<%BANNER%>

Bayesian Methods in Case-Control Studies with Applications in Genetic Epidemiology

xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID E20101219_AAAAEW INGEST_TIME 2010-12-20T02:30:16Z PACKAGE UFE0015583_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES
FILE SIZE 7225 DFID F20101219_AACWYR ORIGIN DEPOSITOR PATH zhang_l_Page_127thm.jpg GLOBAL false PRESERVATION BIT MESSAGE_DIGEST ALGORITHM MD5
a70625257e5722ec28fcc7a69e46d29d
SHA-1
c4991784835c36eed928fce4328191c400ae769b
98797 F20101219_AACWBK zhang_l_Page_116.jpg
ccb24035f476a0196d355b91b6ee946c
0df59919183aff92010bc9147901c959576404fe
21330 F20101219_AACVWE zhang_l_Page_121.QC.jpg
34a93bad2131e33f60ebf9142024d727
e13bc13ac8cfc529420862eeac29e559e0726748
2169 F20101219_AACWAW zhang_l_Page_115.txt
34f4e49789f7a251f385e9352eec6cc3
38a4290164de93d8fd9565b418189c8ca2e1af50
6253 F20101219_AACVVP zhang_l_Page_049thm.jpg
1688e36c874b7fbb0e745a2bf3760319
95faebe872e53827e84bb1a824b1bd45ba5003e7
7950 F20101219_AACWYS zhang_l_Page_128thm.jpg
8bc4be0893f29586970062a3b543a992
9d4f0e4aa31a15917f19d5b9333e44c89aeb2bda
538809 F20101219_AACWBL zhang_l_Page_141.jp2
6b32b6661198f45daa1d5b4a9fde8a1d
25049aebeda057c06bd02227e95746354804e263
95618 F20101219_AACVWF zhang_l_Page_069.jpg
5d1ed3d1ba0eed3d76e89c5401384511
e7f980a251670ab43b0c2b21ea5741bca80e541c
745753 F20101219_AACWAX zhang_l.pdf
24b317f3454b6dbf663ec96f1cccabfe
b3050e08e3a24f9aeef5980f633199de62164bfa
90549 F20101219_AACVVQ zhang_l_Page_113.jpg
9d6d59ea85bd16ce2627058398de702a
54a05853640bc27283b2e947da1b2107d70084b4
3959 F20101219_AACWYT zhang_l_Page_132thm.jpg
64bd5729b199206feb4313ac5e532b8b
9fb16c3750c0c7b806207c466b92f46c3ee25fb0
95903 F20101219_AACWBM zhang_l_Page_027.jp2
a7c0433000729bcc80922cdf993e453f
d687097a65e647e4ce3839c978f62e03ab75c5fc
1726 F20101219_AACVWG zhang_l_Page_078.txt
e0215e49df315d4a2fa8b51f0a1fe960
c5846fd504db0c0311c3983f21075390c89d9e30
1051271 F20101219_AACWAY zhang_l_Page_106.jp2
d8f378bd05330995f9806026f3fdd4bd
d452e01183ee0be3cb3b7a1e10f856f19ecfc8e6
51755 F20101219_AACVVR zhang_l_Page_033.pro
f05cd815ea55af802d40cd1a75fa9a58
88ca413a212401bacbc88ac5e1d2a2716b2e5028
82746 F20101219_AACWCA zhang_l_Page_042.jpg
dc9601fbd88f72d45ad0e0f2652d6883
9c0fdde659194505e9e4c49258f0d28429812795
20121 F20101219_AACWYU zhang_l_Page_133.QC.jpg
176bba39bcb2e289a8ce112af7be9fc1
bff4fdc19a075350c7f935fd8cc59e26b44d9320
7028 F20101219_AACWBN zhang_l_Page_018thm.jpg
7a3ae239f33cf2b5c64eaac9b7a2eacf
95069cd38dfa014b64c91ace804f8a3ebb452c22
36749 F20101219_AACVWH zhang_l_Page_014.pro
ca1e000f6189aa27b9c56d4eaca053c2
c09764c4eae12fb892b1edca42641d94286dca15
1053954 F20101219_AACVVS zhang_l_Page_063.tif
8cd423f501d77313abed97d73374a20e
d2b4fdcc9a26e2a25ec0b1d8bd5471d34abad788
106950 F20101219_AACWCB zhang_l_Page_126.jp2
fa737419634c33b07b6b1338b8d4fb65
734c68fcc01bcdf6f13001cf222ef2867a41131e
29289 F20101219_AACWYV zhang_l_Page_135.QC.jpg
14828bcb9e1badd13b734a8d89b1ee17
22787d7d898554d08d6ca3013c6d51c4b00ae326
12524 F20101219_AACWBO zhang_l_Page_063.pro
178fce7333b65b5f0cb95ca91363a204
c3890c14d0976cf601e72ed8e5093dea624f7970
992926 F20101219_AACVWI zhang_l_Page_113.jp2
0d7b5fbb1dd1f9ef5adc94f90eb6efb8
1cb9a73ca488b0f6b05178317da3a03b660c9ab0
8047 F20101219_AACWAZ zhang_l_Page_089thm.jpg
8df73d9a201c192077c4319123f7eed3
d6833f0fe4a6455719c03cc9339e84ffdf1d63f0
25271604 F20101219_AACVVT zhang_l_Page_012.tif
c89749d470b08ca4d2b1037f627dd942
315e7764798dbb5060d69cd989d66ce018386fbc
7727 F20101219_AACWCC zhang_l_Page_074thm.jpg
124f4a8f9a835ba86e498754e8a556b9
7184c4977dc92a0038d2970cc10178a09420d7db
7172 F20101219_AACWYW zhang_l_Page_150thm.jpg
751c0731314cb63fe579ca58624c3dbb
7896ddee2ebd9eb397d344f70e2865d7b11c6c14
2046 F20101219_AACWBP zhang_l_Page_109.txt
3aad1e69ec6367c53d33f5a1497f8bf5
675c78c256d5696df30e74431c3b452ee3da5337
29848 F20101219_AACVWJ zhang_l_Page_146.QC.jpg
6cbf422fde39063475f467466a670a90
a2c496758c04588a84d67439e4b153e389cb6ccc
1692 F20101219_AACVVU zhang_l_Page_076.txt
96a4b14abf37d87512c04a060534e271
f75f374d6f3b2a0ed6fb691e6f53b73828fdc96f
90294 F20101219_AACWCD zhang_l_Page_134.jpg
a39d2a33e0063e3dc93456550eab895c
ef55a7e24cd063090e787cb991b645d3a0608756
7151 F20101219_AACWBQ zhang_l_Page_026thm.jpg
96dbab14f733b3afff12f267a9a6c2d3
50c43efc59d2354b95a29feb934a7f41f87c577d
1051983 F20101219_AACVWK zhang_l_Page_129.jp2
6f7a3a1bada8c00127c1057639acc230
72ce687bf727c0b60d2ada852cea209fa4a7b451
4395 F20101219_AACVVV zhang_l_Page_124thm.jpg
658a16483f7c88b2ec920a7591bec67f
a7bb1dfc56791adfb10ddb185b8d5cc677fba3d0
29118 F20101219_AACWCE zhang_l_Page_134.QC.jpg
3a6a5b66f3138b7895e1c5a63597ca12
f274ada0aa77990f5c31be0ea321c4daee73a567
87930 F20101219_AACWBR zhang_l_Page_042.jp2
5864290097a41ce2e800b345ea9187a2
a6bf1e8b6322681f86ef47579d59e3b765669b2d
1461 F20101219_AACVWL zhang_l_Page_094.txt
7ff9ade0906a72ef9c24441a7c9ee601
6cdc5429c8dad516e0997a7d1b0d8af96f04bf2a
102431 F20101219_AACVVW zhang_l_Page_029.jpg
d4af65be76aa88ed64825d6e96dfe0ef
72eb80501bfcbfa08617277dabfc4d06200ab1e7
1553 F20101219_AACWCF zhang_l_Page_122.txt
429dfd2ab75a578103780c91b81e1564
f7232a78d8cf8631e365e0f25d6e9d3401ec8aa5
842851 F20101219_AACVXA zhang_l_Page_025.jp2
d46f59aeed66739266c286a3a1b81aa8
c5e12bd3efdf0f6a14d42a7da13bd21f3436a759
F20101219_AACWBS zhang_l_Page_024.tif
716bc604fba4bf22b0b92012285cdb8e
ce387e90e72e54c906f9e8a70a0e84120c1d91e1
3174 F20101219_AACVWM zhang_l_Page_068thm.jpg
a5eeac8f5b65e6bf616713bfaa0cfd22
de32c574127e2a798f904fcfff4f385bb7eca8f7
55901 F20101219_AACVVX zhang_l_Page_034.pro
0c334d0825a63b052bad01301eaf73ac
2dd30e5a99e3ffc85ac9715dff5ef403b00d55dc
35483 F20101219_AACWCG zhang_l_Page_061.QC.jpg
a9b19d86879893b4f625b84e2ee33a94
82905f897993ce25cf46145b6403eb049dddb042
57139 F20101219_AACVXB zhang_l_Page_117.pro
2bc8f16ba1802e6904c3c0fb5751ab4d
af4d018f26fafd4f7fb6f8bd904e25d7f2861b40
65131 F20101219_AACWBT zhang_l_Page_011.jpg
48978c336f1d84cada4921132a227838
9d9effc375d0ca3ebf205759c0558ab0ea994b2d
7982 F20101219_AACVWN zhang_l_Page_082thm.jpg
6b3001677e0cd1658d5a0a08f09db931
95247a03bd5156c685aa488d592583718dfad7c9
1474 F20101219_AACVVY zhang_l_Page_032.txt
c8af8fd4d13f52d7b21c030cb1455bf1
00360620ba2950a632603ec45f9da5ffc26b6602
888 F20101219_AACWCH zhang_l_Page_132.txt
a8b4c5eb789cbd4a4000e10133c5d1a9
8b32c61cddee8cb4978e6f8a26968df4e2fb2382
30675 F20101219_AACVXC zhang_l_Page_088.QC.jpg
6ba34aa878e9842316210e35684bd4e7
631f959b55b5cb474a3bea11dee5fe94961829fd
3556 F20101219_AACWBU zhang_l_Page_012thm.jpg
1e59a1852c4ecf38506347bd1d3f5760
d904159851e7cec244a1c8d66ad350c0c0398e93
7701 F20101219_AACVWO zhang_l_Page_075thm.jpg
63abb7fdc40cf919b1c3be245d1ce12d
89d0d6f14803ff5e45df70d7e1c8b7e10ab90355
2237 F20101219_AACVVZ zhang_l_Page_081.txt
654ae48af373fce236313267bd8f9608
5a98073d88e0ebb6ab9d1ec032026587b52a2d68
45805 F20101219_AACWCI zhang_l_Page_006.pro
d42947ef777ff9d16a2d8e759671e300
190b0b31a61deb69383c584f1cab8c7e4638c74d
14930 F20101219_AACVXD zhang_l_Page_099.pro
2e9e501f46866457cf0440d92d6726f8
7bc9139bcee87a81b5c2e915e710f72a16231888
F20101219_AACWBV zhang_l_Page_056.tif
26a42446cb3e8ee853f243a36191aa52
b63124b7e14bf8d68ceae011fdaec56b690ec735
F20101219_AACVWP zhang_l_Page_095.tif
bdb07c31cff12fcee6c19fdd831dc55e
25c3769e2889bb6e279637fb7a8450b7e4d53e76
7630 F20101219_AACWCJ zhang_l_Page_070thm.jpg
d5e35063a176462c3ada6fa514c04497
535e4512a88ff2bc2f01827db05417442a09b71d
F20101219_AACVXE zhang_l_Page_033.tif
c07b7390205d4093f5f26c3d626302e1
88e1df9fb42081f9febf1655a7933ba1fd80f2f4
93467 F20101219_AACWBW zhang_l_Page_054.jpg
528af1d96cc8c356dbfd327cb94c423b
443432a0d2b461a12f53a778724d6169c28ea2ab
27796 F20101219_AACWCK zhang_l_Page_152.jp2
dbed9128d5389032316a8d135738a89a
f8a9b41093f40e620a033698e398c82544678e5f
34023 F20101219_AACVXF zhang_l_Page_005.jp2
3b5bdc4b0d2f571f03f943c63c74c554
aa59b66cc60092d5f7958d2e044742bc93d83251
29046 F20101219_AACWBX zhang_l_Page_054.QC.jpg
ee85cd3045563889b045aaee7aff9700
b0fab3240ec2ab928ad6ed3c0c8e2598849dcbb9
105242 F20101219_AACVWQ zhang_l_Page_116.jp2
b1d8556685e042fa109c02869d681435
485101d8303f0db06939e030366ad143c90a3c4d
2321 F20101219_AACWCL zhang_l_Page_083.txt
b8c5487e62d9d30bee5169716abd60d5
9360e6594ded6e0669f22d81b4649810538d9b5a
44791 F20101219_AACVXG zhang_l_Page_135.pro
31f87418a2dbac8ac06de64900c9ab70
90a4f5a4e303bc05f8ad91dd471ce51932dac29f
95366 F20101219_AACWBY zhang_l_Page_018.jp2
000074ca5c304d30e49d0d0d0d2c9a19
6711fe49728fb9af44dcaf25f2e625da3fd128d0
914166 F20101219_AACVWR zhang_l_Page_016.jp2
f70d0be5673c5362e8a0ff11d5428b85
2709c6b5ccc72e2f959f40d573bf01d17a9b0655
32966 F20101219_AACWDA zhang_l_Page_060.QC.jpg
47183fdf8734559ae82d885c202877fe
c17683d5254284c5b5ab406fef8faf119214db08
28648 F20101219_AACWCM zhang_l_Page_011.pro
dadaa4e0e67d1ee81fa7e34daca9ae72
6b7fd15327cec020149a21a279e0d6b78283e331
43108 F20101219_AACVXH zhang_l_Page_052.pro
c02b35e19f69111388685b34bcbe443b
1fef22a2557d9dda775b9ad521a0b724deffd952
1325 F20101219_AACWBZ zhang_l_Page_136.txt
7ce479f4844d53b181bd38e6bc2ffba0
6c25cd0c3abadb615cb83f3593547d9ecc31c3af
78778 F20101219_AACVWS zhang_l_Page_123.jp2
df1b1da4bfabbcb739480793e3e5cff0
6c399da6f7e98ff7f42810dafc4f8bf5f4b07bb8
72013 F20101219_AACWDB zhang_l_Page_123.jpg
787ff2df2991562c9c49f5d11e64c993
a4c1df8ab4b310104acb5867b1b5ddc87a5a0efd
30313 F20101219_AACWCN zhang_l_Page_043.QC.jpg
696d8cddc9e7fb307df9c1d6b0fb89d7
9f99eae34b28c28a1dfdb7cdda8e7ac3982ad474
71176 F20101219_AACVXI zhang_l_Page_139.jpg
b318a35518b80281dd3c835280261b48
49140c30638dfd51d15ffed0c8ed7b88966017f3
1051972 F20101219_AACVWT zhang_l_Page_086.jp2
3457cde1c460c2487342cb6be089cc1f
d43f3fe6a371f239db373150fe824926e7b8fa2c
F20101219_AACWDC zhang_l_Page_103.tif
f231d52c8165d786aed5af4a773030ab
543afc5e7d4fb8e60fee891f37c1866a1334be88
35403 F20101219_AACWCO zhang_l_Page_028.QC.jpg
f1af1d7e79e2f264daa8c07631b68215
8f8d3e8a04051517e04036f52a42ab3fbd6dad96
7132 F20101219_AACVXJ zhang_l_Page_055thm.jpg
ffdc310a624ce867ed1afb4ec28ec342
80fa4765c7d9886002dcc424b383ec5a8105b521
1051956 F20101219_AACVWU zhang_l_Page_111.jp2
5fd26ea676c2d658ad941aacc171dd6a
b616fbfaf9245b582c0ab1be1c5186f00925fa1d
5921 F20101219_AACWDD zhang_l_Page_039thm.jpg
169efbc16f32fc50ba015469a58f73fa
3ea56d9cae7bf4aa4cc31a63993f3fe5d8d15bfb
24702 F20101219_AACWCP zhang_l_Page_012.pro
b7c51f5027ea855b1ddd1701939db33b
3f883febdec909aa7fe57e0dfcfd3796ab313a13
50559 F20101219_AACVXK zhang_l_Page_029.pro
b65cfd96bd22d76a79cbe6ffeb1f6cc4
126d1433ffead52d51c632259651ddc5f99b2341
F20101219_AACVWV zhang_l_Page_042.tif
f78caaf03418350caca424ec1da5f3cc
d7457196536a11d723af016f8260d9720837a83e
35387 F20101219_AACWDE zhang_l_Page_086.QC.jpg
7b8fd0d8416a9293809eb6aa2f90651c
ad1faf0db7e0347ca281834a1008ae7dc3d0c5bd
58997 F20101219_AACWCQ zhang_l_Page_009.pro
d18f89d58b0baccfe6bd5ccc123717ef
8e9d238f3948c1c0913afae5f211f2744463cf3b
33876 F20101219_AACVWW zhang_l_Page_077.QC.jpg
0cd0b79cbc83e4f7f1b2b8cd7473100b
de0fdaac1e79e5407f0ea4c125ad0801c9d9b8d6
46195 F20101219_AACWDF zhang_l_Page_055.pro
1d3aad2ca76c2683828b930f24b64c34
5e28cf4dbb5d8750db05828c9a2566710068494b
107982 F20101219_AACWCR zhang_l_Page_024.jpg
1f6215d4f73597846f8486d2a78a97d9
b4591a40ab2d86d18e3750f249b491f566edf370
30740 F20101219_AACVXL zhang_l_Page_095.pro
5864a95918bca819f373fbdbc50f8597
53cda6a37719e8ce7fe36bffc00965eccafc51cb
F20101219_AACVWX zhang_l_Page_008.tif
98c81be85e6796451333a3c184f0a07f
b9f80b7b323ddb5d975382f70d543ee0026438af
45802 F20101219_AACWDG zhang_l_Page_027.pro
47caa927f882e4484c39d5b61fe9f8de
f0146eea40b792da9fb8d2401c295845be4f84e7
2340 F20101219_AACVYA zhang_l_Page_046.txt
effaf535d90382777b65946ca2c385de
58ab5af87dd650a1761d2beabe72775c59aea2fc
7207 F20101219_AACWCS zhang_l_Page_041thm.jpg
d1eb07bc1a9b093d9e376ebe7f9da9ba
8434f37691ab3409c5fb1596174597c48a62f226
33514 F20101219_AACVXM zhang_l_Page_065.pro
3ed1ebe975c9bbbe6200bfea66ebc509
f369ff9bd3bf2eb3c97654a3dee0bce1007657ec
1051984 F20101219_AACVWY zhang_l_Page_089.jp2
72d5cc006d30b2a71a16532354cccbca
38f6900999a1305b3aaf430cbf8a79597a8bacb8
F20101219_AACWDH zhang_l_Page_017.tif
9220397af0f515a65d44b02af669cfac
00bf0ef841f037daf973000f67a5468b74d459b8
15166 F20101219_AACVYB zhang_l_Page_097.QC.jpg
6d7340e66e98e1916004e637fc9020af
2b753b83e9d6ee0d984c0278eabd0fa8f29eafee
114193 F20101219_AACWCT zhang_l_Page_028.jpg
35c0908ce6870cb87c2cd63ce4b859c9
fa8f8b3683bfdcad222d28ca006d983b48cdea56
F20101219_AACVXN zhang_l_Page_106.tif
940b8a22a6188552365139242f9f8f0b
ee9c57d8ccdaf6622e05759a72a5c999f87ebc82
31419 F20101219_AACVWZ zhang_l_Page_009.QC.jpg
1fbd568cf1dd7b850aa99a18c2c4aee8
7ef2c44c6fe23fc3ccd1fd502f5cc52f8d0c7aa7
6562 F20101219_AACWDI zhang_l_Page_110thm.jpg
5b65c6d0290f11b38165736e19ce723e
53a04063a9003489354cce0117c5d7cc3254c379
F20101219_AACVYC zhang_l_Page_146.tif
0465be3d838080c795da496299211438
8ce088ab3f46a5088606ac4a860a26fe5c74e878
F20101219_AACWCU zhang_l_Page_023.tif
e8355c70233955b4592fad27df46240e
1641cc93e309a08c587f216dfdcdbb788a0cc21b
F20101219_AACVXO zhang_l_Page_135.tif
f03f35727bb9d9ef5785358684240f47
973fd9a459e5669418190b0e4bb9151b58f19eaf
1708 F20101219_AACWDJ zhang_l_Page_037.txt
2ce5160ea3b868e6f217a25fbf78837a
eb90770fa5194ac2aa5697a247286ccf606fa0d0
33184 F20101219_AACVYD zhang_l_Page_024.QC.jpg
9ac7a937732ef1f31fe21df068350578
1c99113e2b32db6250f03ada01480352c2bca726
21245 F20101219_AACWCV zhang_l_Page_123.QC.jpg
5cca316a315e790bcfffa2a031bf270d
fbf29470fe4f91baeccf8d96397132d3dd7b1691
1941 F20101219_AACVXP zhang_l_Page_111.txt
2debc568e2a325af4995e3d3e7ff20e2
708775346821248453d4bfbef272f18b11dde52b
7706 F20101219_AACWDK zhang_l_Page_058thm.jpg
22ee8886e2e76690738149e71c0a52e2
70f5a969c85135c710ee0362b88806f40d08dd6c
2043 F20101219_AACVYE zhang_l_Page_043.txt
d344a731e779aa17eb6faf38c6079223
c29cf523f6a45f1d3bc60df8ac6b235c2a1d18ca
44352 F20101219_AACWCW zhang_l_Page_018.pro
78da0cf396039447410b3737656e3aaf
d3f7e432b21c823ce06dd297ddbec22c7b430087
1051982 F20101219_AACVXQ zhang_l_Page_090.jp2
60bb2f7b5ae6422c7296ba42bf63a7f0
cf78bac16b069733bb09780e8a934c0559cf9d12
5454 F20101219_AACWDL zhang_l_Page_096thm.jpg
0f7c41533317dbaa8e6aeb98bc6cacae
b6ba9b991558702363dcce03fc0ef374a467a692
F20101219_AACVYF zhang_l_Page_149.tif
9e4a8a4001258ca78a450cc738a3c6e8
a7d54f1212e25d96283a2b37b99dae743d13c099
2222 F20101219_AACWCX zhang_l_Page_023.txt
6538ee650bdb59a56005eeb6c13c8da3
1d81917404fd732f1108cd30bc8419e4a81c56d9
35404 F20101219_AACWEA zhang_l_Page_112.QC.jpg
fc1426abd49af79e63a753d0d6c47314
8e09aee355b095eba6fa5a9e2da75bc425b1f981
80364 F20101219_AACWDM zhang_l_Page_078.jpg
cefeef6de55cb45f213058b75385c6da
0a9f22ff95aaf1c0e0d0dbc81f509e035ca505eb
104695 F20101219_AACVYG zhang_l_Page_144.jpg
23a3cddcca00e9213c18ef58af3b9afd
f0ddb2e9fea56b12394b6f0088019f4c2876b2d8
F20101219_AACWCY zhang_l_Page_121.tif
70aefed120ab73e1f91302ed36098c74
7cb814882b4080dd881d4fe7faf6fb0ff47b5223
F20101219_AACVXR zhang_l_Page_078.tif
b7299a27dfb04e7fa764d557273d45da
d6a5d5cec9216048a022da2183a7509b22dd2c21
21732 F20101219_AACWEB zhang_l_Page_105.QC.jpg
dee263c0fe562598a83e2c12605c4f3e
59f9735735dfe9de47735568296191643ef3cb2e
10157 F20101219_AACWDN zhang_l_Page_005.QC.jpg
42124926282e33784d8ee5278474f97e
219ef9bb95717b2af84d7c474ac95752abd9a1d4
54929 F20101219_AACVYH zhang_l_Page_147.pro
3a893ceac6851518d686e0bf3942e269
94b017fe0996cb860961a9023e5c8ab3fd5be761
2642 F20101219_AACWCZ zhang_l_Page_093thm.jpg
0974c8939fb5a583539bf1a5e9c0b0d4
31dd0534fcc99dfa1b9cbf5e31ffc289f847f43a
F20101219_AACVXS zhang_l_Page_036.tif
dd0ac8e791ea694e3adcae4da670df9f
00fb3b5fd6aac0eaaf325249389efa74223770a7
104311 F20101219_AACWEC zhang_l_Page_058.jpg
7b17f96862031752c86852476cb7e719
bb817467058ac1dad2a6b6e390155c880a81d266
1232 F20101219_AACWDO zhang_l_Page_110.txt
26b838aa1902866310f22bcedd9cdbef
adf9c92dae8d31a8a8f79fb61f5ec5ceec774e04
45570 F20101219_AACVYI zhang_l_Page_106.pro
3580f2ee71523b54873956cf76df77a7
d83aa50bccdd411c885cfb4631ecd1e9c14d06c3
35229 F20101219_AACVXT zhang_l_Page_138.jpg
267fd4fe45c2a1143d17c9e05806d9d0
927a8d854b8eb79e62ab06988c5e60f318fe0213
2007 F20101219_AACWED zhang_l_Page_036.txt
629cec5588cb65cbf42663762285f51a
88468fd077c486bb0e01531149dfd21702a51906
F20101219_AACWDP zhang_l_Page_128.tif
3b88fd8f80c37374f49fcb85182894a7
ab0c06aca56a33fecf76be7f07b4723f297ebc57
F20101219_AACVYJ zhang_l_Page_120.tif
80d2ec15b7bb2b01f434cff5f0174e73
02b1e35d390637d795b3f1afc119ae9b94015483
30015 F20101219_AACVXU zhang_l_Page_109.QC.jpg
e21598825b70a8be834627f55ff9e2f6
adc9e3b227c97ce3d15a75dfc857781c6afed166
54447 F20101219_AACWEE zhang_l_Page_022.jpg
4a8889c871d16f5e0eae6719995b7de4
5a8255ed64a8855924526c4f7830df0b5618ceea
2016 F20101219_AACWDQ zhang_l_Page_107.txt
3afa0c2847e720b562784e6a8cd4dc76
9ff6d0b9da2a6dd46eefd096f54c33cd7c48a743
6894 F20101219_AACVYK zhang_l_Page_052thm.jpg
7718b1c4365a96ef158e05731fe1a338
c6d26e031ee762e0453a77292031cdaa3ad092db
1947 F20101219_AACVXV zhang_l_Page_004.txt
49ac62b8bfe39317e573f9b61974bb59
a30ff9e58c6314242e7f289dcf80c921c40eda51
51643 F20101219_AACWEF zhang_l_Page_141.jpg
126e510a820ddf34ff87d43e88be547c
023b9516d74630f67632da611c21c1dfcd225c58
116875 F20101219_AACWDR zhang_l_Page_024.jp2
c3c5d53d2a852529022cf90820fa872c
e11ac76df5e1f15d45400fb2df20ab3f80bb031c
7973 F20101219_AACVYL zhang_l_Page_086thm.jpg
d416e5afc891654a811c894dfbcbab50
37903845e8241030bac324bb7fe89506530b05a7
114125 F20101219_AACVXW zhang_l_Page_045.jpg
210ecdc17cf4f7eb5767dfd56158661e
bee43bd3191587cd21bee22cad6e871524c17149
22902 F20101219_AACWEG zhang_l_Page_136.QC.jpg
1b3ccbbdf874f48a0bc41b9129cb4269
f4d0a5e0b72f945f1706b663d4b734de164826d8
75333 F20101219_AACVZA zhang_l_Page_119.jp2
f7bc28b312e19d95de16dd84d61af8d3
747d4d8d07d529c01b8cadd9dc30f1acd898ec55
104728 F20101219_AACWDS zhang_l_Page_146.jpg
be47a0c8dcf1c8abb4d437902f7eb1a7
9d170ea949bb4e829c8a3718ea14e554e766860a
26588 F20101219_AACVYM zhang_l_Page_013.QC.jpg
f884f55b4e21054d4bf2d2f729be667f
ebfb3a6a1f48886115ce04ecdb35ccea386a49e8
38172 F20101219_AACVXX zhang_l_Page_078.pro
b2fb3f88a60d2325f2ace85051d12cd6
e3bc58cbbc82e9530d92ab8998ddf3b2a87f1c86
46375 F20101219_AACWEH zhang_l_Page_020.pro
4785fab65b30a815100cd102528b3449
41d6cf75f99cc9e07af6dafdff49a2f2ff5c48ae
23140 F20101219_AACVZB zhang_l_Page_035.QC.jpg
8bfda1750e011d65ae3ec5e76e4311a5
b27bd227d0035a49258b383113ef56f6ae65a2dc
32632 F20101219_AACWDT zhang_l_Page_058.QC.jpg
037de7f507e94484d50ab21b6a4ed01b
b49ae33f63ab3981700a17d5a8e305f6538c92ff
2487 F20101219_AACVYN zhang_l_Page_009.txt
c8580f763f113674ff3f0b0001968ca1
a8e59b9ae4077755538eaa292508d8d1b4678ee1
34430 F20101219_AACVXY zhang_l_Page_117.QC.jpg
d67d308aa34d4beb99310ceaeba4451e
1773be6be1a18f9b8e3bd0326359343681725e00
1051846 F20101219_AACVZC zhang_l_Page_074.jp2
ff82702029748e6ae96d52121adba646
24a8ec847ccd10b3a6f8db8ce6a48ca27a46ceae
49066 F20101219_AACWDU zhang_l_Page_088.pro
75c97a6d91cc25d6804575bb9d556a0b
042c1e08fb1d377aa9cb8e3165cf946c9ded4ce6
44473 F20101219_AACVYO zhang_l_Page_064.pro
3cf0500b07224baa60806b86c197357d
3c3cba534f58d9ac06bf2af441635fb371669f77
1199 F20101219_AACVXZ zhang_l_Page_039.txt
66324f5fbe82300e282e700f3d235ce4
77d7d6c5cfa7a046cf95342e78ddaae416c8aa39
115391 F20101219_AACWEI zhang_l_Page_044.jp2
ea2150cc3655fdb94191b051eab36be3
cfa6afd7e6fd38516f713c5b523bd523190b46fc
34909 F20101219_AACVZD zhang_l_Page_123.pro
e96d7f73ddc4257d72f845ffd01c88b9
5f134ba9adc4a250bb87be894c107c506c344daf
84377 F20101219_AACWDV zhang_l_Page_027.jpg
c24233913d1734f99ae5ab03035ad97e
bda91206ec061e824cb8f588dac0f5ba8e94b8cc
9834 F20101219_AACVYP zhang_l_Page_092.QC.jpg
c04499080c894109aa4c60d790b5624a
d4923edab772fdee7bce97801e2fab9ea70c538e
F20101219_AACWEJ zhang_l_Page_107.tif
e29fecbf817fd46ab052b99ff8b0ef51
c26f9777cf21024dd4de6b59ca542c53d9c0b643
27469 F20101219_AACVZE zhang_l_Page_018.QC.jpg
f35be8d2162f49d3a22823aac8c399c8
6384b7799a705cf5c8480af4d0de0b32dec9da62
32546 F20101219_AACWDW zhang_l_Page_136.pro
405e08302df246958573525367eaa589
88d9a62d9b3c0dac8d73467bdb2155d89b82c3a1
F20101219_AACVYQ zhang_l_Page_104.tif
7b35d136aeaeb07cf7f4ebfd944940b5
b86630df3e8cac1e142e4091da27a562d76dbf17
17161 F20101219_AACWEK zhang_l_Page_022.QC.jpg
bafb6e17f0e645605c5cbf8799125f6d
f3d3f26e518cc5dd2b7094658727447c9a9975e7
22675 F20101219_AACVZF zhang_l_Page_066.pro
43583e7b69f974a7b85a700c9f744648
66d3393b814e235f37e9d06f52eec1b0206169d5
58032 F20101219_AACWDX zhang_l_Page_124.jpg
fb56082dcdf9960b7c354cec13c350a4
b7caacf171d982d8d4b66006a53ef7114c267b66
49811 F20101219_AACVYR zhang_l_Page_116.pro
d0212691de69f5674730c9dcaf9ea191
b86e3067f7f609c5424248cdb1ad312113fc52fd
29003 F20101219_AACWEL zhang_l_Page_100.QC.jpg
19ad3bab1c5eefe5746b3f7598105847
47d54b7f87d9b26032ea0b55d9e29e83526237e9
29264 F20101219_AACVZG zhang_l_Page_127.QC.jpg
6d937176888eb779de542e6ac0ce783e
a73d8e40308579df796ac8661d528de3c19e1a4d
2368 F20101219_AACWDY zhang_l_Page_072.txt
fb28305e666e3f37d59e68ac80bc7f15
cacf2600b997c6dd92d8278ab825af284ce31b82
51057 F20101219_AACWFA zhang_l_Page_126.pro
8510d3d40d2b0de9ef43ec85f9c6f8b9
1f5269e2ebf758f260a9813ebb5084d9a49ba5ef
35531 F20101219_AACWEM zhang_l_Page_096.pro
5e0d313c73a22da215f07874db82b92f
c0288f306d9304ec40adc02165256629b5563d10
28180 F20101219_AACVZH zhang_l_Page_124.pro
b7fba52c65d883fa702afc3123d558e8
a70dbaacc3cf73df2c2357aad6fd633180ef4f41
1051938 F20101219_AACWDZ zhang_l_Page_081.jp2
d329da1b9007783da0b38da57fb71f32
ac6588378bf1296a2c3c9dc9578c093910cf9923
F20101219_AACVYS zhang_l_Page_026.tif
24b4f41884addde861e5781fb0002616
e19089f1046b5dace8613313bdbb4e6ce919cb1e
33264 F20101219_AACWFB zhang_l_Page_037.pro
129683757a4ae8db00f3c18a9d2059d7
a22e3adf114e2ae9a1febcc0092aca457789658b
F20101219_AACWEN zhang_l_Page_049.tif
f8a43e6b3777af9d7fdeaa51888fa575
2999639e3fccef03a79eeb50cba717af384049aa
99481 F20101219_AACVZI zhang_l_Page_043.jpg
cb161e103a85c626c87b3daff4409473
38d11d3e6c25038b261ca7936f58067e4be6ae0b
86535 F20101219_AACVYT zhang_l_Page_051.jpg
6af97658e09c538a0f440b71f6500c4a
71cd6781504d6bdf6f3651956a302f3e75538c07
80774 F20101219_AACWFC zhang_l_Page_025.jpg
f9b1e8382d7e23e2779f0bcf8f399486
4edd2141ea85682d023c2d0160b8dc4a7dc6c95e
634766 F20101219_AACWEO zhang_l_Page_040.jp2
24b67707302e3c8a0c1f55888dd9ea5f
1b156dae5029bf2c9d4a65d50cad2a8abfe5683b
13304 F20101219_AACVZJ zhang_l_Page_098.pro
435413be36426d7007f7dcb88e8018c7
38c4f0bddb9e11f9f1250ce46d17895da5b51f08
99457 F20101219_AACVYU zhang_l_Page_019.jp2
96c796b4b26bab9c445e534c82188969
9beec00b5bce6c9a38e79ca570e4d2675b06bc28
1656 F20101219_AACWFD zhang_l_Page_021.txt
0e3cddf303fd0b98b97c230e7aea9820
7eedbde49a2ea616d35a8b05daa875bf5dd6c689
F20101219_AACWEP zhang_l_Page_075.tif
2ef544bad82178a2c28d1ff7883578c2
a62ee3a184d4a9eed5f02c75c7f67e5039c9e1fc
F20101219_AACVZK zhang_l_Page_119.tif
eec2224e0b320f01b6bc0e3dd15c7600
7db5d8637b26af105f3b3d87fa4e00e8b54f5cf6
1879 F20101219_AACVYV zhang_l_Page_013.txt
8f2fce161ff66a849cef8a7c9ca6023b
04329e250ed8fc7696b0dd286d4d485a16b645a0
2059 F20101219_AACWFE zhang_l_Page_085.txt
b994c5ac64ffb70a293dd448285429c9
913944a671ec069faa45c8f74323ed8473e0e984
72203 F20101219_AACWEQ zhang_l_Page_121.jpg
e9ee0563a6f5af2869c7ab2b2a9e96e4
bba0ac0c28f11b6b4027c2f4ef373aec30d6bc27
34036 F20101219_AACVZL zhang_l_Page_034.QC.jpg
28c0ab720a6e62bdfe0bf7fb623c80e7
3490e2af7c34758f8239faeb9ad8dc1bf52dd76e
7816 F20101219_AACVYW zhang_l_Page_071thm.jpg
bbc026d57df0ab180681628c5a247b58
b1d857d927880b5c4cc2aca03677185c6c1e8b79
29323 F20101219_AACWFF zhang_l_Page_019.QC.jpg
866c8fd5b1e5a611c8994f9c5b165971
5425874bcbb02bcbf4f68505e53b3ca73f850675
113214 F20101219_AACWER zhang_l_Page_081.jpg
8c740f382627e6682c4d7cbd74e4543c
1c827769ae768f8b2f2da1c5fd10c8a538813713
7668 F20101219_AACVZM zhang_l_Page_111thm.jpg
aa15c5b3ee406704a8c24f89ff3481fc
3ab160803ea11ec558963a8790bcf0a02c6ec59d
8005 F20101219_AACVYX zhang_l_Page_115thm.jpg
2d16aa77e6c512aa95176e1cd774a341
dee7f7edbca92eeb7e8dcc96d9167f9cfabbdd2b
2274 F20101219_AACWFG zhang_l_Page_148.txt
f0fe44673cb0e9bf6c940ea2a985e167
2c12f1665fa16e53fd85ac4c36a4ece0a3102e0e
F20101219_AACWES zhang_l_Page_073.tif
148ac886990d8f540f6a59c7f58197f6
cc85705eb4a32bee2491e5f2b9037407ea00197f
2258 F20101219_AACVZN zhang_l_Page_075.txt
520fa2476ad88af9da74341ef6b58365
5b4fc9ab304b34069044c7faf9f7715f8fc27df8
F20101219_AACVYY zhang_l_Page_082.tif
c92cd269756096d891aca54041819857
017bb2d25a419d96f06c553353d4dab2baea8a8c
24652 F20101219_AACWFH zhang_l_Page_049.QC.jpg
e691befbb769a58fc67eda92fe7c8ae9
9669815ed3a8e65356b0ecd67641cbdb1a8c1a9c
107316 F20101219_AACWET zhang_l_Page_101.jpg
d53060ea81deaf80b28d6fffc5562002
62e1f07d5ea5ad397b5e1f215b7daefe3d747f1f
51294 F20101219_AACVZO zhang_l_Page_132.jpg
1f9530ad8a97f12ae1a06865e10abaf6
e05a8fb45f94fdb776f22f693afb0c106d6249b4
18910 F20101219_AACVYZ zhang_l_Page_065.QC.jpg
b5316b428075111bf2e961fb9bb865d6
4bcbf615fc06870c3d87988e007b41c84e6bf396
25877 F20101219_AACWFI zhang_l_Page_131.pro
de87ec4476e0b2d0c70ee66e4908fe61
45c97c00e11a0164a16458929ce46c525beb189a
31779 F20101219_AACWEU zhang_l_Page_005.jpg
c9782cd8761416338f123b17127fd261
f69cab0e6c36e6bedb36108169837a76abde19ba
1804 F20101219_AACVZP zhang_l_Page_130.txt
92be544128f37de8902966e569ff3e60
770bd17b2a1ca6337086ea699ddefcc139cd29af
36829 F20101219_AACWFJ zhang_l_Page_032.pro
2da6699a2c50def6d8d9ca13e931dde5
ef44de4bad36bd49b860db08b9512053119978cb
14756 F20101219_AACWEV zhang_l_Page_092.pro
9db43270ab9bcb61d5fa645655197a73
32cfce8ed1b1ec57e2d98582ab3dbbe95ea83d76
F20101219_AACVZQ zhang_l_Page_081.tif
38db0df112ae7481376f7f2745f3107c
3ffdae2c56291df0a1f2d76307b9b202b6b95107
46571 F20101219_AACWFK zhang_l_Page_111.pro
f0edd6efccea8ff6a378a4518e797021
55aadf153486c071acff2dbf0e28d8930b331614
26350 F20101219_AACWEW zhang_l_Page_079.QC.jpg
9f9ec716dcf760343d6534f600cbf642
dfbbc2ba778d1e3a48bc0834ce89db51ffd06d17
56100 F20101219_AACVZR zhang_l_Page_084.pro
9f3de12bbb9e9f9e06686608e153eeb4
2bbc7649ac6c607fdf3a024847ce6bc09f93815f
F20101219_AACWFL zhang_l_Page_126.tif
8da36e2d0bc7a4ebd5b3243933aea871
48f98ba60d097d70cc53e1dd4add266379f24ef0
58492 F20101219_AACWEX zhang_l_Page_148.pro
e0029692d88cfaf2aa904f5dd0dd97da
13fc56e22d1cb90d536c4b8f53a36daad53e9d05
49232 F20101219_AACVZS zhang_l_Page_107.pro
ec4f60fff6a404b03f69e364b9ed1545
8ea81a9506bbacc9535e9a3f8b90c13c5e45e1dc
F20101219_AACWGA zhang_l_Page_053.tif
e2193ed44ea9826837386dce8937f378
9db3dc073d3fda7e3fd50c269818977439b3b017
F20101219_AACWFM zhang_l_Page_110.tif
78393874aa87e2cef0788af161566328
9bed35e7ff10c48e7bdf1760d84099068367089f
59574 F20101219_AACWEY zhang_l_Page_056.pro
04cc5ac23b94beba8e830ab10f278d6f
7e8a4c7577d2bfa8719fc067b6a2d546a17e9f53
8120 F20101219_AACWGB zhang_l_Page_129thm.jpg
4e22e1623a14c1e7381a674a96f00922
a27833ea9a0219c339abfed376ef38deb5a16634
16016 F20101219_AACWFN zhang_l_Page_132.QC.jpg
1f3633599e6411bd899b9f667b2bfef8
3e277f256c67cd58c392e6becdcb89c862e91d7e
8139 F20101219_AACWEZ zhang_l_Page_059thm.jpg
93e453777e2b0d938707c63863a53c54
0f2bc92d8ed35b638467319e4222326dd4c5e62c
29331 F20101219_AACVZT zhang_l_Page_067.pro
3312f72fec77b841a3b0d3ad699b4db5
08b0ae45eca90fb323a0e99c33300cc3a2a03566
F20101219_AACWGC zhang_l_Page_018.tif
e812c40cde6d60bbc2b02724859b1939
5f36681ba8d4ac7d162cd06a74f4524672c48327
72015 F20101219_AACWFO zhang_l_Page_122.jpg
84925f3f0f01714aa298ecfac245a7e3
747763d35bf8e78324e502150dd8375d4f2d6900
F20101219_AACVZU zhang_l_Page_060.tif
8a04d7c4ea8f9c6c64e816881ab21871
38cb600e6eeb7366b31b12a2ac59bf41913ac566
F20101219_AACWGD zhang_l_Page_130.tif
7b34d4ddb6a5a4b3acf11f8f045ee61c
f8c7edacae13b8693c0649c47cf0e4f550374249
13614 F20101219_AACWFP zhang_l_Page_066.QC.jpg
bace89d33c399cf4c1ba9a9a8c000f3d
667d592942aaddf6566a8bcf68b83ff6f8d1c107
29928 F20101219_AACVZV zhang_l_Page_151.pro
b72353f2a91f42a2f0e05b16c842c1b1
b35b76c6f0b26cc60fcf5e9bea915423b8e79955
1741 F20101219_AACWGE zhang_l_Page_042.txt
3dffa498a2f193a53da239019860fa8e
925131e0a07d0cacfa184a98e5518addf8149c91
7886 F20101219_AACWFQ zhang_l_Page_028thm.jpg
13b993f2d82710d32a710513b681b792
d14bec5a95dfd1842dc0142fa4d919bc34858628
40070 F20101219_AACVZW zhang_l_Page_079.pro
40c6b7b74569e81e24fcd9b7375d7b78
a3e8e477d974fb2b34bccef4862a1a47d7c8b45a
5016 F20101219_AACWGF zhang_l_Page_022thm.jpg
6fc2d07407f4d3c2dd1cf947cde51094
35456c8d6cdd903e878c819435973bd2457645d5
1031 F20101219_AACWFR zhang_l_Page_012.txt
23a0dfffd67f2b73cac08f5849980123
6261f607fac4dcfeafbb5741d335ef1ca6eb2ed2
7417 F20101219_AACVZX zhang_l_Page_148thm.jpg
811a23f0dae25655f44d97ba3cdada23
f9c510702bc9fcf829a3e4a190d80ed652f0d558
2330 F20101219_AACWGG zhang_l_Page_092thm.jpg
cffad6b2fceb4693fb60abe90ee18101
490820459305fd8867097ca8262fc44977d865e9
43597 F20101219_AACWFS zhang_l_Page_134.pro
cb844d939596f67c9e8b8ce222fa4a92
77524dc2596621832d067b179d54153f3ecbf44e
62794 F20101219_AACVZY zhang_l_Page_114.jp2
62aef09bdb915f3285d39d9ac64de1b9
6f91b61a840714efb31c1d3ae8cc5485ca197b1c
52036 F20101219_AACWGH zhang_l_Page_066.jp2
5279ba0669b486b22570706f258cc013
d06fe635191f82f32f33db54375556810eba7361
124998 F20101219_AACWFT zhang_l_Page_009.jpg
be3f1162ce5305a70899c433f8b371b8
75ec350562722ecc06e6bcb0df249cce09e04fd3
8035 F20101219_AACVZZ zhang_l_Page_061thm.jpg
fc5836bae096aeeb58a021e6b376960d
9a62b17daf44a01b76a498410d08810f0ab82995
101716 F20101219_AACWGI zhang_l_Page_006.jpg
699cc4c42ee32ce2f9838ee00e1b76e3
dc36f5b30a2fa45c7d70e228da2f34c22061a7bb
2162 F20101219_AACWFU zhang_l_Page_108.txt
5a5216628ed3b603508fe3dae2e5c16b
018b948bac3056639907cf8526c5d496beef684d
1471 F20101219_AACWGJ zhang_l_Page_014.txt
c3006f033a830d7045e80c87633887e7
e91847f1bc12103642b2853249f1fdd8fbb0dd8c
F20101219_AACWFV zhang_l_Page_058.tif
2d3236005554babccee19203095361ee
c411ec9c3b737e4e40a041c24654ea9fa9907c54
7923 F20101219_AACWGK zhang_l_Page_087thm.jpg
f47d9f4dfb747b219d3ec1408c87e75d
23049105350b7699004b891a1be856ff7870bcec
F20101219_AACWFW zhang_l_Page_046.tif
39ba4681e6d567153333eeecf5e34022
40fff2c34d30883dd94e3a43ef99825beb87babf
108678 F20101219_AACWGL zhang_l_Page_077.jpg
983852c7eeabf73f060463b2e410d957
a0967a001a1f260704660bf29855730ad2b16939
33277 F20101219_AACWFX zhang_l_Page_092.jp2
92cf664ffa67b9dbcea8e6712d9cf44b
f419fa3a59a67b4a806b605261da0eb86d48cdad
502 F20101219_AACWGM zhang_l_Page_063.txt
bf95a0b315a23a2951a23db6730d8d44
80e284ef867c86e8ce84b59a81ab62b4ea06a86c
F20101219_AACWFY zhang_l_Page_010.tif
a0892b9997c8f34aef8dbc55949c54b4
6ccd49371994a8b063115d87c6371aba6100b7fe
F20101219_AACWHA zhang_l_Page_132.tif
355102546b73ded0a8c5ca669f032c03
88928928974493b4b4e110856b694e58d52b5f74
63965 F20101219_AACWGN zhang_l_Page_067.jp2
01587c86024ab26a7ea4ddfbce48b7bb
058ec72b238dc91574917bb54a3ecfc94eed4cee
6214 F20101219_AACWFZ zhang_l_Page_130thm.jpg
49f26652a6241bf911880dde8db7eda3
27a6fb3fb84852cf15905f770e0b31b7ae1ce7a8
F20101219_AACWHB zhang_l_Page_003.tif
df5884623ddd465be3ebdb03bc81867b
aa7015db793699616125687410291c7ef7eef578
22470 F20101219_AACWGO zhang_l_Page_110.QC.jpg
a101f20559663b7bbfc77e9b0b5506c3
2fb8f6cf6fb99e4f9d7f03965f4747f8134d16d8
48094 F20101219_AACWHC zhang_l_Page_100.pro
9841d13401a93e57352f2420aa31ce75
4b14ab0aeb57aa9fe8dfcd469869745d041b5a8f
13117 F20101219_AACWGP zhang_l_Page_138.pro
ddfc5c047e0ddd4ebb7312c2798dc5cb
1ef9eb638b6535ddd59e61f01610db2c67494bc5
4329 F20101219_AACWHD zhang_l_Page_064thm.jpg
bedeac66e787f4487ec15f72c35e09a2
da71f10532ba8850d261caff665155c674cd61da
6966 F20101219_AACWGQ zhang_l_Page_076thm.jpg
daadc559c85e07ba85cfa5b57f2d2826
232feae75d401308888cf2ed4d6c3b1f9fded64d
6505 F20101219_AACWHE zhang_l_Page_021thm.jpg
a521348e6d89d39100571def613d1e67
af3688364f9aefc9a65e86021f5590bfd1b1808e
59883 F20101219_AACWGR zhang_l_Page_102.pro
82ca128b82284a097e04d149e7fb6a14
43b35553ffd67d6a310398f903b725e3db90c8b1
2228 F20101219_AACWHF zhang_l_Page_103.txt
84e2ee05bed82e83b4d86ba8dbbf8b19
61aa175b985cedc8b3c81cdf87e3d4e935842d90
1557 F20101219_AACWGS zhang_l_Page_121.txt
c9abb659857a062e9680bf25a04fdcc2
41518a4f7c6b3832348a77701b03d5a48c2fc930
25785 F20101219_AACWHG zhang_l_Page_114.pro
0f10e2e00a3d862a6784d9b3ff94ac18
1deacdceb34e8201f011e8938f46434bc1cb258f
7839 F20101219_AACWGT zhang_l_Page_034thm.jpg
17069a7f36ea3cc60f5038cc27623847
e0f5095273da11ff0c687fd56c248e9ecc7ba584
2020 F20101219_AACWHH zhang_l_Page_019.txt
e9073c3669d14142fa1ead67df209c4c
b39be0d5b61edc0ed6e75d73bee0f84cb459bcd2
2265 F20101219_AACWGU zhang_l_Page_090.txt
16efeef5cd03494d28c6f23c04458b82
586c35048ecfb6ff6981d181d08034fa47029e30
F20101219_AACWHI zhang_l_Page_099.tif
6404cd50eb125f378a51fd225b8bf299
97afedd85edc1dba082b727dc77bf1b4db552075
F20101219_AACWGV zhang_l_Page_066.tif
949422fc6bcb9b38a72a7175ad5e2334
a530f5c7b90c390351210b2016c5977457562bd6
99456 F20101219_AACWHJ zhang_l_Page_047.jpg
443da74060a58f4d53e16877346d61ea
5c6dfdae125cfad2c534d1162d4838819ff0a833
F20101219_AACWGW zhang_l_Page_021.tif
78315257aeb67634c0831a40c6058de7
dc2059d598760ae6cb47bb09609cc1404a52482b
14960 F20101219_AACWHK zhang_l_Page_140.QC.jpg
743fcc8842abcf9c7065c910925b51e5
d359829225bc8361b518b4f9901a45d2dcb51daa
6544 F20101219_AACWGX zhang_l_Page_143thm.jpg
2c49a64212fd180892e42995a475c13e
9976d63c26c609926d9079b303cc93cfc0fac034
7693 F20101219_AACWIA zhang_l_Page_146thm.jpg
8ffc2fdee9a197983135b55fb10598e5
7a451017e565f8c41d9d24d0a90bd4f26901172b
50117 F20101219_AACWHL zhang_l_Page_043.pro
0889244465bf8beeb2aa2079f21a562b
d8bc12a77e3328796f548fddded2f3ca8dd42673
924673 F20101219_AACWGY zhang_l_Page_051.jp2
e331eaf0cd5120ec3063c77f8acb41d0
0f6e3444bc4f608109c9f251a72e9faf0c3f1c07
112644 F20101219_AACWHM zhang_l_Page_073.jpg
41a9cb7067d10f44062c47fa3795a48f
82bb7b3eb0730d68c9eecf50bca2bb071641b255
31547 F20101219_AACWGZ zhang_l_Page_036.QC.jpg
ab40d624546834cf1e0cd5dcc1cce6a6
0f12f6cb70e7f7bde22ebd8d7712216152e00870
133 F20101219_AACWIB zhang_l_Page_008.txt
a6265a63f80c33f8d5a43a285a4819ff
acc9b6891ce4255bf6484f78cf774f257a64fa52
113498 F20101219_AACWHN zhang_l_Page_086.jpg
6c37cdfed16bacaa0ceeaa1122251331
ff94daec1df74f5bae10d60d49cfe958e974bbd5
F20101219_AACWIC zhang_l_Page_025.tif
f4be1aaa01ae4ac67c0546017a170839
ce789ec86170ad80c29d043e6d9832a26aa0d4c3
909923 F20101219_AACWHO zhang_l_Page_052.jp2
a9f15ea152c60d42f80fd0e68f146dcc
f6d9a2835118c59232a994614c6f648e5cc8ade0
1051912 F20101219_AACWID zhang_l_Page_029.jp2
7a51c759876e64e8f5d1c99e8849f861
ddc8d0ab1aafa8b150b12c6afd6ff9380f0a6c01
110225 F20101219_AACWHP zhang_l_Page_071.jpg
0b519d1a321c1e028ba960c32e595b07
e3bc97dc78c0b9fb66a31b7b629d65c68c97444b
2264 F20101219_AACWIE zhang_l_Page_071.txt
b123e43e43f4178b1f03d6b23d50a94f
26662558a746361cd49164de94ed0b98df585993
23279 F20101219_AACWHQ zhang_l_Page_137.QC.jpg
4846ce329cf45e58c138fa8f79860cb3
e041b1398686295e3663f1eb4be3c377e07fcfb1
115039 F20101219_AACWIF zhang_l_Page_046.jpg
f2bc1db4987e9d16cec18ba79ba16f94
899e2a3a32f2a698cd588505fa2ea179a19d8fd1
46943 F20101219_AACWHR zhang_l_Page_019.pro
471b30f8e7b8e874bf1a9411c2fba54f
cebae49c5ceb16365e44a79a4cb23a6610d2f3fa
F20101219_AACWIG zhang_l_Page_087.tif
4fb4039a8357604aef9b108ab1e1edef
9691a6552ec475fec0423e509fa428a01aab2412
F20101219_AACWHS zhang_l_Page_029.tif
5ca766d24eec5a4b885cee1384b09838
f132cc9463b5cabe44f63f4bde32fa6773032bc3
2339 F20101219_AACWIH zhang_l_Page_144.txt
b8245ffe85ce86c45830015bfd617b9e
0524f62425501cde9c7cd42c37f86e3bb0a3259e
42917 F20101219_AACWHT zhang_l_Page_041.pro
e5e312824d6955825ebc2d6d78130370
3a50ce8e9e5db3e69403d3771d76994ba0fb92d9
57373 F20101219_AACWII zhang_l_Page_075.pro
3494d12c7d07deaa0cff397873dd6cc0
ad849cde2a565797918632cee54c5c787637fb07
F20101219_AACWHU zhang_l_Page_002.tif
47f6b06f6dccd835ed75f970ffb55123
187109ab91a26422cd37fdf65c251c7b2b85d8ea
F20101219_AACWIJ zhang_l_Page_006.tif
1c0f89cb1ec17bd27f9c8640a2b81ec2
b4bdf454369c9253015162b9428dc44df4310910
48087 F20101219_AACWHV zhang_l_Page_026.pro
6723b08ce0a2721f2b741e70e507fd6a
bde9b3e38833a0aaa54e15a575e7f790c2711c5f
6749 F20101219_AACWIK zhang_l_Page_051thm.jpg
1e7808678ed0751d55d3fab4c607f399
a2ae4a7418e074a53e30b1f89bd589869b3b63e4
84456 F20101219_AACWHW zhang_l_Page_016.jpg
ba464fc5a8191aded1e225a5ac815c17
c031e8f6027779a89001c46ea48cdecbc7b79569
F20101219_AACWJA zhang_l_Page_028.tif
64c7dc7da35ad58dfe328e09f5b8b65c
a51b16a7ba5ff0f0b4beb479a79531f60d91b550
56001 F20101219_AACWIL zhang_l_Page_091.pro
6e6c122f08921e03940619567cc64848
b0eae8698b0b3a41b3cd6e3db166161b5c2e0d3a
108345 F20101219_AACWHX zhang_l_Page_015.jp2
3cdb20ee37f2feff23f7bd4123605b50
5e74453a5144842ae051df78d63c91bde15c31b1
1051945 F20101219_AACWJB zhang_l_Page_031.jp2
fe9ed7ca0b74c7dedf5fc844d8c9f66e
3d2cc244b424ade5bba5e581a0e8b52e08957d0e
35114 F20101219_AACWIM zhang_l_Page_121.pro
909b929842760abda9bf406bf0357c2e
728625e692f1546d70ca81433c0bcaeaedadd324
F20101219_AACWHY zhang_l_Page_076.tif
044c2dfcb8ed6d067713b843bc8a6eca
91d206fe03b224b4faffd09b259a2adc1431eaad
21230 F20101219_AACWIN zhang_l_Page_122.QC.jpg
9c0ea88232198f11c7bb945a22948c0a
ef5843b80505d7d1ac815d1f380503e9646f3295
112146 F20101219_AACWHZ zhang_l_Page_128.jpg
3b546bc2c66a84f032d4657ae1674fd2
6108f272bb4dd974178aad6821d1002a17409a3e
1054428 F20101219_AACWJC zhang_l_Page_093.tif
a66a4117f7c08e5e491edaa0b2d41d2b
72010581728a7ac80f8eba1497a366f40ba9421e
40012 F20101219_AACWIO zhang_l_Page_099.jp2
3e97d11a1b60c323f390252137eed8c6
873650842152a98f4d19fa886048a28af00571af
1558 F20101219_AACWJD zhang_l_Page_120.txt
6977f2be41b765530d9a804aefe9df05
68ab6a6b9620eb767b53f068e66b00111fb86fb1
35780 F20101219_AACWIP zhang_l_Page_045.QC.jpg
09b06eb771b970cafb83fe3414d8cd76
4164825a55cfe5b91c20741fc4bfdb55821ebc8e
7206 F20101219_AACWJE zhang_l_Page_118thm.jpg
9382f414b92bef2239eb0a6fb4575b13
fcd942bb6bd05283e1f3858ff6dd8c570986eb6c
40359 F20101219_AACWIQ zhang_l_Page_094.jpg
1190c5cbeab1a886d62074f25f56b99e
ba6f99c6e355d584d34a1fb4b1c8444106d727aa
17105 F20101219_AACWIR zhang_l_Page_141.QC.jpg
b64357efc085b284d854f4549ae7aab3
2f7ba4574d4a8242232ecc3abdd3fe7c96b8b815
F20101219_AACWJF zhang_l_Page_137thm.jpg
a012dbaca2616e0b05c332f266e37887
18179cb5b438c2e4a8eb09fedeb9c8ad511872a5
30014 F20101219_AACWIS zhang_l_Page_116.QC.jpg
e9e3b08aefa323c205fdee123d527e4e
281713af77891208e17e70b7784255c748d28b8b
2197 F20101219_AACWJG zhang_l_Page_006.txt
4b6d28d42b6e20bb6556eeedd2231665
2b9cf8606a5ac65cd4372f1da1436615693c8d0e
25415 F20101219_AACWIT zhang_l_Page_078.QC.jpg
4cad99f75bdc9513138c7bd1c1facc9d
a0fdb16c465e4a13d73e99f69c0f935720a1c3b4
112651 F20101219_AACWJH zhang_l_Page_112.jpg
a0337fca2d3248ce416b16cca80a65f3
e1fd5debb37778c4eeee452ece40daf8076d3aca
2219 F20101219_AACWIU zhang_l_Page_086.txt
c7085927661f1e55d8c1119c1d9fd6b7
7afb4b3c8de62995aef5b1baf0e2ca338a413c5c
37421 F20101219_AACWJI zhang_l_Page_104.pro
9f32c300ccc9a179afb480dc0c730d7d
d53a7859018b44fb3d57be366415402d4815c7e2
1237 F20101219_AACWIV zhang_l_Page_131.txt
94abb9426681c95218b97b3107115423
4302893e4afae85a50f54dd1dec2b7526d9e47dc
62945 F20101219_AACWJJ zhang_l_Page_040.jpg
dcbd0fbd71efebfedeb53bc5be8d7a1a
15ae72ff72bdc28ae3e8e3b8826ed501fe5e90d4
34482 F20101219_AACWIW zhang_l_Page_082.QC.jpg
bf77e992494c76da2d9e960f94410cc4
21d46d89283cff0296fc51752d868e6f356b3600
87053 F20101219_AACWJK zhang_l_Page_130.jpg
3dc769062618354153698de3d74d174d
fa0df046a2a0029ea7b57960b5fdf483538af7bc
110819 F20101219_AACWIX zhang_l_Page_070.jpg
8f66b5e8c01fa9ca7a634071a0e03871
7cae07f2d63b698cdf25c94f57befdfcea16edae
F20101219_AACWKA zhang_l_Page_015.tif
54c22c0b533e8adc6f3155b06fe29650
bc7241d9111e37f2e43d062aa88fe412f5b58364
6583 F20101219_AACWJL zhang_l_Page_025thm.jpg
d0d15b9504deedca78563f2f9621d873
5fe166f7eefa80b98219d53b27c38d5a971afed3
1828 F20101219_AACWIY zhang_l_Page_143.txt
9b67818d5317c69acdc809e2f3b36542
b0493b9d938c31d971ce5eea98f886d4178d44c1
74683 F20101219_AACWKB zhang_l_Page_095.jp2
b6bc540b6f8a0265961ab8fd51441545
c84edea9ebcefa518c892a5973aba0bca29cdb6f
F20101219_AACWJM zhang_l_Page_131.tif
35bb0976fc670f8b41749db22db05263
c761525c4fbeef079e8519a9ee2d8e973b293b20
58952 F20101219_AACWIZ zhang_l_Page_061.pro
100c91e8cffe2448c9b94725798752c3
79f03f0063e01b245fdad82d3b51c0a3c8faedac
3939 F20101219_AACWKC zhang_l_Page_011thm.jpg
82cef970bc731ba0a6afe70d853743f8
076d4d62f4a0d43adeb002b947d9eacfe6ede660
17177 F20101219_AACWJN zhang_l_Page_011.QC.jpg
46d39b0f1c5f7cef2542bed6b982b361
f7c67627af0c1e17d8bee02bc97322e636b7de3e
113772 F20101219_AACWJO zhang_l_Page_056.jpg
b120264c2dfbb57fe940b18e2b22387a
d7aecb6ccdf3a4ff0ff968e27baeeb38f14daba7
F20101219_AACWKD zhang_l_Page_013.tif
54b48f2267c1aa47d28cc886770e3eb3
4a1778b23df1b6f6c6e16c94029a808bcf565f71
30392 F20101219_AACWJP zhang_l_Page_126.QC.jpg
64d04690023ccebb150dc59a43b0ce86
2d6cebd5f74fbcc72d6582796614b57413b5ed55
F20101219_AACWKE zhang_l_Page_125.tif
3a9504e520ac1b961e93014da2fbf16e
7ed3d64c3784b6c46dba7649e0beb498c292487b
27071 F20101219_AACWJQ zhang_l_Page_130.QC.jpg
eb29fc065ba0b61938029c536808ff3f
6fba7882bfc17f0b5dece2c5aff57ee39f59f801
F20101219_AACWKF zhang_l_Page_069.tif
682dd97dca974cd781af7206c2ed86af
0974cf893a018e9a8d03ce15ad7084db77fcd974
7628 F20101219_AACWJR zhang_l_Page_106thm.jpg
c9b36b23e19426cf2f4e342d6a661c98
efd85878fd790a06a4146b0245319dc8ecc395b2
F20101219_AACWKG zhang_l_Page_151.tif
eb3f8ac2cfc5298a2178cace72d3d25d
8ec4619bd5ab954c8a26258beea7745402a3c46b
109072 F20101219_AACWJS zhang_l_Page_087.jpg
346a570549517669343e6a5776423f1b
7b3e0efafbd8bc46ad0610ba96c54451169d073d
123580 F20101219_AACWKH zhang_l_Page_045.jp2
a00c0bf2eb6eff86808864601e36e444
c890abcae5f4499308b9188283dac2ea53beabbe
68313 F20101219_AACWJT zhang_l_Page_119.jpg
64b8ad7f1badc94ce147128d7c47241b
4db093bf28653f2cfd6e3ed7f27db8dd33b037a4
116458 F20101219_AACWKI zhang_l_Page_091.jp2
fc3c5d4189a5ab1e0d4d4ad05fefcb04
7a4d75e3a03ed6d5d33a1bd818cf9483fabb7fbe
20306 F20101219_AACWJU zhang_l_Page_040.QC.jpg
8aef8e7fc0418ffa9a44aa4d22d1a4cd
9911a5727f72837c5f1527d4b193091ea4a244cc
30647 F20101219_AACWKJ zhang_l_Page_020.QC.jpg
fcccea3302ceb98775db273ea46c9431
7916f89e5d726ac907f0551439b4b907e3f4904a
54693 F20101219_AACWJV zhang_l_Page_030.pro
555c8c34d924af50bb988d59d5c9691d
faf443378add96334418c8f2733d26123231f7f9
31601 F20101219_AACWKK zhang_l_Page_145.QC.jpg
98799ff94a5904da34230afb4371306e
35979c42c291e5758311e9f1acff564c9ca962c5
29997 F20101219_AACWJW zhang_l_Page_035.pro
b00f34e10f59acfd4fdbd2a9b4577415
2c4d93c3aeda2b5b588a750db06f30532adb169f
49981 F20101219_AACWLA zhang_l_Page_036.pro
6c55d492f8a0e1713370e5556e0d540e
1ce815c75cbaa71ae5c39a5a416a8a8a6fcfdf55
857015 F20101219_AACWKL zhang_l_Page_078.jp2
0e9b4ac355be9df59577b10397f92db8
2d618e2d086573e4ee1bb6ab39dbd9ebdf982609
111744 F20101219_AACWJX zhang_l_Page_057.jpg
edea087db9085fa16674466f7d0bfe09
5b08601670470b1122c1a364b52c7887809c4c02
20471 F20101219_AACWLB zhang_l_Page_039.QC.jpg
20039e7d2ce31e60f4b3794af29bd0ce
d3f57b78791b00be79826c5030ff7f2a7ff2683e
F20101219_AACWKM zhang_l_Page_061.tif
6c4c0acefb01e8c936645ef2beb4068b
0c7162457ccf7a8df52502a1f22721c42aeeaeb8
48331 F20101219_AACWJY zhang_l_Page_004.pro
2de5165f59b27f84e6a411b9e68c1204
7299fbb169093ef518c5514aaffc8b846fef174c
F20101219_AACWLC zhang_l_Page_107.jp2
7ed515b276f057a71de2f5c155024aea
cc7f004b3f0123ff714f09198110e189ee350267
1184 F20101219_AACWKN zhang_l_Page_151.txt
82c769bdaec6f96f3e3d861ac744fb8e
fc438253a076617c2e7dd23f2177f8873d34c9fd
738665 F20101219_AACWJZ zhang_l_Page_035.jp2
8db08fba240678b14b8d5058002de3f8
9ad79064f5718f589149f7c6c29f56dd3668ccb3
35062 F20101219_AACWLD zhang_l_Page_128.QC.jpg
c039d3f425b67f084103d6a675cf2520
d7e6bc65003195158a3288d6ce9a38f2b7a23cb8
7843 F20101219_AACWKO zhang_l_Page_144thm.jpg
3011543efb964ebb837b08c3c9cf4dbf
fa43027e99aa504651b993adbcafcf149c7fd64d
1799 F20101219_AACWKP zhang_l_Page_048.txt
916c1adb7c34cef084142883d7504339
49300509dec669cfd3df1519f682bcd195961778
20812 F20101219_AACWLE zhang_l_Page_068.pro
957018cf62e3ce05f52929e9ef82c308
06eb2ec08078aef727d9531dbad68eea6bd04181
16043 F20101219_AACWKQ zhang_l_Page_124.QC.jpg
876bec2cd1f0a055dbb53566ca886f38
430bff12bf9393544db806fe6cf6b79a3279269d
55637 F20101219_AACWLF zhang_l_Page_087.pro
f14d59dbe23431d9e57d8d01dccce3e8
5788f0fe4449b57fbdb69e07836f42c18f1c6b9b
460 F20101219_AACWKR zhang_l_Page_001.txt
47b3f2d0e8db84c70ce0495e87390649
7e59f53655b9ee99b9b6174b09056e706334658d
2482 F20101219_AACWLG zhang_l_Page_149.txt
8434788abb0fa21449f8f1b850b494ee
d433040f924acff9e313292455f4b862d71cf867
22814 F20101219_AACWKS zhang_l_Page_037.QC.jpg
efe00e6fe58ce5641227dfe23c2083c8
b368a6978370075ab0f6fe308f7d27b90cfc6fa4
29422 F20101219_AACWLH zhang_l_Page_041.QC.jpg
23f0e87e034e6a5e213999c0a11c715f
975bd231b467334efa55205c6186ef8950bff57a
20877 F20101219_AACWKT zhang_l_Page_039.pro
a154758e0443f7b7b943e1b1390a56b2
81279e50563c958899d02dd7af8a79060d13be6e
26779 F20101219_AACWLI zhang_l_Page_042.QC.jpg
158861b8d650f4ee5767444ff01cd836
a15afcbb399e464cf51f50497a174f8ec1bec5e6
72093 F20101219_AACWKU zhang_l_Page_120.jpg
41d1df977cf00e2de7c1fe576e530e16
8c063a0b51ccee3ac331c2b5a49d11f38d55b2f3
3889 F20101219_AACWLJ zhang_l_Page_064.txt
0fe53719bdf7417dc03af6744bd6787d
0727fc88fa72753e7405a0b7848603dcc98a7287
3648 F20101219_AACWKV zhang_l_Page_138thm.jpg
13e3bb877bfe0df902c36a4b852d19bc
37c816e63576fb3a706b70e81c4e2fe3d8545c5c
112283 F20101219_AACWLK zhang_l_Page_075.jpg
b0fdc11b363ec70ac94dd40d33989096
d9a4aa55f54ae62301766d18814187c6a8737177
2023 F20101219_AACWKW zhang_l_Page_001thm.jpg
4c9d8d3471b4b7c0f78b390627193904
b91685e1860bc573c318db27ed037bd4e5d35fbc
3963 F20101219_AACWLL zhang_l_Page_097thm.jpg
a8868026e06dd5d20d074d8db8a37df0
fd1f435a4242f06896e842be3fb6709f9d4d3a01
1051980 F20101219_AACWKX zhang_l_Page_115.jp2
f956aaa7e82ebdf3d01ced9eab00702d
4b86c1fdf2cf4c13d48f83e3ccee678ced5a69bf
59493 F20101219_AACWMA zhang_l_Page_046.pro
e338be2d169743bb4ae527710e808357
c1122dd4b0ff32c089beb70a5976de700d4c362d
18507 F20101219_AACWLM zhang_l_Page_114.QC.jpg
232832f3f8f76d7dc50bd450edd482ab
954ac95d4886d293ce8a05de7dd3fbaee44e2d5d
F20101219_AACWKY zhang_l_Page_052.tif
7875fc8d17d4e78a21e1050b5b5f84db
eb5230ee2c9b8623189b3b7c0037f6d9d2e45a72
44452 F20101219_AACWMB zhang_l_Page_098.jpg
07240419f45c9b0dc1421f32bd41170a
4050b43aedf649e40900414e6eb8c0d8c539ac41
78077 F20101219_AACWLN zhang_l_Page_049.jpg
3e637dc50ddfd89ad6387063784e5432
32166a439605f76766b6a2769a4d580de7bf8a05
46029 F20101219_AACWKZ zhang_l_Page_140.jpg
34330fde4a1551d807ee94543f380268
0ef2a6558ef3708bc5265554861582baba52c175
F20101219_AACWMC zhang_l_Page_100.tif
7d7ac9c156034eb16c70e12e2d838152
f7c9f2dd60b8165bcb8dc3680dda2b7ccbfbe9b4
4163 F20101219_AACWLO zhang_l_Page_002.jpg
d1b77751278bcdd196c77b6e7d683cd2
c65a217788bd576f4599f21b91d67f5e12232c85
117289 F20101219_AACWMD zhang_l_Page_072.jpg
bb65a1134491ef511e0146eebb5dd90c
356798cca615294084a8b24549223562cf507bd7
F20101219_AACWLP zhang_l_Page_097.tif
b6a1e62876e04b30ab8b9aa426b1d5fb
1d015d7a77c9087e35c6cc90eb3e36fc18728415
501 F20101219_AACWME zhang_l_Page_152.txt
a9b2715b9507ae1d71ce6d35779957e8
3c43bfbbdf93154a917a1bd8760011f6f8ecdc67
7890 F20101219_AACWLQ zhang_l_Page_081thm.jpg
4075fde488b1c9214cba0292ae598c08
ca28ed58a4fb36be764e1a6942ef775a743e6df7
1958 F20101219_AACWLR zhang_l_Page_027.txt
5b78e5f66e8991d8f6bc96b43bbaa592
0e33842a357fb737601c2b6abcabd5bea6db7d59
F20101219_AACWMF zhang_l_Page_030.tif
6e9bd868a0a13ccc7cfa32e432c70693
941069852ff955879ba8723fa522026c97c7b443
120272 F20101219_AACWLS zhang_l_Page_128.jp2
d65778b88ef47dd4265f22fe63cebff6
b7d8b636a743610b44208d90368bf586151ca380
39337 F20101219_AACWMG zhang_l_Page_051.pro
e5b1485cf1144f3c74ddb6e6d87846a9
76e9a4b646c2d985c023c91ab7f0d87da665ec18
22201 F20101219_AACWLT zhang_l_Page_014.QC.jpg
6b46dd6346ee1cf3e9e5252d318a2f3a
0aa8d85f0fffafb9bf9bec5cc7ff889b35e88259
121278 F20101219_AACWMH zhang_l_Page_056.jp2
7b8a499adcb55d9c998404abb5dabdeb
0a6b9bf92ca3fced7ead03b3fea4ba2feae6dc20
F20101219_AACWLU zhang_l_Page_019.tif
560e71496a1ec71ee9ec818271802da7
fbfe16e5b0e1825c7d8b1a8330b3e30041f4bf5a
F20101219_AACWMI zhang_l_Page_011.tif
346b7a0fc277d5e24d1493e52b353b7a
f3684d1f77cb4f765c2123154fdfa58732160b86
F20101219_AACWLV zhang_l_Page_118.tif
6d2e7fc9e980918715d481ec6d7a174e
c8c9a69f5b14e9a8b6b2bd836c7b8bb91c398990
2210 F20101219_AACWMJ zhang_l_Page_089.txt
1a4c58649a5e51935dfbe331d58a407b
5a0dd43206742321757d213da7ed693a6e59da5f
116434 F20101219_AACWLW zhang_l_Page_080.jpg
22bbb217b5b683c03c883eb3810e8549
bc731f6ea3084338b128cb8fcb3f3a3bba7d64b0
34483 F20101219_AACWMK zhang_l_Page_083.QC.jpg
8ff45a1fceee9d250f8ba22fdc0f0c96
3c05f25dda12882a9620af55ba893c6c3bf5140b
589 F20101219_AACWLX zhang_l_Page_005.txt
a4ceefd6201439d4b003ac313982839b
bab5b2cf0e12dd0561d76278bfe45bc5962abbed
7183 F20101219_AACWNA zhang_l_Page_134thm.jpg
17de0fcf6144181b376f1693248fa97d
eb068c86a5c3b95364f7845de668e8df8b6f59ef
5922 F20101219_AACWML zhang_l_Page_139thm.jpg
4013b2e1afbdce3e1829f1e9c0ba2552
53ff1106369033b6af7484830e47346a165eaa8c
32420 F20101219_AACWLY zhang_l_Page_044.QC.jpg
695749573060265ee60b3d50346c0309
5f1f0d7962fa56be0f9f19705d48167af1aa4637
1396 F20101219_AACWNB zhang_l_Page_114.txt
27bd3b3b00a18d827d593ef955e7ec97
9c06cc2cc8c6a57849bdd0f3a94d86574d8723c5
1145 F20101219_AACWMM zhang_l_Page_142.txt
c6c8400a8e89ae2c40e11c18b1dadbfc
6340d9ae7d4a728cb62c76d62fec074046e12e74
4811 F20101219_AACWLZ zhang_l_Page_095thm.jpg
b91b697e8ea10a2e070a2f76bcaf05ef
18df72cffa2f55b27b475fa78884a0e5ce3ac33a
117520 F20101219_AACVKA zhang_l_Page_023.jp2
d922031d26021dd2575f331946c7b34c
f99ee7cc5a94d47f97fc5122b50195a5189d6d88
F20101219_AACWNC zhang_l_Page_123.tif
5d5cf7abd72e69e8443dba943b7f3fc7
58367db0ccfa9718d7a2121c6b7b463c2ad092f7
97058 F20101219_AACWMN zhang_l_Page_111.jpg
d10a67516df90ea18ea82ec98db9eac2
93e970e6956951a8afd649ccaeaa9ee3b54d4121
1051979 F20101219_AACVKB zhang_l_Page_012.jp2
a43a146e46bb51bf208bd621b9f8bf14
52ac8f44e9bf0bff900a803664f0dde9a21a827b
F20101219_AACWND zhang_l_Page_129.tif
30486fd54e268ddd785f51a97df4675d
a4a0515503f658cc194f96d150e1c6346882dcec
19904 F20101219_AACWMO zhang_l_Page_132.pro
1ac089472f13e473234bbd3bb3080e48
77979536037374c29012c0444f8884d03e7d1598
14903 F20101219_AACVKC zhang_l_Page_125.QC.jpg
d3457c42b5cc8035a69de9a28d80a40c
fec20b9f379c906076f89af3b85b15d29ea8daa4
F20101219_AACWNE zhang_l_Page_112.tif
79499f97f2ac6e121055cab2fe7a9549
50b9fa5c834d11ba4d6dae99e31287dddf6e2fbf
5917 F20101219_AACWMP zhang_l_Page_037thm.jpg
788e5536646f59682a4156242f16bf8b
ed61cd5898c7e1da134cc9f3e0d58827b0960074
F20101219_AACVKD zhang_l_Page_072.tif
b21bae6f821f0f732e19afd6c74c8e57
d0061fdc0cee446b1421df9fcbefe6a46c820373
41253 F20101219_AACWNF zhang_l_Page_068.jpg
2c36763c4a50127485c460dd93f75b1f
93548d4461cb1ead95570d905cdf81b4f9e7adb8
58270 F20101219_AACWMQ zhang_l_Page_114.jpg
1fe7debd9f0b4c63b09298623b81a8b8
f9869d2741a7bcbb832c2a442970d3a86de74af6
1046169 F20101219_AACWMR zhang_l_Page_020.jp2
3fde1f479e873e2049468f6f819438b5
19e7e0c202d3c061345ee66d8a22a42967c9ac4f
83329 F20101219_AACVKE zhang_l_Page_076.jpg
624d243bee3ea95fe45cfc806b9c381d
bc55dae8e4e9c45066d8e83ba57b89d4cad20fce
1051975 F20101219_AACWNG zhang_l_Page_047.jp2
d9391d12b941cdbddb4b2452963e9b35
24ea2e1bd467674be9929243fc9e42e585014c60
F20101219_AACWMS zhang_l_Page_090.tif
e4a12b57f52add82ca06598306aa6e38
e10f058e14c6cde3378d2f5e90dd70fd344d7eec
13718 F20101219_AACVKF zhang_l_Page_098.QC.jpg
744532f090a2f6b8f633699c62d6491c
9d175717bce5711420bcadf1c36372df64df5e04
56201 F20101219_AACWNH zhang_l_Page_059.pro
d82634327ebbd87507fca7d781f4849a
c9d4f6f932186b3b05648d246f4adae82d8b617d
7783 F20101219_AACWMT zhang_l_Page_031thm.jpg
9cb6dce74286115c7bbac771e4513b1f
8a5eb593c9278d24d31b6d5e9af86cc609c00d1b
3904 F20101219_AACVKG zhang_l_Page_099thm.jpg
38ca6f6e3e883519c5dfef5752494286
f30e2b0ad1b244d5ada9334a1f280e7652335b1d
732544 F20101219_AACWNI zhang_l_Page_037.jp2
84baa8865e8c3c56fcae8736cc9f64da
dec889e0e2a8e0bbb6e853c7c284e1f2c5e4c1b3
6874 F20101219_AACWMU zhang_l_Page_069thm.jpg
f1bd74031b557194a34b56486a753251
22039a11d4ceb62c60471888cf44ab6e66ada529
73439 F20101219_AACVKH zhang_l_Page_137.jpg
26ca06fba0c3f2ace9be67be65ee5066
f299d4ef9177d6e326e2ac3cfa363b1bb98bea8d
1341 F20101219_AACWNJ zhang_l_Page_067.txt
81488ef256b3855715aad34ed7944480
39a83dccc4a7c2cd678e68fc5456136ea8c73056
8185 F20101219_AACWMV zhang_l_Page_102thm.jpg
b23c1d376fb6647750ff2527831efb37
cc1604e0437c37a8fd2d192acbabc2d382ade058
F20101219_AACVKI zhang_l_Page_108.tif
221841ed1a314b810edccb59e0b962a3
1e7213675e4da5b6c9b947f8f3e990c83740dae0
5081 F20101219_AACWNK zhang_l_Page_114thm.jpg
de5fba0b19f2681a79e5337fb0678263
45fa915000373b32547c03c4c3cb502db828e68e
6858 F20101219_AACWMW zhang_l_Page_113thm.jpg
1536159592dd306abc1845cb52c92ef8
eaa4a8ac157e3af637d185b210ea013103172c5d
53222 F20101219_AACWOA zhang_l_Page_060.pro
8dc9524b717251f9119ce913c29ea042
70bbcb55d814aa556ee42f266f28c6be1151891e
29023 F20101219_AACVKJ zhang_l_Page_150.QC.jpg
3802b0a281b0ae16a05c73c84254022d
663c62ae0feb5a98dbf06202716d4d80c74bd810
99082 F20101219_AACWNL zhang_l_Page_033.jpg
5108ef3e2ad069aaa11b32b975dc6987
0bdd136b4d715a4ce17750a37212baca24be151f
F20101219_AACVJV zhang_l_Page_139.tif
272abd086957114fa8e9311192503943
890486dc216e80bf6fa957905ce405943263a88b
28907 F20101219_AACWMX zhang_l_Page_069.QC.jpg
10130833bfc16eb6b1b7318043c9c02a
bc0fadeb98627b6c7a8259d8f7c05b02498dc19f
5486 F20101219_AACWOB zhang_l_Page_123thm.jpg
933641d969c5caaf24aa27a936c3f679
bdcb3cfdbe318ff4ce6ee33dd175a50c714602ff
F20101219_AACVKK zhang_l_Page_142.tif
1aaf22d34e1286a6b855bbe35f4679ff
39cddb53b648818da023a3e4634888082670b62a
6906 F20101219_AACWNM zhang_l_Page_126thm.jpg
0eba48b24b457380b0df31bd545570a8
553b9259827b2107be9626a9ecc4f036c2f7a7bb
2119 F20101219_AACVJW zhang_l_Page_033.txt
ef44ce66c51754efcfb24ff9371a0def
606347ba0ae70c8a3ef8690bc14cf94f144e0819
55786 F20101219_AACWMY zhang_l_Page_071.pro
aa515e22ac9a109162d6bcfed132b33f
83c4b17b78fe584d7f17c7738aa7d00ff07f0743
87611 F20101219_AACWOC zhang_l_Page_013.jpg
fede1a2e158b38b0da5d9af45483879c
2872e00a75f2b72e105c197afead0bf33728d77e
49226 F20101219_AACVKL zhang_l_Page_066.jpg
12403652023c474c8b0e5a134dd03906
df1d9aab48dc0d8374b3b74fc0dd3a0012ac39cb
106491 F20101219_AACWNN zhang_l_Page_085.jpg
01c5ae06c7c23c157993e67321c49415
6966d979c0f2d7d452ed6809be264dda7622df9c
56663 F20101219_AACVJX zhang_l_Page_062.pro
7906ff96e66b84fcd6662ff2db1a2b5a
dfc6ae7a5386c86921e9494c7aa9b576f7f32293
4216 F20101219_AACWMZ zhang_l_Page_142thm.jpg
4038e830f605685a0cd9c128060d8ab1
31c14d04eed138a9c556ccd2ca3a43df0085c6bf
35011 F20101219_AACVLA zhang_l_Page_062.QC.jpg
9d1902007659e959a38911351a33f5c3
d00288fcabe910e6501ff97b484c7b2eb339d07d
F20101219_AACWOD zhang_l_Page_039.tif
0f671a14f791675774abc311eb257cf0
3e3ab8418412994259b3944f2b09a0f3aca450d8
8364 F20101219_AACVKM zhang_l_Page_080thm.jpg
86fa6bb56adb689e40bf086c112380cf
accc6dead58c46d4b54af21f845c0d357ea5c596
22362 F20101219_AACWNO zhang_l_Page_139.QC.jpg
04a44fdd0f348728c07f12bb92572c57
f0121d255ec32bdf7bf8a5b4c243aaff925f4a4c
60551 F20101219_AACVJY zhang_l_Page_072.pro
98fdcbaaa223e6b5c51d5b9b4340235f
204130fdbab959b83a10f3cabec0b43b5372804a
60072 F20101219_AACVLB zhang_l_Page_146.pro
abe1d62d06528b4da54bd22b7e6773df
551faf660ae21a009c5bdea56954ced4cc9507cb
32183 F20101219_AACWOE zhang_l_Page_030.QC.jpg
e8e7f6670cbdc884ae28a44c5ac06652
e2dc02c9370aadf045fc21a518e2372334cbaf04
84254 F20101219_AACVKN zhang_l_Page_108.jpg
6c1f8fffa78dc0b8220f59ec3e3a36f7
d771971201811fb67ca4050c079b4d45ff379463
16326 F20101219_AACWNP zhang_l_Page_067.QC.jpg
ee9a0e259ed9b934ab0e802b48f5f613
7715feafd54481d22750db63714852b4093b1c18
F20101219_AACVJZ zhang_l_Page_005.tif
3ceedc440938d610a9e04a52f575245b
d2aca3d945a7c9b4d63a62581599f9b199495c33
2102 F20101219_AACVLC zhang_l_Page_152thm.jpg
078eda20c468e4e8e7d3c92a76dc1dbc
6e1eead1188294cd2beff68883a96b7f00dbab95
F20101219_AACWOF zhang_l_Page_041.tif
cb373d06bc95e8b6a9fa7ff8b401433f
f8578a72d9393d27f35c256b64499e4c8ecb632f
F20101219_AACVKO zhang_l_Page_037.tif
696dca9ffd9b8338fc22375c3f67eb9f
42f8bb811fac02d02b24b5df25fc185cdb415ef0
7604 F20101219_AACWNQ zhang_l_Page_145thm.jpg
fa8f79dbafcea3b73c4e70b53bbcb063
ee81c2482ae76dfa4d3b4c8446a92fc0a0802347
25032 F20101219_AACVLD zhang_l_Page_021.QC.jpg
03849dfcd153ad3688a8429a632458fb
a2b972638b830e0757f3f0badf715cc72479b987
44619 F20101219_AACWOG zhang_l_Page_048.pro
c46e7fb642c2904ab588b4d34899e36c
176dd44a994bd4d20d89762f08a0b4a8521b03ef
F20101219_AACVKP zhang_l_Page_114.tif
2dcefac913de383cbd09f99a47f67458
821c83bdf74c270765f50359802eb2586b39a302
54820 F20101219_AACWNR zhang_l_Page_077.pro
837efa94fb6c721585b923d67d6324b5
75d621a7cb30f5d0d2e7cf209d0c8253f5e7941b
31172 F20101219_AACVLE zhang_l_Page_107.QC.jpg
66e214150efd9fa69b0d323ace404dc0
0d2bc0da883de0678054d04b032965c369fc9eba
70997 F20101219_AACVKQ zhang_l_Page_038.jpg
065ec9f33be892f95a79300eac2611b3
468e840c306ae03700253029d36b2b24b052227d
2238 F20101219_AACWNS zhang_l_Page_031.txt
7bcc95d4497c78bba15a0925d340f2da
a4198a22d6f08b9952555d8b18501babccb382a3
5366 F20101219_AACWOH zhang_l_Page_131thm.jpg
c71b5769ebf233d36981bfae30dece0f
3ee0fff6f34ecbf91c3ed0885c3c67396283079d
56542 F20101219_AACVKR zhang_l_Page_080.pro
924ab325357ee6699285975886029bc4
f30459cb161a0bb2cb6a1f79b3522f04deef44bf
67824 F20101219_AACWNT zhang_l_Page_065.jpg
31ffcb05403d28c5fa41b137ed629ddc
84a2cb63d15bdff4a3853af63c75c4a5ebc08239
56684 F20101219_AACVLF zhang_l_Page_081.pro
0d79cbddb4164c29b56d1bff6f016de0
f1f69e3b3801cae561b788d6be49110838d5de56
31532 F20101219_AACWOI zhang_l_Page_015.QC.jpg
87e7ca7168fb2f4267643187e552ee6d
505f2534333017ec6b950121f67efe67f7bb3509
100382 F20101219_AACVKS zhang_l_Page_015.jpg
e43999c7af127d3f4438d95a0e4dbe56
89ca208e66ee738abc093d312dd8f58f67b0dfdc
F20101219_AACWNU zhang_l_Page_152.tif
62819dea76fbd05cc294be9296264b19
121af72d6dea7be54742c79dd5e30d72704ada57
5583 F20101219_AACVLG zhang_l_Page_032thm.jpg
e81a0021c34b082d124faffad840433f
c0859480e528f8bcd22d4ae605a2481ca251918a
2322 F20101219_AACWOJ zhang_l_Page_128.txt
67e54f00c713d9336ea88b49f8496694
21c6217db4eec09a9fc043d28713546708e62920
11328 F20101219_AACVKT zhang_l_Page_094.QC.jpg
8c77ceb07d1f43457653df0f9771f0b3
d0fea836ff5c6211fb6327b83c42f83631981640
1169 F20101219_AACWNV zhang_l_Page_141.txt
8448c64d73cdf50ba785cd107cada3b7
84941828b9e5f6fdc9fd48d060a316163527b747
62790 F20101219_AACVLH zhang_l_Page_151.jp2
1ddca39923d52a9f1be1a64c4d05e941
bfe5bcfa9f85b0a3abf6cffce01942581fa8d5a7
100545 F20101219_AACWOK zhang_l_Page_127.jp2
c2cbbd4eee919e85dc780a5780fbf1c2
1a2c6f4d3c557bc4f1a712871a1739c225003fc3
93790 F20101219_AACVKU zhang_l_Page_026.jpg
96065e00b5a87c7cef42e6b7088ccc19
2410beecaed668a26f3097fd407378baeb1671b4
7180 F20101219_AACWNW zhang_l_Page_053thm.jpg
a6e8c1fc8a29cb52eae1a932fd6e30a7
ed96307c5546be7846746e7f6998194714ada37c
1787 F20101219_AACVLI zhang_l_Page_104.txt
1b4c0662b06e6ebc41405dc8af356745
b2ec7ded718f7bffd9ac91f75b2d8ea9dd755d02
67140 F20101219_AACWPA zhang_l_Page_105.jpg
27dacddeeb325c301a80d67659ca1eff
deb36c88909aff3137ddee946ad029f233c885fb
725407 F20101219_AACWOL zhang_l_Page_105.jp2
f6ad93ff69e56e502a28ccaa5b33fc50
68afb7de7566e569c3643da9682a51353da5d64d
4062 F20101219_AACVKV zhang_l_Page_067thm.jpg
8fa157aa639ca8b6642476d80d34bddb
e780841d4b65dac79048224ae215c3548bc1f686
51774 F20101219_AACWNX zhang_l_Page_015.pro
c7af92de6bb5ea02c8b952f4d1e6de6d
a8c7ec3916d5c0f72e18543be33f6afc27c73e0f
24460 F20101219_AACVLJ zhang_l_Page_104.QC.jpg
097ede9c0c054d5bcf962464b923a328
2c89c850d9e148df47c873f95aa489cddd7bb624
78241 F20101219_AACWPB zhang_l_Page_120.jp2
ca04d47e6496a605a85d6c0cd65144c7
b96a93184f71eb5146f1fe58452c61e4883bdc01
48461 F20101219_AACWOM zhang_l_Page_132.jp2
442529dc72cc1d032f2d6eb17629095b
eb1867666d48968374146601292c14726d5734db
28843 F20101219_AACVKW zhang_l_Page_147.QC.jpg
321e506f67955ed55768e3ad69cbc2f7
1b935382e8005edc94e1359ed0cc7cec80038497
7446 F20101219_AACWNY zhang_l_Page_044thm.jpg
5e9fc8315111b663ce202681a3b60684
a31a3d114849a260f2f70c4c071ced7816b44c4e
29248 F20101219_AACVLK zhang_l_Page_106.QC.jpg
c66c58453f8a833fda18c80af87978a5
6436db66a2c3ea56d182c80f93094afdb11ce632
1963 F20101219_AACWPC zhang_l_Page_020.txt
0ac2a4e36ce448633e6b7b0b11ad7b19
71fe89cc1e79db0b9ed781f96f852d09e3175399
22178 F20101219_AACWON zhang_l_Page_038.QC.jpg
2b023cfee52446507a4a9b71058c1fd5
38bd98da37fbff924345bf057fd4051420e94cfa
92506 F20101219_AACVKX zhang_l_Page_013.jp2
974647a085e3e783313f3fda50c77b4c
1acceab71a7b215e06841bebfb3277a4cc67c53d
2187 F20101219_AACWNZ zhang_l_Page_030.txt
0fd03117d20dcaf98d29e7dc6147ce7d
acd6a7ef4d31cc71a1d7895e0571b5f36e2f062c
23441 F20101219_AACVMA zhang_l_Page_022.pro
f4fc36dde9b641bc5131ca1950c61904
9c3f869fa26d9ab5083718d5d731f8495152b145
36096 F20101219_AACVLL zhang_l_Page_102.QC.jpg
699ca9674299135f3239af76ff894d93
7adb505ffc09f5c2475f5205c785735cbf0e29de
8248 F20101219_AACWPD zhang_l_Page_001.pro
b326b05bf9ddefcf2c5e59f60956ee3d
f2cf4170e18b86420f781fa42e7bd3e337eccef0
6644 F20101219_AACWOO zhang_l_Page_078thm.jpg
9925fff93b3b1fe7bcdd5f32115e63f6
94c013881d5908b30b54eeeef773f00f82f21040
F20101219_AACVKY zhang_l_Page_122.tif
5ba85c31deed5cc8258375358376f6ca
b3d793e73ca3c118d8006c7bf5ae2033731fa33d
13170 F20101219_AACVMB zhang_l_Page_099.QC.jpg
d0aacbea38da10ef3c55222d2f69600a
6295ec3e3a9d4d1a5a79bb91e463be206a5163bf
19648 F20101219_AACVLM zhang_l_Page_095.QC.jpg
c80f6b4a21b283498dd18e1bbcf40927
c937e35c981a35c2bc604db5ec1c80123a3b3508
7209 F20101219_AACWPE zhang_l_Page_116thm.jpg
978feef8b07d69be9850a3e192d7704c
84bc2f71a03b659e18faefe5bbc715b0c9ca2ac4
1659 F20101219_AACWOP zhang_l_Page_139.txt
21c974411e1edd08d85255c901e8d83e
1a2e6ca5d0af743dc816d617df78b30fa83f6a9f
6844 F20101219_AACVKZ zhang_l_Page_004thm.jpg
73498a2d598eb09b25bd5bf91754f46b
56f15ab4e77a28a10e1af0b797453d92a63bd743
1051965 F20101219_AACVMC zhang_l_Page_004.jp2
ef224b666dabbe2f6cba55a7ec203256
58a9abc22a5283fb5d32e74038e35f4897cfb0f4
23193 F20101219_AACVLN zhang_l_Page_142.pro
e82f8bb7ccb635d50c7e2de5cbe17c12
e06899a30eb63703f68345faea5d7a452b4c7ced
78156 F20101219_AACWPF zhang_l_Page_014.jp2
8c7ea4dd73fc6dc77fe49020211c6c5c
3e968fc1cf0351bfeb8e81b8cfe48e1e8df04f85
5968 F20101219_AACWOQ zhang_l_Page_017thm.jpg
0f508f318db6febc40abb6bfa72d888d
943952201d80433360637a2d829db39c7f2db47f
F20101219_AACVMD zhang_l_Page_068.tif
6b08f8d6042676baed3eaa48529b8514
61012d0a590dd00ce6fb22b64d8991e126d197c1
46996 F20101219_AACVLO zhang_l_Page_054.pro
a3b1b84e0d3b0ba2a3c4ec2de6247c00
3d69da62d441bcfa5949c0529d74b45a5ad22ac6
1179 F20101219_AACWPG zhang_l_Page_011.txt
5ff8f51e32767cf93b77e242c12b5aa7
58e729b3b2d9c38041d824d770f8a8bf6b6c2356
F20101219_AACWOR zhang_l_Page_016.txt
936b66b3cae011999d8680dd784f7308
6be2e7ff163c46730811c68e978f2eea1e2d49fa
1633 F20101219_AACVME zhang_l_Page_137.txt
01be26ce883a1bf4818b0462e2502b07
0dd110fb6ca106443af7a6e5c4278527c5db80be
58198 F20101219_AACVLP zhang_l_Page_073.pro
372e878ac7b71139589dc61e94ca5cb8
6d5ba90039fbe59a273fcdccf9836e5527214693
7080 F20101219_AACWPH zhang_l_Page_015thm.jpg
28c7aa9c29d4b367fdca5a0d0356dd6f
d00d6f9651d6e51bed425d867d2d22fec79e5575
7527 F20101219_AACWOS zhang_l_Page_088thm.jpg
3dc7b8507c0a5d61b9176035c77c7930
d6ef12d0bddede56dd7bb4204276e95c6244943d
F20101219_AACVMF zhang_l_Page_034.tif
ec150c1750595ac340d60e8c28fea254
16dc2b7a992fef1b35649faa350c9e9bdf180737
2000 F20101219_AACVLQ zhang_l_Page_018.txt
3ae3e8d8d108023eb7dcf9989cfe606e
d4dae1e203101d6cff56980a98f4441aa543722e
73314 F20101219_AACWOT zhang_l_Page_038.jp2
2c4549f236b84fed210234a8d591f6b7
e23d4bbad1b271e3a59af384477fa30adc2944ce
38434 F20101219_AACVLR zhang_l_Page_049.pro
85aa879135899acc2f1b209bed4f6c77
8a07c67404271d152533d3e02e193a7dc638570d
150388 F20101219_AACWPI zhang_l_Page_010.jpg
01714ee350d1f8bc7b8fa070584c4584
d521bc60c4d87e93614d504b22af99a454b25a9a
F20101219_AACWOU zhang_l_Page_045.tif
ff9e0547f6ccd2bd31e56fc7ebe3737b
aef11e6d87f3ee00fc248c1557cec2db6e1c07b3
816113 F20101219_AACVMG zhang_l_Page_104.jp2
64ae35a33fb4e7df30c685d052f0597b
be7f279621cad5761b0d07a287541e3ad4859374
48722 F20101219_AACVLS zhang_l_Page_047.pro
734c34287ef04151780bef54e96d46f6
4ec4ff6274cbb24fa193ebe07a2c6772a779c06e
33971 F20101219_AACWPJ zhang_l_Page_031.QC.jpg
92e555ce8bc393222576e78f8d7a0ae8
3714c5a27f3e0a3d735a2cd1a8d9b795bfeffb5f
51518 F20101219_AACWOV zhang_l_Page_125.jpg
682a6894a3eed97302aced221bfeefe4
7326dd1ccd0c3de8858af0a7c12871c2dafa78f2
33179 F20101219_AACVMH zhang_l_Page_101.QC.jpg
d98ce758e81da89e9757f70ab717ddf5
04e7ef958229c5fa53858176c2d331351c350ccd
93348 F20101219_AACVLT zhang_l_Page_055.jpg
5b4cad0bf0e6acf58952c01024b1f97d
2796244171ac8a71adeb0caad102dd239eed546f
56532 F20101219_AACWPK zhang_l_Page_103.pro
34416e5426ddc963a8baaa5247bccc0b
eb44064c5f15841ae4494a799131411e490dc0d2
F20101219_AACWOW zhang_l_Page_109.tif
00898f040792e9e47ce34c4d930e331b
b4d982a180429b91ace35b1d7e0f9dd50e0c8155
30623 F20101219_AACVMI zhang_l_Page_137.pro
f2004d8fae07bad2c45754a2d28442f3
27c4e0b2de06931cad081f297d48e0ba566fdcb2
1982 F20101219_AACVLU zhang_l_Page_069.txt
d49d6597e23054e3088334d62f3a3a9e
db7fc35c7b4fe215daf5b345610b50c46d8e86ad
2431 F20101219_AACWQA zhang_l_Page_007.txt
1dc1174dae83aced6bd0fddbab103f72
a2721568d22686bd6ae49c59657115b790378eec
103509 F20101219_AACWPL zhang_l_Page_100.jp2
de5d375d00ca9a0c5e0db8d81ce303c0
4ec9af0b3a0d18166a1f0e9844cc03b28c2b277f
35806 F20101219_AACWOX zhang_l_Page_129.QC.jpg
ff113d364c44ac97951e39b623cab7ce
f87ce41dacb255dbf0d105168edeb46a9c7be32a
2342 F20101219_AACVMJ zhang_l_Page_056.txt
d49c6a568fe0acb01ef3fec011ed9b09
3c908dbc5122012ee95c2789ba411d78bc31014c
907495 F20101219_AACVLV zhang_l_Page_050.jp2
8fb6bfbf6ff34a13ec001fb22e16bfea
4bef662999a82d2db0d0bb4ddb3317cd4435ca21
111576 F20101219_AACWQB zhang_l_Page_083.jpg
1151a4d8e59929adedf3a44a56b7a62b
594ac80beb5a6cf7f72f54a04ee78efbf60de230
F20101219_AACWPM zhang_l_Page_080.jp2
dd5897b5314088e546ac4946b2cafd19
cbc477b19e7fab86e754a75d185ef11d7025b0cd
3348 F20101219_AACWOY zhang_l_Page_008.QC.jpg
9f9b5ac578f1f7bc7a1d3f15e7991f5c
29e87dc0895306bef553bc14518db4eb4923eb86
2234 F20101219_AACVMK zhang_l_Page_059.txt
f9ed620583779037b1ed1fd817c346d2
c2f36c2412a08aec29e57779c24f3fccec939f22
F20101219_AACVLW zhang_l_Page_138.tif
a9077c132535783cc1ff837eb464b550
c43800b7c5a61da02c39feb10e9ddb0f54668e3e
F20101219_AACWQC zhang_l_Page_065.tif
a6f3b9f36e25ede41fa46cec7319bb95
9a5affd1d391555571fb73c29c77e39b569d7edb
F20101219_AACWPN zhang_l_Page_080.tif
dac0d885feff1053634495e21510f257
da49027105fb9d307c550f5490c4442c99465f3a
5231 F20101219_AACWOZ zhang_l_Page_002.jp2
4512292875e6253274acc48979306bcb
814598135e19b0f870527bab9f8ba9b93407b2f3
6353 F20101219_AACVNA zhang_l_Page_104thm.jpg
6ed25226a4f49819ee9c52031bb6149d
c83e172fcf4f69c970c610a3187e95eebe43eead
5750 F20101219_AACVML zhang_l_Page_136thm.jpg
c282ee5676cac18bf065a009a1ed46e5
17921d8c7d17b931a1b2e642f0bbc8fac4ef13f9
F20101219_AACVLX zhang_l_Page_048.tif
a93f8d58edf13b936a3974c9de536aa1
69b196ef3aa7e9c1e220ca9e58724e84a72a313a
70 F20101219_AACWQD zhang_l_Page_003.txt
34530259ac5829f20788cc2b2e786ee6
745fe6fed2d41fcf14beb948a0ac16342dc9ef50
41291 F20101219_AACWPO zhang_l_Page_016.pro
ec428f1310dc45e5870f28be26c89246
7dad601bc75115d31dc41a0b8d7dd46a2e06bf73
5597 F20101219_AACVNB zhang_l_Page_003.jp2
0a749567ba38844ecb2a64100de4a792
b7c8e6f7f8d773b40cf3159e7026ded0c659ead5
119053 F20101219_AACVMM zhang_l_Page_070.jp2
98c1937188d88657b1747517fbdbf281
18ab869103ac6881a8dfe3e95b493f6eb1492981
1953 F20101219_AACVLY zhang_l_Page_088.txt
eff877024e2f9893966308085eefaf85
d9442bf2afde5c98af6e190501074d1c71733481
5006 F20101219_AACWQE zhang_l_Page_141thm.jpg
1d08c2c6cfdc588fdc2c8001373a9e50
5026db1944a83adf9c5454f052dc18353455a24f
45839 F20101219_AACWPP zhang_l_Page_109.pro
2ba63cdba8b8936c86604506a7ee0240
d25190b3db774a6f686d28aee4cc2ab4fda35b5d
29373 F20101219_AACVNC zhang_l_Page_004.QC.jpg
0b76b5b6d73e30511a578b246395bbc9
2e14fed3450a18829a2486519e5e50e43d77eb33
69237 F20101219_AACVMN zhang_l_Page_094.jp2
f0eb0ac8168e12a05b8bf31e59c206e4
cf4ca4247e07729dc47cfec242e99744d4a88a77
1051966 F20101219_AACVLZ zhang_l_Page_057.jp2
6f776c9486cea36bc2629a8b402c000a
174ca34701074f0f72bb295db095c5789256b8ef
7770 F20101219_AACWQF zhang_l_Page_149thm.jpg
364a42f75ee48b95fac719564309a640
86c15d6cf12a292080c17ee60254987b7e44b114
8049 F20101219_AACWPQ zhang_l_Page_112thm.jpg
a43624aa750c91b4df8df1704cc8fb5a
d5906509959bea8ca518c4769cf7c68f3f24fd60
56302 F20101219_AACVND zhang_l_Page_086.pro
7aca0a0f5f3531e69e25c7a7681137b5
63480338f654ef029d342b90fc809adb4e0ab084
55553 F20101219_AACVMO zhang_l_Page_082.pro
1bbf64747944d6c9a761b73752d68aee
a5eaff5dafb7952a49e9a0c864227081416d675b
48014 F20101219_AACWQG zhang_l_Page_127.pro
ee53e8d290e4805a596479d22e463b6a
83ca1c83d91696b631b8b891de66738cee895bd1
120480 F20101219_AACWPR zhang_l_Page_083.jp2
502e8e44c29d12481e155d66402e61d1
01668ea03e04d02630ba12a6197b23a4075ae6eb
26544 F20101219_AACVNE zhang_l_Page_076.QC.jpg
a9387aad0eb580137de6ebdf8b7af82c
490ca172c427711a0604b6536f5f23c6517bfae2
F20101219_AACVMP zhang_l_Page_059.tif
efeab62ba953beb21020f3413ee4e9e4
d626262839a31f7804345914021cc7c9c15079c4
26340 F20101219_AACWQH zhang_l_Page_016.QC.jpg
890e72e801790ad30619dc2f1d296015
144ba8891ea2f1171834f355d79a69202f9e0c38
76700 F20101219_AACWPS zhang_l_Page_136.jp2
cb55ffda414dbdc9dbfc35696810fcfb
5dea7b060de62fc6b891264d2ebe3bccc3811ab4
7090 F20101219_AACVNF zhang_l_Page_043thm.jpg
7f099bc4f0e54525f6f512f7bde44c2b
7c1a686cba6f67fc42df643ef6f72b5960444424
F20101219_AACVMQ zhang_l_Page_050.tif
4a618bef6563f89bdc4c267dec894209
f708f22cfd49f0cfeffdcbae44783b1a8cb47656
F20101219_AACWQI zhang_l_Page_016.tif
58ae9821ed0ad1a0d14028ed83778096
02217d196d75583fe5f13787924402045dff846f
78774 F20101219_AACWPT zhang_l_Page_121.jp2
6d082e9657271d683e0eaf892d8a0cdd
1cf6cacb998793b8f4c0d6aa0320631a5afe8fe6
F20101219_AACVNG zhang_l_Page_145.tif
7a9420f4d966d6b99ec3534bc559cc29
451ee9d59fd10d9edb6886302d8027ba537e1a5e
7827 F20101219_AACVMR zhang_l_Page_023thm.jpg
c7dfd9ff5573a2e4ca7d584483278d75
134836172225184f5e2f88c3bebad7a56f56f3c7
53234 F20101219_AACWPU zhang_l_Page_058.pro
5fab79a8fcd1237797d0240fa575fe10
472881a94aca9e89db90d669c6a388b3e3604fa7
34151 F20101219_AACVMS zhang_l_Page_087.QC.jpg
516fb67be270291a62e7cc9c395ce355
fbd9e2b10325c113d2f159139792a15d63699268
2112 F20101219_AACWQJ zhang_l_Page_058.txt
615c08cc7477d324699182cd52911c6a
2b884269b2fb55c9ae138673334e6242cab2a77f
23987 F20101219_AACWPV zhang_l_Page_143.QC.jpg
2a9ad7ea0454edc70f848cac111a28de
711d2cc32f059513d15a17f0b53cfdf6cf45474c
34183 F20101219_AACVNH zhang_l_Page_070.QC.jpg
ac7cf97b0235d5c71877479abbce6278
62b910762ea8fe55df5cec41eec93401ffa5d886
1051967 F20101219_AACVMT zhang_l_Page_010.jp2
bd29cbd526df947dd3f7b0ca2bb136c2
a5baa6d250f6d88314a7786d71396ad71b442c84
666124 F20101219_AACWQK zhang_l_Page_133.jp2
2c7f3f8d14127a9b1b33ba931af2a65f
137852159ac3513c0e2b7cb1ea46421735aeb69c
F20101219_AACWPW zhang_l_Page_134.tif
3d8a2e1df593d4c62cdcef88803d56ea
7c4eb5949de68f839560dfebd164c7ef37946ec3
103568 F20101219_AACVNI zhang_l_Page_074.jpg
93b939dc7e4d9dc14081bd34d0b68243
9d727b366d9ef2994870c35042dc312295a72898
29750 F20101219_AACVMU zhang_l_Page_148.QC.jpg
2876b292fba0bb23e725b9e8bb6b0c24
79142fc942de8d63d24c88fcc610efd011d95d88
62998 F20101219_AACWRA zhang_l_Page_145.pro
0e00cef9b5089ea95d622cca99e243a8
d225c4d5195f2d3bc7a218ae606daf039312bdb5
F20101219_AACWQL zhang_l_Page_113.tif
b5e7c0cb9c01ec73c350c7e9d4fac4c7
c9c961a5dee53717d62be55b3db184b4f3fe5c62
56583 F20101219_AACWPX zhang_l_Page_112.pro
8991d220dd02124acfe8704c62d604eb
64f6261fd6c307ddeb9d2602d3383fc503290d0e
F20101219_AACVNJ zhang_l_Page_092.tif
6f270e0e767053ab6df1a6997c7a7e82
44547a6f0a5aea5016d7a3928e474eaffc38d103
2257 F20101219_AACVMV zhang_l_Page_070.txt
2c9cf56e9bab84f61400882145b80341
3bf4ac91e6a8dd818a162f559b05d859894d7b2f
35339 F20101219_AACWRB zhang_l_Page_073.QC.jpg
5bbedbd781c475dd520c360548087a31
bc393cb974f6fecfafc88b6b808bafa5306793f1
111512 F20101219_AACWQM zhang_l_Page_062.jpg
2a6de6069365d8f96bad2ef8db05580b
f0bdeb1858cb768b986a536686c79dbd79dd5617
4616 F20101219_AACWPY zhang_l_Page_065thm.jpg
93afbf035e87f7680ae49feedc4e9981
d9c30ed52457b2c5e57ac79b3ec6e5871c5da099
58310 F20101219_AACVNK zhang_l_Page_064.jpg
e3cf9c55d7dd2e10dc8a1f4dd0151655
362a83204df277eab1b768f73c6751f655b996eb
F20101219_AACVMW zhang_l_Page_141.tif
1a60d0f3ca7224d8d992af4d23e7e128
8e2c6424201f017e7fdc75e85349f06e671fec6c
39784 F20101219_AACWRC zhang_l_Page_113.pro
1ea5508f8899b4dafd320a7cf45ae69f
bae5e1bb26d5f9e2eaf080406fb2767128026a56
37767 F20101219_AACWQN zhang_l_Page_076.pro
cbb6b2cf44b78820e6c4049d8b339c3e
6051462c07db5e15a72c296123194d17698d92cc
8164 F20101219_AACWPZ zhang_l_Page_073thm.jpg
b898111db30a66a172c6de944ab638c0
9d3062b4d626b96fc8827050aea415d5a549677b
4368 F20101219_AACVNL zhang_l_Page_140thm.jpg
3d33254caaa5e59b34ec7165dfe407fa
3c71708e12fb4063ca04eedc744d7cc2808b2cb3
73305 F20101219_AACVMX zhang_l_Page_065.jp2
e3ea91e3c11e1d9e7f9c199e32ab702f
99c15ceac83ea3fec9ad9e9ce5aaccd8e560f956
51580 F20101219_AACVOA zhang_l_Page_074.pro
46c427b999d7797868b9ecf8e1257ecb
641e72e4fa2b71dff6c466f6cf079c9f719bb46f
870 F20101219_AACWRD zhang_l_Page_140.txt
f968099ca350041087037813f935f495
5a67c34ad37a198b7c494134298507efba8c3541
32030 F20101219_AACWQO zhang_l_Page_093.jpg
1b9ffcb1dfc1b8f08bd381515c7c160b
8c23bc92eea1f8e10f8637fbf3d06215ae15a86b
5489 F20101219_AACVNM zhang_l_Page_120thm.jpg
7c4331c87fc70aeed6251ebb447a07d5
61115bbdd2ec7fc12d65afa85cb08e2d84942381
32944 F20101219_AACVMY zhang_l_Page_105.pro
4c46c1da5933f3da5ea410726a97a4d0
2f886df1e48c8929cebd2c07c229fe0f54f5e6f1
116062 F20101219_AACVOB zhang_l_Page_148.jp2
a46ae6f4b5b561ce4bb2dc9712f5720e
9db41c134d557c7651f4a3dc6361dfd580d9b285
5478 F20101219_AACWRE zhang_l_Page_121thm.jpg
acf20e7fde3917796fe61613d77c0ae8
70879ba8ae6bcfe029abacc8151db60879132038
110492 F20101219_AACWQP zhang_l_Page_150.jp2
ae2defe2f02420bbfbee1f843c7bd388
c5da47728504f18f5303cc015ac011c3795b770b
11844 F20101219_AACVNN zhang_l_Page_068.QC.jpg
5c7b18769c6371e6e9e4bcd3c0fd9ae1
a961b2e5cddf824031f41fb292028f66d0188f8e
7726 F20101219_AACVMZ zhang_l_Page_147thm.jpg
c8f296ea7170fc51714184cce63df525
3e50d41b1866aac3bdcda96acfc7825bb16b93cd
22353 F20101219_AACVOC zhang_l_Page_096.QC.jpg
ff30e7521f9dd8a9d6097f54c97e6ff8
639bfdac7d9a22ddf4b9496ed6344966a9270e0f
100 F20101219_AACWRF zhang_l_Page_002.txt
593dd15de6b3e708c03fe25f72d64bba
8edb87435054ba26c8a3e25c4e2ef705ee3cb419
89562 F20101219_AACWQQ zhang_l_Page_048.jpg
5e6eb5bf0269a152ea8cc7ed1e9d4b00
21cbbe88ad3d527ee1d715f36331e6fcc6f78a1d
7200 F20101219_AACVNO zhang_l_Page_036thm.jpg
9e464dbecf3a49ae3ede90b14cb05272
6898280fb109618f67d4e37786f2f2656d6e3384
2937 F20101219_AACVOD zhang_l_Page_008.pro
7705e444905d18620920297c89a3e0af
b1c0b2222117ec4abacc6308ac116495df7dfc4b
20984 F20101219_AACWRG zhang_l_Page_141.pro
fd02dd218ae5f1a1100f9b6997ca4438
3c0bd42d07ce7cf7656137e8c63150c07bbeec2d
58983 F20101219_AACWQR zhang_l_Page_128.pro
70624360d94e92e38ee1f942cce1cc64
c3fac446d67bf74b0bd43b4d495b91cfb36b756b
106294 F20101219_AACVNP zhang_l_Page_060.jpg
e3f94f2d27018c693cb815c0db4dacc9
3970400e8491d44b5c968ad1bb568ea78fd0c92f
83971 F20101219_AACVOE zhang_l_Page_143.jpg
8c56b1086cde1776251454219ea9316a
b71c16b791b698c293b7f980b29811ea60097311
2109 F20101219_AACWRH zhang_l_Page_060.txt
fa4f2523a5b09b646af6b5acddd3c933
2c63992607e35732bcb557873f10d436d2b67bfb
35094 F20101219_AACWQS zhang_l_Page_122.pro
74bd644ccf752af2c1e932341a418e04
64d4f429e2bb48dbf0e406532eb2e658907d529b
5792 F20101219_AACVOF zhang_l_Page_040thm.jpg
4d7e7c8389029082245f7d24cdb874ba
f7910d8b7465411cb2075c271d9e42e105ba47fd
52394 F20101219_AACVNQ zhang_l_Page_093.jp2
f18c9c6bb08ae6f0717fae5df13421af
230211826cc69ab98c009a8fb77e7c6777edda76
47328 F20101219_AACWRI zhang_l_Page_140.jp2
27108e65b538c3f3cd689c42adf1bcfd
008d14a2ce854c07dbfaaf6256fbd7abbd739b5b
7548 F20101219_AACWQT zhang_l_Page_060thm.jpg
cdb975155ae396f63ba8aec29048f870
ff525e2722f96099cdfa65762f2398eb093b50f6
100145 F20101219_AACVOG zhang_l_Page_107.jpg
580af73db6fca7caa4dc06b7c2fdb107
a6d63b4f02dc54fd3e0f0db4b530163f321d4cd1
82456 F20101219_AACVNR zhang_l_Page_050.jpg
384ab2544ab1a1aa7c1ff6d43fd94abf
464e0ad2f80fe10cd1a3f7224ba445a09f13fe5f
56623 F20101219_AACWRJ zhang_l_Page_031.pro
ae20474e6c6753bebbc39494b5c07578
7e8a1c0405468c4a1954956ce90525cc841b946f
2151 F20101219_AACWQU zhang_l_Page_044.txt
9c790a7c3511b9904af543eeae6c18af
f16c5c63de7d6ec60791fa120653eeb298aaf5c4
7455 F20101219_AACVOH zhang_l_Page_091thm.jpg
88708b77fe83aa74c40fe0936e031c1d
c0ceb45a33f7fe83d317bd2d62458a5ee1bc6d9b
1689 F20101219_AACVNS zhang_l_Page_038.txt
3f422c92a68b1fb2c335d86047a76c12
9336e494875a35d821ae3b5f654933f22efb0ee4
27839 F20101219_AACWQV zhang_l_Page_048.QC.jpg
5e960c070048b776351842d2c93a088c
a87a8bc5baf16c28910b5cf2b0aef082116a2f1c
112506 F20101219_AACVNT zhang_l_Page_082.jpg
e1a2bc11049316eb7e40b2916b627585
89265a0259575404a4b8c383dea92bcb36102844
4115 F20101219_AACWRK zhang_l_Page_003.jpg
fd51e6b38608bee5ab33907674b6b98a
a150656c1d9dd947b0948806c5685e6f817d87cb
29843 F20101219_AACWQW zhang_l_Page_144.QC.jpg
4543cffe767cbc927127a29138817a82
bcb3dbf501b54d5b8cc765723de921db71fb0044
915483 F20101219_AACVOI zhang_l_Page_135.jp2
b78ddd8c6c97ce6aea89d9ee5f0593d8
afb78bb5d8da31161a92ceed644b50eff6a6520e
1073 F20101219_AACVNU zhang_l_Page_022.txt
593332dfe02398c05fad590f3e6cc18e
63b6afca2864802491cb9646d912c0b0b6efb236
1051974 F20101219_AACWSA zhang_l_Page_058.jp2
65c92741681eb541fe6f10429b7682aa
32aa60693289ffa219170da5cb83d0f8bae432cf
84675 F20101219_AACWRL zhang_l_Page_052.jpg
fa2ecdc079568403ac7cf2f9de8b8a34
d688f5b8a2a58c993ad3b09a9beeafae9951d98e
70098 F20101219_AACWQX zhang_l_Page_037.jpg
3f2fcefe0c7b8294ef597a550bb92308
2ba769856f931762d864a9753d492ecf2ce24ee5
2231 F20101219_AACVOJ zhang_l_Page_062.txt
f9012018e0ab7a3e194cd047cffe70c2
9bb293c9c74301174c669aad5e08dedd2545c9b0
3029 F20101219_AACVNV zhang_l_Page_010.txt
bbb086e18e5425ce0f7094e3dfc4e11d
052da05f0b7501cf20c7ed4734c81f692ee72cd6
8097 F20101219_AACWSB zhang_l_Page_057thm.jpg
0ac1de716d44f38bc39a4c546f455be1
2506315ae52e265d1ff19f163349755d969eb6e3
1725 F20101219_AACWRM zhang_l_Page_113.txt
6e6c9eba745d706dd3e7bdfe38c2e5ef
892327daeba8a3aa40d20a72707ee4da031684ae
1821 F20101219_AACWQY zhang_l_Page_025.txt
eee536d8d0e442342f37356323eaffea
caf2b802181e9ec48bbce10d83dec49764ca6f3f
31592 F20101219_AACVOK zhang_l_Page_149.QC.jpg
8cdeb0aaea73e29481732b0c533ea2e0
ea95586632274844ec08e593401930b4dac7f77e
7483 F20101219_AACVNW zhang_l_Page_030thm.jpg
0889c48e23bbc48bbce2656c550d005a
ac1df9bedb2bbc7542803190118c314408430d15
59651 F20101219_AACWSC zhang_l_Page_045.pro
e609f09e69d5e604cb48d4b6883836e3
a759397f6d409157887cdfb5becb4a5028e9fedc
27891 F20101219_AACWRN zhang_l_Page_050.QC.jpg
7e787d085e050249b21f4f79c6f030be
cea93d67cbd5ea9a5804978b52de4b4f16ee863f
7879 F20101219_AACWQZ zhang_l_Page_045thm.jpg
59e81156a08d92519bcb69bc20fa2022
9668c07da3083aa646a60b0f4f1d2a33d3ddbf13
8147 F20101219_AACVPA zhang_l_Page_090thm.jpg
24037c37835af8d0d09384af72e8fc0d
fd60dbc2bb088a8b1b4646638d84c755fcbb1351
651790 F20101219_AACVOL zhang_l_Page_125.jp2
00272fe30c3f5bfc3a7c89736e24709f
045252018a8003087f67de6c3d7f7e3596bfdf43
20568 F20101219_AACVNX zhang_l_Page_131.QC.jpg
c9528acbc94471b8ad0b40d36ff63f76
5d7e6186f2d70971addb7a26bc356f42186909a2
56036 F20101219_AACWSD zhang_l_Page_150.pro
f73038196ee6a15d075251b0839c03b9
f07d05886aec7fe0ed799907e233b32dce4bc280
30314 F20101219_AACWRO zhang_l_Page_033.QC.jpg
31ada5852192cb64679ed374d6208e08
4944a88f32a4194500f769e53412858f3652d232
F20101219_AACVPB zhang_l_Page_120.QC.jpg
2948d43860f0f1c7589a67a2c1b17a01
a15e7bfc2323f1af9fc666545eb5f0ed2340b854
2055 F20101219_AACVOM zhang_l_Page_118.txt
09da895c07777a8cf47baa168b47931d
59171678e87a23ccbd29cf21d888664f8cbed467
36502 F20101219_AACVNY zhang_l_Page_025.pro
7a14c62cbb2b23bf81080d3854eca3b9
3023f2cfaf8b39d93fbe900a0cdd5300e89a3a86
30396 F20101219_AACWSE zhang_l_Page_111.QC.jpg
fef9584b00953fe9bec47b2982cf054a
58de300c306f6aafe0fd3ce9130951cf4c8fa0f9
973 F20101219_AACWRP zhang_l_Page_068.txt
014565ee5e50432ad516d66be6633cb3
abd293a587e5778cbfbacf50eb9ea08658a7d5b1
70628 F20101219_AACVPC zhang_l_Page_110.jpg
d87491bdbea090d174e3271a748025aa
2ec4089adcf720d2547d1453f1669ef6292b718d
F20101219_AACVON zhang_l_Page_094.tif
4ac967a90319d5e3aa3634287856d3b0
8e09a1fa709c189cae16f24075f15768f5c3e977
112089 F20101219_AACVNZ zhang_l_Page_117.jpg
c4b5882e18f090497bdd62462b96cbd1
a792a3a494679b8796bdc5d628c30dbc7999262f
F20101219_AACWSF zhang_l_Page_014.tif
8678019c07bea5a25238e93a38b03385
76d26fa0b167d83c3ce2af5802bce4e0eb902a78
108395 F20101219_AACWRQ zhang_l_Page_033.jp2
b27ecc12a0a7118ea66a4360f9f9d10b
e851d2f837e7af7578861e12567b0cb0879596a8
100602 F20101219_AACVPD zhang_l_Page_150.jpg
e34f1b5b4501d4a2cdbc5cfdc6d91a6e
ddd42710965dacbb047dc4a5bee55f18a39b4667
999353 F20101219_AACVOO zhang_l_Page_041.jp2
cd9d557e149b9a6b7155f685257e12f4
2fe620e26a74e31fdb79301bd510769ae715b8c6
F20101219_AACWSG zhang_l_Page_077.tif
295ed7a5779ad518ed7513e0c2c6a375
fa1d3fd408d2f94b7aa34e433fdd32f0ca99092d
F20101219_AACWRR zhang_l_Page_133.tif
669f967061048276696b423024a79817
409c08c3a7725ab214164be28b387a58a1a3ca1d
F20101219_AACVPE zhang_l_Page_067.tif
42f834002d28e5f82419d16f2b07881f
dd18935a40e88f8420fecf06957d20eabc0dfe43
21355 F20101219_AACVOP zhang_l_Page_119.QC.jpg
85a9147cdf48fd524371ff4195fb8ffb
aa09a6aa7e783d67db0721f25dc91104e0eb6831
226183 F20101219_AACWSH UFE0015583_00001.xml FULL
fd114e41500a80de7292e5a61d725bf5
ed3cd7c435df3cc8e93f38130ce338fbef67d5ac
BROKEN_LINK
zhang_l_Page_001.tif
F20101219_AACWRS zhang_l_Page_044.tif
d05b7315d4aa0a4e1b870db83aab6177
399da91432a2253b547d638e58c2a7b016b410e0
72887 F20101219_AACVPF zhang_l_Page_136.jpg
48a976299560473576199a09c2923e94
86da92c7d50b5a4136b92bdaf24d244129f69382
F20101219_AACVOQ zhang_l_Page_116.tif
351937712dbce385b056b5edf1cc1ba0
8bdbc072e953c42f9eac521171855e2f62916949
2248 F20101219_AACWRT zhang_l_Page_080.txt
17c3d6f62d9d6c91bf4d611616852378
9ce5199d99525292792235ecba50dedd4fa1ee4c
1728 F20101219_AACVPG zhang_l_Page_105.txt
cc5c934f6fe6fd2cac56238eeeb62b07
4a6412a76ad35aa1339214552072c7c803bf5b27
67486 F20101219_AACVOR zhang_l_Page_095.jpg
bd4d762bd0894e14d4ff0c3ecff99845
bc0765f56c895228da9361e1fa4964c016f05403
3241 F20101219_AACWRU zhang_l_Page_066thm.jpg
0bec5e42a654487151470937bcacc9a6
974267af73067cb9cd1a68f6088ddba8f21f9dd4
F20101219_AACVPH zhang_l_Page_098.tif
f028941bd6b3eca29e74a62b7fa9a1b8
561b04c85b042e761cad5b628b75673178eb4e8a
7043 F20101219_AACVOS zhang_l_Page_019thm.jpg
0884a7a8dcb13a5fcc9845353e0637bd
a992d359e6942522ca8f7a10501b4d8352f28c40
96450 F20101219_AACWSK zhang_l_Page_004.jpg
1c5d56426c3a6161d818c77efbe85851
d428286f5f382fd5a3660eb9498dcb7fc524cffd
78722 F20101219_AACWRV zhang_l_Page_122.jp2
4490f06c4bc4825337b2abd8b62398e8
596c00fc330b08f3f675abcf445aebd05105fb16
354301 F20101219_AACVPI zhang_l_Page_138.jp2
0afdc514ac6cda6028564b62047badd1
494ec27328bc9f775c6c4f2a5e878fd15c5feddf
34627 F20101219_AACVOT zhang_l_Page_007.QC.jpg
a6b158e15b3e770b2acfa504b3687e53
35e50682748dba02bcb30c580af7d4bc81417adf
43030 F20101219_AACWRW zhang_l_Page_099.jpg
884352390b56c2b158422f6710aa59d7
75d7e90766c2559d85e10680ffaae4ef4818daf7
F20101219_AACVOU zhang_l_Page_088.tif
a690a3c12ff37aa4257ea44ec9c74d26
f480b953b78628dafa701e28fb0f84d6f559c6f2
93352 F20101219_AACWTA zhang_l_Page_100.jpg
9aaec4dd4ff6865662203bc5269f1e09
66f4bae2678e93f52fe150cfd9c6d213f8968f0a
57383 F20101219_AACWSL zhang_l_Page_012.jpg
5d7bc557e8ad701ce73dbba214dfe40b
f0bf3095865b4f477ed43869267e5f25c3bec596
F20101219_AACWRX zhang_l_Page_007.jp2
04f5585b84f9cbfde9e9ad4f3ed73df6
6dddf2c0fc57a7a81ae450d86cc8f34b6314c4aa
2243 F20101219_AACVPJ zhang_l_Page_087.txt
5a3835aec36980c6ccb18ae269e5e645
06ca310a3fb6f43023e489dab165388e02a9eedc
42886 F20101219_AACVOV zhang_l_Page_053.pro
95ca9c41e601e4497cf29efa4ae9f9e2
ce7d72df3da6bdc8f86e105bc2137aac65f20113
117036 F20101219_AACWTB zhang_l_Page_102.jpg
962587ab76e4af0485e169ba0c6f3875
498b0c90dae32885b56c7c57b0709138fbf8423b
68047 F20101219_AACWSM zhang_l_Page_017.jpg
15c9f2aa7b35e30a36d427f978e2003f
dca9807dab267d2def61854da6e335c014bf894c
1573 F20101219_AACWRY zhang_l_Page_125.txt
426a91560d295ed5c7bae1afb662d062
cd17d9f00d56c10f5f2ca5f083a082c9dc68efc9
527 F20101219_AACVPK zhang_l_Page_002thm.jpg
a996bcd07d6c1caea2a8426d66458d12
2178a16bb3c278324bf84886d803950e3186ba1f
117526 F20101219_AACVOW zhang_l_Page_071.jp2
5f1a8609b3d68963b1e94329a717d1ac
a93dcffbaa69b2c4dc23145c3982e6abe19fc038
106817 F20101219_AACWTC zhang_l_Page_115.jpg
79414eaff09c53bc869599322cf34076
7019efa6add38263c6c36d3f22d32036037e3117
88055 F20101219_AACWSN zhang_l_Page_018.jpg
9d97796ce3fb20b74465b9e3d502b4ce
1798e68146974f16ac6a0b22a3504536b36240b9
110868 F20101219_AACWRZ zhang_l_Page_031.jpg
1b2bd251d9e3a18ead072e6d96fce428
47c2c91a576124c9c20da8b90f7083f350303883
1051986 F20101219_AACVPL zhang_l_Page_006.jp2
a8de0a1278211c20f12c374db2fbcc16
85116231c258623859ea9f228c1cd3f48d07eb80
F20101219_AACVOX zhang_l_Page_009.tif
94ed3c981b4368f0d8bc3700852c1776
c0057823331c20bd76e882743d97a29cadf02d4a
F20101219_AACVQA zhang_l_Page_143.tif
336a1d079a6d0ab847073894059d8864
0ad3a6a45eef478e70f46372a63ffbe7dfa5758a
97110 F20101219_AACWTD zhang_l_Page_126.jpg
737f434c9e34f02a5065eef9ad8c50b7
de0a99d037a30fc5eeb9bd16747c013e9db52543
79161 F20101219_AACWSO zhang_l_Page_021.jpg
d940cbcc7a3575a261769ee64bd9ae46
2574b43ec78ab2e536739237722ff1599df181e0
34789 F20101219_AACVPM zhang_l_Page_075.QC.jpg
c84c21fcb08408044d3665903c34d294
7d9d12498f1fadfb6ac58d3a02c278f852563b95
30744 F20101219_AACVOY zhang_l_Page_038.pro
1c84fbedc6f4824c8a9da0eff0ddbc23
5914f80ff2fc9cfa5905c5d0ddd240bdedd2aaa3
98287 F20101219_AACVQB zhang_l_Page_088.jpg
82c33134c976798cdc67d71adbdb90ee
10fbea09a980bf37a94ca8159e055b9bc8f9adb2
62503 F20101219_AACWTE zhang_l_Page_131.jpg
4d83ac6a2ac000907b7d1735debe6b74
60adf87d219b4fa1c907d75fb70f8f4b0da49081
103403 F20101219_AACWSP zhang_l_Page_030.jpg
8816578bfcaf1c802b2dd57003c76e29
768ef572cc14549fd6c378c6a7e8ec1ddbaf0cce
676535 F20101219_AACVPN zhang_l_Page_131.jp2
9dfe23faf98abbefc36e23da4ae88673
1b9499ba9c176ee1c36c7f2ced46b2b671d65413
2445 F20101219_AACVOZ zhang_l_Page_145.txt
9eb0aae4a28de39afc06860cbac013ac
17311441de3b4f28f6a784672c9134e796c02f3d
97776 F20101219_AACVQC zhang_l_Page_020.jpg
41ca2fc0023aba6217b50f2dda3aeec4
2d3a3708357de99dfee82104f7a08163a5929879
64055 F20101219_AACWTF zhang_l_Page_133.jpg
1b4318aceaa2a93230d9f1e95cac2984
48d756549f3cdff87d1072422ea8e1bbb2276eff
73524 F20101219_AACWSQ zhang_l_Page_032.jpg
c45bfdc9cfbe2a8d795a89bd65a47390
4ccc40e876aea3dbb909b3c91ba44843c46ca519
F20101219_AACVPO zhang_l_Page_062.tif
b6339e04c08ebb634493833d6da63596
bbc4ede5e426b1aee4dabbe07b24067ce354bae1
95327 F20101219_AACVQD zhang_l_Page_109.jpg
ec6a1ff49e59dfaa1b13ea701103a976
23d0f885e8f956bb5416d798d2e6fd3b7003c77f
88678 F20101219_AACWTG zhang_l_Page_135.jpg
146cf119aaa98673cf110f7a1734fe74
6d984e3358279c99ff1f101ccade42d9d8785fb9
70486 F20101219_AACWSR zhang_l_Page_035.jpg
4c6719192d7f9ec12eea60667f426c64
7f63c02bf231afe2c13ec328464bd92416c7269b
16317 F20101219_AACVPP zhang_l_Page_142.QC.jpg
39bb6b1d13e8534a7470f1a8070fab8d
eb2335266f270bb5c2259da29491e519bd49da88
1998 F20101219_AACVQE zhang_l_Page_116.txt
7026f486cd91953d2375fc8c8156f870
fdc4e9841cdb4126bafcc11d47fe7fe0b65e654f
51943 F20101219_AACWTH zhang_l_Page_142.jpg
e8384a45dfe503aefe68956818d23d5a
0f596e5a08a6dd1fe937ed31246a2c581357ebb1
113666 F20101219_AACWSS zhang_l_Page_059.jpg
c5fd19c7145fced17596ae1934d9620e
2f60c19f1748ae11fc8ec485917ea17f77e0fa5d
2191 F20101219_AACVPQ zhang_l_Page_150.txt
6b24b956fdbd1b4d2317f9943f0456e9
0229098d2e55944c976c3cf3bb42fcefe949d592
5635 F20101219_AACVQF zhang_l_Page_105thm.jpg
2675535856a7a8565ac62c5f87dcfcd7
e8cc96191385e5f77b392cc3ba1d327d73b862d0
111354 F20101219_AACWTI zhang_l_Page_145.jpg
5ce8f720a9179b7443c2c378dd956dc0
d30fcb23c049a1adad539215ea2e725fe42969d5
60192 F20101219_AACWST zhang_l_Page_067.jpg
15aecb0e84333fce464b90dac3258eee
7adf63d8d3e289631418fa9f4e7e8928cb563b29
7765 F20101219_AACVPR zhang_l_Page_083thm.jpg
dbd83328b17382eabc6d9fe5eb133a95
964069f9c5bf521f5660bd48baa35b8178a813ca
F20101219_AACVQG zhang_l_Page_051.tif
0d9344e868188b047cfd73bd9727c420
610a713edeb7e4ee5b15989fdbf87df6ad0cd081
53873 F20101219_AACWTJ zhang_l_Page_151.jpg
a12c86cdd4184ecd6c47c16ad411f017
9ad39a3df725d39e16477cbf567491d7ee907335
82164 F20101219_AACWSU zhang_l_Page_079.jpg
9166042de9952c581d0d2b8eca26c5dc
d54248333fb3a7c7fade903d6be0d147e0679aef
8157 F20101219_AACVPS zhang_l_Page_062thm.jpg
62e65d6f3411bb7d06b50a0c3bd8c895
95881683eca9183c93444b4680a5f6c242bf2ced
F20101219_AACVQH zhang_l_Page_106.txt
859e60abfdb6cb56e234548e7370b12f
0e20eb16fe3fab83aa1b4c89f8b23743e4a59d0d
1051970 F20101219_AACWTK zhang_l_Page_009.jp2
d0ef0740acc05704ec689d7cbdd2b430
f6a3df4f992ca0c94d6c94d2f23ac87847d9b07a
111108 F20101219_AACWSV zhang_l_Page_084.jpg
f3694db7ed00a8f471f1475d8b1bdd1f
6ac382249323b6a24a5829b7e42dbf60bcedc2e1
491 F20101219_AACVPT zhang_l_Page_003thm.jpg
9902da73236cfcabdbf52af3f57134fa
f8222365d6ad0a59cf9d8c3c97706fd39a6a147a
56998 F20101219_AACVQI zhang_l_Page_090.pro
f75f3c1c33a1d80163dbf8deec000890
123395e4493d145bccbb88de64ed1ff55984983f
874609 F20101219_AACWTL zhang_l_Page_021.jp2
27c07ce479851efd5c2e534872d2b7dc
2b1ffe5f066af16b9cf727c675013581d655c14f
109842 F20101219_AACWSW zhang_l_Page_089.jpg
f5d41e537a9727f0f78ac941fa83aab4
086c52f18ac5f25e5c10cd92230a8030540a13fc
F20101219_AACVPU zhang_l_Page_096.tif
b4f29fbf4ea3fbfe4882784ca89f0c14
445ea6e2752ce68b6dbcc1acbe2661e61940c8a4
34616 F20101219_AACVQJ zhang_l_Page_023.QC.jpg
d70596fa69079be6a77fd170e394e841
8f99af9ce2dc30d7828744eb4570f157b9fe9a75
1051940 F20101219_AACWUA zhang_l_Page_082.jp2
11a308c86fb6bbb768577a6470f6a491
9df86d5aa3e70d3b1e946ee3e6c25ea85da79259
109779 F20101219_AACWSX zhang_l_Page_091.jpg
5cf50c10bc63cf28778124e14fbcc9c8
9a2f4c9025d326225350c5a4f7416de0e2ed5944
25044 F20101219_AACVPV zhang_l_Page_025.QC.jpg
acc2a16d467576b07c246a1a8904c56a
a44349bb343393f44715e25c6a927a362cf6e800
1051930 F20101219_AACWUB zhang_l_Page_084.jp2
89b1d8b9441e67bbb236b19336bb166e
64e52825239277e010e1e60066b7e741cff3e76b
1051944 F20101219_AACWTM zhang_l_Page_034.jp2
503a76140d8a3724c402389cd41a5194
6ff1c08a47b2734fe8907448d982217be733cb1e
31382 F20101219_AACWSY zhang_l_Page_092.jpg
af1e76836238770a5bc2ca9e6ca3a815
524c1ef1bc3481a9f6db4c1b2eaf0c35b33b9c1a
40117 F20101219_AACVPW zhang_l_Page_108.pro
3a1fdd809ddb6695b15fdc4f073c5da5
0e355b6e0b28d8891fdb7e1ec93aa40501508e01
1051922 F20101219_AACVQK zhang_l_Page_087.jp2
c62bbe7400c586e54008392c6aedbd8d
b57437d7152708ba53bde84843964b50e3d12d72
1051947 F20101219_AACWUC zhang_l_Page_103.jp2
5b56033b71d8bbc854132b45b59c4f1a
e57e5b9ef927fde889d27e254f8ed9c77a3dee97
F20101219_AACWTN zhang_l_Page_046.jp2
9561e544dff56e0baa2ffe10e084701a
3e32c7ccfddd77821fc8e91a4d4c3b3675ae0d67
52538 F20101219_AACWSZ zhang_l_Page_097.jpg
24eda3df12d8bd016707a33ff9d9aad9
f416fd3f6f0764b603fef65faf78e074d474dbe4
2353 F20101219_AACVPX zhang_l_Page_102.txt
47486e5ecc51ab5fd1abc8cda73d2eca
01bd1405e98bba4f9576135f784f333104026222
31435 F20101219_AACVRA zhang_l_Page_139.pro
1be11232e454c5aabad28d83859ab387
9762cc5ac38d4f4ca41562fa1e97993a46839512
102808 F20101219_AACVQL zhang_l_Page_069.jp2
1849910fe43e024d9747e8bfdb7ae32b
0c350af6abee50febb98f84359483ed883a0a939
763539 F20101219_AACWUD zhang_l_Page_110.jp2
306dbc6c7c4497d89cf7c0b80d3c2c14
3a3988b7b89f3e05543176b1a8bf02619537c3e9
94422 F20101219_AACWTO zhang_l_Page_048.jp2
4bca6f87124828d3f84790d2d7af9127
0a15b6c264c4e5b51d1b4fad0e969bb782016c3b
29407 F20101219_AACVPY zhang_l_Page_063.jp2
8edf2289ee4fcdc9d4b990372211a7a6
f606b28b8f1babeda52155f4e38c43abd36b2b39
59536 F20101219_AACVRB zhang_l_Page_028.pro
bcde7f6a923cc16b46cb8063a56124b3
888bf5a6e1a184c499b0b59b89f7b94c5ff77535
F20101219_AACVQM zhang_l_Page_020.tif
b9292044f1746fcf0ea36caf27134b81
34d6f78e46023752c4f135370cda5359fa97dd66
1051976 F20101219_AACWUE zhang_l_Page_117.jp2
05c4ab79f4088011e329a49f3d07f308
c2986dfa283f0f3be7ad3df89b14d2705318ff01
948636 F20101219_AACWTP zhang_l_Page_053.jp2
e7b190e9e0eae866f33bf11ce1cb3fcb
540dc85645d64b793f14c1c9b3278f15c25068ea
31222 F20101219_AACVPZ zhang_l_Page_017.pro
6e699ad826f85835b51167f2c380a855
bc7566d2b8be422d3a40526725b0ffa7277f8a5f
15982 F20101219_AACVRC zhang_l_Page_151.QC.jpg
3c338fcef819a0d8fe35f618f28991f0
a88d345d9c22371bb703997c2b36006e09263632
104957 F20101219_AACVQN zhang_l_Page_148.jpg
bca19af2bf7923db87240bfcdee32891
ba0b3dbdc0fe12eff2c91c3812fd12f84c5e0db9
F20101219_AACWUF zhang_l_Page_118.jp2
9cfafba1df8f8599c7c635cdb1919f12
b6a95ff6495cd6dda9656fbf955c9d53c89bb3ac
97266 F20101219_AACWTQ zhang_l_Page_054.jp2
588627ad194a56628119a16eb309fdde
6abd4ad4f8be03f95b3efef791160d5bce71912c
33903 F20101219_AACVRD zhang_l_Page_115.QC.jpg
b8a08b8da3aff82c89379e2a1be22ff4
724edfa3f4ac00530f9d8144e87ff81c8364f4d8
117859 F20101219_AACVQO zhang_l_Page_144.jp2
a08b098389fc2a1379999d89356c493e
40ba0ae002f37d5bd6db37f1b1e0dbe067aa3085
59911 F20101219_AACWUG zhang_l_Page_124.jp2
a631f5d734819164c8dd26342df75f77
5e90b823fac461ded4f6fac1cbae93f50b9c3fd2
1006592 F20101219_AACWTR zhang_l_Page_055.jp2
5cf9cf7163daa8cfe35793303b3f1f9e
4031f50dbe6d70e9e4fd5c6ae52c0db57099a558
8087 F20101219_AACVRE zhang_l_Page_046thm.jpg
768f2929b0a3fbfe8c630fc0445f84a0
6506cca2911ea1370aca081d1b327167b0ae3d29
34741 F20101219_AACVQP zhang_l_Page_081.QC.jpg
d6d9392323247a2467c3956c1c4c4197
bf4e02bf2d05131876b885c014818446e7f48dea
979808 F20101219_AACWUH zhang_l_Page_134.jp2
282557742980f588b9ef81142532d03f
a63f181eb4f89e93e14143fc3b8ffd5a77f254f2
1051969 F20101219_AACWTS zhang_l_Page_059.jp2
478ac69f8467fc347b5bb09aee7780dc
f7e5bc6bd71c4fd2b9fd96e884b749e43123eb28
94617 F20101219_AACVRF zhang_l_Page_130.jp2
5127b5e36ed851d7b9d4376b9d6f6671
ca6a674f4436a5c1b6eeafb642fbea0d31711ee0
76634 F20101219_AACVQQ zhang_l_Page_104.jpg
6957f5bc3189e797948130f31de583ac
c9f218d54d0aef742b8e84b99db6b6868e4609a8
777935 F20101219_AACWUI zhang_l_Page_137.jp2
9d4e95ce67e8ad0084261c7dcfe5b1a8
f464327081b5d28507176aeae39a6716b907deb7
1051953 F20101219_AACWTT zhang_l_Page_060.jp2
d73bfff1d903d8945b722f8cf24d2479
4095cc1a81c4ecd79ddef762144c20ce533e3637
1779 F20101219_AACVRG zhang_l_Page_096.txt
3c6adee41275d1c68e974eed55922b0c
2be83a0e6fa79dd5201600d388b51f12ec4526ac
28609 F20101219_AACVQR zhang_l_Page_110.pro
a925c1d144967b74d1b773a6f746acc4
2dfb60ad197c6ebc20103afe4f329fdf00a4e0ca
538307 F20101219_AACWUJ zhang_l_Page_142.jp2
daaea7cfcac5d2a253b2337fcf6bbbaa
c93bea034dab70e90026c66df8ce68f372985fee
F20101219_AACWTU zhang_l_Page_061.jp2
05d3e7f6ba599fb6b35d7830c0960400
48d53ff7819d6ad8befee295861d82adbdcd219e
28355 F20101219_AACVRH zhang_l_Page_113.QC.jpg
a18aabdda1b6566793c2732c8cc731e4
7be4cc258a0665996feea356a868d44ba348f4b3
5952 F20101219_AACVQS zhang_l_Page_013thm.jpg
e5d0d06cd981479293a9858ee7e12f1e
b82bf2e11a1d0fa14aaf1bdf3cfc3767383b8888
93853 F20101219_AACWUK zhang_l_Page_143.jp2
25c2490081756f5035096bb91a92e383
c47b9f0a0f4d52ce184f31e13b33cb5ebe8b25c2
1051985 F20101219_AACWTV zhang_l_Page_062.jp2
e0916f33eae41e609f05b53fd2569216
8d7d785bd89c5f8e7842e0f44b186f4959cf6349
124668 F20101219_AACVRI zhang_l_Page_007.jpg
c48c03c2f5baacf5528d340d84c04520
c1a047627b16d21cdfc2677b33140858f3f8168b
17088 F20101219_AACVQT zhang_l_Page_064.QC.jpg
9c201ec3fa03be190da977d35f9cfe2f
a4a47a1bc94933abee065e67f8a03118f4695509
110159 F20101219_AACWUL zhang_l_Page_147.jp2
08e76a48b8a4bbdec4dcf78febdc7525
2c89d53714cd923c722331fe198f9372545c1d69
74546 F20101219_AACWTW zhang_l_Page_064.jp2
934d484c5ed6c67186696eec24c50e0f
4358ece571704a1e2c6e6738566366a35dd10a21
6531 F20101219_AACVRJ zhang_l_Page_042thm.jpg
4b85031e6b1ca02d0e54f0368119de94
2c7f58c6226525383f83a5c830f3edb33fab76c2
1972 F20101219_AACVQU zhang_l_Page_054.txt
7703643a44a0d3fe1db11dfc386ffe29
8edc86d1c563587504271c2ba1dd00f55fe49c6b
F20101219_AACWVA zhang_l_Page_111.tif
81c44cd0cb2a91e90cf2afabf3292987
e3aecb92d5f61fd208dc762f00fd27b3f01466a4
F20101219_AACWUM zhang_l_Page_004.tif
e2d266e09aee7e4701b1b8188cda9e60
932cb6d6ab1f5d45931235ae3a4ba7cca65197cc
45963 F20101219_AACWTX zhang_l_Page_068.jp2
e39e3d10b2e9ac9fc821fb648145ff88
5b41b99bc28bcb993f7df260a449d0194675df1e
50865 F20101219_AACVRK zhang_l_Page_118.pro
d4547685cc7c3cffb0b963b6f731e88c
c6735d34e5dfd79df91c15939d0bc4fd8b9c8aba
F20101219_AACVQV zhang_l_Page_137.tif
186b9a6e35064cad83311a5ef38d09ab
53a4e366ecefcec83e2fcb868d979db488e060f7
F20101219_AACWVB zhang_l_Page_117.tif
1d3501230d625945ef4cc3994ad005e1
02c94d09fd4cc8ce70fbc693a0b29092eb133494
881046 F20101219_AACWTY zhang_l_Page_076.jp2
d3c1eda59328aac68f210e3757fba6cb
ba808fd7d66c0ea1a8a58173bf6e51d30465e346
22981 F20101219_AACVQW zhang_l_Page_125.pro
5f2b673b942cebf751d0c8bf738c4739
e350e28f84530e8e1f0ba6d99e48e6bf6b7847c4
F20101219_AACWVC zhang_l_Page_124.tif
edd913a1469e7a6e79f1cc30df78c875
6ac797816bb3e228ad0016e89ae44eb2adadc6ff
F20101219_AACWUN zhang_l_Page_027.tif
1e80aee9fa5883f76c2953579cfb77f0
dacd893cf80741e98c451bd3a7f0075ff1a44ef0
F20101219_AACWTZ zhang_l_Page_077.jp2
7121f799ad43a06fee1ecbabecaa62e2
d243458c96457cc8e2367959c6b6ae24ccebc351
93496 F20101219_AACVSA zhang_l_Page_127.jpg
0a5e5f527327ddba524fe9c30951cb56
39cd8c9e5dbe31204939882b776533fe9f2e1ba3
88184 F20101219_AACVRL zhang_l_Page_053.jpg
6566321a0e17c973fe0705a109fe7d7f
4fbb67818cb68f0bb22bb1240e57af53fe5b6497
111160 F20101219_AACVQX zhang_l_Page_103.jpg
6e255f7145f774d72864e556c4b5995d
f5dd06be89fabe1873bbb2bb504ea294bac5af92
F20101219_AACWVD zhang_l_Page_136.tif
41b41fca8685b84e27807a9d2f67f082
1f3834c51786fb7da3a63fb2e7045178d7d1ae6a
F20101219_AACWUO zhang_l_Page_031.tif
cba6bb4031477ec2fd69ff5423ad311e
b58fba000175e0972e452f9ad708ba856859c9eb
36361 F20101219_AACVSB zhang_l_Page_072.QC.jpg
b65332b49f8ab69ec11751f79a482b7e
723409625cdd590c1eb83f93d484c8249bee70b8
26416 F20101219_AACVRM zhang_l_Page_108.QC.jpg
2b0ed942aa6bf7e1841be1d394e68d8b
cd29b3ff1b208b9e67be7e845efa9300673e0644
114528 F20101219_AACVQY zhang_l_Page_030.jp2
5b397d9c4a020e4b703c53c81bebe073
019880dc9d31cb32d333bc2eb43e452203311606
F20101219_AACWVE zhang_l_Page_144.tif
3cbf3bdf5905b6a50c4fde0e83d907ca
90bf1d037d64a3f1c5e64f458717968bfef83d13
F20101219_AACWUP zhang_l_Page_054.tif
8775a21f86989ea1cb5af21cdf5fa66a
3f9d01462b1cec98ff0990cc19dd2999d0a4d7f8
7102 F20101219_AACVSC zhang_l_Page_033thm.jpg
da9be0718ddb6c46e317bca397ab9197
e53de4a0b402915d7f454292e5b75cbd7b3b7828
24017 F20101219_AACVRN zhang_l_Page_001.jp2
aa7d3018f8f6f63f702fef0a74ee6bd5
b7a860e1a93938ba057fa657975a9641af377801
10540 F20101219_AACVQZ zhang_l_Page_008.jpg
f22128454b272a1cbfca1113e73a72c6
f221ba3c9e560329875dee9a9d17bcf967551c25
F20101219_AACWVF zhang_l_Page_147.tif
9bc42574e3a723ed9721ba1bd7140fdb
144004c5dc5400608cc45745475daee2f0da4836
F20101219_AACWUQ zhang_l_Page_055.tif
e0361fac324a9eff649468ad09898d5c
8010d276156b41926a214a6574bf7faf14c846b0
100794 F20101219_AACVSD zhang_l_Page_026.jp2
cfae33845d2f77de04114088fca6118a
3c5a30dc79728656743c8e8b2e9fec2fed0f1b3e
7055 F20101219_AACVRO zhang_l_Page_048thm.jpg
c30d51472e60b74b810cf5926d0db20f
f51cfa7b1836eee4385b37e727a327f04e6de011
F20101219_AACWVG zhang_l_Page_148.tif
b25c4b4c41273fe5aa660874ab7cb802
dcce91010282d6e666c7814a7bf4919f534f8c8f
F20101219_AACWUR zhang_l_Page_057.tif
da7618603ae1116d24c9c50bb7597798
1d18b430745923014b4d2fe4ff9626d01fb6f1ef
7261 F20101219_AACVSE zhang_l_Page_001.QC.jpg
3cb61c112389e0f4760cd09438168141
e9a871b308aac22dce2afff12117198072371f31
F20101219_AACVRP zhang_l_Page_085.jp2
9bfbdc59800d0ffa8a136918ac9c8928
a62c2a6bc740c06e011d1795d4859503e308c91a
F20101219_AACWVH zhang_l_Page_150.tif
78dc4f88089dd95f7e251f17a20099b7
ed28a4e7cd3b577c9071a64a53b442fb42411e8a
F20101219_AACWUS zhang_l_Page_070.tif
e7c0a8074947b726f4d91cad1c71bd54
eb9259eaaa585be0776955d042c0f853a20602c4
1413 F20101219_AACVSF zhang_l_Page_003.QC.jpg
9880d450f356e83f80a87a65068b9d63
22b1064fd3c7d19da64d7e83e49a4e8f31e5748f
2230 F20101219_AACVRQ zhang_l_Page_112.txt
0c452767427ff988822f69c798607932
439efdfc2adb53e16dec1c2e2618e5cd9b1a2540
1193 F20101219_AACWVI zhang_l_Page_003.pro
af09cd6e63c1da63902f3976f2cabb8c
b2ff8ae6ff0f295e28b2c6aa08fad948a9e3780c
F20101219_AACWUT zhang_l_Page_071.tif
89db189f747af7a25d18c32d85be2f28
5aaab8c726bde2a4ec87d9a2da06b8082708be9a
7793 F20101219_AACVSG zhang_l_Page_117thm.jpg
b42dd2df33b02a5099d337c2b83aa1fa
93174dd0c0a1d8af59b05419b3dcb7f90a1bde29
114745 F20101219_AACVRR zhang_l_Page_061.jpg
145e957ca96259914131b69c93917b0d
856ee65b25166ed304985fff3caf093aa8f0b784
14390 F20101219_AACWVJ zhang_l_Page_005.pro
3c8c9fa4b9333fa8afdb84c1943c399c
0a5d6ed3a791660f63f6e278112ad2d715978139
F20101219_AACWUU zhang_l_Page_079.tif
6e523c245c1562f79d4e2c5e02543a48
ca219164094dc11338ea4e71cc1a14c8ff607c92
105174 F20101219_AACVSH zhang_l_Page_044.jpg
dc42cdf5a2570de2eb67792741a8bbb6
49230a6312a1a65491cc999d9b9b861b3cfd26aa
F20101219_AACVRS zhang_l_Page_073.jp2
d9cf4c642a13437eebc89cb042c3d2f6
0ad1dd5d27f7e2021a03782f07548cbf1592809e
53618 F20101219_AACWVK zhang_l_Page_007.pro
70a2f8ce59d4325c93a88cb6f7c3de6d
74baed6a042c23ffe60a71d78ff079458edf2c35
F20101219_AACWUV zhang_l_Page_083.tif
228d1de71c96d6fb09e8318eda8163f7
53f59fb96c2bf9cd49a79ebf83154a9aaa2be1e9
2260 F20101219_AACVSI zhang_l_Page_117.txt
3dc08789365022fda771fa95e197a643
be2ce95c3166c543ea50dce0287ea991a868fde9
35047 F20101219_AACVRT zhang_l_Page_050.pro
0d2d8a070a567dfbc68183498b45ae54
4a5e967a3af999dbf80afb28b271c81415295692
72738 F20101219_AACWVL zhang_l_Page_010.pro
8417d7fa1ee448030876aed451a7db89
85f0646019a9d11a7a3f28699c0bc31e400d7c89
F20101219_AACWUW zhang_l_Page_084.tif
0e0b4d00d694c624999adf2acfcfb0c3
342140ea85f935d2912dc7183adcaa1b6c6a27e7
55590 F20101219_AACVSJ zhang_l_Page_024.pro
7f4709b92580bb755c37528012ac115f
a773981bb55680776ce90de08aa696628ee43819
74476 F20101219_AACVRU zhang_l_Page_096.jpg
201c46c695a2c90cc99e9ee2ac81efc2
c8f138bc029647226b8929877ac0522124936934
35090 F20101219_AACWWA zhang_l_Page_120.pro
2d7bb231907c2c0be143d6f59b085f39
5eb1717a14b01991889ae1e645067c5cc24e288e
43171 F20101219_AACWVM zhang_l_Page_013.pro
c22bd059cad6e3b96f22d6972d2e96cd
e4c7daf00d3a9921de37ef360e9224a13b385a3b
F20101219_AACWUX zhang_l_Page_101.tif
91316005bb5db8d2e5b6bb1258b2b659
75d68721dc79dbaf66316e319df100d909b09939
7362 F20101219_AACVSK zhang_l_Page_109thm.jpg
c3bf77d44b484f33af30455356f47534
2018b618a3d006f0d020cafbe2087ec9befec1bb
54262 F20101219_AACVRV zhang_l_Page_115.pro
bbfe49c6b7cd78181005db6f2f1d8f83
b6435f7eaf594ccbf66de5c5ef81e33d2a1e1e6d
45247 F20101219_AACWWB zhang_l_Page_130.pro
bd83d23469f151dbc4d3b2fe0f7b5b23
5c37129b8a22287b43c0fc44785d42b9b9797e9c
38754 F20101219_AACWVN zhang_l_Page_021.pro
eb10f9f299e6036fd1ef68ec4dcb978b
56c5526a2473f6cd83baea9e54d5c8fee6155703
F20101219_AACWUY zhang_l_Page_102.tif
df04d60c95ed2c824e75f1b1912cb2d2
944ae5fe22278c5bf6ece26fbb7df36ac0408ef9
7738 F20101219_AACVSL zhang_l_Page_020thm.jpg
5c254291c54f991eaba11c0a3d64c6da
307f213570143d50bb0225762f3b706c2c295cf4
743 F20101219_AACVRW zhang_l_Page_099.txt
bd58302818fefcb3f600e00b97403c12
587c17c6d1b3eefea4311bb454a99cc5b56e6db1
46623 F20101219_AACWWC zhang_l_Page_143.pro
770444ae3f1833aa01b5c0f0bde2d61e
d820b39c0e54b53eb8b36d2ec478b658945d1cf7
F20101219_AACWUZ zhang_l_Page_105.tif
efac000595821c90e0f53a5762315fb0
dab4dcf4b92fa26bb7f1e4f12bd8b5d34e7c6c09
1051908 F20101219_AACVRX zhang_l_Page_011.jp2
666c847a0e22d0094e15d8e437eef2be
2b7c70e564e87a0149d34c8817bace60045066dd
2170 F20101219_AACVTA zhang_l_Page_063thm.jpg
0fac3dd70e2e854b51d4c2552f397d90
4596961fa2057dc31c575d58c622fc73c01ae8e2
60163 F20101219_AACWWD zhang_l_Page_144.pro
9c0ea7342ee5008b4f719b59c5128729
3eb36b70e5359325381eb4cf8cffda3ff6774621
16147 F20101219_AACWVO zhang_l_Page_040.pro
42631873badd509439f5818874621e8e
cf06a4234c00d2e43d4303d83864c124c251f728
883958 F20101219_AACVSM zhang_l_Page_108.jp2
059974b90b5e5187cee61a12d1c7730b
c16eac96abd56223355cdc8c718419d37b866386
32290 F20101219_AACVRY zhang_l_Page_074.QC.jpg
037b6c4c779b5920e8caac21614b1dec
488217b749bcdba721e45f3cc5ef3be6ac6728d0
7150 F20101219_AACVTB zhang_l_Page_054thm.jpg
78afca43c43a103e0da7fe08fe7efc48
eeb34c17538e15db04389b0a277ae1a325d4eda7
1533 F20101219_AACWWE zhang_l_Page_017.txt
1c24f080580a0b78fe31de3bac0fdd43
c781309275c2aaee171d2c29c219dc99fcea086a
37065 F20101219_AACWVP zhang_l_Page_042.pro
794709216105a6b398d178914305c685
4ff55fab4ebf70900eef4d27a4de14ba0c4a3987
7145 F20101219_AACVRZ zhang_l_Page_007thm.jpg
e37446f0400342f60ff7901ea13e13e7
4ba96046d8dd1ff596762b3131161f0d19b13666
114690 F20101219_AACVTC zhang_l_Page_129.jpg
3ecffc8e5258efaaddaf5a268f03ab61
c80d298e2291b8a6568bf0a1004cda59f194f591
F20101219_AACVSN zhang_l_Page_086.tif
be4ca8a9352081cfc7bf7bff1674c7f1
b6a1685641a1938ee9ac43ed1f9e5abe42e2aaea
2180 F20101219_AACWWF zhang_l_Page_026.txt
7bc073e2e685bd34788b29a8e7cffe35
d4cddc60ea4b21d18ab7b5c5a07c0bb73d2c2410
54589 F20101219_AACWVQ zhang_l_Page_044.pro
9950d38cb69c5c55ffaedcb6a5d150cc
121774b920cc51097bf9fc86db13247f3b3f8471
F20101219_AACVTD zhang_l_Page_127.tif
9ff3faf8f10ae5bd2fa4c1c9141a1e20
6cfce913245b5e9ed66551109a33febf4d69c5e8
F20101219_AACVSO zhang_l_Page_091.tif
97a2f4ed129c856ada3089f04d824d5e
aa80f0e3479685e9c9dbf38ae19c8c48c871e6be
2345 F20101219_AACWWG zhang_l_Page_028.txt
56a9b249a83df6a152cff8d2ee755709
4d23a9e7cd084362cfebdda77d4fae7dd33bc9e0
57415 F20101219_AACWVR zhang_l_Page_057.pro
d0dfd331515803444cb08651667de5b4
7d30bc141f337f12dd628eb94a3952363c0fc417
36033 F20101219_AACVTE zhang_l_Page_080.QC.jpg
148a2ce369036378d6577fe220f9500a
011173e45297de19d0ffadb844745e5b4be91dd8
98918 F20101219_AACVSP zhang_l_Page_118.jpg
e444ee620e9cbcdf6fd836e71685e78e
abad659e20f312017f3f6b87f0b584162f547072
2006 F20101219_AACWWH zhang_l_Page_029.txt
81af76d8f705695bd65acb8b76e3f704
bd125a23d48317531868d085d6adbbe4fcfb2cad
47563 F20101219_AACWVS zhang_l_Page_069.pro
260740ea4a2b7f6ea7e1eb3dd29bfbd4
7c8020200bd608797618feebd0628218a601c95d
1966 F20101219_AACVTF zhang_l_Page_100.txt
3336ffca06262b4a78b0f8acf9190241
3a2f8ec80fc5415464096d45967e95629b52b66e
81589 F20101219_AACVSQ zhang_l_Page_049.jp2
1dd059cbc376e303987c38e7d16580fe
7f698dc7386a29bef350574d5f06dcfb47de6881
1421 F20101219_AACWWI zhang_l_Page_035.txt
377a44c2a291bd2215ce343661aba6f9
dc976af9a22c0d3aea4fc2a075dfbe185dcb74b8
58839 F20101219_AACWVT zhang_l_Page_083.pro
f363881b1bc602d828ed0dab55785440
8e69a6a09f2c16677b8618ffde8d3c934cec4437
56232 F20101219_AACVTG zhang_l_Page_023.pro
883da704b334176b885aabb31d654944
cd986c6c827eb2250240b8b2828bf3a33d123ac3
F20101219_AACVSR zhang_l_Page_007.tif
2866b20aad3f9561ab83503e782097de
23ae69f03a7bfe76a2d528e5ed8b29e72792d227
906 F20101219_AACWWJ zhang_l_Page_040.txt
02dcee0474ef503ee502760b58a5569d
2197ddd41de27ff8bdddbb74c51f36e9362cfab0
51870 F20101219_AACWVU zhang_l_Page_085.pro
81f549faadf3605e565932a775618832
c46565378f14229509d21264bb4ee4806218189b
823318 F20101219_AACVTH zhang_l_Page_096.jp2
5be6ca895a70a7029f2f833fba329a72
39baa1be2995fd43b7ceb7728ce0d2623855d35e
F20101219_AACVSS zhang_l_Page_089.tif
9ff855088fc3c0ccf50e1cfdf295161e
36ae5bb488225e7cbc30c86a5f719c2ed8726162
1780 F20101219_AACWWK zhang_l_Page_041.txt
79c580de00df58114fc7066f4edaae49
7ded6097bfc182ec07a768a7ade0b8335a3cb132
54727 F20101219_AACWVV zhang_l_Page_089.pro
806ba1a06f4a564e74b1ebf8e3b2f866
237ed0692b8e0ead571d306fffc85a5f08737bfa
27374 F20101219_AACVTI zhang_l_Page_006.QC.jpg
c347fe771c79c16c6d7d4da18f471781
b07d165e1de47dc057125d734c2f3fabb9cd434f
6603 F20101219_AACVST zhang_l_Page_027thm.jpg
d09e0646a92eda7b31984b9bd82972ad
baaef3c2eda2c04b504c79dcc238db412f4435b0
1962 F20101219_AACWWL zhang_l_Page_047.txt
2336588222d96ea5d0d48f25dc0a3ff8
ca6abdf3a795b6ca6d3495611d6a2c9931746a86
22929 F20101219_AACWVW zhang_l_Page_093.pro
3fe04699c748a02c36c81d7fb497fa73
910785d767000ea3eae056d77fa391582924b8d9
808038 F20101219_AACVTJ zhang_l_Page_032.jp2
417b57bb6e16657d7f21e468fab3b485
93c718742b2468d8ec07b1efc4db5c0f2d5c74ba
26323 F20101219_AACVSU zhang_l_Page_063.jpg
be56e5833629602b022a6cb9b73097d5
6989a754b8eaf84afcb2588a99e2b83a19f6280a
F20101219_AACWXA zhang_l_Page_101.txt
9e20fd504a170582c1b9f62077e0f332
8ae40b00785c95c4cec2e7ab88b8b7248229a779
1717 F20101219_AACWWM zhang_l_Page_049.txt
1139b17c072f8931260f3bdb4c88b931
a36c7bc6ec81631930256642da53b67b17da0a41
30330 F20101219_AACWVX zhang_l_Page_094.pro
9bb9fb8eda470f0ce9bafe450cef823e
a56321ddafdb9893f5c7cd1046e5d7196c6143f7
71627 F20101219_AACVTK zhang_l_Page_014.jpg
0f7fa8f063ae7c5a4bdb4062171c52ed
2e746976f2fe5aeba84a214f6db5c026881b507f
6501 F20101219_AACVSV zhang_l_Page_038thm.jpg
16d7d557afbebbf8751c950b8e6f42c5
36a06e115b6e4c94aa47fa8f7012bd49bb572f44
1409 F20101219_AACWXB zhang_l_Page_119.txt
5a994443d1605d197a21d2d05384bbf5
5344d99fafc69bccc71cc469ab89a812a30793bc
1487 F20101219_AACWWN zhang_l_Page_050.txt
7abf9bd2f4936c81f7974817ccbf8e44
2606091c51db54603adcf6b7389044157a706fe8
54711 F20101219_AACWVY zhang_l_Page_101.pro
f6ec1193199114a93a9994076962540e
8e196e57dd8a0d8e5fd33a6e5ac67a9e4d01d3fc
126351 F20101219_AACVTL zhang_l_Page_149.jp2
1616fa40f021e565d2ddecc0153332c5
c7b7280133213f3ad81640252bab7817e6e9e946
610154 F20101219_AACVSW zhang_l_Page_039.jp2
0d4ce21db7ad2e14835ac4ef983d8a5d
7d6eadee5b85e23b717123ab3c5428d74f042806
2068 F20101219_AACWXC zhang_l_Page_126.txt
ff5561ec77a7aad732b22da752c5738b
e6a6c90661799c7fb3e8a3307153fcff32295976
1716 F20101219_AACWWO zhang_l_Page_051.txt
ccecce3a2c4daf84d2d3c71201c48418
112d3ae3f9b630e9b3f3191201e884fc60268a60
35472 F20101219_AACWVZ zhang_l_Page_119.pro
12e1a0acf946f27dd5f9b087dbfa482e
084feae54a51e5f3556a61f99ee52f341d293051
3901 F20101219_AACVUA zhang_l_Page_151thm.jpg
f58d4793db4fd4093eb0624acf761b60
d8305a138755e647db24b152ebc90ec8a7a58bcc
F20101219_AACVTM zhang_l_Page_102.jp2
28d039ba871087c0fa7fbf045858ff61
bdfa5f78c6563af68368d337a67bde487102e08b
F20101219_AACVSX zhang_l_Page_064.tif
eda73666a4b4af77d69e18433f7f334d
d0c0b400e65320612a7a7e6b4fb265c8fc7c9b36
2086 F20101219_AACWXD zhang_l_Page_127.txt
4e5bff96948f454c7327d234df109442
a43c8529f4f04fb25a4aeb3aac33582955d88439
7920 F20101219_AACVUB zhang_l_Page_152.QC.jpg
bb5b084bdd2dce4cf299c294c1ab4114
b26d0daa15f6dc682519f9cbdc819d644065ec34
865356 F20101219_AACVSY zhang_l_Page_079.jp2
914ac949df594d841130403031cd0983
718f3bd4af466bfa9182f3228336ee9306e6bd2a
1323 F20101219_AACWXE zhang_l_Page_133.txt
a2556f086c98a5d0046081e5f65e3a37
91798c43dfdd949a1941f10359f7fd93d33271b9
1839 F20101219_AACWWP zhang_l_Page_052.txt
508c65338db2dc29fab0ac5edea9f83d
9777b8d87e955d31b9ca198779d63e31b9584557
56942 F20101219_AACVUC zhang_l_Page_070.pro
8424d5dd4f649f2c824d0b1542c5c241
9f46fadebcef9d178b3a8fc8939a39e447d60451
3788 F20101219_AACVTN zhang_l_Page_098thm.jpg
dcfed4456411b15cce294bd89ca42af5
f8c76c981ebfd0c8f6f3dce7a842ae7092170cc5
8076 F20101219_AACVSZ zhang_l_Page_072thm.jpg
1db6efa05864a2075eb8c7cbc4b00d2d
7a916d4fa98b926b649d218aefdc117c2db3bee3
1852 F20101219_AACWXF zhang_l_Page_134.txt
edd7afb828350e39a88e147b82db5829
b5e0a16f2729d7451d51dcf5cfea24ff3c6772c1
1816 F20101219_AACWWQ zhang_l_Page_053.txt
04a6520522a3744b258bc740983fdbe4
c47814e5068e29eb9986a6d6f001605fea325a58
109179 F20101219_AACVUD zhang_l_Page_036.jp2
7fb234b82c2b12de83e128cf4d6fbb78
20ba104d1be312761f3be512055353f2d2af7874
2209 F20101219_AACVTO zhang_l_Page_084.txt
c12821a5e0e8964d0508f3907079e5fe
729815ea2bddc6bf69df58b1fad44dcf68b8ba77
2060 F20101219_AACWXG zhang_l_Page_135.txt
59f78ab2bedf36c4412701326f4249bc
721131938780616eeefb987d29461f2523dccdfc
1951 F20101219_AACWWR zhang_l_Page_055.txt
30c2f876926d8764bf1b91facc6afa36
a311a01c11608b0c36bcebb1f228dec3f1e00e4c
92276 F20101219_AACVUE zhang_l_Page_041.jpg
b9f818d9df1e6f6bd11f7ea853ea7673
f331559b2bd9b78b8ee31a9340546d8fc32cc376
F20101219_AACVTP zhang_l_Page_043.tif
46faabd8c903e795ac5a7645f6dceb9d
4fe16f581e1dea52e7aad847d50132709f993f01
811 F20101219_AACWXH zhang_l_Page_138.txt
dc99f5371bd06b64420d64b501a0420f
a5e8a7cb633b665e4873b15591a08e3e70d69423
2263 F20101219_AACWWS zhang_l_Page_057.txt
a42c66a0ed58a246160fb695605ec002
44af82ca5b19f6086cca725b7f131d88504b28ea
26597 F20101219_AACVUF zhang_l_Page_001.jpg
97155ee25153c1660b003def3037d559
57b3335c851d855b9bde8f51bb67f0623aa910a0
60086 F20101219_AACVTQ zhang_l_Page_039.jpg
1d52f1512734a3fdbd77b5f7a8880aee
ebd45fb5ecaeae0de6eeea6dfea2a9beeae79f9d
F20101219_AACWXI zhang_l_Page_146.txt
807fb7260a291b675e2fcb7b1bf635b7
3c9a93f6b672e6591b20cec2fd1f6f06d288a6b7
2346 F20101219_AACWWT zhang_l_Page_061.txt
c83fba3deb8043399e3763008d181d52
7b795b8ecea976766985d41d9162633f56906db3
989 F20101219_AACVUG zhang_l_Page_002.pro
ad1683f0303199d924ab06e32c95613e
073362d212c65c0444e34b1011e9e5845ede5c8c
F20101219_AACVTR zhang_l_Page_038.tif
35a61f2f3367c7f2de4bfed0600bbf05
7768943ba8e8bb6d9b606916b2cfcad6e89e3320
29030 F20101219_AACWAA zhang_l_Page_055.QC.jpg
fb9e423f887dc89a0181bdc8c88a34d6
133c94e7056570f8a36ddf3e96eaa80d74b88df3
2132 F20101219_AACWXJ zhang_l_Page_147.txt
85fec82aeffc02f67933a7c0a79141f4
0a7cddf9d7e46ce674cbe64f4e3494ca0d673111
1208 F20101219_AACWWU zhang_l_Page_066.txt
6ac394c2429654d831bd956fe8edea3d
db7061a902968754dd2011dad6b4d7ff029f5e3b
2202 F20101219_AACVUH zhang_l_Page_091.txt
092d16ddc73e36e4f350ece0a7a380d2
be497d6a9a648659b5befc83eefaa04eaf5ed3e7
F20101219_AACVTS zhang_l_Page_085.tif
07913caddfa6e750a3fe11f1a85b91b1
86a770bde94e043c708895c55efe71f064cf164c
100210 F20101219_AACWAB zhang_l_Page_109.jp2
13237f3c9e5d7667169ee831e0c95346
e4e36f21ab5b15fe4cfa0c81a3785fc072dfefb5
174738 F20101219_AACWXK UFE0015583_00001.mets
56f1d934e19e62b93241da71dfc8c5cd
b71dda02a8a1f62dfa4f33d526fda638f4dd49c1
zhang_l_Page_001.tif
2309 F20101219_AACWWV zhang_l_Page_073.txt
6319a141c13d62959ec3f157c7f668b3
b539f5181487d977e5bfcf0079c2ea60ca04b973
118643 F20101219_AACVUI zhang_l_Page_075.jp2
a5ae85deb5716d11b50a21e3fdb49a72
db61f6543570203041895e5568edb28cf2e0f08b
2332 F20101219_AACVTT zhang_l_Page_129.txt
f5e020355b25a1b641a64df8abd1996d
1238ccefc17005781120f48424f9eb428d062374
23393 F20101219_AACWAC zhang_l_Page_032.QC.jpg
a30f856834abb9ca6ce1429315598301
10734f02822c54ca7dd51e51d4a23953031b16ee
6000 F20101219_AACWXL zhang_l_Page_006thm.jpg
68fb251703a36f4ccebccc3715a42b16
75d2edc0a13b789de83662870fa5c55696530533
F20101219_AACWWW zhang_l_Page_077.txt
5343848f872567303f44d20ef4aa46d0
bdf7343cee0fa7a4edeb92f2a7a2940dcd59d1a7
1051971 F20101219_AACVUJ zhang_l_Page_112.jp2
d8980a036ed3a1e72e17212565f5f1df
5d25472ef379891020ebd82abfd23df4c3c687b4
11460 F20101219_AACVTU zhang_l_Page_152.pro
d5aef96c88deaa028f37492e9b6abdb8
275a1fff3f00dffcc5d9ee20dc0f1c86a3d0d129
F20101219_AACWAD zhang_l_Page_045.txt
b055d355a5c66a8fcbaf25c240ae17bb
d6985266b07eacaf72ebbe301f9f355626f15932
27516 F20101219_AACWYA zhang_l_Page_051.QC.jpg
45ea047af1173ead62dbb0999859062d
b2a226447d8882eb6b3360ed483087a0fa4ff644
1099 F20101219_AACWXM zhang_l_Page_008thm.jpg
ae05fb04be58ad0b0b2bf6eb198af760
20a95e784f3ef674efc8ada3acd789603deaf7a5
1148 F20101219_AACWWX zhang_l_Page_093.txt
d51c0074ac3824b34c3771df6983c5e6
33e79d27290a1d7015d024899f6341a96325cb1a
6460 F20101219_AACVUK zhang_l_Page_079thm.jpg
f27a58905e488b7ead1b09b28a5d3790
8a60b71672aabacd98efcfa95ea1038cca7192a0
123927 F20101219_AACVTV zhang_l_Page_028.jp2
9014c5f6c3bd060b95aa692dfa5183ee
9eb081c93b4243d4c0667b3828a9a99c0d413d0b
17716 F20101219_AACWAE zhang_l_Page_140.pro
51741e40ba9114aef2f05f8d401e9a59
9bc7dd19b3a4adff4f0608256643c5bc716ddb1d
28343 F20101219_AACWYB zhang_l_Page_053.QC.jpg
f88881b0eee8c6b0b34256f139cac466
95c5f91c2e1cf8163a25530f4215974fa5a04847
6805 F20101219_AACWXN zhang_l_Page_009thm.jpg
f21c8e094d562b369c26ddc86b237773
df9979c3c928ea7ab7eb88d1b905d1a20f6978bf
2116 F20101219_AACWWY zhang_l_Page_095.txt
5cb2e46f0f8f04232c865779aa0122db
6ea505b183bbbbf8ee207a236475f4fa2385741c
F20101219_AACVUL zhang_l_Page_115.tif
adc5ba4d0e8b88fa4fe8353372432407
bf24a13b67457fb5c8d584b42b81b3368da4a01d
24690 F20101219_AACVTW zhang_l_Page_133.pro
d2e0ccbde945a233717b0ba21cfade0f
96aa352286657a90b555ffbc4cf1d2d299043f6c
F20101219_AACWAF zhang_l_Page_140.tif
1cc2cc349fe137183fd77ead9d2dc4f0
de546df735f197f9d3c547faaa4c8380f5012cd4
35291 F20101219_AACWYC zhang_l_Page_056.QC.jpg
16df1505eb9d0e0cef8979c415c1fd3a
ae66ef3bc6bfb12c0a3373afadc46cea121e3a91
39273 F20101219_AACWXO zhang_l_Page_010.QC.jpg
21fa07d65c5d888b38203e68c8e40f6f
2c60082a0a9860682439f62088466da6220d37d0
1191 F20101219_AACWWZ zhang_l_Page_097.txt
9c6cdb2091c200dbcaca01c8b4311384
ff2546c89ac4700e36ad685bf06f918cff564482
32785 F20101219_AACVUM zhang_l_Page_085.QC.jpg
842b1a15bca8d11d1f7de077f9e9d48d
1e5cfe74e865d19b73cc467a1f298f35ffcfbc57
727139 F20101219_AACVTX zhang_l_Page_139.jp2
1e8e385d6bbe9cef1c8b6765eb34e304
01de70678ec8a8b2737c59016fe341e196b90a52
97527 F20101219_AACWAG zhang_l_Page_106.jpg
41ab63edabca80a6082f3336bcc302cc
e2609348e4e68672e1a69f2d8d4ce7ed1fb5331e
34160 F20101219_AACVVA zhang_l_Page_089.QC.jpg
902d021b8c54b2a119c660e64eb48db7
935864a3ad6e1f8d3d196ebbccdef549a0e0bb1a
8030 F20101219_AACWYD zhang_l_Page_056thm.jpg
2c0e1495709a4545a8d4f33312d45c63
62c89a6fa8941ee89222abdb8ce1ffec6d4600b4
8499 F20101219_AACWXP zhang_l_Page_010thm.jpg
1060c4c86bff5b485466457c864256e0
48bfbc3e08816b858d17dc64b45d2508ffed0b6f
749814 F20101219_AACVUN zhang_l_Page_017.jp2
103ed7783630aed54a4f3d1a5c48b88f
d1bab4bec8b735b352d67daa16957da485ee2a5b
2194 F20101219_AACVTY zhang_l_Page_024.txt
4150305c8d496bb33d7309c03e49b77a
c7eefb4016fa9ce0fd71820faa8c4ef4bcbeaa33
34858 F20101219_AACWAH zhang_l_Page_103.QC.jpg
14c9021fd9fd9b3f1334a8ec33228dec
d511f216c0e98e244ea3ae5575490aa9c1283584
1776 F20101219_AACVVB zhang_l_Page_079.txt
f80aa3a320ddab07016811ad877fb612
5690b447c6bbdea61aa2a12206285f1eda49d4cd
35753 F20101219_AACWYE zhang_l_Page_059.QC.jpg
706fd1b4b080a5a8f1d8a24e0d32b30d
f0b87de04b4bb7b76bd3206ab721e10f5f161fac
2215 F20101219_AACVTZ zhang_l_Page_034.txt
64c10f9930c5269e0d588c9f410133c8
c452af63422797bc292fbe9eb02ed90344da4f82
2705 F20101219_AACWAI zhang_l_Page_005thm.jpg
a2f32621d46ca68aabfd6227a1f41a93
5b20539d308cc350968a00d9ed7ff042f25cd906
6534 F20101219_AACVVC zhang_l_Page_108thm.jpg
edcaaa96a768bdd610a95230ef07e6ed
45bc0432f5a5cfd70070bd44b90438b5b6241da0
8568 F20101219_AACWYF zhang_l_Page_063.QC.jpg
efe981fbf2f9d76a05f0cf9b4dd2e5a9
b433adf9aba0cd7a70fa8f3e6d0c494e77f6af08
14784 F20101219_AACWXQ zhang_l_Page_012.QC.jpg
aea93a97d04bdd6d1e98d9a4c4b079e9
530f3286dd896881d873faea08ee35c791cbf925
7008 F20101219_AACVUO zhang_l_Page_050thm.jpg
3ff26a81bf896221611e4c01874938f7
b078e344373eb18a195f52f2ad65cb932b26167b
2193 F20101219_AACWAJ zhang_l_Page_074.txt
c839e249005762842803b0bf76601640
b6ab4e2c1c1ed050a0d121172b4b3d77c4936d85
27081 F20101219_AACVVD zhang_l_Page_052.QC.jpg
dff7ca31035f7df00b799f9cab64952c
7c6471a9eb98a0ab3e2921618ba308553a2d1996
34326 F20101219_AACWYG zhang_l_Page_071.QC.jpg
8128daed81f9582ba68dc3a3269c4db1
0186cf5a8bf7a83d300ed950d2318b99ad602ce1
5264 F20101219_AACWXR zhang_l_Page_014thm.jpg
98cdd0c452f5487d6b80dc7fc07c5ec5
665642176a7cd28ab2bec0a98a4d427d31662c76
114018 F20101219_AACVUP zhang_l_Page_149.jpg
16885105c59e100ed28b57d84f97d815
5847d7cfbc8d1de6c86a636ef6b754881d896b53
7275 F20101219_AACWAK zhang_l_Page_047thm.jpg
5710c10502bb8238488d7b0b72c70393
db81a33a9493254f531236c2dcbaa977e82d6391
F20101219_AACVVE zhang_l_Page_074.tif
bc9279da21c904b5bc93ce696131db5b
f750081b691337e4cb3f37af42a312a57c167abe
7669 F20101219_AACWYH zhang_l_Page_085thm.jpg
a41d8c7405db1827658b8b906ba28eb1
5759633609800ed7a28954cd27f260632e5bf1a5
21832 F20101219_AACWXS zhang_l_Page_017.QC.jpg
3adce08d7a67b8a398a0987897c0d931
f6fc7e83866e70dff3e8179e3547dc50839c5354
64150 F20101219_AACVUQ zhang_l_Page_149.pro
6d0606288d76e88b412243d36f5f8dca
a4085f11b6d8df3aa186ca3741e5d2df1c763e29
34786 F20101219_AACWAL zhang_l_Page_090.QC.jpg
6ca442497da22b3f668e2a31ffbb0ba4
93407eb15e15c4656b2488647c62bd8598ef670a
57074 F20101219_AACVVF zhang_l_Page_097.jp2
ec88fdd07fa4ea207eafc50259a1758a
e21f19bc6663f6a40af874ecc09c8f8efdd0d499
33568 F20101219_AACWYI zhang_l_Page_091.QC.jpg
3bdf0a8b57b35d0913953a8ae970445c
18319b8bbb711e79735b7529de0464385216e78a
7731 F20101219_AACWXT zhang_l_Page_024thm.jpg
8c86848c7ebbb34d21581b528087e5d1
a731e661b73bcdac4e3a5806eae7d0e265d65fd9
F20101219_AACVUR zhang_l_Page_047.tif
59a87fc50daa994d4ca0d479c6cb3091
3b2c4a42a6854ac1331c8869ad3bd492851cdf2b
2272 F20101219_AACWBA zhang_l_Page_082.txt
9754193b37dc56a39fb58fa6da2141d7
d55aa9afd2057888fee366b5a3bd2642cb5a22ef
99327 F20101219_AACWAM zhang_l_Page_147.jpg
e4104eaba87e09016fbeb9d04ffbba70
a9b727946731278d054875b96e0beb86277a8132
1230 F20101219_AACVVG zhang_l_Page_002.QC.jpg
9790211ac585ee3a3329592b6b9288ce
1df840f650fcd11be8e56163696de81f3afb9900
9332 F20101219_AACWYJ zhang_l_Page_093.QC.jpg
a0ad34a2a8576fd5c587b0a948a4962e
5ae4100f4e4fa03432fe087363ba0095a5f47150
29138 F20101219_AACWXU zhang_l_Page_026.QC.jpg
9d0fddd2350003ffc594cd4a4ab04f12
f56c21b5e6b771b366cf53b3fbf23acc109cf23b
7900 F20101219_AACVUS zhang_l_Page_084thm.jpg
2cc9532515f2102492aef1ce6d5648ff
f589117f6608f94489b34201430b8d29b842fde0
528 F20101219_AACWBB zhang_l_Page_098.txt
d2e865e488bacb7776abcc7d8810c3f6
de53728e78ed048b21e0cd858df86d9408b9d87d
108469 F20101219_AACWAN zhang_l_Page_043.jp2
1231708c56a2a2120d1e103b5d272fed
48524c1981ed4dbea144e4134afe06b09321b0c8
6783 F20101219_AACVVH zhang_l_Page_016thm.jpg
6f892c67480c8736ae3b2705a0ffca66
412357757320c9e32cbd1b9646563de1ccd541d5
3200 F20101219_AACWYK zhang_l_Page_094thm.jpg
64166a161f9f57ac907d13df2d256997
a652b185b338460000d5b046452f8cef34e1cea9
31918 F20101219_AACWXV zhang_l_Page_029.QC.jpg
477a93814b43b792fa24bd8e6a14aacd
070bebdd973fe3648e7949808592ca698dd0a4e3
630 F20101219_AACVUT zhang_l_Page_092.txt
e2291df9b1c65e9fa84fe499ba0ddb90
e275139de307cff116090e08b2ca6cd8aca1899d
24861 F20101219_AACWBC zhang_l_Page_152.jpg
7edc01767352ddca54967eb6e01b3433
47a300728288a1027e7638a0928836673d5041f3
F20101219_AACWAO zhang_l_Page_032.tif
9b46a57387b27ffa1bf64cd45608e5bd
b54b6c123dc5a6a803ec5149732b6c0b1c87a9a6
2111 F20101219_AACVVI zhang_l_Page_015.txt
7183f263bc9b3bdcae3b0b62f0ca895c
8780e08dd3341e958e5f92d5411e1c4652a5c8dd
7712 F20101219_AACWYL zhang_l_Page_101thm.jpg
19164cd6694f102755a6af1d79ae4aa9
22ed72382609aef2333b6f13ef30d837b7e2ea78
7560 F20101219_AACWXW zhang_l_Page_029thm.jpg
05d5ffdce66e6d36361f72e5d609f3e6
8f839112246c088890645768370fd46aea9093ac
F20101219_AACVUU zhang_l_Page_040.tif
213dfb751be4efb88f7f846c887709f4
b69a49c1d58fa7de843a1d0a884f723aeec09915
117191 F20101219_AACWBD zhang_l_Page_146.jp2
ffa596bffdb08157b092a27798e3e387
a4863b768226067c3b4860334d668f2942557144
7213 F20101219_AACWAP zhang_l_Page_135thm.jpg
625bba1877db383fa85d30e623275944
4433bf56c56f3c5e0036cdb8fab63757e4e26959
1051978 F20101219_AACVVJ zhang_l_Page_101.jp2
31427e3eee01ec7f7d363c77a9778fa6
f942cadc0c10a9132409815430d09317c9829d80
8046 F20101219_AACWYM zhang_l_Page_103thm.jpg
830a5623edcf4a65b859b01745695bc8
7fd1d085ae8447c030dd35ca77599a03eb34410f
6296 F20101219_AACWXX zhang_l_Page_035thm.jpg
57383af8fa59233ace7247b95e54b30b
783c07209f8dc60957ecd9da969178a474aaae6a
6797 F20101219_AACVUV zhang_l_Page_100thm.jpg
511bda391726198267ec573c410e7f0b
f41e192180a1af1171b6866bb2244b5df6965228
F20101219_AACWBE zhang_l_Page_022.tif
148ed7401ac9b0e3fd9d4256ef21bd82
e357b7e772cf87b69e8033ef48ee2a610e32c2ea
1543 F20101219_AACWAQ zhang_l_Page_123.txt
15d466620ee75f84cdbaeb2dd8752317
ef2bc0a4fa7ebe022beff61b61d933001c5f8abd
7504 F20101219_AACVVK zhang_l_Page_107thm.jpg
2aee1759637b84dc06e26c47f92a5fea
f18333a23eb10fc70b0c7d51df1a45137efd3374
30415 F20101219_AACWYN zhang_l_Page_118.QC.jpg
e09a54a519284783bd0a4d2c57a8f490
93a03e6503dca76e473c5f1016d3fd4bb63fdd57
35355 F20101219_AACWXY zhang_l_Page_046.QC.jpg
d64e724ff5f25435fd225a0301d2720e
ac1ef32f0bb6fde9068c284def4c65aea4676f5f
547107 F20101219_AACVUW zhang_l_Page_022.jp2
26be8d31f70b47b61760ad11e20a8ecb
eb61dc511505aa9dfa47ce9d2ad5fcd852b1b077
38243 F20101219_AACWBF zhang_l_Page_098.jp2
c618a2573e0869fa4711a35271bf6f4d
1b2c7dead6030d5fcc8f13278cc0199e9f8fc747
111821 F20101219_AACWAR zhang_l_Page_090.jpg
76b867c59d48afd6ec479db72fc172be
76fe56a65dab0763057f83f6a4890ceda083e36a
27196 F20101219_AACVVL zhang_l_Page_027.QC.jpg
979ddbb86f1cecfaad57a2e017892148
b8aea363fd2e0fb2b9a4a594d99770456e34a1bc
4942 F20101219_AACWYO zhang_l_Page_119thm.jpg
cdfc672c215927f8200b97e5ac9dc833
b50fdd3af79148c1bb3070816219af382b3eb5ed
30638 F20101219_AACWXZ zhang_l_Page_047.QC.jpg
f473ce200d913185a746ecdccc9a1bea
9a92995100a3fc08f6b71a10c1c78d3ebeaf188c
F20101219_AACVUX zhang_l_Page_035.tif
c19628f9f4c93944b6662169896236d4
2c776f90615d585bde3e0d6a9163d9d480744172
124554 F20101219_AACWBG zhang_l_Page_072.jp2
23264756335e28e05ac1f73be38b6d69
444483799fdcd89320e62d99784f0fac69abcb6f
F20101219_AACVWA zhang_l_Page_088.jp2
9da0ac2f8c9c5425269f69f1a83dfadf
8316e04e798f15e005614b33fb4775f9f768a3fc
109154 F20101219_AACWAS zhang_l_Page_034.jpg
8382349202ee676ad21696705d197165
098786a54a30e6b7a536418680bc3b92bb5ee8f3
24836 F20101219_AACVVM zhang_l_Page_097.pro
c6b1833ffa24d38bf6e51e149db710aa
18d94ea37ede8feacea7b7d2d2681db37ba2ce8b
5484 F20101219_AACWYP zhang_l_Page_122thm.jpg
a249dadc39c8c85ffec7eb98396ba79f
9632adf91aea8c5b9e17191653348424693128b7
91998 F20101219_AACVUY zhang_l_Page_019.jpg
0b34d012af9a27371747cd1f0e596f62
706b1a325441cd3145c2962de0e78754445a453f
1651 F20101219_AACWBH zhang_l_Page_124.txt
55635f5dff1d16172e7f1ed0ac30fb3c
e8fcfeca3329f8ab80ba40a071f77a4618da82c9
101143 F20101219_AACVWB zhang_l_Page_036.jpg
5d528351aeda0ed00d730b1d07ad450b
0415f3af868912476690b42ab29483b0030d7408
11343 F20101219_AACWAT zhang_l_Page_138.QC.jpg
e0998282846e0239348790a208a07f5b
c91c8f953617f16e7bfa690a3f32c57bcf07ab39
101353 F20101219_AACVVN zhang_l_Page_008.jp2
ff23f900e1512b2aed455fd3c4c62fca
584431059aa7ca6921b324e0aeca175659f42520
3794 F20101219_AACWYQ zhang_l_Page_125thm.jpg
a96e1fc28b541372757eda19e389bf2f
3b0f3e895ee8bcc8024a007f89a826d14203e110
35279 F20101219_AACVUZ zhang_l_Page_057.QC.jpg
a7ab901113c9d5236aac4b0a9cc89709
c93157d84c3636d700eb65595178b42075eb8974
1732 F20101219_AACWBI zhang_l_Page_065.txt
cdf75d1f0ada2f50a29e57338be28c83
81d6ecb0b77554f5c11ae3f2cad82228d8dcccfb
7873 F20101219_AACVWC zhang_l_Page_077thm.jpg
687fee2ab4d255d52473197057143786
8a00e0dcd750d7ee43f23f4b18c85ee97524f270
59374 F20101219_AACWAU zhang_l_Page_129.pro
f57bc464571d7f555affec4baca77856
d22a6c209ac5ef2a6410955abd5d7826d080fb0b
34448 F20101219_AACVVO zhang_l_Page_084.QC.jpg
32285d0326557050b0f6546d7bcfc497
3244467d4a694bfe9207a26d562fc221a262cd8c
109449 F20101219_AACWBJ zhang_l_Page_023.jpg
e21418387636122609a26a67b63c51a4
aa4cb3ea6640df2bf4a02885dd6417fb8e20c8bc
5275 F20101219_AACVWD zhang_l_Page_133thm.jpg
4385ca6d67ddeda8feeb21965b0add5b
1c6cc59074a20de62ce42bd36752069651b01028
120295 F20101219_AACWAV zhang_l_Page_145.jp2
492109fb5f5fc7a2e7eb49c92923231e
cfa817b03d5c0ebcdc023ca94b23a633260fb45a



PAGE 1

BAYESIANMETHODSINCASE-CONTROLSTUDIESWITHAPPLICATIONSINGENETICEPIDEMIOLOGYByLIZHANGADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2006

PAGE 2

Copyright2006byLiZhang

PAGE 3

Tomyhusband,Xin,andmyparents.

PAGE 4

ACKNOWLEDGMENTSFirstofall,IwouldliketoexpressmysinceregratitudetobothofmyadvisorsProfessorMalayGhoshandProfessorBhramarMukherjeefortheirimmensehelpateverystageofmyresearch.Iremaingratefulfortheirconstantencouragement,andmentalsupportthroughoutthehardshipofmygraduatestudyattheUniversityofFlorida.Withouttheirpatience,guidanceandencouragement,noneofthisworkwouldhavebeenpossible.Asmentors,theirwisdom,kindnessandenthusiasmbene-ttedmegreatlyinbothmyresearchworkandlife.Theirvaluableinsightsandideasdirectlyandsignicantlycontributedtotheworkinthisdissertation.IwouldalsoliketogivespecialthankstoProfessorRonglingWuformanyfruitfuldiscussions,greathelpandprovidingthedatasetanalyzedinChapter 3 ofthisdissertation.IalsoextendmygratitudetoProfessorMichaelDanielsandProfessorPaulDuncanforservingonmycommittee.Iappreciatetheirconstructivesuggestionsandprecioustime.Ithankalltheotherprofessorsinourdepartmentfortheirhelpthroughoutmygraduatestudy.IwouldliketoconveymyappreciationtoDr.NilanjanChatterjeewhoisaSeniorInvestigatorattheNationalCancerInstituteforbeingmymentorduringmytrainingfellowshipattheNationalCancerInstituteandforprovidingusawonderfuldatasetwhichdirectlymotivatedtheworkinChapter 4 ofthisdissertation.IwouldliketotakethisopportunitytothankmyfellowgraduatestudentsintheDepartmentofStatisticsattheUniversityofFlorida.Inparticular,IthankDr.SamiranSinhacurrentlyonthefacultyatTexasA&MUniversityformanyhelpfuldiscussionsandforhiscontributiontotheworkinChapter 4 ofthisdissertation.Ithankmyfriend iv

PAGE 5

Dr.YanGongwhoisafacultymemberattheUniversityofFloridaforsharingherexpertiseingenetics.IthanktheCollegeofLiberalArtsandSciencesattheUniversityofFloridaforawardingmetheKeeneDissertationFellowshipAward,whichprovidedawon-derfulopportunityformetofocussolelyonmyresearchduringthelaststageofmydissertation.Last,butnottheleast,mysincerethanksgotomyfamilyfortheirendlesslove,continuoussupportandencouragementduringmylife.Thisworkisdedicatedtoallofthem. v

PAGE 6

TABLEOFCONTENTS page ACKNOWLEDGMENTS ............................. iv LISTOFTABLES ................................. ix LISTOFFIGURES ................................ xii ABSTRACT .................................... xiii CHAPTER 1OVERVIEW ................................. 1 1.1Introduction:TheFrequentistDevelopmentinCase-ControlStudies 1 1.1.1TheMantel-HaenszelEra .................... 2 1.1.2LogisticRegressioninCase-ControlStudies .......... 5 1.1.3EquivalenceofProspectiveandRetrospectiveModelsinCase-ControlStudies ......................... 6 1.1.4MatchedCase-ControlStudies ................. 10 1.2BayesianAnalysisofCase-ControlStudies .............. 12 1.3TopicsofThisDissertation ....................... 16 2EQUIVALENCEOFPOSTERIORSINTHEBAYESIANANALYSISOFTHEMULTINOMIAL-POISSONTRANSFORMATION ......... 19 2.1Introduction ............................... 19 2.2AGeneralResultonPosteriorEquivalence .............. 20 2.3StratiedCase-ControlStudieswithMissingExposures ....... 22 2.4Discussion ................................ 28 3BAYESIANMODELINGFORGENETICASSOCIATIONINCASE-CONTROLSTUDIES:ACCOUNTINGFORUNKNOWNPOPULATIONSUBSTRUCTURE .............................. 29 3.1Introduction ............................... 29 3.2ModelandNotation .......................... 33 3.2.1StatisticalModel ........................ 33 3.2.2GeneticModel .......................... 34 3.2.3InferenceonIforTheModelwithAdmixture ........ 35 3.3LikelihoodandPriors .......................... 37 3.3.1Likelihood ............................ 37 vi

PAGE 7

3.3.2PriorsandPosteriors ...................... 38 3.3.3ComputationalDetails ..................... 39 3.4Simulation ................................ 40 3.5ApplicationtoARealDataset ..................... 45 3.6Discussion ................................ 47 4SEMIPARAMETRICBAYESIANANALYSISOFCASE-CONTROLDATAUNDERGENE-ENVIRONMENTINDEPENDENCEANDPOPULA-TIONSTRATIFICATION .......................... 55 4.1Introduction ............................... 55 4.2Model,Likelihood,PriorsandPosteriors ............... 59 4.3TheIsraeliOvarianCancerData ................... 68 4.4Simulation ................................ 73 4.5Discussion ................................ 76 5ACCOUNTINGFORERRORDUETOMISCLASSIFICATIONOFEX-POSURESINCASE-CONTROLSTUDIESOFGENE-ENVIRONMENTINTERACTION ............................... 86 5.1Introduction ............................... 86 5.2UnmatchedCase-ControlStudiesofGeneEnvironmentInteraction 89 5.2.1MaximumLikelihoodEstimationunderG-EIndependenceAssumption ........................... 90 5.2.2MaximumLikelihoodEstimationinThePresenceofMisclas-sication ............................. 95 5.2.3Case-onlyMethodwithPossibleMisclassication ....... 99 5.3SimulationStudies ........................... 101 5.4Conclusion ................................ 104 6FUTUREWORKANDCONCLUSION .................. 112 APPENDIX AAPPENDIXTOCHAPTER3 ........................ 117 BAPPENDIXTOCHAPTER4 ........................ 118 B.1ProofofLemmasandResults ..................... 118 B.2LikelihoodforTheEDPMModel ................... 120 B.3ComputationalDetails ......................... 120 CAPPENDIXTOCHAPTER5 ........................ 123 C.1TheConstrainedMLEquationsunderG-EIndependenceandRareDiseaseAssumptionsinUnmatchedCase-Controlstudies ...... 123 C.2ObtainRestriction 5{6 ........................ 127 C.3ProofofRemark3 .......................... 128 vii

PAGE 8

REFERENCES ................................... 129 BIOGRAPHICALSKETCH ............................ 138 viii

PAGE 9

LISTOFTABLES Table page 1{1Case-controldatawithabinaryexposurevariable ............. 2 1{2Seriesof22tableforstratiedcase-controldata ............ 3 1{3Matchedcase-controldatawithabinaryexposurevariable ........ 11 3{1AllelefrequenciesforTwelveSTRlociinthefourArgentineansubpopu-lations. ..................................... 50 3{2Theresultsofsimulatedrare-diseasedatawithmarkerlociinlinkageequi-libriumwiththecandidategeneD6S366.Ratioofthesamplesizesofcasestocontrolsis125=125and250=250.X12andX6,representthatthepa-rameterswereestimatedbyusingthetwelveandtherstsixadditionalmarkerloci,respectively.X0istheanalysiswithoutusinganyadditionalmarkerloci.MeanandposteriorstandarddeviationrefertotheaverageoftheBayesestimatesandposteriorstandarddeviationsobtainedin100replications,whereasMSEistheestimatedmeansquarederrorbasedon100replications. ................................ 51 3{3Theresultsofsimulatedrare-diseasedatawithmarkerlociinlinkageequi-libriumwiththecandidategeneD6S366whichareanalyzedbySattenetal.001.125=125and250=250denoteratioofthesamplesizesofcasestocontrols.X12andX6representthattheparameterswereestimatedbyusingthetwelveandtherstsixoftheadditionalmarkerloci,respec-tively.Meanandstandarderrorrefertotheaverageoftheestimatesandstandarderrorsobtainedin500replications. ................ 52 3{4Theresultsofsimulatedcommon-diseasedatawithmarkerlociinlinkageequilibriumwiththecandidategeneD6S366.Ratioofthesamplesizesofcasestocontrolsis125=125and250=250.X12andX6,representthattheparameterswereestimatedbyusingthetwelveandtherstsixadditionalmarkerloci,respectively.X0istheanalysiswithoutusinganyadditionalmarkerloci.MeanandposteriorstandarddeviationrefertotheaverageoftheBayesestimatesandposteriorstandarddeviationsobtainedin100replications,whereasMSEistheestimatedmeansquarederrorbasedon100replications. ................................ 53 ix

PAGE 10

3{5TheresultsofrealdataanalysiswiththeposteriormeanEstimate,pos-teriorstandarddeviationand95%highestposteriordensityHPDinter-valMLEandcondenceintervalCIfortheordinarylogisticregressionmodel. .................................... 54 4{1AnalysisofIsraeliovariancancerdatabyallvemethods,consideringOCuseastheonlyenvironmentalexposure,with95%HPDandcondenceintervals 79 4{2AnalysisofIsraeliovariancancerdatabyallvemethods,consideringbothOCuseandparityasenvironmentalexposures,with95%HPDandcondenceintervals .................................... 80 4{3Simulationscenarios:EisZero-Inated;G:rareorcommon;G-Eindepen-denceassumptionholdsE=0ordoesnotholdE=0:25.Meandenotesthemeanestimatebasedon100replications,whereasMSEistheestimatedmeansquarederrorbasedon100replications. ................. 81 4{4Simulationscenarios:E:Mixtureoftwonormals;G:withparametriclogisticintermsofSasin 4{8 orcommonlyprevalentasin 4{4 ;G-EindependenceholdsE=0ordoesnotholdE=0:25.Meandenotesthemeanestimatebasedon100replications,whereasMSEistheestimatedmeansquarederrorbasedon100replications. ........................... 82 4{5Simulationscenarios:E:Mixtureoftwonormals;G:rarelyprevalent;G-EindependenceholdsE=0ordoesnotholdE=0:25.Meandenotesthemeanestimatebasedon100replications,whereasMSEistheestimatedmeansquarederrorbasedon100replications. .................... 83 5{1Dataforaunmatchedcase-controlstudywithabinarygeneticfactorandabinaryenvironmentalexposure. ...................... 90 5{2Intheabsenceofmisclassication,theMLEsoftheoddsratiosandtheirestimatedasymptoticvariancesintermsofobservedcountsrdjforbothtraditionalmodelandthemodelunderG-Eindependenceandraredisease. 92 5{3Inthepresenceofmisclassication,theMLEsofthetrueoddsratiosintermsofestimatedstarredexpectedcountsrdjforthetraditionalmodelModel1andrdjIRforthemodelunderG-EindependenceandrarediseaseassumptionsModel2. ....................... 99 5{4Resultsofunmatchedcase-controldata50/750,wherespecicityforbothgeneticandenvironmentalfactor=1.0,se0G=se1G=0:95andse0E=se1E=0:9.PD=10:01,PE=10:5andPG=10:2 106 5{5Resultsofunmatchedcase-controldata1000/1000,wherespecicityforbothgeneticandenvironmentalfactor=1.0,se0G=se1G=0:95andse0E=se1E=0:9.PD=10:01,PE=10:5andPG=10:2 107 x

PAGE 11

5{6Resultsofunmatchedcase-controldata50/750,wherespecicityforbothgeneticandenvironmentalfactor=1.0,se0G=se1G=0:9andse0E=se1E=0:8.PD=10:01,PE=10:5andPG=10:2 .... 108 5{7Resultsofunmatchedcase-controldata1000/1000,wherespecicityforbothgeneticandenvironmentalfactor=1.0,se0G=se1G=0:9andse0E=se1E=0:8.PD=10:01,PE=10:5andPG=10:2 .... 109 5{8Minimumnumberofcasescase:controlratio=1requiredtodetecta2-foldmultiplicativeinteractionOR10=OR01=2andOR11=8with80%powerfordierentlevelsofsensitivitiesandspecicitiesoftheenvi-ronmentalandgeneticfactors,wherePE=1=0:5andPG=1=0:2. 110 5{9Minimumnumberofcasescase:controlratio=1requiredtodetecta3-foldmultiplicativeinteractionOR10=1:3,OR01=7andOR11=3with80%powerfordierentlevelsofsensitivitiesandspecicitiesoftheenvironmentalandgeneticfactors,wherePE=1=0:2andPG=1=0:01. ................................... 110 xi

PAGE 12

LISTOFFIGURES Figure page 4{1RealdataanalyzedwithEDPMmodelbyconsideringOCuseasanenvi-ronmentalexposure:Histogramoflast5000MCMCvaluesforthemaineectsandinteractionparameterwithoverlayedsmoothedkerneldensity. 84 4{2DetailsofDPMmodelbyconsideringOCuseasanenvironmentalexpo-sure:HistogramcorrespondingtoapproximateposteriordistributionofandKintheDPMmodel.Alsoplottedarehistogramsofvariancesofthei'sandii=1;;24,calculatedforeachofthelast5000MCMCruns. 85 5{1Minimumnumberofcasescase:controlratio=1requiredtodetecta2-foldinteractionOR10=2,OR01=2,andOR11=8with80%powerasafunctionofthetrueprevalenceoftheenvironmentalfactor,PE=1,fortheprevalenceofthegeneticfactorbeing0.2,andforselectedvaluesofsensitivityandspecicityoftheexposureassessment. ......... 111 xii

PAGE 13

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyBAYESIANMETHODSINCASE-CONTROLSTUDIESWITHAPPLICATIONSINGENETICEPIDEMIOLOGYByLiZhangAugust2006Chair:MalayGhoshCochair:BhramarMukherjeeMajorDepartment:StatisticsThefundamentalideabehindcase-controlstudiesistocompareselectedpersonshavingadiseasethecaseswiththosenothavingthediseasethecontrolsbyassessingtowhatextenttheyhavebeenexposedtothedisease'spossibleriskfactors.Thenaturallikelihoodtouseforacase-controlstudyisaretrospective"likelihood,i.e.alikelihoodbasedontheprobabilityofexposuregivendiseasestatus.Iprovetheequivalenceofposteriorinferenceforthelogoddsratiosparametersbasedonprospectiveandretrospectivelikelihoodsinstratiedcase-controlstudiesinwhichsomeoftheexposurevariablescouldbemissingcompletelyatrandom.Mydissertationalsoaddressesthreeproblemsinthedomainofgeneticepidemiol-ogytoexploreavarietyofdisease-geneassociationandgene-environmentinteraction.First,Iconsidertheproblemofdetectingassociationbetweenadiseaseandacandidategeneinthepresenceofpopulationadmixture.Iproposeatwo-stageparametricBayesianapproachimplementedviaMarkovchainMonteCarloMCMCnumericalintegrationtechnique,whichrstestimatestheposteriorprobabilityofdif-ferentunknownpopulationsubstructuresandthenintegratesthisinformationinto xiii

PAGE 14

adisease-geneassociationmodelthroughthetechniqueofBayesianmodelaverag-ing.Thus,theuncertaintyinestimatingthepopulationsubstructureistakenintoaccountwhileprovidingcredibleintervalsforparametersinthedisease-geneassocia-tionmodel.Second,IpresentaBayesiansemiparametricapproachtomodeltheeectofstraticationvariablesundertheassumptionofgene-environmentindependenceinthecontrolpopulationconditionalonsomeothercovariatestostudythegene-environmentinteraction.ItakeaccountofstratumheterogeneityintheexposuredistributionbyadoptingtheDirichletprocessmixtureDPMofnormalpriortothedistributionoftheenvironmentalexposureandaexiblemodelforthedistributionofthegeneticfactor.IillustratethemethodsbyapplyingthemtoanIsraeliovariancancerstudytoinvestigatetheeectofBRCA1/2mutations,oralcontraceptiveuseandparityinthedevelopmentofovariancancer.Third,Iconsideranalysisofunmatchedcase-controlstudiesinwhichbinaryexposuresarepotentiallymisclassied.Idescribearelativesimpleapproachtoadjusttheestimationoftheparametersofinterestingene-environmentassociationstudiesinthepresenceofmisclassicationandbyexploitingtheG-Eindependenceassumption.Concludingremarksanddirectionsforfutureworkareincludedintheend. xiv

PAGE 15

CHAPTER1OVERVIEW1.1Introduction:TheFrequentistDevelopmentinCase-ControlStudiesThegoalofanepidemiologicstudyistondthecausesofadiseaseandtoassessthedegreeofassociationbetweenthediseaseanditspotentialriskfactors.Case-controlstudiesareperhapsthemostdominantformofanalyticalresearchinepidemiology,especiallyincancerepidemiology.Thefundamentalideabehindsuchinvestigationsistocompareselectedpersonshavingadiseasethecaseswiththosenothavingthediseasethecontrolsbyassessingtowhatextenttheyhavebeenexposedtothedisease'spossibleriskfactors.Theultimategoaloftenistoevaluatethehypothesisthatoneormoreoftheexposurevariablesisacauseofthedisease.Thereareseveralpopularstudydesignstoascertaindisease-exposureassociation.Acase-controlstudyisretrospectiveinthesensethatseparaterandomsamplesfromcaseandcontrolpopulationsarecollectedrstandthenexposureinformationisascertainedfortheselectedsubjects.Insuchastudydesign,onecollectsexposureinformationconditionalonthediseasestatusofthesubject.Acohortstudy,ontheotherhand,isprospectiveinnatureasaninitiallyhealthycohortisfollowedovertimetoassessthediseaseincidencerateandpossibledisease-exposureassociation.Case-controlstudiesdesignbecamepopularinthe1920's.Initially,thereweredoubtsregardingthevalidityofusingcase-controldatatoextractinformationontherelativerisksofthedisease,i.e.,theoddsoftheoccurrenceofadiseaseforthoseexposedrelativetothoseunexposed.Corneld1951demonstratedthattheexposureoddsratioforcasesversuscontrolsequalsthediseaseoddsratioforexposedversusunexposed,andthatthelatterinturnapproximatestheratioofdiseaseratesortherelativeriskofthediseaseprovidedthatthediseaseisrare.Tounderstand 1

PAGE 16

2 Table1{1:Case-controldatawithabinaryexposurevariable DiseaseStatus ExposedNon-Exposed Total Case n11n10 n1 Control n01n00 n0 Total e1e0 N thisissueinthesimplestsetting,consideracase-controlstudywithasinglebinaryexposurevariableXX=1exposed,andX=0unexposedandletDdenotethediseasestatusD=1forcases,D=0forcontrols.Table 1{1 presentsthedatalayoutandcellfrequenciesforeachdisease-exposurecombination.OnemaynotethatPX=1jD=1PX=0jD=0 PX=1jD=0PX=0jD=1theexposureoddsratio=PD=1jX=1PD=0jX=0 PD=0jX=1PD=1jX=0thediseaseoddsratioPD=1jX=1 PD=1jX=0relativerisk: {1 Theapproximationholdsforararedisease,asPD=0jX=1PD=0jX=01.Sothediseaseoddsratio,say,=expwheredenotesthelog-oddsratioparameter,isthesameastheexposureoddsratiowhichapproximatestherelativeriskofthediseaseforararedisease.Therefore,anoddsratioof1impliesthatthereisnoassociationbetweenthediseaseandtheexposure,whereasanoddsratiootherthan1impliesthatexposureiseithersynergisticorantagonisticwiththedisease.Also,oneestimatesby^=n11n00=n10n01and=logby^=log^.1.1.1TheMantel-HaenszelEraItiswell-knownthatforalargesample,^)]TJ/F22 11.955 Tf 12.44 0 Td[(hasanasymptoticnormaldis-tributionwithmean0andvariance=n11+1=n10+1=n01+1=n00Agresti,2001.Forasmallsamplesize,exactinferenceisbasedonanoncentralhypergeometric

PAGE 17

3 Table1{2:Seriesof22tableforstratiedcase-controldata DiseaseStatus ExposedNon-Exposed Total Case n11in10i n1i Control n01in00i n0i Total e1ie0i Ni distribution,Prn11jn1;n0;e1;e0;=0B@n1n111CA0B@n0e1)]TJ/F22 11.955 Tf 11.956 0 Td[(n111CAn11 Pu0B@n1u1CA0B@n0e1)]TJ/F22 11.955 Tf 11.955 0 Td[(u1CAu; {2 whichistheconditionaldistributionofpairedbinomialdatagiventhemarginaltotalsthemarginaltotalsareconsideredasapproximatelyancillaryinthesensethattheydonotcontainanyinformationabouttheparameterofinterest.OnecanuseFisher'sexacttesttotestH0:=0againstH1:>0,bycalculatingtheuppertailprobabilityunderthedistributionshownin 1{2 ,pu=Xun11Pujn1;n0;e1;e0;0: {3 Similarly,totestH0againstH1:<0oneshouldcalculatethecorrespondinglowertailprobability.MantelandHaenszel959proposedanalternativetoFisher'sexacttest.As-sumingacommonoddsratioacrossaseriesof22tables,theyproposedanestimatorforthecommonoddsratio.Specically,supposeonehasIsuchtablesandthei-thtableisrepresentedbythedatalayoutinTable 1{2 .TheMantel-HaenszelMHoddsratioestimateisgivenby^MH=exp^MH=Pin11in00i=Ni Pin01in10i=Ni: {4

PAGE 18

4 Totestforhomogeneityofoddsratiosacrossthetables,i.e.,H0:1=2=:::=I,theMHteststatisticis2=jPin11i)]TJ/F22 11.955 Tf 11.955 0 Td[(En11ij^MHj)]TJ/F21 7.97 Tf 19.128 4.707 Td[(1 22 PiVarn11ij^MH; {5 whichhasanapproximate2distributionwithI)]TJ/F15 11.955 Tf 11.955 0 Td[(1degreesoffreedom.MantelandHaenszelpresentednovarianceformulafortheirestimatorandreferredtotheworkbyCorneld1956forcalculationoftheintervalestimates.Robins,BreslowandGreenland986andPhillipsandHolland987indepen-dentlyproposedvarianceestimatoroftheMHestimatorcoveringthetwodierenttypesofasymptoticstructure:1asmallnumberoftableswithlargefrequencies,and2alargenumberoftableswithsmallfrequencies.Themainideaisthefollowing.First,ERi=iESi,whereRi=n11in00i=Ni,Si=n01in10i=Ni,andidenotesthetrueoddsratiointablei.Thus^MHisthesolutionoftheunbiasedestimatingequationR)]TJ/F22 11.955 Tf 12.4 0 Td[(S=0,withR=PiRiandS=PiSi,assumingacommonvaluefori.Second,underpairedbinomialsampling,thevariancesoftheindividualcontributionstothisestimatingequationsatisfyN2iVarRi)]TJ/F22 11.955 Tf 11.955 0 Td[(Si=1 2Efn11in00i+n01in10in11i+n00i+n01i+n10ig: {6 Now,withonestepTaylorexpansion,^MH=log^MH=log+R)]TJ/F22 11.955 Tf 11.956 0 Td[(S ER+opVarR E2R+VarS E2S: {7 ThelasttwoequationstogetheryieldVar^MH:=VarR)]TJ/F22 11.955 Tf 11.955 0 Td[(S E2R=PiVarRi)]TJ/F22 11.955 Tf 11.955 0 Td[(Si E2R: {8 However,theMHmethodsconcerntheeectsofasinglebinaryriskfactor.Onemayextendthemethodstoasinglecategoricalexposureandthentomultiplecategoricalexposuresonlybyconsideringeachfactoratatimeafterstratication

PAGE 19

5 withrespecttolevelsoftheotherfactors.Continuousexposurescannotbehandledinthisframeworkunlessonecategorizesthem.1.1.2LogisticRegressioninCase-ControlStudiesMethodstoevaluatesimultaneouseectsofmultiplequantitativeriskfactorsstartedbeingdevelopedinthe1960's.Corneldetal.961notedthatifthemultivariatedistributionofexposureXamongpersonswithandwithoutdiseaseDwerenormalwithseparatemeansbutacommoncovariancematrix,thentheprobabilityofdevelopingdiseaseforanindividualwithvaluesX=xwasgivenbythelogisticresponsecurvePrD=1jX=x=exp+Tx 1+exp+Tx: {9 DayandKerridge67conrmedthatlogisticregressionwasecientinasemepara-metricsense.Theynotedthatthefulljointlikelihoodwithexposurevariableshav-inganarbitrarydistributionpxcanbewrittenaspD;X=PrDjXpx,andthetwofactorsinthelikelihoodcouldbemaximizedseparately,leadingtosemi-parametriceciencyofthelogisticmodel.Akeyfeatureofthelogisticmodelforcase-controlstudiesisthattheregressioncoecientshaveaniceriskinterpretationSeigelandGreenhouse1973inthefollowingsense:PrD=1jX=x1PrD=0jX=x0 PrD=0jX=x1PrD=1jX=x0=expfTx1)]TJ/F47 11.955 Tf 11.955 0 Td[(x0g: {10 ThusTx1)]TJ/F47 11.955 Tf 12.938 0 Td[(x0representsthelogrelativeriskforasubjectwithexposurex1versusonewithexposurex0.Butthenaturallikelihoodforcase-controlsamplingistheretrospective"likelihood,andisoftheformpXjDratherthanPrDjXwhichistheformofaprospective"likelihoodobtainedfromacohortstudy.AsMantelandHaenszel1959statedintheirseminalpaper:aprimarygoalistoreachthesame

PAGE 20

6 conclusionsinaretrospectivestudyaswouldhavebeenobtainedfromaprospectivestudy,ifonehadbeendone."Prospectivelogisticregressionanalysisisindeedmoreconvenientthanttingretrospectivemodels.Inaretrospectiveformulation,modelingthedistributionoftheexposuremayposecertainchallenges,especiallywhentheexposureishighdi-mensionaloramixtureofdiscreteandcontinuousvariables.Buttheuseoftheprospectivemodelinanalyzingcase-controldataneededmoretheoreticalvalidationwhichwasprovidedbyAnderson1972andPrenticeandPyke1979.Iwilldiscussthisissueingreaterdetailinthenextsection.1.1.3EquivalenceofProspectiveandRetrospectiveModelsinCase-ControlStudiesAsstatedin 1{10 theprospectivelogisticregressionmodelmaybeusedtoinducearetrospectivemodel,whichalsoturnsouttobeofalogisticformPrenticeandPyke,1979.Beginningwith 1{10 anddening=lognPrD=1jx0 PrD=0jx0o)]TJ/F47 11.955 Tf 11.955 0 Td[(Tx0 {11 onecanrecover 1{9 .Similarlytheoddsratiorepresentation 1{10 allowsonetocalculatepX=xjD=d=expfx+dTxg Rexpfx+dTxgdx;d=0;1; {12 where=x=logfPrX=xjD=0=PrX=x0jD=0gforallx.Furthermore,ifXhasKdistinctvalues,theintegrationbecomessummationoverallKdistinctvalues.Theprospectivemodel 1{9 andtheretrospectivemodel 1{12 arepreciselyequivalentprovidedthatin 1{9 andin 1{12 areunrestricted.Anderson972providesadeeperlookintothepropositionofretrospectivedatabeinganalyzedbyaprospectivemodel.SupposeadiscreteexposurevariableXtakesKdistinctvaluesz1;;zK.Therearen=n0+n1sampleswithn0controlsandn1

PAGE 21

7 cases.Letn0kandn1kdenotethenumberofcontrolsandcasesobserved,respectively,withX=zk.Denotep1k=1)]TJ/F22 11.955 Tf 12.06 0 Td[(p0k=PrD=1jX=zk,whichisspeciedbythelogisticmodel 1{9 ,andthemarginalprobabilitiescorrespondingtotheexposurearegivenbyqk=PrX=zk.AssumingthemarginaldiseaseprobabilitiesPrD=d=dareknownandbyusingPrXjD=PrDjXPrX=PrD,thecase-controllikelihoodisproportionaltoL1L2=1Yd=0KYk=1pdkndkKYk=1qkn+k; {13 wheren+k=n0k+n1k.Buttheparametersareconstrainedbyxedmarginalprob-abilitiesofdisease:Pkpdkqk=d,ford=0;1.Anderson72discoveredthatestimatesandcovariancematrixforthecoecientswereidenticaltothoseofordi-narylogisticregressioninvolvingmaximizationofL1alone.PrenticeandPyke979extendedAnderson's1972resultsonlogisticdiscrim-inationandgeneralizedthendingsofBreslowandPowers978ontheequivalenceofoddsratioestimatorswhenbothprospectiveandretrospectivelogisticmodelsareappliedtocase-controlstudies.Theystartedfromanotherfactorizationofthelikeli-hood.Again,letusconsidern0controlsandn1casesbutanarbitraryexposurevariablex.TheretrospectivelikelihoodfunctionisL=Yj:casesPxjjD=1Yj:controlsPxjjD=0: {14 DenoteSasasamplingindicatorS=1,anindividualisselectedinthecase-controlsample;=0,otherwise.Becauseconditionalondiseasestatus,samplingis

PAGE 22

8 independentofexposure,byBayes'stheorem,PxjD=d=PxjD=d;S=1=PD=djx;S=1PxjS=1 PD=djS=1=PD=djx;S=1PxjS=1n nd: {15 AsinMantel973,wecanobtainPD=1jx;S=1=PS=1jD=1PD=1jx P1d=0PS=1jD=dPD=djx; {16 bythefactthatsamplingisindependentofexposurewithincasesandcontrols.Thisistheconditionalprobabilityofanindividualincases,givenexposurexandwassampledforthestudy.SincePD=1jS=1=PD=0jS=1=n1=n0,inserting 1{9 into 1{16 ,oneobtainsPD=1jx;S=1=exp+Tx 1+exp+Tx; {17 where=+log)]TJ/F23 7.97 Tf 6.675 -4.871 Td[(n10 n01.Nowsubstituting 1{15 into 1{14 ,oneobtainsL/L1L2; {18 whereL1=1Yd=0ndYj=1PD=djxdj;S=1L2=1Yd=0ndYj=1PxdjjS=1: {19 Notethattheparameters,andqx=PxjS=1arerestrictedbynd=n=RPxjS=1PD=djx;S=1dx.

PAGE 23

9 PrenticeandPyke79demonstratedthatthesolutiontotheunconstrainedmaximizationproblem,with^;^fromtheordinarylogisticregressioncoecientsbasedonL1anddqx=s=nwhichisassignedtoanyvalueofxthatisobservedwithmultiplicitysthesampleXdistribution,actuallysatisedtheconstraintsandthusyieldedthedesiredestimates.TheyfurthershowedthattheestimatingequationsderivedfromL1wereunbiasedand,usingestimatingequationtheory,conrmedthattheusualcovariancematrixfor^remainedvalidundercase-controlsampling.Becausetheinterceptwasafreeparameter,itdidnotmatterthatthei'swereunknown.Carroll,WangandWang95extendedthePrenticeandPyke1979re-sultstovalidatettingofprospectivelogisticregressionmodelstocase-controldatainthepresenceofmeasurementerrorandpartialmissingnessinexposurevalues.Theyshowedthat,ingeneral,usingprospectivelyderivedstandarderrorsisatworstasymptoticallyconservative;inaddition,theyderivedasimplesucientconditionguaranteeingthatprospectivestandarderrorsareasymptoticallycorrect.Roeder,CarrollandLindsay996extendedthePrenticeandPyke1979re-sultstothecasewherecovariatesaremeasuredwitherror.Theyprovedthattheprospectiveandretrospectivemodelsgeneratethesameprolelikelihoodforthelogoddsratio.Byusingamixturemodel,therelationshipbetweenthetruecovariateXandtheresponseDcanbemodeledappropriatelyforbothcompleteandreduceddata.ThelikelihooddependsonthemarginaldistributionofXandthemeasure-menterrordensity[WjX;D].Thelatterismodeledparametricallybasedonthevalidationsample.Themarginaldistributionofthetruecovariateismodeledusinganonparametricmixturedistribution.SeamanandRichardson2004presentedanalternativeproofofequalityofthetwoprolelikelihoodsintheabsenceofmeasurementerror,wheretheyappliedthemultinomial-PoissonMPtransformation.Furthermore,theyprovedthataBayesian

PAGE 24

10 analysiswhichusestheprospectivelikelihoodandassumesauniformpriordistribu-tionforthelogoddsthatanindividualwithbaselineexposureisdiseased,isexactlyequivalenttoananalysisthatusestheretrospectivelikelihoodandassumesaDirichletpriordistributionfortheexposureprobabilitiesinthecontrolgroup.ThismeansthatBayesiananalysisofcase-controlstudiesmay,liketheclassicalfrequentistanalysis,becarriedoutusingaprospectivemodel,thussignicantlyreducingitscomplexity.SeamanandRichardson2004,likePrenticeandPyke979,consideredun-matchedcase-controlproblems.Theylefttheopenquestionofsimilarequivalenceresultsinthecontextofmatchedcase-controlproblemsandalsoforsituationswithmissingdata.InmydissertationIaddresstheproblemofextendingtheequivalenceresultstostratiedcase-controlstudiesinwhichsomeoftheexposurevariablescouldbemissingcompletelyatrandom.1.1.4MatchedCase-ControlStudiesSofarIhaveconcentratedonunmatchedcase-controlstudydesigns,butmydis-sertationwillinvolvesomematchedcase-controlsettingsaswell,soIbrieyreviewthematchedstudydesign.Matchingisoftenimplementedasadesignstrategytoeliminateeectsduetoconfounding.Inamatchedcase-controlstudy,controlsarematchedwithacaseorseveralcasesonthebasisofsomematchingfactorscon-foundingvariablessuchasage,gender,region,ethnicityetc.Therearetwotypesofmatchingcommonlyused.Oneisfrequencymatching,inwhichthenumberofcontrolsareselectedaccordingtothenumberofcasesinbroadhomogeneousstratadenedbythevaluesofmatchingfactorstomaintainaspeciccase:controlratioineachstratum.Theotherisindividualmatching,inwhichcontrolsareselectedindividuallycorrespondingtoeachselectedcasebymatchingwithrespecttocertainfactors.Thesimplestsituationofmatcheddataariseswhenonecaseismatchedwithonecontrol,andtheyarecategorizedonthebasisofabinaryexposure.Supposeone

PAGE 25

11 Table1{3:Matchedcase-controldatawithabinaryexposurevariable DiseaseStatus ExposedNon-Exposed Case m11m10 Control m01m00 hasm11,m10,m01andm00matchedpairsunderdierentlevelsofDandXasshowninTable 1{3 .Letbetheconditionalprobabilityofobservingamatchedpairwithanexposedcaseandunexposedcontrolgivenadiscordantpair.=PX=1jD=1PX=0jD=0 PX=1jD=1PX=0jD=0+PX=0jD=1PX=1jD=0= +1:{20Notethatm10jm10;m01Binm10+m01;.SotheMantelHaenszelestimatorofthecommonoddsratioparameter,theMLEof,ism10=m01.Notethatwhen=1,=1=2.HencetheteststatistictotestH0:=1is2=jm10)]TJ/F22 11.955 Tf 11.955 0 Td[(EH0m10jm10+m01j)]TJ/F21 7.97 Tf 19.128 4.707 Td[(1 22 VarH0m10jm10+m01;{21whichisknownasMcNemar's947test.Oneofthepotentialproblemswiththisestimatorandthistestisthatitusesonlythediscordantpairsofobservationsanddiscardstheinformationcontainedintheconcordantset.Inthecaseof1:Mmatching,theMantel-Haenszelestimatorofcommonoddsratiois^MH=PMr=1M)]TJ/F22 11.955 Tf 11.955 0 Td[(r+1m1r)]TJ/F21 7.97 Tf 6.586 0 Td[(1 PMr=1rm0r;{22wherem1risthenumberofmatchedsetswherethecaseandrcontrolsareexposed;andm0risthenumberofmatchedsetswherethecaseisunexposedbutrcontrolsareexposed.TheteststatisticfortestingH0:=1is2=jPMr=1m1r)]TJ/F21 7.97 Tf 6.586 0 Td[(1)]TJ/F23 7.97 Tf 17.416 4.707 Td[(rtr M+1j)]TJ/F21 7.97 Tf 19.128 4.707 Td[(1 22 PMr=1rtrM)]TJ/F23 7.97 Tf 6.586 0 Td[(r+1 M+12;{23wheretr=m1r+m0r.

PAGE 26

12 Letusnowfocusonlogisticregressionmodelsinmatchedcase-controlstudies.Inthesimplestsetting,thedataconsistofIstrataandthereareMicontrolsmatchedwithacase,forstratumSi,i=1;;I:Asbefore,oneassumesaprospectivelogisticincidencemodelfordiseasePD=1jz;Si=expfi+Tz)]TJ/F47 11.955 Tf 11.955 0 Td[(z0g 1+expfi+Tz)]TJ/F47 11.955 Tf 11.955 0 Td[(z0g;{24wherei'sarestratumspecicinterceptterms.Withoutlossofgenerality,assumingthattherstsubjectineachstratumisacaseandrestofthesubjectsarecontrols,conditioningonthesucientstatisticsPMi+1j=1Dijfori,oneobtainstheconditionallikelihoodLc=IYi=1Mi+1Yj=1PDijjzij;Si;Mi+1Xj=1Dij=1g=IYi=1expTzi1 PMi+1j=1expTzij: {25 ThismethodisknownasconditionallogisticregressionCLR.Breslow96illus-tratedthatunmatchedanalysisofmatcheddatabasedonunconditionalfulllikeli-hoodledtobiasedandinconsistentestimatesoftherelativeriskparameters.Thedierencebetweenunconditionalandconditionalanalysisdependsonthedegreeofassociationbetweentheexposureandthematchingvariables.Itisindeedimportanttoacknowledgethematchedstudydesignintoanymodelproposedformatcheddata.1.2BayesianAnalysisofCase-ControlStudiesSincethemethodsIproposeinmydissertationaremostlybasedontheBayesianparadigm,IwillnowpresentabriefaccountofthecurrentstateoftheartinBayesianmethodsforcase-controlstudies.Inspiteofthevastliteratureinthefrequentistdomain,Bayesianmethodsforanalyzingcase-controldatawererstproposedinthe1980's.WiththearrivalofMarkovchainMonteCarloMCMCtechniquesinthe1990's,itbecamepossibletoaddressmorecomplexandunorthodoxdatascenarios

PAGE 27

13 likemissingnessandmeasurementerrorinthecontextofacase-controlstudyeveninaBayesianframework.ZelenandParker86,NurminenandMutanen987,Marshall88,andAshbyetal.993developedBayesianmethodsforanalyzingcase-controlstudieswithonlyasinglebinaryexposurevariable.Allofthemusedversionsofthefollowingmodel:Letandbetheprobabilitiesofexposureincontrolandcasepopulations,respectively.Theretrospectivelikelihoodisl;/n01)]TJ/F22 11.955 Tf 11.955 0 Td[(n00n11)]TJ/F22 11.955 Tf 11.955 0 Td[(n10;{26wheren01andn00arethenumberofexposedandunexposedobservationsinacontrolpopulation,whereasn11andn10denotethesameforacasepopulation.IndependentconjugatepriordistributionsforandareassumedtobeBetau1;u2andBetav1;v2respectively.Afterreparametrization,oneobtainstheposteriordis-tributionofthelogoddsratioparameter,=logf)]TJ/F22 11.955 Tf 11.955 0 Td[(=)]TJ/F22 11.955 Tf 11.955 0 Td[(gaspjn11;n10;n01;n00/expfn11+v1gZ10n11+n01+v1+u2)]TJ/F21 7.97 Tf 6.587 0 Td[(1)]TJ/F22 11.955 Tf 11.955 0 Td[(n10+n00+v2+u1)]TJ/F21 7.97 Tf 6.587 0 Td[(1 f1)]TJ/F22 11.955 Tf 11.956 0 Td[(+expgn11+n10+v1+v2d: {27 Theposteriordensityofdoesnotexistinclosedfrom,butmaybeevaluatedbynumericalintegration.Sinceinterestoftenliesinthehypothesis=0,ZelenandParker986recom-mendedcalculatingtheratioofthetwoposteriorprobabilitiesp=patselecteddeviates.Whenissetattheposteriormode,alargevalueofthisratiowillindicateconcentrationoftheposteriorawayfrom0andonewouldinferdisease-exposureas-sociation.However,thecriticalvaluesuggestedforthisratioiscompletelyarbitrary.Theyalsoprovidedanormalapproximationtotheposteriordistributionoftoavoid

PAGE 28

14 numericalcomputation,anddiscussedtheproblemofchoosingapriordistributionbasedonsomepriordataonexposureinformationinaBayesianframework.NurminenandMutanen87consideredamoregeneralparametrizationintermsoftheoddsratio=expwhichcoversriskratioandriskdierences.Theyprovidedacomplicatedexactformulaforthecumulativedistributionfunctionofthisgeneralcomparativeparameter,whichcanberelatedtoFisher'sexacttestforcomparingtwoproportionsinsamplingtheory.TheBayesianpointestimateswereconsideredasposteriormedianandmode,whereasinferencewasbasedonhighestposteriordensityintervalforthecomparativeparameterofinterest.Marshall1988providedaclosed-formexpressionforthemomentsofthepos-teriordistributionoftheoddsratio.Hementionedthatanapproximationtotheexactposteriordensityoftheoddsratioparametercanbeobtainedbypowerseriesexpansionofthehypergeometricfunctionsinvolvedintheexpressionforthedensity,butacknowledgedtheproblemofslowconvergenceinadoptingthismethod.InsteadMarshallusedLindley's1964resultfortheapproximatenormalityoflogoddsra-tiowhichworksverywelloverawiderangeofsituations.Intheabsenceofexposureinformation,Marshallrecommendedusingindependentpriorsontheparameters.HesuggestedthataperceptionaboutthevalueoftheoddsratioshouldguidethechoiceofpriorparametersratherthanattemptingtoexploittheexposureproportionsassuggestedinZelen-Parker.Inferenceagainisbasedonposteriorcredibleintervals.MullerandRoeder97proposedasemiparametricBayesianapproachtocase-controlstudieshavingcontinuousexposureswithmeasurementerror.TheyusedaBayesiannon-parametricmodelforthejointmarginaldistributionofthetrueex-posurewhereavailable,thesurrogateandthemeasurementerror.Theirmethodsareintrinsicallydesignedforcontinuousexposure.Mulleretal.999proposedahierarchicalBayesianapproachforcombiningthedatafromacase-controlstudyandaprospectivecohortstudy,andtoestimatetheabsoluteriskofthedisease.They

PAGE 29

15 modeledtheretrospectivedistributionoftheexposurevariablegiventhediseasesta-tus,andaccountedforparameterheterogeneityacrossstudiesbyusingahierarchicalBayesianapproach.Diggle,MorrisandWakeeld000presentedtherstBayesiananalysisforindividuallymatchedcase-controldataappropriatenuisanceparametersareintro-ducedtorepresenttheseparateeectofmatchingineachmatchedsettorecognizethestudydesign.Theyconsideredmatcheddatawhenexposureofprimaryinterestisdenedbythespatiallocationofanindividualrelativetoapointorlinesourceofpollution.SeamanandRichardson01extendedthebinaryexposuremodelofZelen-Parkertoanynumberofcategoricalexposures,bysimplyreplacingthebinomiallike-lihoodsin 1{26 byamultinomiallikelihood,andthenadoptingaMCMCstrategywithrespecttoabaselinecategory.TheyalsoadaptedtheMuller-RoederapproachtothesettingwithcategoricalexposuresandillustratethatundercertainspecicchoicesofadiscreteDirichletpriorontheexposuredistribution,Zelen-ParkerandMuller-Roederapproachesbecameapproximatelyequivalent.GhoshandChen02developedgeneralBayesianinferentialtechniquesformatchedcase-controlproblemsinthepresenceofoneormorebinaryexposurevari-ables.TheirmodelwasmoregeneralthanthatofZelenandParker86,andwasbasedonanunconditionallikelihoodratherthanaconditionallikelihoodunlikeDiggle,MorrisandWakeeld00.ThegeneralBayesianmethodologybasedonthefulllikelihoodthattheyproposedworkedbeyondthelogitlink.Theirprocedureincludednotonlytheprobitandthecomplementaryloglinksbutalsosomenewsymmetricaswellasskewedlinks.Theproprietyofposteriorswasprovedunderaverygeneralclassofpriorsthatneednotalwaysbeproper.

PAGE 30

16 Sinhaetal.005apresentedauniedsemiparametricBayesianapproachtomatchedcase-controlstudieswithmissingexposure.TheyassumedaDirichletpro-cesspriorwithamixingnormaldistributiononthedistributionofthestratumeectsontheexposuredistribution.Theproposedmethodpossessedcertainattractivero-bustnesspropertiesundervaryingdegreesofstratumheterogeneityintheexposuredistribution.Sinhaetal.004consideredmatchedcase-controlstudieswithmultipledis-easestates.TheyfurtherextendedtheirmethodstomodelmultivariateexposurewithassociationandpartialmissingnessSinhaetal.,2005b.Tosummarize,theypresentedanensembleofmethodstohandleunorthodoxdatascenariosinmatchedcase-controlstudies.1.3TopicsofThisDissertationAresurgenceofinteresthasbeenrecentlyexpressedingeneticcase-controlstud-iesRischandMerikangas,1996;MortonandCollins,1998;Sullivanetal.,2001toexploreavarietyofdisease-geneassociationandgene-environmentinteraction.TheBayesianpathwayshaveremainedlessexploredinthecase-controlcontextmainlybecauseofthecomputingneedsforimplementingthemodels.Ingeneticcase-controlstudies,accountingforpopulationsubstructureisacrit-icalissueinapopulationwhereadmixtureofseveralancestryhastakenplace.Asystematicdierenceinancestryincasesandcontrolscanleadtofalsediscoveryofassociation.Inmydissertation,Iproposeatwo-stageparametricBayesianapproachwhichintegratesthemodeluncertaintyintoadisease-geneassociationmodelthroughthetechniqueofBayesianmodelaveraging,wheretheanalysisisnotlimitedtobinarygenotypesirrespectiveofwhetherornotthediseaseisrare.Manyhumandiseasesresultfromtheinterplayofgeneticfactorsandenvironmen-talexposures.Onemayexploitthegene-environmentindependenceinordertoderivemoreecientestimationtechniquesthanthetraditionallogisticregressionanalysis.

PAGE 31

17 IprovideBayesiannonparametricmethodstocapturestraticationeectsonthedistributionofenvironmentalexposuresunderthegene-environmentindependenceassumptioninthecontrolpopulation.AlsoinaBayesianparadigmIcaneectivelyusethepriorknowledgewhilemodelingtheindividualgenotypefrequenciesineachstratumandthusrelaxthestringentlogisticassumption.Myobjectivewillbenotonlytoestimatetheinteractioneectparameter,butalsotoestimatetheeectsofthegeneticfactorandenvironmentalexposuresaswell.Measurementerrorinexposureassessmentisoneofthemajorsourceofbiasinepidemiologicalstudies.Whenignored,theseerrorsbiasourpointandintervalestimatesofeect,andinvalidatep-valuesofhypothesestests.Lessattentionhasbeengiventotheinuenceofmisclassicationontheassessmentofinteractionsbe-tweentwoormorefactors.Basedonsensitivityandspecicityofthegeneticandenvironmentalfactors,Idescribearelativesimpleapproachtoadjusttheestimationoftheparametersofinterestingene-environmentassociationstudiesinthepresenceofmisclassicationwhileexploitingtheG-Eindependenceassumption.Theoutlineoftherestofmydissertationisasthefollowing.InChapter 2 ,IpresentageneralresultwhichshowsthattheposteriorinferencefortheparametersfromamultinomiallikelihoodisexactlyequivalenttothatfromthecorrespondingPoissonlikelihoodwithanarbitraryproperpriorfortheparametersofinterestandindependentuniformpriorsforthelatentparameters.Theresultisthenextendedtoprovetheequivalenceofposteriorinferencefortheoddsratioparameterbasedonprospectiveandretrospectivelikelihoodsinstratiedcase-controlstudieswheresomeoftheexposurevariablescouldbemissingcompletelyatrandom.InChapter 3 ,IproposeaparametricBayesianapproachtoexaminetheasso-ciationbetweenacandidategeneandtheoccurrenceofadiseaseinthepresenceofpopulationadmixture.Twounmatchedcase-controlsimulationstudiesbasedonan

PAGE 32

18 admixedArgentineanpopulationasdescribedinSalaetal.998,1999areper-formedtoillustratethemethodsandcomputingscheme.Themethodisalsoappliedtoarealdatasetcomingfromageneticassociationstudyonobesity.InChapter 4 ,IprovideanovelsemiparametricBayesianapproachtomodelstrat-icationeectsundertheassumptionofgene-environmentindependenceinthecon-trolpopulation.Iillustratethemethodsbyapplyingthemtodatafromapopulation-basedcase-controlstudyonovariancancerconductedinIsrael.Simulationstudiesareconductedtocompareourmethodwithotherpopularchoices.Theresultsre-ectthatthesemiparametricBayesianmodelallowsincorporationofkeyscienticevidenceintheformofapriorandoersaexible,robustalternativewhenstandardparametricmodelassumptionsdonothold.InChapter 5 ,Ideriveanalyticformulationtoobtainestimatesandcondenceintervalsforthemisclassiedcaseinaunmatchedcase-controlset-up,whichreducebacktostandardanalyticformsastheerrorprobabilitiesreducetozero.IadaptandextendtheworkofRiceandHolmans003tothesituationwhenonehasabinarygeneticriskfactor,abinaryenvironmentalexposure,andbotharepotentiallysubjecttomisclassication.ConcludingremarksanddirectionsforfutureworkarestatedinChapter 6

PAGE 33

CHAPTER2EQUIVALENCEOFPOSTERIORSINTHEBAYESIANANALYSISOFTHEMULTINOMIAL-POISSONTRANSFORMATION2.1IntroductionBaker94presentedageneralresultwhichshowedhowmaximumlikelihoodestimationofparametersfromamultinomialdistributioncouldbecarriedoutfromacorrespondingPoissonlikelihoodbyexploitingthemultinomial-Poissonrelationship.Henceforth,thiswillbereferredtoasthemultinomial-PoissonMPtransformation.Bakerconsideredsituationswherethemultinomialprobabilitieswereratiosoffunc-tionsofparameterstothesumofthesefunctions.Themotivationwastosimplifythemaximumlikelihoodcomputationaswellascomputationoftheasymptoticvariance-covariancematrixofthemaximumlikelihoodestimateMLE.Baker'sresultuniedalargenumberofanalysesinvolvinglog-linearmodels,capture-recapturemodels,pro-portionalhazardsmodelswithcategoricalcovariates,generalizedRaschmodels,voterpluralitymodels,conditionallogisticregressionandtwo-stagecase-controlstudies.Baker'sideaswereextendedinthecontextofBayesiananalysisofcase-controlstudiesbySeamanandRichardson04.Thenaturallikelihoodtouseforacase-controlstudyisaretrospective"likelihood,i.e.,alikelihoodbasedontheprobabilityofexposuregiventhediseasestatus.PrenticeandPyke79showedthat,whenalogisticregressionisassumedfortheprobabilityofadiseasegivencertainexposures,themaximumlikelihoodestimatorsandasymptoticcovariancematrixofthelogoddsratiosobtainedfromtheretrospectivelikelihoodarethesameasthoseobtainedfromtheprospectivelikelihood,i.e.,thatbasedontheprobabilityofadiseasegivenexpo-sures.TheobjectiveofSeamanandRichardson04wastoverifyaresultsimilartoPrenticeandPyke1979fortheposteriordistributionofthelogoddsratiosin 19

PAGE 34

20 aBayesiananalysis.TheyprovedthataBayesiananalysisthatusestheprospec-tivelikelihood,andassumesauniformpriorforthelogoddsthatanindividualwithbaselineexposureisdiseased,isequivalenttoananalysisthatusestheretrospectivelikelihoodandassumesaDirichletpriorfortheexposureprobabilitiesinthecontrolgroup.Earlier,anapproximateequivalenceresultwasindicatedbyGustafsonetal.002.SeamanandRichardsonleftopenthequestionofsimilarequivalenceforstratiedcase-controldatawithmissingexposurevalues.InSection 2.2 ofthischapter,rstbasedonaMPtransformationinaBayesianframework,IproveageneralresultwhichshowsthattheposteriorinferencefortheparametersofamultinomiallikelihoodisthesameasthatforthecorrespondingPois-sonlikelihoodwitharbitraryproperpriorsfortheparametersofinterestanduniformpriorsforthelatentparametersintroducedinthePoissonlikelihood.Proprietyofposteriorsundertheassumedpriorsfollowasanimmediateconsequence.InSection 2.3 ,IextendtheresultsofSeamanandRichardson04tostratiedcase-controlproblemswheresomeoftheexposurevariablescouldbemissingcompletelyatran-dom.Stratiedcase-controlproblemswithoutanymissingnesscanbehandledasspecialcases.Individuallymatchedcase-controldesignisaspecialcaseofstratiedcase-controldesignwherethematchedsetsdenethestrata.Finally,someconcludingremarksaremadeinSection 2.4 .2.2AGeneralResultonPosteriorEquivalenceLetfYij;j2Ji;i=1;2;;Igdenoteavectorofdiscreterandomvariableswitharealizationfyij;j2Ji;i=1;2;;Ig.Thesubscriptiindexeslevelsofacategoricalcovariateoracross-classicationofcategoricalcovariates,andJiindexedbyjdenotesthesetofsubjectsinleveli.IassumethatthevectorfYij;j2Jigfollowsamultinomialdistributionwithparametersfgij=Gi,forj2Jig,wheregijaresomefunctionsof,Gi=Pj2Jigij,and=1;;qT.The

PAGE 35

21 likelihoodfunctionisthenproportionaltoLM=IYi=1Yj2Jingij Gioyij:{1Let=1;;i;;ITindicateasetofparameters.TheMPtransformationof 2{1 asgivenbyBaker94isthecorrespondingPoissonlikelihoodproportionalto:LP;=IYi=1Yj2Jifgijexpigyijexpf)]TJ/F22 11.955 Tf 15.276 0 Td[(gijexpig:{2Theorem1 .SupposePj2Jiyij1foralli=1;2;;I.Assumeindependentimproperpriorspi/1,fori=1;;I,andaproperpriorpforwhichisindependentof.ThentheposteriordistributionforderivedfromLMisequivalenttothatgeneratedfromLP;.Proof:Leti=expi,i=1;;I.Thenihasthepriorpi/)]TJ/F21 7.97 Tf 6.586 0 Td[(1i.ThemarginalposterioroffromLP;isnowgivenbyjy/pIYi=1Z10n)]TJ/F21 7.97 Tf 6.586 0 Td[(1iYj2Jifigijgyijexpf)]TJ/F22 11.955 Tf 15.276 0 Td[(igijgodi=pIYi=1nYj2JifgijgyijZ10Pj2Jiyij)]TJ/F21 7.97 Tf 6.587 0 Td[(1iexpf)]TJ/F22 11.955 Tf 15.277 0 Td[(iGigdio/pIYi=1Yj2Jingij Gioyij=pLM;whichisobviouslythesameastheposteriordistributionofgeneratedfromLM.Thefollowingtheoremestablishestheproprietyoftheaboveposteriorunderverymildconditions.Corollary1 .IfPj2Jiyij1foralli=1;;I,andpisproper,thenjyisproper.

PAGE 36

22 Proof.Letdenotethesupportof.ThenbyTheorem1,Z2jyd/Z2pLMdProprietyoftheposteriorthusfollowsasanimmediateconsequenceoftheequivalenceofthetwoanalyses.Remark1 .IfinsteadIuseindependentpriorspi/ai)]TJ/F21 7.97 Tf 6.586 0 Td[(1iai>0foralli=1;;I,thentheassumptionPj2Jiyij1canbedroppedtoestablishproprietyoftheresultingposteriorfor.ButthisposteriorwillnolongerbeproportionaltopLMasGiwillthenhavethepowerPj2Jiyij+airatherthanPj2Jiyij.BayesiananalogousofalltheexamplesofBaker994cannowbehandledfromthisgeneraltheorem.Forbrevity,Iomittheseexamples,andproceedtothenextsectiontoshowtheequivalenceofposteriorsbasedonprospectiveandretrospectivelikelihoodsinstratiedcase-controlstudieswheresomeoftheexposurevariablescouldbemissingcompletelyatrandomLittleandRubin,2002.2.3StratiedCase-ControlStudieswithMissingExposuresInthissection,IprovethataBayesiananalysisofstratiedcase-controldatawithmissingexposurethatusestheprospectivelikelihood,andassumesauniformpriorforthelogoddsthatanindividualwithbaselineexposureisdiseased,isexactlyequivalenttoananalysisthatusestheretrospectivelikelihoodandassumesauniformpriordistributionfortheexposureprobabilitiesinthecontrolgroup.Myanalysishandlesthecasewhensomeoftheexposurevariablesaremissingcompletelyatrandom.SupposethereareIstratawhereeachstratumhasscasesandtcontrolsinastratiedcase-controlstudy.LetSidenotethei-thstratum.LetDij=1or0correspondtothepresenceorabsenceofadiseaseforthejthindividualinithstratum,andletxijdenotethevectorofdiscreteexposurevariablesforthejthobservedsubjectintheithstratum.IassumethateachxijcantakeoneoftheK

PAGE 37

23 possiblevaluesfz1;;zKg.SupposenowPDij=1jXij=zk;Si=iexpTzk 1+iexpTzk;PXij=zkjDij=0;Si=ik PKl=1il: {3 Theprobabilitythatindividualjinstratumihasexposurevaluezkgiventhattheindividualisamemberofthecontrolpopulationisproportionaltoik.Foreachexposurevaluezk,theseprobabilitiesareassumedtobesameforallcontrolsinstratumianddonotdependonj.Using 2{3 IcanobtainthedistributionoftheexposureinthecasepopulationandwritetheprospectiveandretrospectivemodelsinthefollowingformPDij=djXij=zk;Si=diexpdTzk P1l=0liexplTzk;PXij=zkjDij=d;Si=ikexpdTzk PKl=1ilexpdTzl; {4 whered=0;1.Letijdenotethemissingnessindicatorfortheithstratumindicatingmiss-ingnesswithPij=1jSi=1)]TJ/F22 11.955 Tf 11.955 0 Td[(Pij=0jSi=i:{5Let=1;;IT.Withthemissingcompletelyatrandomassumption,idoesnotdependontheparametersik,ior.Letyidk=Ps+tj=1fI[Xij=zk]I[Dij=d]I[ij=1]g,d=0;1,i.e,yi0kandyi1karetherespectivenumbersofundiseasedanddiseasedsubjectshavingX=zkintheithstratum,andIdenotestheusualindicatorfunction.Now,theprospectivelikelihood

PAGE 38

24 isLP=IYi=1s+tYj=1hPDijjxij;Siiij=IYi=11Yd=0KYk=1hdiexpdTzk P1l=0liexplTzkiyidk; {6 andtheretrospectivelikelihoodisLR=IYi=11Yd=0KYk=1hikexpdTzk PKl=1ilexpdTzliyidk: {7 Inowhavethefollowingequivalencetheorem.Theorem2 .SupposePKk=1yi1k1andPKk=1yi0k1,foralli=1;;I.Assumemutuallyindependentpriorsforthei,ik,and,wherepi/)]TJ/F21 7.97 Tf 6.587 0 Td[(1i,pik/)]TJ/F21 7.97 Tf 6.587 0 Td[(1ik,whileandhaveproperpriors1and2.Thentheposteriordistributionofderivedfromtheprospectivelikelihoodisapproximatelyequivalenttothatfromtheretrospectivelikelihood.Proof:SupposethatrandomvariablesYidkareindependentlydistributedasYidkPoissonidk,wherelogidk=logi+logik+dlogi+Tzk:{8Thenwriting=1;;IT,and=11;;1K;;I1;;IKT,thejointprioris;;;/12nIYi=1)]TJ/F21 7.97 Tf 6.587 0 Td[(1i)]TJ/F23 7.97 Tf 11.341 5.26 Td[(KYk=1)]TJ/F21 7.97 Tf 6.586 0 Td[(1iko:Thejointposteriorisnowgivenby;;;jy/IYi=11Yd=0KYk=1nexp)]TJ/F22 11.955 Tf 9.298 0 Td[(idkyidkidk yidk!o12nIYi=1)]TJ/F21 7.97 Tf 6.587 0 Td[(1i)]TJ/F23 7.97 Tf 11.341 5.261 Td[(KYk=1)]TJ/F21 7.97 Tf 6.587 0 Td[(1iko=IYi=11Yd=0KYk=1nhexpf)]TJ/F22 11.955 Tf 15.276 0 Td[(ikidiexpdTzkg)]TJ/F22 11.955 Tf 5.479 -9.684 Td[(ikidiexpdTzkyidk yidk!ionIYi=1)]TJ/F23 7.97 Tf 11.341 5.26 Td[(KYk=1)]TJ/F21 7.97 Tf 6.587 0 Td[(1iko12nIYi=1)]TJ/F21 7.97 Tf 6.586 0 Td[(1io: {9

PAGE 39

25 FirstnotethatZ10exp)]TJ/F22 11.955 Tf 11.955 0 Td[(iik[1+iexpTzk]yi0k+yi1k)]TJ/F21 7.97 Tf 6.587 0 Td[(1ikdik/)]TJ/F21 7.97 Tf 6.586 0 Td[(yi0k+yi1ki[1+iexpTzk])]TJ/F21 7.97 Tf 6.586 0 Td[(yi0k+yi1k:Thusthejointposteriorof;;andisgivenby;;jy/IYi=11Yd=0KYk=1hdiexpdTzk P1l=0liexplTzkiyidk12nIYi=1)]TJ/F21 7.97 Tf 6.587 0 Td[(1io=LP12nIYi=1)]TJ/F21 7.97 Tf 6.587 0 Td[(1io:Nextintegratingout,thejointposteriorofandis;jy/LP2IYi=1)]TJ/F21 7.97 Tf 6.586 0 Td[(1i: {10 Letik=ik=PKl=1iland'i=PKl=1il;thusik='iik.TheJacobianofthistransformationis@i1;;iK @i1;;iK)]TJ/F21 7.97 Tf 6.587 0 Td[(1;'i='K)]TJ/F21 7.97 Tf 6.586 0 Td[(1i:Thus,thepriorstructureonikimpliesthefollowingpriorstructurefor'=1;;I,and=11;;1K;;I1;;IK:p';=IYi=1'K)]TJ/F21 7.97 Tf 6.587 0 Td[(1iIYi=1KYk=1'iik)]TJ/F21 7.97 Tf 6.586 0 Td[(1=IYi=1')]TJ/F21 7.97 Tf 6.586 0 Td[(1iIYi=1KYk=1ik)]TJ/F21 7.97 Tf 6.586 0 Td[(1Now,thejointposteriorgivenin 2{9 canbewrittenas;;';;jy/IYi=1KYk=1nhexpf)]TJ/F22 11.955 Tf 15.276 0 Td[('iikig'iikiyi0kioIYi=1KYk=1nhexpf)]TJ/F22 11.955 Tf 15.276 0 Td[('iikiiexpTzkg'iikiiexpTzkyi1kionIYi=1)]TJ/F22 11.955 Tf 5.479 -9.684 Td[(')]TJ/F21 7.97 Tf 6.586 0 Td[(1iKYk=1)]TJ/F21 7.97 Tf 6.587 0 Td[(1iko12nIYi=1)]TJ/F21 7.97 Tf 6.586 0 Td[(1io: {11

PAGE 40

26 Again,notethatZ10expn)]TJ/F22 11.955 Tf 11.956 0 Td[(i'iiKXk=1ikexpTzkoPKk=1yi1k)]TJ/F21 7.97 Tf 6.587 0 Td[(1idi/i'i)]TJ/F29 7.97 Tf 7.998 5.978 Td[(PKk=1yi1knKXk=1ikexpTzko)]TJ/F29 7.97 Tf 7.998 5.977 Td[(PKk=1yi1k:Thus,;';;jy/IYi=1KYk=1nhexpf)]TJ/F22 11.955 Tf 15.276 0 Td[('iikig'iikiyi0kioIYi=1nhKXk=1ikexpTzki)]TJ/F29 7.97 Tf 7.998 5.977 Td[(PKk=1yi1kKYk=1ikexpTzkyi1konIYi=1)]TJ/F22 11.955 Tf 5.479 -9.684 Td[(')]TJ/F21 7.97 Tf 6.587 0 Td[(1iKYk=1)]TJ/F21 7.97 Tf 6.586 0 Td[(1iko12:Integratingwithrespectto',Ihave,;;jy/IYi=1hKXk=1ik)]TJ/F29 7.97 Tf 7.998 5.977 Td[(PKk=1yi0kKYk=1yi0kikKYk=1nikexpTzk PKl=1ilexpTzloyi1ki12nIYi=1)]TJ/F23 7.97 Tf 11.34 5.26 Td[(KYk=1)]TJ/F21 7.97 Tf 6.586 0 Td[(1iko:Thenintegratingwithrespectto,andrewritingLRasgivenin 2{7 asLR=IYi=1hKYk=1ikyi0kKYk=1nikexpTzk PKl=1ilexpTzloyi1ki;andnotingthatPKk=1ik=1,Ihave,;jy/LR2nIYi=1KYk=1)]TJ/F21 7.97 Tf 6.587 0 Td[(1iko: {12 Sincetheorderofintegrationofthejointposteriordoesnotmatteraslongastheposteriorisproper,comparing 2{10 and 2{12 ,itfollowsthatafterintegratingthenuisanceparameters,or,theposteriorforgeneratedfromLPorLRremainsthesame.

PAGE 41

27 Remark2 .Theorem2indicatesthatthemarginalposteriordistributionsoffromeither 2{6 or 2{7 arethesame.Thus,inthepresenceofexposuresmissingcompletelyatrandom,onemayteithertheprospectiveortheretrospectivemodeltostratiedcase-controldata.Remark3 .Astratiedcase-controlstudywithoutmissingexposuresisaspe-cialcase,wherePij=1jSi=1)]TJ/F22 11.955 Tf 9.494 0 Td[(Pij=0jSi=1.A1:Mindividuallymatchedcase-controlstudyisaspecialcaseofstratiedcase-controlstudywiths=1andt=MwhereMisapositiveintegerandthestrataaredenedasthematchedsets.Notethatwecouldverywellassumethattherearesicasesandticontrolsineachstratum,andtheproofwillstillcarrythrough.Remark4 .Itisinterestingtonotethattheposterior;;jyisnon-identiablein,inthesenseofDawid79,sincej;;y=1whichdoesnotdependony.This,however,doesnotimpedetheproprietyofthejointposteriorasshowninthefollowingtheorem.Forageneralresultrelatingnon-identiabilitywithproprietyorimproprietyofposteriors,IrefertoGhoshetal.000.Thenexttheoremprovestheproprietyoftheposteriorundertheassumedmodelundercertainconditions.Ineedthenotationsndk=PIi=1yidk,d=0;1.Theorem3 .AssumeiPKk=1yidk1andiiE[expfd)]TJ/F15 11.955 Tf 9.299 0 Td[(1TPKk=1zkndkg]<1ford=0andd=1.HereEdenotesexpectationwithrespecttothepriordistri-butionon,namely,2.Then;;;jyisproper.Proof.Itsucestoshowthat;jyisproper.ThisamountstoshowingRQIi=1Iid<1,whereIi=Z1)]TJ/F21 7.97 Tf 6.586 0 Td[(1iKYk=11Yd=0ndiexpdTzk P1l=0liexplTzkodi;i=1;;I:

PAGE 42

28 Let!i=expi,i=1;;I.ThenIi=Z1KYk=1n1 1+exp!i+Tzkoyi0knexp!i+Tzk 1+exp!i+Tzkoyi1kd!i0,IYi=1Ii<2I)]TJ/F21 7.97 Tf 6.587 0 Td[(1nexp)]TJ/F47 11.955 Tf 9.299 0 Td[(TIXi=1KXk=1zkyi0k+expTIXi=1KXk=1zkyi1ko=2I)]TJ/F21 7.97 Tf 6.587 0 Td[(1nexp)]TJ/F47 11.955 Tf 9.299 0 Td[(TKXk=1zkn0k+expTKXk=1zkn1ko:Theproofisnowcompletedbyassumptionii,whichessentiallyrequiresthenite-nessofthemomentgeneratingfunctioncorrespondingtothepriordistribution2.2.4DiscussionAsweknowthat,theMPtransformationcansimplifymaximizationofmulti-nomiallikelihoodbyconsideringaPoissonlikelihoodwithadditionalparameters.Introducingsomespecicpriorstothelatentparameters,andarbitrarypriorstotheparametersofinterestinthePoissonlikelihood,Ishowthatthemarginalposteriordistributionoftheparametersofinterestisexactlyequivalenttothatgeneratedfromthemultinomiallikelihood.However,theMPtransformationrequirescategoricalco-variates.Ifsomeofthemarecontinuous,thecurrentpracticeiseithertodiscretizethemorfollowtheBayesianbootstrapasproposedinGustafsonetal.002.Animportantopenquestionisextensionofthepresentresultstocontinuousexposures.

PAGE 43

CHAPTER3BAYESIANMODELINGFORGENETICASSOCIATIONINCASE-CONTROLSTUDIES:ACCOUNTINGFORUNKNOWNPOPULATIONSUBSTRUCTURE3.1IntroductionTheevaluationoftheassociationbetweenmolecularmarkersanddiseasestatuscanbeusedtostudythegeneticbasisofcommonhumandiseasesRischandMerikan-gas,1996;MortonandCollins,1998;Sullivanetal.,2001.Thebasicprincipleforsuchso-calledassociationstudiesarisesfromthedependenceofallelefrequenciesatmarkerlociuponthoseofdiseasevariants,thatis,thelinkagedisequilibriabetweenallelesfromdierentgeneticloci.Asignicantassociationdetectedbetweenamarkerandthediseasecanbeconsideredasevidenceforclosephysicallinkagebetweenthemarkerandadiseaselocus,giventhatthelinkagedisequilibriumbetweenanytwogenesalwaysdecaysexponentiallywiththeirgeneticdistanceinarandommatingidealizedpopulationLynchandWalsh,1998.Inpractice,however,thererarelyexistsanidealizedpopulationasaresultoftheactionofvariousevolutionaryforcesLynchandWalsh,1998.Evolutionaryforces,suchaspopulationstructureandpopulationadmixture,operatingonapopulationcanresultinspuriousassociationsbetweenaphenotypeandmarkersthatarenotlinkedtoanycausativeloci.Thepresenceofspuriousassociationsuggeststhatthedetectedstatisticalassociationdoesnotnecessarilyimplythephysicallinkagebe-tweenthediseasephenotypeandarbitrarymarkersthathavenophysicallinkagetocausativelociLanderandSchork,1994.AclassicexampleofspuriousassociationcausedbypopulationsubstructureispresentedinKnowleretal.988.Inthisstudy,basedonasampleofNativeAmericansofthePimaandPapagotribes,averystrongnegativeassociationbetweentheGmhaplotypeGm3;5,13,14andtype2or 29

PAGE 44

30 non-insulin-dependentdiabetesmellituswasdetected.Onemightconcludefromthisobservationthattheabsenceofthishaplotype,orthepresenceofacloselylinkedgeneisacausalriskfactorforthedisease.HoweverGm3;5,13,14isamarkerforCaucasianadmixture,anditismostlikelythatthepresenceofCaucasianallelesanddecreaseinIndianallelesledtolowersusceptibilitytotype2diabetes,ratherthanthedirectac-tionofthehaplotypeorofacloselylinkedlocus.Thisstudydemonstratestheeectsofconfoundingduetopopulationsubstructure,andtheimportanceofconsideringgeneticadmixturewhileinvestigatingtheassociationbetweenadiseaseandgeneticmarkers.Inordertoovercometheproblemofspuriousassociations,manydierentgeneticstrategieshavebeenproposed.Spielmanetal.993usedthetransmissiondisequi-libriumtestTDTtomeasuretheassociationbetweenacandidategeneanddiseasestatusbyincorporatingthegenotypesofparentsofaectedindividuals.ThistesthasbeeninstrumentalingeneticassociationstudiesofhumandiseasesSpielmanandEvens,1998,butitisoftenlimitedbecauseofdicultieswithDNAsampling.Forthisreason,asimplecase-controldesignthatusesaectedindividualsandunrelatedcontrolshasrecentlyreceivedincreasedattentionFreedmanetal.,2004;Marchinietal.,2004.Anumberofapproacheshavebeendevelopedtoavoidthegenera-tionofspuriousassociationsincase-controlstudiesofdisease-geneassociation.ForacomprehensiverecentreviewofadmixturemappingforcomplextraitsseeMcKeigue005.Pritchardandcolleaguesusedmultilocusgenotypedatatoestimatepopulationsubstructure.Theyproposedamodel-basedclusteringmethodtoidentifythepop-ulationstructurebygenotypingsamplesatadditionalunlinkedmarkersPritchardetal.,2000a.Thismethodwasthenextendedtoallowforthelinkagebetweendif-ferentmarkersFalushetal.,2003.Asoftwarepackage,STRUCTURE,hasbeen

PAGE 45

31 writtentoimplementtheiralgorithmsthatconsiderbothlinkedandunlinkedmark-ers.Pritchardetal.000bproposedatwo-stageprocedureinwhichrstthepopulationstructureisinferredbyemployingthemethodofPritchardetal.000a,andthenthetestsofassociationwithinsubpopulationsareconductedconditionalontheimputedsubstructure.However,thismethoddoesnotdevelopamodelfortheprobabilityofdiseaseincidenceandcannotbegeneralizedeasilytoprovideestimatesoftheoddsratiocorrespondingtothegeneticriskfactor.Hoggartetal.003,2004developedacombinationofBayesianandclassicalapproachesforassociationstudiesbasedontheadmixturebetweenpopulationswithdierentancestries.ApartfromSTRUCTURE,twoothersoftwareswhichemployBayesianideasforstatisticalmodelingofgeneticdatafromadmixedpopulationareADMIXMAPHoggartetal.,2003,2004andANCESTRYMAPPattersonetal.,2004.Dierentfromtheabovetreatments,Sattenetal.001providedanovellatent-classanalysistostudytheassociationbetweenthediseaseandthecandidategenesbasedonaseriesofadditionalmarkersthatareinlinkageequilibriumwitheachotherandwiththecandidategeneswithinsubpopulations.BasedontheAkaikeinformationcriterionAIC,theirmethodcanestimatethenumberofsubpopulations.Butbyeitherassumingthediseasetoberare,orcollapsingmultiplegenotypesintovariousbinarygenotypes,theirmethodhasnotfullycapitalizedontheinformationaboutthemultiple-genotypeinheritanceofthecandidategene.Inthischapter,IprovideanalternativeparametricBayesianmodelforinfer-ringondisease-geneassociationafteraccountingforpopulationsubstructure.AsinSattenetal.001,Iusethelatent-classapproachtoestimatetheassociationpa-rameters,whileIaccountforthepopulationsubstructureinawaysimilartothatofPritchardetal.000a.However,unlikeSattenetal.001,ouranalysisdoesnotrequiretherarediseaseassumptionoranalyzingmulticategorygenotypesbyseveralanalysesusingvariouspossiblebinarygenotypesofthecandidategene.Ourmodel

PAGE 46

32 canalsohandlemulti-allelicgenotypesofthecandidategenes,extendingonearlierapproachesforthegenotypicanalysisofonlybiallelicloci.Thecomputationalstrat-egyfollowedinSattenetal.001involveduseoftheE-Malgorithmtoestimatetheparametersinthemodel,combinedwithaparametricbootstrapstrategytoobtainstandarderrorestimates.TheMCMCstrategydesignedinthischaptersimpliesthecomputationalcomplexity,withposteriorstandarddeviationestimatesandcred-ibleintervalsbeingobtainedfromtherandomobservationsgeneratedfromthefullconditionaldistributionsoftheparameters.IshouldemphasizethatinourBayesiananalysis,inferenceonthedisease-geneassociationisnotcarriedoutonthebasisoftheparticularimputedstructureasdoneinPritchardetal.000a.Instead,throughuseofmodelaveragingseeforexample,MadiganandRaftery1994,theassociationparametersareestimatedbyincorporatingtheuncertaintyinestimatingthesubstructure.Inparticular,insteadofassumingthenumberofsubpopulationsItobexed,IputaprioronIandobtaintheposteriordistributionofI.ForeachpossiblevalueofIwithpositiveposteriorprobability,Ithenestimatetheassociationparametersinthedisease-generiskmodel.FinallyItaketheweightedaverageoftheseestimates,theweightsbeingproportionaltotheposteriorprobabilitiesofthedierentvaluesofI.Theexplicitmodelaverag-ingformulasaregiveninSection 3.3.2 .OuranalysisthuscombinesthesubstructureestimationideasofPritchardetal.000ausingBayesianclustering,andthelatentclassdiseaseriskmodelsofSattenetal.001posedinapurelyfrequentistframe-work,throughamoregeneraluniedBayesianapproach.Thischapterpresentsanoveltwo-stagemodelwithaclusteringalgorithmforinferringoncrypticpopulationstructure,followedbyalogisticmodelfordiseaseincidence,tiedtogetherthroughthetechniqueofBayesianmodelaveraging.Theoutlineofthechapterisasfollows.Section 3.2 statesboththestatisticalmodelandthegeneticmodel,andbrieyintroducesthemethodsinPritchardetal.

PAGE 47

33 000atoestimatethenumberofsubpopulations.Section 3.3 derivestheunderly-inglikelihood.Ialsointroduceinthissectiontheappropriatepriorsforthemodelparametersandobtaintheirestimatesbasedontheposteriors.Theposteriorsareanalyticallyintractable.SotheBayesianprocedureisimplementedbytheMCMCnumericalintegrationtechnique.InSection 3.4 ,Istatethesimulationstrategyandprovideresultsonsimulatedcase-controlstudiesunderbothararediseaseandacom-mondiseaseassumption.ThesimulationstudiesareconductedinthesamesettingasinSattenetal.001andmimicanadmixedArgentineanpopulationasdescribedinSalaetal.998,1999.Undertherarediseaseassumption,IcompareourresultswiththoseobtainedinSattenetal.001.InSection 3.5 ,Iapplyourmethodstorealdatacollectedinageneticassociationstudywithobesityasthediseaseoutcomeandthe2-adrenergicreceptor2ARasthecandidategeneunderinvestigation.SomeconcludingremarksaremadeinSection 3.6 .3.2ModelandNotation3.2.1StatisticalModelLetthebinaryvariableDdenotediseaseandletGbeapossiblyvector-valuedgeneticriskfactor.IassumethattheoverallpopulationofsizeNiscomprisedofIsubpopulations,eachhavingdierentfrequenciesofGandD.BytheunmeasuredcovariateZ,Iindicatethesubpopulationtowhichanindividualbelongs.Thus,Dj=1or0correspondstothepresenceorabsenceofadiseaseforthejthindividualwithageneticriskfactorGj,j=1;;N.IassumeGjtobeaunivariatediscreterandomvariable,takingM+1valuesg0=0,g1,,gM.IassumethattheprospectiveconditionallogisticdistributionforthediseasestatusisPrDj=1jGj=gm;Z=i=Hf0i+1mg;m=0;;M;{1

PAGE 48

34 whereHu=f1+exp)]TJ/F22 11.955 Tf 9.298 0 Td[(ug)]TJ/F21 7.97 Tf 6.586 0 Td[(1.Here0iisatermrepresentingthesubpopulationeectontheprobabilityofdiseaseforindividualsbelongingtoaparticularsubpop-ulationi,and1misthecoecientcorrespondingtothegeneticexposurevariableintheabovelogisticregressionmodel.Forparameteridentiability,Iset10=0.ThemethodcanimmediatelybeextendedtoavectorvaluedgeneticriskfactorGjforindividualj.3.2.2GeneticModelSincedierentsubpopulationsmayhavedierentfrequenciesofothermarkergenes,Iusealatent-classapproachtoinferaboutthepopulationsubstructurebyusinginformationonthoseadditionalmarkerloci.Considerxclasthealleleatmarkerlonchromosomec=1,2labelingofthetwochromosomesinagivenpairas1or2isarbitraryandletX=x11,x21,,x1L,x2L,whereListhenumberofmarkerlociunderconsideration.First,Iassumethatthegenesattheadditionalmarkerlociareunrelatedtodisease,thatisPrDj=1jGj;Xj;Z=i=PrDj=1jGj;Z=i:{2Intheanalysisthatfollows,IassumethatHardy-Weinbergequilibriumholdsforeachsubpopulation.HumanpopulationsrarelyshowmuchdivergencefromtheHardy-WeinbergequilibriumoncepopulationsubstructurehasbeenaccountedforReportofCommitteeonDNAForensicScience1996,pp.104andreferencescitedtherein.Further,bychoosingadditionalmarkerlociondierentchromosomesfromthechromosomewhereGisfound,Irstassumethattheadditionalmutuallyindepen-dentmarkergenesareinlinkageequilibriumwiththecandidategeneG,sothatPrGj;XjjZ=i=PrGjjZ=iPrXjjZ=i:{3

PAGE 49

35 ByHardy-Weinbergequilibrium,PrXjjZ=i=LYl=12Yc=1plixcl;{4whereplixclistheproportionofpersonsinsubpopulationihavingallelexclatmarkerlocil,Lbeingthenumberofmarkerloci.SupposethecandidategeneGhaswalleles,e.g.,a1;;aw,andthefrequencyofthealleleauu=1;;wintheithsubpopulationisiu=Pr[Gcl=aujZ=i]:ThenbyHardy-WeinbergequilibriumtheprobabilitiesofthegenotypesofGauavu;v=1;;waregivenby:Pr[G=auavjZ=i]=8><>:2iu;u=v;2iuiv;u6=v. {5 3.2.3InferenceonIforTheModelwithAdmixtureIconsiderthesituationwhereIhavemultilocusgenotypedatafromindividualssampledfromapopulationwithpossiblyunknownstructure.Pritchardetal.000ausedthegenotypesofasampleofindividualstoidentifythepresenceofpopulationstructurewhichisdiculttodetectusingvisiblecharacters,butmaybesignicantingeneticterms.AsPritchardetal.000apointedout,theproblemofinferringonthenumberofunknownpopulations,I,presentinadatasetisaverydiculttask.InaBayesianparadigm,withasuitablychosenpriordistributiononI,onecanbaseinferenceforIontheposteriordistribution:PIjX/PXjIPI;{6whereXdenotesthevectorofgenotypesofthesampledindividualsincludingthecandidategeneG.LetZdenotetheunknownpopulationoforiginoftheindividuals,

PAGE 50

36 Pdenotetheunknownallelefrequencyvectorinallpopulations,andQdenotethevectorofadmixtureproportionsforeachindividual.TheharmonicmeanestimatorisoneofthesimplestwaysofestimatingPXjI,1 PXjI=ZPZ;P;QjX;I PXjZ;P;Q;IdZdPdQ1 KKXk=11 PXjZk;Pk;Qk;I:{7Howeverthisestimatorisnotoriouslyunstable,oftenhavinginnitevariance,andthusposesseverecomputationalchallenges.Pritchardetal.000adescribedanalternativeapproachwhichisamoreadhocbuteectiveapproachbasedontheBayesiandeviancefunctionDVZ;P;Q=)]TJ/F15 11.955 Tf 9.298 0 Td[(2logPXjZ;P;Q:{8Letk=1;2;denotethek-thiterationintheMarkovchain.OneestimatestheconditionalmeanandvarianceofthedeviancefunctionDVgivenXasfollows:EDVZ;P;QjX1 KKXk=1)]TJ/F15 11.955 Tf 9.299 0 Td[(2logPXjZk;Pk;Qk=^;VarDVZ;P;QjX1 KKXk=1)]TJ/F15 11.955 Tf 9.299 0 Td[(2logPXjZk;Pk;Qk)]TJ/F15 11.955 Tf 12.875 0 Td[(^2=^2:ByassumingthattheconditionaldistributionofthedeviancefunctionDVgivenXisnormal,itfollowsfrom 3{7 that)]TJ/F15 11.955 Tf 9.298 0 Td[(2logPXjI^+^2=4:{9AnanalyticalexplanationofthisapproximationisprovidedinAppendix A .AnalternativeinterpretationofthismethodisthatmodelselectionisbasedonpenalizingthemeanoftheBayesiandeviancebyaquarterofitsvariance.Pritchardetal.000apointedoutthatreplacingtheassumptionofnormalitywiththeassumptionoftheBayesiandeviancefunctionbeingdistributedasaGammarandomvariable

PAGE 51

37 maybeasymptoticallymorejustiable,butmakeslittleornodierenceintermsofestimationaccuracyinpracticalapplications.Onemayuse 3{9 toestimatePXjIforeachIandthensubstitutethees-timateinto 3{6 toobtainapproximateestimatesofPIjXseePritchardetal.2000a,foradetailedalgorithm.Onewouldthenimputetheestimatedsubstructurewhileconductingtestsfordisease-geneassociation.IwillessentiallyfollowthesametechniqueforestimatingPIjXandembedthederivedinformationintoadiseaseriskmodelasdescribedinthefollowingsection.3.3LikelihoodandPriorsInthissection,Iderivethelikelihoodfunction,statethepriordistributionsandderivetheposteriors.ThekeyaspectofthemodelingisinhowIdevelopalgorithmsforestimatingthemodelparametersandatthesametimeaccountforthepopulationstructureinourframework.3.3.1LikelihoodBecausedierentsubpopulationsmayhavedierentfrequenciesofothermarkergenes,ImakeinferencebasedonthemarginaljointdistributionofD,GandX,summingoverallpossiblevaluesofZ,thelatentvariate.LetPrZ=i=qi,whichistheproportionofpersonsinsubpopulationi.Notethatforsubjectj,Gjtakesoneofthevaluesgm,m=0;1;;M:By 3{3 and 3{4 ,forgivenI,thefulllikelihoodLIisfactorizedasfollows:LI=NYj=1PrDj;Gj;Xj=NYj=1IXi=1PrZ=iPrGj;XjjZ=iPrDjjGj;Z=i=NYj=1IXi=1hqinLYl=12Yc=1plixcloPrGj=gmjZ=iexpfDj0i+1mg 1+expf0i+1mgi; {10

PAGE 52

38 wherePrGjjZ=iisafunctionofiuu=1;;wasdescribedin 3{5 ,andListhenumberofmarkerlociwhichareinlinkageequilibriumwithG.Iuseamarginallikelihoodratherthanaconditionallikelihoodapproach.Thelikelihoodinvolvesparametersofinterest1mm=1;;M,andthenuisanceparameters0i,iu,qiandplixi=1;;I;8land8x,whichgrowindi-rectproportiontothenumberofsubpopulations.ThisgivesrisetothewellknownNeyman-ScottphenomenonwhereMLEsturnouttobeinconsistentifIgrowswithsamplesize.TypicallyIdealwithIbetween1through7,andhandlingnuisanceparametersisnotadicultissueinsuchscenarios.However,themarginalmodeldoescontainalargenumberofparameters,andIcarryoutBayesianinferencebyintroducingappropriatepriordistributionsfortheseparameters.3.3.2PriorsandPosteriorsThemainproblemistoestimatetheregressionparameters1m;m=1;;M;Iconsiderthefollowingmutuallyindependentnormalpriors:0iNormal0i;20i;i=1;;I;1mNormal1m;21m;m=1;;M:WheninferringthenumberofsubpopulationsI,IconsideradiscreteuniformprioronthedomainofI.ThepriorsforPandQcorrespondinglyarethefollowing:q1;;qIDirichlet;;;iuBetaai;bi;pli1;pli2;;pliXlDirichletpli1;pli2;;pliXl:Withtheabovemodelandpriorspecications,onecanobtainthefullconditionaldistributionsfortheparameters0i,1m,iu,qiandplix.Noneoftheconditionalshasastandarddistributionalform.

PAGE 53

39 ForeachgivenvalueofI,theparametersofinterestcanbeestimatedbygen-eratingrandomobservationsfromthefullconditionalsusingaMCMCnumericalintegrationschemeandthentakingaveragesofthegeneratedobservations.Corre-spondingtoeachvalueofI,IalsohaveassociatedposteriorprobabilitiesPIjXasdiscussedinSection 3.2.3 .Therefore,bysetting=11;;1M,usingamodel-averagingtechnique,anygenericparameterisestimatedbytheposteriormeanEjX=XiEjX;I=iPrI=ijX{11withposteriorvarianceVjX=XiVjX;I=iPrI=ijX+Xi[EjX;I=i]2PrI=ijX)]TJ/F28 11.955 Tf 11.955 13.27 Td[(hXiEjX;I=iPrI=ijXi2: {12 Thustheposteriorvarianceestimatesfortheparametersofinterestaccountforun-certaintyintheestimationofI.ThenalpointestimatesarenotbyproductsofasinglemodelwithaxedvalueofI,butaveragedoverpossiblemodelswithweightsproportionaltotheposteriorprobabilitiesPIjX.3.3.3ComputationalDetails1.EstimationofassociationparametersNoneoftheconditionaldistributionsoftheparametershasastandarddistributionalformandthusgeneratingobservationsfromtheposteriordistributionsorcalculatingtheposteriorestimatesisnotautomatic.IadoptedacomponentwiseMetropolis-Hastingsalgorithmforeachoftheparameters.Letstandforagenericparameter,i.e.,anyofthe0i,1m,iu,qiandplixm=1;2;i=1;;I;8l;x.LetLjdenotethefulllikelihoodasgivenin 3{10 asafunctionofgiventhedataandalltheotherparameters.Letbethe

PAGE 54

40 priordistributionon.Inordertosimulateobservationsfromthefullconditionaldistributionof,namelyj,Iproceedasfollows.Step1:Startwithanyreasonableinitialvalueof,say0.Thisisthecurrentvalueof.Step2:Generateanewvalueof,say,fromacandidatedensityg.Step3:Replace0bywithprobabilitymin1;jg0 0jg.Retaintheexistingvalueof0otherwise.Notethatj/Lj.Ifthecandidatedensity=g,thentheacceptanceprobabilityreducestoaftercancelationofthepriortermwiththeidenticalcandidatedensitytermmin1;Lj L0j.2.InferenceofthenumberofsubpopulationsIThefollowingalgorithmPritchardetal.,2000aisusedtosamplefromPrZ;P;Q.StartingwithinitialvaluesofZ,iteratethefollowingstepsfork=1;2;Step1.SamplePkandQkfromPrP;QjX;Zk)]TJ/F21 7.97 Tf 6.586 0 Td[(1;Step2.SampleZkfromPrZjX;Pk;Qk;Step3.UpdateusingMetropolis-HastingsstepwhereIconsiderauniform,10priorto.Step2maybeperformedbysimulatingzj;clpopulationoforiginofallelecopyxj;cl,independentlyforeachj,candlfromPrzj;cl=ijX;P=qjiPrxj;cljP;zj;cl=i PIi0=1qji0Prxj;cljP;zj;cl=i0;{13wherePrxj;cljP;zj;cl=i=pilxj;cl.3.4SimulationToillustrateourapproach,IconsiderascenariosimilartotheoneinSattenetal.001withanadmixtureofEuropeanandAmericanIndianancestryinAr-gentineanpopulation.Salaetal.998,1999publishedallelefrequencydataontwelveshorttandemrepeatSTRlociinArgentineansofEuropeanancestry,aswellasinthreeArgentineanAmericanIndianaboriginalgroupsMapuche,Tehuelche,

PAGE 55

41 andWichiTable 3{1 .TheMetropolitanpopulationofBuenosAireswasstudiedandthepopulationdidnotexhibitanysignicantdierencefromHardy-Weinbergequilibrium.However,theSTRallelefrequencydistributionsarecharacterizedbysignicantdierenceswithinandalsobetweendierentpopulations.IassumethatArgentineanEuropeansconstituted70%ofahypotheticaltargetpopulationandthateachAmericanIndiangroupconstituted10%.IsimulateapopulationsuchthatallelevenadditionalmutuallyindependentSTRlociareinlinkageequilibriumwiththecandidategeneforpersonsinthesamesubpopulation.Simulateddatasetsareconstructedbyusingreasonabletrueval-uesoftheparameters.Specically,byusingtheallelefrequenciesfromSalaetal.999,Igeneratedataonthecandidategeneandothermarkerlociinapopulationthatcomprisesfoursubpopulations.AsinSattenetal.001,Iselectallele3oflocusD6S366asthedisease-causingallele,withfrequencies0.277,0.341,0.446and0.557inEuropean,Mapuche,Tehuelche,andWichi,respectively.Considerabialleliccandidategene,i.e.,acandidategenewithtwoallelesAthedisease-causingalleleandathenon-disease-causingallele.ThecandidategeneGhas3possiblegeno-typesg0;g1andg2correspondingtopersonshavingzeroaa,oneAaandtwoAAcopiesofadisease-causingallele.Ifthefrequencyofthedisease-causingalleleintheithsubpopulationisi=Pr[Gcl=AjZ=i]=1)]TJ/F15 11.955 Tf 11.955 0 Td[(Pr[Gcl=ajZ=i];{14thenbyHardy-Weinbergequilibrium,theprobabilitiesofthegenotypesofGareasthefollows:Pr[G=g0jZ=i]=)]TJ/F22 11.955 Tf 11.955 0 Td[(i2;Pr[G=g1jZ=i]=2)]TJ/F22 11.955 Tf 11.955 0 Td[(ii;Pr[G=g2jZ=i]=2i: {15

PAGE 56

42 Finally,thediseasestatusdatathatvarywithchangingfrequenciesofthedisease-causingalleleforeachsubpopulationaregenerated.AsstatedinSattenetal.001,personswhowerehomozygousforthedisease-causingallelehadanincreasedriskofdiseasecorrespondingtoalog-oddsratioof1.0relativerisk=exp:0=2:72;andpersonswhowereheterozygousforthedisease-causingallelehadnoincreaseinrisk.Thisimplies,inournotation,11=0and12=1:0.Thelogoddsoftherarediseasewhichimpliesthatthecontrolpopulationmimicsthewholepopulation,andPrG=gmjD=0;Z=iPrG=gmjZ=iamongpersonswithzerooronecopyofthedisease-causingallelewas)]TJ/F15 11.955 Tf 9.299 0 Td[(5,)]TJ/F15 11.955 Tf 9.298 0 Td[(4,)]TJ/F15 11.955 Tf 9.298 0 Td[(3and)]TJ/F15 11.955 Tf 9.299 0 Td[(3intheEuropean,Mapuche,Tehuelche,andWichipopulations,respectively.Forthecommondiseasewithahigherprevalencerate,Iassumethatthelogoddsamongpersonswithzerooronecopyofthedisease-causingallelewas)]TJ/F15 11.955 Tf 9.299 0 Td[(2,)]TJ/F15 11.955 Tf 9.299 0 Td[(1:5,)]TJ/F15 11.955 Tf 9.298 0 Td[(1and)]TJ/F15 11.955 Tf 9.298 0 Td[(1intheEuropean,Mapuche,Tehuelche,andWichipopulations,respectively.TheresultsIpresentedarebasedonasetofdiuseandmutuallyindependentpriors.IuseN;9prioron0iand1m,Beta:5;0:5oniandasymmetricDirich-letpriorfortheallelefrequencyparameterswithall'sbeing0:5.Forq1;;qI,IchooseaDirichlet;;prior,withaU,10hyperprioron.Foreachscenario,Igenerated100dierentdatasetsandobtainedtheparameterestimatesbycomputingthemodelaveragedposteriormeansforeachsimulateddataset.Ineachreplicationofoursimulation,Igenerateddatafor1250casesand12550controlsfromtheabovesimulationstrategy,followedbysamplingthecasesandcontrolsfromalargerrandomsampleofsubjects.Foreachreplication,IranmultipleMarkovchains,typicallywith20000)]TJ/F15 11.955 Tf 10.43 0 Td[(30000iterations.Theposteriormeanscalculatedforeachreplicationwerebasedoneverytenthobservationofthelast5000observationsineachchain,combinedtogethertoreduceauto-correlation.Anestimateoftheposteriorvariancewascalculatedbasedontheaggregateofthelast5000valuesforeachreplication.Ireportaveragevaluesforthesequantitiesover

PAGE 57

43 the100replications.IalsocalculatedanestimateofthemeansquarederrorMSEcorrespondingtotheestimatesofeachoftheparametersofinterestsayingeneralbasedonthe100replications.IconsideredthisMSE,i.e.,thesquareddeviationsoftheestimatesfromthetrueparameter,averagedoverthe100replicationsasameasureofperformanceofourmethod.MSE=1 100100Xr=1Posteriormeanofinr-threplication)]TJ/F15 11.955 Tf 11.955 0 Td[(Truevalueof2:ToexaminetheeectofthenumberofSTRlociontheestimators,Ianalyzedthedatasetswith250subjects25casesand125controlsbyiusingalltheadditionallociandiionlytherstsixadditionalloci.ThesetwoscenariosarelabeledasX12andX6inTables 3{2 and 3{4 respectively.ByapplyingthemethodsstatedinSection 3{2 Pritchardetal.2000aandintroducingauniformpriorforII2f1;2;3;4g,foreachsimulateddataset,rstIobtainestimatesofPIjX.Forexample,byi,IobtainPI=3jX=0:2andPI=4jX=0:8.ThenthemodelaveragedestimateofIis0.23+0.84=3.8.Theestimatesoftheassociationparametersarecomputedfollowing 3{11 and 3{12 .Forthesamedataset,theestimateof12is1:09forI=3and1:02forI=4,thusthenalmodelaveragedestimateof12forthatdatasetis1:090:2+1:020:8=1:034.TheresultsinTable 3{2 areobtainedbyaveragingtheseestimatesoverthe100simulateddatasets,whichshowsthattheposteriorstandarddeviationsofourmodelaveragedestimatesaretypicallysmallerthanthestandarderrorsfurnishedbySattenetal.001IincludetherelevantnumbersfromTables3-2and3-3ofSattenetal.001directlyinTable 3{3 ofthecurrentchapter.IrealizethatthoughoursimulationsettingsarethesameasofSattenetal.001,thetwosetsofestimatesmaynotbeexactlycomparableasthetwomethodsarenotimplementedonidenticaldatasets,butstillthismightserveasaprecursorforcomparisonpurposes.Sattenetal.001donotprovideMSEfortheirestimatesoverthereplications.AsaresultIcannotcomparethetwoprocedures

PAGE 58

44 directlyintermsoftheMSE.Asonemightexpect,whenIincreasedthesamplesizeto5000casesand250controls,adequateperformanceisachievedevenwithjusttherstsixSTRlociandtheoverallpatternoftheresultsremainthesame.Ialsoincludethenaiveanalysiscompletelyignoringadditionalmultilocusinfor-mationdenotedasX0inTables 3{2 and 3{3 .Onecannotethattheestimationresultsaremuchinferiorifoneignoresthegenotypicinformationataseriesofaddi-tionalunlinkedmarkerloci.ToshowthatthemethodsarenotlimitedtotheassumptionsthateitherthediseaseisrareorthegenotypesGarebinary,Ialsoanalyzedasimulateddatasetwith250subjects25casesand125controlsandanotherwith500subjects0casesand250controlswherethediseasehasahigherprevalencerate.Theoverallpatternoftheresultsarefairlysimilartotherarediseasecase.InoterelativelysmallerMSE'sandposteriorstandarddeviationsforthiscommondiseasecaseascomparedtotherarediseasecase.TheresultsarepresentedinTable 3{4 .Foranalyzingthesimulateddata,Iusedtheimplicitpriorbeliefthatthesourcepopulationmayhave4orlesssubpopulations,byputtingadiscreteuniformprioron1;2;3;4forI.However,Ihavealsotriedtoputnon-zeroprobabilityonavalueofIgreaterthanthetruesimulationvalueof4,forinstance,adiscreteuniformprioron1;;8.Inthiscase,theestimatesoftheregressionparameters1mappeartochangeverylittleevenwhenIisestimatedtobeslightlygreaterthanthetruevalueusedtogeneratethedataresultsarenotprovided.Pritchardetal.004anotethatforsituationswhereseveralvaluesofIgivesimilarestimatesoflogPrXjI,itisoftenthecasethatthesmallestoftheseis`correct'.Inthepracticalimplementation,IadoptamodelselectionperspectiveandtrytoobtainthesmallestvalueofIthatcapturesthemajorstructureinthedata.

PAGE 59

45 3.5ApplicationtoARealDatasetToillustrateourmethod,Iapplyourapproachtoexploregeneticassociationofobesityandthe2ARcandidategenefordetailsofthestudy,pleaseseeLinetal.,2005.The-adrenergicreceptorsARareknowntoplayanimportantroleincardiovascularfunctionandinresponsetodrug.Ianalyzecompletedataon144menandwomenwhoparticipatedinthisstudyandignoretheobservationswithmissingness.EachoftheparticipatingsubjectsweregenotypedforSNPmarkersatcodon16withinthe2ARgene,atcodon389withinthe1ARgeneandatcodon492withinthe1Agene.Thephenotypicinformationcollectedareweightandheightofindividuals,bywhichthebodymassindexBMIofeachsubjectcanbecalculated.Ideneobese",i.e,D=1whenBMI30:0,andD=0otherwise.Thisleadsto85undiseasedand59diseasedsubjectsinthedatasetIconsider.Previousstudieshavedetectedpossibleassociationbetweenpolymorphisminthe2ARgeneandobesity,thefocusbeingparticularlyoncodon16andcodon27substitutions,butnoassociationhasbeendetectedwithin1ARgeneor1AgeneJohnsonandTerra2002,Linetal.2005,Takamietal.1999.Therefore,Iconsiderthe2ARgeneasthecandidategene,denotedbyGandthe1ARgeneandthe1Ageneastwoothergenesunrelatedwiththedisease,denotedbyX=X1;X2.Notethatinthisdataset,Ionlyhavethegenotypicinformationregardingsinglepolymorphismsinthesethreegeneswhichhavebiallelicgenotypes,generallyexpressedasx=0;1;2.Sotheexpressionin 3{4 willbechangedasPXjZ=i=Q2l=1plix,whereplixistheproportionofpersonsinsubpopulationihavinggenotypexx=0;1;2correspondingtogenel.IanalyzedthedatabyconsideringgenotypicinformationonallthreegenesdenotedbyX2+G"andbyonlythecandidategenedenotedbyX0+G".Sinceintherealdata,IdonotknowthetruevalueofI,IshouldtrytoestimatethesmallestvalueofIthatcapturesthemajorsubstructureinthedata,ifany.Tothisend,I

PAGE 60

46 introduceadiscreteuniformprioron1;2;;15forI.Iconsiderpli1;pli2;;pliIDirichlet0:5;0:5;;0:5,andforq1;;qI,IchooseaDirichlet;;priorwithauniformhyperprioronwithrangefrom0to10.ByapplyingthemethodsstatedinSection 3.2 ,IrstobtaininferenceonI.Theprincipalndingsarethatwiththeinclusionofthetwoothergenes,IdetectsomeevidenceofsubstructurewithanestimateofI,as^I=3,withPI=3jX=1,whereaswithoutthesetwogenesandbyonlyusingG,IobtainPI=1jX=1,implying^I=1,i.e.,nopopulationsubstructurecanbedetectedinthesourcepopulation.Infact,thedatacamefromaNorthAmericanpopulationwithdiverseethniccompositionofblacks,whitesandothers,soonecouldexpectsomelatentpopulationsubstructureinthisdata.TheresultsofouranalysisarepresentedinTable 3{5 .Inallthemethodsofanalysis,thegeneticfactordoesnotappeartobeastatisticallysignicantriskfactor.Theresultssuggestthatcodon16Arg16Glypolymorphismsofthe2ARgeneisnotamajorcontributingfactortoobesityforthisstudiedpopulation.Infact,inSwedishCaucasians,Gln27Glupolymorphismatcodon27ofthe2ARgenewasshowntobeassociatedwithobesity,butnosuchassociationwasshownforArg16Glypolymorphismatcodon16.NoneoftheGln27GluandArg16Glypolymorphismsofthe2ARgenewerefoundtobeamajorcontributingfactortoobesityinJapanesemenHayakawaetal.2000.Intheordinarylogisticregressionmodel,withGasacategoricalfactor,IalsondinsignicanceofG,P-values0.8591and0.1571correspondingtoG=1and2,respectively.Evenafteraccountingforinformationintheothergenesandpopulationsubstructure,theeectofthecandidategeneremainsinsignicant.NoticethattheBayesianHPDintervalsarewiderthantheordinarylogisticmodelduetoadditionoflayerofuncertaintyonI.

PAGE 61

47 3.6DiscussionInthischapter,IpresentanalternativeBayesianmodelforaccountingforpop-ulationsubstructureingeneticassociationstudies.Ascomparedtopreviousap-proaches,ourmodelisadvantageousintermsofthefollowingaspects.First,itcanestimatethenumberofsubpopulationsIthatcomprisetheoverallpopulation.Al-thoughSattenetal.001canalsoprovidesuchanestimate,theirapproachisbasedonthegridprocedureinwhichmultipledierentI'sarettedandtheoptimaloneisthendeterminedintermsoftheminimumAIC.Ontheotherhand,Pritchardetal.000bestimatedsubstructureandthenconductedtestsbasedontheimputedsubstructure.Basedonmarkerandcandidategeneinformation,ourmodelestimatestheposteriorprobabilitiesofI,whichisthenusedinformingthenalestimatesoftherelativeriskparametersthroughmodelaveraging.Anadditionaladvantageisthat,unlikeSattenetal.'s01approach,ourmodeldoesnotrelyontheassumptionoftherarediseaseorthecollapsingofmultiplegenotypesintobinarygenotypes,thusoersmorepowertostudythegeneticarchitectureofanytypeofdiseases.AnewfeatureoftheBayesiananalysisistheuseofmodelaveragingtoesti-matetheregressioncoecients.RatherthanrelyingononeparticularmodelwithaxednumberofstrataI,IhaveputaprioronI,andhaveestimatedtheregressionparametersastheweightedaverageoftheirestimatesfordierentvaluesofI.TheweightsareproportionaltotheposteriorprobabilitiesofthedierentvaluesofI.ThusIembedthesubstructureestimationtogetherwithinferenceontheassocia-tionparametersinauniedBayesianframework.ThestandarderroroftherelativeriskestimatesdoesincorporatetheuncertaintyintheestimationofIasreectedin-14.ThisisunlikethemethodproposedinPritchardetal.000bwherethesubstructureisestimatedrstandtestsareconductedbasedontheimputedsub-structure.Table 3{2 showsthatourmethodsarecomparabletothoseofSattenetal.001;however,sinceourset-upisdierentfromthatofPritchardetal.000b,

PAGE 62

48 itishardtocomparethetwomethodsdirectlyinnumericalsense.Inprinciple,Idobelievethatcombininginferencesofthesubstructureandassociationmodelingwilllendonemorepowerindetectingassociation.Itshouldbepointedoutthatfeweradditionalmarkersareneededwhenthesamplesizeislarge.Whenadditionalmarkerlociareinvolved,thenumberofnuisanceparameterstheallelefrequenciesofthoselociforeachsubpopulationinthemodelwouldincrease,requiringmoredatatoestimatethemproperly.Thereremainstheproblemofhandlingmarkerlociinlinkagedisequilibriumwiththecandidategeneinourframework.AccordingtoFalushetal.003,therearethreesourcesoflinkagedisequilibriaLD,mixtureLD,admixtureLDandback-groundLD.ThemixtureLDarisesfromvariationinindividuals'ancestryanditcanbemeasuredbyunlinkedmarkers.TheadmixtureLDoccursbecauseofthecorrela-tioninancestryamonganextendedgenomicregion.ThebackgroundLDdecaysonashortscaleand,therefore,occurswithinanechromosomalstructure.Pritchardetal.000amodeledthemixtureLDforassociationstudies.Intheirlinkage"model,Falushetal.003incorporatedtheadmixtureLD"intotheinferenceofpopulationstructure.TheincorporationofthebackgroundLDisaninterestingopenquestion.Insummary,IhavederivedexibleBayesianestimationtechniquesfordisease-geneassociationincase-controlstudiesbyaccountingforpopulationstructure.First,IappliedPritchardetal.'s2000amethodstoinferpopulationstructurei.e.esti-matingPIjXandIbyusingthegenotypesofsampledindividualsataseriesofunlinkedmarkers.Second,Iproposealatentvariableapproachtoestimatetheasso-ciationparameters,andaccountforpopulationsubstructureusingadditionalmarkerlociinformationasinSattenetal.001.Thenalresultsarecalculatedbythemodelaveragingtechniqueasdescribedin 3{11 and 3{12 whichcombinesin-ferencesfromtheabovetwosteps.Estimationresultsbasedonasimulatedadmixed

PAGE 63

49 populationmimickingtheresultspresentedinSalaetal.998showthattheestimatesoftherelativeriskparametersusingadditionalmutilocusgeneticinforma-tionaresuperiortothosewhensuchinformationisnotexploited.Ialsoapplyourmethodtoarealdatasetonobesity.ThischapterillustrateshowthemodelingtoolofBayesianmodelaveragingcanbeeectivelyusedtoconductposteriorinferenceinaninterestingapplicationinhumangenetics.

PAGE 64

50 Table3{1:AllelefrequenciesforTwelveSTRlociinthefourArgentineansubpopu-lations. Locus ArgentinianEuropeansMapucheTehuelcheWichi D6S366 0.0820.0910.1430 0.2040.1140.0710 0.2770.3410.4460.557 0.1190.1360.0360.086 0.0910.1250.0360.029 0.1830.1590.1430.200 0.0280.0110.0180.071 0.0150.0230.1070.057FABP 0.5890.6830.7320.485 0.1100.0580.1070.162 0.3000.2600.1610.353CSF1PO 0.3300.2660.3390.226 0.3130.2820.2320.194 0.2980.3670.4110.581 0.0590.0850.0180F13A 0.1510.2220.3570.173 0.0600.1220.1250.077 0.2020.1220.0540.346 0.2090.1780.1430.115 0.3250.3440.3040.288 0.0530.0110.0170FESFPS 0.2600.1700.1430.257 0.4200.5000.7140.543 0.2470.2840.1070.043 0.0730.0450.0360.157THO1 0.2330.5260.2860.132 0.2500.2980.4290.721 0.1050.0090.0180 0.1850.0260.0890.015 0.2260.1400.1790.132HPRTB 0.032000 0.1790.0320.0910 0.3170.3230.2270.357 0.2850.4030.5910.167 0.1370.2420.0910.357 0.050000.119VWA 0.0630.00960.0360.014 0.0990.0770.0540.014 0.2940.5770.4290.514 0.2970.1250.2140.343 0.2460.2120.2680.114D13S317 0.0900.02000 0.1600.2400.150.464 0.0600.0700.050.179 0.2900.1200.150.089 0.2500.2600.30.089 0.1000.1800.2250.179 0.0400.1100.1250D7S820 0.1560.0700.0500 0.1150.0500.0500.070 0.2760.2200.1750.125 0.2450.4200.5250.450 0.1590.2100.2000.250 0.0460.03000.105D16S539 0.1560.1100.2250.125 0.1000.1300.0750.232 0.2940.2400.1000.321 0.2520.3700.5500.250 0.1950.1500.0500.071RENA4 0.7720.7280.8810.690 0.0740.2290.0230 0.1530.0410.0950.310 CitedfromSalaetal.998andSattenetal.001.

PAGE 65

51 Table3{2:Theresultsofsimulatedrare-diseasedatawithmarkerlociinlinkageequilibriumwiththecandidategeneD6S366.Ratioofthesamplesizesofcasestocontrolsis125=125and250=250.X12andX6,representthattheparameterswereestimatedbyusingthetwelveandtherstsixadditionalmarkerloci,respectively.X0istheanalysiswithoutusinganyadditionalmarkerloci.MeanandposteriorstandarddeviationrefertotheaverageoftheBayesestimatesandposteriorstandarddeviationsobtainedin100replications,whereasMSEistheestimatedmeansquarederrorbasedon100replications. SamplesizeModel1112ITruevalue0.00001.00004 125/125X12Mean-0.04751.10933.8178MSE0.14970.07650.1802Post.std.dev.0.31260.26380.3854X6Mean-0.10951.10283.6403MSE0.20050.09860.3540Post.std.dev.0.32770.31270.4763X0Mean-0.33800.88554.0000MSE1.22770.4982Post.std.dev.1.59821.0677250/250X12Mean0.00051.09663.7873MSE0.05460.05510.2107Post.std.dev.0.27040.15920.4089X6Mean0.00511.10353.5415MSE0.06310.05820.4572Post.std.dev.0.31270.19520.4994X0Mean-0.27660.94894.0000MSE1.26030.4330Post.std.dev.1.41520.9236

PAGE 66

52 Table3{3:Theresultsofsimulatedrare-diseasedatawithmarkerlociinlinkageequilibriumwiththecandidategeneD6S366whichareanalyzedbySattenetal.001.125=125and250=250denoteratioofthesamplesizesofcasestocontrols.X12andX6representthattheparameterswereestimatedbyusingthetwelveandtherstsixoftheadditionalmarkerloci,respectively.Meanandstandarderrorrefertotheaverageoftheestimatesandstandarderrorsobtainedin500replications. SampleSizeModel1112ITruevalue0.0001.0004 125/125X12Mean0.0611.0063.53Std.err.0.2930.4530.76X6Mean0.0230.8833.32Std.err.0.8651.7180.69CrudeAnalysis*Mean0.3661.7601.00Std.err.0.2850.370250/250X6Mean0.0230.9623.37Std.err.0.2260.3940.61 *Ignorestraticationandanalyzedatawithoutadditionalmarkerloci.

PAGE 67

53 Table3{4:Theresultsofsimulatedcommon-diseasedatawithmarkerlociinlinkageequilibriumwiththecandidategeneD6S366.Ratioofthesamplesizesofcasestocontrolsis125=125and250=250.X12andX6,representthattheparameterswereestimatedbyusingthetwelveandtherstsixadditionalmarkerloci,respectively.X0istheanalysiswithoutusinganyadditionalmarkerloci.MeanandposteriorstandarddeviationrefertotheaverageoftheBayesestimatesandposteriorstandarddeviationsobtainedin100replications,whereasMSEistheestimatedmeansquarederrorbasedon100replications. SamplesizeModel1112ITruevalue0.00001.00004 125/125X12Mean-0.00621.11163.8492MSE0.11060.10050.1456Post.std.dev.0.31520.16070.3523X6Mean0.00171.12993.6279MSE0.11730.13710.3634Post.std.dev.0.34880.27660.4766250/250X12Mean0.00231.09283.9331MSE0.06000.05510.0461Post.std.dev.0.21650.18060.2412X6Mean0.01911.10513.6228MSE0.04080.04700.3748Post.std.dev.0.26270.19910.4846

PAGE 68

54 Table3{5:TheresultsofrealdataanalysiswiththeposteriormeanEstimate,posteriorstandarddeviationand95%highestposteriordensityHPDintervalMLEandcondenceintervalCIfortheordinarylogisticregressionmodel. Model1112I X2+GEstimate-0.08950.71653Poststd.dev.0.39970.5201HPD-0.8619,0.6831-0.2996,1.7259X0+GEstimate-0.12060.74331Poststd.dev.0.45150.5602HPD-1.0028,0.7865-0.3339,1.8303OrdinarylogisticregressionEstimate-0.06680.7143withonlyGascovariateStd.err.0.37650.5048CI-0.8047,0.6711-0.2751,1.7037 :AlloftheposteriorprobabilityconcentratedonasinglevalueofI,thusweareunabletoobtainestimatesofposteriorvariance.

PAGE 69

CHAPTER4SEMIPARAMETRICBAYESIANANALYSISOFCASE-CONTROLDATAUNDERGENE-ENVIRONMENTINDEPENDENCEANDPOPULATIONSTRATIFICATION4.1IntroductionExceptforsomerarediseases,suchasHuntingtonorTaySachsdiseasewhichmaybetheresultofadeciencyofasinglegeneproduct,mostcommonhumandiseaseshaveamultifactorialetiologyinvolvingcomplexinterplayofmanygeneticandenvironmentalfactors.Byidentifyingandcharacterizingsuchcomplicatedgene-environmentinteractions,onehasmoreopportunitiestostudyetiology,diagnosis,prognosisandtreatmentofcomplexdiseases.Thecase-controlstudydesign,wheresamplingisconditionalonthepresenceorabsenceofdisease,isapowerfulepidemiologictoolforstudyingpotentialriskfactorsofrarediseases.Ithasbeenestablishedthatprospectivelogisticregressionanalysisofcase-controldataisecient"inthemodernsemiparametricsensewithrespecttotheunderlyingcovariatedensitymodelBreslowetal.,2000.Aspecialaspectofthegene-environmentassociationproblemisthatitmayoftenbereasonabletoassumethatasubject'sgeneticsusceptibilityisindependentoftheenvironmentalex-posure.Consequently,onemaybeabletoobtainmoreecientestimationtechniquesthanthetraditionallogisticregression,byexploitingtheadditionalgene-environmentindependencerestrictioninsteadofanunconstrainedcovariatedensitymodel.Piegorschetal.994rstobservedthatonecanestimatemultiplicativegene-environmentinteractionsinlogisticmodelswithdatafromcasesalone,providedthattheenvironmentalfactorEandthegeneticfactorGareindependentinthepopulationandthediseaseisrare.Theinteractionparameterisobtainedastheodds 55

PAGE 70

56 ratiobetweenGandEamongcasesonly.TheyalsonotedthattheestimateoftheG-Einteractionparameterfromcase-onlydataismoreecientthanitscounterpartobtainedfromcase-controldatausinglogisticregression.However,methodsthatuseG-EindependenceproduceseverelybiasedestimatesiftheassumptionisviolatedSchmidtandSchaid,1999;Albertetal.,2001.Non-independenceislesslikelytooccurwhentheenvironmentalexposureisexternalpollution,pesticideorradio-activesubstanceorarandomizedtreatmentinaclini-caltrial.Onehastobemuchmorecautiouswiththeindependenceassumptionwhenconsideringbehavioralriskfactorsandmetabolicpolymorphismswhichcouldalteranindividual'sbehavior.Gattoetal.004discussseveralsuchpotentialsourcesofnon-independence.Infact,geneticsusceptibilityfactorsandenvironmentalexpo-sures,thoughunlikelytobecausallyrelatedatanindividuallevelmaybecorrelatedatapopulationlevelduetotheirdependenceonothervariablesthatstratifythepop-ulation,suchasage,ethnicity,familyhistoryandalike.Forexample,awomanwithastrongfamilyhistoryofbreastcancerismorelikelytocarryBRCA1/2twoma-jorgenesidentiedforbreastandovariancancermutationandknowingherfamilyhistory,lesslikelytousepost-menopausalhormones.ThismayresultinanegativeassociationbetweenBRCA1/2mutationandhormoneuse.Insuchinstances,G-Eindependencedoesnotholdmarginally,butmayholdwhenconditionedonthestrat-icationvariablesforinstance,familyhistory.ModelingstraticationeectscanthusbeviewedasapossibleremedytoguardagainstresultantbiasduetoviolationoftheG-Eindependenceassumption.Oneofthemajorgoalsofthecurrentchapteristodeveloptechniquestomodelstraticationeectsinaexible,data-adaptivewayinanestimationframeworkwhichexploitsconditionalG-Eindependence.TheuseofG-Eindependencethroughcase-onlystudieshasmainlybeenfores-timatingthegene-environmentinteractionparameter.KhouryandFlanders996

PAGE 71

57 notedthatneitherthegeneticnorenvironmentalexposuremaineectcanbees-timatedwithcasedataonly.UmbachandWeinberg97showedthatwithdataavailableonbothcasesandcontrols,onecanestimatethemaineectsandinteractionbyttingasuitablyconstrainedlog-linearmodelunderararediseaseassumption.Inapopulationbasedcase-controlstudyofovariancancerofJewishwomeninIsrael,Modanetal.001arguedthatundergene-environmentindependenceandtherarediseaseassumption,thediseaseoddsratioassociatedwithEamongsubjectswithgenotypeG=gcanbeestimatedbyalogisticregressionanalysisthatcomparesPEjD=0withPEjD=1;G=g.However,themethodproposedinModanetal.001alsodoesnotallowfortheestimationofallmaineectsofinterest.MostoftheabovemethodsconsiderverysimplesettingsanditisnotimmediatehowtoexploitG-Eindependenceinthepresenceofpopulationstraticationasadirectextensionofthesemethods.ChatterjeeandCarroll2005referredtoasCCinrestofthetextproposeasemiparametricmaximumlikelihoodmethodofestimationofallthelogisticregressionparameters.TheyexploittheG-Eindependenceassumptionandusedatafrombothcasesandcontrols.Theirmethodaddressesmanyofthelimitationsoftheexistingmethodsasdiscussedabove.CCderivearobustprole-likelihoodbasedestimationtechniquewhichdoesnotrequiretherarediseaseassumption.TheyalsoconsidertheissueofpopulationstraticationandproposeamethodwhenG-EindependenceassumptiononlyholdsconditionalonthesetofstraticationvariablesS.CCconsideralogisticdiseaseprobabilitymodelforPDjG;E;S.TheyproceedtoworkwiththejointretrospectivelikelihoodoftheformPG;E;SjD,factorizedas,PG;E;SjD=PDjG;E;SPGjE;SPE;S PG;E;andSPDjG;E;SPGjE;SPE;S:UndertheassumptionofG-EindependenceconditionalonS,thesecondfactorontherighthandsidereducestoPGjE;S=PGjSandthusitremainstomodel

PAGE 72

58 PE;SandPGjS.CCleavethejointdistributionoftheenvironmentalexposureandthestraticationvariables,PE;Stobefullynon-parametric.However,theymodelPGjSinaparametricway,byassumingalogisticregressionmodelwithSascovariate.Aswewillnote,theparametriclogisticmodelforthePGjSisofteninadequate,especiallyforageneticmutationwhichisrarelydetectedinhealthycontrolsbutcommonlyprevalentinthecasepopulation.Insuchcircumstances,theestimation,especiallyofthemaineectduetoG,suersinthemethodproposedbyCC.Toovercomethisproblem,IuseafactorizationofthepartiallyretrospectivelikelihoodPG;EjD;Sthatallowsustomodelthegenotypefrequenciesseparatelyinthecaseandthecontrolpopulation.Moreover,forgeneticmutationsliketheBRCA1/2,thereareseveralgeneticriskmodelsAntoniouetal.,2004aswellasempiricaldataRischetal.,2001;Couchetal.,1997whichpredictpopulationmutationfrequenciesafteradjustingforcovariateslikefamilyhistoryandancestry.AexibleBayesianmodelcanincorporatethisaccumulatedscienticevidenceintheformofapriordistributionassignedtoPGjSandleadtomoreaccurateestimationthanalogisticmodelforcarrierprobabilities.ToelicitthisadvantageoftheBayesianparadigmwhileestimatingalltheparametersintheG-Elogisticregressionmodel,andnotjustG-Einteraction,remainsanotherprimarygoalofthischapter.ThedatasetIuseisareplicaoftheonethatCCuse,basedonacase-controlstudyonovariancancerpatientsinIsraelModanetal.,2001.IconsiderpresenceofmutationofBRCA1/2asthegeneticriskfactorandnumberofyearsoforalcontraceptiveOCuseandparityastheenvironmentalexposures.ThestraticationvariablesIconsiderareagegroup,ethnicity,personalhistoryofbreastcancerPHBandfamilyhistoryofbreastandovariancancerFHBO.ImodelthecontroldistributionofthecontinuousenvironmentalexposuresconditionalonSasaDirichletprocessmixtureofnormalsDPM.TheDPMmodelisappealinginthiscontextasitprovidesanaturalmeasureofthedegreeofstraticationandismodel-robust.Ialsopresenta

PAGE 73

59 parametricBayesianalternativeforcomparisonpurposes.Anextensivesimulationstudyprovidinganin-depthcomparisonoftheproposedBayesianmethodswiththepowerfulestimationtechniquesprovidedbyCC,thecase-onlymethodandordinarylogisticregressionisaveryimportantfeatureofthischapter.Thesimulationexploresseveralscenarios,withchangingdistributionsforGandEaswellasunderviolationoftheG-Eindependenceassumptionevenwhenconditionedonobservableconfounders.ItappearsthatunderG-Eindependence,theproposedsemiparametricBayesianmethodhasarealadvantageoverthecompetingmethodsunderanyofthefollowingsituationsitheindividualgenotypefrequenciesineachstratumdonotfollowthelogisticmultiplicativeoddsmodelintermsofstraticationvariables,iithegeneticmutationisrareinthecontrolpopulationandiscommonlyprevalentinthecasepopulation.ThegainissignicantwhenthenumberofstratadenedbySisrelativelylarge.WhentheG-EindependenceassumptionevenwhenconditionalonSfails,allthemethodswhichusethisassumptionperformpoorly,leastsofortheBayesiansemiparametricmethod,whichismorerobusttomodelchanges.Therestofthischapterisorganizedasfollows.InSection 4.2 Ipresentthemodel,likelihood,priorsandposteriors.Section 4.3 containsanalysisoftheIsraeliovariancancerdata.Section 4.4 presentsthedetailsofoursimulationstudyandtheresults.Section 4.5 containsconcludingdiscussion,whileproofsandcomputationaldetailsarerelegatedtoAppendix B .4.2Model,Likelihood,PriorsandPosteriorsConsideracase-controlstudywithnsubjects,n1casesandn0controls.LetDbethebinarydiseasevariable,i.e.,Dj=1ifthejthsubjectisacase,andDj=0ifthesubjectisacontrol.ThegeneticriskfactorGisessentiallythegenotypeatasinglelocuswithinacandidategene.IwillconsiderGasacategoricalvariablewithM+1levels,namelyg0;;gM.Inaddition,thedataareassumedtobestratiedbasedonsomeothercovariates,sayS.Iconsiderthefollowinglogisticregression

PAGE 74

60 functiontomodelthediseaseprobabilityintermsofG,EandS,PD=1jG;E;S=Hf0S+MXm=0IG=gm1m+2E+EMXm=03mIG=gmg;{1whereHu=f1+exp)]TJ/F22 11.955 Tf 9.299 0 Td[(ug)]TJ/F21 7.97 Tf 6.587 0 Td[(1.Theintercepts0ScapturestraticationeectsduetothecovariatesSontheriskofdisease.Let1=10;;1M,2,and3=30;;3Mrepresentthemaineectofthegeneticfactor,themaineectoftheenvironmentalfactor,andtheirinteractioneectrespectively.Forparameteridentiability,Iset10=0and30=0.Forsimplicity,Ipresentmymodelwithonlyonecontinuousenvironmentalexposure.ExtensiontomultiplecontinuousexposuresEisstraightforwardandonesuchanalysisispresentedinSection 4.3 .ExtensionofthemethodologywhenEisasetofcategoricalexposuresoramixedsetofcontinuousandcategoricalexposuresisindicatedlaterinthissection.AsIcontinuetocompareandcontrastourmethodswithCCandtraditionallo-gisticregression,Iwouldrstliketopointoutthateachmethodisbasedonadierentlikelihood,theCCmethodusesafullyretrospectivelikelihood,PG;E;SjD,thetra-ditionallogisticmodelusesafullyprospectivelikelihood,PDjG;E;S,whereasourmethodusesthefollowingpartiallyretrospectivelikelihoodPG;EjD;SfactorizedasLR=nYj=1PGj;EjjSj;Dj=nYj=1PGjjEj;Sj;DjPEjjSj;Dj: {2 AsillustratedinPrenticeandPyke979anddiscussedagaininRoederetal.996andMullerandRoeder997,theformoftheretrospectivelikelihoodconsideredhereiscompatiblewiththelogisticformoftheprospectivelikelihood.Evaluationofthelikelihoodfunction 4{2 requirestheconditionaldistributionof[GjE;S;D]andtheconditionaldistributionof[EjS;D].Iwillmakethefollowingassumption:Assumption1:ConditionalonS,GandEareindependentinthecontrolpopulation,i.e.,PGjD=0;E;S=PGjD=0;S.

PAGE 75

61 Whenthediseaseisrareineachstratum,andthecontrolpopulationmimicstheentirepopulation,theusualG-Eindependenceassumptioninsourcepopulation,i.e.,PGjE;S=PGjSisapproximatelyequivalenttoAssumption1.ThetwoassumptionsofG-EindependenceinsourcepopulationandrarediseasearemadebyPiergorschetal.994,UmbachandWeinberg97andModanetal.001,whileCCdonotneedtherarediseaseassumption.OuranalysisisexactunderAssumption1whichmayholdevenwhenthediseaseisnotrare.AspointedoutinSchmidtandSchaid999,therarediseaseassumptionisquitesubtleandmaynothold,forexampleinsituationswherethediseaseriskismuchhigherforthecarriersofaparticulargenemutationorforcertainstrataofthepopulation.InthedatasetIconsider,theriskofovariancancerisknowntobehigherforBRCA1/2carriersandforsubjectswithfamilyhistoryofbreastorovariancancer.Fortunately,thebiasduetotherarediseaseassumptionhaslessimpactwhentheoveralldiseaseprevalencePD=1issmall,evenwithhighlypenetrantgenesSchmidtandSchaid,1999.IdorecognizethatdirectlyverifyingAssumption1empiricallycouldbequitedicultbasedonthegivenstudyathand,astestsofindependencewillhavelittlepower.ManyresearchershaveconsideredthisissueofverifyingG-Eindependenceincontrolpopulationinthecontextofusingthisasascreeningtooltovalidatetheuseofcase-onlyanalysisAlbertetal.,2001.SensitivityanalysisshowsthattheG-EassociationpatternincontrolsreectG-Eassociationinsourcepopulationwhenbaselinediseaseriskislessthan0.1%Gattoetal.,2004.Toaddressthisissue,inthesimulations,IdoconsidervariousdeparturesfromAssumption1,andtheperformanceofallthemethodsunderviolationofthisassumption.Iadvocatethatwhensubstantialuncertaintyremainsonthevalidityoftheindependenceassumption,statisticallysignicantresultsbasedontheproposedmethodsshouldbetreatedasprecursorsforhighpriorityinvestigationsforfutureepidemiologicstudies.

PAGE 76

62 Assumingthattherstn0observationsarecontrolsandthenextn)]TJ/F22 11.955 Tf 12.034 0 Td[(n0obser-vationsarecases,underAssumption1,theretrospectivelikelihoodin 4{2 reducestoLR=n0Yj=1PGjjSj;Dj=0PEjjSj;Dj=0nYj=n0+1PGjjEj;Sj;Dj=1PEjjSj;Dj=1:Consequently,toevaluatethelikelihoodcontributedfromcontroldataIwillneedtospecifyprobabilitymodelsforPGjS;D=0andPEjS;D=0.FollowingthetechniquerstsuggestedbySattenandKupper1993,IpresentthefollowingLemmaswhichwillthenfurnishexpressionsforPGjS;E;D=1andPEjS;D=1,oncehavingthecontroldistributionsandtheprospectivemodelasin 4{1 .Lemma1:PG=gmjE;S;D=1 PG=gmjE;S;D=0=PD=1jG=gm;E;S=PD=0jG=gm;E;S PD=1jE;S=PD=0jE;S:Lemma2:PD=1jE;S PD=0jE;S=MXm=0PD=1jG=gm;E;S PD=0jG=gm;E;SPG=gmjD=0;E;S:Lemma3:PEjS;D=1 PEjS;D=0=PD=1jE;S=PD=0jE;S ZPD=1jE;S PD=0jE;SPEjS;D=0dE:TheproofsoftheLemmasarecollectedinAppendix B .Remark1:WiththelikelihoodconditionalonS,IdonotintendtoestimatetherelativerisksduetothestraticationvariablesSandfocusonlyontheparameterofinterest=1;2;3.AsIproceed,Inotethatunderourformulation,Iwouldtacitlyavoiddirectestimationofthestratumspecicinterceptparameters0Swhichappearsinthediseaseriskmodel 4{1

PAGE 77

63 Beforedescribingtheestimationtheory,Irstwouldliketoaddresstheiden-tiabilityoftheparametersintheprospectivemodel 4{1 andtheretrospectivelikelihoodLR.AsstatedinPrenticeandPyke979,iftherearenoassumptionsmadeonthecovariatedistributionHg;ejs=PG=g;E=ejS=sneitherH;js,nor0Sisidentiable.ButisalwaysidentiableunderanychoiceofH.FollowingLemma1ofRoederetal.996itcanbeeasilyshownthatunderAssumption1onthecovariatedensity,remainsidentiableinourlikelihoodLR.Remark2:IwouldliketopointoutthatunlikethePrentice-Pykeresultforgeneralnonparametriccovariatedensitycase,withanadditionalindependencere-strictiononH;jsinthesourcepopulationnotjustincontrolpopulationasstatedinAssumption1,Lemma1ofCCprovesthatboththeinterceptandthecovariatedistributionsareidentiablegivenS=s.Forararedisease,Assumption1isap-proximatelyequivalenttoindependenceinthesourcepopulation.Thus,intherarediseasecase,withourformulation,byLemma1ofCC,Idohaveidentiabilityoftheentirelikelihood,notjustof.IconsiderthestraticationvariablesSasavectorofq1categoricalcovariates,withthekthvariablehavingrkcategoriesorlevels.Therefore,thelevelcombinationsofSdenesI=Qqk=1rkpossiblestrata.Forinstance,intheIsraeliovariancancerdataIconsiderq=4straticationvariables:Agegroup,ethnicity,PHB,FHBO,therstthreehavingtwocategorieseachandFHBOhavingthreecategories.ThereforeSdenesI=2223=24possiblestrata.Foreaseofnotation,IwillintroduceZ,asingleindexvariablewithIpossiblevalues,eachvaluerepresentingadistinctstratum.Soforsubjectj,Zjcantakeexactlyoneofthevalues1;;I,completelydeterminedbytheobservedvaluesofthestraticationvariablesforsubjectj,namelySj.IcannowrewritethelikelihoodLRafterreplacingSjbythestratummembership

PAGE 78

64 indicatorofsubjectj,namelyZj.LR=n0Yj=1hPGjjZj;Dj=0PEjjZj;Dj=0inYj=n0+1hPGjjEj;Zj;Dj=1PEjjZj;Dj=1i: {3 Iconsiderthefollowingmodelforthecontroldistributionofthegeneticfactorinstratumi,logPG=gmjZ=i;D=0 PG=g0jZ=i;D=0=im:m=1;;M:{4Notethati0=0.TheabovemodeldoesnotassumeanystringentparametricformforPGjD=0;SintermsofSandsimplytreatstheprobabilitiesineachstratumtobethemodelparameters,allowingcompletedistributionalexibility.Result1:Using 4{1 4{4 andLemma1,IobtainthecasedistributionofGas:PG=gmjE;Z=i;D=1=expf1m+3mE+img 1+PMk=1expf1k+3kE+ikg;m=1;;M: {5 ProofofResult1ispresentedinAppendix B .Notethatalthoughinthecontrolpopulationbyvirtueoftheindependenceassumption,PGjE;D=0;Z=i=PGjD=0;Z=i,inthecasepopulationPGjE;D=1;Z=idoesdependonE.DuetohighdimensionalnatureofthestraticationvariablesS,itisoftenhardtomodeltheeectofSonthedistributionoftheexposurevariableEexplic-itly.IconsideraexiblenonparametricBayesianapproachtomodelthedistributionEjD=0;Z=iwhichallowsforpossiblestraticationeectsonthedistributionofEanddoessoinadataadaptiveway.IconsiderthecasewhenEiscontinuous,asinthedataexample.TheDirichletprocessmixturemodelDPMwithanormal

PAGE 79

65 kernelcanbeexpressedinthefollowinghierarchicalstructureEjD=0;Z=iNi;2ii=i;2ijPPPDPP0; {6 whereP,servingasaprioronthei,i=1;;I,isitselfarandomprobabilitymea-sure.IassumethatPisrealizationofaDirichletprocessDPwithscalarprecisionparameter0andbasemeasureorbasepriorE[P]=P0whichisabivariateCDFonRR+.ApropertyoftheDPprioristhattherandomprobabilitymeasurePisalmostsurelydiscrete,leadingtothefollowingpropertieswhichreinterprettheDPMmodelstructureseeAntoniak,1974andSethuraman,1994fordetails: 1. Anyrealizationof1;;IgeneratedfromPliesinasetofKIdistinctvalues,denotedby!=f!1;;!Kg; 2. !l;l=1;KarearandomsamplefromthebasepriorP0; 3. KIisdrawnfromanimplicitlydeterminedpriordistributiondependingontheprecisionparameterandI; 4. GivenKI,theIvaluesareselectedfromtheset!accordingtoauniformmultinomialdistribution.TheabovediscussionisconditionalonandthehyperparameterswhichdetermineP0.WiththishierarchicalmixturepriorstructureforthecontroldistributionofEandtheprospectivelogisticmodel 4{1 itnowremainstoinvestigatethenatureofthecase-distributionofE.Thefollowingresultprovidesananswer.Result2:Assumethattheitakevalues!lfromtheset!asdescribedin1.ThenEjZ=i;D=1;i=!l=MXm=0pilmE;!lm; {7

PAGE 80

66 where;denotesthenormaldensitywithparametervector,!l=l;2l,say,and!lm=l+22l+3m2l;2landpilm=exp1m+l+22l+3m2l2=2l+im=PMk=0exp1k+l+22l+3k2l2=2l+ik.Hence,thedistributionofEamongthecasepopulationconditionalonallotherparametersisagainaDPmixturebutnotwithanormalkernelbutwithamixturekernelgivenby 4{7 .Theexactexpressionofthelikelihood 4{3 andproofofResult2isdeferredtoAppendix B .IwillrefertothismodelforEasEDPMforfuturereferences.PriorStructure:Thelikelihood 4{3 involvestheassociationparameters1,2,3,andi1;;iM,andi=i;2i,i=1;;I.Iuseindependentnormalpriorsforalltheassociationparametersandalsoonim's,m=1;;M.IwillnoteintherealdataexamplewithonlytwopossiblevaluesofG,sothatm=0;1thatifweknowapriorithatthemutationisrareinthecontrolpopulation,andhaveanestablishedgeneticriskmodelforPGjS,weshouldselectaninformativeprioroni1,sothattheeectiverangeofthecarrierprobabilitiesinthecontrol/casepopulationforeachstratumreectsthescienticguessesforthesevalues.ItnowremainstodescribethehierarchicalpriorstructureinvolvedintheDPMmodel.NotethatthemeanoftherandomprobabilitymeasurePisP0whichisabivariatedistribution,andIconsiderthefollowingstandardnormal-inverseGammastructure,namely,underP0,ij2iNm0;2i,2iIGs=2;S=2.Forcompu-tation,IusedaNm0;2m0prioronm0,whichaddsanextralayerofuncertaintyinP0.IuseInverseGammaIGa=2;b=2prioron.Lastly,followingEscobarandWest95,IassumeaGammaa;bpriorontheprecisionparameter.Ichoosethepriorparametersa;binsuchawaythatthemeanofthepriordistributionofKisreasonablylargecomparedtoIandthevarianceismodest.Choosingsuchafurther"priorissuggestedinWestetal.994.NoneofthefullconditionaldistributionsfollowsastandarddistributionalformandposteriorinferenceismadebyusingtheMCMCnumericalintegrationtechnique.

PAGE 81

67 Conditionaloni,drawingrandomnumbersfromtherespectiveconditionaldistribu-tionsisstraightforwardapplicationoftheMetropolis-Hastingsalgorithm.Toupdatei,IusethenogapsalgorithmprescribedbyMacEachernandMuller98.IdescribethecomputationaldetailsofthealgorithminAppendix B .Remark3:AninterestingfeatureoftheEDPMmodelisthatitselectsK,thenumberofdistinctvaluesinIrealizationsfromPorthecardinalityoftheset!inadataadaptivewaydependingontheextentofstraticationonthedistributionoftheenvironmentalexposure.Inthepresenceofstrongstraticationeects,allofthe!lcouldbedistinct,i.e.K=I;inthecompleteabsenceofstraticationeects,K=1.TypicallyKwillliesomewhereinbetween.TheposteriormodeofKthusservesasanindicatorofthedegreeofstraticationeectsonthecontroldistributionofE.Intheabovediscussion,IassumedStobeasetofcategoricalstraticationvari-ableswhichismostoftenthecase.Ifanyofthestraticationvariablesiscontinuous,IrecommendcategorizingthemforimplementingtheEDPMmodel.Remark4:SinceIassumethedistributionofEtobeaDirichletMixtureofnor-mals,thismodelappliesonlytocontinuousenvironmentalexposures.Themodelcanbeeasilyextendedtomultiplecontinuousexposure,simplybytakingaDPMmodelwithmultivariatenormalkernelasusedbyMullerandRoeder997.Iillustratethismultivariateextensioninonesegmentoftherealdataanalysis.Forcategoricalexposures,themodelscouldbeadaptedasshowninSeamanandRichardson001byusingaDirichletdistributionaspriorontheprobabilitiesforeachcategory.Foramixtureofdiscreteandcontinuousenvironmentalexposuresonecouldeithercat-egorizethecontinuousexposureintoclassesoradapttheBayesianbootstrapideasasdescribedinGustafsonetal.002.Themainthemeiscommonbetweenallthreemethods,tryingtomodelthedistributionoftheenvironmentalexposureinanon-parametricwaytoguardagainstviolationsofmodelassumptions.

PAGE 82

68 Remark5:Notethat,asindicatedinRemark1,viatheaboveformulation,thenuisanceparameter0SdoesnotpresentitselfinthecasedistributionsofGandEaspresented,respectively.0Sappearsasacommonterminboththenumeratoranddenominatorof 4{5 and 4{7 andthusgetscanceledintheratio.Hence,theretrospectivelikelihooddoesnotinvolve0S.Remark6:OnecouldnaturallythinkofthefollowingparametriclogisticmodelformodelingthedistributionofG,insteadofusingthemoreexiblemodelasgivenin 4{4 logPGj=gmjDj=0;Sj PGj=g0jDj=0;Sj=0+TmSj;m=1;;M;{8wheremisavectorofregressionparameterscapturingtheeectofstraticationvariablesontheincidenceofthegeneticsusceptibilityfactorinthecontrolpopulation.CCassumeasimilarlogisticmodelforPGjSfortheirrealdataanalysis,thoughtheyrecognizethatitishardtopredictBRCA1/2carrierprobabilitiesusingthislogisticstructure.Indeed,whenIbasedmyinferenceusingthemodelin 4{8 withnormalpriorson0andm,theestimatesoftheparametersofinterestwerelessaccuratewhencomparedtotheonesusing 4{4 .Thus,forthesakeofbrevityIonlyincluderesultswhereIusedmodel 4{4 forcarrierprobabilities.4.3TheIsraeliOvarianCancerDataInthissection,Iapplytheproposedmethodologytothedatafromapopulation-basedcase-controlstudyonallovariancancerpatientsidentiedinIsraelbetweenMarch1,1994andJune30,1999Modanetal.,2001.BloodsampleswerecollectedfromthecasesandthecontrolsinordertotestforthepresenceofmutationinthetwomajorbreastandovariancancersusceptibilitygenesBRCA1andBRCA2.Inaddition,thesubjectswereinterviewedtocollectdataonreproductive/gynecologicalhistorysuchasparity,numberofyearsofOCuseandgynecologicalsurgery.ThemaingoalofthestudywastoexaminetheinterplayoftheBRCA1/2genesandknownreproductive/gynecologicalriskfactorsofovariancancer.Sincetheactual

PAGE 83

69 datahadcondentialityissues,areplicawasgeneratedbyreplacingonlytheoriginalgeneticsusceptibilityfactorbyasimulatedbinarygeneticriskfactor,retainingallthefeaturesasintheoriginaldataset.ThedatasetIusedcontained832casesand747controls.ThisisarealexamplewhereOCuseandBRCA1/2mutationmayappeartobecorrelatedsimplybecausebothcouldberelatedtothestraticationvariablesSlikeageandfamilyhistory,anditismorerealistictoassumeindependencebetweenthesetwogeneticandgynecologicalriskfactorconditionalonS.However,itishardtoverifyAssumption1basedonthissingledatasetasonly7outofthe747controlswereBRCA1/2carriers.IranalogisticregressionofGontheexposuresofinterestEinthecontrolsineachstratum,andthoughthetestsofassociationwereinsignicant,thesparsityofthedatamakestheresultsofthesetestsforassociationunstableandlessreliable.However,Modanetal.001,page236andChatterjeeandCarroll005bothindicatethatitisreasonabletoassumethatcarrierstatusisindependentoftheexposuresunderconsideration,namelyparityandnumberofyearsofOCuse,andIalsoemploythisassumptionintheanalysis.ItisknownthattheriskofovariancancerishigherforcertainstrataforexampleforthesubgroupwithfamilyhistoryofbothbreastandovariancanceraswellasforBRCA1/2carrierssotherarediseaseassumptionmaynotholdforalllevelsofthegeneticfactororforcertainsubgroups.However,Modanetal.001reportedonly1326casesofepithelialovariancancerduringtheve-yearstudyperiodwithabaselinepopulationofapproximately1.5million,suggestinganempiricalestimateofdiseaseprevalencePD=1=8:710)]TJ/F21 7.97 Tf 6.587 0 Td[(4,suggestingthattheodds-ratioestimatesobtainedthroughtheanalysisunderAssumption1willprovideadequateapproximationstotheonesobtainedviaexactanalysisusingG-Eindependenceinsourcepopulation.Allanalysesarecarriedoutconditionalonfourstraticationvariables:S=Agegroup=0ifage<50yearsand=1ifage50years,ethnicity=1forAshkenazi

PAGE 84

70 Jewsand0otherwise,presenceorabsenceofapersonalhistoryofbreastcancerPHB=1ifpresentand0ifabsentandafamilyhistoryofbreastorovariancancerFHBO=0ifnohistory,1ifonebreastcancercaseinfamilyand2ifovariancancerortwoormorebreastcancercasesinthefamily.SothetotalnumberofstratadenedbythelevelcombinationsofSisI=24.IanalyzethedatausingtheEDPMmethodasdescribedintheprevioussection.Formodelingthedistributionofthegeneticfactor,Iuse 4{4 .ThegeneticfactorGisbinarywithG=0forabsenceofanyBRCA1/2mutationandG=1forcarryingatleastoneBRCA1/2mutation.ItiswellknownthatBRCA1/2mutationsareveryrareamongovariancancercontrols,andasModanetal.001pointedout,traditionallogisticregressionanalysiswouldyieldimpreciseestimatesofparametersofinterest.CompoundingtothesparsityisthefactthatIdohavearelativelylargenumberofstratadenedbySandasaresult,estimationofgenotypefrequenciesindividuallyineachstratumwouldbeimpreciseinaclassicalset-up.CCadoptaparametriclogisticmodelforPGjStocircumventthisproblemwhichisalsonotsatisfactory.InaBayesianparadigmIeectivelyusethepriorknowledgeonBRCA1/2carrierprobabilitiesindierentagegroups,ancestryandwithvaryinglevelsoffamilyhistorybasedongeneticalgorithmsBRCAPRO:Parmigianietal.,1998;BOADICEA:Antoniouetal.2004andempiricaldataCouchetal.,1997,Rischetal.,2001.Iallowuncertaintyinthesepredictionsbyallowingtheinformativeprioroni1tovaryaroundthescienticguessesandinthisprocessrelaxthestringentlogisticassumption.TheeectiverangeofpriorprobabilitiesforPG=1jS;D=0typicallyvariedfrom10)]TJ/F21 7.97 Tf 6.586 0 Td[(1to10)]TJ/F21 7.97 Tf 6.586 0 Td[(4acrossdierentstrata.Ipresenttwoanalyses,therstwithOCuseastheonlyenvironmentalexposureEasadirectillustrationofthemethodsformulatedinSection 4.2 .WithabinaryG,therearethreeparametersofinterestinvolvedinthediseaseriskmodel 4{1 :

PAGE 85

71 11=1,2,and31=3.logitPD=1jG;E;S=0S+1I[G=1]+2OC+3I[G=1]OCForeachof1,2,3,IuseN;16prior.Sincescientictheorysuggestshighpositivevalueof1,onecouldalsoselectasharperpriorfor1.FortheEDPMmodelasdescribedin 4{6 ,underthebase-measureP0,Iassumethatthevariancecomponent2IG;1andj2Nm0;2.Theexposurevariable,numberofyearsofOCusetypicallyrangesfrom0to20years.Ichoseadiuseprioronm0,namely,m0N;9.IuseIG;1prioron.Choosingpriorsforisachallengingtaskashasthedualroleofcapturingthedegreeoffaithinthebasemeasure,aswellasdeterminingthenumberofdistinctvaluesof.AsprescribedbyEscobarandWest995IchooseaGammaprioronwhichallowsforpriorprobabilitiesforlargervaluesofKI=24.IexperimentedwithvariouschoicesoftheshapeandscaleparametersoftheGammaprior,andtheresultsarepresentedforGamma;1prioron.Detailedalgorithmforresamplingfromthefullconditionalsiscollectedintheappendix.Forcomparisonpurposes,Ialsoanalyzedthisdatawithaparametricmodel,largelytargetedtowardsthisdataset.Asthedatacontain832casesand747controlsofwhich678casesand586controlsdidnotuseoral-contraceptivesatall,Iusedazero-inatedmodelEZIMforthecontroldistributionofOCuse.Forindividualj,Iconsiderpjastheprobabilityofnon-exposureEj=0,andwithprobability)]TJ/F22 11.955 Tf 10.459 0 Td[(pj,theexposurevaluesfollowNj;2,wherej=0+T1Sj.Themixingprobabilitiesarealsomodeledthroughthefourobservedstraticationfactors,logitpj=0+T1Sj.ThecasedistributionscanbeobtainedasmixturedistributionsviaLemmas1-3.FortheEZIMmodel,IconsidermutuallyindependentN;16priorfortheregressionparameters,1,2and3,aswellason0,0andeachcomponentof1

PAGE 86

72 and1.Forthescaleparameter2,IuseIG;1prior.PosteriorinferenceisagainbasedonMCMCsamplesfromthefullconditionaldistributionoftheparameters.IanalyzedthisdatathroughthemethodproposedbyCCandthecase-onlymethodafteradjustingforthecovariatesS.Thecase-onlymethodonlyfurnishesestimateoftheBRCA1/2*OCinteractionparameter3.TheresultsarepresentedinTable 4{1 .Thereislittleinthewayofdierencesforestimationof2and3byallthefourmethodswhichuseG-Eindependence.ButforestimatingthemaineectofGasmeasuredby1,theBayesianmethodshavemuchsmallerposteriorstandarddeviationandnarrowerHPDintervalcomparedtothestandarderrorandtheCIfortheestimateof1intheCCmethod.TheresultsindicatethatstandardlogisticassumptionislesslikelytoholdforPGjSinthisdataset,andthemoreexiblemodelforGasgivenin 4{4 ,boostedwiththescienticallyvalidatedpriors,andadaptingitselfmorenaturallytothefeaturesofthedata.Interestingly,thenon-parametricEDPMmodelforOCuseperformsquitecomparablywiththeparametriczero-inatedmodelwhichisdesignedspecicallytocapturethedistributionofOCuse.IalsoanalyzedthedatabyordinarylogisticregressionanalysiswhichdoesnotexploitG-Eindependenceinanymanner.Thewidercondenceintervals,especiallyfortheinteractionparameterindicatesthatanymethodusingG-Eindependenceisabletoestimatetheinteractionparametermoreprecisely.WhereasalltheotherfourmethodsdeclareG-Einteractiontobestatisticallysignicant,theordinarylogisticmodelcannotdetectsignicance.Insummarizingtheresults,IrstobservethatforwomenwhoneverusedOCE=0,thereisanalmostastronomicincreaseinriskofovariancancerforaBRCA1/2mutationcarrier.TheestimatedoddsratiobytheEDPMmethodisexp:75=42:52.Ontheotherhand,amongnon-carriers,longeruseofOCisrelatedtodecreaseindiseaseriskwithassociatedoddsratioexp)]TJ/F15 11.955 Tf 9.299 0 Td[(0:0748=0:92.However,theestimate

PAGE 87

73 oftheinteractionparameter3suggeststhatamongBRCA1/2carriers,theriskofovariancancerincreasesslightlywithOCuse,withanoddsratioexp)]TJ/F15 11.955 Tf 9.298 0 Td[(0:0748exp:1091=1:03.TheprecisionestimatesandthecredibleintervalsallindicatethatthemaineectofBRCA1/2andtheBRCA-OCinteractionarestatisticallysignicantwhereasthemaineectofOCuseisonlymarginallysignicant.Figures 4{1 and 4{2 presentplotsofposteriordistributionfor1,2,3andalsoforandKfortheEDPMmethod.Toexplorethedegreeofstratication,IalsopresentaplotofvariandvariintheEDPMmodeli=1;;24.Inoticethatthei'sandi'sdoreectvariationinthevalues,thevariabilityinibeinggreater.TheposteriormodeofKisat5withposteriormeanaround5:76,suggestingthatthoughthereare24possiblestrata,notallofthemhavedistincteectsonthedistributionofnumberofyearsofOCuse.IpresentanotheranalysiswithOCandparitybothconsideredasenvironmen-talexposures,IomitthedetailscorrespondingtothisanalysisandonlycollecttheresultsinTable 4{2 .Inotethatforwomenwithparity=0,andOC=0,BRCA1/2mu-tationisassociatedwithahugeincreaseinriskofovariancancer.AmongBRCA1/2non-carriers,higherparityisassociatedwithdecreasedriskofovariancancer.Theparity*BRCA1/2interactionestimatesuggeststhatthedecreaseinriskofovariancancerassociatedwithincreasedparityismodestlylargerforcarriersthanfornon-carriers,butthisdierenceisnotstatisticallysignicant.Sinceforarealdataset,thetruestateoftheparametersisunknownanditisnotreallypossibletocomparethemethods,Iconductanextensivesimulationstudytoassesstheperformancesofthemethodsoverarangeofdierentscenariosandproviderecommendationsforthepractitioner.4.4SimulationInordertosimulateadatasetforcomparingtheBayesianmethodsalongwiththemethodproposedbyCC,case-onlyanalysisandordinarylogisticregression,Iused

PAGE 88

74 theovariancancerdataasaprototypetoelicitrealistictruevaluesoftheparameters.IsetthetruevaluesclosetotheresultsIobtainedintheanalysisofrealdatabyEDPMmethodinTable 4{1 ,i.e.,1=3,2=)]TJ/F15 11.955 Tf 9.298 0 Td[(0:07,and3=0:12.Igenerated1500observationsfollowingtheschemeasbelow:1.IstartedwithgeneratingtheS=Agegroup,Ethnicity,PHB,FHBOfromamultinomialdistribution,wherethestratumprobabilitiesareconsistentwiththerealstudy.2.GivenS,IgeneratedabinaryvariableDrepresentingthediseasestatus,withprobabilitiesPD=1jSinagreementwiththeovariancancerstudy,themarginaldiseaseprobabilityinthegeneratedpopulationbeingaround0.1%.Ialsoexperi-mentedwithseveralotherchoicesofPD=1forwhichtheresultsarenotincluded.3.IgeneratedtheenvironmentalexposureEfromtwodistributions:iAzero-inatedmodelexactlymimickingtheexposureOCuseasintherealdataset.ThetruevaluesofallassociatedparameterswerechosenastheestimatesobtainedfromtherealdatawhenanalyzedbytheEZIMmodel.iiMixtureoftwonormaldistributions:Todeviatefromtheexactpatternofrealdataandtoputthenonparametricandparametricmethodstotest,Iconsideredthecasewhen[EjD=0;Z=i]comesfromthefollowingmixture:0:5N;1+0:5N;1.4.Finally,IgeneratedabinaryvariableGstandingforBRCA1/2mutationstatususingtheprobabilitystructurePGjD;E;Zasgivenin 4{4 and 4{5 .Iselectthetruevaluesfori1insuchawaythatPrG=1jD=03:3%andPrG=1jD=046:9%torepresentthetwosituationswithamoderatelyrareandacommongeneticmutationrespectively.IalsoprovideonesetofsimulationswhenGwasgeneratedfromtheparametriclogisticregressionmodelasin 4{8 Table 4{4

PAGE 89

75 Apartfromtheaboveset-upwhichassumesG-Eindependence,totestthero-bustnessofourmodel,underviolationsofthisassumption,IsimulateGusingthemodellogPrG=1jZ=i;E;D=0 PrG=0jZ=i;E;D=0=i1+EE: {9 TovarythedegreeofdependenceIconsidertwochoicesE=0:1andE=0:25,thatis,theoddsofhavingG=1withoneunitincreaseinEincreasesbyafactorof1:105and1:284respectively.ResultsforonlyE=0:25areincludedinthetextTables 4{3 4{4 and 4{5 .ThestrategiesIfollowedforchoosingpriorsfortheBayesianmethodsinthesimulationstudyareessentiallysameasdiscussedintherealdataanalysis.Ireplicatedthesimulation100timesandcalculatedMSEbasedonthese100estimates.TheresultsaregiveninTables 4{3 4{4 and 4{5 .Thesimulationresultsarefairlyclear.Ifinterestliesinestimatingthemaineectofthegeneticfactor1,theBayesianEDPMmodelperformsthebestforanychoiceofdistributionsofGandE.ThefullyparametricBayesianEZIMmodelsuerswhenEisoriginatedfromanyothermodel,forexamplethemixtureoftwonormaldistributionsTables 4{4 and 4{5 .WhentheparametriclogisticassumptionforPGjSdoesnothold,thereisacleardominanceoftheBayesianmethodsovertheCCmethodforestimating1.EvenwhenthedataisgeneratedfromanexactlylogisticmodelforPGjSTable 4{4 ,theBayesianmethodsperformquitecomparablywiththeCCmethod.Theeciencygainforestimating1inBayesianmethodsislargerwhenthegeneticmutationrarelyoccursinthecontrolpopulationTables 4{3 and 4{5 ,whichcouldbeduetotheexibilityofthelikelihoodinmodelingthecontroldistributionsseparatelyintheBayesianmethods,whereasCCmodelthemarginaldistributionofGjS.IfinterestliesinestimatingthemaineectofE,boththeCCmethodandtheEDPMmethodarecomparable,withCCmethodhavingaslightedgeinsomecases.OnemaynotethattheMSEcorrespondingto

PAGE 90

76 2fortheEDPMmodelisoftenlargerthantheothermethodsaswiththeDPMstructureI'maddinganotherlevelofmodeluncertainity.Indeed,theadvantageoftheDPMisnotintermsofgainineciencyforestimating2acrossallscenarios,butbecauseofitsrobustness.OnemaynotethatinsteadofmodelingPEjS,CCmodelPE;Snonparametrically.TheirprolelikelihoodtechniqueworksextremelywellacrossmanydierentdatageneratingmechanismsforE.Onaminornote,inasmallproportionoftimes,theredoesappeartobeaproblemwiththeconvergenceoftheirestimationalgorithmwhichappearstoberelatedtothechoiceofstartingvalues.Iexcludedthoserunswhenpresentingthenaltablesforoursimulation.ForestimatingtheG-Einteraction3,onecouldchooseeithercase-only,EDPMortheCCmethod.Whensimultaneousestimationofallthreeparametersisconsidered,andAssumption1isfairlyreasonable,theEDPMmodelappearstobeasuperiorchoice.Underviolationoftheindependenceassumption,performanceofallthemethodsworsenTables 4{3 4{4 and 4{5 ,butthelossofeciencyappearstobetheleastforEDPMmodel.TheordinarylogisticregressionmodelwhichisleastecientunderG-Eindependence,especiallyfortheinteractionparameter,doesnotlosemucheciencyunderviolationofG-EindependenceasitdoesnotimposeanyrestrictionsontheG-Edistribution.4.5DiscussionEpidemiologistshavelonggrappledwiththisissueonhowtomeasureinterac-tioninabiologicallymeaningfulwayandthereisstillnoconsensusintheliteratureBottoandKhury,2001.Onemustrecallthatthestatisticalinteractionparameter3asinthischapterhasaveryspecializedmeaningwhichisrelatedtothegen-eralnotionofinteraction"inthescienticcommunityonlyinavaguewayCox,1984.Nostatisticalinteraction"inourmodelmeansconstantmultiplicativeeectofgenotypeonthediseaseoddsacrossalllevelsoftheenvironmentalexposure.Abiologistmightdeneinteraction"inabroadermechanisticsensethatinteraction

PAGE 91

77 existsifthegeneticfactorandenvironmentalexposureworkonthesamepathwayBrennan,2002,ClaytonandMcKeigue,2001.Assessingthejointeectsofgeneticandenvironmentalfactorswithinstratadenedbyothervariablesmayprovideusefulinsightintodiseaseetiologyandhelptodetermineeectivepublichealthinterventionstrategies.ThearticlebyCCisthusamajorbreakthroughwhichemphasizesthatcase-controlstudiesofgene-environmentinteraction"gowellbeyondestimatingthestatisticalinteractionparameter3andanydesignoranalysisstrategyshouldal-lowonetoestimateotherdierentparametersofinterest,andshouldnotonlybetargetedtowardsestimationof3.However,asemphasizedthroughoutthischap-ter,Irecommendextremelycautioususeoftheindependenceassumption.Scienticandempiricalvalidationofthisassumptionisofutmostimportancewhileusingtheproposedmethods.Toconclude,Iwouldliketohighlightsomeofthenewfeaturesofthechapter.InthischapterIproposedafullyexible,robustBayesiansemiparametricmodelforestimatingnotonlytheinteractionparameter,butthemaineectsundergene-environmentindependenceinastratiedcontrolpopulation.Themethodoutper-formstheexistingmethodsinmanyinstancesandperformscomparablyinothers.Withgeneticmutationwhichhasunequalfrequenciesincaseandcontrolpopulation,theabilitytomodelthemseparatelythroughtheproposedlikelihoodhasanaturaljustication.WhentheG-Eindependenceassumptiondoesnothold,themethodperformsbetterwhencomparedtoothercontenders.Thischapternotonlyaddressesanimportantprobleminmodernepidemiology,italsointroducessomeinterestingstatisticaltechniquesespeciallyforhandlingthehigh-dimensionalstratumeectsonthegeneticandenvironmentalexposuredistributioninadataadaptiveway.TheuseoftheDPMmodelasillustratedinResult2inconjunctionwithtransitionfromcontroltocasedistributionisaniceapplicationofthetheoryonDP.Usingprior

PAGE 92

78 biologicalinformationonthefrequenciesofthegeneticmutationreiteratesthefun-damentaladvantageoffollowingaBayesianparadigm.Thesimulationstudyisanadditionalassetofthischapter,comparingtheBayesianmethodswiththecommonlyusedfrequentistmethodsandtherecentlyproposedmethodbyCC.HowtohandlemisclassicationofGandmeasurementerrorinEwillbedis-cussedinthenextchapter.Theascertainbiasduetodierentcontrolselectionmechanismsintheaboveframeworkremainstopicsforfutureresearch.

PAGE 93

79 Table4{1:AnalysisofIsraeliovariancancerdatabyallvemethods,consideringOCuseastheonlyenvironmentalexposure,with95%HPDandcondenceintervals Model123K EZIMEstimate3.7832-0.05270.0910post.stdev0.13170.02430.0326HPD.4641,3.9764-0.1265,-0.01400.0270,0.1482EDPMEstimate3.7537-0.07480.109114.74555.7630post.stdev0.12940.03030.03525.83611.8836HPD.4358,3.9310-0.1409,-0.0151.0364,0.17915.8913,28.5666,10CCEstimate3.6323-0.06240.1110std.error0.39990.02660.0341CI.8485,4.4161-0.1145,-0.0103.0442,0.1778OrdinaryLogisticEstimate3.7710-0.06420.0476std.error0.44070.02680.0999CI.9072,4.6348-0.1167,-0.0117-0.1482,0.2434Case-OnlyEstimate0.0924std.error0.0329CI.0279,0.1569

PAGE 94

80 Table4{2:AnalysisofIsraeliovariancancerdatabyallvemethods,consideringbothOCuseandparityasenvironmentalexposures,with95%HPDandcondenceintervals ModelGOCParityOCGParityG EZIMEstimate3.7877-0.0829-0.03690.1566-0.0781post.stdev0.15730.02720.03040.03660.0427HPD.4937,4.0909-0.1491,-0.0379-0.0864,0.0119.0765,0.2230-0.1686,0.0026EDPMEstimate3.8808-0.0631-0.04040.1360-0.1072post.stdev0.15660.02020.03110.03310.0501HPD.5748,4.1713-0.1034,-0.0224-0.0947,0.0260.0823,0.2123-0.2207,0.0031CCEstimate3.8961-0.0620-0.05990.1128-0.1041std.error0.42970.02670.03200.03440.0599CI.0539,4.7383-0.1143,-0.0097-0.1227,0.0029.0454,0.1802-0.2214,0.0133OrdinaryLogisticEstimate4.7321-0.0582-0.03880.0292-0.3869std.error0.74110.02630.03170.10800.1481CI.2795,6.1847-0.1097,-0.0067-0.1009,0.0233-0.1825,0.2409-0.6772,-0.0966Case-OnlyEstimate0.0931-0.0565std.error0.03310.0591CI.0283,0.1579-0.1724,0.0594

PAGE 95

81 Table4{3:Simulationscenarios:EisZero-Inated;G:rareorcommon;G-EindependenceassumptionholdsE=0ordoesnotholdE=0:25.Meandenotesthemeanestimatebasedon100replications,whereasMSEistheestimatedmeansquarederrorbasedon100replications. GEModel123True3.0000-0.07000.1200 Rare0EZIMMean2.9777-0.06230.1132MSE0.01960.00060.0015EDPMMean2.9315-0.06150.1140MSE0.02420.00110.0016CCMean2.9013-0.06300.1152MSE0.26330.00060.0015OrdinaryMean2.9053-0.06320.1877MSE0.37430.00060.0619Case-OnlyMean0.1196MSE0.0017Common0EZIMMean2.9838-0.08110.1282MSE0.00990.00160.0014EDPMMean2.9722-0.08340.1260MSE0.01080.00180.0016CCMean2.7742-0.07870.1239MSE0.08050.00150.0012OrdinaryMean2.8179-0.07750.1214MSE0.06630.00160.0026Case-OnlyMean0.1296MSE0.0014Common0:25EZIMMean2.8643-0.27520.3368MSE0.03130.04890.0536EDPMMean2.8960-0.14650.2119MSE0.02180.00640.0090CCMean2.4190-0.31160.3723MSE0.35800.06450.0695OrdinaryMean2.8006-0.25860.1786MSE0.06370.04260.0109Case-OnlyMean0.3950MSE0.0822

PAGE 96

82 Table4{4:Simulationscenarios:E:Mixtureoftwonormals;G:withparametriclogisticintermsofSasin 4{8 orcommonlyprevalentasin 4{4 ;G-EindependenceholdsE=0ordoesnotholdE=0:25.Meandenotesthemeanestimatebasedon100replications,whereasMSEistheestimatedmeansquarederrorbasedon100replications. GmodelEModel123True3.0000-0.07000.1200 Generated0EZIMMean3.0000-0.04470.1323by 4{8 MSE0.06850.07980.0092EDPMMean2.9880-0.07100.1290MSE0.05880.00220.0022CCMean3.0185-0.07820.1284MSE0.05630.00150.0019OrdinaryMean3.0157-0.07930.1323MSE0.12820.00170.0080Case-OnlyMean0.1289MSE0.0018Generated0EZIMMean2.8213-0.12270.1352by 4{4 MSE0.03630.08860.0076EDPMMean2.9870-0.06950.1189MSE0.03030.00230.0021CCMean2.7542-0.07130.1249MSE0.10980.00170.0023OrdinaryMean2.7683-0.07220.1328MSE0.18560.00190.0087Case-OnlyMean0.1251MSE0.0023Generated0.25EZIMMean2.8581-0.31140.3778by 4{4 MSE0.15680.21270.1886EDPMMean2.8858-0.26920.3473MSE0.04490.04420.0549CCMean1.9287-0.29840.3682MSE0.88990.05630.0652OrdinaryMean2.7678-0.25090.1519MSE0.13680.03740.0076Case-OnlyMean0.3212MSE0.0885

PAGE 97

83 Table4{5:Simulationscenarios:E:Mixtureoftwonormals;G:rarelyprevalent;G-EindependenceholdsE=0ordoesnotholdE=0:25.Meandenotesthemeanestimatebasedon100replications,whereasMSEistheestimatedmeansquarederrorbasedon100replications. GEEjD=0;Z123Truevalue3.0000-0.07000.1200 Rare0EZIMMean2.9169-0.04370.1587MSE0.08220.03110.0209EDPMMean2.9294-0.06740.1296MSE0.06700.00160.0039CCMean2.8525-0.07250.1335MSE0.22230.00090.0047OrdinaryMean2.8732-0.07150.1450MSE0.82800.00100.0605Case-OnlyMean0.1340MSE0.0045Rare0.25EZIMMean3.0717-0.20410.3076MSE0.16880.19340.0481EDPMMean3.1533-0.13240.2990MSE0.09650.00560.0341CCMean2.0735-0.14440.3143MSE0.84760.00660.0398OrdinaryMean3.1541-0.13930.0452MSE0.71120.00590.0432Case-OnlyMean0.3505MSE0.0559

PAGE 98

84 Figure4{1:RealdataanalyzedwithEDPMmodelbyconsideringOCuseasanenvironmentalexposure:Histogramoflast5000MCMCvaluesforthemaineectsandinteractionparameterwithoverlayedsmoothedkerneldensity.

PAGE 99

85 Figure4{2:DetailsofDPMmodelbyconsideringOCuseasanenvironmentalex-posure:HistogramcorrespondingtoapproximateposteriordistributionofandKintheDPMmodel.Alsoplottedarehistogramsofvariancesofthei'sandii=1;;24,calculatedforeachofthelast5000MCMCruns.

PAGE 100

CHAPTER5ACCOUNTINGFORERRORDUETOMISCLASSIFICATIONOFEXPOSURESINCASE-CONTROLSTUDIESOFGENE-ENVIRONMENTINTERACTION5.1IntroductionMeasurementerrorinexposureassessmentisoneofthemajorsourcesofbiasinepidemiologicalstudies.Whenignored,theseerrorsbiaspointandintervalestimatesofeect,andinvalidatep-valuesofhypothesestests.Often,althoughnotalways,thebiasistowardsthenullvalue,underestimatingthetrueexposure-diseaserelationship,andtherecanbeasubstantiallossofpowerinhypothesistests.Thepervasivenessandextensivenessoftheseexposuremeasurementandmisclassicationerrorsinepi-demiologicresearchmayexplainmuchoftheinconsistentandinconclusiveresultscurrentlyreportedintheliterature.BashirandDuy1997providedageneralreviewofepidemiologicalmethodsfordealingspecicallywithmeasurementerrorandmisclassication.GreenlandandKleinbaum983proposedasimpletwo-stageprocedureforestimatingtheoddsratioinmatchedpairs,withacorrectedvarianceestimatordevelopedlaterbyGreen-land89.RiceandHolmans003obtainedanalyticformulaeforestimatesofgenotypicrelativerisksintermsofthegenotypingerrorprobabilityinanalysisofun-matchedcase-controlstudieswithasinglebinarygeneticfactorasexposure.Later,Rice03proposedafulllikelihood-basedapproachtoobtainestimatesandcon-denceintervalsfortheparametersofinterestinthepresenceofmisclassicationofabinaryexposureinmatchedcase-controlstudies.However,muchofthediscussionontheeectsofmisclassicationofexposurehavejustfocusedontheimpactontherelativeriskand/orsamplesizeinstudiesofasinglefactor.Incontrast,lessattention 86

PAGE 101

87 hasbeengiventotheinuenceofmisclassicationontheassessmentofinteractionsbetweentwoormorefactors.AsalreadyindicatedinChapter 4 ,oneofthemajorgoalsinmanyrecentepidemi-ologicalstudieshasbeentoinvestigatetheeectofgenesonadisease,incombinationwithenvironmentalexposures.Incase-controlstudiesofgene-environmentassocia-tionwithdisease,whengeneticandenvironmentalexposurescanbeassumedtobeindependentintheunderlyingpopulation,onemayexploittheindependenceinordertoderivemoreecientestimationtechniquesthanthetraditionallogisticregressionanalysisChatterjeeandCarroll,2005.Garcia-Closasetal.998showedthatun-derasetofconditionsoftensatisedinstudiesofgene-environmentinteractions,bothdierentialandnondierentialmisclassicationofbinaryenvironmentalfactorsbiasamultiplicativeinteractioneecttowardthenullvalue.Garcia-Closasetal.999proposedasimpleapproachtoassesstheimpactofmisclassicationonbiasintheestimationofmultiplicativeoradditiveinteractionsandonsamplesizerequirements.Theypointedoutthatundermisclassicationofexposures,increasedsamplesizeisneededtoattainthesamepowertodetecttheattenuatedinteraction.ThefocusofGarcia-Closasetal.999wasprimarilyonthestudydesignissuesundermisclas-sication,anddidnotproposecorrectedestimatesoftheparametersofinterest,orinferentialadjustments,ifinfact,misclassicationispresentinthedata.Thecurrentchapterdescribesarelativelysimpleapproachtoadjustthees-timationoftheparametersofinterestincase-controlstudiesofgene-environmentinteractioninthepresenceofmisclassication.TheproposedmethodexploitsG-Eindependenceassumption,andobtainscorrectedparameterestimatesofallthepa-rametersofinterest,andnotjusttheinteractionoddsratio.Iconsideraunmatchedcase-controlset-up,adaptandextendtheworkofRiceandHolmans003tothe

PAGE 102

88 situationwhenonehasabinarygeneticriskfactorG,abinaryenvironmentalex-posureE,andbotharepotentiallysubjecttomisclassication,whereIusethetraditionalformulationintermsofoddsratiosina24table.InSection 5.2 ,Ifocusonanunmatchedcase-controlsamplingdesign.Iformu-latethemaximumlikelihoodestimationMLEproblemundertheG-Eindependenceassumptionandobtainparameterestimatesunderthisadditionalrestriction.Irstmakeararediseaseassumption,inwhichcase,Icanobtainclosed-formexpressionforparameterestimates.Ipointoutthattheestimateofthegene-environmentin-teractionparameterobtainedbythisapproach,asexpected,isexactlyidenticaltotheestimateobtainedbythepopularcase-onlyapproachPiergorschetal.,1994.However,withknowledgeofthemarginalprevalenceofthediseaseinthepopulationPD=1,Icanrelaxtherarediseaseassumption.Inthelattersituation,IcanalsoobtainconstrainedMLEs.Thoughthecorrespondingscoreequationsdonothaveexplicitclosedformsolutions,numericalevaluationisextremelystraightforward.Afterthispreliminaryformulationwithaperfectlymeasureddataset,Idelveintotheissueofadjustmentsforthepresenceofmisclassication.Inthepresenceofmisclassication,IadjusttheMLEsbasedonthesensitivityandspecicityofthemeasuringinstrumentsforgeneticandenvironmentalfactors.Teststatisticsandcondenceintervalsareformulatedasinanystandardlikelihood-basedinferenceusingtheasymptoticdistributionoftheMLE,oncetheadjustmentsaremade.Infact,asmisclassicationerrorratesgotozero,theestimatesreducetothestandardMLEsfordatarecordedwithoutmisclassication.Ialsoprovidesamplesizecalculationandpowercomparisonsfortheproposedmethods.IassumethroughoutthatIhavebeengivenxedvaluesofsensitivityandspecicityofthemeasurementdevicesandmakenoattempttoestimatetheminthischapter.RiceandHolmans03proposedtouseanexternalvalidationstudytoestimatethesensitivityandspecicityparameterswhichcouldalsobeadaptedinthiscase.Itmaybenotedthatincontrasttothe

PAGE 103

89 centralthemeofthisdissertation,IadoptpurelyclassicalmaximumlikelihoodbasedtechniquesinthisparticularchapterinsteadofaBayesianroute,asthemaximumlikelihoodcomputationsareveryecientandcompactinthisparticularunmatchedframeworkwithtwobinaryexposures.Manygeneticandenvironmentalfactorsinepidemiologicalstudiesareliabletomismeasurement,leadingtoincorrectndings.EortstoimprovetheaccuracyinmeasuringbothgeneticandenvironmentalfactorsarecriticalforthevalidassessmentofG-Einteractionsaswellastheirmaineectsincase-controlstudies.Simulationstudiesshowthatmycorrectedinferencegainsubstantiallyintermsofeciency,andreducebiaswhencomparedtotheunadjustedones.Also,myproposedadjustedapproachesincreasethepowerfortestingtheinteractioneect,andthusdecreasethesamplesizerequiredtoattainadesignatedpowerlevel.Therestofthechapterisorganizedasfollows.InSection 5.2 Idiscussunmatchedcase-controlstudies,whereIrstpresentthemodels,likelihoodsandassumptionsintheabsenceofmisclassicationanddiscussmaximumlikelihoodestimation.Ithenintroduceadjustmentsforthetrueparametersinthepresenceofmisclassication.Ipointouttheequivalenceofcase-onlymethodandmymethodintermsofestimatingtheinteractionparameter,andthesubsequentadjustmentduetothemisclassiedexposure.Section 5.3 containssupportingnumericalevidenceincludingsimulationstudieswithsamplesizedeterminationandpowercalculationforunmatchedcase-controlstudies.Section 5.4 containsconcludingdiscussion.SomeproofsanddetailedcalculationsarerelegatedtoAppendix C .5.2UnmatchedCase-ControlStudiesofGeneEnvironmentInteractionIconsiderunmatchedcase-controlstudieswithabinarygeneticfactorGandabinaryenvironmentalexposureEwhichtakevalues1forsusceptibleexposedinthecaseofEand0fornonsusceptibleunexposedinthecaseofEsubjects.LetDdenotethediseasestatus,whereD=1denotesaected,andD=0denotes

PAGE 104

90 Table5{1:Dataforaunmatchedcase-controlstudywithabinarygeneticfactorandabinaryenvironmentalexposure. G=0G=1 E=0E=1E=0E=1total j1234 D=0r01r02r03r04n0D=1r11r12r13r14n1 unaectedindividuals.UsingthesamenotationasinGarcia-Closasetal.999,theoddsratioORegmeasurestheassociationbetweendiseaseandtheenvironmentalandgeneticfactors.Relativetosubjectsnotexposedtotheenvironmentalorgeneticfactor,Idenethefollowingoddsratios:OR10denotestheoddsratiofornonsuscep-tiblesubjectsexposedtotheenvironmentalfactor;OR01denotestheoddsratioforsusceptiblesubjectsnotexposedtotheenvironmentalfactor;andOR11denotestheoddsratioforsusceptiblesubjectsexposedtotheenvironmentalfactor.Therefore,=OR11=OR10OR01isthemultiplicativeinteractionparameter.5.2.1MaximumLikelihoodEstimationunderG-EIndependenceAssump-tionTable 5{1 showsageneralformofdataunderanunmatchedcase-controldesign.Intheabsenceofmisclassication,Icanassumethatdatafromcontrolpopulationandcasepopulation,eachformsamultinomialdistribution,namely,r0Mnn0;p0r1Mnn1;p1; {1 wheren0andn1arexed,andr0=r01;r02;r03;r04r1=r11;r12;r13;r14p0=p01;p02;p03;p04=1)]TJ/F22 11.955 Tf 11.955 0 Td[(p01)]TJ/F22 11.955 Tf 11.955 0 Td[(p02)]TJ/F22 11.955 Tf 11.956 0 Td[(p03p1=p11;p12;p13;p14=1)]TJ/F22 11.955 Tf 11.955 0 Td[(p11)]TJ/F22 11.955 Tf 11.955 0 Td[(p12)]TJ/F22 11.955 Tf 11.955 0 Td[(p13:

PAGE 105

91 Bythedenitionoftheoddsratio,OR10=p01p12 p02p11OR01=p01p13 p03p11OR11=p01p14 p04p11=p02p03p11p14 p01p04p12p13:Thus,Iobtainthecase-distributionas,p11=p01 pp12=p02 pOR10p13=p03 pOR01p14=p04 pOR10OR01; {2 wherep=p01+p02OR10+p03OR01+p04OR10OR01.Thecorrespondingmultinomiallikelihood,whenparameterizedintermsofthecontrolprobabilitiesandtheoddsratiosisL1=LOR10;OR01;;p01;p02;p03jr0;r1=1Yd=04Yj=1prdjdj: {3 Notethattheparametrizationofp01,p02andp03imposesnootherrestrictionsexceptfortheyleadtovalidprobabilitydistributionstheyareallpositiveandsumtolessthan1.Similarly,theoddsratiosarejustexpectedtobepositive.IcanalwaystthemultinomialmodeltothisdataandobtainthemaximumlikelihoodestimatesMLEsoftheparametersofinterestandtheirestimatedasymptoticvarianceAVARasinTable 5{2 underthecolumnoftraditionalmodel.TheMLEsofthecellprobabilitiesaresimplygivenby^pdj=rdj=nd,d=0;1,j=1;;4.NowletmedescribehowtheestimationchangeswiththeadditionalassumptionofG-Eindependenceholdinginthesourcepopulation.Irstinvestigatetheestimates

PAGE 106

92 Table5{2:Intheabsenceofmisclassication,theMLEsoftheoddsratiosandtheirestimatedasymptoticvariancesintermsofobservedcountsrdjforbothtraditionalmodelandthemodelunderG-Eindependenceandraredisease. ParametersTraditionalmodelG-Eindependenceandraredisease OR10MLEdOR10=r01r12 r02r11dORIR10=r12r01+r03 r11r02+r04AVARdOR2101 r01+1 r02+1 r11+1 r12)]TJ 5.985 -6.662 Td[(dORIR1021 r01+r03+1 r02+r04+1 r11+1 r12OR01MLEdOR01=r01r13 r03r11dORIR01=r13r01+r02 r11r03+r04AVARdOR2011 r01+1 r03+1 r11+1 r13)]TJ 5.985 -6.662 Td[(dORIR0121 r01+r02+1 r03+r04+1 r11+1 r13MLE^=r02r03r11r14 r01r04r12r13^IR=r11r14 r12r13AVAR^2P1d=0P4j=11 rdj)]TJ/F15 11.955 Tf 7.873 -6.529 Td[(^IR2P4j=11 r1j underararediseaseassumption,anassumptionwhichisroutinelymadeinepidemi-ologicalstudies.TheassumptionofG-Eindependenceinthesourcepopulation,PG;E=PGPEinconjunctionwiththerarediseaseassumption,impliesthatG-Eindependenceholdsinthecontrolpopulation,i.e.,PG;EjD=0=PGjD=0PE=0.Thisaddsanadditionalrestrictiononp01,p02andp03:p01)]TJ/F22 11.955 Tf 11.955 0 Td[(p01)]TJ/F22 11.955 Tf 11.955 0 Td[(p02)]TJ/F22 11.955 Tf 11.955 0 Td[(p03=p02p03: {4 Withthisadditionalrestriction,maximizingthelikelihood 5{3 willnotprovidethesameestimatesasinthepreviousmodel.TheMLEsandtheirAVARinthisrestrictedparameterspacearepresentedinTable 5{2 underthecolumnofG-Eindependenceandraredisease.TheconstrainedMLequationsandtheirsolutionswhichleadtothiscolumninTable 5{2 arepresentedinAppendix C.1 IfthediseaseprevalencePD=1=inthesourcepopulationisknown,IcanrelaxtherarediseaseassumptionbyexpressingtheG-Eindependenceasthefollowing:PG=gPE=e=PG=g;E=e=PG=g;E=ejD=0PD=0+PG=g;E=ejD=1PD=1; {5

PAGE 107

93 whereg;e=0;1.Therefore,insteadoftherestrictionasin 5{4 ,wehavethefollowingrestrictiononp01,p02andp03:f=)]TJ/F22 11.955 Tf 11.955 0 Td[(p04+OR10OR01p04=p)]TJ/F15 11.955 Tf 19.261 0 Td[([)]TJ/F22 11.955 Tf 11.955 0 Td[(p02+p04+OR10p02+OR10OR01p04=p][)]TJ/F22 11.955 Tf 11.955 0 Td[(p03+p04+OR01p03+OR10OR01p04=p]=0: {6 Thedetailsofobtaining 5{6 isdelegatedtoAppendix C.2 .Withthisadditionalrestriction,maximizingthelikelihood 5{3 willnotprovidethesameestimatesasundertherarediseaseassumption.InfactthesolutionstotheMLequationscannotbewritteninclosedforms.However,wecanobtaintheserestrictedMLEsbytheusualNewton-Raphsonalgorithm,andobtaintheestimatedasymptoticvariance-covariancematrixbytheinverseoftheobservedinformationmatrix.Theobservedinformationmatrixisconstructedbytakingthesecondderivativeofthelog-likelihoodwithrespecttheparameters,andevaluatingthemattheMLEsoftheparameters.OnceweobtaintheMLEsoftheparametersofinterest,inthiscasetheoddsratios,onewouldliketoconducttestsofassociationorinteractionaswellascon-structlargesamplecondenceintervalsfortheparametersofinterest.Becauseoftheskewnessinthesamplingdistributionoftheestimatedoddsratios,statisticalinferencefortheparametersofinterestdenotedbyagenericsymbolusesanalter-nativebutequivalentmeasure:itsnaturallogarithm,log^,whichhasalessskewedsamplingdistributionthatisclosertonormal.Byasimpleuseofdeltamethod,thelarge-sampledistributionoflog^isapproximatelynormal,i.e.log^Nlog;AVARlog^;where^istheMLEof,andtheestimatedvarianceoflog^isgivenby,AVARlog^=^)]TJ/F21 7.97 Tf 6.586 0 Td[(2dVar^.Z-testsandcondenceintervalsforthelog-scaleparametersarecon-structedbasedontheabovelargesampledistribution.

PAGE 108

94 Remark1: Itiswell-knownthatinamultinomialset-up,theexpectedcellcounts,namely,Epd[rd]=ndpd,whereEpd[rd]representstherowvectorofexpectedcellcountscorrespondingtoD=d;d=0;1,andpddenotestheprobabilityvector.Thenthevectorofestimatedexpectedcellfrequencies,denotedby~rdisgivenbyEpd[rd]jpd=^pd=nd^pd,i.e.,theexpectedfrequenciesevaluatedattheMLE'softhemodelparameters.Forexample,fortheusualmultinomialmodel,withoutanyrestrictionsontheexposurespace,thevectorofestimatedexpectedcellfrequenciesmatchesexactlywiththeobservedfrequencies,thatis,~rd=rd,as^pd=rd=nd,whererdisthevectorofobservedfrequencies.Undertheindependenceandrarediseaseassumptions,whichIdenotebythesuperscriptIRbelowtodistinguishfromtheothermodels,fromAppendix C.1 ,InotethattheMLEsforp0andp1are,^pIR01=r01+r03r01+r02 n20^pIR02=r01+r02r02+r04 n20^pIR03=r01+r03r03+r04 n20^pIR04=r02+r04r03+r04 n20^pIR1j=r1j n1;j=1;2;3;4; {7 andthustheestimatedexpectedfrequenciesareobtainedsimplyby~rIRd=nd^pIRd.Remark2: OnecanobtaintheestimatesofOR10,OR01andtheinteractioneectaswellasthecellprobabilitiesinthecontrolpopulationp0.DenethegeneticandenvironmentalmarginaloddsratiosORGandOREasthefollowing,ORE=PD=1jE=1PD=0jE=0 PD=0jE=1PD=1jE=0ORG=PD=1jG=1PD=0jG=0 PD=0jG=1PD=1jG=0:

PAGE 109

95 ThusonealsocanestimateORGandOREbyusingthefollowingidentities,ORE=np01+p03 p02+p04onp02OR10+p04OR10OR01 p01+p03OR01oORG=np01+p02 p03+p04onp03OR01+p04OR10OR01 p01+p02OR10o:UndertheG-Eindependenceandrarediseaseassumptions,wehavep01p04=p02p03,furthermore,PEjD=0PEandPGjD=0PG,thusonecanestimateORGandOREbyusingORE=)]TJ/F22 11.955 Tf 11.955 0 Td[(PG=1OR10+PG=1OR10OR01 )]TJ/F22 11.955 Tf 11.955 0 Td[(PG=1+PG=1OR01ORG=)]TJ/F22 11.955 Tf 11.955 0 Td[(PE=1OR01+PE=1OR10OR01 )]TJ/F22 11.955 Tf 11.955 0 Td[(PE=1+PE=1OR10:NotethatPE=1p02+p04andPG=1p03+p04.5.2.2MaximumLikelihoodEstimationinThePresenceofMisclassica-tionNow,Iintroducetheeectsofmisclassicationinthepresentframework.Ourmodelformisclassieddataisbasedontheassumptionthatsomeperfectlyclassiedtrue"case-controldataexist,wherethetrue"parametersofinterestandthetrueunderlyingcellprobabilitiesfollowthesamedistributionaspdasdiscussedabove.Followingthestar"notationofRice03,Iletthesuperscriptasteriskdenotethetrueparametersforthetruedatamodelaswellasthetruevariables.LetspdGsedGandspdEsedEdenotespecicitysensitivityofGandEwithdiseasestatusd,respectively,wheresensitivity=Pobservedexposedjtrulyexposedandspecicity=Pobservedunexposedjtrulyunexposed,sosedG=PG=1jG=1;D=d,sedE=PE=1jE=1;D=d,spdG=PG=0jG=0;D=dandspdE=PG=0jG=0;D=d.Applyingaclassicalerrorstructure,allsubjectsareassumedtohavethesameprobabilityoftheobservedexposure,conditionalontheircase/controlstatusandtrueexposure.Ithenhavethefollowingtworesults.

PAGE 110

96 Result1 Assumingthatgiventhediseasestatusd=0;1andthetrueexposurestatusesofGandE,theobservedexposurestatusesofGandEareindependent,then0B@pd1pd2pd3pd41CA=A0B@pd1pd2pd3pd41CAB; {8 whereA=0B@spdG1)]TJ/F22 11.955 Tf 11.955 0 Td[(sedG1)]TJ/F22 11.955 Tf 11.955 0 Td[(spdGsedG1CAandB=0B@spdE1)]TJ/F22 11.955 Tf 11.955 0 Td[(spdE1)]TJ/F22 11.955 Tf 11.955 0 Td[(sedEsedE1CA:Proof:PG;EjD=d=1Xg=01Xe=0PG;EjD=d;G=g;E=ePG=g;E=ejD=d=1Xg=01Xe=0PGjD=d;G=g;E=ePEjD=d;G=g;E=ePG=g;E=ejD=d=1Xg=01Xe=0PGjD=d;G=gPEjD;E=ePG=g;E=ejD=d:NotepdjasdenedinTable 5{1 denotesthecellprobabilitiesofjthGd;Edcong-urationgivendiseasestatusd,wherevaluesofjareasstatedinTable 5{1 .Result1holdsforallthreesituationsdiscussedintheprevioussection.Therefore,iftheobserveddatacomesfromacommonmultinomialdistributionwithcellprobabilitiespdj,thenIcanwritedownthelikelihood 5{3 intermsofthetrue,starred"param-eters.Isimplywritethepdj'sintermsofalinearfunctionofthetrueparameterspdjasdenedbyResult1andmaximizethefollowingmultinomiallikelihoodintermsoftheunderlyingtrueorstarredparameters.L2=LOR10;OR01;;p01;p02;p03jr0;r1=1Yd=04Yj=1fpdjpdgrdj;{9

PAGE 111

97 wherepdjpddenotesthelineartransformationdenedin 5{8 ,essentiallyI'mre-placingthepdjintheoriginallikelihoodbyafunctionoftheunderlyingtruepa-rametersasdescribedinResult1.Thus,bymaximizingthelikelihood 5{9 whichnowincludestheeectofmisclassicationthroughthelineartransformationontheparameterswiththecorrectionmatricesAandB,IcannowobtaintheMLEsofthestarredparameters,denotedby^pd.AsindicatedinRemark1,thevectorofestimatedexpectedcellcountsunderthemultinomialmodelisgivenby~rd=nd^pd.Thusfortheestimationwiththestarredparameters,thevectorofestimatedexpectedcellcountsunderthetruedatamodelisrd=nd^pd.NotethatbyinvariancepropertyoftheMLE,Result1holdswhentheparametersp0andp1arereplacedwiththeMLEsfortheperfectlyclassieddatamodelandthemisclassieddatamodel.ThusbyinvertingResult1asin 5{8 ,replacingtheparameterswiththeMLE's,Ihave0B@^pd1^pd2^pd3^pd41CA=A)]TJ/F21 7.97 Tf 6.587 0 Td[(10B@^pd1^pd2^pd3^pd41CAB)]TJ/F21 7.97 Tf 6.587 0 Td[(1=1 ndA)]TJ/F21 7.97 Tf 6.586 0 Td[(10B@~rd1~rd2~rd3~rd41CAB)]TJ/F21 7.97 Tf 6.586 0 Td[(1:Thisimmediatelyleadstothefollowingrelationshipbetweenestimatedexpectedcellcountsforthetruedataandthemisclassieddata,Result2 0B@rd1rd2rd3rd41CA=A)]TJ/F21 7.97 Tf 6.587 0 Td[(10B@~rd1~rd2~rd3~rd41CAB)]TJ/F21 7.97 Tf 6.586 0 Td[(1: {10 Infact,theresultistrueforthevectorofexpectedcellcountsinvolvingtheunknownparameters,notonlytheestimates,asisobviousfromtheabovediscussion.Thus,forthetraditionalmultinomialmodelandthemodelundertheG-Einde-pendenceandrarediseaseassumptions,theMLEsofthetruestarredparametersofinteresthaveclosed-formexpressionintermsoftheestimatedstarredexpectedcellcountsrd,whichareshowninTable 5{3 .Toobtainrd,IsimplyobtaintheMLEs

PAGE 112

98 ^pdunderdierentmodelsandmultiplybynd.NotethattheMLEs^pdarealsoeasilyobtainedbyusingthetransformationinResult1andtheMLestimationofpdasdiscussedinSection 5.2.1 underdierentmodelassumptions.TheMLEs^pturnouttobedierentfunctionsoftheobservedcellcountsrd,sensitivityandspecicitypa-rameters,theformofthefunctiondependingonthemodelassumptions.Therefore,rdunderdierentassumptionsorconstraintsontheparametersmightbedierentIdenotebythesuperscriptIRunderG-Eindependenceandrarediseaseassump-tionstodistinguishfromtheothermodelsastheMLEs^pdand^pdaredierentacrossthemodelswithdierentassumptionsIrefertothediscussioncomparingtheusualmultinomialmodel,andthemodelwithrarediseaseandG-Eindepen-denceinSection 5.2.1 .ThissimplymeansthatIcanapplythecorrectedcountsinsteadoftheobservedcountsrd,totheestimatesobtainedinTable 5{2 andthatwillleadtotheexactlysameestimatesasdescribedinTable 5{3 .IemphasizethattheseestimatorsinTable 5{3 areonlystrictlyvalidasMLEswhentheyliewithintheconstrainedparameterspace.WhenthepositivityconstraintsontheOReg'sortheprobabilityconstraintsonthepd'sareviolatede.g.whenverysmallvaluesofsensitivityorspecicityareused,correspondingtohugemisclassicationratesthentheconstrainedMLEswouldbeontheboundaryoftheparameterspace,weshouldmaximizethelikelihood 5{9 withrespecttothetrueparameterssubjecttotheconstraints,insteadoftransformingtheobservedMLE.However,assuchestimatesareindicativeofextrememisclassication,ortoosmallasample,wemightwelltreattheseestimateswithsomecaution.WecanalsoseethebehavioroftheestimatorsfromTable 5{3 asthemisclassi-cationerrorratesgoto0byTaylorseriesexpansionsoftheseestimators.Denetheerrorsas"pG=1)]TJ/F22 11.955 Tf 9.64 0 Td[(sp1G,"eG=1)]TJ/F22 11.955 Tf 9.639 0 Td[(se1G,"pE=1)]TJ/F22 11.955 Tf 9.639 0 Td[(spdEand"eE=1)]TJ/F22 11.955 Tf 9.64 0 Td[(se1E.Expandingthelog-scaleestimator^IRoftheinteractionparameteraround"pG="eG="pE="eE=0,

PAGE 113

99 Table5{3:Inthepresenceofmisclassication,theMLEsofthetrueoddsratiosintermsofestimatedstarredexpectedcountsrdjforthetraditionalmodelModel1andrdjIRforthemodelunderG-EindependenceandrarediseaseassumptionsModel2. ParametersModel1Model2 OR10MLEr01r12 r02r11r01IRr12IR r02IRr11IROR01MLEr01r13 r03r11r01IRr13IR r03IRr11IRMLEr02r03r11r14 r01r04r12r13r11IRr14IR r12IRr13IR weseethatlog^IR=logr11IRr14IR r12IRr13IR=logr11r14 r12r13+r11r14)]TJ/F22 11.955 Tf 11.955 0 Td[(r12r13r13r14"pG+r12r14"pE+r11r12"eG+r11r13"eE r11r12r13r14+higherorderterms: {11 Toarst-orderapproximation,theestimatorreducestothenormal,perfect-dataasinTable 5{2 astheerrorsreduceto0.Therst-ordertermssuggestthatus-ingagoodapproximationtoerrorsmaygivebetterestimatesthansimplyignoringmisclassication,i.e.,settingtheerrorsequalto0.ConstructionofthecondenceintervalsfollowsinexactlythesamewayasfortheperfectlyclassieddatawiththestandarderrorestimatesobtainedfromtheinverseoftheinformationmatrixofL2evaluatedattheMLEs.5.2.3Case-onlyMethodwithPossibleMisclassicationAsdiscussedindetailinChapter 4 ,thecase-onlymethodPiegorschetal.1994isapopularmethodtoestimatethemultiplicativegene-environmentinteractionpa-rameter,whereundertherarediseaseandgene-environmentindependenceassump-tions,theoddsratioofGforexposedversusunexposedsubjectsamongthecasesonlyprovidesanecientestimateoftheinteractionparameter.ThedatausedisasshowninthesecondrawofTable 5{1 ,ignoringthecontroldataontherstrow.

PAGE 114

100 Intheabsenceofmisclassication,datafromthecasepopulationformsamulti-nomialdistribution,r1Mnn1;p1;wheren1isxed.TheinteractionparameterheredenotedasCOisobtainedastheoddsratiobetweenGandEamongthecasepopulation,i.e.,CO=p14=p12 p13=p11:TogetherwithP4j=1p1j=1,Ihavethefollowingrestrictionsforp1,p13=p11)]TJ/F22 11.955 Tf 11.955 0 Td[(p11)]TJ/F22 11.955 Tf 11.955 0 Td[(p12 p11+p12COp14=p12CO)]TJ/F22 11.955 Tf 11.955 0 Td[(p11)]TJ/F22 11.955 Tf 11.956 0 Td[(p12 p11+p12CO:Thecorrespondinglikelihoodforthecase-onlymethodisthusLCO=LCO;p11;p12jr1=4Yj=1pr1j1j;andtheMLEoftheinteractionparameterCOis^CO=r11r14 r12r13withvariance^2COP4j=11=r1j.BothResults1and2holdford=1aswell,therefore,estimatingthetrueparam-etersinthepresenceofmisclassicationisstraightforwardbywritingthelikelihoodintermsofthetrueparametersLCO=LCO;p11;p12jr1=4Yj=1fp1jp1gr1j:Note^p1j=r1j=n1,sotheMLEofthetrue"parameterCOintermsofr1COthesuperscriptCOistodistinguishfromtheothermodelsis^CO=r11COr14CO=r12COr13CO;

PAGE 115

101 andr1jCOcanbeobtainedfollowingResult2,with~r1j=n1^p1j=r1j.ThevarianceestimatorscanbeagainestimatedfromtheinverseoftheinformationmatrixofLCOevaluatedattheMLEsorbythetechniqueasstatedinAppendix C.1 .Remark3: Notethat,theMLEoftheinteractionparameteranditsvarianceobtainedbythecase-onlymethodareexactlythesameasthoseobtainedinSection 5.2.1 ,whereIalsoassumeG-Eindependenceandrarediseaseassumptions,butusebothcaseandcontroldata.Thisistruewhetherintheabsenceofmisclassication,orinthepresenceofmisclassicationunadjustedoradjusted.ThisestablishesyetanotherproofofthefactthattheunderG-Eindependenceandrarediseaseassumptions,theinteractionoddsratioisexactlyequaltotheoddsratioofEonGforcasesalone.ThedetailsofthesimpleequivalenceresultappearsinAppendix C.3 .Remark3showsthatourmodelwiththeG-Eindependenceandrarediseaseassumptionscanalsoobtainahighlyecientestimateoftheinteractionparameterasinthecase-onlymethod.Moreover,ourmodelisalsoabletoestimatethemaineectsofgeneticandenvironmentalfactors,whichthecase-onlymethodcannotestimate.AsClaytonandMcKeigue01noted,studiesofgene-environmentassociationwithdiseaseneedtogobeyondtheestimationofthestatisticalinteractionparameter,andourstudycanestimateauxiliaryparametersofinterestwithoutcompromisingoneciencyoftheestimateoftheinteractionparameter.Ionlypointthisoutasasideobservation,asthecase-onlymethoddoesnothaveanyadvantagecomparedtotheproposedmethod.5.3SimulationStudiesInthissection,Ipresentnumericalevidenceintheformofsimulationstudiestoillustratetheadvantageoftheproposedmethodsinunmatchedcase-controlstudies.Generally,Iassumethatthegeneticvariantofinterestisabi-alleliclocuswiththewildandvarianttypealleles.Iconsideradominantmodelfortheeectofthegene

PAGE 116

102 variant.Ialsoassumeabinaryenvironmentalexposureandconsideritasacommonlyprevalentexposure.Specically,IfollowthesimilarsimulationdesignasmentionedinChatterjeeetal.005.Irstgeneratetheparentalgenotypedataforeachindividual.Giventhegeno-typesoftheparents,IgeneratethegenotypeforoneospringbasedonastandardMendelianmodeofinheritance.Iindependentlygeneratetheenvironmentalexpo-suresforthisospringbasedonthemarginalprobabilityofexposureE=1fortheunderlyingpopulation.Giventheinformationofgeneticandenvironmentalfactors,Igeneratethediseaseoutcomeforeachindividual,independentofother,usingthelogisticregressionmodellognPD=1jG;E;0 PD=0jG;E;0o=exp0+EE+GG+GEGE:{12IchoosethemaineectparametersofE=logOR10=logandG=logOR01=logandconsideramultiplicativeinteractionbetweenGandE,xingGE=log=log.Iselectthevalueof0sothatthemarginalprobabilityofthedis-easeinthepopulation,PD=10:01infactPD=1jG=0;E=00:001.Followingthisscheme,Irstgeneratedataforalargenumberofrandomlysamples,whichItreatastheunderlyingpopulation,Ithenselect1000diseasedindividualsand1000non-diseasedindividuals.Ionlyretaintheappropriatedisease,genotypeandenvironmentalexposureinfor-mationanddiscardtherestoftheinformation.Followingthedenitionofsensitivityandspecicity,Irandomlymisclassifythegenotypeandenvironmentalexposurein-formation,independentofoneanother,butkeepthediseaseinformationunchanged.Inthesimulation,Iletsp0G=sp1G=sp0E=sp1E=1andinthesampleddatasetconsiderthefollowingsettings:1se0G=se1G=0:95andse0E=se1E=0:9;2se0G=se1G=0:9andse0E=se1E=0:8.

PAGE 117

103 Foreachscenario,Isimulate500datasetsandanalyzethedatabyimplementingtheadjustedformulation.Toillustratetheeciencyoftheproposedmethod,Icomparetheresultsforbothintheabsenceandinthepresenceofmisclassicationunadjustedandadjusted.ToexploittheG-Eindependenceassumption,IalsoapplytheformulationunderallthreemodelassumptionsasdiscussedinSection 5.2 .Tables 5{4 5{5 5{6 and 5{7 showtheresultsofanalyzingunmatchedcase-controldatawithdierentsamplesizes1000/1000and750/750fordierentset-upofmisclassicationerrors.Tosummarize,inthepresenceofmisclassication,theestimateswithoutad-justingshowhighbiasandhavesignicantlylargemeansquareoferrorsMSE,butthestandarderrorsarenotnecessarilybiggerifcomparingwiththeestimatesintheabsenceofmisclassication.Infact,InoticethattheestimatesofOR10andwithoutadjustmentarebiasedtowardsnull.Theadjustedestimateswhichareobtainedthroughtheproposedformulationarequiteclosetothetrueparameters,exceptwithrelativelargestandarderrors,andthepowertodetecttheinteractionisalsoasignicantimprovementovertheothers.Undereachsamescenario,Inoticethattheestimatorsobtainedbythetradi-tionalmodelsuermorefrommisclassicationthanthoseobtainedundertheG-Eindependenceassumption.Evenintheabsenceofmisclassication,themodelsundertheindependenceassumptionprovidemorepreciseestimates,i.e.,smallerstandarderrorsandMSE.AsIexpected,largersamplesizeimprovesthepoweroftestingtheinteractioneectaswellastheprecisionoftheparameterestimates.Ialsopresentthesamplesizecalculationstoachievethedesignedpower,followingtheapproachdescribedbyLubinandGail990anddiscussedbyGarcia-ClosasandLubin1999.ThesecalculationsareperformedusingtheprogramPOWER.Table 5{8 and 5{9 presenttheimpactofreducingsensitivity/specicityoftheenvironmentalfactorassessmentfrom1.0to0.90and0.80,bothintheabsenceandinthepresence

PAGE 118

104 ofreducedsensitivity/specicityintheassessmentofthegeneticfactorfrom1.0to0.95and0.90.InFigure 5{1 ,Iexploreinmoredetailtheeectsofmisclassicationonsamplesize.ThesolidlinesinFigure 5{1 representthesamplesizerequiredtodetectthespecied2-foldinteractionintheabsenceofmisclassicationasafunctionofthetrueprevalenceoftheenvironmentalfactorfor0.2prevalenceofthegeneticfactor.TheotherlinesinFigure 5{1 illustratetheimpactofmisclassicationoftheenvironmentalfactoronsamplesizeforselectedvaluesofsensitivityandspecicityofexposureassessment.5.4ConclusionIdescriberelativelysimpleanalyticformulationforaccountingformisclassi-cationofexposuresinstudiesofgene-environmentinteractionbasedonsensitivityandspecicityofthemeasurementinstrumentforgeneticandenvironmentalfactorsinunmatchedcase-controlstudies.Asillustratedinthesimulations,evenrelativelysmalldegreesoferrori.e.,sensitivityorspecicityquitecloseto1,theestimatesofparametersofinteresthaverelativelargebiases.Thecorrectedestimatesminimizethebiasesandarefoundclosertotrueparameters,althoughthestandarderrorsareslightlylargewhenthesamplesizeisrelativesmall.IalsoconsiderdierentmodelassumptionstoexploittheG-Eindependenceas-sumption.Accordingtoresultsofthesimulation,IdosuggestusingtheformulationundertheG-Eindependenceassumptioniftheassumptionholdsinthesourcepopu-lation,sincethetraditionalmodelsuersmorefrommisclassication.CautionswhileusingthisindependenceassumptionasdiscussedinChapter 4 shouldbeexercisedwhileusingthesemethods.Improvementsoftheaccuracyofexposureassessmentforboththeenvironmentalandgeneticfactorscangreatlyreducesamplesizerequirementstostudyinteractions

PAGE 119

105 andarecriticalforaccurateassessmentofgene-environmentinteractionsincase-controlstudies.HowtoexploitG-Eindependenceassumptionforamatchedcase-controlstudyandadjustformisclassicationerrorispartofmyfuturework.Rice003hasproposedaningeniousfull-likelihoodbasedmethodofadjustingformis-classicationinanyn1:n2matchedcase-controlstudywithasinglebinaryexposurewhichusesthemixingdistributionscharacterizedinRice2004.IwouldconsiderthenovelconditionallikelihoodproposedbyChatterjeeetal.005formatchedcase-controlstudies,withwhichonecanexploittheG-Eindependenceassumption.ThenIwillattempttoestablishequivalenceofthisnewconditionallikelihoodwiththeintegratedfull-likelihoodforaspecialclassofinvariantmarginaldistributionsinthespiritofRice2004,basedonwhichIthencouldconsiderafull-likelihoodbasedapproachtoaccountformisclassicationerrors.Theinterpretationofthenovelcon-ditionallikelihoodintermsofarandomeectsonBayesianviewpoint,wouldopenupanewstrategyforcomputingthepointandintervalestimatesandmakeiteasiertoadjustforerrorduetomisclassication.Howtohandlecontinuousexposures,non-ignorablemissingnessinexposureremainsatopicoffutureresearch.

PAGE 120

106 Table5{4:Resultsofunmatchedcase-controldata750/750,wherespecicityforbothgeneticandenvironmentalfactor=1.0,se0G=se1G=0:95andse0E=se1E=0:9.PD=10:01,PE=10:5andPG=10:2 OR10OR01PowerAssumptionsMisclassication2.00002.00002.0000H0:=1 NoNoMLE2.00792.02062.00480.796s.e.0.27290.38570.4893MSE0.07430.14670.2509Yes&unadjustedMLE1.87612.27891.70750.588s.e.0.24590.40080.4084MSE0.07270.24760.2475Yes&adjustedMLE2.00772.02622.04530.817s.e.0.27320.38710.5000MSE0.09570.21970.3517G)]TJ/F22 11.955 Tf 11.955 0 Td[(EindependenceNoMLE2.01122.02111.94880.763andrarediseases.e.0.26170.34240.3296MSE0.06560.11780.1128Yes&unadjustedMLE1.87872.27551.66400.547s.e.0.23550.35500.2614MSE0.06750.20560.1803Yes&adjustedMLE2.01092.01941.98660.786s.e.0.30320.39910.4270MSE0.08820.16780.1806G)]TJ/F22 11.955 Tf 11.955 0 Td[(EindependenceNoMLE2.00202.00051.99000.788andPD=1knowns.e.0.26040.33880.3402MSE0.06490.11500.1175Yes&unadjustedMLE1.87132.25671.69570.577s.e.0.23450.35210.2699MSE0.06900.19360.1643Yes&adjustedMLE1.99981.99512.03710.813s.e.0.30160.39500.4432MSE0.08720.16410.1954

PAGE 121

107 Table5{5:Resultsofunmatchedcase-controldata000/1000,wherespecicityforbothgeneticandenvironmentalfactor=1.0,se0G=se1G=0:95andse0E=se1E=0:9.PD=10:01,PE=10:5andPG=10:2 OR10OR01PowerAssumptionsMisclassication2.00002.00002.0000H0:=1 NoNoMLE1.99122.00551.98280.887s.e.0.23400.33060.4181MSE0.05390.12710.1931Yes&unadjustedMLE1.87172.25921.68970.696s.e.0.21220.34350.3495MSE0.05980.18360.2134Yes&adjustedMLE2.00422.01052.00880.898s.e.0.23600.33180.4243MSE0.07250.14630.2523G)]TJ/F22 11.955 Tf 11.955 0 Td[(EindependenceNoMLE1.99532.00701.93290.864andrarediseases.e.0.22440.29370.2823MSE0.04800.09590.0923Yes&unadjustedMLE1.87972.27121.63080.636s.e.0.20380.30670.2215MSE0.05510.15780.1841Yes&adjustedMLE2.01412.02121.92950.862s.e.0.26280.34540.3573MSE0.06850.10690.1307G)]TJ/F22 11.955 Tf 11.955 0 Td[(EindependenceNoMLE1.98631.98761.97280.883andPD=1knowns.e.0.22340.29080.2913MSE0.04770.09440.0943Yes&unadjustedMLE1.87242.25271.66130.667s.e.0.20300.30420.2287MSE0.05660.14670.1657Yes&adjustedMLE2.00411.99951.97650.884s.e.0.26160.34230.3705MSE0.06770.10620.1365

PAGE 122

108 Table5{6:Resultsofunmatchedcase-controldata750/750,wherespecicityforbothgeneticandenvironmentalfactor=1.0,se0G=se1G=0:9andse0E=se1E=0:8.PD=10:01,PE=10:5andPG=10:2 OR10OR01PowerAssumptionsMisclassication2.00002.00002.0000H0:=0 NoNoMLE2.00792.02062.00480.796s.e.0.27290.38570.4893MSE0.07430.14670.2509Yes&unadjustedMLE1.79502.46921.50010.386s.e.0.23050.40840.3613MSE0.09540.39650.4081Yes&adjustedMLE2.04622.11262.01630.801s.e.0.27880.40210.4943MSE0.14050.35610.6661G)]TJ/F22 11.955 Tf 11.955 0 Td[(EindependenceNoMLE2.01122.02111.94880.763andrarediseases.e.0.26170.34240.3296MSE0.06560.11780.1128Yes&unadjustedMLE1.79492.46121.45840.343s.e.0.22030.36310.2203MSE0.08910.34600.3486Yes&adjustedMLE2.04562.09271.93690.755s.e.0.35810.47970.5194MSE0.12530.26460.3227G)]TJ/F22 11.955 Tf 11.955 0 Td[(EindependenceNoMLE2.00202.00051.99000.788andPD=1knowns.e.0.26040.33880.3402MSE0.06490.11500.1175Yes&unadjustedMLE1.78902.44541.48300.369s.e.0.21950.36080.2275MSE0.09120.33030.3265Yes&adjustedMLE2.03282.06461.99340.789s.e.0.35600.47480.5418MSE0.12320.25640.3492

PAGE 123

109 Table5{7:Resultsofunmatchedcase-controldata000/1000,wherespecicityforbothgeneticandenvironmentalfactor=1.0,se0G=se1G=0:9andse0E=se1E=0:8.PD=10:01,PE=10:5andPG=10:2 OR10OR01PowerAssumptionsMisclassication2.00002.00002.0000H0:=0 NoNoMLE1.99122.00551.98280.887s.e.0.23400.33060.4181MSE0.05390.12710.1931Yes&unadjustedMLE1.78872.45831.47800.460s.e.0.19880.35240.3074MSE0.08300.36740.3774Yes&adjustedMLE2.03072.08731.96690.880s.e.0.23910.34450.4161MSE0.10530.28340.4169G)]TJ/F22 11.955 Tf 11.955 0 Td[(EindependenceNoMLE1.99532.00701.93290.864andrarediseases.e.0.22440.29370.2823MSE0.04800.09590.0923Yes&unadjustedMLE1.78302.43381.47440.455s.e.0.18940.31110.1928MSE0.08280.30140.3155Yes&adjustedMLE2.02182.04701.95250.873s.e.0.30540.40740.4493MSE0.09850.19680.2135G)]TJ/F22 11.955 Tf 11.955 0 Td[(EindependenceNoMLE1.98631.98761.97280.883andPD=1knowns.e.0.22340.29080.2913MSE0.04770.09440.0943Yes&unadjustedMLE1.77702.41791.49900.487s.e.0.18870.30910.1991MSE0.08520.28630.2927Yes&adjustedMLE2.00922.01952.00860.898s.e.0.30370.40330.4686MSE0.09710.19160.2302

PAGE 124

110 Table5{8:Minimumnumberofcasescase:controlratio=1requiredtodetecta2-foldmultiplicativeinteractionOR10=OR01=2andOR11=8with80%powerfordierentlevelsofsensitivitiesandspecicitiesoftheenvironmentalandgeneticfactors,wherePE=1=0:5andPG=1=0:2. sedEsedGspdEspdGNo.ofcases 11117630.9111117810.95118470.90.951113140.80.9112077110.918571110.95802110.90.95902110.80.910960.90.950.90.9516390.80.90.90.952830 Table5{9:Minimumnumberofcasescase:controlratio=1requiredtodetecta3-foldmultiplicativeinteractionOR10=1:3,OR01=7andOR11=3with80%powerfordierentlevelsofsensitivitiesandspecicitiesoftheenvironmentalandgeneticfactors,wherePE=1=0:2andPG=1=0:01. sedEsedGspdEspdGNo.ofcases 111137710.9111456310.951138280.90.951144780.80.9115728110.9145431110.951793110.90.952235110.80.932420.90.950.90.9529440.80.90.90.953869

PAGE 125

111 Figure5{1:Minimumnumberofcasescase:controlratio=1requiredtodetecta2-foldinteractionOR10=2,OR01=2,andOR11=8with80%powerasafunctionofthetrueprevalenceoftheenvironmentalfactor,PE=1,fortheprevalenceofthegeneticfactorbeing0.2,andforselectedvaluesofsensitivityandspecicityoftheexposureassessment.

PAGE 126

CHAPTER6FUTUREWORKANDCONCLUSIONIplantoextendtheworkrelatedtothisdissertationintwoprincipaldirections.Therstoneisforestimatingcovariateeectsinfamily-basedcase-controlstudies.Geneticepidemiologyisarelativelyneweldwhichappliestheconventionalepidemi-ologicdesignsandmethodstoexploretherolegeneticfactorsplayindeterminingadisease.Boththeoreticalandempiricalstudieshaveshownthattraditionallinkagestudiesmaybeinferiorinpowercomparedtostudiesdirectlyutilizingallelestatus.Asanalternative,population-basedcase-controlassociationstudiesaresubjecttobiasduetopopulationstratication.Asacompromisebetweenlinkagestudiesandpopulation-basedcase-controlstudies,family-basedassociationdesignshavereceivedgreatattentionrecentlyduetotheirpotentiallyhigherpowertoidentifycomplexdiseasegenesandtheirrobustnessinthepresenceofpopulationsubstructureZhao2000.Acommonphenomenoningeneticepidemiologicresearchisthatsampledfami-liesarenotrepresentativeofthetargetedpopulationastheyareascertainedthroughprobandswithknownphenotypicvalues.Itiswellknownintheliteraturethatstatisticalinferencewithoutproperascertainmentcorrectionswouldleadtobiasedestimationofkeyparameters.Onesimpleremedyistoconditionontheobservedphe-notypicvaluesoftheprobandscaseorcontrol.Infamily-basedcase-controlstudies,anaturalapproachtoaccountforfamilyeectwillbetoconductamatchedcase-controlanalysiswithcontrolsselectedfromthesamefamily,andtouseconditionallogisticregressionconditionalonthenumberofcasesinthefamilytherecouldbemorethanonepersonaectedbythediseaseinafamily.Ithasbecomeincreasinglyclearthatbothgeneticandenvironmentalfactorscontributetotheaetiologyofmany 112

PAGE 127

113 commondiseases.Ifenvironmentalexposuresorothercovariatesarealsoimportant,theyshouldbeincorporatedintothesegeneticanalysestocontrolforconfoundingandincreasestatisticalpower.Diggleetal.994,Ch.9presentgeneralizedlinearmixedmodelsforexpo-nentialfamilieswithcanonicallinks.Witteetal.999usedaxedeectslogisticmodelinwhichthefamilialgeneticeectsweremeasured,ratherthanrandomef-fects,toanalyzefamilystudies.Pfeieretal.001proposedthefollowingtwo-levelmixedeectsmodeltoaccountforcommonfamilialeectsandfordierentgeneticcorrelationsamongfamilymembers.ConsideringabinarydiseasevariableDijforthejthmemberoftheithfamilyandacovariatevariableXij,forj=1;;niandi=1;;N,letlognpij 1)]TJ/F22 11.955 Tf 11.955 0 Td[(pijo=lognPrDij=1jai;gij;Xij PrDij=0jai;gij;Xijo=+ai+gij+Xij: {1 Hereaidenotestherandomfamilialeect,whichaectsallfamilymembersequallyandgijdenotesanindividualrandomgeneticeect.Theybasedtheiranalysisonthemarginallikelihoodafterintegratingwithrespecttothejointrandomeectsdistribution.Theyadjustedforascertainmentbyconditioningonthenumberofcases,ki2,inthefamilyandperformedconditionalmaximumlikelihoodanalysisbasedonthismixedeectsmodel.TheconditionaldistributionforfamilyicanbewrittenasPrDi1;;DinijXi1;;Xini;niXj=1Dij=ki=PrDi1;;Dini;Pnij=1Dij=kijXi1;;Xini PrPnij=1Dij=kijXi1;;Xini: {2 Thisapproachallowstoestimateenvironmentaleectswhileaccountingforvaryinggeneticcorrelationsamongfamilymembers.Thisapproachtookintoaccountofunmeasuredfamilialandgeneticeectsthatinducecorrelatedresponsesandyielded

PAGE 128

114 consistentestimatorsofcovariateeectswhenthecovariatehasnoeectondiseasestatusevenwithamisspeciedrandomeectsdistribution.TheapproachpresentedinPfeieretal.001hasexibilitybutcomeswiththedrawbackofcomputationalcomplexityastheyhavetoapproximatetheintegratedlikelihoodonadensegridofpoints.AlthoughtheMonteCarloapproachworkedwellintheirexamples,largerMonteCarlosamplesorothermethodsmaybeneededforlargerpedigrees.IwanttoproposeafullBayesianalternativetobuildinahierarchicalpedigreestructureandassumepriorsontherandomeectswhichoersamoreuniedandcomputationallyappealingalternative.AsPfeieretal.001pointedoutthatmorefundamentally,oftenonedoesnotknowtheprecisenatureofthegeneticinuencesandhencethedistributionofthefamilialindividualgeneticeects.Theestimatedrandomeectforeachindividualfamilywillchangebychangingthedistributionsoftherandomeects.Thispointisimportantbecausetherearemanyapplicationsinwhichanestimateoftherandomeectitselfisdesired.Forexample,whenthetruedistributionsoftherandomeectsaremixturesofnormals,assumingthatitisasingleunmixednormalcanleadtopoorestimatesoftherandomeects.Insuchapplications,unbiasedestimationoftherandomeectsiscrucialandtheassumptionofnormalitymayintroducebias.FromaBaysianperspective,inferentialinterestfocusesontheposteriordistributionofthexedeectsaswellastherandomeects.Allowingdistributionsotherthanthenormalfortherandomeectsmaymoreaccuratelymodelourpriorbeliefs.Itisalsoimportanttoaccuratelymodelthedistributionoftherandomeectswhenpredictionforafutureobservationfromagivensubjectisdesired.IwillprovideageneralframeworkforBayesiananalysisfortherandomeectsmodelwhereanonparametricDirichletProcessDPpriorisspeciedfortherandomeects.Pfeieretal.001modeledthecovariancematrixofthefamilialgeneticran-domeectsasafunctionofthedegreeofkinshipbetweenmembersineachfamily

PAGE 129

115 byassumingnodominancecomponentofthegeneticvarianceFisher1918.Theirinferenceabouttherandomeectswasreducedtoonlyascalarcommonvariance2g.Iwouldinsteadconsideramorecomplicatedcovariancestructurewiththeparentalvariance,siblingsvariance,theinterclassparent-sibcorrelationandtheintraclasssib-sibcorrelationasusuallyusedformultivariatefamilialdataSrivastava1984,Srivastavaetal.1988.Therefore,Iwouldmaketheinferenceoncovarianceparam-etersrelatedtothefamilydata.Thisparametriccovariancehasaexibledegreeofkinshipandalsocanbeeasilyextendedtofamiliesoflargerpedigrees.IwillextendmyworkinChapter 5 tothecaseofmatchedcase-controlset-up,inparticulartofamily-basedcase-controlstudies.Conditionallogisticregression,thetraditionalmethodofanalysisofthefamily-basedcase-controldatafailstoexploittheassumptionthatgeneticsusceptibilityandenvironmentalexposuresareindepen-dentlydistributedofeachotherwithinfamiliesinthesourcepopulation,andhencethetraditionalmethodcanbeinecient.Alternatively,Chatterjeeetal.005proposednovelmethodsfortheanalysisoffamily-basedcase-controlstudiesundertheG-Eindependenceassumptionwithinfamiliesinthesourcepopulation.Theas-sumptionofG-EindependencewithinfamiliesinthesourcepopulationisrelativelyweakerthanourassumptioninChapter 4 ,asthisassumptionisnotaectedbyspu-riousassociationbetweengenotypeandexposurestatusthatmaybecreatedduetopopulationsubstructureUmbachandWeinberg,2000.Thisapproachleadstoasimpleandyethighlyecientmethodsofestimatinginteractionandvariousotherriskparametersofscienticinterest.Chatterjeeetal.005leavethequestionofmisclassicationerrorsunaddressed,whiletheydeveloptheclassicalasymptoticeciencytheory.Rice03haspro-posedaningeniousfull-likelihoodbasedmethodofadjustingformisclassicationinanyn1:n2matchedcase-controlstudywithasinglebinaryexposurewhichusesthemixingdistributionscharacterizedinRice004.Iwouldrstextendtheresult

PAGE 130

116 ofRice004fordatawithnomisclassicationandcharacterizeaclassofrandomeectsmixingdistributionsforthenuisanceparametersinvolvedinthefulllikelihoodsuchthatwhentheyareintegratedout,theintegratedlikelihoodisidenticaltothenovelconditionallikelihoodproposedbyChatterjeeetal.005underararedis-easeassumption.Theestablishedequivalenceisasignicantandexcitingndinginitself,whichrendersaBayesianinterpretationtothenovelconditioningparadigmproposedbyChatterjeeetal.005,exactlyasRice004providesaBayesianinterpretationtothetraditionalconditionallikelihood.Theestimationofparame-terscanthenbecarriedoutusingMCMCtechniquesassuggestedbyRice004.Characterizingtheclassofrandomeectsdistributionswillprovideamethodtoad-justformisclassicationerrorbasedonthefulllikelihoodwhichwillbedesignedtoincludemisclassicationeects.IwouldobtainestimatesandcondenceintervalsforthemisclassiedcasewhichreducebacktoonesobtainedviaChatterjeeetal.'sconditionallikelihoodasmisclassicationerrorratesgotozero.Toconclude,inthisdissertationIaddresstheimportantfoundationalissueofequivalenceofprospectiveandretrospectiveanalysisinaBayesianframework.Iconsidersomenewproblemsinthedomainofcase-controlmethodologywhichareemergingwithmodernadvancesingenetictechnology.TopopularizetheuseofBayesianmethodsinthisarea,oneneedswiderdisseminationofuser-friendlysoftwarecodes.Abroadergoalofmyfutureworkwillbetomakethesecodesavailableonlineinausableform.

PAGE 131

APPENDIXAAPPENDIXTOCHAPTER3Thisisanexplanationtotheapproximationinequation 3{9 .Fromequation 3{7 ,bythestronglawoflargenumbers,1 PXjIa:s:!E1 Y;whereY=PXjZ;P;Q.Thisimpliesthat,)]TJ/F15 11.955 Tf 11.955 0 Td[(2logPXjIa:s:!2logE1 Y; A{1 LetW=)]TJ/F15 11.955 Tf 9.299 0 Td[(2logY=[DVjX],thenE1 Y=EYexp)]TJ/F15 11.955 Tf 11.291 0 Td[(logY=MW1 2; A{2 whereMWtdenotesthemomentgeneratingfunctionofthedistributionofW.Byassumingthatthedeviancefunction[DVjX]isnormal,i.e.,WN;2,by A{2 ,Ihave,E1 Y=exp)]TJ/F22 11.955 Tf 5.48 -9.683 Td[(=2+2=8:Henceby A{1 ,andthefactthat^and^2areconsistentestimatesofand2,Ihavetheapproximationin 3{9 .Remark:SupposeIassumeinsteadofnormalityofthedeviancefunction,that[DVjX]=W=)]TJ/F15 11.955 Tf 9.298 0 Td[(2logYGamma2=2;2=,whereand2arethemeanandvarianceofW,andGammaa,bdenotesaGammadistributionwithshapeparameteraandscaleparameterb.Thenbystepssimilarasabove,onewillobtainananalogueof 3{9 intheGammacaseas,)]TJ/F15 11.955 Tf 9.298 0 Td[(2logPXjI)]TJ/F15 11.955 Tf 21.918 0 Td[(2^2=^2log1)]TJ/F15 11.955 Tf 14.306 8.088 Td[(^2 2^;for^2 ^<2: 117

PAGE 132

APPENDIXBAPPENDIXTOCHAPTER4B.1ProofofLemmasandResultsProofofLemma1 PG=gmjE;S;D=1 PG=gmjE;S;D=0=PD=1jG=gm;E;SPG=gmjE;S=PD=1jE;S PD=0jG=gm;E;SPG=gmjE;S=PD=0jE;S=PD=1jG=gm;E;S=PD=0jG=gm;E;S PD=1jE;S=PD=0jE;S:ProofofLemma2 IusethefollowingidentityandapplyLemma1.1=MXm=0PG=gmjD=1;E;S=MXm=0PG=gmjD=1;E;S PG=gmjD=0;E;SPG=gmjD=0;E;S;ProofofLemma3 IbeginwiththeidentitypEjS;D=1 pEjS;D=0=PD=1jE;S=PD=0jE;S PD=1jS=PD=0jS:ThenIobservethatPD=1jS=ZPD=1jE;SpEjSdE=ZPD=1jE;S PD=0jE;SPD=0jE;SpEjSdE=ZPD=1jE;S PD=0jE;SpEjD=0;SPD=0jSdE: 118

PAGE 133

119 ProofofResult1 ByLemma2and 4{1 onegets,PD=1jE;Z=i PD=0jE;Z=i=MXm=0exp0i+1m+2E+3mEexpim PMk=0expik=exp0i+2Ef1+PMk=1exp1k+3kE+ikg 1+PMk=1expik: B{1 Substitutingmodel 4{1 and B{1 intoLemma1,IgetPG=gmjD=1;E;Z=i PG=gmjD=0;E;Z=i=exp0i+1m+2E+3mE exp0i+2E+PMk=1exp1k+3kE+ik f1+PMk=1expikg=exp1m+3mEf1+PMk=1expikg 1+PMk=1exp1k+3kE+ik:Nowby 4{4 ,Igettheresult.ProofofResult2 Notethatby 4{5 4{6 andLemma2,PD=1jE;Z=i PD=0jE;Z=ipEjD=0;Z=i;i=!l=expf0i+2EgfPmk=0exp1m+3mE+img PMk=0expik1 p 22lexpn)]TJ/F15 11.955 Tf 13.151 8.088 Td[(E)]TJ/F22 11.955 Tf 11.955 0 Td[(l2 22lo=exp0i PMk=0expikMXm=0hexpn1m+im+l+22l+3m2l2)]TJ/F22 11.955 Tf 11.955 0 Td[(2l 22loE;!lmiwhere!lm=l+22l+3m2l;2l.Now,byLemma3andtheabovetworesults,IgetpEjD=1;Z=i;i=!l=MXm=0expf1m+im+l+22l+3m2l2 22lg PMk=0expf1k+ik+l+22l+3k2l2 22lg| {z }pilmE;!lm:

PAGE 134

120 B.2LikelihoodforTheEDPMModelThecomponentsintheretrospectivelikelihood 4{3 areasthefollows:PGj=gmjZj=i;Dj=0PEjjZj=i;Dj=0=expfim)]TJ/F15 11.955 Tf 11.955 0 Td[(Ej)]TJ/F22 11.955 Tf 11.955 0 Td[(l2=2lg p 22lPMk=0expik;PGj=gmjEj;Zj=i;Dj=1PEjjZj=i;Dj=1=expf1m+2Ej+3mEj+im)]TJ/F15 11.955 Tf 11.955 0 Td[(E2j)]TJ/F15 11.955 Tf 11.956 0 Td[(2lEj=2lg PMk=0expf1k+ik+l+22l+3k2l=2lg:B.3ComputationalDetailsFortheBayesianmethodsIneedtosimulaterandomnumbersfromthefullconditionaldistributionsoftheparametersgiventhedata.WhentheconditionaldistributionsdonothaveastandardformIuseindependencesamplerMetropolis-Hastingsalgorithmtogeneraterandomnumbersfromtherespectiveconditionals.UndertheEDPMmodel,acycleofGibbssamplerconsistsofthefollowingsteps.Step1 .Drawing1m,3m,im,m=1;;M,i=1;;Iand2followingtheusualMetropolis-Hastingsalgorithm;Step2 .DrawingobservationsfromtheposteriorofDPM,followingthenogapsalgorithmproposedbyMacEachernandMuller98asthefollowing:Let!=!1;;!Kdenotethesetofdistincti's,whereKIisthenumberofdistinctelementsinthevector=1;;I.Lets=s1;;sIdenotesthevectorofcongurationindicatorsdenedbysi=kifandonlyifi=!k,i=1;;I.InthisconnectionIusethetermcluster"wherekthclusterisdenedasIk=fi:si=kganddenenkasthesizeofthekthcluster,andtherefore,PKk=1nk=I.Nowitisobviousthat!andsuniquelydetermine.However,todetermine!andsuniquelyfromIneedtoredene!asfollows.Dene!1=1andforj2!j=l,wherel=minfr:r6=!1;;r6=!j)]TJ/F21 7.97 Tf 6.586 0 Td[(1g.

PAGE 135

121 Now,insteadofsimulating=1;;IdirectlyIsimulate!=!1;;!Kands=s1;;sIastheyarein1-1relation.Iuse)]TJ/F22 11.955 Tf 9.299 0 Td[(i"todenotethesituationwhentheobservationiisremoved.Forex-ample,)]TJ/F23 7.97 Tf 6.587 0 Td[(i=1;;i)]TJ/F21 7.97 Tf 6.586 0 Td[(1;i+1;;I,K)]TJ/F23 7.97 Tf 6.587 0 Td[(idenotesthenumberofclustersformedby)]TJ/F23 7.97 Tf 6.587 0 Td[(i.LikewiseIdene!)]TJ/F23 7.97 Tf 6.587 0 Td[(iandn)]TJ/F23 7.97 Tf 6.587 0 Td[(i;kasthedistinctcomponentsin)]TJ/F23 7.97 Tf 6.587 0 Td[(iandclustersizesafterremovingiJusttomakeanotethatPKr=1n)]TJ/F23 7.97 Tf 6.586 0 Td[(ir=I)]TJ/F15 11.955 Tf 11.955 0 Td[(1. 1. Updatesidrawingfrom[sijs)]TJ/F23 7.97 Tf 6.586 0 Td[(i;!;Data]foreveryi=1;;Ipsi=ljs)]TJ/F23 7.97 Tf 6.587 0 Td[(i;!;Data/psi=ljs)]TJ/F23 7.97 Tf 6.587 0 Td[(i;!Yj:Zj=iPrEjjZj;Dj;i=!l;wherepsi=ljs)]TJ/F23 7.97 Tf 6.586 0 Td[(i;!isdenedbelow.Whennsi>1,thenpsi=ljs)]TJ/F23 7.97 Tf 6.587 0 Td[(i;!=8><>:cn)]TJ/F23 7.97 Tf 6.586 0 Td[(i;lforl=1;2;;K)]TJ/F23 7.97 Tf 6.587 0 Td[(ic K)]TJ/F24 5.978 Tf 5.756 0 Td[(i+1forl=K)]TJ/F23 7.97 Tf 6.587 0 Td[(i+1 B{2 NotethatifsihappenstobeK)]TJ/F23 7.97 Tf 6.586 0 Td[(i+1,then!K)]TJ/F24 5.978 Tf 5.756 0 Td[(i+1issimplyarandomdrawfromP0.Ifnsi=1thenK)]TJ/F23 7.97 Tf 6.587 0 Td[(i=K)]TJ/F15 11.955 Tf 13.171 0 Td[(1andwithprobabilityK)]TJ/F15 11.955 Tf 13.171 0 Td[(1=Kleavesiunchanged,i.e.,i=!si;otherwiserelabeltheclusterssuchthatsi=Kandthenresamplesiwithprobabilitiesin B{2 .NowifthenewsihappenedtobeK)]TJ/F23 7.97 Tf 6.586 0 Td[(i+1=K,thentheprecedingrelabelingkeepsthepreviousvaluesofias!kandnothingischangedexceptpossiblerelabelingof!andhenceofs.IfthenewsiK)]TJ/F23 7.97 Tf 6.586 0 Td[(i,thelastelementafterrelabelingin!isdiscarded. 2. OncethecongurationindicatorsandtheassociatedclustersaredeterminedImoveontoupdate!'s.Thefullconditionaldistributionof!lis[!ljs;1;2;3;]/dP0!lYfi:si=lgYj:Zj=iPrEjjZj;Dj;i=!l; B{3 whichisnotinstandardform,thereforeIagainuseMetropolis-Hastingsalgo-rithmtoupdate!l's.Drawingarandomnumberfromthebivariatedistribution

PAGE 136

122 P0isequivalenttorstdraw2fromIGs/2,S/2andconditionedon2,,andm0,drawfromNm0;2.Step3 .Iupdatethehyperparametersasthefollowing.aConditionalon!=!1;;!Kand,m0isconditionallyindependentofdataandallotherparameters,andfollowsanormaldistributionwithEm0j;!=)]TJ/F22 11.955 Tf 12.491 0 Td[(xm0+xVPKk=1k=2kandVm0j;!=xV,wherex=2m0=2m0+V,andV=PKk=1)]TJ/F21 7.97 Tf 6.587 0 Td[(2k)]TJ/F21 7.97 Tf 6.587 0 Td[(1:bConditionalon!andm0,thedistributionofisfreeofotherparametersandthedataandthefullconditionaldistributionjisIG1 2a+K;1 2b+PKk=1k)]TJ/F23 7.97 Tf 6.587 0 Td[(m02 2k:Step4 .Weupdatethevalueofinthefollowingtwosteps.iSamplefrompj;K/)]TJ/F22 11.955 Tf 11.955 0 Td[(I)]TJ/F21 7.97 Tf 6.586 0 Td[(1;iiSamplefromGammaa+K;b)]TJ/F15 11.955 Tf 12.195 0 Td[(log+1)]TJ/F22 11.955 Tf 12.194 0 Td[(Gammaa+K)]TJ/F15 11.955 Tf -422.084 -23.908 Td[(1;b)]TJ/F15 11.955 Tf 11.956 0 Td[(log,where=)]TJ/F22 11.955 Tf 11.955 0 Td[(=a+K)]TJ/F15 11.955 Tf 11.955 0 Td[(1=fIb)]TJ/F15 11.955 Tf 11.955 0 Td[(logg.ConvergenceofthechainwasassessedbycomputingtheGelmanandRubin992GRdiagnostic.Theposteriormeansaretakentheparameterestimates.Toreduceauto-correlationamongobservations,Iranmultiplechainsandtakeevery5thobservationfromtherespectiveMCMCchainaftertheburn-inperiodof30000runs,andcalculatetheposteriormean,standarddeviationand95%HPDregionbasedonobservationsfromthelast10000MCMCrunsofeachchain.

PAGE 137

APPENDIXCAPPENDIXTOCHAPTER5Note:Alltheparametersaredenedthesameasinthetext,exceptthosedenedseparatelyhere.LetPG=1=qGandPE=1=qE.C.1TheConstrainedMLEquationsunderG-EIndependenceandRareDiseaseAssumptionsinUnmatchedCase-ControlstudiesTheseconstraintequationsareobtainedbydierentiatingthelogarithmofthelikelihood 5{3 withrespecttothecorrespondingparameters:p2p04r01)]TJ/F39 10.909 Tf 10.909 0 Td[(p01r04+pp01p04r11)]TJ/F39 10.909 Tf 10.909 0 Td[(OR10OR01r14=p01p04)]TJ/F39 10.909 Tf 10.91 0 Td[(OR10OR01qp2p04r02)]TJ/F39 10.909 Tf 10.909 0 Td[(p02r04+pp02p04OR10r12)]TJ/F39 10.909 Tf 10.909 0 Td[(OR01r14=p02p04OR10)]TJ/F39 10.909 Tf 10.909 0 Td[(OR10OR01qp2p04r03)]TJ/F39 10.909 Tf 10.909 0 Td[(p03r04+pp03p04OR01r13)]TJ/F39 10.909 Tf 10.909 0 Td[(OR10r14=p03p04OR01)]TJ/F39 10.909 Tf 10.909 0 Td[(OR10OR01qr12+r14 OR10)]TJ/F17 10.909 Tf 12.104 7.38 Td[(p02+p04OR01n1 p=0 A.1 r13+r14 OR01)]TJ/F17 10.909 Tf 12.104 7.38 Td[(p03+p04OR10n1 p=0 A.2 r14 )]TJ/F39 10.909 Tf 12.105 7.38 Td[(p04OR10OR01n1 p=0; A.3 whereq=p01r11+p02OR10r12+p03OR01r13+p04OR10OR01r14.Recallp04=1)]TJ/F22 11.955 Tf 11.376 0 Td[(p01)]TJ/F22 11.955 Tf 11.377 0 Td[(p02)]TJ/F22 11.955 Tf 11.377 0 Td[(p03andp=p01+p02OR10+p03OR01+p04OR10OR01.Thesolutionstotheaboveequationssubjecttotherestrictionofp01p04=p02p03.ThefollowingIshowhowtoobtainthoserestrictedMLEs:1Plugging A.1 A.2 and A.3 ,Ihave^p=^p01+dOR10^p02+^p04dOR01^+dOR01^p03+^p04dOR10^)]TJ/F15 11.955 Tf 12.941 0 Td[(^p04dOR10dOR01^=^p01+r12+r14 n1^p+r13+r14 n1^p)]TJ/F22 11.955 Tf 13.151 8.088 Td[(r14 n1^p=^p01+n1)]TJ/F22 11.955 Tf 11.955 0 Td[(r11 n1^p; 123

PAGE 138

124 thusr11 n1^p=^p01and^p11=^p01 ^p=r11 n1:Alsoby A.1 A.2 and A.3 ,Icanobtain^p1j=r1j n1;j=2;3;4:2ThusIcanwritetheprolelikelihoodasthefollowingLpp0;^p1=4Yj=1p0jr0j4Yj=1^pr1j1j/4Yj=1p0jr0j:BytheG-Eindependenceandrarediseaseassumptions,Ihavep01p04=p01)]TJ/F22 11.955 Tf 11.955 0 Td[(p01)]TJ/F22 11.955 Tf 11.955 0 Td[(p02)]TJ/F22 11.955 Tf 11.955 0 Td[(p03=p02p03;i.e.,p01=p201+p01p02+p01p03+p02p03=p01+p02p01+p03;similarly,Ihavep01=p01+p02p01+p03p02=p01+p02p02+p04p03=p01+p03p03+p04: A.4 SowritingLpp0;^p1/p01+p02r01+r02p01+p03r01+r03p02+p04r02+r04p03+p04r03+r04;

PAGE 139

125 IhavepIR01+pIR02=r01+r02 n0pIR01+pIR03=r01+r03 n0pIR02+pIR04=r02+r04 n0pIR03+pIR04=r03+r04 n0:Plugginginto A.4 ,Ihave 5{7 .Theestimatedasymptoticvariance-covariancematrixcanbeobtainedbytheinverseoftheobservedinformationmatrix.Theobservedinformationmatrixisconstructedbytakingthesecondderivativeofthelog-likelihoodwithrespecttheparameters,andevaluatingthemattheMLEsoftheparameters,whicharethesolutionstotheaboveequations.HereIstatehowIusedeltamethodalongwiththepropertiesofamultinomialdistributionandabinomialdistributiontoobtaintheestimatedasymptoticvarianceoftheoddsratios.FirstIconsiderOR10,whoseMLEisORIR10=r12 r11r01+r03 r02+r04=^p12 ^p11^p01+^p03 ^p02+^p04;where^pdj=rdj=ndd=0;1andj=1;2;3;4.Let=^p01+^p03,IarticiallybuildadistributionPAwhichincludesindependentdistributionsPmandPbtosatisfythisparticularoddsratioestimate,PA=PmPb/pr1111pr1212)]TJ/F22 11.955 Tf 11.955 0 Td[(p11)]TJ/F22 11.955 Tf 11.955 0 Td[(p12r13+r14r01+r03)]TJ/F22 11.955 Tf 11.955 0 Td[(r02+r04:NotethatforthemultinomialdistributionPm/pr1111pr1212)]TJ/F22 11.955 Tf 11.956 0 Td[(p11)]TJ/F22 11.955 Tf 11.955 0 Td[(p12r13+r14;0B@^p12^p111CAAN0B@p12p111CA;;with=1 n10B@p12)]TJ/F22 11.955 Tf 11.955 0 Td[(p12)]TJ/F22 11.955 Tf 9.298 0 Td[(p11p12)]TJ/F22 11.955 Tf 9.298 0 Td[(p11p12p11)]TJ/F22 11.955 Tf 11.955 0 Td[(p111CA:

PAGE 140

126 Letgx;y=logx)]TJ/F15 11.955 Tf 11.955 0 Td[(logy,then@gx;y @x=1 xand@gx;y @y=)]TJ/F15 11.955 Tf 10.636 8.088 Td[(1 y;thusbydeltamethod,log^p12 ^p11ANlogp12 p11;1 p12;)]TJ/F21 7.97 Tf 14.411 4.707 Td[(1 p111 p12;)]TJ/F21 7.97 Tf 14.411 4.707 Td[(1 p11T;sotheestimatedasymptoticvarianceoflog^p12 ^p11,AVARlog^p12 ^p11=1 n11 ^p11+1 ^p12=1 r11+1 r12:Similarly,forthebinomialdistributionPb/r01+r03)]TJ/F22 11.955 Tf 11.955 0 Td[(r02+r04;^ 1)]TJ/F15 11.955 Tf 13.468 3.155 Td[(^AN 1)]TJ/F22 11.955 Tf 11.955 0 Td[(;1 n01 2)]TJ/F22 11.955 Tf 11.955 0 Td[(2:Letgx=logx)]TJ/F15 11.955 Tf 11.955 0 Td[(log)]TJ/F22 11.955 Tf 11.955 0 Td[(x,thendgx dx=1 x+1 1)]TJ/F22 11.955 Tf 11.955 0 Td[(x=1 x)]TJ/F22 11.955 Tf 11.955 0 Td[(x;thusbydeltamethod,log^ 1)]TJ/F15 11.955 Tf 13.468 3.155 Td[(^ANlog 1)]TJ/F22 11.955 Tf 11.955 0 Td[(;1 n01 +1 1)]TJ/F22 11.955 Tf 11.955 0 Td[(;sotheestimatedasymptoticvarianceoflog^ 1)]TJ/F21 7.97 Tf 7.692 2.103 Td[(^,AVARlog^ 1)]TJ/F15 11.955 Tf 13.468 3.154 Td[(^=1 n01 ^+1 1)]TJ/F15 11.955 Tf 13.468 3.154 Td[(^=1 r01+r03+1 r02+r04:SincelogdORIR10=log^p12)]TJ/F15 11.955 Tf 11.955 0 Td[(log^p11+log^)]TJ/F15 11.955 Tf 11.955 0 Td[(log)]TJ/F15 11.955 Tf 13.468 3.154 Td[(^;

PAGE 141

127 thenAVARlogdORIR10=AVARlog^p12 ^p11+AVARlog^ 1)]TJ/F15 11.955 Tf 13.468 3.155 Td[(^=1 r11+1 r12+1 r01+r03+1 r02+r04:CalculatingtheestimatedasymptoticvariancesofdORIR01and^IRfollowsthesameideas.C.2ObtainRestriction 5{6 Following 5{5 ,Ihave)]TJ/F22 11.955 Tf 11.955 0 Td[(qG)]TJ/F22 11.955 Tf 11.955 0 Td[(qE=)]TJ/F22 11.955 Tf 11.955 0 Td[(p01+p01=p A.5 )]TJ/F22 11.955 Tf 11.955 0 Td[(qGqE=)]TJ/F22 11.955 Tf 11.955 0 Td[(p02+OR10p02=p A.6 qG)]TJ/F22 11.955 Tf 11.955 0 Td[(qE=)]TJ/F22 11.955 Tf 11.955 0 Td[(p03+OR01p03=p A.7 qGqE=)]TJ/F22 11.955 Tf 11.955 0 Td[(p04+OR10OR01p04=p: A.8 Notethatby A.6 and A.8 qE=)]TJ/F22 11.955 Tf 11.956 0 Td[(p02+OR10p02=p+)]TJ/F22 11.955 Tf 11.955 0 Td[(p04+OR10OR01p04=p A.9 andby A.7 and A.8 qG=)]TJ/F22 11.955 Tf 11.955 0 Td[(p03+OR01p03=p+)]TJ/F22 11.955 Tf 11.955 0 Td[(p04+OR10OR01p04=p; A.10 thus,Ihave)]TJ/F22 11.955 Tf 11.955 0 Td[(p04+OR10OR01p04=p=f)]TJ/F22 11.955 Tf 11.955 0 Td[(p02+OR10p02=p+)]TJ/F22 11.955 Tf 11.955 0 Td[(p04+OR10OR01p04=pgf)]TJ/F22 11.955 Tf 11.955 0 Td[(p03+OR01p03=p+)]TJ/F22 11.955 Tf 11.955 0 Td[(p04+OR10OR01p04=pg;whichisequivalentto 5{6

PAGE 142

128 C.3ProofofRemark3Infact,bytheresultsinTable 5{2 ,IcaneasilyseeRemark3holds.However,hereIuseaslightdierentwaytoseetheroleoftheindependenceandrarediseaseassumptionsintheproof.Ieonlyconsiderintheabsenceofmisclassication,forthecaseofmisclassication,theproofisstraightforward.ToobtainMLEsoftheparameters,Idierentiatethelogarithmofthelikelihood 5{3 withrespecttothecorrespondingparametersandIhavetheequationsasin C.1 .ThusIhaveOR10=r12p n1p02OR01=r13p n1p03=r14n1p02p03 r12r13p04p; A.11 AlsonotethatunderG-Eindependenceandrarediseaseassumptions,Ihaveanadditionalrestrictionasin 5{4 .Sotogetherwiththedenitionofp1jj=1;2;3;4in 5{2 ,andp11n1=r11,Ihave^IR=r14n1^p02^p03 r12r13^p04^p 5{4 =r14n1^p01 r12r13^p 5{2 =r14n1^p11 r12r13^p11n1=r11=r14r11 r12r13=^CO:

PAGE 143

REFERENCESAgresti,A..CategoricalDataAnalysis,SecondEdition.NewYork:Wiley.Albert,P.S.,Ratnastingle,D.,Tangrea,J.,andWacholder,S..Lim-itationsofthecase-onlydesignforidentifyinggene-environmentinteraction.AmericanJournalofEpidemiology,154:687{693.Anderson,J.A..Separatesamplelogisticdiscrimination.Biometrika,59:19{35.Antoniak,C.E..MixturesofDirichletprocesseswithapplicationstonon-parametricproblems.TheAnnalsofStatistics,2:1152{1174.Antoniou,A.C.,Pharoah,P.P,Smith,P.,andEaston,D.F.2004.TheBOADICEAmodelofgeneticsusceptibilitytobreastandovariancancer.BritishJournalofCancer,91:1580-1590.Ashby,D.,Hutton,J.L.,andMcGee,M.A..SimpleBaysiananalysesforcase-controlledstudiesincancerepidemiology.TheStatistician,42:385{389.Baker,S.G..ThemultinomialPoissontransformation.TheStatisti-cian,43:495{504.Bashir,S.A.,andDuy,S.W.Thecorrectionofriskestimatesformeasurementerror.AnnalsofEpidemiology,7:156164.Botto,L.D.,andKhouryM.J..Commentary:Facingthechallengeofgene-environmentinteraction:thetwobyfourtableandbeyond.AmericanJournalofEpidemiology,153:1016{1020.Brennan,P..Gene-environmentinteractionandaetiologyofcancer:whatdoesitmeanandhowcanwemeasureit?Carcinogenesis,23:381{87.Breslow,N.E.1996.Statisticsinepidemiology:Thecase-controlstudy.JournaloftheAmericanStatisticalAssociation,91:14{28.Breslow,N.E.,andPowers,W..Aretheretwologisticregressionsforretrospectivestudies?Biometrics,34:100{105. 129

PAGE 144

130 Breslow,N.E.,Robind,D.,Tangrea,J.M.,andWellner,J.A..Onthesemi-parametriceciencyoflogisticregressionundercase-controlsampling.Bernoulli,6:447{455.Carroll,R.J.,Wang,S.,andWang,C.Y..Prospectiveanalysisoflo-gisticcase-controlstudies.JournaloftheAmericanStatisticalAssociation,90:157{169Chatterjee,N.,andCarroll,R..Semiparametricmaximumlikelihoodestimationexploitinggene-environmentindependenceincase-controlstudies.Biometrika,92:399{418.Chatterjee,N.Kalaylioglu,Z.,andCarroll,R..Exploitinggene-environmentindependenceinfamily-basedcase-controlstudies:Increasepowerdetectingassocia-tions,interactionsandjointeects.GeneticEpidemiology,28:138{156.Clayton,D.andMcKeigue,P.M..Epidemiologicalmethodsforstudy-inggenesandenvironmentalfactorsincomplexdiseases.Lancet,358:1356{1360.CommitteeonDNAForensicScience:AnUpdateTheevaluationofforensicDNAevidence.NationalAcademyPress,WashingtonDC.Corneld,J..Amethodofestimatingcomparativeratesformclinicaldata:applicationstocancerofthelung,breast,andcervix.JournaloftheNationalCancerInstitute,11:1269{1275.Corneld,J..AstatisticalproblemarisingformretrospectivestudiesProceedingsoftheThirdBerkeleySymposiumonMathematicalStatisticsandProbability,ed.J.Neyman,Berkeley,CA:UniversityofCaliforniaPress,pp135{146.Corneld,J.,Gordon,T.,andSmith,W.W.1961.QuantalResponseCurvesforexponentiallyuncontrolledvariables.BulletinoftheInternationalStatisticalInstitute,38:97{115.Couch,F.J.,DeShano,M.L.,Blackwood,M.A.,Calzone,K.,Stopfer,J.,Campeau,L.,Ganguly,A.,Rebbeck,T.,Weber,B.L.,Jablon,L.,Cobleigh,M.A.,Hoskins,K.,andGarber,J.E..BRCA1mutationsinwomenattendingclinicsthatevaluatetheriskofbreastcancer.TheNewEnglandJournalofMedicine,336:1409{1415.Cox,D.R..InteractionwithdiscussionInternationalStatisticalRe-view,52:1{32.Dawid,A.P..Conditionalindependenceinstatisticaltheory.Journal

PAGE 145

131 oftheRoyalStatisticalSociety,SeriesB,41:1{31.Day,N.E.,andKerridge,D.F..Ageneralmaximumlikelihooddis-criminant.Biometrics,23:313{323.Diggle,P.,Heagert,P.,Liang,K.,andZeger,S.L..AnalysisofLongi-tudinaldata.NewYork:OxfordUniversityPress.Diggle,P.J.,Morris,S.E.,andWakeeld,J.C.2000.Point-sourcemodel-ingusingmatchedcase-controldata.Biostatistics,1:89{105.Escobar,M.D,andWest,M..BayesianDensityEstimationandInfer-enceUsingMixtures.JournaloftheAmericanStatisticalAssociation,90:577{588.Falush,D.,Stephens,M.,andPritchard,J.K.InferenceofPopulationStructureUsingMultilocusGenotypeData:LinkedLociandCorrelatedAlleleFrequencies.Genetics,164:1567{1587.Fisher,R.A..ThecorrelationbetweenrelativesonthesuppositionofMendelianinheritance.TransactionsoftheRoyalSocietyofEdinburgh,52:399{433.Freedman,M.L.,Reich,D.,Penney,K.L.,McDonald,G.J.,Mignault,A.A.,Patterson,N.,Gabriel,S.B.,Topol,E.J.,Smoller,J.W.,Pato,C.N.,Pato,M.T.,Petryshen,T.L.,Kolonel,L.N.,Lander,E.S.,Sklar,P.,Henderson,B.,Hirschhorn,J.N.,andAltshuler,D..Assessingtheimpactofpopulationstraticationongeneticassociationstudies.NatureGenetics,36:388{393.Garcia-Closas,M.,Rothman,N.,andLubin,J..Misclassicationincase-controlstudiesofgene-environmentinteractions:assessmentofbiasandsamplesize.CancerEpidemiology,Biomarkers&Prevention,8:1043{1050.Garcia-Closas,M.,Thompson,W.D.,andRobins,J.M..Dierentialmisclassicationandtheassessmentofgene-environmentinteractionsincase-controlstudies.AmericanJournalofEpidemiology,147,426-433.Gatto,N.M.,Campbell,U.B.,Rundle,A.G.,andAhsan,H..Fur-therdevelopmentofthecase-onlydesignforassessinggene-environmentinteraction:evaluationofandadjustmentforbias.InternationalJournalEpidemiology,33:1014-1024.Gelman,A.,andRubin,D.B..Inferencefromiterativesimulationus-ingmultiplesequences.StatisticalScience,7:457{472.Ghosh,M.,andChen,M-H..Bayesianinferenceformatchedcase-controlstudies.Sankhya,SeriesB,Pt.2,64:107{127.

PAGE 146

132 Ghosh,M.,Ghosh,A.,Chen,M-H,andAgresti,A..Noninformativepriorsforone-parameteritemresponsemodels.JournalofStatisticalPlanningandInference,88:99{115.GreenlandS.Oncorrectingformisclassicationintwinstudiesandothermatchedpairstudies.StatisticsinMedicine,8:825-829.Greenland,S.,andKleinbaum,D.G.Correctingformisclassicationintwo-waytablesandmatched-pairstudies.InternationalJournalofEpidemiology,12:93-97.Gustafson,P.,Le,N.D.,andVallee,M..ABayesianapproachtocase-controlstudieswitherrorsincovariables.Biostatitics,3:229{243.Hayakawa,T.,Nagai,Y.,Kahara,T.,Yamashita,H.,Takamura,T.,Abe,T.,Nomura,G.,andKobayashi,K.2000Gln27GluandArg16Glypolymorphismsofthebeta2-adrenergicreceptorgenearenotassociatedwithobesityinJapanesemen.Metabolism,49:1215{8.Hoggart,C.J.,Parra,E.J.,Shriver,M.D.,Bonilla,C.,Kittles,R.A.,Clay-ton,D.G.,andMcKeigue,P.M.Controlofconfoundingofgeneticassocia-tionsinstratiedpopulations.AmericanJournalofHumanGenetics,72:1492-1504.Hoggart,C.J.,Shriver,M.D.,Kittles,R.A.,Clayton,D.G.,andMcK-eigue,P.M.Designandanalysisofadmixturemappingstudies.AmericanJournalofHumanGenetics,74:965-978.Johnson,J.A.,andTerra,S.G.b-Adrenergicreceptorpolymorphisms:cardiovasculardiseaseassociationsandpharmacogenetics.PharmaceuticalResearch,19:1779-1787.Khoury,M.J.,andFlanders,W.D..Nontraditionalepidemiologicap-proachesintheanalysisofgene-environmentinteraction:case-controlstudieswithnocontrols.AmericanJournalofEpidemiology,144:207{213.Knowler,W.C.,Williams,R.C.,Pettitt,D.J.,andSteinberg,A.G..Gm3;5,13,14andtype2diabetesmellitus:anassociationinAmericanIndianswithgeneticadmixture.AmericanJournalofHumanGenetics,43:520-526.Little,R.J.A.,andRubin,D.B..StatisticalAnalysiswithMissingData,SecondEdition.NewYork:Wiley.Lander,E.S.,andSchork,N.J..Geneticdissectionofcomplextraits.Science,265:2037{2048.

PAGE 147

133 Lin,M.,Aquilante,C.,Johnson,J.A.,andWu,R.SequencingdrugresponsewithHapMap.ThePharmacogenomicsJournal,5:149-156.Lindley,D.V..TheBayesiananalysisofcontingencytables.TheAn-nalsofMathematicalStatistics35:1622{1643.Lubin,J.H.,andGail,M..Onpowerandsamplesizeforstudyingfeaturesoftherelativeoddsofdisease.AmericanJournalofEpidemiology,131,552-566.Lynch,M.,andWalsh,B.GeneticsandAnalysisofQuantitativeTraits.Sinauer,Sunderland,MA.MacEachern,S.N.,andMuller,P..EstimatingmixtureofDirichletprocessmodels.JournalofComputationalandGraphicalStatistics,7:223{238.Madigan,D.,andRaftery,A.E..Modelselectionandmodeluncer-taintyingraphicalmodelsusingOccam'sWindow.JournaloftheAmericanStatistiscalAssociation,89:1535{1546Mantel,N..Syntheticretrospectivestudiesandrealtedtopics.Biomet-rics,29:479{486.Mantel,N.,andHaenszel,W..Statisticalaspectsoftheanalysisofdatafromretrospectivestudiesofdisease.JournaloftheNationalCancerInstitute,22:719{748.Marchini,J.,Cardon,L.R.,Phillips,M.S.,andDonnelly,P.,.Theef-fectsofhumanpopulationstructureonlargegeneticassociationstudies.NatureGenetics,36:512{517.Marshall,R.J..Baysiananalysisofcase-controlstudies.StatisticsinMedicine,7:1223{1230.McKeigue,P.M..Prospectsforadmixturemappingofcomplextraits.AmericanJournalofHumanGenetics,76:1{7.McNemar,Q.1947.Noteonthesamplingerrorofthedierencebetweencorrelatedproportionsorpercentages.Psychometrika,12:153{157.Modan,M.D.,andHartge,P.,Hirsh-Yechezkel,G.,Chetrit,A.,Lubin,F.,Beller,U.,Ben-Baruch,G.,Fishman,A.,Menczer,J.,Struewing,J.P.,Tucker,M.A.,Ebbers,S.M.,Friedman,E.,Piura,B.,Wacholder,S..Parity,oralcontraceptivesandtheriskofovariancanceramongcarriersandnoncarriersofa

PAGE 148

134 BRCA1orBRCA2mutation.NewEnglandJournalofMedicine,345:235{240.Morton,N.E.,andCollins,A..Testsandestimatesofallelicassocia-tionincomplexinheritance.ProceedingsoftheNationalAcademyofSciences,95:11389{393.Muller,P.,Parmigiani,G.,Schildkraut,J.,andTardella,L..ABaysianhierarchicalapproachforcombiningcase-controlandprospectivestudies.Biometrics,55:858{866.Muller,P.,andRoeder,K..ABaysiansemiparametricmodelforcase-controlstudieswitherrorsinvariables.Biometrika,84:523{537.Nurminen,M.,andMutanen,P..ExactBayesiananalysisoftwopro-portions.ScandinavianJournalofStatistics,14:67{77.Parmigiani,G.,Berry,D.A.,andAguilar,O..DeterminingCarrierProbabilitiesforBreastCancerSusceptibilityGenesBRCA1andBRCA2.AmericanJournalHumanGenetics,62:145-158.Patterson,N.,Hattangadi,N.,Lane,B.,Lohmueller,K.E.,Haer,D.A.,Oksenberg,J.R.,Hauser,S.L.,Smith,M.W.,OBrien,S.J.,Altshuler,D.,Daly,M.J.,andReich,D..MethodsforHigh-DensityAdmixtureMappingofDiseaseGenes.AmericanJournalofHumanGenetics,74:979{1000.Pfeier,R.,Gail,M,H.,andPee,D..Inferenceforcovariatesthataccountsforascertainmentandrandomgeneticeectsinfamilystudies.Biometrika,88:933{948.Phillips,A.,andHolland,P.W..EstimatorsofthevarianceoftheMantel-Haenszellog-odds-ratioestimate.Biometrics,43:425{431.Piegorsch,W.W.,Weinberg,C.R.,andTaylor,J.A..Non-hierarchicallogisticmodelsandcase-onlydesignsforassessingsuspectibilityinpopulationbasedcase-controlstudies.StatisticsinMedicine,13:153{162.Prentice,R.L.,andPyke,R.1979.Logisticdiseaseincidencemodelandcase-controlstudies.Biometrika,66:403{411.Pritchard,J.K.,Stephens,M.,andDonnelly,P.a.Inferenceofpopula-tionstructureusingmultiocusgenotypedata.Genetics,155:945{959.Pritchard,J.K.,Stephens,M.,Rosenberg,N.A.,andDonnelly,P.b.Associationmappinginstructuredpopulations.AmericanJournalofHumanGenetics,67:170{181.

PAGE 149

135 Risch,H.A.,McLaughlinJ.R.,Cole,D.E.C.,Rosen,B.,Bradley,L.,Kwan,E.,Jack,E.,Vesprini,D.J.,Kuperstein,G.,Abrahamson,J.L.A.,Fan,I.,Wong,B.,andNarod,S.A..PrevalenceandPenetranceofGermlineBRCA1andBRCA2MutationsinaPopulationSeriesof649WomenwithOvarianCancer.AmericanJournalofHumanGenetics,68:700{710.Rice,K..Full-likelihoodapproachestomisclassicationofabinaryex-posureinmatchedcase-controlstudies.StatisticsinMedicine,22:3177{3194.Rice,K.M..EquivalencebetweenconditionalandmixtureapproachestotheRaschmodelandmatchedcase-controlstudies,withapplications.JournaloftheAmericanStatisticalAssociation,99:510{522.Rice,K.,andHolmans,P..Allowingforgenotypingerrorinanalysisofunmatchedcase-controlstudies.AnnalsofHumanGenetics,67:165{174.Risch,N.,andMerikangas,K..Thefutureofgeneticstudiesofcom-plexdiseases.Science,273:1516{517.Robins,J.,Breslow,N.,andGreenland,S..EstimatorsoftheMantel-Haenszelvarianceconsistentinbothsparsedataandlarge-stratalimitingmodels.Biometrics,42:311{323.Roeder,K.,Carroll,R.J.,andLindsay,B.G.1996.Semiparameticmixtureapproachtocase-controlstudieswitherrorsincovariates.JournaloftheAmericanStatisticalAssociation,91:722{732.Sala,A.,Penacino,G.,Carnese,R.,andCorach,D..ReferencedatabaseofhypervariablegeneticmarkersofArgentina:applicationformolecularanthropologyandforensiccasework.Electrophoresis,20:1733{739.Sala,A.,Penacino,G.,andCorach,D.1998.Comparisonofallelefrequen-ciesofeightLocifromArgentineanAmerindianandEuropeanpopulations.HumanBiology,70:937{947.Satten,G.A.,Flanders,W.D.,andYang,Q..Accountingforun-measuredpopulationsubstructureincase-controlstudiesofgeneticassociationusinganovellatent-classmodel.AmericanJournalofHumanGenetics,68:466{477.Satten,G.,andKupper,L..Inferenceaboutexposure-diseaseassocia-tionsusingprobability-of-exposureinformation.JournaloftheAmericanStatisticalAssociation,88:200{208.Schmidt,S.,andSchaid,D.J..Potentialmisinterpretationofthecase-onlystudytoassessgene-environmentinteraction.AmericanJournalofEpidemiology,

PAGE 150

136 150:878{885.Seaman,S.R.,andRichardson,S..Bayesiananalysisofcase-controlstudieswithcategoricalcovariates.Biometrika,88:1073{1088.Seaman,S.R.,andRichardson,S.2004.Equivalenceofprospectiveandret-rospectivemodelsintheBayesiananalysisofcase-controlstudies.Biometrika,91:15{25.Seigel,D.G.,andGreenhouse,S.W..Multiplerelativeriskfunctionsincase-controlstudies.AmericanJournalofEpidemiology,97:324{331.Sethuraman,J..AconstructivedenitionofDirichletpriors.StatisticaSinica,4:639{650.Sinha,S.Mukherjee,B.,andGhosh,M..Bayesiananalysisofmatchedcase-controlstudieswithmultiplediseasestates.Biometrics,60:41{49.Sinha,S.Mukherjee,B.,Ghosh,M.,Mallick,B.,K,andCarroll,R.Ja.Bayesiansemiparametricmodelingformatchedcase-controlstudieswithmultiplediseasestates.JournaloftheAmericanStatisticalAssociation,100:591{601.Sinha,S.Mukherjee,B.,andGhosh,M.b.Modelingassociationamongmultivariateexposuresinmatchedcase-controlstudies.Preprint.Spielman,R.S.,andEwens,W.J.Asibshiptestforlinkageinthepresenceofassociation:thesibtransmission/disequilibriumtest.AmericanJournalofHumanGenetics,62:450{458.Spielman,R.S.,McGinnis,R.E.,andEwens,W.J.Transmissiontestforlinkagedisequilibrium:theinsulingeneregionandinsulin-dependentdiabetesmellitusIDDM.AmericanJournalofHumanGenetics,52:506{516.Srivastava,M..Estimationofinterclasscorrelationsinfamilialdata.Biometrika,71:177{185.Srivastava,M.,Keen,K.J.,andKatapa,R.S..Estimationofinter-classandintraclasscorrelationsinmultivariatefamilialdata.Biometrics,44:141{150.Sullivan,P.F.,Eaves,L.J.,Kendler,K.S.,andNeale,M.C..Ge-neticcase-controlassociationstudiesinneuropsychiatry.ArchivesofGeneralPsychiatry,58:1015{024.Takami,S.,Wong,Z.Y.H.,Stebbing,M.,andHarrap,S.B.Linkage

PAGE 151

137 analysisofglucocorticoidandb2-adrenergicreceptorgeneswithbloodpressureandbodymassindexAmericanJournalofPhysiology,HeartandCirculatoryPhysiology,276:1379{1384.Umbach,D.M.,andWeinberg,C.R..Designingandanalyzingcase-controlstudiestoexploitindependenceofgenotypeandexposure.StatisticsinMedicine,16:1731{43.West,M.,Muller,P.,andEscobar,M.D.,Hierarchicalpriorsandmixturemodels,withapplicationinregressionanddensityestimation,inAspectsofUncertainty.ATributetoD.V.Lindley,A.F.M.SmithandP.Freeman,eds.,pp363{386,Wiley:NewYork.Witte,J.S.,Gauderman,J.,andThomas,D.C..Asymptoticbiasandeciencyincase-controlstudiesofcandidatesgenesandgene-environmentinterac-tions:basicfamilydesign.AmericanJournalofEpidemiology,149:693{705.Zelen,M.,andParker,R.A..Case-controlstudiesandBayesianinfer-ence.StatisticsinMedicine,5:261{269.Zhao,H..Family-basedassociationstudies.StatisticalMethodsinMed-icalResearch,9:563{587.

PAGE 152

BIOGRAPHICALSKETCHLiZhangwasbornonJune19,1976,inJiangsu,China.ShereceivedherB.E.incivilengineeringfromSoutheastUniversity,Nanjing,China,in1999.ShejoinedthegraduateprograminstatisticsattheUniversityofFloridainJanuary2001.Herresearchinterestsincludecase-controlstudies,geneticepidemiologyandBayesianmethod.Shestartedtoworkonherdissertationinthespringof2004. 138


Permanent Link: http://ufdc.ufl.edu/UFE0015583/00001

Material Information

Title: Bayesian Methods in Case-Control Studies with Applications in Genetic Epidemiology
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0015583:00001

Permanent Link: http://ufdc.ufl.edu/UFE0015583/00001

Material Information

Title: Bayesian Methods in Case-Control Studies with Applications in Genetic Epidemiology
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0015583:00001


This item has the following downloads:


Full Text










BAYESIAN METHODS IN CASE-CONTROL STUDIES WITH APPLICATIONS
IN GENETIC: EPIDEMIOLOGY
















By
LI ZHANG,


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


2006































Copyright 2006

by
Li Zhang


















To my husband, Xin, and my parents.















ACKNOWLEDGMENTS

First of all, I would like to express my sincere gratitude to both of my advisors

(Professor Malay Ghosh and Professor Bhramar Mukherjee) for their immense help

at every stage of my research. I remain grateful for their constant encouragement,

and mental support throughout the hardship of my graduate study at the University

of Florida. Without their patience, guidance and encouragement, none of this work

would have been possible. As mentors, their wisdom, kindness and enthusiasm bene-

fitted me greatly in both my research work and life. Their valuable insights and ideas

directly and significantly contributed to the work in this dissertation.

I would also like to give special thanks to Professor Rongling Wu for many

fruitful discussions, great help and providing the dataset analyzed in ('! .pter 3 of this

dissertation. I also extend my gratitude to Professor Michael Daniels and Professor

Paul Duncan for serving on my committee. I appreciate their constructive I _-- i ;Huns-

and precious time. I thank all the other professors in our department for their help

throughout my graduate study.

I would like to convey my appreciation to Dr. N\I1 l ui .Il C'll III. ijee (who is a

Senior Investigator at the National Cancer Institute) for being my mentor during my

training fellowship at the National Cancer Institute and for providing us a wonderful

dataset which directly motivated the work in C'!s Ilter 4 of this dissertation. I would

like to take this opportunity to thank my fellow graduate students in the Department

of Statistics at the University of Florida. In particular, I thank Dr. Samiran Sinha

(currently on the faculty at Texas A& M University) for many helpful discussions and

for his contribution to the work in C'!s Ilter 4 of this dissertation. I thank my friend










Dr. Yan Gong (who is a faculty member at the University of Florida) for sharing her

expertise in genetics.

I thank the College of Liberal Arts and Sciences at the University of Florida

for awarding me the K~eene Dissertation Fellowship Award, which provided a won-

derful opportunity for me to focus solely on my research during the last stage of my

dissertation.

Last, but not the least, my sincere thanks go to my family for their endless love,

continuous support and encouragement during my life. This work is dedicated to all

of them.


















TABLE OF CONTENTS

page

ACK(NOWLEDGMENTS ......... .. iv

LIST OF TABLES ......... ... .. ix

LIST OF FIGURES ......... . .. xii

ABSTRACT ...... ...... .......... xiii

CHAPTER

1 OVERVIEW ......... ... .. 1

1.1 Introduction: The Frequentist Development in Case-Control Studies 1
1.1.1 The Alantel-Haenszel Era . ... 2
1.1.2 Logistic Regression in Case-Control Studies .. .. .. .. 5
1.1.3 Equivalence of Prospective and Retrospective Models in Case-
Control Studies ....... .. .. 6
1.1.4 Alatched Case-Control Studies .... ... .. 10
1.2 B li- Im Analysis of Case-Control Studies .. .. .. 12
1.3 Topics of This Dissertation . .... 16

2 EQUIVALENCE OF POSTERIORS IN THE BAYESIAN ANALYSIS OF
THE MILTINOhIlAL-POISSON TRANSFORMATION .. .. .. .. 19

2.1 Introduction ......... . .. 19
2.2 A General Result on Posterior Equivalence ... .. .. 20
2.3 Stratified Case-Control Studies with Missing Exposures .. .. .. 22
2.4 Discussion ......... . 28

:3 BAYESIAN MODELING FOR GENETIC ASSOCIATION IN CASE-
CONTROL STUDIES: ACCOUNTING FOR UNKNOWN POPULATION
SITBSTRITCTITRE ......... . 29

:3.1 Introduction ......... .. .. 29
:3.2 Model and Notation ......... .. 3:3
:3.2.1 Statistical Model . ...... .. 3:3
:3.2.2 Genetic Model . . .. :34
:3.2.3 Inference on I for The Model with Admixture .. .. .. :35
:3.3 Likelihood and Priors ........ .. :37
:3.3.1 Likelihood ......... ... :37











3.3.2 Priors and Posteriors . ..... .. 38
3.3.3 Computational Details ...... .. 39
3.4 Simulation ......... .. .. .. .. 40
3.5 Application to A Real Dataset ...... .... 45
3.6 Discussion ......... . .. 47

4 SEMIPARAMETRIC BAYESIAN ANALYSIS OF CASE-CONTROL DATA
ITNDER GENE-ENVIRONMENT INDEPENDENCE AND POPITLA-
TION STRATIFICATION ........ .. 55

4. 1 Introduction ........ .. .. .. 55
4.2 Model, Likelihood, Priors and Posteriors .... .. 59
4.3 The Israeli Ovarian Cancer Data .... .. .. 68
4.4 Simulation ........ . .. 73
4.5 Discussion ........ . .. 76

5 ACCOUNTING FOR ERROR DITE TO MISCLASSIFICATION OF EX-
POSITRES IN CASE-CONTROL STUDIES OF GENE-ENVIRONMENT
INTER ACTION ......... . 86

5.1 Introduction ........... .. ..... .... 86
5.2 IUnnatched Case-Control Studies of Gene Environment Interaction 89
5.2.1 Maxiniun Likelihood Estimation under G-E Independence
Assumption ....... ..... ....... 90
5.2.2 Maxiniun Likelihood Estimation in The Presence of Misclas-
sification ........... ... .... 95
5.2.3 Case-only Method with Possible Misclassification .. .. .. 99
5.3 Simulation Studies ....... ... .. 101
5.4 Conclusion ........ . .. 104

6 FITTIRE WORK( AND CONCLUSION ... ... .. 112

APPENDIX

A APPENDIX TO CHAPTER 3 . .... .. .. 117

B APPENDIX TO CHAPTER 4 . ..... .. .. 118

B.1 Proof of Leninas and Results .... ... . 118
B.2 Likelihood for The EDPhi Model ... .. .. 120
B.3 Computational Details . ...... .. 120

C APPENDIX TO CHAPTER 5 . ..... .. .. 123

C.1 The Constrained ML Equations under G-E Independence and Rare
Disease Assumptions in IUnnatched Case-Control studies .. .. 123
C.2 Obtain Restriction (5-6) . ..... .. .. 127
C.3 Proof of RE1\ARK< 3 ....... ... .. 128












REFERENCES ......... . .. .. 129

BIOGRAPHICAL SK(ETCH ....... ... .. 138
















LIST OF TABLES
Table page

1-1 Case-control data with a binary exposure variable .. .. .. 2

1-2 Series of 2 x 2 table for stratified case-control data .. .. :3

1-3 Alatched case-control data with a binary exposure variable .. .. .. 11

:31 Allele frequencies for Twelve STR loci in the four Argentinean subpopu-
lations. ......... .. . 50

:32 The results of simulated rare-disease data with marker loci in linkage equi-
librium with the candidate gene D6S:366. Ratio of the sample sizes of cases
to controls is 125/125 and 250/250. X12 and X6, represent that the pa-
rameters were estimated by using the twelve and the first six additional
marker loci, respectively. XO is the analysis without using any additional
marker loci. Mean and posterior standard deviation refer to the average
of the B~i-;-s estimates and posterior standard deviations obtained in 100
replications, whereas MSE is the estimated mean squared error based on
100 replications. ......... . 51

:33 The results of simulated rare-disease data with marker loci in linkage equi-
librium with the candidate gene D6S:366 which are analyzed by Satten et
al. (2001). 125/125 and 250/250 denote ratio of the sample sizes of cases
to controls. X12 and X6 represent that the parameters were estimated
by using the twelve and the first six of the additional marker loci, respec-
tively. Mean and standard error refer to the average of the estimates and
standard errors obtained in 500 replications. ... .. .. 52

:34 The results of simulated common-disease data with marker loci in linkage
equilibrium with the candidate gene D6S:366. Ratio of the sample sizes of
cases to controls is 125/125 and 250/250. X12 and X6, represent that the
parameters were estimated by using the twelve and the first six additional
marker loci, respectively. XO is the analysis without using any additional
marker loci. Mean and posterior standard deviation refer to the average
of the B~i-;-s estimates and posterior standard deviations obtained in 100
replications, whereas MSE is the estimated mean squared error based on
100 replications. ......... . 5:3










3-5 The results of real data analysis with the posterior mean (Estimate), pos-
terior standard deviation and 95' highest posterior density (HPD) inter-
val (j 1.11; and confidence interval (CI) for the ordinary logistic regression
model). ......... .. 54

4-1 A s, 1-, -is of Israeli ovarian cancer data by all five methods, considering OC use
as the only environmental exposure, with ".' HPD and confidence intervals .79

4-2 A s, 1-, -is of Israeli ovarian cancer data by all five methods, considering both
OC use and parity as environmental exposures, with ".' HPD and confidence
intervals ......... .. .. 80

4-3 Simulation scenarios: E is Zero-Inflated ; G :rare or common; G-E indepen-
dence assumption holds (yE = 0) or does not hold (yE = 0.25). Mean denotes
the mean estimate based on 100 replications, whereas MSE is the estimated
mean squared error based on 100 replications. .... .. 81

4-4 Simulation scenarios: E: Mixture of two normals; G: with parametric logistic
in terms of S as in (4-8) or commonly prevalent as in (4-4); G-E independence
holds (yE = 0) or does not hold (yE = 0.25). Mean denotes the mean estimate
based on 100 replications, whereas MSE is the estimated mean squared error
based on 100 replications. ......... ... 82

4-5 Simulation scenarios: E: Mixture of two normals; G: rarely prevalent; G-E
independence holds (yE = 0) or does not hold (yE = 0.25).Mean denotes the
mean estimate based on 100 replications, whereas MSE is the estimated mean
squared error based on 100 replications. ..... .. 83

5-1 Data for a unmatched case-control study with a binary genetic factor and
a binary environmental exposure. ...... .. 90

5-2 In the absence of misclassification, the MLEs of the odds ratios and their
estimated .I-i-mptotic variances in terms of observed counts rdj for both
traditional model and the model under G-E independence and rare disease. 92

5-3 In the presence of misclassification, the MLEs of the true odds ratios in
terms of estimated starred expected counts r) for the traditional model
(Model 1) and rT" for. the model under G'-E independence and rare
disease assumptions (Model 2). . ..... 99

5-4 Results of unmatched case-control data (750/750), where specificity for
both genetic and environmental factor =1.0, seoo = sele = 0.95 and
se0E = 861E = 0.9. P(D = 1) a 0.01, P(E = 1) a 0.5 and P(G = 1) a 0.2 106

5-5 Results of unmatched case-control data (1000/1000), where specificity for
both genetic and environmental factor =1.0, seoo = sele = 0.95 and
se0E = 861E = 0.9. P(D = 1) a 0.01, P(E = 1) a 0.5 and P(G = 1) a 0.2 107










5-6 Results of unmatched case-control data (750/750), where specificity for
both genetic and environmental factor =1.0, seoo = sere = 0.9 and se0E =
selE = 0.8. P(D = 1) a 0.01, P(E = 1) a 0.5 and P(G = 1) a 0.2 .. 108

5-7 Results of unmatched case-control data (1000/1000), where specificity for
both genetic and environmental factor =1.0, seoo = sere = 0.9 and se0E =
selE = 0.8. P(D = 1) a 0.01, P(E = 1) a 0.5 and P(G = 1) a 0.2 .. 109

5-8 Minimum number of cases (case:control ratio=1) required to detect a 2-
fold multiplicative interaction (ORlo = ORol = 2 and OR11 = 8) with
>II' power for different levels of sensitivities and specificities of the envi-
ronmental and genetic factors, where P(E = 1) = 0.5 and P(G = 1) = 0.2. 110

5-9 Minimum number of cases (case:control ratio=1) required to detect a 3-
fold multiplicative interaction (ORlo = 1.3, ORol = 7 and OR11 = 3)
with NI I' power for different levels of sensitivities and specificities of the
environmental and genetic factors, where P(E = 1) = 0.2 and P(G=
1) =0.01. ...... ...... ........... 110

















Figure page

4-1 Real data analyzed with EDPM model by considering OC use as an envi-
ronmental exposure: Histogfram of last 5000 MC1| C values for the main
effects and interaction parameter with ovabwdi- I smoothed kernel density. 84

4-2 Details of DPM model by considering OC use as an environmental expo-
sure: Histogfram corresponding to approximate posterior distribution of a~
and K in the DPM model. Also plotted are histogframs of variances of the
pei's and oa i = 1, 24, calculated for each of the last 5000 MC' \!C runs. 85

5-1 Minimum number of cases (case:control ratio = 1) required to detect a
2-fold interaction (ORlo = 2, ORol = 2, and OR11 = 8) with H I' power
as a function of the true prevalence of the environmental factor, P(E=1),
for the prevalence of the genetic factor being 0.2, and for selected values
of sensitivity and specificity of the exposure assessment. .. .. .. .. 111


LIST OF FIGURES















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Phike..phlli-

BAYESIAN METHODS IN CASE-C CONTROL STUDIES WITH APPLICATIONS
IN GENETIC: EPIDEMIOLOGY

By

Li Zhang

August 2006

CI. ..1-: Malay Ghosh
Cochair: Bhramar Mukherjee
Major Department: Statistics

The fundamental idea behind case-control studies is to compare selected persons

having a disease (the cases) with those not having the disease (the controls) by

assessing to what extent they have been exposed to the disease's possible risk factors.

The natural likelihood to use for a case-control study is a 1. I 1..-p.. 1 i,- likelihood,

i.e. a likelihood based on the probability of exposure given disease status. I prove

the equivalence of posterior inference for the log odds ratios parameters based on

prospective and retrospective likelihood in stratified case-control studies in which

some of the exposure variables could be missing completely at random.

My dissertation also addresses three problems in the domain of genetic epidemiol-

ogy to explore a variety of disease-gene association and gene-environment interaction.

First, I consider the problem of detecting association between a disease and

a candidate gene in the presence of population admixture. I propose a two-stage

parametric B li-- -1 Io approach implemented via Markov chain Monte Carlo (il Ci LC)

numerical integration technique, which first estimates the posterior probability of dif-

ferent unknown population substructures and then integrates this information into










a disease-gene association model through the technique of B li-- -1 .Is model averagf-

ing. Thus, the uncertainty in estimating the population substructure is taken into

account while providing credible intervals for parameters in the disease-gene associa-

tion model.

Second, I present a B .i- -1 .Is semiparametric approach to model the effect of

stratification variables under the assumption of gene-environment independence in the

control population conditional on some other covariates to study the gene-environment

interaction. I take account of stratum heterogeneity in the exposure distribution by

adopting the Dirichlet process mixture (DPM) of normal prior to the distribution of

the environmental exposure and a flexible model for the distribution of the genetic

factor. I illustrate the methods by applying them to an Israeli ovarian cancer study

to investigate the effect of BRCA1/2 mutations, oral contraceptive use and parity in

the development of ovarian cancer.

Third, I consider analysis of unmatched case-control studies in which binary

exposures are potentially misclassified. I describe a relative simple approach to adjust

the estimation of the parameters of interest in gene-environment association studies in

the presence of misclassification and by exploiting the G-EG independence assumption.

Concluding remarks and directions for future work are included in the end.















CHAPTER 1
OVERVIEW

1.1 Introduction: The Frequentist Development in Case-Control Studies

The goal of an epidentiologic study is to find the causes of a disease and to

assess the degree of association between the disease and its potential risk factors.

Case-control studies are perhaps the most dominant form of analytical research in

epidemiology, especially in cancer epidemiology. The fundamental idea behind such

investigations is to compare selected persons having a disease (the cases) with those

not having the disease (the controls) by assessing to what extent they have been

exposed to the disease's possible risk factors. The ultimate goal often is to evaluate

the hypothesis that one or more of the exposure variables is a cause of the disease.

There are several popular study designs to ascertain disease-exposure association. A

case-control study is retrospective in the sense that separate random samples front

case and control populations are collected first and then exposure information is

ascertained for the selected subjects. In such a study design, one collects exposure

information conditional on the disease status of the subject. A cohort study, on the

other hand, is prospective in nature as an initially healthy cohort is followed over time

to assess the disease incidence rate and possible disease-exposure association.

Case-control studies design became popular in the 1920's. Initially, there were

doubts regarding the validity of using case-control data to extract information on

the relative risks of the disease, i.e., the odds of the occurrence of a disease for

those exposed relative to those unexposed. Cornfield (1951) demonstrated that the

exposure odds ratio for cases versus controls equals the disease odds ratio for exposed

versus unexposed, and that the latter in turn approximates the ratio of disease rates

or the relative risk of the disease provided that the disease is rare. To understand









Table 1-1: Case-control data with a binary exposure variable

Disease Status Exposed Non-Exposed Total
Case nii nio n1
Control not noo no
Total el co N


this issue in the simplest setting, consider a case-control study with a single binary

exposure variable X (X = 1 exposed, and X = 0 unexposed) and let D denote the

disease status (D = 1 for cases, D = 0 for controls). Table 1-1 presents the data

layout and cell frequencies for each disease-exposure combination. One may note that

P(X = 1|D = 1)P(X = 0|D = 0)
(the exposure odds ratio)
P(X = 1D = )P(X = 0D =1)
P(D =1|X =1)P(D = 0X =0)
(the disease odds ratio)
P(D = 0|X = 1)P(D = 1|X = 0)
P(D =1|X =1)
(relative risk). (1-1)
P(D =1|X = )

The approximation holds for a rare disease, as P(D = 0|X = 1) a P(D = 0|X=

0) a 1. So the disease odds ratio, or, = exp(P) (where p denotes the log-odds ratio

parameter), is the same as the exposure odds ratio which approximates the relative

risk of the disease for a rare disease. Therefore, an odds ratio of 1 implies that there

is no association between the disease and the exposure, whereas an odds ratio other

than 1 implies that exposure is either synergistic or antagonistic with the disease.

Also, one estimates by = nllnoo/(r7'... nt) and p = log( ) by P = log( ).
1.1.1 The Mantel-Haenszel Era

It is well-known that for a large sample, P P has an .I-oui- ndic'l~ normal dis-

tribution with mean 0 and variance (1/nll + 1/nlo + 1/nol + 1/noo) (Agresti, 2001).

For a small sample size, exact inference is based on a noncentral hypergeometric









Table 1-2: Series of 2 x 2 table for stratified case-control data

Disease Status Exposed Non-Exposed Total
Case nli nloi nli
Control noll nooi noi
Total el coi Nii


distribution,


n1 n0

811 6 ni11
Pr(nii ay, no, el, co; #) =,(1-2)





which is the conditional distribution of paired binomial data given the marginal totals

(the marginal totals are considered as approximately ancillary in the sense that they

do not contain any information about the parameter of interest ~). One can use

Fisher's exact test to test Ho :~ = I'.' against H1 :~ > I',, by calculating the upper

tail probability under the distribution shown in (1-2),


p,, =C P(uPn lno, el, o; ~',,). (1-3)


Similarly, to test Ho against H : < I',, one should calculate the corresponding lower

tail probability.

Mantel and Haenszel (1959) proposed an alternative to Fisher's exact test. As-

suming a common odds ratio across a series of 2 x 2 tables, they proposed an estimator

for the common odds ratio. Specifically, suppose one has I such tables and the i-th

table is represented by the data layout in Table 1-2. The Mantel-Haenszel (\!ll)

odds ratio estimate is given by


e' m, = exp(PMH) (4
Ci notinmoi/Nsi









To test for homogeneity of odds ratios across the tables, i.e., Ho : #1 = #2 =

= the MH test statistic is
2 ~i R11i 1ie< ,) 2
x =1 (1-5)


which has an approximate X2 distribution with I 1 degrees of freedom.

Mantel and Haenszel presented no variance formula for their estimator and

referred to the work by Cornfield (1956) for calculation of the interval estimates.

Robins, Breslow and Greenland (1986) and Phillips and Holland (1987) indepen-

dently proposed variance estimator of the MH estimator covering the two different

types of .l i-mptotic structure: 1) a small number of tables with large frequencies, and

2) a large number of tables with small frequencies. The main idea is the following.

First, E(Ri) = I' C(Si), where Ri = niznooi/Nsi, Si = nolinio4/Ns, and I' denotes

the true odds ratio in table i. Thus I' ,,, is the solution of the unbiased estimating

equation R ~S = 0, with R = Ci Ri and S = Ci Si, assuming a common value

Sfor I' Second, under paired binomial sampling, the variances of the individual

contributions to this estimating equation satisfy


N,? Var (Ri R ) = ~E { (a nooi + #Rolinoi) (nli + nooi + # (noli + nloi)) } (1-6)


Now, with one step Taylor expansion,

R ~S Var (R) Var (S)( 7
PMH = l0g / s;;r) = l0g Op. -7
E(R) E2(R) E2(S)

The last two equations together yield

._Var (R ~S) Ci Var (Ri -' R )
Var ( M/H) i1
E2(R) E2(R

However, the MH methods concern the effects of a single binary risk factor.

One may extend the methods to a single categorical exposure and then to multiple

categorical exposures only by considering each factor at a time after stratification









with respect to levels of the other factors. Continuous exposures cannot be handled

in this framework unless one categorizes them.

1.1.2 Logistic Regression in Case-Control Studies

Methods to evaluate simultaneous effects of multiple quantitative risk factors

started being developed in the 1960's. Cornfield et al. (1961) noted that if the

multivariate distribution of exposure X among persons with and without disease

D were normal with separate means but a common covariance matrix, then the

probability of developing disease for an individual with values X = z was given by

the logistic response curve

exp(a + p m)
Pr(D = 1|X = 2) = (1-9)
1 + exp(a + p m)

Day and K~erridge (1967) confirmed that logistic regression was efficient in a semepara-

metric sense. They noted that the full joint likelihood with exposure variables hav-

ing an arbitrary distribution p(z) can be written as p(D, X) = Pr(D|X)p(z), and

the two factors in the likelihood could be maximized separately, leading to semi-

parametric efficiency of the logistic model.

A key feature of the logistic model for case-control studies is that the regression

coefficients p have a nice risk interpretation (Seigel and Greenhouse 1973) in the

following sense:

Pr(D = 1|X = 21) Pr(D = 0|X = to)
= exp {0 (my zo) }. (1-10)
Pr(D = 0|X = 21) Pr(D = 1|X = to)

Thus P (21 -- 2o) represents the log relative risk for a subject with exposure my

versus one with exposure to. But the natural likelihood for case-control sampling is

the 1 I i '-II p. !, I likelihood, and is of the form p(X |D) rather than Pr(D|X) which

is the form of a pt. -p.. I ';.- likelihood obtained from a cohort study. As Mantel and

Haenszel (1959) stated in their seminal paper: "a primary goal is to reach the same









conclusions in a retrospective study as would have been obtained front a prospective

study, if one had been done."

Prospective logistic regression analysis is indeed more convenient than fitting

retrospective models. In a retrospective formulation, modeling the distribution of

the exposure may pose certain challenges, especially when the exposure is high di-

niensional or a mixture of discrete and continuous variables. But the use of the

prospective model in analyzing case-control data needed more theoretical validation

which was provided by Anderson (1972) and Prentice and Pyke (1979). I will discuss

this issue in greater detail in the next section.

1.1.3 Equivalence of Prospective and Retrospective Models in Case-Control
Studies

As stated in (1-10) the prospective logistic regression model may be used to

induce a retrospective model, which also turns out to be of a logistic form (Prentice

and Pyke, 1979). Beginning with (1-10) and defining


Pr(D = 0|xo) 1 )

one can recover (1-9). Similarly the odds ratio representation (1-10) allows one to

calculate

exp~q*(z)+dP z}
p(X = 2|D = d) =d = 0, 1, (1-12)
fI exp(q*(tc)+dp z~dz

where ]* = q*(z) = log{Pr(X = z|D = 0)/ Pr(X = zo|D = 0)} for all z. Further

more, if X has K distinct values, the integration becomes sunination over all E

distinct values. The prospective model (1-9) and the retrospective model (1-12) are

precisely equivalent provided that n~ in (1-9) and ]* in (1-12) are unrestricted.

Anderson (1972) provides a deeper look into the proposition of retrospective data

being analyzed by a prospective model. Suppose a discrete exposure variable X takes

K distinct values zl, zK. There are n = no + n samples with no controls and nl










cases. Let nOk and nlk denote the number of controls and cases observed, respectively,

with X = zk. Denote pTk = k ~ = Pr(D = 1|X = zk), Which is specified by the

logistic model (1-9), and the marginal probabilities corresponding to the exposure

are given by q( = Pr(X = zk). Assuming the marginal disease probabilities Pr(D=

d) = iTe are known and by using Pr(X|D) = Pr(D|X) Pr(X)/ Pr(D), the case-control

likelihood is proportional to



d=0 k=1 k=1

where n+k, no +~ nlk. But the parameters are constrained by fixed marginal prob-

abilities of disease: CEkq = kG d, for d = 0, 1. Anderson (1972) discovered that

estimates and covariance matrix for the coefficients p were identical to those of ordi-

nary logistic regression involving maximization of LT alone.

Prentice and Pyke (1979) extended Anderson's (1972) results on logistic discrim-

ination and generalized the findings of Breslow and Powers (1978) on the equivalence

of odds ratio estimators when both prospective and retrospective logistic models are

applied to case-control studies. They started from another factorization of the likeli-

hood.

Again, let us consider no controls and nl cases but an arbitrary exposure variable

z. The retrospective likelihood function is


L = P(31 |D = ) (3 |D 0). (1-14)
j:cases j:controls

Denote S as a sampling indicator (S = 1, an individual is selected in the case-

control sample; = 0, otherwise). Because conditional on disease status, sampling is









independent of exposure, by B ii. -'s theorem,


P(z|D = d)


P(z|D = d, S= 1)
P(D = dz, S = 1)P(2|S
P (D = d |S = 1)
P(D = dz, S = 1)P(2|S


1) a


(1-15)


As in Mantel (1973), we can obtain

P(S= 1|D 1)P(D= 1|x)
P(D = 1|x, S = 1) =,(1-16)
Ca=o P(S= 1|D d)P(D= dlz)

by the fact that sampling is independent of exposure within cases and controls. This

is the conditional probability of an individual in cases, given exposure z and was

sampled for the study.

Since P(D = 1|S = 1)/P(D = 0|S = 1) = ni/no, inserting (1-9) into (1-16),

one obtains


exp(6 + p m)
1) = ,
1 + exp(6 + p0z m


P(D = 1| x, S


(1-17)


where b = aL + log (nw)

Now substituting (1


15) into (1-14), one obtains


(118)


L oc L1 x L2,


where


(1-19)


Note that the parameters 6,

S P(2|S = 1)P(D = dlz, S


1)) are restricted by nd/n


d=0 j=1


d=0 j=1

p and q(z)(= P(2|S =

1 )dz.










Prentice and Pyke (1979) demonstrated that the solution to the unconstrained

m~axim~ization problem?, with? (6, 4) from? the ordin~aryi logistic regression coefficien~ts

based on L1 and q@z) = s/n (which is assigned to any value of z that is observed with

multiplicity s) the sample X distribution, actually satisfied the constraints and thus

yielded the desired estimates. They further showed that the estimating equations

derived from L1 were unbiased and, using estimating equation theory, confirmed

that the usual covariance matrix for p remained valid under case-control sampling.

Because the intercept 6 was a free parameter, it did not matter that the Tri's were

unknown.

Carroll, Wang and Wang (1995) extended the Prentice and Pyke (1979) re-

sults to validate fittingf of prospective logistic regression models to case-control data

in the presence of measurement error and partial missingfness in exposure values.

They showed that, in general, using prospectively derived standard errors is at worst

.I- imptotically conservative; in addition, they derived a simple sufficient condition

guaranteeing that prospective standard errors are .I-i-mptotically correct.

Roeder, Carroll and Linderi (1996) extended the Prentice and Pyke (1979) re-

sults to the case where covariates are measured with error. They proved that the

prospective and retrospective models generate the same profile likelihood for the log

odds ratio. By using a mixture model, the relationship between the true covariate

X and the response D can he modeled appropriately for both complete and reduced

data. The likelihood depends on the marginal distribution of X and the measure-

ment error density [W|X, D]. The latter is modeled parametrically based on the

validation sample. The marginal distribution of the true covariate is modeled using

a nonparametric mixture distribution.

Seaman and Richardson (2004) presented an alternative proof of equality of the

two profile likelihood in the absence of measurement error, where they applied the

multinomial-Poisson (\l P ) transformation. Furthermore, they proved that a B li-o -1 I










analysis which uses the prospective likelihood and assumes a uniform prior distribu-

tion for the log odds that an individual with baseline exposure is diseased, is exactly

equivalent to an analysis that uses the retrospective likelihood and assumes a Dirichlet

prior distribution for the exposure probabilities in the control group. This means that

B li-o Io analysis of case-control studies may, like the classical frequentist analysis,

he carried out using a prospective model, thus significantly reducing its complexity.

Seaman and Richardson (2004), like Prentice and Pyke (1979), considered un-

matched case-control problems. They left the open question of similar equivalence

results in the context of matched case-control problems and also for situations with

missing data. In my dissertation I address the problem of extending the equivalence

results to stratified case-control studies in which some of the exposure variables could

be missing completely at random.

1.1.4 Matched Case-Control Studies

So far I have concentrated on unmatched case-control study designs, but my dis-

sertation will involve some matched case-control settings as well, so I briefly review

the matched study design. 1\atching is often implemented as a design strategy to

eliminate effects due to confounding. In a matched case-control study, controls are

matched with a case (or several cases) on the basis of some matching factors (con-

founding variables) such as age, gender, region, ethnicity etc. There are two types
of matching commonly used. One is flli'r.;l '. I vareching i hihth nmero

controls are selected according to the number of cases in broad homogeneous strata

defined hv the values of matching factors to maintain a specific case:control ratio

in each stratum. The other is individwel matching, in which controls are selected

individually corresponding to each selected case by matching with respect to certain

factors.

The simplest situation of matched data arises when one case is matched with

one control, and they are categorized on the basis of a binary exposure. Suppose one









Table 1-3: Matched case-control data with a binary exposure variable

Disease Status Exposed Non-Exposed
Case mil mio
Control mot moo


has mil, mlo, mol and moo matched pairs under different levels of D and X as shown

in Table 1-3. Let xr be the conditional probability of observing a matched pair with

an exposed case and unexposed control given a discordant pair.

P(X = 1|D = 1)P(X = 0|D = 0)
P(X = 1|D = 1)P(X = 0|D = 0) + P(X = 0|D = 1)P(X = 1|D = 0) + 1
(1-20)

Note that mlolmlo, mol ~ Bin(mlo + mol, x). So the Mantel Haenszel estimator of

the common odds ratio parameter, the MLE of ~, is mlo/mo Note that when = 1,

xr = 1/2. Hence the test statistic to test Ho :~ = 1 is

2 W10 -- EHomlmo +1 mo10 01 2
X = (1-21)
VaTHo (m10 Im10 + 01

which is known as McNemar's (1947) test. One of the potential problems with this

estimator and this test is that it uses only the discordant pairs of observations and

discards the information contained in the concordant set.

In the case of 1 : M~ nr I, 1,11. the Mantel-Haenszel estimator of common odds

ratio is

r=11 (1 2r

where mi, is the number of matched sets where the case and r controls are exposed,

and mo, is the number of matched sets where the case is unexposed but r controls

are exposed. The test statistic for testing Ho :~ = 1 is


X pM (M-r+1)
Lr=1 rt (M+1),


where t, = mi, + mor.









Let us now focus on logistic regression models in matched case-control studies.

In the simplest setting, the data consist of I strata and there are Mi controls matched

with a case, for stratum Si, i = 1, I. As before, one assumes a prospective logistic

incidence model for disease

exp(ag +0 (z zo)}
P(D = 1|z, Si) = ,(1-24)
1 + exp~cai +0 p(z zo)}

where asi's are stratum specific intercept terms. Without loss of generality, assuming

that the first subject in each stratum is a case and rest of the subjects are controls,

conditioning on the sufficient statistics CE ~ Day for asi, one obta~ins the con~dition~al
likelihood

I Mi+1 i+
Le =~ P(D z 1~3S, Ss Da =1)
i= 1 j= 1 j= 1

exp (P z )
= ~(1-25)
i 1 1 exp(0 z:I)

This method is known as conditional logistic regression (CLR). Breslow (1996) illus-

trated that unmatched analysis of matched data based on unconditional full likeli-

hood led to biased and inconsistent estimates of the relative risk parameters. The

difference between unconditional and conditional analysis depends on the degree of

association between the exposure and the matching variables. It is indeed important

to acknowledge the matched study design into any model proposed for matched data.

1.2 Bayesian Analysis of Case-Control Studies

Since the methods I propose in my dissertation are mostly based on the B li-, -i Ia

paradigm, I will now present a brief account of the current state of the art in B li-, I a

methods for case-control studies. In spite of the vast literature in the frequentist

domain, B li-, -i Ia methods for analyzing case-control data were first proposed in the

1980's. With the arrival of Markov chain Monte Carlo ( ilCjl C) techniques in the

1990's, it became possible to address more complex and unorthodox data scenarios










like missingness and measurement error in the context of a case-control study even

in a B li-, -i Ia framework.

Zelen and Parker (1986), Nurminen and Mutanen (1987), Marshall (1988), and

Ashby et al. (1993) developed B li-, -i Ia methods for analyzing case-control studies

with only a single binary exposure variable. All of them used versions of the following

model:

Let 4 and y be the probabilities of exposure in control and case populations,

respectively. The retrospective likelihood is


1(4, y) oc (1 4)noo7"" (1 Y)nlo,) (1-26)


where not and noo are the number of exposed and unexposed observations in a control

population, whereas nll and nlo denote the same for a case population.

Independent conjugate prior distributions for 4 and y are assumed to be Beta(ul, u82

and Beta(vl, v2) TOSpectively. After reparametrization, one obtains the posterior dis-

tribution of the log odds ratio parameter, P = log {q(1 ~)/( (1 y))} as


p(p|nll, mo, 801, Roo
/1, +nol+v1+uz-1( )nlo+noo+v2+ul-1
oc exp{(niiv)} d (1-27)
o { exp(p))}nni tio tvi ~v2

The posterior density of P does not exist in closed from, but may be evaluated by

numerical integration.

Since interest often lies in the hypothesis P = 0, Zelen and Parker (1986) recom-

mended calculating the ratio of the two posterior probabilities p(P)/p(0) at selected

deviates p. When P is set at the posterior mode, a large value of this ratio will indicate

concentration of the posterior away from 0 and one would infer disease-exposure as-

sociation. However, the critical value -II_a-r-- -1. for this ratio is completely arbitrary.

They also provided a normal approximation to the posterior distribution of P to avoid










numerical computation, and discussed the problem of choosing a prior distribution

based on some prior data on exposure information in a B li-. -1 Ia framework.

Nurminen and Mutanen (1987) considered a more general parametrization in

terms of the odds ratio = exp(f3) which covers risk ratio and risk differences.

They provided a complicated exact formula for the cumulative distribution function

of this general comparative parameter, which can he related to Fisher's exact test for

comparing two proportions in sampling theory. The B li-, -i Ia point estimates were

considered as posterior median and mode, whereas inference was based on highest

posterior density interval for the comparative parameter of interest.

Marshall (1988) provided a closed-form expression for the moments of the pos-

terior distribution of the odds ratio. He mentioned that an approximation to the

exact posterior density of the odds ratio parameter can he obtained by power series

expansion of the hypergeometric functions involved in the expression for the density,

but acknowledged the problem of slow convergence in adopting this method. Instead

Marshall used Lindley's (1964) result for the approximate normality of log(odds ra-

tio) which works very well over a wide range of situations. In the absence of exposure

information, Marshall recommended using independent priors on the parameters. He

sl I-_ -_ -1. .1i that a perception about the value of the odds ratio should guide the choice

of prior parameters rather than attempting to exploit the exposure proportions as

ell--- -r. I1 in Zelen-Parker. Inference again is based on posterior credible intervals.

Miller and Roeder (1997) proposed a semiparametric B li-o -1 .Is approach to case-

control studies having continuous exposures with measurement error. They used a

B li-. -1 Ia non-parametric model for the joint marginal distribution of the true ex-

posure (where available), the surrogate and the measurement error. Their methods

are intrinsically designed for continuous exposure. Miller et al. (1999) proposed a

hierarchical B li-o -1 .Is approach for combining the data from a case-control study and

a prospective cohort study, and to estimate the absolute risk of the disease. They










modeled the retrospective distribution of the exposure variable given the disease sta-

tus, and accounted for parameter heterogeneity across studies by using a hierarchical

B .i- -1 Io approach.

Diggle, Morris and Wakefield (2000) presented the first B .i- -1 .Is analysis for

individually matched case-control data (appropriate nuisance parameters are intro-

duced to represent the separate effect of matching in each matched set to recognize

the study design). They considered matched data when exposure of primary interest

is defined by the spatial location of an individual relative to a point or line source of

pollution.

Seaman and Richardson (2001) extended the binary exposure model of Zelen-

Parker to any number of categorical exposures, by simply replacing the binomial like-

lihoods in (1-26) by a multinomial likelihood, and then adopting a MC1| C strategy

with respect to a baseline category. They also adapted the Miiller-Roeder approach

to the setting with categorical exposures and illustrate that under certain specific

choices of a discrete Dirichlet prior on the exposure distribution, Zelen-Parker and

Miiller-Roeder approaches became approximately equivalent.

Ghosh and C'I. 1. (2002) developed general B li-, -i Ia inferential techniques for

matched case-control problems in the presence of one or more binary exposure vari-

ables. Their model was more general than that of Zelen and Parker (1986), and

was based on an unconditional likelihood rather than a conditional likelihood unlike

Diggle, Morris and Wakefield (2000). The general B li-, -i Ia methodology based on

the full likelihood that they proposed worked beyond the logit link. Their procedure

included not only the probit and the complementary log links but also some new

symmetric as well as skewed links. The propriety of posteriors was proved under a

very general class of priors that need not ahr-l- .- be proper.










Sinha et
matched case-control studies with missing exposure. They assumed a Dirichlet pro-

cess prior with a mixing normal distribution on the distribution of the stratum effects

on the exposure distribution. The proposed method possessed certain attractive ro-

bustness properties under varying degrees of stratum heterogeneity in the exposure

distribution.

Sinha et
ease states. They further extended their methods to model multivariate exposure

with association and partial missingness (Sinha et
presented an ensemble of methods to handle unorthodox data scenarios in matched

case-control studies.

1.3 Topics of This Dissertation

A resurgence of interest has been recently expressed in genetic case-control stud-

ies (Risch and Merikangas, 1996; Morton and Collins, 1998; Sullivan et
explore av .-1'ii iv of disease-gene association and gene-environment interaction. The

B li-o -1 .I pathi- .va~ have remained less explored in the case-control context mainly

because of the computing needs for implementing the models.

In genetic case-control studies, accounting for population substructure is a crit-

ical issue in a population where admixture of several ancestry has taken place. A

systematic difference in ancestry in cases and controls can lead to false discovery of

association. In my dissertation, I propose a two-stage parametric B li-o -1 .Is approach

which integrates the model uncertainty into a disease-gene association model through

the technique of B li-. -1 Io model averaging, where the analysis is not limited to binary

genotypes irrespective of whether or not the disease is rare.

Alany human diseases result from the interplay of genetic factors and environmen-

tal exposures. One may exploit the gene-environment independence in order to derive

more efficient estimation techniques than the traditional logistic regression analysis.










I provide B li-, -i Ia nonparametric methods to capture stratification effects on the

distribution of environmental exposures under the gene-environment independence

assumption in the control population. Also in a B li-o -1 .I paradigm I can effectively

use the prior knowledge while modeling the individual 7.~ nc.1vipe frequencies in each

stratum and thus relax the stringent logistic assumption. My objective will be not

only to estimate the interaction effect parameter, but also to estimate the effects of

the genetic factor and environmental exposures as well.

Measurement error in exposure assessment is one of the us! in i- source of bias

in epidemiological studies. When ignored, these errors bias our point and interval

estimates of effect, and invalidate p-values of hypotheses tests. Less attention has

been given to the influence of misclassification on the assessment of interactions be-

tween two or more factors. Based on sensitivity and specificity of the genetic and

environmental factors, I describe a relative simple approach to adjust the estimation

of the parameters of interest in gene-environment association studies in the presence

of misclassification while exploiting the G-E independence assumption.

The outline of the rest of my dissertation is as the following. In C'!s Ilter 2, I

present a general result which shows that the posterior inference for the parameters

from a multinomial likelihood is exactly equivalent to that from the corresponding

Poisson likelihood with an arbitrary proper prior for the parameters of interest and

independent uniform priors for the latent parameters. The result is then extended

to prove the equivalence of posterior inference for the odds ratio parameter based on

prospective and retrospective likelihood in stratified case-control studies where some

of the exposure variables could be missing completely at random.

In C'!s Ilter 3, I propose a parametric B li- 1Is approach to examine the asso-

ciation between a candidate gene and the occurrence of a disease in the presence of

population admixture. Two unmatched case-control simulation studies based on an










admired Argentinean population as described in Sala et al. (1998, 1999) are per-

formed to illustrate the methods and computing scheme. The method is also applied

to a real dataset coming from a genetic association study on obesity.

In C'I!s Ilter 4, I provide a novel semiparametric B li-o -1 .Is approach to model strat-

ification effects under the assumption of gene-environment independence in the con-

trol population. I illustrate the methods by applying them to data from a population-

hased case-control study on ovarian cancer conducted in Israel. Simulation studies

are conducted to compare our method with other popular choices. The results re-

flect that the semiparametric B li-. -1 Ia model allows incorporation of key scientific

evidence in the form of a prior and offers a flexible, robust alternative when standard

parametric model assumptions do not hold.

In ('!, Ilter 5, I derive analytic formulation to obtain estimates and confidence

intervals for the misclassified case in a unmatched case-control set-up, which reduce

back to standard analytic forms as the error probabilities reduce to zero. I adapt

and extend the work of Rice and Holmans (2003) to the situation when one has a

binary genetic risk factor, a binary environmental exposure, and both are potentially

subject to misclassification. Concluding remarks and directions for future work are

stated in ('! .pter 6.















CHAPTER 2
EQUIVALENCE OF POSTERIORS IN THE BAYESIAN ANALYSIS OF THE
MULTINOhIlAL-POISSON TR ANSFOR MATION

2.1 Introduction

Baker (1994) presented a general result which showed how maximum likelihood

estimation of parameters from a multinomial distribution could be carried out from a

corresponding Poisson likelihood by exploiting the multinomial-Poisson relationship.

Henceforth, this will be referred to as the multinomial-Poisson (jl P) transformation.

Baker considered situations where the multinomial probabilities were ratios of func-

tions of parameters to the sum of these functions. The motivation was to simplify the

maximum likelihood computation as well as computation of the .I- i-i!! ng d ,tic variance-

covariance matrix of the maximum likelihood estimate (j \! L E). Baker's result unified a

large number of analyses involving log-linear models, capture-recapture models, pro-

portional hazards models with categorical covariates, generalized Rasch models, voter

plurality models, conditional logistic regression and two-stage case-control studies.

Baker's ideas were extended in the context of B li-o -1 Io analysis of case-control

studies by Seaman and Richardson (2004). The natural likelihood to use for a case-

control study is a 1. Ii n~- p. I 'is.; likelihood, i.e., a likelihood based on the probability

of exposure given the disease status. Prentice and Pyke (1979) showed that, when a

logistic regression is assumed for the probability of a disease given certain exposures,

the maximum likelihood estimators and .I-i-~!1 II '.l~e covariance matrix of the log odds

ratios obtained from the retrospective likelihood are the same as those obtained from

the prospective likelihood, i.e., that based on the probability of a disease given expo-

sures. The objective of Seaman and Richardson (2004) was to verify a result similar

to Prentice and Pyke (1979) for the posterior distribution of the log odds ratios in










a B li-o -1 .I analysis. They proved that a B i-- -1 .Is analysis that uses the prospec-

tive likelihood, and assumes a ;, t:T.:lrm prior for the log odds that an individual with

baseline exposure is diseased, is equivalent to an analysis that uses the retrospective

likelihood and assumes a Diricklet prior for the exposure probabilities in the control

11 our1 Earlier, an approximate equivalence result was indicated by Gustafson et al.

(2002). Seaman and Richardson left open the question of similar equivalence for

stratified case-control data with missing exposure values.

In Section 2.2 of this chapter, first based on a MP transformation in a B li-, -i Ia

framework, I prove a general result which shows that the posterior inference for the

parameters of a multinomial likelihood is the same as that for the corresponding Pois-

son likelihood with arbitrary proper priors for the parameters of interest and uniform

priors for the latent parameters introduced in the Poisson likelihood. Propriety of

posteriors under the assumed priors follow as an immediate consequence. In Section

2.3, I extend the results of Seaman and Richardson (2004) to stratified case-control

problems where some of the exposure variables could be missing completely at ran-

dom. Stratified case-control problems without any missingness can be handled as

special cases. Individually matched case-control design is a special case of stratified

case-control design where the matched sets define the strata. Finally, some concluding

remarks are made in Section 2.4.

2.2 A General Result on Posterior Equivalence

Let {y; je JE i = 1, 2, I} denote a vector of discrete random variables

with a realization {yj; je E i = 1, 2, I}. The subscript i indexes levels of a

categorical covariate or a cross-classification of categorical covariates, and &i (indexed

by j) denotes the set of subjects in level i. I assume that the vector { y; je JE}

follows a multinomial distribution with parameters {9ij(P)/Gi(P), for je E }, where

ge,(P) are some functions of p, G,(P) = E-I ge(P), and P (SI, 4,I)T. The









likelihood function is then proportional to


LM Mid (2-1)
i=1 je4

Let = ( 1, 4, 7)T indicate a set of parameters. The MP transformation

of (2-1) as given by Baker (1994) is the corresponding Poisson likelihood proportional
to:

Lp( ,P)= {ggy(0)exp 4}""'exp{-ggy(0)exp 4}. (2-2)
i=1 je4
THEOREM 1. Suppose CjJzyi > 1 for all i = 1, 2, I. Assume independent

improper priors p( 4) oc 1, for i = 1, I, and a proper prior p(P) for P which

is independent of ~. Then the posterior distribution for P derived from LM $)P is

equivalent to that generated from Lp(, ,0).

Proof: Let asi = exp( 4), i = 1, I. Then asi has the prior p(a i) oc a i. The

marginal posterior of P from Lp ( p) is now given by


(P|y) oc p(0O) a~n, {ag~ (P)}"Y' exp{-aggYi (P)} )da
i=1 6J






i=1 je4
=p(P>n()LM


which is obviously the same as the posterior distribution of P generated from LM r().

The following theorem establishes the propriety of the above posterior under very

mild conditions.

COROLLARY 1. If CjJz~ Yij > 1 for all i = 1, I, and p(P) is proper, then

xr(P|y) is proper.










Proof. Let R denote the support of P. Then by Theorem 1,


Propriety of the posterior thus follows as an immediate consequence of the equivalence

of the two analyses.

REMARK< 1. If instead I use independent priors p(asi) oc a~i- (ai > 0 for all i =

1, I), then the assumption CjJzyi > 1 can be dropped to establish propriety

of the resulting posterior for P. But this posterior will no longer be proportional to

Yp(0LMnrP a8 CrS Ge( will ~lthen have: ~lthe powerl tJ yi + ai rather than CjJ yi.
B li-o -1 .I analogous of all the examples of Baker (1994) can now be handled from

this general theorem. For brevity, I omit these examples, and proceed to the next

section to show the equivalence of posteriors based on prospective and retrospective

likelihood in stratified case-control studies where some of the exposure variables

could be missing completely at random (Little and Rubin, 2002).

2.3 Stratified Case-Control Studies with Missing Exposures

In this section, I prove that a B li-- -1 I is analysis of stratified case-control data with

missing exposure that uses the prospective likelihood, and assumes a **;;.'..:I: toI prior for

the log odds that an individual with baseline exposure is diseased, is exactly equivalent

to an analysis that uses the retrospective likelihood and assumes a uniform prior

distribution for the exposure probabilities in the control group. My analysis handles

the case when some of the exposure variables are missing completely at random.

Suppose there are I strata where each stratum has a cases and t controls in

a stratified case-control study. Let Si denote the i-th stratum. Let Day (= 1 or

0) correspond to the presence or absence of a disease for the jth individual in ith

stratum, and let zij denote the vector of discrete exposure variables for the jth

observed subject in the ith stratum. I assume that each zij can take one of the K









possible values {zl, zK} Suppose now

as exp(P zk)
P(Dij = 1|Xij = zk, s) =
1 + cas exp(P zk)
P(X,j = zk|Dij = 0, Si) =ik (2-3)
I= 1 Yil

The probability that individual j in stratum i has exposure value zk, given that the

individual is a member of the control population is proportional to yik. For each

exposure value zk, these probabilities are assumed to be same for all controls in

stratum i and do not depend on j.

Using (2-3) I can obtain the distribution of the exposure in the case population

and write the prospective and retrospective models in the following form

ofexp(dp Zk)
P(Dij = d|Xij = zk, Si)=,
r=o a~ exp(10 zk)
Yik exp(dp zk)
P(X,j = zk|Dij = d, Si) =l' K-~ ,x(1O~l (2-4)

where d = 0, 1.

Let Aij denote the missingness indicator for the ith stratum (0 indicating miss-

ingness) with

P(A,, = 1|S,) = 1 P(Ai = 0|Si) = rl. (2-5)

Let rl = (rll, rll)" With the missing completely at random assumption, rli does

not depend on the parameters Tik, t~i Or P.

Let ~yidk = Cy {I[Xij = zk]I [Day = d]I[aj = 1]}, d = 0, 1, i.e, ~Yi~k anld yil, ar~e
the respective numbers of undiseased and diseased subjects having X = zk, in the ith

stratum, and I denotes the usual indicator function. Now, the prospective likelihood












LP i~ [nI s2, ~] t
i= 1 j= 1
I 1K aex(d,,, nz k) dke
= H H [ ](2-6)
i=1d=0c=1 lI=0@ep1 k

and the retrospective likelihood is

I1K Tik exp(dp zk) Vidk
CR nni=1 d=0 k=1 lI=1 Yil exp(dp zl) (7

I now have the following equivalence theorem.

THEOREM 2. Suppose C, yilk, > 1 and C,= yiok > foT all i = i, i.

Assume mutually independent priors for the asi, Yik, rl and p, where p(asi) oc a i,

p(yik) OC 7/k1, While rl and P have proper priors xxl(r) and 2~(P). Then the posterior

distribution of P derived from the prospective likelihood is approximately equivalent

to that from the retrospective likelihood.

Proof: Suppose that random variables ~dk are independently distributed as ~dk -

Poisson(Aidk), Where


log Aidk = l0E ri +0 lo ik + d10g t~i k ~). (2-8)


Then writing a = (azl, al)T, and 7 = ( l, ,Y1K,. Il,. ,yIK) the joint

prior is
I K
*(a,a,r,P) oc *l(a)2( C Il 1 k1~'))

The joint posterior is now given by

7i(?,a,y,0|y) oc~ ll~ ~l(X ex(- idk 1 2 --- k1
i=1 d=0 k=1 i=1 k=1
I 1 KC d ik~iu'fexp(~d# zk Vid
= ~ l~l( exp{-- l~ exp(dp zk))Yd!I
i=1 d=0 k=1 ik


ii= 1 =1 i= 1










First note that


y0

oc rli(viok+vers) [1f + a exp(0zkj -(liok+/ilk)


Thus the joint posterior of rl, a, and P is given by

I 1K 7~x~d~ k I~cp~r
"(rl,a,P|y) oc afn ex~# k idk l 2a-
i=1 d=0 1 l=0 @ jexp(1 zk) i=1


i= 1

Next integrating out rl, the joint posterior of a and P is


(2-10)


Lect eik = Tik/ I~ 1il anld a.i= C: 1747, th~uS -ik = r- 0, Th~e Ja~cobian of th~is

transformation is



d 8il, 8iK-1, ci)= s

Thus, the prior structure on yik, illplies the following prior structure for cp = ( 1, I),

and 8 = (011, 01K 87,H1,. 8IK)

~JY, l I ~I K IvI K
i= 1 i=1 = 1 ii= i= 1 = 1

Now, the joint posterior given in (2-9) can be written as







x. exp-, 0 .ex(zk r i ep z Ik


ii= 1 =1 i= 1


(a,0| y) c~ Lpn2 as-1. l
i= 1










Again, note that


exp 949 a Oik exp(( zk a Wilk-1d i





Thus,




IKK





"(lBx) c OCikr ex( z- K=1Ml 8ik exp(p zk il
ii= 1 =1 k=l 1l~~~l~(7l
I K



Ihn ntegrating with respect to rl, I n havegLRa gvn n(2)

-~ ~ i exp= v o) Wik exp( zk

i=1 k=l k=1 k=1Bi CX(T l=18lex(





IKI K
Oik, exp(0 zk M
La~~- = ikio





Since the order of integration of the joint posterior does not matter as long as the

posterior is proper, comparing (2-10) and (2-12), it follows that after integrating the

nuisance parameters, a or 8, the posterior for p generated from Lp or LR remains

the same.









REMARK< 2. Theorem 2 indicates that the marginal posterior distributions of P

from either (2-6) or (2-7) are the same. Thus, in the presence of exposures missing

completely at random, one may fit either the prospective or the retrospective model

to stratified case-control data.

REMARK< 3. A stratified case-control study without missing exposures is a spe-

cial case, where P(Aij = 1|Si) = 1-P(Aij = 0|Si) = 1. A 1 : M~ individually matched

case-control study is a special case of stratified case-control study with a = 1 and

t = M~ where M~ is a positive integer and the strata are defined as the matched sets.

Note that we could very well assume that there are as cases and ti controls in each

stratum, and the proof will still carry through.

REMARK< 4. It is interesting to note that the posterior ;T(rl, 8, 0y) is non-

identifiable in rl, in the sense of Dawid (1979), since r(rl|8,P, y) =xl(rl) which does

not depend on y. This, however, does not impede the propriety of the joint posterior

as shown in the following theorem. For a general result relating non-identifiability

with propriety or impropriety of posteriors, I refer to Ghosh et al. (2000).

The next theorem proves the propriety of the posterior under the assumed model



THEOREM 3. Assume (i) C~,Yidk > l and (ii) E[exp{(2d-1)(PT zk~d

00 for d = 0 and d = 1. Here E denotes expectation with respect to the prior distri-

bution on p, namely, x2a(P). Then ~(rl,a,*y, P|y) is proper.

Proof. It suffices to show that xr(a, 0|y) is proper. This amounts to showing

Sf=, lI(0)x(0)d < 00, where


le(#) = a exp(d# zk ~ = ,""
-" k=1 d=0 l=0@ep1 k










Let wi = exp(asi), i = 1, I. Then


/" K 1 wiOk exp (wi + 0zk) il
ri(P> k=1 1 + exp(wi + p zk) X(1 + exp(wi + p zk) iew



k=1 k=1


0K K
+ exppT~ z y~ilkd i x-pp(0 zJk) ilk
k=1 k=1



where the final step follows from assumption (i). Hence, by the inequality (a+b)l+s <

2 (als + bl' ) for 6 > 0,


Ir'(P) < 2- expI(-0 zki~ik) + exp(PT zkik
i= 1 i= 1 k= 1 i= 1 k= 1
K K
=21- ex(~-wlB CzkRak) + exp(P' zkRnk)
k=1 k=1

The proof is now completed by assumption (ii), which essentially requires the finite-

ness of the moment generating function corresponding to the prior distribution Tr2 -)

2.4 Discussion

As we know that, the MP transformation can simplify maximization of multi-

nomial likelihood by considering a Poisson likelihood with additional parameters.

Introducing some specific priors to the latent parameters, and arbitrary priors to the

parameters of interest in the Poisson likelihood, I show that the marginal posterior

distribution of the parameters of interest is exactly equivalent to that generated from

the multinomial likelihood. However, the MP transformation requires categorical co-

variates. If some of them are continuous, the current practice is either to discretize

them or follow the B li- Ion bootstrap as proposed in Gustafson et al. (2002). An

important open question is extension of the present results to continuous exposures.















CHAPTER :3
BAYESIAN MODELING FOR GENETIC ASSOCIATION IN CASE-CONTROL
STUDIES: ACCOUNTING FOR UNKNOWN POPULATION SITBSTRITCTIRE

3.1 Introduction

The evaluation of the association between molecular markers and disease status

can he used to study the genetic hasis of common human diseases (Risch and Merikan-

gas, 1996; Morton and Collins, 1998; Sullivan et
such so-called association studies arises from the dependence of allele frequencies at

marker loci upon those of disease variants, that is, the linkage disequilibria between

alleles from different genetic loci. A significant association detected between a marker

and the disease can he considered as evidence for close physical linkage between the

marker and a disease locus, given that the linkage disequilibrium between any two

genes ahr-l- .- decays exponentially with their genetic distance in a random mating

idealized population (Lynch and Walsh, 1998).

In practice, however, there rarely exists an idealized population as a result of the

action of various evolutionary forces (Lynch and Walsh, 1998). Evolutionary forces,

such as population structure and population admixture, operating on a population

can result in spurious associations between a phenotype and markers that are not

linked to any causative loci. The presence of spurious association -II---- -N- that the

detected statistical association does not necessarily imply the physical linkage he-

tween the disease phenotype and arbitrary markers that have no physical linkage to

causative loci (Lander and Schork, 1994). A classic example of spurious association

caused by population substructure is presented in K~nowler et
study, based on a sample of Native Americans of the Pima and Papago tribes, a very

strong negative association between the Gm haplotype Gm:3;5,1:3,14 and type 2 or










non-insulin-dependent diabetes mellitus was detected. One might conclude front this

observation that the absence of this haplotype, or the presence of a closely linked gene

is a causal risk factor for the disease. However Gn13;5,1:3,14 is a marker for Caucasian

admixture, and it is most likely that the presence of Caucasian alleles and decrease in

Indian alleles led to lower susceptibility to type 2 diabetes, rather than the direct ac-

tion of the haplotype or of a closely linked locus. This study demonstrates the effects

of confounding due to population substructure, and the importance of considering

genetic admixture while investigating the association between a disease and genetic

markers .

In order to overcome the problem of spurious associations, many different genetic

strategies have been proposed. Spielman et
libriunt test (TDT) to measure the association between a candidate gene and disease

status by incorporating the genotypes of parents of affected individuals. This test

has been instrumental in genetic association studies of human diseases (Spielman and

Evens, 1998), but it is often limited because of difficulties with DNA sampling. For

this reason, a simple case-control design that uses affected individuals and unrelated

controls has recently received increased attention (Freedman et
et
tion of spurious associations in case-control studies of disease-gene association. For a

comprehensive recent review of admixture mapping for complex traits see McE~eigue

(2005).

Pritchard and colleagues used niultilocus genotype data to estimate population

substructure. They proposed a model-based clustering method to identify the pop-

ulation structure by genotyping samples at additional unlinked markers (Pritchard

et
ferent markers (Falush et









written to implement their algorithms that consider both linked and unlinked mark-

ers. Pritchard et
population structure is inferred by employing the method of Pritchard et
and then the tests of association within subpopulations are conducted conditional on

the imputed substructure. However, this method does not develop a model for the

probability of disease incidence and cannot he generalized easily to provide estimates

of the odds ratio corresponding to the genetic risk factor. Hoggart et
2004) developed a combination of B li-o -1 .Is and classical approaches for association

studies based on the admixture between populations with different ancestries. Apart

from STRUCTURE, two other software which employ B li-. -1 Io ideas for statistical

modeling of genetic data from admired population are ADMIXMAP (Hoggart et
2003, 2004) and ANCESTRYMAP (Patterson et
Different from the above treatments, Satten et
class analysis to study the association between the disease and the candidate genes

based on a series of additional markers that are in linkage equilibrium with each

other and with the candidate genes within subpopulations. Based on the Akaike

information criterion (AIC), their method can estimate the number of subpopulations.

But by either assuming the disease to be rare, or collapsing multiple genotypes into

various binary ... 11..i vpes, their method has not fully capitalized on the information

about the multiple---, 11..(i pe inheritance of the candidate gene.

In this chapter, I provide an alternative parametric B li-. -1 Ia model for infer-

ring on disease-gene association after accounting for population substructure. As in

Satten et
rameters, while I account for the population substructure in a way similar to that of

Pritchard et
require the rare disease assumption or analyzing multicategory ... 11..(i pes by several

analyses using various possible binary genotypes of the candidate gene. Our model










can also handle multi-allelic genotypes of the candidate genes, extending on earlier

approaches for the .-- n.~i Pi-pC analysis of only hiallelic loci. The computational strat-

egy followed in Satten et
parameters in the model, combined with a parametric bootstrap strategy to obtain

standard error estimates. The MC1|LC strategy designed in this chapter simplifies

the computational complexity, with posterior standard deviation estimates and cred-

ible intervals being obtained from the random observations generated from the full

conditional distributions of the parameters.

I should emphasize that in our B li-o -1 .Is analysis, inference on the disease-gene

association is not carried out on the basis of the particular imputed structure as

done in Pritchard et
example, Aladigan and Raftery (1994)), the association parameters are estimated by

incorporating the uncertainty in estimating the substructure. In particular, instead

of assuming the number of subpopulations I to be fixed, I put a prior on I and obtain

the posterior distribution of I. For each possible value of I with positive posterior

probability, I then estimate the association parameters in the disease-gene risk model.

Finally I take the weighted average of these estimates, the weights being proportional

to the posterior probabilities of the different values of I. The explicit model averag-

ing formulas are given in Section :3.3.2. Our analysis thus combines the substructure

estimation ideas of Pritchard et
class disease risk models of Satten et
work, through a more general unified B li-o -1 .Is approach. This chapter presents a

novel two-stage model with a clustering algorithm for inferring on cryptic population

structure, followed by a logistic model for disease incidence, tied together through

the technique of B li-. -1 Io model averaging.

The outline of the chapter is as follows. Section :3.2 states both the statistical

model and the genetic model, and briefly introduces the methods in Pritchard et









(2000a) to estimate the number of subpopulations. Section 3.3 derives the underly-

ing likelihood. I also introduce in this section the appropriate priors for the model

parameters and obtain their estimates based on the posteriors. The posteriors are

analytically intractable. So the B li- Im procedure is implemented by the MC \!C

numerical integration technique. In Section 3.4, I state the simulation strategy and

provide results on simulated case-control studies under both a rare disease and a com-

mon disease assumption. The simulation studies are conducted in the same setting as

in Satten et al. (2001) and mimic an admired Argentinean population as described

in Sala et al. (1998, 1999). Under the rare disease assumption, I compare our results

with those obtained in Satten et al. (2001). In Section 3.5, I apply our methods to

real data collected in a genetic association study with obesity as the disease outcome

and the P2-adrenergic receptor (P2AR) as the candidate gene under investigation.

Some concluding remarks are made in Section 3.6.

3.2 Model and Notation

3.2.1 Statistical Model

Let the binary variable D denote disease and let G be a (possibly vector-valued)

genetic risk factor. I assume that the overall population of size NV is comprised of I

subpopulations, each having different frequencies of G and D. By the unmeasured

covariate Z, I indicate the subpopulation to which an individual belongs. Thus,

Dj(= 1 or 0) corresponds to the presence or absence of a disease for the jth individual

with a genetic risk factor Gj, j = 1, NV.

I assume Gj to be a univariate discrete random variable, taking M~ + 1 values

go(= 0), gl, gM.r I aSSume that the prospective conditional logistic distribution

for the disease status is


Pr(Dj = 1|G.; = gm, Z = i) = H{#oi+4 1m},


m = 0,- M~, (3-1)










where H(u) = {1 +exp(-u)}- Here poi is a term representing the subpopulation

effect on the probability of disease for individuals belonging to a particular subpop-

ulation i, and pim is the coefficient corresponding to the genetic exposure variable in

the above logistic regression model. For parameter identifiability, I set Pro = 0. The

method can immediately be extended to a vector valued genetic risk factor Gj for

individual j.

3.2.2 Genetic Model

Since different subpopulations may have different frequencies of other marker

genes, I use a latent-class approach to infer about the population substructure by

using information on those additional marker loci. Consider xy as the allele at marker

I on chromosome c =1, 2 (labeling of the two chromosomes in a given pair as 1 or 2

is arbitrary) and let X = (xi, xi -\ x, x whr i h nube of,,,l, marker, loci

under consideration.

First, I assume that the genes at the additional marker loci are unrelated to

disease, that is


Pr(Dj = 1|Gj,Xj Z = i) = Pr(Dj = 1|Gj, Z = i). (3-2)


In the analysis that follows, I assume that Hardy-Weinberg equilibrium holds for each

subpopulation. Human populations rarely show much divergence from the Hardy-

Weinberg equilibrium once population substructure has been accounted for (Report

of Committee on DNA Forensic Science 1996, pp. 104 and references cited therein).

Further, by choosing additional marker loci on different chromosomes from the

chromosome where G is found, I first assume that the additional mutually indepen-

dent~ mar&kerl genes are: in1 lIinkage equiibiumIII withI lthe candidaU&te genet Go ha


Pr(Gj,Xj |Z i) Pr(Gj |Z i) x Pr(X |Z = i).


(3-3)










By Hardy-Weinberg equilibrium,


Pr(Xj |Z = i) = pF (3-4)
l= 1 c= 1

where pliy is the proportion of persons in subpopulation i having allele xy at marker

loci 1, L being the number of marker loci.

Suppose the candidate gene G has w alleles, e.g., al, as,, and the frequency

of the allele a, (u = 1, w) in the ith subpopulation is


pi, = Pr[GY = a,|Z = i].


Then by Hardy-Weinberg equilibrium the probabilities of the genotypes of G (asa,)

(u, v = 1, w) are given by:


Pr[G = ana,|Z = i] = M, u (3-5)



3.2.3 Inference on I for The Model with Admixture

I consider the situation where I have multilocus genotype data from individuals

sampled from a population with possibly unknown structure. Pritchard et al. (2000a)

used the genotypes of a sample of individuals to identify the presence of population

structure which is difficult to detect using visible characters, but may be significant

in genetic terms. As Pritchard et al. (2000a) pointed out, the problem of inferring on

the number of unknown populations, I, present in a data set is a very difficult task.

In a B li-o Io paradigm, with a suitably chosen prior distribution on I, one can base

inference for I on the posterior distribution:


P(I|X) oc P(X|I)P(I), (3-6)


where X denotes the vector of genotypes of the sampled individuals including the

candidate gene G. Let Z denote the unknown population of origin of the individuals,









P denote the unknown allele frequency vector in all populations, and Q denote the

vector of admixture proportions for each individual. The harmonic mean estimator

is one of the simplest r-~in~ of estimating P(X|I),

1 P(Z, P, Q|X, I) 1 1
P(X|I) P(X|Z, P, Q,I)dd Q K =1 i:P(X|Z(k) P(k) (k) /'(7

However this estimator is notoriously unstable, often having infinite variance, and

thus poses severe computational challenges. Pritchard et al. (2000a) described an

alternative approach which is a more ad hoc but effective approach based on the

B li-o -1 .I deviance function


DV(Z, P, Q) = -210og P(X| Z, P, Q). (3-8)

Let k = 1, 2, denote the k-th iteration in the Markov chain. One estimates

the conditional mean and variance of the deviance function DV given X as follows:




E~(DV(Z, P, Q) |X) -22 ~ log P(X| Z:k) pl";ki) (k)z








-210og P(X|I) m p + -2/4 (39)

An analytical explanation of this approximation is provided in Appendix A. An

alternative interpretation of this method is that model selection is based on penalizing

the mean of the B li-o -1 .Is deviance by a quarter of its variance. Pritchard et al.

(2000a) pointed out that replacing the assumption of normality with the assumption
of the B li-o -1 .Is deviance function being distributed as a Gamma random variable










may be .I- -np!11''1 cally more justifiable, but makes little or no difference in terms of

estimation accuracy in practical applications.

One may use (3-9) to estimate P(X|I) for each I and then substitute the es-

timate into (3-6) to obtain approximate estimates of P(I|X) (see Pritchard et al.

2000a, for a detailed algorithm). One would then impute the estimated substructure

while conducting tests for disease-gene association. I will essentially follow the same

technique for estimating P(I|X) and embed the derived information into a disease

risk model as described in the following section.

3.3 Likelihood and Priors

In this section, I derive the likelihood function, state the prior distributions and

derive the posteriors. The key aspect of the modeling is in how I develop algorithms

for estimating the model parameters and at the same time account for the population

structure in our framework.

3.3.1 Likelihood

Because different subpopulations may have different frequencies of other marker

genes, I make inference based on the marginal joint distribution of D, G and X,

summing over all possible values of Z, the latent variate. Let Pr(Z = i) = gi, which

is the proportion of persons in subpopulation i. Note that for subject j, Gj takes one

of the values gm, m = 0, 1, M~. By (3-3) and (3-4), for given I, the full likelihood

LI is factorized as follows:



j=1



NI L2
= q4 x~n pig ,:x Pr(G~y = mZ= i)
j= 1 i= 1 = 1 c= 1
exp{Dj x (Poi + Plm)} 1(1
1+exp { oi + im)










where Pr(Gj|Z = i) is a function of pi, (u = 1, -- w) as described in (3-5), and L

is the number of marker loci which are in linkage equilibrium with G.

I use a marginal likelihood rather than a conditional likelihood approach. The

likelihood involves parameters of interest Im, (m = 1, M~), and the nuisance

parameters Poi, pin, 4i and plix (i = 1, I; V I and V x), which grow in di-

rect proportion to the number of subpopulations. This gives rise to the well known

Neyman-Scott phenomenon where MLEs turn out to be inconsistent if I grows with

sample size. Typically I deal with I between 1 through 7, and handling nuisance

parameters is not a difficult issue in such scenarios. However, the marginal model

does contain a large number of parameters, and I carry out B li- -1 Im inference by

introducing appropriate prior distributions for these parameters.

3.3.2 Priors and Posteriors

The main problem is to estimate the regression parameters Pim, m = 1, M~;

I consider the following mutually independent normal priors:


oin ~ Normal (I"#osTi, o si=1,---,I

im71 ~ Normal (@Im im~1), m = 1, M.


When inferring the number of subpopulations I, I consider a discrete uniform prior

on the domain of I. The priors for P and Q correspondingly are the following:


(qi -, q) Dirichlet(a~, -,);

pi, ~ Beta(ai, bi);

(pilpli2, aliX, ~ Dir~ic~hlet( Avest, Amm > A ix,)-


With the above model and prior specifications, one can obtain the full conditional

distributions for the parameters Poi, Ilm, Pin, 4i and pri,. None of the conditionals

has a standard distributional form.









For each given value of I, the parameters of interest can be estimated by gen-

erating random observations from the full conditionals using a MC \!C numerical

integration scheme and then taking averages of the generated observations. Corre-

sponding to each value of I, I also have associated posterior probabilities P(I|X) as
discussed in Section 3.2.3. Therefore, by setting 8 = (P11, PlM/), using a model-

averaging technique, any generic parameter 8 is estimated by the posterior mean



E(0|X) = E (0|X, I = i) Pr(I = i|X) (3-11)

with posterior variance


V(0|X) = Vl(0|XI = i) Pr(I = i|X)


+ [E(|XI= i)]2Pr( =i|X E~ ( |XI= i)Pr(I=r i|X) 2

(3-12)

Thus the posterior variance estimates for the parameters of interest account for un-

certainty in the estimation of I. The final point estimates are not byproducts of a

single model with a fixed value of I, but averaged over possible models with weights

proportional to the posterior probabilities P(I|X).
3.3.3 Computational Details

1. Estimation of association parameters

None of the conditional distributions of the parameters has a standard distributional

form and thus generating observations from the posterior distributions or calculating

the posterior estimates is not automatic. I adopted a componentwise Metropolis-

Hastings algorithm for each of the parameters.

Let rl stand for a generic parameter, i.e., any of the Poi, Plm, Piu, 4i and pliz

(m = 1, 2; i = 1, I; V 1, x). Let L(r|-) denote the full likelihood as given in

(3-10) as a function of rl given the data and all the other parameters. Let xr(r) be the









prior distribution on rl. In order to simulate observations from the full conditional

distribution of rl, namely xr(r|-), I proceed as follows.

Step 1: Start with any reasonable initial value of rl,?iT- lo. This is the current value

of rl.

Step 2: Generate a new value of rl, 11- rl*, from a candidate density g(rl).

Step 3: Replace rlo bny .* with probabhility min (1, 3 }.~j Retain the existing
value of rlo otherwise. Note that xr(r|-) oc xr(r)L(r|-). If the candidate density

"(rl) = g(rl), then the acceptance probability reduces to (after cancelation of the

prior term with the identical candidate density term) min (1, Jii-.
2. Inference of the number of subpopulations I

The following algorithm (Pritchard et al., 2000a) is used to sample from Pr(Z, P, Q).

Starting with initial values of Z(O), iterate the following steps for k = 1, 2, -

Step 1. Sample P(k) and Q(k) from Pr(P, Q|X, Z(k"-1))

Step 2. Sample Z(k) from Pr(Z|X, Ptk), Q(k)

Step 3. Update a~ using Metropolis-Hastings step (where I consider a uniform(0,10)

prior to a~).

Step 2 may be performed by simulating zziC! ,c) (populatonoforgi of.,:,-, allel copy,1, x~,,, i...))

independently for each j, c and I from

(y~c q r~x~~c)P, z("~c)
Pr~~z =~ i|,P (3-13)
II\"=1 qX Pr) I c)P, zzU~C) /

where,. Pr./~(x yc)|P, z(3.c) = i) =; p ;:

3.4 Simulation

To illustrate our approach, I consider a scenario similar to the one in Satten

et al. (2001) with an admixture of European and American Indian ancestry in Ar-

gentinean population. Sala et al. (1998, 1999) published allele frequency data on
twelve short tandem repeat (STR) loci in Argentineans of European ancestry, as well

as in three Argentinean American Indian aboriginal groups (Mapuche, Tehuelche,









and Wichi) (Table 3-1). The Metropolitan population of Buenos Aires was studied

and the population did not exhibit any significant difference from Hardy-Weinberg

equilibrium. However, the STR allele frequency distributions are characterized by

significant differences within and also between different populations. I assume that

Argentinean Europeans constituted Tu' .; of a hypothetical target population and that

each American Indian group constituted 10' .

I simulate a population such that all eleven additional mutually independent

STR loci are in linkage equilibrium with the candidate gene for persons in the same

subpopulation. Simulated data sets are constructed by using reasonable true val-

ues of the parameters. Specifically, by using the allele frequencies from Sala et al.

(1999), I generate data on the candidate gene and other marker loci in a population

that comprises four subpopulations. As in Satten et al. (2001), I select allele 3 of

locus D6S366 as the disease-causing allele, with frequencies 0.277, 0.341, 0.446 and

0.557 in European, Mapuche, Tehuelche, and Wichi, respectively. Consider a biallelic

candidate gene, i.e., a candidate gene with two alleles A (the disease-causing allele)

and a (the non-disease-causing allele). The candidate gene G has 3 possible geno-

types go, 91 and g2 COTTOSponding to persons having zero (aa), one (Aa) and two (AA)

copies of a disease-causing allele. If the frequency of the disease-causing allele in the

ith subpopulation is


pi = Pr [Gy = A| Z = i] = 1 Pr [Gy = a| Z = i], (3-14)


then by Hardy-Weinberg equilibrium, the probabilities of the nd~ vr~ipes of G are as

the follows:


Pr[G = go|Z = i] = (-p)

Pr[G = gl|Z = i] = 2(1 pi)pi,

PMr [ g2 | Z =i] = 2 (3-15)










Finally, the disease status data that vary with changing frequencies of the disease-

causing allele for each subpopulation are generated. As stated in Satten et al. (2001),

persons who were 1......... i--zous for the disease-causing allele had an increased risk of

disease corresponding to a log-odds ratio of 1.0 (relative risk = exp(1.0) = 2.72),

and persons who were heterozygous for the disease-causing allele had no increase in

risk. This implies, in our notation, P11 = 0 and p12 = 1.0. The log odds of the

rare disease (which implies that the control population mimics the whole population,

and Pr(G = gm|D = 0, Z = i) a Pr(G = gm|Z = i)) among persons with zero

or one <..pi- of the disease-causing allele was -5, -4, -3 and -3 in the European,

Mapuche, Tehuelche, and Wichi populations, respectively. For the common disease

with a higher prevalence rate, I assume that the log odds among persons with zero

or one <..pi- of the disease-causing allele was -2, -1.5, -1 and -1 in the European,

Mapuche, Tehuelche, and Wichi populations, respectively.

The results I presented are based on a set of diffuse and mutually independent

priors. I use NV(0, 9) prior on poi and Plm, Beta(0.5, 0.5) on pi and a symmetric Dirich-

let prior for the allele frequency parameters with all A's being 0.5. For (q,, ql), I

choose a Dirichlet(al, a~) prior, with a U(0,10) hyperprior on a~.

For each scenario, I generated 100 different data sets and obtained the parameter

estimates by computing the model averaged posterior means for each simulated data

set. In each replication of our simulation, I generated data for 125 (250) cases and

125 (250) controls from the above simulation strategy, followed by sampling the cases

and controls from a larger random sample of subjects. For each replication, I ran

multiple Markov chains, typically with 20000 30000 iterations. The posterior means

calculated for each replication were based on every tenth observation of the last

5000 observations in each chain, combined together to reduce auto-correlation. An

estimate of the posterior variance was calculated based on the .I__oregate of the last

5000 values for each replication. I report average values for these quantities over









the 100 replications. I also calculated an estimate of the mean squared error (ilrmi)

corresponding to the estimates of each of the parameters of interest (;?-, 8 in general)

hased on the 100 replications. I considered this MSE, i.e., the squared deviations

of the estimates from the true parameter, averaged over the 100 replications as a

measure of performance of our method.
100
MISE =~ (Posterior mean of ti in r-th replication- True value of 8)2

To examine the effect of the number of STR loci on the estimators, I analyzed the

datasets with 250 subjects (125 cases and 125 controls ) by (i) using all the additional

loci and (ii) only the first six additional loci. These two scenarios are labeled as X12

and X6 in Tables :32 and :34 respectively. By applying the methods stated in Section

:32 (Pritchard et
for each simulated dataset, first I obtain estimates of P(I|X). For example, by (i),

I obtain P(I = :3|X) = 0.2 and P(I = 4|X) = 0.8. Then the model averaged

estimate of I is 0.2x :3+ 0.8x 4=:3.8. The estimates of the association parameters

are computed following (:311) and (:312). For the same dataset, the estimate of /312

is 1.09 for I = :3 and 1.02 for I = 4, thus the final model averaged estimate of /312 foT

that dataset is 1.09 x 0.2 +1.02 x 0.8 = 1.034. The results in Table :32 are obtained

by averaging these estimates over the 100 simulated datasets, which shows that the

posterior standard deviations of our model averaged estimates are typically smaller

than the standard errors furnished by Satten et
numbers from Tables :3-2 and :3-:3 of Satten et
current chapter). I realize that though our simulation settings are the same as of

Satten et
two methods are not implemented on identical datasets, but still this might serve as

a precursor for comparison purposes. Satten et
their estimates over the replications. As a result I cannot compare the two procedures










directly in terms of the MSE. As one might expect, when I increased the sample size

to 500 (250 cases and 250 controls), adequate performance is achieved even with just

the first six STR loci and the overall pattern of the results remain the same.

I also include the naive analysis completely ignoring additional niultilocus infor-

niation (denoted as XO in Tables :32 and :33). One can note that the estimation

results are much inferior if one ignores the genotypic information at a series of addi-

tional unlinked marker loci.

To show that the methods are not limited to the assumptions that either the

disease is rare or the genotypes G are binary, I also analyzed a simulated dataset

with 250 subjects (125 cases and 125 controls) and another with 500 subjects (250

cases and 250 controls) where the disease has a higher prevalence rate. The overall

pattern of the results are fairly similar to the rare disease case. I note relatively

smaller AISE's and posterior standard deviations for this coninon disease case as

compared to the rare disease case. The results are presented in Table :34.

For analyzing the simulated data, I used the implicit prior belief that the source

population may have 4 or less subpopulations, by putting a discrete uniform prior on

1, 2, 3, 4 for I. However, I have also tried to put non-zero probability on a value of

I greater than the true simulation value of 4, for instance, a discrete uniform prior

on 1, 8. In this case, the estimates of the regression parameters /Sun appear to

change very little even when I is estimated to be slightly greater than the true value

used to generate the data (results are not provided). Pritchard et al. (2004a) note

that for situations where several values of I give similar estimates of log Pr(X|I), it is

often the case that the smallest of these is 'correct'. In the practical intplenientation,

I adopt a model selection perspective and try to obtain the smallest value of I that

captures the major structure in the data.









3.5 Application to A Real Dataset

To illustrate our method, I apply our approach to explore genetic association

of obesity and the P2AR candidate gene (for details of the study, please see Lin et

al., 2005). The P-adrenergic receptors (PAR) are known to pt. Vi an important role

in cardiovascular function and in response to drug. I analyze complete data on 144

men and women who participated in this study and ignore the observations with

missingness. Each of the participating subjects were nod v~(iped for SNP markers at

codon 16 within the P2AR gene, at codon 389 within the P1AR gene and at codon 492

within the a~lA gene. The phenotypic information collected are weight and height of

individuals, by which the body mass index (BMI) of each subject can be calculated.

I define "obese", i.e, D = 1 when BMI > 30.0, and D = 0 otherwise. This leads to

85 undiseased and 59 diseased subjects in the dataset I consider.

Previous studies have detected possible association between polymorphism in

the P2AR gene and obesity, the focus being particularly on codon 16 and codon

27 substitutions, but no association has been detected within PlAR gene or a~lA

gene (Johnson and Terra 2002, Lin et al. 2005, Takami et al. 1999). Therefore, I

consider the P2AR gene as the candidate gene, denoted by G and the P1AR gene

and the a~lA gene as two other genes unrelated with the disease, denoted by X=

(X1, X2). Note that in this dataset, I only have the genotypic information regarding

single polymorphisms in these three genes which have biallelic genotypes, generally

expressed as x = 0, 1, 2. So the expression in (3-4) will be changed as P(X|Z = i) =

nl2=19ir, W-here pri, is the proportion of persons in sublpopulaltion i having genotype

x (x = 0, 1, 2) corresponding to gene 1.

I analyzed the data by considering genotypic information on all three genes

(denoted by "X2+G") and by only the candidate gene (denoted by "XO+G"). Since

in the real data, I do not know the true value of I, I should try to estimate the smallest

value of I that captures the us! inr~~ substructure in the data, if any. To this end, I










introduce a discrete uniform prior on 1, 2, 15 for I. I consider (plil, pli2, plil) -

Dirichlet(0.5, 0.5, 0.5), and for (ql, q,), I choose a Dirichlet(al, a~) prior

with a uniform hyperprior on a~ with range from 0 to 10. By applying the methods

stated in Section 3.2, I first obtain inference on I. The principal findings are that

with the inclusion of the two other genes, I detect some evidence of substructure

with an estimate of I, as I = 3, with P(I = 3|X) = 1, whereas without these two

genes and by only using G, I obtain P(I = 1|X) = 1, implying I = 1, i.e., no

population substructure can be detected in the source population. In fact, the data

came from a North American population with diverse ethnic composition of blacks,

whites and others, so one could expect some latent population substructure in this

data. The results of our analysis are presented in Table 3-5. In all the methods

of analysis, the genetic factor does not appear to be a statistically significant risk

factor. The results -II__- -r that codon 16 (Argl6Gly) polymorphisms of the P2AR

gene is not all its lin contributing factor to obesity for this studied population. In fact,

in Swedish Caucasians, Gln27Glu polymorphism at codon 27 of the P2AR gene was

shown to be associated with obesity, but no such association was shown for Argl6Gly

polymorphism at codon 16. None of the Gln27Glu and Argl6Gly polymorphisms of

the P2AR gene were found to be all in l int contributing factor to obesity in Japanese

men (Hayakawa et al. 2000). In the ordinary logistic regression model, with G

as a categorical factor, I also find insignificance of G, (P-values 0.8591 and 0.1571

corresponding to G=1 and 2, respectively). Even after accounting for information in

the other genes and population substructure, the effect of the candidate gene remains

insignificant. Notice that the B li-. -1 Ia HPD intervals are wider than the ordinary

logistic model due to addition of lI .v.r of uncertainty on I.










3.6 Discussion

In this chapter, I present an alternative B li-. -1 Io model for accounting for pop-

ulation substructure in genetic association studies. As compared to previous ap-

proaches, our model is advantageous in terms of the following aspects. First, it can

estimate the number of subpopulations (I) that comprise the overall population. Al-

though Satten et
on the grid procedure in which multiple different I's are fitted and the optimal one

is then determined in terms of the minimum AIC. On the other hand, Pritchard et


substructure. Based on marker and candidate gene information, our model estimates

the posterior probabilities of I, which is then used in forming the final estimates of the

relative risk parameters through model averaging. An additional advantage is that,

unlike Satten et al.'s (2001) approach, our model does not rely on the assumption of

the rare disease or the collapsing of multiple genotypes into binary .-- n, J irpes, thus

offers more power to study the genetic architecture of any type of diseases.

A new feature of the B li-o -1 .Is analysis is the use of model averaging to esti-

mate the regression coefficients. Rather than relying on one particular model with a

fixed number of strata I, I have put a prior on I, and have estimated the regression

parameters as the weighted average of their estimates for different values of I. The

weights are proportional to the posterior probabilities of the different values of I.

Thus I embed the substructure estimation together with inference on the associa-

tion parameters in a unified B li-. -1 Ia framework. The standard error of the relative

risk estimates does incorporate the uncertainty in the estimation of I as reflected in

(3-14). This is unlike the method proposed in Pritchard et
substructure is estimated first and tests are conducted based on the imputed sub-

structure. Table 3-2 shows that our methods are comparable to those of Satten et
(2001); however, since our set-up is different from that of Pritchard et









it is hard to compare the two methods directly in numerical sense. In principle, I do

believe that combining inferences of the substructure and association modeling will

lend one more power in detecting association.

It should be pointed out that fewer additional markers are needed when the

sample size is large. When additional marker loci are involved, the number of nuisance

parameters (the allele frequencies of those loci for each subpopulation) in the model

would increase, requiring more data to estimate them properly.

There remains the problem of handling marker loci in linkage disequilibrium

with the candidate gene in our framework. According to Falush et al. (2003), there

are three sources of linkage disequilibria (LD), mixture LD, admixture LD and back-

ground LD. The mixture LD arises from variation in individuals' ancestry and it can

be measured by unlinked markers. The admixture LD occurs because of the correla-

tion in ancestry among an extended genomic region. The background LD decays on

a short scale and, therefore, occurs within a fine chromosomal structure. Pritchard

et al. (2000a) modeled the mixture LD for association studies. In their "lml!: I,.

model, Falush et al. (2003) incorporated the eIIIn!::1or~e LD" into the inference of

population structure. The incorporation of the background LD is an interesting open

question.

In summary, I have derived flexible B li-. .Is estimation techniques for disease-

gene association in case-control studies by accounting for population structure. First,

I applied Pritchard et al.'s (2000a) methods to infer population structure (i.e. esti-

mating P(I|X) and I) by using the 7.~ nd v(pes of sampled individuals at a series of

unlinked markers. Second, I propose a latent variable approach to estimate the asso-

ciation parameters, and account for population substructure using additional marker

loci information as in Satten et al. (2001). The final results are calculated by the

model averaging technique (as described in (3-11) and (3-12)) which combines in-

ferences from the above two steps. Estimation results based on a simulated admired










population (mimicking the results presented in Sala et al. (1998)) show that the

estimates of the relative risk parameters using additional mutilocus genetic informa-

tion are superior to those when such information is not exploited. I also apply our

method to a real dataset on obesity. This chapter illustrates how the modeling tool

of B li- Io model averaging can he effectively used to conduct posterior inference in

an interesting application in human genetics.








50




Table 3-1: Allele frequencies for Twelve STR loci in the four Argentinean subpopu-
lations .

Locus Argentinian Europeans Mapuche Tehuelche Wichi
E)6S366 0.082 0.091 0.143 0
0.204 0.114 0.071 0
0.277 0.341 0.446 0.557
0.119 0.136 0.036 0.086
0.091 0.125 0.036 0.029
0.183 0.159 0.143 0.200
0.028 0.011 0.018 0.071
0.015 0.023 0.107 0.057
FABP 0.589 0.683 0.732 0.485
0.110 0.058 0.107 0.162
0.300 0.260 0.161 0.353
CSF1PO 0.330 0.266 0.339 0.226
0.313 0.282 0.232 0.194
0.298 0.367 0.411 0.581
0.059 0.085 0.018 0
Fl3A 0.151 0.222 0.357 0.173
0.060 0.122 0.125 0.077
0.202 0.122 0.054 0.346
0.209 0.178 0.143 0.115
0.325 0.344 0.304 0.288
0.053 0.011 0.017 0
FESFPS 0.260 0.170 0.143 0.257
0.420 0.500 0.714 0.543
0.247 0.284 0.107 0.043
0.073 0.045 0.036 0.157
THO1 0.233 0.526 0.286 0.132
0.250 0.298 0.429 0.721
0.105 0.009 0.018 0
0.185 0.026 0.089 0.015
0.226 0.140 0.179 0.132
HPRTB 0.032 0 0 0
0.179 0.032 0.091 0
0.317 0.323 0.227 0.357
0.285 0.403 0.591 0.167
0.137 0.242 0.091 0.357
0.050 0 0 0.119
VWA 0.063 0.0096 0.036 0.014
0.099 0.077 0.054 0.014
0.294 0.577 0.429 0.514
0.297 0.125 0.214 0.343
0.246 0.212 0.268 0.114
D13S317 0.090 0.020 0 0
0.160 0.240 0.15 0.464
0.060 0.070 0.05 0.179
0.290 0.120 0.15 0.089
0.250 0.260 0.3 0.089
0.100 0.180 0.225 0.179
0.040 0.110 0.125 0
D7S820 0.156 0.070 0.050 0
0.115 0.050 0.050 0.070
0.276 0.220 0.175 0.125
0.245 0.420 0.525 0.450
0.159 0.210 0.200 0.250
0.046 0.030 0 0.105
L)16S539 0.156 0.110 0.225 0.125
0.100 0.130 0.075 0.232
0.294 0.240 0.100 0.321
0.252 0.370 0.550 0.250
0.195 0.150 0.050 0.071
RENA4 0.772 0.728 0.881 0.690
0.074 0.229 0.023 0
0.153 0.041 0.095 0.310

Cited from Sala et al. (1998) and Satten et al. (2001).



















Table :32: The results of simulated rare-disease data with marker loci in linkage
equilibrium with the candidate gene D6S:366. Ratio of the sample sizes of cases to
controls is 125/125 and 250/250. X12 and X6, represent that the parameters were
estimated by using the twelve and the first six additional marker loci, respectively.
XO is the analysis without using any additional marker loci. Mean and posterior
standard deviation refer to the average of the B wa;- estimates and posterior standard


deviations obtained in 100 replications, whereas
error based on 100 replications.


MSE is the estimated mean squared


Sample size Model
True value
125/125 X12 Mean
MSE
Post. std. dev.
X6 Mean
MSE
Post. std. dev.
XO Mean
MSE
Post. std. dev.


250/250 X12 Mean
MSE
Post. std. dev.
X6 Mean
MSE
Post. std. dev.
XO Mean
MSE
Post. std. dev.


P11r
0.0000
-0.0475
0.1497
0.3126
-0.1095
0.2005
0.3277
-0.:3380
1.2277
1.5982

0.0005
0.0546
0.2704
0.0051
0.06:31
0.3127
-0.2766
1.260:3
1.4152


/912
1.0000
1.109:3
0.0765

1.1028
0.0986
0.3127
0.8855
0.4982
1.0677

1.0966
0.0551
0.1592
1.10:35
0.0582
0.1952
0.9489
0.4:330
0.92:36


4
:3.8178
0.1802
0.3854
:3.640:3
0.3540
0.476:3
4.0000




:3.787:3
0.2107
0.4089
:3.5415
0.4572
0.4994
4.0000


























Table 3-3: The results of simulated rare-disease data with marker loci in linkage
equilibrium with the candidate gene D6S366 which are analyzed by Satten et al.
(2001). 125/125 and 250/250 denote ratio of the sample sizes of cases to controls.
X12 and X6 represent that the parameters were estimated by using the twelve and
the first six of the additional marker loci, respectively. Mean and standard error refer
to the average of the estimates and standard errors obtained in 500 replications.

Sample Size Model P11 Pl12
True value 0.000 1.000 4
125/125 X12 Mean 0.061 1.006 3.53
Std. err. 0.293 0.453 0.76
X6 Mean 0.023 0.883 3.32
Std. err. 0.865 1.718 0.69
Crude Analysis* Mean 0.366 1.760 1.00
Std. err. 0.285 0.370

250/250 X6 Mean 0.023 0.962 3.37
Std. err. 0.226 0.394 0.61
* Ignore stratification and analyze data without additional marker loci.




































X12 Mean
MSE
Post. std. dev.
X6 Mean
MSE
Post. std. dev.

X12 Mean
MSE
Post. std. dev.
X6 Mean
MSE
Post. std. dev.


Table 3-4: The results of simulated common-disease data with marker loci in linkage
equilibrium with the candidate gene D6S366. Ratio of the sample sizes of cases to
controls is 125/125 and 250/250. X12 and X6, represent that the parameters were
estimated by using the twelve and the first six additional marker loci, respectively.
XO is the analysis without using any additional marker loci. Mean and posterior
standard deviation refer to the average of the B .ws- estimates and posterior standard


deviations obtained in 100 replications, whereas
error based on 100 replications.


MSE is the estimated mean squared


Mode


Sample size I

125/125








250/250


l


P11
0.0000
-0.006;2
0.1106
0.3152
0.0017
0.1173
0.3488

0.0023
0.0600
0.2165
0.0191
0.0408
0.2627


P12
1.0000
1.1116
0.1005
0.1607
1.1299
0.1371
0.2766

1.0928
0.0551
0.1806
1.1051
0.0470
0.1991


4
3.8492
0.1456
0.3523
3.6279
0.3634
0.4766

3.9331
0.0461
0.2412
3.6;228
0.3748
0.4846


True value


























Table :35: The results of real data analysis with the posterior mean (Estimate),
posterior standard deviation and 95'.~ highest posterior density (HPD) interval (ill.11;
and confidence interval (CI) for the ordinary logistic regression model).


1\odel
X2+G


XO+G


7311
-0.0895
0.3997
(-0.8619,0.68:31)
-0.1206
0.4515
(-1.0028,0.7865)


7312
0.7165
0.5201
(-0.2996,1.'71.' s)
0.743:3
0.5602
(-0.:3:339,1.8:30:3)


Estimate
Post std.dev.
HPD
Estimate
Post std.dev.
HPD


Ordinary logistic regression Estimate -0.0668 0.714:3
withI only G~ as Covaria&te Std.err. 0.3765 0.5048
CI (-0.8047,0.6711) (-0.2751,1.70:37)
*:All of the posterior ps. .1 .1 il i r-, concentrated on a single value of I, thus we are unable to
obtain estimates of posterior va~ria~nce.















CHAPTER 4
SEMIPARAMETRIC BAYESIAN ANALYSIS OF CASE-CONTROL DATA
UNDER GENE-ENVIRONMENT INDEPENDENCE AND POPULATION
STRATIFICATION

4.1 Introduction

Except for some rare diseases, such as Huntington or Tay Sachs disease which

may be the result of a deficiency of a single gene product, most common human

diseases have a multifactorial etiology involving complex interplay of many genetic

and environmental factors. By identifying and characterizing such complicated gene-

environment interactions, one has more opportunities to study etiology, diagnosis,

prognosis and treatment of complex diseases.

The case-control study design, where sampling is conditional on the presence or

absence of disease, is a powerful epidemiologic tool for studying potential risk factors

of rare diseases. It has been established that prospective logistic regression analysis

of case-control data is "e1II~n !ll in the modern semiparametric sense with respect

to the underlying covariate density model (Breslow et
of the gene-environment association problem is that it may often he reasonable to

assume that a subject's genetic susceptibility is independent of the environmental ex-

posure. Consequently, one may be able to obtain more efficient estimation techniques

than the traditional logistic regression, by exploiting the additional gene-environment

independence restriction instead of an unconstrained covariate density model.

Piegorsch et
environment interactions in logistic models with data from cases alone, provided

that the environmental factor (E) and the genetic factor (G) are independent in the

population and the disease is rare. The interaction parameter is obtained as the odds










ratio between G and E among cases only. They also noted that the estimate of the

G-E interaction parameter from case-only data is more efficient than its counterpart

obtained from case-control data using logistic regression.

However, methods that use G-E independence produce severely biased estimates

if the assumption is violated (Schmidt and Schaid, 1999; Albert et
independence is less likely to occur when the environmental exposure is external

(pollution, pesticide or radio-active substance) or a randomized treatment in a clini-

cal trial. One has to be much more cautious with the independence assumption when

considering behavioral risk factors and metabolic polymorphisms which could alter

an individual's behavior. Gatto et
of non-independence. In fact, genetic susceptibility factors and environmental expo-

sures, though unlikely to be causally related at an individual level may be correlated

at a population level due to their dependence on other variables that stratify the pop-

ulation, such as age, ethnicity, family history and alike. For example, a woman with

a strong family history of breast cancer is more likely to carry BRCA1/2 (two ma-

jor genes identified for breast and ovarian cancer) mutation and knowing her family

history, less likely to use post-menopausal hormones. This may result in a negative

association between BRCA1/2 mutation and hormone use. In such instances, G-E~

independence does not hold marginally, but may hold when conditioned on the strat-

ification variables (for instance, family history). 1\odeling stratification effects can

thus he viewed as a possible remedy to guard against resultant hias due to violation

of the G-E independence assumption. One of the 1 in r~ goals of the current chapter

is to develop techniques to model stratification effects in a flexible, data-adaptive way

1 in an et~hllimaionUI framelltwork whIichI e~xpIloit conitUioal G-E independence.

The- use_ of G-E independence through case-only studies has mainly been for es-

timating the gene-environment interaction parameter. K~houry and Flanders (1996)









noted that neither the genetic nor environmental exposure main effect can he es-

tiniated with case data only. Unibach and Weinherg (1997) showed that with data

available on both cases and controls, one can estimate the main effects and interaction

by fitting a suitably constrained log-linear model under a rare disease assumption. In

a population based case-control study of ovarian cancer of Jewish women in Israel,

Modan et
disease assumption, the disease odds ratio associated with E among subjects with

genotype G = g can he estimated by a logistic regression analysis that compares

P(E|D = 0) with P(E|D = 1, G = g). However, the method proposed in Modan

et
Most of the above methods consider very simple settings and it is not ininediate how

to exploit G-E independence in the presence of population stratification as a direct

extension of these methods.

C'I II1 Iej~ee and Carroll (2005) (referred to as CC in rest of the text) propose a

senliparanietric niaxiniun likelihood method of estimation of all the logistic regression

parameters. They exploit the G-E independence assumption and use data front both

cases and controls. Their method addresses many of the limitations of the existing

methods as discussed above. CC derive a robust profile-likelihood based estimation

technique which does not require the rare disease assumption. They also consider

the issue of population stratification and propose a method when G-E independence

assumption only holds conditional on the set of stratification variables (S). CC

consider a logistic disease probability model for P(D|G, E, S). They proceed to work

with the joint retrospective likelihood of the form P(G, E, S|D), factorized as,

P(D|G, E, S)P(G|ES)P(E, S)
P(G, E, S|D)=.
CG:,E, and S P(D|G, E, S)P(G|E: S)P(EI S)

Under the assumption of G-EG independence conditional on S, the second factor on

the right hand side reduces to P(G|E, S) = P(G|S) and thus it remains to model










P(E, S) and P(G|S). CC leave the joint distribution of the environmental exposure

and the stratification variables, P(E, S) to be fully non-parametric. However, they

model P(G|S) in a parametric way, by assuming a logistic regression model with S

as covariate. As we will note, the parametric logistic model for the P(G|S) is often

inadequate, especially for a genetic mutation which is rarely detected in healthy

controls but commonly prevalent in the case population. In such circumstances, the

estimation, especially of the main effect due to G, suffers in the method proposed

by CC. To overcome this problem, I use a factorization of the partially retrospective

likelihood P(G, E|D, S) that allows us to model the genotype frequencies separately

in the case and the control population. Moreover, for genetic mutations like the

BRCA1/2, there are several genetic risk models (Antoniou et
empirical data (Risch et
mutation frequencies after adjusting for covariates like family history and ancestry.

A flexible B li-, -i Ia model can incorporate this accumulated scientific evidence in the

form of a prior distribution assigned to P(G|S) and lead to more accurate estimation

than a logistic model for carrier probabilities. To elicit this advantage of the B li-, -i I s

paradigm while estimating all the parameters in the G-EG logistic regression model,

and not just G-E interaction, remains another primary goal of this chapter. The

dataset I use is a replica of the one that CC use, based on a case-control study on

ovarian cancer patients in Israel (Modan et
of BRCA1/2 as the genetic risk factor and number of years of oral contraceptive

(OC) use and parity as the environmental exposures. The stratification variables I

consider are age group, ethnicity, personal history of breast cancer (PHB) and family

history of breast and ovarian cancer (FHBO). I model the control distribution of the

continuous environmental exposures conditional on S as a Dirichlet process mixture

of normals (DPM). The DPhi model is appealing in this context as it provides a

natural measure of the degree of stratification and is model-robust. I also present a










parametric B li-- -1 .Is alternative for comparison purposes. An extensive simulation

study providing an in-depth comparison of the proposed B li-, -i Ia methods with the

powerful estimation techniques provided by CC, the case-only method and ordinary

logistic regression is a very important feature of this chapter. The simulation explores

several scenarios, with changing distributions for G and E as well as under violation of

the G-E independence assumption even when conditioned on observable confounders.

It appears that under G-E independence, the proposed semiparametric B li-, -i Ia

method has a real advantage over the competing methods under any of the following

situations (i) the individual genotype frequencies in each stratum do not follow the

logistic multiplicative odds model in terms of stratification variables, (ii) the genetic

mutation is rare in the control population and is commonly prevalent in the case

population. The gain is significant when the number of strata defined by S is relatively

large. When the G-E independence assumption even when conditional on S fails, all

the methods which use this assumption perform poorly, least so for the B li-, -i Ia

semiparametric method, which is more robust to model changes.

The rest of this chapter is organized as follows. In Section 4.2 I present the

model, likelihood, priors and posteriors. Section 4.3 contains analysis of the Israeli

ovarian cancer data. Section 4.4 presents the details of our simulation study and the

results. Section 4.5 contains concluding discussion, while proofs and computational

details are relegated to Appendix B.

4.2 Model, Likelihood, Priors and Posteriors

Consider a case-control study with a subjects nl cases and no controls. Let D

be the binary disease variable, i.e., Dj = 1 if the f"h subject is a case, and Dj = 0

if the subject is a control. The genetic risk factor G is essentially the genotype at a

single locus within a candidate gene. I will consider G as a categorical variable with

M~ + 1 levels, namely 90, 9M.r In addition, the data are assumed to be stratified

based on some other covariates, my- S. I consider the following logistic regression









function to model the disease probability in terms of G, E and S,


P(D = 1|G, E, S) =H{#o(S ) + Ir(G =o~ i,~ gm) + 2E +E 0M l( gm)},
m= o m= o
(4-1)

where H(u) = {1 +exp(-u)}-l. The intercepts Po(S) capture stratification effects
due to the covariates S on the risk of disease. Let P, = (Plo, ,Pl1M), 2~, and

P3 (30, 3M/) represent the main effect of the genetic factor, the main effect

of the environmental factor, and their interaction effect respectively. For parameter

identifiability, I set Pro = 0 and P30 = 0. For simplicity, I present my model with only

one continuous environmental exposure. Extension to multiple continuous exposures

E is straightforward and one such analysis is presented in Section 4.3. Extension of

the methodology when E is a set of categorical exposures or a mixed set of continuous

and categorical exposures is indicated later in this section.

As I continue to compare and contrast our methods with CC and traditional lo-

gistic regression, I would first like to point out that each method is based on a different

likelihood, the CC method uses a fully retrospective likelihood, P(G, E, S|D), the tra-

ditional logistic model uses a fully prospective likelihood, P(D|G, E, S), whereas our

method uses the following partially retrospective likelihood P(G, E|D, S) factorized



La = (i i|y y [P(Gy|EySy, D)P(Eg|Sy, D)]. (4-2)

j=1 j=1

As illustrated in Prentice and Pyke (1979) and discussed again in Roeder et al. (1996)

and Miller and Roeder (1997), the form of the retrospective likelihood considered here

is compatible with the logistic form of the prospective likelihood. Evaluation of the

likelihood function (4-2) requires the conditional distribution of [G|E, S, D] and the

conditional distribution of [E|S, D]. I will make the following assumption:

Assumption 1: Conditional on S, G and E are independent in the control

population, i.e., P(G|D = 0, E, S) = P(G|D = 0, S).










When the disease is rare in each stratum, and the control population mimics

the entire population, the usual G-E independence assumption in source population,

i.e., P(G|E, S) = P(G|S) is approximately equivalent to Assumption 1. The two

assumptions of G-E independence in source population and rare disease are made

by Piergorsch et
while CC do not need the rare disease assumption. Our analysis is exact under

Assumption 1 which may hold even when the disease is not rare. As pointed out in

Schmidt and Schaid (1999), the rare disease assumption is quite subtle and may not

hold, for example in situations where the disease risk is much higher for the carriers

of a particular gene mutation or for certain strata of the population. In the dataset I

consider, the risk of ovarian cancer is known to be higher for BRCA1/2 carriers and

for subjects with family history of breast or ovarian cancer. Fortunately, the bias due

to the rare disease assumption has less impact when the overall disease prevalence

P(D = 1) is small, even with highly penetrant genes (Schmidt and Schaid, 1999).

I do recognize that directly verifying Assumption 1 empirically could be quite

difficult based on the given study at hand, as tests of independence will have little

power. ?llby: researchers have considered this issue of verifying G-E independence

in control population in the context of using this as a screening tool to validate the

use of case-only analysis (Albert et
E association pattern in controls reflect G-E association in source population when

baseline disease risk is less than 0.1 (Gatto et
in the simulations, I do consider various departures from Assumption 1, and the

performance of all the methods under violation of this assumption. I advocate that

when substantial uncertainty remains on the validity of the independence assumption,

statistically significant results based on the proposed methods should be treated as

precursors for high priority investigations for future epidemiologic studies.









Assuming that the first no observations are controls and the next n no obser-

vations are cases, under Assumption 1, the retrospective likelihood in (4-2) reduces





j= 1

x i [P(Gy|Ey, Sn=lr( y, D )(gSy, D = 1).
j=no+1

Consequently, to evaluate the likelihood contributed from control data I will need

to specify probability models for P(G|S, D = 0) and P(E|S, D = 0). Following

the technique first so-----~ -1. .1 by Satten and K~upper (1993), I present the following

Lemmas which will then furnish expressions for P(G|S, E, D = 1) and P(E|S, D =

1), once having the control distributions and the prospective model as in (4-1).
Lemma 1:

P(G = gm|E, S, D = 1) P(D 1|G gm, E, S)/P(D = 0|G gm, E, S)
P(G = gm|E, S, D = 0) P(D 1|E, S)/P(D = 0|E, S)

Lemma 2:

P(D = 1|E, S) P(D = 1|G = gm, E, S)
= P(G = gm|D = 0, E, S).
P(D = 0|E, S) P(D = 0|G = gm, E, S)

Lemma 3:

P(E|S, D = 1) P(D = 1|E, S)/P(D = 0|E, S)
P(E|S, D = 0) P(D = 1|E )(D=OE, S)
P(E|S, D = 0)dE

The proofs of the Lemmas are collected in Appendix B.

Remark 1: With the likelihood conditional on S, I do not intend to estimate

the relative risks due to the stratification variables S and focus only on the parameter

of interest p = (P,, #2) ,3). As I proceed, I note that under our formulation, I would

tacitly avoid direct estimation of the stratum specific intercept parameters Po(S)

which appears in the disease risk model (4-1).









Before describing the estimation theory, I first would like to address the iden-

tifiability of the parameters in the prospective model (4-1) and the retrospective

likelihood LR. As stated in Prentice and Pyke (1979), if there are no assumptions

made on the covariate distribution 'F(g,els) = P(G = g,E = e|S = s) neither

'F(-, -|s), nor Po(S) is identifiable. But P is ahr-l-w identifiable under any choice of

'F. Following Lemma 1 of Roeder et al. (1996) it can be easily shown that under

Assumption 1 on the covariate density, P remains identifiable in our likelihood LR.

Remark 2: I would like to point out that unlike the Prentice-Pyke result for

general nonparametric covariate density case, with an additional independence re-

striction on 'F(-, -|s) in the source population (not just in control population as stated

in Assumption 1), Lemma 1 of CC proves that both the intercept and the covariate

distributions are identifiable given S = s. For a rare disease, Assumption 1 is ap-

proximately equivalent to independence in the source population. Thus, in the rare

disease case, with our formulation, by Lemma 1 of CC, I do have identifiability of the

entire likelihood, not just of P.

I consider the stratification variables S as a vector of q > 1 categorical covariates,

with the kth variable having Tk, CategOrieS Or leVelS. Therefore, the level combinations

of S defines I = nV= Tk, pOSSible strata. For instance, in the Israeli ovarian cancer

data I consider q = 4 stratification variables: (Age group, ethnicity, PHB, FHBO), the

first three having two categories each and FHBO having three categories. Therefore

S defines I = 2 x 2 x 2 x 3 = 24 possible strata. For ease of notation, I will introduce

Z, a single index variable with I possible values, each value representing a distinct

stratum. So for subject j, Zj can take exactly one of the values 1, I, completely

determined by the observed values of the stratification variables for subject j, namely

Sj. I can now rewrite the likelihood LR after replacing Sj by the stratum membership










indicator of subject j, namely Zj.



j= 1

x Pii(Gy|E, Zy Dy = 1)P( Ej |Zj, D = 1) (4-3)
j=no+1

I consider the following model for the control distribution of the genetic factor in

stratum i,
P(G = gm|Z = i, D= 0)
log =Sm. m = 1, M~. (4-4)
P(G = go|Z = i, D= 0)
Note that Tio = 0. The above model does not assume any stringent parametric form

for P(G|D = 0, S) in terms of S and simply treats the probabilities in each stratum

to be the model parameters, allowing complete distributional flexibility.

Result 1: Using (4-1), (4-4) and Lemma 1, I obtain the case distribution of G

as:

~exp { Im + 3mE + im}
P(G = gm|E, Z= i, D= 1) = M kbCPPl:IPlF

m =1, M~. (4-5)


Proof of Result 1 is presented in Appendix B. Note that although in the control

population by virtue of the independence assumption, P(G|E, D = 0, Z = i)=

P(G|D = 0, Z = i), in the case population P(G|E, D = 1, Z = i) does depend on E.

Due to high dimensional nature of the stratification variables S, it is often hard

to model the effect of S on the distribution of the exposure variable (E) explic-

itly. I consider a flexible nonparametric B li-o Io approach to model the distribution

[E|Dn = 0, Z = i] which allows for possible stratification? effects on? the distribution?

of E and does so in a data adaptive way. I consider the case when E is continuous,

as in the data example. The Dirichlet process mixture model (DPM) with a normal










kernel can be expressed in the following hierarchical structure


[E|D = 0, Z=] i] NOps, of )



P ~ DP(a~Po), (4-6)


where P, serving as a prior on the Os, i = 1, I, is itself a mundom probability mea-

sure. I assume that P is realization of a Dirichlet process (DP) with scalar precision

parameter a~ > 0 and base measure (or base prior) E [p] = Po which is a bivariate

CDF on R x R+. A property of the DP prior is that the random probability measure

p is almost surely discrete, leading to the following properties which reinterpret the

DPM model structure (see Antoniak, 1974 and Sethuraman, 1994 for details):

1. Any realization of Bi, Or generated from P lies in a set of K(< I) distinct

values, denoted by w = {wl, wK 9

2. wl, (1 = 1 K) are a random sample from the base prior ~Po;

3. K(< I) is drawn from an implicitly determined prior distribution depending on

the precision parameter a~ and I;

4. Given K < I, the I values are selected from the set w according to a uniform

multinomial distribution.

The above discussion is conditional on a~ and the hyperparameters which determine

~Po

With this hierarchical mixture prior structure for the control distribution of E

and the prospective logistic model (4-1) it now remains to investigate the nature of

the case-distribution of E. The following result provides an answer.

Result 2: Assume that the Os take values wl from the set w as described in 1.

Then


[E|Z= i, D= 1, Of = 1] = Pam (E; wim)~, (4-7)
m= o











and wi =(t+02 12 3 12 12)adim ep Im+(P+ 2 1 3m122(2al2)+

Tim)/ E o ex"p(P1k: I : 2 12 P3k 12 2/(2l2\ -ik) Hnce, th~e distribution? of E~

among the case population conditional on all other parameters is again a DP mixture

but not with a normal kernel but with a mixture kernel given by (4-7).

The exact expression of the likelihood (4-3) and proof of Result 2 is deferred to

Appendix B. I will refer to this model for E as EDPM for future references.

Prior Structure: The likelihood (4-3) involves the association parameters 7,,

P2, 3, and yi, yiM/, and Os = (ps, of ), i = 1, I. I use independent normal

priors for all the association parameters and also on Tim's, m = 1, M~. I will

note in the real data example (with only two possible values of G, so that m = 0, 1)

that if we know a priori that the mutation is rare in the control population, and

have an established genetic risk model for P(G|S), we should select an informative

prior on yi1, so that the effective range of the carrier probabilities in the control/case

population for each stratum reflects the scientific guesses for these values.

It now remains to describe the hierarchical prior structure involved in the DPM

model. Note that the mean of the random probability measure P is po which is a

bivariate distribution, and I consider the following standard normal-inverse Gamma

structure, namely, under po, pi| a2 ~ N'lmo, ref)~a r, (a)~I~/2, S/2) For compu-

tation, I used a NV(pa~, "i) prior on mo, which adds an extra lai;. r of uncertainty in

po. I use Inverse Gamma (IG)(a,/2, b,/2) prior on -r. Lastly, following Escobar and

West (1995), I assume a Gamma(a,, b,) prior on the precision parameter a~. I choose

the prior parameters (a,, b,) in such a way that the mean of the prior distribution of

K is reasonably large (compared to I) and the variance is modest. Ch....~-!nig such a

!stills,~ i" prior is so__~---- -1. in West et al. (1994).

None of the full conditional distributions follows a standard distributional form

and posterior inference is made by using the MC1| C numerical integration technique.










Conditional on Os, drawing random numbers from the respective conditional distribu-

tions is straightforward application of the Metropolis-Hastings algorithm. To update

Os, I use the no gaps algorithm prescribed by MacEachern and Miiller (1998). I

describe the computational details of the algorithm in Appendix B.

Remark 3: An interesting feature of the EDPM model is that it selects K, the

number of distinct values in I realizations from P or the cardinality of the set w in a

data adaptive way depending on the extent of stratification on the distribution of the

environmental exposure. In the presence of strong stratification effects, all of the wl

could be distinct, i.e. K = I, in the complete absence of stratification effects, K = 1.

Typically K will lie somewhere in between. The posterior mode of K thus serves as

an indicator of the degree of stratification effects on the control distribution of E.

In the above discussion, I assumed S to be a set of categorical stratification vari-

ables which is most often the case. If any of the stratification variables is continuous,

I recommend categorizing them for implementing the EDPM model.

Remark 4: Since I assume the distribution of E to be a Dirichlet Mixture of nor-

mals, this model applies only to continuous environmental exposures. The model can

be easily extended to multiple continuous exposure, simply by taking a DPM model

with multivariate normal kernel (as used by Miller and Roeder (1997)). I illustrate

this multivariate extension in one segment of the real data analysis. For categorical

exposures, the models could be adapted as shown in Seaman and Richardson (2001)

by using a Dirichlet distribution as prior on the probabilities for each category. For

a mixture of discrete and continuous environmental exposures one could either cat-

egorize the continuous exposure into classes or adapt the B li-o Io bootstrap ideas

as described in Gustafson et al. (2002). The main theme is common between all

three methods, trying to model the distribution of the environmental exposure in a

non-parametric way to guard against violations of model assumptions.









Remark 5: Note that, as indicated in Remark 1, via the above formulation, the

nuisance parameter Po(S) does not present itself in the case distributions of G and E~

as presented, respectively. Po(S) appears as a common term in both the numerator

and denominator of (4-5) and (4-7) and thus gets canceled in the ratio. Hence, the

retrospective likelihood does not involve Po(S).

Remark 6: One could naturally think of the following parametric logistic model

for modeling the distribution of G, instead of using the more flexible model as given

in (4-4)
PG=gm|Dj = Sj)
log = v + S,-, m = 1, -- ,M (4-8)
P(Gy = go|Dy = Sj)
where um is a vector of regression parameters capturing the effect of stratification

variables on the incidence of the genetic susceptibility factor in the control population.

CC assume a similar logistic model for P(G|S) for their real data analysis, though

they recognize that it is hard to predict BRCA1/2 carrier probabilities using this

logistic structure. Indeed, when I based my inference using the model in (4-8) with

normal priors on vo and um, the estimates of the parameters of interest p were less

accurate when compared to the ones using (4-4). Thus, for the sake of brevity I only

include results where I used model (4-4) for carrier probabilities.

4.3 The Israeli Ovarian Cancer Data

In this section, I apply the proposed methodology to the data from a population-

based case-control study on all ovarian cancer patients identified in Israel between

March 1, 1994 and June 30, 1999 (Modan et al., 2001). Blood samples were collected

from the cases and the controls in order to test for the presence of mutation in the

two 1! in r~ breast and ovarian cancer susceptibility genes BRCA1 and BRCA2. In

addition, the subjects were interviewed to collect data on reproductive/gynecological

history such as parity, number of years of OC use and gynecological surgery. The

main goal of the study was to examine the interpl~i of the BRCA1/2 genes and

known reproductive/gynecological risk factors of ovarian cancer. Since the actual










data had confidentiality issues, a replica was generated by replacing only the original

genetic susceptibility factor by a simulated binary genetic risk factor, retaining all

the features as in the original dataset. The dataset I used contained 8:32 cases and

747 controls.

This is a real example where OC' use and BRC'A1/2 mutation may appear to

be correlated simply because both could be related to the stratification variables S

like age and family history, and it is more realistic to assume independence between

these two genetic and gynecological risk factor conditional on S. However, it is hard

to verify Assumption 1 based on this single dataset as only 7 out of the 747 controls

were BRC'A1/2 carriers. I ran a logistic regression of G on the exposures of interest E

in the controls in each stratum, and though the tests of association were insignificant,

the sparsity of the data makes the results of these tests for association unstable and

less reliable. However, Modan et
(2005) hoth indicate that it is reasonable to assume that carrier status is independent

of the exposures under consideration, namely parity and number of years of OC' use,

and I also employ this assumption in the analysis.

It is known that the risk of ovarian cancer is higher for certain strata (for example

for the subgroup with family history of both breast and ovarian cancer) as well as for

BRC'A1/2 carriers so the rare disease assumption may not hold for all levels of the

genetic factor or for certain subgroups. However, Modan et
1:326 cases of epithelial ovarian cancer during the five-year study period with a baseline

population of approximately 1.5 million, -II---- -f it.-:: an empirical estimate of disease

prevalence P(D = 1) = 8.7 x 10-4, -11_ *** -r;line; that the odds-ratio estimates obtained

through the analysis under Assumption 1 will provide adequate approximations to

the ones obtained via exact analysis using G-E independence in source population.

All analyses are carried out conditional on four stratification variables: S= (Age

group (=0 if age < 50 years and =1 if age > 50 years), ethnicity (=1 for Ashkenazi









Jews and 0 otherwise), presence or absence of a personal history of breast cancer

PHB (= 1 if present and 0 if absent) and a family history of breast or ovarian cancer

FHBO (= 0 if no history, 1 if one breast cancer case in family and 2 if ovarian cancer

or two or more breast cancer cases in the family)). So the total number of strata

defined by the level combinations of S is I = 24.

I analyze the data using the EDPM method as described in the previous section.

For modeling the distribution of the genetic factor, I use (4-4). The genetic factor G

is binary with G = 0 for absence of any BRCA1/2 mutation and G = 1 for carrying

at least one BRCA1/2 mutation. It is well known that BRCA1/2 mutations are

very rare among ovarian cancer controls, and as Modan et al. (2001) pointed out,

traditional logistic regression analysis would yield imprecise estimates of parameters

of interest. Compounding to the sparsity is the fact that I do have a relatively large

number of strata defined by S and as a result, estimation of nd v(ipe frequencies

individually in each stratum would be imprecise in a classical set-up. CC adopt

a parametric logistic model for P(G|S) to circumvent this problem which is also

not satisfactory. In a Bl- -v 4 o paradigm I effectively use the prior knowledge on

BRCA1/2 carrier probabilities in different age groups, ancestry and with varying

levels of family history based on genetic algorithms (BRCAPRO: Parmigiani et al.,

1998, BOADICEA: Antonion et al. 2004) and empirical data (Couch et al., 1997,

Risch et al., 2001). I allow uncertainty in these predictions by allowing the informative

prior on yi1 to vary around the scientific guesses and in this process relax the stringent

logistic assumption. The effective range of prior probabilities for P(G = 1|S, D = 0)

typically varied from 10-1 to 10-4 arCOSS different strata.

I present two analyses, the first with OC use as the only environmental exposure

(E) as a direct illustration of the methods formulated in Section 4.2. With a binary

G, there are three parameters of interest involved in the disease risk model (4-1):









P11 = P1, P2, and P31 = 3-


logitP(D = 1|G, E, S) =o()+pl[G 1] + n20C + 3I[ = rr 1] OC


For each of pi, P2, 3S, I use NV(0,16) prior. Since scientific theory -II---- -R- high

positive value of pi, one could also select a sharper prior for P1. For the EDPM

model as described in (4-6), under the base-measure po, I assume that the variance

component a2 ~ IG(2, 1) and pla2 00m,-r2). The exposure variable, number

of years of OC use typically ranges from 0 to 20 years. I chose a diffuse prior on

mo, namely, mo ~ NV(3,9). I use- IG(3, 1) prior on -r. ClIn..-!nig priors for a~ is a

challenging task as a~ has the dual role of capturing the degree of faith in the base

measure, as well as determining the number of distinct values of 0. As prescribed

by Escobar and West (1995) I choose a Gamma prior on a~ which allows for prior

probabilities for larger values of K < I = 24. I experimented with various choices of

the shape and scale parameters of the Gamma prior, and the results are presented for

Gamma(4, 1) prior on a~. Detailed algorithm for resampling from the full conditionals

is collected in the appendix.

For comparison purposes, I also analyzed this data with a parametric model,

largely targeted towards this dataset. As the data contain 832 cases and 747 controls

of which 678 cases and 586 controls did not use oral-contraceptives at all, I used a

zero-inflated model (EZIM) for the control distribution of OC use. For individual j, I

consider pj as the probability of non-exposure (Ej = 0), and with probability (1- pi),
the exposure values follow NV(py, a2), Where pyj = 6o0 Sy.T h miigpobblte

are also modeled through the four observed stratification factors, logit(pj) = rlo +

rlSj. The case distributions can be obtained as mixture distributions via Lemmas

1-3. For the EZIM model, I consider mutually independent N(0, 16) prior for the

regression parameters, pi, P2 and P3, aS Well aS On 60, r0 and each component of by










and rl. For the scale parameter o.2, I use IG(2, 1) prior. Posterior inference is again

based on MCijlC samples from the full conditional distribution of the parameters.

I analyzed this data through the method proposed by CC and the case-only

method after adjusting for the covariates S. The case-only method only furnishes

estimate of the BRCA1/2*OC interaction parameter 3. The results are presented

in Table 4-1. There is little in the way of differences for estimation of 2~ and P3

by all the four methods which use G-EG independence. But for estimating the main

effect of G as measured by pi, the B li- I o methods have much smaller posterior

standard deviation and narrower HPD interval compared to the standard error and

the CI for the estimate of pi in the CC method. The results indicate that standard

logistic assumption is less likely to hold for P(G|S) in this dataset, and the more

flexible model for G as given in (4-4), boosted with the scientifically validated priors,

and adapting itself more naturally to the features of the data. Interestingly, the non-

parametric EDPM model for OC use performs quite comparably with the parametric

zero-inflated model which is designed specifically to capture the distribution of OC

use.

I also analyzed the data by ordinary logistic regression analysis which does not

exploit G-E independence in any manner. The wider confidence intervals, especially

for the interaction parameter indicates that any method using G-E independence is

able to estimate the interaction parameter more precisely. Whereas all the other four

methods declare G-E interaction to be statistically significant, the ordinary logistic

model cannot detect significance.

In summarizing the results, I first observe that for women who never used OC

(E=0), there is an almost astronomic increase in risk of ovarian cancer for a BRCA1/2

mutation carrier. The estimated odds ratio by the EDPM method is exp(3.75)=

42.52. On the other hand, among non-carriers, longer use of OC is related to decrease

in disease risk with associated odds ratio exp(-0.0748) = 0.92. However, the estimate










of the interaction parameter P3 Sil**-- -r that among BRCA1/2 carriers, the risk of

ovarian cancer increases slightly with OC use, with an odds ratio exp(-0.0748) x

exp(0.1091) = 1.03. The precision estimates and the credible intervals all indicate

that the main effect of BRCA1/2 and the BRCA-OC interaction are statistically

significant whereas the main effect of OC use is only marginally significant.

Figures 4-1 and 4-2 present plots of posterior distribution for P1, P2, P3 and also

for a~ and K for the EDPM method. To explore the degree of stratification, I also

present a plot of var(ps) and var(ai) in the EDPM model (i = 1, 24). I notice

that the ps's and ai's do reflect variation in the values, the variability in ai being

greater. The posterior mode of K( is at 5 with posterior mean around 5.76, -11_ -- -r;El.-

that though there are 24 possible strata, not all of them have distinct effects on the

distribution of number of years of OC use.

I present another analysis with OC and parity both considered as environmen-

tal exposures, I omit the details corresponding to this analysis and only collect the

results in Table 4-2. I note that for women with parity=0, and OC=0, BRCA1/2 mu-

tation is associated with a huge increase in risk of ovarian cancer. Among BRCA1/2

non-carriers, higher parity is associated with decreased risk of ovarian cancer. The

parity*BRCA1/2 interaction estimate so----- -is that the decrease in risk of ovarian

cancer associated with increased parity is modestly larger for carriers than for non-

carriers, but this difference is not statistically significant.

Since for a real dataset, the true state of the parameters is unknown and it is

not really possible to compare the methods, I conduct an extensive simulation study

to assess the performances of the methods over a range of different scenarios and

provide recommendations for the practitioner.

4.4 Simulation

In order to simulate a dataset for comparing the B li-, -i la methods along with the

method proposed by CC, case-only analysis and ordinary logistic regression, I used









the ovarian cancer data as a prototype to elicit realistic true values of the parameters.

I set the true values close to the results I obtained in the analysis of real data by

EDPM method in Table 4-1, i.e., P1 = 3, P2 = -0.07, and P3 = 0.12. I generated

1500 observations following the scheme as below:

1. I started with generating the S = (Age group, Ethnicity, PHB, FHBO) from

a multinomial distribution, where the stratum probabilities are consistent with the

real study.

2. Given S, I generated a binary variable D representing the disease status, with

probabilities P(D = 1| S) in agreement with the ovarian cancer study, the marginal

disease probability in the generated population being around 0.1 I also experi-

mented with several other choices of P(D = 1) for which the results are not included.

3. I generated the environmental exposure E from two distributions:

(i) A zero-inflated model exactly mimicking the exposure OC use as in the real

dataset. The true values of all associated parameters were chosen as the estimates

obtained from the real data when analyzed by the EZIM model.

(ii) Mixture of two normal distributions: To deviate from the exact pattern of real

data and to put the nonparametric and parametric methods to test, I considered the

case when [E|D = 0, Z = i] comes from the following mixture: 0.5 x NV(2, 1) + 0.5 x



4. Finally, I generated a binary variable G standing for BRCA1/2 mutation

status using the probability structure P(G|D, E, Z) as given in (4-4) and (4-5). I

select the true values for Tol in such a way that Pr(G = 1|D = 0) a 3.;:' and

Pr(G = 1|D = 0) a 46.C,' to represent the two situations with a moderately rare

and a common genetic mutation respectively. I also provide one set of simulations

when G was generated from the parametric logistic regression model as in (4-8) (Table

4-4).










Apart from the above set-up which assumes G-E independence, to test the ro-

bustness of our model, under violations of this assumption, I simulate G using the

model

Pr(G = 1Z= i, E, D 0)
log { ) = Tol + yEE. (4-9)
Pr(G = 0Z= i, E, D 0)

To vary the degree of dependence I consider two choiceS yE = 0.1 and yE = 0.25, that

is, the odds of having G = 1 with one unit increase in E increases by a factor of 1.105

and 1.284 respectively. Results for only yE = 0.25 are included in the text (Tables

4-3, 4-4 and 4-5). The strategies I followed for choosing priors for the B li-, -i Ia

methods in the simulation study are essentially same as discussed in the real data

analysis. I replicated the simulation 100 times and calculated MSE based on these

100 estimates. The results are given in Tables 4-3, 4-4 and 4-5.

The simulation results are fairly clear. If interest lies in estimating the main

effect of the genetic factor pr, the B li-, -i Ia EDPM model performs the best for any

choice of distributions of G and E. The fully parametric B li-, -i Ia EZIM model suffers

when E is originated from any other model, for example the mixture of two normal

distributions (Tables 4-4 and 4-5). When the parametric logistic assumption for

P(G|S) does not hold, there is a clear dominance of the B li-, Ii a methods over the CC

method for estimating P1. Even when the data is generated from an exactly logistic

model for P(G|S) (Table 4-4), the B li-, -i Ia methods perform quite comparably

with the CC method. The efficiency gain (for estimating P1) in B li-, -i Ia methods

is larger when the genetic mutation rarely occurs in the control population (Tables

4-3 and 4-5), which could be due to the flexibility of the likelihood in modeling

the control distributions separately in the B li-, -i Ia methods, whereas CC model

the marginal distribution of G|S. If interest lies in estimating the main effect of

E, both the CC method and the EDPM method are comparable, with CC method

having a slight edge in some cases. One may note that the MSE corresponding to










2~ for the EDPM model is often larger than the other methods as with the DPM

structure I'm adding another level of model uncertainty. Indeed, the advantage of

the DPM is not in terms of gain in efficiency for estimating P2 arCOSS all SCenariOS,

but because of its robustness. One may note that instead of modeling P(E|S), CC

model P(E, S) nonparametrically. Their profile likelihood technique works extremely

well across many different data generating mechanisms for E. On a minor note, in a

small proportion of times, there does appear to be a problem with the convergence

of their estimation algorithm which appears to be related to the choice of starting

values. I excluded those runs when presenting the final tables for our simulation. For

estimating the G-E interaction P3, One COuld choose either case-only, EDPM or the

CC method. When simultaneous estimation of all three parameters is considered, and

Assumption 1 is fairly reasonable, the EDPM model appears to be a superior choice.

Under violation of the independence assumption, performance of all the methods

worsen (Tables 4-3, 4-4 and 4-5), but the loss of efficiency appears to be the least

for EDPM model. The ordinary logistic regression model which is least efficient

under G-E independence, especially for the interaction parameter, does not lose much

efficiency under violation of G-E independence as it does not impose any restrictions

on the G-E distribution.

4.5 Discussion

Epidemiologists have long grappled with this issue on how to measure interac-

tion in a biologically meaningful way and there is still no consensus in the literature

(Botto and K~hury, 2001). One must recall that the statistical interaction parameter

3S aS in this chapter has a very specialized meaning which is related to the gen-

eral notion of !lsoI~ II I!un~" in the scientific community only in a vague way (Cox,

1984). "No statistical interaction" in our model means constant multiplicative effect

of ._ I n..i vpe on the disease odds across all levels of the environmental exposure. A

biologist might define lis.I~ .. II I!un~" in a broader mechanistic sense that interaction










exists if the genetic factor and environmental exposure work on the same pathway

(Brennan, 2002, C1 .vton and McE~eigue, 2001). Assessing the joint effects of genetic

and environmental factors within strata defined by other variables may provide useful

insight into disease etiology and help to determine effective public health intervention

strategies. The article by CC is thus a n,! l ini- breakthrough which emphasizes that

case-control studies of gene-environment !lIst ~ I .. Ii ni" go well beyond estimating the

statistical interaction parameter P3 and any design or analysis strategy should al-

low one to estimate other different parameters of interest, and should not only be

targeted towards estimation of P3. However, as emphasized throughout this chap-

ter, I recommend extremely cautious use of the independence assumption. Scientific

and empirical validation of this assumption is of utmost importance while using the

proposed methods.

To conclude, I would like to highlight some of the new features of the chapter.

In this chapter I proposed a fully flexible, robust B .i- .Is semiparametric model

for estimating not only the interaction parameter, but the main effects under gene-

environment independence in a stratified control population. The method outper-

forms the existing methods in many instances and performs comparably in others.

With genetic mutation which has unequal frequencies in case and control population,

the ability to model them separately through the proposed likelihood has a natural

justification. When the G-E independence assumption does not hold, the method

performs better when compared to other contenders. This chapter not only addresses

an important problem in modern epidemiology, it also introduces some interesting

statistical techniques especially for handling the high-dimensional stratum effects on

the genetic and environmental exposure distribution in a data adaptive way. The

use of the DPM model as illustrated in Result 2 in conjunction with transition from

control to case distribution is a nice application of the theory on DP. Using prior






78


biological information on the frequencies of the genetic mutation reiterates the fun-

damental advantage of following a B li-o -1 .I paradigm. The simulation study is an

additional asset of this chapter, comparing the B li-, -i Ia methods with the commonly

used frequentist methods and the recently proposed method by CC.

How to handle misclassification of G and measurement error in E will be dis-

cussed in the next chapter. The ascertain bias due to different control selection

mechanisms in the above framework remains topics for future research.



















Table 4-1: An! II, i of Israeli ovarian cancer data by all five methods, considering OC use as the only environmental exposure, with
- E.' HPD and confidence intervals


Model
EZIM


EDPM


CC



Ordinary Logistic


Case- Only


Pl
3.7832
0.1317
(3.4641, 3.9764)
3.7537
0.1294
(3.4358,3.9310)
3.6;323
0.3999
(2.8485, 4.4161)
3.7710
0.4407
(2.9072, 4.6348)


-0.0527
0.0243
(-0.1265, -0.0140)
-0.0748
0.0303
(-0.1409,-0.0151)
-0.0624
0.026;6
(-0.1145, -0.0103)
-0.0642
0.026;8
(-0.1167, -0.0117)


Estimate
post. stdev
HPD
Estimate
post. stdev
HPD
Estimate
std. error
CI
Estimate
std. error
CI
Estimate
std. error
CI


0.0910
0.0326
(0.0270, 0.1482)
0.1091
0.0352
(0.0364,0.1791)
0.1110
0.0341
(0.0442, 0.1778)
0.0476
0.0999
(-0.1482, 0.2434)
0.0924
0.0329
(0.0279, 0.1569)


14.7455
5.83631
(5.8913,28.5666)


5.7630
1.88363
(2,10)



















Table 4-2: A!! 11-, of Israeli ovarian cancer data by all five methods, considering both OC use and parity as environmental exposures,
with 0' HPD and confidence intervals


Model
EZIM


So
3.7877
0.1573


Poc
-0.0829
0.0272
(-0.1491, -0.0379)
-0.0631
0.0202
(-0.1034,-0.0224)
-0.06;20
0.0267
(-0.1143,-0.0097)
-0.0582
0.0263
(-0.1097,-0.0067)


PParity
-0.0369
0.0304
(-0.0864, 0.0119)
-0.0404
0.0311
(-0.0947,0.0260)
-0.0599
0.0320
(-0.1227,0.0029)
-0.0388
0.0317
(-0.1009,0.0233)


POC*G
0.1566
0.0366
(0.0765,0.2230)
0.1360
0.0331
(0.0823,0.2123)
0.1128
0.0344
(0.0454,0.1802)
0.0292
0.1080
(-0.1825,0.2409)
0.0931
0.0331
(0.0283,0.1579)


PParity*G
-0.0781
0.0427
(-0.1686,0.0026)
-0.1072
0.0501
(-0.2207,0.0031)
-0.1041
0.0599
(-0.2214,0.0133)
-0.3869
0.1481
(-0.6772,-0.0966)
-0.0565
0.0591
(-0.1724,0.0594)


Estimate
post. stdev


HPD (3.4937, 4.0909)
Estimate 3.8808
post. stdev 0.1566
HPD (3.5748,4. 1713)
Estimate 3.8961
std. error 0.4297
CI (3.0539,4.7383)
Estimate 4.7321
std. error 0.7411
CI (3.2795,6.1847)
Estimate
std. error
CI


EDPM


CC



Ordinary Logistic


Case-Only













Table 4-3: Simulation scenarios: E is Zero-Inflated ; G :rare or common; G-E independence
assumption holds (yE = 0) or does not hold (yE = 0.25). Mean denotes the mean estimate
based on 100 replications, whereas MSE is the estimated mean squared error based on 100
replications.

G YE Model P1 P2 P3
True 3.0000 -0.0700 0.1200
Rare 0 EZIM Mean 2.9777 -0.0623 0.1132
MSE 0.0196 0.0006 0.0015
EDPM Mean 2.9315 -0.0615 0.1140
MSE 0.0242 0.0011 0.0016
CC Mean 2.9013 -0.0630 0.1152
MSE 0.2633 0.0006 0.0015
Ordinary Mean 2.9053 -0.0632 0.1877
MSE 0.3743 0.0006 0.0619
Case-Only Mean 0.1196
MSE 0.0017

Common 0 EZIM Mean 2.9838 -0.0811 0.1282
MSE 0.0099 0.0016 0.0014
EDPM Mean 2.9722 -0.0834 0.1260
MSE 0.0108 0.0018 0.0016
CC Mean 2.7742 -0.0787 0.1239
MSE 0.0805 0.0015 0.0012
Ordinary Mean 2.8179 -0.0775 0.1214
MSE 0.0663 0.0016 0.0026
Case-Only Mean 0.1296
MSE 0.0014

Common 0.25 EZIM Mean 2.8643 -0.2752 0.3368
MSE 0.0313 0.0489 0.0536
EDPM Mean 2.8960 -0.1465 0.2119
MSE 0.0218 0.0064 0.0090
CC Mean 2.4190 -0.3116 0.3723
MSE 0.3580 0.0645 0.0695
Ordinary Mean 2.8006 -0.2586 0.1786
MSE 0.0637 0.0426 0.0109
Case-Only Mean 0.3950
MSE 0.0822













Table 4-4: Simulation scenarios: E: Mixture of two normals; G: with parametric logfistic
in terms of S as in (4-8) or commonly prevalent as in (4-4); G-E independence holds
(YE = 0) or does not hold (yE = 0.25). Mean denotes the mean estimate based on 100
replications, whereas MSE is the estimated mean squared error based on 100 replications.


G model

Generated
by (4-8)











Generated
by (4-4)


YE Model
True
0 EZIM Mean
MSE
EDPM Mean
MSE
CC Mean
MSE
Ordinary Mean
MSE
Case-Only Mean
MSE

0 EZIM Mean
MSE
EDPM Mean
MSE
CC Mean
MSE
Ordinary Mean
MSE
Case-Only Mean
MSE


Pl
3.0000
3.0000
0.0685
2.9880
0.0588
3.0185
0.0563
3.0157
0.1282




2.8213
0.0363
2.9870
0.0303
2.7542
0.1098
2.7683
0.1856


-0.0700
-0.0447
0.0798
-0.0710
0.0022
-0.0782
0.0015
-0.0793
0.0017




-0.1227
0.0886
-0.0695
0.0023
-0.0713
0.0017
-0.0722
0.0019


P3
0.1200
0.1323
0.0092
0.1290
0.0022
0.1284
0.0019
0.1323
0.0080
0.1289
0.0018

0.1352
0.0076
0.1189
0.0021
0.1249
0.0023
0.1328
0.0087
0.1251
0.0023

0.3778
0.1886
0.3473
0.0549
0.3682
0.0652
0.1519
0.0076
0.3212
0.0885


Generated
by (4-4)


0.25


EZIM Mean
MSE
EDPM Mean
MSE
CC Mean
MSE
Ordinary Mean
MSE
Case-Only Mean
MSE


2.8581
0.1568
2.8858
0.0449
1.9287
0.8899
2.7678
0.1368


-0.3114
0.2127
-0.26;92
0.0442
-0.2984
0.0563
-0.2509
0.0374




















Table 4-5: Simulation scenarios: E: Mixture of two normals; G: rarely prevalent; G-E
independence holds (yE = 0) or does not hold (yE = 0.25).Mean denotes the mean estimate
based on 100 replications, whereas MSE is the estimated mean squared error based on 100
replications.


YE E|D = 0, Z


Pl
3.0000
2.9169
0.0822
2.9294
0.0670
2.8525
0.2223
2.8732
0.8280


-0.0700
-0.0437
0.0311
-0.0674
0.0016
-0.0725
0.0009
-0.0715
0.0010


P3
0.1200
0.1587
0.0209
0.1296
0.0039
0.1335
0.0047
0.1450
0.0605
0.1340
0.0045

0.3076
0.0481
0.2990
0.0341
0.3143
0.0398
0.0452
0.0432
0.3505
0.0559


True value
Mean
MSE
Mean
MSE
Mean
MSE
Mean
MSE
Mean
MSE


Rare 0 EZIM

EDPM

CC

Ordinary

Case-Only


Rare 0.25 EZIM

EDPM

CC

Ordinary

Case-Only


Mean
MSE
Mean
MSE
Mean
MSE
Mean
MSE
Mean
MSE


3.0717
0.1688
3.1533
0.0965
2.0735
0.8476
3.1541
0.7112


-0.2041
0.1934
-0.1324
0.0056
-0.1444
0.0066
-0.1393
0.0059





_. Yi -~-


Figure 4-1: Real data analyzed with EDPhi model by considering OC use as an
environmental exposure: Histogram of last 5000 MC !LC values for the main effects
and interaction parameter with ovall li4I smoothed kernel density.


3.2
20 r


O


3.4 3.6 3.8 4
a) an approximate posterior distribution of P1


0L
-0.2


-0.15 -0.1 -0.05 0
b) an approximate posterior distribution of P2


0.05


0.05 0.1 0.15 0.2
c) an approximate posterior distribution of P3


0.25


i--T























































d) variability of 0.2


0 20 40 60 8(
a) an approximate posterior distribution of oc




4 -


3 -


-


Figure 4-2: Details of DPM model by considering OC use as an environmental ex-
posure: Histogram corresponding to approximate posterior distribution of a~ and K
in the DPM model. Also plotted are histograms of variances of the ps's and oa
i = 1, 24, calculated for each of the last 5000 MC1| C runs.
0.08, 1 1 0.25 1


Op,


0.06


0.04


0.02


0.15 E


b) an


60
50


5 10 o15
approximate posterior distribution of K


c) variability of 4















CHAPTER 5
ACCOUNTING FOR ERROR DUE TO MISCLASSIFICATION OF EXPOSURES
IN CASE-CONTROL STUDIES OF GENE-ENVIRONMENT INTERACTION

5.1 Introduction

Measurement error in exposure assessment is one of the ill linr~ sources of bias in

epidemiological studies. When ignored, these errors bias point and interval estimates

of effect, and invalidate p-values of hypotheses tests. Often, although not ahr-l- .-, the

bias is towards the null value, underestimating the true exposure-disease relationship,

and there can be a substantial loss of power in hypothesis tests. The pervasiveness

and extensiveness of these exposure measurement and misclassification errors in epi-

demiologic research may explain much of the inconsistent and inconclusive results

currently reported in the literature.

Bashir and Duffy (1997) provided a general review of epidemiological methods

for dealing specifically with measurement error and misclassification. Greenland and

K~leinbaum (1983) proposed a simple two-stage procedure for estimating the odds

ratio in matched pairs, with a corrected variance estimator developed later by Green-

land (1989). Rice and Holmans (2003) obtained analytic formulae for estimates of

genotypic relative risks in terms of the ._ I n..i ping error probability in analysis of un-

matched case-control studies with a single binary genetic factor as exposure. Later,

Rice (2003) proposed a fulllikelihood-based approach to obtain estimates and confi-

dence intervals for the parameters of interest in the presence of misclassification of

a binary exposure in matched case-control studies. However, much of the discussion

on the effects of misclassification of exposure have just focused on the impact on the

relative risk and/or sample size in studies of a single factor. In contrast, less attention