MOLECULAR MECHANISMS UNDERLYING DEPRESSION : A MULTI OMICS APPROACH By YUN ZHU A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 201 9
Â© 201 9 Yun Zhu
To my Family
4 ACKNOWLEDGMENTS I would like to thank my advisor , Dr. Jinying Zhao, for her incredible guidance, and dedication, and for mentoring me on what it takes to become a researcher. She has inspired, guided, and challenged me every step of the way, and by doing so, motivated me to give my best. Her knowledge, expertise , and advice have been an inspiration for me, not only for my career as a scientist but also to become a better person every day. I am being grateful to her for taking a chance on me and giving me the opportunity to be part of her group. It was working with her where I gained the knowledge and tools to be successful in my future career, and I will always treasure that. I would also like to express my thanks to my committee members, Dr. Catherine Striley, Dr. Mattia Prosperi, and Dr. Mattew Gurka for their support and advice during my Ph .D. program. Without their assistance, I could not have done this alone.
5 TABLE OF CONTENTS page ACKNOWLEDGMENTS ................................ ................................ ................................ .. 4 LIST OF TABLES ................................ ................................ ................................ ............ 9 LIST OF FIGURES ................................ ................................ ................................ ........ 11 LIST OF ABBREVIATIONS ................................ ................................ ........................... 12 ABSTRACT ................................ ................................ ................................ ................... 14 C H APTER 1 INTRODUCTION ................................ ................................ ................................ .... 16 Epidemiology of Major Depressive Disorder ................................ ........................... 16 Disease Burden ................................ ................................ ................................ 16 Risk Factors ................................ ................................ ................................ ..... 17 Genetics risk factors ................................ ................................ .................. 18 Findings from candidate gene based studies ................................ ............. 18 Findings from GWAS ................................ ................................ ................. 19 Environmen tal factors ................................ ................................ ................ 19 Gene x Environment Interactions ................................ ................................ ..... 20 Epigenetics ................................ ................................ ................................ 21 Metabolomics ................................ ................................ ............................. 22 Gut microbiome ................................ ................................ .......................... 22 Late life depression ................................ ................................ .......................... 23 Aims of Current Study ................................ ................................ ............................. 24 Study Description ................................ ................................ ............................. 26 The Mood Methylation Study Monozygotic Twin Pairs Discordant on MDD (Paper ................................ ................................ ................................ ........... 26 1 & 3) ................................ ................................ ................................ ................ 26 Available clinical phenotypes ................................ ................................ ..... 27 Available multi omics data ................................ ................................ ......... 28 The ROS/MAP study (Paper 2) ................................ ................................ ........ 29 Available clinical phenotypes ................................ ................................ ..... 30 Available multi omics data ................................ ................................ ......... 30 2 GENOME WIDE PROFILING OF DN A METHYLOME AND TRAN SCR IPTOME IN PERIPHERAL BLOOD MONOCYTES FOR MAJOR DEPRESSION: A MONOZYGOTIC DISCORDA NT TWIN STUDY ................................ ..................... 31 Methods ................................ ................................ ................................ .................. 32 Twin Pairs ................................ ................................ ................................ ......... 32 MDD Diagnosis ................................ ................................ ................................ 33
6 Inclusion/Exclusion Criteria ................................ ................................ .............. 34 Other Measures ................................ ................................ ................................ 34 Monocyte Isolation and DNA/RNA Extraction ................................ ................... 35 DNA Methylation Profiling ................................ ................................ ................. 35 Methylation Data Pre processing and QC ................................ ........................ 36 Transcriptome Profiling by RNA seq ................................ ................................ 36 Replication in Brain ................................ ................................ .......................... 37 Statistical Analysis ................................ ................................ ............................ 37 Identifying differentially methylated regions (DMRs) associated with MDD ................................ ................................ ................................ ....... 37 Differentially expressed genes (DEGs) associated with MDD .................... 39 Integrated methylome and transcriptome analysis ................................ ..... 39 Methods used for the replication in the brain ................................ ............. 40 Co methylation and co expression networks ................................ ............. 40 Functional enrichment analysis ................................ ................................ .. 40 Sensitivity analysis ................................ ................................ ..................... 41 Control for multiple comparisons ................................ ................................ 41 Results ................................ ................................ ................................ .................... 41 DMRs Associated with a Lifetime History of MDD ................................ ............ 41 DEGs Associated with MDD ................................ ................................ ............. 42 Replication in Brain ................................ ................................ .......................... 42 Genome wide Integration of DNA Methylome and Transcriptome in Blood Monocytes ................................ ................................ ................................ ..... 42 Differential Network Analysis ................................ ................................ ............ 43 Functional Enrichment Analysis ................................ ................................ ....... 44 Sensitivity Analysis ................................ ................................ ........................... 44 Discussi on ................................ ................................ ................................ .............. 45 3 GENOME WIDE PROFILING OF DN A METHYLOME FOR LATE LIFE DEPRESSIVE SYMPTOMS ................................ ................................ ................... 86 Methods ................................ ................................ ................................ .................. 87 Study Participants ................................ ................................ ............................ 87 Clinical Evaluation ................................ ................................ ...................... 88 Late Life Depressive Symptoms Assessment ................................ ............ 88 DNA Methylation Data Assessment ................................ ........................... 88 Gene Expression by RNA seq ................................ ................................ ... 89 Statistical Analysis ................................ ................................ ............................ 89 Identifying differentially methylated probes (DMPs) associated with late life depressive symptoms. ................................ ................................ ....... 89 Identifying differentially methylated regions (DMRs) associated with late life depressive symptoms ................................ ................................ ........ 89 Correlation between differential methylated probes and their cis expression ................................ ................................ .............................. 90 Functional enrichment analysis ................................ ................................ .. 90 Co methylation networks ................................ ................................ ........... 90 Sensitivity analysis ................................ ................................ ..................... 91
7 Multiple testing ................................ ................................ ........................... 91 Results ................................ ................................ ................................ .................... 91 Late Life Depressive Symptoms Associated DMPs. ................................ ......... 91 Late Life Depressive Symptoms Associated DMRs ................................ ......... 92 Correlation between Methylation Level of DMR Genes and cis Gene Expression ................................ ................................ ................................ .... 92 Functional Enrichment Analysis ................................ ................................ ....... 92 Sensitivity Analysis ................................ ................................ ........................... 93 Discussion ................................ ................................ ................................ .............. 93 4 GUT MICROBIOME AND B LOOD METABOLOME PROF ILES FOR MAJOR DEPRESSION: FINDINGS FROM A MONOZYGOTIC D ISCORDANT TWIN STUDY ................................ ................................ ................................ .................. 105 Method ................................ ................................ ................................ .................. 1 07 Study Population ................................ ................................ ............................ 107 Inclusion/Exclusion Criteria ................................ ................................ ............ 107 Other Measures ................................ ................................ .............................. 108 Gut Microbiome Profiling by 16s rRNA Sequencing ................................ ....... 109 Stool sample collection ................................ ................................ ............ 109 16s rRNA sequencing ................................ ................................ .............. 109 Quality control ................................ ................................ .......................... 110 OTU annotation ................................ ................................ ........................ 110 Plasm Metabolomics Analysis by LC MS ................................ ....................... 111 Statistical Analysis ................................ ................................ .......................... 112 Gut microbiome diversity ................................ ................................ ......... 112 Identifying MDD associated gut microbiota ................................ .............. 112 Gut microbiota discriminating individuals with MDD from control ............. 113 Relationship between gut and diet related plasma metabolites and MDD ................................ ................................ ................................ ..... 113 Relationship between gut derived metabolites and gut microbiome ........ 113 Functional analysis ................................ ................................ .................. 113 Sensitivity analysis ................................ ................................ ................... 114 Results ................................ ................................ ................................ .................. 114 Microbiome Composition in Depressed and Non depressed Participants ...... 114 Microbiome Diversity Associated with MDD ................................ ................... 115 Taxa Associated with MDD ................................ ................................ ............ 115 Separation of Depressed and Non depressed Twins using Microbiome Profiles ................................ ................................ ................................ ........ 115 Relationship Between Gut and Gut Derived Metabolites ................................ 116 Functional Analysis ................................ ................................ ........................ 116 Sensitiv ity Analysis ................................ ................................ ......................... 116 Discussion ................................ ................................ ................................ ............ 116 5 SUMMARY ................................ ................................ ................................ ........... 131 Findings from EWAS in Young MZ Twins ................................ ............................. 131
8 Findings from EWAS in Elderly Individuals ................................ ........................... 132 Findings from Microbiome Analysis in Young MZ Twins ................................ ....... 133 Limitations ................................ ................................ ................................ ............. 134 Strengths ................................ ................................ ................................ .............. 136 Conclusion and Future Direction ................................ ................................ ........... 138 LIST OF REFERENCES ................................ ................................ ............................. 141 BIOGRAPHICAL SKETCH ................................ ................................ .......................... 165
9 LIST OF TABLES Table page 2 1 QC information for RNA seq ................................ ................................ ................... 51 2 2 Clinical characteristics of twin pairs participating in the MMS ................................ . 56 2 3 Significant DMRs associated with lifetime history of MDD in MZ discordant twin p airs ................................ ................................ ................................ .................... 57 2 4 Significant DEGs associated with lifetime history of MDD in MZ discordant twin pairs ................................ ................................ ................................ .................... 59 2 5 Significant CpG probes replicated in the brain ................................ ........................ 60 2 6 Replication of differentially expressed genes (DEGs) in the brain .......................... 61 2 7 List of significant correlation pairs between DNA methylation and cis acting gene expression in peripheral blood monocytes ................................ ................. 62 2 8 Co methylation modules along with hub genes and biological pathways ................ 69 2 9 Co expression modules along with hub genes and biological pathways ................. 71 2 1 0 Pathway enrichment for DMRs with nominal associations with MDD .................... 73 2 11 Result for sensitivity analysis of the identified DMRs ................................ ............ 74 2 12 Results for sensitiv ity analysis of the identified DEGs ................................ ........... 76 3 1 Characteristics of study participants ................................ ................................ ........ 96 3 2 CpGs associated with late life depression symptom ................................ ............... 97 3 3 DMRs associated with life depression symptom ................................ ..................... 98 3 4 Cis correlation between DMR genes and gene ex pression ................................ .... 99 3 5 Pathway enrichment for putative DMRs with nominal associations with MDD (P<0.001) ................................ ................................ ................................ .......... 100 3 6 Co methylation modules ................................ ................................ ....................... 101 4 1 Clinical characteristics of twin pairs participating in the MMS (N=74) ................... 120 4 2 Genus level OTUs associated with MDD ................................ .............................. 121 4 3 Gut derived metabolites associa ted with MDD ................................ ..................... 122
10 4 4 Gut metabolic pathways enriched with differential microbiomes ........................... 123 4 5 Sensitivity analysis of genus level OTUs associated with MDD ............................ 124 4 6 Sensitivity analysis of gut derived metabolites associated with MDD ................... 125
11 LIST OF FIGURES Figure page 2 1 Manhattan plot displaying the DMRs associated with MDD in monozygotic discordant twin pairs (N=79 pairs).. ................................ ................................ .... 78 2 2 Genomic distribution and CpG contents of identified DMRs. ................................ .. 79 2 3 Man hattan plot displaying the DEGs associated with MDD in monozygotic discordant twin pairs (N=79 pairs) ................................ ................................ ...... 80 2 4 Circos plot showing the genome wide relationship between DNA methylation and gene expression in peripheral blood monocytes in relation to major depression. ................................ ................................ ................................ ......... 81 2 5 Genome wide partial correlation patterns between DNA methylation and cis expression (Â±5kb) ................................ ................................ ............................... 82 2 6 The largest co methylation module associated with MDD in depressed twins in comparison to their non depressed co twins. ................................ ..................... 83 2 7 The largest co expression module for the identified DEGs. ................................ .... 84 2 8 Tissue/cell types enrichment of the identified DMRs. ................................ .............. 85 3 1 Manhattan plot of EWAS for life depression symptom ................................ .......... 102 3 2 Genomic distribution and CpG contents of identified DMRs. ................................ 103 3 3 The largest co methylation modules for the identified DMGs. ............................... 104 4 1 Overall gut microbial composition at the class level. The relative abundance of g ut taxa was shown as height of bar for depressed twins (red) and non depressed twins (blue). ................................ ................................ .................... 126 4 2 Microbiome diversity among depressed twins and non depressed twins. ............. 127 4 3 Hierarchical clustering for the genus level abundance of MDD related gut m icrobiome. Depressed participants (blue labels) were clustered together. ..... 128 4 4 t SNE plot showing depressed twins can be separated from their non depressed co twins based on their gut microbiome composition. ................................ ...... 129 4 5 Correlation patterns between metabolites and microbiome taxas associated with MDD ................................ ................................ ................................ ................. 130
12 LIST OF ABBREVIATIONS BDNF Brain derived neurotrophic factor CESD Center for Epidemiologic Studies Depression Sacle CNS Central nervous system DEG Differentially expressed gene DMG Differentially methylated gene DMP Differentially methylated probe DMR Differentially methylated region DNA Deoxyribonucleic acid ENS Enteric nervous system EOD Early onset depression FC Fold change GBA Gut brain axis GWAS Genome wide association study MAP Memory and Aging Project MDD Major depressive disorder MMS Mood Methylation Study NIMH National Institute of Mental Health OTU Operational taxonomic unit PC Phosphatidylcholine PTSD Post traumatic stress disorder RADC University RNA Ribonucleic acid RNA seq RNA sequencing
13 ROS Religious Orders Study SM Sphingomyelin WHO World Health Organization WGBS Genome
14 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy MOLECULAR MECHANISMS UNDERLYING DEPRESSION: A MULTI OMICS APPROACH By Yun Zhu May 201 9 Chair: Jinying Zhao Major: Epidemiology Major depressive disorder (MDD) is a debilitating mental disorder that affects more than 350 million people worldwide . The prevalence of MDD varies according to gender, age, and racial/ethnic groups. Twin studies suggest that the heritability of MDD is estimated to be as high as 50%. As such, substantial effort has been made to identify genetic factors underlying MDD. However, all genetic variants identified so far explain only a small fraction of MDD variability, suggesting that environmental factors are also important contributors to depression pathogenesis. To date, the molecular mechanisms underlying MDD remain elusive. Leveraging the rich clinical and multi omics data collected in two well characterized community based population cohorts, the Mood Methylation Study (MMS) and the Religious Orders Study and Memory and Aging Project (ROSMAP), I conducted integrated multi omics data analyses to understand the molecular mechanisms of depression in younger and older adults. In a well matched mo nozygotic (MZ) twin sample including 79 MZ twin pairs discordant on MDD (mean age 38.2 years), I conducted integrated DNA methylation and gene expression analysis to identify differentially methylated regions (DMRs) and
15 differentially expressed genes (DEGs ) associated with lifetime history of MDD in younger adults. The identified genes were enriched in pathways related to neuronal function and stress responses , and overrepresented in MDD GWAS loci . As the etiology of depression in younger and older age groups could be different , I conducted EWAS to identify differentially methylated genes associated with late life depressive symptoms. While several identified genes overlapped with differentially methylated genes in younger adults , most genes are spec ific to late life depressive symptoms. These findings s upport the hypothesis that molecular mechanisms underlying depression in younger individuals and older individuals could be different. Using gut microbiome data collected in 37 MZ twin pairs discordant on MDD, I found the microbiome metabolic pathways in plasma such as sphingolipid metabolism, D glutamine, and D glutamate metabolism, and histidine metabolism were significantly associated with lifetime history of MDD. In summary, this dissertation identi fied key genes and molecular pathways associated with MDD. These findings enhance our understanding of depression and are likely to provide novel biomarkers for depression and related disorders.
16 CHAPTER 1 INTRODUCTI ON Major depressive disorder (MDD) is a debilitating mental disorder that is characterized by at least one discrete depressive episode lasting at least two weeks and involving significant changes in mood, interests, cognitive function and vegetative other vegetative symptoms, such as disturbed sleep or appetite  . MDD is one of the major contributor s to the global burden of disease [2 ] . Despite significant efforts , the pathogenesis of depression remains enigma  . Recently advances in molecular technology have generated multi level omics data that allowed for a better understanding of the molecular me chanisms underlying MDD  . In this dissertation , I utilize d the multi omics data collected in two well characterize d , community based population cohort s to understand the molecular mechanisms and identify biomarkers for depression . Epidemiology of Major Depressive Disorder Disease Burden MDD affects more than 16 .1 million American adults  . Globally, over 3 5 0 million people of all ages suffer from depression, and about 10% to 15% of the general population will experience clinical depression during their lifetim e [6,7] . A recent analysis including more than one billion participants w orldwide showed that the 12 month and lifetime prevalence of MDD is 7.2% and 10.8%, respectively  . These estimates vary significantly across racial/ethnic groups. According to National Institute of Mental Health (NIMH) , the 12 month prevalence of major depressive episode in U.S. was highest among adults reporting two or more races (10.5%), followed by American Indians/ Alaskan Natives (8.7%) and Whites (7.4%) . Asian s have relative lower 12 -
17 month prevalence of MDD (3.9%)  . Besides racial differences, MDD also has a clear gender difference. Women h ave twice the lifetime rates of depression and more anxiety disorders than men [5,9,10] . Moreover, t he prevalence of MDD varies greatly across age groups. Based on NIMH, the 12 month prevalence was highest among individuals aged 18 25 (10.9%), followed by individuals aged 26 49 (7.5%), adults aged 50 64 (5.3%) and elderly adults aged 65 or more ( 4.5%) [7,11] . According to the American Academy of Suicidology, the risk of suicide is about 20 times greater among people with maj or depression, and about two thirds of people who commit suicide are depressed at the time of their death  . Moreover, depressed indivi duals suffer from higher rates of many chronic diseases,  , such as diabetes  , cardiovascular disease,  and dementia  . Together, MDD increases the mortality risk by 60 80%, accounting for 10% of all cause mortality worldwide [16,17] . Risk Factors The etiology of depression is highly complex involving genes , environment as and gen e environment interactions [18,19] . Twin studies suggest a heritability of 40% to 50% [4,20 23] , and family studies indicate a 2 3 fold increase in lifetime risk of MDD among first degree relatives  . Although substantial effort has been made to identify genetic factors associated with MDD, the identified genetic variants altogether only explain a small fraction of the heritability [18,25] , suggest ing that environmental factors , especially life stress [26 28] also contribute to MDD pathology. A deeper understanding of its mechanisms is the key to develop effective and targeted strategies against this debilitating disorder.
18 Genetics r isk f actors Twin and family studies have shown that gen etic factors play an important role in MDD pathogenesis [18,21,29 31] . The heritability of MDD was estimated to be as high as 40 50% [4,20 23] . As such, substantial efforts have been devoted to identifying susceptibility genes associated with MDD. Below I summarize the major findings from candidate gene based and genome wide association studies (GWAS). Findings from c andidate gene based studies Many earlier genetic studies on depression employed a candidate gene based approach [32 37] . This approach focuses on biological pathways known to be involved in depression and stress responses, such as the HPA axis [38,39] , and has identified genetic polymorph isms in key candidate genes associated with depression . Of these, the serotonin transporter gene, SLC6A4, is the most well established candidate gene for MDD [40,41] . This gene has two alleles (S, L), with the S allele being related to lower expression of serotonin transporter proteins. Subjects carrying two copies of the short allele (genotype SS) have an 1.3 ti mes increased risk of developing MDD compared to those carrying one copy or zero copy of this allele [42,43] . Interestingly, the S allele interacts with e arly life stress in leading to MDD  . Anothe r candidate gene is the brain derived neurotrophic factor ( BDNF), which is involved in activity dependent neuronal plasticity,  . Decreased activity of BDNF negatively influence s mood or recover y from depressed mood [45,46] . Genetic polymorphisms in the BDNF gene h ave been associ ated with MDD [47 49] . Other candidate genes for MDD include HTR2A , TPH 2 , GABA , etc. [50 55] . Although candidate gene based approach has been very successful in identifying genetic variants associated with MDD, this method is hypothesis dependent , and cannot discover new genes associated with human complex
19 diseases. As such, hypothesis free methods such as GWAS are needed to search for susceptible genes at a genome scale . Findings from GWAS With the rapid advance of SNP genotyping technologies , genome wide association study (GWAS) has become popular in genetic studies of human complex diseases. This hypothesis free approach generates millions of genotype data across the genome , and has been very successfully in identifying genetic variants asso ciated with human complex diseases [56 63] . However, because the effect of a single variant is in general small, a large sample size is required to achieve adequate statistical power for GWAS of a complex disease or trait. [64,65] . Despite of successes in similarly sized GWAS of schizophrenia  and bipolar disorder (BPD)  , GWAS for MDD has proven to b e difficult . A recent large scale meta analysis including over 9 , 000 MDD patients failed to identify any replicable associations  . Another meta analysis comprising 34,549 individuals from 17 population based studies has also failed to identify SNPs at the genome wide significant level  . Until recently, another GWAS meta analysis based on the Psychiatric Genomics Consortium  identified 44 loci in a combined sample of 135,458 cases and 344,901 control across 7 large MDD cohorts of European ancestry  . The difficult y in identifying susceptibility genes for MDD could be largely attributable to the highly heterogeneity in clinical phenotypes, depression s ubtypes, and environmental factors. Environmental f actors Although the heritability of MDD is estimated at 40% to 50% [20,21,70] , the genetic basis of MDD remain e lusive. Twin studies indicated that environmental factors
20 account for over 60% inter individual variability in MDD [21,71] . Epidemiological studies have identified man y environmental factors for depression, including low SES [72 74] , negative family relationships [24,72,75] , childhood maltreatment, [26,76 78] and stress ful life events, etc [79 83] . Of these, exposure to early life advers ity may represent one of the most important environmental fact ors implicated in MDD. For instance, i t has been shown that children exposed to sexual abuse have 1.37 higher risk of MDD later in life  . Thus, it is of great interest to examine how early life adversities become biological embedded (get under the skin) into depression risk later in life. Gene x Environment Interaction s Gene x environment interactions whereb y genetic effects are moderated by specific environmental factors have long been postulated to play a n important role in depression  . During periods of heightened neural plasticity throughout development, brain regions involved in the regulation of stress response appear to be affec ted by stressful events. Such experience dependent plasticity may alter neural circuits and maladaptive responsiveness to the environment that lead to an increased risk for depression  . Moreover, early life stress was found to interacted with several genes includin g SLC6A4 , DAT1 and FKBP5 in leading to risk of MDD later in life  . S everal mecha nisms have been postulated to underlie gene x environment interactions involved in depression and other neuropsychiatric diseases, including DNA methylation  , miRNAs  , metabolomics  and gut microbio ta [91,92] . Deciphering these molecular mechanisms is the key to understand depression pathogenesis and identify novel biomarkers and druggable targets for effective interventions.
21 Epigenetics Epigenetics refers to the changes in gene expression without altering the DNA sequence. Epigenetic mechanisms, especially DNA methylation, provide a mechanism through which environmental factors influence the genome [93,94] , and also could explain many clinical features of depression such as correlation with adverse life events, discordance in identical twin pairs, sex differences, and the phenomena of both remission, and, in many individuals, subsequent relapse  . Investigating epigenetic factors may thus allow an integrated view of how genetic and environmental factors alter risk of disease. Several studies have shown that early life adversity leaves long lasting marks in the epigenome [26,96,97] . Epigenetic modifications and resulting changes in gene expression have been reported in rodent models after exposure to stress [98 101] . In addition to these studies, early life trauma modifies DNA methylation in the promoter region of the NR3C1, SLC6A4 and BDNF genes in human brain and peripheral bl ood [42,102 104] . Th ese findings indicate that stressful environmental factors can negatively affect depression through altering DNA methylation and other epigenetic mechanisms . Several studies have reported association of altered DNA methylation with MDD [105 107] . Given the close relationship between BDNF gene and MDD, several studies investigated DNA methylation of the BDNF pathway in depression  . Keller and colleagues found that DN A methylation of the promoter region of the BDNF gene was significantly higher in depressed subjects who committed suicide compared to nondepressed subjects who died in accident s  . In a recent EWAS including 7,948 middle aged individuals, three CpGs sites in the CDC42BPB gene and the ARHGEF3
22 gene were associated with major depression  . While existing studies clearly suggest that aberrant DN A methylation is associated with MDD, results are mixed and no conclusive genes have been identified so far . Metabolomics Metabolites, the final product of interactions between genetic and environment, may serve as biomarker s for human complex disease s including depression [110 112] . Recent advances in technology have led to the emergence of metabolomics, an in novative high throughput bioanalytical method aiming to identify and quantify thousands of molecules present in any biological samples simultaneously  . It has been increasingly used as a versatile tool for discovery of molecular biomarkers in m a ny diseases including MDD. Currently, several metabolomics studies have been conducted to identify the metabolites associated with MDD using unrelated individuals [90,111,114 116] . These findings suggest there are significant changes in human metabolomics profiles related to MDD and metabolomics as a hypothesis free a pproach is a powerful tool in discovering novel molecules involved in the pathophysiology of MDD. Gut m icrobiome Cumulative evidence suggests that the gut brain axis (GBA) plays an important role in maintaining the homeostasis of the central nervous system (CNS) [117,118] . The GBA is a dynamic matrix of tissues and organs including the brain, glands, gut, immune cells and gastrointestinal microbiota that communicate in a complex multidirectional manner to maintain homeostasis. Dysbiosis of the gut microbiome can lead to a broad spectrum of physiological and behavioral effects, includ ing hypothalamic pituitary adrenal (HPA) axis activation, altered activity of neurotransmitter systems, and immune
23 functions  . Indeed, animal studies have shown that microbiome abundances are associated wit h MDD. For example, mice exposed to stress showed increased levels of alistipes, odoribacter, roseburia , and clostridium, but decreased levels of parabacteroides , coprococcus and dorea [120 122] . In addition, fecal microbiota transplantation of germ with depression showed increased depression like behaviors compared to GF mice colonized with microbiota derived from non depressed individuals [91,123] . Microbiome composition has also been associated with depressive symptom in mice model  . Antibiotic treatment induced substantial shifts in the mouse gut microbiota, which resulted in depressive behavio rs [125 127] . Moreover, probiotics have the potential to reduce stress and stress related disorders such as depressio n [127,128] . In human, previous studies have shown that the compositions of gut microbiome in individuals with MDD were significantly different from that in healthy controls [91,129] . Compared to non depressed individuals, patients with depression exhibited increased levels of parabacteroides, paraprevotella, anaerofilum, blautia and gelria and decreased levels of ruminococcus, faecalibacterium an d bacteroides [91,123,129,130] . L ate life depression Late life depression is a heterogeneous mood disorder defined as a major depressive episode occurring in an older adult (65 years or older). Most previous studies on depression have focused on young or middle aged individuals [43,69,131 133] . However, the etiology of depression in older individuals (>65 y ea rs) may be different from that of depression in younger individuals  . With the aging population worldwide , studying late life depressive symptoms has become a focused area in depression research. A recent meta analysis of adults aged 50 years or older found the
24 prevalence of depressive symptoms is near 20%  . However, little research has been done to decipher the molecular mechanisms under lying late life depressive symptoms . Over time, the relative contribution of genes to a phenotype decrease [136,137] . This de crease may be due to the accumulation of environmental insults that tends to increase the total phenotypic variance , resulting in lower heritability over time  . In Framingham Study, researchers found that age stratified heritability estimates for complex traits were decreasing with aging  . Like other chronic disease, environmental factors play more important roles in MDD than genetic effect in elders  . The heritability of MDD in older populations (age 60) decreased to 18% as compared to 30 50% in younger population  , suggesting that the cumulative exposure to environment and gen e environment interactions may dilute the genetic effects with aging  . Furthermore, t he heterogeneity in depression also increases with aging and the accompanying chronic medication conditions  . Thus, it is of p articular importance to understand the mechanisms underlying late life depressive symptoms . The epigenome is at the interface between the genome and the environment, so studying it offer clues about different ways in only 1 loci that was previously known in younger individuals with depression  . Further resea rch es are necessary in order to assess the role of epigenetics in late life depression. Aims of Current S tudy The etiology of MDD is highly complex involving both genes and the environment as well as their interactions. Despite substantial progresses over the past decades, our
25 understanding of its pathophysiology remains incomplete. As recognizing and confirming MDD symptoms rely heavily on skilled psychiatrists, it is critical to understan d its mechanism and to identify reliable biomarkers for disease diagnosis, thorough understanding of its molecular mechanisms. The central aim of this dissertation is to identify the potential molecular mechanisms underlying depression using a multiple omics approaches in two well characterized population based cohorts. The specific aims of each study are to: 1. Chapter 2: Identify epigenetic and transcript factors associated with MDD in young er adults (mean age 38). I hypothesized that altered epigenome and transcriptome profiles are associated with depression in younger adults. This analysis uses DNA methylation data (Illumina MethylationEPIC BeadChips) and gene expression d ata (RNA seq) generated in peripheral blood monocytes from 79 MZ twin pairs discordant on MDD . I conducted an EWAS to identify differentially methylated regions associated with lifetime history of MDD, followed by a genome wide gene expression analysis to identify differentially expressed genes associated with MDD. L ifetime history of MDD was used as major outcome in the analysis since the study was designed to recruit twins discordant on lifetime history of MDD. 2. Chapter 3: Identify epigenetic factors assoc iated with depressive symptoms in e lder ly adults (mean age 88). I hypothesized that altered epigenome profiles are associated with depression in elderly adults, and the altered profiles may differ from that in younger individuals. This analysis uses DNA methylation data (Illumina Methylation450K BeadChips) and gene expression data (RNA seq) generated in
26 postmortem brain tissue from 708 individuals . I conducted an EWAS to identify differentially methylated regions associated with late life depression. Since age onset data is hard to assess in ROSMAP, participants with chronic depression or late onset depression cannot be distinguished. Here I used the depressive symptoms in their follow ups (9 years) as major outcome. 3. Chapter 4: Identify gut microbiot a profiles associated with MDD in young er adults (mean age 38). I hypothesized that dysbiosis of gut microbiota play a role in depression. This analysis uses 16s RNA seq data generated in fecal sample and metabolomics data generated in host plasma from 37 MZ twin pairs discordant on MDD. I compared diversity of gut microbiome in depressed and non depressed twins, and examined the specific taxa associated with depression. The relationship between abundance of gut taxa and gut derived metabolites were also as sessed. Similar to aim 1, lifetime history of depression was used as the major outcome. Study Description The Mood Methylation Study Monozygotic Twin Pairs Discordant on MDD (Paper 1 & 3 ) The current analysis included 79 MZ twin pairs discordant for lifet ime history of MDD. Twins included in the present analysis are enrolled through the Mood and Methylation Study (MMS), an observational study designed to identify functional epigenetic determinants for MDD using a co twin control design. Detailed informatio n for the larger MMS study, including study design, twin recruitment, clinical examination, and sample collection is currently being considered for publication elsewhere. For the present study, all twins were members of the Washington State Twin Registry ( WSTR), a community based twin registry consisting of over 9,000 twin pairs  . All
27 participants provided informed consent for the study procedures, and all procedures were approved by the University of Washington Institutional Review Board. Zygosity was determined using SNP co ntrol probes (n = 59) located on the Illumina MethylationEPIC Beadchip. Only complete twin pairs were eligible for this study. Because the demographics of the WSTR are approximately 85% Caucasian (which reflects Washington State generally), examining race as a moderator variable was not possible. Therefore, all twin pairs recruited for the study were Caucasian. In addition to pairwise lifetime MDD discordance, the specific inclusion criteria included: (1) monozygosity determined by DNA analysis; (2) age 18 years and above; (3) reared together; (4) BDI II score <13 for the non depressed twin; and (5) willingness to provide blood samples. Twins who did not meet these criteria were excluded from the study. The primary exclusionary conditions for both twins in t he pair included schizophrenia, bipolar disorder I and II, current substance use disorder, cancer within the previous 5 years, autoimmune disorders, uncontrolled endocrine disorders, and uncontrolled sleep apnea. Available c linical p henotypes Lifetime and current major depression were determined using the relevant sections of the Structured Clinical Interview, Research Version, for the DSM IV (SCID RV) which was administered by a clinical psychologist. The SCID RV is a semi structured interview guide for ma king Axis I DSM IV diagnoses and was the most up to date version at the start of the study. There are no substantive differences in the criteria for MDD between DSM IV and DSM V. Twins participating in this study completed the Mood Episodes, Psychotic and Associated Symptoms, Psychotic Disorders, Mood Disorders, Substance Use Disorders, and Anxiety Disorders modules. Any questions or
28 concerns about the depression phenotype or exclusionary psychiatric conditions were referred to the senior psychiatrist for e valuation and final decision making. The same inclusion and exclusion criteria (see below) and review processes were applied to both twins in a pair. Interviewers were blinded to the clinical information about the co twin. A discordant pair was defined as a twin pair with one twin meeting the criteria of MDD, but his/her co twin does not. After signing the informed consent, each participant was given a physical exam and anthropometric measurements. They were also asked to complete standard questionnaires f or sociodemographic factors, lifestyles, use of psychiatric medications, and history of diseases. Information on the severity of current depressive symptoms (assessed by the Beck Depression Inventory II, and the Quick Inventory of Depressive Symptomatology ) and other psychometric measures were also collected. Available m ulti omics d ata Because different cell types may present different epigenetic or gene expression profiles, it is crucial to isolate specific cell types for epigenetic studies. In this study , isolated monocytes were isolated from fresh peripheral blood, and monocyte purity was the sample. Genome wide DNA methylation analysis was performed using the Infinium HumanMethylationEPIC BeadChip (Illumina Inc., San Diego, CA) as previously described. Gene expression level was quantified by paired end RNA seq (100 PE) in blood monocytes of the same twins. Relative abundance of fasting plasma metabolites was measured using a mixed targeted and untargeted high resolution LC MS approach. The raw peaks data were normalized using total ion chromatogram (TIC ) of the known compounds to avoid using potential non biological artifacts for the
29 biological normalizations. Further batch effects were corrected using QC sample in each batch. Gut microbiome data were measured in self collected stool sample using 16S rDN A at CMMR. 16Sv4 rDNA gene sequences were clustered into Operational Taxonomic Units (OTUs) at a similarity cutoff value of 97%. A custom script constructed a rarefied OTU table from the output files generated in the previous two steps for downstream analy ses of alpha diversity, beta diversity  and phylogenetic trends. The ROS/MAP study (Paper 2 ) The study included deceased participants from two ongoing, prospective studies of brain aging and dementia in older individuals: Religious Orders Study (ROS) : Initiated in 1994, ROS enrolled older Catholic priests, nuns and brothers from across the USA free of known dementia at time of enrollment. Participants agreed to annual clinical evaluations including standardized neurological examination and neurop sychological testing and signed both an informed consent and an Anatomic Gift Act donating their brains at time of death. Both the follow up rate of survivors and the autopsy rate exceed 90% (autopsies of deaths). Rush Memory and Aging Project (MAP) : Estab lished in 1997, MAP consists of older men and women from across the Chicagoland area, without known dementia at enrollment. Participants agreed to annual clinical evaluations and signed both an informed consent and an Anatomic Gift Act donating their brain s, spinal cords and selected nerves and muscles at the time of death  . The follow up rate for survivors exceeds 90% and the autopsy rate for deceased subjects exceeds 80% (autopsies of deaths).
30 Available c linical p henotypes Depres sion symptoms were measured using self reported Center for Epidemiologic Studies Depression Scale (CESD). CESD measures symptoms defined by DSM IV for a major depressive episode, thus served as a screening test for depression. Each participant was given an annual physical exam and anthropometric measurements. They were also asked to complete standard questionnaires for sociodemographic factors, lifestyles, use of psychiatric medications, and history of diseases at annual follow up. Available m ulti omics d a ta DNA methylation data was measured using Infinium HumanMethylation450 BeadChip in brain tissues from dorsolateral prefrontal region. Data pre processing, QC and normalization were performed using methods previously described. The final data included a total of 441,930 probes measured in 706 subjects. Gene expression data were measured in the same brain region using pair end RNA sequencing .
31 CHAPTER 2 GENOME WIDE PROFILING OF DN A METHYLOME AND TRAN SCRIPTOME IN PERIPHERAL BLOOD MON OCYTES FOR MAJOR DEP RE SSION: A MONOZYGOTIC DISCORDANT TWIN STUD Y Major depressive disorder (MDD) affects over 350 million people worldwide and is projected to be the leading cause of disease burden by 2030  . Although the etiology of MDD involves genetic and environmental factors  , the precise underlying mechanisms remain poorly unders tood. Epigenetic processes such as DNA methylation provide a mechanism through which environmental factors influence the genome [93,147] , and also could explain many clinical features of depression such as close correlation with adverse life events, discordance in identical twi n pairs, sex differences, and the phenomena of both remission, and, in many individual s, subsequent relapse [148,149] . Indeed, a growing body of evidence has implicated a role for altered DNA methylation in MDD [88,150] and other psychiatric disorders such as schizophrenia and bipolar disorder  . However, despite the current promise of epigenetic research, specific genes and biological pathways underlying MDD remain unclear. DNA methylation is influenced by genetic  , early life social envir onment, and behavioral factors [97,153] , and is tissue and cell type specific  . Thus, e stablishing the relationship between DNA methylation and mental illnesses such as MDD requires the control of these potential confounding variables . Monozygotic (MZ) discordant twin pairs provide a powerful tool to examine the role of epigenetic mechanisms in depression because they are matched on genotype, age, and sex. Moreover, identical twins reared together to share an early life environment including pre and perinatal conditions, socioeconomic status, and access to quality education, all of which may
32 contribute to the risk of depression later in life [155 158] . At the time of this writing, several epigenome wide association studies (EWAS) have been conducted to identify differentia lly methylated genes/regions associated with MDD using either unrelated individuals [88,159] or small numbers of identical twin pairs [147,159,160] . However, almost all previous st udies utilized heterogeneous cell types from blood, buccal, or postmortem brain tissue samples. As different cell types have altered DNA methylation profiles , the use of homogeneous cell types such as purified monocytes should minimize confounding by cellular heterogeneity in epigenetic research [103,154] . Moreover, because monocytes are key innate immune cells involved in inflamma tion, a mechanism known to be implicated in MD D , identifying methylation changes in circulating monocytes is likely to provide mechanistic insight into disease pathogenesis [161,162] . Further, the potential functional consequences of epigenetically altered genes on gene expression have not been adequately evaluated in previous studies. Here we report findings from integrated genome wide profiling of DNA methylome and transcriptome using DNA and RNA isolated from purified blood monocytes in 79 MZ twin pairs discordant for lifetime history of MDD, followed by replication in brain tissue samples and network analysis. Our goal is to identify key candidate genes and novel pathways asso ciated with MDD. Methods Twin P airs This study included 79 MZ twin pairs discordant for lifetime history of MDD. Twins were enrolled from the community based Washington State Twin Registry (WSTR; formerly the University of Washington Twin Registry), as pa rt of an ongoing effort to identify twin pairs within the Registry in which one or both members ha ve a
33 history of MDD  . Detailed methods for the construction of the registry and enrollment of twin pairs have been described previously  . Briefly, recruitment into the WSTR itself relies on the initial identification of twins from the Washington State Department of Licensing (DOL), which asks all driver license and state ID app licants University, the names and contact information of all twin respondents are then sent to the WSTR. Twins are contacted with information about the Registry and an enrollment survey. After both members of the pair complete the survey , the pair is enrolled in the Registry. Both the WSTR and the procedures for the present study were approved by the University of Washington IRB . All participants provided informed consent. Zygosit y of the twin pairs included in the present study was confirmed using the 59 polymorphic SNPs in the EPIC array. MDD D iagnosis To identify twin pairs for the present study, we sent an introductory letter to WSTR members via postal mail and email that asked general questions about MDD. These letters were only sent to twins where one or both members were in Western Washington because of the need for an in person study visit in Seattle. Interested twins were then interviewed by trained research staff using app roved scripts to screen for likelihood of eligibility. After pre screening, lifetime and current MDD diagnoses were determined using the Structured Clinical Interview for DSM IV Research Version (SCID 4 RV). Interviews were administered via phone by a clin ical psychologist (E.D.S.) who was blind to pre screening clinical information about both the interviewee and his or her co twin, and the final diagnosis was confirmed by consulting the senior psychiatrist (P.P.R.B). The final sample was drawn from a total of 693 clinical interviews (see below)
34 and all eligible and interested MZ pairs who could complete the in person visit before the end of the enrollment period were enrolled. A discordant pair was defined as a twin pair in which one twin met the criteria f or a lifetime history of MDD, and his/her co twin did not. Inclusion/ E xclusion C riteria Only complete twin pairs were eligible for the present study. As over 85% of the twins in the WSTR are Caucasian (which reflects Washington State generally), all twin pairs included in the current analysis are Caucasian. Inclusion criteria included: (1) monozygosity determined by DNA analysis; (2) pairwise discordance on lifetime history of MDD; (3) aged 18 or older; (4) reared together; (5) Beck Depression Inventory II (BDI II)  score < 13 for the non depressed twin (0 II authors); and (6) willingness to provide blood samples. The primary exclusionary conditions included schizophrenia or other psychotic disorder, bipolar disord er, current substance use disorder, cancer within the past five years, autoimmune dis ease s , uncontrolled endocrine disorders, and uncontrolled sleep apnea. Other M easures Each twin was also asked to complete standard questionnaires regarding sociodemographic factors, lifestyle, early life experience, use of psychiatric medications, and disease history. The s everity of current depressive symptoms was assessed by the BDI II a nd the Quick Inventory of Depressive Symptomatology (QIDS SR 16) at the time of the blood draw [1 64,165] . Self reported early life stress was measured using the Adverse Childhood Experience (ACE)  and Early Trauma Inventory (ETI)  questionnaires. PTSD was not exclusionary in this study due to high comorbidi ty with
35 MDD, but was diagnosed and recorded during the SCID RV interviews. Participants reported their use of medications commonly prescribed for MDD along with the approximate duration of use (ranging from 2 weeks to >10 years). They were also asked to li st any additional psychiatric medications they had taken. For the current analyses, medications were first categorized into types (i.e., antidepressants, benzodiazepines, mood stabilizers , and other ( i.e. , iable for each of these categories for each participant. A participant was defined as having a history of antidepressant use if he/she used any of the medications. Monocyte I solation and DNA/RNA E xtraction Monocytes were isolated from fresh peripheral blo od (collected into EDTA vacutainer tubes) using the Monocyte Isolation Kit II from Miltenyi Biotec (Auburn, CA). DNA/RNA was isolated from monocytes using the AllPrep DNA/RNA/miRNA Universal Kit (Qiagen, CA) according to RNA was quantified using PicoGreen Fluorometry. RNA integrity was assessed by capillary electrophoresis (e.g., Agilent BioAnalyzer 2100). All samples have a RIN>7.5 in our analysis. DNA M ethylation P rofiling The Infinium HumanMethylationEPIC BeadChip (Illumina Inc., CA)  was used for DNA methylation profiling. G en omic DNA (500 ng) was bisulfite converted using the EZ 96 DNA methylation kit (Zymo Research, CA). Modified DNA was then amplified, fragmented and hybridized, followed by fluorescent staining and scanning on a HiScan scanner (Illumina Inc.). The methylatio n level of each probe was represented To ensure the accuracy of the methylation assay , each batch included 2 CEPH DNA samples (Coriell
36 Institute, NJ) as positive controls. Twin pairs were hybridized on the same chip to minimize the batch effect. Methylation D ata P re processing and QC We first removed the following: (1) probes with a detection p value > 0.05 in more than 20% of the samples; (2) probes with raw signal intensities greater or less than three SD from the mean; and (3) probes located on sex chromosomes (both X and Y chromosomes). Probes passing initial QC were then annotated to the UCSC (GRCh37/ hg19), and those mapped to multiple locations or overlapping with known SNPs were fu rther removed. The final analyses included 813,382 autosomal probes. All samples passed QC procedures. Prior to analysis , DNA methylation data were normalized with functional normalization using the R package minfi  . This method corrects for the bias of different probe types and batch effects via an unsupervised approach  . Transcriptome P rofiling by RNA seq Monocyte gene expression was quantified by paired end RNA seq (50 PE). U sing sequence specific Ribozero capture probes, total RNA (500 ng) was globin and rRNA depleted, fragmented, and reverse transcribed. Trueseq libraries were quantified using PicoGreen Fluorimetry and sequenced on HiSeq 2500. Sequencing reads were aligned to the UCSC ( GRCh37/ hg19) by bowtie2  . Expression level was quantified by RSEM v 1.12  . Genes with expression below detect able levels (ln (TPM+1) > 1) in at least 5% of samples were discarded . A total of 10,329 genes in al l samples was included in the statistical analysis. At least 20 million reads per sample were captured in RNA seq analysis ( Table 2 1).
37 Replication in B rain To replicate the putative DMRs identified in blood monocytes, we downloaded brain DNA methylation d ata (HumanMethylation450K BeadChip) through the publicly available GEO database (GSE41826). The brain tissue was collected by the NICHD Brain Bank of Developmental Disorders  , and DNA methylation data were generated in sorted neuronal nuclei from 58 postmortem brain tissue, including 29 individual s with MDD and 29 controls (51.7% female, mean age 32.6Â±16.0 years old). Demographic information of the brain donors was described previously  . To replicate the putative DEGs identified in blood monocytes, we downloaded brain gene expression data (RNA seq) through publicly available GEO database (GSE10 1521). The expression data were generated in 59 postmortem brain tissue (29% females, mean age 49.3Â±20.3 years old) collected by The Division of Molecular Imaging and Neuropathology at the New York State Psychiatric Institute and Columbia University. Detai led information of the brain samples was described previously  . Depression was diagnosed by the structured clinical interview for DSM IV (SCID IV) in both datasets. Statistical A nalysis Identifying differentiall y methylated regions (DMRs) associated with MDD As DNA methylation between adjacent probes could be functionally and spatially correlated, identifying genomic regions containing biologically relevant probes should be preferable compared to single CpG anal ysis. To achieve this, we employed a mixed model explicitly designed for discordant twin pairs  :
38 i represents the regression coefficient for the i th fixed effect variable (e.g., age, sex), and y j denotes the random effect of the j th covariate ( e.g. , between depressed and non depressed twins within a pair. By testing the null test hypomethylation) at a specific CpG site was associated with MDD or not. The effect of environmental factors on DNA methylation was assessed by cause indicates that it causes hypomethylation at the CpG site being tested in the depressed twins). This model allows for establishing a link between environmental exposure, epigenetic alteration, and depression. In this analysis, we ad justed for covariates including twin age, sex, BMI, smoking ( pack year ), alcohol consumption, family income, and education level as a fixed effect, and the batch was included as random effect. These variables were selected based on prior knowledge from the scientific literature  . Age, gender and BMI are known to be associated with both methylation level and depression [9,174] . While life style (smoking, drinking etc) and social economics status are not direct influences methylation level and depression, they serves as a surrogate variables for a complex set of poorly understood factors that seem to carry a higher risk of depress ion [175,176] . Moreover, recent evidences suggest life style and social economics status influences methylation level [42,177] . Thus, I adjusted for life style measurement (smoking and drinking) and social economics status (education and income) in the model.
39 The r egion based analysis was performed using the program DMRcate, which identifies genomic re gions harboring CpG sites and accounts for correlations between adjacent probes  five correlated probes (peak probe p < 0.01 and correlation DMR was defined as a region with q <0.05 after correcting for a total number of regions . Putative DMRs were ranked and annotated to genomic features based on the UCSC database (GRCh37/hg19). Genomic features of DMRs were compared to the null distribution of CpG probes included in the MethylationEPIC array  . Differentially expressed genes (DEGs) associated with MDD Using the statistical model described above, we identified DEGs associated with a lifetime history of MDD, adjusting for the same covariates. Integrated methylome and transcriptome analysis To examine the impact of DNA methylation on gene expression, we calculated partial correlation coefficients (corrected for twin age, sex, BMI, smoking, alcohol, family income, and education) between DNA methylation and cis acting gene expression for each p robe. Here cis actin g was defined as the correlation between DNA methylation of a putative gene with its expression (Â±5kb to a tested probe). We used a conservative threshold to determine significant negative (partial correlation < 0.84) or positive (par tial correlation > 0.84) correlation. This threshold was obtained by randomly permuting DNA methylation and gene expression datasets for all twins. The cutoff was based on the fifth percentile of the empirical distribution of partial correlation coefficien ts, assuming no correlation between methylation and gene expression across all participants (null hypothesis). To further examine the role of DNA methylation in gene
40 regulation, we also conducted gene overlapping analysis using the GeneOverlap software in R  . Methods used for the replication in the brain For each of the 39 DMRs identified in blood monocytes, we tested the association of each probe within each region with depression (y/n) using logistic regression, adjusting for age and sex. Similar analysis was conducted to test the association of each putative DEG with depression, adjusting for age, sex, RIN and brain PH. Multiple testin g corrected for 39 DMRs or 30 DEGs using false discovery rate (FDR). Co methylation and co expression networks To examine the correlation patterns among putative DMRs and identify genes that are co methylated, we performed network analyses using the Weighted Gene Correlation Network Analysis (WGCNA)  . This analysis included 322 DMRs showing nominal association (raw p< 0.001) with MDD. A s imilar analysis was used to identify co expression networks (326 genes with raw p<0.001 was included ). Network analysis was constructed separately in depressed twins and their non depressed co twins. Differential methylation or expression networks were identified by comparing the two groups. Network visualization was done using CytoScape  . Functional enrichment analysis To explore the potential functional relevance of the identified differentially methylated (DM) or differentially expressed (DE) genes, we conducted functional enrichment analysis u sing the program DEPICT  . We first tested whether the putative genes (p <0.001) are enriched in GWAS loci for major depression. A total of 746 SNPs previously associated with major depressio n in GWAS was downloaded from
41 the GWAS catalog for this analysis  . We then tested whether the identified genes a re enriched in gene sets related to antidepressants by extracting 283 drug target genes from the Open Targets database  . Sensitivity analysis To examine wh ether adverse childhood experience (ACE) modulates the association between DNA methylation and MDD, we further adjusted for the total number of ACE in the above described statistical model. Similarly, we tested the influence of antidepressants usage (y/n) or history of PTSD (y/n) on the relationship between DNA methylation and MDD. Control for multiple comparisons In the above described analyses, we adjusted for multiple testing by false discovery rate (FDR) and FDR adjusted P (i.e., q value) < 0.05 was co nsidered statistically significant. Results Table 2 2 shows the characteristics of the twins (mean age 38.2 Â± 15.6 years, 68.4% females). Except for current BDI II score, depressed twins did not differ significantly from their non depressed co twins. DMRs A ssociated with a L ifetime H istory of MDD We identified 39 DMRs (annotated to 36 unique genes) significantly associated with MDD at q<0.05 ( Table 2 3 ). Of these, 33 DMRs are hypermethylated , and six are hypo methylated in relation to MDD. Figure 2 1 shows a Manhattan plot for these DMRs. The genomic distribution of th is DMRs is shown in Figure 2 2 , which indicate s that the identified DMRs are enriched in the 1 st exon and promoter regions but depleted
42 in intergenic regions. In relation to CpG context, the identified DMRs are largely located within CGIs. DEGs A ssociated with MDD We identified 30 DEGs (14 upregulated, 16 downregulated) associated with a lifetime history of MDD at q<0.05 ( Table 2 4 ). A Manhattan plot of these putative DEGs i s shown in Figure 2 3 . Replication in B rain Of the 39 DMRs identified in blood, 14 regions contain at least one CpG probe showing significant association with MDD ( q<0.05) in the brain after adjustments for covariates and a total number of probes in each region ( Table 2 5 ). Of these, 10 DMRs are in the same direction, whereas four regions were in the opposite direction as that in the blood. Of the 30 DEGs identified in blood monocytes, two genes ( NDUFA8, GUSBP9 ) were also significantly associated with MDD (q<0.05) in the brain ( Table 2 6 ). While the expression level of the NDUFA8 gene was lower in individuals with MDD than that in controls (i.e., downregulated) in both blood and brain, the GUSBP9 gene was in the opposite direction (i.e., upregulated in blood, but downregulated in the brain between cases and controls). Genome wide I ntegration of DNA M ethylome and T ranscriptome in B lood M onocytes Figure 2 4 displays the genome wide correlation patterns between DNA methylation and c is acting gene expression in peripheral blood monocytes. It shows that DNA methylation was main ly negatively (74% of the correlation pairs) correlated with gene expression, but positive correlations were also observed . Interestingly, we
43 found that the cor relation patterns between DNA methylation and gene expression vary by genomic locations, with genes located about 1kb upstream (negative) or downstream (positive) of the transcription start site (TSS) showing stronger correlation, whereas those located in between showing weak or no correlation ( Figure 2 5 ). These correlation pairs ( Table 2 7 ) involve 140 distinct methylated loci (from 64 unique genes) and 60 corresponding expression regions (representing 57 unique genes). Differential N etwork A nalysis Our co methylation network analysis identified three differential co methylation modules containing at least 50 genes. Table 2 8 lists the co methylation modules along with hub genes (genes with highest module membership) and biological pathways in each mo dule. A total of 304 genes (94.4% of the total 322 genes showing nominal association with MDD) were assigned to at least one module . The largest module ( Figure 2 6 ) comprises 167 genes involved in four biological processes. The network connectivity (measur ed by node degrees) of two pathways in this module significantly differs between depressed and non depressed twins. For example, the node connectivity for a pathway depressed twins was significantl y higher than that in non depressed twins (3.8 vs . 2.6, p value =7.15Ã—10 5 compared to their non depres sed co twins (2.8 vs . 4.2, p value =2.21Ã—10 5 ). Our co expression network analysis identified five modules containing 292 genes ( Table 2 9 ). The largest module comprises 94 genes involved in eight biological a significant difference between depressed and non depressed twins. ( Figure 2 7 )
44 Functional E nrichment A nalysis The identified differentially methylated genes are significantly enriched in pathways related to stress act ivated protein kinase signaling cascade, neuron apoptotic process, negative regulation of insulin receptor signaling, mTOR signaling, and nerve growth factor receptor signaling pathways ( Table 2 10 ). The differentially expressed genes are enriched in biolo Gene overlapping analysis revealed that the putative differentially methylated genes are 2.44 times more likely to be differentially expressed ( P=1.1Ã—10 4 ), and are significantly overrepresented in previous GWAS loci for major depression (2.32 times, P=2.4Ã—10 4 ). Moreover, these differentially methylated genes are significantly enriched in drug targets related to antidepressants (2.83 times, P=7.6Ã—10 5 ), and are highly expressed in tissues/cell types related to the nervous, endocrine, and urogenital systems ( Figure 2 8 ). Together, these results suggest a potential functional role of altered DNA methylation in MDD pathogenesis. Sensitivity A nal ysis Additionally adjustment for childhood traumatic experience (ACE) slightly attenuated the association between DNA methylation and MDD, but results remained most ly unchanged (all region p < 0.01) . After further correction for the use of antidepressants or PTSD, the association of one gene (the MAFF gene) with MDD disappeared, but other genes remained statistically significant. Results for sensitivity analysis are shown in Table 2 11 . It appears that additional adjustments for ACE, PTSD or use of antidep ressants did not have a significant impact on the association between gene expression and MDD ( Table 2 12 ).
45 D iscussion Using a monozygotic discordant co (33 hypermethylated, six hypomethylated) and 30 differentially expressed genes (14 upregulated, 16 downregulated) associated with lifetime history of MDD, after accounting for clinical covariates and multiple testing. These differentially methylated or expressed genes are signifi cantly enriched in biological processes related to neuronal function, stress response, insulin regulation, mTOR signaling, and cytokine secretion, suggesting potential relevance to MDD pathogenesis. Moreover, the identified DMR genes are overrepresented in GWAS loci and drug targets related to antidepressants. Integrated DNA methylome and transcriptome analysis revealed that DNA methylation was both negatively and positively correlated with gene expression in peripheral blood monocytes. To the best of our k nowledge, this is the first integrated DNA methylome and transcriptome analysis in purified blood monocytes for lifetime history of MDD using a relatively large number of MZ discordant pairs from a community based population cohort. Of the 36 annotated DM R genes, the PRSS21 gene showed the most significant association with MDD. The methylation level of this gene in depressed twins is on average 1.13 fold as high as that in their non depressed co twins. This gene (also known as testisin ) encodes glycosylpho sphatidyl inositol (GPI) linked serine protease, which is a member of the trypsin family of serine proteases. A growing body of evidence demonstrates that the brain can co opt the activities of these serine proteases and their receptors/inhibitors to regul ate various processes including synaptic activity, learning, and social behavior [185,186] . Aberrant activity of these molecules may contribute to disease, traumatic
46 brain injury, and stroke [185,186] . Another significantly hypermethylated gene is HSPB11 , which encodes a family member of the heat shock proteins (HSPs) that are produced in response to str essful conditions [187,188] . Although the exact mechanisms behind the association of HSPB11 hypermethylation with MDD are unknown, HSPs are involved in protein misfolding and aggregation, which have been implicated in neurodegenerative a nd neuropsychiatric disorders  . Other top ranked DMR genes, such as AAK1, SORBS2, and GAREM2, are also abundantly expressed in the brain and may affect MDD susceptibility through a variety of biological processes [87,190 193] . For example, the AAK1 gene is involved in intracellular vesicle trafficking, a mechanism that is essential for neurotransmitter release and recycling of synaptic vesicle proteins [193,194] . Inhibition of AAK1 activity may provide a novel therapeutic target for treating neuropsychiatric and neu rodegenerative disorders [193,195] . The SORBS2 gene encodes the Arg protein tyrosine kinase binding protein 2 (ArgBP2), downregulation of which was previously associated with mood disorders  . The GAREM2 gene encodes an adapter protein that regulates MAPK/ERK signaling, which is involved in neuronal plasticity and resilience in psychiatric disorders  . Further, the identified DMR genes are enriched in neuron apoptosis  , nerve growth factor  , stress activated protein kinase signaling  , insulin receptor regulation  , and mTOR signaling  , suggesting potential relevance to MDD pathogenesis. Of the differentially expressed genes associated with MDD, the peroxisomal trans 2 enoyl CoA reductase ( PECR) gene showed the strongest association. This gene is involved in mitochondrial energy production by catalyzing the reduction of enoyl CoAs to acyl CoAs, and genetic polymorphisms in this gene were associated with
47 alcohol dependenc e [202,203] . The differentially expressed genes are enriched in the regulation of cytokine secretion and stress responses, lending further support for the critical roles of inflammation and stress in major depression  . Previous studies have demonstrated th at zinc plays an essential role in synaptic activity and neuronal plasticity and that zinc deficiency was associated with behavioral impairments  , neurodegenerative disorders, and mood disorders including depression  . In line with these findings, we found that several zinc family genes are differentially methylated (e.g., SLC30A3, ZNF212, ZBTB45, SWSAP1, WT1, TRIM39 ) or differentially expressed (e.g., ZNF200, ZNF101, ZNF493, ZNF816, ZNF487, ZNF772 ) between depressed twins and their non depressed co twins. These findings provide further support for a potential ly important role of zinc dysregulation in depression and suggest that DNA methylation may modulate the effect of zinc function on depression susceptibility. Together, our results may unravel novel molecular pathways underlying MDD pathogenesis. Although DNA methylation is generally believed to cause gene s ilencing, a global analysis of the extent and pattern of DNA methylation with gene expression in human blood monocytes is still lacking. Here we demonstrated that monocyte DNA methylation c ould be both positively and negatively correlated with gene express ion, although there appears to be a trend that the correlations are predominantly negative in promoter regions. These results are in agreement with previous studies reporting both positive and negative correlations between DNA methylation and gene expressi on in blood and brain [157,158] . Interestingly, we found that the relationship between DNA methylation and cis acting gene expression appears to vary by genomic locations, with m ethylation
48 of putative genes located upstream of TSS showing predominantly negative correlations, whereas those located downstream of TSS showing most ly positive correlations with gene expression. While the negative correlation may result from the interfer ence with transcription factor binding or recruiting repressors such as histone deacetylases [206 208] , the positive correlation between DNA methylation and gene expression may be attributed to the high level of DNA methylation in the gene body of highly transcribed genes. In addition , it has been shown that DNA methylation of a promoter or an enhancer can activate transcription of a target gene and therefore is positively correlated with gene expression [42,207,209] . Together, our integrated DNA gene regulat ion and identified key candidate genes whose role in depression are modulated by epigenetic changes. MDD is a highly heterogeneous disorder involving the joint or interactive effects of many genes in multiple pathways. Traditional methods that model the ef fect of a single gene cannot capture the complicated biological pathways implicated in MDD  . Using a network bas methylated or co expressed modules containing coordinated genes across different genomic loci. These findings support the hypothesis that altered DNA methylation is interdependent and that the blood monocytes methylome and tran scriptome comprise a complex network of interacting processes. Our study has several limitations. First, in spite of using 79 monozygotic discordant twin pairs, our study is still underpowered , and thus we might have missed important disease related gene s, especially those with small individual effect size. As
49 such, the current analysis focused on region based rather than single probe analysis. Our results should be considered as a proof of concept rather than conclusive. Second, although we used purified blood monocytes for DNA methylation and gene expression profiling, our sample still includes other cell types such as macrophages or dendritic sample has nearly 97% purity , and we be lieve confounding by cellular heterogeneity should not be a major concern for our study . Moreover , given the cell type specific nature of DNA methylation, it is unclear to what extent our results derived from peripheral blood could reflect methylation changes in the brain. However, monocytes are one of the key components of the innate immunity system, dysfunction of which has been implicated in neuropsychiatric disorders including depressi on  , and accumulating evidence indicated that epimutations may not be l imited to the affected organ (e.g., brain) but could also be detected in peripheral blood [102,212,213] . Further , many of the identified DMR genes are abundantly expressed , and some could be replicated in the brain. These putative genes detected in read ily accessible tissues such as blood are suitable for biomarkers. Third, the twin participants were evaluated for lifetime history of MDD , and some non depressed co twins might ultimately develop an episode of MDD. The mean age of our sample was 38 which sugge sts that a large portion of the participants had already passed the peak risk period of young adulthood  . In addition , the current depression severity scores were higher on a verage for the depressed twins compared to their non depressed co twins even though only a very small number of twins were currently depressed at the time of the study visit. Thus, the results are promising even if some regions ultimately
50 do not replicate or are found to be related to earlier onset of MDD. Other potential limitations include the limited genome coverage of the EPIC array used in our study, the inability to establish causality between DNA methylation and MDD pathology, and the uncertainty to generalize our findings to other racial/ethnic groups. Our study has several strengths. First, as monozygotic twin pairs share almost identical genotypes, age, sex, and early familial environment (e.g., in the utero environment) as well as many unknown o r unmeasured factors, the use of a monozygotic discordant co twin control design minimizes or eliminates potential confounding by these factors. Second, we profiled DNA methylome and transcriptome in purified blood monocytes, which minimizes confounding by cellular heterogeneity. Moreover, we used the structured clinical interview (DSM IV) for the depression diagnosis. In summary, our results demonstrated a critical role of altered DNA methylation and gene expression in MDD and identified key candidate genes and pathways underlying MDD pathogenesis. If validated, the newly identified genes and pathways may serve as novel therapeutic targets for MDD and related disorders.
51 Table 2 1. QC information for RNA seq Study ID RIN Total Reads Aligned Reads % Aligned Reads QC30 1000 9.04 35,589,699 29,603,511 83.18% 1 1017 8.87 32,382,943 29,332,470 90.58% 0.97 1027 8.11 30,947,356 27,781,441 89.77% 0.99 1044 8.51 33,105,942 26,239,769 79.26% 1 1046 7.83 26,766,472 24,135,328 90.17% 1 1052 7.66 30,252,083 27,157,295 89.77% 1 1065 8.53 37,706,116 30,858,685 81.84% 1 1066 8.61 28,348,653 23,841,217 84.10% 1 1072 7.99 22,842,796 17,943,016 78.55% 1 1089 8.25 35,065,192 31,614,777 90.16% 0.99 1104 9.25 31,837,624 28,338,669 89.01% 1 1106 8.78 21,712,939 19,435,252 89.51% 1 1111 8.66 32,172,418 28,890,831 89.80% 1 1119 8.1 29,465,971 26,380,884 89.53% 1 1121 8 26,577,598 24,289,267 91.39% 1 1144 8.77 21,382,699 19,208,078 89.83% 1 1149 8.83 26,897,611 21,797,824 81.04% 1 1168 8.92 20,482,031 18,724,673 91.42% 0.99 1173 9.09 35,071,738 31,883,717 90.91% 1 1175 8.65 36,628,436 33,661,532 91.90% 0.95 1204 8.04 27,863,618 25,004,810 89.74% 1 1211 9.19 35,641,976 31,404,145 88.11% 1 1217 8.45 34,509,804 31,362,510 90.88% 1 1218 8.3 34,951,001 29,002,341 82.98% 1 1243 8.48 37,845,052 34,298,970 90.63% 1 1247 8.13 23,422,113 21,276,648 90.84% 1 1254 8.83 38,833,918 35,039,844 90.23% 1 1294 9.22 31,433,349 26,756,067 85.12% 1 1328 7.73 33,280,602 30,544,937 91.78% 1 1337 7.95 28,382,694 25,572,807 90.10% 1 1343 7.9 22,649,348 18,108,154 79.95% 1 1358 7.83 24,287,675 22,016,777 90.65% 1 1362 8.56 26,686,979 22,385,038 83.88% 0.98 1364 8.48 35,060,443 31,375,591 89.49% 1 1425 8.71 29,824,176 26,400,361 88.52% 1
52 Table 2 1. Continued Study ID RIN Total Reads Aligned Reads % Aligned Reads QC30 1493 8.45 25,699,324 23,188,500 90.23% 1 1495 8.05 23,430,905 21,521,286 91.85% 1 1504 8.96 39,062,764 35,804,930 91.66% 1 1514 8 26,171,523 20,769,721 79.36% 1 1539 8.29 31,411,345 24,745,858 78.78% 1 1550 8.54 35,474,123 32,331,115 91.14% 0.85 1558 8.01 38,053,227 34,875,783 91.65% 1 1584 8.4 35,221,926 29,434,964 83.57% 1 1608 7.83 29,190,726 24,108,621 82.59% 1 1612 8.69 32,953,640 30,086,674 91.30% 1 1614 8.44 36,443,429 32,904,772 90.29% 1 1618 8.69 31,172,529 28,572,740 91.66% 1 1620 8.49 29,733,423 26,914,694 90.52% 1 1648 8.37 20,877,732 18,163,627 87.00% 1 1673 8.85 27,944,571 25,038,336 89.60% 1 1675 9.24 24,603,471 22,305,507 90.66% 1 1684 8.38 35,410,401 29,886,378 84.40% 1 1693 8.5 35,416,628 28,212,886 79.66% 1 1708 9.23 25,293,191 22,682,934 89.68% 1 1731 8.8 29,078,886 25,327,710 87.10% 1 1749 8.24 30,839,425 27,712,307 89.86% 0.97 1760 8.36 20,652,784 15,644,484 75.75% 0.99 1769 8.27 36,244,838 32,518,869 89.72% 1 1775 8.79 24,237,971 22,119,572 91.26% 0.94 1789 8.39 28,164,424 23,959,476 85.07% 1 1804 8.05 34,223,863 30,849,390 90.14% 1 1809 7.85 35,195,026 32,337,190 91.88% 1 1812 8.03 26,781,765 24,274,992 90.64% 1 1828 7.8 27,718,019 25,373,074 91.54% 1 1865 9.33 35,455,344 31,895,628 89.96% 0.99 1866 8.14 25,929,365 21,617,312 83.37% 0.97 1868 8.76 39,768,598 36,257,031 91.17% 1 1890 8.9 24,566,805 20,930,918 85.20% 0.99 1899 8.41 27,859,795 25,452,709 91.36% 1 1916 8.63 24,903,400 22,340,840 89.71% 0.99
53 Table 2 1. Continued Study ID RIN Total Reads Aligned Reads % Aligned Reads QC30 1936 8.78 25,255,562 19,727,119 78.11% 1 1949 8.28 33,886,018 28,260,939 83.40% 0.98 1957 8.51 25,675,377 23,554,591 91.74% 0.99 1984 8.35 25,268,650 22,185,875 87.80% 0.99 2003 8 35,417,930 31,475,914 88.87% 1 2010 8.07 32,334,249 29,375,665 90.85% 1 2015 8.66 36,831,023 32,930,618 89.41% 1 2023 8.9 31,384,449 28,591,233 91.10% 1 2024 8.39 37,903,559 31,168,096 82.23% 1 2027 8.51 38,119,049 34,070,806 89.38% 1 2029 8.31 31,020,523 27,937,083 90.06% 1 2041 8.48 22,498,851 20,566,199 91.41% 1 2054 8.39 27,554,973 25,339,553 91.96% 0.99 2056 8.79 29,148,975 24,397,692 83.70% 1 2081 8.71 37,334,843 34,176,315 91.54% 1 2092 9.07 25,460,761 22,886,678 89.89% 1 2101 8.37 20,805,415 19,084,807 91.73% 1 2103 8.54 38,185,503 34,569,336 90.53% 1 2115 8.12 22,189,559 20,081,551 90.50% 1 2121 8.53 23,719,803 19,001,934 80.11% 1 2126 9.11 39,547,924 36,162,621 91.44% 1 2134 8.59 20,260,935 16,615,993 82.01% 1 2164 8.58 38,991,673 34,944,338 89.62% 1 2171 8.14 38,287,025 34,707,189 90.65% 1 2221 8.26 34,607,618 31,001,504 89.58% 1 2222 8.67 22,121,808 18,697,352 84.52% 1 2228 9.2 34,829,549 31,168,964 89.49% 1 2232 9.1 32,957,353 29,605,590 89.83% 1 2313 8.74 20,278,845 18,601,785 91.73% 0.93 2318 8.2 28,659,618 24,830,693 86.64% 1 2328 8.34 37,738,438 33,983,464 90.05% 0.99 2329 7.71 26,323,027 23,903,941 90.81% 1 2336 8.91 21,412,771 19,399,970 90.60% 1 2343 8.56 39,536,831 36,322,487 91.87% 1 2363 8.07 27,783,963 24,027,572 86.48% 0.99
54 Table 2 1. Continued Study ID RIN Total Reads Aligned Reads % Aligned Reads QC30 2393 9.1 23,714,578 21,456,950 90.48% 1 2402 9.12 27,141,586 24,560,422 90.49% 0.97 2409 8.63 32,154,186 29,353,556 91.29% 1 2418 8.42 35,131,538 32,208,594 91.68% 1 2431 9.22 32,888,071 29,645,307 90.14% 0.98 2434 9.12 20,892,469 18,799,043 89.98% 0.98 2443 7.98 27,390,142 24,719,604 90.25% 1 2447 8.74 34,219,049 30,715,018 89.76% 1 2464 8.75 27,752,462 25,465,659 91.76% 1 2467 8.87 35,019,996 31,595,041 90.22% 1 2471 8.3 33,494,498 30,265,628 90.36% 1 2472 8.32 39,555,241 36,121,846 91.32% 1 2475 8.29 26,335,587 23,733,631 90.12% 1 2481 9.19 38,848,593 35,631,929 91.72% 0.99 2486 7.77 24,674,202 22,088,346 89.52% 1 2494 8.37 26,812,453 24,251,864 90.45% 1 2500 8.32 27,853,656 25,569,657 91.80% 0.98 2523 8.7 32,643,033 27,002,317 82.72% 1 2553 9.24 23,097,259 21,048,532 91.13% 1 2565 8.66 37,241,148 31,368,219 84.23% 1 2566 8.55 24,294,754 21,906,580 90.17% 1 2576 9.02 21,482,654 19,353,723 90.09% 1 2583 8.62 30,857,293 27,953,622 90.59% 1 2597 9.09 38,026,805 34,110,044 89.70% 1 2609 8.57 29,107,244 26,566,182 91.27% 0.99 2649 7.87 35,923,273 32,280,653 89.86% 1 2660 8.39 28,409,045 25,409,050 89.44% 1 2683 7.7 25,434,116 22,893,248 90.01% 1 2689 8.94 27,384,890 21,765,510 79.48% 1 2692 8.62 23,381,259 20,215,436 86.46% 1 2725 8.84 21,767,419 19,649,449 90.27% 1 2726 8.49 39,302,936 35,604,530 90.59% 1 2753 7.6 25,573,023 23,314,925 91.17% 1 2755 8.51 26,138,262 23,712,632 90.72% 1
55 Table 2 1. Continued Study ID RIN Total Reads Aligned Reads % Aligned Reads QC30 2767 8.37 38,610,914 33,274,885 86.18% 1 2789 9.32 23,328,093 20,803,993 89.18% 1 2793 8.4 32,678,548 29,302,854 89.67% 1 2802 8.49 31,929,122 29,253,462 91.62% 1 2814 8.67 23,825,418 21,685,895 91.02% 1 2820 8.45 26,937,319 24,133,144 89.59% 1 2836 8.65 29,164,310 26,536,605 90.99% 0.98 2851 7.89 28,936,821 25,921,604 89.58% 0.71 2862 8.2 26,521,555 23,922,442 90.20% 1 2881 8.76 31,890,882 28,548,717 89.52% 1 2886 9 21,662,684 19,901,508 91.87% 1 2894 8.14 36,418,563 33,395,822 91.70% 1 2934 8.8 20,907,671 19,218,331 91.92% 1 2941 7.98 27,271,836 24,610,105 90.24% 1 2942 8.15 22,403,323 20,599,856 91.95% 1 2948 8.57 36,920,527 33,645,677 91.13% 1 2950 8.52 24,733,017 22,136,050 89.50% 0.99 2981 9.46 31,140,208 28,076,012 90.16% 0.99 2997 8.46 36,163,644 30,540,197 84.45% 1
56 Table 2 2. Clinical characteristics of twin pairs participating in the MMS Variable Non depressed co twin (N=79) Depressed twin (N=79) P value a Age, mean (SD), years 38.2 (15.6) 38.2 (15.6) Female, No. (%) 54 (68.4) 54 (68.4) Body mass index, mean (SD), kg/m 2 26.9 (6.5) 27.0 (6.9) 0.82 1 Smoking, mean (SD), pack/year 2.3 (9.4) 2.4 (8.3) 0.83 9 AUDIT C score, mean (SD) b 3.6 (2.6) 3.4 (2.8) 0.70 3 Education below high school, No. (%) 6 (7.6) 4 (5.1) 0.43 4 Family income less than $20,000, No. (%) 6 (7.6) 7 (8.9) 0.53 7 BDI II score, mean (SD) 3.1 (5.1) 6.6 (6.5) 0.003 Exposure to ACE, No. (%) 8 (10.1) 9 (11.4) 0.66 4 History of PTSD, No. (%) 1 (1.3) 6 (7.6) 0.014 Use of antidepressants, No. (%) 12 (15.2) 42 (53.2) 0.006 Abbreviations: BDI II, Beck Depressive Inventory II . ACE: Adverse childhood experiences. PTSD, Post Traumatic Stress Disorder. a P values were calculated using the paired t test. b Score based on alcohol use disorders identification test (AUDIT C);
57 Table 2 3. Significant DMRs associated with lifetime history of MDD in MZ discordant twin pairs Chr Start (bp) End (bp) Size (bp) Nearest gene # of prob es Peak P a Region P b Mean FC c Mean difference (%) d Peak differenc e (%) e 16 2,866,834 2,868,001 1,168 PRSS21 10 6.28Ã—10 4 1.16Ã—10 9 1.13 4.06 14.28 5 43,037,123 43,037,666 544 7 2.99Ã—10 4 2.96Ã—10 9 1.15 0.46 1.26 1 54,411,017 54,412,009 993 HSPB11 18 1.90Ã—10 3 3.69Ã—10 7 1.05 0.72 4.02 2 69,870,526 69,871,424 899 AAK1 8 5.70Ã—10 4 4.32Ã—10 7 1.05 2.48 8.45 4 186,732,926 186,733,331 406 SORBS2 8 3.53Ã—10 3 4.89Ã—10 7 1.14 6.98 13.36 2 26,395,359 26,395,859 501 GAREML 8 5.55Ã—10 3 7.20Ã—10 7 1.08 0.29 0.81 5 140,800,398 140,800,983 586 PCDHGA11 10 1.85Ã—10 3 1.04Ã—10 6 1.11 8.02 11.54 1 153,940,616 153,941,285 670 CREB3L4 6 2.76Ã—10 3 1.16Ã—10 6 1.12 11.12 15.53 5 43,602,380 43,603,353 974 NNT 17 2.76Ã—10 3 1.33Ã—10 6 1.06 1.03 6.84 5 8,457,538 8,458,392 855 RP11 9 4.84Ã—10 3 1.63Ã—10 6 0.89 3.24 8.65 22 38,598,577 38,599,166 590 MAFF 9 5.79Ã—10 3 3.06Ã—10 6 1.05 4.46 7.30 12 64,173,610 64,174,367 758 TMEM5 9 6.05Ã—10 3 3.32Ã—10 6 1.04 1.73 4.48 19 11,484,448 11,485,452 1,005 SWSAP1 14 8.08Ã—10 5 3.94Ã—10 6 1.05 0.56 1.61 2 27,485,922 27,486,460 539 SLC30A3 8 3.43Ã—10 3 4.49Ã—10 6 1.07 1.52 6.90 2 101,034,246 101,034,295 50 CHST10 6 5.93Ã—10 3 5.48Ã—10 6 1.10 7.51 8.30 11 85,779,252 85,780,378 1,127 PICALM 10 1.85Ã—10 3 5.72Ã—10 6 1.06 5.28 13.31 13 114,814,024 114,814,401 378 5 7.42Ã—10 3 6.81Ã—10 6 1.09 7.08 13.05 11 32,454,216 32,455,025 810 WT1 8 2.18Ã—10 3 8.16Ã—10 6 1.06 2.34 6.02 16 70,557,411 70,557,707 297 SF3B3 10 7.70Ã—10 3 8.65Ã—10 6 1.06 3.32 8.43 7 90,224,158 90,225,380 1,223 CDK14 11 9.78Ã—10 4 9.23Ã—10 6 0.99 1.06 3.14 11 87,908,134 87,908,805 672 RAB38 7 6.00Ã—10 3 9.72Ã—10 6 1.09 1.26 5.74 12 122,019,031 122,019,117 87 KDM2B 5 8.20Ã—10 3 1.37Ã—10 5 0.91 8.86 11.79 16 85,096,632 85,097,151 520 KIAA0513 5 6.05Ã—10 3 1.73Ã—10 5 0.94 3.88 7.55
58 Table 2 3. Continued Chr Start (bp) End (bp) Size (bp) Nearest gene # of pro bes Peak P a Region P b Mean FC c Mean difference (%) d Peak difference (%) e 2 65,594,021 65,595,186 1,166 SPRED2 6 9.76Ã—10 5 1.81Ã—10 5 1.09 2.29 5.55 9 98,079,646 98,080,622 977 FANCC 10 2.64Ã—10 4 2.02Ã—10 5 1.07 6.64 11.76 10 90,611,604 90,612,228 625 ANKRD22 7 2.33Ã—10 3 2.13Ã—10 5 1.07 5.22 8.13 5 140,777,344 140,777,655 312 9 4.87Ã—10 3 2.14Ã—10 5 1.11 8.87 13.05 1 178,994,834 178,995,133 300 FAM20B 8 2.76Ã—10 3 2.31Ã—10 5 1.06 5.27 9.33 7 148,936,572 148,937,410 839 ZNF212 9 3.03Ã—10 3 2.40Ã—10 5 1.06 1.46 4.08 16 68,118,822 68,119,261 440 NFATC3 9 7.94Ã—10 3 2.64Ã—10 5 1.05 4.97 8.83 19 59,030,662 59,031,081 420 ZBTB45 7 6.62Ã—10 3 3.75Ã—10 5 1.02 0.97 2.52 16 87,351,006 87,351,824 819 C16orf95 10 1.63Ã—10 3 4.00Ã—10 5 1.05 5.03 13.48 17 58,499,300 58,500,186 887 C17orf64 9 9.07Ã—10 4 4.27Ã—10 5 0.93 5.43 9.35 19 19,739,060 19,739,414 355 LPAR2 8 6.05Ã—10 3 4.65Ã—10 5 1.04 1.81 4.80 3 179,280,056 179,280,746 691 ACTL6A 9 3.97Ã—10 3 4.75Ã—10 5 1.02 1.48 4.75 1 70,876,598 70,877,381 784 CTH 9 5.78Ã—10 3 5.33Ã—10 5 1.06 1.23 3.90 7 78,400,383 78,400,769 387 MAGI2 5 6.61Ã—10 3 5.39Ã—10 5 1.09 8.22 12.42 6 30,297,174 30,297,941 768 TRIM39 10 2.76Ã—10 3 5.57Ã—10 5 1.09 8.93 14.69 15 69,222,400 69,223,018 619 NOX5 7 6.22Ã—10 3 5.79Ã—10 5 0.88 4.11 10.37 a Adjusted for twin age, sex, BMI, smoking, alcohol consumption, education, and family income. b Adjusted for a total number of 6,858 regions. c Mean fold change (FC) in DNA methylation level across all CpG probes in a region. FC > 1 represents hypermethylated, whereas FC< 1 represents hypomethylated (depressed twin vs non depressed co twin). d Mean methylation difference across all probes in the region between depressed twins and their non depressed co twins. e Methylation difference of the peak probe in the region between depressed twins and their non depressed co twins.
59 Table 2 4. Significant DEGs associated with lifetime history of MDD in MZ discordant twin pairs Chr Start (bp) End (bp) Size (bp) Nearest gene FC a P b q c 2 215,996,329 216,082,955 86,626 PECR 1.49 7.00Ã—10 8 7.10Ã—10 3 1 185,292,384 185,294,372 1,988 AL356273.3 1.32 5.12Ã—10 7 2.59Ã—10 2 1 44,800,225 44,805,990 5,765 PLK3 2.48 9.35Ã—10 7 2.89Ã—10 2 5 71,197,646 71,208,130 10,484 GUSBP9 1.65 1.29Ã—10 6 2.89Ã—10 2 8 63,015,079 63,039,171 24,092 GGH 1.27 1.78Ã—10 6 2.89Ã—10 2 6 11,538,278 11,583,524 45,246 TMEM170B 8.02 2.47Ã—10 6 2.89Ã—10 2 11 64,223,799 64,226,254 2,455 TRPT1 0.75 2.90Ã—10 6 2.89Ã—10 2 16 19,701,934 19,718,235 16,301 KNOP1 0.78 3.36Ã—10 6 2.89Ã—10 2 16 3,222,325 3,236,221 13,896 ZNF200 0.82 3.39Ã—10 6 2.89Ã—10 2 6 30,617,709 30,626,395 8,686 MRPS18B 0.57 3.48Ã—10 6 2.89Ã—10 2 7 66,682,164 66,811,464 129,300 RABGEF1 1.46 3.65Ã—10 6 2.89Ã—10 2 22 49,900,229 49,918,458 18,229 ALG12 1.19 3.71Ã—10 6 2.89Ã—10 2 12 52,076,841 52,082,084 5,243 AC025259.1 0.69 5.78Ã—10 6 3.62Ã—10 2 19 19,668,796 19,683,509 14,713 ZNF101 0.61 6.16Ã—10 6 3.62Ã—10 2 1 151,156,629 151,159,749 3,120 TNFAIP8L2 0.16 6.35Ã—10 6 3.62Ã—10 2 8 33,473,386 33,513,601 40,215 TTI2 0.80 6.79Ã—10 6 3.62Ã—10 2 13 41,457,559 41,470,882 13,323 RGCC 4.27 7.44Ã—10 6 3.74Ã—10 2 9 122,144,058 122,159,819 15,761 NDUFA8 0.59 7.73Ã—10 6 3.74Ã—10 2 11 93,741,591 93,764,749 23,158 C11orf54 0.59 8.66Ã—10 6 3.99Ã—10 2 16 85,690,084 85,751,129 61,045 C16orf74 1.62 9.17Ã—10 6 4.04Ã—10 2 7 80,742,538 80,922,359 179,821 SEMA3C 1.81 1.03Ã—10 5 4.10Ã—10 2 19 21,397,119 21,427,573 30,454 ZNF493 1.65 1.05Ã—10 5 4.10Ã—10 2 11 4,384,897 4,393,696 8,799 TRIM21 0.28 1.18Ã—10 5 4.30Ã—10 2 19 52,949,379 52,962,911 13,532 ZNF816 0.72 1.21Ã—10 5 4.30Ã—10 2 10 43,436,841 43,483,179 46,338 ZNF487 1.44 1.23Ã—10 5 4.30Ã—10 2 19 57,466,663 57,477,570 10,907 ZNF772 0.84 1.27Ã—10 5 4.30Ã—10 2 22 42,509,968 42,519,802 9,834 RRP7A 0.43 1.34Ã—10 5 4.31Ã—10 2 21 36,069,941 36,073,166 3,225 CBR1 0.35 1.36Ã—10 5 4.31Ã—10 2 2 98,619,106 98,731,126 112,020 MGAT4A 2.80 1.60Ã—10 5 4.91Ã—10 2 1 145,911,350 145,918,837 7,487 PEX11B 0.67 1.66Ã—10 5 4.94Ã—10 2 a Fold change (FC) in gene expression level between depressed twins and their non depressed co twins b Adjusted for twin age, sex, BMI, smoking, alcohol consumption, education, and family income. c Adjusted for a total number of 10,329 genes.
60 Table 2 5. Significant CpG probes replicated in the brain Probe Gene Chr Position FC P a q Direction with blood cg09895920 CREB3L4 1 153,941,186 1.27 1.07Ã—10 2 3.21Ã—10 2 Y cg24563094 GAREML 2 26,395,458 1.14 1.44Ã—10 3 7.23Ã—10 3 Y cg22470850 GAREML 2 26,395,824 1.13 1.77Ã—10 3 8.85Ã—10 3 Y cg19283506 CHST10 2 101,034,270 1.14 2.87Ã—10 3 3.44Ã—10 2 Y cg03902565 NNT 5 43,603,176 1.12 2.81Ã—10 3 2.25Ã—10 2 Y cg03249630 ANKRD22 10 90,611,782 1.18 4.26Ã—10 3 2.98Ã—10 2 Y cg01561719 ANKRD22 10 90,611,855 1.23 4.46Ã—10 4 3.12Ã—10 3 Y cg03818395 ANKRD22 10 90,612,228 1.10 7.10Ã—10 3 4.26Ã—10 2 Y cg18395636 RAB38 11 87,908,785 1.20 1.31Ã—10 3 1.05Ã—10 2 Y cg16837338 KDM2B 12 122,018,770 1.23 9.67Ã—10 5 1.16Ã—10 3 N cg26509318 KDM2B 12 122,019,760 1.24 3.40Ã—10 3 1.70Ã—10 2 N cg11549417 RASA3 13 114,814,643 1.23 1.35Ã—10 3 9.45Ã—10 3 Y cg04202511 NFATC3 16 68,117,991 1.17 2.45Ã—10 3 2.94Ã—10 2 Y cg17125623 NFATC3 16 68,119,985 1.46 2.74Ã—10 3 1.64Ã—10 2 Y cg15213605 SF3B3 16 70,557,485 0.79 5.25Ã—10 4 5.25Ã—10 3 N cg07450021 KIAA0513 16 85,097,151 1.19 7.46Ã—10 4 7.46Ã—10 3 N cg17628249 C17orf64 17 58,499,854 1.22 6.58Ã—10 4 6.58Ã—10 3 Y cg02172058 C17orf64 17 58,499,911 1.24 2.63Ã—10 5 1.58Ã—10 4 Y cg06697439 ZBTB45 19 59,031,463 0.87 3.06Ã—10 3 2.14Ã—10 2 N cg16007279 MAFF 22 38,598,948 1.15 4.43Ã—10 4 3.10Ã—10 3 Y cg09035736 MAFF 22 38,599,166 1.21 4.44Ã—10 4 1.78Ã—10 3 Y a Adjusted for age and gender
61 Table 2 6. Replication of differentially expressed genes (DEGs) in the brain Chr Start (bp) End (bp) Nearest gene FC a P b Direction with blood 5 71,197,646 71,208,130 GUSBP9 0.28 1.52 Ã—10 04 N 9 122,144,058 122,159,819 NDUFA8 0.03 4.22 Ã—10 04 Y a Fold change (FC) in gene expression level between depressed twins and their non depressed co twins. b Adjusted for age, sex, RIN and brain PH.
62 Table 2 7. List of significant correlation pairs between DNA methylation and cis acting gene expression in peripheral blood monocytes Gene Chr DNA Methylation Expression Corre lation P a Probe Position (bp) Start (bp) End (bp) TAL1 1 cg06463365 47,697,733 47,700,160 47,703,358 0.88 1.40Ã—10 6 SLC25A25 9 cg07688412 130,830,096 130,834,219 130,837,486 0.87 1.70Ã—10 6 SH3GL3 15 cg27648738 84,115,811 84,117,052 84,121,779 0.96 3.20Ã—10 6 ZEB2 2 cg03424727 145,277,646 145,278,318 145,284,285 0.96 5.40Ã—10 6 SRI 7 cg06737937 87,849,496 87,851,947 87,856,642 0.93 6.30Ã—10 6 RPGRIP1L 16 cg26746331 53,737,506 53,738,259 53,745,820 0.85 7.10Ã—10 6 TAL1 1 cg01418261 47,697,663 47,700,160 47,703,358 0.95 7.20Ã—10 6 MAGI2 7 cg19591626 78,400,561 78,405,293 78,408,797 0.9 0 8.00Ã—10 6 NDRG4 16 cg17650822 58,497,795 58,500,176 58,508,125 0.96 8.40Ã—10 6 VPS37D 7 cg13662144 73,082,340 73,084,835 73,089,169 0.93 8.70Ã—10 6 TAL1 1 cg19918343 47,697,673 47,700,160 47,703,358 0.96 8.80Ã—10 6 NNT 5 cg08052882 43,602,666 43,604,225 43,611,261 0.95 1.03Ã—10 5 SPRED2 2 cg14480116 65,594,890 65,596,164 65,603,423 0.85 1.03Ã—10 5 11 cg07211140 32,455,025 32,455,005 32,461,579 0.93 1.13Ã—10 5 MAFF 22 cg07207286 38,598,880 38,601,665 38,607,349 0.96 1.22Ã—10 5 RAB1B 11 cg15615396 66,035,392 66,035,869 66,040,194 0.94 1.29Ã—10 5 KDM2B 12 cg15234492 122,019,076 122,021,012 122,025,297 0.86 1.34Ã—10 5 5 cg22464292 140,777,446 140,778,318 140,783,255 0.93 1.38Ã—10 5 N4BP2L2 13 cg17936564 33,113,331 33,114,268 33,121,145 0.88 1.46Ã—10 5 KLC2 11 cg03128921 66,035,086 66,035,869 66,040,194 0.93 1.46Ã—10 5
63 Table 2 7. Continued Gene Chr DNA Methylation Expression Corre lation P a Probe Position (bp) Start (bp) End (bp) ZNF212 7 cg12695158 148,936,883 148,937,822 148,943,315 0.95 1.58Ã—10 5 RPGRIP1L 16 cg26692097 53,738,201 53,738,259 53,745,820 0.91 1.66Ã—10 5 NRXN3 14 cg05468833 79,745,664 79,747,186 79,752,834 0.93 1.68Ã—10 5 LRRC45 17 cg11040439 79,980,929 79,981,680 79,985,887 0.87 1.72Ã—10 5 CREB3L4 1 cg01387743 153,940,674 153,942,761 153,949,543 0.95 1.73Ã—10 5 PCDHGA11 5 cg18118262 140,800,424 140,778,318 140,783,255 0.88 1.75Ã—10 5 RP11 5 cg18371052 8,457,721 8,460,078 8,466,246 0.9 1.79Ã—10 5 SH3GL3 15 cg22946150 84,116,107 84,117,052 84,121,779 0.91 1.80Ã—10 5 CTD 5 cg01817364 43,037,411 43,038,634 43,043,702 0.9 1.85Ã—10 5 UFC1 1 cg00939106 161,123,698 161,125,610 161,129,715 0.88 1.95Ã—10 5 MAGI2 7 cg16678001 78,400,383 78,405,293 78,408,797 0.85 2.11Ã—10 5 FAM20B 1 cg00562731 178,995,133 178,996,577 179,000,118 0.96 2.11Ã—10 5 NFATC3 16 cg07026259 68,119,185 68,119,586 68,124,112 0.92 2.28Ã—10 5 RP11 15 cg24750854 69,222,903 69,224,060 69,229,440 0.94 2.30Ã—10 5 HIST1H2BI 6 cg04704193 26,272,200 26,273,038 26,280,028 0.93 2.50Ã—10 5 ZBTB45 19 cg11457695 59,030,948 59,032,251 59,039,415 0.94 2.60Ã—10 5 ZFP64 20 cg20182785 50,722,303 50,723,020 50,730,004 0.91 2.76Ã—10 5 RP11 5 cg18394854 8,457,818 8,460,078 8,466,246 0.89 2.83Ã—10 5 SORBS2 4 cg12066473 186,733,331 186,734,615 186,742,456 0.92 2.88Ã—10 5 CTD 16 cg08346731 82,204,172 82,205,361 82,210,598 0.94 2.91Ã—10 5
64 Table 2 7. Continued Gene Chr DNA Methylation Expression Corre lation P a Probe Position (bp) Start (bp) End (bp) FRMD4B 3 cg18433615 69,435,504 69,436,020 69,441,438 0.86 2.93Ã—10 5 AC092431.1 2 cg22629907 69,871,140 65,596,164 65,603,423 0.95 2.95Ã—10 5 FANCC 9 cg10862471 98,079,646 98,084,022 98,090,465 0.87 3.10Ã—10 5 ACTL6A 3 cg03839554 179,280,332 179,281,338 179,284,744 0.86 3.12Ã—10 5 CTD 5 cg23810282 43,037,519 43,038,634 43,043,702 0.95 3.14Ã—10 5 RP11 5 cg24581226 8,457,970 8,460,078 8,466,246 0.94 3.15Ã—10 5 SLC25A11 17 cg03889382 4,842,765 4,844,332 4,851,923 0.85 3.26Ã—10 5 KIAA0513 16 cg06276064 85,096,632 85,097,705 85,101,296 0.86 3.49Ã—10 5 FAM59B 2 cg17129645 26,395,833 26,397,782 26,404,786 0.84 3.51Ã—10 5 N4BP2L2 13 cg21921456 33,113,032 33,114,268 33,121,145 0.91 3.58Ã—10 5 SWSAP1 19 cg08405405 11,485,325 11,487,244 11,492,308 0.96 3.65Ã—10 5 TAL1 1 cg11766986 47,697,550 47,700,160 47,703,358 0.91 3.67Ã—10 5 AC002456.2 7 cg25757472 90,224,583 90,227,038 90,230,173 0.87 3.69Ã—10 5 MYO1C 17 cg03079497 1,390,554 1,393,069 1,397,501 0.89 3.80Ã—10 5 6 cg19147015 30,297,941 30,299,839 30,303,837 0.74 3.88Ã—10 5 STRA13 17 cg15578811 79,981,292 79,981,680 79,985,887 0.84 3.94Ã—10 5 C17orf64 17 cg12131208 58,499,700 58,472,476 58,478,500 0.96 3.96Ã—10 5 STRA13 17 cg25953504 79,981,121 79,981,680 79,985,887 0.84 3.96Ã—10 5 AC019181.2 2 cg20557037 165,698,099 165,700,427 165,704,462 0.93 3.97Ã—10 5 N4BP2L2 13 cg11630632 33,113,343 33,114,268 33,121,145 0.86 4.01Ã—10 5
65 Table 2 7. Continued Gene Chr DNA Methylation Expression Corre lation P a Probe Position (bp) Start (bp) End (bp) RAB38 11 cg17108629 87,908,805 85,781,097 85,788,010 0.93 4.07Ã—10 5 NRXN3 14 cg09260207 79,746,520 79,747,186 79,752,834 0.96 4.21Ã—10 5 ZBTB45 19 cg26634707 59,030,662 59,032,251 59,039,415 0.9 4.28Ã—10 5 ZNF212 7 cg07704585 148,936,630 148,937,822 148,943,315 0.85 4.37Ã—10 5 NNT 5 cg08420334 43,603,343 43,604,225 43,611,261 0.87 4.38Ã—10 5 NNT 5 cg00452016 43,603,138 43,604,225 43,611,261 0.94 4.51Ã—10 5 STRA13 17 cg04875987 79,981,264 79,981,680 79,985,887 0.94 4.51Ã—10 5 FAM20B 1 cg06528214 178,995,107 178,996,577 179,000,118 0.88 4.89Ã—10 5 HSPB11 1 cg15513671 54,412,007 54,414,946 54,420,001 0.85 4.90Ã—10 5 C17orf64 17 cg06752482 58,499,816 58,472,476 58,478,500 0.94 4.91Ã—10 5 11 cg27409910 32,454,216 32,455,005 32,461,579 0.8 5.12Ã—10 5 HTRA4 8 cg21184369 38,831,148 38,832,480 38,837,576 0.89 5.24Ã—10 5 TMEM194A 12 cg21721432 57,472,784 57,474,160 57,477,980 0.86 5.25Ã—10 5 FRMD4B 3 cg19522075 69,435,780 69,436,020 69,441,438 0.84 5.40Ã—10 5 HSPA13 21 cg01662102 15,755,986 15,758,849 15,764,866 0.87 5.46Ã—10 5 FTO 16 cg18821731 53,737,871 53,738,259 53,745,820 0.91 5.48Ã—10 5 NFATC3 16 cg07981599 68,119,049 68,119,586 68,124,112 0.91 5.49Ã—10 5 11 cg25835307 87,908,134 85,781,097 85,788,010 0.89 5.66Ã—10 5 STRA13 17 cg25213539 79,981,084 79,981,680 79,985,887 0.91 5.81Ã—10 5 KLC2 11 cg15201417 66,034,922 66,035,869 66,040,194 0.96 5.83Ã—10 5
66 Table 2 7. Continued Gene Chr DNA Methylation Expression Corre lation P a Probe Position (bp) Start (bp) End (bp) PICALM 11 cg16633848 85,780,144 85,781,097 85,788,010 0.96 5.84Ã—10 5 VPS37D 7 cg24954661 73,082,001 73,084,835 73,089,169 0.91 5.85Ã—10 5 C16orf95 16 cg10067538 87,351,824 87,352,557 87,358,880 0.88 5.88Ã—10 5 SF3B3 16 cg07751125 70,557,411 70,558,784 70,564,104 0.92 5.94Ã—10 5 FAM20B 1 cg05383153 178,995,099 178,996,577 179,000,118 0.95 5.97Ã—10 5 FAM59B 2 cg24563094 26,395,458 26,397,782 26,404,786 0.84 6.02Ã—10 5 ZBTB45 19 cg14212467 59,030,979 59,032,251 59,039,415 0.91 6.09Ã—10 5 SLC25A11 17 cg11432441 4,842,610 4,844,332 4,851,923 0.93 6.11Ã—10 5 ZEB2 2 cg19101754 145,277,381 145,278,318 145,284,285 0.92 6.11Ã—10 5 USP32 17 cg18654231 58,469,739 58,472,476 58,478,500 0.88 6.28Ã—10 5 RAB1B 11 cg02520768 66,035,485 66,035,869 66,040,194 0.94 6.35Ã—10 5 CIRH1A 16 cg00615892 69,166,530 68,119,586 68,124,112 0.96 6.41Ã—10 5 RFC5 12 cg00670756 118,454,418 118,457,394 118,460,794 0.84 6.47Ã—10 5 RP11 5 cg17877220 8,458,089 8,460,078 8,466,246 0.92 6.54Ã—10 5 CTH 1 cg02917772 70,876,623 70,878,838 70,884,286 0.85 6.54Ã—10 5 AC002456.2 7 cg26735135 90,224,886 90,227,038 90,230,173 0.86 6.54Ã—10 5 SRI 7 cg14644787 87,849,494 87,851,947 87,856,642 0.84 6.60Ã—10 5 NNT 5 cg12656077 43,602,605 43,604,225 43,611,261 0.85 6.65Ã—10 5 ZBTB45 19 cg17364234 59,031,070 59,032,251 59,039,415 0.88 6.70Ã—10 5 CTD 5 cg04268624 43,037,285 43,038,634 43,043,702 0.87 6.85Ã—10 5
67 Table 2 7. Continued Gene Chr DNA Methylation Expression Corre lation P a Probe Position (bp) Start (bp) End (bp) NAIF1 9 cg16950519 130,829,748 130,834,219 130,837,486 0.94 6.86Ã—10 5 C16orf95 16 cg02223001 87,351,033 87,352,557 87,358,880 0.91 6.97Ã—10 5 LRRC45 17 cg04489846 79,980,949 79,981,680 79,985,887 0.9 6.99Ã—10 5 SPRED2 2 cg00376294 65,594,797 65,596,164 65,603,423 0.89 7.03Ã—10 5 PICALM 11 cg09030501 85,779,252 85,781,097 85,788,010 0.96 7.14Ã—10 5 TMEM5 12 cg06437928 64,173,769 64,175,881 64,180,729 0.84 7.17Ã—10 5 SF3B3 16 cg20435469 70,557,679 70,558,784 70,564,104 0.84 7.17Ã—10 5 13 cg23132774 114,814,171 114,815,864 114,819,100 0.74 7.21Ã—10 5 NRXN3 14 cg22908679 79,746,212 79,747,186 79,752,834 0.89 7.28Ã—10 5 PCDHGA11 5 cg26647197 140,800,398 140,778,318 140,783,255 0.93 7.33Ã—10 5 KLC2 11 cg14442997 66,035,267 66,035,869 66,040,194 0.84 7.39Ã—10 5 SLC30A3 2 cg10629682 27,486,061 27,488,436 27,495,664 0.91 7.53Ã—10 5 RAB1B 11 cg02351179 66,035,370 66,035,869 66,040,194 0.86 7.53Ã—10 5 CTH 1 cg03755098 70,876,598 70,878,838 70,884,286 0.91 7.65Ã—10 5 MAFF 22 cg16007279 38,598,948 38,601,665 38,607,349 0.95 7.76Ã—10 5 NRXN3 14 cg14335579 79,745,997 79,747,186 79,752,834 0.91 7.86Ã—10 5 NDRG4 16 cg08791131 58,497,801 58,500,176 58,508,125 0.94 8.03Ã—10 5 STRA13 17 cg17241816 79,981,086 79,981,680 79,985,887 0.93 8.09Ã—10 5 NDRG4 16 cg13031432 58,497,767 58,500,176 58,508,125 0.93 8.15Ã—10 5 TAL1 1 cg26939858 47,697,669 47,700,160 47,703,358 0.92 8.30Ã—10 5
68 Table 2 7. Continued Gene Chr DNA Methylation Expression Corre lation P a Probe Position (bp) Start (bp) End (bp) NNT 5 cg13102118 43,602,505 43,604,225 43,611,261 0.89 8.38Ã—10 5 RNF219 1 cg12548634 44,884,109 44,885,172 44,890,475 0.85 8.52Ã—10 5 TMEM5 12 cg05228379 64,173,617 64,175,881 64,180,729 0.92 8.75Ã—10 5 SPRED2 2 cg10831427 65,594,760 65,596,164 65,603,423 0.89 8.75Ã—10 5 ZWILCH 15 cg17722664 66,797,429 66,800,359 66,807,961 0.91 8.81Ã—10 5 PICALM 11 cg02920502 85,780,029 85,781,097 85,788,010 0.86 9.02Ã—10 5 LPAR2 19 cg02362385 19,739,192 19,740,069 19,747,480 0.94 9.09Ã—10 5 5 cg04553690 140,777,501 140,778,318 140,783,255 0.93 9.12Ã—10 5 11 cg17428011 123,173,101 123,175,220 123,179,990 0.84 9.18Ã—10 5 MYO1C 17 cg02317299 1,390,182 1,393,069 1,397,501 0.85 9.23Ã—10 5 FAM20B 1 cg22332891 178,995,082 178,996,577 179,000,118 0.96 9.24Ã—10 5 RP11 5 cg04828267 8,457,538 8,460,078 8,466,246 0.94 9.44Ã—10 5 RNF220 1 cg16547629 44,884,131 44,885,172 44,890,475 0.92 9.56Ã—10 5 AC019181.2 2 cg26373663 165,698,219 165,700,427 165,704,462 0.94 9.58Ã—10 5 NFATC3 16 cg09049717 68,119,261 68,119,586 68,124,112 0.91 9.60Ã—10 5 C16orf95 16 cg01367424 87,351,490 87,352,557 87,358,880 0.86 9.64Ã—10 5 KLC2 11 cg10498476 66,035,147 66,035,869 66,040,194 0.95 9.68Ã—10 5 CTH 1 cg00968021 70,876,888 70,878,838 70,884,286 0.93 9.74Ã—10 5 C17orf64 17 cg02172058 58,499,911 58,472,476 58,478,500 0.86 9.95Ã—10 5 KLC2 11 cg01280128 66,034,963 66,035,869 66,040,194 0.86 9.97Ã—10 5 a Calculated by permutation.
69 Table 2 8. Co methylation modules along with hub genes and biological pathways Module Size P a Top enriched pathway Hub gene b Genes in the module 1 167 9.22Ã—10 8 Neural nucleus development SLC34A1 SLC34A1, TTN, BNC2, VWA1, MAFF, TBC1D24, GPX8, RUSC2, MLEC, BCAN, HSPB11, SH2D4A, PIK3CA, PDZD3, DKC1, ARHGEF26, INPP5E, BNC1, ARID3B, NINJ2, PKMYT1, ZBTB5, KCNS3, RAD1, HLA DOA, GPLD1, ACSS2, EIF4G2, ADNP, FAT3, UACA, SIM2, CDON, RBP1, RSPO4, CDH11, PIGY, AAK1, MAPRE2, USP6NL, PANX2, ADGRE2, TMEM33, FANCC, NKD2, TMCO3, PHOX2B, FAM207A, SWSAP1, HTT, ZFAT, GLP1R, DIRC3, CCDC129, NEDD8, BCL2L11, STK32B, DENND3, SLC25A25, BDKRB2, AXIN2, ZNF384, GPR97, KLF9, E2F2, UFC1, CUL1, TBC1D31, RNF4, PAQR5, CNDP2, LTF, ATAD5, KLHL8, MR1, SLC30A3, VPS37D, FKBP5, NCOR2, ACTL6A, ATL3, MGMT, CAPZB, CTH, FAM150B, JADE1, MOB2, INPP4B, TPM3, TMEM87B, CATSPERB, GLDN, TOX, RPP30, NTM, FAM98C, MAP2K5, GARS, PCDHGA11, ZNF212, SGSM3, LHX3, NOX5, ALDH1A3, CASC5, CLMN, MIPOL1, FXYD5, PYGM, TJP1, ARHGEF7, IGFBP4, ELMO1, ZNF679, FAM19A5, STK11IP, C22orf13, RP11, ZNF672, THAD A, COL9A3, FTO, GDA, RFX1, PTRF, BOLA1, NPTX1, TTLL10, RAB3GAP1, SCN5A, SYDE1, PKD1L2, RNF8, SNX17, MAGI2, HNRNPA0, SORCS2, TRIB3, ADD1, NRXN3, CNIH3, BARX2, CDK14, CYB561, DGKQ, UNKL, AGAP1, MATN2, SHTN1, ZNF710, HMGXB3, GRIN1, TMEM5, BCL2L13, TM4SF20, VP S41, CHST10, PHF21B, PCYT2, HOXA9, OR2AG1, PRKD3, TESK2, KHSRP, EXOC4, RFC5, C9orf37
70 Table 2 8. Continued Module Size P a Top enriched pathway Hub gene Genes in the module 2 80 2.17Ã—10 4 Positive regulation of endothelial cell proliferation MYO3B MYO3B, TMEM194B, PPID, SETD1B, BRE, PGM1, MAPK10, C6orf48, RASGRP4, SNCAIP, RAB7A, GIN1, PRSS21, PROX1, CDH13, STRIP2, RAB38, UBE3B, GRWD1, SORBS2, DOK4, SEC1P, PFN1, UTP15, ATP8A1, TSPO2, CCDC17, ANKRD11, TRIM39, WNT7B, SHROOM3, SFMBT2, BAT3, MIA3, FBXO42 , GAREML, LAMA2, RDBP, EIF4E2, FAM161B, ABI2, DLK2, DDIT3, SP110, ELAC1, FOXI3, SORCS1, C17orf64, NCF1B, FRMD4B, ACKR2, STAC, IRGM, CHD4, SRSF10, IRAK3, JUND, GHR, KDM2B, MBD6, SYN3, KIRREL3, ADAD2, CELSR1, CREB3L4, TAL1, ENPP7, LRRC1, C19orf81, SF3B3, HAN D1, KIAA0513, PGGT1B, E2F3, ADD2, HADH, F12, FOXD3, ZBTB4, ACTR5 3 57 9.22Ã—10 8 Cellular response to cAMP CACNA1D CACNA1D, SNX24, ANKRD22, PICALM, FAM96B, CLVS1, CACNA1E, CCR6, C3orf75, ADAM32, NEMP1, CTDP1, GPR160, MAEA, ZNF585B, CCDC28A, MCF2L, ZNF618, RRN3, EDAR, PIK3CG, LOC286135, FAM20B, NNT, MADD, HMGB3, FUBP3, THSD1P, RAF1, HADHB, DRG2, CUTA, KCNQ1, SIDT2, LPAR2, RBM47, BHLHE40, ZBTB45, KREMEN1, R3HDM1, ARL15, MYOZ3, OSGIN2, ATE1, ALS2CL, CUX1, TBKBP1, CLIP2, FAM125B, SPRED2, ACAD9, KRI1, LI PE, BMP4, NFATC3, ZWILCH, GFI1 a P value for the association between MDD and the first three eigenvalues of a module b Hub gene: gene with the highest degree in the network
71 Table 2 9. Co expression modules along with hub genes and biological pathways Module Size P a Top enriched pathway Hub gene Genes in the module 1 94 2.72Ã—10 5 positive regulation of cytokine secretion GPR34 GPR34, ZNF304, GTF2E1, SRSF5, MGAT4A, SCARNA13, AC025259.1, RTP4, EIF4EP1, IER3, SVIL AS1, ARHGAP24, DHRS4L2, CLEC5A, HPS5, ORM1, COL17A1, HIST1H3B, CALM2P2, PLK3, ADPRHL2, RN7SL752P, SNORD3B 2, HTRA1, CLEC1B, GIMAP8, AC087521.4, NEU1, AL032821.1, CCDC125, GABARAPL1, PMVK, PYGL, ZNF671, ZNF816, PGM1, TTN AS1, TCTN3, PECR, SLC11A1, AC100810.1, KLF7 IT1, NT 5C3B, AHR, UGCG, FAM43A, DGKD, SPSB2, SNORD46, ANKRD28, RABGEF1, AL583722.2, SNORA79, LINC01578, HYLS1, GPSM2, PSMD4, ZNF232, LTB4R, HIKESHI, RASGEF1B, GIMAP1 GIMAP5, DPCD, AKIRIN2, TTC9C, NFKBID, TRA2B, KBTBD11, TSPAN2, SNHG15, GIMAP1, LIN7A, RN7SL600P, A TP8A1, STAG3L3, RBBP5, MIR222HG, TRPT1, KNOP1, TMCO4, CIP2A, ATP1B1, ZKSCAN4, ST13, DNTTIP1, PAQR8, RPP25L, DDHD2, SNORA32, CALD1, RAB31, FADD, PAM, CD300A 2 62 3.32Ã—10 5 negative regulation of NF kappaB transcription factor activity MMAA MMAA, PLAC8, CD44 AS1, PIGA, MRPL27, TNFAIP8L2, AC245128.3, MR1, DHFR, CRTAM, AL627309.2, TTI2, SNORD3B 1, POLD1, MRPS26, AL132656.3, ZNF487, PFKFB2, SRGN, ZNF101, CYTIP, C3orf14, NUP50 AS1, TAF1B, GUSBP1, NDUFA8, ID1, TBC1D7, CYP4F3, RRP7A, AC092651.2, AC008993.1, YIP F4, TOMM7, ZNF223, MIR22HG, HSPBP1, CRIPT, KCNJ2, CBR1, PCNX2, DEFA3, HIST2H2AB, SIAH2, ARG1, RFK, ADHFE1, SEMA3C, ZNF627, OXSM, MIR181A1HG, AC020916.1, ABHD10, S100P, NAMPT, YOD1, ATP13A3, ZEB2 AS1, AC011472.2, PEX11B, CRYBB2P1, PIWIL4
72 Table 2 9 . Continued Module Size P a Top enriched pathway Hub gene Genes in the module 3 53 1.62Ã—10 5 regulation of response to stress RNF181 RNF181, MRPL50, ZNF200, UAP1L1, FAR2, NME2, SNORA71E, EIF2S1, WASH5P, AC010761.1, CASP6, AP001372.3, NTHL1, CMTR2, MXRA7, RBBP6, MRPS18B, PPP1R10, RFLNB, LINC01003, AL121761.1, AC012368.1, PPID, BUD13, DBR1, MINPP1, ISOC2, CCT7, AC008038.1, ERVK9 11, USP41, PPM1G, F2RL1, TFF3, ZNF493, HPS6, RAPGEF6, DNASE1L1, UQCRFS1P1, BLM, ARL11, TMEM170B, SOWAHD, MRFAP1L1, PGP, SERP INB10, H1F0, ANKRD36B, AC087385.1, YPEL5, CDKN2D, AC115618.3, NECAP2 4 46 9.40Ã—10 5 regulation of neuron death TCEANC TCEANC, CLTCL1, LYSMD2, TSPAN4, C16orf74, ANXA1, NDUFAF4, STAG3L5P, AL133342.1, AC139495.1, TNFSF10, ALG12, FIS1, LILRB4, TMCC1 AS1, FAM107B, NFKBIA, ETS2, BBS10, AC016876.2, RBL1, NAT1, SGK1, THUMPD3 AS1, CDCA7L, HP, GSTM4, VPS72, MFAP1, MRPL44, AL928970.1, PTGER2, MCL1, MRPL46, TEX2, YRDC, RGCC, GPANK1, HSPA1B, AC015967.2, FCER1A, DDX5, MYCT1, GUSBP3, MGAT5, AL158152.1 5 37 5.02Ã—10 5 regulation of response to stress PHF23 PHF23, SNORA72, IFT172, FIP1L1, PELI1, CD59, TRIM21, CSGALNACT1, SUCLA2, MTO1, HMGB3, FAM96B, NME4, HNRNPLL, ZNF772, GUSBP9, RMI1, ICA1, CLEC4E, AC037198.2, BORA, G0S2, AL133445.2, TAF8, RF02121, ITGB7, APIP, IGF2R, RN7SKP255, EPB41L2, MBOAT2, CAPN3, AC004492.1, C11orf54, MTERF3, MINCR, AC114878.1 a P value for the association between MDD and the first three eigenvalues of a module
73 Table 2 10. Pathway enrichment for DMRs with nominal associations with MDD Term Description P a q b GO:0031098 Stress activated protein kinase signaling cascade 2.90Ã—10 6 4.65Ã—10 3 GO:0046627 Negative regulation of insulin receptor signaling pathway 3.29Ã—10 5 2.63Ã—10 2 GO:0070302 Regulation of stress activated protein kinase signaling cascade 4.79Ã—10 5 2.56Ã—10 2 KEGG:hsa04150 mTOR signaling pathway 7.20Ã—10 5 2.88Ã—10 2 GO:0043524 Negative regulation of neuron apoptotic process 7.72Ã—10 5 2.47Ã—10 2 GO:0048011 Nerve growth factor receptor signaling pathway 1.49Ã—10 4 3.96Ã—10 2 a Obtained by permuting random loci matched by gene density using the DEPICT. b Adjusted for 1,627 GO terms/ pathways.
74 Table 2 11. Result for sensitivity analysis of the identified DMRs Chr Start (bp) End (bp) Size (bp) Nearest gene # of pro bes Model 1 a Model 2 b Model 3 c Peak P Region P Â§ Peak P Region P Â§ Peak P Region P Â§ 16 2,866,834 2,868,001 1,168 PRSS21 10 2.67Ã—10 3 1.8Ã—10 4 9.28Ã—10 3 3.57Ã—10 4 7.48Ã—10 4 3.65Ã—10 4 5 43,037,123 43,037,666 544 7 5.25Ã—10 4 8.82Ã—10 4 5.86Ã—10 4 6.69Ã—10 4 4.45Ã—10 4 9.49Ã—10 4 1 54,411,017 54,412,009 993 HSPB11 18 3.84Ã—10 4 8.54Ã—10 4 7.93Ã—10 4 8.81Ã—10 4 8.48Ã—10 4 7.66Ã—10 4 2 69,870,526 69,871,424 899 AAK1 8 7.6 0 Ã—10 5 6.97Ã—10 4 1.95Ã—10 5 7.98Ã—10 4 4.83Ã—10 4 9.05Ã—10 4 4 186,732,926 186,733,331 406 SORBS2 8 8.31Ã—10 5 6.7Ã—10 5 6.79Ã—10 4 2.48Ã—10 5 7.75Ã—10 5 9.94Ã—10 5 2 26,395,359 26,395,859 501 GAREML 8 3.04Ã—10 5 1.09Ã—10 3 5.77Ã—10 4 1.78Ã—10 3 4.43Ã—10 4 3.12Ã—10 3 5 140,800,398 140,800,983 586 PCDHGA1 10 8.14Ã—10 3 1.18Ã—10 5 3.09Ã—10 3 2.66Ã—10 5 9.64Ã—10 5 6.77Ã—10 5 1 153,940,616 153,941,285 670 CREB3L4 6 9.9 0 Ã—10 5 1.49Ã—10 5 5.49Ã—10 5 6.5 0 Ã—10 5 7.24Ã—10 5 1.75Ã—10 5 5 43,602,380 43,603,353 974 NNT 17 5.34Ã—10 3 8.5Ã—10 4 2.81Ã—10 3 6.47Ã—10 4 3.53Ã—10 4 4.81Ã—10 4 5 8,457,538 8,458,392 855 RP11 9 3.61Ã—10 4 8.35Ã—10 5 2.91Ã—10 4 7.43Ã—10 4 3.63Ã—10 5 8.14Ã—10 4 22 38,598,577 38,599,166 590 MAFF 9 1.61Ã—10 3 6.81Ã—10 2 6.95Ã—10 3 9.41Ã—10 2 9.68Ã—10 2 9.1Ã—10 2 12 64,173,610 64,174,367 758 TMEM5 9 8.64Ã—10 5 3.57Ã—10 3 8.12Ã—10 5 1.87Ã—10 3 1.34Ã—10 3 3.86Ã—10 3 19 11,484,448 11,485,452 1,005 SWSAP1 14 1.24Ã—10 4 7.93Ã—10 5 5.49Ã—10 4 8.09Ã—10 5 6.94Ã—10 5 3.92Ã—10 5 2 27,485,922 27,486,460 539 SLC30A3 8 4.54Ã—10 4 3.18Ã—10 3 3.69Ã—10 4 2.84Ã—10 3 2.76Ã—10 3 6.11Ã—10 3 2 101,034,246 101,034,295 50 CHST10 6 6.04Ã—10 5 2.07Ã—10 4 1.93Ã—10 5 3.26Ã—10 4 5.92Ã—10 4 1.59Ã—10 4 11 85,779,252 85,780,378 1,127 PICALM 10 1.62Ã—10 4 2.55Ã—10 3 8.7Ã—10 4 4.38Ã—10 3 7.7Ã—10 3 9.74Ã—10 3 13 114,814,024 114,814,401 378 5 1.82Ã—10 4 1.72Ã—10 3 7.59Ã—10 4 2.17Ã—10 3 6.97Ã—10 3 5.26Ã—10 3 11 32,454,216 32,455,025 810 WT1 8 2.22Ã—10 3 4.85Ã—10 3 9.84Ã—10 3 4.17Ã—10 3 5.52Ã—10 4 3.91Ã—10 3 16 70,557,411 70,557,707 297 SF3B3 10 4.94Ã—10 3 2.22Ã—10 3 9.51Ã—10 4 1.36Ã—10 3 7.87Ã—10 4 6.71Ã—10 3 7 90,224,158 90,225,380 1,223 CDK14 11 2.78Ã—10 3 7.21Ã—10 3 4.94Ã—10 3 7.43Ã—10 3 7.56Ã—10 3 4.79Ã—10 3 11 87,908,134 87,908,805 672 RAB38 7 1.41Ã—10 3 8.91Ã—10 3 8.45Ã—10 3 4.16Ã—10 3 8.38Ã—10 3 4.42Ã—10 3 12 122,019,031 122,019,117 87 KDM2B 5 3.57Ã—10 4 8.68Ã—10 4 0.57Ã—10 4 8.19Ã—10 4 4.53Ã—10 4 4.66Ã—10 4
75 Table 2 11. Continued Chr Start (bp) End (bp) Size (bp) Nearest gene # of pro bes Model 1 a Model 2 b Model 3 c Model 3 c Peak P Region P Â§ Peak P Region P Â§ Peak P Region P Â§ 16 85,096,632 85,097,151 520 KIAA0513 5 5.76Ã—10 3 9.9Ã—10 4 2.79Ã—10 3 4.75Ã—10 4 6.94Ã—10 4 6.01Ã—10 4 2 65,594,021 65,595,186 1,166 SPRED2 6 6.55Ã—10 4 8.52Ã—10 4 9.11Ã—10 4 4.54Ã—10 4 7.51Ã—10 4 5.27Ã—10 4 9 98,079,646 98,080,622 977 FANCC 10 5.13Ã—10 4 2.63Ã—10 4 4.14Ã—10 4 1.33Ã—10 4 3.18Ã—10 4 8.92Ã—10 4 10 90,611,604 90,612,228 625 ANKRD22 7 6.03Ã—10 5 2.43Ã—10 4 7.37Ã—10 5 8.77Ã—10 4 5.64Ã—10 4 4.16Ã—10 4 5 140,777,344 140,777,655 312 9 3.66Ã—10 5 6.09Ã—10 5 6.63Ã—10 4 2.83Ã—10 5 5.71Ã—10 5 6.44Ã—10 5 1 178,994,834 178,995,133 300 FAM20B 8 6.76Ã—10 5 4.21Ã—10 3 4.29Ã—10 4 1.47Ã—10 3 9.15Ã—10 4 6.94Ã—10 3 7 148,936,572 148,937,410 839 ZNF212 9 8.55Ã—10 3 8.28Ã—10 5 3.43Ã—10 3 9.16Ã—10 5 9.32Ã—10 5 5.67Ã—10 5 16 68,118,822 68,119,261 440 NFATC3 9 8.47Ã—10 5 8.64Ã—10 5 9.47Ã—10 5 2.35Ã—10 5 5.53Ã—10 5 9.62Ã—10 5 19 59,030,662 59,031,081 420 ZBTB45 7 9.99Ã—10 3 4.26Ã—10 4 2.59Ã—10 3 1.14Ã—10 4 6.27Ã—10 4 3.88Ã—10 4 16 87,351,006 87,351,824 819 C16orf95 10 5.25Ã—10 4 3.35Ã—10 5 6.60Ã—10 4 1.68Ã—10 4 2.73Ã—10 5 1.63Ã—10 4 17 58,499,300 58,500,186 887 C17orf64 9 1.56Ã—10 3 2.61Ã—10 2 5.45Ã—10 3 2.16Ã—10 2 4.66Ã—10 2 7.55Ã—10 2 19 19,739,060 19,739,414 355 LPAR2 8 8.2Ã—10 5 4.23Ã—10 3 6.52Ã—10 5 6.28Ã—10 3 9.91Ã—10 3 2.32Ã—10 3 3 179,280,056 179,280,746 691 ACTL6A 9 2.15Ã—10 4 4.21Ã—10 5 2.28Ã—10 4 8.02Ã—10 5 2.84Ã—10 5 6.43Ã—10 5 1 70,876,598 70,877,381 784 CTH 9 3.56Ã—10 4 6.05Ã—10 3 2.61Ã—10 4 1.55Ã—10 3 7.49Ã—10 3 7.72Ã—10 3 7 78,400,383 78,400,769 387 MAGI2 5 7.62Ã—10 5 7.52Ã—10 4 8.38Ã—10 5 3.58Ã—10 4 3.04Ã—10 4 5.03Ã—10 4 6 30,297,174 30,297,941 768 TRIM39 10 3.53Ã—10 4 7.11Ã—10 3 3.42Ã—10 4 1.02Ã—10 3 8.88Ã—10 3 7.51Ã—10 3 15 69,222,400 69,223,018 619 NOX5 7 6.41Ã—10 4 5.91Ã—10 3 6.25Ã—10 4 9.42Ã—10 3 4.86Ã—10 3 7.77Ã—10 3 a Model 1 further adjusted for childhood traumatic experience . b Model 2 further adjusted for antidepressant usage. c Model 3 further adjusted for the use of antidepressants. Â§ Calculated by 100,000 times permutation.
76 Table 2 12. Results for sensitivity analysis of the identified DEGs Chr Start (bp) End (bp) Nearest gene Model 1 a Model 2 b Model 3 c FC P FC P FC P 2 215,996,329 216,082,955 PECR 1.56 9.24Ã—10 7 1.51 8.85Ã—10 7 2.05 7.68Ã—10 7 1 185,292,384 185,294,372 AL356273.3 1.41 6.61Ã—10 6 1.38 9.14Ã—10 6 0.83 1.43Ã—10 7 1 44,800,225 44,805,990 PLK3 2.71 7.09Ã—10 6 2.35 5.57Ã—10 5 1.55 8.10Ã—10 5 5 71,197,646 71,208,130 GUSBP9 1.48 6.95Ã—10 6 1.57 1.81Ã—10 5 0.85 1.61Ã—10 6 8 63,015,079 63,039,171 GGH 1.24 5.75Ã—10 6 1.25 2.26Ã—10 5 1.12 8.03Ã—10 6 6 11,538,278 11,583,524 TMEM170B 7.98 3.69Ã—10 5 7.8 9.78Ã—10 5 11.04 3.74Ã—10 5 11 64,223,799 64,226,254 TRPT1 0.76 6.57Ã—10 5 0.74 7.35Ã—10 5 1.27 6.49Ã—10 5 16 19,701,934 19,718,235 KNOP1 0.80 1 7.12Ã—10 5 0.75 4.38Ã—10 4 0.42 3.37Ã—10 4 16 3,222,325 3,236,221 ZNF200 0.82 1.06Ã—10 5 0.86 8.75Ã—10 5 0.64 1.76Ã—10 5 6 30,617,709 30,626,395 MRPS18B 0.58 6.83Ã—10 4 0.59 8.34Ã—10 3 0.51 8.56Ã—10 4 7 66,682,164 66,811,464 RABGEF1 1.45 1.24Ã—10 6 1.39 4.02Ã—10 5 0.86 5.19Ã—10 6 22 49,900,229 49,918,458 ALG12 1.13 7.03Ã—10 5 1.24 3.18Ã—10 5 1.21 5.76Ã—10 5 12 52,076,841 52,082,084 AC025259.1 0.78 3.37Ã—10 5 0.81 7.21Ã—10 4 0.59 8.37Ã—10 5 19 19,668,796 19,683,509 ZNF101 0.59 1.76Ã—10 4 0.61 1.35Ã—10 3 0.48 8.26Ã—10 4 1 151,156,629 151,159,749 TNFAIP8L2 0.17 1.02Ã—10 5 0.19 7.52Ã—10 4 0.13 1.92Ã—10 5 8 33,473,386 33,513,601 TTI2 0.81 2.58Ã—10 5 0.85 3.45Ã—10 4 0.75 4.37Ã—10 5 13 41,457,559 41,470,882 RGCC 4.13 2.01Ã—10 5 3.98 9.68Ã—10 4 4.55 8.63Ã—10 5 9 122,144,058 122,159,819 NDUFA8 0.59 2.10Ã—10 5 0.56 5.70Ã—10 4 0.89 9.20Ã—10 5 11 93,741,591 93,764,749 C11orf54 0.58 2.93Ã—10 5 0.59 3.91Ã—10 4 0.41 8.05Ã—10 5 16 85,690,084 85,751,129 C16orf74 1.68 5.59Ã—10 4 1.58 1.59Ã—10 3 1.86 2.90Ã—10 4 7 80,742,538 80,922,359 SEMA3C 1.72 7.57Ã—10 5 1.85 4.87Ã—10 4 1.2 1.98Ã—10 5
77 Table 2 12. Continued Chr Start (bp) End (bp) Nearest gene Model 1 a Model 2 b Model 3 c FC P FC FC P FC 19 21,397,119 21,427,573 ZNF493 1.56 9.24Ã—10 7 1.51 8.85Ã—10 7 2.05 7.68Ã—10 7 11 4,384,897 4,393,696 TRIM21 1.41 60.6Ã—10 6 1.38 9.14Ã—10 6 0.83 1.43Ã—10 7 19 52,949,379 52,962,911 ZNF816 2.71 70.9Ã—10 6 2.35 5.57Ã—10 5 1.55 8.01Ã—10 5 10 43,436,841 43,483,179 ZNF487 1.48 6.95Ã—10 6 1.57 1.81Ã—10 5 0.85 1.61Ã—10 6 19 57,466,663 57,477,570 ZNF772 1.24 5.75Ã—10 6 1.25 2.26Ã—10 5 1.12 8.03Ã—10 6 22 42,509,968 42,519,802 RRP7A 7.98 3.69Ã—10 5 7.8 9.78Ã—10 5 11.04 3.74Ã—10 5 21 36,069,941 36,073,166 CBR1 0.76 6.57Ã—10 5 0.74 7.35Ã—10 5 1.27 6.49Ã—10 5 2 98,619,106 98,731,126 MGAT4A 0.8 70.1Ã—10 5 0.75 4.38Ã—10 4 0.42 3.37Ã—10 4 1 145,911,350 145,918,837 PEX11B 0.82 1.06Ã—10 5 0.86 8.75Ã—10 5 0.64 1.76Ã—10 5 a Model 1 further adjusted for childhood traumatic experience . b Model 2 further adjusted for antidepressant usage. c Model 3 further adjusted for the use of antidepressants.
78 Figure 2 1. Manhattan plot displaying the DMRs associated with MDD in monozygotic discordant twin pairs (N=79 pairs) . The P values ( log10) of each DMR are plotted against their respective positions on each chromosome. The genome wide threshold (q<0.05) is indicated with a red line.
79 (a) (b) Figure 2 2. Genomic distribution and CpG cont ents of identified DMRs. (a) Genomic distribution of identified DMRs associated with MDD. (b) CpG content of DMRs 1st Exon 12% 5'UTR 15% Body 16% TSS1500 24% TSS200 27% Intergeni c 6% Island 61% Shore 27% Shelf 3% Open Sea 9%
80 Figure 2 3. Manhattan plot displaying the DEGs associated with MDD in monozygotic discordant twin pairs (N=79 pairs) The P values ( log10) of each DEG are plotted against their respective positions on each chromosome. The genome wide threshold (q<0.05) is indicated with a blue lin e
81 Figure 2 4. Circos plot showing the genome wide relationship between DNA methylation and gene expression in peripheral blood monocytes in relation to major depression. The outermost ring displays chromosome numbers and bands. The second ring (green) s hows differential methylation in depressed (red) and non depressed (blue) twins. The third ring shows the Pearson correlation between DNA methylation and gene expression in 500bp bin. The innermost circle (yellow) represents mRNA differential expression in depressed (red) and non depressed (blue) twins. The height of the histogram bins indicates the level of DNA methylation or gene expression.
82 Figure 2 5. Genome wide partial correlation patterns between DNA methylation and cis expression (Â±5kb)
83 Figure 2 6 . The largest co methylation module associated with MDD in depressed twins in comparison to their non depressed co twins. The network connectivity (as measured by node degrees) for the negative regulation of neuron apoptotic process (green) in de pressed twins is significantly higher compared to non depressed co twins (3.8 vs . 2.6, p value =7.15Ã— 10 5 ). In contrast, the network connectivity of the stress activated protein kinase signaling cascade (purple) is significantly lower in depressed twins th an that in non depressed co twins (2.8 vs . 4.2, P=2.21Ã— 10 5 ).
84 Figure 2 7. The largest co expression module for the identified DEGs. The network connectivity (as measured by node degrees) for the positive regulation of cytokine secretion (green) in depr essed twins is significantly higher compared to non depressed co twins (3.8 vs . 2.6, p value =7.15Ã— 10 5 ).
85 Figure 2 8. Tissue/cell types enrichment of the identified DMRs. It shows that the MDD related DMRs are significantly enriched in the nervous system, endocrine system, and urogenital system. P values of the enrichment analysis adjusted for a total number of 209 tissue/cell types . The red line indicates q<0.05
86 CHAPTER 3 GENOME WIDE PROFILING OF DN A METHYLOME FOR LATE LIFE DEPR ESSI VE SYMPTOMS The rapid increase in the number of older adults worldwide makes depressive symptoms in late life a focused area in depression research. A recent meta analysis of older adults (aged 50 years and older) found the prevalence of depression is nearly 20%  . Late life depressi ve symptoms are often undetected or undertreated  . According to the DSM, Major Depressive Disorder (MDD) cannot be diagnosed when its symptoms are the direct physiologic results of medical conditions. For example, depression may be underdiagnos ed in the presence of conditions such as cancer that can also cause weight loss, fatigue, poor appetite and disruption of sleep. As such, identifying biomarkers for late life depressive symptoms is a particular important challenge. Unlike early onset dep ression (EOD), depressive symptoms in late life are more vulnerable to environmental effects compared to genetic effects  . The heritability of MDD in elders decreased to 18% compared to 50% in youth  , suggesting that the cumulative exposure to environment and gene environment interaction may dilute the genetic effects  . The heterogeneity of depression also increases with age as the rises in incidence of chronic diseases and disabi lity  . Thus, it is of particular significance to identify new biomarkers that represents the complexity of gene environment interactions in r elation to late life depressive symptoms . DNA methylation, a molecular mechanism influenced by both genetic  and social environmental factors  may serve as a biomarker for cumulative environment
87 effect on genetics  . Indeed, Growing evidence suggests that DNA methylation alteration would play an important role in depression [88,150] . However, most existing epi genome wide association studies (EWASs) on depression were focused on the younger persons whose ages ranged from 20 50 [88,147,159, 216] . A recent large meta analysis of more than 11,000 middle aged and elderly persons with European and African origin identified 3 CpG sites associated with depression  , but the results were not conclusive. In this chapter , we first report findings from a genome wide profiling of DNA methylome in post mortem brain tissues from an elder study population, followed by findings of the analysis on the correlation between the differentially methylated genes and their cis expressions. Methods Study P articipants This study included 708 deceased participants (m ean age at death 87.8 yrs, 36.3% male) from the Religious Orders Study (ROS) and the Rush Memory and Aging Project (MAP). Both the ROS and the MAP are ongoing, prospective studies of aging and dementia in older individuals, as described previously [144,217] . In brief, the ROS enrolled older Catholic priests, nuns and brothers from across the USA free of known dementia at time of enrollment. The MAP consists of older men and women from across the Chicagoland area, without known dementia at enrollment. All participants agreed to annual cl inical evaluations and signed both an informed consent and an Anatomic Gift Act on donating their brains at time of death. The follow up rate among survivors exceeds 90% and the autopsy rate exceeds 80% in both studies. Both studies were approved by the In stitutional Review Board of the Rush University Medical Center.
88 Clinical E valuation Each participant had a uniform structured evaluation that was repeated annually with the examiners blinded to previously collected information [144,218] . The evaluation included a medical history, neurologic examination, cognitive function testing, and review of brain scan when available. We identified seven medical conditions in > 5% of the cohort at baseline: hypertension, diabetes, heart disease, cancer, thyroid disease, and head injury and loss of consciousness, which were classified on the basis of medical history. Late Life Depressi ve S ymptoms A ssessment Depressive symptoms were assessed using a ten item form of the Center for Epidemiologic Studies Depression Scale (CES D) at each annual clinical evaluation. Participants were asked whether they had experienced each of 10 symptoms in the past week. The score indicated the number of symptoms experienced. The reliability of this 10 item form and its similarity to the original 20 item CES D have been previously established.  To capture the impact of DNA methylation on depressive symptoms  , we used the median CESD score in the fol low ups as our primary outcome. Participant with a four or more depressive symptoms was considered have late life depression [219,220] . DNA M ethylation D ata A ssessment In ROS and MAP, dorsolateral prefrontal cortex (DLPFC) was selected for initial multi omics data generation, as it had proven relevant to multiple common neuropathologies in the aging population  . DNA methylation levels from the gray matter of DLPFC were measured using the Illum ina HumanMethylation450 BeadChip, and the measurements underwent QC processing as previously described (e.g.,
89 detection p < 0.01 for all samples) [221,222] , yielding 708 participants with 415,848 discrete CpG dinucleotide sites with methylation measurement. Any missing methylation levels from any of qualit y controlled CpG dinucleotide sites were imputed using a k nearest neighbor algorithm for k = 100  . Given that different cell types differ in methylation profiles  , we further estimated cell type composition us ing CETS  . Gene E xpression by RNA seq RNA was extracted from the gray matter of DLPFC, and next generation RNA sequencing (RNA Seq) was done on the Illumina HiSeq for sample s with RNA integrity scores > 5 and surpassing a quantity threshold > 5 ug, as previously described [221,22 4] .We quantile normalized the fragments per kilobase of transcript per million fragments mapped (FPKM) after correcting for batch effect with Combat [224,225] . Statistical Analysis Identifying d ifferentially m ethylated p robes (DMPs) associated with late life depressive symptoms . To examine the association between methylation of each single CpG with late lif e depressi ve symptoms, we fitted a linear regression model on median CESD score in the follow ups against methylation level, adjusted for age, sex, BMI, education, chronic disease history, global burden of AD pathology, cell type composition and batch of methylation e xperiment. Identifying differentially methylated regions (DMRs) associated with late life depressive symptoms Region based analysis was performed using DMRcate  wh ich identifies significant DMR was defined as a region with q <0.05 after correcting fo r total number of
90 regions. Putative DMRs were ranked and annotated to genomic features based on the UCSC database (GRCh37/hg19). Genomic features of DMRs were compared to the null distribution of CpG probes included in the Methylation450K array [226,227] . Correlation between differential methylated probes and their cis expression To examine the impact of DNA methylation on gene expression, we calculated partial correlation coefficients (corrected for age, sex, BMI, education, chronic disease history, global burden of A D pathology and cell type composition) between DNA methylation and cis acting gene expression for each probe. Here cis acting was defined as correlation between DNA methylation of a putative gene with its own expression (Â±5kb to a tested probe). Functiona l enrichment analysis To identify the biological function of DMR gene, we performed enrichment analysis by linking the identified genes in GO databases [ 228] . GO terms with more than 10 genes have methylation level measured were included in the enrichment analysis. Co methylation networks We applied Weighted Gene Correlation Network Analysis (WGCNA) to identify discrete groups of co regulated genes (mod ules). Subsequently, all modules were tested for over representation of DMPs associated with late life depressive symptoms . To determine whether the overall methylation of these modules was significantly associated to late life depressive symptoms , we comp ared module eigen values in participants with and without late life depressive symptoms (CESD < 3). Moreover, co methylated network was constructed with and without late life depressive symptoms .
91 Differential methylation network was identified by comparing the two groups. Network visualization was done using CytoScape  . Sensitivity analysis Depressive symptoms change overtime. Here we used a mixed model examining the association between longitudinal assessment of late life depressive symptoms, adjusted for age, gender, BMI, education, disease history, global burden of AD pathology, cell type composition and batch of methylation array a nd use participant ID as random effect. Multiple testing We corrected for multiple testing by using a false discovery rate (FDR)  of q< 0.05. Results The characteristics of the study participants were shown in Table 3 1. The participants a re majority females (63.7%) and aged 87.8 years. There are no significant differences between participants with depressive symptoms compare to those without depressive symptoms in age, gender, education level, or disease history. Late Life Depressive Symptoms A ssociated DMPs. Controlling FDR < 0.05, we identified 74 DMPs ( annotated to 59 unique genes) that were significantly associated with late life depressive symptoms (Table 3 2). Among these DMPs, 53 appeared hyper methylated and the other 21 appeared hypo methylated. Figure 3 1 shows a Manhattan plot for these DMPs. The identified DMPs are enriched in exon and promoter regions but depleted in intergenic regions. The identified DMPs largely reside within CpG Islands (Figure 3 2).
92 Late Life Depressive Symptoms A ssociated DMRs We identified 46 DMRs with clusters of CpGs associated with late life depressive symptoms . The identified DMRs were annotated to 32 unique genes within 5kb flanking regions. Among them, 31 DMRs appeared hyper methylated and the other 15, hypo methylated (Table 3 3). Correlation between M ethylation L evel of DMR G enes and cis G ene E xpression Fifteen DMR genes showing significant correlation with their cis expr ession (Table 3 4). Although methylation was largely negatively (67% of the correlation pairs) correlated with gene expression, 5 genes show positive correlation. Functional E nrichment A nalysis The identified differentially methylated genes were enriched in biological processes related to glial cell derived neurotrophic factor secretion, neuron death in response to oxidative stress, neuron projection terminus, and dopaminergic neuron axon guidance (Table 3 5). By gene overlapping analysis, we found that th e putative differentially methylated genes were 1.82 times overrepresented in previous GWAS loci for major depression (P=1.82 Ã—10 4 ). Differential N etwork A nalysis Co methylation network analysis identified 9 modules with 10 or more genes . Table 3 6 lists the co methylation modules along with hub genes (genes with highest module membership) and biological pathways enriched in each module. Three modules implicated in dopaminergic neuron axon guidance, neuron death in response to oxidative stress path way and glial cell derived neurotrophic factor secretion were significantly enriched for DMPs (Figure 3 3). The neuron death in response to oxidative
93 stress pathway showed significant difference between participants with and without depressive symptoms. Se nsitivity A nalysis Using longitudinal measures of depressive symptoms as outcome slightly attenuated the association between DNA methylation and late life depressive symptoms , but results remained qualitatively unchanged ( all p < 0.01 ) . Discussion Using b rain samples from a well characterized cohort consisting of community based older persons, we identified 74 DMPs and 46 DMRs associated with late life depressive symptoms, after accounting for clinical and pathologic covariates and multiple testing. The di fferentially methylated genes are significantly enriched in biological processes related to glial cell. Moreover, the identified differentially methylated genes are overrepresented in GWAS loci related to MDD. Correlation analysis revealed that DNA methyla tion levels of the identified DMR genes were largely negatively correlated with their cis gene expression. Network analysis showed the differential methylated genes are co regulated and enriched in several biological process related to depression. To our k nowledge, this is the largest DNA methylome analysis focused on late life depression in human brain. Of the 32 annotated DMR genes, the LDB2 gene showed the most significant association with late life depressive symptoms . The methylation level of this gene in depressed participants is on average 1.19 times as high as that in non depressed participants. In recent GWAS among 50,000 independent individuals with European ancestry, a SNP located in LDB2 gene was found to be significantly associated with depressi on  . Gene expression level of LDB2 was found to play a key role in linking
94 maternal psychological stress to neurodevelopmental disorders  . Another significantly hypermethylated gene TBCD encodes tubulin folding co factor D, which is one o f five tubulin specific chaperones playing a pivotal role in microtubule assembly in all cells  . Although the exact mechanisms behind the association of TBCD hypermethylation with depression is unknown, TBCD are involved in impaired neuronal morphology, which have been implicated in neurodegenerative and neuropsychiatric disorders  . In concordance with previous published meta analysis of differentially methylated probes on depression in middle aged and elderly population  , we identified probes in CDC42BPB marginally associated with late life depressive symptoms . However, no signal was found in ARHGEF3 gene. In previous study we found 39 DMRs associated with MDD in young twins. Among these DMRs, 5 (13%) is overlapped with DMRs identified in the current study. There are several reasons for this discordance. First, the measurements for depressive symptoms differ between these studies. Second, as expected, although late life depressive symptoms might share some DMR genes with EOD, they do have different disease mechanism. Third, we measured methylation levels in brain tissues in this study. Our netw ork analysis identified 9 co regulated modules, of which 3 are associated with late life depressive symptoms . The hub genes of these networks highlight the potential role of PER3 and CLOCK in late life depressive symptoms pathology. These circadian genes r egulate the synchronization of independent circadian clocks throughout the body to appropriate phases  . Abnormalities in circadian rhythms may und erlie the development of mood disorders [234,235] , but the biological
95 mechanisms behind the observed associations remain to be determined. It is possible that the disrupted clock nearly all depressed participants suff ered from leads to the association. This study has several limitations. Depressive symptoms were measured using a short formed CES D. Although validated with the standard form of CESD, the short formed one may lack some sensitivity. Second, the study focu sed in one brain region (prefrontal cortex), but it is possible that DNA methylation varies across different brain regions. Moreover, the participants included in the current study are highly educated European Caucasians, and the results may not be general ized to other ethnic groups or population settings. Our study has several strengths. First, we measure methylation levels in postmortem brain tissues which directly reflects the pathological change in the nervous system  . Second, this study focused on elderly people, where environmental effects may dominate the disease susceptibility. The methylation profiles in elderly people may reflect the complex interaction between genetics and environment, thus provide insights into the underlying mechanisms of depression. To our knowledge, this is the largest study examin ing the association between methylation level and late life depressive symptoms in a large collection of postmortem human brain samples. Moreover, we conducted a comprehensive statistical analysis focused not only on single CpG, but functional region assoc iated with late life depressive symptoms . If validated, the newly identified genes and pathways may serve as novel therapeutic targets for late life depressive symptoms and related disorders.
96 Table 3 1. Characteristics of study participants 3 and less d epressive symptoms (N=592) a 4 or more depressive symptoms (N=116) b P c Age at death , mean (SD), years 87.4 ( 6.3 ) 88.1 ( 6.7 ) 0.847 Male, N (%) 230 (36.9%) 40 (34.4%) 0.284 BMI , mean (SD), kg/m 2 29.3 (6 .3 ) 28. 4 ( 5.4 ) 0.482 Education , mean (SD), years 16.5 (3 .6 ) 15.9 ( 3.4 ) 0.723 Chronic disease history d , N (%) 532 (85.2%) 105 (90.5%) 0.146 A D isease , N (%) 204 (32.7%) 28 (24.2%) 0.014 a Participants with CESD score less than or equal to 3 b Participants with CESD score greater than 3 c P value calculated by t test. d Have one or more disease in the following categories: CVD, CHD, Hypertension, Cancer, Stroke, Kidney disease and head injur y
97 Table 3 2. CpGs associated with late life depression symptom Gene CpG chr Posi tion logFC a P b q c LDB2 cg21734996 4 16,760,829 0.1999 2.11 Ã—10 13 3.15 Ã—10 8 MAD1L1 cg05533001 7 2,019,608 0.1492 3.15 Ã—10 13 4.59 Ã—10 8 cg10340048 8 74,258,684 0.1715 5.39 Ã—10 13 7.65 Ã—10 8 RASA3 cg21186053 13 1,152,956,738 0.0359 1.12 Ã—10 12 1.56 Ã—10 7 NRD1 cg12301169 1 52,344,471 0.0793 2.03 Ã—10 12 2.74 Ã—10 7 MAML2 cg16509158 11 95,894,040 0.1027 3.92 Ã—10 12 5.18 Ã—10 7 OPRM1 cg12466324 6 1,542,947,869 0.1189 5.44 Ã—10 12 7.03 Ã—10 7 TBCD cg17096374 17 80,882,296 0.1843 6.73 Ã—10 12 8.49 Ã—10 7 DLGAP2 cg19107264 8 1,581,598 0.2006 1.14 Ã—10 11 1.41 Ã—10 6 ERC2 cg13491490 3 55,646,231 0.1400 1.43 Ã—10 11 1.73 Ã—10 6 HELB cg06239064 12 66,697,621 0.1487 1.51 Ã—10 11 1.79 Ã—10 6 LGALS12 cg16004377 11 63,272,909 0.1593 2.40 Ã—10 11 2.78 Ã—10 6 cg07010222 11 134,837,624 0.2104 3.86 Ã—10 11 4.39 Ã—10 6 REEP6 cg13970591 19 1,491,289 0.0501 6.21 Ã—10 11 6.92 Ã—10 6 OBSCN cg23586423 1 229,395,787 0.2026 1.21 Ã—10 10 1.32 Ã—10 5 cg15758240 7 12,839,582,374 0.1860 1.88 Ã—10 10 2.02 Ã—10 5 VWA1 cg13897675 1 1,374,310 0.1468 2.07 Ã—10 10 2.18 Ã—10 5 DYNC1I2 cg21472700 2 1,734,882,673 0.0370 2.96 Ã—10 10 3.05 Ã—10 5 cg03725444 2 53,486,933 0.2004 3.18 Ã—10 10 3.19 Ã—10 5 cg21648324 11 12,692,986 0.1690 3.20 Ã—10 10 3.19 Ã—10 5 PER3 cg10059324 1 7,884,824 0.1969 8.05 Ã—10 10 7.89 Ã—10 5 SPACA1 cg04367614 6 88,776,150 0.1619 1.01 Ã—10 9 9.77 Ã—10 5 ZNF433 cg15079762 19 12,146,404 0.0444 1.55 Ã—10 9 1.47 Ã—10 4 cg02660564 19 42,749,039 0.0583 2.99 Ã—10 9 2.79 Ã—10 4 PITX1 cg19802165 5 134,753,218 0.0428 3.46 Ã—10 9 3.17 Ã—10 4 SSTR3 cg11069317 22 37,603,654 0.2074 3.82 Ã—10 9 3.44 Ã—10 4 PABPC1 cg04504715 8 102,867,432 0.1330 3.89 Ã—10 9 3.46 Ã—10 4 ITGA11 cg20707064 15 68,724,948 0.1277 4.33 Ã—10 9 3.79 Ã—10 4 cg18703066 2 105,974,357 0.0342 4.74 Ã—10 9 4.08 Ã—10 4 PTPRN2 cg06963346 7 157,563,075 0.0318 5.47 Ã—10 9 4.64 Ã—10 4 a Log fold change b Adjusted for age, gender, BMI, education, disease history, global burden of AD pathology, cell type composition and batch of experiment. c Adjusted for total 415848 CpGs.
98 Table 3 3. DMRs associated with life depression symptom Gene CHR Position # of CpGs logFC a P b q c 8,248,503 9 0.0291 1.14 Ã—10 13 3.82 Ã—10 8 8,609,878 11 0.0055 2.45 Ã—10 13 7.72 Ã—10 8 65,784,265 7 0.0235 4.84 Ã—10 13 1.45 Ã—10 7 33,079,692 9 0.0218 1.49 Ã—10 12 4.24 Ã—10 7 14,308,183 14 0.0235 1.16 Ã—10 11 3.09 Ã—10 6 32,803,210 5 0.0259 1.20 Ã—10 11 3.09 Ã—10 6 30,804,997 10 0.0234 8.14 Ã—10 11 2.01 Ã—10 5 159,586,355 11 0.0281 1.35 Ã—10 10 3.21 Ã—10 5 140,836,274 12 0.0567 3.54 Ã—10 10 8.04 Ã—10 5 48,069,745 5 0.0199 5.25 Ã—10 10 1.15 Ã—10 4 71,401,960 5 0.02 00 5.74 Ã—10 10 1.21 Ã—10 4 124,385,623 9 0.0262 8.06 Ã—10 10 1.58 Ã—10 4 48,403,011 9 0.0205 7.99 Ã—10 10 1.58 Ã—10 4 30,795,574 8 0.0032 4.47 Ã—10 9 8.46 Ã—10 4 23,303,619 10 0.0189 5.32 Ã—10 9 9.76 Ã—10 4 LDB2 4 16,760,829 11 0.0291 1.14 Ã—10 8 1.23 Ã—10 3 MAD1L1 7 2,019,608 13 0.0055 2.45 Ã—10 8 2.40 Ã—10 3 8 74,258,684 6 0.0235 4.84 Ã—10 8 3.30 Ã—10 3 RASA3 13 11,583,755 6 0.0218 1.49 Ã—10 7 3.58 Ã—10 3 NRD1 1 52,344,471 12 0.0235 1.16 Ã—10 7 4.38 Ã—10 3 MAML2 11 95,894,040 11 0.0259 1.20 Ã—10 7 6.86 Ã—10 3 OPRM1 6 1,549,038,467 8 0.0234 8.14 Ã—10 7 8.37 Ã—10 3 TBCD 17 80,882,296 5 0.0281 1.35 Ã—10 7 9.72 Ã—10 3 a Mean log fold change in the region b Adjusted for age, gender, BMI, education, disease history, global burden of AD pathology, cell type composition and batch of experiment. c Adjusted for total 95848 regions.
99 Table 3 4. Cis correlation between DMR genes and gene expression Gene Chr DNA Methylation Expression Correlation P a CpG Position (bp) Start (bp) End (bp) LMO1 11 cg20775162 47,697,733 47,696,229 47,700,223 0.88 1.40Ã—10 6 RAB12 18 cg03327386 130,830,096 130,828,513 130,831,937 0.87 1.70Ã—10 6 CATSPER1 11 cg07538986 84,115,811 84,113,598 84,117,389 0.92 3.20Ã—10 6 HLA DPB2 6 cg25438492 145,277,646 145,275,533 145,280,372 0.46 5.40Ã—10 6 TRIO 5 cg06800293 87,849,496 87,847,809 87,851,213 0.93 6.30Ã—10 6 TAP2 6 cg20224850 53,737,506 53,734,815 53,739,177 0.85 7.10Ã—10 6 POFUT1 20 cg00901574 47,697,663 47,694,727 47,698,467 0.91 7.20Ã—10 6 VIPR2 7 cg24182468 78,400,561 78,397,586 78,400,923 0.93 8.00Ã—10 6 NPDC1 9 cg14642467 58,497,795 58,496,097 58,499,626 0.90 8.40Ã—10 6 SUNC1 7 cg12302812 73,082,340 73,081,263 73,084,532 0.93 8.70Ã—10 6 DH10 12 cg18118262 43,602,666 43,601,613 43,604,833 0.85 1.03Ã—10 5 POFUT1 20 cg22946150 32,455,025 32,452,342 32,455,564 0.93 1.13Ã—10 5 MRPL52 14 cg01817364 38,598,880 38,597,572 38,600,969 0.96 1.22Ã—10 5 LDB2 4 cg01387743 66,035,392 66,033,770 66,038,060 0.94 1.29Ã—10 5 MAD1L1 7 cg18118262 122,019,076 122,017,812 122,021,001 0.86 1.34Ã—10 5 RASA3 13 cg22946150 33,113,331 33,111,485 33,115,151 0.88 1.46Ã—10 5 a Calcuated by Pearson correlation.
100 Table 3 5. Pathway enrichment for putative DMRs with nominal associations with MDD (P<0.001) Term Description P q GO:0031098 G lial cell derived neurotrophic factor secretion 1.57 Ã—10 8 6.29 Ã—10 5 GO:0036475 N euron death in response to oxidative stress 1.30 Ã—10 7 5.21 Ã—10 4 GO:0061416 R egulation of transcription from RNA polymerase II promoter in response to salt stress 2.78 Ã—10 7 1.11 Ã—10 3 GO:0044306 N euron projection terminus 8.17 Ã—10 7 3.27 Ã—10 3 GO:0036519 C hemorepulsion of serotonergic neuron axon 2.18 Ã—10 6 8.71 Ã—10 3 GO:0036514 D opaminergic neuron axon guidance 6.24 Ã—10 6 2.50 Ã—10 2
101 Table 3 6. Co methylation modules Module Module size Hub gene Top enriched GO term in module P a 1 312 PER3 C ircadian sleep/wake cycle process 1.74 10 2 2 278 N euron death in response to oxidative stress pathway 3.32 10 4 3 244 CLOCK C ircadian sleep/wake cycle process 2.21 10 2 4 201 SLC27A3 N egative regulation of activation induced cell death of T cells 4.96 10 4 5 179 RNPC3 G lial cell derived neurotrophic factor secretion 5.26 10 4 6 143 DAB1 D opaminergic neuron axon guidance 1.36 10 1 7 118 STX12 Golgi membrane 4.87 10 1 8 64 VWA1 E xtracellular matrix structural constituent 7.64 10 1 9 49 NRD1 R egulation of death inducing signaling complex assembly 2.35 10 1 a P value for the association between first 3 PC of the module and late life depressive symptoms , adjusted for age, gender, BMI, education, disease history, global burden of AD pathology, cell type composition and batch of experiment.
102 Figure 3 1. Manhattan plot of EWAS for life depression symptom
103 Figure 3 2. Genomic distribution and CpG contents of identified DMRs. (a) Genomic distribution of identified DMRs associated with late life depressive symptoms . (b) CpG content of identified DMRs associated with late life depressive symptoms. Intergenic 7% TSS2000 19% TSS1500 28% Exon 13% 5'UTR 17% Gene Body 16% Shelf 3% Open Sea 8% Island 63% Shore 26% a b
104 Figure 3 3. The largest co methylation module s for the identified DMGs.
105 CHAPTER 4 GUT MICROBIOME AND B LOOD METABOLOME PROF ILES FOR MAJOR DEPRESSION: FINDINGS FROM A MONOZYGOTIC D ISCORDANT TWIN STUDY Major depressive disorder (MDD) is a neuropsychiatric disorder that affects ~350 million people worldwide as of 1997 2013 1 . The World Health Organization has identified unipolar major depression as the leading cause of disease burden for the year 2030 [2,6] . Family studies have discovered that both genetic factors and environmental factors interact together to contribute to MDD pathology [18,95,237] . However, identifying specific MDD related risk factors has proven more difficult and less fruitful as compared to the efforts devoted to other mental disorders. Many factors underlying MDD remain unknown. By recent cumulative evidence, the inter relationship within gut brain axis (GBA) plays an important role in maintaining homeostasis of the central nervous system (CNS) [117,118] . The GBA is a dynamic neurohumoral communication system involving brain, glands, gut, immune cells, and gastrointestinal microbiota. Such organs and tissue s communicate each other to maintain homeostasis. Gut dysbiosis can lead to a broad spectrum of physiological and behavioral effects, including hypothalamic pituitary adrenal (HPA) axis activation, altered activity of neurotransmitter systems, and immune f unction  . Animal studies have shown that microbiome abundances are associated with MDD. For example, mice exposed to stressors show increased levels of alistipes, odoribacter, roseburia , and clostridium wi th decreased level of parabacteroides , coprococcus and dorea [120 122] . Furthermore, fecal microbiota transplantation of germ om depression individuals showed increased depression like behaviors compared to GF mice colonized with
106 microbiota derived from non depressed mice [91,123] . Microbiome composition is also associated with depressive symptom in mice model. Antibiotic treatment can induce substantial shifts in the mouse gut microbiome community, which is associated with an increased level of depressive behaviors [125 127] . Probiotics have the potential to reduce stress and stress related disorders such as depression [127,128] . Human studies have found that the gut microbiotic compositions of MDD individuals are significantly different from that of healthy controls [91,129] . Depressed participants have elevated abundances of parabacteroides, paraprevotella, anaerofilum, blautia and gelria were increased in whereas the abundances of ruminococcus, faecalibacterium and bacteroides , decreased [91,123,129,130] . While aforesaid studies have clearly demonstrated the potential important role of gut dysbiosis in MDD pathogenesis, their results are quite diverse. This discrepancy would probably due to confounding by many latent uncontrolled factors, e.g., genetics, in utero environment, and early familial environment, all of which are important proven contributors to both gut microbiome and depression [238 240] . Furthermore, the host and its gut microbiota co produce a large number of metabolites during the metabolism of food , many of which play critical roles in MDD etiology [90,110,111,115] . Therefore, it is of particular importance to investigate MDD related gut microbiome and gut derived metabolic simultaneously while controlling the important confounders. A well matched MZ discordant c o twin control design represents an ideal nature experiment for studying the role of gut microbiome in complex human diseases such as MDD. In this chapter, I report findings from a MZ twin pairs discordant on lifetime history of MDD. Using
107 metagenomics in ference, we identified multiple significant metabolic pathways associated with MDD related microbial communities. Method Study Population The current analysis includes 37 monozygotic twin pairs discordant on MDD. To identify potentially eligible twin pairs, introductory letters were sent to WSTR members via postal mail and email that asked general questions about MDD. These letters were only sent to the twin pairs each of which had least one living in Western Washington because of the need for an in person study visit in Seattle. Interested twins were then interviewed by trained research staff using approved scripts to screen for likelihood o f eligibility. After the pre screening, lifetime and current MDD diagnoses were determined using the Structured Clinical Interview for DSM IV Research Version (SCID 4 RV)  . Interviews were administered via phone by a clinical psy chologist (E.D.S.) who was blind to pre screening clinical information about both the interviewees and or the co twins. Final diagnoses were confirmed by consulting the senior psychiatrist (P.P.R.B). A discordant pair was defined as a twin pair in which on e twin met the criteria for lifetime history of MDD, and the other did not. Inclusion/Exclusion Criteria Only complete twin pairs were eligible for the present study. As over 85% of the twins in the WSTR are Caucasian (which reflects Washington State gene rally), all twin pairs included in current analysis are Caucasian. Inclusion criteria included: (1) monozygosity determined by DNA analysis, (2) pairwise discordance on lifetime history of MDD, (3) aged 18 or older, (4) reared together, (5) Beck Depression Inventory II (BDI II)15 score <13 for the non depressed twin (0
108 II authors), and (6) willingness to provide blood samples. The primary exclusionary conditions included schizophrenia or other ps ychotic disorder, bipolar disorder, current substance use disorder, cancer within the past 5 years, autoimmune disorders, uncontrolled endocrine disorders, or uncontrolled sleep apnea. Other Measures Each twin was asked to complete standard questionnaires regarding sociodemographic factors, lifestyle, early life experience, use of psychiatric medications, and disease history. Severity of each depressive symptom was assessed by the BDI II and the Quick Inventory of Depressive Symptomatology (QIDS SR 16) at the time of the blood draw. [ 165] Self reported early life stress was measured using the Adverse Childhood Experience (ACE)  and Early Trauma I nventory (ETI)  questionnaires. PTSD was not exclusionary in this study due to high comorbidity with MDD but was diagnosed and recorded during the SCID RV interv reported use of commonly prescribed psychiatric medications and approximate duration of use (ranging from 2 weeks to over 10 years) were encouraged to list any additional psychiatric medications they had taken. In our data analysis, med ications on the list and those added by participants were categorized into t antidepressants, benzodiazepams, mood stabilizers, and other (i.e. Ritalin), and each of which was dichotomized into a dummy variable. A participant was defined as a user for anti depressants if he/she used any medication in any of the four categories. Diet intake for each participant was self reported using Food Frequency Questionnaire (FFQ).
109 Gut Microbiome Profiling by 16s rRNA Sequencing Stool sample collection Candidate twins w ere asked to collect their stool samples at home using provided self collection kits. Each kit consists of a collection bowl that sits on the and two collection tube s that have a scoop built onto the cap. The kits are designed for consistent volumes of stool and preservation of microbial DNA. Each collection tube had 5ml RNAlater stabilization solution (Ambion, Thermofisher) inside to stabilize the sample, along with small glass beads to aid in mixing the solution into the sample. Participants were instructed to invert the tube multiple times to homogenize and liquefy the sample after the collection was completed and the tube was capped. After enclosing samples in the outer storage tubes and biohazard bags, participants froze the samples and brought them on ice when they attended the in person visits. Alternatively, some participants collected their samples and brought them directly to the lab on the mornings of their i n person visits, and others shipped their samples overnight after the collections. This collection process had been proven effective by earlier clinical studies  . 16s rRNA sequencing Stool samples (~5 gram) were delivered to the Alkek Center for Metagenomics and Microbiome Research (CMMR) at Baylor College of Medicine in Houston, Texas for microbiome anal ysis. Samples were arrayed in boxes and shipped on dry ice over night with an accompanying sample manifest that included de identified sample IDs and box positions. Upon delivery, samples were reconciled with the provided manifest and l further processing. Samples were defrosted at room temperature to re
110 extraction deep well plate. DNA extraction was carried out using the Hamilton STARlet platform following the M oBio Soil DNA extraction protocol V1.4 (MO BIO Laboratories, CA). Extracted DNA was subjected to the V4 hypervariable region of bacterial 16S rDNA amplification using primers 515F and 806R containing Illumina adapters and a single end barcode allowing pool ing and direct sequencing of PCR products by Nextera XT DNA library Prep Kit V1.2 (Illumina, CA). Quantified amplicons were normalized and pooled use DNA mass of 100 ng per sample. The resulting amplicon pool was cleaned using the ChargeSwitch PCR Clean up Kit (Invitrogen). Then, the amplicon pool was sequenced using 2 x 250 bp paired end sequencing on illumine MiSeq platform V1.8 (Illumina, CA). Resulting sequences were demultiplexed based on the unique molecular barcodes for each participant. The reads we re merged using QIIME 2  , allowing zero mismatches and a minimum overlap of 50 bases. Quality control The analytic pipeline for 16S rDNA analysis leverages custom analytic packages and pipelines developed at the CMMR. It provides summary statistics and quality control measurements for each sequencing run and multi run reports. Also, it provides data merging capabilities for validating built in controls (known and blank) and characterizing microbial communities across large numbers of sampl es. OTU annotation 16Sv4 rDNA gene sequences were clustered into Operational Taxonomic Units (OTUs) at a similarity cutoff value of 97% using the UPARSE algorithm  . OTUs were mapped to an optimized version of the SILVA Database  containing only the 16Sv4 region to determine taxonomies. Abundances were recovered by mapping the demultiplexed reads to the UPARSE OTUs. A custom script constructed a rarefied OTU
111 table from the output files generated in the previous two steps for downstream analyses of alpha diversity, beta diversity, and phylogenetic tr ends  . Plas m Metabolomics Analysis by LC MS Relative abundance of fasting plasma metabolites was measured using a mixed targeted and untargeted high resolution LC isotopic standard mix, placed on ice for 30 min, and centrifuged for 10 min (16,100 x g at 4Â°C) to remove protein. The supernatant was then removed and placed into of supernatant with a 10 min formic acid/acetonitrile gradient at a flow rate of 0.35 mL/min for the initial 6 min and 0.5 mL/min for the remaining 4 min on a Thermo LTQ Velos Orbitrap mass spectrometer (Thermo Fisher, San Diego, CA). The mass spectrometer was set to collect metabolic profile from mass/charge ratio (m/z) 85 to 2000 in a positive ionization mode. Quality control was performed based on these internal standards to evaluate mass accuracy in ppm, reproducibility of detection of internal standard s, and total ion intensity across all samples. Raw data were processed in an untargeted (qualitative) chromatograms. Peak features were then imported into MassProfilerProfessional for peak alignments to seek which peaks were present in at least 30% of chromatograms in which these peaks were positively detected. These peaks were then collated and constrained into a MassHunter quantification method on the accurate mass precursor ion level , using the MS/MS information and the LipidBlast library to identify lipids with manual confirmation of adduct ions and spectral scoring accuracy. MassHunter yields back filling of quantifications for the peaks missed in the primary peak finding process,
112 h ence yields data sets without missing values. Then the raw peaks data were normalized using total ion chromatogram (TIC) of the known compounds in order to avoid using potential non biological artifacts for the biological normalizations. Further batch effe cts were corrected using QC sample in each batch. Statistical Analysis Gut microbiome diversity eco richness score; (ii) Shannon Index; (iii) Simpson index and (iv) inversed Simpson score. diversity) measured using weight UniFrac score was used to determine whether gut microbiota composition within depressed individuals differs from their non depressed co diversity represents the variation of the species composition betw een samples and is often used as a similarity measurement. To test whether there is a difference between microbiome diversity in depressed twins versus their non depressed co twins, we compared the within phylogenetic diversity between the two groups using paired t tests. Identifying MDD associated gut microbiota To measure the association for each taxonomy and species, we first calculated the Variable Importance in the Projection (VIP) score using random forest to select informative species  . The final analysis included 94 OTUs with VIP score > 1 and shared in more than 50% of sa mples. Since monozygotic twins were matched on age, gender and early familial environment, we used the twin specific paired Wilcoxon rank -
113 sum test to identify significant MDD associated gut microbiota OTUs. Given the small sample, the Wilcoxon rank sum tes t was used here because the gut microbiota data significantly violated normality. Gut microbiota discriminating individuals with MDD from control To test whether the gut microbiota can separate individuals with MDD from their non depressed co twins, we p erformed separation analysis using t Distributed Stochastic Neighbor Embedding (t SNE)  . T SNE utilizes Barnes Hut approximations for dimension redu ction and allows for high dimensional datasets regardless of its distribution  . The t SNE loadings for the depressed group and non depressed group were compared using paired t test. Relationship between gut and diet related pla sma metabolites and MDD Several classes of plasma metabolites were derived from diet through the action of gut microbes. To examine the relationship between these gut and food derived metabolites and MDD, we tested the association between MDD and previou sly identified gut and food related metabolites in plasma samples  using Wilcoxon rank sum test as previously described. Relationship between gut derived metabolites and gut microbiome Pair wise Spearman co rrelation was adopted to evaluate the relationship between gut derived metabolites and gut OTUs. Hierarchical clustering was applied to the correlation patterns. The correlations between MDD related metabolites with each OTU were plotted in heatmap. Functi onal analysis Using existing human gut microbiome pathway database in phylogenetic investigation of communities by reconstruction of unobserved states (PICRUSt)  ,
114 we performed taxonomic and functional annotations on the identified gut microbiota. PICRUSt utilizes an extended ancestral state reconstruction algorithm to predict the gene families or pathways in m icrobiome data.  The composite score of each identified pathway was compared between depressed and non dep ressed twins using paired t test. Sensitivity analysis Although monozygotic twins were matched for most of the covariates, they could still have some discordance which might potentially influence the association tests. To exam whether covariates influence our results, we performed sensitivity analysis using Zero inflated negative binomial (ZINB) mixed regression [250 ] by controlling age, gender, education level, income level, early life stress, diet intake, and use of antidepressants as fixed effects and twin pair as a random effect. Results The characteristics of the twins (mean age 38.2 Â± 15.6 years, 68.4% fema les) were shown in Table 4 1. Except for depression symptoms, there are no significant difference between depressed twins and their co twin. Microbiome Composition in Depressed and Non depressed Participants The total number of sequencing reads was 1,730,4 61 with an average of 22,769 reads per sample. One pair of twins was removed due to low sequence reads in the samples, leaving 74 sequences were then rarefied to 13,843 reads to reduce the ef fect of sequencing depth. The overall composition is shown in Figure 4 1. Total 9 4 OTUs at genus level with a 97% threshold of pairwise identity were observed in our samples. In consistent with
115 previous studies, Clostridiales , a taxonomic order with the ph ylum Firmicutes demonstrated the highest abundance. Microbiome Diversity Associated with MDD In terms of the within sample ( phylogenetic diversity, the depressed twins showed lower gut microbiome diversity as measured by Simpson index and Shannon index than did their non depressed co twins (Figure 4 score and inverse Simpson index did not show significant difference between depressed and non depressed twins, the similar trends were observed (Figure 4 2b &c). The degrees of microbial phylogenetic similarities ( diversities) within depressed and non depressed groups were significantly higher than that between the groups (Figure 4 2d) . This result suggests that depressed and non depressed groups would have significantly different microbiome profiles. Taxa Associated with MDD Among the 94 OTUs at genus level, we identified 9 whose relative abundances showed significant associations with MDD (Table 4 2). The depressed twins had elevated leve ls of Alistipes , Odoribacer and Butyricimonas as well as reduced levels of Ruminoccocaceae , Dorea and Lachnospiraceae (Figure 4 3). Separation of Depressed and Non depressed Twins using Microbiome Profiles Hierarchical clusters in the heatmap indicated t hat the individuals with MDD had similar microbiome composition patterns (Figure 4 3). In terms of T SNE loadings, most depressed twins were clustered together and significantly (P < 0.01) separated from their non depressed co twins (Figure 4 4).
116 Relation ship Between Gut and Gut Derived Metabolites Among 43 gut derived metabolites in our data, 4 metabolites including N acetylglycine, 2,3 dihydroxybutanoic, Trimethylamine N oxide (TMAO) and 3 hydroxybutyric acid were significantly associated with MDD (Table 4 3). The relative abundance of TMAO was also significantly associated with microbiome diversity. Figure 4 5 shows the correlation patterns between metabolites and microbiome taxonomic. The OTUs from Clostridales is positively correlated with TMAO, glutam ine and polyunsaturated fatty acids, whereas Blautia is negatively correlated with N acetylglycine. These findings provide further insights for metabolically important relationships. Functional Analysis Four gut metabolic pathways were enriched with the mi crobiome sequences. The disease based pathways such as sphingolipid metabolism, D glutamine and D glutamate metabolism , and histidine metabolism were significantly different between groups. (Table 4 4) Sensitivity Analysis Sensitivity analysis by further a djusting for age, gender, education level, income level and use of antidepressant did not significantly change the association between microbiome OTU and depression (Table 4 5) or the association between gut derived metabolites in plasma and depression (Ta ble 4 6). Discussion In this article, we investigated the association between gut microbiota with lifetime history of MDD using 37 pairs of monozygotic twins discordant on MDD. This is the largest matched monozygotic twin study on the relationship between gut
117 microbiome, metabolites, and depression. It provided unique advantages to jointly leverage metabolomics and microbiome data to deciphering MDD mechanisms while properly control confounding effects of many latent factors. We identified 9 OTUs and four g ut derived metabolites that were significantly associated with MDD. Strong correlations were identified between gut derived metabolites and gut OTUs, suggesting potential crosstalk between host metabolomics and metagenomics. Moreover, we identified MDD ass ociated metabolic pathways, including sphingolipid metabolism, D glutamine and D glutamate metabolism, and histidine metabolism. Together, our results suggest a potential important role of gut microbiome in MDD. In this twin study, we observed significan tly reduced microbial diversity in depressed twins. High microbial diversity is the hallmark of a healthy microbiome  . A recent study in MDD showed microbial community difference between individuals with MDD and healthy controls, but no significant reduced diversity was identified  . Reduced microbial diversity was also observed in several other mental disorder like schizophrenia and individuals with bipolar [252,253] . Of our note, the diversity of gut bacteria could be influenced by several factors such as age, gender, diet and drug use  . After sensitivity analysis by further controlling for such covariates, the finding remained statistically significant. Among the 9 OTUs, the Lachnospiraceae showed the most significant association with MDD. On average, the relative abundance of this OTU in depressed twins was 0.85 times that in their non depressed co twins. Seve ral animal studies reported that the abundance of Lachnospiraceae in mice correlated with behavioral changes induced by stress [ 120,124] . Lachnospiraceae is one of the most abundant
118 families of Firmicutes and is most often associated with production of short chain fatty acids (SCFA) from complex carbohydrates  . A decrease in these bacteria often correlated with a decline in SCFA production, which in turn causes disease. We found Alistipes abundance was 1.22 times abundant in depressed twins compared to non depressed twins, which is consistent with a previous study on Alistipes level in individuals with MDD  . In our study, the genus Ali stipes was one of the OTU that showed the highest increase in the depressed twins . Alistipes has been significantly associated with inflammation  . As such, it is potentially linked to depression via inflammatory pathways  . In our discriminant analysis, most of the depressed twins we re clustered together in terms of the t SNE loadings. In accordance with previous studies, this finding would suggest that MDD participants share a common base of microbial profile [91,256,25 7] . Among the four MDD associated gut metabolites identified in our study, TMAO was previously associated with cardiovascular disease. This metabolite is an amine oxide generated from trimethylamine by gut microbial metabolism  . It protects against the protein destabilizing effects of urea and thus, influences cholesterol and sterol metabolism  . Biological pathways related to TMAO would serve as the pote ntial mechanism link between chronic disease and depression  . The four significant metabolites are involved in three microbiome metabolic pathways, including sphingolipid metabolism, D glutamine and D g lutamate metabolism, and histidine metabolism. Sphingolipids and its by product ceramide are well known to associate with MDD  . Sphingolipids are suggested mediators in the crosstalk between microbiota and diseases [262,263] . Our integration analysis of microbiome and metabolomics pathways is in support of this assumption.
119 Summarizing our results, we conclude that the microbiota composition may play an important role in MDD. Our study provided innovative understanding of the underlying MDD mechanism from the view of gut brain axis. We acknowledged that our study has several limitations. First, all participants were Caucasians. Additional population based cohort s tudies are necessary to determine whether there are trans ethnic differences in fecal microbiota composition. Second, our study involved 37 pairs of monozygotic twins only. The small sample size clearly limits our statistical power for identifying e MDD as sociated OTUs and metabolites. Third, this study used 16S rRNA sequencing, which only can classify bacteria mostly down to the genus level other than the species or strain level. There could be major differences between even two strains of the same species  . New technologies for full metagenomics may provide a better representation of microbial identity and metabolic activity. Fourth, as a cross sectional study, we have not been able to establish the cause effect relationship between gut microbiome compositio n, abundance, and depression. A follow up animal study using fecal transplantation may help for the causation inference.
120 Table 4 1. Clinical characteristics of twin pairs participating in the MMS (N=74) Variable Non depressed co twin (N=37) Depressed twin (N=37) P value a Age, mean (SD), years 38.4 (15.2) 38.4 (15.2) Female, No. (%) 24 (64.2) 24 (64.2) Body mass index, mean (SD), kg/m 2 27.1 (5.9) 27.2 (6.1) 0.78 Smoking, mean (SD), pack/year 2.2 (5.4) 2.3 (5.3) 0.68 AUDIT C score, mean (SD) b 3.5 (2.4) 3.3 (2.6) 0.64 Education below high school, No. (%) 4 (10.8) 3 (8.1) 0.76 Family income less than $20,000, No. (%) 4 (10.8) 4 (10.8) 0.99 BDI II score, mean (SD) 3.2 (4.9) 6.8 (5.4) 0.01 Exposure to ACE, No. (%) 4 (10.8) 5 (13.5) 0.83 History of PTSD, No. (%) 1 (2.7) 3 (8.2) 0.01 Use of antidepressants, No. (%) 6 (16.2) 20 (54.1) 0.01 Abbreviations: BDI II, Beck Depressive Inventory II . ACE: Adverse childhood experiences. PTSD, Post Traumatic Stress Disorder. a P values were calculated using the paired t test. b Score based on Alcohol Use Disorders Identification Test Consumption (AUDIT C);
121 Table 4 2. Genus level OTUs associated with MDD Name FC P value a q b Lachnospiraceae (HFOSpec6) 0.89 0.0004 0.0198 Ruminoccocaceae (Unc05eti) 1.08 0.0005 0.0198 Butyricimonas 1.22 0.0006 0.0198 Alisipes 1.17 0.0044 0.0658 Ruminoccocaceae (Unc01e6u) 0.84 0.0045 0.0840 Dorea 0.89 0.0057 0.0898 Odoribacer 1.34 0.0064 0.0868 Lachnospiraceae (Unc057b2) 0.94 0.0081 0.0948 Ruminoccocaceae (Ru2Call2) 0.92 0.0095 0.0992 Abbreviations: FC: Fold change. a P values were calculated using Wilcox model. b FDR adjusted p value accounting for 94 OTUs
122 Table 4 3. Gut derived metabolites associated with MDD Metabolite m/z RT P a q b Effect size c N acetylglycine 174.92 3.56 0.0004 0.0184 0.2913 Trimethylamine N oxide 76.08 5.51 0.0007 0.0161 0.2645 2,3 dihydroxybutanoic acid 292.94 3.85 0.0030 0.0460 0.2254 Histidine 154.49 6.64 0.0097 0.1116 0.1698 Abbreviation: m/z: mass/charge ratio; RT: retention time; a Raw p value from Wilcox model b FDR adjusted p for 46 metabolites c Median differences between depressed twins and their non depressed co twins
123 Table 4 4 . Gut metabolic pathways enriched with d ifferential microbiomes Pathway # of matches a P q b Sphingolipid metabolism 8 4.38 10 5 0.0182 D glutamine and D glutamate metabolism 11 2.28 10 4 0.0474 Histidine metabolism 6 7.13 10 4 0.0989 Abbreviations: m/z: mass/charge ratio; RT: retention time. a Matched terms and gut taxas in each pathway. b Adjusted for 416 gut metabolic pathways with 3 or more matches.
124 Table 4 5 . Sensitivity analysis of genus level OTUs associated with MDD Name Effe ct Size b P a Lachnospiraceae (HFOSpec6) 0.380 0.031 Ruminoccocaceae (Unc05eti) 0.798 0.001 Butyricimonas 0.090 0.010 Alisipes 0.189 0.042 Ruminoccocaceae (Unc01e6u) 0.385 0.091 Dorea 0.737 0.038 Odoribacer 0.030 0.011 Lachnospiraceae (Unc057b2) 0.548 0.046 Ruminoccocaceae (Ru2Call2) 0.183 0.035 a P values were calculated under a mixed zero inflated model, treating twin pair as random effect and adjusting for age, gender, education level, income level and use of antidepressants. b Effect size for 1 SD change in abundance of microbiome taxa.
125 Table 4 6 . Sensitivity analysis of gut derived metabolites associated with MDD Metabolite m/z RT P a Effect size b N acetylglycine 174.92 3.56 0.0087 0.1422 Trimethylamine N oxide 76.08 5.51 0.0192 0.0478 2,3 dihydroxybutanoic acid 292.94 3.85 0.0275 0.0523 Histidine 154.49 6.64 0.0386 0.1168 Abbreviations: m/z: mass/charge ratio; RT: retention time. a P values were calculated under a mixed model, treating twin pair as random effect and adjusting for age, gender, education level, income level and use of antidepressants. b Effect size for 1 SD change in metabolites concertation level.
126 Figure 4 1 . Overall gut microbial composition at the class level. The relative abundance of gut taxa was shown as height of bar for depressed twins (red) and non depressed twins (blue).
127 Figure 4 2 . Microbiome diversity among depressed twins and non depressed twins. Depressed twins show a lower with in sample microbial phylogenetic ( diversity than do the non depressed twins ( a c ) . The microbial phylogenetic diversities) analysis shows that the distance measured as weighted UniFrac score is lower within each depressed or non depressed group than between group score ( d ). a b c d Depressed Non depressed
128 Figure 4 3 . Hierarchical clustering for the genus level abundance of MDD related gut microbiome. Depressed participants (blue labels) were clustered together.
129 Figure 4 4 . t SNE plot showing depressed twins can be separated from their non depressed co twins based on their gut microbiome composition. (paired t test p <0.01 for both loadings)
130 Figure 4 5 . Correlation patterns between metabolites and microbiome taxas associated with MDD 0.8 0 0.8 Correlation r:
131 CHAPTER 5 SUMMARY MDD is a highly heterogeneous disorder . Its etiology of MDD is highly complex involving both genes and the environment as well as their interactions. Despite substantial progresses over the past decades, our understanding of its pathophysiology remains incomplete. In this dissertation , I identif ied novel genes and biological pathways associated with depression using an in tegrated omics approach . Findings from EWAS in Young MZ Twins In a genome wide DNA methylation and expression analysis of monozygotic twin pairs discordant on MDD, I identified 39 differential ly methylated regions (DMRs) and 30 differ e ntially expressed g enes (DEGs) associated with lifetime history of MDD. The identified genes are significantly enriched in biological processes related to neuronal function and stress responses , and are overrepresented in the GWAS loci associated with depression. Although i t is generally believed that DNA methylation causes gene silencing, the relationship between DNA methylation and gene expression in peripheral blood monocytes may be more complicated than previously anticipated . To date, there lacks a global analysis of th e extent and pattern of DNA methylation with gene expression in human blood monocytes. Here I demonstrated that monocytes DNA methylation can be both positively and negatively correlated with gene expression, although there appears to be a trend that the c orrelations are predominantly negative in promoter regions. These results are in agreement with previous studies reporting both positive and negative correlations between DNA methylation and gene expression in blood and brain [157,209,265] . Interestingly, I found that the relationship between DNA methylation and
132 cis acting gene expression appears to vary from one genomic location to another , with methylation of putative genes located upstream of TS S showing predominantly negative correlations, whereas those located downstream of TSS showing largely positive correlations with gene expression. Possible mechanisms for the negative correlations between DNA methylation and gene expression may include int erference with the binding of transcription factors or through the recruitment of repressors such as histone deacetylases [206,207] . Integrated DNA methylome and transcriptome analysis revealed that DNA methylation was both negatively and positively correlated wit h gene expression in peripheral blood monocytes. Putting everything together , the integrated DNA methylome and transcriptome analysis reinforced the importance of DNA methylation in gene regulation and identified key candidate genes whose role in depressio n are modulated by epigenetic changes. To my knowledge, this is the first integrated genome wide profiling of DNA methylome and transcriptome on lifetime history of MDD in purified blood monocytes from MZ discordant twin pairs recruited by a community base d population study. Findings from EWAS in Elderly Individuals In an epigenome wide association analysis of DNA methylation data generated in postmortem brain sample from elderly individuals, I identified 74 DMPs and 46 DMRs associated with late life depres sive symptoms at epigenome wide significant level , after accounting for clinical and pathologic covariates and multiple testing. The differentially methylated genes were significantly enriched in biological processes related to glial cell. Moreover, the id entified differentially methylated genes were overrepresented in GWAS loci related to depressive symptoms. Correlation analysis revealed that DNA methylation levels of the identified DMR genes were largely negatively correlated with
133 their cis gene expressi on. Network analysis showed the differential methylated genes were co regulated and enriched in several biological process related to depression. Depression is a complex and heterogeneous disorder involving the functions of many genes that jointly or int eractively contribute to disease etiology. Traditional methods that model the effect of a single gene cannot effectively capture the complicated biological pathways implicated in d epression. Using a network based methylated modules containing coordinated genes across different genomic loci. These findings support the hypothesis that altered DNA methylation is interdependent and that the brain methylome comprises a complex network of interacting processes. To my knowledge, this is the first genome wide profiling of DNA methylome on late life depression in postmortem brain tissues from elderly individuals recruited by a community based population study. Findings from Microbiome Analysis in Young MZ Twins Gut m icrobiome analysis found that depressed twins have significant higher levels of Alistipes , Odoribacer and Butyricimonas and lower level s of Ruminoccocaceae , Dorea and Lachnospiraceae compared to their non depressed co twins. Integrated gut m icrobiome and host metabolome anal ysis revealed a strong correlation between gut taxa and gut derived metabolites. Furthermore, the microbiome metabolic pathways such as sphingolipid metabolism, D glutamine , and D glutamate metabolism, and histidine metabolism were significantly associated with MDD. To my knowledge, this is the largest microbiome analysis on lifetime history of MDD from MZ discordant twin pairs recruited by a community based population study.
134 Limitations This study has several limitations. These relate to the lack of sample size, cell type heterogeneity, lack of assessment of variables and the observational study design. Limited sample size : Due to the finite sample, the MMS study may underpowered . T hus I might have missed important disease related genes, compounds and / or microbiome with small effect size [266,267] . However, studies using a twin design usually have a higher power than those using unrelated individual design with a same sample size [64,268] . By power analysis , I should have 80% power to detect a 1.1 fold change in DNA methylation at genom e wide level. Furthermore , the current analysis used statistical tools like region based approach, network based approach rather than a single test to reduce multiple testing, thus i n crease power. Therefore, although the sample size is limited, I still sho uld capture most of the important genes. Cell type heterogeneity : Molecular profiles differ among cell type s [269,270] . In MMS study, purified blood monocytes were used for D NA methylation and gene expression profiling, the sample still includes other cell types such as macrophages or sample ha d nearly 97% purity . In ROSMAP study, DNA methylatio n was measured using a mixed brain cells. To avoid the bias introduce by cell community, I further adjusted for cell types composition in the model. Thus, confounding by cellular heterogeneity should not be a major concern for MMS study . Depression measure ment : T he twin participants in MMS study were evaluated for lifetime history of MDD, and some non depressed co twins might ultimately develop an episode of MDD. The mean age of our sample was 38 which suggests that a large portion of the participants had a lready passed the peak risk period of young adulthood
135  . In addition , the current depression severity scores were higher on average for the depressed twins compared to their non depressed co twins even though only a very small number of twins were currently depressed at the time of the study visit. Thus, the results are pr omising even if some regions have not been replicate d or are found to be related to earlier onset of MDD. No previous depression episode or age of onset information was available in ROSMAP. As such, I have not be able to distinguish participants with late onset depression and participants with chronic depression. However, since over 95% of the participants showing late life depression symptoms had multiple depression episodes in the follow ups , the results should represent the suspected differential methyla ted genes related to late life depressive symptoms. Limited microarray coverage : T he methylation profiles were assessed using microarrays with limited genome coverage . The EPIC array covers only about 3% of the human genome, and it does not differentiate c ytosine methylation and hydroxymethylation. However, single base resolution technique s , e.g., genome wide (WGBS) , remains too expensive to be applied to a large number of samples. EPIC array covers of 99% of RefSeq genes with multiple p robes per gene, 96% of CpG islands from the UCSC database, CpG island shores and additional content selected from whole genome bisulfite sequencing data. Data from EPIC array is highly correlated with data generated using sequencing methods  . As such, EPIC platform remains a central tool in epigenetic research while cost and complexity of bioinformatic analysis still prohibits the large scale use of WGBS.
136 Undetermined c ausality : Like all other observational studies, the correlation between DNA methylation and gene expression observed in this study alone cannot establish causality [57, 209] . Lacking DNA data, I have not been able to infer the causal role of DNA methylation on MDD using Mendelian Randomization methods in this dissertation. Functional studies are necessary to ultimately determine the causality between epigenetic changes and MDD. Generalizability : The participants included in the current study are highly educated European Caucasians. Currently, it is unclear whether to what an extent the results reported herein can be generalized to other ethnic groups, such as African Am ericans and Hispanic Americans among others. Strengths Monozygotic twins design : T he monozygotic twin pairs matched most of the confounders , especially genetic background. Since molecular biomarkers are generally influenced by genetics , it is important to have the genetic background controlled. To control genetic effects , most current studies adjust the effects of major principal component s calculated from the whole genome data. However, such PCs mainly capture global patterns , i.e., po pulation s tructur es . Adjusting for such PCs would not be effective to calibrate the confounding effects of individual genetic variants on the association between local DNA methylation and a target disease trait. Monozygotic twins design is the golden standard to solve this issue since their DNA s are perfectly matched. The use of monozygotic discordant co twin control design minimizes potential confounding by these factors. Therefore , a MZ discordant co twin control design can have notable power to capture local DNA methylat ion signals, even in a relatively small sample size.
137 Integration of multiple omics data : B y utilizing multiple layer of the molecular profile s , I performed an integration analysis using a network based approach. The development of MDD is a complex biologica l process that involves many independent and interrelated molecular pathways. An integrative approach that incorporates information from multi layer molecular events is appropriate to unravel the complex biological pathways. Integrative multi omics analysi s can generate new knowledge and unravel novel mechanistic insight into the pathogenesis of MDD , which cannot be attained by separate single layer analyses. Traditional methods that model the effect of a single gene are ineffective to capture the complicat ed biological pathways implicated in MDD. Using a network based approach, I methylated modules containing coordinated genes across different genomic loci. F indings in this dissertation support the hypothesis that altered molecular levels are i nterdependent and multi omics comprise a complex network of interacting processes. Different methylation profile in younger and elderly individuals : Late life depressive symptoms would under regulation of molecular pathways other than those previously identified in younger age groups. In this dissertation, I examined the altered methylation profiles associated with depression in both younger and elderly individuals. By r esult s derived a lready, although the two studies share some common gene s such as PECR and ZNF 487 , most identified genes are specific to particular age groups. This finding highlights the importance and utility of the twin design in identifying DMRs in different age groups .
138 Molecular profiles measures in both blood and brain tissues : Most existing studies only use molecular profiles measured in blood. Although correlated, blood DNA methylation levels do not reflect DNA methylation levels in brain. In this dissertation, I reported the findings of analyzing peripheral blood first, flowed by replicate findings of analyzing human brain tissues. This approach not only ensures my findings in peripheral blood could be replicated, but also reflects the molecular profiles in brain, which would directly influences disease susceptibility. Compr ehensive neuropathology measurement in elderly individuals : Individuals with late life depressive symptoms often have cognitive deficits , includ ing impairments of episodic memory, speed of information processing and executive functioning which are first signs of dementia [272 275] . It is important to distinguish individuals with late life depressive symptoms to those have dementia. Neuropsychological assessment is generally considered the gold standard in differentiating between depression and the ea rly stages of dementia  . However, most previous st udies do not have solid neuropsychological measurement s . ROSMAP have collected a comprehensive measurement of age related. By controlling the neuropathology, it is feasible to identify differential ly methylated genes which are independ ent of neuropsycholog y. Conclusion and Future D irection T hese findings reported in this dissertation support the important role of DNA methylation and gut microbiota dysbiosis in the molecular mechanisms underlying depression. When validated, these newly discovered genes may serve as novel biomarkers and/ or therapeutic targets.
139 To of my knowledge, this dissertation included the first integrated DNA methylome and transcriptome analysis in purified blood monocytes for lifetime history of MDD using a considerable number of MZ discordant pairs from a community based population cohort. The research presented in this dissertation has raised several new lines of research. First, the identified biomarkers need to be replicated in another tw in or population based cohort to validate their generalizability. Replications are important to help ensure that a novel methylation disease association represents a credible causal gene. Given the increasing availability of more and more public EWAS and whole measurements. Next, d ue to the limited sample size, integrati ng of 3 or 4 layers omics data can be underpowered and not be performed here. The etiology of depression is far more complex and is not centered on single molecular mechanism  . Full mechanistic insight will require incorporating multiple omics data at multiple time points. It is necessary to expand the study by rec r uit ing more participants , which could help to capture those false negative genes with smaller effect size. The increased study power will also enable the integration analysis of more omics layers . Third, l imited by computational r esources, the analysis by integrating genome wide methylation data and genome wide expression data has not been finished in ROSMAP. I will continue the analysis and unravel the correlation pattern between DNA methylation and gene expression in brain of dep ressed participants. Moreover, since DNA methylation and gene expression are tissue specific, they will also vary among
140 different brain regions. ROSMAP have collected brain tissues from multiple brain regions. It remains a crucial research line to identify brain regions, cell types, and genes which are most differentially methylated and/or expressed. Finally, this observational study based association analysis alone cannot establish the causal mechanisms. Mendelian Randomization methods, which use known causal genes to examine the causal effect of a mediate exposure, would be able to serve as an instru ctive avenue to establish causality  . available in these two cohorts to test the direction and strength of each identified association. Ultimately, the identified strong causal effects will be further validated using animal models.
141 LIST OF REFERENCES 1. American Psychiatric Association (2013) Dia gnostic and Statistical Manual of Mental Disorders. 5th ed. Washington, DC: American Psychiatric Association. doi:10.1176/appi.books.9780890425596. 2. Global Burden of Disease Study 2013 Collaborators (2015) Global, regional, and national incidence, preva lence, and years lived with disability for 301 acute and chronic diseases and injuries in 188 countries, 1990 2013: a systematic analysis for the Global Burden of Disease Study 2013. Lancet 386: 743 800. doi:10.1016/S0140 6736(15)60692 4. 3. Richards D (2 011) Prevalence and clinical course of depression: a review. Clin Psychol Rev 31: 1117 1125. doi:10.1016/j.cpr.2011.07.004. 4. Krishnan V, Nestler EJ (2010) Linking molecules to mood: new insight into the biology of depression. Am J Psychiatry 167: 1305 1 320. doi:10.1176/appi.ajp.2009.10030434. 5. Hasin DS, Sarvet AL, Meyers JL, Saha TD, Ruan WJ, et al. (2018) Epidemiology of Adult DSM 5 Major Depressive Disorder and Its Specifiers in the United States. JAMA Psychiatry 75: 336 346. doi:10.1001/jamapsychia try.2017.4602. 6. LÃ©pine J P, Briley M (2011) The increasing burden of depression. Neuropsychiatr Dis Treat 7: 3 7. doi:10.2147/NDT.S19617. 7. Lim GY, Tam WW, Lu Y, Ho CS, Zhang MW, et al. (2018) Prevalence of Depression in the Community from 30 Countrie s between 1994 and 2014. Sci Rep 8: 2861. doi:10.1038/s41598 018 21243 x. 8. Bose J, Ahrnsbrak A, Hedden S, Lipari R, Park Lee E (2016) Key Substance Use and Mental Health Indicators in the United States: Results from the 2015 National Survey on Drug Use and Healt. Tice P, editor Rockville, MD: Substance Abuse and Mental Health Services Administration (SAMHSA). 9. Van de Velde S, Bracke P, Levecque K (2010) Gender differences in depression in 23 European countries. Cross national variation in the gend er gap in depression. Soc Sci Med 71: 305 313. doi:10.1016/j.socscimed.2010.03.035. 10. Kessler RC, Bromet EJ (2013) The epidemiology of depression across cultures. Annu Rev Public Health 34: 119 138. doi:10.1146/annurev publhealth 031912 114409. 11. Kes sler RC, Birnbaum HG, Shahly V, Bromet E, Hwang I, et al. (2010) Age differences in the prevalence and co morbidity of DSM IV major depressive episodes: results from the WHO World Mental Health Survey Initiative. Depress Anxiety 27: 351 364. doi:10.1002/da .20634.
142 12. American Academy of Suicidology (2014) Depression and Suicide Risk . American Academy of Suicidology. 13. Mezuk B, Eaton WW, Albrecht S, Golden SH (2008) Depression and type 2 diabetes over the lifespan: a meta analysis. Diabetes Care 31: 23 83 2390. doi:10.2337/dc08 0985. 14. Mulle JG, Vaccarino V (2013) Cardiovascular disease, psychosocial factors, and genetics: the case of depression. Prog Cardiovasc Dis 55: 557 562. doi:10.1016/j.pcad.2013.03.005. 15. Donnell HF, Fell PJ, et al. (1987) Depression, dementia and disability in the elderly. Br J Psychiatry 150: 482 493. doi:10.1192/bjp.150.4.482. 16. Walker ER, McGee RE, Druss BG (2015) Mortality in mental disorders and global disease burden implications: a systematic review and meta analysis. JAMA Psychiatry 72: 334 341. doi:10.1001/jamapsychiatry.2014.2502. 17. Cuijpers P, Vogelzangs N, Twisk J, Kleiboer A, Li J, et al. (2014) Comprehensive meta analysis of excess mortality in depression in the general c ommunity versus patients with specific illnesses. Am J Psychiatry 171: 453 462. doi:10.1176/appi.ajp.2013.13030325. 18. Lohoff FW (2010) Overview of the genetics of major depressive disorder. Curr Psychiatry Rep 12: 539 546. doi:10.1007/s11920 010 0150 6. 19. Otte C, Gold SM, Penninx BW, Pariante CM, Etkin A, et al. (2016) Major depressive disorder. Nat Rev Dis Primers 2: 16065. doi:10.1038/nrdp.2016.65. 20. McGuffin P, Katz R, Watkins S, Rutherford J (1996) A hospital based twin register of the heritabi lity of DSM IV unipolar depression. Arch Gen Psychiatry 53: 129 136. doi:10.1001/archpsyc.1996.01830020047006. 21. Fernandez Pujals AM, Adams MJ, Thomson P, McKechanie AG, Blackwood DHR, et al. (2015) Epidemiology and heritability of major depressive diso rder, stratified by age of onset, sex, and illness course in generation scotland: scottish family health study (GS:SFHS). PLoS One 10: e0142197. doi:10.1371/journal.pone.0142197. 22. Gatz M, Pedersen NL, Plomin R, Nesselroade JR, McClearn GE (1992) Import ance of shared genes and shared environments for symptoms of depression in older adults. J Abnorm Psychol 101: 701 708. doi:10.1037/0021 843X.101.4.701. 23. Cross Disorder Group of the Psychiatric Genomics Consortium, Lee SH, Ripke S, Neale BM, Faraone SV, et al. (2013) Genetic relationship between five psychiatric disorders estimated from genome wide SNPs. Nat Genet 45: 984 994. doi:10.1038/ng.2711.
143 24. Weissman MM, Wickramaratne P, Nomura Y, Warner V, Verdeli H, et al. (2005) Families at high and low risk for depression: a 3 generation study. Arch Gen Psychiatry 62: 29 36. doi:10.1001/archpsyc.62.1.29. 25. Major Depressive Disorder Working Group of the PGC, Wray NR, Sullivan PF (2017) Genome wide association analyses identify 44 risk variants and refi ne the genetic architecture of major depression. BioRxiv. doi:10.1101/167577. 26. Maniam J, Antoniadis C, Morris MJ (2014) Early Life Stress, HPA Axis Adaptation, and Mechanisms Contributing to Later Health Outcomes. Front Endocrinol (Lausanne) 5: 73. doi :10.3389/fendo.2014.00073. 27. Bifulco A, Brown GW, Adler Z (1991) Early sexual abuse and clinical depression in adult life. Br J Psychiatry 159: 115 122. doi:10.1192/bjp.159.1.115. 28. Phillips NK, Hammen CL, Brennan PA, Najman JM, Bor W (2005) Early ad versity and the prospective prediction of depressive and anxiety disorders in adolescents. J Abnorm Child Psychol 33: 13 24. doi:10.1007/s10802 005 0930 3. 29. Kendler KS (2001) Twin studies of psychiatric illness. Arch Gen Psychiatry 58: 1005. doi:10.100 1/archpsyc.58.11.1005. 30. Rice F, Harold G, Thapar A (2002) The genetic aetiology of childhood depression: a review. J Child Psychol Psychiatry 43: 65 79. doi:10.1111/1469 7610.00004. 31. Shih RA, Belmonte PL, Zandi PP (2004) A review of the evidence fr om family, twin and adoption studies for a genetic contribution to adult psychiatric disorders. Int Rev Psychiatry 16: 260 283. doi:10.1080/09540260400014401. 32. Duncan LE, Keller MC (2011) A critical review of the first 10 years of candidate gene by env ironment interaction research in psychiatry. Am J Psychiatry 168: 1041 1049. doi:10.1176/appi.ajp.2011.11020191. 33. Caspi A, Sugden K, Moffitt TE, Taylor A, Craig IW, et al. (2003) Influence of life stress on depression: moderation by a polymorphism in t he 5 HTT gene. Science 301: 386 389. doi:10.1126/science.1083968. 34. Ogilvie AD, Battersby S, Bubb VJ, Fink G, Harmar AJ, et al. (1996) Polymorphism in serotonin transporter gene associated with susceptibility to major depression. Lancet 347: 731 733. do i:10.1016/S0140 6736(96)90079 3. 35. Verbeek EC, Bevova MR, Bochdanovits Z, Rizzu P, Bakker IMC, et al. (2013) Resequencing three candidate genes for major depressive disorder in a Dutch cohort. PLoS One 8: e79921. doi:10.1371/journal.pone.0079921.
144 36. K ao C F, Fang Y S, Zhao Z, Kuo P H (2011) Prioritization and evaluation of depression candidate genes by combining multidimensional data resources. PLoS One 6: e18696. doi:10.1371/journal.pone.0018696. 37. Utge S, Soronen P, Partonen T, Loukola A, Kronholm E, et al. (2010) A population based association study of candidate genes for depression and sleep disturbance. Am J Med Genet B, Neuropsychiatr Genet 153B: 468 476. doi:10.1002/ajmg.b.31002. 38. Varghese FP, Brown ES (2001) The Hypothalamic Pituitary Adr enal Axis in Major Depressive Disorder: A Brief Primer for Primary Care Physicians. Prim Care Companion J Clin Psychiatry 3: 151 155. 39. Penninx BWJH, Milaneschi Y, Lamers F, Vogelzangs N (2013) Understanding the somatic consequences of depression: biolo gical mechanisms and the role of depression symptom profile. BMC Med 11: 129. doi:10.1186/1741 7015 11 129. 40. Philibert RA, Sandhu H, Hollenbeck N, Gunter T, Adams W, et al. (2008) The relationship of 5HTT (SLC6A4) methylation and genotype on mRNA expre ssion and liability to major depression and alcohol dependence in subjects from the Iowa Adoption Studies. Am J Med Genet B, Neuropsychiatr Genet 147B: 543 549. doi:10.1002/ajmg.b.30657. 41. Kang H J, Kim J M, Stewart R, Kim S Y, Bae K Y, et al. (2013) As sociation of SLC6A4 methylation with early adversity, characteristics and outcomes in depression. Prog Neuropsychopharmacol Biol Psychiatry 44: 23 28. doi:10.1016/j.pnpbp.2013.01.006. 42. Duman EA, Canli T (2015) Influence of life stress, 5 HTTLPR genotyp e, and SLC6A4 methylation on gene expression and stress response in healthy Caucasian males. Biol Mood Anxiety Disord 5: 2. doi:10.1186/s13587 015 0017 x. 43. Lam D, Ancelin M L, Ritchie K, Freak Poli R, Saffery R, et al. (2018) Genotype dependent associa tions between serotonin transporter gene (SLC6A4) DNA methylation and late life depression. BMC Psychiatry 18: 282. doi:10.1186/s12888 018 1850 4. 44. Malcangio M, Lessmann V (2003) A common thread for pain and memory synapses? Brain derived neurotrophic factor and trkB receptors. Trends Pharmacol Sci 24: 116 121. doi:10.1016/S0165 6147(03)00025 7. 45. Keller MB, Lavori PW, Mueller TI, Endicott J, Coryell W, et al. (1992) Time to recovery, chronicity, and levels of psychopathology in major depression. A 5 year prospective follow up of 431 subjects. Arch Gen Psychiatry 49: 809 816. 46. Keita GP (2007) Psychosocial and cultural contributions to depression in women: considerations for women midlife and beyond. J Manag Care Pharm 13: S12 5.
145 47. Gatt JM, Neme roff CB, Dobson Stone C, Paul RH, Bryant RA, et al. (2009) Interactions between BDNF Val66Met polymorphism and early life stress predict brain and arousal pathways to syndromal depression and anxiety. Mol Psychiatry 14: 681 695. doi:10.1038/mp.2008.143. 48 . Kim J M, Stewart R, Kim S W, Yang S J, Shin I S, et al. (2007) Interactions between life stressors and susceptibility genes (5 HTTLPR and BDNF) on depression in Korean elders. Biol Psychiatry 62: 423 428. doi:10.1016/j.biopsych.2006.11.020. 49. Schumac her J, Jamra RA, Becker T, Ohlraun S, Klopp N, et al. (2005) Evidence for a relationship between genetic variants at the brain derived neurotrophic factor (BDNF) locus and major depression. Biol Psychiatry 58: 307 314. doi:10.1016/j.biopsych.2005.04.006. 5 0. Kalueff AV, Nutt DJ (2007) Role of GABA in anxiety and depression. Depress Anxiety 24: 495 517. doi:10.1002/da.20262. 51. Luscher B, Shen Q, Sahir N (2011) The GABAergic deficit hypothesis of major depressive disorder. Mol Psychiatry 16: 383 406. doi: 10.1038/mp.2010.120. 52. Gao J, Pan Z, Jiao Z, Li F, Zhao G, et al. (2012) TPH2 gene polymorphisms and major depression -a meta analysis. PLoS One 7: e36721. doi:10.1371/journal.pone.0036721. 53. Du J, Zhang Z, Li W, He L, Xu J, et al. (2016) Association study of the TPH2 Gene with Major Depressive Disorder in the Han Chinese Population. The European Journal of Psychiatry. 54. Lohoff FW, Aquino TD, Narasimhan S, Multani PK, Etemad B, et al. (2013) Serotonin receptor 2A (HTR2A) gene polymorphism predicts treatment response to venlafaxine XR in generalized anxiety disorder. Pharmacogenomics J 13: 21 26. doi:10.1038/tpj.2011.47. 55. Fabbri C, Marsano A, Serretti A (2013) Genetics of serotonin receptors and depression: state of the art. Curr Drug Targets 14: 531 548. 56. Visscher PM, Wray NR, Zhang Q, Sklar P, McCarthy MI, et al. (2017) 10 years of GWAS discovery: biology, function, and translation. Am J Hum Genet 101: 5 22. doi:10.1016/j.ajhg.2017.06.005. 57. Zhang D, Cheng L, Badner JA, Chen C, Chen Q, et al. (2010) Genetic control of individual differences in gene specific methylation in human brain. Am J Hum Genet 86: 411 419. doi:10.1016/j.ajhg.2010.02.005. 58. Flanagan JM (2015) Epigenome wide association studies (EWAS): past, present, and future. Met hods Mol Biol 1238: 51 63. doi:10.1007/978 1 4939 1804 1_3.
146 59. Chang CQ, Yesupriya A, Rowell JL, Pimentel CB, Clyne M, et al. (2014) A systematic review of cancer GWAS and candidate gene meta analyses reveals limited overlap but similar effect sizes. Eur J Hum Genet 22: 402 408. doi:10.1038/ejhg.2013.161. 60. Xu Z, Taylor JA (2009) SNPinfo: integrating GWAS and candidate gene information into functional SNP selection for genetic association studies. Nucleic Acids Res 37: W600 5. doi:10.1093/nar/gkp290. 6 1. Amos W, Driscoll E, Hoffman JI (2011) Candidate genes versus genome wide associations: which are better for detecting genetic susceptibility to infectious disease? Proc Biol Sci 278: 1183 1188. doi:10.1098/rspb.2010.1920. 62. Eric R. Braverman KB (201 4) Genome Wide Sequencing Compared to Candidate Gene Association Studies for Predisposition to Substance Abuse a Subset of Reward Deficiency Syndrome (RDS): Are we throwing the Baby Out with the Bathwater? Epidemiol 04. doi:10.4172/2161 1165.1000158. 63. Wilkening S, Chen B, Bermejo JL, Canzian F (2009) Is there still a need for candidate gene approaches in the era of genome wide association studies? Genomics 93: 415 419. doi:10.1016/j.ygeno.2008.12.011. 64. Tsai P C, Bell JT (2015) Power and sample size estimation for epigenome wide association scans to detect differential DNA methylation. Int J Epidemiol 44: 1429 1441. doi:10.1093/ije/dyv041. 65. Auer PL, Lettre G (2015) Rare variant association studies: considerations, challenges and opportunities. Gen ome Med 7: 16. doi:10.1186/s13073 015 0138 2. 66. Schizophrenia Psychiatric Genome Wide Association Study (GWAS) Consortium (2011) Genome wide association study identifies five new schizophrenia loci. Nat Genet 43: 969 976. doi:10.1038/ng.940. 67. Psychiatric GWAS Consortium Bipolar Disorder Working Group (2011) Large scale genome wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4. Nat Genet 43: 977 983. doi:10.1038/ng.943. 68. Major Depressive Disorder Wo rking Group of the Psychiatric GWAS Consortium, Ripke S, Wray NR, Lewis CM, Hamilton SP, et al. (2013) A mega analysis of genome wide association studies for major depressive disorder. Mol Psychiatry 18: 497 511. doi:10.1038/mp.2012.21. 69. Hek K, Demirkan A, Lahti J, Terracciano A, Teumer A, et al. (2013) A genome wide association study of depressive symptoms. Biol Psychiatry 73: 667 678. doi:10.1016/j.biopsych.2012.09.033.
147 70. Kendler KS, Gatz M, Gardner CO, Pedersen NL (2006) A Swedish national twin study of lifetime major depression. Am J Psychiatry 163: 109 114. doi:10.1176/appi.ajp.163.1.109. 71. Kaprio J (2012) Twins and the mystery of missing heritability: the contribution of gene environment interactions. J Intern Med 272: 440 448. doi:10. 1111/j.1365 2796.2012.02587.x. 72. Bromet E, Andrade LH, Hwang I, Sampson NA, Alonso J, et al. (2011) Cross national epidemiology of DSM IV major depressive episode. BMC Med 9: 90. doi:10.1186/1741 7015 9 90. 73. Senn TE, Carey MP, Vanable PA (2010) The intersection of violence, substance use, depression, and stds: testing of a syndemic pattern among patients attending an urban STD clinic. J Natl Med Assoc 102: 614 620. doi:10.1016/S0027 9684(15)30639 8. 74. Hasin DS, Goodwin RD, Stinson FS, Grant BF (20 05) Epidemiology of major depressive disorder: results from the National Epidemiologic Survey on Alcoholism and Related Conditions. Arch Gen Psychiatry 62: 1097 1106. doi:10.1001/archpsyc.62.10.1097. 75. Thapar A, Collishaw S, Pine DS, Thapar AK (2012) De pression in adolescence. Lancet 379: 1056 1067. doi:10.1016/S0140 6736(11)60871 4. 76. Felitti VJ, Anda RF, Nordenberg D, Williamson DF, Spitz AM, et al. (1998) Relationship of Childhood Abuse and Household Dysfunction to Many of the Leading Causes of Dea th in Adults. Am J Prev Med 14: 245 258. doi:10.1016/S0749 3797(98)00017 8. 77. Chapman DP, Whitfield CL, Felitti VJ, Dube SR, Edwards VJ, et al. (2004) Adverse childhood experiences and the risk of depressive disorders in adulthood. J Affect Disord 82: 2 17 225. doi:10.1016/j.jad.2003.12.013. 78. Slavich GM, Irwin MR (2014) From stress to inflammation and major depressive disorder: a social signal transduction theory of depression. Psychol Bull 140: 774 815. doi:10.1037/a0035302. 79. Feingold D, Weiser M , Rehm J, Lev Ran S (2015) The association between cannabis use and mood disorders: A longitudinal study. J Affect Disord 172: 211 218. doi:10.1016/j.jad.2014.10.006. 80. Cummings CM, Caporino NE, Kendall PC (2014) Comorbidity of anxiety and depression in children and adolescents: 20 years after. Psychol Bull 140: 816 845. doi:10.1037/a0034733.
148 81. Gullander M, Hogh A, Hansen Ã…M, Persson R, Rugulies R, et al. (2014) Exposure to workplace bullying and risk of depression. J Occup Environ Med 56: 1258 1265. doi:10.1097/JOM.0000000000000339. 82. GÃ¡mez Guadix M, Orue I, Smith PK, Calvete E (2013) Longitudinal and reciprocal relations of cyberbullying with depression, substance use, and problematic internet use among adolescents. J Adolesc Health 53: 446 452. d oi:10.1016/j.jadohealth.2013.03.030. 83. Lin Y L, Wang S (2014) Prenatal lipopolysaccharide exposure increases depression like behaviors and reduces hippocampal neurogenesis in adult rats. Behav Brain Res 259: 24 34. doi:10.1016/j.bbr.2013.10.034. 84. Ke ndler KS, Kuhn JW, Prescott CA (2004) Childhood sexual abuse, stressful life events and risk for major depression in women. Psychol Med 34: 1475 1482. doi:10.1017/S003329170400265X. 85. Ober C, Vercelli D (2011) Gene environment interactions in human dise ase: nuisance or opportunity? Trends Genet 27: 107 115. doi:10.1016/j.tig.2010.12.004. 86. Hornung OP, Heim CM (2014) Gene environment interactions and intermediate phenotypes: early trauma and depression. Front Endocrinol (Lausanne) 5: 14. doi:10.3389/fe ndo.2014.00014. 87. Lopizzo N, Bocchio Chiavetto L, Cattane N, Plazzotta G, Tarazi FI, et al. (2015) Gene environment interaction in major depression: focus on experience dependent biological systems. Front Psychiatry 6: 68. doi:10.3389/fpsyt.2015.00068. 88. Story Jovanova O, Nedeljkovic I, Spieler D, Walker RM, Liu C, et al. (2018) DNA Methylation Signatures of Depressive Symptoms in Middle aged and Elderly Persons: Meta analysis of Multiethnic Epigenome wide Studies. JAMA Psychiatry 75: 949 959. doi:10. 1001/jamapsychiatry.2018.1725. 89. MicroRNAs as biomarkers for major depression: a role for let 7b and let 7c. Transl Psychiatry 6: e862. doi:10.1038/tp.2016.131. 90. Kawamura N, S hinoda K, Sato H, Sasaki K, Suzuki M, et al. (2018) Plasma metabolome analysis of patients with major depressive disorder. Psychiatry Clin Neurosci 72: 349 361. doi:10.1111/pcn.12638. 91. Zheng P, Zeng B, Zhou C, Liu M, Fang Z, et al. (2016) Gut microbiom e remodeling induces depressive like behaviors through a pathway mediated by the 796. doi:10.1038/mp.2016.44. 92. Dash S, Clarke G, Berk M, Jacka FN (2015) The gut microbiome and diet in psychiatry: focus on depre ssion. Curr Opin Psychiatry 28: 1 6. doi:10.1097/YCO.0000000000000117.
149 93. Jaenisch R, Bird A (2003) Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals. Nat Genet 33 Suppl: 245 254. doi:10.1038/ng1089. 94. Trerotola M, Relli V, Simeone P, Alberti S (2015) Epigenetic inheritance and the missing heritability. Hum Genomics 9: 17. doi:10.1186/s40246 015 0041 3. 95. Malki K, Koritskaya E, Harris F, Bryson K, Herbster M, et al. (2016) Epigenetic differences in monozygotic twins discordant for major depressive disorder. Transl Psychiatry 6: e839. doi:10.1038/tp.2016.101. 96. Wankerl M, Miller R, Kirschbaum C, Hennig J, Stalder T, et al. (2014) Effects of genetic and early environmental risk factors for depres sion on serotonin transporter expression and methylation profiles. Transl Psychiatry 4: e402. doi:10.1038/tp.2014.37. 97. Szyf M (2011) DNA methylation, the early life social environment and behavioral disorders. J Neurodev Disord 3: 238 249. doi:10.1007/ s11689 011 9079 2. 98. Wilkinson MB, Xiao G, Kumar A, LaPlant Q, Renthal W, et al. (2009) Imipramine treatment and resiliency exhibit similar chromatin regulation in the mouse nucleus accumbens in depression models. J Neurosci 29: 7820 7832. doi:10.1523/J NEUROSCI.0932 09.2009. 99. Uchida S, Hara K, Kobayashi A, Otsuki K, Yamagata H, et al. (2011) Epigenetic status of Gdnf in the ventral striatum determines susceptibility and adaptation to daily stressful events. Neuron 69: 359 372. doi:10.1016/j.neuron.20 10.12.023. 100. Sterrenburg L, Gaszner B, Boerrigter J, Santbergen L, Bramini M, et al. (2011) Chronic stress induces sex specific alterations in methylation and expression of corticotropin releasing factor gene in the rat. PLoS One 6: e28128. doi:10.1371 /journal.pone.0028128. 101. Fuchikami M, Morinobu S, Kurata A, Yamamoto S, Yamawaki S (2009) Single immobilization stress differentially alters the expression profile of transcripts of the brain derived neurotrophic factor (BDNF) gene and histone acetylat ion at its promoters in the rat hippocampus. Int J Neuropsychopharmacol 12: 73 82. doi:10.1017/S1461145708008997. 102. Walton E, Hass J, Liu J, Roffman JL, Bernardoni F, et al. (2016) Correspondence of DNA methylation between blood and brain tissue and it s application to schizophrenia research. Schizophr Bull 42: 406 414. doi:10.1093/schbul/sbv074. 103. Chen Y, Breeze CE, Zhen S, Beck S, Teschendorff AE (2016) Tissue independent and tissue specific patterns of DNA methylation alteration in cancer. Epigene tics Chromatin 9: 10. doi:10.1186/s13072 016 0058 4. 104. Fuchikami M, Morinobu S, Segawa M, Okamoto Y, Yamawaki S, et al. (2011) DNA methylation profiles of the brain derived neurotrophic factor (BDNF) gene as a
150 potent diagnostic biomarker in major depre ssion. PLoS One 6: e23881. doi:10.1371/journal.pone.0023881. 105. Ladd Acosta C, Hansen KD, Briem E, Fallin MD, Kaufmann WE, et al. (2014) Common DNA methylation alterations in multiple brain regions in autism. Mol Psychiatry 19: 862 871. doi:10.1038/mp.2 013.114. 106. Houtepen LC, Vinkers CH, Carrillo Roa T, Hiemstra M, van Lier PA, et al. (2016) Genome wide DNA methylation levels and altered cortisol stress reactivity following childhood trauma in humans. Nat Commun 7: 10967. doi:10.1038/ncomms10967. 107 . Saavedra K, Molina MÃ¡rquez AM, Saavedra N, Zambrano T, Salazar LA (2016) Epigenetic modifications of major depressive disorder. Int J Mol Sci 17. doi:10.3390/ijms17081279. 108. Dalton VS, Kolshus E, McLoughlin DM (2014) Epigenetics and depression: retu rn of the repressed. J Affect Disord 155: 1 12. doi:10.1016/j.jad.2013.10.028. 109. Keller S, Sarchiapone M, Zarrilli F, Tomaiuolo R, Carli V, et al. (2011) TrkB gene expression and DNA methylation state in Wernicke area does not associate with suicidal b ehavior. J Affect Disord 135: 400 404. doi:10.1016/j.jad.2011.07.003. 110. Liu Y, Yieh L, Yang T, Drinkenburg W, Peeters P, et al. (2016) Metabolomic biosignature differentiates melancholic depressive patients from healthy controls. BMC Genomics 17: 669. doi:10.1186/s12864 016 2953 2. 111. Hashimoto K (2018) Metabolomics of major depressive disorder and bipolar disorder: overview and future perspective. Adv Clin Chem 84: 81 99. doi:10.1016/bs.acc.2017.12.005. 112. Zheng H, Zheng P, Zhao L, Jia J, Tang S, et al. (2017) Predictive diagnosis of major depression using NMR based metabolomics and least squares support vector machine. Clin Chim Acta 464: 223 227. doi:10.1016/j.cca.2016.11.039. 113. Cassol E, Misra V, Morgello S, Kirk GD, Mehta SH, et al. (2015) Altered Monoamine and Acylcarnitine Metabolites in HIV Positive and HIV Negative Subjects With Depression. J Acquir Immune Defic Syndr 69: 18 28. doi:10.1097/QAI.0000000000000551. 114. Pan J X, Xia J J, Deng F L, Liang W W, Wu J, et al. (2018) Diagnosis of major depressive disorder based on changes in multiple plasma neurotransmitters: a targeted metabolomics study. Transl Psychiatry 8: 130. doi:10.1038/s41398 018 0183 x. 115. Shao W, Chen J, Fan S, Lei Y, Xu H, et al. (2015) Combined metabolomics and pr oteomics analysis of major depression in an animal model: perturbed energy metabolism in the chronic mild stressed rat cerebellum. OMICS 19: 383 392. doi:10.1089/omi.2014.0164.
151 116. Shah SH, Hauser ER, Bain JR, Muehlbauer MJ, Haynes C, et al. (2009) High heritability of metabolomic profiles in families burdened with premature cardiovascular disease. Mol Syst Biol 5: 258. doi:10.1038/msb.2009.11. 117. Carabotti M, Scirocco A, Maselli MA, Severi C (2015) The gut brain axis: interactions between enteric micr obiota, central and enteric nervous systems. Ann Gastroenterol 28: 203 209. 118. Evrensel A, Ceylan ME (2015) The Gut Brain Axis: The Missing Link in Depression. Clin Psychopharmacol Neurosci 13: 239 244. doi:10.9758/cpn.2015.13.3.239. 119. Rea K, Dinan TG, Cryan JF (2016) The microbiome: A key regulator of stress and neuroinflammation. Neurobiol Stress 4: 23 33. doi:10.1016/j.ynstr.2016.03.001. 120. Bangsgaard Bendtsen KM, Krych L, SÃ¸rensen DB, Pang W, Nielsen DS, et al. (2012) Gut microbiota compositio n is correlated to grid floor induced stress and behavior in the BALB/c mouse. PLoS One 7: e46231. doi:10.1371/journal.pone.0046231. 121. Bailey MT, Dowd SE, Parry NMA, Galley JD, Schauer DB, et al. (2010) Stressor exposure disrupts commensal microbial po pulations in the intestines and leads to increased colonization by Citrobacter rodentium. Infect Immun 78: 1509 1519. doi:10.1128/IAI.00862 09. 122. Coplan JD, Andrews MW, Rosenblum LA, Owens MJ, Friedman S, et al. (1996) Persistent elevations of cerebros pinal fluid concentrations of corticotropin releasing factor in adult nonhuman primates exposed to early life stressors: implications for the pathophysiology of mood and anxiety disorders. Proc Natl Acad Sci USA 93: 1619 1623. 123. rien C, Patterson E, El Aidy S, et al. (2016) Transferring the blues: Depression associated gut microbiota induces neurobehavioural changes in the rat. J Psychiatr Res 82: 109 118. doi:10.1016/j.jpsychires.2016.07.019. 124. Wong ML, Inserra A, Lewis MD, Mastronardi CA, Leong L, et al. (2016) Inflammasome signaling affects anxiety and depressive like behavior and gut microbiome composition. Mol Psychiatry 21: 797 805. doi:10.1038/mp.2016.46. 125. Theriot CM, Koenigsknecht MJ, Carlson PE, Hatton GE, Nelso n AM, et al. (2014) Antibiotic induced shifts in the mouse gut microbiome and metabolome increase susceptibility to Clostridium difficile infection. Nat Commun 5: 3114. doi:10.1038/ncomms4114. 126. Guida F, Turco F, Iannotta M, De Gregorio D, Palumbo I, e t al. (2018) Antibiotic induced microbiota perturbation causes gut endocannabinoidome changes, hippocampal neuroglial reorganization and depression in mice. Brain Behav Immun 67: 230 245. doi:10.1016/j.bbi.2017.09.001.
152 127. Marin IA, Goertz JE, Ren T, Ric h SS, Onengut Gumuscu S, et al. (2017) Microbiota alteration is associated with the development of stress induced despair behavior. Sci Rep 7: 43859. doi:10.1038/srep43859. 128. Fond G, Boukouaci W, Chevalier G, Regnault A, Eberl G, et al. (2015) The review. Pathol Biol 63: 35 42. doi:10.1016/j.patbio.2014.10.003. 129. Jiang H, Ling Z, Zhang Y, Mao H, Ma Z, et al. (2015) Altered fecal microbiota composition in patients with major depressive disorder. Brain Behav Immun 48: 186 194. doi:10.1016/j.bbi.2015.03.016. 130. Bailey MT, Dowd SE, Galley JD, Hufnagle AR, Allen RG, et al. (2011) Exposure to a social stressor alters the structure of the intestinal microbiota: implica tions for stressor induced immunomodulation. Brain Behav Immun 25: 397 407. doi:10.1016/j.bbi.2010.10.023. 131. Park S C, Hahn S W, Hwang T Y, Kim J M, Jun T Y, et al. (2014) Does age at onset of first major depressive episode indicate the subtype of majo r depressive disorder?: the clinical research center for depression study. Yonsei Med J 55: 1712 1720. doi:10.3349/ymj.2014.55.6.1712. 132. Lamers F, de Jonge P, Nolen WA, Smit JH, Zitman FG, et al. (2010) Identifying depressive subtypes in a large cohort study: results from the Netherlands Study of Depression and Anxiety (NESDA). J Clin Psychiatry 71: 1582 1589. doi:10.4088/JCP.09m05398blu. 133. Malki K, Keers R, Tosto MG, Lourdusamy A, Carboni L, et al. (2014) The endogenous and reactive depression subt ypes revisited: integrative animal and human studies implicate multiple distinct molecular mechanisms underlying major depressive disorder. BMC Med 12: 73. doi:10.1186/1741 7015 12 73. 134. Aziz R, Steffens DC (2013) What are the causes of late life depre ssion? Psychiatr Clin North Am 36: 497 516. doi:10.1016/j.psc.2013.08.001. 135. Volkert J, Schulz H, HÃ¤rter M, Wlodarczyk O, Andreas S (2013) The prevalence of mental disorders in older people in Western countries a meta analysis. Ageing Res Rev 12: 339 353. doi:10.1016/j.arr.2012.09.004. 136. Li Y, Tollefsbol TO (2016) Age related epigenetic drift and phenotypic plasticity loss: implications in prevention of age related human diseases. Epigenomics 8: 1637 1651. doi:10.2217/epi 2016 0078. 137. ViÃ±uela A, Brown AA, Buil A, Tsai P C, Davies MN, et al. (2018) Age dependent changes in mean and variance of gene expression across tissues in a twin cohort. Hum Mol Genet 27: 732 741. doi:10.1093/hmg/ddx424.
153 138. Zuk O, Hechter E, Sunyaev SR, Lander ES (2012) T he mystery of missing heritability: Genetic interactions create phantom heritability. Proc Natl Acad Sci USA 109: 1193 1198. doi:10.1073/pnas.1119675109. 139. Brown WM, Beck SR, Lange EM, Davis CC, Kay CM, et al. (2003) Age stratified heritability estimat ion in the Framingham Heart Study families. BMC Genet 4 Suppl 1: S32. doi:10.1186/1471 2156 4 S1 S32. 140. Fiske A, Wetherell JL, Gatz M (2009) Depression in older adults. Annu Rev Clin Psychol 5: 363 389. doi:10.1146/annurev.clinpsy.032408.153621. 141. Korten NCM, Penninx BWJH, Kok RM, Stek ML, Oude Voshaar RC, et al. (2014) Heterogeneity of late life depression: relationship with cognitive functioning. Int Psychogeriatr 26: 953 963. doi:10.1017/S1041610214000155. 142. Afari N, Noonan C, Goldberg J, Edw ards K, Gadepalli K, et al. (2006) University of Washington Twin Registry: construction and characteristics of a community based twin registry. Twin Res Hum Genet 9: 1023 1029. doi:10.1375/183242706779462543. 143. Allali I, Arnold JW, Roach J, Cadenas MB, Butz N, et al. (2017) A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome. BMC Microbiol 17: 194. doi:10.1186/s12866 017 1101 8. 144. Bennett DA, Wilson RS, Arvanitakis Z, Boyle PA, de Toledo Morrell L, et al. (2013) Selected findings from the Religious Orders Study and Rush Memory and Aging Project. J Alzheimers Dis 33 Suppl 1: S397 403. doi:10.3233/JAD 2012 129007. 145. World Health Organization (2013) Mental Health and Older Adults. 146. Sullivan PF, Neale MC, Kendler KS (2000) Genetic epidemiology of major depression: review and meta analysis. Am J Psychiatry 157: 1552 1562. doi:10.1176/appi.ajp.157.10.1552. 147. CÃ³rdova Palomera A, FatjÃ³ Vilas M, GastÃ³ C, Navarro V, Krebs MO, et al. (20 15) Genome wide methylation study on depression: differential methylation and variable methylation in monozygotic twins. Transl Psychiatry 5: e557. doi:10.1038/tp.2015.49. 148. Mill J, Dempster E, Caspi A, Williams B, Moffitt T, et al. (2006) Evidence for monozygotic twin (MZ) discordance in methylation level at two CpG sites in the promoter region of the catechol O methyltransferase (COMT) gene. Am J Med Genet B, Neuropsychiatr Genet 141B: 421 425. doi:10.1002/ajmg.b.30316. 149. Dempster EL, Wong CCY, Le ster KJ, Burrage J, Gregory AM, et al. (2014) Genome wide methylomic analysis of monozygotic twins discordant for adolescent depression. Biol Psychiatry 76: 977 983. doi:10.1016/j.biopsych.2014.04.013. 150. Mill J, Petronis A (2007) Molecular studies of m ajor depressive disorder: the epigenetic perspective. Mol Psychiatry 12: 799 814. doi:10.1038/sj.mp.4001992.
154 151. Mill J, Tang T, Kaminsky Z, Khare T, Yazdanpanah S, et al. (2008) Epigenomic profiling reveals DNA methylation changes associated with major psychosis. Am J Hum Genet 82: 696 711. doi:10.1016/j.ajhg.2008.01.008. 152. Chong S, Whitelaw E (2004) Epigenetic germline inheritance. Curr Opin Genet Dev 14: 692 696. doi:10.1016/j.gde.2004.09.001. 153. Hoffmann A, Sportelli V, Ziller M, Spengler D (2017) Epigenomics of major depressive disorders and schizophrenia: early life decides. Int J Mol Sci 18. doi:10.3390/ijms18081711. 154. Houseman EA, Accomando WP, Koestler DC, Christensen BC, Marsit CJ, et al. (2012) DNA methylation arrays as surrogate m easures of cell mixture distribution. BMC Bioinformatics 13: 86. doi:10.1186/1471 2105 13 86. 155. Bergvall N, Iliadou A, Johansson S, de Faire U, Kramer MS, et al. (2007) Genetic and shared environmental factors do not confound the association between bi rth weight and hypertension: a study among Swedish twins. Circulation 115: 2931 2938. doi:10.1161/CIRCULATIONAHA.106.674812. 156. Bouchard TJ, Lykken DT, McGue M, Segal NL, Tellegen A (1990) Sources of human psychological differences: the Minnesota Study of Twins Reared Apart. Science 250: 223 228. doi:10.1126/science.2218526. 157. Bell JT, Pai AA, Pickrell JK, Gaffney DJ, Pique Regi R, et al. (2011) DNA methylation patterns associate with genetic and gene expression variation in HapMap cell lines. Genome Biol 12: R10. doi:10.1186/gb 2011 12 1 r10. 158. Bell JT, Spector TD (2011) A twin approach to unraveling epigenetics. Trends Genet 27: 116 125. doi:10.1016/j.tig.2010.12.005. 159. Kuan PF, Waszczuk MA, Kotov R, Marsit CJ, Guffanti G, et al. (2017) An e pigenome wide DNA methylation study of PTSD and depression in World Trade Center responders. Transl Psychiatry 7: e1158. doi:10.1038/tp.2017.130. 160. Davies MN, Krause L, Bell JT, Gao F, Ward KJ, et al. (2014) Hypermethylation in the ZBTB20 gene is assoc iated with major depressive disorder. Genome Biol 15: R56. doi:10.1186/gb 2014 15 4 r56. 161. Miller AH, Raison CL (2016) The role of inflammation in depression: from evolutionary imperative to modern treatment target. Nat Rev Immunol 16: 22 34. doi:10.10 38/nri.2015.5. 162. Bailey P, Chang DK, Nones K, Johns AL, Patch A M, et al. (2016) Genomic analyses identify molecular subtypes of pancreatic cancer. Nature 531: 47 52. doi:10.1038/nature16965.
155 163. Strachan E, Hunt C, Afari N, Duncan G, Noonan C, et al. (2013) University of Washington Twin Registry: poised for the next generation of twin research. Twin Res Hum Genet 16: 455 462. doi:10.1017/thg.2012.124. 164. Beck AT, Steer RA, Brown GK (1996) BDI II. Beck Depression Inventory: Second Edition. San An tonio, TX: The Psychological Corporation. 165. Rush AJ, Trivedi MH, Ibrahim HM, Carmody TJ, Arnow B, et al. (2003) The 16 Item Quick Inventory of Depressive Symptomatology (QIDS), clinician rating (QIDS C), and self report (QIDS SR): a psychometric evalua tion in patients with chronic major depression. Biol Psychiatry 54: 573 583. 166. Bremner JD, Bolus R, Mayer EA (2007) Psychometric properties of the Early Trauma Inventory Self Report. J Nerv Ment Dis 195: 211 218. doi:10.1097/01.nmd.0000243824.84651.6c. 167. Moran S, Arribas C, Esteller M (2016) Validation of a DNA methylation microarray for 850,000 CpG sites of the human genome enriched in enhancer sequences. Epigenomics 8: 389 399. doi:10.2217/epi.15.114. 168. Aryee MJ, Jaffe AE, Corrada Bravo H, Lad d Acosta C, Feinberg AP, et al. (2014) Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30: 1363 1369. doi:10.1093/bioinformatics/btu049. 169. Langmead B, Salzberg SL (2012) Fast gapped read alignment with Bowtie 2. Nat Methods 9: 357 359. doi:10.1038/nmeth.1923. 170. Li B, Dewey CN (2011) RSEM: accurate transcript quantification from RNA Seq data with or without a reference genome. BMC Bioinformatics 12: 323. doi:10.1186/147 1 2105 12 323. 171. Guintivano J, Aryee MJ, Kaminsky ZA (2013) A cell epigenotype specific model for the correction of brain cellular heterogeneity bias and its application to age, brain region and major depression. Epigenetics 8: 290 302. doi:10.4161/epi .23924. 172. Pantazatos SP, Huang YY, Rosoklija GB, Dwork AJ, Arango V, et al. (2017) Whole transcriptome brain expression and exon usage profiling in major depression and suicide: evidence for altered glial, endothelial and ATPase activity. Mol Psychiatr y 22: 760 773. doi:10.1038/mp.2016.130. 173. Tan Q (2013) Epigenetic epidemiology of complex diseases using twins. Med Epigenet 1: 46 51. doi:10.1159/000354285. 174. Boks MP, Derks EM, Weisenberger DJ, Strengman E, Janson E, et al. (2009) The relationshi p of DNA methylation with age, gender and genotype in twins and healthy controls. PLoS One 4: e6767. doi:10.1371/journal.pone.0006767.
156 175. Freeman A, Tyrovolas S, Koyanagi A, Chatterji S, Leonardi M, et al. (2016) The role of socio economic status in dep ression: results from the COURAGE (aging survey in Europe). BMC Public Health 16: 1098. doi:10.1186/s12889 016 3638 0. 176. for depression. BMC Psychiatry 14: 107. doi:10.1186/ 1471 244X 14 107. 177. AlegrÃa Torres JA, Baccarelli A, Bollati V (2011) Epigenetics and lifestyle. Epigenomics 3: 267 277. doi:10.2217/epi.11.22. 178. Peters TJ, Buckley MJ, Statham AL, Pidsley R, Samaras K, et al. (2015) De novo identification of diffe rentially methylated regions in the human genome. Epigenetics Chromatin 8: 6. doi:10.1186/1756 8935 8 6. 179. Li Shen;, Mount Sinai (2013) GeneOverlap: Test and visualize gene overlaps. Computer software. 180. Langfelder P, Horvath S (2008) WGCNA: an R p ackage for weighted correlation network analysis. BMC Bioinformatics 9: 559. doi:10.1186/1471 2105 9 559. 181. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, et al. (2003) Cytoscape: a software environment for integrated models of biomolecular interac tion networks. Genome Res 13: 2498 2504. doi:10.1101/gr.1239303. 182. Pers TH, Karjalainen JM, Chan Y, Westra H J, Wood AR, et al. (2015) Biological interpretation of genome wide association studies using predicted gene functions. Nat Commun 6: 5890. doi:10.1038/ncomms6890. 183. MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, et al. (2017) The new NHGRI EBI Catalog of published genome wide association studies (GWAS Catalog). Nucleic Acids Res 45: D896 D901. doi:10.1093/nar/gkw1133. 184. Koscielny G, An P, Carvalho Silva D, Cham JA, Fumis L, et al. (2017) Open Targets: a platform for therapeutic target identification and validation. Nucleic Acids Res 45: D985 D994. doi:10.1093/nar/gkw1055. 185. Almonte AG, Sweatt JD (2011) Serine proteases, serine pro tease inhibitors, and protease activated receptors: roles in synaptic function and behavior. Brain Res 1407: 107 122. doi:10.1016/j.brainres.2011.06.042. 186. Reumann R, Vierk R, Zhou L, Gries F, Kraus V, et al. (2017) The serine protease inhibitor neuros erpin is required for normal synaptic plasticity and regulates learning and social behavior. Learn Mem 24: 650 659. doi:10.1101/lm.045864.117. 187. Feder ME, Hofmann GE (1999) Heat shock proteins, molecular chaperones, and the stress response: evolutionar y and ecological physiology. Annu Rev Physiol 61: 243 282. doi:10.1146/annurev.physiol.61.1.243.
157 188. Penke B, BogÃ¡r F, Crul T, SÃ¡ntha M, TÃ³th ME, et al. (2018) Heat Shock Proteins and Autophagy Pathways in Neuroprotection: from Molecular Bases to Pharmac ological Interventions. Int J Mol Sci 19. doi:10.3390/ijms19010325. 189. Polajnar M, Zerovnik E (2014) Impaired autophagy: a link between neurodegenerative and neuropsychiatric diseases. J Cell Mol Med 18: 1705 1711. doi:10.1111/jcmm.12349. 190. Taniguch i T, Tanaka S, Ishii A, Watanabe M, Fujitani N, et al. (2013) A brain specific Grb2 associated regulator of extracellular signal regulated kinase (Erk)/mitogen activated protein kinase (MAPK) (GAREM) subtype, GAREM2, contributes to neurite outgrowth of neu roblastoma cells by regulating Erk signaling. J Biol Chem 288: 29934 29942. doi:10.1074/jbc.M113.492520. 191. Subaran RL, Odgerel Z, Swaminathan R, Glatt CE, Weissman MM (2016) Novel variants in ZNF34 and other brain expressed transcription factors are sh ared among early onset MDD relatives. Am J Med Genet B, Neuropsychiatr Genet 171B: 333 341. doi:10.1002/ajmg.b.32408. 192. Jansen R, Penninx BWJH, Madar V, Xia K, Milaneschi Y, et al. (2016) Gene expression in major depressive disorder. Mol Psychiatry 21: 339 347. doi:10.1038/mp.2015.57. 193. Kostich W, Hamman BD, Li Y W, Naidu S, Dandapani K, et al. (2016) Inhibition of AAK1 kinase as a novel therapeutic approach to treat neuropathic pain. J Pharmacol Exp Ther 358: 371 386. doi:10.1124/jpet.116.235333. 1 94. Nakazawa T, Hashimoto R, Sakoori K, Sugaya Y, Tanimura A, et al. (2016) Emerging roles of ARHGAP33 in intracellular trafficking of TrkB and pathophysiology of neuropsychiatric disorders. Nat Commun 7: 10594. doi:10.1038/ncomms10594. 195. Kuai L, Ong S E, Madison JM, Wang X, Duvall JR, et al. (2011) AAK1 identified as an inhibitor of neuregulin 1/ErbB4 dependent neurotrophic factor signaling using integrative chemical genomics and proteomics. Chem Biol 18: 891 906. doi:10.1016/j.chembiol.2011.03.017. 1 96. Lee S E, Chang S (2016) nArgBP2 as a hub molecule in the etiology of various neuropsychiatric disorders. BMB Rep 49: 457 458. doi:10.5483/BMBRep.2016.49.9.138. 197. Jarskog LF, Glantz LA, Gilmore JH, Lieberman JA (2005) Apoptotic mechanisms in the pa thophysiology of schizophrenia. Prog Neuropsychopharmacol Biol Psychiatry 29: 846 858. doi:10.1016/j.pnpbp.2005.03.010. 198. Rao S, MartÃnez Cengotitabengoa M, Yao Y, Guo Z, Xu Q, et al. (2017) Peripheral blood nerve growth factor levels in major psychiat ric disorders. J Psychiatr Res 86: 39 45. doi:10.1016/j.jpsychires.2016.11.012.
158 199. Uebi T, Itoh Y, Hatano O, Kumagai A, Sanosaka M, et al. (2012) Involvement of SIK3 in glucose and lipid homeostasis in mice. PLoS One 7: e37803. doi:10.1371/journal.pone. 0037803. 200. Pearson S, Schmidt M, Patton G, Dwyer T, Blizzard L, et al. (2010) Depression and insulin resistance: cross sectional associations in young adults. Diabetes Care 33: 1128 1133. doi:10.2337/dc09 1940. 201. IgnÃ¡cio ZM, RÃ©us GZ, Arent CO, Abel aira HM, Pitcher MR, et al. (2016) New perspectives on the involvement of mTOR in depression as well as in the action of antidepressant drugs. Br J Clin Pharmacol 82: 1280 1290. doi:10.1111/bcp.12845. 202. Treutlein J, Cichon S, Ridinger M, Wodarz N, Soyk a M, et al. (2009) Genome wide association study of alcohol dependence. Arch Gen Psychiatry 66: 773 784. doi:10.1001/archgenpsychiatry.2009.83. 203. Treutlein J, Rietschel M (2011) Genome wide association studies of alcohol dependence and substance use di sorders. Curr Psychiatry Rep 13: 147 155. doi:10.1007/s11920 011 0176 4. 204. Hagmeyer S, Haderspeck JC, Grabrucker AM (2014) Behavioral impairments in animal models for zinc deficiency. Front Behav Neurosci 8: 443. doi:10.3389/fnbeh.2014.00443. 205. Dob Zaleska M, Cui R, et al. (2017) Zinc in the monoaminergic theory of depression: its relationship to neural plasticity. Neural Plast 2017: 3682752. doi:10.1155/2017/3682752. 206. Grayson DR, Kundakovic M, Sharma RP (20 10) Is there a future for histone deacetylase inhibitors in the pharmacotherapy of psychiatric disorders? Mol Pharmacol 77: 126 135. doi:10.1124/mol.109.061333. 207. Bahar Halpern K, Vana T, Walker MD (2014) Paradoxical role of DNA methylation in activati on of FoxA2 gene expression during endoderm development. J Biol Chem 289: 23882 23892. doi:10.1074/jbc.M114.573469. 208. Ikegami K, Ohgane J, Tanaka S, Yagi S, Shiota K (2009) Interplay between DNA methylation, histone modification and chromatin remodelin g in stem cells and during development. Int J Dev Biol 53: 203 214. doi:10.1387/ijdb.082741ki. 209. van Eijk KR, de Jong S, Boks MPM, Langeveld T, Colas F, et al. (2012) Genetic analysis of DNA methylation and gene expression levels in whole blood of healthy human subjects. BMC Genomics 13: 636. doi:10.1186/1471 2164 13 636. 210. Mostafavi S, Battle A, Zhu X, Potash JB, Weissman MM, et al. (2014) Type I interferon signaling genes in recurrent major depression: increased expression detected by whole bl ood RNA sequencing. Mol Psychiatry 19: 1267 1274. doi:10.1038/mp.2013.161.
159 211. Beumer W, Gibney SM, Drexhage RC, Pont Lezica L, Doorduin J, et al. (2012) The immune theory of psychiatric diseases: a key role for activated microglia and circulating monocytes. J Leukoc Biol 92: 959 975. doi:10.1189/jlb.0212100. 212. Teschendorff AE, Menon U, Gentry Maharaj A, Ramus SJ, Gayther SA, et al. (2009) An epigenetic signature in peripheral blood predicts active ovarian cancer. PLoS One 4: e8274. doi:10.1371/ journal.pone.0008274. 213. Menke A, Binder EB (2014) Epigenetic alterations in depression and antidepressant treatment. Dialogues Clin Neurosci 16: 395 404. 214. Weissman MM (1996) Cross National Epidemiology of Major Depression and Bipolar Disorder. JAM A 276: 293. doi:10.1001/jama.1996.03540040037030. 215. Klap R, Unroe KT, UnÃ¼tzer J (2003) Caring for mental illness in the United States: a focus on older adults. Am J Geriatr Psychiatry 11: 517 524. 216. Chen D, Meng L, Pei F, Zheng Y, Leng J (2017) A r eview of DNA methylation in depression. J Clin Neurosci 43: 39 46. doi:10.1016/j.jocn.2017.05.022. 217. Bennett DA, Schneider JA, Arvanitakis Z, Wilson RS (2012) Overview and findings from the religious orders study. Curr Alzheimer Res 9: 628 645. doi:10. 2174/156720512801322573. 218. Bennett DA, Shannon KM, Beckett LA, Goetz CG, Wilson RS (1997) Metric Disease Rating Scale. Neurology 49: 1580 1587. doi:10.1212/WNL.49.6 .1580. 219. Kohout FJ, Berkman LF, Evans DA, Cornoni Huntley J (1993) Two shorter forms of the CES D (Center for Epidemiological Studies Depression) depression symptoms index. J Aging Health 5: 179 193. doi:10.1177/089826439300500202. 220. Bennett DA, Wi lson RS, Schneider JA, Bienias JL, Arnold SE (2004) Cerebral infarctions and the relationship of depression symptoms to level of cognitive functioning in older persons. Am J Geriatr Psychiatry 12: 211 219. doi:10.1097/00019442 200403000 00012. 221. Bennet t DA, Yu L, De Jager PL (2014) Building a pipeline to discover and Biochem Pharmacol 88: 617 630. doi:10.1016/j.bcp.2014.01.037. 222. De Jager PL, Srivastava G, Lunnon K, Burge ss J, Schalkwyk LC, et al. (2014) RHBDF2 and other loci. Nat Neurosci 17: 1156 1163. doi:10.1038/nn.3786. 223. McGregor K, Bernatsky S, Colmegna I, Hudson M, Pastinen T, et al. (2016) An evaluation of methods correcting for cell type heterogeneity in DNA methylation studies. Genome Biol 17: 84. doi:10.1186/s13059 016 0935 y.
160 224. Chan G, White CC, Winn PA, Cimpean M, Replogle JM, et al. (2015) CD33 modulates TREM2: convergence of Alzheimer loci. Nat Neurosci 18: 1556 1558. doi:10.1038/nn.4126. 225. Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8: 118 127. doi:10.1093/biostatistics/kxj037. 226. Dedeurwaerder S, Defrance M, Calonne E, Denis H, Sotiriou C, et al. (2011) Evaluation of the Infinium Methylation 450K technology. Epigenomics 3: 771 784. doi:10.2217/epi.11.105. 227. Bibikova M, Barnes B, Tsan C, Ho V, Klotzle B, et al. (2011) High density DNA methylation array with single CpG site resolution. Genomics 98: 288 295. doi:10.1016/j.ygeno.2011.07.007. 228. Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, et al. (2004) The Gene Ontology (GO) database and informatics resource. Nuclei c Acids Res 32: D258 61. doi:10.1093/nar/gkh036. 229. Benjamini Y, Drai D, Elmer G, Kafkafi N, Golani I (2001) Controlling the false discovery rate in behavior genetics research. Behav Brain Res 125: 279 284. doi:10.1016/S0166 4328(01)00297 2. 230. Breen MS, Wingo AP, Koen N, Donald KA, Nicol M, et al. (2018) Gene expression in cord blood links genetic risk for neurodevelopmental disorders with maternal psychological distress and adverse childhood outcomes. Brain Behav Immun 73: 320 330. doi:10.1016/j.bbi .2018.05.016. 231. Tian G, Lewis SA, Feierbach B, Stearns T, Rommelaere H, et al. (1997) Tubulin subunits exist in an activated conformational state generated and maintained by protein cofactors. J Cell Biol 138: 821 832. 232. Miyake N, Fukai R, Ohba C, Chihara T, Miura M, et al. (2016) Biallelic TBCD Mutations Cause Early Onset Neurodegenerative Encephalopathy. Am J Hum Genet 99: 950 961. doi:10.1016/j.ajhg.2016.08.005. 233. Buhr ED, Takahashi JS (2013) Molecular components of the Mammalian circadian cl ock. Handb Exp Pharmacol: 3 27. doi:10.1007/978 3 642 25950 0_1. 234. McCarthy MJ, Le Roux MJ, Wei H, Beesley S, Kelsoe JR, et al. (2016) Calcium circadian rhythms. Neuroph armacology 101: 439 448. doi:10.1016/j.neuropharm.2015.10.017. 235. Liu C, Chung M (2015) Genetics and epigenetics of circadian rhythms and their potential roles in neuropsychiatric disorders. Neurosci Bull 31: 141 159. doi:10.1007/s12264 014 1495 3.
161 236. Feng J, Fan G (2009) The role of DNA methylation in the central nervous system and neuropsychiatric disorders. Int Rev Neurobiol 89: 67 84. doi:10.1016/S0074 7742(09)89004 1. 237. Byrne EM, Carrillo Roa T, Henders AK, Bowdler L, McRae AF, et al. (2013) Monozygotic twins affected with major depressive disorder have greater variance in methylation than their unaffected co twin. Transl Psychiatry 3: e269. doi:10.1038/tp.2013.45. 238. co mmon clinical presentation. Innov Clin Neurosci 8: 38 42. 239. Kales HC, Valenstein M (2002) Complexity in late life depression: impact of confounding factors on diagnosis, treatment, and outcomes. J Geriatr Psychiatry Neurol 15: 147 155. doi:10.1177/0891 98870201500306. 240. Strakowski SM (2012) The complexities of depression. Curr Psychiatry Rep 14: 608 609. doi:10.1007/s11920 012 0330 7. 241. First MB, Spitzer RL, Williams JBW, Gibbon M (1995) Structured Clinical Interview for DSM IV Pationet Edition (SCID P). Washington, D. C.: American Psychiatric Press. 242. Bremner JD, Vermetten E, Mazure CM (2000) Development and preliminary psychometric properties of an instrument for the measurement of childhood trauma: the Early Trauma Inventory. Depress Anxie ty 12: 1 12. doi:10.1002/1520 6394(2000)12:1<1::AID DA1>3.0.CO;2 W. 243. Abrahamson M, Hooker E, Ajami NJ, Petrosino JF, Orwoll ES (2017) Successful collection of stool samples for microbiome analyses from a large community based population of elderly men . Contemp Clin Trials Commun 7: 158 162. doi:10.1016/j.conctc.2017.07.002. 244. Frolkis A, Knox C, Lim E, Jewison T, Law V, et al. (2010) SMPDB: the small molecule pathway database. Nucleic Acids Res 38: D480 7. doi:10.1093/nar/gkp1002. 245. Washburne AD , Morton JT, Sanders J, McDonald D, Zhu Q, et al. (2018) Methods for phylogenetic analysis of microbiome data. Nat Microbiol 3: 652 661. doi:10.1038/s41564 018 0156 0. 246. Strobl C, Boulesteix A L, Zeileis A, Hothorn T (2007) Bias in random forest variab le importance measures: illustrations, sources and a solution. BMC Bioinformatics 8: 25. doi:10.1186/1471 2105 8 25. 247. van der Maaten L (2014) Accelerating t SNE using Tree Based Algorithms. Journal of Machine Learning Research.
162 248. Holmes E, Li JV, Athanasiou T, Ashrafian H, Nicholson JK (2011) Understanding the role of gut microbiome host metabolic signal disruption in health and disease. Trends Microbiol 19: 349 359. doi:10.1016/j.tim.2011.05.006. 249. Langille MGI, Zaneveld J, Caporaso JG, McDona ld D, Knights D, et al. (2013) Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol 31: 814 821. doi:10.1038/nbt.2676. 250. Yau KKW, Wang K, Lee AH (2003) Zero Inflated Negative Binomial Mixed Regre ssion Modeling of Over Dispersed Count Data with Extra Zeros. Biom J 45: 437 452. doi:10.1002/bimj.200390024. 251. Yatsunenko T, Rey FE, Manary MJ, Trehan I, Dominguez Bello MG, et al. (2012) Human gut microbiome viewed across age and geography. Nature 48 6: 222 227. doi:10.1038/nature11053. 252. Castro Nallar E, Bendall ML, PÃ©rez Losada M, Sabuncyan S, Severance EG, et al. (2015) Composition, taxonomy and functional diversity of the oropharynx microbiome in individuals with schizophrenia and controls. Pee rJ 3: e1140. doi:10.7717/peerj.1140. 253. Evans SJ, Bassis CM, Hein R, Assari S, Flowers SA, et al. (2017) The gut microbiome composition associates with bipolar disorder and illness severity. J Psychiatr Res 87: 23 29. doi:10.1016/j.jpsychires.2016.12.00 7. 254. Lozupone CA, Stombaugh JI, Gordon JI, Jansson JK, Knight R (2012) Diversity, stability and resilience of the human gut microbiota. Nature 489: 220 230. doi:10.1038/nature11550. 255. Duncan SH, Louis P, Flint HJ (2007) Cultivable bacterial diversity from the human colon. Lett Appl Microbiol 44: 343 350. doi:10.1111/j.1472 765X.2007.02129.x. 256. Jiang W, Wu N, Wang X, Chi Y, Zhang Y, et al. (2015) Dysbiosis gut microbiota associated with inflammation and impaired mucosal immune function in intestine of humans with non alcoholic fatty liver disease. Sci Rep 5: 8096. doi:10.1038/srep08096. 257. Jangi S, Gandhi R, Cox LM, Li N, von Glehn F, et al. (2016) Alterations of the human gut microbiome in multiple sclerosis. Nat Commun 7: 12015. doi:10 .1038/ncomms12015. 258. Velasquez MT, Ramezani A, Manal A, Raj DS (2016) Trimethylamine N Oxide: The Good, the Bad and the Unknown. Toxins (Basel) 8. doi:10.3390/toxins8110326. 259. Yancey PH (2005) Organic osmolytes as compatible, metabolic and countera cting cytoprotectants in high osmolarity and other stresses. J Exp Biol 208: 2819 2830. doi:10.1242/jeb.01730. 260. Ottiger M, Nickler M, Steuer C, Bernasconi L, Huber A, et al. (2018) Gut, microbiota dependent trimethylamine N oxide is associated with lo ng term all cause
163 mortality in patients with exacerbated chronic obstructive pulmonary disease. Nutrition 45: 135 141.e1. doi:10.1016/j.nut.2017.07.001. 261. Jernigan PL, Hoehn RS, GrassmÃ© H, Edwards MJ, MÃ¼ller CP, et al. (2015) Sphingolipids in major dep ression. Neurosignals 23: 49 58. doi:10.1159/000442603. 262. Bryan P F, Karla C, Edgar Alejandro M T, Sara Elva E P, Gemma F, et al. (2016) Sphingolipids as Mediators in the Crosstalk between Microbiota and Intestinal Cells: Implications for Inflammatory Bowel Disease. Mediators Inflamm 2016: 9890141. doi:10.1155/2016/9890141. 263. Huang F C (2017) The role of sphingolipids on innate immunity to intestinal salmonella infection. Int J Mol Sci 18. doi:10.3390/ijms18081720. 264. Gilbert JA (2015) Our unique microbial identity. Genome Biol 16: 97. doi:10.1186/s13059 015 0664 7. 265. Rakyan VK, Hildmann T, Novik KL, Lewin J, Tost J, et al. (2004) DNA methylation profiling of the human major histocompatibility complex: a pilot study for the human epigenome pro ject. PLoS Biol 2: e405. doi:10.1371/journal.pbio.0020405. 266. Dahmen G, Rochon J, Konig IR, Ziegler A (2004) Sample size calculations for controlled clinical trials using generalized estimating equations (GEE). Methods Inf Med 43: 451 456. 267. Saffari A, Silver MJ, Zavattari P, Moi L, Columbano A, et al. (2018) Estimation of a significance threshold for epigenome wide association studies. Genet Epidemiol 42: 20 33. doi:10.1002/gepi.22086. 268. Verhulst B (2017) A power calculator for the classical twi n design. Behav Genet 47: 255 261. doi:10.1007/s10519 016 9828 9. 269. Titus AJ, Gallimore RM, Salas LA, Christensen BC (2017) Cell type deconvolution from DNA methylation: a review of recent applications. Hum Mol Genet 26: R216 R224. doi:10.1093/hmg/ddx275. 270. Holbrook JD, Huang R C, Barton SJ, Saffery R, Lillycrop KA (2017) Is cellular heterogeneity merely a confounder to be removed from epigenome wide association studies? Epigenomics 9: 1143 1150. doi:10.2217/epi 2017 0032. 271. Pidsley R, Zotenko E, Peters TJ, Lawrence MG, Risbridger GP, et al. (2016) Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole genome DNA methylation profiling. Genome Biol 17: 208. doi:10.1186/s13059 016 1066 1. 272. Nebes R D, Butters MA, Mulsant BH, Pollock BG, Zmuda MD, et al. (2000) Decreased working memory and processing speed mediate cognitive impairment in geriatric depression. Psychol Med 30: 679 691.
164 273. Elderkin Thompson V, Kumar A, Bilker WB, Dunkin JJ, Mintz J, e t al. (2003) Neuropsychological deficits among patients with late onset minor and major depression. Arch Clin Neuropsychol 18: 529 549. 274. Butters MA, Whyte EM, Nebes RD, Begley AE, Dew MA, et al. (2004) The nature and determinants of neuropsychological functioning in late life depression. Arch Gen Psychiatry 61: 587 595. doi:10.1001/archpsyc.61.6.587. 275. Baudic S, Tzortzis C, Barba GD, Traykov L (2004) Executive deficits in elderly patients with major unipolar depression. J Geriatr Psychiatry Neurol 17: 195 201. doi:10.1177/0891988704269823. 276. Wright SL, Persad C (2007) Distinguishing between depression and dementia in older persons: neuropsychological and neuropathological correlates. J Geriatr Psychiatry Neurol 20: 189 198. doi:10.1177/089198870 7308801. 277. Hasin Y, Seldin M, Lusis A (2017) Multi omics approaches to disease. Genome Biol 18: 83. doi:10.1186/s13059 017 1215 1. 278. Opgen Rhein R, Strimmer K (2007) From correlation to causation networks: a simple approximate learning algorithm an d its application to high dimensional plant gene expression data. BMC Syst Biol 1: 37. doi:10.1186/1752 0509 1 37.
165 BIOGRAPHICAL SKETCH degree majoring in p hysics at Fudan Unive rsity. During his graduate training, he became interested in research. The seminar he attended on identifying risk factors for cardiovascular disease in the senior year gave me a better understanding about epidemiology. It fascinated him with its complexit y that biology, medicine , and biostatistics are playing an increasingly crucial role in disease prevention and health promotion. After graduation, he worked as a computer engineer at Shanghai Branch of Intel Corporation. Thanks to this opportunity, he acqu ired a great deal of knowledge and statistical techniques in computer programming and data processing, which in addition , are crucial and valuable in analyzing complicated datasets in epidemiology and public health. The y ear 2011, he met his advisor and mentor, Dr. Jinying Zhao at the University of Oklahoma and worked as research assistant to her. He pursued his ma ster' s degree majoring in Epidemiology at Tulane University in the year 2015. After that, he started his Ph .D . st udy in e pidemiology at Tulane University. Since his Ph .D. mentor Dr. Jinying Zhao moved from Tulane University to the University of Florida, he was transferred to the University of Florida in 2016. He completed his work in 201 9 and identified multiple mole cular markers for depression .