PAGE 1
1
PAGE 2
2
PAGE 3
3
PAGE 4
IrealizedthatIhavebeeninschoolformostofmylifeandthisdissertationistheculminationofmyformaleducation-butcertainlynottheendtolearning.Iwouldliketothankeveryonewhohascontributedtotheaccumulationofmyknowledgeandhoningofmyskills,thosewhohavehelpedmeinallaspectsofmycareer,andallwhohaveaectedmylife.Inparticular,thankstoThepeopleinmyearlyyearsofeducation:MabelNakasas,myrsttutor,forhertirelesseortevenwhenIwasdaydreamingorfallingasleepwhileshewasteachingmeMath;RicoSantoswhogaveustheopportunitytobetterourMathskills;Mr.andMrs.Yebanforalltheirsupportandmentorship;Dr.AurelloRamos,Jr.forgivingmeajobatLSC;Dr.AugustoHermosillaforallhispreciouspiecesofadviceregardingmycareer;allmycolleaguesattheAteneodeManilaUniversityMathDepartmentforalltheirfriendshipandsupport.Allmyrecommenders:Dr.JoseMarasigan,Dr.ReginaldMarcelo,andDr.GerrySalas(forinitialadmissiontograduateschool);Dr.StephenAgard,Dr.DennisCook,Dr.ChrisBingham,andDr.JohnBaxter(foradmissiontothePh.D.programinStatisticsattheUniversityofFlorida);Dr.RonglingWu,Dr.JamesHobert,Dr.MarkYang,andDr.WendyLondon(forjobapplications).Dr.WendyLondonforgivingmetheopportunitytoworkatCOGandlearnaboutchildren'scancerandmyCOGcolleaguesPatrickMcGrady,ChenguangWang,andStephenLinda.Mycolleagues,ocematesandfriendsintheStatisticsDepartmentatUF:AixinTanwhohashelpedmealotinmystatisticscareer,SongWuforallhishelpinstatistical 4
PAGE 5
5
PAGE 6
page ACKNOWLEDGMENTS ................................. 4 LISTOFTABLES ..................................... 8 LISTOFFIGURES .................................... 10 ABSTRACT ........................................ 11 CHAPTER 1INTRODUCTION .................................. 13 1.1BasicGeneticsandQTLMapping ....................... 14 1.1.1Terminology ............................... 14 1.1.2ExperimentalCrosses .......................... 15 1.1.3LinkageandMarkers .......................... 16 1.1.4IntervalMapping ............................ 17 1.2FunctionalMappingofQTL .......................... 20 1.2.1ModelFormulation ........................... 20 1.2.2ParameterEstimationviatheEMAlgorithm ............. 23 1.2.3HypothesisTests ............................ 25 1.3OtherQTLMappingModels .......................... 26 1.4Goals ....................................... 28 2NONPARAMETRICCOVARIANCEESTIMATIONINFUNCTIONALMAPPINGOFQTL ................................. 30 2.1Introduction ................................... 30 2.2CovarianceEstimation ............................. 31 2.2.1ModiedCholeskyDecompositionandRegressionInterpretation .. 31 2.2.2RegularizedCovarianceEstimators .................. 33 2.2.3RidgeRegressionandLASSO ..................... 35 2.2.4PenalizedLikelihood .......................... 38 2.3CovarianceEstimationinFunctionalMapping ................ 41 2.3.1ComputingthePenalizedLikelihoodEstimates ............ 41 2.3.2FromEMtoECMAlgorithm ..................... 44 2.3.3SelectionofTuningParameter ..................... 46 2.4NumericalResults ................................ 46 2.4.1Simulations ............................... 46 2.4.2RealDataAnalysis ........................... 56 2.5SummaryandDiscussion ............................ 62 6
PAGE 7
........................... 64 3.1Introduction ................................... 64 3.2FunctionalMappingofReactionNormstoMultipleEnvironmentalSignals 66 3.2.1Likelihood ................................ 68 3.2.2MeanandCovarianceModels ..................... 69 3.2.3HypothesisTests ............................ 70 3.3Spatio-temporalCovarianceFunctions .................... 71 3.3.1Introduction ............................... 71 3.3.2BasicIdeas,Notation,andAssumptions ................ 72 3.3.3SeparableCovarianceStructures .................... 73 3.3.4NonseparableCovarianceStructures .................. 75 3.3.4.1SpectralmethodbyCressieandHuang(1999) ....... 75 3.3.4.2MonotonefunctionmethodbyGneiting(2002) ...... 77 3.4Simulations ................................... 78 3.5SummaryandDiscussion ............................ 90 4CONCLUDINGREMARKS ............................. 93 4.0.1Summary ................................. 93 4.0.2FutureDirections ............................ 94 APPENDIX ADERIVATIONOFEMALGORITHMFORMULAS ............... 97 BDERIVATIONOFEQUATION2-9 ......................... 99 CMINIMIZATIONOF2-33 .............................. 100 DDEFINITIONOFKRONECKERPRODUCT ................... 102 EDERIVATIONOFEQUATION3-20 ........................ 103 REFERENCES ....................................... 104 BIOGRAPHICALSKETCH ................................ 113 7
PAGE 8
Table page 1-1Conditionalgenotypeprobabilityinabackcross .................. 19 1-2ConditionalgenotypeprobabilityinanF2 19 2-1AveragedQTLposition,meancurveparameters,maximumlog-likelihoodratios(maxLR),entropyandquadraticlossesandtheirstandarderrors(giveninparentheses)forthreeQTLgenotypesinanF2populationunderdierentsamplesizes(n)basedon100simulationreplicates(NP,NormalData). 52 2-2AveragedQTLposition,meancurveparameters,maximumlog-likelihoodratios(maxLR),entropyandquadraticlossesandtheirstandarderrors(giveninparentheses)forthreeQTLgenotypesinanF2populationunderdierentsamplesizes(n)basedon100simulationreplicates(AR(1),NormalData). 53 2-3AveragedQTLposition,meancurveparameters,maximumlog-likelihoodratios(maxLR),entropyandquadraticlossesandtheirstandarderrors(giveninparentheses)forthreeQTLgenotypesinanF2populationunderdierentsamplesizes(n)basedon100simulationreplicates(NP,Datafromt-distribution). 54 2-4AveragedQTLposition,meancurveparameters,maximumlog-likelihoodratios(maxLR),entropyandquadraticlossesandtheirstandarderrors(giveninparentheses)forthreeQTLgenotypesinanF2populationunderdierentsamplesizes(n)basedon100simulationreplicates(AR(1),Datafromt-distribution). 55 2-5AvailablemarkersandphenotypedataofalinkagemapinanF2populationofmice(datafromVaughnetal.,1999). ........................ 59 3-1AveragedQTLposition,meancurveparameters,maximumlog-likelihoodratios(maxLR),entropyandquadraticlossesandtheirstandarderrors(giveninparentheses)fortwoQTLgenotypesinabackcrosspopulationunderdierentsamplesizes(n)basedon100simulationreplicates(NonseparableModel). 81 3-2AveragedQTLposition,meancurveparameters,maximumlog-likelihoodratios(maxLR),entropyandquadraticlossesandtheirstandarderrors(giveninparentheses)fortwoQTLgenotypesinabackcrosspopulationunderdierentsamplesizes(n)basedon100simulationreplicates(NonseparableModel). 82 3-3AveragedQTLposition,meancurveparameters,maximumlog-likelihoodratios(maxLR),entropyandquadraticlossesandtheirstandarderrors(giveninparentheses)fortwoQTLgenotypesinabackcrosspopulationunderdierentsamplesizes(n)basedon100simulationreplicates(NP). 83 8
PAGE 9
84 3-5AveragedQTLposition,meancurveparameters,maximumlog-likelihoodratios(maxLR),entropyandquadraticlossesandtheirstandarderrors(giveninparentheses)fortwoQTLgenotypesinabackcrosspopulationunderdierentsamplesizes(n)basedon100simulationreplicates(C1withn=400and2=2;4). 88 3-6AveragedQTLposition,meancurveparameters,maximumlog-likelihoodratios(maxLR),entropyandquadraticlossesandtheirstandarderrors(giveninparentheses)fortwoQTLgenotypesinabackcrosspopulationunderdierentsamplesizes(n)basedon100simulationreplicates(C1withn=400,increasedirradianceandtemperaturelev-els,and2=1;2). 89 9
PAGE 10
Figure page 1-1ExperimentalcrossesfrompureinbredlineparentsP1andP2 15 1-2Crossing-over ..................................... 17 1-3Weightsofmicemeasuredeveryweekfor10weeks ................ 22 1-4HypotheticalplotofLRvs.linkagemap ...................... 27 2-1Penalizedlikelihoodincurveestimation ...................... 40 2-2Log-likelihoodratio(LR)plotsbasedonsimulateddataunderthreedierentcovariancestructures ................................. 49 2-3Theproleofthelog-likelihoodratios(LR)betweenthefullmodel(thereisaQTL)andreduced(thereisnoQTL)modelforbodymassgrowthtrajectoriesacrossthegenomeinamouseF2population .................... 58 2-4Log-likelihoodratio(LR)plotsforchromosomes6and7ofthemicedata .... 60 2-5ThreegrowthcurveseachpresentingagenotypeateachofsevenQTLsdetectedonmousechromosomes1,4,6,7,10,11,and15forgrowthtrajectoriesofmiceinanF2population. ................................. 61 3-1Reactionnormsurfaceofphotosyntheticrateasafunctionofirradianceandtemperature ...................................... 68 3-2Boxplotsofthevaluesofthelog-likelihoodunderthealternativemodel,H1 85 3-3Covarianceplots ................................... 86 3-4Contourplots ..................................... 87 4-1Formationofaphenotypebyalandscape ...................... 95 10
PAGE 11
Oneofthefundamentalobjectivesinagricultural,biologicalandbiomedicalresearchistheidenticationofgenesthatcontrolthedevelopmentalpatternofcomplextraits,theirresponsestotheenvironment,andthewaythesegenesinteractinacoordinatedmannertodeterminethenalexpressionofthetrait.Morerecently,anewstatisticalframework,calledfunctionalmapping,hasbeendevelopedtoidentifyandmapquantitativetraitloci(QTLs)thatdeterminedevelopmentaltrajectoriesbyintegratingbiologicallymeaningfulmathematicalmodelsoftraitprogressionintoamixturemodelforunknownQTLgenotypes.FunctionalmappinghasemergedtobeapowerfulstatisticaltoolformappingQTLscontrollingtheresponsiveness(reactionnorm)ofatraittodevelopmentalandenvironmentalsignals. Fromastatisticalperspective,functionalmappingdesignedtostudythegeneticregulationandnetworkofquantitativevariationindynamiccomplextraitsisvirtuallyajointmean-covariancelikelihoodmodel.AppropriatechoicesofthemodelforthemeanandcovariancestructuresareofcriticalimportancetostatisticalinferenceaboutQTLlocationsandactions/interactions.Whileabatteryofstatisticalandmathematicalmodelshavebeenproposedformeanvectormodeling,theanalysisofcovariancestructurehasbeenmostlylimitedtoparametricstructureslikeautoregressiveone(AR(1))orstructuredantedependence(SAD)model.Infunctionalmappingofreactionnormsthatrespondtotwoenvironmentalsignals,amodel,expressedasaKroneckerproductoftwoAR(1) 11
PAGE 12
Ourstudyproposesanonparametriccovarianceestimatorinfunctionalmappingofquantitativetraitlocus.WeadoptHuangetal.'s(2006)approachofinvokingthemodiedCholeskydecompositionandconvertingtheproblemintomodelingasequenceofregressionsofresponses.Aregularizedpositive-denitecovarianceestimatorisobtainedusinganormalpenalizedlikelihoodwithanL2penalty.Thisapproachisembeddedwithinthemixturelikelihoodframeworkoffunctionalmappingbyusingareparameterizedversionofthederivativeofthelog-likelihood.Weextendtheideaoffunctionalmappingtomodelthecovariancestructureofinteractioneectsbetweenthetwoenvironmentalsignalsinanon-separableway.Theextendedmodelallowsthequantitativetestofseveralfundamentalbiologicalquestions.IsthereapleiotropicQTLthatregulatesgenotypicresponsestodierentenvironmentalsignals?WhatisthedierenceinthetiminganddurationofQTLexpressionbetweenenvironment-specicresponsiveness?Howisanenvironment-dependentQTLregulatedbyadevelopment-relatedQTL?Weperformedvarioussimulationstudiestorevealthestatisticalpropertiesofthenewmodelsanddemonstratetheadvantagesoftheproposedestimator.Byanalyzingrealexamplesingeneticstudies,weillustratedtheutilizationandusefulnessofthemethodology.Thenewmethodswillprovideausefultoolforgenome-widescanningfortheexistence,distributionandinteractionsofQTLsunderlyingadynamictraitimportanttoagriculture,biologyandhealthsciences. 12
PAGE 13
Anumberofbiologicaltraitsarequantitativelyinherited.Examplesofsuchtraitsincludetheheightoftrees,theweightorbodymassofanimals,theyieldofagriculturalcrops,orevendiseaseprogressionanddrugresponse.Geneticmappingofquantitativetraitsandsubsequentcloningoftheunderlyinggeneshavebecomeaconsiderablefocusinagricultural,biological,andbiomedicalresearch.SincethepublicationoftheseminalmappingpaperbyLanderandBotstein(1989),therehasbeenalargeamountofliteratureconcerningthedevelopmentofstatisticalmethodsformappingcomplextraits(reviewedinJansen,2000;Hoeschele,2000;Wuetal.,2007b).Althoughtheideaofassociatingacontinuouslyvaryingphenotypewithadiscretetrait(marker)datesbacktotheworkofSax(1923),itwasLanderandBotstein(1989)whorstestablishedanexplicitprincipleforlinkageanalysis.Theyalsoprovidedatractablestatisticalalgorithmfordissectingaquantitativetraitintotheirindividualgeneticlocuscomponents,referredtoasquantitativetraitloci(QTLs). ThesuccessofLanderandBotsteinindevelopingapowerfulmethodforlinkageanalysisofacomplextraithasrootsintwodierentdevelopments.First,therapiddevelopmentofmoleculartechnologiesinthemiddle1980sledtothegenerationofavirtuallyunlimitednumberofmarkersthatspecifythegenomestructureandorganizationofanyorganism(Draynaetal.,1984).Second,almostsimultaneously,improvedstatisticalandcomputationaltechniques,suchastheEMalgorithm(Dempsteretal.,1977),madeitpossibletotacklecomplexgeneticandgenomicproblems. LanderandBotstein's(1989)modelforintervalmappingofQTLsisregardedasappropriateforanideal(simplied)situation,inwhichthesegregationpatternsofallmarkerscanbepredictedonthebasisoftheMendelianlawsofinheritanceandatraitunderstudyisstrictlycontrolledbyoneQTLonachromosome.Thisworkwasextendedandimprovedbymanyresearchers(JansenandStam,1994;Zeng,1994;Haleyetal.,1994; 13
PAGE 14
ThischapterprovidesanoverviewofbasicgeneticconceptsrelatedtoQTLmappingofcomplextraits.Fundamentalproceduresforfunctionalmapping(Maetal.,2002)willbeemphasized.FunctionalmappingisastatisticalandgeneticmodelformappingQTLsthatunderlieacomplexdynamictrait.Thischapterisorganizedasfollows:Section 1.1 introducesbasicgeneticconceptsanddescribehowQTLmapping,viaintervalmapping,isdoneusingtheideaoflinkageinstructuredpopulationscalledexperimentalcrosses.Section 1.2 ,introducesthefunctionalmappingmodel.Section 1.3 describesafewotherQTLmappingmethodsandnally,Section 1.4 statesthemaingoalsofthisdissertationandgivestheoutlineoftherestofthechapters. 1.1.1Terminology 14
PAGE 15
ExperimentalcrossesfrompureinbredlineparentsP1andP2:F1P1orF1P2producesabackcrosswhileF1F1producesanintercrossorF2.(AdaptedfromBroman,1997). 'A'and'a',thepossiblegenotypesareAA;Aaandaa.Thegenotypedeterminesthetraitorphenotype.VariationduetoaQTLresultsfromphenotypesdeterminedbydierentgenotypes.However,becauseenvironmentalfactorsalsocontributetothetotalphenotypicvariation,itisdiculttoinferanospring'sgenotypefromitsphenotype. 1-1 ).EachparentcontributesachromosomestrandtocreateanospringcalledtherstlialorF1whichhasgenotypeAa.IftheF1ismatedtooneofitsparents,sayP2,theospringiscalledabackcross,withgenotypeeitherAaoraa.Duringmeiosis(theproductionofsexcellsorgametes),eachparentalstrandreplicatesandexchangegeneticmaterialwithotherstrands.Thisexchangeisknownascrossing-over. 15
PAGE 16
1-2 .EachchromosomestrandfromP1andP2pairupandthenreplicatestoformatetrad.Crossing-over,asillustratedhere,occursatonepointbetweentheinnerstrandsandinvolvesanexchangeofgeneticmaterial.Theendresultisfourchromosomesthateitherresembletheparentalstrands(nonrecombinant,NR)ornot(recombinant,R).Ifcrossing-overdoesoccur,itcanalsodosomorethanonceandincludetheothertwoouterstrands.Ingeneral,recombinantstrandsareformedwhenthereisanoddnumberofcrossoverpoints. Genesonthesamechromosomehaveanassociationcalledlinkage.Thetendencyduringmeiosisisforthegenestoremainonthesamestrand.Thismeansthattherewillbemorenonrecombinantthanrecombinantchromosomes.Ifristherecombina-tionfractionortheproportionofrecombinantchromosomes,thentheproportionofnonrecombinantchromosomesis1r.Becauseoflinkage,itisgenerallytruethatr<1=2.Thevalueofrdependsonthedistancebetweengeneloci.Genesthatarefarapartusuallyhavehighvaluesofrbecauseofthelargeportionofchromosomeinbetweenallowingforabetterchanceforcrossing-overtooccur.Iftwogenesareveryclosetoeachother,thereisahighpossibilitythatnocrossing-overwilloccurandtheywillendupinthesamechromosome. LinkageprovidesawayoflocatingaQTLinachromosomebyusingknownoridentiedgenescalledgeneticormolecularmarkers.MarkersdonotaecttheQTL'sphenotypedirectlyandassuchtheyaresaidtobephenotypicallyneutral.Butthey 16
PAGE 17
Crossing-over:(1)Parentalchromosomestrandsalign(2)Eachstrandreplicatestoformatetrad,crossing-overstartsbetweeninnerstrands(3)Recombinant(R)ornonrecombinant(NR)gametes. mayaectothervisiblephenotypessuchaseyecolor,makingitpossibletodistinguishtheirgenotypes.IfamarkeriscloselylinkedwithaQTL,thenbothoftheirallelescouldpossiblyenduponthesamechromosomeinabackcrossorF2ospring.Theresultingmarkergenotype,whichcanbeidentied,isinformativeinpredictingtheQTL.Thus,aprerequisiteforQTLmappingistheconstructionofalinkagemapofmarkersthatspansanentirechromosomeorthesetofallchromosomesinanorganism(genome).Themoremarkersthereare,thegreaterthechanceofQTLdetection.Someofthepopularlyusedmarkersincludetherestrictionfragmentlengthpolymorphisms(RFLPs),ampliedfragmentlengthpolymorphisms(AFLPs),andsinglenucleotidepolymorphisms(SNPs). 17
PAGE 18
Aunitmapdistance,expressedincentiMorgans(cM),betweentwolociisdenedastheexpectednumberofcrossoversbetweenlociin100meioticproducts.Assumingthatcrossoversoccuratrandomandareindependentofeachother,theHaldanemapfunction(Haldane,1919;Wuetal.,2007c)canbeusedtorelateadistanceofdcMtotherecombinationfrequencyrinthefollowingway: 2orr=1 2(1e2d=100):(1{1) ThedistancesbetweenmarkersacrossthegenomeareknownandusuallyexpressedincM.However,QTLmappingmodelsutilizeprobabilitieswhichareexpressedintermsofr.Thus,whenalinkagemapofmarkersaregivenincM,theyareconvertedtorusingEq. 1{1 Forsimplicity,weassumeabackcrosspopulationwithtwopossiblegenotypesatalocidenotedby1(forAa)or0(foraa).Consideranintervalonachromosomewithtwolinkedmarkers,MandN,asendpointsandletrbetherecombinationfractionbetweenthem.WerefertoMastheleftmarkerandNastherightmarker.SupposethereexistsaQTL,Q,withinthemarkerinterval.Letr1betherecombinationfractionbetweenMandQandr2betweenQandN.ItiseasytoshowusingEq. 1{1 thatr=r1+r22r1r2.TheQTLgenotypesareunknownbuttheirconditionalprobabilitiesgiventhemarkergenotypescanbederived.TheseconditionalprobabilitiesareshowninTable 1-1 .Asanillustration,ifanospringhasgenotype1(Mm)atmarkerMand0(nn)atmarkerN,thenthemarkerintervalgenotypeis10.TheconditionalprobabilitythataQTLhasgenotype1(Qq)is
PAGE 19
Table1-1. Conditionalgenotypeprobabilityinabackcross MarkerIntervalQTLGenotype Genotype10 11(1r1)(1r2) (1r)r1r2 (1r) 1-2 Table1-2. ConditionalgenotypeprobabilityinanF2 Genotype210 22(1r1)2(1r2)2 (1r)2r21r22 (12r+2r2)(12r1+2r21)(12r2+2r22) (12r+2r2)2r1(1r1)r2(1r2) (12r+2r2)10r1(1r1)r22 (1r)2(1r1)2(1r2)2 1-1 and 1-2 wereobtainedusingathree-pointanalysisofgenes(seeforexample,chapter4ofWuetal.,2007c).AQTLisusuallysearchedorscannedatconsecutiveequidistantpointsorintervalsinthegenome.Forexample,agivenchromosomeissearchedstartingattheleftmostmarkertotheoppositeendatevery2or4cM.Ateachsearchpoint,thenumericalvaluesoftheconditionalprobabilitiesofaQTLcanbecalculatedusingTables 1-1 and 1-2 .Theseconditionalprobabilitiesformtheweights 19
PAGE 20
1.2.1 .Noticethatforagivenmarkerintervalineithertable,PJk=1pkji=1,whereJisthenumberofgenotypes(J=2;3forabackcrossandintercross,respectively).Thismeansthatallentriesforagivenrowineithertableaddupto1. Acompleteintervalmappingmodelinvolvesthephenotypedataasidefromthemarkers.Weshallseeinthenextsection( 1.2.1 )howfunctionalmapping,whichisbasedonintervalmapping,incorporatesphenotypedata. 1.2.1ModelFormulation wherethemeangenotypevaluegkandcovariancearespecied,andk=1;:::;J. Thelikelihoodfunctioncanberepresentedbyamultivariatemixturemodel whereistheparametervectorwhichwewillspecifyshortly.pkjiistheprobabilityofaQTLgenotypegiventhegenotypesoftwoankingmarkers(Section 1.1.4 ).Asstatedearlier,aQTLissearchedatdierentpointsthroughoutthegenome.Atanygivenpoint,thenumericalvalueofpkjicanbecomputedbasedonTables 1-1 and 1-2 .Thus,foragivensearchposition,thelog-likelihood,logL(),canbemaximizedtoobtainthemaximumlikelihoodestimates(MLEs)ofthemean,,andcovariance,,parameters.Therefore,=(;). 20
PAGE 21
1-3 )canbemodeledbyalogisticfunctiondenedby (Niklas,1994;Westetal.,2001;Zhaoetal.,2004).Thismodelhasafewdesirabledescriptiveproperties.Thecurvestartswithanexponentialphaseandreachesaninectionpointwherethemaximumrateofgrowthoccurs.Thengrowthcontinuesasymptoticallytowardsthevaluea.Thevalueattheonsetofgrowthisa=(1+b)att=0whileatthepointofinectionisa=2att=logb=r.Thesepropertiescanbeusedtoderivehypothesistests(Maetal.,2002;Zhaoetal.,2004).OtherparametricmeanmodelsincludethesigmoidEmaxequationwhichrelatesdrugconcentrationanddrugeect(Linetal.,2007),theRichardsandGompertziancurvesfortime-dependenttumorgrowth(Lietal.,2006),andthebiexponentialmodelforHIV-Idynamics(Wangetal.,2006).Intheabsenceofstructuralforms,semiparametric(Cuietal.,2006;Wuetal.,2007d;Yangetal.,2007)ornonparametric(Zhao,W.,2005a;Yang,J.,2006;Yangetal.,2007)approachescanbeusedtomodelthemean. Thecovariance,,isassumedtobethesameforeachgenotypegroupk.TheusualparametricmodelistheautoregressiveoneorAR(1), =226666666666412:::m11:::m221:::..................m1m2::::::1377777777775;(1{5) 21
PAGE 22
Weightsofmicemeasuredeveryweekfor10weeks.DatafromthestudyofVaughnetal.(1999). whichispopularinthelongitudinaldataliterature(Diggleetal.,2002;VerbekeandMolenberghs,2005).Thismodelassumesvariancestationarity(equalresidualvariances2ateachtimepoint)andcovariancestationarity(proportionallydecreasingcovariancesbetweentimepoints).Explicitformsfortheinverseanddeterminantareeasilyobtainedbymatrixalgebra: 1=1 22
PAGE 23
Thus,anAR(1)modeliscomputationallyecient.Furthermore,whenusingtheECM-algorithm,itispossibletoobtainCM-stepiterationsolutionsfortheparametersinlogistic(Maetal.,2002)orrationalfunction(Yapetal.,2007)meanmodels.DespitetheadvantagesofanAR(1)model,theassumptionsofvarianceandcovariancestationaritymaynotalwayshold,especiallyforrealdata.ThisisevidentfromFigure 1-3 wherethedataappearsto'fanout'acrosstimeinsteadofbeingstationary.Wuetal.(2004b)triedtoresolvethisproblembyapplyingatransform-both-sides(CarrollandRuppert,1984)log-transformationonthedatatoachieveapproximatestationarity.TheAR(1)canthenbeusedonthetransformeddata(Zhaoetal.,2004).However,suchtransformationmaynotproduceastationarycovariancesothatanAR(1)isstillnotanappropriatemodel.Togetridofstationarityissues,Zhaoetal.(2005)proposedusingastructuredantedependencemodel(SAD)(ZimmermanandNu~nez-Anton,2001)whichcanhandlenon-stationarydataandismorerobustandlessdata-dependentthanAR(1).TheelementsofanSADcovariancestructureoforder1aregivenby var(yij)=12j whereisthegeneralizedautoregressiveparameter,istheinnovationvariance,andj=1;:::;m. 1{8 showsthatthevariancesarenotconstantandthecovariancesarenotonlydependentonjj2j1jbutalsoonthereferencepointj1.However,Zhaoetal.stillrecommendsmodelingdatabySADinconjunctionwithAR(1). 1{3 ,canbewrittenas logL()=nXi=1log"JXk=1pkjifk(yij)#(1{9) 23
PAGE 24
@logL()=0(1{10) where2.Notethatforlogisticmean,=fak;bk;rkjk=1;:::;Jg,andAR(1)covariance,=f2;g,models,=(;).However,theleftsideofEq. 1{10 canbereparameterizedas @logL()=nXi=1JXk=1pkji@ @fk(yij) @logfk(yij)=nXi=1JXk=1Pkji@ @logfk(yij) (1{11) where isinterpretedastheposteriorprobabilitythatprogenyihasQTLgenotypek(McLachlanandPeel,2000;Maetal.,2002).LetP=fPkji;k=1;:::;J;i=1;:::;ng.TheExpectationandMaximization(EM)algorithm(Dempsteretal.,1977)atthe(j+1)thiterationproceedsasfollows: 1. Thecurrentvalueofis(j). 2. 3. @logfk(yij)=0(1{14) toget(j+1). 4. Repeatuntilsomeconvergencecriterionismet. ThevaluesatconvergencearetheMLEsof. 24
PAGE 25
A showsthederivationof 1{13 and 1{14 basedonamissingdataargument.Thisderivationcanalsobeusedtoshowtheextensionfromamaximumtomaximumpenalizedlikelihoodalgorithm(Section 2.3.2 ).ForamorethoroughtreatmentontheEMalgorithm,thereaderisreferredtoMcLachlanandPeel(2000). AlthoughtheEMalgorithmprovidesecientcomputationofthemodelparametersinfunctionalmapping,othermethodscanbeusedaswellsuchasNewton-RaphsonortheNelder-Meadsimplexalgorithm(NelderandMead,1965;Zhaoetal.,2004)whichisadirectnonlinearoptimizationprocedure.Thesemethodsareparticularlyusefulwhennoclosedformformulasfortheparameterestimatescanbeobtained. whereH0isthereduced(ornull)modelsothatonlyasinglelogisticcurvecantthephenotypedataandH1isthefull(oralternative)modelinwhichcasethereexistmorethanonelogisticcurvesthattthephenotypedataduetotheexistenceofaQTL.Notethatthelikelihoodfunctioncorrespondingtothenullmodelisgivensimplyby where and inthecaseofalogisticmeanmodel.Anumberofotherimportanthypothesescanbetested,asoutlinedinWuetal.(2004a). 25
PAGE 26
plottedovertheentirelinkagemap,where~and^denotetheMLEsunderH0andH1,respectively.ThepeakoftheLRplot,whichweshallfromhereonrefertoasmaxLR,wouldsuggestaputativeQTLbecausethiscorrespondstowhenH1isthemostlylikelyoverH0.ThedistributionofLRisdiculttodeterminebecauseoftwomajorissues:theunidentiabilityoftheQTLpositionunderH0andamultipletestingproblemthatarisesbecausetestsacrossthegenomearenotmutuallyindependent(Wuetal,2007c).However,anonparametricmethodcalledpermutationtestsbyDoergeandChurchill(1996)canbeusedtondanapproximatedistributionandasignicancethresholdfortheexistenceofaQTL.Inpermutationtests,thefunctionalmappingmodelisappliedtoseveralrandompermutationsofthephenotypedataonthemarkersandathresholdisdeterminedfromthesetofmaxLRvaluesobtainedfromeachpermutationtestrun.Theideahereistodisassociatethemarkersandphenotypessothatrepeatedapplicationofthemodelonpermuteddatawillproduceanapproximateempiricalnulldistribution. Figure 1-4 showsahypotheticalplotofthelog-likelihoodratioteststatisticoveralinkagemaponachromosome.Themarkersarespacedoutatf0;32;46;58;68;82;100;112gcMandtheQTLsearchwasdoneat2cMintervalsfromtheleftmostmarkerat0.Thetwohorizontallinesarethresholdsbasedonpermutationtests.ThesolidredlinethatcrossestheLRplotsuggestsasignicantQTLwhilethebrokengreenlinedoesnot. 26
PAGE 27
HypotheticalplotofLRvs.linkagemap.Thelatterconsistofmarkersspacedoutatf0,32,46,58,68,82,10,112galongthechromosomerepresentedbythex-axis.QTLsearchwasat2cMintervals.Thelocationcorrespondingtothepeakoftheplot,maxLR,suggestsaQTLposition.Athresholdthatcrossestheplot(redhorizontalline)indicatesasignicantQTL;ifnot(greenhorizontalbrokenline),theQTLisnotsignicant. mapping(CIM;Zeng,1994)wereproposedtoaddressthisissue.Toincreasethepowerofintervalmapping,bothmethodsuseasubsetofmarkerlocibeyondthemarkerintervalunderconsiderationascovariatesinapartialregressionanalysis.TheeectsofthesubsetofmarkersareusedtoestimatetheeectsofotherQTLs.Theproblemwiththesemethodsthoughishowtoselecttheappropriatemarkerstoincludeinthemodel. Multipleintervalmapping(MIM;Kaoetal.,1999)usesmultiplemarkerintervalssimultaneouslytoidentifymultipleQTLs.Thismodelallowsestimationofmainandepistatic(interaction)eectsamongalldetectedQTLs.However,anissuewiththismethodismodelcomparisonandsearchingthroughmodels(Broman,2001). 27
PAGE 28
1.2 )providesausefulframeworkforgeneticmappingthroughmeanandcovariancemodelingofmulti-orlongitudinaltraits.Becauseitrequiresasmallnumberofmodelparameterstoestimate,itiscomputationallyecientandcanbeusedondatathathavelimitedsamplesizes.FunctionalmappinghasshownpotentialasapowerfulstatisticalmethodinQTLmapping. AlthoughparametricmodelssuchasAR(1)andSADaresuitablecovariancestructuresforthelikelihood-basedfunctionalmapping,severebiascouldbeintroducedintheestimationprocessiftheunderlyingdatastructureissignicantlydierent.Specically,abiasedcovarianceestimatecanaecttheestimatesforQTLlocation,QTLeects(theestimatedmeanmodel),andeventhevalueofmaxLR,whichisneededinpermutationtestsforsignicance.Thus,thereisaneedforarobustestimatorthatcanprovidemoreaccurateandpreciseresults.Inthisregard,weproposeanonparametricapproach.AnonparametriccovarianceestimatorwasproposedbyHuangetal.(2006)forthenullmodel(Section 1.2.3 ).TheseauthorsusedapenalizedlikelihoodprocedureinsolvingasetofregressionequationsobtainedfromthemodiedCholeskydecompositionofthecovariancematrix.Theirestimatorisregularizedandguaranteedto 28
PAGE 29
Therestofthechaptersareorganizedasfollows: InChapter 2 ,wedescribethemodiedCholeskydecompositionapproachofcovarianceestimationandHuangetal.'snonparametricprocedure.WeprovidesomediscussiononridgeregressionandLASSOtechniquesforsolvingaregression,andpenalizedlikelihoodbecausethesearethemainconceptsbehindtheirmethod.Thenweshowtheextensiontothemixturelikelihoodcaseandapplyourmodeltosimulatedandreallongitudinaldata. InChapter 3 ,weextendtheuseofourproposedestimatortofunctionalmappingofreactionnormswithtwoenvironmentalsignals.Weconsiderphotosyntheticrateasthereactionnormandirradianceandtemperatureasthetwoenvironmentsignals.Thepreviousproposedcovariancemodelwasparametricandseparable.Insituationswhentheunderlyingdatastructureisnonseparable,ournonparametricestimatorisshowntobemorereliablebasedonthesimulationresults. InChapter 4 ,weconcludethisdissertation. 29
PAGE 30
Themaindicultiesassociatedwithcovarianceestimationarethenumberofparametersthatneedtobeestimated(whichgrowsquadraticallywithdimension)andthepositive-deniteconstraint.Pourahmadi(1999)discoveredthatthelattercanbetakencareofifoneusesthemodiedCholeskydecomposition.AfewpublishedresearchthatfollowedPourahmadi'ssuggestionproposedwaysofregularizationtoprovideanecientcovarianceestimator(WuandPourahmadi,2003;Huangetal.,2006and2007b;Levinaetal.,2008).Inthischapter,weadopttheapproachproposedbyHuangetal.(2006)whichusesapenalizedlikelihoodprocedureandextendittothemixturelikelihoodframeworkoffunctionalmapping.Suchextensionispossibleiftheposteriorprobabilityreparameterizationofthederivativeofthelog-likelihoodfunction, 1{11 ,isused.Thenewapproachisanonparametriccovarianceestimatorinfunctionalmapping.Thetermnonparametric,whichreferstodistributionfreemethods,maynotbeexactlyappropriate.Asweshallseeinthischapter,theestimatorisreallyforunstructuredcovariancesand 30
PAGE 31
Thischapterisorganizedasfollows:InSection 2.2 ,wediscussthemodiedCholeskydecompositionanditsregressioninterpretation,reviewthemethodsproposedintheliteratureforregularizedcovarianceestimators,anddiscussridgeregression(HoerlandKennard,1970),theleastabsoluteshrinkageandselectoroperatororLASSO(Tibshirani,1996),andpenalizedlikelihood.InSection 2.3 ,wedescribehowHuangetal.'sapproachcanbeextendedtofunctionalmapping.Section 2.4 isdevotedtosimulationsandananalysisofarealdataofanintercrossprogenyofmiceusingourproposedmethodology.Thelastsection( 2.5 )summarizesthechapterandprovidessomediscussion. 2.2.1ModiedCholeskyDecompositionandRegressionInterpretation whereT=26666666666410002110031321............m1m2m;m11377777777775;
PAGE 32
wheretjisthe(t;j)thentry(forj
PAGE 33
Pourahmadi(1999)recognizedthattheGARPsandthelogarithmoftheIVsareunconstrainedparametersandhencecanbemodeledintermsofcovariates.HisapproachwastoestimatetheCholeskycomponentsTandD,insteadofestimatingdirectly.Thatis,ndestimates^Tand^DofTandD,respectively,sothatanestimatorofis^=^T1^D(^T1)0whichispositive-denite.Hedidthisbysuggestingalinkfunctiong()fordenedby (WuandPourahmadi,2003)wherelogDmeansthematrixDwherethelogarithmistakenoneachdiagonalelementandIistheidentitymatrix.Thisformulationisanalogoustoalinkfunctionforthemeaningeneralizedlinearmodeltheory(McCullaghandNelder,1989). AlthoughtheCholeskycomponentsstillhavethesamenumberofparameterstoestimateasanyunstructuredcovariancematrix,thisnumbercanbereducedconsiderablybyusingcovariatesandmodelingtheentriesofTandDeitherparametrically,nonparametrically,orinaBayesianway(Pourahmadi,1999;DanielsandPourahmadi,2002;WuandPourahmadi,2003).Pourahmadi(1999)illustratedtheparametricapproachbyusingtimelagsascovariatesfortheentriesofTandDinanalyzingthecattledata(Kenward1987).WuandPourahmadi(2003)andHuangetal.(2007b)eachusedanonparametricapproachbycapitalizingontheregressionrepresentationEq. 2{2 .Intuitively,forlongitudinaldata,termsfarawayintheregressionareexpectedtobesmall.Thatis,thelagjregressioncoecientt;tjisexpectedtobesmallforaxedtandlargej.ThismeansthatforagivenrowontheTmatrix,thetermsareexpected 33
PAGE 34
AlthoughHuangetal.'snonparametricapproachofsmoothingtherstfewsubdiagonalsoftheTmatrixandsettingtheresttozeroproducesastatisticallyecientestimatorof,itmaynotbeadequateincaseswhenthediagonalsarenotsmoothorwhentheremaybesmallbutnonzeroelementsinT.AnalternativeapproachistousepenalizedlikelihoodasproposedbyHuangetal.(2006).Byimposingroughnesspenaltiesonthenegativelog-likelihoodfunction,thisprocedureessentiallyprovidesasolutiontothesequenceofregressionequations 2{2 .TheclassofLppenaltieswithp=1;2areconsideredunderthegeneralframeworkofpenalizedlikelihoodforregressionmodels(FanandLi,2001).TheL1andL2penaltiesallowshrinkageontheGARPsinasimilarfashionasLASSOandridgeregression,respectively.Moreover,theL1penaltycanshrinksomeoftheGARPstowardzeroandthusprovideaselectionschemeforregressioncoecients.ThisishoweverdierentfrombandingTwheret;tjissettozeroforallj>m0.WiththeL1penalty,thezeroescanbeirregularlyplacedinanygivenrowofT.Levinaetal.(2008)proposedasimilarpenalizedlikelihoodprocedurecalledadaptivebanding.Theirmethodusesanestedlassopenaltywhichsetst;tjtozeroforallj>k,wherekmay 34
PAGE 35
WeadopttheL2penaltyapproachofHuangetal.(2006)andproposeanextensionofthismethodtocovarianceestimationinthemixturelikelihoodframeworkoffunctionalmapping.Suchextensionispossiblebycapitalizingontheposteriorprobabilityrepresentationofthemixturelog-likelihoodusedintheimplementationoftheEMalgorithm(Section 2.3.1 ).EstimationisthencarriedoutbyusingtheECMalgorithm(MengandRubin,1993)withtwoCM-steps(Section 2.3.2 ). Assumethelinearregressionmodel,y=X+,wherey=(y1;y2;:::;yn)0isthevectorofresponses,X=266666664x11x21xp1x12x22xp2............x1nx2nxpn377777775 ^=(X0X)1X0y;(2{6) 35
PAGE 36
andgivestheminimumvarianceamongunbiasedlinearestimatorsof.AdrawbackoftheOLSestimator 2{6 isthatitisnotuniquewhenthedesignmatrixXislessthanfullrank,i.e.rank(X)
PAGE 37
2{9 and 2{10 ,smalleigenvaluescancausetheexpectedvalueandvarianceofthesquareddistancefrom^totobelargeor,asshownbyEq. 2{11 ,theregressioncoecientsthemselvestobetoolargeinmagnitude.Similarly,thevarianceof^canalsobeinatedsincevar(^)=2(X0X)1.Asaresult,themeansquarederror(MSE)alsobecomesinatedandpredictionsbasedontheOLSestimator( 2{6 )arenotveryreliable. Onewayofresolvingmulticollinearityisthroughridgeregression.TheideaofridgeregressionistomakeX0XclosetotheidentitymatrixbyreplacingitwithX0X+IwhereissomepositivenumberandIistheidentitymatrix.Theresultingestimatoris ^r=(X0X+I)1X0y(2{12) whichisessentiallyashrunkversionof^toward0.^ryieldslargereigenvaluesi+,fori=1;:::;p,andtherefore,smallerpredictionvariances.Itshouldbenotedhowever,thatthereisatrade-oforthisbecauseunlike^,^risnotunbiased.Butontheaverage,westillgetlowerMSEandbetteroverallprediction.HoerlandKennard(1970)provideswaysforselecting. LASSOdoesasimilarapproachasridgeregressioninreducingvariancebysacricingbias.LASSOalsoshrinkstheregressioncoecientstowards0butgoesfurtherbypossiblyallowingsomeofthemtobe0.Thisisdesirableinthesensethattheresultingmodelisparsimoniousandhasbetterinterpretation,becauseonlythecoecientswithstrongeectsareincludedinthemodel.TheLASSOestimateof,whichweshalldenoteas^l,canbeobtainedbyminimizing wheret0isatuningparameter.ThisisaquadraticprogrammingproblemwithlinearinequalityconstraintsandTibshirani(1996)providesecientandstablealgorithmsto 37
PAGE 38
2{13 isequivalenttothepenalizedresidualsumofsquares (Gilletal.,1981;Tibshirani,1996)whereisatuningparameter. Ridgeregressioncanalsobeexpressedasaconstrainedoptimizationproblemasaminimizationof wheret0isatuningparameterorequivalently,aminimizationof (Tibshirani,1996)whereisatuningparameter.Therefore, 2{16 alsoleadsto 2{12 with=. wherep()iscalledthepenaltyfunctionand>0isatuningparameter.Theideabehindpenalizedlikelihoodis,insomesense,similartomeansquarederror mse=bias2+variance(2{18) 38
PAGE 39
whereyiistheresponseinaregressiononxi,iisi.i.d.N(0;2)andfisafunctiontobeestimated.Here,=fandL(f)=Pni=1(yif)2,theresidualsumofsquares,withouttheconstants.Acompletelyunbiasedestimateoffisacurve,^f,thattsalltheyi'sexactly.However,thiscurveshouldhaveahighvariancebecauseofitsrapidlocalvariation.Wesaythat^finthiscaseis"rough"sothatifwewanttocontrolthe"roughness"aspectoff,wecanuseitasapenalty,p(f),in 2{17 .Towardstheotherextreme,anotherestimateoffcanbetoosmoothbutmaybeseverelybiased.Figure 2-1 showsthreedierentestimatesoffwithvaryingdegreesofroughness.The"inbetween"curveseemstobethebestestimatebecauseithasagoodbalancebetweentandroughness.Thetuningparametercontrolstherelativeimportanceofthesetwo. Apopularchoiceforassessingroughnessistheintegratedsquaredsecondderivative(ISSD) whichmeasuresthecurvatureofafunction(RamsayandSilverman,1997;Green,1999).HighlyvariablefunctionswillhavehighvaluesforpISSD.AlinearfunctionhaspISSD=0andthereforehastheleastcurvature. AnotherexampleofpenalizedlikelihoodisridgeregressionwhichwehaveseeninSection 2.2.3 .Inridgeregression, 2{16 hasthesameformas 2{17 with==(1;:::;p)0.Theaspectofthatisbeingcontrolledisthesizeofitselements(theregressioncoecients),quantiedbypr()=0=Ppj=12j.LASSOalsocontrolsthesizeoftheregressioncoecientsbutbyusingthepenaltypl()=Ppj=1jjjinstead.RidgeregressionandLASSOarespecialcasesofafamilyofpenalizedregressionscalledbridgeregressionwhichimposesapenaltyfunctionoftheformPpj=1jjj,1(Fu,1998). 39
PAGE 40
Penalizedlikelihoodincurveestimation.Dependingonthepenalty,theestimatedcurvecaneitherberough,smoothorinbetween.Theinbetweencurveillustratesabalancebetweenbiasandvariance. Penalizedlikelihoodisalsousedinmodelselection.SupposewehavemodelMiwithparametervectoriandlikelihoodfunctionLi(i).Let=L1(^1)=L2(^2)where^i=argminLi(i);i=1;2.SupposemodelM2isnestedwithinmodelM1.Thenbythelikelihoodratiotest,thesimplermodelM2isrejectedif2logexceedsacertainpercentileofthe2p1p2distribution,wherepiisthenumberofparametersinMi.However,iftheamountofdataislarge,themorecomplexmodelM1isselectedevenwhenthesimplermodelM2istrue(Lindley,1957;Green,1999).Variousapproachesthatutilizepenalizedlikelihoodhavebeendevelopedtoalleviatethisproblem.OnesuchapproachistheBayesianInformationCriteria(BIC;Schwarz,1978)which,formodelMi,isdenedas 40
PAGE 41
whichshowsapenalizedformofthelikelihoodratioteststatistic.OtherapproachessuchasAkaike'sInformationCriteria(AIC;Akaike,1974) andMallows'Cp(Mallows,1973) usedierentformsofthepenalties.Themainideabehindthesecriteriaispenalizemorecomplexmodelstofavorthesimpleronesbasedonthenumberofparametersineachofthem. RecallthatLASSOshrinkstheregressioncoecientstowardszeroandevenallowthemtobeexactlyzero.Therefore,inadditiontocontrollingthesizeoftheregressioncoecientsthroughpl()=Ppj=1jjj,LASSOalsoimplementsmodelselectionbyprovidingsimplerormoreparsimoniousregressionmodels. FormoreaboutpenalizedlikelihoodseeRamsayandSilverman(1997)andGreen(1999),andforasymptotictheory,Coxetal.(1990). 2.3.1ComputingthePenalizedLikelihoodEstimates (2{25) whereyki=yigk,k=1;:::;J. 41
PAGE 42
logL()=nXi=1log"JXk=1pkjifk(yij)#(2{26) andthat @logL()=nXi=1JXk=1Pkji@ @logfk(yij)(2{27) where and2=(;). ItfollowsfromTT0=D,Eq. 2{1 ,that1=T0DTandjj=jDj.Therefore,ifisgiven, @logL()=nXi=1JXk=1Pkji@ @m 2logjj1 2yki01yki=1 2nXi=1JXk=1Pkji@ @logjj+yki01yki=1 2nXi=1JXk=1Pkji@ @mXt=1log2t+mXt=1kit2 whereki1=yki1andkit=ykitPt1j=1tjykijfort=2;:::;m.Itisimplicitlyassumed,therefore,that2t=var(kt)fork=1;:::;J.Notethatifk=(k1;:::;km)0andyk=(yk1;:::;ykm)0thenk=Tyksothatvar(k)=TT0=D. Denethepenalizednegativelog-likelihoodas AssumingtheL2penaltyp(ftjg)=Pmt=2Pt1j=12tj,wehave Ourproblemisimmediatelysolvedif 2{31 canbeexpressedinthesameformas 2{16 wherethetj'scorrespondtothej's.However,thersttermontherightsideof 2{31 42
PAGE 43
2{27 .Thus,bytakingthederivativeof 2{31 andusing 2{29 ,weget @[2logL()+p(ftjg)]=2nXi=1JXk=1Pkji@ @logfk(yij)+@ @mXt=2t1Xj=12tj=nXi=1JXk=1Pkji@ @mXt=1log2t+mXt=1kit2 @mXt=2t1Xj=12tj=@ @"nXi=1JXk=1PkjimXt=1log2t+mXt=1kit2 @"nXi=1JXk=1Pkjilog21+ki12 @mXt=2"nXi=1JXk=1PkjimXt=1log2t+mXt=1kit2 2{16 whenwrittenintermsoftjbecausekit=ykitPt1j=1tjykijfort=2;:::;m.Thus,weneedtominimize and foreacht=2;:::;m. Theminimizerof 2{32 canbeobtainedbysolving@ @21"nXi=1JXk=1Pkjilog21+ki12 whichyields ^12=Pni=1PJk=1Pkjiyki12 Fort=2;:::;m, 2{33 canbeminimizedbyalternatingminimizationover2tandtj,j=1;2;:::;t1(seeAppendix C ).Thesolutionsare 43
PAGE 44
and wheret(t)=(t1;t2;:::;t;t1)0andHt,ItandgtaregiveninAppendix C .Noticethesimilarityof 2{36 to 2{12 andthatinformulas 2{34 2{35 and 2{36 ,theposteriorprobabilities,Pkji's,aretheweightsforthegenotypegroups,k=1;:::;J.Anonparametriccovarianceestimate,^NPcanthereforebeobtainedthrough^NP=^T1^D(^T1)0,wheretheelementsof^Daregivenby 2{34 and 2{35 ,andtheelementsof^Taregivenby 2{36 TheprecedingcalculationswerebasedontheL2penalty,p(ftjg)=Pmt=2Pt1j=12tj.IftheL1penalty,p(ftjg)=Pmt=2Pt1j=1jtjj,isusedinstead,closedformsolutionslike 2{35 and 2{36 cannotbeobtainedandaniterativealgorithmisneeded.ThisiscarriedoutbyusinganiterativelocalquadraticapproximationofPt1j=1jtjj(FanandLi,2001;Ojelundetal.,2001).ThereaderisreferredtoHuangetal.(2006)foradditionaldetails. 1.2.2 ),apenaltyterm,p(),onthemodelparameters,canbeintroducedtothecompletedatalog-likelihood, A{1 ,togetthepenalizedcompletedatalog-likelihood logLPc()=nXi=1JXk=1xik[logpkji+logfk(yij)]+p():(2{37) Clearly,takingtheconditionalexpectationof 2{37 doesnotaectthepenaltytermbecausetheexpectationistakenwithrespecttothemissingvariablex.Thus,attheE-step,wehave 44
PAGE 45
@QP(j(j))=nXi=1JXk=1P(j)kji@ @logfk(yij)+@ @p()=0(2{39) toget(j+1),where2. ThederivedformulasforintheprecedingsectioncannotbedirectlyusedintheEMalgorithmbecauseweassumedthatwasgiven.WeinsteaduseavariantoftheEMalgorithmcalledtheExpectationandConditionalMaximization(ECM)algorithm(Meng&Rubin,1993)whichpartitionstheparametersetaccordingtomeanandcovarianceparameters,and,respectively.TheECMalgorithmdiersfromEMinthattheM-stepinvolvesaconditionaloptimizationwithrespecttoeachpartitionof.Moreprecisely,forthe(j+1)thiteration,theECMalgorithmproceedsasfollows: 1. Initialize(j)=((j);(j)). 2. 3. 2{34 2{35 and 2{36 (Section 2.3.1 ). 4. Repeatsteps(2)(3)untilsomeconvergencecriterionismet. Unlessastructure,suchasAR(1),isimposedonthecovariancematrix,itisdiculttondclosedformCM-stepsolutionsforthemeanparametersinfunctionalmapping.Hence,estimationinthiscaseiscarriedoutbyusingtheNelder-Meadsimplexalgorithm(NelderandMead,1965;Zhaoetal.,2004)whichcanbereadilyimplementedbypopularsoftware.Seeforexamplethefminsearchbuilt-infunctioninMatlaboroptiminR. Inabackcrosspopulationdesign,Maetal.(2002)provideclosedformiterationformulasfor 45
PAGE 46
ForaK-foldcross-validation,letZdenotethefulldataset.ZisrandomlysplitintoKsubsetsofaboutthesamesize.Eachsubset,sayZs(s=1;:::;K),isusedtovalidatethelog-likelihoodbasedontheparametersestimatedusingthedataZnZs.Thevalueofthatmaximizestheaverageofallcross-validatedlog-likelihoodsisusedtoselectanestimatefor. Thecross-validatedlog-likelihoodcriterionisgivenby where^sisanestimateofswhichisbasedonthedatasetZnZsandLsisthelikelihoodbasedonZs.=^ischosentomaximizeC(). Notethattherereallyaretwosetsoftuningparametersinoursetting-oneunderthenullmodelandanotherunderthealternative.However,becausethelog-likelihoodunderthenullmodelisconstantthroughoutamarkerinterval,weshallassumethatthecorrespondingtuningparameterhasbeenestimatedaccordinglyandinthesucceedingsectionssimplyrefertothetuningparametersastheonesforthealternativemodel. 2.4.1Simulations 2.3.1 ),isassessedandcomparedtoanAR(1)-structuredestimator,AR(1)(Section 1.2.1 ).Weinvestigatedatageneratedfrombothmultivariatenormalandt-distributions.Webeginwiththeformer. 46
PAGE 47
1.1.2 )forQTLmapping,werandomlygenerated6markersequallyspacedonachromosome100cMlongwith1QTLbetweenthesecondandthirdmarkers,12cMfromthesecondmarker(or32cMfromtheleftmostmarkerinthechromosome).EachphenotypeassociatedwiththesimulatedQTLhadm=10measurementsandwassampledfromamultivariatenormaldistribution,usinglogisticcurvesasgenotypemeansunder3dierentcovariancestructures.Themeanparameterswerea1=30;a2=28:5;a3=27:5,b1=b2=b3=5,andr1=r2=r3=:5andthecovariancestructureswereasfollows: 47
PAGE 48
2-2 ,thebrokenlineLRplotistheresultofourprocedurewhilethesolidoneisbasedonindividualc'sthathaveeachbeenseparatelycross-validated.Forn=400,thesetwoplotsareindistinguishable.Thereasonforthisisthat,thecross-validated'sateachsearchpointwithinamarkerintervalarenotthatdierentfromoneanother.Thus,usingoneforeachmarkerinterval(theonethatproducesthemaximumLR)willnotsignicantlyalterthegeneralshapeoftheLRplot.Thetwodottedlineplotswerebasedonc,forallc=1;2;:::;26,settotwodierentarbitraryvaluesof. Toevaluatetheestimate^l(l=1;2;3)ofthetruecovariancestructurel,anumberofcriteriacanbeused.Amongthemarethematrixnormlosses klk 48
PAGE 49
Log-likelihoodratio(LR)plotsbasedonsimulateddataunderthreedierentcovariancestructures.Thesolidlineplotisbasedoncross-validated(CV)tuningparametersateachsearchpoint(individual's).Thebrokenlineplotisbasedoncross-validatedtuningparameters(max's)correspondingtothemaximumLRineachmarkerinterval.Thedottedlineplotisbasedontwodierentarbitrarytuningparametervalues,eachassumedatallsearchpoints. 49
PAGE 50
whereIistheidentitymatrix.Theselossesareallnonnegativeandequalitytozeroholdswhen^l=l.Thereisnoagreementastowhichofthesenormsisappropriateforaparticularsituationbutanyofthemmaybeusedandtheresultswouldqualitativelybethesame(Levinaetal.,2008).Here,weuseLEandLQwhichwerealsousedbyWuandPourahmadi(2003),Huangetal.(2006and2007b),andLevinaetal.(2008).Thecorrespondingriskfunctionsaredenedby and 100simulationrunswerecarriedoutandtheaveragesonallrunsoftheestimatedQTLlocation,logisticmeanparameterestimates,maxLR,entropyandquadraticlosses,includingtherespectiveMontecarlostandarderrors(SE),wererecorded.TheresultsareshowninTables 2-1 and 2-2 .For1,AR(1)doeswellasexpected,butNPalsodoesagoodjob.Bothprovidebetterprecisionwithincreasedsamplesize.ThemaxLRvaluesarecomparablei.e.38.52and112.03fromTable1versus37.78and128.21fromTable2,respectively,arenottoodierentfromeachother. For2and3,NPdoesbetterthanAR(1).AR(1)showshighvaluesforbothaveragedlosseswhichtranslatestosignicantlybiasedestimatesinQTLlocationandpoormeanparameterestimates,particularlyfor3atthesecondandthirdgenotypegroup.Increasedsamplesizedoesnothelpandevenmakesmeanparameterestimatesworseinthecaseof3.ValuesofmaxLRforNPandAR(1)areverydierentinthese 50
PAGE 51
Toassesstherobustnessofourproposednonparametricestimator,wemodeledsimulateddatafromat-distributionwith5degreesoffreedom.Thatis,samplesweretakenfrom whereXN(0;),Z2()andgkisthelogisticmeanforgenotypek=1;2;3.TheresultsarepresentedinTables 2-3 and 2-4 .WeexcludedthecolumnformaxLRbecauseitisnotappropriateinthisscenario.Theresultsshowthatdespiteinatedaveragelosses,NPstilloutperformsAR(1).Noticethatthequadraticlossisseverelyinatedbecauseofthefattailsofthet-distribution.Itmaynotbeareliablemeasureofperformancebutwepresenttheresultshereforillustration. 51
PAGE 52
QTLQTLgenotype1QTLgenotype2QTLgenotype3 CovariancenLocation^a1^b1^r1^a2^b2^r2^a3^b3^r3maxLRLELQ
PAGE 53
QTLQTLgenotype1QTLgenotype2QTLgenotype3 CovariancenLocation^a1^b1^r1^a2^b2^r2^a3^b3^r3maxLRLELQ
PAGE 54
QTLQTLgenotype1QTLgenotype2QTLgenotype3 CovariancenLocation^a1^b1^r1^a2^b2^r2^a3^b3^r3LELQ
PAGE 55
QTLQTLgenotype1QTLgenotype2QTLgenotype3 CovariancenLocation^a1^b1^r1^a2^b2^r2^a3^b3^r3LELQ
PAGE 56
1.1.2 )populationof259maleand243femaleprogenywith96markersinatotalof19chromosomes.Themiceweremeasuredfortheirbodymassat10weeklyintervalsstartingatage7days.Correctionsweremadefortheeectsduetodam,littersizeatbirth,parity,andsex(Cheverudetal.,1996;Krameretal.,1998).AplotoftheweightdataisshowninFigure 1-3 FunctionalmappingwasrstusedtoanalyzethisdatainZhaoetal.(2004),whoinvestigatedQTLsexinteraction.Theauthorsusedalogisticcurve(Eq. 1{4 )tomodelthegenotypemeansandemployedthetransform-both-sides(TBS;Section 1.2.1 )techniqueforvariancestabilizationinordertoutilizeanAR(1)structure.Theirmethodidentied4of19chromosomesthateachhadsignicantQTLsandtheyconcludedthatthereweresexdierencesofbodymassgrowthinmice.Zhaoetal.(2005)appliedanSADcovariancestructureinfunctionalmappingandfound3QTLs.LiuandWu(2007)likewiseanalyzedthesamedatausingaBayesianapproachinfunctionalmappinganddetectedonly3signicantQTLs. Here,weappliedourproposednonparametricestimator,NP,inagenome-widescanforgrowthQTLwithoutregardtosex.Wescannedthelinkagemapatintervalsof4cM.Figure 2-3 showstheLRplotsforall19chromosomes.Theywereobtainedusing'sthatwerecross-validatedateachsearchpoint.Weconductedapermutationtest(DoergeandChurchill,1996;Section 1.2.3 )toidentifysignicantQTLs.Foreverypermutationrun,wecalculatedmaxLReforchromosomee=1;:::;19usingthesamegeneralprocedureasinthesimulations(Section 2.4.1 ).Inthismicedataset,however,somemarkerswereeithermissingornotgenotypedandweusedonlytheavailablemarkers(Table 2-5 ).Thus,everymarkerintervalhaddierentsetsofavailablephenotypedata.Butwebelievethisdidnotaecttheresultsbecauseofthelargesamplesizeoftheavailabledata.Welookedatchromosomes6and7andfoundthistobethecase.Figure 2-4 showsLRplotsbased 56
PAGE 57
2-3 correspondto95%(broken)and99%(solid)thresholdsbasedon100permutationtestruns.Therewere9chromosomeswithsignicantQTLs(1;4;6;7;9;10;11;14and15)basedonthe95%thresholdbutonly7under99%(1;4;6;7;10;11and15).Thetwochromosomesthatdidnotmakethe99%threshold(9and14)barelymadethe95%.Forthismicedataset,werecommendusingthe99%thresholdbecausetherewereonly100permutationtestruns.Zhaoetal.(2004)identiedQTLsinchromosomes6;7;11and15,andZhaoetal.(2005)andLiuandWu(2007)foundQTLsinchromosomes6;7and10.Thesewereallatthe95%threshold.OurndingsveriedtheresultsofthesepreviousstudiesthatmadeuseofthefunctionalmappingmethodandevendetectedmoreQTLs.Althoughthereisadiscrepancyinourresultsandothers,itisinconclusivetosaythattheseadditionalQTLsthatourproposedmodeldetectedarenonexistent.Infact,Vaughnetal.(1999)identied17QTLs,althoughmostofthemaresuggestive,usingsimpleintervalmapping. TheestimatedgenotypemeancurvesforthedetectedQTLsareshowninFigure??.ThreegenotypesataQTLhavedierentgrowthcurves,indicatingthetemporalgeneticeectsofthisQTLongrowthprocessesformousebodymass.SomeQTLs,likethoseonchromosomes6,7and10,actinanadditivemannerbecausetheheterozygote(Qq,brokencurves)areintermediatebetweenthetwohomozygotes(QQ,solidcurvesandqq,dotcurves).SomeQTLsuchasoneonchromosome11areoperationalinadominantwaysincetheheterozygoteisveryclosetooneofthehomozygotes. 57
PAGE 58
.ThegenomicpositioncorrespondingtothepeakofthecurveistheoptimallikelihoodestimateoftheQTLlocalizationindicatedbyverticalbrokenlines.Theticksonthex-axisindicatethepositionsofmarkersonthechromosome.Themapdistances(incenti-Morgan)betweentwomarkersarecalculatedusingtheHaldanemappingfunction.Thethresholdsforclaimingthegenome-wideexistenceofaQTLareshownbyhorizontallines.
PAGE 59
AvailablemarkersandphenotypedataofalinkagemapinanF2populationofmice(datafromVaughnetal.,1999). MarkerIntervals Chromosome12345678 13784334834674504404662414404453465430347749148947647544614754814814915441439449381385646748348548174074244594523783724284158395453472949849649810401406481490497114314514684644461249748948348813450443466144434754951549149446816498173713941848747942019445468468 59
PAGE 60
Log-likelihoodratio(LR)plotsforchromosomes6and7ofthemicedata.Thesolidlineplotisbasedoncross-validated(CV)tuningparametersateachsearchpoint(individual's).Thebrokenlineplotisbasedoncross-validatedtuningparameters(max's)correspondingtothemaximumLRineachmarkerinterval.Thedottedlineplotisbasedontwodierentarbitrarytuningparametervalues,eachassumedatallsearchpoints.Slightdierencesbetweenthesolidandbrokenlineplotsmaybeduetodierentsamplesizesamongmarkerintervals(seeTable5). 60
PAGE 61
g:Means
PAGE 62
Inthischapter,weadoptedHuangetal.'sL2penaltyapproachinfunctionalmapping.ThispenaltyworksbestwhenthetrueTmatrixhasmanysmallelements.UsingtheL1penaltygivesabetterestimatorwhensomeoftheelementsofTareactuallyzero.However,webelievethatthedierencesinresultsbetweenusingeitherpenaltieswillnotbesignicantunlessthedimensionisverylarge.Nonetheless,theL1penaltycanbeeasilyincorporatedintoourscheme.WehaveshownhowtointegrateHuangetal.'sprocedureintothemixturelikelihoodframeworkoffunctionalmapping.Thekeywastoutilizetheposteriorprobabilityrepresentationofthederivativeofthelog-likelihood, 2{27 ,andapplyanL2penaltytothenegativelog-likelihood.EstimationwasthencarriedoutusingtheECMalgorithm(Section 2.3.2 )withtwoCM-steps,basedonapartitionofthemeanandcovarianceparameters.OursimulationshaveshownbetteraccuracyandprecisioninestimatesforQTLlocation,genotypemeanparameters,andmaxLRvalues,byNPcomparedtoAR(1).ThemaxLRvaluesareimportantbecausethecompleteLRplotprovidestheamountofevidencefortheexistenceofaQTL.LRvaluesnoticeably 62
PAGE 63
2-3 )seemedtohavethelargestevidenceforQTLexistence.TheLRplotsarealsousedinpermutationtests(Section 1.2.3 )tondasignicancethreshold.MorepreciseestimatesofthecovariancestructuremeansbetterestimatesofthethepeakoftheLRplotandthereforemorereliablepermutationtestsresults. Withregardstotheutilizationofourproposedmodel,wesuggestapreliminaryanalysisofthedatabycheckingvarianceandcovariancestationarity.IftheselatterconditionsaresatisedthenAR(1)maybeanappropriatemodel.IfcovariancestationarityisnotanissuethenaTBSmethod(Section 1.2.1 )coupledwithusingAR(1)isapplicable.IfnostationarityisdetectedthenanSAD(Section 1.2.1 )orNPmaybemoreuseful.Althoughwedidnotassessthecomparativeperformanceofthesetwomodels,wethinkthatSADbecomesmorecomputationallyintensiveifthedataexhibitslong-termdependence,inwhichcaseNPmaybemoreappropriate.NPshouldalsobeconsideredifotherparametricstructuresaresuspect. 63
PAGE 64
1.2 )whichaddressesthelatterdicultybyusingabiologicallyrelevantmathematicalfunctiontomodelreactionnorms.Theauthorsconsideredaparametricmodelofphotosyntheticrateasafunctionoflightirradianceandtemperatureandstudiedthegeneticmechanismofsuchprocess.Theyshowed,throughextensivesimulations,thatinabackcrosspopulationwithoneortwo-QTLs,theirmethodaccuratelyandpreciselyestimatedtheQTLlocation(s)andtheparametersofthemeanmodel.However,theyassumedthecovariancematrixtobeaKroneckerproductoftwoAR(1)structures,eachmodelingareactionnormduetooneenvironmentalfactor.Thistypeofcovariancemodelissaidtobeseparable.Althoughcomputationallyattractive,suchmodelonlycapturesseparatereactionnormeectsbutfailstoincorporateinteractions.Amoregeneralapproachisthereforeneeded. 64
PAGE 65
Inthischapter,weshowthroughsimulationsthat,infunctionalmappingofreactionnormstotwoenvironmentalsignals,(1)nonseparablestructurescanbeutilizedascovariancemodelsandusedtogeneratedataofprocessesthatexhibitinteractions(2)theseparablemodelproposedbyWuetal.(2007),whichweshallcallAR(1),maynotbeappropriateforsuchdataand(3)thenonparametriccovarianceestimator,NP,developedinchapter 2 ,isamorereliablecovariancemodelthanAR(1).Byutilityin(1),wemeanthatanonseparablemodelcananalyzedatageneratedbythesamenonseparablemodel.Withregardsto(2),ourresultsaresurprisingbecause,forsomevarianceoftheprocessoracertainnumberoflevelsintheenvironmentalsignals,theestimatedQTLlocationandmeanmodelparametersaregenerallyrobusttoabiasedseparablecovarianceestimate,^AR(1),ofanonseparableunderlyingstructure.Thatis,ifthecovarianceofadatageneratedfromanonseparablestructureisestimatedbytheseparablemodel,AR(1),theestimateisbiased,asexpected,buttheQTLlocationandmeanmodelparametersarestillaccuratelyandpreciselyestimated.However,theestimatedmaxLR(Section 65
PAGE 66
)isnotaccuratebecausethetrueunderlyingcovariancestructureandthe(biased)estimate,^AR(1),producedierentlog-likelihoodvalues.RecallthatmaxLRisimportantbecauseitisusedinpermutationteststoassesssignicanceofQTLexistence.Butwhenboththevarianceandthenumberoflevelsintheenvironmentalsignalsareincreased,theestimatedQTLlocationisseverelybiasedwhilethemeanparametersareonlymildlyaected.NPprovidesconsistentlybetterresultsoverAR(1).Ofcourse,ifnonseparablecovariancemodelsthemselvesareusedtoanalyzedatathatexhibitinteractions,theresultsareexpectedtobemuchbetter.However,inreality,theunderlyingstructureofthedataisunknownanditisverydiculttoidentifyanappropriatenonseparablemodeltouseinthiscase.Modelersoftenemploystrategiesthataremainlyadhocorspecictoaproblem.Unfortunately,therearenogeneralguidelinesthatareavailableinapproachingthesetypeofproblems.Wewill,however,usenonseparablecovariancemodelstogeneratesimulateddatawithinteractionsanduseittocompareNPandAR(1). Thischapterisorganizedasfollows:InSection 3.2 ,wedescribethefunctionalmappingmodelproposedbyWuetal.(2007a)forreactionnorms.InSection 3.3 ,wediscussseparableandnonseparablemodelsusedinspatio-temporalmodeling.InSection 3.4 ,wepresentasimulationstudyusingsomenonseparablestructuresintroducedinSection 3.3 andthenconcludewithasummaryanddiscussioninSection 3.5 .Inthischapter,wemayalternatelyusethetermscovariancematrix,structureorfunction.Theyallrefertothesamething. 66
PAGE 67
Anexampleofareactionnormthatillustratesasurfacelandscapeisphotosyn-thesis(Wuetal.,2007a)whichistheprocessbywhichlightenergyisconvertedtochemicalenergybyplantsandotherlivingorganisms.Itisanimportantbutcomplexprocessbecauseitinvolvesseveralfactorssuchastheageofaleaf(wherephotosynthesistakesplaceinmostplants),theconcentrationofcarbondioxideintheenvironment,temperature,lightirradiance,availablenutrientsandwaterinthesoil,etc..Amathematicalexpressionfortherateofsingle-leafphotosynthesis,P,withoutphotorespirationis 2I+Pmp (ThornleyandJohnson,1990),where2(0;1)isadimensionlessparameter,isthephotochemicaleciency,Iistheirradiance,andPmistheasymptoticphotosyntheticrateatasaturatingirradiance.Pmisalinearfunctionofthetemperature,T, wherePm(20)isthevalueofPmatthereferencetemperatureof20oCandTisthetemperatureatwhichphotosynthesisstops.Tischosenoverarangeoftemperatures,suchas5oC-25oC,toprovideagoodttoobserveddata. Wuetal.(2007a)studiedthereactionnormofphotosyntheticrate,denedbyEqs. 3{1 and 3{2 ,asafunctionofirradiance(I)andtemperature(T).Thatis,theauthorsconsideredP=P(I;T).Here,weassumethatT=5sothatthereactionnormmodelparametersare(;Pm(20);).ThesurfacelandscapethatdescribesthereactionnormofP(I;T),withparameters(;Pm(20);)=(0:02;1;0:9),isshowninFigure 3-1 .Asstatedearlier,eachreactionnormsurfacecorrespondstoaspecicgeneticeect.Thus,ifaQTL 67
PAGE 68
Reactionnormsurfaceofphotosyntheticrateasafunctionofirradianceandtemperature.ModelisbasedonEqs. 3{1 and 3{2 withparameters(;Pm(20);)=(0:02;1;0:9).AdaptedfromWuetal.(2007a). 1.1.2 )withoneQTL.Extensionstomorecomplicateddesignsandthetwo-QTLcase,asinWuetal.(2007a),arestraightforward.AssumeabackcrossplantpopulationofsizenwithasingleQTLaectingthephenotypictraitofphotosyntheticrate.Thephotosyntheticrateforeachprogenyi(=1;:::;n)ismeasuredatdierentirradiance(s=1;:::;S)andtemperature(t=1;:::;T)levels.Thischoiceofvariablesisadoptedforconsistencyinlaterdiscussionsaswewillbeworkingwithspatio-temporalcovariancemodels.Thesetofphenotypemeasurementsorobservationscanbewritteninvectorformas {z }irradiance1;:::;yi(S;1);:::;yi(S;T)| {z }irradianceS]0:(3{3) 68
PAGE 69
1.1 ).Becauseweassumeabackcrossdesign,theQTLhastwopossiblegenotypes(asdothemarkers)whichshallbeindexedbyk=1;2.Thelikelihoodfunctionbasedonthephenotypeandmarkerdatacanbeformulatedas wherepkjiistheconditionalprobabilityofaQTLgenotypegiventhegenotypeofamarkerintervalforprogenyi(Section 1.1.4 ).Weassumeamultivariatenormaldensityforthephenotypevectoryiwithgenotype-specicmeans {z }irradiance1;:::;k(S;1);:::;k(S;T)| {z }irradianceS]0(3{5) andcovariancematrix=cov(yi). 3{5 canbemodeledusingEqs. 3{1 and 3{2 as 2kks+Pmkp where andk=1;2. Wuetal.(2007a)usedaseparablestructure(Mitchelletal.,2005)fortheSTSTcovariancematrixas AR(1)=12(3{8) 69
PAGE 70
D ).Notethat1and2areuniqueonlyuptomultiplesofaconstantbecauseforsomejcj>0,c1(1=c)2=12.Eachof1and2ismodeledusinganAR(1)structurewithacommonerrorvariance,2,andcorrelationparameters1and2: 1=226666666411:::S1111:::S21............S11S21:::1377777775;2=226666666412:::T1221:::T21............T12T22:::1377777775(3{9) Separablecovariancestructures,however,cannotmodelinteractioneectsofeachreactionnormtotemperatureandirradiance.Thus,thereisaneedforamoregeneralmodelforthispurpose. Notethatwith 3{6 3{7 3{8 and 3{9 ,=f1;Pm1(20);1;2;Pm2(20);2;2;1;2gin 3{4 .ThesemodelparametersmaybeestimatedusingtheECMalgorithmbutclosedformsolutionsattheCM-stepcouldbeverycomplicated.AmoreecientmethodistheNelder-Meadsimplexalgorithm(Section 2.3.2 ). Thismeansthatifthereactionnormcurvesaredistinct(intermsoftheirrespectiveestimatedparameters),thenaQTLpossiblyexists.OfcourseaslightdierenceinparameterestimatesdoesnotautomaticallymeanaQTLexists.Butthesignicanceoftheresultscanbetestedbydoingpermutationtestsusingthelog-likelihoodratio 70
PAGE 71
1.2.3 ).AproceduredescribedinWuetal.(2004a)canbeusedtotesttheadditiveeectsofaQTL.Otherhypothesescanbeformulatedandtestedsuchasthegeneticcontrolofthereactionnormtoeachenvironmentalfactor,interactioneectsbetweenenvironmentalfactorsonthephenotype,andthemarginalslopeofthereactionnormwithrespecttoeachenvironmentalfactororthegradientofthereactionnormitself.ThereaderisreferredtoWuetal.(2007a)formoredetails. 3.3.1Introduction 71
PAGE 72
Theconstructionofvalid(positive-denite)nonseparablecovariancemodelshastakengreatstridesinrecentyears.SchabenbergerandGotway(2005)describefourmainapproaches:(1)Gneiting's(2002)monotonefunction,(2)CressieandHuang's(1999)spectralmethod,(3)mixture(Ma,2007),and(4)JonesandZhang's(1997)partialdierentialequation.(1)and(2)utilizemainlystatisticalprincipleswhereas(3)and(4)aremostlymathematicalinnature.Weshalldiscuss(1)and(2)inSection 3.3.4 anduseexamplesderivedfromtheseapproachesinthesimulations(Section 3.4 ). whereobservationsarecollectedatNspatio-temporalcoordinates(s1;t1);(s2;t2);:::;(sN;tN)andd2Z+.Thedataareonlyapartialrealizationoftheprocessbecause,forpracticalreasons,theprocesscannotbeobservedateachcoordinate.Gneiting(2002)notesthatmathematically,thespace-timedomainRdRandthepurelyspatialdomainRd+1areequivalent.Thismeansthatthespace-timecovariancefunctionsinRdRandspatialcovariancefunctionsinRd+1belongtothesameclass.However,thenotationRdRisusedtohighlightthedistinctionbetweentherespectivedomains.Inthisstudy,wewillonlybeconcernedwiththecased=1sothat,fromhereon,wewilluseRinsteadofRdforthespatialdomain.Asidefromthosementionedintheintroduction(Section 3.1 ),Ymayalsorepresentozonelevels,diseaseincidence,oceancurrentpatterns,watertemperatures,etc.Inourstudy,Yrepresentsphotosyntheticrate. Ifvar(Y(s;t))<1forall(s;t)2RR,thenthemeanE[Y(s;t)]andcovariancecov(Y(s;t);Y(s+u;t+v)),whereuandvarespatialandtemporallags,respectively,bothexist.Weassumethatthecovarianceisstationaryinspaceandtimesothatforsome 72
PAGE 73
cov(Y(s;t);Y(s+u;t+v))=C(u;v):(3{11) Thismeansthatthecovariancefunction,C,dependsonlyonthelagsandnotonthevaluesofthecoordinatesthemselves.Stationarityisoftenassumedtoallowestimationofthecovariancefunctionfromthedata(CressieandHuang,1999).Wealsoassumethatthecovariancefunctionisisotropicwhichmeansthatitdependsonlyontheabsolutelagsandnotinthedirectionororientationofthecoordinatestoeachother: cov(Y(s;t);Y(s+u;t+v))=C(juj;jvj):(3{12) Stationaryandisotropiccovariancefunctionsaresaidtobetranslationandrotation-invariant(abouttheorigin)(WallerandGotway,2004).NotethatC(u;0)andC(0;v)correspondtopurelyspatialandpurelytemporalcovariancefunctions,respectively. Tobeavalidcovariancefunction,Cmustbepositivedenite.Thismeansthatforany(s1;t1);:::;(sk;tk)2RR,anyrealcoecientsa1;:::;ak,andanypositiveintegerk, NotethatbasedonEq. 3{13 ,Cshouldreallybenonnegative-denite.However,thisisthewayitisdenedintheliteratureandwewilladheretothisconvention. Inspatio-temporalanalysis,theultimategoalisoptimalprediction(orkriging)ofanunobservedpartoftherandomprocessusinganappropriatecovariancefunctionmodel.Inthisstudy,weutilizeanonseparablecovariancetocalculatethemixturelikelihoodassociatedwithfunctionalmapping. 73
PAGE 74
where2=C(0;0)isthevarianceoftheprocess. Withrepresentation 3{14 ,separablemodelshaveanadvantage.Forexample,modelsforC(u;vj)canbeeasilyconstructedbyselectingsuitableandreadilyavailablechoicesforeachofC1(uj1)andC2(vj2).Becausemanyofthesechoicesarepositive-denite,C(u;vj)isguaranteedtobepositive-denitealso.Anexampleis whereC1(uja)=exp(ajuj)andC2(vjb)=exp(bjvj).Noticethatforanygivenspatiallagsu1andu2,C(u1;vja;b)andC(u2;vja;b)areproportionaltoeachother.Thismeansthattheplotsofthetemporalcovarianceshavethesameshapesatthesespatiallags.ThispropertyisimportantinthespectralconstructionofvalidnonseparablemodelsproposedbyCressieandHuang(1999)(Section 3.3.4.1 ).Forseparablemodels,theprocessesinthespatialandtemporaldomainsdonotactoneachotherandhencetheselectionofanappropriatemodelforC(u;vj)canbefacilitatedbydoingseparate(conditional)exploratorydataanalysesofspatialandtemporalpatterns. AmoregeneraldenitionforseparabilityisasaKroneckerproduct,asinEq. 3{8 .FromEq. 3{8 ,itcanbeshownthat1AR(1)=1112andjAR(1)j=j1jj2j,wherejjdenotesthedeterminantofamatrix.Thus,anotheradvantageofseparablemodelsiscomputationaleciency,particularlyinlikelihoodmodelswheretheinverseanddeterminantofthecovariancematrixarecalculated.ForalargecovariancematrixofdimensionUV,itsinversecanbecalculatedfromtheinversesofitsKroneckercomponent 74
PAGE 75
3{14 as whereu=1;:::;U,v=1;:::;V.Thismodelassumesequidistantorregularlyspacedcoordinates.Thus,twoconsecutiveorclosestneighborcoordinateswillhavethesamecorrelationstructureasanothereveniftheirrespectivedistancesaredierent.Amoreappropriatemodelmightbe whereaandbarescaleparameters.However,thismodelismorecomplexthanAR(1)inthesensethatithasmoreparameters(5vs3)toestimate.Thequestionofwhichmodelisbetterwillleadustoamodelselectionissue. 75
PAGE 76
E )thatEq. 3{19 canbeexpressedas 3{20 canbeusedtondvalidcovariancefunctionsbyselectingappropriateformsfor(!;v)and(!).Togetnonseparablestructures,(!;v)mustnotbeindependentof!.Otherwise,C(u;v)willbeseparable. CressieandHuanggavesevenexamplesofvalidnonseparablecovariancefunctionsconstructedfromcertainchoicesfor(!;v)and(!)andusingequation 3{20 .Wepresentthreeofthemhereandusethersttwointhesimulations. wherea;b0arethescalingparametersoftimeandspace,respectively,and2=C(0;0). (ajvj+1)2+b2juj2;(3{22) wherea;b0arethescalingparametersoftimeandspace,respectively,and2=C(0;0). 76
PAGE 77
3{23 reducestoaseparablemodel. 3{19 or 3{20 .Gneiting(2002)developedanapproachthatdoesnotrelyonFouriertransformpairsandavoidsthiskindoflimitation. Let(x)and(x)befunctionswithnonnegativedomains.Suppose(x)iscompletelymonotoneand(x)ispositivewithacompletelymonotonederivative.Then where2=C(0;0)>0,isavalidnonseparablespatio-temporalcovariancemodel.(x)and(x)canbeassociatedwithspatialandtemporalstructures,respectively,andGneiting(2002)providesalistoffunctionsthatcanbeusedforeach.Forexample,using Multiplying 3{25 bythepurelytemporalcovariancefunction(ajvj2+1);v2R,with0,produces 77
PAGE 78
3{26 is where1=2replaces+=2.isaspace-timeinteractionparameterwhichimpliesaseparablestructurewhen0andnonseparablestructureotherwise.Increasingvaluesofindicatesstrengtheningspatio-temporalinteraction. 3{21 3{22 (Examples1and2ofSection 3.3.4.1 )and 3{27 ,denotedasfollows: (ajvj+1)2+b2juj2;a;b0;2>0;(3{29) Tosimplifyouranalysis,weassumeforC3(u;v)that=1=2;=1=2,and=1sothat Wethengeneratedatausingthesenonseparablestructurestosimulateinteractioneectsbetweenthetwoenvironmentalsignalsinfunctionalmappingofareactionnorm.Simulationsusingseparableandnonseparablecovariancestructuresforspatio-temporalprocesswerestudiedbyHuangetal.(2007a).Thegenerateddataisanalyzedusingthenonparametricestimator,NP,developedinchapter 2 andAR(1)toassesstheirperformance.Wealsowanttotestwhethertheseparablemodel,AR(1),canbeusedto 78
PAGE 79
2.4.1 ). Usingabackcrossdesign(2genotypegroups;Section 1.1.2 )fortheQTLmappingpopulation,werandomlygenerated6markersequallyspacedonachromosome100cMlong.OneQTLwassimulatedbetweenthefourthandfthmarkers,12cMfromthefourthmarker(or72cMfromtheleftmostmarkerofthechromosome).TheQTLhadtwopossiblegenotypeswhichdeterminedtwodistinctmeanphotosyntheticratereactionnormsurfacesdenedbyEqs. 3{1 and 3{2 (seeFigure 3-1 ).Thesurfaceparametersforeachgenotypegroupwere(1;Pm1(20);1)=(0:02;2;0:9)and(2;Pm2(20);2)=(0:01;1:5;0:9).Phenotypeobservationswereobtainedbysamplingfromamultivariatenormaldistributionwithmeansurfacebasedonirradianceandtemperaturelevelsoff0;50;100;200;300gandf15;20;25;30g,respectivelyandcovariancematrixCl(u;v);l=1;2;3. Thefunctionalmappingmodelwasappliedtothemarkerandphenotypedatawithn=200;400samples.ThesurfacedenedbyEqs. 3{1 and 3{2 wasusedasmeanmodelandCl(u;v)ascovariancemodeltoanalyzethedatageneratedusingCl(u;v).Thatis,wemodeleddatageneratedbythesamemeanandcovarianceusedinthemodel.100simulationrunswerecarriedoutandtheaveragesonallrunsoftheestimatedQTLlocation,meanparameterestimates,maxLR,entropyandquadraticlosses(seeSection 2.4.1 ),includingtherespectiveMontecarlostandarderrors(SE),wererecorded.TheresultsareshowninTables 3-1 and 3-2 .Table 3-2 alsoincludestheresultsforAR(1).BothtablesshowaccurateandpreciseestimatesofQTLlocation,meansurfaceandcovarianceparameters. Next,NPandAR(1)wereusedtoanalyzethedatageneratedbyeachofCl(u;v);l=1;2;3.Tables 3-3 and 3-4 showtheresultsoftheserespectiveanalyses.TheresultsforNPareverygood.However,thoseforAR(1)aresomewhatunexpected.Apparently,theestimatedQTLlocationandmeanparametersareaccurateandprecise!Thiswouldimply 79
PAGE 80
3-1 and 3-2 ,whichshouldbe(almost)thetruevalues.Theaveragelosses,however,areinatedforC1andC2.Uponcloseinspection,itturnsoutthatitismisleadingtolookatmaxLRinthissituation.Whatshouldbeconsideredarethelog-likelihoodvaluesunderthenullandalternativemodelsfromwhichmaxLRisderived.Figure 3-2 providesboxplotsofthelog-likelihoodvaluesunderthealternativemodelbasedonthe100simulationruns.TheseplotsrevealclearbiasedestimatesofC1andC2byAR(1)andthedegreesofbiasareconsistentwiththeaveragelosses.Theresultsforthenullmodelareverysimilarbutarenotpresentedhere.WealsoprovidethecovarianceandcorrespondingcontourplotsofCl(u;v);l=1;2;3andtheAR(1)estimatesoftheseinFigures 3-3 and 3-4 WeconductedfurthersimulationsunderC1withn=400,thecasewhereAR(1)performedtheworst.Weconsideredtwoscenarios:increasedvariance(2=2;4)andnumberofirradiance(f0;50;100;150;200;250;300g)andtemperature(f15;18;21;24;27;30g)levels.TheresultsareshowninTables 3-5 and 3-6 ,respectively.Theresultsshowthatunderthesetwoscenarios,theestimateoftheQTLlocationisseverelybiasedifoneusesAR(1).ThisisnotthecaseforNP. 80
PAGE 81
QTLQTLgenotype1QTLgenotype2 CovariancenLocation^1^Pm1(20)^1^2^Pm2(20)^2maxLRLELQ2ab C120071.960.022.000.900.011.540.88131.460.020.191.000.500.01(0.32)(0.00)(0.01)(0.00)(0.00)(0.02)(0.01)(2.31)(0.00)(0.04)(0.00)(0.00)(0.00)40072.000.022.010.900.011.520.89262.110.011.131.000.500.01(0.20)(0.00)(0.01)(0.00)(0.00)(0.01)(0.01)(3.00)(0.00)(0.02)(0.00)(0.00)(0.00) QTLQTLgenotype1QTLgenotype2 CovariancenLocation^1^Pm1(20)^1^2^Pm2(20)^2maxLRLELQ212
PAGE 82
QTLQTLgenotype1QTLgenotype2 CovariancenLocation^1^Pm1(20)^1^2^Pm2(20)^2maxLRLELQ2abc C320071.960.022.010.890.011.550.87126.800.020.191.001.030.010.62(0.34)(0.00)(0.01)(0.01)(0.00)(0.02)(0.01)(2.20)(0.00)(0.04)(0.00)(0.02)(0.00)(0.02)40071.920.022.010.900.011.520.89253.380.010.131.001.010.010.61(0.20)(0.00)(0.01)(0.00)(0.00)(0.01)(0.01)(2.83)(0.00)(0.02)(0.00)(0.01)(0.00)(0.02) QTLQTLgenotype1QTLgenotype2 CovariancenLocation^1^Pm1(20)^1^2^Pm2(20)^2maxLRLELQ212
PAGE 83
QTLQTLgenotype1QTLgenotype2 CovariancenLocation^1^Pm1(20)^1^2^Pm2(20)^2maxLRLELQ
PAGE 84
QTLQTLgenotype1QTLgenotype2 CovariancenLocation^1^Pm1(20)^1^2^Pm2(20)^2maxLRLELQ
PAGE 85
Boxplotsofthevaluesofthelog-likelihoodunderthealternativemodel,H1.SignicantlybiasedestimatesbyAR(1)areapparentforC1. 85
PAGE 86
Covarianceplots.PlotsofCl,l=1;2;3versusirradiance(juj)andtemperature(jvj)lagsareontheleftcolumn.OntherightcolumnaretheestimatesofClbyAR(1). 86
PAGE 87
Contourplots.ContourplotsofCl,l=1;2;3ontheleftcolumn.OntherightcolumnarethecontourplotsoftheestimatesofClbyAR(1). 87
PAGE 88
QTLQTLgenotype1QTLgenotype2log-likelihood Covariance2Location^1^Pm1(20)^1^2^Pm2(20)^2H0H1maxLRLELQ
PAGE 89
QTLQTLgenotype1QTLgenotype2log-likelihood Covariance2Location^1^Pm1(20)^1^2^Pm2(20)^2H0H1maxLRLELQ
PAGE 90
Afewissuesneedtobediscussed.First,NPwasdevelopedinchapter2basedonasequenceofregressionsobtainedfromthemodiedCholeskydecompositionofthecovariancematrixofaonedimensional(longitudinal)vectorwhichhasanorderingofvariables.Inthischapter,thephenotypevectorconsistsofobservationsbasedontwolevelsofirradianceandtemperaturemeasurements,i.e. {z }irradiance1;:::;yi(S;1);:::;yi(S;T)| {z }irradianceS]0:(3{32) Whiletheorderofthevariablesinthisvectorispredened,thereisnonaturalorderinglikeinlongitudinaldata.InsteadofNP,amoreappropriatemethodmightbetoadoptthesparsepermutationinvariantcovarianceestimator(SPICE)proposedbyRothmanetal.(2008)whichisinvarianttovariablepermutations.SPICEisderivedbydecomposingthecovariancematrixas =C0C(3{33) whereC=[ctj]isalowertriangularmatrix.Intermsofthecomponentsofthesequenceofregressionequations, whereftjgandf2ttgaretheGARPsandIVs(Section 2.2.1 ).However,oursimulationresultssuggestthatNPcanstillbedirectlyappliedtoobservationsthathavenovariableorderingsuchas 3{32 .Furthermore,Rothmanetal.statedthat,undervariable 90
PAGE 91
Asecondissuepertainstousingnonseparablemodelsinfunctionalmappingwherethesimulationsinthischaptershowedverygoodresults.Thismightbeagoodideaifthemodelcloselyreectsthestructureofthedata.Unfortunately,thisisnotoftenthecase.Infact,itisnotevenknownwhetherthedataexhibitsinteractionsornot.Beforedecidingonwhatmodeltouse,spatio-temporalmodelersutilizetestsforseparability(Mitchelletal.,2005;Fuentesetal.,2005).Ifseparablemodelsareappropriate,thereareawealthofoptions.Otherwise,itisdiculttochoosefromanumberofcomplexmodelsbecausetherearenoavailablegeneralguidelinesasyetthatcanhelponedecideonaspecicnonseparablemodel.ThemodelC3thatwasusedinthesimulations(Section 3.4 )hasaneasytointerpretinteractionparameter2[0;1].However,despiteaninteraction"strength"of=0:6,theseparablemodel,AR(1),estimatedthedatageneratedbyC3quitewell.Thus,thetrade-obetweenusinganonseparablemodelinsteadofaseparableonemaynotbeworthit.Anotheroptionistouseseparableapproximationstononseparablecovariances(Genton,2007).Thenonseparablecovariancesthatweconsideredwereassumedtobestationaryandisotropic(Section 3.3.2 ).Thesetwoassumptionsmaynotalwaysholdforrealdata.Althoughnotspecicallyaddressedhere,usingNPmayworkfordatathatdonotsatisfytheseassumptions. Inthischapter,weonlyconsideredtwoenvironmentalsignalswithinteractions:irradianceandtemperature.However,thereactionnormofphotosyntheticrateisaverycomplexprocessbecausetherearereallymoreenvironmentalsignalsatplayotherthanthelattertwo.Thespatialdomainofspatio-temporalnonseparablecovariancemodelscanbeextendedtomorethanonedimensions.Forexample,atwodimensionalspatialdomainmodelsanareaonaatsurfacewhileathreedimensionaldomainmodelsspace.However,thisextensioncannotbeusedtoincreasethenumberofsignalsunlessthesignalshavethesameunitofmeasurementoroneassumesseparabilityornointeraction.Thus, 91
PAGE 92
Theanalysisconductedinthischapterwereallbasedonsimulateddata,whichmakesourproposedmodeltheoreticalandnot(yet)practical.However,wehopethatourtheoreticalframeworkcaneitherstimulateandmotivateresearcherstoconductexperimentsandstudiestoproducedatathatourmodelcananalyzeoratleastleadresearchtoadirectionthatweconsidertheoreticallypossible. 92
PAGE 93
2{34 2{35 and 2{36 basedonanL2penalty.Thisprovidednotonlyanintuitiveinterpretationbutmoreimportantly,awaytoextendanullmodelcovariancetoanalternative(ormixture)modelcovariance.Thus,thenestedLASSO(Levinaetal.,2008)andSPICE(Rothmanetal.,2008)modelscanpotentiallybeimplementedbyourmethodtoproduceotherregularizedestimators. Weconsideredtwomainapplicationsoffunctionalmapping:traitsmeasuredacrosstimeorlongitudinaltraits(chapter 2 )andreactionnormstotwoenvironmentalsignals(chapter 3 ).Forlongitudinaltraits,simulationsshowedthatourestimatorcanmorepreciselyestimatetheQTLlocationanditseectscomparedtoAR(1).Forreactionnorms,simulationsagainshowedourestimatortobemorereliablethanWuetal.'s(2007a)proposedestimatorinthepresenceofinteractioneectsbetweenthetwoenvironmentalsignals.Theinteractioneectsweresimulatedusingnonseparablecovariancestructures.Inbothapplications,thenonparametricestimatorismoreexiblebecauseitdoesnotassumeaparametricorstructuralformandisthereforesuitedtoanalyzedatawithvaryingstructures.Therefore,thenonparametricestimatorcanbeusedasanalternativeover,oraguidefor,parametricmodelingofthecovarianceinthepracticaldeploymentoffunctionalmapping. 93
PAGE 94
1.3 ).Compositefunctionalmappingallowsmodelingofmarkereectsbeyondtheintervalconsideredbyusingapartialregressionanalysis.ThissignicantlyimprovestheaccuracyandprecisionoffunctionalmappinginmultipleQTLdetection.However,compositefunctionalmappingassumesanAR(1)covariancestructure.Itwouldbeadvantageoustoincorporateourproposednonparametricestimatorintothismethodtofurtherimproveitspower. Thedevelopmentofcomplextraitsistheconsequenceofinteractionsamongamultitudeofgeneticandenvironmentalfactorsthateachtriggeranimpactoneverystepoftraitdevelopment.Thisprocessisinherentlycomplicated,butcanbeillustratedbyalandscapeofphenotypeformedbygeneticandenvironmentalvariables(Rice2002;Wolf2002).Thesurfaceofsuchaphenotypelandscapedenesthephenotypedeterminedbyaparticularcombinationofunderlyinggenetic(suchasadditive,dominantorepistatic)andenvironmentalfactors(suchastemperature,lightormoisture)thatinteractwitheachotherthroughdevelopmentalpathways.Thenumberofunderlyingfactorscontributingtophenotypicvariationdenesthenumberofdimensionsofthelandscapespace.Intheory,thenumberofunderlyingfactorscanbeunlimited,implyingthatalandscapecanexistinvery-high-dimensionalspace(i.e.,hyperspace)(Wolf2002).Figure 4-1 showsahypotheticallandscape,wherethephenotypeofanindividualisdeterminedbythevaluesoftwounderlyingfactors.Bycharacterizingthetopographicalfeaturesofsuchlandscape,afundamentalquestionofhoweachunderlyingfactorcontributestotheexpressionofaparticulartraitindividuallyorthroughaninteractivewebcanbeaddressed.Thesefeaturestypicallyinclude\slope",\curvature",\peak-valley"and\ridge".Thedescriptionofthetopographyofathree-dimensionallandscape(Fig. 4-1 )ismostintuitive,but 94
PAGE 95
Figure4-1. Formationofaphenotypebyalandscape.Thephenotypicformationisafunctionofthevalueofunderlyingfactors1and2(u1andu2)thatinteractduringtraitdevelopment.Twoshadedovalspresenttwodierentareasonthesurface,onebeingsteeper(pointingtoInsetA)andthesecondbeingatter(pointingtoInsetB).Thesteeperoneisassociatedwithadramaticchangeinphenotypicexpressioncontributedbyasmallchangeintheunderlyingfactors(indicatedbythedistributioninInsetA),whereastheatteroneassociatedwithadierentpatterninwhichdramaticchangesintheunderlyingfactorsonlyleadtoaminorchangeinphenotypicexpression(indicatedbythedistributioninInsetB).AdaptedfromWolf(2002). Becausebiologicaltraitsarederivedfromdevelopmentalprocessesandphysiologicalregulatorymechanisms,complexmultivariatesystemsthatundergosuchprocessesshould 95
PAGE 96
96
PAGE 97
Letx=(x01;:::;x0n)0,wherexi=(xi1;:::;xiJ)0,i=1;:::;n,isavectorthatindicatesfromwhichgenotypegroupyi=(yi1;:::;yim)0belongsto.Weassumethatthexi'sareindependentandidenticallydistributed(i.i.d.)realizedvaluesfromamultinomial(1;pi)distributionwherepi=(p1ji;:::;pJji)0.Thus,xik=1or0,dependingonwhetherornotyibelongstogenotypegroupk=1;:::;J.Inreality,xisunknown(ormissing)sothaty=(y01;:::;y0n)0canbeviewedasincompletedata.Thecompletedatais(x0;y0)0withlog-likelihood logLc()=lognYi=1JYk=1[pkjifk(yij)]xik=nXi=1JXk=1xik[logpkji+logfk(yij)]: TheEMalgorithmatthe(j+1)thiterationproceedsasfollows: 1. Thecurrentvalueofis(j). 2. 97
PAGE 98
P(Xik=1jyi;(j))=P(yi;Xik=1j(j)) P(yij(j))=P(yijXik=1;(j))P(Xik=1j(j)) sothat A{2 becomes Therefore,thisstepinvolvesupdatingP(j)kjiusing(j)asin 1{13 3. @Q(j(j))=nXi=1JXk=1P(j)kji@ @logfk(yij)=0(A{5) toget(j+1). 4. Repeatuntilsomeconvergencecriterionismet. 98
PAGE 99
SupposeX0Xisincorrelationform.ThentheeigenvaluedecompositionofX0Xis (X0X)1=V2666666641=100001=200............0001=k377777775V0 2{9 thenfollows. 99
PAGE 100
Forxedtj,j=1;2;:::;t1, 2{33 isminimizedwithrespectto2tbysolving@ @2t"nXi=1JXk=1Pkjilog2t+kit2 yielding sincekit=ykitPt1j=1ykijtj. Forxed2t, 2{33 isminimizedwithrespecttotjbytheminimizerof Lett(t)=(t1;t2;:::;t;t1)0andyki(t)=(yki1;yki2;:::;yki;t1)0.Thersttermof C{2 is 1
PAGE 101
C{2 becomes forxed2t. 101
PAGE 102
If
PAGE 103
ByFouriertransformation, 22ZZei(u!+v)C(u;v)dudv=1 2Zeiv1 2Zeiu!C(u;v)dudv=1 2Zeivh(!;v)dv where istheinverseFouriertransformofginorthespatialspectraldensityfortemporallag.Using E{2 3{19 becomes Let where(!;v)isavalidcontinuousautocorrelationfunctioninvforeach!and(!)>0.IfR(!;v)dv<1andRk(!)d!<1,thenintermsof E{4 E{3 becomes 3{20 103
PAGE 104
[1] Andersson,L.,Haley,C.S.,Ellegren,H.,Knott,S.A.,Johansson,M.,Andersson,K.,Anderssoneklund,L.,Edforslilja,I.,Fredholm,M.,Hansson,I.,Hakansson,J.,Hakansson,J.andLundstrom,K.(1994).\Geneticmappingofquantitativetraitlociforgrowthandfatnessinpigs",Science2631771-1774. [2] Angilletta,Jr.,M.J.andSears,M.W.(2004).\Evolutionofthermalreactionnormsforgrowthrateandbodysizeinectotherms:anintroductiontothesymposium",Integr.Comp.Biol.44,401-402. [3] Akaike,H.(1974)\Anewlookatthestatisticalmodelidentication",IEEETransactionsonAutomaticControl19(6):716723. [4] Banerjee,O.,'dAspremont,A.,andElGhaouli,L.(2006).\Sparsecovarianceselectionviarobustmaximumlikelihoodestimation",ProceedingsofICML. [5] Bickel,P.andLevina,E.(2008)\Regularizedestimationoflargecovariancematrices",Ann.Statist.36(1):199-227. [6] Bochner,S.(1955).HarmonicAnalysisandtheTheoryofProbability,UniversityofCaliforniaPress,BerkleyandLosAngeles. [7] Broman,K.(1997)Identifyingquantitativetraitlociinexperimentalcrosses,Ph.D.Dissertation,DepartmentofStatistics,UniversityofCalifornia,Berkley. [8] Broman,K.(2001).\ReviewofstatisticalmethodsforQTLmappinginexperimentalcrosses",LabAnimal30,no.7,44-52. [9] Carlborg,O.,Andersson,L.andKinghorn,B.(2000).\Theuseofageneticalgorithmforsimultaneousmappingofmultipleinteractingquantitativetraitloci",Genetics155,2003-2010. [10] Carrol,R.J.andRupert,D.(1984).\Powertransformationswhenttingtheoreticalmodelstodata",J.Am.Statist.Assoc.79,321-328. [11] Cox,D.D.andSullivan,F.(1990).\Asymptoticanalysisofpenalizedlikelihoodandrelatedestimators",AnnalsofStatistics18,1676-1695. [12] Cressie,N.andHuang,H-C.(1999).\Classesofnonseparable,spatio-temporalstationarycovariancefunctions",J.Am.Statist.Assoc94,no.448,1330-1340. [13] Cui,H.J.,Zhu,J.andWu,R.(2006)\Functionalmappingforgeneticcontrolofprogrammedcelldeath",Physiol.Genom.25,458-469. [14] Daniels,M.J.andPourahmadi,M.(2002).\Bayesiananalysisofcovariancematricesanddynamicmodelsforlongitudinaldata",Biometrika89,553-566. [15] deBoor,C.(2001)\APracticalGuidetoSplines",Reviseded.SpringerNewYork. 104
PAGE 105
Dempster,A.P.,Laird,N.M.andRubin,D.B.(1977).\MaximumlikelihoodfromincompletedataviatheEMalgorithm",J.Roy.Statist.Soc.B39,1-38. [17] Diggle,P.J.,Heagerty,P.,Liang,K.Y.andZeger,S.L.(2002).AnalysisofLongitudi-nalData,OxfordUniversityPress,UK. [18] Doerge,R.W.(2002).\Mappingandanalysisofquantitativetraitlociinexperimentalpopulations",Nat.Rev.Genet.3:43-52. [19] Doerge,R.W.andChurchill,G.A.(1996).\Permutationtestsformultiplelociaectingaquantitativecharacter",Genetics142,285-294. [20] Drayne,D.,Davies,K.,Hartley,D.,Mandel,J.L.,Camerino,G.,Williamson,R.andWhite,R.(1984).\GeneticmappingofthehumanX-chromosomebyusingrestrictionfragmentlengthpolymorphisms",Proc.Natl.Acad.Sci.USA812836-2839. [21] Eilers,P.H.C.andMarx,B.D.(1996)\FlexiblesmoothingwithB-splinesandpenalties",StatisticalScience11,no.2,89-121. [22] Fan,J.andLi,R.(2001).\Variableselectionvianonconcavepenalizedlikelihoodanditsoracleproperties",J.Am.Statist.Assoc.96,1348-1360. [23] Fu,W.(1998).\Penalizedregressions:Thebridgeversusthelasso",Comput.Graph.Statist.7,397-416. [24] Fuentes,M.(2005).\Testingseparabilityofspatial-temporalcovariancefunctions",JournalofStatisticalPlanningandInference136,no.2,447-466. [25] Furrer,R.andBengtsson,T.(2007).\Estimationofhigh-dimensionalpriorandposterioricovariancematricesinKalmanltervariants",JournalofMultivariateAnalysis98,no.2,pp.227-255. [26] Genton,M.(2007).\Separableapproximationsofspace-timecovariancematrices",Envirometrics18,681-695. [27] Gill,P.,Murray,W.andWright,M.(1981).PracticalOtimization,AcademicPress,NewYork. [28] Gneiting,T.(2002).\Nonseparable,stationarycovariencefunctionsforspace-timedata",J.Am.Statist.Assoc97,no.458,590-600. [29] Gneiting,T.,Genton,M.andGuttorp,P.(2006).\Geostatisticalspace-timemodels,stationary,separabilityandfullsymmetry",StatisticalMethodsforSpatio-temporalSystems(MonographsonStatisticsandAppliedProbability)B.Finkenstadt,L.HeldandV.Isham,editors,Chapman&Hall/CRC. [30] Green,P.(1990).\OnuseoftheEMalgorithmforpenalizedlikelihoodestimation",J.Roy.Statist.Soc.B52,443-452. 105
PAGE 106
Green,P.(1999).\Penalizedlikelihood",EncyclopediaofStatisticalSciences3,578-586. [32] Griths,A.J.,Wessler,S.R.,Lewontin,R.C.,Gelbart,W.G.,Suzuki,D.T.andMiller,J.H.(2005).IntroductiontoGeneticAnalysis,W.H.FreemanandCompany,NewYork. [33] Haldane,J.B.S.(1919).\Thecombinationoflinkagevaluesandthecalculationofdistancebetweenthelocioflinkedfactors",JournalofGenetics8,299-309. [34] Haley,C.S.,Knott,S.A.andElsen,J.M.(1994).\Geneticmappingofquantitativetraitlociincrossbetweenoutbredlinesusingleastsquares",Genetics1361195-1207. [35] Hoerl,A.andKennard,R.(1970).\Ridgeregression:biasedestimationfornonorthogonalproblems",Technometrics12,55-67. [36] Hoeschele,I.(2000).Mappingquantitativetraitlociinoutbredpedigrees.In:HandbookofStatisticalGeneticsEditedbyD.J.Balding,M.BishopandC.Cannings.WileyNewYork.567-597. [37] Huang,H.C.,Martinez,F.,Mateu,J.andMontes,F.(2007a).\Modelcomparisonandselectionforstationaryspace-timemodels",Comp.StatisticsandDataAnalysis51,4577-4596. [38] Huang,J.,Liu,L.andLiu,N.(2007b).\Estimationoflargecovariancematricesoflongitudinaldatawithbasisfunctionapproximations",J.Comput.Graph.Statist.16,189-209. [39] Huang,J.,Liu,N.,Pourahmadi,M.andLiu,L.(2006).\Covarianceselectionandestimationviapenalisednormallikelihood",Biometrika93,85-98. [40] Ibanez,M.V.andSimo,A.(2007).\Spatio-temporalmodelingofperimetrictestdata",StatisticalMethodsinMedicalResearch16,no.6,497-522. [41] Jansen,R.C.(2000).Quantitativetraitlociininbredlines.In:HandbookofStatisticalGeneticsEditedbyD.J.Balding,M.BishopandC.Cannings.WileyNewYork.567-597. [42] Jansen,R.C.andStam,P.(1994).\Highresolutionofquantitativetraitsintomultiplelociviaintervalmapping",Genetics136,1447-1455. [43] Jones,R.H.andZhang,Y.(1997).\Modelsforcontinuousstationaryspace-timeprocess",InModellingLongitudinalandSpatiallyCorrelatedData,LectureNotesinStatistics122,Springer,NewYork,122,289-298. [44] Kao,C.H.,Zeng,Z-B.andTeasdale,R.D.(1999).\Multipleintervalmappingforquantitativetraitloci",Genetics152,1203-1216. 106
PAGE 107
Knott,S.A.,Neale,D.B.,Sewell,M.M.andHaley,C.S.(1997).\Multiplemarkermappingofquantitativetraitlociinanoutbredpedigreeofloblollypine",Theor.Appl.Genet.94810-820. [46] Kramer,M.G.,Vaughn,T.T.,Pletscher,L.S.,King-Ellison,K.Erikson,C.andCheverud,J.M.(1998).\Geneticvariationinbodyweightgrowthandcompositionintheintercrossoflarge(LG/J)andsmall(SM/J)inbredstrainsofmice",GeneticsandMolecularBiology21,211-218. [47] Kenward,M.G.(1987).\Amethodforcomparingprolesofrepeatedmeasurements",Appl.Statist36,296-308. [48] Kingsolver,J.G.,Izem,R.andRagland,G.J.(2004).\Plasticityofsizeandgrowthinuctuatingthermalenvironments:comparingreactionnormsandperformancecurves",Integr.Comp.Biol.44,450-460. [49] M.KirkpatrickandN.Heckman,\Aquantitativegeneticmodelforgrowth,shape,reactionnorms,andotherinnite-dimensionalcharacters",J.Math.Biol.27,429-450,1989. [50] Kolovos,A.,Christakos,G.,Hristopulos,D.T.andSerre,M.L.(2004).\Methodsforgeneratingnon-separablespatiotemporalcovariancemodelswithpotentialenvironmentalapplications",AdvancesinWaterResources27,815-830. [51] Krishnaiah,P.(1985).\MultivariateAnalysis",ElsevierSciencePublishersB.V.,NewYork. [52] Lander,E.S.andBotstein,D.(1989).\MappingMendelianfactorsunderlyingquantitativetraitsusingRFLPlinkagemaps",Genetics121,185-199. [53] Ledoit,O.andWolf,M.(2003).\Awell-conditionedestimatorforlarge-dimensionalcovariancematrices",JournalofMultivariateAnalysis88,365-411. [54] Levina,E.,Rothman,A.andZhu,J.(2008).\Sparseestimationoflargecovariancematricesviaanestedlassopenalty",Ann.Appl.Statist.2,no.1,245-263. [55] Li,H.Y.,Kim,B-R.andWu,R.L.(2006).\Identicationofquantitativetraitnucleotidesthatregulatecancergrowth:Asimulationapproach",J.Theor.Biol.242,426-439. [56] Li,Y.,Wang,N.,Hong,M.,Turner,N.,Lupton,J.andCarrol,R.,(2007).\Nonparametricestimationofcorrelationfunctionsinlongitudinalandspecialdata,withapplicationstocoloncarcinogenesisexperiments",AnnalsofStatistics35,no.4,1608-1643. [57] Lin,M.,Li,H.Y.,Hou,W.,Johnson,J.A.andWu,R.L.(2007).\Modelingsequence-sequenceinteractionsfordrugresponse",Bioinformatics23,no.10,1251-1257. 107
PAGE 108
Lin,M.,Lou,X-Y.,Chang,M.andWu,R.L.(2003).\Ageneralstatisticalframeworkformappingquantitativetraitlociinnon-modelsystems:Issueforcharacterizinglinkagephases",Genetics165,901-913. [59] Lindley,D.V.(1957).\Astatisticalparadox",Biometrika44,187-192. [60] Liu,T.,Liu,X-L,Chen,Y.M.andWu,R.L.(2007).\Aunifyingdierentialequationmodelforfunctionalgeneticmappingofcircadianrhythms",Theor.Biol.MedicalModel.4,5. [61] Liu,T.andWu,R.L.(2007).\AgeneralBayesianframeworkforfunctionalmappingofdynamiccomplextraits",Genetics(tentativelyaccepted2007). [62] Liu,T.,Zhao,W.,Tian,L.andWu,R.L.(2005).\Analgorithmformoleculardissectionoftumorprogression",J.Math.Biol.50,336-354. [63] Long,F.,Chen,Y.Q.,Cheverud,J.M.andWu,R.L.(2006).\Geneticmappingofallometricscalinglaws",Genet.Res.87,207-216. [64] Lynch,M.andWalsh,B.(1998).GeneticsandAnalysisofQuantitativeTraits.Sinauer,Sunderland,MA. [65] Ma,C.(2007).\Recentdevelopmentsontheconstructionofspatial-temporalcovariancemodels",StochEnvironResRiskAssess,Springer-Verlag,22,s39-s47. [66] Ma,C.,Casella,G.andWu,R.L.(2002).\Functionalmappingofquantitativetraitlociunderlyingthecharacterprocess:Atheoreticalframework",Genetics161,1751-1762. [67] Madsen,H.andThyregod,P.(2001).\Calibrationwithabsoluteshrinkage",J.Chemomet.15,497-509. [68] Mallows,C.L.(1973).\SomeCommentsonCp",Technometrics,15,661-675. [69] Matern,B.(1986).SpatialVariation,SpringerNewYork,2ndEd.. [70] McCullagh,P.andNelder,J.A.(1989).GeneralizedLinearModels,ChapmanandHall,London. [71] McLachlan,G.andPeel,D.(2000).FiniteMixtureModels,JohnWileyandSons,Inc.,NewYork. [72] Meng,X-L.andRubin,D.(1993).\MaximumlikelihoodestimationviatheECMalgorithm:Ageneralframework",Biometrika80,267-278. [73] Mitchell,M.W.,Genton,M.G.andGumpertz,M.L.(2005)\Testingforseparabilityofspace-timecovariences",Envirometrics16,819-831. [74] Molenberghs,G.andVerbeke,G.(2005).ModelsforDiscreteLongitudinalData,SpringerScience+BusinessMedia,Inc.,NewYork. 108
PAGE 109
Myers,R.(1990).ClassicalandModernRegressionwithApplications,PWS-KentPublishingCompany,Boston. [76] Nelder,J.A.andMead,R.(1965).\Asimplexmethodforfunctionminimization",Comput.J.7,308-313. [77] Newton,H.J.(1988).TIMESLAB:ATimeSeriesAnalysisLaboratory,Wadsworth&Brooks/Cole,PacicGrove,CA. [78] Niklas,K.L.(1994).PlantAllometry:TheScalingofFormandProcess,UniversityofChicago,Chicago. [79] Nychka,D.,Wikle,C.andRoyle,A.(2002).\Multiresolutionmodelsfornonstationaryspatialcovariancefunctions",StatisticalModeling2,315-331. [80] Ojelund,H.,Madsen,H.andThyregod,P.(2001).\Calibrationwithabsoluteshrinkage",J.Chemomet15,497-509. [81] Pan,J.X.andMackenzie,G.(2003).\Onmodellingmean-covarianceinlongitudinalstudies",Biometrika90,239-244. [82] Porcu,E.andMateu,J.(2006)\Nonseparablestationaryanisotropicspace-timecovariancefunctions",StochEnvironRes.RiskAssess21,113-122. [83] Porcu,E.,Mateu,J.,Zini,A.andPini,R.(2007).\Modellingspatio-temporaldata:Anewviogramandcovariancestructureproposal",StatisticsandProbabilityLetters77,83-89. [84] Pourahmadi,M.(1999).\Jointmean-covariancemodelswithapplicationstolongitudinaldata:Unconstrainedparameterization",Biometrika86,677-690. [85] Pourahmadi,M.(2000).\Maximumlikelihoodestimationofgeneralisedlinearmodelsformultivariatenormalcovariancematrix",Biometrika87,425-435. [86] Ramsay,J.O.andSilverman,B.W.(1997).FunctionalDataAnalysis,Springer-Verlag,NewYork. [87] Rothman,A.,Bickel,P.,Levina,E.andZhu,J.(2007).\Sparsepermutationinvariantcovarianceestimation",Dept.ofStatistics,Univ.ofMichigan(TechnicalReportno.467). [88] Sampson.P.andGuttorp,P.(1992).\Nonparametricestimationofnonstationaryspatialcovariancestructure",J.Am.Statist.Assoc87,108-119. [89] Satagopan,J.M.,Yandell,Y.S.,Newton,M.A.andOsborn,T.C.(1996).\ABayesianapproachtodetectquantitativetraitlociusingMarkovchainMonteCarlo",Genetics144,805-816. [90] Sax,K.(1923).\Theassociationofsizedierencewithseed-coatpatternandpigmentationinPhaseolusvulgaris",Genetics8552-560. 109
PAGE 110
Schabenberger,O.andGotway,C.(2005).StatisticalMethodsforSpatialDataAnalysis,ChapmanandHall/CRC,BocaRaton. [92] Schwarz,G.(1978).\Estimatingthedimensionofamodel",AnnalsofStatistics6(2):461-464. [93] Sillanpaa,M.J.andArjas,E.(1999).\Bayesianmappingofmultiplequantitativetraitlocifromincompleteoutbredospringdata",Genetics151,1605-1619. [94] Smith,M.andKohn,R.(2002).\Parsimoniouscovariancematrixestimationforlongitudinaldata",J.Am.Statist.Assoc97,no.460,1141-1153. [95] Stein,M.(2005).\Space-timecovariancefunctions",J.Am.Statist.Assoc100,no.469,310-321. [96] Stratton,D.(1998).\ReactionnormfunctionsandQTL-environmentinteractionsforoweringtimeinArabidopsisthaliana",Heredity81,144-155. [97] Stroud,J.(2001).\Dynamicmodelsforspatiotemporaldata",J.R.Statist.Soc.B63,673-698. [98] Tibshirani,R.(1996).\RegressionshrinkageandselectionviatheLasso",J.Roy.Statist.Soc.B58,267-288. [99] Vaughn,T.,Pletscher,S.,Peripato,A.,King-Ellison,K.,Adams,E.,Erikson,C.andCheverud,J.(1999).\Mappingofquantitativetraitlociformurinegrowth:Acloserlookatgeneticarchitecture",Genet.Res.74,313-322. [100] Thornley,J.H.M.andJohnson,I.R.(1990).PlantandCropModelling:AMathemat-icalApproachtoPlantandCropPhysiology,ClarendonPress,Oxford. [101] Waller,L.andGotway,C.(2004).AppliedSpatialStatisticsforPublicHealthData,Wiley-Interscience,Hoboken,N.J.. [102] Wang,Z.,Hou,W.andWu,R.L.(2006).\AstatisticalmodeltoanalysequantitativetraitlocusinteractionsforHIVdynamicsfromthevirusandhumangenomes",Statist.Med25,495-511. [103] Wang,Z.andWu,R.L.(2004).\Astatisticalmodelforhigh-resolutionmappingofquantitativetraitlocideterminingHIVdynamics",Statist.Med23,3033-3051. [104] Weiss,R.(2005).ModelingLongitudinalData,Springer-Verlag,NewYork. [107] West,G.B.,Brown,J.H.andEnquist,B.J.(2001).\Ageneralmodelforontogeneticgrowth",Nature413,628-631. [106] Wolf,J.B.(2002).\Thegeometryofphenotypicevolutionindevelopmentalhyperspace",ProceedingsoftheNationalAcademyofSciencesoftheUSA99,15849-15851. 110
PAGE 111
Wong,F.,Carter,C.K.andKohn,R.(2003)\Ecientestimationofcovarianceselectionmodels",Biometrika90,809-830. [108] Wu,J.,Zeng,Y.,Huang,J.,Hou,W.,Zhu,J.andWu,R.L.(2007a).\Functionalmappingofreactionnormstomultipleenvironmentalsignals",Genet.Res.Camb.89,27-38. [109] Wu,R.L.,Hou,W.,Cui,Y.,Li,H.Y.,Wu,S.,Ma,C-X.andZeng,Y.(2007b)\Modelingthegeneticarchitectureofcomplextraitswithmolecularmarkers",RecentPatentsonNanotechnology1,41-49. [110] Wu,R.L.,Ma,C-X.,andCasella,G.(2007c).StatisticalGeneticsofQuantitativeTraits:Linkage,Maps,andQTL,Springer-Verlag,NewYork. [111] Wu,R.L.,Ma,C-X.,Lin,M.andCasella,G.(2004a).\Ageneralframeworkforanalyzingthegeneticarchitectureofdevelopmentalcharacteristics",Genetics166,1541-1551. [112] Wu,R.L.,Ma,C-X.,Lin,M.,Wang,Z.andCasella,G.(2004b).\Functionalmappingofquantitativetraitlociunderlyinggrowthtrajectoriesusingatransform-both-sideslogisticmodel",Biometrics60,729-738. [113] Wu,R.L.,Ma,C-X.,Littel,R.andCasella,G.(2002).\Astatisticalmodelforthegeneticoriginofallometricscalinglawsinbiology",J.Theor.Biol.217,275-287. [114] Wu,W.B.andPourahmadi,M.(2003).\Nonparametricestimationoflargecovariancematricesoflongitudinaldata",Biometrika90,831-844. [115] Wu,S.,Yang,J.andWu,R.L.(2007d).\Semiparametricfunctionalmappingofquantitativetraitlocigoverninglong-termHIVdynamics",Bioinformatics23,569-576. [116] Xu,S.Z.(1996).\Mappingquantitativetraitlociusingfour-waycrosses",Genet.Res.68175-181. [117] Xu,S.Z.andYi,N.J.(2000).\Mixedmodelanalysisofquantitativetraitloci",Proc.Natl.Acad.Sci.USA97,14542-14547. [118] Yang,J.(2006)Nonparametricfunctionalmappingofquantitativetraitloci,Ph.D.Dissertation,DepartmentofStatistics,UniversityofFlorida. [119] Yang,R.Q.,Gao,H.J.,Wang,X.,Zheng,Z-B.,andWu,R.L.(2007).\Asemiparametricmodelforcompositefunctionalmappingofdynamicquantitativetraits",Genetics177,1859-1870. [120] Yap,J.S.,Wang,C.G.andWu,R.L.(2007).\Asimulationapproachforfunctionalmappingofquantitativetraitlocithatregulatethermalperformancecurves",PLoSONE2(6),e554. 111
PAGE 112
Yuan,M.andLin,Y.(2007).\ModelselectionandestimationintheGaussiangraphicalmodel",Biometrika94(1),19-35. [122] Zeng,Z-B.(1994).\Precisionmappingofquantitativetraitloci",Genetics136,1457-1468. [123] Zhao,W.(2005a).Statisticalmodellingforfunctionalmappingoflongitudinalandmultiplelongitudinaltraits:structuredantedependencemodelandwaveletdimensionalityreduction,Ph.D.Dissertation,DepartmentofStatistics,UniversityofFlorida. [124] Zhao,W.,Chen,Y.,Casella,G.,Cheverud,J.M.andWu,R.L.(2005b).\Anon-stationarymodelforfunctionalmappingofcomplextraits",Bioinformatics21,2469-2477. [125] Zhao,W.,Ma,C-X.,Cheverud,J.M.andWu,R.L.(2004).\AunifyingstatisticalmodelforQTLmappingofgenotypesexinteractionfordevelopmentaltrajectories",Physiol.Genomics19,218-227. [126] Zimmerman,D.andNu~nez-Anton,V.(2001).\Parametricmodelingofgrowthcurvedata:Anoverview(withdiscussions)",Test10,1-73. 112
PAGE 113
JohnStephenF.YapwasborninthetownofTagoloan,MisamisOriental,PhilippinestoRhodaandLizardoYap.Hehasanolderbrother,Mark.JohnearnedaB.S.inmathematicsfromAteneodeManilaUniversityinthePhilippinesandupongraduationworkedasanactuarialassistantatWatsonWyatt.HealsospentayearasanassistantinstructorintheMathematicsDepartmentofAteneodeManilaUniversity.JohnobtainedaM.S.inmathematicswithemphasisinactuarialsciencefromtheUniversityofMinnesotainMinneapolisandaPh.D.instatisticsfromtheUniversityofFloridainGainesville.HewillworkattheFoodandDrugAdministrationasamathematicalstatistician. 113