Bayesian Methods for Modeling Dependence Structures in Longitudinal Data

PAGE 13

CHAPTER1INTRODUCTIONWhenworkingwithlongitudinal(ortime-ordered)data,specifyingand/ormodelingofthedependencestructureisofprimeimportance.Inmostsituationsthemethodsoflinearandgeneralizedlinearmodelscanbeadaptedtodescribethemeanstructure.Aslongitudinaldataconsistsofmultiplemeasurementswithinanexperimentalunit,methodstohandlethedependenceacrossthemeasurementsbecomesnecessary.Here,wedistinguishlongitudinaldataanalysisfromtherelatedeldofmultivariateanalysis(seee.g., Anderson 1984 ).Wedenelongitudinaldatatobeamultivariateresponseforanexperimentalunit(patientorcase)wheretheresponsesaremeasurementsofthesameoutcomeatdifferenttimepoints,e.g.,whetherapatientsmokesduringagivenweekoverthecourseof8weeksornumberofdepressionsymptomsinaweekover16weeks.Dataofthistypehaveaclearorderingintime,whereasgeneralmultivariatedatamaynot.Oneshouldnotethatthemethodswedevisehereandmostinthelongitudinaldataliteraturemakeuseofassumptionsandintuitionthatareonlyreasonablewhenthemeasurementsfollowaxedorderingandmaynotbeappropriateinmoregeneralmultivariatesituations.Itisnotuncommonforstatisticianstomistakenlyassumethatestimationofthecovarianceparametersperformsasecondaryroletothemeanestimation.Infact,theyshouldgenerallybetreatedjointly.Insituationswithcompletedata(i.e.,allpatientsareobservedatallobservationtimes)andmultivariatenormality,themeanandcovarianceparametersareorthogonalinthesenseof Cox&Reid ( 1987 ),andtheestimatesofthemeanparameterswillbeconsistentundermisspecicationofthedependencestructure.However,whenoneanalyzesreal-worldlongitudinaldatasuchasthatfromaclinicaltrial,thedatawillusuallyexhibitsomeamountofmissingness.Ifthereismissingnessinthedata,thereisnolongerorthogonalitybetweenmeananddependenceparameters,evenatthetruevalueofthecovariancematrix( Little&Rubin 2002 ).Hence,forthe 13

PAGE 14

posteriordistributionofthemeanparameterstobeconsistent,thedependencestructuremustbecorrectlyspecied.So,eveninthemissingatrandomcase(MAR; Daniels&Hogan 2008 ),wheremissingnessdependsonlyontheobservedvaluesnottheunobserveddata,itisnolongerappropriatetotreatthecovariancestructureasanuisanceparameter.Inthiscasebiasedmeanestimatescanresultifwedonotusethecorrectmodelforthedependence( Daniels&Hogan 2008 ,Section6.2).Tofurthermotivatethenecessityofmethodologyforimprovedcovarianceestimation,notethateveninthecompletedatacasewherethemeananddependenceareasymptoticallyindependentundernormality,efciencygainsmaybepossibleforsmallormoderatelysizeddatasets.Throughfoursimulationexampleswithrelativelysmallsamplesizes, Crippsetal. ( 2005 )demonstratedimprovementsinestimatingregressioncoefcients,ttedvalues,andthepredictivedensityusingthe Wongetal. ( 2003 )covarianceselectionprioroveramoredispersedcovariancepriorchoice.InthisdissertationwewilldevelopBayesianmethodstomodellongitudinaldependencestructures.Thesedependonthespecicationofanappropriatepriordistributionforthecovariancematrixoritsparameters.AkeyconsiderationindevelopingthesepriorsistheabilitytoincorporatethemaspartofaMarkovchainMonteCarlo(MCMC)schemetoobtainposteriorinference.Wealsowantthepriorstobestructuredsuchthattheyarecenteredatintuitivepriorbeliefsspecictolongitudinaldata,suchasdecreasingdependenceacrosstimeorpositive(ratherthannegative)correlation.Additionally,wedesirepriorsthatpromotesparse(orlower-dimensional)parameterizationsofthedependencestructure.Throughoutweconsidertwoimportantsituationsinlongitudinaldata.FirstinChapter 2 ,welookatsituationswherethedependencestructureisconstrainedtobecorrelationmatrixforidentiability.Thiscomplicationisencounteredinmultivariateprobitmodels( Chib&Greenberg 1998 ),Gaussiancopularegression( Pittetal. 2006 ),certainlatentvariablemodels( Daniels&Normand 2006 ),amongothers.Theother 14

PAGE 15

problemweconsiderinChapters 3 and 4 isthatofjointlyestimatingmultiplecovariancematrices.Oftenlongitudinaldatamaybeviewedascomposedofseveralgroupseachofwhichmayneeditsowncovariancestructure.Itisdesirabletodevelopmethodologytojointlyestimatethesecovariancematricesallowingsharingofinformation.Wenowreviewtheimportantcontributionstotheliteratureforeachofthisproblems. 1.1LiteratureReviewforCorrelationEstimationandPriorDistributionsFirst,weexploretheproblemofmodelingacorrelationmatrix.Consideramean-zero,J-dimensionalrandomvectorYwithcorrelationmatrixR.TherearetwomainconstraintsontheJJmatrixR:thediagonalelementsmustallequaloneandRispositive-denite.LetRJdenotethesetofallrealmatricesthatsatisfytheserequirements.Positive-denitenessgenerallyprovidesthegreatestdifcultyinanalysis,asthesetofvaluesforaparticularelementijthatsatisfythepositive-deniteconstraintdependsonthechoiceoftheremainingelementsofR.Additionally,becausethenumberofparametersinRisquadraticinthedimensionJ,methodstondaparsimoniousorlower-dimensionalstructurecanbebenecial.Oneoftheearliestattemptstondlower-dimensionalstructureforthemoregeneralproblemofestimatingacovariancematrixistheideaofcovarianceselection( Dempster 1972 ).Bysettingsomeoftheoff-diagonalelementsoftheconcentrationmatrix=)]TJ /F9 7.97 Tf 6.59 0 Td[(1tozero,amoreparsimoniouschoiceforthecovariancematrixofYisachieved.Azerointhe(i,j)-thpositionofimplieszerocorrelation(andfurther,independenceundermultivariatenormality)betweenYiandYj,conditionalontheremainingcomponentsofY.Thisproperty,alongwithitsrelationtographicalmodeltheory( Lauritzen 1996 ),hasledtotheuseofcovarianceselectionasastandardpartofanalysisinmultivariateproblems(forinstance, Rothmanetal. 2008 ; Wongetal. 2003 ; Yuan&Lin 2007 ).However,onemustbecautiouswhenusingsuchselectionmethodsasnotallproducepositivedeniteestimators.Forinstance,thresholdingthesamplecorrelationmatrixwillnotnecessarilybepositivedenite( Bickel&Levina 2008 ). 15

PAGE 16

Eveninsituationswherethefocusisonageneralcovariancematrix,modelspecicationmaydependonthecorrelationstructurethroughtheso-calledseparationstrategy( Barnardetal. 2000 ).Theseparationstrategyinvolvesreparameterizingby=SRS,withSadiagonalmatrixcontainingthemarginalstandarddeviationsofYandRthecorrelationmatrix.Separationcanalsobeperformedontheconcentrationmatrix,=TCTsothatTisdiagonalandC2RJ.ThediagonalelementsofTgivethepartialstandarddeviations,whiletheelementscijofCarethe(full)partialcorrelations.ThecovarianceselectionproblemforisequivalenttochoosingelementsofthepartialcorrelationmatrixCtobenull.SeveralauthorshaveconstructedpriorstoefcientlyestimatebyallowingCtobeasparsematrix( Carteretal. 2011 ; Wongetal. 2003 ).Inmanycasesthefullpartialcorrelationmatrixmaynotbeconvenienttouse.Whenthecovariancematrixisxedtoacorrelationmatrixsuchasforthemultivariateprobitmodel,theelementsoftheconcentrationmatrixTandCareconstrainedtomaintainaunitdiagonalfor.Thisiseasytoseesince=RandCeachhaveJ(J)]TJ /F4 11.955 Tf 11.85 0 Td[(1)=2parametersbutTaddsanadditionalJparameters.Additionally,interpretationofparametersinthefullpartialcorrelationmatrixcanbechallenging,particularlyforlongitudinalsettingsasthepartialcorrelationsaredenedconditionalonfuturevalues.Forexample,c12givesthecorrelationbetweenY1andY2conditionalonthefuturemeasurementsY3,...,YJ.AfurtherissuewithBayesianmethodsthatpromotesparsityinCiscalculatingthevolumeofthespaceofcorrelationmatriceswithaxedzeropattern;seeSection 2.4.2 fordetails.PreviousBayesiansolutionsareconcernedwithchoicesofanappropriatepriordistributionp(R)onRJ.CommonlyusedpriorsincludeplacingequalweightonallelementsofRJ( Barnardetal. 2000 )andJeffrey'spriorp(R)/jRj)]TJ /F9 7.97 Tf 6.59 0 Td[((J+1)=2.InthesecasesthesamplingstepsforRcansometimesbenetfromparameterexpansiontechniques( Liu 2001 ; Liu&Daniels 2006 ; Zhangetal. 2006 ). Liechtyetal. ( 2004 )alsodevelopacorrelationmatrixpriorbyspecifyingeachelementijofRasan 16

PAGE 17

independentnormalsubjecttoR2RJ. Pittetal. ( 2006 )extendthecovarianceselectionprior( Wongetal. 2003 )tothecorrelationmatrixcasebyxingtheelementsofTtobeconstrainedbyCsothatTisthediagonalmatrixsuchthatR=(TCT))]TJ /F9 7.97 Tf 6.58 0 Td[(1hasunitdiagonal.Thedifcultyofjointlydealingwiththepositive-deniteandunitdiagonalconstraintsofacorrelationmatrixhasledsomeresearcherstoconsiderpriorsforRbasedonthepartialautocorrelations(PACs).ThePACbetweenYiandYj(i
PAGE 18

isonesuchmethod( Carteretal. 2011 ; Pittetal. 2006 ; Wongetal. 2003 ).Othernon-Bayesiantechniquesthatencouragesparseversionsofthedependencestructureincludebandingthesamplecovarianceorconcentrationmatrix( Bickel&Levina 2008 ),usingalasso-typepenaltyontheelementsof( Mazumder&Hastie 2012 ; Meinhausen&Buhlmann 2006 )ortheelementstheCholeskydecompositionof( Rothmanetal. 2008 ),andbandingtheCholeskydecompositionof( Rothmanetal. 2010 ).Again,onemustbecautiouswhenusingselectionmethodsasnotallproducepositivedeniteestimators.Aspreviouslymentioned,theideasfromcovarianceselectionareconnectedtothoseofgraphicalmethods.Bayesianmethodsforgraphicalmodelsgenerallyconsistofxingthezerostructureof(or)torepresentaparticulargraphG.Apriorfor()isthenconstructedtorangeoverthespaceofconcentration(covariance)matriceswiththeappropriatezerostructure.Therearemanysuchpriorswithvaryingdegreesofexibility( Dawid&Lauritzen 1993 ; Khare&Rajaratnam 2011 ; Letac&Massam 2007 ; Rajaratnametal. 2008 ). Giudici&Green ( 1999 )proposeahierarchicalpriorwherethegraphGisassumedrandomoverspaceofdecomposablegraphs.Mostgraphicalmethodsdonotincorporatetheorderingoftheresponses,andconsequently,Y1andY2areaslikelytobeuncorrelatedasareY1andYJ.Thisisundesirableforlongitudinaldata,andsowedonotconsiderfurtherthegraphicalmodelmethodology.Bayesianfactormodelsprovidesanotheroptiontodeveloplower-dimensionspecicationsof.Themostcommonofthesemodelsfactorthecovariancematrixas=0+D,whereDisdiagonal(sometimesoftheform2I)andisJkwithk
PAGE 19

exhibitidentiabilityproblemswhenusedaspartofaMCMCanalysis( Lopes&West 2004 ).AparameterizationbasedontheCholeskydecompositionofhasbeenproposedby Pourahmadi ( 1999 2000 ).Inthisparameterizationthecovariancematrixdependsontwosetsofparameters:,thegeneralizedautoregressiveparameters(GARPs),and)]TJ /F1 11.955 Tf 6.94 0 Td[(,thesetofinnovationvariances(IVs),suchthat(,)]TJ /F4 11.955 Tf 6.94 0 Td[())]TJ /F9 7.97 Tf 6.58 0 Td[(1=T()D()]TJ /F4 11.955 Tf 6.95 0 Td[()T()0=2666666641)]TJ /F11 11.955 Tf 9.3 0 Td[(12)]TJ /F11 11.955 Tf 9.3 0 Td[(131)]TJ /F11 11.955 Tf 9.3 0 Td[(231...3777777752666666641 11 2...1 J3777777752666666641)]TJ /F11 11.955 Tf 9.3 0 Td[(121)]TJ /F11 11.955 Tf 9.3 0 Td[(13)]TJ /F11 11.955 Tf 9.3 0 Td[(231............377777775.TheT()matrixisupper-triangularwithonesonitsdiagonal.NotethatthereareJparametersforeach)]TJ /F4 11.955 Tf 11.46 0 Td[(=(1,...,J)andK=J(J)]TJ /F4 11.955 Tf 12.44 0 Td[(1)=2parametersassociatedwitheach=(12,...,J)]TJ /F9 7.97 Tf 6.59 0 Td[(1,J).OneofthekeymotivationsbehindtheuseofthemodiedCholeskydecompositionisthattheonlyconstraintonand)]TJ /F1 11.955 Tf 10.27 0 Td[(neededtoguarantee(,)]TJ /F4 11.955 Tf 6.94 0 Td[()ispositivedeniteisthatj>0forallj=1,...,J.AdditionallytheGARPsandIVsareinterpretedasparametersfromsequentialregressions;i.e.,EfYjjy1,...,yj)]TJ /F9 7.97 Tf 6.59 0 Td[(1g=1jy1++j)]TJ /F9 7.97 Tf 6.59 0 Td[(1,jyj)]TJ /F9 7.97 Tf 6.59 0 Td[(1andVarfYjjy1,...,yj)]TJ /F9 7.97 Tf 6.59 0 Td[(1g=j(assumingwithoutlossofgeneralityYhasmeanzero).Again,thedependenceontheorderingofthecomponentsofYisclear.TheinterpretabilityoftheGARPsreliesonanassumedorderoftheJcomponentsofY,whichisnaturalinthecaseoflongitudinalmeasurements.Priorsforcanthenbeenformedbyspecifyingpriorsonand)]TJ /F1 11.955 Tf 6.94 0 Td[(. Daniels&Pourahmadi ( 2002 )developpriorsthatexploitconjugacyfortheGARPsandIVs. Smith&Kohn ( 2002 )formparsimoniouspriorsthatstochasticallysetelementsoftozero.Notethatjk=0(j
PAGE 20

Y1,...,Yj)]TJ /F9 7.97 Tf 6.58 0 Td[(1,Yj+1,...,Yk)]TJ /F9 7.97 Tf 6.58 0 Td[(1,sothatthesparsityisinterpretableasanindependencerelationship.OtherBayesianmethodsforcovariancepriorsincludespecifyingthepriorintermsofthematrixlogarithmofsothatitisunconstrained( Leonard&Hsu 1992 ),priorsbasedonthespectraldecomposition( Daniels&Kass 1999 ),aatprioronortheJeffrey'spriorp()/jj)]TJ /F9 7.97 Tf 6.59 0 Td[((J+1)=2,andthereferenceprior( Yang&Berger 1994 ). 1.3LiteratureReviewforSimultaneousCovarianceEstimationandPriorDistributionsInSection 1.2 wedescribedtechniquestomodelasinglecovariancematrix.Arelatedchallengeissimultaneouslyestimatingmultiplecovariancematrices.Frequentlyinlongitudinalproblemsdataiscomposedofseveralgroups,suchasdifferingtreatmentsinaclinicaltrial.Inmanycases,particularlyifonedoesnothavemanyobservationspergroup,oneassumesthatthecovariance(orcorrelation)structureisconstantacrossallgroups.However,thisassumption,ifitfailstohold,canhaveadramaticeffectontheinferenceofmeaneffects,evenleadingtobiasifdataareincomplete( Daniels&Hogan 2008 ).Conversely,ifonespecieseachofthecovariancematriceswithoutregardtotheothergroups,thiscanleadtoalossofinformation.Soitisimportanttondmethodsthatcanndamiddlegroundbetweenthesetwoextremesbysharinginformationaboutthedependenceacrossgroups.Letmdenotethecovariancematrixforgroupm(m=1,...,M)and=f1,...,Mgbethecollectionofcovariancematrices.Manyauthorshavedevelopedfrequentistestimatorsforthiscollectionbyinducingcommonalityamongsomefeatureofthem. Boik ( 2002 2003 )proposedmodelstoinducestructurebyimposingcommonalityonsome(orall)oftheprincipalcomponentsofthecovarianceorcorrelationmatrix.Othershaveusedthevariance-correlationdecompositionforestimationbyimposingstructuressuchasproportionalityofallmorcommonalityamongthecorrelationmatrices( Manly&Rayner 1987 ). Pourahmadietal. ( 2007 ) 20

PAGE 21

developedestimationandtestingproceduresforequalityamongtheGARPsandsubsetsoftheGARPs.Inaclusteringcontext McNicholas&Murphy ( 2010 )advocateasimilarcovarianceestimationprocedurethatincludesbandingtheT()matrices. Daniels ( 2006 )consideredaBayesianperspectivebyintroducingpriorsfortheGARPsandIVs,aswellastheprincipalcomponentsofthecovariancematrices,thatinducepoolingacrossgroups.Unfortunately,itiscomputationallychallengingtoselectamongallthepossiblemodelswithintheseclasses. Hoff ( 2009 )proposedahierarchicaleigenmodelforthatpoolstheeigenvectorsofeachgrouptowardacommonstructure. Guoetal. ( 2011 )consideranautomatedapproachusingthelassotoestimatesparsegraphicalmodelsbyselectingsetsofedgescommontoallgroups,aswellasgroup-specicedges.Inthelongitudinaldatasettingwewishtondmorecovariancestructurethanjustcommonzerosacrossallgroups.Wewanttoconsidermodelsthatallowsubsetsofthemodelparameterstobeequalacross(asubsetofthe)groupsatnon-zerovalues;theGuoetal.estimatorsdonotaccommodatethisgoal. Danaheretal. ( 2012 )extendthisworktoallowtheelementsofmtobeequaltoasinglenon-zerovalue,zero,ordistinctacrossgroups.Butthisstilldoesnotallowforastructurecontainingsubsetsofgroups.WeadditionallynotethatitisnotclearhowonecouldeasilyadapteitherpenaltytermintoaBayesianprioronthesetcovariancematricesforoursetting.Othertechniqueshavebeenproposedthatmodelthecovariancematrixasafunctionofoneofmorecontinuouscovariates.AnumberofmodelsofthisavorhavebeendevelopedthatarespeciedthroughregressionsontheGARPsand/orIVs( Daniels 2006 ; Pourahmadi 1999 2000 ; Pourahmadi&Daniels 2002 ). Chiuetal. ( 1996 )devisesuchamodelbyregressingontolog(m).OtherregressionframeworksincludemodelsonthePACsandmarginalvariances( Wang&Daniels 2013a ),regressingwithinafactormodel( Fox&Dunson 2011 ),andmodelbasedonam=B+xmx0m0factorization( Hoff&Niu 2012 ).Additionalmethodstreat 21

PAGE 22

thecovariancematricesasrealizationsofastochasticvolatilityprocess( Lopesetal. 2011 ; Philipov&Glickman 2006a b ).However,covarianceregressionmodelsareoftenplaguedbythedifcultyofinterpretingtheregressionparameters.ToformourpriorsonwewillmakeuseofthemodiedCholeskyparameterizationbecauseoftheunrestrictednessoftheparameters,theinterpretabilityforlongitudinaldata,andthecomputationaladvantagesviaconjugacy( Daniels&Pourahmadi 2002 ).OurgoalistodeveloppriorsforthesetofGARPsandIVsinsuchawaythatweborrowstrengthacrosstheMgroups.Additionally,wewanttoshareinformationacross)]TJ /F7 7.97 Tf 6.94 -1.79 Td[(mandmvalues(m=(m,)]TJ /F7 7.97 Tf 6.94 -1.79 Td[(m),m=1,...,M),particularlythoseGARPsofacommonlag.AnotherconsiderationforpriordevelopmentistoencouragesparsityoftheelementsofT(m),thatis,containingfewnon-zeroelements.BecauseeachGARPrepresentsaconditionaldependency,settingm;jktozeroestablishesaconditionalindependencerelationshipbetweenapairofcomponentsofY.Itisnecessarytoconsiderpriorsthatallowthedatatoinformthebalancebetweenthesetwogoals:poolingacrossgroupsandintroducingsparsity.Aboveall,weseektoaccomplishthisinanautomated,stochasticfashion.Weproposetwosolutionstothisproblem.InChapter 3 wedevelopanonparametricpriorbasedonthematrixstick-breakingprocess( Dunsonetal. 2008 ).ThispriorallowsforclusteringoftheGARPssimultaneouslyacrossgroupsmandij'sofacommonlagj)]TJ /F5 11.955 Tf 13.23 0 Td[(i,whileallowingsomeGARPstobeidenticallyzero(implyingaconditionalindependence).ThesecondsolutiontothesimultaneousestimationproblempresentedinChapter 4 considersclusteringbasedonthet-thcolumnofT(m)andm;t,whichdenescollectionsofthegroups1,...,MthathavethesamedependenceparametersforthedistributionofYtgivenY1,...,Yt)]TJ /F9 7.97 Tf 6.58 0 Td[(1.TheequalityrelationshipsinthiscovariancepartitionpriorarespecieddirectlythroughaMarkovchainonthesequenceofpartitionsof1,...,M. 22

PAGE 23

PAGE 24

CHAPTER2SPARSEPRIORDISTRIBUTIONSFORCORRELATIONMATRICESTHROUGHTHEPARTIALAUTOCORRELATIONS 2.1BayesianCorrelationEstimationDeterminingthestructureofanunknownJJcovariancematrixisalongstandingstatisticalchallenge.Akeydifcultyindealingwiththecovariancematrixisthepositivedenitenessconstraint.Thisisbecausethesetofvaluesforaparticularelementijthatyieldapositivedenitedependsonthechoiceoftheremainingelementsof.Additionally,becausethenumberofparametersinisquadraticinthedimensionJ,methodstondaparsimonious(lower-dimensional)structurecanbebenecial.Oneoftheearliestattemptsinthisdirectionistheideaofcovarianceselection( Dempster 1972 ).Bysettingsomeoftheoff-diagonalelementsoftheconcentrationmatrix=)]TJ /F9 7.97 Tf 6.59 0 Td[(1tozero,amoreparsimoniouschoiceforthecovariancematrixoftherandomvectorYisachieved.Azerointhe(i,j)-thpositionofimplieszerocorrelation(andfurther,independenceundermultivariatenormality)betweenYiandYj,conditionalontheremainingcomponentsofY.Thisproperty,alongwithitsrelationtographicalmodeltheory(e.g., Lauritzen 1996 ),hasledtotheuseofcovarianceselectionasastandardpartofanalysisinmultivariateproblems( Rothmanetal. 2008 ; Wongetal. 2003 ; Yuan&Lin 2007 ).However,oneshouldbecautiouswhenusingsuchselectionmethodsasnotallproducepositivedeniteestimators.Forinstance,thresholdingthesamplecovariance(concentration)matrixwillnotgenerallybepositivedenite,andadjustmentsareneeded( Bickel&Levina 2008 ).Modelspecicationformaydependonacorrelationstructurethroughtheso-calledseparationstrategy( Barnardetal. 2000 ).Theseparationstrategyinvolvesreparameterizingby=SRS,withSadiagonalmatrixcontainingthemarginalstandarddeviationsofYandRthecorrelationmatrix.LetRJdenotethesetofvalidcorrelationmatrices,thatis,thecollectionofJJpositivedenitematriceswithunit 24

PAGE 25

diagonal.Separationcanalsobeperformedontheconcentrationmatrix,=TCTsothatTisdiagonalandC2RJ.ThediagonalelementsofTgivethepartialstandarddeviations,whiletheelementscijofCarethe(full)partialcorrelations.ThecovarianceselectionproblemisequivalenttochoosingelementsofthepartialcorrelationmatrixCtobenull.SeveralauthorshaveconstructedpriorstoestimatebyallowingCtobeasparsematrix( Carteretal. 2011 ; Wongetal. 2003 ).Inmanycasesthefullpartialcorrelationmatrixmaynotbeconvenienttouse.Incaseswherethecovariancematrixisxedtobeacorrelationmatrixsuchasthemultivariateprobitcase,theelementsoftheconcentrationmatrixTandCareconstrainedtomaintainaunitdiagonalfor( Pittetal. 2006 ).Additionally,interpretationofparametersinthepartialcorrelationmatrixcanbechallenging,particularlyforlongitudinalsettingsasthepartialcorrelationsaredenedconditionalonfuturevalues.Forexample,c12givesthecorrelationbetweenY1andY2conditionalonthefuturemeasurementsY3,...,YJ.AnadditionalissuewithBayesianmethodsthatpromotesparsityinCiscalculatingthevolumeofthespaceofcorrelationmatriceswithaxedzeropattern;seeSection 2.4.2 fordetails.InadditiontotheroleRplaysintheseparationstrategy,insomedatamodelsthecovariancematrixisconstrainedtobeacorrelationmatrixforidentiability.Thisisthecaseforthemultivariateprobitmodel( Chib&Greenberg 1998 ),Gaussiancopularegression( Pittetal. 2006 ),certainlatentvariablesmodels(e.g. Daniels&Normand 2006 ),amongothers.Thus,itisnecessarytomakeuseofmethodsspecicforestimatingand/ormodelingacorrelationmatrix.WeconsiderthisproblemofcorrelationmatrixestimationinaBayesiancontextwhereweareconcernedwithchoicesofanappropriatepriordistributionp(R)onRJ.CommonlyusedpriorsincludeauniformprioroverRJ( Barnardetal. 2000 )andJeffrey'spriorp(R)/jRj)]TJ /F9 7.97 Tf 6.59 0 Td[((J+1)=2.InthesecasesthesamplingstepsforRcansometimesbenetfromparameterexpansiontechniques( Liu 2001 ; Liu&Daniels 25

PAGE 26

2006 ; Zhangetal. 2006 ). Liechtyetal. ( 2004 )developacorrelationmatrixpriorbyspecifyingeachelementijofRasanindependentnormalsubjecttoR2RJ. Pittetal. ( 2006 )extendthecovarianceselectionprior( Wongetal. 2003 )tothecorrelationmatrixcasebyxingtheelementsofTtobeconstrainedbyCsothatTisthediagonalmatrixsuchthatR=(TCT))]TJ /F9 7.97 Tf 6.59 0 Td[(1hasunitdiagonal.ThedifcultyofjointlydealingwiththepositivedeniteandunitdiagonalconstraintsofacorrelationmatrixhasledsomeresearcherstoconsiderpriorsforRbasedonthepartialautocorrelations(PACs)insettingswherethedataareordered.PACssuggestapracticalalternativebyavoidingthecomplicationofthepositivedeniteconstraint,whileprovidingeasilyinterpretableparameters( Joe 2006 ). Kurowicka&Cooke ( 2003 2006 )framethePACideaintermsofavinegraphicalmodel. Daniels&Pourahmadi ( 2009 )constructaexibleprioronRthroughindependentshiftedBetapriorsonthePACs. Wang&Daniels ( 2013a )constructunderlyingregressionsforthePACs,aswellasatriangularpriorwhichshiftsthepriorweighttoamoreintuitivechoiceinthecaseoflongitudinaldata.InsteadofsettingpartialcorrelationsfromCtozerotoincorporatesparsity,ourgoalistoencourageparsimonythroughthePACs.AsthePACsareunconstrained,selectiondoesnotleadtothecomputationalissuesassociatedwithndingthenormalizingconstantforasparseC.WeintroduceandcomparepriorsforbothselectionandshrinkageofthePACsthatextendspreviousworkonsensibledefaultchoices( Daniels&Pourahmadi 2009 ).Thelayoutofthischapterisasfollows.Inthenextsectionwewillreviewtherelevantdetailsofthepartialautocorrelationparameterization.Section 2.3 proposesapriorforRinducedbyshrinkagepriorsonthePACs.Section 2.4 introducestheselectionpriorforthePACs.SimulationresultsshowingtheperformanceofthepriorsappearinSection 2.5 .InSection 2.6 theproposedPACpriorsareappliedtoadatasetfromasmokingcessationclinicaltrial.Section 2.7 concludesthechapterwithabriefdiscussion. 26

PAGE 27

2.2PartialAutocorrelationsForageneralrandomvectorY=(Y1,...,YJ)0thepartialautocorrelationbetweenYiandYj(i
PAGE 28

forj)]TJ /F5 11.955 Tf 12.43 0 Td[(i>1.AstherelationshipbetweenRandisone-to-one,theJacobianforthetransformationfromRtocanbecomputedeasily.ThedeterminantoftheJacobianisgivenby jJ()j=Yi
PAGE 29

whichhasacontributionfromijof(1)]TJ /F11 11.955 Tf 11.89 0 Td[(2ij)[J)]TJ /F9 7.97 Tf 6.58 0 Td[(1)]TJ /F9 7.97 Tf 6.58 0 Td[((j)]TJ /F7 7.97 Tf 6.58 0 Td[(i)]=2.NotethatpfR()istheproductofindependentSBeta(ij,ij)distributionsforeachij,whereij=ij=1+[J)]TJ /F4 11.955 Tf 9.77 0 Td[(1)]TJ /F4 11.955 Tf 9.77 0 Td[((j)]TJ /F5 11.955 Tf 9.78 0 Td[(i)]=2.Thisprovidesanunconstrainedrepresentationoftheat-Rprior.Inlongitudinal/ordereddatacontexts,weexpectthePACstobenegligibleforelementsthathavelargelags.Weexploitthisconceptviatwotypesofpriors.First,weintroducepriorsthatshrinkPACstowardzerowiththeaggressivenessoftheshrinkagedependingonthelag.Next,wepropose,inthespiritof Wongetal. ( 2003 ),aselectionpriorthatwillstochasticallychoosePACstobesettozero. 2.3PartialAutocorrelationShrinkagePriors 2.3.1SpecicationoftheShrinkagePriorUsingthePACframework,weformpriorsthatwillshrinkthePACijtowardzero.Ithaslongbeenknownthatshrinkageestimatorscanproducegreatlyimprovedestimation( James&Stein 1961 ).Aspreviouslynoted,ij=0impliesthatYiandYjareuncorrelatedgiventheinterveningvariables(Yi+1,...,Yj)]TJ /F9 7.97 Tf 6.58 0 Td[(1).InthecasewhereYhasamultivariatenormaldistribution,thisimpliesindependencebetweenYiandYj,given(Yi+1,...,Yj)]TJ /F9 7.97 Tf 6.59 0 Td[(1).Weanticipatethatvariablesfartherapartintime(andconditionalonmoreintermediatevariables)aremorelikelytobeuncorrelated,sowewillmoreaggressivelyshrinkijforlargervaluesofthelagj)]TJ /F5 11.955 Tf 11.95 0 Td[(i.WeleteachijSBeta(ij,ij)independently.Aswewishtoshrinktowardzero,wewantEfijg=0,sowexij=ij.ItiseasilyshownthatVarfijg=4ijij (ij+ij)2(ij+ij+1),whichwedenotebyij.WerecovertheSBetashapeparametersbyij=ij=()]TJ /F9 7.97 Tf 6.59 0 Td[(1ij)]TJ /F4 11.955 Tf 12.35 0 Td[(1)=2.Hence,thedistributionofijisdeterminedbyitsvarianceij.RatherthanspecifyingtheseJ(J)]TJ /F4 11.955 Tf 11.96 0 Td[(1)=2differentvariances,weparameterizethemthrough Varfijg=ij=0jj)]TJ /F5 11.955 Tf 11.95 0 Td[(ij)]TJ /F12 7.97 Tf 6.59 0 Td[(,(2) 29

PAGE 30

where02(0,1)and>0.Clearly,ijisdecreasinginlagsothathigherlagtermswillgenerallybeclosertozero.Weletthepositiveparameterdeterminetheratethatijdecreasesinlag.TofullyspecifytheBayesianset-up,wemustintroducepriordistributionsonthetwoparameters,0and.Tospecifythesehyperpriors,weuseauniform(orpossiblyamoregeneralbeta)for0andagammadistributionfor.Werequire>0,soij=0jj)]TJ /F5 11.955 Tf 12.87 0 Td[(ij)]TJ /F12 7.97 Tf 6.59 0 Td[(remainsandecreasingfunctionoflag.InthesimulationsanddataanalysisofSections 2.5 and 2.6 ,weuseGamma(5,5),sothathasapriormeanof1andpriorvarianceof1=5.Weuseamoderatelyinformativepriortokeepfromdominatingtheroleof0inij=0jj)]TJ /F5 11.955 Tf 12.8 0 Td[(ij)]TJ /F12 7.97 Tf 6.59 0 Td[(.Alargevalueofwillforceallijoflaggreaterthanonetobeapproximatelyzero,regardlessofthevalueof0. 2.3.2SamplingundertheShrinkagePriorTheutilityofourpriordependsonourabilitytoincorporateitintoaMarkovchainMonteCarlo(MCMC)scheme.ForsimplicityweassumethatthedataconsistsofY1,...,YN,whereeachYiisaJ-dimensionalnormalvectorwithmeanzeroandcovarianceR,whichisacorrelationmatrixsoastomimicthecomputationsforthemultivariateprobitcase.LetL(jY)denotethelikelihoodfunctionforthedata,parameterizedbythePACs,.TheMCMCchainweproposeinvolvessequentiallyupdatingeachoftheJ(J)]TJ /F4 11.955 Tf 12.04 0 Td[(1)=2PACs,followedbyupdatingthehyperparametersdetermermingthevarianceoftheSBetadistributions.Tosampleaparticularij,wemustdrawthenewvaluefromthedistributionproportionaltoL(ij,()]TJ /F7 7.97 Tf 6.59 0 Td[(ij)jY)pij(ij),wherepij(ij)istheSBeta(ij,ij)densityand()]TJ /F7 7.97 Tf 6.58 0 Td[(ij)representsthesetofPACsexceptij.Duetothesubtleroleofijinthelikelihoodpiece,thereisnosimpleconjugatesamplingstep.InordertosamplefromL(ij,()]TJ /F7 7.97 Tf 6.58 0 Td[(ij)jY)pij(ij),weintroduceanauxiliaryvariableUij( Damienetal. 1999 ; Neal 30

PAGE 31

2003 ),andnotethatwecanrewritetheconditionaldistributionas L(ij,()]TJ /F7 7.97 Tf 6.59 0 Td[(ij)jY)pij(ij)=Z10Ifuij
PAGE 32

( 2013a )by=2,=1;alternatively,independenthyperpriorsfor,couldbespecied.Thevalueofijgivestheprobabilitythatijwillbenon-zero,i.e.willbedrawnfromthecontinuouscomponentinthemixturedistribution.Hence,wehavetheprobabilitythatYiandYjareuncorrelated,giventheintercedingvariables,is1)]TJ /F11 11.955 Tf 13.21 0 Td[(ij.Asthevaluesofthe'sdecrease,theselectionpriorplacesmoreweightonthepoint-mass0componentofthedistribution( 2 ),yieldingmoresparsechoicesfor.AswithourparameterizationsofthevarianceijinSection 2.3.1 ,wemakeastructuralchoiceoftheformofijsothatthisprobabilitydependsonthelag-value.Welet ij=0jj)]TJ /F5 11.955 Tf 11.96 0 Td[(ij)]TJ /F12 7.97 Tf 6.58 0 Td[(,(2)similartoourchoiceofijintheshrinkageprior.Thischoice( 2 )speciesthecontinuouscomponentprobabilitytobeanpolynomialfunctionofthelag.Becauseijisdecreasingasthelagj)]TJ /F5 11.955 Tf 12.85 0 Td[(iincreases,pr(ij=0)increases.Conceptually,thismeansthatweanticipatethatvariablesfartherapartintime(andconditionalonmoreintermediatevariables)aremorelikelytobeuncorrelated.Aswiththeshrinkageprior,wechoosehyperpriorsof0Unif(0,1)andGamma(5,5). 2.4.2NormalizingConstantforPriorsonROneofthekeyimprovementsofourselectionprioroverothersparsepriorsforRisthesimplicityofthenormalizingconstant,asmentionedintheintroduction.PreviouscovariancepriorswithasparseC( Carteretal. 2011 ; Pittetal. 2006 ; Wongetal. 2003 )placeaatprioronthenon-zerocomponentscijforagivenpatternofzeros.However,theneedednormalizingconstantrequiresndingthevolumeofthesubspaceofRJcorrespondingtothepatternofzerosinC.Thisturnsouttobeaquitedifculttaskandprovidesmuchofthechallengeintheworkofthethreepreviouslycitedpapers. 32

PAGE 33

WeareabletoavoidthisissuebyspecifyingourselectionpriorintermsoftheunrestrictedPACparameterization.Asthevalueofanyoftheij'sdoesnoteffectthesupportoftheremainingPACs,thevolumeof[)]TJ /F4 11.955 Tf 9.3 0 Td[(1,1]J(J)]TJ /F9 7.97 Tf 6.59 0 Td[(1)=2correspondingtoanycongurationofwithJ0(J(J)]TJ /F4 11.955 Tf 12.78 0 Td[(1)=2)non-zeroelementsis2J0,thevolumeofaJ0-dimensionalhypercube.Becausethisconstantdoesnotdependonwhichelementsarenon-zero,weneednotexplicitlydealwithitintheMCMCalgorithmtobeintroducedinthenextsubsection.Further,weareabletheexploitstructureintheorderofthePACsinselection(i.e.higherlagtermsaremorelikelytobenull),whereasin Pittetal. ( 2006 ),theprobabilitythatcijiszeroischosentominimizetheeffortrequiredtondthenormalizingconstant.AnadditionalbenetofperformingselectiononthepartialautocorrelationasopposedtothepartialcorrelationsCisthatthezeropatternsholdundermarginalizationsofthebeginningand/orendingtimepoints.Forinstance,ifwemarginalizeouttheJthtimepoint,thecorrespondingmatrixofPACsistheoriginalafterremovingthelastrowandcolumn.However,anyzeroelementsinCwillnotbepreservedbecausecorr(Y1,Y2jY3,...,YJ)=0doesnotgenerallyimplythatcorr(Y1,Y2jY3,...,YJ)]TJ /F9 7.97 Tf 6.59 0 Td[(1)=0. 2.4.3SamplingundertheSelectionPriorSamplingwiththeselectionpriorproceedssimilarlytotheshrinkagepriorschemewiththemaindifferencebeingtheintroductionofthepointmassin( 2 ).AsbeforewesequentiallyupdateeachofthePACs,bydrawingthenewvaluefromthedistributionproportionaltoL(ij,()]TJ /F7 7.97 Tf 6.59 0 Td[(ij)jY)pij(ij),wherepij(ij)givesthedensitycorrespondingthepriordistributionin( 2 )(withrespecttotheappropriatemixturedominatingmeasure).Wecannotusetheslicesamplingstepaccordingto( 2 )butmustwritethedistributionas L(ij,()]TJ /F7 7.97 Tf 6.59 0 Td[(ij)jY)pij(ij)=Z10Ifuij
PAGE 34

Fortheselectionprior,wesampleUijuniformlyovertheintervalfromzerotoL(ij,()]TJ /F7 7.97 Tf 6.58 0 Td[(ij)jY),usingthecurrentvalueofij,andthendrawijfrompij(),restrictedtotheslicesetP=f:uij
PAGE 35

Thesamplingdistributionsof0anddependononlythroughthesetofindicatorvariablesij.Aswiththevarianceparametersoftheshrinkagepriors,weincorporateapairofslicesamplingstepstoupdatethehyperparameters. 2.5SimulationsTobetterunderstandthebehaviorofourproposedpriors,weconductedasimulationstudytoassessthe(frequentist)riskoftheirposteriorestimators.WeconsiderfourchoicesADforthetruecovariancematrixinthecaseofsix-dimensional(J=6)data.RAwillhaveanautoregressive(AR)structurewithAij=0.7jj)]TJ /F7 7.97 Tf 6.58 0 Td[(ij.ThecorrespondingAhasvaluesof0.7forthelag-1termsandzerofortheothers,asparseparameterization.ForthesecondcorrelationmatrixRBwechoosetheidentitymatrixsothatallofPACsarezerointhiscase.TheChasastructurethatdecaystozero.Forthelag-1termsCi,i+1=0.7,andfortheremainingterms,Cij=0.4j)]TJ /F7 7.97 Tf 6.59 0 Td[(i)]TJ /F9 7.97 Tf 6.59 0 Td[(1,j)]TJ /F5 11.955 Tf 11.96 0 Td[(i>1.NeitherCnorRChavezeroelements,butCijdecreasequicklyinlagj)]TJ /F5 11.955 Tf 12.04 0 Td[(i.Finally,weconsideracorrelationmatrixthatcomesfromasparseD,D=266666641.9.30000.901.8.4.100.800.801.6.200.620.670.601.8.30.580.630.580.801.70.460.500.450.690.70137777775,wheretheupper-triangularelementscorrespondtoDandthelower-triangularelementsdepictthemarginalcorrelationsfromRD.NotethatwhileDissomewhatsparse,RDhasonlynon-zeroelements.ForeachofthesefourchoicesofthetruedependencestructureandforsamplesizesofN=20,50,and200,wesimulate50datasets.Foreachdatasetaposteriorsamplefor(andhence,R)isobtainedbyrunninganMCMCchainfor5000iterations,afteraburn-inof1000.Weuseeverytenthiterationforinference,givingasampleof500valuesforeachdataset.Weconsidertheperformanceofboththeselectionandshrinkagepriorson.Fortheselectionprior,weperformanalyseswithSBeta(1,1) 35

PAGE 36

(i.e.,Unif()]TJ /F4 11.955 Tf 9.3 0 Td[(1,1))andSBeta(2,1)(triangularprior)forthecontinuouscomponentofthemixturedistributions( 2 ).Inboththeselectionandshrinkagepriors,thehyperpriorsare0Unif(0,1)andGamma(5,5).Theestimatorsfromtheshrinkageandselectionpriorsarecomparedwiththeestimatorsresultingfromtheat-R,at-PAC,andtriangularpriors.Finally,weconsideranaiveshrinkagepriorwhereisxedatzeroin( 2 ).Here,allPACsareequallyshrunkwithvarianceij=0independentoflag.Weconsidertwolossfunctionsincomparingtheperformanceofthesixpriorchoices:L1(^R,R)=tr(^RR)]TJ /F9 7.97 Tf 6.59 0 Td[(1))]TJ /F4 11.955 Tf 12.96 0 Td[(logj^RR)]TJ /F9 7.97 Tf 6.59 0 Td[(1j)]TJ /F5 11.955 Tf 19.93 0 Td[(pandL2(^,)=Pi
PAGE 37

Figure2-1. BoxplotsoftheobservedlossusingL1(^R1,R)fortheJ=6cases.Thepriordistributionscomparedare(1)shrinkage,(2)selection(2,1),(3)selection(1,1),(4)at-R,(5)at-,(6)triangular,and(7)naiveshrinkage. 37

PAGE 38

Table2-1. RiskestimatesforsimulationstudywithdimensionJ=6.Correlationmatrices:Aautoregressivestructure;Bindependence;Cnon-zerodecaying;Dsparse.Lossfunctions:L1(^R,R)=tr(^RR)]TJ /F9 7.97 Tf 6.58 0 Td[(1))]TJ /F4 11.955 Tf 11.96 0 Td[(logj^RR)]TJ /F9 7.97 Tf 6.59 0 Td[(1j)]TJ /F5 11.955 Tf 17.93 0 Td[(p;L2(^,)=Pi
PAGE 39

Table2-2. 1010PACmatrixD0shownabovethediagonalanditsrespectivecorrelationmatrixRD0shownbelowthediagonal.D0=26666666666666641.9.300000000.901.8.4.1000000.800.801.6.2000000.620.670.601.8.300000.580.630.580.801.700000.460.500.450.690.701.8.4.100.370.400.360.550.560.801.6.200.310.340.300.460.470.670.601.8.30.290.320.290.430.440.630.580.801.70.230.250.230.340.350.500.450.690.7013777777777777775 estimatedriskforat-Risvisiblyworsethantheothers.RecallthatCijisdecreasinginlagbutisnotequaltozero.Infact,thesmallestelementC16=(0.4)4=0.0256whichmaynotbecloseenoughtozerotobeeffectivelyzeroedout,explainingwhytheselectionpriorsarelesseffectiveforCthanintheotherscenarios.WhenweconsiderestimatingthesparsecorrelationmatrixD,theshrinkageandselectionpriorsoutperformthefourotherpriors.FromTable 2-1 weseethatforlossfunction1andtheN=20samplesizetheestimatedriskdecreasesby45(25),45(24)and39(16)percentfortheestimatesfromtheshrinkage,selection(2,1),andselection(1,1)priorsovertheat-R(at-)priors.Thisisquiteasubstantialdropforthesmallsamplesize.Fortheothersamplesizeswestillobservedacleardecreaseovertheatpriors.ForN=50thereisadropof32(20),26(14),and22(9)percentforthesparsepriorsovertheatpriors,andwithN=200adecreaseof13(9),10(7),and7(4)percent.ToinvestigatehowourpriorsbehaveasJincreases,werepeattheanalysisusingthenon-sparsedecayingRCandasparseRD0withthedimensionofthematrixincreasedtoJ=10.Again,Ci,i+1=0.7forthelag-1termsandCij=0.4j)]TJ /F7 7.97 Tf 6.59 0 Td[(i)]TJ /F9 7.97 Tf 6.59 0 Td[(1forallj)]TJ /F5 11.955 Tf 12.56 0 Td[(i>1,andweexpandthepreviousRDtothe1010RD0showninTable 2-2 .AsbeforetheabovediagonalelementsarefromD0andthebelowdiagonalelements 39

PAGE 40

Table2-3. RiskestimatesforsimulationstudywithdimensionJ=10.Correlationmatrices:Cnon-zerodecaying;D0sparse.Lossfunctions:L1(^R,R)=tr(^RR)]TJ /F9 7.97 Tf 6.59 0 Td[(1))]TJ /F4 11.955 Tf 11.96 0 Td[(logj^RR)]TJ /F9 7.97 Tf 6.58 0 Td[(1j)]TJ /F5 11.955 Tf 17.93 0 Td[(p;L2(^,)=Pi
PAGE 41

PAGE 42

missingnessduetostudydropout.Asinpreviousanalysesofthisdata( Daniels&Hogan 2008 ),weassumethismissingnessisignorable.Forpatienti=1,...,N(N=281),wedenotethevectorofquitstatusesbyQi=(Qi1,...,QiJ)0.Weonlyconsidertheresponsesafterpatientsareaskedtoquit,weeks5through12(J=8).HereQit=1indicatesasuccess(notsmoking)forpatientiattimet(1tJ,correspondingtoweekt+4),Qit=)]TJ /F4 11.955 Tf 9.3 0 Td[(1forafailure(smokingduringtheweek),andQit=0iftheobservationismissing.Followingtheusualconventionsofthemultivariateprobitregressionmodel( Chib&Greenberg 1998 ),weletYibetheJ-dimensionalvectoroflatentvariablescorrespondingtoQi.Thus,Qit=1impliesthatYit0,andQit=)]TJ /F4 11.955 Tf 9.3 0 Td[(1givesYit<0.WhenQit=0,thesignofYitrepresentsthe(unobserved)quitstatusfortheweek.WeassumethelatentvariablesfollowamultivariatenormaldistributionYiNJ(i,R)fori=1,...,N,wherei=Xi,XiisaJqmatrixofcovariatesandaq-vectorofregressioncoefcients.AsthescaleofYisunidentied,thecovariancematrixofYisconstrainedtobeacorrelationmatrixR.WeconsidertwochoicesofXi:`time-varying'whichspeciesadifferentitforeachtimewithineachtreatmentgroup(q=2J)and`time-constant'whichgivesthesamevalueofitacrossalltimeswithintreatmentgroup(q=2).Withthetime-constantandtime-varyingchoicesofthemeanstructure,weconsiderthefollowingpriorsforR:shrinkage,selection,at-R,at-,triangular,naiveshrinkage,andanautoregressive(AR)prior.TheARpriorassumesanAR(1)structureforR,thatis,ij=jj)]TJ /F7 7.97 Tf 6.59 0 Td[(ijandi,i+1=andij=0ifjj)]TJ /F5 11.955 Tf 11.95 0 Td[(ij>1.WeassumeaUnif()]TJ /F4 11.955 Tf 9.3 0 Td[(1,1)distributionfor.Asintherisksimulation,weconsidertheselectionpriorwithbothSBeta(1,1)andwithSBeta(2,1)forthecontinuouscomponent.Theremainingpriordistributionstobespeciedare0Unif(0,1),Gamma(5,5),andthepriorontheregressioncoefcientsisat. 42

PAGE 43

ToanalyzethedatawerunanMCMCchainfor12,000iterationsafteraburn-inof3000,retainingeverytenthobservation.Convergencewasassessedthroughgraphicaldiagnosticsanddeemedadequate.TherearethreesetsofparameterstosampleintheMCMCchain:theregressioncoefcients,thecorrelationmatrix,andthelatentvariables.TheconditionalforgivenYandRismultivariatenormal.SamplingthecorrelationmatrixevolvesasdiscussedinSections 2.3.2 and 2.4.3 usingtheresidualsYi)]TJ /F16 11.955 Tf 12.05 0 Td[(i.ThelatentvariablesYi,whichareconstrainedbyQi,aresampledaccordingtothestrategyof Liuetal. ( 2009 ,Proposition1).Tocomparethespecicationbasedonourpriorchoices,wemakeuseofthedevianceinformationcriterion(DIC; Spiegelhalteretal. 2002 ).TheDICstatisticcanbeviewedsimilarlytotheBayesianorAkaikeinformationcriterion,butDICdoesnotrequiretheusertocountthenumberofmodelparameters.ThisiskeyforBayesianmodelsthatutilizeshrinkageand/orsparsitypriorsasitisnotclearwhetherorhowoneshouldcountaparameterthathasbeensettoorshrunktowardzero.Tothatend,let Dev=)]TJ /F4 11.955 Tf 9.3 0 Td[(2loglik(^,^RjQ)=Xi)]TJ /F4 11.955 Tf 9.3 0 Td[(2loglik(^,^RjQi)(2)bethedevianceortwicethenegativelog-likelihoodwiththeparameters^and^R.Here^istheposteriormean,andforthecorrelationestimate^R,weusetherstoftheestimatorsweconsideredinSection 2.5 ,^R=SEfR)]TJ /F9 7.97 Tf 6.59 0 Td[(1g)]TJ /F9 7.97 Tf 6.58 0 Td[(1SwithS=[diag(EfR)]TJ /F9 7.97 Tf 6.59 0 Td[(1g)]1=2.ThecomplexityofthemodelismeasuredbythetermpD,sometimescalledtheeffectivenumberofparameters.ThispDiscalculatedas pD=Ef)]TJ /F4 11.955 Tf 15.27 0 Td[(2loglik(,RjQ)g)]TJ /F1 11.955 Tf 20.59 0 Td[(Dev,(2)wheretheexpectationisovertheposteriordistributionoftheparameters(,R).TheDICmodelcomparisonstatisticisDIC=Dev+2pD,thesumoftermsmeasuringmodeltandcomplexity.SmallervaluesofDICarepreferred. 43

PAGE 44

Table2-4. ModelcomparisonstatisticsfortheCTQdata. MeanStructureCorrelationPriorDevpDDIC Time-constantShrinkage1031141060Time-constantSelection(2,1)1042121066Time-constantSelection(1,1)1044121068Time-constantTriangular1029201068Time-constantat-1029201069Time-constantNaiveshrinkage1033201074Time-constantAR107131078Time-constantat-R1043211086Time-varyingShrinkage1022251071Time-varyingTriangular1017301077Time-varyingSelection(2,1)1033221077Time-varyingSelection(1,1)1036221080Time-varyingat-1019301080Time-varyingNaiveshrinkage1023311085Time-varyingAR1068131093Time-varyingat-R1034311097 As Wang&Daniels ( 2011 )pointout,DICshouldbecalculatedusingtheobserveddata,whichinthiscaseisthequitstatusresponsesQinotthelatentvariablesYi.Hencethelog-likelihoodforQiatparameters(,R)isequalto loglik(,RjQi)=logZ(,1)JIfQityt08tg(yjXi,R)dy,(2)where(j,)istheJ-dimensionalmultivariatenormaldensitywithmeanandcovariancematrix.Theintegralin( 2 )isnottractablebutcanbeestimatedusingimportancesampling( Robert&Casella 2004 ,Section3.3).SeeAppendix A fordetailsaboutestimatingtheDIC.Themodelt(Dev),complexity(pD),andcomparison(DIC)statisticsareinTable 2-4 ;DICstatisticswereestimatedwithastandarderrorofapproximately0.5.WeseethatthemodelsthatuseameanstructurethatdependsonlyontreatmentandnottimettendtohavelowerDICvalues.Thetime-varyingmodelsarepenalizedinthepDtermforhavingtoestimatetheadditional14regressioncoefcients.Ofthe 44

PAGE 45

correlationpriorstheat-RandARpriorsperformmuchworsethantheshrinkage,selection,triangular,andat-PACpriorswiththesamemeanstructure.Additionally,theselectionpriorthatusesthetriangularformforSBeta(=2,=1)tendtohaveasmallerDICthantheSBeta(1,1)priors.FromTable 2-4 wedeterminethepriorchoicethatbestbalancesmodeltwithparsimonyisclearlythemodelwithtime-constantmeanstructureandtheshrinkageprioronthecorrelationmatrixprior.Usingthisbestttingmodel,theposteriormeanofis()]TJ /F4 11.955 Tf 9.3 0 Td[(0.504,)]TJ /F4 11.955 Tf 9.3 0 Td[(0.295)implyingthatthemarginalprobability(95%credibleinterval)ofnotsmokingduringagivenstudyweekis()]TJ /F4 11.955 Tf 9.29 0 Td[(0.504)=0.307(0.24,0.37)forthecontrolgroupand()]TJ /F4 11.955 Tf 9.3 0 Td[(0.295)=0.384(0.32,0.45)fortheexercisegroup,where()isthedistributionfunctionofthestandardnormaldistribution.Thetestofthehypothesisthatthecontroltreatmentisaseffectiveastheexercisetreatment(i.e.,H0:12)hasaposteriorprobabilityof0.06,providingsomeevidencetotheclaimthatexerciseimprovescessationresults.Wenowexamineinmoredetailtheeffecttheshrinkagepriorhasonmodelingthecorrelationmatrix.Theposteriormeans(credibleinterval)oftheshrinkageparametersare^0=0.406(0.25,0.60)and^=2.44(1.6,3.4).Withavalueofgreaterthan1,thevarianceofijisdecayingtozerofairlyrapidly.Theposteriormeanofis^=2666666666641.000.700.120.020.050.000.00-0.010.711.000.830.160.090.020.010.000.640.841.000.810.120.100.060.020.560.740.821.000.780.240.090.030.510.640.690.791.000.810.370.040.480.610.660.740.831.000.880.210.480.610.670.740.830.891.000.780.400.520.570.630.700.770.801.00377777777775,withthelowerdiagonalvaluesgivingtheelementsof^R.WeseethatthePACsarefarfromzeroinonlythersttwolagsandtheremaining'sareclosetozero.Thisisbecausethesepartialautocorrelationshavebeenshrunkalmosttozeroinmostiterations. 45

PAGE 46

2.7DiscussionInthispaperwehaveintroducedtwonewpriorsforcorrelationmatrices,ashrinkagepriorandaselectionprior.ThesepriorschooseasparseparameterizationofthecorrelationmatrixthroughthesetofPACs.Intheselectioncontext,bystochasticallyselectingtheelementsoftozeroout,ourmodelndsinterpretableindependencerelationshipsfornormaldataandavoidstheneedforcomplexmodelselectionofthedependencestructure.Akeyimprovementoftheselectionprioroverexistingmethodsforsparsecorrelationmatricesisthatourapproachavoidsthecomplexnormalizingconstantsseeninpreviouswork.Additionally,insettingswithtime-ordereddata,thepartialautocorrelationsaremoreinterpretablethanthefullpartialcorrelations,astheydonotinvolveconditioningonfuturevalues.Whiletheexampleswehaveconsideredhereinvolvesituationswherethecovariancematrixwasconstrained(asinthedataexample)orknown(asinthesimulations)tobeacorrelationmatrix,theextensiontoarbitraryissimple.Returningtotheseparationstrategy=SRS( Barnardetal. 2000 ),apriorforcanbeformedbyplacingindependentpriorsonSandR,i.e.p()=p(R)p(S).Usingoneoftheproposedpriorsforp(R),sensiblechoicesofp(S)includeanindependentinversegammaforeachofthejjoraatprioronfS=diag(11,...,JJ):jj>0g.ThisleadstoaprioronwithsparsePACs.ThesimulationsanddatawehaveconsideredheredealwithYoflowormoderatedimension.WeprovideafewcommentsregardingthescalabilityofourapproachfordatawithlargerJ.AswebelievethatPACsoflargerlagplayaprogressivelysmallerroleindescribingthe(temporal)dependence,itmaybereasonabletospecifyamaximumallowablelagfornon-zeroPACs.Thatis,wechoosesomeksuchthatij=0forallj)]TJ /F5 11.955 Tf 12.09 0 Td[(i>kandsampleij(j)]TJ /F5 11.955 Tf 12.09 0 Td[(ik)fromeitherourshrinkageorselectionprior.Bandingthematrixisrelatedtotheideaofbandingthecovariancematrix( Bickel&Levina 2008 ),concentrationmatrix( Rothmanetal. 2008 ),ortheCholeskydecomposition 46

PAGE 47

of)]TJ /F9 7.97 Tf 6.59 0 Td[(1( Rothmanetal. 2010 ).Bandinghasalsobeenstudiedby Wang&Daniels ( 2013b ).Inadditiontoreducingthenumberofparametersthatmustbesampled,othermatrixcomputationswillbefasterbyusingpropertiesofbandedmatrices.Relatedtothis,modicationstotheshrinkagepriormaybeneededforlargerdimensionJ.Recallthatthevarianceofijisij=0jj)]TJ /F5 11.955 Tf 12.48 0 Td[(ij)]TJ /F12 7.97 Tf 6.58 0 Td[(.Forlargelags,thiscanbeveryclosetozeroleadingtonumericalinstability;recalltheparametersoftheSBetadistributionareinverselyrelatedtoijthroughij=ij=()]TJ /F9 7.97 Tf 6.59 0 Td[(1ij)]TJ /F4 11.955 Tf 12.32 0 Td[(1)=2.Replacing( 2 )withij=0minfjj)]TJ /F5 11.955 Tf 12.36 0 Td[(ij,kg)]TJ /F12 7.97 Tf 6.59 0 Td[(orij=0+1jj)]TJ /F5 11.955 Tf 12.36 0 Td[(ij)]TJ /F12 7.97 Tf 6.59 0 Td[(toboundthevariancesawayfromzeroorbandingaftertherstklagsprovidetwopossibilitiestoavoidsuchnumericalissues.Further,wehaveparametrizedthevariancecomponentandtheselectionprobabilityinsimilarwaysinourtwosparsepriors.Thequantityisoftheform0jj)]TJ /F5 11.955 Tf 12.12 0 Td[(ij)]TJ /F12 7.97 Tf 6.59 0 Td[(forbothijin( 2 )andijin( 2 ),butotherparameterizationsarepossible.Wehaveconsideredsomesimulations(notincluded)allowingthevariance/selectionprobabilitytobeuniqueforlag,i.e.ij=jj)]TJ /F7 7.97 Tf 6.59 0 Td[(ij.ApriorneedstobespeciedforeachoftheseJ)]TJ /F4 11.955 Tf 12.47 0 Td[(1's,ideallydecreasinginlag.Alternatively,onecoulduse0=jj)]TJ /F5 11.955 Tf 12.88 0 Td[(ij,whichcanbeviewedasaspecialcasewheretheprioronisdegenerateat1.Inourexperienceresultswerenotverysensitivetothechoiceoftheparameterization,andposteriorestimatesofandRweresimilar.Inaddition,wehavefocusedourdiscussiononthecorrelationestimationprobleminthecontextofanalysiswithmultivariatenormaldata.WenotethatthesepriorsareadditionallyapplicableinthecontextofestimatingaconstrainedscalematrixforthemultivariateStudentt-distribution.ConsidertherandomvariableYtJ(,R,).Thatis,YfollowsaJ-dimensionalt-distributionwithlocation(mean)vector,scalematrixR(constrainedtobeacorrelationmatrix),anddegreesoffreedom(eitherxedorrandom).Usingthegamma-mixture-of-normalstechnique( Albert&Chib 1993 ),werewritethedistributionofYtobeYjNJ(,)]TJ /F9 7.97 Tf 6.58 0 Td[(1R)andGamma(=2,=2). 47

PAGE 48

SamplingforRaspartofanMCMCchainfollowsasinSections 2.3.2 and 2.4.3 usingY?=p (Y)]TJ /F16 11.955 Tf 12.19 0 Td[()asthedata.However,oneshouldnotethatazeroPACijimpliesthatYiandYjareuncorrelatedgivenYi+1,...,Yj)]TJ /F9 7.97 Tf 6.58 0 Td[(1,butthisisnotequivalenttoconditionalindependenceasinthenormalcase. 48

PAGE 49

CHAPTER3ANONPARAMETRICPRIORFORSIMULTANEOUSCOVARIANCEESTIMATION 3.1SimultaneousCovarianceEstimationWhenworkingwithlongitudinaldata,specifyingthemodelforthedependencestructureisamajorconsideration.Oftenthedataarecomposedofseveralgroups,suchasdifferingtreatmentsinaclinicaltrial.Inmanycases,particularlyifonedoesnothavemanyobservationspergroup,oneassumesthatthecovarianceorcorrelationstructureisconstantacrossallgroups.However,thisassumption,ifitfailstohold,canhaveadramaticeffectontheinferenceformeaneffects,evensometimesleadingtobias.Conversely,ifonespecieseachofthecovariancematriceswithoutregardtotheothergroups,thiscanleadtoalossofinformation.Dealingwiththesecompetingmodelsforthecovariancestructureisaconcerninmanystatisticalapplications,suchasclassicationandmodel-basedclustering.Therefore,itisdesirabletodevelopmethodstosimultaneouslyestimatethesetofcovariancematricesthatwillborrowinformationacrossgroupsinacoherent,automatedmannerallowingforstructuralzeros,commonalityacrosssubsetsofthegroups,andappropriateequalityofparameterswithinagroup.Whenthedataarefullyobservedundermultivariatenormality,themeanandcovarianceparametersareorthogonalinthesenseof Cox&Reid ( 1987 ),andthemeanparameterswillbeconsistentundermisspecicationofthecovariancestructure.However,ifthereismissingness,asisoftenthecaseforlongitudinaldata,thereisnolongerorthogonality,evenatthetruevalueofthecovariancematrix( Little&Rubin 2002 ).Hence,fortheposteriordistributionofthemeanparameterstobeconsistent,thedependencestructuremustbecorrectlyspecied,anditisnotappropriatetotreatthecovariancematrixasanuisanceparameter( Daniels&Hogan 2008 ,Section6.2).Further, Crippsetal. ( 2005 )demonstrateefciencygainsfortheregressionparametersforfullyobserveddatabyusingparsimoniousmodelsforthecovariancematrix. 49

PAGE 50

AssumethatwehaveMgroupsofnormallydistributedlongitudinaldatawithnmresponsesofdimensionp,Ymiforthemthgroup.Weassumewithoutlossofgeneralitythatthemeanvectorforeachgroupiszero.ThedistributionoftheYmiisYmijmNp)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(0,m(i=1,...,nm;m=1,...,M),withthecovariancematrixm=(m,)]TJ /F7 7.97 Tf 12.25 -1.79 Td[(m)parameterizedbythegeneralizedautoregressiveparameters,mandinnovationvariances,)]TJ /F7 7.97 Tf 6.78 -1.8 Td[(m,asdescribedby Pourahmadi ( 1999 2000 ).Forbrevity,wesometimesrefertothegeneralizedautoregressiveparametersastheautoregressiveparameters.WealsorefertothisasthemodiedCholeskyparameterization,sincetheparametersarederivedbyperformingaCholeskydecompositiononm,(m,)]TJ /F7 7.97 Tf 12.25 -1.79 Td[(m))]TJ /F9 7.97 Tf 6.59 0 Td[(1=T(m)D()]TJ /F7 7.97 Tf 11.66 -1.79 Td[(m)T(m)>.Here,)]TJ /F7 7.97 Tf 6.77 -1.79 Td[(m=(m1,...,mp),andD()]TJ /F7 7.97 Tf 11.65 -1.8 Td[(m)isappdiagonalmatrixwith(j,j)-element(mj))]TJ /F9 7.97 Tf 6.58 0 Td[(1.TheT(m)matrixisupper-triangularwithonesonthemaindiagonal,andtheabove-diagonalelementsaregivenbythenegativesofm.Theelementsofm=(m1,...,mJ)areindexedbyj=1,...,J(J=p(p)]TJ /F4 11.955 Tf 12.08 0 Td[(1)=2)correspondingtolocation(j1,j2)inT(m)(1
PAGE 51

toestimatesparsegraphicalmodelsbyselectingsetsofedgescommontoallgroups,aswellasgroup-specicedges.However,theirmethodshareslittleinformationaboutnon-zeroparametersacrossthegroups. Pourahmadietal. ( 2007 )developedestimationandtestingproceduresforequalityamongsubsetsofthemj's.Inaclusteringcontext,modelsassumingT(m)and/orD()]TJ /F7 7.97 Tf 11.66 -1.79 Td[(m)tobeeitherconstantordistinctacrossallgroupsweredevelopedby McNicholas&Murphy ( 2010 ). Daniels ( 2006 )consideredaBayesianperspectivebyintroducingpriorsfortheparametersoftheCholeskydecomposition,aswellastheprincipalcomponentsofthecovariancematrices,thatinducepoolingacrossgroups.Unfortunately,itiscomputationallychallengingtoselectamongallthepossiblemodelswithintheseclasses. Hoff ( 2009 )alsoconsidersamodelthatshrinkstowardacommoneigenvectorstructure,allowingtheextentofthepoolingtovaryacrosseachprincipleaxis.Othermethodshavebeenproposedthatmodelthecovariancematrixasaregressionfunctionofacontinuouscovariate( Chiuetal. 1996 ; Daniels 2006 ; Fox&Dunson 2011 ; Hoff&Niu 2012 ).However,covarianceregressionmodelsareoftenplaguedbythedifcultyofinterpretingtheregressionparameters.InthischapterwefocussolelyonthemodiedCholeskyparameterizationbecauseoftheunrestrictednessoftheparameters,theinterpretabilityforlongitudinaldata,andthecomputationaladvantagesviaconjugacy( Daniels&Pourahmadi 2002 ).Ourgoalistodevelopapriorforthesets=f1,...,Mgand)-406(=f)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(1,...,)]TJ /F7 7.97 Tf 30.19 -1.79 Td[(MginsuchawaythatweborrowstrengthacrosstheMgroups.Additionally,wewanttoshareinformationacross)]TJ /F7 7.97 Tf 6.78 -1.79 Td[(mandmvalues,particularlythoseautoregressiveparametersofacommonlag.AnotherconsiderationforpriordevelopmentistoencouragesparsityoftheelementsofT(m).Becauseeachmjrepresentsaconditionaldependency,settingmjtozeroestablishesaconditionalindependencerelationshipbetweenapairofcomponentsofY.Itisnecessarytoconsiderpriorsthatallowthedatatoinformthebalancebetweenthesetwogoals:poolingacrossgroupsandintroducingsparsity.Aboveall,weseektoaccomplishthisinanautomated,stochasticfashion.Toform 51

PAGE 52

PAGE 53

XjH=1forallj,sothatthestick-breakingweightssumtoone,guaranteeingFmjisavaliddistribution.Thematrixstick-breakingprocessisthendenedusingtheabovespecicationasH!1,andtheauthorsrefertotheniteHcaseasthetruncationapproximationtothematrixstick-breakingprocess.Wecanconsidertheadequacyofthisapproximationusingamethodsimilartothatemployedby Ishwaran&James ( 2001 ). Dunsonetal. ( 2008 )showthatforasetfmjhgdrawnfromthefullprocess, E 1Xh=Hmjh!=1)]TJ /F4 11.955 Tf 48.06 8.09 Td[(1 (1+)(1+)H)]TJ /F9 7.97 Tf 6.58 0 Td[(1.(3)WemaychoosethenumberofclustersHsuchthatthisexpectedapproximationerror( 3 )isarbitrarilysmall,sotheeffectoftheapproximationisnegligible.BecausetheprobabilitymeasuresFmjandFm0jfortwogroupsmandm0sharethesamesetofatomsfj1,...,jHg,thereisapositiveprobabilitythatmjwillequalm0j.Thisoccurswhenmjandm0jaredrawnfromthesamecluster,thatis,ifmj=m0j=jhforsomehin1,...,H.Theprobabilityofthisoccurringisaknownfunctionofthestick-breakingparametersand. 3.3CovarianceGroupingPriors 3.3.1Lag-BlockGroupingPriorforWenowproposepriorstouseforsimultaneouscovarianceestimationbasedonthematrixstick-breakingprocess.Thesepriorsarereferredtoasgroupingpriorsbecausetheyinducegroupingamongthevaluesofthevariousparameters.Tothisend,weindependentlyplacepriorsonand)]TJ /F1 11.955 Tf 10.1 0 Td[(withtheprioroninducedbythemappingm=(m,)]TJ /F7 7.97 Tf 12.26 -1.8 Td[(m).Becauseand)]TJ /F1 11.955 Tf 10.1 0 Td[(areorthogonalparameters( Pourahmadi 2007 ),itissensibletochooseindependentpriors. 53

PAGE 54

Thepriorfor,referredtoasthelag-blockgroupingprior,isdenedasfollows. mjFmj()=HXh=1mjhq(j)h()(m=1,...,M;j=1,...,J), (3) qhq0+(1)]TJ /F11 11.955 Tf 11.95 0 Td[(q)N(0,2)(q=1,...,p)]TJ /F4 11.955 Tf 11.96 0 Td[(1;h=1,...,H), (3) mjh=UmhXjhYl
PAGE 55

Weformtheprobabilitiesfmjhgasin Dunsonetal. ( 2008 ).Theandstick-breakingparametersservethesameroleasandbefore.Wesubscriptthemwithtodistinguishthestick-breakingparametersfortheprioronfromtheparameterstobedenedfortheprioron)]TJ /F1 11.955 Tf 6.77 0 Td[(.TheUmhandXjhparametersfollowthesameinterpretationasinthematrixstick-breakingprocess,butwhilewesharecandidatesacrossautoregressiveparametersofthesamelag,eachparameterhasitsownvaluesXjh.Akeydistinctionbetweenourpriorandtheoriginalprocessof Dunsonetal. ( 2008 )istheuseofthesamesetofcandidatevaluesfordifferentparameters.Thishasimportantconsequencesforthetheoreticalpropertiesofourpriors.Inparticular,formjandmj0withj6=j0andq(j)=q(j0),i.e.differentautoregressiveparametersforacommongroupandlag,theirdistributionsFmjandFmj0arepositivelycorrelated,whereasundertheoriginalspecicationtheywouldbeuncorrelated.Thisimplicationisquiteattractiveforlongitudinaldataasitfollowscommonintuition.Forexample,itmaybereasonabletoconsiderthattheregressioneffectofYtontoYt)]TJ /F9 7.97 Tf 6.59 0 Td[(1tobethesamefordifferentvaluesoft.WediscussthesepropertiesfurtherinSection 3.4 3.3.2Correlated-LognormalGroupingPriorfor)]TJ /F1 11.955 Tf -251.13 -24.53 Td[(Wenowdenethepriorfortheinnovationvariances)]TJ /F1 11.955 Tf 10.1 0 Td[(asfollows. mjGmj()=HXh=1mjhjh()(m=1,...,M;j=1,...,p), (3) jh=exp(!jh)(j=1,...,p;h=1,...,H),!h=(!1h,...,!ph)TNpf 1p,R()g(h=1,...,H), (3) mjh=WmhZjhYl
PAGE 56

Wedrawtheinnovationvariancemjfromthestick-breakingmeasureGmj,wherethecandidateatomsaredrawnbyexponentiatingamultivariatenormalvariable!h.Theprobabilitymjhofeachoftheatomsisformedusingthestick-breakingmethodontheproductofWandZ.Thesebetarandomvariablesdependontheparametersand.Thecandidatesjharedrawninacorrelatedfashionunliketheoriginalmatrixstick-breakingprocessandmarginallyfollowalognormaldistribution,providingthenameofthisprior.Weintroducetheintermediatevariable!hin( 3 ),whichisap-dimensionalnormallydistributedrandomvectorwithmeanvector 1pandcovariancematrixR().Here, andarescalarquantities,>0,andR()isthecorrelationmatrixcorrespondingtoanautoregressivefunctionoforder1.The(i,j)componentofR()isji)]TJ /F7 7.97 Tf 6.58 0 Td[(jj.Thischoiceismotivatedbythefactthatonesometimesconsiderstheinnovationvariancesasrealizedvaluesofsomeunknownsmoothfunctionoftime.Similartothelag-blockpriorwewillobtaintheatomsjhfortherandommeasureGmjinadependentway,whileleavingtheconstructionoftheprobabilityweightsmjhunchanged.Inthespecialcasewhere=0,thecomponentsofthe!hvectorareindependent.Consequently,theinnovationvariancecandidatesjharedistributedaccordingtothelognormal( ,)distribution,andthisspecialcasefollowsthematrixstick-breakingprocessframework.Inadditiontothegroupingpriorsthatwehavedenedhere,thereareotherpossibilitiestoformsimilarpriorsonthesetf1,...,Mgusingthematrixstick-breakingprocessframework.WeexploresomeoftheseinAppendix B 3.4TheoreticalProperties 3.4.1GeneralizedAutoregressiveParameterPropertiesWenowexploresomeofthetheoreticalpropertiesoftheproposedgroupingpriorsinthecasewhereH,H!1.Recallthatthematrixstick-breakingprocessisformally 56

PAGE 57

denedtobethelimitingdistributionasthenumberofclustersapproachesinnity,andthenitenumberofclusterscase,whilenecessaryforimplementation,isviewedasanapproximation.Ourgroupingpriorsfollowinthesameway.Thefollowingproperties,( 3 )( 3 ),arederivedfortheselimitingdistributions,andweensurethatthenumberofclustersischosenlargeenoughthatthesepropertiesmaybeconsideredtoholdapproximately.TheinitialpropertiesmirrorPropositions1,2,and4of Dunsonetal. ( 2008 ).Partialderivationsof( 3 )( 3 )areprovidedinAppendix C .First,weconsiderthebehavioroffromthelag-blockgroupingprior.Forthefollowingcalculations,weassumethattheq'sandallhyperparametersarexed.Additionally,foreaseofnotation,weignorethesubscriptonq,,whenitisclearfromcontext,andlet()denotetheprobabilitymeasurefortheNormal(0,2)distribution.Dene()=0()+(1)]TJ /F11 11.955 Tf 12.31 0 Td[()(),theprobabilitymeasureforthemixturedistributionoftheqh's.ForallsetsAintheBoreleldofthereallineB(R), EfFmj(A)g=(A),VarfFmj(A)g=2 (2+)(2+))]TJ /F9 7.97 Tf 6.58 0 Td[(2(A)f1)]TJ /F4 11.955 Tf 11.96 0 Td[((A)g. (3) Thisunbiasednesspropertyshowsthatitisappropriatetorefertothe0-normalmixtureasthebasedistributionfor.TheformofthevarianceshowsthatandcontroltheextenttowhichtherandommeasureFmjdiffersfromthebasedistribution.Aseitherorapproachinnity,thedistributionofmjcollapsestotheparametricbase;smallvaluesofandallowforamoreexibleprior.Fortwodifferentgroupsm6=m0, corrfFmj(A),Fm0j(A)g=+=2++1 2+++1. (3) BecausethiscorrelationbetweenamountofmassthedistributionfunctionsassigntothesetAdoesnotdependonthechoiceofA,itmaybeusedasasimpleunivariatemeasureofthedegreetowhichinformationissharedacrossgroups.Simplealgebra 57

PAGE 58

showsthat1=2corrfFmj(A),Fm0j(A)g1.Inparticular,corrfFmj(A),Fm0j(A)gapproaches1=2asaseitherorapproachinnityandapproaches1as!0.Forgroupsm6=m0,theprobabilityofmatchingforthejthautoregressiveparameteris pr(mj=m0j)=2+1)]TJ /F12 7.97 Tf 6.58 0 Td[(2 (1+)(2+))]TJ /F9 7.97 Tf 6.59 0 Td[(1. (3) Thepresenceofthezeropointmassincausesourpropertiestodifferfromthosederivedin Dunsonetal. ( 2008 ).Aseitherorapproachinnity,thisprobabilityapproaches2,theprobabilitythatbothmjandm0jarezeroifdrawnindependentlyfromtheparametricbasedistribution.Therighthandsideof( 3 )isincreasingin,aslargervaluesofindicatethatbothtermsaremorelikelytobezerowhetherornottheycomefromthesamecluster.Additionally,( 3 )increaseswheneitheranddecreases,coincidingwiththeincreasein( 3 ).Consideringtwodifferentautoregressiveparametersj6=j0ofthesamelagq=q(j)=q(j0)fromthesamegroupm,wehavethat corrfFmj(A),Fmj0(A)g=+=2++1 2+++1,pr(mj=mj0)=2q+1)]TJ /F12 7.97 Tf 6.59 0 Td[(2q (2+)(1+))]TJ /F9 7.97 Tf 6.59 0 Td[(1. (3) Thegroupingpriorhasimposedacorrelationstructureonthedistributionfunctionsofthe'sofacommonlag,allowingustoborrowstrengthintheestimationofthedependenceparametersfromthesamelag.Thiscorrelationisthesameas( 3 )fortheearlierm6=m0casewiththeroleofandinreverse.Likewise,theprobabilityofmatchingacrossparametersofcommonlag( 3 )isalsoequivalenttotheprobabilityofmatchingacrossgroupforcommonparameter( 3 )withandexchanged.Thisisakeydistinctionfromtheprocessof Dunsonetal. ( 2008 )wherethecorrelationandmatchingprobabilitieswouldbezero. 58

PAGE 59

Fordifferentgroupsm6=m0anddifferentautoregressiveparametersj6=j0ofthesamelag, corrfFmj(A),Fm0j0(A)g==2+++1 2+2+2+1,pr(mj=m0j0)=2q+1)]TJ /F12 7.97 Tf 6.59 0 Td[(2q 2(1+)(1+))]TJ /F9 7.97 Tf 6.59 0 Td[(1. (3) Somealgebrashowsthatthiscorrelationislessthanboth( 3 )and( 3 ).Likewise,pr(mj=m0j0)issmallerthan( 3 )and( 3 ).Thatis,thecorrelationsofthedistributionfunctionsandtheprobabilityofmatchingacrossbothgroupandautoregressiveparameterarestrictlysmallerthanthecorrelationandmatchingprobabilityacrossjustone.If>,thencorrfFmj(A),Fm0j(A)g
PAGE 60

3.4.2InnovationVariancePropertiesWenowexplorethebehavioroftheinnovationvariancesandtheirdistributionsGmj.LetR+denotethepositiverealline,logAbethesetflogx:x2AgforanyA2B(R+),and()theprobabilityfunctionfortheN( ,)distribution,assumingthehyperparameters ,arexed.Properties( 3 )( 3 )holdasintheautoregressiveparametercasewith(A)replaced(logA)andsetto0.Forinnovationvariancesofthesamegroupanddifferenttimesj6=j0, corrfGmj(A),Gmj0(A)g=+=2++1 2+++1corrn!j1(logA),!j01(logA)o, (3) andforbothdifferentgroupsm6=m0anddifferenttimesj6=j0, corrfGmj(A),Gm0j0(A)g==2+++1 2+2+2+1corrn!j1(logA),!j01(logA)o. (3) ThecorrelationofthesedistributionsnowdependsonthechoiceofBorelsetA.However,theyaretheproductsofatermthatdependssolelyonthestick-breakingparametersandandatermthatdependsonlyonAandthedistributionof(!j1,!j01)N2f 12,R()g,whereR()isthe22correlationmatrixwithoff-diagonalelementsjj)]TJ /F7 7.97 Tf 6.58 0 Td[(j0j.Thehighercorrelationsforneighboringtermsimpliesasmoothingofthevariancesasafunctionofjfor>0.Weobservethattheleadingtermgivesthesamecorrelationstructureasin( 3 ).Additionally,withthechoiceof=0,thetermdependingonAiszero,andthedistributionsareuncorrelatedasin Dunsonetal. ( 2008 ).Forj6=j0and1m,m0M,pr(mj=m0j0)=0,thatis,thereisnomatchingoftheinnovationvariancesacrosstimepoints.Thisisaconsequenceofthefactthattwopointsdrawnfromacorrelatednormaldistributionwithjj<1willbeequalwithprobabilityzero. 60

PAGE 61

3.5ComputationalConsiderationsRecallthatequation( 3 )provideduswiththeexpectedapproximationerrorwhichweemploytochoosethenumberofclustersnecessaryforthematrixstick-breakingprocesstruncation.Thisformulacontinuestoholdfortheproposedgroupingpriors,sincethestick-breakingweightsareformedusingthesameframework.Hence,ifthevaluesof,foreitherforthelag-blockorcorrelated-lognormalpriorareassumedknown,thenwechoosethenumberofclustersHsuchthat( 3 )islessthansomethreshold,suchas0.01.Aswegenerallydonothaveanyknowledgeorpriorbeliefaboutthesestick-breakingparameters,itwilloftenbeinappropriatetoprespecifyvalues,sowefollowthesuggestionof Dunsonetal. ( 2008 )andspecifyindependentGamma(1,1)priorsforand.AnalysesusingGamma(10,10)andGamma(0.1,0.1)indicatelittleeffectofthispriorchoiceontheestimatesof.TochoosethevalueofHwhenusingapriorforthestick-breakingparameters,werunapreliminaryMarkovchainforapproximately10%ofthedesiredchainlengthandusetheposteriormeanstotestwhether( 3 )isbelowourthreshold.Ifso,wexthisvalueofHforremainingcomputation.Oneofthenicepropertiesofthematrixstick-breakingprocessisthatintroducingappropriatelatentvariablesleadstoacomputationalalgorithmthatgenerallysamplesfromwell-knownconjugatedistributions( Dunsonetal. 2008 ).Becauseanormalpriorforthegeneralizedautoregressiveparametersprovidesconjugacy,thesamplingforisfromarelativelyeasytosamplezero-normalmixture.Withthelognormaldistribution,conjugacyfortheinnovationvariancesisnotobtainedsinceinversegammaistheconjugatedistributionfor,butwecansampleefcientlybyincorporatingaslicesamplingstep( Neal 2003 ).Weappropriatelymodifythealgorithmof Dunsonetal. ( 2008 )forposteriorsamplingfromourgroupingpriorsanddiscussfurthercomputationalchallengesinAppendix D 61

PAGE 62

Oneissueinthesamplingalgorithmisthebehaviorofsamplingthecorrelation.Inourexperiencewhenconsideringpriorswithrandom,didnotseemtobewellinformedbythedata.Hence,weopttotreatasatuningparameter.Werecommendspecifyingadefaultvaluesuchas=0.75,possiblytryingafewotherchoicesandselectingthevaluebasedonsomemodelselectioncriterion.AsshowninthedepressiondatastudyinSection 3.7 ,thethreechoicesof=0.5,0.75,and0.9leadtosimilarmodeltsasmeasuredbythedeviance.Basedonoursimulationstudies,itappearsthatthecorrelated-lognormalpriorisfairlyrobusttothechoiceof. 3.6RiskSimulationWenowexaminetheoperatingcharacteristicsoftheproposedgroupingpriorsviaarisksimulationdesignedtomimictheanalysisofatypicallongitudinaldatascenario.Weincorporateanon-zeromean,andthesimulateddatawillsufferfromignorabledropout.ThereareM=8groupseachwithnm=50measurementsofdimensionp=6.LetDidenotethetimet=2,...,p+1ofdropoutforsubjecti,whereDi=p+1indicatesasubjectwhocompletesthestudy.Dropoutisinducedaccordingtothemodel logitfpr(Di=t+1jDi>t,yit,m)g=0t+1tyit+2m(t=1,...,p)]TJ /F4 11.955 Tf 11.95 0 Td[(1). (3) Thismissingdatamechanismismissingatrandombecausethedropouttimedependsonlyonobservedvalues.Themean,covariance,anddropoutparametersforthesimulation,aswellastheprobabilitiesofmissingnessattimetforeachgroup,areprovidedinAppendix E .Thechoicesofand)]TJ /F1 11.955 Tf 10.1 0 Td[(donothaveanyequalitiesacrossgroupsbutsomewithinlag.Howeverwiththesmallsamplesizes,itwillgenerallystillbeadvantageoustoshareinformationacrosstheeightgroups.Also,thereisamoderateamountofsparsityin,asistypicalforordereddata.Allgroupshaveameanofzeroattime1,andthemeanfunctionsincreaseatdifferingratestothenaltimet=6.Thedropoutratesvaryacrossgroupswithmostgroupslosing35to50%oftheirsubjectsbyt=6.Groups3and 62

PAGE 63

PAGE 64

Underthisdatamodel,wegenerate50datasetsandrunourMarkovchainMonteCarloalgorithmoneachdatasetwitheachpriorfor50,000iterationskeepingeverytenthiteration,usingaburn-inof10,000.Weplacethefollowingpriorsonthehyperparameterswhenappearinginthepriorspecication:q,independentUnif(0,1);,,,,1,and2,independentGamma(1,1);2InvGamma(0.1,0.1);InvGamma(0.1,0.1); N(0,c2),withc2=1000.Wexthevalueoftobe0.75.Weassumeaatprioronthegroup-specicmeanvectorsm.Tohandleincompletedata,weusedataaugmentationtosamplethemissingdatavaluesfromnormaldistributionsconditionalontheobserveddata.WemeasuretheperformanceofourproposedpriorsbyestimatingtheriskassociatedwiththeBayesestimatorsundertwocommonlossfunctions( Yang&Berger 1994 ),L1(m,^m1)=tr()]TJ /F9 7.97 Tf 6.59 0 Td[(1m^m1))]TJ /F4 11.955 Tf 12.86 0 Td[(logj)]TJ /F9 7.97 Tf 6.59 0 Td[(1m^m1j)]TJ /F5 11.955 Tf 19.75 0 Td[(pandL2(m,^m2)=trf()]TJ /F9 7.97 Tf 6.58 0 Td[(1m^m2)]TJ /F5 11.955 Tf 10.83 0 Td[(I)2g.Sincetheselossesaredenedintermsofasinglecovariancematrix,weconsiderthelossforestimatingthesetofcovariancematricestobetheaverageacrossgroupsofthelossesfromtheindividualcovariancematrices.Tostudytheabilitytorecoverthemeanfunctionwithdifferingpriorson,weusetheposteriormeanofmandthelossfunctionL(^m,m)=(^m)]TJ /F11 11.955 Tf 12.14 0 Td[(m)>)]TJ /F9 7.97 Tf 6.59 0 Td[(1m(^m)]TJ /F11 11.955 Tf 12.14 0 Td[(m),takinganaverageacrossgroupsstandardizedbythetruecovariancematrixm.TheestimatedrisksassociatedwithestimatingthecovariancematricesforlossfunctionsL1andL2areshowninTable 3-1 .Thegroupingpriorshowsariskimprovementof30and25%overthenaiveBayes1priorand52and41%overthegroup-specicatprior.WhilethenaiveBayesprior1accommodatessparsity,itdoesnotpromoteanyequalityrelationshipsintheautoregressiveparametersacrossgroupsasinthetrueparametervalues,unlikethegroupingprior.ThenalcolumnofTable 3-1 displaystheriskinmeanestimationshowingaclearimprovementunderthegroupingprior.Thelag-block/correlated-lognormalpriorproducesarisk14%smallerthanthenaiveBayes1priorand29%smallerthanthe 64

PAGE 65

Table3-1. EstimatedrisksforeachchoiceofcovariancepriorfromthesimulationinSection 3.6 .TheestimatedriskiscalculatedastheaveragelossusinglossfunctionsL1(m,^m1)=tr()]TJ /F9 7.97 Tf 6.58 0 Td[(1m^m1))]TJ /F4 11.955 Tf 11.95 0 Td[(logj)]TJ /F9 7.97 Tf 6.58 0 Td[(1m^m1j)]TJ /F5 11.955 Tf 17.93 0 Td[(p,L2(m,^m2)=trf()]TJ /F9 7.97 Tf 6.58 0 Td[(1m^m2)]TJ /F5 11.955 Tf 11.95 0 Td[(I)2g,andL(^m,m)=(^m)]TJ /F11 11.955 Tf 11.95 0 Td[(m)>)]TJ /F9 7.97 Tf 6.58 0 Td[(1m(^m)]TJ /F11 11.955 Tf 11.96 0 Td[(m). PriorEstimatedRiskL1L2L CovarianceGroupingPrior0.4250.7420.175NaiveBayes10.6050.9870.203NaiveBayes20.6301.0100.210Group-specicat*0.8921.2550.248Common-at8.10584.3390.925 *Thegroup-specicatpriorisonlyover49datasetsbecausetheMarkovchainfailedtoconvergeforonedataset. group-specicestimator.Theriskassociatedwiththecommon-priorisalmostvetimesthatassociatedwiththegroupingpriors.Thiscorrespondswithourobservationsintheintroductionthatbyconsideringamoreaccuratestructureonthedependence,weestimatethemeanfunctionmoreefcientlyandwithlessbias.Additionalrisksimulationsusingthegroupingpriorundersimplerdatamodelswithfullyobserveddatahavebeenperformed,someofwhichareincludedinAppendix E .Thesecontinuedtoshowthatourprioroutperformsthenaivecompetitorsundermanydifferenttypesofcovariancematrixspecicationssuchassituationswithnosparsityanddissimilarcovariancematricesacrossgroups,andunderincreasingnm,M,andp.Inparticular,webelievethatasthenumberofgroupsMandthedimensionofthecovariancematrixpincreases,thegroupingestimatorsforwilloutperformthenaiveBayesestimatorsandthemarginbywhichtheydosoincreases.Thechoiceoftheatpriorscontinuedtoperformpoorlycomparedtothegroupingandnaivechoices. 3.7DataExampleWenowdemonstratetheuseofthegroupingpriorsinthettingofalongitudinaldatasetfromadepressionstudy.Thedata,originallypresentedby Thaseetal. ( 1997 ) 65

PAGE 66

PAGE 67

PAGE 68

Figure3-1. Theposteriorprobabilitiesofmatchingfortheinnovationvariancesattimes1,9,and15.Thesizeoftheboxesareproportionaltopr(mj=m0jjyobs). Wealsoconsiderhowthechoiceofcovarianceprioreffectsmeanestimation.Weshowthetreatmenteffect,calculatedasthedifferenceinmeanvaluebetweenbaselineandweek16,and95%credibleintervalsforthersttwogroupsinTable 3-2 .Ingroupm=2weseethattherearecleardifferencesforthiseffectacrossthedifferentpriors.Thetreatmenteffectunderthegroupingpriorsisestimatedtobearound9.5points,butitis10.2forthecommon-atand6.9forthegroup-specicatprior.Evenbetweenthetwonaivepriors,theestimatedtreatmenteffectsdifferby0.6.Forgroup1,aswellasgroups3and4,wedonotobservemuchdifferenceinthemeaneffect,exceptforsomedeviationwiththecommon-prior,althoughthecondenceintervalismorenarrowforthegroupingpriorsthantheatversions.Thesetwogroupsdemonstratethebiasandefciencyissuesrelevanttocovariancematrixestimationwithmissingdataasdiscussedintheintroduction.Wenotethatthedifferencesdonotrisetothelevelofstatisticalsignicancehere,buttheyarelargeenoughtobeofpracticalimportance.Figures 3-1 and 3-2 showthegroupingnatureoftheproposedpriors.Figure 3-1 showstheposteriorprobabilitiesofpr(mj=m0j)foreachm,m0combinationattimesj=1,9,15.Thesetimeswerechosenasrepresentativeoftheoverallpatternsinthedata.Forj=1andmostoftheundisplayedtimes,thereissubstantialmatchingforthegroups1and2,thelowinitialseveritygroup,aswellasforgroups3and4,thehigh 68

PAGE 69

Figure3-2. Theposteriorprobabilitiesofmatchingforthegeneralizedautoregressiveparameters.Panel(a)containsthematchingfortherstfourlag-1terms,and(b)displaystherstfourlag-4terms.Thesizeofthegrayboxesareproportionaltopr(mj=m0j0jyobs).Theblackboxesoverlayingthediagonalareproportionaltotheposteriorofpr(mj=0).Theaxesindicategroupm(topline)andautoregressiveparameterjoflag-q(bottomline). initialseveritygroups,withlessmatchingacrossthepairs.Thevariancesatj=9and15showastrongerpropensitytomatchacrossallgroups.Figure 3-2 givestheposteriorprobabilitiesofmatchingforthelag-1andlag-4autoregressiveparameters.Weshowonlytherstfourofeachduetospacelimitations.Theblackboxesthatoverlaythey=xdiagonalareproportionaltotheposteriorofpr(mj=0).Weseethatthelag-1termsarerarelysettozero,whilethelag-4termsforgroups2and3arefrequentlyzeroedout.Duetothenatureofthegroupingprior,thereisapositiveprobabilityofequalityacrossmj'sofacommonlag.InFigure 3-2 (a)wenotethepairwiseprobabilityofequalityisveryhighforallcombinationsoftherstautoregressiveparameterofallgroups,i.e.theregressionoftime2ontotime1,andtheotherlag-1,group2terms.Onewouldbeunlikelytolearnofthis 69

PAGE 70

relationshiportoconsideramodelwithequalityacrossall,orevenalargesubset,oftheseparametersusingotherapproaches.ConsideringFigure 3-2 (b),therearelargermatchingprobabilitiesforthelag-4parameters,muchofwhichisduetomatchingwithbothparameterssettozero.However,thematchingisnotalwaysduetoequalityatzero,ascanbeseenfromthelargeprobabilitiesofmatchingacrossthegroup1's.Thereissimilarbehaviorforthegroup4generalizedautoregressiveparameters. 3.8DiscussionWehavedevelopedaprioronthesetofMcovariancematricesthatsimultaneouslyexploitssparsityandmatchingofdependenceparametersacrossgroups.Themodelspacecontainingallcombinationswhereeachautoregressiveparameter/varianceisconstantacrossallpossiblesubsetsofthegroupshasBp(p+1)=2Mmodels,whereBMistheMthBellnumber( Stanley 1997 ,p.33).Infact,thegroupingpriorsconsideraspacethatisevenlargersinceweallowmatchingacrossautoregressiveparametersofacommonlag.Withthismanymodelswehavelittlehopeofndingthemostappropriateone.Ourgroupingpriorsavoidthisproblembystochasticallyconsideringthepossibilityofeachofthesemodelsinasingleanalysisandaccountingforuncertaintyappropriately.ItisourbeliefthatrunningaMarkovchainwithoneofthesegroupingpriorsisanecessaryalternativetotheunreasonabletimeandenergyrequiredtotandcomparethisextremelylargeclassofmodels. 70

PAGE 71

CHAPTER4COVARIANCEPARTITIONPRIORS:ABAYESIANAPPROACHTOSIMULTANEOUSCOVARIANCEESTIMATION 4.1SimultaneousCovarianceEstimationandaDrawbackoftheCovarianceGroupingPriorsWhenmodelinglongitudinaldata,estimationofthecovariancematricesisofprimeimportance.Inmanycases,thedatamayconsistofmultiplegroupsdenedbydifferencesintreatmentsand/orbaselinecovariates.Animportantconsiderationfortheanalystiswhether,andhow,thedependencestructurevariesacrossthesegroups.Often,oneperformsinferenceundertheassumptionthatthecovariancestructuresareequalacrossthesegroups,butifthisassumptionisincorrect,theresultinginferencesmaybeinvalidated,evenforinferencesonthemeanmodelifthereismissingness( Daniels&Hogan 2008 ).Conversely,modelingthedependencestructuresindependentlywithoutregardtotheothergroupscanleadtoinefciencyifthegroupsamplesizesaresmallorthedimensionislarge.Ourgoalistodevelopmethodologythatwillallowforthesharingofinformationacrossgroupstoimproveestimationefciency.ConsiderMgroupscontainingnmobservations,Ymi(i=1,...,nm;m=1,...,M),ofT-dimensional,multivariatenormaldata.Withoutlossofgenerality,weassumeYmihasmeanzero;otherwise,weletYmirepresenttheresidualaftercentering.Eachgroupmhasitsowncovariancematrixm,andwelet=f1,...,Mgdenotethecollectionofcovariancematrices.WeparameterizeeachmthroughthemodiedCholeskydecomposition( Pourahmadi 1999 2000 ),so)]TJ /F9 7.97 Tf 6.59 0 Td[(1m=TmDmT>mwithDmadiagonalmatrixwithpositiveentriesandTmanupper-triangularmatrixwithunitdiagonals.TheparametersofthisCholeskyparameterizationareinterpretablebyconsideringthesequentialdistributionsofYmi, f(Ymi1,...,YmiT)=f(Ymi1)f(Ymi2jYmi1)f(YmiTjYmi1,...,Ymi,T)]TJ /F9 7.97 Tf 6.59 0 Td[(1).(4) 71

PAGE 72

UndermultivariatenormalityeachoftheseTsequentialdistributions,f(YmitjYmi1,...,Ymi,t)]TJ /F9 7.97 Tf 6.59 0 Td[(1),isanormaldistributionwithmeanPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1j=1m;jtYmijandvariancemt.Letmt=(m;1t,...,m;t)]TJ /F9 7.97 Tf 6.59 0 Td[(1,t)>bethevectorofregressioncoefcientsfromthesequentialregressionforthetthresponse.FromtheCholeskydecomposition,theunconstrainedelementsofthetthcolumnofTmare)]TJ /F16 11.955 Tf 9.3 0 Td[(mt,andthe(t,t)elementofDmis1=mt.Hence,thesesequentialdistributionsfullyanduniquelydeterminem.Further,weneednotworryaboutthepositivedeniteconstraintonmaslongasmt>0forallt,andthenormalandinversegammadistributionsprovideconditionallyconjugatepriorsformtandmt,respectively( Daniels&Pourahmadi 2002 ).Wedenotethem;jt'sasthegeneralizedautoregressiveparameters(GARPs)andthemt'sastheinnovationvariances(IVs).Therehasbeenasubstantialamountofresearchonthesimultaneouscovarianceestimationproblem,butrelativelylittlefocusonthelongitudinaldatascenario.Methodshavebeensuggestedbyimposingcommonalitiesonaparticularfeatureofthem's:equalityacrosssubsetsoftheprinciplecomponentsof( Boik 2002 )oritscorrelationmatrix( Boik 2003 );equalityacrosscorrelationmatricesorthevolumesof( Manly&Rayner 1987 );equalityacrossallTmand/orallDm( McNicholas&Murphy 2010 );equalityamongarbitrarysubsetsofTmandDm( Pourahmadietal. 2007 ).Thelaterisinthespiritofourmethod,buttherethesubsetsmustbeprovidedbytheuser.Thispresentsasignicantcomputationalchallengebecausetherearemuchtoomanypossiblesubsetstondthebestcongurationwithoutanautomatedmethod. Daniels ( 2006 )and Hoff ( 2009 )proposeshrinkagepriorsontheCholeskytermsandtheeigenvectors,respectively. Guoetal. ( 2011 )and Danaheretal. ( 2012 )proposepenalizedlikelihoodmodelsthatinduceasparsitystructurecommonacrossgroups.The Danaheretal. ( 2012 )techniquealsoallowsequalityacrossgroupsatasinglenon-zerovalue.Thisisunsatisfactoryinourcase,asweexpectthattheremaybesubsetsofthegroupsthatshouldshareparametervaluesatdistinctnon-zerovalues; Guoetal. ( 2011 ) 72

PAGE 73

providesnoinformationacrossgroupsbeyondacommongraphicalstructure.Otherauthorshavemodeledthecovariancematricesthroughregressionsofacontinuouscovariate( Chiuetal. 1996 ; Daniels 2006 ; Fox&Dunson 2011 ; Hoff&Niu 2012 ),buttheregressionparametersinthesemodelsoftenlackinterpretation.InChapter 3 (seealso Gaskins&Daniels 2013 ),weproposedanonparametricpriorfortheCholeskytermsbasedonthematrixstick-breakingprocess( Dunsonetal. 2008 ).Thismethodologyproposesasparsepriorthatallowsformatchingofthem;jtGARPs(m=1,...,M;j=1,...,t)]TJ /F4 11.955 Tf 11.64 0 Td[(1;t=2,...,T)acrossallmandallj,twithinaxedvalueoft)]TJ /F5 11.955 Tf 11.98 0 Td[(j.Thevaluet)]TJ /F5 11.955 Tf 11.98 0 Td[(jiscalledthelagandrepresentsthetimedistancebetweenthetheresponseYtandtheregressorYj.TheIVsmtmayalsomatchacrossgroups.Thispriorconsidersanenormousmodelspaceandrequiresmanylatentvariablesforanefcientsamplingscheme.Inthischapterwesimplifythesetofmodelsunderconsiderationwhilestillconsideringarichsetofsparsemodelswithcommonstructuresacrossgroups.ThisleadstofastercomputationsthanthemethodofChapter 3 ,andhence,willbetteraccommodatedatawithlargerdimensionT.InSection 4.2 weintroduceourapproachbasedonacovariancepartitionprior.Toshareinformationacrossgroups,thispriorspeciespartitionsofthegroupssuchthatthosegroupscontainedinthesamesetofthepartitionwillhavecommonvaluesforsomesubsetofthedependenceparameters.Wedeneapartitioncorrespondingtoeachofthesequentialdistributionsin( 4 )andconsiderthesequenceofpartitionstobehaveasaMarkovchain.Section 4.3 containsdetailsofthecomputationalalgorithmneededtogenerateaposteriorsamplefortheproposedprior.PerformanceofthecovariancepartitionpriorisstudiedinSections 4.4 and 4.5 throughasimulationstudyandtheanalysisofdatafromadepressionstudy.AbriefdiscussioninSection 4.6 concludesthechapter. 73

PAGE 74

4.2CovariancePartitionPrior 4.2.1PriorontheSequenceofPartitionsFirst,itisnecessarytodenesomenotation.LetM=f1,...,Mgdenotethecollectionofallgroups.Weareinterestedinpartitioningthegroupsintosetsthatshareasimilardependencestructure.LetPdenoteapartitionofMandthecollectionofallpossibleP.ThecardinalityofisBM,theMthBellnumber( Stanley 1997 ,p.33).BMisequaltothesumoftheStirlingnumbersofthesecondkind.AnypartitionPcanbewrittenasthecollectionofitssetsP=fS1,...,Sdg,whereeachSiisnon-empty,disthedegree(thenumberofdistinctsets)ofP,Si\Sj=;forall1i
PAGE 75

clusteringofaDirichletprocess(Si)=(ni!),whereistheconcentrationparameterandnithedegreeofsetSi.However,ourfocusisonajointprior(P1,...,PT)forTtime-orderedpartitions.Wedesireapriorthatwillencouragesimilarstructuresacrosst.Ifgroupsm1andm2areinthesamesetinPt,andthereforesharethesameGARPsandIVforresponset,itshouldbemorelikelythattheyareinthesamesetofPt+1.Tothatend,weconsiderthesequenceofpartitionsfP1,...,PTgtobeaMarkovprocessonthestatespace.BytheMarkovpropertyandtime-invariance,weletthedistributionofPtgivenallpreviouspartitionsdependsonlyonthemostrecentpartition,thatis,pr(PtjP1,...,Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1)=pr(PtjPt)]TJ /F9 7.97 Tf 6.58 0 Td[(1)=pr(P2jP1)forallt.Hence,thefullconditionaldistributionofPtgivenalltheotherpartitionsdependsonlyontheadjacentpartitionsPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1andPt+1.Tospecifythetransitionprobabilitypr(P2jP1),weemployacommonlyusedmetricondenedbyd(P1,P2)=2jP1\P2j)-303(jP1j)-302(jP2j,wherejPjgivesthedegreeofapartitionPandtheintersectionpartitionisP1\P2=fS:S6=;;S=S1\S2forsomesetsS12P1andS22P2g( Day 1981 ).Notethatfortwogroupsm16=m2tobeinthesamesetofthepartitionP1\P2,theymusthavebeeninthesamesetinbothP1andP2.Thedistanced(P1,P2)isinterpretedastheminimumnumberofmergesandsplitsofthesetsofP1neededtoobtainP2.Othermetricsforcanbefoundin Arabie&Boorman ( 1973 )or Denud&Guenoche ( 2006 ).Usingthed(,)metric,wedenetheclosenessbetweenthetwopartitionsbycq(P1,P2)=[1+fd(P1,P2)gq])]TJ /F9 7.97 Tf 6.59 0 Td[(1,whereqisanon-negativeconstantdeterminingtherelativestrengthofthedistance.Noteforallniteq,thatcq(P,P)=1forallP,cq(P1,P2)2(0,1)forP16=P2,andcq(,)isdecreasingind(,).Deneaq(P)=B)]TJ /F9 7.97 Tf 6.58 0 Td[(1MPP0cq(P,P0)tobetheattractivenessofthepartitionP,givenbytheaverageclosenessofPover.ThetransitionprobabilityfromPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1toPt(t>1) 75

PAGE 76

isproportionaltotheclosenessofthepartitionsandisgivenby pr(PtjPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1)=cq(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1,Pt) aq(Pt)]TJ /F9 7.97 Tf 6.58 0 Td[(1)BM.(4)ItiseasytoverifythatthestationaryprobabilityofPisproportionaltoitsattractivenessaq(P).Hence,wechoosethedistributionfortheinitialpartitionP1tobethesetofstationaryprobabilitiespr(P1)=aq(P1)=(AqBM),whereAq=B)]TJ /F9 7.97 Tf 6.58 0 Td[(1MPPaq(P)istheaverageattractiveness.BecausewearestartingaMarkovchainatitsstationarydistribution,themarginalprobabilityforpartitionPtis pr(Pt)=aq(Pt) AqBM,(Pt2,t=1,...,T).(4)Combining( 4 )and( 4 ),thedistributionoftheentirepartitionprocessisgivenby (P1,...,PT)=B)]TJ /F7 7.97 Tf 6.59 0 Td[(TMaq(P1) AqTYt=2cq(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1,Pt) aq(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1).(4)Ourpartitionpriorclearlydependsonthevalueofq.Whenq=0,c0(P1,P2)=1=2forallP1,P2undertheusualconvention00=1.Hence,aq(P)=1=2forallP,andpr(PtjPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1)=pr(Pt)=1=BM.Thepreviouspartitionprovidesnoinformationaboutthenewpartition,andallpartitionsareequallylikely.Wecallthiscasetheindependent-uniformsprior,becauseeachpartitionisindependentoftheothersandfollowsauniformdistributionover.Because0qisdiscontinuousatq=0,cq(P1,P2)!0.5+0.5I(P1=P2)asqapproacheszero.Inthislimit,pr(PtjPt)]TJ /F9 7.97 Tf 6.58 0 Td[(1)!f1+I(Pt=Pt)]TJ /F9 7.97 Tf 6.58 0 Td[(1)g=(BM+1).Theprobabilityofmovingtoanydifferentpartitionis1=(BM+1)withprobabilityofremainingatthesamepartitionistwiceaslikely.SinceBMislarge,thereislittlepracticaldifferencebetweentheresultsatq=0andq!0.Conversely,asqgetslarger,cq(P1,P2)0ifd(P1,P2)>1.Hence,movesfromPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1toPtthatrequiremorethanonemergeorsplitareincreasinglyunlikely.Forlargeqthisplacesahighlyrestrictivestructureonthepartitionprocess.Hence,weonlyallow 76

PAGE 77

Figure4-1. MarginalprobabilitiesforthreechoicesofqwithM=8groups.TheseprobabilitiesarescaledbyBMandcanbeviewedastheratioofmarginalprobabilitiesunderqandundertheindependent-uniforms(q=0)case,i.e.pr(Pjq)=pr(Pjq=0).Partitionsareorderedbyincreasingdegreeonthex-axiswithnoinformativeorderingbetweenpartitionsofcommondegree;partitionsofodddegreeareinblackandthoseofevendegreeareingray. valuesofqbetween0and10sothatpr(PtjPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1)isboundedawayfromzeroforallqandallt,Pt)]TJ /F9 7.97 Tf 6.58 0 Td[(1,Pt.WenotethatXPpr(Pjq=10))]TJ /F4 11.955 Tf 11.95 0 Td[(pr(Pjq=1).001,andsothereislittledifferencebetweenthemarginalprobabilitiesforqgreaterthan10.Tobetterseetheeffectthechoiceofqhasonthemarginalprobabilitiespr(P)of( 4 ),weplotthemarginalprobabilitiesforthreechoicesofqforM=8groupsinFigure 4-1 .TheseprobabilitiesarescaledbyBM=4140andcanbeviewedastheratioofmarginalprobabilitiesundertheparticularqtotheprobabilityundertheindependent-uniformsq=0case,i.e.pr(Pjq)=pr(Pjq=0).ForeachqthehighestprobabilitypartitionisPpool=fMg,thepartitionthatpoolsalldataintoasinglegroup.Forsmallerqallmarginalprobabilitiesarerelativelyclose,butasqincreases,thedisparityacrossPincreases.Forq=5,maxP,P0fpr(P)=pr(P0)g=11.4,butsincethesupportislarge,thisdifferenceisnotsogreatastoleaveanypartitionswithnegligible 77

PAGE 78

support.Thisdisparitydoesnotincreasesubstantiallyafterq=5aseachcq(P1,P2)isapproximatelyeither1,1=2,or0forq>5.Asitisnotclearwhatanoptimumvalueofqwouldbe,weletqbearandomvariabletobesampledaspartoftheanalysis.BecauseoursamplingschemewillrequirethevaluesofAqandaq(P1),...,aq(PT)foreachq,wespecifyanitesupporttominimizethecomputationalcomplexity.Tothatend,wechoosethesequenceQfrom0to10withstepsof0.025(jQj=401)tobethesupportofq.ThepriorforqisuniformoverthesetQ. 4.2.2PriorontheCholeskyParametersGivenaparticularpartitionPt,groupsinacommonsetwillsharecommonvaluesforthedependenceparametersassociatedwiththesequentialdistributionf(YmitjYmi1,...,Ymi,t)]TJ /F9 7.97 Tf 6.59 0 Td[(1)from( 4 ).Theseparametersare(mt,mt),wheremtisemptywhent=1.Tothatend,foreachsetSit2Pt=fS1t,...,Sdttg,weassociatetheparameters(?it,?it),sothat(?it,?it)=(mt,mt)forallm2Sit.As(mt,mt)aredeterminedbyf(?it,?it)gdti=1andPt,wenowspecifytheprioron(?it,?it)conditionalonthepartitionPt.Inadditiontothematchingacrossgroupsthatisinducedbyourpartitions,wedevelopourpriortoinducesparsityintheTmmatrices.Undermultivariatenormalitym;jt=0impliesYmijandYmitareindependentgiventheYmik'swithk
PAGE 79

Thepriorfor(?it,?it)conditionalonthepartitionPtisasfollows. pr(it=kjPt)=kt=expf)]TJ /F11 11.955 Tf 15.28 0 Td[(1I(k=0))]TJ /F11 11.955 Tf 11.95 0 Td[(2kg Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1l=0expf)]TJ /F11 11.955 Tf 15.27 0 Td[(1I(l=0))]TJ /F11 11.955 Tf 11.95 0 Td[(2lg(k=0,...,t)]TJ /F4 11.955 Tf 11.96 0 Td[(1), (4) ?itjPtInvGamma(1,2), (4) ~itjit,?it,PtNit(0it,?it2Iit). (4) itdeterminesonhowmanyofthepreviousmeasurementstheresponseYmitisdependent(form2Sit).Welet~itgivetheitnon-zeroGARPs,andso?it=(0>t)]TJ /F9 7.97 Tf 6.59 0 Td[(1)]TJ /F12 7.97 Tf 6.59 0 Td[(it,~>it)>givesthefullvectorofGARPscorrespondingtothenegativesoftheunconstrainedelementsofthetthcolumnofTm.Forconditionalconjugacyweuseanormalpriorfor~itandtheinversegammadistributionfortheinnovationvariances.Whent=1therearenoGARPs,andonlythepriorin( 4 )isneeded.ForafullyBayesiananalysis,weneedhyperpriorsforthehyperparameters2,1,2,1,and2.Wespecify2InvGamma(0.1,0.1)andindependentGamma(1,1)priorsfortheinnovationvarianceparameters1,2.Todenethehyperpriorsfor1and2,werstexploretheinterpretationoftheseparameters.Algebrashowsthat2=logpr(it=k) pr(it=k+1)(k=1,...,t)]TJ /F4 11.955 Tf 11.96 0 Td[(1;t=3,...,T),providinganinterpretationof2asthelogoddsthatYmitdependingononefewerpastresponse.Toencouragesparsityandsmallervaluesofit,wechooseanexponentialdistributionwithratelog(2)asthepriordistributionfor2.Thisguaranteesthatthislogoddsisstrictlypositivewithapriormeanoflog(2).Further,1)]TJ /F11 11.955 Tf 11.95 0 Td[(2k=logpr(it=k) pr(it=0)(k=1,...,t)]TJ /F4 11.955 Tf 11.96 0 Td[(1;t=2,...,T).Wechooseapriorof1j2Unif(2,(T)]TJ /F4 11.955 Tf 12.12 0 Td[(1)2).Theleftendpointrepresentsthecasethatpr(it=1)=pr(it=0),i.e.thattheresponsedependsononlythemostrecentmeasurementisaslikelyastheresponsebeingindependentofallpastmeasurements,andtherightendpointisequivalenttopr(iT=T)]TJ /F4 11.955 Tf 12.16 0 Td[(1)=pr(iT=0),thenalresponse 79

PAGE 80

dependingonthefullhistoryisaslikelyasindependencefromthehistory.Further,foranyk=1,...,T)]TJ /F4 11.955 Tf 12.45 0 Td[(1,2k2[2,(T)]TJ /F4 11.955 Tf 12.45 0 Td[(1)2]soitisequallylikelythat1=2kforanyk.Thatis,itisaslikelythatYmitdependsonthekpreviousresponsesasthecasethatYmitisindependentofitshistory,foreachchoiceofk.Simulationshavedemonstratedrobustnesstothesehyperpriorchoices. 4.3SamplingAlgorithmPosteriorinferenceusingthecovariancepartitionpriorswillrelyonaposteriorsamplegeneratedfromaMarkovchainMonteCarlo(MCMC)algorithm.InthissectionwedescribetheGibbssamplernecessarytoobtainaposteriorsample.LetH=(q,2,1,2,1,2)denotethesetofmodelhyperparameters,andCt=f(mt,mt)gMm=1denotethesetofCholeskyparametersforthetthsequentialdistributionsofallMgroups.Further,C()]TJ /F7 7.97 Tf 6.59 0 Td[(t)representsthesetofCholeskyparametersC1,...,CTexcludingCt;P()]TJ /F7 7.97 Tf 6.59 0 Td[(t)isdenedsimilarly.ThesamplingalgorithmconsistsofstepsoftheformPt,CtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),C()]TJ /F7 7.97 Tf 6.59 0 Td[(t),H,DthatjointlyupdatethepartitionandCholeskyparametersassociatedwiththetthsequentialdistributiongiventhedataD.WeperformthisstepintwopartsbyfactoringPt,CtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),C()]TJ /F7 7.97 Tf 6.58 0 Td[(t),H,D=PtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),C()]TJ /F7 7.97 Tf 6.59 0 Td[(t),H,DCtjPt,P()]TJ /F7 7.97 Tf 6.59 0 Td[(t),C()]TJ /F7 7.97 Tf 6.58 0 Td[(t),H,D.Finally,weupdatethehyperparametersHgivenP1,...,PT,C1,...,CT,D.Beforebeginningthesamplingscheme,therearesomecomputationsthatshouldbeperformedandstoredrst.ToupdatethevalueofqrequiresknowledgeofAqandaq()foreachPt.Wechoosetocomputethefullsetofattractivenessesforeachq2QandallP2rst.WhenweupdateqintheMCMCscheme,welookuptheneededvalues.First,wedescribethePtjP()]TJ /F7 7.97 Tf 6.58 0 Td[(t),C()]TJ /F7 7.97 Tf 6.59 0 Td[(t),H,DstepthatsamplesthepartitionPTgiventheremainingpartitions,marginalizedovertheGARPsandIVsCt.ItcanbeshownthatPtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),C()]TJ /F7 7.97 Tf 6.59 0 Td[(t),H,Ddependsonlyontheotherpartitionsandq, 80

PAGE 81

PtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),C()]TJ /F7 7.97 Tf 6.59 0 Td[(t),H,D=PtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),q,D;wesamplefromthisdistributionwithareversiblejumpMCMCstep( Green 1995 ).LetPtdenotethecurrentvalueofthepartition.WeproposeacandidatepartitionP?t=fS?1t,...,S?d?ttgasfollows.UniformlyselectmfromM,andletS02Ptdenotethesetofthepartitionthatcontainsgroupm.ThechoiceofthecandidateP?tdependsontheformofS0. IfS0=M,thenP?t=fMnfmg,fmgg.Thatit,thecandidatepartitionsplitsgroupmfromM. IfS0=fmg,thenuniformlysampleS00fromtheremainingsetsinPt.ThecandidateP?tisfS0[S00g[(PtnfS0,S00g).Thatis,wemergethesingletonsetS0withthesetS00. IfS0isneitherasingletonorM,thenweuniformlysampleS00from(PtnfS0g)[f;g.ThecandidatepartitionP?tisgivenbyfS0nfmgg[fS00[fmgg[(PtnfS0,S00g).Thatis,wemovethegroupmfromsetS0tothe(possiblyempty)setS00.Itcanbeshownthatthecorrespondingratiooftransitionprobabilitiesispr(Pt!P?t) pr(P?t!Pt)=dt)]TJ /F5 11.955 Tf 11.96 0 Td[(I(fmg2Pt) d?t)]TJ /F5 11.955 Tf 11.95 0 Td[(I(fmg2P?t).Byconstruction,PtandP?tdifferonlyinthelocationofgroupm.BecausewearedealingwithpartitionsonM,arelativelysmallspace,weareabletotraversethehighprobabilityregionsofbymakingthesesinglestepmoves.Forproblemswherethemodelspaceislarger,itmaybenecessarytousesplit-mergemoves(e.g. Kimetal. 2006 ; Richardson&Green 1997 )oramixtureofsinglestepandsplit-mergemoves(asin Boothetal. 2008 ).However,theposteriorsamplesfromoursimulationsanddataanalysisdoesnotindicatethatthisisneededinoursetting.ComputingtheprobabilityofacceptingthemovetoP?trequiresthelikelihoodimpliedbyPtandP?tconditionalontheotherpartitions,lik(PtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),data)/pr(PtjPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1)pr(Pt+1jPt)dtYi=1marg(Ymi,m2Sit), 81

PAGE 82

wheremarg(Ymi,m2Sit)isthelikelihoodofthedatafromgroupsinsetSitaftermarginalizingouttheGARPsandIVs, marg(Ymi,m2Sit)=t)]TJ /F9 7.97 Tf 6.59 0 Td[(1X=0ZRZR+(Ym2SitnmYi=1f(YmitjYmi1,...,Ymi,t)]TJ /F9 7.97 Tf 6.58 0 Td[(1;,))(~j,)()td~d=t)]TJ /F9 7.97 Tf 6.59 0 Td[(1X=0t(2))]TJ /F7 7.97 Tf 6.59 0 Td[(n=2)]TJ /F12 7.97 Tf 6.58 0 Td[(j)]TJ /F9 7.97 Tf 6.59 0 Td[(2I+X>Xj)]TJ /F9 7.97 Tf 6.59 0 Td[(1=212\(1))]TJ /F9 7.97 Tf 6.58 0 Td[(1\(1+n=2)2+1 2y>nIn)]TJ /F3 11.955 Tf 11.96 0 Td[(X)]TJ /F11 11.955 Tf 5.48 -9.68 Td[()]TJ /F9 7.97 Tf 6.58 0 Td[(2I+X>X)]TJ /F9 7.97 Tf 6.58 0 Td[(1X>oy)]TJ /F9 7.97 Tf 6.59 0 Td[((1+n=2), (4) wheren=Pm2Sitnm,yisthen-vectorcontainingYmit(i=1,...,nm,m2Sit),andXisthenmatrixwithrow(Ymi,t)]TJ /F12 7.97 Tf 6.59 0 Td[(,...,Ymi,t)]TJ /F9 7.97 Tf 6.59 0 Td[(1)>.For=0,theX0matrixisempty,andthesummandin( 4 )reducesto0t(2))]TJ /F7 7.97 Tf 6.58 0 Td[(n=212\(1))]TJ /F9 7.97 Tf 6.58 0 Td[(1\(1+n=2)2+1 2y>y)]TJ /F9 7.97 Tf 6.58 0 Td[((1+n=2).ThereversiblejumpMCMCstepacceptsthemovefromPttoP?tifUmin(1,pr(P?tjPt)]TJ /F9 7.97 Tf 6.58 0 Td[(1)pr(Pt+1jP?t)Qd?ti=1marg(Ymi,m2S?it) pr(PtjPt)]TJ /F9 7.97 Tf 6.58 0 Td[(1)pr(Pt+1jPt)Qdti=1marg(Ymi,m2Sit)pr(Pt!P?t) pr(P?t!Pt))forUUnif(0,1).Thevaluespr(P?tjPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1),pr(Pt+1jP?t),pr(PtjPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1),pr(Pt+1jPt)aregivenin( 4 )(forthecurrentvalueofq).FortheCtjPt,P()]TJ /F7 7.97 Tf 6.58 0 Td[(t),C()]TJ /F7 7.97 Tf 6.59 0 Td[(t),H,Dstep,wecanexpressthisdistributionasQdti=1[(?it,?it,it)jPt,H,D]andsampleeach(?it,?it,it)exactlyfromitsconditionaldistribution.Thesamplingdistributionsare pr(it=kjSit,H,D)/kt)]TJ /F7 7.97 Tf 6.59 0 Td[(kj)]TJ /F9 7.97 Tf 6.58 0 Td[(2Ik+X>kXkj)]TJ /F9 7.97 Tf 6.58 0 Td[(1=22+1 2y>nIn)]TJ /F3 11.955 Tf 11.95 0 Td[(Xk)]TJ /F11 11.955 Tf 5.48 -9.69 Td[()]TJ /F9 7.97 Tf 6.58 0 Td[(2Ik+X>kXk)]TJ /F9 7.97 Tf 6.59 0 Td[(1X>koy)]TJ /F9 7.97 Tf 6.59 0 Td[((1+n=2), (4) ?itjSit,it,H,DInvGamma1+n 2,2+1 2y>nIn)]TJ /F3 11.955 Tf 11.95 0 Td[(X)]TJ /F11 11.955 Tf 5.48 -9.68 Td[()]TJ /F9 7.97 Tf 6.59 0 Td[(2I+X>X)]TJ /F9 7.97 Tf 6.59 0 Td[(1X>oy,~itjSit,it,?it,H,DNit)]TJ /F11 11.955 Tf 5.48 -9.68 Td[()]TJ /F9 7.97 Tf 6.58 0 Td[(2I+X>X)]TJ /F9 7.97 Tf 6.58 0 Td[(1X>y,?it)]TJ /F11 11.955 Tf 5.48 -9.68 Td[()]TJ /F9 7.97 Tf 6.59 0 Td[(2I+X>X)]TJ /F9 7.97 Tf 6.59 0 Td[(1. 82

PAGE 83

Wethenset?it=(0>t)]TJ /F9 7.97 Tf 6.59 0 Td[(1)]TJ /F12 7.97 Tf 6.59 0 Td[(it,~>it)>and(mt,mt)=(?it,?it)form2Sit.Both( 4 )and( 4 )requirethecalculationoftterms.FordatawithlargeT,thismayadverselyeffectcomputationalspeed.Insuchcaseswerecommendmodifying( 4 )topr(it=k)/expf)]TJ /F11 11.955 Tf 15.28 0 Td[(1I(k=0))]TJ /F11 11.955 Tf 12.75 0 Td[(2kgI(kk0),wherek0isaxedvaluerepresentingthemaximumnumberofpreviousmeasurementsonwhichYmitcandepend.Hence,( 4 )and( 4 )willonlyrequirethecalculationofminft,k0+1gterms.ThisisrelatedtotheideaofbandingtheCholeskymatrix( Rothmanetal. 2010 ; Wu&Pourahmadi 2003 ),whichxesallm;jt,t)]TJ /F5 11.955 Tf 12.33 0 Td[(j>k0,tobezeroandallm;jt,t)]TJ /F5 11.955 Tf 12.33 0 Td[(jk0,tobenon-zero.TheadditionoftheI(kk0)termtopr(it=k)inourmodelprovidesamoreexiblestructurethanbandingasitallowssomem;jt=0,t)]TJ /F5 11.955 Tf 11.96 0 Td[(jk0.WenowupdatethehyperparametersH.For2and2,conjugacygivestheconditionalsof2InvGamma 0.1+TXt=2dtXi=1it,0.1+TXt=2ditXi=1?it>?it!,2Gamma 1+1TXt=1dt,1+TXt=1dtXi=1(?it))]TJ /F9 7.97 Tf 6.59 0 Td[(1!.Theconditionaldistributionof1isavailableinclosedform, (1j)/\(1))]TJ /F27 7.97 Tf 8 5.98 Td[(PTt=1dtPTt=11dt2exp()]TJ /F11 11.955 Tf 9.3 0 Td[(1 1+TXt=1dtXi=1log(?it)!),whichcanbesampledbyslicesampling( Neal 2003 ).Theconditionalfor(1,2)doesnothaveastandardform,butdependssolelyonthesetofit's.Letting(1,2)denotethehyperprior,weupdate(1,2)byslicesamplingfromtheconditionaldistribution (1,2j)/(1,2)TYt=2dtYi=1expf)]TJ /F11 11.955 Tf 15.28 0 Td[(1I(it=0))]TJ /F11 11.955 Tf 11.95 0 Td[(2itg Pt)]TJ /F9 7.97 Tf 6.58 0 Td[(1l=0expf)]TJ /F11 11.955 Tf 15.27 0 Td[(1I(l=0))]TJ /F11 11.955 Tf 11.96 0 Td[(2lg.SamplingqrequiresaMetropolis-Hastingsstep.WecreateaproposalsetaroundthecurrentvalueqbyQ?=fq+0.025n:n=1,...,50g,andthenwedrawq? 83

PAGE 84

uniformlyfromQ?.Weacceptthemovetoq?ifUmin(1,I(q?2Q)Aq Aq?aq?(P1) aq(P1)TYt=2cq?(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1,Pt) cq(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1,Pt)aq(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1) aq?(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1))forUUnif(0,1).Otherwise,weretainq. 4.4SimulationStudyWeexaminetheperformanceofourpriorsusingasimulationstudy.WesimulatedataconsistingofM=8groupswithsamplesizesofn1==n5=60,n6==n8=30.Becausethereislessdataforthenalthreegroups,weexpectthatleveraginginformationacrossgroupswillaidintheestimationespeciallyforthesesmallergroups.ObservationsaremeasuredatT=10timepoints.ThetruepartitionschosenareP1=fMg,P2==P5=ff1,2,3,4g,f5,6,7,8gg,P6==P8=ff1,2,3,4g,f5,6g,f7,8gg,andP9=P10=ff1,2g,f3,4g,f5,6g,f7,8gg.Giventhepartitions,thetruevaluesofTmandDmarespeciedtorepresentalongitudinaldatascenariowithserialcorrelation.Thetruecovariancematriceshaveasparsestructurebyspecifyingittobebetween1and3,mostoften2.ExactdetailsareprovidedinAppendix F .Underthisdatadistribution,wegenerate50datasetsandrunanMCMCchainforeachusingthealgorithmofSection 4.3 .Afteraburn-inof3000iterations,thechainrunsfor9000iterations,andweretaineverytenthforposteriorinference.Tomeasuretheperformanceoftheresultingestimators,weusethelossfunctions Lall(,^)=Lall(f1,...,Mg,f^1,...,^Mg)=MXm=1nm NL(m,^m),L(m,^m)=tr()]TJ /F9 7.97 Tf 6.59 0 Td[(1m^m))]TJ /F4 11.955 Tf 11.96 0 Td[(logj)]TJ /F9 7.97 Tf 6.59 0 Td[(1m^mj)]TJ /F5 11.955 Tf 17.93 0 Td[(T,whereN=Pmnm,and^m=Epost(TmDmT>m))]TJ /F9 7.97 Tf 6.59 0 Td[(1istheBayesestimatorforgroupm( Yang&Berger 1994 ).L(m,^m)isthestandardlog-likelihoodlossfunctionforasinglecovarianceestimator,andLall(,^)measuresthelosstoestimatingthecollectionofcovariancematricesbytakingaweightedaveragewithweightsproportionaltogroup 84

PAGE 85

Table4-1. Riskestimatesfromsimulationstudy. LabelinPartitionPriorCholeskyPriorRiskunderRiskunderFig. 4-2 Lall(,^)L(8,^8) 1cov.partitionpriorsparsity0.3040.3552indep.-uniformspriorsparsity0.3470.4503indep.-uniformspriornon-sparse0.4630.5354cov.partitionpriornon-sparse0.4660.5125Pt=Pindsparsity0.5150.6916Pt=Ppoolsparsity0.7691.1587Pt=Pindnon-sparse0.7801.0598Pt=Ppoolnon-sparse0.8201.187 size.Theestimatedriskofthecovariancepartitionpriormethodistheaveragelossoverthe50datasets.Wecomparethecovariancepartitionpriorswithseveralcommonlyusedmethods.LetPind=ff1g,,fMggrepresentthepartitionwhereeachgroupisinasingletonset,correspondingtothecasewherenoinformationissharedacrossgroups,andrecallPpool=fMgpoolsthedataintoasinglegroup.Weconsideratotalofeightpriorstructures:thedistributiononthepartitionsisgivenby( 4 )withthepriorforquniformonQ,calledthe`covariancepartitionprior';xingq=0,the`independent-uniformsprior';PtisxedasPindforallt;PtisxedasPpoolforallt.Witheachofthesefourstructuresonthepartitions,weconsidertwopriorsontheCholeskyparameters:sparsityisinducedaccordingto( 4 )( 4 )andanon-sparsechoicebyxingit=t)]TJ /F4 11.955 Tf 12.16 0 Td[(1.Table 4-1 andFigure 4-2 containtheestimatedriskofestimatingthecollectionofcovariancematricesandtheriskforestimatingthenalgroupundereachofthesepriorspecications.Itisclearthatestimatesbasedonpartitionpriorshavelowerrisk.NeitherofthePpoolpriorsworkwellbecausetheyproduceinconsistentestimators.Further,thePindpriorwithoutsparsityhastoomanyparameterstoestimateefciently;addingsparsityalleviatesthissomewhat.Thecovariancepartitionpriorwithsparsityshowsthebestperformance;theriskofthePindnon-sparsechoiceis2.5timeslarger.Theriskof 85

PAGE 86

PAGE 87

PAGE 88

Table4-2. Modelselectionstatisticsfordepressionstudy. PartitionPriorCholeskyPriorDevpDDIC cov.partitionpriornon-sparse39,46471140,887cov.partitionpriorsparsity39,51471540,945Pt=Ppoolnon-sparse40,25834540,949Pt=Ppoolsparsity40,28535941,004indep.-uniformspriornon-sparse39,34883641,020indep.-uniformspriorsparsity39,43583141,096Pt=Pindnon-sparse39,051116441,381Pt=Pindsparsity39,183115941,501 missingvaluesgiventheobservedmeasurementsandcurrentparametervalues.Eachgroupisassumedtohaveaunique,unstructuredmeanvector,whichisgivenaatprior.AnMCMCanalysisisrunforeachoftheeightpriorspecicationsfromtherisksimulation.Again,asinglechainisrunfor9000iterationsafteraburn-inof3000,andthevaluesfromeverytenthiterationareretainedforposteriorinference.Withtheincreaseddimensionofthisdata(T=17),thesparsecovariancepartitionpriorcansamplearound4.5iterationsinthetimenecessaryforthecovariancegroupingpriors( Gaskins&Daniels 2013 )methodtosampleone.Thepartitionpriorwiththenon-sparseCholeskyparameterizationsamplesabout4timesfasterthanthesparseversion.Todeterminethemodelthatbesttsthedata,weusethedevianceinformationcriterion(DIC; Spiegelhalteretal. 2002 ).TheDICiscalculatedasDIC=Dev+2pD,whereDevisthedevianceevaluatedattheposteriorexpectationoftheparametervaluesandpDisamodelcomplexityterm.ThispDiscomputedasthedifferencebetweentheexpecteddevianceandDev,andthetermisofteninterpretedastheeffectivenumberofmodelparameters.Weusetheobserveddatalikelihoodtocomputethedeviances( Wang&Daniels 2011 ).ModelswithsmallerDICareconsideredtobetterbalancethemodeltandcomplexity.Table 4-2 containsthemodelcomparisonstatistics. 88

PAGE 89

Thecovariancepartitionpriorwiththenon-sparseCholeskystructureprovidestheoverallbestmodelt.ThePpoolmodelsperformsecondbestbyutilizingfewerparametersthantheothermodelsbutatacostofapoorermodeltDev.Theindependent-uniformspriorshowssimilarmodelttothecovariancepartitionpriorbutusesabout120moreparameters,asthedegreeofthepartitionsP1,...,PTtendstobelargerwiththeindependent-uniformspriorthanunderthepriorwithrandomq.ThePindperformsworstduetothelargenumberofparametersrequired.Wenotethatforthisparticulardatasetourspecializedsparsitystructuredoesnotprovideanimprovementinmodelt.ThisisclearlyseensincetheDICarelargerthanthenon-sparsepriorforeachofthefourpartitionpriors.Additionally,wehavecomparablevaluesofpDunderbothCholeskychoices,indicatingthatformostiterationstheTmmatricesarefullorclosetofull.Furtherinspectionoftheparameterestimatesindicatethatrestrictingsparsitytotheformf(YmitjYmi1,...,Ymi,t)]TJ /F9 7.97 Tf 6.58 0 Td[(1)=f(YmitjYmi,t)]TJ /F7 7.97 Tf 6.59 0 Td[(k,...,Ymi,t)]TJ /F9 7.97 Tf 6.58 0 Td[(1)isnotwellsupportedbythisdata.ItispossibletoexploitsparsityinthecolumnsofTm(aswasshowninSection 3.7 ),buttheappropriatesparsestructurerequireszerosspreadthroughoutthecolumnsinsteadofjustintherstt)]TJ /F5 11.955 Tf 9.69 0 Td[(k)]TJ /F4 11.955 Tf 9.68 0 Td[(1positions.Weconcludethediscussionofthedepressiondatabyexaminingthestructurethatthecovariancepartitionpriorinducesforthisdataunderthebestttingmodel,thecovariancepartitionpriorwiththenon-sparseCholeskystructure.Themeanofq,theparameterdeterminingthesmoothnessofthestochasticprocessofpartitions,is8.09witha95%credibleintervalof(5.53,9.95).Figure 4-3 showstheclusteringpropertiesofourprioratbaselinethroughweek11;theremaining,undisplayedweeksbehavesimilarlytoweeks8.Eachpaneldepictsthepairwiseprobabilitythatm1andm2aregroupedtogetherattimet(weekt)]TJ /F4 11.955 Tf 12.61 0 Td[(1),thatis,theprobabilitythatthereexistsasetS2Ptwithfm1,m2gS.Wenotethatatbaselineandthenexttwoweeksgroupsaresplitbyinitialseveritywithalowseveritysetf1,2,5,6gandahighseverityset 89

PAGE 90

Figure4-3. Theposteriorprobabilitythatm1andm2areinthesamesetofthepartitionPtforbaselinethroughweek11(t=1,...,12). f3,4,7,8g.Fromweeks2through7thereislessstabilityinthepartitions,butfromweek8on,therearetwostrongclusters:onecontainsgroups4(drug,highseverity,male),5(nodrug,lowseverity,female),and6(drug,lowseverity,female)andtheothercontainsgroups2(drug,lowseverity,male),7(nodrug,highseverity,female),and8(drug,highseverity,female).Thepartitionlocationsofgroups1and3(nodrug,low/highseverity,male)arelessstableastvaries.Whileourmodelmakesuseofthepartitionsprimarilyasatoolforsharinginformationacrossgroups,thematchingstructurethatourmodelcapturesmaybe 90

PAGE 91

PAGE 92

CHAPTER5CONCLUSIONSANDFUTUREWORKInthisdissertationwehavedevelopedthreenovelmethodstoestimatethedependencestructuresinlongitudinaldata.InChapter 2 weintroducedBayesianpriordistributionsforthecorrelationmatrixRthatinducessparsitythroughthepartialautocorrelations( Daniels&Pourahmadi 2009 ).ByallowingshrinkageorselectiononthePACs,weimproveestimationbyworkinginalower-dimensionspace.Inthenexttwochapters,weconsideredtheproblemofsimultaneouscovarianceestimation.Chapter 3 introducesanon-parametricmodelthroughthematrixstick-breakingprocess( Dunsonetal. 2008 )thatallowsequalityofparametersacrossgroupsandacrossGARPsofacommonlag.InChapter 4 weproposeamethodrelyingonpartitioningthegrouplabelsateachtimepoint,sothatthesequentialdistributionscoincideforgroupsinthesamesetofthepartition.BothpriordistributionsareformedintermsofthemodiedCholeskyparameterization( Pourahmadi 1999 )andfavorsparsechoicesoftheT()matrix.Intermsoffutureresearchdirections,thePACparameterizationofRisquiteintuitivebuthasyettoattractalotofresearchforlongitudinaldata.Inparticular,algorithmsformoreefcientcomputationsoftheposteriorareneeded,especiallyforcorrelationmatricesoflargerdimension.Also,modelsthatseekstructureinthePACsbeyondsparsitycanbedevelopedsuchasshrinkingorclusteringPACswithinlag.Additionalmodelsthatimposesparsitythroughthemarginalorpartialcorrelations,beyond Pittetal. ( 2006 ),couldbedeveloped.Regardingthesimultaneouscovarianceestimationproblem,thereareanumberofextensionsandfuturedirectionstoexplore.Possibleextensionsofthegroupingandcovariancepartitionmodelsincludesituationswherethedimensionofmchangesacrossgroupsorifthetimebetweenmeasurementsdiffersacrossgroups.Oftenlongitudinaldataarenotmeasuredatxedtimes,andsooneshouldconsiderwhat 92

PAGE 93

kindofmethodologyisneededtoallowsharingofinformationacrossgroupsforthesesituations.InbothChapters 3 and 4 weconsiderthegroupsasexchangeable,butdependingoncontext,theremaybeaninformativeorderingtothegroupssuchasincreasinglevelsoftreatmentthatshouldbeexploited.RatherthanrelyingongroupingsoftheGARPandIVparametersthatareequal,modelscouldbedevelopedthatrelyontargetedshrinkage.Further,theideaofusingajointdistributiononpartitions(P1,...,PT)ispotentiallyapplicableinavarietyofcontextsoutsideofcovarianceestimationwhensimultaneousclusteringisrequired. 93

PAGE 94

APPENDIXACALCULATINGTHEDICSTATISTICFORCTQDATAWedescribeinthisappendixthedetailsinvolvedinapproximatingtheDICtermusedformodelcomparisonoftheCTQIdata,andinparticular,theestimationoftheintegralin( 2 )introducedinSection 2.6 .First,weintroducenotation.Let=(,R)bethesetofparameters,^=(^,^R)thesetoftheposteriorestimates,andgthevalueofattheg-thiterationoftheMarkovchain(g=1,...,G).ThefunctionIi(Y)=IfQitYt08tgindicateswhetherYisasetoflatentvariableswhosesignsagreewithQi.Denepi()tobetheprobabilityofobservingQiundertheparameters.Asin( 2 ),thisispi()=pi(,R)=Z(,1)JIi(y)(yj)dy,where(j)isthemultivariatenormaldensitywithmeanXiandcovariancematrixRwhen=(,R).Hence,loglik(jQi)=logpi().Aspreviouslynoted,thisintegralinintractable.Usingthedenitionsofthedevianceandcomplexityparameter(equations( 2 )and( 2 ))andthenewnotation,wecanwriteDICasthesumofthecontributionsDICiforeachpatient,DIC=XiDICi=Xih2logpi(^))]TJ /F4 11.955 Tf 11.95 0 Td[(4Eflogpi()gi.AsobservationsQiareindependent,itsufcestoconsidertheperpatientcontributionDICi.Notethattheexpectationinthenaltermiswithrespecttotheposteriordistributionoftheparametersandwillbeestimatedbyitsaverageoverthevaluesfromtheposteriorsample1,...,G.Additionally,wewillhavetoapproximatepi()withsomeestimate^pi().So,dDICi=2log^pi(^))]TJ /F4 11.955 Tf 11.95 0 Td[(4G)]TJ /F9 7.97 Tf 6.59 0 Td[(1GXg=1log^pi(g). 94

PAGE 95

NotethatcalculationofDICwillrequire(G+1)estimatesoftheintegralpi()foreachi=1,...,N.Toevaluatethisintegralweuseimportancesampling( Robert&Casella 2004 ,Section3.3).Wetakeasoursamplingdensityt(j),themultivariatet-distributionwith5degreesoffreedom,locationparameterXi^,andscalematrixk^Rforsomeconstantk>1.Wedene=(^,k^R)tobethesetofparametersforthesamplingdistribution.Wechoosethet-distributionsothatt(j)willhaveheaviertailsthan(jg)forg=1,...,Gand^.Thisalsomotivatesthechoicetouseascalematrixthatisaninatedversionof^R.Notethatwecanwritepi()aspi()=Z(,1)JIi(y)(yj)dy=Z(,1)JIi(z)(zj) t(zj)t(zj)dz.Toestimatethis,drawZ1,...,ZHiindependentlyfromt(j),and^pi()=H)]TJ /F9 7.97 Tf 6.59 0 Td[(1iHiXh=1Ii(Zh)(Zhj) t(Zhj)isanunbiasedandconsistentestimatorofpi().EvaluationofdDICiinvolves,foreachg=1,...,G,simulatingadatasetZ=fZhgHih=1andcalculating^pi(g),followedbydrawinganalZtoestimate^pi(^).DrawingG+1independentdatasetsturnsouttobecomputationallyslow.Instead,wewilldrawasinglesampleZtousetocalculateall^pi(1),...,^pi(G),^pi(^).Itisclearthattheseestimatesremainunbiasedandconsistent.WhatremainsistoconsiderwhateffectthiswillhaveonthevariabilityoftheindividualcontributionstotheDIC,dDICi.FirstwederivethevarianceofdDICiinthesituationwherewedrawanewdatasetZforeach^pi(g).Inthiscase, VarfdDICig=4Varflog^pi(^)g+16G)]TJ /F9 7.97 Tf 6.58 0 Td[(2XgVarflog^pi(g)g,(A) 95

PAGE 96

wheretheexpectation(inthevariance)iswithrespecttothesamplingdistributionofZ.Foranygor^,thisvarianceis Varflog^pi()gpi())]TJ /F9 7.97 Tf 6.59 0 Td[(2Varf^pi()g=pi())]TJ /F9 7.97 Tf 6.58 0 Td[(2H)]TJ /F9 7.97 Tf 6.58 0 Td[(1iVarIi(Z1)(Z1j) t(Z1j)=pi())]TJ /F9 7.97 Tf 6.59 0 Td[(2H)]TJ /F9 7.97 Tf 6.59 0 Td[(1iEIi(Z1)(Z1j)2 t(Z1j)2)]TJ /F5 11.955 Tf 11.95 0 Td[(pi()2,wheretheapproximationintherstlineisduetothedeltamethod.Thisquantitycanbeconsistentlyestimatedby dVarflog^pi()g=^pi())]TJ /F9 7.97 Tf 6.59 0 Td[(2H)]TJ /F9 7.97 Tf 6.59 0 Td[(1i"H)]TJ /F9 7.97 Tf 6.59 0 Td[(1iHiXh=1Ii(Zh)(Zhj)2 t(Zhj)2)]TJ /F4 11.955 Tf 12.1 0 Td[(^pi()2#.(A)TocalculatethevarianceofdDICiunderoursamplingschemewithasinglesampleZ,note VarfdDICig=4Varflog^pi(^)g+16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2XgVarflog^pi(g)g+16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2XgXg06=gCovflog^pi(g),log^pi(g0)g (A) )]TJ /F4 11.955 Tf 9.29 0 Td[(16G)]TJ /F9 7.97 Tf 6.59 0 Td[(1XgCovflog^pi(g),log^pi(^)g.Thequantitiesonthesecondandthirdlinesof( A )representtheadditionaltermsduetosharingthedatasetZacrosscalculationsof^pi().DeneCOVitobethesumofthesecovarianceterms.WemaywriteCOViasCOVi=16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2XgXg06=gCovflog^pi(g),log^pi(g0)g)]TJ /F5 11.955 Tf 32.23 8.09 Td[(G G)]TJ /F4 11.955 Tf 11.96 0 Td[(1Covflog^pi(g),log^pi(^)g.FromthedeltamethodwehaveCovflog^pi(g),log^pi()gCov^pi(g) pi(g),^pi() pi(), 96

PAGE 97

andso COVi16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2XgXg06=g"Cov^pi(g) pi(g),^pi(g0) pi(g0))]TJ /F5 11.955 Tf 23.6 8.09 Td[(G G)]TJ /F4 11.955 Tf 11.96 0 Td[(1Cov(^pi(g) pi(g),^pi(^) pi(^))#=16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2XgXg06=gCov(^pi(g) pi(g),^pi(g0) pi(g0))]TJ /F5 11.955 Tf 23.59 8.09 Td[(G G)]TJ /F4 11.955 Tf 11.95 0 Td[(1^pi(^) pi(^))=16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2H)]TJ /F9 7.97 Tf 6.59 0 Td[(1iXgXg06=gCovIi(Z1)(Z1jg) pi(g)t(Z1j),Ii(Z1) (Z1jg0) pi(g0)t(Z1j))]TJ /F5 11.955 Tf 23.59 8.09 Td[(G G)]TJ /F4 11.955 Tf 11.96 0 Td[(1(Z1j^) pi(^)t(Z1j)!)=16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2H)]TJ /F9 7.97 Tf 6.59 0 Td[(1iXgXg06=gG G)]TJ /F4 11.955 Tf 11.96 0 Td[(1 (A) +E(Ii(Z1)(Z1jg) pi(g)t(Z1j) (Z1jg0) pi(g0)t(Z1j))]TJ /F5 11.955 Tf 23.59 8.09 Td[(G G)]TJ /F4 11.955 Tf 11.95 0 Td[(1(Z1j^) pi(^)t(Z1j)!)#.Aslongasthisquantity( A )issmall(relativetotheindependencevarianceestimator( A )),wemaysavecomputationaltimebyonlysamplingonedatasetZwithoutsacricingprecision.InthesituationoftheanalysisoftheCTQdata,estimationofCOViwitharepresentativesampleofobservationsshowedCOVitobesmallrelativetotheindependencevarianceestimator( A ).Infact,thistermisoftennegative,indicatingthatusingthecommondatasetZmayimproveestimationforsomeobservationsi.Intherepresentativesampleweconsidered,theadditionoftheCOVitermtendedtoleadtochangesinthestandarderrorofdDICirangingfromadecreaseof5%toanincreaseof25%.AsitiscomputationallyinfeasibletocomputeCOViforallobservations(anestedloopoverg0insidealoopoverg),weestimatethestandarderroroftheDICestimateusingtheindependenceestimatorobtainedfrom( A )and( A ).TheDev,pD,andDICestimatesinTable 2-4 arecomputedinthisway.TheimportancesamplingsizeHiischosenbyrstdrawing200,000valuesofZht(j),wherevariancescalingfactorkis1.52.ThischoiceofkismadewithconsiderationtothedimensionJofZ(increasingJshouldcorrespondtoincreasingk),howfarfromthe 97

PAGE 98

origintendstobe,howlikelyIi(Z)istobeone,amongotherconsiderations;ultimately,trial-and-errorwithsmallchoicesofHiledusconcludethatk=1.52worksreasonablywell.Havingdrawn200,000valuesofZh,ifPhIi(Zh)2000(i.e.,atleast2000oftheZh'shavesignsappropriateforalatentvariableofQi),thenHi=200,000andthissetfZhg200,000h=1istheimportancesampleZ.Ifnot,wecontinuetodrawadditionalsetsof200,000Zh'stoappendtothedatasetuntilPhIi(Zh)2000.Thisimpliesthatwehavelargersamplesforthosepatientsiwithsmallvaluesofpi().ThishelpstocontrolthevarianceofDICisincethetermisprecededbypi())]TJ /F9 7.97 Tf 6.59 0 Td[(2(seeequation( A )).Withthisscheme,weestimatethestandarderrorsforourDICestimatesinTable 2-4 tobearound0.5. 98

PAGE 99

APPENDIXBADDITIONALCOVARIANCEGROUPINGPRIORSANDTHEIRPROPERTIES B.1SparsityGroupingPriorforWeintroducethreeadditionalgroupingpriorspecicationstotheoneintroducedinSection 3.3 ,twofortheautoregressiveparametersandonefortheinnovationvariances.Thesethreepriorsfollowcloselytothematrixstick-breakingprocess( Dunsonetal. 2008 ).Herewerelyonfewerlongitudinalassumptionsintheconstructionofthesemodels,andsotheymaybeviewedasamiddlegroundbetweenthelag-blockandcorrelated-lognormalpriorsandthenaiveBayespriorsusedinSections 3.6 and 3.7 .Thesparsitygroupingpriorisdenedbyreplacingequations( 3 )and( 3 )fromthelag-blockgropingpriorwith mjFmj()=HXh=1mjhjh()(m=1,...,M;j=1,...,J),jhq(j)0+(1)]TJ /F11 11.955 Tf 11.95 0 Td[(q(j))N(0,2)(j=1,...,J;h=1,...,H). (B) Thekeydistinctionisthatin( B )wesamplecandidatesjhforeachautoregressiveparameterj.Previouslywehadasetofcandidatesqhforallautoregressiveparameterswithinlag.Becausethecandidatesaredistinctacrosslags,thispriorwillnolongerallowclusteringofautoregressiveparametersacrosslags,exceptforcommonzerovalues.Asbefore,weuseazero-normalmixtureforthebasedistributionsofthecandidatestoencouragesparsity.Thissparsitygroupingpriorforisdenedsimilarlytothematrixstick-breakingprocess,withthekeydifferencebeingthat Dunsonetal. ( 2008 )specifythatthebasedistributionofthejh'sbenonatomic.Thisisnotthecasewithourpriorsinceweuseadistributionforthecandidatesthatcontainapointmassatzero.Thisdoesnotleadtoaproblem,butitdoesaltersomeofthetheoreticproperties.Whensamplingfromthesparsitygroupingprior,properties( 3 )( 3 )ofthelag-blockpriorcontinuetohold.Thismeansthatthedistribution'scorrelation, 99

PAGE 100

corrfFmj(A),Fm0j(A)g,andclusteringproperties,pr(mj=m0j),acrossgroupswithinthesameautoregressiveparametercontinuetohold.However,properties( 3 )and( 3 )whichconsiderbehavioracrossautoregressiveparametersj6=j0withinthesamelagchange.ItisnowthecasethatcorrfFmj(A),Fmj0(A)g=corrfFmj(A),Fm0j0(A)g=0.Additionally,pr(mj=mj0)=pr(mj=m0j0)=2q,theprobabilitythatbothareindependentlysettozero.Hence,thispriorprovidesaless-richgroupingstructurethanthelag-blockprior.Recallthatinthespecialcase!0and!1,eachofthemodelparametersissampledindependentlyfromthecandidatebasedistribution.Thatis,mjisdistributedasequation( 3 ),whichisthenaiveBayesprior1thatweusedfortherisksimulationanddataanalysis.Hence,thesparsitygroupingpriorsubsumesthisnaivemodelasaspecialcase. B.2Non-SparseGroupingPriorforThenon-sparsegroupingpriorisaslightsimplicationofthesparsitygroupingthatnolongerencouragessparsity.Toformthisprior,weusethesparsitygroupingpriorandreplaceequation( B )withjhN(0,2).Havingremovedthezeropointmassfromthe-level,mjisalmostsurelynon-zero.Thus,wenolongerallowforconditionalindependencerelationshipsorgainsparsityintheT()matrix,butbyxingthemeanatzeroforthedistributionofthecandidates,westillencourageshrinkagetowardindependence.Thispriorfollowsexactlytheframeworkof Dunsonetal. ( 2008 ).Weadditionallypointoutthatthenon-sparsegroupingpriorisequivalenttothesparsitygroupingpriorifwexq=0forallq.Hence,therespectivepropertiesforthispriorareobtainedbysubstituting=0,andconsequently,()=(),intothepropertiesofthesparsitygroupingprior.Becausethenon-sparsepriorisamatrixstick-breakingprocesspriorwithanon-atomicbasedistribution,thepropertiesmay 100

PAGE 101

alternativelybetakenfromPropositions1,2,and4in Dunsonetal. ( 2008 ).Additionally,thenaiveBayes2prioristhespecialcaseofthisnon-sparsegroupingpriorwhen!0and!1. B.3InvGammaGroupingPriorfor)]TJ /F1 11.955 Tf -308.03 -24.53 Td[(Weconstructanadditionalpriorfortheinnovationvariancesdifferingfromthecorrelated-lognormalpriorintwokeyways.First,wedonotstrivetotreattheinnovationvariancesascomingfromasmoothfunction.Inthepreviouspriorthevariancecandidatesjhweresampledfromacorrelated-lognormaldistribution,butwenowtreatthecandidatesindependentlyacrossj.Secondly,becausewechooseindependentcandidates,wechoosetheconjugateinversegammaforthebasedistributioninsteadofthelognormal.TheInvGammagroupingpriorisdenedbyreplacinglines( 3 )and( 3 )with mjGmj()=HXh=1mjhjh()(m=1,...,M;j=1,...,p),jhInvGamma(1,2),(j=1,...,p;h=1,...,H). (B) Wenolongerrequireany!hin( 3 )becausewedonotdesireintoinduceacorrelationamongthe's.Aswiththenon-sparsepriorthisisaspecialcaseofthematrixstick-breakingprocess.Wecomparethispriorwiththecorrelated-lognormalpriorinthespecialcasewhere=0.Recallthatif=0,thenthecandidatesjhareindependentvariablesfromthelognormal( ,)distribution,andthecorrelated-lognormalpriorgivesamatrixstick-breakingprocess.Betweenthesetwopriors,theInvGammagroupingpriorandthecorrelated-lognormalgroupingpriorwith=0,werecommendusingtheInvGammabecauseoftheconjugacythatisobtained.Thebenetofusingthenormal(equivalently,lognormal)distributionfor!()isthatinducingthecorrelationinsideaclusterisstraightforward,againthatoutweighsthelossofconjugacy.Forsituationswhereitisreasonabletoconsidertheinnovationvariancesasfollowingasmoothprogress,such 101

PAGE 102

asmostlongitudinaldatamodels,werecommendthecorrelated-lognormalpriorwithanon-zerochoiceof.Toobtainthetheoreticalpropertiesofthisprior,letI\()denotetheprobabilityfunctionoftheInvGamma(1,2)distribution,withxedvaluesforthehyperparameters.BecausetheInvGammagroupingpriorisamatrixstick-breakingprocess,themostrelevantpropertiesfollowimmediatelyfrom Dunsonetal. ( 2008 ).ForA2B(R+)theunbiasedandvariancepropertiesaregivenbyEfGmj(A)g=I\(A),VarfGmj(A)g=2 (2+)(2+))]TJ /F4 11.955 Tf 11.96 0 Td[(2I\(A)f1)]TJ /F1 11.955 Tf 11.95 0 Td[(I\(A)g.Thebehavioracrossgroupsm6=m0withcommontimejasin( 3 )and( 3 )continuetoholdwith=0.ThekeydifferenceisthatcorrfGmj(A),Gmj0(A)g=corrfGmj(A),Gm0j0(A)g=0,contrastedwith( 3 )and( 3 ).Thatis,thedistributionsfortimesjandj0areuncorrelated.Itremainstruethatpr(mj=mj0)=0.As!0and!1,thispriorconvergestothenaivepriorusedfortheinnovationvariancesinSections 3.6 and 3.7 ofthearticle. B.4FurtherGroupingPriorExtensionsInadditiontothegroupingpriorspreviouslydenedinChapter 3 andhere,thereareanumberofothernaturalextensionsandpossiblevariationsthatonecouldconstruct.Forinstance,onecouldallowfordifferingvaluesof2,thevarianceoftheautoregressiveparametercandidates,thatdependonthelagvalueoftheassociatedautoregressiveparameter.Thismightbebenecialinasituationwherepislargeandonebelievesthatthe'saftertherstfewlagsvarymoretightlyaroundzero.Additionally,wecanremovethesparsityfromthelag-blockgroupingpriorbydeletingthepointmassatzerofromthedistributionofthe's.Asanotherchoice,insteadofspecifyingtheand)]TJ /F1 11.955 Tf 10.1 0 Td[(asseparateblockswithdifferentvaluesofthestick-breakingparametersand,onecoulddrawbothsetsofparameterswiththesamevaluesofand.Insteadofspecifyingthatthecandidate'sarezeroaccording 102

PAGE 103

totheprobability,anotherextensionistomodifythegroupingpriorbyintroducingazero-thclusterwherej0=0forallj.Theselectionofmjwouldthenfollowbypr(mj=0)=pr(mj=j0)=q(j)andforh=1,...,H,pr(mj=jh)=(1)]TJ /F11 11.955 Tf 12.28 0 Td[(q(j))mjh.Thepropertieswehaveconsideredareeasilyobtainedandcomparedtothoseobtainedinthesparsitygroupingcase.Whiletheseorothersmaybemorenaturalincertaincontexts,webelievethatthelag-blockandcorrelated-lognormalgroupingpriorsarethemostapplicablechoicesforgenerallongitudinaldata. 103

PAGE 104

APPENDIXCDERIVATIONOFCOVARIANCEGROUPINGPRIORPROPERTIES C.1ProofsforGeneralizedAutoregressiveParameterPropertiesWeprovidepartialproofstothepropertiesofthelag-blockgroupingpriornotedinSection 3.4 ofthearticle.Theproofsforproperties( 3 )and( 3 )canbefoundintheappendixto Dunsonetal. ( 2008 ).Duetothenon-atomicnatureofthebasedistribution( 3 ),wearenotabletodirectlyapplytheirconclusiontoourpriortoprove( 3 ).Detailsfollow.pr(mj=m0j)=pr(mj=m0j6=0)+pr(mj=m0j=0)=E(Xhmjhm0jhjh(Rn0))+E(Xhmjhjh(0)Xim0jiji(0))=E(Xhmjhm0jhjh(Rn0))+E(Xhmjhm0jhjh(0))+2E(Xh1Xi=h+1mjhm0jijh(0)ji(0))=E(Xhmjhm0jhjh(R))+2E(Xh1Xi=h+1mjhm0jijh(0)ji(0))=E(Xhmjhm0jh)+22E(Xh1Xi=h+1mjhm0ji)=(I)+22(II),whereexpressions(I)and(II)arecalculatedbelowandRdenotestherealline. 104

PAGE 105

(I)=E(XhUmhUm0hX2jhYl
PAGE 106

Using(I)and(II),wehavepr(mj=m0j)=(I)+22(II)=2+1)]TJ /F11 11.955 Tf 11.96 0 Td[(2 (1+)(2+))]TJ /F4 11.955 Tf 11.95 0 Td[(1.Toestablishthecorrelationpropertyin( 3 ),letq=q(j)=q(j0).FirstnoteEfFmj(A)Fmj0(A)g=E(Xhmjhmj0hqh(A))+2E(Xh1Xi=h+1mjhmj0iqh(A)q0i(A))=(A)E(Xhmjhmj0h)+2(A)2E(Xh1Xi=h+1mjhmj0i)=(A)(III)+2(A)2(IV),whereformulas(III)and(IV)aredenedbelow.(III)=E(XhU2mhXjhXj0hYl
PAGE 107

(IV)=E"Xh1Xi=h+1UmhXjh(1)]TJ /F5 11.955 Tf 11.96 0 Td[(UmhXj0h)(Yl
PAGE 108

Toprovethematchingprobabilityof( 3 ),notepr(mj=mj0)=pr(mj=mj06=0)+pr(mj=mj0=0)=E(Xhmjhmj0hqh(Rn0))+E(Xhmjhqh(0)Ximj0iqi(0))=E(Xhmjhmj0hqh(R))+2E(Xh1Xi=h+1mjhmj0iqh(0)qi(0))=(III)+22(IV)=2q+1)]TJ /F11 11.955 Tf 11.95 0 Td[(2q (2+)(1+))]TJ /F4 11.955 Tf 11.95 0 Td[(1.Toobtain( 3 ),wehaveEfFmj(A)Fm0j0(A)g=E(Xhmjhm0j0hqh(A))+2E(Xh1Xi=h+1mjhm0j0iqh(A)qi(A))=(A)E(Xhmjhm0j0h)+2(A)2E(Xh1Xi=h+1mjhm0j0i)=(A)(V)+2(A)2(VI),where(V)=E(XhUmhUm0hXjhXj0hYl
PAGE 109

and(VI)=E"Xh1Xi=h+1UmhXjh(1)]TJ /F5 11.955 Tf 11.96 0 Td[(Um0hXj0h)(Yl
PAGE 110

Toseethatthedistributionsareuncorrelated,assumem=m0.ThenEfFmj(A)Fmj0(A)g=E(Xhmjhmj0hqh(A)q0h(A))+2E(Xh1Xi=h+1mjhmj0iqh(A)q0i(A))=(A)2(III)+2(A)2(IV)=(A)2.Whenm6=m0,theproofproceedssimilarlybutrequiresexpressions(V)and(VI)insteadof(III)and(IV). C.2ProofsforInnovationVariancePropertiesToestablishthepropertiesforthecorrelated-lognormalprior,notethatforaxedcommonvalueofand,thedistributionsofUmhandWmh,aswellasXjhandZjh,arethesame.Hence,thesetfmjhgwillbedistributedthesameasthesetfmjhg,andwemayusetheexpressions(I)(VI)toobtainexpectationsoftheinnovationvariancestick-breakingweights.Toprove( 3 ),noticeEfGmj(A)Gmj0(A)g=E(Xhmjhmj0hjh(A)j0h(A))+2E(Xh1Xi=h+1mjhmj0ijh(A)j0i(A))=Enj1(A)j01(A)oE(Xhmjhmj0h)+2fEj1(A)gfEj01(A)gE(Xh1Xi=h+1mjhmj0i)=Enj1(A)j01(A)o(III)+2fEj1(A)gfEj01(A)g(IV)=1 (2+)(1+))]TJ /F4 11.955 Tf 11.95 0 Td[(1hEn!j1(logA)!j01(logA)ofE!j1(logA)gfE!j01(logA)gi+fE!j1(logA)gfE!j01(logA)g=1 (2+)(1+))]TJ /F4 11.955 Tf 11.95 0 Td[(1covn!j1(logA),!j01(logA)o+(logA)2. 110

PAGE 111

Applyingvarf!j1(logA)g=(logA)f1)]TJ /F4 11.955 Tf 12.36 0 Td[((logA)gandastheformulasforEfGmj(A)gandvarfGmj(A)ggivesthenalresult.Theproofof( 3 )followsthesameasabove,exceptoneusesexpressions(V)and(VI)inplaceof(III)and(IV).Finally,pr(mj=m0j0)=0followsfromtheobservationthat!jh6=!j0h0almostsurelyasaconsequenceofthemultivariatenormaldistributionwithanon-degeneratecorrelation.Proofsofthepropertiesofthesparsity,non-sparse,andInvGammagroupingpriorsdenedinAppendix B areexcluded.Thesecanbederivedfollowingstepssimilartothoseaboveorintheappendixto Dunsonetal. ( 2008 ). 111

PAGE 112

APPENDIXDDETAILSOFMCMCALGORITHMFORCOVARIANCEGROUPINGPRIORS D.1PreliminariesAsmentionedinSection 3.5 ,weintroduceseverallatentvariablestofacilitatesamplingfromthedistributionsFmj()andGmj()inequations( 3 )and( 3 ),followingthealgorithmof Dunsonetal. ( 2008 ).Tothisend,considerthefollowingfoursetsofbinarydummyvariables:umjhBern(Umh)(m=1,...,M;j=1,...,J;h=1,...,H),xmjhBern(Xjh)(m=1,...,M;j=1,...,J;h=1,...,H),wmjhBern(Wmh)(m=1,...,M;j=1,...,p;h=1,...,H),zmjhBern(Zjh)(m=1,...,M;j=1,...,p;h=1,...,H).NowdeneRmj=minfh:1=umjh=xmjhgandAmj=minfh:1=wmjh=zmjhg.Byconstruction,pr(Rmj=h)=mjhandpr(Amj=h)=mjh.WeletRmjdesignatewhichjhoutoftheHcandidateswechooseasmj,andlikewise,Amjgivesthejhtoselectasmj.Hence,isdeterminedbyfRmjgandfjhgand)]TJ /F1 11.955 Tf 10.1 0 Td[(byfAmjgandfjhg.Thus,aftersamplingthevaluesoffRmjg,fjhg,fAmjg,andfjhg,thevaluesofand)]TJ /F1 11.955 Tf 10.1 0 Td[(aredetermined.NowwecalculatetheconditionaldistributionsthatwewillneedforourGibbssamplerforeachofthegroupingpriors.Notationally,wedenotetheconditionaldistributionforarandomvariable,sayC,conditionalontheremainingrandomvariablesbyCj)]TJ /F1 11.955 Tf 15.94 0 Td[(. D.2SamplingStepsforLag-BlockGroupingPriorSamplingfromthelag-blockgroupingpriorproceedsinfoursteps:theautoregressiveparametersthoughtheparametercandidatesandthedummyvariablesR,u,x;thecandidateprobabilitiesmjhbysamplingtheUmhandXjh;thestick-breakingparameters 112

PAGE 113

,;thehyperparametersofthebasedistribution( 3 ).Therststepproceedsintwosubsteps.Algorithmdetailsfollow.Step1a:Parametercandidates.Itisimportanttorecallthedenitionoftheautoregressiveparametersasconditionalregressioncoefcients.Forinstance,therstparameterm1istheregressioncoefcientforymi1ontoymi2withinnovationvariancem1.Likewise,m2andm3arethecoefcientsofymi1andymi2formodelingymi3withvariancem2.Weletxmijdenotethecomponentofymithatcorrespondstothejthautoregressiveparameterregressor,e.g.xmij=ymi1forj=1,2andxmi3=ymi2.Similarly,weletmjdenotetherelevantinnovationvariance.Thatis,m1=m1,andforj=2and3,mj=m2.Finally,wedeneemijtobetheresidualfortheregressionequation,excludingthecontributionofxmij.Thatis,forj=1,emi1=ymi2,forj=2,emi2=ymi3)]TJ /F11 11.955 Tf 13.09 0 Td[(m3ymi2,andforj=3,emi3=ymi3)]TJ /F11 11.955 Tf 13.09 0 Td[(m2ymi1.The-variablesaredenedinthenaturalwaysothatemijN(mjxmij,mj)foreachj.Havingestablishedthenecessarynotation,weseethatthecontributiontothedatalikelihoodofmjisproportionaltoexp()]TJ /F4 11.955 Tf 18.65 8.09 Td[(1 2mjnmXi=1)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(emij)]TJ /F11 11.955 Tf 11.96 0 Td[(mjxmij2).However,wedonotdrawthemj'sbutqh.Forh=1,...,Handxedq=1,...,p)]TJ /F4 11.955 Tf 9.59 0 Td[(1,letPqhdenotesthesetof(m,j)suchthatq(j)=qandRmj=h,whichisthesetofgroupandautoregressiveparameterpairsthatcontributetothelikelihoodofqh.Thus,thecontributionofqhis exp8<:X(m,j)2PqhnmXi=1)]TJ /F4 11.955 Tf 9.3 0 Td[(1 2mj(emij)]TJ /F11 11.955 Tf 11.95 0 Td[(qhxmij)29=;. (D) 113

PAGE 114

Thissummationover(m,j)2Pqhmeansthatweareonlyincludingthe(m,j)pairssuchthatqh=mj.Fromthisobservation,wehavethattheconditionaldistributionofqhis (qhj)]TJ /F4 11.955 Tf 12.62 0 Td[()/exp8<:X(m,j)2PqhnmXi=1)]TJ /F4 11.955 Tf 9.3 0 Td[(1 2mj(emi)]TJ /F11 11.955 Tf 11.96 0 Td[(jhxmi)29=;q0(qh)+(1)]TJ /F11 11.955 Tf 11.96 0 Td[(q)(22))]TJ /F14 5.978 Tf 7.79 3.26 Td[(1 2exp)]TJ /F11 11.955 Tf 12.37 9.1 Td[(2qh 22/q0(qh)+(1)]TJ /F11 11.955 Tf 11.95 0 Td[(q) exp()2 22N(,2), (D) where =2X(m,j)2PqhnmXi=1emijxmij mj,2=8<:1 2+X(m,j)2PqhnmXi=1(xmij)2 mj9=;)]TJ /F9 7.97 Tf 6.59 0 Td[(1.(D)Thus,tosamplefromthisconditional( D ),wesetqhtozerowithprobabilityq q+(1)]TJ /F11 11.955 Tf 11.96 0 Td[(q) expn()2 22o,andotherwise,drawqhfromN(,2).IfPqhisempty,thentheconditionalisq0+(1)]TJ /F11 11.955 Tf 11.96 0 Td[(q)N(0,2),theoriginalpriorforqhgivenby( 3 ).Step1b:Dummyvariables.Havingupdatedthecandidatesvaluesfqhgh,wenowsamplethevariablesthatdeterminewhichcandidatewechoosefortheautoregressiveparameter.First,form=1,...,Mandjsuchthatq=q(j),wesampleRmjaftermarginalizingoutfumjh,xmjhgHh=1.Theconditionalprobabilitydistributionisgivenby pr(Rmj=hj)-222(nfumjh,xmjhgh)/mjhexp()]TJ /F4 11.955 Tf 18.65 8.09 Td[(1 2mjnmXi=1(emij)]TJ /F11 11.955 Tf 11.96 0 Td[(q(j)hxmij)2).(D)Hence,wedrawRmjfromthemultinomialdistributionwithprobabilitiesfrom( D ),normalizedtosumtoone.GiventhevalueofRmj,wedrawthesetfumjh,xmjhghtorequirethatRmjistherstoccasionwherebothumjhandxmjhareone.Forh>Rmj,drawumjhBern(Umh)andxmjhBern(Xjh),andwhenh=Rmj,1=umjh=xmjh.Forh
PAGE 115

thenwejointlydrawumjhandxmjhinaccordancetothefollowingprobabilities:pr(umjh=0,xmjh=0)=(1)]TJ /F5 11.955 Tf 11.96 0 Td[(Umh)(1)]TJ /F5 11.955 Tf 11.96 0 Td[(Xjh)=(1)]TJ /F5 11.955 Tf 11.95 0 Td[(UmhXjh),pr(umjh=1,xmjh=0)=Umh(1)]TJ /F5 11.955 Tf 11.95 0 Td[(Xjh)=(1)]TJ /F5 11.955 Tf 11.96 0 Td[(UmhXjh),pr(umjh=0,xmjh=1)=(1)]TJ /F5 11.955 Tf 11.96 0 Td[(Umh)Xjh=(1)]TJ /F5 11.955 Tf 11.96 0 Td[(UmhXjh).Steps1aand1bshouldbeperformconsecutivelywiththesamevalueofq.Step2:Candidateprobabilities.ToupdatetheprobabilitiesmjhwemustsamplenewvaluesforthecomponentsUmhandXjh.Tothatend,giventhevaluesoftheumjh'sandtheothervariables,theconditionalforUmhforh
PAGE 116

and.Thentheconditionalforisj)-277(Gamma0@M(H)]TJ /F4 11.955 Tf 11.96 0 Td[(1)+1,1)]TJ /F7 7.97 Tf 16.95 14.95 Td[(MXm=1H)]TJ /F9 7.97 Tf 6.58 0 Td[(1Xh=1log(1)]TJ /F5 11.955 Tf 11.95 0 Td[(Umh)1A.Likewise,j)-278(Gamma0@J(H)]TJ /F4 11.955 Tf 11.96 0 Td[(1)+1,1)]TJ /F7 7.97 Tf 18.25 14.94 Td[(JXj=1H)]TJ /F9 7.97 Tf 6.58 0 Td[(1Xh=1log(1)]TJ /F5 11.955 Tf 11.95 0 Td[(Xjh)1A.WecanchooseadifferentGamma(a,b)priorinsteadofGamma(1,1),andwewillmaintainthegamma-gammaconjugacy.Step4:Basedistributionhyperparameters.Thebasedistributionforqhin( 3 )dependsontwosetsofhyperparameters:thevarianceofthecontinuouspiece2andthesparsityprobabilitiesq.WechoosetheInvGamma(a,b)familyofdistributionsfortheprioron2,sothatwewillhaveconjugacy.Thisyieldsthefollowingconditionaldistribution,2j)-278(InvGamma a+1 2Xq,hf1)]TJ /F11 11.955 Tf 11.95 0 Td[(0(qh)g,b+1 2Xq,h2qh!.Onemustnowspecifythevaluesofa,b.WerecommendInvGamma(0.1,0.1),sothatourpriorapproximatesthecommonly-usedimproperprior(2)/)]TJ /F9 7.97 Tf 6.59 0 Td[(2.ByplacingaBeta(q,q)prioronq,theconditionalforqisqj)-278(Beta0@q+HXh=10(qh),q+HXh=1f1)]TJ /F11 11.955 Tf 11.95 0 Td[(0(qh)g1A.Itisnecessarytospecifythevaluesofqandq.Werecommendusingq=q=1forallq,whichgivesaUnif(0,1)priorforeachq.Alternatively,onecouldchoosethevaluesofqandqtomoreaggressivelyshrinkqforlowerlagstowardzeroandqforhigherlagstowardone. 116

PAGE 117

D.3SamplingStepsforCorrelated-LognormalGroupingPriorThesamplingfortheinnovationvariancepriorfollowssimilarlytothepreviousprior.However,wenowlongerhavetoperformthecandidatevalueanddummyvariablesamplingstepconcurrentlyasforthelag-blockprior.Step1:Parametercandidates.Let~emijbetheresidualobtainedfromthedifferenceofymijandthepreviouscomponentsofymimultipliedbytheappropriateautoregressiveparameters.Forinstance,whenj=1,~emi1=ymi1,andforj=2,~emi2=ymi2)]TJ /F11 11.955 Tf -421.13 -23.91 Td[(m1ymi1,andsoon.Notethatthisisadifferentdenitionofthese~e-residualsfromthee-residualsusedintheautoregressiveparametersamplingsteps.Foreachvalueofj,thisyields~emijN(0,mj).Thecontributiontothelikelihoodofjhfromthedataisproportionalto )]TJ /F14 5.978 Tf 7.79 3.26 Td[(1 2Pmnmh(Amj)jhexp()]TJ /F4 11.955 Tf 16.99 8.09 Td[(1 2jhXmnmXi=1(~emij)2h(Amj)).(D)Insteadofconsideringtheconditionalforjh,weinsteadchoosetolookintermsof!jh=logjh.Foreachsamplingset,wepartition!hinto(!hA,!hB)sothat!hAcontainsthecollectionof!jhsuchthatAmj=hforatleastonem.Thisdivides!hintothe!hB,whichcanbedrawneasilythroughaconjugatedistribution,andthe!hA,whichrequireamoreadvancedsamplingmethod.Tosample!hBgiventheremainingvariables,weletadenotethelengthof!hAandb=p)]TJ /F5 11.955 Tf 10.17 0 Td[(adenotethelengthof!hB.DeneRAAtobethesubmatrixofR()correspondingtotheelementsof!hA,RBBcorrespondingtotheelementsof!hB,andRBAcontaintheelementsoftherowsof!hBandcolumnsof!hA.Then,usingstandardmultivariatenormalresults,!hBj!hA,)-277(Nb)]TJ /F11 11.955 Tf 5.48 -9.68 Td[( 1b+RBAR)]TJ /F9 7.97 Tf 6.59 0 Td[(1AA(!hA)]TJ /F11 11.955 Tf 11.95 0 Td[( 1a),(RBB)]TJ /F5 11.955 Tf 11.96 0 Td[(RBAR)]TJ /F9 7.97 Tf 6.59 0 Td[(1AAR0BA).Jointlydrawingthevector!hBleadstobettermixingthandrawingeachcomponentseparately. 117

PAGE 118

Tosample!hA,wecyclethroughthecomponents!hof!hAfor=1,...,a.Werecognizethatthecontributiontotheconditionalof!hfromthepriorisexp)]TJ /F4 11.955 Tf 17.39 8.09 Td[(1 2(!h)]TJ /F11 11.955 Tf 11.96 0 Td[( )2,where = +R,()]TJ /F12 7.97 Tf 6.59 0 Td[()R)]TJ /F9 7.97 Tf 6.59 0 Td[(1()]TJ /F12 7.97 Tf 6.59 0 Td[(),()]TJ /F12 7.97 Tf 6.59 0 Td[()(!h()]TJ /F12 7.97 Tf 6.58 0 Td[())]TJ /F11 11.955 Tf 11.96 0 Td[( 1p)]TJ /F9 7.97 Tf 6.59 0 Td[(1),=1)]TJ /F5 11.955 Tf 11.95 0 Td[(R,()]TJ /F12 7.97 Tf 6.59 0 Td[()R)]TJ /F9 7.97 Tf 6.58 0 Td[(1()]TJ /F12 7.97 Tf 6.59 0 Td[(),()]TJ /F12 7.97 Tf 6.58 0 Td[()R0,()]TJ /F12 7.97 Tf 6.58 0 Td[(),!h()]TJ /F12 7.97 Tf 6.59 0 Td[()isthe!hvectorafterremoving!h,R()]TJ /F12 7.97 Tf 6.58 0 Td[(),()]TJ /F12 7.97 Tf 6.59 0 Td[()istheR()matrixformedbyremovingtherowandcolumncorrespondingto,andR,()]TJ /F12 7.97 Tf 6.59 0 Td[()isthevectordenedbytakingtherowofR()andremovingthecomponent.Wecanequivalentlyviewthisash=exp(!h)lognormal( ,),andcalculatetheconditionaldistributionintermsofh.Thisgives(hjh()]TJ /F12 7.97 Tf 6.59 0 Td[(),)]TJ /F4 11.955 Tf 9.3 0 Td[()/)]TJ /F14 5.978 Tf 7.78 3.26 Td[(1 2Pmnmh(Amj))]TJ /F9 7.97 Tf 6.59 0 Td[(1hexp()]TJ /F4 11.955 Tf 18.63 8.09 Td[(1 2hXmnmXi=1(~emij)2h(Amj))]TJ /F4 11.955 Tf 20.05 8.09 Td[(1 2(logh)]TJ /F11 11.955 Tf 11.95 0 Td[( )2).Samplingfromthisdistributionrequiresanapproximatesamplingstep.Werecommendslicesampling( Neal 2003 ),althoughanalternativesamplingstrategycouldbeused.Step2:Dummyvariables.TodrawAmjwewillproceedsimilarlytothepreviousStep1bbylookingattheconditionalmarginallyoverfwmjh,zmjhgh. pr(Amj=hjnfwmjh,zmjhgh)/mjh)]TJ /F14 5.978 Tf 7.78 3.26 Td[(1 2nmjhexp )]TJ /F4 11.955 Tf 17 8.09 Td[(1 2jhnmXi=1~e2mi!. (D) Hence,wedrawAmjfromthemultinomialdistributionwithprobabilitiesfrom( D ),normalizedtosumtoone.Asbefore,wesimulatethesetswmjhandzmjhconditionalonAmjbeingtherstoccasionwherebothwmjhandzmjhareone.Forh>Amj,drawwmjhBern(Wmh)andzmjhBern(Zjh),andwhenh=Amj,1=wmjh=zmjh.For 118

PAGE 119

h
PAGE 120

Inthesimulationanddataexample,weusea=b=0.1,c2=1000.AsmentionedinSection 3.5 ,ithasbeenourexperiencethatsamplingleadstoinstability,andwegenerallyrecommendxingit. D.4SamplingStepsforSparsityGroupingPriorThesparsitygroupingpriorintroducedinAppendix B.1 followsasimilarstructuretothelag-blockpriorexceptusesanewsetsofautoregressiveparametercandidatesforeachj.Consequently,thealgorithmrequiredforsamplingfromthispriorwillbequitesimilar.WedenethenecessarystepsiftheydifferfromtheprocedureintroducedinAppendix D.2 .Step1a:Parametercandidates.Usingthesamenotationasbefore,wenotethatthecontributionfromthedataaboutjhis exp8<:Xm:Rmj=h)]TJ /F4 11.955 Tf 9.3 0 Td[(1 2mjnmXi=1(emij)]TJ /F11 11.955 Tf 11.96 0 Td[(jhxmij)29=;,whichwecompareto( D ).Incorporatingthezero-normalmixtureprior,theposteriorconditionaldistributionwillalsobeazero-normalmixture.Wesetjhtozerowithprobabilityq(j) q(j)+(1)]TJ /F11 11.955 Tf 11.96 0 Td[(q(j)) expn()2 22o,andsamplefromaN(,2)otherwise.Themeanandvarianceforthecontinuouscomponentaregivenby =2Xm:Rmj=hnmXi=1emijxmij mj,2=8<:1 2+Xm:Rmj=hnmXi=1(xmij)2 mj9=;)]TJ /F9 7.97 Tf 6.58 0 Td[(1.(D)Step1b:Dummyvariables.ThestepisthesameasStep1bforthelag-blockpriorexceptthat( D )becomespr(Rmj=hjnfumjh,xmjhgh)/mjhexp()]TJ /F4 11.955 Tf 18.65 8.09 Td[(1 2mjnmXi=1(emij)]TJ /F11 11.955 Tf 11.96 0 Td[(jhxmij)2).NowwecycleSteps1aand1boverjinsteadofq. 120

PAGE 121

Step4:Basedistributionhyperparameters.Thesamplingdistributionsforthehyperparametersaregivenby2j)-278(InvGamma a+1 2Xj,hf1)]TJ /F11 11.955 Tf 11.95 0 Td[(0(jh)g,b+1 2Xj,h2jh!andqj)-278(Beta0@q+Xj:q(j)=qHXh=10(jh),q+Xj:q(j)=qHXh=1f1)]TJ /F11 11.955 Tf 11.95 0 Td[(0(jh)g1A,forpriorsdistributionsof2InvGamma(a,b)andqBeta(q,q).Notethatthesummationsinthedistributionforqsumovertheautoregressivetermsthatcorrespondtolag-q. D.5SamplingStepsforNon-SparseGroupingPriorSamplingunderthispriorproceedsasunderthelag-blockandsparsitypriors.Wedescribethestepsthatdifferfromlag-blockprior.Step1a:Parametercandidates.Becausethispriordoesnotimposesparsity,theconditionaldistributionforjhisN(,2)whereand2aregivenbyequation( D ).Step4:Basedistributionhyperparameters.Theonlyhyperparameterfortheautoregressivecandidatesis2.WithapriordistributionofInvGamma(a,b),thesamplingdistributionis2j)-278(InvGamma a+1 2JH,b+1 2Xj,h2jh!. D.6SamplingStepsforInvGammaGroupingPriorThesamplingalgorithmproceedsasforthecorrelated-lognormalprior.WedescribeonlythosestepsthatrequiremodicationfromthealgorithminAppendix D.3 .Step1:Parametercandidates.Usingthelikelihoodcontributionofjhfrom( D )andapriordistributionofInvGamma(1,2),wehavethattheconditionalsampling 121

PAGE 122

distributionisjhj)-278(InvGamma0@1+1 2Xm:Amj=hnm,2+1 2Xm:Amj=hnmXi=1~e2mij1A.Step5:Basedistributionhyperparameters.Thehyperparametersassociatedwiththisinnovationvariancegroupingare1and2in( B ).WeplaceindependentGamma(1,1)priorsoneach.Theconditionalfor2is2j)-277(Gamma 1pH+1,1+Xj,h)]TJ /F9 7.97 Tf 6.59 0 Td[(1jh!.Theconditionalfor1is(1j)]TJ /F4 11.955 Tf 15.94 0 Td[()/\(1))]TJ /F7 7.97 Tf 6.59 0 Td[(pH)]TJ /F12 7.97 Tf 6.59 0 Td[(1pH2exp")]TJ /F11 11.955 Tf 9.29 0 Td[(1(1+Xj,hlog(jh))#,butthisisnotastandarddistributiontosample.So,itbecomesnecessarytoimplementanalternativesamplingmethod,andwechoosetointroduceaMetropolis-in-Gibbssteptoapproximatelysimulatefromthisconditional.Drawthecandidatevalue1toreplacethecurrentvalue1fromtheN(1,)distribution,andacceptthemoveto1withprobabilityminf1,~g,where~=I(1>0)"exp(log\(1) \(1)+1 pH(1)]TJ /F11 11.955 Tf 11.96 0 Td[(1) 1+Xj,hlog(jh)+pHlog(2)!)#pH.Itisnecessarytoprespecifyacandidatevariancesuchthattheacceptancerateis20to40%( Gelmanetal. 1996 ). D.7FinalCommentsaboutComputationalAlgorithmWenallynotethatonecanviewourgroupingpriorsinahierarchicalfashionwithmultiplelevels.Asisoftenthecaseinhierarchicalmodels,theremaybelittleinformationabouttheparametersinthelowestlevels.Wehaveoftenfoundthistobethecaseforthegroupingpriorsresultinginpoormixingforsomeofthelowerlevelmodelparameters.Whilethevaluesoftheautoregressiveparametersand 122

PAGE 123

PAGE 124

APPENDIXEADDITIONALRISKSIMULATIONSFORCOVARIANCEGROUPINGPRIORS E.1AdditionalDetailsforRiskSimulationofSection3.6Mean,covariance,anddropoutmodel( 3 )parametersaredisplayedinTable E-1 .Recallthatthedropoutmodel( 3 )isgivenbylogitfpr(Di=t+1jDi>t,yit,m)g=0t+1tyit+2m.ThedropoutprobabilitiesthatthismodelinducesaregiveninTable E-2 .RecallthattheT(m)upper-triangularmatrixisdenedthroughtheJ-dimensionalvectormbyT(m)=2666666666641)]TJ /F11 11.955 Tf 9.3 0 Td[(m1)]TJ /F11 11.955 Tf 9.3 0 Td[(m2)]TJ /F11 11.955 Tf 9.3 0 Td[(m41)]TJ /F11 11.955 Tf 9.3 0 Td[(m3)]TJ /F11 11.955 Tf 9.3 0 Td[(m51)]TJ /F11 11.955 Tf 9.3 0 Td[(m61...377777777775.RecallthatinSection 3.6 ,weanalyzedtheriskunder5differentpriorchoices:thetwonaivepriors,thetwoatpriors,andthegroupingpriorformedbythelag-blockandthecorrelated-lognormalgroupingpriors.WeextendtherisksimulationsfromTable 3-1 byaddingfouradditionalgroupingpriorsbasedonthenewpriorsintroducedinAppendix B .Thesenewpriorsarecomposedofmixinganautoregressiveparametergroupingpriorwithaninnovationvariancegroupingprior.Thesefouradditionalpriorsarelag-block/InvGamma,sparsity/correlated-lognormal,sparsity/InvGamma,andnon-sparse/InvGamma.SpecicationsforhyperpriorsandothersimulationchoicesarethesameasSection 3.6 .Table E-3 containsthecovarianceandmeanriskforthisextendedsimulation.Allveofthegroupingpriorsbeatthenaiveandatpriors.Thelag-block/correlated-lognormalgroupingprioristhemosteffectiveofourgroupingpriorforthisanalysis.Theabilitytoborrowstrengthacrossgroupsimprovestheestimationsuchthateventhe 124

PAGE 125

TableE-1. ParametervaluesforrisksimulationofSection 3.6 .1=(0,1.9,5.2,9.9,16.0,23.5)2=(0,1.8,4.8,9.0,14.4,21.0)3=(0,1.8,5.6,11.4,19.2,29.0)4=(0,2.0,5.0,9.0,14.0,20.0)5=(0,2.0,5.2,9.6,15.2,22.0)6=(0,3.0,6.0,9.0,12.0,15.0)7=(0,1.8,4.8,9.0,14.4,21.0)8=(0,2.8,7.2,13.2,20.8,30.0)1=(0.7,0.2,0.7,0,0.2,0.7,0,0,0.2,0.7,0,0,0,0.2,0.7)2=(0.6,0.1,0.6,0.1,0.1,0.6,-0.1,0.1,0.1,0.6,-0.1,-0.1,0.1,0.1,0.6)3=(0.4,0.3,0.4,-0.2,0.3,0.4,0,-0.2,0.3,0.4,-0.2,0,-0.2,0.3,0.4)4=(0.3,0,0.3,-0.1,0,0.3,0,-0.1,0,0.3,0,0,-0.1,0,0.3)5=(1,-0.5,1,0.2,-0.5,1,0,0.2,-0.5,1,0,0,0.2,-0.5,1)6=(0.8,-0.4,0.8,0.3,-0.4,0.8,0,0.3,-0.4,0.8,0,0,0.3,-0.4,0.8)7=(0.9,-0.2,1,-0.2,-0.2,1,-0.2,-0.2,-0.2,1,-0.2,-0.2,-0.2,-0.2,1)8=(-0.9,0.1,-0.9,0,0.1,-1,0.2,0,0.1,-0.8,-0.2,0.2,0,0.1,-0.8))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(1=(1,1,1,1,1,1))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(2=(1.5,1.5,1.5,1.5,1.5,1.5))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(3=(3.4,3.1,2.8,2.5,2.2,1.8))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(4=(3,3,2,2,2,1))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(5=(3.5,3.2,2.9,3.5,3.2,2.9))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(6=(5,3.7,3,3,2,2))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(7=(2,1.8,1.6,1.4,1.2,1))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(8=(3.3,3,2.7,2.4,2.2,1.9)0=(-2.5,-3.5,-9,-13,-20)1=(0.4,0.5,0.8,1.0,1.2)2=(0,0.2,-2,0,0,0,0.1,-4) TableE-2. ProbabilityYitismissingbygroupmforrisksimulationofSection 3.6 mt=2t=3t=4t=5t=6 10.0800.1560.1670.2390.49820.1000.1880.1990.2410.35630.0140.0290.0340.1120.67040.0900.1800.1900.2250.30350.0930.1990.2240.3230.49660.1000.2510.2820.3330.34870.0940.1830.1970.2510.37480.0020.0070.0140.1720.726 125

PAGE 126

TableE-3. EstimatedrisksforeachchoiceofcovariancepriorfromtheextensionoftheSection 3.6 simulation.TheestimatedriskiscalculatedastheaveragelossusinglossfunctionsL1(m,^m1)=tr()]TJ /F9 7.97 Tf 6.59 0 Td[(1m^m1))]TJ /F4 11.955 Tf 11.96 0 Td[(logj)]TJ /F9 7.97 Tf 6.59 0 Td[(1m^m1j)]TJ /F5 11.955 Tf 17.94 0 Td[(p,L2(m,^m2)=trf()]TJ /F9 7.97 Tf 6.59 0 Td[(1m^m2)]TJ /F5 11.955 Tf 11.96 0 Td[(I)2g,andL(^m,m)=(^m)]TJ /F11 11.955 Tf 11.95 0 Td[(m)>)]TJ /F9 7.97 Tf 6.59 0 Td[(1m(^m)]TJ /F11 11.955 Tf 11.96 0 Td[(m). CovariancePriorEstimatedRiskPrioronPrioron)]TJ /F5 11.955 Tf 60.69 0 Td[(L1L2L Lag-blockCorr-lognormal0.4250.7420.175Lag-blockInvGamma0.4370.7590.174SparsityCorr-lognormal0.5530.9150.196SparsityInvGamma0.5650.9340.196Non-sparseInvGamma0.5510.9120.200NaiveBayes10.6050.9870.203NaiveBayes20.6301.0100.210Group-specicat*0.8921.2550.248Common-at8.10584.3390.925 *Thegroup-specicatpriorisonlyover49datasetsbecausetheMarkovchainfailedtoconvergeforonedataset. non-sparsegroupingprior,whichdoesnotallowthecorrectindependencerelationships,beatstherstnaiveprior,whichcorrectlyincorporatesthepotentialindependence.Thelag-block/correlated-lognormalpriorcontinuestobeattheremainderofthegroupingpriors,withariskimprovementof30and25%overthenaive1priorand52and41%overthegroup-specicatprior.Intermsoftheriskassociatedwithmeanestimation,wenotethatallvegroupingpriorsalsodominatethenaiveandatpriors.Thetwolag-blockpriorsbeattherstnaivepriorby14%.Thegroupingpriorswithsparseandnon-sparsepriorsforonlydoslightlybetterthanthenaivechoicesbutarestillclearlysuperiortheatpriors. E.2RiskSimulation2Wenowdescribethreesomewhatsimplerrisksimulationstofurtherdemonstratethatourgroupingpriorsperformwellinavarietyofsituations.ConsiderM=5groupsandp=4four-dimensionalnormallydistributedmean-zerorandomvariables.Theve 126

PAGE 127

TableE-4. Riskestimatesforsimulation2. CovariancePriorEstimatedRiskPrioronPrioron)]TJ /F5 11.955 Tf 55.7 0 Td[(L1L2 Lag-blockCorr-lognormal0.2470.429Lag-blockInvGamma0.2570.448SparsityCorr-lognormal0.2700.462SparsityInvGamma0.2810.480NaiveBayes10.2910.488Non-sparseInvGamma0.2920.493NaiveBayes20.3220.530Group-specicat0.4630.700Common-at1.5606.623 covariancematricesaredenedbythefollowingspecicationoftheautoregressiveandinnovationvarianceparameters:1=(0.7,0.2,0.7,0,0.2,0.7),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(1=(1,1,1,1),2=(0.7,0,0.3,0,0,0.7),)]TJ /F9 7.97 Tf 6.77 -1.8 Td[(2=(2,2,2,2),3=(0.3,0,0.3,0,0,0.3),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(3=(2,2,1,1),4=(0.7,0.2,0.7,0.1,0.2,0.7),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(4=(5,5,5,5),5=(0.7,0,0.7,0,0,0.3),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(5=(1,1,2,2).Weusesamplesizesofn1=...=n4=30,n5=15.Forthisspecicationmanyoftheparametersacrossgroupsareequal,andmanyofthehigherlagautoregressivetermsarezero.Additionally,withthesmallersamplesizeforthefthgroup,thegroupingpriorsshouldimproveestimationof5bysharinginformationacrosssimilargroups.Withgroupsofdifferentsizes,wemeasurethelosstoestimatingthecollectiontobeweightedaverageoftheindividuallosses,Lk(m,^mk)(k=1,2),withweightsproportionaltothesamplesizenm.Usingthesameset-upastheprevioussimulationproducestheriskestimatesgiveninTable E-4 .Thepriorcomposedofthelag-blockstructureontheautoregressiveandthecorrelated-lognormalspecicationforthevarianceshasthebestriskestimatesofthecollection.Comparingthelag-block/InvGammaandsparsity/correlated-lognormalpriorstothesparsity/InvGammagroupingprior,themodicationoneithertheautoregressive 127

PAGE 128

ortheinnovationvariancesproducesimprovedriskperformance.Thelag-block/correlated-lognormalanalysisproducesriskestimatesthatare15%and12%lowerthanthenaive1prior.Itisnaturaltocomparethesparsenaivepriortothesparsity/InvGammabecausetherstisalimitingcaseofthelatter.Likewise,wecomparenaive2priorwithgrouping/InvGamma.Forbothlossfunctions,thesparsity/InvGammabeatsnaive1andgrouping/InvGammabeatsnaive2,indicatingtheborrowingofinformationacrossgroupsinducedbythegroupingpriorsimprovestheestimation.Wealsoseethatthesparsitypriorperformsbetterthanthenon-sparsegroupingprior,butthisistobeexpectedsinceweknowthatthereareautoregressiveparametersthatareequaltozerointhetruemodel.Comparatively,theestimatorsfromtheatpriorsperformverypoorly;therisksforthegroupingpriorsare37%smallerthanthegroup-specicestimatorforL1and30%forL2. E.3RiskSimulation3Weperformanotherrisksimulationsimilartothepreviouswithagainvegroupsandp=4.Wexthemeantozeroandtheparametersofthetruecovariancematricesaregivenby1=(1,0.5,1,0.5,0.5,1),)]TJ /F9 7.97 Tf 6.77 -1.8 Td[(1=(2,2,2,2),2=(1,0.5,1,0.5,0.5,1),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(2=(4,4,4,4),3=(1,-0.5,1,-0.5,-0.5,1),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(3=(2,2,2,2),4=(1,-0.5,1,-0.5,-0.5,1),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(4=(4,4,4,4),5=(2,-1.0,2,-0.5,-1.0,1),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(5=(2,2,1,1).Againweusethesamplesizesn1=...=n4=30,n5=15.Thereshouldbealargeamountofclusteringinthiscase,sincethereisagreatdealofcommonalityamongautoregressiveparametersandinnovationvariancesfordifferentsamples.Thesecovariancematricesalsodonothaveanyconditionalindependencerelationshipstoexploitsinceeachofthe'sarenonzero.RiskestimatesareshowninTable E-5 .Asinthepreviousrisksimulation,thelag-block/correlated-lognormalpriorproducesthebestriskat15%and20%lowerthan 128

PAGE 129

PAGE 130

TableE-6. Parametervaluesforsimulation4.1=(0.7,0.2,0.7,0,0.2,0.7,0,0,0.2,0.7,0,0,0,0.2,0.7)2=(0.7,0.2,0.7,0.1,0.2,0.7,0,0.1,0.2,0.7,0,0,0.1,0.2,0.7)3=(0.3,0,0.3,0,0,0.3,0,0,0,0.3,0,0,0,0,0.3)4=(0.3,0,0.3,-0.1,0,0.3,0,-0.1,0,0.3,0,0,-0.1,0,0.3)5=(1,-0.5,1,0,-0.5,1,0,0,-0.5,1,0,0,0,-0.5,1)6=(1,-0.5,1,0.3,-0.5,1,0,0.3,-0.5,1,0,0,0.3,-0.5,1)7=(1,-0.2,1,-0.2,-0.2,1,-0.2,-0.2,-0.2,1,-0.2,-0.2,-0.2,-0.2,1)8=(1,-0.2,1,-0.2,-0.2,1,-0.2,-0.2,-0.2,1,-0.2,-0.2,-0.2,-0.2,1))]TJ /F9 7.97 Tf 6.78 -1.79 Td[(1=(1,1,1,1,1,1))]TJ /F9 7.97 Tf 6.78 -1.79 Td[(2=(1,1,1,1,1,1))]TJ /F9 7.97 Tf 6.78 -1.8 Td[(3=(3.4,3.1,2.8,2.5,2.2,1.8))]TJ /F9 7.97 Tf 6.78 -1.79 Td[(4=(3,3,2,2,2,1))]TJ /F9 7.97 Tf 6.78 -1.8 Td[(5=(5,3,3,4,4,4))]TJ /F9 7.97 Tf 6.78 -1.79 Td[(6=(5,5,3,3,2,2))]TJ /F9 7.97 Tf 6.78 -1.79 Td[(7=(2,1.8,1.6,1.4,1.2,1))]TJ /F9 7.97 Tf 6.78 -1.79 Td[(8=(3.4,3.1,2.8,2.5,2.2,1.8) TableE-7. Riskestimatesforsimulation4. CovariancePriorEstimatedRiskPrioronPrioron)]TJ /F5 11.955 Tf 55.7 0 Td[(L1L2 Lag-blockCorr-lognormal0.4680.781Lag-blockInvGamma0.4920.816SparsityCorr-lognormal0.5560.904SparsityInvGamma0.5830.939Non-sparseInvGamma0.6020.963NaiveBayes10.6641.013NaiveBayes20.7611.121Group-specicat1.3001.584Common-at3.03614.149 asintherisksimulationofthearticle.Here,weallowforM=8groupsandconsider66covariancematrices,denedbythedependenceparametersinTable E-6 .Weassumeameanofzeroandfullyobservealldata.Thischoiceforincorporatescommonalitybothwithinlagandacrossgroups,aswellaspossessingmanyconditionalindependencerelationshipsamongthehigherlagterms.Wechooseasamplesizeofthirtyfortherstvegroupsandfteenforthenalthreegroups,andthirtyclustersforthegroupingpriors.TheestimatedriskassociatedwithestimatingthecovariancematricesforeachofthetwolossfunctionsisshowninTable E-7 130

PAGE 131

PAGE 132

TableE-8. Modeltstatisticsandtreatmenteffectsforthersttwogroupsforthedepressiondatausingeachofthepriors. CovariancePriorModelFitTreatmentEffectPrioronPrioron)]TJ /F5 11.955 Tf 79.22 0 Td[(DevpDDICGroup1Group2 Lag-blockCorr-logN(=0.90)39,00634239,6909.23(7.03,11.48)9.51(6.85,12.19)Lag-blockInvGamma38,99935039,6989.22(6.98,11.45)9.39(6.73,12.13)Lag-blockCorr-logN(=0.75)39,00634739,7009.22(6.99,11.42)9.56(6.85,12.27)Lag-blockCorr-logN(=0.50)39,00334939,7009.24(7.00,11.53)9.41(6.74,12.14)SparsityCorr-logN(=0.90)38,88746439,8169.25(7.02,11.49)8.78(6.33,11.24)SparsityCorr-logN(=0.75)38,88746639,8199.23(6.96,11.42)8.82(6.39,11.27)SparsityCorr-logN(=0.50)38,88347239,8279.25(7.01,11.51)8.68(6.23,11.20)SparsityInvGamma38,88447539,8349.25(7.02,11.53)8.64(6.16,11.15)NaiveBayes138,87548139,8379.25(7.11,11.46)8.56(6.16,10.99)Non-sparseInvGamma38,81852939,8769.29(7.01,11.59)8.81(6.21,11.52)NaiveBayes238,76556339,8909.24(7.02,11.49)7.99(5.59,10.46)Common-at39,90722040,3479.44(6.21,12.53)10.17(7.02,13.24)Group-specicat39,178102141,2199.20(6.44,12.08)6.93(4.22,9.77) 132

PAGE 133

PAGE 134

APPENDIXFMODELPARAMETERSFORRISKSIMULATIONOFCHAPTER4WedescribethecovariancematricesusedtocreatedatausedintherisksimulationofSection 4.4 .Foreachgroupm,weprovidetheTmCholeskymatrixwhereinsteadofdenoting1onthediagonalweincludetheIVmtinbold.Emptyupper-triangularelementsarezero.RecallthatthetruepartitionstructureisP1=fMg,P2==P5=ff1,2,3,4g,f5,6,7,8gg,P6==P8=ff1,2,3,4g,f5,6g,f7,8gg,andP9=P10=ff1,2g,f3,4g,f5,6g,f7,8gg.GroupsareinthesamesetofthepartitionifmtandthetthcolumnoftheTmmatrixareequal.Groupsm=1,2:26666666666666641.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.29 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.29 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.71.03777777777777775Groupsm=3,4:26666666666666641.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.29 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.29 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.31.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.30.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.30.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.30.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.30.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.80.73777777777777775 134

PAGE 135

Groupsm=5,6:26666666666666641.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.29 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.29 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.50.83777777777777775Groupsm=7,8:26666666666666641.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.29 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.29 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.2.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.2.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.2)]TJ /F4 11.955 Tf 9.3 0 Td[(.71.23777777777777775 135

PAGE 136

REFERENCES ALBERT,J.H.&CHIB,S.(1993).Bayesiananalysisofbinaryandpolychotomousresponsedata.JournaloftheAmericanStatisticalAssociation88,669. ANDERSON,T.(1984).AnIntroductiontoMultivariateStatisticalAnalysis,2ndEdition.Wiley. ARABIE,P.&BOORMAN,S.A.(1973).Multidimensionalscalingofmeasuresofdistancebetweenpartitions.JournalofMathematicalPsychology10,148. BARNARD,J.,MCCULLOCH,R.&MENG,X.-L.(2000).Modelingcovariancematricesintermsofstandarddeviationsandcorrelations,withapplicationtoshrinkage.StatisticaSinica10,1281. BICKEL,P.&LEVINA,E.(2008).Regularizedestimationoflargecovariancematrices.TheAnnalsofStatistics36,199. BOIK,R.J.(2002).Spectralmodelsforcovariancematrices.Biometrika89,159. BOIK,R.J.(2003).Principalcomponentmodelsforcorrelationmatrices.Biometrika90,679. BOOTH,J.G.,CASELLA,G.&HOBERT,J.P.(2008).Clusteringusingobjectivefunctionsandstochasticsearch.JournaloftheRoyalStatisticalSociety,SeriesB70,119. CAI,B.&DUNSON,D.B.(2006).Bayesiancovarianceselectioningeneralizedlinearmixedmodels.Biometrics62,446. CARTER,C.K.,WONG,F.&KOHN,R.(2011).ConstructingpriorsbasedonmodelsizefornondecoposableGaussiangraphicalmodels:Asimulationbasedapproach.JournalofMultivariateAnalysis102,871. CHEN,Z.&DUNSON,D.B.(2003).Randomeffectsselectioninlinearmixedmodels.Biometrics59,762. CHIB,S.&GREENBERG,E.(1998).Analysisofmultivariateprobitmodels.Biometrika85,347. CHIU,T.Y.M.,LEONARD,T.&TSUI,K.-W.(1996).Thematrix-logarithmiccovariancemodel.JournaloftheAmericanStatisticalAssociation91,198. COX,D.R.&REID,N.(1987).Parameterorthogonalityandapproximateconditionalinference(withdiscussion).JournaloftheRoyalStatisticalSociety,SeriesB49,1. CRIPPS,E.,CARTER,C.K.&KOHN,R.(2005).Variableselectionandcovarianceselectioninmultivariateregressionmodels.InHandbookofStatistics,vol.25. 136

PAGE 137

CROWLEY,E.M.(1997).Productpartitionmodelsfornormalmeans.JournaloftheAmericanStatisticalAssociation92,192. DAMIEN,P.,WAKEFIELD,J.&WALKER,S.(1999).GibbssamplingforBayesiannon-conjugateandhierarchicalmodelsbyusingauxiliaryvariables.JournaloftheRoyalStatisticalSociety,SeriesB61,331. DANAHER,P.,WANG,P.&WITTEN,D.M.(2012).Thejointgraphicallassoforinversecovarianceestimationacrossmultipleclasses.ArXivpreprintarXiv:1111.0324. DANIELS,M.J.(2006).Bayesianmodellingofseveralcovariancematricesandsomeresultsontheproprietyoftheposteriorforlinearregressionwithcorrelatedand/orheterogeneouserrors.JournalofMultivariateAnalysis97,1185. DANIELS,M.J.&HOGAN,J.W.(2008).MissingDatainLongitudinalStudies:Strate-giesforBayesianModelingandSensitivityAnalysis.Chapman&Hall. DANIELS,M.J.&KASS,R.E.(1999).NonconjugateBayesianestimationofcovariancematricesanditsuseinhierarchicalmodels.JournaloftheAmericanStatisticalAssociation94,1254. DANIELS,M.J.&NORMAND,S.-L.(2006).Longitudinalprolingofhealthcareunitsbasedonmixedmultivariatepatientoutcomes.Biostatistics7,1. DANIELS,M.J.&POURAHMADI,M.(2002).Bayesiananalysisofcovariancematricesanddynamicmodelsforlongitudinaldata.Biometrika89,553. DANIELS,M.J.&POURAHMADI,M.(2009).Modelingcovariancematricesviapartialautocorrelations.JournalofMultivariateAnalysis100,23522363. DAWID,A.P.&LAURITZEN,S.L.(1993).HyperMarkovlawsinthestatisticalanalysisofdecomposablegraphicalmodels.TheAnnalsofStatistics21,1272. DAY,W.H.E.(1981).Thecomplexityofcomputingmetricdistancesbetweenpartitions.MathematicalSocialSciences1,269. DEMPSTER,A.P.(1972).Covarianceselection.Biometrics28,157. DENUD,L.&GUENOCHE,A.(2006).Comparisonofdistanceindicesbetweenpartitions.InDataScienceandClassication,V.Batagelj,H.-H.Bock,A.Ferligoj&A.Ziberna,eds.,StudiesinClassication,DataAnalysis,andKnowledgeOrganization.SpringerBerlinHeidelberg,pp.21. DUNSON,D.B.,XUE,Y.&CARIN,L.(2008).Thematrixstick-breakingprocess:FlexibleBayesmeta-analysis.JournaloftheAmericanStatisticalAssociation103,317. FOX,E.&DUNSON,D.B.(2011).Bayesiannonparametriccovarianceregression.ArXivpreprintarXiv:1101.2017. 137

PAGE 138

PAGE 139

KUROWICKA,D.&COOKE,R.(2006).Completionproblemwithpartialcorrelationvines.LinearAlgebraanditsApplications418,188. LAURITZEN,S.L.(1996).GraphicalModels.ClarendonPress. LEONARD,T.&HSU,J.S.J.(1992).Bayesianinferenceforacovariancematrix.TheAnnalsofStatistics20,1669. LETAC,G.&MASSAM,H.(2007).Wishartdistributionsfordecomposablegraphs.TheAnnalsofStatistics35,1278. LIECHTY,J.,LIECHTY,M.&MULLER,P.(2004).Bayesiancorrelationestimation.Biometrika91,1. LITTLE,R.J.A.&RUBIN,D.B.(2002).StatisticalAnalysiswithMissingData.NewYork:JohnWiley&Sons. LIU,C.(2001).CommentonTheartofdataaugmentationbyD.A.vanDykandX.-L.Meng.JournalofComputationalandGraphicalStatistics10,75. LIU,X.&DANIELS,M.J.(2006).Anewalgorithmforsimulatingacorrelationmatrixbasedonparameterexpansionandre-parameterization.JournalofComputationalandGraphicalStatistics15,897. LIU,X.,DANIELS,M.J.&MARCUS,B.(2009).Jointmodelsfortheassociationoflongitudinalbinaryandcontinuousprocesseswithapplicationtoasmokingcessationtrial.JournaloftheAmericanStatisticalAssociation104,429. LOPES,H.F.,MCCULLOCH,R.E.&TSAY,R.S.(2011).Choleskystochasticvolatility.Submitted. LOPES,H.F.&WEST,M.(2004).Bayesianmodelassessmentinfactoranalysis.StatisticaSinica14,41. MANLY,B.F.J.&RAYNER,J.C.W.(1987).Thecomparisonofsamplecovariancematricesusinglikelihoodratiotests.Biometrika74,841. MARCUS,B.,ALBRECHT,A.,KING,T.,PARISI,A.,PINTO,B.,ROBERTS,M.,NIAURA,R.&ABRAMS,D.(1999).Theefcacyofexerciseasanaidforsmokingcessationinwomen:Arandomizedcontrolledtrial.ArchivesofInternalMedicine159,1229. MAZUMDER,R.&HASTIE,T.(2012).Exactcovariancethresholdingintoconnectedcomponentsforlarge-scalegraphicallasso.JournalofMachineLearningResearch13,781. MCNICHOLAS,P.D.&MURPHY,T.B.(2010).Model-basedclusteringoflongitudinaldata.TheCanadianJournalofStatistics38,153. MEINHAUSEN,N.&BUHLMANN,P.(2006).High-dimensionalgraphsandvariableselectionwiththelasso.TheAnnalsofStatistics34,1436. 139

PAGE 140