﻿
 UFDC Home myUFDC Home  |   Help
<%BANNER%>

# Bayesian Methods for Modeling Dependence Structures in Longitudinal Data

## Material Information

Title:
Bayesian Methods for Modeling Dependence Structures in Longitudinal Data
Physical Description:
1 online resource (142 p.)
Language:
english
Creator:
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

## Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Statistics
Committee Chair:
Daniels, Michael Joseph
Committee Members:
Doss, John
Khare, Kshitij
Manini, Todd M

## Subjects

Subjects / Keywords:
bayesian -- correlation -- covariance -- longitudinal -- sparsity
Statistics -- Dissertations, Academic -- UF
Genre:
Statistics thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

## Notes

Abstract:
In the modeling of longitudinal data from several groups, appropriate handling of the dependence structure is of central importance.  In this dissertation we consider two important situations where estimation of the dependence structure is particularly important.  We develop Bayesian prior distributions for a correlation matrix that favors parsimonious models.  We additionally introduce two new estimation methods for the situation where data are composed of multiple groups each of which requires its own covariance matrix. Modeling a correlation matrix can be a difficult statistical task due to both the positive definite and the unit diagonal constraints.  Because the number of parameters increases quadratically in the dimension, it is often useful to consider a sparse parameterization.  In Chapter 2, we introduce prior distributions on the set of correlation matrices through the partial autocorrelations (PACs), each of which vary independently over -1,1.  The first of the two proposed priors shrinks each of the PACs toward zero with increasingly aggressive shrinkage in lag.  The second prior (a selection prior) is a mixture of a zero point mass and a continuous component for each PAC, allowing for a sparse representation.  The structure implied under our priors is readily interpretable because each zero PAC implies a conditional independence relationship in the distribution of the data.  For ordered data selection priors on the PACs provide a computationally attractive alternative to selection on the elements of the correlation matrix or its inverse.  These priors allow for data-dependent shrinkage/selection under an intuitive parameterization in an unconstrained setting.  The proposed priors are compared to standard methods through a simulation study and a multivariate probit data example. In Chapter 3, we focus on the challenge of estimating multiple covariance matrices.  Standard methods include specifying a single covariance matrix for all groups or independently estimating the covariance matrix for each group without regard to the others, but when these model assumptions are incorrect, these techniques can lead to biased mean effects or loss of efficiency, respectively.  Thus, it is desirable to develop methods to simultaneously estimate the covariance matrix for each group that will borrow strength across groups in a way that is ultimately informed by the data.  In addition, for several groups with covariance matrices of even medium dimension, it is difficult to manually select a single best parametric model among the huge number of possibilities given by incorporating structural zeros and/or commonality of individual parameters across groups.  In this chapter we develop a family of nonparametric priors using the matrix stick-breaking process of Dunson et al. (2008) that seeks to accomplish this task by parameterizing the covariance matrices in terms of the parameters of their modified Cholesky decomposition (Pourahmadi, 1999).  We establish some theoretic properties of these priors, examine their effectiveness via a simulation study, and illustrate the priors using data from a longitudinal study of a depression treatment. Chapter 4 proposes a second method to handle the situation of simultaneous covariance estimation.  We introduce a covariance partition prior which provides a partition of the groups at each measurement time.  Groups in a common set of the partition share dependence parameters for the distribution of the current measurement given the preceding ones, and the sequence of partitions is modeled as Markov chain to encourage them to vary smoothly across measurement times.  This approach additionally encourages a lower-dimensional structure of the covariance matrices by using a sparse Cholesky structure.  We demonstrate the performance of our model through a simulation study and analysis of the depression study data.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
Thesis:
Thesis (Ph.D.)--University of Florida, 2013.
Local:
Electronic Access:
RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2015-08-31

## Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2013
System ID:
UFE0045709:00001

## Material Information

Title:
Bayesian Methods for Modeling Dependence Structures in Longitudinal Data
Physical Description:
1 online resource (142 p.)
Language:
english
Creator:
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

## Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Statistics
Committee Chair:
Daniels, Michael Joseph
Committee Members:
Doss, John
Khare, Kshitij
Manini, Todd M

## Subjects

Subjects / Keywords:
bayesian -- correlation -- covariance -- longitudinal -- sparsity
Statistics -- Dissertations, Academic -- UF
Genre:
Statistics thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

## Notes

Abstract:
In the modeling of longitudinal data from several groups, appropriate handling of the dependence structure is of central importance.  In this dissertation we consider two important situations where estimation of the dependence structure is particularly important.  We develop Bayesian prior distributions for a correlation matrix that favors parsimonious models.  We additionally introduce two new estimation methods for the situation where data are composed of multiple groups each of which requires its own covariance matrix. Modeling a correlation matrix can be a difficult statistical task due to both the positive definite and the unit diagonal constraints.  Because the number of parameters increases quadratically in the dimension, it is often useful to consider a sparse parameterization.  In Chapter 2, we introduce prior distributions on the set of correlation matrices through the partial autocorrelations (PACs), each of which vary independently over -1,1.  The first of the two proposed priors shrinks each of the PACs toward zero with increasingly aggressive shrinkage in lag.  The second prior (a selection prior) is a mixture of a zero point mass and a continuous component for each PAC, allowing for a sparse representation.  The structure implied under our priors is readily interpretable because each zero PAC implies a conditional independence relationship in the distribution of the data.  For ordered data selection priors on the PACs provide a computationally attractive alternative to selection on the elements of the correlation matrix or its inverse.  These priors allow for data-dependent shrinkage/selection under an intuitive parameterization in an unconstrained setting.  The proposed priors are compared to standard methods through a simulation study and a multivariate probit data example. In Chapter 3, we focus on the challenge of estimating multiple covariance matrices.  Standard methods include specifying a single covariance matrix for all groups or independently estimating the covariance matrix for each group without regard to the others, but when these model assumptions are incorrect, these techniques can lead to biased mean effects or loss of efficiency, respectively.  Thus, it is desirable to develop methods to simultaneously estimate the covariance matrix for each group that will borrow strength across groups in a way that is ultimately informed by the data.  In addition, for several groups with covariance matrices of even medium dimension, it is difficult to manually select a single best parametric model among the huge number of possibilities given by incorporating structural zeros and/or commonality of individual parameters across groups.  In this chapter we develop a family of nonparametric priors using the matrix stick-breaking process of Dunson et al. (2008) that seeks to accomplish this task by parameterizing the covariance matrices in terms of the parameters of their modified Cholesky decomposition (Pourahmadi, 1999).  We establish some theoretic properties of these priors, examine their effectiveness via a simulation study, and illustrate the priors using data from a longitudinal study of a depression treatment. Chapter 4 proposes a second method to handle the situation of simultaneous covariance estimation.  We introduce a covariance partition prior which provides a partition of the groups at each measurement time.  Groups in a common set of the partition share dependence parameters for the distribution of the current measurement given the preceding ones, and the sequence of partitions is modeled as Markov chain to encourage them to vary smoothly across measurement times.  This approach additionally encourages a lower-dimensional structure of the covariance matrices by using a sparse Cholesky structure.  We demonstrate the performance of our model through a simulation study and analysis of the depression study data.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
Thesis:
Thesis (Ph.D.)--University of Florida, 2013.
Local:
Electronic Access:
RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2015-08-31

## Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2013
System ID:
UFE0045709:00001

Full Text

PAGE 1

PAGE 2

PAGE 3

Idedicatethisdissertationtomyfamilyandfriends,whoseconsistentsupporthasbeeninvaluabletomeonthisjourney. 3

PAGE 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 4 LISTOFTABLES ...................................... 8 LISTOFFIGURES ..................................... 9 ABSTRACT ......................................... 10 CHAPTER 1INTRODUCTION ................................... 13 1.1LiteratureReviewforCorrelationEstimationandPriorDistributions .... 15 1.2LiteratureReviewforCovarianceEstimationandPriorDistributions .... 17 1.3LiteratureReviewforSimultaneousCovarianceEstimationandPriorDistributions ................................... 20 2SPARSEPRIORDISTRIBUTIONSFORCORRELATIONMATRICESTHROUGHTHEPARTIALAUTOCORRELATIONS ....................... 23 2.1BayesianCorrelationEstimation ....................... 23 2.2PartialAutocorrelations ............................ 25 2.3PartialAutocorrelationShrinkagePriors ................... 28 2.3.1SpecicationoftheShrinkagePrior .................. 28 2.3.2SamplingundertheShrinkagePrior ................. 29 2.4PartialAutocorrelationSelectionPriors .................... 30 2.4.1SpecicationoftheSelectionPrior .................. 30 2.4.2NormalizingConstantforPriorsonR ................. 31 2.4.3SamplingundertheSelectionPrior .................. 32 2.5Simulations ................................... 33 2.6DataAnalysis .................................. 40 2.7Discussion ................................... 44 3ANONPARAMETRICPRIORFORSIMULTANEOUSCOVARIANCEESTIMATION ..................................... 47 3.1SimultaneousCovarianceEstimation ..................... 47 3.2TheMatrixStick-BreakingProcess ...................... 50 3.3CovarianceGroupingPriors .......................... 51 3.3.1Lag-BlockGroupingPriorfor .................... 51 3.3.2Correlated-LognormalGroupingPriorfor)]TJ ET 0 G 0 g BT /F1 11.955 Tf 319.22 -583.28 Td[(.............. 53 3.4TheoreticalProperties ............................. 54 3.4.1GeneralizedAutoregressiveParameterProperties ......... 54 3.4.2InnovationVarianceProperties .................... 57 3.5ComputationalConsiderations ........................ 58 5

PAGE 6

3.6RiskSimulation ................................. 59 3.7DataExample .................................. 63 3.8Discussion ................................... 67 4COVARIANCEPARTITIONPRIORS:ABAYESIANAPPROACHTOSIMULTANEOUSCOVARIANCEESTIMATION .................. 68 4.1SimultaneousCovarianceEstimationandaDrawbackoftheCovarianceGroupingPriors ................................. 68 4.2CovariancePartitionPrior ........................... 70 4.2.1PriorontheSequenceofPartitions .................. 70 4.2.2PriorontheCholeskyParameters .................. 74 4.3SamplingAlgorithm .............................. 76 4.4SimulationStudy ................................ 80 4.5DepressionDataExample ........................... 83 4.6Discussion ................................... 87 5CONCLUSIONSANDFUTUREWORK ...................... 89 APPENDIX ACALCULATINGTHEDICSTATISTICFORCTQDATA .............. 91 BADDITIONALCOVARIANCEGROUPINGPRIORSANDTHEIRPROPERTIES 96 B.1SparsityGroupingPriorfor ......................... 96 B.2Non-SparseGroupingPriorfor ....................... 97 B.3InvGammaGroupingPriorfor)]TJ ET 0 G 0 g BT /F1 11.955 Tf 226.21 -367.61 Td[(........................ 98 B.4FurtherGroupingPriorExtensions ...................... 99 CDERIVATIONOFCOVARIANCEGROUPINGPRIORPROPERTIES ...... 101 C.1ProofsforGeneralizedAutoregressiveParameterProperties ....... 101 C.2ProofsforInnovationVarianceProperties .................. 107 DDETAILSOFMCMCALGORITHMFORCOVARIANCEGROUPINGPRIORS 109 D.1Preliminaries .................................. 109 D.2SamplingStepsforLag-BlockGroupingPrior ................ 109 D.3SamplingStepsforCorrelated-LognormalGroupingPrior ......... 114 D.4SamplingStepsforSparsityGroupingPrior ................. 117 D.5SamplingStepsforNon-SparseGroupingPrior ............... 118 D.6SamplingStepsforInvGammaGroupingPrior ............... 118 D.7FinalCommentsaboutComputationalAlgorithm .............. 119 EADDITIONALRISKSIMULATIONSFORCOVARIANCEGROUPINGPRIORS 121 E.1AdditionalDetailsforRiskSimulationofSection3.6 ............ 121 E.2RiskSimulation2 ................................ 123 6

PAGE 7

E.3RiskSimulation3 ................................ 125 E.4RiskSimulation4 ................................ 126 E.5ExtendedAnalysisofDepressionStudyData ................ 128 FMODELPARAMETERSFORRISKSIMULATIONOFCHAPTER4 ....... 131 REFERENCES ....................................... 133 BIOGRAPHICALSKETCH ................................ 139 7

PAGE 8

LISTOFTABLES Table page 2-1RiskestimatesforsimulationstudywithdimensionJ=6. ............ 37 2-2SpecicationofD0. ................................. 38 2-3RiskestimatesforsimulationstudywithdimensionJ=10. ........... 39 2-4ModelcomparisonstatisticsfortheCTQdata. .................. 42 3-1Riskestimatesfromsimulationstudy. ....................... 62 3-2Modeltstatisticsandtreatmenteffectsforthedepressiondata. ........ 64 4-1Riskestimatesfromsimulationstudy. ....................... 82 4-2Modelselectionstatisticsfordepressionstudy. .................. 85 E-1ParametervaluesforrisksimulationofSection 3.6 ................ 122 E-2ProbabilityYitismissingbygroupmforrisksimulationofSection 3.6 ..... 122 E-3ExpandedriskestimatesforSection 3.6 simulationstudy. ............ 123 E-4Riskestimatesforsimulation2. ........................... 124 E-5Riskestimatesforsimulation3. ........................... 126 E-6Parametervaluesforsimulation4. ......................... 127 E-7Riskestimatesforsimulation4. ........................... 127 E-8Expandedmodeltstatisticsandtreatmenteffectsforthedepressiondata. .. 129 8

PAGE 9

LISTOFFIGURES Figure page 2-1BoxplotsoftheobservedlossfromsimulationstudywithJ=6. ........ 36 2-2BoxplotsoftheobservedlossfromsimulationstudywithJ=10. ........ 39 3-1Posteriorprobabilitiesofmatchingfortheinnovationvariances. ......... 65 3-2Posteriorprobabilitiesofmatchingforthegeneralizedautoregressiveparameters. ...................................... 66 4-1MarginalprobabilitiesforthreechoicesofqwithM=8groups. ......... 74 4-2Boxplotsoflossfromsimulationstudy. ....................... 82 4-3Posteriorprobabilitythatm1andm2areinthesamesetofthepartitionPt. .. 86 9

PAGE 10

PAGE 11

PAGE 12

usingasparseCholeskystructure.Wedemonstratetheperformanceofourmodelthroughasimulationstudyandanalysisofthedepressionstudydata. 12

PAGE 13

CHAPTER1INTRODUCTIONWhenworkingwithlongitudinal(ortime-ordered)data,specifyingand/ormodelingofthedependencestructureisofprimeimportance.Inmostsituationsthemethodsoflinearandgeneralizedlinearmodelscanbeadaptedtodescribethemeanstructure.Aslongitudinaldataconsistsofmultiplemeasurementswithinanexperimentalunit,methodstohandlethedependenceacrossthemeasurementsbecomesnecessary.Here,wedistinguishlongitudinaldataanalysisfromtherelatedeldofmultivariateanalysis(seee.g., Anderson 1984 ).Wedenelongitudinaldatatobeamultivariateresponseforanexperimentalunit(patientorcase)wheretheresponsesaremeasurementsofthesameoutcomeatdifferenttimepoints,e.g.,whetherapatientsmokesduringagivenweekoverthecourseof8weeksornumberofdepressionsymptomsinaweekover16weeks.Dataofthistypehaveaclearorderingintime,whereasgeneralmultivariatedatamaynot.Oneshouldnotethatthemethodswedevisehereandmostinthelongitudinaldataliteraturemakeuseofassumptionsandintuitionthatareonlyreasonablewhenthemeasurementsfollowaxedorderingandmaynotbeappropriateinmoregeneralmultivariatesituations.Itisnotuncommonforstatisticianstomistakenlyassumethatestimationofthecovarianceparametersperformsasecondaryroletothemeanestimation.Infact,theyshouldgenerallybetreatedjointly.Insituationswithcompletedata(i.e.,allpatientsareobservedatallobservationtimes)andmultivariatenormality,themeanandcovarianceparametersareorthogonalinthesenseof Cox&Reid ( 1987 ),andtheestimatesofthemeanparameterswillbeconsistentundermisspecicationofthedependencestructure.However,whenoneanalyzesreal-worldlongitudinaldatasuchasthatfromaclinicaltrial,thedatawillusuallyexhibitsomeamountofmissingness.Ifthereismissingnessinthedata,thereisnolongerorthogonalitybetweenmeananddependenceparameters,evenatthetruevalueofthecovariancematrix( Little&Rubin 2002 ).Hence,forthe 13

PAGE 14

posteriordistributionofthemeanparameterstobeconsistent,thedependencestructuremustbecorrectlyspecied.So,eveninthemissingatrandomcase(MAR; Daniels&Hogan 2008 ),wheremissingnessdependsonlyontheobservedvaluesnottheunobserveddata,itisnolongerappropriatetotreatthecovariancestructureasanuisanceparameter.Inthiscasebiasedmeanestimatescanresultifwedonotusethecorrectmodelforthedependence( Daniels&Hogan 2008 ,Section6.2).Tofurthermotivatethenecessityofmethodologyforimprovedcovarianceestimation,notethateveninthecompletedatacasewherethemeananddependenceareasymptoticallyindependentundernormality,efciencygainsmaybepossibleforsmallormoderatelysizeddatasets.Throughfoursimulationexampleswithrelativelysmallsamplesizes, Crippsetal. ( 2005 )demonstratedimprovementsinestimatingregressioncoefcients,ttedvalues,andthepredictivedensityusingthe Wongetal. ( 2003 )covarianceselectionprioroveramoredispersedcovariancepriorchoice.InthisdissertationwewilldevelopBayesianmethodstomodellongitudinaldependencestructures.Thesedependonthespecicationofanappropriatepriordistributionforthecovariancematrixoritsparameters.AkeyconsiderationindevelopingthesepriorsistheabilitytoincorporatethemaspartofaMarkovchainMonteCarlo(MCMC)schemetoobtainposteriorinference.Wealsowantthepriorstobestructuredsuchthattheyarecenteredatintuitivepriorbeliefsspecictolongitudinaldata,suchasdecreasingdependenceacrosstimeorpositive(ratherthannegative)correlation.Additionally,wedesirepriorsthatpromotesparse(orlower-dimensional)parameterizationsofthedependencestructure.Throughoutweconsidertwoimportantsituationsinlongitudinaldata.FirstinChapter 2 ,welookatsituationswherethedependencestructureisconstrainedtobecorrelationmatrixforidentiability.Thiscomplicationisencounteredinmultivariateprobitmodels( Chib&Greenberg 1998 ),Gaussiancopularegression( Pittetal. 2006 ),certainlatentvariablemodels( Daniels&Normand 2006 ),amongothers.Theother 14

PAGE 15

problemweconsiderinChapters 3 and 4 isthatofjointlyestimatingmultiplecovariancematrices.Oftenlongitudinaldatamaybeviewedascomposedofseveralgroupseachofwhichmayneeditsowncovariancestructure.Itisdesirabletodevelopmethodologytojointlyestimatethesecovariancematricesallowingsharingofinformation.Wenowreviewtheimportantcontributionstotheliteratureforeachofthisproblems. 1.1LiteratureReviewforCorrelationEstimationandPriorDistributionsFirst,weexploretheproblemofmodelingacorrelationmatrix.Consideramean-zero,J-dimensionalrandomvectorYwithcorrelationmatrixR.TherearetwomainconstraintsontheJJmatrixR:thediagonalelementsmustallequaloneandRispositive-denite.LetRJdenotethesetofallrealmatricesthatsatisfytheserequirements.Positive-denitenessgenerallyprovidesthegreatestdifcultyinanalysis,asthesetofvaluesforaparticularelementijthatsatisfythepositive-deniteconstraintdependsonthechoiceoftheremainingelementsofR.Additionally,becausethenumberofparametersinRisquadraticinthedimensionJ,methodstondaparsimoniousorlower-dimensionalstructurecanbebenecial.Oneoftheearliestattemptstondlower-dimensionalstructureforthemoregeneralproblemofestimatingacovariancematrixistheideaofcovarianceselection( Dempster 1972 ).Bysettingsomeoftheoff-diagonalelementsoftheconcentrationmatrix=)]TJ /F9 7.97 Tf 6.59 0 Td[(1tozero,amoreparsimoniouschoiceforthecovariancematrixofYisachieved.Azerointhe(i,j)-thpositionofimplieszerocorrelation(andfurther,independenceundermultivariatenormality)betweenYiandYj,conditionalontheremainingcomponentsofY.Thisproperty,alongwithitsrelationtographicalmodeltheory( Lauritzen 1996 ),hasledtotheuseofcovarianceselectionasastandardpartofanalysisinmultivariateproblems(forinstance, Rothmanetal. 2008 ; Wongetal. 2003 ; Yuan&Lin 2007 ).However,onemustbecautiouswhenusingsuchselectionmethodsasnotallproducepositivedeniteestimators.Forinstance,thresholdingthesamplecorrelationmatrixwillnotnecessarilybepositivedenite( Bickel&Levina 2008 ). 15

PAGE 16

Eveninsituationswherethefocusisonageneralcovariancematrix,modelspecicationmaydependonthecorrelationstructurethroughtheso-calledseparationstrategy( Barnardetal. 2000 ).Theseparationstrategyinvolvesreparameterizingby=SRS,withSadiagonalmatrixcontainingthemarginalstandarddeviationsofYandRthecorrelationmatrix.Separationcanalsobeperformedontheconcentrationmatrix,=TCTsothatTisdiagonalandC2RJ.ThediagonalelementsofTgivethepartialstandarddeviations,whiletheelementscijofCarethe(full)partialcorrelations.ThecovarianceselectionproblemforisequivalenttochoosingelementsofthepartialcorrelationmatrixCtobenull.SeveralauthorshaveconstructedpriorstoefcientlyestimatebyallowingCtobeasparsematrix( Carteretal. 2011 ; Wongetal. 2003 ).Inmanycasesthefullpartialcorrelationmatrixmaynotbeconvenienttouse.Whenthecovariancematrixisxedtoacorrelationmatrixsuchasforthemultivariateprobitmodel,theelementsoftheconcentrationmatrixTandCareconstrainedtomaintainaunitdiagonalfor.Thisiseasytoseesince=RandCeachhaveJ(J)]TJ /F4 11.955 Tf 11.85 0 Td[(1)=2parametersbutTaddsanadditionalJparameters.Additionally,interpretationofparametersinthefullpartialcorrelationmatrixcanbechallenging,particularlyforlongitudinalsettingsasthepartialcorrelationsaredenedconditionalonfuturevalues.Forexample,c12givesthecorrelationbetweenY1andY2conditionalonthefuturemeasurementsY3,...,YJ.AfurtherissuewithBayesianmethodsthatpromotesparsityinCiscalculatingthevolumeofthespaceofcorrelationmatriceswithaxedzeropattern;seeSection 2.4.2 fordetails.PreviousBayesiansolutionsareconcernedwithchoicesofanappropriatepriordistributionp(R)onRJ.CommonlyusedpriorsincludeplacingequalweightonallelementsofRJ( Barnardetal. 2000 )andJeffrey'spriorp(R)/jRj)]TJ /F9 7.97 Tf 6.59 0 Td[((J+1)=2.InthesecasesthesamplingstepsforRcansometimesbenetfromparameterexpansiontechniques( Liu 2001 ; Liu&Daniels 2006 ; Zhangetal. 2006 ). Liechtyetal. ( 2004 )alsodevelopacorrelationmatrixpriorbyspecifyingeachelementijofRasan 16

PAGE 17

independentnormalsubjecttoR2RJ. Pittetal. ( 2006 )extendthecovarianceselectionprior( Wongetal. 2003 )tothecorrelationmatrixcasebyxingtheelementsofTtobeconstrainedbyCsothatTisthediagonalmatrixsuchthatR=(TCT))]TJ /F9 7.97 Tf 6.58 0 Td[(1hasunitdiagonal.Thedifcultyofjointlydealingwiththepositive-deniteandunitdiagonalconstraintsofacorrelationmatrixhasledsomeresearcherstoconsiderpriorsforRbasedonthepartialautocorrelations(PACs).ThePACbetweenYiandYj(i
PAGE 18

isonesuchmethod( Carteretal. 2011 ; Pittetal. 2006 ; Wongetal. 2003 ).Othernon-Bayesiantechniquesthatencouragesparseversionsofthedependencestructureincludebandingthesamplecovarianceorconcentrationmatrix( Bickel&Levina 2008 ),usingalasso-typepenaltyontheelementsof( Mazumder&Hastie 2012 ; Meinhausen&Buhlmann 2006 )ortheelementstheCholeskydecompositionof( Rothmanetal. 2008 ),andbandingtheCholeskydecompositionof( Rothmanetal. 2010 ).Again,onemustbecautiouswhenusingselectionmethodsasnotallproducepositivedeniteestimators.Aspreviouslymentioned,theideasfromcovarianceselectionareconnectedtothoseofgraphicalmethods.Bayesianmethodsforgraphicalmodelsgenerallyconsistofxingthezerostructureof(or)torepresentaparticulargraphG.Apriorfor()isthenconstructedtorangeoverthespaceofconcentration(covariance)matriceswiththeappropriatezerostructure.Therearemanysuchpriorswithvaryingdegreesofexibility( Dawid&Lauritzen 1993 ; Khare&Rajaratnam 2011 ; Letac&Massam 2007 ; Rajaratnametal. 2008 ). Giudici&Green ( 1999 )proposeahierarchicalpriorwherethegraphGisassumedrandomoverspaceofdecomposablegraphs.Mostgraphicalmethodsdonotincorporatetheorderingoftheresponses,andconsequently,Y1andY2areaslikelytobeuncorrelatedasareY1andYJ.Thisisundesirableforlongitudinaldata,andsowedonotconsiderfurtherthegraphicalmodelmethodology.Bayesianfactormodelsprovidesanotheroptiontodeveloplower-dimensionspecicationsof.Themostcommonofthesemodelsfactorthecovariancematrixas=0+D,whereDisdiagonal(sometimesoftheform2I)andisJkwithk
PAGE 19

exhibitidentiabilityproblemswhenusedaspartofaMCMCanalysis( Lopes&West 2004 ).AparameterizationbasedontheCholeskydecompositionofhasbeenproposedby Pourahmadi ( 1999 2000 ).Inthisparameterizationthecovariancematrixdependsontwosetsofparameters:,thegeneralizedautoregressiveparameters(GARPs),and)]TJ /F1 11.955 Tf 6.94 0 Td[(,thesetofinnovationvariances(IVs),suchthat(,)]TJ /F4 11.955 Tf 6.94 0 Td[())]TJ /F9 7.97 Tf 6.58 0 Td[(1=T()D()]TJ /F4 11.955 Tf 6.95 0 Td[()T()0=2666666641)]TJ /F11 11.955 Tf 9.3 0 Td[(12)]TJ /F11 11.955 Tf 9.3 0 Td[(131)]TJ /F11 11.955 Tf 9.3 0 Td[(231...3777777752666666641 11 2...1 J3777777752666666641)]TJ /F11 11.955 Tf 9.3 0 Td[(121)]TJ /F11 11.955 Tf 9.3 0 Td[(13)]TJ /F11 11.955 Tf 9.3 0 Td[(231............377777775.TheT()matrixisupper-triangularwithonesonitsdiagonal.NotethatthereareJparametersforeach)]TJ /F4 11.955 Tf 11.46 0 Td[(=(1,...,J)andK=J(J)]TJ /F4 11.955 Tf 12.44 0 Td[(1)=2parametersassociatedwitheach=(12,...,J)]TJ /F9 7.97 Tf 6.59 0 Td[(1,J).OneofthekeymotivationsbehindtheuseofthemodiedCholeskydecompositionisthattheonlyconstraintonand)]TJ /F1 11.955 Tf 10.27 0 Td[(neededtoguarantee(,)]TJ /F4 11.955 Tf 6.94 0 Td[()ispositivedeniteisthatj>0forallj=1,...,J.AdditionallytheGARPsandIVsareinterpretedasparametersfromsequentialregressions;i.e.,EfYjjy1,...,yj)]TJ /F9 7.97 Tf 6.59 0 Td[(1g=1jy1++j)]TJ /F9 7.97 Tf 6.59 0 Td[(1,jyj)]TJ /F9 7.97 Tf 6.59 0 Td[(1andVarfYjjy1,...,yj)]TJ /F9 7.97 Tf 6.59 0 Td[(1g=j(assumingwithoutlossofgeneralityYhasmeanzero).Again,thedependenceontheorderingofthecomponentsofYisclear.TheinterpretabilityoftheGARPsreliesonanassumedorderoftheJcomponentsofY,whichisnaturalinthecaseoflongitudinalmeasurements.Priorsforcanthenbeenformedbyspecifyingpriorsonand)]TJ /F1 11.955 Tf 6.94 0 Td[(. Daniels&Pourahmadi ( 2002 )developpriorsthatexploitconjugacyfortheGARPsandIVs. Smith&Kohn ( 2002 )formparsimoniouspriorsthatstochasticallysetelementsoftozero.Notethatjk=0(j
PAGE 20

Y1,...,Yj)]TJ /F9 7.97 Tf 6.58 0 Td[(1,Yj+1,...,Yk)]TJ /F9 7.97 Tf 6.58 0 Td[(1,sothatthesparsityisinterpretableasanindependencerelationship.OtherBayesianmethodsforcovariancepriorsincludespecifyingthepriorintermsofthematrixlogarithmofsothatitisunconstrained( Leonard&Hsu 1992 ),priorsbasedonthespectraldecomposition( Daniels&Kass 1999 ),aatprioronortheJeffrey'spriorp()/jj)]TJ /F9 7.97 Tf 6.59 0 Td[((J+1)=2,andthereferenceprior( Yang&Berger 1994 ). 1.3LiteratureReviewforSimultaneousCovarianceEstimationandPriorDistributionsInSection 1.2 wedescribedtechniquestomodelasinglecovariancematrix.Arelatedchallengeissimultaneouslyestimatingmultiplecovariancematrices.Frequentlyinlongitudinalproblemsdataiscomposedofseveralgroups,suchasdifferingtreatmentsinaclinicaltrial.Inmanycases,particularlyifonedoesnothavemanyobservationspergroup,oneassumesthatthecovariance(orcorrelation)structureisconstantacrossallgroups.However,thisassumption,ifitfailstohold,canhaveadramaticeffectontheinferenceofmeaneffects,evenleadingtobiasifdataareincomplete( Daniels&Hogan 2008 ).Conversely,ifonespecieseachofthecovariancematriceswithoutregardtotheothergroups,thiscanleadtoalossofinformation.Soitisimportanttondmethodsthatcanndamiddlegroundbetweenthesetwoextremesbysharinginformationaboutthedependenceacrossgroups.Letmdenotethecovariancematrixforgroupm(m=1,...,M)and=f1,...,Mgbethecollectionofcovariancematrices.Manyauthorshavedevelopedfrequentistestimatorsforthiscollectionbyinducingcommonalityamongsomefeatureofthem. Boik ( 2002 2003 )proposedmodelstoinducestructurebyimposingcommonalityonsome(orall)oftheprincipalcomponentsofthecovarianceorcorrelationmatrix.Othershaveusedthevariance-correlationdecompositionforestimationbyimposingstructuressuchasproportionalityofallmorcommonalityamongthecorrelationmatrices( Manly&Rayner 1987 ). Pourahmadietal. ( 2007 ) 20

PAGE 21

developedestimationandtestingproceduresforequalityamongtheGARPsandsubsetsoftheGARPs.Inaclusteringcontext McNicholas&Murphy ( 2010 )advocateasimilarcovarianceestimationprocedurethatincludesbandingtheT()matrices. Daniels ( 2006 )consideredaBayesianperspectivebyintroducingpriorsfortheGARPsandIVs,aswellastheprincipalcomponentsofthecovariancematrices,thatinducepoolingacrossgroups.Unfortunately,itiscomputationallychallengingtoselectamongallthepossiblemodelswithintheseclasses. Hoff ( 2009 )proposedahierarchicaleigenmodelforthatpoolstheeigenvectorsofeachgrouptowardacommonstructure. Guoetal. ( 2011 )consideranautomatedapproachusingthelassotoestimatesparsegraphicalmodelsbyselectingsetsofedgescommontoallgroups,aswellasgroup-specicedges.Inthelongitudinaldatasettingwewishtondmorecovariancestructurethanjustcommonzerosacrossallgroups.Wewanttoconsidermodelsthatallowsubsetsofthemodelparameterstobeequalacross(asubsetofthe)groupsatnon-zerovalues;theGuoetal.estimatorsdonotaccommodatethisgoal. Danaheretal. ( 2012 )extendthisworktoallowtheelementsofmtobeequaltoasinglenon-zerovalue,zero,ordistinctacrossgroups.Butthisstilldoesnotallowforastructurecontainingsubsetsofgroups.WeadditionallynotethatitisnotclearhowonecouldeasilyadapteitherpenaltytermintoaBayesianprioronthesetcovariancematricesforoursetting.Othertechniqueshavebeenproposedthatmodelthecovariancematrixasafunctionofoneofmorecontinuouscovariates.AnumberofmodelsofthisavorhavebeendevelopedthatarespeciedthroughregressionsontheGARPsand/orIVs( Daniels 2006 ; Pourahmadi 1999 2000 ; Pourahmadi&Daniels 2002 ). Chiuetal. ( 1996 )devisesuchamodelbyregressingontolog(m).OtherregressionframeworksincludemodelsonthePACsandmarginalvariances( Wang&Daniels 2013a ),regressingwithinafactormodel( Fox&Dunson 2011 ),andmodelbasedonam=B+xmx0m0factorization( Hoff&Niu 2012 ).Additionalmethodstreat 21

PAGE 22

thecovariancematricesasrealizationsofastochasticvolatilityprocess( Lopesetal. 2011 ; Philipov&Glickman 2006a b ).However,covarianceregressionmodelsareoftenplaguedbythedifcultyofinterpretingtheregressionparameters.ToformourpriorsonwewillmakeuseofthemodiedCholeskyparameterizationbecauseoftheunrestrictednessoftheparameters,theinterpretabilityforlongitudinaldata,andthecomputationaladvantagesviaconjugacy( Daniels&Pourahmadi 2002 ).OurgoalistodeveloppriorsforthesetofGARPsandIVsinsuchawaythatweborrowstrengthacrosstheMgroups.Additionally,wewanttoshareinformationacross)]TJ /F7 7.97 Tf 6.94 -1.79 Td[(mandmvalues(m=(m,)]TJ /F7 7.97 Tf 6.94 -1.79 Td[(m),m=1,...,M),particularlythoseGARPsofacommonlag.AnotherconsiderationforpriordevelopmentistoencouragesparsityoftheelementsofT(m),thatis,containingfewnon-zeroelements.BecauseeachGARPrepresentsaconditionaldependency,settingm;jktozeroestablishesaconditionalindependencerelationshipbetweenapairofcomponentsofY.Itisnecessarytoconsiderpriorsthatallowthedatatoinformthebalancebetweenthesetwogoals:poolingacrossgroupsandintroducingsparsity.Aboveall,weseektoaccomplishthisinanautomated,stochasticfashion.Weproposetwosolutionstothisproblem.InChapter 3 wedevelopanonparametricpriorbasedonthematrixstick-breakingprocess( Dunsonetal. 2008 ).ThispriorallowsforclusteringoftheGARPssimultaneouslyacrossgroupsmandij'sofacommonlagj)]TJ /F5 11.955 Tf 13.23 0 Td[(i,whileallowingsomeGARPstobeidenticallyzero(implyingaconditionalindependence).ThesecondsolutiontothesimultaneousestimationproblempresentedinChapter 4 considersclusteringbasedonthet-thcolumnofT(m)andm;t,whichdenescollectionsofthegroups1,...,MthathavethesamedependenceparametersforthedistributionofYtgivenY1,...,Yt)]TJ /F9 7.97 Tf 6.58 0 Td[(1.TheequalityrelationshipsinthiscovariancepartitionpriorarespecieddirectlythroughaMarkovchainonthesequenceofpartitionsof1,...,M. 22

PAGE 23

PAGE 24

CHAPTER2SPARSEPRIORDISTRIBUTIONSFORCORRELATIONMATRICESTHROUGHTHEPARTIALAUTOCORRELATIONS 2.1BayesianCorrelationEstimationDeterminingthestructureofanunknownJJcovariancematrixisalongstandingstatisticalchallenge.Akeydifcultyindealingwiththecovariancematrixisthepositivedenitenessconstraint.Thisisbecausethesetofvaluesforaparticularelementijthatyieldapositivedenitedependsonthechoiceoftheremainingelementsof.Additionally,becausethenumberofparametersinisquadraticinthedimensionJ,methodstondaparsimonious(lower-dimensional)structurecanbebenecial.Oneoftheearliestattemptsinthisdirectionistheideaofcovarianceselection( Dempster 1972 ).Bysettingsomeoftheoff-diagonalelementsoftheconcentrationmatrix=)]TJ /F9 7.97 Tf 6.59 0 Td[(1tozero,amoreparsimoniouschoiceforthecovariancematrixoftherandomvectorYisachieved.Azerointhe(i,j)-thpositionofimplieszerocorrelation(andfurther,independenceundermultivariatenormality)betweenYiandYj,conditionalontheremainingcomponentsofY.Thisproperty,alongwithitsrelationtographicalmodeltheory(e.g., Lauritzen 1996 ),hasledtotheuseofcovarianceselectionasastandardpartofanalysisinmultivariateproblems( Rothmanetal. 2008 ; Wongetal. 2003 ; Yuan&Lin 2007 ).However,oneshouldbecautiouswhenusingsuchselectionmethodsasnotallproducepositivedeniteestimators.Forinstance,thresholdingthesamplecovariance(concentration)matrixwillnotgenerallybepositivedenite,andadjustmentsareneeded( Bickel&Levina 2008 ).Modelspecicationformaydependonacorrelationstructurethroughtheso-calledseparationstrategy( Barnardetal. 2000 ).Theseparationstrategyinvolvesreparameterizingby=SRS,withSadiagonalmatrixcontainingthemarginalstandarddeviationsofYandRthecorrelationmatrix.LetRJdenotethesetofvalidcorrelationmatrices,thatis,thecollectionofJJpositivedenitematriceswithunit 24

PAGE 25

diagonal.Separationcanalsobeperformedontheconcentrationmatrix,=TCTsothatTisdiagonalandC2RJ.ThediagonalelementsofTgivethepartialstandarddeviations,whiletheelementscijofCarethe(full)partialcorrelations.ThecovarianceselectionproblemisequivalenttochoosingelementsofthepartialcorrelationmatrixCtobenull.SeveralauthorshaveconstructedpriorstoestimatebyallowingCtobeasparsematrix( Carteretal. 2011 ; Wongetal. 2003 ).Inmanycasesthefullpartialcorrelationmatrixmaynotbeconvenienttouse.Incaseswherethecovariancematrixisxedtobeacorrelationmatrixsuchasthemultivariateprobitcase,theelementsoftheconcentrationmatrixTandCareconstrainedtomaintainaunitdiagonalfor( Pittetal. 2006 ).Additionally,interpretationofparametersinthepartialcorrelationmatrixcanbechallenging,particularlyforlongitudinalsettingsasthepartialcorrelationsaredenedconditionalonfuturevalues.Forexample,c12givesthecorrelationbetweenY1andY2conditionalonthefuturemeasurementsY3,...,YJ.AnadditionalissuewithBayesianmethodsthatpromotesparsityinCiscalculatingthevolumeofthespaceofcorrelationmatriceswithaxedzeropattern;seeSection 2.4.2 fordetails.InadditiontotheroleRplaysintheseparationstrategy,insomedatamodelsthecovariancematrixisconstrainedtobeacorrelationmatrixforidentiability.Thisisthecaseforthemultivariateprobitmodel( Chib&Greenberg 1998 ),Gaussiancopularegression( Pittetal. 2006 ),certainlatentvariablesmodels(e.g. Daniels&Normand 2006 ),amongothers.Thus,itisnecessarytomakeuseofmethodsspecicforestimatingand/ormodelingacorrelationmatrix.WeconsiderthisproblemofcorrelationmatrixestimationinaBayesiancontextwhereweareconcernedwithchoicesofanappropriatepriordistributionp(R)onRJ.CommonlyusedpriorsincludeauniformprioroverRJ( Barnardetal. 2000 )andJeffrey'spriorp(R)/jRj)]TJ /F9 7.97 Tf 6.59 0 Td[((J+1)=2.InthesecasesthesamplingstepsforRcansometimesbenetfromparameterexpansiontechniques( Liu 2001 ; Liu&Daniels 25

PAGE 26

2006 ; Zhangetal. 2006 ). Liechtyetal. ( 2004 )developacorrelationmatrixpriorbyspecifyingeachelementijofRasanindependentnormalsubjecttoR2RJ. Pittetal. ( 2006 )extendthecovarianceselectionprior( Wongetal. 2003 )tothecorrelationmatrixcasebyxingtheelementsofTtobeconstrainedbyCsothatTisthediagonalmatrixsuchthatR=(TCT))]TJ /F9 7.97 Tf 6.59 0 Td[(1hasunitdiagonal.ThedifcultyofjointlydealingwiththepositivedeniteandunitdiagonalconstraintsofacorrelationmatrixhasledsomeresearcherstoconsiderpriorsforRbasedonthepartialautocorrelations(PACs)insettingswherethedataareordered.PACssuggestapracticalalternativebyavoidingthecomplicationofthepositivedeniteconstraint,whileprovidingeasilyinterpretableparameters( Joe 2006 ). Kurowicka&Cooke ( 2003 2006 )framethePACideaintermsofavinegraphicalmodel. Daniels&Pourahmadi ( 2009 )constructaexibleprioronRthroughindependentshiftedBetapriorsonthePACs. Wang&Daniels ( 2013a )constructunderlyingregressionsforthePACs,aswellasatriangularpriorwhichshiftsthepriorweighttoamoreintuitivechoiceinthecaseoflongitudinaldata.InsteadofsettingpartialcorrelationsfromCtozerotoincorporatesparsity,ourgoalistoencourageparsimonythroughthePACs.AsthePACsareunconstrained,selectiondoesnotleadtothecomputationalissuesassociatedwithndingthenormalizingconstantforasparseC.WeintroduceandcomparepriorsforbothselectionandshrinkageofthePACsthatextendspreviousworkonsensibledefaultchoices( Daniels&Pourahmadi 2009 ).Thelayoutofthischapterisasfollows.Inthenextsectionwewillreviewtherelevantdetailsofthepartialautocorrelationparameterization.Section 2.3 proposesapriorforRinducedbyshrinkagepriorsonthePACs.Section 2.4 introducestheselectionpriorforthePACs.SimulationresultsshowingtheperformanceofthepriorsappearinSection 2.5 .InSection 2.6 theproposedPACpriorsareappliedtoadatasetfromasmokingcessationclinicaltrial.Section 2.7 concludesthechapterwithabriefdiscussion. 26

PAGE 27

2.2PartialAutocorrelationsForageneralrandomvectorY=(Y1,...,YJ)0thepartialautocorrelationbetweenYiandYj(i
PAGE 28

forj)]TJ /F5 11.955 Tf 12.43 0 Td[(i>1.AstherelationshipbetweenRandisone-to-one,theJacobianforthetransformationfromRtocanbecomputedeasily.ThedeterminantoftheJacobianisgivenby jJ()j=Yi
PAGE 29

whichhasacontributionfromijof(1)]TJ /F11 11.955 Tf 11.89 0 Td[(2ij)[J)]TJ /F9 7.97 Tf 6.58 0 Td[(1)]TJ /F9 7.97 Tf 6.58 0 Td[((j)]TJ /F7 7.97 Tf 6.58 0 Td[(i)]=2.NotethatpfR()istheproductofindependentSBeta(ij,ij)distributionsforeachij,whereij=ij=1+[J)]TJ /F4 11.955 Tf 9.77 0 Td[(1)]TJ /F4 11.955 Tf 9.77 0 Td[((j)]TJ /F5 11.955 Tf 9.78 0 Td[(i)]=2.Thisprovidesanunconstrainedrepresentationoftheat-Rprior.Inlongitudinal/ordereddatacontexts,weexpectthePACstobenegligibleforelementsthathavelargelags.Weexploitthisconceptviatwotypesofpriors.First,weintroducepriorsthatshrinkPACstowardzerowiththeaggressivenessoftheshrinkagedependingonthelag.Next,wepropose,inthespiritof Wongetal. ( 2003 ),aselectionpriorthatwillstochasticallychoosePACstobesettozero. 2.3PartialAutocorrelationShrinkagePriors 2.3.1SpecicationoftheShrinkagePriorUsingthePACframework,weformpriorsthatwillshrinkthePACijtowardzero.Ithaslongbeenknownthatshrinkageestimatorscanproducegreatlyimprovedestimation( James&Stein 1961 ).Aspreviouslynoted,ij=0impliesthatYiandYjareuncorrelatedgiventheinterveningvariables(Yi+1,...,Yj)]TJ /F9 7.97 Tf 6.58 0 Td[(1).InthecasewhereYhasamultivariatenormaldistribution,thisimpliesindependencebetweenYiandYj,given(Yi+1,...,Yj)]TJ /F9 7.97 Tf 6.59 0 Td[(1).Weanticipatethatvariablesfartherapartintime(andconditionalonmoreintermediatevariables)aremorelikelytobeuncorrelated,sowewillmoreaggressivelyshrinkijforlargervaluesofthelagj)]TJ /F5 11.955 Tf 11.95 0 Td[(i.WeleteachijSBeta(ij,ij)independently.Aswewishtoshrinktowardzero,wewantEfijg=0,sowexij=ij.ItiseasilyshownthatVarfijg=4ijij (ij+ij)2(ij+ij+1),whichwedenotebyij.WerecovertheSBetashapeparametersbyij=ij=()]TJ /F9 7.97 Tf 6.59 0 Td[(1ij)]TJ /F4 11.955 Tf 12.35 0 Td[(1)=2.Hence,thedistributionofijisdeterminedbyitsvarianceij.RatherthanspecifyingtheseJ(J)]TJ /F4 11.955 Tf 11.96 0 Td[(1)=2differentvariances,weparameterizethemthrough Varfijg=ij=0jj)]TJ /F5 11.955 Tf 11.95 0 Td[(ij)]TJ /F12 7.97 Tf 6.59 0 Td[(,(2) 29

PAGE 30

where02(0,1)and>0.Clearly,ijisdecreasinginlagsothathigherlagtermswillgenerallybeclosertozero.Weletthepositiveparameterdeterminetheratethatijdecreasesinlag.TofullyspecifytheBayesianset-up,wemustintroducepriordistributionsonthetwoparameters,0and.Tospecifythesehyperpriors,weuseauniform(orpossiblyamoregeneralbeta)for0andagammadistributionfor.Werequire>0,soij=0jj)]TJ /F5 11.955 Tf 12.87 0 Td[(ij)]TJ /F12 7.97 Tf 6.59 0 Td[(remainsandecreasingfunctionoflag.InthesimulationsanddataanalysisofSections 2.5 and 2.6 ,weuseGamma(5,5),sothathasapriormeanof1andpriorvarianceof1=5.Weuseamoderatelyinformativepriortokeepfromdominatingtheroleof0inij=0jj)]TJ /F5 11.955 Tf 12.8 0 Td[(ij)]TJ /F12 7.97 Tf 6.59 0 Td[(.Alargevalueofwillforceallijoflaggreaterthanonetobeapproximatelyzero,regardlessofthevalueof0. 2.3.2SamplingundertheShrinkagePriorTheutilityofourpriordependsonourabilitytoincorporateitintoaMarkovchainMonteCarlo(MCMC)scheme.ForsimplicityweassumethatthedataconsistsofY1,...,YN,whereeachYiisaJ-dimensionalnormalvectorwithmeanzeroandcovarianceR,whichisacorrelationmatrixsoastomimicthecomputationsforthemultivariateprobitcase.LetL(jY)denotethelikelihoodfunctionforthedata,parameterizedbythePACs,.TheMCMCchainweproposeinvolvessequentiallyupdatingeachoftheJ(J)]TJ /F4 11.955 Tf 12.04 0 Td[(1)=2PACs,followedbyupdatingthehyperparametersdetermermingthevarianceoftheSBetadistributions.Tosampleaparticularij,wemustdrawthenewvaluefromthedistributionproportionaltoL(ij,()]TJ /F7 7.97 Tf 6.59 0 Td[(ij)jY)pij(ij),wherepij(ij)istheSBeta(ij,ij)densityand()]TJ /F7 7.97 Tf 6.58 0 Td[(ij)representsthesetofPACsexceptij.Duetothesubtleroleofijinthelikelihoodpiece,thereisnosimpleconjugatesamplingstep.InordertosamplefromL(ij,()]TJ /F7 7.97 Tf 6.58 0 Td[(ij)jY)pij(ij),weintroduceanauxiliaryvariableUij( Damienetal. 1999 ; Neal 30

PAGE 31

2003 ),andnotethatwecanrewritetheconditionaldistributionas L(ij,()]TJ /F7 7.97 Tf 6.59 0 Td[(ij)jY)pij(ij)=Z10Ifuij
PAGE 32

( 2013a )by=2,=1;alternatively,independenthyperpriorsfor,couldbespecied.Thevalueofijgivestheprobabilitythatijwillbenon-zero,i.e.willbedrawnfromthecontinuouscomponentinthemixturedistribution.Hence,wehavetheprobabilitythatYiandYjareuncorrelated,giventheintercedingvariables,is1)]TJ /F11 11.955 Tf 13.21 0 Td[(ij.Asthevaluesofthe'sdecrease,theselectionpriorplacesmoreweightonthepoint-mass0componentofthedistribution( 2 ),yieldingmoresparsechoicesfor.AswithourparameterizationsofthevarianceijinSection 2.3.1 ,wemakeastructuralchoiceoftheformofijsothatthisprobabilitydependsonthelag-value.Welet ij=0jj)]TJ /F5 11.955 Tf 11.96 0 Td[(ij)]TJ /F12 7.97 Tf 6.58 0 Td[(,(2)similartoourchoiceofijintheshrinkageprior.Thischoice( 2 )speciesthecontinuouscomponentprobabilitytobeanpolynomialfunctionofthelag.Becauseijisdecreasingasthelagj)]TJ /F5 11.955 Tf 12.85 0 Td[(iincreases,pr(ij=0)increases.Conceptually,thismeansthatweanticipatethatvariablesfartherapartintime(andconditionalonmoreintermediatevariables)aremorelikelytobeuncorrelated.Aswiththeshrinkageprior,wechoosehyperpriorsof0Unif(0,1)andGamma(5,5). 2.4.2NormalizingConstantforPriorsonROneofthekeyimprovementsofourselectionprioroverothersparsepriorsforRisthesimplicityofthenormalizingconstant,asmentionedintheintroduction.PreviouscovariancepriorswithasparseC( Carteretal. 2011 ; Pittetal. 2006 ; Wongetal. 2003 )placeaatprioronthenon-zerocomponentscijforagivenpatternofzeros.However,theneedednormalizingconstantrequiresndingthevolumeofthesubspaceofRJcorrespondingtothepatternofzerosinC.Thisturnsouttobeaquitedifculttaskandprovidesmuchofthechallengeintheworkofthethreepreviouslycitedpapers. 32

PAGE 33

WeareabletoavoidthisissuebyspecifyingourselectionpriorintermsoftheunrestrictedPACparameterization.Asthevalueofanyoftheij'sdoesnoteffectthesupportoftheremainingPACs,thevolumeof[)]TJ /F4 11.955 Tf 9.3 0 Td[(1,1]J(J)]TJ /F9 7.97 Tf 6.59 0 Td[(1)=2correspondingtoanycongurationofwithJ0(J(J)]TJ /F4 11.955 Tf 12.78 0 Td[(1)=2)non-zeroelementsis2J0,thevolumeofaJ0-dimensionalhypercube.Becausethisconstantdoesnotdependonwhichelementsarenon-zero,weneednotexplicitlydealwithitintheMCMCalgorithmtobeintroducedinthenextsubsection.Further,weareabletheexploitstructureintheorderofthePACsinselection(i.e.higherlagtermsaremorelikelytobenull),whereasin Pittetal. ( 2006 ),theprobabilitythatcijiszeroischosentominimizetheeffortrequiredtondthenormalizingconstant.AnadditionalbenetofperformingselectiononthepartialautocorrelationasopposedtothepartialcorrelationsCisthatthezeropatternsholdundermarginalizationsofthebeginningand/orendingtimepoints.Forinstance,ifwemarginalizeouttheJthtimepoint,thecorrespondingmatrixofPACsistheoriginalafterremovingthelastrowandcolumn.However,anyzeroelementsinCwillnotbepreservedbecausecorr(Y1,Y2jY3,...,YJ)=0doesnotgenerallyimplythatcorr(Y1,Y2jY3,...,YJ)]TJ /F9 7.97 Tf 6.59 0 Td[(1)=0. 2.4.3SamplingundertheSelectionPriorSamplingwiththeselectionpriorproceedssimilarlytotheshrinkagepriorschemewiththemaindifferencebeingtheintroductionofthepointmassin( 2 ).AsbeforewesequentiallyupdateeachofthePACs,bydrawingthenewvaluefromthedistributionproportionaltoL(ij,()]TJ /F7 7.97 Tf 6.59 0 Td[(ij)jY)pij(ij),wherepij(ij)givesthedensitycorrespondingthepriordistributionin( 2 )(withrespecttotheappropriatemixturedominatingmeasure).Wecannotusetheslicesamplingstepaccordingto( 2 )butmustwritethedistributionas L(ij,()]TJ /F7 7.97 Tf 6.59 0 Td[(ij)jY)pij(ij)=Z10Ifuij
PAGE 34

Fortheselectionprior,wesampleUijuniformlyovertheintervalfromzerotoL(ij,()]TJ /F7 7.97 Tf 6.58 0 Td[(ij)jY),usingthecurrentvalueofij,andthendrawijfrompij(),restrictedtotheslicesetP=f:uij
PAGE 35

Thesamplingdistributionsof0anddependononlythroughthesetofindicatorvariablesij.Aswiththevarianceparametersoftheshrinkagepriors,weincorporateapairofslicesamplingstepstoupdatethehyperparameters. 2.5SimulationsTobetterunderstandthebehaviorofourproposedpriors,weconductedasimulationstudytoassessthe(frequentist)riskoftheirposteriorestimators.WeconsiderfourchoicesADforthetruecovariancematrixinthecaseofsix-dimensional(J=6)data.RAwillhaveanautoregressive(AR)structurewithAij=0.7jj)]TJ /F7 7.97 Tf 6.58 0 Td[(ij.ThecorrespondingAhasvaluesof0.7forthelag-1termsandzerofortheothers,asparseparameterization.ForthesecondcorrelationmatrixRBwechoosetheidentitymatrixsothatallofPACsarezerointhiscase.TheChasastructurethatdecaystozero.Forthelag-1termsCi,i+1=0.7,andfortheremainingterms,Cij=0.4j)]TJ /F7 7.97 Tf 6.59 0 Td[(i)]TJ /F9 7.97 Tf 6.59 0 Td[(1,j)]TJ /F5 11.955 Tf 11.96 0 Td[(i>1.NeitherCnorRChavezeroelements,butCijdecreasequicklyinlagj)]TJ /F5 11.955 Tf 12.04 0 Td[(i.Finally,weconsideracorrelationmatrixthatcomesfromasparseD,D=266666641.9.30000.901.8.4.100.800.801.6.200.620.670.601.8.30.580.630.580.801.70.460.500.450.690.70137777775,wheretheupper-triangularelementscorrespondtoDandthelower-triangularelementsdepictthemarginalcorrelationsfromRD.NotethatwhileDissomewhatsparse,RDhasonlynon-zeroelements.ForeachofthesefourchoicesofthetruedependencestructureandforsamplesizesofN=20,50,and200,wesimulate50datasets.Foreachdatasetaposteriorsamplefor(andhence,R)isobtainedbyrunninganMCMCchainfor5000iterations,afteraburn-inof1000.Weuseeverytenthiterationforinference,givingasampleof500valuesforeachdataset.Weconsidertheperformanceofboththeselectionandshrinkagepriorson.Fortheselectionprior,weperformanalyseswithSBeta(1,1) 35

PAGE 36

(i.e.,Unif()]TJ /F4 11.955 Tf 9.3 0 Td[(1,1))andSBeta(2,1)(triangularprior)forthecontinuouscomponentofthemixturedistributions( 2 ).Inboththeselectionandshrinkagepriors,thehyperpriorsare0Unif(0,1)andGamma(5,5).Theestimatorsfromtheshrinkageandselectionpriorsarecomparedwiththeestimatorsresultingfromtheat-R,at-PAC,andtriangularpriors.Finally,weconsideranaiveshrinkagepriorwhereisxedatzeroin( 2 ).Here,allPACsareequallyshrunkwithvarianceij=0independentoflag.Weconsidertwolossfunctionsincomparingtheperformanceofthesixpriorchoices:L1(^R,R)=tr(^RR)]TJ /F9 7.97 Tf 6.59 0 Td[(1))]TJ /F4 11.955 Tf 12.96 0 Td[(logj^RR)]TJ /F9 7.97 Tf 6.59 0 Td[(1j)]TJ /F5 11.955 Tf 19.93 0 Td[(pandL2(^,)=Pi
PAGE 37

Figure2-1. BoxplotsoftheobservedlossusingL1(^R1,R)fortheJ=6cases.Thepriordistributionscomparedare(1)shrinkage,(2)selection(2,1),(3)selection(1,1),(4)at-R,(5)at-,(6)triangular,and(7)naiveshrinkage. 37

PAGE 38

Table2-1. RiskestimatesforsimulationstudywithdimensionJ=6.Correlationmatrices:Aautoregressivestructure;Bindependence;Cnon-zerodecaying;Dsparse.Lossfunctions:L1(^R,R)=tr(^RR)]TJ /F9 7.97 Tf 6.58 0 Td[(1))]TJ /F4 11.955 Tf 11.96 0 Td[(logj^RR)]TJ /F9 7.97 Tf 6.59 0 Td[(1j)]TJ /F5 11.955 Tf 17.93 0 Td[(p;L2(^,)=Pi
PAGE 39

Table2-2. 1010PACmatrixD0shownabovethediagonalanditsrespectivecorrelationmatrixRD0shownbelowthediagonal.D0=26666666666666641.9.300000000.901.8.4.1000000.800.801.6.2000000.620.670.601.8.300000.580.630.580.801.700000.460.500.450.690.701.8.4.100.370.400.360.550.560.801.6.200.310.340.300.460.470.670.601.8.30.290.320.290.430.440.630.580.801.70.230.250.230.340.350.500.450.690.7013777777777777775 estimatedriskforat-Risvisiblyworsethantheothers.RecallthatCijisdecreasinginlagbutisnotequaltozero.Infact,thesmallestelementC16=(0.4)4=0.0256whichmaynotbecloseenoughtozerotobeeffectivelyzeroedout,explainingwhytheselectionpriorsarelesseffectiveforCthanintheotherscenarios.WhenweconsiderestimatingthesparsecorrelationmatrixD,theshrinkageandselectionpriorsoutperformthefourotherpriors.FromTable 2-1 weseethatforlossfunction1andtheN=20samplesizetheestimatedriskdecreasesby45(25),45(24)and39(16)percentfortheestimatesfromtheshrinkage,selection(2,1),andselection(1,1)priorsovertheat-R(at-)priors.Thisisquiteasubstantialdropforthesmallsamplesize.Fortheothersamplesizeswestillobservedacleardecreaseovertheatpriors.ForN=50thereisadropof32(20),26(14),and22(9)percentforthesparsepriorsovertheatpriors,andwithN=200adecreaseof13(9),10(7),and7(4)percent.ToinvestigatehowourpriorsbehaveasJincreases,werepeattheanalysisusingthenon-sparsedecayingRCandasparseRD0withthedimensionofthematrixincreasedtoJ=10.Again,Ci,i+1=0.7forthelag-1termsandCij=0.4j)]TJ /F7 7.97 Tf 6.59 0 Td[(i)]TJ /F9 7.97 Tf 6.59 0 Td[(1forallj)]TJ /F5 11.955 Tf 12.56 0 Td[(i>1,andweexpandthepreviousRDtothe1010RD0showninTable 2-2 .AsbeforetheabovediagonalelementsarefromD0andthebelowdiagonalelements 39

PAGE 40

Table2-3. RiskestimatesforsimulationstudywithdimensionJ=10.Correlationmatrices:Cnon-zerodecaying;D0sparse.Lossfunctions:L1(^R,R)=tr(^RR)]TJ /F9 7.97 Tf 6.59 0 Td[(1))]TJ /F4 11.955 Tf 11.96 0 Td[(logj^RR)]TJ /F9 7.97 Tf 6.58 0 Td[(1j)]TJ /F5 11.955 Tf 17.93 0 Td[(p;L2(^,)=Pi
PAGE 41

PAGE 42

missingnessduetostudydropout.Asinpreviousanalysesofthisdata( Daniels&Hogan 2008 ),weassumethismissingnessisignorable.Forpatienti=1,...,N(N=281),wedenotethevectorofquitstatusesbyQi=(Qi1,...,QiJ)0.Weonlyconsidertheresponsesafterpatientsareaskedtoquit,weeks5through12(J=8).HereQit=1indicatesasuccess(notsmoking)forpatientiattimet(1tJ,correspondingtoweekt+4),Qit=)]TJ /F4 11.955 Tf 9.3 0 Td[(1forafailure(smokingduringtheweek),andQit=0iftheobservationismissing.Followingtheusualconventionsofthemultivariateprobitregressionmodel( Chib&Greenberg 1998 ),weletYibetheJ-dimensionalvectoroflatentvariablescorrespondingtoQi.Thus,Qit=1impliesthatYit0,andQit=)]TJ /F4 11.955 Tf 9.3 0 Td[(1givesYit<0.WhenQit=0,thesignofYitrepresentsthe(unobserved)quitstatusfortheweek.WeassumethelatentvariablesfollowamultivariatenormaldistributionYiNJ(i,R)fori=1,...,N,wherei=Xi,XiisaJqmatrixofcovariatesandaq-vectorofregressioncoefcients.AsthescaleofYisunidentied,thecovariancematrixofYisconstrainedtobeacorrelationmatrixR.WeconsidertwochoicesofXi:`time-varying'whichspeciesadifferentitforeachtimewithineachtreatmentgroup(q=2J)and`time-constant'whichgivesthesamevalueofitacrossalltimeswithintreatmentgroup(q=2).Withthetime-constantandtime-varyingchoicesofthemeanstructure,weconsiderthefollowingpriorsforR:shrinkage,selection,at-R,at-,triangular,naiveshrinkage,andanautoregressive(AR)prior.TheARpriorassumesanAR(1)structureforR,thatis,ij=jj)]TJ /F7 7.97 Tf 6.59 0 Td[(ijandi,i+1=andij=0ifjj)]TJ /F5 11.955 Tf 11.95 0 Td[(ij>1.WeassumeaUnif()]TJ /F4 11.955 Tf 9.3 0 Td[(1,1)distributionfor.Asintherisksimulation,weconsidertheselectionpriorwithbothSBeta(1,1)andwithSBeta(2,1)forthecontinuouscomponent.Theremainingpriordistributionstobespeciedare0Unif(0,1),Gamma(5,5),andthepriorontheregressioncoefcientsisat. 42

PAGE 43

ToanalyzethedatawerunanMCMCchainfor12,000iterationsafteraburn-inof3000,retainingeverytenthobservation.Convergencewasassessedthroughgraphicaldiagnosticsanddeemedadequate.TherearethreesetsofparameterstosampleintheMCMCchain:theregressioncoefcients,thecorrelationmatrix,andthelatentvariables.TheconditionalforgivenYandRismultivariatenormal.SamplingthecorrelationmatrixevolvesasdiscussedinSections 2.3.2 and 2.4.3 usingtheresidualsYi)]TJ /F16 11.955 Tf 12.05 0 Td[(i.ThelatentvariablesYi,whichareconstrainedbyQi,aresampledaccordingtothestrategyof Liuetal. ( 2009 ,Proposition1).Tocomparethespecicationbasedonourpriorchoices,wemakeuseofthedevianceinformationcriterion(DIC; Spiegelhalteretal. 2002 ).TheDICstatisticcanbeviewedsimilarlytotheBayesianorAkaikeinformationcriterion,butDICdoesnotrequiretheusertocountthenumberofmodelparameters.ThisiskeyforBayesianmodelsthatutilizeshrinkageand/orsparsitypriorsasitisnotclearwhetherorhowoneshouldcountaparameterthathasbeensettoorshrunktowardzero.Tothatend,let Dev=)]TJ /F4 11.955 Tf 9.3 0 Td[(2loglik(^,^RjQ)=Xi)]TJ /F4 11.955 Tf 9.3 0 Td[(2loglik(^,^RjQi)(2)bethedevianceortwicethenegativelog-likelihoodwiththeparameters^and^R.Here^istheposteriormean,andforthecorrelationestimate^R,weusetherstoftheestimatorsweconsideredinSection 2.5 ,^R=SEfR)]TJ /F9 7.97 Tf 6.59 0 Td[(1g)]TJ /F9 7.97 Tf 6.58 0 Td[(1SwithS=[diag(EfR)]TJ /F9 7.97 Tf 6.59 0 Td[(1g)]1=2.ThecomplexityofthemodelismeasuredbythetermpD,sometimescalledtheeffectivenumberofparameters.ThispDiscalculatedas pD=Ef)]TJ /F4 11.955 Tf 15.27 0 Td[(2loglik(,RjQ)g)]TJ /F1 11.955 Tf 20.59 0 Td[(Dev,(2)wheretheexpectationisovertheposteriordistributionoftheparameters(,R).TheDICmodelcomparisonstatisticisDIC=Dev+2pD,thesumoftermsmeasuringmodeltandcomplexity.SmallervaluesofDICarepreferred. 43

PAGE 44

Table2-4. ModelcomparisonstatisticsfortheCTQdata. MeanStructureCorrelationPriorDevpDDIC Time-constantShrinkage1031141060Time-constantSelection(2,1)1042121066Time-constantSelection(1,1)1044121068Time-constantTriangular1029201068Time-constantat-1029201069Time-constantNaiveshrinkage1033201074Time-constantAR107131078Time-constantat-R1043211086Time-varyingShrinkage1022251071Time-varyingTriangular1017301077Time-varyingSelection(2,1)1033221077Time-varyingSelection(1,1)1036221080Time-varyingat-1019301080Time-varyingNaiveshrinkage1023311085Time-varyingAR1068131093Time-varyingat-R1034311097 As Wang&Daniels ( 2011 )pointout,DICshouldbecalculatedusingtheobserveddata,whichinthiscaseisthequitstatusresponsesQinotthelatentvariablesYi.Hencethelog-likelihoodforQiatparameters(,R)isequalto loglik(,RjQi)=logZ(,1)JIfQityt08tg(yjXi,R)dy,(2)where(j,)istheJ-dimensionalmultivariatenormaldensitywithmeanandcovariancematrix.Theintegralin( 2 )isnottractablebutcanbeestimatedusingimportancesampling( Robert&Casella 2004 ,Section3.3).SeeAppendix A fordetailsaboutestimatingtheDIC.Themodelt(Dev),complexity(pD),andcomparison(DIC)statisticsareinTable 2-4 ;DICstatisticswereestimatedwithastandarderrorofapproximately0.5.WeseethatthemodelsthatuseameanstructurethatdependsonlyontreatmentandnottimettendtohavelowerDICvalues.Thetime-varyingmodelsarepenalizedinthepDtermforhavingtoestimatetheadditional14regressioncoefcients.Ofthe 44

PAGE 45

correlationpriorstheat-RandARpriorsperformmuchworsethantheshrinkage,selection,triangular,andat-PACpriorswiththesamemeanstructure.Additionally,theselectionpriorthatusesthetriangularformforSBeta(=2,=1)tendtohaveasmallerDICthantheSBeta(1,1)priors.FromTable 2-4 wedeterminethepriorchoicethatbestbalancesmodeltwithparsimonyisclearlythemodelwithtime-constantmeanstructureandtheshrinkageprioronthecorrelationmatrixprior.Usingthisbestttingmodel,theposteriormeanofis()]TJ /F4 11.955 Tf 9.3 0 Td[(0.504,)]TJ /F4 11.955 Tf 9.3 0 Td[(0.295)implyingthatthemarginalprobability(95%credibleinterval)ofnotsmokingduringagivenstudyweekis()]TJ /F4 11.955 Tf 9.29 0 Td[(0.504)=0.307(0.24,0.37)forthecontrolgroupand()]TJ /F4 11.955 Tf 9.3 0 Td[(0.295)=0.384(0.32,0.45)fortheexercisegroup,where()isthedistributionfunctionofthestandardnormaldistribution.Thetestofthehypothesisthatthecontroltreatmentisaseffectiveastheexercisetreatment(i.e.,H0:12)hasaposteriorprobabilityof0.06,providingsomeevidencetotheclaimthatexerciseimprovescessationresults.Wenowexamineinmoredetailtheeffecttheshrinkagepriorhasonmodelingthecorrelationmatrix.Theposteriormeans(credibleinterval)oftheshrinkageparametersare^0=0.406(0.25,0.60)and^=2.44(1.6,3.4).Withavalueofgreaterthan1,thevarianceofijisdecayingtozerofairlyrapidly.Theposteriormeanofis^=2666666666641.000.700.120.020.050.000.00-0.010.711.000.830.160.090.020.010.000.640.841.000.810.120.100.060.020.560.740.821.000.780.240.090.030.510.640.690.791.000.810.370.040.480.610.660.740.831.000.880.210.480.610.670.740.830.891.000.780.400.520.570.630.700.770.801.00377777777775,withthelowerdiagonalvaluesgivingtheelementsof^R.WeseethatthePACsarefarfromzeroinonlythersttwolagsandtheremaining'sareclosetozero.Thisisbecausethesepartialautocorrelationshavebeenshrunkalmosttozeroinmostiterations. 45

PAGE 46

2.7DiscussionInthispaperwehaveintroducedtwonewpriorsforcorrelationmatrices,ashrinkagepriorandaselectionprior.ThesepriorschooseasparseparameterizationofthecorrelationmatrixthroughthesetofPACs.Intheselectioncontext,bystochasticallyselectingtheelementsoftozeroout,ourmodelndsinterpretableindependencerelationshipsfornormaldataandavoidstheneedforcomplexmodelselectionofthedependencestructure.Akeyimprovementoftheselectionprioroverexistingmethodsforsparsecorrelationmatricesisthatourapproachavoidsthecomplexnormalizingconstantsseeninpreviouswork.Additionally,insettingswithtime-ordereddata,thepartialautocorrelationsaremoreinterpretablethanthefullpartialcorrelations,astheydonotinvolveconditioningonfuturevalues.Whiletheexampleswehaveconsideredhereinvolvesituationswherethecovariancematrixwasconstrained(asinthedataexample)orknown(asinthesimulations)tobeacorrelationmatrix,theextensiontoarbitraryissimple.Returningtotheseparationstrategy=SRS( Barnardetal. 2000 ),apriorforcanbeformedbyplacingindependentpriorsonSandR,i.e.p()=p(R)p(S).Usingoneoftheproposedpriorsforp(R),sensiblechoicesofp(S)includeanindependentinversegammaforeachofthejjoraatprioronfS=diag(11,...,JJ):jj>0g.ThisleadstoaprioronwithsparsePACs.ThesimulationsanddatawehaveconsideredheredealwithYoflowormoderatedimension.WeprovideafewcommentsregardingthescalabilityofourapproachfordatawithlargerJ.AswebelievethatPACsoflargerlagplayaprogressivelysmallerroleindescribingthe(temporal)dependence,itmaybereasonabletospecifyamaximumallowablelagfornon-zeroPACs.Thatis,wechoosesomeksuchthatij=0forallj)]TJ /F5 11.955 Tf 12.09 0 Td[(i>kandsampleij(j)]TJ /F5 11.955 Tf 12.09 0 Td[(ik)fromeitherourshrinkageorselectionprior.Bandingthematrixisrelatedtotheideaofbandingthecovariancematrix( Bickel&Levina 2008 ),concentrationmatrix( Rothmanetal. 2008 ),ortheCholeskydecomposition 46

PAGE 47

of)]TJ /F9 7.97 Tf 6.59 0 Td[(1( Rothmanetal. 2010 ).Bandinghasalsobeenstudiedby Wang&Daniels ( 2013b ).Inadditiontoreducingthenumberofparametersthatmustbesampled,othermatrixcomputationswillbefasterbyusingpropertiesofbandedmatrices.Relatedtothis,modicationstotheshrinkagepriormaybeneededforlargerdimensionJ.Recallthatthevarianceofijisij=0jj)]TJ /F5 11.955 Tf 12.48 0 Td[(ij)]TJ /F12 7.97 Tf 6.58 0 Td[(.Forlargelags,thiscanbeveryclosetozeroleadingtonumericalinstability;recalltheparametersoftheSBetadistributionareinverselyrelatedtoijthroughij=ij=()]TJ /F9 7.97 Tf 6.59 0 Td[(1ij)]TJ /F4 11.955 Tf 12.32 0 Td[(1)=2.Replacing( 2 )withij=0minfjj)]TJ /F5 11.955 Tf 12.36 0 Td[(ij,kg)]TJ /F12 7.97 Tf 6.59 0 Td[(orij=0+1jj)]TJ /F5 11.955 Tf 12.36 0 Td[(ij)]TJ /F12 7.97 Tf 6.59 0 Td[(toboundthevariancesawayfromzeroorbandingaftertherstklagsprovidetwopossibilitiestoavoidsuchnumericalissues.Further,wehaveparametrizedthevariancecomponentandtheselectionprobabilityinsimilarwaysinourtwosparsepriors.Thequantityisoftheform0jj)]TJ /F5 11.955 Tf 12.12 0 Td[(ij)]TJ /F12 7.97 Tf 6.59 0 Td[(forbothijin( 2 )andijin( 2 ),butotherparameterizationsarepossible.Wehaveconsideredsomesimulations(notincluded)allowingthevariance/selectionprobabilitytobeuniqueforlag,i.e.ij=jj)]TJ /F7 7.97 Tf 6.59 0 Td[(ij.ApriorneedstobespeciedforeachoftheseJ)]TJ /F4 11.955 Tf 12.47 0 Td[(1's,ideallydecreasinginlag.Alternatively,onecoulduse0=jj)]TJ /F5 11.955 Tf 12.88 0 Td[(ij,whichcanbeviewedasaspecialcasewheretheprioronisdegenerateat1.Inourexperienceresultswerenotverysensitivetothechoiceoftheparameterization,andposteriorestimatesofandRweresimilar.Inaddition,wehavefocusedourdiscussiononthecorrelationestimationprobleminthecontextofanalysiswithmultivariatenormaldata.WenotethatthesepriorsareadditionallyapplicableinthecontextofestimatingaconstrainedscalematrixforthemultivariateStudentt-distribution.ConsidertherandomvariableYtJ(,R,).Thatis,YfollowsaJ-dimensionalt-distributionwithlocation(mean)vector,scalematrixR(constrainedtobeacorrelationmatrix),anddegreesoffreedom(eitherxedorrandom).Usingthegamma-mixture-of-normalstechnique( Albert&Chib 1993 ),werewritethedistributionofYtobeYjNJ(,)]TJ /F9 7.97 Tf 6.58 0 Td[(1R)andGamma(=2,=2). 47

PAGE 48

SamplingforRaspartofanMCMCchainfollowsasinSections 2.3.2 and 2.4.3 usingY?=p (Y)]TJ /F16 11.955 Tf 12.19 0 Td[()asthedata.However,oneshouldnotethatazeroPACijimpliesthatYiandYjareuncorrelatedgivenYi+1,...,Yj)]TJ /F9 7.97 Tf 6.58 0 Td[(1,butthisisnotequivalenttoconditionalindependenceasinthenormalcase. 48

PAGE 49

CHAPTER3ANONPARAMETRICPRIORFORSIMULTANEOUSCOVARIANCEESTIMATION 3.1SimultaneousCovarianceEstimationWhenworkingwithlongitudinaldata,specifyingthemodelforthedependencestructureisamajorconsideration.Oftenthedataarecomposedofseveralgroups,suchasdifferingtreatmentsinaclinicaltrial.Inmanycases,particularlyifonedoesnothavemanyobservationspergroup,oneassumesthatthecovarianceorcorrelationstructureisconstantacrossallgroups.However,thisassumption,ifitfailstohold,canhaveadramaticeffectontheinferenceformeaneffects,evensometimesleadingtobias.Conversely,ifonespecieseachofthecovariancematriceswithoutregardtotheothergroups,thiscanleadtoalossofinformation.Dealingwiththesecompetingmodelsforthecovariancestructureisaconcerninmanystatisticalapplications,suchasclassicationandmodel-basedclustering.Therefore,itisdesirabletodevelopmethodstosimultaneouslyestimatethesetofcovariancematricesthatwillborrowinformationacrossgroupsinacoherent,automatedmannerallowingforstructuralzeros,commonalityacrosssubsetsofthegroups,andappropriateequalityofparameterswithinagroup.Whenthedataarefullyobservedundermultivariatenormality,themeanandcovarianceparametersareorthogonalinthesenseof Cox&Reid ( 1987 ),andthemeanparameterswillbeconsistentundermisspecicationofthecovariancestructure.However,ifthereismissingness,asisoftenthecaseforlongitudinaldata,thereisnolongerorthogonality,evenatthetruevalueofthecovariancematrix( Little&Rubin 2002 ).Hence,fortheposteriordistributionofthemeanparameterstobeconsistent,thedependencestructuremustbecorrectlyspecied,anditisnotappropriatetotreatthecovariancematrixasanuisanceparameter( Daniels&Hogan 2008 ,Section6.2).Further, Crippsetal. ( 2005 )demonstrateefciencygainsfortheregressionparametersforfullyobserveddatabyusingparsimoniousmodelsforthecovariancematrix. 49

PAGE 50

AssumethatwehaveMgroupsofnormallydistributedlongitudinaldatawithnmresponsesofdimensionp,Ymiforthemthgroup.Weassumewithoutlossofgeneralitythatthemeanvectorforeachgroupiszero.ThedistributionoftheYmiisYmijmNp)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(0,m(i=1,...,nm;m=1,...,M),withthecovariancematrixm=(m,)]TJ /F7 7.97 Tf 12.25 -1.79 Td[(m)parameterizedbythegeneralizedautoregressiveparameters,mandinnovationvariances,)]TJ /F7 7.97 Tf 6.78 -1.8 Td[(m,asdescribedby Pourahmadi ( 1999 2000 ).Forbrevity,wesometimesrefertothegeneralizedautoregressiveparametersastheautoregressiveparameters.WealsorefertothisasthemodiedCholeskyparameterization,sincetheparametersarederivedbyperformingaCholeskydecompositiononm,(m,)]TJ /F7 7.97 Tf 12.25 -1.79 Td[(m))]TJ /F9 7.97 Tf 6.59 0 Td[(1=T(m)D()]TJ /F7 7.97 Tf 11.66 -1.79 Td[(m)T(m)>.Here,)]TJ /F7 7.97 Tf 6.77 -1.79 Td[(m=(m1,...,mp),andD()]TJ /F7 7.97 Tf 11.65 -1.8 Td[(m)isappdiagonalmatrixwith(j,j)-element(mj))]TJ /F9 7.97 Tf 6.58 0 Td[(1.TheT(m)matrixisupper-triangularwithonesonthemaindiagonal,andtheabove-diagonalelementsaregivenbythenegativesofm.Theelementsofm=(m1,...,mJ)areindexedbyj=1,...,J(J=p(p)]TJ /F4 11.955 Tf 12.08 0 Td[(1)=2)correspondingtolocation(j1,j2)inT(m)(1
PAGE 51

toestimatesparsegraphicalmodelsbyselectingsetsofedgescommontoallgroups,aswellasgroup-specicedges.However,theirmethodshareslittleinformationaboutnon-zeroparametersacrossthegroups. Pourahmadietal. ( 2007 )developedestimationandtestingproceduresforequalityamongsubsetsofthemj's.Inaclusteringcontext,modelsassumingT(m)and/orD()]TJ /F7 7.97 Tf 11.66 -1.79 Td[(m)tobeeitherconstantordistinctacrossallgroupsweredevelopedby McNicholas&Murphy ( 2010 ). Daniels ( 2006 )consideredaBayesianperspectivebyintroducingpriorsfortheparametersoftheCholeskydecomposition,aswellastheprincipalcomponentsofthecovariancematrices,thatinducepoolingacrossgroups.Unfortunately,itiscomputationallychallengingtoselectamongallthepossiblemodelswithintheseclasses. Hoff ( 2009 )alsoconsidersamodelthatshrinkstowardacommoneigenvectorstructure,allowingtheextentofthepoolingtovaryacrosseachprincipleaxis.Othermethodshavebeenproposedthatmodelthecovariancematrixasaregressionfunctionofacontinuouscovariate( Chiuetal. 1996 ; Daniels 2006 ; Fox&Dunson 2011 ; Hoff&Niu 2012 ).However,covarianceregressionmodelsareoftenplaguedbythedifcultyofinterpretingtheregressionparameters.InthischapterwefocussolelyonthemodiedCholeskyparameterizationbecauseoftheunrestrictednessoftheparameters,theinterpretabilityforlongitudinaldata,andthecomputationaladvantagesviaconjugacy( Daniels&Pourahmadi 2002 ).Ourgoalistodevelopapriorforthesets=f1,...,Mgand)-406(=f)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(1,...,)]TJ /F7 7.97 Tf 30.19 -1.79 Td[(MginsuchawaythatweborrowstrengthacrosstheMgroups.Additionally,wewanttoshareinformationacross)]TJ /F7 7.97 Tf 6.78 -1.79 Td[(mandmvalues,particularlythoseautoregressiveparametersofacommonlag.AnotherconsiderationforpriordevelopmentistoencouragesparsityoftheelementsofT(m).Becauseeachmjrepresentsaconditionaldependency,settingmjtozeroestablishesaconditionalindependencerelationshipbetweenapairofcomponentsofY.Itisnecessarytoconsiderpriorsthatallowthedatatoinformthebalancebetweenthesetwogoals:poolingacrossgroupsandintroducingsparsity.Aboveall,weseektoaccomplishthisinanautomated,stochasticfashion.Toform 51

PAGE 52

PAGE 53

XjH=1forallj,sothatthestick-breakingweightssumtoone,guaranteeingFmjisavaliddistribution.Thematrixstick-breakingprocessisthendenedusingtheabovespecicationasH!1,andtheauthorsrefertotheniteHcaseasthetruncationapproximationtothematrixstick-breakingprocess.Wecanconsidertheadequacyofthisapproximationusingamethodsimilartothatemployedby Ishwaran&James ( 2001 ). Dunsonetal. ( 2008 )showthatforasetfmjhgdrawnfromthefullprocess, E 1Xh=Hmjh!=1)]TJ /F4 11.955 Tf 48.06 8.09 Td[(1 (1+)(1+)H)]TJ /F9 7.97 Tf 6.58 0 Td[(1.(3)WemaychoosethenumberofclustersHsuchthatthisexpectedapproximationerror( 3 )isarbitrarilysmall,sotheeffectoftheapproximationisnegligible.BecausetheprobabilitymeasuresFmjandFm0jfortwogroupsmandm0sharethesamesetofatomsfj1,...,jHg,thereisapositiveprobabilitythatmjwillequalm0j.Thisoccurswhenmjandm0jaredrawnfromthesamecluster,thatis,ifmj=m0j=jhforsomehin1,...,H.Theprobabilityofthisoccurringisaknownfunctionofthestick-breakingparametersand. 3.3CovarianceGroupingPriors 3.3.1Lag-BlockGroupingPriorforWenowproposepriorstouseforsimultaneouscovarianceestimationbasedonthematrixstick-breakingprocess.Thesepriorsarereferredtoasgroupingpriorsbecausetheyinducegroupingamongthevaluesofthevariousparameters.Tothisend,weindependentlyplacepriorsonand)]TJ /F1 11.955 Tf 10.1 0 Td[(withtheprioroninducedbythemappingm=(m,)]TJ /F7 7.97 Tf 12.26 -1.8 Td[(m).Becauseand)]TJ /F1 11.955 Tf 10.1 0 Td[(areorthogonalparameters( Pourahmadi 2007 ),itissensibletochooseindependentpriors. 53

PAGE 54

Thepriorfor,referredtoasthelag-blockgroupingprior,isdenedasfollows. mjFmj()=HXh=1mjhq(j)h()(m=1,...,M;j=1,...,J), (3) qhq0+(1)]TJ /F11 11.955 Tf 11.95 0 Td[(q)N(0,2)(q=1,...,p)]TJ /F4 11.955 Tf 11.96 0 Td[(1;h=1,...,H), (3) mjh=UmhXjhYl
PAGE 55

Weformtheprobabilitiesfmjhgasin Dunsonetal. ( 2008 ).Theandstick-breakingparametersservethesameroleasandbefore.Wesubscriptthemwithtodistinguishthestick-breakingparametersfortheprioronfromtheparameterstobedenedfortheprioron)]TJ /F1 11.955 Tf 6.77 0 Td[(.TheUmhandXjhparametersfollowthesameinterpretationasinthematrixstick-breakingprocess,butwhilewesharecandidatesacrossautoregressiveparametersofthesamelag,eachparameterhasitsownvaluesXjh.Akeydistinctionbetweenourpriorandtheoriginalprocessof Dunsonetal. ( 2008 )istheuseofthesamesetofcandidatevaluesfordifferentparameters.Thishasimportantconsequencesforthetheoreticalpropertiesofourpriors.Inparticular,formjandmj0withj6=j0andq(j)=q(j0),i.e.differentautoregressiveparametersforacommongroupandlag,theirdistributionsFmjandFmj0arepositivelycorrelated,whereasundertheoriginalspecicationtheywouldbeuncorrelated.Thisimplicationisquiteattractiveforlongitudinaldataasitfollowscommonintuition.Forexample,itmaybereasonabletoconsiderthattheregressioneffectofYtontoYt)]TJ /F9 7.97 Tf 6.59 0 Td[(1tobethesamefordifferentvaluesoft.WediscussthesepropertiesfurtherinSection 3.4 3.3.2Correlated-LognormalGroupingPriorfor)]TJ /F1 11.955 Tf -251.13 -24.53 Td[(Wenowdenethepriorfortheinnovationvariances)]TJ /F1 11.955 Tf 10.1 0 Td[(asfollows. mjGmj()=HXh=1mjhjh()(m=1,...,M;j=1,...,p), (3) jh=exp(!jh)(j=1,...,p;h=1,...,H),!h=(!1h,...,!ph)TNpf 1p,R()g(h=1,...,H), (3) mjh=WmhZjhYl
PAGE 56

Wedrawtheinnovationvariancemjfromthestick-breakingmeasureGmj,wherethecandidateatomsaredrawnbyexponentiatingamultivariatenormalvariable!h.Theprobabilitymjhofeachoftheatomsisformedusingthestick-breakingmethodontheproductofWandZ.Thesebetarandomvariablesdependontheparametersand.Thecandidatesjharedrawninacorrelatedfashionunliketheoriginalmatrixstick-breakingprocessandmarginallyfollowalognormaldistribution,providingthenameofthisprior.Weintroducetheintermediatevariable!hin( 3 ),whichisap-dimensionalnormallydistributedrandomvectorwithmeanvector 1pandcovariancematrixR().Here, andarescalarquantities,>0,andR()isthecorrelationmatrixcorrespondingtoanautoregressivefunctionoforder1.The(i,j)componentofR()isji)]TJ /F7 7.97 Tf 6.58 0 Td[(jj.Thischoiceismotivatedbythefactthatonesometimesconsiderstheinnovationvariancesasrealizedvaluesofsomeunknownsmoothfunctionoftime.Similartothelag-blockpriorwewillobtaintheatomsjhfortherandommeasureGmjinadependentway,whileleavingtheconstructionoftheprobabilityweightsmjhunchanged.Inthespecialcasewhere=0,thecomponentsofthe!hvectorareindependent.Consequently,theinnovationvariancecandidatesjharedistributedaccordingtothelognormal( ,)distribution,andthisspecialcasefollowsthematrixstick-breakingprocessframework.Inadditiontothegroupingpriorsthatwehavedenedhere,thereareotherpossibilitiestoformsimilarpriorsonthesetf1,...,Mgusingthematrixstick-breakingprocessframework.WeexploresomeoftheseinAppendix B 3.4TheoreticalProperties 3.4.1GeneralizedAutoregressiveParameterPropertiesWenowexploresomeofthetheoreticalpropertiesoftheproposedgroupingpriorsinthecasewhereH,H!1.Recallthatthematrixstick-breakingprocessisformally 56

PAGE 57

denedtobethelimitingdistributionasthenumberofclustersapproachesinnity,andthenitenumberofclusterscase,whilenecessaryforimplementation,isviewedasanapproximation.Ourgroupingpriorsfollowinthesameway.Thefollowingproperties,( 3 )( 3 ),arederivedfortheselimitingdistributions,andweensurethatthenumberofclustersischosenlargeenoughthatthesepropertiesmaybeconsideredtoholdapproximately.TheinitialpropertiesmirrorPropositions1,2,and4of Dunsonetal. ( 2008 ).Partialderivationsof( 3 )( 3 )areprovidedinAppendix C .First,weconsiderthebehavioroffromthelag-blockgroupingprior.Forthefollowingcalculations,weassumethattheq'sandallhyperparametersarexed.Additionally,foreaseofnotation,weignorethesubscriptonq,,whenitisclearfromcontext,andlet()denotetheprobabilitymeasurefortheNormal(0,2)distribution.Dene()=0()+(1)]TJ /F11 11.955 Tf 12.31 0 Td[()(),theprobabilitymeasureforthemixturedistributionoftheqh's.ForallsetsAintheBoreleldofthereallineB(R), EfFmj(A)g=(A),VarfFmj(A)g=2 (2+)(2+))]TJ /F9 7.97 Tf 6.58 0 Td[(2(A)f1)]TJ /F4 11.955 Tf 11.96 0 Td[((A)g. (3) Thisunbiasednesspropertyshowsthatitisappropriatetorefertothe0-normalmixtureasthebasedistributionfor.TheformofthevarianceshowsthatandcontroltheextenttowhichtherandommeasureFmjdiffersfromthebasedistribution.Aseitherorapproachinnity,thedistributionofmjcollapsestotheparametricbase;smallvaluesofandallowforamoreexibleprior.Fortwodifferentgroupsm6=m0, corrfFmj(A),Fm0j(A)g=+=2++1 2+++1. (3) BecausethiscorrelationbetweenamountofmassthedistributionfunctionsassigntothesetAdoesnotdependonthechoiceofA,itmaybeusedasasimpleunivariatemeasureofthedegreetowhichinformationissharedacrossgroups.Simplealgebra 57

PAGE 58

showsthat1=2corrfFmj(A),Fm0j(A)g1.Inparticular,corrfFmj(A),Fm0j(A)gapproaches1=2asaseitherorapproachinnityandapproaches1as!0.Forgroupsm6=m0,theprobabilityofmatchingforthejthautoregressiveparameteris pr(mj=m0j)=2+1)]TJ /F12 7.97 Tf 6.58 0 Td[(2 (1+)(2+))]TJ /F9 7.97 Tf 6.59 0 Td[(1. (3) Thepresenceofthezeropointmassincausesourpropertiestodifferfromthosederivedin Dunsonetal. ( 2008 ).Aseitherorapproachinnity,thisprobabilityapproaches2,theprobabilitythatbothmjandm0jarezeroifdrawnindependentlyfromtheparametricbasedistribution.Therighthandsideof( 3 )isincreasingin,aslargervaluesofindicatethatbothtermsaremorelikelytobezerowhetherornottheycomefromthesamecluster.Additionally,( 3 )increaseswheneitheranddecreases,coincidingwiththeincreasein( 3 ).Consideringtwodifferentautoregressiveparametersj6=j0ofthesamelagq=q(j)=q(j0)fromthesamegroupm,wehavethat corrfFmj(A),Fmj0(A)g=+=2++1 2+++1,pr(mj=mj0)=2q+1)]TJ /F12 7.97 Tf 6.59 0 Td[(2q (2+)(1+))]TJ /F9 7.97 Tf 6.59 0 Td[(1. (3) Thegroupingpriorhasimposedacorrelationstructureonthedistributionfunctionsofthe'sofacommonlag,allowingustoborrowstrengthintheestimationofthedependenceparametersfromthesamelag.Thiscorrelationisthesameas( 3 )fortheearlierm6=m0casewiththeroleofandinreverse.Likewise,theprobabilityofmatchingacrossparametersofcommonlag( 3 )isalsoequivalenttotheprobabilityofmatchingacrossgroupforcommonparameter( 3 )withandexchanged.Thisisakeydistinctionfromtheprocessof Dunsonetal. ( 2008 )wherethecorrelationandmatchingprobabilitieswouldbezero. 58

PAGE 59

Fordifferentgroupsm6=m0anddifferentautoregressiveparametersj6=j0ofthesamelag, corrfFmj(A),Fm0j0(A)g==2+++1 2+2+2+1,pr(mj=m0j0)=2q+1)]TJ /F12 7.97 Tf 6.59 0 Td[(2q 2(1+)(1+))]TJ /F9 7.97 Tf 6.59 0 Td[(1. (3) Somealgebrashowsthatthiscorrelationislessthanboth( 3 )and( 3 ).Likewise,pr(mj=m0j0)issmallerthan( 3 )and( 3 ).Thatis,thecorrelationsofthedistributionfunctionsandtheprobabilityofmatchingacrossbothgroupandautoregressiveparameterarestrictlysmallerthanthecorrelationandmatchingprobabilityacrossjustone.If>,thencorrfFmj(A),Fm0j(A)g
PAGE 60

3.4.2InnovationVariancePropertiesWenowexplorethebehavioroftheinnovationvariancesandtheirdistributionsGmj.LetR+denotethepositiverealline,logAbethesetflogx:x2AgforanyA2B(R+),and()theprobabilityfunctionfortheN( ,)distribution,assumingthehyperparameters ,arexed.Properties( 3 )( 3 )holdasintheautoregressiveparametercasewith(A)replaced(logA)andsetto0.Forinnovationvariancesofthesamegroupanddifferenttimesj6=j0, corrfGmj(A),Gmj0(A)g=+=2++1 2+++1corrn!j1(logA),!j01(logA)o, (3) andforbothdifferentgroupsm6=m0anddifferenttimesj6=j0, corrfGmj(A),Gm0j0(A)g==2+++1 2+2+2+1corrn!j1(logA),!j01(logA)o. (3) ThecorrelationofthesedistributionsnowdependsonthechoiceofBorelsetA.However,theyaretheproductsofatermthatdependssolelyonthestick-breakingparametersandandatermthatdependsonlyonAandthedistributionof(!j1,!j01)N2f 12,R()g,whereR()isthe22correlationmatrixwithoff-diagonalelementsjj)]TJ /F7 7.97 Tf 6.58 0 Td[(j0j.Thehighercorrelationsforneighboringtermsimpliesasmoothingofthevariancesasafunctionofjfor>0.Weobservethattheleadingtermgivesthesamecorrelationstructureasin( 3 ).Additionally,withthechoiceof=0,thetermdependingonAiszero,andthedistributionsareuncorrelatedasin Dunsonetal. ( 2008 ).Forj6=j0and1m,m0M,pr(mj=m0j0)=0,thatis,thereisnomatchingoftheinnovationvariancesacrosstimepoints.Thisisaconsequenceofthefactthattwopointsdrawnfromacorrelatednormaldistributionwithjj<1willbeequalwithprobabilityzero. 60

PAGE 61

3.5ComputationalConsiderationsRecallthatequation( 3 )provideduswiththeexpectedapproximationerrorwhichweemploytochoosethenumberofclustersnecessaryforthematrixstick-breakingprocesstruncation.Thisformulacontinuestoholdfortheproposedgroupingpriors,sincethestick-breakingweightsareformedusingthesameframework.Hence,ifthevaluesof,foreitherforthelag-blockorcorrelated-lognormalpriorareassumedknown,thenwechoosethenumberofclustersHsuchthat( 3 )islessthansomethreshold,suchas0.01.Aswegenerallydonothaveanyknowledgeorpriorbeliefaboutthesestick-breakingparameters,itwilloftenbeinappropriatetoprespecifyvalues,sowefollowthesuggestionof Dunsonetal. ( 2008 )andspecifyindependentGamma(1,1)priorsforand.AnalysesusingGamma(10,10)andGamma(0.1,0.1)indicatelittleeffectofthispriorchoiceontheestimatesof.TochoosethevalueofHwhenusingapriorforthestick-breakingparameters,werunapreliminaryMarkovchainforapproximately10%ofthedesiredchainlengthandusetheposteriormeanstotestwhether( 3 )isbelowourthreshold.Ifso,wexthisvalueofHforremainingcomputation.Oneofthenicepropertiesofthematrixstick-breakingprocessisthatintroducingappropriatelatentvariablesleadstoacomputationalalgorithmthatgenerallysamplesfromwell-knownconjugatedistributions( Dunsonetal. 2008 ).Becauseanormalpriorforthegeneralizedautoregressiveparametersprovidesconjugacy,thesamplingforisfromarelativelyeasytosamplezero-normalmixture.Withthelognormaldistribution,conjugacyfortheinnovationvariancesisnotobtainedsinceinversegammaistheconjugatedistributionfor,butwecansampleefcientlybyincorporatingaslicesamplingstep( Neal 2003 ).Weappropriatelymodifythealgorithmof Dunsonetal. ( 2008 )forposteriorsamplingfromourgroupingpriorsanddiscussfurthercomputationalchallengesinAppendix D 61

PAGE 62

Oneissueinthesamplingalgorithmisthebehaviorofsamplingthecorrelation.Inourexperiencewhenconsideringpriorswithrandom,didnotseemtobewellinformedbythedata.Hence,weopttotreatasatuningparameter.Werecommendspecifyingadefaultvaluesuchas=0.75,possiblytryingafewotherchoicesandselectingthevaluebasedonsomemodelselectioncriterion.AsshowninthedepressiondatastudyinSection 3.7 ,thethreechoicesof=0.5,0.75,and0.9leadtosimilarmodeltsasmeasuredbythedeviance.Basedonoursimulationstudies,itappearsthatthecorrelated-lognormalpriorisfairlyrobusttothechoiceof. 3.6RiskSimulationWenowexaminetheoperatingcharacteristicsoftheproposedgroupingpriorsviaarisksimulationdesignedtomimictheanalysisofatypicallongitudinaldatascenario.Weincorporateanon-zeromean,andthesimulateddatawillsufferfromignorabledropout.ThereareM=8groupseachwithnm=50measurementsofdimensionp=6.LetDidenotethetimet=2,...,p+1ofdropoutforsubjecti,whereDi=p+1indicatesasubjectwhocompletesthestudy.Dropoutisinducedaccordingtothemodel logitfpr(Di=t+1jDi>t,yit,m)g=0t+1tyit+2m(t=1,...,p)]TJ /F4 11.955 Tf 11.95 0 Td[(1). (3) Thismissingdatamechanismismissingatrandombecausethedropouttimedependsonlyonobservedvalues.Themean,covariance,anddropoutparametersforthesimulation,aswellastheprobabilitiesofmissingnessattimetforeachgroup,areprovidedinAppendix E .Thechoicesofand)]TJ /F1 11.955 Tf 10.1 0 Td[(donothaveanyequalitiesacrossgroupsbutsomewithinlag.Howeverwiththesmallsamplesizes,itwillgenerallystillbeadvantageoustoshareinformationacrosstheeightgroups.Also,thereisamoderateamountofsparsityin,asistypicalforordereddata.Allgroupshaveameanofzeroattime1,andthemeanfunctionsincreaseatdifferingratestothenaltimet=6.Thedropoutratesvaryacrossgroupswithmostgroupslosing35to50%oftheirsubjectsbyt=6.Groups3and 62

PAGE 63

PAGE 64

Underthisdatamodel,wegenerate50datasetsandrunourMarkovchainMonteCarloalgorithmoneachdatasetwitheachpriorfor50,000iterationskeepingeverytenthiteration,usingaburn-inof10,000.Weplacethefollowingpriorsonthehyperparameterswhenappearinginthepriorspecication:q,independentUnif(0,1);,,,,1,and2,independentGamma(1,1);2InvGamma(0.1,0.1);InvGamma(0.1,0.1); N(0,c2),withc2=1000.Wexthevalueoftobe0.75.Weassumeaatprioronthegroup-specicmeanvectorsm.Tohandleincompletedata,weusedataaugmentationtosamplethemissingdatavaluesfromnormaldistributionsconditionalontheobserveddata.WemeasuretheperformanceofourproposedpriorsbyestimatingtheriskassociatedwiththeBayesestimatorsundertwocommonlossfunctions( Yang&Berger 1994 ),L1(m,^m1)=tr()]TJ /F9 7.97 Tf 6.59 0 Td[(1m^m1))]TJ /F4 11.955 Tf 12.86 0 Td[(logj)]TJ /F9 7.97 Tf 6.59 0 Td[(1m^m1j)]TJ /F5 11.955 Tf 19.75 0 Td[(pandL2(m,^m2)=trf()]TJ /F9 7.97 Tf 6.58 0 Td[(1m^m2)]TJ /F5 11.955 Tf 10.83 0 Td[(I)2g.Sincetheselossesaredenedintermsofasinglecovariancematrix,weconsiderthelossforestimatingthesetofcovariancematricestobetheaverageacrossgroupsofthelossesfromtheindividualcovariancematrices.Tostudytheabilitytorecoverthemeanfunctionwithdifferingpriorson,weusetheposteriormeanofmandthelossfunctionL(^m,m)=(^m)]TJ /F11 11.955 Tf 12.14 0 Td[(m)>)]TJ /F9 7.97 Tf 6.59 0 Td[(1m(^m)]TJ /F11 11.955 Tf 12.14 0 Td[(m),takinganaverageacrossgroupsstandardizedbythetruecovariancematrixm.TheestimatedrisksassociatedwithestimatingthecovariancematricesforlossfunctionsL1andL2areshowninTable 3-1 .Thegroupingpriorshowsariskimprovementof30and25%overthenaiveBayes1priorand52and41%overthegroup-specicatprior.WhilethenaiveBayesprior1accommodatessparsity,itdoesnotpromoteanyequalityrelationshipsintheautoregressiveparametersacrossgroupsasinthetrueparametervalues,unlikethegroupingprior.ThenalcolumnofTable 3-1 displaystheriskinmeanestimationshowingaclearimprovementunderthegroupingprior.Thelag-block/correlated-lognormalpriorproducesarisk14%smallerthanthenaiveBayes1priorand29%smallerthanthe 64

PAGE 65

Table3-1. EstimatedrisksforeachchoiceofcovariancepriorfromthesimulationinSection 3.6 .TheestimatedriskiscalculatedastheaveragelossusinglossfunctionsL1(m,^m1)=tr()]TJ /F9 7.97 Tf 6.58 0 Td[(1m^m1))]TJ /F4 11.955 Tf 11.95 0 Td[(logj)]TJ /F9 7.97 Tf 6.58 0 Td[(1m^m1j)]TJ /F5 11.955 Tf 17.93 0 Td[(p,L2(m,^m2)=trf()]TJ /F9 7.97 Tf 6.58 0 Td[(1m^m2)]TJ /F5 11.955 Tf 11.95 0 Td[(I)2g,andL(^m,m)=(^m)]TJ /F11 11.955 Tf 11.95 0 Td[(m)>)]TJ /F9 7.97 Tf 6.58 0 Td[(1m(^m)]TJ /F11 11.955 Tf 11.96 0 Td[(m). PriorEstimatedRiskL1L2L CovarianceGroupingPrior0.4250.7420.175NaiveBayes10.6050.9870.203NaiveBayes20.6301.0100.210Group-specicat*0.8921.2550.248Common-at8.10584.3390.925 *Thegroup-specicatpriorisonlyover49datasetsbecausetheMarkovchainfailedtoconvergeforonedataset. group-specicestimator.Theriskassociatedwiththecommon-priorisalmostvetimesthatassociatedwiththegroupingpriors.Thiscorrespondswithourobservationsintheintroductionthatbyconsideringamoreaccuratestructureonthedependence,weestimatethemeanfunctionmoreefcientlyandwithlessbias.Additionalrisksimulationsusingthegroupingpriorundersimplerdatamodelswithfullyobserveddatahavebeenperformed,someofwhichareincludedinAppendix E .Thesecontinuedtoshowthatourprioroutperformsthenaivecompetitorsundermanydifferenttypesofcovariancematrixspecicationssuchassituationswithnosparsityanddissimilarcovariancematricesacrossgroups,andunderincreasingnm,M,andp.Inparticular,webelievethatasthenumberofgroupsMandthedimensionofthecovariancematrixpincreases,thegroupingestimatorsforwilloutperformthenaiveBayesestimatorsandthemarginbywhichtheydosoincreases.Thechoiceoftheatpriorscontinuedtoperformpoorlycomparedtothegroupingandnaivechoices. 3.7DataExampleWenowdemonstratetheuseofthegroupingpriorsinthettingofalongitudinaldatasetfromadepressionstudy.Thedata,originallypresentedby Thaseetal. ( 1997 ) 65

PAGE 66

PAGE 67

PAGE 68

Figure3-1. Theposteriorprobabilitiesofmatchingfortheinnovationvariancesattimes1,9,and15.Thesizeoftheboxesareproportionaltopr(mj=m0jjyobs). Wealsoconsiderhowthechoiceofcovarianceprioreffectsmeanestimation.Weshowthetreatmenteffect,calculatedasthedifferenceinmeanvaluebetweenbaselineandweek16,and95%credibleintervalsforthersttwogroupsinTable 3-2 .Ingroupm=2weseethattherearecleardifferencesforthiseffectacrossthedifferentpriors.Thetreatmenteffectunderthegroupingpriorsisestimatedtobearound9.5points,butitis10.2forthecommon-atand6.9forthegroup-specicatprior.Evenbetweenthetwonaivepriors,theestimatedtreatmenteffectsdifferby0.6.Forgroup1,aswellasgroups3and4,wedonotobservemuchdifferenceinthemeaneffect,exceptforsomedeviationwiththecommon-prior,althoughthecondenceintervalismorenarrowforthegroupingpriorsthantheatversions.Thesetwogroupsdemonstratethebiasandefciencyissuesrelevanttocovariancematrixestimationwithmissingdataasdiscussedintheintroduction.Wenotethatthedifferencesdonotrisetothelevelofstatisticalsignicancehere,buttheyarelargeenoughtobeofpracticalimportance.Figures 3-1 and 3-2 showthegroupingnatureoftheproposedpriors.Figure 3-1 showstheposteriorprobabilitiesofpr(mj=m0j)foreachm,m0combinationattimesj=1,9,15.Thesetimeswerechosenasrepresentativeoftheoverallpatternsinthedata.Forj=1andmostoftheundisplayedtimes,thereissubstantialmatchingforthegroups1and2,thelowinitialseveritygroup,aswellasforgroups3and4,thehigh 68

PAGE 69

Figure3-2. Theposteriorprobabilitiesofmatchingforthegeneralizedautoregressiveparameters.Panel(a)containsthematchingfortherstfourlag-1terms,and(b)displaystherstfourlag-4terms.Thesizeofthegrayboxesareproportionaltopr(mj=m0j0jyobs).Theblackboxesoverlayingthediagonalareproportionaltotheposteriorofpr(mj=0).Theaxesindicategroupm(topline)andautoregressiveparameterjoflag-q(bottomline). initialseveritygroups,withlessmatchingacrossthepairs.Thevariancesatj=9and15showastrongerpropensitytomatchacrossallgroups.Figure 3-2 givestheposteriorprobabilitiesofmatchingforthelag-1andlag-4autoregressiveparameters.Weshowonlytherstfourofeachduetospacelimitations.Theblackboxesthatoverlaythey=xdiagonalareproportionaltotheposteriorofpr(mj=0).Weseethatthelag-1termsarerarelysettozero,whilethelag-4termsforgroups2and3arefrequentlyzeroedout.Duetothenatureofthegroupingprior,thereisapositiveprobabilityofequalityacrossmj'sofacommonlag.InFigure 3-2 (a)wenotethepairwiseprobabilityofequalityisveryhighforallcombinationsoftherstautoregressiveparameterofallgroups,i.e.theregressionoftime2ontotime1,andtheotherlag-1,group2terms.Onewouldbeunlikelytolearnofthis 69

PAGE 70

relationshiportoconsideramodelwithequalityacrossall,orevenalargesubset,oftheseparametersusingotherapproaches.ConsideringFigure 3-2 (b),therearelargermatchingprobabilitiesforthelag-4parameters,muchofwhichisduetomatchingwithbothparameterssettozero.However,thematchingisnotalwaysduetoequalityatzero,ascanbeseenfromthelargeprobabilitiesofmatchingacrossthegroup1's.Thereissimilarbehaviorforthegroup4generalizedautoregressiveparameters. 3.8DiscussionWehavedevelopedaprioronthesetofMcovariancematricesthatsimultaneouslyexploitssparsityandmatchingofdependenceparametersacrossgroups.Themodelspacecontainingallcombinationswhereeachautoregressiveparameter/varianceisconstantacrossallpossiblesubsetsofthegroupshasBp(p+1)=2Mmodels,whereBMistheMthBellnumber( Stanley 1997 ,p.33).Infact,thegroupingpriorsconsideraspacethatisevenlargersinceweallowmatchingacrossautoregressiveparametersofacommonlag.Withthismanymodelswehavelittlehopeofndingthemostappropriateone.Ourgroupingpriorsavoidthisproblembystochasticallyconsideringthepossibilityofeachofthesemodelsinasingleanalysisandaccountingforuncertaintyappropriately.ItisourbeliefthatrunningaMarkovchainwithoneofthesegroupingpriorsisanecessaryalternativetotheunreasonabletimeandenergyrequiredtotandcomparethisextremelylargeclassofmodels. 70

PAGE 71

CHAPTER4COVARIANCEPARTITIONPRIORS:ABAYESIANAPPROACHTOSIMULTANEOUSCOVARIANCEESTIMATION 4.1SimultaneousCovarianceEstimationandaDrawbackoftheCovarianceGroupingPriorsWhenmodelinglongitudinaldata,estimationofthecovariancematricesisofprimeimportance.Inmanycases,thedatamayconsistofmultiplegroupsdenedbydifferencesintreatmentsand/orbaselinecovariates.Animportantconsiderationfortheanalystiswhether,andhow,thedependencestructurevariesacrossthesegroups.Often,oneperformsinferenceundertheassumptionthatthecovariancestructuresareequalacrossthesegroups,butifthisassumptionisincorrect,theresultinginferencesmaybeinvalidated,evenforinferencesonthemeanmodelifthereismissingness( Daniels&Hogan 2008 ).Conversely,modelingthedependencestructuresindependentlywithoutregardtotheothergroupscanleadtoinefciencyifthegroupsamplesizesaresmallorthedimensionislarge.Ourgoalistodevelopmethodologythatwillallowforthesharingofinformationacrossgroupstoimproveestimationefciency.ConsiderMgroupscontainingnmobservations,Ymi(i=1,...,nm;m=1,...,M),ofT-dimensional,multivariatenormaldata.Withoutlossofgenerality,weassumeYmihasmeanzero;otherwise,weletYmirepresenttheresidualaftercentering.Eachgroupmhasitsowncovariancematrixm,andwelet=f1,...,Mgdenotethecollectionofcovariancematrices.WeparameterizeeachmthroughthemodiedCholeskydecomposition( Pourahmadi 1999 2000 ),so)]TJ /F9 7.97 Tf 6.59 0 Td[(1m=TmDmT>mwithDmadiagonalmatrixwithpositiveentriesandTmanupper-triangularmatrixwithunitdiagonals.TheparametersofthisCholeskyparameterizationareinterpretablebyconsideringthesequentialdistributionsofYmi, f(Ymi1,...,YmiT)=f(Ymi1)f(Ymi2jYmi1)f(YmiTjYmi1,...,Ymi,T)]TJ /F9 7.97 Tf 6.59 0 Td[(1).(4) 71

PAGE 72

UndermultivariatenormalityeachoftheseTsequentialdistributions,f(YmitjYmi1,...,Ymi,t)]TJ /F9 7.97 Tf 6.59 0 Td[(1),isanormaldistributionwithmeanPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1j=1m;jtYmijandvariancemt.Letmt=(m;1t,...,m;t)]TJ /F9 7.97 Tf 6.59 0 Td[(1,t)>bethevectorofregressioncoefcientsfromthesequentialregressionforthetthresponse.FromtheCholeskydecomposition,theunconstrainedelementsofthetthcolumnofTmare)]TJ /F16 11.955 Tf 9.3 0 Td[(mt,andthe(t,t)elementofDmis1=mt.Hence,thesesequentialdistributionsfullyanduniquelydeterminem.Further,weneednotworryaboutthepositivedeniteconstraintonmaslongasmt>0forallt,andthenormalandinversegammadistributionsprovideconditionallyconjugatepriorsformtandmt,respectively( Daniels&Pourahmadi 2002 ).Wedenotethem;jt'sasthegeneralizedautoregressiveparameters(GARPs)andthemt'sastheinnovationvariances(IVs).Therehasbeenasubstantialamountofresearchonthesimultaneouscovarianceestimationproblem,butrelativelylittlefocusonthelongitudinaldatascenario.Methodshavebeensuggestedbyimposingcommonalitiesonaparticularfeatureofthem's:equalityacrosssubsetsoftheprinciplecomponentsof( Boik 2002 )oritscorrelationmatrix( Boik 2003 );equalityacrosscorrelationmatricesorthevolumesof( Manly&Rayner 1987 );equalityacrossallTmand/orallDm( McNicholas&Murphy 2010 );equalityamongarbitrarysubsetsofTmandDm( Pourahmadietal. 2007 ).Thelaterisinthespiritofourmethod,buttherethesubsetsmustbeprovidedbytheuser.Thispresentsasignicantcomputationalchallengebecausetherearemuchtoomanypossiblesubsetstondthebestcongurationwithoutanautomatedmethod. Daniels ( 2006 )and Hoff ( 2009 )proposeshrinkagepriorsontheCholeskytermsandtheeigenvectors,respectively. Guoetal. ( 2011 )and Danaheretal. ( 2012 )proposepenalizedlikelihoodmodelsthatinduceasparsitystructurecommonacrossgroups.The Danaheretal. ( 2012 )techniquealsoallowsequalityacrossgroupsatasinglenon-zerovalue.Thisisunsatisfactoryinourcase,asweexpectthattheremaybesubsetsofthegroupsthatshouldshareparametervaluesatdistinctnon-zerovalues; Guoetal. ( 2011 ) 72

PAGE 73

providesnoinformationacrossgroupsbeyondacommongraphicalstructure.Otherauthorshavemodeledthecovariancematricesthroughregressionsofacontinuouscovariate( Chiuetal. 1996 ; Daniels 2006 ; Fox&Dunson 2011 ; Hoff&Niu 2012 ),buttheregressionparametersinthesemodelsoftenlackinterpretation.InChapter 3 (seealso Gaskins&Daniels 2013 ),weproposedanonparametricpriorfortheCholeskytermsbasedonthematrixstick-breakingprocess( Dunsonetal. 2008 ).Thismethodologyproposesasparsepriorthatallowsformatchingofthem;jtGARPs(m=1,...,M;j=1,...,t)]TJ /F4 11.955 Tf 11.64 0 Td[(1;t=2,...,T)acrossallmandallj,twithinaxedvalueoft)]TJ /F5 11.955 Tf 11.98 0 Td[(j.Thevaluet)]TJ /F5 11.955 Tf 11.98 0 Td[(jiscalledthelagandrepresentsthetimedistancebetweenthetheresponseYtandtheregressorYj.TheIVsmtmayalsomatchacrossgroups.Thispriorconsidersanenormousmodelspaceandrequiresmanylatentvariablesforanefcientsamplingscheme.Inthischapterwesimplifythesetofmodelsunderconsiderationwhilestillconsideringarichsetofsparsemodelswithcommonstructuresacrossgroups.ThisleadstofastercomputationsthanthemethodofChapter 3 ,andhence,willbetteraccommodatedatawithlargerdimensionT.InSection 4.2 weintroduceourapproachbasedonacovariancepartitionprior.Toshareinformationacrossgroups,thispriorspeciespartitionsofthegroupssuchthatthosegroupscontainedinthesamesetofthepartitionwillhavecommonvaluesforsomesubsetofthedependenceparameters.Wedeneapartitioncorrespondingtoeachofthesequentialdistributionsin( 4 )andconsiderthesequenceofpartitionstobehaveasaMarkovchain.Section 4.3 containsdetailsofthecomputationalalgorithmneededtogenerateaposteriorsamplefortheproposedprior.PerformanceofthecovariancepartitionpriorisstudiedinSections 4.4 and 4.5 throughasimulationstudyandtheanalysisofdatafromadepressionstudy.AbriefdiscussioninSection 4.6 concludesthechapter. 73

PAGE 74

4.2CovariancePartitionPrior 4.2.1PriorontheSequenceofPartitionsFirst,itisnecessarytodenesomenotation.LetM=f1,...,Mgdenotethecollectionofallgroups.Weareinterestedinpartitioningthegroupsintosetsthatshareasimilardependencestructure.LetPdenoteapartitionofMandthecollectionofallpossibleP.ThecardinalityofisBM,theMthBellnumber( Stanley 1997 ,p.33).BMisequaltothesumoftheStirlingnumbersofthesecondkind.AnypartitionPcanbewrittenasthecollectionofitssetsP=fS1,...,Sdg,whereeachSiisnon-empty,disthedegree(thenumberofdistinctsets)ofP,Si\Sj=;forall1i
PAGE 75

clusteringofaDirichletprocess(Si)=(ni!),whereistheconcentrationparameterandnithedegreeofsetSi.However,ourfocusisonajointprior(P1,...,PT)forTtime-orderedpartitions.Wedesireapriorthatwillencouragesimilarstructuresacrosst.Ifgroupsm1andm2areinthesamesetinPt,andthereforesharethesameGARPsandIVforresponset,itshouldbemorelikelythattheyareinthesamesetofPt+1.Tothatend,weconsiderthesequenceofpartitionsfP1,...,PTgtobeaMarkovprocessonthestatespace.BytheMarkovpropertyandtime-invariance,weletthedistributionofPtgivenallpreviouspartitionsdependsonlyonthemostrecentpartition,thatis,pr(PtjP1,...,Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1)=pr(PtjPt)]TJ /F9 7.97 Tf 6.58 0 Td[(1)=pr(P2jP1)forallt.Hence,thefullconditionaldistributionofPtgivenalltheotherpartitionsdependsonlyontheadjacentpartitionsPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1andPt+1.Tospecifythetransitionprobabilitypr(P2jP1),weemployacommonlyusedmetricondenedbyd(P1,P2)=2jP1\P2j)-303(jP1j)-302(jP2j,wherejPjgivesthedegreeofapartitionPandtheintersectionpartitionisP1\P2=fS:S6=;;S=S1\S2forsomesetsS12P1andS22P2g( Day 1981 ).Notethatfortwogroupsm16=m2tobeinthesamesetofthepartitionP1\P2,theymusthavebeeninthesamesetinbothP1andP2.Thedistanced(P1,P2)isinterpretedastheminimumnumberofmergesandsplitsofthesetsofP1neededtoobtainP2.Othermetricsforcanbefoundin Arabie&Boorman ( 1973 )or Denud&Guenoche ( 2006 ).Usingthed(,)metric,wedenetheclosenessbetweenthetwopartitionsbycq(P1,P2)=[1+fd(P1,P2)gq])]TJ /F9 7.97 Tf 6.59 0 Td[(1,whereqisanon-negativeconstantdeterminingtherelativestrengthofthedistance.Noteforallniteq,thatcq(P,P)=1forallP,cq(P1,P2)2(0,1)forP16=P2,andcq(,)isdecreasingind(,).Deneaq(P)=B)]TJ /F9 7.97 Tf 6.58 0 Td[(1MPP0cq(P,P0)tobetheattractivenessofthepartitionP,givenbytheaverageclosenessofPover.ThetransitionprobabilityfromPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1toPt(t>1) 75

PAGE 76

isproportionaltotheclosenessofthepartitionsandisgivenby pr(PtjPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1)=cq(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1,Pt) aq(Pt)]TJ /F9 7.97 Tf 6.58 0 Td[(1)BM.(4)ItiseasytoverifythatthestationaryprobabilityofPisproportionaltoitsattractivenessaq(P).Hence,wechoosethedistributionfortheinitialpartitionP1tobethesetofstationaryprobabilitiespr(P1)=aq(P1)=(AqBM),whereAq=B)]TJ /F9 7.97 Tf 6.58 0 Td[(1MPPaq(P)istheaverageattractiveness.BecausewearestartingaMarkovchainatitsstationarydistribution,themarginalprobabilityforpartitionPtis pr(Pt)=aq(Pt) AqBM,(Pt2,t=1,...,T).(4)Combining( 4 )and( 4 ),thedistributionoftheentirepartitionprocessisgivenby (P1,...,PT)=B)]TJ /F7 7.97 Tf 6.59 0 Td[(TMaq(P1) AqTYt=2cq(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1,Pt) aq(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1).(4)Ourpartitionpriorclearlydependsonthevalueofq.Whenq=0,c0(P1,P2)=1=2forallP1,P2undertheusualconvention00=1.Hence,aq(P)=1=2forallP,andpr(PtjPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1)=pr(Pt)=1=BM.Thepreviouspartitionprovidesnoinformationaboutthenewpartition,andallpartitionsareequallylikely.Wecallthiscasetheindependent-uniformsprior,becauseeachpartitionisindependentoftheothersandfollowsauniformdistributionover.Because0qisdiscontinuousatq=0,cq(P1,P2)!0.5+0.5I(P1=P2)asqapproacheszero.Inthislimit,pr(PtjPt)]TJ /F9 7.97 Tf 6.58 0 Td[(1)!f1+I(Pt=Pt)]TJ /F9 7.97 Tf 6.58 0 Td[(1)g=(BM+1).Theprobabilityofmovingtoanydifferentpartitionis1=(BM+1)withprobabilityofremainingatthesamepartitionistwiceaslikely.SinceBMislarge,thereislittlepracticaldifferencebetweentheresultsatq=0andq!0.Conversely,asqgetslarger,cq(P1,P2)0ifd(P1,P2)>1.Hence,movesfromPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1toPtthatrequiremorethanonemergeorsplitareincreasinglyunlikely.Forlargeqthisplacesahighlyrestrictivestructureonthepartitionprocess.Hence,weonlyallow 76

PAGE 77

Figure4-1. MarginalprobabilitiesforthreechoicesofqwithM=8groups.TheseprobabilitiesarescaledbyBMandcanbeviewedastheratioofmarginalprobabilitiesunderqandundertheindependent-uniforms(q=0)case,i.e.pr(Pjq)=pr(Pjq=0).Partitionsareorderedbyincreasingdegreeonthex-axiswithnoinformativeorderingbetweenpartitionsofcommondegree;partitionsofodddegreeareinblackandthoseofevendegreeareingray. valuesofqbetween0and10sothatpr(PtjPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1)isboundedawayfromzeroforallqandallt,Pt)]TJ /F9 7.97 Tf 6.58 0 Td[(1,Pt.WenotethatXPpr(Pjq=10))]TJ /F4 11.955 Tf 11.95 0 Td[(pr(Pjq=1).001,andsothereislittledifferencebetweenthemarginalprobabilitiesforqgreaterthan10.Tobetterseetheeffectthechoiceofqhasonthemarginalprobabilitiespr(P)of( 4 ),weplotthemarginalprobabilitiesforthreechoicesofqforM=8groupsinFigure 4-1 .TheseprobabilitiesarescaledbyBM=4140andcanbeviewedastheratioofmarginalprobabilitiesundertheparticularqtotheprobabilityundertheindependent-uniformsq=0case,i.e.pr(Pjq)=pr(Pjq=0).ForeachqthehighestprobabilitypartitionisPpool=fMg,thepartitionthatpoolsalldataintoasinglegroup.Forsmallerqallmarginalprobabilitiesarerelativelyclose,butasqincreases,thedisparityacrossPincreases.Forq=5,maxP,P0fpr(P)=pr(P0)g=11.4,butsincethesupportislarge,thisdifferenceisnotsogreatastoleaveanypartitionswithnegligible 77

PAGE 78

support.Thisdisparitydoesnotincreasesubstantiallyafterq=5aseachcq(P1,P2)isapproximatelyeither1,1=2,or0forq>5.Asitisnotclearwhatanoptimumvalueofqwouldbe,weletqbearandomvariabletobesampledaspartoftheanalysis.BecauseoursamplingschemewillrequirethevaluesofAqandaq(P1),...,aq(PT)foreachq,wespecifyanitesupporttominimizethecomputationalcomplexity.Tothatend,wechoosethesequenceQfrom0to10withstepsof0.025(jQj=401)tobethesupportofq.ThepriorforqisuniformoverthesetQ. 4.2.2PriorontheCholeskyParametersGivenaparticularpartitionPt,groupsinacommonsetwillsharecommonvaluesforthedependenceparametersassociatedwiththesequentialdistributionf(YmitjYmi1,...,Ymi,t)]TJ /F9 7.97 Tf 6.59 0 Td[(1)from( 4 ).Theseparametersare(mt,mt),wheremtisemptywhent=1.Tothatend,foreachsetSit2Pt=fS1t,...,Sdttg,weassociatetheparameters(?it,?it),sothat(?it,?it)=(mt,mt)forallm2Sit.As(mt,mt)aredeterminedbyf(?it,?it)gdti=1andPt,wenowspecifytheprioron(?it,?it)conditionalonthepartitionPt.Inadditiontothematchingacrossgroupsthatisinducedbyourpartitions,wedevelopourpriortoinducesparsityintheTmmatrices.Undermultivariatenormalitym;jt=0impliesYmijandYmitareindependentgiventheYmik'swithk
PAGE 79

Thepriorfor(?it,?it)conditionalonthepartitionPtisasfollows. pr(it=kjPt)=kt=expf)]TJ /F11 11.955 Tf 15.28 0 Td[(1I(k=0))]TJ /F11 11.955 Tf 11.95 0 Td[(2kg Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1l=0expf)]TJ /F11 11.955 Tf 15.27 0 Td[(1I(l=0))]TJ /F11 11.955 Tf 11.95 0 Td[(2lg(k=0,...,t)]TJ /F4 11.955 Tf 11.96 0 Td[(1), (4) ?itjPtInvGamma(1,2), (4) ~itjit,?it,PtNit(0it,?it2Iit). (4) itdeterminesonhowmanyofthepreviousmeasurementstheresponseYmitisdependent(form2Sit).Welet~itgivetheitnon-zeroGARPs,andso?it=(0>t)]TJ /F9 7.97 Tf 6.59 0 Td[(1)]TJ /F12 7.97 Tf 6.59 0 Td[(it,~>it)>givesthefullvectorofGARPscorrespondingtothenegativesoftheunconstrainedelementsofthetthcolumnofTm.Forconditionalconjugacyweuseanormalpriorfor~itandtheinversegammadistributionfortheinnovationvariances.Whent=1therearenoGARPs,andonlythepriorin( 4 )isneeded.ForafullyBayesiananalysis,weneedhyperpriorsforthehyperparameters2,1,2,1,and2.Wespecify2InvGamma(0.1,0.1)andindependentGamma(1,1)priorsfortheinnovationvarianceparameters1,2.Todenethehyperpriorsfor1and2,werstexploretheinterpretationoftheseparameters.Algebrashowsthat2=logpr(it=k) pr(it=k+1)(k=1,...,t)]TJ /F4 11.955 Tf 11.96 0 Td[(1;t=3,...,T),providinganinterpretationof2asthelogoddsthatYmitdependingononefewerpastresponse.Toencouragesparsityandsmallervaluesofit,wechooseanexponentialdistributionwithratelog(2)asthepriordistributionfor2.Thisguaranteesthatthislogoddsisstrictlypositivewithapriormeanoflog(2).Further,1)]TJ /F11 11.955 Tf 11.95 0 Td[(2k=logpr(it=k) pr(it=0)(k=1,...,t)]TJ /F4 11.955 Tf 11.96 0 Td[(1;t=2,...,T).Wechooseapriorof1j2Unif(2,(T)]TJ /F4 11.955 Tf 12.12 0 Td[(1)2).Theleftendpointrepresentsthecasethatpr(it=1)=pr(it=0),i.e.thattheresponsedependsononlythemostrecentmeasurementisaslikelyastheresponsebeingindependentofallpastmeasurements,andtherightendpointisequivalenttopr(iT=T)]TJ /F4 11.955 Tf 12.16 0 Td[(1)=pr(iT=0),thenalresponse 79

PAGE 80

dependingonthefullhistoryisaslikelyasindependencefromthehistory.Further,foranyk=1,...,T)]TJ /F4 11.955 Tf 12.45 0 Td[(1,2k2[2,(T)]TJ /F4 11.955 Tf 12.45 0 Td[(1)2]soitisequallylikelythat1=2kforanyk.Thatis,itisaslikelythatYmitdependsonthekpreviousresponsesasthecasethatYmitisindependentofitshistory,foreachchoiceofk.Simulationshavedemonstratedrobustnesstothesehyperpriorchoices. 4.3SamplingAlgorithmPosteriorinferenceusingthecovariancepartitionpriorswillrelyonaposteriorsamplegeneratedfromaMarkovchainMonteCarlo(MCMC)algorithm.InthissectionwedescribetheGibbssamplernecessarytoobtainaposteriorsample.LetH=(q,2,1,2,1,2)denotethesetofmodelhyperparameters,andCt=f(mt,mt)gMm=1denotethesetofCholeskyparametersforthetthsequentialdistributionsofallMgroups.Further,C()]TJ /F7 7.97 Tf 6.59 0 Td[(t)representsthesetofCholeskyparametersC1,...,CTexcludingCt;P()]TJ /F7 7.97 Tf 6.59 0 Td[(t)isdenedsimilarly.ThesamplingalgorithmconsistsofstepsoftheformPt,CtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),C()]TJ /F7 7.97 Tf 6.59 0 Td[(t),H,DthatjointlyupdatethepartitionandCholeskyparametersassociatedwiththetthsequentialdistributiongiventhedataD.WeperformthisstepintwopartsbyfactoringPt,CtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),C()]TJ /F7 7.97 Tf 6.58 0 Td[(t),H,D=PtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),C()]TJ /F7 7.97 Tf 6.59 0 Td[(t),H,DCtjPt,P()]TJ /F7 7.97 Tf 6.59 0 Td[(t),C()]TJ /F7 7.97 Tf 6.58 0 Td[(t),H,D.Finally,weupdatethehyperparametersHgivenP1,...,PT,C1,...,CT,D.Beforebeginningthesamplingscheme,therearesomecomputationsthatshouldbeperformedandstoredrst.ToupdatethevalueofqrequiresknowledgeofAqandaq()foreachPt.Wechoosetocomputethefullsetofattractivenessesforeachq2QandallP2rst.WhenweupdateqintheMCMCscheme,welookuptheneededvalues.First,wedescribethePtjP()]TJ /F7 7.97 Tf 6.58 0 Td[(t),C()]TJ /F7 7.97 Tf 6.59 0 Td[(t),H,DstepthatsamplesthepartitionPTgiventheremainingpartitions,marginalizedovertheGARPsandIVsCt.ItcanbeshownthatPtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),C()]TJ /F7 7.97 Tf 6.59 0 Td[(t),H,Ddependsonlyontheotherpartitionsandq, 80

PAGE 81

PtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),C()]TJ /F7 7.97 Tf 6.59 0 Td[(t),H,D=PtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),q,D;wesamplefromthisdistributionwithareversiblejumpMCMCstep( Green 1995 ).LetPtdenotethecurrentvalueofthepartition.WeproposeacandidatepartitionP?t=fS?1t,...,S?d?ttgasfollows.UniformlyselectmfromM,andletS02Ptdenotethesetofthepartitionthatcontainsgroupm.ThechoiceofthecandidateP?tdependsontheformofS0. IfS0=M,thenP?t=fMnfmg,fmgg.Thatit,thecandidatepartitionsplitsgroupmfromM. IfS0=fmg,thenuniformlysampleS00fromtheremainingsetsinPt.ThecandidateP?tisfS0[S00g[(PtnfS0,S00g).Thatis,wemergethesingletonsetS0withthesetS00. IfS0isneitherasingletonorM,thenweuniformlysampleS00from(PtnfS0g)[f;g.ThecandidatepartitionP?tisgivenbyfS0nfmgg[fS00[fmgg[(PtnfS0,S00g).Thatis,wemovethegroupmfromsetS0tothe(possiblyempty)setS00.Itcanbeshownthatthecorrespondingratiooftransitionprobabilitiesispr(Pt!P?t) pr(P?t!Pt)=dt)]TJ /F5 11.955 Tf 11.96 0 Td[(I(fmg2Pt) d?t)]TJ /F5 11.955 Tf 11.95 0 Td[(I(fmg2P?t).Byconstruction,PtandP?tdifferonlyinthelocationofgroupm.BecausewearedealingwithpartitionsonM,arelativelysmallspace,weareabletotraversethehighprobabilityregionsofbymakingthesesinglestepmoves.Forproblemswherethemodelspaceislarger,itmaybenecessarytousesplit-mergemoves(e.g. Kimetal. 2006 ; Richardson&Green 1997 )oramixtureofsinglestepandsplit-mergemoves(asin Boothetal. 2008 ).However,theposteriorsamplesfromoursimulationsanddataanalysisdoesnotindicatethatthisisneededinoursetting.ComputingtheprobabilityofacceptingthemovetoP?trequiresthelikelihoodimpliedbyPtandP?tconditionalontheotherpartitions,lik(PtjP()]TJ /F7 7.97 Tf 6.59 0 Td[(t),data)/pr(PtjPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1)pr(Pt+1jPt)dtYi=1marg(Ymi,m2Sit), 81

PAGE 82

wheremarg(Ymi,m2Sit)isthelikelihoodofthedatafromgroupsinsetSitaftermarginalizingouttheGARPsandIVs, marg(Ymi,m2Sit)=t)]TJ /F9 7.97 Tf 6.59 0 Td[(1X=0ZRZR+(Ym2SitnmYi=1f(YmitjYmi1,...,Ymi,t)]TJ /F9 7.97 Tf 6.58 0 Td[(1;,))(~j,)()td~d=t)]TJ /F9 7.97 Tf 6.59 0 Td[(1X=0t(2))]TJ /F7 7.97 Tf 6.59 0 Td[(n=2)]TJ /F12 7.97 Tf 6.58 0 Td[(j)]TJ /F9 7.97 Tf 6.59 0 Td[(2I+X>Xj)]TJ /F9 7.97 Tf 6.59 0 Td[(1=212\(1))]TJ /F9 7.97 Tf 6.58 0 Td[(1\(1+n=2)2+1 2y>nIn)]TJ /F3 11.955 Tf 11.96 0 Td[(X)]TJ /F11 11.955 Tf 5.48 -9.68 Td[()]TJ /F9 7.97 Tf 6.58 0 Td[(2I+X>X)]TJ /F9 7.97 Tf 6.58 0 Td[(1X>oy)]TJ /F9 7.97 Tf 6.59 0 Td[((1+n=2), (4) wheren=Pm2Sitnm,yisthen-vectorcontainingYmit(i=1,...,nm,m2Sit),andXisthenmatrixwithrow(Ymi,t)]TJ /F12 7.97 Tf 6.59 0 Td[(,...,Ymi,t)]TJ /F9 7.97 Tf 6.59 0 Td[(1)>.For=0,theX0matrixisempty,andthesummandin( 4 )reducesto0t(2))]TJ /F7 7.97 Tf 6.58 0 Td[(n=212\(1))]TJ /F9 7.97 Tf 6.58 0 Td[(1\(1+n=2)2+1 2y>y)]TJ /F9 7.97 Tf 6.58 0 Td[((1+n=2).ThereversiblejumpMCMCstepacceptsthemovefromPttoP?tifUmin(1,pr(P?tjPt)]TJ /F9 7.97 Tf 6.58 0 Td[(1)pr(Pt+1jP?t)Qd?ti=1marg(Ymi,m2S?it) pr(PtjPt)]TJ /F9 7.97 Tf 6.58 0 Td[(1)pr(Pt+1jPt)Qdti=1marg(Ymi,m2Sit)pr(Pt!P?t) pr(P?t!Pt))forUUnif(0,1).Thevaluespr(P?tjPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1),pr(Pt+1jP?t),pr(PtjPt)]TJ /F9 7.97 Tf 6.59 0 Td[(1),pr(Pt+1jPt)aregivenin( 4 )(forthecurrentvalueofq).FortheCtjPt,P()]TJ /F7 7.97 Tf 6.58 0 Td[(t),C()]TJ /F7 7.97 Tf 6.59 0 Td[(t),H,Dstep,wecanexpressthisdistributionasQdti=1[(?it,?it,it)jPt,H,D]andsampleeach(?it,?it,it)exactlyfromitsconditionaldistribution.Thesamplingdistributionsare pr(it=kjSit,H,D)/kt)]TJ /F7 7.97 Tf 6.59 0 Td[(kj)]TJ /F9 7.97 Tf 6.58 0 Td[(2Ik+X>kXkj)]TJ /F9 7.97 Tf 6.58 0 Td[(1=22+1 2y>nIn)]TJ /F3 11.955 Tf 11.95 0 Td[(Xk)]TJ /F11 11.955 Tf 5.48 -9.69 Td[()]TJ /F9 7.97 Tf 6.58 0 Td[(2Ik+X>kXk)]TJ /F9 7.97 Tf 6.59 0 Td[(1X>koy)]TJ /F9 7.97 Tf 6.59 0 Td[((1+n=2), (4) ?itjSit,it,H,DInvGamma1+n 2,2+1 2y>nIn)]TJ /F3 11.955 Tf 11.95 0 Td[(X)]TJ /F11 11.955 Tf 5.48 -9.68 Td[()]TJ /F9 7.97 Tf 6.59 0 Td[(2I+X>X)]TJ /F9 7.97 Tf 6.59 0 Td[(1X>oy,~itjSit,it,?it,H,DNit)]TJ /F11 11.955 Tf 5.48 -9.68 Td[()]TJ /F9 7.97 Tf 6.58 0 Td[(2I+X>X)]TJ /F9 7.97 Tf 6.58 0 Td[(1X>y,?it)]TJ /F11 11.955 Tf 5.48 -9.68 Td[()]TJ /F9 7.97 Tf 6.59 0 Td[(2I+X>X)]TJ /F9 7.97 Tf 6.59 0 Td[(1. 82

PAGE 83

Wethenset?it=(0>t)]TJ /F9 7.97 Tf 6.59 0 Td[(1)]TJ /F12 7.97 Tf 6.59 0 Td[(it,~>it)>and(mt,mt)=(?it,?it)form2Sit.Both( 4 )and( 4 )requirethecalculationoftterms.FordatawithlargeT,thismayadverselyeffectcomputationalspeed.Insuchcaseswerecommendmodifying( 4 )topr(it=k)/expf)]TJ /F11 11.955 Tf 15.28 0 Td[(1I(k=0))]TJ /F11 11.955 Tf 12.75 0 Td[(2kgI(kk0),wherek0isaxedvaluerepresentingthemaximumnumberofpreviousmeasurementsonwhichYmitcandepend.Hence,( 4 )and( 4 )willonlyrequirethecalculationofminft,k0+1gterms.ThisisrelatedtotheideaofbandingtheCholeskymatrix( Rothmanetal. 2010 ; Wu&Pourahmadi 2003 ),whichxesallm;jt,t)]TJ /F5 11.955 Tf 12.33 0 Td[(j>k0,tobezeroandallm;jt,t)]TJ /F5 11.955 Tf 12.33 0 Td[(jk0,tobenon-zero.TheadditionoftheI(kk0)termtopr(it=k)inourmodelprovidesamoreexiblestructurethanbandingasitallowssomem;jt=0,t)]TJ /F5 11.955 Tf 11.96 0 Td[(jk0.WenowupdatethehyperparametersH.For2and2,conjugacygivestheconditionalsof2InvGamma 0.1+TXt=2dtXi=1it,0.1+TXt=2ditXi=1?it>?it!,2Gamma 1+1TXt=1dt,1+TXt=1dtXi=1(?it))]TJ /F9 7.97 Tf 6.59 0 Td[(1!.Theconditionaldistributionof1isavailableinclosedform, (1j)/\(1))]TJ /F27 7.97 Tf 8 5.98 Td[(PTt=1dtPTt=11dt2exp()]TJ /F11 11.955 Tf 9.3 0 Td[(1 1+TXt=1dtXi=1log(?it)!),whichcanbesampledbyslicesampling( Neal 2003 ).Theconditionalfor(1,2)doesnothaveastandardform,butdependssolelyonthesetofit's.Letting(1,2)denotethehyperprior,weupdate(1,2)byslicesamplingfromtheconditionaldistribution (1,2j)/(1,2)TYt=2dtYi=1expf)]TJ /F11 11.955 Tf 15.28 0 Td[(1I(it=0))]TJ /F11 11.955 Tf 11.95 0 Td[(2itg Pt)]TJ /F9 7.97 Tf 6.58 0 Td[(1l=0expf)]TJ /F11 11.955 Tf 15.27 0 Td[(1I(l=0))]TJ /F11 11.955 Tf 11.96 0 Td[(2lg.SamplingqrequiresaMetropolis-Hastingsstep.WecreateaproposalsetaroundthecurrentvalueqbyQ?=fq+0.025n:n=1,...,50g,andthenwedrawq? 83

PAGE 84

uniformlyfromQ?.Weacceptthemovetoq?ifUmin(1,I(q?2Q)Aq Aq?aq?(P1) aq(P1)TYt=2cq?(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1,Pt) cq(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1,Pt)aq(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1) aq?(Pt)]TJ /F9 7.97 Tf 6.59 0 Td[(1))forUUnif(0,1).Otherwise,weretainq. 4.4SimulationStudyWeexaminetheperformanceofourpriorsusingasimulationstudy.WesimulatedataconsistingofM=8groupswithsamplesizesofn1==n5=60,n6==n8=30.Becausethereislessdataforthenalthreegroups,weexpectthatleveraginginformationacrossgroupswillaidintheestimationespeciallyforthesesmallergroups.ObservationsaremeasuredatT=10timepoints.ThetruepartitionschosenareP1=fMg,P2==P5=ff1,2,3,4g,f5,6,7,8gg,P6==P8=ff1,2,3,4g,f5,6g,f7,8gg,andP9=P10=ff1,2g,f3,4g,f5,6g,f7,8gg.Giventhepartitions,thetruevaluesofTmandDmarespeciedtorepresentalongitudinaldatascenariowithserialcorrelation.Thetruecovariancematriceshaveasparsestructurebyspecifyingittobebetween1and3,mostoften2.ExactdetailsareprovidedinAppendix F .Underthisdatadistribution,wegenerate50datasetsandrunanMCMCchainforeachusingthealgorithmofSection 4.3 .Afteraburn-inof3000iterations,thechainrunsfor9000iterations,andweretaineverytenthforposteriorinference.Tomeasuretheperformanceoftheresultingestimators,weusethelossfunctions Lall(,^)=Lall(f1,...,Mg,f^1,...,^Mg)=MXm=1nm NL(m,^m),L(m,^m)=tr()]TJ /F9 7.97 Tf 6.59 0 Td[(1m^m))]TJ /F4 11.955 Tf 11.96 0 Td[(logj)]TJ /F9 7.97 Tf 6.59 0 Td[(1m^mj)]TJ /F5 11.955 Tf 17.93 0 Td[(T,whereN=Pmnm,and^m=Epost(TmDmT>m))]TJ /F9 7.97 Tf 6.59 0 Td[(1istheBayesestimatorforgroupm( Yang&Berger 1994 ).L(m,^m)isthestandardlog-likelihoodlossfunctionforasinglecovarianceestimator,andLall(,^)measuresthelosstoestimatingthecollectionofcovariancematricesbytakingaweightedaveragewithweightsproportionaltogroup 84

PAGE 85

Table4-1. Riskestimatesfromsimulationstudy. LabelinPartitionPriorCholeskyPriorRiskunderRiskunderFig. 4-2 Lall(,^)L(8,^8) 1cov.partitionpriorsparsity0.3040.3552indep.-uniformspriorsparsity0.3470.4503indep.-uniformspriornon-sparse0.4630.5354cov.partitionpriornon-sparse0.4660.5125Pt=Pindsparsity0.5150.6916Pt=Ppoolsparsity0.7691.1587Pt=Pindnon-sparse0.7801.0598Pt=Ppoolnon-sparse0.8201.187 size.Theestimatedriskofthecovariancepartitionpriormethodistheaveragelossoverthe50datasets.Wecomparethecovariancepartitionpriorswithseveralcommonlyusedmethods.LetPind=ff1g,,fMggrepresentthepartitionwhereeachgroupisinasingletonset,correspondingtothecasewherenoinformationissharedacrossgroups,andrecallPpool=fMgpoolsthedataintoasinglegroup.Weconsideratotalofeightpriorstructures:thedistributiononthepartitionsisgivenby( 4 )withthepriorforquniformonQ,calledthe`covariancepartitionprior';xingq=0,the`independent-uniformsprior';PtisxedasPindforallt;PtisxedasPpoolforallt.Witheachofthesefourstructuresonthepartitions,weconsidertwopriorsontheCholeskyparameters:sparsityisinducedaccordingto( 4 )( 4 )andanon-sparsechoicebyxingit=t)]TJ /F4 11.955 Tf 12.16 0 Td[(1.Table 4-1 andFigure 4-2 containtheestimatedriskofestimatingthecollectionofcovariancematricesandtheriskforestimatingthenalgroupundereachofthesepriorspecications.Itisclearthatestimatesbasedonpartitionpriorshavelowerrisk.NeitherofthePpoolpriorsworkwellbecausetheyproduceinconsistentestimators.Further,thePindpriorwithoutsparsityhastoomanyparameterstoestimateefciently;addingsparsityalleviatesthissomewhat.Thecovariancepartitionpriorwithsparsityshowsthebestperformance;theriskofthePindnon-sparsechoiceis2.5timeslarger.Theriskof 85

PAGE 86

PAGE 87

PAGE 88

Table4-2. Modelselectionstatisticsfordepressionstudy. PartitionPriorCholeskyPriorDevpDDIC cov.partitionpriornon-sparse39,46471140,887cov.partitionpriorsparsity39,51471540,945Pt=Ppoolnon-sparse40,25834540,949Pt=Ppoolsparsity40,28535941,004indep.-uniformspriornon-sparse39,34883641,020indep.-uniformspriorsparsity39,43583141,096Pt=Pindnon-sparse39,051116441,381Pt=Pindsparsity39,183115941,501 missingvaluesgiventheobservedmeasurementsandcurrentparametervalues.Eachgroupisassumedtohaveaunique,unstructuredmeanvector,whichisgivenaatprior.AnMCMCanalysisisrunforeachoftheeightpriorspecicationsfromtherisksimulation.Again,asinglechainisrunfor9000iterationsafteraburn-inof3000,andthevaluesfromeverytenthiterationareretainedforposteriorinference.Withtheincreaseddimensionofthisdata(T=17),thesparsecovariancepartitionpriorcansamplearound4.5iterationsinthetimenecessaryforthecovariancegroupingpriors( Gaskins&Daniels 2013 )methodtosampleone.Thepartitionpriorwiththenon-sparseCholeskyparameterizationsamplesabout4timesfasterthanthesparseversion.Todeterminethemodelthatbesttsthedata,weusethedevianceinformationcriterion(DIC; Spiegelhalteretal. 2002 ).TheDICiscalculatedasDIC=Dev+2pD,whereDevisthedevianceevaluatedattheposteriorexpectationoftheparametervaluesandpDisamodelcomplexityterm.ThispDiscomputedasthedifferencebetweentheexpecteddevianceandDev,andthetermisofteninterpretedastheeffectivenumberofmodelparameters.Weusetheobserveddatalikelihoodtocomputethedeviances( Wang&Daniels 2011 ).ModelswithsmallerDICareconsideredtobetterbalancethemodeltandcomplexity.Table 4-2 containsthemodelcomparisonstatistics. 88

PAGE 89

Thecovariancepartitionpriorwiththenon-sparseCholeskystructureprovidestheoverallbestmodelt.ThePpoolmodelsperformsecondbestbyutilizingfewerparametersthantheothermodelsbutatacostofapoorermodeltDev.Theindependent-uniformspriorshowssimilarmodelttothecovariancepartitionpriorbutusesabout120moreparameters,asthedegreeofthepartitionsP1,...,PTtendstobelargerwiththeindependent-uniformspriorthanunderthepriorwithrandomq.ThePindperformsworstduetothelargenumberofparametersrequired.Wenotethatforthisparticulardatasetourspecializedsparsitystructuredoesnotprovideanimprovementinmodelt.ThisisclearlyseensincetheDICarelargerthanthenon-sparsepriorforeachofthefourpartitionpriors.Additionally,wehavecomparablevaluesofpDunderbothCholeskychoices,indicatingthatformostiterationstheTmmatricesarefullorclosetofull.Furtherinspectionoftheparameterestimatesindicatethatrestrictingsparsitytotheformf(YmitjYmi1,...,Ymi,t)]TJ /F9 7.97 Tf 6.58 0 Td[(1)=f(YmitjYmi,t)]TJ /F7 7.97 Tf 6.59 0 Td[(k,...,Ymi,t)]TJ /F9 7.97 Tf 6.58 0 Td[(1)isnotwellsupportedbythisdata.ItispossibletoexploitsparsityinthecolumnsofTm(aswasshowninSection 3.7 ),buttheappropriatesparsestructurerequireszerosspreadthroughoutthecolumnsinsteadofjustintherstt)]TJ /F5 11.955 Tf 9.69 0 Td[(k)]TJ /F4 11.955 Tf 9.68 0 Td[(1positions.Weconcludethediscussionofthedepressiondatabyexaminingthestructurethatthecovariancepartitionpriorinducesforthisdataunderthebestttingmodel,thecovariancepartitionpriorwiththenon-sparseCholeskystructure.Themeanofq,theparameterdeterminingthesmoothnessofthestochasticprocessofpartitions,is8.09witha95%credibleintervalof(5.53,9.95).Figure 4-3 showstheclusteringpropertiesofourprioratbaselinethroughweek11;theremaining,undisplayedweeksbehavesimilarlytoweeks8.Eachpaneldepictsthepairwiseprobabilitythatm1andm2aregroupedtogetherattimet(weekt)]TJ /F4 11.955 Tf 12.61 0 Td[(1),thatis,theprobabilitythatthereexistsasetS2Ptwithfm1,m2gS.Wenotethatatbaselineandthenexttwoweeksgroupsaresplitbyinitialseveritywithalowseveritysetf1,2,5,6gandahighseverityset 89

PAGE 90

Figure4-3. Theposteriorprobabilitythatm1andm2areinthesamesetofthepartitionPtforbaselinethroughweek11(t=1,...,12). f3,4,7,8g.Fromweeks2through7thereislessstabilityinthepartitions,butfromweek8on,therearetwostrongclusters:onecontainsgroups4(drug,highseverity,male),5(nodrug,lowseverity,female),and6(drug,lowseverity,female)andtheothercontainsgroups2(drug,lowseverity,male),7(nodrug,highseverity,female),and8(drug,highseverity,female).Thepartitionlocationsofgroups1and3(nodrug,low/highseverity,male)arelessstableastvaries.Whileourmodelmakesuseofthepartitionsprimarilyasatoolforsharinginformationacrossgroups,thematchingstructurethatourmodelcapturesmaybe 90

PAGE 91

PAGE 92

CHAPTER5CONCLUSIONSANDFUTUREWORKInthisdissertationwehavedevelopedthreenovelmethodstoestimatethedependencestructuresinlongitudinaldata.InChapter 2 weintroducedBayesianpriordistributionsforthecorrelationmatrixRthatinducessparsitythroughthepartialautocorrelations( Daniels&Pourahmadi 2009 ).ByallowingshrinkageorselectiononthePACs,weimproveestimationbyworkinginalower-dimensionspace.Inthenexttwochapters,weconsideredtheproblemofsimultaneouscovarianceestimation.Chapter 3 introducesanon-parametricmodelthroughthematrixstick-breakingprocess( Dunsonetal. 2008 )thatallowsequalityofparametersacrossgroupsandacrossGARPsofacommonlag.InChapter 4 weproposeamethodrelyingonpartitioningthegrouplabelsateachtimepoint,sothatthesequentialdistributionscoincideforgroupsinthesamesetofthepartition.BothpriordistributionsareformedintermsofthemodiedCholeskyparameterization( Pourahmadi 1999 )andfavorsparsechoicesoftheT()matrix.Intermsoffutureresearchdirections,thePACparameterizationofRisquiteintuitivebuthasyettoattractalotofresearchforlongitudinaldata.Inparticular,algorithmsformoreefcientcomputationsoftheposteriorareneeded,especiallyforcorrelationmatricesoflargerdimension.Also,modelsthatseekstructureinthePACsbeyondsparsitycanbedevelopedsuchasshrinkingorclusteringPACswithinlag.Additionalmodelsthatimposesparsitythroughthemarginalorpartialcorrelations,beyond Pittetal. ( 2006 ),couldbedeveloped.Regardingthesimultaneouscovarianceestimationproblem,thereareanumberofextensionsandfuturedirectionstoexplore.Possibleextensionsofthegroupingandcovariancepartitionmodelsincludesituationswherethedimensionofmchangesacrossgroupsorifthetimebetweenmeasurementsdiffersacrossgroups.Oftenlongitudinaldataarenotmeasuredatxedtimes,andsooneshouldconsiderwhat 92

PAGE 93

kindofmethodologyisneededtoallowsharingofinformationacrossgroupsforthesesituations.InbothChapters 3 and 4 weconsiderthegroupsasexchangeable,butdependingoncontext,theremaybeaninformativeorderingtothegroupssuchasincreasinglevelsoftreatmentthatshouldbeexploited.RatherthanrelyingongroupingsoftheGARPandIVparametersthatareequal,modelscouldbedevelopedthatrelyontargetedshrinkage.Further,theideaofusingajointdistributiononpartitions(P1,...,PT)ispotentiallyapplicableinavarietyofcontextsoutsideofcovarianceestimationwhensimultaneousclusteringisrequired. 93

PAGE 94

APPENDIXACALCULATINGTHEDICSTATISTICFORCTQDATAWedescribeinthisappendixthedetailsinvolvedinapproximatingtheDICtermusedformodelcomparisonoftheCTQIdata,andinparticular,theestimationoftheintegralin( 2 )introducedinSection 2.6 .First,weintroducenotation.Let=(,R)bethesetofparameters,^=(^,^R)thesetoftheposteriorestimates,andgthevalueofattheg-thiterationoftheMarkovchain(g=1,...,G).ThefunctionIi(Y)=IfQitYt08tgindicateswhetherYisasetoflatentvariableswhosesignsagreewithQi.Denepi()tobetheprobabilityofobservingQiundertheparameters.Asin( 2 ),thisispi()=pi(,R)=Z(,1)JIi(y)(yj)dy,where(j)isthemultivariatenormaldensitywithmeanXiandcovariancematrixRwhen=(,R).Hence,loglik(jQi)=logpi().Aspreviouslynoted,thisintegralinintractable.Usingthedenitionsofthedevianceandcomplexityparameter(equations( 2 )and( 2 ))andthenewnotation,wecanwriteDICasthesumofthecontributionsDICiforeachpatient,DIC=XiDICi=Xih2logpi(^))]TJ /F4 11.955 Tf 11.95 0 Td[(4Eflogpi()gi.AsobservationsQiareindependent,itsufcestoconsidertheperpatientcontributionDICi.Notethattheexpectationinthenaltermiswithrespecttotheposteriordistributionoftheparametersandwillbeestimatedbyitsaverageoverthevaluesfromtheposteriorsample1,...,G.Additionally,wewillhavetoapproximatepi()withsomeestimate^pi().So,dDICi=2log^pi(^))]TJ /F4 11.955 Tf 11.95 0 Td[(4G)]TJ /F9 7.97 Tf 6.59 0 Td[(1GXg=1log^pi(g). 94

PAGE 95

NotethatcalculationofDICwillrequire(G+1)estimatesoftheintegralpi()foreachi=1,...,N.Toevaluatethisintegralweuseimportancesampling( Robert&Casella 2004 ,Section3.3).Wetakeasoursamplingdensityt(j),themultivariatet-distributionwith5degreesoffreedom,locationparameterXi^,andscalematrixk^Rforsomeconstantk>1.Wedene=(^,k^R)tobethesetofparametersforthesamplingdistribution.Wechoosethet-distributionsothatt(j)willhaveheaviertailsthan(jg)forg=1,...,Gand^.Thisalsomotivatesthechoicetouseascalematrixthatisaninatedversionof^R.Notethatwecanwritepi()aspi()=Z(,1)JIi(y)(yj)dy=Z(,1)JIi(z)(zj) t(zj)t(zj)dz.Toestimatethis,drawZ1,...,ZHiindependentlyfromt(j),and^pi()=H)]TJ /F9 7.97 Tf 6.59 0 Td[(1iHiXh=1Ii(Zh)(Zhj) t(Zhj)isanunbiasedandconsistentestimatorofpi().EvaluationofdDICiinvolves,foreachg=1,...,G,simulatingadatasetZ=fZhgHih=1andcalculating^pi(g),followedbydrawinganalZtoestimate^pi(^).DrawingG+1independentdatasetsturnsouttobecomputationallyslow.Instead,wewilldrawasinglesampleZtousetocalculateall^pi(1),...,^pi(G),^pi(^).Itisclearthattheseestimatesremainunbiasedandconsistent.WhatremainsistoconsiderwhateffectthiswillhaveonthevariabilityoftheindividualcontributionstotheDIC,dDICi.FirstwederivethevarianceofdDICiinthesituationwherewedrawanewdatasetZforeach^pi(g).Inthiscase, VarfdDICig=4Varflog^pi(^)g+16G)]TJ /F9 7.97 Tf 6.58 0 Td[(2XgVarflog^pi(g)g,(A) 95

PAGE 96

wheretheexpectation(inthevariance)iswithrespecttothesamplingdistributionofZ.Foranygor^,thisvarianceis Varflog^pi()gpi())]TJ /F9 7.97 Tf 6.59 0 Td[(2Varf^pi()g=pi())]TJ /F9 7.97 Tf 6.58 0 Td[(2H)]TJ /F9 7.97 Tf 6.58 0 Td[(1iVarIi(Z1)(Z1j) t(Z1j)=pi())]TJ /F9 7.97 Tf 6.59 0 Td[(2H)]TJ /F9 7.97 Tf 6.59 0 Td[(1iEIi(Z1)(Z1j)2 t(Z1j)2)]TJ /F5 11.955 Tf 11.95 0 Td[(pi()2,wheretheapproximationintherstlineisduetothedeltamethod.Thisquantitycanbeconsistentlyestimatedby dVarflog^pi()g=^pi())]TJ /F9 7.97 Tf 6.59 0 Td[(2H)]TJ /F9 7.97 Tf 6.59 0 Td[(1i"H)]TJ /F9 7.97 Tf 6.59 0 Td[(1iHiXh=1Ii(Zh)(Zhj)2 t(Zhj)2)]TJ /F4 11.955 Tf 12.1 0 Td[(^pi()2#.(A)TocalculatethevarianceofdDICiunderoursamplingschemewithasinglesampleZ,note VarfdDICig=4Varflog^pi(^)g+16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2XgVarflog^pi(g)g+16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2XgXg06=gCovflog^pi(g),log^pi(g0)g (A) )]TJ /F4 11.955 Tf 9.29 0 Td[(16G)]TJ /F9 7.97 Tf 6.59 0 Td[(1XgCovflog^pi(g),log^pi(^)g.Thequantitiesonthesecondandthirdlinesof( A )representtheadditionaltermsduetosharingthedatasetZacrosscalculationsof^pi().DeneCOVitobethesumofthesecovarianceterms.WemaywriteCOViasCOVi=16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2XgXg06=gCovflog^pi(g),log^pi(g0)g)]TJ /F5 11.955 Tf 32.23 8.09 Td[(G G)]TJ /F4 11.955 Tf 11.96 0 Td[(1Covflog^pi(g),log^pi(^)g.FromthedeltamethodwehaveCovflog^pi(g),log^pi()gCov^pi(g) pi(g),^pi() pi(), 96

PAGE 97

andso COVi16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2XgXg06=g"Cov^pi(g) pi(g),^pi(g0) pi(g0))]TJ /F5 11.955 Tf 23.6 8.09 Td[(G G)]TJ /F4 11.955 Tf 11.96 0 Td[(1Cov(^pi(g) pi(g),^pi(^) pi(^))#=16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2XgXg06=gCov(^pi(g) pi(g),^pi(g0) pi(g0))]TJ /F5 11.955 Tf 23.59 8.09 Td[(G G)]TJ /F4 11.955 Tf 11.95 0 Td[(1^pi(^) pi(^))=16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2H)]TJ /F9 7.97 Tf 6.59 0 Td[(1iXgXg06=gCovIi(Z1)(Z1jg) pi(g)t(Z1j),Ii(Z1) (Z1jg0) pi(g0)t(Z1j))]TJ /F5 11.955 Tf 23.59 8.09 Td[(G G)]TJ /F4 11.955 Tf 11.96 0 Td[(1(Z1j^) pi(^)t(Z1j)!)=16G)]TJ /F9 7.97 Tf 6.59 0 Td[(2H)]TJ /F9 7.97 Tf 6.59 0 Td[(1iXgXg06=gG G)]TJ /F4 11.955 Tf 11.96 0 Td[(1 (A) +E(Ii(Z1)(Z1jg) pi(g)t(Z1j) (Z1jg0) pi(g0)t(Z1j))]TJ /F5 11.955 Tf 23.59 8.09 Td[(G G)]TJ /F4 11.955 Tf 11.95 0 Td[(1(Z1j^) pi(^)t(Z1j)!)#.Aslongasthisquantity( A )issmall(relativetotheindependencevarianceestimator( A )),wemaysavecomputationaltimebyonlysamplingonedatasetZwithoutsacricingprecision.InthesituationoftheanalysisoftheCTQdata,estimationofCOViwitharepresentativesampleofobservationsshowedCOVitobesmallrelativetotheindependencevarianceestimator( A ).Infact,thistermisoftennegative,indicatingthatusingthecommondatasetZmayimproveestimationforsomeobservationsi.Intherepresentativesampleweconsidered,theadditionoftheCOVitermtendedtoleadtochangesinthestandarderrorofdDICirangingfromadecreaseof5%toanincreaseof25%.AsitiscomputationallyinfeasibletocomputeCOViforallobservations(anestedloopoverg0insidealoopoverg),weestimatethestandarderroroftheDICestimateusingtheindependenceestimatorobtainedfrom( A )and( A ).TheDev,pD,andDICestimatesinTable 2-4 arecomputedinthisway.TheimportancesamplingsizeHiischosenbyrstdrawing200,000valuesofZht(j),wherevariancescalingfactorkis1.52.ThischoiceofkismadewithconsiderationtothedimensionJofZ(increasingJshouldcorrespondtoincreasingk),howfarfromthe 97

PAGE 98

origintendstobe,howlikelyIi(Z)istobeone,amongotherconsiderations;ultimately,trial-and-errorwithsmallchoicesofHiledusconcludethatk=1.52worksreasonablywell.Havingdrawn200,000valuesofZh,ifPhIi(Zh)2000(i.e.,atleast2000oftheZh'shavesignsappropriateforalatentvariableofQi),thenHi=200,000andthissetfZhg200,000h=1istheimportancesampleZ.Ifnot,wecontinuetodrawadditionalsetsof200,000Zh'stoappendtothedatasetuntilPhIi(Zh)2000.Thisimpliesthatwehavelargersamplesforthosepatientsiwithsmallvaluesofpi().ThishelpstocontrolthevarianceofDICisincethetermisprecededbypi())]TJ /F9 7.97 Tf 6.59 0 Td[(2(seeequation( A )).Withthisscheme,weestimatethestandarderrorsforourDICestimatesinTable 2-4 tobearound0.5. 98

PAGE 99

APPENDIXBADDITIONALCOVARIANCEGROUPINGPRIORSANDTHEIRPROPERTIES B.1SparsityGroupingPriorforWeintroducethreeadditionalgroupingpriorspecicationstotheoneintroducedinSection 3.3 ,twofortheautoregressiveparametersandonefortheinnovationvariances.Thesethreepriorsfollowcloselytothematrixstick-breakingprocess( Dunsonetal. 2008 ).Herewerelyonfewerlongitudinalassumptionsintheconstructionofthesemodels,andsotheymaybeviewedasamiddlegroundbetweenthelag-blockandcorrelated-lognormalpriorsandthenaiveBayespriorsusedinSections 3.6 and 3.7 .Thesparsitygroupingpriorisdenedbyreplacingequations( 3 )and( 3 )fromthelag-blockgropingpriorwith mjFmj()=HXh=1mjhjh()(m=1,...,M;j=1,...,J),jhq(j)0+(1)]TJ /F11 11.955 Tf 11.95 0 Td[(q(j))N(0,2)(j=1,...,J;h=1,...,H). (B) Thekeydistinctionisthatin( B )wesamplecandidatesjhforeachautoregressiveparameterj.Previouslywehadasetofcandidatesqhforallautoregressiveparameterswithinlag.Becausethecandidatesaredistinctacrosslags,thispriorwillnolongerallowclusteringofautoregressiveparametersacrosslags,exceptforcommonzerovalues.Asbefore,weuseazero-normalmixtureforthebasedistributionsofthecandidatestoencouragesparsity.Thissparsitygroupingpriorforisdenedsimilarlytothematrixstick-breakingprocess,withthekeydifferencebeingthat Dunsonetal. ( 2008 )specifythatthebasedistributionofthejh'sbenonatomic.Thisisnotthecasewithourpriorsinceweuseadistributionforthecandidatesthatcontainapointmassatzero.Thisdoesnotleadtoaproblem,butitdoesaltersomeofthetheoreticproperties.Whensamplingfromthesparsitygroupingprior,properties( 3 )( 3 )ofthelag-blockpriorcontinuetohold.Thismeansthatthedistribution'scorrelation, 99

PAGE 100

corrfFmj(A),Fm0j(A)g,andclusteringproperties,pr(mj=m0j),acrossgroupswithinthesameautoregressiveparametercontinuetohold.However,properties( 3 )and( 3 )whichconsiderbehavioracrossautoregressiveparametersj6=j0withinthesamelagchange.ItisnowthecasethatcorrfFmj(A),Fmj0(A)g=corrfFmj(A),Fm0j0(A)g=0.Additionally,pr(mj=mj0)=pr(mj=m0j0)=2q,theprobabilitythatbothareindependentlysettozero.Hence,thispriorprovidesaless-richgroupingstructurethanthelag-blockprior.Recallthatinthespecialcase!0and!1,eachofthemodelparametersissampledindependentlyfromthecandidatebasedistribution.Thatis,mjisdistributedasequation( 3 ),whichisthenaiveBayesprior1thatweusedfortherisksimulationanddataanalysis.Hence,thesparsitygroupingpriorsubsumesthisnaivemodelasaspecialcase. B.2Non-SparseGroupingPriorforThenon-sparsegroupingpriorisaslightsimplicationofthesparsitygroupingthatnolongerencouragessparsity.Toformthisprior,weusethesparsitygroupingpriorandreplaceequation( B )withjhN(0,2).Havingremovedthezeropointmassfromthe-level,mjisalmostsurelynon-zero.Thus,wenolongerallowforconditionalindependencerelationshipsorgainsparsityintheT()matrix,butbyxingthemeanatzeroforthedistributionofthecandidates,westillencourageshrinkagetowardindependence.Thispriorfollowsexactlytheframeworkof Dunsonetal. ( 2008 ).Weadditionallypointoutthatthenon-sparsegroupingpriorisequivalenttothesparsitygroupingpriorifwexq=0forallq.Hence,therespectivepropertiesforthispriorareobtainedbysubstituting=0,andconsequently,()=(),intothepropertiesofthesparsitygroupingprior.Becausethenon-sparsepriorisamatrixstick-breakingprocesspriorwithanon-atomicbasedistribution,thepropertiesmay 100

PAGE 101

alternativelybetakenfromPropositions1,2,and4in Dunsonetal. ( 2008 ).Additionally,thenaiveBayes2prioristhespecialcaseofthisnon-sparsegroupingpriorwhen!0and!1. B.3InvGammaGroupingPriorfor)]TJ /F1 11.955 Tf -308.03 -24.53 Td[(Weconstructanadditionalpriorfortheinnovationvariancesdifferingfromthecorrelated-lognormalpriorintwokeyways.First,wedonotstrivetotreattheinnovationvariancesascomingfromasmoothfunction.Inthepreviouspriorthevariancecandidatesjhweresampledfromacorrelated-lognormaldistribution,butwenowtreatthecandidatesindependentlyacrossj.Secondly,becausewechooseindependentcandidates,wechoosetheconjugateinversegammaforthebasedistributioninsteadofthelognormal.TheInvGammagroupingpriorisdenedbyreplacinglines( 3 )and( 3 )with mjGmj()=HXh=1mjhjh()(m=1,...,M;j=1,...,p),jhInvGamma(1,2),(j=1,...,p;h=1,...,H). (B) Wenolongerrequireany!hin( 3 )becausewedonotdesireintoinduceacorrelationamongthe's.Aswiththenon-sparsepriorthisisaspecialcaseofthematrixstick-breakingprocess.Wecomparethispriorwiththecorrelated-lognormalpriorinthespecialcasewhere=0.Recallthatif=0,thenthecandidatesjhareindependentvariablesfromthelognormal( ,)distribution,andthecorrelated-lognormalpriorgivesamatrixstick-breakingprocess.Betweenthesetwopriors,theInvGammagroupingpriorandthecorrelated-lognormalgroupingpriorwith=0,werecommendusingtheInvGammabecauseoftheconjugacythatisobtained.Thebenetofusingthenormal(equivalently,lognormal)distributionfor!()isthatinducingthecorrelationinsideaclusterisstraightforward,againthatoutweighsthelossofconjugacy.Forsituationswhereitisreasonabletoconsidertheinnovationvariancesasfollowingasmoothprogress,such 101

PAGE 102

asmostlongitudinaldatamodels,werecommendthecorrelated-lognormalpriorwithanon-zerochoiceof.Toobtainthetheoreticalpropertiesofthisprior,letI\()denotetheprobabilityfunctionoftheInvGamma(1,2)distribution,withxedvaluesforthehyperparameters.BecausetheInvGammagroupingpriorisamatrixstick-breakingprocess,themostrelevantpropertiesfollowimmediatelyfrom Dunsonetal. ( 2008 ).ForA2B(R+)theunbiasedandvariancepropertiesaregivenbyEfGmj(A)g=I\(A),VarfGmj(A)g=2 (2+)(2+))]TJ /F4 11.955 Tf 11.96 0 Td[(2I\(A)f1)]TJ /F1 11.955 Tf 11.95 0 Td[(I\(A)g.Thebehavioracrossgroupsm6=m0withcommontimejasin( 3 )and( 3 )continuetoholdwith=0.ThekeydifferenceisthatcorrfGmj(A),Gmj0(A)g=corrfGmj(A),Gm0j0(A)g=0,contrastedwith( 3 )and( 3 ).Thatis,thedistributionsfortimesjandj0areuncorrelated.Itremainstruethatpr(mj=mj0)=0.As!0and!1,thispriorconvergestothenaivepriorusedfortheinnovationvariancesinSections 3.6 and 3.7 ofthearticle. B.4FurtherGroupingPriorExtensionsInadditiontothegroupingpriorspreviouslydenedinChapter 3 andhere,thereareanumberofothernaturalextensionsandpossiblevariationsthatonecouldconstruct.Forinstance,onecouldallowfordifferingvaluesof2,thevarianceoftheautoregressiveparametercandidates,thatdependonthelagvalueoftheassociatedautoregressiveparameter.Thismightbebenecialinasituationwherepislargeandonebelievesthatthe'saftertherstfewlagsvarymoretightlyaroundzero.Additionally,wecanremovethesparsityfromthelag-blockgroupingpriorbydeletingthepointmassatzerofromthedistributionofthe's.Asanotherchoice,insteadofspecifyingtheand)]TJ /F1 11.955 Tf 10.1 0 Td[(asseparateblockswithdifferentvaluesofthestick-breakingparametersand,onecoulddrawbothsetsofparameterswiththesamevaluesofand.Insteadofspecifyingthatthecandidate'sarezeroaccording 102

PAGE 103

totheprobability,anotherextensionistomodifythegroupingpriorbyintroducingazero-thclusterwherej0=0forallj.Theselectionofmjwouldthenfollowbypr(mj=0)=pr(mj=j0)=q(j)andforh=1,...,H,pr(mj=jh)=(1)]TJ /F11 11.955 Tf 12.28 0 Td[(q(j))mjh.Thepropertieswehaveconsideredareeasilyobtainedandcomparedtothoseobtainedinthesparsitygroupingcase.Whiletheseorothersmaybemorenaturalincertaincontexts,webelievethatthelag-blockandcorrelated-lognormalgroupingpriorsarethemostapplicablechoicesforgenerallongitudinaldata. 103

PAGE 104

APPENDIXCDERIVATIONOFCOVARIANCEGROUPINGPRIORPROPERTIES C.1ProofsforGeneralizedAutoregressiveParameterPropertiesWeprovidepartialproofstothepropertiesofthelag-blockgroupingpriornotedinSection 3.4 ofthearticle.Theproofsforproperties( 3 )and( 3 )canbefoundintheappendixto Dunsonetal. ( 2008 ).Duetothenon-atomicnatureofthebasedistribution( 3 ),wearenotabletodirectlyapplytheirconclusiontoourpriortoprove( 3 ).Detailsfollow.pr(mj=m0j)=pr(mj=m0j6=0)+pr(mj=m0j=0)=E(Xhmjhm0jhjh(Rn0))+E(Xhmjhjh(0)Xim0jiji(0))=E(Xhmjhm0jhjh(Rn0))+E(Xhmjhm0jhjh(0))+2E(Xh1Xi=h+1mjhm0jijh(0)ji(0))=E(Xhmjhm0jhjh(R))+2E(Xh1Xi=h+1mjhm0jijh(0)ji(0))=E(Xhmjhm0jh)+22E(Xh1Xi=h+1mjhm0ji)=(I)+22(II),whereexpressions(I)and(II)arecalculatedbelowandRdenotestherealline. 104

PAGE 105

(I)=E(XhUmhUm0hX2jhYl
PAGE 106

Using(I)and(II),wehavepr(mj=m0j)=(I)+22(II)=2+1)]TJ /F11 11.955 Tf 11.96 0 Td[(2 (1+)(2+))]TJ /F4 11.955 Tf 11.95 0 Td[(1.Toestablishthecorrelationpropertyin( 3 ),letq=q(j)=q(j0).FirstnoteEfFmj(A)Fmj0(A)g=E(Xhmjhmj0hqh(A))+2E(Xh1Xi=h+1mjhmj0iqh(A)q0i(A))=(A)E(Xhmjhmj0h)+2(A)2E(Xh1Xi=h+1mjhmj0i)=(A)(III)+2(A)2(IV),whereformulas(III)and(IV)aredenedbelow.(III)=E(XhU2mhXjhXj0hYl
PAGE 107

(IV)=E"Xh1Xi=h+1UmhXjh(1)]TJ /F5 11.955 Tf 11.96 0 Td[(UmhXj0h)(Yl
PAGE 108

Toprovethematchingprobabilityof( 3 ),notepr(mj=mj0)=pr(mj=mj06=0)+pr(mj=mj0=0)=E(Xhmjhmj0hqh(Rn0))+E(Xhmjhqh(0)Ximj0iqi(0))=E(Xhmjhmj0hqh(R))+2E(Xh1Xi=h+1mjhmj0iqh(0)qi(0))=(III)+22(IV)=2q+1)]TJ /F11 11.955 Tf 11.95 0 Td[(2q (2+)(1+))]TJ /F4 11.955 Tf 11.95 0 Td[(1.Toobtain( 3 ),wehaveEfFmj(A)Fm0j0(A)g=E(Xhmjhm0j0hqh(A))+2E(Xh1Xi=h+1mjhm0j0iqh(A)qi(A))=(A)E(Xhmjhm0j0h)+2(A)2E(Xh1Xi=h+1mjhm0j0i)=(A)(V)+2(A)2(VI),where(V)=E(XhUmhUm0hXjhXj0hYl
PAGE 109

and(VI)=E"Xh1Xi=h+1UmhXjh(1)]TJ /F5 11.955 Tf 11.96 0 Td[(Um0hXj0h)(Yl
PAGE 110

Toseethatthedistributionsareuncorrelated,assumem=m0.ThenEfFmj(A)Fmj0(A)g=E(Xhmjhmj0hqh(A)q0h(A))+2E(Xh1Xi=h+1mjhmj0iqh(A)q0i(A))=(A)2(III)+2(A)2(IV)=(A)2.Whenm6=m0,theproofproceedssimilarlybutrequiresexpressions(V)and(VI)insteadof(III)and(IV). C.2ProofsforInnovationVariancePropertiesToestablishthepropertiesforthecorrelated-lognormalprior,notethatforaxedcommonvalueofand,thedistributionsofUmhandWmh,aswellasXjhandZjh,arethesame.Hence,thesetfmjhgwillbedistributedthesameasthesetfmjhg,andwemayusetheexpressions(I)(VI)toobtainexpectationsoftheinnovationvariancestick-breakingweights.Toprove( 3 ),noticeEfGmj(A)Gmj0(A)g=E(Xhmjhmj0hjh(A)j0h(A))+2E(Xh1Xi=h+1mjhmj0ijh(A)j0i(A))=Enj1(A)j01(A)oE(Xhmjhmj0h)+2fEj1(A)gfEj01(A)gE(Xh1Xi=h+1mjhmj0i)=Enj1(A)j01(A)o(III)+2fEj1(A)gfEj01(A)g(IV)=1 (2+)(1+))]TJ /F4 11.955 Tf 11.95 0 Td[(1hEn!j1(logA)!j01(logA)ofE!j1(logA)gfE!j01(logA)gi+fE!j1(logA)gfE!j01(logA)g=1 (2+)(1+))]TJ /F4 11.955 Tf 11.95 0 Td[(1covn!j1(logA),!j01(logA)o+(logA)2. 110

PAGE 111

Applyingvarf!j1(logA)g=(logA)f1)]TJ /F4 11.955 Tf 12.36 0 Td[((logA)gandastheformulasforEfGmj(A)gandvarfGmj(A)ggivesthenalresult.Theproofof( 3 )followsthesameasabove,exceptoneusesexpressions(V)and(VI)inplaceof(III)and(IV).Finally,pr(mj=m0j0)=0followsfromtheobservationthat!jh6=!j0h0almostsurelyasaconsequenceofthemultivariatenormaldistributionwithanon-degeneratecorrelation.Proofsofthepropertiesofthesparsity,non-sparse,andInvGammagroupingpriorsdenedinAppendix B areexcluded.Thesecanbederivedfollowingstepssimilartothoseaboveorintheappendixto Dunsonetal. ( 2008 ). 111

PAGE 112

APPENDIXDDETAILSOFMCMCALGORITHMFORCOVARIANCEGROUPINGPRIORS D.1PreliminariesAsmentionedinSection 3.5 ,weintroduceseverallatentvariablestofacilitatesamplingfromthedistributionsFmj()andGmj()inequations( 3 )and( 3 ),followingthealgorithmof Dunsonetal. ( 2008 ).Tothisend,considerthefollowingfoursetsofbinarydummyvariables:umjhBern(Umh)(m=1,...,M;j=1,...,J;h=1,...,H),xmjhBern(Xjh)(m=1,...,M;j=1,...,J;h=1,...,H),wmjhBern(Wmh)(m=1,...,M;j=1,...,p;h=1,...,H),zmjhBern(Zjh)(m=1,...,M;j=1,...,p;h=1,...,H).NowdeneRmj=minfh:1=umjh=xmjhgandAmj=minfh:1=wmjh=zmjhg.Byconstruction,pr(Rmj=h)=mjhandpr(Amj=h)=mjh.WeletRmjdesignatewhichjhoutoftheHcandidateswechooseasmj,andlikewise,Amjgivesthejhtoselectasmj.Hence,isdeterminedbyfRmjgandfjhgand)]TJ /F1 11.955 Tf 10.1 0 Td[(byfAmjgandfjhg.Thus,aftersamplingthevaluesoffRmjg,fjhg,fAmjg,andfjhg,thevaluesofand)]TJ /F1 11.955 Tf 10.1 0 Td[(aredetermined.NowwecalculatetheconditionaldistributionsthatwewillneedforourGibbssamplerforeachofthegroupingpriors.Notationally,wedenotetheconditionaldistributionforarandomvariable,sayC,conditionalontheremainingrandomvariablesbyCj)]TJ /F1 11.955 Tf 15.94 0 Td[(. D.2SamplingStepsforLag-BlockGroupingPriorSamplingfromthelag-blockgroupingpriorproceedsinfoursteps:theautoregressiveparametersthoughtheparametercandidatesandthedummyvariablesR,u,x;thecandidateprobabilitiesmjhbysamplingtheUmhandXjh;thestick-breakingparameters 112

PAGE 113

,;thehyperparametersofthebasedistribution( 3 ).Therststepproceedsintwosubsteps.Algorithmdetailsfollow.Step1a:Parametercandidates.Itisimportanttorecallthedenitionoftheautoregressiveparametersasconditionalregressioncoefcients.Forinstance,therstparameterm1istheregressioncoefcientforymi1ontoymi2withinnovationvariancem1.Likewise,m2andm3arethecoefcientsofymi1andymi2formodelingymi3withvariancem2.Weletxmijdenotethecomponentofymithatcorrespondstothejthautoregressiveparameterregressor,e.g.xmij=ymi1forj=1,2andxmi3=ymi2.Similarly,weletmjdenotetherelevantinnovationvariance.Thatis,m1=m1,andforj=2and3,mj=m2.Finally,wedeneemijtobetheresidualfortheregressionequation,excludingthecontributionofxmij.Thatis,forj=1,emi1=ymi2,forj=2,emi2=ymi3)]TJ /F11 11.955 Tf 13.09 0 Td[(m3ymi2,andforj=3,emi3=ymi3)]TJ /F11 11.955 Tf 13.09 0 Td[(m2ymi1.The-variablesaredenedinthenaturalwaysothatemijN(mjxmij,mj)foreachj.Havingestablishedthenecessarynotation,weseethatthecontributiontothedatalikelihoodofmjisproportionaltoexp()]TJ /F4 11.955 Tf 18.65 8.09 Td[(1 2mjnmXi=1)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(emij)]TJ /F11 11.955 Tf 11.96 0 Td[(mjxmij2).However,wedonotdrawthemj'sbutqh.Forh=1,...,Handxedq=1,...,p)]TJ /F4 11.955 Tf 9.59 0 Td[(1,letPqhdenotesthesetof(m,j)suchthatq(j)=qandRmj=h,whichisthesetofgroupandautoregressiveparameterpairsthatcontributetothelikelihoodofqh.Thus,thecontributionofqhis exp8<:X(m,j)2PqhnmXi=1)]TJ /F4 11.955 Tf 9.3 0 Td[(1 2mj(emij)]TJ /F11 11.955 Tf 11.95 0 Td[(qhxmij)29=;. (D) 113

PAGE 114

Thissummationover(m,j)2Pqhmeansthatweareonlyincludingthe(m,j)pairssuchthatqh=mj.Fromthisobservation,wehavethattheconditionaldistributionofqhis (qhj)]TJ /F4 11.955 Tf 12.62 0 Td[()/exp8<:X(m,j)2PqhnmXi=1)]TJ /F4 11.955 Tf 9.3 0 Td[(1 2mj(emi)]TJ /F11 11.955 Tf 11.96 0 Td[(jhxmi)29=;q0(qh)+(1)]TJ /F11 11.955 Tf 11.96 0 Td[(q)(22))]TJ /F14 5.978 Tf 7.79 3.26 Td[(1 2exp)]TJ /F11 11.955 Tf 12.37 9.1 Td[(2qh 22/q0(qh)+(1)]TJ /F11 11.955 Tf 11.95 0 Td[(q) exp()2 22N(,2), (D) where =2X(m,j)2PqhnmXi=1emijxmij mj,2=8<:1 2+X(m,j)2PqhnmXi=1(xmij)2 mj9=;)]TJ /F9 7.97 Tf 6.59 0 Td[(1.(D)Thus,tosamplefromthisconditional( D ),wesetqhtozerowithprobabilityq q+(1)]TJ /F11 11.955 Tf 11.96 0 Td[(q) expn()2 22o,andotherwise,drawqhfromN(,2).IfPqhisempty,thentheconditionalisq0+(1)]TJ /F11 11.955 Tf 11.96 0 Td[(q)N(0,2),theoriginalpriorforqhgivenby( 3 ).Step1b:Dummyvariables.Havingupdatedthecandidatesvaluesfqhgh,wenowsamplethevariablesthatdeterminewhichcandidatewechoosefortheautoregressiveparameter.First,form=1,...,Mandjsuchthatq=q(j),wesampleRmjaftermarginalizingoutfumjh,xmjhgHh=1.Theconditionalprobabilitydistributionisgivenby pr(Rmj=hj)-222(nfumjh,xmjhgh)/mjhexp()]TJ /F4 11.955 Tf 18.65 8.09 Td[(1 2mjnmXi=1(emij)]TJ /F11 11.955 Tf 11.96 0 Td[(q(j)hxmij)2).(D)Hence,wedrawRmjfromthemultinomialdistributionwithprobabilitiesfrom( D ),normalizedtosumtoone.GiventhevalueofRmj,wedrawthesetfumjh,xmjhghtorequirethatRmjistherstoccasionwherebothumjhandxmjhareone.Forh>Rmj,drawumjhBern(Umh)andxmjhBern(Xjh),andwhenh=Rmj,1=umjh=xmjh.Forh
PAGE 115

thenwejointlydrawumjhandxmjhinaccordancetothefollowingprobabilities:pr(umjh=0,xmjh=0)=(1)]TJ /F5 11.955 Tf 11.96 0 Td[(Umh)(1)]TJ /F5 11.955 Tf 11.96 0 Td[(Xjh)=(1)]TJ /F5 11.955 Tf 11.95 0 Td[(UmhXjh),pr(umjh=1,xmjh=0)=Umh(1)]TJ /F5 11.955 Tf 11.95 0 Td[(Xjh)=(1)]TJ /F5 11.955 Tf 11.96 0 Td[(UmhXjh),pr(umjh=0,xmjh=1)=(1)]TJ /F5 11.955 Tf 11.96 0 Td[(Umh)Xjh=(1)]TJ /F5 11.955 Tf 11.96 0 Td[(UmhXjh).Steps1aand1bshouldbeperformconsecutivelywiththesamevalueofq.Step2:Candidateprobabilities.ToupdatetheprobabilitiesmjhwemustsamplenewvaluesforthecomponentsUmhandXjh.Tothatend,giventhevaluesoftheumjh'sandtheothervariables,theconditionalforUmhforh
PAGE 116

and.Thentheconditionalforisj)-277(Gamma0@M(H)]TJ /F4 11.955 Tf 11.96 0 Td[(1)+1,1)]TJ /F7 7.97 Tf 16.95 14.95 Td[(MXm=1H)]TJ /F9 7.97 Tf 6.58 0 Td[(1Xh=1log(1)]TJ /F5 11.955 Tf 11.95 0 Td[(Umh)1A.Likewise,j)-278(Gamma0@J(H)]TJ /F4 11.955 Tf 11.96 0 Td[(1)+1,1)]TJ /F7 7.97 Tf 18.25 14.94 Td[(JXj=1H)]TJ /F9 7.97 Tf 6.58 0 Td[(1Xh=1log(1)]TJ /F5 11.955 Tf 11.95 0 Td[(Xjh)1A.WecanchooseadifferentGamma(a,b)priorinsteadofGamma(1,1),andwewillmaintainthegamma-gammaconjugacy.Step4:Basedistributionhyperparameters.Thebasedistributionforqhin( 3 )dependsontwosetsofhyperparameters:thevarianceofthecontinuouspiece2andthesparsityprobabilitiesq.WechoosetheInvGamma(a,b)familyofdistributionsfortheprioron2,sothatwewillhaveconjugacy.Thisyieldsthefollowingconditionaldistribution,2j)-278(InvGamma a+1 2Xq,hf1)]TJ /F11 11.955 Tf 11.95 0 Td[(0(qh)g,b+1 2Xq,h2qh!.Onemustnowspecifythevaluesofa,b.WerecommendInvGamma(0.1,0.1),sothatourpriorapproximatesthecommonly-usedimproperprior(2)/)]TJ /F9 7.97 Tf 6.59 0 Td[(2.ByplacingaBeta(q,q)prioronq,theconditionalforqisqj)-278(Beta0@q+HXh=10(qh),q+HXh=1f1)]TJ /F11 11.955 Tf 11.95 0 Td[(0(qh)g1A.Itisnecessarytospecifythevaluesofqandq.Werecommendusingq=q=1forallq,whichgivesaUnif(0,1)priorforeachq.Alternatively,onecouldchoosethevaluesofqandqtomoreaggressivelyshrinkqforlowerlagstowardzeroandqforhigherlagstowardone. 116

PAGE 117

D.3SamplingStepsforCorrelated-LognormalGroupingPriorThesamplingfortheinnovationvariancepriorfollowssimilarlytothepreviousprior.However,wenowlongerhavetoperformthecandidatevalueanddummyvariablesamplingstepconcurrentlyasforthelag-blockprior.Step1:Parametercandidates.Let~emijbetheresidualobtainedfromthedifferenceofymijandthepreviouscomponentsofymimultipliedbytheappropriateautoregressiveparameters.Forinstance,whenj=1,~emi1=ymi1,andforj=2,~emi2=ymi2)]TJ /F11 11.955 Tf -421.13 -23.91 Td[(m1ymi1,andsoon.Notethatthisisadifferentdenitionofthese~e-residualsfromthee-residualsusedintheautoregressiveparametersamplingsteps.Foreachvalueofj,thisyields~emijN(0,mj).Thecontributiontothelikelihoodofjhfromthedataisproportionalto )]TJ /F14 5.978 Tf 7.79 3.26 Td[(1 2Pmnmh(Amj)jhexp()]TJ /F4 11.955 Tf 16.99 8.09 Td[(1 2jhXmnmXi=1(~emij)2h(Amj)).(D)Insteadofconsideringtheconditionalforjh,weinsteadchoosetolookintermsof!jh=logjh.Foreachsamplingset,wepartition!hinto(!hA,!hB)sothat!hAcontainsthecollectionof!jhsuchthatAmj=hforatleastonem.Thisdivides!hintothe!hB,whichcanbedrawneasilythroughaconjugatedistribution,andthe!hA,whichrequireamoreadvancedsamplingmethod.Tosample!hBgiventheremainingvariables,weletadenotethelengthof!hAandb=p)]TJ /F5 11.955 Tf 10.17 0 Td[(adenotethelengthof!hB.DeneRAAtobethesubmatrixofR()correspondingtotheelementsof!hA,RBBcorrespondingtotheelementsof!hB,andRBAcontaintheelementsoftherowsof!hBandcolumnsof!hA.Then,usingstandardmultivariatenormalresults,!hBj!hA,)-277(Nb)]TJ /F11 11.955 Tf 5.48 -9.68 Td[( 1b+RBAR)]TJ /F9 7.97 Tf 6.59 0 Td[(1AA(!hA)]TJ /F11 11.955 Tf 11.95 0 Td[( 1a),(RBB)]TJ /F5 11.955 Tf 11.96 0 Td[(RBAR)]TJ /F9 7.97 Tf 6.59 0 Td[(1AAR0BA).Jointlydrawingthevector!hBleadstobettermixingthandrawingeachcomponentseparately. 117

PAGE 118

Tosample!hA,wecyclethroughthecomponents!hof!hAfor=1,...,a.Werecognizethatthecontributiontotheconditionalof!hfromthepriorisexp)]TJ /F4 11.955 Tf 17.39 8.09 Td[(1 2(!h)]TJ /F11 11.955 Tf 11.96 0 Td[( )2,where = +R,()]TJ /F12 7.97 Tf 6.59 0 Td[()R)]TJ /F9 7.97 Tf 6.59 0 Td[(1()]TJ /F12 7.97 Tf 6.59 0 Td[(),()]TJ /F12 7.97 Tf 6.59 0 Td[()(!h()]TJ /F12 7.97 Tf 6.58 0 Td[())]TJ /F11 11.955 Tf 11.96 0 Td[( 1p)]TJ /F9 7.97 Tf 6.59 0 Td[(1),=1)]TJ /F5 11.955 Tf 11.95 0 Td[(R,()]TJ /F12 7.97 Tf 6.59 0 Td[()R)]TJ /F9 7.97 Tf 6.58 0 Td[(1()]TJ /F12 7.97 Tf 6.59 0 Td[(),()]TJ /F12 7.97 Tf 6.58 0 Td[()R0,()]TJ /F12 7.97 Tf 6.58 0 Td[(),!h()]TJ /F12 7.97 Tf 6.59 0 Td[()isthe!hvectorafterremoving!h,R()]TJ /F12 7.97 Tf 6.58 0 Td[(),()]TJ /F12 7.97 Tf 6.59 0 Td[()istheR()matrixformedbyremovingtherowandcolumncorrespondingto,andR,()]TJ /F12 7.97 Tf 6.59 0 Td[()isthevectordenedbytakingtherowofR()andremovingthecomponent.Wecanequivalentlyviewthisash=exp(!h)lognormal( ,),andcalculatetheconditionaldistributionintermsofh.Thisgives(hjh()]TJ /F12 7.97 Tf 6.59 0 Td[(),)]TJ /F4 11.955 Tf 9.3 0 Td[()/)]TJ /F14 5.978 Tf 7.78 3.26 Td[(1 2Pmnmh(Amj))]TJ /F9 7.97 Tf 6.59 0 Td[(1hexp()]TJ /F4 11.955 Tf 18.63 8.09 Td[(1 2hXmnmXi=1(~emij)2h(Amj))]TJ /F4 11.955 Tf 20.05 8.09 Td[(1 2(logh)]TJ /F11 11.955 Tf 11.95 0 Td[( )2).Samplingfromthisdistributionrequiresanapproximatesamplingstep.Werecommendslicesampling( Neal 2003 ),althoughanalternativesamplingstrategycouldbeused.Step2:Dummyvariables.TodrawAmjwewillproceedsimilarlytothepreviousStep1bbylookingattheconditionalmarginallyoverfwmjh,zmjhgh. pr(Amj=hjnfwmjh,zmjhgh)/mjh)]TJ /F14 5.978 Tf 7.78 3.26 Td[(1 2nmjhexp )]TJ /F4 11.955 Tf 17 8.09 Td[(1 2jhnmXi=1~e2mi!. (D) Hence,wedrawAmjfromthemultinomialdistributionwithprobabilitiesfrom( D ),normalizedtosumtoone.Asbefore,wesimulatethesetswmjhandzmjhconditionalonAmjbeingtherstoccasionwherebothwmjhandzmjhareone.Forh>Amj,drawwmjhBern(Wmh)andzmjhBern(Zjh),andwhenh=Amj,1=wmjh=zmjh.For 118

PAGE 119

h
PAGE 120

Inthesimulationanddataexample,weusea=b=0.1,c2=1000.AsmentionedinSection 3.5 ,ithasbeenourexperiencethatsamplingleadstoinstability,andwegenerallyrecommendxingit. D.4SamplingStepsforSparsityGroupingPriorThesparsitygroupingpriorintroducedinAppendix B.1 followsasimilarstructuretothelag-blockpriorexceptusesanewsetsofautoregressiveparametercandidatesforeachj.Consequently,thealgorithmrequiredforsamplingfromthispriorwillbequitesimilar.WedenethenecessarystepsiftheydifferfromtheprocedureintroducedinAppendix D.2 .Step1a:Parametercandidates.Usingthesamenotationasbefore,wenotethatthecontributionfromthedataaboutjhis exp8<:Xm:Rmj=h)]TJ /F4 11.955 Tf 9.3 0 Td[(1 2mjnmXi=1(emij)]TJ /F11 11.955 Tf 11.96 0 Td[(jhxmij)29=;,whichwecompareto( D ).Incorporatingthezero-normalmixtureprior,theposteriorconditionaldistributionwillalsobeazero-normalmixture.Wesetjhtozerowithprobabilityq(j) q(j)+(1)]TJ /F11 11.955 Tf 11.96 0 Td[(q(j)) expn()2 22o,andsamplefromaN(,2)otherwise.Themeanandvarianceforthecontinuouscomponentaregivenby =2Xm:Rmj=hnmXi=1emijxmij mj,2=8<:1 2+Xm:Rmj=hnmXi=1(xmij)2 mj9=;)]TJ /F9 7.97 Tf 6.58 0 Td[(1.(D)Step1b:Dummyvariables.ThestepisthesameasStep1bforthelag-blockpriorexceptthat( D )becomespr(Rmj=hjnfumjh,xmjhgh)/mjhexp()]TJ /F4 11.955 Tf 18.65 8.09 Td[(1 2mjnmXi=1(emij)]TJ /F11 11.955 Tf 11.96 0 Td[(jhxmij)2).NowwecycleSteps1aand1boverjinsteadofq. 120

PAGE 121

Step4:Basedistributionhyperparameters.Thesamplingdistributionsforthehyperparametersaregivenby2j)-278(InvGamma a+1 2Xj,hf1)]TJ /F11 11.955 Tf 11.95 0 Td[(0(jh)g,b+1 2Xj,h2jh!andqj)-278(Beta0@q+Xj:q(j)=qHXh=10(jh),q+Xj:q(j)=qHXh=1f1)]TJ /F11 11.955 Tf 11.95 0 Td[(0(jh)g1A,forpriorsdistributionsof2InvGamma(a,b)andqBeta(q,q).Notethatthesummationsinthedistributionforqsumovertheautoregressivetermsthatcorrespondtolag-q. D.5SamplingStepsforNon-SparseGroupingPriorSamplingunderthispriorproceedsasunderthelag-blockandsparsitypriors.Wedescribethestepsthatdifferfromlag-blockprior.Step1a:Parametercandidates.Becausethispriordoesnotimposesparsity,theconditionaldistributionforjhisN(,2)whereand2aregivenbyequation( D ).Step4:Basedistributionhyperparameters.Theonlyhyperparameterfortheautoregressivecandidatesis2.WithapriordistributionofInvGamma(a,b),thesamplingdistributionis2j)-278(InvGamma a+1 2JH,b+1 2Xj,h2jh!. D.6SamplingStepsforInvGammaGroupingPriorThesamplingalgorithmproceedsasforthecorrelated-lognormalprior.WedescribeonlythosestepsthatrequiremodicationfromthealgorithminAppendix D.3 .Step1:Parametercandidates.Usingthelikelihoodcontributionofjhfrom( D )andapriordistributionofInvGamma(1,2),wehavethattheconditionalsampling 121

PAGE 122

distributionisjhj)-278(InvGamma0@1+1 2Xm:Amj=hnm,2+1 2Xm:Amj=hnmXi=1~e2mij1A.Step5:Basedistributionhyperparameters.Thehyperparametersassociatedwiththisinnovationvariancegroupingare1and2in( B ).WeplaceindependentGamma(1,1)priorsoneach.Theconditionalfor2is2j)-277(Gamma 1pH+1,1+Xj,h)]TJ /F9 7.97 Tf 6.59 0 Td[(1jh!.Theconditionalfor1is(1j)]TJ /F4 11.955 Tf 15.94 0 Td[()/\(1))]TJ /F7 7.97 Tf 6.59 0 Td[(pH)]TJ /F12 7.97 Tf 6.59 0 Td[(1pH2exp")]TJ /F11 11.955 Tf 9.29 0 Td[(1(1+Xj,hlog(jh))#,butthisisnotastandarddistributiontosample.So,itbecomesnecessarytoimplementanalternativesamplingmethod,andwechoosetointroduceaMetropolis-in-Gibbssteptoapproximatelysimulatefromthisconditional.Drawthecandidatevalue1toreplacethecurrentvalue1fromtheN(1,)distribution,andacceptthemoveto1withprobabilityminf1,~g,where~=I(1>0)"exp(log\(1) \(1)+1 pH(1)]TJ /F11 11.955 Tf 11.96 0 Td[(1) 1+Xj,hlog(jh)+pHlog(2)!)#pH.Itisnecessarytoprespecifyacandidatevariancesuchthattheacceptancerateis20to40%( Gelmanetal. 1996 ). D.7FinalCommentsaboutComputationalAlgorithmWenallynotethatonecanviewourgroupingpriorsinahierarchicalfashionwithmultiplelevels.Asisoftenthecaseinhierarchicalmodels,theremaybelittleinformationabouttheparametersinthelowestlevels.Wehaveoftenfoundthistobethecaseforthegroupingpriorsresultinginpoormixingforsomeofthelowerlevelmodelparameters.Whilethevaluesoftheautoregressiveparametersand 122

PAGE 123

PAGE 124

APPENDIXEADDITIONALRISKSIMULATIONSFORCOVARIANCEGROUPINGPRIORS E.1AdditionalDetailsforRiskSimulationofSection3.6Mean,covariance,anddropoutmodel( 3 )parametersaredisplayedinTable E-1 .Recallthatthedropoutmodel( 3 )isgivenbylogitfpr(Di=t+1jDi>t,yit,m)g=0t+1tyit+2m.ThedropoutprobabilitiesthatthismodelinducesaregiveninTable E-2 .RecallthattheT(m)upper-triangularmatrixisdenedthroughtheJ-dimensionalvectormbyT(m)=2666666666641)]TJ /F11 11.955 Tf 9.3 0 Td[(m1)]TJ /F11 11.955 Tf 9.3 0 Td[(m2)]TJ /F11 11.955 Tf 9.3 0 Td[(m41)]TJ /F11 11.955 Tf 9.3 0 Td[(m3)]TJ /F11 11.955 Tf 9.3 0 Td[(m51)]TJ /F11 11.955 Tf 9.3 0 Td[(m61...377777777775.RecallthatinSection 3.6 ,weanalyzedtheriskunder5differentpriorchoices:thetwonaivepriors,thetwoatpriors,andthegroupingpriorformedbythelag-blockandthecorrelated-lognormalgroupingpriors.WeextendtherisksimulationsfromTable 3-1 byaddingfouradditionalgroupingpriorsbasedonthenewpriorsintroducedinAppendix B .Thesenewpriorsarecomposedofmixinganautoregressiveparametergroupingpriorwithaninnovationvariancegroupingprior.Thesefouradditionalpriorsarelag-block/InvGamma,sparsity/correlated-lognormal,sparsity/InvGamma,andnon-sparse/InvGamma.SpecicationsforhyperpriorsandothersimulationchoicesarethesameasSection 3.6 .Table E-3 containsthecovarianceandmeanriskforthisextendedsimulation.Allveofthegroupingpriorsbeatthenaiveandatpriors.Thelag-block/correlated-lognormalgroupingprioristhemosteffectiveofourgroupingpriorforthisanalysis.Theabilitytoborrowstrengthacrossgroupsimprovestheestimationsuchthateventhe 124

PAGE 125

TableE-1. ParametervaluesforrisksimulationofSection 3.6 .1=(0,1.9,5.2,9.9,16.0,23.5)2=(0,1.8,4.8,9.0,14.4,21.0)3=(0,1.8,5.6,11.4,19.2,29.0)4=(0,2.0,5.0,9.0,14.0,20.0)5=(0,2.0,5.2,9.6,15.2,22.0)6=(0,3.0,6.0,9.0,12.0,15.0)7=(0,1.8,4.8,9.0,14.4,21.0)8=(0,2.8,7.2,13.2,20.8,30.0)1=(0.7,0.2,0.7,0,0.2,0.7,0,0,0.2,0.7,0,0,0,0.2,0.7)2=(0.6,0.1,0.6,0.1,0.1,0.6,-0.1,0.1,0.1,0.6,-0.1,-0.1,0.1,0.1,0.6)3=(0.4,0.3,0.4,-0.2,0.3,0.4,0,-0.2,0.3,0.4,-0.2,0,-0.2,0.3,0.4)4=(0.3,0,0.3,-0.1,0,0.3,0,-0.1,0,0.3,0,0,-0.1,0,0.3)5=(1,-0.5,1,0.2,-0.5,1,0,0.2,-0.5,1,0,0,0.2,-0.5,1)6=(0.8,-0.4,0.8,0.3,-0.4,0.8,0,0.3,-0.4,0.8,0,0,0.3,-0.4,0.8)7=(0.9,-0.2,1,-0.2,-0.2,1,-0.2,-0.2,-0.2,1,-0.2,-0.2,-0.2,-0.2,1)8=(-0.9,0.1,-0.9,0,0.1,-1,0.2,0,0.1,-0.8,-0.2,0.2,0,0.1,-0.8))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(1=(1,1,1,1,1,1))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(2=(1.5,1.5,1.5,1.5,1.5,1.5))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(3=(3.4,3.1,2.8,2.5,2.2,1.8))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(4=(3,3,2,2,2,1))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(5=(3.5,3.2,2.9,3.5,3.2,2.9))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(6=(5,3.7,3,3,2,2))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(7=(2,1.8,1.6,1.4,1.2,1))]TJ /F9 7.97 Tf 6.18 -1.64 Td[(8=(3.3,3,2.7,2.4,2.2,1.9)0=(-2.5,-3.5,-9,-13,-20)1=(0.4,0.5,0.8,1.0,1.2)2=(0,0.2,-2,0,0,0,0.1,-4) TableE-2. ProbabilityYitismissingbygroupmforrisksimulationofSection 3.6 mt=2t=3t=4t=5t=6 10.0800.1560.1670.2390.49820.1000.1880.1990.2410.35630.0140.0290.0340.1120.67040.0900.1800.1900.2250.30350.0930.1990.2240.3230.49660.1000.2510.2820.3330.34870.0940.1830.1970.2510.37480.0020.0070.0140.1720.726 125

PAGE 126

TableE-3. EstimatedrisksforeachchoiceofcovariancepriorfromtheextensionoftheSection 3.6 simulation.TheestimatedriskiscalculatedastheaveragelossusinglossfunctionsL1(m,^m1)=tr()]TJ /F9 7.97 Tf 6.59 0 Td[(1m^m1))]TJ /F4 11.955 Tf 11.96 0 Td[(logj)]TJ /F9 7.97 Tf 6.59 0 Td[(1m^m1j)]TJ /F5 11.955 Tf 17.94 0 Td[(p,L2(m,^m2)=trf()]TJ /F9 7.97 Tf 6.59 0 Td[(1m^m2)]TJ /F5 11.955 Tf 11.96 0 Td[(I)2g,andL(^m,m)=(^m)]TJ /F11 11.955 Tf 11.95 0 Td[(m)>)]TJ /F9 7.97 Tf 6.59 0 Td[(1m(^m)]TJ /F11 11.955 Tf 11.96 0 Td[(m). CovariancePriorEstimatedRiskPrioronPrioron)]TJ /F5 11.955 Tf 60.69 0 Td[(L1L2L Lag-blockCorr-lognormal0.4250.7420.175Lag-blockInvGamma0.4370.7590.174SparsityCorr-lognormal0.5530.9150.196SparsityInvGamma0.5650.9340.196Non-sparseInvGamma0.5510.9120.200NaiveBayes10.6050.9870.203NaiveBayes20.6301.0100.210Group-specicat*0.8921.2550.248Common-at8.10584.3390.925 *Thegroup-specicatpriorisonlyover49datasetsbecausetheMarkovchainfailedtoconvergeforonedataset. non-sparsegroupingprior,whichdoesnotallowthecorrectindependencerelationships,beatstherstnaiveprior,whichcorrectlyincorporatesthepotentialindependence.Thelag-block/correlated-lognormalpriorcontinuestobeattheremainderofthegroupingpriors,withariskimprovementof30and25%overthenaive1priorand52and41%overthegroup-specicatprior.Intermsoftheriskassociatedwithmeanestimation,wenotethatallvegroupingpriorsalsodominatethenaiveandatpriors.Thetwolag-blockpriorsbeattherstnaivepriorby14%.Thegroupingpriorswithsparseandnon-sparsepriorsforonlydoslightlybetterthanthenaivechoicesbutarestillclearlysuperiortheatpriors. E.2RiskSimulation2Wenowdescribethreesomewhatsimplerrisksimulationstofurtherdemonstratethatourgroupingpriorsperformwellinavarietyofsituations.ConsiderM=5groupsandp=4four-dimensionalnormallydistributedmean-zerorandomvariables.Theve 126

PAGE 127

TableE-4. Riskestimatesforsimulation2. CovariancePriorEstimatedRiskPrioronPrioron)]TJ /F5 11.955 Tf 55.7 0 Td[(L1L2 Lag-blockCorr-lognormal0.2470.429Lag-blockInvGamma0.2570.448SparsityCorr-lognormal0.2700.462SparsityInvGamma0.2810.480NaiveBayes10.2910.488Non-sparseInvGamma0.2920.493NaiveBayes20.3220.530Group-specicat0.4630.700Common-at1.5606.623 covariancematricesaredenedbythefollowingspecicationoftheautoregressiveandinnovationvarianceparameters:1=(0.7,0.2,0.7,0,0.2,0.7),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(1=(1,1,1,1),2=(0.7,0,0.3,0,0,0.7),)]TJ /F9 7.97 Tf 6.77 -1.8 Td[(2=(2,2,2,2),3=(0.3,0,0.3,0,0,0.3),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(3=(2,2,1,1),4=(0.7,0.2,0.7,0.1,0.2,0.7),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(4=(5,5,5,5),5=(0.7,0,0.7,0,0,0.3),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(5=(1,1,2,2).Weusesamplesizesofn1=...=n4=30,n5=15.Forthisspecicationmanyoftheparametersacrossgroupsareequal,andmanyofthehigherlagautoregressivetermsarezero.Additionally,withthesmallersamplesizeforthefthgroup,thegroupingpriorsshouldimproveestimationof5bysharinginformationacrosssimilargroups.Withgroupsofdifferentsizes,wemeasurethelosstoestimatingthecollectiontobeweightedaverageoftheindividuallosses,Lk(m,^mk)(k=1,2),withweightsproportionaltothesamplesizenm.Usingthesameset-upastheprevioussimulationproducestheriskestimatesgiveninTable E-4 .Thepriorcomposedofthelag-blockstructureontheautoregressiveandthecorrelated-lognormalspecicationforthevarianceshasthebestriskestimatesofthecollection.Comparingthelag-block/InvGammaandsparsity/correlated-lognormalpriorstothesparsity/InvGammagroupingprior,themodicationoneithertheautoregressive 127

PAGE 128

ortheinnovationvariancesproducesimprovedriskperformance.Thelag-block/correlated-lognormalanalysisproducesriskestimatesthatare15%and12%lowerthanthenaive1prior.Itisnaturaltocomparethesparsenaivepriortothesparsity/InvGammabecausetherstisalimitingcaseofthelatter.Likewise,wecomparenaive2priorwithgrouping/InvGamma.Forbothlossfunctions,thesparsity/InvGammabeatsnaive1andgrouping/InvGammabeatsnaive2,indicatingtheborrowingofinformationacrossgroupsinducedbythegroupingpriorsimprovestheestimation.Wealsoseethatthesparsitypriorperformsbetterthanthenon-sparsegroupingprior,butthisistobeexpectedsinceweknowthatthereareautoregressiveparametersthatareequaltozerointhetruemodel.Comparatively,theestimatorsfromtheatpriorsperformverypoorly;therisksforthegroupingpriorsare37%smallerthanthegroup-specicestimatorforL1and30%forL2. E.3RiskSimulation3Weperformanotherrisksimulationsimilartothepreviouswithagainvegroupsandp=4.Wexthemeantozeroandtheparametersofthetruecovariancematricesaregivenby1=(1,0.5,1,0.5,0.5,1),)]TJ /F9 7.97 Tf 6.77 -1.8 Td[(1=(2,2,2,2),2=(1,0.5,1,0.5,0.5,1),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(2=(4,4,4,4),3=(1,-0.5,1,-0.5,-0.5,1),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(3=(2,2,2,2),4=(1,-0.5,1,-0.5,-0.5,1),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(4=(4,4,4,4),5=(2,-1.0,2,-0.5,-1.0,1),)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(5=(2,2,1,1).Againweusethesamplesizesn1=...=n4=30,n5=15.Thereshouldbealargeamountofclusteringinthiscase,sincethereisagreatdealofcommonalityamongautoregressiveparametersandinnovationvariancesfordifferentsamples.Thesecovariancematricesalsodonothaveanyconditionalindependencerelationshipstoexploitsinceeachofthe'sarenonzero.RiskestimatesareshowninTable E-5 .Asinthepreviousrisksimulation,thelag-block/correlated-lognormalpriorproducesthebestriskat15%and20%lowerthan 128

PAGE 129

PAGE 130

TableE-6. Parametervaluesforsimulation4.1=(0.7,0.2,0.7,0,0.2,0.7,0,0,0.2,0.7,0,0,0,0.2,0.7)2=(0.7,0.2,0.7,0.1,0.2,0.7,0,0.1,0.2,0.7,0,0,0.1,0.2,0.7)3=(0.3,0,0.3,0,0,0.3,0,0,0,0.3,0,0,0,0,0.3)4=(0.3,0,0.3,-0.1,0,0.3,0,-0.1,0,0.3,0,0,-0.1,0,0.3)5=(1,-0.5,1,0,-0.5,1,0,0,-0.5,1,0,0,0,-0.5,1)6=(1,-0.5,1,0.3,-0.5,1,0,0.3,-0.5,1,0,0,0.3,-0.5,1)7=(1,-0.2,1,-0.2,-0.2,1,-0.2,-0.2,-0.2,1,-0.2,-0.2,-0.2,-0.2,1)8=(1,-0.2,1,-0.2,-0.2,1,-0.2,-0.2,-0.2,1,-0.2,-0.2,-0.2,-0.2,1))]TJ /F9 7.97 Tf 6.78 -1.79 Td[(1=(1,1,1,1,1,1))]TJ /F9 7.97 Tf 6.78 -1.79 Td[(2=(1,1,1,1,1,1))]TJ /F9 7.97 Tf 6.78 -1.8 Td[(3=(3.4,3.1,2.8,2.5,2.2,1.8))]TJ /F9 7.97 Tf 6.78 -1.79 Td[(4=(3,3,2,2,2,1))]TJ /F9 7.97 Tf 6.78 -1.8 Td[(5=(5,3,3,4,4,4))]TJ /F9 7.97 Tf 6.78 -1.79 Td[(6=(5,5,3,3,2,2))]TJ /F9 7.97 Tf 6.78 -1.79 Td[(7=(2,1.8,1.6,1.4,1.2,1))]TJ /F9 7.97 Tf 6.78 -1.79 Td[(8=(3.4,3.1,2.8,2.5,2.2,1.8) TableE-7. Riskestimatesforsimulation4. CovariancePriorEstimatedRiskPrioronPrioron)]TJ /F5 11.955 Tf 55.7 0 Td[(L1L2 Lag-blockCorr-lognormal0.4680.781Lag-blockInvGamma0.4920.816SparsityCorr-lognormal0.5560.904SparsityInvGamma0.5830.939Non-sparseInvGamma0.6020.963NaiveBayes10.6641.013NaiveBayes20.7611.121Group-specicat1.3001.584Common-at3.03614.149 asintherisksimulationofthearticle.Here,weallowforM=8groupsandconsider66covariancematrices,denedbythedependenceparametersinTable E-6 .Weassumeameanofzeroandfullyobservealldata.Thischoiceforincorporatescommonalitybothwithinlagandacrossgroups,aswellaspossessingmanyconditionalindependencerelationshipsamongthehigherlagterms.Wechooseasamplesizeofthirtyfortherstvegroupsandfteenforthenalthreegroups,andthirtyclustersforthegroupingpriors.TheestimatedriskassociatedwithestimatingthecovariancematricesforeachofthetwolossfunctionsisshowninTable E-7 130

PAGE 131

PAGE 132

TableE-8. Modeltstatisticsandtreatmenteffectsforthersttwogroupsforthedepressiondatausingeachofthepriors. CovariancePriorModelFitTreatmentEffectPrioronPrioron)]TJ /F5 11.955 Tf 79.22 0 Td[(DevpDDICGroup1Group2 Lag-blockCorr-logN(=0.90)39,00634239,6909.23(7.03,11.48)9.51(6.85,12.19)Lag-blockInvGamma38,99935039,6989.22(6.98,11.45)9.39(6.73,12.13)Lag-blockCorr-logN(=0.75)39,00634739,7009.22(6.99,11.42)9.56(6.85,12.27)Lag-blockCorr-logN(=0.50)39,00334939,7009.24(7.00,11.53)9.41(6.74,12.14)SparsityCorr-logN(=0.90)38,88746439,8169.25(7.02,11.49)8.78(6.33,11.24)SparsityCorr-logN(=0.75)38,88746639,8199.23(6.96,11.42)8.82(6.39,11.27)SparsityCorr-logN(=0.50)38,88347239,8279.25(7.01,11.51)8.68(6.23,11.20)SparsityInvGamma38,88447539,8349.25(7.02,11.53)8.64(6.16,11.15)NaiveBayes138,87548139,8379.25(7.11,11.46)8.56(6.16,10.99)Non-sparseInvGamma38,81852939,8769.29(7.01,11.59)8.81(6.21,11.52)NaiveBayes238,76556339,8909.24(7.02,11.49)7.99(5.59,10.46)Common-at39,90722040,3479.44(6.21,12.53)10.17(7.02,13.24)Group-specicat39,178102141,2199.20(6.44,12.08)6.93(4.22,9.77) 132

PAGE 133

PAGE 134

APPENDIXFMODELPARAMETERSFORRISKSIMULATIONOFCHAPTER4WedescribethecovariancematricesusedtocreatedatausedintherisksimulationofSection 4.4 .Foreachgroupm,weprovidetheTmCholeskymatrixwhereinsteadofdenoting1onthediagonalweincludetheIVmtinbold.Emptyupper-triangularelementsarezero.RecallthatthetruepartitionstructureisP1=fMg,P2==P5=ff1,2,3,4g,f5,6,7,8gg,P6==P8=ff1,2,3,4g,f5,6g,f7,8gg,andP9=P10=ff1,2g,f3,4g,f5,6g,f7,8gg.GroupsareinthesamesetofthepartitionifmtandthetthcolumnoftheTmmatrixareequal.Groupsm=1,2:26666666666666641.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.29 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.29 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.71.03777777777777775Groupsm=3,4:26666666666666641.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.29 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.29 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.31.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.30.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.30.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.30.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.30.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.80.73777777777777775 134

PAGE 135

Groupsm=5,6:26666666666666641.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.29 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.29 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.50.83777777777777775Groupsm=7,8:26666666666666641.0)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.29 0 Td[(.21.0)]TJ /F4 11.955 Tf 9.29 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.2.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.5)]TJ /F4 11.955 Tf 9.3 0 Td[(.2.20.8)]TJ /F4 11.955 Tf 9.3 0 Td[(.7)]TJ /F4 11.955 Tf 9.3 0 Td[(.21.2)]TJ /F4 11.955 Tf 9.3 0 Td[(.71.23777777777777775 135

PAGE 136

REFERENCES ALBERT,J.H.&CHIB,S.(1993).Bayesiananalysisofbinaryandpolychotomousresponsedata.JournaloftheAmericanStatisticalAssociation88,669. ANDERSON,T.(1984).AnIntroductiontoMultivariateStatisticalAnalysis,2ndEdition.Wiley. ARABIE,P.&BOORMAN,S.A.(1973).Multidimensionalscalingofmeasuresofdistancebetweenpartitions.JournalofMathematicalPsychology10,148. BARNARD,J.,MCCULLOCH,R.&MENG,X.-L.(2000).Modelingcovariancematricesintermsofstandarddeviationsandcorrelations,withapplicationtoshrinkage.StatisticaSinica10,1281. BICKEL,P.&LEVINA,E.(2008).Regularizedestimationoflargecovariancematrices.TheAnnalsofStatistics36,199. BOIK,R.J.(2002).Spectralmodelsforcovariancematrices.Biometrika89,159. BOIK,R.J.(2003).Principalcomponentmodelsforcorrelationmatrices.Biometrika90,679. BOOTH,J.G.,CASELLA,G.&HOBERT,J.P.(2008).Clusteringusingobjectivefunctionsandstochasticsearch.JournaloftheRoyalStatisticalSociety,SeriesB70,119. CAI,B.&DUNSON,D.B.(2006).Bayesiancovarianceselectioningeneralizedlinearmixedmodels.Biometrics62,446. CARTER,C.K.,WONG,F.&KOHN,R.(2011).ConstructingpriorsbasedonmodelsizefornondecoposableGaussiangraphicalmodels:Asimulationbasedapproach.JournalofMultivariateAnalysis102,871. CHEN,Z.&DUNSON,D.B.(2003).Randomeffectsselectioninlinearmixedmodels.Biometrics59,762. CHIB,S.&GREENBERG,E.(1998).Analysisofmultivariateprobitmodels.Biometrika85,347. CHIU,T.Y.M.,LEONARD,T.&TSUI,K.-W.(1996).Thematrix-logarithmiccovariancemodel.JournaloftheAmericanStatisticalAssociation91,198. COX,D.R.&REID,N.(1987).Parameterorthogonalityandapproximateconditionalinference(withdiscussion).JournaloftheRoyalStatisticalSociety,SeriesB49,1. CRIPPS,E.,CARTER,C.K.&KOHN,R.(2005).Variableselectionandcovarianceselectioninmultivariateregressionmodels.InHandbookofStatistics,vol.25. 136

PAGE 137

CROWLEY,E.M.(1997).Productpartitionmodelsfornormalmeans.JournaloftheAmericanStatisticalAssociation92,192. DAMIEN,P.,WAKEFIELD,J.&WALKER,S.(1999).GibbssamplingforBayesiannon-conjugateandhierarchicalmodelsbyusingauxiliaryvariables.JournaloftheRoyalStatisticalSociety,SeriesB61,331. DANAHER,P.,WANG,P.&WITTEN,D.M.(2012).Thejointgraphicallassoforinversecovarianceestimationacrossmultipleclasses.ArXivpreprintarXiv:1111.0324. DANIELS,M.J.(2006).Bayesianmodellingofseveralcovariancematricesandsomeresultsontheproprietyoftheposteriorforlinearregressionwithcorrelatedand/orheterogeneouserrors.JournalofMultivariateAnalysis97,1185. DANIELS,M.J.&HOGAN,J.W.(2008).MissingDatainLongitudinalStudies:Strate-giesforBayesianModelingandSensitivityAnalysis.Chapman&Hall. DANIELS,M.J.&KASS,R.E.(1999).NonconjugateBayesianestimationofcovariancematricesanditsuseinhierarchicalmodels.JournaloftheAmericanStatisticalAssociation94,1254. DANIELS,M.J.&NORMAND,S.-L.(2006).Longitudinalprolingofhealthcareunitsbasedonmixedmultivariatepatientoutcomes.Biostatistics7,1. DANIELS,M.J.&POURAHMADI,M.(2002).Bayesiananalysisofcovariancematricesanddynamicmodelsforlongitudinaldata.Biometrika89,553. DANIELS,M.J.&POURAHMADI,M.(2009).Modelingcovariancematricesviapartialautocorrelations.JournalofMultivariateAnalysis100,23522363. DAWID,A.P.&LAURITZEN,S.L.(1993).HyperMarkovlawsinthestatisticalanalysisofdecomposablegraphicalmodels.TheAnnalsofStatistics21,1272. DAY,W.H.E.(1981).Thecomplexityofcomputingmetricdistancesbetweenpartitions.MathematicalSocialSciences1,269. DEMPSTER,A.P.(1972).Covarianceselection.Biometrics28,157. DENUD,L.&GUENOCHE,A.(2006).Comparisonofdistanceindicesbetweenpartitions.InDataScienceandClassication,V.Batagelj,H.-H.Bock,A.Ferligoj&A.Ziberna,eds.,StudiesinClassication,DataAnalysis,andKnowledgeOrganization.SpringerBerlinHeidelberg,pp.21. DUNSON,D.B.,XUE,Y.&CARIN,L.(2008).Thematrixstick-breakingprocess:FlexibleBayesmeta-analysis.JournaloftheAmericanStatisticalAssociation103,317. FOX,E.&DUNSON,D.B.(2011).Bayesiannonparametriccovarianceregression.ArXivpreprintarXiv:1101.2017. 137

PAGE 138

PAGE 139

KUROWICKA,D.&COOKE,R.(2006).Completionproblemwithpartialcorrelationvines.LinearAlgebraanditsApplications418,188. LAURITZEN,S.L.(1996).GraphicalModels.ClarendonPress. LEONARD,T.&HSU,J.S.J.(1992).Bayesianinferenceforacovariancematrix.TheAnnalsofStatistics20,1669. LETAC,G.&MASSAM,H.(2007).Wishartdistributionsfordecomposablegraphs.TheAnnalsofStatistics35,1278. LIECHTY,J.,LIECHTY,M.&MULLER,P.(2004).Bayesiancorrelationestimation.Biometrika91,1. LITTLE,R.J.A.&RUBIN,D.B.(2002).StatisticalAnalysiswithMissingData.NewYork:JohnWiley&Sons. LIU,C.(2001).CommentonTheartofdataaugmentationbyD.A.vanDykandX.-L.Meng.JournalofComputationalandGraphicalStatistics10,75. LIU,X.&DANIELS,M.J.(2006).Anewalgorithmforsimulatingacorrelationmatrixbasedonparameterexpansionandre-parameterization.JournalofComputationalandGraphicalStatistics15,897. LIU,X.,DANIELS,M.J.&MARCUS,B.(2009).Jointmodelsfortheassociationoflongitudinalbinaryandcontinuousprocesseswithapplicationtoasmokingcessationtrial.JournaloftheAmericanStatisticalAssociation104,429. LOPES,H.F.,MCCULLOCH,R.E.&TSAY,R.S.(2011).Choleskystochasticvolatility.Submitted. LOPES,H.F.&WEST,M.(2004).Bayesianmodelassessmentinfactoranalysis.StatisticaSinica14,41. MANLY,B.F.J.&RAYNER,J.C.W.(1987).Thecomparisonofsamplecovariancematricesusinglikelihoodratiotests.Biometrika74,841. MARCUS,B.,ALBRECHT,A.,KING,T.,PARISI,A.,PINTO,B.,ROBERTS,M.,NIAURA,R.&ABRAMS,D.(1999).Theefcacyofexerciseasanaidforsmokingcessationinwomen:Arandomizedcontrolledtrial.ArchivesofInternalMedicine159,1229. MAZUMDER,R.&HASTIE,T.(2012).Exactcovariancethresholdingintoconnectedcomponentsforlarge-scalegraphicallasso.JournalofMachineLearningResearch13,781. MCNICHOLAS,P.D.&MURPHY,T.B.(2010).Model-basedclusteringoflongitudinaldata.TheCanadianJournalofStatistics38,153. MEINHAUSEN,N.&BUHLMANN,P.(2006).High-dimensionalgraphsandvariableselectionwiththelasso.TheAnnalsofStatistics34,1436. 139

PAGE 140