Linear Mixed Model Estimation with Dirichlet Process Random Effects

MISSING IMAGE

Material Information

Title:
Linear Mixed Model Estimation with Dirichlet Process Random Effects
Physical Description:
1 online resource (84 p.)
Language:
english
Creator:
Li, Chen
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Statistics
Committee Chair:
Casella, George
Committee Members:
Ghosh, Malay
Young, Linda
Mai, Volker

Subjects

Subjects / Keywords:
blue -- dirichlet -- linear -- mixed -- models -- ols -- process
Statistics -- Dissertations, Academic -- UF
Genre:
Statistics thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract:
The linear mixed model is very popular, and has proven useful in many areas of applications. (See, for example, McCulloch and Searle (2001), Demidenko (2004), and Jiang (2007).) Usually people assume that the random effect  is normally distributed. However, as this distribution is not observable, it is possible that the distribution of the random effect is non-normal (Burr and Doss (2005), Gill and Casella (2009), Kyung et al. (2009, 2010)). We assume that the random effect  follows a Dirichlet process, as discussed in Burr and Doss (2005), Gill and Casella (2009), Kyung et al. (2009, 2010). In this dissertation, we first consider the Dirichlet process as a model for classical random effects, and investigate their effect on frequentist estimation in the linear mixed model. We discuss the relationship between the BLUE (Best Linear Unbiased Estimator) and OLS (Ordinary Least Squares) in Dirichlet process mixed models, and also give conditions under which the BLUE coincides with the OLS estimator in the Dirichlet process mixed model. In addition, we investigate the model from the Bayesian view, and discuss the properties of estimators under different model assumptions, compare the estimators under the frequentist model and different Bayesian models, and investigate minimaxity. Furthermore, we apply the linear mixed model with Dirichlet Process random effects to a real data set and get satisfactory results.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
by Chen Li.
Thesis:
Thesis (Ph.D.)--University of Florida, 2012.
Local:
Adviser: Casella, George.
Electronic Access:
RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2013-02-28

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2012
System ID:
UFE0044412:00001


This item is only available as the following downloads:


Full Text

PAGE 1

LINEARMIXEDMODELESTIMATIONWITHDIRICHLETPROCESSRANDOMEFFECTSByCHENLIADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2012

PAGE 2

c2012ChenLi 2

PAGE 3

Tomyparentsandmysister 3

PAGE 4

ACKNOWLEDGMENTSIwouldliketosincerelythankmyadvisorDr.GeorgeCasellaforhisguidance,patienceandhelp.Ifeelveryluckytogettoknowhimandlearnunderhim.Ilearnalotfromhim,notonlyabouttheknowledgebutalsoabouthisworkethicandtheattitudetothelife.Iwouldliketothankeveryoneonmysupervisorycommittee:Dr.MalayGhosh,Dr.LindaYoungandDr.VolkerMaifortheirguidanceandencouragement.Theirsuggestionsandhelpmakeabigimpactonthisdissertation.IwouldliketothankthefacultyattheDepartmentofStatistics.Theytaughtmealot,bothinandoutoftheclassroom.IamveryluckytobeagraduatestudentattheDepartmentofStatistics.IwouldliketothankallmyfriendsbothinUSAandinChinafortheirfriendshipandsupport.Ithankmyparentsfortheirsupportandcondenceinme.Withouttheirsupportandencouragement,Iwouldnothavethecourageandabilitytopursuemydreams.IalsowanttothankmysisterYan,mybrother-in-lawJian,mybrotherYangandmynephewZiyangfortheirconsistenthelpandsupport. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 4 LISTOFTABLES ...................................... 7 LISTOFFIGURES ..................................... 8 ABSTRACT ......................................... 9 CHAPTER 1INTRODUCTION ................................... 10 2POINTESTIMATIONANDINTERVALESTIMATION ............... 16 2.1Gauss-MarkovTheorem ............................ 19 2.2EqualityofOLSandBLUE:EigenvectorConditions ............. 20 2.3EqualityofOLSandBLUE:MatrixConditions ................ 21 2.4SomeExamples ................................ 23 2.5IntervalEstimation ............................... 25 2.6TheOnewayModel ............................... 26 2.6.1VarianceComparisons ......................... 27 2.6.2UnimodalityandSymmetry ...................... 28 2.6.3LimitingValuesofm .......................... 29 2.6.4RelationshipAmongDensitiesof YwithDifferentm ......... 30 2.7TheEstimationof2,2andCovariance ................... 31 2.7.1MINQUEfor2and2 ......................... 32 2.7.2MINQEfor2and2 .......................... 33 2.7.3Estimationsof2,2andCovariancebytheSampleCovarianceMatrix .................................. 33 2.7.4FurtherDiscussion ........................... 34 2.7.5SimulationStudy ............................ 35 3SIMULATIONSTUDIESANDAPPLICATION ................... 37 3.1SimulationStudies ............................... 37 3.1.1DataGenerationandEstimationsofParameters .......... 37 3.1.2SimulationResults ........................... 39 3.2ApplicationtoaRealDataSet ........................ 40 3.2.1TheModelandEstimation ....................... 40 3.2.2SimulationResultsfortheModelsintheSection3.2.1 ....... 41 3.2.3DiscussionoftheSimulationStudiesResults ............ 48 3.2.4ResultsusingtheRealDataSet .................... 49 5

PAGE 6

4BAYESIANESTIMATIONUNDERTHEDIRICHLETMIXEDMODEL ...... 50 4.1BayesianEstimatorsunderFourModels ................... 51 4.1.1FourModelsandCorrespondingBayesianEstimators ....... 51 4.1.2MoreGeneralCases .......................... 53 4.2TheOnewayModel ............................... 53 4.2.1Estimators ................................ 54 4.2.2ComparisonandChoiceoftheParameterBasedonMSE .... 55 4.3TheMSEandBayesRisks .......................... 56 4.3.1OnewayModel ............................. 57 4.3.2GeneralLinearMixedModel ...................... 60 5MINIMAXITYANDADMISSIBILITY ........................ 65 5.1MinimaxityandAdmissibilityofEstimators .................. 65 5.2AdmissibilityofCondenceIntervals ..................... 66 6CONCLUSIONSANDFUTUREWORK ...................... 68 APPENDIX APROOFOFTHEOREM2 .............................. 70 BPROOFOFTHEOREM3 .............................. 72 CEVALUATIONOFEQUATION(2-4) ........................ 73 DPROOFOFTHEOREM14 ............................. 75 EPROOFOFTHEOREM15 ............................. 78 FPROOFOFTHEOREM16 ............................. 79 REFERENCES ....................................... 81 BIOGRAPHICALSKETCH ................................ 84 6

PAGE 7

LISTOFTABLES Table page 2-1EstimatedcutoffpointsofExample10,with=0.95,2=2=1,andm=3. 26 2-2Estimatedcutoffpoints(=0.975)undertheDirichletmodelYij=+ i+"ij,1i6,1j6,fordifferentvaluesofm.2=2=1. ......... 31 2-3Estimated2and2undertheDirichletprocessonewaymodelYij=+ i+"ij,1i7,1j7.2=2=1. ........................ 36 2-4Estimated2and2undertheDirichletprocessonewaymodelYij=+ i+"ij,1i7,1j7.2=2=10. ....................... 36 2-5EstimatedcutoffpointsfordensityofYwiththeestimatorsof2and2inTa-ble2-3undertheDirichletmodelYij=+ i+"ij,1i7,1j7.m=3,=0,=0.95.True2=2=1. ..................... 36 3-1Thesimulationsresultswith2=2=1.m=1. ................. 39 3-2Thesimulationsresultswith2=2=5.m=1. ................. 39 3-3Thesimulationsresultswith2=2=10.m=1. ................. 40 3-4TheDataSetups. .................................. 42 4-1TheMSEswithdifferent2.2=2=1. ...................... 63 4-2TheMSEswithdifferent2.2=2=5. ...................... 63 7

PAGE 8

LISTOFFIGURES Figure page 2-1Therelationshipbetweendandm,withr=7,12,15,20. ............ 19 2-2Densitiesof Ycorrespondingtodifferentvaluesofmwith==1. ..... 27 4-1Var(lI Yjnormalmodel))]TJ /F7 11.955 Tf 11.95 0 Td[(Var(l YjDirichletmodel)forsmall,=1,=.5. ... 57 4-2TheBayesrisksofBayesianestimatorsinModels1-4andtheBayesriskofBLUE.m=3. ...................................... 58 4-3TheBayesRisks. .................................. 61 4-4TheMSEsunderDifferentModels. ......................... 62 4-5TheBayesRisks.2=5. .............................. 62 4-6TheMSEs.2=5. .................................. 63 4-7TheMSEs.2=5. ................................. 64 8

PAGE 9

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyLINEARMIXEDMODELESTIMATIONWITHDIRICHLETPROCESSRANDOMEFFECTSByChenLiAugust2012Chair:GeorgeCasellaMajor:StatisticsThelinearmixedmodelisverypopular,andhasprovenusefulinmanyareasofapplications.(See,forexample, McCullochandSearle ( 2001 ), Demidenko ( 2004 ),and Jiang ( 2007 ).)Usuallypeopleassumethattherandomeffectisnormallydistributed.However,asthisdistributionisnotobservable,itispossiblethatthedistributionoftherandomeffectisnon-normal( BurrandDoss ( 2005 ), GillandCasella ( 2009 ), Kyungetal. ( 2009 2010 )).WeassumethattherandomeffectfollowsaDirichletprocess,asdiscussedin BurrandDoss ( 2005 ), GillandCasella ( 2009 ), Kyungetal. ( 2009 2010 ).Inthisdissertation,werstconsidertheDirichletprocessasamodelforclassicalrandomeffects,andinvestigatetheireffectonfrequentistestimationinthelinearmixedmodel.WediscusstherelationshipbetweentheBLUE(BestLinearUnbiasedEstimator)andOLS(OrdinaryLeastSquares)inDirichletprocessmixedmodels,andalsogiveconditionsunderwhichtheBLUEcoincideswiththeOLSestimatorintheDirichletprocessmixedmodel.Inaddition,weinvestigatethemodelfromtheBayesianview,anddiscussthepropertiesofestimatorsunderdifferentmodelassumptions,comparetheestimatorsunderthefrequentistmodelanddifferentBayesianmodels,andinvestigateminimaxity.Furthermore,weapplythelinearmixedmodelwithDirichletProcessrandomeffectstoarealdatasetandgetsatisfactoryresults. 9

PAGE 10

CHAPTER1INTRODUCTIONThelinearmixedmodelisverypopular,andhasprovenusefulinmanyareasofapplications.(See,forexample, McCullochandSearle ( 2001 ), Demidenko ( 2004 ),and Jiang ( 2007 ).)Itistypicallywrittenintheform Y=X+Z +",(1)whereYisann1vectorofresponses,Xisannpknowndesignmatrix,isap1vectorofcoefcients,Zisanotherknownnrmatrix,multiplyingther1vector ,avectorofrandomeffects,and"Nn(0,2In)istheerror.Itistypicaltoassumethat isnormallydistributed.However,asthisdistributionisnotobservable,itispossiblethatthedistributionoftherandomeffectisnon-normal( BurrandDoss ( 2005 ), GillandCasella ( 2009 ), Kyungetal. ( 2009 2010 )).Ithasnowbecomepopulartochangethedistributionalassumptionon toaDirichletprocess,asdiscussedin BurrandDoss ( 2005 ), GillandCasella ( 2009 ), Kyungetal. ( 2009 2010 ).TherstuseofDirichletprocessaspriordistributionswasby Ferguson ( 1973 );seealso Antoniak ( 1974 ),whoinvestigatedthebasicproperties.Ataboutthesametime, BlackwellandMacQueen ( 1973 )provedthatthemarginaldistributionoftheDirichletprocesswasthesameasthedistributionofthenthstepofaPolyaurnprocess.Therewasnotagreatdealofworkdoneinthistopicinthefollowingyears,perhapsduetothedifcultyofcomputingwithDirichletprocesspriors.Thetheorywasadvancedbytheworkof KorwarandHollander ( 1973 ), Lo ( 1984 ),and Sethuraman ( 1994 ).However,notuntilthe1990sand,inparticular,theGibbssampler,didworkinthisareatakeoff.Therewehavecontributionsfrom Liu ( 1996 ),whofurtheredthetheory,andworkby EscobarandWest ( 1995 ), MacEachernandMuller ( 1998 ),and Neal ( 2000 ),andothers,whousedGibbssamplingtodoBayesiancomputations.Morerecently, Kyungetal. ( 2009 )investigatedavariancereductionpropertyoftheDirichletprocessprior,and Kyungetal. 10

PAGE 11

( 2010 )providedanewGibbssamplerforthelinearDirichletmixedmodelanddiscussedestimationoftheprecisionparameteroftheDirichletprocess.Sincethe1990s,theBayesianapproachhasseenthemostuseofmodelswithDirichletprocesspriors.Here,however,wewanttoconsidertheDirichletprocessasamodelforclassicalrandomeffects,andtoinvestigatetheireffectonfrequentistestimationinthelinearmixedmodel.ManypapersdiscusstheMLEofamixtureofnormaldensities.Forexample, YoungandCoraluppi ( 1969 )developedastochasticapproximationalgorithmtoestimatethemixturesofnormaldensitywithunknownmeansandunknownvariance. Day ( 1969 )providedamethodofestimatingamixtureoftwonormaldistributionwiththesameunknowncovariancematrix. PetersandWalker ( 1978 )discussedaniterativeproceduretogettheMLEforamixtureofnormaldistributions. XuandJordan ( 1996 )discussedtheEMalgorithmfortheniteGaussianmixturesanddiscussedtheadvantagesanddisadvantagesofEMfortheGaussianmixturemodels.However,wecannotusethemethodsmentionedintheabovepaperstogettheMLEinDirichletprocessmixedmodels.TheabovepapersconsideredthedensityPifi(xji),whereiisaparameter;theproportioniisalsoparameter,independentofi,withPi=1.However,intheDirichletprocessmixedmodel,thedensityconsideredisPAP(A)fi(xjA),wheretheproportionP(A)dependsonthematrixAandfi(xjA)alsodependsonthematrixA.HereAisarkmatrix.TherkmatrixAisabinarymatrix;eachrowisallzerosexceptforoneentry,whichisa1,whichdepictstheclustertowhichthatobservationisassigned.Ofcourse,bothkandAareunknown.WewilldiscussmoredetailsaboutthematrixAinnextchapter.TheweightsP(A)arecorrelatedwiththecorrespondingcomponentsfi(xjA).Thus,themethodsandresultsinthesepaperscannotbeusedheredirectly.WedonotdiscusstheMLEhere.Wewillconsiderothermethodstoestimatethexedeffectsthebestlinearunbiasedestimator 11

PAGE 12

(BLUE)andordinaryleastsquares(OLS)forthexedeffectsandMINQUE/samplecovariancematrixmethodforthevariancecomponents2and2here.TheGauss-MarkovTheorem,whichndstheBLUE,isgivenby ZyskindandMartin ( 1969 )forthelinearmodel,andby Harville ( 1976 )forthelinearmixedmodel,wherehealsoobtainedthebestlinearunbiasedpredictor(BLUP)oftherandomeffects. Robinson ( 1991 )discussedBLUPandtheestimationofrandomeffects,and AfshartousandWolf ( 2007 )focusedontheinferenceofrandomeffectsinmultilevelandmixedeffectsmodels. HuangandLu ( 2001 )extendedGauss-Markovtheoremtoincludenonparametricmixed-effectsmodels.ManypapershavediscussedtherelationshipbetweenOLSandBLUE,withtherstresultsobtainedby Zyskind ( 1967 ). PuntanenandStyan ( 1989 )discussedthisrelationshipinahistoricalperspective.BytheGauss-MarkovTheorem,wecanwritetheBLUEforthexedeffectsinaclosedform.Wegivetheformulaofthecorrespondingvariance-covariancematrix,whichhelpsusgetthecovariancematrixdirectly.Weareconcernedwithndingthebestlinearunbiasedestimator(BLUE),andseeingwhenthiscoincideswiththeordinaryleastsquares(OLS)estimator.Weprovideconditions,calledEigenvectorConditionsandMatrixConditionsrespectively,underwhichthereistheequalitybetweentheOLSandBLUE.Bythesetheorems,wecanjustuseOLSastheBLUEundermanycases,whichavoidsthedifcultiesandcomputationaleffertsofestimatingthevariancecomponents2,2andprecisionparameterm.Inaddition,wendthatthecovarianceisdirectlyrelatedtotheprecisionparameteroftheDirichletprocess,givinganewinterpretationofthisparameter.Themonotonicitypropertyofthecorrelationisalsoinvestigated.Furthermore,weprovideamethodtoconstructcondenceintervals.AnotherproblemintheDirichletprocessmixedmodelistoestimatetheparameters2and2.IntheDirichletprocessmixedmodel,thedistributionofresponsesisamixtureofnormaldensities,notasinglenormaldistribution,whichmightleadtosomedifcultywhenwetrytousesomemethods(forexample,maximumlikelihood)to 12

PAGE 13

estimatetheparameters.Wewilldiscussthreemethods(MINQUE,MINQE,andsamplecovariancematrix)tondtheestimatorsfor2and2andshowasimulationstudy.Thesethreemethodsdonotneedtheresponsefollowsanormaldistribution.Thesimulationstudyshowsthattheestimatorsfromthesamplecovariancematrixareverysatisfactory.Inaddition,wecanalsogetsatisfactoryestimationofcovariancebyusingthesamplecovariancematrixmethod.Inthesituationwhenthevariancecomponentsareunknown, KackarandHarville ( 1981 )discussedtheconstructionofestimatorswithunknownvariancecomponentsandshowthattheestimatedBLUEandBLUPremainunbiasedwhentheestimatorsofthestandarderrorsareevenandtranslation-invariant.Otherworksinclude KackarandHarville ( 1984 ),whogaveageneralapproximationofmeansquarederrorswhenusingestimatedvariances,and Dasetal. ( 2004 ),whodiscussedmeansquarederrorsofempiricalpredictorsingeneralcaseswhenusingtheMLorREMLtoestimatetheerrors.Wewillshowthattheestimatorsfor2and2bythesamplecovariancematrixsatisfytheevenandtranslation-invariantconditions.Sotheestimatorof(or ,ortheirlinearcombinations)withestimatorsof2and2fromthesamplecovariancematrixisstillunbiased.Ontheotherhand,by Dasetal. ( 2004 )weknowthattheestimationbyMINQUEalsosatisestheevenandtranslation-invariantconditions.Thentheestimatorsof(or ,ortheirlinearcombinations)withestimatorsof2and2fromMINQUEarealsounbiased.AlltheresultsmentionedabovewillbeshownindetailintheChapter 2 .WehavediscussedtheclassicalestimationundertheDirichletprocessmixedmodelabove.WewillalsocomparetheperformanceoftheDirichletmodelwiththeperformanceoftheclassicalnormalmodelthroughsomedataanalysis.Wewillconsiderbothsimulateddatasetsandarealdataset.First,wewillconsidersomesimulationstudies.ThenwewillmovetoapplytheDirichletmodeltoarealdataset.WeuseboththeDirichletmodelandtheclassicalnormalmodeltotthesimulateddataandthereal 13

PAGE 14

dataset,andcomparethecorrespondingresults.TheresultsshowthattheDirichletprocessmixedmodelisrobustandtendstogivebetterresults.AllthenumericalanalysisresultsarelistedintheChapter 3 .Thewayweusedtogettheaboveresultsisfromthefrequentistviewpoint.AnotherwaytodiscusstheDirichletprocessmixedmodelisfromtheBayesianviewpoint.WealwaysputpriorsonwhenusingBayesianmethods.Differentpriorsanddifferentrandomeffectsmightleadtodifferentestimators,differentMSEanddifferentBayesrisks.Wecanassumethattherandomeffectsfollowanormaldistribution.WecanalsoassumethattherandomeffectsfollowtheDirichletprocess.Wecanputanormaldis-tributionprioron.Wecanalsoputtheatprioron.Weareinterestedintheanswertothequestion:whichprior/modelisbetter.TheChapter 4 considerthisquestion.Inordertocomparethepriorsandmodels,wewillrstgivethefoursmodels.WecangetthecorrespondingBayesianestimatorsandshowthecorrespondingMSEandBayesrisksoftheseBayesianestimatorsanddiscusswhichmodelisbetter.Moredetailsintheonewaymodelarealsodiscussed.Undertheclassicalnormalmixedmodel,weknowtheminimaxestimatorsofthexedeffectsinsomespecialcases.WewanttoknowiftherearestillsomeminimaxestimatorsofthexedeffectsundertheDirichletprocessmixedmodel.Wewilldiscusstheminimaxityandadmissibilityoftheestimators,andtoshowtheadmissibilityofcondenceintervalsunderthesquarederrorloss.Wewillshowthat YisminimaxintheDirichletprocessonewaymodel.Thisresultalsoholdsforthemultivariatecase.TheChapter 5 willdiscusstheseproperties.Thedissertationisorganizedasfollows.InChapter 2 wewillderivetheBLUEandtheBLUP,examinetheBLUE-OLSrelationship,andlookatintervalestimation.InSection2.7wewillgivethesomemethodstoestimatethecovariancecomponents2and2andprovideasimulationstudytocomparethemethods.InChapter 3 wewillshowtheperformanceoftheDirichletprocessmixedmodelbyttingthesimulateddata 14

PAGE 15

setsandarealdatasets.InChapter 4 wewilldiscusstheDirichletprocessmixedmodelfromtheBayesianviewpoint.Wewillcomparethemodelswithdifferentpriorsonanddifferentrandomeffectstoseewhichoneisbetter.InChapter 5 wewillinvestigatetheminimaxityandadmissibilityundertheDirichletprocessmixedmodelinsomespecialcases.Atlast,wewillgiveaconclusion.Thereisatechnicalappendixattheend. 15

PAGE 16

CHAPTER2POINTESTIMATIONANDINTERVALESTIMATIONHereweconsiderestimationofin( 1 ),whereweassumethattherandomeffectsfollowaDirichletprocess.TheGauss-MarkovTheorem( ZyskindandMartin ( 1969 ); Harville ( 1976 ))isapplicableinthiscase,andcanbeusedtondtheBLUEof.InSection 2.1 ,wegivetheBLUEofandBLUPof and,InSections 2.2 2.3 weinvestigateconditionsunderwhichOLSisBLUE.InSection 2.4 ,wegivesomeexamplestoshowtheequalitybetweentheOLSandtheBLUE.InSection 2.5 ,wegiveamethodtoconstructcondenceintervals.Section 2.6 showspropertiesundertheDirichletprocessonewaymodel.Section 2.7 discussesthemethodstoestimatethevariancecomponents2and2.Considermodel( 1 ),butnowweallowthevector tofollowaDirichletprocesswithanormalbasemeasureandprecisionparameterm, iDP(m,N(0,2)). BlackwellandMacQueen ( 1973 )showedthatif 1, 2,...arei.i.d.fromGDP(m,0),thejointdistributionof isaproductoftheform ij 1,..., i)]TJ /F6 7.97 Tf 6.58 0 Td[(1,mm i)]TJ /F7 11.955 Tf 11.96 0 Td[(1+m0( i)+1 i)]TJ /F7 11.955 Tf 11.96 0 Td[(1+mi)]TJ /F6 7.97 Tf 6.59 0 Td[(1Xl=1( i= l).Asdiscussedin Kyungetal. ( 2010 ),thisexpressiontellsusthattheremightbeclustersbecausethevalueof icanbeequaltooneofthepreviousvalueswithpositiveproba-bility.TheimplicationofthisrepresentationisthattherandomeffectsfromtheDirichletprocesscanhavecommonvalues,andthisled Kyungetal. ( 2010 )touseaconditionalrepresentationof( 1 )oftheform Y=X+ZA+",(2)where =A,Aisarkmatrix,Nk(0,2Ik),andIkisthekkidentitymatrix.TherkmatrixAisabinarymatrix;eachrowisallzerosexceptforoneentry,whichisa1,whichdepictstheclustertowhichthatobservationisassigned.BothkandAare 16

PAGE 17

unknown,butwedoknowthatifAhascolumnsumsfr1,r2,...,rkg,thenthemarginaldistributionofA( Kyungetal. ( 2010 )),undertheDPrandomeffectsmodelis P(A)=(r1,r2,...,rk)=\(m) \(m+r)mkkYj=1\(rj).(2)IfAisknownthen( 2 )isastandardnormalrandomeffectslinearmixedmodel,andwehaveE(YjA)=X,Var(YjA)=2In+2ZAA0Z0.WhenAisunknown,itstillremainsthatE(YjA)=X,butnowwehaveV=Var(Y)=E[Var(YjA)]+Var[E(YjA)]=E[Var(YjA)],asthesecondtermontherightsideiszero.Itthenfollowsthat V=Var(Y)=2In+2XAP(A)ZAA0Z0=2In+ZWZ0,(2)whereW=2PAP(A)AA0=[wij]rr=2E(a0iaj)=I(i6=j)d2+I(i=j)2,andbyAppendix C d=r)]TJ /F6 7.97 Tf 6.58 0 Td[(1Xi=1im\(m+r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)]TJ /F3 11.955 Tf 11.95 0 Td[(i)\(i) \(m+r)(2)andI()istheindicatorfunction.Thisis,W=2666666642d2d2d22d2............d2d22377777775.LetV=[vij]nnandZ=[zij].Fori6=j,vij=PkPlzikwklzjl,whichmightdependondandZ.ThusthecovarianceofYiandYjmightdependonZ,2,randm. Example1. Weconsideramodeloftheform Yij=x0ij+ i+"ij,1ir,1jt,(2) 17

PAGE 18

whichissimilarto( 1 ),exceptherewemightconsider itobeasubject-specicrandomeffect.IfweletY=[Y11,...,Y1t,...,Yr1,...,Yrt]0,1t=[1,...,1]0t1,andB=2666666641t0001t0...001t377777775nr,wheren=rt,thenmodel( 2 )canbewritten Y=X+BA+",(2)soYjAN(1,2In+2BAA0B0]).TheBLUEofisgivenin( 2 ),andhasvariance(X0V)]TJ /F6 7.97 Tf 6.59 0 Td[(1X))]TJ /F6 7.97 Tf 6.59 0 Td[(1,butnowwecanevaluateVbyEq.( 2 )obtaining V=2666666642I+2Jd2Jd2Jd2Jd2J2I+2Jd2Jd2J...............d2Jd2Jd2J2I+2J377777775,(2)whereIisthettidentitymatrix,Jisattmatrixofones.Ifi6=i0,thecorrelationis Corr(Yi,j,Yi0,j0)=d2 2+2=2Pr)]TJ /F6 7.97 Tf 6.59 0 Td[(1i=1im\(m+r)]TJ /F6 7.97 Tf 6.58 0 Td[(1)]TJ /F8 7.97 Tf 6.58 0 Td[(i)\(i) \(m+r) 2+2.(2)Thislastexpressionisquiteinteresting,asitrelatestheprecisionparametermtothecorrelationintheobservations,arelationshipthatwasnotapparentbefore.Althoughwearenotcompletelysureofthebehaviorofthisfunction,weexpectedthatthecorrelationwouldbeadecreasingfunctionofm.Thiswouldmakesense,asabiggervalueofmimpliesmoreclustersintheprocess,leadingtosmallercorrelations.Thisisnotthecase,however,asFigure 2-1 shows.Wecanestablishthatdisdecreasingwhenmiseithersmallorlarge,butthemiddlebehaviorisnotclear.Whatwecanestablish 18

PAGE 19

Figure2-1.Therelationshipbetweendandm,withr=7,12,15,20. aboutthebehaviorofdissummarizedinthefollowingtheorem,whoseproofisgiveninAppendix A Theorem2. Letdbethesameasbefore.Thendisadecreasinginmonmp (r)]TJ /F7 11.955 Tf 11.96 0 Td[(2)(r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)or0m2. Proof. SeeAppendix A 2.1Gauss-MarkovTheoremWecannowapplytheGauss-MarkovTheorem,asin Harville ( 1976 ),toobtaintheBLUEofandtheBLUPof : e=(X0V)]TJ /F6 7.97 Tf 6.59 0 Td[(1X))]TJ /F6 7.97 Tf 6.58 0 Td[(1X0V)]TJ /F6 7.97 Tf 6.58 0 Td[(1Y,e =C0V)]TJ /F6 7.97 Tf 6.59 0 Td[(1(Y)]TJ /F4 11.955 Tf 11.96 0 Td[(Xe),(2)whereC=Cov(Y, )=2PAP(A)ZAA0=2ZW.Italsofollowsfrom Harville ( 1976 )thatforpredictingw=L0+ ,forsomeknownmatrixL0,suchthatL0isestimable,theBLUPofwisew=L0e+CV)]TJ /F6 7.97 Tf 6.59 0 Td[(1(Y)]TJ /F4 11.955 Tf 11.96 0 Td[(Xe).Touse( 2 )tocalculatetheBLUPrequires 19

PAGE 20

eitherknowledgeofV,orthevericationthattheBLUEisequaltotheOLSestimator,andweneedtoknowCtousetheBLUP.Unfortunately,wehaveneitherinthegeneralcase.Thereareanumberofconditionsunderwhichtheseestimatorsareequal(e.g. PuntanenandStyan ( 1989 ); Zyskind ( 1967 )),alllookingattherelationshipbetweenVar(Y)andX.Forexample,whenVar(Y)isnonsingular,onenecessaryandsufcientconditionisthatHVar(Y)=Var(Y)H,whereH=X(X0X))]TJ /F4 11.955 Tf 7.09 -4.34 Td[(X0,whichalsoimpliesthatHVar(Y)issymmetric.From Zyskind ( 1967 )weknowthatanothernecessaryandsufcientconditionforOLSbeingBLUEisthatasubsetofrX(rX=rank(X))eigenvectorsofVar(Y)existsformingabasisofthecolumnspaceofX.SinceW=dJr+2(1)]TJ /F3 11.955 Tf 12.1 0 Td[(d)Ir,whereJrisarrmatrixofones,wecanrewritethematrixVas V=2In+ZWZ0=2In+d2ZJrZ0+2(1)]TJ /F3 11.955 Tf 11.95 0 Td[(d)ZZ0,(2)wherethematricesZJrZ0andZZ0arefreeoftheparametersm,and.ByworkingwiththesematriceswewillbeabletodeduceconditionsofequalityofOLSandBLUEthatarefreeofunknownparameters. 2.2EqualityofOLSandBLUE:EigenvectorConditionsWerstderiveconditionsontheeigenvectorsofZZ0andZJZ0thatimplytheequalityoftheOLSandBLUE.Theseconditionsarenoteasytoverify,andmaynotbeveryusefulinpractice.However,theydohelpwiththeunderstandingofthestructureoftheproblem,andgivenecessaryandsufcientconditionsinaspecialcase.Letg1=[s,..,s]Tandg2=[)]TJ /F11 11.955 Tf 11.29 8.97 Td[(Pr)]TJ /F6 7.97 Tf 6.58 0 Td[(1i=1li,l1,...,lr)]TJ /F6 7.97 Tf 6.59 0 Td[(1]T,wheresandli,i=1,...,r)]TJ /F7 11.955 Tf 12.02 0 Td[(1arearbitraryrealnumbers. 20

PAGE 21

SinceW=2666666642ddd2d............dd2377777775,weknowtherearetwodistinctnonzeroeigen-values1=(r)]TJ /F7 11.955 Tf 12.49 0 Td[(1)d2+2(algebraicmultiplicityis1)and2=2)]TJ /F3 11.955 Tf 12.49 0 Td[(d2(algebraicmultiplicityisr-1),whosecorrespondingeigenvectorsofWareg1andg2respectively.LetE1=fg1gSfg2g,E2=fZg1:g16=0g,E3=fZg2:g26=0g,andE4=fg:Z0g=0,g6=0g.WeassumeZiswithfullcolumnrank,i.e.rank(Z)=r.However,weassumethatZ0Zgi=aconstantgi,i=1,2.NotethatZ0Z=cIrisaspecialcaseofthisassumption.BytheformofEq.( 2 ),weknowthatavectorgisaneigenvectorofVifandonlyifitisaneigenvectorofZWZ0.ThefollowingtheoremsandcorollarieslisttheeigenvectorsofZWZ0,i.e.,listtheeigenvectorsofV.IfweknowalltheeigenvectorsofV,wecangetanecessaryandsufcientconditiontoguaranteetheOLSbeingtheBLUE. Theorem3. Considerthelinearmixedmodel( 1 ).AssumethatZsatisestheaboveassumptions.TheOLSistheBLUEifandonlyiftherearerX(rX=rank(X))elementsinthesetS4j=2EjformingabasisofthecolumnspaceofX. Proof. SeeAppendix B 2.3EqualityofOLSandBLUE:MatrixConditionsItishardtolistformsofalltheeigenvectorsofVforageneralZ,sincewedonotknowtheformofthematrixZ.AlsoitishardtocheckifHVar(Y)issymmetric,sinceVar(Y)dependsontheunknownparameters,,m.However,wecangivesomesufcientconditiontoguaranteetheOLSbeingtheBLUE. Theorem4. Considerthemodel( 2 ).LetHisthesameasbefore.Wehavethefollowingconclusions: 21

PAGE 22

IfHZJrZ0andHZZ0aretwosymmetricmatrices,thentheOLSistheBLUE.IfHZJrZ0issymmetricandHZZ0isnotsymmetric(orifHZJrZ0isnotsymmetricandHZZ0isnotsymmetric),thentheOLSisnotBLUE. Proof. Fromthecovariancematrixexpression( 2 ),theconclusionsareclear.IfHZJrZ0andHZZ0aretwosymmetricmatrices,thenHVissymmetric.Thus,theOLSistheBLUE.IfHZJrZ0issymmetricandHZZ0isnotsymmetric(orifHZJrZ0isnotsymmetricandHZZ0isnotsymmetric),thenHVisnotsymmetric.Thus,theOLSisnottheBLUE. ThistheoremgiveusasufcientconditionfortheequalityoftheOLSandtheBLUE.ThetheoremalsogiveusasufcientconditionthattheOLSisnottheBLUE.However,whenbothHZJrZ0andHZZ0arenotsymmetric,thereisnoconclusionfortherelationshipbetweentheOLSandtheBLUE. Corollary5. IfC(Z)C(X),i.e.,thecolumnspaceofZiscontainedinthecolumnspaceofX,thenZZ0HandZJZ0Haresymmetric,whereHissameasbefore.Thus,theOLSistheBLUEnow. Proof. SinceC(Z)C(X),thereexistamatrixQsuchthatZ=XQ.ThenwehaveZZ0H=XQQ0X0X(X0X))]TJ /F6 7.97 Tf 6.59 0 Td[(1X0=XQQ0X0,whichissymmetric.ZJZ0H=XQJQ0X0X0X(X0X))]TJ /F6 7.97 Tf 6.58 0 Td[(1X0=XQJQ0X0,whichisalsosymmetric.Bythediscussionabove,theOLSistheBLUE. 22

PAGE 23

2.4SomeExamples Example6. ConsideraspecialcaseoftheExample 1 :therandomizedcompleteblockdesign:Yij=+i+ j+"ij,1ia,1jb,where jistheeffectforbeinginblockj.Assume jDP(m,N(0,2)).ThenthemodelcanbewrittenasT=X+Z +""",whereX=B,ZT=[Ib,...,Ib]b,n,and=[1,...,a]T=[+1,...,+a]T.WecanusethetheoremsdiscussedaboveorusetheresultsinExample 1 tocheckiftheOLSistheBLUE.Bystraightforwardcalculation,wehaveZTH=ZTX(XTX))]TJ /F6 7.97 Tf 6.59 0 Td[(1XT=1 b[1b,...,1b]b,n,where1bisab1vectorwhoseeveryelementis1.InadditionZZ0H=Jn,whereJnisannmatrixwhoseeveryelementis1.Similarly,ZJZ0H=bJn.Thus,bythepreviousdiscussionweknowthattheOLSistheBLUEnow. Example7. Consideramodel: Yijk=x0i+i+j+"ijk,1ia,1jb,1knij,(2)wherei,jarerandomeffects.Withoutlossofgenerality,assumea=b=nij=2.Thus,ZT=26666666411110000000011111100110000110011377777775.WecanusethetheoremstoseeiftheOLSistheBLUE. 23

PAGE 24

Forexample,assumeXT=2666641)]TJ /F7 11.955 Tf 9.3 0 Td[(11)]TJ /F7 11.955 Tf 9.3 0 Td[(1002)]TJ /F7 11.955 Tf 9.3 0 Td[(21)]TJ /F7 11.955 Tf 9.3 0 Td[(1001)]TJ /F7 11.955 Tf 9.3 0 Td[(12)]TJ /F7 11.955 Tf 9.3 0 Td[(200)]TJ /F7 11.955 Tf 9.3 0 Td[(1)]TJ /F7 11.955 Tf 9.3 0 Td[(11)]TJ /F7 11.955 Tf 9.3 0 Td[(12)]TJ /F7 11.955 Tf 9.3 0 Td[(2377775.ThenHZZ0andHZJZ0aresymmetric,whereHisthesameasbefore.ThenbytheTheorem 4 ,weknowthattheOLSisBLUEnow.However,forsomeotherXs,theOLSmightnotbetheBLUE.Forexample,ifXT=266664111111111)]TJ /F7 11.955 Tf 9.3 0 Td[(1111)]TJ /F7 11.955 Tf 9.3 0 Td[(11)]TJ /F7 11.955 Tf 9.29 0 Td[(100111)]TJ /F7 11.955 Tf 9.3 0 Td[(11)]TJ /F7 11.955 Tf 9.29 0 Td[(1377775.ForthisX,HZJZ0issymmetricandH1ZZ0isnotsymmetric.Thus,theOLSisnottheBLUEnowbythepreviousdiscussion. Example8. ConsiderY=X+Z +".AssumeZT=26666411111111)]TJ /F7 11.955 Tf 9.3 0 Td[(1)]TJ /F7 11.955 Tf 9.3 0 Td[(1)]TJ /F7 11.955 Tf 9.3 0 Td[(1)]TJ /F7 11.955 Tf 9.29 0 Td[(1111111)]TJ /F7 11.955 Tf 9.3 0 Td[(1)]TJ /F7 11.955 Tf 9.29 0 Td[(111)]TJ /F7 11.955 Tf 9.3 0 Td[(1)]TJ /F7 11.955 Tf 9.3 0 Td[(1377775.TheZmatrixsatisestheconditionZ0Z=cI.WecanapplythetheoremtocheckiftheOLSistheBLUE.Forexample,assumeXT=2666641)]TJ /F7 11.955 Tf 9.3 0 Td[(11)]TJ /F7 11.955 Tf 9.3 0 Td[(1002)]TJ /F7 11.955 Tf 9.3 0 Td[(211)]TJ /F7 11.955 Tf 9.3 0 Td[(1)]TJ /F7 11.955 Tf 9.3 0 Td[(1331100110011377775.ByregularalgebracalculationwendthattheelementsinS4j=2EjdoformabasisofthecolumnspaceofX.ThustheOLSistheBLUE.However,ifXT=2666641)]TJ /F7 11.955 Tf 9.29 0 Td[(11)]TJ /F7 11.955 Tf 9.3 0 Td[(1002)]TJ /F7 11.955 Tf 9.29 0 Td[(211)]TJ /F7 11.955 Tf 9.3 0 Td[(1)]TJ /F7 11.955 Tf 9.3 0 Td[(1331111110011377775.Byregularalgebracalculation,weknowthattheelementsinS4j=2EjdonotformabasisofthecolumnspaceofX.ThustheOLSisnottheBLUEnow. 24

PAGE 25

Example9. ConsiderabalancedANOVAmodelY=X+B +"whenrank(X)=length( ).Inthiscase,XandBhavethesamecolumnspace.Thus,wecanjustconsiderthemodelY=B+B +".SinceeachcolumnofBcanbewrittenasalinearcombinationoftheeigenvectors!1and!2,bythediscussionoftheExample 1 ,wehavethattheOLSistheBLUEinthemodelY=B+B +".Inanotherword,theOLSistheBLUEinthemodelY=X+B +". 2.5IntervalEstimationInthissectionweshowhowtoputcondenceintervalsonthexedeffectsiinthegeneralcaseofmodel( 2 ).LetG=(X0V)]TJ /F6 7.97 Tf 6.58 0 Td[(1X))]TJ /F6 7.97 Tf 6.58 0 Td[(1X0V)]TJ /F6 7.97 Tf 6.58 0 Td[(1,sotheBLUEforise=GY.Ifwedenee1=[1,0,...,0]0,e2=[0,1,0,...,0]0,...,ep=[0,...,0,1]0,thentheestimateforiisei=eTiGY,i=1,2,...,p.Wewanttondthebi,(i=1,2,...,p)suchthat,P(eibi)=,for0<<1,andwestartwitheijAN(i,e0iGVAG0ei),VA=2ZAA0Z0+2In,so=P(eibi)=PAP(A)(bi)]TJ /F17 7.97 Tf 6.59 0 Td[(i e0iGVAG0ei).Itturnsoutthatwecangeteasilycomputableupperandlowerboundsone0iGVAG0ei,whichwouldallowustoeitherapproximatebi,oruseabisection.Itisstraightforwardtocheckthatthematrix[(n)]TJ /F7 11.955 Tf 10.41 0 Td[(1)I+J])]TJ /F4 11.955 Tf 10.42 0 Td[(AA0isalwaysnonnegativedeniteforeveryA,andthusbytheexpressionofmatrixVA,wehave2e0iGInG0eie0iGVAG0eie0iG(2Z[(n)]TJ /F4 11.955 Tf 11.95 0 Td[(1)I+J]Z0+2In)G0ei.Thisinequalitygivesusalowerbondandaupperbondfore0iGVAG0ei,whichcanhelpformtheboundingnormaldistributions.NowletZ0andZ1betheuppercutoffpointsfromtheboundingnormaldistribu-tions,sowehaveZ0biZ1. 25

PAGE 26

Table2-1.EstimatedcutoffpointsofExample 10 ,with=0.95,2=2=1,andm=3.Iterationsisthenumberofstepstoconvergence,Z0andZ1aretheboundingnormalcutoffpoints,andb1isthecutoffpoint. Iterationsz0z1b1P(e1b1) r=4,t=581.5316.8732.26140.9496r=5,t=571.3406.7682.14600.9508r=6,t=681.1246.8511.95210.9508r=7,t=671.0136.6921.85590.9497r=8,t=880.8617.0321.70500.9499 Nowwecanusetheseendpointsforaconservativecondenceinterval,or,insomecases,wecancalculate(exactlyorbyMonteCarlo)thecdfof~.Wegiveasmallexample. Example10. ConsiderthemodelY=X+B +",whereX=[x1,x2,x3],x1=[1,...,1]0,x2=[1,2,...,n]0,x3=[12,22,...,n2]0.For=0.95,2=2=1,andm=3,wendb1,suchthat=P(e1b1).DetailsareintheTable 2-1 .Weseethatthelowerboundtendstobeclosertotheexactcutoff,butthisisafunctionofthechoiceofm.Ingeneralwecanusetheconservativeupperbound,oruseanapproximationsuchas(Z0+Z1)=2. 2.6TheOnewayModelInthissectionweonlyconsidertheonewaymodel.Inthisspecialcasewecaninvestigatefurtherpropertiesoftheestimator Y,suchasunimodality,symmetry,andtheeffectoftheprecisionparameterm.TheonewaymodelisYij=1+ i+"ij,1ir,1jt,i.e., YjAN(,2A),2A=1 n2(n2+2t2X`r2`)(2) 26

PAGE 27

Figure2-2.Densitiesof Ycorrespondingtodifferentvaluesofmwith==1. wherewerecallthatther`arethecolumnsumsofthematrixA.Wedenotethedensityof Ybyfm(y),whichcanberepresentedas fm(y)=XAf(yjA)P(A),(2)wheref(yjA)isthenormaldensitywithmeanandvariance2A,andP(A)isthemarginalprobabilityofthematrixA,asin( 2 ).Thesubscriptmistheprecisionparameter,whichappearsintheprobabilityofA.Figure 2-2 isaplotofthepdffm(y)withn=8,fordifferentmwith==1.Thegureshowsthatthedensityof Yissymmetricandunimodal.Itisalsoapparentthat,inthetails,thedensitieswithsmallermareabovethosewithbiggerm. 2.6.1VarianceComparisonsBythepreviousresult,weknowthat YistheBLUEundertheDirichletprocessonewaymodelHerewewanttocomparethevariancesofthetheBLUE YundertheDirichletprocessonewaymodelandtheclassicalonewaymodel,withnormalrandomeffects.WewillseethatVar( Y)undertheDirichletmodelislargerthanthatunderthenormalmodel. 27

PAGE 28

Theonewaymodelhasthematrixform Y=1+BA+",(2)where"N(0,2I), =A,andk1N(0,2Ik),andVar( YjA)=1 n2210(I+2 2BAA0B0)1.RecallthatthecolumnsumsofAare(r1,r2,...,rk),anddenotethisvectorbyr.Itisstraightforwardtoverifythat10B=t10,andthen10BA=tr0.RecallingthatPjrj=randn=rt,wehave10(I+2 2BAA0B0)1=n+2 2t2r0r=n+2 2t2kXj=1r2j.Thus,theconditionalvarianceundertheDirichletmodelis Var( YjA)=1 n2 n2+2t2kXj=1r2j!.(2)NowPkj=1r2j>(Pkj=1rj)2=k,andsincekisnecessarilynogreaterthanr,wehavePkj=1r2j(Pkj=1rj)2=randVar( YjA)2 n2 n+2 2t2(Pkj=1rj)2 r!=2 n2n+2 2t2r=Var( YjI),whereVar( YjI)isjustthecorrespondingvarianceundertheclassicalonewaymodel.Thus,everyconditionalvarianceof YundertheDirichletmodelisbiggerthanthevarianceinthenormalmodel,sotheunconditionalvarianceof YundertheDirichletmodelisalsobiggerthanthatunderthenormalmodel. 2.6.2UnimodalityandSymmetryForeveryAandeveryrealnumbery,f(+yjA)=f()]TJ /F3 11.955 Tf 11.96 0 Td[(yjA),P(A)0. 28

PAGE 29

Thus,fm(+y)=fm()]TJ /F3 11.955 Tf 11.99 0 Td[(y),thatis,themarginaldensityissymmetricaboutthepoint.Also,itiseasytoshowthat 1. Ify1y2,f(jA)f(y1jA)f(y2jA),=)fm(jA)fm(y1jA)fm(y2jA); 2. Ify1y2,f(jA)f(y1jA)f(y2jA),=)fm(jA)fm(y1jA)fm(y2jA),andthusthemarginaldensityisunimodalaroundthepoint. 2.6.3LimitingValuesofmNowwelookatthelimitingcases:m=0andm!1.Wewillshowthatfm(y)remainsaproperdensitywhenm=0andm!1. Theorem11. Whenm=0orm!1,themarginaldensities(r1,r2,...,rk)in( 2 )degeneratetoasinglepoint.Specically.limm!1(r1,r2,...,rk)=8><>:1,k=1,0,k=2,...,r,andlimm!0(r1,r2,...,rk)=8><>:1,k=r,0,k=1,...,r)]TJ /F7 11.955 Tf 11.95 0 Td[(1.Itthenfollowsfrom( 2 )that Yjm=0N(,1 n2+2), Yjm=1N(,1 n(2+2t)). Proof. From( 2 )wecanwrite(r1,r2,...,rk)=mk)]TJ /F6 7.97 Tf 6.59 0 Td[(1 (m+r)]TJ /F7 11.955 Tf 11.95 0 Td[(1)(m+r)]TJ /F7 11.955 Tf 11.95 0 Td[(2)(m+1)kYj=1\(rj).Thedenominator(m+r)]TJ /F7 11.955 Tf 12.28 0 Td[(1)(m+r)]TJ /F7 11.955 Tf 12.29 0 Td[(2)(m+1)isapolynomialofdegree(r)]TJ /F7 11.955 Tf 12.28 0 Td[(1),andgoesto(r)]TJ /F7 11.955 Tf 12 0 Td[(1)!whenm!0.Thus(r1,r2,...,rk)!0unlessk=1.Whenm!1,(r1,r2,...,rk)againgoestozerounlessk=r,makingthenumeratorapolynomialofdegreer)]TJ /F7 11.955 Tf 11.95 0 Td[(1.ThedensitiesofYfollowfromsubstitutinginto( 2 ). 29

PAGE 30

Whenm=0alloftheobservationsareinthesamecluster,theAmatrixdegen-eratesto(1,1,...,1)0.Atm=1,eachobservationisinitowncluster,A=I,andthedistributionofYisthatoftheclassicnormalrandomeffectsmodel. 2.6.4RelationshipAmongDensitiesof YwithDifferentmInthissection,wecomparethetailsofthedensitiesof Ywithdifferentparametermandshowthatthetailsofdensitieswithsmallermarealwaysabovethetailsofdensitieswithlargerm.Recall( 2 ),andnotethat 21=1 n(2+2t)2A=1 n2(n2+2t2X`r2`)1 n2+2=20,(2)andso20isthelargestvariance.Wecanthenestablishthefollowingtheorem. Theorem12. Ifm1
PAGE 31

m=0,andm=1.Infact,forsufcientlylargeywehavef1(y)fm(y)f0(y),andthetailsofanydensityarealwaysbetweenthetailsofdensitieswithm=0andm=1.ThisgivesusamethodtondthecutoffpointsintheDirichletprocessonewaymodel.Sincewehaveboundingcutoffpoints,wecouldusethecutoffcorrespondingtom=0asaconservativebound.Alternatively,wecoulduseabisectionmethodifwehadsomeideaofthevalueofm.WeseeinTable 2-2 thatthereisarelativelywiderangeofcutoffvalues,andthattheconservativecutoffcouldbequitelarge. Table2-2.Estimatedcutoffpoints(=0.975)undertheDirichletmodelYij=+ i+"ij,1i6,1j6,fordifferentvaluesofm.2=2=1. m0.1.51201 EstimatedCutoff1.9871.9171.7061.5660.9520.864 2.7TheEstimationof2,2andCovarianceAnotherproblemintheDirichletprocessmixedmodel(orDirichletprocessonewaymodel)istoestimatetheparameters2and2.IntheDirichletprocessmixedmodel,thedistributionoftheresponsesisamixtureofnormaldensities,notasinglenormaldistribution,whichmightleadtosomedifcultywhenwetrytousesomemethods(forexample,maximumlikelihoodmethod)toestimatetheparameters.Thefollowingsectionswilldiscussthreemethods(MINQUE,MINQE,andsamplecovariancematrix)tondtheestimatorsfor2and2andshowasimulationstudy.Thesethreemethodsdonotneedthattheresponsesfollowanormaldistribution.Thesimulationstudyshowsthattheestimationfromthesamplecovariancematrixareverysatisfactory.Inaddition,wecanalsogettheestimationofcovariancefromthesamplecovariancematrixmethod. 31

PAGE 32

2.7.1MINQUEfor2and2Asdiscussedin Searleetal. ( 2006 ), Rao ( 1979 ), Brown ( 1976 ), Rao ( 1977 )and Chaubey ( 1980 ),theminimumnormquadraticunbiasedestimation(MINQUE)doesnotrequirethenormalityassumption.TheDirichletmixedmodelYjAN(X,2In+2ZAA0Z0)canbewrittenas Y=X+",(2)whereVar(")=2In+2PP(A)ZAA0Z0=2In+dZJrZ0+(1)]TJ /F3 11.955 Tf 10.62 0 Td[(d)2ZZ0.DenoteT1=In,T2=ZJrZ0,andT3=ZZ0.Notethat2>d.Let=(2,2,2)]TJ /F3 11.955 Tf 12.36 0 Td[(d),S=(Sij),Sij=tr(QTiQTj),Q=I)]TJ /F4 11.955 Tf 11.96 0 Td[(X(X0X))]TJ /F6 7.97 Tf 6.58 0 Td[(1X0.By Rao ( 1977 ), Chaubey ( 1980 )and Mathew ( 1984 ),forgivenp,if=(1,2,3)satisesS=p,aMINQUEofp0isY0(PiiQTiQ)Y.Bylettingp=c(1,0,0),p=(0,1,0)andp=(0,1,)]TJ /F7 11.955 Tf 9.3 0 Td[(1),wecangetestimatorsof2,2andd2.LetY(1),Y(2)...,Y(N)beNvectors,independentlyandidenticallydistributedasYinthemodel( 2 ).Thenthemodel( 2 )becomes eY=eX+e",(2)whereeY=[Y(1)T,Y(2)T...,Y(N)T]T,bX=[X0,X0,...,X0]Tandb"=["0,...,"0]T.LetbibetheMINQUEofthevariancecomponentsforthemodel( 2 ).b1=b2andb2=b2.Bythecorollary1in Brown ( 1976 ),weknowthatN1 2(bi)]TJ /F5 11.955 Tf 13.03 0 Td[(i)haslimitingnormaldistribution.Thus,theb2andb2mentionedabovehavelimitingnormaldistributions.However,theMINQUEcanalsobenegativeinsomecases. Mathew ( 1984 )and Pukelsheim ( 1981 )(intheirTheorem1and2)discussedsomeconditionstomake 32

PAGE 33

theMINQUEnonnegativedenite.Itiseasytoshowwhenweestimate2or2byMINQUE,theDirichletprocessmixedmodeldoesnotsatisfytheconditionofTheorem1and2in Pukelsheim ( 1981 ).Thus,theestimationsbyMINQUEmightbenegativeinourmodel.WhenaMINQUEisnegative,wecanreplaceitwithotherpositiveestimator,suchasMINQE(minimumnormquadraticestimator). 2.7.2MINQEfor2and2ThereisanotherestimatorcalledMINQE(minimumnormquadraticestimator).TheMINQEisdiscussedinmanypapers,suchas RaoandChaubey ( 1978 ), Rao ( 1977 ), Rao ( 1979 )and Brown ( 1976 ).Dene(21,22,23)tobeapriorvaluesfor(2,2,2)]TJ /F3 11.955 Tf 11.95 0 Td[(d2).Letci=4ipi n,i=1,2,3;V=1T1+2T2+3T3;P=X(X0V)]TJ /F6 7.97 Tf 6.58 0 Td[(1X))]TJ /F4 11.955 Tf 7.09 -4.34 Td[(X0V)]TJ /F6 7.97 Tf 6.59 0 Td[(1;R=I)]TJ /F4 11.955 Tf 11.95 0 Td[(P.InMINQE,wecanusethefollowingestimatortoestimatep0:YT3Xi=1ciRTV)]TJ /F6 7.97 Tf 6.59 0 Td[(1TiV)]TJ /F6 7.97 Tf 6.59 0 Td[(1RY.Thus,theestimatorsof2and2isb2=41 nYTRTV)]TJ /F6 7.97 Tf 6.58 0 Td[(1T1V)]TJ /F6 7.97 Tf 6.59 0 Td[(1RY;b2=42 nYTRTV)]TJ /F6 7.97 Tf 6.58 0 Td[(1T2V)]TJ /F6 7.97 Tf 6.59 0 Td[(1RY.bd=1)]TJ /F6 7.97 Tf 15.18 4.71 Td[(1 b243 nYTRTV)]TJ /F6 7.97 Tf 6.58 0 Td[(1T3V)]TJ /F6 7.97 Tf 6.59 0 Td[(1RY.b2andb2arebothpositive. 2.7.3Estimationsof2,2andCovariancebytheSampleCovarianceMatrixInthispart,weonlyconsideramodeloftheform Yij=x0i+ i+"ij,1ir,1jt,(2)whichhasthecovariancematrix(asshownbefore) V=2666666642I+2Jd2Jd2Jd2Jd2J2I+2Jd2Jd2J...............d2Jd2Jd2J2I+2J377777775,(2) 33

PAGE 34

whereIisthettidentitymatrix,Jisattmatrixofones.Wewillgiveanothermethodtoestimatethe2,2anddanddiscusssomeproperties.GivenasampleconsistingofhindependentobservationsY(1),Y(2)...,Y(h)ofn-dimensionalrandomvariables,anunbiasedestimatorofthecovariancematrixVar(Y)=E(Y)]TJ /F4 11.955 Tf 11.95 0 Td[(E(Y))(Y)]TJ /F4 11.955 Tf 11.95 0 Td[(E(Y))Tisthesamplecovariancematrix bV(Y(1),Y(2)...,Y(h))=1 h)]TJ /F4 11.955 Tf 11.95 0 Td[(1hXi=1(Y(i))]TJ ET q .478 w 300.87 -134.73 m 309.97 -134.73 l S Q BT /F4 11.955 Tf 300.87 -144.7 Td[(Y)(Y(i))]TJ ET q .478 w 352.74 -134.73 m 361.84 -134.73 l S Q BT /F4 11.955 Tf 352.74 -144.7 Td[(Y)T(2)Infact,withtheestimatedbVwecangetestimatorsof2,2andd.Thevariance-covariancematrixhasthesamestructureasEq.( 2 ).Wecanusetheestimatedblockmatrix2I+2Jondiagonalpositiontoestimate2and2.Theaverageofthediagonalelementsintheblockmatrices2I+2Jcanbeconsideredasestimatorsof2+2.Theaverageoftheoff-diagonalelementsintheblockmatrices2I+2Jcanbeconsideredasestimatorsof.Thedifferenceofthesetwoaveragescanbeusetoestimate2.Inaddition,wecanusetheaverageofelementsontheblockmatricesd2Jtoestimatethed. 2.7.4FurtherDiscussionAsdiscussedin Robinson ( 1991 ), Harville ( 1977 ), KackarandHarville ( 1984 ), KackarandHarville ( 1981 ),and Dasetal. ( 2004 ),when2and2areunknown,wecanstilluseanestimatorof(or ,ortheirlinearcombinations)with2and2replacedbytheircorrespondingestimatorsintheexpressionofBLUE. KackarandHarville ( 1981 )showedthattheestimatedeande arestillunbiasedwhentheestimatorsof2and2areevenandtranslation-invariant,i.e.whenb2(y)=b2()]TJ /F3 11.955 Tf 9.29 0 Td[(y),b2(y)=b2()]TJ /F3 11.955 Tf 9.3 0 Td[(y),b2(y+X)=b2(y),b2(y+X)=b2(y).ItisclearthatbV(Y(1),Y(2)...,Y(h))inEq.( 2 )satisestheevencondition.Since,bV(Y(1),...,Y(h))=bV(Y(1)+X,Y(2)+X...,Y(h)+X),forevery,italsosat-isesthetranslation-invariantcondition.Asdiscussedintheprevioussection,theestimationof2and2bythesamplecovariancematrixcanbewrittenintheform 34

PAGE 35

Pi,jHibV(Y(1),Y(2)...,Y(h))Gj,whereHi,Gjarematrixfreeof2and2.Thus,thees-timatorsfor2and2,bythesamplecovariancematrix,alsosatisfytheevenandtranslation-invariantconditions.Theestimatorof(or ,ortheirlinearcombinations)isstillunbiased.Ontheotherhand,by Dasetal. ( 2004 )weknowtheestimatorsfromMINQUEalsosatisfytheevenandtranslation-invariantconditions.Theestimatorsof(or ,ortheirlinearcombinations)isalsounbiased. 2.7.5SimulationStudy Example13. Inthisexample,considertheDirichletprocessonewaymodel: Yij=+ i+"ij,1i7,1j7,(2)i.e.Y=1+BA+".Wewanttocomparetheperformanceofthemethodstoestimate2and2.Werstassumethatthetrue2=2=1,10.Wesimulate1000datasetsfor2=2=1,10respectively.Forgivenand,weusethemethoddiscussedin Kyungetal. ( 2010 )togeneratethematrixA.At(t+1)thstep,weusethefollowingexpressiontogeneratethematrixA:q(t+1)=(q(t+1)1,...,q(t+1)r)Dirichlet(r(t)1+1,...,r(t)k+1,1,...,1);everyrowofAai,aiMultinomial(1,q(t+1)).Thus,wecangeneratethematrixA.SinceforgivenA,YN(1,2In+2BAA0B0),wecangenerateY.Hereassume=0.Sowecangeneratethecorrespondingdatasetsfordifferentand.Weusefourmethods(MINQUE,MINQE,ANOVA,andsamplecovariancematrix)toestimate2and2.WhenusingMINQE,weusethethepriorvalues(1,1)for(2,2)foralldatasets.Foreverymethod,wecalculatethemeanofthe1000correspondingestimatorsandthemeansquarederrors(MSE).TheresultsarelistedinTable 2-3 and 2-4 .Inthesetables,theestimatorsusingsamplevariancealwaysgivesmallestMSEfor^2and^2,nomatterwhetherthetrue2and2arebigorsmall.Themeansquared 35

PAGE 36

errorsforMINQUEandMINQEarealmostthesame,althoughthetrue2and2mightbefarawayfromthepriorvalue(1,1).However,wealsond,onaverage,theestimatorsof2and2byMINQUE,MINQEandthesamplecovariancematrixarealmostthesameasthetruevalues.However,theestimatorsbyMINQUEandMINQEhavesmallerbias,buthavelargervariance.Theestimatorsusingthesamplecovariancematrixhavesmallbiasandsmallervariance.TheMSEoftheestimatorsusingthesamplecovariancematrixismuchsmallerthanothers.TheANOVAestimatorsarenotsatisfactory.InTable 2-5 ,wecalculatethecutoffpointsandcorrespondingcoverageprobabilitybyusingtheresultsinTable 2-3 .Obviously,themethodsofbythesamplecovariancematrixgivethebestresults. Table2-3.Estimated2and2undertheDirichletprocessonewaymodelYij=+ i+"ij,1i7,1j7.2=2=1. methodmeanof^2meanof^2MSEof^2MSEof^2 MINQUE1.0031.0070.081.15MINQE1.0031.0070.081.15ANOVA1.740.141.271.33Samplecovariancematrix1.0031.0040.0050.003 Table2-4.Estimated2and2undertheDirichletprocessonewaymodelYij=+ i+"ij,1i7,1j7.2=2=10. methodmeanof^2meanof^2MSEof^2MSEof^2 MINQUE9.9710.108.00143.43MINQE9.9710.097.99143.49ANOVA17.46-1.20146.63135.66Samplecovariancematrix10.489.970.690.86 Table2-5.EstimatedcutoffpointsfordensityofYwiththeestimatorsof2and2inTable 2-3 undertheDirichletmodelYij=+ i+"ij,1i7,1j7.m=3,=0,=0.95.True2=2=1. methodestimatedcutoffP( Y
PAGE 37

CHAPTER3SIMULATIONSTUDIESANDAPPLICATIONWehavediscussedtheclassicalestimationundertheDirichletmodelintheprevioussections.Inthispart,wewillcomparetheperformancesoftheDirichletprocessmixedmodelwiththeclassicalnormalmixedmodel.WeuseboththeDirichletmodelandtheclassicalnormalmodeltotsomesimulateddatasetsandarealdataset,andcomparethecorrespondingresults. 3.1SimulationStudiesFirst,wewilldosomesimulationstudiestoinvestigatetheperformanceoftheDirichletlinearmixedmodel.Wewillgeneratethedatalesfromthetwomodels:thelinearmixedmodelwithDirichletprocessrandomeffectsandthelinearmixedmodelwithnormalrandomeffects.ThenweuseboththeDirichletmodelandthenormalmodeltotthesimulateddatasetsandcomparetheresultsoftheDirichletlinearmixedmodelwiththeresultsoftheclassicalnormallinearmixedmodel. 3.1.1DataGenerationandEstimationsofParametersWegeneratethedatafromtwomodels.DataOrigin1.Thedataaregeneratedfromtheclassicalnormalmixedmodel:Y=X+B +",where N(0,2Ir)and"N(0,In2).r=5andn=25.DataOrigin2.ThedataaregeneratedfromtheDirichletmixedmodel:Y=X+B +",where jDP(m,N(0,2))and"N(0,In2).Forgivenand,weusethemethoddiscussedin Kyungetal. ( 2010 )togeneratethematrixA:q(t+1)=(q(t+1)1,...,q(t+1)r)Dirichlet(r(t)1+1,...,r(t)k+1,1,...,1);everyrowofAai,aiMultinomial(1,q(t+1)).Thus,wecangeneratethematrixA.SinceforgivenA,YN(1,2In+2BAA0B0),wecangenerateY.Letthetrue=[1,0,1,1,1]T,m=1. 37

PAGE 38

AndX=266666664FI5...I5377777775,whereF=26666666412...5377777775.Wegenerate1000datasetsfor2=2=1,5,10respectivelyfrombothorigins.WeuseboththeclassicalnormalmixedmodelandtheDirichletmixedmodeltotthegenerateddatasets.Wecalculatethecorrespondingestimatorsfor,correspond-ingMSEandtheestimationsfor2and2.Basedontheresultsweinprevioussections,weknowthattheBLUEofandtheBLUP are e=(X0V)]TJ /F6 7.97 Tf 6.59 0 Td[(1X))]TJ /F6 7.97 Tf 6.58 0 Td[(1X0V)]TJ /F6 7.97 Tf 6.58 0 Td[(1Y,e =C0V)]TJ /F6 7.97 Tf 6.59 0 Td[(1(Y)]TJ /F4 11.955 Tf 11.96 0 Td[(Xe),(3)whereVisthecovariancematrix.ThevarianceofeisVar(e)=(X0V)]TJ /F6 7.97 Tf 6.59 0 Td[(1X))]TJ /F6 7.97 Tf 6.58 0 Td[(1.Inaddition,weusethesamplecovariancematrixtoestimate2,2andd.SincethecovariancematrixofYinclassicalnormalmixedmodelisaspecialcase(correspondingtod=0asshownbefore)ofthecovariancematrixinDirichletmixedmodel,wecangetthesameestimatorsof2and2byusingthesamplecovariancematrixmethodforbothmodelsduetothespecialstructureofthecovariancematrix.Ontheotherhand,wecantestsomehypotheses,basedontheestimationofanditscorrespondingvariancematrix.Forexamplewecantest:L=0,whereLisaasinglerow.Let T=Le q L(X0^V)]TJ /F6 7.97 Tf 6.59 0 Td[(1X))]TJ /F6 7.97 Tf 6.58 0 Td[(1L0.(3) 38

PAGE 39

AlthoughwedonotknowtheexactdistributionofTnow,wecanusethegenerated1000datasetstondtheestimateddistributionofTandndthecorrespondingestimatedpvaluefortheobservationy. 3.1.2SimulationResultsLet^DbetheestimatedBLUEbyusingtheDirichletmodeland^Nbetheesti-matedBLUEbyusingthenormalmixedmodel.Usingthegenerateddatasets,wecanestimatethecovariancematrixbythesamplecovariancematrix.Let^VDand^VN,where^VDistheestimatedcovariancematrixbyusingtheDirichletmixedmodeland^VNistheestimatedcovariancematrixbyusingtheclassicalnormalmixedmodel.WecanalsogettheestimatedCDFsofTinboththeDirichletmixedmodelandtheclassicalnormalmixedmodel.TheestimationresultsarelistedinTable 3-1 and 3-3 .disthetruecovarianceand^disestimationofdintheDirichletmixedmodel.FromtheTable 3-1 and 3-3 ,wendthattheMSE(^D)arealmostsameforthesameand,nomatterwhichdistributionthetruedataarefrom.However,thechangesofMSE(^N)isbiggerwhenthetruemodelofthedatachangesfromthenormalmodeltotheDirichletmodel.Theestimated^disclosetothetruedbyusingtheDirichletmixedmodel. Table3-1.Thesimulationsresultswith2=2=1.m=1. DataOrigin^2^2MSE(^D)MSE(^N)d^d NormalModel1.001.021.0040.83200.019DirichletModel1.000.931.0011.1070.3330.326 Table3-2.Thesimulationsresultswith2=2=5.m=1. DataOrigin^2^2MSE(^D)MSE(^N)d^d NormalModel5.035.035.2194.35800.067DirichletModel5.044.954.8625.4211.6671.474 39

PAGE 40

Table3-3.Thesimulationsresultswith2=2=10.m=1. DataOrigin^2^2MSE(^D)MSE(^N)d^d NormalModel10.049.759.9467.93100.107DirichletModel9.8310.889.84411.3223.3333.904 3.2ApplicationtoaRealDataSetWewillusetheDirichletmixedmodeltoanalyzeonerealdataset.ThedataisfromProfessorVolkerMai'slab.Theywanttoinvestigatethecontributionofgutmicrobiotatocolorectalcancer.Aunmatchedpairscase-controlstudywasperformedinhumanvolunteersundergoingascreeningcolonoscopy.Fecalsamplesandmultiplecolonbiopsysamplesfromtheindividualswascollected.Thegroupdid16SrRNAsequenceanalysistoobtainthedata.Ourgoalistotestifthereisdifferencebetweencaseandcontrolforeachbacteria(one-sidedandtwo-sidedtests).Thedatalecontainsthecountsforbacteria.Foreachbacteria,therearecountnumbersfor30casesand30controls.Thereare6321rowsinthedata.Inthedataset,therowscorrespondtothebacteriaandthecolumnscorrespondtothecountnumbersforcasesandcontrols.Wewanttotestifthereisdifferencebetweenthecasesandcontrolsforeachbacteria. 3.2.1TheModelandEstimationForeachbacteria,weusethefollowingmodeltotthedata: eYij=+i+ij+"ij,i=1,2,j=1,...,30,(3)whereeYijiseitherarcsin(p pij)orlog(Yij+),pijisthecolumn-wiseproportionandissmallrealnumber.Inotherwords,weconsidertwokindsoftransformationoftheresponseforthemodelinthispart.i=1meanscaseandi=2meanscontrol.j=1,...,30meanstheindexofsubjects(volunteers).Wecannotusethesamplecovariancematrixtoestimate,,anddinthisrealdataset,sincewedonothaveasample.Weonlyhaveoneobservationforeach 40

PAGE 41

bacteria.However,wecanusetheminimumnormquadraticunbiasedestimation(MINQUE)methodtoestimate,,anddforthisobservation.AssumeT0iscorrespondingtotheTstatisticforthisobservation.WedonotknowtheexactdistributionofT,butwecanestimateitsdistributionnon-parametricallybyusingthepermutationmethod.Werandomlypermutethetreatmentlabelsandrecalculatethe^,^,^dandTstatistic.RepeatingthisprocedureforBtimes,wecanobtainasetofTstatisticsT1,...,TB.Thenwecanestimatethepvalue.Forexample,ifwewanttotestH0:1=2;HA:1>2,thenthepvalueforthisbacteriais:p=#fTiT0,i=1,...,Bg B. 3.2.2SimulationResultsfortheModelsintheSection3.2.1Inordertocomparetheperformanceofthemodels,wewillrstdosomesimulationstudies,sincewedonotknowthetrueresultsforthemodelsintheSection3.2.1.Wegeneratedatasetsfromthenegativebinomialdistributionswithseveralparameterssetups.Inthissection,wejustshowthenumericalresultsfordifferentdatasetupsbyusingdifferentmethods.Thediscussionandconclusionwillbeshowninnextsection.Table 3-4 showallthedetailsoftheparametersunderallthesedatasetups.Wegeneratethedatafromnegativebinomial(,r),i.e.,mean=,variance=+2 r.Ineverydatasetup,wegenerate10datasets.Eachdatasethas800rowsforeachdatasetup.100casemeansthatwegeneratethedatawith=100caseforthecasesintherst100rows.100controlmeansthatwegeneratethedatawith=100controlforthecontrolsintherst100rows.othermeansthatwegeneratethedatawith=otherforthecasesandcontrolsintherestrows. DataSetup1.Generatethedatafromnegativebinomial(,r),i.e.,mean=,variance=+2 r.Generate10datasets.Eachdatasethas800rows.Taker=0.1. 41

PAGE 42

Table3-4.TheDataSetups. r100case100controlother DataSetup10.120.30.3DataSetup20.0210.30.3DataSetup30.00520.30.3DataSetup40.0250.30.3DataSetup50.1100.30.3DataSetup60.02100.30.3 Let=2,forthecasesintherst100rows;=0.3,forthecontrolsintherst100rows;Let=0.3,forthecasesandcontrolsintherestrows.Bydoingthis,wegenerateadatasetwithabout86%zeros.Maxvalueis103andminvalueis0.Theoriginalrealdatahasabout94%zeros.Themaximumvalueis1573andtheminimumvalueis0.Usethenormalmodel,theDirichletmodelandtheLRTtondthepvalues.Firstusetheusualtransformation:arcsin(q Yij nij).Onaverage,ifthesignicancelevelis0.1,theDirichletmodelgives99(58)bacteriasignicant;thenormalmodelgives104(58)bacteriasignicant.310bacteria(97)aresignicantbyLRT.OneestimateofFDRis41/99=0.41intheDirichletmodel;theestimateofFDRis46/104=0.44inthenormalmodel;theestimateofFDRis41/310=0.69whenusingLRT.Thenumberin()isthenumberofsignicantbacteriafromtherst100bacteria.Onaverage,ifthesignicancelevelis0.05;theDirichletmodelgives62(42)bacteriasignicant;thenormalmodelgives63(41)bacteriasignicant.161(91)bacteriaaresignicantbyLRT.OneestimateofFDRis20/62=0.32intheDirichletmodel;theestimateofFDRis22/63=0.35inthenormalmodel;theestimateofFDRis70/161=0.43whenusingLRT.Thenusetheusualtransformation:log(Yij). 42

PAGE 43

Onaverage,ifthesignicancelevelis0.1,theDirichletmodelgives128(61)bacteriasignicant;thenormalmodelgives104(58)bacteriasignicant.OneestimateofFDRis67/128=0.52intheDirichletmodel;theestimateofFDRis46/104=0.44inthenormalmodel.Onaverage,ifthesignicancelevel0.05;theDirichletmodelgives87(26)bacteriasignicant;thenormalmodelgives95(26)bacteriasignicant.OneestimateofFDRis61/87=0.70intheDirichletmodel;theestimateofFDRis59/95=0.62inthenormalmodel. DataSetup2.Generatethedatafromnegativebinomial(,r),i.e.,mean=,variance=+2 r.Generate10datasets.Eachdatasethas800rows.Taker=0.02.Let=1,forthecasesintherst100rows;=0.3,forthecontrolsintherst100rows;Let=0.3,forthecasesandcontrolsintherestrows.Usethenormalmodel,theDirichletmodelandtheLRTtondthepvalues.Firstusetheusualtransformation:arcsin(q Yij nij).Onaverage,ifthesignicancelevelis0.1,theDirichletmodelgives78(24)bacteriasignicant;thenormalmodelgives96(28)bacteriasignicant.145bacteria(48)aresignicantbyLRT.OneestimateofFDRis54/78=0.69intheDirichletmodel;theestimateofFDRis68/96=0.70inthenormalmodel;theestimateofFDRis97/145=0.67whenusingLRT.Onaverage,ifthesignicancelevel0.05;theDirichletmodelgives41(13)bacteriasignicant;thenormalmodelgives32(12)bacteriasignicant.61(29)bacteriaaresignicantbyLRT.OneestimateofFDRis28/41=0.68intheDirichletmodel;theestimateofFDRis20/32=0.63inthenormalmodel;theestimateofFDRis32/61=0.53whenusingLRT.Thenusetheusualtransformation:log(Yij). 43

PAGE 44

Onaverage,ifthesignicancelevelis0.1,theDirichletmodelgives87(26)bacteriasignicant;thenormalmodelgives95(26)bacteriasignicant.OneestimateofFDRis61/87=0.70intheDirichletmodel;theestimateofFDRis69/95=0.72inthenormalmodel.Onaverage,ifthesignicancelevel0.05;theDirichletmodelgives44(13)bacteriasignicant;thenormalmodelgives31(11)bacteriasignicant.OneestimateofFDRis28/44=0.64intheDirichletmodel;theestimateofFDRis20/31=0.64inthenormalmodel. DataSetup3.Generatethedatafromnegativebinomial(,r),i.e.,mean=,variance=+2 r.Generate10datasets.Eachdatasethas800rows.Taker=0.005.Withthissmallr,weincreasethevarianceofthenegativebinomialdistribution,whichhelpustogeneratesome`extreme'numbers.Let=2,forthecasesintherst100rows;=0.3,forthecontrolsintherst100rows;Let=0.3,forthecasesandcontrolsintherestrows.Usethenormalmodel,theDirichletmodelandtheLRTtondthepvalues.Firstusetheusualtransformation:arcsin(q Yij nij).Onaverage,ifthesignicancelevelis0.1,theDirichletmodelgives52(8)bacteriasignicant;thenormalmodelgives44(7)bacteriasignicant.12bacteria(3)aresignicantbyLRT.OneestimateofFDRis44/52=0.84intheDirichletmodel;theestimateofFDRis37/44=0.84inthenormalmodel;theestimateofFDRis9/12=0.75whenusingLRT.Onaverage,ifthesignicancelevel0.05;theDirichletmodelgives23(3)bac-teriasignicant;thenormalmodelgives3(0)bacteriasignicant.2(1)bacteriaaresignicantbyLRT. 44

PAGE 45

OneestimateofFDRis20/23=0.87intheDirichletmodel;theestimateofFDRis3/3=1.0inthenormalmodel;theestimateofFDRis1/2=0.50whenusingLRT.Thenusetheusualtransformation:log(Yij).Onaverage,ifthesignicancelevelis0.1,theDirichletmodelgives47(8)bacteriasignicant;thenormalmodelgives44(8)bacteriasignicant.OneestimateofFDRis39/47=0.83intheDirichletmodel;theestimateofFDRis36/44=0.82inthenormalmodel.Onaverage,ifthesignicancelevel0.05;theDirichletmodelgives23(4)bacteriasignicant;thenormalmodelgives4(0)bacteriasignicant.OneestimateofFDRis19/23=0.83intheDirichletmodel;theestimateofFDRis1inthenormalmodel. DataSetup4.Generatethedatafromnegativebinomial(,r),i.e.,mean=,variance=+2 r.Generate10datasets.Eachdatasethas800rows.Taker=0.02.Let=5,forthecasesintherst100rows;=0.3,forthecontrolsintherst100rows;Let=0.3,forthecasesandcontrolsintherestrows.Usethenormalmodel,theDirichletmodelandtheLRTtondthepvalues.Firstusetheusualtransformation:arcsin(q Yij nij).Onaverage,ifthesignicancelevelis0.1,theDirichletmodelgives80(27)bacteriasignicant;thenormalmodelgives93(30)bacteriasignicant.163bacteria(61)aresignicantbyLRT.OneestimateofFDRis53/80=0.66intheDirichletmodel;theestimateofFDRis63/93=0.67inthenormalmodel;theestimateofFDRis102/163=0.63whenusingLRT. 45

PAGE 46

Onaverage,ifthesignicancelevel0.05;theDirichletmodelgives44(15)bacteriasignicant;thenormalmodelgives35(13)bacteriasignicant.87(41)bacteriaaresignicantbyLRT.OneestimateofFDRis29/44=0.66intheDirichletmodel;theestimateofFDRis22/35=0.63inthenormalmodel;theestimateofFDRis46/87=0.53whenusingLRT.Thenusetheusualtransformation:log(Yij).Onaverage,ifthesignicancelevelis0.1,theDirichletmodelgives98(32)bacteriasignicant;thenormalmodelgives90(30)bacteriasignicant.OneestimateofFDRis66/98=0.67intheDirichletmodel;theestimateofFDRis60/90=0.67inthenormalmodel.Onaverage,ifthesignicancelevel0.05;theDirichletmodelgives50(19)bacteriasignicant;thenormalmodelgives35(13)bacteriasignicant.OneestimateofFDRis31/50=0.62intheDirichletmodel;theestimateofFDRis22/35=0.63inthenormalmodel. DataSetup5.Generatethedatafromnegativebinomial(,r),i.e.,mean=,variance=+2 r.Generate10datasets.Eachdatasethas800rows.Taker=0.1.Let=10,forthecasesintherst100rows;=0.3,forthecontrolsintherst100rows;Let=0.3,forthecasesandcontrolsintherestrows.Usethenormalmodel,theDirichletmodelandtheLRTtondthepvalues.Firstusetheusualtransformation:arcsin(q Yij nij).Onaverage,ifthesignicancelevelis0.1,theDirichletmodelgives94(79)bacteriasignicant;thenormalmodelgives97(80)bacteriasignicant.OneestimateofFDRis14/94=0.15intheDirichletmodel;theestimateofFDRis17/97=0.18inthenormalmodel. 46

PAGE 47

Onaverage,ifthesignicancelevel0.05;theDirichletmodelgives74(65)bacteriasignicant;thenormalmodelgives73(64)bacteriasignicant.165(100)bacteriaaresignicantbyLRT.OneestimateofFDRis9/74=0.12intheDirichletmodel;theestimateofFDRis9/73=0.12inthenormalmodel;theestimateofFDRis65/165=0.39whenusingLRT.Thenusetheusualtransformation:log(Yij).Onaverage,ifthesignicancelevelis0.1,theDirichletmodelgives146(82)bacteriasignicant;thenormalmodelgives97(75)bacteriasignicant.OneestimateofFDRis64/146=0.44intheDirichletmodel;theestimateofFDRis22/97=0.23inthenormalmodel.Onaverage,ifthesignicancelevel0.05;theDirichletmodelgives102(71)bacteriasignicant;thenormalmodelgives70(59)bacteriasignicant.OneestimateofFDRis31/102=0.30intheDirichletmodel;theestimateofFDRis11/70=0.16inthenormalmodel. DataSetup6.Generatethedatafromnegativebinomial(,r),i.e.,mean=,variance=+2 r.Generate10datasets.Eachdatasethas800rows.Taker=0.02.Let=10,forthecasesintherst100rows;=0.3,forthecontrolsintherst100rows;Let=0.3,forthecasesandcontrolsintherestrows.Usethenormalmodel,theDirichletmodelandtheLRTtondthepvalues.Firstusetheusualtransformation:arcsin(q Yij nij).Onaverage,ifthesignicancelevelis0.1,theDirichletmodelgives77(28)bacteriasignicant;thenormalmodelgives96(31)bacteriasignicant;theLRTgives117(52)bacteriasignicant.OneestimateofFDRis49/77=0.64intheDirichletmodel;theestimateofFDRis65/96=0.68inthenormalmodel;theestimateofFDRis65/117=0.56whenusingLRT. 47

PAGE 48

Onaverage,ifthesignicancelevel0.05;theDirichletmodelgives44(16)bacteriasignicant;thenormalmodelgives31(13)bacteriasignicant.90(52)bacteriaaresignicantbyLRT.OneestimateofFDRis28/44=0.64intheDirichletmodel;theestimateofFDRis18/31=0.58inthenormalmodel;theestimateofFDRis38/90=0.42whenusingLRT.Thenusetheusualtransformation:log(Yij).Onaverage,ifthesignicancelevelis0.1,theDirichletmodelgives104(37)bacteriasignicant;thenormalmodelgives96(32)bacteriasignicant.OneestimateofFDRis67/104=0.64intheDirichletmodel;theestimateofFDRis64/96=0.67inthenormalmodel.Onaverage,ifthesignicancelevel0.05;theDirichletmodelgives55(23)bacteriasignicant;thenormalmodelgives32(14)bacteriasignicant.OneestimateofFDRis32/55=0.58intheDirichletmodel;theestimateofFDRis18/32=0.56inthenormalmodel. 3.2.3DiscussionoftheSimulationStudiesResultsFromthesimulationstudiesabove,wendthatthereisnobigdifferencebetweentheperformancesundertheclassicallinearmixedmodelandtheperformanceundertheDirichletprocessmixedmodel.Theperformanceofthetransformationarcsin(q Yij nij)isveryslightlybetterthanthatofthetransformationlog(Yij).However,thegenerateddatasetsarenotsimilarastheoriginalone,i.e.,thegenerateddatasetsbythenegativebinomialdistributionsarenotasextremeastheoriginalrealdata.Theoriginalrealdatahaveabout94%zeros,withthemaximumvalue1573andtheminimumvalue0.Wecangeneratethecorrespondingzerosinthesimulateddatasets,butthemaximumvaluesinthesimulateddatasetsaremuchsmaller. 48

PAGE 49

3.2.4ResultsusingtheRealDataSetNowwemovetousethemodel( 3 )toanalyzetheoriginaldata.First,weusearcsin(q Yij nij)astheresponseinthemodel( 3 ).TheDirichletmixedmodelwiththetransformationarcsin(q Yij nij)tendstogivelessbacteriasignicant.Ifthesignicancelevel=0.1,theDirichletmodelgives474bacteriasignicant;thenormalmodelgives1005bacteriasignicant.Ifthesignicancelevel=0.05,theDirichletmodelgives205bacteriasignicant;thenormalmodelgives488bacteriasignicant. 49

PAGE 50

CHAPTER4BAYESIANESTIMATIONUNDERTHEDIRICHLETMIXEDMODELIntheprevioussections,wediscussedclassicalestimationundertheDirichletprocessmixedmodel.WealsoprovidedgeneralconditionsunderwhichtheOLSestimatoristheBLUE.Weinvestigatedthewaystoestimatethevariancecomponents2and2.AlltheanalysisfortheDirichletmodelarefromthefrequentistviewpoint.AnotherwaytodiscusstheDirichletprocessmodelisfromtheBayesianway.Sincethe1990s,theBayesianapproachhasseenthemostuseintheDirichletmixedmodels.Forexample, Antoniak ( 1974 )showedthattheposteriordistributioncouldbedescribedasbeingamixtureofDirichletprocesses. EscobarandWest ( 1995 )illustratedBayesiandensityestimationbyusingthemixtureofDirichletprocess. Kyungetal. ( 2009 )investigatedavariancereductionpropertyoftheDirichletprocessprior,and Kyungetal. ( 2010 )providedanewGibbssamplerforthelinearDirichletmixedmodelanddiscussedestimationoftheprecisionparameteroftheDirichletprocess.WemovetoBayesianestimationinthissection.Infact,wealwaysputpriorsonwhenusingBayesianmethods.Differentpriorsanddifferentrandomeffectsmightleadtodifferentestimators,differentMSEanddiffer-entBayesrisks.Wecanassumethattherandomeffectsfollowanormaldistribution.WecanalsoassumethattherandomeffectsfollowtheDirichletprocess.Wecanputanormaldistributionprioron.Wecanalsoputtheatprioron.Wewillinvestigatewhichprior/modelisbetter.Inthispart,wewillrstgivethefoursmodelswhichwewanttodiscussandthecorrespondingBayesianestimators.WewillshowthecorrespondingMSEandBayesrisksoftheseBayesianestimatorsanddiscusswhichmodelisbetter.Furthermore,wewillinvestigateinmoredetailtheDirichletprocessonewaymodelandcomparetheposteriorvariance,varianceandMSE. 50

PAGE 51

4.1BayesianEstimatorsunderFourModelsTheBLUEandOLSestimatorshavebeendiscussedindetailinprevioussections.Inthispart,wemovetoBayesianestimation.Firstly,wewillgivefourmodelswithdifferentpriorsonanddifferentrandomeffects.Then,wewillgivethecorrespondingBayesianestimatorsforunderthesefourmodels. 4.1.1FourModelsandCorrespondingBayesianEstimatorsWeconsiderfourmodelswithdifferentrandomeffectsanddifferentpriorson: Model1(Dirichlet+Normal):themixedmodelwiththeDirichletprocessrandomeffectsandthenormaldistributionprioron(asin Kyungetal. ( 2009 )): Yj,A,2N(X,2(I+cZAA0Z0));N(0,2In);2IG(a,b).(4) Model2(Normal+Normal):themixedmodelwiththenormalrandomeffectsandthenormaldistributionprioron: Yj,2N(X,2(I+cZZ0));N(0,2In);2IG(a,b).(4) Model3(Dirichlet+atprior):themixedmodelwiththeDirichletprocessrandomeffectsandtheatprioron: Yj,A,2N(X,2(I+cZAA0Z0));atprior;2IG(a,b).(4) Model4(Normal+atprior):themixedmodelwiththenormalrandomeffectsandtheatprioron: Yj,2N(X,2(I+cZZ0));atprior;2IG(a,b).(4)Withthesefourmodels,wecangetthecorrespondingtheBayesianestimators.FirstconsiderModel1.LetA=I+cZAA0Z0.Thenwecanobtainconditionaldensity (,2jA,Y)/(1 2)n+1 2+a+1expf)]TJ /F3 11.955 Tf 19.25 8.09 Td[(b 2)]TJ /F7 11.955 Tf 22 8.09 Td[(1 22T)]TJ /F7 11.955 Tf 18.72 8.09 Td[(1 22(Y)]TJ /F4 11.955 Tf 11.95 0 Td[(X)T)]TJ /F6 7.97 Tf 6.58 0 Td[(1A(Y)]TJ /F4 11.955 Tf 11.95 0 Td[(X)g.(4)Byintegratingoutthe2in( 4 ),wehave (jA,Y)=C(A,Y,n)\(n+1 2+a) (b+1 2T+1 2(Y)]TJ /F4 11.955 Tf 11.95 0 Td[(X)T)]TJ /F6 7.97 Tf 6.58 0 Td[(1A(Y)]TJ /F4 11.955 Tf 11.95 0 Td[(X))n+1 2+a,(4) 51

PAGE 52

whereC(A,Y,n)=ba 2\(a))]TJ /F8 7.97 Tf 6.59 0 Td[(n=2jj)]TJ /F6 7.97 Tf 6.59 0 Td[(1=2.BysummingupallthepossibleAmatrices,wecanhavethedensityforjY: (jY)=XP(A)C(A,Y,n)\(n+1 2+a) (b+1 2T+1 2(Y)]TJ /F4 11.955 Tf 11.96 0 Td[(X)T)]TJ /F6 7.97 Tf 6.59 0 Td[(1A(Y)]TJ /F4 11.955 Tf 11.96 0 Td[(X))n+1 2+a.(4)LetHA=XT)]TJ /F18 5.978 Tf 5.75 0 Td[(1AX 2+Ip 2,SA=1 2YT)]TJ /F6 7.97 Tf 6.59 0 Td[(1AXandDA=H)]TJ /F6 7.97 Tf 6.59 0 Td[(1AS0A.ThenEq.( 4 )canberewrittenas (jY)=XP(A)C(A,Y,n)\(n+1 2+a) (b+1 2YT)]TJ /F6 7.97 Tf 6.59 0 Td[(1AY)]TJ /F4 11.955 Tf 11.96 0 Td[(DTAHADA+()]TJ /F4 11.955 Tf 11.96 0 Td[(DA)THA()]TJ /F4 11.955 Tf 11.95 0 Td[(DA))n+1 2+a.(4)Withtheexpressionof( 4 ),wecanobtainthecorrespondingposteriormeanandposteriorvariance.TheBayesianestimatorfor,^B,isthemeanof(jY)is( 4 ),i.e., ^B=E(jY)=XP(A)DA=XP(A)(XT)]TJ /F6 7.97 Tf 6.58 0 Td[(1AX+Ip ))]TJ /F6 7.97 Tf 6.58 0 Td[(1XT)]TJ /F6 7.97 Tf 6.58 0 Td[(1AY.(4)ItisclearthattheBayesianestimatorunderModel1isbiased.Bysimilarcalculation,wecanobtaintheBayesianestimatorforunderModel2,^I, ^I=(XT)]TJ /F6 7.97 Tf 6.59 0 Td[(1IX+Ip ))]TJ /F6 7.97 Tf 6.59 0 Td[(1XT)]TJ /F6 7.97 Tf 6.59 0 Td[(1IY,(4)whereI=I+cBB0.Again,itisbiased.ForModel3,theBayesianestimatorforis ^BF=XP(A)(XT)]TJ /F6 7.97 Tf 6.59 0 Td[(1AX))]TJ /F6 7.97 Tf 6.59 0 Td[(1XT)]TJ /F6 7.97 Tf 6.59 0 Td[(1AY.(4)ForModel4,theBayesianestimatorforis ^IF=(XT)]TJ /F6 7.97 Tf 6.59 0 Td[(1IX))]TJ /F6 7.97 Tf 6.58 0 Td[(1XT)]TJ /F6 7.97 Tf 6.59 0 Td[(1IY.(4)ItisclearthatunderModels3and4theestimators^BFand^IFareunbiased.UnderModel1and2theestimators^Band^Iarebiased.Withtheseestimators,we 52

PAGE 53

cancalculatethecorrespondingMSEandBayesrisksandcomparethemodelsinfollowingsections. 4.1.2MoreGeneralCasesInModel1andModel2,theprioronisN(0,2In),whichhasmean0.Nowweconsiderageneralcase:themeanisnotazerovector.Assume0isaknownvector.ModifytheprioroninModel1andModel2toN(0,2In).Thus,Model1andModel2changeto: Model1*:themixedmodelwiththeDirichletprocessrandomeffectsandthenormaldistributionprioron: Yj,A,2N(X,2(I+cZAA0Z0));N(0,2In);2IG(a,b).(4) Model2*:themixedmodelwiththenormalrandomeffectsandthenormaldistributionprioron: Yj,2N(X,2(I+cZZ0));N(0,2In);2IG(a,b).(4)Byasimilarcalculation,thefourBayesianestimatorsforModel1*-2*areasfollows. ^B=XP(A)(XT)]TJ /F6 7.97 Tf 6.59 0 Td[(1AX+Ip ))]TJ /F6 7.97 Tf 6.59 0 Td[(1(XT)]TJ /F6 7.97 Tf 6.59 0 Td[(1AY+0 ).(4) ^I=(XT)]TJ /F6 7.97 Tf 6.58 0 Td[(1IX+Ip ))]TJ /F6 7.97 Tf 6.58 0 Td[(1(XT)]TJ /F6 7.97 Tf 6.58 0 Td[(1IY+0 ).(4) 4.2TheOnewayModelIntheprevioussection,wehaveobtainedexpressionsoftheBayesianestimatorsunderfourmodels.Inthispart,wewanttoinvestigatemoredetailsabouttheBayesianversionoftheonewaymodel.Wewillconsiderthefourmodels(Model1-Model4)discussedinprevioussection.First,wewillrewritetheBayesianestimatorsunderthefourmodelsinamoreconcisewayandshowtherelationshipbetweentheBayesianestimatorsandtheBLUE.Andwewillcomparetheirvariance,biasesandmeanssquarederrors.Inthelastpartofthissectionwewilldiscussthechoiceoftheparameter. 53

PAGE 54

4.2.1EstimatorsAsshownin( 4 ),underModel1,theBayesianestimatorforis ^B=XP(A)1T)]TJ /F6 7.97 Tf 6.59 0 Td[(1A 1T)]TJ /F6 7.97 Tf 6.59 0 Td[(1A1+1 Y.(4)Here,PP(A)1T)]TJ /F18 5.978 Tf 5.76 0 Td[(1A 1T)]TJ /F18 5.978 Tf 5.75 0 Td[(1A1+1 isavector.DenotehT=(h1,...,hn)=PP(A)1T)]TJ /F18 5.978 Tf 5.75 0 Td[(1A 1T)]TJ /F18 5.978 Tf 5.76 0 Td[(1A1+1 .So,^B=hTY.Similarly,underModel2,theBayesianestimatorforis ^I=1T)]TJ /F6 7.97 Tf 6.58 0 Td[(1I 1T)]TJ /F6 7.97 Tf 6.58 0 Td[(1I1+1 Y,(4)where,1T)]TJ /F18 5.978 Tf 5.76 0 Td[(1I 1T)]TJ /F18 5.978 Tf 5.75 0 Td[(1I1+1 isavector.DenoteeT=(e1,...,en)=1T)]TJ /F18 5.978 Tf 5.75 0 Td[(1I 1T)]TJ /F18 5.978 Tf 5.76 0 Td[(1I1+1 .So,^I=eTY.LetlI=Pei.ThefollowingtheoremshowstherelationshipbetweentheBayesianestimators^B,^IandtheBLUE. Theorem14. Letl=PhiandlI=Pei,where0l,and,^B=hTY=l Y,^I=eTY=lI Ywhere YistheBLUEforundertheDirichletprocessonewaymodel.Moreover,^BF=^IF= Y,where^BFand^IFareBayesianestimatorsunderModel3andModel4respectively. YistheBLUEasshownbefore. Proof. SeeAppendix D 54

PAGE 55

Because^BF=^IF= Y,wehavethesameestimatorsforunderModel3andModel4(i.e.theBayesianestimatorsaresameundertheDirichletprocessonewaymodelwithatpriorandtheclassicalonewaymodelwithatprior).Theyareunbiased.Since1>lI>l,theBayesianestimatorunderModel1shrinksmore.Inaddition,thebiasoftheBayesianestimatorunderModel1isbiggerthanthatofBayesianestimatorunderModel2.Thus,ifwewanttheestimatorsofModel1andModel2havethesamebias,weshouldchooseabiggerparameterinModel1thanthatinModel2.Particularly,ifwechoose1inModel1,thenl=lIbychoosing2=l 1T)]TJ /F18 5.978 Tf 5.75 0 Td[(1I(1)]TJ /F8 7.97 Tf 6.58 0 Td[(l)inModel2.Inotherwords,withthese1and2thecorrespondingestimatorshavethesamebias. 4.2.2ComparisonandChoiceoftheParameterBasedonMSETheModel1andModel2inprevioussectiondependontheparameter.Model3andModel4correspondto=1.Inaddition,underthesefourmodelstheMSEoftheestimatorscanbewrittenas MSE(^)=g2Var( Y)+(1)]TJ /F3 11.955 Tf 11.96 0 Td[(g)22,(4)whereg2(0,1)forModels1and2andg=1forModels3and4.Thus,wefocusonthisequationforthefourmodels.WewanttondagoodchoiceofinordertomaketheMSEassmallaspossible.Infact,@MSE(^) @g=2(2+Var( Y))(g)]TJ /F5 11.955 Tf 38.66 8.09 Td[(2 Var( Y)+2)8><>:0,g2 Var( Y)+2;>0,g>2 Var( Y)+2.Thus,^g=2 Var( Y)+2(<1)makestheMSEsmallest.ForModel1,g()=PP(A)1T)]TJ /F18 5.978 Tf 5.75 0 Td[(1A1 1T)]TJ /F18 5.978 Tf 5.75 0 Td[(1A1+1 ,whichisamonotoneandcontinuousfunctionforandtherangeofthisfunctionis(0,1).Thus,thereisaunique^,suchthat XP(A)1T)]TJ /F6 7.97 Tf 6.59 0 Td[(1A1 1T)]TJ /F6 7.97 Tf 6.59 0 Td[(1A1+1 =2 Var( YjDirichletmodel)+2.(4) 55

PAGE 56

Ifweknow,thenwecanusethebi-sectionmethodtondthis^.Withthischoiceoftheparameter,wecanmaketheMSEsmallestforModel1.IfweuseestimatorsforandVar( YjDirichletmodel)inEq.( 4 ),wecangetanapproximateof^.ForModel2,g()=1T)]TJ /F18 5.978 Tf 5.76 0 Td[(1I1 1T)]TJ /F18 5.978 Tf 5.75 0 Td[(1I1+1 .Wecanobtain^from 1T)]TJ /F6 7.97 Tf 6.58 0 Td[(1I1 1T)]TJ /F6 7.97 Tf 6.59 0 Td[(1I1+1 =2 Var( Yjnormalmodel)+2.(4)Similarly,ifweknow,thenwecanusethebi-sectionmethodtondthis^.Withthischoiceoftheparameter,wecanmaketheMSEsmallestforModel2.IfweuseestimatorsforandVar( YjDirichletmodel)inEq.( 4 ),wecangetanapproximateof^.Models3and4arethelimitingcaseofModels1and2respectively.Sincetheextremevaluescanbeobtainedbytheinteriorpoints,theMSEwiththeextremepointunderModels1and2issmallerthantheMSEunderModels3and4respectively.Theabovediscussionisbasedon,whichisunknowninreality.However,wecanuse YinEq.( 4 )-( 4 )toapproximatetondthegoodchoiceof.Inaddition,withthisapproximatedtheestimatorsforunderModel1-2are^B= Y2 Var( YjDirichletmodel)+ Y2 Y,^I= Y2 Var( Yjnormalmodel)+ Y2 Y. 4.3TheMSEandBayesRisksWehaveobtainedtheBayesianestimatorsunderthefourmodelsintheprevioussections.Peoplemightbeinterestedinthequestion:whichmodel/priorisbetter?ThecomparisonofthemodelsmightneedtocomparethecorrespondingBayesianestimators.WeusuallycalculatethecorrespondingMSEsandBayesriskstocomparetheBayesianestimators.Inthissection,wewillcalculatetheMSEsandBayesrisksoftheBayesianestimatorstocomparethefourmodels.First,wewillconsideraspecialcasetheonewaymodel.Thenwemovetoageneralcase.Wealwaysusethesumofthesquarederrorlossinthispart. 56

PAGE 57

4.3.1OnewayModelInthepreviouspart,wegettheestimatorsofunderModels1-4.Inthispart,wewanttoinvestigatemoredetailsabouttheMSEsandBayerisksoftheestimatorsundertheonewaymodel.SinceMSE(^B)=Var(l Y)+Bias(l Y)=l2Var( Y)+(1)]TJ /F3 11.955 Tf 10.1 0 Td[(l)22,andl2Var( Y)lI>l,whenisbig,e.g.whenctr+1 tr,Var(l YjDirichletmodel)Var(lI Yjnormalmodel)andMSE(^BjDirichletmodel)MSE(^Ijnormalmodel). Proof. SeeAppendix E Thetheoremtellsusthatalthough1>lI>l,Var(l YjDirichletmodel))]TJ /F7 11.955 Tf -404.85 -23.91 Td[(Var(lI Yjnormalmodel)dependson.Figure 4-1 showthedifferenceofVar(l YjDirichletmodel))]TJ ET BT /F1 11.955 Tf 227.35 -687.85 Td[(57

PAGE 58

Var(lI Yjnormalmodel)withdifferent.Whenissmall,Var(lI Yjnormalmodel)isbig.Whenisbig,Var(l YjDirichletmodel)isbig.Ontheotherhand,basedonEq.( 4 ),thecorrespondingBayesriskswithsquarederrorlossunderModel1and2areBayesRisk=Z(g2Var( Y)+(1)]TJ /F3 11.955 Tf 11.95 0 Td[(g)22)d()=g2Var( Y)+(1)]TJ /F3 11.955 Tf 11.96 0 Td[(g)22=(Var( Y)+2)g2)]TJ /F7 11.955 Tf 11.95 0 Td[(22g+2,whereg=l=PP(A)1T)]TJ /F18 5.978 Tf 5.76 0 Td[(1A1 1T)]TJ /F18 5.978 Tf 5.76 0 Td[(1A1+1 inModel1andg=lI=1T)]TJ /F18 5.978 Tf 5.76 0 Td[(1I1 1T)]TJ /F18 5.978 Tf 5.76 0 Td[(1I1+1 inModel2. Figure4-2.TheBayesrisksofBayesianestimatorsinModels1-4andtheBayesriskofBLUE.m=3. Figure 4-2 showstheBayesrisksof^B,^I,^BFand^IF.TheBayesrisksinFigure 4-2 arecalculatedbasedontheassumptionthatthecorrespondingmodelsusedarethetruemodels,i.e.,theBayesriskof^BistheBayesrisk(^BjDirichletmodel)andtheBayesriskof^IistheBayesrisk(^Ijnormalmodel).Inotherwords,theBayesrisksinFigure 4-2 arecalculatedunderdifferenttruemodels. 58

PAGE 59

However,thetruemodelshouldbeonemodelinreality,whichmaybeeithertheDirichletprocessonewaymodelorthenormalonewaymodel.Thus,itismoremeaningfultocomparethecorrespondingMSEsandBayesrisksoftheBayesianestimatorsunderthesametruemodel.Now,assumethatthetruemodelistheDirichletmodel(orthenormalmodel).Then,wecancomparetheMSEsandBayesriskofBandI.Wehavethefollowingtheorem. Theorem16. Let^Band^Ibesameasbefore:^B=l Y,^I=lI Y.Wehavethefollowingresults:(1)IfthetruemodelistheDirichletmodel,thenVar(^B)Var(^I)andBayesRisk(^B)BayesRisk(^I).Inaddition,deneD=BayesRisk(^I))]TJ /F9 11.955 Tf 11.95 0 Td[(BayesRisk(^B),thenD=(Var( YjDirichletmodel)+2)(lI)]TJ /F3 11.955 Tf 11.96 0 Td[(l)(lI+l)]TJ /F7 11.955 Tf 11.96 0 Td[(22 Var( YjDirichletmodel)+2);(2)Ifthetruemodelisthenormalmodel,thenVar(^B)Var(^I)andBayesRisk(^B)BayesRisk(^I).LetN=BayesRisk(^B))]TJ /F9 11.955 Tf 11.95 0 Td[(BayesRisk(^I),thenN=(Var( Yjnormalmodel)+2)(lI)]TJ /F3 11.955 Tf 11.96 0 Td[(l)(22 Var( Yjnormalmodel)+2)]TJ /F3 11.955 Tf 11.96 0 Td[(lI)]TJ /F3 11.955 Tf 11.95 0 Td[(l).(3)DN. Proof. SeeAppendix F ThistheoremmeasuresandcomparestheincreasesintheBayesriskswhenthemodelusedisnotthetruemodel.Iftheincreaseisbig,weknowthattheestimatorusedisnotsogood.Iftheincreaseisnotbig,weknowthattheestimatorusedisnotsobad.ThefactthatDNtellsusthatifthetruemodelistheDirichletmodel,andweusethenormalmodelasthetruemodel,theincreaseintheBayesriskisbigger.Ontheotherhand,ifthetruemodelisthenormalmodelandweusetheDirichletmodelasthe 59

PAGE 60

truemodel,theincreaseintheBayesriskNisnotbig.TheDirichletmodelhasakindofrobustproperty.Thus,underthiscase^Bisbetterthat^I,i.e.,weshoulduseModel1.Since^BF=^IF= Y,theyhavethesamevariance,sameMSEandsameBayesRisks,i.e.therenodifferenceinusingthem.Underthiscase,thereisnodifferenceinusingModel3andModel4.AswhatwewillshowintheproofofTheorem 18 ,theBayesriskof^BisalwayssmallerthantheBayesriskof^BF,whichmeansthat^Bisbetterthan^BF.Thus,weshoulduseanormalpriorinsteadoftheatprior.Insummary,weshouldusetheDirichletmodelwithanormalprioronundertheonewaymodel. 4.3.2GeneralLinearMixedModelIntheprevioussection,wehaveconsideredthespecialcasetheonewaymodel.Inthispart,wemovetoageneralcase.WestillconsiderthefourBayesiansetups:Models1-4.WewillcomparethesefourmodelinpartsA-C.PartA:CompareModel1andModel2Considerthefollowingexample. Example17. Let=[0,1,1,1,1]T,andX=266666664FI5...I5377777775,whereF=26666666412...5377777775.CalculatetheBayesRisk(^BjDirichletModel),theBayesRisk(^IjnormalModel),MSE(^BjDirichletModel),andMSE(^IjnormalModel).TheresultsareshowninFigures 4-3 4-4 .Asdiscussedbefore,theBayesrisksandMSEsintheFigures 4-3 4-4 arecalculatedbasedontheassumptionthatthe 60

PAGE 61

correspondingmodelsusedarethetruemodels,i.e.,theBayesrisksandMSEsarecalculatedunderdifferenttruemodels.WemightbemoreinterestedincomparingtheBayesianestimatorsperformanceunderthesametruemodel.Thus,wemovetoconsiderthefourdifferences:BayesRisk(^BjnormalModel))]TJ /F9 11.955 Tf 9.3 0 Td[(BayesRisk(^IjnormalModel);BayesRisk(^IjDirichletModel))]TJ /F9 11.955 Tf 9.3 0 Td[(BayesRisk(^BjDirichletModel);CompareMSE(^BjnormalModel))]TJ /F9 11.955 Tf 9.3 0 Td[(MSE(^IjnormalModel);MSE(^IjDirichletModel))]TJ /F9 11.955 Tf 9.3 0 Td[(MSE(^BjDirichletModel). Figure4-3.TheBayesRisks. AsintheTheorem 16 ,thesefourdifferencesmeasuretheincreaseintheBayesrisks/MSEswhenweuseawrongmodelasthetruemodel.Ifthedifferenceisbig,weknowthattheestimatorusedisnotsogood;ifthedifferenceisnotbig,weknowthattheestimatorusedisnotbad.ThefourdifferencesareshownintheFigures 4-5 4-6 .TheFigurestellsusethatweshouldchoosesmall,sincethedifferencesoftheBayesrisks/MSEsaresmallwhenissmall.Inaddition,whenissmall,theincreaseintheBayesrisk/MSEbyusingtheDirichletprocessmixedmodelisnotbigifthetruemodelisthenormalmodel.Similarly, 61

PAGE 62

Figure4-4.TheMSEsunderDifferentModels. Figure4-5.TheBayesRisks.2=5. whenisbig,theincreaseintheBayesrisk/MSEbyusingthenormalmodelisnotbigifthetruemodelistheDirichletprocessmixedmodel.Thus,theestimatorbyModel1isbetterifissmall.Ifwechoosetobebig,thentheestimatorbythenormalmodel(Model2)isbetter.PartB:CompareModel3andModel4. 62

PAGE 63

Figure4-6.TheMSEs.2=5. FirstcomparetheestimatorsunderModel3andModel4,i.e.comparetheestima-torsBFandIF.NotethatbothBFandIFareunbiased.IFistheBLUEunderthenormalmodel.IfOLSistheBLUE(ZZ0HandZJZ0Haresymmetric,whereHissameasbefore),thenIFistheOLS,andtheBLUEisequaltotheOLSundertheDirichletmodel.Thus,IFistheBLUEundertheDirichletmodel.ThenwehavethatVar(BF))]TJ /F7 11.955 Tf 12.31 0 Td[(Var(IF)isalwaysanonnegativedenitematrix,nomatterwhatthetruedistributionofYis.IFisbetterthanBF.Thatis,weshouldusethenormalmodelwhenweuseatprioronandtheOLSisequaltotheBLUE.IfOLSisnottheBLUE,Ionlyhavenumericalresultscurrently. Table4-1.TheMSEswithdifferent2.2=2=1. TrueModelMSE(^BFj2=1)MSE(^IFj2=1) NormalModel1.2570.828DirichletModel1.0711.2325 Table4-2.TheMSEswithdifferent2.2=2=5. TrueModelMSE(^BFj2=5)MSE(^IFj2=5) NormalModel6.2874.142DirichletModel5.3696.163 63

PAGE 64

TheTable 4-1 4-2 showstheMSEsunderdifferentmodels.FromtheresultswendthattheMSEsareclose.IfweconsiderthedifferencesofMSEsaswhatwehavedoneinpartA,wewillndthatthenormalmodel(Model4)isbetternow.Inshort,Model4(thenormalmodel)isbetternow.PartC:CompareModel1andModel3.LetusmovetocompareModel1andModel3.Intheonewaymodelsection,wehaveknownthatwithgoodchoiceoftheparameter,Model1isbetterthatModel3.Dowehavesimilarresultsinthegeneralcases?AgainconsiderthesameandXinExample 17 .ThecorrespondingMSEsof^Band^BFareshowninFigure 4-7 .Itisobviousthat^BisbetterunderthiscasesincetheMSEsof^Bisalwayssmaller.Thus,Model1isbetter. Figure4-7.TheMSEs.2=5. Inshort,whencomparingModel1VSModel2(Dirichlet+normalVSnormal+normal),weshouldchoosesmallandtheDirichletmodel(Dirichlet+normal).WhencomparingModel3VSModel4(Dirichlet+atpriorVSnormal+atprior),weshouldchooseModel4(thenormal+atprior).WhencomparingModel1VSModel3(Dirichlet+normalVSDirichlet+atprior),weshouldchooseModel1(Dirichlet+normal). 64

PAGE 65

CHAPTER5MINIMAXITYANDADMISSIBILITYUndertheclassicalnormalmixedmodel,weknowtheminimaxestimatorsofthexedeffectsinsomespecialcases.WewanttoinvestigateiftherearestillsomeminimaxestimatorsintheDirichletprocessmixedmodels.Inthissection,wewilldiscusstheminimaxityandadmissibilityoftheestimators,andtoshowtheadmissibilityofthecondenceintervals.Weonlyconsiderthecasesunderthesquarederrorlossinthispart. 5.1MinimaxityandAdmissibilityofEstimatorsFirstweconsideraspecialcasetheDirichletprocessonewaymodel:Y=1+A+""".Thefollowingtheoremshowsthat YisminimaxandadmissibleundertheDirichletprocessonewaymodel. Theorem18. YisminimaxandadmissiblewiththeDirichletprocessrandomeffect. Proof. Considerasequenceofpriordistributionsn()=N(0,n2).Letln=PP(A)1T)]TJ /F18 5.978 Tf 5.76 0 Td[(1A1 1T)]TJ /F18 5.978 Tf 5.76 0 Td[(1A1+1 nandr0=Var( Y).Bythepreviousdiscussion,weknowthatr0=c02,wherec0isaconstantfreeof2.ThenthecorrespondingBayesestimatorisBn=ln Y.Bythesimilarcalculationinprevioussection,thecorrespondingBayesriskisrn=l2nVar( Y)+n(1)]TJ /F3 11.955 Tf 11.95 0 Td[(ln)22=(Var( Y)+n2)l2n)]TJ /F7 11.955 Tf 11.95 0 Td[(2n2ln+n2.Sincen(1)]TJ /F3 11.955 Tf 11.95 0 Td[(ln)2!0andln!1asn!1,rn!r0=Var( Y).Bytheinequalitybetweentheharmonicmeanandarithmeticmeanwehave1TA1 (rt)21 1T)]TJ /F18 5.978 Tf 5.76 0 Td[(1A1,whereA=I+cBAA0B0,since1T)]TJ /F6 7.97 Tf 6.58 0 Td[(1A1=Prti=11 ctrindex(i)+1.Then,Var( Y) 2=1 (rt)2XP(A)1TA1XP(A)1 1T)]TJ /F6 7.97 Tf 6.59 0 Td[(1A1. 65

PAGE 66

LetCA=1T)]TJ /F6 7.97 Tf 6.58 0 Td[(1A1.Thenrn=Var( Y)f(XAP(A)CA CA+1 n)2+2 nVar( Y)XAP(A)(1 CA+1 n)2gVar( Y)f(XAP(A)CA CA+1 n)2+1 nXAP(A)1 CA+1 ng=Var( Y)f(XAP(A)CA CA+1 n)2+(1)]TJ /F11 11.955 Tf 11.95 11.36 Td[(XAP(A)CA CA+1 n)gVar( Y).Thus,rn!r0=Var( Y)andrnr0.SincesupR(, Y)=Var( Y)=r0,bytheTheorem5.1.12in LehmannandCasella ( 1998 ), Yisminimax.Ontheotherhand,theabovediscussionshowthatconditions(a)-(c)oftheTheo-rem5.7.13in LehmannandCasella ( 1998 ))aresatised.thus, Yisadmissible. Nowmovetothemodel:Y=B+BA+""",whereBisthesameasbefore.ThentheBLUE(OLS)is Y =f Y1, Y2,..., Yrg,where Yi=1 tPsYis,i=1,...,r.Similarlywiththesquarederrorloss1 rPi(i)]TJ /F5 11.955 Tf 11.95 0 Td[(i)2, Y isminimax. 5.2AdmissibilityofCondenceIntervalsInthissection,weshowtheadmissibilityoftheusualfrequentistcondenceinterval. Theorem19. Thecondenceintervalforwiththeform( Y)]TJ /F3 11.955 Tf 11.69 0 Td[(c0, Y+c0)isadmissibleintheDirichletonewaymodel. Proof. Asdiscussedbefore, YjAN(,2A),where2A=1 n2(n2+2t2Plr2l).Letf(yjA)bethenormaldensitywithmeanandvariance2A,andthemarginaldensityfor Ybef Y(y).Then f Y(y)=XAf(yjA)P(A),(5)i.e.thedensityof Yisthemixtureofnormaldensitieswiththesamemean.Assumethereisanotherinterval(g(Y),h(Y))satisfying:(1)h(Y))]TJ /F3 11.955 Tf 11.96 0 Td[(g(Y)62c0; 66

PAGE 67

(2)P(2( Y)]TJ /F3 11.955 Tf 11.96 0 Td[(c0, Y+c0))=XP(A)P(2( Y)]TJ /F3 11.955 Tf 11.95 0 Td[(c0, Y+c0)jA)6P(2(g(Y),h(Y)))=XP(A)P(2(g(Y),h(Y))jA),where(2*)holdsforeveryandthestrictinequalityin(2*)holdingforatleastone.Bythisassumption,weknowthereisatleastonematrixA0suchthat P(2( Y)]TJ /F3 11.955 Tf 11.95 0 Td[(c0, Y+c0)jA0)6P(2(g(Y),h(Y))jA0),(5)whereEq.( 5 )holdsforeveryandthestrictinequalityholdingforatleastone.ThisiscontradictorytothefactthatforeveryA, Yc0isadmissibleforthenormaldensityf(yjA).Thus,theassumptionisfalse.Thus,thereisnosuchinterval(g(Y),h(Y)).Thatis,thecondenceintervalforwiththeform( Y)]TJ /F3 11.955 Tf 11.96 0 Td[(c0, Y+c0)isadmissible. Forthemodel:Y=B+A+""", YijAN(,e2A,i).Bysimilarproof,wecanshowthatthecondenceintervalswiththeform( Yi)]TJ /F3 11.955 Tf 11.95 0 Td[(c0, Yi+c0)foriisalsoadmissible. 67

PAGE 68

CHAPTER6CONCLUSIONSANDFUTUREWORKThepreviouschaptersdiscussthelinearmixedmodelwithDirichletProcessrandomeffectsfromboththefrequentistandBayesianviewpoint.WerstconsidertheDirichletprocessasamodelforclassicalrandomeffects,andinvestigatetheireffectonfrequentistestimationinthelinearmixedmodel.InmydissertationIdiscusstherelationshipbetweentheBLUE(BestLinearUnbiasedEstimator)andOLS(OrdinaryLeastSquares)inDirichletprocessmixedmodels,andalsogiveconditionsunderwhichtheBLUEcoincideswiththeOLSestimatorintheDirichletprocessmixedmodel.Inaddition,IinvestigatethemodelfromtheBayesianview,anddiscussthepropertiesofestimatorsunderdifferentmodelassumptions,comparetheestimatorsunderthefrequentistmodelanddifferentBayesianmodels,andinvestigateminimaxity.Furthermore,weapplythelinearmixedmodelwithDirichletProcessrandomeffectstoarealdatasetandgetsatisfactoryresults.Literatureandpresentworkofferthepotentialtodevelopsomefurtherresults.MyfutureresearchplanfocusontheDirichletmixedmodelfromboththetheoreticalwayandapplication.(1)Intherealexampleof Kyungetal. ( 2010 ),thelengthofthecredibleintervalobtainedbytheDirichletmixedmodelissmallerthanthatbytheclassicalmixedmodel.Isthisalwaystrueingeneralcases?Iwouldliketoinvestigatemoredetailstothisques-tion.ThecredibleintervalsbydifferentBayesianmodelsmighthavedifferentcoverageprobabilities.TheDirichletmixedmodelgaveshortercredibleintervalsin Kyungetal. ( 2010 ).Dotheseshorterintervalshavehigherorlowercoverageprobabilities?ArethecoverageprobabilitiesofthecredibleintervalsbytheDirichletmixedmodelalwayslarger?IwouldliketocomparethecoverageprobabilitiesofthecredibleintervalsundertheDirichletprocessmixedmodelandtheclassicalnormalmixedmodelingeneral. 68

PAGE 69

(2)Wehavedonesomesimulationstudiesandanapplicationtoarealdatasetinthepreviouschapters.Theresultsaresatisfactory.Thus,Iwouldliketoapplythismodeltomorerealproblems,whichmightwantamoreexibleandpossiblynonparametricstructure,suchassomegeneticproblemsandproblemsinsocialscience. 69

PAGE 70

APPENDIXAPROOFOFTHEOREM2 Proof. Leth(r,m)=Pr)]TJ /F6 7.97 Tf 6.59 0 Td[(1i=1im\(m+r)]TJ /F6 7.97 Tf 6.58 0 Td[(1)]TJ /F8 7.97 Tf 6.58 0 Td[(i)\(i) \(m+r).Weonlyneedtoprovethath(r,m)isadecreasinginmonmp (r)]TJ /F7 11.955 Tf 11.95 0 Td[(2)(r)]TJ /F7 11.955 Tf 11.95 0 Td[(1)or0m2.Lethi(r,m)=im\(m+r)]TJ /F6 7.97 Tf 6.59 0 Td[(1)]TJ /F8 7.97 Tf 6.59 0 Td[(i)\(i) \(m+r).Thenh(r,m)=Pr)]TJ /F6 7.97 Tf 6.58 0 Td[(1i=1hi(r,m).Therearefourcasesforr.Case1:Whenr=2,h=h1=m m(m+1)=1 m+1,whichisdecreasingobviously.Case2:Whenr=3,h=h1+h2=m (m+1)(m+2)+2m m(m+1)(m+2)=1 m+1,whichisdecreasingobviously.Case3:Whenr=4,h=h1+h2+h3=1 m+3+4 (m+1)(m+2)(m+3)whichisdecreasingobviously.Case4:r5.Wewillusemathematicalinductionmethodtoprovethath(r,m)isadecreasingonmp (r)]TJ /F7 11.955 Tf 11.96 0 Td[(2)(r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)or0m2.Whenr=2,h=h1=m m(m+1)=1 m+1,whichisdecreasingobviously.Assumethatforr=s,h(s,m)isadecreasingonmp (s)]TJ /F7 11.955 Tf 11.95 0 Td[(2)(s)]TJ /F7 11.955 Tf 11.96 0 Td[(1)or0m2.i.e.,@h(s,m) @m0onmp (s)]TJ /F7 11.955 Tf 11.95 0 Td[(2)(s)]TJ /F7 11.955 Tf 11.95 0 Td[(1)or0m2.Nowconsiderr=s+1.Weneedtoshowthath(s+1,m)=Psi=1hi(s+1,m)=Psi=1im\(m+s)]TJ /F8 7.97 Tf 6.59 0 Td[(i)\(i) \(m+s+1)isdecreasingonmp s(s)]TJ /F7 11.955 Tf 11.95 0 Td[(1)or0m2.Byregularcalculation,wehavethathi(s+1,m)=(1)]TJ /F8 7.97 Tf 15.98 4.71 Td[(i+1 m+r)hi(s+1,m),fori=1,...,s)]TJ /F7 11.955 Tf 11.95 0 Td[(1.Then,h(s+1,m)=s)]TJ /F6 7.97 Tf 6.59 0 Td[(1Xi=1hi(s+1,m)+hs(s+1,m)=hs(s+1,m)+s)]TJ /F6 7.97 Tf 6.59 0 Td[(1Xi=1(1)]TJ /F3 11.955 Tf 15.8 8.08 Td[(i+1 m+r)hi(s,m)=m\(s+1)\(m) \(m+s+1)+h(s,m))]TJ /F8 7.97 Tf 13.05 14.95 Td[(s)]TJ /F6 7.97 Tf 6.59 0 Td[(1Xi=1m\(m+s)]TJ /F7 11.955 Tf 11.95 0 Td[(1)]TJ /F3 11.955 Tf 11.95 0 Td[(i)\(i+2) \(m+s+1) 70

PAGE 71

=h(s,m)+m\(s+1)\(m) \(m+s+1))]TJ /F8 7.97 Tf 18.57 14.94 Td[(sXj=2m\(m+s)]TJ /F3 11.955 Tf 11.96 0 Td[(j)\(j+1) \(m+s+1)=h(s,m)+m\(s+1)\(m) \(m+s+1))]TJ /F3 11.955 Tf 11.96 0 Td[(h(s+1,m)+h1(s+1,m)Inanotherword, h(s+1,m)=1 2[h(s,m)+m\(s+1)\(m) \(m+s+1)+m\(m+s)]TJ /F7 11.955 Tf 11.96 0 Td[(1) \(m+s+1)](A)Thus,weonlyneedtoprovethat@f\(s+1)\(m+1) \(m+s+1)+m\(m+s)]TJ /F18 5.978 Tf 5.76 0 Td[(1) \(m+s+1)g @m<0whenm2s(s)]TJ /F7 11.955 Tf 11.67 0 Td[(1)orm2..Infact,@f\(s+1)\(m+1) \(m+s+1)+m\(m+s)]TJ /F6 7.97 Tf 6.59 0 Td[(1) \(m+s+1)g @m=(s)]TJ /F7 11.955 Tf 11.96 0 Td[(1)s)]TJ /F3 11.955 Tf 11.96 0 Td[(m2 (s+m)2(m+s)]TJ /F7 11.955 Tf 11.96 0 Td[(1)2)]TJ /F7 11.955 Tf 11.95 0 Td[(\(s+1)Psi=1Qsj=1,j6=i(m+j) (m+s)2(m+1)2=(s)]TJ /F7 11.955 Tf 11.96 0 Td[(1)s)]TJ /F3 11.955 Tf 11.96 0 Td[(m2 (s+m)2(m+s)]TJ /F7 11.955 Tf 11.96 0 Td[(1)2)]TJ /F7 11.955 Tf 23.66 8.09 Td[(\(s+1) Qsj=1(m+j)sXj=11 m+j.itisobviousthatwhenm2s(s)]TJ /F7 11.955 Tf 11.96 0 Td[(1),@f\(s+1)\(m+1) \(m+s+1)+m\(m+s)]TJ /F18 5.978 Tf 5.75 0 Td[(1) \(m+s+1)g @m<0.Nowconsidertheinterval0m2.Since(s)]TJ /F6 7.97 Tf 6.59 0 Td[(1)s)]TJ /F8 7.97 Tf 6.58 0 Td[(m2 (s+m)2(m+s)]TJ /F6 7.97 Tf 6.59 0 Td[(1)2isdecreasing,wehave@f\(s+1)\(m+1) \(m+s+1)+m\(m+s)]TJ /F6 7.97 Tf 6.59 0 Td[(1) \(m+s+1)g @m=(s)]TJ /F7 11.955 Tf 11.96 0 Td[(1)s)]TJ /F3 11.955 Tf 11.95 0 Td[(m2 (s+m)2(m+s)]TJ /F7 11.955 Tf 11.96 0 Td[(1)2)]TJ /F7 11.955 Tf 23.66 8.09 Td[(\(s+1) Qsj=1(m+j)sXj=11 m+js)]TJ /F7 11.955 Tf 11.96 0 Td[(1 s(s+1)2)]TJ /F7 11.955 Tf 21.81 8.09 Td[(\(s+1) Qsj=1(2+j)sXj=11 2+j,=s)]TJ /F7 11.955 Tf 11.96 0 Td[(1 s(s+1)2)]TJ /F7 11.955 Tf 46.46 8.09 Td[(2 (s+2)(s+1)sXj=11 2+j(1)]TJ /F7 11.955 Tf 11.95 0 Td[(ln4)s2)]TJ /F7 11.955 Tf 11.96 0 Td[((ln4+3)s+2 s(s+1)(s+2)<0.Thus,ifr5,h(s,m)isadecreasingonm2(r)]TJ /F7 11.955 Tf 11.96 0 Td[(2)(r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)orm2.Inanotherword,ifr5,disadecreasingonm2(r)]TJ /F7 11.955 Tf 12.13 0 Td[(2)(r)]TJ /F7 11.955 Tf 12.14 0 Td[(1)orm2.Whenr5,disadecreasingfunctionofmonthewholerealline. 71

PAGE 72

APPENDIXBPROOFOFTHEOREM3 Proof. WeonlyneedtoshowthatgistheeigenvectorofmatrixVifandonlyifg2S4j=2Ej.1.Sufcientpart.Assumeg2S4j=2Ej.ByregularcalculationweknowthatZWZ0g=aconstantg,whichmeansgisaneigenvector.Theproofofthispartiscomplete.2.Necessarypart.AssumegisaneigenvectorofZWZ0.IfwecanshowthatS4j=2Ejcontainsalltheeigenvectors,theng2S4j=2Ej.NowmovetoshowthatS4j=2EjcontainsalltheeigenvectorsofZWZ0.Foreveryh2E1,bytheassumptionandalgebracalculation,weknowthathisalsoaneigenvectorcorrespondingtoacertainnonzeroeigenvalueofWZ0Z.Thesumofthegeometricmultiplicitiesofallthesedistinctnonzeroeigenvaluesisr.Bythealgebratheory,weknowthatifthehisaneigenvectorof(WZ0)Z=WZ0Zcorrespondingtoanonzeroeigenvalue,thenZhisaneigenvectorofZ(WZ0)=ZWZ0correspondingtothesameeigenvaluewiththesamealgebramultiplicityandgeometricmultiplicity.SoweknowthateveryelementinE2SE3isaneigenvectorcorrespondingtoacertainnonzeroeigenvalueofZWZ0andthesumofthegeometricmultiplicitiesofallthesedistinctnonzeroeigenvaluesisalsor.E2SE3Sf0gistheunionofthecorrespondingeigenspaces.Foreveryh2E4,wehaveZWZ0h=0,whichistheeigenvectorcorrespondingtoeigenvalue0.Thegeometricmultiplicityforthiseigenvalueisn)]TJ /F3 11.955 Tf 12.55 0 Td[(r.E4Sf0gisthecorrespondingeigenspace.Thetotalsumofthegeometricmultiplicitiesofallthesedistinct(zeroandnonzero)eigenvaluesisn)]TJ /F3 11.955 Tf 12.4 0 Td[(r+r=n.Inanotherword,S4j=2EjcontainsalltheeigenvectorsofZWZ0.Thus,g2S4j=2Ej.Theproofofnecessarypartiscomplete. 72

PAGE 73

APPENDIXCEVALUATIONOFEQUATION(2-4)LetA=[a1,...,ar]0.ItisstraightforwardtocheckthatPAP(A)a0iai=1,Moreover,fori6=j,d=E(a0iaj)=P(i,jinthesamecluster)=P(1,2inthesamecluster).Wecanconsiderthat(1,2)formsanewunit,andwepartitionther)]TJ /F7 11.955 Tf 11.95 0 Td[(1subjectstogetP(1,2inthesamecluster)=\(m) \(m+r)r)]TJ /F6 7.97 Tf 6.59 0 Td[(1Xk=1XC:jCj=k,Pri=r)]TJ /F6 7.97 Tf 6.58 0 Td[(1\(r1+1)mkkYj=2\(rj)=\(m+r)]TJ /F7 11.955 Tf 11.96 0 Td[(1) \(m+r)f\(m) \(m+r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)r)]TJ /F6 7.97 Tf 6.59 0 Td[(1Xk=1XC:jCj=k,Pri=r)]TJ /F6 7.97 Tf 6.58 0 Td[(1mkr1kYj=1\(rj)g=\(m+r)]TJ /F7 11.955 Tf 11.96 0 Td[(1) \(m+r)Em,r)]TJ /F6 7.97 Tf 6.59 0 Td[(1[r1],whereEm,r)]TJ /F6 7.97 Tf 6.59 0 Td[(1[r1]meansthisexpectationiscorrespondingtothepartitionsofr)]TJ /F7 11.955 Tf 13 0 Td[(1subjectsandCmeansapartition.jCj=kmeansapartitiondividesthesampletokgroups.Usinganideasimilartoabove,wecalculate,Em,r)]TJ /F6 7.97 Tf 6.58 0 Td[(1[r1]=1P(r1=1)+...+(r)]TJ /F7 11.955 Tf 11.95 0 Td[(1)P(r1=r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)=1P(r2,...,rkformpartitionsofr)]TJ /F7 11.955 Tf 11.96 0 Td[(1)]TJ /F7 11.955 Tf 11.95 0 Td[(1subjects)+...+(r)]TJ /F7 11.955 Tf 11.96 0 Td[(2)P(r2,...,rkformpartitionsofr-1-(r-2)subjects)+(r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)P(r1=r)]TJ /F7 11.955 Tf 11.95 0 Td[(1)=m\(m+r)]TJ /F7 11.955 Tf 11.96 0 Td[(2) \(m+r)]TJ /F7 11.955 Tf 11.95 0 Td[(1)8><>:\(m) \(m+r)]TJ /F7 11.955 Tf 11.96 0 Td[(2)r)]TJ /F6 7.97 Tf 6.59 0 Td[(1Xk=2XC:jCj=k)]TJ /F6 7.97 Tf 6.59 0 Td[(1,Pkj=2rj=r)]TJ /F6 7.97 Tf 6.59 0 Td[(2mk)]TJ /F6 7.97 Tf 6.58 0 Td[(1kYj=2\(rj)9>=>;\(1) 73

PAGE 74

++(r)]TJ /F7 11.955 Tf 11.95 0 Td[(1)m\(m+r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)]TJ /F3 11.955 Tf 11.96 0 Td[(r+1)\(r)]TJ /F7 11.955 Tf 11.95 0 Td[(1) \(m+r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)=m\(m+r)]TJ /F7 11.955 Tf 11.95 0 Td[(2)\(1) \(m+r)]TJ /F7 11.955 Tf 11.95 0 Td[(1)+2m\(m+r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)]TJ /F7 11.955 Tf 11.95 0 Td[(2)\(2) \(m+r)]TJ /F7 11.955 Tf 11.95 0 Td[(1)++(r)]TJ /F7 11.955 Tf 11.95 0 Td[(1)m\(m+r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)]TJ /F3 11.955 Tf 11.96 0 Td[(r+1)\(r)]TJ /F7 11.955 Tf 11.95 0 Td[(1) \(m+r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)=r)]TJ /F6 7.97 Tf 6.59 0 Td[(1Xi=1im\(m+r)]TJ /F7 11.955 Tf 11.95 0 Td[(1)]TJ /F3 11.955 Tf 11.96 0 Td[(i)\(i) \(m+r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)Whenm!0,d=r)]TJ /F6 7.97 Tf 6.59 0 Td[(1Xi=1im\(m+r)]TJ /F7 11.955 Tf 11.95 0 Td[(1)]TJ /F3 11.955 Tf 11.96 0 Td[(i)\(i) \(m+r)=[r)]TJ /F6 7.97 Tf 6.59 0 Td[(2Xi=1im\(m+r)]TJ /F7 11.955 Tf 11.95 0 Td[(1)]TJ /F3 11.955 Tf 11.96 0 Td[(i)\(i) \(m+r)+(r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)m\(m)\(r)]TJ /F7 11.955 Tf 11.95 0 Td[(1) \(r)]!1.Whenm!1,d=r)]TJ /F6 7.97 Tf 6.59 0 Td[(1Xi=1im\(m+r)]TJ /F7 11.955 Tf 11.96 0 Td[(1)]TJ /F3 11.955 Tf 11.96 0 Td[(i)\(i) \(m+r)
PAGE 75

APPENDIXDPROOFOFTHEOREM14 Proof. SinceA=I+cBAA0B0,)]TJ /F6 7.97 Tf 6.58 0 Td[(1A=I)]TJ /F4 11.955 Tf 11.96 0 Td[(BA(A0B0BA+I c))]TJ /F6 7.97 Tf 6.59 0 Td[(1A0B0,where(A0B0BA+I c))]TJ /F6 7.97 Tf 6.59 0 Td[(1=diagfc ctrj+1g,rjisthesumofj-thcolumnofmatrxiA.Denoteindex(i)asthefunctiontoidentifytheclusterthatthei-thobservationofresponsebelongsto.Forexample,index(3)=2meansthatthe3rdobservationofresponsecorrespondingtothe2ndcluster,i.e,y3=+ 2+23.Then,1T)]TJ /F6 7.97 Tf 6.58 0 Td[(1A=1T[I)]TJ /F4 11.955 Tf 11.95 0 Td[(BAdiagfc ctrj+1gA0B0]=1T)]TJ /F7 11.955 Tf 11.95 0 Td[((tr1,...,trk)diagfc ctrj+1gA0B0=1T)]TJ /F7 11.955 Tf 11.95 0 Td[((ctr1 ctr1+1,ctr2 ctr2+1,...,ctrk ctrk+1)A0B0=1T)]TJ /F7 11.955 Tf 11.95 0 Td[((ctrindex(1) ctrindex(1)+1,ctrindex(2) ctrindex(2)+1,...,ctrindex(n) ctrindex(n)+1)=(1 ctrindex(1)+1,1 ctrindex(2)+1,...,1 ctrindex(n)+1) 1T)]TJ /F6 7.97 Tf 6.59 0 Td[(1A 1T)]TJ /F6 7.97 Tf 6.59 0 Td[(1A1+1 =(1 ctrindex(1)+1,1 ctrindex(2)+1,...,1 ctrindex(n)+1) Pni=11 ctrindex(i)+1+1 .(D)Foracertain(index(1),...,index(n)),let(index(1)0,...,index(n)0)beapossibleper-mutation.IfA1iscorrespondingto(index(1),...,index(n))andA2iscorrespondingto(index(1)0,...,index(n)0),thenP(A1)=P(A2).Setf=Pni=11 ctrindex(i)+1,wehave:Xpermutationsof(index(1),...,index(n))1T)]TJ /F6 7.97 Tf 6.59 0 Td[(1A 1T)]TJ /F6 7.97 Tf 6.59 0 Td[(1A1+1 =Xpermutationsof(index(1),...,index(n))(1 ctrindex(1)+1,1 ctrindex(2)+1,...,1 ctrindex(n)+1) f+1 =#ofpermutations1 f+1 (f1,...,f1) 75

PAGE 76

wheref1isaconstant.Sincef=n 2)]TJ /F6 7.97 Tf 15.77 4.71 Td[(1 2Pni=1ctrindex(i) ctrindex(i)+1,wehavePf1=f,i.e,f1=f nThus,Xpermutationsof(index(1),...,index(n))1T)]TJ /F6 7.97 Tf 6.58 0 Td[(1A 1T)]TJ /F6 7.97 Tf 6.58 0 Td[(1A1+1 Y=#ofpermutations nf f+(1,...,1)Y=#ofpermutationsf f+1 Y.LetS=fallpossible(index(1),...,index(n))g.Bypreviousdenitionwehavel=Phi=fP(index(1),...,index(n))2SP(A)Pni=11 ctrindex(i)+1 Pni=11 ctrindex(i)+1+1 g.LetsetS1bethelargestsubsetofS,whichsatisesthecondition:forevery(index(1),...,index(n))2S1,itspermutationsdonotbelongtoS1.Thus,wehave:^B=fXAP(A)1T)]TJ /F6 7.97 Tf 6.59 0 Td[(1A 1T)]TJ /F6 7.97 Tf 6.59 0 Td[(1A1+1 gY=XP(A)f(1 ctrindex(1)+1,1 ctrindex(2)+1,...,1 ctrindex(n)+1) Pni=11 ctrindex(i)+1+1 gY=fX(index(1),...,index(n))2S1P(A)(#ofpermutations)f f+1 g Y=fX(index(1),...,index(n))2SP(A)Pni=11 ctrindex(i)+1 Pni=11 ctrindex(i)+1+1 g Y=l Y.Thus,^B=l Y.Inaddition,l<1.Similarly, ^I=1T)]TJ /F6 7.97 Tf 6.58 0 Td[(1I 1T)]TJ /F6 7.97 Tf 6.58 0 Td[(1I1+1 Y=1T)]TJ /F7 11.955 Tf 11.96 0 Td[((ct ct+1,ct ct1+1,...,ct ct1+1) n)]TJ /F11 11.955 Tf 11.96 8.96 Td[(Pni=1ct ct+1+1 Y=lI Y,(D)wherelI=Pni=11 ct+1 Pni=11 ct+1+1 .Sincef=n)]TJ /F11 11.955 Tf 11.96 8.97 Td[(Pni=1ctrindex(i) ctrindex(i)+1Pni=11 ct+1,llI. 76

PAGE 77

Since^BF=PP(A)(XT)]TJ /F6 7.97 Tf 6.59 0 Td[(1X))]TJ /F6 7.97 Tf 6.58 0 Td[(1XT)]TJ /F6 7.97 Tf 6.58 0 Td[(1Yand^IF=(XT)]TJ /F6 7.97 Tf 6.59 0 Td[(1IX))]TJ /F6 7.97 Tf 6.58 0 Td[(1XT)]TJ /F6 7.97 Tf 6.58 0 Td[(1IY,bythesimilarcalculationfor^Band^Iwecanalsoget^BF=^IF= Y,wheretheyarecorrespondingto=1inEq.( D )and( D ). 77

PAGE 78

APPENDIXEPROOFOFTHEOREM15 Proof. SinceVar(l YjDirichletmodel)=l2XAP(A)n2+2Pni=1ctrindex(i) n2,bytheexpressionofl,wehave:Var(l YjDirichletmodel)XAP(A)2 nnXi=1(ctrindex(i)+1)[Pni=11 ctrindex(i)+1 1 +Pni=11 ctrindex(i)+1]2.ConsiderafunctiongA(x1,...,xn)=Pni=1xi[Pni=11 xi 1 +Pni=11 xi]2,wherect+1xictr+1.Thenwhenctr+1 tr,foreveryj,@gA @xj=Pni=11 xi (1 +Pni=11 xi)2[nXi=11 xi)]TJ /F7 11.955 Tf 11.95 0 Td[(2(Xxi)1 1 x2j 1 +Pni=11 xi]Pni=11 xi (1 +Pni=11 xi)2[nXi=11 xi)]TJ /F7 11.955 Tf 11.95 0 Td[(2(Xxi)tr ctr+11 (ctr+1)2 2tr ctr+1]0,whichmeansthatgisaincreasingfunctioninxj.Thus,Var(l YjDirichletmodel)XAP(A)2 ngA(rindex(1),..,rindex(n))XAP(A)2 ngA(ct+1,...,ct+1)=Var(l Yjnormalmodel).Thus,when2(ctr+1) tr,wehaveVar(l YjDirichletmodel)Var(lI Yjnormalmodel).SincellI,wehaveMSE(^B)MSE(^I). 78

PAGE 79

APPENDIXFPROOFOFTHEOREM16 Proof. SincellI,thenwealwayshaveVar(^Bjtruemodel)Var(^jtruemodel),nomatterwhatthetruemodelis.Part1:AssumethetruemodelistheDirichletmodel.For^Band^I,theBayesriskshavetheformBayesRisk=Z(g2Var( Y)+(1)]TJ /F3 11.955 Tf 11.96 0 Td[(g)22)d()=(Var( Y)+2)g2)]TJ /F7 11.955 Tf 11.96 0 Td[(22g+2,whichcanbeconsideredasaquadraticfunctionofgwiththeaxisofsymmetryg0=2 2+Var( Y).For^B,g=landfor^I,g=lI.BytheproofoftheTheorem 18 ,weknowthatVar( Y) 2PP(A)1 1T)]TJ /F18 5.978 Tf 5.75 0 Td[(1A1undertheDirichletmodel.ThenbyJensen'sinequality,wehaveg0=2 2+Var( Y)1 1+1 PP(A)1 1T)]TJ /F18 5.978 Tf 5.75 0 Td[(1A1XP(A)1 1+1 1 1T)]TJ /F18 5.978 Tf 5.76 0 Td[(1A1=llI.ThuswhenthetruemodelistheDirichletmodel,BayesRisk(^B)BayesRisk(^I).Inaddition,D=BayesRisk(^I))]TJ /F9 11.955 Tf 12.17 0 Td[(BayesRisk(^B)=(Var( Y)+2)(lI)]TJ /F3 11.955 Tf 12.16 0 Td[(l)(lI+l)]TJ /F7 11.955 Tf -451.38 -23.91 Td[(22 Var( Y)+2).Part2:Assumethetruemodelisthenormalmodel.Bythesimilarargument,g0=2 2+Var( Y)=1 1+1 1+tc n=lIl.Thuswhenthetruemodelisthenormalmodel,BayesRisk(^B)BayesRisk(^I).Inaddition,N=BayesRisk(^B))]TJ /F9 11.955 Tf 9.44 0 Td[(BayesRisk(^I)=(Var( Y)+2)(lI)]TJ /F3 11.955 Tf 9.43 0 Td[(l)(22 Var( Y)+2)]TJ /F3 11.955 Tf -458.7 -23.91 Td[(lI)]TJ /F3 11.955 Tf 11.95 0 Td[(l). 79

PAGE 80

Part3.Bytheabovecalculation,wehave:D)]TJ /F5 11.955 Tf 11.96 0 Td[(N=42(lI)]TJ /F3 11.955 Tf 11.96 0 Td[(l)[(l+lI)(22+Var( YjDirichletmodel)+Var( Yjnormalmodel)) 42)]TJ /F7 11.955 Tf 11.95 0 Td[(1]=42(lI)]TJ /F3 11.955 Tf 11.96 0 Td[(l)[(l+lI)2+Var( YjDirichletmodel)+2+Var( Yjnormalmodel) 42)]TJ /F7 11.955 Tf 11.96 0 Td[(1]42(lI)]TJ /F3 11.955 Tf 11.96 0 Td[(l)[(l+lI)(1 41 l+1 41 lI))]TJ /F7 11.955 Tf 11.96 0 Td[(1]=42(lI)]TJ /F3 11.955 Tf 11.96 0 Td[(l)[1 4lI l+1 4l lI)]TJ /F7 11.955 Tf 13.15 8.09 Td[(1 2]0.ThusDN. 80

PAGE 81

REFERENCES AFSHARTOUS,D.andWOLF,M.(2007).Avoiding`datasnooping'inmultilevelandmixedeffectsmodels.J.R.Statist.Soc.A,1701035. ANTONIAK,C.(1974).Mixturesofdirichletprocesseswithapplicationstobayesiannonparametricproblems.AnnalsofStatistics,21157. BLACKWELL,D.andMACQUEEN,J.(1973).Fergusondistributionsviapolyaurnschemes.AnnalsofStatistics,1353. BROWN,K.G.(1976).Asymptoticbehaviorofminque-typeestimationsofvariancecomponents.TheAnnalsofStatistics,4746. BURR,D.andDOSS,H.(2005).Abayesiansemi-parametricmodelforrandomeffectsmetaanalysis.J.Amer.Statist.Assoc.,100242. CHAUBEY,Y.P.(1980).Applicationofthemethodofminqueforestimationinregressionwithintraclasscovariancematrix.Sankhya:TheIndianJournalofStatistics,4228. DAS,K.,JIANG,J.andRAO,J.(2004).Meansquarederrorofempiricalpredictor.TheAnnalsofStatistics,32818. DAY,N.E.(1969).Estimatingthecomponentsofamixturesofnormaldistributions.Biometrika,56463. DEMIDENKO,E.(2004).MixedModels:TheoryandApplications.Wiley. ESCOBAR,M.D.andWEST,M.(1995).Bayesiandensityestimationandinferenceusingmixtures.JournaloftheAmericanStatisticalAssociation,90577. FERGUSON,T.(1973).Abayesiananalysisofsomenonparametricproblems.AnnalsofStatistics,1209. GILL,J.andCASELLA,G.(2009).Nonparametricpriorsforordinalbayesiansocialsciencemodels:Specicationandestimation.J.Amer.Statist.Assoc.,104453. HARVILLE,D.(1976).Extensionofthegauss-markovtheoremtoincludetheestimationofrandomeffects.TheAnnalsofStatistics,4384. HARVILLE,D.(1977).Maximumlikelihoodapproachestovariancecomponentesti-mationandtorelatedproblems.ournaloftheAmericanStatisticalAssociation,72320. HUANG,S.andLU,H.H.(2001).Extendedgauss-markovtheoremfornonparametricmixed-effectsmodels.JournalofMultivariateAnalysis,76249. JIANG(2007).LinearandGeneralizedLinearMixedModelsandTheirApplications.Springer. 81

PAGE 82

KACKAR,R.andHARVILLE,D.(1981).Unbiasednessoftwo-stageestimationandpredictionproceduresformixedlinearmodels.Comm.Statist.A-TheoryMethods,101249. KACKAR,R.andHARVILLE,D.(1984).Approximationsforstandarderrorsofestimatorsofxedandrandomeffectinmixed.JournaloftheAmericanStatisticalAssociation,79209. KORWAR,R.M.andHOLLANDER,M.(1973).Contributionstothetheoryofdirichletprocesses.Ann.Probab.,1705. KYUNG,M.,GILL,J.andCASELLA,G.(2009).Characterizingthevarianceimprove-mentinlineardirichletrandomeffectsmodels.StatisticsandProbabilityLetters,792343. KYUNG,M.,GILL,J.andCASELLA,G.(2010).Estimationindirichletrandomeffectsmodels.TheAnnalsofStatistics,38979. LEHMANN,E.L.andCASELLA,G.(1998).TheoryofPointEstimation.Springer. LIU,J.S.(1996).Nonparametrichierarchicalbayesviasequentialimputations.TheAnnalsofStatistics,24911. LO,A.Y.(1984).Onaclassofbayesiannonparametricestimates:I.densityestimates.TheAnnalsofStatistics,12351. MACEACHERN,S.N.andMULLER,R.(1998).Estimatingmixtureofdirichletprocessmodels.JournalofComputationalandGraphicalStatistics,7223. MATHEW,T.(1984).Onnonnegativequadraticunbiasedestimabilityofvariancecomponenes.TheAnnalsofStatistics,121566. MCCULLOCH,C.E.andSEARLE,S.R.(2001).Generalized,LinearandMixedModels.Wiley-Interscience. NEAL,R.M.(2000).Markovchainsamplingmethodsfordirichletprocessmixturemodels.JournalofComputationalandGraphicalStatistics,9249. PETERS,B.andWALKER,H.F.(1978).Aniterativeprocedureforobtainingmaximum-likelihoodestimatesoftheparametersforamixtureofnormaldistributions.SIAMJ.ofAppl.Math.,35362. PUKELSHEIM,F.(1981).Ontheexistenceofunbiasednonnegativeestimatesofvarianceconvariancecomponents.TheAnnalsofStatistics,9293. PUNTANEN,S.andSTYAN,G.(1989).Theequalityoftheordinaryleastsquaresestimatorandthebestlinearunbiasedestimator.TheAmericanStatistician,43153. 82

PAGE 83

RAO,C.R.(1979).Minqetheoryanditsrelationtomlandmmlestimationofvariancecomponents.Sankhya:TheIndianJournalofStatistics,41138. RAO,P.S.R.S.(1977).Theoryoftheminque:Areview.Sankhya:TheIndianJournalofStatistics,39201. RAO,R.S.R.S.andCHAUBEY,V.P.(1978).Threemodicationsoftheprincipleoftheminque.Comm.Statist.Meth.,A7. ROBINSON,G.(1991).Thatblupisagoodthing:theestimationofrandomeffects.StatisticalScience,615. SEARLE,S.,CASELLA,G.andMCCULLOCH,C.(2006).VarianceComponents.JohnWileyandSons,NewJersey. SETHURAMAN,J.(1994).Aconstructivedenitionofdirichletpriors.StatisticalSinica,4638. XU,L.andJORDAN,M.I.(1996).Onconvergencepropertiesoftheemalgorithmforgaussianmixtures.NeuralComputation,8129. YOUNG,T.Y.andCORALUPPI,G.(1969).Onestimationofamixtureofnormaldensityfunctions.AdaptiveProcesses(8th)DecisionandControl,1969IEEESymposiumon63. ZYSKIND,G.(1967).Oncanonicalformsnon-negativecovariancematricesandbestandsimpleleastsquareslinearestimatorsinlinearmodels.Ann.Math.Statist,381092. ZYSKIND,G.andMARTIN,F.B.(1969).Onbestlinearestimationandageneralgauss-markovtheoreminlinearmodelswitharbitrarynonnegativecovariancestructure.SIAMJournalonAppliedMathematics,171190. 83

PAGE 84

BIOGRAPHICALSKETCHChenLiwasborninP.R.China.Shereceivedherbachelor'sdegreeinappliedmathematicsfromTongjiUniversityin2002atShanghai,P.R.China.Shethencom-pletedherPhDinAppliedMathematicsatTongjiUniversityin2007.Theresearchonappliedmathematicsfocusedonthenumericalmethodsforpartialdifferentialequations.Afterthat,shejoinedtheDepartmentofStatisticsattheUniversityofFlorida.Shere-ceivedthePhDinStatisticsin2012.TheresearchonstatisticswastodiscussthelinearmixedmodelwiththeDirichletProcessrandomeffectsfromboththefrequentistwayandtheBayesianway. 84