<%BANNER%>

High Dimensional Inference and Variable Selection

MISSING IMAGE

Material Information

Title:
High Dimensional Inference and Variable Selection
Physical Description:
1 online resource (73 p.)
Language:
english
Creator:
Dasgupta, Shibasish
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Statistics
Committee Chair:
Ghosh, Malay
Committee Co-Chair:
Khare, Kshitij
Committee Members:
Bliznyuk, Nikolay A
Lu, Xiaomin

Subjects

Subjects / Keywords:
asymptotic -- bayesian -- consistency -- dimensional -- expansion -- glm -- high -- inference -- kullback -- lasso -- leibler -- normality -- oracle -- posterior -- property -- selection -- variable
Statistics -- Dissertations, Academic -- UF
Genre:
Statistics thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract:
This dissertation consists of three research projects. The first project focuses on asymptotic expansions of posteriors for generalized linear models (GLM) with canonical link functions when the number of regressors grows to infinity at a certain rate relative to the growth of the sample size. As a side result, we have also proved asymptotic normality of the maximum likelihood estimator in a high-dimensional GLM set up. The second project considers posterior consistency in the context of high dimensional variable selection using the Bayesian lasso algorithm. In a frequentist setting, consistency is perhaps the most basic property that we expect any reasonable estimator to achieve. However, in a Bayesian setting, consistency is often ignored or taken for granted, especially in more complex hierarchical Bayesian models. We have derived sufficient conditions for posterior consistency in the Bayesian lasso model (with orthogonal design), where the number of parameters grows with the sample size. The last part of my thesis proposes a new variable selection technique using the Kullback-Leibler (KL) divergence loss and establishes related asymptotic properties.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
by Shibasish Dasgupta.
Thesis:
Thesis (Ph.D.)--University of Florida, 2013.
Local:
Adviser: Ghosh, Malay.
Local:
Co-adviser: Khare, Kshitij.
Electronic Access:
RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2015-08-31

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2013
System ID:
UFE0045721:00001

MISSING IMAGE

Material Information

Title:
High Dimensional Inference and Variable Selection
Physical Description:
1 online resource (73 p.)
Language:
english
Creator:
Dasgupta, Shibasish
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Statistics
Committee Chair:
Ghosh, Malay
Committee Co-Chair:
Khare, Kshitij
Committee Members:
Bliznyuk, Nikolay A
Lu, Xiaomin

Subjects

Subjects / Keywords:
asymptotic -- bayesian -- consistency -- dimensional -- expansion -- glm -- high -- inference -- kullback -- lasso -- leibler -- normality -- oracle -- posterior -- property -- selection -- variable
Statistics -- Dissertations, Academic -- UF
Genre:
Statistics thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract:
This dissertation consists of three research projects. The first project focuses on asymptotic expansions of posteriors for generalized linear models (GLM) with canonical link functions when the number of regressors grows to infinity at a certain rate relative to the growth of the sample size. As a side result, we have also proved asymptotic normality of the maximum likelihood estimator in a high-dimensional GLM set up. The second project considers posterior consistency in the context of high dimensional variable selection using the Bayesian lasso algorithm. In a frequentist setting, consistency is perhaps the most basic property that we expect any reasonable estimator to achieve. However, in a Bayesian setting, consistency is often ignored or taken for granted, especially in more complex hierarchical Bayesian models. We have derived sufficient conditions for posterior consistency in the Bayesian lasso model (with orthogonal design), where the number of parameters grows with the sample size. The last part of my thesis proposes a new variable selection technique using the Kullback-Leibler (KL) divergence loss and establishes related asymptotic properties.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
by Shibasish Dasgupta.
Thesis:
Thesis (Ph.D.)--University of Florida, 2013.
Local:
Adviser: Ghosh, Malay.
Local:
Co-adviser: Khare, Kshitij.
Electronic Access:
RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2015-08-31

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2013
System ID:
UFE0045721:00001


This item has the following downloads:


Full Text

PAGE 1

HIGHDIMENSIONALINFERENCEANDVARIABLESELECTIONBySHIBASISHDASGUPTAADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2013

PAGE 2

c2013ShibasishDasgupta 2

PAGE 3

TomyparentsaswellastoeveryonewhoselovehasmademylifewhatitisandmademethepersonIam 3

PAGE 4

ACKNOWLEDGMENTS EverythingthatIhaveaccomplishedinlifehasbeentheproductofthesupportthatIhavebeenfortunateenoughtoreceivefromthepeoplewhohavealwayswantedwhatwasbestforme.Tomyfamily(especially,tomyparents,mysister,brother-in-lawandmysweetlittleniece)whohaslovedme,my(innite)friendsfromallovertheworldwhohavesupportedme,alltheprofessorsinourdepartment(especially,myadvisersProf.MalayGhoshandProf.KshitijKharewithoutwhomitwouldhavenotbeenpossibleformetonishmyPhD),theothermembersofmyPhDcommittee,allmyteachers,advisers/guides/mentorsinIndiaandabroadwhohaveinspiredme,thestainourdepartmentandalltheotherpersonsinmylifewhohavehelpedme,specically,mynearanddearonesinGainesville(especially,DolaKakima,Indraneelda,Arunavada,mybrothersSubho,Sayanandallmycloseones),thestudentsinmyownclasseswhohavetoleratedme,andallthepeopleIhaveeverhurtorletdownwhohaveforgivenme:Thankyou. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS ................................... 4 ABSTRACT ......................................... 6 CHAPTER 1INTRODUCTION ................................... 7 2ASYMPTOTICEXPANSIONOFTHEPOSTERIORDENSITYINHIGHDIMENSIONALGENERALIZEDLINEARMODELS .......................... 9 2.1LiteratureSurveyandProposedWork ...................... 9 2.2Setupandassumptions ............................. 11 2.3MainResult ................................... 13 2.4AsymptoticNormalityoftheMLE ........................ 31 3HIGHDIMENSIONALPOSTERIORCONSISTENCYOFTHEBAYESIANLASSO 40 3.1LiteratureSurveyandProposedWork ...................... 40 3.1.1InterpretationofPosteriorConsistency .................. 40 3.1.2FormalDenitionandChoiceofVectorNorm .............. 41 3.1.3ConditionalIndependencePrior&theBayesianLasso ......... 43 3.1.4ShrinkagePriors ............................. 45 3.2MainResult ................................... 45 4VARIABLESELECTIONWITHKULLBACK-LEIBLERDIVERGENCELOSS .... 50 4.1LiteratureSurveyandProposedWork ...................... 50 4.1.1VariableSelectionandTheLASSO ................... 50 4.1.2OraclePropertyandtheAdaptiveLasso ................. 53 4.2AnAlternativeApproachtotheAdaptiveLassothroughtheKLDivergenceLoss ....................................... 54 4.3Computations .................................. 57 4.4NumericalExample:DiabetesData ....................... 58 4.5FurtherExtension ................................ 59 5CONCLUSIONS .................................... 68 REFERENCES ........................................ 69 BIOGRAPHICALSKETCH ................................. 73 5

PAGE 6

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyHIGHDIMENSIONALINFERENCEANDVARIABLESELECTIONByShibasishDasguptaAugust2013Chair:MalayGhoshCochair:KshitijKhareMajor:StatisticsThisdissertationconsistsofthreeresearchprojects.Therstprojectfocusesonasymptoticexpansionsofposteriorsforgeneralizedlinearmodels(GLM)withcanonicallinkfunctionswhenthenumberofregressorsgrowstoinnityatacertainraterelativetothegrowthofthesamplesize.Asasideresult,wehavealsoprovedasymptoticnormalityofthemaximumlikelihoodestimatorinahigh-dimensionalGLMsetup.ThesecondprojectconsidersposteriorconsistencyinthecontextofhighdimensionalvariableselectionusingtheBayesianlassoalgorithm.Inafrequentistsetting,consistencyisperhapsthemostbasicpropertythatweexpectanyreasonableestimatortoachieve.However,inaBayesiansetting,consistencyisoftenignoredortakenforgranted,especiallyinmorecomplexhierarchicalBayesianmodels.WehavederivedsucientconditionsforposteriorconsistencyintheBayesianlassomodel(withorthogonaldesign),wherethenumberofparametersgrowswiththesamplesize.ThelastpartofmythesisproposesanewvariableselectiontechniqueusingtheKullback-Leibler(KL)divergencelossandestablishesrelatedasymptoticproperties. 6

PAGE 7

CHAPTER1INTRODUCTIONThisdoctoraldissertationconsistsofthreeresearchprojects.Therstprojectcentersontheasymptoticexpansionoftheposteriordensityinhighdimensionalgeneralizedlinearmodels(glms).ThesecondprojectconsidersposteriorconsistencyinthecontextofhighdimensionalvariableselectionusingtheBayesianLassoalgorithm.ThelastprojectproposesanewvariableselectiontechniqueusingtheKullback-Leibler(KL)divergencelossandestablishesrelatedasymptoticproperties.WhiledevelopingapriordistributionforanyBayesiananalysis,itisimportanttocheckwhetherthecorrespondingposteriordistributionbecomesdegenerateinthelimittothetrueparametervalueasthesamplesizeincreases.Inthesamevein,itisalsoimportanttounderstandamoredetailedasymptoticbehaviorofposteriordistributions.Thisisparticularlyrelevantinthedevelopmentofmanynonsubjectivepriors.Thesecondchapterofthisdissertationfocusesonasymptoticexpansionsofposteriorsforgeneralizedlinearmodels(GLMs)withcanonicallinkfunctionswhenthenumberofregressorsgrowstoinnityatacertainraterelativetothegrowthofthesamplesize.Asasideresult,wehavealsoprovedasymptoticnormalityofthemaximumlikelihoodestimator(MLE)inahigh-dimensionalGLMsetup.ThirdchapterofthisdissertationconsidersposteriorconsistencyinthecontextofhighdimensionalvariableselectionusingtheBayesianlassoalgorithm.Inafrequentistsetting,consistencyisperhapsthemostbasicpropertythatweexpectanyreasonableestimatortoachieve.However,inaBayesiansetting,consistencyisoftenignoredortakenforgranted,especiallyinmorecomplexhierarchicalBayesianmodels.Inthischapter,wehavederivedsucientconditionsforposteriorconsistencyintheBayesianlassomodel(withorthogonaldesign),wherethenumberofparametersgrowswiththesamplesize.Fourthchapterofthisdissertationdealswithvariableselectionproblem.Theadaptivelassoisarecenttechniqueforsimultaneousestimationandvariableselectionwhereadaptive 7

PAGE 8

weightsareusedforpenalizingdierentcoecientsinthel1penalty.Inthischapter,weproposeanalternativeapproachtotheadaptivelassothroughtheKullback-Leibler(KL)divergenceloss,calledtheKLadaptivelasso,wherewereplacethesquarederrorlossintheadaptivelassosetupbytheKLdivergencelosswhichisalsoknownastheentropydistance.TherearevarioustheoreticalreasonstodefendtheuseofKullback-Leiblerdistance,rangingfrominformationtheorytotherelevanceoflogarithmicscoringruleandthelocation-scaleinvarianceofthedistance.WeshowthattheKLadaptivelassoenjoystheoracleproperties;namely,itperformsaswellasifthetrueunderlyingmodelweregiveninadvance.WealsodiscusstheextensionoftheKLadaptivelassoingeneralizedlinearmodelsandshowthattheoraclepropertiesstillholdundermildregularityconditions.Wegiveconcludingremarksanddiscusssomepossiblefutureworksinthelastchapterofthisdissertation. 8

PAGE 9

CHAPTER2ASYMPTOTICEXPANSIONOFTHEPOSTERIORDENSITYINHIGHDIMENSIONALGENERALIZEDLINEARMODELS 2.1LiteratureSurveyandProposedWorkBayesianmethodologyisgainingincreasingprominenceinthetheoryandapplicationofstatistics.It'sversatilityhasenhancedduetoitsimplementabilityviamanystatisticalnumericalintegrationtechniques,inparticular,theMarkovchainMonteCarlomethod.Nevertheless,itisimportantnottooverlookasymptoticperformanceofanyBayesianprocedure.Specically,itisimportanttocheckwhetheraposteriordistributiongeneratedbyapriorbecomesdegenerateinthelimittothetrueparametervalueasthesamplesizegrowstoinnity.Inthesamevein,itisalsoimportanttounderstandamoredetailedasymptoticbehavioroftheposterior,asymptoticdistributionoftheparameterofinterest,whenproperlycenteredandscaled.Asymptoticnormalityoftheposteriorforregular(whenthesupportofthedistributiondoesnotdependontheparameter)familyofdistributionsbasedoniidobservationswasrstdevelopedbyBernsteinandVonMises(seeBernstein,1934[ 1 ]).Later,analogoustofrequentistEdgeworthexpansionofthedensityorthedistributionfunction,higherorderasymptoticexpansionoftheposteriorwasdevelopedtoaddressvariousotherimportantissuesneededforBayesiananalysis,mostprominentlythedevelopmentofnon-subjectivepriorsusinganumberofdierentcriteria;seee.g.Ghosh(2011)[ 2 ]whereotherreferencesarecited.Toourknowledge,therstworkdealingwithacomprehensiveasymptoticexpansionoftheposteriorisduetoJohnson(1967[ 3 ],1970[ 4 ]).ThiswasfolloweduplaterbyWalker(1969)[ 5 ],Ghosh,SinhaandJoshi(1982)[ 6 ],Crowder(1988)[ 7 ],justtonameafew.However,muchofthisworkwasfocusedonposteriorsgeneratedfromiidobservationsgeneratedfromaregularfamilyofdistributionsandasmoothfamilyofpriorsadmittingderivativesuptoacertainorder.Ghosal(1997[ 8 ],1999[ 9 ],2000[ 10 ])madesignicantandtopicalcontributionstothisareabyestablishingposteriorconsistencyinahighdimensionalcontext.Specically,Ghosal 9

PAGE 10

(1997)establishedasymptoticnormalityoftheposteriorforgeneralizedlinearmodelsinhighdimensionalsetup.Thenumberofregressorspnisallowedtogrowwiththesamplesize.Inparticular,itisassumedthatp4nlogpn=n!0.Later,Ghosal(1999)establishedasymptoticnormalityoftheposteriorforlinearregressionmodelsinasimilarhighdimensionalsetupasGhosal(1997).InGhosal(2000),asymptoticnormalityoftheposteriorwasestablishedforexponentialfamilieswithnumberofparametersgrowingwiththesamplesizen.Inthispaper,wefocusongeneralizedlinearmodels(GLMs)withcanonicallinkfunction.Sinceagenerallinkfunctionisaone-to-onefunctionofthecanonicallinkfunction,wecangetasimilarasymptoticexpansionforthevectorofregressionparametersinthegeneralcaseaswell.ThemainobjectiveofthispaperistoextendtheasymptoticconsistencyresultofGhosal(1997),byprovidingathirdordercorrectasymptoticexpansionoftheposteriordensityforGLMwithcanonicallinkfunctionwhenthenumberofregressorsgrowstoinnityatacertainraterelativetothegrowthofthesamplesizen.Theresultsbearthepotentialforthedevelopmentofavarietyofobjectivepriorsinthisframework.Inparticular,therststeptowardsthedevelopmentofreferencepriors,probabilitymatchingpriors,momentmatchingpriorsandothersrequiresasymptoticexpansionsofposteriors(cf.Ghosh,2011[ 2 ]).Inordertoproveourmainasymptoticexpansionresult(Theorem 1 ),werstprove(weak)consistencyofthemaximumlikelihoodestimator(MLE)inthehighdimensionalGLMsetupconsideredinthispaper(Lemma 1 ).InTheorem 1 ,theasymptoticexpansionoftheposteriordensityoftheregressionparametervector(appropriatelynormalized)iscentredaroundamultivariatenormaldensitywithmeanvectorzero.InGhosal(1997),asymptoticconsistencywasestablishedwithrespecttoamultivariatenormaldensitywithanon-zeromeanvector.Asaninterestingside-consequence,weuseTheorem 1 andGhosal(1997)'sresulttoestablishasymptoticnormalityoftheMLEinthehighdimensionalGLMsetupconsideredinthispaper(Theorem 2 ).Tothebestofourknowledge,consistencyandasymptoticnormalityoftheMLEinGLMunderhighdimensionalsettinghasnotbeenconsideredinpreviousliterature.SeeSection3foracomparisonwithrelatedworks. 10

PAGE 11

Thepaperisorganizedasfollows.InSection2,weintroducethemodelandprovidetherequiredassumptions.InSection3,weprovethemainasymptoticexpansionresult.InSection4,weuseTheorem 1 andGhosal(1997)'sresulttoestablishtheasymptoticnormalityoftheMLE.Theappendixcontainsproofswhichestablishthattheassumptions(inSection2)onthepriordensityaresatisedbythemultivariatenormalandmultivariatetdensities. 2.2SetupandassumptionsLetX1,...,Xnbeindependentrandomvariables.Letfi()denotethedensityofXiwithrespecttoa-nitemeasure.Suppose fi(Xi)=exp[Xii)]TJ /F7 11.955 Tf 11.95 0 Td[( (i)],i=1,...,n,(2{1)where,i=zTi,=(1,...,pn)Tisthevectorofparametersandzi=(zi1,...,zipn)Tisthevectorofcovariatesfori=1,...,n.Notethatweareallowingthedimensionpnoftheparametertogrowwiththesamplesizen.Also,thecumulantgeneratingfunction isinnitelydierentiableandisassumedtobestrictlyconvex.Let()denotethepriordensityof.ThentheposteriordistributionofgiventheobservationsX1,...,Xnisdenedby: (jX)=exp[ln()]() Rexp[ln()]()d,(2{2)where,ln()=nXi=1(XizTi)]TJ /F7 11.955 Tf 11.96 0 Td[( (zTi))isthelog-likelihoodfunction.Notethatthecovariatevectorsz1,...,zn,thetrueparametervalue0,theprior(),theposterior(jX)allchangewithn.However,wesuppressthisdependenceinournotationforsimplicityofexposition.Wenowstatetheregularityconditionsneededforourresult. (A)]TJ /F1 11.955 Tf 11.95 0 Td[(0)ThematrixAndenedbytherelationAn=nXi=1zizTi 11

PAGE 12

ispositivedeniteandtheeigenvaluesof1 nAnareuniformlybounded,i.e.,9constantsC1andC2(independentofn)suchthatthematrix1 nAnsatises:0K.Thisisequivalenttothestatementthattheparameterspaceisrestrictedton,where n=f:max1ipnjzTijpn0,forsome0>0(0doesnotdependonn).Also, 12

PAGE 13

supjj)]TJ /F12 7.97 Tf 6.58 0 Td[(0jjCnjjrlog()jj20,(2{7)and supjj)]TJ /F12 7.97 Tf 6.58 0 Td[(0jjCnmax1j,j0pn1 ()@2() @j@j00,where,Cn=O)]TJ /F15 5.978 Tf 8.8 -5.94 Td[(4p pn n.(2{8)Thisassumptionissatisedbyappropriatemultivariatetandmultivariatenormaldensities(seeappendix). (A)]TJ /F1 11.955 Tf 12.28 0 Td[(3)Thedimensionpncangrowtoinnitysuchthatp6+n n!0asn!1forsome>0.ThisisstrongerthanthecorrespondingassumptioninGhosal(1997)whichonlyrequiresp4nlogpn=n!0.However,thegoalinGhosal(1997)istoestablishasymptoticnormalityoftheposterior.Forthispurpose,asecondordercorrectTaylorseriesexpansionoftheposteriorisneeded.Ourgoalistogetathirdordercorrectasymptoticexpansionoftheposterior.Henceitisnotsurprisingthatweneedaslowerrateofincreaseforpn. 2.3MainResultInthissection,wederiveourmainresult:athirdordercorrectasymptoticexpansionoftheposterior(jX)aroundanappropriatenormaldensity.Westartwithalemmawhichwillbehelpfulinprovingthemainresult.Let^nbethemaximumlikelihoodestimatorof.Itfollowsbytheconvexityof andassumption(A)]TJ /F1 11.955 Tf 12.28 0 Td[(0)thattheHessianmatrixofln()isanegativedenitematrixforall.Henceln()isastrictlyconvexfunctionandhasauniquemaximum.Thefollowinglemmaestablishesweakconsistencyofthemaximumlikelihoodestimator^n.WebrieymentionsomerelatedworksonhighdimensionalconsistencyandasymptoticnormalityoftheMLE,andthedierencesbetweenoursetupandthesetupinthosepapers.Portnoy(1984[ 11 ],1985[ 12 ])establishedconsistencyandasymptoticnormalityofM-estimatorsinthecontextoflinearregression,asthenumberofregressionparameterspngrowswiththesamplesizen(satisfyingthecondition(pnlogpn)3=2 n!0)1.Portnoy 1SeePortnoy(1984,1985)forreferencestoearlierworksinthisarea. 13

PAGE 14

(1988)[ 13 ]establishedconsistencyandasymptoticnormalityoftheMLEforiidobser-vationsfromexponentialfamilies,asthenumberofparameterspngrowswiththesamplesizen(satisfyingtheconditionp3=2n n!0).Thisisadierentsettingthanregressionbasedsetting(withcovariates)consideredinthispaper.FanandPeng(2004)[ 14 ]establishedhighdimensionalconsistencyandasymptoticnormalityofpenalizedlikelihoodestimators(MLEcanbethoughtofasaspecialcase).However,theyconsideredtheiidsetting,whichisdierentthanthesettinginthispaper.Zhanget.al.(2011)[ 15 ]consideredpenalizedpseudo-likelihoodestimatorsforhighdimensionalGLMs.However,theirBregmandivergencebasedlossfunctionsdonotincludethenegativelog-likelihoodlossfunction.Morespecically,inthecontextofGLMwithcanonicallinkZhanget.al.(2011)'slossfunctionlookslike nXi=1)]TJ /F3 11.955 Tf 9.3 0 Td[(q(Xi)+q( 0(zTi))+(Xi)]TJ /F7 11.955 Tf 11.95 0 Td[( 0(zTi))q0( 0(zTi)),(2{9)whereq()isaconcavefunction.Thelog-likelihoodfunctionln()cannotbewritteninthisform.Aproofofasymptoticnormalityof^ninthespecialcaseoflogisticregressionisprovidedinLiangandDu(2012)[ 16 ].Wenowprovethe(weak)consistencyof^nbyadaptingtheapproachofFanandPeng(2004)(intheiidsetting)totheGLMsetting. Lemma1. Ifp6n n!0asn!1,then^nsatisesjj^n)]TJ /F8 11.955 Tf 11.96 0 Td[(0jj=Op(p pn n). Proof. Letn=p pn n.Wewillshowthatforanygiven,thereexistsaconstantCsuchthat P(supjjujj=Cln(0+nu)
PAGE 15

=nnXi=1XizTiu)]TJ /F10 11.955 Tf 11.95 20.44 Td[( nXi=1 (zTi(0+nu)))]TJ /F7 11.955 Tf 11.96 0 Td[( (zTi0)!=nnXi=1Xi)]TJ /F7 11.955 Tf 11.96 0 Td[( 0(zTi0)zTiu)]TJ /F7 11.955 Tf 13.15 8.08 Td[(2n 2nXi=1 00(zTi0)(zTiu)2)]TJ /F7 11.955 Tf 13.15 8.08 Td[(3n 6nXi=1 000(i)(zTiu)3=I1+I2+I3,say,whereiliesbetweenzTi0andzTi(0+nu),forevery1in.Noticethatby(A)]TJ /F1 11.955 Tf 12.1 0 Td[(1),zTi0isuniformlybounded(iniandn)and 00()isacontinuousfunction.Hence, 00(zTi0)isalsouniformlybounded(iniandn)bysay,K1.ItfollowsthatE"nXi=1)]TJ /F3 11.955 Tf 5.48 -9.69 Td[(Xi)]TJ /F7 11.955 Tf 11.95 0 Td[( 0(zTi0)zTiu#2=nXi=1Eh(Xi)]TJ /F7 11.955 Tf 11.96 0 Td[( 0(zTi0))2(zTiu)2i,)]TJ /F18 11.955 Tf 5.47 -9.68 Td[(*Xi0sareindependentandE[Xi]= 0(zTi0)=nXi=1(zTiu)2 00(zTi0))]TJ /F18 11.955 Tf 5.48 -9.68 Td[(*E(Xi)]TJ /F7 11.955 Tf 11.95 0 Td[( 0(zTi0))2= 00(zTi0)K1nXi=1(zTiu)2nK1uT nXi=1zizTi n!unK1C2jjujj2.Thelaststepfollowsby(A)]TJ /F1 11.955 Tf 11.96 0 Td[(0).HenceEPni=1)]TJ /F3 11.955 Tf 5.48 -9.68 Td[(Xi)]TJ /F7 11.955 Tf 11.96 0 Td[( 0(zTi0)zTiu2=O(n)jjujj2.Thus, I1=Op(np n)=Op(p pn) (2{11) Notethat isastrictlyconvexfunctionandhence 00()>0.Since 00iscontinuous,itfollowsthatitsinmumonaboundedintervalisstrictlypositive.By(A)]TJ /F1 11.955 Tf 12.9 0 Td[(1),zTi0isuniformlybounded.Thisimplies 00(zTi0)isuniformlyboundedbelowbyapositiveconstant,sayK2.HenceI2=)]TJ /F7 11.955 Tf 10.5 8.09 Td[(2n 2nXi=1 00(zTi0)(zTiu)2 15

PAGE 16

)]TJ /F3 11.955 Tf 28.56 0 Td[(K22n 2nXi=1(zTiu)2=)]TJ /F3 11.955 Tf 9.3 0 Td[(K22n 2nuT nXi=1zizTi=n!u<0,by(A)]TJ /F1 11.955 Tf 11.96 0 Td[(0).Also,by(A)]TJ /F1 11.955 Tf 11.95 0 Td[(0)andtheargumentsabove jI2jK22n 2nuT nXi=1zizTi=n!K22n 2nC1=C1K2pn. (2{12) Now,sinceiliesbetweenzTi0andzTi(0+nu),itfollowsby(A)]TJ /F1 11.955 Tf 11.95 0 Td[(0)and(A)]TJ /F1 11.955 Tf 11.95 0 Td[(1)thatjzTinj
PAGE 17

Thelaststepfollowsby(A)]TJ /F1 11.955 Tf 12.11 0 Td[(0).Sincep6n=n!0asn!1,itfollowsby( 2{11 ),( 2{12 )and( 2{13 )thattheorderofI2dominatestheordersofI1andI3.SinceI2isnegative,theassertionin( 2{10 )holds. Remark:NotethatbyLemma 1 and(A)]TJ /F1 11.955 Tf 11.96 0 Td[(0),jzTi(^n)]TJ /F8 11.955 Tf 11.95 0 Td[(0)kzikk^n)]TJ /F8 11.955 Tf 11.96 0 Td[(0k=Oppn p n.By(A)]TJ /F1 11.955 Tf 11.96 0 Td[(1),itfollowsthat ^n2:max1injzTij
PAGE 18

Proof. WeverifythattheassumptionsinGhosal(1997)followfrom(A)]TJ /F1 11.955 Tf 12.13 0 Td[(0)to(A)]TJ /F1 11.955 Tf 12.13 0 Td[(4).NotethatGhosal[equations(2.6)and(2.7)]followimmediatelyfromourassumptions(A)]TJ /F1 11.955 Tf 12.24 0 Td[(1)and(A)]TJ /F1 11.955 Tf 12.56 0 Td[(2).Letn=jj(zizTi))]TJ /F4 7.97 Tf 6.59 0 Td[(1=2jj.By(A)]TJ /F1 11.955 Tf 12.57 0 Td[(0),itfollowsthatn=O(n)]TJ /F4 7.97 Tf 6.58 0 Td[(1=2).Notethatby(A)]TJ /F1 11.955 Tf 11.96 0 Td[(2),ifjj)]TJ /F8 11.955 Tf 11.96 0 Td[(0jjC4p pn n,thenbythemeanvaluetheorem,jlog())]TJ /F1 11.955 Tf 11.96 0 Td[(log(0)jsupjj)]TJ /F12 7.97 Tf 6.59 0 Td[(0jjC4p pn njjrlog()jjjj)]TJ /F8 11.955 Tf 11.96 0 Td[(0jjM1pnjj)]TJ /F8 11.955 Tf 11.96 0 Td[(0jj.Notethatpn(logpn)1=2n=O(p1+ 3n p n)=O(4r pn n).Hence,Ghosal[equation(2.8)]issatisedwithKn=M1pn.NotethatKnnpn(logpn)1=2=p2+=3n p n!0.Here,wedenenormofamatrixAnasjjAnjj=supjjAnxjj jjxjj:x2Rnwithx6=0.Letn=max1injjA)]TJ /F4 7.97 Tf 6.59 0 Td[(1=2nzijj.Then,njjA)]TJ /F4 7.97 Tf 6.59 0 Td[(1=2njjmax1injjzijj=O(r pn n).Thismeansp3=2n(logpn)1=2n=O(p3=2+=3nr pn n)=O(p2+=3n p n)!0.Hence,Ghosal[equation2.10]issatised.Now,since1 nPni=1zizTihasuniformlyboundedeigenvalues(by(A)]TJ /F1 11.955 Tf 11.95 0 Td[(0)),hencetr 1 nnXi=1zizTi!=O(pn),tr(nXi=1zizTi)=O(npn),nXi=1tr(zizTi)=O(npn) 18

PAGE 19

,nXi=1tr(zTizi)=O(npn),nXi=1pnXj=1z2ij=O(npn)Thus,Ghosal[equation2.11]isalsosatised.Hence,alltheassumptionsinGhosal(1997)hold.ThelemmanowfollowsfromTheorem2.1ofGhosal(1997). Denethefunction Zn,^n(g):=expln(^n+g p n))]TJ /F3 11.955 Tf 11.95 0 Td[(ln(^n)(2{17)Notethat(gjX)=Zn,^n(g)(^n+g p n) RCnZn,^n(g)(^n+g p n)dg1g2Gn.Wenowproveaseriesoflemmaswhichhelpustoproveourmainresult(Theorem 1 ).Henceforth,weassumethatpn!1.Ifpnisbounded,asimplemodicationoftheargumentsbelowcanbeusedtoestablishtheasymptoticexpansionresult.SeetheremarkfollowingtheproofofTheorem 1 Lemma3. LetCn:=ng:g2Gn,jjgjjp1 2+0noandKn=(^n)(2)pn 2)]TJ /F11 7.97 Tf 10.5 5.7 Td[(r2ln(^n) n1=2,where0= 6.Then, ZCn1 KnZn,^n(g)(^n+g p n)dg!P1.(2{18) Proof. Notethat,Zn(g)=expln(^n+g p n))]TJ /F3 11.955 Tf 11.95 0 Td[(ln(^n).ByathirdordercorrectTaylorseriesexpansionoflnaround^n,wegetthat Zn(g)=exp gTr2ln(^n)g 2n+1 6n3=2nXi=1 000(zTin)(pnXr=1zirgr)3!,(2{19)wheren=n(g)isapointbetween^nand(^n+g p n).NotethatbyLemma 1 ,^n2nwithprobabilitytendingto1.Also,ifg2Cn, pnXr=1zirgrjjzijjjjgjjMp pnp1 2+0n=Mp1+0n.(2{20) 19

PAGE 20

NotethatP(n(g)2n8g2Cn)Pmax1injzTi^nj
PAGE 21

=exp(rlog(n))Tg p n,forsomenbetween^nand^n+g p n.Notethat supg2Cn(rlog(n))Tg p nsupg2Cnjjrlog(n))jjjjgjj p n=Op p3 2+0n p n!=op(1). (2{24) Itfollowsby( 2{23 ),( 2{24 )andthedenitionofKnthat ZCn1 KnZn,^n(g)(^n+g p n)dg=exp(op(1))ZCnNpngj0,^ndg, (2{25) where^n=)]TJ /F11 7.97 Tf 10.49 5.7 Td[(r2ln(^n) n)]TJ /F4 7.97 Tf 6.58 0 Td[(1=1 nPni=1 00(zTi^n)zizTi)]TJ /F4 7.97 Tf 6.58 0 Td[(1.IfUnNpn0,^n,then sup1inzTi^n+Un p n>K0=)sup1injzTiUnj>p nK0)]TJ /F1 11.955 Tf 16.07 0 Td[(sup1inzTi^n=)jjUnjj>r n pnM2K0)]TJ /F1 11.955 Tf 16.07 0 Td[(sup1inzTi^n(by(A)]TJ /F1 11.955 Tf 11.95 0 Td[(0)), (2{26) Letusassumethatjj^n)]TJ /F8 11.955 Tf 12.12 0 Td[(0njj1 n1=3.Bylemma 1 ,theprobabilityofthiseventconvergesto1.Usingthisassumption,and(A)]TJ /F1 11.955 Tf 11.4 0 Td[(1),K0)]TJ /F1 11.955 Tf 11.4 0 Td[(sup1injzTi^njisuniformlyboundedawayfrom0.Notethat, ENpn(0,^n)jjU2njj=trace(^n)=Op(pn).(2{27)NotethatCcn=Gcn[ng:jjgjj>p1 2+0no.AsimpleapplicationofMarkovinequality,alongwith( 2{26 )and( 2{27 ),yieldsthat ZCcnNpngj0,^ndgZGcnNpngj0,^ndg+Zfg:jjgjj2p1 2+0ngNpngj0,^ndg 21

PAGE 22

EjjUnjj2pnM2 nK0)]TJ /F1 11.955 Tf 11.95 0 Td[(sup1inzTi^n2+EjjUnjj2 p1+20n=Op(p2n n)+Op(p)]TJ /F4 7.97 Tf 6.58 0 Td[(20n)=op(1). (2{28) Itfollowsby( 2{25 )and( 2{28 )thatZCnZn(g)(^n+g p n)dg!p1asn!1. Lemma4. ZCcn(gjX)dg!p0.(2{29) Proof. LetUnNpn(gjn,n),wherenandnareasdenedinthestatementofLemma 2 .Then, PjjUnjj>p1 2+0nPjjUn)]TJ /F8 11.955 Tf 11.96 0 Td[(njj>p1 2+0n)-222(jjnjjP jj)]TJ /F15 5.978 Tf 7.78 3.26 Td[(1 2n(Un)]TJ /F8 11.955 Tf 11.95 0 Td[(n)jj>p1 2+0n)-222(jjnjj jj1 2njj! (2{30) Sincejj)]TJ /F15 5.978 Tf 7.78 3.26 Td[(1 2n(Un)]TJ /F8 11.955 Tf 11.87 0 Td[(n)jj2hasa2pndistribution,hencejj)]TJ /F15 5.978 Tf 7.78 3.26 Td[(1 2n(Un)]TJ /F8 11.955 Tf 11.87 0 Td[(n)jj=Op(p pn).Also,by(A)]TJ /F1 11.955 Tf 11.96 0 Td[(0)and(A)]TJ /F1 11.955 Tf 11.96 0 Td[(1),wegetthatjj1 2njj)]TJ /F4 7.97 Tf 6.58 0 Td[(1=Op(1).Notethat jjnjjjjp nB)]TJ /F4 7.97 Tf 6.59 0 Td[(1nnXi=1)]TJ /F3 11.955 Tf 5.48 -9.69 Td[(Xi)]TJ /F7 11.955 Tf 11.96 0 Td[( 0)]TJ /F5 11.955 Tf 5.48 -9.69 Td[(zTi0zijj+OP(p pn).(2{31)SinceB)]TJ /F4 7.97 Tf 6.58 0 Td[(1n=1 nnandknk=O(1),itfollowsthatEjjp nB)]TJ /F4 7.97 Tf 6.59 0 Td[(1nnXi=1)]TJ /F3 11.955 Tf 5.48 -9.68 Td[(Xi)]TJ /F7 11.955 Tf 11.96 0 Td[( 0)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(zTi0zijj2=O Ejj1 p nnXi=1)]TJ /F3 11.955 Tf 5.48 -9.68 Td[(Xi)]TJ /F7 11.955 Tf 11.95 0 Td[( 0)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(zTi0zijj2!=O E"1 nnXi=1)]TJ /F3 11.955 Tf 5.48 -9.69 Td[(Xi)]TJ /F7 11.955 Tf 11.95 0 Td[( 0)]TJ /F5 11.955 Tf 5.48 -9.69 Td[(zTi02zTizi#!(*Xi0sareindependent) 22

PAGE 23

=O "1 nnXi=1E)]TJ /F3 11.955 Tf 5.48 -9.68 Td[(Xi)]TJ /F7 11.955 Tf 11.95 0 Td[( 0)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(zTi02zTizi#!=O "1 nnXi=1 00)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(zTi0zTizi#!O "1 nmax1in 00)]TJ /F5 11.955 Tf 5.48 -9.69 Td[(zTi0nXi=1zTizi#!=O "1 nnXi=1zTizi#!(*By(A)]TJ /F1 11.955 Tf 11.96 0 Td[(1)andcontinuityof 00)=1 nO(npn)=O(pn).Itfollowsby( 2{31 )that jjnjj=Op(p pn).(2{32)Itfollowsby( 2{30 )and( 2{32 )thatPjjUnjj>p1 2+0n!p0asn!1.TheresultnowfollowsbyusingLemma 2 Lemma5. ZGn1 KnZn(g)(^n+g p n)dg!1. Proof. NotethatbyLemma 6 Zg:jjgjj>p1 2+0n(gjX)dg=RGnCn1 KnZn(g)(^n+g p n)dg RGn1 KnZn,^n(g)(^n+g p n)dg!0.Hence, RGn=Cn1 KnZn(g)(^n+g p n)dg RCn1 KnZn(g)(^n+g p n)dg+RGnCn1 KnZn(g)(^n+g p n)dg!0.(2{33)Now,byLemma 3 ,ZCn1 KnZn(g)(^n+g p n)!1. 23

PAGE 24

Theresultfollowsby( 2{33 ). Wenowstateandprovethemainresultofthepaper. Theorem1. Suppose2Rpnsatisesp njj)]TJ /F1 11.955 Tf 14.61 2.99 Td[(^njjp1 2+ 6nforeveryn.Thisisequivalenttotheassumptionthatg2Cn.Insuchacase,underconditions(A)]TJ /F1 11.955 Tf 12.23 0 Td[(0))]TJ /F1 11.955 Tf 12.23 0 Td[((A)]TJ /F1 11.955 Tf 12.23 0 Td[(4)describedabove, (gjX)=Npn(gj0,n)"1)]TJ /F1 11.955 Tf 22.88 8.09 Td[(1 6n3=2pnXr,s,t=1nXi=1 000zTi^ngrgsgtzirziszit+1 p npnXv=1gvrlog(^n)v)]TJ /F10 11.955 Tf 19.26 20.44 Td[((1 6n3=2pnXr,s,t=1nXi=1 000zTi^ngrgsgtzirziszit)(1 p npnXv=1gvrlog(^n)v)+R(g)#[1)]TJ /F3 11.955 Tf 11.95 0 Td[(op(1)], (2{34) where,supg2CnR(g)=Opp6+ 3n nandNpn(gj0,n)isapn-dimensionalnormaldensitywithmeanvector0andcovariancematrixn=)]TJ /F11 7.97 Tf 10.49 5.7 Td[(r2ln(^n) n)]TJ /F4 7.97 Tf 6.59 0 Td[(1evaluatedatg.Remark:NotethatbyLemma 3 ,theposteriorprobabiltythatgdoesnotlieinCnconvergesto0. Proof. Sincerln(^n)=0,byafourthorderTaylorseriesexpansionaround^n,wehave: ln(^n+g p n))]TJ /F3 11.955 Tf 11.95 0 Td[(ln(^n)=1 2ngTr2ln(^n)g+1 6n3=2pnXr,s,t=1grgsgt@3ln() @r@s@t=^n+1 24n2pnXr,s,t,u=1grgsgtgu@4ln() @r@s@t@u=n=A1(g)+A2(g)+A3(g)(say). (2{35) 24

PAGE 25

Heren=n(g)isanintermediatepointonthelinejoining^nand(^n+g p n).Basedonexactlythesameargumentleadingupto( 2{21 )(intheproofofLemma 3 ), P(n(g)2n8g2Cn)!1,(2{36)asn!1.Also, (^n+g p n)=(^n)+1 p ngTr(^n)+1 2ngTr2(n)g=(^n)(1+1 p npnXv=1gvrlog((^nv+1 2ngTr2(n)g (^n))=(^n)(1+B1(g)+B2(g))(say). (2{37) where,n=n(g)isanintermediatepointonthelinejoining^nand(^n+g p n).Basedonexactlythesameargumentleadingupto( 2{21 )(intheproofofLemma 3 ), P(n(g)2n8g2Cn)!1,(2{38)asn!1.Wenowanalyzedierenttermsin( 2{35 )and( 2{37 ).Bythecontinuityof 000andthefactthat^n2nwithprobabilitytendingto1,itfollowsthat max1inj 000zTi^nj=Op(1).(2{39)Hence, jA2(g)j=1 6n3=2pnXr,s,t=1grgsgt@3ln() @r@s@t=^n=1 6n3=2pnXr,s,t=1nXi=1 000zTi^nzirziszitgrgsgt1 6n3=2max1in 000zTi^npnXr,s,t=1nXi=1jgrjjgsjjgtjjzirjjzisjjzitj=1 6n3=2max1in 000zTi^nnXi=1 pnXr=1jgrjjzirj! pnXs=1jgsjjzisj! pnXt=1jgtjjzitj! 25

PAGE 26

=1 6n3=2max1in 000zTi^nnXi=1(jjgjjjjzijj)3=1 6n3=2max1in 000zTi^njjgjj3nXi=1jjzijj3max1in 000zTi^n(p1 2+ 6n)3n(Mp pn)3 6n3=2. (2{40) Thelastinequalityfollowsbyusing(A)]TJ /F1 11.955 Tf 11.95 0 Td[(0).Itfollowsby( 2{40 )that supg2CnjA2(g)j=Op p3+ 2n p n!(2{41)Bythecontinuityof 0000and( 2{36 ),itfollowsthat supg2Cnmax1inj 0000)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(zTinj=Op(1).(2{42)Hence, jA3(g)j=1 24n2pnXr,s,t,u=1grgsgtgu@4ln() @r@s@t@u=n=1 24n2pnXr,s,t,u=1nXi=1 0000)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(zTinzirziszitziugrgsgtgu1 24n2max1in 0000)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(zTinpnXr,s,t,u=1nXi=1jgrjjgsjjgtjjgujjzirjjzisjjzitjjziuj=1 24n2max1in 0000)]TJ /F5 11.955 Tf 5.48 -9.69 Td[(zTinnXi=1 pnXr=1jgrjjzirj! pnXs=1jgsjjzisj! pnXt=1jgtjjzitj! pnXu=1jgujjziuj!=1 24n2max1in 0000)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(zTinnXi=1(jjgjjjjzijj)4=1 24n2max1in 0000)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(zTinjjgjj4nXi=1jjzijj4 26

PAGE 27

max1in 0000)]TJ /F5 11.955 Tf 5.47 -9.68 Td[(zTin(Cp1 2+ 6n)4n(Mp pn)4 24n2. (2{43) Thelastinequalityfollowsbyusing(A)]TJ /F1 11.955 Tf 11.95 0 Td[(0).Itfollowsby( 2{43 )that supg2CnjA3(g)j=Op p4+2 3n n!.(2{44)Sincejjgjj2Cp1 2+ 6n,itfollowsthat pnXj=1jgjjp pnjjgjjCp1+ 6n.(2{45)Nextweanalyzethesecondorderremaindertermin( 2{37 ).Notethatjj^n)]TJ /F8 11.955 Tf 12.65 0 Td[(0jj=Op(p pn n)byLemma 1 andsupg2Cnjjn(g))]TJ /F1 11.955 Tf 14.48 2.99 Td[(^njj=Opp1 2+0n p nasn(g)isanintermediatepointonthelinejoining^nand(^n+g p n).Hence supg2Cnjjn(g))]TJ /F8 11.955 Tf 11.96 0 Td[(0jj=Op p1 2+0n p n!.(2{46)By(A)]TJ /F1 11.955 Tf 11.96 0 Td[(3),wegetthatp1 2+0n p n=o)]TJ /F15 5.978 Tf 8.8 -5.95 Td[(4p pn n.By(A)]TJ /F1 11.955 Tf 11.96 0 Td[(2)itfollowsthat supg2Cnmax1r,spn1 ()@2() @r@s=n=Op(p4n).(2{47)Notethat(n) (^n)=explog(n))]TJ /F1 11.955 Tf 11.95 0 Td[(log((^n))=expn(rlog(n))Tn)]TJ /F1 11.955 Tf 14.33 2.99 Td[(^no,wherenisanintermediatepointonthelinejoiningnand^n.Hence,supg2Cnjjn)]TJ /F1 11.955 Tf 14.32 2.99 Td[(^njjsupg2Cnjjn)]TJ /F1 11.955 Tf 14.32 2.99 Td[(^njj=Op p1 2+0n p n!. 27

PAGE 28

ByLemma 1 and(A)]TJ /F1 11.955 Tf 11.95 0 Td[(2)itfollowsthatsupg2Cnjjrlog(n)jj=Op(pn).Hence, supg2Cn(n) (^n)supg2Cnexpjjrlog(njjjjn)]TJ /F1 11.955 Tf 14.32 2.99 Td[(^njj=exp Op(pn)Op p1 2+0n p n!!=Op(1). (2{48) Itfollowsthat jB2(g)j=1 ngTr2(n)g (^n)=(n) (^n)1 npnXr,s=11 ()@2() @r@s=ngrgs(n) (^n)(1 npnXr,s=11 ()@2() @r@s=njgrjjgsj)(n) (^n)(1 nmax1r,spn1 ()@2() @r@s=npnXr,s=1jgrjjgsj) (2{49) Itfollowsby( 2{47 ),( 2{48 )and( 2{49 )that supfg2CngjB2(g)j=Op p6+ 3n n!.(2{50) 28

PAGE 29

Notethatby( 2{15 ),( 2{35 )and( 2{36 ),(gjX)=N=D,where, N=(^n)(1+B1(g)+B2(g))exp(A1(g)+A2(g)+A3(g)) (^n)(2)pn 2)]TJ /F11 7.97 Tf 10.49 5.7 Td[(r2ln(^n) n1 2=Npn(gj0,n)(1+B1(g)+B2(g))exp(A2(g)+A3(g))=Npn(gj0,n)f(1+B1(g)))(1+A2(g))+B2(g)(1+A2(g)+A3(g))+(1+B1(g))A3(g)g+Npn(gj0,n)f(1+B1(g)+B2(g))(exp(A2(g)+A3(g)))]TJ /F1 11.955 Tf 11.96 0 Td[((1+A2(g)+A3(g)))g=Npn(gj0,n)(N1(g)+N2(g)+N3(g)+N4(g)),(say), (2{51) and D=ZN(g)dg (2{52) Now,from( 2{41 ),( 2{44 )and( 2{50 ),itfollowsthat supfg2CngN2(g)=supg2Cn[B2(g)(1+A2(g)+A3(g))]=Op p6+ 3n n!.(2{53)InviewofLemma 1 and(A)]TJ /F1 11.955 Tf 11.96 0 Td[(2), supfg2Cng(1+B1(g))=1+supg2Cn1 p npnXv=1gvrlog(^n)v1+supg2Cn1 p nkgkkrlog(^n)k=1+Op p3 2+0n p n!=1+op(1), (2{54) Hence, supfg2CngN3(g)=supfg2Cng[(1+B1(g))A3(g)]=Op p4+2 3n n!.(2{55) 29

PAGE 30

By( 2{41 )and( 2{44 ),supfg2CngjA2(g)+A3(g)jsupfg2CngjA2(g)j+supfg2CngjA3(g)j=Op p3+ 2n p n!.Itfollowsby(A)]TJ /F1 11.955 Tf 11.96 0 Td[(3)thatforlargeenoughn, supfg2Cng(exp(A2(g)+A3(g)))]TJ /F1 11.955 Tf 11.95 0 Td[((1+A2(g)+A3(g)))supfg2Cng(A2(g)+A3(g))2=Opp6+n n.(2{56)Itfollowsfrom( 2{50 ),( 2{54 )and( 2{56 )that supfg2CngN4(g)=supfg2Cng[(1+B1(g)+B2(g))(exp(A2(g)+A3(g)))]TJ /F1 11.955 Tf 11.95 0 Td[((1+A2(g)+A3(g)))]=Op p6+ 3n n!. (2{57) LetR(g):=N2(g)+N3(g)+N4(g).Itfollowsfrom( 2{53 ),( 2{55 )and( 2{57 )that supfg2Cng(N2(g)+N3(g)+N4(g))=Op p6+ 3n n!(2{58)ByLemma 5 ,D=ZGn1 KnZn(g)(^n+g p n)dg=1+op(1).Thus,n(gjX)=N=D=Npngj0,^n"1+1 6n3=2pnXr,s,t=1grgsgt@3ln() @r@s@t=^n"1+1 p npnXv=1gvrlog(^n)v#+R(g)#[1)]TJ /F3 11.955 Tf 11.95 0 Td[(op(1)]=Npngj0,^n"1+1 6n3=2pnXr,s,t=1grgsgt@3ln() @r@s@t=^n+1 p npnXv=1gvrlog(^n)v+8<:1 6n3=2pnXr,s,t=1grgsgt@3ln() @r@s@t=^n9=;(1 p npnXv=1gvrlog(^n)v) 30

PAGE 31

+R(g)#[1)]TJ /F3 11.955 Tf 11.96 0 Td[(op(1)]=Npngj0,^n"1)]TJ /F1 11.955 Tf 22.88 8.09 Td[(1 6n3=2pnXr,s,t=1nXi=1 000zTi^ngrgsgtzirziszit+1 p npnXv=1gvrlog(^n)v)]TJ /F10 11.955 Tf 19.27 20.45 Td[((1 6n3=2pnXr,s,t=1nXi=1 000zTi^ngrgsgtzirziszit)(1 p npnXv=1gvrlog(^n)v)+R(g)#[1)]TJ /F3 11.955 Tf 11.96 0 Td[(op(1)],where,supg2CnR(g)=Opp6+ 3n n. Remark:Ifpnisuniformlybounded,thenLemma 3 ,Lemma 6 ,Lemma 5 andTheorem 1 canbeestablishedbythesamesetofarguments,byusingCn=g:jjgjj
PAGE 32

,Ef2f21 f22)]TJ /F1 11.955 Tf 11.96 0 Td[(1!0.Bysubstitutingtheexpressionforf1andf2,itfollowsthat Zjf1(g))]TJ /F3 11.955 Tf 11.95 0 Td[(f2(g)jdg!0,ENpn(gj0,^n)"j^nj jnjexpgT^)]TJ /F4 7.97 Tf 6.59 0 Td[(1ng)]TJ /F5 11.955 Tf 11.96 0 Td[(gT)]TJ /F4 7.97 Tf 6.59 0 Td[(1ng#!1 (2{59) LetZ=^)]TJ /F15 5.978 Tf 7.79 3.26 Td[(1 2ng.Then, ENpn(gj0,^n)"j^nj jnjexpgT^)]TJ /F4 7.97 Tf 6.58 0 Td[(1ng)]TJ /F5 11.955 Tf 11.96 0 Td[(gT)]TJ /F4 7.97 Tf 6.59 0 Td[(1ng#=ENpn(gj0,Ipn)hexpZTI)]TJ /F1 11.955 Tf 13.26 2.99 Td[(^1=2n)]TJ /F4 7.97 Tf 6.59 0 Td[(1n^1=2nZi=Z11 (2)pn 2exp)]TJ /F5 11.955 Tf 10.49 8.09 Td[(zTz 2+zTI)]TJ /F1 11.955 Tf 13.26 2.99 Td[(^1=2n)]TJ /F4 7.97 Tf 6.59 0 Td[(1n^1=2nzdz=Z11 (2)pn 2exp0@)]TJ /F5 11.955 Tf 10.49 12.43 Td[(zT2^1=2n)]TJ /F4 7.97 Tf 6.59 0 Td[(1n^1=2n)]TJ /F3 11.955 Tf 11.96 0 Td[(Iz 21Adz=j(2^1=2n)]TJ /F4 7.97 Tf 6.59 0 Td[(1n^1=2n)]TJ /F3 11.955 Tf 11.96 0 Td[(I)j)]TJ /F15 5.978 Tf 7.78 3.26 Td[(1 2=pnYi=1j2i,n)]TJ /F1 11.955 Tf 11.95 0 Td[(1j)]TJ /F15 5.978 Tf 7.78 3.26 Td[(1 2, (2{60) where1,n,2,n,...,pn,naretheeigenvaluesof^1=2n)]TJ /F4 7.97 Tf 6.58 0 Td[(1n^1=2n.Notethatjjn)]TJ /F1 11.955 Tf 13.25 2.99 Td[(^njj=jj1 nnXi=1 00(zTi0))]TJ /F7 11.955 Tf 11.96 0 Td[( 00(zTi^n)zizTijjjj1 nnXi=1 000(zTin)zTi(^n)]TJ /F8 11.955 Tf 11.95 0 Td[(0)zizTijj,where,nisanintermediatepointonthelinejoining0and^n.Itfollowsthatjjn)]TJ /F1 11.955 Tf 13.26 2.99 Td[(^njjpnM1M2jj^n)]TJ /F8 11.955 Tf 11.96 0 Td[(0jjjj1 nnXi=1zizTijjKpnjj^n)]TJ /F8 11.955 Tf 11.96 0 Td[(0jj 32

PAGE 33

Op(p3 2n p n).By(A)]TJ /F1 11.955 Tf 12.34 0 Td[(1),Lemma 1 andthecontinuityof 00itfollowsthattheeigenvaluesof^nareuniformlyboundedabovefromzeroandboundedbelowfrominnity,withprobabilitytendingto1.Hence,jj^)]TJ /F4 7.97 Tf 6.59 0 Td[(1=2n)]TJ /F4 7.97 Tf 6.59 0 Td[(1n^1=2n)]TJ /F3 11.955 Tf 11.96 0 Td[(Ijj=Op(p3 2n p n).Itfollowsthatji,n)]TJ /F1 11.955 Tf 11.96 0 Td[(1j=Op(p3 2n p n),forevery1ipn.Notethat1)]TJ /F1 11.955 Tf 11.96 0 Td[(2ji,n)]TJ /F1 11.955 Tf 11.96 0 Td[(1jj2i,n)]TJ /F1 11.955 Tf 11.96 0 Td[(1j1+2ji,n)]TJ /F1 11.955 Tf 11.95 0 Td[(1j=)(1+2ji,n)]TJ /F1 11.955 Tf 11.96 0 Td[(1j))]TJ /F9 5.978 Tf 7.78 3.69 Td[(pn 2pnYi=1j2i,n)]TJ /F1 11.955 Tf 11.95 0 Td[(1j)]TJ /F15 5.978 Tf 7.79 3.26 Td[(1 2(1)]TJ /F1 11.955 Tf 11.95 0 Td[(2ji,n)]TJ /F1 11.955 Tf 11.95 0 Td[(1j))]TJ /F9 5.978 Tf 7.78 3.69 Td[(pn 2Bothleftandrighthandsideoftheaboveinequalityareoforder1+O(p5 2n p n).Hence,by(A)]TJ /F1 11.955 Tf 11.96 0 Td[(3)itfollowsthatpnYi=1j2i,n)]TJ /F1 11.955 Tf 11.96 0 Td[(1j!1,asn!1.Theresultsfollowsfrom( 2{59 )and( 2{60 ). Wenowstateandprovetheasymptoticnormalityresult. Theorem2. LetUnbeakpnmatrixsuchthatUTnUn!H,asn!1.Then,p nUn)]TJ /F15 5.978 Tf 7.78 3.26 Td[(1 2n(^n)]TJ /F8 11.955 Tf 11.96 0 Td[(0)!dNpn(0,H). Proof. NotethatbyTheorem 1 ,ifjjgjjCp1 2+n,then n(gjX)=Npngj0,^n[1+op(1)].(2{61)Also,byLemma 6 ,forxedC>0, Pjjgjj>Cp1 2+njX!0(2{62) 33

PAGE 34

ByasimilarargumentasintheproofofLemma 6 Zjjgjj>Cp1 2+nNpngj0,^ndg!0(2{63)Combining( 2{61 ),( 2{62 )and( 2{63 ),itfollowsthatZjn(gjX))-222(Npngj0,^njdg!0.ByLemma 2 andthetriangleinequality,itfollowsthatZjNpn(gjn,n))-222(Npngj0,^njdg!0.Now,toshow:ZjNpn(gj0,n))-222(Npngj0,^njdg!0.ByLemma 5 andbythetriangleinequality,itfollowsthatZjNpn(gjn,n))-222(Npn(gj0,n)j!0.Wenowusethegeneralresult:supA2BjP(A))]TJ /F3 11.955 Tf 11.95 0 Td[(Q(A)j=1 2Zjp(x))]TJ /F3 11.955 Tf 11.96 0 Td[(q(x)jdx,whereBistheclassofallBorelsets,PandQaretheprobabilitymeasuresassociatedwithdistributionswithrespectivedensitiespandq.ThisleadstoZjNpn(gjn,n))-222(Npn(gj0,n)j!0,supA2BnjPn,n(g2A))]TJ /F3 11.955 Tf 11.95 0 Td[(P0,n(g2A)j!0)Pn,n(g0))]TJ /F3 11.955 Tf 11.95 0 Td[(P0,n(g0)!0)Pn,n Tn)]TJ /F4 7.97 Tf 6.59 0 Td[(1ng)]TJ /F8 11.955 Tf 11.95 0 Td[(Tn)]TJ /F4 7.97 Tf 6.59 0 Td[(1nn p Tn)]TJ /F4 7.97 Tf 6.58 0 Td[(1nn)]TJ /F10 11.955 Tf 21.92 10.84 Td[(p Tn)]TJ /F4 7.97 Tf 6.59 0 Td[(1nn!)]TJ /F1 11.955 Tf 13.15 8.09 Td[(1 2!0))]TJ /F10 11.955 Tf 9.3 10.84 Td[(p Tn)]TJ /F4 7.97 Tf 6.59 0 Td[(1nn)]TJ /F1 11.955 Tf 13.15 8.09 Td[(1 2!0 34

PAGE 35

)Tn)]TJ /F4 7.97 Tf 6.59 0 Td[(1nn!0)jjnjj!0,since,nisapositivedenitematrix.Now,deneYn,i=(xi)]TJ /F7 11.955 Tf 11.96 0 Td[( 0(zTi0))1 p nUn1 2nzi,fori=1,2,...,n.NotethatjjYn,ijj=jxi)]TJ /F7 11.955 Tf 11.96 0 Td[( 0(zTi0)j p njjUn1 2nzijjjxi)]TJ /F7 11.955 Tf 11.96 0 Td[( 0(zTi0)j p njjUnjjjj1 2njjjjzijj.NotethatjjUnjj=O(1),jj1 2njj=O(1),andbyassumption(A)]TJ /F1 11.955 Tf 12.56 0 Td[(0),jjzijjMp pn,foralli=1,...,n.Also,by(A)]TJ /F1 11.955 Tf 13.19 0 Td[(1)andthecontinuityof 00and 0000,itfollowsthatE[xi)]TJ /F7 11.955 Tf 11.96 0 Td[( 0(zTi0)]4= 0000(zTi0)+3( 00(zTi0))2isuniformlybounded.Itfollowsthatsup1inEjjYn,ijj4=O(p2n n2).If>0isarbitrarilyxed,thenbytheCauchy-Schwartzinequality,nXi=1EjjYn,ijj21jjYn,ijj>nXi=1p E[jjYn,ijj4]P(jjYn,ijj>)nXi=1r E[jjYn,ijj4]E[jjYn,ijj4] 4nXi=1E[jjYn,ijj4] 2=O(p2n n)Thus,nXi=1EjjYn,ijj21jjYn,ijj>!0. 35

PAGE 36

SinceYn,i0sareindependent,itfollowsthatV(nXi=1Yn,i)=nXi=1V(Yn,i)=nXi=1V(xi)]TJ /F7 11.955 Tf 11.96 0 Td[( 0(zTi0))1 p nUn1 2nzi=Un1 2n 1 nnXi=1 00(zTi0)zizTi!1 2nUTn=UnUTnsince,)]TJ /F4 7.97 Tf 6.58 0 Td[(1n=1 nPni=1 00)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(zTi0zizTi!Hasn!1.ByamultivariateversionofLindeberg-Fellercentrallimittheorem(VanderVart(1998)[ 17 ]),itfollowsthat nXi=1Yn,i!DN(0,H)(2{64)Notethat jjp nUn)]TJ /F15 5.978 Tf 7.78 3.26 Td[(1 2n(^n)]TJ /F8 11.955 Tf 11.96 0 Td[(0))]TJ /F6 7.97 Tf 18.34 14.94 Td[(nXi=1Yn,ijj=jjp nUn)]TJ /F15 5.978 Tf 7.78 3.26 Td[(1 2nzi(^n)]TJ /F8 11.955 Tf 11.96 0 Td[(0))]TJ /F1 11.955 Tf 18.34 8.09 Td[(1 p n(xi)]TJ /F7 11.955 Tf 11.96 0 Td[( 0(zTi0))Un1 2nzijjjjUnjjjj)]TJ /F15 5.978 Tf 7.78 3.26 Td[(1 2njjjjp n(^n)]TJ /F8 11.955 Tf 11.95 0 Td[(0))]TJ /F1 11.955 Tf 18.34 8.09 Td[(1 p n(xi)]TJ /F7 11.955 Tf 11.96 0 Td[( 0(zTi0))Un1 2nzijj=jjUnjjjj)]TJ /F15 5.978 Tf 7.78 3.26 Td[(1 2njjjjnjj!P0. (2{65) Theresultfollowsfrom( 2{64 )and( 2{65 ). Acknowledgement:WewouldliketothankProf.SubhasishGhosalforhishelpwiththepaper. 36

PAGE 37

Appendix Multivariatenormaldistributionsatisesassumptionsin(2.5)and(2.6) SupposeNpn(,A).Weassumethatjjjj=O(p pn)andjjA)]TJ /F4 7.97 Tf 6.59 0 Td[(1jj=O(p pn).Notethatrlog()=)]TJ /F3 11.955 Tf 9.3 0 Td[(A)]TJ /F4 7.97 Tf 6.58 0 Td[(1()]TJ /F8 11.955 Tf 11.95 0 Td[().Hence, jjrlog()jjjjA)]TJ /F4 7.97 Tf 6.59 0 Td[(1jjjj)]TJ /F8 11.955 Tf 11.95 0 Td[(jj(2{66)Also, jj)]TJ /F8 11.955 Tf 11.96 0 Td[(jjjj)]TJ /F8 11.955 Tf 11.96 0 Td[(0jj+jj0jj+jjjj=jj)]TJ /F8 11.955 Tf 11.96 0 Td[(0jj+q T00+jjjjjj)]TJ /F8 11.955 Tf 11.96 0 Td[(0jj+r 1 C1vuut T0(1 nnXi=1zizTi)0+jjjj(byassumption(A)]TJ /F1 11.955 Tf 11.96 0 Td[(0))jj)]TJ /F8 11.955 Tf 11.96 0 Td[(0jj+r 1 C1vuut 1 nnXi=1(zTi0)2+jjjjjj)]TJ /F8 11.955 Tf 11.96 0 Td[(0jj+K p C1+jjjj(byassumption(A)]TJ /F1 11.955 Tf 11.95 0 Td[(1))=jj)]TJ /F8 11.955 Tf 11.96 0 Td[(0jj+O(p pn) (2{67) Itfollowsfrom( 2.4 ),( 2{67 )andtheassumptiononandA,thatsupjj)]TJ /F12 7.97 Tf 6.59 0 Td[(0jjCnjjrlog()jj=O(pn),where,Cn=O)]TJ /F15 5.978 Tf 8.8 -5.94 Td[(4p pn n.Notethatfor1j,j0pn. 1 ()@2() @j@j0 37

PAGE 38

=)]TJ /F1 11.955 Tf 9.3 0 Td[((A)]TJ /F4 7.97 Tf 6.58 0 Td[(1)jj0+ pnXk=1(A)]TJ /F4 7.97 Tf 6.58 0 Td[(1)jk(k)]TJ /F7 11.955 Tf 11.95 0 Td[(k)! pnXk=1(A)]TJ /F4 7.97 Tf 6.59 0 Td[(1)j0k(k)]TJ /F7 11.955 Tf 11.95 0 Td[(k)!jjA)]TJ /F4 7.97 Tf 6.59 0 Td[(1jj+)]TJ /F2 11.955 Tf 5.48 -9.69 Td[(jjA)]TJ /F4 7.97 Tf 6.59 0 Td[(1jjp pnjj)]TJ /F8 11.955 Tf 11.96 0 Td[(jj2=O(p pn)+O(p2n)jj)]TJ /F8 11.955 Tf 11.96 0 Td[(jj2 (2{68) Itfollowsfrom( 2{67 )and( 2{68 )thatsupjj)]TJ /F12 7.97 Tf 6.59 0 Td[(0jjCnmax1j,j0pn1 ()@2() @j@j0=O(p4n),where,Cn=O)]TJ /F15 5.978 Tf 8.8 -5.95 Td[(4p pn n.Multivariatetdistributionsatisesassumptionsin(2.5)and(2.6) Supposet(,A).Heret(,A)denotesthemultivariatetdistributionwithparameters,andA.Thedensityofthisdistributionisproportionalto1+1 ()]TJ /F8 11.955 Tf 11.96 0 Td[()TA)]TJ /F4 7.97 Tf 6.59 0 Td[(1()]TJ /F8 11.955 Tf 11.95 0 Td[())]TJ /F4 7.97 Tf 6.59 0 Td[((+pn)=2.WeassumethattheeigenvaluesofA)]TJ /F4 7.97 Tf 6.58 0 Td[(1areuniformlyboundedinn.Then,()=Ch1+1 ()]TJ /F8 11.955 Tf 11.96 0 Td[()TA)]TJ /F4 7.97 Tf 6.58 0 Td[(1()]TJ /F8 11.955 Tf 11.95 0 Td[()i)]TJ /F4 7.97 Tf 6.59 0 Td[((+pn)=2,where,Cisaconstant.Now,rlog()=0() ()=)]TJ /F10 11.955 Tf 11.29 16.86 Td[(+pn A)]TJ /F4 7.97 Tf 6.58 0 Td[(1()]TJ /F8 11.955 Tf 11.95 0 Td[() h1+1 ()]TJ /F8 11.955 Tf 11.95 0 Td[()TA)]TJ /F4 7.97 Tf 6.59 0 Td[(1()]TJ /F8 11.955 Tf 11.96 0 Td[()iThus,jjrlog()jj2=+pn p ()]TJ /F8 11.955 Tf 11.96 0 Td[()TA)]TJ /F4 7.97 Tf 6.59 0 Td[(2()]TJ /F8 11.955 Tf 11.96 0 Td[() h1+1 ()]TJ /F8 11.955 Tf 11.96 0 Td[()TA)]TJ /F4 7.97 Tf 6.59 0 Td[(1()]TJ /F8 11.955 Tf 11.96 0 Td[()iC1+pn p ()]TJ /F8 11.955 Tf 11.95 0 Td[()TA)]TJ /F4 7.97 Tf 6.59 0 Td[(1()]TJ /F8 11.955 Tf 11.96 0 Td[() h1+1 ()]TJ /F8 11.955 Tf 11.95 0 Td[()TA)]TJ /F4 7.97 Tf 6.59 0 Td[(1()]TJ /F8 11.955 Tf 11.96 0 Td[()iO(pn), 38

PAGE 39

whereC1isaconstant.Now,letA)]TJ /F4 7.97 Tf 6.58 0 Td[(1=((aij)).Notethat()=C"1+1 pnXk,l=1akl(k)]TJ /F7 11.955 Tf 11.95 0 Td[(k)(l)]TJ /F7 11.955 Tf 11.95 0 Td[(l)#)]TJ /F4 7.97 Tf 6.58 0 Td[((+pn)=2Hence,forj=1...pn,@() @j=)]TJ /F3 11.955 Tf 9.3 0 Td[(C+pn 2"1+1 pnXk,l=1akl(k)]TJ /F7 11.955 Tf 11.96 0 Td[(k)(l)]TJ /F7 11.955 Tf 11.95 0 Td[(l)#)]TJ /F15 5.978 Tf 7.78 4.03 Td[((+pn) 2)]TJ /F4 7.97 Tf 6.58 0 Td[(1 pnXk=1akj(k)]TJ /F7 11.955 Tf 11.96 0 Td[(k)!Now,forj6=j0,1 ()@2() @j@j0=1 42(+pn)(+pn+2)Ppnk=1akj(k)]TJ /F7 11.955 Tf 11.96 0 Td[(k)Ppnk=1akj0(k)]TJ /F7 11.955 Tf 11.95 0 Td[(k) h1+1 Ppnk,l=1akl(k)]TJ /F7 11.955 Tf 11.96 0 Td[(k)(l)]TJ /F7 11.955 Tf 11.96 0 Td[(l)i2)]TJ /F1 11.955 Tf 24.77 8.09 Td[(1 2(+pn)ajj0 h1+1 Ppnk,l=1akl(k)]TJ /F7 11.955 Tf 11.96 0 Td[(k)(l)]TJ /F7 11.955 Tf 11.95 0 Td[(l)i(a)O(p2n)(()]TJ /F8 11.955 Tf 11.96 0 Td[()TA)]TJ /F4 7.97 Tf 6.58 0 Td[(1)j(()]TJ /F8 11.955 Tf 11.95 0 Td[()TA)]TJ /F4 7.97 Tf 6.59 0 Td[(1)j0 (1+()]TJ /F8 11.955 Tf 11.96 0 Td[()TA)]TJ /F4 7.97 Tf 6.59 0 Td[(1()]TJ /F8 11.955 Tf 11.96 0 Td[())2+O(pn)(b)O(p2n)()]TJ /F8 11.955 Tf 11.96 0 Td[()TA)]TJ /F4 7.97 Tf 6.58 0 Td[(2()]TJ /F8 11.955 Tf 11.95 0 Td[() (1+()]TJ /F8 11.955 Tf 11.96 0 Td[()TA)]TJ /F4 7.97 Tf 6.58 0 Td[(1()]TJ /F8 11.955 Tf 11.95 0 Td[())2+O(pn)(c)O(p2n)()]TJ /F8 11.955 Tf 11.96 0 Td[()TA)]TJ /F4 7.97 Tf 6.58 0 Td[(1()]TJ /F8 11.955 Tf 11.96 0 Td[() (1+()]TJ /F8 11.955 Tf 11.96 0 Td[()TA)]TJ /F4 7.97 Tf 6.58 0 Td[(1()]TJ /F8 11.955 Tf 11.95 0 Td[())2+O(pn)=O(p2n),where(a)and(c)followfromtheassumptionthattheeigenvaluesofA)]TJ /F4 7.97 Tf 6.59 0 Td[(1areuniformlybounded,and(b)followssince(()]TJ /F8 11.955 Tf 12.39 0 Td[()TA)]TJ /F4 7.97 Tf 6.59 0 Td[(1)jp ()]TJ /F8 11.955 Tf 11.96 0 Td[()TA)]TJ /F4 7.97 Tf 6.58 0 Td[(2()]TJ /F8 11.955 Tf 11.95 0 Td[(),8j=1...pn.Hence,the(sequenceof)priordensitysatisesassumption(A)]TJ /F1 11.955 Tf 11.96 0 Td[(2). 39

PAGE 40

CHAPTER3HIGHDIMENSIONALPOSTERIORCONSISTENCYOFTHEBAYESIANLASSO 3.1LiteratureSurveyandProposedWorkThesimpleproblemofregressingaresponsevectorofnobservationsonpcovariatesisamongtherstencounteredbyanystudentofstatistics,andregressionmodelshavepracticalapplicationsinvirtuallyanyconceivableeldofstudy.Morespecically,regressionundertheBayesianparadigmhasgrowninpopularityinrecentdecadeswiththerapidaccelerationofcomputingpowerandthecontinuingdevelopmentofMarkovchainMonteCarlonumericalintegrationtechniques.ThebehaviorandtheoreticalpropertiesoftheseBayesianregressionmodelshavethereforebecomeimportanttopicsofstudy. 3.1.1InterpretationofPosteriorConsistencyThisbasicformulationofregressionlendsitselfeasilytoaBayesiananalysisinwhichapriorisplacedontheunknowncoecientvectorandvariance2.TheapplicationoftheBayestheoremandintegratingoutthenuisanceparameter2yieldthemarginalposteriorof,whichistheprimaryobjectonwhichBayesianinferenceisbased.CallthisentireBayesianmodelPM.However,supposethatthedataisactuallygeneratedundersomemodelP0comprisingthesamelikelihoodasPM,butwithsomexedbutunknownparametervalues0and20.Onewouldhopethatasthesamplesizetendstoinnity,themarginalposteriorofconvergestodegeneracyat0almostsurelyunderP0.Wecallthispropertyposteriorconsistency,andtheobviousquestionistodeterminethevaluesof0and20forwhichposteriorconsistencyoccurs.Inthissimplestpossiblesetup,itiseasilyveriedthatposteriorconsistencyoccursaslongasbasicregularityassumptionsaboutthedesignmatrixhold.InBayesiananalysis,onestartswithsomepriorknowledge(sometimesimprecise)expressedasadistributionontheparameterspaceandupdatestheknowledgeaccordingtotheposteriordistributiongiventhedata.Itisthereforeofutmostimportancetoknowwhethertheupdatedknowledgebecomesmoreandmoreaccurateandpreciseasdataarecollectedindenitely.Thisrequirementiscalledtheconsistencyoftheposteriordistribution.Althoughitisanasymptoticproperty,consistencyisoneofthebenchmarkssincetheviolation 40

PAGE 41

ofconsistencyisclearlyundesirableandonemayhaveseriousdoubtsagainstinferencesbasedonaninconsistentposteriordistribution. 3.1.2FormalDenitionandChoiceofVectorNormConsiderasimpleBayesianregressionmodelwhere2isknown,theprioronisspeciedaseithernormalorat,andthedimensionalityofthecovariatevectorisxedatsomenitep.Thenposteriorconsistencyforallvaluesof0followsimmediatelyfromtheformoftheposteriordistributionofaslongasbasicregularityassumptionsabouttheeigenvaluesofthedesignmatrixandpriorvariancematrixhold.Inthemorerealisticcasewhere2isunknownandgivenasensibleprior,itcanalsobeeasilyveriedthatposteriorconsistencyoccursforall0.Aswaspreviouslystated,thenotionofposteriorconsistencyconsideredhereinistheconvergenceoftheposteriordistributionoftodegeneracyat0withP0-probability1.Wenowstateaformaldenitionofposteriorconsistency. Denition. Let0n2Rpnforeachn1and20>0.NowletP0denotethedistributionoff(^n,Sn),n1gunderthemodelYn=Xn0n+enforeachn1,whereenNn(0n,20In)foreachn1.ThesequenceofposteriordistributionsPM(nj^n,Sn)issaidtobeconsistentatf(0n,20),n1gifPM(jjn)]TJ /F8 11.955 Tf 11.95 0 Td[(0njj1>j^n,Sn)!0a.s.(P0)forevery>0.Thechoiceofthe`1norminourdenitionofposteriorconsistencywarrantssomediscussion.Inthecasewherethenumberofcovariatespisxed,itisclearthattheparticularchoiceofvectornormisirrelevant,sincethe`1normcouldbereplacedbyanyother`rnorm,1r<1,andthedenitionwouldstillbeequivalent.However,thedistinctionbecomesrelevantwhenptendstoinnityatsomeratealongwiththesamplesize,inwhichcasep,,and0becomepn,n,and0n.Ifwewishtoallowpntogrowinproportionton,thentheconventional`2norm,denedasjjxjj2=(Ppni=1x2i)1=2,makesposteriorconsistencyunreasonablydiculttoachieve.Asjustication,notethatunderthe`2norm,eventheMLEitselffailstoachieveclassicalfrequentistconsistency.Thus,weinsteadconsiderposteriorconsistencyunderthe`1normjjxjj1=max1ipnjxij. 41

PAGE 42

Thefollowinglemmaandcorollaryillustratewhythe`2normisnotsucientlyexibleforourpurposes. Lemma7. LetZnNpn(0pn,n)]TJ /F4 7.97 Tf 6.59 0 Td[(1Vn),wherepn0.NotethatVar(Zn,i)=n)]TJ /F4 7.97 Tf 6.59 0 Td[(1Vn,iin)]TJ /F4 7.97 Tf 6.58 0 Td[(1!max,andn1=2V)]TJ /F4 7.97 Tf 6.59 0 Td[(1=2n,iiZn,iN(0,1).NowletUn=nPpni=1V)]TJ /F4 7.97 Tf 6.59 0 Td[(1n,iiZ2n,i,sothatUn2pn.Bythepropertiesofthechi-squareddistribution,Un=n!0almostsurelyifandonlyifpn=n!0.Thensincer !minUn njjZnjj2r !maxUn n,itfollowsthatjjZnjj2!0almostsurelyifandonlyifpn=n!0. Corollary3.1. jj^n)]TJ /F8 11.955 Tf 11.96 0 Td[(0njj2!0a.s.(P0)ifandonlyifpn=n!0. Proof. ApplyLemma 7 underP0withZn=^n)]TJ /F8 11.955 Tf 11.95 0 Td[(0nandVn=20(1 nXTnXn))]TJ /F4 7.97 Tf 6.58 0 Td[(1. AsisclearfromCorollary 3.1 ,noteventheMLE^nachievesalmostsureconsistencyunderthe`2normwhenpngrowsatthesamerateasn.Thus,anyattempttoestablishposteriorconsistencyunderthe`2normofaBayesianregressionmodelunderthesamecircumstanceswouldbefutile.However,thefollowinglemmaandcorollarymotivatethechoiceofthe`1norminstead. Lemma8. LetZnNpn(0pn,n)]TJ /F4 7.97 Tf 6.59 0 Td[(1Vn),wherepn0.NotethatVar(Zn,i)=n)]TJ /F4 7.97 Tf 6.58 0 Td[(1Vn,iin)]TJ /F4 7.97 Tf 6.59 0 Td[(1!max,andn1=2V)]TJ /F4 7.97 Tf 6.59 0 Td[(1=2n,iiZn,iN(0,1).Then1Xn=1P(jjZnjj1>)=1Xn=1Pmax1ipnjZn,ij>1Xn=1pnXi=1Pn1=2V)]TJ /F4 7.97 Tf 6.58 0 Td[(1=2n,iijZn,ij>)]TJ /F3 11.955 Tf 5.48 -9.68 Td[(n)]TJ /F4 7.97 Tf 6.59 0 Td[(1Vn,ii)]TJ /F4 7.97 Tf 6.59 0 Td[(1=2 42

PAGE 43

1Xn=1pnXi=1Pn1=2V)]TJ /F4 7.97 Tf 6.58 0 Td[(1=2n,iijZn,ij>!)]TJ /F4 7.97 Tf 6.59 0 Td[(1=2maxn1=21Xn=1pnXi=115!3max 6n3<1,byapplyingMarkov'sinequalityton3V)]TJ /F4 7.97 Tf 6.58 0 Td[(3n,iiZ6n,i,andtheresultfollowsfromtheBorel-Cantellilemma,notingthatpn0.Onemayalsowishtouseanimproperpriorproportionalto1=2,1=,or1,buttheseimproperpriorscanbeseentohavethesamebasicformasinversegammadensitiesiftheparameterrestrictionsarerelaxedtoa)]TJ /F1 11.955 Tf 21.91 0 Td[(2andb0.Althoughthereexistvariouspriorstructuresforthecoecientvectorwithinterestingpropertiesandapplications,perhapsthemostobviousalternativeistosimplyreplace)]TJ /F5 11.955 Tf 5.48 -9.69 Td[(XTnXn)]TJ /F4 7.97 Tf 6.58 0 Td[(1inthepriorvarianceofj2withsomediagonalmatrixD=Diag(21,...,2p),yieldingj2Np(,2D).Thus,thecomponentsofareindependentaprioriwhenconditionedon2.Thevaluesof21,...,2pcanbetakenasxed,ortheycanbesetequaltoacommonvalue2which 43

PAGE 44

isthenestimatedthroughanempiricalBayesianapproach.However,themostimportantapplicationofthismodelistheextensiontoahierarchicalmodelinwhich21,...,2pareassignedindependentexponentialpriorswithcommonrateparameter2=2.AsnotedbyParkandCasella(2008)[ 18 ],thisformulationleadstoaBayesianversionofthelassoofTibshirani(1996)[ 19 ]ifthepointestimateofthecoecientvectoristakentobeitsposteriormode.ParkandCasellaobservethattheresultingBayesianlassotypicallyyieldsresultsquitesimilartothoseoftheordinarylasso,butwiththeadvantageofautomaticintervalestimatesforallparametersviaanyoftheusualconstructionsofBayesiancredibleintervals.Ofcourse,thisstillleavesthequestionofhowtospecifytheparameter.Casella(2001)[ 20 ]examinesthereplacementofwithanempiricalBayesianestimatederivedbymaximizingthemarginallikelihoodof.Alternatively,thehierarchicalstructurecanbeextendedfurtherbyspecifyingaprioron,thoughParkandCasellaadvisecautionhere,asseeminglyinnocuousimproperpriorssuchas1=2canleadtoimproprietyoftheposterior.FurtherdiscussionofBayesianlassomethodscanbefoundinKyungetal.(2010)[ 21 ].Aslightbutsignicantmodicationoftheabovestructureistotakeand2tobeaprioriindependent,removingthedependenceon2fromthepriorgivenaboveforthecoecientvector.However,ParkandCasella(2008)showthatthisunconditionalpriorcaneasilyleadtoabimodalposterioron,2.Incontrast,theyshowthattheconditionalprioralwaysleadstoaunimodalposterioraslongas2InverseGamma(a=2,b=2),wherewepermita)]TJ /F1 11.955 Tf 21.92 0 Td[(2andb0,asbefore.Moreover,Kyungetal.(2010)illustrateotherlasso-typepenalizedregressionschemesthatcanberepresentedthroughhierarchicalextensionsoftheconditionalindependenceprior.InadditiontoTibshirani'soriginallasso,boththegrouplassoofYuanandLin(2006)[ 22 ]andtheelasticnetofZouandHastie(2005)[ 23 ]canberepresentedinthisfashion.Ageneralexaminationofposteriorconsistencyunderhierarchicalextensionsoftheconditionalindependencepriorcouldprovideconditionsunderwhichtheselasso-typeregressiontechniquesareconsistentinthefrequentistsense. 44

PAGE 45

3.1.4ShrinkagePriorsShrinkageestimationthroughcontinuospriors(Grin&Brown(2007)[ 24 ],Park&Casella(2008)[ 18 ],Hans(2009)[ 25 ],Carvalhoetal.(2010)[ 26 ],Grin&Brown(2010)[ 27 ])hasfoundmuchattentioninrecentyearsalongwiththeirfrequentistanalogues(Knight&Fu(2000)[ 28 ],Fan&Li(2001)[ 29 ],Yuan&Lin(2005)[ 30 ],Zhao&Yu(2006)[ 31 ],Zou(2006)[ 32 ],Zou&Li(2008)[ 33 ]intheregularizationframework.TheLassoofTibshirani(1996)anditsBayesiananaloguesrelyingondoubleexponentialpriors(Park&Casella(2008),Hans(2009))havedrawnparticularattention,withmanyvariationsbeingproposed.ThesepriorsyieldundeniablecomputationaladvantagesinregressionmodelsoverBayesianvariableselectionapproachesthatrequireasearchoverahugediscretemodelspace(George&McCulloch(1993)[ 34 ],Rafteryetal.(1997)[ 35 ],Chipmanetal.(2001)[ 36 ],Liangetal.(2008)[ 37 ],Clydeetal.(2010)[ 38 ]).Considerthelinearmodelyn=Xn0n+n,whereynisann-dimensionalvectorofresponses,Xnisthenpndesignmatrix,nNn(0,20In)withxed20,andsomeofthecomponentsof0narezero.IntheBayesianframework,tojustifyuseinhigh-dimensionalsettings,itisimportanttoestablishposteriorconsistencyincasesinwhichthenumberofparameterspincreaseswithsamplesizen.Armaganetal.(2013)[ 39 ]investigatedtheasymptoticbehaviorofposteriordistributionsofregressioncoecientsinhigh-dimensionallinearmodelsasthenumberofparametersgrowswiththenumberofobservations.Theirmaincontributionisprovidingasimplesucientconditiononthepriorconcentrationforstrongposteriorconsistency(in`2norm)whenpn=o(n).Theirparticularfocusisonshrinkagepriors,includingtheLaplace,Studentt,generalizeddoublePareto,andhorseshoe-typepriors(Johnstone&Silverman(2004)[ 40 ],Grin&Brown(2007)[ 24 ],Carvalhoetal.(2010)[ 26 ],Armaganetal.(2011a)[ 41 ]). 3.2MainResultConsiderthefollowingBayesianLasso(Park&Casella(2008)[ 18 ])model:ynjXn,n,2Nn(Xnn,2In) 45

PAGE 46

nj2,21,...,2pnNpn(0pn,2D);whereD=diag(21,...,2pn)i.e.,jj2,2jindN(0,22j);j=1...pn.2jiidexp(2=2);j=1...pn.21,...,2pn>0.Intheabovehierarchicalmodel,wetreat2asanon-randomquantity.Now,supposethetruemodelis:WeinvestigatetheposteriorconsistencyoftheaboveBayesianLassomodelforamuchmorerelaxedgrowthrestrictiononthedimension,i.e;whenpn=O(n).Asdiscussedabove,posteriorconsistencyin`2normseemsunrealistictoachieveunderthisgrowthcondition.Henceweconsiderposteriorconsistencyintheweaker`1norm.Wehaveprovedthefollowingtheorem: Theorem3. Foranorthogonaldesign,i.e;XTnXn=nInandaconditiononthetrueregressioncoecients(jj0njj22=O(n2)]TJ /F19 7.97 Tf 6.59 0 Td[();>0),theposteriorconsistencyoftheregressioncoecientscanbeachieved,i.e;Pn(jjn)]TJ /F8 11.955 Tf 11.95 0 Td[(0njj1>jyn)!0a.sasn!1,forevery>0. Proof. Pn(jjn)]TJ /F8 11.955 Tf 11.95 0 Td[(0njj1>jyn)=Pn)]TJ 5.48 .48 Td[(n)]TJ /F3 11.955 Tf 11.96 0 Td[(E)]TJ /F8 11.955 Tf 5.48 -9.68 Td[(nj2,2,yn+E)]TJ /F8 11.955 Tf 5.48 -9.68 Td[(nj2,2yn)]TJ /F8 11.955 Tf 11.96 0 Td[(0n1>jynPn)]TJ 5.48 .47 Td[(n)]TJ /F3 11.955 Tf 11.96 0 Td[(E)]TJ /F8 11.955 Tf 5.48 -9.69 Td[(nj2,2,yn1>=2jyn+Pn)]TJ 5.48 .47 Td[(E)]TJ /F8 11.955 Tf 5.48 -9.69 Td[(nj2,2yn)]TJ /F8 11.955 Tf 11.96 0 Td[(0n1>=2jyn=(I)+(II),say. (3{1) Let,~(2,2,yn)=E(nj2,2,yn))=(nIn+D)]TJ /F4 7.97 Tf 6.59 0 Td[(1))]TJ /F4 7.97 Tf 6.59 0 Td[(1XTnyn.Now,noticethat:hn)]TJ /F1 11.955 Tf 14.32 2.99 Td[(~(2,2,yn)j2,2,yniNpn(0,2(nIn+D)]TJ /F4 7.97 Tf 6.59 0 Td[(1))]TJ /F4 7.97 Tf 6.59 0 Td[(1)Also,letviiistheithdiagonalelementof(nIn+D)]TJ /F4 7.97 Tf 6.59 0 Td[(1))]TJ /F4 7.97 Tf 6.59 0 Td[(1.Then,byapplyingtheTowerpropertyandtheBonferroniboundconsecutively,weget: (I)=Pnn)]TJ /F1 11.955 Tf 14.32 2.98 Td[(~)]TJ /F7 11.955 Tf 5.48 -9.68 Td[(2,2,yn1>jyn=EhPnn)]TJ /F1 11.955 Tf 14.32 2.99 Td[(~)]TJ /F7 11.955 Tf 5.48 -9.68 Td[(2,2,yn1>j2,2yni 46

PAGE 47

pnXi=1E"P ji)]TJ /F1 11.955 Tf 13.98 2.98 Td[(~i(2,2,yn)j p vii> p viij2,2yn!#=pnXi=1EPjZij> p viijyn;whereZiN(0,1)pnXi=1EPjZij>p n jyn=pnEPjZj>p n jyn!0,asn!1. (3{2) Nextobservethatbythetriangleinequality, (II)=Pn)]TJ 5.48 .47 Td[((nIn+D)]TJ /F4 7.97 Tf 6.59 0 Td[(1))]TJ /F4 7.97 Tf 6.59 0 Td[(1XTnyn)]TJ /F8 11.955 Tf 11.95 0 Td[(0n1>jynPnmax1ipnj^n,i)]TJ /F7 11.955 Tf 11.96 0 Td[(0,ij>=2jyn+Pnmax1ipn)]TJ /F4 7.97 Tf 6.59 0 Td[(2i n+)]TJ /F4 7.97 Tf 6.59 0 Td[(2ij^n,ij>=2jyn=(III)+(IV),say. (3{3) ByCorollary 3.2 ,itiseasytoseethat(III)!0.Also, (IV)=Pnmax1ipn)]TJ /F4 7.97 Tf 6.58 0 Td[(2i n+)]TJ /F4 7.97 Tf 6.58 0 Td[(2ij^n,ij>=2jynI^n)]TJ /F8 11.955 Tf 11.95 0 Td[(0n1=2+Pnmax1ipn)]TJ /F4 7.97 Tf 6.59 0 Td[(2i n+)]TJ /F4 7.97 Tf 6.59 0 Td[(2ij^n,ij>=2jynI^n)]TJ /F8 11.955 Tf 11.96 0 Td[(0n1>=2=(V)+(VI),say. (3{4) Clearly,(VI)!0.Now (V)=Pnmax1ipn)]TJ /F4 7.97 Tf 6.59 0 Td[(2i n+)]TJ /F4 7.97 Tf 6.59 0 Td[(2ij^n,ij>=2jynI^n)]TJ /F8 11.955 Tf 11.96 0 Td[(0n1=2Xi:j^n,ij>=2Pn)]TJ /F4 7.97 Tf 6.58 0 Td[(2i n+)]TJ /F4 7.97 Tf 6.58 0 Td[(2ij^n,ij>=2jynI^n)]TJ /F8 11.955 Tf 11.95 0 Td[(0n1=2Xi:j^n,ij>=2Pn )]TJ /F4 7.97 Tf 6.58 0 Td[(2i>n=2 j^n,ijjyn!I^n)]TJ /F8 11.955 Tf 11.95 0 Td[(0n1=2, (3{5) 47

PAGE 48

since^n)]TJ /F8 11.955 Tf 11.96 0 Td[(0n1=2andjj0njj22=O(n2)]TJ /F19 7.97 Tf 6.59 0 Td[();>0,j^n,ij=Op(n1)]TJ /F19 7.97 Tf 6.59 0 Td[(=2).Thus,forlargen,usingK(>0)asagenericconstant, Xi:j^n,ij>=2Pn )]TJ /F4 7.97 Tf 6.58 0 Td[(2i>n=2 j^n,ijjyn!I^n)]TJ /F8 11.955 Tf 11.96 0 Td[(0n1=2Xi:j^n,ij>=2Pn)]TJ /F7 11.955 Tf 5.48 -9.68 Td[()]TJ /F4 7.97 Tf 6.59 0 Td[(2i>Kn=2jynI^n)]TJ /F8 11.955 Tf 11.96 0 Td[(0n1=2Xi:j^n,ij>=2EPn)]TJ /F7 11.955 Tf 5.48 -9.68 Td[()]TJ /F4 7.97 Tf 6.58 0 Td[(2i>Kn=2jn,ynjyn (3{6) Nextweobservethat)]TJ /F4 7.97 Tf 6.59 0 Td[(2ijn,ynInverseGaussian jij,2.Inordertondanupperboundfortheinnerprobabilityin( 3{6 ),weneedthefollowinglemma. Lemma9. Suppose,XInverseGaussian(,).ThenP(X>M)r 8Mexp exp)]TJ /F7 11.955 Tf 10.49 8.09 Td[(M 22. Proof. P(X>M)=Z1Mr 2x3exp)]TJ /F7 11.955 Tf 10.49 8.08 Td[((x)]TJ /F7 11.955 Tf 11.96 0 Td[()2 22xdx=r 2exp Z1M1 x3=2exp)]TJ /F7 11.955 Tf 12.67 8.09 Td[(x 22exp)]TJ /F7 11.955 Tf 13.23 8.09 Td[( 2xdxr 2exp Z1M1 x3=2exp)]TJ /F7 11.955 Tf 12.67 8.09 Td[(x 22dxr 2exp exp)]TJ /F7 11.955 Tf 10.5 8.08 Td[(M 22Z1M1 x3=2dx=r 21 2p Mexp exp)]TJ /F7 11.955 Tf 10.5 8.09 Td[(M 22=r 8Mexp exp)]TJ /F7 11.955 Tf 10.49 8.09 Td[(M 22. Bytheabovelemmaanupperboundfor( 3{6 )isgivenby K n=4Eexpjij )]TJ /F3 11.955 Tf 13.16 8.08 Td[(Kn=22i 22jyn.(3{7) 48

PAGE 49

Nextobservethattheihaveiidpriorswithcommonpdff(j,)= 2exp[)]TJ /F1 11.955 Tf 9.3 0 Td[((=)jj].Also,sinceXTnXn=nIn,writingjjYn)]TJ /F5 11.955 Tf 10.89 0 Td[(Xnnjj2=jjYn)]TJ /F5 11.955 Tf 10.89 0 Td[(Xn^njj2+jjXn(^n)]TJ /F8 11.955 Tf 10.88 0 Td[(n)jj2andfurtherjjXn(^n)]TJ /F8 11.955 Tf 11.96 0 Td[(n)jj2=jj^n)]TJ /F8 11.955 Tf 11.95 0 Td[(njj2=nPni=1(i)]TJ /F1 11.955 Tf 13.64 2.99 Td[(^n,i)2.Hencetheposteriorofijynis (ijyn)/exp)]TJ /F3 11.955 Tf 16.2 8.09 Td[(n 22(i)]TJ /F1 11.955 Tf 13.64 2.99 Td[(^n,i)2)]TJ /F7 11.955 Tf 13.15 8.09 Td[(jij .(3{8)Inviewof( 3{7 ), Eexpjij )]TJ /F3 11.955 Tf 13.15 8.09 Td[(Kn=22i 22jyn=R1exph)]TJ /F6 7.97 Tf 14.76 4.71 Td[(n 22(i)]TJ /F1 11.955 Tf 13.64 2.99 Td[(^n,i)2)]TJ /F6 7.97 Tf 13.15 6.43 Td[(Kn=22i 22idi R1exph)]TJ /F6 7.97 Tf 14.76 4.71 Td[(n 22(i)]TJ /F1 11.955 Tf 13.64 2.99 Td[(^n,i)2)]TJ /F19 7.97 Tf 13.15 5.7 Td[(jij idi=N=D(say). (3{9) Now, NZ1exph)]TJ /F3 11.955 Tf 16.2 8.09 Td[(n 22(i)]TJ /F1 11.955 Tf 13.64 2.99 Td[(^n,i)2idi=(22n)1=2.(3{10)Also, D=Z10exp )]TJ /F3 11.955 Tf 10.49 8.09 Td[(n2i 22+ni^n,i 2)]TJ /F3 11.955 Tf 13.15 8.73 Td[(n^2n,i 22)]TJ /F7 11.955 Tf 13.15 8.09 Td[(i !di+Z0exp )]TJ /F3 11.955 Tf 10.49 8.09 Td[(n2i 22+ni^n,i 2)]TJ /F3 11.955 Tf 13.15 8.73 Td[(n^2n,i 22)]TJ /F7 11.955 Tf 13.15 8.09 Td[(i !di=Z10exp")]TJ /F3 11.955 Tf 16.2 8.09 Td[(n 22i)]TJ /F1 11.955 Tf 11.96 0 Td[((^n,i)]TJ /F7 11.955 Tf 13.15 8.09 Td[(i n)2#exp 22i 2n)]TJ /F7 11.955 Tf 13.15 8.09 Td[(i^n,i 2!di+Z0exp")]TJ /F3 11.955 Tf 16.19 8.09 Td[(n 22i)]TJ /F1 11.955 Tf 11.96 0 Td[((^n,i+i n)2#exp 22i 2n+i^n,i 2!di="^n,i)]TJ /F7 11.955 Tf 13.15 8.09 Td[(i nexp 22i 2n)]TJ /F7 11.955 Tf 13.15 8.09 Td[(i^n,i 2!+^n,i+i nexp 22i 2n+i^n,i 2!#(22=n)1=2. (3{11) Since^n,i!0,ia.s.,from( 3{10 )and( 3{11 )itfollowsthatN=D=O(1)a.s.asn!1.Hence,from( 3{7 )to( 3{11 ),itfollowsthat(V)!0a.s.asn!1. 49

PAGE 50

CHAPTER4VARIABLESELECTIONWITHKULLBACK-LEIBLERDIVERGENCELOSS 4.1LiteratureSurveyandProposedWorkTherearetwofundamentalgoalsinstatisticallearning:ensuringhighpredictionaccuracyanddiscoveringrelevantpredictivevariables.Variableselectionisparticularlyimportantwhenthetrueunderlyingmodelhasasparserepresentation.Identifyingsignicantpredictorswillenhancethepredictionperformanceofthettedmodel.Variableselectionisalsofundamentaltohigh-dimensionalstatisticalmodeling.Manyapproachesinusearestepwiseselectionprocedures,whichcanbecomputationallyexpensiveandignorestochasticerrorsinthevariableselectionprocess.Regularizationmethodsarecharacterizedbylossfunctionsmeasuringdatatsandpenaltytermsconstrainingmodelparameters.TheLASSOisapopularregularizationtechniqueforsimultaneousestimationandvariableselection(Tibshirani(1996)[ 19 ]).FanandLi(2006)[ 42 ]gaveacomprehensiveoverviewofvariable/featureselectionandproposedauniedframeworktoapproachtheproblemofvariableselection. 4.1.1VariableSelectionandTheLASSOLetusconsidertheusuallinearmodelsetup.Supposeweobserveani.i.dsample(xi,yi),i=1,...,n,wherexi=(xi1,...,xip)isthevectorofpcovariates.Thelinearmodelisgivenby:Y=X+,whereN(0,2I).Ourmaininterestistoestimatetheregressioncoecient=(1,...,p).Wealsoknowthatforpn,thentheLSEisnotuniqueanditwillusuallyovertthedata,i.e;allobservationsarepredictedperfectly,buttherearemanysolutionstothecoecientsofthetandnewobservationsbecomenotuniquelypredictable.Theclassicalsolutiontothisproblemwastotrytoreducethenumberofvariablesbyprocessessuchasforwardandbackwardregressionwithreductioninvariablesdeterminedbyhypothesistests,seeDraperandSmith(1998)[ 43 ],forexample. 50

PAGE 51

Suppose,thetrue(unknown)regressioncoecientsare:0=(01,...,0p)T.Wedenotethetruenon-zerosetasS0=j:0j6=0andthetruedimensionofthissetisgivenbyd=jS0j.So,wehavetherelationship:d
PAGE 52

^=argmin(nXi=1(yi)]TJ /F10 11.955 Tf 11.95 11.35 Td[(Xjcj^0jxij)2)s.t.cj0,Xjcjt,where^0jisthefullLSEofj,j=1...p.Therearesomeadvantagesofusingnon-negativegarotteasitgiveslowerpredictionerrorthansubsetselectionanditiscompetitivewithridgeregressionexceptwhenthetruemodelhasmanysmallnon-zerocoecients.ButthedrawbackofthismethodisitdependsonboththesignandthemagnitudeofLSE,thusitsuerswhenLSEbehavespoorly.TheLassoisashrinkageandselectionmethodforlinearregression.Itminimizestheusualsumofsquarederrors,withaboundonthesumoftheabsolutevaluesofthecoecients.Becauseofthenatureofthisconstraintittendstoproducesomecoecientsthatareexactly0andhencegivesinterpretablemodels.TheLassohasseveraladvantages.Itsimultaneouslyperformsmodelselectionandmodelt.Inaddition,althoughitisanon-linearmethod,itistheglobalminimumofaconvexpenalizedlossfunctionandcanbecomputedeciently.ForLasso,westandardizexijasPixij=n=0,Pix2ij=n=1.Now,denote^=(^1,...,^p)T,thelassoestimate^isgivenby:^=argmin(nXi=1(yi)]TJ /F10 11.955 Tf 11.95 11.36 Td[(Xjjxij)2)s.t.Xjjjjt,wheret0isatuningparameter.Here,tcontrolstheamountofshrinkage,i.e;whent
PAGE 53

t
PAGE 54

Supposethat^isaroot-n-consistentestimatorto;forexample,wecanuse^(ols).Picka>0,anddenetheweightvector^w=1=j^j.Theadaptivelassoestimators^adalassoaregivenby:^adalasso=argmin(nXi=1(yi)]TJ /F10 11.955 Tf 11.95 11.36 Td[(Xjjxij)2+nXj^wjjjj),wherenvarieswithn. 4.2AnAlternativeApproachtotheAdaptiveLassothroughtheKLDivergenceLossInourwork,wereplacethesquarederrorlossintheadaptivelassosetupbytheKLdivergenceloss,whichisalsoknownastheentropydistance.TherearevarioustheoreticalreasonstodefendtheuseofKullback-Leiblerdistance,rangingfrominformationtheorytotherelevanceoflogarithmicscoringruleandthelocation-scaleinvarianceofthedistance,asdetailedinBernardoandSmith(1994)[ 46 ].Wealsoshowthattheoraclepropertiesdiscussedaboveholdinthiscaseundermildregularityconditions.WeadoptthesetupofKnightandFu(2000)[ 28 ]fortheasymptoticanalysis.Weassumetwoconditions:(a)yi=xTi+i,where1,...,nareinindependentidenticallydistributed(iid)randomvariableswithmean0andvariance2.Here,forconvenience,weassumethat2isknownandisequaltoone.(b)1 nXTX!C,whereCisapositivedenitematrix.Withoutlossofgenerality,assumethatA=f1,2,...,p0g.LetC11isthep0p0upperleft-cornerpartitionedsub-matrixofC.Now,supposethatf(yijxi,)andf(yijxi,)arethenormaldensitiesevaluatedatandrespectively.Thenthe\AdaptivePenalizedKLDivergence"estimators(whichwearegoingcallasthe\KLadaptivelasso00estimatorsnowonwards)^KLaregivenby: ^KL=argmin(nXi=1Elogf(yijxi,) f(yijxi,)+nXj^wjjjj).(4{1) 54

PAGE 55

Now,sincethevectoroftrueregressioncoecientsisunknown,sowereplaceitby^,whichistheordinaryleastsquares(ols)estimatorof.Hence, ^KL=argmin(hX^)]TJ /F5 11.955 Tf 11.95 0 Td[(XiThX^)]TJ /F5 11.955 Tf 11.95 0 Td[(Xi+nXj^wjjjj).(4{2)Thisisaconvexoptimizationprobleminaswehaveusedconvexpenaltyhereandhencethelocalminimizer^KListheuniqueglobalKLadaptivelassoestimator(fornon-convexpenalties,however,thelocalminimizermaynotbegloballyunique).LetAn=nj:^KLj6=0o.Wehaveshownthatwithaproperchoiceofn,theKLadaptivelassoenjoystheoracleproperties. Theorem4. Supposethatn=p n!0andnn()]TJ /F4 7.97 Tf 6.59 0 Td[(1)=2!1.Then,theKLadaptivelassoestimatesmustsatisfythefollowing:1.Consistencyinvariableselection:limP(An=A)=1.2.Asymptoticnormality:p n^KLA)]TJ /F8 11.955 Tf 11.95 0 Td[(A!dN)]TJ /F21 11.955 Tf 5.48 -9.68 Td[(0,C)]TJ /F4 7.97 Tf 6.59 0 Td[(111. Proof. Werstprovetheasymptoticnormalitypart.Let=+u p n,and n(u)=X^)]TJ /F5 11.955 Tf 11.96 0 Td[(X+u p nTX^)]TJ /F5 11.955 Tf 11.95 0 Td[(X+u p n+npXj=1^wjj+uj p n.Let^u(n)=argmin n(u);then^KL=+^u(n) p n,or,^u(n)=p n(^KL)]TJ /F8 11.955 Tf 11.95 0 Td[().Now,dene: V(n)(u)= n(u))]TJ /F7 11.955 Tf 11.95 0 Td[( n(0)=uT1 nXTXu)]TJ /F1 11.955 Tf 11.95 0 Td[(2uTXTX p n(^)]TJ /F8 11.955 Tf 11.95 0 Td[()+n p npXj=1^wjp nj+uj p n)]TJ /F10 11.955 Tf 11.96 10.16 Td[(j. (4{3) 55

PAGE 56

Now,since^istheolsestimateof,hencep n(^)]TJ /F8 11.955 Tf 12.62 0 Td[()!dN(0,2C)]TJ /F4 7.97 Tf 6.59 0 Td[(1)thenbyassumption(b)weget:)]TJ /F4 7.97 Tf 6.81 -4.98 Td[(1 nXTXp n(^)]TJ /F8 11.955 Tf 11.96 0 Td[()!dW,whereWN(0,2C).Now,considerthelimitingbehaviorofthethirdtermin(2.3).Ifj6=0,then^wj!pjjj)]TJ /F19 7.97 Tf 6.59 0 Td[((*^j!pjandthenbycontinuousmapping)andp nj+uj p n)]TJ /F10 11.955 Tf 11.95 10.17 Td[(j!ujsgn(j).Then,bySlutsky0stheorem,wehave:n p n^wjp nj+uj p n)]TJ /F10 11.955 Tf 11.96 10.16 Td[(j!p0,sincebyoneoftheassumptionsofthistheoremwehaven=p n!0.Ifj=0,thenp nj+uj p n)]TJ /F10 11.955 Tf 11.95 10.16 Td[(j=jujjandn p n^wj=n p nn=2jp n^jj)]TJ /F19 7.97 Tf 6.59 0 Td[(=nn()]TJ /F4 7.97 Tf 6.58 0 Td[(1)=2jp n^jj)]TJ /F19 7.97 Tf 6.58 0 Td[(,wherep n^j=Op(1).BySlutsky0stheoremandbyoneoftheotherassumptionsofthistheorem,nn()]TJ /F4 7.97 Tf 6.59 0 Td[(1)=2!1.Thenn p n^wjp nj+uj p n)]TJ /F10 11.955 Tf 11.96 10.16 Td[(j!p0,ifuj=0andn p n^wjp nj+uj p n)]TJ /F10 11.955 Tf 11.96 10.16 Td[(j!p1,ifuj6=0.Hence,wesummarizetheresultsasfollows:n p n^wjp nj+uj p n)]TJ /F10 11.955 Tf 11.96 10.17 Td[(j!p0,ifj6=0;n p n^wjp nj+uj p n)]TJ /F10 11.955 Tf 11.96 10.16 Td[(j!p0,ifj=0anduj=0;n p n^wjp nj+uj p n)]TJ /F10 11.955 Tf 11.96 10.16 Td[(j!p1,ifj=0anduj6=0.Thus,againbySlutsky0stheorem,weseethatV(n)(u)!dV(u)foreveryu,where,V(u)=uTAC11uA)]TJ /F1 11.955 Tf 11.96 0 Td[(2uTAWA;ifuj=08j=2AandV(u)=1,otherwise.Now,notethatV(n)isconvexandtheuniqueminimumofVisattainedat(C)]TJ /F4 7.97 Tf 6.59 0 Td[(111WA,0)T.Followingtheepi-convergenceresultsofGeyer(1994)[ 28 ]andKnightandFu(2000)[ 47 ],wehave:^u(n)A!dC)]TJ /F4 7.97 Tf 6.59 0 Td[(111WAand^u(n)Ac!d0.Finally,weobservethatWA=N(0,2C11);thenweprovetheasymptoticnormalitypart. 56

PAGE 57

Now,weshowthetheconsistencypart.8j2A,theasymptoticnormalityresultindicatesthat^(n)j!pj;thusP(j2An)!1.Thenitsucestoshowthat8j0=2A,P(j02An)!0.Considertheeventj02An.Then,bytheKKToptimalityconditions,weknowthat:2xTj0X^)]TJ /F5 11.955 Tf 11.96 0 Td[(X^KL=n^wj0.Now,notethat:n^wj0 p n=n p nn=21 jp n^j0j!1,(sincenn()]TJ /F4 7.97 Tf 6.59 0 Td[(1)=2!1andp n^j0=Op(1),sincej0=2A,i.e;j0=0.)whereas,2xTj0(X^)]TJ /F17 7.97 Tf 6.58 0 Td[(X^KL) p n=2xTj0Xp n(^)]TJ /F4 7.97 Tf 8.19 2 Td[(^KL) n!dsomenormaldistribution.(since,p n^)]TJ /F1 11.955 Tf 14.32 2.99 Td[(^KL=p n^)]TJ /F8 11.955 Tf 11.96 0 Td[()]TJ 12.33 8.69 Td[(p n^KL)]TJ /F8 11.955 Tf 11.95 0 Td[(!dsomenormalr.v-somenormalr.v)Thus,P[j02An]Ph2xTj0X^)]TJ /F5 11.955 Tf 11.95 0 Td[(X^KL=n^wj0i!0.Thiscompletestheproof. 4.3ComputationsInthissectionwediscussthecomputationalissues.TheKLadaptivelassoestimatesin( 4.2 )canbesolvedbytheLARSalgorithm(Efronetal.(2004)[ 48 ]).ThecomputationdetailsaregiveninAlgorithm1.Algorithm1(TheLARSalgorithmfortheKLadaptivelasso).1.Denexj=xj ^wj,j=1,...,p.2.Evaluate^y=Ppj=1xj^j.3.Solvethefollowingoptimizationproblemforalln,^=argmin^y)]TJ /F6 7.97 Tf 18.24 15.43 Td[(pXj=1xjj2+npXj=1jjj.4.Output^KLj=^j=^wj,j=1,...,p.Tuningisanimportantissueinpractice.Supposethatweuse^(ols)toconstructtheadaptiveweightsintheKLadaptivelasso;wethenwanttondanoptimalpairof(,n).Wecanusetwo-dimensionalcross-validationtotunetheKLadaptivelasso.Notethatforagiven,wecanusecross-validationalongwiththeLARSalgorithmtoexclusivelysearchforthe 57

PAGE 58

optimaln.Inprinciple,wecanalsoreplace^(ols)withotherconsistentestimators.Hencewecantreatitasthethirdtuningparameterandperformthree-dimensionalcross-validationtondanoptimaltriple(^,,n).Wesuggestusing^(ols)unlesscollinearityisaconcern,inwhichcasewecantry^(ridge)fromthebestridgeregressiont,becauseitismorestablethan^(ols). 4.4NumericalExample:DiabetesDataFromEfronetal.(2004)[ 48 ],wehave:\Tenbaselinevariables,age,sex,bodymassindex(bmi),averagebloodpressure(map),andsixbloodserummeasurements(tc,ldl,hdl,tch,ltg,glu)wereobtainedforeachofn=442diabetespatients,aswellastheresponseofinterest(y),aquantitativemeasureofdiseaseprogressiononeyearafterbaseline."ByapplyingourKLadaptivelassovariableselectionmethodologyonthisdata,wegettheregressioncoecientestimatesforthepredictorvariablesasfollows:age0.0000sex-201.6889bmi540.5075map314.5000tc-514.2194ldl268.7080hdl0.0000tch119.6211ltg682.5415glu0.0000Itisveryclearfromtheaboveestimatesthatthevariablesnamedage,hdlandgludonothaveanysignicantinuenceontheresponse.Hence,wecanonlyselecttheremaining7variablestopredicttheresponse.Theaboveresultsupportstheresultobtainedfromtheadaptivelassotechniqueappliedonthesamedata. 58

PAGE 59

4.5FurtherExtensionHavingshowntheoraclepropertiesoftheKLadaptivelassoinlinearregressionmodels,wehavefurtherextendedthetheoryandmethodologytogeneralizedlinearmodels(GLMs).WeconsiderthepenalizedKLdivergencelossfunctionusingtheadaptivelyweightedl1penalty,wherethedensitybelongstotheexponentialfamilywithcanonicalparameter.Thegenericdensityformcanbewrittenas(McCullaghandNelder(1989)[ 49 ])f(yjx,)=h(y)exp(y)]TJ /F7 11.955 Tf 11.96 0 Td[(()).Generalizedlinearmodelsassumethat=xT.Supposethat^isthemaximumlikelihoodestimates(mle)ofintheGLM.Weconstructtheweightvector^w=1=j^jforsome>0.Now,supposethatf(yijxi,)andf(yijxi,)aretheexponentialfamilydensitiesevaluatedatandrespectively.Notethat,E(yi)=0(xTi).ThentheKLadaptivelassoestimates^KL(glm)aregivenby: ^KL(glm)=argmin(nXi=1Elogf(yijxi,) f(yijxi,)+nXj^wjjjj)=argmin(nXi=10(xTi)(xTi)]TJ /F5 11.955 Tf 11.95 0 Td[(xTi))]TJ /F7 11.955 Tf 11.96 0 Td[((xTi)+(xTi)+nXj^wjjjj). (4{4) Now,weneedtoreplacethetruebutunknownbyaroot-n-consistentestimatorof.Hence,wereplaceby^(mle)intheKLdivergencelossfunction.Thus,equation(5.1)becomes: ^KL(glm)=argmin(nXi=1n0(xTi^)(xTi^)]TJ /F5 11.955 Tf 11.96 0 Td[(xTi))]TJ /F7 11.955 Tf 11.95 0 Td[((xTi^)+(xTi)o+nXj^wjjjj). (4{5) 59

PAGE 60

Forlogisticregression,(5.2)becomes^KL(logistic)=argmin(nXi=1n0(xTi^)(xTi^)]TJ /F5 11.955 Tf 11.95 0 Td[(xTi))]TJ /F1 11.955 Tf 11.96 0 Td[(log(1+exp(xTi^))+log(1+exp(xTi))o+nXj^wjjjj).ForPoissonlog-linearregressionmodels,(5.2)becomes^KL(poisson)=argmin(nXi=1n0(xTi^)(xTi^)]TJ /F5 11.955 Tf 11.95 0 Td[(xTi))]TJ /F1 11.955 Tf 11.96 0 Td[(exp(xTi^)+exp(xTi)o+nXj^wjjjj).LetLn()=Pni=1n0(xTi^)(xTi^)]TJ /F5 11.955 Tf 11.95 0 Td[(xTi))]TJ /F7 11.955 Tf 11.95 0 Td[((xTi^)+(xTi)o.Then,@2Ln() @@T=nXi=100(xTi^)xixTi;whichispositivedenite,sincethevariancefunction00(xTi^)>0.Now,letln()=Ln()+nPj^wjjjj.Since,inthiscase,wehaveusedconvexpenalty,soln()isnecessarilyconvexinandhencethelocalminimizer^KL(glm)istheuniqueglobalKLadaptivelassoestimator.Assumethatthetruemodelhasasparserepresentation.Withoutlossofgenerality,letA=j:j6=0=f1,2,...,p0gandp0
PAGE 61

Proof. Weassumethefollowingregularityconditions:1.TheFisherinformationmatrixisniteandpositivedenite,I()=E00(xT)xxT.2.ThereisasucientlylargeenoughopensetOthatcontainssuchthat82O,000(xT)M(x)<1andE[M(x)jxjxkxlj]<181j,k,lp.Werstprovetheasymptoticnormalitypart.^KL(glm)=argmin(nXi=1n0(xTi^)(xTi^)]TJ /F5 11.955 Tf 11.96 0 Td[(xTi))]TJ /F7 11.955 Tf 11.95 0 Td[((xTi^)+(xTi)o+nXj^wjjjj).Let=+u p n;u2Rp.Dene:)]TJ /F6 7.97 Tf 6.32 -1.8 Td[(n(u)=nXi=1xTi(^)]TJ /F8 11.955 Tf 11.96 0 Td[()]TJ /F5 11.955 Tf 17.91 8.08 Td[(u p n)0(xTi^))]TJ /F7 11.955 Tf 11.96 0 Td[((xTi^)+(xTi(+u p n))+nXj^wjjj+uj p nj.and)]TJ /F6 7.97 Tf 6.32 -1.79 Td[(n(0)=nXi=1nxTi(^)]TJ /F8 11.955 Tf 11.96 0 Td[()0(xTi^))]TJ /F7 11.955 Tf 11.96 0 Td[((xTi^)+(xTi()o+nXj^wjjjj.Let^un=argmin)]TJ /F6 7.97 Tf 56.02 -1.79 Td[(n(u)=argminf)]TJ /F6 7.97 Tf 6.33 -1.79 Td[(n(u))]TJ /F1 11.955 Tf 11.96 0 Td[()]TJ /F6 7.97 Tf 6.32 -1.79 Td[(n(0)g.Then,^un=p n(^KL(glm))]TJ /F8 11.955 Tf 11.96 0 Td[().LetH(n)(u)=)]TJ /F6 7.97 Tf 26.62 -1.79 Td[(n(u))]TJ /F1 11.955 Tf 11.96 0 Td[()]TJ /F6 7.97 Tf 6.32 -1.79 Td[(n(0)=nXi=1xTi()]TJ /F5 11.955 Tf 15.26 8.08 Td[(u p n)0(xTi^)+(xTi+xTiu p n))]TJ /F7 11.955 Tf 11.95 0 Td[((xTi)+n p nXj^wjp njj+uj p nj)-222(jjj.Now,noticethat: 61

PAGE 62

(xTi+xTiu p n)=(xTi)+0(xTi)xTiu p n+1 200(xTi)uT(xixTi) nu+n)]TJ /F4 7.97 Tf 6.59 0 Td[(3=2 6000(xTi)(xTiu)3,whereisbetweenand+u p n.Hence, 62

PAGE 63

H(n)(u)=nXi=1(0(xTi))]TJ /F7 11.955 Tf 11.96 0 Td[(0(xTi^))xTiu p n+1 200(xTi)uT(xixTi) nu+n)]TJ /F4 7.97 Tf 6.59 0 Td[(3=2 6000(xTi)(xTiu)3+n p nXj^wjp njj+uj p nj)-223(jjj=A(n)1+A(n)2+A(n)3+A(n)4;(say)Now,A(n)1=)]TJ /F6 7.97 Tf 17.67 14.94 Td[(nXi=1h0(xTi^))]TJ /F7 11.955 Tf 11.96 0 Td[(0(xTi)ixTiu p n=)]TJ /F6 7.97 Tf 17.67 14.95 Td[(nXi=1h00(xTi)(xTi^)]TJ /F5 11.955 Tf 11.95 0 Td[(xTi)ixTiu p n)]TJ /F1 11.955 Tf 13.15 8.09 Td[(1 2nXi=1h000(xTi^)(xTi^)]TJ /F5 11.955 Tf 11.96 0 Td[(xTi)2ixTiu p n=)]TJ /F3 11.955 Tf 9.3 0 Td[(A(n)11)]TJ /F3 11.955 Tf 11.95 0 Td[(A(n)12;(say),where^isinbetween^and.NoticethatA(n)11=nXi=1h00(xTi)(xTi^)]TJ /F5 11.955 Tf 11.95 0 Td[(xTi)ixTiu p n=uT"1 nnXi=100(xTi)xixTi#W0n;where,W0n=p n(^)]TJ /F8 11.955 Tf 11.95 0 Td[()!dW0,suchthatW0!dN(0,[I()])]TJ /F4 7.97 Tf 6.59 0 Td[(1).Also,1 nPni=100(xTi)xixTi!pI()(byWLLN).Thus,bySlutsky0stheorem,A(n)11!duTI()W0=uTW,where,WN(0,I()).Now,consider:A(n)12=1 2nXi=1h000(xTi^)(xTi^)]TJ /F5 11.955 Tf 11.96 0 Td[(xTi)2ixTiu p n 63

PAGE 64

=1 2p nW0nT"1 nnXi=1000(xTi^)xixTiuTxi#W0n1 2p njW0nTj"1 nnXi=1M(xi)xixTijxTiuj#jW0nj,bytheregularitycondition2since^2O.Also,W0n=Op(1)since,W0n=p n(^)]TJ /F8 11.955 Tf 12.1 0 Td[()!dW0.ByWLLN,1 nnXi=1M(xi)xixTijxTiuj!pEM(x)xxTjxTuj.Now,noticethat,the(i,j)th.elementoftheppmatrixEM(x)xxTjxTujisgivenby:81i,jp,\000EM(x)xxTjxTuji,j=EM(x)xixjjxTujpXk=1E[M(x)jxijjxjjjxkjjukj]=pXk=1jukjE[M(x)jxixjxkj]<1,again,bytheregularitycondition2.Hence,1 nnXi=1M(xi)xixTijxTiuj=Op(1).Thus,A(n)12!p0whichimpliesthatA(n)1!duTW.Now,forthesecondtermA(n)2,weobservethat:1 nPni=100(xTi)xixTi!pI().Thus,bySlutsky0stheorem,A(n)2!p1 2uTI()u.Now,since^2O,sobytheregularitycondition2,thethirdtermA(n)3canbeboundedas:6p nA(n)31 nnXi=1M(xi)jxTiuj3!pEM(x)jxTuj3<1. 64

PAGE 65

Hence,A(n)3!p0.ThelimitingbehaviorofthefourthtermA(n)4isalreadydiscussedintheproofofTheorem1.Wesummarizetheresultsasfollows:n p n^wjp nj+uj p n)]TJ /F10 11.955 Tf 11.96 10.16 Td[(j!p0,ifj6=0;n p n^wjp nj+uj p n)]TJ /F10 11.955 Tf 11.96 10.16 Td[(j!p0,ifj=0anduj=0;n p n^wjp nj+uj p n)]TJ /F10 11.955 Tf 11.96 10.16 Td[(j!p1,ifj=0anduj6=0.Thus,bySlutsky0stheorem,weseethatHn(u)!dH(u)foreveryu,whereH(u)=uTAI11uA)]TJ /F1 11.955 Tf 11.95 0 Td[(2uTAWA,ifuj=08j=2AandH(u)=1,otherwise,whereWN(0,I()).H(n)isconvexandtheuniqueminimumofHisattainedat(I)]TJ /F4 7.97 Tf 6.59 0 Td[(111WA,0)T.Then,wehave:^u(n)A!dI)]TJ /F4 7.97 Tf 6.58 0 Td[(111WAand^u(n)Ac!d0.BecauseWAN(0,I11),theasymptoticnormalitypartisproven.Nowweshowtheconsistencypart.8j2A,theasymptoticnormalityindicatesthatP(j2An)!1.Thenitsucestoshowthat8j0=2A,P(j02An)!0.Considertheeventj02An.BytheKKToptimalityconditions,wemusthavenXi=1xij00(xTi^))]TJ /F7 11.955 Tf 11.95 0 Td[(0(xTi^KL(glm))=n^wj0;thusP(j02An)PPni=1xij00(xTi^))]TJ /F7 11.955 Tf 11.96 0 Td[(0(xTi^KL(glm))=n^wj0.NotethatnXi=1xij00(xTi^))]TJ /F7 11.955 Tf 11.96 0 Td[(0(xTi^KL(glm))=B(n)1+B(n)2+B(n)3+B(n)4,withB(n)1=nXi=1xij00(xTi^))]TJ /F7 11.955 Tf 11.95 0 Td[(0(xTi)=p n, 65

PAGE 66

B(n)2= 1 nnXi=1xij000(xTi)xTi!p n()]TJ /F1 11.955 Tf 14.32 2.99 Td[(^KL(glm))and1 2nnXi=1xij0000(xTi^)xTip n()]TJ /F1 11.955 Tf 14.32 2.99 Td[(^KL(glm))2=p n,where^isinbetween^KL(glm)and.Now,B(n)1=nXi=1xij00(xTi^))]TJ /F7 11.955 Tf 11.95 0 Td[(0(xTi)=p n=nXi=1h00(xTi)(xTi^)]TJ /F5 11.955 Tf 11.96 0 Td[(xTi)ixij0 p n+1 2nXi=1h000(xTi^)(xTi^)]TJ /F5 11.955 Tf 11.95 0 Td[(xTi)2ixij0 p n=B(n)11+B(n)12,(say).NoticethatB(n)11=nXi=1h00(xTi)(xTi^)]TJ /F5 11.955 Tf 11.96 0 Td[(xTi)ixij0 p n="1 nnXi=1xij000(xTi)xTi#W0n.Now,1 nPni=1xij000(xTi)xTi!pIj0,where,Ij0isthej0th.rowofI()andW0n!dW0.Thus,bySlutsky0stheorem,wehave:B(n)11!dIj0W0N(0,Ij0[I()])]TJ /F4 7.97 Tf 6.59 0 Td[(1ITj0).Also,B(n)12=1 2nXi=1h000(xTi^)(xTi^)]TJ /F5 11.955 Tf 11.95 0 Td[(xTi)2ixij0 p n=1 2p nW0nT"1 nnXi=1000(xTi^)xixTixij0#W0n 66

PAGE 67

1 2p nW0nT"1 nnXi=1M(xi)xixTijxij0j#W0n;bytheregularitycondition2since^2O.Now,weknowthatW0n=Op(1)andbyWLLN,1 nnXi=1M(xi)xixTijxij0j!pEM(x)xxTjxij0j,whichimplies1 nPni=1M(xi)xixTijxij0j=Op(1).Hence,B(n)12!p0.Now,since1 nPni=1xij000(xTi)xTi!pIj0,hencetheasymptoticnormalitypartimpliesthat:B(n)2!dsomenormalr.v.Meanwhile,wehave:n^wj0 p n=n p nn=21 jp n^j0j!p1,since,nn()]TJ /F4 7.97 Tf 6.59 0 Td[(1)=2!p1for>0,bytheassumptionandasj0=2A)^j00,sop n(^j0)]TJ /F1 11.955 Tf 11.96 0 Td[(0)!dsomenormalr.v.)jp n^j0j=Op(1).Hence,P(j02An)!0.Thiscompletestheproof. 67

PAGE 68

CHAPTER5CONCLUSIONSThemainobjectiveofthesecondchapterofthisdissertationistoprovideathirdordercorrectasymptoticexpansionoftheposteriordensityforGLMwithcanonicallinkfunctionwhenthenumberofregressorsgrowstoinnityatacertainraterelativetothegrowthofthesamplesizen.Theresultsbearthepotentialforthedevelopmentofavarietyofobjectivepriorsinthisframework.Inparticular,therststeptowardsthedevelopmentofreferencepriors,probabilitymatchingpriors,momentmatchingpriorsandothersrequiresasymptoticexpansionsofposteriors.Inthethirdchapter,wehaveestablishedposteriorconsistencyintheBayesianLassomodelassuminganorthogonaldesignmatrixandforaxed20.Infuturework,wewouldliketogeneralizethisresultforanarbitrarydesignmatrixandastochasticvarianceparameter.Inthefourthchapter,wehaveproposedtheKLadaptivelassoforsimultaneousestimationandvariableselection.WehaveshownthattheKLadaptivelassoalsoenjoystheoraclepropertiesbyutilizingtheadaptivelyweightedl1penalty.Owingtotheecientpathalgorithm,theKLadaptivelassoalsoenjoysthecomputationaladvantageofthelasso.OurnumericalexamplehasshownthattheKLadaptivelassoperformssimilarlywithadaptivelasso.WewouldliketoextendourKLdivergencebasedvariableselectionmethodologyinthehighdimensionalregimeaswellasinasurvivalanalysiscontextwherewehavetomodifythedatatoaccountforcensoring,andinvestigatetheoraclepropertiesspeciedaboveinthisscenario.Wewouldalsoliketoapplyvariableselectionmethodologiesincovarianceestimationproblems.Infuture,ourinterestwillfocusontheBayesianversionsofsomenewlydevelopedcovarianceestimationtools. 68

PAGE 69

REFERENCES [1] S.N.Bernstein,TheoryofProbability.Moscow:GTTI,1934. [2] M.Ghosh,\Objectivepriors:anintroductionforfrequentists(withdiscussion),"Statisti-calScience,vol.26,pp.187{211,2011. [3] R.A.Johnson,\Anasymptoticexpansionforposteriordistribution,"AnnalsofMathemat-icalStatistics,vol.38,pp.1899{1906,1967. [4] ||,\Anasymptoticexpansionforposteriordistribution,"AnnalsofMathematicalStatistics,vol.42,pp.1241{1253.,1970. [5] A.M.Walker,\Ontheasymptoticbehavioroftheposteriordistribution,"JournaloftheRoyalStatisticalSociety,SeriesB,vol.31,pp.80{88.,1969. [6] J.K.Ghosh,B.K.Sinha,andS.N.Joshi,\ExpansionsforposteriorprobabilityandintegratedBayesrisk,"inStatisticalDecisionTheoryandRelatedTopicsIII,S.S.GuptaandJ.O.Berger,Eds.NewYork:AcademicPress,1982,vol.1,pp.403{456. [7] M.Crowder,\Asymptoticexpansionsofposteriorexpectations,distributionsanddensitiesforstochasticprocesses,"Ann.Inst.Statist.Math.,vol.40,pp.297{309,1988. [8] S.Ghosal,\Normalapproximationtotheposteriordistributionforgeneralizedlinearmodelswithmanycovariates,"MathematicalMethodsofStatistics,vol.6,pp.181{197,1997. [9] ||,\Asymptoticnormalityofposteriordistributionsinhigh-dimensionallinearmodels,"Bernoulli,vol.5,pp.315{331,1999. [10] ||,\Asymptoticnormalityofposteriordistributionsforexponentialfamilieswithmanyparameters,"JournalofMultivariateAnalysis,vol.74,pp.49{69,2000. [11] S.Portnoy,\Asymptoticbehaviorofm-estimatorsofpregressionparameterswhenp2=nislarge.i:Consistency,"AnnalsofStatistics,vol.12,pp.1298{1309.,1984. [12] ||,\Asymptoticbehaviorofm-estimatorsofpregressionparameterswhenp2=nislarge.ii:Normalapproximation,"AnnalsofStatistics,vol.13,pp.1403{1417.,1985. [13] ||,\Asymptoticbehavioroflikelihoodmethodsforexponentialfamilieswhenthenumberofparameterstendstoinnity,"AnnalsofStatistics,vol.16,pp.356{366.,1988. [14] J.FanandH.Peng,\Nonconcavepenalizedlikelihoodwithadivergingnumberofparameters,"AnnalsofStatistics,vol.32,pp.928{61,2004. [15] Y.J.ChunmingZhangandY.Chai,\Penalizedbregmandivergenceforlarge-dimensionalregressionandclassication,"Biometrika,vol.97,pp.551{566.,2011. 69

PAGE 70

[16] H.LiangandP.Du,\Maximumlikelihoodestimationinlogisticregressionmodelswithadivergingnumberofcovariates,"ElectronicJournalofStatistics,vol.6,pp.1838{1846.,2012. [17] A.vanderVaart,AsymptoticStatistics.Cambridge:CambridgeUniversityPress,1998. [18] T.ParkandG.Casella,\TheBayesianlasso,"JournaloftheAmericanStatisticalAssociation,vol.103,pp.681{686,2008. [19] R.J.Tibshirani,\Regressionshrinkageandselectionviathelasso,"JournaloftheRoyalStatisticalSociety,SeriesB,vol.58,pp.267{288,1996. [20] G.Casella,\EmpiricalBayesGibbssampling,"Biostatistics,vol.2,pp.485{500,2001. [21] M.Kyung,J.Gill,M.Ghosh,andG.Casella,\Penalizedregression,standarderrors,andBayesianlassos,"BayesianAnalysis,vol.5,pp.369{412,2010. [22] M.YuanandY.Lin,\Modelselectionandestimationinregressionwithgroupedvariables,"JournaloftheRoyalStatisticalSociety,SeriesB,vol.68,pp.49{67,2006. [23] H.ZouandT.Hastie,\Regularizationandvariableselectionviatheelasticnet,"JournaloftheRoyalStatisticalSociety,SeriesB,vol.67,pp.301{320,2005. [24] J.E.GrinandP.J.Brown,\Bayesianadaptivelassoswithnon-convexpenalization,"TechnicalReport,2007. [25] C.Hans,\Bayesianlassoregression,"Biometrika,2009. [26] N.G.P.C.M.CarvalhoandJ.G.Scott,\Thehorseshoeestimatorforsparsesignals,"Biometrika,vol.97,pp.465{480.,2010. [27] J.E.GrinandP.J.Brown,\Inferencewithnormal-gammapriordistributionsinregressionproblems,"BayesianAnalysis,2010. [28] K.KnightandW.Fu,\Asymptoticsforlasso-typeestimators,"AnnalsofStatistics,2000. [29] J.FanandR.Li,\Variableselectionvianonconcavepenalizedlikelihoodanditsoracleproperties,"JournaloftheAmericanStatisticalAssociation,vol.96,pp.1348{1360.,2001. [30] M.YuanandY.Lin,\Ecientempiricalbayesvariableselectionandestimationinlinearmodels,"JournaloftheAmericanStatisticalAssociation,2005. [31] P.ZhaoandB.Yu,\Onmodelselectionconsistencyoflasso,"J.Mach.Learn.Res.,2006. [32] H.Zou,\Theadaptivelassoanditsoracleproperties,"JournaloftheAmericanStatisticalAssociation,2006. 70

PAGE 71

[33] H.ZouandR.Li,\One-stepsparseestimatesinnonconcavepenalizedlikelihoodmodels,"TheAnnalsofStatistics,2008. [34] E.I.GeorgeandR.E.Mcculloch,\Variableselectionviagibbssampling,"JournaloftheAmericanStatisticalAssociation,1993. [35] D.M.A.E.RafteryandJ.A.Hoeting,\Bayesianmodelaveragingforlinearregressionmodels,"JournaloftheAmericanStatisticalAssociation,1997. [36] E.I.G.H.ChipmanandR.E.Mcculloch,\Thepracticalimplementationofbayesianmodelselection,"IMSLectureNotes-MonographSeries,vol.38.,2001. [37] F.Liang,R.Paulo,G.Molina,M.A.Clyde,andJ.O.Berger,\MixturesofgpriorsforBayesianvariableselection,"JournaloftheAmericanStatisticalAssociation,vol.103,pp.410{423,2008. [38] J.G.M.ClydeandM.L.Littman,\Bayesianadaptivesamplingforvariableselectionandmodelaveraging,"JournalofComputationalandGraphicalStatistics,vol.20:1,pp.80{101.,2010. [39] J.L.W.U.B.A.Armagan,D.B.DunsonandN.Stawn,\Posteriorconsistencyinlinearmodelsundershrinkagepriors,"Biometrika(toappear),2013. [40] I.M.JohnstoneandB.W.Silverman,\Needlesandstrawinhaystacks:Empiricalbayesestimatesofpossiblysparsesequences,"AnnalsofStatistics,2004. [41] D.B.D.A.ArmaganandM.Clyde,\Generalizedbetamixturesofgaussians,"AdvancesinNeuralInformationProcessingSystems(NIPS),2011a. [42] J.FanandR.Li,\Statisticalchallengeswithhighdimensionality:Featureselectioninknowledgediscovery,"ProceedingsoftheMadridInternationalCongressofMathemati-cians,2006. [43] N.R.DraperandH.Smith,Appliedregressionanalysis(3rded.).NewYork:JohnWiley&Sons,1998. [44] A.E.HoerlandR.W.Kennard,\Ridgeregression:Biasedestimationfornonorthogonalproblems,"Technometrics,1970. [45] L.Breiman,\Bettersubsetregressionusingthenonnegativegarrote,"Technometrics,1995a. [46] J.BernardoandA.F.M.Smith,BayesianTheory.NewYork:J.Wiley,1994. [47] C.Geyer,\Ontheasymptoticsofconstrainedm-estimation,"TheAnnalsofStatistics,1994. [48] I.J.B.Efron,T.HastieandR.Tibshirani,\Leastangleregression,"TheAnnalsofStatistics,2004. 71

PAGE 72

[49] P.McCullaghandJ.Nelder,GeneralizedLinearModels(2nded.).NewYork:Chapman&Hall,1989. 72

PAGE 73

BIOGRAPHICALSKETCH ShibasishDasguptagraduatedwithaBachelorofScienceinstatisticsin2002fromSt.Xavier'sCollege,Calcutta,UniversityofCalcutta,India.Hereceivedhismaster'sdegreeinstatisticsin2004andwasalsoawardedtheMasterofPhilosophy(M.Phil.)instatisticsfromUniversityofPune,Indiain2005.PriortocomingtotheUnitedStatesforpursuinghisPhD,heworkedinindustryforalmost3yearsinIndia,rstatTATAResearchDevelopment&DesignCenter(TRDDC)asaResearchScientist,andthenatPersistentSystemsLimited(PSL)asaSeniorMemberofTechnicalSta.Infall2007,hejoinedthegraduateprogramintheDepartmentofStatistics,UniversityofFlorida,Gainesville.HehasbeenaTeachingAssistantandalsoservedasthesoloinstructorofsomeofthecoreundergraduate/graduatelevelcoursesintheStatisticsDepartmentatUniversityofFlorida.Hisresearchinterestsincludestatisticallearning,Bayesiantheoryandmethodology,particularlyasrelatedtotopicsinhigh-dimensionalinferenceandvariableselection,aswellasthefrequentistperformanceofBayesianmethods. 73