Bayes and Empirical Bayes Benchmarking for Small Area Estimation

MISSING IMAGE

Material Information

Title:
Bayes and Empirical Bayes Benchmarking for Small Area Estimation
Physical Description:
1 online resource (86 p.)
Language:
english
Creator:
Steorts, Rebecca Carter
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Statistics
Committee Chair:
Ghosh, Malay
Committee Members:
Doss, Hani
Daniels, Michael J
Garvan, Cynthia W

Subjects

Subjects / Keywords:
area-level -- bayes -- benchmarking -- bootstrap -- empirical-bayes -- fay-herriot -- small-area
Statistics -- Dissertations, Academic -- UF
Genre:
Statistics thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract:
Small area estimation has become increasingly popular due to growing demand for such statistics. In order to produce estimates of adequate precision for these small areas, it is often necessary to borrow strength from other related areas. The resulting model-based estimates may not aggregate to the more reliable direct estimates at the higher level, which may be politically problematic. Adjusting model-based estimates to correct this problem is known as benchmarking. In this dissertation, we propose a general class of benchmarked Bayes estimators that can be expressed in the form of a Bayesian adjustment applicable to any estimator, linear or nonlinear. We also derive a second set of estimators under an additional constraint that benchmarks the weighted variability. We illustrate this work using U.S. Census Bureau data. Furthermore, we extend one-stage benchmarking to a two-stage procedure and illustrate this using data from the National Health Interview Survey. Finally, we obtain the benchmarked empirical Bayes (EB) estimator. Our goal is to see how much mean squared error (MSE) is lost due to due benchmarking. Furthermore, we find an asymptotically unbiased estimator of this MSE and compare it to the second-order approximation of the MSE of the EB estimator or equivalently of the MSE of the empirical best linear unbiased predictor (EBLUP), which was derived by Prasad and Rao (1990). Morever, using methods similar to those of Butar and Lahiri (2003), we compute a parametric bootstrap estimator of the MSE of the benchmarked EB estimator under the Fay-Herriot model and compare it to the MSE of the benchmarked EB estimator found by a second-order approximation. Finally, we illustrate our methods using SAIPE data from the U.S. Census Bureau.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
by Rebecca Carter Steorts.
Thesis:
Thesis (Ph.D.)--University of Florida, 2012.
Local:
Adviser: Ghosh, Malay.
Electronic Access:
RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2014-05-31

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2012
System ID:
UFE0043927:00001


This item is only available as the following downloads:


Full Text

PAGE 1

BAYESANDEMPIRICALBAYESBENCHMARKINGFORSMALLAREAESTIM ATION By REBECCAC.STEORTS ADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOL OFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENT OFTHEREQUIREMENTSFORTHEDEGREEOF DOCTOROFPHILOSOPHY UNIVERSITYOFFLORIDA 2012

PAGE 2

c r 2012RebeccaC.Steorts 2

PAGE 3

Iwouldliketodedicatemythesistomyparentsandmyfamily. Withouttheirconstant supportandencouragement,IwouldnothavemadeittowhereI amtoday. Secondly,IwouldliketodedicatethisbodyofworktoMalayG hosh,myadvisor.Ithas beenanhonortoworkwithsomeonewhohasdedicatedhisentir elifetoresearch, teachingandenrichingthelivesofstudents.Itismygreate stwishtobeabletomentor studentsandcontributetostatisticalresearchinthewayt hehehas. Finally,Idedicatemythesistoallofthefaculty,students ,staff,andfriendsthatmade thisdissertationpossible.Ihavebeenhumbledbythistrul yamazingjourney. 3

PAGE 4

ACKNOWLEDGMENTS Iwouldliketothankmyadvisor,MalayGhosh,forbeingacons tantdailyreminder ofwhatkindofresearcherandteacherIwanttobewhenIgradu atefromFlorida.Ihave beenincrediblygratefulfortheopportunitytolearnunder himandseemydevelopment inresearchgrowfromhisendlessknowledgenotjustinsmall areaestimationbutinall areasofstatistics.Finally,Igreatlyadmirehisworkethi c,enthusiasmforlearning,and humilitythathasinreturnshapedwhoIamnotonlyasaresear cherbutalsoaperson whowantstoservemycommunity.IwillneverforgetwhatProf essorGhoshhasdone formethepastveyearsattheUniversityofFlorida. IwouldalsoliketothankHaniDossforbeingoneofthebestpr ofessorsIhavehad inmygraduatecareerandforpushingmetoexcelandstickwit hmydeterminationto getmyPhD.ProfessorDoss'honestyandopinionondecisions inmycareerhavebeen aninvaluableassettome.Iamextremelygratefultohimfora llthetime,energy,andencouragementhehasgivenmeovertheyearsattheUniversityo fFlorida.Furthermore,I wouldliketoMikeDanielsforhisinsightthathasleadtomyw orkimproving. Finally,Iwouldliketothankmyparentsforconstantlybein gloving,supportiveand encouragingofmygoalsanddreams.IwouldnotbethepersonI amtodaywithout themandIwouldnotbeasfaralonginmyacademiccareerwitho uttheirconstant encouragementofscience,mathematics,andstatisticssin ceIwasveryyoung. TheresearchwaspartiallysupportedbyNSFGrantSES102616 5andtheUnited StatesCensusBureauDissertationFellowshipProgram.The viewsexpressedhereare thoseoftheauthorsanddonotreectthoseoftheUnitedStat esCensusBureau. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 4 LISTOFTABLES ...................................... 7 LISTOFFIGURES ..................................... 8 ABSTRACT ......................................... 9 CHAPTER 1INTRODUCTION ................................... 11 2LITERATUREREVIEW ............................... 13 2.1SmallAreaEstimation ............................. 13 2.2Area-LevelModels ............................... 13 2.3Unit-LevelModels ............................... 16 2.4BLUPandEBLUP ............................... 17 2.5MSEoftheBLUPandEBLUP ........................ 19 2.6Benchmarking ................................. 21 2.7ConstrainedBayesEstimation ........................ 26 2.8ProposedResearch .............................. 27 3ONE-STAGEBENCHMARKING .......................... 29 3.1DevelopmentoftheEstimators ........................ 29 3.2RelationshipwithSomeExistingEstimators ................. 33 3.3BenchmarkingwithBothMeanandVariabilityConstraint s ......... 35 3.4AnIllustrativeExample ............................. 38 4TWO-STAGEBENCHMARKING .......................... 43 4.1Two-StageBenchmarkingResults ...................... 43 4.2AnExample ................................... 49 5ONESTIMATIONOFMSESOFBENCHMARKEDEBESTIMATORS ..... 55 5.1BenchmarkedEBEstimators ......................... 56 5.2Second-OrderApproximationtoMSE .................... 57 5.3EstimatorofMSEApproximation ....................... 66 5.4ParametricBootstrapEstimatoroftheBenchmarkedEBEs timator .... 68 5.5AnApplication ................................. 69 6SUMMARYREMARKSANDFUTUREWORK .................. 77 APPENDIX:LEMMASFORCHAPTER5 ........................ 79 5

PAGE 6

REFERENCES ....................................... 83 BIOGRAPHICALSKETCH ................................ 86 6

PAGE 7

LISTOFTABLES Table page 3-1BenchmarkingStatisticsforASECCPS ...................... 40 4-1TableofestimatesusingTheorem1 ........................ 52 5-1Tableofestimatesfor2000 ............................. 73 5-2Tableofestimatesfor1997 ............................. 75 7

PAGE 8

LISTOFFIGURES Figure page 3-1ChangeinEstimatorsduetoBenchmarkingfor1998 ............... 42 3-2ChangeinEstimatorsduetoBenchmarkingfor2001 ............... 42 8

PAGE 9

AbstractofDissertationPresentedtotheGraduateSchool oftheUniversityofFloridainPartialFulllmentofthe RequirementsfortheDegreeofDoctorofPhilosophy BAYESANDEMPIRICALBAYESBENCHMARKINGFORSMALLAREAESTIM ATION By RebeccaC.Steorts May2012 Chair:MalayGhoshMajor:DepartmentofStatistics Smallareaestimationhasbecomeincreasinglypopularduet ogrowingdemand forsuchstatistics.Inordertoproduceestimatesofadequa teprecisionforthesesmall areas,itisoftennecessarytoborrowstrengthfromotherre latedareas.Theresulting model-basedestimatesmaynotaggregatetothemorereliabl edirectestimatesatthe higherlevel,whichmaybepoliticallyproblematic.Adjust ingmodel-basedestimatesto correctthisproblemisknownasbenchmarking. Inthisdissertation,weproposeageneralclassofbenchmar kedBayesestimators thatcanbeexpressedintheformofaBayesianadjustmentapp licabletoanyestimator, linearornonlinear.Wealsoderiveasecondsetofestimator sunderanadditional constraintthatbenchmarkstheweightedvariability.Weil lustratethisworkusingU.S. CensusBureaudata.Furthermore,weextendone-stagebench markingtoatwo-stage procedureandillustratethisusingdatafromtheNationalH ealthInterviewSurvey. Finally,weobtainthebenchmarkedempiricalBayes(EB)est imator.Ourgoalistosee howmuchmeansquarederror(MSE)islostduetoduebenchmark ing.Furthermore, wendanasymptoticallyunbiasedestimatorofthisMSEandc ompareittothesecondorderapproximationoftheMSEoftheEBestimatororequival entlyoftheMSEofthe empiricalbestlinearunbiasedpredictor(EBLUP),whichwa sderivedby PrasadandRao ( 1990 ).Morever,usingmethodssimilartothoseof ButarandLahiri ( 2003 ),wecompute aparametricbootstrapestimatoroftheMSEofthebenchmark edEBestimatorunder 9

PAGE 10

theFay-HerriotmodelandcompareittotheMSEofthebenchma rkedEBestimator foundbyasecond-orderapproximation.Finally,weillustr ateourmethodsusingSAIPE datafromtheU.S.CensusBureau. 10

PAGE 11

CHAPTER1 INTRODUCTION Smallareaestimationhasbecomeincreasinglypopularover thepastdecadedue toagrowingneedformorereliablesmallareastatistics.Es timationbasedondirect estimatorsusuallyhaslargestandarderrorsandcoefcien tsofvariation.Inorderto produceestimatesforsmallareas,itisnecessarytoborrow strengthfromrelatedareas toformindirectestimatorsthatincreasetheeffectivesam plesize,andhence,increase theprecision. Itiswidelyknownthatsmallareaestimationneedsexplicit (oratleastimplicit) useofmodelsthatprovidealinktorelatedsmallareasthrou ghsupplementarydata suchascensuscountsorotheradministrativedata.Model-b asedestimatesoften differwidelyfromthedirectestimates,especiallyforare aswithsmallsamplesizes. Whenaggregated,themodel-basedestimatesareoftenveryd ifferentfromtheoverall estimatesforthelargergeographicalareasbasedonthedir ectestimates,wherethe latterisbelievedtobequitereliable.Furthermore,anove rallagreementwiththedirect estimatesatanaggregatelevelmaybesometimespoliticall ynecessarytoconvince legislatorsoftheutilityofsmallareaestimators. Weaddressthisproblembybenchmarking,whichrequiresthe aggregatedmodelbasedestimatestomatchthecorrespondingdirectestimate atahigherlevel.Modelbasedestimatesdonotbenchmarktoreliabledirectsurveye stimatesforlargeareas. Becauseofthisbehaviorandtoprotectagainstpossiblemod elfailureandovershrinkage,webenchmarkthemodel-basedestimatesinsuchawaytha ttheyadduptothe directestimatesforthelargerarea.Weproposeconstraine dBayesprocedureswhich meettheobjectiveofbenchmarkingtheweightedmeanaswell astheweightedvariability.Weextendthisideatotwo-stagebenchmarkingusin gasinglemodel.Finally, weobtainthebenchmarkedempiricalBayesestimatorunders pecicmodelingassumptionsandsomemildregularityconditions.Asubsequen tgoalistoseehowmuch 11

PAGE 12

meansquarederror(MSE)islostduetoduebenchmarking.Fur thermore,wendan asymptoticallyunbiasedestimatorofthisMSEandcomparei ttothesecond-orderapproximationoftheMSEoftheEBestimatororequivalentlyof theMSEoftheempirical bestlinearunbiasedpredictor(EBLUP),whichwasderivedb y PrasadandRao ( 1990 ). Morever,usingmethodssimilartothoseof ButarandLahiri ( 2003 ),wecomputea parametricbootstrapestimatoroftheMSEofthebenchmarke dEBestimatorunderthe Fay-HerriotmodelandcompareittotheMSEofthebenchmarke dEBestimatorfound byasecond-orderapproximation.Finally,weillustrateou rmethodsusingSAIPEdata fromtheU.S.CensusBureau. Beforeproceedingwithourproposedmethods,werstdiscus stheliteraturethat greatlymotivatesourwork. 12

PAGE 13

CHAPTER2 LITERATUREREVIEW BeforepresentingourproposedmethodsonBayesandEmpiric alBayesbenchmarking,werstintroducesmallareaestimationandpresen tthebasicarea-and unit-levelmodels.Wethenprovidesomebackgroundonthebe stlinearunbiasedpredictor(BLUP)andtheempirical(BLUP)aswellastheMSEofea chofthesequantities ofinterest.Finally,weexplaintheideaofbenchmarkingan dprovideareviewofmanyof thepreviouslyproposedbenchmarkedestimatorsinthelite rature. 2.1SmallAreaEstimation Samplesurveysaregenerallyconstructedsothattheyyield reliabledirectestimates forlargeareasordomains.However,insmallareas,directe stimatorstendtolead tolargestandarderrorsandcoefcientsofvariationdueto smallsamplesizesinthe domains.Thus,inndingestimatesforsmallareas,itisnec essarytoborrowstrength fromneighboringorrelatedareas,whichincreasestheeffe ctivesamplesizeandhence theprecisionoftheestimates.Itiswellknownnowthatthee stimatesarebasedon eitherexplicit(orimplicit)modelsthatlinkthesmallare athroughdata(suchascensus countsoradministrativerecords). 2.2Area-LevelModels YouandRao ( 2002 )and Rao ( 2003 )providedthebestexplanationforarea-level models.Smallareamodelsareclassiedintoeitherarea-le velorunit-levelmodels. Werstcoverarea-levelmodels.Abasicarea-levelmodelas sumesthatsomefunction i = g ( i ) ofthesmallareamean i isrelatedtothearea-specicauxiliarydata x i =( x i 1 ,..., x ip ) T throughalinearmodelwitharea-specicrandomeffect u i i = g ( i )= x Ti + u i i =1,..., m (2.2.1) 13

PAGE 14

Notethat isa p 1 vectorofknownregressionparameters, u i iid (0, 2 u ) ,and m isthe numberofsmallareas.Normalityoftherandomeffects u i isoftentakenasacommon assumption.Throughoutourwork,wetake g ( ) astheidentityfunction. Inthebasicarea-levelmodel,weassumethedirectestimate ^ i ofthesmallarea mean i isavailablewhenevertheareasamplesize n i > 1. Thedirectestimate ^ i is usuallydesign-unbiasedmeaning E ( ^ i )= i Itisusualtoassume ^ i = i + e i i =1,..., m (2.2.2) where e i denotethesamplingerrorsassociatedwiththetransformed directestimator ^ i Itisusualtoassumethatthe e i areindependentnormalrandomvariables with E ( e i j i )=0 andVar ( e i j i )= D i Whenwecombinethetwomodels,they producealinearmixedeffectsmodel.Thatis,weobtainthel inearmixedmodelof FayandHerriot ( 1979 ) ^ i = x Ti + u i + e i i =1,..., m (2.2.3) Thismodelinvolvesdesign-basedrandomvariables e i aswellasmodel-based randomvariables u i MethodsinempiricalBayes(EB)orhierarchicalBayes(HB)c an employtheFay-Herriotmodeltoobtainimprovedmodel-base dsmallareaestimatesof i and i ( GhoshandRao ( 1994 );and Rao ( 1999 )). Wenallynotethat E ( e i j i )=0 maynotbeavalidassumptionifthesamplesize n i istoosmalland i isanonlinearfunctionof i .Wealsomentionthatthesampling variance D i intheFay-Herriotmodelisusuallyassumedtobeknown.This isneededin ordertoavoidanyidentiabilityproblems. 14

PAGE 15

First,assumingnormalityandknown 2 u ,thebasicarea-levelmodelcanbewritten as ^ i j i ind N ( i D i ) (2.2.4) i j 2 u ind N ( x Ti 2 u ), i =1,..., m ( ) / 1. Ifweassumethat 2 u isunknownwesimplyreplacethelastlineinmodel( 2.2.4 ) with ( 2 u ) / ( ) ( 2 u ). (2.2.5) Furthermore,when 2 u isknownand ( ) / 1, thehierarchical(HB)andbestlinear unbiasedpredictor(BLUP)methods(undernormality)leadt othesamepointestimates andmeasuresofvariability.ThiswillbeexplainedmoreinS ection 2.4 WenowconsiderempiricalBayes(EB)estimationforarea-le velmodels.Assuming normality,thetwo-stageHBmodelis ^ i j i ind N ( i D i ), i =1,..., m (2.2.6) i ind N ( x Ti 2 u ). Theoptimalestimateof i isgivenbytheBayesestimate ^ B i = E h i j ^ i 2 u i = B i ^ i +(1 B i ) x Ti where B i = 2 u ( 2 u + D i ) 1 Thisfollowsfrom i j ^ i 2 u ind N ( ^ B i B i D i ). Theestimate ^ B i istheBayesestimateundersquarederrorlossandisoptimal in thesensethatitsMSEissmallerthantheMSEofanyotheresti matorof i linearor nonlinear.TheBayesestimatedependson and 2 u whichmustbeestimatedfromthe 15

PAGE 16

marginaldistribution ^ i ind N ( x Ti 2 u + D i ) usingmaximumlikelihood(MLE),restrictedmaximumlikeli hood(REML),ormethodof moments(MOM).Wedenotetheseestimatorsby ^ 2 u and ^ WeobtaintheEBestimator of i from ^ B i bysubstituting ^ for and ^ 2 u for 2 u Then ^ EB i = ^ B i ( ^ ,^ 2 u )= ^ B i ^ i +(1 ^ B i ) x Ti ^ WenotethattheEBestimatorisidenticaltotheempiricalBL UP(EBLUP)estimator ^ H i whichwillbeexplainedindetailinSection 2.4 2.3Unit-LevelModels Next,weturntotheunit-levelmodelaspresentedby Batteseetal. ( 1988 ), DattaandGhosh ( 1991 ),and Rao ( 2003 ).Weassumeunit-specicauxiliarydata x ij =( x ij 1 ,..., x ijp ) T isavailableforeachpopulationelement j ineachsmallarea i We assume ^ ij isrelatedto x ij throughtheone-foldnestederrorlinearregressionmodel ^ ij = x ij T + u i + e ij ; j =1,..., N i ; i =1,..., m (2.3.1) Thearea-speciceffects u i iid (0, 2 u ). Weassume e ij = k ij ~ e ij where k ij areknown constantsand ~ e ij iid (0, 2 e ) andareindependentofthe u i .Furthermore,normalityofthe u i and e ij isoftenassumed.Weassumeasample s i ofsize n i istakenfromthe N i units ineachofthe i areasandthesampleunitsareassumedtofollowthepopulati onmodel. WenowapplyaHBapproachtothebasicunit-levelmodelassum ing 2 u and 2 e are bothknown.Then ^ ij j u i 2 e ind N x ij T + u i k ij 2 e j =1,..., n i ; i =1,..., m (2.3.2) u i j 2 u ind N 0, 2 u ( ) / 1. 16

PAGE 17

When 2 u and 2 e arebothunknown,wereplace ( ) / 1 with ( 2 u 2 e ) / ( 2 u ) ( 2 e ) intheHBmodel( 2.3.2 ). 2.4BLUPandEBLUP Werstpresentthebasictheoryforagenerallinearmixedmo del.Wesupposethat thesampledata (an m 1 vectorofsampleobservations)obeysthegenerallinear mixedmodel = X + Z u + e (2.4.1) Here X and Z areknown m p and m h matricesand isa p 1 vectorofregression parameters.Furthermore, u e areindependentlydistributedwithmeans 0 andcovariances G R dependingonvarianceparameters =( 1 ,..., q ) T Weassume belongs toaspeciedsubsetofEuclidean q -spacesuchthat V ( ^ )= V = V ( )= R + ZGZ T is nonsingularforall inthesubset. Weareinterestedinestimatingalinearcombination = l T + m T u where l m arevectorsofconstants.Alinearestimatorof is ^ = a T ^ + b forknown a b Theestimator ^ ismodel-unbiasedfor meaningthat E M [ ^ ]= E M [ ], where E M representsexpectationwithrespecttomodel( 2.4.1 ).TheMSEof ^ isMSE [ ^ ]= E [( ^ ) 2 ]. Weareinterestedinndingthebestlinearunbiasedpredict or(BLUP) estimatorwhichminimizestheMSEintheclassoflinearunbi asedestimators ^ Forknown ,theBLUPestimatorof isgivenby ~ H = t ( ^ )= l T ~ + m T ~ u = l T ~ + m T GZ T V 1 ( ^ X ~ ), (2.4.2) where ~ =( X T V 1 X ) 1 X T V 1 ^ (2.4.3) 17

PAGE 18

istheBLUPof Moreover, ~ u = GZ T V 1 ( ^ X ~ ) (2.4.4) istheBLUPof u Thistheorywasproposedby Henderson ( 1950 )andappearedin Robinson ( 1991 ), Rao ( 2003 ),andothers. TheBLUPestimator ~ H = t ( ^ ) givenin( 2.4.2 )dependsonthevariance parameters whicharetypicallyunknowninapplications.Ifwereplace by ^ we obtainatwo-stageestimator ^ H = t ( ^ ^ ), whichwecalltheempiricalBLUPestimator (EBLUP). WenowpresenttheBLUPandEBLUPestimatorsfortheFay-Herr iotmodelgiven in( 2.2.4 ).Wedene ~ H i and ^ H i astheBLUPandEBLUPestimatorsof i respectively. Weassumethat D i and 2 u areknown.Model( 2.2.3 )isaspecialcaseof( 2.4.1 ).Using thegeneralformofthemodel,wendthat Z = I G = 2 u I R = Diag ( D 1 ,..., D m ), V = Diag ( 2 u + D 1 ,..., 2 u + D m ). Also, i = i = x Ti + u i sothat l T = X and m T = I Making theappropriatesubstitutionsinto( 2.4.2 ),wendthat ~ H i = B i ^ i +(1 B i ) x Ti ~ (2.4.5) where B i = 2 u ( 2 u + D i ) 1 and ~ = X i x i x Ti ( 2 u + D i ) 1 # 1 X i x i ^ i ( 2 u + D i ) 1 # (2.4.6) Itisclearfrom( 2.4.5 )thattheBLUPestimator ~ H i canbeexpressedasaweighted averageofthedirectestimator ^ i andtheregression-syntheticestimator x Ti where theweight B i (0 B i 1) measurestheuncertaintyinmodelingthe i ,i.e.,themodel variance 2 u relativetothetotalvariance 2 u + D i TheBLUPestimatordependson 2 u whichisunknowninpracticalapplications. Replacing 2 u byanestimate ^ 2 u resultsinanempiricalBLUP(EBLUP)estimator ^ H i 18

PAGE 19

where ^ H i = ^ B i ^ i +(1 ^ B i ) x Ti ^ (2.4.7) Also, ^ B i and ^ aretheestimatedvaluesof B i and ~ when 2 u isreplacedby ^ 2 u Finally,when 2 u isknownmodel( 2.2.4 )isassumed,straightforwardcalculations showthat i j ^ 2 u N ( B i ^ i +(1 B i ) x Ti B i D i ). Itcanbeshownthat j ^ N ( ~ ,( X T V 1 X ) 1 ). Usingthesetwofactsandtheformulasforiteratedexpectat ionandvariance,wecan showthat i j ^ N ( ~ H i MSE ( ~ H i )), (2.4.8) where ~ H i = B i ^ i +(1 B i ) x Ti ~ andMSE ( ~ H i )= B i D i +(1 B i ) 2 x Ti ( X T V 1 X ) 1 x i Thisimpliesthatwhen 2 u isknownandwhentakingaatpriorfor ,theHBand BLUPapproachesundernormalityleadtothesamepointestim atesandmeasuresof variability.Similarly,when 2 u isknownandwhentakingaatpriorfor ,theEBand EBLUPapproachesundernormalityleadtothesamepointesti matesandmeasuresof variability. 2.5MSEoftheBLUPandEBLUP Rao ( 2003 )presentedtheMSEoftheBLUP ~ H i assumingthebasicarea-level model( 2.2.3 ).AmoregeneralversionoftheMSEcanbefoundinSection6.3 of Rao ( 2003 )forthegeneralmixedlinearmodel.However,forthearea-l evelmodel,theMSEof ~ H i isgivenby MSE ( ~ H i )= E ( ~ H i i ) 2 = g 1 i ( 2 u )+ g 2 i ( 2 u ), (2.5.1) 19

PAGE 20

where g 1 i ( 2 u )= B i D i and g 2 i ( 2 u )=(1 B i ) 2 x Ti ( X T V 1 X ) 1 x i Therstterm g 1 i ( 2 u ) isoforder O (1), andthesecondterm g 2 i ( 2 u ) isoforder O ( m 1 ) forlarge m assumingthefollowingregularityconditions: (i) D i and 2 u areuniformlybounded (ii) sup 1 i m x Ti ( X T X ) 1 x i = O ( m 1 ). RecalltheBLUPestimator ~ H i dependsonvariancecomponent 2 u whichisunknowninpracticalapplications.Wecanreplace 2 u withanestimator ^ 2 u henceobtaining anEBLUPestimator ^ H i asdenedin( 2.4.7 ). Wenowturntotheworkof PrasadandRao ( 1990 )andfocusonmodel( 2.2.4 ). Wenotethatthenormalityassumptionisnotneededtoderive thetwo-stageestimator ^ H i Wehavealreadymentionedthat ~ H i = t i ( 2 u ^ ) dependsonvariancecomponent 2 u whichisgenerallyunknown.Weusuallyestimate t i ( 2 u ^ ) byreplacing 2 u withan asymptoticallyconsistentestimator. PrasadandRao ( 1990 )presentedanunbiased quadraticestimatorof 2 u inthemodel( 2.2.4 ) ~ 2 u =( m p ) 1 m X i =1 ^ u i m X i =1 D i (1 x Ti ( X T X ) 1 x i ) # (2.5.2) where ^ u i = ^ i x Ti ^ andwenowtake ^ =( X T X ) 1 X T ^ Thetwo-stageestimator t i (^ 2 u ^ i ) isanempiricalBayesestimatorof i undernormality( FayandHerriot ( 1979 )). Weaddthatitispossiblefor ~ 2 u totakepositiveornegativevalues.Hence,weinstead deneanewestimator ^ 2 u =max f ~ 2 u ,0 g Thisensuresthatthetwo-stageestimator ^ H i willhaveniteexpectation.Itmayalsobenotedthat P (~ 2 u 0) tendstozeroas m !1 PrasadandRao ( 1990 )obtainedestimatorsoftheMSEapproximationunder normality.TheirAppendix(TheoremA.2andTheoremA.3)sho wstheexpectationofthe 20

PAGE 21

MSEestimatoriscorrectto O ( m 1 ). Theyshow MSE [ t i (^ 2 u ^ i )]= g 1 i ( 2 u )+ g 2 i ( 2 u )+ g 3 i ( 2 u )+ o ( m 1 ) (2.5.3) where g 1 i ( 2 u )= 2 u D i ( 2 u + D i ) 1 g 2 i ( 2 u )= D 2 i ( 2 u + D i ) 2 x Ti ( X T V 2 X ) 1 x i and g 3 i ( 2 u )= D 2 i ( 2 u + D i ) 3 V (^ 2 u ), V (^ 2 u )=2 m 1 [ 4 u +2 m 1 2 u P i D i + m 1 P i D 2 i ]+ o ( m 1 ). Thisisestimatedby [ MSE [ t i (^ 2 u ^ i )]= g 1 i (^ 2 u )+ g 2 i (^ 2 u )+2 g 3 i (^ 2 u ) (2.5.4) since E [ g 1 i (^ 2 u )]= g 1 i ( 2 u ) g 3 i ( 2 u )+ o ( m 1 ) E [ g 2 i (^ 2 u )]= g 2 i ( 2 u )+ o ( m 1 ) E [ g 3 i (^ 2 u )]= g 3 i ( 2 u )+ o ( m 1 ). 2.6Benchmarking Aswehavementionedalready,smallareaestimationneedsex plicitorimplicit useofmodels.Suchmodel-basedestimatesmaydifferwidely fromdirectestimates, especiallyinareaswherethesamplesizeisparticularlylo w.Problemsoccurwhen weaggregatemodel-basedestimatessincetheyoftendonota greewiththeoverall directestimateforalargergeographicalarea.Furthermor e,anoverallagreementwith thedirectestimatorsatsomehigherlevelmaysometimesbep oliticallynecessaryto convincelegislatorsoftheutilityofsmallareaestimates .Thisproblemcanbemore severeintheeventofmodelfailureasoftenthereisnocheck forvalidityoftheassumed model. Onewaytoavoidtheproblemjustdescribedisbybenchmarkin g.Thisentails adjustingthemodel-basedestimatessuchthattheaggregat eestimatesforthesmall areasmatchthatforthelargergeographicalarea.Asimplei llustrationofthisistomodify 21

PAGE 22

themodel-basedstate-levelestimatessothattheaggregat ematchesthenational estimates.Themostpopularmethodutilizedistherakingor ratioadjustmentmethod, whichamountstomultiplyingthemodel-basedestimatesbya constantdata-dependent factorsotheweightedtotalagreeswiththedirectestimate Forexample,wementiontheSmallAreaIncomeandPovertyEst imatesProgram (SAIPE)estimatesoftheU.S.CensusBureaubasedontheAmer icanCommunity Survey(ACS)data.Here,estimatesarecontrolledsothatth eoverallweightedestimatesagreewiththecorrespondingstateestimates.Eventh oughtheseestimatesare model-based,theyarequiteclosetothedirectestimates. Wenowdiscusssomeoftheexistingbenchmarkingliterature (mostlyfrequentist) forsmallareaestimation. Batteseetal. ( 1988 )consideredpredictionofareasunder cornandsoybeansfor12countiesinIowabasedonthe1978Jun eEnumerativeSurvey andLANDSATsatellitedatafromtheUnitedStatesDepartmen tofAgriculture(USDA). Predictingcropareasforsmalldomainssuchascountieshad notbeenattemptedatthis timeduetoalackofavailabledatafromfarmsurveysforthes edomains( Batteseetal. ( 1988 )).Asurveyregressionpredictorwasfoundforthe12counti es'meancrop areaswhichhadasmallvariance.However, Batteseetal. ( 1988 )pointedoutthatitis desirabletomodifytheindividualcountypredictorssucht hattheweightedsumequals theunbiasedsurveyregressionpredictorforthetotalarea Youetal. ( 2004 )consideredthearea-levelmodelof FayandHerriot ( 1979 )in ( 2.2.3 ).Theysuggestedbenchmarkingthemodel-basedHBestimate sinsuchaway thattheywilladduptothedirectestimatesforthelargeare as.WerefertotheHB estimatorof i by ^ B i = E ( i j ^ i ). Denoteagenericsmallareapredictorby ^ P i Asmall areapredictorsatisesthebenchmarkedpropertyif m X i =1 w i ^ P i = m X i =1 w i ^ i (2.6.1) where w i arenormalizedsamplingweightssuchthat P mi =1 w i ^ i isdesignconsistent. 22

PAGE 23

Forinstance,theratioorrakedestimator(R)canbeobtaine das ^ R i = ^ B i P mj =1 ^ j P mj =1 ^ B j (2.6.2) Inmanysituationssurveyestimatorsofthelargergeograph icalareaarethought tohaveadequateprecision.However,itmaybedesirabletou sethedesignconsistent estimatorofthelargergeographicalareaandrequirethatt heweightedsumofthesmall areaestimatorsequalthedesignconsistentestimator( Wangetal. ( 2008 )).Toexplain designconsistency,werstmustdenedesign-unbiasedest imators. Supposewehaveapopulationtotal andweobservethe ^ valuesassociated withasample s Inadesignbasedapproach,anestimator ^ E of issaidtobedesignunbiasedif E p ( ^ E )= P s p ( s ) ^ s = Thedesignvarianceof ^ E isdenotedby V p ( ^ E )= E p h ^ E E p ( ^ E ) i 2 Wesaythat ^ E isdesignconsistentifitisdesign-unbiased(orits designbiastendstozeroasthesamplesizeincreases)andif V p ( ^ E ) tendstozeroas thesamplesizeincreases( Rao ( 2003 )). Wangetal. ( 2008 )consideredmodel( 2.2.3 ).Assumingthevariancecomponents 2 u and D i areknown,thebestlinearunbiasedpredictor(estimator)( BLUP)of is ~ =( X T V 1 X ) 1 X T V 1 ^ = X i ( 2 u + D i ) 1 x i x Ti # 1 X i ( 2 u + D i ) 1 x i ^ i # where X T =( x 1 ,..., x m ), ^ T =( ^ 1 ,..., ^ m ), and V = Var ( ^ )= Diag ( 2 u + D 1 ,..., 2 u + D m ). RecalltheBLUPof i is ~ H i = B i ^ i +(1 B i ) x Ti ~ (2.6.3) where B i = 2 u ( 2 u + D i ) 1 (2.6.4) 23

PAGE 24

Wenowreviewsomebenchmarkingproceduresintheliteratur e.Todoso,let ~ H denotetheBLUPpredictorof denedin( 2.6.3 ),where ~ H = X ~ + ~u Notethat ~ and ~u areanysolutionsto 264 X T 1 d XX T 1 d 1 d X 1 d + 1 u 375 264 u 375 = 264 X T 1 d ^ 1 d ^ 375 (2.6.5) where u = 2 u I m and d = Diag ( D 1 ,..., D m ). Moreover,ndingasolutiontothemixed modelequationin( 2.6.5 )isequivalenttondingasolutionto min ,u n [ ^ X u ] T 1 d [ ^ X u ]+ u T 1 u u o (2.6.6) PfeffermannandBarnard ( 1991 )proposedamodiedpredictor ^ PB = X ^ PB +^ u PB where ^ PB and ^ u PB arethesolutionstotheminimizationproblemin( 2.6.6 )subjecttothe constraint X i w i ( x Ti ^ PB +^ u PB )= X i w i ^ i (2.6.7) Thisleadstotheestimator ^ PB i = ~ H i +[ Var ( ~ )] 1 Cov ( ~ H i ~ ) m X j =1 w j ^ j ~ # (2.6.8) where ~ = P mi =1 w i ~ H i Cov ( ~ H i ~ )= w i B i D i + P mj =1 w j (1 B i )(1 B j ) x Ti V ( ^ ) x j and Var ( ~ )= P mi =1 w i Cov ( ~ H i ~ ). Furthermore, Isakietal. ( 2000 )imposedtherestrictionbyaprocedurethatbasically constructsthebestpredictorsof n 1 quantitiesthatareuncorrelatedwith P i w i ^ i After matrixcomputations,theIsaki-Tsay-Fuller(ITF)estimat orcanbewrittenas ^ ITF i = ^ H i + m X j =1 w 2 j d Var ( ^ j ) # 1 w i d Var ( ^ i ) m X j =1 w j ^ j m X j =1 w j ^ H j (2.6.9) where d Var ( ^ i ) estimates 2 u + D i 24

PAGE 25

ThePfeffermann-Barnard(PB)( 2.6.8 )andITF( 2.6.9 )estimatorscanbewrittenin theform ^ a i = ^ i + a i m X j =1 w j ^ j m X j =1 w j ^ P j (2.6.10) where P i w i a i =1. Wangetal. ( 2008 )describedtherestrictionin( 2.6.1 )tobejustan adjustmentissue.Thatis,inordertoforce ^ a i tosatisfy( 2.6.1 ), P mj =1 w j ^ j P mj =1 w j ^ P j mustbeallocatedtothesmallareaestimator ^ P j using a i YouandRao ( 2002 )proposedanestimatorof suchthattheresultingestimators orpredictorssatisfyconstraint( 2.6.1 )underaunit-levelmodel.YouandRaocalled suchestimatorsself-calibrated.Wang,Fuller,andQu(200 8)appliedthemethodsof YouandRao ( 2002 )tothearea-levelmodeltond ^ YR i = ^ B i ^ i +(1 ^ B i ) x Ti ^ YR (2.6.11) where ^ YR = h P i w i (1 ^ B i ) x i x Ti i 1 P i w i (1 ^ B i ) x i ^ i Anestimator ^ P j issaidtobeselfcalibratedif P mj =1 w j ^ j P mj =1 w j ^ P j =0. Forexample,theYouandRaoestimator(YR)( 2.6.11 ) isself-calibrated.ThisimpliesthattheYRestimatorisof theform( 2.6.10 ),where ^ a i = ^ i Wangetal. ( 2008 )uniedthepreviousworkdoneforaweightedquadraticloss and foundtheBLUPfor thatsatisesconstraint( 2.6.1 ).Theydenedthelossfunction Q ( ^ a i )= m X i =1 i E ( ^ a i i ) 2 (2.6.12) where i > 0 areaknownsetofweights.Usingtherandomeffectsmodelde ned in( 2.2.3 ),dene ^ a i tobetheuniquepredictoramongalllinearunbiasedpredict ors satisfying( 2.6.1 )thatminimizes( 2.6.12 ).Then ^ a i = ~ H i + a i m X j =1 w j ^ j m X j =1 w j ~ H j (2.6.13) 25

PAGE 26

where a i = P j 1 j w 2 j 1 1 i w i Itisnotedthatwhenthevariancecomponentsare unknown,thevariancecomponentsarereplacedwithreasona bleestimatorstoobtain theEBLUP ^ H j ThenthemodiedestimatororpredictorofWang,Fuller,and Qu becomes ^ a i = ^ H i + a i m X j =1 w j ^ j m X j =1 w j ^ H j (2.6.14) Recallthelossfunction( 2.6.12 ),wherethechoiceoftheweights i dependson theproblemathand.Forinstance,moreweightmaybegivento moreimportantareas. Itisoftenthecasethat i isafunctionofthevariancecomponents,orwecanchoose i inordertoobtaindesirableproperties.Forexample,choos ing i =[ d Var ( ^ i )] 1 yieldstheITFestimators.Ontheotherhand,choosing i = w i [ Cov ( ~ H i ~ )] 1 where ~ = P j w j ~ H i leadstothe PfeffermannandBarnard ( 1991 )estimatorin( 2.6.8 ). Moreover, i =[ d Var ( ^ H i )] 1 leadstotheestimatorby Batteseetal. ( 1988 ). 2.7ConstrainedBayesEstimation Theworkof Louis ( 1984 )and Ghosh ( 1992 )motivatesourworkonone-stage benchmarking.UnderanyquadraticlosstheBayesestimates areusuallytheposterior meansoftheparametersofinterest,however,thepointofan alysiswithmanyparametersisusuallytoobtainanotherobjective.Thisobjective maybetohavetheparameters beaboveorbelowacutoffpointortohavethersttwomoments ofthesamplematch thethetruemoments.Ghosh(1992)showed E m X i =1 ( i ) 2 x # > m X i =1 ( ^ B i ( x ) ^ B ( x )) 2 (2.7.1) where ^ B i ( x )= E ( j x ). ThismeansthattheBayesestimatesaretooclosetogether comparedtothetrueparametervalues. Louis ( 1984 )commentedthatthisbehaviorwas duetoovershrinkingoftheobserveddatatowardsthepriorm eans. Thesepapersillustratedthattheempiricalhistogramofth eposteriormeansofaset ofparametersofinterestisunderdispersedascomparedtot heposteriorhistogramof 26

PAGE 27

thesamesetofparameters.Hence,adjustmentofBayesestim atorsisneededinorder tomeetthetwinobjectiveofaccuracyandclosenessofthehi stogramoftheestimates withtheposteriorestimateoftheparameterhistogram.Ino therwords, Louis ( 1984 )and Ghosh ( 1992 )wantedtondestimators e i of i whichsatisfy E [ m X i =1 i j x ]= m X i =1 e i and(2.7.2) E [ m X i =1 ( i ) 2 j x ]= m X i =1 ( e i e ) 2 (2.7.3) Ghosh ( 1992 )providedageneralclassofconstrainedBayesestimatorsi nageneral contextthatareobtainedbymatchingthersttwomomentsof thehistogramofthe estimatestotheposteriorexpectationsofthersttwomome ntsoftheparameters.The quadraticlossisminimizedsubjecttothoseconditions. Thus,takingthemotivatingworkof Louis ( 1984 )and Ghosh ( 1992 ),wepropose agenerallossfunctionwithoutanydistributionalassumpt ionsandbenchmarkthe weightedmeanaswellastheweightedmeanandweightedvaria bility.Wewillsoonsee thatmanyofthebenchmarkedestimatorsalreadymentioneda respecialcasesofthis generaltheorywepresentforanarea-levelmodel. 2.8ProposedResearch Ourworkextendsthatof Wangetal. ( 2008 )toamoregeneralsetting. Wangetal. ( 2008 )derivedabestlinearunbiasedestimator(BLUP)subjectto thestandardbenchmarkingconstraintwhileminimizingaweightedsquarederr orloss.Theyrestricted theirattentiontoasimplerandomeffectsmodel,onlyconsi deredlinearestimatorsof smallareameans,anddidnotconsidermultivariatesetting s.Incontrast,inourwork ( Dattaetal. ( 2011 )),wedevelopmoregeneralbenchmarkedBayesestimators,w here theformoftheBayesestimatecanbelinearornonlinear.Wea lsoderiveasecond setofestimatorsunderasecondconstraintthatbenchmarks theweightedvariability.Wecanextendourresultstomultivariatesettingsandu seamoregeneralloss 27

PAGE 28

function.Both Wangetal. ( 2008 )and Dattaetal. ( 2011 )areabletoshowthatmany previouslyproposedestimatorsarespecialcasesoftheirr espectiveestimator.Thenin GhoshandSteorts ( 2011 ),weextendtheworkabovetoatwo-stagebenchmarkingprocedureunderonesinglemodel.Finally,sincemodel-basede stimatesborrowstrength, theyusuallyshowanimprovementoverthedirectestimatesi ntermsofmeansquared error(MSE).In SteortsandGhosh ( 2012 ),wedeterminehowmuchofthisadvantage islostduetobenchmarkingunderspecicmodelingassumpti ons.Thisquestionwas posedthroughsimulationstudiesusinganaugmentedmodelb y Wangetal. ( 2008 ), howevertheydidnotderiveanyprobabilisticresults.They showedthattheMSEofthe benchmarkedEBestimatorwasslightlylargerthantheMSEof theEBestimatorfortheir simulationstudies.Inourwork,wederiveasecond-orderap proximationoftheMSEof thebenchmarkedEBestimatorandthenndanasymptotically unbiasedestimateof thisMSE.Finally,wederiveaparametricbootstrapestimat oroftheMSEofthebenchmarkedEBestimatorandanalyzeourmethodologyusingdataf romtheSmallAreaand IncomeandPovertyEstimationdatafromtheU.S.CensusBure au. 28

PAGE 29

CHAPTER3 ONE-STAGEBENCHMARKING Aswehavealreadymentioned,model-basedestimatescandif ferwidelyfromdirect estimates,especiallyforareaswithverylowsamplesizes. Whilemodel-basedestimatesareuseful,onepotentialdifcultywithsuchestimat esisthatwhenaggregated, theoverallestimateforalargergeographicalareamaybequ itedifferentfromthecorrespondingdirectestimate.Onewaytoavoidthisproblemis bybenchmarking,which amountstomodifyingthesemodel-basedestimatessothatwe getthesameaggregate estimateforthelargergeographicalarea.Currentlythemo stpopularapproachisthesocalled“raking”orratioadjustmentmethod,whichinvolves multiplyingallthesmallarea estimatesbyaconstantfactorsothattheweightedtotalagr eeswiththedirectestimate. Therakingapproachisadhoc,although,wegiveitaconstrai nedBayesinterpretation. OurobjectiveistodevelopageneralclassofBayesestimato rswhichachieves thenecessarybenchmarking.Fordeniteness,wewillconce ntrateonlyonarea-level models.Aswewillseelater,manyofthecurrentlyproposedb enchmarkedestimators includingtherakedonesbelongtotheproposedclassofBaye sestimators.Inparticular, someoftheestimatorsproposedin PfeffermannandBarnard ( 1991 ), Isakietal. ( 2000 ), Wangetal. ( 2008 )and Youetal. ( 2004 )aremembersofthisclass. 3.1DevelopmentoftheEstimators Let ^ 1 ,..., ^ m denotethedirectestimatorsofthe m smallareameans 1 ,..., m Wewrite ^ =( ^ 1 ,..., ^ m ) T and =( 1 ,..., m ) T .Initially,weseekthebenchmarked Bayesestimator ^ BM 1 =( ^ BM 1 1 ,..., ^ BM 1 m ) T of suchthat P mi =1 w i ^ BM 1 i = t ,whereeither t isprespeciedfromsomeothersourceor t = P mi =1 w i ^ i .The w i aregivenweights attachedtothedirectestimator ^ i ,andwithoutanylossofgenerality, P mi =1 w i =1 .These weightsmaydependon ^ (whichismostoftennotthecase),butdonotdependon Forexample,wemaytake w i = N i = P mj =1 N j ,wherethe N i arethepopulationsizesfor the m smallareas. 29

PAGE 30

ABayesianapproachtothisendistominimizetheposteriore xpectationofthe weightedsquarederrorloss P mi =1 i E [( i e i ) 2 j ^ ] withrespecttothe e i satisfying e w = P mi =1 w i e i = t .The i maybethesameasthe w i ,butthatneednotalwaysbe thecase.Also,like w i i maydependon ^ ,butnoton Wangetal. ( 2008 )considered thesamelossbutrestrictedthemselvestoasimplerandomef fectsmodelandconsider onlylinearestimatorsofsmallareameans.Indeed,inthats pecialcase,theBayesian estimatorproposedreducestothatoftheformer.Itshouldb enotedthoughthatthe BayesianadjustmentproposedisapplicabletoanyBayesest imator,linearornonlinear. The i canberegardedasweightsforamultiple-objectivedecisio nprocess.That is,eachspecicweightisrelevantonlytothedecision-mak erforthecorresponding smallarea,whomaynotbeconcernedwiththeweightsrelated todecision-makersin othersmallareas.Combininglossesinsuchsituationsinal inearfashionisdiscussed forexamplein Berger ( 1985 ). Wenowproveatheoremwhichprovidesasolutiontoourproble m.Afewnotations areneededbeforestatingthetheorem.Let ^ B i denotetheposteriormeanof i i = 1,..., m underacertainprior.Thevectorofposteriormeansandthec orresponding weightedaveragearedenotedrespectivelyby ^ B =( ^ B 1 ,..., ^ B m ) T and ^ B w = P mi =1 w i ^ B i Also,let r =( r 1 ,..., r m ) T ,where r i = w i = i i =1,..., m ,and s = P mi =1 w 2 i = i .Thenwe havethefollowingtheorem. Theorem1. Theminimizer ^ BM 1 of P mi =1 i E [( e i i ) 2 j ^ ] subjectto e w = t isgivenby ^ BM 1 = ^ B + s 1 ( t ^ B w ) r (3.1.1) Proof. Firstrewrite P mi =1 i E [( e i i ) 2 j ^ ]= P mi =1 i V ( i j ^ )+ P mi =1 i ( e i ^ B i ) 2 .Nowthe problemreducestominimizationof P mi =1 i ( e i ^ B i ) 2 subjectto e w = t .ALagrangian multiplierapproachprovidesthesolution.Butthenweneed toshowinadditionthatthe 30

PAGE 31

solutionprovidesaminimizerandnotamaximizer.Alternat ely,wecanusetheidentity m X i =1 i ( e i ^ B i ) 2 = m X i =1 i f e i ^ B i s 1 ( t ^ B w ) r i g 2 + s 1 ( t ^ B w ) 2 (3.1.2) Thesolutionisnowimmediatefrom( 3.1.2 ). Remark1: TheconstrainedBayesbenchmarkedestimators ^ BM 1 asgivenin(1)can alsobeviewedaslimitingBayesestimatorsundertheloss L ( e )= m X i =1 i ( i e i ) 2 + ( t e w ) 2 (3.1.3) where e =( e 1 ,..., e m ) T ,and > 0 isthepenaltyparameter.Likethe i ,thepenalty parameter candifferfordifferentpolicymakers.TheBayesestimator of underthe aboveloss(aftersomealgebra)isgivenby ^ B = ^ B +( s + 1 ) 1 ( t ^ B w ) r Clearly,when !1 ,i.e.,whenweinvoketheextremepenaltyfornothavingthee xact equality e w = t ,wegettheestimatorgivenin( 3.1.1 ).Otherwise, servesasatrade-off between t and ^ B w since w T ^ B = s s +1 t + 1 s +1 ^ B w Remark2: Thebalancedlossof Zellner ( 1986 ), Zellner ( 1988 ),and Zellner ( 1994 )isnot quitethesameastheoneinRemark 1 andisgivenby L ( e )= m X i =1 i ( i e i ) 2 + m X i =1 ( ^ i e i ) 2 ThisleadstotheBayesestimator ^ B + ( I + ) 1 ( ^ ^ B ) ,where I istheidentitymatrix and = Diag ( 1 ,..., m ), whichisacompromisebetweentheBayesestimator ^ B and thedirectestimator ^ of andconvergestothedirectestimatoras !1 andtothe Bayesestimatorwhen 0 31

PAGE 32

Theposteriorriskof ^ BM 1 underthegivenlossinTheorem1simpliesto P mi =1 i [ V ( i j ^ )+ s 1 ( t ^ w ) 2 r 2 i ] .HencetheexcessposteriorriskduetoadjustmentoftheBay esestimatoriftheassumedpriorwere“true”isgivenby P i i s 1 ( t ^ w ) 2 r 2 i .When i = w i and P i w i =1 ,theexpressionfurthersimpliesto 1 m ( t ^ w ) 2 .However,withapriordifferent fromtheassumedone,itispossibletohavealowerposterior riskoftheadjustedBayes estimatorthantheBayesestimator. Toseethisinaverysimplesetting,considerthecasewhere ^ i j i ind N( i ,1 )and i iid N( 0, 2 u ), i =1,..., m .ThentheBayesestimatorof is ^ B =(1 B ) ^ ,where B =(1+ 2 u ) 1 .Also,if i = w i and P mi =1 w i =1 ,then r i =1 forall i =1,..., m and s =1 .Further,if t = ^ w ,asoftenisthecasewithinternalbenchmarking,then ^ BM 1 = (1 B ) ^ + B ^ w 1 m ,where 1 m denotesan m -componentvectorwitheachelementequal toone.Ifinsteadwehavetheprior i iid N( 0, 2 v ),and B 0 =(1+ 2 v ) 1 ,thenaftersome simplication,theposteriorriskof ^ B is (1 B 0 )+( B B 0 ) 2 P mi =1 w i ( ^ i ^ w ) 2 +( B B 0 ) 2 ^ 2 w whilethatof ^ BM 1 is (1 B 0 )+( B B 0 ) 2 P mi =1 w i ( ^ i ^ w ) 2 + B 2 0 ^ 2 w .Now ^ BM 1 hassmaller posteriorriskthanthatof ^ B ifandonlyif j B B 0 j = B 0 > 1 whichisquitepossibleif B 0 is verysmallcomparedto B ,i.e.,if 2 v ismuchlargerthan 2 u WenowprovideageneralizationofTheorem 1 whereweconsidermultipleconstraintsinsteadofonesingleconstraint.Asanexample,fo rtheSAIPEcounty-level analysis,onemayneedtocontrolthecountyestimatesineac hstatesothattheir weightedtotalagreeswiththecorrespondingstateestimat es.Wenowconsideramore generalquadraticlossgivenby L ( e )=( e ) T n( e ), (3.1.4) where n isapositivedenitematrix.Thefollowingtheoremprovide saBayesiansolution fortheminimizationof E [ L ( e ) j ^ ] subjecttotheconstraint W T e = t ,where t isa q -componentvectorand W isan m q matrixofrank q < m Theorem2. TheconstrainedBayesiansolutionundertheloss 3.1.4 isgivenby 32

PAGE 33

^ MBM = ^ B +n 1 W ( W T n 1 W ) 1 ( t ^ Bw ), where ^ Bw = W T ^ B Proof. Firstwrite E [( e ) T n( e ) j ^ ]= E [( ^ B ) T n( ^ B ) j ^ ]+( e ^ B ) T n( e ^ B ) Hence,theproblemreducestominimizationof ( e ^ B ) T n( e ^ B ) withrespectto e subjectto W T e = t .Theresultfollowsfromtheidentity ( e ^ B ) T n( e ^ B )=[( e ^ B n 1 W ( W T n 1 W ) 1 ( t ^ Bw )] T n[( e ^ B n 1 W ( W T n 1 W ) 1 ( t ^ Bw )] +( t ^ Bw ) T ( W T n 1 W ) 1 ( t ^ Bw ). Thechoiceoftheweightmatrix n usuallydependsontheexperimenterdepending onhowmuchpenaltyshe/heiswillingtoassignforamisspeci edestimator.Inthe specialcaseofadiagonal n ,Wangetal.(2008)havearguedinfavorof i =[ Var ( ^ i )] 1 3.2RelationshipwithSomeExistingEstimators Wenowshowhowsomeoftheexistingbenchmarkedestimatorsf ollowasspecial casesofearlierproposedBayesestimators.Indeed,ourpro posedclassofBayes estimatorsincludessomeoftherakedbenchmarkedestimato rsaswellassomeofthe otherbenchmarkedestimatorsproposedbyseveralauthors. Example1: ItiseasytoseewhytherakedBayesestimators,consideredf orexample inYouandRao(2004),belongtothegeneralclassofestimato rsproposedinTheorem 1 .Ifwechoose(possiblyquitearticially) i = w i = ^ B i i =1,..., m ,with ^ B i > 0 forall i =1,..., m ,then r = ^ B and s = ^ B w .Consequently,theconstrainedBayesestimator proposedinTheorem 1 simpliesto ( t = ^ B w ) ^ B ,whichistherakedBayesestimator.In particular,wecantake t = ^ w .Wemayalsonotethatthischoiceofthe i 'sisdifferent fromtheoneinWangetal.(2008)whoconsidered i = w i = ^ i 33

PAGE 34

Example2: Thenextexampleconsiderstheusualrandomeffectsmodelas considered in FayandHerriot ( 1979 )or PfeffermannandNathan ( 1981 ).Underthismodel, ^ i j i ind N ( i D i ) and i ind N ( x Ti 2 u ) ,withthe D i > 0 known.FortheHBapproach,wethen usethetheprior ( 2 u )=1 althoughotherpriorsarealsopossibleaslongasthe posteriorsareproper.TheHBestimators E ( j ^ ) cannotbeobtainedanalytically,butit ispossibletondthemnumericallyeitherthroughMarkovch ainMonteCarlo(MCMC) orthroughnumericalintegration.DenotingtheHBestimato rsby ^ B i ,wecanobtainthe benchmarkedBayesestimators ^ BM 1 i ( i =1,..., m ) byapplyingTheorem 1 .Withan empiricalBayesapproach,weestimate V i = 2 u + D i i =1,..., m fromthemarginal independent N ( x Ti V i )distributionofthe ^ i Isakietal. ( 2000 )suggestedthat i = ^ V 1 i i =1,..., m Example3: Wangetal. ( 2008 )consideredaslightlyvariedformoftheFay-Herriot randomeffectsmodel,withtheonlychangethatthemarginal varianceofthe i arenow z 2 i 2 u ,wherethe z i areknown.Theydonotassumenormality,buttheyrestricted their attentiontotheclassoflinearestimatorsof andbenchmarkthebestlinearunbiased predictor(BLUP)of when 2 u isknown.Forthisexample,thebenchmarkedestimators givenin( 3.2.2 )of Wangetal. ( 2008 )canbederivedfromTheorem 1 .Firstforknown 2 u ,considertheuniformpriorfor .Write B i = D i = ( D i + z 2 i 2 u ) B = Diag ( B 1 ,..., B m ) = Diag ( D 1 + z 2 1 2 u ,..., D m + z 2 m 2 u ) x T =( x 1 ,..., x m ) ,and ~ =( x T 1 x ) 1 x T 1 ^ assuming x tobeamatrixwithfullcolumnrank.ThentheBayesestimator of is ~ B = B ^ +( I B ) x T ~ whichisthesameastheBLUPof aswell.Nowidentifythe r i inthispaperwiththe a i of Wangetal. ( 2008 )toget(19)intheirpaper. Asshownin Wangetal. ( 2008 ),the PfeffermannandBarnard ( 1991 )estimator belongstotheir(andaccordinglyour)generalclassofesti matorswherewechoose i = w i = Cov ( ^ B i ^ B w ) ,wherethecovarianceiscalculatedoverthejointdistribu tionof ^ and ,treating asanunknownbutxedparameter.Then r containstheelementsof Cov ( ^ B i ^ B ) asitscomponents,while s = V ( ^ B w ) 34

PAGE 35

InsteadoftheconstrainedBayesestimatorsasgivenin( 3.1.1 ),itispossibleto obtainconstrainedempiricalBayes(EB)estimatorsaswell whenweestimatetheprior parametersfromthemarginaldistributionof ^ (afterintegratingout ).TheresultingEB estimatorsaregivenby ^ EBM 1 = ^ EB + s 1 ( t ^ EB w ) r (3.2.1) where ^ EB =( ^ EB 1 ,..., ^ EB m ) T isanEBestimatorof and ^ EB w = P mi =1 w i ^ EB i Remark3: InthemodelasconsideredinExample3,forunknown 2 u ,wegetestimators of and 2 u simultaneouslyfromthemarginals ^ i ind N ( x Ti D i + z 2 i 2 u ) ( FayandHerriot ( 1979 ); PrasadandRao ( 1990 ); DattaandLahiri ( 2000 ); Dattaetal. ( 2005 )).Denoting theestimatorof 2 u by ^ 2 u ,weestimate by ^ = Diag ( D 1 + z 2 1 ^ 2 u ,..., D m + z 2 m ^ 2 u ) by ^ =( X T ^ 1 X ) 1 X T ^ 1 ^ and B by ^ B = D ^ 1 ,where D = Diag ( D 1 ,..., D m ) .Denoting theresultingEBestimatorof by ^ EB ,weget ^ EB =( I m ^ B ) ^ + ^ BX ^ (3.2.2) ThebenchmarkedEBestimatorisnowobtainedfrom( 3.2.1 ). Remark4: ThebenchmarkedEBestimatorasgivenin(5)includestheone givenin Isakietal.(2000),wherewetake i asthereciprocalofthe i thdiagonalelementof ^ for all i =1,..., m .Anotheroptionistotake i asthereciprocalofanestimatorof V ( ^ B i ) withthevariancecomputedonceagainunderthejointdistri butionof ^ and ,treating asanunknownbutxedparameter. 3.3BenchmarkingwithBothMeanandVariabilityConstraint s Therearesituationsthatdemandbenchmarkingnotonlythe rstmomentofthe Bayesestimatorsbuttheirvariabilityaswell.Wewilladdr essthisissueinthespecial casewhen i = cw i forsome c > 0 i =1,..., m .Inthiscase, ^ BM 1 i givenin( 3.1.1 ) simpliesto ^ B i +( t ^ B w ) forall i =1,..., m .Thisitselfisnotaverydesirableestimator sincethen P mi =1 w i ( ^ BM 1 i t ) 2 = P mi =1 w i ( ^ B i ^ B w ) 2 Ghosh ( 1992 )showedthat P mi =1 w i ( ^ B i ^ B w ) 2 < P mi =1 w i E [( i w ) 2 j ^ ] .Inotherwords,theweightedensemble 35

PAGE 36

variabilityoftheestimators ^ BM 1 i isanunderestimateoftheposteriorexpectationofthe correspondingweightedensemblevariabilityofthepopula tionparameters.Toaddress thisissue,orfromotherconsiderations,wewillconsidere stimators ^ BM 2 i i =1,..., m whichsatisfytwoconstraints,namely,(i) P mi =1 w i ^ BM 2 i = t and(ii) P mi =1 w i ( ^ BM 2 i t ) 2 = H where H isapreassignednumbertakenfromsomeothersource,forexa mplefrom censusdata,oristakenas P mi =1 w i E [( i w ) 2 j ^ ] moreinthespiritof Louis ( 1984 )and Ghosh ( 1992 ).Subjecttothesetwoconstraints,weminimize P mi =1 w i E [( i e i ) 2 j ^ ] .The followingtheoremprovidestheresultingestimator. Theorem3. Subjectto(i)and(ii),thebenchmarkedBayesestimatorsof i i =1,..., m aregivenby ^ BM 2 i = t + a CB ( ^ B i ^ B w ), (3.3.1) where a 2 CB = H = P mi =1 w i ( ^ B i ^ B w ) 2 .Notethat a CB 1 when H = P mi =1 w i E [( i w ) 2 j ^ ]. Proof. AsinTheorem 1 ,theproblemreducestominimizationof P mi =1 w i ( e i ^ B i ) 2 .We willwrite m X i =1 w i ( e i ^ B i ) 2 = m X i =1 w i [( e i e w ) ( ^ B i ^ B w )] 2 +( e w ^ B w ) 2 (3.3.2) Nowdenetwodiscreterandomvariables Z 1 and Z 2 suchthat P ( Z 1 = e i e w Z 2 = ^ B i ^ B w )= w i i =1,..., m .Hence, m X i =1 w i [( e i e w ) ( ^ B i ^ B w )] 2 = V ( Z 1 )+ V ( Z 2 ) 2 Cov ( Z 1 Z 2 ) whichisminimizedwhenthecorrelationbetween Z 1 and Z 2 equals1,i.e.,when e i e w = a ( ^ B i ^ B w )+ b (3.3.3) i =1,..., m with a > 0 .Multiplyingbothsidesof( 3.3.3 )by w i andsummingover i =1,..., m ,weget b =0 .Next,squaringbothsidesof( 3.3.3 ),thenmultiplyingboth 36

PAGE 37

sidesby w i andsummingover i =1,..., m ,weget H = a 2 P mi =1 w i ( ^ B i ^ B w ) 2 dueto condition(ii).Finally,bycondition(i),theresultfollo wsfrom( 3.3.3 ). Remark5: AsinthecaseofTheorem 1 ,itispossibletoworkwitharbitrary i rather than i = w i forall i =1,..., m .Butthenwedonotgetaclosedformminimizer,although itcanbeshownthatsuchaminimizerexists.Wecanalsoprovi deanalgorithmfor ndingthisminimizernumerically. Themultiparameterextensionoftheaboveresultproceedsa sfollows.Suppose now ^ 1 ,..., ^ m arethe q -componentdirectestimatorsofthesmallareameans 1 ,..., m Wemaygeneralizetheconstraints(i)and(ii)as ( iM ) e w = m X i =1 w i e i = t forsomespecied t and ( iiM ) m X i =1 w i ( e i e w )( e i e w ) T = H where H isapositivedenite(possiblydata-dependentmatrix)and isoftentakenas P mi =1 w i E [( i w )( i w ) T j ^ ] .Thesecondconditionisequivalentto c T f m X i =1 w i ( e i e w )( e i e w ) T g c = c T H c forevery c =( c 1 ,..., c q ) T 6 = 0 whichsimpliesto P mi =1 w i f c T ( e i e w ) g 2 = c T H c Anargumentsimilartobeforenowleadsto c T ^ BM 2 i = c T ^ w + a CB c T ( ^ Bi ^ Bw ) forevery c 6 = 0 ,where ^ Bi istheposteriormeanof i ^ Bw = P mi =1 w i ^ Bi ,and a 2 CB = c T H c = P mi =1 w i f c T ( ^ Bi ^ Bw ) g 2 .ThecoordinatewisebenchmarkedBayesestimatorsarenowobtainedbyletting c =(1,0,...,0) T ...,(0,0,...,1) T insuccession. Theproposedapproachcanbeextendedalsotoatwo-stageben chmarking somewhatsimilartowhatisconsideredby PfeffermannandTiller ( 2006 ).Tocitean example,considertheSAIPEscenariowhereonewantstoesti matethenumberof poorschoolchildrenindifferentcountieswithinastate,a swellasthosenumberswithin thedifferentschooldistrictsinallthesecounties.Let ^ i denotetheCurrentPopulation 37

PAGE 38

Survey(CPS)estimateof i ,thetruenumberofpoorschoolchildrenforthe i thcounty and ^ B i thecorrespondingBayesestimate,namelytheposteriormea n.Subjectto theconstraints e w = P mi =1 w i e i = P mi =1 w i ^ i = ^ w and P mi =1 w i ( e i e w ) 2 = H thebenchmarkedBayesestimatefor i inthe i thcountyis ^ BM 2 i asgivenin( 3.3.1 ). Next,supposethat ^ ij istheCPSestimatorof ij ,thetruenumberofpoorschool childrenforthe j thschooldistrictinthe i thcounty,and ij istheweightattachedto thedirectCPSestimatorof ij j =1,..., n i .Weseekestimators e ij of ij suchthat(i) e i = P n i j =1 ij e ij = ^ BENCHi ,thebenchmarkedestimatorof i = P n i j =1 ij ij ,and(ii) P n i j =1 ij ( e ij e i ) 2 = H i forsomepreassigned H i ,whereagain H i canbetakenas P n i j =1 ij E [( ij i ) 2 j ^ x i i ] ^ x i i beingthevectorwithelements ij .Abenchmarkedestimator similarto( 3.3.1 )cannowbefoundforthe ij aswell. 3.4AnIllustrativeExample Themotivationbehindthisexampleisprimarilytoillustra tehowtheproposed Bayesianapproachcanbeusedforreal-lifedata.TheSmallA reaIncomeandPoverty Estimates(SAIPE)programattheU.S.BureauoftheCensuspr oducesmodel-based estimatesofthenumberofpoorschool-agedchildren(5–17y earsold)atthenational, state,county,andschooldistrictlevels.Theschooldistr ictestimatesarebenchmarked tothestateestimatesbytheDepartmentofEducationtoallo catefundsundertheNo ChildLeftBehindActof2001.IntheSAIPEprogram,themodel -basedstateestimates arebenchmarkedtothenationalschool-agedpovertyrateus ingtheratioadjustment method.Thenumberofpoorschool-agedchildrenhasbeencol lectedfromtheAnnual SocialandEconomicSupplement(ASEC)oftheCPSfrom1995to 2004,whileACS estimateshavebeenusedbeginningin2005.Additionally,t hemodel-basedcounty estimatesarebenchmarkedtothemodel-basedstateestimat esinahierarchical fashion,onceagainusingratioadjustments.Inthissectio n,wewillconsiderthreesets ofriskfunctionweights i thatwillbeusedtobenchmarktheestimatedstatepoverty ratesbasedonTheorem 1 .WewillalsobenchmarkusingtheresultsfromTheorem 3 38

PAGE 39

IntheSAIPEprogram,thestatemodelforpovertyratesinsch ool-agedchildren followsthebasicFay-Herriotframework(seee.g. Bell ( 1999 )), ^ i = i + e i (3.4.1) i = x Ti + u i (3.4.2) where i isthetruestate-levelpovertyrate, ^ i isthedirectsurveyestimate(fromCPS ASEC), e i isthesamplingerrortermwithassumedknownvariance D i x i arethe predictors, isthevectorofregressioncoefcients,and u i isthemodelerrorwith constantvariance 2 u .TheexplanatoryvariablesinthemodelareIRSincometax–b ased pseudo-estimateofthechildpovertyrate,IRSnon-lerrat e,foodstamprate,andthe residualtermfromtheregressionofthe1990Censusestimat edchildpovertyrate.The posteriormeansandvariancesof ( i 2 u ) areestimatedusingarejectionsampling– typealgorithmproposedby EversonandMorris ( 2000 ).Multivariatenormaldistributions arerequiredforthersttwolevelsofthehierarchy.Seethe irpaperforfurtherdetails. ThestateestimateswerebenchmarkedtotheCPSdirectestim ateofthenational school-agedchildpovertyrateuntil2004.Theweights w i tocalibratethestate'spoverty ratestothenationalpovertyrateareproportionaltothepo pulationestimatesofthe numberofschool-agedchildrenineachstate.Weutilizethr eedifferentsetsofrisk functionweights i tobenchmarktheestimatedstatepovertyratesbasedonTheo rem 1 .Therstsetofweightswillbetheweightsusedinthebenchm arking,i.e., i = w i Thesecondsetofweightscreatestheratio-adjustedbenchm arkedestimators i = w i = ^ B i (Example1).ThethirdsetofweightsusestheresultsfromPf effermannand Barnard(1991)where i = w i = Cov ( ^ B i ^ B w ) (Example3).Letthissetofbenchmarked estimatesbedenotedas ^ (1) ^ ( r ) and ^ ( PB ) ,respectively.Finally,webenchmarkthe statepovertyestimatesusingtheresultsfromTheorem 3 anddenotetheestimatoras ^ (3) 39

PAGE 40

Table3-1.BenchmarkingStatisticsforASECCPS yeart ^ B w a CB 199518.217.91.04199717.917.81.04199817.016.81.08199915.214.91.07200015.615.41.05200115.315.11.05 Forbenchmarking,asgivenbyTheorems 1 and 3 ,thekeysummaryquantitiesare t = P i w i ^ i ^ B w = P i w i ^ B i and a CB .AsnotedearlierinTheorem 3 ,thechoiceof H assuggestedinthistheoremimpliesthat a CB 1 .Sixyearsofhistoricaldatafromthe CPSandtheSAIPEprogramareanalyzedandbenchmarkedusing thefourcriteria mentionedabove.Table 3-1 givesthekeyquantitiesforthesesixyears.Thehierarchic al Bayesestimatesunderestimatethebenchmarkedpovertyrat eforallyearsgiven.Evenif theestimate ^ B w isclosetothebenchmarkedvalue t ,thereisstillastrongdesiretohave exactagreementbetweenthequantitieswhenproducingofc ialstatistics. Figure 3-1 showsthedifferencesofthevariousbenchmarkedestimates fromthe hierarchicalBayesestimate ^ B i madefortheyear1998whentheoverallpovertylevel hadtoberaisedtoagreewiththenationaldirectestimate.W ealsoshowthisparticular yearsinceitillustrateswhen a CB isthelargestat1.08,correspondingtoaverysteep slopefor ^ (3) andillustratingthatwhentheBayesestimateislargewend alargelinear increase,whereaswhentheBayesestimateisquitesmall,we actuallyndthatthe estimatesareadjusteddownwards.Wecontrastthiswith ^ ( r ) wherewendthatalarge valueoftheBayesestimateresultsinasmalllinearincreas e,whereasasmallvalue oftheBayesestimateresultsinasmalllinearadjustment.T hebehaviorof ^ (1) isfairly obviousinthatittheBayesestimateisbeingadjustedbythe sameamountforeach smallarea.Finally,observing ^ ( PB ) wendthatitissomewhatsimilarto ^ ( r ) however,it isnotalinearestimator. 40

PAGE 41

WecomparethisplottoFigure 3-2 .Thisgureshowsthedifferencesforyear2001, wheretheoverallpovertylevelhadtoberaisedtoobtainagr eement.Similarcomments canbemadehereasweremadeforyear1998.Themaindifferenc ehoweverisregarding ^ (3) FromFigure 3-2 ,wecanseethattheslopeisnotassteep,whichisduetothe factthat a CB is1.04.However,westillnoticethatwhentheBayesestimat esaresmall wecangetadjustmentsthatarenegative(aswasfoundalsoin year2001). Inaddition,ifthedifferencesforeachofthebenchmarkede stimators ^ (1) ^ ( r ) and ^ (2) fromtheHBestimatorareplottedversusthethevalueoftheB ayesestimator,the differencesforeachbenchmarkedestimatorwillfallalong astraightline.Thelinesfor allthreeestimatorswillpassthroughthepoint ( ^ B w t ^ B w ) .Infact,thesebenchmarked estimatorscanbewrittenintheform: ^ BM i = t + ( ^ B i ^ B w ) where =1 for ^ (1) = t = ^ B w for ^ ( r ) and = a CB for ^ (3) .Theslopesofthelinesin Figures1and2fordifferencesinthebenchmarkedestimates from ^ B i are 1 .The slopesfor ^ (1) i and ^ ( r ) i dependonwhetherthebenchmarkedtotal t islargerorsmaller thanthemodel-basedestimate ^ B w .However,since a CB 1 ,theslopeforthedifference ^ (3) i ^ B i willalwaysbenon-negative.ThePfeffermann-Barnardbenc hmarkedestimator doesnotfollowthisform.However,itdoesshowatrendinasi milardirectionasthe benchmarkedestimator ^ ( r ) i basedonratioadjustment. 41

PAGE 42

Figure3-1.ChangeinEstimatorsduetoBenchmarkingfor199 8 Figure3-2.ChangeinEstimatorsduetoBenchmarkingfor200 1 42

PAGE 43

CHAPTER4 TWO-STAGEBENCHMARKING Weextendtheworkof Dattaetal. ( 2011 )andproposetwo-stagebenchmarking basedonanestederrorregressionmodel.Forexample,inthe SmallAreaIncome andPovertyEstimation(SAIPE)projectoftheU.S.CensusBu reau,oneisrequiredto benchmarkthestateestimatessothattheiraggregatematch esthenationalestimate. Also,thecountyestimateswithineachstatearebenchmarke dsothattheaggregated benchmarkedcountyestimatesmatchthecorrespondingbenc hmarkedstateestimates. Thecurrentapproachtotwo-stagebenchmarkingusestwosep aratemodels,for example,oneforthestatesandtheotherforthecounties,to achievethenecessary benchmarking.Incontrast,weconsideronesinglemodeltoo btaintheBayesestimates andthenadjusttheBayesestimatestoachievebenchmarking atbothlevels.Theneed forasinglemodelisoftenfeltinmanysmallareaproblems.F orinstance,theNational AgriculturalStatisticalService(NASS)oftheUnitedStat esDepartmentofAgriculture (USDA)requiressimulataneousbenchmarkingofregion-lev elagriculturalcashrent estimatestothestateestimates,andthecounty-levelesti matestothecorresponding regionestimates. 4.1Two-StageBenchmarkingResults Let ij denotethetrueparameterofinterestforthe j thunitinthe i tharea,andlet ^ B ij denoteitsBayesestimatorunderacertainprior ( j =1,..., n i ; i =1,..., m ). Fora givensetofnormalizedweights w ij ( P n i j =1 w ij =1 forall i ),let iw = P j w ij ij denotethe trueweightedmeanforthe i tharea.Forexample, ij maybethetrueproportionofpoor schoolchildreninthe j thcountyinthe i thstate,and w ij maydenotethetrueproportion ofschoolchildrenforthe j thcountyinthe i thstate. Also,let i ( i =1,..., m ) denotethenormalizedweightsforthe i tharea,where P i i =1. Wewanttondestimates ^ ij for ij and e i for iw suchthat P i i e i = p and P j w ij ^ ij = e i forall i Forexample, p mightbethenationalproportionofpoorschool 43

PAGE 44

children.Throughoutwewillusethenotation ^ =( ^ 11 ,..., ^ 1 n 1 ,..., ^ m 1 ,..., ^ mn m ) T and e =( e 1 ,..., e m ) T Also, ^ B ij denotestheBayesestimateof ij underacertainprior ( i =1,..., m ; j =1,..., n i ). Ourobjectiveistominimizetheweightedsquarederrorloss L ( ^ e ) withrespectto ^ ij and e i where L ( ^ e )= X i X j ij ( ^ ij ij ) 2 + X i i ( e i iw ) 2 (4.1.1) subjecttotherestrictions(i) P i i e i = p and(ii) P j w ij ^ ij = e i forall i .Notethatthe weights ij and i neednotbethesameas w ij and i respectively.Let s i = P j w 2 ij 1 ij for all i Wenowprovethefollowingtheorem. Theorem4. Theminimizerof E [ L ( ^ e ) j data ] withrespectto ^ and e subjecttothe restrictions(i)and(ii)isgivenby (a) ^ ij = ^ B ij + ( p ^ B w ) i (1+ i s i ) 1 w ij 1 ij P mk =1 2 k s k (1+ k s k ) 1 forall i j (b) e i = ^ B iw + ( p ^ B w ) i s i (1+ i s i ) 1 P mk =1 2 k s k (1+ k s k ) 1 forall i Proof. First,wewrite E [ L ( ^ e ) j data ]= E X i X j ij ( ij ^ B ij + ^ B ij ^ ij ) 2 + X i i ( iw ^ B iw + ^ B iw e i ) 2 j data # = X i X j ij V ( ij j data )+ X i X j ij ( ^ ij ^ B ij ) 2 + X i i V ( iw j data )+ X i i ( ^ B iw e i ) 2 (4.1.2) Inviewof( 4.1.2 ),theproblemreducestominimizationof g = X i X j ij ( ^ ij ^ B ij ) 2 + X i i ( e i ^ B iw ) 2 2 X i 1 i ( X j w ij ^ ij e i ) 2 2 ( X i i e i p ) (4.1.3) 44

PAGE 45

withrespectto ^ ij and e i (forall i j )wherethe 1 i and 2 aretheLagrangianmultipliers. From( 4.1.3 ),wend @ g @ ^ ij =2 ij ( ^ ij ^ B ij ) 2 1 i w ij (4.1.4) @ g @ e i =2 i ( e i ^ B iw )+2 1 i 2 2 i (4.1.5) Solving @ g @ ij =0 implies ^ ij = ^ B ij + 1 i w ij 1 ij Nowinvoking e i = P j w ij ^ ij wendthat e i = ^ B iw + 1 i s i Thatis, 1 i =( e i ^ B iw ) s 1 i Next,wesolve @ g @ e i =0, whichimplies i ( e i ^ B iw )+ 1 i 2 i =0 () i ( e i ^ B iw )+ s 1 i ( e i ^ B iw )= 2 i () ( e i ^ B iw )(1+ s i i )= 2 i s i () e i = ^ B iw + 2 i s i (1+ s i i ) 1 (4.1.6) Recallthat P i i e i = p Applythisto( 4.1.6 ).Then p = ^ B w + 2 X i 2 i s i (1+ s i i ) 1 (4.1.7) whichimplies 2 = p ^ B w P k 2 k s k (1+ s k k ) 1 Thenby( 4.1.6 )and( 4.1.7 ),wegetpart(b)of thetheorem. Nextby ^ ij = ^ B ij + 1 i w ij 1 ij and 1 i =( e i ^ B iw ) s 1 i ,weget ^ ij = ^ B ij +( e i ^ B iw ) w ij s 1 i 1 ij (4.1.8) Nowapply(b)toget(a)from(8). Remark6: Theresultsimpliessomewhatwhen i = w ij forall i j Then s i =1 forall i sothat ^ ij = ^ B ij + ( p ^ B w ) i (1+ i ) 1 P mk =1 2 k (1+ k ) 1 e i = ^ B iw + ( p ^ B w ) i (1+ i ) 1 P mk =1 2 k (1+ k ) 1 45

PAGE 46

Furthersimplicationispossiblewhen i = i ( i =1,..., m ). Often,inadditiontocontrollingtheaveragesasinTheorem 1,onewantstocontrol variabilityoftheestimatesaswell.Thefollowingtheorem providesapartialanswerto thisproblem.Wetake ij = w ij Theorem5. Theminimizingsolutionof E [ P i P j ij ( ^ ij ij ) 2 + P i i ( e i iw ) 2 j data ] subjectto(i) P j w ij ^ ij = e i (ii) P j w ij ( ^ ij e i ) 2 = h i and(iii) P i i e i = p isgivenby (a) ^ ij = ^ B ij + ( p ^ B w ) i (1+ i ) 1 P k 2 k (1+ k ) 1 + h i d 1 i 1 2 ( ^ B ij ^ B iw ) (b) e i = ^ B iw + ( p ^ B w ) i (1+ i ) 1 P k 2 k (1+ k ) 1 where d i = P j w ij ( ^ B ij ^ B iw ) 2 Proof. SimilartoTheorem1,theproblemreducestominimizationof g = X i X j w ij ( ^ ij ^ B ij ) 2 + X i i ( e i ^ B iw ) 2 2 X i 1 i ( X j w ij ^ ij e i ) X i 3 i ( X j w ij ( ^ ij e i ) 2 h i ) 2 2 ( X i i e i p ) (4.1.9) withrespectto ^ ij and e i subjectto(i)–(iii),whereagain 1 i 2 and 3 i aretheLagrangianmultipliers.Webeginwith 0= @ g @ ^ ij =2 w ij ( ^ ij ^ B ij ) 2 1 i w ij 2 3 i w ij ( ^ ij e i ) (4.1.10) 0= @ g @ e i =2 i ( e i ^ B iw )+2 1 i 2 2 i (4.1.11) By( 4.1.10 ),wend ^ ij = ^ B ij + 1 i + 3 i ( ^ ij e i ). (4.1.12) Thisimplies e i = ^ B iw + 1 i (4.1.13) 46

PAGE 47

Combining( 4.1.12 )and( 4.1.13 ), ^ ij e i = ^ B ij ^ B iw + 3 i ( ^ ij e i ). Nowweapply constraint(ii)tond (1 3 i ) 2 h i = P j w ij ( ^ B ij ^ B iw ) 2 = d i whichimplies 3 i =1 d i h 1 i 1 2 Thisinreturnimpliesthat ^ ij e i = ^ B ij iw + h 1 d i h 1 i 1 2 i ( ^ ij e i ) () ^ ij e i = h i d 1 i 1 2 ( ^ B ij ^ B iw ). (4.1.14) From( 4.1.11 ),weknowthat e i = ^ B iw 1 i 1 i + 2 i 1 i Nowby( 4.1.13 ), e i = ^ B iw ( e i ^ B iw ) 1 i + 2 i 1 i Thisleadsto 2 i =( e i ^ B iw )(1+ i ), whichimplies p = ^ B w + 2 P k 2 k (1+ k ) 1 Then e i = ^ B iw + 2 i (1+ i ) 1 = ) e i = ^ B iw + ( p ^ B w ) P k 2 k (1+ k ) 1 i (1+ i ) 1 (4.1.15) Wecombine( 4.1.14 )and( 4.1.15 )togetthetheabovetheorem. WenowextendTheorem1toamultiparametersetting.Conside rthefollowing notation.Let =( 11 ,..., 1 n 1 ,..., m 1 ,... mn m ) T ^ =( ^ 11 ,..., ^ 1 n 1 ,..., ^ m 1 ,... ^ mn m ) T and e =( e 1 ,..., e m ) T Dene iw = P j W ij ij and w = P i i iw where W ij and i are knownpositivedenitematrices(forall i and j ).WedenotetheBayesestimatorsby ^ B =( ^ B11 ,..., ^ B1 n 1 ,..., ^ Bm 1 ,... ^ Bmn m ) T Wealsodene ^ B iw = P j W ij ^ B ij and ^ B w = P i i ^ B iw Theorem6. Considerthelossfunction L ( ^ e )= P i P j ( ^ ij ij ) T ij ( ^ ij ij )+ P i ( e i iw ) T n i ( e i iw ), where ij and n i arepositivedenitematrices.Thentheminimizing solutionof E [ L ( ^ e ) j data ] subjectto(i) P j W ij ^ ij = e i and(ii) P i i e i = p isgivenby (a) ^ ij = ^ B ij + 1 ij W ij s 1 i (n i + s 1 i ) 1 i R 1 ( p ^ B w ) (b) e i = ^ B iw +(n i + s 1 i ) 1 i R 1 ( p ^ B w ), where s i = P j W ij 1 ij W ij and R = P i i (n i + s 1 i ) 1 i Notethat ij neednotbethesameas W ij Similarly, n i neednotbethesameas i Also, p isgiven. 47

PAGE 48

Proof. Bystandardresults,theproblemreducestominimizationof g = X i X j ( ^ ij ^ B ij ) T ij ( ^ ij ^ B ij )+ X i ( e i ^ B iw ) T n i ( e i ^ B iw ) 2 X i 1i T ( X j W ij ^ ij e i ) 2 2 T ( X i i e i p ), where 1i and 2 aretheLagrangemultipliers.Wesolve 0 = @ g @ ^ ij =2 ij ( ^ ij ^ B ij ) 2 W ij 1i 8 i j (4.1.16) 0 = @ g @ e i =2n i ( e i ^ B iw )+2 1i 2 i 2 8 i (4.1.17) From( 4.1.16 ), ^ ij = ^ B ij + 1 ij W ij 1i = ) e i = ^ B iw + s i 1i () 1i = s 1 i ( e i ^ B iw ) 8 i (4.1.18) From( 4.1.17 ), e i = ^ B iw +n 1 i i 2 n 1 i 1i () e i = ^ B iw +n 1 i i 2 n 1 i s 1 i ( e i ^ B iw ) () n 1 i i 2 =( I +n 1 i s 1 i )( e i ^ B iw )= ) e i = ^ B iw +(n i + s 1 i ) 1 i 2 (4.1.19) Applyingconstraint(ii),wend p = ^ B w + P i i (n i + s 1 i ) 1 i 2 = ^ B w + R 2 where R := P i i (n i + s 1 i ) 1 i Thisimpliesthat 2 = R 1 ( p ^ B w ). From( 4.1.19 ), e i = ^ B iw +(n i + s 1 i ) 1 i R 1 ( p ^ B w ). (4.1.20) Combining( 4.1.18 )and( 4.1.20 ), ^ ij = ^ B ij + 1 ij W ij 1i () ^ ij = ^ B ij + 1 ij W ij s 1 i ( e i ^ B iw )= ) ^ ij = ^ B ij + 1 ij W ij s 1 i (n i + s 1 i ) 1 i R 1 ( p ^ B w ). (4.1.21) 48

PAGE 49

Theresultfollowsfrom( 4.1.20 )and( 4.1.21 ). 4.2AnExample Thissectionconsiderssmallarea/domainestimationofthe proportionofpersons withouthealthinsuranceforseveraldomainsoftheAsiansu bpopulation.Ourgoalisto benchmarktheaggregatedprobabilitiesthatapersondoesn othavehealthinsuranceto thecorrespondingdomainvalues.Wealsobenchmarkthedoma inestimatestomatch theoverallpopulationestimates. Thesmalldomainswereconstructedonthebasisofage,sex,r ace,andtheregion whereeachpersonlives.TheNationalHealthInterviewSurv ey(NHIS)dataprovides theindividual-levelbinaryresponsedataaswellastheind ividual-levelcovariates ( Ghosh,Kim,Sinha,Maiti,KatzoffandParsons ( 2009 )).Wehaveinformationonthe mainresponsevariableofinterest,whetherornotapersonh ashealthinsurance.More informationonthisdataisdescribedin Ghosh,Kim,Sinha,Maiti,KatzoffandParsons ( 2009 ).Moreover,whentargetingspecicsubpopulationscrossc lassiedbydemographiccharacteristics,directestimatesareusuallyacc ompaniedwithlargestandard errorsandcoefcientsofvariation.Hence,aproceduresuc hastheoneproposedin Section2isappropriate. TheAsiangroupismadeupofthefollowingfourgroups:Chine se,Filipino,Asian Indian,andotherssuchasKoreans,Vietnamese,Japanese,H awaiian,Samoan, Guamanian,etc.Individualsfromthesesubpopulationsare assignedtospecicdomainsdependingontheirage,gender,andtheregiontheycom efrom.Thereare threeagegroups(0–17,18–64,and65+.Furthermore,therea retwogenders,four races,andfourregionsthatdependonthesizeofthemetropo litanstatisticalarea ( < 499,999;500,000–999,999;1,000,000–2,499,999; > 2,500,000).Hence,thereare 4 2 3 4=96 totaldomains. In Ghosh,Kim,Sinha,Maiti,KatzoffandParsons ( 2009 )astepwiseregression procedureisusedforthisdatasettoreachanalmodelwitha nintercepttermand 49

PAGE 50

threecovariates:familysize,educationlevel,andtotalf amilyincome.Sinceweare usingthisdataformainlyillustrativepurposes,weusethe samecovariates.Themodel structureisasfollows: y ij j ij ind Bin (1, ij ) 8 i j logit ( ij )= x T ij + u i 8 i j u i iid N (0, 2 u ) 8 i Uniform ( R p ) 2 u Gamma ( c d ). Here y ij istheresponseofthe j thunitinthe i thsmalldomain(i.e.,whetherornota personhashealthinsurance)and ij istheprobabilitythatperson j inunit i doesnot havehealthinsurance. Inouranalysisofthedata,werstndtheBayesestimatesan dassociated standarderrors(ofthesmalldomains)usingMarkovchainMo nteCarlo(MCMC). Let ^ ( m ) ij denotethesampledvalueof ij fromtheMCMCoutputgeneratedfromthe m thdraw,wherethereare M totaldraws.TheMonteCarloestimateof E ( ij j y )= M 1 P Mm =1 ^ ( m ) ij TheMonteCarloestimateofCov ( ij ij T j y )= M 1 P Mm =1 ( ^ ( m ) ij ^ ( m ) ij T ) ( M 1 P Mm =1 ^ ( m ) ij )( M 1 P Mm =1 ^ ( m ) ij T ). Then E ( iw j y )= P n i j =1 w ij E ( ij j y ) andVar ( iw j y )= P n i j =1 P n i j T =1 w ij w ij T Cov ( ij ij T ). Wewillthenndthetwo-stagebenchmarkedBayesestimatesusingTheorem1andconsideringtheweightsbelow.F urthermore,tomeasure thevariabilityassociatedwiththetwo-stagebenchmarked Bayesestimatesatthearea levels,weusetheposteriormeansquarederror(PMSE),wher e PMSE ^ BENCH i = E h ( ^ BENCH i i ) 2 j ^ i (4.2.1) ThenthePMSEof e i isgivenby PMSE ( e i )=( e i ^ B iw ) 2 + Var ( iw j ^ ). (4.2.2) 50

PAGE 51

Weconsiderinouranalysistheweights w ij = w ij P j w ij where w ij isthenalperson weightintheNHISdataset,and i = P j w ij ( P i P j w ij ) Thisleadsto p = P i P j w ij y ij P i P j w ij = P i P j ( w ij y ij ) i Wealsotake ij = V 1 ( ij j data ) and i = V 1 ( P j w ij ij j data ). Simplicationofthe area-levelestimatecanbefoundbyinsertingthesequantit iesintoTheorem1. Table 4-1 containsthedirect,hierarchicalBayes,andbenchmarkedB ayesestimatesandassociatedposteriorRMSEs(PRMSE)foreachofthe smalldomains.Italso containsthepercentincreaseinthePRMSEinthebenchmarke dBayesestimatescomparedtotheHBestimates.Thedirectionofadjustmentdepen dsonthesignof p ^ B w andtheamountofadjustmentdependsontherelativemagnitu desof i s i (1+ i s i ) 1 Forthegivendataset,theadjustmentsarealwayspositives ince p > ^ B w .Also,with thepresentchoiceofweights,thepercentincreaseinPRMSE overtheHBestimators isquitesmall,whichsomewhatjustiesthechoiceofthegiv enHBmodel.Moreover, theamountofadjustmentistypicallymorefordomainswiths mallersamplesizesas comparedtothosewithlargersamplesaswewouldliketosee. Consider,forexample,domain12withasamplesizeof11ands e(HB)equalto 0.056.Sincetheconstraintweight i isinverselyproportionaltotheposteriorvariance, weexpecttoseealargeradjustmentinthisdomain,whichisp reciselywhatoccurs.We alsonotethatthisdomainalsoshowsthelargestpercentinc reaseinRPMSE,which is1.616.Similarexpectedbehavioroccursfordomain79wit hsamplesize122,which endsuphavingthesmallestoveralladjustment. 51

PAGE 52

Table4-1.TableofestimatesusingTheorem1 Domain n i DirectHBse(HB) e i PRMSE%inc* 1100.1260.1350.048890.1410.049270.78420------3240.0630.0880.031980.0910.032130.4674280.1460.1500.044710.1580.045381.5145200.1380.1250.041380.1320.041931.3206170.1120.1300.042900.1350.043180.6627780.0970.1080.027460.1100.027570.3968660.2740.2140.042280.2190.042650.880950.1730.1270.048200.1320.048460.542 10600.1430.054220.1500.054710.90811700.1400.051590.1470.052080.94112110.3350.1640.056300.1750.057211.6161370.1330.1190.046130.1260.046570.96914200.0970.042170.1000.042290.288152700.0750.028370.0770.028460.31516290.1130.1290.039630.1340.039920.72317270.120.1250.039250.1300.039550.757181400.0930.035710.0970.035890.49219770.1310.1250.029310.1280.029430.42620750.2230.1880.037620.1920.037880.69721300.1000.041850.1040.042040.44922600.1150.045320.1190.045540.48923800.1530.055290.1600.055790.90224900.1190.045140.1240.045420.625251000.0940.037770.0980.037990.57626600.0950.040430.0980.040560.32927320.0980.1140.036170.1180.036360.547282300.0830.031760.0860.031930.53129250.1870.1220.040210.1270.040500.71130230.2260.1450.044820.1500.045090.60231710.1180.1180.029960.1210.030040.28232500.1090.1090.031340.1120.031520.56733200.1150.048340.1190.048560.43934200.1190.049020.1230.049110.1893580.1080.1170.044390.1220.044710.72036700.1250.048290.1300.048510.4563790.0620.0980.039490.1030.039870.966381700.0730.029290.0760.029370.27939240.1170.1360.043860.1420.044230.836402000.0960.035320.1000.035600.81641500.1630.1560.038480.1600.038750.703 *inPRMSE 52

PAGE 53

Table4-1.Continued Domain n i DirectHBse(HB) e i PRMSE%inc* 42380.1410.1220.035920.1260.036120.55843760.1040.1140.028220.1160.028290.25144730.1420.1210.029610.1240.029750.46845200.1290.052550.1350.052840.54646300.0960.041570.0990.041710.345471000.0980.038580.1010.038720.37148700.1250.047110.1300.047390.59649100.0870.1160.042450.1210.042750.69250500.1260.050700.1310.050960.51051230.0380.0890.032530.0920.032640.35552210.2430.1430.045840.1490.046250.90753310.1140.1250.037890.1290.038140.67154180.2020.1580.049410.1640.049710.60155740.0940.0940.025540.0960.025620.31256830.2040.1860.036990.1910.037290.81457200.1400.056670.1450.056880.36358100.0930.041700.0960.041770.16359200.0940.041150.0970.041250.2326080.1120.1320.047980.1380.048310.68961160.2020.1560.052060.1630.052601.0406230.3010.1580.063620.1650.063950.51663330.0550.0930.031180.0960.031310.42164280.1050.1180.037430.1220.037640.56065330.1260.1330.038990.1380.039280.73166130.3930.2070.063930.2160.064601.03867700.0790.0960.026350.0980.026420.28368750.1790.1590.034250.1630.034440.53769110.1460.060750.1490.060810.1077020.3610.1570.059460.1620.059680.37271400.1040.042910.1080.043070.38472200.2260.080820.2350.081340.64373450.2710.2080.045780.2150.046321.189741000.0960.038540.1000.038730.48575830.1490.1340.030020.1370.030170.48976590.1130.1260.032470.1290.032630.50577680.3380.2700.046380.2760.046851.02478390.0980.1070.033310.1110.033460.452791220.110.1060.023480.1080.023550.285801250.3080.2930.038050.2980.03830.65181700.1220.046810.1260.047040.484821200.0970.037520.1010.037710.49883130.0490.1220.044280.1280.044580.678 *inPRMSE 53

PAGE 54

Table4-1.Continued Domain n i DirectHBse(HB) e i PRMSE%inc* 84400.1260.049150.1310.049400.50685320.1890.1720.046410.1780.046740.69986100.1350.1320.048440.1370.048780.70087520.1920.1570.037650.1620.038000.92588650.1530.1400.033460.1430.033660.58489710.2850.2260.041320.2310.041640.76390570.0860.1030.029650.1060.029790.446911530.1490.1350.023870.1360.023940.297921380.3080.2770.035420.2810.035630.593931000.1210.045130.1260.045470.76194160.0670.1070.038850.1100.039010.40495180.1080.1440.046680.1500.047070.81696140.1110.1380.047310.1430.047590.595 *inPRMSE 54

PAGE 55

CHAPTER5 ONESTIMATIONOFMSESOFBENCHMARKEDEBESTIMATORS Smallareaestimationhasbecomeincreasinglypopularrece ntlyduetoagrowing demandforsuchstatistics.Itiswellknownthatdirectsmal l-areaestimatorsusually havelargestandarderrorsandcoefcientsofvariation.In ordertoproduceestimates forthesesmallareas,itisnecessarytoborrowstrengthfro motherrelatedareas. Accordingly,model-basedestimatesoftendifferwidelyfr omthedirectestimates, especiallyforareaswithsmallsamplesizes.Oneproblemth atarisesinpracticeis thatthemodel-basedestimatesdonotaggregatetothemorer eliabledirectsurvey estimates.Agreementwiththedirectestimatesisoftenapo liticalnecessitytoconvince legislatorsregardingtheutilityofsmallareaestimates. Theprocessofadjusting model-basedestimatestocorrectthisproblemisknownasbe nchmarking.Anotherkey benetofbenchmarkingisprotectionagainstmodelmisspec icationaspointedoutby You,RaoandDick ( 2004 )and Datta,Ghosh,SteortsandMaples ( 2011 ). Inrecentyears,theliteratureonbenchmarkinghasgrownin smallareaestimation.Amongothers, PfeffermannandBarnard ( 1991 ); YouandRao ( 2003 ); You,RaoandDick ( 2004 );and PfeffermannandTiller ( 2006 )havemadeanimpact onthecontinuingdevelopmentofthiseld.Specically, Wang,FullerandQu ( 2008 ) providedafrequentistmethodwhereinanaugmentedmodelwa susedtoconstructa bestlinearunbiasedpredictor(BLUP)thatautomaticallys atisesthebenchmarking constraint.Inaddition, Datta,Ghosh,SteortsandMaples ( 2011 )developedverygeneral benchmarkedBayesestimators,whichcoveredmostoftheear lierestimatorsthathave beenmotivatedfromeitherafrequentistorBayesianperspe ctive.Inparticular,they foundbenchmarkedBayesestimatorsunderthecelebrated FayandHerriot ( 1979 ) model. Duetothefactthattheyborrowstrength,model-basedestim atestypicallyshowa substantialimprovementoverdirectestimatesintermsofm eansquarederror(MSE).It 55

PAGE 56

isofparticularinteresttodeterminehowmuchofthisadvan tageislostbyconstraining theestimatesthroughbenchmarking.Theaforementionedwo rkof Wang,FullerandQu ( 2008 )examinedthisquestionthroughsimulationstudiesbutdid notderiveanyprobabilisticresults.TheyshowedthattheMSEofthebenchmarke dEBestimatorwasslightly largerthantheMSEoftheEBestimatorfortheirsimulations tudies.InSection 5.2 we deriveasecond-orderapproximationoftheMSEofthebenchm arkedBayesEBestimatortoshowthattheincreaseduetobenchmarkingis O ( m 1 ), where m isthenumberof smallareas. Inthispaper,weareconcernedwiththebasicarea-levelmod elof FayandHerriot ( 1979 ).WeobtainbenchmarkedempiricalBayesestimators,andth eseestimators areproposedinSection 5.1 .InSection 5.2 ,wederiveasecond-orderasymptotic expansionoftheMSEofthebenchmarkedEBestimator.InSect ion 5.3 ,wethennd anestimatorofthisMSEandcompareittothesecond-orderap proximationofthe MSEoftheEBestimatororequivalentlytheMSEoftheEBLUP,w hichwasderived by PrasadandRao ( 1990 ).Finally,inSection 5.4 ,usingmethodssimilartothoseof ButarandLahiri ( 2003 ),wecomputeaparametricbootstrapestimateofthemean squarederrorofthebenchmarkedEBestimateundertheFay-H erriot(1979)modeland compareittoourestimatesfromSection 5.1 .Section 5.5 containsanapplicationbased onSmallAreaIncomeandPovertyEstimationData(SAIPE)fro mtheU.S.Census Bureau. 5.1BenchmarkedEBEstimators Considerthearea-levelrandomeffectsmodel ^ i = i + e i i = x Ti + u i ; i =1,..., m ; (5.1.1) where e i and u i aremutuallyindependentwith e i ind N (0, D i ) and u i iid N (0, 2 u ). Thismodelwasrstconsideredinthecontextofestimatingi ncomeforsmallareas (populationlessthan1000)by FayandHerriot ( 1979 ).Inmodel 5.1.1 D i areknownas 56

PAGE 57

arethe p 1 designvectors x i However,thevectorofregressioncoefcients p 1 is unknown. Whenthevariancecomponent 2 u isknownand hasauniformprioron R p then theBayesestimatorof i isgivenby ^ B i =(1 B i ) ^ i + B i x Ti ~ where B i = D i ( 2 u + D i ) 1 ~ ~ ( 2 u )=( X T V 1 X ) 1 X T V 1 ^ and V = Diag ( 2 u + D 1 ,..., 2 u + D m ). Supposenow wewanttomatchtheweightedaverageofsomeestimates e i totheweightedaverage ofthedirectestimates,whichwedenoteby t Wewillassumeforourcalculationsthat t = ^ w Wedenotethenormalizedweightsby w i sothat P i w i =1. Undertheloss L ( e )= P i w i ( i e i ) 2 andsubjectto P i w i e i = P i w i ^ i thebenchmarkedBayes estimatorasderivedin Datta,Ghosh,SteortsandMaples ( 2011 )isgivenby ^ BM 1 i = ^ B i +( ^ w ^ B w ); i =1,..., m (5.1.2) Inmorerealisticsettings, 2 u isunknown.Dene P X = X ( X T X ) 1 X T h ij = x Ti ( X T X ) 1 x j ^ u i = ^ i x Ti ^ and ^ =( X T X ) 1 X T ^ Inthispaper,weconsiderthesimplemoment estimatorgivenby ^ 2 u =max f 0,~ 2 u g where ~ 2 u =( m p ) 1 P mi =1 ^ u 2 i P mi =1 D i (1 h ii ) whichisgivenin PrasadandRao ( 1990 ).ThentheempiricalbenchmarkedBayes estimatorof i isgivenby ^ EBM 1 i = ^ EB i +( ^ w ^ EB w ), (5.1.3) where ^ EB i =(1 ^ B i ) ^ i + ^ B i x Ti ~ (^ 2 u ); ^ B i = D i (^ 2 u + D i ) 1 ; i =1,..., m Theobjective ofthenexttwosectionswillbetoobtaintheMSEofthebenchm arkedempiricalBayes estimatorcorrectupto O ( m 1 ) andalsotondanestimatoroftheMSEcorrecttothe sameorder. 5.2Second-OrderApproximationtoMSE Wangetal. ( 2008 )constructasimulationstudytocomparetheMSEofthebench markedEBestimatortotheMSEoftheEBestimator.Inthissec tion,wederiveasecondorderexpansionfortheMSEofthebenchmarkedBayesesti matorunderthesame 57

PAGE 58

regularityconditionsandassumingthestandardbenchmark ingconstraint.Thatis,assumingthemodelproposedinSection 5.1 ,weobtainasecond-orderapproximationto theMSEoftheempiricalbenchmarkedBayesestimator.Dene h V ij = x Ti ( X T V 1 X ) 1 x j andassumethat 2 u > 0. Thefollowingregularityconditionsarenecessaryforesta blishingTheorem 7 : (i) 0 < D L inf 1 i m D i sup 1 i m D i D U < 1 ; (ii) max 1 i m h ii = O ( m 1 ); and (iii) max 1 i m w i = O ( m 1 ). Condition ( iii ) requiresakindofhomogeneityofthesmallareas,inparticu lar,that theredonotexistafewlargeareaswhichdominatetheremain ingsmallareasinterms ofthe w i Conditions ( i ) and ( ii ) aresimilartothoseof PrasadandRao ( 1990 )andare veryoftenassumedinthesmallareaestimationliterature. Theresultingapproximation isgiveninTheorem 7 Theorem7. Assumeregularityconditions(i)–(iii)hold.Then E [( ^ EBM 1 i i ) 2 ]= g 1 i ( 2 u )+ g 2 i ( 2 u )+ g 3 i ( 2 u )+ g 4 ( 2 u )+ o ( m 1 ), where g 1 i ( 2 u )= B i 2 u g 2 i ( 2 u )= B 2 i h V ii g 3 i ( 2 u )= B 3 i D 1 i Var (~ 2 u ) g 4 ( 2 u )= m X i =1 w 2 i B 2 i V i m X i =1 m X j =1 w i w j B i B j h V ij andwhereVar (~ 2 u )=2( m p ) 2 P mk =1 ( 2 u + D k ) 2 + o ( m 1 ). 58

PAGE 59

Proof. Observe E [( ^ EBM 1 i i ) 2 ]= E [( ^ B i i ) 2 ]+ E [( ^ EBM 1 i ^ B i ) 2 ] = E [( ^ B i i ) 2 ]+ E [( ^ B i ^ EB i t + ^ EB w ) 2 ] = E [( ^ B i i ) 2 ]+ E [( ^ B i ^ EB i + ^ EB w ^ B w + ^ B w t ) 2 ] = E [( ^ B i i ) 2 ]+ E [( ^ EB i ^ B i ) 2 ]+ E [( ^ B w t ) 2 ]+ E [( ^ EB w ^ B w ) 2 ] 2 E [( ^ EB i ^ B i )( ^ EB w ^ B w )] 2 E [( ^ EB i ^ B i )( ^ B w t )] +2 E [( ^ EB w ^ B w )( ^ B w t )]. (5.2.1) Next,weobservethat E [( ^ B i i ) 2 ]+ E [ ^ EB i ^ B i ] 2 = g 1 i ( 2 u )+ g 2 i ( 2 u )+ g 3 i ( 2 u )+ o ( m 1 ) by PrasadandRao ( 1990 ),where g 1 i ( 2 u )= B i 2 u g 2 i ( 2 u )= B 2 i h V ii g 3 i ( 2 u )= B 3 i D 1 i Var (~ 2 u ). Itmaybenotedthatwhile g 1 i ( 2 u )= O (1), both g 2 i ( 2 u ) and g 3 i ( 2 u ) areoforder O ( m 1 ) asshownin PrasadandRao ( 1990 ).Wewillshowthat E [( ^ B w t ) 2 ]= g 4 ( 2 u )= O ( m 1 ), whereastheremainingfourtermsofexpression( 5.2.1 )areoforder o ( m 1 ). Firstweshowthat E [( ^ B w t ) 2 ]= g 4 ( 2 u ). Weobserve ^ B w t = P mi =1 w i B i ( ^ i x Ti ~ ) andconsider E [( ^ B w t ) 2 ]= E 24 ( m X i =1 w i B i ( ^ i x Ti ~ ) ) 2 35 = m X i =1 w 2 i B 2 i E [( ^ i x Ti ~ ) 2 ]+ X i 6 = j w i w j B i B j E [( ^ i x Ti ~ )( ^ j x j T ~ )] = m X i =1 w 2 i B 2 i ( V i h V ii )+ X i 6 = j w i w j B i B j ( h V ij ) = m X i =1 w 2 i B 2 i V i m X i =1 m X j =1 w i w j B i B j h V ij (5.2.2) 59

PAGE 60

Wemaynotethattheexpressionontherighthandsideof( 5.2.2 )is O ( m 1 ) since max 1 i m h ii = O ( m 1 ), whichimpliesthat max 1 i j m h V ij = O ( m 1 ). Next,wereturnto( 5.2.1 )andshowthat E [( ^ EB w ^ B w ) 2 ]= o ( m 1 ). Consider E [( ^ EB w ^ B w ) 2 ]= X i w 2 i E h ( ^ EB i ^ B i ) 2 i +2 m 1 X i =1 m X j = i +1 w i w j E h ( ^ EB i ^ B i )( ^ EB j ^ B j ) i =2 m 1 X i =1 m X j = i +1 w i w j E h ( ^ EB i ^ B i )( ^ EB j ^ B j ) i + o ( m 1 ) (5.2.3) since P i w 2 i E [( ^ EB i ^ B i ) 2 ]= o ( m 1 ). Thelatterholdsbecause E [( ^ EB i ^ B i ) 2 ]= g 2 i ( 2 u )+ g 3 i ( 2 u )= O ( m 1 ),max 1 i m w i = O ( m 1 ), and P i w i =1. Thus,itsufcesto show E h ( ^ EB i ^ B i )( ^ EB j ^ B j ) i = o ( m 1 ) forall i 6 = j andwedosobyexpanding ^ EB i about ^ B i Forsimplicityofnotation,denote @ ^ B i @ 2 u = @ ^ B i ( 2 u ) @ 2 u and @ 2 ^ B i @ ( 2 u ) 2 = @ 2 ^ B i ( 2 u ) @ ( 2 u ) 2 Thisresultsin ^ EB i ^ B i = @ ^ B i @ 2 u (^ 2 u 2 u )+ 1 2 @ 2 ^ B i @ ( 2 u ) 2 (^ 2 u 2 u ) 2 forsome 2 u between 2 u and ^ 2 u Theexpansionof ^ EB j about ^ B j issimilar. Wenowconsider E [( ^ EB i ^ B i )( ^ EB j ^ B j )] for i 6 = j Noticethat E [( ^ EB i ^ B i )( ^ EB j ^ B j )]= E @ ^ B i @ 2 u @ ^ B j @ 2 u (^ 2 u 2 u ) 2 # + 1 2 E @ ^ B i @ 2 u @ 2 ^ B j @ ( 2 u ) 2 (^ 2 u 2 u ) 3 # + 1 2 E @ 2 ^ B i @ ( 2 u ) 2 @ ^ B j @ 2 u (^ 2 u 2 u ) 3 # + 1 4 E @ 2 ^ B i ( @ 2 u ) 2 @ 2 ^ B j @ 2 ( 2 u ) 2 (^ 2 u 2 u ) 4 # := R 0 + R 1 + R 2 + R 3 In R 1 considerthat E @ ^ B i @ 2 u @ 2 ^ B j @ ( 2 u ) 2 (^ 2 u 2 u ) 3 # = E @ ^ B i @ 2 u @ 2 ^ B j @ ( 2 u ) 2 (~ 2 u 2 u ) 3 I (~ 2 u > 0) # E @ ^ B i @ 2 u @ 2 ^ B j @ ( 2 u ) 2 ( 2 u ) 3 I (~ 2 u 0) # (5.2.4) 60

PAGE 61

Observethat E @ ^ B i @ 2 u @ 2 ^ B j @ ( 2 u ) 2 ( 2 u ) 3 I (~ 2 u 0) # 6 u E 1 4 24 ( @ ^ B i @ 2 u ) 4 35 E 1 4 24 ( @ 2 ^ B j @ ( 2 u ) 2 ) 4 35 P 1 2 (~ 2 u 0) 6 u E 1 4 24 ( @ ^ B i @ 2 u ) 4 35 E 1 4 24 sup 2 u 0 ( @ 2 ^ B j @ ( 2 u ) 2 ) 4 35 P 1 2 (~ 2 u 0) = o ( m r ) forall r > 0 byLemma 2 (ii)and 3 ,whichwehaveprovedinAppendixA,andsince P (~ 2 u 0)= O ( m r ) 8 r > 0 asprovedinLemmaA.6of PrasadandRao ( 1990 ).We nextconsider E @ ^ B i @ 2 u @ 2 ^ B j @ ( 2 u ) 2 (~ 2 u 2 u ) 3 I (~ 2 u > 0) # = E @ ^ B i @ 2 u @ 2 ^ B j @ ( 2 u ) 2 (~ 2 u 2 u ) 3 # E @ ^ B i @ 2 u @ 2 ^ B j @ ( 2 u ) 2 (~ 2 u 2 u ) 3 I (~ 2 u 0) # (5.2.5) wherethesecondtermexpressionin( 5.2.5 )is O ( m r ) since P (~ 2 u 0)= O ( m r ) 8 r > 0. Continuingalong,wenextobservethat E @ ^ B i @ 2 u @ 2 ^ B j @ ( 2 u ) 2 (~ 2 u 2 u ) 3 # E 1 4 24 ( @ ^ B i @ 2 u ) 4 35 E 1 4 24 ( @ 2 ^ B j @ ( 2 u ) 2 ) 4 35 E 1 2 [(~ 2 u 2 u ) 6 ] E 1 4 24 ( @ ^ B i @ 2 u ) 4 35 E 1 4 24 sup 2 u 0 ( @ 2 ^ B j @ ( 2 u ) 2 ) 4 35 E 1 2 [(~ 2 u 2 u ) 6 ]= O ( m 3 = 2 ) since E [(~ 2 u 2 u ) 2 r ]= O ( m r ) forany r 1 byLemmaA.5in PrasadandRao ( 1990 ). Thisprovesthat R 1 = o ( m 1 ) since max 1 i m w i = O ( m 1 ). Bysymmetry, R 2 isalso o ( m 1 ). Finally,weshowthat R 3 is o ( m 1 ). Usingasimilarcalculationinvolving R 1 we canshowthat E @ 2 ^ B i ( @ 2 u ) 2 @ 2 ^ B j @ 2 ( 2 u ) 2 (^ 2 u 2 u ) 4 # = E @ 2 ^ B i ( @ 2 u ) 2 @ 2 ^ B j @ 2 ( 2 u ) 2 (~ 2 u 2 u ) 4 # + o ( m r ). (5.2.6) 61

PAGE 62

ObservenowthatE @ 2 ^ B i ( @ 2 u ) 2 @ 2 ^ B j @ 2 ( 2 u ) 2 (~ 2 u 2 u ) 4 # E 1 4 24 ( @ 2 ^ B i ( @ 2 u ) 2 ) 4 35 E 1 4 24 ( @ 2 ^ B j @ 2 ( 2 u ) 2 ) 4 35 E 1 2 (~ 2 u 2 u ) 8 E 1 4 24 sup 2 u 0 ( @ 2 ^ B i ( @ 2 u ) 2 ) 4 35 E 1 4 24 sup 2 u 0 ( @ 2 ^ B j @ 2 ( 2 u ) 2 ) 4 35 E 1 2 (~ 2 u 2 u ) 8 = O ( m 2 ). Pluggingthisbackintotheexpressionin( 5.2.6 ),wendthat E @ 2 ^ B i ( @ 2 u ) 2 @ 2 ^ B j @ 2 ( 2 u ) 2 (^ 2 u 2 u ) 4 # = o ( m 1 ). Hence, R 3 is o ( m 1 ). Finally,bycalculationssimilartothoseusedforexpression( 5.2.4 ),wendthat R 0 = E @ ^ B i @ 2 u @ ^ B j @ 2 u (^ 2 u 2 u ) 2 # = E @ ^ B i @ 2 u @ ^ B j @ 2 u (~ 2 u 2 u ) 2 # + o ( m r ). Dene = V X ( X T V 1 X ) 1 X T =( I P V X ) V where P X = X ( X T V 1 X ) 1 X T Also, dene P V X = X ( X T V 1 X ) 1 X T V 1 andlet e i representthe i thunitvector.Wecanshow @ ^ B i @ 2 u = B i e i T V 2 ~u where ~u = ^ X ~ Dene A ij = B i B j V 2 e i e j T V 2 andconsider E @ ^ B i @ 2 u @ ^ B j @ 2 u (~ 2 u 2 u ) 2 # = E [ ~u T A ij ~u (~ 2 u 2 u ) 2 ] = Cov ( ~u T A ij ~u ,(~ 2 u 2 u ) 2 )+ E [ ~u T A ij ~u ] E [(~ 2 u 2 u ) 2 ]. 62

PAGE 63

UsingLemma 4 andtherelation ( I P X )=( I P X ) V Cov ( ~u T A ij ~u ,(~ 2 u 2 u ) 2 ) =( m p ) 2 Cov ( ~u T A ij ~u ,[ ~u T ( I P X ) ~u tr f ( I P X ) V g ] 2 ) (5.2.7) =( m p ) 2 Cov ( ~u T A ij ~u ,[ ~u T ( I P X ) ~u ] 2 ) 2( m p ) 2 Cov ( ~u T A ij ~u ~u T ( I P X ) ~u ) tr f ( I P X ) V g =( m p ) 2 4 tr f A ij V ( I P X ) V g tr f ( I P X ) V g +8 tr f A ij V ( I P X ) V ( I P X ) V g 4 tr f A ij V ( I P X ) V g tr f ( I P X ) V g =8( m p ) 2 tr f A ij V ( I P X ) V ( I P X ) V g =8( m p ) 2 B i B j e j T V 1 ( I P X ) V ( I P X ) V 1 e i wheretrdenotesthetrace.Observethat ( I P X ) V 1 = I ( P V X ) T and ( I P V X ) V ( I ( P V X ) T )=. Then Cov ( ~u T A ij ~u ,(~ 2 u 2 u ) 2 )=8( m p ) 2 B i B j e j T V 1 ( I P X ) V ( I P X ) V 1 e i =8( m p ) 2 B i B j e j T ( I P V X ) V ( I ( P V X ) T ) e i =8( m p ) 2 B i B j e j T e i =8( m p ) 2 B i B j e j T V e i + O ( m 3 )= O ( m 3 ) sincethersttermiszerobecause i 6 = j and V isdiagonal.Wenowcalculate E [ ~u T A ij ~u ]= tr f B i B j V 2 e i e j T V 2 g = B i B j e j T V 2 V 2 e i Observethat V 2 = I ( P V X ) T P V X + P V X ( P V X ) T Thenaftersomecomputations, wendthat E [ ~u T A ij ~u ]= B i B j e j T V 1 e i + O ( m 1 )= O ( m 1 ) since i 6 = j ByLemma 5 E [(~ 2 u 2 u ) 2 ]=2( m p ) 2 P mk =1 ( 2 u + D k ) 2 + O ( m 2 ). Then E [ ~u T A ij ~u ] E [(~ 2 u 2 u ) 2 ]= o ( m 1 ) 63

PAGE 64

since i 6 = j Thisimpliesthat R 0 = o ( m 1 ), whichinturnimpliesthat E [( ^ EB i ^ B i )( ^ EB j ^ B j )]= o ( m 1 ) for i 6 = j (5.2.8) since R 0 R 1 R 2 and R 3 areall o ( m 1 ). Finally,thisand( 5.2.3 )establishesthat E [( ^ EB w ^ B w ) 2 ]= o ( m 1 ). Wenowreturnto( 5.2.1 )andshowthat E [( ^ EB i ^ B i )( ^ EB w ^ B w )]= o ( m 1 ). Bythe Cauchy-Schwarzinequality,wendthat E [( ^ EB i ^ B i )( ^ EB w ^ B w )] E 1 2 h ( ^ EB i ^ B i ) 2 i E 1 2 h ( ^ EB w ^ B w ) 2 i = o ( m 1 ) sincethersttermis O ( m 1 = 2 ) andthesecondtermis o ( m 1 = 2 ). Forthenexttermof( 5.2.1 ),weareinterestedinshowingthat E [( ^ EB i ^ B i )( ^ B w t )]= o ( m 1 ). First,byTaylorexpansion,wendthat ^ EB i ^ B i = @ ^ B i @ 2 u (^ 2 u 2 u )+ 1 2 @ 2 ^ B i @ ( 2 u ) 2 (^ 2 u 2 u ) 2 forsome 2 u between 2 u and ^ 2 u Observethat ^ B w t = P i w i B i ( ^ i x Ti ~ ). Then E [( ^ EB i ^ B i )( ^ B w t )]= X j w j B j E @ ^ B i @ 2 u (^ 2 u 2 u )( ^ j x j T ~ ) # 1 2 X j w j B j E @ 2 ^ B i @ ( 2 u ) 2 (^ 2 u 2 u ) 2 ( ^ j x j T ~ ) # := R 4 + R 5 64

PAGE 65

Observe E @ ^ B i @ 2 u (^ 2 u 2 u )( ^ j x j T ~ ) # = 2 u E @ ^ B i @ 2 u ( ^ j x j T ~ ) I (~ 2 u 0) # (5.2.9) + E @ ^ B i @ 2 u (~ 2 u 2 u )( ^ j x j T ~ ) I (~ 2 u > 0) # = E @ ^ B i @ 2 u (~ 2 u 2 u )( ^ j x j T ~ ) I (~ 2 u > 0) # + o ( m r ) = E @ ^ B i @ 2 u (~ 2 u 2 u )( ^ j x j T ~ ) # E @ ^ B i @ 2 u (~ 2 u 2 u )( ^ j x j T ~ ) I (~ 2 u 0) # + o ( m r ) = E @ ^ B i @ 2 u (~ 2 u 2 u )( ^ j x j T ~ ) # + o ( m r ) sincewemayobservethat E @ ^ B i @ 2 u ( 2 u )( ^ j x j T ~ ) I (~ 2 u 0) # = o ( m r ) and E @ ^ B i @ 2 u (~ 2 u 2 u )( ^ j x j T ~ ) I (~ 2 u 0) # = o ( m r ). Now,observethat @ ^ B i @ 2 u = B i e i T V 2 ~u anddene D ij = B i V 2 e i e j T Thenbycalculationssimilartothosein expression( 5.2.7 ),wend E @ ^ B i @ 2 u (~ 2 u 2 u )( ^ j x j T ~ ) # = Cov ( ~u T D ij ~u ,~ 2 u 2 u ) =( m p ) 1 Cov ( ~u T D ij ~u ~u T ( I P X ) ~u tr f ( I P X ) V g ) =2( m p ) 1 tr f D ij V ( I P X ) V g =2( m p ) 1 tr f B i V 2 e i e j T V ( I P X ) V g =2( m p ) 1 B i e j T V ( I P X ) V 1 e i =2( m p ) 1 B i e j T V ( I ( P V X ) T ) e i =2( m p ) 1 B i [ e j T V e i h V ij ] =2( m p ) 1 B i e j T V e i + o ( m 1 ). 65

PAGE 66

Usingtheexpressionderivedabove,wendthat X j w j B j E @ ^ B i @ 2 u (~ 2 u 2 u )( ^ j x j T ~ ) # =2( m p ) 1 B 2 i w i ( 2 u + D i )+ o ( m 1 )= o ( m 1 ). Hence, R 4 is o ( m 1 ). Wenowshowthat R 5 = o ( m 1 ). Bycalculationssimilartothosein expression( 5.2.9 ), X j w j B j E @ 2 ^ B i @ ( 2 u ) 2 (^ 2 u 2 u ) 2 ( ^ j x j T ~ ) # = X j w j B j E @ 2 ^ B i @ ( 2 u ) 2 (~ 2 u 2 u ) 2 ( ^ j x j T ~ ) # + o ( m r ). Recallthat E n P j w j B j ( ^ j x j T ~ ) o 2 = O ( m 1 ) by( 5.2.2 ).Nowconsider X j w j B j E @ 2 ^ B i @ ( 2 u ) 2 (~ 2 u 2 u ) 2 ( ^ j x j T ~ ) # E 1 4 24 ( @ 2 ^ B i @ ( 2 u ) 2 ) 4 35 E 1 4 (~ 2 u 2 u ) 8 E 1 2 24 ( X j w j B j ( ^ j x j T ~ ) ) 2 35 E 1 4 24 ( sup 2 u 0 @ 2 ^ B i @ ( 2 u ) 2 ) 4 35 E 1 4 (~ 2 u 2 u ) 8 E 1 2 24 ( X j w j B j ( ^ j x j T ~ ) ) 2 35 = O ( m 3 = 2 ) byLemma 2 (ii),byTheoremA.5of PrasadandRao ( 1990 ),andbyexpression( 5.2.2 ). Thus, R 5 is o ( m 1 ), and E [( ^ EB i ^ B i )( ^ B w t )]= o ( m 1 ). Forthelasttermin( 5.2.1 ),weusethetheCauchy-Schwartzinequalitytoshow E [( ^ EB w ^ B w )( ^ B w t )] E 1 2 [( ^ EB w ^ B w ) 2 ] E 1 2 [( ^ B w t ) 2 ]= o ( m 1 ). Thisconcludestheproofofthetheorem. 5.3EstimatorofMSEApproximation WenowobtainanestimatoroftheMSEapproximationfortheFa y-Herriotmodel (assumingnormality).Theorem 8 showsthattheexpectationoftheMSEestimatoris correctto O ( m 1 ). 66

PAGE 67

Lemma1: Supposethat sup t 2 T j h 0 ( t ) j = O ( m 1 ) (5.3.1) forsomeinterval T R .If ^ 2 u 2 u 2 T w.p. 1, then E [ h (^ 2 u )]= h ( 2 u )+ o ( m 1 ). Proof. Considertheexpansion h (^ 2 u )= h ( 2 u )+ h 0 ( 2 u )(^ 2 u 2 u ) forsome 2 u between 2 u and ^ 2 u Then 2 u 2 T a.s. and h 0 ( 2 u ) sup t 2 T j h 0 ( t ) j a.s.aswell.Thisimplies E [ h 0 ( 2 u )(^ 2 u 2 u )] sup t 2 T j h 0 ( t ) j E j ^ 2 u 2 u j = O ( m 3 = 2 ) byequation( 5.3.1 ) andsince E j ^ 2 u 2 u j E 1 2 [(^ 2 u 2 u ) 2 ]. Hence,ifassumption( 5.3.1 )holds,then E [ h (^ 2 u )]= h ( 2 u )+ o ( m 1 ). Theorem8. E [ g 1 i (^ 2 u )+ g 2 i (^ 2 u )+2 g 3 i (^ 2 u )+ g 4 (^ 2 u )]= g 1 i ( 2 u )+ g 2 i ( 2 u )+ g 3 i ( 2 u )+ g 4 ( 2 u )+ o ( m 1 ), where g 1 i ( 2 u ), g 2 i ( 2 u ), g 3 i ( 2 u ), and g 4 ( 2 u ) aredenedinTheorem1. Proof. ByTheoremA.3in PrasadandRao ( 1990 ), E [ g 1 i (^ 2 u )+ g 2 i (^ 2 u )+2 g 3 i (^ 2 u )]= g 1 i ( 2 u )+ g 2 i ( 2 u )+ g 3 i ( 2 u )+ o ( m 1 ). Inaddition,weconsider E [ g 4 (^ 2 u )], where g 4 ( 2 u )= P mi =1 w 2 i B 2 i V i P mi =1 P mj =1 w i w j B i B j h V ij =: g 41 ( 2 u )+ g 42 ( 2 u ). Werstshowthatthe derivativesof g 41 ( 2 u ) and g 42 ( 2 u ) satisfyassumption( 5.3.1 ).Let T =[0, 1 ). Consider sup 2 u 0 @ g 41 ( 2 u ) @ 2 u =sup 2 u 0 m X i =1 w 2 i B 2 i = O ( m 1 ). Itcanbeshownthat @ B i B j @ 2 u = B i B 2 j D 1 j B 2 i B j D 1 i and ( X T V 1 X ) 1 ( X T V 2 X ) 1 D 1 L Observethat @ g 42 ( 2 u ) @ 2 u m X i =1 m X j =1 w i w j j B i D 1 L h V ij j + j B j D 1 L h V ij j + B i B j x Ti ( X T V 1 X ) 1 X T V 2 X ( X T V 1 X ) 1 x i 3 m 2 (max 1 i m w i ) 2 D 1 L B i ( 2 u + D U )(max 1 i m h i ) 3 m 2 (max 1 i m w i ) 2 D 1 L D U ( 2 u + D L ) 1 ( 2 u + D U )(max 1 i m h i ) =3 m 2 (max 1 i m w i ) 2 D 1 L D U (1+ D U D 1 L )(max 1 i m h i )= O ( m 1 ). 67

PAGE 68

Thisimpliesthat sup 2 u 0 @ g 42 ( 2 u ) @ 2 u = O ( m 1 ). Sincethederivativesof g 41 ( 2 u ) and g 42 ( 2 u ) satisfyassumption( 5.3.1 ),weknowthat E [ g 4 (^ 2 u )]= g 4 ( 2 u )+ o ( m 1 ). 5.4ParametricBootstrapEstimatoroftheBenchmarkedEBEs timator Inthissection,weextendthemethodsof ButarandLahiri ( 2003 )tondaparametricbootstrapestimatoroftheMSEofthebenchmarkedemp iricalBayesestimator. Undertheproposedmodel,theexpectationoftheproposedme asureofuncertaintyof thebenchmarkedempiricalBayesestimatoriscorrectuptoo rder O ( m 1 ). Tointroducetheparametricbootstrapmethod,considerthe followingmodel: ^ i j u i ind N ( x Ti ~ + u i D i ) u i ind N (0,^ 2 u ). (5.4.1) Asexplainedin ButarandLahiri ( 2003 ),weusetheparametricbootstraptwice.Werst useittoestimate g 1 i ( 2 u ), g 2 i ( 2 u ), and g 4 ( 2 u ) bycorrectingthebiasof g 1 i (^ 2 u ), g 2 i (^ 2 u ), and g 4 (^ 2 u ). Wethenuseitagaintoestimate E [( ^ EB i ^ B i ) 2 ]= g 3 i ( 2 u )+ o ( m 1 ). ButarandLahiri ( 2003 )derivedaparametricbootstrapestimatorfortheMSEofthe EBestimatorunderthe FayandHerriot ( 1979 )model.UsingTheoremA.1oftheirpaper, theyshowthatthebootstrapestimator V BOOT i is V BOOT i =2[ g 1 i (^ 2 u )+ g 2 i (^ 2 u )] E g 1 i (^ 2 u )+ g 2 i (^ 2 u ) + E [( ^ EB i ^ EB i ) 2 ], (5.4.2) where E denotestheexpectationcomputedwithrespecttothemodelg ivenin( 5.4.1 ), and ^ EB i =(1 B i (^ 2 u )) ^ i + B i (^ 2 u ) x Ti ^ Followingtheirwork,weproposeaparametricbootstrapestimatoroftheMSEofbenchmarkedEBesti matorwhichisasimple extensionof( 5.4.2 ). Weproposetoestimate g 1 i ( 2 u )+ g 2 i ( 2 u )+ g 4 ( 2 u ) by 2[ g 1 i (^ 2 u )+ g 2 i (^ 2 u )+ g 4 (^ 2 u )] E g 1 i (^ 2 u )+ g 2 i (^ 2 u )+ g 4 (^ 2 u ) 68

PAGE 69

andthenestimate E [( ^ EB i ^ B i ) 2 ] by E [( ^ EB i ^ EB i ) 2 ]. Thus,ourproposedestimatorof MSE [ ^ EBM1 i ] isgivenby V B-BOOT i =2[ g 1 i (^ 2 u )+ g 2 i (^ 2 u )+ g 4 (^ 2 u )] E g 1 i (^ 2 u )+ g 2 i (^ 2 u )+ g 4 (^ 2 u ) + E [( ^ EB i ^ EB i ) 2 ]. Wenowshowthattheexpectationof V B-BOOT i iscorrectupto O ( m 1 ). Theorem9. E [ V B-BOOT i ]= MSE [ ^ EBM1 i ]+ o ( m 1 ). Proof. First,byTheoremA.1in ButarandLahiri ( 2003 ),wenotethat E [ g 1 i (^ 2 u )]= g 1 i (^ 2 u ) g 3 i (^ 2 u )+ o p ( m 1 ), E [ g 2 i (^ 2 u )]= g 2 i (^ 2 u )+ o p ( m 1 ), and E [( ^ EB i ^ EB i ) 2 ]= g 5 i (^ 2 u )+ o p ( m 1 ), where g 5 i (^ 2 u )=[ B i (^ 2 u )] 4 D 2 i ^ i x Ti ~ (^ 2 u ) 2 Also, E [ g 4 (^ 2 u )]= g 4 (^ 2 u )+ o p ( m 1 ), whichfollowsalongthelinesoftheproofofTheoremA.2(b)o f DattaandLahiri ( 2000 ). ApplyingtheseresultsandTheorem2ofthispaper,wend V B-BOOT i = g 1 i (^ 2 u )+ g 2 i (^ 2 u )+ g 3 i (^ 2 u )+ g 4 (^ 2 u )+ g 5 i (^ 2 u )+ o p ( m 1 ). Thisimpliesthat E [ V B-BOOT i ]= g 1 i ( 2 u )+ g 2 i ( 2 u )+ g 3 i ( 2 u )+ g 4 ( 2 u )+ o ( m 1 ) since E [ g 5 i (^ 2 u )]= g 3 i ( 2 u )+ o ( m 1 ) by ButarandLahiri ( 2003 ). 5.5AnApplication Inthissection,weconsideranexamplewherewecomputeesti matesoftheMSEof theEBestimatorandthebenchmarkedEBestimator.Wealsoco mputeanestimateof theMSEoftheEBestimatorusingaparametricbootstrapproc edurefortheFay-Herriot modelasdescribedinSection5. 69

PAGE 70

Weconsiderdatafrom1997and2000fromtheSmallAreaIncome andPoverty Estimates(SAIPE)programattheU.S.CensusBureau,whichp roducesmodel-based estimatesofthenumberofpoorschool-agedchildren(5–17y earsold)atthenational, state,county,anddistrictlevels.Theschooldistrictest imatesarebenchmarkedtothe stateestimatesbytheDepartmentofEducationtoallocatef undsundertheNoChild LeftBehindActof2001.IntheSAIPEprogram,themodel-base dstateestimatesare benchmarkedtothenationalschool-agedpovertyrateusing thebenchmarkedestimator in( 5.1.3 ).Thenumberofpoorschool-agedchildrenhasbeencollecte dfromtheAnnual SocialandEconomicSupplement(ASEC)oftheCPSfrom1995to 2004,whileACS estimateshavebeenusedbeginningin2005.Additionally,t hemodel-basedcounty estimatesarebenchmarkedtothemodel-basedstateestimat esusingthebenchmarked estimatorin( 5.1.3 ). IntheSAIPEprogram,thestatemodelforpovertyratesinsch ool-agedchildren followsthebasicFay-Herriot(1979)frameworkwhere ^ i = i + e i and i = x Ti + u i where i isthetruestatelevelpovertyrate, ^ i isthedirectsurveyestimate(fromCPS ASEC), e i isthesamplingerrortermwithassumedknownvariance D i > 0 x i arethe predictors, istheunknownvectorofregressioncoefcients,and u i isthemodelerror withunknownvariance 2 u TheexplanatoryvariablesinthemodelareIRSincometax– basedpseudo-estimateofthechildpovertyrate,IRSnon-l errate,foodstamprate,and theresidualtermfromtheregressionofthe1990Censusesti matedchildpovertyrate. Weestimate usingtheweightedleastsquaresestimate ~ =( X T V 1 X ) 1 X T V 1 ^ andweestimate 2 u usingthemodiedmomentestimator ^ 2 u fromSection2. AsshowninTable 5-1 ,theestimatedMSEoftheEBestimator,mse ( ^ EB i ) ,comparedtotheestimatedMSEofthebenchmarkedEBestimator,m se ( ^ EBM 1 i ), differsby theconstant g 4 ( 2 u ), whichis 0.015. ThisconstantiseffectivelytheincreaseinMSE thatwesufferfrombenchmarking,andweseethatinthiscase thisquantityissmall 70

PAGE 71

(comparedtothevaluesoftheMSEs).Generallyspeakingthi squantityisexpectedto besmallsince g 4 ( 2 u )= O ( m 1 ). Itshouldbenotedthatintheproofsofourpaperaboveandin PrasadandRao ( 1990 ),wetakeadvantageofthefactthat P (~ 2 u 0)= O ( m r ) 8 r > 0. Practicallyspeaking,thisfactshouldcorrespondtotheaf orementionedprobabilitybeing verysmall.However,ifthisisnotthecaseforsomeparticul ardatasetwithxed m thenourtheoreticalderivationsoftheMSEofthebenchmark edEBestimatorandthe derivationsin PrasadandRao ( 1990 )undertheFay-Herriotmodelmaybeunreliable. WenowillustratethisconceptwithSAIPEdatafromtheU.S.C ensusBureau. InTable 5-1 andTable 5-2 ,wedenemse B andmse BB asthebootstrapestimates oftheMSEoftheEBestimatorandthebenchmarkedEBestimato rrespectively.We considertwoyearsforillustrativepurposesfromtheSAIPE dataset(years1997and 2000).Table 5-1 illustrateswhen ^ 2 u isnottooclosetozero,being 3.08. Whenwe performthebootstrapping,weresample ~ 2 u 10,000 timesinordertocalculatemse B andmse BB andtheproportionofresampledvaluesof ~ 2 u thatarenegativeis 0.034. Sincethisproportionissmall,wecanseethatthebootstrap estimateoftheMSEof thebenchmarkedEBestimatorandtheEBestimatoraresomewh atclosetoeach other.Thisisbestunderstoodbyconsideringtheconceptbe hindourbootstrapping approach.Considerthebehaviorof g 1 i ( 2 u ) ,theonlytermthatis O (1). Ordinarily, g 1 i (^ 2 u ) underestimates g 1 i ( 2 u ) ,and E [ g 1 i (^ 2 u )] underestimates g 1 i (^ 2 u ). Thebasicideaisthatwe usetheamountbywhich E [ g 1 i (^ 2 u )] underestimates g 1 i (^ 2 u ) asanapproximationofthe amountbywhich g 1 i (^ 2 u ) underestimates g 1 i ( 2 u ). Werunintoaproblemwiththe1997data,where g 1 i (^ 2 u ) is0,sincenow E [ g 1 i (^ 2 u )] overestimates g 1 i (^ 2 u ). Recallthat V B-BOOT i = g 1 i (^ 2 u )+ f g 1 i (^ 2 u ) E [ g 1 i (^ 2 u )] g + O ( m 1 ). 71

PAGE 72

Since g 1 i (^ 2 u ) is0andisthedominatingtermof V B-BOOT i manyoftheestimatedMSEs ofthebenchmarkedbootstrappedestimator(mse BB )willbenegative.Also,observethis samebehaviorholdstrueforthebootstrappedestimatorpro posedby ButarandLahiri ( 2003 ),whichwedenotebymse B 72

PAGE 73

Table5-1.Tableofestimatesfor2000 i ^ i ^ EB i ^ EBM 1 i mse ( ^ i ) mse ( ^ EB i ) mse ( ^ EBM 1 i ) mse B mse BB 119.3819.0519.2310.783.163.173.013.0329.369.9010.082.161.591.611.561.58321.2119.5119.694.052.452.472.502.51422.9421.1221.3016.073.523.533.423.43518.7918.4818.661.291.121.131.071.09611.0911.6511.834.142.222.242.102.1278.608.999.172.781.921.931.801.82813.4613.0913.275.092.352.372.192.21924.9022.5322.7113.755.965.975.955.96 1016.5817.2517.431.681.361.381.451.471116.0416.9717.155.702.462.482.362.371213.0112.6312.816.263.053.072.882.891319.0016.5916.773.442.092.112.502.511416.5414.5614.743.692.052.072.242.261510.4711.2511.422.731.761.771.751.76168.769.439.611.881.401.421.461.471712.7012.3812.562.491.661.671.561.571814.4016.2816.469.353.103.113.053.071923.6521.2721.459.613.343.353.263.27207.419.8710.058.553.163.173.203.22217.028.458.631.751.351.361.941.962214.0111.9812.163.312.112.122.432.442312.9613.0413.222.431.641.651.531.55249.779.499.671.271.071.081.041.062514.1620.1520.3312.693.643.664.134.15269.4112.3412.526.632.632.642.852.872719.6017.5417.7210.083.163.183.073.092810.7310.8611.042.081.521.531.431.442911.8813.1313.313.502.232.242.262.28307.016.947.121.080.950.970.900.913110.6810.7710.951.721.331.351.261.273223.5925.2425.4212.213.913.923.863.883319.2318.4918.662.311.721.731.681.693417.9916.5416.7210.522.832.852.712.733512.4011.8812.067.642.902.912.742.763613.8713.0813.262.971.861.881.791.813721.7319.7419.9214.303.193.203.093.103818.4014.7114.898.613.463.473.573.593910.5310.9811.161.211.021.031.101.114011.1012.0612.246.332.722.732.602.61 73

PAGE 74

Table5-1.Continued i ^ i ^ EB i ^ EBM 1 i mse ( ^ i ) mse ( ^ EB i ) mse ( ^ EBM 1 i ) mse B mse BB 4114.4316.3916.574.532.372.382.512.534213.1413.4413.614.182.402.412.242.264321.6417.3417.528.152.932.953.193.214420.0219.8420.021.621.341.351.241.264510.0210.5710.753.422.122.132.012.034612.8011.3111.492.621.871.882.032.04477.2610.0210.203.351.941.962.722.734810.8211.1011.282.401.661.671.571.594915.2217.1417.326.413.193.213.233.245014.9812.2812.464.802.452.472.752.765113.2213.0213.203.001.961.981.821.84 74

PAGE 75

Table5-2.Tableofestimatesfor1997 i ^ i ^ EB i ^ EBM 1 i mse ( ^ i ) mse ( ^ EB i ) mse ( ^ EBM 1 i ) mse B mse BB 125.1621.3821.5615.721.381.410.020.04210.9914.9415.1110.442.122.140.660.68323.3520.8921.0611.841.681.700.000.01423.3222.1822.3513.851.901.920.370.38523.5522.7122.882.395.925.941.121.1369.1413.1213.296.382.192.220.360.38710.3413.3913.569.852.082.100.390.41815.5413.0613.2317.560.910.94-0.47-0.45935.8532.4332.6032.354.924.953.493.50 1018.3419.5919.763.703.713.740.400.411123.5220.5320.7012.931.161.19-0.38-0.371218.9813.7213.8920.872.452.481.241.261317.5613.6413.8212.381.701.730.230.251414.5715.7215.893.563.453.47-0.06-0.051511.0712.5312.707.581.841.86-0.23-0.221611.0911.2111.388.491.741.76-0.24-0.221711.0113.4813.659.341.611.63-0.15-0.141823.1220.7820.9513.981.371.40-0.12-0.111921.0824.1524.3215.191.801.820.400.422013.1812.4412.6113.632.092.110.560.57219.9013.1613.339.281.651.67-0.03-0.012219.6614.3814.567.662.462.481.021.042313.7816.8617.034.043.113.130.380.392414.3410.1110.289.911.641.670.160.172520.5822.3022.4715.072.422.450.970.992618.9015.1115.2815.241.001.03-0.37-0.352717.0018.6018.7712.951.371.40-0.21-0.19289.729.629.797.182.242.260.090.102914.0612.9413.1210.231.711.74-0.06-0.043010.946.726.8911.351.881.910.500.523114.6613.2813.455.522.482.51-0.03-0.013229.6924.4424.6113.182.622.651.381.403323.7622.8523.023.104.764.790.940.953413.9016.5816.755.702.292.31-0.010.013518.1913.6413.8111.921.811.840.480.503613.9113.6413.813.953.073.10-0.25-0.233716.0921.5021.6811.141.521.540.240.263812.6013.4313.6010.352.532.560.830.843914.6113.9214.093.733.403.42-0.010.004020.3714.6014.7718.531.041.07-0.15-0.14 75

PAGE 76

Table5-2.Continued i ^ i ^ EB i ^ EBM 1 i mse ( ^ i ) mse ( ^ EB i ) mse ( ^ EBM 1 i ) mse B mse BB 4118.7421.2121.3814.571.491.520.020.044212.8715.7715.9412.941.982.010.460.474316.0916.1016.2711.941.921.950.280.304421.9521.3821.553.384.054.070.380.404511.279.769.939.452.282.310.500.514611.1510.1010.2711.952.452.480.860.884716.4014.9615.1311.511.201.22-0.49-0.474812.2613.1713.349.331.851.870.010.024918.7622.2522.4213.733.813.832.462.48507.6011.8712.046.412.742.760.970.985111.7411.7011.878.862.082.100.170.19 76

PAGE 77

CHAPTER6 SUMMARYREMARKSANDFUTUREWORK SmallareaestimationrstgrewsubstantiallyduringWorld WarIIbutisourishing againduetoitsnecessityingovernmentagenciesworldwide .Webrieydescribethe importanceofsmallareaestimationinChapter 1 andhowbenchmarkingspecically hasplayedanimportantroleinresearch.Furthermore,inCh apter 2 ,wereviewthe importantmodelsusedinsmallareaestimationinthisbodyo fwork,andmostrelevantly thoseof FayandHerriot ( 1979 ), Louis ( 1984 ), PrasadandRao ( 1990 ), Ghosh ( 1992 ), Louis ( 1984 ), YouandRao ( 2002 ), YouandRao ( 2003 ),and Wang,FullerandQu ( 2008 ). InChapter 3 wedevelopageneralclassofbenchmarkedBayesestimatorsu ndera generallossfunctionwherewecanbenchmarktheweightedme anorboththeweighted meanandweightedvariability.Moreover,ourresultsdonot assumeanydistribution assumptions,andtheformofourestimatorcanbelinearorno nlinear.Wecanalso extendourresultstomultiparametersettingsusingagener allossfunction.Finally,many ofthepreviouslyproposedestimatorsfromtheliteraturer esultasspecialcasesofthe benchmarkedBayesestimator.Weillustrateourmethodsusi ngSAIPEdatafromthe U.S.CensusBureau. Next,inChapter 4 ,weextendtheaboveworktoatwo-stagebenchmarkingprocedureunderonemodel.Forexample,weareabletobenchmark theweightedmeans andndclosed-formsolutions,andweareabletobenchmarkb oththeweightedmeans andweightedunit-levelvariabilityandndclosed-formso lutions.Wecanextendthese tomultiparametersettings.However,goingbeyondourpres entworkindicatesthata solutionexistsbutnotinclosedform.Westudythebehavior ofourestimatesbylooking attheproportionofpeoplewhodonothavehealthinsurancef oranAsiansubpopulation asstudiedbytheNHISin2000. 77

PAGE 78

Finally,inChapter 5 ,wendthebenchmarkedEBestimatorundertheFay-Herriot model(assumingnormality)andassumingthestandardbench markingconstraint. Undersomemildregularityconditions,wedeterminehowmuc hmeansquarederror islostduetobenchmarking.Specically,usingasecond-or derexpansion,weshow thattheerrorlostduetobenchmarkingis O ( m 1 ), where m isthenumberofsmall areas.Wethenndanasymptoticallyunbiasedestimatoroft hisMSE.Inaddition,using methodssimilartothosein ButarandLahiri ( 2003 ),wederiveaparametricbootstrap estimatoroftheMSEofthebenchmarkedEBestimator.Weinve stigatetheproperties ofourresultsusingSAIPEdataandillustrateinterestingb ehaviorthatoccurswhenthe estimatedvalueof 2 u is 0 asoccursin1997. Intermsoffuturework,webelievethatourworkinbenchmark ingjustscratches thesurface.Benchmarkinginspatialsettingshasnotbeene xplored,andsuchwork wouldndimmediateapplicationwheneverresponsevariabl esshowcorrelationdue togeographicproximityorsimilarity.Wealsobelievether eismuchmoretobedonein two-stagebenchmarkingincludingthederivationofclosed -formsolutionsornumerical algorithmsifclosed-formsolutionscannotbeobtained.Fu rthermore,thisideacould extendtomulti-stageprocedures.Forexample,wemightwis htobenchmarkcountylevelestimatestodistrict-levelestimates,district-le velestimatestostate-levelestimates, andthenstate-levelestimatestothenational-levelestim ate.Finally,weshouldattempt tondclosed-formexpressionsfortheMSEofthebenchmarke dEBestimatorunder morecomplicatedconstraintsandmodels,aswellasunderdi fferentvariancecomponentestimators,inordertogeneralizeourresultsontheam ountoferrorlostdueto benchmarking. 78

PAGE 79

APPENDIX:LEMMASFORCHAPTER5 Lemma2: Let r > 0 bearbitrary.Then (i) E 24 ( @ ^ B i @ 2 u ) 2 r 35 = O (1), and (ii) E 24 sup 2 u 0 @ 2 ^ B i @ ( 2 u ) 2 2 r 35 = O (1). ProofofLemma 2 (i). Recall ~u = ^ X ~ anddene u = ^ X Recall ^ B i = (1 B i ) ^ i + B i x Ti ~ Since @ ~ @ 2 u = ( X T V 1 X ) 1 X T V 2 ^ wecaneasilyshowthat @ ^ B i @ 2 u =[ B 2 i D 1 i e i T B i x Ti ( X T V 1 X ) 1 X T V 2 ] ~u Thisimpliesthat @ ^ B i @ 2 u j D 1 L e i T ~u j + j x Ti ( X T V 1 X ) 1 X T V 2 ~u j Then @ ^ B i @ 2 u 2 r 2 2 r 1 D 2 r L j e i T ~u j 2 r +2 2 r 1 j x Ti ( X T V 1 X ) 1 X T V 2 ~u j 2 r 2 2 r 1 D 2 r L j ^ i x Ti ~ j 2 r +2 2 r 1 x Ti ( X T V 1 X ) 1 x i ~u T V 3 ~u r 2 2 r 1 D 2 r L j ^ i x Ti ~ j 2 r +2 2 r 1 (max 1 i m h i )( 2 u + D U )( 2 u + D L ) 1 D 1 L u T V 1 u r 2 2 r 1 D 2 r L j ^ i x Ti ~ j 2 r +2 2 r 1 (1+ D U D 1 L ) D 1 L (max 1 i m h i ) u T V 1 u r = O p (1). since ^ i x Ti ~ N (0, V i ),max 1 i m h i = O ( m 1 ), and u T V 1 u 2m Thisimpliesthat E 24 @ ^ B i @ 2 u 2 r 35 = O (1). ProofofLemma 2 (ii). Recall P V X = X ( X T V 1 X ) 1 X T V 1 ~u = ^ X ~ and u = ^ X Knowing @ ~ @ 2 u = ( X T V 1 X ) 1 X T V 2 ^ wendthat @ 2 ^ B i @ ( 2 u ) 2 = 2 B 3 i D 2 i e i T ~u +2 B 2 i D 1 i e i T P V X V 1 ~u +2 B i e i T P V X V 2 ~u 2 B i e i T P V X V 1 P V X V 1 ~u 79

PAGE 80

Thenwend @ 2 ^ B i @ ( 2 u ) 2 j 2 B 3 i D 2 i e i T ~u j + j 2 B 2 i D 1 i e i T P V X V 1 ~u j + j 2 B i e i T P V X V 2 ~u j + j 2 B i e i T P V X V 1 P V X V 1 ~u j (A.0.1) Dene n=( X T V 1 X ) 1 ( X T V 2 X )( X T V 1 X ) 1 ( X T V 2 X )( X T V 1 X ) 1 Usingour expressionin( A.0.1 ),wendthat @ 2 ^ B i @ ( 2 u ) 2 2 r 4 3 r 1 j B 3 i D 2 L e i T ~u j 2 r + B 4 r i D 2 r L [ x Ti ( X T V 1 X ) 1 x i ~u T V 3 ~u ] r + B 2 r i [ x Ti ( X T V 1 X ) 1 x i ~u T V 5 ~u ] r + B 2 r i [ x Ti n x i ~u T V 3 ~u ] r 4 3 r 1 B r i D 4 r L j ^ i x Ti j 2 r + D 3 r L [(1+ D U D 1 L )(max 1 i m h i ) ~u T V 1 ~u ] r + D 3 r L [(1+ D U D 1 L )(max 1 i m h i ) ~u T V 1 ~u ] r ] + D 3 r L [(1+ D U D 1 L )(max 1 i m h i ) ~u T V 1 ~u ] r =4 3 r 1 B r i D 4 r L j ^ i x Ti j 2 r +3 D 3 r L [(1+ D U D 1 L )(max 1 i m h i ) ~u T V 1 ~u ] r (A.0.2) Fromequation( A.0.2 )andsince ~u T V 1 ~u u V 1 u itfollowsthat sup 2 u 0 @ 2 ^ B i @ ( 2 u ) 2 sup 2 u 0 4 3 r 1 B r i D 4 r L j ^ i x Ti j 2 r +3 D 4 r L (1+ D U D 1 L )(max 1 i m h i ) u T V 1 u r 4 3 r 1 D 4 r L sup 2 u 0 B r i j ^ i x Ti j 2 r +3 D r L (1+ D U D 1 L ) r (max 1 i m h i ) r sup 2 u 0 ( u T V 1 u ) r # 4 3 r 1 D 4 r L 24 D r U sup 2 u 0 ^ i x Ti ( 2 u + D i ) 1 = 2 2 r +3( D L + D U ) r (max 1 i m h i ) r sup 2 u 0 ( u T V 1 u ) r 35 80

PAGE 81

whereforall 2 u 0, ^ i x Ti ( 2 u + D i ) 1 = 2 2 r d = j Z j 2 r and Z N (0,1). Also,forall 2 u 0, ( u T V 1 u ) r d = W r m where W m 2m Thisimpliesthat E 24 sup 2 u 0 @ 2 ^ B i @ ( 2 u ) 2 2 r 35 4 3 r 1 D 4 r L D r U E [ j Z j 2 r ]+3( D L + D U ) r (max 1 i m h i ) r E [ W r m ] = O (1). Recallthat u = ^ X N (0, V ). Wehavethefollowingcollectionofresults: Lemma3: Let r > 0 andassume max 1 i m x Ti = O (1). Then jj ^ X ~ jj 2 r = O p ( m r ) and E h jj ^ X ~ jj 2 r i = O ( m r ). ProofofLemma 3 Recall ~u = ^ X ~ Then 1 = 2 ~u N (0, I ). Let W = ~u T 1 ~u 2m andobserve ~u T 1 ~u ~u V 1 ~u ~u T ( 2 u + D U ) 1 I ~u =( 2 u + D U ) 1 jj ~u jj 2 Thisimpliesthat jj ~u jj 2 r ( 2 u + D U ) r W r = O p ( m r ). Also, E jj ~u jj 2 r E ( 2 u + D U ) r W r = O ( m r ). Lemma4: Let z N p ( 0 ,) withmatrices A p p and B p p where B issymmetric.Then (i) Cov ( z T A z z T B z )=2 tr ( A B ) (ii) Cov ( z T A z ,( z T B z ) 2 )=4 tr ( A B ) tr ( B )+8 tr ( A B B ). Proofof(i). See Searle ( 1971 ,pg.51) Proofof(ii). First,let = I p BytheSpectralDecompositionTheorem,dene D := PBP T ,where P isorthogonaland D isdiagonalwitheigenvalues i .Dene C := PAP T Weknowthat z T B z = z T P T DP z and z T A z = z T P T CP z Also,since z N p ( 0 I p ) and z d = P z ,Cov z T C z ,( z T D z ) 2 = Cov z T A z ,( z T B z ) 2 Thenbytheaboveandalgebra, wecanshow E ( z T D z ) 2 =2 tr ( B 2 )+ tr ( B ) 2 81

PAGE 82

and E z T C z ( z T D z ) 2 =8 tr ( AB 2 )+2 tr ( A ) tr ( B 2 )+4 tr ( AB ) tr ( B )+ tr ( A ) tr ( B ) 2 Hence, Cov z T A z ,( z T B z ) 2 =8 tr ( AB 2 )+4 tr ( AB ) tr ( B ). (A.0.3) Nowweassumegeneral andlet w = 1 = 2 z N p ( 0 I p ). By( A.0.3 ),weobserve that Cov ( z T A z ,( z T B z ) 2 ) = Cov ( w T 1 = 2 A 1 = 2 w ,( w T 1 = 2 B 1 = 2 w ) 2 ) =4 tr ( A B ) tr ( B )+8 tr ( A B B ). Lemma5: E [(~ 2 u 2 u ) 2 ]=2( m p ) 2 P mi =1 ( 2 u + D i ) 2 + O ( m 2 ). Proof. Observe m p = tr f I P X g anddene d = P i D i (1 h i )= tr f ( I P X ) D g where D = Diag f D i g Also,recall ~u = ^ X ~ Then E [(~ 2 u 2 u ) 2 ]=( m p ) 2 E h ~u T ( I P X ) ~u 2 u ( m p ) d 2 i =( m p ) 2 E h ~u T ( I P X ) ~u tr f ( I P X ) V g 2 i =( m p ) 2 E ~u T ( I P X ) ~u 2 2 tr f ( I P X ) V g E [ ~u T ( I P X ) ~u ] + tr f ( I P X ) V g 2 =2( m p ) 2 tr f ( I P X ) V ( I P X ) V g Usingmatrixmanipulations,itiseasytoshowthat E [(~ 2 u 2 u ) 2 ]=2( m p ) 2 m X i =1 ( 2 u + D i ) ( 2 u + D i ) + x Ti ( X T X ) 1 X T VX ( X T X ) 1 x Ti 2( 2 u + D i ) 2 h V ii =2( m p ) 2 m X i =1 ( 2 u + D i ) 2 + O ( m 2 ). 82

PAGE 83

REFERENCES B ATTESE ,G.,H ARTER ,R.andF ULLER ,W.(1988).Anerror-componentsmodelfor predictionofcountycropareausingsurveyandsatelliteda ta. JournaloftheAmerican StatisticalAssociation 83 28–36. B ELL ,W.(1999).Accountingforuncertaintyaboutvariancesins mallareaestimation. BulletinoftheInternationalStatisticalInstitute,52nd Session,Helsinki B ERGER ,J.(1985). StatisticalDecisionTheoryandBayesianAnalysis,2ndEdi tion Springer-Verlag,NewYork. B UTAR ,F.andL AHIRI ,P.(2003).OnmeasuresofuncertaintyofempiricalBayessm all areaestimators. J.Statist.Plann.Inference 112 63–76. D ATTA ,G.andG HOSH ,M.(1991).Bayesianpredictioninlinear-models-applica tionsto smallareaestimation. AnnalsofStatistics 19 1748–1770. D ATTA ,G.,G HOSH ,M.,S TEORTS ,R.andM APLES ,J.(2011).Bayesianbenchmarking withapplicationstosmallareaestimation. TEST 62 574–588. D ATTA ,G.andL AHIRI ,P.(2000).Auniedmeasureofuncertaintyofestimatedbes t linearunbiasedpredictorsinsmallareaestimationproble ms. StatisticaSinica 10 613–627. D ATTA ,G.,R AO ,J.andS MITH ,D.(2005).Onmeasuringthevariabilityofsmallarea estimatorsunderabasicarealevelmodel. Biometrika 92 183–196. E VERSON ,P.andM ORRIS ,C.(2000).Inferenceofmultivariatenormalhierarchical models. JournaloftheRoyalStatisticalSociety,SeriesB 20 399–412. F AY ,R.andH ERRIOT ,R.(1979).Estimatesofincomefromsmallplaces:anapplic ation ofJames-Steinprocedurestocensusdata. JournaloftheAmericanStastical Association 74 269–277. G HOSH ,M.(1992).ConstrainedBayesestimationwithapplication s. Journalofthe AmericanStasticalAssociation 87 533–540. G HOSH ,M.,K IM ,D.,S INHA ,K.,M AITI ,T.,K ATZOFF ,M.andP ARSONS ,V.(2009). HierarchicalandempiricalBayessmalldomainestimationo ftheproportionofpersons withouthealthinsuranceorminoritysubpopulations. SurveyMethodology 35 53–66. G HOSH ,M.andR AO ,J.(1994).Smallareaestimation:Anappraisal. Statistical Science 9 55–83. G HOSH ,M.andS TEORTS ,R.(2011).TwostageBayesianbenchmarkingasappliedto smallareaestimation. Submitted H ENDERSON ,C.(1950).Estimationofgeneticparameters(abstract). Annalsof MathematicalStatistics 21 309–310. 83

PAGE 84

H ENDERSON ,C.(1975).Bestlinearunbiasedestimationandprediction undera selectionmodel. Biometrics 31 423–447. I SAKI ,C.,T SAY ,J.andF ULLER ,W.(2000).Estimationofcensusadjustmentfactors. SuveryMethodology 26 31–42. L OUIS ,T.(1984).Estimatingapopulationofparametervaluesusi ngBayesand empiricalBayesmethods. JournaloftheAmericanStasticalAssociation 79 P FEFFERMANN ,D.andB ARNARD ,C.(1991).Somenewestimatorsforsmallarea meanswithapplicationtotheassessmentoffarmlandvalues JournalofBusiness andEconomicStatistics 9 31–42. P FEFFERMANN ,D.andN ATHAN ,G.(1981).Regressionanalysisofdatafromacluster sample. JournaloftheAmericanStasticalAssociation 76 681–689. P FEFFERMANN ,D.andT ILLER ,R.(2006).Smallareaestimationwithstate-spacemodelssubjecttobenchmarkconstraints. JournaloftheAmericanStasticalAssociation 101 1387–1397. P RASAD ,N.andR AO ,J.(1990).Theestimationofthemeansquarederrorofsmall areaestimators. JournaloftheAmericanStasticalAssociation 85 163–171. R AO ,J.(1999).Somerecentadvantagesinmodel-basedsmallare aestimation. Survey Methodology 25 175–186. R AO ,J.(2003). SmallAreaEstimation .Wiley,NewYork. R OBINSON ,G.(1991).Thatblupisagoodthing:Theestimationofrando meffects. StatisticalScience 6 15–31. S EARLE ,S.(1971). LinearModels .Wiley,NewYork. S TEORTS ,R.andG HOSH ,M.(2012).Onestimationofmeansquarederrorsof benchmarkedempiricalBayesestimators. Submitted W ANG ,J.,F ULLER ,W.andQ U ,Y.(2008).Smallareaestimationunderarestriction. SurveyMethodology 34 29–36. Y OU ,Y.andR AO ,J.(2002).Apseudo-empiricalbestlinearunbiasedpredic tion approachtosmallareaestimationusingsurveyweights. TheCanadianJournalof Statistics 30 431–439. Y OU ,Y.andR AO ,J.(2003).PseudohierarchicalBayessmallareaestimatio ncombiningunitlevelmodelsandsurveyweights. JournalofStatisticalPlanningand Inference 111 197–208. Y OU ,Y.,R AO ,J.andD ICK ,P.(2004).BenchmarkinghierarchicalBayessmallarea estimatorsinthecanadiancensusundercoverageestimatio n. StatisticsinTransition 6 631–640. 84

PAGE 85

Z ELLNER ,A.(1986).FurtherresultsonBayesianminimumexpectedlo ss(melo)estimatesandposteriordistributionsforstructuralcoefcie nts. AdvancesinEconometrics 171–182. Z ELLNER ,A.(1988).Bayesiananalysisineconometrics. JournalofEconometrics 37 27–50. Z ELLNER ,A.(1994).Statisticaldecisiontheoryandrelatedtopics .In StatisticalDecision TheoryandRelatedTopics (S.GuptaandJ.Berger,eds.).Springer-Verlag,NewYork, 337–390. 85

PAGE 86

BIOGRAPHICALSKETCH RebeccaCarterSteortsgraduatedfromDavidsonCollegein2 005withaB.S. inMathematics.ShethencompletedanM.S.inMathematicalS ciencesatClemson Universityin2007.Finally,shereceivedherPh.D.fromthe UniversityofFloridainthe DepartmentofStatisticsin2012underthedirectionofMala yGhosh.Hercurrentareas ofinterestincludesmallareaestimation,surveymethodol ogy,Bayesianmethodology, anddecisiontheory.DuringhertimeattheUniversityofFlo ridashereceivedthe UnitedStatesCensusBureauDissertationFellowship,ands hewasalsoawarded theUFInnovationthroughInstitutionalIntegration(I-Cu bed)Program(fundedbythe NSF)TeachingAwardfordevelopmentofanewcoursetothedep artmentcurriculum inspring2011.SheplanstojointheDepartmentofStatistic satCarnegieMellon UniversityaftergraduationasaVisitingAssistantProfes sor. 86