Citation |

- Permanent Link:
- http://ufdc.ufl.edu/UFE0021364/00001
## Material Information- Title:
- Empirical and Hierarchical Bayesian Methods with Applications to Small Area Estimation
- Creator:
- Roy, Ananya
- Place of Publication:
- [Gainesville, Fla.]
Florida - Publisher:
- University of Florida
- Publication Date:
- 2007
- Language:
- english
- Physical Description:
- 1 online resource (97 p.)
## Thesis/Dissertation Information- Degree:
- Doctorate ( Ph.D.)
- Degree Grantor:
- University of Florida
- Degree Disciplines:
- Statistics
- Committee Chair:
- Ghosh, Malay
- Committee Members:
- Presnell, Brett D.
Randles, Ronald H. Daniels, Michael J. Rao, Murali
## Subjects- Subjects / Keywords:
- bayes, binary, estimation, mse, robust
Statistics -- Dissertations, Academic -- UF - Genre:
- bibliography ( marcgt )
theses ( marcgt ) government publication (state, provincial, terriorial, dependent) ( marcgt ) born-digital ( sobekcm ) Electronic Thesis or Dissertation Statistics thesis, Ph.D.
## Notes- Abstract:
- The topic of this dissertation focuses on the formulation of empirical and hierarchical Bayesian techniques in the context of small area estimation. In the first part of the dissertation, we consider robust Bayes and empirical Bayes (EB) procedure for estimation of small area means. We have introduced the notion of influence functions in the small area context, and have developed robust Bayes and EB estimators based on such influence functions. We have derived an expression for the predictive influence function, and based on a standardized version of the same, we have proposed some new small area estimators. The mean squared errors and estimated mean squared errors of these estimators have also been found. The findings are validated by a simulation study. In the second part, we have considered small area estimation for bivariate binary data in the presence of covariates. We have developed EB estimators along with their mean squared errors. In this case the covariates were assumed to be completely observed. In the presence of missing covariates, we have developed a hierarchical Bayes approach for bivariate binary responses. Under the assumption that the missing mechanism is missing at random (MAR), we have suggested methods to estimate the small area means along with the associated standard errors. Our findings have been supported by appropriate data analyses. ( en )
- General Note:
- In the series University of Florida Digital Collections.
- General Note:
- Includes vita.
- Bibliography:
- Includes bibliographical references.
- Source of Description:
- Description based on online resource; title from PDF title page.
- Source of Description:
- This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
- Thesis:
- Thesis (Ph.D.)--University of Florida, 2007.
- Local:
- Adviser: Ghosh, Malay.
- Electronic Access:
- RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2010-08-31
- Statement of Responsibility:
- by Ananya Roy.
## Record Information- Source Institution:
- UFRGP
- Rights Management:
- Copyright Roy, Ananya. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
- Embargo Date:
- 8/31/2010
- Classification:
- LD1780 2007 ( lcc )
## UFDC Membership |

Downloads |

## This item has the following downloads: |

Full Text |

ACKNOWLEDGMENTS I would like to extend my sincerest gratitude to my advisor, Professor Malay Ghosh for his invaluable guidance and constant support throughout my graduate studies. It was a privilege to have him as my advisor, who was alvb--, available to help. I would also like to thank Professors Brett Presnell, Ronald Randles, Michael Daniels and Murali Rao for serving on my committee. My special thanks to Professors Dalho Kim and Ming-Hui C'!, 1i for helping me with computation. I am indebted to my professors from my undergraduate d4 v- whose teaching inspired me to pursue higher studies in Statistics. I would especially like to thank Professor Bhaskar Bose from my higher secondary d4 v- He introduced me to Statistics and helped me develop a basic understanding of the subject. My sincerest thanks to Professor Indranil Mukherjee for his constant encouragement during my undergraduate d ,v-. The extent of his knowledge and problem-solving capabilities have alvb--, inspired me. During my stay at the University of Florida, I have been lucky to have marvelous teachers like Professors Malay Ghosh, Brett Presnell, Andre Khuri and Ronald Randles. They have taught me a lot and would ahv-- i inspire me to be a good teacher and researcher. I would like to thank my friends and classmates in the department, especially George, Mihai, for helping me whenever I needed. Life would have been really different in Gainesville if not for friends like Vivekananda, Siulidi and Bharati. I am grateful for their support, guidance and, well, just for being there. I have been fortunate to have come to know and grow close to people like Dolakakima, Bhramardi and little Munai. Their genuine affection and warmth, along with the love and support of my friends, and, of course 'Asha Parivar', has made Gainesville a second home to me. I don't think any of this would have been possible if not for the encouragement and support of my family. My heartfelt thanks to my parents, who have ahv-- i supported me in all the choices I have made and encouraged me to move forward. I owe everything that I have achieved to my mother who has ahv-- been my inspiration. Last, but not the Xij (Xij=i, ._., Xij3, Xij4, Xij5, Xij6)T, j ,...,i, i = 1,2,..., k = 48. Covariates xijI, Xij4, Xij5 are completely observed, while r ._., xij3 and Xij6 have missing values. The models considered for these three missing covariates are given as follows: a6) exp{xij6(a61 + 62x4 + (-, .)} 1 + exp((a61 + a62Xij4 + (, where a6 = a61, a62, a63)T, P((r ._,Xij3) (t,0)|a2, a3) exp(a21 + 22Xij4 + ('. + .,.) 1 + exp(a2 + 22xij4 + (' .. ,1' ...) +exp(a31 + _Xij4 + 33Xij5 + .) and P((T .-,Zij3) (0, t)|I 2, 3) exp(a3i + a32Xij4 + a33Xij5 + (I + exp(a21 + a22Xij4 + (. (.1...) + exp(a3 + 32ij4 + 33Xij5 + ) where a = (alI, a2, a3,al4)T for I = 2,3. We do not need to model xijl, Xij4, xij5 We assume improper uniform priors for a2, a3 and a6, i.e., 7r(a) o 1, where a = (ao, T, aToT. To prove the propriety of the posterior under an uniform prior for a, we start with the complete likelihood for the covariates: Let k ni L(a6,a2,a3 Dcomp) t ( f (-a) exp{r ._(a21 + 22j4 +o .-. + o'- ..)} + (a1 + _.xj4 + 33Xj5 + aXj346 )} 1 + exp(o21 + a22Xij4 + .-. + (I ..) + exp(oa31 + ( _.ij4 + a33xij5 + ( ..) where Dcomp {(r xij3,ij6, Xij4, ij5) j 1,... ni, i 1,..., k}. We consider the following four cases. Writing mij(b) = mnj = (Tmlij, Tm2ij, T3ij )T, E[(PEB -pB0)(PE_-P,) T] = A2(+-2E[{m(b) By one-step Taylor expansion, for I 1,2, 3, rniO(b) flhy (b))T(b Hence(b) f ij(b) + ( b (63, Hence for < I/l_ < 3, [ amj(b) TE[( b)( -b)fl [ml (b)] (3-25) Next we find an approximation to E[(b To this end, let Sk(b) Y E E[ ak(b)] = i= l j 1 b)(b b)T] which is correct up to 0(k 1) DT E1 gj. Noting that E(gj) E[D E 1Diy + (DE )g] SiJ j=1 0, x x ij]. (3-26) We denote by Uk the expression in the rightmost side of Eq. 3-26. We assume that Uk = O(k). This leads to the approximation E[(b- b)(b- b)T] = UUkUk1 Uk1 = 0(k-1). E[(pSB ) -pB)IVT [ pEB p ) (pEB B), Vijj,, (3-24) w. ,(b)}] i=l j 1 Now (3-27) mii(b)}{mij,(b)-mij,(b)} T]. (3-23) E[{mnj(b) muj(b)}][{w. .(b) for ARMSE, 42.6 -, for AD and 65.9 for SMSE for the first variable. For the second variable, the improvements were 50.4 for ARD, 87.6 for ARMSE, 24.5 for AD and 71.5 for SMSE. Table 3-1 and Table 3-2 summarizes the performance of the EB estimates for the first and second small area mean respectively in terms of their bias and MSE estimates. Table 3-1. Relative biases of the point estimates and relative bias of MSE estimates for the small area mean: the first variable. A ARD ARMSE AD AMSE EMSE REB 0.614 0.289 0.192 0.358 0.290 0.180 0.036 0.330 0.071 2.32 .0402 0.005 k 10 0.081 0.079 .069 k 20 0.080 0.089 0.068 0.011 0.012 0.008 0.0108 0.013 0.0073 0.037 0.043 0.023 0.036 0.039 0.022 2.385 2.355 3.276 2.406 2.347 2.406 Table 3-2. Relative biases of the point estimates and relative bias of MSE estimates for the small area mean: the second variable. A ARD ARMSE AD AMSE EMSE REB 0.422 0.208 0.151 0.209 0.290 0.1412 0.007 0.182 0.050 0.355 .008 0.023 k 10 0.074 0.084 0.063 k 20 0.075 0.079 0.062 0.009 0.010 0.006 0.010 0.011 0.006 0.007 0.009 0.005 0.007 0.008 0.004 0.214 0.283 0.196 0.221 0.225 0.206 1.5 Layout of this Dissertation The first part of this dissertation deals with robust B-,i and empirical B-,i, - methods for SAE. In the second part, we propose EB and HB methods in SAE when the responses are bivariate binary. In C'!i Ipter 2, we have considered the basic area level model and proposed some new robust small area estimation procedures based on standardized residuals. The approach consists of first finding influence functions corresponding to each individual area level observation by measuring the divergence between the two posteriors of regression coefficients with and without that observation. Next, based on these influence functions, properly standardized, we have proposed some new robust B ,--i and empirical B i,.- (EB) small area estimators. The approximated MSE's and estimated MSE's of these estimators have also been found. In C'!i Ipter 3, we have addressed small area problems when the response is bivariate binary in the presence of covariates. We have proposed EB estimators of the small area means by assigning a conjugate prior to the parameters after suitable transformation and subsequent modelling on the covariates. We have also provided second-order approximate expressions for the MSE's and estimated MSE's of these estimators. In C'!i Ipter 4, we have proposed HB estimators in the bivariate binary setup, where some of the covariates may be partially missing. We implemented hierarchical B ,v ~i i models which accommodate this missingness. Estimators of small area means along with the associated posterior standard errors, and posterior correlations have also been provided. Table 4-1. Bivariate HB estimates along with the Direct estimates and the True values area n, P1 P2 Dl D2 HB1 HB2 1 695 0.007 0.068 0.004 0.062 0.006 0.080 2 3 0.015 0.099 0.000 0.000 0.007 0.082 3 41 0.006 0.061 0.000 0.122 0.011 0.108 4 1477 0.006 0.068 0.006 0.071 0.006 0.075 5 425 0.014 0.131 0.024 0.150 0.006 0.077 6 345 0.006 0.067 0.008 0.064 0.006 0.082 7 510 0.005 0.067 0.004 0.061 0.009 0.102 8 888 0.009 0.116 0.008 0.114 0.009 0.116 9 44 0.004 0.081 0.000 0.091 0.005 0.073 10 1718 0.004 0.063 0.002 0.058 0.007 0.083 11 379 0.006 0.068 0.005 0.082 0.005 0.075 12 126 0.013 0.130 0.03 0.135 0.007 0.082 13 5 0.005 0.068 0.000 0.200 0.009 0.091 14 15 0.003 0.077 0.000 0.067 0.006 0.076 15 13 0.003 0.070 0.000 0.000 0.007 0.083 16 975 0.005 0.072 0.007 0.060 0.010 0.107 17 349 0.011 0.135 0.017 0.149 0.007 0.083 18 4 0.005 0.112 0.000 0.500 0.007 0.080 19 58 0.003 0.074 0.000 0.069 0.005 0.078 20 288 0.005 0.082 0.003 0.090 0.007 0.082 21 27 0.002 0.078 0.000 0.111 0.007 0.079 22 237 0.004 0.067 0.000 0.076 0.006 0.076 23 987 0.005 0.068 0.004 0.078 0.006 0.079 24 6 0.019 0.052 0.000 0.000 0.005 0.078 25 50 0.008 0.084 0.000 0.040 0.008 0.084 26 365 0.004 0.059 0.003 0.088 0.004 0.072 27 453 0.008 0.064 0.011 0.077 0.005 0.073 28 78 0.019 0.120 0.012 0.128 0.005 0.078 29 4 0.016 0.093 0.000 0.000 0.007 0.081 30 15 0.004 0.076 0.000 0.133 0.005 0.072 31 19 0.007 0.064 0.000 0.158 0.007 0.086 32 446 0.008 0.071 0.002 0.074 0.005 0.073 33 76 0.015 0.133 0.000 0.145 0.005 0.069 34 8 0.008 0.085 0.000 0.125 0.007 0.086 35 22 0.006 0.056 0.045 0.136 0.009 0.093 36 290 0.014 0.133 0.014 0.131 0.005 0.073 37 3 0.009 0.074 0.000 0.000 0.008 0.086 38 18 0.006 0.061 0.000 0.111 0.006 0.082 39 933 0.005 0.066 0.004 0.062 0.007 0.084 40 207 0.012 0.119 0.019 0.126 0.006 0.082 41 4 0.000 0.084 0.000 0.000 0.008 0.085 42 21 0.008 0.084 0.000 0.048 0.007 0.089 43 1472 0.007 0.070 0.007 0.071 0.008 0.089 44 27 0.004 0.073 0.000 0.037 0.005 0.076 45 275 0.006 0.067 0.003 0.069 0.005 0.072 46 1131 0.006 0.065 0.008 0.060 0.008 0.085 47 191 0.014 0.125 0.016 0.126 0.008 0.087 48 303 0.006 0.060 0.003 0.066 0.008 0.085 given b, a2, Xmis, and Dobs, the 's are conditionally independent with 7(' |b, r2, is, Dobs) x exp + 2j2x + ) exp(x bi + ) , j= 1 l= 4 which is log-concave in Again, we sample using the adaptive rejection algorithm of Gilks and Wild (1992) for i 1, 2,..., k. For (iii), we have k a2 v, Dobs Inverse Gamma (c + k, d + . i= 1 Thus, sampling a2 is straightforward. For (iv), given b, v, a, Dobs, the xumijs's are conditionally independent with 71(xij,mis b, v, a, Dobs) o exp { Lz[(Jxbi + ) exp(x.b + )j f (x+ a). 44 Sampling x,mis depends on the form of f(xiy|a). For the data given in the next section, Xij,mis is a vector of discrete covariates and hence, it is easy to sample xijmis from its conditional posterior distribution. For (v), T(aolxmis, Dobs) OC [ Hi Hji1 f(xij a)] r(0). For various covariate distributions specified through a series of smaller dimensional conditional distributions, sampling a is straightforward. 4.3 Data Analysis The motivating example for our study is to estimate jointly the proportion of newborns with low birthweight and infant mortality rate at low levels of geography such as districts within a state. The data from the infant mortality study was conducted by National Center for Health Statistics (NCHS) involving birth cohorts from 1995 to 1999. The original survey was designed to obtain reliable estimates at the state level. The same data needs to be used to produce estimates at the district level. Other than the intercept term, the covariates considered were (i) mother's schooling categorized into less than 12 years, equal to 12 years, or greater than 12 years; (ii) mother's age categorized into less than 20 years, between 20 and 34 years, and greater than 34 years; (iii) the smoking habit of the mother categorized into mother does not smoke and the average number of I bij if (i,j) e I U 121 aIij o.w. (4-14) n+ij =aij if (ij) E I21 We are now led to the following theorem. Theorem 4-1. For 1 < r < q and 1 < 1 < 4, let x )( = a) for every (i, j) EC A x(1 a) for every j e MA4 n21, or, let x ) for every (i,j) e A*, x ij,r i for every j e M4* I21, (Cl) the matrix UM') as defined in Eq. 4-11 is of full rank; (C2) there exists a positive vector a (ai, a +Ni) E T. such that U( a = 0 (4-15) where U( is defined in Eq. 4-12 (C3) fl u|qdf (u) < oc fori 1,2,3. If conditions (C1)-(C3) are satisfied, then the posterior is proper provided k ni J fl f(xij,,obs\ a)da < oc. (4-16) i=l j 1 We will defer the proof of this theorem to the appendix. To implement the Gibbs sampling algorithm, we sample from the following conditional distributions in turn: (i) 7r(blv, Xmis, Dobs), (ii) 7r(v b, 72, xmis, Dobs), (iii) |7r(2 v, Dobs), (iv) x7(xmisb, v, a, Dobs), and (v) (a x mis, Dobs). For (i), given v, Xmis, and Dobs, the bl's are conditionally independent with 7(bi v, mis, Dobs) o exp zji(xbi + ) exp(xbi +, ) . i= l j= It can be shown that 7r(b Iv, xmis, Dobs) is log-concave in bi and, hence, we can sample bi via the adaptive rejection algorithm of Gilks and Wild (1992) for I 1,... 4. For (ii), APPENDIX B PROOFS OF THEOREMS 4-1 AND 4-2 Proof of Theorem 4-1: We start with the joint posterior given by: 7r(b, -2, a, v, Xmis Dobs) Ik ni 4 Texpz7(x b + )} 1 Vi2 {i h exp--ziJl(xi exp -exp(x bi +,)} exp 22) i 1 j 1 1=-1 k ni x H11 f(xij,obs, Xij,misa) exp(--C)(2)-d/2-17(a)w(b) (B-l) i=1 j 12 Here b = (bi, b2, b3, b4), 7r(b) x 1 and 7r(a) x 1. Now, let j =- x bi + I 1, 2, 3, 4, j 1, i = 1,... k. Now let us first integrate the posterior w.r.t -2: 7r(b, a,v, Xmis Dobs) x IHIHexp{zijl } exp exp(9}]) (c + v1v)(d+k)/2 i 1 ( 11 k ni x l f Jf(Xjobs, x .... a) (B-2) i=l j 1 Then from Eq. B-2 and Eq. 4-10 we get: 7r(b, a, v, Xmis Dobs) 4 < n[ n f(U-) n Cij3f2 l-3 ] l1 (ij)1I (i,j)EI21 k ni 1 A ^ x (c + V )d+k)/2 11 iobs, Xij,mis a), (B-3) i=-1 j=1 where fl, f2 and f3 are defined in Section 4.2. For 1, ... ,4, 1 < r < q, let x$r a1J for (i, j) c A4M and ir a- i. for (i,j) EM I I21 if br > 0; x b for (i, j) E AM and V, .for (i, j) E A4 nI21 if bl, < 0. Write x mis {x, r C A4v} for (i,j^) 4 and miss r A } for (i,) e 2n 2 (ij,obs, ,'miss and J (xbs, ,miss). Also, let i ( e) andu e()T. Then To my teachers and my family Let Dobs = (Dobs,1, Dobs,2, Dobs,3) and let mis denote the collection of all missing values. Combining Cases 1 3 leads to L(a6, 2, a3|Dobs) = L(a6, a2, 3 IDcomp) -Li(06, a2, 3 |Dobs,1)L2(a6, a2, 3|IDobs,2)L3(06, 62, O Dobs,3). To start with, let uT (xlij, 4ij, 5ij, X6ij) U1,j = (U OT 4)T and uj = (0X4, ui ). Also let 7 (as, a~)T such that 7, a2r,and, 74+r a= 3r, r 1. 4. Let us consider partition of the index set {(i, j) : (r .., Xij3) is observed} into three parts: Ji {(i,j) : x2ij ,x3ij 0}, J2 {(i,j) : x2j 0,3ij = 1} and J3 {(,) 2ij = O,3ij = 0}. Then the likelihood L(o2, o3,o 6|Dob0) can be written as L(a2 a3,a6 Dobs) H f (xj6 a6) i,j: Xij6 is observed f (7 ., | .)exp (u 7) 1 + exp (U T7) + exp (UT 7) (i,j)EJ j xij6,mis f J fv- .exp (u 7) __ t ( | 1 + exp (U T7) + exp (UT 7) (i,j) J2 Xij6,mis 2 I X H fexp( 7) + exp (UT () (i,j)EJ3 ij6,mis We follow mainly the techniques of CI.! i1 and Shao (2000) and C.!, i, Ibrahim, and Shao (2006) to prove the propriety of the posterior. First, for (i,j) c .M {(i,j) * x6ij is observed}, let x* (lj, x4j, x5ij) and X* be the the n* x 3 design matrix with rows xi, where n* is the cardinality of Mc. Let z* = 1 2x6ij, and X* be the matrix with rows z x ~, (i, j) E Mc. Let us state the following conditions: (Dl) X is of full rank; (D2) there exists a positive vector a = (ai, a,,) E R"*, such that X*a 0 (4-18) CHAPTER 3 EMPIRICAL BAYES ESTIMATION FOR BIVARIATE BINARY DATA WITH APPLICATIONS TO SMALL AREA ESTIMATION 3.1 Introduction Analysis of multivariate binary data is required in rin ,i areas of statistics. For example, in longitudinal data analysis, quite often one encounters binary responses for two or more variables (Bishop et al. (1975); Zhao and Prentice (1990)). Bivariate binary data arise very naturally in opthalmological studies. Item response theory, which occupies a central role in psychometry and educational statistics, deals almost exclusively with multivariate binary data. The classic example is the Rasch model with its many and varied extensions. Our interest in the analysis of multivariate binary data stems from their potential application in small area estimation. We consider the case when the response is bivariate binary. In Section 2, we have first derived the B-,-,i estimators, and subsequently the EB estimators for the small area parameters. Section 3 is devoted to the derivation of the MSE's of B ,--, estimators, and second order correct approximations for the MSE's of the EB estimators. Section 4 is devoted to the second order correct estimators of the MSE's of the EB estimators. We have conducted a simulation study in Section 5 to illustrate our method. 3.2 Bayes And Empirical Bayes Estimators Let y = (yui, I/ ._.)T (j = 1,.- .* ni; i = 1, k) denote the bivariate binary response for the jth unit in the ith small area. Thus each yij (1 = or 2) assumes the values 0 or 1. The joint probability function for y is given by / exp(' iyiji + '. + '. ,;ii, ) (3 l+exp('. ,) +exp(', .) +exp( -', + -. .+ ..) With the reparametrization exp( 0)1) Phi 1 + exp( '. ,) + exp( '. _) + exp( ', + + .) P( 1, 0), To this end let gij = zlij mij, g2ij Z2ij m2ij and g3ij Z3ij M3ij. Then writing zij = (zlij, z2ij, Z3ij)T, ij = V(zij), -E(a ) -E(-) -E(-9i ab, ab, ab, D T -E ) -E(_ ) -E(_) , --i(jlj gi gi -E(g) -E(--) -E(-3)i the optimal estimating equations are given by iED gyDi ij =gij equations, we find the estimators of b5, b2 and b3. We first observe that Again from Eq. 3-9, D = Efi (j equations Di ji 1 -1 further simplifies into k lj i= 1j= 1 Ej = Diag(mlj Tm2ij, mT3ij) -- 2ij lij m 2ij f 3ij mn3ij (3-9) 0. Solving these (3-10) xij, where 0 denotes the Kronecker product. Hence, the - 0 can be rewritten as Y Z E ['il[13 0xijij = 0 which k ni xjmli (1 = 1, 2, 3). i=1 j=1 (3- 11) Solving Eq. 3- 11, one estimates bl, b2 and b3. The EB estimators of plij (1 now given by 1, 2, 3) are pEB (zlij + A nj(b))/(A + 1), (3-12) 1, i = 1,... ,k; = 1,2,3, where b = (b6 b, 3 ). Then the EB estimators of and 02ij are given by nEB EB ,EB OEB EB -EB Iij Plij + P3ij 02ij 2ij + P3ij * (3- 13) We write (i, B ( EB, 2B)T, where O E t ri' (r 2; -1 ,... k, ). E 1,.0B ," APPENDIX A PROOF OF THEOREM 2-3 We start with the identity E (REB E(of- _0)2 + E(o REB Vi(1 B) + E(ofEB V~l-B +Ete - Next we write E (REB 0B)2 (0EB 6B + 0REB E (EB 6B B+ E (REB Now observe that i REB REB B -D ^ BD[Yi XT + (K - Di -BeD^ -X6 + (K Dh Yi xfb D _i XT6)I D XT b xTb >K] _b>K] Di >1]> (K + )I-S) Di b - (K + )I yD x Bj(y xf 6) (y fb) B- {(y xb KD)I -X>KDi] [v- ib K 4 +(y -fb + KD)I Ixf b< KD]+Bi(y +(y -fi + KD)I fb<-KD]}. We first show that xf(b xf b- KD)I[,-Xb>Kli] (A-3) b) = Op(m- 1). To this end, we begin with the one-step Taylor expansion, b + (XTE-1X)- XTE- (y Ob Xb) + (A A) A ab A)b OA 6iB)2 0B)2. (A-1) REB) 2 +2E(REB eREB) 2 B e REB 0, )(O, 0REB). (A-2) Di xTb LI, TTb Di b b + (A (A-4) Di There has been considerable work done in the context of disease mapping. A simple model assumes that the observed small area disease counts i' 10 id Poisson(ni0i) with 0i i.d. Gamma(a, b), where 0i is the true incidence rate and ni is the number exposed in the ith small area. Maiti (1998) assumed a log-normal distribution for Oj's. He also assumed spatial dependence model for the log-transformed incidence rates. Some other works in this area include papers by Ghosh et al. (1999), Lahiri and Maiti (2002) and Nandram et al. (1999). Ghosh et al. (1998) considered Hierarchical Bayes (HB) generalized linear model(GLM)s for a unified analysis of both discrete and continuous data. The responses yij's are assumed to be conditionally independent with pdf f (yi'0 .) exp( '-(ijij b(Oij)) + c(yij, ij), (1-20) (j 1,... i = 1, ... m), where 0(> 0) are known. The canonical parameters are modelled as h(Oij) = xTb + ui + ej, where the link function h(.) is a strictly increasing function. Ghosh et al. used the hierarchical model in the following way: 2 2 ind yij 'O b, ui, ja, e ~ f 22 2 ind h(Oij) b, u i, aC N(x b + u, ); u\b, 0a, o, N (0, ,) (1-21) b ~ Uniform(Rk); (2) -1 Gamma(a, b); (-2)-1 Gamma(c, d), where b, o- a are mutually independent and (a, b, c, d) are fixed parameter values. MC'\ IC methods were implemented to evaluate the posterior distribution of O6j's or any functions of them. Ghosh and Maiti (2004) provided small-area estimates of proportions based on natural exponential family models with quadratic variance function (QVF) involving survey weights. In the paper, they considered area level covariates, i.e., the same covariates for all the units in the same area. They introduced the natural exponential family quadratic variance function model, which has the same set up as Eq. 1-20 with Then 8 L(7|D)dy Kf exp(, + ..) K1 2 (1 +exp(, .) + exp(,,.))3 I <00 Then J3 J8 L(a2, a3,a6 Dobb,)da2da3da6 But fp3 H[(~j f(xij61a6)dao6 < oO under conditions (Dl) and (D2) by (Theorem 2.1) of C'!., ii and Shao (2000). Hence the theorem follows. Next we calculate I m (V+-{A)(B +- B )[V+A + A)- j 1 j-1 1A-A Bi)[+ + 2V+A 1(A- A)2 8 (V + A)2 1 Tm - XT ( (V +A) -lXjX)-xi 1 j=1 1 3 + A (Z(V + A) -xxT)-_Xi + o(m-1)] 2 Vi +T A'1 1(A-A)2 -)] 8 (V + A)2 ~ (K + A) (Bi + is - 1A-A 2(+A 1 B(A- A)2 (V( + A)2B8- 2 V(A A) A- A 1) + (0 + A)' t)7- + op (m- (V+A)(V+A)2V+A Bi A-A 2 ( + A)2 Bi A(-A 2 (V + A)2' and, Di(B, Bi) Bi (A A)2 8 (V+A)3 B(A A)2 -- + Op(m 2(V + A) 5B, (A -A)2 1 3 + op 0n 1), 8 (V+A)2 [V + A + O(n-1)] ( (A- A) (V+A)(I+A) V(A- A) A- A1 (V +A)2 +A)- (.+.. r V+A VI(A- A) ( + A) 2 - A- + Op (m 1)] gi+A A- A (A A)2 (V + A) (V + A)2 A-A Bi + A (V+A) (A A)2 + Bi + A 0P(-1) (H+A)a 1A-A 2V-+A (V + A) (Bi + bi CHAPTER 4 HIERARCHICAL BAYES ESTIMATION FOR BIVARIATE BINARY DATA WITH APPLICATIONS TO SMALL AREA ESTIMATION 4.1 Introduction In C'!i pter 3, we considered empirical B-,- estimation of the small area parameters for bivariate binary data. In this chapter we have implemented the hierarchical B-,.--i procedure for bivariate binary data when there is missingness present in the covariates. For our purpose, we have assumed the missingness to be missing at random ( \!AR) (Little and Rubin, 2002). We have incorporated ideas from Lipsitz and Ibrahim (1996) and Ibrahim, Lipsitz, and C'!, 1i (1999) to treat the missing covariates. The general HB models which accommodate this missingness is discussed in Section 2. We have, after suitable reparametrization, converted the bivariate binary likelihood into a multinomial likelihood, and subsequently have utilized the multinomial-Poisson relationship leading eventually to a Poisson reparameterization. B i, i ,i analogues of log-linear models are used to estimate the Poisson parameters. We have applied the HB methodology to obtain the small area estimates, the associated standard errors and posterior correlations. Due to the analytical intractability of the posterior, the HB procedure is implemented via Markov chain Monte Carlo (\ C'\ C) integration techniques, in particular, the Gibbs sampler. In Section 3, The methodology is illustrated with the real data related to low birthweight and infant mortality. 4.2 Hierarchical Bayes Method Let y = (yij, /i ._.) (j = 1,.- .* n; i = 1, k) denote the bivariate binary response for the jth unit in the ith small area. Thus each yijl (1 = l or 2) assumes the values 0 or 1. The joint probability function for y is given by / ) exp( .1 i + + + '/i ._.)(4 -) + exp( ,) + exp( + exp( -, + + ..) 1) 3.4 Estimation of The MSE In this section, we find an approximation of the expression given in the right hand side of Eq. 3-33 which is second order correct, i.e. the first neglected term is o(k-1). We may note that the second order term in the right hand side of Eq. 3-33 is O(k-1). Hence, substitution of b for b estimates the same correctly up to O(k-1). However, estimation of MSE(OB) with the simple substitution of b for b is not adequate since the first neglected term is O(k-1). Hence, we need to estimate MSE(bO). To this end, we begin with the evaluation of E[m1ij(b)(1 miyj(b))]. By two-step Taylor expansion, 1ijO~(b) ( TijT(b) + ( b 1 2b) m j(b) *m (b) + ( b )(b b) + 2(6 b 8 (b b). (3-34) Hence, E[mij (b) (1 mij(b))] E[T{m (b) + (a b))T(b- b) + (6 b)T li(b) (6- b)} Ob 2 ObObT x {1 mii(b) ( b (6 b) -(6 b)0-2T (b- b)}] ab 2 abab Smiij(b)[l miij(b)] + (1 2mlij(b))( aml )E(b b) 1 a2m1ij (b) (am li (b) m1sy (b) + tr[{(1 2miij(b)) b ab b b)( b )T}E(- b)(6 b)T]. (3-35) from Eq. B-3, 7r(b, a, v, Xmis Dobs) 4 < cH [ f(i() Tr) f2it(I)Tr, )f3 (t)T)] 1l 1 (i,j) Iill (i,j) EI2 1 k ni (c + VTv)(d+k)/2 111 f(xi,obs, Xij,mis la) ilj 1 Then, 7r (b, a, v Dobs) 4 < C [ lfi (t( r) f2 t(I)Tr f3 (ITrl)] l1- (i,j)eIil (i,j)EI21 1 k ni x (c + vTv)(d+k)/2 1 i=1 j=1 aij f (xij,obs, X) 1 ) CHi [ A( fi()Tr ) n f2(t( 1rl)T f (()Trl)] 1-1 (i,j)Ell (i,j) E121 k ni (c + Tv) (d+k) /2 11 f (xi0,b a) il j 1 Let, R (i,j) El Using the fact that fi(I) 1 7 f u())i ( QTrr) f2( t r)3 (l )Tr }'dbi. S(ij) 2dbl. (i,j) E121 d(-fh(u)) -oo -00 for i = 1, 2, and u}d(f3 (u)), (B-4) (B-5) f (T) d(f (u)) I {-/ < --OO ---0 CHAPTER 5 SUMMARY AND FUTURE RESEARCH In this dissertation, we have developed some new robust B,--i and empirical BE,-- estimation procedures in the context of small area estimation under a normal model. It turns out that the proposed EB estimators perform quite well in comparison with the regular EB estimators under a contaminated normal model, both in terms of the bias and the mean squared error, especially for the extreme small area parameters. A closed form expression of the MSE can not be obtained under this complex situation. We have developed second-order correct MSE estimators for the robust estimators. The other part of this dissertation is small area estimation when the response is bivariate binary in the presence of covariates. We have first developed an EB estimation procedure followed by a hierarchical B-,-i procedure. In the latter case, we have covariates which are missing at random. We have shown that both the methods perform much better than the direct estimators through a simulation study in the former case and real data analysis in the latter situation. In this dissertation, we have considered only area level small area estimation models for the case of robust estimation of parameters. It is worthwhile to extend the proposed approach to unit level small area estimation models as considered for example in Battese et al. (1988) A more challenging task is to extend these ideas to non-normal models, especially for some of the discrete distributions involving binary, count or categorical data. The posterior of the regression coefficient in such instances does not come out in a closed form. A more managable alternative seems to involve a prediction problem where one decides to look at the predictive distribution of a future observation, and use the same to derive the influence of a given observation. A natural extension of the first part of this dissertation would be to develop similar robust procedures in the case of general linear mixed models. As noted in the introduction, the normal model considered in the dissertation and also the other exp( ,) 1 l+exp( ') +exp(' .. )+exp( ', + ..+ ..) 0 .. 1) and i3 exp( + '..+ ..) pjt3 l+exp( ',,)+exp( .:_.)+exp( ,,1+ ..+ P.) P' 1,.. 1) we can rewrite the model given in Eq. 3-1 as .(1- Yj2) (1-Y ji), i2 (1-yiji)(l-yj2) (3 2) P(Yiji, 1 ..) iji P1 Pij3 P4ij (3-2) where P4ij 1 1plij. Let Olij P= ij + Pij3, O2ij P .. + Pij3. Our primary interest is simultaneous estimation of the small area means (01i, 02i)T, where 01i = YZ, .01ij, 02i 1 it 111' *02ij, 1ji' = 1 for each i. Here the weights w, .'s are assumed to be known, based on some given sampling scheme. We begin with the likelihood k ni L = 1 l P(YJi, 2iJ). (3-3) i= j=1 Writing pT = (p, p ), the conjugate Dirichlet prior is given by k ni P) \ 1rn A -lmiP 1 Arm2-1 Arm3i j-1 Amn4i- 1 l() where A is the dispersion parameter, m4ij 1 t- E3 lij, and the mlij are functions of the covariates xij associated with the binary vector y. In particular, we take mnl = exp(xTbl)/[1 + exp(xTbi) + exp(xb2) + exp(xb3)], (1 1, 2, 3) (3-5) where b1, b2, b3 are the regression coefficients. Writing zlij = yil(1 i/ ._),z2ij i ._(1 yiji), Z3ij Y Yijli .: and z4ij 1 y3I zlij, the posterior for p is now given by k ni 7r(p Y) Noc J J[ Z1+Ami -1izAin y AmiJ-lZ 3i+Am3ij- l T 4i m4ij-11 (3 6) i= 1 j 1 Table 4-3. Posterior variance, correlation and 95' HPDs along with predictive p-values II 12 area SE1 SE2 CORR p-value 1 0.001 0.002 0.115 0.005 0.007 0.076 0.085 0.536 2 0.001 0.002 0.151 0.005 0.008 0.077 0.087 0.544 3 0.002 0.006 0.181 0.007 0.015 0.097 0.119 0.556 4 0.001 0.002 0.124 0.004 0.007 0.071 0.080 0.520 5 0.001 0.002 0.121 0.005 0.007 0.072 0.081 0.572 6 0.001 0.002 0.099 0.005 0.007 0.077 0.086 0.552 7 0.002 0.005 0.153 0.006 0.013 0.092 0.112 0.496 8 0.009 0.130 0.149 0.008 0.102 0.005 0.073 0.548 9 0.001 0.002 0.168 0.004 0.007 0.069 0.078 0.536 10 0.001 0.003 0.106 0.006 0.009 0.078 0.088 0.588 11 0.001 0.002 0.170 0.004 0.006 0.071 0.08 0.544 12 0.001 0.003 0.170 0.006 0.009 0.077 0.087 0.524 13 0.001 0.004 0.152 0.006 0.011 0.084 0.099 0.552 14 0.001 0.002 0.151 0.005 0.007 0.072 0.081 0.536 15 0.001 0.003 0.144 0.005 0.009 0.078 0.089 0.540 16 0.002 0.005 0.178 0.007 0.013 0.098 0.117 0.496 17 0.001 0.002 0.162 0.006 0.009 0.078 0.088 0.472 18 0.001 0.003 0.105 0.005 0.008 0.075 0.085 0.596 19 0.001 0.002 0.154 0.004 0.007 0.074 0.083 0.508 20 0.001 0.002 0.174 0.006 0.009 0.078 0.087 0.476 21 0.001 0.004 0.126 0.004 0.009 0.071 0.087 0.556 22 0.001 0.002 0.142 0.004 0.007 0.072 0.081 0.524 23 0.001 0.002 0.097 0.005 0.008 0.075 0.084 0.612 24 0.001 0.002 0.181 0.004 0.007 0.073 0.083 0.540 25 0.001 0.003 0.072 0.006 0.009 0.079 0.089 0.576 26 0.001 0.004 0.104 0.002 0.005 0.064 0.079 0.596 27 0.001 0.002 0.102 0.004 0.006 0.069 0.078 0.604 28 0.001 0.002 0.157 0.004 0.007 0.073 0.083 0.560 29 0.001 0.002 0.132 0.005 0.008 0.076 0.085 0.604 30 0.001 0.003 0.106 0.004 0.006 0.067 0.077 0.492 31 0.001 0.003 0.201 0.005 0.008 0.081 0.091 0.604 32 0.001 0.003 0.172 0.004 0.007 0.067 0.078 0.540 33 0.001 0.002 0.148 0.004 0.007 0.064 0.074 0.496 34 0.001 0.003 0.106 0.005 0.008 0.081 0.091 0.492 35 0.002 0.004 0.163 0.007 0.012 0.085 0.101 0.588 36 0.001 0.003 0.08 0.004 0.006 0.068 0.079 0.544 37 0.001 0.003 0.191 0.006 0.011 0.079 0.092 0.544 38 0.001 0.002 0.088 0.005 0.007 0.077 0.086 0.532 39 0.001 0.003 0.146 0.006 0.009 0.079 0.089 0.556 40 0.001 0.004 0.14 0.004 0.008 0.075 0.089 0.536 41 0.001 0.004 0.152 0.005 0.010 0.078 0.093 0.536 42 0.001 0.003 0.102 0.006 0.009 0.084 0.094 0.488 43 0.001 0.003 0.156 0.006 0.010 0.083 0.094 0.536 44 0.001 0.004 0.052 0.003 0.007 0.069 0.084 0.544 45 0.001 0.002 0.105 0.004 0.006 0.067 0.076 0.560 46 0.001 0.003 0.179 0.006 0.010 0.079 0.091 0.552 47 0.001 0.003 0.164 0.006 0.009 0.081 0.092 0.508 48 0.001 0.003 0.241 0.006 0.010 0.079 0.091 0.564 Lemma 2-3. Assume condition (i) of Theorem 2-3. Then E(B, Bj) TnZ B(t Bj){ ( Bj)2-1I[2( j=1 -tr {(-(1- Bj)xjXT)-(-(1 1 1 gsim + o(m-1) (--). Bj) B x)2xjXT)}]+ O( -1) Proof: By Taylor expansion and Lemma 1, E(B, Bj) V-(V + A)-2E[(A A) (A A)2] + o(m-'). Now by Lemma 2, and the fact that E(A A)2 (2trE-2)-1 + o(m-1), we get E(B, Bj) = Be( + Bi(trE- B= [2(1 - A)[ 2 tr {(XT 1X)-1(XT -2X} (A + A)(trE-2) tr-2 -2)-1[2(V + A)-2 (V + A)-'tr{(XT -1X)-1(X -2X)}] + o(nm-1) Bi)2{Z(- Bj)2}-1 j=1 m m m -(1 B ){Z(1 B)2}-1tr{((1 Bj)xjx)- ((1- )2xT)}] + o(m -1) j -1 1 1 Bi(1- B){ (1- Bj)21-1[2(1- Bj)- mr tr{(Z(1 Bj)XjX)- I( (1 Bj)2XjXO)}] + o(-1). This proves the lemma. Remark 3. It may be noted that the leading term in E(A A) is O(m-1) and g95in O(m-1). The lemmas provide approximations to E(A A) and E(Bi Bi) both correct up to O(m-1). Thus, by Lemmas 2 and 3 we get the following theorem which gives us the estimate of the MSE correct up to the second order. (2-24) Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy EMPIRICAL AND HIERARCHICAL BAYESIAN METHODS WITH APPLICATIONS TO SMALL AREA ESTIMATION By Ananya Roy August 2007 C(! i': Malay Ghosh Major: Statistics The topic of this dissertation focuses on the formulation of empirical and hierarchical B -i, -i i techniques in the context of small area estimation. In the first part of the dissertation, we consider robust B-,--, and empirical BE,-.-i (EB) procedure for estimation of small area means. We have introduced the notion of influence functions in the small area context, and have developed robust B,-.-- and EB estimators based on such influence functions. We have derived an expression for the predictive influence function, and based on a standardized version of the same, we have proposed some new small area estimators. The mean squared errors and estimated mean squared errors of these estimators have also been found. The findings are validated by a simulation study. In the second part, we have considered small area estimation for bivariate binary data in the presence of covariates. We have developed EB estimators along with their mean squared errors. In this case the covariates were assumed to be completely observed. In the presence of missing covariates, we have developed a hierarchical B,-.-- approach for bivariate binary responses. Under the assumption that the missing mechanism is missing at random (M!AR), we have -,r--:. -i. 1 methods to estimate the small area means along with the associated standard errors. Our findings have been supported by appropriate data m, i -. - the best linear unbiased estimator of the population total T is given by TLS T1 + 3LS i, 2 where T1 is the total of all the n sampled units ,Y2 xz indicates the sum of the rz's corresponding to the unsampled populations values, and /3LS {Z1 yIxI/v(xi)}x {E11 x/v(xi)}~1 is the standard least squares estimates. To make TLS robust to outliers, C'! iinl>ers (1986) proposed the estimator T. T, + b. Y x, + uj u [(y, bnXi),7, 2 1 where u1 = x1q (ZY 2 xj)(32 x j2). Here, bn is some outlier robust estimator of/3 and i) is a real-valued function. In his paper, C'!i iinl>ers discussed the different choices of ip and primarily discussed the bias and variance robustness of Tn under a gross error model. Based on the ..i' ,-il, ..I ic theory given in the paper, it appears that the variance robustness of T, can only be achieved by choosing ip such that this estimator is no longer bias robust. Zaslavsky et al. (2001) used robust estimation techniques to downweight the effect of clusters, with an application to 1990 post enumeration survey. In the context of SAE, Datta and Lahiri (1995) developed a robust HB method which was particularly well-suited when there were one or more outliers in the dataset. They used a class of scale mixtures of normal distributions on the random effects in the basic area model. C 1,, :v distribution was used for the outliers. We have tried to outline some of the n i jri" works significant in the advances of SAE. Review papers on SAE include Rao (1986), Chaudhuri (1994), Ghosh and Rao (1994), Rao (1999) and Pfeffermann (2002). There is a book on SAE by Rao (2003). and this approximation is correct up to O(k-1). The expression for 9miuj(b)/9b is given in Eq. 3-29. Based on that expression, one gets 02M 1ij miiy(1 m1ij)(1 21mij) -mTijTm2ij(1 21mij) -mTijTym3ij(1 2mij) T -m1ij 2ij(1 2miyj) -Tim2ij(1 j -- 22ij) 2mij2ij3ij xiij -mTnijT 3ij (1- 2mij ) 2mim2ij3ij -m1ijT3ij(1 2m 3ij) (3-36) We need to find E(b b). We follow Cox and Snell (1968) to find this expectation. Slij T1ij To this end, we first note that OSk/Ob Z= I [E i E {I3 Xij Z2ij 2ij ]is Z3ij T- 3ij a constant. Hence, since E(Sk) = 0 the expectation of the product of any element of Sk with any element of OSk/Ob is zero. Thus denoting J = Ju,rs,mn) = (E(Sk,u., aob .)), we have Jl,,rs,mn = 0 for all 1, r, m = 1, 2, 3 and u, s, = 1,... ,p. Accordingly, from Cox and with (Daj (b)p,q mpij(b)[1 mpij(b)] S(1 2mpij(b))( Ompij (b) ) (b) tr(UkV (b)K3p(b)) -tr[{(1 2mij(b)) 02 ij (b) (O j(b) b pij (b) bTIU-1 tr[{21 2m b)) T b6 |b bb \b b bb) U b for p = q = 1, 2, 3, = mpij (b) qij (b) [mpi (b) Ob qijb + mq (b) m b ]T Ob b b +m-6Ob) a -bb Xtr(u1(b)K3p () 1 2mqij(b) 02m -y(b) -tr[{{mpijy(b) baibr b + mqij(b) bb Ib b 0 2 abOabbOrb bb) + U + (mpi (b) b b)(m b) b)}Uk ], for p/q 1,2,3. Remark. As mentioned in Section 2, direct estimation of the dispersion parameter A is impossible due to its nonidentifiability. It may appear that an adaptive estimator of A is obtained by minimizing some suitable real-valued function of the approximate estimated MSE matrix with respect to A. For example, we can try to minimize the trace of the estimated MSE matrix. Writing the trace of the matrix given in Eq. 3-43 as g(A), we have k g(A) = [A(A + 1)-2tr(C()) + A2(1 + A) -2tr(Bb)). i= 1 Obviously, g(A) attains its minimum 0 atA = 0. But A = 0 gives only the direct estimator, which we want to avoid for small area estimation. Also, solving g'(A) = 0 subject to the 1.3 Analysis of Discrete data in SAE The models discussed in the earlier section has mainly been for continuous data. We devote this section to discussing some models for discrete measurements. Some of the recent works in SAE has been focused on the analysis of categorical data. McGibbon and Tomberlin (1989) considered the model where yj ~- Bin(1,pij) and logit(pyi) xTb + ui;, ui N(O, 72). (1-18) Here yj's are conditionally independent. The independent random effects are denoted by ui's. The authors assume a diffuse prior for b with au assumed to be known. The joint posterior distribution of b and {ui, i = 1, m} is approximated by the multivariate normal distribution with mean equal to the true posterior mode and covariance matrix equal to the inverse information matrix evaluated at the mode. The estimator of pij is given by i [1 + exp{-(xb + u)}]- (1-19) where b and ui are the respective modes of the posterior. Then the area proportions are estimated by pi = >> pij/Ni for small sampling fractions ni/Ni. For a2 unknown, the MLE of a~ is phlu.-., d in, yielding an EB estimator. Taylor expansion is used to obtain a naive estimator of the variance of the EB estimator. But the drawback of this estimator is it ignores the variance component resulting from the estimation of Ju. Farrell, McGibbon, and Tomberlin (1997) showed that the variance estimator could be improved upon by the use of parametric Bootstrap procedure. Some other papers relating to the analysis of binary outcomes are by Jiang and Lahiri (2001) where they propose a method to find the Best Predictor (BP) of logit(pyj) and use the method of moments to obtain sample estimates of b and ui as given in Eq. 1-18. Malec et al. (1997) also consider a logistic linear mixed model for Bernoulli outcomes and use HB approach to derive model dependent predictors. Accordingly the Fisher information matrix for (b, A) is given by XTE-1X 0 I(b, A)= (2-18) o1 'Ym(V + A)-2 Due to the ..-', iii .l ic independence of b and A, A is .i-v ,'il l ically N(A, 2('1e(V, + A)-2)-1). We now have the following simple but important lemma. Lemma 2-1. Assume 0 < V, < minl
E|A A| = e0(m~-) and F[11i, .:: 0.Here O refers to the exact order which means that the expectation is bounded both below and above by terms such as Cin-r/2 and C2M-r/2 for large m. This is in contrast to order 0 which only requires the upper bound. A rigorous proof of the lemma requires a uniform integrability argument which we omit. However, intuitively, E[A AI {E (V + A)-2}r/2] Oe(1). But m(V* + A)-2< Em(V, + A)-2 < rn(V, + A)-2. Accordingly, E7(V, + A)-2 Oe((m). This leads to EIA- A| =Oe(rm-). Next we write |B Bil = V,A A|/[(V + A)(V + A)] < A A|/(V, + A) for all i 1,... m which leads to the desired conclusion. Also, we conclude from the lemma that P(|A A| > E) = O(m-), P(maxl c) 0(m-), (2-19)for any arbitrary c > 0 and arbitrary large r(> 0) which we require in the sequel. We may point out that the assumption regarding the boundedness of the V's is required in any .,-i~, iI 1i ic Bayes risk (or MSE) calculations in this context as evident, for example, from Prasad and Rao (1990). We are now in a position to find an expression for E (OEB _- 0)2 which is second order correct, i.e. the bias is of the order o(m-1). The following theorem is proved. Theorem 2-3. Assume (i) 0 < V, < minl
CHAPTER 2 INFLUENCE FUNCTIONS AND ROBUST BAYES AND EMPIRICAL BAYES SMALL AREA ESTIMATION 2.1 Introduction As noted in C'!i ipter 1, empirical and hierarchical B i,, -i ,i methods are widely used for small area estimation. One of the key features of such estimation procedures is that the direct survey estimator for a small area is typically shrunk towards some synthetic or regression estimator, thereby borrowing strength from similar neighboring areas. Shrinkage estimators of the above type are quite reasonable if the assumed prior is approximately true. There may be over or under shrinking towards the regression estimator if the assumed prior is not true. Indeed, there is a standard criticism against any subjective B ,i, -i ,i method is that it can perform very poorly in the event of significant departure from the assumed prior (see e.g., Efron and Morris (1971),Fay and Herriot (1979)). Efron and Morris (1971,1972) remarked that B--~v. and empirical B--v, ; estimators might perform well overall but poorly on individual component. e.g., suppose we assume the model where i 10 N(OA,,1) and Oi N(p,A),i 1,... ,m. (2-1) But suppose the assumed prior on Oi's is actually a mixture of several distributions, one of which is N(pi, A1), A1 < A. In that case, Efron and Morris (1971) have shown that the B-i,- ~ risk for the component which belongs to the subpopulation can be very high. Hence the shrinkage estimators which may be good for most of the components can be singularly inappropriate for a particular component 0i. In general, if there is wide departure of the true prior from the assumed prior, the B-,i risk of the B-,-i estimator can be quite high. Moreover, there are practical problems associated with over- and under-shrinking in small area estimation problems, especially with widely varying sample sizes. If this happens, the resulting large area ..-:-: regate estimates may differ widely from the direct estimates, which Thus, Bj(D, D) + Dj(B, Bj) Bi A- A 3B (A A)2 2 + 3 +Op, m-1) 2 (Vi+A)2 8 (Vi+A)2 so that E[{Bi(Di Dj) + Di(Bi B d) }2(I [jX-b|>KD]1 Ll z\ z z\ z z!S \ [\ i- b\K i -B( + A)- E{(A 4 m B( + A)-[ (Vj + A)-2]-1( j- 1 B( + A)(1 Bi)2[- j= BD (1- B )2[(t- A)2 }E{I }xTb D + o(T -1) -K) + o(m-1) _ )2]-1 (-K) + o(m-1) )2] -1(-K) + o(T -). Combining Eq. A-18 and Eq. A-20, it follows that E (iREB -_ REB)2 BD(1 B )2[(1 j 1 BDf(1- Bi)2[Z(1 j 1 - 2 Bj)2-1[4(K) j)2]-1[(K2 _ 4Kb(K) + K2(1 1(K))] + o(m-1) 2) 4K (K) (K2 4)4(K)(] + o(m-1). (A-21) Finally, by Eq. 2-13 and Eq. A-19 we get (REB 6iB iREB eREB) -B{x (6- b) + (y xfb KD _)I xT [yi- bi >KDi] +(y x6b + KDj)I[y,-xTb<-KDi] x [(Bi Bj)(y- x b)I Xb +o,(- )) [yi-xTb>KDi] [yJ-xTb Ki (A 2 <-KDi]}] + O,( n-1). 3(A A) 4(V + A) (A-22) (A-19) (A-20) A)(V + A)- (1 Following the same transformations as in C'!i ,pter 3, we arrive at the model S(1-Y )2) (1- Yij)l i2 (1-YI il)(1-Yi2) 4 2) P(Yili I1 ._) p Pij3 Pij4 (4-2) where pij4 t1 1YPijl.Then writing zij = yijl(1 / .), ~.. =/ ._(1 yijl), Zij3 ijli and Zij4 (1 y)(l / .), we get P (yij I N ij i 2p ij3pSj4 (4 3) Next we write piul = (yil/ 1 (ir and use the well-known relationship between the multinomial and Poisson distribution, namely, (zii, ._., zij, zi4) have the same distribution as the joint conditional distribution of (uijl, / .., uij3, ui4) given Zj 1 Uijq 1, where Uiq are independent Poisson((jq) (q 1,2,3,4). We begin modelling the (ij as log(ij) = xbi + (4-4) where xiy (xijl, ._.,..., ,ijq)T, b, (bli, bl2, ..., blq)T, and, id N(0, Ja2). In the presence of missing covariates, we write x = (x mis, Xjobs) and assume that the missingness is MAR. In the presence of missingness in some of the components of xij, we model only those missing components for necessary imputation. Further, in order to reduce the number of nuisance parameters, we model the joint distribution of these missing components as the product of several smaller dimensional conditional distributions as proposed in Lipsitz and Ibrahim (1996) and Ibrahim, Lipsitz, and C'!. i1 (1999). Specifically, we partition Xij,mis into several component vectors, -v, (,1)mis, ijmis, parameterized by a1,..., ar, and conditional on the observed Xij,obs given by 7(y) oc exp(- )y-d/2-1. We let 7(a) denote the prior distribution of a. Then, the resulting posterior distribution is given by r(b, a2, a Dobs) O L(b, a2, a Dob) exp(- --)(2)-d/2-17(). (4 8) Direct evaluation of the posterior moments of Pijl's are not possible as it entails evaluation of numerical integrals which are very difficult to compute. We use the Gibbs sampling method instead. Let Xmis denote the full vector of all the ij,,mis, and v = (vI,... vk). To facilitate Gibbs sampling, we consider the following joint posterior based on the complete data likelihood as: 7r(b, a2, a mv,i Dob) x L(b, a2, aD) exp(-- )(2)-d/2-17(a), (4-9) where L(b, a-2, aoD) is given by Eq. 4-6. It is easy to show that after integrating out v and Xmis from Eq. 4-9, the joint marginal posterior of (b, a 2, a) is Eq. 4-8. To prove the propriety of the posterior, we will mainly follow the techniques given in C1'i. i: and Shao (2000), C'i. i, et al. (2004) and Chen et al. (2002). We note that our covariates are bounded. We assume that aij < xij,miss < bij, which means all the components of Xij,miss are bounded by aij and bij. We start with introducing some notations: let M = {(i,j) : at least one component of xij is missing}, Mij = {1 < r < q : xijr is missing}, and M* = {(i, j) : r E Mij}, where j 1,... ru, i 1,... ,k. EMPIRICAL AND HIERARCHICAL BAYESIAN METHODS WITH APPLICATIONS TO SMALL AREA ESTIMATION By ANANYA ROY A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2007 exp (uli7) 1 exp (ij7-y) + exp (ij- 7) j exp tij (1 + exp ( u-y) + exp (-uj u2ij )) dtij = exp(-t) exp (- y(exp (-1 7)) exp iy(exp (-(i1j 2ij ) dtij Now, let F(v) = exp(- exp(-v)) and noting that F(v) f= l(s < v)dF(s) = f 1(s > -v)(-dF(-s)), we get Ji= exp(-ty) exp (- t-(exp (- uI))) exp (- ti(exp (-(ul -2ij ) )dtij S exp(-ij) 1(-i < ,)(-dFi(- ,)) x (-(tlij t2ij T )(-dFtij-, ,)) dtij. (B-14) anod Similarly, exp (2i7)j 1 + exp ( -7) + exp (i Y) S exp (- ) 1(-_ 7 ) < .)(-dF~(- )) 0 --o X 1(-(i2ij- lij)T < /"' ,)(-dFtij(-"' ,)) dtij (B-15) --00 and, 1 3 + exp (i 7) + exp ( Y7) S exp(-t) (--1i7) < .,)(-dFtij(- ,)) x 1(-E, 7 < ,, .)(-_dF(_ ,.)) dt. (B-16) --oo where, Var(A) 2m-1 [A2 + 2A E /m + E V/m]. Lahiri and Rao (1995) showed that the estimator of MSE as given by (1-16) is robust to departures from normality of the random area effect 's, though not the sampling error ej's. Datta and Lahiri (2000) extended the results of Prasad and Rao to general linear mixed models of the form, Yi = Xib+Zivi+ei, i =l, ,m, (1-17) where Yi is the vector of sample observations of order ni, Xi and Zi are known matrices, and vi and e, are distributed independently with means 0 and covariance matrices Gi and Ri, respectively, depending on some parameters A, the variance components. In their paper, Datta and Lahiri first estimated the variances of the BLUP for the case of A known. They also developed MSE estimators correct up to the second order, when A is estimated by MLE or REML estimator. They also show that MSE approximations for the ML or REML methods are exactly the same in the second order .,-vmptotic sense. However, the second order accurate estimator of the MSE based on the latter method requires less bias correction than the one based on the former method. In the same thread, Datta et al. (2005) derived second order approximation to the MSE of small area estimators based on the F ,v-Herriot model as given by (1-1), when the model variances are estimated by the Fay-Herriot iterative method based on weighted residual sum of squares. They also derived a noninformative prior on the model for which the posterior variance of a small area mean is second-order unbiased for the MSE. M ,ii. other models have been proposed in the context of SAE. Cressie (1990) and Ghosh, Natareai ,in Stroud, and Carlin (1998) emphasize the use of linear models and generalized linear models, respectively. Erickson and Kadane (1987) carried out Sensitivity Analysis for SAE. Pfeffermann and Burck (1990), Rao and Yu (1992) and Ghosh et al. (1996), studied the extent of possible use of time series and cross-sectional models in SAE. Based on the figures in Table 4-1, we calculate the following summary measures to examine the performance of the estimators. Let ti be the generic symbol for a true proportion in the ith small area and ei be the corresponding generic symbol for the estimate. Then we define the Average Relative Deviation (ARD) as ARD = m- lei t1i/t. (4-21) i= 1 The Average Relative Mean Squared Error (ARMSE) is given by ARMSE = -1 (e_ ti /t2 (4-22) i= 1 The Average Deviation (AD) is denoted by AD m-1 |e t1, (4-23) i= 1 and, the Average MSE is given by AMSE = m (Ci ti)2. (4-24) i= 1 Table 4-2 summarizes the performance of the HB estimates, and also of the direct estimates. It follows from the summary table Table 4-2 that the HB estimates outperform the direct estimates according to all the four criteria. For the HB estimates, ARD, ARMSE, AD and AMSE improvement over the direct estimates are 39.3', 74 , 37.9 2',. and 71.,! respectively, for estimating infant mortality rate. For low birthweight proportions, corresponding improvements are 46.59'. 85.0''. 43.05'. and i.'. respectively. Table 4-2. Measures of precision D1 D2 HB1 HB2 ARD 0.7973 0.464 0.4832 0.2478 ARMSE 1.6525 0.6244 0.4151 0.0931 AD 0.0058 0.0367 0.0036 0.0209 AMSE 0.00007 0.005 0.00002 0.0007 2007 Ananya Roy Next we calculate 2 m dA d [2( + A)-2{y + A Xj (Vj + A)-I1^ )-I }] j-1 V 2 [-( + A)-2 + 2(V + A)- (Z(Vj + A)- 1X )- 1 j 1 m m m +( + A) 2XT(y j + A) 1XXT)-1(Z(j + A) 2XjXT)(Yj + A) -1XXT)-X] j=1 j-1 j-1 -B + 0(Tm-1). (2-28) Hence, by Lemma 2-2 and assumptions of Theorem 2-3, E[g2i(A)] =g2i(A) m m m 2B[E(j + tA)r-2]-Ir[( + A)-lxjxT)- I((yj + A)-2xxT)] j-1 j-1 j-1 x {(1 + K2)>I(-K) K(K)} + O(m-2) =g2i(A)- g6im(A) + O(m-2) (2-29) Thus, we estimate g2i(A) by g2i(A) + g6im(A). The theorem follows. 2.5 Simulation Study We consider a simulation study to see the effectiveness of the proposed estimators and compare them with standard empirical By,-,i estimators. The layout of the simulation is as follows: Generate Oi = + u, i = 1, m. Generate yi = O + e, i = 1, m. We take e ~- N(0, Vi), where for m = 10, popular small area models are special cases of the general linear mixed models. It will be worthwhile to develop robust methods in this more general case. Another future undertaking will be extending the results for the bivariate binary case in the multivariate context. Also we have developed EB techniques when there are no missing covariates. It will be interesting to extend these results when some of the covariates are missing. Theorem 2-4. Assume the conditions (i) and (ii) of Theorem 2-3. Then the estimate of E(OfEB Oi)2 correct up to O(m-1) is given by giw(A) + ,,g5im(A) + g2i(A) + g6im(A) + g3im(A) + g4im(A). where g1i, g2i, 93im and g4m are defined in Theorem 2, and m 9gim(A) = Bj(1-B)j{Y(1-Bj)2 -1[2(1-Bj)-tr{ j 1 m m g6im(A) = 2Bf[ (Vj + A)-21-1tr[{Z(Vj + A) i= 1 i 1 m( ( (1t m -lxj}-I {E(Vj i 1 Bj)2xTxj}], +A)-2xjxj}] (2-25) [(1 + K2) (-K) Proof : In view of Lemma 2-3, E[Vi(1 B)] giu(A) igim(A) + o(m-1) Hence, we estimate gli(A) by gli(A) + Vig5im(A). Next, by one step Taylor approximation, 92i(A) g2i(A) + (A - E[g2i(A)]= g2i(A) + E(A KO(K)]. (2-26) Hence , A)A ' a,1 A) A (2-27) Bj)XjX m)-( ( By}xyx )- Y( (1 where Z E2(X Z ~ Np(0, Ip), we get D,(fl, f2) A(1 + A) D), x 72 (p,1 p2) and C E E2 Noting that () expf{ (2-)P ZT (1+ AI 2 AC)Z + AcTCZ + C}dZ AC- exp {2 (C AC(1 + AIp AC)-1C)0} x 21) (t + A)Ip A(1+A) ElJ1 exp{ T(C-1 A Ip)- I( +A)Ip - AC|- where the final step of Eq. 2-6 requires the standard matrix inversion formula (Rao, 1973, p 33) (A + BDB)-1 A- A-1B(D- + BTA -1B)BTA-1 with A C-1,B = Ip and D = Ip. Again, CT(c-1 A- I 1 +I) A I 1 1 (I~1 jt2)T ]73 Y]7 Y]2]F A -i -] x1 2x 1E2 12 1tA (Ai A2 )T(E2 -1 1 2) 1 + A (1 + A)(p1 A2 )T-[(1 + A)x2- AE 1]-(j1- P2). Moreover, |(1 + A)l lls A 1 2 2| Ax1 E2i- 221(1 + A)2- Ai)S1 - I121(1 +A)x)2 AEll-. x (t + 2 1 A(1 + A) (2-6) 1 I E1 2 (,1 (1 + A)Ip ACI- (2-7) constraint A > 0, one gets A mnax[ tr0].b ti c tr(Ci(b)) 2 K tr(Bi(b)) A plot of g(A) reveals that the function increases from 0 to A and then decreases from A onwards. Further, since Ei tr(C(b)) = O(k) and 1tr(Bi(b)) 0= (1), for large k, A is close to 1. This is also revealed in our data analysis given in the next section. Due to the same condition, the dominating term in the estimated MSE is the first term, namely A(A+1)-2 E ltr((C(b6)). Since A(A+1)-2 +-1 A-1+1)-2, the dominating first term has the same expression for A and A-1. Hence, the minimum is attained only for very small or very large values of A for most practical applications. Thus, the choice of A reflects a trade-off between the direct and the regression estimate, and practitioner should be guided by the amount of bias that he/she is willing to tolerate. 3.5 Simulation Study In this section, we will discuss a simulation study to compare the performance of the EB estimators with the usual sample mean. Here, we consider k = 10 and k = 20 with ni = 5Vi. To start with, we generated the covariates from a N(0, 1) distribution and kept them fixed for the rest of the analysis. We generated zzij's from a multinomial (pijy,p2ij,p3ij,p4ij) distribution where the p., 's are simulated from a dirichlet(Amijy, Am2ij, Ars3i, Ar4ij) distribution to replicate the bivariate binary model described in Section 2. Here the mnij's were modelled on the covariates through the multinomial link. For our simulation, we assumed b, = (0, 1)', b2 = (0, 2)' and b3 = (0, 1)'. Our parameters of interest were the small area means Or 1(Prij + P3ij), r = 1,2. We used the optim function in R to estimate the parameters. We calculate the following summary measures to examine the performance of the estimators. Let r) be the estimate of 0(r) for the 1th small area at the Rth simulation. Then we define the A PROOF OF THEOREM 2-3 ... B PROOFS OF THEOREMS 4-1 AND REFERENCES ............... BIOGRAPHICAL SKETCH ........ 4 -2 . . (ii) maxi
E(O EB Oi)2 li(A) + g2i(A) + g3im(A) + g4im(A) + o(m-1), where gi (A), g2i(A) and g3im(A) are defined in Theorem 2, and 94i,(A) B 2D2(1 Bi)2[ (- Bj)2]-1[2(2K2 1) 4(K2 1)@(K) 5K(K)], j=1 The proof of this theorem is deferred to the appendix. 2.4 MSE Estimation The present section is devoted to obtain an estimator of the approximate MSE derived in the previous section which is correct up to O(m-1). As noted already under the assumptions of Theorem 3, gi3, and g4i, are all O(m-1). Hence, substitution of A by A in all these expressions suffices to find terms which are Op(m-1). However, a similar substitution for gli or g2i will ignore terms which are Op(m-1). Accordingly, to estimate gli or g2i we need further Taylor series expansion. To this end, we first state the following lemmas. Lemma 2-2. Assume conditions of Theorem 2-3. Then E(A A) (tr& -2)-ltr[(XT-l X)-l(XTX-2X)] + O(m-2). Proof. From Cox and Snell (1968), due to assumption (i) of Theorem 3, E(A A) (1/2)I-'(A)[K(A) + 2J(A)] + O(m-2), (2-20) since (zij, plij) is distributed independently of ( .*', p) for j / j'and all 1 < 1, 1' < 3. Hence, (3-15) for all r, s ni E[(O B rie Osi) 0 (O j-1 1, 2. Thus, (OB, -_ 01i)2 - Olij)(2Bj 02ij) (i lij)( 02ij) (o2j o2ij 1 0 1 0 1 1 Now for each I 1, 2, 3, Plij) A(A + 1) -(Plij- mij)]2 (A + 1)-2E[zj -Ph- A(pij First conditioning on puij, it follows from Eq. 3-17 that (A + ) -2E[pj(1 pu) + A2 (Pli (A + 1) -2[mTi mlij )2 2 Tnij (1- m~j) + A2 (lij ( m )l mj A+ I A + A(A + 1)-2m(ij MSE(OBf) "t 1( j- (%j j=1 (3-16) E( p.plj)2 E[(A + 1)- (zlij mri)]2. E(p _- Plij)2 (3 17) Orij)(si -sij), E[(B ei)(OB ei)T] (3-1t8) 1 0 p B)(i p )T] 0 1 1 1 Datta, G. S., Rao, J. N. K., and Smith, D. D. (2005), "On measuring the variability of small area estimators under a basic area level model," Biometrika, 92, 183-196. Drew, D., Singh, M. P., and Choudhry, G. H. (1982), "Evaluation of Small Area Estimation Techniques for the Canadian Labour Force Survey," Survey M. /i.. .,/'/i 8, 17-47. Efron, B. and Morris, C. (1971), "Limiting the risk of Bi-, and empirical B-,-. estimators. I. The B-,- case," Journal of American Statistical Association, 66, 807-815. - (1972), "Limiting the risk of B ,--, and empirical B ,--, estimators. II. The empirical B-v-- case," Journal of American Statistical Association, 67, 130 139. Erickson, E. P. and Kadane, J. B. (1987), "Sensitivity analysis of local estimates of undercount in the 1980 U.S. Census," in Platek et al. (1987), pp. 23-45. Farrell, P. J., McGibbon, B., and Tomberlin, T. J. (1997), "Empirical B ,--, estimators of small area proportions in multistage designs," Statistica Sinica, 7, 1065-1083. Fay, R. E. and Herriot, R. A. (1979), "Estimates of income for small places: an application of James-Stein procedures to census data," Journal of American Statistical Association, 74, 269-277. Ghosh, M. and Maiti, T. (2004), "Small-area estimation based on natural exponential famliy quadratic variance function models and survey weights," Biometrika, 91, 95-112. Ghosh, M., Nangia, N., and Kim, D. H. (1996), "Estimation of median income of four-person families: a B ,. -i approach," Journal of American Statistical Associ- ation, 91, 1423-1431. Ghosh, M., Natare'i -in K., Stroud, T. W. F., and Carlin, B. P. (1998), "Generalized linear models for small-area estimation," Journal of American Statistical Association, 93, 273-282. Ghosh, M., Natare'i i, K., Waller, L. A., and Kim, D. H. (1999), "Hierarchical B-, i GLMs for the analysis of spatial data: an application to disease n ,ppii,'- Journal of Statistical Planning and Inference, 75, 305-318. Ghosh, M. and Rao, J. N. K. (1994), "Small area estimation: an appraisal," Statistical Science, 9, 55-93. Gilks, W. R. and Wild, P. (1992), "Adaptive Rejection Sampling for Gibbs ,-:plin- " Applied Statistics, 41, 337-348. Godambe, V. P. and Thompson, M. E. (1989), "An extension of quasi-likelihood estimation," Journal of Statistical P7la, ;i,,; and Inference, 22, 137-172. Gwet, J.-P. and Rivest, L.-P. (1992), "Outlier resistant alternatives to the ratio estimator," Journal of American Statistical Association, 87, 1174-1182. Here,following C'! i: Ibrahim, and Shao (2004), f (zijiy Ixiy, bl, v) 1 Sexp{zijl lijl} exp exp (lii) Ziji! Sexp exp (lijl) if Ziji = 0 +) exp(- i1) exp(- TI) if zijl > 1 Denote by 1 = {(i,j) : zi = 0} and I21 {(i,j) : ij > 1}. exp{-exp(,y)}, f2Q() = exp(-,+) and f2(,) = exp(-,-). Note that f, non-increasing in T1. Also fi(oo) f2(oo) 0, and, fi(oo) < oc, f2(oo) other hand, f3 is non-decreasing in T1, f3(-oO) 0 and f3(oo) < i-I ni and N2 be the cardinality of I21, 1, ... 4. Denote i (a e (0, .., 1, 0, ..0) e Rk as the basis vector with 1 at the 1th component. (4-10) Let fi(I) = and f2 are < oO. On the X). Let n = c., ej) with Let U() = (uj, 1 < j < n, 1 < < k, +, (i,) 21) (4- 11) be the (n + N2) x (q + k) design matrix such that u = Uijfor (i, j) e I31. Let w ) lif(i,j) e IuU 21 and 1if(i,j) e I21. Define U,1) to be the matrix with rows w .T 1 < j < n, 1 < i < kw^, j +uj, (i, j) C I21 wi? ui?,] _' _+?( ] (4-12) For 1 < r < q, 1 < / < 4, let S ayij ij bij an+ij bij if (i,j) C IlU U 21 o.w. if (i,j) e I21 (4-13) Remark 1. It is clear from the above theorem that if E1 and E2 are known, the divergence function DA(fl, f2) is one-to-one with (A/ 2)[(1 + A)E2- AEi]-1(t1 2). In the special case when El E2 = E, the above divergence measure is a one-to-one function of the Mahalanobis distance (A, A2)T -1-(1- 12). We now utilize Theorem 2-1 in deriving the influence of (yi, xi) in the given small area context. Note that b y, X ~ N[(XTE-1X)-XT'E-ly, (XTE-1X)-1] and b|ly(_),X(_j) ~ N[(X(_E X(_) X(_ y_, (X(_)y((-X )-1 where E(_i) is a diagonal matrix similar to E with the ith diagonal element (Vi + A) removed. Now writing t (XT" 1X)-1xTE-l y, A2 ( XT i)E- )X(_i))-1T i)X )Y(_i), El (XT-IX)-1 and E2 (X E- X())-1, one gets ~1 -- 2 XT -1X)- XT -1y (XTE-IX (V, + A)- xx )-(XT-1Y (V, + A)-lxi ). (2-8) By a standard matrix inversion formula (Rao, 1973, p 33), (XE-IX (i + A)-lixT)-1 (XT E-X)_1(V/ + A) -1xx(XTE-1X)-1 (2-9) (XTE-IX)-1 + 1 (Vi + A)- lx(XT -lX)- lx From Eq. 2-8 and Eq. 2-9, on simplification, (XTE-1X)-1(V + A)-1xi(yj Xb) S- 2 + A) xT )-(2 10) t (Vi + A) -1XT(XTE-1X)_jXi Average Relative Deviation (ARD) as 1 k R (r (r) ARD Y j 1< 0 (3-44) kR ZZ O) i=l r= 1 The Average Relative Mean Squared Error (ARMSE) is given by 1 k R (r) r)2 ARMSE = k 0(7)2 (3-45) i=1 r=1 i The Average Deviation (AD) is denoted by k R AD= | _- O (3-46) i= 1 r 1 The Average MSE is given by k R AMSE Y Y O))2 (3-47)) i=1 r=1 and, the estimated MSE is given by 1 k R --- (r) EMSE= k Y MSE(04) (3-48) i=l r = where MSE(Oi) is the second order correct estimate of the MSE as given by 3-43. In addition to the relative bias of the point estimates and simulated MSE's, we report relative bias of the estimates of the mean squared errors as defined below. Relative bias with respect to the empirical MSE: REB E{MSE()} SMSE(, 1) rn SMSE(0i) where, SMSE(O,) = i 1EZ 1(r O )2 E{MSE(O) 1 MSE(0i) As matter of interest we also computed the direct estimators to compare their performances with our empirical B-,-.-- estimators. We observed there was considerable improvement in terms of bias and MSE when we used our EB estimators as compared to the raw estimators, e.g., for A = 5, the improvement was 48.5'. for ARD, 98.1 . CHAPTER 1 REVIEW OF SMALL AREA TECHNIQUES 1.1 Introduction The terms -i, i!! ci, and "local -,r are commonly used to denote a small geographical area, but they may also be used to describe a -i1 i1! domain", i.e., a small subpopulation such as a specific age-sex-race group of people within a large geographical area. Small area estimation (SAE) is gaining increasing prominence in survey methodology due to a need for such statistics in both the public and private sectors. The reason behind its success is that the same survey data, originally targeted towards a higher level of geography (e.g., states) needs to be used also for producing estimates at a lower level of .,-::--regation (e.g., counties, subcounties or census tracts). In such cases, the direct estimates, which are usually based on area-specific sample data, are often unavailable (e.g., due to zero sample size) and almost alv--, unreliable due to large standard errors and coefficients of variation arising from the scarcity of samples in individual areas. SAE techniques provide alternative tools for inference to improve upon the direct estimators. The use of small area statistics goes back to 11th century England and 17th century Canada (Brackstone, 1987). However, these statistics were all based either on a census or on administrative records aiming at complete enumeration. The problem of SAE is twofold. First is the fundamental issue of producing reliable estimates of parameters of interest (e.g., means, counts, quantiles etc.) for small domains. The second problem is to assess the estimation error. In the absence of sufficient samples from the small areas, the only option is to borrow strength from the neighboring areas i.e., those with similar characteristics. In Section 2, we outline the advances in SAE over the past few decades and some popular works significant to the enhancement of the techniques in this field. Putting Eq. B-14-Eq. B-16 in Eq. B-13 and integrating w.r.t. to 7 (a a )T: S(< u 1(-)7)< < .)(-dF'(-, )) (i,j) E J O x l(- (i ij t2ij) ,)(-dF 'i(-,,' ,)) dtij /-exp(-t) 1(-f7) < ,)(-dF(- ,)) (i,j) E J2 3 -. .n O . x (-(2j ij)T < I ,)(-dF'ij(-,' ,*))dtid n / exp(-tj)[ 1(- 7)-< ,)(-dF '(- ,)) (i,j)e J3 0- -0 X 1(-_ l y <_ (.) ,(--dF '( ,)) d d7 JR8 JR+n i R2n (B-17) where dFt(v*) )) (-dFi )) i d(-F(-', ,)). Using C'i.. i, and Shao (2000),under conditions (HI) and (H2), we get f k ni SL(ID)d Now, (-dFt7i(-, .))(-dF'(-,,' )) ti~exp(, .) exp(-tijexp(, .))exp(,' ) exp(- tiexp(' .)). cigarettes smoked during pregnancy is at least 1. It turned out that there was missingness in the response to both (i) and (iii). If one leaves out completely all the entries where such missingness occurs, then one is left with only a few observations where one encounters both low birthweight and infant mortality, and indeed the total number of such entries will be disproportionately small in comparison with the remaining outcomes. Also, it is very clear that both these covariates have direct impact on both low birthweight and infant mortality so that the missingness cannot be treated as "missing completely at random". Instead, this missingness falls in the category of "missing at random" (M\ AR) (Little and Rubin, 2002). We use our method to estimate infant mortality rate and low birthweight rate simultaneously over a period of 5 years for several small areas in one of the states (name withheld for confidentiality). We constructed the small areas by cross-classifying the different cities in the state with mother's race, namely, Non-Hispanic White, Non-Hispanic Black, Non-Hispanic American Indian, Non-Hispanic Asian/Pacific Islander and Hispanic. We get 48 such small domains. For each small domain, we consider the NCHS full data as constituting our population values, and we draw a 2'-. simple random sample from this population of individuals. Thus the population low birthweight and infant mortality rates are known for these small domains, and will serve as the gold stl ,i, ,1. for comparison with the different estimates. In particular, we will find the HB estimates, and compare the same with the direct estimates, namely the sample means. In our analysis we consider the response variables and covariates as follows: yi = infant death (1 yes, 0 no); i1 ._ = low birthweight (1 yes, 0 no); xjij = 1 denoting the intercept; (r ._.,Xij3) = (0, 0), (1, 0), or (0, 1) according as mother's schooling is less than 12 years, equal to 12 years, or is greater than 12 years; (xij4, ij5) = (0, 0), (1,0) or (0,1) if mother's age is less than 20 years, between 20 and 34 years, or is greater than 34 years; and xij6 = 0 or 1 if mother does not smoke, or the average number of cigarettes smoked during pregnancy is at least 1. Here y = (yiji, i ..:)T is the response vector with the covariates Again for 1 < I 1' < 3, E(pl' Plij)( p. .) E[{(A + 1)- (z1ij lij) A(A + 1)- (p. m )} x{(A + 1)-(, p, .) -A(A + )-(p. )}] (A + 1)-2E[-p aPij + A2Cov(pljp, ,)] (A + )-2E[ A lij A 2. , A+ I A(A + 1)-2r .. A(A + 1)-2[Diag(mlij, T 2ij, mn3ij) Tn ii 2ij ( 1ij T2ij T3ij ]. (3-20) T 3ij It follows from Eq. 3-16 and Eq. 3-20 that MSE(0) A(A + 1)-2 ni (, lij + T 3ij)( T1ij 3ij) 3ij (lij + r 3ij)( n2ij + T 3ij) X tW' j=1 [T3ij ( nlij + 3ij) 2ij + ijj) (n2ij + Tn3ij))(1 T 2ij Tn3ij) (3-21) Next we observe that E[(oB oB)(oEB B)T] [ EB (lij it' it ,B EB (02 B EB 1lij) lij' OB lEB 2ij (0lij' B EB olij') (0lij 1B -EB 01ij') (02ij B EB B lij) (02i 02ij) 6B -EB 1B 02i) (02i 02ij) (3-22) Hence, A-h 1 MSE(p ) (3-19) Y K 1 0 1 0 1 " 0 1 ( P ,),_ ) P 1 \1 1 with Ji the square matrix of order ni with every element equal to 1 and Ii the identity matrix of order ni. Then the generalized least-squares estimator of b is given by 6 (XTV-1X)- XTV-1y. (1-9) Then the best linear unbiased predictor (BLUP) for the 1th small area mean, when (ac, ,a7) is known, is given by O 4x + (y,. 6)g, (1-10) where Xi. n17 l xij and gi af/(K + n V). In practice, however, a2 and a are seldom known. A common procedure is to replace them in the BLUP formula by standard variance components estimates like Maximum Likelihood Estimators (\! 1.), Restricted MLE (REML) or Analysis of Variance (ANOVA) estimators. The resulting predictors are known as the Empirical BLUP (EBLUP) or EB predictors. Prasad and Rao (1990), among others, provide details. An alternative method is to find Hierarchical B-,.-i (HB) predictors by specifying prior distributions for 3, aQ2 and a. and computing the posterior distributions f(Oi y, X) given all the observations in all the areas (Datta and Ghosh, 1991). Markov chain Monte Carlo(\!C\!C) techniques are used for simulations from the posterior. As pointed out earlier, another important aspect of SAE, apart from the estimation of parameters, is the calculation of the error associated with estimation known as mean squared error (\!Sl ). To this end, we discuss here the work of Prasad and Rao (1990). Prasad and Rao pointed out that the models of Fay and Herriot (1979) and Battese et al. (1988) are all special cases of the general mixed linear model y Xb + Zv + e (1- 11) where y is the vector of sample observations, X and Z are known matrices, and v and e are distributed independently with means 0 and covariance matrices G and R, respectively, depending on some parameters A, the variance components. Following Similarly, one finds bias-corrected estimates for rn2ij(b)(1 m3ij (b)). Next we calculate E[mj1(ib)m2ij(b)1 E[{mi,,(b) + ( ib))T(6 Ob b) + -(6 2 _b)T 2mlij(b) (b ab) b( 8 bQ^b b)} b)}] b) 1 022ij(b) 02r1ij (b) lmij (b) )a(mlij(b))T + tr[{{my(b) + m2(b) } + ( )( )} x E{(b- b)(6- b)T}]. Hence we estimate mljn(b)m2ij(b) by mijy(b)rm2ij(b) tr(Uk '(6)K1(b6)) tr(U'(b)K3p()) m2ij () m i (b) + ,m jb [ b) b b+m2-j(b) b b 1 02m2i(b) tr[{2 {mi~(b) b0br b- am ij (b) + b b b ]Tt kl '(b) -62k 02M 1ij(b) b+ m2iJ(b) Tbbr b Omij (b) b- 6)( Ob Ib b ) 6 k Similarly, one finds bias-corrected estimates of m (lij(b)Tm3ij(b) and mn2ij(b)T3ij(b). Hence we get the following theorem. Theorem 3-2. Assume Uk Oe(k),then MSE(O B) can be estimated, correct up to 0(k-1), by A(A + ) 2C() + A2( + A) -2 (b), where c (b) j=1 S9 1 0 1 0 1 t t1 1 \ i j (b)1 8 2 021iJ (b) X {m2i(b) + ( 2b))T(6 b) + (6 b)T 2i (b6 mb 2 Ob T(b mjij(b)m2ij(b) + [miij(b) am2(b) + 2ij(b) Mib E(6 +b O2~b)alb (3- 43) m2ii(b)) and m3i(b)(1 - Henderson (1975),they proposed the best linear unbiased estimator of p = 1 + mTv as t(A, y) = 1T/3 + mnTGZTV- (y X/3), (1-12) where V = Cov(y) = R + ZGZT and 3 is the GLS estimator of b. They proposed the estimators for the variance components using Henderson's methods. Prasad and Rao also proposed an estimator for the MSE of the estimators. They extended the work of Kackar and Harville (1984), and approximated the true prediction MSE of the EBLUP under normality of the error terms. They showed E[t(A) t(A)]2 tr[(Vb)(VbT)E[( A)(A A)T], (1-13) where VbT colI MSE[t(A)] MSE[t(A)] + tr[(Vb)(VbT)E[(A A)(A A)T] (1-14) They provided general conditions under which the precise order of the neglected terms in the approximation is o(t-1) for large t. These conditions are satisfied by both Fay-Herriot and the nested error models. For the estimation of the MSE, the models were considered individually as it is difficult to find general conditions. For example, the MSE approximation for the Fay and Herriot (1979) model is given as follows: E[OfB -O ]2 Vi(1 B ) + BJxTXT D- X)-lx + E(OB (A) Ofi(A))2. (15) Prasad and Rao estimated Vi(1 Bj) by Vi(1 B) and BfxT(XTD- X)-1xi by B/x (XTD 1X)-lxi, where D = diag(Vi + A, ... Vm + A). They approximated (OY(A) O(A))2 by 2Bf(V, + A)-lVar(A). While under normality of the error terms, Var(A) was approximated as Var(A) = 2m-1[A2 + 2A 7 Vi/m + c V2/m], correct up to the second order. Hence, the estimated MSE up to the second order is Vi(1 bi) + B^xf(XTD 1X)- lx + 2Bj(Vi + A)- Var(A). (1-16) protection against model failure and the excess Bayes risk that one is willing to tolerate when the assumed model is true. We now proceed to the study of robust EB estimators in the next section. 2.3 Robust Empirical Bayes Estimation and MSE Evaluation In the previous section, the variance component A (> 0) was assumed to be known . If, however, A is unknown, we estimate A from the marginal distributions of the yi's. We may recall that marginally yi t N(x'b, V, + A). There have been various proposals for estimating A including method of moments (Fay and Herriot, 1979; Prasad and Rao, 1990) method of maximum likelihood (il\,), or restricted maximum likelihood (REML) (Datta and Lahiri, 2000, Datta et al., 2005). The analytical results that we are going to obtain are valid for all these choices, although our numerical results will be based on the ML estimate of A, which we denote by A. Let = Diag(Vi + A, Vm + A), B = -- i = A+Vi 1,''' ,m and b (XT:-IX)-IxT-l'y. Then the robust EB estimators of the Oi's are given by IREB D i ~i \ (2-16) where )b is defined in the previous section after Eq. 2-12. Our objective in this section is to find E(O'REB _- 0)2 correct up to O(m-1), i.e. we provide an ..- i,!iii i ,tic expansion of this expectation with an explicit coefficient of m-1, while the remainder term when multiplied by m converges to zero, subject to certain assumptions. Here E denotes expectation over the joint distribution of y and 0. We need certain preliminary results before calculating the B v.-i risks . First we need the ...,-il' 1 1 ic distribution of A. Based on the marginal distribution of the yi's, the likelihood for b and A is given by L(b, A) (27)-7 2 (V + A) exp[- (y- xb)2/( + A)]. (2-17) 1 1 Harter, R. M. and Fuller, W. (1987), "The Multivariate components of variance model in small area estimation," in Platek et al. (1987), pp. 103-123. Ibrahim, J. G., Lipsitz, S. R., and C'!., i M.-H. (1999), \i,--ig Covariates in Generalized Linear Models When the Missing Data Mechanism Is Nonignorable," Journal of the Royal Statistical S.. .. I,; Series B, 61, 173-190. Jiang, J. and Lahiri, P. (2001), "Empirical Best Prediction for Small Area Inference with Binary Data," Annals of the Institute of Statistical Mathematics, 53, 217 243. Kackar, R. N. and Harville, D. A. (1984), "Approximations for standard errors of estimators of fixed and random effects in mixed linear models," Journal of Ameri- can Statistical Association, 79, 853-862. Lahiri, P. and Maiti, T. (2002), "Empirical B-i,- estimation of relative risks in disease mapping," Calcutta Statistical Association Bulletin, 53, 213-223. Lahiri, P. and Rao, J. N. K. (1995), "Robust Estimation of Mean Squared Error of Small Area Estimators," Journal of American Statistical Association, 82, 758-766. Lipsitz, S. R. and Ibrahim, J. G. (1996), "A Conditional Model for Incomplete Covariates in Parametric Regression Models," Biometrika, 83, 916-922. Little, R. J. A. and Rubin, D. B. (2002), Statistical A,.1.]l-':- of Missing Data, Wiley, New York, 2nd ed. Maiti, T. (1998), "Hierarchical B-,-.- Estimation of Mortality Rates for Disease M ,..i.- " Journal of Statistical Pu,',',*I and Inference, 69, 339-348. Malec, D., Sedransk, J., Moriarity, C. L., and LeClere, F. B. (1997), "Small Area Inference for Binary Variables in the National Health Interview Survey," Journal of American Statistical Association, 92, 815-826. McGibbon, B. and Tomberlin, T. J. (1989), "Small area estimates of proportions via empirical B ,-c- techniques," Survey M. /I, ]./. ..,i/; 15, 237-252. Nandram, B., Sedransk, J., and Pickle, L. (1999), "B ,, -i ,i: analysis of mortality rates for U.S. health service areas," S',,./,,';.7 61, 145-165. Pfeffermann, D. (2002), "Small area estimation-new developments and directions," International Statistical Review, 15, 237 252. Pfeffermann, D. and Burck, L. (1990), "Robust small area estimation combining time series and cross-sectional data," Survey M //I.'./. .]1i,;. 16, 217-237. Platek, R., Rao, J., Sarndal, S., and Singh, M. (eds.) (1987), Wiley, New York. Prasad, N. G. N. and Rao, J. N. K. (1990), "The estimation of mean squared errors of small area estimators," Journal of American Statistical Association, 85, 163-171. Also (XTE-X)-1( +A)-1xx (XTE- X)-1 (1 + A)2 (X X) + (+ A)-1 + A) -1 1 (V, + A) -lx(XTE-IX)-lxi (2- 11) It is easy to see from Eq. 2-10 and Eq. 2-11 that (A1 A2)T[(1 + A)2 -- AI]-(p1 2) is a quadratic form in y, xfb. We propose restricting the amount of shrinking by controlling the residuals yi xfb. However, in order to make these residuals scale-free, we standardize them as (y xfb)/Di, where D = V(y, XTfb) V, + A- Xf(XT E-1X)-1, i 1,... m. Based on these standardized residuals, we propose a robust HB estimator of 0i as OHB i BiD (Yi = 1, m, (2-12) Di 7 where jb(t) = sgn(t)min(K, |t|). Remark 2. For the special case of the intercept model as considered in Lindley and Smith (1972) with V = 1 and B, = B = (1 + A)-1 for all i, the above robust EB estimator of 0i reduces to fRHB i- BD (DY ,i 1, ...,m, where D2 (m )/(mB). The selection of K is ahv--, an open question. Often this is dictated from the user's point of view. A somewhat more concrete recommendation can be given from the following theorem which provides the B-,i risk of robust B-,-i estimator given in Eq. 2-12 under the model Eq. 2-2. Throughout this paper, let 4 denote the standard normal df and 0 the standard normal pdf. . Theorem 2-2. Consider the model (2-2). Then E(OfHB -)2 91i(A) + g2i(A) + 93im(A), For (i,j) C M23 (i,j) (2ij, x3ij) is observed, let ( job~, xs oT) where xaj is either equal to (1 x2ij) or x2ij if (i,j) cE .M {(ij) : x6j is missing}; also = (0 ij,obs, I where xij is either equal to (1 x3ij) or X3ij if (i,j) E A.*. Let U** = (- (i, j) E J1 U J3, (uij j), (i,3) e J1, ( 3 ), (i,j) e J2, -u (ij) J2 U J3). Let n** be the cardinality of MA3. Then let us state the following conditions: (HI) U** is of full rank; (H2) there exists a positive vector a = (a,. a2n**) C such that **T a 0. (4-19) Now we state the following theorem: Theorem 4-2. Suppose conditions (Dl) and (D2), and, (HI) and (H2) hold, then we have /3 / L(a2, a3, a6 Dobs)da2da3da6 < 00. (4-20) For our analysis, we assume an Inverse Gamma(l,1) prior for oa2. Table 4-1 provides the sample sizes for all the 48 small domains, the population proportions of low birthweight and infant mortality (P1,P2), the corresponding sample means (Dl, D2), and the corresponding HB estimates (HB1, HB2). where Z ~ N(O, 1). Since Z -Z, E[(Z2 1)(Z K)I[z>] (Z2 1)(Z + K)I[z<-K] =2E[(Z2- 1)(Z K)I[>K] 2 (Z2- 1)(Z - K) (Z)dZ 2[ Z2(- (Z))dZ K Z(-O'(Z))dZ 2[(K2 + 2)0(K) (K2 O(K) + K{(-K)) O(K) + K@(-K)] 20(K). Similarly, xf b KD)I x D [ve- i b>KDi] (yi xfb + KD)IMf KD i b + K [yi-xTb<-KDi] =2{(Vj + A)-2}-1( + A)- E[(j j-1 -4{(Vj + A)-2}-1( + A)- E[( J1 -4{(Vj + A)-2}-1( + A)- [ ( j 1 Hence, E (eREB B)(REB REB) KB2 m [(Vj + A)-2]-1( + A) j-1 [(Vj + A)-2]- ( + A) j=1 KB2D2 B)2[(1 j=1 K)I[Z>K] (Z + K)I[z<-K]] + o(m 1) K)I[z>K] +o(m-1) - K!)(-K)] + o(m-1). (A-26) b(K) 3{0(K) K)(-K)}] + o(m-1) 1[- (K) + 3K4)(-K)}] + o(m- 1) Combining Eq. A-27 and Eq. A-21 the theorem follows. K (-'(Z))dZ + K K O(Z)dZ] JK JK E[(A- A)2(+A) {(y (A-25) Bj)2] -1[--(K) + 3K4)(-K)}] + o(m-1). (A-27) Before proceeding further, we recall a few facts for the Dirichlet distribution. Suppose ul, ur have a Dirichlet pdf given by r r f (ui, ,Ur) x (hf u1 1)(1 ui)ar 1i1, l 1 l 1 where a, > 0 ( 1,... r+1). Then E(u ) ay/E j a, V(uJ) aJ((El,) a)/[(Er al)2 (Ei al + 1)], Cov(u UJ) -aay,/[(E r+11a)2(E 11a + 1)], (1 < j / < r). Hence from Eq. 3-6, the Bw .-- estimator(posterior mean) of plij is given by p, (zi + Ami(b))/(A + 1); j 1, = 1, m. (3-7) The resulting estimators for 01ij and 02ij are ij Pj + P3ij, 02ij P + Pij (3-8) In an EB scenario, the parameter b6, b2 and b3 are all unknown and need to be estimated from the marginal distributions of all the y's. We assume that A is known, and outline later its adaptive estimation. Direct estimation of A is impossible from the marginal distributions of the y's since marginally P(yijl 1,- i.. = 0) = E(zu-) = ulij,P(yijl 0,1 ..= 1)= E(z2ij) M2ij and P(yijl 1, ..= 1)= E(z3ij) 3ij, and the joint marginal of the y is uniquely determined from these probabilities resulting in nonidentifiability in A. One way to estimate bi, b2, b3 from the marginal Dirichlet-Multinomial distribution of the y's is the method of maximum likelihood. But the MLE's are mathematically intractable, and are not readily implementable in practice. Instead, we appeal to the theory of optimal unbiased estimating functions as proposed in Godambe and Thompson (1989). I 1, 2, 3, u = 1, .* p. In particular, we have a2 k ni "b, Bbf [S (zlij i 1 j 1 02 k ni OabOb, 1 T jljij )iju Tmijn)XijU] i 1 i 1 2 k ni bt b (zzij m~ijy)x( 2 i j= 1 ab11(abf)1 a2 k ni ObOb1,b [ (zi1 i 1 j 1 02 k ni O ,Ob ,,b [i (z 1 i 1 j 1 k i 1 k i= 1 Tj)xi jU Tnj)Xi jU an Ym. .M. .m(1 j 1 j 1 k ni -2 .m . i= l j=l T 2mij)xijxijxiju, 2m, .XijXTXij, .m )XijXijxiju where 1 < 1 / l' / l" < 3, 1 < u < p. Thus E(b b) = 0(k-1). From Eq. 3-39-Eq. 3-42, one obtains the elements of the matrices Kiu for all I 1, 2, 3 and u 1, ,p. Thus from Eq. 3-36-Eq. 3-42, the bias-corrected estimate of mnrij(b)(1 mTij(b)) is given by tr(Uk 1(b)K1l(b)) 1 \1ij (b) ij s(b) [1 --biji(b)] (1 2bij (b)) b |b b)u -l() -tr[{2(1 2 am1ij (b) ODb bb T li~ j i iju (3-39) (3-40) (3-41) (3-42) tr(U (b)K3p(b)) m )(bij (b) U- -6 Ob Ib b)-}U1k I02M1ij(b) 2 mnjij (b)) 8b -b b Oij 0 Oi for all j in the ith area; E(y jl0) = '(O) = I( ) and var '"(0i)/@ V(pf)/i (- 1), (i 1, ... ,m). Here for the quadratic variance function structure, V(pi) =,, + vipi + _i'2 with ',, v, v2 never simultaneously zero. For example, for the Binomial distribution, v0 = 0, v, = land v2 = -1, and for the Poisson distribution, ,, = v2 = 0 and vi 1. A conjugate prior was proposed for the canonical parameters in the model, 7r(0i) exp[A{mrn0 )(0i)}] C(A, rnm), (1-22) where A is the overdipersion parameter and mi = g(xib). Using results from Morris (1983), E(pi) = mi and Var(pi) = (-, where A > max (0, v2),Ghosh and Maiti A-V2 provide pseudo BLUPs of the small area means, conditional on the weighted means yin = y 1E, 1i t,' t *' where ir,, > 0's are all jand 1 7 = 1.They eventually arrived at the pseudo EBLUPs by simultaneously estimating b and A, using the method of unbiased estimating functions following Godambe and Thompson (1989). They also provided approximate formulae for the MSE and their approximate estimators. 1.4 Robust Estimation in SAE There has been some research in the past on controlling influential observations in surv ,- though not much in the context of SAE. Among others, we refer to C'!i i : hers (1986) and Gwet and Rivest (1992), who used robust estimation techniques specifically in the context of ratio estimation. Chambers (1986) mainly considered the robust estimation of a finite population total when the sampled data contains 'representative' outliers such that there is no good reason to assume that the nonsampled part of the target population is free of outliers. C'!i i hers assumed a superpopulation model 1o, under which the random variables rj = (yi O3xi)-1 1 are independent and identically distributed with zero mean and unit variance. Here x1's are known real values associated with the population elements yI's, ao = a2v(xi), 03 is unknown and a2 is assumed to be unknown. Under (o, for arbitrarily large r(> 0) as m 0 oc. Moreover, Bi(y, x b) Bi(yi x b) Bix (b b) (Bi B)(y, (B B)(y, x b) + Op(m-1). O REB REB -(B B)-(y- xfT) + B[{(y, +{(y- Tb + KD)- (y,- - - xfb KD ) (y- fb KDj)}I >KD b+ KD)}I [-xTb<-KD ] +(B B -)[(y x b KD)I [y-X b>KDi] + ( +Op(m-n1) -( B)(y- x b)I Tb< [ Iv-m~|K 4 K{Bj(D D ) + Dj(B, B)} ( [y-xTb>KD ] x b + KD)I[y,-XTb<-KDi] [y fb D [y,-xTb<-KD]) + Op(mT ). (A- 11) E[(B B )2x(y b)_ X r [ yX b| +K2E[{B{(D! D) + Di(B, B)}2 (ly xTbl>KDi]1 (A 12) -B, V, (V + A) -(V, + A)- (A V(V + A)-2(A V(V + A)-2(A A-A) A)(1 )- A)(t + op,(m-)). E[(B, B)2(,- x_ b)I _xb| A)2(, X 2[lyi-X b xTb) (A-10) Hence, (6 REB E(O, REB) 2 Now, Hence, 2,( + A)-E[(A (A-13) for all practical purposes, are believed to be quite reliable. This was, for example, pointed out in Fay and Herriot (1979). Efron and Morris (1971,1972) proposed limited translation rule (LTR) estimators to guard against problems of the above type in the context of compound decision problems. The LTR estimators were obtained with a special choice of what these authors referred to as i !- 1. ,:,'e function". The main purpose of the LTR estimators was to !, 'l,. B i,, -i ,n bets with frequentist caution" (Efron and Morris, 1972). In this chapter, we have introduced the notion of influence functions in the small area context, and developed robust EB and HB estimators based on such influence functions. These influence functions are akin to the relevance functions of Efron and Morris, but are derived on the basis of a general divergence measure between two distributions. In this way, we are assigning some theoretical underpinnings to the work of Efron and Morris as well. Also, unlike the intercept model of Efron and Morris, our results are derived in the more general regression context, a scenario more suitable for small area estimation. The importance of influence functions in robust estimation is well-emphasized in Hampel (1974), Huber (1981), Hampel et al. (1986) and many related papers. The predominant idea is to detect influential observations in terms of their effects on parameters, most typically the regression parameters. Johnson and Geisser (1982, 1983, 1985) took instead a predictive point of view for detection of influential observations. In the process, they developed B il, ~-i in predictive inference function (PIF's) and applied these in several univariate and multivariate prediction problems based on regression models. These predictive influence functions are based on Kullback-Leibler divergence measures. Our proposed influence functions are motivated in a B i,v -i ,i way based on some general divergence measures as introduced in Amari (1982) and Cressie and Read (1984). Included as special cases are the well-known Kullback-Leibler and Hellinger divergence measures. But S- -[(XTE-IX)- I(X -2X)(XT X1)- xTE-I ( OA + (XT -IX)-lXTE-2(y- Xb)] so that E[ ] = 0, and after some simplifications, ab V (- ) OA Xb) (XTE-X)-I [3(XTE-2X)(XT-I1X)- (XTE-2X) (A-5) (A-6) + XTE-3X](XTE-1X)- . By assumption (i), (XT -1X)-1 > (V* + A)-1(XTX)-1. Also, XTX-2X < A-2XTX and XTE-3X < A-3XTX. Hence from Eq. A-6, Ob V(OA) < (V* + A)-2{A-3 + 3A-4(V* + A)-'}(XTX)-1 Now, noting that E(A A)2 = O(m-1), xTE[(6 6)(b6- 6)T]x XiTE[(A- A)2(b )(bA) ] X OAaOA < C{E[(A A)4]1/2 E[( T(b)(A bA) T )4 }1/2 A~~q}O OA( b- (-7 C{E[(A A)4]}1/2{[(iTV(b)x)2]} 1/2 A)4]} 1/2(iT(XT X)- i) (A-7) where C(> 0) is a constant. This implies that xf(b b) = Op(m-1). Accordingly, since Syi-Xfb1>KDi [|yi-X fb>KD] [Xf (b-b)|>K|Di-DiD]' where IDi Di = Op(m- 2) and xT(b b) -Op((m-1), it follows from Eq. A-8 that P( [yi-xTb>KbD [yi-xTb>KDj / 0) 0 < C{E[(A O(m-2), (A-8) (A-9) least, I would like to thank my husband Yash for being such a positive force in my life. Without him it would have been very difficult to survive the pressure and demand of the last year. two normal distributions based on the general divergence measure. The result is probably known, but we include a proof for the sake of completeness. Theorem 2-1. Let f, and f2 denote the Np(/tl, Ei) and N(A2, E2) pdf's respectively. Then 1 [exp{ A (A t ( A(A1 + t) 2 x |i-| 12S 2 +A), 2 - )T(( + A) 2 A )-1 2)} AEX1 1]. Proof: D (fl, f2) 1 A(A + 1) S E 2 e A X T-(X A) (X PI)TE( -- )} -_ 1 x (E I D. l exp{ 2(X p2 2,A2)2(X2-)- "2 \, lY^1 } where X ~ N(p 1, E1). Next we write (X t)T2 1(X 12) (X PI)TE l(X 2 = AlX Al+-^ { 2 1 2 A+ 1 T ( [ (X-+ ) T [ X-), ) = (Z +)TC(Z + ) ZTZ, 2 - i) A2) -(X 1)TE (X -P) 2 2 (X [t + 2 P2) Dx (fl, f2) Integrating w.r.t. v: 7r(a|Dobs) 4 4 <1K ||1I 1 21{|v| < K||711||} dF k ni x ffJ f(xij,obsa) (B-9) i=l j 1 Then 4 4 fJ j ^ j Lj j { 1 1 (1f (c t)/21|v|| < KV|n<]} } dF1 J 2n N Rn N Ru+NJ J-n N 7=1 2= 1 4 <-, / E i^,i + E Ilq+ E iT.+i,,,q} , l 1 2 (ij)EIii (i,j)EI21 (i,j)EI21 < oc (from condition (C3)) Hence, k ni ] 7o(a| Dbs)da < K f (xij,obs a)da < (B10) ie 1 lj=l Hence the theorem follows. REFERENCES Battese, G. E., Harter, R. M., and Fuller, W. A. (1988), "An error-components model for prediction of county crop areas using survey and satellite data," Journal of American Statistical Association, 83, 28-36. Bishop, Y. M. M., Fienberg, S. E., and Holland, P. W. (1975), Discrete Multivariate Ai1i.:.-. Theory and Practice, The MIT Press, Cambridge, Mass.-London. Brackstone, G. J. (1987), "Small area data: policy issues and technical challenges," in Platek et al. (1987), pp. 3-20. C'! iinlhers, R. L. (1986), "Outlier robust finite population estimation," Journal of Ameri- can Statistical Association, 81, 1063-1069. C'!i ii'. ,iii A. (1994), "Small domain statistics: A review," Statistica Neerlandica, 48, 215-236. C'!I ii M. H., Ibrahim, J. G., and Shao, Q. M. (2004), "Propriety of the Posterior Distribution and Existence of the MLE for Regression Models with Covariates Missing at Random," Journal of American Statistical Association, 99, 421-438. - (2006), "Posterior propriety and computation for the Cox Regression Model with applications to Missing Covariates," Biometrika, 93, 791-807. C!. i, M. H. and Shao, Q. M. (2000), "Propriety of Posterior Distribution for Dichotomous Quantal Response Models," Proceedings of the American Mathemati- cal S .. .. I,; 129, 293-302. C!. i, M. H., Shao, Q. M., and Xu, D. (2002), \.. r-- y and sufficient conditions on the properiety of posterior distributions for generalized linear mixed models," Sa,',t1. iji. 64, 57-85. Cox, D. R. and Snell, E. J. (1968), "A general definition of residuals," Journal of the Royal Statistical S... I/; Series B, 30, 248-275. Cressie, N. (1990), "Small-area prediction of undercount using the general linear model," Statistics Canada Symposium, 93-105. Datta, G. S. and Ghosh, M. (1991), "B -i'l prediction in linear models: applications to small area estimation," Annals of Statistics, 19, 1748-1770. Datta, G. S. and Lahiri, P. (1995), "Robust hierarchical B- ,-.,- estimation of small area characteristics in the presence of covariates and outliers," Journal of Multivariate A .I1.-.:.- 54, 310-328. - (2000), "A unified measure of uncertainty of estimated best linear unbiased predictors in small area estimation problems," Statistica Sinica, 10, 613-627. The general divergence measure between two normal distributions is derived in Section 2. Also, in this section, we have applied these results to find influential observations for certain B i,-i in small area estimation regression models. Based on these influence functions, we have proposed certain robust HB small area estimators. The B -,- a risk of the estimators (also referred to as the MSE) is also derived. Robust EB estimators have been proposed in Section 3, and second order correct expansions of their B-,-. risks or MSE's are provided in this section. Estimators of the MSE's which are correct up to the second order are derived in Section 4. Section 5 contains a simulation study which investigates situations where the proposed EB estimators will perform better than the regular EB estimators. 2.2 Influence Functions And Robust Bayes Estimation We consider a small area setting where 1 0 Id N(oi, V), V(> 0) known, 0, N(xf b,A), i = ,** (2-2) Here xa's are the p (< m)- component design vectors. We write XT (xi, ... xm) and assume that rank(X) = p. Also, let y = (yi, .. ym)T and 0 = (01, ... Om)T. The above model is often referred to in the literature as the Fay-Herriot (1979) model, where yj's are the direct survey estimators of the small area parameters Oi, and the xi's are the associated covariates. The B,-,-- i estimator of Oi when both b and A (> 0) are known is given by O (1 Bi)y + Bxf b, (2-3) where Bi = i = 1, m. We first begin with a simple EB or HB scenario where b is unknown but A is known. In this case, the HB estimator of Oi with a uniform prior on b is given by B (1 Bj)y + Bixfb, (2-4) .01 i= 1,- -4 .1 i 5,...,7 1 i 8,--- ,10 and for m = 20, the numbers of small areas are doubled within each group. We consider different distributions for generating uj's. (i) Contaminated normal distribution. In particular .9 x N(0, 1) + .1 x N(0, 36) so that 1 of 10 areas is expected to be an outlier. (ii) Normal distribution N(0, 4.5). Note that 4.5 is the variance of the contaminated normal distribution. Then we calculate the following quantities to measure the performance of the SAE. Let i, be the estimate of Oi. Here we define the absolute relative bias as 1 O N- RB (OY) 1 R 0H o and simulated mean squared error as SMSE(O) =- (O) r) 2 ri1 where R is the number of replicates which is taken as 1000. The notations used in Table 2-1 are as follows: RB: Relative bias of standard EB estimates; RBR: relative bias of robust EB estimates; RBI: relative improvement of relative bias of robust estimates over the standard B-,--i estimates; SMSE: empirical MSE of standard EB estimates; SMSER: empirical MSE of robust EB estimates; RMI: Relative improvement of MSE of robust estimators over the standard B-,-i estimators. REBN: relative bias of the naive MSE estimates under robust EB method; RI: relative improvement of bias in the estimated MSE over naive estimated MSE under the robust method. Here the naive MSE estimates are obtained by taking only the first two terms of MSE estimates. The values corresponding to normal random effects are reported within parentheses. Case 1: All (r .,,Xij3,Xij6) are observed. Li((a6, aO2, aO3 Dobs,1) H [f(Xij6 a6) i,j: (xij2,Xij3,xij6) are observed L exp{r ._(a +21 22Xj4 +o (' .+ ,I .)}+ (c31 + (.x + 033ij5 + 034Xj6)} 1 + exp(ao21 + a22Xij4 + 0- + ( .I .) + exp(o31 + (' _Xj4 + a33j5 + (I .) where Dobs,1 ((r .-., xij3, j6, xj4,x j) are observed 1 < j < ni, 1 < i < k}. Case 2: Xij6 observed but (r .-, Xij3) are missing. Let Dobs,2 ((iXj6, j4, ij5) (xij6,Xij4,Xij5) are observed but (r -..,Xij3) are missing 1 < j < n, 1 _< i < k}. Let L2 (C6, a2, C3 Dobs,2) f(xij6a6)x [ i,j: Xij6 is observed \ Xij2,mi,ij3,mis exp{r .(a21 + '22Xij4 + -+ 1,' -..)}+r (031 _+ 32 j4 + 33Xj5 + 0 *)} 1 +exp(a221 + 22ij4 +o .-+ i ...) +exp(31 + 32Xij4 + 33xij5+ t .) / H 1/f(xij6 6). i,j: Xij6 is observed Case 3: xij6 is missing but (r ., ij3) are observed. In this case, we sum over all possible values of Xij6. Let Dobs,3 ((ir .., Xij3, X j,) (rij5 ..:,Xij3, Xij4, ij) are observed but Xij6 is missing 1 < j < ni, 1 < i < k}. Similarly to Case 2, we have L3(af6, af2, a3 Dobs,3) I H 1 f(Xij6 a6) i,j: (xij2,xij3) are observed but Xij6 missing ij6,is L exp{r ._(a21 + a22Xi ++ oI .- + 0, ..)} + r. (-31 + a32Xi + a33X + a34Xi)} 1 + exp(o21 + a22Xij4 + ( --+ ( .,.) + exp(o31 + 32Xij4 + 33Xij5 + (I' .) Case 4: (r .., xij3, xij6) all are missing. In this case, the observed likelihood after summing over missing values is equal to 1. It is evident from Table 2-1 the robust EB outperforms the standard EB. The relative bias in SAE is about 24 less on an average under the robust method when m 10 and the distribution of the random effects is a mixture of two normal distributions. This improvement reduces to 10' when m 20. In the case of normal distribution of the random effects, the respective improvements are only 10''. and 5 respectively for m 10 and 20. The MSE is about 1,,'. less on an average under robust method when m 10 while for m 20 its only about -' For normal random effects this improvement reduces to 2T!' . and 2'. for m 10 and m 20 respectively. The corrected MSE estimates are at least 5'. less biased than the naive MSE estimates. For m 20 this difference is about ;'. Thus the bias corrected MSE estimates are useful at least for small samples. Table 2-1. Relative biases of the point estimates and relative bias of MSE estimates. K RB RBR RBI SMSE SMSER RMI REBN REB RI m 10 .5 2.743 2.011 .267 .909 .344 .622 -.106 -.087 .055 (1.150) (1.051) (.086) (.460) (.345) (.250) (-.065) (-.050) (.0434) 1 2.202 .197 .355 .609 -.154 -.133 .059 (-) (1.067) (.072) (-) (.350) (.239) (-.078) (-.062) (.046) 1.5 2.067 .246 .372 .590 -.176 -.153 .062 (-) (0.969) (.157) (-) (.350) (.239) (-.084) (-.066) (.051) m 20 .5 2.581 2.392 .073 .329 .320 .027 -.059 -.050 .028 (.683) (.653) (.059) (.322) (.321) (.003) (-.026) (-.018) (.025) 1 2.390 .074 .328 .003 -.090 -.080 .0305 (-) (.677) (.023) (-) (.318) (.012) (-.041) (-.033) (.025) 1.5 2.120 .179 .318 .033 -.089 -.078 .035 (-) (.649) (.064) (-) (.310) (.037) (-.081) (-.074) (.023) BIOGRAPHICAL SKETCH Ai, iv'a Roy was born in Kolkata, India, on the 24th of December, 1978. She graduated with a bachelor's degree from Calcutta University in 2000 with first class honours in statistics. She then joined the Indian Statistical Institute, from where she received her Master of Statistics degree in 2002. She moved to the University of Florida at Gainesville in the fall of 2002 to pursue her doctoral studies in statistics. Upon graduation, she joined the University of N. Li o-1; at Lincoln as an Assistant Professor in the Department of Statistics. Rao, J. N. K. (1986), "Synthetic Estimators, SPREE and Model Based Predictors," in Proceedings of the Conference on Son, ,,1 Research Methods in Ay,:. Jll,,., U. S. Department of Agriculture,, Washington, D. C., pp. 1-16. - (1999), "Some recent advances in model-based small area estimation," Si', :, Method- 1 ..,,; 25, 175-186. - (2003), Small Area Estimation, John Wiley and Sons. Rao, J. N. K. and Yu, M. (1992), "Small area estimation by combining time series and cross-sectional data," ASA Proceedings of the Son', .; Research Methods Section, 1-19. Schaible, W. L. (1978), "C'!-i.... g weights for composite estimators for small area statistics," ASA Proceedings of the Section on Survey Research Methods, 741-746. Zaslavsky, A. M., Schenker, N., and Belin, T. R. (2001), "Downweighting Influential Clusters in Survy ,- Application to the 1990 Post Enumeration Survey," Journal of American Statistical Association, 96, ",- -.' .21. Zhao, L. P. and Prentice, R. L. (1990), "Correlated binary regression using a quadratic exponential model," Biometrika, 77, 642-648. Hence from Eq. A-16, and the fact that D Vi + A + O(m-1), E[(A -A)2 T [)21] E(A A)2E[(y x b)2 I[IX{Tb 2[-(Vj + A)-2]-1(/ + A)E[Z2I[IZ| 2[-(V, + A)-2 ]-1( + A)[24(K) 1 2K(K)] + o(m-1). (A-17) j-1 From Eq. A-13 and Eq. A-17, E[(B1 Bi)2(y X 6i )21] S[ly--X 1 = 2BD2(1 B)2[( 2] -[2(K) 1 2K (K)] + o( m-1). (A-18) J-1 1.2 Overview of SAE Techniques As mentioned earlier, one of the fundamental problems in SAE is to find reliable estimators of the characteristics of interest for a small area. Some of the newer methods of estimation are the empirical B,--. (EB), Hierarchical B-i-., (HB) and empirical best linear unbiased prediction (EBLUP) procedures which have made significant impact on small area estimation. In this context, the work of Fay and Herriot (1979) is notable. Fay and Herriot (1979) -,-. -1. 1 the use of the James-Stein Estimator in SAE. Their objective was to estimate the per capital income (PCI) based on the US Population Census 1970 for several small places, many of which had fewer than 500 people. Earlier, in the Bureau of the Census, the general formula for estimation of PCI in the current year was given by product of the PCI values of the previous year and the ratio of administrative estimate of PCI for the place in 1970 to the derived estimate for PCI in 1969. But this method had many drawbacks resulting from ignoring the presence of some auxiliary variables and also due to small coverage for small places, thereby increasing the standard errors of the direct estimates. The Fay and Herriot model is as follows: |^ N(0e,Vi) and O, N(xfb,A),i =1,... ,m (1-1) where Vi's and auxiliary variables xi's are all known but A is unknown. In this case, let (xTXE-1X)-1XTE-1Y, (1 -2) be the regression estimator of Oi's where E = Diag(Vi + A, V, + A). Then, the -,t.-..-, -ld1 estimator for the ith small area was given by A* Vi A* + V A* + 1 3) Here A* is the estimate of the unknown variance A determined from the equation i j A+V = m p, where p is the number of auxiliary variables in the regression model and m > p. A* was found by an iterative procedure with the constraint A* > 0. we get, f= (I n ({u1) Tr < %i -Ad(-fi('^j)) Il u i < Tmij}d(-f2(ljl)) (i,j) E 21 (i,j)EI2 o R f 1{ T ()- < 7}db} dFi (B-6) where dF, = d ((i,j)El, ( -fi(riji)) H[(i,j)e21(-f2(T2jl))x )(i,J)e121(f3(Tliyj))),for1 1,...,4. Now from Lemma (4.1) of C'!. i, and Shao (2000), we know that under conditions (Cl) and (C2), < ~I~1 < K II| where K is a constant. But Ir1l < KI17\\ => ||bI\\ < K|I71 & Ivl| < K|I|71. Then, 1 < {|l{|v < K||7||} } l{||bl| < K||rh||}dbi}dF, < K I, lq|1{llvll < K\II }ll}dF,. (B-7) Then from Eq. B-5 we get, S7r(b, a,v Dobs)db S { (+4 q llvll q< llll}}F,} [k[ ) (B-8) < K H N (c )(d+k)21{v < K }dF l f (Xij,obsa) (B 8) 2= i=1 j=1 where b (= XTE-1X)-1XTE-ly, E Diag(Vi + A, .. Vm + A). The same estimator is obtained as an EB estimator of b when one estimates A instead from the marginal independent N(xfb, A + Vi) distributions of the yj's. As discussed in the introduction, the above EB (or HB) estimator can often over or under shrink the direct estimators yi towards the regression estimators xfb. What we propose now is a robust B i, -i procedure which is intended to guard against such model misspecification. To this end, we begin finding the influence of (yi, x), i =1,. m in the context of estimating the parameter b. The derivation of the influence function is based on a general divergence measure as introduced in Amari (1982) Cressie and Read (1984). For two densities f, and f2, this general divergence measure is given by DA(fl, f2) A( l)Ef, 1 (2-5) The above divergence measure should be interpreted as its limiting value when A -+ 0 or A -1. We may note that DA(fl, f2) is not necessarily symmetric in fi and f2, but the symmetry can al- --- be achieved by considering 1 [DA(fl, f2) + DA(f2, fl)]. Also, it may be noted that as A -+ 0, D(fl, f2) Ef, log ] while if A -1, D (fl, f2) Efy log L2 These are the two Kullback-Leibler divergence measures. Also D_ (fi, f2) 4(1 f --lf2) 2H2(f1,f2), where H(fl,f2) {2(1 f flf2)}12, the Hellinger divergence measure. In the present context, we consider the divergence between the densities of b and b_-i), where b(-i) is the EB (or HB) estimator of b under the assumed model given in Eq. 2-2 with (yi, xi) removed. To this end, we first prove a general divergence result involving TABLE OF CONTENTS page ACKNOWLEDGMENTS .............. LIST O F TABLES . . . ABSTRACT ..................... CHAPTER 1 REVIEW OF SMALL AREA TECHNIQUES 1.1 Introduction ............... 1.2 Overview of SAE Techniques ....... 1.3 Analysis of Discrete data in SAE . 1.4 Robust Estimation in SAE . 1.5 Layout of this Dissertation . 2 INFLUENCE FUNCTIONS AND ROBUST BAYES AND EMPIRI- CAL BAYES SMALL AREA ESTIMATION ................. 2.1 Introduction . . . 2.2 Influence Functions And Robust Bi,-i Estimation .............. 2.3 Robust Empirical B-,-i Estimation and MSE Evaluation ......... 2.4 M SE Estim ation . . 2.5 Sim ulation Study . . 3 EMPIRICAL BAYES ESTIMATION FOR BIVARIATE BINARY DATA WITH APPLICATIONS TO SMALL AREA ESTIMATION ...... Introduction . B-,i And Empirical B -- Estimators MSE Approximation ........... Estimation of The MSE ......... Simulation Study ............. 4 HIERARCHICAL BAYES ESTIMATION FOR BIVARIATE BINARY DATA WITH APPLICATIONS TO SMALL AREA ESTIMATION . 4.1 Introduction . . . 4.2 Hierarchical B ,- M ethod ........................... 4.3 D ata A analysis . . . 5 SUMMARY AND FUTURE RESEARCH ................... APPENDIX . . . . . . . . . . S tr( -2) and K(A) where I , ab 2aA E / , '4 abi A2 XTE-2X OT ' ,4 9ObpA2 S 'A 'A2 .42 A3 (2-21) 0 2tr(E-3) and ,( I O log L ) E ( OlogL )( -, (Ob \ ( A2 Opxp Opxi 01xp -tr(E-3) -( )( OlogL ... (alo ( logL ' \ ~ )k OA2 I (logL)( -, I'.) Ob1t 9A2 I (OlogL -, \ a.i) (\ L)(\ 9A2 ) O."A )\ OA2 (XTE- X)-1 i-1T 01 {tr(-2)} -1 Str[(XE- 1X)- I(XT-2X)]. The Lemma now follows from Eq. 2-20 and Eq. 2-23. M2 Since (2-22) K(A) + 2J(A) tr [(XTX X)-I(XT -2X) + {2tr(E-3) 2tr(E-3)}{tr(E-2)}-1] (2-23) j (V + A)-2 tr(I-1M), J(A) tr(I-'M2), where I( A) Table 4-3 provides the posterior standard deviations of the HB estimates denoted by SE1 and SE2 respectively, and the posterior correlation denoted by CORR. In addition, we provide 95'. highest posterior density (HPD) intervals associated with the HB estimates. These HPD intervals are denoted by II and 12. We used the posterior predictive p-values of Gelman et al. (2003, Ch. 6) for assessing the model fit. The posterior predictive p-value for the ith area is defined as ppi = Pr(T(yreP, Oi) > T(y, Oi) Dobs), where yrep denotes the replicated data that could have been observed, and T(y, 0) (Oi EyOi)T Y1(i EyOi), Oi = (Oil, Oi2)T, and Ey[Oi] and Zy are the posterior mean and variance-covariance matrix of Oi based on the observed data y for i = 1, 2,..., k = 48. Here the p-value is estimated by calculating the proportion of cases in which the simulated discrepancy variable T(yrep, 0) exceeds the realized value T(y, 0i): N P-n1- 1 = I(T(yrep, 0i)> T(y,0i)), i- 1, ,48, i 1 N being the number of replications of the data. The posterior predictive p-values are also shown in Table 4-3. If the model is reasonably accurate, the replications should look similar to the observed data y. In the case where the data y in in conflict with the posited model, T(y, 0i))'s are expected to have extreme values, which in turn should give p-values close to 0 or 1. Hence, an ideal fit requires the posterior predictive p-values to be in the ballpark of 0.5. The figures provided in Table 3 show that this is indeed the case for our fitted model. 3.3 MSE Approximation In this section we find the MSE matrix of O= (6 0B)T, and also of 0fB (O i, O2B)T which is second order correct, i.e. correct up to O(k-1). Theorem 3-1. Let Uk i li E_ iJ1 0 xx] = Oe(k). Then MSE(OEB) = A(A + 1)-2 A(b) + A2(1 + A) -2B(b) + O(k- 1), where Ai(b) - .j 1 (mn7ij + T 3ij)(1 1ij T 3i) mT 3i j (n + n3ij )(mn2ij + T3ij) 3ij (lij + rn3ij)(Tn2ij + Ti 3ij) (n2ij + T3ij)(l T2ij -3i- j) and, (b 1 0 1 mij(b)i amij(b) BY(b) [ b Ob 0 . 0 1 1/ j" j1 Proof: Note that for i 1,... ,k, MSE(OfB') E[(Of' i )(OfB ) T] E[(Ob O+ OB bf)(0 o + B oT] E[(B o)( B o)T] + E[(Of 'B)(OEB OfB)T] MSE(O) + E[(Of Oj)(Of" ofT]. We first observe that for j / j', 1 < 1, 1' < 3, E[{(A + 1)-(zij pli) A(A + 1)- (plij m i)} x{(A+1)-(., .-- .')-A(A+1)-(, -m, .)}] 0, (3-14) E(p Pl.j)( ,- .*) Proof of Theorem 4-2: We have L(a ,a3, a6Dob s) Sf(xij6 a6) ( Tf( \exp (u,7)T _ S( 1 + exp ( -y) + exp ( T ) (i,j)EJl xij6,ria 2i exp (uT7) __ (i,j)EJ2 6, (ij6 6 1 + exp (u, y) + exp (ui7j x f f(Xij6 I 6) (T (UT (B-11) ij1 X i t1 + exp (u7) + exp u .7) (ij)6J3 iE mis6, For 1 < r < 8, let uljr = 1 if 7, > 0 and uijr = 0 if 7, < 0 for (i,j) e Ji Ag; S. 0 if > 0 and 5, 1 tif % < 0 for (i,j) e J 1U ;- ifr > 0 and 0 if, < 0 for (i,j) e J2 ; 0 if > 0 and5 1 if 7, < 0 for (ij) j J2 U J3nm. Then L(a2,a3,a6Dobs) < j f(xj6 a6) exp (lj7) exp (ij 7) (, 1 + exp ( -7) + exp (i.7 Y) H 1 + exp ( u2-) + exp (u. 7) x f . (ij)EJ3 1 +exp (jy) + exp (-i-,) (B-12) where the last step follows since Xij6, s fi(xj6 a6) = 1. Now, let L(a2,a3 Dobs) exp (u7i)) exp (u,7)ij (iJ)7, 1 + exp (i -y7) + exp (t, 7Y) ( 1 + exp (uT7-y) + exp (u, Y7) x (1 +exp (u- --- (B-13) (j)EJ3 1 + exp (1-7) + exp (uY 7) and aT = (a, ... aT), write the joint pdf of Xijmis as f (Xij,mis Xij,obs, a) J )ijmis ijmis ij,obs, 0) (1 ) .(1) (r-1) rr f (Xij,mis 'ij,obs *x ij,mis ix ij,mis r) r (r- 1) .(1) (r -2) \ 'ij,mis ij,obs, ij,mis, ij,mis, Or-l) x (x f(x ij,mis Xijobs, O 1). (4-5) It is important to note that since we are assuming MAR missing covariates, we do not need to model the missing data mechanism. Write zij (Zijl, ._.,Zj3,ij4) and b -(b bT, bT, bb). Let D = (z, xij, j 1, 2,..., ni, i 1, 2,..., k) denote the complete data. Also, let Dobs (Zij, x i,obs, 1, 2,... i 1, 2,..., k) denote the observed data. Then, the complete data likelihood under the hierarchical model is given by L(b, 72, aID) {=[Uzo1 )}-exp exp(x bi +,)} exp(-z+)} i 1 j 1 l 1 zijl! k ni S 1 f(ij,obs Xij,mis, a) (4-6) i= l j 1 and the observed data likelihood is of the form L(b, 72, a Dobs) j k { J ni 4 exp{ziji(xT b + e} 1 zipjt exp -exp( b T b i 1 j 1 1i 1 2 1 k ni x exp )dvi f(ij,bsx id ), (47) i 1 j= 1 where p is some a-finite measure, typically the Lebesgue measure or the counting measure. We assume bz's, I 1,..., 4, a-2, a are a prior mutually independent with b ~- uniform(Rq) for I 1,... 4, a2 ~ Inverse Gamma(c, d), where Inverse Gamma(c, d) pdf is where gi (A) V(1 Bi), g2i(A) 2B2D2{(1 + K -2)(-K) K) (K)} and g3im(A) BxTf(XT E-X) -1x, expectation being taken over the joint distribution of y and 0. Proof: E (OfIHB o,)2 E (O E(0, Vi (l of + of oHB)2 O)2 +E(OB OHB)2 B)+EE(Of HB)2. (2-13) ~f 0fHB Biy- B(- xfb) y,, + B(y, xfb) _BAD {Y- X Dx ) -B \ D D i -Bi[((6- b) + D Y ) y] It is easy to check that E{x (b b)}2 xT(XTIX X)-lxi. Also, noting that (y, xfb)/Di ~ N(0, 1), and is thus distributed symmetrically about zero, one gets E D- D ) =iD E{(Z- K)21[z>K] + (Z + K)2 [z<-K]} S2E{(Z-K)2 [Z>K]} = 2{(1 + K2)>(-K) KO(K)}, (2-14) (2-15) where Z denotes the N(0, 1) variable. Now by the independence of b and yi xfb and the fact that E(b) = b, one gets the result from Eq. 2-13-Eq. 2-15. It may be noted that if the assumed model Eq. 2-2 is correct, then the B ,-,- risk of the regular EB estimator 0rB is g91(A) + g3im(A). Hence, the excess B .--. risk of OfEB over OfB under the assumed model is g2i(A) which is strictly decreasing in K, and as intuitively expected, tends to zero as K -- oc. Hence, setting a value, i-, M to 92i(A) will determine the value of K. Thus, the choice of K will be based on a tradeoff between But We use the approximation m m A- A [(V, + A)-2 -1 (Vj + A)-2{(y b)2 (V + A)}, j-1 j-1 and write (y x1b)2 Also, I[v_-XTb| (A A)2 = O((m-1), we now get E[(A- A)2(y- xb)2 62 b|<[KD [(yi- xb) x[(b- b)]2 (yi- xfb)2 + O,(m-). (A-15) + op(m-r) for arbitrarily large r > 0. Since E[(A A)2(yi xb)2 KD] + o(m -1). (A-16) Next by the mutual independence of the y, xiTb, (i = 1,... m), and assumption (i) of Theorem 2-3, Cov[(A A)2, (y- [Z(V + A)- j 1 x Tb)2 [i-xTfbI xCov[{ (Vj + A)-2{(yj xb)2 (Vj + A)}}2, (yi xTb)2i[ixb| [(V + A)-2]-2 j 1 xCov[(V, + A)-4(y xTb)2 (Vi + A)}2, (y T b)2I xb|KDJ] 0( -2). (A-14) Snell(1968), tr(Uk 1K11) tr (Uk 1K1,) tr(Uk 1K31) tr(Uk K3p)) tr(U k 111) tr(Uk 1K1) tr(Uk 1K31) tr(Uk K3p) tr(Uk- 1J1) tr(Uk 1Jlp) tr(Uk 1J31) \tr(Uk 1J3p) (3-37) a2 a2 a2 ab abi abab7 abab3 a2 ab2ab 92 absab" a2 ab29b2 92 absab2 a2 a2 ab3ab3 (3 38) k i 1 E(b- b) 1 2 Uk 2 2 Uk where where V Ij, A2(1 + A) -2 (n(b) )TUk ( am ,(b) (b))TU ( a (b)) ( ,) k ( (b) ( (b))TU j() ( a b))TU (am (b)) ( )k" b l (b) 3(b))TUk (amT,(b) ( aT3(b))TU (am2i(b)) ( "bTU (am (b) (3-28) where mi y(1 T mnij) -mTlijTm2ij TlijT3ij -mnlijT 2ij T 2ij(1 n2ij) -mTlij m3ij -m2ijT3ij mn3i (1 T Tm2ijT3ij T m2ij) (3-29) (3-30) (3 31) By Eq. 3-23-Eq. 3-30, E[(oB o)(oEB oB)T] A2 1 20T 8m U(b)\ a9miy(b) (1I A)2 1 1]TU [l ,, 0 1 .(3-32) S)2 0 1 1 Ob 1Ob )1 1 Hence, MSE(OEB) = MSE(Of) (t1+ A)2 1 0 S0 1 ba (b) am aij(b) S ]TU-1 [ 0 .(3-33) 0 1 1 1Ob \1 1 The theorem follows. mij (b)/9b Om2iy(b)/Ob Om3ij(b)/9b Now by the independence of b with (A, yi- xfb), we get (REB _B REB ~REB) KB12 3(A- A)2 [{(A A)( + A)-- 3-- } 2 4(V +A)3 x {( KD)I -[y- Xb>KDi] Ib + KDi)I[yT <-KD])} + o(-1) (A-23) With the approximation given in Eq. A-14, and the independence of b with y xb, we get E[(A A)(V + A)- {(y,- xb KD _)I [yi-Xib>KDi] -(y- xrb + KDe)I)[ -Tb<-KD}] = E[(A- A)(V + A)- {((y xb KDj) x(b b))I} b_ [yi-Xib>KDi] -((yi xfb + KDj) x(b b))I[__x-b<-KD)]} = E[(A A)(V + A)- {(y xfb KD)(Iy _xb>KD] + op(m-)) -(i- xb +KD )(I_ _xfb<-KD] + op(m ))}] E[{Z(Vj + A)-2}-1{- (Vy + A)-2((yj xTb)2 (Vj + A))} j 1 j 1 x(V, + A)- {(y- rb KD)I~ afTb>KD] -(y- -xb + KD)I_ xf.Tb<-KD}] + oP(m-1) = { (V + A)-2}- + A)-E[(V + A)- ((y- ixb)2- (Vi + A))} j-1 x{(y- b- K( +A)2)I _" [yi-Xib>K(Vi+A) 2] -(y, b + K(V, + A)i_)I_[xr-b<-K(V1+A)1]}] + o(m-1) S{ (Vj + A)-2}-1(i + A)- E[(Z2- 1)(Z K)I[>K]- (Z2- 1)(Z + K)I[z<-K]] j-1 +o(m-1), (A-24) The idea behind proposing this estimator followed from the James-Stein Estimator which has the property of dominating the sample mean with respect to squared error loss for m > 3. Under the model (1-1) with Vi = V Vi and A known, the James-Stein estimator is given by S= (1 ((m p 2)V/S))Y + p(( 2)V/S)Y*, (1-4) where s = ( y*)2 (1-5) i We observe here that the (1-3) is just the EB estimator under model (1-1). Following the argument of Efron and Morris (1971,1972), who remarked that EB estimators such as (1-4) may perform overall but poorly on individual components in case of departure from the assumed prior, Fay and Herriot proposed the final EB estimators as: = if Y, Vi < O < Y + V =Y, v^ if ^ < Y VVz =Yi + V/ if i > Yi + V/-. There are other approaches regarding estimation of the unknown A. Prasad and Rao (1990) proposed the use of the unweighted least squares and the method of moments approach to estimate the regression coefficients and variance A respectively. Datta and Lahiri (2000) -.- -1 the use of maximum likelihood and residual maximum likelihood to estimate the same. The Fay-Herriot estimator is an example of a composite estimator. A composite estimator is a weighted average of the direct survey estimator and some regression estimator, in the presence of auxiliary information. These are called model-based estimators (e.g., Fay-Herriot estimator). Otherwise, it is a weighted average of the overall mean and the sample mean for that small area. These are called the design-based estimators. There has been many proposed approaches regarding the choice of optimal LIST OF TABLES Table page 2-1 Relative biases of the point estimates and relative bias of MSE estimates. 39 3-1 Relative biases of the point estimates and relative bias of MSE estimates for the small area mean: the first variable. . 57 3-2 Relative biases of the point estimates and relative bias of MSE estimates for the small area mean: the second variable. . 57 4-1 Bivariate HB estimates along with the Direct estimates and the True values 70 4-2 M measures of precision . . 71 4-3 Posterior variance, correlation and 95'. HPDs along with predictive p-values 73 weights in the latter case. Some of the works include Schaible (1978) and Drew, Singh, and Choudhry (1982). Another kind of estimation in vogue, in SAE, is -. ./h., I. estimation. Brackstone (1987) described synthetic estimates as follows: "An unbiased estimate is obtained from a sample survey for a large area; when this estimate is used to derive estimates for subareas under the assumption that the small areas have the same characteristics as the large area, we identify these estimates as synthetic est ii i,. Ghosh and Rao (1994) provided model-based justifications for some of the synthetic and composite estimators in common use. One of the simplest models in common use is the nested error unit level regression model, employ, ,1 originally by Battese et al. (1988) for predicting areas under corn and soybeans in 12 counties in Iowa, USA. They used a variance components model as -,.-.-, -. ,1 by Harter and Fuller (1987) in the context of SAE. They used the model yij xT b+, +eij, j 1, ,n,i 1,..--- ,T. (1-6) where and eij are mutually independent error terms with zero means and variances a and a2 respectively. Here ni samples are drawn from areas with Ni observations. The random term represents the joint effect of area characteristics that are not accounted by the concomitant variables xi's. Under the model, the small area mean, denoted by Oi, is 0i x- p )b +, (1-7) where (p) = N-1 1yij. Let u = + ej, = (1u, ur)T and E(uur) V block diag(Vi, V2,--- V), where V, crJ, + I, (1-8) ACKNOWLEDGMENTS I would like to extend my sincerest gratitude to my advisor, Professor Malay Ghosh for his invaluable guidance and constant support throughout my graduate studies. It was a privilege to have him as my advisor, who was alvb--, available to help. I would also like to thank Professors Brett Presnell, Ronald Randles, Michael Daniels and Murali Rao for serving on my committee. My special thanks to Professors Dalho Kim and Ming-Hui C'!, 1i for helping me with computation. I am indebted to my professors from my undergraduate d4 v- whose teaching inspired me to pursue higher studies in Statistics. I would especially like to thank Professor Bhaskar Bose from my higher secondary d4 v- He introduced me to Statistics and helped me develop a basic understanding of the subject. My sincerest thanks to Professor Indranil Mukherjee for his constant encouragement during my undergraduate d ,v-. The extent of his knowledge and problem-solving capabilities have alvb--, inspired me. During my stay at the University of Florida, I have been lucky to have marvelous teachers like Professors Malay Ghosh, Brett Presnell, Andre Khuri and Ronald Randles. They have taught me a lot and would ahv-- i inspire me to be a good teacher and researcher. I would like to thank my friends and classmates in the department, especially George, Mihai, for helping me whenever I needed. Life would have been really different in Gainesville if not for friends like Vivekananda, Siulidi and Bharati. I am grateful for their support, guidance and, well, just for being there. I have been fortunate to have come to know and grow close to people like Dolakakima, Bhramardi and little Munai. Their genuine affection and warmth, along with the love and support of my friends, and, of course 'Asha Parivar', has made Gainesville a second home to me. I don't think any of this would have been possible if not for the encouragement and support of my family. My heartfelt thanks to my parents, who have ahv-- i supported me in all the choices I have made and encouraged me to move forward. I owe everything that I have achieved to my mother who has ahv-- been my inspiration. Last, but not the |

Full Text |

PAGE 1 1 PAGE 2 2 PAGE 3 3 PAGE 4 Iwouldliketoextendmysincerestgratitudetomyadvisor,ProfessorMalayGhoshforhisinvaluableguidanceandconstantsupportthroughoutmygraduatestudies.Itwasaprivilegetohavehimasmyadvisor,whowasalwaysavailabletohelp.IwouldalsoliketothankProfessorsBrettPresnell,RonaldRandles,MichaelDanielsandMuraliRaoforservingonmycommittee.MyspecialthankstoProfessorsDalhoKimandMing-HuiChenforhelpingmewithcomputation.Iamindebtedtomyprofessorsfrommyundergraduatedays,whoseteachinginspiredmetopursuehigherstudiesinStatistics.IwouldespeciallyliketothankProfessorBhaskarBosefrommyhighersecondarydays.HeintroducedmetoStatisticsandhelpedmedevelopabasicunderstandingofthesubject.MysincerestthankstoProfessorIndranilMukherjeeforhisconstantencouragementduringmyundergraduatedays.Theextentofhisknowledgeandproblem-solvingcapabilitieshavealwaysinspiredme.DuringmystayattheUniversityofFlorida,IhavebeenluckytohavemarvelousteacherslikeProfessorsMalayGhosh,BrettPresnell,AndreKhuriandRonaldRandles.Theyhavetaughtmealotandwouldalwaysinspiremetobeagoodteacherandresearcher.Iwouldliketothankmyfriendsandclassmatesinthedepartment,especiallyGeorge,Mihai,forhelpingmewheneverIneeded.LifewouldhavebeenreallydierentinGainesvilleifnotforfriendslikeVivekananda,SiulidiandBharati.Iamgratefulfortheirsupport,guidanceand,well,justforbeingthere.IhavebeenfortunatetohavecometoknowandgrowclosetopeoplelikeDolakakima,BhramardiandlittleMunai.Theirgenuineaectionandwarmth,alongwiththeloveandsupportofmyfriends,and,ofcourse'AshaParivar',hasmadeGainesvilleasecondhometome.Idon'tthinkanyofthiswouldhavebeenpossibleifnotfortheencouragementandsupportofmyfamily.Myheartfeltthankstomyparents,whohavealwayssupportedmeinallthechoicesIhavemadeandencouragedmetomoveforward.IoweeverythingthatIhaveachievedtomymotherwhohasalwaysbeenmyinspiration.Last,butnotthe 4 PAGE 5 5 PAGE 6 page ACKNOWLEDGMENTS ................................. 4 LISTOFTABLES ..................................... 8 ABSTRACT ........................................ 9 CHAPTER 1REVIEWOFSMALLAREATECHNIQUES ................... 10 1.1Introduction ................................... 10 1.2OverviewofSAETechniques .......................... 11 1.3AnalysisofDiscretedatainSAE ....................... 17 1.4RobustEstimationinSAE ........................... 19 1.5LayoutofthisDissertation ........................... 21 2INFLUENCEFUNCTIONSANDROBUSTBAYESANDEMPIRI-CALBAYESSMALLAREAESTIMATION 22 2.1Introduction .................................. 22 2.2InuenceFunctionsAndRobustBayesEstimation .............. 24 2.3RobustEmpiricalBayesEstimationandMSEEvaluation ......... 31 2.4MSEEstimation ................................ 33 2.5SimulationStudy ................................ 37 3EMPIRICALBAYESESTIMATIONFORBIVARIATEBINARYDATAWITHAPPLICATIONSTOSMALLAREAESTIMATION 40 3.1Introduction ................................... 40 3.2BayesAndEmpiricalBayesEstimators ................... 40 3.3MSEApproximation .............................. 44 3.4EstimationofTheMSE ............................ 49 3.5SimulationStudy ................................ 55 4HIERARCHICALBAYESESTIMATIONFORBIVARIATEBINARYDATAWITHAPPLICATIONSTOSMALLAREAESTIMATION 58 4.1Introduction ................................... 58 4.2HierarchicalBayesMethod ........................... 58 4.3DataAnalysis .................................. 64 5SUMMARYANDFUTURERESEARCH 74 APPENDIX 6 PAGE 7 76 BPROOFSOFTHEOREMS4-1AND4-2 ...................... 85 REFERENCES ....................................... 93 BIOGRAPHICALSKETCH ................................ 97 7 PAGE 8 Table page 2-1RelativebiasesofthepointestimatesandrelativebiasofMSEestimates. ... 39 3-1RelativebiasesofthepointestimatesandrelativebiasofMSEestimatesforthesmallareamean:therstvariable. ....................... 57 3-2RelativebiasesofthepointestimatesandrelativebiasofMSEestimatesforthesmallareamean:thesecondvariable. ..................... 57 4-1BivariateHBestimatesalongwiththeDirectestimatesandtheTruevalues .. 70 4-2Measuresofprecision ................................. 71 4-3Posteriorvariance,correlationand95%HPDsalongwithpredictivep-values .. 73 8 PAGE 9 ThetopicofthisdissertationfocusesontheformulationofempiricalandhierarchicalBayesiantechniquesinthecontextofsmallareaestimation. Intherstpartofthedissertation,weconsiderrobustBayesandempiricalBayes(EB)procedureforestimationofsmallareameans.Wehaveintroducedthenotionofinuencefunctionsinthesmallareacontext,andhavedevelopedrobustBayesandEBestimatorsbasedonsuchinuencefunctions.Wehavederivedanexpressionforthepredictiveinuencefunction,andbasedonastandardizedversionofthesame,wehaveproposedsomenewsmallareaestimators.Themeansquarederrorsandestimatedmeansquarederrorsoftheseestimatorshavealsobeenfound.Thendingsarevalidatedbyasimulationstudy. Inthesecondpart,wehaveconsideredsmallareaestimationforbivariatebinarydatainthepresenceofcovariates.WehavedevelopedEBestimatorsalongwiththeirmeansquarederrors.Inthiscasethecovariateswereassumedtobecompletelyobserved.Inthepresenceofmissingcovariates,wehavedevelopedahierarchicalBayesapproachforbivariatebinaryresponses.Undertheassumptionthatthemissingmechanismismissingatrandom(MAR),wehavesuggestedmethodstoestimatethesmallareameansalongwiththeassociatedstandarderrors.Ourndingshavebeensupportedbyappropriatedataanalyses. 9 PAGE 10 Theuseofsmallareastatisticsgoesbackto11thcenturyEnglandand17thcenturyCanada( Brackstone 1987 ).However,thesestatisticswereallbasedeitheronacensusoronadministrativerecordsaimingatcompleteenumeration. TheproblemofSAEistwofold.Firstisthefundamentalissueofproducingreliableestimatesofparametersofinterest(e.g.,means,counts,quantilesetc.)forsmalldomains.Thesecondproblemistoassesstheestimationerror.Intheabsenceofsucientsamplesfromthesmallareas,theonlyoptionistoborrowstrengthfromtheneighboringareasi.e.,thosewithsimilarcharacteristics.InSection2,weoutlinetheadvancesinSAEoverthepastfewdecadesandsomepopularworkssignicanttotheenhancementofthetechniquesinthiseld. 10 PAGE 11 FayandHerriot ( 1979 )isnotable. FayandHerriot ( 1979 )suggestedtheuseoftheJames-SteinEstimatorinSAE.Theirobjectivewastoestimatethepercapitaincome(PCI)basedontheUSPopulationCensus1970forseveralsmallplaces,manyofwhichhadfewerthan500people.Earlier,intheBureauoftheCensus,thegeneralformulaforestimationofPCIinthecurrentyearwasgivenbyproductofthePCIvaluesofthepreviousyearandtheratioofadministrativeestimateofPCIfortheplacein1970tothederivedestimateforPCIin1969.Butthismethodhadmanydrawbacksresultingfromignoringthepresenceofsomeauxiliaryvariablesandalsoduetosmallcoverageforsmallplaces,therebyincreasingthestandarderrorsofthedirectestimates.TheFayandHerriotmodelisasfollows: whereVi'sandauxiliaryvariablesxi'sareallknownbutAisunknown.Inthiscase,let betheregressionestimatorofi'swhere=Diag(V1+A;;Vm+A).Then,thesuggestedestimatorfortheithsmallareawasgivenby ^i=A HereAistheestimateoftheunknownvarianceAdeterminedfromtheequationPi(YiYi)2 11 PAGE 12 1{1 )withVi=V8iandAknown,theJames-Steinestimatorisgivenby ^i=(1((mp2)V=S))Yi+((mp2)V=S)Yi;(1{4) where Weobserveherethatthe( 1{3 )isjusttheEBestimatorundermodel( 1{1 ).Followingtheargumentof EfronandMorris ( 1971 1972 ),whoremarkedthatEBestimatorssuchas( 1{4 )mayperformoverallbutpoorlyonindividualcomponentsincaseofdeparturefromtheassumedprior, FayandHerriot proposedthenalEBestimatorsas: ^0i=^iifYip PrasadandRao ( 1990 )proposedtheuseoftheunweightedleastsquaresandthemethodofmomentsapproachtoestimatetheregressioncoecientsandvarianceArespectively. DattaandLahiri ( 2000 )suggesttheuseofmaximumlikelihoodandresidualmaximumlikelihoodtoestimatethesame. TheFay-Herriotestimatorisanexampleofacompositeestimator.Acompositeestimatorisaweightedaverageofthedirectsurveyestimatorandsomeregressionestimator,inthepresenceofauxiliaryinformation.Thesearecalledmodel-basedestimators(e.g.,Fay-Herriotestimator).Otherwise,itisaweightedaverageoftheoverallmeanandthesamplemeanforthatsmallarea.Thesearecalledthedesign-basedestimators.Therehasbeenmanyproposedaproachesregardingthechoiceofoptimal 12 PAGE 13 Schaible ( 1978 )and Drew,Singh,andChoudhry ( 1982 ). Anotherkindofestimationinvogue,inSAE,issyntheticestimation. Brackstone ( 1987 )describedsyntheticestimatesasfollows:\Anunbiasedestimateisobtainedfromasamplesurveyforalargearea;whenthisestimateisusedtoderiveestimatesforsubareasundertheassumptionthatthesmallareashavethesamecharacteristicsasthelargearea,weidentifytheseestimatesassyntheticestimates". GhoshandRao ( 1994 )providedmodel-basedjusticationsforsomeofthesyntheticandcompositeestimatorsincommonuse. Oneofthesimplestmodelsincommonuseisthenestederrorunitlevelregressionmodel,employedoriginallyby Batteseetal. ( 1988 )forpredictingareasundercornandsoybeansin12countiesinIowa,USA.Theyusedavariancecomponentsmodelassuggestedby HarterandFuller ( 1987 )inthecontextofSAE.Theyusedthemodel whereviandeijaremutuallyindependenterrortermswithzeromeansandvariances2vand2erespectively.HerenisamplesaredrawnfromareaswithNiobservations.Therandomtermvirepresentsthejointeectofareacharacteristicsthatarenotaccountedbytheconcomitantvariablesxij's.Underthemodel,thesmallareamean,denotedbyi,is wherexi(p)=N1iPNij=1xij.Letui=vi+eij,u=(u1;;uT)TandE(uuT)V=blockdiag(V1;V2;;VT), where 13 PAGE 14 ~b=(XTV1X)1XTV1Y:(1{9) Thenthebestlinearunbiasedpredictor(BLUP)fortheithsmallareamean,when(2v;2e)isknown,isgivenby ^i=xTi~b+(yi:xTi:~b)gi;(1{10) wherexi:=n1iPnij=1xijandgi=2v=(2v+n1i2e): PrasadandRao ( 1990 ),amongothers,providedetails.AnalternativemethodistondHierarchicalBayes(HB)predictorsbyspecifyingpriordistributionsfor,2vand2eandcomputingtheposteriordistributionsf(ijy;X)givenalltheobservationsinalltheareas( DattaandGhosh 1991 ).MarkovchainMonteCarlo(MCMC)techniquesareusedforsimulationsfromtheposterior. Aspointedoutearlier,anotherimportantaspectofSAE,apartfromtheestimationofparameters,isthecalculationoftheerrorassociatedwithestimationknownasmeansquarederror(MSE).Tothisend,wediscussheretheworkof PrasadandRao ( 1990 ). PrasadandRao pointedoutthatthemodelsof FayandHerriot ( 1979 )and Batteseetal. ( 1988 )areallspecialcasesofthegeneralmixedlinearmodel whereyisthevectorofsampleobservations,XandZareknownmatrices,andvandearedistributedindependentlywithmeans0andcovariancematricesGandR,respectively,dependingonsomeparameters,thevariancecomponents.Following 14 PAGE 15 whereV=Cov(y)=R+ZGZTand~istheGLSestimatorofb.TheyproposedtheestimatorsforthevariancecomponentsusingHenderson'smethods. PrasadandRao alsoproposedanestimatorfortheMSEoftheestimators.Theyextendedtheworkof KackarandHarville ( 1984 ),andapproximatedthetruepredictionMSEoftheEBLUPundernormalityoftheerrorterms.Theyshowed whererbT=col1jp(@bT=@j)andbT=mTGZTV1.Here~istheHenderson'sestimatorofiand^=max(~;0).Hence, Theyprovidedgeneralconditionsunderwhichthepreciseorderoftheneglectedtermsintheapproximationiso(t1)forlarget.TheseconditionsaresatisedbybothFay-Herriotandthenestederrormodels.FortheestimationoftheMSE,themodelswereconsideredindividuallyasitisdiculttondgeneralconditions.Forexample,theMSEapproximationforthe FayandHerriot ( 1979 )modelisgivenasfollows: PrasadandRao estimatedVi(1Bi)byVi(1^Bi)andB2ixTi(XTD1X)1xiby^B2ixTi(XT^D1X)1xi,where^D=diag(V1+^A;;Vm+^A).TheyapproximatedE(^EBi(A)^EBi(^A))2by2B2i(Vi+A)1Var(^A):Whileundernormalityoftheerrorterms,Var(^A)wasapproximatedasVar(^A)=2m1[A2+2APVi=m+PV2i=m];correctuptothesecondorder.Hence,theestimatedMSEuptothesecondorderis 15 PAGE 16 ( 1995 )showedthattheestimatorofMSEasgivenby( 1{16 )isrobusttodeparturesfromnormalityoftherandomareaeectvi's,thoughnotthesamplingerrorei's. DattaandLahiri ( 2000 )extendedtheresultsof PrasadandRao togenerallinearmixedmodelsoftheform, whereYiisthevectorofsampleobservationsoforderni,XiandZiareknownmatrices,andviandeiaredistributedindependentlywithmeans0andcovariancematricesGiandRi,respectively,dependingonsomeparameters,thevariancecomponents.Intheirpaper, DattaandLahiri rstestimatedthevariancesoftheBLUPforthecaseofknown.TheyalsodevelopedMSEestimatorscorrectuptothesecondorder,whenisestimatedbyMLEorREMLestimator.TheyalsoshowthatMSEapproximationsfortheMLorREMLmethodsareexactlythesameinthesecondorderasymptoticsense.However,thesecondorderaccurateestimatoroftheMSEbasedonthelattermethodrequireslessbiascorrectionthantheonebasedontheformermethod.Inthesamethread, Dattaetal. ( 2005 )derivedsecondorderapproximationtotheMSEofsmallareaestimatorsbasedontheFay-Herriotmodelasgivenby( 1{1 ),whenthemodelvariancesareestimatedbytheFay-Herriotiterativemethodbasedonweightedresidualsumofsquares.Theyalsoderivedanoninformativeprioronthemodelforwhichtheposteriorvarianceofasmallareameanissecond-orderunbiasedfortheMSE. ManyothermodelshavebeenproposedinthecontextofSAE. Cressie ( 1990 )and Ghosh,Natarajan,Stroud,andCarlin ( 1998 )emphasizetheuseoflinearmodelsandgeneralizedlinearmodels,respectively. EricksonandKadane ( 1987 )carriedoutSensitivityAnalysisforSAE. PfeermannandBurck ( 1990 ), RaoandYu ( 1992 )and Ghoshetal. ( 1996 ),studiedtheextentofpossibleuseoftimeseriesandcross-sectionalmodelsinSAE. 16 PAGE 17 McGibbonandTomberlin ( 1989 )consideredthemodelwhereyijBin(1;pij)and logit(pij)=xTijb+ui;uiN(0;2u):(1{18) Hereyij'sareconditionallyindependent.Theindependentrandomeectsaredenotedbyui's.Theauthorsassumeadiusepriorforbwith2uassumedtobeknown.Thejointposteriordistributionofbandfui;i=1;;mgisapproximatedbythemultivariatenormaldistributionwithmeanequaltothetrueposteriormodeandcovariancematrixequaltotheinverseinformationmatrixevaluatedatthemode.Theestimatorofpijisgivenby ~pij=[1+expf(xTij~b+~ui)g]1(1{19) where~band~uiaretherespectivemodesoftheposterior.Thentheareaproportionsareestimatedby^pi=PNij=1~pij=Niforsmallsamplingfractionsni=Ni.For2uunknown,theMLEof2uispluggedin,yieldinganEBestimator.TaylorexpansionisusedtoobtainanaiveestimatorofthevarianceoftheEBestimator.Butthedrawbackofthisestimatorisitignoresthevariancecomponentresultingfromtheestimationof2u. Farrell,McGibbon,andTomberlin ( 1997 )showedthatthevarianceestimatorcouldbeimproveduponbytheuseofparametricBootstrapprocedure. Someotherpapersrelatingtotheanalysisofbinaryoutcomesareby JiangandLahiri ( 2001 )wheretheyproposeamethodtondtheBestPredictor(BP)oflogit(pij)andusethemethodofmomentstoobtainsampleestimatesofbanduiasgiveninEq. 1{18 Malecetal. ( 1997 )alsoconsideralogisticlinearmixedmodelforBernoullioutcomesanduseHBapproachtoderivemodeldependentpredictors. 17 PAGE 18 Maiti ( 1998 )assumedalog-normaldistributionfori's.Healsoassumedspatialdependencemodelforthelog-transformedincidencerates.Someotherworksinthisareaincludepapersby Ghoshetal. ( 1999 ), LahiriandMaiti ( 2002 )and Nandrametal. ( 1999 ). Ghoshetal. ( 1998 )consideredHierarchicalBayes(HB)generalizedlinearmodel(GLM)sforauniedanalysisofbothdiscreteandcontinuousdata.Theresponsesyij'sareassumedtobeconditionallyindependentwithpdf Ghoshetal. usedthehierarchicalmodelinthefollowingway: (1{21) GhoshandMaiti ( 2004 )providedsmall-areaestimatesofproportionsbasedonnaturalexponentialfamilymodelswithquadraticvariancefunction(QVF)involvingsurveyweights.Inthepaper,theyconsideredarealevelcovariates,i.e.,thesamecovariatesforalltheunitsinthesamearea.Theyintroducedthenaturalexponentialfamilyquadraticvariancefunctionmodel,whichhasthesamesetupasEq. 1{20 with 18 PAGE 19 whereistheoverdipersionparameterandmi=g(xib).UsingresultsfromMorris(1983),E(i)=miandVar(i)=V(mi) providepseudoBLUPsofthesmallareameans,conditionalontheweightedmeansyiw=Pmi=1Pnij=1wijyij,wherewij>0'sarealljandPnij=1wij=1.TheyeventuallyarrivedatthepseudoEBLUPsbysimultaneouslyestimatingband,usingthemethodofunbiasedestimatingfunctionsfollowing GodambeandThompson ( 1989 ).TheyalsoprovidedapproximateformulaefortheMSEandtheirapproximateestimators. Chambers ( 1986 )and GwetandRivest ( 1992 ),whousedrobustestimationtechniquesspecicallyinthecontextofratioestimation. Chambers ( 1986 )mainlyconsideredtherobustestimationofanitepopulationtotalwhenthesampleddatacontains'representative'outlierssuchthatthereisnogoodreasontoassumethatthenonsampledpartofthetargetpopulationisfreeofoutliers. Chambers assumedasuperpopulationmodel0,underwhichtherandomvariablesrI=(yIxI)1Iareindependentandidenticallydistributedwithzeromeanandunitvariance.HerexI'sareknownrealvaluesassociatedwiththepopulationelementsyI's,I=2v(xI),isunknownand2isassumedtobeunknown.Under0, 19 PAGE 20 Chambers ( 1986 )proposedtheestimator Chambers discussedthedierentchoicesofandprimarilydiscussedthebiasandvariancerobustnessofTnunderagrosserrormodel.Basedontheasymptotictheorygiveninthepaper,itappearsthatthevariancerobustnessofTncanonlybeachievedbychoosingsuchthatthisestimatorisnolongerbiasrobust. Zaslavskyetal. ( 2001 )usedrobustestimationtechniquestodownweighttheeectofclusters,withanapplicationto1990postenumerationsurvey.InthecontextofSAE, DattaandLahiri ( 1995 )developedarobustHBmethodwhichwasparticularlywell-suitedwhentherewereoneormoreoutliersinthedataset.Theyusedaclassofscalemixturesofnormaldistributionsontherandomeectsinthebasicareamodel.Cauchydistributionwasusedfortheoutliers. WehavetriedtooutlinesomeofthemajorworkssignicantintheadvancesofSAE.ReviewpapersonSAEinclude Rao ( 1986 ), Chaudhuri ( 1994 ), GhoshandRao ( 1994 ), Rao ( 1999 )and Pfeermann ( 2002 ).ThereisabookonSAEby Rao ( 2003 ). 20 PAGE 21 InChapter2,wehaveconsideredthebasicarealevelmodelandproposedsomenewrobustsmallareaestimationproceduresbasedonstandardizedresiduals.Theapproachconsistsofrstndinginuencefunctionscorrespondingtoeachindividualarealevelobservationbymeasuringthedivergencebetweenthetwoposteriorsofregressioncoecientswithandwithoutthatobservation.Next,basedontheseinuencefunctions,properlystandardized,wehaveproposedsomenewrobustBayesandempiricalBayes(EB)smallareaestimators.TheapproximatedMSE'sandestimatedMSE'softheseestimatorshavealsobeenfound. InChapter3,wehaveaddressedsmallareaproblemswhentheresponseisbivariatebinaryinthepresenceofcovariates.WehaveproposedEBestimatorsofthesmallareameansbyassigningaconjugatepriortotheparametersaftersuitabletransformationandsubsequentmodellingonthecovariates.Wehavealsoprovidedsecond-orderapproximateexpressionsfortheMSE'sandestimatedMSE'softheseestimators. InChapter4,wehaveproposedHBestimatorsinthebivariatebinarysetup,wheresomeofthecovariatesmaybepartiallymissing.WeimplementedhierarchicalBayesianmodelswhichaccomodatethismissingness.Estimatorsofsmallareameansalongwiththeassociatedposteriorstandarderrors,andposteriorcorrelationshavealsobeenprovided. 21 PAGE 22 Shrinkageestimatorsoftheabovetypearequitereasonableiftheassumedpriorisapproximatelytrue.Theremaybeover-orunder-shrinkingtowardstheregressionestimatoriftheassumedpriorisnottrue.Indeed,thereisastandardcriticismagainstanysubjectiveBayesianmethodisthatitcanperformverypoorlyintheeventofsignicantdeparturefromtheassumedprior(seee.g., EfronandMorris ( 1971 ), FayandHerriot ( 1979 )). EfronandMorris ( 1971 1972 )remarkedthatBayesandempiricalBayesestimatorsmightperformwelloverallbutpoorlyonindividualcomponent.e.g.,supposeweassumethemodelwhere Butsupposetheassumedprioroni'sisactuallyamixtureofseveraldistributions,oneofwhichisN(1;A1);A1 PAGE 23 FayandHerriot ( 1979 ). EfronandMorris ( 1971 1972 )proposedlimitedtranslationrule(LTR)estimatorstoguardagainstproblemsoftheabovetypeinthecontextofcompounddecisionproblems.TheLTRestimatorswereobtainedwithaspecialchoiceofwhattheseauthorsreferredtoas\relevancefunction".ThemainpurposeoftheLTRestimatorswasto\hedgeBayesianbetswithfrequentistcaution"( EfronandMorris 1972 ). Inthischapter,wehaveintroducedthenotionofinuencefunctionsinthesmallareacontext,anddevelopedrobustEBandHBestimatorsbasedonsuchinuencefunctions.Theseinuencefunctionsareakintotherelevancefunctionsof EfronandMorris ,butarederivedonthebasisofageneraldivergencemeasurebetweentwodistributions.Inthisway,weareassigningsometheoreticalunderpinningstotheworkof EfronandMorris aswell.Also,unliketheinterceptmodelof EfronandMorris ,ourresultsarederivedinthemoregeneralregressioncontext,ascenariomoresuitableforsmallareaestimation. Theimportanceofinuencefunctionsinrobustestimationiswell-emphasizedinHampel(1974),Huber(1981),Hampeletal.(1986)andmanyrelatedpapers.Thepredominantideaistodetectinuentialobservationsintermsoftheireectsonparameters,mosttypicallytheregressionparameters.JohnsonandGeisser(1982,1983,1985)tookinsteadapredictivepointofviewfordetectionofinuentialobservations.Intheprocess,theydevelopedBayesianpredictiveinferencefunction(PIF's)andappliedtheseinseveralunivariateandmultivariatepredictionproblemsbasedonregressionmodels.ThesepredictiveinuencefunctionsarebasedonKullback-Leiblerdivergencemeasures. OurproposedinuencefunctionsaremotivatedinaBayesianwaybasedonsomegeneraldivergencemeasuresasintroducedinAmari(1982)andCressieandRead(1984).Includedasspecialcasesarethewell-knownKullback-LeiblerandHellingerdivergencemeasures. 23 PAGE 24 Herexi'sarethep( PAGE 25 Asdiscussedintheintroduction,theaboveEB(orHB)estimatorcanoftenover-orunder-shrinkthedirectestimatorsyitowardstheregressionestimatorsxTi~b.WhatweproposenowisarobustBayesianprocedurewhichisintendedtoguardagainstsuchmodelmisspecication. Tothisend,webeginndingtheinuenceof(yi;xi);i=1;;minthecontextofestimatingtheparameterb.ThederivationoftheinuencefunctionisbasedonageneraldivergencemeasureasintroducedinAmari(1982)CressieandRead(1984).Fortwodensitiesf1andf2,thisgeneraldivergencemeasureisgivenby Theabovedivergencemeasureshouldbeinterpretedasitslimitingvaluewhen!0or!1.WemaynotethatD(f1;f2)isnotnecessarilysymmetricinf1andf2,butthesymmetrycanalwaysbeachievedbyconsidering1 2[D(f1;f2)+D(f2;f1)].Also,itmaybenotedthatas!0;D(f1;f2)!Ef1hlogf1 2(f1;f2)=4(1Rp Inthepresentcontext,weconsiderthedivergencebetweenthedensitiesof~band~b(i),where~b(i)istheEB(orHB)estimatorofbundertheassumedmodelgiveninEq. 2{2 with(yi;xi)removed.Tothisend,werstproveageneraldivergenceresultinvolving 25 PAGE 26 2(12)T((1+)21)1(12)gj1j 2j(1+)21j1 21]: j1j 2(X1)T11(X1)=(X1+12)T12(X1+12)1 2(X1)T11(X1)=h1 21(X1+12)iT1 21121 211 21(X1+12)1 2h1 21(X1)iTh1 21(X1)i=(Z+)TC(Z+)1 2ZTZ; PAGE 27 21(X1);=1 21(12)andC=1 21121 21.NotingthatZNp(0;Ip),weget j1j (2)p 2ZT( 1+IpC)Z+TCZ+ j1j 2expf 1+IpC)1C)g1#=1 j1j 21; wherethenalstepofEq. 2{6 requiresthestandardmatrixinversionformula(Rao,1973,p33)(A+BDBT)1=A1A1B(D1+BTA1B)BTA1withA=C1;B=IpandD= 211 2121 21 2121 2111 21(12)=(12)T(2 Moreover, 2=j(1+)1 21111 211 21121 21j1 2=j1j1 2j(1+)1112j1 2=j1j1 2j12(1+)21)11j1 2=j2j1 2j(1+)21j1 2: PAGE 28 WenowutilizeTheorem2-1inderivingtheinuenceof(yi;xi)inthegivensmallareacontext.Notethatbjy;XN[(XT1X)1XT1y;(XT1X)1] andbjy(i);X(i)N[(XT(i)1(i)X(i))1XT(i)1(i)y(i);(XT(i)1(i)X(i))1]; Byastandardmatrixinversionformula(Rao,1973,p33), (XT1X(Vi+A)1xixTi)1=(XT1X)1+(XT1X)1(Vi+A)1xixTi(XT1X)1 FromEq. 2{8 andEq. 2{9 ,onsimplication, 1(Vi+A)1xTi(XT1X)1xi:(2{10) 28 PAGE 29 (1+)21=(XT1X)1+(1+)(XT1X)1(Vi+A)1xixTi(XT1X)1 ItiseasytoseefromEq. 2{10 andEq. 2{11 that(12)T[(1+)21]1(12)isaquadraticforminyixTi~b.WeproposerestrictingtheamountofshrinkingbycontrollingtheresidualsyixTi~b.However,inordertomaketheseresidualsscale-free,westandardizethemas(yixTi~b)=Di,whereD2i=V(yixTi~b)=Vi+AxTi(XT1X)1xi;i=1;;m: ~RHBi=yiBiDiyixTi~b where(t)=sgn(t)min(K;jtj). Remark2.ForthespecialcaseoftheinterceptmodelasconsideredinLindleyandSmith(1972)withVi=1andBi=B=(1+A)1foralli,theaboverobustEBestimatorofireducesto~RHBi=yiBDyiy D;i=1;;m; TheselectionofKisalwaysanopenquestion.Oftenthisisdictatedfromtheuser'spointofview.AsomewhatmoreconcreterecommendationcanbegivenfromthefollowingtheoremwhichprovidestheBayesriskofrobustBayesestimatorgiveninEq. 2{12 underthemodelEq. 2{2 .Throughoutthispaper,letdenotethestandardnormaldfandthestandardnormalpdf.. 2{2 ).ThenE(~RHBii)2=g1i(A)+g2i(A)+g3im(A); PAGE 30 But ^Bi~RHBi=yiBi(yixTib)yi+Bi(yixTi~b)BiDi(yixTi~b ItiseasytocheckthatEfxTi(~bb)g2=xTi(XT1X)1xi.Also,notingthat(yixTi~b)=DiN(0;1),andisthusdistributedsymmetricallyaboutzero,onegets whereZdenotestheN(0;1)variable.Nowbytheindependenceof~bandyixTi~bandthefactthatE(~b)=b,onegetstheresultfromEq. 2{13 -Eq. 2{15 ItmaybenotedthatiftheassumedmodelEq. 2{2 iscorrect,thentheBayesriskoftheregularEBestimator~EBiisg1i(A)+g3im(A).Hence,theexcessBayesriskof~REBiover~EBiundertheassumedmodelisg2i(A)whichisstrictlydecreasinginK,andasintuitivelyexpected,tendstozeroasK!1.Hence,settingavalue,say,Mtog2i(A)willdeterminethevalueofK.Thus,thechoiceofKwillbebasedonatradeobetween 30 PAGE 31 WenowproceedtothestudyofrobustEBestimatorsinthenextsection. ^REBi=yi^Bi^DiyixTi^b whereisdenedintheprevioussectionafterEq. 2{12 OurobjectiveinthissectionistondE(^REBii)2correctuptoO(m1),i.e.weprovideanasymptoticexpansionofthisexpectationwithanexplicitcoecientofm1,whiletheremaindertermwhenmultipliedbymconvergestozero,subjecttocertainassumptions.HereEdenotesexpectationoverthejointdistributionofyand.WeneedcertainpreliminaryresultsbeforecalculatingtheBayesrisks. Firstweneedtheasymptoticdistributionof^A.Basedonthemarginaldistributionoftheyi's,thelikelihoodforbandAisgivenby 2exp[1 2mX1(yixTib)2=(Vi+A)]:(2{17) 31 PAGE 32 2Pm1(Vi+A)2375:(2{18) DuetotheasymptoticindependenceofbandA,^AisasymptoticallyN(A;2(Pm1(Vi+A)2)1).Wenowhavethefollowingsimplebutimportantlemma. HereOereferstotheexactorderwhichmeansthattheexpectationisboundedbothbelowandabovebytermssuchasC1mr=2andC2mr=2forlargem.ThisisincontrasttoorderOwhichonlyrequirestheupperbound. Arigorousproofofthelemmarequiresauniformintegrabilityargumentwhichweomit.However,intuitively,E[j^AAjrfPm1(Vi+A)2gr=2]=Oe(1).Butm(V+A)2Pm1(Vi+A)2m(V+A)2.Accordingly,Pm1(Vi+A)2=Oe(m).ThisleadstoEj^AAjr=Oe(mr Nextwewritej^BiBij=Vij^AAj=[(Vi+^A)(Vi+A)]j^AAj=(V+A)foralli=1;;mwhichleadstothedesiredconclusion.Also,weconcludefromthelemmathat foranyarbitrary>0andarbitrarylarger(>0)whichwerequireinthesequel.WemaypointoutthattheassumptionregardingtheboundednessoftheVi'sisrequiredinanyasymptoticBayesrisk(orMSE)calculationsinthiscontextasevident,forexample,fromPrasadandRao(1990). WearenowinapositiontondanexpressionforE(^REBii)2whichissecondordercorrect,i.e.thebiasisoftheordero(m1).Thefollowingtheoremisproved. 32 PAGE 33 Tothisend,werststatethefollowinglemmas. 33 PAGE 34 2Pmj=1(Vj+A)2=1 2tr(2)andK(A)=tr(I1M1);J(A)=tr(I1M2),where @b21@A@3logL @b1@bp@A@3logL @b1@A2@3logL @b1@bp@A@3logL @b2p@A@3logL @bp@A2@3logL @b1@A2@3logL @bp@A2@3logL @A3377777775=264XT2X00T2tr(3)375; and @b21)(@logL @A)(@2logL @b1@bp)(@logL @A)(@logL @b1)(@2logL @A2)(@2logL @b1@bp)(@logL @A)(@2logL @b2p)(@logL @A)(@logL @bp)(@2logL @A2)(@logL @b1)(@2logL @A2)(@logL @bp)(@2logL @A2)(@logL @A)(@2logL @A2)377777775=2640pp0p101ptr(3)375: SinceI1=264(XT1X)1010Tpftr(2)g1375; K(A)+2J(A)=tr[(XT1X)1(XT2X)+f2tr(3)2tr(3)gftr(2)g1]=tr[(XT1X)1(XT2X)]: TheLemmanowfollowsfromEq. 2{20 andEq. 2{23 34 PAGE 35 NowbyLemma2,andthefactthatE(^AA)2=(2tr2)1+o(m1),weget (Vi+A)(tr2)trf(XT1X)1(XT2Xg Remark3.ItmaybenotedthattheleadingterminE(^AA)isO(m1)andg5im=O(m1). ThelemmasprovideapproximationstoE(^AA)andE(^BiBi)bothcorrectuptoO(m1).Thus,byLemmas2and3wegetthefollowingtheoremwhichgivesustheestimateoftheMSEcorrectuptothesecondorder. 35 PAGE 36 g6im(A)=2B2i[mXi=1(Vj+A)2]1tr[fmXi=1(Vj+A)1xjxTjg1fmXi=1(Vj+A)2xjxTjg][(1+K2)(K)K(K)]:(2{25) Hence,weestimateg1i(A)byg1i(^A)+Vig5im(^A). Next,byonestepTaylorapproximation,g2i(^A)=g2i(A)+(^AA)@g2i 36 PAGE 37 dA(B2iD2i)=d dA[V2i(Vi+A)2fVi+AxTi(mXj=1(Vj+A)1xjxTj)1xig]=V2i[(Vi+A)2+2(Vi+A)3xTi(mXj=1(Vj+A)1xjxTj)1xi+(Vi+A)2xTi(mXj=1(Vj+A)1xjxTj)1(mXj=1(Vj+A)2xjxTj)(mXj=1(Vj+A)1xjxTj)1xi]=B2i+O(m1): Hence,byLemma2-2andassumptionsofTheorem2-3, Thus,weestimateg2i(A)byg2i(^A)+g6im(^A).Thetheoremfollows. 37 PAGE 38 andform=20,thenumbersofsmallareasaredoubledwithineachgroup.Weconsiderdierentdistributionsforgeneratingui's.(i)Contaminatednormaldistribution.Inparticular:9N(0;1)+:1N(0;36)sothat1of10areasisexpectedtobeanoutlier.(ii)NormaldistributionN(0;4:5).Notethat4.5isthevarianceofthecontaminatednormaldistribution.ThenwecalculatethefollowingquantitiestomeasuretheperformanceoftheSAE.Let^ibetheestimateofi.Herewedenetheabsoluterelativebiasas RB:RelativebiasofstandardEBestimates;RBR:relativebiasofrobustEBestimates;RBI:relativeimprovementofrelativebiasofrobustestimatesoverthestandardBayesestimates;SMSE:empiricalMSEofstandardEBestimates;SMSER:empiricalMSEofrobustEBestimates;RMI:RelativeimprovmentofMSEofrobustestimatorsoverthestandardBayesestimators.REBN:relativebiasofthenaiveMSEestimatesunderrobustEBmethod;RI:relativeimprovementofbiasintheestimatedMSEovernaiveestimatedMSEundertherobustmethod.HerethenaiveMSEestimatesareobtainedbytakingonlythersttwotermsofMSEestimates.Thevaluescorrespondingtonormalrandomeectsarereportedwithinparentheses. 38 PAGE 39 Table2-1. RelativebiasesofthepointestimatesandrelativebiasofMSEestimates. KRBRBRRBISMSESMSERRMIREBNREBRI m=10.52.7432.011.267.909.344.622-.106-.087.055(1.150)(1.051)(.086)(.460)(.345)(.250)(-.065)(-.050)(.0434)1-2.202.197-.355.609-.154-.133.059(-)(1.067)(.072)(-)(.350)(.239)(-.078)(-.062)(.046)1.5-2.067.246-.372.590-.176-.153.062(-)(0.969)(.157)(-)(.350)(.239)(-.084)(-.066)(.051)m=20.52.5812.392.073.329.320.027-.059-.050.028(.683)(.653)(.059)(.322)(.321)(.003)(-.026)(-.018)(.025)1-2.390.074-.328.003-.090-.080.0305(-)(.677)(.023)(-)(.318)(.012)(-.041)(-.033)(.025)1.5-2.120.179-.318.033-.089-.078.035(-)(.649)(.064)(-)(.310)(.037)(-.081)(-.074)(.023) 39 PAGE 40 Bishopetal. ( 1975 ); ZhaoandPrentice ( 1990 )).Bivariatebinarydataariseverynaturallyinopthalmologicalstudies.Itemresponsetheory,whichoccupiesacentralroleinpsychometryandeducationalstatistics,dealsalmostexclusivelywithmultivariatebinarydata.TheclassicexampleistheRaschmodelwithitsmanyandvariedextensions. Ourinterestintheanalysisofmultivariatebinarydatastemsfromtheirpotentialapplicationinsmallareaestimation.Weconsiderthecasewhentheresponseisbivariatebinary.InSection2,wehaverstderivedtheBayesestimators,andsubsequentlytheEBestimatorsforthesmallareaparameters.Section3isdevotedtothederivationoftheMSE'sofBayesestimators,andsecondordercorrectapproximationsfortheMSE'softheEBestimators.Section4isdevotedtothesecondordercorrectestimatorsoftheMSE'softheEBestimators.WehaveconductedasimulationstudyinSection5toillustrateourmethod. 1+exp(ij1)+exp(ij2)+exp(ij1+ij2+ij3):(3{1) Withthereparametrizationp1ij=exp(ij1) 1+exp(ij1)+exp(ij2)+exp(ij1+ij2+ij3)=P(yij1=1;yij2=0); PAGE 41 1+exp(ij1)+exp(ij2)+exp(ij1+ij2+ij3)=P(yij1=0;yij2=1); 1+exp(ij1)+exp(ij2)+exp(ij1+ij2+ij3)=P(yij1=1;yij2=1); 3{1 as wherep4ij=1P3l=1plij.Let1ij=pij1+pij3,2ij=pij2+pij3.Ourprimaryinterestissimultaneousestimationofthesmallareameans(1i;2i)T,where1i=Pnij=1wij1ij,2i=Pnij=1wij2ij,Pnij=1wij=1foreachi.Heretheweightswij'sareassumedtobeknown,basedonsomegivensamplingscheme.Webeginwiththelikelihood WritingpT=(pT11;;pT1n1;;pTk1;;pTknk),theconjugateDirichletpriorisgivenby whereisthedispersionparameter,m4ij=1P3l=1mlij,andthemlijarefunctionsofthecovariatesxijassociatedwiththebinaryvectory.Inparticular,wetake whereb1;b2;b3aretheregressioncoecients.Writingz1ij=yij1(1yij2),z2ij=yij2(1yij1),z3ij=yij1yij2andz4ij=1P3l=1zlij,theposteriorforpisnowgivenby 41 PAGE 42 3{6 ,theBayesestimator(posteriormean)ofplijisgivenby ^pBlij=(zlij+mlij(b))=(+1);j=1;;ni;i=1;;m:(3{7) Theresultingestimatorsfor1ijand2ijare ^B1ij=^pB1ij+^pB3ij;^B2ij=^pB2ij+^pB3ij:(3{8) InanEBscenario,theparameterb1;b2andb3areallunknownandneedtobeestimatedfromthemarginaldistributionsofallthey's.Weassumethatisknown,andoutlinelateritsadaptiveestimation.Directestimationofisimpossiblefromthemarginaldistributionsofthey'ssincemarginallyP(yij1=1;yij2=0)=E(z1ij)=m1ij;P(yij1=0;yij2=1)=E(z2ij)=m2ijandP(yij1=1;yij2=1)=E(z3ij)=m3ij,andthejointmarginaloftheyisuniquelydeterminedfromtheseprobabilitiesresultinginnonidentiabilityin. Onewaytoestimateb1;b2;b3fromthemarginalDirichlet-Multinomialdistributionofthey'sisthemethodofmaximumlikelihood.ButtheMLE'saremathematicallyintractable,andarenotreadilyimplementableinpractice.Instead,weappealtothetheoryofoptimalunbiasedestimatingfunctionsasproposedin GodambeandThompson ( 1989 ). 42 PAGE 43 theoptimalestimatingequationsaregivenbyPki=1Pnij=1DTij1ijgij=0.Solvingtheseequations,wendtheestimatorsofb1;b2andb3.Werstobservethat AgainfromEq. 3{9 ,DTij=ijxij,wheredenotestheKroneckerproduct.Hence,theequationsPki=1Pnij=1DTij1ijgij=0canberewrittenasPki=1Pnij=1[I3xij]gij=0whichfurthersimpliesinto SolvingEq. 3{11 ,oneestimatesb1;b2andb3.TheEBestimatorsofplij(l=1;2;3)arenowgivenby ^pEBlij=(zlij+mlij(^b))=(+1);(3{12)j=1;;ni;i=1;;k;l=1;2;3;where^bT=(^bT1;^bT2;^bT3).ThentheEBestimatorsof1ijand2ijaregivenby ^EB1ij=^pEB1ij+^pEB3ij;^EB2ij=^pEB2ij+^pEB3ij:(3{13) Wewrite^Bi=(^B1i;^B2i)T,^EBi=(^EB1i;^EB2i)T,where^Bri=Pnij=1wij^Brij,^EBri=Pnij=1wij^EBrij,(r=1;2;i=1;;k). 43 PAGE 44 Werstobservethatforj6=j0,1l;l03, PAGE 45 forallr;s=1;2.Thus, Nowforeachl=1;2;3, Firstconditioningonplij,itfollowsfromEq. 3{17 that 45 PAGE 46 Hence, ItfollowsfromEq. 3{16 andEq. 3{20 that Nextweobservethat 46 PAGE 47 Byone-stepTaylorexpansion,forl=1;2;3, Hencefor1l6=l03, NextwendanapproximationtoE[(^bb)(^bb)T]whichiscorrectuptoO(k1). Tothisend,letSk(b)=Pki=1Pnij=1DTij1ijgij.NotingthatE(gij)=0, @b(DTij1ij)gij]=kXi=1niXj=1DTij1ijDij=kXi=1niXj=1[ijxijxTij]: WedenotebyUktheexpressionintherightmostsideofEq. 3{26 .WeassumethatUk=O(k).Thisleadstotheapproximation NowE[(^pEBij^pBij)(^pEBij0^pBij0)T]:=Vijj0; PAGE 48 where ByEq. 3{23 -Eq. 3{30 Hence, Thetheoremfollows. 48 PAGE 49 3{33 whichissecondordercorrect,i.e.therstneglectedtermiso(k1).WemaynotethatthesecondordertermintherighthandsideofEq. 3{33 isO(k1).Hence,substitutionof^bforbestimatesthesamecorrectlyuptoO(k1).However,estimationofMSE(^Bi)withthesimplesubstitutionof^bforbisnotadequatesincetherstneglectedtermisO(k1).Hence,weneedtoestimateMSE(^Bi). Tothisend,webeginwiththeevaluationofE[m1ij(^b)(1m1ij(^b))].Bytwo-stepTaylorexpansion, 2(^bb)T@2m1ij(b) Hence, 2(^bb)T@2m1ij(b) 2(^bb)T@2m1ij(b) 2(12m1ij(b))@2m1ij(b) 49 PAGE 50 3{29 .Basedonthatexpression,onegets WeneedtondE(^bb).Wefollow CoxandSnell ( 1968 )tondthisexpectation. Tothisend,werstnotethat@Sk=@b=@ @b[Pki=1Pnij=1fI3xijg0BBBB@z1ijm1ijz2ijm2ijz3ijm3ij1CCCCA]isaconstant.Hence,sinceE(Sk)=0theexpectationoftheproductofanyelementofSkwithanyelementof@Sk=@biszero.ThusdenotingJ=(Jlu;rs;mn)=(E(Sk;lu;@Sk;rs 50 PAGE 51 2U1k26666666666666666640BBBBBBBBBBBBBBBBB@tr(U1kK11)...tr(U1kK1p)...tr(U1kK31)...tr(U1kK3p)1CCCCCCCCCCCCCCCCCA+20BBBBBBBBBBBBBBBBB@tr(U1kJ11)...tr(U1kJ1p)...tr(U1kJ31)...tr(U1kJ3p)1CCCCCCCCCCCCCCCCCA3777777777777777775=1 2U1k2666666666666666664tr(U1kK11)...tr(U1kK1p)...tr(U1kK31)...tr(U1kK3p)3777777777777777775; where 51 PAGE 52 where1l6=l06=l003,1up.ThusE(^bb)=O(k1). FromEq. 3{39 -Eq. 3{42 ,oneobtainstheelementsofthematricesKluforalll=1;2;3andu=1;;p.ThusfromEq. 3{36 -Eq. 3{42 ,thebias-correctedestimateofm1ij(b)(1m1ij(b))isgivenby 2(12m1ij(^b))(@m1ij(b) 2(12m1ij(^b))@2m1ij(b) PAGE 53 2(^bb)T@2m1ij(b) 2(^bb)T@2m2ij(b) 2fm1ij(b)@2m2ij(b) 2U1k(^b)266664tr(U1k(^b)K11(^b))...tr(U1k(^b)K3p(^b))377775tr[f1 2fm1ij(^b)@2m2ij(b) whereCi(^b)=niXj=1w2ij0B@1010111CADij(^b)0BBBB@1001111CCCCA: PAGE 54 (Dij(^b)p;q=mpij(^b)[1mpij(^b)]1 2(12mpij(^b))(@mpij(b) 2(12mpij(^b))@2mpij(b) 2U1k(^b)266664tr(U1k(^b)K11(^b))...tr(U1k(^b)K3p(^b))377775tr[f1 2fmpij(^b)@2mqij(b) 3{43 asg(),wehave 54 PAGE 55 55 PAGE 56 TheAverageRelativeMeanSquaredError(ARMSE)isgivenby TheAverageDeviation(AD)isdenotedby TheAverageMSEisgivenby and,theestimatedMSEisgivenby where\MSE(^i)(r)isthesecondordercorrectestimateoftheMSEasgivenby 3{43 .InadditiontotherelativebiasofthepointestimatesandsimulatedMSE's,wereportrelativebiasoftheestimatesofthemeansquarederrorsasdenedbelow. RelativebiaswithrespecttotheempiricalMSE:REBi=Ef\MSE(^i)gSMSE(^i) AsmatterofinterestwealsocomputedthedirectestimatorstocomparetheirperformanceswithourempiricalBayesestimators.WeobservedtherewasconsiderableimprovementintermsofbiasandMSEwhenweusedourEBestimatorsascomparedtotherawestimators,e.g.,for=5,theimprovementwas48.5%forARD,98.1% 56 PAGE 57 Table3-1. RelativebiasesofthepointestimatesandrelativebiasofMSEestimatesforthesmallareamean:therstvariable. k=10.50.6140.0360.0810.0110.0372.38510.2890.3300.0790.0120.0432.35550.1920.071.0690.0080.0233.276k=20.50.3582.320.0800.01080.0362.40610.290.04020.0890.0130.0392.34750.1800.0050.0680.00730.0222.406 Table3-2. RelativebiasesofthepointestimatesandrelativebiasofMSEestimatesforthesmallareamean:thesecondvariable. k=10.50.4220.0070.0740.0090.0070.21410.2080.1820.0840.0100.0090.28350.1510.0500.0630.0060.0050.196k=20.50.2090.3550.0750.0100.0070.22110.290.0080.0790.0110.0080.22550.14120.0230.0620.0060.0040.206 57 PAGE 58 LittleandRubin 2002 ).Wehaveincorporatedideasfrom LipsitzandIbrahim ( 1996 )and Ibrahim,Lipsitz,andChen ( 1999 )totreatthemissingcovariates. ThegeneralHBmodelswhichaccomodatethismissingnessisdiscussedinSection2.Wehave,aftersuitablereparametrization,convertedthebivariatebinarylikelihoodintoamultinomiallikelihood,andsubsequentlyhaveutilizedthemultinomial-PoissonrelationshipleadingeventuallytoaPoissonreparameterization.Bayesiananaloguesoflog-linearmodelsareusedtoestimatethePoissonparameters.WehaveappliedtheHBmethodologytoobtainthesmallareaestimates,theassociatedstandarderrorsandposteriorcorrelations.Duetotheanalyticalintractabilityoftheposterior,theHBprocedureisimplementedviaMarkovchainMonteCarlo(MCMC)integrationtechniques,inparticular,theGibbssampler.InSection3,Themethodologyisillustratedwiththerealdatarelatedtolowbirthweightandinfantmortality. 1+exp(ij1)+exp(ij2)+exp(ij1+ij2+ij3):(4{1) 58 PAGE 59 wherepij4=1P3l=1pijl.Thenwritingzij1=yij1(1yij2),zij2=yij2(1yij1),zij3=yij1yij2andzij4=(1yij1)(1yij2),weget Nextwewritepijl=ijl=P4r=1ijrandusethewell-knownrelationshipbetweenthemultinomialandPoissondistribution,namely,(zij1;zij2;zij3;zij4)havethesamedistributionasthejointconditionaldistributionof(uij1;uij2;uij3;uij4)givenP4q=1uijq=1,whereuijqareindependentPoisson(ijq)(q=1,2,3,4). Webeginmodellingtheijlas wherexij=(xij1;xij2;:::;xijq)T,bl=(bl1;bl2;:::;blq)T,andviiidN(0;2).Inthepresenceofmissingcovariates,wewritexTij=(xTij;mis;xTij;obs)andassumethatthemissingnessisMAR.Inthepresenceofmissingnessinsomeofthecomponentsofxij,wemodelonlythosemissingcomponentsfornecessaryimputation.Further,inordertoreducethenumberofnuisanceparameters,wemodelthejointdistributionofthesemissingcomponentsastheproductofseveralsmallerdimensionalconditionaldistributionsasproposedin LipsitzandIbrahim ( 1996 )and Ibrahim,Lipsitz,andChen ( 1999 ).Specically,wepartitionxij;misintoseveralcomponentvectors,say,x(1)ij;mis;;x(r)ij;mis,parameterizedby1;:::;r,andconditionalontheobservedxij;obs PAGE 60 ItisimportanttonotethatsinceweareassumingMARmissingcovariates,wedonotneedtomodelthemissingdatamechanism. Writezij=(zij1;zij2;zij3;zij4)TandbT=(bT1;bT2;bT3;bT4).LetD=(zij;xij;j=1;2;:::;ni;i=1;2;:::;k)denotethecompletedata.Also,letDobs=(zij;xij;obs;j=1;2;:::;ni;i=1;2;:::;k)denotetheobserveddata.Then,thecompletedatalikelihoodunderthehierarchicalmodelisgivenbyL(b;2;jD)=(kYi=1"niYj=14Yl=1expfzijl(xTijbl+vi)g (4{6) andtheobserveddatalikelihoodisoftheformL(b;2;jDobs)=Z(kYi=1Z"niYj=14Yl=1expfzijl(xTijbl+vi)g whereissome-nitemeasure,typicallytheLebesguemeasureorthecountingmeasure. Weassumebl's,l=1;:::;4,2,areapriorimutuallyindependentwithbluniform(Rq)forl=1;:::;4,2InverseGamma(c;d),whereInverseGamma(c;d)pdfis 60 PAGE 61 Directevaluationoftheposteriormomentsofpijl'sarenotpossibleasitentailsevaluationofnumericalintegralswhichareverydiculttocompute.WeusetheGibbssamplingmethodinstead.Letxmisdenotethefullvectorofallthexij;mis,andv=(v1;:::;vk)T.TofacilitateGibbssampling,weconsiderthefollowingjointposteriorbasedonthecompletedatalikelihoodas: whereL(b;2;jD)isgivenbyEq. 4{6 .ItiseasytoshowthatafterintegratingoutvandxmisfromEq. 4{9 ,thejointmarginalposteriorof(b;2;)isEq. 4{8 Toprovetheproprietyoftheposterior,wewillmainlyfollowthetechniquesgivenin ChenandShao ( 2000 ), Chenetal. ( 2004 )and Chenetal. ( 2002 ).Wenotethatourcovariatesarebounded.Weassumethataijxij;missbij,whichmeansallthecomponentsofxij;missareboundedbyaijandbij:Westartwithintroducingsomenotations: letM=f(i;j):atleastonecomponentofxijismissingg,Mij=f1rq:xijrismissingg,andMr=f(i;j):r2Mijg;wherej=1;;ni,i=1;;k: PAGE 62 Chen,Ibrahim,andShao ( 2004 ), DenotebyI1l=f(i;j):zijl=0gandI2l=f(i;j):zijl1g.Letf1()=expfexp()g,f2()=exp(+)andf2()=exp():Notethatf1andf2arenon-increasingin.Alsof1(1)=f2(1)=0,and,f1(1)<1;f2(1)<1.Ontheotherhand,f3isnon-decreasingin,f3()=0andf3(1)<1.Letn=Pki=1niandNl2bethecardinalityofI2l;l=1;;4:DenoteuTij=(xTij;ei)withe=(0;::;1;0;::0)2Rkasthebasisvectorwith1attheithcomponent.Let bethe(n+Nl2)(q+k)designmatrixsuchthatu(l)n+ij=uijfor(i;j)2I3l.Letw(l)ij=1if(i;j)2I1lSI2landw(l)n+ij=1if(i;j)2I2l:DeneU(l)tobethematrixwithrowsw(l)ijuTij;1jni;1ikw(l)n+iju(l)Tn+ij;(i;j)2I2l For1rq,1l4,let ^alij=8><>:aijif(i;j)2I1lSI2lbijo.w.^aln+ij=bijif(i;j)2I2l(4{13) 62 PAGE 63 Wearenowledtothefollowingtheorem. 4{11 isoffullrank; whereU(l)isdenedinEq. 4{12 Wewilldefertheproofofthistheoremtotheappendix. ToimplementtheGibbssamplingalgorithm,wesamplefromthefollowingconditionaldistributionsinturn:(i)(bjv;xmis;Dobs),(ii)(vjb;2;xmis;Dobs),(iii)(2jv;Dobs),(iv)(xmisjb;v;;Dobs),and(v)(jxmis;Dobs). For(i),givenv,xmis,andDobs,thebl'sareconditionallyindependentwith(bljv;xmis;Dobs)/exp(kXi=1niXj=1hzijl(xTijbl+vi)exp(xTijbl+vi)i): GilksandWild ( 1992 )forl=1;:::;4.For(ii), 63 PAGE 64 GilksandWild ( 1992 )fori=1;2;:::;k.For(iii),wehave2jv;DobsInverseGammac+k;d+kXi=1v2i: 64 PAGE 65 LittleandRubin 2002 ). Weuseourmethodtoestimateinfantmortalityrateandlowbirthweightratesimultaneouslyoveraperiodof5yearsforseveralsmallareasinoneofthestates(namewithheldforcondentiality).Weconstructedthesmallareasbycross-classifyingthedierentcitiesinthestatewithmother'srace,namely,Non-HispanicWhite,Non-HispanicBlack,Non-HispanicAmericanIndian,Non-HispanicAsian/PacicIslanderandHispanic.Weget48suchsmalldomains.Foreachsmalldomain,weconsidertheNCHSfulldataasconstitutingourpopulationvalues,andwedrawa2%simplerandomsamplefromthispopulationofindividuals.Thusthepopulationlowbirthweightandinfantmortalityratesareknownforthesesmalldomains,andwillserveasthe\goldstandard"forcomparisonwiththedierentestimates.Inparticular,wewillndtheHBestimates,andcomparethesamewiththedirectestimates,namelythesamplemeans.Inouranalysisweconsidertheresponsevariablesandcovariatesasfollows:yij1=infantdeath(1=yes,0=no);yij2=lowbirthweight(1=yes,0=no);xij1=1denotingtheintercept;(xij2;xij3)=(0;0),(1;0),or(0;1)accordingasmother'sschoolingislessthan12years,equalto12years,orisgreaterthan12years;(xij4;xij5)=(0;0),(1,0)or(0,1)ifmother'sageislessthan20years,between20and34years,orisgreaterthan34years;andxij6=0or1ifmotherdoesnotsmoke,ortheaveragenumberofcigarettessmokedduringpregnancyisatleast1.Herey=(yij1;yij2)Tistheresponsevectorwiththecovariates 65 PAGE 66 1+exp(21+22xij4+23xij5+24xij6)+exp(31+32xij4+33xij5+34xij6); 1+exp(21+22xij4+23xij5+24xij6)+exp(31+32xij4+33xij5+34xij6); Toprovetheproprietyoftheposteriorunderanuniformpriorfor,westartwiththecompletelikelihoodforthecovariates:LetL(6;2;3jDcomp)=kYi=1niYj=1"f(xij6j6)expfxij2(21+22xij4+23xij5+24xij6)g+xij3(31+32xij4+33xij5+34xij6)g Weconsiderthefollowingfourcases. 66 PAGE 67 67 PAGE 68 1+exp(uT1ij)+exp(uT2ij)#Y(i;j)2J2Xxij6;mis"f(xij6j6)exp(uT2ij) 1+exp(uT1ij)+exp(uT2ij)#Y(i;j)2J3Xxij6;mis"f(xij6j6)1 1+exp(uT1ij)+exp(uT2ij)#: Wefollowmainlythetechniquesof ChenandShao ( 2000 )and Chen,Ibrahim,andShao ( 2006 )toprovetheproprietyoftheposterior.First,for(i;j)2Mc6=f(i;j):x6ijisobservedg;letxij=(x1ij;x4ij;x5ij)andXbethethen3designmatrixwithrowsxij,wherenisthecardinalityofMc6.Letzij=12x6ij;andXbethematrixwithrowszijxij;(i;j)2Mc6:Letusstatethefollowingconditions: 68 PAGE 69 Nowwestatethefollowingtheorem: Forouranalysis,weassumeanInverseGamma(1,1)priorfor2.Table4-1providesthesamplesizesforallthe48smalldomains,thepopulationproportionsoflowbirthweightandinfantmortality(P1,P2),thecorrespondingsamplemeans(D1,D2),andthecorrespondingHBestimates(HB1,HB2). 69 PAGE 70 BivariateHBestimatesalongwiththeDirectestimatesandtheTruevalues areaniP1P2D1D2HB1HB2 16950.0070.0680.0040.0620.0060.080230.0150.0990.0000.0000.0070.0823410.0060.0610.0000.1220.0110.108414770.0060.0680.0060.0710.0060.07554250.0140.1310.0240.1500.0060.07763450.0060.0670.0080.0640.0060.08275100.0050.0670.0040.0610.0090.10288880.0090.1160.0080.1140.0090.1169440.0040.0810.0000.0910.0050.0731017180.0040.0630.0020.0580.0070.083113790.0060.0680.0050.0820.0050.075121260.0130.1300.030.1350.0070.0821350.0050.0680.0000.2000.0090.09114150.0030.0770.0000.0670.0060.07615130.0030.0700.0000.0000.0070.083169750.0050.0720.0070.0600.0100.107173490.0110.1350.0170.1490.0070.0831840.0050.1120.0000.5000.0070.08019580.0030.0740.0000.0690.0050.078202880.0050.0820.0030.0900.0070.08221270.0020.0780.0000.1110.0070.079222370.0040.0670.0000.0760.0060.076239870.0050.0680.0040.0780.0060.0792460.0190.0520.0000.0000.0050.07825500.0080.0840.0000.0400.0080.084263650.0040.0590.0030.0880.0040.072274530.0080.0640.0110.0770.0050.07328780.0190.1200.0120.1280.0050.0782940.0160.0930.0000.0000.0070.08130150.0040.0760.0000.1330.0050.07231190.0070.0640.0000.1580.0070.086324460.0080.0710.0020.0740.0050.07333760.0150.1330.0000.1450.0050.0693480.0080.0850.0000.1250.0070.08635220.0060.0560.0450.1360.0090.093362900.0140.1330.0140.1310.0050.0733730.0090.0740.0000.0000.0080.08638180.0060.0610.0000.1110.0060.082399330.0050.0660.0040.0620.0070.084402070.0120.1190.0190.1260.0060.0824140.0000.0840.0000.0000.0080.08542210.0080.0840.0000.0480.0070.0894314720.0070.0700.0070.0710.0080.08944270.0040.0730.0000.0370.0050.076452750.0060.0670.0030.0690.0050.0724611310.0060.0650.0080.0600.0080.085471910.0140.1250.0160.1260.0080.087483030.0060.0600.0030.0660.0080.085 70 PAGE 71 TheAverageRelativeMeanSquaredError(ARMSE)isgivenby TheAverageDeviation(AD)isdenotedby and,theAverageMSEisgivenby Table4-2summarizestheperformanceoftheHBestimates,andalsoofthedirectestimates.ItfollowsfromthesummarytableTable4-2thattheHBestimatesoutperformthedirectestimatesaccordingtoallthefourcriteria.FortheHBestimates,ARD,ARMSE,ADandAMSEimprovementoverthedirectestimatesare39.39%,74.88%,37.93%and71.43%respectively,forestimatinginfantmortalityrate.Forlowbirthweightproportions,correspondingimprovementsare46.59%,85.09%,43.05%and86%respectively. Table4-2. Measuresofprecision D1D2HB1HB2 ARD0.79730.4640.48320.2478ARMSE1.65250.62440.41510.0931AD0.00580.03670.00360.0209AMSE0.000070.0050.000020.0007 71 PAGE 72 Weusedtheposteriorpredictivep-valuesofGelmanetal.(2003,Ch.6)forassessingthemodelt.Theposteriorpredictivep-valuefortheithareaisdenedasppi=Pr(T(yrep;i)T(y;i)jDobs); Ifthemodelisreasonablyaccurate,thereplicationsshouldlooksimilartotheobserveddatay.Inthecasewherethedatayininconictwiththepositedmodel,T(y;i))'sareexpectedtohaveextremevalues,whichinturnshouldgivep-valuescloseto0or1.Hence,anidealtrequirestheposteriorpredictivep-valuestobeintheballparkof0.5.TheguresprovidedinTable3showthatthisisindeedthecaseforourttedmodel. 72 PAGE 73 Posteriorvariance,correlationand95%HPDsalongwithpredictivep-values I1I2areaSE1SE2CORRp-value 10.0010.0020.1150.0050.0070.0760.0850.53620.0010.0020.1510.0050.0080.0770.0870.54430.0020.0060.1810.0070.0150.0970.1190.55640.0010.0020.1240.0040.0070.0710.0800.52050.0010.0020.1210.0050.0070.0720.0810.57260.0010.0020.0990.0050.0070.0770.0860.55270.0020.0050.1530.0060.0130.0920.1120.49680.0090.1300.1490.0080.1020.0050.0730.54890.0010.0020.1680.0040.0070.0690.0780.536100.0010.0030.1060.0060.0090.0780.0880.588110.0010.0020.1700.0040.0060.0710.080.544120.0010.0030.1700.0060.0090.0770.0870.524130.0010.0040.1520.0060.0110.0840.0990.552140.0010.0020.1510.0050.0070.0720.0810.536150.0010.0030.1440.0050.0090.0780.0890.540160.0020.0050.1780.0070.0130.0980.1170.496170.0010.0020.1620.0060.0090.0780.0880.472180.0010.0030.1050.0050.0080.0750.0850.596190.0010.0020.1540.0040.0070.0740.0830.508200.0010.0020.1740.0060.0090.0780.0870.476210.0010.0040.1260.0040.0090.0710.0870.556220.0010.0020.1420.0040.0070.0720.0810.524230.0010.0020.0970.0050.0080.0750.0840.612240.0010.0020.1810.0040.0070.0730.0830.540250.0010.0030.0720.0060.0090.0790.0890.576260.0010.0040.1040.0020.0050.0640.0790.596270.0010.0020.1020.0040.0060.0690.0780.604280.0010.0020.1570.0040.0070.0730.0830.560290.0010.0020.1320.0050.0080.0760.0850.604300.0010.0030.1060.0040.0060.0670.0770.492310.0010.0030.2010.0050.0080.0810.0910.604320.0010.0030.1720.0040.0070.0670.0780.540330.0010.0020.1480.0040.0070.0640.0740.496340.0010.0030.1060.0050.0080.0810.0910.492350.0020.0040.1630.0070.0120.0850.1010.588360.0010.0030.080.0040.0060.0680.0790.544370.0010.0030.1910.0060.0110.0790.0920.544380.0010.0020.0880.0050.0070.0770.0860.532390.0010.0030.1460.0060.0090.0790.0890.556400.0010.0040.140.0040.0080.0750.0890.536410.0010.0040.1520.0050.0100.0780.0930.536420.0010.0030.1020.0060.0090.0840.0940.488430.0010.0030.1560.0060.0100.0830.0940.536440.0010.0040.0520.0030.0070.0690.0840.544450.0010.0020.1050.0040.0060.0670.0760.560460.0010.0030.1790.0060.0100.0790.0910.552470.0010.0030.1640.0060.0090.0810.0920.508480.0010.0030.2410.0060.0100.0790.0910.564 73 PAGE 74 Theotherpartofthisdissertationissmallareaestimationwhentheresponseisbivariatebinaryinthepresenceofcovariates.WehaverstdevelopedanEBestimationprocedurefollowedbyahierarchicalBayesprocedure.Inthelattercase,wehavecovariateswhicharemissingatrandom.Wehaveshownthatboththemethodsperformmuchbetterthanthedirectestimatorsthroughasimulationstudyintheformercaseandrealdataanalysisinthelattersituation. Inthisdissertation,wehaveconsideredonlyarealevelsmallareaestimationmodelsforthecaseofrobustestimationofparameters.Itisworthwhiletoextendtheproposedapproachtounitlevelsmallareaestimationmodelsasconsideredforexamplein Batteseetal. ( 1988 )Amorechallengingtaskistoextendtheseideastonon-normalmodels,especiallyforsomeofthediscretedistributionsinvolvingbinary,countorcategoricaldata.Theposterioroftheregressioncoecientinsuchinstancesdoesnotcomeoutinaclosedform.Amoremanagablealternativeseemstoinvolveapredictionproblemwhereonedecidestolookatthepredictivedistributionofafutureobservation,andusethesametoderivetheinuenceofagivenobservation. Anaturalextensionoftherstpartofthisdissertationwouldbetodevelopsimilarrobustproceduresinthecaseofgenerallinearmixedmodels.Asnotedintheintroduction,thenormalmodelconsideredinthedissertationandalsotheother 74 PAGE 75 Anotherfutureundertakingwillbeextendingtheresultsforthebivariatebinarycaseinthemultivariatecontext.AlsowehavedevelopedEBtechniqueswhentherearenomissingcovariates.Itwillbeinterestingtoextendtheseresultswhensomeofthecovariatesaremissing. 75 PAGE 76 Nextwewrite Nowobservethat ^iREB~REBi=BiDiyixTi~b WerstshowthatxTi(^b~b)=Op(m1).Tothisend,webeginwiththeone-stepTaylorexpansion, ^b:=~b+(^AA)@~b 76 PAGE 77 sothatE[@~b Byassumption(i),(XT1X)1(V+A)1(XTX)1.Also,XT2XA2XTXandXT3XA3XTX.HencefromEq. A{6 ,V(@~b whereC(>0)isaconstant.ThisimpliesthatxTi(^b~b)=Op(m1).Accordingly,since wherej^DiDij=Op(m1 2)andxTi(^b~b)=Op(m1),itfollowsfromEq. A{8 that 77 PAGE 78 ^iREB~REBi=(^BiBi)(yixTi~b)+^Bi[f(yixTi^bK^Di)(yixTi~bKDi)gI[yixTi~b>KDi]+f(yixTi^b+K^Di)(yixTi~b+KDi)gI[yixTi~b PAGE 79 ^AA:=[mXj=1(Vj+A)2]1mXj=1(Vj+A)2f(yjxTjb)2(Vj+A)g;(A{14) andwrite (yixTi~b)2=[(yixTib)xTi(~bb)]2=(yixTib)2+Op(m1 2): Also,I[jyixTi~bjKDi]=I[jyixTibjKDi]+op(mr)forarbitrarilylarger>0.Since(^AA)2=Op(m1),wenowget NextbythemutualindependenceoftheyixiTb,(i=1;;m),andassumption(i)ofTheorem2-3, Cov[(^AA)2;(yixTib)2I[jyixTibjKDi]]:=[mXj=1(Vj+A)2]2Cov[fmXj=1(Vj+A)2f(yjxTjb)2(Vj+A)gg2;(yixTib)2I[jyixTibjKDi]]=[mXj=1(Vj+A)2]2Cov[(Vi+A)4f(yixTib)2(Vi+A)g2;(yixTib)2I[jyixTibjKDi]]]=O(m2): PAGE 80 A{16 ,andthefactthatD2i=Vi+A+O(m1); E[(^AA)2(yixTi~b)2I[jyixTi~bjKDi]]=E(^AA)2E[(yixTib)2I[jyixTibjK(Vi+A)1 2]]+o(m1)=2[mXj=1(Vj+A)2]1(Vi+A)E[Z2I[jZjK]]+o(m1)=2[mXj=1(Vj+A)2]1(Vi+A)[2(K)12K(K)]+o(m1): FromEq. A{13 andEq. A{17 80 PAGE 81 (Vi+A)1 2(Bi+^BiBi)[f1+^AA Vi+AxTi 2f1xTi 2]=(Vi+A)1 2(Bi+^BiBi)[1+1 2^AA Vi+A1 8(^AA)2 2xTi 2xTi 2(Bi+^BiBi)[1 2^AA Vi+A1 8(^AA)2 2Bi1 2^AA Vi+A(Vi+A)1 2Bi1 8(^AA)2 2Vi(A^A) (Vi+A)(Vi+^A)1 2^AA Vi+A+op(m1)=Bi 2Bi 2Bi(^AA)2 2+op(m1)=Bi 25Bi 2+op(m1); 2Vi(A^A) (Vi+A)(Vi+^A)=Vi(A^A) (Vi+A)3 2(1+^AA Vi+A)1=Vi(A^A) (Vi+A)3 2[1^AA Vi+A+Op(m1)]=Vi[^AA 2+(^AA)2 2+op(m1)]=Bi^AA 2+Bi(^AA)2 2+op(m1): PAGE 82 ^Bi(^DiDi)+Di(^BiBi)=Bi 2+3Bi 2+op(m1); sothat 4B2i(Vi+A)1Ef(^AA)2gEfI[jyixTi~bj>KDi]g+o(m1)=B2i(Vi+A)1[mXj=1(Vj+A)2]1(K)+o(m1)=B2i(Vi+A)(1Bi)2[mXj=1(1Bj)2]1(K)+o(m1)=B2iD2i(1Bi)2[mXj=1(1Bj)2]1(K)+o(m1): CombiningEq. A{18 andEq. A{20 ,itfollowsthat Finally,byEq.2-13andEq. A{19 weget (~REBi^iB)(^iREB~REBi)=BifxTi(~bb)+(yixTi~bKDi)I[yixTi~b>KDi]+(yixTi~b+KDi)I[yixTi~b PAGE 83 (~REBi^iB)(^iREB~REBi)=KB2i 23(^AA)2 2gf(yixTi~bKDi)I[yixTi~b>KDi](yixTi~b+KDi)I[yixTi~b PAGE 84 Similarly,E[(^AA)2(Vi+A)3 2f(yixTi~bKDi)I[yixTi~b>KDi](yixTi~b+KDi)I[yixTi~b PAGE 85 (B{1) Hereb=(b1;b2;b3;b4),(b)/1and()/1.Now,letijl=xTijbl+vi;l=1;2;3;4;j=1;;ni;i=1;;k:Nowletusrstintegratetheposteriorw.r.t2: (c+vTv)(d+k)=2kYi=1niYj=1f(xij;obs;xij;misj) (B{2) ThenfromEq. B{2 andEq.4-10weget: (c+vTv)(d+k)=2kYi=1niYj=1f(xij;obs;xij;misj); wheref1,f2andf3aredenedinSection4.2.Forl=1;;4,1rq,let^xlijr=^alijfor(i;j)2Mrand~xlijr=^aln+ijfor(i;j)2MrTI2lifblr0;^xlijr=^blijfor(i;j)2Mrand~xlijr=^bln+ijfor(i;j)2MrTI2lifblr<0.Write^x(l)ij;miss=f^xlijr;r2Mijgfor(i;j)2Mrand~x(l)ij;miss=f~xlijr;r2Mijgfor(i;j)2MrTI2l;^x(l)ij=(xij;obs;^x(l)ij;miss)and~x(l)ij=(xij;obs;~x(l)ij;miss):Also,let~u(l)ij=(~x(l)ijT;eTi)Tand^u(l)ij=(^x(l)ijT;eTi)T:Then 85 PAGE 86 B{3 (c+vTv)(d+k)=2kYi=1niYj=1f(xij;obs;xij;misj) (B{4) Then, (c+vTv)(d+k)=2kYi=1niYj=1Xaijxijbijf(xij;obs;xij;misj)=C4Yl=1hY(i;j)2I1lf1(^u(l)ijTl)Y(i;j)2I2lf2(^u(l)ijTl)f3(~u(l)ijTl)i1 (c+vTv)(d+k)=2kYi=1niYj=1f(xij;obsj) (B{5) Let, PAGE 87 wheredFl=dQ(i;j)2I1l(f1(ijl))Q(i;j)2I2l(f2(ijl))Q(i;j)2I2l(f3(ijl)),forl=1;;4: ChenandShao ( 2000 ),weknowthatunderconditions(C1)and(C2), ThenfromEq. B{5 weget,Z(b;;vjDobs)dbK4Yl=1nZRn+Nl2klkq (B{8) 87 PAGE 88 (c+vTv)(d+k)=21fkvkKklkgi4Yl=1dFlokYi=1niYj=1f(xij;obsj) (B{9) Then (c+vTv)(d+k)=21fkvkKklkgio4Yl=1dFlKZRn+N12ZRn+N22ZRn+N32ZRn+N42n4Yl=1klkqg4Yl=1dFlK4Yl=1ZRn+Nl2nX(i;j)2I1ljijljq+X(i;j)2I2ljijljq+X(i;j)2I2ljn+ij;ljqodFl<1(fromcondition(C3)) Hence, Hencethetheoremfollows. 88 PAGE 89 1+exp(uT1ij)+exp(uT2ij)#Y(i;j)2J2Xxij6;mis"f(xij6j6)exp(uT2ij) 1+exp(uT1ij)+exp(uT2ij)#Y(i;j)2J3Xxij6;mis"f(xij6j6)1 1+exp(uT1ij)+exp(uT2ij)#: For1r8;let^u1ijr=1ifr>0and^u1ijr=0ifr0for(i;j)2J1TM6;~u1ijr=0ifr>0and~u1ijr=1ifr0for(i;j)2J1SJ3TM6;^u2ijr=1ifr>0and^u2ijr=0ifr0for(i;j)2J2TM6;~u2ijr=0ifr>0and~u2ijr=1ifr0for(i;j)2J2SJ3TM6: 1+exp(^uT1ij)+exp(~uT2ij)Y(i;j)2J2exp(^uT2ij) 1+exp(~uT1ij)+exp(^uT2ij)Y(i;j)2J31 1+exp(~uT1ij)+exp(~uT2ij); wherethelaststepfollowssincePxij6;misf(xij6j6)=1:Now,letL(2;3jDobs)=Y(i;j)2J1exp(^uT1ij) 1+exp(^uT1ij)+exp(~uT2ij)Y(i;j)2J2exp(^uT2ij) 1+exp(~uT1ij)+exp(^uT2ij)Y(i;j)2J31 1+exp(~uT1ij)+exp(~uT2ij); 89 PAGE 90 1+exp(^uT1ij)+exp(~uT2ij)=Z10exphtij1+exp(^uT1ij)+exp((^u1ij~u2ij)T)idtij=Z10exp(t)exptij(exp(^uT1ij))exptij(exp((^u1ij~u2ij)T)dtij Similarly, 1+exp(^uT2ij)+exp(~uT1ij)=Z10exp(tij)hZ11(^uT2ij)vij)(dFtij(vij))Z11((^u2ij~u1ij)Twij)(dFtij(wij))idtij and,J3=1 1+exp(^uT2ij)+exp(~uT1ij)=Z10exp(tij)hZ11(~uT1ij)vij)(dFtij(vij))Z11(~uT2ijwij)(dFtij(wij))idtij: 90 PAGE 91 B{14 -Eq. B{16 inEq. B{13 andintegratingw.r.t.to=(T2;T3)T:ZR8L(jD)dZR8nY(i;j)2J1hZ10exp(tij)hZ11(^uT1ij)vij)(dFtij(vij))Z11((^u1ij~u2ij)Twij)(dFtij(wij))idtijiY(i;j)2J2hZ10exp(tij)hZ11(^uT2ij)vij)(dFtij(vij))Z11((^u2ij~u1ij)Twij)(dFtij(wij))idtijiY(i;j)2J3hZ10exp(tij)hZ11(~uT2ij)vij)(dFtij(vij))Z11(~uT1ijwij)(dFtij(wij))idtijiod=ZR8nZR+nexp(kXi=1niXj=1tij)hZR2n1(Uv)dFt(v)idtod wheredFt(v)=Qi;j(dFtij(vij))Qi;j(dFtij(wij)):Using ChenandShao ( 2000 ),underconditions(H1)and(H2),weget Now,(dFtij(vij))(dFtij(wij))=t2ijexp(vij)exp(tijexp(vij))exp(wij)exp(tijexp(wij)): PAGE 92 (1+exp(vij)+exp(wij))3K1ZR2n(Xi;jv8ij+Xi;jw8ij)Yi;jexp(vij+wij) (1+exp(vij)+exp(wij))3<1 ButRR3Q(i;j)2Mc6f(xij6j6)d6<1underconditions(D1)and(D2)by(Theorem2.1)of ChenandShao ( 2000 ). Hencethetheoremfollows. 92 PAGE 93 Battese,G.E.,Harter,R.M.,andFuller,W.A.(1988),\Anerror-componentsmodelforpredictionofcountycropareasusingsurveyandsatellitedata,"JournalofAmericanStatisticalAssociation,83,28{36. Bishop,Y.M.M.,Fienberg,S.E.,andHolland,P.W.(1975),DiscreteMultivariateAnalysis:TheoryandPractice,TheMITPress,Cambridge,Mass.-London. Brackstone,G.J.(1987),\Smallareadata:policyissuesandtechnicalchallanges,"in Plateketal. ( 1987 ),pp.3{20. Chambers,R.L.(1986),\Outlierrobustnitepopulationestimation,"JournalofAmeri-canStatisticalAssociation,81,1063{1069. Chaudhuri,A.(1994),\Smalldomainstatistics:Areview,"StatisticaNeerlandica,48,215{236. Chen,M.H.,Ibrahim,J.G.,andShao,Q.M.(2004),\ProprietyofthePosteriorDistributionandExistenceoftheMLEforRegressionModelswithCovariatesMissingatRandom,"JournalofAmericanStatisticalAssociation,99,421{438. |(2006),\PosteriorproprietyandcomputationfortheCoxRegressionModelwithapplicationstoMissingCovariates,"Biometrika,93,791{807. Chen,M.H.andShao,Q.M.(2000),\ProprietyofPosteriorDistributionforDichotomousQuantalResponseModels,"ProceedingsoftheAmericanMathemati-calSociety,129,293{302. Chen,M.H.,Shao,Q.M.,andXu,D.(2002),\Necessaryandsucientconditionsontheproperietyofposteriordistributionsforgeneralizedlinearmixedmodels,"Sankhya,64,57{85. Cox,D.R.andSnell,E.J.(1968),\Ageneraldenitionofresiduals,"JournaloftheRoyalStatisticalSociety,SeriesB,30,248{275. Cressie,N.(1990),\Small-areapredictionofundercountusingthegenerallinearmodel,"StatisticsCanadaSymposium,93{105. Datta,G.S.andGhosh,M.(1991),\Bayesianpredictioninlinearmodels:applicationstosmallareaestimation,"AnnalsofStatistics,19,1748{1770. Datta,G.S.andLahiri,P.(1995),\RobusthierarchicalBayesestimationofsmallareacharacteristicsinthepresenceofcovariatesandoutliers,"JournalofMultivariateAnalysis,54,310{328. |(2000),\Auniedmeasureofuncertaintyofestimatedbestlinearunbiasedpredictorsinsmallareaestimationproblems,"StatisticaSinica,10,613{627. 93 PAGE 94 Datta,G.S.,Rao,J.N.K.,andSmith,D.D.(2005),\Onmeasuringthevariabilityofsmallareaestimatorsunderabasicarealevelmodel,"Biometrika,92,183{196. Drew,D.,Singh,M.P.,andChoudhry,G.H.(1982),\EvaluationofSmallAreaEstimationTechniquesfortheCanadianLabourForceSurvey,"SurveyMethodology,8,17{47. Efron,B.andMorris,C.(1971),\LimitingtheriskofBayesandempiricalBayesestimators.I.TheBayescase,"JournalofAmericanStatisticalAssociation,66,807{815. |(1972),\LimitingtheriskofBayesandempiricalBayesestimators.II.TheempiricalBayescase,"JournalofAmericanStatisticalAssociation,67,130{139. Erickson,E.P.andKadane,J.B.(1987),\Sensitivityanalysisoflocalestimatesofundercountinthe1980U.S.Census,"in Plateketal. ( 1987 ),pp.23{45. Farrell,P.J.,McGibbon,B.,andTomberlin,T.J.(1997),\EmpiricalBayesestimatorsofsmallareaproportionsinmultistagedesigns,"StatisticaSinica,7,1065{1083. Fay,R.E.andHerriot,R.A.(1979),\Estimatesofincomeforsmallplaces:anapplicationofJames-Steinprocedurestocensusdata,"JournalofAmericanStatisticalAssociation,74,269{277. Ghosh,M.andMaiti,T.(2004),\Small-areaestimationbasedonnaturalexponentialfamliyquadraticvariancefunctionmodelsandsurveyweights,"Biometrika,91,95{112. Ghosh,M.,Nangia,N.,andKim,D.H.(1996),\Estimationofmedianincomeoffour-personfamilies:aBayesianapproach,"JournalofAmericanStatisticalAssoci-ation,91,1423{1431. Ghosh,M.,Natarajan,K.,Stroud,T.W.F.,andCarlin,B.P.(1998),\Generalizedlinearmodelsforsmall-areaestimation,"JournalofAmericanStatisticalAssociation,93,273{282. Ghosh,M.,Natarajan,K.,Waller,L.A.,andKim,D.H.(1999),\HierarchicalBayesGLMsfortheanalysisofspatialdata:anapplicationtodiseasemapping,"JournalofStatisticalPlanningandInference,75,305{318. Ghosh,M.andRao,J.N.K.(1994),\Smallareaestimation:anappraisal,"StatisticalScience,9,55{93. Gilks,W.R.andWild,P.(1992),\AdaptiveRejectionSamplingforGibbsSampling,"AppliedStatistics,41,337{348. Godambe,V.P.andThompson,M.E.(1989),\Anextensionofquasi-likelihoodestimation,"JournalofStatisticalPlanningandInference,22,137{172. Gwet,J.-P.andRivest,L.-P.(1992),\Outlierresistantalternativestotheratioestimator,"JournalofAmericanStatisticalAssociation,87,1174{1182. PAGE 95 Harter,R.M.andFuller,W.(1987),\TheMultivariatecomponentsofvariancemodelinsmallareaestimation,"in Plateketal. ( 1987 ),pp.103{123. Ibrahim,J.G.,Lipsitz,S.R.,andChen,M.-H.(1999),\MissingCovariatesinGeneralizedLinearModelsWhentheMissingDataMechanismIsNonignorable,"JournaloftheRoyalStatisticalSociety,SeriesB,61,173{190. Jiang,J.andLahiri,P.(2001),\EmpiricalBestPredictionforSmallAreaInferencewithBinaryData,"AnnalsoftheInstituteofStatisticalMathematics,53,217{243. Kackar,R.N.andHarville,D.A.(1984),\Approximationsforstandarderrorsofestimatorsofxedandrandomeectsinmixedlinearmodels,"JournalofAmeri-canStatisticalAssociation,79,853{862. Lahiri,P.andMaiti,T.(2002),\EmpiricalBayesestimationofrelativerisksindiseasemapping,"CalcuttaStatisticalAssociationBulletin,53,213{223. Lahiri,P.andRao,J.N.K.(1995),\RobustEstimationofMeanSquaredErrorofSmallAreaEstimators,"JournalofAmericanStatisticalAssociation,82,758{766. Lipsitz,S.R.andIbrahim,J.G.(1996),\AConditionalModelforIncompleteCovariatesinParametricRegressionModels,"Biometrika,83,916{922. Little,R.J.A.andRubin,D.B.(2002),StatisticalAnalysisofMissingData,Wiley,NewYork,2nded. Maiti,T.(1998),\HierarchicalBayesEstimationofMortalityRatesforDiseaseMapping,"JournalofStatisticalPlanningandInference,69,339{348. Malec,D.,Sedransk,J.,Moriarity,C.L.,andLeClere,F.B.(1997),\SmallAreaInferenceforBinaryVariablesintheNationalHealthInterviewSurvey,"JournalofAmericanStatisticalAssociation,92,815{826. McGibbon,B.andTomberlin,T.J.(1989),\SmallareaestimatesofproportionsviaempiricalBayestechniques,"SurveyMethodology,15,237{252. Nandram,B.,Sedransk,J.,andPickle,L.(1999),\BayesiananalysisofmortalityratesforU.S.healthserviceareas,"Sankhya,61,145{165. Pfeermann,D.(2002),\Smallareaestimation-newdevelopmentsanddirections,"InternationalStatisticalReview,15,237{252. Pfeermann,D.andBurck,L.(1990),\Robustsmallareaestimationcombiningtimeseriesandcross-sectionaldata,"SurveyMethodology,16,217{237. Platek,R.,Rao,J.,Sarndal,S.,andSingh,M.(eds.)(1987),Wiley,NewYork. Prasad,N.G.N.andRao,J.N.K.(1990),\Theestimationofmeansquarederrorsofsmallareaestimators,"JournalofAmericanStatisticalAssociation,85,163{171. PAGE 96 Rao,J.N.K.(1986),\SyntheticEstimators,SPREEandModelBasedPredictors,"inProceedingsoftheConferenceonSurveyResearchMethodsinAgriculture,U.S.DepartmentofAgriculture,,Washington,D.C.,pp.1{16. |(1999),\Somerecentadvancesinmodel-basedsmallareaestimation,"SurveyMethod-ology,25,175{186. |(2003),SmallAreaEstimation,JohnWileyandSons. Rao,J.N.K.andYu,M.(1992),\Smallareaestimationbycombiningtimeseriesandcross-sectionaldata,"ASAProceedingsoftheSurveyResearchMethodsSection,1{19. Schaible,W.L.(1978),\Choosingweightsforcompositeestimatorsforsmallareastatistics,"ASAProceedingsoftheSectiononSurveyResearchMethods,741{746. Zaslavsky,A.M.,Schenker,N.,andBelin,T.R.(2001),\DownweightingInuentialClustersinSurveys:Applicationtothe1990PostEnumerationSurvey,"JournalofAmericanStatisticalAssociation,96,858{8629. Zhao,L.P.andPrentice,R.L.(1990),\Correlatedbinaryregressionusingaquadraticexponentialmodel,"Biometrika,77,642{648. PAGE 97 AnanyaRoywasborninKolkata,India,onthe24thofDecember,1978.Shegraduatedwithabachelor'sdegreefromCalcuttaUniversityin2000withrstclasshonoursinstatistics.ShethenjoinedtheIndianStatisticalInstitute,fromwhereshereceivedherMasterofStatisticsdegreein2002.ShemovedtotheUniversityofFloridaatGainesvilleinthefallof2002topursueherdoctoralstudiesinstatistics.Upongraduation,shejoinedtheUniversityofNebraskaatLincolnasanAssistantProfessorintheDepartmentofStatistics. 97 |