<%BANNER%>

Empirical and Hierarchical Bayesian Methods with Applications to Small Area Estimation

Permanent Link: http://ufdc.ufl.edu/UFE0021364/00001

Material Information

Title: Empirical and Hierarchical Bayesian Methods with Applications to Small Area Estimation
Physical Description: 1 online resource (97 p.)
Language: english
Creator: Roy, Ananya
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2007

Subjects

Subjects / Keywords: bayes, binary, estimation, mse, robust
Statistics -- Dissertations, Academic -- UF
Genre: Statistics thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: The topic of this dissertation focuses on the formulation of empirical and hierarchical Bayesian techniques in the context of small area estimation. In the first part of the dissertation, we consider robust Bayes and empirical Bayes (EB) procedure for estimation of small area means. We have introduced the notion of influence functions in the small area context, and have developed robust Bayes and EB estimators based on such influence functions. We have derived an expression for the predictive influence function, and based on a standardized version of the same, we have proposed some new small area estimators. The mean squared errors and estimated mean squared errors of these estimators have also been found. The findings are validated by a simulation study. In the second part, we have considered small area estimation for bivariate binary data in the presence of covariates. We have developed EB estimators along with their mean squared errors. In this case the covariates were assumed to be completely observed. In the presence of missing covariates, we have developed a hierarchical Bayes approach for bivariate binary responses. Under the assumption that the missing mechanism is missing at random (MAR), we have suggested methods to estimate the small area means along with the associated standard errors. Our findings have been supported by appropriate data analyses.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Ananya Roy.
Thesis: Thesis (Ph.D.)--University of Florida, 2007.
Local: Adviser: Ghosh, Malay.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2010-08-31

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2007
System ID: UFE0021364:00001

Permanent Link: http://ufdc.ufl.edu/UFE0021364/00001

Material Information

Title: Empirical and Hierarchical Bayesian Methods with Applications to Small Area Estimation
Physical Description: 1 online resource (97 p.)
Language: english
Creator: Roy, Ananya
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2007

Subjects

Subjects / Keywords: bayes, binary, estimation, mse, robust
Statistics -- Dissertations, Academic -- UF
Genre: Statistics thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: The topic of this dissertation focuses on the formulation of empirical and hierarchical Bayesian techniques in the context of small area estimation. In the first part of the dissertation, we consider robust Bayes and empirical Bayes (EB) procedure for estimation of small area means. We have introduced the notion of influence functions in the small area context, and have developed robust Bayes and EB estimators based on such influence functions. We have derived an expression for the predictive influence function, and based on a standardized version of the same, we have proposed some new small area estimators. The mean squared errors and estimated mean squared errors of these estimators have also been found. The findings are validated by a simulation study. In the second part, we have considered small area estimation for bivariate binary data in the presence of covariates. We have developed EB estimators along with their mean squared errors. In this case the covariates were assumed to be completely observed. In the presence of missing covariates, we have developed a hierarchical Bayes approach for bivariate binary responses. Under the assumption that the missing mechanism is missing at random (MAR), we have suggested methods to estimate the small area means along with the associated standard errors. Our findings have been supported by appropriate data analyses.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Ananya Roy.
Thesis: Thesis (Ph.D.)--University of Florida, 2007.
Local: Adviser: Ghosh, Malay.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2010-08-31

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2007
System ID: UFE0021364:00001


This item has the following downloads:


Full Text





EMPIRICAL AND HIERARCHICAL BAYESIAN METHODS WITH APPLICATIONS
TO SMALL AREA ESTIMATION



















By

ANANYA ROY


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

2007
































2007 Ananya Roy

































To my teachers and my family









ACKNOWLEDGMENTS

I would like to extend my sincerest gratitude to my advisor, Professor Malay Ghosh

for his invaluable guidance and constant support throughout my graduate studies. It was

a privilege to have him as my advisor, who was alvb--, available to help. I would also like

to thank Professors Brett Presnell, Ronald Randles, Michael Daniels and Murali Rao for

serving on my committee. My special thanks to Professors Dalho Kim and Ming-Hui C'!, 1i

for helping me with computation.

I am indebted to my professors from my undergraduate d4 v- whose teaching inspired

me to pursue higher studies in Statistics. I would especially like to thank Professor

Bhaskar Bose from my higher secondary d4 v- He introduced me to Statistics and helped

me develop a basic understanding of the subject. My sincerest thanks to Professor Indranil

Mukherjee for his constant encouragement during my undergraduate d ,v-. The extent of

his knowledge and problem-solving capabilities have alvb--, inspired me. During my stay

at the University of Florida, I have been lucky to have marvelous teachers like Professors

Malay Ghosh, Brett Presnell, Andre Khuri and Ronald Randles. They have taught me a

lot and would ahv-- i inspire me to be a good teacher and researcher.

I would like to thank my friends and classmates in the department, especially

George, Mihai, for helping me whenever I needed. Life would have been really different

in Gainesville if not for friends like Vivekananda, Siulidi and Bharati. I am grateful for

their support, guidance and, well, just for being there. I have been fortunate to have come

to know and grow close to people like Dolakakima, Bhramardi and little Munai. Their

genuine affection and warmth, along with the love and support of my friends, and, of

course 'Asha Parivar', has made Gainesville a second home to me.

I don't think any of this would have been possible if not for the encouragement and

support of my family. My heartfelt thanks to my parents, who have ahv-- i supported

me in all the choices I have made and encouraged me to move forward. I owe everything

that I have achieved to my mother who has ahv-- been my inspiration. Last, but not the









least, I would like to thank my husband Yash for being such a positive force in my life.

Without him it would have been very difficult to survive the pressure and demand of the

last year.










TABLE OF CONTENTS


page


ACKNOWLEDGMENTS ..............


LIST O F TABLES . . . . . . . . . .


ABSTRACT .....................


CHAPTER


1 REVIEW OF SMALL AREA TECHNIQUES

1.1 Introduction ...............
1.2 Overview of SAE Techniques .......
1.3 Analysis of Discrete data in SAE .
1.4 Robust Estimation in SAE . .
1.5 Layout of this Dissertation . .


2 INFLUENCE FUNCTIONS AND ROBUST BAYES AND EMPIRI-
CAL BAYES SMALL AREA ESTIMATION .................

2.1 Introduction . . . . . . . . .
2.2 Influence Functions And Robust Bi,-i Estimation ..............
2.3 Robust Empirical B-,-i Estimation and MSE Evaluation .........
2.4 M SE Estim ation . . . . . . . .
2.5 Sim ulation Study . . . . . . . .

3 EMPIRICAL BAYES ESTIMATION FOR BIVARIATE BINARY DATA
WITH APPLICATIONS TO SMALL AREA ESTIMATION ......


Introduction . . . .
B-,i And Empirical B -- Estimators
MSE Approximation ...........
Estimation of The MSE .........
Simulation Study .............


4 HIERARCHICAL BAYES ESTIMATION FOR BIVARIATE BINARY
DATA WITH APPLICATIONS TO SMALL AREA ESTIMATION .

4.1 Introduction . . . . . . . . .
4.2 Hierarchical B ,- M ethod ...........................
4.3 D ata A analysis . . . . . . . . .

5 SUMMARY AND FUTURE RESEARCH ...................

APPENDIX


. . . .
. . . .
. . . .
. . . .
. . . .









A PROOF OF THEOREM 2-3 ...

B PROOFS OF THEOREMS 4-1 AND

REFERENCES ...............

BIOGRAPHICAL SKETCH ........


4 -2 . . . . . .









LIST OF TABLES


Table page

2-1 Relative biases of the point estimates and relative bias of MSE estimates. 39

3-1 Relative biases of the point estimates and relative bias of MSE estimates for
the small area mean: the first variable. . . . . . 57

3-2 Relative biases of the point estimates and relative bias of MSE estimates for
the small area mean: the second variable. . . . . . 57

4-1 Bivariate HB estimates along with the Direct estimates and the True values 70

4-2 M measures of precision . . . . . . . . 71

4-3 Posterior variance, correlation and 95'. HPDs along with predictive p-values 73









Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

EMPIRICAL AND HIERARCHICAL BAYESIAN METHODS WITH APPLICATIONS
TO SMALL AREA ESTIMATION

By

Ananya Roy

August 2007

C(! i': Malay Ghosh
Major: Statistics

The topic of this dissertation focuses on the formulation of empirical and hierarchical

B -i, -i i techniques in the context of small area estimation.

In the first part of the dissertation, we consider robust B-,--, and empirical BE,-.-i

(EB) procedure for estimation of small area means. We have introduced the notion of

influence functions in the small area context, and have developed robust B,-.-- and EB

estimators based on such influence functions. We have derived an expression for the

predictive influence function, and based on a standardized version of the same, we have

proposed some new small area estimators. The mean squared errors and estimated mean

squared errors of these estimators have also been found. The findings are validated by a

simulation study.

In the second part, we have considered small area estimation for bivariate binary data

in the presence of covariates. We have developed EB estimators along with their mean

squared errors. In this case the covariates were assumed to be completely observed. In

the presence of missing covariates, we have developed a hierarchical B,-.-- approach for

bivariate binary responses. Under the assumption that the missing mechanism is missing

at random (M!AR), we have -,r--:. -i. 1 methods to estimate the small area means along

with the associated standard errors. Our findings have been supported by appropriate

data m, i -. -









CHAPTER 1
REVIEW OF SMALL AREA TECHNIQUES

1.1 Introduction

The terms -i, i!! ci, and "local -,r are commonly used to denote a small

geographical area, but they may also be used to describe a -i1 i1! domain", i.e., a small

subpopulation such as a specific age-sex-race group of people within a large geographical

area. Small area estimation (SAE) is gaining increasing prominence in survey methodology

due to a need for such statistics in both the public and private sectors. The reason behind

its success is that the same survey data, originally targeted towards a higher level of

geography (e.g., states) needs to be used also for producing estimates at a lower level

of .,-::--regation (e.g., counties, subcounties or census tracts). In such cases, the direct

estimates, which are usually based on area-specific sample data, are often unavailable

(e.g., due to zero sample size) and almost alv--, unreliable due to large standard errors

and coefficients of variation arising from the scarcity of samples in individual areas. SAE

techniques provide alternative tools for inference to improve upon the direct estimators.

The use of small area statistics goes back to 11th century England and 17th century

Canada (Brackstone, 1987). However, these statistics were all based either on a census or

on administrative records aiming at complete enumeration.

The problem of SAE is twofold. First is the fundamental issue of producing reliable

estimates of parameters of interest (e.g., means, counts, quantiles etc.) for small domains.

The second problem is to assess the estimation error. In the absence of sufficient samples

from the small areas, the only option is to borrow strength from the neighboring areas i.e.,

those with similar characteristics. In Section 2, we outline the advances in SAE over the

past few decades and some popular works significant to the enhancement of the techniques

in this field.









1.2 Overview of SAE Techniques

As mentioned earlier, one of the fundamental problems in SAE is to find reliable

estimators of the characteristics of interest for a small area. Some of the newer methods

of estimation are the empirical B,--. (EB), Hierarchical B-i-., (HB) and empirical best

linear unbiased prediction (EBLUP) procedures which have made significant impact on

small area estimation. In this context, the work of Fay and Herriot (1979) is notable.

Fay and Herriot (1979) -,-. -1. 1 the use of the James-Stein Estimator in SAE. Their

objective was to estimate the per capital income (PCI) based on the US Population Census

1970 for several small places, many of which had fewer than 500 people. Earlier, in the

Bureau of the Census, the general formula for estimation of PCI in the current year was

given by product of the PCI values of the previous year and the ratio of administrative

estimate of PCI for the place in 1970 to the derived estimate for PCI in 1969. But this

method had many drawbacks resulting from ignoring the presence of some auxiliary

variables and also due to small coverage for small places, thereby increasing the standard

errors of the direct estimates. The Fay and Herriot model is as follows:


|^ N(0e,Vi) and O, N(xfb,A),i =1,... ,m (1-1)

where Vi's and auxiliary variables xi's are all known but A is unknown. In this case, let

(xTXE-1X)-1XTE-1Y, (1 -2)

be the regression estimator of Oi's where E = Diag(Vi + A, V, + A). Then, the

-,t.-..-, -ld1 estimator for the ith small area was given by

A* Vi
A* + V A* + 1 3)

Here A* is the estimate of the unknown variance A determined from the equation

i j A+V = m p, where p is the number of auxiliary variables in the regression
model and m > p. A* was found by an iterative procedure with the constraint A* > 0.









The idea behind proposing this estimator followed from the James-Stein Estimator

which has the property of dominating the sample mean with respect to squared error

loss for m > 3. Under the model (1-1) with Vi = V Vi and A known, the James-Stein

estimator is given by

S= (1 ((m p 2)V/S))Y + p(( 2)V/S)Y*, (1-4)


where

s = ( y*)2 (1-5)
i
We observe here that the (1-3) is just the EB estimator under model (1-1). Following

the argument of Efron and Morris (1971,1972), who remarked that EB estimators such as

(1-4) may perform overall but poorly on individual components in case of departure from

the assumed prior, Fay and Herriot proposed the final EB estimators as:


= if Y, Vi < O < Y + V

=Y, v^ if ^ < Y VVz

=Yi + V/ if i > Yi + V/-.

There are other approaches regarding estimation of the unknown A. Prasad and Rao

(1990) proposed the use of the unweighted least squares and the method of moments

approach to estimate the regression coefficients and variance A respectively. Datta and

Lahiri (2000) -.- -1 the use of maximum likelihood and residual maximum likelihood to

estimate the same.

The Fay-Herriot estimator is an example of a composite estimator. A composite

estimator is a weighted average of the direct survey estimator and some regression

estimator, in the presence of auxiliary information. These are called model-based

estimators (e.g., Fay-Herriot estimator). Otherwise, it is a weighted average of the

overall mean and the sample mean for that small area. These are called the design-based

estimators. There has been many proposed approaches regarding the choice of optimal









weights in the latter case. Some of the works include Schaible (1978) and Drew, Singh,

and Choudhry (1982).

Another kind of estimation in vogue, in SAE, is -. ./h., I. estimation. Brackstone

(1987) described synthetic estimates as follows: "An unbiased estimate is obtained from a

sample survey for a large area; when this estimate is used to derive estimates for subareas

under the assumption that the small areas have the same characteristics as the large

area, we identify these estimates as synthetic est ii i,. Ghosh and Rao (1994) provided

model-based justifications for some of the synthetic and composite estimators in common

use.

One of the simplest models in common use is the nested error unit level regression

model, employ, ,1 originally by Battese et al. (1988) for predicting areas under corn

and soybeans in 12 counties in Iowa, USA. They used a variance components model as

-,.-.-, -. ,1 by Harter and Fuller (1987) in the context of SAE. They used the model


yij xT b+, +eij, j 1, ,n,i 1,..--- ,T. (1-6)

where and eij are mutually independent error terms with zero means and variances

a and a2 respectively. Here ni samples are drawn from areas with Ni observations. The

random term represents the joint effect of area characteristics that are not accounted by

the concomitant variables xi's. Under the model, the small area mean, denoted by Oi, is


0i x- p )b +, (1-7)

where (p) = N-1 1yij. Let u = + ej, = (1u, ur)T and E(uur) V

block diag(Vi, V2,--- V),

where

V, crJ, + I, (1-8)









with Ji the square matrix of order ni with every element equal to 1 and Ii the identity

matrix of order ni. Then the generalized least-squares estimator of b is given by

6 (XTV-1X)- XTV-1y. (1-9)

Then the best linear unbiased predictor (BLUP) for the 1th small area mean, when (ac, ,a7)

is known, is given by

O 4x + (y,. 6)g, (1-10)

where Xi. n17 l xij and gi af/(K + n V).

In practice, however, a2 and a are seldom known. A common procedure is to replace

them in the BLUP formula by standard variance components estimates like Maximum

Likelihood Estimators (\! 1.), Restricted MLE (REML) or Analysis of Variance (ANOVA)

estimators. The resulting predictors are known as the Empirical BLUP (EBLUP) or EB

predictors. Prasad and Rao (1990), among others, provide details. An alternative method

is to find Hierarchical B-,.-i (HB) predictors by specifying prior distributions for 3, aQ2 and

a. and computing the posterior distributions f(Oi y, X) given all the observations in all

the areas (Datta and Ghosh, 1991). Markov chain Monte Carlo(\!C\!C) techniques are

used for simulations from the posterior.

As pointed out earlier, another important aspect of SAE, apart from the estimation

of parameters, is the calculation of the error associated with estimation known as mean

squared error (\!Sl ). To this end, we discuss here the work of Prasad and Rao (1990).

Prasad and Rao pointed out that the models of Fay and Herriot (1979) and Battese et al.

(1988) are all special cases of the general mixed linear model

y Xb + Zv + e (1- 11)

where y is the vector of sample observations, X and Z are known matrices, and v

and e are distributed independently with means 0 and covariance matrices G and R,

respectively, depending on some parameters A, the variance components. Following








Henderson (1975),they proposed the best linear unbiased estimator of p = 1 + mTv as

t(A, y) = 1T/3 + mnTGZTV- (y X/3), (1-12)

where V = Cov(y) = R + ZGZT and 3 is the GLS estimator of b. They proposed the
estimators for the variance components using Henderson's methods. Prasad and Rao also
proposed an estimator for the MSE of the estimators. They extended the work of Kackar
and Harville (1984), and approximated the true prediction MSE of the EBLUP under
normality of the error terms. They showed

E[t(A) t(A)]2 tr[(Vb)(VbT)E[( A)(A A)T], (1-13)

where VbT colI estimator of Ai and AX max(X, 0). Hence,

MSE[t(A)] MSE[t(A)] + tr[(Vb)(VbT)E[(A A)(A A)T] (1-14)

They provided general conditions under which the precise order of the neglected terms
in the approximation is o(t-1) for large t. These conditions are satisfied by both
Fay-Herriot and the nested error models. For the estimation of the MSE, the models
were considered individually as it is difficult to find general conditions. For example, the
MSE approximation for the Fay and Herriot (1979) model is given as follows:

E[OfB -O ]2 Vi(1 B ) + BJxTXT D- X)-lx + E(OB (A) Ofi(A))2. (15)

Prasad and Rao estimated Vi(1 Bj) by Vi(1 B) and BfxT(XTD- X)-1xi by
B/x (XTD 1X)-lxi, where D = diag(Vi + A, ... Vm + A). They approximated
(OY(A) O(A))2 by 2Bf(V, + A)-lVar(A). While under normality of the error terms,
Var(A) was approximated as Var(A) = 2m-1[A2 + 2A 7 Vi/m + c V2/m], correct up to
the second order. Hence, the estimated MSE up to the second order is

Vi(1 bi) + B^xf(XTD 1X)- lx + 2Bj(Vi + A)- Var(A). (1-16)









where, Var(A) 2m-1 [A2 + 2A E /m + E V/m]. Lahiri and Rao (1995) showed that

the estimator of MSE as given by (1-16) is robust to departures from normality of the

random area effect 's, though not the sampling error ej's.

Datta and Lahiri (2000) extended the results of Prasad and Rao to general linear

mixed models of the form,


Yi = Xib+Zivi+ei, i =l, ,m, (1-17)


where Yi is the vector of sample observations of order ni, Xi and Zi are known matrices,

and vi and e, are distributed independently with means 0 and covariance matrices Gi

and Ri, respectively, depending on some parameters A, the variance components. In

their paper, Datta and Lahiri first estimated the variances of the BLUP for the case of

A known. They also developed MSE estimators correct up to the second order, when A

is estimated by MLE or REML estimator. They also show that MSE approximations for

the ML or REML methods are exactly the same in the second order .,-vmptotic sense.

However, the second order accurate estimator of the MSE based on the latter method

requires less bias correction than the one based on the former method. In the same

thread, Datta et al. (2005) derived second order approximation to the MSE of small area

estimators based on the F ,v-Herriot model as given by (1-1), when the model variances

are estimated by the Fay-Herriot iterative method based on weighted residual sum of

squares. They also derived a noninformative prior on the model for which the posterior

variance of a small area mean is second-order unbiased for the MSE.

M ,ii. other models have been proposed in the context of SAE. Cressie (1990) and

Ghosh, Natareai ,in Stroud, and Carlin (1998) emphasize the use of linear models and

generalized linear models, respectively. Erickson and Kadane (1987) carried out Sensitivity

Analysis for SAE. Pfeffermann and Burck (1990), Rao and Yu (1992) and Ghosh et al.

(1996), studied the extent of possible use of time series and cross-sectional models in SAE.









1.3 Analysis of Discrete data in SAE

The models discussed in the earlier section has mainly been for continuous data. We

devote this section to discussing some models for discrete measurements. Some of the

recent works in SAE has been focused on the analysis of categorical data. McGibbon and

Tomberlin (1989) considered the model where yj ~- Bin(1,pij) and

logit(pyi) xTb + ui;, ui N(O, 72). (1-18)

Here yj's are conditionally independent. The independent random effects are denoted by

ui's. The authors assume a diffuse prior for b with au assumed to be known. The joint

posterior distribution of b and {ui, i = 1, m} is approximated by the multivariate

normal distribution with mean equal to the true posterior mode and covariance matrix

equal to the inverse information matrix evaluated at the mode. The estimator of pij is

given by

i [1 + exp{-(xb + u)}]- (1-19)

where b and ui are the respective modes of the posterior. Then the area proportions are

estimated by pi = >> pij/Ni for small sampling fractions ni/Ni. For a2 unknown, the

MLE of a~ is phlu.-., d in, yielding an EB estimator. Taylor expansion is used to obtain a

naive estimator of the variance of the EB estimator. But the drawback of this estimator is

it ignores the variance component resulting from the estimation of Ju. Farrell, McGibbon,

and Tomberlin (1997) showed that the variance estimator could be improved upon by the

use of parametric Bootstrap procedure.

Some other papers relating to the analysis of binary outcomes are by Jiang and Lahiri

(2001) where they propose a method to find the Best Predictor (BP) of logit(pyj) and

use the method of moments to obtain sample estimates of b and ui as given in Eq. 1-18.

Malec et al. (1997) also consider a logistic linear mixed model for Bernoulli outcomes and

use HB approach to derive model dependent predictors.









There has been considerable work done in the context of disease mapping. A simple

model assumes that the observed small area disease counts i' 10 id Poisson(ni0i) with

0i i.d. Gamma(a, b), where 0i is the true incidence rate and ni is the number exposed

in the ith small area. Maiti (1998) assumed a log-normal distribution for Oj's. He also

assumed spatial dependence model for the log-transformed incidence rates. Some other

works in this area include papers by Ghosh et al. (1999), Lahiri and Maiti (2002) and

Nandram et al. (1999).

Ghosh et al. (1998) considered Hierarchical Bayes (HB) generalized linear model(GLM)s

for a unified analysis of both discrete and continuous data. The responses yij's are

assumed to be conditionally independent with pdf

f (yi'0 .) exp( '-(ijij b(Oij)) + c(yij, ij), (1-20)


(j 1,... i = 1, ... m), where 0(> 0) are known. The canonical parameters are
modelled as h(Oij) = xTb + ui + ej, where the link function h(.) is a strictly increasing

function. Ghosh et al. used the hierarchical model in the following way:

2 2 ind
yij 'O b, ui, ja, e ~ f
22 2 ind
h(Oij) b, u i, aC N(x b + u, ); u\b, 0a, o, N (0, ,) (1-21)

b ~ Uniform(Rk); (2) -1 Gamma(a, b); (-2)-1 Gamma(c, d),

where b, o- a are mutually independent and (a, b, c, d) are fixed parameter values.

MC'\ IC methods were implemented to evaluate the posterior distribution of O6j's or any

functions of them.

Ghosh and Maiti (2004) provided small-area estimates of proportions based on

natural exponential family models with quadratic variance function (QVF) involving

survey weights. In the paper, they considered area level covariates, i.e., the same

covariates for all the units in the same area. They introduced the natural exponential

family quadratic variance function model, which has the same set up as Eq. 1-20 with









Oij 0 Oi for all j in the ith area; E(y jl0) = '(O) = I( ) and var '"(0i)/@

V(pf)/i (- 1), (i 1, ... ,m). Here for the quadratic variance function structure,
V(pi) =,, + vipi + _i'2 with ',, v, v2 never simultaneously zero. For example, for the
Binomial distribution, v0 = 0, v, = land v2 = -1, and for the Poisson distribution,

,, = v2 = 0 and vi 1. A conjugate prior was proposed for the canonical parameters in

the model,

7r(0i) exp[A{mrn0 )(0i)}] C(A, rnm), (1-22)

where A is the overdipersion parameter and mi = g(xib). Using results from Morris

(1983), E(pi) = mi and Var(pi) = (-, where A > max (0, v2),Ghosh and Maiti
A-V2
provide pseudo BLUPs of the small area means, conditional on the weighted means

yin = y 1E, 1i t,' t *' where ir,, > 0's are all jand 1 7 = 1.They eventually
arrived at the pseudo EBLUPs by simultaneously estimating b and A, using the method

of unbiased estimating functions following Godambe and Thompson (1989). They also

provided approximate formulae for the MSE and their approximate estimators.

1.4 Robust Estimation in SAE

There has been some research in the past on controlling influential observations in

surv ,- though not much in the context of SAE. Among others, we refer to C'!i i : hers

(1986) and Gwet and Rivest (1992), who used robust estimation techniques specifically in

the context of ratio estimation. Chambers (1986) mainly considered the robust estimation

of a finite population total when the sampled data contains 'representative' outliers such

that there is no good reason to assume that the nonsampled part of the target population

is free of outliers. C'!i i hers assumed a superpopulation model 1o, under which the

random variables rj = (yi O3xi)-1 1 are independent and identically distributed with zero

mean and unit variance. Here x1's are known real values associated with the population

elements yI's, ao = a2v(xi), 03 is unknown and a2 is assumed to be unknown. Under (o,









the best linear unbiased estimator of the population total T is given by


TLS T1 + 3LS i,
2
where T1 is the total of all the n sampled units ,Y2 xz indicates the sum of the rz's
corresponding to the unsampled populations values, and /3LS {Z1 yIxI/v(xi)}x

{E11 x/v(xi)}~1 is the standard least squares estimates. To make TLS robust to outliers,
C'! iinl>ers (1986) proposed the estimator


T. T, + b. Y x, + uj u [(y, bnXi),7,
2 1
where u1 = x1q (ZY 2 xj)(32 x j2). Here, bn is some outlier robust estimator of/3

and i) is a real-valued function. In his paper, C'!i iinl>ers discussed the different choices

of ip and primarily discussed the bias and variance robustness of Tn under a gross error

model. Based on the ..i' ,-il, ..I ic theory given in the paper, it appears that the variance

robustness of T, can only be achieved by choosing ip such that this estimator is no longer

bias robust.

Zaslavsky et al. (2001) used robust estimation techniques to downweight the effect

of clusters, with an application to 1990 post enumeration survey. In the context of SAE,

Datta and Lahiri (1995) developed a robust HB method which was particularly well-suited

when there were one or more outliers in the dataset. They used a class of scale mixtures of

normal distributions on the random effects in the basic area model. C 1,, :v distribution

was used for the outliers.

We have tried to outline some of the n i jri" works significant in the advances of SAE.

Review papers on SAE include Rao (1986), Chaudhuri (1994), Ghosh and Rao (1994), Rao

(1999) and Pfeffermann (2002). There is a book on SAE by Rao (2003).









1.5 Layout of this Dissertation

The first part of this dissertation deals with robust B-,i and empirical B-,i, -

methods for SAE. In the second part, we propose EB and HB methods in SAE when the

responses are bivariate binary.

In C'!i Ipter 2, we have considered the basic area level model and proposed some

new robust small area estimation procedures based on standardized residuals. The

approach consists of first finding influence functions corresponding to each individual area

level observation by measuring the divergence between the two posteriors of regression

coefficients with and without that observation. Next, based on these influence functions,

properly standardized, we have proposed some new robust B ,--i and empirical B i,.-

(EB) small area estimators. The approximated MSE's and estimated MSE's of these

estimators have also been found.

In C'!i Ipter 3, we have addressed small area problems when the response is bivariate

binary in the presence of covariates. We have proposed EB estimators of the small area

means by assigning a conjugate prior to the parameters after suitable transformation and

subsequent modelling on the covariates. We have also provided second-order approximate

expressions for the MSE's and estimated MSE's of these estimators.

In C'!i Ipter 4, we have proposed HB estimators in the bivariate binary setup, where

some of the covariates may be partially missing. We implemented hierarchical B ,v ~i i

models which accommodate this missingness. Estimators of small area means along with the

associated posterior standard errors, and posterior correlations have also been provided.









CHAPTER 2
INFLUENCE FUNCTIONS AND ROBUST BAYES AND EMPIRICAL BAYES
SMALL AREA ESTIMATION

2.1 Introduction

As noted in C'!i ipter 1, empirical and hierarchical B i,, -i ,i methods are widely used

for small area estimation. One of the key features of such estimation procedures is that

the direct survey estimator for a small area is typically shrunk towards some synthetic or

regression estimator, thereby borrowing strength from similar neighboring areas.

Shrinkage estimators of the above type are quite reasonable if the assumed prior

is approximately true. There may be over or under shrinking towards the regression

estimator if the assumed prior is not true. Indeed, there is a standard criticism against

any subjective B ,i, -i ,i method is that it can perform very poorly in the event of

significant departure from the assumed prior (see e.g., Efron and Morris (1971),Fay

and Herriot (1979)).

Efron and Morris (1971,1972) remarked that B--~v. and empirical B--v, ; estimators

might perform well overall but poorly on individual component. e.g., suppose we assume

the model where

i 10 N(OA,,1) and Oi N(p,A),i 1,... ,m. (2-1)

But suppose the assumed prior on Oi's is actually a mixture of several distributions, one

of which is N(pi, A1), A1 < A. In that case, Efron and Morris (1971) have shown that the

B-i,- ~ risk for the component which belongs to the subpopulation can be very high. Hence

the shrinkage estimators which may be good for most of the components can be singularly

inappropriate for a particular component 0i. In general, if there is wide departure of the

true prior from the assumed prior, the B-,i risk of the B-,-i estimator can be quite high.

Moreover, there are practical problems associated with over- and under-shrinking in small

area estimation problems, especially with widely varying sample sizes. If this happens, the

resulting large area ..-:-: regate estimates may differ widely from the direct estimates, which









for all practical purposes, are believed to be quite reliable. This was, for example, pointed

out in Fay and Herriot (1979).

Efron and Morris (1971,1972) proposed limited translation rule (LTR) estimators to

guard against problems of the above type in the context of compound decision problems.

The LTR estimators were obtained with a special choice of what these authors referred to

as i !- 1. ,:,'e function". The main purpose of the LTR estimators was to !, 'l,. B i,, -i ,n

bets with frequentist caution" (Efron and Morris, 1972).

In this chapter, we have introduced the notion of influence functions in the small area

context, and developed robust EB and HB estimators based on such influence functions.

These influence functions are akin to the relevance functions of Efron and Morris, but are

derived on the basis of a general divergence measure between two distributions. In this

way, we are assigning some theoretical underpinnings to the work of Efron and Morris as

well. Also, unlike the intercept model of Efron and Morris, our results are derived in the

more general regression context, a scenario more suitable for small area estimation.

The importance of influence functions in robust estimation is well-emphasized

in Hampel (1974), Huber (1981), Hampel et al. (1986) and many related papers.

The predominant idea is to detect influential observations in terms of their effects on

parameters, most typically the regression parameters. Johnson and Geisser (1982, 1983,

1985) took instead a predictive point of view for detection of influential observations. In

the process, they developed B il, ~-i in predictive inference function (PIF's) and applied

these in several univariate and multivariate prediction problems based on regression

models. These predictive influence functions are based on Kullback-Leibler divergence

measures.

Our proposed influence functions are motivated in a B i,v -i ,i way based on some

general divergence measures as introduced in Amari (1982) and Cressie and Read (1984).

Included as special cases are the well-known Kullback-Leibler and Hellinger divergence

measures.









The general divergence measure between two normal distributions is derived

in Section 2. Also, in this section, we have applied these results to find influential

observations for certain B i,-i in small area estimation regression models. Based on

these influence functions, we have proposed certain robust HB small area estimators. The
B -,- a risk of the estimators (also referred to as the MSE) is also derived. Robust EB

estimators have been proposed in Section 3, and second order correct expansions of their
B-,-. risks or MSE's are provided in this section. Estimators of the MSE's which are

correct up to the second order are derived in Section 4. Section 5 contains a simulation

study which investigates situations where the proposed EB estimators will perform better

than the regular EB estimators.

2.2 Influence Functions And Robust Bayes Estimation

We consider a small area setting where

1 0 Id N(oi, V), V(> 0) known,

0, N(xf b,A), i = ,** (2-2)

Here xa's are the p (< m)- component design vectors. We write XT (xi, ... xm) and

assume that rank(X) = p. Also, let y = (yi, .. ym)T and 0 = (01, ... Om)T. The

above model is often referred to in the literature as the Fay-Herriot (1979) model, where

yj's are the direct survey estimators of the small area parameters Oi, and the xi's are the

associated covariates.

The B,-,-- i estimator of Oi when both b and A (> 0) are known is given by


O (1 Bi)y + Bxf b, (2-3)

where Bi = i = 1, m. We first begin with a simple EB or HB scenario where b is

unknown but A is known. In this case, the HB estimator of Oi with a uniform prior on b is

given by

B (1 Bj)y + Bixfb, (2-4)








where b (= XTE-1X)-1XTE-ly, E Diag(Vi + A, .. Vm + A). The same estimator
is obtained as an EB estimator of b when one estimates A instead from the marginal
independent N(xfb, A + Vi) distributions of the yj's.
As discussed in the introduction, the above EB (or HB) estimator can often over or
under shrink the direct estimators yi towards the regression estimators xfb. What we
propose now is a robust B i, -i procedure which is intended to guard against such model
misspecification.
To this end, we begin finding the influence of (yi, x), i =1,. m in the context of
estimating the parameter b. The derivation of the influence function is based on a general
divergence measure as introduced in Amari (1982) Cressie and Read (1984). For two
densities f, and f2, this general divergence measure is given by

DA(fl, f2) A( l)Ef, 1 (2-5)

The above divergence measure should be interpreted as its limiting value when A -+ 0
or A -1. We may note that DA(fl, f2) is not necessarily symmetric in fi and f2, but
the symmetry can al- --- be achieved by considering 1 [DA(fl, f2) + DA(f2, fl)]. Also, it
may be noted that as A -+ 0, D(fl, f2) Ef, log ] while if A -1, D (fl, f2)
Efy log L2 These are the two Kullback-Leibler divergence measures. Also D_ (fi, f2)
4(1 f --lf2) 2H2(f1,f2), where H(fl,f2) {2(1 f flf2)}12, the Hellinger
divergence measure.
In the present context, we consider the divergence between the densities of b and
b_-i), where b(-i) is the EB (or HB) estimator of b under the assumed model given in Eq.
2-2 with (yi, xi) removed. To this end, we first prove a general divergence result involving








two normal distributions based on the general divergence measure. The result is probably
known, but we include a proof for the sake of completeness.

Theorem 2-1. Let f, and f2 denote the Np(/tl, Ei) and N(A2, E2) pdf's respectively.
Then


1 [exp{ A (A t (
A(A1 + t) 2
x |i-| 12S 2 +A), 2


- )T(( + A) 2 A )-1 2)}
AEX1 1].


Proof:

D (fl, f2)
1
A(A + 1)
S E 2 e A X T-(X A) (X PI)TE( -- )} -_ 1
x (E I D. l exp{ 2(X p2 2,A2)2(X2-)- "2
\, lY^1 }


where X ~ N(p 1, E1). Next we write

(X t)T2 1(X 12) (X PI)TE l(X
2
= AlX Al+-^ { 2 1 2 A+

1 T
( [ (X-+ ) T [ X-), )
= (Z +)TC(Z + ) ZTZ,
2


- i)

A2) -(X 1)TE (X -P)
2
2 (X [t + 2 P2)


Dx (fl, f2)









where Z E2(X
Z ~ Np(0, Ip), we get

D,(fl, f2)

A(1 + A)


D), x 72 (p,1


p2) and C


E E2 Noting that


() expf{
(2-)P


ZT (1+ AI
2


AC)Z + AcTCZ + C}dZ


AC- exp {2 (C


AC(1 + AIp


AC)-1C)0}


x 21) (t + A)Ip


A(1+A) ElJ1


exp{ T(C-1


A Ip)- I( +A)Ip


- AC|-


where the final step of Eq. 2-6 requires the standard matrix inversion formula (Rao, 1973,
p 33) (A + BDB)-1 A- A-1B(D- + BTA -1B)BTA-1 with A C-1,B = Ip
and D = Ip. Again,


CT(c-1 A- I
1 +I) A I
1 1
(I~1 jt2)T ]73 Y]7 Y]2]F


A -i -]
x1 2x 1E2 12
1tA


(Ai A2 )T(E2 -1 1 2)
1 + A


(1 + A)(p1


A2 )T-[(1 + A)x2- AE 1]-(j1- P2).


Moreover,


|(1 + A)l lls


A 1 2 2|


Ax1


E2i- 221(1 + A)2- Ai)S1 -

I121(1 +A)x)2 AEll-.


x (t + 2

1
A(1 + A)


(2-6)


1 I
E1 2 (,1


(1 + A)Ip


ACI-


(2-7)










Remark 1. It is clear from the above theorem that if E1 and E2 are known, the divergence
function DA(fl, f2) is one-to-one with (A/ 2)[(1 + A)E2- AEi]-1(t1 2). In the
special case when El E2 = E, the above divergence measure is a one-to-one function of
the Mahalanobis distance (A, A2)T -1-(1- 12).
We now utilize Theorem 2-1 in deriving the influence of (yi, xi) in the given small
area context. Note that

b y, X ~ N[(XTE-1X)-XT'E-ly, (XTE-1X)-1]

and

b|ly(_),X(_j) ~ N[(X(_E X(_) X(_ y_, (X(_)y((-X )-1

where E(_i) is a diagonal matrix similar to E with the ith diagonal element (Vi + A)
removed. Now writing t (XT" 1X)-1xTE-l y,
A2 ( XT i)E- )X(_i))-1T i)X )Y(_i), El (XT-IX)-1 and
E2 (X E- X())-1, one gets

~1 -- 2 XT -1X)- XT -1y
(XTE-IX (V, + A)- xx )-(XT-1Y (V, + A)-lxi ). (2-8)

By a standard matrix inversion formula (Rao, 1973, p 33),

(XE-IX (i + A)-lixT)-1
(XT E-X)_1(V/ + A) -1xx(XTE-1X)-1 (2-9)
(XTE-IX)-1 +
1 (Vi + A)- lx(XT -lX)- lx

From Eq. 2-8 and Eq. 2-9, on simplification,

(XTE-1X)-1(V + A)-1xi(yj Xb)
S- 2 + A) xT )-(2 10)
t (Vi + A) -1XT(XTE-1X)_jXi









Also


(XTE-X)-1( +A)-1xx (XTE- X)-1
(1 + A)2 (X X) + (+ A)-1 + A) -1
1 (V, + A) -lx(XTE-IX)-lxi
(2- 11)
It is easy to see from Eq. 2-10 and Eq. 2-11 that (A1 A2)T[(1 + A)2 -- AI]-(p1 2)

is a quadratic form in y, xfb. We propose restricting the amount of shrinking by

controlling the residuals yi xfb. However, in order to make these residuals scale-free, we
standardize them as (y xfb)/Di, where

D = V(y, XTfb) V, + A- Xf(XT E-1X)-1, i 1,... m.

Based on these standardized residuals, we propose a robust HB estimator of 0i as

OHB i BiD (Yi = 1, m, (2-12)
Di 7

where jb(t) = sgn(t)min(K, |t|).

Remark 2. For the special case of the intercept model as considered in Lindley and Smith
(1972) with V = 1 and B, = B = (1 + A)-1 for all i, the above robust EB estimator of 0i

reduces to
fRHB i- BD (DY ,i 1, ...,m,

where D2 (m )/(mB).

The selection of K is ahv--, an open question. Often this is dictated from the user's

point of view. A somewhat more concrete recommendation can be given from the following
theorem which provides the B-,i risk of robust B-,-i estimator given in Eq. 2-12 under

the model Eq. 2-2. Throughout this paper, let 4 denote the standard normal df and 0 the
standard normal pdf. .

Theorem 2-2. Consider the model (2-2). Then


E(OfHB -)2 91i(A) + g2i(A) + 93im(A),








where gi (A) V(1 Bi), g2i(A) 2B2D2{(1 + K -2)(-K) K) (K)} and g3im(A)
BxTf(XT E-X) -1x, expectation being taken over the joint distribution of y and 0.
Proof:


E (OfIHB o,)2


E (O
E(0,
Vi (l


of + of oHB)2
O)2 +E(OB OHB)2
B)+EE(Of HB)2.


(2-13)


~f 0fHB Biy- B(- xfb) y,, + B(y, xfb)
_BAD {Y- X Dx )
-B \ D D i

-Bi[((6- b) + D Y ) y]

It is easy to check that E{x (b b)}2 xT(XTIX X)-lxi. Also, noting that (y,
xfb)/Di ~ N(0, 1), and is thus distributed symmetrically about zero, one gets

E D- D ) =iD E{(Z- K)21[z>K] + (Z + K)2 [z<-K]}

S2E{(Z-K)2 [Z>K]}
= 2{(1 + K2)>(-K) KO(K)},


(2-14)









(2-15)


where Z denotes the N(0, 1) variable. Now by the independence of b and yi xfb and the
fact that E(b) = b, one gets the result from Eq. 2-13-Eq. 2-15.
It may be noted that if the assumed model Eq. 2-2 is correct, then the B ,-,- risk of
the regular EB estimator 0rB is g91(A) + g3im(A). Hence, the excess B .--. risk of OfEB
over OfB under the assumed model is g2i(A) which is strictly decreasing in K, and as
intuitively expected, tends to zero as K -- oc. Hence, setting a value, i-, M to 92i(A)
will determine the value of K. Thus, the choice of K will be based on a tradeoff between


But









protection against model failure and the excess Bayes risk that one is willing to tolerate

when the assumed model is true.

We now proceed to the study of robust EB estimators in the next section.

2.3 Robust Empirical Bayes Estimation and MSE Evaluation

In the previous section, the variance component A (> 0) was assumed to be known .

If, however, A is unknown, we estimate A from the marginal distributions of the yi's. We

may recall that marginally yi t N(x'b, V, + A). There have been various proposals for

estimating A including method of moments (Fay and Herriot, 1979; Prasad and Rao, 1990)

method of maximum likelihood (il\,), or restricted maximum likelihood (REML) (Datta

and Lahiri, 2000, Datta et al., 2005). The analytical results that we are going to obtain

are valid for all these choices, although our numerical results will be based on the ML

estimate of A, which we denote by A. Let = Diag(Vi + A, Vm + A), B = -- i =
A+Vi
1,''' ,m and b (XT:-IX)-IxT-l'y. Then the robust EB estimators of the Oi's are

given by
IREB D i ~i \ (2-16)


where )b is defined in the previous section after Eq. 2-12.

Our objective in this section is to find E(O'REB _- 0)2 correct up to O(m-1), i.e. we

provide an ..- i,!iii i ,tic expansion of this expectation with an explicit coefficient of m-1,

while the remainder term when multiplied by m converges to zero, subject to certain

assumptions. Here E denotes expectation over the joint distribution of y and 0. We need

certain preliminary results before calculating the B v.-i risks .

First we need the ...,-il' 1 1 ic distribution of A. Based on the marginal distribution of

the yi's, the likelihood for b and A is given by


L(b, A) (27)-7 2 (V + A) exp[- (y- xb)2/( + A)]. (2-17)
1 1









Accordingly the Fisher information matrix for (b, A) is given by


XTE-1X 0
I(b, A)= (2-18)
o1 'Ym(V + A)-2

Due to the ..-', iii .l ic independence of b and A, A is .i-v ,'il l ically N(A, 2('1e(V, +
A)-2)-1). We now have the following simple but important lemma.

Lemma 2-1. Assume 0 < V, < minl E|A A| = e0(m~-) and F[11i, .:: 0.
Here O refers to the exact order which means that the expectation is bounded both
below and above by terms such as Cin-r/2 and C2M-r/2 for large m. This is in contrast
to order 0 which only requires the upper bound.
A rigorous proof of the lemma requires a uniform integrability argument which we
omit. However, intuitively, E[A AI {E (V + A)-2}r/2] Oe(1). But m(V* + A)-2<

Em(V, + A)-2 < rn(V, + A)-2. Accordingly, E7(V, + A)-2 Oe((m). This leads to
EIA- A| =Oe(rm-).
Next we write |B Bil = V,A A|/[(V + A)(V + A)] < A A|/(V, + A) for all
i 1,... m which leads to the desired conclusion. Also, we conclude from the lemma
that
P(|A A| > E) = O(m-), P(maxl c) 0(m-), (2-19)

for any arbitrary c > 0 and arbitrary large r(> 0) which we require in the sequel. We may
point out that the assumption regarding the boundedness of the V's is required in any
.,-i~, iI 1i ic Bayes risk (or MSE) calculations in this context as evident, for example, from

Prasad and Rao (1990).
We are now in a position to find an expression for E (OEB _- 0)2 which is second order
correct, i.e. the bias is of the order o(m-1). The following theorem is proved.
Theorem 2-3. Assume (i) 0 < V, < minl








(ii) maxi

E(O EB Oi)2 li(A) + g2i(A) + g3im(A) + g4im(A) + o(m-1),

where gi (A), g2i(A) and g3im(A) are defined in Theorem 2, and

94i,(A) B 2D2(1 Bi)2[ (- Bj)2]-1[2(2K2 1) 4(K2 1)@(K) 5K(K)],
j=1
The proof of this theorem is deferred to the appendix.
2.4 MSE Estimation

The present section is devoted to obtain an estimator of the approximate MSE
derived in the previous section which is correct up to O(m-1). As noted already under
the assumptions of Theorem 3, gi3, and g4i, are all O(m-1). Hence, substitution of A
by A in all these expressions suffices to find terms which are Op(m-1). However, a similar
substitution for gli or g2i will ignore terms which are Op(m-1). Accordingly, to estimate

gli or g2i we need further Taylor series expansion.
To this end, we first state the following lemmas.
Lemma 2-2. Assume conditions of Theorem 2-3. Then

E(A A) (tr& -2)-ltr[(XT-l X)-l(XTX-2X)] + O(m-2).

Proof. From Cox and Snell (1968), due to assumption (i) of Theorem 3,

E(A A) (1/2)I-'(A)[K(A) + 2J(A)] + O(m-2), (2-20)










S tr( -2) and K(A)


where


I ,
ab 2aA

E
/ ,
'4

abi A2

XTE-2X

OT


' ,4 9ObpA2



S 'A 'A2

.42 A3


(2-21)


0

2tr(E-3)


and


,( I O log L )

E

( OlogL )( -,
(Ob \ ( A2

Opxp Opxi

01xp -tr(E-3)


-( )( OlogL




... (alo ( logL
' \ ~ )k OA2 I


(logL)( -, I'.)
Ob1t 9A2 I


(OlogL -, \ a.i)

(\ L)(\ 9A2 )
O."A )\ OA2


(XTE- X)-1
i-1T


01

{tr(-2)} -1


Str[(XE- 1X)- I(XT-2X)].


The Lemma now follows from Eq. 2-20 and Eq. 2-23.


M2


Since


(2-22)


K(A) + 2J(A)


tr [(XTX X)-I(XT -2X) + {2tr(E-3) 2tr(E-3)}{tr(E-2)}-1]


(2-23)


j (V + A)-2


tr(I-1M), J(A)


tr(I-'M2),


where I( A)









Lemma 2-3. Assume condition (i) of Theorem 2-3. Then


E(B, Bj)


TnZ
B(t Bj){ ( Bj)2-1I[2(
j=1

-tr {(-(1- Bj)xjXT)-(-(1
1 1
gsim + o(m-1) (--).


Bj)


B x)2xjXT)}]+ O( -1)


Proof: By Taylor expansion and Lemma 1,


E(B, Bj)


V-(V + A)-2E[(A A) (A A)2] + o(m-').


Now by Lemma 2, and the fact that E(A A)2 (2trE-2)-1 + o(m-1), we get


E(B, Bj)

= Be( +

Bi(trE-

B= [2(1 -


A)[ 2 tr {(XT 1X)-1(XT -2X}
(A + A)(trE-2) tr-2
-2)-1[2(V + A)-2 (V + A)-'tr{(XT -1X)-1(X -2X)}] + o(nm-1)

Bi)2{Z(- Bj)2}-1
j=1


m m m
-(1 B ){Z(1 B)2}-1tr{((1 Bj)xjx)- ((1- )2xT)}] + o(m -1)
j -1 1 1

Bi(1- B){ (1- Bj)21-1[2(1- Bj)-


mr
tr{(Z(1


Bj)XjX)- I( (1 Bj)2XjXO)}] + o(-1).


This proves the lemma.

Remark 3. It may be noted that the leading term in E(A A) is O(m-1) and g95in

O(m-1).

The lemmas provide approximations to E(A A) and E(Bi Bi) both correct up

to O(m-1). Thus, by Lemmas 2 and 3 we get the following theorem which gives us the

estimate of the MSE correct up to the second order.


(2-24)









Theorem 2-4. Assume the conditions (i) and (ii) of Theorem 2-3. Then the estimate of

E(OfEB Oi)2 correct up to O(m-1) is given by


giw(A) + ,,g5im(A) + g2i(A) + g6im(A) + g3im(A) + g4im(A).

where g1i, g2i, 93im and g4m are defined in Theorem 2, and


m
9gim(A) = Bj(1-B)j{Y(1-Bj)2 -1[2(1-Bj)-tr{
j 1
m m
g6im(A) = 2Bf[ (Vj + A)-21-1tr[{Z(Vj + A)
i= 1 i 1


m(
( (1t


m
-lxj}-I {E(Vj
i 1


Bj)2xTxj}],


+A)-2xjxj}]


(2-25)


[(1 + K2) (-K)


Proof : In view of Lemma 2-3,


E[Vi(1 B)]


giu(A) igim(A) + o(m-1)


Hence, we estimate gli(A) by gli(A) + Vig5im(A).

Next, by one step Taylor approximation,


92i(A) g2i(A) + (A -



E[g2i(A)]= g2i(A) + E(A


KO(K)].


(2-26)


Hence ,


A)A '


a,1
A) A


(2-27)


Bj)XjX m)-( (
By}xyx )- Y( (1









Next we calculate


2 m
dA

d [2( + A)-2{y + A Xj (Vj + A)-I1^ )-I }]
j-1

V 2 [-( + A)-2 + 2(V + A)- (Z(Vj + A)- 1X )- 1
j 1
m m m
+( + A) 2XT(y j + A) 1XXT)-1(Z(j + A) 2XjXT)(Yj + A) -1XXT)-X]
j=1 j-1 j-1
-B + 0(Tm-1). (2-28)


Hence, by Lemma 2-2 and assumptions of Theorem 2-3,


E[g2i(A)] =g2i(A)
m m m
2B[E(j + tA)r-2]-Ir[( + A)-lxjxT)- I((yj + A)-2xxT)]
j-1 j-1 j-1

x {(1 + K2)>I(-K) K(K)} + O(m-2)

=g2i(A)- g6im(A) + O(m-2)

(2-29)

Thus, we estimate g2i(A) by g2i(A) + g6im(A). The theorem follows.

2.5 Simulation Study

We consider a simulation study to see the effectiveness of the proposed estimators and

compare them with standard empirical By,-,i estimators. The layout of the simulation is

as follows:


Generate Oi = + u, i = 1, m.
Generate yi = O + e, i = 1, m.

We take e ~- N(0, Vi), where for m = 10,











.01 i= 1,- -4

.1 i 5,...,7

1 i 8,--- ,10
and for m = 20, the numbers of small areas are doubled within each group. We consider

different distributions for generating uj's. (i) Contaminated normal distribution. In

particular .9 x N(0, 1) + .1 x N(0, 36) so that 1 of 10 areas is expected to be an outlier. (ii)

Normal distribution N(0, 4.5). Note that 4.5 is the variance of the contaminated normal

distribution. Then we calculate the following quantities to measure the performance of the

SAE. Let i, be the estimate of Oi. Here we define the absolute relative bias as


1 O N-
RB (OY) 1 R 0H o

and simulated mean squared error as


SMSE(O) =- (O) r) 2
ri1
where R is the number of replicates which is taken as 1000. The notations used in Table

2-1 are as follows:

RB: Relative bias of standard EB estimates; RBR: relative bias of robust EB

estimates; RBI: relative improvement of relative bias of robust estimates over the standard

B-,--i estimates; SMSE: empirical MSE of standard EB estimates; SMSER: empirical MSE

of robust EB estimates; RMI: Relative improvement of MSE of robust estimators over the

standard B-,-i estimators. REBN: relative bias of the naive MSE estimates under robust

EB method; RI: relative improvement of bias in the estimated MSE over naive estimated

MSE under the robust method. Here the naive MSE estimates are obtained by taking only

the first two terms of MSE estimates. The values corresponding to normal random effects

are reported within parentheses.









It is evident from Table 2-1 the robust EB outperforms the standard EB. The relative

bias in SAE is about 24 less on an average under the robust method when m 10 and

the distribution of the random effects is a mixture of two normal distributions. This

improvement reduces to 10' when m 20. In the case of normal distribution of the

random effects, the respective improvements are only 10''. and 5 respectively for m 10

and 20. The MSE is about 1,,'. less on an average under robust method when m 10 while

for m 20 its only about -' For normal random effects this improvement reduces to 2T!' .

and 2'. for m 10 and m 20 respectively. The corrected MSE estimates are at least 5'.

less biased than the naive MSE estimates. For m 20 this difference is about ;'. Thus the

bias corrected MSE estimates are useful at least for small samples.

Table 2-1. Relative biases of the point estimates and relative bias of MSE estimates.
K RB RBR RBI SMSE SMSER RMI REBN REB RI

m 10
.5 2.743 2.011 .267 .909 .344 .622 -.106 -.087 .055
(1.150) (1.051) (.086) (.460) (.345) (.250) (-.065) (-.050) (.0434)
1 2.202 .197 .355 .609 -.154 -.133 .059
(-) (1.067) (.072) (-) (.350) (.239) (-.078) (-.062) (.046)
1.5 2.067 .246 .372 .590 -.176 -.153 .062
(-) (0.969) (.157) (-) (.350) (.239) (-.084) (-.066) (.051)

m 20
.5 2.581 2.392 .073 .329 .320 .027 -.059 -.050 .028
(.683) (.653) (.059) (.322) (.321) (.003) (-.026) (-.018) (.025)
1 2.390 .074 .328 .003 -.090 -.080 .0305
(-) (.677) (.023) (-) (.318) (.012) (-.041) (-.033) (.025)
1.5 2.120 .179 .318 .033 -.089 -.078 .035
(-) (.649) (.064) (-) (.310) (.037) (-.081) (-.074) (.023)









CHAPTER 3
EMPIRICAL BAYES ESTIMATION FOR BIVARIATE BINARY DATA WITH
APPLICATIONS TO SMALL AREA ESTIMATION

3.1 Introduction

Analysis of multivariate binary data is required in rin ,i areas of statistics. For

example, in longitudinal data analysis, quite often one encounters binary responses for

two or more variables (Bishop et al. (1975); Zhao and Prentice (1990)). Bivariate binary

data arise very naturally in opthalmological studies. Item response theory, which occupies

a central role in psychometry and educational statistics, deals almost exclusively with

multivariate binary data. The classic example is the Rasch model with its many and

varied extensions.

Our interest in the analysis of multivariate binary data stems from their potential

application in small area estimation. We consider the case when the response is bivariate

binary. In Section 2, we have first derived the B-,-,i estimators, and subsequently the EB

estimators for the small area parameters. Section 3 is devoted to the derivation of the

MSE's of B ,--, estimators, and second order correct approximations for the MSE's of the

EB estimators. Section 4 is devoted to the second order correct estimators of the MSE's

of the EB estimators. We have conducted a simulation study in Section 5 to illustrate our

method.

3.2 Bayes And Empirical Bayes Estimators

Let y = (yui, I/ ._.)T (j = 1,.- .* ni; i = 1, k) denote the bivariate binary response

for the jth unit in the ith small area. Thus each yij (1 = or 2) assumes the values 0 or 1.

The joint probability function for y is given by

/ exp(' iyiji + '. + '. ,;ii, ) (3
l+exp('. ,) +exp(', .) +exp( -', + -. .+ ..)

With the reparametrization

exp( 0)1)
Phi 1 + exp( '. ,) + exp( '. _) + exp( ', + + .) P( 1, 0),








exp( ,)
1 l+exp( ') +exp(' .. )+exp( ', + ..+ ..) 0 .. 1)
and

i3 exp( + '..+ ..)
pjt3 l+exp( ',,)+exp( .:_.)+exp( ,,1+ ..+ P.) P' 1,.. 1)

we can rewrite the model given in Eq. 3-1 as

.(1- Yj2) (1-Y ji), i2 (1-yiji)(l-yj2) (3 2)
P(Yiji, 1 ..) iji P1 Pij3 P4ij (3-2)

where P4ij 1 1plij. Let Olij P= ij + Pij3, O2ij P .. + Pij3. Our primary interest

is simultaneous estimation of the small area means (01i, 02i)T, where 01i = YZ, .01ij,

02i 1 it 111' *02ij, 1ji' = 1 for each i. Here the weights w, .'s are assumed to be
known, based on some given sampling scheme. We begin with the likelihood
k ni
L = 1 l P(YJi, 2iJ). (3-3)
i= j=1

Writing pT = (p, p ), the conjugate Dirichlet prior is given by

k ni
P) \ 1rn A -lmiP 1 Arm2-1 Arm3i j-1 Amn4i- 1
l() i= l j 1

where A is the dispersion parameter, m4ij 1 t- E3 lij, and the mlij are functions of
the covariates xij associated with the binary vector y. In particular, we take

mnl = exp(xTbl)/[1 + exp(xTbi) + exp(xb2) + exp(xb3)], (1 1, 2, 3) (3-5)

where b1, b2, b3 are the regression coefficients. Writing zlij = yil(1 i/ ._),z2ij

i ._(1 yiji), Z3ij Y Yijli .: and z4ij 1 y3I zlij, the posterior for p is now given by
k ni
7r(p Y) Noc J J[ Z1+Ami -1izAin y AmiJ-lZ 3i+Am3ij- l T 4i m4ij-11 (3 6)
i= 1 j 1









Before proceeding further, we recall a few facts for the Dirichlet distribution. Suppose

ul, ur have a Dirichlet pdf given by
r r
f (ui, ,Ur) x (hf u1 1)(1 ui)ar 1i1,
l 1 l 1

where a, > 0 ( 1,... r+1). Then E(u ) ay/E j a, V(uJ) aJ((El,) a)/[(Er al)2

(Ei al + 1)], Cov(u UJ) -aay,/[(E r+11a)2(E 11a + 1)], (1 < j / < r). Hence
from Eq. 3-6, the Bw .-- estimator(posterior mean) of plij is given by


p, (zi + Ami(b))/(A + 1); j 1, = 1, m. (3-7)

The resulting estimators for 01ij and 02ij are

ij Pj + P3ij, 02ij P + Pij (3-8)

In an EB scenario, the parameter b6, b2 and b3 are all unknown and need to be

estimated from the marginal distributions of all the y's. We assume that A is known,

and outline later its adaptive estimation. Direct estimation of A is impossible from the

marginal distributions of the y's since marginally P(yijl 1,- i.. = 0) = E(zu-) =

ulij,P(yijl 0,1 ..= 1)= E(z2ij) M2ij and P(yijl 1, ..= 1)= E(z3ij) 3ij,

and the joint marginal of the y is uniquely determined from these probabilities resulting in

nonidentifiability in A.

One way to estimate bi, b2, b3 from the marginal Dirichlet-Multinomial distribution

of the y's is the method of maximum likelihood. But the MLE's are mathematically

intractable, and are not readily implementable in practice. Instead, we appeal to the

theory of optimal unbiased estimating functions as proposed in Godambe and Thompson

(1989).









To this end let gij = zlij mij, g2ij Z2ij m2ij and g3ij Z3ij M3ij. Then writing

zij = (zlij, z2ij, Z3ij)T, ij = V(zij),


-E(a ) -E(-) -E(-9i
ab, ab, ab,
D T -E ) -E(_ ) -E(_) ,
--i(jlj gi gi
-E(g) -E(--) -E(-3)i

the optimal estimating equations are given by iED gyDi ij =gij

equations, we find the estimators of b5, b2 and b3. We first observe that


Again from Eq. 3-9, D = Efi (j

equations Di ji 1 -1

further simplifies into


k lj

i= 1j= 1


Ej = Diag(mlj Tm2ij, mT3ij) -- 2ij lij m 2ij f 3ij

mn3ij


(3-9)



0. Solving these


(3-10)


xij, where 0 denotes the Kronecker product. Hence, the

- 0 can be rewritten as Y Z E ['il[13 0xijij = 0 which


k ni
xjmli (1 = 1, 2, 3).
i=1 j=1


(3- 11)


Solving Eq. 3- 11, one estimates bl, b2 and b3. The EB estimators of plij (1

now given by


1, 2, 3) are


pEB (zlij + A nj(b))/(A + 1), (3-12)

1, i = 1,... ,k; = 1,2,3, where b = (b6 b, 3 ). Then the EB estimators of

and 02ij are given by


nEB EB ,EB OEB EB -EB
Iij Plij + P3ij 02ij 2ij + P3ij *


(3- 13)


We write (i, B ( EB, 2B)T, where O

E t ri' (r 2; -1 ,... k, ).


E 1,.0B ,"








3.3 MSE Approximation
In this section we find the MSE matrix of O= (6 0B)T, and also of 0fB

(O i, O2B)T which is second order correct, i.e. correct up to O(k-1).
Theorem 3-1. Let Uk i li E_ iJ1 0 xx] = Oe(k). Then

MSE(OEB) = A(A + 1)-2 A(b) + A2(1 + A) -2B(b) + O(k- 1),


where


Ai(b) -
.j 1


(mn7ij + T 3ij)(1 1ij T 3i)
mT 3i j (n + n3ij )(mn2ij + T3ij)


3ij (lij + rn3ij)(Tn2ij + Ti 3ij)
(n2ij + T3ij)(l T2ij -3i- j)


and,


(b 1 0 1 mij(b)i amij(b)
BY(b) [ b Ob 0 .
0 1 1/ j" j1


Proof:
Note that for i 1,... ,k,


MSE(OfB')


E[(Of' i )(OfB ) T]
E[(Ob O+ OB bf)(0 o + B oT]
E[(B o)( B o)T] + E[(Of 'B)(OEB OfB)T]

MSE(O) + E[(Of Oj)(Of" ofT].


We first observe that for j / j', 1 < 1, 1' < 3,


E[{(A + 1)-(zij pli) A(A + 1)- (plij m i)}

x{(A+1)-(., .-- .')-A(A+1)-(, -m, .)}]

0,


(3-14)


E(p Pl.j)( ,- .*)









since (zij, plij) is distributed independently of ( .*', p) for j / j'and all 1 < 1, 1' < 3.


Hence,


(3-15)


for all r, s


ni
E[(O B rie Osi) 0 (O
j-1
1, 2. Thus,


(OB, -_ 01i)2
- Olij)(2Bj 02ij)


(i lij)( 02ij)
(o2j o2ij


1 0 1

0 1 1


Now for each I 1, 2, 3,


Plij) A(A + 1) -(Plij- mij)]2


(A + 1)-2E[zj -Ph- A(pij


First conditioning on puij, it follows from Eq. 3-17 that


(A + ) -2E[pj(1 pu) + A2 (Pli


(A + 1) -2[mTi


mlij )2


2 Tnij (1- m~j) + A2 (lij ( m )l
mj A+ I A +


A(A + 1)-2m(ij


MSE(OBf)


"t
1(
j- (%j


j=1


(3-16)


E( p.plj)2


E[(A + 1)- (zlij


mri)]2.


E(p _- Plij)2


(3 17)


Orij)(si -sij),


E[(B ei)(OB ei)T]


(3-1t8)


1 0
p B)(i p )T] 0 1


1 1









Again for 1 < I 1' < 3,


E(pl' Plij)( p. .)


E[{(A + 1)- (z1ij lij) A(A + 1)- (p. m )}

x{(A + 1)-(, p, .) -A(A + )-(p. )}]

(A + 1)-2E[-p aPij + A2Cov(pljp, ,)]
(A + )-2E[ A lij A 2. ,


A+ I
A(A + 1)-2r ..


A(A + 1)-2[Diag(mlij, T 2ij, mn3ij)


Tn ii

2ij ( 1ij T2ij T3ij ]. (3-20)

T 3ij


It follows from Eq. 3-16 and Eq. 3-20 that

MSE(0) A(A + 1)-2
ni (, lij + T 3ij)( T1ij 3ij) 3ij (lij + r 3ij)( n2ij + T 3ij)
X tW'
j=1 [T3ij ( nlij + 3ij) 2ij + ijj) (n2ij + Tn3ij))(1 T 2ij Tn3ij)
(3-21)


Next we observe that


E[(oB oB)(oEB B)T]


[ EB
(lij
it' it ,B
EB
(02


B EB
1lij) lij'
OB lEB
2ij (0lij'


B EB
olij') (0lij
1B -EB
01ij') (02ij


B EB B
lij) (02i 02ij)
6B -EB 1B
02i) (02i 02ij)


(3-22)


Hence,


A-h 1


MSE(p )


(3-19)


Y K

1 0
1 0 1
" 0 1 ( P ,),_ ) P 1
\1 1








Writing mij(b) = mnj = (Tmlij, Tm2ij, T3ij )T,

E[(PEB -pB0)(PE_-P,) T] = A2(+-2E[{m(b)


By one-step Taylor expansion, for I 1,2, 3,

rniO(b) flhy (b))T(b
Hence(b) f ij(b) + ( b (63,

Hence for < I/l_ < 3,


[ amj(b) TE[(


b)( -b)fl [ml (b)]
(3-25)


Next we find an approximation to E[(b
To this end, let Sk(b) Y E


E[ ak(b)] =
i= l j 1


b)(b b)T] which is correct up to 0(k 1)


DT E1 gj. Noting that E(gj)


E[D E 1Diy + (DE )g]


SiJ

j=1


0,


x x ij].


(3-26)


We denote by Uk the expression in the rightmost side of Eq. 3-26. We assume that
Uk = O(k). This leads to the approximation


E[(b- b)(b- b)T] = UUkUk1


Uk1 = 0(k-1).


E[(pSB ) -pB)IVT
[ pEB p ) (pEB B), Vijj,,


(3-24)


w. ,(b)}]


i=l j 1


Now


(3-27)


mii(b)}{mij,(b)-mij,(b)} T]. (3-23)


E[{mnj(b)


muj(b)}][{w. .(b)









where


V Ij, A2(1 + A) -2
(n(b) )TUk ( am ,(b) (b))TU ( a (b)) ( ,) k ( (b)
( (b))TU j() ( a b))TU (am (b)) ( )k" b l (b)

3(b))TUk (amT,(b) ( aT3(b))TU (am2i(b)) ( "bTU (am (b)
(3-28)


where


mi y(1


T
mnij) -mTlijTm2ij TlijT3ij


-mnlijT 2ij T 2ij(1 n2ij)


-mTlij m3ij


-m2ijT3ij mn3i (1


T
Tm2ijT3ij


T
m2ij)


(3-29)


(3-30)


(3 31)


By Eq. 3-23-Eq. 3-30,


E[(oB o)(oEB oB)T]


A2 1 20T 8m U(b)\ a9miy(b)
(1I A)2 1 1]TU [l ,, 0 1 .(3-32)
S)2 0 1 1 Ob 1Ob
)1 1


Hence,


MSE(OEB) = MSE(Of)


(t1+ A)2


1 0
S0 1 ba (b) am aij(b)
S ]TU-1 [ 0 .(3-33)
0 1 1 1Ob
\1 1


The theorem follows.


mij (b)/9b


Om2iy(b)/Ob


Om3ij(b)/9b








3.4 Estimation of The MSE
In this section, we find an approximation of the expression given in the right hand
side of Eq. 3-33 which is second order correct, i.e. the first neglected term is o(k-1). We
may note that the second order term in the right hand side of Eq. 3-33 is O(k-1). Hence,
substitution of b for b estimates the same correctly up to O(k-1). However, estimation of
MSE(OB) with the simple substitution of b for b is not adequate since the first neglected
term is O(k-1). Hence, we need to estimate MSE(bO).
To this end, we begin with the evaluation of E[m1ij(b)(1 miyj(b))]. By two-step
Taylor expansion,

1ijO~(b) ( TijT(b) + ( b 1 2b)
m j(b) *m (b) + ( b )(b b) + 2(6 b 8 (b b). (3-34)

Hence,

E[mij (b) (1 mij(b))]
E[T{m (b) + (a b))T(b- b) + (6 b)T li(b) (6- b)}
Ob 2 ObObT
x {1 mii(b) ( b (6 b) -(6 b)0-2T (b- b)}]
ab 2 abab
Smiij(b)[l miij(b)] + (1 2mlij(b))( aml )E(b b)
1 a2m1ij (b) (am li (b) m1sy (b)
+ tr[{(1 2miij(b)) b ab b b)( b )T}E(- b)(6 b)T].
(3-35)









and this approximation is correct up to O(k-1). The expression for 9miuj(b)/9b is given in

Eq. 3-29. Based on that expression, one gets

02M 1ij


miiy(1 m1ij)(1 21mij) -mTijTm2ij(1 21mij) -mTijTym3ij(1 2mij)
T
-m1ij 2ij(1 2miyj) -Tim2ij(1 j -- 22ij) 2mij2ij3ij xiij

-mTnijT 3ij (1- 2mij ) 2mim2ij3ij -m1ijT3ij(1 2m 3ij)
(3-36)

We need to find E(b b). We follow Cox and Snell (1968) to find this expectation.
Slij T1ij
To this end, we first note that OSk/Ob Z= I [E i E {I3 Xij Z2ij 2ij ]is

Z3ij T- 3ij
a constant. Hence, since E(Sk) = 0 the expectation of the product of any element of Sk

with any element of OSk/Ob is zero. Thus denoting J = Ju,rs,mn) = (E(Sk,u., aob .)), we

have Jl,,rs,mn = 0 for all 1, r, m = 1, 2, 3 and u, s, = 1,... ,p. Accordingly, from Cox and









Snell(1968),


tr(Uk 1K11)



tr (Uk 1K1,)



tr(Uk 1K31)



tr(Uk K3p))

tr(U k 111)



tr(Uk 1K1)



tr(Uk 1K31)



tr(Uk K3p)


tr(Uk- 1J1)



tr(Uk 1Jlp)



tr(Uk 1J31)



\tr(Uk 1J3p)


(3-37)


a2 a2 a2
ab abi abab7 abab3


a2
ab2ab
92
absab"


a2
ab29b2
92
absab2


a2

a2
ab3ab3


(3 38)


k i 1


E(b- b)


1
2 Uk


2
2 Uk


where









I 1, 2, 3, u = 1, .* p. In particular, we have


a2 k ni
"b, Bbf [S (zlij
i 1 j 1


02 k ni

OabOb, 1


T jljij )iju



Tmijn)XijU]


i 1 i 1

2 k ni
bt b (zzij m~ijy)x( 2


i j= 1
ab11(abf)1


a2 k ni

ObOb1,b [ (zi1
i 1 j 1
02 k ni

O ,Ob ,,b [i (z 1
i 1 j 1


k

i 1
k

i= 1


Tj)xi jU


Tnj)Xi jU


an
Ym. .M. .m(1
j 1


j 1


k ni
-2 .m .
i= l j=l


T
2mij)xijxijxiju,


2m, .XijXTXij,


.m )XijXijxiju


where 1 < 1 / l' / l" < 3, 1 < u < p. Thus E(b b) = 0(k-1).

From Eq. 3-39-Eq. 3-42, one obtains the elements of the matrices Kiu for all

I 1, 2, 3 and u 1, ,p. Thus from Eq. 3-36-Eq. 3-42, the bias-corrected estimate of

mnrij(b)(1 mTij(b)) is given by


tr(Uk 1(b)K1l(b))


1 \1ij (b)
ij s(b) [1 --biji(b)] (1 2bij (b)) b |b b)u -l()


-tr[{2(1
2


am1ij (b)
ODb bb


T
li~ j i iju


(3-39)


(3-40)


(3-41)


(3-42)


tr(U (b)K3p(b))
m )(bij (b) U-
-6 Ob Ib b)-}U1k


I02M1ij(b)
2 mnjij (b)) 8b -b b








Similarly, one finds bias-corrected estimates for rn2ij(b)(1


m3ij (b)). Next we calculate

E[mj1(ib)m2ij(b)1
E[{mi,,(b) + ( ib))T(6
Ob


b) + -(6
2


_b)T 2mlij(b) (b
ab) b(
8 bQ^b


b)}

b)}]

b)


1 022ij(b) 02r1ij (b) lmij (b) )a(mlij(b))T
+ tr[{{my(b) + m2(b) } + ( )( )}
x E{(b- b)(6- b)T}].

Hence we estimate mljn(b)m2ij(b) by

mijy(b)rm2ij(b)


tr(Uk '(6)K1(b6))


tr(U'(b)K3p())


m2ij () m i (b) + ,m jb
[ b) b b+m2-j(b) b b


1 02m2i(b)
tr[{2 {mi~(b) b0br b-


am ij (b)
+ b b b


]Tt kl '(b)
-62k


02M 1ij(b)
b+ m2iJ(b) Tbbr b


Omij (b) b-
6)( Ob Ib b ) 6 k


Similarly, one finds bias-corrected estimates of m (lij(b)Tm3ij(b) and mn2ij(b)T3ij(b). Hence
we get the following theorem.


Theorem 3-2. Assume Uk


Oe(k),then MSE(O B) can be estimated, correct up to


0(k-1), by


A(A + ) 2C() + A2( + A) -2 (b),


where


c (b)
j=1


S9


1 0 1
0 1 t


t1 1


\ i j (b)1 8 2 021iJ (b)
X {m2i(b) + ( 2b))T(6 b) + (6 b)T 2i (b6
mb 2 Ob T(b
mjij(b)m2ij(b) + [miij(b) am2(b) + 2ij(b) Mib E(6
+b O2~b)alb


(3- 43)


m2ii(b)) and m3i(b)(1


-








with


(Daj (b)p,q mpij(b)[1 mpij(b)]


S(1 2mpij(b))( Ompij (b) ) (b)

tr(UkV (b)K3p(b))
-tr[{(1 2mij(b)) 02 ij (b) (O j(b) b pij (b) bTIU-1
tr[{21 2m b)) T b6 |b bb \b b bb) U b
for p = q = 1, 2, 3,

= mpij (b) qij (b)

[mpi (b) Ob qijb + mq (b) m b ]T
Ob b b +m-6Ob) a -bb




Xtr(u1(b)K3p ()
1 2mqij(b) 02m -y(b)
-tr[{{mpijy(b) baibr b + mqij(b) bb Ib b 0
2 abOabbOrb bb) + U
+ (mpi (b) b b)(m b) b)}Uk ], for p/q 1,2,3.

Remark. As mentioned in Section 2, direct estimation of the dispersion parameter A is
impossible due to its nonidentifiability. It may appear that an adaptive estimator of A is
obtained by minimizing some suitable real-valued function of the approximate estimated
MSE matrix with respect to A. For example, we can try to minimize the trace of the
estimated MSE matrix. Writing the trace of the matrix given in Eq. 3-43 as g(A), we have

k
g(A) = [A(A + 1)-2tr(C()) + A2(1 + A) -2tr(Bb)).
i= 1
Obviously, g(A) attains its minimum 0 atA = 0. But A = 0 gives only the direct estimator,
which we want to avoid for small area estimation. Also, solving g'(A) = 0 subject to the









constraint A > 0, one gets


A mnax[ tr0].b
ti c tr(Ci(b)) 2 K tr(Bi(b))

A plot of g(A) reveals that the function increases from 0 to A and then decreases from

A onwards. Further, since Ei tr(C(b)) = O(k) and 1tr(Bi(b)) 0= (1), for large k,

A is close to 1. This is also revealed in our data analysis given in the next section. Due to

the same condition, the dominating term in the estimated MSE is the first term, namely

A(A+1)-2 E ltr((C(b6)). Since A(A+1)-2 +-1 A-1+1)-2, the dominating first term has

the same expression for A and A-1. Hence, the minimum is attained only for very small

or very large values of A for most practical applications. Thus, the choice of A reflects a

trade-off between the direct and the regression estimate, and practitioner should be guided

by the amount of bias that he/she is willing to tolerate.

3.5 Simulation Study

In this section, we will discuss a simulation study to compare the performance

of the EB estimators with the usual sample mean. Here, we consider k = 10 and

k = 20 with ni = 5Vi. To start with, we generated the covariates from a N(0, 1)

distribution and kept them fixed for the rest of the analysis. We generated zzij's from

a multinomial (pijy,p2ij,p3ij,p4ij) distribution where the p., 's are simulated from a
dirichlet(Amijy, Am2ij, Ars3i, Ar4ij) distribution to replicate the bivariate binary model

described in Section 2. Here the mnij's were modelled on the covariates through the

multinomial link. For our simulation, we assumed b, = (0, 1)', b2 = (0, 2)' and

b3 = (0, 1)'. Our parameters of interest were the small area means Or 1(Prij +

P3ij), r = 1,2. We used the optim function in R to estimate the parameters. We calculate
the following summary measures to examine the performance of the estimators. Let r)

be the estimate of 0(r) for the 1th small area at the Rth simulation. Then we define the









Average Relative Deviation (ARD) as

1 k R (r (r)
ARD Y j 1< 0 (3-44)
kR ZZ O)
i=l r= 1

The Average Relative Mean Squared Error (ARMSE) is given by

1 k R (r) r)2
ARMSE = k 0(7)2 (3-45)
i=1 r=1 i

The Average Deviation (AD) is denoted by

k R
AD= | _- O (3-46)
i= 1 r 1
The Average MSE is given by

k R
AMSE Y Y O))2 (3-47))
i=1 r=1

and, the estimated MSE is given by

1 k R --- (r)
EMSE= k Y MSE(04) (3-48)
i=l r =

where MSE(Oi) is the second order correct estimate of the MSE as given by 3-43.

In addition to the relative bias of the point estimates and simulated MSE's, we report

relative bias of the estimates of the mean squared errors as defined below.

Relative bias with respect to the empirical MSE:


REB E{MSE()} SMSE(, 1) rn
SMSE(0i)

where, SMSE(O,) = i 1EZ 1(r O )2 E{MSE(O) 1 MSE(0i)

As matter of interest we also computed the direct estimators to compare their

performances with our empirical B-,-.-- estimators. We observed there was considerable

improvement in terms of bias and MSE when we used our EB estimators as compared

to the raw estimators, e.g., for A = 5, the improvement was 48.5'. for ARD, 98.1 .









for ARMSE, 42.6 -, for AD and 65.9 for SMSE for the first variable. For the second

variable, the improvements were 50.4 for ARD, 87.6 for ARMSE, 24.5 for AD

and 71.5 for SMSE. Table 3-1 and Table 3-2 summarizes the performance of the EB

estimates for the first and second small area mean respectively in terms of their bias and

MSE estimates.

Table 3-1. Relative biases of the point estimates and relative bias of MSE estimates for
the small area mean: the first variable.
A ARD ARMSE AD AMSE EMSE REB


0.614
0.289
0.192


0.358
0.290
0.180


0.036
0.330
0.071


2.32
.0402
0.005


k 10
0.081
0.079
.069

k 20
0.080
0.089
0.068


0.011
0.012
0.008


0.0108
0.013
0.0073


0.037
0.043
0.023


0.036
0.039
0.022


2.385
2.355
3.276


2.406
2.347
2.406


Table 3-2. Relative biases of the point estimates and relative bias of MSE estimates for
the small area mean: the second variable.
A ARD ARMSE AD AMSE EMSE REB


0.422
0.208
0.151


0.209
0.290
0.1412


0.007
0.182
0.050


0.355
.008
0.023


k 10
0.074
0.084
0.063

k 20
0.075
0.079
0.062


0.009
0.010
0.006


0.010
0.011
0.006


0.007
0.009
0.005


0.007
0.008
0.004


0.214
0.283
0.196


0.221
0.225
0.206









CHAPTER 4
HIERARCHICAL BAYES ESTIMATION FOR BIVARIATE BINARY DATA
WITH APPLICATIONS TO SMALL AREA ESTIMATION

4.1 Introduction

In C'!i pter 3, we considered empirical B-,- estimation of the small area parameters

for bivariate binary data. In this chapter we have implemented the hierarchical B-,.--i

procedure for bivariate binary data when there is missingness present in the covariates.

For our purpose, we have assumed the missingness to be missing at random ( \!AR) (Little

and Rubin, 2002). We have incorporated ideas from Lipsitz and Ibrahim (1996) and

Ibrahim, Lipsitz, and C'!, 1i (1999) to treat the missing covariates.

The general HB models which accommodate this missingness is discussed in Section

2. We have, after suitable reparametrization, converted the bivariate binary likelihood

into a multinomial likelihood, and subsequently have utilized the multinomial-Poisson

relationship leading eventually to a Poisson reparameterization. B i, i ,i analogues of

log-linear models are used to estimate the Poisson parameters. We have applied the

HB methodology to obtain the small area estimates, the associated standard errors

and posterior correlations. Due to the analytical intractability of the posterior, the HB

procedure is implemented via Markov chain Monte Carlo (\ C'\ C) integration techniques,

in particular, the Gibbs sampler. In Section 3, The methodology is illustrated with the

real data related to low birthweight and infant mortality.

4.2 Hierarchical Bayes Method

Let y = (yij, /i ._.) (j = 1,.- .* n; i = 1, k) denote the bivariate binary response

for the jth unit in the ith small area. Thus each yijl (1 = l or 2) assumes the values 0 or 1.

The joint probability function for y is given by

/ ) exp( .1 i + + + '/i ._.)(4
-) + exp( ,) + exp( + exp( -, + + ..) 1)








Following the same transformations as in C'!i ,pter 3, we arrive at the model

S(1-Y )2) (1- Yij)l i2 (1-YI il)(1-Yi2) 4 2)
P(Yili I1 ._) p Pij3 Pij4 (4-2)

where pij4 t1 1YPijl.Then writing zij = yijl(1 / .), ~.. =/ ._(1 yijl),
Zij3 ijli and Zij4 (1 y)(l / .), we get

P (yij I N ij i 2p ij3pSj4 (4 3)

Next we write piul = (yil/ 1 (ir and use the well-known relationship between
the multinomial and Poisson distribution, namely, (zii, ._., zij, zi4) have the same
distribution as the joint conditional distribution of (uijl, / .., uij3, ui4) given

Zj 1 Uijq 1, where Uiq are independent Poisson((jq) (q 1,2,3,4).
We begin modelling the (ij as

log(ij) = xbi + (4-4)

where xiy (xijl, ._.,..., ,ijq)T, b, (bli, bl2, ..., blq)T, and, id N(0, Ja2). In the
presence of missing covariates, we write x = (x mis, Xjobs) and assume that the
missingness is MAR. In the presence of missingness in some of the components of
xij, we model only those missing components for necessary imputation. Further, in
order to reduce the number of nuisance parameters, we model the joint distribution
of these missing components as the product of several smaller dimensional conditional
distributions as proposed in Lipsitz and Ibrahim (1996) and Ibrahim, Lipsitz, and
C'!. i1 (1999). Specifically, we partition Xij,mis into several component vectors, -v,

(,1)mis, ijmis, parameterized by a1,..., ar, and conditional on the observed Xij,obs









and aT = (a, ... aT), write the joint pdf of Xijmis as


f (Xij,mis Xij,obs, a) J )ijmis ijmis ij,obs, 0)
(1 ) .(1) (r-1) rr
f (Xij,mis 'ij,obs *x ij,mis ix ij,mis r)

r (r- 1) .(1) (r -2)
\ 'ij,mis ij,obs, ij,mis, ij,mis, Or-l)

x (x f(x ij,mis Xijobs, O 1). (4-5)

It is important to note that since we are assuming MAR missing covariates, we do not
need to model the missing data mechanism.
Write zij (Zijl, ._.,Zj3,ij4) and b -(b bT, bT, bb). Let D = (z, xij, j
1, 2,..., ni, i 1, 2,..., k) denote the complete data. Also, let Dobs (Zij, x i,obs,
1, 2,... i 1, 2,..., k) denote the observed data. Then, the complete data likelihood
under the hierarchical model is given by

L(b, 72, aID)


{=[Uzo1 )}-exp exp(x bi +,)} exp(-z+)}
i 1 j 1 l 1 zijl!
k ni
S 1 f(ij,obs Xij,mis, a) (4-6)
i= l j 1
and the observed data likelihood is of the form

L(b, 72, a Dobs)
j k { J ni 4 exp{ziji(xT b + e} 1
zipjt exp -exp( b T b
i 1 j 1 1i 1
2 1 k ni
x exp )dvi f(ij,bsx id ), (47)
i 1 j= 1

where p is some a-finite measure, typically the Lebesgue measure or the counting measure.
We assume bz's, I 1,..., 4, a-2, a are a prior mutually independent with b ~-
uniform(Rq) for I 1,... 4, a2 ~ Inverse Gamma(c, d), where Inverse Gamma(c, d) pdf is









given by 7(y) oc exp(- )y-d/2-1. We let 7(a) denote the prior distribution of a. Then,
the resulting posterior distribution is given by

r(b, a2, a Dobs) O L(b, a2, a Dob) exp(- --)(2)-d/2-17(). (4 8)

Direct evaluation of the posterior moments of Pijl's are not possible as it entails
evaluation of numerical integrals which are very difficult to compute. We use the Gibbs
sampling method instead. Let Xmis denote the full vector of all the ij,,mis, and v =

(vI,... vk). To facilitate Gibbs sampling, we consider the following joint posterior based
on the complete data likelihood as:

7r(b, a2, a mv,i Dob) x L(b, a2, aD) exp(-- )(2)-d/2-17(a), (4-9)

where L(b, a-2, aoD) is given by Eq. 4-6. It is easy to show that after integrating out v
and Xmis from Eq. 4-9, the joint marginal posterior of (b, a 2, a) is Eq. 4-8.
To prove the propriety of the posterior, we will mainly follow the techniques given
in C1'i. i: and Shao (2000), C'i. i, et al. (2004) and Chen et al. (2002). We note that
our covariates are bounded. We assume that aij < xij,miss < bij, which means all
the components of Xij,miss are bounded by aij and bij. We start with introducing some
notations:
let M = {(i,j) : at least one component of xij is missing},
Mij = {1 < r < q : xijr is missing}, and M* = {(i, j) : r E Mij}, where j 1,... ru,
i 1,... ,k.









Here,following C'! i: Ibrahim, and Shao (2004),


f (zijiy Ixiy, bl, v)
1


Sexp{zijl lijl} exp exp (lii)
Ziji!

Sexp exp (lijl) if Ziji = 0
+) exp(- i1) exp(- TI) if zijl > 1

Denote by 1 = {(i,j) : zi = 0} and I21 {(i,j) : ij > 1}.

exp{-exp(,y)}, f2Q() = exp(-,+) and f2(,) = exp(-,-). Note that f,

non-increasing in T1. Also fi(oo) f2(oo) 0, and, fi(oo) < oc, f2(oo)

other hand, f3 is non-decreasing in T1, f3(-oO) 0 and f3(oo) <

i-I ni and N2 be the cardinality of I21, 1, ... 4. Denote i (a
e (0, .., 1, 0, ..0) e Rk as the basis vector with 1 at the 1th component.


(4-10)


Let fi(I) =

and f2 are

< oO. On the

X). Let n =

c., ej) with

Let


U() = (uj, 1 < j < n, 1 < < k, +, (i,) 21)


(4- 11)


be the (n + N2) x (q + k) design matrix such that u = Uijfor (i, j) e I31. Let

w ) lif(i,j) e IuU 21 and 1if(i,j) e I21. Define U,1) to be the matrix with

rows


w .T 1 < j < n, 1 < i < kw^, j +uj, (i, j) C I21
wi? ui?,] _' _+?( ]


(4-12)


For 1 < r < q, 1 < / < 4, let


S ayij
ij
bij

an+ij bij


if (i,j) C IlU U 21
o.w.

if (i,j) e I21


(4-13)











I bij if (i,j) e I U 121
aIij o.w. (4-14)

n+ij =aij if (ij) E I21

We are now led to the following theorem.
Theorem 4-1. For 1 < r < q and 1 < 1 < 4, let x )( = a) for every (i, j) EC A x(1

a) for every j e MA4 n21, or, let x ) for every (i,j) e A*, x ij,r i
for every j e M4* I21,

(Cl) the matrix UM') as defined in Eq. 4-11 is of full rank;
(C2) there exists a positive vector a (ai, a +Ni) E T. such that

U( a = 0 (4-15)

where U( is defined in Eq. 4-12
(C3) fl u|qdf (u) < oc fori 1,2,3.
If conditions (C1)-(C3) are satisfied, then the posterior is proper provided
k ni
J fl f(xij,,obs\ a)da < oc. (4-16)
i=l j 1
We will defer the proof of this theorem to the appendix.
To implement the Gibbs sampling algorithm, we sample from the following conditional
distributions in turn: (i) 7r(blv, Xmis, Dobs), (ii) 7r(v b, 72, xmis, Dobs), (iii) |7r(2 v, Dobs),
(iv) x7(xmisb, v, a, Dobs), and (v) (a x mis, Dobs).
For (i), given v, Xmis, and Dobs, the bl's are conditionally independent with

7(bi v, mis, Dobs) o exp zji(xbi + ) exp(xbi +, ) .
i= l j=
It can be shown that 7r(b Iv, xmis, Dobs) is log-concave in bi and, hence, we can sample bi
via the adaptive rejection algorithm of Gilks and Wild (1992) for I 1,... 4. For (ii),









given b, a2, Xmis, and Dobs, the 's are conditionally independent with

7(' |b, r2, is, Dobs) x exp + 2j2x + ) exp(x bi + ) ,
j= 1 l= 4

which is log-concave in Again, we sample using the adaptive rejection algorithm of

Gilks and Wild (1992) for i 1, 2,..., k. For (iii), we have
k
a2 v, Dobs Inverse Gamma (c + k, d + .
i= 1

Thus, sampling a2 is straightforward. For (iv), given b, v, a, Dobs, the xumijs's are

conditionally independent with

71(xij,mis b, v, a, Dobs) o exp { Lz[(Jxbi + ) exp(x.b + )j f (x+ a).
44
Sampling x,mis depends on the form of f(xiy|a). For the data given in the next section,

Xij,mis is a vector of discrete covariates and hence, it is easy to sample xijmis from its

conditional posterior distribution. For (v), T(aolxmis, Dobs) OC [ Hi Hji1 f(xij a)] r(0).

For various covariate distributions specified through a series of smaller dimensional

conditional distributions, sampling a is straightforward.

4.3 Data Analysis

The motivating example for our study is to estimate jointly the proportion of

newborns with low birthweight and infant mortality rate at low levels of geography such

as districts within a state. The data from the infant mortality study was conducted by

National Center for Health Statistics (NCHS) involving birth cohorts from 1995 to 1999.

The original survey was designed to obtain reliable estimates at the state level. The same

data needs to be used to produce estimates at the district level. Other than the intercept

term, the covariates considered were (i) mother's schooling categorized into less than

12 years, equal to 12 years, or greater than 12 years; (ii) mother's age categorized into

less than 20 years, between 20 and 34 years, and greater than 34 years; (iii) the smoking

habit of the mother categorized into mother does not smoke and the average number of









cigarettes smoked during pregnancy is at least 1. It turned out that there was missingness

in the response to both (i) and (iii). If one leaves out completely all the entries where such

missingness occurs, then one is left with only a few observations where one encounters

both low birthweight and infant mortality, and indeed the total number of such entries will

be disproportionately small in comparison with the remaining outcomes. Also, it is very

clear that both these covariates have direct impact on both low birthweight and infant

mortality so that the missingness cannot be treated as "missing completely at random".

Instead, this missingness falls in the category of "missing at random" (M\ AR) (Little and

Rubin, 2002).

We use our method to estimate infant mortality rate and low birthweight rate

simultaneously over a period of 5 years for several small areas in one of the states (name

withheld for confidentiality). We constructed the small areas by cross-classifying the

different cities in the state with mother's race, namely, Non-Hispanic White, Non-Hispanic

Black, Non-Hispanic American Indian, Non-Hispanic Asian/Pacific Islander and Hispanic.

We get 48 such small domains. For each small domain, we consider the NCHS full data

as constituting our population values, and we draw a 2'-. simple random sample from

this population of individuals. Thus the population low birthweight and infant mortality

rates are known for these small domains, and will serve as the gold stl ,i, ,1. for

comparison with the different estimates. In particular, we will find the HB estimates,

and compare the same with the direct estimates, namely the sample means. In our

analysis we consider the response variables and covariates as follows: yi = infant death

(1 yes, 0 no); i1 ._ = low birthweight (1 yes, 0 no); xjij = 1 denoting the intercept;

(r ._.,Xij3) = (0, 0), (1, 0), or (0, 1) according as mother's schooling is less than 12 years,
equal to 12 years, or is greater than 12 years; (xij4, ij5) = (0, 0), (1,0) or (0,1) if mother's

age is less than 20 years, between 20 and 34 years, or is greater than 34 years; and

xij6 = 0 or 1 if mother does not smoke, or the average number of cigarettes smoked during
pregnancy is at least 1. Here y = (yiji, i ..:)T is the response vector with the covariates









Xij (Xij=i, ._., Xij3, Xij4, Xij5, Xij6)T, j ,...,i, i = 1,2,..., k = 48. Covariates xijI,

Xij4, Xij5 are completely observed, while r ._., xij3 and Xij6 have missing values. The models
considered for these three missing covariates are given as follows:

a6) exp{xij6(a61 + 62x4 + (-, .)}
1 + exp((a61 + a62Xij4 + (,

where a6 = a61, a62, a63)T,

P((r ._,Xij3) (t,0)|a2, a3)
exp(a21 + 22Xij4 + ('. + .,.)
1 + exp(a2 + 22xij4 + (' .. ,1' ...) +exp(a31 + _Xij4 + 33Xij5 + .)

and


P((T .-,Zij3) (0, t)|I 2, 3)
exp(a3i + a32Xij4 + a33Xij5 + (I
+ exp(a21 + a22Xij4 + (. (.1...) + exp(a3 + 32ij4 + 33Xij5 + )

where a = (alI, a2, a3,al4)T for I = 2,3. We do not need to model xijl, Xij4, xij5
We assume improper uniform priors for a2, a3 and a6, i.e., 7r(a) o 1, where a =
(ao, T, aToT.

To prove the propriety of the posterior under an uniform prior for a, we start with
the complete likelihood for the covariates: Let
k ni
L(a6,a2,a3 Dcomp) t ( f (-a)

exp{r ._(a21 + 22j4 +o .-. + o'- ..)} + (a1 + _.xj4 + 33Xj5 + aXj346 )}
1 + exp(o21 + a22Xij4 + .-. + (I ..) + exp(oa31 + ( _.ij4 + a33xij5 + ( ..)

where Dcomp {(r xij3,ij6, Xij4, ij5) j 1,... ni, i 1,..., k}.
We consider the following four cases.








Case 1: All (r .,,Xij3,Xij6) are observed.


Li((a6, aO2, aO3 Dobs,1) H [f(Xij6 a6)
i,j: (xij2,Xij3,xij6) are observed L
exp{r ._(a +21 22Xj4 +o (' .+ ,I .)}+ (c31 + (.x + 033ij5 + 034Xj6)}
1 + exp(ao21 + a22Xij4 + 0- + ( .I .) + exp(o31 + (' _Xj4 + a33j5 + (I .)

where Dobs,1 ((r .-., xij3, j6, xj4,x j) are observed 1 < j < ni, 1 < i < k}.
Case 2: Xij6 observed but (r .-, Xij3) are missing. Let Dobs,2 ((iXj6, j4, ij5)
(xij6,Xij4,Xij5) are observed but (r -..,Xij3) are missing 1 < j < n, 1 _< i < k}.
Let

L2 (C6, a2, C3 Dobs,2) f(xij6a6)x [
i,j: Xij6 is observed \ Xij2,mi,ij3,mis
exp{r .(a21 + '22Xij4 + -+ 1,' -..)}+r (031 _+ 32 j4 + 33Xj5 + 0 *)}
1 +exp(a221 + 22ij4 +o .-+ i ...) +exp(31 + 32Xij4 + 33xij5+ t .) /

H 1/f(xij6 6).
i,j: Xij6 is observed
Case 3: xij6 is missing but (r ., ij3) are observed. In this case, we sum over all possible
values of Xij6. Let Dobs,3 ((ir .., Xij3, X j,) (rij5 ..:,Xij3, Xij4, ij) are observed but Xij6
is missing 1 < j < ni, 1 < i < k}. Similarly to Case 2, we have

L3(af6, af2, a3 Dobs,3) I H 1 f(Xij6 a6)
i,j: (xij2,xij3) are observed but Xij6 missing ij6,is L
exp{r ._(a21 + a22Xi ++ oI .- + 0, ..)} + r. (-31 + a32Xi + a33X + a34Xi)}
1 + exp(o21 + a22Xij4 + ( --+ ( .,.) + exp(o31 + 32Xij4 + 33Xij5 + (I' .)

Case 4: (r .., xij3, xij6) all are missing. In this case, the observed likelihood after
summing over missing values is equal to 1.








Let Dobs = (Dobs,1, Dobs,2, Dobs,3) and let mis denote the collection of all missing values.
Combining Cases 1 3 leads to

L(a6, 2, a3|Dobs) = L(a6, a2, 3 IDcomp)

-Li(06, a2, 3 |Dobs,1)L2(a6, a2, 3|IDobs,2)L3(06, 62, O Dobs,3).

To start with, let uT (xlij, 4ij, 5ij, X6ij) U1,j = (U OT 4)T and uj =

(0X4, ui ). Also let 7 (as, a~)T such that 7, a2r,and, 74+r a= 3r, r 1. 4.
Let us consider partition of the index set {(i, j) : (r .., Xij3) is observed} into three
parts: Ji {(i,j) : x2ij ,x3ij 0}, J2 {(i,j) : x2j 0,3ij = 1} and

J3 {(,) 2ij = O,3ij = 0}. Then the likelihood L(o2, o3,o 6|Dob0) can be written as



L(a2 a3,a6 Dobs) H f (xj6 a6)
i,j: Xij6 is observed

f (7 ., | .)exp (u 7)
1 + exp (U T7) + exp (UT 7)
(i,j)EJ j xij6,mis f J
fv- .exp (u 7) __
t ( | 1 + exp (U T7) + exp (UT 7)
(i,j) J2 Xij6,mis 2 I

X H fexp( 7) + exp (UT ()
(i,j)EJ3 ij6,mis

We follow mainly the techniques of CI.! i1 and Shao (2000) and C.!, i, Ibrahim, and
Shao (2006) to prove the propriety of the posterior. First, for (i,j) c .M {(i,j) *
x6ij is observed}, let x* (lj, x4j, x5ij) and X* be the the n* x 3 design matrix with
rows xi, where n* is the cardinality of Mc. Let z* = 1 2x6ij, and X* be the matrix
with rows z x ~, (i, j) E Mc. Let us state the following conditions:


(Dl) X is of full rank;
(D2) there exists a positive vector a = (ai, a,,) E R"*, such that

X*a 0 (4-18)








For (i,j) C M23 (i,j) (2ij, x3ij) is observed, let ( job~, xs oT) where
xaj is either equal to (1 x2ij) or x2ij if (i,j) cE .M {(ij) : x6j is missing}; also

= (0 ij,obs, I where xij is either equal to (1 x3ij) or X3ij if (i,j) E A.*. Let
U** = (- (i, j) E J1 U J3, (uij j), (i,3) e J1, ( 3 ), (i,j) e J2, -u (ij)
J2 U J3). Let n** be the cardinality of MA3. Then let us state the following conditions:

(HI) U** is of full rank;
(H2) there exists a positive vector a = (a,. a2n**) C such that

**T a 0. (4-19)

Now we state the following theorem:
Theorem 4-2. Suppose conditions (Dl) and (D2), and, (HI) and (H2) hold, then we have


/3 / L(a2, a3, a6 Dobs)da2da3da6 < 00. (4-20)

For our analysis, we assume an Inverse Gamma(l,1) prior for oa2. Table 4-1
provides the sample sizes for all the 48 small domains, the population proportions of
low birthweight and infant mortality (P1,P2), the corresponding sample means (Dl, D2),
and the corresponding HB estimates (HB1, HB2).














Table 4-1. Bivariate HB estimates along with the Direct estimates and the True values


area n, P1 P2 Dl D2 HB1 HB2
1 695 0.007 0.068 0.004 0.062 0.006 0.080
2 3 0.015 0.099 0.000 0.000 0.007 0.082
3 41 0.006 0.061 0.000 0.122 0.011 0.108
4 1477 0.006 0.068 0.006 0.071 0.006 0.075
5 425 0.014 0.131 0.024 0.150 0.006 0.077
6 345 0.006 0.067 0.008 0.064 0.006 0.082
7 510 0.005 0.067 0.004 0.061 0.009 0.102
8 888 0.009 0.116 0.008 0.114 0.009 0.116
9 44 0.004 0.081 0.000 0.091 0.005 0.073
10 1718 0.004 0.063 0.002 0.058 0.007 0.083
11 379 0.006 0.068 0.005 0.082 0.005 0.075
12 126 0.013 0.130 0.03 0.135 0.007 0.082
13 5 0.005 0.068 0.000 0.200 0.009 0.091
14 15 0.003 0.077 0.000 0.067 0.006 0.076
15 13 0.003 0.070 0.000 0.000 0.007 0.083
16 975 0.005 0.072 0.007 0.060 0.010 0.107
17 349 0.011 0.135 0.017 0.149 0.007 0.083
18 4 0.005 0.112 0.000 0.500 0.007 0.080
19 58 0.003 0.074 0.000 0.069 0.005 0.078
20 288 0.005 0.082 0.003 0.090 0.007 0.082
21 27 0.002 0.078 0.000 0.111 0.007 0.079
22 237 0.004 0.067 0.000 0.076 0.006 0.076
23 987 0.005 0.068 0.004 0.078 0.006 0.079
24 6 0.019 0.052 0.000 0.000 0.005 0.078
25 50 0.008 0.084 0.000 0.040 0.008 0.084
26 365 0.004 0.059 0.003 0.088 0.004 0.072
27 453 0.008 0.064 0.011 0.077 0.005 0.073
28 78 0.019 0.120 0.012 0.128 0.005 0.078
29 4 0.016 0.093 0.000 0.000 0.007 0.081
30 15 0.004 0.076 0.000 0.133 0.005 0.072
31 19 0.007 0.064 0.000 0.158 0.007 0.086
32 446 0.008 0.071 0.002 0.074 0.005 0.073
33 76 0.015 0.133 0.000 0.145 0.005 0.069
34 8 0.008 0.085 0.000 0.125 0.007 0.086
35 22 0.006 0.056 0.045 0.136 0.009 0.093
36 290 0.014 0.133 0.014 0.131 0.005 0.073
37 3 0.009 0.074 0.000 0.000 0.008 0.086
38 18 0.006 0.061 0.000 0.111 0.006 0.082
39 933 0.005 0.066 0.004 0.062 0.007 0.084
40 207 0.012 0.119 0.019 0.126 0.006 0.082
41 4 0.000 0.084 0.000 0.000 0.008 0.085
42 21 0.008 0.084 0.000 0.048 0.007 0.089
43 1472 0.007 0.070 0.007 0.071 0.008 0.089
44 27 0.004 0.073 0.000 0.037 0.005 0.076
45 275 0.006 0.067 0.003 0.069 0.005 0.072
46 1131 0.006 0.065 0.008 0.060 0.008 0.085
47 191 0.014 0.125 0.016 0.126 0.008 0.087
48 303 0.006 0.060 0.003 0.066 0.008 0.085









Based on the figures in Table 4-1, we calculate the following summary measures

to examine the performance of the estimators. Let ti be the generic symbol for a true

proportion in the ith small area and ei be the corresponding generic symbol for the

estimate. Then we define the Average Relative Deviation (ARD) as


ARD = m- lei t1i/t. (4-21)
i= 1

The Average Relative Mean Squared Error (ARMSE) is given by


ARMSE = -1 (e_ ti /t2 (4-22)
i= 1

The Average Deviation (AD) is denoted by


AD m-1 |e t1, (4-23)
i= 1

and, the Average MSE is given by


AMSE = m (Ci ti)2. (4-24)
i= 1

Table 4-2 summarizes the performance of the HB estimates, and also of the direct

estimates. It follows from the summary table Table 4-2 that the HB estimates outperform

the direct estimates according to all the four criteria. For the HB estimates, ARD,

ARMSE, AD and AMSE improvement over the direct estimates are 39.3', 74 ,

37.9 2',. and 71.,! respectively, for estimating infant mortality rate. For low birthweight

proportions, corresponding improvements are 46.59'. 85.0''. 43.05'. and i.'. respectively.

Table 4-2. Measures of precision

D1 D2 HB1 HB2
ARD 0.7973 0.464 0.4832 0.2478
ARMSE 1.6525 0.6244 0.4151 0.0931
AD 0.0058 0.0367 0.0036 0.0209
AMSE 0.00007 0.005 0.00002 0.0007









Table 4-3 provides the posterior standard deviations of the HB estimates denoted by

SE1 and SE2 respectively, and the posterior correlation denoted by CORR. In addition, we

provide 95'. highest posterior density (HPD) intervals associated with the HB estimates.

These HPD intervals are denoted by II and 12.

We used the posterior predictive p-values of Gelman et al. (2003, Ch. 6) for assessing

the model fit. The posterior predictive p-value for the ith area is defined as


ppi = Pr(T(yreP, Oi) > T(y, Oi) Dobs),


where yrep denotes the replicated data that could have been observed, and


T(y, 0) (Oi EyOi)T Y1(i EyOi),


Oi = (Oil, Oi2)T, and Ey[Oi] and Zy are the posterior mean and variance-covariance

matrix of Oi based on the observed data y for i = 1, 2,..., k = 48. Here the p-value

is estimated by calculating the proportion of cases in which the simulated discrepancy

variable T(yrep, 0) exceeds the realized value T(y, 0i):

N
P-n1- 1 = I(T(yrep, 0i)> T(y,0i)), i- 1, ,48,
i 1

N being the number of replications of the data. The posterior predictive p-values are also

shown in Table 4-3.

If the model is reasonably accurate, the replications should look similar to the

observed data y. In the case where the data y in in conflict with the posited model,

T(y, 0i))'s are expected to have extreme values, which in turn should give p-values close to

0 or 1. Hence, an ideal fit requires the posterior predictive p-values to be in the ballpark of

0.5. The figures provided in Table 3 show that this is indeed the case for our fitted model.













Table 4-3. Posterior variance, correlation and 95' HPDs along with predictive p-values


II 12
area SE1 SE2 CORR p-value
1 0.001 0.002 0.115 0.005 0.007 0.076 0.085 0.536
2 0.001 0.002 0.151 0.005 0.008 0.077 0.087 0.544
3 0.002 0.006 0.181 0.007 0.015 0.097 0.119 0.556
4 0.001 0.002 0.124 0.004 0.007 0.071 0.080 0.520
5 0.001 0.002 0.121 0.005 0.007 0.072 0.081 0.572
6 0.001 0.002 0.099 0.005 0.007 0.077 0.086 0.552
7 0.002 0.005 0.153 0.006 0.013 0.092 0.112 0.496
8 0.009 0.130 0.149 0.008 0.102 0.005 0.073 0.548
9 0.001 0.002 0.168 0.004 0.007 0.069 0.078 0.536
10 0.001 0.003 0.106 0.006 0.009 0.078 0.088 0.588
11 0.001 0.002 0.170 0.004 0.006 0.071 0.08 0.544
12 0.001 0.003 0.170 0.006 0.009 0.077 0.087 0.524
13 0.001 0.004 0.152 0.006 0.011 0.084 0.099 0.552
14 0.001 0.002 0.151 0.005 0.007 0.072 0.081 0.536
15 0.001 0.003 0.144 0.005 0.009 0.078 0.089 0.540
16 0.002 0.005 0.178 0.007 0.013 0.098 0.117 0.496
17 0.001 0.002 0.162 0.006 0.009 0.078 0.088 0.472
18 0.001 0.003 0.105 0.005 0.008 0.075 0.085 0.596
19 0.001 0.002 0.154 0.004 0.007 0.074 0.083 0.508
20 0.001 0.002 0.174 0.006 0.009 0.078 0.087 0.476
21 0.001 0.004 0.126 0.004 0.009 0.071 0.087 0.556
22 0.001 0.002 0.142 0.004 0.007 0.072 0.081 0.524
23 0.001 0.002 0.097 0.005 0.008 0.075 0.084 0.612
24 0.001 0.002 0.181 0.004 0.007 0.073 0.083 0.540
25 0.001 0.003 0.072 0.006 0.009 0.079 0.089 0.576
26 0.001 0.004 0.104 0.002 0.005 0.064 0.079 0.596
27 0.001 0.002 0.102 0.004 0.006 0.069 0.078 0.604
28 0.001 0.002 0.157 0.004 0.007 0.073 0.083 0.560
29 0.001 0.002 0.132 0.005 0.008 0.076 0.085 0.604
30 0.001 0.003 0.106 0.004 0.006 0.067 0.077 0.492
31 0.001 0.003 0.201 0.005 0.008 0.081 0.091 0.604
32 0.001 0.003 0.172 0.004 0.007 0.067 0.078 0.540
33 0.001 0.002 0.148 0.004 0.007 0.064 0.074 0.496
34 0.001 0.003 0.106 0.005 0.008 0.081 0.091 0.492
35 0.002 0.004 0.163 0.007 0.012 0.085 0.101 0.588
36 0.001 0.003 0.08 0.004 0.006 0.068 0.079 0.544
37 0.001 0.003 0.191 0.006 0.011 0.079 0.092 0.544
38 0.001 0.002 0.088 0.005 0.007 0.077 0.086 0.532
39 0.001 0.003 0.146 0.006 0.009 0.079 0.089 0.556
40 0.001 0.004 0.14 0.004 0.008 0.075 0.089 0.536
41 0.001 0.004 0.152 0.005 0.010 0.078 0.093 0.536
42 0.001 0.003 0.102 0.006 0.009 0.084 0.094 0.488
43 0.001 0.003 0.156 0.006 0.010 0.083 0.094 0.536
44 0.001 0.004 0.052 0.003 0.007 0.069 0.084 0.544
45 0.001 0.002 0.105 0.004 0.006 0.067 0.076 0.560
46 0.001 0.003 0.179 0.006 0.010 0.079 0.091 0.552
47 0.001 0.003 0.164 0.006 0.009 0.081 0.092 0.508
48 0.001 0.003 0.241 0.006 0.010 0.079 0.091 0.564









CHAPTER 5
SUMMARY AND FUTURE RESEARCH

In this dissertation, we have developed some new robust B,--i and empirical BE,--

estimation procedures in the context of small area estimation under a normal model. It

turns out that the proposed EB estimators perform quite well in comparison with the

regular EB estimators under a contaminated normal model, both in terms of the bias

and the mean squared error, especially for the extreme small area parameters. A closed

form expression of the MSE can not be obtained under this complex situation. We have

developed second-order correct MSE estimators for the robust estimators.

The other part of this dissertation is small area estimation when the response is

bivariate binary in the presence of covariates. We have first developed an EB estimation

procedure followed by a hierarchical B-,-i procedure. In the latter case, we have

covariates which are missing at random. We have shown that both the methods perform

much better than the direct estimators through a simulation study in the former case and

real data analysis in the latter situation.

In this dissertation, we have considered only area level small area estimation models

for the case of robust estimation of parameters. It is worthwhile to extend the proposed

approach to unit level small area estimation models as considered for example in Battese

et al. (1988) A more challenging task is to extend these ideas to non-normal models,

especially for some of the discrete distributions involving binary, count or categorical data.

The posterior of the regression coefficient in such instances does not come out in a closed

form. A more managable alternative seems to involve a prediction problem where one

decides to look at the predictive distribution of a future observation, and use the same to

derive the influence of a given observation.

A natural extension of the first part of this dissertation would be to develop

similar robust procedures in the case of general linear mixed models. As noted in

the introduction, the normal model considered in the dissertation and also the other









popular small area models are special cases of the general linear mixed models. It will be

worthwhile to develop robust methods in this more general case.

Another future undertaking will be extending the results for the bivariate binary

case in the multivariate context. Also we have developed EB techniques when there are

no missing covariates. It will be interesting to extend these results when some of the

covariates are missing.









APPENDIX A
PROOF OF THEOREM 2-3


We start with the identity


E (REB


E(of- _0)2 + E(o REB

Vi(1 B) + E(ofEB
V~l-B +Ete -


Next we write


E (REB 0B)2


(0EB 6B + 0REB


E (EB 6B B+ E (REB


Now observe that

i REB REB


B -D ^


BD[Yi XT + (K -
Di

-BeD^ -X6 + (K
Dh


Yi xfb
D

_i XT6)I
D


XT b
xTb
>K]

_b>K]
Di
>1]>


(K + )I-S)
Di b

- (K + )I
yD x


Bj(y xf 6) (y fb) B- {(y xb KD)I -X>KDi]
[v- ib K 4


+(y -fb + KD)I Ixf b< KD]+Bi(y

+(y -fi + KD)I fb<-KD]}.


We first show that xf(b


xf b- KD)I[,-Xb>Kli]


(A-3)


b) = Op(m- 1). To this end, we begin with the one-step Taylor


expansion,


b + (XTE-1X)- XTE- (y


Ob
Xb) + (A A) A


ab
A)b
OA


6iB)2
0B)2.


(A-1)


REB) 2


+2E(REB


eREB) 2


B e REB
0, )(O,


0REB).


(A-2)


Di


xTb
LI,


TTb
Di


b b + (A


(A-4)


Di








But


S- -[(XTE-IX)- I(X -2X)(XT X1)- xTE-I (
OA
+ (XT -IX)-lXTE-2(y- Xb)]

so that E[ ] = 0, and after some simplifications,


ab
V (- )
OA


Xb)


(XTE-X)-I [3(XTE-2X)(XT-I1X)- (XTE-2X)


(A-5)


(A-6)


+ XTE-3X](XTE-1X)- .

By assumption (i), (XT -1X)-1 > (V* + A)-1(XTX)-1. Also, XTX-2X < A-2XTX
and XTE-3X < A-3XTX. Hence from Eq. A-6,


Ob
V(OA)


< (V* + A)-2{A-3 + 3A-4(V* + A)-'}(XTX)-1


Now, noting that E(A A)2 = O(m-1),


xTE[(6 6)(b6- 6)T]x


XiTE[(A- A)2(b )(bA) ] X
OAaOA


< C{E[(A


A)4]1/2 E[( T(b)(A bA) T )4 }1/2
A~~q}O OA( b- (-7


C{E[(A A)4]}1/2{[(iTV(b)x)2]} 1/2


A)4]} 1/2(iT(XT X)- i)


(A-7)


where C(> 0) is a constant. This implies that xf(b


b) = Op(m-1). Accordingly, since


Syi-Xfb1>KDi [|yi-X fb>KD] [Xf (b-b)|>K|Di-DiD]'


where IDi Di = Op(m- 2) and xT(b


b) -Op((m-1), it follows from Eq. A-8 that


P( [yi-xTb>KbD [yi-xTb>KDj / 0) 0


< C{E[(A


O(m-2),


(A-8)


(A-9)









for arbitrarily large r(> 0) as m 0 oc. Moreover,


Bi(y, x b) Bi(yi x b)


Bix (b b) (Bi B)(y,


(B B)(y,


x b) + Op(m-1).


O REB REB


-(B B)-(y- xfT) + B[{(y,

+{(y- Tb + KD)- (y,- -


- xfb KD ) (y- fb KDj)}I >KD

b+ KD)}I [-xTb<-KD ]


+(B B -)[(y x b KD)I [y-X b>KDi] + (

+Op(m-n1)

-( B)(y- x b)I Tb<
[ Iv-m~|K 4


K{Bj(D D ) + Dj(B,


B)} ( [y-xTb>KD ]


x b + KD)I[y,-XTb<-KDi]
[y fb D


[y,-xTb<-KD]) + Op(mT ).
(A- 11)


E[(B B )2x(y b)_ X
r [ yX b|

+K2E[{B{(D! D) + Di(B,


B)}2 (ly xTbl>KDi]1 (A 12)


-B, V, (V + A) -(V, + A)- (A


V(V + A)-2(A

V(V + A)-2(A


A-A)
A)(1 )-

A)(t + op,(m-)).


E[(B, B)2(,- x_ b)I _xb|

A)2(, X 2[lyi-X b

xTb)


(A-10)


Hence,


(6 REB
E(O,


REB) 2


Now,


Hence,


2,( + A)-E[(A


(A-13)









We use the approximation


m m
A- A [(V, + A)-2 -1 (Vj + A)-2{(y b)2 (V + A)},
j-1 j-1


and write


(y x1b)2


Also, I[v_-XTb|
(A A)2 = O((m-1), we now get

E[(A- A)2(y- xb)2 62 b|<[KD


[(yi- xb) x[(b- b)]2

(yi- xfb)2 + O,(m-).


(A-15)


+ op(m-r) for arbitrarily large r > 0. Since


E[(A A)2(yi xb)2 KD] + o(m -1).
(A-16)


Next by the mutual independence of the y, xiTb, (i = 1,... m), and assumption (i) of

Theorem 2-3,


Cov[(A A)2, (y-

[Z(V + A)-
j 1


x Tb)2 [i-xTfbI

xCov[{ (Vj + A)-2{(yj xb)2 (Vj + A)}}2, (yi xTb)2i[ixb| j=1

[(V + A)-2]-2
j 1
xCov[(V, + A)-4(y xTb)2 (Vi + A)}2, (y T b)2I xb|KDJ]

0( -2).


(A-14)









Hence from Eq. A-16, and the fact that D Vi + A + O(m-1),


E[(A -A)2 T [)21]

E(A A)2E[(y x b)2 I[IX{Tb
2[-(Vj + A)-2]-1(/ + A)E[Z2I[IZ| j=1

2[-(V, + A)-2 ]-1( + A)[24(K) 1 2K(K)] + o(m-1). (A-17)
j-1

From Eq. A-13 and Eq. A-17,


E[(B1 Bi)2(y X 6i )21]
S[ly--X 1
= 2BD2(1 B)2[( 2] -[2(K) 1 2K (K)] + o( m-1). (A-18)
J-1









Next we calculate


I m
(V+-{A)(B +- B )[V+A + A)-
j 1

j-1


1A-A
Bi)[+ +
2V+A


1(A- A)2
8 (V + A)2


1 Tm
- XT ( (V +A) -lXjX)-xi 1
j=1
1 3
+ A (Z(V + A) -xxT)-_Xi + o(m-1)]
2 Vi +T A'1


1(A-A)2 -)]
8 (V + A)2 ~


(K + A) (Bi + is -

1A-A
2(+A


1 B(A- A)2
(V( + A)2B8-


2 V(A A) A- A 1)
+ (0 + A)' t)7- + op (m-
(V+A)(V+A)2V+A


Bi A-A
2 ( + A)2
Bi A(-A
2 (V + A)2'


and,


Di(B, Bi)


Bi (A A)2
8 (V+A)3


B(A A)2
-- + Op(m
2(V + A)


5B, (A -A)2 1
3 + op 0n 1),
8 (V+A)2



[V + A + O(n-1)] ( (A- A)
(V+A)(I+A)
V(A- A) A- A1
(V +A)2 +A)-
(.+.. r V+A


VI(A- A)
( + A) 2


- A-
+ Op (m 1)]
gi+A


A- A (A A)2
(V + A) (V + A)2


A-A
Bi + A
(V+A)


(A A)2
+ Bi + A 0P(-1)
(H+A)a


1A-A
2V-+A


(V + A) (Bi + bi









Thus,


Bj(D, D) + Dj(B, Bj)
Bi A- A 3B (A A)2
2 + 3 +Op, m-1)
2 (Vi+A)2 8 (Vi+A)2


so that


E[{Bi(Di Dj) + Di(Bi B d) }2(I [jX-b|>KD]1
Ll z\ z z\ z z!S \ [\ i- b\K i


-B( + A)- E{(A
4


m
B( + A)-[ (Vj + A)-2]-1(
j- 1


B( + A)(1 Bi)2[-
j=

BD (1- B )2[(t-


A)2 }E{I }xTb D + o(T -1)


-K) + o(m-1)


_ )2]-1 (-K) + o(m-1)


)2] -1(-K) + o(T -).


Combining Eq. A-18 and Eq. A-20, it follows that


E (iREB -_ REB)2

BD(1 B )2[(1
j 1

BDf(1- Bi)2[Z(1
j 1


- 2


Bj)2-1[4(K)


j)2]-1[(K2 _


4Kb(K) + K2(1


1(K))] + o(m-1)


2) 4K (K) (K2 4)4(K)(] + o(m-1).

(A-21)


Finally, by Eq. 2-13 and Eq. A-19 we get


(REB 6iB iREB


eREB)


-B{x (6- b) + (y xfb KD _)I xT
[yi- bi >KDi]
+(y x6b + KDj)I[y,-xTb<-KDi]


x [(Bi Bj)(y- x b)I Xb +[yI- { [ Kfb| Di]

+o,(- )) [yi-xTb>KDi] [yJ-xTb


Ki (A
2


<-KDi]}] + O,( n-1).


3(A A)
4(V + A)
(A-22)


(A-19)


(A-20)


A)(V + A)- (1








Now by the independence of b with (A, yi- xfb), we get

(REB _B REB ~REB)
KB12 3(A- A)2
[{(A A)( + A)-- 3-- }
2 4(V +A)3
x {( KD)I -[y- Xb>KDi] Ib + KDi)I[yT <-KD])} + o(-1)
(A-23)

With the approximation given in Eq. A-14, and the independence of b with y xb, we
get

E[(A A)(V + A)- {(y,- xb KD _)I
[yi-Xib>KDi]
-(y- xrb + KDe)I)[ -Tb<-KD}]

= E[(A- A)(V + A)- {((y xb KDj) x(b b))I} b_
[yi-Xib>KDi]
-((yi xfb + KDj) x(b b))I[__x-b<-KD)]}

= E[(A A)(V + A)- {(y xfb KD)(Iy _xb>KD] + op(m-))

-(i- xb +KD )(I_ _xfb<-KD] + op(m ))}]

E[{Z(Vj + A)-2}-1{- (Vy + A)-2((yj xTb)2 (Vj + A))}
j 1 j 1
x(V, + A)- {(y- rb KD)I~ afTb>KD]

-(y- -xb + KD)I_ xf.Tb<-KD}] + oP(m-1)

= { (V + A)-2}- + A)-E[(V + A)- ((y- ixb)2- (Vi + A))}
j-1
x{(y- b- K( +A)2)I _"
[yi-Xib>K(Vi+A) 2]
-(y, b + K(V, + A)i_)I_[xr-b<-K(V1+A)1]}] + o(m-1)

S{ (Vj + A)-2}-1(i + A)- E[(Z2- 1)(Z K)I[>K]- (Z2- 1)(Z + K)I[z<-K]]
j-1
+o(m-1),


(A-24)









where Z ~ N(O, 1). Since Z -Z,

E[(Z2 1)(Z K)I[z>] (Z2 1)(Z + K)I[z<-K]

=2E[(Z2- 1)(Z K)I[>K]


2 (Z2- 1)(Z


- K) (Z)dZ


2[ Z2(- (Z))dZ K Z(-O'(Z))dZ


2[(K2 + 2)0(K)


(K2 O(K) + K{(-K))


O(K) + K@(-K)]


20(K).


Similarly,


xf b KD)I x D
[ve- i b>KDi]


(yi xfb + KD)IMf KD
i b + K [yi-xTb<-KDi]

=2{(Vj + A)-2}-1( + A)- E[(j
j-1

-4{(Vj + A)-2}-1( + A)- E[(
J1

-4{(Vj + A)-2}-1( + A)- [ (
j 1
Hence,

E (eREB B)(REB REB)
KB2 m
[(Vj + A)-2]-1( + A)
j-1

[(Vj + A)-2]- ( + A)
j=1


KB2D2


B)2[(1
j=1


K)I[Z>K] (Z + K)I[z<-K]] + o(m 1)


K)I[z>K] +o(m-1)


- K!)(-K)] + o(m-1). (A-26)






b(K) 3{0(K) K)(-K)}] + o(m-1)


1[- (K) + 3K4)(-K)}] + o(m- 1)


Combining Eq. A-27 and Eq. A-21 the theorem follows.


K (-'(Z))dZ + K K O(Z)dZ]
JK JK


E[(A- A)2(+A) {(y


(A-25)


Bj)2] -1[--(K) + 3K4)(-K)}] + o(m-1).


(A-27)









APPENDIX B
PROOFS OF THEOREMS 4-1 AND 4-2
Proof of Theorem 4-1: We start with the joint posterior given by:

7r(b, -2, a, v, Xmis Dobs)

Ik ni 4 Texpz7(x b + )} 1 Vi2
{i h exp--ziJl(xi exp -exp(x bi +,)} exp 22)
i 1 j 1 1=-1
k ni
x H11 f(xij,obs, Xij,misa) exp(--C)(2)-d/2-17(a)w(b) (B-l)
i=1 j 12

Here b = (bi, b2, b3, b4), 7r(b) x 1 and 7r(a) x 1. Now, let j =- x bi + I
1, 2, 3, 4, j 1, i = 1,... k. Now let us first integrate the posterior w.r.t -2:

7r(b, a,v, Xmis Dobs)

x IHIHexp{zijl } exp exp(9}]) (c + v1v)(d+k)/2
i 1 ( 11
k ni
x l f Jf(Xjobs, x .... a) (B-2)
i=l j 1
Then from Eq. B-2 and Eq. 4-10 we get:

7r(b, a, v, Xmis Dobs)
4
< n[ n f(U-) n Cij3f2 l-3 ]
l1 (ij)1I (i,j)EI21
k ni
1 A ^
x (c + V )d+k)/2 11 iobs, Xij,mis a), (B-3)
i=-1 j=1

where fl, f2 and f3 are defined in Section 4.2. For 1, ... ,4, 1 < r < q, let x$r a1J
for (i, j) c A4M and ir a- i. for (i,j) EM I I21 if br > 0; x b for (i, j) E AM
and V, .for (i, j) E A4 nI21 if bl, < 0. Write x mis {x, r C A4v} for

(i,j^) 4 and miss r A } for (i,) e 2n 2 (ij,obs, ,'miss
and J (xbs, ,miss). Also, let i ( e) andu e()T. Then









from Eq. B-3,


7r(b, a, v, Xmis Dobs)
4
< cH [ f(i() Tr) f2it(I)Tr, )f3 (t)T)]
1l 1 (i,j) Iill (i,j) EI2
1 k ni
(c + VTv)(d+k)/2 111 f(xi,obs, Xij,mis la)
ilj 1


Then,


7r (b, a, v Dobs)
4
< C [ lfi (t( r) f2 t(I)Tr f3 (ITrl)]
l1- (i,j)eIil (i,j)EI21


1 k ni
x (c + vTv)(d+k)/2 1
i=1 j=1 aij

f (xij,obs, X) 1 )


CHi [ A( fi()Tr ) n f2(t( 1rl)T f (()Trl)]
1-1 (i,j)Ell (i,j) E121
k ni
(c + Tv) (d+k) /2 11 f (xi0,b a)
il j 1


Let,


R (i,j) El
Using the fact that


fi(I) 1 7


f u())i ( QTrr) f2( t r)3 (l )Tr }'dbi.
S(ij) 2dbl.
(i,j) E121


d(-fh(u))


-oo
-00


for i = 1, 2, and


u}d(f3 (u)),


(B-4)


(B-5)


f (T) d(f (u)) I {-/ <
--OO ---0








we get,


f= (I n ({u1) Tr < %i -Ad(-fi('^j))

Il u i < Tmij}d(-f2(ljl))
(i,j) E 21

(i,j)EI2 o

R f 1{ T ()- < 7}db} dFi (B-6)

where dF, = d ((i,j)El, ( -fi(riji)) H[(i,j)e21(-f2(T2jl))x

)(i,J)e121(f3(Tliyj))),for1 1,...,4.
Now from Lemma (4.1) of C'!. i, and Shao (2000), we know that under conditions
(Cl) and (C2),
< ~I~1 < K II|

where K is a constant. But

Ir1l < KI17\\ => ||bI\\ < K|I71 & Ivl| < K|I|71.

Then,

1 < {|l{|v < K||7||} } l{||bl| < K||rh||}dbi}dF,

< K I, lq|1{llvll < K\II }ll}dF,. (B-7)

Then from Eq. B-5 we get,

S7r(b, a,v Dobs)db
S { (+4 q llvll q< llll}}F,} [k[ ) (B-8)
< K H N (c )(d+k)21{v < K }dF l f (Xij,obsa) (B 8)
2= i=1 j=1









Integrating w.r.t. v:


7r(a|Dobs)
4 4
<1K ||1I 1 21{|v| < K||711||} dF
k ni
x ffJ f(xij,obsa) (B-9)
i=l j 1
Then
4 4
fJ j ^ j Lj j { 1 1 (1f (c t)/21|v|| < KV|n<]} } dF1


J 2n N Rn N Ru+NJ J-n N 7=1 2= 1
4
<-, / E i^,i + E Ilq+ E iT.+i,,,q} ,
l 1 2 (ij)EIii (i,j)EI21 (i,j)EI21

< oc (from condition (C3))

Hence,
k ni
] 7o(a| Dbs)da < K f (xij,obs a)da < (B10)
ie 1 lj=l
Hence the theorem follows.








Proof of Theorem 4-2: We have


L(a ,a3, a6Dob s)


Sf(xij6 a6)
(

Tf( \exp (u,7)T _
S( 1 + exp ( -y) + exp ( T )
(i,j)EJl xij6,ria 2i
exp (uT7) __
(i,j)EJ2 6, (ij6 6 1 + exp (u, y) + exp (ui7j

x f f(Xij6 I 6) (T (UT (B-11)
ij1 X i t1 + exp (u7) + exp u .7)
(ij)6J3 iE mis6,
For 1 < r < 8, let uljr = 1 if 7, > 0 and uijr = 0 if 7, < 0 for (i,j) e Ji Ag;
S. 0 if > 0 and 5, 1 tif % < 0 for (i,j) e J 1U ;- ifr > 0
and 0 if, < 0 for (i,j) e J2 ; 0 if > 0 and5 1 if 7, < 0 for

(ij) j J2 U J3nm.
Then



L(a2,a3,a6Dobs) < j f(xj6 a6)

exp (lj7) exp (ij 7)
(, 1 + exp ( -7) + exp (i.7 Y) H 1 + exp ( u2-) + exp (u. 7)


x f .
(ij)EJ3 1 +exp (jy) + exp (-i-,)


(B-12)


where the last step follows since Xij6, s fi(xj6 a6) = 1. Now, let


L(a2,a3 Dobs)


exp (u7i)) exp (u,7)ij
(iJ)7, 1 + exp (i -y7) + exp (t, 7Y) ( 1 + exp (uT7-y) + exp (u, Y7)

x (1 +exp (u- --- (B-13)
(j)EJ3 1 + exp (1-7) + exp (uY 7)










exp (uli7)
1 exp (ij7-y) + exp (ij- 7)
j exp tij (1 + exp ( u-y) + exp (-uj u2ij )) dtij

= exp(-t) exp (- y(exp (-1 7)) exp iy(exp (-(i1j 2ij ) dtij

Now, let F(v) = exp(- exp(-v)) and noting that F(v) f= l(s < v)dF(s) =
f 1(s > -v)(-dF(-s)), we get



Ji= exp(-ty) exp (- t-(exp (- uI))) exp (- ti(exp (-(ul -2ij ) )dtij

S exp(-ij) 1(-i < ,)(-dFi(- ,))

x (-(tlij t2ij T )(-dFtij-, ,)) dtij. (B-14)
anod
Similarly,


exp (2i7)j
1 + exp ( -7) + exp (i Y)
S exp (- ) 1(-_ 7 ) < .)(-dF~(- ))
0 --o
X 1(-(i2ij- lij)T < /"' ,)(-dFtij(-"' ,)) dtij (B-15)
--00

and,

1
3 + exp (i 7) + exp ( Y7)
S exp(-t) (--1i7) < .,)(-dFtij(- ,))

x 1(-E, 7 < ,, .)(-_dF(_ ,.)) dt. (B-16)
--oo








Putting Eq. B-14-Eq. B-16 in Eq. B-13 and integrating w.r.t. to 7 (a a )T:

S(< u 1(-)7)< < .)(-dF'(-, ))
(i,j) E J O
x l(- (i ij t2ij) ,)(-dF 'i(-,,' ,)) dtij

/-exp(-t) 1(-f7) < ,)(-dF(- ,))
(i,j) E J2 3






-. .n O .
x (-(2j ij)T < I ,)(-dF'ij(-,' ,*))dtid

n / exp(-tj)[ 1(- 7)-< ,)(-dF '(- ,))
(i,j)e J3 0- -0
X 1(-_ l y <_ (.) ,(--dF '( ,)) d d7


JR8 JR+n i R2n
(B-17)

where dFt(v*) )) (-dFi )) i d(-F(-', ,)). Using C'i.. i, and Shao
(2000),under conditions (HI) and (H2), we get
f k ni
SL(ID)d JR8 +n i1j 1 J 2n

Now,
(-dFt7i(-, .))(-dF'(-,,' )) ti~exp(, .) exp(-tijexp(, .))exp(,' ) exp(- tiexp(' .)).







Then


8 L(7|D)dy
iR2n .
Kf exp(, + ..)
K1 2 (1 +exp(, .) + exp(,,.))3
I 2n Y ''' (1 + exp(i .)+ exp(. ,))3
<00

Then

J3 J8 L(a2, a3,a6 Dobb,)da2da3da6
But fp3 H[(~j f(xij61a6)dao6 < oO under conditions (Dl) and (D2) by (Theorem 2.1) of
C'!., ii and Shao (2000).
Hence the theorem follows.









REFERENCES


Battese, G. E., Harter, R. M., and Fuller, W. A. (1988), "An error-components model for
prediction of county crop areas using survey and satellite data," Journal of American
Statistical Association, 83, 28-36.

Bishop, Y. M. M., Fienberg, S. E., and Holland, P. W. (1975), Discrete Multivariate
Ai1i.:.-. Theory and Practice, The MIT Press, Cambridge, Mass.-London.

Brackstone, G. J. (1987), "Small area data: policy issues and technical challenges," in
Platek et al. (1987), pp. 3-20.

C'! iinlhers, R. L. (1986), "Outlier robust finite population estimation," Journal of Ameri-
can Statistical Association, 81, 1063-1069.

C'!i ii'. ,iii A. (1994), "Small domain statistics: A review," Statistica Neerlandica, 48,
215-236.

C'!I ii M. H., Ibrahim, J. G., and Shao, Q. M. (2004), "Propriety of the Posterior
Distribution and Existence of the MLE for Regression Models with Covariates Missing
at Random," Journal of American Statistical Association, 99, 421-438.

- (2006), "Posterior propriety and computation for the Cox Regression Model with
applications to Missing Covariates," Biometrika, 93, 791-807.

C!. i, M. H. and Shao, Q. M. (2000), "Propriety of Posterior Distribution for
Dichotomous Quantal Response Models," Proceedings of the American Mathemati-
cal S .. .. I,; 129, 293-302.

C!. i, M. H., Shao, Q. M., and Xu, D. (2002), \.. r-- y and sufficient conditions on the
properiety of posterior distributions for generalized linear mixed models," Sa,',t1. iji. 64,
57-85.

Cox, D. R. and Snell, E. J. (1968), "A general definition of residuals," Journal of the
Royal Statistical S... I/; Series B, 30, 248-275.

Cressie, N. (1990), "Small-area prediction of undercount using the general linear model,"
Statistics Canada Symposium, 93-105.

Datta, G. S. and Ghosh, M. (1991), "B -i'l prediction in linear models: applications to
small area estimation," Annals of Statistics, 19, 1748-1770.

Datta, G. S. and Lahiri, P. (1995), "Robust hierarchical B- ,-.,- estimation of small area
characteristics in the presence of covariates and outliers," Journal of Multivariate
A .I1.-.:.- 54, 310-328.

- (2000), "A unified measure of uncertainty of estimated best linear unbiased predictors
in small area estimation problems," Statistica Sinica, 10, 613-627.









Datta, G. S., Rao, J. N. K., and Smith, D. D. (2005), "On measuring the variability of
small area estimators under a basic area level model," Biometrika, 92, 183-196.

Drew, D., Singh, M. P., and Choudhry, G. H. (1982), "Evaluation of Small Area
Estimation Techniques for the Canadian Labour Force Survey," Survey M. /i.. .,/'/i 8,
17-47.

Efron, B. and Morris, C. (1971), "Limiting the risk of Bi-, and empirical B-,-.
estimators. I. The B-,- case," Journal of American Statistical Association, 66, 807-815.

- (1972), "Limiting the risk of B ,--, and empirical B ,--, estimators. II. The empirical
B-v-- case," Journal of American Statistical Association, 67, 130 139.

Erickson, E. P. and Kadane, J. B. (1987), "Sensitivity analysis of local estimates of
undercount in the 1980 U.S. Census," in Platek et al. (1987), pp. 23-45.

Farrell, P. J., McGibbon, B., and Tomberlin, T. J. (1997), "Empirical B ,--, estimators of
small area proportions in multistage designs," Statistica Sinica, 7, 1065-1083.

Fay, R. E. and Herriot, R. A. (1979), "Estimates of income for small places: an application
of James-Stein procedures to census data," Journal of American Statistical Association,
74, 269-277.

Ghosh, M. and Maiti, T. (2004), "Small-area estimation based on natural exponential
famliy quadratic variance function models and survey weights," Biometrika, 91, 95-112.

Ghosh, M., Nangia, N., and Kim, D. H. (1996), "Estimation of median income of
four-person families: a B ,. -i approach," Journal of American Statistical Associ-
ation, 91, 1423-1431.

Ghosh, M., Natare'i -in K., Stroud, T. W. F., and Carlin, B. P. (1998), "Generalized linear
models for small-area estimation," Journal of American Statistical Association, 93,
273-282.

Ghosh, M., Natare'i i, K., Waller, L. A., and Kim, D. H. (1999), "Hierarchical B-, i
GLMs for the analysis of spatial data: an application to disease n ,ppii,'- Journal of
Statistical Planning and Inference, 75, 305-318.

Ghosh, M. and Rao, J. N. K. (1994), "Small area estimation: an appraisal," Statistical
Science, 9, 55-93.

Gilks, W. R. and Wild, P. (1992), "Adaptive Rejection Sampling for Gibbs ,-:plin- "
Applied Statistics, 41, 337-348.

Godambe, V. P. and Thompson, M. E. (1989), "An extension of quasi-likelihood
estimation," Journal of Statistical P7la, ;i,,; and Inference, 22, 137-172.

Gwet, J.-P. and Rivest, L.-P. (1992), "Outlier resistant alternatives to the ratio
estimator," Journal of American Statistical Association, 87, 1174-1182.









Harter, R. M. and Fuller, W. (1987), "The Multivariate components of variance model in
small area estimation," in Platek et al. (1987), pp. 103-123.

Ibrahim, J. G., Lipsitz, S. R., and C'!., i M.-H. (1999), \i,--ig Covariates in Generalized
Linear Models When the Missing Data Mechanism Is Nonignorable," Journal of the
Royal Statistical S.. .. I,; Series B, 61, 173-190.

Jiang, J. and Lahiri, P. (2001), "Empirical Best Prediction for Small Area Inference with
Binary Data," Annals of the Institute of Statistical Mathematics, 53, 217 243.

Kackar, R. N. and Harville, D. A. (1984), "Approximations for standard errors of
estimators of fixed and random effects in mixed linear models," Journal of Ameri-
can Statistical Association, 79, 853-862.

Lahiri, P. and Maiti, T. (2002), "Empirical B-i,- estimation of relative risks in disease
mapping," Calcutta Statistical Association Bulletin, 53, 213-223.

Lahiri, P. and Rao, J. N. K. (1995), "Robust Estimation of Mean Squared Error of Small
Area Estimators," Journal of American Statistical Association, 82, 758-766.

Lipsitz, S. R. and Ibrahim, J. G. (1996), "A Conditional Model for Incomplete Covariates
in Parametric Regression Models," Biometrika, 83, 916-922.

Little, R. J. A. and Rubin, D. B. (2002), Statistical A,.1.]l-':- of Missing Data, Wiley, New
York, 2nd ed.

Maiti, T. (1998), "Hierarchical B-,-.- Estimation of Mortality Rates for Disease M ,..i.- "
Journal of Statistical Pu,',',*I and Inference, 69, 339-348.

Malec, D., Sedransk, J., Moriarity, C. L., and LeClere, F. B. (1997), "Small Area Inference
for Binary Variables in the National Health Interview Survey," Journal of American
Statistical Association, 92, 815-826.

McGibbon, B. and Tomberlin, T. J. (1989), "Small area estimates of proportions via
empirical B ,-c- techniques," Survey M. /I, ]./. ..,i/; 15, 237-252.

Nandram, B., Sedransk, J., and Pickle, L. (1999), "B ,, -i ,i: analysis of mortality rates for
U.S. health service areas," S',,./,,';.7 61, 145-165.

Pfeffermann, D. (2002), "Small area estimation-new developments and directions,"
International Statistical Review, 15, 237 252.

Pfeffermann, D. and Burck, L. (1990), "Robust small area estimation combining time
series and cross-sectional data," Survey M //I.'./. .]1i,;. 16, 217-237.

Platek, R., Rao, J., Sarndal, S., and Singh, M. (eds.) (1987), Wiley, New York.

Prasad, N. G. N. and Rao, J. N. K. (1990), "The estimation of mean squared errors of
small area estimators," Journal of American Statistical Association, 85, 163-171.









Rao, J. N. K. (1986), "Synthetic Estimators, SPREE and Model Based Predictors,"
in Proceedings of the Conference on Son, ,,1 Research Methods in Ay,:. Jll,,., U. S.
Department of Agriculture,, Washington, D. C., pp. 1-16.

- (1999), "Some recent advances in model-based small area estimation," Si', :, Method-
1 ..,,; 25, 175-186.

- (2003), Small Area Estimation, John Wiley and Sons.

Rao, J. N. K. and Yu, M. (1992), "Small area estimation by combining time series and
cross-sectional data," ASA Proceedings of the Son', .; Research Methods Section, 1-19.

Schaible, W. L. (1978), "C'!-i.... g weights for composite estimators for small area
statistics," ASA Proceedings of the Section on Survey Research Methods, 741-746.

Zaslavsky, A. M., Schenker, N., and Belin, T. R. (2001), "Downweighting Influential
Clusters in Survy ,- Application to the 1990 Post Enumeration Survey," Journal of
American Statistical Association, 96, ",- -.' .21.

Zhao, L. P. and Prentice, R. L. (1990), "Correlated binary regression using a quadratic
exponential model," Biometrika, 77, 642-648.









BIOGRAPHICAL SKETCH

Ai, iv'a Roy was born in Kolkata, India, on the 24th of December, 1978. She

graduated with a bachelor's degree from Calcutta University in 2000 with first class

honours in statistics. She then joined the Indian Statistical Institute, from where she

received her Master of Statistics degree in 2002. She moved to the University of Florida

at Gainesville in the fall of 2002 to pursue her doctoral studies in statistics. Upon

graduation, she joined the University of N. Li o-1; at Lincoln as an Assistant Professor

in the Department of Statistics.





PAGE 1

1

PAGE 2

2

PAGE 3

3

PAGE 4

Iwouldliketoextendmysincerestgratitudetomyadvisor,ProfessorMalayGhoshforhisinvaluableguidanceandconstantsupportthroughoutmygraduatestudies.Itwasaprivilegetohavehimasmyadvisor,whowasalwaysavailabletohelp.IwouldalsoliketothankProfessorsBrettPresnell,RonaldRandles,MichaelDanielsandMuraliRaoforservingonmycommittee.MyspecialthankstoProfessorsDalhoKimandMing-HuiChenforhelpingmewithcomputation.Iamindebtedtomyprofessorsfrommyundergraduatedays,whoseteachinginspiredmetopursuehigherstudiesinStatistics.IwouldespeciallyliketothankProfessorBhaskarBosefrommyhighersecondarydays.HeintroducedmetoStatisticsandhelpedmedevelopabasicunderstandingofthesubject.MysincerestthankstoProfessorIndranilMukherjeeforhisconstantencouragementduringmyundergraduatedays.Theextentofhisknowledgeandproblem-solvingcapabilitieshavealwaysinspiredme.DuringmystayattheUniversityofFlorida,IhavebeenluckytohavemarvelousteacherslikeProfessorsMalayGhosh,BrettPresnell,AndreKhuriandRonaldRandles.Theyhavetaughtmealotandwouldalwaysinspiremetobeagoodteacherandresearcher.Iwouldliketothankmyfriendsandclassmatesinthedepartment,especiallyGeorge,Mihai,forhelpingmewheneverIneeded.LifewouldhavebeenreallydierentinGainesvilleifnotforfriendslikeVivekananda,SiulidiandBharati.Iamgratefulfortheirsupport,guidanceand,well,justforbeingthere.IhavebeenfortunatetohavecometoknowandgrowclosetopeoplelikeDolakakima,BhramardiandlittleMunai.Theirgenuineaectionandwarmth,alongwiththeloveandsupportofmyfriends,and,ofcourse'AshaParivar',hasmadeGainesvilleasecondhometome.Idon'tthinkanyofthiswouldhavebeenpossibleifnotfortheencouragementandsupportofmyfamily.Myheartfeltthankstomyparents,whohavealwayssupportedmeinallthechoicesIhavemadeandencouragedmetomoveforward.IoweeverythingthatIhaveachievedtomymotherwhohasalwaysbeenmyinspiration.Last,butnotthe 4

PAGE 5

5

PAGE 6

page ACKNOWLEDGMENTS ................................. 4 LISTOFTABLES ..................................... 8 ABSTRACT ........................................ 9 CHAPTER 1REVIEWOFSMALLAREATECHNIQUES ................... 10 1.1Introduction ................................... 10 1.2OverviewofSAETechniques .......................... 11 1.3AnalysisofDiscretedatainSAE ....................... 17 1.4RobustEstimationinSAE ........................... 19 1.5LayoutofthisDissertation ........................... 21 2INFLUENCEFUNCTIONSANDROBUSTBAYESANDEMPIRI-CALBAYESSMALLAREAESTIMATION 22 2.1Introduction .................................. 22 2.2InuenceFunctionsAndRobustBayesEstimation .............. 24 2.3RobustEmpiricalBayesEstimationandMSEEvaluation ......... 31 2.4MSEEstimation ................................ 33 2.5SimulationStudy ................................ 37 3EMPIRICALBAYESESTIMATIONFORBIVARIATEBINARYDATAWITHAPPLICATIONSTOSMALLAREAESTIMATION 40 3.1Introduction ................................... 40 3.2BayesAndEmpiricalBayesEstimators ................... 40 3.3MSEApproximation .............................. 44 3.4EstimationofTheMSE ............................ 49 3.5SimulationStudy ................................ 55 4HIERARCHICALBAYESESTIMATIONFORBIVARIATEBINARYDATAWITHAPPLICATIONSTOSMALLAREAESTIMATION 58 4.1Introduction ................................... 58 4.2HierarchicalBayesMethod ........................... 58 4.3DataAnalysis .................................. 64 5SUMMARYANDFUTURERESEARCH 74 APPENDIX 6

PAGE 7

76 BPROOFSOFTHEOREMS4-1AND4-2 ...................... 85 REFERENCES ....................................... 93 BIOGRAPHICALSKETCH ................................ 97 7

PAGE 8

Table page 2-1RelativebiasesofthepointestimatesandrelativebiasofMSEestimates. ... 39 3-1RelativebiasesofthepointestimatesandrelativebiasofMSEestimatesforthesmallareamean:therstvariable. ....................... 57 3-2RelativebiasesofthepointestimatesandrelativebiasofMSEestimatesforthesmallareamean:thesecondvariable. ..................... 57 4-1BivariateHBestimatesalongwiththeDirectestimatesandtheTruevalues .. 70 4-2Measuresofprecision ................................. 71 4-3Posteriorvariance,correlationand95%HPDsalongwithpredictivep-values .. 73 8

PAGE 9

ThetopicofthisdissertationfocusesontheformulationofempiricalandhierarchicalBayesiantechniquesinthecontextofsmallareaestimation. Intherstpartofthedissertation,weconsiderrobustBayesandempiricalBayes(EB)procedureforestimationofsmallareameans.Wehaveintroducedthenotionofinuencefunctionsinthesmallareacontext,andhavedevelopedrobustBayesandEBestimatorsbasedonsuchinuencefunctions.Wehavederivedanexpressionforthepredictiveinuencefunction,andbasedonastandardizedversionofthesame,wehaveproposedsomenewsmallareaestimators.Themeansquarederrorsandestimatedmeansquarederrorsoftheseestimatorshavealsobeenfound.Thendingsarevalidatedbyasimulationstudy. Inthesecondpart,wehaveconsideredsmallareaestimationforbivariatebinarydatainthepresenceofcovariates.WehavedevelopedEBestimatorsalongwiththeirmeansquarederrors.Inthiscasethecovariateswereassumedtobecompletelyobserved.Inthepresenceofmissingcovariates,wehavedevelopedahierarchicalBayesapproachforbivariatebinaryresponses.Undertheassumptionthatthemissingmechanismismissingatrandom(MAR),wehavesuggestedmethodstoestimatethesmallareameansalongwiththeassociatedstandarderrors.Ourndingshavebeensupportedbyappropriatedataanalyses. 9

PAGE 10

Theuseofsmallareastatisticsgoesbackto11thcenturyEnglandand17thcenturyCanada( Brackstone 1987 ).However,thesestatisticswereallbasedeitheronacensusoronadministrativerecordsaimingatcompleteenumeration. TheproblemofSAEistwofold.Firstisthefundamentalissueofproducingreliableestimatesofparametersofinterest(e.g.,means,counts,quantilesetc.)forsmalldomains.Thesecondproblemistoassesstheestimationerror.Intheabsenceofsucientsamplesfromthesmallareas,theonlyoptionistoborrowstrengthfromtheneighboringareasi.e.,thosewithsimilarcharacteristics.InSection2,weoutlinetheadvancesinSAEoverthepastfewdecadesandsomepopularworkssignicanttotheenhancementofthetechniquesinthiseld. 10

PAGE 11

FayandHerriot ( 1979 )isnotable. FayandHerriot ( 1979 )suggestedtheuseoftheJames-SteinEstimatorinSAE.Theirobjectivewastoestimatethepercapitaincome(PCI)basedontheUSPopulationCensus1970forseveralsmallplaces,manyofwhichhadfewerthan500people.Earlier,intheBureauoftheCensus,thegeneralformulaforestimationofPCIinthecurrentyearwasgivenbyproductofthePCIvaluesofthepreviousyearandtheratioofadministrativeestimateofPCIfortheplacein1970tothederivedestimateforPCIin1969.Butthismethodhadmanydrawbacksresultingfromignoringthepresenceofsomeauxiliaryvariablesandalsoduetosmallcoverageforsmallplaces,therebyincreasingthestandarderrorsofthedirectestimates.TheFayandHerriotmodelisasfollows: whereVi'sandauxiliaryvariablesxi'sareallknownbutAisunknown.Inthiscase,let betheregressionestimatorofi'swhere=Diag(V1+A;;Vm+A).Then,thesuggestedestimatorfortheithsmallareawasgivenby ^i=A HereAistheestimateoftheunknownvarianceAdeterminedfromtheequationPi(YiYi)2 11

PAGE 12

1{1 )withVi=V8iandAknown,theJames-Steinestimatorisgivenby ^i=(1((mp2)V=S))Yi+((mp2)V=S)Yi;(1{4) where Weobserveherethatthe( 1{3 )isjusttheEBestimatorundermodel( 1{1 ).Followingtheargumentof EfronandMorris ( 1971 1972 ),whoremarkedthatEBestimatorssuchas( 1{4 )mayperformoverallbutpoorlyonindividualcomponentsincaseofdeparturefromtheassumedprior, FayandHerriot proposedthenalEBestimatorsas: ^0i=^iifYip PrasadandRao ( 1990 )proposedtheuseoftheunweightedleastsquaresandthemethodofmomentsapproachtoestimatetheregressioncoecientsandvarianceArespectively. DattaandLahiri ( 2000 )suggesttheuseofmaximumlikelihoodandresidualmaximumlikelihoodtoestimatethesame. TheFay-Herriotestimatorisanexampleofacompositeestimator.Acompositeestimatorisaweightedaverageofthedirectsurveyestimatorandsomeregressionestimator,inthepresenceofauxiliaryinformation.Thesearecalledmodel-basedestimators(e.g.,Fay-Herriotestimator).Otherwise,itisaweightedaverageoftheoverallmeanandthesamplemeanforthatsmallarea.Thesearecalledthedesign-basedestimators.Therehasbeenmanyproposedaproachesregardingthechoiceofoptimal 12

PAGE 13

Schaible ( 1978 )and Drew,Singh,andChoudhry ( 1982 ). Anotherkindofestimationinvogue,inSAE,issyntheticestimation. Brackstone ( 1987 )describedsyntheticestimatesasfollows:\Anunbiasedestimateisobtainedfromasamplesurveyforalargearea;whenthisestimateisusedtoderiveestimatesforsubareasundertheassumptionthatthesmallareashavethesamecharacteristicsasthelargearea,weidentifytheseestimatesassyntheticestimates". GhoshandRao ( 1994 )providedmodel-basedjusticationsforsomeofthesyntheticandcompositeestimatorsincommonuse. Oneofthesimplestmodelsincommonuseisthenestederrorunitlevelregressionmodel,employedoriginallyby Batteseetal. ( 1988 )forpredictingareasundercornandsoybeansin12countiesinIowa,USA.Theyusedavariancecomponentsmodelassuggestedby HarterandFuller ( 1987 )inthecontextofSAE.Theyusedthemodel whereviandeijaremutuallyindependenterrortermswithzeromeansandvariances2vand2erespectively.HerenisamplesaredrawnfromareaswithNiobservations.Therandomtermvirepresentsthejointeectofareacharacteristicsthatarenotaccountedbytheconcomitantvariablesxij's.Underthemodel,thesmallareamean,denotedbyi,is wherexi(p)=N1iPNij=1xij.Letui=vi+eij,u=(u1;;uT)TandE(uuT)V=blockdiag(V1;V2;;VT), where 13

PAGE 14

~b=(XTV1X)1XTV1Y:(1{9) Thenthebestlinearunbiasedpredictor(BLUP)fortheithsmallareamean,when(2v;2e)isknown,isgivenby ^i=xTi~b+(yi:xTi:~b)gi;(1{10) wherexi:=n1iPnij=1xijandgi=2v=(2v+n1i2e): PrasadandRao ( 1990 ),amongothers,providedetails.AnalternativemethodistondHierarchicalBayes(HB)predictorsbyspecifyingpriordistributionsfor,2vand2eandcomputingtheposteriordistributionsf(ijy;X)givenalltheobservationsinalltheareas( DattaandGhosh 1991 ).MarkovchainMonteCarlo(MCMC)techniquesareusedforsimulationsfromtheposterior. Aspointedoutearlier,anotherimportantaspectofSAE,apartfromtheestimationofparameters,isthecalculationoftheerrorassociatedwithestimationknownasmeansquarederror(MSE).Tothisend,wediscussheretheworkof PrasadandRao ( 1990 ). PrasadandRao pointedoutthatthemodelsof FayandHerriot ( 1979 )and Batteseetal. ( 1988 )areallspecialcasesofthegeneralmixedlinearmodel whereyisthevectorofsampleobservations,XandZareknownmatrices,andvandearedistributedindependentlywithmeans0andcovariancematricesGandR,respectively,dependingonsomeparameters,thevariancecomponents.Following 14

PAGE 15

whereV=Cov(y)=R+ZGZTand~istheGLSestimatorofb.TheyproposedtheestimatorsforthevariancecomponentsusingHenderson'smethods. PrasadandRao alsoproposedanestimatorfortheMSEoftheestimators.Theyextendedtheworkof KackarandHarville ( 1984 ),andapproximatedthetruepredictionMSEoftheEBLUPundernormalityoftheerrorterms.Theyshowed whererbT=col1jp(@bT=@j)andbT=mTGZTV1.Here~istheHenderson'sestimatorofiand^=max(~;0).Hence, Theyprovidedgeneralconditionsunderwhichthepreciseorderoftheneglectedtermsintheapproximationiso(t1)forlarget.TheseconditionsaresatisedbybothFay-Herriotandthenestederrormodels.FortheestimationoftheMSE,themodelswereconsideredindividuallyasitisdiculttondgeneralconditions.Forexample,theMSEapproximationforthe FayandHerriot ( 1979 )modelisgivenasfollows: PrasadandRao estimatedVi(1Bi)byVi(1^Bi)andB2ixTi(XTD1X)1xiby^B2ixTi(XT^D1X)1xi,where^D=diag(V1+^A;;Vm+^A).TheyapproximatedE(^EBi(A)^EBi(^A))2by2B2i(Vi+A)1Var(^A):Whileundernormalityoftheerrorterms,Var(^A)wasapproximatedasVar(^A)=2m1[A2+2APVi=m+PV2i=m];correctuptothesecondorder.Hence,theestimatedMSEuptothesecondorderis 15

PAGE 16

( 1995 )showedthattheestimatorofMSEasgivenby( 1{16 )isrobusttodeparturesfromnormalityoftherandomareaeectvi's,thoughnotthesamplingerrorei's. DattaandLahiri ( 2000 )extendedtheresultsof PrasadandRao togenerallinearmixedmodelsoftheform, whereYiisthevectorofsampleobservationsoforderni,XiandZiareknownmatrices,andviandeiaredistributedindependentlywithmeans0andcovariancematricesGiandRi,respectively,dependingonsomeparameters,thevariancecomponents.Intheirpaper, DattaandLahiri rstestimatedthevariancesoftheBLUPforthecaseofknown.TheyalsodevelopedMSEestimatorscorrectuptothesecondorder,whenisestimatedbyMLEorREMLestimator.TheyalsoshowthatMSEapproximationsfortheMLorREMLmethodsareexactlythesameinthesecondorderasymptoticsense.However,thesecondorderaccurateestimatoroftheMSEbasedonthelattermethodrequireslessbiascorrectionthantheonebasedontheformermethod.Inthesamethread, Dattaetal. ( 2005 )derivedsecondorderapproximationtotheMSEofsmallareaestimatorsbasedontheFay-Herriotmodelasgivenby( 1{1 ),whenthemodelvariancesareestimatedbytheFay-Herriotiterativemethodbasedonweightedresidualsumofsquares.Theyalsoderivedanoninformativeprioronthemodelforwhichtheposteriorvarianceofasmallareameanissecond-orderunbiasedfortheMSE. ManyothermodelshavebeenproposedinthecontextofSAE. Cressie ( 1990 )and Ghosh,Natarajan,Stroud,andCarlin ( 1998 )emphasizetheuseoflinearmodelsandgeneralizedlinearmodels,respectively. EricksonandKadane ( 1987 )carriedoutSensitivityAnalysisforSAE. PfeermannandBurck ( 1990 ), RaoandYu ( 1992 )and Ghoshetal. ( 1996 ),studiedtheextentofpossibleuseoftimeseriesandcross-sectionalmodelsinSAE. 16

PAGE 17

McGibbonandTomberlin ( 1989 )consideredthemodelwhereyijBin(1;pij)and logit(pij)=xTijb+ui;uiN(0;2u):(1{18) Hereyij'sareconditionallyindependent.Theindependentrandomeectsaredenotedbyui's.Theauthorsassumeadiusepriorforbwith2uassumedtobeknown.Thejointposteriordistributionofbandfui;i=1;;mgisapproximatedbythemultivariatenormaldistributionwithmeanequaltothetrueposteriormodeandcovariancematrixequaltotheinverseinformationmatrixevaluatedatthemode.Theestimatorofpijisgivenby ~pij=[1+expf(xTij~b+~ui)g]1(1{19) where~band~uiaretherespectivemodesoftheposterior.Thentheareaproportionsareestimatedby^pi=PNij=1~pij=Niforsmallsamplingfractionsni=Ni.For2uunknown,theMLEof2uispluggedin,yieldinganEBestimator.TaylorexpansionisusedtoobtainanaiveestimatorofthevarianceoftheEBestimator.Butthedrawbackofthisestimatorisitignoresthevariancecomponentresultingfromtheestimationof2u. Farrell,McGibbon,andTomberlin ( 1997 )showedthatthevarianceestimatorcouldbeimproveduponbytheuseofparametricBootstrapprocedure. Someotherpapersrelatingtotheanalysisofbinaryoutcomesareby JiangandLahiri ( 2001 )wheretheyproposeamethodtondtheBestPredictor(BP)oflogit(pij)andusethemethodofmomentstoobtainsampleestimatesofbanduiasgiveninEq. 1{18 Malecetal. ( 1997 )alsoconsideralogisticlinearmixedmodelforBernoullioutcomesanduseHBapproachtoderivemodeldependentpredictors. 17

PAGE 18

Maiti ( 1998 )assumedalog-normaldistributionfori's.Healsoassumedspatialdependencemodelforthelog-transformedincidencerates.Someotherworksinthisareaincludepapersby Ghoshetal. ( 1999 ), LahiriandMaiti ( 2002 )and Nandrametal. ( 1999 ). Ghoshetal. ( 1998 )consideredHierarchicalBayes(HB)generalizedlinearmodel(GLM)sforauniedanalysisofbothdiscreteandcontinuousdata.Theresponsesyij'sareassumedtobeconditionallyindependentwithpdf Ghoshetal. usedthehierarchicalmodelinthefollowingway: (1{21) GhoshandMaiti ( 2004 )providedsmall-areaestimatesofproportionsbasedonnaturalexponentialfamilymodelswithquadraticvariancefunction(QVF)involvingsurveyweights.Inthepaper,theyconsideredarealevelcovariates,i.e.,thesamecovariatesforalltheunitsinthesamearea.Theyintroducedthenaturalexponentialfamilyquadraticvariancefunctionmodel,whichhasthesamesetupasEq. 1{20 with 18

PAGE 19

whereistheoverdipersionparameterandmi=g(xib).UsingresultsfromMorris(1983),E(i)=miandVar(i)=V(mi) providepseudoBLUPsofthesmallareameans,conditionalontheweightedmeansyiw=Pmi=1Pnij=1wijyij,wherewij>0'sarealljandPnij=1wij=1.TheyeventuallyarrivedatthepseudoEBLUPsbysimultaneouslyestimatingband,usingthemethodofunbiasedestimatingfunctionsfollowing GodambeandThompson ( 1989 ).TheyalsoprovidedapproximateformulaefortheMSEandtheirapproximateestimators. Chambers ( 1986 )and GwetandRivest ( 1992 ),whousedrobustestimationtechniquesspecicallyinthecontextofratioestimation. Chambers ( 1986 )mainlyconsideredtherobustestimationofanitepopulationtotalwhenthesampleddatacontains'representative'outlierssuchthatthereisnogoodreasontoassumethatthenonsampledpartofthetargetpopulationisfreeofoutliers. Chambers assumedasuperpopulationmodel0,underwhichtherandomvariablesrI=(yIxI)1Iareindependentandidenticallydistributedwithzeromeanandunitvariance.HerexI'sareknownrealvaluesassociatedwiththepopulationelementsyI's,I=2v(xI),isunknownand2isassumedtobeunknown.Under0, 19

PAGE 20

Chambers ( 1986 )proposedtheestimator Chambers discussedthedierentchoicesofandprimarilydiscussedthebiasandvariancerobustnessofTnunderagrosserrormodel.Basedontheasymptotictheorygiveninthepaper,itappearsthatthevariancerobustnessofTncanonlybeachievedbychoosingsuchthatthisestimatorisnolongerbiasrobust. Zaslavskyetal. ( 2001 )usedrobustestimationtechniquestodownweighttheeectofclusters,withanapplicationto1990postenumerationsurvey.InthecontextofSAE, DattaandLahiri ( 1995 )developedarobustHBmethodwhichwasparticularlywell-suitedwhentherewereoneormoreoutliersinthedataset.Theyusedaclassofscalemixturesofnormaldistributionsontherandomeectsinthebasicareamodel.Cauchydistributionwasusedfortheoutliers. WehavetriedtooutlinesomeofthemajorworkssignicantintheadvancesofSAE.ReviewpapersonSAEinclude Rao ( 1986 ), Chaudhuri ( 1994 ), GhoshandRao ( 1994 ), Rao ( 1999 )and Pfeermann ( 2002 ).ThereisabookonSAEby Rao ( 2003 ). 20

PAGE 21

InChapter2,wehaveconsideredthebasicarealevelmodelandproposedsomenewrobustsmallareaestimationproceduresbasedonstandardizedresiduals.Theapproachconsistsofrstndinginuencefunctionscorrespondingtoeachindividualarealevelobservationbymeasuringthedivergencebetweenthetwoposteriorsofregressioncoecientswithandwithoutthatobservation.Next,basedontheseinuencefunctions,properlystandardized,wehaveproposedsomenewrobustBayesandempiricalBayes(EB)smallareaestimators.TheapproximatedMSE'sandestimatedMSE'softheseestimatorshavealsobeenfound. InChapter3,wehaveaddressedsmallareaproblemswhentheresponseisbivariatebinaryinthepresenceofcovariates.WehaveproposedEBestimatorsofthesmallareameansbyassigningaconjugatepriortotheparametersaftersuitabletransformationandsubsequentmodellingonthecovariates.Wehavealsoprovidedsecond-orderapproximateexpressionsfortheMSE'sandestimatedMSE'softheseestimators. InChapter4,wehaveproposedHBestimatorsinthebivariatebinarysetup,wheresomeofthecovariatesmaybepartiallymissing.WeimplementedhierarchicalBayesianmodelswhichaccomodatethismissingness.Estimatorsofsmallareameansalongwiththeassociatedposteriorstandarderrors,andposteriorcorrelationshavealsobeenprovided. 21

PAGE 22

Shrinkageestimatorsoftheabovetypearequitereasonableiftheassumedpriorisapproximatelytrue.Theremaybeover-orunder-shrinkingtowardstheregressionestimatoriftheassumedpriorisnottrue.Indeed,thereisastandardcriticismagainstanysubjectiveBayesianmethodisthatitcanperformverypoorlyintheeventofsignicantdeparturefromtheassumedprior(seee.g., EfronandMorris ( 1971 ), FayandHerriot ( 1979 )). EfronandMorris ( 1971 1972 )remarkedthatBayesandempiricalBayesestimatorsmightperformwelloverallbutpoorlyonindividualcomponent.e.g.,supposeweassumethemodelwhere Butsupposetheassumedprioroni'sisactuallyamixtureofseveraldistributions,oneofwhichisN(1;A1);A1
PAGE 23

FayandHerriot ( 1979 ). EfronandMorris ( 1971 1972 )proposedlimitedtranslationrule(LTR)estimatorstoguardagainstproblemsoftheabovetypeinthecontextofcompounddecisionproblems.TheLTRestimatorswereobtainedwithaspecialchoiceofwhattheseauthorsreferredtoas\relevancefunction".ThemainpurposeoftheLTRestimatorswasto\hedgeBayesianbetswithfrequentistcaution"( EfronandMorris 1972 ). Inthischapter,wehaveintroducedthenotionofinuencefunctionsinthesmallareacontext,anddevelopedrobustEBandHBestimatorsbasedonsuchinuencefunctions.Theseinuencefunctionsareakintotherelevancefunctionsof EfronandMorris ,butarederivedonthebasisofageneraldivergencemeasurebetweentwodistributions.Inthisway,weareassigningsometheoreticalunderpinningstotheworkof EfronandMorris aswell.Also,unliketheinterceptmodelof EfronandMorris ,ourresultsarederivedinthemoregeneralregressioncontext,ascenariomoresuitableforsmallareaestimation. Theimportanceofinuencefunctionsinrobustestimationiswell-emphasizedinHampel(1974),Huber(1981),Hampeletal.(1986)andmanyrelatedpapers.Thepredominantideaistodetectinuentialobservationsintermsoftheireectsonparameters,mosttypicallytheregressionparameters.JohnsonandGeisser(1982,1983,1985)tookinsteadapredictivepointofviewfordetectionofinuentialobservations.Intheprocess,theydevelopedBayesianpredictiveinferencefunction(PIF's)andappliedtheseinseveralunivariateandmultivariatepredictionproblemsbasedonregressionmodels.ThesepredictiveinuencefunctionsarebasedonKullback-Leiblerdivergencemeasures. OurproposedinuencefunctionsaremotivatedinaBayesianwaybasedonsomegeneraldivergencemeasuresasintroducedinAmari(1982)andCressieandRead(1984).Includedasspecialcasesarethewell-knownKullback-LeiblerandHellingerdivergencemeasures. 23

PAGE 24

Herexi'sarethep(0)areknownisgivenby ^Bi=(1Bi)yi+BixTib;(2{3) whereBi=Vi ~HBi=(1Bi)yi+BixTi~b;(2{4) 24

PAGE 25

Asdiscussedintheintroduction,theaboveEB(orHB)estimatorcanoftenover-orunder-shrinkthedirectestimatorsyitowardstheregressionestimatorsxTi~b.WhatweproposenowisarobustBayesianprocedurewhichisintendedtoguardagainstsuchmodelmisspecication. Tothisend,webeginndingtheinuenceof(yi;xi);i=1;;minthecontextofestimatingtheparameterb.ThederivationoftheinuencefunctionisbasedonageneraldivergencemeasureasintroducedinAmari(1982)CressieandRead(1984).Fortwodensitiesf1andf2,thisgeneraldivergencemeasureisgivenby Theabovedivergencemeasureshouldbeinterpretedasitslimitingvaluewhen!0or!1.WemaynotethatD(f1;f2)isnotnecessarilysymmetricinf1andf2,butthesymmetrycanalwaysbeachievedbyconsidering1 2[D(f1;f2)+D(f2;f1)].Also,itmaybenotedthatas!0;D(f1;f2)!Ef1hlogf1 2(f1;f2)=4(1Rp Inthepresentcontext,weconsiderthedivergencebetweenthedensitiesof~band~b(i),where~b(i)istheEB(orHB)estimatorofbundertheassumedmodelgiveninEq. 2{2 with(yi;xi)removed.Tothisend,werstproveageneraldivergenceresultinvolving 25

PAGE 26

2(12)T((1+)21)1(12)gj1j 2j(1+)21j1 21]: j1j 2(X1)T11(X1)=(X1+12)T12(X1+12)1 2(X1)T11(X1)=h1 21(X1+12)iT1 21121 211 21(X1+12)1 2h1 21(X1)iTh1 21(X1)i=(Z+)TC(Z+)1 2ZTZ;

PAGE 27

21(X1);=1 21(12)andC=1 21121 21.NotingthatZNp(0;Ip),weget j1j (2)p 2ZT( 1+IpC)Z+TCZ+ j1j 2expf 1+IpC)1C)g1#=1 j1j 21; wherethenalstepofEq. 2{6 requiresthestandardmatrixinversionformula(Rao,1973,p33)(A+BDBT)1=A1A1B(D1+BTA1B)BTA1withA=C1;B=IpandD= 211 2121 21 2121 2111 21(12)=(12)T(2 Moreover, 2=j(1+)1 21111 211 21121 21j1 2=j1j1 2j(1+)1112j1 2=j1j1 2j12(1+)21)11j1 2=j2j1 2j(1+)21j1 2:

PAGE 28

WenowutilizeTheorem2-1inderivingtheinuenceof(yi;xi)inthegivensmallareacontext.Notethatbjy;XN[(XT1X)1XT1y;(XT1X)1] andbjy(i);X(i)N[(XT(i)1(i)X(i))1XT(i)1(i)y(i);(XT(i)1(i)X(i))1]; Byastandardmatrixinversionformula(Rao,1973,p33), (XT1X(Vi+A)1xixTi)1=(XT1X)1+(XT1X)1(Vi+A)1xixTi(XT1X)1 FromEq. 2{8 andEq. 2{9 ,onsimplication, 1(Vi+A)1xTi(XT1X)1xi:(2{10) 28

PAGE 29

(1+)21=(XT1X)1+(1+)(XT1X)1(Vi+A)1xixTi(XT1X)1 ItiseasytoseefromEq. 2{10 andEq. 2{11 that(12)T[(1+)21]1(12)isaquadraticforminyixTi~b.WeproposerestrictingtheamountofshrinkingbycontrollingtheresidualsyixTi~b.However,inordertomaketheseresidualsscale-free,westandardizethemas(yixTi~b)=Di,whereD2i=V(yixTi~b)=Vi+AxTi(XT1X)1xi;i=1;;m: ~RHBi=yiBiDiyixTi~b where(t)=sgn(t)min(K;jtj). Remark2.ForthespecialcaseoftheinterceptmodelasconsideredinLindleyandSmith(1972)withVi=1andBi=B=(1+A)1foralli,theaboverobustEBestimatorofireducesto~RHBi=yiBDyiy D;i=1;;m; TheselectionofKisalwaysanopenquestion.Oftenthisisdictatedfromtheuser'spointofview.AsomewhatmoreconcreterecommendationcanbegivenfromthefollowingtheoremwhichprovidestheBayesriskofrobustBayesestimatorgiveninEq. 2{12 underthemodelEq. 2{2 .Throughoutthispaper,letdenotethestandardnormaldfandthestandardnormalpdf.. 2{2 ).ThenE(~RHBii)2=g1i(A)+g2i(A)+g3im(A);

PAGE 30

But ^Bi~RHBi=yiBi(yixTib)yi+Bi(yixTi~b)BiDi(yixTi~b ItiseasytocheckthatEfxTi(~bb)g2=xTi(XT1X)1xi.Also,notingthat(yixTi~b)=DiN(0;1),andisthusdistributedsymmetricallyaboutzero,onegets whereZdenotestheN(0;1)variable.Nowbytheindependenceof~bandyixTi~bandthefactthatE(~b)=b,onegetstheresultfromEq. 2{13 -Eq. 2{15 ItmaybenotedthatiftheassumedmodelEq. 2{2 iscorrect,thentheBayesriskoftheregularEBestimator~EBiisg1i(A)+g3im(A).Hence,theexcessBayesriskof~REBiover~EBiundertheassumedmodelisg2i(A)whichisstrictlydecreasinginK,andasintuitivelyexpected,tendstozeroasK!1.Hence,settingavalue,say,Mtog2i(A)willdeterminethevalueofK.Thus,thechoiceofKwillbebasedonatradeobetween 30

PAGE 31

WenowproceedtothestudyofrobustEBestimatorsinthenextsection. ^REBi=yi^Bi^DiyixTi^b whereisdenedintheprevioussectionafterEq. 2{12 OurobjectiveinthissectionistondE(^REBii)2correctuptoO(m1),i.e.weprovideanasymptoticexpansionofthisexpectationwithanexplicitcoecientofm1,whiletheremaindertermwhenmultipliedbymconvergestozero,subjecttocertainassumptions.HereEdenotesexpectationoverthejointdistributionofyand.WeneedcertainpreliminaryresultsbeforecalculatingtheBayesrisks. Firstweneedtheasymptoticdistributionof^A.Basedonthemarginaldistributionoftheyi's,thelikelihoodforbandAisgivenby 2exp[1 2mX1(yixTib)2=(Vi+A)]:(2{17) 31

PAGE 32

2Pm1(Vi+A)2375:(2{18) DuetotheasymptoticindependenceofbandA,^AisasymptoticallyN(A;2(Pm1(Vi+A)2)1).Wenowhavethefollowingsimplebutimportantlemma. HereOereferstotheexactorderwhichmeansthattheexpectationisboundedbothbelowandabovebytermssuchasC1mr=2andC2mr=2forlargem.ThisisincontrasttoorderOwhichonlyrequirestheupperbound. Arigorousproofofthelemmarequiresauniformintegrabilityargumentwhichweomit.However,intuitively,E[j^AAjrfPm1(Vi+A)2gr=2]=Oe(1).Butm(V+A)2Pm1(Vi+A)2m(V+A)2.Accordingly,Pm1(Vi+A)2=Oe(m).ThisleadstoEj^AAjr=Oe(mr Nextwewritej^BiBij=Vij^AAj=[(Vi+^A)(Vi+A)]j^AAj=(V+A)foralli=1;;mwhichleadstothedesiredconclusion.Also,weconcludefromthelemmathat foranyarbitrary>0andarbitrarylarger(>0)whichwerequireinthesequel.WemaypointoutthattheassumptionregardingtheboundednessoftheVi'sisrequiredinanyasymptoticBayesrisk(orMSE)calculationsinthiscontextasevident,forexample,fromPrasadandRao(1990). WearenowinapositiontondanexpressionforE(^REBii)2whichissecondordercorrect,i.e.thebiasisoftheordero(m1).Thefollowingtheoremisproved. 32

PAGE 33

Tothisend,werststatethefollowinglemmas. 33

PAGE 34

2Pmj=1(Vj+A)2=1 2tr(2)andK(A)=tr(I1M1);J(A)=tr(I1M2),where @b21@A@3logL @b1@bp@A@3logL @b1@A2@3logL @b1@bp@A@3logL @b2p@A@3logL @bp@A2@3logL @b1@A2@3logL @bp@A2@3logL @A3377777775=264XT2X00T2tr(3)375; and @b21)(@logL @A)(@2logL @b1@bp)(@logL @A)(@logL @b1)(@2logL @A2)(@2logL @b1@bp)(@logL @A)(@2logL @b2p)(@logL @A)(@logL @bp)(@2logL @A2)(@logL @b1)(@2logL @A2)(@logL @bp)(@2logL @A2)(@logL @A)(@2logL @A2)377777775=2640pp0p101ptr(3)375: SinceI1=264(XT1X)1010Tpftr(2)g1375; K(A)+2J(A)=tr[(XT1X)1(XT2X)+f2tr(3)2tr(3)gftr(2)g1]=tr[(XT1X)1(XT2X)]: TheLemmanowfollowsfromEq. 2{20 andEq. 2{23 34

PAGE 35

NowbyLemma2,andthefactthatE(^AA)2=(2tr2)1+o(m1),weget (Vi+A)(tr2)trf(XT1X)1(XT2Xg Remark3.ItmaybenotedthattheleadingterminE(^AA)isO(m1)andg5im=O(m1). ThelemmasprovideapproximationstoE(^AA)andE(^BiBi)bothcorrectuptoO(m1).Thus,byLemmas2and3wegetthefollowingtheoremwhichgivesustheestimateoftheMSEcorrectuptothesecondorder. 35

PAGE 36

g6im(A)=2B2i[mXi=1(Vj+A)2]1tr[fmXi=1(Vj+A)1xjxTjg1fmXi=1(Vj+A)2xjxTjg][(1+K2)(K)K(K)]:(2{25) Hence,weestimateg1i(A)byg1i(^A)+Vig5im(^A). Next,byonestepTaylorapproximation,g2i(^A)=g2i(A)+(^AA)@g2i 36

PAGE 37

dA(B2iD2i)=d dA[V2i(Vi+A)2fVi+AxTi(mXj=1(Vj+A)1xjxTj)1xig]=V2i[(Vi+A)2+2(Vi+A)3xTi(mXj=1(Vj+A)1xjxTj)1xi+(Vi+A)2xTi(mXj=1(Vj+A)1xjxTj)1(mXj=1(Vj+A)2xjxTj)(mXj=1(Vj+A)1xjxTj)1xi]=B2i+O(m1): Hence,byLemma2-2andassumptionsofTheorem2-3, Thus,weestimateg2i(A)byg2i(^A)+g6im(^A).Thetheoremfollows. 37

PAGE 38

andform=20,thenumbersofsmallareasaredoubledwithineachgroup.Weconsiderdierentdistributionsforgeneratingui's.(i)Contaminatednormaldistribution.Inparticular:9N(0;1)+:1N(0;36)sothat1of10areasisexpectedtobeanoutlier.(ii)NormaldistributionN(0;4:5).Notethat4.5isthevarianceofthecontaminatednormaldistribution.ThenwecalculatethefollowingquantitiestomeasuretheperformanceoftheSAE.Let^ibetheestimateofi.Herewedenetheabsoluterelativebiasas RB:RelativebiasofstandardEBestimates;RBR:relativebiasofrobustEBestimates;RBI:relativeimprovementofrelativebiasofrobustestimatesoverthestandardBayesestimates;SMSE:empiricalMSEofstandardEBestimates;SMSER:empiricalMSEofrobustEBestimates;RMI:RelativeimprovmentofMSEofrobustestimatorsoverthestandardBayesestimators.REBN:relativebiasofthenaiveMSEestimatesunderrobustEBmethod;RI:relativeimprovementofbiasintheestimatedMSEovernaiveestimatedMSEundertherobustmethod.HerethenaiveMSEestimatesareobtainedbytakingonlythersttwotermsofMSEestimates.Thevaluescorrespondingtonormalrandomeectsarereportedwithinparentheses. 38

PAGE 39

Table2-1. RelativebiasesofthepointestimatesandrelativebiasofMSEestimates. KRBRBRRBISMSESMSERRMIREBNREBRI m=10.52.7432.011.267.909.344.622-.106-.087.055(1.150)(1.051)(.086)(.460)(.345)(.250)(-.065)(-.050)(.0434)1-2.202.197-.355.609-.154-.133.059(-)(1.067)(.072)(-)(.350)(.239)(-.078)(-.062)(.046)1.5-2.067.246-.372.590-.176-.153.062(-)(0.969)(.157)(-)(.350)(.239)(-.084)(-.066)(.051)m=20.52.5812.392.073.329.320.027-.059-.050.028(.683)(.653)(.059)(.322)(.321)(.003)(-.026)(-.018)(.025)1-2.390.074-.328.003-.090-.080.0305(-)(.677)(.023)(-)(.318)(.012)(-.041)(-.033)(.025)1.5-2.120.179-.318.033-.089-.078.035(-)(.649)(.064)(-)(.310)(.037)(-.081)(-.074)(.023) 39

PAGE 40

Bishopetal. ( 1975 ); ZhaoandPrentice ( 1990 )).Bivariatebinarydataariseverynaturallyinopthalmologicalstudies.Itemresponsetheory,whichoccupiesacentralroleinpsychometryandeducationalstatistics,dealsalmostexclusivelywithmultivariatebinarydata.TheclassicexampleistheRaschmodelwithitsmanyandvariedextensions. Ourinterestintheanalysisofmultivariatebinarydatastemsfromtheirpotentialapplicationinsmallareaestimation.Weconsiderthecasewhentheresponseisbivariatebinary.InSection2,wehaverstderivedtheBayesestimators,andsubsequentlytheEBestimatorsforthesmallareaparameters.Section3isdevotedtothederivationoftheMSE'sofBayesestimators,andsecondordercorrectapproximationsfortheMSE'softheEBestimators.Section4isdevotedtothesecondordercorrectestimatorsoftheMSE'softheEBestimators.WehaveconductedasimulationstudyinSection5toillustrateourmethod. 1+exp(ij1)+exp(ij2)+exp(ij1+ij2+ij3):(3{1) Withthereparametrizationp1ij=exp(ij1) 1+exp(ij1)+exp(ij2)+exp(ij1+ij2+ij3)=P(yij1=1;yij2=0);

PAGE 41

1+exp(ij1)+exp(ij2)+exp(ij1+ij2+ij3)=P(yij1=0;yij2=1); 1+exp(ij1)+exp(ij2)+exp(ij1+ij2+ij3)=P(yij1=1;yij2=1); 3{1 as wherep4ij=1P3l=1plij.Let1ij=pij1+pij3,2ij=pij2+pij3.Ourprimaryinterestissimultaneousestimationofthesmallareameans(1i;2i)T,where1i=Pnij=1wij1ij,2i=Pnij=1wij2ij,Pnij=1wij=1foreachi.Heretheweightswij'sareassumedtobeknown,basedonsomegivensamplingscheme.Webeginwiththelikelihood WritingpT=(pT11;;pT1n1;;pTk1;;pTknk),theconjugateDirichletpriorisgivenby whereisthedispersionparameter,m4ij=1P3l=1mlij,andthemlijarefunctionsofthecovariatesxijassociatedwiththebinaryvectory.Inparticular,wetake whereb1;b2;b3aretheregressioncoecients.Writingz1ij=yij1(1yij2),z2ij=yij2(1yij1),z3ij=yij1yij2andz4ij=1P3l=1zlij,theposteriorforpisnowgivenby 41

PAGE 42

3{6 ,theBayesestimator(posteriormean)ofplijisgivenby ^pBlij=(zlij+mlij(b))=(+1);j=1;;ni;i=1;;m:(3{7) Theresultingestimatorsfor1ijand2ijare ^B1ij=^pB1ij+^pB3ij;^B2ij=^pB2ij+^pB3ij:(3{8) InanEBscenario,theparameterb1;b2andb3areallunknownandneedtobeestimatedfromthemarginaldistributionsofallthey's.Weassumethatisknown,andoutlinelateritsadaptiveestimation.Directestimationofisimpossiblefromthemarginaldistributionsofthey'ssincemarginallyP(yij1=1;yij2=0)=E(z1ij)=m1ij;P(yij1=0;yij2=1)=E(z2ij)=m2ijandP(yij1=1;yij2=1)=E(z3ij)=m3ij,andthejointmarginaloftheyisuniquelydeterminedfromtheseprobabilitiesresultinginnonidentiabilityin. Onewaytoestimateb1;b2;b3fromthemarginalDirichlet-Multinomialdistributionofthey'sisthemethodofmaximumlikelihood.ButtheMLE'saremathematicallyintractable,andarenotreadilyimplementableinpractice.Instead,weappealtothetheoryofoptimalunbiasedestimatingfunctionsasproposedin GodambeandThompson ( 1989 ). 42

PAGE 43

theoptimalestimatingequationsaregivenbyPki=1Pnij=1DTij1ijgij=0.Solvingtheseequations,wendtheestimatorsofb1;b2andb3.Werstobservethat AgainfromEq. 3{9 ,DTij=ijxij,wheredenotestheKroneckerproduct.Hence,theequationsPki=1Pnij=1DTij1ijgij=0canberewrittenasPki=1Pnij=1[I3xij]gij=0whichfurthersimpliesinto SolvingEq. 3{11 ,oneestimatesb1;b2andb3.TheEBestimatorsofplij(l=1;2;3)arenowgivenby ^pEBlij=(zlij+mlij(^b))=(+1);(3{12)j=1;;ni;i=1;;k;l=1;2;3;where^bT=(^bT1;^bT2;^bT3).ThentheEBestimatorsof1ijand2ijaregivenby ^EB1ij=^pEB1ij+^pEB3ij;^EB2ij=^pEB2ij+^pEB3ij:(3{13) Wewrite^Bi=(^B1i;^B2i)T,^EBi=(^EB1i;^EB2i)T,where^Bri=Pnij=1wij^Brij,^EBri=Pnij=1wij^EBrij,(r=1;2;i=1;;k). 43

PAGE 44

Werstobservethatforj6=j0,1l;l03,

PAGE 45

forallr;s=1;2.Thus, Nowforeachl=1;2;3, Firstconditioningonplij,itfollowsfromEq. 3{17 that 45

PAGE 46

Hence, ItfollowsfromEq. 3{16 andEq. 3{20 that Nextweobservethat 46

PAGE 47

Byone-stepTaylorexpansion,forl=1;2;3, Hencefor1l6=l03, NextwendanapproximationtoE[(^bb)(^bb)T]whichiscorrectuptoO(k1). Tothisend,letSk(b)=Pki=1Pnij=1DTij1ijgij.NotingthatE(gij)=0, @b(DTij1ij)gij]=kXi=1niXj=1DTij1ijDij=kXi=1niXj=1[ijxijxTij]: WedenotebyUktheexpressionintherightmostsideofEq. 3{26 .WeassumethatUk=O(k).Thisleadstotheapproximation NowE[(^pEBij^pBij)(^pEBij0^pBij0)T]:=Vijj0;

PAGE 48

where ByEq. 3{23 -Eq. 3{30 Hence, Thetheoremfollows. 48

PAGE 49

3{33 whichissecondordercorrect,i.e.therstneglectedtermiso(k1).WemaynotethatthesecondordertermintherighthandsideofEq. 3{33 isO(k1).Hence,substitutionof^bforbestimatesthesamecorrectlyuptoO(k1).However,estimationofMSE(^Bi)withthesimplesubstitutionof^bforbisnotadequatesincetherstneglectedtermisO(k1).Hence,weneedtoestimateMSE(^Bi). Tothisend,webeginwiththeevaluationofE[m1ij(^b)(1m1ij(^b))].Bytwo-stepTaylorexpansion, 2(^bb)T@2m1ij(b) Hence, 2(^bb)T@2m1ij(b) 2(^bb)T@2m1ij(b) 2(12m1ij(b))@2m1ij(b) 49

PAGE 50

3{29 .Basedonthatexpression,onegets WeneedtondE(^bb).Wefollow CoxandSnell ( 1968 )tondthisexpectation. Tothisend,werstnotethat@Sk=@b=@ @b[Pki=1Pnij=1fI3xijg0BBBB@z1ijm1ijz2ijm2ijz3ijm3ij1CCCCA]isaconstant.Hence,sinceE(Sk)=0theexpectationoftheproductofanyelementofSkwithanyelementof@Sk=@biszero.ThusdenotingJ=(Jlu;rs;mn)=(E(Sk;lu;@Sk;rs 50

PAGE 51

2U1k26666666666666666640BBBBBBBBBBBBBBBBB@tr(U1kK11)...tr(U1kK1p)...tr(U1kK31)...tr(U1kK3p)1CCCCCCCCCCCCCCCCCA+20BBBBBBBBBBBBBBBBB@tr(U1kJ11)...tr(U1kJ1p)...tr(U1kJ31)...tr(U1kJ3p)1CCCCCCCCCCCCCCCCCA3777777777777777775=1 2U1k2666666666666666664tr(U1kK11)...tr(U1kK1p)...tr(U1kK31)...tr(U1kK3p)3777777777777777775; where 51

PAGE 52

where1l6=l06=l003,1up.ThusE(^bb)=O(k1). FromEq. 3{39 -Eq. 3{42 ,oneobtainstheelementsofthematricesKluforalll=1;2;3andu=1;;p.ThusfromEq. 3{36 -Eq. 3{42 ,thebias-correctedestimateofm1ij(b)(1m1ij(b))isgivenby 2(12m1ij(^b))(@m1ij(b) 2(12m1ij(^b))@2m1ij(b)

PAGE 53

2(^bb)T@2m1ij(b) 2(^bb)T@2m2ij(b) 2fm1ij(b)@2m2ij(b) 2U1k(^b)266664tr(U1k(^b)K11(^b))...tr(U1k(^b)K3p(^b))377775tr[f1 2fm1ij(^b)@2m2ij(b) whereCi(^b)=niXj=1w2ij0B@1010111CADij(^b)0BBBB@1001111CCCCA:

PAGE 54

(Dij(^b)p;q=mpij(^b)[1mpij(^b)]1 2(12mpij(^b))(@mpij(b) 2(12mpij(^b))@2mpij(b) 2U1k(^b)266664tr(U1k(^b)K11(^b))...tr(U1k(^b)K3p(^b))377775tr[f1 2fmpij(^b)@2mqij(b) 3{43 asg(),wehave 54

PAGE 55

55

PAGE 56

TheAverageRelativeMeanSquaredError(ARMSE)isgivenby TheAverageDeviation(AD)isdenotedby TheAverageMSEisgivenby and,theestimatedMSEisgivenby where\MSE(^i)(r)isthesecondordercorrectestimateoftheMSEasgivenby 3{43 .InadditiontotherelativebiasofthepointestimatesandsimulatedMSE's,wereportrelativebiasoftheestimatesofthemeansquarederrorsasdenedbelow. RelativebiaswithrespecttotheempiricalMSE:REBi=Ef\MSE(^i)gSMSE(^i) AsmatterofinterestwealsocomputedthedirectestimatorstocomparetheirperformanceswithourempiricalBayesestimators.WeobservedtherewasconsiderableimprovementintermsofbiasandMSEwhenweusedourEBestimatorsascomparedtotherawestimators,e.g.,for=5,theimprovementwas48.5%forARD,98.1% 56

PAGE 57

Table3-1. RelativebiasesofthepointestimatesandrelativebiasofMSEestimatesforthesmallareamean:therstvariable. k=10.50.6140.0360.0810.0110.0372.38510.2890.3300.0790.0120.0432.35550.1920.071.0690.0080.0233.276k=20.50.3582.320.0800.01080.0362.40610.290.04020.0890.0130.0392.34750.1800.0050.0680.00730.0222.406 Table3-2. RelativebiasesofthepointestimatesandrelativebiasofMSEestimatesforthesmallareamean:thesecondvariable. k=10.50.4220.0070.0740.0090.0070.21410.2080.1820.0840.0100.0090.28350.1510.0500.0630.0060.0050.196k=20.50.2090.3550.0750.0100.0070.22110.290.0080.0790.0110.0080.22550.14120.0230.0620.0060.0040.206 57

PAGE 58

LittleandRubin 2002 ).Wehaveincorporatedideasfrom LipsitzandIbrahim ( 1996 )and Ibrahim,Lipsitz,andChen ( 1999 )totreatthemissingcovariates. ThegeneralHBmodelswhichaccomodatethismissingnessisdiscussedinSection2.Wehave,aftersuitablereparametrization,convertedthebivariatebinarylikelihoodintoamultinomiallikelihood,andsubsequentlyhaveutilizedthemultinomial-PoissonrelationshipleadingeventuallytoaPoissonreparameterization.Bayesiananaloguesoflog-linearmodelsareusedtoestimatethePoissonparameters.WehaveappliedtheHBmethodologytoobtainthesmallareaestimates,theassociatedstandarderrorsandposteriorcorrelations.Duetotheanalyticalintractabilityoftheposterior,theHBprocedureisimplementedviaMarkovchainMonteCarlo(MCMC)integrationtechniques,inparticular,theGibbssampler.InSection3,Themethodologyisillustratedwiththerealdatarelatedtolowbirthweightandinfantmortality. 1+exp(ij1)+exp(ij2)+exp(ij1+ij2+ij3):(4{1) 58

PAGE 59

wherepij4=1P3l=1pijl.Thenwritingzij1=yij1(1yij2),zij2=yij2(1yij1),zij3=yij1yij2andzij4=(1yij1)(1yij2),weget Nextwewritepijl=ijl=P4r=1ijrandusethewell-knownrelationshipbetweenthemultinomialandPoissondistribution,namely,(zij1;zij2;zij3;zij4)havethesamedistributionasthejointconditionaldistributionof(uij1;uij2;uij3;uij4)givenP4q=1uijq=1,whereuijqareindependentPoisson(ijq)(q=1,2,3,4). Webeginmodellingtheijlas wherexij=(xij1;xij2;:::;xijq)T,bl=(bl1;bl2;:::;blq)T,andviiidN(0;2).Inthepresenceofmissingcovariates,wewritexTij=(xTij;mis;xTij;obs)andassumethatthemissingnessisMAR.Inthepresenceofmissingnessinsomeofthecomponentsofxij,wemodelonlythosemissingcomponentsfornecessaryimputation.Further,inordertoreducethenumberofnuisanceparameters,wemodelthejointdistributionofthesemissingcomponentsastheproductofseveralsmallerdimensionalconditionaldistributionsasproposedin LipsitzandIbrahim ( 1996 )and Ibrahim,Lipsitz,andChen ( 1999 ).Specically,wepartitionxij;misintoseveralcomponentvectors,say,x(1)ij;mis;;x(r)ij;mis,parameterizedby1;:::;r,andconditionalontheobservedxij;obs

PAGE 60

ItisimportanttonotethatsinceweareassumingMARmissingcovariates,wedonotneedtomodelthemissingdatamechanism. Writezij=(zij1;zij2;zij3;zij4)TandbT=(bT1;bT2;bT3;bT4).LetD=(zij;xij;j=1;2;:::;ni;i=1;2;:::;k)denotethecompletedata.Also,letDobs=(zij;xij;obs;j=1;2;:::;ni;i=1;2;:::;k)denotetheobserveddata.Then,thecompletedatalikelihoodunderthehierarchicalmodelisgivenbyL(b;2;jD)=(kYi=1"niYj=14Yl=1expfzijl(xTijbl+vi)g (4{6) andtheobserveddatalikelihoodisoftheformL(b;2;jDobs)=Z(kYi=1Z"niYj=14Yl=1expfzijl(xTijbl+vi)g whereissome-nitemeasure,typicallytheLebesguemeasureorthecountingmeasure. Weassumebl's,l=1;:::;4,2,areapriorimutuallyindependentwithbluniform(Rq)forl=1;:::;4,2InverseGamma(c;d),whereInverseGamma(c;d)pdfis 60

PAGE 61

Directevaluationoftheposteriormomentsofpijl'sarenotpossibleasitentailsevaluationofnumericalintegralswhichareverydiculttocompute.WeusetheGibbssamplingmethodinstead.Letxmisdenotethefullvectorofallthexij;mis,andv=(v1;:::;vk)T.TofacilitateGibbssampling,weconsiderthefollowingjointposteriorbasedonthecompletedatalikelihoodas: whereL(b;2;jD)isgivenbyEq. 4{6 .ItiseasytoshowthatafterintegratingoutvandxmisfromEq. 4{9 ,thejointmarginalposteriorof(b;2;)isEq. 4{8 Toprovetheproprietyoftheposterior,wewillmainlyfollowthetechniquesgivenin ChenandShao ( 2000 ), Chenetal. ( 2004 )and Chenetal. ( 2002 ).Wenotethatourcovariatesarebounded.Weassumethataijxij;missbij,whichmeansallthecomponentsofxij;missareboundedbyaijandbij:Westartwithintroducingsomenotations: letM=f(i;j):atleastonecomponentofxijismissingg,Mij=f1rq:xijrismissingg,andMr=f(i;j):r2Mijg;wherej=1;;ni,i=1;;k:

PAGE 62

Chen,Ibrahim,andShao ( 2004 ), DenotebyI1l=f(i;j):zijl=0gandI2l=f(i;j):zijl1g.Letf1()=expfexp()g,f2()=exp(+)andf2()=exp():Notethatf1andf2arenon-increasingin.Alsof1(1)=f2(1)=0,and,f1(1)<1;f2(1)<1.Ontheotherhand,f3isnon-decreasingin,f3()=0andf3(1)<1.Letn=Pki=1niandNl2bethecardinalityofI2l;l=1;;4:DenoteuTij=(xTij;ei)withe=(0;::;1;0;::0)2Rkasthebasisvectorwith1attheithcomponent.Let bethe(n+Nl2)(q+k)designmatrixsuchthatu(l)n+ij=uijfor(i;j)2I3l.Letw(l)ij=1if(i;j)2I1lSI2landw(l)n+ij=1if(i;j)2I2l:DeneU(l)tobethematrixwithrowsw(l)ijuTij;1jni;1ikw(l)n+iju(l)Tn+ij;(i;j)2I2l For1rq,1l4,let ^alij=8><>:aijif(i;j)2I1lSI2lbijo.w.^aln+ij=bijif(i;j)2I2l(4{13) 62

PAGE 63

Wearenowledtothefollowingtheorem. 4{11 isoffullrank; whereU(l)isdenedinEq. 4{12 Wewilldefertheproofofthistheoremtotheappendix. ToimplementtheGibbssamplingalgorithm,wesamplefromthefollowingconditionaldistributionsinturn:(i)(bjv;xmis;Dobs),(ii)(vjb;2;xmis;Dobs),(iii)(2jv;Dobs),(iv)(xmisjb;v;;Dobs),and(v)(jxmis;Dobs). For(i),givenv,xmis,andDobs,thebl'sareconditionallyindependentwith(bljv;xmis;Dobs)/exp(kXi=1niXj=1hzijl(xTijbl+vi)exp(xTijbl+vi)i): GilksandWild ( 1992 )forl=1;:::;4.For(ii), 63

PAGE 64

GilksandWild ( 1992 )fori=1;2;:::;k.For(iii),wehave2jv;DobsInverseGammac+k;d+kXi=1v2i: 64

PAGE 65

LittleandRubin 2002 ). Weuseourmethodtoestimateinfantmortalityrateandlowbirthweightratesimultaneouslyoveraperiodof5yearsforseveralsmallareasinoneofthestates(namewithheldforcondentiality).Weconstructedthesmallareasbycross-classifyingthedierentcitiesinthestatewithmother'srace,namely,Non-HispanicWhite,Non-HispanicBlack,Non-HispanicAmericanIndian,Non-HispanicAsian/PacicIslanderandHispanic.Weget48suchsmalldomains.Foreachsmalldomain,weconsidertheNCHSfulldataasconstitutingourpopulationvalues,andwedrawa2%simplerandomsamplefromthispopulationofindividuals.Thusthepopulationlowbirthweightandinfantmortalityratesareknownforthesesmalldomains,andwillserveasthe\goldstandard"forcomparisonwiththedierentestimates.Inparticular,wewillndtheHBestimates,andcomparethesamewiththedirectestimates,namelythesamplemeans.Inouranalysisweconsidertheresponsevariablesandcovariatesasfollows:yij1=infantdeath(1=yes,0=no);yij2=lowbirthweight(1=yes,0=no);xij1=1denotingtheintercept;(xij2;xij3)=(0;0),(1;0),or(0;1)accordingasmother'sschoolingislessthan12years,equalto12years,orisgreaterthan12years;(xij4;xij5)=(0;0),(1,0)or(0,1)ifmother'sageislessthan20years,between20and34years,orisgreaterthan34years;andxij6=0or1ifmotherdoesnotsmoke,ortheaveragenumberofcigarettessmokedduringpregnancyisatleast1.Herey=(yij1;yij2)Tistheresponsevectorwiththecovariates 65

PAGE 66

1+exp(21+22xij4+23xij5+24xij6)+exp(31+32xij4+33xij5+34xij6); 1+exp(21+22xij4+23xij5+24xij6)+exp(31+32xij4+33xij5+34xij6); Toprovetheproprietyoftheposteriorunderanuniformpriorfor,westartwiththecompletelikelihoodforthecovariates:LetL(6;2;3jDcomp)=kYi=1niYj=1"f(xij6j6)expfxij2(21+22xij4+23xij5+24xij6)g+xij3(31+32xij4+33xij5+34xij6)g Weconsiderthefollowingfourcases. 66

PAGE 67

67

PAGE 68

1+exp(uT1ij)+exp(uT2ij)#Y(i;j)2J2Xxij6;mis"f(xij6j6)exp(uT2ij) 1+exp(uT1ij)+exp(uT2ij)#Y(i;j)2J3Xxij6;mis"f(xij6j6)1 1+exp(uT1ij)+exp(uT2ij)#: Wefollowmainlythetechniquesof ChenandShao ( 2000 )and Chen,Ibrahim,andShao ( 2006 )toprovetheproprietyoftheposterior.First,for(i;j)2Mc6=f(i;j):x6ijisobservedg;letxij=(x1ij;x4ij;x5ij)andXbethethen3designmatrixwithrowsxij,wherenisthecardinalityofMc6.Letzij=12x6ij;andXbethematrixwithrowszijxij;(i;j)2Mc6:Letusstatethefollowingconditions: 68

PAGE 69

Nowwestatethefollowingtheorem: Forouranalysis,weassumeanInverseGamma(1,1)priorfor2.Table4-1providesthesamplesizesforallthe48smalldomains,thepopulationproportionsoflowbirthweightandinfantmortality(P1,P2),thecorrespondingsamplemeans(D1,D2),andthecorrespondingHBestimates(HB1,HB2). 69

PAGE 70

BivariateHBestimatesalongwiththeDirectestimatesandtheTruevalues areaniP1P2D1D2HB1HB2 16950.0070.0680.0040.0620.0060.080230.0150.0990.0000.0000.0070.0823410.0060.0610.0000.1220.0110.108414770.0060.0680.0060.0710.0060.07554250.0140.1310.0240.1500.0060.07763450.0060.0670.0080.0640.0060.08275100.0050.0670.0040.0610.0090.10288880.0090.1160.0080.1140.0090.1169440.0040.0810.0000.0910.0050.0731017180.0040.0630.0020.0580.0070.083113790.0060.0680.0050.0820.0050.075121260.0130.1300.030.1350.0070.0821350.0050.0680.0000.2000.0090.09114150.0030.0770.0000.0670.0060.07615130.0030.0700.0000.0000.0070.083169750.0050.0720.0070.0600.0100.107173490.0110.1350.0170.1490.0070.0831840.0050.1120.0000.5000.0070.08019580.0030.0740.0000.0690.0050.078202880.0050.0820.0030.0900.0070.08221270.0020.0780.0000.1110.0070.079222370.0040.0670.0000.0760.0060.076239870.0050.0680.0040.0780.0060.0792460.0190.0520.0000.0000.0050.07825500.0080.0840.0000.0400.0080.084263650.0040.0590.0030.0880.0040.072274530.0080.0640.0110.0770.0050.07328780.0190.1200.0120.1280.0050.0782940.0160.0930.0000.0000.0070.08130150.0040.0760.0000.1330.0050.07231190.0070.0640.0000.1580.0070.086324460.0080.0710.0020.0740.0050.07333760.0150.1330.0000.1450.0050.0693480.0080.0850.0000.1250.0070.08635220.0060.0560.0450.1360.0090.093362900.0140.1330.0140.1310.0050.0733730.0090.0740.0000.0000.0080.08638180.0060.0610.0000.1110.0060.082399330.0050.0660.0040.0620.0070.084402070.0120.1190.0190.1260.0060.0824140.0000.0840.0000.0000.0080.08542210.0080.0840.0000.0480.0070.0894314720.0070.0700.0070.0710.0080.08944270.0040.0730.0000.0370.0050.076452750.0060.0670.0030.0690.0050.0724611310.0060.0650.0080.0600.0080.085471910.0140.1250.0160.1260.0080.087483030.0060.0600.0030.0660.0080.085 70

PAGE 71

TheAverageRelativeMeanSquaredError(ARMSE)isgivenby TheAverageDeviation(AD)isdenotedby and,theAverageMSEisgivenby Table4-2summarizestheperformanceoftheHBestimates,andalsoofthedirectestimates.ItfollowsfromthesummarytableTable4-2thattheHBestimatesoutperformthedirectestimatesaccordingtoallthefourcriteria.FortheHBestimates,ARD,ARMSE,ADandAMSEimprovementoverthedirectestimatesare39.39%,74.88%,37.93%and71.43%respectively,forestimatinginfantmortalityrate.Forlowbirthweightproportions,correspondingimprovementsare46.59%,85.09%,43.05%and86%respectively. Table4-2. Measuresofprecision D1D2HB1HB2 ARD0.79730.4640.48320.2478ARMSE1.65250.62440.41510.0931AD0.00580.03670.00360.0209AMSE0.000070.0050.000020.0007 71

PAGE 72

Weusedtheposteriorpredictivep-valuesofGelmanetal.(2003,Ch.6)forassessingthemodelt.Theposteriorpredictivep-valuefortheithareaisdenedasppi=Pr(T(yrep;i)T(y;i)jDobs); Ifthemodelisreasonablyaccurate,thereplicationsshouldlooksimilartotheobserveddatay.Inthecasewherethedatayininconictwiththepositedmodel,T(y;i))'sareexpectedtohaveextremevalues,whichinturnshouldgivep-valuescloseto0or1.Hence,anidealtrequirestheposteriorpredictivep-valuestobeintheballparkof0.5.TheguresprovidedinTable3showthatthisisindeedthecaseforourttedmodel. 72

PAGE 73

Posteriorvariance,correlationand95%HPDsalongwithpredictivep-values I1I2areaSE1SE2CORRp-value 10.0010.0020.1150.0050.0070.0760.0850.53620.0010.0020.1510.0050.0080.0770.0870.54430.0020.0060.1810.0070.0150.0970.1190.55640.0010.0020.1240.0040.0070.0710.0800.52050.0010.0020.1210.0050.0070.0720.0810.57260.0010.0020.0990.0050.0070.0770.0860.55270.0020.0050.1530.0060.0130.0920.1120.49680.0090.1300.1490.0080.1020.0050.0730.54890.0010.0020.1680.0040.0070.0690.0780.536100.0010.0030.1060.0060.0090.0780.0880.588110.0010.0020.1700.0040.0060.0710.080.544120.0010.0030.1700.0060.0090.0770.0870.524130.0010.0040.1520.0060.0110.0840.0990.552140.0010.0020.1510.0050.0070.0720.0810.536150.0010.0030.1440.0050.0090.0780.0890.540160.0020.0050.1780.0070.0130.0980.1170.496170.0010.0020.1620.0060.0090.0780.0880.472180.0010.0030.1050.0050.0080.0750.0850.596190.0010.0020.1540.0040.0070.0740.0830.508200.0010.0020.1740.0060.0090.0780.0870.476210.0010.0040.1260.0040.0090.0710.0870.556220.0010.0020.1420.0040.0070.0720.0810.524230.0010.0020.0970.0050.0080.0750.0840.612240.0010.0020.1810.0040.0070.0730.0830.540250.0010.0030.0720.0060.0090.0790.0890.576260.0010.0040.1040.0020.0050.0640.0790.596270.0010.0020.1020.0040.0060.0690.0780.604280.0010.0020.1570.0040.0070.0730.0830.560290.0010.0020.1320.0050.0080.0760.0850.604300.0010.0030.1060.0040.0060.0670.0770.492310.0010.0030.2010.0050.0080.0810.0910.604320.0010.0030.1720.0040.0070.0670.0780.540330.0010.0020.1480.0040.0070.0640.0740.496340.0010.0030.1060.0050.0080.0810.0910.492350.0020.0040.1630.0070.0120.0850.1010.588360.0010.0030.080.0040.0060.0680.0790.544370.0010.0030.1910.0060.0110.0790.0920.544380.0010.0020.0880.0050.0070.0770.0860.532390.0010.0030.1460.0060.0090.0790.0890.556400.0010.0040.140.0040.0080.0750.0890.536410.0010.0040.1520.0050.0100.0780.0930.536420.0010.0030.1020.0060.0090.0840.0940.488430.0010.0030.1560.0060.0100.0830.0940.536440.0010.0040.0520.0030.0070.0690.0840.544450.0010.0020.1050.0040.0060.0670.0760.560460.0010.0030.1790.0060.0100.0790.0910.552470.0010.0030.1640.0060.0090.0810.0920.508480.0010.0030.2410.0060.0100.0790.0910.564 73

PAGE 74

Theotherpartofthisdissertationissmallareaestimationwhentheresponseisbivariatebinaryinthepresenceofcovariates.WehaverstdevelopedanEBestimationprocedurefollowedbyahierarchicalBayesprocedure.Inthelattercase,wehavecovariateswhicharemissingatrandom.Wehaveshownthatboththemethodsperformmuchbetterthanthedirectestimatorsthroughasimulationstudyintheformercaseandrealdataanalysisinthelattersituation. Inthisdissertation,wehaveconsideredonlyarealevelsmallareaestimationmodelsforthecaseofrobustestimationofparameters.Itisworthwhiletoextendtheproposedapproachtounitlevelsmallareaestimationmodelsasconsideredforexamplein Batteseetal. ( 1988 )Amorechallengingtaskistoextendtheseideastonon-normalmodels,especiallyforsomeofthediscretedistributionsinvolvingbinary,countorcategoricaldata.Theposterioroftheregressioncoecientinsuchinstancesdoesnotcomeoutinaclosedform.Amoremanagablealternativeseemstoinvolveapredictionproblemwhereonedecidestolookatthepredictivedistributionofafutureobservation,andusethesametoderivetheinuenceofagivenobservation. Anaturalextensionoftherstpartofthisdissertationwouldbetodevelopsimilarrobustproceduresinthecaseofgenerallinearmixedmodels.Asnotedintheintroduction,thenormalmodelconsideredinthedissertationandalsotheother 74

PAGE 75

Anotherfutureundertakingwillbeextendingtheresultsforthebivariatebinarycaseinthemultivariatecontext.AlsowehavedevelopedEBtechniqueswhentherearenomissingcovariates.Itwillbeinterestingtoextendtheseresultswhensomeofthecovariatesaremissing. 75

PAGE 76

Nextwewrite Nowobservethat ^iREB~REBi=BiDiyixTi~b WerstshowthatxTi(^b~b)=Op(m1).Tothisend,webeginwiththeone-stepTaylorexpansion, ^b:=~b+(^AA)@~b 76

PAGE 77

sothatE[@~b Byassumption(i),(XT1X)1(V+A)1(XTX)1.Also,XT2XA2XTXandXT3XA3XTX.HencefromEq. A{6 ,V(@~b whereC(>0)isaconstant.ThisimpliesthatxTi(^b~b)=Op(m1).Accordingly,since wherej^DiDij=Op(m1 2)andxTi(^b~b)=Op(m1),itfollowsfromEq. A{8 that 77

PAGE 78

^iREB~REBi=(^BiBi)(yixTi~b)+^Bi[f(yixTi^bK^Di)(yixTi~bKDi)gI[yixTi~b>KDi]+f(yixTi^b+K^Di)(yixTi~b+KDi)gI[yixTi~bKDi]+(yixTi~b+KDi)I[yixTi~bKDi]I[yixTi~b
PAGE 79

^AA:=[mXj=1(Vj+A)2]1mXj=1(Vj+A)2f(yjxTjb)2(Vj+A)g;(A{14) andwrite (yixTi~b)2=[(yixTib)xTi(~bb)]2=(yixTib)2+Op(m1 2): Also,I[jyixTi~bjKDi]=I[jyixTibjKDi]+op(mr)forarbitrarilylarger>0.Since(^AA)2=Op(m1),wenowget NextbythemutualindependenceoftheyixiTb,(i=1;;m),andassumption(i)ofTheorem2-3, Cov[(^AA)2;(yixTib)2I[jyixTibjKDi]]:=[mXj=1(Vj+A)2]2Cov[fmXj=1(Vj+A)2f(yjxTjb)2(Vj+A)gg2;(yixTib)2I[jyixTibjKDi]]=[mXj=1(Vj+A)2]2Cov[(Vi+A)4f(yixTib)2(Vi+A)g2;(yixTib)2I[jyixTibjKDi]]]=O(m2):

PAGE 80

A{16 ,andthefactthatD2i=Vi+A+O(m1); E[(^AA)2(yixTi~b)2I[jyixTi~bjKDi]]=E(^AA)2E[(yixTib)2I[jyixTibjK(Vi+A)1 2]]+o(m1)=2[mXj=1(Vj+A)2]1(Vi+A)E[Z2I[jZjK]]+o(m1)=2[mXj=1(Vj+A)2]1(Vi+A)[2(K)12K(K)]+o(m1): FromEq. A{13 andEq. A{17 80

PAGE 81

(Vi+A)1 2(Bi+^BiBi)[f1+^AA Vi+AxTi 2f1xTi 2]=(Vi+A)1 2(Bi+^BiBi)[1+1 2^AA Vi+A1 8(^AA)2 2xTi 2xTi 2(Bi+^BiBi)[1 2^AA Vi+A1 8(^AA)2 2Bi1 2^AA Vi+A(Vi+A)1 2Bi1 8(^AA)2 2Vi(A^A) (Vi+A)(Vi+^A)1 2^AA Vi+A+op(m1)=Bi 2Bi 2Bi(^AA)2 2+op(m1)=Bi 25Bi 2+op(m1); 2Vi(A^A) (Vi+A)(Vi+^A)=Vi(A^A) (Vi+A)3 2(1+^AA Vi+A)1=Vi(A^A) (Vi+A)3 2[1^AA Vi+A+Op(m1)]=Vi[^AA 2+(^AA)2 2+op(m1)]=Bi^AA 2+Bi(^AA)2 2+op(m1):

PAGE 82

^Bi(^DiDi)+Di(^BiBi)=Bi 2+3Bi 2+op(m1); sothat 4B2i(Vi+A)1Ef(^AA)2gEfI[jyixTi~bj>KDi]g+o(m1)=B2i(Vi+A)1[mXj=1(Vj+A)2]1(K)+o(m1)=B2i(Vi+A)(1Bi)2[mXj=1(1Bj)2]1(K)+o(m1)=B2iD2i(1Bi)2[mXj=1(1Bj)2]1(K)+o(m1): CombiningEq. A{18 andEq. A{20 ,itfollowsthat Finally,byEq.2-13andEq. A{19 weget (~REBi^iB)(^iREB~REBi)=BifxTi(~bb)+(yixTi~bKDi)I[yixTi~b>KDi]+(yixTi~b+KDi)I[yixTi~bKDi]I[yixTi~b
PAGE 83

(~REBi^iB)(^iREB~REBi)=KB2i 23(^AA)2 2gf(yixTi~bKDi)I[yixTi~b>KDi](yixTi~b+KDi)I[yixTi~bKDi](yixTi~b+KDi)I[yixTi~bKDi]((yixTib+KDi)xTi(~bb))I[yixTi~bKDi]+op(m1 2))(yixTib+KDi)(I[yixTibKDi](yixTib+KDi)I[yixTibK(Vi+A)1 2](yixTib+K(Vi+A)1 2)I[yixTibK](Z21)(Z+K)I[Z
PAGE 84

Similarly,E[(^AA)2(Vi+A)3 2f(yixTi~bKDi)I[yixTi~b>KDi](yixTi~b+KDi)I[yixTi~bK](Z+K)I[ZK]]+o(m1)=4fmXj=1(Vj+A)2g1(Vi+A)1[(K)K(K)]+o(m1): Hence, CombiningEq. A{27 andEq. A{21 thetheoremfollows. 84

PAGE 85

(B{1) Hereb=(b1;b2;b3;b4),(b)/1and()/1.Now,letijl=xTijbl+vi;l=1;2;3;4;j=1;;ni;i=1;;k:Nowletusrstintegratetheposteriorw.r.t2: (c+vTv)(d+k)=2kYi=1niYj=1f(xij;obs;xij;misj) (B{2) ThenfromEq. B{2 andEq.4-10weget: (c+vTv)(d+k)=2kYi=1niYj=1f(xij;obs;xij;misj); wheref1,f2andf3aredenedinSection4.2.Forl=1;;4,1rq,let^xlijr=^alijfor(i;j)2Mrand~xlijr=^aln+ijfor(i;j)2MrTI2lifblr0;^xlijr=^blijfor(i;j)2Mrand~xlijr=^bln+ijfor(i;j)2MrTI2lifblr<0.Write^x(l)ij;miss=f^xlijr;r2Mijgfor(i;j)2Mrand~x(l)ij;miss=f~xlijr;r2Mijgfor(i;j)2MrTI2l;^x(l)ij=(xij;obs;^x(l)ij;miss)and~x(l)ij=(xij;obs;~x(l)ij;miss):Also,let~u(l)ij=(~x(l)ijT;eTi)Tand^u(l)ij=(^x(l)ijT;eTi)T:Then 85

PAGE 86

B{3 (c+vTv)(d+k)=2kYi=1niYj=1f(xij;obs;xij;misj) (B{4) Then, (c+vTv)(d+k)=2kYi=1niYj=1Xaijxijbijf(xij;obs;xij;misj)=C4Yl=1hY(i;j)2I1lf1(^u(l)ijTl)Y(i;j)2I2lf2(^u(l)ijTl)f3(~u(l)ijTl)i1 (c+vTv)(d+k)=2kYi=1niYj=1f(xij;obsj) (B{5) Let,

PAGE 87

wheredFl=dQ(i;j)2I1l(f1(ijl))Q(i;j)2I2l(f2(ijl))Q(i;j)2I2l(f3(ijl)),forl=1;;4: ChenandShao ( 2000 ),weknowthatunderconditions(C1)and(C2), ThenfromEq. B{5 weget,Z(b;;vjDobs)dbK4Yl=1nZRn+Nl2klkq (B{8) 87

PAGE 88

(c+vTv)(d+k)=21fkvkKklkgi4Yl=1dFlokYi=1niYj=1f(xij;obsj) (B{9) Then (c+vTv)(d+k)=21fkvkKklkgio4Yl=1dFlKZRn+N12ZRn+N22ZRn+N32ZRn+N42n4Yl=1klkqg4Yl=1dFlK4Yl=1ZRn+Nl2nX(i;j)2I1ljijljq+X(i;j)2I2ljijljq+X(i;j)2I2ljn+ij;ljqodFl<1(fromcondition(C3)) Hence, Hencethetheoremfollows. 88

PAGE 89

1+exp(uT1ij)+exp(uT2ij)#Y(i;j)2J2Xxij6;mis"f(xij6j6)exp(uT2ij) 1+exp(uT1ij)+exp(uT2ij)#Y(i;j)2J3Xxij6;mis"f(xij6j6)1 1+exp(uT1ij)+exp(uT2ij)#: For1r8;let^u1ijr=1ifr>0and^u1ijr=0ifr0for(i;j)2J1TM6;~u1ijr=0ifr>0and~u1ijr=1ifr0for(i;j)2J1SJ3TM6;^u2ijr=1ifr>0and^u2ijr=0ifr0for(i;j)2J2TM6;~u2ijr=0ifr>0and~u2ijr=1ifr0for(i;j)2J2SJ3TM6: 1+exp(^uT1ij)+exp(~uT2ij)Y(i;j)2J2exp(^uT2ij) 1+exp(~uT1ij)+exp(^uT2ij)Y(i;j)2J31 1+exp(~uT1ij)+exp(~uT2ij); wherethelaststepfollowssincePxij6;misf(xij6j6)=1:Now,letL(2;3jDobs)=Y(i;j)2J1exp(^uT1ij) 1+exp(^uT1ij)+exp(~uT2ij)Y(i;j)2J2exp(^uT2ij) 1+exp(~uT1ij)+exp(^uT2ij)Y(i;j)2J31 1+exp(~uT1ij)+exp(~uT2ij); 89

PAGE 90

1+exp(^uT1ij)+exp(~uT2ij)=Z10exphtij1+exp(^uT1ij)+exp((^u1ij~u2ij)T)idtij=Z10exp(t)exptij(exp(^uT1ij))exptij(exp((^u1ij~u2ij)T)dtij Similarly, 1+exp(^uT2ij)+exp(~uT1ij)=Z10exp(tij)hZ11(^uT2ij)vij)(dFtij(vij))Z11((^u2ij~u1ij)Twij)(dFtij(wij))idtij and,J3=1 1+exp(^uT2ij)+exp(~uT1ij)=Z10exp(tij)hZ11(~uT1ij)vij)(dFtij(vij))Z11(~uT2ijwij)(dFtij(wij))idtij: 90

PAGE 91

B{14 -Eq. B{16 inEq. B{13 andintegratingw.r.t.to=(T2;T3)T:ZR8L(jD)dZR8nY(i;j)2J1hZ10exp(tij)hZ11(^uT1ij)vij)(dFtij(vij))Z11((^u1ij~u2ij)Twij)(dFtij(wij))idtijiY(i;j)2J2hZ10exp(tij)hZ11(^uT2ij)vij)(dFtij(vij))Z11((^u2ij~u1ij)Twij)(dFtij(wij))idtijiY(i;j)2J3hZ10exp(tij)hZ11(~uT2ij)vij)(dFtij(vij))Z11(~uT1ijwij)(dFtij(wij))idtijiod=ZR8nZR+nexp(kXi=1niXj=1tij)hZR2n1(Uv)dFt(v)idtod wheredFt(v)=Qi;j(dFtij(vij))Qi;j(dFtij(wij)):Using ChenandShao ( 2000 ),underconditions(H1)and(H2),weget Now,(dFtij(vij))(dFtij(wij))=t2ijexp(vij)exp(tijexp(vij))exp(wij)exp(tijexp(wij)):

PAGE 92

(1+exp(vij)+exp(wij))3K1ZR2n(Xi;jv8ij+Xi;jw8ij)Yi;jexp(vij+wij) (1+exp(vij)+exp(wij))3<1 ButRR3Q(i;j)2Mc6f(xij6j6)d6<1underconditions(D1)and(D2)by(Theorem2.1)of ChenandShao ( 2000 ). Hencethetheoremfollows. 92

PAGE 93

Battese,G.E.,Harter,R.M.,andFuller,W.A.(1988),\Anerror-componentsmodelforpredictionofcountycropareasusingsurveyandsatellitedata,"JournalofAmericanStatisticalAssociation,83,28{36. Bishop,Y.M.M.,Fienberg,S.E.,andHolland,P.W.(1975),DiscreteMultivariateAnalysis:TheoryandPractice,TheMITPress,Cambridge,Mass.-London. Brackstone,G.J.(1987),\Smallareadata:policyissuesandtechnicalchallanges,"in Plateketal. ( 1987 ),pp.3{20. Chambers,R.L.(1986),\Outlierrobustnitepopulationestimation,"JournalofAmeri-canStatisticalAssociation,81,1063{1069. Chaudhuri,A.(1994),\Smalldomainstatistics:Areview,"StatisticaNeerlandica,48,215{236. Chen,M.H.,Ibrahim,J.G.,andShao,Q.M.(2004),\ProprietyofthePosteriorDistributionandExistenceoftheMLEforRegressionModelswithCovariatesMissingatRandom,"JournalofAmericanStatisticalAssociation,99,421{438. |(2006),\PosteriorproprietyandcomputationfortheCoxRegressionModelwithapplicationstoMissingCovariates,"Biometrika,93,791{807. Chen,M.H.andShao,Q.M.(2000),\ProprietyofPosteriorDistributionforDichotomousQuantalResponseModels,"ProceedingsoftheAmericanMathemati-calSociety,129,293{302. Chen,M.H.,Shao,Q.M.,andXu,D.(2002),\Necessaryandsucientconditionsontheproperietyofposteriordistributionsforgeneralizedlinearmixedmodels,"Sankhya,64,57{85. Cox,D.R.andSnell,E.J.(1968),\Ageneraldenitionofresiduals,"JournaloftheRoyalStatisticalSociety,SeriesB,30,248{275. Cressie,N.(1990),\Small-areapredictionofundercountusingthegenerallinearmodel,"StatisticsCanadaSymposium,93{105. Datta,G.S.andGhosh,M.(1991),\Bayesianpredictioninlinearmodels:applicationstosmallareaestimation,"AnnalsofStatistics,19,1748{1770. Datta,G.S.andLahiri,P.(1995),\RobusthierarchicalBayesestimationofsmallareacharacteristicsinthepresenceofcovariatesandoutliers,"JournalofMultivariateAnalysis,54,310{328. |(2000),\Auniedmeasureofuncertaintyofestimatedbestlinearunbiasedpredictorsinsmallareaestimationproblems,"StatisticaSinica,10,613{627. 93

PAGE 94

Datta,G.S.,Rao,J.N.K.,andSmith,D.D.(2005),\Onmeasuringthevariabilityofsmallareaestimatorsunderabasicarealevelmodel,"Biometrika,92,183{196. Drew,D.,Singh,M.P.,andChoudhry,G.H.(1982),\EvaluationofSmallAreaEstimationTechniquesfortheCanadianLabourForceSurvey,"SurveyMethodology,8,17{47. Efron,B.andMorris,C.(1971),\LimitingtheriskofBayesandempiricalBayesestimators.I.TheBayescase,"JournalofAmericanStatisticalAssociation,66,807{815. |(1972),\LimitingtheriskofBayesandempiricalBayesestimators.II.TheempiricalBayescase,"JournalofAmericanStatisticalAssociation,67,130{139. Erickson,E.P.andKadane,J.B.(1987),\Sensitivityanalysisoflocalestimatesofundercountinthe1980U.S.Census,"in Plateketal. ( 1987 ),pp.23{45. Farrell,P.J.,McGibbon,B.,andTomberlin,T.J.(1997),\EmpiricalBayesestimatorsofsmallareaproportionsinmultistagedesigns,"StatisticaSinica,7,1065{1083. Fay,R.E.andHerriot,R.A.(1979),\Estimatesofincomeforsmallplaces:anapplicationofJames-Steinprocedurestocensusdata,"JournalofAmericanStatisticalAssociation,74,269{277. Ghosh,M.andMaiti,T.(2004),\Small-areaestimationbasedonnaturalexponentialfamliyquadraticvariancefunctionmodelsandsurveyweights,"Biometrika,91,95{112. Ghosh,M.,Nangia,N.,andKim,D.H.(1996),\Estimationofmedianincomeoffour-personfamilies:aBayesianapproach,"JournalofAmericanStatisticalAssoci-ation,91,1423{1431. Ghosh,M.,Natarajan,K.,Stroud,T.W.F.,andCarlin,B.P.(1998),\Generalizedlinearmodelsforsmall-areaestimation,"JournalofAmericanStatisticalAssociation,93,273{282. Ghosh,M.,Natarajan,K.,Waller,L.A.,andKim,D.H.(1999),\HierarchicalBayesGLMsfortheanalysisofspatialdata:anapplicationtodiseasemapping,"JournalofStatisticalPlanningandInference,75,305{318. Ghosh,M.andRao,J.N.K.(1994),\Smallareaestimation:anappraisal,"StatisticalScience,9,55{93. Gilks,W.R.andWild,P.(1992),\AdaptiveRejectionSamplingforGibbsSampling,"AppliedStatistics,41,337{348. Godambe,V.P.andThompson,M.E.(1989),\Anextensionofquasi-likelihoodestimation,"JournalofStatisticalPlanningandInference,22,137{172. Gwet,J.-P.andRivest,L.-P.(1992),\Outlierresistantalternativestotheratioestimator,"JournalofAmericanStatisticalAssociation,87,1174{1182.

PAGE 95

Harter,R.M.andFuller,W.(1987),\TheMultivariatecomponentsofvariancemodelinsmallareaestimation,"in Plateketal. ( 1987 ),pp.103{123. Ibrahim,J.G.,Lipsitz,S.R.,andChen,M.-H.(1999),\MissingCovariatesinGeneralizedLinearModelsWhentheMissingDataMechanismIsNonignorable,"JournaloftheRoyalStatisticalSociety,SeriesB,61,173{190. Jiang,J.andLahiri,P.(2001),\EmpiricalBestPredictionforSmallAreaInferencewithBinaryData,"AnnalsoftheInstituteofStatisticalMathematics,53,217{243. Kackar,R.N.andHarville,D.A.(1984),\Approximationsforstandarderrorsofestimatorsofxedandrandomeectsinmixedlinearmodels,"JournalofAmeri-canStatisticalAssociation,79,853{862. Lahiri,P.andMaiti,T.(2002),\EmpiricalBayesestimationofrelativerisksindiseasemapping,"CalcuttaStatisticalAssociationBulletin,53,213{223. Lahiri,P.andRao,J.N.K.(1995),\RobustEstimationofMeanSquaredErrorofSmallAreaEstimators,"JournalofAmericanStatisticalAssociation,82,758{766. Lipsitz,S.R.andIbrahim,J.G.(1996),\AConditionalModelforIncompleteCovariatesinParametricRegressionModels,"Biometrika,83,916{922. Little,R.J.A.andRubin,D.B.(2002),StatisticalAnalysisofMissingData,Wiley,NewYork,2nded. Maiti,T.(1998),\HierarchicalBayesEstimationofMortalityRatesforDiseaseMapping,"JournalofStatisticalPlanningandInference,69,339{348. Malec,D.,Sedransk,J.,Moriarity,C.L.,andLeClere,F.B.(1997),\SmallAreaInferenceforBinaryVariablesintheNationalHealthInterviewSurvey,"JournalofAmericanStatisticalAssociation,92,815{826. McGibbon,B.andTomberlin,T.J.(1989),\SmallareaestimatesofproportionsviaempiricalBayestechniques,"SurveyMethodology,15,237{252. Nandram,B.,Sedransk,J.,andPickle,L.(1999),\BayesiananalysisofmortalityratesforU.S.healthserviceareas,"Sankhya,61,145{165. Pfeermann,D.(2002),\Smallareaestimation-newdevelopmentsanddirections,"InternationalStatisticalReview,15,237{252. Pfeermann,D.andBurck,L.(1990),\Robustsmallareaestimationcombiningtimeseriesandcross-sectionaldata,"SurveyMethodology,16,217{237. Platek,R.,Rao,J.,Sarndal,S.,andSingh,M.(eds.)(1987),Wiley,NewYork. Prasad,N.G.N.andRao,J.N.K.(1990),\Theestimationofmeansquarederrorsofsmallareaestimators,"JournalofAmericanStatisticalAssociation,85,163{171.

PAGE 96

Rao,J.N.K.(1986),\SyntheticEstimators,SPREEandModelBasedPredictors,"inProceedingsoftheConferenceonSurveyResearchMethodsinAgriculture,U.S.DepartmentofAgriculture,,Washington,D.C.,pp.1{16. |(1999),\Somerecentadvancesinmodel-basedsmallareaestimation,"SurveyMethod-ology,25,175{186. |(2003),SmallAreaEstimation,JohnWileyandSons. Rao,J.N.K.andYu,M.(1992),\Smallareaestimationbycombiningtimeseriesandcross-sectionaldata,"ASAProceedingsoftheSurveyResearchMethodsSection,1{19. Schaible,W.L.(1978),\Choosingweightsforcompositeestimatorsforsmallareastatistics,"ASAProceedingsoftheSectiononSurveyResearchMethods,741{746. Zaslavsky,A.M.,Schenker,N.,andBelin,T.R.(2001),\DownweightingInuentialClustersinSurveys:Applicationtothe1990PostEnumerationSurvey,"JournalofAmericanStatisticalAssociation,96,858{8629. Zhao,L.P.andPrentice,R.L.(1990),\Correlatedbinaryregressionusingaquadraticexponentialmodel,"Biometrika,77,642{648.

PAGE 97

AnanyaRoywasborninKolkata,India,onthe24thofDecember,1978.Shegraduatedwithabachelor'sdegreefromCalcuttaUniversityin2000withrstclasshonoursinstatistics.ShethenjoinedtheIndianStatisticalInstitute,fromwhereshereceivedherMasterofStatisticsdegreein2002.ShemovedtotheUniversityofFloridaatGainesvilleinthefallof2002topursueherdoctoralstudiesinstatistics.Upongraduation,shejoinedtheUniversityofNebraskaatLincolnasanAssistantProfessorintheDepartmentofStatistics. 97