Hierarchical bayes analysis for continuous and discrete data

MISSING IMAGE

Material Information

Title:
Hierarchical bayes analysis for continuous and discrete data
Physical Description:
vii, 65 leaves : ; 29 cm.
Language:
English
Creator:
Natarajan, Kannan, 1966-
Publication Date:

Subjects

Subjects / Keywords:
Statistics thesis Ph.D
Dissertations, Academic -- Statistics -- UF
Genre:
bibliography   ( marcgt )
non-fiction   ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1993.
Bibliography:
Includes bibliographical references (leaves 61-64).
Statement of Responsibility:
by Kannan Natarajan.
General Note:
Typescript.
General Note:
Vita.

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 001950889
oclc - 31209698
notis - AKC7431
System ID:
AA00003251:00001

Full Text









HIERARCHICAL BAYES ANALYSIS FOR CONTINUOUS AND DISCRETE
DATA













By

KANNAN NATARAJAN


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


1993














To my parents and teachers












ACKNOWLED( ;EN ENTS


I would like to express my sincere gratitude to Professor Malay Ghosh for being
my advisor and for all the attention I received from him for the past four years.
Words cannot simply express how grateful I am for his patience, encouragement and
invaluable guidance. Without his help it would not have been possible to complete
the work. I was extremely fortunate to work under him as his research assistant for
most of my years in the graduate program. Also, I would like to thank Professors
Alan Agresti, Richard L. Scheaffer, Ramon Littell and Patrick Thompson for their
encouragement and advice while serving on my committee.
I am also grateful and highly indebted to my alma mater Indian Statistical
Institute, Calcutta, India, for all the support I received. I would like to take this
opportunity to thank all the Professors in ISI, to whom I definitely owe a lot for
my basic understanding of statistics. I would like to express my appreciation to
Professor Bikas K. Sinha for his genuine interest in me. Also, I would like to
acknowledge the help and support I received as an undergraduate from the professors
in the Department of Statistics, Loyola College, Madras, India.
Much gratitude is owed to my parents, sisters and brothers-in-law, whose sup-
port, advice, guidance and prayers throughout the years of my life have made this
achievement possible. A Very special thanks are offered to my best friends Atalanta
Ghosh and Sofia Paul for their support, friendship and love. There are numerous
other friends who made my years in graduate school so memorable and wonderful.
Their friendship will never be forgotten.














TABLE OF CONTENTS




ACKNOWLEDGEMENTS ............................

A BST RACT . . . . .

CHAPTERS

1 INTRODUCTION ...............................

1.1 Literature Review ............. .................
1.2 The Subject of this Dissertation ................. ...


2 ADJUSTMENT OF 1990 CENSUS UNDERCOUNT:
BAYESAPPROACH .....................

2.1 Introduction .....................
2.2 Hierarchical Bayes Model And Gibbs Sampling. .
2.3 Adjustment of 1990 Census Data .........


A HIERARCHICAL


3 REFINEMENT OF QUALITY MEASUREMENT PLAN .........


Introduction .........
Notations and Assumptions
Hierarchical Bayes Model .
An Example .........


4 BAYESIAN ANALYSIS OF CATEGORICAL SURVEY DATA ......


Introduction ................
Generalized Linear Models for Two-Stage
Analysis of Multi-Category Data ...
An Example ................


.Samplin ..ithi
Sampling Within
. . .


5 SUMMARY AND FUTURE RESEARCH ..................

5.1 Sum m ary . . . . .
5.2 Future Research . . . .


Strata


. . .
............

...........


11 II







BIBLIOGRAPHY ................................. 61

BIOGRAPHICAL SKETCH ................. .......... 65













Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment
of the Requirements for the Degree of
Doctor of Philosophy


HIERARCHICAL BAYES ANALYSIS FOR CONTINUOUS AND DISCRETE
DATA


By

Kannan Natarajan

December 1993

Chairman: Malay Ghosh
Major Department: Statistics

This dissertation considers several problems where hierarchical Bayes methodology

is used for obtaining estimates and the associated standard errors. Although the

methods are presented in the context of some specific problems, they are fairly general

in nature and can easily be adapted to other related problems as well.

The first problem is related to the adjustment of census undercount. Adjustment

of the decennial census counts in the United States has been a topic of heated debate

for more than a decade. Many statisticians, including some within the Bureau of the

Census, have recognized the importance of a model based approach for adjustments.

In this dissertation, we present a multivariate hierarchical linear model and also relax

many of the assumptions which have been the subject of criticism. In particular,

we have done a computer-intensive fully Bayesian procedure which uses Monte Carlo

numerical integration techniques like the Gibbs sampler. This eliminates the need for







assuming sample variance-covariance matrices of the adjustment factors to be known

which was hitherto assumed in any Bayesian or non-Bayesian analysis.

The second specific problem is related to the Quality Measurement Plan (QMP),

a plan implemented for reporting the quality assurance audit results to Bell System

management. An important function of the Bell Laboratories Quality Assurance

Center and the Western Electric Quality Assurance Directorate is to audit the quality

of the products manufactured and the services provided by the Western Electric

Company to determine if the intended quality standards are met. Starting with

the seventh period of 1980, the QMP was implemented. The QMP is based on an

empirical Bayes model of the audit-sampling process. It uses the past sample indices

but makes an inference about current quality. However, parts of the derivation of

QMP are heuristic, including the derivation of the posterior distribution of the current

population index, the parameter of interest. Here, we present a hierarchical Bayes

model, which avoids the adhoc approximations while deriving the QMP.

The third problem deals with the Bayesian analysis of categorical survey data.

Much of the earlier work deals with Bayesian analysis for data in binary fashion,

where presence or absence of a specific response is considered. In the case of multi-

category data, there will be times, however, when one would like to analyze the

responses jointly, arriving at a posterior covariance matrix for the response pattern

rather than just a variance for one alternative at a time. Here, a hierarchical Bayesian

approach is used to estimate finite population proportions under two-stage sampling

within strata based on generalized linear models. In particular, for data on items

containing three or more possible responses, a hierarchical Bayesian analysis based

on a Poisson model for counts is provided. A Monte Carlo method, the Gibbs sampler,

has been used to overcome the computational limitations that have plagued Bayesian

analysis for years. The main technique is illustrated using Canada Youth and AIDS

Study data.














CHAPTER 1

INTRODUCTION



1.1 Literature Review


Empirical and hierarchical Bayes methods are becoming increasingly popular in

statistics, especially in the context of simultaneous estimation of several parameters.

For example, agencies of the federal government have been involved in obtaining

estimates of per capital income, unemployment rates, crop yields and so forth simul-

taneously for several state and local government areas. In such situations, quite often

estimates of certain area means, or simultaneous estimates of several area means can

be improved by incorporating information from similar neighboring areas. Examples

of this type are especially suitable for empirical Bayes (EB) analysis. As described in

Berger (1985), an EB scenario is one in which known relationships among the coordi-

nates of the parameter vector, say 0 = (1,- -, O,)T, allow use of the data to estimate

some features of the prior distribution. Such problems occur quite frequently in statis-

tics. One such situation is when 90 arises from some common population; so what we

can imagine is creating a probabilistic model for the population and can interpret this

model as the prior distribution. For example, one may have reason to believe that

the Oi's are iid from a prior 7ro(A), where 7r0 is structurally known except possibly for

some unknown parameter A. A parametric empirical Bayes (EB) procedure is one in

which A is estimated from the marginal distribution of the observations.







Closely related to the EB procedure is the hierarchical Bayes (HB) procedure

which models the prior distribution in stages. In the first stage, conditional on A =

A, 0O's are iid with a prior ro(A). In the second stage, a prior distribution (often

improper) is assigned to A. This is an example of a two stage prior. The idea can be

generalized to multistage priors, but will not be pursued in this dissertation.

It is apparent that both the EB and the HB procedures recognize the uncertainty

in the prior information. Whereas the HB procedure models the uncertainty in the

prior information by assigning a distribution (often noninformative or improper) to

the prior parameters (usually called hyperparameters), the EB procedure attempts to

estimate the unknown hyperparameters, typically by some classical method such as

the method of moments or method of maximum likelihood, etc., and use the resulting

estimated priors for inferential purposes. In the context of point estimation, both

methods often lead to comparable results. However, when it comes to the question

of measuring the standard errors associated with these estimators, the HB method

has a clear edge over a naive EB method. Empirical Bayes theory by itself does

not indicate how to incorporate the hyperparameter estimation error in the analysis.

The HB analysis incorporates such errors automatically and hence is generally more

reasonable of the approaches. Also, there are no clear cut measures of standard errors

associated with EB point estimators. But the same is not true with HB estimators. To

be precise, if one estimates the parameter of interest by its posterior mean, then a very

natural estimate of the risk associated with this estimator is its posterior variance.

Estimates of the standard errors associated with EB point estimators usually need

an ingenious approximation (see, e.g., Morris, 1981, 1983), whereas the posterior

variances associated with the HB estimators, though often complicated, can be found

exactly.

Berger (1985) observes, in addition to the hyperparameter estimation error, two

more advantages of using the HB procedure. There are often available both structural







prior information (leading to the first stage prior structure) and lsubjective prior

information about tih location of 0. The hierarchical Bayes approach allows the use

of bot b types of information and this can especially be valuable for smaller p. Also,

another adlai \;ii.L of the HB approach is that it easily produces more information

about the posterior distribution, such as, the posterior covarianices, but it would

require work to derive in a sophisticated empirical Bayes fashion.

The term hierarchical Bayes was first used by Good (1965). Lindley and Smith

(1972) called such priors multistage priors. The latter used the idea very effectively for

estimating the vector of normal means, as well as the vector of regression coefficients.

Indeed, Lindley and Smith (1972) reanalyzed the usual linear statistical model using

Bayesian methods and the concept of exchangeability. They find estimates in a linear

model that substantially improve over the usual estimates derived by the method of

least squares, by exploiting the available prior information about the parameters.

There is a huge literature on hierarchical Bayes analysis for a wide range of prob-

lems, in the case of continuous data. Much of the literature for continuous data

deals with the estimation of parameters of the normal distribution. Ghosh (1992)

reviews and unifies the hierarchical and empirical Bayes approach for estimating the

multivariate normal mean. To handle the case of heavy tailed priors of the normal

distribution, Datta and Lahiri (1992) and Angers and Berger (1991) used t-priors

viewing them as scale mixture of normals.

Hierarchical Bayes methodology also have been implemented in illI|r.,N i:I, small

area. estimators. Empirical Bayes or the variance components approach has been con-

sidered for simultaneous estimation of the parameters for several small areas (strata),

where each stratum contains a finite number of elements, by Fay and Herriot (1979),

Ghosh and Meeden (1986), Ghosh and Lahiri (1987), Battese, Harter and Fuller

(1!s.), and Prasad and Hao (1990). Ghosh and Lahiri (i'9'r2) and Datta and Ghosh






(1991) proposed HB procedures as an alternative to the EB procedures for small area

estimation problems.

HB procedures also have been used for discrete data in specific contexts. George,

Makov and Smith (il'12) provide a Bayesian hierarchical analysis of the pump failure

data, previously analyzed by Gaver and O'Muircheaetaigh (1987) in an empirical

Bayes fashion, by using a Poisson-Gamma hierarchical model. Albert (1988) provides

a Bayesian hierarchical generalized linear model (GLM) for the assessment of the

goodness of fit of the GLM and the estimation of the mean pi of the random variable

from an exponential family. He also discusses tractable accurate approximations

for the posterior calculations. -The GLM hierarchical model of Albert (1988) is a

generalization of the normal hierarchical model of Lindley and Smith (1972). Leonard

and Novick (1986) used exchangeable and log-linear hierarchical models for Poisson

data while modelling the structure of an r x s contingency table and for drawing

marginal inferences about all parameters in the model. Zeger and Karim (1991) cast

the generalized linear random effects model in a hierarchical Bayesian framework. The

methodology is illustrated through a simulation study and an analysis of infectious

disease data by fitting a logistic-normal random effects model.

From a calculational perspective the comparison of the HB approach versus the

EB approach previously was something of a toss-up. EB theory requires solving

likelihood equations, while the HB approach requires numerical integration, often

multi-dimensional. In the past, the use of the HB approach was hampered by the

need for multi-dimensional integration. The usual numerical integration tools are

not very reliable in high dimensions. Tierney and Kadane (1986), Kass, Tiern:y and

Kadane (1989), Kass and Steffey (1989) have used Laplace's method of approximat-

ing marginal posterior densities and moments. The proposed method, like the EB

approach, requires solving likelihood equations instead of numerical integration. In







recent years, with the advent of fast computers, Monte Carlo numerical integration

techniques like the Gibbs sampler have become very popular.

Ily now a large body of literature has evolved dealing with small area estimation

problelns. One specific problem of small area estimation is related to adilil -t, l of

the census undercount. Ericksen and Kadane (1985,1987) proposed a model-based

approach toward adjustment of census counts. They advocated shrinking the adjust-

ment factors calculated as the ratio of the 1980 census post enumeration survey (PES)

estimates to the census figures toward some suitable regression model similar to the

ones considered in Fay and Herriot (1979) and Morris (19S:). The model con-sidered

by Ericksen and Kadane (1985) is univariate. But Datta et al. (1992), in their article

on the 1988 Missouri Dress Rehearsal data discussed a multivariate generalization of

the model. These authors developed procedures that were used to model data from

the 1990 census and the subsequent PES and smooth survey-based estimates of the

adjustment factors.

Hoadley (1981) developed a plan, implemented for reporting the quality assur-

ance audit results to Bell System management, called the Quality Measurement Plan

(QMP). The QMP is based on an empirical Bayes model of the audit-sampling pro-

cess. It represents a considerable improvement in the statistical power for detecting

substandard quality as compared with the old rules based on the T-rate system,

evolved from the work of Dodge and others. It uses the past indices but makes an

inference about current quality.

Another specific problem of small area estimation is related to categorical survey

data. Unlike the frequentist approach, the prior structure assumed by the Bav\,ianl

approach enables the estimation of the population parameters in cells which contain

no data. Stroud (1991) provided a hierarchical-conjugate Bayesian anal ,is, encom-

passing simple random, stratified, cluster and two-stage -.ninpliin,. as well as two-stage







sampling within strata, for data in binary fashion, where presence or absence of a spe-

cific response is considered. The main technique was illustrated using a small subset

of Canada Youth and AIDS Study data.

1.2 The Subject of this Dissertation


This dissertation considers several problems where hierarchical Bayes methodology

is used for obtaining estimates and the associated standard errors. Although the

methods are presented in the context of some specific problems, they are fairly general

in nature, and can easily be adapted to other related problems as well.

In Chapter 2, we discuss a model-based approach towards adjustment of the 1990

census data. A hierarchical Bayes procedure is proposed, which overcomes many

of the criticisms levelled against the Bayesian procedures of earlier authors like Er-

icksen and Kadane (1985,1987) and Datta et al. (1992). In particular, we have

devised a computer-intensive fully Bayesian procedure which uses Monte Carlo nu-

merical integration techniques like the Gibbs sampler. This eliminates the need for

assuming sample variance-covariance matrices of the adjustment factors to be known

which were hitherto assumed in any Bayesian or non-Bayesian analysis. The find-

ings also indicate that some of the standard errors one obtains by assuming known

sample variance-covariance matrices may result in serious underestimation in com-

parison with what one would have obtained when the uncertainty of such matrices

was modelled appropriately.

In Chapter 3, we provide a hierarchical Bayes refinement of Hoadley's Quality

Measurement Plan (QMP), which has been severely criticized on several grounds.

The HB procedure proposed will avoid the ad hoc approximations needed in Hoadley's

original procedure. Also, the method proposed will provide another illustration of the

Markov chain Monte Carlo integration technique, Gibbs sampling, which has gained

popularity over recent years.




7


Chapter 4 addresses the Bayesian analysis of categorical survey data, where the

data are classified into several (not necessarily two) categories. A hierarchical Bayes

procedure is used for the analysis of such data. More generally, a complete HB

analysis is given for two-stage sampling within strata based on generalized linear

models. The computational limitation of multi-dimensional integration which has

plagued Bayesian analysis for years is overcome with the use of the Monte Carlo

integration method, the Gibbs sampler. The main technique is illustrated using

Canada Youth and AIDS Study data.














CHAPTER 2

ADJUSTMENT OF 1990 CENSUS UNDERCOUNT: A HIERARCHICAL BAYES APPROACH



2.1 Introduction


Adjustment of census counts has been a topic of heated debate for nearly a decade.

The 1980 counts were never officially adjusted due to a decision of the then commerce

secretary Mr. Robert Moshbacher. However, in several lawsuits brought against

the Bureau of the Census by different states and cities who demanded revision of

the reported counts, the topic of adjustment came up repeatedly in the courtroom

testimony of statisticians appearing as expert witnesses on both sides. The issue was

again hotly discussed and debated in subsequent scientific publications (see Ericksen

and Kadane, 1985; and Freedman and Navidi, 1986). It is clear from these discussions

that the statistics community is sharply divided within itself regarding the desirability

of adjusting census counts.

Far from being over, the issue has resurfaced with the appearance of the 1990

census data. Once again secretary Moshbacher announced on July 15, 1991, that the

results of the 1990 census would not be adjusted, thus overturning the Census Bureau

recommendation to use "adjusted" census data. Almost immediately after this, the

city of New York and others brought a lawsuit seeking to overturn the decision of the

Commerce Secretary. The case was tried in the courtroom of Federal Judge Joseph M.

McLaughlin of Manhattan during May, 1992, and a verdict is yet to come. However,







it is clear that there is yet no consensus even among statisticians on whether or not

to adjust the counts.

The objective of this chapter is not to deal with the pros and cons of adjustment

but instead to introduce a methodology that can be used for adjustment of the 1990

census if needed. The present method is a refinement and generalizations of the

previous work of Datta et al. (1992) where hierarchical and empirical Bayes methods

were proposed for adjusting census data. While Datta et al.(1992) analyzed the 1!)S

Missouri Dress Rehearsal data, the present chapter analyzes the actual 1990 census

data.

Like other proponents of adjustment, we agree that the 1990 post enumeration

survey (PES) data collected in August 1990 forms the basis of adjustment. The 1990

PES is a sample of 170,000 housing units in 5,400 sample block clusters, each cluster

being either one block or a collection of several small blocks. To be useful, the PES

results must be generalized to nonsampled blocks. With this end, the population

is divided into several groups or poststrata. The census count is known for each

such poststratum, while the PES estimates the corresponding true population. The

ratio of the PES estimate of the true population to the census count is known as the

adjustment factor. The construction of poststrata has undergone several revisions

with the original proposal of 1392 postsrata being now replaced by 357 poststrata.

The detailed description of the latest poststrata appears in Section 2.3.

We begin at the point where a set of estimated raw adjustment factors and their

variances for the different poststrata are available for modelling based on the 11'11

census and the subsequent PES. We introduce in Section 2.2 a hierarchical linear

model for this purpose, and relax many of the assumptions which have hitherto been

the subject of criticism. Ericksen and Kadane (H19.5) were the first proponents of

hierarchical models. Many of the earlier criticisms levelled against their procedure

were taken into account in Datta et al. (I'lI2). However, the latter did not model the







sample variance-covariance matrix of the adjustment factors which are estimates and

thus bear uncertainty. The present chapter models this uncertainty as well. Since

the pairwise sample correlation coefficients of the adjustment factors between the

different poststrata are much smaller compared to the variances, they are not taken

into account in the present analysis.

Section 2.3 contains the actual analysis of the data. We obtain in this section

the smoothed adjustment factors and the associated standard errors. We use the

Gibbs sampling Monte-Carlo integration technique to carry out the Bayesian analy-

sis. This is in sharp contrast to the previous work of Datta et al. (1992) which used

a simple one-dimensional numerical integration subroutine. Due to unknown vari-

ances, the present Bayesian analysis involves high-dimensional numerical integration

which seems impossible to carry out without resort to some Monte-Carlo integration

technique.

There are some important (though not surprising) consequences of the analysis

of our data. First, the two sets of point estimates of the true adjustment factors

are very close whether we use the present hierarchical Bayes (HB) procedure or the

one of Datta et al. (1992). However, for most of the poststrata the standard errors

obtained by the present method are 1.5 to 2 times higher than those obtained by the

earlier method. From a statistical point of view, this additional variability can be

explained very easily from the fact of modelling the sample variance-covariance matrix

rather than treating it as fixed. Also, our findings lend some support to Dr. Fay's

testimony before Judge McLaughlin (see Fienberg, 1992, p. 35) that the variances of

the adjustment factors were understated by a factor of 1.7 to 3.0.

We conclude this section by saying that we believe in the need for adjustment of

census data and that a Bayesian analysis is suitable for this purpose. However, to

achieve greater robustness, a full HB analysis as done in this chapter is much preferred

to a subjective Bayes analysis.






2.2 Hierarclii al Bave., Motdhel Andl (ibbs Saniplini


Suppose there are 7m poststrata. Let YV denote the sample adjustment factor for

the i th p,'.t-',It 'nm. and 0i the corresponding true adjustment factor. Also, let Vi

denote the sample variance for the i th poststratum (i = 1,***, ).
The following HB model is considered.

I. Conditional on 01,,, O ,, V1,- V,,,, 3 and ao, Yi and Vi's are mutually inde-
ind d
pendent with Y, i N(0 Vi) and Vi "' Vi x;

II. 0 1,..- V1 ,..,v,,,,p,/a3,72 i N(,p. a2);
III. Marginally, /, a2, V1, ** V, are mutually independent with 3 P.Uniform(R"),

Z=(o)-'- ~ Gamima( 1c ad) and ( = Vi-''s are Gamma( !a 1b). [A random

variable W is said to have a Gamma(a, )3) distribution if it has a pdf of the form

f(w) oc exp(--aw)wO-'1 1,oo)(w), where I denotes the usual indicator function]. We
allow the possibility of diffuse priors for Z or ji's, for example a=c=0, etc. Note that
the above hierarchical model is suitable for other contexts as well, for example in the
estimation of income of small places as considered by Fay and Herriot(1979). These
authors considered an alternative empirical Bayes approach for this problem.

We shall use the notations Y = (Yi, Y,,)T, 0 = (01, 0,,)T' XT =

(x,. ,,,). Then the posterior distribution of 0 given Y=y and I= b, (i =
1, t) is obtained as follows:

(i) conditional on Y = Yi, V = ib, i(i = 1,* m) and Z = z, 0~- N,,(E-'Ay, E-'),

where A = diag(1, -,,,) and E = A + z(I,, X(XTX)-'XT);

(ii) conditional on Y, = yi and V; = i(i = 1, m7), Z, ...-,,,, have joint pdf


f/(,---,(, I yi, ---,y ,", I, ,) E -' e2xp yT (A AE-'A) y


9 [ J







fj {^"* x-1 (l (" + a))}
i--2-


Finding the posterior distribution of 0 through (i) and (ii) requires evaluation of

(m+l)-dimensional integrals. The task becomes quite formidable even for moderate
m. Rather than using multidimensional numerical integral, we use Monte Carlo nu-
merical integration to generate the posterior distributions and associated means and
variances. More specifically, we use Gibbs sampling originally introduced in Geman
and Geman(1984), and more recently popularized by Gelfand and Smith(1990) and
Gelfand et al. (1990). Gibbs sampling is described below.
Gibbs sampling is a Markovian updating scheme. Given an arbitrary starting

set of values U0)- .,UJ0) we draw U0 ~ I[ | 1U), '", UI ~1 [ U2

U(,), o) O], ~, N [k .I (U U), -, U(\ ], where [- i ] denotes the rel-

evant conditional distributions. Thus, each variable is visited in the natural order
and a cycle in this scheme requires k random variate generations. After t such it-

erations, one arrives at (U(t), .. U). As t-+ oo, (U(, U) d ( i, ,).
Gibbs sampling through q replications of the aforementioned t-iterations generates q

iid k-tuples (U4), ""-, Uk ) (j =1,- *-,q). Ui, -, U could possibly be vectors in the
above scheme.
Using Gibbs sampling, the joint posterior pdf of 01,- 0,,, is approximated by


q-l (O, ,0,,,) l y, ,. 0,,,, = t), z = z ,t ; v ),i= 1,. m] (2.2.1)
j=1

To estimate the posterior moments, we use Rao-Blackwellized estimates as in Gelfand
and Smith (1991). Notice that


E(0I |y, ., ,,,, ,- ,,, z)= (1 Bi)yi + BixTp,






where B = z/(z + (i), i = 1, m. This is approximated by


q' (1 B)y + B xT) (2.2.2)
j=1

where as before, t denotes the number of iterations needed to generate a sample. Next
noting that

V(Oi I yi,--- ,y,, ii, ,G,, )
= E[ V(0i yl,- ,,,, ',- --,i~,,, 'i,---,,,, 3,z) i y ,- '-, ,y ,i,'i, ,,m]

+V[ E(0{ yl,---,y,,,, i,--- ,,(, ),3, z) I y, i Y, ,,i,-' ,,m]

= E[ (z+y)-1 I yl,--,ym,i,- ,m]

+V[ (1 Bj)yj + BxT3 I yI ,y2,,,,,i<, ,ir,,,,] (2.2.3)

one approximates the same by


NO ( + q Z B)2 ( -xT et)2 q- L B+E ) (Yi XT r))'
j=1 j=1 j=1
(2.2.4)
The Gibbs sampling analysis is based on the following posterior distributions :

(i) / y, 0 ,, ..,I.,1,,,.-.,11,z ~ N ((XTX)-'XT0, z-'(XTX)-1);

(ii) z y, v,,- ,,,, ,, 3,0,, ,,, ~Gamma( (c + E=(Oi( xf)2) (d + m))

(iii) i, 0 | y, 0 ,, v d,, tm -nGamma ((a + (yi 0i)2 + niji) (ri + b + 1))

(iv) 01, -,, I y, i1,, ,,, ,N (( B.)i + Bix -'B).
We investigate in the next section how this approach leads to smoothed a;iIjIt Ilmit
factors for the 1990 census.







2.3 Adjustment of 1990 Census Data


As mentioned in the introduction, the latest adjustment factors are available for

357 poststrata. We now give a brief description of what these poststrata are. First,

for non-Hispanic white and other owners, there are four geographic areas under con-

sideration: (i) northeast, (ii) south, (iii) midwest and (iv) west. Each geographic area

is then divided into (a) urbanized areas with population 250,000+, (b) other urban

areas, and (c) nonurban area. This leads to 12 strata. Similarly for non-Hispanic

white and other non-owners (renters), there are 12 such strata. Next black owners in

urbanized areas with population 250,000+ are classified into four strata according to

four geographic areas. However, black owners in other urban areas are collapsed into

one stratum as are black owners in nonurban areas. This leads to 6 strata for black

owners. Similarly each category of black nonowners, nonblack Hispanic owners, and

nonblack Hispanic nonowners is divided into 6 strata following the same pattern used

in the construction of strata for the black owners.

So far we have reached a total of 12+12+6+6+6+6 = 48 strata. Added to these

are 3 strata containing (i) Asian and Pacific-Islander owners, (ii) Asian and Pacific-

Islander nonowners, and (iii) American Indians on reservations. This leads to a total

of 51 strata. Each such stratum is now cross-classified with 7 age-sex categories: (a)

0-17 (males and females), (b) 18-29 (males), (c) 18-29 (females), (d) 30-49 (males),

(e) 30-49 (females), (f) 50+ (males), and (g) 50+ (females). This leads to a total of

51 x 7 = 357 poststrata.

The set of adjustment factors and the sample variances are available for all the

357 poststrata. However, for performing the HB analysis, we have not taken into

account the last three categories of (i) Asian and Pacific-Islander owners, (ii) Asian

and Pacific-Islander nonowners, and (iii) American Indians on reservations, as it

is generally felt that these categories should not be merged with the rest, and an







HB a;tiil.,i combines information from all the sources in cr~lllhting the smoothed

,ililntment factors. This leads to a HB analysis based on 336 poststrata. We do not

report that analysis here but discuss instead the results of a simpler analysis based on

48 poststrata where all seven age-sex categories are pooled into one. Even with this

simplification, the main messages of this chapter, namely the need for (i) smoothing

the adjustment factors and (ii) providing more reliable estimates of the associated

standard errors, are clearly conveyed in our analysis.

We consider the hierarchical model as given in Section 2.2 with a=b=c=d=0 to

ensure some form of diffuse gamma priors for the inverse of the variance components

in our model. The results, however, are not very sensitive to the choice of a, b, c, d

as long as some version of diffuse prior is used. Next, the ni's, the degrees of freedom

for the X2 distribution associated with Vi in the i th poststratum, represent the P -

sample (the number of persons counted in the PES) in the i th poststratum divided

by some factor, here 300 We admit the adhockery of the number 300, but feel that

division by some such factor is essential to perform some meaningful analysis. The

design matrix X provided to us from the Bureau of the Census was obtained via best

subsets regression and is of the form

XT = (i,.-,4) ,

where each xi is a nine component column-vector with the first element equal to 1,

the second element equal to the indicator for nonowner, the third and the fourth

elements equal to the indicators for black and Hispanic respectively, the fifth and

the sixth elements denoting, respectively, the indicators for an urbanized area with a

population of 25'l)ll,0il+ and a nonurbanized area, and finally the seventh, eighth and

ninth elements dleii,,u, ill respectively, the indicator or proportion in northeast, south

and west.

The 111 analysis was performed by using the Gibbs sanlpler. In performing the

analysis, we have taken t (the number of iterations needed to generate a sample)







equal to 50, while the number of samples is taken as 2500. The stability in the point

estimates of the adjustment factors is achieved once a sample of 1500 is generated

while stability in the associated standard errors is achieved once a sample of 2500 is

generated.

The results of the HB analysis are reported in Table 2.1 which provides the adjust-

ment factors (Y), the corresponding standard errors (SD.Y), the smoothed adjii: t inrlnt.

factors using the hierarchical model of Section 2.2 (HB1), the associated standard er-

rors (SD.HB1), the smoothed adjustment factors using the model of Datta et al.

(HB2) and the associated standard errors (SD.HB2) for all the 48 poststrata

It is clear from Table 2.1 that both the present method and the one of Datta et

al. essentially lead to the same point estimates of the adjustment factors and both

the methods lead to substantial reduction in the standard errors. However, in most

of the 48 poststrata, the estimated standard errors obtained by the present method

(SD.HB1) are 1.5 to 2 times (sometimes even more) bigger than the ones of Datta

et al. (SD.HB2). A few exceptions are poststrata 12, 25, 27, 28, 31-34 where the

estimated standard errors using the present method are lower than the ones using the

model of Datta et al. This is somewhat surprising, and we do not have an intuitive

explanation of this phenomenon as yet.

We conclude with the assertion that a model-based approach for smoothing the

adjustment factors is strongly recommended. Also, hierarchical modelling is particu-

larly well-suited to meet this need.







TABLE 2.1. RAW ADJUSTMENT FACTORS, HB ESTIMATORS

AND STANDARD ERRORS


I Y SD.Y HB1 SD.HB1 HB2 SD.HB2


0.9792

1.0069


0.0104
0.0072


3 0.9974 0.0039


0.9966

0.9893
1.0052

0.9990
1.0063

0.9947

1.0018

0.9930
1.0029

1.0117

1.0262

1.0239

1.0328

1.0353

1.0330

1.0124

1.0470

1.0697
1 .0lll)

1.112'3.
1 .0I11 Is


0.0064

0.0048

0.0043

0.0040

0.0058

0.0069

0.0069

0.0116

0.0069

0.014:3

0.0156

0.0170

0.0172

0.0162

0.0186

0.011:3

0.0147

0.0467

0.019:3

0.0160
0.02i1ji


1

2


0.9902

1. 0038

0.9948

1.0027

0.9908

1.0044

0.9954
1.0035

0.9937

1.0072

0.9981
1.0063

1.0238

1.0374

1.0283

1.0365

1.0245

1.0380
1.028S

1.0371

1.0274

1.0409

1.0318

1.0400


0.0038

0.0042

0.0023

0.0034

0.0030

0.0034

0.0024

0.0026

0.0046

0.004:3

0.0048

0.0034

0.0062

0.0045

0.0046

0.0055

0.0060

0.0040

0.0048
0.11052

0.0079

0.0060
0.111173

0.001 is


.0027

.0020

.0015
.0021

.0021


O.().s,)7

1.0030

0.9949

1.005.1

0.9909

1.0041

0.9961

1.0067

0.9926

1.0057

0.9975
1.008:3

1.0272

1.0404

1.0322

1.0430

1.02" 4

1.0415

1.0332

1.0412

1.0301

1.0433
1 013, 1

1.0459


.0023

.0020

.011(25

.0032

.0032

.0037
.0036

.0025

.0027
)0025

.0021

.0025

.0026
.ll()2(i

.0029

.0029
.1113: J

.uII3







Table 2.1 (continued)


I Y SD.Y HBI SD.HB1 HB2 SD.HB2


1.0165
1.0221

1.0082
1.0649
1.0136
1.0364


0.0196
0.0094
0.0088
0.0216
0.0101
0.0203


1.0108
1.0242
1.0151
1.0234
1.0230
1.0271


31 1.0913 0.0193 1.0445


1.0669
1.0638
1.1106
1.0433

1.0484
1.0068

1.0259
0.9585
1.0298

1.0095
1.0280
1.0721
1.1030

1.0711

1.0629
1.0707

1.1876


0.0217
0.0191

0.0335
0.0128
0.0595

0.0444

0.0095
0.0238
0.0092

0.0170
0.0283
0.0404

0.0311
0.0374
0.0209

0.0310
0.0724


1.0579
1.0489

1.0570
1.0561
1.0605

1.0132
1.0267
1.0176

1.0258

1.0255
1.0282
1.0469

1.0604
1.0513

1.0595
1.0584
1.0621


0.0057
0.0065
0.0058
0.0058
0.0067

0.0084
0.0075

0.0067
0.0070

0.0071
0.0071
0.0094

0.0052
0.0046
0.0049

0.0046

0.0044
0.0053
0.0075

0.0054
0.0066

0.0067
0.0062

0.0077


1.0071
1.0202
1.0120
1.0230
1.0198

1.0226
1.0448

1.0578
1.0496

1.0604

1.0568
1.0598

1.0107
1.0239

1.0156
1.0265
1.0249
1.0262
1.0483

1.0614
1.0532

1.0640
1.0618

1.0644


.0084
.0065

.0066
.0063
.0055

.0051
.0095
.0075
.0078
.0072

.0065

.0059
.0047

.0030
.0036
.0030
.0029
.0030

.0055

.0039
.0043

.0036
.0032

.0032


32
33
34

35

36
37

38
39
40
41
42

43
44

45

46
47

48







Table 2.2 HB1 and HB2


I HBI SD.HB1 [HB2 SD.HB2


0.! )!)1'2

1.0038

0.9948
1.0027

0.9908
1.0044

0.9954

1.0035
0.9937

1.0072
0.9981

1.0063
1.0238

1.0374
1.0283

1.0365

1.0245
1.0380
1.0288

1.0371
1.0274
1.0409

1.0318

1.0400


.00421
.1102.32
.1111:312

.00297
.00344

.00244

.00263
.00464

.00426
.00481

.00:344
.00624

.00449
.00457
.00550

.00598

.00403

.00476
.00520
.0076(i

.00599
.00726

.00677


0.9SI7

1.0030

0.9949
1.0054

0.9909
1.0041

0.9961
1.0067

0.9926

1.0057
0.9975
1.0083
1.0272

1.0404

1.0322

1.0430

1.0284

1.0415
1.0332
1.0442

1.0301
1.0433
1.0350

1.0459


.00274
.00195

.00145
.0023S

.00214
.00231

.00195
.00288
.00245

.00323

.00315
.00373

.00357

.00251
.00266
.00245

.00288
.00249

.00276
.002(f2

.00291
.00292
.00297

.00327







Table 2.2 (continued)




25 1.0108 .00565 1.0071 .00839
26 1.0242 .00649 1.0202 .00649
27 1.0151 .00580 1.0120 .00660
28 1.0234 .00578 1.0230 .00634
29 1.0230 .00671 1.0198 .00551
30 1.0271 .00836 1.0226 .00505
31 1.0445 .00754 1.0448 .00948
32 1.0579 .00665 1.0578 .00754
33 1.0489 .00697 1.0496 .00777
34 1.0570 .00714 1.0604 .00722
35 1.0561 .00712 1.0568 .00647
36 1.0605 .00935 1.0598 .00592
37 1.0132 .00524 1.0107 .00465
38 1.0267 .00463 1.0239 .00298
39 1.0176 .00492 1.0156 .00357
40 1.0258 .00461 1.0265 .00297
41 1.0255 .00439 1.0249 .00285
42 1.0282 .00526 1.0262 .00304
43 1.0469 .00752 1.0483 .00550
44 1.0604 .00541 1.0614 .00385
45 1.0513 .00660 1.0532 .00430
46 1.0595 .00667 1.0640 .00358
47 1.0584 .00619 1.0618 .00324
48 1.0621 .00770 1.0644 .00316














CHAPTER 3
REFINEMENT OF QUALITY MEASUREMENT PLAN



3.1 Introduction


The primary responsibility of Bell Laboratories Quality Assurance Center (QAC)

is to maintain quality requirements in the communication products designed by Bell

Laboratories, manufactured by Western Electric Company, Incorporated, and then

marketed to Bell System operating companies. In order to meet this responsibility,

the QAC conducts quality assurance audits on the products along with its Western

Electric agents, the Quality Assurance Directorate (QAD) and Purchased Products

Inspection (PPI) organizations.

Quality assurance audits are a structured system of inspections done on a sampling

basis by inspectors in production processes in order to report product quality to the

management. The audits are based on defects, detectives or demerits. Each sampled

product is inspected, and the defects are assessed whenever the product fails to meet

engineering requirements. The results are then compared to a quality standard, a

target value reflecting a tradeoff between manufacturing cost, operating costs and

customer needs. For audits based on defects or detectives, the standards are expressed

in terms of defects or detectives per unit. For audits based on demerits, the standards

are derived from fundamental defect per unit of count of A, B, C, D type defects (see

Hoadley, 1981).







The Quality Measurement Plan (QMP), developed by Hoadley (1981), is a statis-

tical method for analyzing discrete quality audit data which consist of the expected

number of defects given standard quality. This plan was implemented for report-

ing the quality assurance audit results to Bell system management starting with the

seventh period of 1980. The QMP is based on an empirical Bayes model of the audit-

sampling process. It uses the past sample indices, but makes inference about the

current quality. The method represented a considerable improvement in the statisti-

cal power for detecting substandard quality as compared with the old rules based on

the T-rate system, evolved from the work of Shewart, Dodge and others, starting in

the 1920s.

In spite of its wide publicity, QMP has been criticized on several grounds (see

for example Barlow and Irony, 1992). The main criticism is that Hoadley's original

procedure is at best heuristic, and a full Bayesian implementation of the procedure

will require high-dimensional numerical integration. Hoadley's original procedure

involves a Poisson likelihood with a gamma prior with the parameters of the gamma

prior estimated from the marginal likelihood (after integrating with respect to the

parameters of interest). However, the empirical Bayes versions of posterior means and

variances as given by Hoadley (1981) are based on the assumption of independence

of certain variables which usually fails to hold, especially for small samples. These

points will be made specific in Section 3.3.

The primary objective of this chapter is to provide a hierarchical Bayes (HB)

refinement of Hoadley's QMP. Such a HB procedure will avoid the ad hoc approxi-

mations needed in Hoadley's solution. Second, the present method will provide yet

another illustration of the powerful Markov chain Monte Carlo integration technique

which is gaining rapid popularity in recent years.

The outline of the remaining sections is as follows. In Section 3.2, we provide

the notations and assumptions needed to describe the QMP model. In Section 3.3,







we describe the HB model and contrast it with Hoadley's (1!9IS) model. Based on

the present hierarchical model, we have found the posterior distributions of the pa-

rameters of interest as well as the posterior means and variances by using the Gibbs

sampling technique (Gelfand and Smith, 1990; Gelfand et al., 1990). The Gibbs sam-

pling method requires generating samples from different posterior distributions. In

our derivation, one of the posterior distributions is known only up to a multiplicative

constant. Accordingly, an accept-reject algorithm is used to generate samples from

such a posterior. However, since this posterior turns out to be log-concave, we have

been able to use the adaptive rejection sampling algorithm of Gilks and Wild (1992).

A similar application of adaptive rejection sampling appears in George, Makov and

Smith (1993). Besides giving the formulas for posterior means and variances, we

have provided in this section a brief description of the Gibbs sampler as well as a

description of the adaptive rejection sampling scheme. Finally, as a possible approxi-

mate solution, we have also discussed the Laplace approximation method (see Tierney

and Kadane, 1986; Kass, Tierney and Kadane, 1989; Kass and Steffey, 19!):!) in the

present context.

Section 3.4 contains the actual analysis of the data, We have provided the Bayes

estimates and the associated standard errors of the current quality index using the

HB model introduced in Section 3.3. We have also shown that the Laplace method

:n\1; provide a poor approximation in this situation due to a heavily skewed posterior

density.

Irony et al. (1992) have recently used an addiive nmoidel and a multiplicative model

as alternatives to QMP. The additive model deals with production processes that de-

grade as time goes by (processes that age for instance). The multiplicative model

is appropriate for processes that improve with time (e.g. processes that depend on

learning). In contrast, the QMP model of Hoadley assumes that the process average,

,'.v 0, although unknown is fixed. In reality, however, 0 inli be






to handle this, the QMP procedure uses a moving window of six periods of data to

infer on the current quality index. This underscores the importance of small sample

inference associated with QMP procedures.


3.2 Notations and Assumptions


Suppose there are T rating periods: t=l,...,T, where T is the current period. For

period t, we have the following data from the audit:

nt = audit sample size;

zt = number of defects in the audit sample;

s = standard number of defects per unit;

et = snt = expected number of defects in the audit sample when the quality standard

is met;

It = xt/et = defect index of the current sample.

ASSUMPTIONS : Xt ~ Poisson(nt At) where At is the defect rate per unit. Repa-

rameterize At as Ot = At/s = quality index at rating period t. Then Ot = 1 is the

standard value and also Xt I t ~ Poisson(etOt).

The parameter of interest is OT, the current quality index. The objective is to

derive the posterior distribution of OT given the data x, which include the past data

(xl, -- XT-1) and the current data XT.

3.3 Hierarchical Bayes Model


The following hierarchical Bayes (HB) model is considered:

I. Conditional on 0,, OT,0, a and [t, Xt d Poisson(etOt);

II. Conditional on a and /3, Ot Gamma(a, /), where a Camna(a, /) variable,

say Z has pdf f(z I a, /3) = exp(-az) z3-' a / r(/3), z > 0, a > 0, / > 0;






III. Marginally a and /f have joint pdf


7r(a, ) oc Y-'-a (a > 1).

While doing the data analysis in Section 3.4, we shall consider several choices of a.

.Jeffire,' prior is most widely used as a noninformative prior. This prior is pro-

portional to the positive square root of the determinant of the expected Fisher-

information matrix. For the Gamma(a, /3) density, the expected Fisher-information

matrix is given by


I(, E (-Q os)
-\ ana _


-E ---- ) -1 F
,90 (( a a132 /


Hence, Jeffreys' prior for (a, /3) is given by


S1/2


(3.3.1)


This prior has limited practical utility due to appearance of the complicated triganma

function. However, using Stirling's approximation

r ve-#-30-1/2.


Thus,


io2ogr(f3) 1 1
a 3 2 22 1 3


(3.3.2)


Substitution of (3.3.2) into (3.3.1) yields


1/2
7r(a, /3) oc + + 1
a 2t3


1c 0- -1/2


However, this leads to an improper posterior for 9O's. To avoid this, we take a prior

of the form given in III.


r(a, /3) cx lI(a, 3)I'/2 = 1 # l g
aIa .1 02


(3.3.3)







The present hierarchical model is closely akin to a similar model of George, Makov

and Smith (1993). The difference occurs at the third stage of the hierarchical model

where George et al. use proper independent gamma priors for a and /3, whereas we

are using some diffuse gamma priors instead. We prefer to use the present class of

priors for this problem due to lack of prior elicitation, making subjective analysis
more difficult to justify.

Based on the present hierarchical model, a subjective Bayesian approach to find

the posterior distribution of 0T given a proceeds as follows


(i) OT I x, a, /3 ~ Gamma(eT + c, XT + /3); (3.3.4)

T T
(ii) p(a, /3 x) oc A/3-,a -1 H (a + er)(xT+) lr(xt + )/r(3)} (3.3.5)
t=1 t=1

Lemma 3.1 Suppose ax > 1 for all t = 1, --,T. Then fo'Jo' p(a, f3 x) dad/3 < 00

T
provided Ixt > T > a > 1.
t=1

Proof of Lemma 3.1 Note p(a, /1 a ) oc h(a, /3), where

T T
h(a, f) = f3-"aT"-i ( + et)(xt')) II{r(xe + /3)/F(f3)}.
t=1 t=1

In what follows, we shall use the notation K (> 0) for a generic constant which may

depend on x, but not on a and f3.
First, using Stirling's bounds for factorials for /3 > f3o > 0,


T T [ + (f+Xt+ )
1 tr(# + x,)/r(t3)] = H + r(i+i)
t=l t=l f 3 + x) rp+e 1+


t=1 e-( e )(/ 3 3






T
H(I + Xt)+,+
t=1 /

exPp [ X(Xr + 2)//Jo < AI7 '
2,j


T
(since (/3 + Xt
t=1


+ )log (1 + )
2 /3


-= xt+


Ext(xt + 2)


< 'xt'+


xzto(x+ )


For 0 < /3 < f/o,


r(3 + + 1)/(r(/3+1) =


JJ e-z:'+' dz / e-?zdz
0 / Jo


= Ep(z'),


where Z ~ Gamma(l, / + 1).


Using the MLR, property of gamma distributions


Ei(zx') T in jf so that EO(z') < Eoo(z'). Now,


T T
H [(3 + .)/rp(3)] _< T( H x-1)
t=1 t=1


Hence, writing c = ET= [log (a + et) log a], it follows that


I h(a,/3) d/


(0 + )


-a' nl(a + e)-Xt exp((--c3)/P T r( + d
t=1 (=l 1 3


t=1 [

+ ex0 pC (-cl3)3E ,-r'-` d131

T
t=1
< -' + e"-' T [ ,- 1 r( a + ))
(=i a


" K/3E-1


" K/3'x
<_ A7ft


T
5 (13 + st
t=1


+


K3-.


T
t= o(" )
t=1






T
< iKa- II(( + et)-"' (c-T+a-1 + -x'+a-1
t=1

< xK- (a + e,,,,)-Ex ({log (a + e,,,) log a}-T+a-

+ {log (a + e,,,,,) log r}- ,+a-I) (3.3.6)

T
(since c = (log (a + et) log a) > T(log (a+ e,,in) log a), where e,,,m = min et,
1=1 l and E xt > a and T > a).

Consider an interior point d of (0, oo) where d > 'e, .i,,. Then,


S -1 (a + emi,- xt [log (a + e ,,e,) log a]-Ta-1 da




Sda
eoo 2 -T+a-1
< aExt ( el ""' da
S a a 2a 2)
E (- eyi,) -T+a- ( e7llt -T+a- d
d a \ 2d a

< K -Ex'-- da < K a- da < oo (3.3.7)



(since xt > T and a > 1).
t
Similar calculations yield


ro a-1(a + e,,,)- [log (a + e,,) log a]-E+a-1 da


< K aE-ExE-" da < oo. .3.8)
TdO

Now, observe that


a-' (a + eC,, ,)~ E' [log (a + e,,,,,) log a]-T+a-1 da







) +l -1 (a(+ c)L -T+a-1 d
= (a + en) +a1 ^ -'(a + e,,,i,)-1 [log (a + e,,,,) log a]T"+1 dc

< E +i (log (c + e,,i,,) log aC)-T+a I'=d
(-ei-)(-T + a) 0=0
< 00 (since T > a). (3.3.9)

Again,

/o a1 (a + e,,.')- E [log (a + e,,?l,) log a]- E,+o-l d


< (e7 ,)- rxt+ (log (a + em) log a)- EZ't+ =d
(-e,,.i)(- E Xt + a) J =o

< oo (since E xt > a) (3.3.10)

Combine (3.3.6)-(3.3.10) to get Jo f'o h(a, f) da df3 < oo.

The above result ensures (due to (3.3.4) and (3.3.5)) that the posterior pdf of
OT given a is a proper pdf under the same condition. If this is the case, then using
(3.3.4), the posterior mean and the posterior variance are given by

E [, x] = E [E ( a, a, 3, x) x] = E ((xt + )(et + a)-' ](3.3.11)

V[9, x] = E[V(0, x, a,f) Ix] + V [E ((9 I, xa, 1) x]

E [(t + /3)(et + a)-2 I ] + V [(xt + 3)(et + a)-' ] x(3.3.12)

Hoadley (1981) uses the notation 0 = /3/a, the prior mean for the Ot's which he
calls "process average" and 7 = f3/a2, the prior variiaice, which he calls "process
variance". Writing w7' = -/(eT + a), it follows from (3.3.11) that

E[0, Ix] = E[(1 T)IT + WTOI 1]. (3.3.13)

If the prior parameters a and /3 were known, as in subjective Bayes anilYv.i, then the
posterior mean would be (1 wT)IT + WTO, a weighted average of the current defect






index IT and the process average 0. If a is small compared to eT, i.e. the sample
evidence outweighs the prior evidence, then the weighted average leans more toward
IT, the current index. The opposite is the case when a is large compared to eT.
Hoadley starts with (3.3.13), but unlike our stage III, does not assume any hy-
perprior for a and f3. Instead he estimates a and /3 from the marginal distribution of

x after integrating out 01,-, O -,. In this way an approximation (1 WT)IT + wrT

for the expression in the right hand side of (3.3.13) can be made where ibT and 0 are
EB estimators of WT and 0 respectively. But Hoadley (see his p 233) seems to argue
E(wTO I x) E(WT I x) E(0 I x) and then approximate each of E(WT I x) and
E(0 I x). The posterior uncorrelation of WT and 0 does not hold in general. Next
note that


V [(t + /)(et + a)-' I =

= V [(1 wT)IT + WTOI m] = V [wTr( -- IT) X]

= V [wT(0 OB + OB IT) I X], (3.3.14)


where OB = E(O I x). Hoadley approximates the right hand side of (3.3.14) by

wTV(0 | ) + (0 IT)2V(WT I x). This approximation is much more questionable

since neglecting Cov(wrT( OB), WT(OB IT) I x) may be too much of a sacrifice.

Second, the approximation of V(wT(O OR) I a) by 4wV(O x) does not take into
account the posterior dependence of WT and 0.
The above does not undermine Hoadley's novel contribution. The main difficulty
that he faced was that finding the posterior distribution of OT given a using the
hierarchical Bayes model requires multidimensional integrals. The usual numerical
integration tools are not very reliable in high dimensions. Monte Carlo numerical
integration was not very popular in those days.






In the present study, we use Monte Carlo numerical integration to generate poste-
rior distributions and associated means and variances. More specifically, we use Gibbs
sampling originally introduced in Geman and (Geman (1'4S-), and more recently pop-
ularized by Gelfand and Smith(1990) and Gelfand et al. (1990). The method is
described below.
Gibbs sampling is a Markovian updating scheme. Given an arbitrary starting
set of values U((), .. () we draw 1(') [U I ( ..)], U( ) [2U )
1'), o) -, U)0)], 1 *, U ) ~i [U I Uf I )--, U..11)], where [. I .] denotes the rel-

evant conditional distributions. Thus, each variable is visited in the natural order
and a cycle in this scheme requires p random variate generations. After k such iter-

ations, one arrives at ((Uk), ..,Uk). As k-+ 0o, (UTk),.. ,11k)) (l,*., t,).
Gibbs sampling through q replications of the aforementioned k iterations generates
q iid p-tuples (U),., i. f )) (j =-,..,q); -, Up could possibly be vectors in the
above scheme.
Using Gibbs sampling, the joint posterior pdf of 0= (01,- -, OT) is approximated
by
q
ql [Oxa=a) 3,,= ik)] (3.3.15)
j=1
The Gibbs sampling analysis is based on the following posterior distributions:

(i) Ot X, a, I3 m Gamma(e + a + 3, t );
(ii) ca z,/ Gamma( 'X3 i T/);

(iii) | 0, x, a has pdf p(3 I| 0,x,a) ax ([=nT Ot)- 1- -" ,
To estimate the posterior moments, we use Rao-Blackwellized estimates as in
Gelfand and Smith (1991). Using (3.3.11), E(OT I x) is approximated by


q -' (3.3.16)
: I+q (i ) I






where as before, k denotes tlhe number of iterations needed to generate a sample.
Next using (3.3.12), V(OT I x) is approximated by


Sq X T+( \+ I (k T + / 3 q T /3 ) 2V
q- I (1 3 cT E+ q E kIi) k (3.3.17)
qj=1 \(eT + a' )2 j=1 T + j=1 CT + a

In implementing the Gibbs sampler, one should be able to draw samples from the
conditional densities given in (i)-(iii). Simulation from the conditional densities (i)
and (ii) which are both gamma densities can be done by standard methods. However,
the posterior pdf of f3 given 0,a and a is known only up to a multiplicative constant.
In order to simulate from this density, one general approach is to use the Metropolis-
Hastings accept-reject algorithm.
Fortunately, the task becomes simpler for us because of the following result.

Lemma 3.2 log p(f ) 0,, a) is a concave function of i3 if T > a.

Proof of Lemma 3.2 Consider p(/3 I O,x,a) cc (fITL Ot)13'-a/ (FT. Then


log p(/3 I 0, a) = C (/3- 1) Elog t a log 3 + T/log a- Tlog (f/3)
t

where C is the norming constant. Hence,

Slogp(f3 | ,x, a)
0/3

= Tlog t +Tloga -T log

= Et log O + T log a + (T a)j -T foe-; lo (3.3.18)

Therefore,

o2 log p(3 1 0, ,xa)
) /32







= -(T -a) -T -iO (log ()2 d) W)2

-(T a) i T V, +1(log z) < 0 (3.3.19)


for T > a, where z Gamma(l, (l + 1))


Because of the log-concavity of this posterior density, we can use the adaptive

rejection sampling algorithm of Gilks and Wild (1992) to simulate from this density.

The adaptive rejection sampling is a black box technique for sampling from aiI

univariate log-concave probability density function f(x). The algorithm is based on

the fact that any concave function can be bounded by a rejection envelope and a

squeezing function which are piecewise exponential functions, constructed by tan-

gents at, and the squeezing function by chords between, evaluated sampled points

on the function over its domain. As sampling proceeds, the r,.jediI. envelope and

the squeezing function converge to the density function, and hence the method is

adaptive.

We now describe the adaptive rejection sampling in a general framework. Let f(x)

be a probability density function with domain D. It is assumed that D is connected

and f(x) is continuous and differentiable everywhere in D and that h(x) = lnf(s) is

concave everywhere in D. Consider m abscissa points in D: xZ < x2 < < x,,. Let

T,,, = {xi; i = 1, -, m}. For j = 1,- m 1 the tangents to h(x) at xj and xj+l

in T,,, intersect at


Sh(xj+,) h(xj) xj+lh'(xj+,) + xyh'(xj) (.3.)
h'(x,) h'(xj+,)


The rejection envelope on T,,, is defined as exp(u,,,(x)) where u,,,(x) is a piecewise

linear upper hull of the form


u,,,(.) = h(x,) + (x Xz)h'(xj) for x e [zj-i,zj], j = 1, -,k


(3.3.21)







where z0 is the lower bound of D(or -oo if D is unbounded below) and z,,, is the

upper bound of D(or +00 if D is unbounded above). The squeezing function on Tn

is defined as exp(,,i,(z)), where l,,I(x) is a piecewise linear lower hull formed from the

chords between adjacent abscissae in T,, and is of the form


I, = (xj+1 x)h(xj) + (x j)h(x ) (3.3.22)
l1,mx) = (3.3.22)
xj+l Xj


for j = 1, m m 1. For x < Zx or x > x, 1,, l() = -oo. Also, define the following

function

s,,,(x) = exp(u,,.(x)) exp(u,,(x')) dx' (3.3.23)

The concavity of h(x) ensures that ,,,(x) < h(x) < u,,(x) for all x in D. To

sample n points independently from f(x) the following steps are performed: (1)

Initialization step, (2) Sampling step and (3) Updating step.

Initialization step : Intialize the abscissa points in T,,,. If D is unbounded, below

then x, is chosen such that h'(xl) > 0 and if D is unbounded above, then x,,, is

chosen such that h'(x,,,) < 0. The functions u,,,(x), ,,,(x) and sm,(x) are found from

equations (3.3.21),(3.3.22) and (3.3.23) respectively.

Sampling step : x* is sampled from s,,(x) and a value w is sampled independently

from Uniform(0,1) distribution. The squeezing test is performed as follows: if


w < exp{l,,(x*) u,,,(z*)}

then z* is accepted. Else h(x*) and h'(x*) are evaluated and the rejection test is

performed: if

w < exp{h(x*) u,,(x*)}

then x* is accepted; otherwise x* is rejected.

Updating step : If h(x*) and h'(x*) were evaluated at the sampling step then

c* was included in T,,, to form T,,+1 and the elements of T,,,+i were relabelled in






ascending order. Then the functions um,,+1(x), 1,,,+(x) and s,,,+i(x) from equa-
tions (3.3.21),(3.3.22) and (3.3.23) respectively are evaluated. We return to the
sampling step if n points have not been accepted.
As an alternative to Gibbs sampling the posterior moments of OT given a can also
be obtained using the Laplace method of approximation (see Tierney and Kadane,
1986). Note that


E[OT x = E E(XT x /

(ff ('z p(a, f3, x)dad/(3
( ffp(a, 3, x) da'd (3.3

Setting

L = log p(a, /3, x)

= log C + (T/3 1)log a along f3 + Et log F(xt + /) T log F(0)

Et (x + f3)log (et + a) (3.3.25)

where C( is a norming constant, and


L* = log (xTl+ + log p(a, 3, x) (3.3.26)
SeT + a)

produces the approximation


E [O, I x] -= (d exp L*(**) L(&, /)j (3.3.27)


to E [Ot I a], where (t*, /*) and (v, /1) maximize L* and L respectively and E* and E

are minus the inverse Hessians of L* and L at (&*, /3*) and (&, /3) respectively. The
approximation (3.3.27) is referred to as the first order Laplace approximation. A







similar approximation applies to the posterior variance when one writes


V [ ['+ ) I2 = E- {E [(x+ l (3.3.28)
et +a et +c I et + (32)

substitutes (3.3.28) in (3.3.12), and uses calculation similar to (3.3.25) and (3.3.26)
for each term in (3.3.12) to arrive at an expression similar to (3.3.27).
It should be noted in the present context that since (XT+/3)/(er+c~) > 0, second
order Laplace approximation is automatically achieved using the present approach
as in Tierney and Kadane (1986). One does not need to appeal to Kass and Steffey
(1989) which provides second order Laplace approximation even when the integrand
is not necessarily positive.

3.4 An Example


The example in this section considers the same defect data as given in Hoadley
(1981). Hoadley's primary goal was to compare QMP with then existing T-rate
method. Our objective is to compare and contrast the present HB method with
Hoadley's EB method.
In deriving the HB estimates of the present chapter, we have considered Gibbs
sampler with a burn-in sample of 2000, subsequent iterations being 50 to get one
sample. A sample of size 10,000 is taken to obtain the Monte-Carlo estimates, as
stability seems to be achieved with this sample size.
Table 3.1 provides the expressions for et (expected number of defects in the audit
sample when the quality standard is met for period t), It (defect index of the current

sample for period t), O (posterior mean of quality index at period t using Hoadley's

QMP), Vt (posterior variance of quality index at period t using Hoadley's QMP),

Ot(a (posterior mean of quality index at period t using the present HB method for

different choices a = 2,3 and 4), &t^ (posterior variance of quality index at period t







using the present HB method) once again for a = 2,3 and 4, ,) (posterior mean of

quality index at period t using Laplace's approximation), ,,) (posterior variance of

quality index at period t using Laplace's approximation) for a = 2, 3,4. These figures

are provided for the same 9 time periods t = 1, ,9 as given in Table 4 of Hoadley

(1981).

An inspection of Table 3.1 reveals that there can be significant difference in the

estimates of 01 as given in Hoadley and the HB estimates given in this chapter. Also,

the HB procedure is somewhat sensitive to the choice of "a" as different choices of

"a" can lead to slightly different point estimates. The posterior variances obtained by

the HB approach also are dissimilar to the approximate expressions given by Hoadley.

They are substantially smaller for periods 3 and 4, but on the other hand, are much

bigger for periods 1,2,5,6,7,8 and 9. Also, there is some sensitivity of the proposed

HB method regarding the choice of "a". However, these differences are not as drastic

as compared to the differences of the HB method and Hoadley's approximations.

Laplace's approximations do not work very well in the present context. The

main reason is that the joint posterior distribution of (a, #) given the data is highly

skewed. It is a folklore that the Laplace approximation works well only when the

posterior distribution is close to Gaussian. The bigger the departure from normality,

greater is the inadequacy of Laplace's method. The present example provides yet

another illustration of this phenomenon. Situation may improve if one works with

some transformation of a and /3. This idea is not explored in this chapter, and will

be a topic for future study.





38




CD CD M) 10 tk- t- t- t1- 0
) mo m t- tD v<1 v1 v "





Wb o o c c- oo
,b C> 0 0 Ce C 0 0 0 0 r-4



R o "W M V t- oo- k- G oo
o) mm C4 v v vv


t-. C 0C t-C CD CD CD GO M)
< o a k ooo o v r-





Or kG CD Ck 4
00 a00 a)Lt r- v r-
5- 50 ) 000 vl v IV v C

C 4 a dC ddcd a)
4 t- v t- CD CD in r 0
0 c=zh" i C6 li 4 r4- r- 4 1-4 w





Sa-CD CD o t- 0 t
q t- I r-l4 V co co r-4,4 mO

o k- Co M k- t- t k- 00 VD

SG -


a) M I o o0 m t- ie cqI- r
- C' e CO C6 CO r-4 1-4 r4 r,4 e
r t2 R VO m N t-CO t- Co o

CO-

p CIO t30 Ct k) k--CD CD


HQ CD CD k-Ck- i r i e
CO 19 a) C 4 rv C-4
d d o C G o 00 '-4


G0 CD iM G k- m0 NO Ce 0D


G-O ca )4 CO4 a) eH m 0
C4 Ck dq dq t-4 t 4 C a 4

ko C O h- o cci0 C 0
t- O 00 O 0 CD 00 r O 6
eq< CO G6 O CD i 4 '- 4 '-4 CO














CHAPTER 4

BAYESIAN ANALYSIS OF CATEGORICAL SURVEY DATA



4.1 Introduction


Small area estimation is gaining increasing importance in survey sampling due to

growing demand of reliable small area statistics both from public and private sectors.

In typical small area estimation problems, there exist a large number of local areas,

but samples available from an individual area are not usually adequate to achieve

accuracy at a specified level. The reason behind this is that the original survey was

designed to provide specific accuracy at a much higher level of aggregation than that

for local areas. This makes it a necessity to "borrow strength" or connect these

local areas explicitly or implicitly through models. In consequence, an estimate for

a particular local area utilizes information from similar neighboring areas. For an

ei'rlv history as well the recent developments on small area e'tiinitioi. the reader is

referred to the survey article of Ghosh and Rao (1991).

For quite some time now, Bayesian methods have been applied very extensively

for solving small area estimation problems. Particularly effective in this regard has

been the hierarchical or empirical Bayes (HB or EB) approach which are especially

suited for a systematic connection of the local areas through models. For the general

discussion of the EB or HB methodology in the small area estimation c,-ilt-xt. the

reader is referred to Ghosh and Meeden (1986), Ghosh and Lahiri (1987), Datta and

(Ghosh (1'111 ) among others.







However, the development to date has mainly concentrated on numerical valued

variates. Often the survey data are categorical, for which the HB or EB analysis

suitable for continuous variates is not very appropriate. It is only recently that some

work has started on the analysis of binary survey data. McGibbon and Tomberlin

(1989) obtain small area estimates of proportions via EB techniques, while Stasny

(1991) uses a HB model to estimate the probability that an individual has a cer-

tain characteristic. She uses data from the national crime survey (NCS) to estimate

the probability of being victimized, and the data from the current population survey

(CPS) to estimate the probability of being unemployed. Stroud (1991) develops a

general HB methodology for binary data, and subsequently (Stroud, 1992) provides a

comprehensive treatment of binary categorical survey data encompassing simple ran-

dom, stratified, cluster and two-stage sampling as well as two-stage sampling within

strata.

However, often the survey data, by nature, are more appropriately classified into

several categories instead of two. Simple examples of such multi-category responses

are choice of transportation to take to work (drive, bus, subway, walk, bicycle), con-

census on an opinion (strongly agree, agree, disagree, and strongly disagree), political

ideology (liberal, moderate, and conservative). To our knowledge, hardly any EB or

HB analysis seems available for such data. The objective of this chapter is to provide

a general HB methodology related to inference for data on items containing three

or more possible responses. The analysis is done within the framework of two-stage

sampling within strata. As a specific example to be considered in this chapter, we

cite the recent Canada Youth and AIDS study (King et al., 1988). In the different

provinces of Canada, children within selected schools were asked the question "how

often have you had sexual intercourse?" There were four response categories: never,

once, a few times, and often. Stroud (1991) analyzed the data by collapsing the four

categories into two : "often" and "not often"; but a more elaborate analysis involving







all the four categories is bound to be more informative. We shall present a general

HB methodology which will be found adequate to handle data of the above type.

The outline of the remaining sections is as follows. In Section 4.2, we present a

general HB algorithm for inference based on generalized linear models. The inference

includes but is not restricted to the important logistic regression or 1.,l-liinrai models.

As mentioned earlier, the method is described when there is two-stage sampling within

the strata, and in this way, our method extends the work of Zeger and Karim (1991)

who consider one stage sampling within strata. As in Zeger and Karim (1991), we

find the necessary posterior distributions by using thel Gibbs sampling (see Geman

and Geman, 1984; Gelfand and Smith, 1990), but there is one crucial simplification

in this chapter. We have identified the log-concavity of several densities which are

known only up to a multiplicative constant, and in this way have been able to use the

general adaptive rejection sampling of Gilks and Wild (1992) in contrast to the more

complex direct Metropolis-Hastings algorithm as done in Zeger and Karim (1991).

Also, in Section 4.2 of this chapter, we have contrasted the present method to that

of Albert (1988) and of Leonard and Novick (1986).

Section 4.3 considers general multi-category survey data admitting a multinomial

likelihood. However, we have viewed the multinomial distribution as the joint distri-

bution of several independent Poisson variables conditional on their sum, and in this

way, have been able to bring in directly the results of Section 4.2 for the analysis of

multi-category survey data.

Finally, in Section 4.4, we have considered the Youth and AIDS data. to illustrate

the general methods described in Section 4.3.

4 2 Geierlized Linear Model,, for Two,-Stage Sanipliiiwi Withini Strtta


Let Y;,k denote the response (discrete or continuous) of the kth unit within the

jtli cluster in the ith stratum (k = 1, -, ; j = 1, -, c; i = 1, --, ). The







Yijk are assumed to be independent with pdf's


f(Yijk; ijk, Oijk) =ep [Oijki(Yijk I 'ijk) + P(Yik; +Qijk)] (4.2.1)

Such a model is referred to as a generalized linear model (McCullagh and Nelder, 1989,

p 28). The density (4.2.1) is parametrized with respect to the canonical parameters

Oijk and scale parameters qijk. It is assumed that the scale parameters Oijk are known.
The natural parameters Oijk are modelled as

Oijk = Xik3 + uj + jk,, (4.2.2)

where the Xijk (p x 1) are known design vectors, / (p x 1) is the unknown regression

coefficient, uij are the random effects, and Eijk are the errors. It is assumed that uij

and the (ijk are mutually independent with uij iid N(0, a,) and eijk iid N(0,o2).

It is possible to represent (4.2.1) and (4.2.2) as a conditionally iidepf-ilei'"

hierarchical model (see e.g. Kass and Steffey (1990)). Write Aj = 1/3 + u.,,

RB = (oro)-' and R = (o-)-1. Then the hierarchical model is given by
I. Conditional on 3, RB = tB and R = r, Yijk are mutually independent with a

density of the form given in (4.2.1).

II. Conditional on 3, RB = rB and R = r, Oijk i' N(Aij, r-).

III. Conditional on /, RB= rB and R = r, Aij in N(x r1).

To complete the HB analysis, we assign the prior

IV. 3, RB and R are mutually independent with P3 uuiform(RP), RB ~

Gamma(la, 'b) and R Gamma(c, -d).

[A rv Z is said to have a Gamma(a, fl ) distribution if it has a pdf of the form

f(z) oc exp(-az) z-' Il[z>o], where a > 0, f3 > 0 ]. We allow the possibility of diffuse
gamma priors by allowing a, b, c and d to be zeroes or even negative.

We are interested in the posterior distribution of Oijk given the data yijk (k =
1,-'. n; j = ,--, ci; i = 1, m). This is best accomplished by using the






Gi(bbs sampler (Geman and Gemian (1984), Gelfand and Smith (1990)). For its im-
plementation, the necessary posterior distributions based on (I)-(IV) are given as
follows:

T(i) 31 A,y,0,rB,r ~ N((E.E1 xi (E.EA. ) B (. E xi 1);

(ii) Aj I 3, y, 0, rB,r r N ((r + 7'r)-' I( Ek 0ijk + 3), (r + B

(iii) R I y, 0, r, ~ Ganimna( (c+E E Ek c ijkl i )2), (d+ I .- ",

(iv) R J 3, y, 0, r, A~ Ganmma(((a + E. E,(A0j ~T )2), ,(b + ETI c)) ,

(v) 0ijk. I y,y7,ra, A are mutually independent with


f(Oij I /3,y, B,r, A) Cx {ep (OjkYijk (Ojk) r( ijk Ai)2)

It is clear from the above that it is possible to generate samples from the normal
and gamma distributions given in (i)-(iv). On the other hand, as evidenced in (v),
the posterior distribution of 0ijk given /3, y, rB, r and A is known only up to a mul-
tiplicative constant, and accordingly one has to use a general accept-reject algorithm
to generate samples from this pdf. Fortunately, the task becomes much simpler due
to the following lemma establishing log-concavity of this posterior density, because
then one can use the adaptive rejection sampling of Gilks and Wild (1992).

Le, in a .'. 1 log f(Oijk Y, y, rB, 7', A) is a concave function of Oijk-

Proof of L rnina .'. I

ilog f(0ijk I Y3,B, r,, A)
= yijk V'(Oijk) r'(Ojk Aij)
OUijk

Hence,

d'log f(O{jik I P,y,r ,r,A) ,,,,0

IiiiL tle fact that r > 0 and i'(,j) V(Yjk ,,,,)= -. ij- r < ,) > 0.
,,,i,,L the fact that r > 0 and "(O"ik) = V(Yijk I O,0,, B, A) = V(Yijk 10) > 0.






The actual implementation of the Gibbs sampling technique in the specific exam-
ple mentioned in the introduction is given in Section 4.4.

In Zeger and Karim (1991), the basic data consist of yik, the response for the
kth unit in the ith stratum. In this way, no two-stage sampling is involved, thereby

elliminating several steps in (i)-(v). However, Zeger and Karim (1991) allow the

possibility of correlated errors eijk, and thereafter put an inverse Wishart on the

covariance matrix rather than the inverse gamma distribution as done in this chapter.

Zeger and Karim (1991) proposed modelling h(Oijk), where h is a strictly monotone

increasing function, by (4.2.2) rather than modelling 0ijk itself. However, in their

simulation work, they worked with the canonical link Oijk. Their calculations can be

greatly simplified by adaptation of the Gilks-Wild algorithm.

Two special cases are of immense practical interest. First is the logistic regression
model where Oijk = log1(pijk/(l-Pijk)), Pijk being the success probabilities in Bernoulli

trials. Second is the log-linear model where Oijk = log (ijk), ijk being Poisson means.

We shall consider the second situation in Section 4.3.

The log-concavity idea is used slightly differently in Dellaportas and Smith (1993)
where the prime objective is inference about. p in generalized linear models and I0,jk's

are modelled as functions of P without any error. Dellaportas and Smith (1993) have

used a N(30, Do) prior for p where .l and Do are known, and their method, unlike

ours, does not use any unknown variance components.

Our method should also be contrasted to that of Albert (1988) which generalizes

Leonard and Novick (1986). Albert's (1988) method when generalized to the present

setting will first assign independent conjugate prior distributions

7r(ik I m, () = exp [C(mn;Oijk 0(ijk)) + g(mij; (4.2.3)

to the Oijk. Next one assumes that h(mij) = xT3 for some monotone function

h. Subsequently, he assigns distributions (possibly diffuse) to the hyperparameters
f and (. Thus, Albert's (1988) procedure amounts to modelling some function of







the prior mean llrlliilj some linear model without any error. This can, of (,Iii.r,.

be generalized by aildingm an error component to the regression term. It should also

be noted that Albert's paper was written before this recent surge of Monte Carlo

integration. He, therefore, suggested approximations of the Bayes procedure by one or

the other of three methods: (i) Laplace method (see e.g. Tierney and Kadane, l!9ti),

(ii) quasi likelihood approach, and (iii) Brook's (1984) method. These approximations
are, in general, unnecessary now with the advent of the sophisticated lMonte Carlo

integration techniques.

.1.:1 A iirlyij of Multi-Categ,,ry' Data


We now see how the results of the previous section help in the analyvis of multi-

category data. Consider m strata labelled 1, --,m. Within each strat 1u several

units are selected, and suppose that the responses of individuals within each selected

unit are independent, and can be classified into J categories. For the kth selected unit

within the ith stratum, let Pijk denote the probability that an individual's response

belongs to the jth category, and let Zijk denote the number of individuals whose

response falls in the jth category (j = 1,--- ,J; k = 1, ..-, n). Then within the

kth selected unit within the ith stratumi. Zijk (j = 1,' ., .1) has a joint multinomial

(7i; Pilk, '' *, PiJk) distribution. Using the well-known relationship between multino-
mial and Poisson distributions (Zilk,, ZiJk) has the same distribution as the joint

conditional distribution of (Y11k, Jk) given Ej=1 Yijk, where Yijk (j = 1, ,J)

are independent Poisson(Cjk) and Pijk = Cijk/ 1 ij = '", J) TI s. al-

though the present structure is not strictly two-stage sampling within strata (since

the suffix j corresponds to a category, and not a primary unit within the ith stratum),

the results of the previous section apply (with 0ti = ni for all j and c, = .1 for all

i) for finding the posterior distribution of (Ok. The posterior means and variances







of pijk are simply obtained by using pijk = (ijk/ =1 cijk (j= 1, ), and then

using the Monte-('arlo integration algorithm.

To be specific, suppose that the Gibbs sampler uses t-iterates and G replications.

The corresponding sampled (ijk values are given by (t) (g = 1,--,G). Then

E(pijk I y) is approximated by

a -(t)
G-1 -,ijkg
g=1 jj=1 "ijjkg

while E(p?.k y) is approximated by


Gc (t) )2
-1\ ijkg
(=1 (2E3=1 tj.ckg)

V(pijk I y) is now approximated by using the individual approximations for E(p.. y)

and E(pijk I y). Further, E(pijkpi'jk' y) is approximated by


G ( t) ( t)
G- =1 ,j=1(tijkg \ i-,j= '1i'jk'g


which leads to an approximation for Cov(Pijkpi'j'k' Y).


4.4 An Example


We illustrate the method of Sections 4.2 and 4.3 with an analysis of Canada Youth

& AIDS Study data mentioned in the introduction. Recall that the question "how

often have you had sexual intercourse?" had four response categories: never, once,

a few times, and often. In this section we obtain the posterior mean and standard

deviation of the proportion of Grade 9 students in the selected Province Newfoundland

of Canada who would respond in any one of the four categories if sampled. No attempt

is made toexamine the question of reporting bias.







The different school boards are stratified by Catholic/Protestant and by urban/rural.

This is an attempt to minimize the effect of selection ',ia-. since some school boards

refused to participate. Refusal was often based on the personal choice of the school

board official and was related to how busy the school w;,,. how many other issues the

school had to deal with and a reticence to get involved in a situation perceived to

have political ramifications. It is reasonable to assume that, within urban/rural and

Catholic/Protestant categories within the geographical area studied here, student re-

sponses would be uncorrelated with reasons of refusal. We also assume that would-be

responses are uncorrelated with student nonresponse (chiefly due to absence), though

this is clearly a possible source of nonsampling error. Methods of modelling non-

response in stratified sampling used by Stasny (1991) have not been developed for

complex sampling designs.

School boards were selected according to a probability scheme where larger boards

had a larger probability of being selected. Classes within schools within boards

were randomly selected, and all students in attendance in the sampled classes were

given the questionnaire. Let the stratum of Catholic/Protestant and urban/rural

be indexed by r and c (rows and columns) respectively, where r = 1.-- ,H and

c = 1, C. Here in this example R=2 (Catholic and Protestant) and C=3 (Rural-

Small Town, Town and Small City). For a given school board, it turns out that

all schools within that school board fall within the same (r,c) stratum. TliIL,. the

cluster is indexed by i corresponding to the school board, k corresponds to the

school within a school board and j corresponding to the alternatives of the response,

(i= 1, k= 1, n, j = 1,. J). Thus nj = n; for all j, andci = J for

all i.

We begin with the Poisson model for counts and then obtain the proportions as

given in Section 4.3. The y-values within cluster (i,j, k) are distributed as


(4.4.4)


Yjk nit Poisson(Cijk)







Then, as discussed in Section 4.2, the natural parameter is modelled using the canon-

ical link, specifically in the case of Poisson, the parameter 0ijk = log (jk will be

modelled as

Oijk = x3 + Ui + eijk (4.4.5)

where


,,J ~ N(0, o)

eik ~ N(0, a2) (4.4.6)

Keeping in mind that a given stratum i corresponds to a particular (r, c) cinlIi:lit i-n!.

we have

xae = t + rf + r +' + r + r + r (4.4.7)

r = 1, Rc = 1, -, C and j = 1, .., J. In the above g is the general effect, T'

is the main effect of the jth alternative of the response, 7, is the main effect of the

rth row, Tr' is the main effect of cth column, JR is the interaction effect of the jth

response and the rth row, rjjc is the interaction effect of the jth response and the

cth column and r," is the interaction effect of the rth row and the cth column. To

avoid redundancy we assume the corner point restrictions namely


riJ = Tr = Tir = rl = ri = r7- = r = rc = 0 (4.4.8)


for all (r,c,j). The additive log-linear model (4.4.5) will cause estimates of the (;kj to

borrow strength from other estimates in board i and other estimates in school k. It

is recommended in situations where some (i,j, k) cells have few samples, or even no

samples.

Table 4.1 and 4.2 provides the hierarchical Bayes estimates and the associated

standard errors for the proportion of students responding to the four categories in the

forty selected schools within the fifteen school boards. Clearly, there is a distinction







between the sample proportions and the HB estimates. In particular, if no student

responds for a specific category, for example "often", the sample proportions are

clearly zeroes, whereas the HB method is usually a.;iignin, a small probability to the

event. Judging the subjective nature of the iespnl, e,ll the HB estimates are probably

more meaningful than the sample proportion,, at least for this category.

The biggest advantage of using the HB method instead of the sample proportions

is the tremendous reduction in standard errors for all the three categories "Nev.-r",

"Once" and "Few Times". For some of these categories, the reduction is often as high

as fifty per cent. On the other hand, if no students respond to a certain category,

based on the sample proportions, the estimated saturated standard errors are clearly

zeroes. Such estimates are usually questionable, but the HB method is consistently

rectifying this deficiency by producing positive estimates of these standard errors.

Perhaps. the biggest advantage of the tHB method lies in finite population sampling.

If, after drawing a random sample, some clusters are not represented at all, the sample

proportions for those clusters are not available. On the other lhandl. it is still possible

to estimate the proportions in these categories by the HB method by borrowing

strength from other clusters.







and Sample Proportions


BOARD CLASS NEVER ONCE FEW TIMES OFTEN


283


283


283


283


283


284


284


284


284


0.6684
(0.6072)
0.7195
(0.8696)
0.6726
(0.5806)
0.7053
(0.7600)
0.7166
(0.7857)
0.7154
(0.7666)
0.5371
(0.2778)
0.6771
(0.8148)
0.6563
(0.8182)
0.6132
(0.5806)


0.1174
(0.1071)
0.1180
(0.1304)
0.1203
(0.1290)
0.1183
(0.1200)
0.1118
(0.1071)
0.1049
(0.0667)
0.1218
(0.1111)
0.0964
(0.0741)
0.1029
(0.0909)
0.1011
(0.0968)


0.1868
(0.2857)
0.1349

(0)
0.1772
(0.2258)
0.1472
(0.0800)
0.1454
(0.1072)
0.1541
(0.1667)
0.2456
(0.3333)
0.1619
(0.0370)
0.1787
(0.0909)
0.2275
(0.3226)


0.0273

(0)
0.0276

(0)
0.0299
(0.0646)
0.0292
(0.0400)
0.0262

(0)
0.0256

(0)
0.0954
(0.2778)
0.0646
(0.0741)
0.0622

(0)
0.0582

(0)


Table 4.1.


Estimated







Table 4.1. (continued)


BOARD CLASS NEVER ONCE FEW TIMES OFTEN


284


284


^ "


285


285


287


287


287


287


287


5


6


1


0.5678
(0.4074)
0.6499
(0.8095)
0.3545
(0.2500)
0.4339
(0.5416)
0.4331
(0.5000)
0.5680
(0.6666)
0.4798
(0.2963)
0.5623
(0.5909)
0.5214
(0.4583)
0.6332
(0.9091)


0.1303
(0.2222)
0.0992
(0.0476)
0.0531
(0.0833)
0.0494

(0)
0.0539
(0.0417)
0.1128
(0.05,5 h;)
0.1175
(0.1111)
0.1246
(0.1818)
0.1121
(0.0833)
0.1067
(0.0909)


0 2-'63
(0.2593)
0.1879
(0.1429)
0.4574
(0.5000)
0.39,82
(0.4167)
0.3731
(0.2916)
0.2382
(0.1667)
0.3174
(0.4444)
0.2389
(0.1818)
0.2936
(0.4167)
0.1937

(0)


0.0755
(0.1111)
0.0,30}

(0)
0.1351
(0.1667)
0.1185
(0.0417)
0.1399
(0.1667)
0.0810
(0.1111)
0.0-54
(0.1482)
0.0742
(0.0455)
0.0729
(0.0417)
0.0664

(0)







Table 4.1. (continued)


BOARD CLASS NEVER ONCE FEW TIMES OFTEN


291


292


292


29:3


293


293


295


297


301


0.6431
(0.6956)
0.6495
(0.6333)
0.5297
(0.6250)
0.5405
(0.5000)
0.6916
(0.6667)
0.7321
(0.7812)
0.6775
(0.6500)
0.4916
(0.4865)
0.4049
(0.4000)
0.4358
(0.4211)


0.1096
(0.0870)
0.1129
(0.1333)
0.1711

(0)
0.1901
(0.2778)
0.0715
(0.0833)
0.0588
(0.0313)
0.0755
(0.1000)
0.1320
(0.1351)
0.0879
(0.0800)
0.1516
(0.1578)


0.1987
(0.2174)
0.1871
(0.1667)
0.1368

(0)
0.1392
(0.1666)
0.1766
(0.1667)
0.1594
(0.1562)
0.1865
(0.2000)
0.2167
(0.2162)
0.3547
(0.3600)
0.3012
(0.3158)


0.0486

(0)
0.0505
(0.0667)
0.1624
(0.3750)
0.1302
(0.0556)
0.0602
(0.0833)
0.0496
(0.0313)
0.0606
(0.0500)
0.1596
(0.1622)
0.1525
(0.1600)
0.1114
(0.1053)







Table 4.1. (continued)


BOARD CLASS NEVER ONCE FEW TIMES OFFEN


0.6156
(0.6785)
0.5985
(0.6667)
0.5062
(0.2941)
0.7005
(0.6250)
0.770:3
(0.8000)
0.8160
(0.8158)
0.5572
(0.5334)
0.8032
(0.8064)
0.7838
(0.7408)
0.7898
(0.8572)


0.1489
(0.1786)
0.1449
(0.0952)
0.1681
(0.1765)
0.0601
(0.0625)
0.0472
(0.0333)
0.0525
(0.0526)
0.1382
(0.1333)
0.0724
(0.0968)
0.0748
(0.0741)
0.0745
(0.0476)


r


0.1692
(0.1429)
0.1722
(0.0952)
0.2371
(0.4118)
0.1685
(0.2500)
0.1244
(0.1000)
0.0593
(0.0527)
0.1361
(0.1333)
0.1055
(0.0968)
0.1197
(0.1481)
0.1140
(0.0952)


0.0663

(0)
0.0845
(0.1429)
0.0ss7

(0.1176)
0.0709
(0.06i25)
0.0581
(0.0667)
0.0721
(0.0789)
0.1685
(0.2000)
0.0189

(0)
0.0217
(0.0370)
0.0217







Table 4.2. Standard Errors for


BOARD CLASS NEVER ONCE FEW TIMES OFTEN


283


283


283


283


283


284


284


284


284


L


3


4


5


6


1


2


3


4


0.0570
(0.0923)
0.0531
(0.0702)
0.0547
(0.0886)
0.0527
(0.0854)
0.0522
(0.0775)
0.0518
(0.0772)
0.0818
(0.1056)
0.0610
(0.0748)
0.0588
(0.0822)
0.0575
(0.0886)


0.0348
(0.0584)
0.0349
(0.0702)
0.0352
(0.0602)
0.0349
(0.0650)
0.0332
(0.0584)
0.0320
(0.0456)
0.0402
(0.0741)
0.0305
(0.0504)
0.0327
(0.0613)
0.0314
(0.0531)


0.0479
(0.0854)
0.0382

(0)
0.0435
(0.0751)
0.0382
(0.0543)
0.0375
(0.0585)
0.0379
(0.0680)
0.0598
(0.1111)
0.0490
(0.0363)
0.0474
(0.0613)
0.0518
(0.0840)


0.0136

(0)
0.0138

(0)
0.0148
(0.0442)
0.0144
(0.0392)
0.0130

(0)
0.0129

(0)
0.0402
(0.1056)
0.0234
(0.0504)
0.0227

(0)
0.0213


Estimated and Sample


Proportions







Table 4.2. (continued)


BOARD CLASS NEVER ONCE FEW I'MIES OFTEN


0.0698
(0.0946)
0.0590
(0.0857)
0.0686
(0.0722)
0.0695
(0.1017)
0.0705
(0.1021)
0.0633
(0.1111)
0.0792
(0.0879)
0.0638
(0.1048)
0.0669
(0.1017)
0.0725
(0.0613)


_______________ I _________________ I. 4.


0.0436
(0.0800)
0.0317
(0.0465)
0.0252
(0.0461)
0.0236

(0)
0.0261
(0.0408)
0.0376
(0.0540)
0.0380
(0.0605)
0.0404
(0.0822)
0.0367
(0.0564)
0.0354
(0.0613)


0.0524
(0.0843)
0.0474
(0.0764)
0.0711
(0.0833)
0.0666
(0.1006)
0.0710
(0.0928)
0.0541
(0.0878)
0.0720
(0.0956)
0.0536
(0.0,S22)
0.0636
(0.1006)
0.0572


0.0283
(0.(11 .(15)
0.0234

(0)
0.0417
(0.0621)
0.039:3
(0.0408)
0.0445
(0.0761)
0.0313
(0.0741)
0.0335

(0.06.4M)
0.02,-5
(0.0444)
0.0279


0.0257







Table 4.2. (continued)


BOARD CLASS NEVER ONCE FEW TIMES OFTEN


291


292


292


293


293


293


295


297


301


0.0717
(0.0959)
0.069:3
(0.0880)
0.1001
(0.1712)
0.0958
(0.1179)
0.0619
(0.0962)
0.0573
(0.0731)
0.0656
(0.1067)
0.0768
(0.0822)
0.0973
(0.0980)
0.1006
(0.1133)


0.0427
(0.0588)
0.0429
(0.0621)
0.0696

(0)
0.0746
(0.1056)
0.0307
(0.0564)
0.0257
(0.0308)
0.0333
(0.0671)
0.0526
(0.0562)
0.0502
(0.0543)
0.0692
(0.0836)


0.0598
(0.0860)
0.058:3
(0.0680)
0.0701

(0)
0.0695
(0.0878)
0.0503
(0.0761)
0.0460
(0.0641)
0.0536
(0.0894)
0.0635
(0.0677)
0.0896
(0.0960)
0.0960
(0.1066)


0.0296

(0)
0.0301
(0.0456)
0.0742
(0.1712)
0.0595
(0.0540)
0.0288
(0.0564)
0.0237
(0.0308)
0.0291
(0.0487)
0.0559
(0.0606)
0.0680
(0.0733)
0.0639
(0.0704)







Table 4.2. (continued)


BOARD CLASS NEVER ONCE FEW TIMES OFTEN


0.0703
(0 11'us3)
0.0722
(0.1029)
0.0874
(0.1105)
0.0806
(0.1210)
0.0597
(0.0730)
0.0583
(0.0629)
0.1167
(0.1288)
0.0484
(0.0710)
0.0533
(0.0-4 1)
0.0521
(0.0763)


0.04 S5
(0.0724)
0.0496
(0.0640)
0.0546
(0.-92,5)
0.0366
(0.0605)
0.0291
(0.0328)
0.0335


0.0791
(0.0878)
0.0309
(0.0531)
0.0319
(0.0504)
0.0322
(0.0465)


0.0510
(0.0661)
0.0527
(0. 06 1)
0.0764

(0.1194)
0.0675
(0.1083)
0.0477


0.0351
(0 0o3t2)
0.0792

(0.0o -8)
0.030-S

(0.0531)
0.0415


0.0399
(0.0640)


0.0308

(0)
0.0(37r

(0.0764)
0.0395
(0.0781)
0.0390

(0.0605)
0.0315
(0.0456)
0.0378
(0.0437)
0.0895
(0.1033)
0.0134

(0)
0.0154
(0.131.1)
0.0156

(0)














CHAPTER 5

SUMMARY AND FUTURE RESEARCH



5.1 Summary


In this dissertation, we have considered several problems where the hierarchical

Bayes (HB) methodology is used to obtain estimates and the associated standard

errors. The Bayesian methodology has been applied to two specific problems of small

area estimation, namely, the adjustment of the census undercount and categorical

survey data. We have also provided a hierarchical Bayes refinement of Hoadley's

Quality Measurement Plan (QMP).

The hierarchical Bayes procedure proposed in Chapter 2 for the adjustment of

the 1990 Census undercount overcomes many of the criticisms levelled against the

Bayesian procedures of earlier authors. In particular, we have discussed a model-

based approach which eliminates the need for assuming variance-covariance matrices

of the adjustment factors to be known, which was hitherto assumed known in any

Bayesian or non-Bayesian analysis.

Despite its wide publicity, the QMP developed by Hoadley (1981) has been criti-

cized by many statisticians. One of the main criticisms levelled against the procedure

is that the procedure is heuristic and it would require high-dimensional numerical

integration for a full Bayesian implementation. In Chapter 3, we have provided a IHB

procedure that will avoid all the ad hoc approximations needed in Hoadley's solution.







In Chapter 4, a full Bayesian analysis is provided for categorical survey data,

where data are classified into multi-(not necessarily two) categories. More generally,

we have provided a complete HB analysis for two-stage sampling within strata based

on generalized linear models. The technique has been used to produce estimates and

standard errors in the Canada youth and AIDS study data.

In all chapters of this dissertation, the implementation of the HB methodology

have been illustrated by adopting a Monte Carlo integration technique known as the

Gibbs sampler. Using this procedure, the posterior density as well as conditional

mean and variance can be obtained with considerable ease. Also, a special technique

called the adaptive rejection sampling has been extensively used to generate samples

from log-concave densities.

5.2 Future Research


The Gibbs sampler and iterative simulation methods are potentially very helpful

for summarizing univariate and multivariate distributions. In all of our applicat;uln',

we have em played a single sequence of t x G ibbs iterit e.-, storing every ti'' iterate

to provide i.i.d k-tuples (U) --',U)), (g = 1,--G-,). Since there are no proper

established techniques to monitor convergence of an iterative simulation, we have

employ e-l crude existing techniques for assessing convergence. But it is possible that

when using a single sequence, the inferences may be unduly influenced by slow-moving

realizations of the iterative simulation. It is important to establish the convergence

by implementing quantitative methods in monitoring convergence. One such possible

strategy is to use several independent sequences, with starting points sampled from

an overdispersed distribution, as recommended by Gelman and Rubin (199!-2). Also,

in the case of simulating samples from non-logconcave dchliitic-. it is possible to use

the Adaptive Rejection Metropolis sampling as in (iilks et al. (1993).







Coming to the specific problems considered in this dissertation, in the adjustment

of census undercount, we have not taken into account the pairwise correlation of the

;iljuiiti'itt factors between the different poststrata, since the sample correlations

were too small compared to the variances. We have, in a previous study, considered

a general correlation structure by assuming Wishart type priors on the variance-

covariance matrix, but this yielded unreasonable estimates. The case in which special

type of correlation structure is more appropriate needs further investigation.

In addition to the refinement of Hoadley's QMP, it is important to investigate the

possibility of a full Bayesian implementation in the additive and multiplicative model

proposed by Irony et al.(1992) for analyzing discrete time-series for quality data.

We can extend the HB analysis of categorical survey data to prediction in the

case of finite population sampling. As discussed in Chapter 4, the HB method is well

suited for predictive inference, since the method estimates the unsampled portion by

borrowing strength from related areas.















BIBLIOGRAPHY


Albert, J. H. (1988). Computational methods using a Bayesian hierarchical generalized lin-
ear model. J. Amer. Statist. Assoc., 83, 1037-1044.

Angers, J. F. and Berger, J. 0. (1991). Robust hierarchical Bayes estimation of exchange-
able means. Canadian J. Statist.19, 39-56.

Barlow, R. E. and Irony, T. Z. (1992) Foundations of statistical quality control. Current
Issues in Statistical Inference: Essays in Honor of D. Basu. IMS Lecture Notes- Mono-
graph Series. Eds. M. Ghosh and P. K. Pathak 99-112.

Battese, G.E., Harter, R.M., and Fuller, W.A. (1988). An error-components model for pre-
diction of county crop areas using survey and satellite data. J. Amer. Statist. A.no..
83, 28-36.

Berger, J. (197. ). Statistical Decision Theory and Bayesian Analysis, 2nd Edition. Springer-
Verlag, New York.

Brooks, R. J. (19.-1). Approximate likelihood ratio tests in the analysis of beta-binomial
data. Applied Statistics, 33, 285-289.

Datta, G. S. and Ghosh, M. (1991). Bayvosi;in prediction in linear models: Applications to
small area estimation. The Annals of Statistics, 19, 1748-1770.

Datta, G. S., Ghosh, M., Huang, E. T., Isaki, C. T., Schultz, L. K. and T-av. J. H. (1992).
Hierarchical and empirical Bayes methods for adjustment of census undercount: The
9lq: Missouri Dress Rehearsal Data. Survey Methodology, 18, 95-10S.

Iatta, G. S. and Lahiri, P. (1992). Robust hierarchical Bayes estimation of small area char-
acteristics in presence of covariates. Technical Report No. 92-28, Dept. of Sattitirs.
University of Georgia, Athens.

Dellaportas, P. and Smith, A. F. M. (1993). Bayesian inference for generalized linear and
proportional hazards models via Gibbs sampling. Applied Stat.tI.-.. 42, -1-13- 1"'1.

Ericksen, E.P. and Kadanii. .I.B. (1985). :-.n -ating the population in a census year (with
discussion). J. Amer. Statist. Assor., 80, !'9 131.







Ericksen, E.P. and Kadane, J.B. (1987). Sensitivity analysis of local estimates of under-
count in the 1980 U.S. Census. Small Area Statistics. Eds. R. Platek, J.N.K., Rao,
(.E., Sarndal, and M.P. Singh. Wiley, New York, pp. 23-45.

Fay, R.E. and Herriot, R.A. (1979). Estimates of income for small places: an application of
James-Stein procedures to census data. J. Amer. Statist. Assoc., 74, 269-277.

Fienberg, S. E. (1992). An adjusted census in 1990? The trial. Chance, New Directions for
Stat. and Computers, 5, 28-38.

Freedman, D. A. and Navidi, W. C. (1986). Regression models for adjusting the 1980 census
(with discussion). Statistical Science, 1, 3-39.

Gaver, D. P. and O'Muircheartaigh, I. G. (1987). Robust empirical Bayes analysis of event
rates. Technometrics, 29, 1-15.

Gelfand, A. E., Hills, S. E., Racine-Poon, A. and Smith, A. F. M. (1990). illustration of
Bayesian inference in normal data models using Gibbs sampling. J. Amer. Statist. As-
soc., 85, 972-985.

Gelfand, A. E. and Smith, A. F. M. (1990). Sampling based approaches to calculating
marginal densities. J. Amer. Statist. Assoc., 85, 398-409.

Gelfand, A. E. and Smith, A. F. M. (1991). Gibbs sampling for marginal posterior expec-
tations. ('ommun. in Statist.-Theory Meth., 20(5 & 6), 1747-1766.

Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple
sequences. Statistical Science, 7, 457-511.

Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the
Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 6, 721-741.

George, E. I., Makov, U. E. and Smith, A. F. M. (1993). Conjugate Likelihood Distribution,
To appear in the Scandinavian Journal of Statistics.

Ghosh, M. (1992). Hierarchical and empirical Bayes multivariate estimation. Current Issues
in Statistical Inference: Essays in Honor of D. Basu. IMS Lecture Notes- Monograph
Series. Eds. M. Ghosh and P. K. Pathak. 17, pp. 151-177.

Ghosh, M. and Lahiri, P. (1987). Robust empirical Bayes estimation of means from strati-
fied samples. J. Amer. Statist. Assoc., 82, 1153-1162.







Ghosh, M. and Lahiri, P. (1992). A hierarchical Bayes approach to small area estimation
with auxiliary information. Bayesian Analysis in Statistics and Econometrics, Eds. P.
K. Goel and N. S. Iyengar, Lecture Notes in Stati-tics. 75. Springer-Vfrrlag. New York,
pp. 107-125.

Ghosh, M. and Mo-ilnl G. (1'sG6). Empirical Bayes estimation in lirnifl population sam-
pling. J. Amer. Statist. Assoc., 74, 269-277.

(Chs.-h, M. and Rao, J.N.K. (1991). Small area estimation: An appraisal. Technical Report
390, Dept. Statistics, Univ. Florida, Gainesville.

Gilks, W. R., Best, N. G. and Tan, K. K. C. (1993). Adaptive rejection metropolis sampling
for Gibbs sampling. Submitted for review.

Gilks, W. R. and \\ ild, P. (1992). Adaptive rejection sampling for Gibbs sampling. Applied
Slatiltir ., 41, 337-3-: .

Good, I. J. (1965). The Estimation of Probabilities, An Essay on Modern Bayesian Methods.
M.I.T. Press, Cambridge, Massachusetts.

Hoadley, B. (l'L). The Quality Measurement Plan (QMP). The Bell System Technical
Journal, 60, 215-273.

Irony, T. Z., Pereira, C. A. de B., and Barlow, R. E. (1992). Bayesian Statistics, 4, Eds. J.
M. Bernado, J. O. Berger, A. P. Dawid and A. F. M. Smith. Oxford University Press,
New York, pp. 675-688.

Kass, R. E. and Steffey, D. (1989). Approximate Bayesian inference in conditionally inde-
pendent hierarchical models parametricc empirical Bayes models), .I. Amer. Statist.
Assoc., 84, 717-726.

Kass, R. E., Tierney, L. and Kadane, J. B. (1989). Fully exponential Laplace approxima-
tions to expectations and variances of nonpositive functions. J. Amer. Statist. Assoc..
84, 710-716.

King, A. J. C., Beazley, R. P., Warren, W. K., Hankins, C. A., Robertson, A. S. and Rad-
ford, J. L. (lr'). Canada Youth & AIDS Study. Federal entiree for AIDS, Ottawa.

Leonard, T. and Novick, M. R. (l!PSl). Bayesian full rank marginalization for two-way con-
tingency tables, J. of Educ. Stat., 11, 33-56.

Lindley, 1). and Smith, A. F. M. (1972). Bayes estimates for the linear model, .. R. Statist.
,Soc. B, 34, 1-41.

Mc(ullagh, P. and Nelder, J. A. (1989). (cnermlized Linear Models (J2u ed.). (hapman &
Hall, London.








Morris, C. (1981). Parametric empirical Bayes confidence intervals, Scientific Infcrencc,
Data Analysis, and Robustness, Eds. G. E. P. Box, T. Leonard and C. F. Jeff Wu,
Academic Press, pp. 25-50.

Morris, C. (1983). Parametric empirical Bayes inference and applications, J. Anmcr. Statist.
Assoc., 78, 47-65.

Prasad, N.G.N., and Rao, J.N.K. (1990). The estimation of mean squared errors of small-
area estimators. J. Amer. Statist. Assoc., 85, 163-171.

Stasny, E. A. (1991). Hierarchical models for victimization and nonresponse in the National
Crime Survey. J. Amer. Statist. Assoc., 86, 296-303.

Stroud, T. W. F. (1991). Hierarchical Bayes predictive means and variances with applica-
tion to sample survey inference. Commun. in Statist.-Theory Meth., 20(1), 13-36.

Stroud, T. W. F. (1992). Bayesian inference from categorical survey data. Preprint No.
1991-3, Dept. of Mathematics and Statistics, Queen's University.

Tierney, L. and Kadane, J. B. (1986). Accurate approximations for posterior moments and
marginal densities. J. Amer. Statist. Assoc., 81, 82-86.

Zeger, S. L. and Karim, M. R. (1991). Generalized linear models with random effects; A
Gibbs sampling approach. J. Amer. Statist. Assoc., 86, 79-86.














BIOGRAPHICAL SKETCH


Kannan Natarajan was born on October 21, 1966, in Madras, Tamil Nadu, India.

He received his Bachelor of Science degree with major in statistics in 1987 and there-

after joined the Indian Statistical Institute, Calcutta, India, where he received his

Master of Statistics degree in 1989. In August 1989, he joined the doctoral program

in statistics in the University of Florida at Gainesville. He expects to receive a Ph.D.

degree in December 1993. During his time at the University of Florida., le worked

as a research assistant to Dr. Malay Ghosh. He was also employed as a teaching

assistant in the Department of Statistics. Upon graduation, he will join the Division

of Clinical Statistics and Computing, Abbott Laboratories, Abbott Park.






I certify that I have read this study and that in my opinion it conforms to accept-
able standards of scholarly presentation and is fully adequate, in scope and quality,
as a dissertation for the degree of Doctor of Philosophy.


Malay Ghosh, chairman
Professor of Statistics


I certify that I have read this study and that in my opinion it conforms to accept-
able standards of scholarly presentation and is fully adequate, in scope and quality,
as a dissertation for the degree of Doctor of Philosophy.


Alan Agresti
Professor of Statistics


I certify that I have read this study and that in my opinion it conforms to accept-
able standards of scholarly presentation and is fully adequate, in scope and quality,
as a dissertation for the degree of Doctor of Phi sophy.


Richard L. Sc/eaffer /
Professor of Statistics


I certify that I have read this study and that in my opinion it conforms to accept-
able standards of scholarly presentation and is fully,adequate, in scope 1nd quality,
as a dissertation for the degree of Doctor of Philoso hy.

4b' .. ,_. ___, _
Ramton C. Littel
Professor of Statistics


I certify that I have read this study and that in my opinion it conforms to accept-
able standards of scholarly presentation and is fully adequate inscope and quality,
as a dissertation for the degree of Doctor of P-tiloso hy.


Patrick A. T4dompson
Assistant Pr fessor of Decision
and Information Sciences







This dissertation was submitted to the Graduate Faculty of the Department of
Statistics in the College of Liberal Arts and Sciences and to the Graduate School and
was accepted as partial fulfillment of the requirements for the degree of Doctor of
Philosophy.


December 1993

Dean, Graduate School


































































UNIVERSITY OF FLORIDA


3 1262 08556 8649