Citation
A comparison of methods for combining tests of significance

Material Information

Title:
A comparison of methods for combining tests of significance
Creator:
Louv, William C., 1952-
Publication Date:
Copyright Date:
1979
Language:
English
Physical Description:
vii, 122 leaves : ill. ; 28 cm.

Subjects

Subjects / Keywords:
Approximation ( jstor )
Binomials ( jstor )
Degrees of freedom ( jstor )
Null hypothesis ( jstor )
Probabilities ( jstor )
Random variables ( jstor )
Sample size ( jstor )
Significance level ( jstor )
Statistical discrepancies ( jstor )
Statistics ( jstor )
Dissertations, Academic -- Statistics -- UF
Statistical hypothesis testing ( lcsh )
Statistics thesis Ph. D
Genre:
bibliography ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis--University of Florida.
Bibliography:
Bibliography: leaves 118-121.
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by William C. Louv.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
023309024 ( AlephBibNum )
06429717 ( OCLC )
AAL1950 ( NOTIS )

Downloads

This item has the following downloads:


Full Text







A COMPARISON OF METHODS FOR COMBINING
TESTS OF SIGNIFICANCE













BY

WILLIAM C. LOUV


A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL
OF THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY



UNIVERSITY OF FLORIDA


1979




















Digitized by the Internet Archive
in 2009 witfn unaing from
University of Florida, George A. Smathers


Libraries


http://www.archive.org/details/comparisonofmeth00louv















ACKNOWLEDGMENTS


I am indebted to Dr. Ramon C. Littell for his guidance and

encouragement, without which this dissertation would not have been

completed. I also wish to thank Dr. John G. Saw for his careful

proofreading and many helpful suggestions. The assistance of Dr.

Dennis D. Wackerly throughout my course of graduate study is greatly

appreciated.

My special thanks go to Dr. William Mendenhall who gave me the

opportunity to come to the University of Florida and who encouraged

me to pursue the degree of Doctor of Philosophy.











TABLE OF CONTENTS

Page

ACKNOWLEDGMENTS . . . . . . . .. . . . . . iii

ABSTRACT . . . . . . . .. . . . . . vi

CHAPTER

I INTRODUCTION AND LITERATURE REVIEW . . . . . . 1

1.1 Statement of the Combination Problem . . . . 1
1.2 Non-Parametric Combination Methods . . . . . 2
1.3 A Comparison of Non-Parametric Methods . . . . 5
1.4 Parametric Combination Methods . . . . . . 8
1.5 Weighted Methods of Combination . . . . .. 11
1.6 The Combination of Dependent Tests . . . . .. 12
1.7 The Combination of Tests Based on Discrete Data . 13
1.8 A Preview of Chapters II, III, and IV . . . .. 18

II BAHADUR EFFICIENCIES OF GENERAL COMBINATION METHODS . . 19

2.1 The Notion of Bahadur Efficiency . . . . .. 19

2.2 The Exact Slopes for TA) and T() . . . . . 21
2.3 Further Results on Bahadur Efficiencies . . .. 26
2.4 Optimality of T(F) in the Discrete Data Case . .. 28

III THE COMBINATION OF BINOMIAL EXPERIMENTS . . . . .. 32

3.1 Introduction . . . . . . . . . . 32
3.2 Parametric Combination Methods . . . . . .. 33
3.3 Exact Slopes of Parametric Methods . . . . .. 37
3.4 Approximate Slopes of Parametric Methods . . .. 44
3.5 Powers of Combination Methods . . . . . .. 54
3.6 A Synthesis of Comparisons . . . . . . .. 57
(F)
3.7 Approximation of the Null Distributions of T ,
(LR) (ALR)79
T T.. . . . . . . .... . 79

IV APPLICATIONS AND FUTURE RESEARCH . . . . . ... 96

4.1 Introduction . . . . . . . . . . 96
4.2 Estimation: Confidence Regions Based on
Non-parametric Combination Methods . . . . .. 96







TABLE OF CONTENTS (Continued)


CHAPTER IV (Continued) Page

4.3 The Combination of 2 x2 Tables . . . . .. 110
4.4 Testing for the Heterogeneity of Variances ..... 113
4.5 Testing for the Difference of Means with
Incomplete Data . . . . . . . ... .115
4.6 Asymptotic Efficiencies for k-.. . . . ... 116

BIBLIOGRAPHY . . . . . . . . ... .... .. .118

BIOGRAPHICAL SKETCH . . . . . . . . ... . . 122






Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of the Requirements
for the Degree of Doctor of Philosophy


A COMPARISON OF METHODS FOR COMBINING
TESTS OF SIGNIFICANCE

By

William C. Louv

August 1979

Chairman: Ramon C. Littell
Major Department: Statistics

Given test statistics X( ),...,X(k) for testing the null

hypotheses H1,...,Hk, respectively, the combining problem is to select

a function of X ,...,(k) to be used as an overall test of the hypoth-

esis H = H n H2 n ... n Hk. Functions based on the probability integral

transformation, that is, the significance levels attained by X(1),...,X(k)

form a class of non-parametric combining methods. These methods are com-

pared in a general setting with respect to Bahadur asymptotic relative

efficiency. It is concluded that Fisher's omnibus method is at least as

efficient as all other methods whether X(),...(k) arise from contin-

uous or discrete distributions.

Given a specific parametric setting, it may be possible to

improve upon the non-parametric methods. The problem of combining binom-

ial experiments is studied in detail. Parametric methods analogous to

the sum of chi's procedure and the Cochran-Mantel-Haenszel procedure as

well as the likelihood ratio test and an approximate likelihood ratio

test are compared to Fisher's method. Comparisons are made with respect

to Bahadur efficiency and with respect to exact power. The power

vi






comparisons take the form of plots of contours of equal power. If

prior information concerning the nature of the unknown binomial success

probabilities is unavailable, Fisher's method is recommended. Other

methods are preferred when specific assumptions can be made concerning

the success probabilities. For instance, the Cochran-Mantel-Haenszel

procedure is optimal when the success probabilities have a common value.

Fisher's statistic has a chi-square distribution with 2k degrees

of freedom when X(1),...,X(k) are continuous. In the discrete case,

however, the exact distribution of Fisher's statistic is difficult to

obtain. Several approximate methods are compared and Lancaster's mean

chi-square approximation is recommended.

The combining problem is also approached from the standpoint

of estimation. Non-parametric methods are inverted to form k-dimensional

confidence regions. Several examples for k=2 are graphically displayed.














CHAPTER I


INTRODUCTION AND LITERATURE REVIEW



1.1 Statement of the Combination Problem


The problem of combining tests of significance has been studied

by several writers over the past fifty years. The problem is: Given

test statistics X(),...,X(k) for testing null hypotheses H1,...,Hk,

respectively, to select a function of X(,...,X(k) to be used as the

combined test of the hypothesis H = H1 n H2 n ... n k. In most of the
(i)
work cited, the X are assumed to be mutually independent, and, except

where stated otherwise, that is true in this paper.

Some practical situations in which an experimenter may wish to

combine tests are:

i. The data from k separate experiments, each conducted to
(1) (k)
test the same H, yield the respective test statistics X ,...,X

It is desired to pool the information from the separate experiments to

form a combined test of H. It would be desirable to pool the informa-

tion by combining the X(i) if (a) only the X(i), instead of the raw

data, are available, if (b) the information from the ith experiment is

sufficiently contained in X or if (c) a theoretically optimal test

based on all the data is intractible.
th (i)
ii. The i of k experiments yields X to test a hypothesis

H., i = l,...,k, and a researcher wishes to simultaneously test

1








the truth of l ,..., 'k. Considerations (a), (b), and (c) in the preced-

ing paragraph again lead to the desirability of combining X) as a test

of H = H1 n ... n Hk.

iii. A simultaneous test of H = H1 n ... n Hk is desired, and the

data from a single experiment yield X( ) ,X) as tests of H, ...,Hk,

respectively. Combining the X(i) can provide a test of H.

In Section 1.2 several non-parametric methods of combination

are introduced. A literature review of comparisons of these procedures

is given in Section 1.3. The remainder of this chapter is primarily

a literature review of more specific aspects of the combination problem.

We make some minor extensions which are identified as such.



1.2 Non-parametric Combination Methods


(i)
Suppose that H. is rejected for large values of X Define
1
L(i) = 1 F.(X(i), where F. is the cumulative distribution function

of X under H.. If X ) is a continuous random variable, then L)
1
is uniformly distributed on (0,1) under H.. Many of the well-known
1
methods of combination may be expressed in terms of the L Such

methods considered here are:

(1) T(F) -2QenL) (Omnibus method, Fisher [13])

(2) T(N) =_Z-1-(L(i)) (Normal transform, Liptak [26])

(3) (m) = -min (i) (Minimum significance level, Tippett [42])

(4) T = -max L (Maximum significance level, Wilkinson [44])

(5) T() = 21Ln(l L(i) (Pearson [361)

(6) T(A) -EL(i) (Edgington [12]).









As the statistics are defined here, H is rejected when large values are

observed. Figure 1 (page 4) shows the rejection regions for the sta-

tistics defined above when k = 2.

In the continuous case, the null distributions of these statis-

tics are easily obtained. They are all based upon the fact that the

L are uniformly distributed under H. It is easily established that

this is true. The cumulative distribution function for L(i) is

P{L < } = P{1 F(x) < }

= 1 P{F(x) < 1 I }
-l
= 1 P{x < F -(1 )}

= 1 F{F-(1 ()}

= 1 (1 9) = 9.

That T(N) has a normal distribution with mean 0 and variance k

follows trivially. The statistics T) and T are seen to be based

on the order statistics of a uniform random variable on (0,1) and

therefore distributed according to beta distributions.

(P) (F)
That T and T(F) are distributed as chi-squares on 2k degrees

of freedom is established as follows. The probability density function

(i)
of L is


fL(k) = I(0,1)(P)

Let S = -2ZnL. Then
-S/2 dL -1 -S/2
L' dS 2

It follows that
dL 1 -S/2
f (S) = f (S) = e I (2 (S).


Edgington's statistic, T(A), is a sum of uniform random variables.

As shown by Edgington, significance levels can be established for






L(2) L(-2)











T(F) L) (N

L(2) L(2)









() L
L









(2)
T(m) L 1(M)


0)2)









T(D
T (P) T(A)
Figure 1. Rejection Regions in Terms of the Signific
for k=2.


ance Levels


)








values of TA on the basis of the following equation [12]:

k k k k k
(A) t k (t-l) k (t-2) k (t-3) k (t-S)
P{T >-t} =-- ( ) k ( ) ( ) k .+ (S) (1.1i)
k! 1 k! 2 k! 3 k! S k!

where S is the smallest integer greater than t.



1.3 A Comparison of Non-parametric Methods


The general non-parametric methods of combination are rules

prescribing that H should be rejected for certain values of

(L (), L(2),..., L(k)). Several basic theoretical results for non-

parametric methods of combination are due to Birnbaum [7]. Some of

these results are summarized in the following paragraphs.

(i)
Under H., L is distributed uniformly on (0,1) in the con-
1
(i)
tinuous case. When H. is not true, L is distributed according to
1
a non-increasing density function on (0,1), say gi(L(i)), if X(i)

has a distribution belonging to the exponential family. Some overall

alternative hypotheses that may be considered are:

(i)
H : One or more of the L 's have non-uniform densities
gA
gi (L(i)).

HB: All of the L(i)'s have the same non-uniform density g(L).

HC: One of the L(i) s has a non-uniform density g (L()).

H is the appropriate alternative hypothesis in most cases where prior

knowledge of the alternative densities g (L(i)) is unavailable [7].

The following condition is satisfied by all of the methods

introduced in Section 1.2.

Condition 1: If H is rejected for any given set of L 's,then

it will also be rejected for all sets of L(i)*'s such that L (i) L)

for each i [7].








It can be shown that the best test of H versus any particular

alternative in HA must satisfy Condition 1. It seems reasonable,

therefore, that any method not satisfying Condition 1 can be elim-

inated from consideration [7].

In the present context, Condition 1 does little to restrict

the class of methods from which to choose. In fact, "for each non-

parametric method of combination satisfying Condition 1, we can find

some alternative H represented by non-increasing functions

gl(L(1) ,...,g k(L(k) against which that method of combination gives

a best test of H" [7].

It should be noted that direct comparison of general combining

methods with respect to power is difficult in typical contexts. The

precise distributions of the gi(L(i)) under the alternative hypothesis

are intractible except in very special cases.
(i)
When the X have distributions belonging to the one-parameter

exponential family, the overall null hypothesis can be written H:
(1) = (1),.. (k) (k)
S= 0 0 Rejection of H is based upon
0 0
(X(1),...,X(k) It is reasonable to reject the use of inadmissible

tests. A test is inadmissible if there exists another test which is

at least as powerful for all alternatives and more powerful for at

least one alternative. Birnbaum proves that a necessary condition for

the admissibility of a test is convexity of the acceptance region in

the (X(1),...,X(k)) hyperplane. For X(i) with distributions in the

exponential family, T) and T( do not have convex acceptance regions

and are therefore inadmissible [7].

Although Birnbaum does not consider Edgington's method, we see

that it is clear that T(A) must also be inadmissible. For instance,









for k=2, consider the points (O,c), (c,O), and (c/2,c/2) in the

(L(1),L(2)) plane which fall on the boundary of the acceptance region

T(A) > c. The points in the (X(1),X(2) plane corresponding to (O,c)

and (c,O) would fall on the axes at o (and -"). The point correspond-

ing to (c/2,c/2) certainly falls interior to the boundaries described

by the points corresponding to (c,0) and (O,c). The acceptance region

can not, therefore, be convex and hence T is inadmissible. This

argument is virtually the same as that used by Birnbaum to establish

(PM (M)
the inadmissibility of T and T

For a given inadmissible test it is not known how to find a

particular test which dominates. Birnbaum, however, argues that the

choice of which test to use should be restricted to admissible tests.

The choice of a test from the class of admissible tests is then contin-

gent upon which test has more power against alternatives of interest [7].

In summary of Birnbaum's observations, since T( and T do

not in general form convex acceptance regions in the (X(1),...,X(k))

hyperplane, they are not in general admissible and can be eliminated as

viable methods. We can extend Birnbaum's reasoning to reach the same

conclusion about T By inspecting the acceptance regions formed by

the various methods, Birnbaum also observes than T) is more sensitive

(F)
to HC (departure from H by exactly one parameter) that T(F). The test
(F)
T ( however, has better overall sensitivity to HA [7].

Littell and Folks have carried out comparisons of general non-

parametric methods with respect to exact Bahadur asymptotic relative

efficiency. A detailed account of the notion of Bahadur efficiencies

is deferred to Section 2.1.








In their first investigation [26], Littell and Folks compare

T(F) T(N), T and T(m). The actual values of the efficiencies are

(F)
given in Section 2.3. The authors show that T(F) is superior to the

other three procedures according to this criterion. They also observe

that the relative efficiency of T(m) is consistent with Birnbaum's

observation that T performs well versus HC.

(F)
Further, Littell and Folks show that T with some restric-

tions on the parameter space, is optimal among all tests based on the

X as long as the X) are optimal. This result is extended in

(F)
a subsequent paper [28] by showing that T(F) is at least as efficient

as any other combination procedure. The only condition necessary for

this extension is equivalent to Birnbaum's Condition 1. A formal state-

ment of this result is given in Section 2.3.



1.4 Parametric Combination Methods


(F)
The evidence thus far points strongly to T(F) as the choice

among general non-parametric combination procedures when prior knowledge

of the alternative space is unavailable. When the distributions of the

X(i) belong to some parametric family, or when the alternative param-

(F)
eter space can be characterized, it may be possible that T(F) and the

other general non-parametric methods can be improved upon. A summary

of such investigations follows.

Oosterhoff [33j considers the combination of k normally dis-

tributed random variables with known variances, and unknown means

P1l. 2,... k. The null hypothesis tested is H; P = )2 = ... = k 0

versus the alternative HA :i > 0, with strict inequality for at least
A i









one i. lie observed that many combination problems reduce to this

situation asymptotically. The difference in power between a particular

test and the optimal test for a given (p 2" ,...,i k) is called the short-

coming. Oosterhoff proves that the shortcomings of T(F) and the maximum

likelihood test go to zero for all (l' .. k) as the overall signif-

icance level tends to zero. The maximum shortcoming of the likelihood

(F)
ratio test is shown to be smaller than the maximum shortcoming of T

Oosterhoff derives a most stringent Bayes test with respect to

a least favorable prior. According to numerical comparisons (again

with respect to shortcomings), the most stringent test performs sim-

ilarly to the likelihood ratio test. The likelihood ratio test is much

easier to implement than the most stringent test and is therefore pre-

ferable. Fisher's statistic, T(F) is seen to be slightly more powerful

than the likelihood ratio test when the means are similar; the opposite

is true when the means are dissimilar. A simple summing of the normal

variates performs better than all other methods when the means are very

similar [33].

Koziol and Perlman [20] study the combination of chi-square
(i) 2
variates X ~ X (6.). The hypothesis test considered is H:
Vi 1
6 = .. = = 0 vs HA: 6.i 0 (strict inequality for at least one i)

where the 6. are non-centrality parameters. The V. correspond to the
1 1
respective degrees of freedom. An earlier Monte Carlo study by

Bhartacharya [6] also addressed this problem and compared the statistics

T(F), T(m), and EX(i) Bhattacharya concluded that X(i) and TF) were

almost equally powerful and that both of these methods clearly dominated

T(m). Koziol and Perlman endeavor to establish the power of T( and









EXi) in some absolute sense. To do this, they compare T(F) and EX(

to Bayes procedures since Bayes procedures are admissible and have

good power in an absolute sense [20].

(i)
When the v. are equal, EX is Bayes with respect to priors

giving high probability to (01,...,9k) central to the parameter space

(Type B alternatives). The test Eexp{ X ) is Bayes with respect

to priors which assign high probability to the extremes of the param-

eter space (Type C alternatives). For unequal v.'s the Bayes tests
i
have slightly altered forms. The Bayes procedures are compared to

T(F) T(m) and T(N) for k=2 for various values of (1,' 2) via numerical

tabulations and via the calculation of power contours.

(i)
The statistic T is seen to have better power than the other

tests for Type C alternatives but performs rather poorly in other situa-

tions. The Bayes test performs comparably to T for Type C alterna-

(m)
tives and is much more sensitive to Type B alternatives than T .The

(N)
statistic, T is relatively powerful over only a small region at the

(N)
center of the parameter space. The statistic, T is seen to be dom-

inated by some other procedure for each value of k investigated. The
(F) (i) (F)
statistics, T and EX are good overall procedures, with T more

sensitive to Type C alternatives and EX more sensitive to Type B

alternatives. For v E 2, T(F) is more sensitive to Type B alternatives

than EXi) is to Type C alternatives and T(F) is therefore recommended.

The opposite is true for v = 1. These observations were supported for

k>2 through Monte Carlo simulations.

Koziol and Perlman also consider the maximum shortcomings of

the tests. In the context of no prior information, they show that T(F)









minimizes the maximum shortcomings for vi. 2 while X minimizes the

maximum shortcoming for v. =1. An additional statistic can be consid-

ered when vi = 1. It is T X) = Z(X(i) M the sum of chi's procedures.

For k=2, T is powerful only for a small region in the center of the

parameter space. For large k, the performance of T(X) becomes progres-

sively worse. It can be said that T(X) performs similarly to T(N)



1.5 Weighted Methods of Combination


Good [14 suggests a weighted version of Fisher's statistic,
(G) (i)
T( = -i .nL He showed that, if the Ai are all different, signif-

icance probabilities can be found by the relationship


(C k
P{T( > x) = E A exp(-x/A )
r=l

where

k-1
A (r
r 1 r 2 r r-l r r+l r k

Zelen [45] illustrates the use of T(G) in the analysis of

incomplete block designs. In such designs, it is often possible to

perform two independent analyses of the data. The usual analysis

(intrablock analysis) depends only on comparisons within blocks. The

second analysis (interblock analysis) makes use of the block totals

only. Zelen defines independent F-ratios corresponding to the two

types of analysis. The attained significance level corresponding to

the interblock analysis is weighted according to the interblock effi-

ciency which is a function of the estimated block and error variances.









A similar example is given by Pape [34 Pape extends Zelen's

method to the more general context of a multi-way completely random

design.

Koziol and Perlman [20] also considered weighted methods for

the problem of combining independent chi-squares. They conclude that

when prior information about the non-centrality parameters is available,

increased power can be achieved at the appropriate alternative by a

(i)
weighted version of the sum test, Eb.X if v. > 2 for all i and by
1 1
(G)
the weighted Fisher statistic, T when v i 2 for all i.



1.6 The Combination of Dependent Tests


The combinations considered up to this point have been based

on mutually independent L) arising from mutually independent statis-

tics, X As previously indicated, in such cases, the functions of

the L) which comprise the general methods have nulldistributions which

are easily obtained. When the Xi (and thus the L ) are not inde-

pendent the null distributions are not tractible in typical cases.

Brown [9] considers a particular example of the problem of

combining dependent statistics. The statistics to be combined are

assumed to have a joint multivariate normal distribution with known

covariance matrix f and unknown mean vector (ul ,2 2,..., k)'. The hypoth-

esis test of interest is H: p. = versus H: p > p. (strict in-
1 10 1 i- 10
equality for at least one i). A likelihood ratio test can be derived

[31], but obtaining significance values from this approach is difficult.

Brown bases his solution on T The null distribution of T(

is not chi-square on 2k degrees of freedom in this case. The mean of









(F)
T is 2k as in the independent case. The variance has covariance

terms which Brown approximates. The approximation is expressed as

a function of the correlations between the normal variates. These

first two moments are equated to the first two moments of a gamma dis-

tribution. The resultant gamma distribution is used to obtain approx-

imate significance levels.



1.7 The Combination of Tests Based on Discrete Data


As noted in previous sections, the literature tends to support
(F)
T as a non-parametric combining method in the general, continuous

data framework. Those authors who have addressed the problem of com-
(F)
bining discrete statistics have utilized T(F) assuming that the opti-

mality properties established in the continuous case are applicable.

The problem then becomes one of determining significance

(F)
probabilities since T(F) is no longer distributed as a chi-square on

2k degrees of freedom. We describe the problem as follows. Suppose
(i)* (i)*
L derives from a discontinuous statistic, X and that a and b

are possible values of L(i)*, < a < b 1, such that a < L
is impossible. For a < E < b, P{L (i) } = P{L a} = a.

If L(i) derives from a continuous statistic, X(i), then P{L (i) =

(i)* (i)
Since a < &, L is stochastically larger than L It follows that

Fisher's statistic is stochastically smaller in the discrete case than

(F)
in the continuous case. The ultimate result is that if T(F) is com-

pared to a cli-square distribution with 2k degrees of freedom when the

data are discrete, the null hypothesis will be rejected with too low

a probability.









When k, the number of statistics to be combined, is small and

(nl,n2 ...,nk), the numbers of attainable levels of the discrete sta-

tistics are small, the exact distribution of T(F) can be determined.

Wallis [43] gives algorithms to generate null distributions when all

of the X(i) are discrete and when one X( is discrete. Generating

null distributions via Wallis' algorithms becomes intractible very

quickly as k and the number of attainable levels of the X) increase.

The generation of complete null distributions is even beyond the capa-

bility of usual computer storage limitations in experiments of modest

size. The significance level attained by a particular value of T(F)

can be obtained for virtually any situation with a computer, however.

A transformation of T(F) which can be referred to standard tables is

indicated.

A method suggested by Pearson [37] involves the addition, by

a separate random experiment, of a continuous variable to the original

discrete variable and thus yielding a continuous variable. Suppose X(i)

can take on values 0, 1, 2,..., n. with probabilities p0 p ,..., p
ni 1 0 1 ni
Let P() = E p Note that P(i) J x x ni ni-1 1 0

If the null hypothesis is rejected for large values of X(i), then the

P j = 0,1,2,...,n are the observable significance levels for the

.th
i test; i.e., the observable values of the random variable

L(i) = FX(i) 1) under the null hypothesis. Denote by U(i)

i = 1,2,...,k, mutually independent uniform random variables on (0,1).

Pearson's statistic is defined as

L(i)(i),(i)) = L(i)(i) U) P{X)}.
P









We now establish that Lp i (X(),U i) is uniformly distributed

on (0,1) if and only if X (i) and U() are independent and Ui) is uni-

formly distributed on (0,1). Omitting the superscripts for convenience,

define the random variable (X,U) by Lp = L(X) U P{X} where 0 < U < 1.

It follows that

P{X = x, U u} = P{L(x) u P(x) < Lp < L(x)}

=uP
x
= P{X = x} P{U < u}.

The statistic, -2EnL (, thus has an exact chi-square distri-
p
bution with 2k degrees of freedom. Exact significance levels can be

determined for any combination problem. The concept of randomization

to obtain statistics with exact distributions has been debated by

statisticians. That a decision may depend on an extraneous source of

variation seems to violate some common sense principle. Pearson [37]

argues, however, that his randomization scheme is no more an extran-

eous source of variation than is the a priori random assignment of

treatments to experimental units.

Lancaster [21] considers a pair of approximations to T(F)

Although Lancaster does not consider Pearson's method, the statistics

he introduces can be expressed in terms of the L(i) [19]:
P
i. Mean chi-square (X2):
m

E(-22nL(i) = (-2nL ) du
p p


= 2 {L i)(X)nL (X)


-L )(X+l)nL )(X+l) }/P(X). (1.3)








'2
ii. Median chi-square (X ):
m

Median (-2knL ) = -2Zn{L ( (X) + L(i)X+l)} if L (X+l) # 0
p 2

= 2 2 nL( X) if L(i)(X+) = 0.

2 2
The expectation of X is 2. The variance of X is slightly less
m m
than 4. The median chi-square is introduced because of its ease of

calculation. With the ready availability of pocket calculators with

in functions, this justification no longer seems valid. The expecta-
'2 '2
tion of X is less than 2. The alternate definition of X for when
m m
(i)
L (X+1) = 0 is intended to reduce the bias (without increasing the

difficulty of calculations) [21].

David and Johnson [10] undertake a theoretical investigation of

'2
the distribution of X They prove that as n, the number of attainable
m
levels of X(i) (and hence L (i)), increases without limit, the moments of
'2
X converge to those of a chi-square distribution on 2 degrees of
m

freedom. We obtain a similar result for X by adapting David and John-
m
'2
son's proof for X (superscripts are omitted for convenience). From
m
2
the definition of X given in equation (1.3), it follows that
m
2 I b1
lim E(X2)b = lim E( (-2gnL )du)b
no m p
P(xi)- 0
(j =1,2,...,n.)
1
Then,
2nn
lim E(X b = lim E P(x.)[ (-2LnL )du]
m j=l J P

b n l b
= (-2) lim E P(x.)[ n(L(x.) -uP(x.))du]
j=l J 0 j J


= (-2) lim EP(x.)[ {(L(x) -uP(x) -1)


lim E(X2) b 1 (L(x.) uP(x.)-l)2 + ...}du]b
m (2 +









1 2 1 3
(Note: kn(a) = (a-1) -(a-1) + -(a-) ...)
2 3

Upon performing the integration,


lim E(X2) = (-2)b lim P(x.)[(L(x.) -P(x.) )
m 3J 2 j

22 1
+ -([L(x.) 1] 2 P(x.)(L(x.) 1)

1 2 b
3 J
-3 [P(xj +
1b
Since all of the terms in the expansion of [ n(L(x.) uP(x))du]
0J
are multiplied by P(x.) it is evident that u(P(x.)) gives rise only to

second-order terms in P(x.) which can be ignored in the limit. The

limit thus reduces to

n
b b
(-2) lim E P(x.)[ knL(x.)du]
n- 0 i=l

= (-2)b lim EP(x.)[ZnL(x.)]
b b

= (-2)b [nL(x)]b dL(x).
0

Letting Y = ZnL(x) yields


b-Y b b
(-2)b(l)b e- Y dY = 2bb!
0

which gives the bth moment of a chi-square distribution with 2 degrees

of freedom.

The convergence of moments does not in general imply convergence

in distribution. However, if there is at most one distribution function

F such that lim X dF = X dF, then F F in distribution [8].
n n n
Since the chi-square distribution is uniquely defined by its moments,

2 2
it follows that X X in distribution.
m 2










1.8 A Preview of Chapters II, III, and IV


In Section 2.1, the notion of Bahadur asymptotic relative

efficiency is introduced. In Section 2.2, we derive the Bahadur exact

(A) (P)
slopes for T and T The results due to Littell and Folks men-

tioned in Section 1.3 are summarized in detail in Section 2.3. In

Section 2.4, we extend the optimality property for T(F) given by Littell

and Folks to the discrete data case.

Chapter III deals with a particular combination problem: the

combination of mutually independent binomial experiments. Fisher's

(F)
method, T is compared to several methods which are based directly

on the X) (rather than the L ). Comparisons are made via both

approximate and exact slopes in Sections 3.3 and 3.4. The tests are

also compared by exact power studies. These results are given in

Section 3.5. A summary of the tests' relative performances follows.

Recommendations as to the appropriate use of the methods are given.

Section 1.6 describes some proposed approximations to the null

(F)
distribution of T In Section 1.7, these methods are shown to be

less reliable than might be expected. Alternative approaches are also

o'valuated.

In Section 4.1, the combination problem is approached from the

standpoint of estimation. Confidence regions based upon T(F) and T(

are derived. The remainder of Chapter IV introduces future research

problems which are related to the general combination problem.















CHAPTER II


BAHADUR EFFICIENCIES OF GENERAL COMBINATION METHODS



2.1 The Notion of Bahadur Efficiency


Due to the intractibility of exact distribution theory, it is

often advantageous to consider an asymptotic comparison of two compet-

ing test statistics. In typical cases, the significance level attained

by each test statistic will converge to zero at an exponential rate as

the sample size increases without bound. The idea of Bahadur asymp-

totic relative efficiency is to compare the rates at which the attained

significance levels converge to zero when the null hypothesis is not

true. The test statistic which yields the faster rate of convergence

is deemed superior. A more detailed definition follows.

Denote by (Y1,Y2,...) an infinite sequence of independent

observations of a random variable Y, whose probability distribution P0

depends ona parameter 6 E0.

Let H be the null hypothesis H: 6 E0 and let A be the alterna-

tive A: 6 E0 00. For n = 1,2,..., let X be a real valued test
0 n
statistic which depends only on the first n observations Y1,..., n'

Assume that the probability distribution of X is the same for all

6 0 0. Define the significance level attained by X by
n
Li =1-F (X ) where F (x) =P IX < x}. Let {X }=X X ,....,X ,...}
n n n n 0 n n 1 2' n








denote an infinite sequence of test statistics. In typical cases, there

exists a positive valued function c(6), called the exact slope of {X },

such that for 6e 0-O.

(-2/n) Zn L c(6)
n
with probability one [0]. That is,

-2
P{lim n L = c(O)} = 1.
n n

If {x(n)} and {X(2) have exact slopes cl(0) and c2(0), respectively
n n 1 2
then the ratio 12(0) = c (0)/c2(0) is the exact Bahadur efficiency of

{X(1)} relative to {X(2).
n n
An alternative interpretation of 12(0) is that it gives the

limiting ratio of sample sizes required by the two test statistics to

attain equally small significance levels. That is, if for E >0, N(i)()

(i)
is the smallest sample size such that L < for all sample sizes
n
n N (i), i = 1,2, then as C tends to zero [3]

(2)
lim N= ().
N (1E) 12

The following theorem due to Bahadur [3] gives a method for

calculating exact slopes. A proof is given by I. R. Savage [40].


Theorem 1. Suppose {T } is a sequence of test statistics which
n
the following two properties:

1. There exists a function b(6), O T //n-b(O) with probability one [0].
n
2. There exists a function f(t), O continuous in some open set containing the range of
b(O) such that for each t in the open set
-1
i n[l-F (/n t)]-f(t), where F (t) = 1-P (T n t).
n n n o n
Then the exact slope of {T } is c(O) = 2f(b(9)).
n


satisfies










2.2 The Exact Slopes for T() and T(P)




In terms of the above discussion the general combination problem


can be defined as follows. There are k sequences {X (1),...,{X(k) of
n1 nk


statistics for testing H: 0 E00. For all sample sizes nl,...,n k the


statistics are independently distributed. Let Li) be the level attained
1
(ii
by X i = 1,2,...,k.
n.



n.
1

slope c.(0); that is


-2 (i)
ZnL --c.(O) as n. c
n. n. 1i
1i
with probability one [0]. Assume also that the sample sizes nl,...,nk

n.
satisfy n + ... + n = nk and lim = A, i = 1,...,k. Then
n +o

S+ ... = k and
1 k

-2 (i)
ZnL c.( () as n m
n n. i i
1

with probability one [9]. As defined here, n can be thought of as the


average sample size of the k tests.



Two general combining methods introduced in Section 1.2 are


T (A) Edgington's additive method, and T Pearson's method. Deriva-


tions of the exact slopes of T(A) and T(P) follow.








k
(A) -2 (i)
Proposition 1. Let T = n Z L The exact slope of
n /n i=l n
(A)
T is k min (c.(e)).

(A)
Proof: This proof requires the above definition of T although
n
T(A) = L(i) is a more obvious definition and is the form given in
n ni
Section 1.2 where the non-parametric methods are introduced. Nothing

is lost, however, since equivalent statistics yield identical exact

slopes. Proposition 1 is proved by using Theorem 1 (Bahadur-Savage).

The first step is to establish b(O) of Part 1 of Theorem 1. To accom-

plish this, first suppose that

Acl(6) = min {(.c.()}.
1 i 1 1


It follows that for all > 0, there exists

n >N,


N=N(E) such that for


with probability




with probability





with probability


-2 (1) -2 (i)
Qn L __- knL
n n n ni

one [6]. Then, for n >N,

L(1) (i)
n1 ni '

one [6]. It follows that
k
(1) < L(i) n1 i=l ni n1

one [6]. Thus, for n -N,


i=1,2,...,k,




i = 1,2,...,k,


for n N,


-2n L(1) -2 n i) -2 n kL1)
n n n n. n nk
n n n n. n n
1 1


with probability one [0]. It follows that as n tends to infinity










-2 (1) -2 (i) -2 (1)
lim n L 2 lim Zn EL lim n kL
n n n n. n n

with probability one [0]. Hence,


( >-2 n
Alcl(6) > n in


n. 1 1
i
1


with probability one [0] since, as n tends to infinity,


-2 (1)
lim -2 n kL(1)
n nl


= lim {-2 Zn L()
n nl


- k n k} = lim --nL(1)
n n nl


with probability one [0]. Thus,

T(A)
n -- c (0)
r- 11


with probability one [0].


The choice of A1 c(0) as the minimum was


arbitrary. Hence,


(A)
n
J^


+ min A.c.(9)
i 1


giving b(9) of Part 1 of Theorem 1. Now, as n tends to infinity,


lim -- n[l F (vn t)]
n n

-1 -2 (i) t
= lim -r- n P{-2 Zn L(i) /n t
n /n n.


n n. 2
1

= lim -r n [exp ( k)/k!)].
n 2


The last equality if true since t is positive and therefore

-nk is less than one.
exp ( ) is less than one.










lim -I n[l F nn t]
n n

tk 1 tk
lim [ + 1 Yn k!] = -
2 n 2


This gives f(t) of Part 2 of Theorem 1. Thus, from Theorem 1,


C (0) = 2f(b(O)) = kmin A.c.(8).
A 1 1


Proposition 2. Let T(P = n[-E n(l L )]. The exact
n F- n.
n ni

slope of T is C (6)=k min A.c.(9).
n P 1i


The form of Pearson's statistic given in this proposition is

equivalent to the form given in Section 1.2. The proof of Proposition 2

entails use of the Bahadur-Savage Theorem (Theorem 1). The derivation

of b(0) of Part 1 of the theorem parallels the derivation of b(9) in

Proposition 1. In order to establish f(t) for part two of the theorem,

a result due to Killeen, Hettmansperger, and Sievers [18] is required.

They show that under broad conditions,


1 n f (V'n t) I- n P{X Fn t} = o(l) (2.1)
n n n n


as n->- where f (t) is the density function of X .
n n


Proof of Proposition 2. By a first-order Taylor series approximation,


-E n(l L (i) = E L + E I| L)
n. I. n n.
1 1 1
(i)
where -+0 as L -0 for all i. It follows that as n tends to infinity,
n n.
I










T
lim -- = lim -2 kn[-EZ n(l L(i)]
An n ni

= lim -2 kn[ L(i)]
n n.
1

= min X c (O).
i i


The last step is shown in Proposition 1. Now, to derive f(t) of


Part 2 of Theorem 1 via equation (2.1), note that the U( =1-Li)
n. n.
1 i

are mutually independent uniform random variables on (0,1). Letting


V = -n i it follows that
n. n.
1 1

-v
f (v) = e < <
(i)
n.
1

Letting W =E V = -EZn U it follows directly that
n n. n.
1 1

1f () k-l -w
f (w) = (k w e
n



since V is a gamma with parameters (11) Now, letting


1

f (y) = exp{-ky e } (2.2)
n A

It follows that, as n tends to infinity


-1 (P)
lim -1 n P{T /n t}
n n

-1 2
= lim -- n P{-- Y > n t}
n /n n










= lim -i pn P{Y > 1t
n n 2

-1 ut
= lim -n n fy (-) (2.3)
n Y 2
n

by the result given in equation (2.1). Substituting (2.2) into (2.3),

we find that

-1 1 nt -nt
lim n[- exp{-k(-) exp( )}]
n 2 2

S-1 -knt -nt
=lim [ exp( -)]
n 2 2

kt 1 -nt tk
= lim [k + exp( --)] = t
2 n 2 2

Applying the result of Theorem 1,


C (6) = 2f(b(O)) = k min X.c.().
i



2.3 Further Results on Bahadur Efficiencies


Bahadur exact slopes are derived in Littell and Folks [26] for

other general combining methods. Their results are summarized below.

Test Exact Slope

(F)
T c F(6) = X Z.c.(6)


T(N) c, (0) E c (0)) 2
N k i i







(A) (P) [Y(i
i
(m)
T c. (0) = max ).c.(0)
m 1i 1

Thus T and T have the same exact slope as T The relationship

among these quantities is displayed in Figure 2.











X2 C2


=(i +V-2)2XIOC


2C2= 2X ci


XI c = X2C2


TI'


Iy


Note: I and T-I
IT and Y :

UI and _F:


CM
CM < Cm < CF
Cm

Figure 2. Relative Sizes of Exact Slopes for k=2.










The optimality property of T(F) established by Littell and

Folks [28] is mentioned in Section 1.3. The detailed result is given

here as a theorem.


Theorem 2. Let T be any function of T ,...,T(k) which is
n n1 nk

(i) ,
non-decreasing ineach of the T ', that is, t1 < t',...,tk tk
1

implies T (t ,...,t ) < T (tl ....t). Then the exact slope c(6) of


T satisfies c(O) < Z A.c.(6).
n 1 1


The non-decreasing condition is equivalent to Birnbaum's

Condition 1. (See Section 1.3.) The condition is not very restric-

tive; it is satisfied by any method thus far introduced and by virtually

all other reasonable statistics.



(F)
2.4 Optimality of TF) in the Discrete Data Case


Littell and Folks established that the exact slope for T(F)

is c F() = E A c .(). This derivation is contingent on the fact that
F 1 1
(F)
T has a chi-square distribution on 2k degrees of freedom. This is

true only when the X(i) are distributed according to continuous distri-

butions. A proof that the exact slope of T(F) is Z A.c.(9) when the X(i)
1 1
are discrete follows.


Suppose X(i) can take on values 0, 1, 2,...,n. with probabil-
n. 1
1
nn
1
ities pO, p ...,p Let P i= p Note that
0 1 n. j x
iJ x=j
1 x=j

(i) < i) < < (i) n. n.-l 1 0
1 1









(i) (i)
for large values of X ), then the P. j = 0,1,2,...,n. are the
n. j
1
th
observable significance levels for the i test; that is, the observ-


(i) (i)
able values of random variable L = 1 F(X 1) under the null
n. n.
1i

hypothesis. Assume that an exact slope exists for all tests; that is,


assume that there exist functions c l(), c2(0),...,c k() such that


-2 (i)
-- nL c.(0) as n -
n. n. I i
1 1

with probability one [6] for i = 1,2,...,k.


(d) (i) i
Proposition 3. Let T = (-2 EZ n L )2. If
II n
1

lim -- n[l- F (d)n t)] exists then the exact slope of T(d) is
n n n
n o
k
cd(0) = Z c (0).
i=l

Proof: This proof utilizes the Bahadur-Savage Theorem (Theorem 1).

To establish the first part of Theorem 1, observe that

(1) (k)
(d) 2 (nL) 2nnL) ;
T n_ _
n n n




(A cl(a) + ... + AkCk(0))k as n + om


with probability one [0]. Consistent with the notation of Theorem 1,

denote this limiting quantity bd(0).

Now, to establish f(t) of Part 2 of Theorem 1, choose

Z.c (0,1), i = 1,2,...,k. For each i, there exists a j such that

1 [P(i) (i)
i j Pj-1]









(i)
Now, since L is a discrete random variable,
n.
i

P{(i) .) = P{L (i) n i n. j 1
1 1
Thus

P{L() e .)} P{u( ) M .}
n. i i
1
(ii
where the U are mutually independent uniform random variables on

(0,1). It follows that

P{-21nL(i) >_ } < P{-2n U(i) > .},
n. 1 1
1
and hence that

k k
P{ Z -2nL) >z"} P{ E -2n U(i) "}.
i=l i i=l
Thus,

-1 Mi)
-1 n P{[E 2nL (i)] t} n P{[ -2Zn U (i) > n t}. (2.4)
n n. n
1

The quantity Z = [Z 2Rn U (i) is distributed as the square root of
n
a chi-square with 2k degrees of freedom. It follows that the density

of Z is
n 2
f (Z 1 2k-1 -Z2/2
fZ (Z) k- Z e
n 2 F(k)

Thus, from the result given in (2.1), the limit of the right-hand side

of (2.4) can be written as

-1
lim -4n f (Wn t)
n Z

Sn 2k-1 -1 2 n
= lim n [ (/n t)2 exp( (/n t) )]
n- 21 F(k)

-1 1 2
=lim -- [(2k-1) kn(n t) - nt]
n 2
nco

1 2
t
2










Hence, it follows from (2.4) that

1 2
f (t) > t ,

since it is assumed that the limit of the left-hand side exists.

Applying the result of Theorem 1,

Cd(e) ZiA.c.i ().
d 11

By Theorem 2,


Cd(X) 11().

Hence,

cd(9) = ic. (e).


That is, the exact slope of T(F) is EX.c.(6) regardless of whether the
11
X are continuous or discrete. The condition imposed in Proposition 3


that lim -_. n[lF (d) (/n t)] exist is not very restrictive. It is
n n
n -co
satisfied in most typical cases and in every example considered in this

dissertation.














CHAPTER III


THE COMBINATION OF BINOMIAL EXPERIMENTS



3.1 Introduction


Chapter III deals with the combination of binomial experiments.

That is, suppose k binomial experiments are performed. Let nl,n2,... ,nk

(1) (2) (k)
and X X ...,X he the sizes and the observed numbers of suc-
n1 n2 nk

cesses, respectively, for the experiments. Denote the unknown success

probabilities as pl,p2,...,pk. Suppose one wishes to test the overall

null hypothesis H: pl = P10' p2 = P20' ". = PkO versus the alter-

native hypothesis HA : p > p10' .. k PkO (with strict inequality

for at least one p.). The problem, then, is to choose the best function

of (X ... X (k) for this hypothesis test.
n I nk
1 k.

(F)
The results of Chapters I and II support T(F) as a non-

parametric method with good overall power when there is no prior infor-

mation concerning the unknown parameters. The method based on the

minimum significance level, T is sensitive to situations where

exactly one of the individual hypotheses is rejected. That Is, T) is

powerful versus the alternative IC of Section 1.3.

The investigations of Koziol and Perlman [20] and Oosterhoff [33]

show that the general non-parametric combining methods can be improved


32









on for certain parametric combining problems. It follows that there

may be combination methods based directly on (X ... ) that are
n nk
1 k
superior to Fisher's omnibus procedure.

Chapter III is a detailed comparison of T(F) and several

parametric combination methods.



3.2 Parametric Combination Methods


As stated in Section 1.3, no method of combination is most

powerful versus all possible alternative hypotheses. There are, however,

certain restricted alternative hypotheses against which most powerful

tests do exist.

th
Let the likelihood function for the i binomial experiment

be denoted by

n. (i) n.-X(
L(p) = ( i)pX (1- p.) (3.1)
X

According to the Neyman-Pearson Lemma, if a most powerful test of the

null hypothesis H: pi = Pi, all i, versus the alternative hypothesis

HA: pi > io (with strict inequality for at least one i) exists, it is

to reject H if

k L(p )
iO
71 < C.
i=l L(

Upon substituting (3.1) and taking logs, an equivalent form of the test

is to reject H if

EX() M n{Pi(l Pi )/Pi (l-Pi)} > C. (3.2)

It follows that rejecting H when

SXi) > C (3.3)

is most powerful if p.(l p i)/p i(1-p.) is constant on i.
1 i iO 1









The problem of combining 2 x2 contingency tables is closely

related to the problem being considered. The purpose of each 2x2 table

can be interpreted as testing for the equality of success probabilities

between a standard treatment and an experimental treatment. The overall

null hypothesis is that the experimental and standard success probabil-

ities are equal in all experiments. The overall alternative hypothesis

is that the experimental success probability is superior in at least

one experiment.

Cochran [LO] and Mantel and Haenszel [29] suggest the statistic

E widi//Ew.p.qi (3.4)

where


S= nilni2/(nil + ni2) d = Pil Pi2

for combining 2x2 tables. Mantel and Pasternak [34] have discussed

this statistic in the context of combining binomial experiments. Each

individual binomial experiment is similar to an experiment resulting

in a 2x2 table, two cells of which are empty because the control

success probability is considered known and need not be estimated from

(CHM)
the data. The statistic defined by (3.4) will be denoted by T(

It can easily be shown that the test T(crH) > c is equivalent to the

test E X(i) > C, thus T(CMH) is the most powerful test when

p.(l p )/pi(1 p.) is constant on i.

In many practical combination problems with binomial experiments,

Pi =1/2 for all i. The null hypothesis is then H: p. =1/2, i=,2,...,k

and the general alternative hypothesis is HA: pi > 1/2 (strict inequal-

ity for at least one i). This is the hypothesis testing problem under









consideration throughout the remainder of Chapter III. For pi0 =1/2,

all i, T( ) is uniformly most powerful for testing H: pi0=1/2, all i,

versus HB: pi = p2 = p3 = ... = k > 1/2.

For the hypothesis test just described, T( ) can be written

k k
(i) 1 1 ).
S(X n.)/( E n.) (3.5)
i=l 2 i 4 1
i=l j=1

This variate is asymptotically standard normal. It is of note that

this form is standardized by a pooled estimate of the standard devia-

tion. An alternative statistic can be formed by standardizing each

X( yielding

(X) =1 {(x(i) 1 1 )A
T {(X -n)/(- n)
2 i 4 i
which also has an asymptotic standard normal distribution. The statistic,
(X)
T is analogous to the sum of chi's procedure which has been recommended

for combining 2x2 tables. The statistic T(X) is not in general equiva-

lent to T(CmlI); in fact, the test T(X) > c is equivalent to the test

k (i) (x) (CMI)
E n X > c. When the n. are all equal, T and T are equivalent.
i=l
(i) (i)
Weighted sums of the X say S(g) = Z gi X form a general

class of statistics. Oosterhoff considers this class and makes the fol-

lowing observations concerning their relationship with the individual

sample sizes [32]. It follows from (3.2) that if kn(p./l-p.) = a gi

then the most powerful test of H versus H is E g. X( > c.
1
Let pi = + C It follows that
i 2 1









n(p./l p.) = kn(l + 2L./l 2t.)


(2e.)3 (2c.)5
= 2{2 + + ...}
1 3! 5!

= 4 + (E2)
i O+0 i
1

This implies that for alternatives close to the null hypothesis, H,

S(g) is most powerful if Ei = e gi; that is, if the deviations from the

null values of the pi are proportional to the respective g.. The sum

of chi'sprocedure, T is a special case where g. (n.) It fol-

(x)
lows that the alternatives against which T) is powerful is strongly

related to the sample sizes, n,n2,....,nk.

The weighted sum, S(g), may be a viable statistic if prior

information concerning the p. is available. Under the null hypothesis,

S(g) is a linear combination of binomial random variables, each with

success probability 1/2. The null distribution of S(g) will therefore

be asymptotically normal. The proper normalization of S(g) is analogous

to that of CmllH) given in (3.5).

A well-known generalization of the likelihood ratio test is to

reject the null hypothesis for large values of -2 kn{ sup L(O,X)/sup L(O,X)}.
c60 ~ 6
It is easily shown that for the hypothesis test being considered, the

likelihood ratio statistic is


(LR) k (i) (i) (i) X(i) X(i) 1
_i= 1 2 n. i n. n. 2
i=l1 1 1
where


1 if -- 1/2
.(i n

i (i)
0 if < 1/2.
i










Under broad conditions, which are satisfied in this instance, the
(LR)
statistic T(LR) has an asymptotic chi-square distribution with degrees

of freedom.

Suppose z., i = 1,2,...,k are normal random variables with

means U. and variance 1. The likelihood ratio test for H: P. = 0,
1 1

i = l,...,k versis H1: 1i i 0 (with strict inequality for at least one i)

is to reject H for large values of

k 2
E z. I{z. 0}. (3.6)
i=l

For the binomial problem, an "approximate likelihood ratio" test is then

to reject for large values of
k
(ALR) (i) 1 2 1 (i) 1
T =E (X n) /- n. Ix > n.}
2 4 1 -2 i
i=l

1 1 1
since (X n )/( n. ) is asymptotically a standard normal random var-

iable under H. The exact null distribution of (3.6) is easily derived.

Critical values are tabled in Oosterhoff. When p = 1/2, the normal

approximation to the binomial is considered satisfactory for even

fairly small sample sizes. It follows that the exact null distribution

of (3.6) should serve as an adequate approximation to the null distribu-

tion of T(R)



3.3 Exact Slopes of Parnmetric Methods


In this section, the exact slopes of r(F) T(CI) and T(LR)

are compared. We have not been successful in deriving the exact slope

for T(). A more complete comparison of methods is given in Section 3.4

with respect to approximate slopes.










Suppose X) is a binomial random variable based on n. observa-
n 1
tions with unknown success probability p.. Consider testing the single

null hypothesis H: pi = 1/2 versus the single alternative hypothesis

H : p. > 1/2.


(i 1 (i) (i)
Proposition 4. Let T = X The exact slope of T is
1n. n n.
i n. i i
1

c.(9) = 2{pi n 2pi + (1 pi) n 2(1 pi)}.


Theorem 1 is used to prove Proposition 4. There are several

means by which the function f(t) of Part 2 of Theorem 1 can be obtained.

Perhaps the most straightforward way is by using Chernoff's Theorem [1].

Bahadur, in fact, suggests that the derivation of f(t) provides a good

exercise in the application of Chernoff's Theorem.

Theorem 3. (Chernoff's Theorem). Let y be a real valued random

ty
variable. Let (t) = E(e ) be the moment generating function of y.

Then, 0 < #(t) < for each t and P(0) = 1. Let P = inf{((t): t > 0}.

Let yl, Y2,..., denote a sequence of independent replicates of y and

for n = 1,2,..., let Pn = P{yi + ... + yn 01. Then


1
in P Zn P as n -c .
n n


Proof of Proposition 4. For Part 1 of Theorem 1,


,(i) X(i)
T X
1 i
-+ Pi
n i
1

with probability one [6] giving b(6). For the binomial problem

0 = (pl, 2 ... ,Pk)










Now, as n Lends to infinity,


lim -1
n.
lim
n.
1


= lim -
n.
1


= lim -
n.
1


= lim -
n.
1


Zn(l -F (n-. a))



Zn P{X()/n. > A7 a}
n. i i



(i)
n. 1
1


Zn P{X(
n.
i


- n.a > 0}.
1


The random variable


(i)
X
n.
1


(i)
X = (y
n. 1
1


- n.a can be expressed as



- a) + (y2 a) + ... (n a)
2 n


where the y. are independent replicates of a Bernoulli random variable

y with parameter 1/2. Therefore, P(t) of Chernoff's Theorem is



(t) = e (l + et).
y -a 2
1


It follows that


1 -at
The quantity e
T


1 -at t
P = inf{- e (l + e ):
2

(1 + et) is minimized for


a
t = n -
1-a


Thus,


1 -a[Zn a/l-aj [[Zn a/1-a]
S = 2e (1+ eJ /l-)



1
Zn P = -a Zn a/1-a + Zn (1 + a/1-a).
2


(3.7)


t > }.


and









Hence,

-1 (i) 1
li -- n P{X(i) > n.a} =a An(a/1-a) n( (l +a/1-a)) (3.8)
n. n. 2
1 1

giving f(a) of Part 2 of Theorem 1. Thus,


c.(0) = 2{pi kn p./l-Pi -n -(1 + p./l-p.)}
1i i 2 i i

= 2{pi n 2pi + (1 p.) An 2(1 p.)}



Following the notation of Section 2.2, suppose k binomial

experiments are to be combined and the sample sizes nl,n2,..., nk

satisfy nl + n2 + ... + nk = nk and


n.
lim = X., i = l,...,k.
n 1


Then 1 + ... + k = k and
1 k

-2 (i)
n L .c. (0) as n m.
n n. i i
1

According to Proposition 3, c F() = Z X c (6) in both the continuous and
F i i
discrete case if lim -- n[l F (/To)] exists. The existence of this
n n
limit for a single binomial experiment is shown in (3.8) of the proof

of Proposition 4. Therefore, for the binomial combination problem, the

exact slope for Fisher's method, T(F) is

k
c ,() = X Ai{pi Zn 2pi + (1 p.i)n 2(1 pi)}.
i=l


A property of likelihood ratio test statistics is that they

achieve the maximum possible exact slope [2]. Theorem 2 states that

the exact slope for the combination problem is bounded above by

Si.c.(O). Proposition 3 shows that T(F) achieves this. If follows
11I





41



(F)
that T and the maximum likelihood test have the same exact slope;

that is,

cF(0) = CLR(6).


This relationship is true regardless of whether the data are discrete

or continuous.

(CMH) 1
Let T( -C) X). This form of the Cochran-Mantel-
n -n k ni

Haenszel statistic is equivalent to those previously given in (3.3)

and (3.5).


Proposition 5. The exact slope of T is
n

c (0) = 2k{p kn 2p + (1-p) Zn 2(1-p)} where p = E X.p..
CMH k ii

Proof. To get b(O) of Part 1 of Theorem 1,

S(CmH)
-n = X i) /nk
n n
nn.




1
X(i)
1 n n
k n, n



k i i


with probability one [a]. Now, for Part 2 of Theorem 1, as n tends to

infinity,


lim -1 kn[l F(JI a)]
n


= lim r- in P{-- E i) ,n a}
n n k i

-1 (i)
= lim k in P{E X n k a} = f(a). (3.9)
nk n.
i








(i)
Under the null hypothesis, Z X is a binomial random variable based
n.
I
on nl + ... + nk = nk trials with success probability 1/2. The quantity

(3.9) is the same as the quantity (3.7) except that n. has been replaced
1
by nk. Theorem 3 can be directly applied to line (3.9), yielding

-1
lim -_- n [1 F(/n a)]
n

= k{a en -a n 1(1 + a/l-a)} = f(a)
1-a 2

and therefore

cCMH(0) = 2f(b(O))


= 2k{p ln(p/l-p) ln(-(l + pl-p)}


= 2k{p ln 2p + (l-p) ln 2(l-p)}.

A comparison of T(CME) relative to T(F) and F(LR) with respect

to exact slopes is given in the next section.

Derivation of the exact slope of the sum of chi's procedure,
(X)
T has not been accomplished. An incomplete approach to the problem

follows.

Let T(X) = Z n X ). To derive b(O) of Part 1 of Theorem 1,



T n n. n.
T(X) n.
n 1n s
,n n n


X p. as n m
i i

with probability one [8]. Now, as n tends to infinity,
-1
lim Ln P{1 F(Vn a)}


-1 P(n1 (1) nk (k)
= lim- p{ I ( + ... ) n a).
n ( nl nk









The left-hand side of the above probability statement is a weighted sum

of independent binomial random variables based on varying sample sizes

nl,n2,...,nk each with success probability 1/2. The moment generating

function of this random variable is therefore




n n



= -+ e +... e+ e
2 2

(3.10)

From the form of the moment generating function given in (3.10), it is

apparent that the random variable in question can be regarded as a sum

n independent identically distributed variates each with moment generat-

ing function



1 1 t)nl/n k 1 nk /n
+ Te I... + e


Then, since as n tends to infinity

n-1 n(1) ^(k)
lim P-1 x ( +. ... + X na
n n n nk


n n M
= lim -1{ Z X a 0},
n n. n
j=1 i

#(t) of Theorem 3 is


-at 1 1 1 1 1 k
(t) = e [( + e ) ... ( + e )]
2 2 2 2

and P = inf{f(t): t a 0}

The quantity p has not been found.










3.4 Approximate Slopes of Parametric Methods


Exact slopes are defined in Section 2.1. In Section 3.3 some

comparisons among methods are made with respect to exact slopes and

corresponding efficiencies. Bahadur also defines a quantity called the

approximate slope [3]. Suppose that X has an asymptotic null distri-

bution F; that is,

lim F (x) = F(x)
n
n- oo

for all x. For each n, let


L = 1 F(x)
n n

be the approximate level attained. (Consistent with Bahadur's notation,

the superscript a stands for approximate.) If there exists a c(a)(

such that

-2 Zn L(a) + (a)
n n
(a)
with probability one [0], then c ()() is called the approximate slope

of (X }.
n

If ca) () is the approximate slope of a sequence {x ),
1 n(
i = 1,2, then cl (a)/c2 (a ) is known as the approximate asymptotic


efficiency of {x()} relative to {x(2)
n n

A result similar to Theorem 1 is given by Bahadur [3] for the

calculation of approximate slopes. Suppose that there exists a

function b(0), 0 < b(O) < co, such that

T
n
-- b(6)









with probability one [0]. Suppose that for some a, 0 < a < m, the

limiting null distribution F satisfies

1 2
Zn[l F(t)] ~ at as t c
2
(a) [b() 2
Then the approximate slope is c (() = a[b(6)]. This result is

applicable with a = 1 for statistics with asymptotic standard normal

distributions [3]. This result can be shown directly by applying the

result of Killeen et al. given in (2.1).

The approximate slope, c(a)(0), and the exact slope, c(6),

of a sequence of test statistics are guaranteed to be in agreement only

for alternative hypotheses close to the null hypothesis. Otherwise,

they may result in very different quantities. One notable exception is

the likelihood ratio statistic. When the asymptotic null distribution

is taken to be the chi-square distribution from the well-known -2 Zn

(likelihood ratio statistic) approximation, the approximate slope of the

likelihood ratio statistic is the same as the exact slope. The approx-

imate slope is based upon the asymptotic distribution of the statistic.

Equivalent test statistics may have different asymptotic null distribu-

tions giving rise to different approximate slopes. This apparent short-

coming does not exist with exact slopes.

In typical applied situations, the significance levels attained

by T(C0H) and T(X) will be ascertained by appealing to their asymptotic

(LR)
normal distributions. Similarly, T(R) will be compared to the appropri-

ate chi-square distribution and approximate levels for T(ALR) will be

obtained from the asymptotic distribution given in Section 1.2. Approx-

imate slopes based upon these asymptotic distributions would therefore

seem to afford a more appropriate comparison of the methods. In other









words, it is appealing to consider the null distribution that will be

used to obtain significance levels in practice when comparing the

statistics. The only statistic which will not usually be compared to



(CMHI)
asymptotic distributions is perhaps T .(CMH) The null distribution of

T is binomial based on nI + ... + nk trials with success probabil-

ity 1/2. However, even with the availability of extensive binomial

tables, T(CMH) will often be standardized as in (3.5) and compared to

standard normal tables since the normal approximation to the binomial

when p = 1/2 is satisfactory even for fairly small sample sizes.

The asymptotic null distribution of T(F) in the discrete case

is easily shown to be chi-square with 2k degrees of freedom. This is

(F)
also the exact distribution of T(F) in the continuous case. It follows

that the approximate slope in the discrete case is the same as the

exact slope in the continuous case. In summary,


(a) (a)( k
CLR () =c =c R() =C ( =2 E X.{p kn 2p +(l-p.i)n 2(l-pi)}.
LR F LR F ill i i 1 1
i=l
(Cuf) (x)
In order to derive the approximate slopes for T and T

consider the linear combination Zn X of which T and T are
1 n.
(i)
special cases. The variate X has an asymptotic normal distribution
ni
1
1 1
with mean n. and variance n under the null hypothesis. It follows
2 4 i
directly that

(U) 1 ++l / 2u+l
T (Zn.Xni n2 n )/ vn
n i 2 i 2 i

is asymptotically standard normal.

Proposition 6. The approximate slope of T) is
n
c+(a) +1 2 2c+1+
c(a)() = [A (2p -l)] /i
cti i










Proof. First, to get b(O),


1 C+1
-Zni )
2 i


1 f-2c+l
21/ Eni


1
(E(-)
nn


(i)
i


cE+1
n
1 i
E )
2 x
n


2"+1
2cA+l
1 i
2/ 2a
v n


X(i)




1 /( 2a+1
2 ni


Ac 1 ac+l
( (.iPi) -iz
i 2 i

1 / Z2A+
2 i


with probability one [0].


(Ca)
Now, since T is asymptotically standard
n


normal,


(a)
co


= [b(O)]2


-a+l 1 E+1 2 1 2a+l
i i 2 i 4 i


= [Zi (2Pi


2 2a+1
- 1)] /ZA.
i


Letting a = 0 yields the approximate slope of'lT(CI)


C(a() = [EiX(2p 1)]2/k
CHH 1 i

Letting c = -2 yields the approximate slope of T ,
(a) 2
c(a() = [E(2p 1)]2/k.
X 1 i


(")
T
n

11


I aQ(i)
- (EnX M
i n.
v n 1


as n CO









By inspection of the above approximate slopes, it is apparent
(CMIH)
that T is more efficient when the p. are proportional to the A.
1 1
(X)
(the relative sample sizes) and T() is more efficient when the p. are

inversely related to the X.. The boundary of the parameter space where

(a) (a)
c (6) = c (6) is not p = p = ... = k' however. The statistic

T(CIMH) is more efficient than T(X) in more than half of the parameter

(a) (CHH) (x)
space. As a further comparison, e (T T ), the approximate

efficiency of T(C with respect to T(X), can be integrated over the

parameter space. The result is greater than one which again supports
(CMH)
use of T It should be noted, however, that when the pi are pro-

portional to the A. both tests have high efficiencies relative to when
1
(X)
p. are inversely related to the A.. Therefore T() is more efficient in

a region of the parameter space where both tests have relatively low

efficiency. This is a good property for T(X)

An "approximate" likelihood ratio test is introduced in

Section 3.2. A statistic which is equivalent to the form given in

Section 3.2 is
k n
(ALR) [(i) 1 2 i (i)
T = (E [(X n) / IX n
i=l i 4 n 2

Proposition 7. The approximate slope of TALR) is
n

c (o) = eA.(2p.-)1).

Proof. To find b(0) of Part 1 of Theorem 1,

(i)
,(ALRO) X
Tn 1 (i 1 2 Ix(i) 1
-4- --( )n.}}
S n ni 2 2 1


{4E A.(p --1)2 as n co
i i 2









with probability one [0]. Thus,


b(O) = {4EXA(p --1)2 = {ZA.(2p- l)2
I i 2 1 1

To find f(t) of Part 2 of Theorem 1, the asymptotic null distribution

of T(ALR) is required. According to Oosterhoff [33],
n
k
2 -k k 2
P{z.z I{z. >0} s} = 2 Z (.)P[x2 s)
1 1 j=l J
j=1
where z. is a standard normal random variable. Since, under the null
1
hypothesis

(X 2- n.)/ / ./4 z.
n. 2 1 1 1
1

in distribution, it follows that

(ALR) (ALR) 2 2 -k k 2 2
P{T (ALR) >s} = P{(T (ALR) s 2}-2 -k( k )P{X 2s (3.11)
n n j j
as n-, oo for all s. It follows that the associated density function is

a linear combination of chi-square densities. The result of Killeen et al.

can be applied to verify that

1 2
kn[l F(t)] ~ as t -*
2
(ALR)
where F is the asymptotic null distribution of T Hence,
n

c (0) = [b(0)] = Ei(2i- 1)2.
ALR i 1

Before proceeding to a further comparison of approximate slopes, the

slopes are summarized in the following listing.










Approximate Slope


Fishers (T(F)
Fisher's (T )


2EA (pi Zn 2p. + (l-pi) en 2(l-pi)}
i 1 1 1 1


Likelihood Ratio (T(LR) 2EA {pi n 2p + (1-p )n 2(l-p.)}


"Approximate Likelihood Ratio"(T(ALR) Zi(2i 1)2


Sum of Chi's (T(X) [ 1 (2p 1)]2


(CM) 1 2
Cochran-Mantel-Haenzel (T ) -[Ei(2p 1)]



Letting A = 5A (2i 1), it is easy to see that cALR() (
i i ALR

(ak 2 1 2 (a) (a)
c (0O) since Z A 2 [A.] It is also true that c ( c c().
X i=l1 k 1 ALR CM
i=1

Let B. = (2p. 1). It can easily be shown that
1 1

2 1 2 1 2
EX Bi [A.B] = E XX.(B. B.)2
ii k 1 1 k i ] i
i
(ALR) (CMH) (X)
given that EX. = k. Therefore T dominates both T and T
1
with respect to approximate slopes.

Approximate efficiencies of T(C) and T(ALR) with respect to

T( (and equivalently T(LR ) for A1 = 2 = 1 are given for several points

in the parameter space in Table 1. In this case of equal sample sizes,

T() is equivalent to T(C Table 2 gives efficiencies of T(CN), T(x)

and T(ALR) with respect to T (and equivalently T ( ) for A = 1/3,

A2 = 5/3. The values of A1 and A2 imply that the second test is based

on five times as many observations as the first test. When the exact

(CMH) k
null distribution of T (binomial with parameters Z n. and 1/2) is
i=1


Test









to be used to determine significance levels it is more appropriate to

employ the exact slope, Cc (0), rather than the approximate slope,

(a) (TMH) (F)
cC(a ). The exact efficiencies of T relative to T (and

equivalently T(LR)) are given in Tables 1 and 2 in parentheses.

The efficiencies listed in Tables 1 and 2 support several

previously made observations:

1. The statistic T(ALR) dominates T(CM) and T(X) with respect

to approximate slopes (efficiencies).

2. The test based on T( ) dominates the test based on T(X)

when the success parameters are proportional to the sample sizes. The

test based on T() is more efficient in the reverse case. The test
(X)
based on T is more efficient in a region of relatively low effi-

ciencies for both T() and T(CH)

3. Exact and approximate slopes are not, in general, equivalent.

They are in close agreement for parameters close to the null hypothesis.

4. All of the tabled efficiencies are at most one. This is

expected from the optimality properties of T(F) and ILR) given by Theorem 2

and Proposition 3. A value of one is achieved only for the exact effi-

ciency of T(C when pi = p2. This is consistent with the fact that

T(CH) is the most powerful test (and the likelihood ratio test) when

P = P2'















Table 1

Efficiencies of T(CP TX and T(L


Relative to


(LR) (F)
T( or Equivalently, to T (F) 1 = 1




.5 .6 .7 .8 .9 1.0


(0.497)
0.497
0.993

(0.489)
0.486
0.972

(0.474)
0.467
0.934

(0.447)
0.435
0.869

(0.377)
0.361
0.721


(0.497)
0.497
0.993

(1.000)
0.993
0.993

(0.892)
0.879
0.976

(0.773)
0.752
0.939

(0.674)
0.644
0.876

(0.540)
0.505
0.729


(0.489)
0.486
0.972

(0.892)
0.879
0.976

(1.000)
0.972
0.972

(0.951)
0.909
0.945

(0.856)
0.799
0.888

(0.697)
0.632
0.748


(0.474)
0.467
0.934

(0.773)
0.752
0.939

(0.951)
0.909
0.945

(1.000)
0.934
0.934

(0.964)
0.871
0.892

(0.831)
0.722
0.768


(0.447)
0.435
0.869

(0.674)
0.644
0.876

(0.856)
0.799
0.888

(0.964)
0.874
0.892

(1.000)
0.869
0.869

(0.932)
0.763
0.773


(0.377)
0.361
0.721

(0.540)
0.505
0.729

(0.697)
0.632
0.748

(0.831)
0.722
0.768

(0.932)
0.763
0.773

(1.000)
0.721
0.721


1.0












Table 2

Efficiencies of T(CMH)and (ALR) to T(LR)
Efficiencies of T T and T Relative to T


or Equivalently,


to T(F), X = 1/3, X = 5/3
toT 1 2


.9 1.0


0.166
(0.165)
0.497
0.993

0.162
(0.162)
0.486
0.972

0.156
(0.156)
0.467
0.934

0.145
(0.145)
0.435
0.869

0.120
(0.121)
0.361
0.721


0.828
(0.832)
0.497
0.993

0.993
(1.000)
0.867
0.993

0.893
(0.901)
0.981
0.984

0.737
(0.736)
0.934
0.954

0.576
(0.585)
0.830
0.896

0.420
(0.428)
0.660
0.756


0.810
(0.826)
0.486
0.972

0.935
(0.957)
0.694
0.973

0.972
(1.000)
0.848
0.972

0.932
(0.964)
0.924
0.960

0.838
(0.872)
0.921
0.924

0.673
(0.751)
0.812
0.815


0.778
(0.814)
0.467
0.934

0.867
(0.914)
0.604
0.935

0.921
(0.978)
0.925
0.937

0.934
(1.000)
0.815
0.934

0.904
(0.976)
0.861
0.916

0.805
(0.878)
0.827
0.845


0.725
(0.791)
0.435
0.869

0.790
(0.872)
0.531
0.871

0.839
(0.938)
0.623
0.874

0.867
(0.982)
0.702
0.896

0.869
(1.000)
0.759
0.869

0.823
(0.962)
0.767
0.829


0.601
(0.703)
0.361
0.721

0.646
(0.771)
0.426
0.723

0.685
(0.836)
0.490
0.727

0.714
(0.897)
0.550
0.732

0.731
(0.952)
0.601
0.736

0.723
(1.000)
0.629
0.721


1.0









3.4 Powers of Combination Methods


In the previous two sections, competing methods were compared

with respect to asymptotic efficiencies. Asymptotic efficiencies com-

pare sequences of test statistics in some sense as the sample sizes tend

to infinity. Such comparisons may or may not be applicable to situa-

tions when small sample sizes are encountered. Therefore, the methods

of combinations are compared in this section with respect to exact power.

As mentioned previously, exact power studies are often intractible.

For the test statistics considered here, power functions are not obtain-

able in any simple form which would allow direct comparisons between

competing methods. However, through the use of the computer, it is

possible to plot contours of equal power in the parameter space. From

such plots, the relative powers of the competing methods can be surmised.

The first step in obtaining the power contours is the generation

of the null distributions for each of the five statistics: T(LR) T(F)

T(C T(X), and T(ALR). Size a = .05 acceptance regions for each of

the statistics for varying sample sizes are shown in Figures 3 8.

Acceptance regions for tests with equal sample sizes

(nl = n2 = 10, 15, 20, 30) appear in Figures 3 6. The statistics
T(LR) a (ALR)
T(LR) and T(ALR) define very similar, but not identical tests for

n1 = n2 = 10, 15, 20, 30. They define exactly the same a = .05 accep-

tance regions in all four cases, and will therefore yield identical

power contours. Fisher's statistic, T(F) defines a test similar to T(LR)

and T(ALR) for n = n2 = 10, 15; in fact, T(F) defines the same a = .05

acceptance region for those sample sizes. The major difference between









(F) (LR) (ALR)
T and the two likelihood ratio statistics, T and T( is that

(F)
T (F) has many more attainable levels. For sample sizes nI =n2 =20, 30,

(F) (LR) (ALR)
T defines different a = .05 acceptance regions than T and T

The statistics T( ) and T(X) are equivalent for nI = n2.

Figures 7 8 portray acceptance regions for cases of unequal

sample sizes (nl = 10, n2 = 20 and nI = 10, n2 = 50). The difference

between T and T(C) is apparent for the case of unequal sample sizes.

The statistics T(LR) and T(ALR) define different a = .05 acceptance

(F) (LR) (ALR)
regions. In both figures, it is seen that T), T and TLR) define

similar regions.

In Section 1.3 it was stated that Birnbaum [7] has shown that

combination procedures must produce convex acceptance regions in the

(1) (2) (k)
(X X .., X ) hyperplane in order to be admissible. Each of

the acceptance regions in Figures 3 8 appear to satisfy this convex-

ity condition.

The acceptance regions given in Figures 3 8 are not exact

a = .05 size regions. They are the nominal acceptance regions which are

the closest to size a = .05. In order to make a fair comparison among

the powers of the competing methods, all of the acceptance regions must

be of exactly the same size. This can be accomplished by admitting

(1) (2)
certain values of (X X ) to the acceptance region with probabil-

ities between zero and one. A more precise definition of this procedure

follows. Suppose


P{T(i)
n.
1


P{T(i
n.
i


St } = .05 a,



< t } = .05 + b,
u




56



and T() does not take on any values between t and t Then all T(
S1. u n.
1 1

such that T = t are included in the acceptance region with probabil-
n. L
1
(i)
ity one and T = t is included in the acceptance region with prob-
n. u
1

ability a/a+b.


th(i
The power of the i test is one minus the probability that T
n.
1

fails in the acceptance region. More precisely, define the power of
th
the i test to be


Hi(p1P2) =1- [p{T) < ti(plP2)} + (a/a+b) P"T = tu (p'p)]
i 1


nI (1)
= 1 -[ ( ) )p
(1) (2) (i) M ( tx
(x ,x ): T t
i


n -x n2 (2)
(I-P) ( )p2 (1-
X(2)


(I \


(2)
n -x
P2)


n (1) n -x n (2) n
+ (a/a+b)E E ( 1)P (1-P) (2)2 (l-p
x(1) 1 (2)(2(i) x x
(x ,x ):T n.
ni U

For each test statistic, power is calculated for 2500 values of


(pp 2) in the alternative parameter space. This was accomplished with


a FORTRAN computer program. A data set consisting of these calculated


powers is then passed into SAS [4,16]. The plotting capabilities of


SAS are then exploited to portray contours of equal power in the


(p1P'2) plane.


2-x
2










Figures 9 14 are .90 power contours corresponding to the

acceptance regions of Figures 3- 8, respectively. The Cochran-Mantel-

Haenszel procedure, T is most powerful in the center of the

parameter space; that is, when pl and p2 are nearly equal. This is

expected since T(CM) is uniformly most powerful when pl = P2 for any

choice of sample sizes. The statistic T(C) is clearly inferior

to T(LR), TLR), and T( in the extremes of the parameter space, that

is, when pl and p2 are quite different. Further, the deficiency of

T(CMH) compared to the other methods when pl and p2 are different is

larger than the deficiency of the other methods when pi = p2. From

Figures 9 12 it can also be seen that the central wedge of the param-

eter space where T( ) is more powerful shrinks as the sample sizes
(F)
increase. Fisher's statistic, T and the likelihood ratio statis-
(LR) (ALR)
tics, TLR and T have similar power. Fisher's method gives

slightly more power in the central region of the parameter space while

T(LR) and T(ALR) are slightly more powerful when pl and p2 ave very

different.
(F) (LR)
For unequal sample sizes (Figures 13, 14), T and T yield

power contours too similar to be separated on the drawings. The approx-

imate likelihood ratio test, T(ALR), has almost the same power as T(LR)
(F)
and T(F), having slightly more power when the experiment based on more

observations has the larger p., and slightly less power in the reverse

case. The sum of chi's procedure, T(X) is not equivalent to T(CM) when
(CMII)
n # n2. The power contours are very different with T being more

powerful when the larger experiment matches with a large p i The

statistic T() is more powerful in the opposite case.





58



Figures 15 20 are .60 power contours concomitant with the .90

power contours in Figures 9 14. The comparison of competing methods

may be more appropriately made for low powers. When all of the powers

of the tests are high it is probably unimportant which test is used.

The patterns observed in the .90 power contours are virtually the same

in the .60 power contours, however. No additional information is appar-

ent except that the patterns are consistent over a wide range of powers.






















T(F), T(LR) T(ALR)
T(CMH)

10-------


L



6- -- ---







L


2



0 I
0 2 4 6 8 10 )


Figure 3. Acceptance Regions for n = n2 = 10.




60






X(2)
S(F) T(LR) T(ALR)
14 L- '
14 - (CMH)
IT


12 .--.
I


10- -


8- -














0 12 14X
6 '1








0 ,
0 2 r 4 6 8i0 12 14


Figure 4. Acceptance Regions for n1 = n2 = 15.


































T(F)
T (LR) T(ALR)

......... T(CMH)


i. ...
r.51
rT.-l

I
: I
''"'



""


Figure 5. Acceptance Regions for n, = n2 = 20.


X (2)


20

S8

16

14


S 2 4 6 8 10 12 14 16 18 20
0 2 4 6 8 10 12 14 16 18 20


( I)
X


-

-




62








X(2)
30 -. --...... .-- I.. ....
.... T (F)
28 "--. T (LR), T(ALR)
26 ..T(CMH)
2 6 ..., ......... T CM )

24- :-

22-

20 ------------------- -

18



14 i ....

12,

10-
10 :--:

8-

6-
6 L-:-

4-

2-


0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30


Figure 6. Acceptance Regions for n = n2 = 30.




63

X(2)
20 .
(F) (LR)
S(F T(LR) (virtually the same)
S(ALR)
T (CMH)
|3 .......... .. .... ....



6T......... .
16



.. ....... .. .

1 ........







I0



s
8-




6-




2 4 6 8 10


Acceptance Regions for n, = 10, n2 = 20.


Figure 7.




64




X (2)

46 -
46T(F)

44 oooo T(LR)
ST(ALR)

42 ........ T(CMH)
4 2 -. T(X)
. T
40

38

36

34 -
324 .... i



30.- o
0o
28 -

26o ......
26 001 ---
0


24- !0
o

22 ...
: I 0

20-
II
lo -
I s I

2 4 6 8 10


Acceptance Regions for nl = 10, n2 = 50.


Figure 8.










T_(F) T(LR) T(ALR)
T-- (CMH)


.6 .7 .8 .9 1.0 = n2 = 10.
Figure 9. .90 Power Contours for n = n2 = 10.


P
1.0











S(F), T(LR),T(ALR)
T- (CMH)


.90 Power Contours for n = n2 = 15.


P
2
1.0




.9-




.8-




.7-




.6-


.5 .6 .7 .8 .9


-1
1.0 I


Figure 10.









P2
2


T(F)
__ T(LR) T(ALR)
......... T(CMH)


-1.0
1.0 1


.5 .6 .7 .8 .9
Figure 11. .90 Power Contours for nl = n2 = 20.




68



P
p2
1.0- T(F)

T(LR) T(ALR)
........... T(CMH)

.9-




.8-




.7-




.6-




.5- I
.5 .6 .7 .8 .9 I.0

Figure 12. .90 Power Contours for n = n= 30.
1 2









P2
1.0




.9-





.8-




.7-




.6-


N


K


.5 .6 .7 .8 .9

Figure 13. .90 Power Contours for n1 = 10, n2 = 20.


-1.
1.0


T(--- F) T(LR)(virtually the same)
T(ALR)
......... T(CMH)
_._T(X)


69


































PI








P
1.0


N


.5 .6 .7 .8 .9

Figure 14. .90 Power Contours for n1 = 10, n2 = 50.


T(F) ,T(LR)(virtually the so
__ T(ALR)
........T(CMH)
_._T(X)


70





me)
























I
1.0-
1.0 1











- T(F) T(LR), T(ALR)
-__ T(CMH)


.60 Power Contours for nl = n2 = 10.


P
2
1.0




.9




.8-




.7




.6-


.5 .6 .7 .8 .9


1.0P


Figure 15.











-T(F(F) ,T(LR), T(ALR)
---T(CMH)


.60 Power Contours for n1 = n2 = 15.


P0
1.0


.5 .6 .7 .8 .9


-1 p
1.0


Figure 16.









P
2
1.0- _F)

--- T(LR) T(ALR)
......... T(CMH)

.9-




.8




.7




.6




.57-
.5 .6 .7 .8 .9 1.0 I
Figure 17. .60 Power Contours for n = n2 = 20.










T(F)
--- T(LR) T(ALR)
......... T(CMH)


P
2
I.0




.9




.8-




.7-




.6-




.5


-. P
1.0


.5 .6 .7 .8 .9

Figure 18. .60 Power Contours for n1 = n2 = 30.











---T(F) ,T(LR)(virtually the same)
__ T(ALR)
......... T(CMH)
__ WT(X)


.8-
\
\




.7-








.5 .6 .7 .8 .9 1.0 1
Figure 19. .60 Power Contours for n = 10, n2 = 20.


P
1.0
I.0








P2
2.0
.0O


.5 .6 .7 .8 .9

Figure 20. .60 Power Contours for n1 = 10, n2 = 50.


o P
1.0 I


- T(F) ,T(LR)(virtually the some)
-_T(ALR)
......... T(CMH)
_._T(X)


-\










3.6 A Synthesis of Comparisons


When detailed prior knowledge of the unknown parameters is

(F)
unavailable the class of competing methods can be restricted to T
(LR) (ALR) (x)
T T and T(. These methods are compared with respect to

various criteria in previous sections. In this section, the results

of these comparisons are synthesized to make recommendations concern-

ing the optimum choice of method for various situations.

For the comparisons in the previous sections, the null hypothesis

considered is H : p = 2 = ... = = 1/2. The most general alternative

hypothesis considered is HA: p. i 1/2 (strict inequality for at least

one i). In some situations, it is reasonable to assume that the success

probability is consistent from experiment to experiment. In such cases

the alternative hypothesis of interest is HB: Pk = 2 = "' = k > 1/2.

A third alternative hypothesis of possible interest is H : p. > 1/2
Cj
(exactly one j). This alternative is appropriate if the researcher

believes that at most one p. will be greater than 1/2. The hypotheses

HA and H are probably the more frequently encountered alternatives in

practical situations.

The following recommendations are based on evidence presented

thus far in this dissertation:

(i)
1. The minimum significance level, T has good power versus

the HC alternative. It performs poorly, however, versus other alterna-

tives.

2. The Cochran-Mantel-Haenszel statistic, T( forms the

uniformly most powerful test against HB. Its use is therefore indicated
Bi










whenever it can be assumed that the pi are not very different. The
(CM)
statistic T performs relatively poorly versus alternatives in the

extremes of the parameter space (Type B alternatives).

(F)
3. Fisher's combination, T(F), is not, in general, the most

powerful test versus a particular simple alternative hypothesis. Its

power, however, is never much less than that of the optimum test.

Fisher's method gives good coverage to the entire parameter space and

its use is therefore indicated whenever specification of the alternative

hypothesis cannot be made more precisely than HA.

4. There seems to be no compelling reasons to recommend the

(x)
use of the sum of chi's procedure, T unless it is known, a priori,

that the pi are inversely related to the sample sizes of the individual

binomial experiments.
(LR)
5. The likelihood ratio statistic, T and the approximate

likelihood ratio statistic, T(ALR), define tests very similar to T(

They obtain approximately the same powers throughout the parameter space.

Choosing among these three statistics then depends upon which yields

significance levels with the greatest ease and accuracy. This problem

is addressed in Section 3.7.










3.7 Approximation of the Null Distributions

(F) (JR) (ALR)
of T T T




In Section 1.6 the problem of obtaining significance levels for
(F)
Fisher's statistic, T(F), when the data are discrete is discussed.
2 '2
Lancaster's transformations, X and X are introduced. It is estab-
m m
2 '2
lished in Section 1.6 that X and X both converge to chi-squares with
m m

2k degrees of freedom. Although Lancaster's approach can be expected

to yield good approximate levels for large sample sizes, the degree

of accuracy has not been established for small or moderate sample sizes.

Some indication of the accuracy of significance levels obtained from

2 '2
X and X is given by observing the mean and variance of these variates.
m m

Table 3 (page 79) lists the means and variances for n = 1,2,...,20 for
2 '2
X and X when applied to one experiment. Since the altered form of
m m

T(F) will be compared to a chi-square distribution with 2k degrees of
2 '2
freedom, it is desirous that the mean and variance of X and X are as
m m

close as possible to the mean and variance of the chi-square distribution

with degrees of freedom, which are 2 and 4, respectively. For n 3,

the mean and variance of X2 are closer to 2 and 4, respectively, than
m

the mean and variance of X2. This suggests that X should, in general,
m m
'2
be a more accurate approximation than X
m

In Section 3.2, the likelihood ratio statistic, T(LR), and the

approximate likelihood ratio statistic, T(AL are introduced. The
(LR)
necessary regularity conditions can be shown to be satisfied for T(

so that the statistic can be deemed asymptotically a chi-square with

k degrees of freedom. As previously stated, the null distribution of














Table 3

'2 2
Mean and Variance of Lancaster's X and X
m m


Median chi-square (X 2)
m


Mean

1.9808
1.9531
1.9394
1.9362

1.9385
1.9430
1.9481

1.9530
1.9574

1.9612
1.9645
1.9674

1.9698
1.9719
1.9738
1.9754
1.9768
1.9781

1.9793
1.9803


Mean chi-square (X 2)
m


Mean
2.000


Variance
1.9753
2.8587

3.2223
3.3754

3.4514

3.5014
3.5428

3.5804
3.6151

3.6465

3.6748

3.6998
3.7218
3.7412
3.7582
3.7733
3.7866
3.7986
3.8092
3.8188


Variance
1.9218
2.8036
3.2419

3.4760

3.6101
3.6923

3.7462
3.7836
3.8110

3.8320

3.8486
3.8621

3.8733
3.8828
3.8909
3.8980
3.9042
3.9097
3.9145
3.9189









T(ALR) can be approximated with a distribution derived by Oosterhoff.

Significance levels are then determined by the relationship

k
(ALR) -k k 2
P{T ( c} = 2 E ( ) P{X. c)
j=l J

(F) (LR) (ALR)
The null density functions of TF), T and TALR) are

plotted in Figures 21 22 for k=2, nI =n2= 6. These plots give an

indication of the difficulty of approximating the respective null

density functions. More extreme (larger) values of the statistics do

not always occur with smaller probabilities. This fact gives the jagged

appearances for the density functions. This lack of smoothness caused

difficulty in approximating a discrete density with a continuous one.

The remainder of this section contains numerical comparisons

of the above-mentioned approximations. The goal is to choose the approx-

imation which yields significance levels closest to the exact levels

of the respective statistic.

Tables 4 and 5 correspond to the density functions pictured
(F)
in Figures 21-22. Table 4 lists the possible events as ordered by TF
2 '2
Lancaster's approximate statistics, X and X are calculated for each
m m
2 '2
event. Although it is not generally true, X and X maintain the same
m m

ordering of events as the uncorrected T(F). Significance levels obtained
2 '2
by comparing X and X to a chi-square distribution with four degrees
m m

of freedom are then compared to exact levels. The inaccuracies of these

approximations are then reflected in the columns labeled percentage error.
(LR)
Table 5 gives an evaluation of the approximations given by T and

T(ALR) These statistics define equivalent tests, but yield different

approximations to the exact densities.










The percentage errors given in Tables 4 and 5 tend to favor
(LR) (ALR)
Lancaster's approximations over the approximations to T and T

Both X and X2 yield generally conservative results in this partic-
m m
2
ular case. The mean chi-square, X is somewhat more accurate than
m
'2
the median chi-square X
m

All of the approximations can be expected to improve as the

sample sizes increase. To indicate the behavior of the contending

approximations for increasing sample sizes, nominal c = .05 and a = .01

values for each statistic are given in Tables 6-8 for n =n2 = 3,4,5,...

The data in Tables 4-8 indicate that Lancaster's approxima-
(LR) (ALR)
tions clearly dominate the approximations to T and T The
2 '2 (F)
optimal choice then becomes either the X or the X correction to T
m m
2 '2
Table 6 gives no clear indication as to whether X or X
m m

yields a better approximation. Both statistics give large errors for

small sample sizes. Both statistics yield errors less than 12% for

both a = .05 and a = .01 levels for n16.

The superiority of the mean chi-square, X over the median
'2
chi-square, X becomes clear for k= 3. Table 9 gives the nominal
m
2 '2
a = .05 and a = .01 values for X and X for k=3, n = n2 = n
m m 2 3
2
2,3,4...,10. The mean chi-square, X is more accurate in all cases
but two ( = .01, n and = .05, n5)
but two (0 = .01, n = 8 and 0 = .05, n =5).
















































































I I I I I I


83





0






CO






(D


II



O:
II



0
t4-'
oo






0
C!

c

40



r-4
0O .,4
C:




(D 4
tO



















-0


I- rO RJ -I 0 0 O -- Io o I I oJ
- - - o QQ oQ Q QQ QQ













(LR)
T(ALR)
(equivalent for T(LR) T(ALR)=O)


O 3 6 9 12 15


(LR) (ALR)
Figure 22. Density Functions of T and T


for n, = n2 = 6.


.40


30







.20-







.0-










Table 4


(F)
Lancaster's Approximations to T for k=2,


Event


6,6

6,5


6,4


6,3


5,5


6,2


6,1


6,0


n = n =6
1 2


2 '2
X X
m m

20.6355

15.8629
16.0951

13.2872
13.3847

11.7041
11.7376

11.0904
11.5546

10.8316
10.8393

10.4468
10.4477

10.3335
10.3335
8.5146
8.8442

6.9315
7.1972

6.0590
6.2988

5.9389
6.1338

5.6743
5.9072

5.5609
5.7930

4.3558
4.4867

3.4833
3.5884

3.0985
3.1968

2.9852
3.0826


Approximate
Level


.000374

.003209
.002894

.009954
.009541

.01969
.01941

.02557
.02099

.02852
.02843

.03354
.03353
.03517
.03517

.07445
.06511

.1396
.1258

.1948
.1779
.2038
.1894

.2248
.2062

.2344
.2152

.3600
.3441

.4804
.4646

.5415
.5254

.5603
.5441


Exact
Level


.0002441

.003174


.01050


.02026


.02905


.03638


.03931


.03979


.08374


.1423


.1863


.2412


.2588


.2617


.4082


.5181


.5620


.5693


4,4


5,0


Percent
Error


53.2

1.1
-8.8

-5.2
-9.1

-2.8
-4.2

-21.0
-27.7

-21.6
-21.9

-14.7
-14.7

-11.6
-11.6

-11.1
-22.2

-1.9
-11.6
4.6
-4.5
-15.5
-21.5

-13.1
-20.3

-10.4
-17.7

-11.8
-15.7

-7.3
-10.3

-3.6
-6.5

-1.6
-4.4









Table 4 (Continued)


2 '2
Event X, X Approximate Exact Percent
m m Level Level Error

2.7726 .5966 -10.6
2.8397 .5850 .60 -12.3

2 1.9001 .7541 -7.3
1.9414 .7465 .85 -8.2
,1 1.5154 .8239 -5.5
1.5498 .8178 .-6.2
0 1.4020 .8438 -4.3
1.4356 .8380 8818 -5.0

22 1.0276 .9056 -3.3
1.0431 .9032 -3.6
.6429 .9582 -2.3
2,1 .9807
.6514 .9572 -2.4

.5295 .9706 -1.8
2,0 .9880
.5372 .9698 -1.8

.2582 .9924 -0.4
.2598 .9923 8 -0.5
.1448 .9975 -0.2
0 .1456 .9975 999-0.2
.03142 .9999 0.0
0 .03142 .9999 1.00.0














Table 5

(LR) (ALR)
Approximations to T and T
for k=2, n1 =n2 =6


Event


(6,6)


(6,5)


(6,4)

(6,3),(6,2)
(6,1),(6,0)

(5,5)


(5,4)

(5,3), (5,2)
(5,1), (5,0)

(4,4)

(4,3),(4,2)
(4,1),(4,0)

Remainder


(LR) (ALR)
T T


16.44
12.00
11.23
8.667
8.977
6.667

8.318
6.000
5.822
5.333
3.591
3.333


2.911
2.667
1.359
1.333
.6790
.6667
0
0


Approximate
Level
.002
.000886
.0036
.004901
.0111
.01383
.0156
.0196
.0544
.02783
.1660
.08117


.2333
.1171
.5069
.2525
.7119
.3862
1.0000
1.0000


Exact
Level

.0002441


.003209


.0150


.0310


.0398


.0837


.2068


.2617


.5693


1.0000


Percent
Error
0.0
-63.7
12.5
52.7
5.7
31.7
-49.7
-36.8
36.7
-30.1
98.3
-3.0


12.8
-43.4
93.7
-3.5
25.0
-35.2
0.0
0.0












Q) 0

U


CN







.1-
Cr)


0
O


M X










O
4C1





> u


(0

I 44



> M






Q) 0
C)
C)




> -1


l u )


Cm i CD

ml 1; L- '.0


-r I' '\o0 0 I'D

r- \'. r- oo oo oC 0"


t -Zr L 'D0 r- 0 0 C' o) -:T


01%
'

00


O-








0
0


I I




00
I0

0)
O


I I






0

0
0O r




Or


Nr)
r-
0I

0








00








10 00
O

OO









00 -

0r0
0I) 01

,-- ,-l


In
0


O




in


roo
-4
00


00 o

r- r-
NN C
0.) 0-
1-1 1-


cri r-


0 '
O D
.

r- 00


Ino



r-m
1- 0-


,-0C


1- r






O O



m 0l
00









oom








O ,-
0r^l


00 l
o-i C!



1-1











I I


00
oO
00

0









O0
0 0
* *


00











00 .*O

IDO


o O


oN
S00


00 00






















01%

I I






O 0
-r.
oa
OC)


















SmmC
0




















oOc
0-. ,-)-
0c) 0-


0 00

o aD


00

O\


rO CI




r~- r- o -


LI I D


In .o

I


-Zr

In









0o

LA 0I
C .


o r-
OO r













C r
Om






Ooo

-l4o




-Lr





Ino



OO



*J
0OO

oo








c,- 0

CL C)
OOq



'-O

,---I


0
In
I'D




01)















oq
0 0


0-1

oo


Vn -T

Co
-zj--Z


-I' r-
m c
o o

00 00









. oo
Ln Co
C ,


r- Cj-

0)
*o-*<


00 1 1
oo\


OC


-4
0 r-


0" am 00











r M LA 0


0) 0





SJ -4
uw




U QJ

xw
LrJ






C)






E




N
>- <


on
N


00








CO Cj
cn o


0 0






000
r o
Cn









0

,-4
O












coc









0 C







cA

NLI
-a-t
o



o o
LN 0



O C

r c


* m






o
0

0
0











0
0 0



No
O0








C C
00













rin
-4


C'4 0">


m 00
N-4

00
OcO





L 0 r-
oo



Cn
N
Ln


c 01






Lroo


0 0
-a 0'a
ncr>


0


,-4















0
00 o








0

0


r,-
N
0 0





Lfh


00 00

I I



co.

00










oo
0o





Om






0 0
0 0





0 "
On
mL



rl.




-4
-L4


IN


L f 0' N

I I I
*-i-a CM


cM 00
r--I Ui
oo
oa
00
00








LCN
No


00
00


c.



oo0


r-L -


In c



-n-4





0
,-4

,-


) r- 0 r-1

Nr- D r-
-l


00 cn
NLfn


cn ro

0-

00
O -








0 00
O C)



Ca) -a4



0 0

r-- 1a


L'P
00





Cco
F-a-'

0











0-aN
0 0






o o

00-'


CO










00




1r)










00
0

















01 0
o o






cr> cr


I





cD

\a




O




N <
NTLT
0 0






,-4 -a-
o o(

cO r

cr> a>


0 cn)N
0oo Lr r- r- r-4 r-

m -4 -4 r4 -a-
i-i ~ ~ ~ r 1- - - -


C L 1.0 r o O l 0 cM m `
i- ~( r- co -' 0 -I M N C C NC
,,-- ,-. ,-- ,..4 ,-4 N N N N N


Lr)
mM

NO
O -1







o

-4
-4


4I









uw
C O







0) 0
U












d-


X )
o0 >
E1










E
NO
X
I- 0


00




-4-4







uL 00












NO
C oC
o 0


o-''l 0h











0 0 '-.T
00 -1

I cm cJI
0 C C


- I n 0 oC 0-
-a- n \0 r~- 00 o cr


c
I-- C'

I I


oo ,-4

0 0T
00









. .T r,
-at -at
00


1


4-,
(1) 0


-w




uQl
n >


0

0 CI












-4 -4

I




C 0
0C 0

00
0C 0












00
C0 0


C-J Z

~0 r4
C"l


Ln 0CY

C-a
.1 .


Cl 0 ur LO 0 o Co 0 0o
-Zr LUl UL 0 --o o- 00 O% 'l-


-n -4t
to r~








00 0

0 --l
1-1 -1


-4 -





cn -aT -Z
1-i 1-1 r-A


unC o o)
-a7 04 o )






S-4 -4
Ln CN 0-1
***~


F -Z nI o r- o c-, O -1 C) L \i-- o -C O ,- C'4 C4 -Za
,-A -1 -4 4 r-4 -4 -1 -1 -1 -4 r' C- C4 'C"


04





0n 00
0 C0
OU\







00
Om O





o 0






00







co


CO CO
CO 0


\0 'lo
0 0
00










Co o0

0 0










CO
-.1
Q.,


0)

,-4 = -
o E,--i
x >
II 0 C)

--1


-CO
,-I


'm
I



04 0
o
,-4 0
0 0










00

0 C
0 0


-1 C1 r-
I oo oo
0l li -


C' C3 O1-




c \0 o 0
L 00 r
*3 -a


rl- c14


N
I





0 r-4
00








cr- C4




0 0





L Q 00
ST r-.

3 .
T^ r~.


in CIA








N N
l .I lI

of -.1t













0 *n CO





- r\ ,o-


000












ooo
S04


,-4


II




( 0







,-4
z



O

II


uo

J 0


-iw


(D
>O
X Q






u

o E
X = J

II 0 >


C-)


.Ln
Lff Lff in











C ,
(0 0
U



a-i



0
O >
X 0






4-1



SE
II 0 >


<


0 O
UL 0 i0 0C N





lOi i Ln -4 -4 CO
(D M I I




Ce > -4 cn %D0 m 1-
X W 0 0 0 0 0





r4

ul CO
o0 N4 M %0 N N
* r- 0 N '0 0
X C uI-i N- ,-I Ii 0'
II 0 > cn -M -1
C ( 0 0 0 0 0










S 00 0 ,- 00 N







> C~ r i '. '.


O N oo c0 Nm r- ci 0-

I --l c N tIn C- O0
I I I -



m O c4 D 0- cO m 0D
(1 .-4 1 tfi 0o ~3- cNi -3
0 0 0 0 0 0 0
0 0 0 0 0 0 0 0








0 O N -'-4 -o N %fl 0C
O 0O '0 O O- 0 C O O
m CN I'D 00 cn 0 %.D C1
1 0 (1 r- %O N o o0
0 0 0 0 0 0 0 0



0 0 0 0 0 0 0 0
0 00 oC r. ,-4 I o.




0' 'o r 0 c O 0 -.
0 o cN N- r N t



-Z4 L1 LO 0 r0- 00 0 r-


L i 0 N- CO -


1-4


0 0 0CO







0 0 0








i CO ,C


0 0 0
-N N

















0 0 0

N- CO


CMm


N CV



0 0










N -q
0 0




or '3-




o o
0 0








-4 CO
0 CM







- Nm
-4 -4


(n i 0D rN- co 00 0
r-q


r-4 4 -4 -4 ,-4


,.0 cO ,.-4

D IT3 -4






0 0 0 0
0 0 0 0








-4 N- 0 .00
%.0 -z N- -4
0- r-N 0
0 0 0 0


0 0 0 0
- C00 %.0







00 -






-4 -4 -4 -4
L-1 0 0
m 'T M .-
r~~ o ~ o r ~



o^ o0cr



0 .-Im


00 r-






C-
0

0


II



2
.-

0












a
( 0
u >4

-4


cOi


X
w -


LrO




u\i o
r4


CO





00








00
o Lo

Oo o


0 0


C O



00






C C
7%co


0)
4-J
o E

x H
II 0






E
-t


X




C
x






11


,-I







O






--i ,-











00


-4 r-l


u l
-I
oC


O








0
I -0
(n c0

-4 -4









ooi ..


on o
0co


00
OO O


--4 o0 u-n --
on O r oO
m or- 01


mm CN m
- il 11 r(



m7 mm7

m mm

mr m v^


C N C m


-4-4

00
0
OO


- i i-I C
1-- r- q 1


I r-





00

00
00
O o
o o
o o
o o


Lr1 -zI

oc
00
00


NN CM

CO C
c -4

1 r
i ,--


m m O <















-1 -4 I co co
c m r-o CO COr
- - 1 1
,- t ,- t ,--


oN

00
00
OO


CO J

oo
00o








00
S,0




r-4 r-i


COM
com







CO
00
nD



-4-4






Nr-o


OOCO
O 0
oo
oo

00


0'Cl









mm



Do
L- L.'

cU~l^


(n a- mem

r- r- ifn r-

r-- o-7% co











Cn
lIm I --t
N Il


I -1 I1 r


co

00


4-J
O 0
U







cc >
o (
0n W





C-|
>u 0
UIn

u a-




















CN
n



N


CO



NN
-4-4


aN
W C


(n r-
04



NN)



C
r-q r-


CNN




- r-


MON-










The statistics T(F) T(LR), and T(ALR) define very similar

tests. In Section 3.6 it is concluded, therefore, that the choice

among these three statistics should depend upon which affords the best

approximation to its null distribution. The evidence of this section
2 (F)
indicates that Lancaster's mean chi-square (X ) approximation to T(F)
m

is the best choice. Even this approximation yields large errors for

small sample sizes. Tables 10 and 11 give nominal a =.05 and a = .01

levels for HLL an equivalent and more convenient form of T for

k = 2 and k=3. The exact significance levels are also given.




Full Text
xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID EO9W3CVIT_MXC3YO INGEST_TIME 2017-07-17T20:25:47Z PACKAGE UF00098279_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES



PAGE 1

A COMPARISON OF METHODS FOR COMBINING TESTS OF SIGNIFICANCE BY WILLIAM C. LOUV A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIRE^fENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1979

PAGE 2

Digitized by the Internet Archive XO JILL in 2009 with funaing from University of Florida, George A. Smathers Libraries http://www.archive.org/details/comparisonofmethOOIouv

PAGE 3

ACKNOWLEDGMENTS I am indebted to Dr. Ramon C. Littell for his guidance and encouragement, without which this dissertation would not have been completed. I also wish to thank Dr. John G. Saw for his careful proofreading and many helpful suggestions. The assistance of Dr. Dennis D. Wackerly throughout my course of graduate study is greatly appreciated. My special thanks go to Dr. William Mendenhall who gave me the opportunity to come to the University of Florida and who encouraged me to pursue the degree of Doctor of Philosophy. Ill

PAGE 4

TABLE OF CONTENTS Page ACKNOWLEDGMENTS iii ABSTRACT vi CHAPTER I INTRODUCTION AND LITERATURE REVIEW 1 1.1 Statement of the Combination Problem 1 1.2 Non-Pararaetric Combination Methods 2 1.3 A Comparison of Non-Parametric Methods 5 1.4 Parametric Combination Methods 8 1.5 Weighted Methods of Combination 11 1.6 The Combination of Dependent Tests 12 1.7 The Combination of Tests Based on Discrete Data ... 13 1.8 A Preview of Chapters II, III, and IV 18 II BAHADUR EFFICIENCIES OF GENERAL COMBINATION METHODS .... 19 2.1 The Notion of Bahadur Efficiency 19 2.2 The Exact Slopes for T^'^^ and T^^"* 21 2.3 Further Results on Bahadur Efficiencies 26 (F) 2. A Optimality of T in tlie Discrete Data Case 28 III THE COMBINATION OF BINOMIAL EXPERIMENTS 32 3.1 Introduction 32 3.2 Parametric Combination Methods 33 3.3 Exact Slopes of Parametric Methods 37 3.4 Approximate Slopes of Parametric Methods 44 3.5 Powers of Combination Methods 54 3.6 A Synthesis of Comparisons 57 (F) 3.7 Approximnt ion of the Null Distributions of T , T^'''^\ T^'^'-'^) 79 IV APPLfCATIONS AND FUTURE RESEARCH 96 4.1 Introduction 96 4.2 Estimation: Confidence Regions Based on Non-parametric Combination Methods 96 iv

PAGE 5

TABLE OF CONTENTS (Continued) CHAPTER IV (Continued) Page 4.3 The Combination of 2 x 2 Tables 110 4.4 Testing for the Heterogeneity of Variances 113 4.5 Testing for the Difference of Means with Incomplete Data 115 4.6 Asymptotic Efficiencies for k^°° 115 BIBLIOGRAPHY ng BIOGRAPHICAL SKETCH 122

PAGE 6

Abstract of Dissertation Presented to the Graduate Council of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy A COMPARISON OF METHODS FOR COMBINING TESTS OF SIGNIFICANCE By William C. Louv August 1979 Chairman: Ramon C. Littell Major Department: Statistics Given test statistics X ,...,X for testing the null hypotheses H ,...,H , respectively, the combining problem is to select a function of X ,...,X to be used as an overall test of the hypothesis H = H "^ H '^ ••• 1^ ^Ik. • Functions based on the probability integral transformation, that is, the significance levels attained by X ,...,X , form a class of non-parametric combining methods. These methods are compared in a general setting with respect to Bahadur asymptotic relative efficiency. It is concluded that Fisher's omnibus method is at least as efficient as all other methods whether X ,...,X arise from continuous or discrete distributions. Given a specific parametric setting, it may be possible to improve upon the non-parametric methods. The problem of combining binomial experiments is studied in detail. Parametric methods analogous to the sum of chi's procedure and the Cochran-Mantel-Haenszel procedure as well as the likelihood ratio test and an approximate likelihood ratio test are compared to Fisher's method. Comparisons are made with respect to Bahadur efficiency and with respect to exact power. The power vi

PAGE 7

comparisons Cake tlie form of plots of contours of equal power. If prior information concerning the nature of the unknown binomial success probabilities is unavailable, Fisher's method is recommended. Other methods are preferred when specific assumptions can be made concerning the success probabilities. For instance, the Cochran-Mantel-Haenszel procedure is optimal when the success probabilities have a common value. Fisher's statistic has a chi-square distribution with 2k degrees of freedom when X ,...,X are continuous. In the discrete case, however, the exact distribution of Fisher's statistic is difficult to obtain. Several approximate methods are compared and Lancaster's mean chi-square approximation is recommended. The combining problem is also approached from the standpoint of estimation. Non-parametric methods are inverted to form k-diraensional confidence regions. Several examples for k = 2 are graphically displayed.. Vll

PAGE 8

CHAPTER I INTRODUCTION AND LITERATURE REVIEW 1.1 Statement of the Combination Problem The problem of combining tests of significance has been studied by several writers over the past fifty years. The problem is: Given test statistics X ,...,X for testing null hypotheses H ,...,H , respectively, to select a function of X ,...,X to be used as the combined test of the hypothesis H = H, n H. n . . . n H, . In most of the 12 k work cited, the X are assumed to be mutually independent, and, except where stated otherwise, that is true in this paper. Some practical situations in which an experimenter may wish to combine tests are: i. The data from k separate experiments, each conducted to test the same H, yield the respective test statistics X ,...,X It is desired to pool the information from the separate experiments to form a combined test of H. It would be desirable to pool the information by combining the X if (a) only the X ^ , instead of the raw data, are available, if (b) the information from the i experiment is sufficiently contained in X , or if (c) a theoretically optimal test based on all the data is intractible. ii. The i of k experiments yields X to test a hypothesis H., i = 1 k, and a researcher wishes to simultaneously test

PAGE 9

the truth of H ,...,11 . Considerations (a), (b) , and (c) in the preceding paragraph again lead to the desirability of combining X as a test of H = H, n ... n H, . 1 k iii. A simultaneous test of H = H n ... n H is desired, and the data from a single experiment yield X , . . . ,X as tests of H , . . . ,H , respectively. Combining the X can provide a test of H. In Section 1.2 several non-parametric methods of combination are introduced. A literature review of comparisons of these procedures is given in Section 1.3. The remainder of this chapter is primarily a literature review of more specific aspects of the combination problem. We make some minor extensions which are identified as such. 1 . 2 Non-parametric Combination Methods Suppose that H. is rejected for large values of X . Define L = 1 F (X J, where F. is the cumulative distribution function of X under H.. If X is a continuous random variable, then L is uniformly distributed on (0,1) under }) . . Many of the well-known 1 methods of combination may be expressed in terms of the L . Such methods considered here are: (1) T^^^= -2Z£nL^^^ (Omnibus method, Fisher [13]) (2) T^^-* = -z$"^(l*-^^) (Normal transform, Liptak [26]) (3) T = -inin 1^ (Minimum significance level , Tippet t [42]) (4) T = -max L (Maximum significance level, Wilkinson [44]) .,.0') _ ,,„r, ,(i) (5) T = 2);£n(l L^ ^] (I'earson [36,]) (6) T^^^ = -EL*-"-^ (Edgington [12]).

PAGE 10

As the statistics are defined here, H is rejected when large values are observed. Figure 1 (page 4) shows the rejection regions for the statistics defined above when k = 2. In the continuous case, the null distributions of these statistics are easily obtained. They are all based upon the fact that the L are uniformly distributed under H. It is easily established that this is true. The cumulative distribution function for L is P{L 11}= P{1 F(x) < i.) = 1 P{F(x) < 1 1} = 1 P{x < F~''^(l £)} = 1 f{f"-'-(i I)} = 1 (1 £) = £. (N) That T has a normal distribution with mean and variance k (m) (H) follows trivially. The statistics T and T are seen to be based on the order statistics of a uniform random variable on (0,1) and therefore distributed according to beta distributions. (F) (F) That T and T are distributed as chi-squares on 2k degrees of freedom is established as follows. The probability density function of L^i^ is Let S = -2J.nL. Then It follows that -S/2 dL -1 -S/2 ^ = ^ ' ^ = T^ Edgington's statistic, T , is a sum of uniform random variables. As shown by Edgington, significance levels can be established for

PAGE 11

L 2) J(m) L (i) J(M) T (p) Figure 1. L^" I (I) J(A) Rejection Regions in Terms of the Significance Levels for k = 2.

PAGE 12

1 c ^(A) values ot T on Lhe basis of the following equation [12]: (A)_._t^ ^^At::!)^ k(t^ u^^^ uu-s)"^ k! 4'' k! S^ k! ^•••^^S^~k!~ ^^-1^ P{t(^) >-t} =f-(SlLli)_+ (k lt^2)__ k (t-3)% k (t-S) k! ^r k! 4'' k! 4'^ k! +'--+(c) — nwhere S is the smallest integer greater than t. 1.3 A Comparison of Non-parametric Methods The general non-parametric methods of combination are rules prescribing that H should be rejected for certain values of A^^) t(2) ,(k)^ ^ , ^ L'> L ,..., L J. Several basic theoretical results for nonparametric methods of combination are due to Birnbaum [7], Some of these results are summarized in the following paragraphs. Under H^, L is distributed uniformly on (0,1) in the continuous case. When H^ is not true, L^^^ is distributed according to a non-increasing density function on (0,1), say g.(L^^^), if X^^^ has a distribution belonging to the exponential family. Some overall alternative hypotheses that may be considered are: H^: One or more of the L ^ 's have non-uniform densities H^: All of the L 's have the same non-uniform density g(L). H^: One of the L 's has a non-uniform density g.(L ^'*). '^\ ^^ ^^^ appropriate alternative hypothesis in most cases where prior knowledge of tlic alternative densities g . (l ) is unavailable (7J. The following condition is satisfied by all of the metliods introduced in Section 1.2. Condition 1: If H is rejected for any given set of L 's, then it will also be rejected for all sets of L^^^^'s such that L^^^* 2 L*"^'* for each i [7] .

PAGE 13

It can be shown that the best test of H versus any particular alternative in H must satisfy Condition 1. It seems reasonable, therefore, that any method not satisfying Condition 1 can be eliminated from consideration [7]. In the present context. Condition 1 does little to restrict the class of methods from which to choose. In fact, "for each nonparametric method of combination satisfying Condition 1, we can find some alternative H represented by non-increasing functions g, (l J,...,g (l J against which that method of combination gives a best test of H" [7] . It should be noted that direct comparison of general combining methods with respect to power is difficult in typical contexts. The precise distributions of the g . (l ) under the alternative hypothesis are intractible except in very special cases. When the X have distributions belonging to the one-parameter exponential family, the overall null hypothesis can be written H: g(l) _ q(1) (k) _ (k) „ . ,. . u u A ~ '•••» " Rejection of H is based upon (X ,...,X J. It is reasonable to reject the use of inadmissible tests. A test is inadmissible if there exists another test which is at least as powerful for all alternatives and more powerful for at least one alternative. Birnbaum proves that a necessary condition for the admissibility of a test is convexity of the acceptance region In the (X ,...,X J hyperplane. For X with distributions in the (p) (M) exponential family, T and T do not have convex acceptance regions and are therefore inadmissible [7]. Although Birnbaum does not consider Edgington's method, we see (A) that it is clear that T must also be inadmissible. For instance.

PAGE 14

for k=2, consider the points (0,c), (c,0), and (c/2,c/2) in the (l ,L } plane which fall on the boundary of the acceptance region T > c. The points in the (x^^\x^^^3 plane corresponding to (0,c) and (c,0) would fall on tlie axes at (and --) . The point corresponding to (c/2,c/2) certainly falls interior to the boundaries described by the points corresponding to (c,0) and (0,c). The acceptance region can not, therefore, be convex and hence T^'^^ is inadmissible. This argument is virtually the same as that used by Birnbaum to establish the inadmissibility of T and T . For a given inadmissible test it is not known how to find a particular test which dominates. Birnbaum, however, argues that the choice of which test to use should be restricted to admissible tests. The choice of a test from the class of admissible tests is then contingent upon which test has more power against alternatives of interest [7] In summary of Birnbaum's observations, since T^'''^ and T^^^ do not in general form convex acceptance regions in the (x , . . . ,X^'^^) hyperplane, they are not in general admissible and can be eliminated as viable methods. We can extend Birnbaum's reasoning to reach the same conclusion about T . By inspecting the acceptance regions formed by the various methods, Birnbaum also observes than T ^^ is more sensitive to H^ (departure from H by exactly one parameter) that J^^K The test 1 , however, has better overall sensitivity to H (71. A Littell and Folks have carried out comparisons of general nonparametric methods with respect to exact Bahadur asymptotic relative efficiency. A detailed account of the notion of Bahadur efficiencies is deferred to Section 2.1.

PAGE 15

In ti.eir first investigation [26], Littell and Folks compare M) ^(N) „(M) ^ ^(m) ^^ ' ' ^ ' ^ ' ^^° ^ ' The actual values of the efficiencies are given in Section 2.3. The authors show that T^''^ is superior to the other three procedures according to this criterion. They also observe that the relative efficiency of T^'"^ is consistent with Birnbaum's observation that T^'"^ performs well versus H . Further, Littell and Folks show that T^^\ with some restrictions on the parameter space, is optimal among all tests based on the X " as long as the X ^^ are optimal. This result is extended in a subsequent paper [28] by showing that T^^^ is at least as efficient as any other combination procedure. The only condition necessary for this extension is equivalent to Birnbaum's Condition 1. A formal statement of this result is given in Section 2.3. ^•^ Parametric Combination Methods The evidence thus far points strongly to T^^^ as the choice among general non-parametric combination procedures when prior knowledge of the alternative space is unavailable. When the distributions of the X belong to some parametric family, or when the alternative parameter space can be characterized, it may be possible that T^^^ and the other general non-parametric methods can be improved upon. A summary of such investigations follows. Oosterhoff [3jJ considers the combination of k normally distributed random variables with known variances, and unknown means V^,U^,...,\i^. The null hypothesis tested isH;y =y =...=y=0 1 2 * ' * k versus the alternative H^: p . > 0, with strict inequality for at least

PAGE 16

one i. lie observed that many combination problems reduce to this situation asymptotically. The difference in power between a particular test and the optimal test for a given (u ,VJ , . . . ,ii ) is called the short(F) coming. Oosterhoff proves that the shortcomings of T and the maximum likelihood test go to zero for all (y^ , . . . ,1J, ) as the overall signif1 k icance level tends to zero. The maximum shortcoming of the likelihood (F) ratio test is shown to be smaller than the maximum shortcoming of T Oosterhoff derives a most stringent Bayes test with respect to a least favorable prior. According to numerical comparisons (again with respect to shortcomings), the most stringent test performs similarly to the likelihood ratio test. The likelihood ratio test is much easier to implement than the most stringent test and is tiierefore pre(F) ferable. Fisher's statistic, T , is seen to be slightly more powerful than the likelihood ratio test when the means are similar; the opposite is true when the means are dissimilar. A simple summing of the normal variates performs better than all other methods when the means are very similar [33] . Koziol and Perlman [20] study the combination of chi-square variates X "^ ~ x,, (9.)The hypothesis test considered is H: i ^ 6, = . . . = 6, = vs H : 0. > (strict inequality for at least one i) 1 k A 1 -1 ^ where the 9. are non-centrality parameters. The v. correspond to the respective degrees of freedom. An earlier Monte Carlo study by Bhartacharya [6] also addressed this problem and compared the statistics ^(F)^ j(m) ^ ^^^ Yy^(i) ^ Bhattacharya concluded that EX^^^ and T^^^ were almost equally powerful and that both of these methods clearly dominated T . Koziol and Perlman endeavor to establish the power of T and

PAGE 17

10 T.X in some absolute sense. To do this, they compare T ^^ and EX^"^^ to Bayes procedures since Bayes procedures are admissible and have good power in an absolute sense [20]. When the v^ are equal, LX ^ is Bayes with respect to priors giving high probability to (^ j^» • • • .9;^) central to the parameter space (Type B alternatives). The test E exp{yX^^^ } is Bayes with respect to priors which assign high probability to the extremes of the parameter space (Type C alternatives). For unequal v 's the Bayes tests i have slightly altered forms. The Bayes procedures are compared to t(F) rr,(m) (N) ^ , o ^ i , i , and i tor k=Z for various values of (v ,v ) via numerical tabulations and via the calculation of power contours. The statistic, T , is seen to have better power than the other tests for Type C alternatives but performs rather poorly in other situations. The Bayes test performs comparably to T^'"'' for Type C alternatives and is much more sensitive to Type B alternatives than T . The . . (N) statistic, T , IS relatively powerful over only a small region at the center of the parameter space. The statistic, T*^^\ is seen to be dominated by some other procedure for each value of k investigated. The statistics, T and EX , are good overall procedures, with T^^^ more sensitive to Type C alternatives and ZX^^"* more sensitive to Type B alternatives. For v > 2, t"^^^ is more sensitive to Type B alternatives than ZX is to Type C alternatives and T^^^ is therefore recommended. The opposite is true for v = 1. These observations were supported for k>2 through Monte Carlo simulations. Koziol and Perlman also consider the maximum shortcomings of the tests. In the context of no prior information, they show that T*-^-*

PAGE 18

11 minimizes the maximum shortcomings for \> . 1 2 while ZX minimizes the maximum shortcoming for v =l. An additional statistic can be consid1 XX) _ ^f^(i) I, ered when v. = l. it is T^^^ = e(x^ '^ the sum of chi's procedures. (X) For k-2, T is powerful only for a small region in the center of the (y ) parameter space. For large k, the performance of T becomes progressively worse. It can be said that T^ performs similarly to T . 1. 5 Weighted Methods of Combination Good []4] suggests a weighted version of Fisher's statistic, T = -ZX.2.nL . He showed that, if the X. are all different, significance probabilities can be found by the relationship ?{T^^^ > x) = E A exp(-x/A ) , r "^ r r=l where A = (G) Zelen [45] illustrates the use of T^ in the analysis of incomplete block designs. In such designs, it is often possible to perform two independent analyses of the data. The usual analysis (intrablock analysis) depends only on comparisons within blocks. The second analysis (interblock analysis) makes use of the block totals only. Zelen defines independent F-ratios corresponding to the two types of analysis. The attained significance level corresponding to the interblock analysis is weighted according to the interblock efficiency which is a function of the estimated block and error variances.

PAGE 19

12 A similar example is given by Pape [ 34 J . Pape extends Zelen's method to the more general context of a multi-way completely random design. Koziol and Perlman [20] also considered weighted methods for the problem of combining independent chi-squares. They conclude that wlien prior information about the non-centrality parameters is available, increased power can be achieved at the appropriate alternative by a weighted version of the sum test, Eb X , if v > 2 for all i and bv 1 i ' CG) the weighted Fisher statistic, T , when v. 1 2 for all i. 1 1.6 The Combination of Dependent Tests The combinations considered up to this point have been based on mutually independent L arising from mutually independent statistics, X . As previously indicated, in such cases, the functions of the L which comprise the general methods have null distributions which are easily obtained. l>nien the X^^^ (and thus the L ^^) are not independent the null distributions are not tractible in typical cases. Brown [9] considers a particular example of the problem of combining dependent statistics. The statistics to be combined are assumed to have a joint multivariate normal distribution with known covariance matrix \ and unknown mean vector (p, ,vi , , . . . ,m )'. The hypoth12 k esis test of interest is H: y. = y/ versus H, : y > y (strict in1 lo 1 1 io equality for at least one i) . A likelihood ratio test can be derived [31 J, but obtaining significance values from this approach is difficult. Brown bases his solution on T^ . The null distribution of T^^"* is not chi-square on 2k degrees of freedom in this case. The mean of

PAGE 20

13 T is 2k as in the independent case. The variance has covariance terms which Brown approximates. The approximation is expressed as a function of the correlations between the normal variates. These first two moments are equated to the first two moments of a gamma distribution. The resultant gamma distribution is used to obtain approximate significance levels. 1. 7 The Combination of Tests Based on Discrete Data As noted in previous sections, the literature tends to support (F) T as a non-parametric combining method in the general, continuous data framework. Those authors who have addressed the problem of com(F) bining discrete statistics have utilized T assuming that the optimality properties established in the continuous case are applicable. The problem then becomes one of determining significance (F) probabilities since T is no longer distributed as a chi-square on 2k degrees of freedom. We describe the problem as follows. Suppose (i)* (i)* L derives from a discontinuous statistic, X , and that a and b (i)* (i)* are possible values ofL ,Ola
PAGE 21

14 When k, tiie number of statistics to be combined, is small and (n .n^, . . . ,n ) , the numbers of attainable levels of the discrete sta(F) tistics are small, the exact distribution of T can be determined. Wallis [43] gives algorithms to generate null distributions when all of the X are discrete and when one X is discrete. Generating null distributions via Wallis' algorithms becomes intractible very quickly as k and the number of attainable levels of the X increase. The generation of complete null distributions is even beyond the capability of usual computer storage limitations in experiments of modest size. The significance level attained by a particular value of T can be obtained for virtually any situation with a computer, however. (F) A transformation of T which can be referred to standard tables is indicated. A method suggested by Pearson [37] involves the addition, by a separate random experiment, of a continuous variable to the original discrete variable and thus yielding a continuous variable. Suppose X can take on values 0, 1, 2,..., n. with probabilities p_ , p, , . . . , p . n-L 1 1 n-j^ Let pf^^ = E p . Note that P*^^^ J = 0,1,2, ... ,n. , are the observable significance levels for the .th 1 test; I.e., the observable values of the random variable L = 1 f(x ij under the null hypothesis. Denote by U , i = l,2,...,k, mutually independent uniform random variables on (0,1). Pearson's statistic is defined as

PAGE 22

15 We now establish that L J^ (x \u ) is uniformly distributed on (0,1) if and only if X "^ and U are independent and U is uniformly distributed on (0,1). Omitting the superscripts for convenience, define the random variable (X,U) by L = L(X) U P{X} where < U < 1. It follows that P{X = X, U < u) = P(L(x) u P(x) < Lp < L(x)} = u P X = P{X = x} P{U < u}. The statistic, -2I£nL , thus has an exact chi-square distriP bution with 2k. degrees of freedom. Exact significance levels can be determined for any combination problem. The concept of randomization to obtain statistics with exact distributions has been debated by statisticians. That a decision may depend on an extraneous source of variation seems to violate some common sense principle. Pearson [37] argues, however, that his randomization scheme is no more an extraneous source of variation than is the a priori random assignment of treatments to experimental units. Lancaster [21] considers a pair of approximations to T . Although Lancaster does not consider Pearson's method, the statistics he introduces can be expressed in terms of the L [19]: P 2 1. Mean chi-square (X ): m E(-2JlnL^^^) = 1 ... (-2£nl/^0 du >0 P 2-{L^^''(X)£nL'-''\x) -L^^''(X+l)£nL^^^(X+l)}/P(X). (1.3)

PAGE 23

16 '2 ii. Median chi-square (X ): m Median {-ISLnL^^^ = -2!in^[L^^\x) + L^^Ux+l) } if L^'-^X+l) ^ = 2-2 nL^'-''(X) if L*''-\x+l) = 0. 2 2 The expectation of X is 2 . The variance of X is sliehtlv less ra m ^ than 4. The median chi-square is introduced because of its ease of calculation. With the ready availability of pocket calculators with £n functions, this justification no longer seems valid. The expecta'2 '2 tion of X is less than 2. The alternate definition of X for when m m L (X+1) = is intended to reduce the bias (without increasing the difficulty of calculations) [21]. David and Johnson [10] undertake a theoretical investigation of '2 the distribution of X . They prove that as n, the number of attainable m levels of X (and hence L ), increases without limit, the moments of '2 X converge to those of a chi-square distribution on 2 degrees of m ° 2 freedom. We obtain a similar result for X by adapting David and Johnm '2 son's proof for X (superscripts are omitted for convenience). From 2 the definition of X given in equation (1.3), it follows that 9 L lira E(X ) = lim E( m n -»oo 'Q P(x.)^0 (J = 1>2 n.) 1 , (-2£nL )du) , P X Then, n 9 u lim E(X ) = lim Z P(x.) [ J=l 1 , (-2JlnL )du] P , n rl (-2) lim Z P(x.)[ £n(L(x.) -uP(x.))du] j = l J ^ J J = (-2)^ lim ZP(x.) [ 1 {(L(x.) -uP(x.) -1) J J lim E(X^)^ = -T (L(x.) " uP(x.)-l)^ + ...}du]'^. m ^ J J

PAGE 24

17 (Note: £n(a) = (a-1) |-(a-l)^ + j(a-l)^ ...) Upon performing tlie integrations, 1 I, limE(X )' = (-2) limZP(x.)[(L(x.) -^P(x.) l) +f([L(x ) 1]2 ip(x.)(L(x.) l) 3[P(x.)]^) + ...]^. rl Since all of tlie terms in the expansion of [ £n(L(x.) uP(x.))du]^ Jq J J are multiplied by P(x ) it is evident that u(P(x.)) gives rise only to -^ J second-order terms in P(xJ which can be ignored in the limit. The limit thus reduces tc (-2)'' lim n /• ^ P(x.)[ £nL(x.)du]' = (-2)^ lim XP(x.)[£nL(x.)]^ (-1 = (-2) [JlnL(x)]'' dL(x). ^ Letting Y = JlnL(x) yields /T^b.b -Y b b (-2) (-1) e Y dY = 2b! '0 which gives the b'' moment of a chi-square distribution with 2 degrees of freedom. The convergence of moments does not in general imply convergence in distribution. However, if there is at most one distribution function F such that lim | x"^ dF = fx'^ dF, then F -> F in distribution [8]. Since the chi-square distribution is uniquely defined by its moments, it follows that X^ -> xl in distribution. ra ^2

PAGE 25

18 1.8 A Preview of Chapters II, III, and IV In Section 2.1, the notion of Bahadur asymptotic relative efficiency is introduced. In Section 2.2, we derive the Bahadur exact (A) (P) slopes for T and T . The results due to Littell and Folks mentioned in Section 1.3 are summarized in detail in Section 2.3, In Section 2.4, we extend the optimality property for T^^^ given by Littell and Folks to the discrete data case. Chapter III deals with a particular combination problem: the combination of mutually independent binomial experiments. Fisher's (F) method, T , is compared to several methods which are based directly on the X " (rather than the L^^^ . Comparisons are made via both approximate and exact slopes in Sections 3.3 and 3.4. The tests are also compared by exact power studies. These results are given in Section 3.5. A summary of the tests' relative performances follows. Recommendations as to the appropriate use of the methods are given. Section 1.6 describes some proposed approximations to the null distribution of T^^\ In Sectionl.7, these methods are shown to be less reliable than might be expected. Alternative approaches are also i:valuated. In Section 4.1, the combination problem is approached from the standpoint of estimation. Confidence regions based upon T^^^ and T^'"^ are derived. The remainder of Cl,apter IV introduces future research problems which are related to the general combination problem.

PAGE 26

CHAPTER II B/UIADUR EFFICIENCIES OF GENERAL COMBINATION METHODS 2.1 The Notion of Bahadur Efficiency Due to the intractibility of exact distribution theory, it is often advantageous to consider an asymptotic comparison of two competing test statistics. In typical cases, the significance level attained by each test statistic will converge to zero at an exponential rate as the sample size increases without bound. The idea of Bahadur asymptotic relative efficiency is to compare the rates at which the attained significance levels converge to zero when the null hypothesis is not true. The test statistic which yields the faster rate of convergence is deemed superior. A more detailed definition follows. Denote by (Y ,Y , ...) an infinite sequence of independent observations of a random variable Y, whose probability distribution ?„ depends on a parameter 9 £0. Let H be the null hypothesis H: ^ ^'^r, ^nd let A be the alternative A: 8 eO . For n = 1,2,..., let X be a real valued test u n statistic whicli depends only on the first n observations Y ....,Y . 1 n Assume that the probability distribution of X is the same for all n C0 . Define the significance level attained by X by ^ n L^ = 1-F^(X^) where ^^M-^^i^i x}. Let {X^^} = iX^,X^, . . . ,X^, . . .} 19

PAGE 27

20 denote an infinite sequence of test statistics. In typical cases, there exists a positive valued function c(6), called the exact slope of {X }, such that for 6e6-6 . (-2/n) In L ->c(e) n with probability one [0]. That is, P{lim — £n L = cCG)} = 1. n n If {X } and {X } have exact slopes c (9) and c (0) , respectively then the ratio 'i>-,2^^^ ^ c (0)/c (0) is the exact Bahadur efficiency of {X^^h relative to {X^^^}. n n An alternative interpretation of ({) (9) is that it gives the limiting ratio of sample sizes required by the two test statistics to attain equally small significance levels. That is, if for e > 0, N (e) is the smallest sample size such that L < e for all sample sizes n ' n>N ^ (£), i = 1,2, then as e tends to zero [3] lira -YTT^ = , such that T //n-+b(e) with probability one [0], 2. There exists a function f(t), 0f(t), where F (t)=l-P (T
PAGE 28

21 2.2 The Exact Slopes For T^^^ and T^^^ In terras of the above discussion the general combination problem can be defined as follows. There are k sequences {X },..., {X } of "l \ statistics for testing H: 9 e0 For all sample sizes n ,...,n , the statistics are independently distributed. Let L be the level attained 11 . 1 by X^^\ i = 1,2,.. .,k. n . 1 Assume that for each i = l,...,k, the sequence (X } has exact n . 1 slope c . (Q ) ; that is — £nL^^^ -•-c. (0) as n ^ «> n . n . 1 i witli probability one [9]. Assume also that the sample sizes n ,...,n 1 k n. satisfy n + . . . + n, = nk and lim — = A , i = 1, . . . ,k. Th 1 k n i A + . . . + A = k and 1 k -— £nL^^^ ->-A.c.(e) as n -> « n n. 11 1 with probability one [9]. As defined here, n can be thought of as the avernge sample size of tlie k tests. Two general combining methods introduced in Section 1.2 are T , Edgington's additive method, and T , Pearson's method. Deriva(A) (P) tions of the exact slopes of T ^ and T follow.

PAGE 29

22 I, Proposition 1. Let T^ = ^ ^n Z L^^ The exact slope of T^^^ is k min (c . (G) ) /n i=l i Proof: This proof requires the above definition of T althogh ^P, ~ ~ n"""^ ^ more obvious definition and is the form given in Section 1.2 where the non-parametric methods are introduced. Nothing is lost, however, since equivalent statistics yield identical exact slopes. Proposition 1 is proved by using Theorem 1 (Bahadur-Savage). The first step is to establish b(e) of Part 1 of Theorem 1. To accomplish this, first suppose that Ac (9) = min {A.c.CG)}. It follows that for all e > 0, there exists N = N(e) such that for n >N, T-^^^i^-C. -^^.....^. with probability one [0]. Then, for n > N, L^'^>-L^'^ i = 12 k nj^ n^ , X x,^, . . . ,K, with probability one [9]. It follows that for n>N, ^1 i=i "i ni with probability one [9J. Thus, for n>N, ^ ^n i/i) .::2 ^^ ^^(i) ^-2 ^^ ^Jl) n n n n. n nj^ with probability one [ Gj . It follows that as n tends to infinity

PAGE 30

23 lim — In L^^^ >lim — In IL^^^ > Hm — Hn kL^^\ n n. n n. n n. Ill with probability one [6]. Hence, A,cre)>— ;Ln L^^^>Ac re), 11 n n . 11 1 with probability one [0] since, as n tends to infinity, lim -^ Jin kL^^ = lim {-^ in L^ ' S,n k} = lim -^J,nL^ '^ n n n n n n n. with probability one [6]. Thus, -^^— ^ Ac (9) with probability one [6]. The choice of X c (Q) as the minimum was arbitrary. Hence, -S— -»min A.c.(e) giving b(e) of Part 1 of Theorem 1. Now, as n tends to infinity, lim — £n[l F (v^ t) ] = lira — )ln P{— £n ZL^^^> /n ^ n 7n /n 1 = lira -^ in P{ZL^^^
PAGE 31

24 lim ^^— £n[l F »^ t] n n lim [-^ + £n k!] = ^ 2 n 2 This gives f(t) of Part 2 of Theorem 1. Thus, from Theorem 1, C re) = 2f(b(e)) = kmin X.c. (6). A .11 1 Proposition 2. Let T^^^ = — £n[-Z £n(l L^^^)J. The exact n /n. vn X slope of T^ ^ is C (e)=kmin A c.(8). n r .11 1 The form of Pearson's statistic given in this proposition is equivalent to the form given in Section 1.2. The proof of Proposition 2 entails use of the Bahadur-Savage Theorem (Theorem 1). The derivation of b(9) of Part 1 of the theorem parallels the derivation of b(6) in Proposition 1. In order to establish f(t) for part two of the theorem, a result due to Killeen, Hettmansperger , and Sievers [18] is required. They show that under broad conditions, Zn f^(^ t) ~ ^n P{X > ^^ t} = oil) (2.1) as n->°° where f (t) is the density function of X n r Proof of Proposition 2. By a first-order Taylor series approximation, -Z £n(l L^^)) = Z L^^^ + e \\ L^^hl n. n. n" n. " n 1 1 where e^ -^ as L^"" -> for all i. It follows that as n tends to infinity, i

PAGE 32

25 T lim -^ = lim — JLn[-Z Jln(i L*-^^)l = lim — SLn[i: L^^^l n n . 1 min A c (9 ) . i i 1 The last step is shown in Proposition 1. Now, to derive f(t) of Part 2 of Theorem 1 via equation (2.1), note that the U =1-L n . n . 1 1 are mutually independent uniform random variables on (0,1). Letting V^^^ = -Zn \]^^] it follows that n. n. 1 1 f ,.s (V) = e~ , < V v4t" t} n n — 1 7 = lim — £n P(— Y > /n t] n ^ n vn

PAGE 33

26 = lim — Hn P{Y > ^} n n 2 = lim ^ ?.n f^ (^) (2.3) n by the result given in equation (2.1). Substituting (2.2) into (2,3), we find that lim — £n[ — exp{-k(-^) exp(-T-)}] /k 2 2 . -1 f -knt ^ -nt , = lim — [— rexp (-—-)] n z 2 ,kt 1 /-nt, , tk = Ixm [— + exp( J-)] = ^ . Applying the result of Theorem 1, C„(9) = 2f(b(6)) = k min A.c.(e) P .11 1 2. 3 Further Results on Bahadur Efficiencies Bahadur exact slopes are derived in Littell and Folks [26] for other general combining methods. Their results are summarized below. Test Exact Slope T*-^^ c (6) = z A.c.(e) r 11 T^^^ c (Q) =hHKcAQ)S]^ (M) T ^ c.^(O) = k min A.c.(O) L T*''"'' C (0) = max X c (0 ) m . i i 1 Thus T and T have the same exact slope as T . The relationship among these quantities is displayed in Figure 2.

PAGE 34

27 X|C|=(l + y2) X2C2 x, c Note: I and JK: Cm
PAGE 35

28 (F) The optimality property of T established by Littell and Folks [28] is mentioned in Section 1.3. The detailed result is given here as a theorem. Theorem 2. Let T be any function of T ,...,T which is non-decreasing in each of the T , that is, t
PAGE 36

29 for large values of X , then the p/ , j = 0,1,2, ...,n. are the i -^ "'" observable significance levels for the i test; that is, the observable values of random variable L = 1 F(X 1) under the null n . n. 1 X hypothesis. Assume that an exact slope exists for all tests; that is, assume that there exist functions c,(9), c^ (9 ) , . . . ,c, (9) such that 12 k. / ' \ ^— In L ^ c.(0) as n -I°° n . n. 1 1 1 1 with probability one [9] for i = l,2,...,k. Proposition 3. Let T^'^^ = (-2 E In L )'^. If n n 1 lim — Jcnfl F ( /n t)] exists then the exact slope of T is n n n n ->• oo k c.(0) = E A c (0). d i=l 1 1 Proof: This proof utilizes the Bahadur-Savage Theorem (Theorem 1). To establish the first part of Theorem 1, observe tliat ^(d) ^ 2£nL^^^ 2)lnl/'^^ % n / 1 k 1 ^ n n -> (A,c, (9) + ... + A c, (9)) ^ as n ^" i i k k with probability one [9]. Consistent with the notation of Theorem 1, denote this limiting quantity b,(9). d Now, to establish f(t) of Part 2 of Theorem 1, choose Jl . c (0,1), i = 1,2,. ..,k. For each i, there exists a j such that 1 J J-1

PAGE 37

30 Now, since L^ is a discrete random variable, i n 1 nj i 1 1 "^ Thus p{L^^^ < £.} < pCu^"-^ < a } n. 1 i where the U are mutually independent uniform random variables on (0,1). It follows that P{-2£nL*''--' >Z\} < ?{-2!ln V^^^ > £'}. n^ 1 ±' ' and hence that k ... k P{ L -2£nL^^^ >£"} < P{ E -2£n U^^^ >£"}, i=l "i i=l Thus, ^ Or. Drrv _ on_, (i)i% £n P{[E 2£nL_^^^]'^>A t} > ^ £n P{[E-2£n U^^)]^>v^ t}. (2.4) i The quantity Z^^ = [E 2£n U^^"*]^ is distributed as the square root of ty a chi-square with 2k degrees of freedom. It follows that the densi of Z is n f (7) = 1 72k-l -Z^/2 n 2*' ^ r(k) Thus, from the result given in (2.1), the limit of the right-hand side of (2.4) can be written as lim — £n f^ (/n t) n Z n -><» n = lim ^ £n [^^^-^i (/n t)^''^-^ exp(4 (/n t)^] n->co 2 r(k) -^ = lim ~ [(2k-l) £n(/n t) ^ nt^] = i t^ 2 •

PAGE 38

31 Hence, it follows from (2.4) tliat f,(t) >-\t\ since it is assumed that the limit of the left-hand side exists, Applying the result of Theoiera 1, 0,(6) > ZA.c.(0). d 11 By Theorem 2, Hence, 0,(9) < EA.c.(e) d 11 0,(9) = ZA.o.(0). d 11 (F) That is, the exact slope of T^ is ZA.o.(9) regardless of whether the v(i) A are continuous or discrete. The condition imposed in Proposition 3 that lim — £n[l-F (/n t) ] exist is not very restrictive. It is n -> CO '^ satisfied in most typical cases and in every example considered in this dissertation.

PAGE 39

CHAPTER III THE COMBINATION OF BINOMIAL EXPERIMENTS 3.1 Introducti on Chapter III deals with the combination of binomial experiments. That is, suppose k binomial experiments are performed. Let n ,n , . . . ,n and X^ , X^ '•••'^1 be the sizes and the observed number? of suc12 k cesses, respectively, for the experiments. Denote the unknown success probabilities as p^, p^, . . . ,P|^. Suppose one wishes to test the overall null hypothesis H: p^ = p^^, p^ = p^^, • • ' ' Pj, = P^q ^*^^^"^ the alternative hypothesis H^: p^ > p^^, ..., p^ > p^^^ (with strict inequality for at least one p.). The problem, then, is to choose the best function of (Y^^^ v^'^)^ (T , . , n ' •' n ^ *°^ this hypothesis test. 1 \ The results of Chapters 1 and II support T*^^^ as a nonparametric method with good overall power when there is no prior information concerning the unknown parameters. The method based on the minimum significance level, T^"'\ is sensitive to situations where exactly one of the individual hypotheses is rejected. That is, T^'"^ is powerful versus the alternative 11^ of Section 1.3. The investigations of Koziol and Perlman [20] and Oosterhoff [33] show that the general non-parametric combining methods can be improved 32

PAGE 40

33 on for certain parametric combining problems. It follows tliat there may be combination methods based directlyon(X ,...,X ) that are "l \ superior to Fisher's omnibus procedure. (F) Chapter III is a detailed comparison of T and several parametric combination methods. 3. 2 Parametric Combination Methods As stated in Section 1.3, no method of combination is most powerful versus all possible alternative hypotheses. There are, however, certain restricted alternative hypotheses against which most powerful tests do exist. Let the likelihood function for the i binomial experiment be denoted by n. ,^(i) n -X^^^ L(P,) = ( (,))P^ (1P.) . (3.1) According to the Neyman-Pearson Lemma, if a most powerful test of the null hypothesis H: p. = p , all i, versus the alternative hypothesis H : p. > p. (with strict inequality for at least one i) exists, it is A 1 lO ' to reject H if k L(p ) i-i ''-i* Upon substituting (3.1) and taking logs, an equivalent form of the test is to reject H if ZX^^^ ^n{p^(l Pio)/Pio(l-Pi)^ " ^(^-2) It follows that rejecting H when T. X^^^ > C (3.3) is most powerful if p.(l p )/p (1-p.) is constant on i.

PAGE 41

34 The problem of combining 2x2 contingency tables is closely related to the problem being considered. The purpose of each 2x2 table can be interpreted as testing for the equality of success probabilities between a standard treatment and an experimental treatment. The overall null hypothesis is that the experimental and standard success probabilities are equal in all experiments. The overall alternative hypothesis is that the experimental success probability is superior in at least one experiment. Cochran [iOj and Mantel and Haenszel [29] suggest the statistic ^ w^d^//Ew_j^p^q_^ (3.4) where w. = n.^n.^/Cn.^ + n.^) , d. = p.^ p.^ for combining 2x2 tables. Mantel and Pasternak [34] have discussed this statistic in the context of combining binomial experiments. Each individual binomial experiment is similar to an experiment resulting in a 2x2 table, two cells of which are empty because the control success probability is considered known and need not be estimated from the data. The statistic defined by (3.4) will be denoted by t^^^\ It can easily be shown that the test T^^^^^^ > c is equivalent to the cesc LA > c, thus T is the most powerful test when Pj^(l ~ '^iO^'^'^iO^"'" ~ P--^ '^ constant on i. In many practical combination problems with binomial experiments, PiO " ^^^ ^°^ ^^^ ^"T'""^ ""Jll hypothesis is then H: p. =1/2, i = l,2,...,k and the general alternative hypothesis is H^: p. > 1/2 (strict inequality for at least one i) . This is the hypothesis testing problem under

PAGE 42

consideration throughout the remainder of Chapter III. For p. =1/2, all i, T is uniformly most powerful for testing H: p =1/2, all i, versus H^: Pj^ = P2 " P3 " ••• " P^. '^ ^^'^• For the hypothesis test just described, T can be toritten k , k 1 J: (X^^^ ^ n.)/( E i n.)^. (3.5) 1=1 J=l This variate is asymptotically standard normal. It is of note that this form is standardized by a pooled estimate of the standard deviation. An alternative statistic can be formed by standardizing each X yielding T = — Zi(X n.)/(n.) } ^ 2 1 4 1 which also has an asymptotic standard normal distribution. The statistic, (X) T , is analogous to the sum of chi' s procedure which has been recommended (x) for combining 2x2 tables. The statistic T '^ is not in general equivalent to T ; in fact, the test T > c is equivalent to the test S n^ X > c. When the n. are all equal, T^^ and T are equivalent. i=l ^ Weighted sums of the X ^ , say S(g) = Z g. X , form a general class of statistics. Oosterhoff considers this class and makes the following observations concerning their relationship with the individual sample sizes [32]. It follows from (3.2) that if i2.n(p . /1-p . ) = a g , tlien the most powerful test of H versus H is L g X >c. A i Let p. =— + e.. It follows that 1 2 1

PAGE 43

36 «.n(p./l p.) = «.n(l + 2t-./l 2t. ) 1 1 i 2{2e. + — ~ — + 3! (2£.) 1 5! + ...} i e.^-0 i' 1 This implies that for alternatives close to the null hypothesis, H, S(g) is most powerful if e. = a g . ; that is, if the deviations from the null values of the p^ are proportional to the respective g.. The sum of Chi' s procedure, T^^\ is a special case where g = (n )~^ It foli i lows that the alternatives against which t'^^^ is powerful is strongly related to the sample sizes, n ,n ,...,n . 12 k The weighted sum, S(g), may be a viable statistic if prior information concerning the p. is available. Under the null hypothesis, S(g) is a linear combination of binomial random variables, each with success probability 1/2. The null distribution of S (g) will therefore be asymptotically normal. The proper normalization of S(g) is analogous to that of 1^^^'"^ given in (3.5). A well-known generalization of the likelihood ratio test is to reject the null hypothesis for large values of -2 £n{ sup L(9,x)/sup L(e,X)}. It is easily shown that for the hypothesis test being considered, the likelihood ratio statistic is where n. (n. -X^^^£n(l v(i) y(i) , n . n. 2 1 1 r I{,(i) 1 if -^> 1/2} = J X n. 1 (1) > 1/2 if < 1/2

PAGE 44

37 Under broad conditions, wliich are satisfied in this instance, the (JD) Statistic T has an asymptotic chi-square distribution with k degrees of freedom. Suppose z., i = l,2,...,k are normal random variables with means u. and variance 1. The likelihood ratio test for H: M. =0, i = l,...,k versis H : u. > (with strict inequality for at least one i) is to reject H for large values of k 2 Y. z. Kz. > 0}. (3.6) 1=1 ^ ^ For the binomial problem, an "approximate likelihood ratio" test is then to reject for large values of ,(ALR) . ^ (1) _1 ^2 1 „^(1) ^1 . 1 2 1 A 1 2 1 1=1 1 1 >' since (X -— n . ) / (— n.)^ is asymptotically a standard normal random var1 2 1 4 1 iable under H. The exact null distribution of (3.6) is easily derived. Critical values are tabled in Oosterhoff. When p = 1/2, the normal approximation to the binomial is considered satisfactory for even fairly small sample sizes. It follows that the exact null distribution of (3.6) should serve as an adequate approximation to the null distribu^ ,^(ALR) tion of T . 3. 3 Exact Slopes of Parametric Metliods in this section, the exact slopes of T , T , and T are compared. We have not been successful in deriving the exact slope (X) for T . A more complete comparison of methods is given in Section 3.4 with respect to approximate slopes.

PAGE 45

38 Suppose X is a binomial random variable based on n observan i tions with unknown success probability p.. Consider testing the single null hypothesis H: p. = 1/2 versus the single alternative hypothesis H^: p. > 1/2. Proposition A. Let T = X . The exact slope of T ^^ is c^(9) = 2{p^ Jin 2p^ + (1 p.) In 2(1 p.)}. Theorem 1 is used to prove Proposition 4. There are several means by which the function f(t) of Part 2 of Theorem 1 can be obtained. Perhaps the most straightforward way is by using Chernoff's Theorem [1]. Bahadur, in fact, suggests that the derivation of f(t) provides a good exercise in the application of Chernoff's Theorem. Theorem 3. (Chernoff's Theorem). Let y be a real valued random variable. Let <}'(t) = E(e ) be the moment generating function of y. Then, < <1> (t) < « for each t and (0) = 1. Let P = inf{4i(t): t > 0}. Let y^, y^,..., denote a sequence of independent replicates of y and for n = 1,2 , let P = P{y+ ... + y > q}. Then — Jin P -> £n p as n -> °°. n n Proof of Proposition 4. For Part 1 of Theorem 1, ,(i) ,(i) n . n 1 1 = — -> p n. ' 1 em n. 1 with probability one [6] giving b(6). For the binomial probl 9 = (Pj^,P2,...,Pj^)'.

PAGE 46

Now, as n tends to infinity, lim — £n(l -F (/^ a)) n . n 1 39 = lim — in Pix'^^V^T > /iTT a} n . 1 1 1 = lim — In P{X^^^ > n a} n. n . 1 1 1 lim — in P{X^^^ n.a > q} n . n . 1 ~ 1 1 (3.7) The random variable X n.a can be expressed as .(i) \/ = (y^L a) + (y^ a) + ... (y^ a) where the y^ are independent replicates of a Bernoulli random variable y witli parameter 1/2. Therefore, 4>(t) of Ctiernoff's Theorem is „ .^(t) = I e~^^l + e') i It follows that . iTfl ~at, t, P = inf{e (i + e ): t > 0}, 1 ""3. t t The quantity e (1 + e ) is minimized for t = £n 1-a Thus, and ,, = i -aUn a/l-aj [Hn a/l-a], p ^ e (1+ e ) £n P = -a in a/l-a + In -(1 + a/l-a)

PAGE 47

40 iiencc, lim ^ in P{X^^^ > n^a} =a «,n(a/l-a) Jin(-|(l + a/l-a) ) (3.8) i i giving f(a) of Part 2 of Theorem 1. Thus, c^(J) = 2{p^ an p^/l-p. -£n |(1 + p^/l-p^)} = 2{p. Zn 2p. + (1 p.) Jin 2(1 p.)} . 1 1 1 X Following the notation of Section 2.2, suppose k binomial experiments are to be combined and the sample sizes n-,n„,...,n 12 k satisfy n + n + . . . + n = nk and n . 1 im — =X., i=l,...,k. n -^ 0° Then >^, + . . . + A = k and 1 k — £n L*-^^ -^-X.c (6) as n ^ ». n n. 11 1 According to Proposition 3, c (0 ) = E X.c (6) in both the continuous and F 1 i discrete case if lim — £n[l F^ U^ni)] exists. The existence of this n n '^ limit for a single binomial experiment is shown in (3.8) of the proof of Proposition A. Therefore, for the binomial combination problem, the exact slope for Fisher's method, T , is c„(^) = T. A. {p. £n 2p. + (1 pJJln 2(1 p.)}. r ._ 1 1 1 ^i ^ ^i^ A property of likelihood ratio test statistics is that they achieve the maximum possible exact slope [2]. Theorem 2 states that the exact slope for the combination problem is bounded above by ^ ^±^±^^^' Proposition 3 shows that T^ achieves this. If follows

PAGE 48

41 (F) that T and the maximum likelihood test have the same exact slope; that is, This relationship is true regardless of whether the data are discrete or continuous. Let T '''" = T. X^^ . This form of the Cochran-Manteln /, n. vn k 1 Haenszel statistic is equivalent to those previously given in (3.3) and (3.5). Proposition 5. The exact slope of T is n c^Ka.('^) = 2k{p £n 2p + (1-p) J^n 2(l-p)} where p = f Z A.p.. Proof. To get b(0) of Part 1 of Theorem 1, -^ = Z X^^Vnk , n . n. k n . n 1 I Z X .p. = p with probability one [9]. Now, for Part 2 of Theorem 1, as n tends to infinity, lim — £n[l F(/n a) ] n = lim — £n P{— — i: X^^^ ^ >^ a} n r, n . /n k 1 = lim k ^ in P{E X*"^^ 5 n k a} = f(a). (3.9) nk n . 1

PAGE 49

42 Under the null hypothesis, EX is a binomial random variable based i on n + ... + n = nk trials with success probability 1/2. The quantity (3.9) is the same as the quantity (3.7) except that n. has been replaced by nk. Theorem 3 can be directly applied to line (3.9), yielding lira — Jin [1 F(^ a)] n = k{a In -^ £n ^(1 + a/l-a) } = f(a) 1-a 2 and therefore 0^,^(9) = 2f(b(Q)) = 2k{p £n(p/l-p) Ini^a + pl-p) } = 2k{p Jin 2p + (1-p) Hn 2(l-p)}. ^ ^(CMH) , . ^(F) , ^(LR) . , A comparison of T relative to T and F with respect to exact slopes is given in the next section. Derivation of the exact slope of the sum of chi's procedure, (X) T , has not been accomplished. An incomplete approach to the problem follows . (y) 1 ^ (i) Let t'^^ = Z nf X . To derive b(d) of Part 1 of Theorem 1, n n 1 n. ' 1 r^^^ n ^ ^^'^ T f^\'i n . ^ n ; n % -+ZA.P. as n->o° 11 with probability one [0]. Now, as n tends to infinity, lim ^an P{1 F(i^ a)} n \n n, \ n / n, ^1 ^ k

PAGE 50

43 Tlie left-hand side of the above probability statement is a weighted sum of independent binomial random variables based on varying sample sizes n ,n , ...,n each with success probability 1/2. The moment generating function of this random variable is therefore 4( ffi . 1 i-i) 2^ n /n 14^ itI'^V'r/" (3.10) From the form of the moment generating function given in (3.10), it is apparent that the random variable in question can be regarded as a 0}, (t) of Theorem 3 is nt) = e ^'[(^ + f e 1 ) ( 2 + i e ) ] and p = inf{(J>(t) : t > 0} The quantity p has not been found.

PAGE 51

44 3.4 Approximate Slopes of Parametric Methods Exact slopes are defined in Section 2.1. In Section 3.3 some comparisons among methods are made with respect to exact slopes and corresponding efficiencies. Bahadur also defines a quantity called the approximate slope [3]. Suppose that X has an asymptotic null distrin bution F; that is, lim F (x) = F(x) n n -> 0° for all X. For each n, let L 1 F(x ; n n be the approximate level attained. (Consistent with Bahadur's notation, (a) the superscript a stands for approximate.) If there exists a c (9) such that ^ in L^^) -. c^^^O) n n (a) with probability one [^ ], then c (9 ) is called the approximate slope of {X }, If c. (9) is the approximate slope of a sequence {x }, 1 n i 1,2, then c^ (Q)/c^ (9) is known as the approximate asymptotic efficiency of (x^ M relative to {x }. n n A result similar to Theorem 1 is given by Bahadur [3] for the calculation of approximate slopes. Suppose that there exists a function b(9), < b (0 ) < oo^ such that T — -.b(9)

PAGE 52

45 with probability one [0]. Suppose that for some a, < a < «, the limiting null distribution F satisfies £n[l F(t)] ~ I at^ as t -v », (a) 2 Then the approximate slope is c (0) = a[b(0)] , This result is -applicable with a = 1 for statistics with asymptotic standard normal distributions [3]. This result can be shown directly by applying the result of Killeen et ai. given in (2.1). (a) The approximate slope, c (G) , and the exact slope, c(9), of a sequence of test statistics are guaranteed to be in agreement only for alternative hypotheses close to the null hypothesis. Otherwise, they may result in very different quantities. One notable exception is the likelihood ratio statistic. When the asymptotic null distribution is taken to be the chi-square distribution from the well-known -2 £n (likelihood ratio statistic) approximation, the approximate slope of the likelihood ratio statistic is the same as the exact slope. The approximate slope is based upon the asymptotic distribution of the statistic. Equivalent test statistics may have different asymptotic null distributions giving rise to different approximate slopes. Tliis apparent shortcoming does not exist with exact slopes. In typical applied situations, the significance levels attained by T and T will be ascertained by appealing to their asymptotic normal distributions. Similarly, T^^*^^ will be compared to the appropriate chi-square distribution and approximate levels for t^^^"^^ will be obtained from the asymptotic distribution given in Section 1.2. Approximate slopes based upon these asymptotic distributions would therefore seem to afford a more appropriate comparison of the methods. In other

PAGE 53

46 words, it is appealing to consider the null distribution that will be used to obtain significance levels in practice when comparing the statistics. The only statistic wiiich will not usually be compared to asymptotic distributions is perhaps T . The null distribution of ( PMH ^ T is binomial based on n^ + . . . + n, trials with success probabil1 k ^ ity 1/2. However, even with the availability of extensive binomial ^ ., ^(CMIi) tables, T will often be standardized as in (3.5) and compared to standard normal tables since the normal approximation to the binomial when p = 1/2 is satisfactory even for fairly small sample sizes. (F) The asymptotic null distribution of T in the discrete case is easily shown to be chi-square with 2k degrees of freedom. This is (F) also the exact distribution of T in the continuous case. It follows that the approximate slope in the discrete case is the same as the exact slope in the continuous case. In summary, 4R(^)=4^hQ)=c^^(Q)-C^(Q)=2 L X.{p. Zn 2p.+(l-p.)£n 2(l-p.)}. In order to derive the approximate slopes for x "^^ and T^^\ consider the linear combination Zn" X^^^ of whicliT*'^^'"^ and T*-^\Tre in , . . 1 special cases. The variate X^""^ has an asymptotic normal distribution i with mean n. and variance n^ under the null hypothesis. It follows directly that V ~ ^^Vi " 2 ^"i )/2^^"i is asymptotically standard normal. Proposition 6. The approximate slope of T*-^^ i s n 11 1

PAGE 54

47 Proof. First, to get b(9). . . 1 .„ a„(i) 1„ a+1 ^(a) — (Zn X --Zn. ) /n 1 n 2v 1 vn in 2a+i 1 , n. 2/ 2a / n n n 2 n 1 /A.2a+l 2 /^^n^ -) as n -»• «> . 1 / ^,2cx+l 2 / '\ (a) with probability one [9]. Now, since T is asymptoticallv standard n normal, c^^^ = [b(6)]2 _ r^,«+l 1 y"+li2.1 „,2a+l [ZX. P. 2 ^. ] /j EX. = [ZX"-^\2P. 1)]2/ZX2"-^1 . 11 1 Letting a = yields the approximate slope ofT ^Qm(^> = t^\(2p. -i)]2/k -1 Cy ) Letting a = -— yields the approximate slope of T , cI^\Q) = [ExJ(2p. l)]2/k.

PAGE 55

48 By inspection of tlie above approximate slopes, it is apparent that T is more efficient when the p. are proportional to the X 1 i (the relative sample sizes) and T '^' is more efficient when th e p . are e approximate inversely related to the A.. The boundary of the parameter space where *^X ^^^ " ^Om ^^'^ ^^ """^ '^l " ^2 " ' " Pr' ^^°"^^e^The statistic T is more efficient than T ^ in more than half of the parameter space. As a further comparison, e(^)(j(CMH)^ T '*^^), th efficiency of T with respect to T^^ , can be integrated over the parameter space. The result is greater than one which again supports (CMH) use of T .It should be noted, however, that when the p. are proportional to the A^ both tests have high efficiencies relative to when p^ are inversely related to the A.. Therefore T^^^ is more efficient in a region of the parameter space where both tests have relatively low efficiency. This is a good property for T '^ . An "approximate" likelihood ratio test is introduced in Section 3.2. A statistic which is equivalent to the form given in Section 3.2 is 1-1 1 i Proposition 7. The approximate slope of x ^^^^ is n 4lr(^) = ^A.(2p.-1)2. Proof. To find b(0) of Part 1 of Theorem 1, -* {4Z ^^(P^ '2^ ]'^ as n -> CO

PAGE 56

''.9 witli probability one fO], Tims, b(0) = {4ZA.(p. -|)2}^ = {EA.(2p. -1)2}^To find f(t) of Part 2 of Tlieorem 1, the asymptotic null distribution of T is required. According to Oosterhoff [33], P{Ez^ l{z. >0} > s} = 2"*^ Z ('^)P{x^Ss} 11 , J J J=l where z. is a standard normal random variable. Since, under the null 1 hypothesis (X^'^ -4 n.)/^:/4 ^ z. n. 2 1 1 1 1 in distribution, it follows that P{t(^^'^^ .s} = ?{iT^^'^^h'>-s'}^2-hd)?{x'>-s'} (3.11) n n J J as n -> 00 for all s. It follows that the associated density function is a linear combination of chi-square densities. The result of Killeen et al, can be applied to verify that £n[l F(t)] ~ I t^ as t ->" where F is the asymptotic null distribution of T . Hence, n c^Lr(6> = [bO)]^ = EA.(2p. -1)^ Before proceeding to a further comparison of approximate slopes, the slopes are summarized in the following listing.

PAGE 57

50 Test Approximate Slope Fisher's (T^^^) 2EA {p. £n 2p. + (1-p.) £n 2(l-p.)} ill 1 1 Likelihood Ratio (T^ ) 2ZA.{p. £n 2p . + (l-p.)£n 2(l-p.)} "Approximate Likelihood Ratio" (T^^^^^) EX . (2p . 1)^ Sum of Chi's (T^'^^) hl^y^ (2p 1)]^ k 1 i Cochran-Mantel-Haenzel (T^^^^) i[ZA.(2p. 1)]^ Letting A = A7 (2p. -1), it is easy to see that c'^^^e) > 111 AT.R ^ k c ^ (6) since Z A. ^ rt^A.] . It is also true that c^^^ > c^^\^) . X ^^^ 1 k 1 ALR CMH^ ^ Let B^ = (2p -1). It can easily be shown that XA.B^ ^[ZA.B.]^ = f E Z A A . (B B )^ 11 k 11 k .^_. 1 J 1 j^ given that EA. = k. Therefore T^"^^^^ dominates both T*^^^^^ and T^^^ with respect to approximate slopes. Approximate efficiencies of t^^^^ and t''^^'^^ with respect to (F) (L{^) T (and equivalently T ) for A^ = A^ = 1 are given for several points in the parameter space in Table 1. In this case of equal sample sizes, T ^ is equivalent to t'''^^^ Table 2 gives efficiencies of t*"*^^^^' t^^-*' and T ' with respect to T^^'"* (and equivalently t^^*^^) for A = 1/3, A2 = 5/3. The values of A^ and A^ imply that the second test is based on five times as many observations as the first test. When the exact null distribution of T^ ' ^ (binomial with parameters E n . and 1/2) is i=l ^

PAGE 58

51 to be used to determine significance levels it is more appropriate to employ the exact slope, c (9), rather than the approximate slope, c^j^jjj(U). The exact efficiencies of T relative to T (and equivalently T ) are given in Tables 1 and 2 in parentheses. The efficiencies listed in Tables 1 and 2 support several previously made observations: 1. The statistic T dominates T and T with respect to approximate slopes (efficiencies). (CMH) (x) 2. The test based on T dominates the test based on T when the success parameters are proportional to the sample sizes. The (X) test based on T is more efficient in the reverse case. The test (X) based on T is more efficient in a region of relatively low efficiencies for both T^^^ and t^^''^^ . 3. Exact and approximate slopes are not, in general, equivalent. They are in close agreement for parameters close to the null hypothesis. 4. All of the tabled efficiencies are at most one. This is (F) n R) expected from the optimality properties of T and r'^ given by Theorem 2 and Proposition 3. A value of one is achieved only for the exact efficiency ot r when p = p . This is consistent with the fact that (CMH) T IS the most powerful test (and the likelihood ratio test) when Pi = P2-

PAGE 59

52 Table 1 Efficiencies of t:^^^^\ t^^\ and T*-"^^^-* Relative to T^^'^'*, or Equivalently, to T^'^^ A = X = 1 P2 -5 .6 .7 .8 .9 1.0 Pi .8 .9 1.0 -

PAGE 60

53 Pi Table 2 Efficiencies of t(^""\ T^^\ and T^^^^^ Relative to T^^'^^ (F) or Equivalently, to T^ , A = 1/3, X = 5/3 Pj -5 .6 .7 .8 .9 1.0 .6 .7 1.0 -

PAGE 61

54 3.4 Powers of Combination Methods In the previous two sections, competing methods were compared with respect to asymptotic efficiencies. Asymptotic efficiencies compare sequences of test statistics in some sense as the sample sizes tend to infinity. Such comparisons may or may not be applicable to situations when small sample sizes are encountered. Therefore, the methods of combinations are compared in this section with respect to exact power. As mentioned previously, exact power studies are often intractible, For the test statistics considered here, power functions are not obtainable in any simple form which would allow direct comparisons between competing methods. However, through the use of the computer, it is possible to plot contours of equal power in the parameter space. From such plots, the relative powers of the competing methods can be surmised. The first step in obtaining the power contours is the generation of the null distributions for each of the five statistics: T , T , T , T , and T . Size ot = .05 acceptance regions for each of the statistics for varying sample sizes are shown in Figures 3-8. Acceptance regions for tests with equal sample sizes (n^ = n^ = 10, 15, 20, 30) appear in Figures 3-6. The statistics ^(LR) ^(ALR) ^ ^. i and i define very similar, but not identical tests for "l ^ "2 "^ ^^' ^^' ^^' ^^' ^'^^y define exactly the same a = .05 acceptance regions in all four cases, and will therefore yield identical power contours. Fisher's statistic, T^^^ defines a test similar to T^^^^ , T,(ALR) ^ (Y) and 1 for n^ = n^ = 10, 15; in fact, T^ ' defines the same a = .05 acceptance region for those sample sizes. The major difference between

PAGE 62

55 T and the two likelihood ratio statistics, T ^ and T , is that (F) T has many more attainable levels. For sample sizes n = n = 20, 30, T defines different ct = .05 acceptance regions than T and T The statistics T and T are equivalent for n = n . Figures 7-8 portray acceptance regions for cases of unequal sample sizes (n = 10, n = 20 and n = 10, n = 50) . The difference between T and T is apparent for the case of unequal sample sizes. The statistics T and T define different o. = .05 acceptance regions. In both figures, it is seen that T , T , and T define similar regions. In Section 1.3 it was stated that Birnbaum [7] has shown that combination procedures must produce convex acceptance regions in the (X , X , ..., X ) hyperplane in order to be admissible. Each of the acceptance regions in Figures 3-8 appear to satisfy this convexity condition. The acceptance regions given in Figures 3-8 are not exact a = . 05 size regions. They are the nominal acceptance regions which are the closest to size ot = .05. In order to make a fair comparison among the powers of the competing methods, all of the acceptance regions must be of exactly the same size. This can be accomplished by admitting certain values of (X , X ) to the acceptance region with probabilities between zero and one. A more precise definition of this procedure follows. Suppose PlT^^'* < t „} = .05 a, n . X. 1 P{T^^'' < t } = .05 + b, n . u 1

PAGE 63

56 and T does not take on any values between t. and t . Then all 1 n i u n . 1 1 such that T = t. are included in the acceptance region with probabiln . X1 ity one and T = t is included in the acceptance region with probn. u 1 ability a/a+b. The power of the i test is one minus the probability that T ^ n. 1 fails in the acceptance region. More precisely, define the power of .th the 1 test to be n.Cp^.p^) =1[P(T^^^ ^ t3^|(pj^,P2)} + (a/a + b) p(T^^^ = t^|(p^,P2)}] i i (1) ^'^ n (1) n -X n (2) n -x = l-[l E ( n^)Pi (1-Pi) ( )P9 (1-Po) 1 n (1) n -x^^^ n (2) n -x^^^ + (a/a+b) Z Z ( )P^ (1-p^) ( )p2 (l-p^) , (1) (2) „,(i)_ ^ X X (X ,x ):l^_ -t^ 1 For each test statistic, power is calculated for 2500 values of (P-ijPq) in the alternative parameter space. This was accomplished with a FORTRAN computer program. A data set consisting of these calculated powers is then passed into SAS [4,16]. The plotting capabilities of SAS are then exploited to portray contours of equal power in the (Pj^.P2) plane.

PAGE 64

57 Figures 9-14 are .90 power contours corresponding to the acceptance regions of Figures 3-8, respectively. The Cochran-MantelCCMH ") Haenszel procedure, T , is most powerful in the center of the parameter space; that is, when p and p are nearly equal. This is expected since T is uniformly most powerful when p = p. for any choice of sample sizes. The statistic T is clearly inferior to T , X ^^^ , and T in the extremes of the parameter space, that is, when p and p are quite different. Further, the deficiency of ^(CMH) ^Qjjjpgj-g^j f.Q ^{^g other methods when p and p are different is larger than the deficiency of the other methods when p = p^From Figures 9 12 it can also be seen that the central wedge of the parameter space where T is more powerful shrinks as tlie sample sizes (F) increase. Fisher's statistic, T , and the likelihood ratio statistics, T and j^^^\ have similar power. Fisher's method gives slightly more power in the central region of the parameter space while T and x^^^^^ are slightly more powerful when p and p^ are very different. For unequal sample sizes (Figures 13, 14), T and T yield power contours too similar to be separated on the drawings. The approximate likelihood ratio test, T , has almost the same power as T and T , having slightly more power when the experiment based on more observations lias tlie larger p., and slightly less power in the reverse (v) , ^(CMH) , case. The sum of chi's procedure, T is not equivalent to T when . , ^(t:MH) ^ . n 7^ n . The power contours are very different with T being more powerful wlien the larger experiment matches with a large p . The (x) statistic T is more powerful in the opposite case.

PAGE 65

58 Figures 15 20 are .60 power contours concomitant with the .90 power contours in Figures 9-14. The comparison of competing methods may be more appropriately made for low powers. When all of the powers of the tests are high it is probably unimportant which test is used. The patterns observed in the .90 power contours are virtually the same in the .60 power contours, however. No additional information is apparent except that the patterns are consistent over a wide range of powers.

PAGE 66

X (2) 10 59 j(F) y(LR) -p(ALR) -j-(CMH) 8 I -I I I I 6 4 2 2 6 8 10 X (I) Figure 3. Acceptance Regions for n n 10.

PAGE 67

60 (2) X I4-] I -|-(F) ^(LR) -p(ALR) T (CMH] 12 108^ 4 2 L 2 4 8 10 12 14 X (I) Figure 4. Acceptance Regions for n = n 15.

PAGE 68

61 •(2) 20I 816 14 12 10 H 8 6 4 2 H T (F) -j-(LR) -j-(ALR) 7(CMH) 1-^ 4-d) 2 4 6 8 10 12 14 16 18 20 Figure 5. Acceptance Regions for n, n 20.

PAGE 69

62 '(2) 30 28^ 26 24 222018 16 ^ 14 I2H 10 8 6-1 4 2 T' (LR) (ALR) 7(CMH) '(i) 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 Figure 6. Acceptance Regions for n n 30.

PAGE 70

63 X 20 (2) 8 64 128Z -r-lF) ^(LR) , . , „ ,^ , I , I (virtually the same) j(ALR) y(CMH) y(X) 1 ..^ 2 4 6 8 Figure 7. Acceptance Regions for n = 10, n = 20. 10 X (I)

PAGE 71

x'V 64 46 44 42 ^' 40 3836-: 34 324 T (F) t(lr) OOCOO I j(ALR) -p(CMH) T (X) 00000000000*000000 o:o ?>oo o 30H 28 26 24-1 22 20 *-; — 1

PAGE 72

(i5 -p(F) -^(LR) (ALR) t(cmh) Figure 9. .90 Power Contours for n = n = 10.

PAGE 73

66 j(F) -|-(LR)-|-(ALR) y{CMH) P, Figure 10. .90 Power Contours for n = n =15,

PAGE 74

6 7 R 2 .0-j-(LR) j(ALR) -|-(CMH) Figure 11. .90 Power Contours for n = n = 20.

PAGE 75

(>8 P2 1.0-1 .9.8T (F) j(LR) j(ALR) -j-(CMH) Figure 12. .90 Power Contours for n = n =30. 1 2

PAGE 76

69 2 .0t(f) t(LR), •,,,,. I , I (virtually the same) j(ALR) j(CMH) j(X) P, Figure 13. .90 Power Contours for n = 10, n = 20.

PAGE 77

70 R 2 0.9-y-(F) -p(LRl . , ,, . I ,1 (virtually the same) jCALR) -|-(CMH) y(X) .8P, Figure 14. .90 Power Contours for n = 10, n = 50.

PAGE 78

^2 I. On j(F) -|-(LR) y(ALR) -j-(CMH) 71 P, Figure 15. .60 Power Contours for n = n = 10.

PAGE 79

72 R 2 0j(F) -p(LR) y(ALR) .-j-(CMH) .9.0 P. Figure 16. .60 Power Contours for n = n = 15,

PAGE 80

2 .0.9T (F) j(LR) j(ALR) -j-(CMH) 73 ^ P on Figure 17. .60 Power Contours for n = n = 20.

PAGE 81

7A 2 .0T (F) T(LR)^T(ALR) -|-(CMH) .9 1.0 P. Figure 18. .60 Power Contours for n = n = 30.

PAGE 82

75 1.0 .9t(F) t(LR), .,,,,. I , I (virtually the same) j(ALR) j(CMH) y(X) P, Figure 19. .60 Power Contours for n = 10, n = 20.

PAGE 83

76 2 0y(F) (LR). ., ,, .^^ I ,1 (virtually the same) j(ALR) , j(CMH) j(X) ,8Figure 20. .60 Power Contours for n = 10, n = 50.

PAGE 84

77 3 . 6 A Syntliesis of Comparisons When detailed prior knowledge of the unknown parameters is (F) unavailable the class of competing methods can be restricted to T , „(LR) _(ALR) (X) „, ^. , ^ -.1 T , T , and T . These methods are compared wxth respect to various criteria in previous sections. In this section, the results of these comparisons are synthesized to make recommendations concerning the optimum choice of method for various situations. For the comparisons in the previous sections, the null hypothesis considered is H : p = p = ... = P, = 1/2. The most general alternative hypothesis considered is H : p. > 1/2 (strict inequality for at least one i) . In some situations, it is reasonable to assume that the success probability is consistent from experiment to experiment. In such cases the alternative hypothesis of interest is H : p = p = ... = p > 1/2. B k z k A third alternative hypothesis of possible interest is H : p. > 1/2 ^ J (exactly one j). This alternative is appropriate if the researcher believes that at most one p. will be greater than 1/2. The hypotheses H and H are probably the more frequently encountered alternatives in practical situations. The following recommendations are based on evidence presented thus far in this dissertation: 1. The minimum significance level, T , has good power versus the \\ alternative. It performs poorly, however, versus other alternatives. 2. The Cochran-Mantel-Haenszel statistic, T , forms the uniformly most powerful test against H . Its use is therefore indicated

PAGE 85

78 whenever it can be assumed that the p. are not very different. The . ^(CMH) ^ •, . •, , -, . . ^ statistic T performs relatively poorly versus alternatives in the extremes of the parameter space (Type B alternatives). (p) 3. Fisher's combination, T , is not, in general, the most powerful test versus a particular simple alternative hypothesis. Its power, however, is never much less than that of the optimum test. Fisher's method gives good coverage to the entire parameter space and its use is therefore indicated whenever specification of the alternative hypothesis cannot be made more precisely than H . 4. There seems to be no compelling reasons to recommend the (x) use of the sum of chi's procedure, T , unless it is known, a priori, that the p. are inversely related to the sample sizes of the individual binomial experiments. (LR) 5. The likelihood ratio statistic, T and the approximate 1 •,•,. J . • . ^(ALR) , ^. . ., ^(F) likelihood ratio statistic, T , define tests very similar to T They obtain approximately the same powers throughout the parameter space. Choosing among these three statistics then depends upon which yields significance levels with the greatest ease and accuracy. This problem is addressed in Section 3.7.

PAGE 86

79 3. 7 Approximation of the Null Distributions ^. _,(F) „aR) ^(ALR) or T , r , T In Section 1.6 the problem of obtaining significance levels for (F) Fisher's statistic, T , when the data are discrete is discussed. 2 ' 2 Lancaster's transformations, X and X are introduced. It is estabm m 2 ' 2 lished in Section 1.6 that X and X both converge to chi-squares with m m 2k degrees of freedom. Although Lancaster's approach can be expected to yield good approximate levels for large sample sizes, the degree of accuracy has not been established for small or moderate sample sizes. Some indication of the accuracy of significance levels obtained from 2 ' 2 X and X is given by observing the mean and variance of these variates. m m Table 3 (page 79) lists the means and variances for n = 1,2,..., 20 for 2 ' 2 X and X when applied to one experiment. Since the altered form of m m (•) T will be compared to a chi-squarc distribution with 2k degrees of 2 '2 freedom, it is desirous that the mean and variance of X and X are as close as possible to the mean and variance of the chi-square distribution with 2 degrees of freedom, which are 2 and A, respectively. For n>3, the mean and variance of X^ are closer to 2 and A, respectively, than m the mean and variance of x'^. This suggests that X should, in general, m "' '2 be a more accurate approximation than X^ . (LR) , ., In Section 3.2, tlie likelihood ratio statistic, T , and tlie approximate likelihood ratio statistic, i:''^^^\ are introduced. The (LR) necessary regularity conditions can be shown to be satisfied for T , so that the statistic can be deemed asymptotically a chi-square with k degrees of freedom. As previously stated, the null distribution of

PAGE 87

Table 3 '2 2 Mean and Variance of Lancaster's X and X m m 80 '2 Median chi-square (X ) ra Mean chi-square (X ) m n

PAGE 88

81 T can be approximated with a distribution derived by Oosterhoff. Significance levels are then determined by the relationship P{T^'^^^^ > c} = 2"^ E (*?) P{x^ > c) . j=l ^ J The null density functions of T^^\ T^^^\ and t^^^^^ are plotted in Figures 21 22 for 1<. = 2, n =n =6. These plots give an indication of the difficulty of approximating the respective null density functions. More extreme (larger) values of the statistics do not always occur with smaller probabilities. This fact gives the jagged appearances for the density functions. This lack of smoothness causer, difficulty in approximating a discrete density with a continuous one. The remainder of this section contains numerical comparisons of the above-mentioned approximations. The goal is to choose the approximation which yields significance levels closest to the exact levels of the respective statistic. Tables A and 5 correspond to the density functions pictured (F) in Figures 21-22. Table 4 lists the possible events as ordered by T 2 ' 2 Lancaster's approximate statistics, X and X are calculated for each '^'^ mm 2 '2 event. Although it is not generally true, X and X ^ maintain the same (F) ordering of events as the uncorrected T . Significance levels obtained 2 ' 2 by comparing X and X to a chi-square distribution with four degrees -' *^ ° m m of freedom are then compared to exact levels. The inaccuracies of these approximations are then reflected in the columns labeled percentage error. u ^(LR) , Table 5 gives an evaluation of the approximations given by T and ^(ALR)^ These statistics define equivalent tests, but yield different approximations to the exact densities.

PAGE 89

82 The percentage errors given in Tables 4 and 5 tend to favor . . t(LR) , T,(ALR) Lancaster s approximations over the approximations to T and i ? ' 2 Both X and X yield generally conservative results in this particm m 2 ular case. The mean chi-square, X , is somewhat more accurate than m '2 the median chi-square X m All of the approximations can be expected to improve as the sample sizes increase. To indicate the behavior of the contending approximations for increasing sample sizes, nominal c = .05 and ot = .01 values for each statistic are given in Tables 6-8 for n^ = n^ = 3,4 , 5, . . The data in Tables 4-8 indicate that Lancaster's approxima^(LR) , „(ALR) „, tions clearly dominate the approximations to T and 1 . ine 2 '2 (F) optimal choice then becomes either the X or the X correction to T ^ mm 2 '2 Table 6 gives no clear indication as to whether X or X yields a better approximation. Both statistics give large errors for small sample sizes. Both statistics yield errors less than 12% for both ci = .05 and ct = .01 levels for n>16. 2 The superiority of the mean chi-square, X", over the median chi-square, X , becomes clear for k=3. Table 9 gives the nominal m 9 ' 2 a = ,05 and a = .01 values for X" and X for k = 3, n = n n mm i z J 2 2, 3, 4..., 10. The mean chi-square, X , is more accurate in all cases m but two (a = .01, n = 8 and ct = .05, n = 5) .

PAGE 90

83 O CO CO O) kT CVJ II u o U-l O <4-l o d o •H o B 3 00 •H 03 c (U Q (X> CVJ 0) 3 •H 1 (\j — r— O — r— cn O — rCD O — I— O — rCD O — rID O ^ ro O Q — r— 00 O o Q

PAGE 91

84 .4030 .20 04 y(LR) y(ALR) (equivalent for T^^^^=T^^^^^=0) 3 6 49 12 15 18 Figure 22. Density Functions of T*-^^^ and t*'^^^'' for n = n =6.

PAGE 92

85 EvenC 6,6 6,5 6,4 6,3 5,5 6,2 6,1 6,0 5,4 5,3 5,2 4.4 5,1 5,0 4,3 4,2 4,1 4,0 Table 4 Lancaster's Approximations to T Approximate Level .000374 .003209 .002894 .009954 .009541 .01969 .01941 .02557 .02099 .02852 .02843 .03354 .03353 .03517 .03517 .07445 .06511 .1396 .1258 .1948 .1779 .2038 .1894 .2248 .2062 .2344 .2152 .3600 .3441 .4804 .4646 .5415 .5254 .5603 .5441 (F) for k = 2, n = n X^ m

PAGE 93

Table 4 (Continued) 86 Event

PAGE 94

Table 5 ^(LR) , ^(ALR) Approxmacioiis to T and T for k = 2, n = n = 6 87 Event

PAGE 95

88

PAGE 96

c o u 89 c

PAGE 97

90 c

PAGE 98

u o 91 4J

PAGE 99

92 4-1

PAGE 100

93 (F) (LR) , ,,(ALR) , .. . .1 Tlie statistics T , T , and 1 define very sxmiiar tests. In Section 3.6 it is concluded, therefore, that the choice among these three statistics should depend upon which affords the best approximation to its null distribution. The evidence of this section 2 . (F) indicates that Lancaster's mean chi-square (X ) approximation to T m is the best choice. Even this approximation yields large errors for small sample sizes. Tables 10 and 11 give nominal « = .05 and « = .01 levels for IlL , an equivalent and more convenient form of T for k = 2 and k = 3. The exact significance levels are also given.

PAGE 101

94 Nominal ot = .05

PAGE 102

95 Table 11 Nominal a = .05 and a = , 01 Events for HL , k = 3 a = .05 a = .01 n

PAGE 103

CHAPTER IV APPLICATIONS AND FUTURE RESEARCH 4. 1 Introduction In the first three chapters, the combination problem was approached from the standpoint of hypothesis testing. It is often desirable to consider problems in statistical inference from the standpoint of estimation. In Section 4.2, such an approach is considered. Confidence regions based upon previously considered testing methods are derived. The remaining sections of Chapter IV discuss potential research problems for which the combination methods studied in this dissertation may be applied to gain solutions. 4.2 Estimation: Confidence Regions Based on Non-parametric Combination Methods Consider the general combining problem where the null hypotheses of concern are parametric. That is, the i null hypothesis can be written H : 9 =d , where 9, is an unknown parameter and d . is a i i 1 1 1 specified value. Since the test statistic X ^ typically depends on the particular value of d., write X^^^ = X^^^(d.). The observed sig*^ 1 1 nigicance level for the i test can then be written L (d ) = 1 F (X (d )) where F. , is the cumulative distribution function i,d . i i>d . 96

PAGE 104

97 of X (d ) under the null hypothesis 11.: ^. =(i.. Following the same 1 111 notation, Fisher's statistic can be written }^ I'-^^d ,d ,...,d ) = -2 Z Jin L^-'^^d.) 1=1 and Tippett's statistic, the minimum significance level, can be written T^'"^(d,,d_,...,d, ) = min L*'^-'(d.). 12k . 1 1 A 100(1-0')% joint confidence region for the parameter vector ^ = (^ .^-^j'-'.^ ) can be obtained by inverting Fisher's statistic 12 k. T (d^,...,d ). Following the usual procedure of inverting test X K. statistics, the resulting region is given by Sj. = {(d^,d2,...,dj^): -2 Z In L^^^d.) < x\\^(^-^)^ i=l 2 where X (1-*^) is the l-ot quantile of the central chi-square distribution with 2k degrees of freedom. Regions can likewise be obtained by r„(ni) . , , . inverting T yielding S = {(d,,d„,...,d,): L.(d.) > 1 (1-^)^^^, i = 1,2, ...,k}. m 12 k 1 1 Any of the other non-parametric combining methods introduced in Section 1.2 can likewise be inverted to yield confidence regions. If the 6 are one-dimensional and functionally independent, i then the region S is the usual rectangular confidence region, and ° m therefore is easily computed and displayed in application. The inequality which defines S^ cannot generally be simplified. However, as will ^ F be shown, the region S„ has a simple explicit form for location paramF eters of negative exponential distributions. While the region S^ is not as simply computed and displayed as S , it can be obtained and

PAGE 105

98 displayed using a computer program package capable of computing significance levels (that is, capable of evaluating probability integral transformations) and constructing plots. The Statistical Analysis System (SAS) [4,15] is one such package which can readily evaluate the probability integral transformation for common distribution functions (including normal, t, F, and gamma) with integrated data management and plotting capabilities. All the plots in the examples of this section were obtained by hand or by using SAS. The amount of work involved in constructing the plots is essentially the same as that required to plot response surface contours for second order models. The relative performance of confidence regions is directly related to the relative performance of the corresponding tests whose inversion yields the confidence regions. Specifically, a most powerful test yields confidence sets which minimize the coverage probability of "non-true" parameters [21]. The results of the comparisons among combining procedures given in this dissertation can therefore be translated into comparisons of procedures for obtaining joint confidence regions. ^(F) The results of Chapters 1 and II tend generally to support T as a procedure which, though not most powerful, has good power (relative to a most powerful test) over most of the combined alternative space. We thus conclude that, if 8 = (^ j^. • • • .^j^) is the "true" parameter value, then S should have relatively low coverage probability of 0' for all 6 ' j: B . It will be seen that S is often similar to the region S^^ based on the likelihood ratio statistic. The test statistic T generally has quite good relative power against points in the combined alternative space for which only one (or

PAGE 106

99 at least a small number) of H , ...,H are false (Type C alternative), but does not have good relative power against points for which all, or nearly all, of H,,...,H, are false. Therefore, S should have quite Ik m ^ low relative coverage probability of 6' = (0' ..,,6') if, say, 6' ?= 9 and Q\ ^ Q . , 1 = 2,...,k, but S may have a rather high relative coveri 1 ra age probability of 6' if 6' * Q., i = l,...,k. Several examples of confidence regions based upon non-parametric combining methods are now presented. Example 1. Shift Parameters for Negative Exponentials Let x.,,...,x. be a random sample from a population with il in. ^ '^ '^ -(X..-0.) density function f(x..,6.) = e ^-^ "^ ; x. . > Q . ; i = 1, . . . ,k. ij 1 ' ij 1 We wish to obtain 100(l-oi)% confidence regions for 9 = (d . , . . . ,6 ) . The likelihood ratio test of H.: _ = d. versus the alternative 11 1 A.: y. > d. is to reject H. if X.,,, d. is larger than a prescribed iiii ii(l)i ^ ^ value, where X. , , is the smallest order statistic from the i sample. (i) ( ) -n. (X -d . ) Taking X^ '^ (d^) = X^ d^, one obtains L^^'^(d.) = e ^ ^^^ ^ and thus Sp=((d^,...,d^^): 2Xn.(X.^^^-d.)
PAGE 107

100 The (one-sided) rectangular region S has tlic form S^= ((d.,...,d^): X,(,)-c.
PAGE 108

lOi e. .0725 0.7049 S m S S Figure 23. Confidence Regions S and S„ = S^ „ for the Shift Parameters of Two Negative Exponential Distributions.

PAGE 109

102 The region based on Fisher's combin^ilion is Sp = {(d^,....dj^): -2Z In 2 [1 $( | X. d . | /o. ) ] ^ F m

PAGE 110

9: 103 S LR •0, Figure 24. Confidence Regions S , S , and S for Means of Two Normal Distributions.

PAGE 111

104 2 appearance of S and S when the o. are unknown is essentially the same as shown in Figure 24. Now suppose the M. have a common, but unknown, value y, and suppose the o are unknown. An exact 100(l-a)% confidence set for y (F) (m) . . can once again be obtained by inverting T or T , giving S = {d: -2E Zn 2[1 F.CX^'-^d))] (1-")^'^'', i= l,...,k}, m 1 where X (d) = v^ I x -dl/s and F. is the distribution function of ' i 1 1 students t on n. -1 degrees of freedom. 1 It should be noted that S^ and S are not necessarily intert m vals. The set S may be a union of intervals, and the set S^ is the 1/k intersection of the k individual lOO(l-a) % intervals that could be constructed from the individual samples. However, the failure of Sp or S to be an interval is an indication that the model is not valid, m that is, that not all the means really are equal. Thus the failure of S or S to yield an interval is more a fault of the assumed model than F m of the statistical procedure. It should be noted that, generally, if one desires to estimate a parameter 6 common to each of k populations, and if nuisance parameters vary from one population to the next, then n combination of k individual tests for 6 can be inverted to obtain a combined confidence set for 6.

PAGE 112

105 Example 3. Mean and Variance for a Normal Distribution Suppose X ,...,x is a sample from a normal distribution with 2 mean y and variance o . The usual test of H : y = m, a= t versus the alternative A : M ^^ m, a = t, where ra and t are known and specified, is to reject H if X = /n|x-ml/t is large. Also, the usual test of H : o = t versus the alternative A : a ?: t is to reject H^ if X = Z(x. x) /t is either too small or too large. Under the combined null hypotheses H = H n H , the distributions of X and X 2 are independent. Thus an exact 100(1-^)% confidence region for (y , o ) is given by S = {(m,t^): -2 Jin L^^\m,t) -2 £n L^^^t) 1 (1-a) ^^^} m = {(n..t'): x(^\m.t)
PAGE 113

cr 106 HFigure 25. Confidence Regions S , S , and S for the Mean and m r LK Variance of a Normal Distribution.

PAGE 114

107 for S is noL exact, however, because it was obtained from the LR asymptotic distribution of the likelihood ratio statistic. Example 4. Shift and Scale Parameters of Negative Exponential Distribution with Censored Data Let x.,s...x, , denote the first r order statistics of a (1) (r) sample of size n from a population with density function f(x;B,o) = o-lg-(^-B)/o^ x>0. Then, under the hypothesis H: B = m, o = t, the statistics X^^^ = 2n(x^^^ -m)/t (A. 3) and r i=l 2 are independent x on 2 and 2(r-l) degrees of freedom, respectively. Thus an exact 100(l-ot)% confidence region for (B,o) is given by (4.2) where L^^\m, t) and L^^\t) are the levels of x and x given in (4.3) and (4.4). Optimality of Fisher's combination of X and X is discussed by Perng [38]. Example 5. Mean Differences and Variance Ratios Let x ,...,x and y ,...,y be independent random samples from 1 ' ' ' ' m 1 n two normal populations with respective means M^ and \i ^ and respective variances oj and o^. A test of H: ^^=^2' a^ = o^ versUs the alternative A: M ^ \^ or o a: o" is obtained by combining the usual t-test for test2 _ 2 ing H : M, =V and the usual F-test for testing H^: o^ o^ . This

PAGE 115

108 procedure was proposed by Sukhatme [41] and was shown by Perng and Littell [38] to be asymptotically as efficient as the likelihood ratio 2 2 test. Exact 100(1-'^)% confidence regions for (^'2"^!' °2^°1^ ^^^ obtained by inverting the combined test, yielding Sp = {(d^,d^): -2 £n L^^\d^) -2 ^n L^^^d^) < xj) where L (d ) is the two-sided level of the t-test for H • u u = d and L (d^) is the two-sided level of the F-test 12 1 1 2 2 2 for H^: 02/°! " ^2' Example 6. One-sided Bounds for Multivariate Means Consider now a sample of size n from a multivariate normal population with mean vector M and known covariance matrix Z. In view of the computational difficulties involved with computing the likelihood ratio statistic for a one-sided test about U, Brown [9] proposed Fisher's combination of k univariate one-sided tests as a one-sided test about y. Brown's approach is summarized in Section 1.5. This procedure can be inverted to obtain approximate lOO(l-ct) one-sided confidence regions for y which have distinct computational advantage over regions based on likelihood ratio statistics for the one-sided problem. Example 7. Variance Components Consider the random one-way classification model Y = M + a + e. . (i = 1,. . .,t; j = 1, . . .,n) where y is a fixed 2 . 1 parameter, a. is normal with zero mean and variance o , e xs normal ^1 a ij 2 with zero mean and variance o and all a. and £.. are mutually t1 X J

PAGE 116

109 independent. The analysis of variance sums of squares can be used to 2 2 construct a joint 100(1- .5, all i) confidence region given by S= {(d^,d^,...,d^): X -2 £n L^^^d.) 1=1 In light of the results of Section 3.7, the significance levels should be determined via Lancaster's mean chi-square approximation. That is, define -2 iln[L^^\d.)] = 2 2{P. ^n P. P.^^ ^n ^ ^^^} / C^ ^ ^ ^+i'> where n i rii "-!~J P. = z ,., ( .') d!(l-d.) The concept of inverting a test combination procedure to obtain confidence regions is not new (though perhaps unrecognized) because the usual rectangular confidence regions are in fact inversions of the test

PAGE 117

110 combination statistic T . Since Fisher's combination procedure, T generally performs better than T over a broad alternative space, there is a corresponding benefit in constructing S instead of S^ as (F) an omnibus procedure. Regions obtained from inverting T are typically similar to regions obtained by inverting the likelihood ratio statistic. The rectangular form of S is more easily reported than m is the form of S . However, S can be displayed graphically for a k-dimensional parameter with about the same amount of difficulty that is required to display the contours of a second-order response surface in k variables. 4.3 The Combination of 2 x 2 Tables As noted in Section 3.2, the problem of combining independent binomial experiments is closely related to the problem of combining independent 2x2 tables. In the latter problem, each individual experiment consists of two "sub-experiments"; one to assess the performance o f a standard treatment, the other to assess the performance of an experimental treatment. The quantities of interest for the i experiment are summarized in the following listing. Experimental Standard Total .(i) Y^i) s^^^ Successes X i -^ (i) (i) (i) (i) ,(i) c(i) Failures n-x n-y n ^^ e s (i) (i) J^^ Total n n n e s

PAGE 118

Ill In typical cases, the experimental and standard treatments are evaluated with distinct groups of experimental units. It follows that X and Y are independent binomial random variables based on n and n trials and with unknown success probabilities p and e s e p , respectively. It is often desired to form a combined test of the null hypothesis H: p = p , i = l,2,...,k; that is, that the experimental and standard treatments are equivalent in all k experiments. Analogous to the binomial combination problem considered in Chapter III, there is no uniformly most powerful test versus the general alternative H : p S p "^ (strict inequality for at least one i) . The parametric A e s combination procedures discussed in Chapter III can be applied to the problem of combining 2x2 tables as shown in the following paragraphs. Define the likelihood function of the i experiment to be L(Pe 'Ps )-l^X^'V^Pe ^ (1-Pe -> (g(ps ) ^-Ps ^ ^':\ f-r ,(i)'V,(i) e e (i) s s -> (i) (i) n^^^ ^(i) . (i),^(i)^ ,, (i) ^^-n^'^ ^ (p /1-P) (1-Pe ) (1 Ps ^ If a uniformly most powerful test of H.: p = p^ versus -' "^ 1 e s A: p >P exists for the i ^ individual experiment, it is of the form e s ,, (i) _ (i) _ „(i). L(Pg = P^, P ) L(p^ > p^ ) < c

PAGE 119

112 which is equivalent to (p /1-P 2 ikiB. 1 — < c, (i) ..^ n_^\. (i) "s s s a)/i_p(i))S^ \i-p(i)) (i-P^^) (p-./,_p. .) (l_p^ ,(i) .(i) Conditional on S^ , this is equivalent to X >c. This is known as Fisher's exact test. Under H., X^^^ has a hypergeometric distribution (conditional on S^^^). If Fisher's statistic is to be used to form a combined test, the individual significance levels L , i = l,2,...,k, should be determined via Fisher's exact test. As noted in Chapters I and II, if the individual significance levels are determined by optimal tests, then Fisher's procedure yields an optimum combination procedure in the Bahadur sense. The likelihood function for all k experiments is k .. (1) (1) , (2) (2) , (k) ik) U; UK L((p ,p ., (p^ ,P^ ),...,{? ,P )) 11 (.P .P ) (i) (i) i=l If a uniformly most powerful test of the combined null hypothesis H: p^^^ = p^^\ all i, versus the alternative H : pj" > p/ (strict e s A e b inequality for at least one i) exists it is of the form (i) < c, s s e s

PAGE 120

113 Conditional on (S^ * , S ,...,S^ ') this is equivalent to I X^^) > c i=l if the odds ratio, (i),, (i) P /I P ^i (i)/, (i) P /I P 5 S k is constant on i. The test E X ^ > c is equivalent to the Cochran1=1 Mantel-Haenzsel procedure given in (3.4). In other words, for the ( PMH ^ combination of 2 x 2 tables, T is uniformly most powerful for H: ^1 = ^2 " ' " ^k " ""^ versus H^: ^j^ = ^2 " '" " ^k ^ '" The test statistics t''^^\ t^^^^\ T^™\ and T^^^ can also be constructed to yield combined tests for the 2x2 tables problem. A comparison of the procedures parallel to that given in Chapter III is indicated. In particular, the relative performance of the procedures for various points in the parameter space {(, ^,1.^, . . . ,^^) needs to be determined. As in Chapter III, the notion of Bahadur asymptotic relative efficiency could be used and exact power studies could be undertaken. 4.4 Testing for the Heterogeneity of Variances SuppoKc X, ,x,. . . . ,x ,iro muLunlly i ndci'riuliMU normal random ' ' 12 in 2 2 2 . variables with means M ,M ,,,... ,y^^^ and variances "j^'°2'""°m' ^^ 2 often desired to test for the heterogeneity of the o^. The likelihood ra tio test was proposed as a solution by Neyman and Pearson [32]. Bartlett [5] subsequently proposed a modification to the likelihood

PAGE 121

IIA ratio test. A further simplification leads to the maximum F test due to Hartley [15]. The relative merits of these procedures are discussed by Hartley. None of these three well-known methods yield statistics with common distributions and therefore require either special tables or approximation procedures. An alternative test can be constructed by applying Fisher's combination method to a sequence of independent statistics described by Hogg [16]. Hogg points out that the ratios V. , i = 2,3,... ,m R. = ^ X 1 S V. J=l ' where the V are the sample variances are stochastically independent i 2 2 2 under the null hypothesis H : o. = o = . . . = o . Thus, when multi• ^ o 1 2 m plied by the appropriate constants, the R. yields a succession of m-1 mutually independent F-tests. The resultant significance level (F) can than be combined by the statistic T to yield a test of H^. The computations involved in this alternative approach are probably more cumbersome than even Bartlett's test. From theoretical considerations presented in this paper, however, it is likely that the performance of this alternative method will approximate that of the likelihood ratio procedure. An added benefit is that the null (F) distribution of T is known exactly.

PAGE 122

L15 4.5 Testing for the Difference of Means with Incomplete Data Suppose (X^ ,X) ' is a random vector normally distributed with mean vector (M,,P„)' and covariance matrix |. Assume that N pairs of observations are taken on (X ,X ) ' of which N-n observations corresponding to X^ are randomly missing and N n„ observations corresponding to X„ are randomly missing. Testing H : P = U™ versus H : U^ > U^ can be described as the usual paired t-test with missing data. Special cases of this problem are considered in detail by Lin [23,24,25]. The interested reader is referred to Lin for additional references. To date, the work of Lin is based on the assumption that data are missing only for one variate. (He notes in [25] that results for the more general case of data missing on both variates are forthcoming.) For the case when \ is not known and no data are discarded, only approximate tests have been derived, Lin sliows, however, that in most cases the approximate tests are more powerful than the exact tests formed by discarding data. An alternative approach to the problem described in the first paragraph of this section is to apply Fisher's combination procedure. A description of this approach follows. Consider two subsets of the original data set; (1) the N-n -n complete pairs, and (2) the n, unmatched observations on X and n unmatched observations on X . The data in subset 1 can now be analyzed via the usual paired t-test. The data in subset 2 can be analyzed via the usual two-sample t-test. The statistics that result are independent since they are based on distinct sets of data. Combination of the concomitant significance levels yields

PAGE 123

116 an exact method of performing the hypothesis test of interest. The optimality results of T^^^ established thus far in this paper tend to suggest that this approach will yield a viable solution. A possible added advantage to the approach described here is its simplicity. The same idea can be generalized to more complex designs where missing data have occurred. 4.6 Asymptotic Efficiencies for k->°° The Bahadur efficiencies given in this dissertation are derived assuming a fixed number of tests, k, are performed. The sample sizes are allowed to increase without bound. Experimental situations often occur in which an individual experiment is replicated several times with the sample size remaining fairly constant. For such cases, the derivation of Bahadur efficiencies may be more appropriate if it is assumed that each experiment is based upon a fixed sample size, n, and allowing k to increase without bound. Monti and Sen [30] consider the case k-»-°° in their approach to the combination problem. They derive locally optimum tests for a general multivariate framework. For local alternatives, Monti and Sen show that their procedures are more efficient in the Bahadur sense. The Bahadur-Savage Theorem (Theorem 1) can be used to derive the Bahadur slopes of the various combination procedures for the case k-><=°. In order to derive the function f(t) for Part 2 of the BahadurSavage Theorem, Chernoff's Theorem (Theorem 3) may be useful. If the individual tests are based on a common sample size, the statistics

PAGE 124

117 formed by the combination procedures will he sums of independent, identically distributed random variables. If the moment generating function of an individual test statistic is known, Chernoff's Theorem can then be directly applied.

PAGE 125

BIBLIOGRAPHY [1] Bahadur, R. R. (1971). Some Limit Theorems in Statistics . Society for Industrial and Applied Mathematics, Philadelphia. [2] Bahadur, R. R. (1967). An optimal property of the likelihood ratio statistic. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability , I, 13-26. [3] Bahadur, R. R. (1967). Rates of convergence of estimates and test statistics. Annals of Mathematical Statistics , 38, 303-324, [4] Barr, A. J., Goodnight, J. H., Sail, J. P., and Helwig, J. T. (1976). A User's Guide to SAS-76 . SAS Institute, Raleigh, N.C. [5] Bartlett, M. S. (1937). Proceedings of the Royal Statistical Society Series A , 160, 268. [6] Bhattacharya, N. (1961). Sampling experiments on the combination of independent tests. Sankhya Series A , 23, 191-196. [7] Birnbaum, A. (1954). Combining independent tests of significance. Journal of the American Statistical Association , 49, 559-575. [8] Breiman, L. (1968). Probability . Addison-Wesley , Reading, Massachusetts . [9J Brown, M. B. (1975). A method for combining non-independent onesided tests of significance. Biometrics , 31, 987-992. [10] Cochran, W. G. (1954). Some methods for strengthening the common X^tests. Biometrics , 10, 417. [11] David, F. N. and Johnson, N. L. (1950). The probability integral transformation when the variable is discontinuous. Biometrika , 37, 42-49. [12] LdgingLon, L. S. (1972). An additive method for combining probability values from independent experiments. The Journal of Psychology . 80, 351-363. [13] Fislier, R. A. (1932). Statistical Methods for Research Workers . Oliver and Boyd, Edinburgh, London. 118

PAGE 126

119 [14] Good, L. J. (1955). On the weighted combination of significance tests. Journal of the Royal Statistical Society Series B , 17, 264-265. [15] Hartley, H. 0. (1950). The maximum F-ratio as a short cut test for heterogeneity of variance. Biometrika , 27, 308-312. [16] Helwig, J. T., editor. (1977). SAS Supplemental Library User's Guide . SAS Institute, Raleigh, N.C. [17] Hogg, R. B. (1961). On the resolution of statistical hypotheses. Journal of the American Statistical Association , 56, 978-989. [18] Killeen, T. J., Hettmansperger , T. P., and Sievers, G. L. (1972). An elementary theorem on the probability of large deviations. Annals of Mathematical Statistics , 43, 181-192. [19] Kincaid, W. M. (1962). The combination of tests vased on discrete distrihutions. Journal of the American Statistical Association , 57, 10-19. [20] Koziol, J. A. and Perlman, M. D. (1978). Combining independent chi-squared tests. Journal of tlie American Statistical Association , 73, 753-763. [21] Lancaster, H. 0. (1949). The combination of probabilities arising from data in discrete distributions. Biometrika , 36, 370382. [22] Lehmann, E. L. (1959). Testing Statistical Hypotheses . Wiley and Sons., London. [23] Lin, Fi-Erli. (1971). Estimation procedures for difference of means with missing data. Journal of the American Statistical Association , 66, 634-636. [24] Lin, Pi-Erh. (1973). Procedures for testing the difference of means with incomplete data. Journal of the American Statistical Association , 68, 699-703. [25] Lin, Pi-Erh and Stivers, L. E. (1975). Testing for equality of means with incomplete data on one variable: A Monte Carlo study. Journal of the American Statistical Association , 70, 190-193. [26] Liptak, T. (1958). On the combination of independent tests. Magyar Tud . Akad. Mat. Kutato Int. Kozl ., 3, 171-197. [27] Littell, R. C. and Folks, J. L. (1971). Asymptotic optimality of Fisher's method of combining independent tests. Journal of the American Statistical Association, 66, 802-806.

PAGE 127

120 [28] Littell, R. C. and Folks, J. L. (1973). Asymptotic optimality of Fisher's method of combining independent tests II. Journal of the American Statistical Association , 68, 193-194. [29] Mantel, N. and Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute , 22, 719-748. [30] Monti, K. L. and Sen, P. K. (1976). The locally optimal combination of independent test statistics. Journal of the Americ an Statistical Association , 71, 903-911. [31] Nuesch, P. E. (1966). On the problem of testing location in multivariate populations for restricted alternatives. Annals of Mathematical Statistics , 37, 113-119. [32] Neyman, J. and Pearson, E. S. (1931). Bull. Int. Acad. Craco vie, A, 460. [33] Oosterhoff, J. (1969). Combination of One-Sided Sta tistical Tests, Tract 28, Amsterdam Mathematical Centre. [34] Pape, E. (1972). A combination of F statistics. Technometrics, 14, 89-99. [35] Pasternak, B. S. and Mantel, N. (1966). A deficiency in the summation of chi procedure. Biometrics , 22, 407-409. [36] Pearson, E. S. (1938). The probability integral transformation for testing goodness of fit and combining independent tests of significance. Biometrika , 30, 134-148. [37] Pearson E. S. (1950). On questions raised by the combination of tests based on discontinuous distributions. Biometrika , 37, 383-398. [38] Perng, S. K. (1977). An asymptotically efficient test for the location parameter and the scale parameter of an exponential distribution. Communications in Statistics: Theory and Methods, A6(14), 1399-1407. [39] Perng, S. K. and Littell, R. C. (1976). A test of equality of two normal population means and variances. Journal o f the American Statistical Association , 71, 978-971. [40] Savage, T. R. (1969). Non-parametric statistics: A personal review. Sankhya , 31, 107-144. [41] Sukhatme, P. V. (1935). A contribution to the problem of two samples. Proceedings India Academy Science , A(2) , 384-604.

PAGE 128

121 [A2] Tippett, L. 11. C. (1931). The Method of Statistics . Williams and Norgate, London. [43] Wallis, W. A. (1942). Compounding probabilities from independent significance tests. Econometrica , 10, 229-248. [44] Wilkinson, R. (1951). A statistical consideration in psychological research. Psychological Bulletin , 48, 156-157. [45] Zelen, M. (1957). The analysis of incomplete block designs. Journal of the American Statistical Association, 52, 204-217.

PAGE 129

BIOGRAPHICAL SKETCH William C. Louv was born on February 11, 1952, in Schenectady, New York. William's family moved to suburban Philadelphia in 1957 where they still reside. Upon graduating from Upper Merion Area High School in 1970, William matriculated to The College of William and Mary in Williamsburg, Virginia. There he received a Bachelor of Science degree in biology in 1974. William came to the University of Florida in the fall of 1974 and received a Master of Statistics degree in 1976. Since that time he has l)cen pursuing a doctoral degree in statistics. While studying at the University of Florida, William has been employed by the Department of Statistics as a research and teaching assistant. William has been married to the former Jill Stumpe since June, 1974. 122

PAGE 130

I certify that I have read this study and that in my opinion it conforras to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Ramon C. Littell, Chairman Associate Professor of Statistics I certify that 1 have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. i^ John d. Saw Professor of Statistics 1 certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Dennis D. Wackerly Associate Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of I'hilosophy. -iA^ Jerry F. Busier Associate/Professor of Entomology

PAGE 131

1 certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. V -^ i:r^ v^iife James T. McClave Associate Professor of Statistics i This dissertation was submitted to the Graduate Faculty of the Department of Statistics in the College of Liberal Arts and Sciences and to the Graduate Council, and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. August 1979 Dean, Graduate School