Citation

## Material Information

Title:
Exact unconditional tests for 2x2 contingency tables
Creator:
Suissa, Samy Salomon, 1954- ( Dissertant )
Place of Publication:
Gainesville, Fla.
Publisher:
University of Florida
Publication Date:
1982
Language:
English
Physical Description:
vii, 101 leaves : ill. ; 28 cm.

## Subjects

Subjects / Keywords:
Approximation ( jstor )
Binomials ( jstor )
Critical values ( jstor )
Mathematical independent variables ( jstor )
Power functions ( jstor )
Proportions ( jstor )
Sample size ( jstor )
Significance level ( jstor )
Statistical discrepancies ( jstor )
Statistics ( jstor )
Contingency tables ( lcsh )
Dissertations, Academic -- Statistics -- UF
Statistical hypothesis testing ( lcsh )
Statistics thesis Ph. D
Genre:
bibliography ( marcgt )
non-fiction ( marcgt )

## Notes

Abstract:
The so-called "exact" conditional tests are very popular for testing hypotheses in the presence of nuisance parameters. However, in the context of discrete distributions, they must be supplemented with randomization to become exactly of size a, the nominal significance level. This practice is undesirable since irrelevant events should not affect one's decision. Consequently, the conditional test without randomization, while still called "exact," becomes conservative. As an unconditional alternative, a methodology is developed to compute the exact size of any test when the null power function is of a given form. This approach is a way of catering to the worst possible configuration of the nuisance parameter by maximizing the null power function over the domain of the nuisance parameter. As special cases, the 2x2 contingency table to compare two independent proportions and the 2x2 contingency table to compare two correlated proportions are considered. For the equal sample size case, exact critical values of the Z-test for comparing two independent proportions are computed and tabulated for n=10 (1)150, a=.025 and oc=.05. Sample size requirements based on the exact unconditional one-sided Z-test, with aâ€” .05 and 80% power, were never larger than the corresponding sample size requirements based on the "exact" conditional test, namely Fisher's exact test. In fact, the proposed Z-test is uniformly more powerful than Fisher's exact test for n=10 (1)150, a=.025 and o=.05. For comparing two correlated proportions, exact critical values of the Z-test, which is the appropriate square root of McNemar's chi-square test, are computed and tabulated for N=10(l)200, a=.025 and ot=.05, where N is the total number of matched pairs. Here again, sample size determinations based on the exact unconditional one-sided Z-test, with <x=.05 and 80% power, were never larger than the corresponding sample size requirements based on the "exact" conditional test. In fact, the proposed Z-test is uniformly more powerful than the "exact" conditional test for the cases considered, namely N=10 (1)200, a=.025 and a=.05.
Thesis:
Thesis (Ph. D.)--University of Florida, 1982.
Bibliography:
Includes bibliographic references (leaves 98-100).
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by Samy Salomon Suissa.

## Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
000339612 ( alephbibnum )
09600408 ( oclc )
ABW9299 ( notis )

Full Text

EXACT UNCONDITIONAL TESTS FOR
2x2 CONTINGENCY TABLES

BY

SAMY SALOMON SUISSA

A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL
OF THE UNIVERSITY OF FLORIDA IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

1982

To Nicole, Daniel

and

my parents

ACKNOWLEDGEMENTS

I am deeply grateful to Professor Jonathan Shuster for

being an instrumental part of my education. His understanding

and confidence were important in my entry to this graduate

program. Throughout my studies, his patience, help and

guidance were invaluable to my work. His contribution to

this dissertation was crucial.

I am thankful to my family for their constant support:

My wife Nicole, for her patience, love and understanding,

my son Daniel, for the joys and the change of pace, and my

parents, brother and sisters who were always there with love

and encouragement.

My appreciation extends to Professors Ronald Randles,

Mark Hale, Ramon Littell and Ken Portier for their full

cooperation on such short notice.

page

ACKNOWLEDGEMENTS ....................................... iii

ABSTRACT................................................. vi

CHAPTER

1 INTRODUCTION..................................... 1

1.1 The Problem................................ 1
1.2 Some Methods of Eliminating
Nuisance Parameters........................ 3
1.3 Numerical Example.......................... 5
1.4 Proposed Approach and Preview.............. 12

2 METHODOLOGY FOR COMPUTING THE SIZE OF A TEST.... 14

2.1 Introduction................................ 14
2.2 Local Bound for w'(p)....................... 15
2.3 Least Upper Bound for V(p)................. 16
2.4 Stability of the Null Power Function....... 19
2.5 Choice of the Test Statistic................ 20

3 THE 2x2 TABLE FOR INDEPENDENT PROPORTIONS........ 22

3.1 Introduction............................... 22
3.2 Asymptotic Tests and Sample Size Formulae.. 25
3.3 Fisher's Exact Test......................... 28
3.4 An Exact Unconditional Test............... 31
3.5 Relation to the Chi-square
Goodness-of-fit Test....................... 38
3.6 Power and Sample Sizes ............. .... 40

4 THE 2x2 TABLE FOR CORRELATED PROPORTIONS........ 42

4.1 Introduction.............................. 42
4.2 McNemar's Test, Other Asymptotic Tests
and Sample Size Formulae................... 46
4.3 The Exact Conditional Test.................. 48
4.4 An Exact Unconditional Test................ 51
4.5 Power and Sample Sizes ..................... 57

APPENDICES

A TABLES............. ............................ 59

B PLOTS OF THE NULL POWER FUNCTION................ 79

C COMPUTER PROGRAMS............................... 91

REFERENCES......... .. ................ ... .............. 98

BIOGRAPHICAL SKETCH .................................... 101

Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

EXACT UNCONDITIONAL TESTS FOR
2x2 CONTINGENCY TABLES

by

Samy Salomon Suissa

August 1982

Chairman: Jonathan J. Shuster
Major Department: Statistics

The so-called "exact" conditional tests are very popular

for testing hypotheses in the presence of nuisance parameters.

However, in the context of discrete distributions, they must

be supplemented with randomization to become exactly of size

a, the nominal significance level. This practice is undesir-

able since irrelevant events should not affect one's decision.

Consequently, the conditional test without randomization,

while still called "exact," becomes conservative.

As an unconditional alternative, a methodology is devel-

oped to compute the exact size of any test when the null

power function is of a given form. This approach is a way of

catering to the worst possible configuration of the nuisance

parameter by maximizing the null power function over the

domain of the nuisance parameter. As special cases, the

2x2 contingency table to compare two independent proportions

and the 2x2 contingency table to compare two correlated

proportions are considered.

For the equal sample size case, exact critical values

of the Z-test for comparing two independent proportions are

computed and tabulated for n=10(1)150, a=.025 and a=.05.

Sample size requirements based on the exact unconditional

one-sided Z-test, with a=.05 and 80% power, were never larger

than the corresponding sample size requirements based on the

"exact" conditional test, namely Fisher's exact test. In

fact, the proposed Z-test is uniformly more powerful than

Fisher's exact test for n=10(1)150, a=.025 and a=.05.

For comparing two correlated proportions, exact critical

values of the Z-test, which is the appropriate square root of

McNemar's chi-square test, are computed and tabulated for

N=10(1)200, a=.025 and a=.05, where N is the total number of

matched pairs. Here again, sample size determinations based

on the exact unconditional one-sided Z-test, with a=.05 and

80% power, were never larger than the corresponding sample

size requirements based on the "exact" conditional test.

In fact, the proposed Z-test is uniformly more powerful than

the "exact" conditional test for the cases considered, namely

N=10(1)200, a=.025 and a=.05.

CHAPTER 1

INTRODUCTION

1.1 The problem

The problem of testing a simple hypothesis about a

parameter 6, the index of a known probability distribution

P6 against a simple alternative hypothesis, for a specified

significance level a, is straightforward. Its optimal

solution is given by the fundamental lemma of Neyman and

Pearson (Lehmann, 1959:65). However, many statistical models

involve more than the parameter being tested. The distribu-

tion could be indexed by the parameters (6,v), where v is an

additional unknown parameter. If inference is required only

about 8, the parameter v is called a nuisance parameter. The

problem then becomes one of eliminating this nuisance

parameter from the model.

Over the years, many solutions to this problem have been

put forward. Basu (1977), in reviewing two such elimination

methods, classified all solutions that have been proposed

into ten categories. In this dissertation, three of the most

common of these methods will be discussed with applications

to 2x2 contingency tables. The first method, often used in

asymptotic theory, is that of replacing the nuisance

parameter by an estimate. The second is the conditional

method, which can evolve from two different approaches. The

third method considered is the maximization method which

essentially caters to the worst possible configuration of the

null power function of a test by maximizing this function

over the domain of the nuisance parameter. This is the

definition of the size of a test as given in Lehmann (1959)

or Ferguson (1967). These three methods of eliminating

nuisance parameters will be presented in section 1.2, with

an emphasis on their advantages and drawbacks. Greater

emphasis will be put on the third method, namely the

maximization method, since it is the proposed solution to the

problems considered in this dissertation. In section 1.3,

some intuitive reasons, accentuated by an example from a 2x2

contingency table, will be given as the motivating force for

considering the maximization method as a potential competitor

to the other two. In section 1.4, the proposed approach

will be discussed in the context of the two problems to which

it is applied, namely comparing two independent proportions

and comparing two correlated proportions. Furthermore, a

preview of the results will be given.

1.2 Some Methods of Eliminating Nuisance Parameters

The estimation method. A classical method of elimina-

ting the nuisance parameter v is to replace it by an estimate

v. This practice is popular in asymptotic theory where all

the parameters, except the parameter being tested, are often

replaced by consistent estimators. This method usually leads

to simple test statistics that are referred to well-tabulated

distributions. For example, the chi-square test for com-

paring two independent proportions and McNemar's test for

comparing two correlated proportions, both have an asymptotic

chi-square null distribution with one degree of freedom.

However, because all results are asymptotic or approximate,

these methods should not be used for small samples. Further,.

it is usually unclear what size constitutes a "large enough"

sample. This method should be avoided in any study where

asymptotic theory is questionable.

The conditional method. The conditional method of

eliminating nuisance parameters may evolve from two different

approaches. The first of these approaches, due to R.A.

Fisher, is described in Kendall and Stuart (1967, vol. 2) as

follows. Let (T,U) be sufficient statistics for (6,v). If

the marginal distribution of U is independent of 8, then U

is called an ancillary statistic. The conditionality

principle of Fisher is stated as follows: If U is an

ancillary statistic (distributed free of 8), then the

conditional distribution of TIU is all that is needed to test

a hypothesis about 8. This principle is clearly acceptable

when U is a sufficient statistic for v (when 6 is known)

because the likelihood will then factor into two distribu-

tions, namely that of TJU with parameter 8 and that of U with

parameter v. The problem arises, however, when U is not

sufficient for v. It is then doubtful whether this .principle

leads to "optimal" tests. In fact, Welch (1939) gave an

example in which the conditional test based on TIU may be

uniformly less powerful than an alternative unconditional

test.

The conditional method can be obtained through another

approach. By restricting the test procedure to a smaller

class of tests, namely a-similar or unbiased tests, the

nuisance parameter may be eliminated. In particular, if the

distribution of (T,U), the sufficient statistics for (8,v),

is in the exponential family, the distribution of TIU will

give UMPU (uniformly most powerful unbiased) tests for

hypotheses about 8 (Lehmann, 1959:134). Because of the

richness of the exponential family, UMPU tests can be readily

obtained for a large number of problems through a conditional

distribution which is free of the nuisance parameter.

However, if TIU is discrete, a randomization supplementing

the experiment will be required to make the test exactly of

level a. This randomization is undesirable since irrelevant

events should not affect one's decision. Hence, the test

without randomization becomes conservative and consequently

loses power in the process.

The maximization method. The third method of elimina-

ting nuisance parameters is the maximization method. It is

based on the direct utilization of the definition of the size

of a test, as defined in Lehmann (1959:61), which requires

that the null power function of any test to be maximized over

the domain of the nuisance parameter. In other words, the

maximization method is based on the premise that the worst

possible configuration of the nuisance parameter could take

place and that the testing procedure should always be

protected against this eventuality. This method is easily

explained to laymen because they are already familiar with

the test of hypothesis for a given value of the nuisance

parameter v. Therefore, by letting v vary, they will have

an intuitive feeling for the maximization method. The major

drawback to this method is the complexity of maximizing the

null power function. With the advances in computer

technology, this method should be investigated more fully.

1.3 Numerical Example

The problem of comparing two independent proportions,

called the 2x2 comparative trial, is an example of a test of

hypothesis in the presence of a nuisance parameter. If 91

and 82 are the success rates of the first and second popula-

tions respectively, then the equality of these success rates

1 =2 (=8, unspecified) involves a nuisance parameter, namely

8, the unknown common proportion of success in the two

populations. A numerical example is given to present the

discrepancies that occur between the first two methods of

eliminating that nuisance parameter. Let the outcome of

independent random samples of size n=10 each from two binary

populations 1 and H2 be

Table 1.1

S F totals

nl 4 6 10

12 8 2 10

totals 12 8 20

where S=success and F=failure. The question is: Is the

success rate of H1 less than the success rate of H2 at

significance level a=.05 ?

In the spirit of the estimation method, 8 may be

replaced by 6 = 12/20 in the Z-test statistic with pooled

variance estimator (section 3.2) to give

/20 (8-4)
Z = ---- = 1.83
S (12x8)

which, upon referring to the standard normal distribution,

gives an attained significance level (p-value) of .0336,

which is less than a=.05.

If the conditional argument is used, Fisher's exact

test (section 3.3) is the solution to both approaches. By

restricting the sample space to only those outcomes with the

same marginal totals as Table 1.1, namely

Table 1.2

S F totals

Ei 10

12 10

totals 12 8 20

the p-value is given by the sum of the hypergeometric

probabilities of all tables with the above marginal totals

which are more extreme than the observed one (i.e. < 4

successes in I1) in the direction of the alternative hypoth-

esis. Accordingly, Fisher's exact test for Table 1.1 has

p-value = .0849 > .05.

The statistician is now faced with a dilemma. He would

conclude that 61<82 by the first method, but could not by the

second method at a=.05. This discrepancy raises several

questions: How accurate is the normal approximation to the

Z-test ? How conservative is Fisher's exact test without

randomization ? If the conditional method is preferred,

how does one explain to the layman that the sample space is

restricted only to the tables with 12 total successes ?

The answers to these questions lie in the null power

function, a function of the nuisance parameter 8. For each

of the two tests, the exact null power function is plotted

on the basis of the attained significance levels as the new

nominal significance levels. For the Z-test, the plot of the

exact null power function, based on the Z-value of 1.83

(nominal significance .0336), is given in Figure 1.1. It is

seen that the exact significance would be greater than the

nominal significance level of .0336 when .13<6<.87. In that

range, the Z-test is liberal in the sense that it would

reject the null hypothesis when it should not at level .0336.

However, notice that the exact null power function never

exceeds the original significance level of a=.05, the maximum

being .047. For the conditional approach, the plot of the

exact null power function based on the attained significance

level of Fisher's exact test (.0849) is given in Figure 1.2.

The conservativeness of this conditional test is obvious.

In fact, its null power function is never larger than the

original nominal significance level of a=.05, the maximum

being .045.

Further insight into the reasons for such largely

different results can be obtained by inspecting the critical

regions of each test. Once more, the Z-test will be based

on the nominal significance level .0336 and Fisher's exact

0
-44

o;
II

00

I,

4.) 0
r-t
II
N

4- o

a
g4J

S0
* a -rlOl
S 10

- 0 )

S-H

U0

0)
-H

--
E '0

a

0 0 0

, ,

/' *

/

/ o +>
S A C

a

o

aa
V 0
s 'C"

0

S*H
I I E

F-5

C

-_.4
I- '

C a a a a a a a a
f. a0wc --orrl (
\ SMo~ d
\~~ 0QW *HUCO

test on the level .0849. By representing all the possible

results of the experiment in the form of points of a lattice

diagram, given in Figure 1.3, the critical regions are simply

given by marked subsets of these points.

S2 =10 0 0 0 0 0 0 0 .
9 8 0 0 .
8 0 0 0 0 0 . . .
7 8 0 0 . . . .
6 .
5 .
4 .
3 0 .
2 .
1

0 1 2 3 4 5 6 7 8 9 10 = S1

Figure 1.3 Critical regions of Z-test and Fisher's
exact test.

In Figure 1.3, S = number of successes from ~1 and S =

number of successes from H2* The sample points in the

lattice diagram marked by an "O" belong to the critical

region defined by the Z-test (nominal significance level

.0336) and those marked by an "x" belong to the critical

region defined by Fisher's exact test (nominal significance

level .0849). Although the nominal significance level of the

Z-test (.0336) is much smaller than that of Fisher's exact

test (.0849), notice that the critical region of the Z-test

contains the critical region of Fisher's exact test. This
is a flagrant example of the conservativeness of Fisher's

exact test and of the liberalness of the Z-test.

These discrepancies suggest that the maximization method
might be more appropriate, if not more exact, than the first
two methods of eliminating the nuisance parameter. Therefore,

by either adjusting the Z-test or the unconditional Fisher's

exact test, the maximization method could lead to significance
levels which are closer to the nominal levels.

1.4 Proposed Approach and Preview
In this dissertation, the maximization method will be

used in conjunction with tests derived from the estimation
method to develop an exact unconditional testing procedure

as an alternative to the popular, but conservative, condi-

tional method. This procedure will be applied to two
problems of the 2x2 contingency table, namely that of com-

paring two independent proportions and of comparing two

correlated proportions. For this purpose, a methodology
for computing the supremum of a null power function of a

certain general form is developed in Chapter 2. The form
of this null power function includes, as particular cases,
the null power functions of the two problems considered in
this dissertation. In Chapter 3, the supremum of the null
power function of the Z-test (the size of the Z-test) for

1

comparing two independent proportions, will be computed for

the equal sample size case. Consequently, exact critical

values and minimum required sample sizes will be obtained

and tabulated for the design and analysis of such a trial.

Comparisons with the conditional method (Fisher's exact

test) will show that the exact unconditional method leads to

smaller (or equal) sample sizes for a=.05, 80% power and

common sample size n=10(1)150.

In Chapter 4, the size of the Z-test for comparing two

correlated proportions will be computed via the methodology

developed in Chapter 2. As a result, exact critical values

and sample size determinations will be obtained and

tabulated. This exact unconditional test will produce

smaller (or equal) sample sizes than the conditional test

(the sign test) for a=.05, 80% power and the number of

paired observations N=10(1)200.

Furthermore, a comparison of the critical regions will

show that the exact unconditional Z-tests are uniformly more

powerful than their conditional counterparts in the range

considered, namely a=.025, a=.05 and n=10(1)150 for the

independent case, and N=10(1)200 for the correlated case.

CHAPTER 2

METHODOLOGY FOR COMPUTING
THE SIZE OF A TEST

2.1 Introduction

In this chapter, a methodology is developed to compute

the unconditional size of any test of hypothesis T when:

a. There exists a nuisance parameter p such that 0
and

b. The null power function of T is a linear combination

of a finite number of binomial terms in p.

Thus, we assume that the null power function of T can

be written as

i(p) = ai pbi (-p)ci (2.1.1)
ipC

where ai, bi and ci are -0, i is an indexing subscript over

the whole sample space S defined by the sampling scheme, C is

the set of subscripts for which the related sample points

belong to the critical region defined by the testing procedure

T and p is a nuisance parameter on the unit interval.

The size of the test T is given by sup r(p) and, because

only discrete null distributions will be considered, is

restricted to a finite collection of possible values, namely,

the natural levels of test T as referred to as in Randles and

Wolfe (1979). The remainder of this chapter deals with the

technique of computing sup i (p). The method used is based on
p
the mean value theorem of differential calculus applied to

successive subintervals of the unit interval.

The use of this methodology will be illustrated in two

important cases. The case of comparing two independent

proportions is given in Chapter 3 and that of comparing two

correlated proportions in Chapter 4.

2.2 Local Bound for r'(p)

The applicability of the mean value theorem of differ-

ential calculus, as stated in Courant and John (1965),

requires primarily a bound on the derivative of i(p) for

each subinterval. The task of finding such a bound is

facilitated by the form of i (p), namely that it is a linear

combination of binomial terms. In this section, a method of

computing this bound for any subinterval is given. First,

note that the derivative of i(p),

r'(p) = Z ai{ bi pb-1(1-p)ci -ci pbi(l-p)ci-1 }
ic

(2.2.1)

is also a linear combination of binomial terms of the form

h(p) = pr (1-p)s-r

so that for any given subinterval I=(a,b) with 0

sup h(p) = h(b) if > b
peI
= h(a) if < < a

= h(t) if E I (2.2.2)

and

inf h(p) = min ( h(a), h(b) ) (2.2.3)
pCI

where 0 = r/s.

An upper bound for i' (p) can be obtained on (a,b) by

substituting the right hand side of (2.2.2) for each

positive term of (2.2.1) and the right hand side of (2.2.3)

for each negative term of (2.2.1). Similarly, a lower bound

for T'(p) can be obtained on (a,b) by reversing these substi-

tutions. Finally, a bound M for it''(p) on (a,b) is taken as

the larger of the two bounds, in absolute value.

2.3 A Least Upper Bound for r(p)

Since local bounds for JI'(p) can now be computed for

any subinterval (a,b), the mean value theorem can be applied

to the successive subintervals Il=(0,.01), I2=(.01,.02),...,

I100=(.99,1) of the unit interval to obtain an upper bound

for ir(p) in each I.. An upper bound for r(p), 0
simply the maximum of these local upper bounds.

First, for each I., j=l(1)100, it is now possible to

find a M. such that

Il'( .)I < M. for all 89 e I. ,

where the M.'s are obtained by the method of section 2.2.

By the mean value theorem of differential calculus, it can

be concluded that

T (e ) e ( ir(p )-.005Mj r(pj)+.005M. ), (2.3.1)

for all 8.e I., where p.=(j-.5)/100, the midpoint of I..

For each I., j=1(1)100, the local maximum is given by

n+(pj) = n(pj) + .005M .

It can then be deduced that, for p on the unit interval (0,1),

r(p) < max { n (pj); j=l(l)100 }. (2.3.2)
J )

Therefore, the right hand side of (2.3.2) is greater than

sup i(p), the size of T.

Because Mj can be quite large for some values of j,

it is possible that the bound on the right hand side of

(2.3.2) will be a conservative upper bound for the function

S(p). This bound can be improved upon to produce a least

upper bound of precision 6 in the following manner. Any

interval I. for which

+

+ (pj) > max { r(pi); i=l(1)100 } + 6
1

m.
must be iteratively subdivided into 2 subintervals
m.
SIjk ; k.=l,2,..., 2 J, mj.l }, where m. is the smallest

integer such that

m.
n(Pjk )+.005 Mjk /2 < max{max{i(p.)}, max{I (pj )}}+ 6
j j i Uj j

(2.3.3)

for all k., and where

m.-1
Pjk. = (j-l)/100 + (2kj-l)/(100x2 )

is the midpoint of Ijk. and Mjkj is the local bound for

7''(p)| in Ijk,. The inequality (2.3.3) is clearly

attainable since, from (2.2.1),

Mjk. iC ai max(bi,ci). (2.3.4)
j iEC 1

Moreover, using the right hand side of (2.3.4) rather than

Mjkj is inefficient and would make computing costs

prohibitively large.

By the methodology developed in this section, it is

now possible to compute the size (with precision 6) of any

test for which the null power function is of the form (2.1.1).

These computations will then shed a light on the extent

of conservativeness or liberalness of the tests that are

used in the elimination of a nuisance parameter in the

present context.

2.4 Stability of the Null Power Function

In the ideal case of the absence of a nuisance

parameter, the null power function is constant over the

null parameter space. However, when a nuisance parameter

is present, the magnitude of its effect on the null power

function i(p) can be of interest. A relevant feature is

the stability or flatness of i(p). An indication of this

stability can be created using

inf { i(p); PE(paPb)} (2.4.1)
p

where pa and pb are chosen appropriately for each problem.

The difference between sup n(p) and (2.4.1) is then an

indicator of this stability. The whole unit interval is

not used in (2.4.1) because i(0)=I(1)=0. A lower bound

for the set in (2.4.1) can be computed from (2.3.1) by

min { r(pj) .005M.; j=ja()jb

where ja = int(100pa) + 1,

jb = int(100pb) + 1
and int(.) is the integer function, pj being as in (2.3.1).

'.a

2.5 Choice of the Test Statistic

The choice of the testing procedure is quite arbitrary

since the goal here is to compare unconditional to

conditional tests. The test statistic that is derived from

a testing procedure will simply be a means of dividing the

unconditional sample space S into a critical region C and

an acceptance region S-C. The choice should then be based on

optimal procedures that produce test statistics that are

powerful, simple to compute, and intuitively appealing.

Three such procedures are the likelihood ratio test

criterion, the chi-square goodness-of-fit test and a Z-test

based on the asymptotically standardized maximum likelihood

estimator, often a function of the sufficient statistic, of

the parameter being tested.

The likelihood ratio criterion is given by

sup L(8)
H0
R -=
sup L(6)

where L(e) is the likelihood function of the sample. If

large sample theory applies, the more convenient equivalent

statistic

X2 = -2 log(R) (2.5.1)

can be used because of its limit chi-squared null distribution.

The chi-square goodness-of-fit test, based on the statistic

2 (0i-Ei)2
X = EO (2.5.2)
i E.

is especially apropos because of its applicability to

multinomial data which lead to null power functions of the

form (2.1.1). The asymptotic null distribution of (2.5.2)

is also chi-square. The third test is based on the

asymptotic normality of 6n, the maximum likelihood estimator,

if one exists, of the parameter, call it 6, being tested.

The statistic

Z n
2 (2.5.3)
s(n,)

where s2(6 ) is the asymptotic variance of n or a

consistent estimator thereof, has a standard normal asymptotic

null distribution under the regularity conditions of likeli-

hood theory. For the two problems considered in Chapters 3

and 4, the maximum likelihood estimator is a function of the

sufficient statistic. This statistic and the chi-square

goodness-of-fit statistic are the most appealing since (2.5.1)

could be computationally laborious.

A further advantage of using the test statistics (2.5.1),

(2.5.2) and (2.5.3) is their well-known and well-tabulated

asymptotic distributions. These tables can be used to find

reasonable starting points for the critical values. Moreover,

these values can be compared to the percentage points of the

asymptotic distributions and thus provide a study of the

accuracy of the large sample approximations.

CHAPTER 3

THE 2x2 TABLE FOR
INDEPENDENT PROPORTIONS

3.1 Introduction

A classical problem is the one of comparing two

proportions from independent samples. This seemingly simple

problem that involves only four numbers, has generated a

large amount of literature and has been the subject of much

controversy about the use of conditional tests. Since

Fisher (1935) proposed the "exact" test, Barnard (1947) and

Pearson (1947) started a conflict that has not yet been

resolved, as can be seen in the recent articles by Berkson

(1978), Barnard (1979), Basu (1979), Corsten and de Kroon

(1979) and Kempthorne (1979). Because of the complexity of

the power function, only partial attempts have been made in

order to resolve the argument. The statement of the problem

follows.

Let X and Y be independent binomial random variables

with parameters (n,pl) and (n,p2) respectively. An experi-

ment that compares p, and p2 is called a 2x2 comparative

trial by Barnard (1947), the outcome of which is represented

in the form of Table 3.1,

_I~

Table 3.1

S F totals

T1 x n-x n

T2 y n-y n

where pi = P(SITi) = 1-P(FITi), i=1,2. The labels S and F

represent the binary outcomes ( S=success, F=failure) and T1

and T2 represent the two populations being compared.

The problem is to test, at level a, the null hypothesis

H0:Pl=P2 against the alternative hypothesis Ha:P1
the other one-sided alternative Ha:p1>p2 and the two-sided

alternative Ha:p#3p2 are treated in a similar manner, only

this one-sided case will be considered here. Furthermore,

only the case of equal sample sizes will be considered because

of its optimality under equal sampling costs (Lehmann,

1959:146). The probability of observing the outcome in

Table 3.1 is

P(X=x,Y=y) = x (1- n- ( ((-p -y

and is, under the null hypothesis H0:pl=P2 (=p say),

P(X=x,Y=y) = x)( px+y (1p)2n-x-y

a function of the nuisance parameter p, the unspecified

common value of pl and p2 under H0.

Because of this dependence on a nuisance parameter,

either approximate tests based on asymptotic results or

exact conditional tests are used. Few attempts, however,

have been made to compute the exact unconditional size of

any of these tests. Barnard (1947) proposed an uncondi-

tional test based on sup r(p), the size. The criterion

that he suggested was intricate and no methodology for

computing the size was given. McDonald, Davis and Milliken

(1977) tabulated critical regions based on the unconditional

size of Fisher's exact test for n.15 and a=.01 and .05.

Again, no formal methodology for computing the size was

given. Furthermore, no sample size tables or power

calculations based on an exact unconditional test exist.

In this chapter, the most common tests are presented.

For the asymptotic case, two normal tests and some sample

size formulae are given. For the general case, and in

particular for small sample sizes, two derivations that

both lead to Fishek's exact test are presented. As an

alternative to these tests, the results of chapter 2 are

used to compute and tabulate the exact unconditional size

of two simple statistics as well as the required sample

sizes for a significance level of a=.05 and a power of

1-B=.80. It is also shown that these tests are uniformly

more powerful than Fisher's exact test in the range

considered, namely a=.025, a=.05 and n=10(1)150.

3.2 Asymptotic Tests and Sample Size Formulae

A way of circumventing the effect of the nuisance

parameter is through the use of asymptotic tests. These

approximate tests are appealing because they are usually

based on simple test statistics for which the limiting

distributions are well tabulated. They are, however,

approximations and should not be used when the sample sizes

are. small.

When n is relatively large, the most widely used tests

for the hypothesis of interest are the normal tests. The

first one, based on the inversion of the asymptotic

confidence interval for p2-Pl, is the Z-test with an

unpooled estimator of the variance and is given by

/Z n (P2 1 (3.2.2)
u 1^ 1 1
(P2 2 + Plql)

where Pl = x/n = 1-ql, P2 = y/n = 1-q2, with x, y and n as

in Table 3.1. The second one, based on the asymptotic null

distribution of p2-P1, is the Z-test with a pooled variance

estimator and is given by

/n (p2 1
Z = (3.2.2)
P (2 p q )

where P1 and p2 are as in (3.1.1) and p = (x+y)/2n = l-q.

The limiting distribution of both Z and Z is the standard
u p
normal distribution and an approximate test of size a is

based on the percentage points of 4, the standard normal

distribution function. The test statistic Z is most
p
frequently used through an equivalent test statistic, the

chi-square goodness of fit statistic given by

X = Z (3.2.3)
P P

which has a X1 limiting distribution. Because this chi-

square test deals with two-sided alternatives, the statistic

Z is preferred in the present context of a one-sided test.

The accuracy of the approximation was studied by

various authors. The nominal significance level a was

compared to the actual significance level by computing

(3.4.3) for some values of p. Between the papers by

Pearson(1947) and Berkson (1978), numerous studies have

shown that for Z the actual level could be larger than

the nominal level for some values of p, making this test a

liberal one.

To determine the sample size required in each group,

two formulae are often used. The first one, based on Z and

derived in Fleiss (1980), determines the sample size in each

group by

[ za (2pq) zl (plq1+p2q2) ]2
n = 2 (3.2.4)

(p2) 2

where z is the upper 100y percentile of the standard normal

distribution, a and B are the type I and type II error

probabilities, p, and p2 are the desired alternatives, and

P = (pl1+2)/2 = 1-p. The second formula, based on the
variance stabilizing property of the arcsine transformation

on proportions, is given in Cochran and Cox (1957) by

(za + z ) (3.2.5)
n
as 2(sin-/ l sin- 1/2)2

Other formulae have been derived and are mostly corrected

versions of (3.2.4). Kramer and Greenhouse (1959), arguing

that the test based on Z was too liberal, adjusted (3.2.4)

and found

n { 1+[1+8(p2-P1)/np }2 (3.2.6)
n =n
c p 4(22 12
4 (P2 Pl)2

where n is the sample size found from (3.2.4). More recent-

ly, namely since sample size tables based on the exact

conditional test were computed, a further adjustment to

(3.2.4) was suggested by Casagrande, Pike and Smith (1978a)

in order to arrive closer to results based on. the exact

conditional test. They proposed the formula

n =n { 1 4( /n 2 (3.2.7)
r p (2 2
4 (P2 Pl)

the derivation of which was based on a slight deviation

from the derivation of (3.2.6).

3.3 Fisher's Exact Test

The "exact" method of eliminating the nuisance parameter

is based on a conditional argument and can be obtained via

two different approaches. The first approach, put forward by

Fisher (1935), is that of a permutation test. The permu-

tation test argument is a conditional one in that the

critical region is constructed on. space conditional on some

information from the data. Fisher argues that, because the

marginal totals of Table 3.1 alone do not supply any

information about the equality of p, and p2, it is reasonable

to test conditionally. Thus, given x+y and under HO:Pl=P2,

the probability of Table 3.1 is given by the hypergeometric

distribution, namely

n n
P(x,ylx+y) = (3.3.1)
2n
x+y

This is Fisher's exact test, the size of which is based on

the tail areas of (3.3.1).

The second approach is based on the Neyman-Pearson lemma

for testing hypotheses, a thorough treatment of which is

given in Lehmann (1959:134) in the case for which a nuisance

parameter is present. In the current case, the probability

of Table 3.1 is given by

P(x,y) = px (1-plnx (n) p (l-p2)n-Y

= (n) )(l-pl)n (l-2)n

x exp{x log[pl/(l-pl)]+y log[p2/(1-p2)]}

= x) (l-pl) (l-P2)n

fPl/(-Pl 1
x exp{x log --2/(1-p- +(x+y)log[p2/(l-p)]}.

By Lemma 2 of Lehmann (1959:139), the uniformly most powerful
unbiased (UMPU) level a test for comparing p, and p2 is based
on the conditional distribution of X(=x) given T=X+Y(=t)
and has the form

4(x,t) = 1
= y(t)
= 0

when x < C(t)
when x = C(t)
when x > C(t)

where C and y are determined by

EHo [ ((X,T)IT=t ] = a
"0

for all t, that is

a = P [X

The conditional distribution is

n n

P(X=xlT=t) =
t n n
S(u) t-u Pu
u=0

Pl/(1-pl)
where p-= is the odds ratio,
P2/(1-P2)

and.under H0, the distribution is given by

n n
x t-x
PH (X=xlT=t) = x=0,...t,
O 2n
(t )
t

the same hypergeometric distribution found by Fisher's

permutation method. Here, C(t) is taken to be the largest

value such that

C(t)-l x t-x

x=0 2n
t

In practice, the nonrandomized version of 4 is used,

that is 4 without the random element y(t). Therefore, the

conditional test always has size s a and, unlike (, is not

UMPU of level a.

3.4 An Exact Unconditional Test

In this section, the methodology of Chapter 2 is used

to compute the size of Zu, the normal test statistic with

unpooled variance estimator. This statistic was chosen on

the basis of its computational simplicity and its intuitively

appealing form. It is given by

/n (p2 p1)
Z = ( (3.4.1)
(P2 2 + 11)

where pl = x/n = 1-ql = y/n = 1-q2 with x, y and n as

in Table 3.1. The asymptotic null distribution ( the

standard normal) of Zu is frequently used in this problem

to approximate its actual size. The results of this chapter

can thus be used to verify the accuracy of this approximation.

Since x and y are outcomes of independent binomial

random variables with parameters (n,pl) and (n,p2) respective-

ly ,. the power function of any test is given by

(p ) = E E x n-x ( p (l-P2)n-
1(P1'P2) = S S x p1 (l-p1)
(x,y)eC

(3.4.2)

and under H0:Pl=p2 (=p say), the null power function, also

denoted by i, is given by

7(p) = E x) (y pX+y (l-p)2n-x (3.4.3)
(x,y) eC

where p is the nuisance parameter and C is the critical

region defined by the test statistic. For the one-sided test

of interest, the critical region defined by Zu is given by

C = {(x,y): Zu > zu; x,y=0(l)n, zu O }. (3.4.4)

For an a level test, the critical value of Zu, namely zu,

satisfies the equation

zu = inf {zu: sup r(p) < a }. (3.4.5)
P
p

Since (3.4.3) has the form of (2.1.1), the methodology of

Chapter 2 can be utilized to find a a value at most 6 above

sup r(p) in (3.4.5).

First, to simplify-the computations, (3.4.3) can be

reduced to a single summation by solving the inequality

Zu > z namely

y- x> Z(y(n-y) + x(n-x) (3.4.6)

After squaring both sides of (3.4.6), the larger root for

After squaring both sides of (3.4.6), the larger root for

y in

I

2 2
n (y-x) = z [y(n-y) + x(n-x)]
u

is found to be

y = h(x) = b + (b-4ac
2a

where a = 1 + z/n,
2
b = 2x + z
2 2
and c = ax xz2

Hence (3.4.4) reduces to,

C = {(x,y): y > h(x) ; x,y=0(1)n }.

Next, (3.4.3) can be written as

v
T (p) = E E
x=O y>h(x)

v

x=0

v
=
x=0

(n) n) y p)2n-x-y
x y p (l-p)

(n) p (l-p)n-x
x p (l-p) E
y>h(x)

f(x) [ 1 F(h(x)) ]

2 2
where v = int[ n /(n+z ) ], int[.] is the integer function,
u
f(.) is the binomial probability mass function with parameters

(n,p) and F(.) is its cumulative distribution function.

(3.4.7)

(3.4.8)

() pY (l-p)n-y

The derivative of (p), which can also be reduced

significantly to simplify the computations,is given by

i'r(p) = E y (x+y) px+y- (1-p)2n-x-Y
C

E ) (n-x+n-y) pX+y (1-p)2n-x-y-l
C

where Z denotes the double summation Z Z
C (x,y)eC

It can be rewritten as

p S (n-l n) x+y-1 2n-x-y
(p) = Z n x-1 y p (1-p)-x-y
C

+ n ( y-l px+-1 (-lp)2n-x-y
C

n (nxl) n) x+y (lp)2n-x-y-
C

s n ( nl ) p+y (l-p)2n---1 (3.4.9)
C

n-l n-l
where -1 = n = .

Consider the boundary of the critical region C defined by

W = {(x,y) : (x,y)eC and (x+l,y)gC }.

.The sum of the first and third terms of (3.4.9) becomes,

after cancellation of opposing signed identical contributions,

35

wlj(p) = E n x ) p+y (l-p)2n-xy-1 (3.4.10)
W

The other boundary of C is defined by

V = {(x,y) : (x,y)eC and (x,y-1)yC }.

Then the sum of the second and fourth terms of (3.4.9)

becomes

n n-i
2(p) = E n y-l p+-1 (l-p)2n-x-y (3.4.11)
V

Upon combining (3.4.10) and (3.4.11), the derivative of N(p)

is given by

i '(p) = i (p) + IT(p)

and can be further reduced by noticing, from (3.4.4) and

(3.4.6), that

(x0,Y0)EC iff (n-y0,n-x0)EC

so that

(xYl1)eV iff (n-yl,n-x1)EC and (n-Yl+l,n-x1) C

iff (n-y1,n-x1) W.

I

The derivative of n(p) can finally be written as

n n-i
(p) = E n [ y- p (l-p)2n-x-y
V

(n-i (nx) 2n-x-y 1p)+y-
n-y n-x p (l-p)X+Y-1 ]

(= n )( y-i px+y-l -p)2n-x-y
= n (x y [ p (l-p)
V

Sp2n-x-y (lp)+y-1 ]

a summation over the set of sample points that form a

boundary of the critical region C and which are directly

obtained from the reduction (3.4.7).

The methodology developed in Chapter 2 can now be

utilized to find a the size (of precision 6=.001) of Z

for any value zu. For a test of significance of level a,

the critical value zu can then be obtained by equation

(3.4.5). This is done by using the 100a percentile point of

the standard normal distribution as a starting of z This
*
value is then incremented or decremented until a S a and

zu is taken as the smallest value which satisfies this

inequality. This procedure was implemented in a FORTRAN

computer program, listed in Appendix C.I.

For n=10(1)150 and a=.05 and .025, zu the exact

critical values and a the size (of precision 6=.001) of Zu,

were computed and are given in Table A.1. Furthermore, Table

A.1 also contains al, a lower bound for {r(p); .05

and a2, a lower bound for {' (p); .10
of the stability of the null power function. The critical

values of Z the Z statistic with pooled variance estimator

discussed in the next section, are also given in Table A.1

and are denoted by z

The null power function, i(p), of each test procedure

was plotted for some values of n and a nominal significance

level of a=.05. For n=10, Figure B.1 contains the plot of

S(p) based on the normal approximation of Z (z =1.645) and

Figure B.2 is based on Z (z =1.645). Although both graphs

exceed the ideal value of a=.05 for some values of p, showing

that the normal approximation produces a liberal test in this

case, it is clear that Z gives a better approximation than

Z when referred to a standard normal distribution. This

point is discussed in length in the next section. Figure B.3

is based on the unconditional critical region defined by

Fisher's exact test, the criterion used by McDonald, Davis

and Milliken (1977) in their unconditional approach. The

conservativeness of Fisher's exact test is evident; its

actual size in this case being approximately .02. In Figure

B.4, the value z =1.96 (or equivalently z =1.80) of Table A.1

is used to plot n(p). This plot is seen to perform the best

at approaching the nominal level a=.05 without exceeding it.

For larger values of n, the null power function of the

exact unconditional test Zu (or equivalently Z ) is seen to

behave better. For n=20, Figure B.5 is the plot of w(p)

based on zu = 1.85 (or z = 1.78) and Figure B.6 is based on

n=30 and zu = 1.77 (or z = 1.73).

3.5 Relation to the Chi-square Goodness-of-fit Test

The other test statistic given in (3.2.2), namely Z ,

the square of which is the chi-square goodness-of-fit test

statistic, has a functional relationship with Zu in the

present equal sample size case. This relationship can be

derived by first noticing that

/n (y x)
Z = (3.5.1)
S [y(n-y) + x(n-x)

and

/n (y x)
Z (3.5.2)
Zp = [(x+y)(2n-x-y)] (3.5.2)

The square of the denominator of Z
u

yn y2 + xn x2 (3.5.3)

can be compared to the square of the denominator of Z ,

yn + xn (x+y2 (3.5.4)
yn + xn (x+y) (3.5.4)

By rewriting (3.5.3) as

2 2
yn + xn (x+y) (y-x)

it is clear that Z and Z satisfy the relation
p u

1 1 1
222
Z Z2 2n
u p

which can be rewritten-as

2n Z2
2 u 2
P 2n + 2
u

so that the

obtained by

critical value z of Z given in Table A.1 are
P P

2n Z2 2
Z = sgn(y-x) n + Z-
P2n + Z
u

where

sgn(u) = 1

= 0

= -1

(3.5.6)

u > 0

u = 0

u < 0

Robbins (1977) has noted that IZuI 2 IZ | for the

equal sample size case and has posed the question as to

which of Z or Z is more powerful. This question was
u p

(3.5.5)

investigated by Eberhardt and Fligner (1977). They noticed,

via a computational argument, that the increase in the

significance level for Zu is compensated fairly well by an

increase in power. Moreover, they suggested that Zu should

not be used for small samples because Z is closer to a

standard normal random variable. In view of the relation

(3.5.5), Z and Z are monotonic increasing functions of
u p
each other and are therefore equivalent in the sense that

Zu, with some nominal significance level a, is equivalent

to Z with some lower level a. Thus, for the same nominal

level a, Z will reject H0 more often than Z will.

3.6 Power and Sample Sizes.

Given the critical values of Table A.1, it is now

possible to compute the exact power by (3.4.2) for a=.025

and a=.05 and various values of p, and p2. The minimum

sample size required per group to attain a power of 1-8

and significance level of a can:thus be computed by solving

the equation

n = min {n: I(pl'P2)1-B}

where the critical region that defines 13(p1,P2) is based on

Zu, a function of n.

This equation was solved for a=.05 and 1-B=.80 and the

results are given in Table A.2 for various combinations of

pl and p2. Table A.2 also contains the critical values ,

*
the size a (of precision 6=.001) and the attained power

1-8 This table is thus sufficient for both the design

and analysis of the 2x2 comparative trial.

Table A.3 compares the results of Table A.2 to the

exact conditional test sample sizes [n ] found in Gail and

Gart (1973), Haseman (1978) and Casagrande, Pike, and Smith

(1978b). Furthermore, the approximate formulae given in

section 3.2 are also computed and compared to n and ne

in Table A.3. For the configurations considered, it is
*
seen that n tend to be smaller than ne, the sample sizes

determined by Fisher's exact test. Furthermore, the sample

sizes based on the arcsine formula [nas] and those based on

Z the pooled Z-test [n ], tend to co-agree quite well and

to be, in general, slightly smaller than n The other

formulae discussed in section 3.2, namely nc and nr are
*
seen to exceed n and ne.

A direct comparison of the critical regions defined by

Fisher's exact test and the exact Z-tests was performed numer-

ically. It showed that, for all the cases considered, the

critical region defined by Fisher's exact test is contained in

the critical region defined by the exact Z-tests. Therefore,

the exact Z-tests are uniformly more powerful than Fisher's

exact test for the cases n=10(1)150 and a=.05 and .025.

CHAPTER 4

THE 2x2 TABLE FOR
CORRELATED PROPORTIONS

4.1 Introduction

When the dichotomous responses for each of two regimens

are sampled in pairs, either by measuring the same experimen-

tal unit under each regimen or by pairing experimental units

with respect to some common characteristic, the problem of

comparing the success rate of these two regimens involves

two correlated proportions. Prior to 1947, this type of data

was incorrectly analyzed as if they were independent binomial

samples. McNemar (1947) derived the variance of the differ-

ence between two correlated binary random variables under the

null hypothesis of equal success rate and consequently, using

an asymptotic approach, derived the well-known "McNemar's

test". This problem, like the independent binomial case,

falls into the realm of testing a hypothesis in the presence

of a nuisance parameter. Analogous to the independent

binomial case (Chapter 3), the most common methods of tack-

ling this problem are based on asymptotic approximations or

on the conditional approach. The problem is formulated as

follows.

Let (R,S) represent a pair of binary random variables

with joint distribution

P(R=i,S=j) = pij i,j=0,l, ZE pij = 1.
ij

The outcome of a random sample of N such matched pairs is

usually displayed in the form of a 2x2 contingency table

such as Table 4.1,

Table 4.1

S
0 1 totals

0 u x u+x
R
1 y v y+v

totals u+y x+v N

where {u,x,y,v} are the frequencies.

The problem is to test, at level a, the null hypothesis

H0:P(R=1) = P(S=1) against one of the alternative hypotheses

Ha:P(R=1) < P(S=1), Ha:P(R=1) > P(S=1) or Ha:P(R=1) \$ P(S=1).

For the sake of illustration, only the alternative hypothesis

H :P(R=1) < P(S=1) will be considered. Note that

P(R=1) = P(R=1,S=0) + P(R=1,S=1)

and P(S=1) = P(R=0,S=1) + P(R=1,S=1)

so that the problem becomes that of testing H0:P01=P10

against Ha:P01>P10- The likelihood of the sample is given by

P(u,x,y,v) = xu X y v) p0u X p v
P00 P01 P10 Pll

the quadrinomial distribution with probabilities { pij ;

i,j=0,1}. Under the null hypothesis H0:p01=p10 (=p say),

the likelihood of the sample becomes

N
N u ) u x+y v
P (u,x,y,v) = (ux y v p00 p (1-p00-2p)v,

a function of the unspecified common proportion p and an

unknown probability P00'

The problem was first tackled by McNemar (1947) who used

the asymptotic approach of the standardized sufficient

statistic. Cochran (1950), by an intuitive argument, reduced

the problem to a sign test, which is the exact conditional

test obtained by the Neyman-Pearson approach to the elimina-

tion of nuisance parameters. Bennett (1967) has computed the

chi-square goodness-of-fit test statistic and observed that

it coincides with McNemar's test. A point to note about

these asymptotic tests is that they are also conditional in

the sense that they only involve x and y, and not N. It

turns out that they simply evolve from the asymptotic null

distribution of the exact conditional test. No attempts have

been made to compute the size of any of these tests, although

Bennett and Underwood (1970), in assessing the adequacy of

McNemar's test against its continuity-corrected form, have

computed their null power functions for three values of the

nuisance parameter p, namely p=.10, .50 and .90. Beyond this

investigation, researchers have completely relied upon these

asymptotic approximations and the conditional test. It is

surprising that conditional tests were not contested in this

problem, in light of the fact that they are solely based on

the number of discordant pairs x and y, and not at all on the

number of concordant pairs u and v. That these tests do not

involve N could be disturbing. Lehmann (1959:147), discuss-

ing in the context of the sign test with ties, has hinted

that N enters the picture through the parameter p00 when the

unconditional power is computed.

Approximate power calculations and derivations of sample

size formulae were made by Miettinen (1968). Bennett and

Underwood (1970) compared the exact and approximate powers of

McNemar's test and its continuity-corrected form for alterna-

tives close to the null state. Schork and Williams (1980)

tabulated the required sample sizes based on the exact power

function of the conditional test.

In this chapter, McNemar's test and other asymptotic-

type tests will be presented. The approximate sample size.

formulae will also be given. The exact conditional test will

be derived via the Neyman-Pearson approach. The results of

Chapter 2 will then be used in section 4.4 to compute and

tabulate the size of McNemar's test for the one-sided case.

In section 4.5, the exact unconditional critical values

obtained in 4.4 will be used to tabulate the required sample

sizes for a significance level of a=.05 and a power of

1-B=.80. It is also shown that this exact unconditional test

is uniformly more powerful than the exact conditional sign

test for the cases considered, namely a=.05, a=.025 and

N=10(1)200.

4.2. McNemar's Test, other Asymptotic Tests and Sample
Size Formulae

McNemar (1947) derived the mean and variance of S-R

(as defined in Table 4.1) under the null hypothesis and thus

proposed the asymptotic test statistic

2 (x y)2
= (4.2.1)
x + y

for the two-sided alternative. This statistic has an asymp-

totic Xnull distribution. Cochran (1950) reduced the

problem to a sign test, using the statistic

X -2 +
X2 = (x- n)2 + (y n)
n n

(x y)2

x + y
x+y

where n=x+y, the total number of discordant pairs. Bennett

(1967) used the chi-square goodness-of-fit test, applied to

the quadrinomial frequencies of Table 4.1, to find

2 (u Noo2 (x N01) 2 (y Np)2
X + +
NOO0 NPO1 NP10

(v Np11
+
Np11

(x y)2

x+y
x + y

where the P..'s are the maximum likelihood estimators of

the Pij's under H0. The three methods lead to the same test

statistic, namely McNemar's, and therefore have the same

asymptotic null distribution.

To determine the required sample size, Miettinen (1968)

derived two formulas. The first one, based on an approxima-

tion to the asymptotic unconditional power function of X2

(McNemar's test statistic) gives, for a one-sided test of

significance level a and power 1-B, the required sample size

as

{ z a + z (i2 2) }2
N = (4.2.2)
1 A2

where i=P01+Pl0, =p 10-p01 and z is the upper 100y percen-

tile of the standard normal distribution. The second formula

is based on a more precise approximation to the asymptotic

unconditional power function of X2 and is given by

{ z % + z8 [(2 A 2(3+fl)] }2
N = (4.2.3)
a2 2

For the purpose of comparison with exact conditional

and exact unconditional results, these formulas were computed

and are given in Table A.6. These comparisons are discussed

in section 4.5.

4.3 The Exact Conditional Test

The exact conditional test is obtained by the Neyman-

Pearson approach described in Lehmann (1959). The

probability of the sample is given by

N
P(u,x,y,v) = x y v)00 p01 p py

and can be written in the exponential family form as

N
P(u,x,y,v) = (u x y V exp{ u log(p00) + x log(p0l)

+ (N-u-x-v) log(pl0) + v log(pll)}.

It can be reparametrized as

N
P(u,x,y,v) = u x y v exp{ u log(p00/P10)

+ x log(p01/10) + v log(p1l/pl0)

+ N log(pl0)

The new parameters are, in the notation of Lehmann (1959),

I = log(p01/p10)

v = ( log(p00/p10) log(pll/p10) )

and the hypothesis to be tested becomes H0:8=0 against

H a:>0. The sufficient statistics are

N
X = E (1-R.)Si
i=l

N N
T = (U,V) = ( Z (1-R ) (1-S.) Z R. S. ).
i=l i=l 1

Therefore the UMPU test is given by

((x,t) = 1 when x > C(t)

= y(t) when x = C(t)
= 0 when x < C(t)

where C and y are such that

EH { X(X,U,V) I U=u, V=v } = a all u,v.

To find this conditional expectation, first notice that the

distribution of (U,V) is

P(u,v) = (u v N-u-v p (l00pll)N-u-v

so that the distribution of X given U=u and V=v is

P(xlu,v) = (xy p1 p / (Po0 o) N-u-v

= (pO/(P01+PO)) (10/(P01+Pi1))

= x( px (1-p)n-x

where n=x+y and p = pO1/(P01+P10). Therefore, the null

hypothesis H0:P01=P10 reduces to H0:p=, the usual sign test

problem based only on n, the total number of discordant

pairs. Because this conditional distribution of X is

discrete, the test 0 needs the randomization element Y to

become UMPU of level a. However, since the practice of

using y is rare, the test without randomization will be a

conservative one and not UMPU of level a.

4.4 An Exact Unconditional Test

As in the case of two independent proportions, the

choice of the test statistic is based on the standardization

of the sufficient statistic for the parameter being tested,

namely p01-P10. This statistic is the square root of

McNemar's test statistic and is given by

x y
Z = (x+y) (4.4.1)
(x+y)

where x and y are as in Table 4.1. This statistic is often

written in terms of n (=x+y), the total number of discordant

pairs, as

x n
Z (4.4.2)
c i n

the approximation to the sign test, referred to the.standard

normal distribution. In this section, the methodology of

Chapter 2 is used to compute the size of Z and the exact

critical values based on Zc. These values will then provide

a means of assessing the accuracy of the normal approximation.

The power function of Zc is given by

N

(P1P10) = Z s s (u x y v) p00 px1 0
(x,y)cC u

a function only of p01 and p10 since it is based on the

marginal distribution of (X,Y) which is obtained as

N-x-y N
P(x,y) = u x y ) p0 p p N-x-y-u
u=0 P00

N
= (x y N-x-y) p p0 (1-P01-Pl0N-x-y
=xyN P01 lo(-olpo

The power function

n (=x+y) as

H(P01o'p0) =

of z can then be written in terms of
c

E C
(x,y)eC

N
x n-x N-n Pl0 nPx

x (-p- ) N-n
(1-P l-Plo)

where C is the critical region defined by Zc. Under the null

hypothesis H0:P01=p10 (=e say), the null power function of Zc

is given by

A(6) = E
(x,n)EC

N
x n-x N-n 8n (1-2e)N-n

or equivalently, if p = 28, by

1 (p) = z E
(x,n)EC

N
x n-x N-n) n pn (l_p)N-n

a function of the nuisance parameter p, the probability of a

discordant pair. For the one-sided test of interest, namely

for the alternative hypothesis Ha:P01>Pi0, the critical

(4.4.3)

S (4.4.4)

region C defined by Z is given by

C = ((x,n): Zc > zc; x=0(l)n, n=0(l)N, z >0). (4.4.5)

For an a level test, the critical value of Zc, namely z ,

satisfies the equation

c = inf {zc : sup (p) < al. (4.4.6)
p

Note that r(p), as defined in (4.4.4), is a function of p as

well as of z through (4.4.5). Since (4.4.4) has the form of

(2.1.1), the methodology developed in Chapter 2 can be

utilized to solve (4.4.6) and thus find a a value at most

6 above sup 7(p).

First to simplify the computation of (4.4.4), notice

that the inequality Z > z is in fact

x > z /n + n

so that the critical region C reduces to

C = {(x,n): x > h(n); x=0(l)n, n=0(1)N} ,

where h(n) = {z c/n + n}. The null power function (4.4.4)

becomes

N
ir(p) = 0 x
n=O x>h(n)

N

n=0

N
= S
n=k

() ( n pn (-p)N-n
n x p (l)

() pn (1-p)N-n
n p (l)

N p (
n pn (1-p)N-n

x>h(n)

[1 Fn (in)]

where k = int[z2 + 1], i = int[h(n)], int[.] is the integer

function and F (.) is the binomial cumulative distribution

with parameters (n,). Notice that, since in n, it is

more efficient to compute

n
S
x=i +1
n

n
( ,) n
x3

instead of l-Fn(in). Then, by the symmetry of the binomial

distribution with (n,), the null power function of Z can
c
be rewritten as

N N
r(p) = p (lp) F (n-i -),
n=k n n

(4.4.7)

in the form of (2.1.1). The derivative of the null power

function is

() n
x

N N
N N pn-l N-n
T'(p) = S F (n-in-1) n [ n p (l-p)N-n
n=k

(N-n) pn (1-p)N-n-l

so that the methodology developed in Chapter 2 can now be

utilized to find a the size (of precision 6=.001) of Z
c
for any value z .

In this problem, the size was taken as sup{w(p):0
because of the behaviour of w(p) when p approaches 1. The

null power function is dominated by the last term of the.

summation, namely pN FN(N-iN-1) when p tends to 1. From

the practical point of view, the fact that p tends to 1

implies that almost no concordant pairs will be observed, so

that the problem virtually reduces to a problem with no

nuisance parameter, namely the sign test with no ties. It

then seems reasonable to compute the size on the interval

p e(0,.99).

For a test of significance of level a, the critical

value z can then be obtained by equation (4.4.6). Because

Zc is asymptotically normal, the 100a upper percentile point

of the standard normal distribution can be used as a starting
*
value of z This value is then incremented until a * *
c is taken as the smallest value of zc which satisfies a sa.

The FORTRAN computer program used to implement this procedure

is given in Appendix C.2.

For N=10(1)200 and a=.05 and .025, zc the exact

critical values and a the size (of precision 6=.001) of Z ,

were computed and are given in Table A.4. Furthermore, Table

A.4 also contains al, a lower bound for {1(p): p > 5/N} and

a2, a lower bound for {r(p): p > 10/N} as indicators of the

stability of the null power function. These lower bounds on

p are obtained for expected number of discordant pairs of at

least 5 and 10 respectively.

The null power function, i(p), of the exact conditional

test, as well as of the exact and approximate unconditional

tests, was plotted for some values of N and a nominal level

of significance of a=.05. For N=10, Figure B.7 contains the

plot of i(p) based on the normal approximation of Z

(critical value Zc=1.645). It is apparent that using the

normal approximation in this case induces a liberal test

for that range of the nuisance parameter p where the null

power function exceeds the nominal level a=.05, namely

.30.74. Figure B.8 is the plot bAsed on the

unconditional critical region defined by the exact

conditional test, namely the sign test. Here, the test is

very conservative, its actual size being approximately .013.

In Figure B.9, the exact unconditional critical value z =1.90

of Table A.4 is used to plot '(p). From these plots, the

exact Z-test (Figure B.9) is seen to perform best at

approaching the nominal significance level without exceeding

it, although, because of the sparsity of its natural levels

its size is only .0265.

The null power function of the exact unconditional test

Zc is seen to behave better for larger values of N. For

N=30, Figure B.10 is the plot of r(p) based on z =1.74
c
and Figure B.11 is the plot of n(p) based on z =1.68 and N=40.

4.5 Power and Sample Sizes

Now that the exact critical values of Z have been

computed (Table A.4), the exact power in (4.4.3) can be

readily obtained for a=.025, a=.05, N=10(1)200 and various

values of p01 and pl0. Consequently, the minimum sample

size required to achieve a power of 1-8 and a significance

level of a for a combination of (p01,p10) can be computed by

solving the equation

N = min { N: H(p01,p10) > 1-8 } (4.5.1)

where the critical region that defines H(po01,p0) is based
*
on z a function of N.

Because all other sample size results are given in terms

of the parameters i=P01+P10 and A=pl0-P01 equation (4.5.1)

was solved in terms of these parameters for the purpose of

comparability. For a=.05 and 1-8=.80, and various combina-

tions of p and A, the minimum sample sizes from (4.5.1) are

given in Table A.5. This table also contains the critical

values zc the size (of precision 6=.001) of Zc and the

attained power 1-8 Therefore, Table A.5 is sufficient for

both the design and the analysis of the 2x2 table for

comparing two correlated proportions.

In Table A.6, the exact unconditional sample sizes.[N ]

of Table A.5 are compared to the exact conditional sample

sizes [N ] found in Schork and Williams (1980). Furthermore,

the approximate formulae derived by Miettinen (1968), namely

N and N of section 4.2, are computed and also compared
1 2
to N and N in Table A.6. The exact unconditional sample

sizes N are seen to be smaller-than N the sample sizes

based on the exact conditional test, for all except some

combinations of 4 and A. This seems to happen for larger

values of i and A. The approximate sample sizes N and N.
a1 a2
are almost equal to each other, much smaller than N and
e
slightly smaller than N

Because these results suggest that the exact uncondi-

tional test might be more powerful than the exact conditional

test, the critical regions of each test were compared

numerically. This comparison showed that, for all the

cases considered, the critical region defined by the exact

conditional test (sign test) is contained in the critical

region defined by the exact Z-test. Therefore, the exact

Z-test is uniformely more powerful than the exact conditional

test for the cases considered, namely N=10(1)200, a=.025 and

a=.05.

APPENDIX A

TABLES

These tables contain critical values and sample size

determinations for the problems of comparing two independent

proportions and of comparing two correlated proportions.

For one-sided tests, the tables of critical values are

produced for significance levels a=.05 and .025, and the

sample size tables for a level of a=.05 and 80% power. The

legend for these tables is given below.

Legend for Tables A.1, A.2 and A.3: two independent
proportions

n = sample size in each group

a = nominal significance level

al = lower bound for {i(p): .05
a2 = lower bound for {i(p): .10
n(p)= null power function

zu = exact one-sided critical values of Z the Z-test

with unpooled variance estimator

p = exact one-sided critical values of Zp, the Z-test

with pooled variance estimator

pi = probability of success in group i, i=1,2.

n = sample size determined by exact Z-tests
*
a = size of Z-tests for n and zu or z
*
1-8 = attained power for n zu or z Pl and P2

The following are the sample sizes of Table A.2 as

determined by:

ne = Fisher's exact test, Casagrande et al. (19

nc = the corrected X2 approximation (3.2.6)

nr = the recorrected X2 approximation (3.2.7)

78b)

n
n =

as
n =

Legend for

proportions

N

a =

a1 =
a2 =

7(p)=

z =
C
A

1 =

P01 =

Pl0 =
N
*
a =

1-8 =

2
the uncorrected X approximation (3.2.4)

the arcsine formula (3.2.5)

the exact Z-tests.

Tables A.4, A.5 and A.6: two correlated

sample size (number of matched pairs)

nominal significance level

lower bound for {r(p): p > 5/N}

lower bound for {r(p): p > 10/N}

null power function

exact one-sided critical values of Z the

P10-P01

P10+P01
P(R=0,S=1) (see section 4.1)

P(R=1,S=0) (see section 4.1)

sample size determined by the exact Z-test

size of Z-test for N and z
c
attained power for N z A and q

Z-test

The following are the sample sizes of Table A.5 as

determined by:

N = McNemar's test, first approximation (4.2.2)
a1
Na = McNemar's test, second approximation (4.2.3)

Ne = the exact conditional test (sign test), Schork

and Williams (1980)

N = the exact Z-test.

Table A.1 Critical Values and Sizes of Z-tests for
Comparing Two Independent Proportions.

a = .05

a1 a2 a zu z

.0068
.0086
.0105
.0125
.0146
.0168
.0188
.0209
.0230
.0251

.0271
.0290
.0308
.0325
.0341
.0342
.0361
.0369
.0386
.0393

.0395
.0398
.0387
.0395
.0404
.0376
.0380
.0398
.0408
.0409

.0410
.0411
.0406
.0411
.0412
.0396
.0383
.0395
.0404
.0417

.0251
.0292
.0329
.0363
.0394
.0422
.0289
.0308
.0335
.0368

.0339
.0351
.0360
.0367
.0373
.0342
.0361
.0369
.0386
.0393

.0395
.0398
.0387
.0395
.0405
.0376
.0380
.0398
.0414
.0410

.0419
.0422
.0406
.0413
.0420
.0396
.0383
.0395
.0404
.0417

.0476
.0504
.0471
.0484
.0495
.0505
.0421
.0426
.0430
.0435

.0438
.0445
.0484
.0447
.0458
.0448
.0449
.0450
.0458
.0458

.0467
.0494
.0462
.0476
.0493
.0476
.0467
.0467
.0476
.0469

.0470
.0491
.0481
.0487
.0498
.0492
.0482
.0492
.0478
.0486

1.96
1.84
1.86
1.81
1.77
1.74
1.92
1.90
1.88
1.86

1.85
1.84
1.83
1.84
1.81
1.80
1.79
1.79
1.78
1.78

1.77
1.77
1.80
1.77
1.75
1.75
1.75
1.74
1.74
1.74

1.73
1.73
1.78
1.76
1.74
1.73
1.73
1.72
1.72
1.72

1.80
1.72
1.74
1.71
1.68
1.66
1.82
1.81
1.80
1.78

1.78
1.77
1.77
1.78
1.76
1.75
1.74
1.74
1.74
1.74

1.73
1.73
1.76
1.73
1.72
1.72
1.72
1.71
1.71
1.71

1.70
1.70
1.75
1.73
1.71
1.71
1.71
1.70
1.70
1.70

a= .025

a1 a2 a u z

.0005
.0007
.0010
.0014
.0018
.0024
.0030
.0036
.0042
.0049

.0057
.0064
.0071
.0078
.0086
.0094
.0102
.0109
.0117
.0124

.0034
.0138
.0145
.0151
.0157
.0163
.0168
.0173
.0063
.0169

.0186
.0186
.0193
.0196
.0199
.0202
.0204
.0176
.0097
.0176

.0044
.0058
.0073
.0089
.0106
.0122
.0137
.0152
.0166
.0174

.0192
.0183
.0175
.0179
.0178
.0184
.0183
.0191
.0194
.0195

.0128
.0198
.0187
.0193
.0193
.0204
.0206
.0208
.0141
.0169

.0203
.0186
.0193
.0197
.0208
.0217
.0215
.0177
.0174
.0180

.0212
.0208
.0225
.0200
.0209
.0218
.0252
.0233
.0241
.0246

.0252
.0251
.0245
.0239
.0222
.0233
.0217
.0225
.0233
.0242

.0219
.0243
.0236
.0245
.0238
.0243
.0245
.0249
.0225
.0233"

.0242
.0237
.0244
.0241
.0243
.0249
.0247
.0251
.0233
.0241

2.17
2.40
2.26
2.26
2.19
2.14
2.10
2.21
2.14
2.14

2.10
2.17
2.17
2.17
2.15
2.13
2.12
2.11
2.10
2.09

2.15
2.11
2.09
2.07
2.06
2.06
2.06
2.05
2.13
2.10

2.08
2.07
2.05
2.05
2.04
2.03
2.03
2.06
2.09
2.07

1.96
2.14
2.06
2.07
2.03
2.00
1.97
2.07
2.02
2.03

2.00
2.06
2.07
2.07
2.06
2.04
2.04
2.03
2.03
2.02

2.08
2.04
2.03
2.01
2.00
2.00
2.00
2.00
2.07
2.05

2.03
2.02
2.00
2.00
2.00
1.99
1.99
2.02
2.05
2.03

Table A.1 -- continued

a = .05

O1 a2 a z zp

.0419
.0420
.0435
.0440
.0394
.0396
.0413
.0398
.0389
.0398

.0405
.0416
.0409
.0428
.0406
.0406
.0407
.0327
.0406
.0383

.0387
.0392
.0393
.0400
.0409
.0412
.0412
.0413
.0414
.0415

.0415
.0430
.0327
.0327
.0396
.0398
.0393
.0399
.0403
.0410

.0429
.0426
.0435
.0444
.0394
.0396
.0413
.0398
.0389
.0398

.0405
.0416
.0409
.0428
.0406
.0406
.0407
.0378
.0406
.0383

.0387
.0392
.0393
.0400
.0409
.0412
.0417
.0418
.0419
.0420

.0437
.0430
.0378
.0378
.0396
.0398
.0393
.0399
.0403
.0410

.0494
.0487
.0489
.0499
.0481
.0482
.0498
.0495
.0494
.0499

.0500
.0501
.0502
.0503
.0485
.0486
.0491
.0473
.0483
.0483

.0487
.0489
.0486
.0485
.0486
.0486
.0487
.0488
.0489
.0489

.0490
.0496
.0480
.0480
.0491
.0491
.0491
.0496
.0492
.0494

1.71
1.72
1.71
1.71
1.76
1.75
1.73
1.72
1.72
1.71

1.71
1.70
1.72
1.70
1.72
1.72
1.72
1.75
1.74
1.73

1.72
1.71
1.71
1.71
1.71
1.71
1.70
1.70
1.70
1.70

1.70
1.70
1.74
1.73
1.72
1.71
1.71
1.70
1.70
1.69

1.69
1.70
1.69
1.69
1.74
1.73
1.71
1.70
1.70
1.69

1.69
1.68
1.70
1.68
1.70
1.70
1.70
1.73
1.72
1.72

1.71
1.70
1.70
1.70
1.70
1.70
1.69
1.69
1.69
1.69

1.69
1.69
1.73
1.72
1.71
1.70
1.70
1.69
1.69
1.68

a = .025

a1 a2 a zu z

.0177
.0177
.0178
.0178
.0178
.0178
.0178
.0178
.0126
.0178

.0178
.0178
.0178
.0178
.0178
.0178
.0178
.0179
.0179
.0147

.0152
.0179
.0197
.0193
.0194
.0198
.0198
.0198
.0198
.0198

.0198
.0161
.0162
.0160
.0194
.0196
.0191
.0195
.0200
.0200

.0181
.0182
.0184
.0185
.0187
.0189
.0191
.0192
.0180
.0196

.0207
.0198
.0194
.0197
.0204
.0208
.0207
.0211
.0212
.0175

.0195
.0196
.0199
.0193
.0194
.0200
.0204
.0210
.0211
.0210

.0213
.0186
.0187
.0176
.0194
.0196
.0191
.0195
.0200
.0205

.0248
.0245
.0242
.0247
.0236
.0249
.0246
.0248
.0222
.0236

.0250
.0249
.0247
.0246
.0249
.0249
.0239
.0240
.0247
.0231

.0243
.0242
.0247
.0248
.0249
.0251
.0250
.0251
.0252
.0238

.0246
.0243
.0240
.0240
.0244
.0249
.0242
.0246
.0243
.0248

2.05
2.04
2.04
2.04
2.04
2.03
2.03
2.03
2.09
2.07

2.05
2.04
2.03
2.03
2.02
2.03
2.04
2.03
2.03
2.08

2.06
2.05
2.04
2.03
2.03
2.02
2.02
2.02
2.02
2.03

2.03
2.07
2.06
2.05
2.04
2.03
2.03
2.03
2.03
2.02

2.01
2.00
2.00
2.01
2.01
2.00
2.00
2.00
2.06
2.04

2.02
2.01
2.00
2.00
1.99
2.00
2.01
2.00
2.00
2.05

2.03
2.02
2.01
2.01
2.01
2.00
2.00
2.00
2.00
2.01

2.01
2.05
2.04
2.03
2.02
2.01
2.01
2.01
2.01
2.00

Table A.1 -- continued

a = .05

aO a2 a zu

n

90
91
92
93
94
95
96
97
98
99

100
101
102
103
104
105
106
107
108
109

110
111
112
113
114
115
116
117
118
119

120
121
122
123
124
125
126
127
128
129

.0418
.0423
.0425
.0431
.0436
.0442
.0440
.0445
.0450
.0329

.0382
.0382
.0382
.0382
.0382
.0407
.0413
.0419
.0421
.0422

.0427
.0426
.0437
.0435
.0439
.0443
.0391
.0380
.0392
.0402

.0396
.0405
.0405
.0407
.0411
.0418
.0419
.0426
.0427
.0424

.0418
.0423
.0425
.0431
.0436
.0442
.0440
.0445
.0450
.0389

.0391
.0392
.0393
.0393
.0394
.0407
.0413
.0419
.0421
.0422

.0427
.0426
.0437
.0435
.0439
.0443
.0400
.0380
.0398
.0402

.0396
.0405
.0405
.0407
.0411
.0418
.0419
.0426
.0428
.0424

.0504
.0496
.0494
.0495
.0500
.0503
.0499
.0501
.0504
.0486

.0504
.0491
.0503
.0496
.0493
.0495
.0495
.0495
.0495
.0495

.0495
.0496
.0497
.0495
.0496
.0499
.0484
.0484
.0486
.0500

.0494
.0520
.0502
.0497
.0497
.0502
.0496
.0500
.0497
.0497

1.69
1.69
1.69
1.69
1.68
1.68
1.68
1.68
1.68
1.72

1.71
1.71
1.70
1.70
1.70
1.69
1.69
1.69
1.69
1.69

1.69
1.69
1.68
1.68
1.68
1.68
1.72
1.72
1.71
1.70

1.70
1.69
1.69
1.69
1.69
1.68
1.69
1.68
1.68
1.69

a = .025

a1 a2 a zu p

1.68
1.68
1.68
1.68
1.67
1.67
1.67
1.67
1.67
1.71

1.70
1.70
1.69
1.69
1.69
1.68
1.68
1.68
1.68
1.68

1.68
1.68
1.67
1.67
1.67
1.67
1.71
1.71
1.70
1.69

1.69
1.68
1.68
1.68
1.68
1.67
1.68
1.67
1.67
1.68

.0200
.0201
.0201
.0202
.0170
.0171
.0171
.0153
.0191
.0153

.0191
.0192
.0192
.0192
.0193
.0193
.0211
.0212
.0176
.0183

.0184
.0185
.0182
.0182
.0182
.0182
.0199
.0203
.0205
.0206

.0209
.0209
.0212
.0215
.0213
.0181
.0187
.0185
.0186
.0186

.0207
.0211
.0208
.0209
.0184
.0185
.0186
.0185
.0191
.0186

.0191
.0192
.0192
.0192
.0193
.0193
.0211
.0212
.0176
.0183

.0184
.0192
.0193
.0189
.0194
.0196
.0199
.0203
.0205
.0206

.0209
.0209
.0212
.0215
.0213
.0181
.0187
.0193
.0200
.0193

.0249
.0250
.0242
.0245
.0228
.0232
.0236
.0244
.0249
.0244

.0247
.0246
.0244
.0247
.0250
.0249
.0250
.0245
.0225
.0232

.0236
.0243
.0247
.0243
.0249
.0248
.0251
.0250
.0248
.0245

.0247
.0247
.0247
.0250
.0248
.0232
.0239
.0243
.0250
.0249

2.02
2.01
2.02
2.02
2.07
2.06
2.05
2.04
2.03
2.03

2.02
2.02
2.02
2.02
2.01
2.01
2.01
2.01
2.07
2.06

2.05
2.04
2.03
2.03
2.02
2.02
2.01
2.01
2.01
2.01

2.01
2.01
2.01
2.00
2.01
2.05
2.04
2.03
2.02
2.02

2.00
1.99
2.00
2.00
2.05
2.04
2.03
2.02
2.01
2.01

2.00
2.00
2.00
2.00
1.99
1.99
2.00
2.00
2.05
2.04

2.03
2.03
2.02
2.02
2.01
2.01
2.00
2.00
2.00
2.00

2.00
2.00
2.00
1.99
2.00
2.04
2.03
2.02
2.01
2.01

Table A.1 -- continued

a = .05

a1 I 2 C zu z

n

130
131
132
133
134
135
136
137
138
139

140
141
142
143
144
145
146
147
148
149
150

.0432
.0431
.0435
.0438
.0436
.0383
.0402
.0388
.0403
.0394

.0395
.0402
.0406
.0409
.0413
.0415
.0422
.0423
.0427
.0427
.0427

.0497
.0497
.0498
.0499
.0500
.0486
.0486
.0485
.0499
.0500

.0499
.0500
.0500
.0501
.0500
.0500
.0501
.0498
.0500
.0502
.0501

1.68
1.68
1.68
1.68
1.68
1.72
1.71
1.71
1.70
1.70

1.70
1.69
1.69
1.69
1.69
1.69
1.68
1.68
1.68
1.68
1.68

1.68
1.68
1.68
1.68
1.68
1.71
1.70
1.70
1.70
1.70

1.70
1.69
1.69
1.69
1.69
1.69
1.68
1.68
1.68
1.68
1.68

S= .025

a a2 a zu z

.0187
.0187
.0187
.0188
.0188
.0189
.0189
.0189
.0190
.0190

.0191
.0191
.0185
.0191
.0192
.0191
.0192
.0194
.0194
.0195
.0195

.0193
.0197
.0198
.0199
.0204
.0205
.0208
.0208
.0207
.0210

.0208
.0210
.0185
.0191
.0203
.0191
.0192
.0194
.0197
.0199
.0203

.0243
.0251
.0242
.0241
.0246
.0244
.0246
.0244
.0245
.0244

.0245
.0249
.0236
.0243
.0251
.0246
.0247
.0242
.0246
.0246
.0252

2.02
2.01
2.02
2.02
2.01
2.01
2.01
2.01
2.01
2.01

2.01
2.01
2.04
2.03
2.02
2.02
2.02
2.02
2.01
2.01
2.00

2.01
2.00
2.01
2.01
2.00
2.00
2.00
2.00
2.00
2.00

2.00
2.00
2.03
2.02
2.01
2.01
2.01
2.01
2.00
2.00
1.99

.0427
.0427
.0427
.0427
.0428
.0383
.0402
.0388
.0403
.0394

.0395
.0402
.0406
.0409
.0413
.0415
.0422
.0423
.0427
.0427
.0427

Table A.2 Minimum Sample Sizes to Achieve 80% Power
and ns.05 for One-sided Z-tests for
Comparing Two Independent Proportions.

l P2 n zu z p 1-

.05 .15 107 1.69 1.68 .0495 .8009
.20 56 1.73 1.71 .0498 .8016
.25 38 1.74 1.71 .0476 .8098
.30 28 1.78 1.74 .0458 .8095
.35 22 1.83 1.77 .0484 .8095
.40 18 1.88 1.80 .0430 .8190
.45 13 1.81 1.71 .0484 .8142

.10 .25 79 1.70 1.69 .0489 .8026
.30 49 1.72 1.70 .0486 .8071
.35 35 1.75 1.72 .0476 .8063
.40 26 1.79 1.74 .0449 .8088
.45 21 1.84 1.77 .0445 .8057
.50 17 1.90 1.81 .0426 .8213
.55 13 1.81 1.71 .0484 .8016
.60 10 1.96 1.80 .0476 .8016

.15 .30 95 1.68 1.67 .0503 .8023
.35 59 1.71 1.69 .0499 .8115
.40 40 1.73 1.70 .0470 .8077
.45 29 1.78 1.74 .0458 .8001
.50 23 1.84 1.78 .0447 .8134
.55 18 1.88 1.80 .0430 .8033
.60 14 1.77 1.68 .0495 .8016
.65 13 1.81 1.71 .0484 .8366

.20 .35 111 1.69 1.68 .0496 .8005
.40 68 1.74 1.72 .0483 .8038
.45 44 1.74 1.71 .0498 .8017
.50 32 1.80 1.76 .0462 .8060
.55 26 1.79 1.74 .0449 .8298
.60 20 1.85 1.78 .0438 .8153
.65 15 1.74 1.66 .0505 .8273
.70 13 1.81 1.71 .0484 .8239

.25 .40 123 1.69 1.68 .0497 .8017
.45 71 1.71 1.70 .0489 .8017
.50 48 1.72 1.70 .0478 .8061
.55 33 1.77 1.73 .0476 .8051
.60 26 1.79 1.74 .0449 .8006
.65 20 1.85 1.78 .0438 .8091
.70 15 1.74 1.66 .0505 .8232
.75 13 1.81 1.71 .0484 .8205

Table A.2 -- continued

*
P P2 n Zu Z

1.68
1.70
1.71
1.74
1.79
1.85

.35 .50
.55
.60
.65

136 1.71
79 1.70
51 1.72
37 1.74

.40 .55 144 1.69 1.69
.60 79 1.70 1.69

.0500 .8021
.0489 .8057

1-8

1.68
1.69
1.69
1.71
1.74
1.78

1.70
1.69
1.70
1.71

.0498
.0488
.0494
.0467
.0450
.0438

.0486
.0489
.0487
.0467

.8015
.8021
.8029
.8145
.8064
.8075

.8016
.8070
.8085
.8111

Table A.3 Comparision of Sample Sizes to Achieve 80%
Power and ca.05 for One-sided Tests for
Comparing Two Independent Proportions.

S P2 n nc nr n nas n.

.05 .15 126 148 130 111 105 107
.20 67 84 72 59 55 56
.25 45 57 48 39 35 38
.30 34 42 36 28 25 28
.35 25 33 28 21 19 22
.40 20 27 22 17 15 18
.45 17 23 19 14 12 13

.10 .25 89 104 92 79 76 79
.30 56 67 58 49 47 49
.35 39 49 42 34 32 35
.40 30 37 31 25 24 26
.45 24 30 25 20 19 21
.50 19 25 20 16 15 17
.55 16 21 17 13 12 13
.60 13 18 14 11 10 10

.15 .30 106 120 108 95 94 95
.35 65 76 67 57 56 59
.40 46 54 46 39 38 40
.45 34 40 35 28 28 29
.50 26 32 27 22 21 23
.55 22 26 22 17 17 18
.60 17 22 18 14 13 14
.65 15 18 15 11 11 .13

.20 .35 121 134 122 109 108 111
.40 73 .. 83 74 64 64 68
.45 49 58 50 43 42 44
.50 36 43 37 31 30 32
.55 27 34 28 23 23 26
.60 23 27 23 18 18 20
.65 17 22 18 14 14 15
.70 15 19 15 12 12 13

.25 .40 132 145 133 120 119 123
.45 78 89 79 70 69 71
.50 54 61 53 46 46 48
.55 37 45 39 32 32 33
.60 30 35 30 24 24 26
.65 23 28 23 19 19 20
.70 18 23 19 15 15 15
.75 15 19 15 12 12 13

Table A.3 -- continued

Pl P2 ne nc nr np nas n

132
77
50
37
27
20

136
79
51
37

.30 .45
.50
.55
.60
.65
.70

.35 .50
.55
.60
.65

.40 .55
.60

144 162 149 136 136
85 96 86 77 77

Table A.4 Critical Values and Sizes of Z-test.for
Comparing Two Correlated Proportions.

a = .05

a* *c
a1 a2 a z

.0114 .0265
.0322 .0345 .0395
.0205 .0205 .0373
.0316 .0328 .0430
.0303 .0303 .0371
.0187 .0187 .0370
.0312 .0349 .0454
.0261 .0261 .0413
.0313 .0353 .0446
.0307 .0333 .0399

.0220 .0220 .0395
.0310 .0356 .0459
.0278 .0278 .0425
.0309 .0357 .0427
.0307 .0332 .0412
.0305 .0357 .0494
.0301 .0357 .0434
.0275 .0275 .0413
.0307 .0349 .0407
.0302 .0315 .0407

.0303 .0357 .0450
.0303 .0357 .0407
.0262 .0262 .0407
.0301 .0357 .0458
.0299 .0299 .0435
.0298 .0356 .0429
.0304 .0327 .0425
.0302 .0356 .0449
.0299 .0355 .0419
.0400 .0411 .0501

.0399 .0400 .0500
.0305 .0305 .0500
.0303 .0354 .0433
.0298 .0323 .0428
.0375 .0375 .0500
.0352 .0352 .0501
.0395 .0395 .0501
.0382 .0382 .0501
.0290 .0300 .0427
.0396 .0408 .0501

1.90
1.74
1.74
1.74
1.74
1.81
1.74
1.74
1.74
1.74

1.79
1.74
1.74
1.74
1.74
1.74
1.74
1.74
1.74
1.74

1.74
1.74
1.77
1.74
1.74
1.74
1.74
1.74
1.74
1.68

1.68
1.72
1.74
1.74
1.68
1.68
1.68
1.68
1.74
1.70

a = .025

"1 "2 a Zc

.0114 .0208
.0062 .0062 .0197
.0133 .0204 .0233
.0120 .0120 .0213
.0069 .0069 .0126
.0127 .0186 .0211
.0114 .0114 .0209
.0125 .0175 .0225
.0128 .0163 .0208
.0103 .0103 .0208

.0126 .0187 .0246
.0142 .0142 .0251
.0200 .0200 .0249
.0182 .0182 .0247
.0121 .0121 .0213
.0124 .0166 .0211
.0154 .0154 .0246
.0204 .0204 .0246
.0185 .0185 .0246
.0120 .0128 .0207

.0126 .0166 .0207
.0155 .0155 .0246
.0196 .0196 .0246
.0179 .0179 .0246
.0125 .0128 .0207
.0208 .0208 .0249
.0123 .0151 .0226
.0127 .0182 .0226
.0119 .0168 .0226
.0123 .0192 .0242

.0192 .0192 .0250
.0120 .0143 .0226
.0208 .0208 .0250
.0163 .0163 .0247
.0186 .0186 .0247
.0175 .0175 .0248
.0198 .0198 .0248
.0202 .0202 .0248
.0115 .0150 .0229
.0118 .0167 .0228

2.01
2.12
2.01
2.01
2.14
2.01
2.01
2.01
2.01
2.07

2.01
1.97
1.97
1.97
2.05
2.05
1.97
1.97
1.97
2.05

2.05
1.98
1.98
1.98
2.06
1.98
2.01
2.01
2.01
2.01

1.98
2.04
1.98
1.99
1.99
1.99
1.99
1.99
2.03
2.03

Table A.4 -- continued

a= .05

N 2 *
N 1 a2 a Zc

.0325 .0325 .0500
.0352 .0352 .0500
.0333 .0333 .0500
.0365 .0365 .0500
.0359 .0359 .0501
.0387 .0387 .0501
.0299 .0360 .0453
.0309 .0309 .0501
.0390 .0407 .0501
.0311 .0311 .0501

.0347 .0347 .0501
.0329 .0329 .0501
.0294 .0359 .0445
.0298 .0359 .0427
.0298 .0358 .0470
.0294 .0358 .0453
.0297 .0356 .0442
.0382 .0395 .0498
.0378 .0378 .0501
.0334 .0334 .0499

.0381 .0398 .0499
.0346 .0346 .0495
.0378 .0393 .0501
.0377 .0401 .0500
.0374 .0374 .0501
.0376 .0384 .0500
.0375 .0391 .0501
.0374 .0395 .0501
.0373 .0396 .0501
.0371 .0393 .0500

.0370 .0397 .0499
.0369 .0388 .0501
.0369 .0388 .0500
.0368 .0389 .0501
.0367 .0390 .0500
.0365 .0390 .0501
.0364 .0391 .0500
.0362 .0380 .0500
.0361 .0381 .0500
.0360 .0381 .0500

1.70
1.70
1.70
1.70
1.70
1.70
1.74
1.73
1.68
1.73

1.70
1.70
1.74
1.74
1.74
1.74
1.74
1.70
1.70
1.70

1.68
1.70
1.68
1.68
1.68
1.68
1.68
1.67
1.67
1.69

1.69
1.67
1.67
1.67
1.67
1.67
1.67
1.67
1.67
1.67

a = .025

a *
ai a2 a Zc

.0167 .0167 .0248
.0183 .0183 .0247
.0174 .0174 .0247
.0194 .0194 .0246
.0200 .0200 .0246
.0120 .0151 .0230
.0123 .0163 .0227
.0187 .0187 .0248
.0177 .0177 .0248
.0168 .0168 .0247

.0186 .0186 .0247
.0190 .0190 .0246
.0197 .0197 .0250
.0200 .0200 .0246
.0125 .0185 .0234
.0116 .0189 .0231
.0119 .0186 .0230
.0201 .0201 .0248
.0200 .0200 .0247
.0200 .0200 .0247

.0199 .0200 .0247
.0119 .0180 .0234
.0199 .0199 .0250
.0194 .0194 .0247
.0198 .0199 .0247
.0191 .0191 .0247
.0197 .0198 .0247
.0181 .0181 .0247
.0186 .0186 .0247
.0196 .0197 .0251

.0112 .0185 .0234
.0195 .0195 .0250
.0194 .0195 .0247
.0194 .0195 .0247
.0192 .0192 .0247
.0193 .0194 .0247
.0192 .0193 .0247
.0192 .0192 .0247
.0191 .0192 .0251
.0190 .0192 .0251

1.99
1.99
1.99
1.99
1.99
2.03
2.03
1.99
1.99
1.99

1.99
1.99
1.99
1.99
2.01
2.01
2.01
1.99
1.99
1.99

1.99
2.02
1.99
1.99
1.99
1.99
1.99
1.99
1.99
1.98

2.02
1.99
1.99
1.99
1.99
1.99
1.99
1.99
1.98
1.98

Table A.4 -- continued

a= .05

a* *
a1 a2 a z

.0359 .0382
.0358 .0382
.0357 .0383
.0356 .0383
.0354 .0384
.0353 .0385
.0351 .0371
.0350 .0372
.0348 .0372
.0347 .0372

.0346 .0373
.0344 .0371
.0343 .0371
.0342 .0410
.0341 .0396
.0341 .0383
.0340 .0394
.0339 .0396
.0338 .0409
.0338 .0410

.0337 .0411
.0336 .0412
.0395 .0413
.0395 .0413
.0395 .0399
.0382 .0382
.0394 .0404
.0394 .0409
.0380 .0380
.0388 .0388

.0390 .0390
.0392 .0403
.0391 .0406
.0391 .0406
.0391 .0407
.0390 .0407
.0390 .0408
.0390 .0409
.0389 .0409
.0389 .0410

.0500
.0499
.0500
.0500
.0499
.0499
.0499
.0500
.0499
.0500

.0500
.0500
.0500
.0499
.0500
.0500
.0500
.0500
.0499
.0499

.0499
.0499
.0499
.0499
.0499
.0499.
.0499
.0499
.0499
.0499

.0499
.0499
.0499
.0500
.0499
.0499
.0499
.0498
.0499
.0498

1.69
1.69
1.67
1.67
1.67
1.67
1.67
1.67
1.67
1.67

1.67
1.70
1.70
1.68
1.68
1.68
1.68
1.67
1.67
1.67

1.67
1.67
1.67
1.67
1.69
1.69
1.68
1.68
1.68
1.68

1.68
1.68
1.68
1.67
1.67
1.67
1.67
1.69
1.69
1.68

a = .025

S* *
a-1 a.2 a Z

.0190 .0191 .0248
.0189 .0191 .0247
.0189 .0190 .0247
.0189 .0189 .0247
.0188 .0190 .0246
.0188 .0189 .0247
.0188 .0188 .0251
.0187 .0187 .0250
.0187 .0187 .0249
.0187 .0187 .0248

.0117 .0194 .0236
.0118 .0190 .0236
.0185 .0185 .0248
.0199 .0199 .0248
.0192 .0192 .0247
.0198 .0198 .0247
.0199 .0199 .0247
.0199 .0199 .0247
.0198 .0198 .0246
.0198 .0198 .0250

.0198 .0198 .0249
.0203 .0203 .0246
.0190 .0190 .0246
.0203 .0203 .0251
.0202 .0202 .0248
.0192 .0192 .0248
.0197 .0197 .0251
.0203 .0203 .0247
.0205 .0205 .0247
.0205 .0205 .0251

.0205 .0205 .0251
.0205 .0205 .0251
.0204 .0205 .0245
.0204 .0204 .0245
.0182 .0182 .0245
.0204 .0204 .0250
.0204 .0204 .0250
.0198 .0198 .0251
.0199 .0199 .0251
.0203 .0203 .0251

1.99
2.00
2.00
2.00
2.00
2.00
1.99
1.98
1.98
1.98

2.01
2.01
1.99
1.99
1.99
1.99
1.99
1.99
1.99
1.99

1.99
2.00
2.00
1.99
1.99
1.99
1.99
1.99
1.99
1.98

1.98
1.99
2.00
2.00
2.00
1.98
1.98
1.98
1.98
1.98

Table A.4 -- continued

a= .05

au a2 a Zc

.0388
.0388
.0387
.0387
.0386
.0386
.0386
.0385
.0385
.0385

.0384
.0383
.0383
.0382
.0382
.0381
.0381
.0380
.0380
.0379

.0379
.0379
.0379
.0378
.0377
.0376
.0376
.0375
.0374
.0374

.0373
.0373
.0372
.0372
.0372
.0371
.0371
.0370
.0370
.0369

.0403
.0403
.0404
.0389
.0397
.0397
.0398
.0398
.0399
.0399

.0400
.0400
.0401
.0401
.0402
.0402
.0403
.0403
.0404
.0395

.0395
.0395
.0396
.0396
.0387
.0387
.0388
.0388
.0388
.0389

.0389
.0389
.0390
.0390
.0390
.0390
.0391
.0391
.0391
.0392

.0499
.0498
.0498
.0499
.0499
.0500
.0499
.0499
.0499
.0499

.0499
.0498
.0497
.0497
.0499
.0499
.0499
.0498
.0498
.0498

.0498
.0498
.0499
.0499
.0498
.0498.
.0499
.0497
.0496
.0500

.0498
.0498
.0498
.0498
.0498
.0498
.0498
.0498
.0498
.0498

1.68
1.68
1.68
1.68
1.68
1.67
1.67
1.67
1.67
1.67

1.67
1.67
1.68
1.68
1.67
1.67
1.67
1.67
1.67
1.67

1.67
1.67
1.67
1.67
1.67
1.67
1.67
1.68
1.68
1.67

1.67
1.67
1.67
1.67
1.67
1.67
1.67
1.67
1.67
1.67

a= .025

2
al a2 a ac

.0203
.0203
.0202
.0202
.0202
.0202
.0202
.0202
.0201
.0201

.0199
.0201
.0200
.0200
.0094
.0095
.0200
.0199
.0199
.0199

.0199
.0198
.0198
.0198
.0198
.0198
.0197
.0197
.0197
.0197

.0196
.0196
.0196
.0196
.0195
.0195
.0195
.0195
.0194
.0194

.0203
.0203
.0203
.0203
.0202
.0202
.0202
.0202
.0202
.0201

.0199
.0201
.0201
.0201
.0197
.0197
.0200
.0200
.0200
.0199

.0199
.0199
.0199
.0199
.0198
.0198
.0197
.0197
.0197
.0197

.0197
.0197
.0197
.0196
.0196
.0196
.0196
.0196
.0195
.0195

.0251
.0251
.0251
.0245
.0245
.0245
.0250
.0250
.0250
.0250

.0250
.0250
.0251
.0251
.0241
.0237
.0249
.0246
.0246
.0246

.0246
.0250
.0250
.0250
.0250
.0250
.0250
.0246
.0246
.0246

.0249
.0247
.0246
.0246
.0246
.0246
.0246
.0246
.0246
.0247

1.98
1.98
1.98
2.00
2.00
2.00
1.98
1.98
1.98
1.98

1.98
1.98
1.98
1.98
2.01
2.01
1.99
1.99
1.99
1.99

1.99
1.98
1.98
1.98
1.98
1.98
1.98
2.00
2.00
2.00

1.99
1.99
1.99
1.99
1.99
1.99
1.99
1.99
1.99
1.99

Table A.4 -- continued

a= .05

al 2 a Zc

.0368
.0367
.0367
.0366
.0365
.0365
.0364
.0363
.0363
.0362

.0362
.0361
.0361
.0360
.0360
.0360
.0359
.0359
.0359
.0358

.0358
.0358
.0357
.0357
.0357
.0356
.0356
.0355
.0355
.0354
.0354

.0392
.0393
.0393
.0393
.0382
.0382
.0383
.0383
.0383
.0383

.0383
.0384
.0372
.0372
.0372
.0372
.0373
.0373
.0373
.0373

.0373
.0373
.0374
.0374
.0374
.0374
.0374
.0374
.0374
.0374
.0375

.0498
.0498
.0498
.0498
.0498
.0498
.0498
.0498
.0498
.0498

.0498
.0498
.0498
.0498
.0498
.0498
.0498
.0499
.0501
.0497

.0498
.0498
.0500
.0498
.0498
.0498
.0498
.0498
.0498
.0498
.0498

1.67
1.67
1.68
1.68
1.67
1.67
1.67
1.67
1.67
1.67

1.67
1.67
1.67
1.67
1.67
1.67
1.67
1.67
1.67
1.68

1.68
1.68
1.67
1.67
1.67
1.67
1.67
1.67
1.67
1.67
1.67

a= .025

a1 "2 a z

.0194
.0194
.0193
.0193
.0193
.0193
.0192
.0192
.0192
.0191

.0191
.0191
.0191
.0191
.0190
.0190
.0190
.0190
.0190
.0189

.0189
.0189
.0189
.0189
.0189
.0188
.0188
.0188
.0188
.0188
.0187

.0195
.0195
.0195
.0195
.0194
.0193
.0193
.0193
.0193
.0193

.0193
.0192
.0191
.0191
.0190
.0190
.0190
.0190
.0190
.0189

.0189
.0189
.0189
.0189
.0189
.0188
.0188
.0188
.0188
.0188
.0187

.0251
.0245
.0245
.0251
.0249
.0247
.0246
.0246
.0246
.0246

.0251
.0251
.0247
.0250
.0245
.0250
.0251
.0249
.0247
.0246

.0246
.0246
.0251
.0250
.0250
.0250
.0251
.0250
.0251
.0247
.0247

1.98
2.00
2.00
1.99
1.i99
1.99
1.99
1.99
1.99
1.99

1.98
1.98
1.99
1.99
2.00
1.99
1.99
1.99
1.99
1.99

1.99
1.99
1.98
1.98
1.98
1.98
1.98
1.98
1.98
1.99
1.99

Table A.5 Minimum Sample Sizes to Achieve 80% Power
and a 5.05: for One-sided Z-test for
Comparing Two Correlated Proportions.

AN z a 1-a

.10 .30 185 1.67 .0498 .8006
.22 134 1.68 .0499 .8014
.14 80 1.69 .0499 .8038

.20 .98 153 1.67 .0499 .8003
.94 146 1.67 .0499 ..8014
.90 139 1.67 .0499 .8014
.86 135 1.67 .0500 .8066
.82 129 1.68 .0498 .8018
.78 122 1.68 .0499 .8018
.74 116 1.68 .0499 .8042
.70 108 1.67 .0499 .8016
.66 103 1.68 .0499 .8026
.62 96 1.67 .0499 .8027
.58 89 1.67 .0500 .8013
.54 82 1.67 .0500 .8025
.50 77 1.67 .0501 .8070
.46 70 1.68 .0499 .8005
.42 67 1.70 .0498 .8219
.38 58 1.68 .0501 .8193
.34 51 1.70 .0500 .8058
.30 44 1.68 .0500 .8186
.26 38 1.74 .0419 .8081
.22 30 1.74 .0450 .8124

.30 .88 63 1.74 .0427 .8029
.84 58 1.68 .0501 .8081
.80 57 1.73 .0501 .8088
.76 52 1.70 .0500 .8033
.72 49 1.70 .0501 .8035
.68 46 1.68 .0501 .8029
.64 44 1.68 .0500 .8117
.60 39 1.68 .0501 .8020
.56 39 1.68 .0501 .8324
.52 35 1.74 .0429 .8003
.48 33 1.74 .0458 .8120
.44 30 1.74 .0450 .8123
.40 26 1.74 .0434 .8075
.36 22 1.74 .0425 .8076
.32 19 1.74 .0399 .8195

.40 .98 39 1.68 .0501 .8209
.94 38 1.74 .0419 .8089
.90 35 1.74 .0429 .8037
.86 34 1.74 .0435 .8046

Table A.5 -- continued

A N zc 1-a

.40 .82 33 1.74 .0458 .8090
.78 31 1.74 .0407 .8114
.74 29 1.74 .0407 .8117
.70 27 1.74 .0413 .8108
.66 25 1.74 .0494 .8067
.62 23 1.74 .0427 .8003
.58 22 1.74 .0425 .8120
.54 20 1.79 .0395 .8134
.50 17 1.74 .0413 .8002
.46 15 1.81 .0370 .8030
.42 14 1.74 .0371 .8365

.50 .88 21 1.74 .0459 .8029
.84 21 1.74 .0459 .8201
.80 19 1.74 .0399 .8062
.76 18 1.74 .0446 .8049
.72 17 1.74 .0413 .8031
.68 16 1.74 .0454 .8119
.64 14 1.74 .0371 .8020
.60 13 1.74 .0430 .8166
.56 12 1.74 .0373 .8325
.52 11 1.74 .0395 .8517

.60 .98 18 1.74 .0446 .8522
.94 16 1.74 .0454 .8414
.90 16 1.74 .0454 .8525
.86 15 1.81 .0370 .8196
.82 13 1.74 .0430 .8023
.78 12 1.74 .0373 .8084
.74 11 1.74 .0395 .8107
.70 11 1.74 .0395 .8504
.66 11 1.74 .0395 .8935

Table A.6 Comparision of Sample Sizes to Achieve 80%
Power and a.05 for One-sided Tests for
Comparing Two Correlated Proportions.

A N N N N
al a2 e

.10 .30 179 180 199 185
.22 127 128 146 134
.14 70 74 94 80

.20 .98 150 150 159 153
.94 143 143 152 146
.90 137 137 147 139
.86 131 131 141 135
.82 125 125 135 129
.78 118 118 129 122
.74 112 112 122 116
.70 106 106 116 108
.66 99 99 110 103
.62 93 93 103 96
.58 86 87 96 89
.54 80 80 90 82
.50 73 74 83 77
.46 67 67 76 70
.42 60 61 70 67
.38 53 54 63 58
.34 46 48 56 51
.30 39 41 50 44
.26 31 33 44 38
.22 22 25 36 30

.30 .88 58 59 65 63
.84 56 56 61 58
.80 53 53 59 57
.76 50 50 56 52
.72 47 47 53 49
.68 44 44 50 46
.64 41 41 47 44
.60 38 38 44 39
.56 35 35 41 39
.52 32 32 38 35
.48 29 29 35 33
.44 25 26 32 30
.40 22 23 30 26
.36 18 20 27 22
.32 14 16 23 19

.40 .98 36 36 42 39
.94 34 35 38 38
.90 33 33 37 35
.86 31 31 35 34

78

Table A.6 -- continued

A a N N N N
a1 a2 e

.40 .82 29 30 34 33
.78 28 28 33 31
.74 26 26 31 29
.70 24 25 29 27
.66 23 23 27 25
.62 21 21 25 23
.58 19 19 24 22
.54 17 18 22 20
.50 15 16 21 17
.46 13 14 19 15
.42 10 11 17 14

.50 .88 20 20 23 21
.84 19 19 22 21
.80 17 18 21 19
.76 16 16 20 18
.72 15 15 19 17
.68 14 14 18 16
.64 13 13 17 14
.60 11 12 16 13
.56 10 10 14 12
.52 8 9 13 11

.60 .98 15 15 18 18
.94 14 14 17 16
.90 13 13 16 16
.86 13 13 15 15
.82 12 12 15 13
.78 11 11 14 12
.74 10 10 13 11
.70 9 9 12 11
.66 8 8 11 11

APPENDIX B

PLOTS OF THE NULL POWER FUNCTION

In this appendix, plots of 7(p), the null power

function, are given for the two problems considered here.

The dotted line represents the nominal significance level

on which ir(p) is based. These plots are referred to in

section 3.4 for the two independent proportions case, and

in section 4.4 for the two correlated proportions case.

80

0

4.J
w
0) 0W
*o

oa

O o
o 0)

O m
l 04
0
o .
o u

0
- E.,-I

r 0
0 44

Oj
O

in B*
0 -H

-r4

C N
'. 0
S

,-4
4 4

0 ) >

Sr

1 d

0 l
oo

o
a'
-H

0r

0 0 0 0 0

0 0 0

Z04J QOLJI Ufle L'c.0-- z

o a

C2
4J

S4
/Cr0
0

o o
/ Io

0'
.. 04

rd

Q 04
H4 H
/' in oa

K ON
0

I 0 0
-- a. u -

1 '. a
\ 2 o inu
\ 3 .HUO

\3~ "cw o u Sn

c3

c C C

0 0

ZCW4 0~o0LU~

82

I-0

0

SII

11

uT

ua
o-

-HU

*4- -
c

,t

-.r- .

a-
0

rJ
* s1)

- i0

= [i

H *
* o ra

S a,

0 c
*H

0

HID

1

o F>

C 0

S .

N,

N

cc, cc

a a

L, ~ I'
a' a c~

.
F-

Eo
r-

L

i: ^

oo
<-d
II
N

i)

o
II

0

0

- C

I 4'4

Sa)
4-)

a
00
o

-0 4o
0 0
0 .-

L *d
-4
0

o
4-4

04

Cn
0 N

r-)

H 0

1-4
H C
3 *
C-r-l

^ ^i _i L_ EC =12

-

' '

. . .

o o- 01
- 01 0
01 1 0

01 (1 0
01 01 0 0
01 1 1 0

o c c

a

0
S0

m 0

toa

K I
r^

a
0

N
0

a

0
0
0LU
a
rc

u,

3
*a"

*oo

*t 0

a

II

N

o

a
11

0
-?j

(N *
-O

II 4
oa
*-o

4-0

NOC

'ow
ucl

'0

m M
0 )0

.0 H
4-) \$4
x a

0r-
a4
r0

\$4O
0

0-I
o
w 0

IA
0)
S-I

r-4
E

r-

11

N

ii
o

0
II

0*

4-)
0o

4-1

0
a n

'O

44

o N

N C
0o
0a

u O

94
cfO (

0 a
m a>

(a

o 5

XW 0
O

= J= J lam3 L-azo=3

86

o 0 o 0 0
o 0 0 0 0

Z^ -l- Ll_ 3Ufl4 u_ w o- 0r* -

o-

o

In
- to
II

r-
II
N

sO

z

14
0)
Z

w0
C
a: 4dO

-P 0 0

U 4

U

44
0
4 u

x 0
o S

tp
H 4
-I0)

xc

- N

-o r

m3 C
. a 0 0
0 0 0

r- cn u-s -
o" o C C C C
o a o o 0 C C

r-s ... L.CSua u.a o a--

I :

o
a
-a
LU

cu

a

r

o
a

o
09

-~-~--

in
o
a

0
II

4J
.-H

O
C -H

*s-

do
O)0
04
0) 0

CO
.q a

0
9-4
0

a
4-1

0
+4

0

c o

3 U

88

O
01

o o

0
o *II
N

a4

to c
0

e o
i .

a

CO
1U
O

.
a O
S* ) 0

o
0 (1-

n d d d d o T a iso o
\ b)0 U

440

--4

-\M -rt tt+

cc ccu: v
~ ra
a? a a C

=31- LCjorW

u cr
C a*
o a a

= = -- orZ

j

I

II

0

II

40

01
4J
NW

C
+1

4-4O
M )
0)o

o-H
z4 0

,4J'
aO

*ua

0 (1

0
a

4-)

O
to
0

0)
\$4
a:c

0)'
*-I

1a

ff

N'

j C
L

90

U- H

'II

,40

[ II

LII
: iI

Fo

4o

x0
I- c 0)0

;. o ? .

S-eZ ~ lo K
4-

o.'
** u .o.
a a ral
// o-
/- [
',' h -l
r 3

\ "-I

APPENDIX C

COMPUTER PROGRAMS

The listing of the FORTRAN computer programs used

to compute the size of the Z-test for each of the two

problems considered is given in this Appendix. For the

case of two independent proportions, Appendix C.1 gives

the exact p-value for any n=10(1)150 and any value of the

Z-test statistic with unpooled variance estimator. In

Appendix C.2, the case of two correlated proportions, the

exact p-value for N=10(1)200 and any value of the Z-test

statistic can be obtained.

APPENDIX C.1
C
C
C
C
C
C
REAL AB C TBY(151)
INTEGER ~ Y UU,BOUNDY(151 ,PHI(151)
DOUBLE PRSCISON LFAC(151)1,P,Q LPLQ PX(151),FX(151),
VALUEE DERBND,DERU,DERL,LCOM P.AT1 PHAT2
*PMAX, PMAX2,PMIN1 PBIN2,PL, PPL1 .L2,PUi,PU2,SUP,INF,
*C1,C2,T T3, T4T5T6 D1 D2,D3,D4,D5,D6,LPX,LFX,
*MAXP MAXIMlMINIIM5, MNI0O
DOUBLE PRE VISION DLOG,DLOG10,DEXP,DMAX1,DMINI,DABS
WRITE(6,40)
40 FORMAT( ///,
C
C
'THIS PROGRAM COMPUTES THE P-VALUE OF THE Z-TEST'/
'WITH UNPOOLED VARIANCE ESTIMATOR, TO COMPARE'/
'TWO INDEPENDENT PROPORTIONS FOR'/
*' N = SAMPLE SIZE IN EACH GROUP,'/
Z = NORMAL STATISTIC WITH UNP60LED VARIANCE.'/
'ENTER N, Z, IN FREE FORMAT'/)
C
C
MAXP=0.0
N1=N+1
LFAC(1)=0.DO
DO 1 J=2 N1
1 LFAC(J)=LFAC(J-1)+DLOG(DBLE(FLOAT(J-1)))
C
C
C SIMPLIFY THE COMPUTATIONS AND FORM THE
C BOUNDARY OF THE CRITICAL REGION
C
C
X=0
A=1+ Z**2)/N
2 B=2*X+2**2
C=(X**2 *A-(Z**2) *
TBY(X+1)=(B+SQRT (B**2)-(4*A*C)))/(2*A)
IF (TBY(X+1)-N) 3,4 i
3 X=X+1
GO TO 2
4 U=X-1
UU=U+1
DO 5 X=1 ,U
BOUNDY (X =INT TBY(X) + 1)
PHI(X)=INT(TBY(X))
L=X-1
5 CONTINUE
MAXIM=O.DO
MINIM5=1.DO
MINI10=1.DO
19=0
15=0
P=-.005DO
6 IFPP.GT.0.49DO) GO TO 76
72 IF L.RQ.1) P=P+.01DO
IF L.EQ.1) PP=P+.005DO
Q= -P
LP=DLOG(P)
LQ=DLOG(Q)
DO 7 J= ,N1
X=J-1
T=LFAC(N+1)-LFAC(J)-LACC(N+1-J+1)+X*LPt(N-X)*LQ
IF (T.LT.-180.0) PX(J)=0.DO
I? (T.GE.-180.0) PX(J =DEXP(T)

IF (X) 8,9 8
9 FX(J=PX( J
GO TO 10
8 FX(J)=FX(J-1)+PX(J)
10 CONTINUE
7 CONTINUE
PVALUE=0
DERU=0
DERL=0
DO 30 K=1,UU
X=K-1
IF (?X(K).LE.0.DO) GO TO 11
LPX=DLOG10(PXK)
IF ((1-FX(PHI K +1) .LE.O.DO) GO TO 11
LFX=DLOG10(.1- (PHK)+1))
IF ((LPX+LFX).LT.-7o.0) GO TO 11
PVALUE=PVALUE+PX(K)* (1-FI(PHI(K)+1))
11 Y=BOUNDY(K)
LCOM=LFAC(N+1)-LFAC(K)-LFAC(N+1-K+1)
+LFAC(N+l)--LAC(Y)-LFAC(N+1-Y-1+1)
C
C
C LOCAL BOUND FOR THE DERIVATIVE OF
C THE NULL POWER FUNCTION
C
C
C1=FLOAT X+Y-1)
C2=FLOAT 2*N-X-Y)
PHAT1=C1 /C1+C2)
PHAT2=1- PAT1
PL=P-(.005D0/ 2** L-1 I
PU=P+ .005DO C2**( L-1))
IF (P AT1.GT.PU GO TO 12
IF ((PHATI.LE.PU).AND.(PHAT1.GE.PL)) GO TO 13
IF (HAT.LT.PL) GO TO 14
12 PMAX1=PU
PHIN1=PL
GO TO 15
13 PMAX1=PHATI
IF (PL.NE.0.DO) GO TO 21
PHIN1=PL
GO TO 15
21 PL1=C1*DLOG PL)+C2*DLOG 1-PL)
PU1=C1*DLOG PU +C2*DLOG 1-PU)
IF PL1.LT. PUl PINI=PL
IF (PL1.GE.PU1) PBIN=PU
GO TO 15
14 PMAX1=PL
PHIN1=PU
15 CONTINUE
IF (PHAT2.GT.PU) GO TO 16
IF ((PHAT2.LE.PU).AND.(PHAT2.GE.PL)) GO TO 17
IF CPHAT2.LT.PL) GO TO 18
16 PMAX2=PU
PMIN2=PL
GO TO 19
17 PMAX2=PHAT2
IF (PL.NE.O.DO) GO TO 22
PMIN2=PL
GO TO 19
22 PL2=C2*DLOG (PL)+C1*DLOG 1-PL)
PU2=C2*DLOG(PU +C1*DLOG 1-PU)
IF (PL2.LT.PU2 PMIN2=PL
IF PL2.GT.PU2) PMIN2=PU
GO TO 19
18 PMAX2=PL
PHIN2=PU
19 CONTINUE
T3=LCOM+C1*DLOG(PIAX1 +C2*DLOG(1-PMAX1)
IF (T3.LT.-180.d0 D3=0.D0
IF (T3.GE.-180.0[ D3=DEXP(T3)
IF (?IN2.EQ.O.DO) GO TO 23