NONPARAMETRIC ANALYSIS
OF
BIVARIATE CENSORED DATA
By
EDWARD ANTHONY POPOVICH
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1983
To my parents, this is as
much yours as it is mine.
ACKNOWLEDGEMENTS
This dissertation completes the work toward my Doctor of
Philosophy degree. However, I would be remiss in failing to mention
the contributions of many others without whom this work would not be
possible. I would like to thank my committee for their guidance and
support. Dr. Dennis Wackerly was especially helpful due to his expert
proofreading. Even though, it goes without saying that the major
advisor has the most important direct impact on the actual work, I will
further say that Dr. P.V. Rao went beyond the call of duty. The
endless hours given by Dr. Rao will not be forgotten. Moreover, he is
more than just my advisor, he is a respected friend and shall always
remain so. I would like to thank the Department of Statistics for
allowing me to be a part of this excellent group. To my family,
especially my parents, I will always be grateful for their support
during my years at the University of Florida. I would like to thank
Debra Lynn Smith for her understanding and for being there when I
needed her the most. To all my friends, especially my roommates during
the last two years, John L. Jackson, Jr. and George Tanase, Jr., I
would like to express my gratitude for just being my friends. I would
also like to thank my typist, Donna Ornowski, for the many hours,
including several evenings, she gave in order to do such an excellent
job. Without those mentioned above, this work. could not have been
completed, therefore, in a real way this degree is as much theirs as it
is mine.
TABLE OF CONTENTS
ACKNOWLEDGEMENTS . . . . . . . . .
ADBSTACT .
CHAPTER
ONE
. . iii
v
E NUMBER
1
1
3
TABLES OF CRITICAL VALUES FOR TESTING
FOR DIFFERENCE IN LOCATION . . .. 72
. . . . . . . . . . . . 83
SKETCH . . . . . . . ... 84
iv
TITLE PAG]
INTRODUCTION . . . . . . .
1.1 What is Censoring? . . . . .
1.2 Censored Matched Pairs . . . .
TWO CONDITIONAL TESTS FOR CENSORED
MATCHED PAIRS . . . . . .
2.1 Introduction and Summary . . .
2.2 Notation and Assumptions . . .
2.3 Preliminary Results . . . .
2.4 Two Tests for Difference in Location
A CLASS OF TESTS FOR TESTING FOR DIFFERENCE
IN LOCATION . . . . .. . . .
3.1 Introduction and Summary . . .
3.2 An Exact Test for Difference
in Location . . . . . .
3.3 The Asymptotic Distribution of
the Test Statistic . . . .
3.4 A SmallSample Example . . .
3.5 Choosing the Test Statistic . .
SIMULATION RESULTS AND CONCLUSIONS. . .
4.1 Introduction and Summary . . .
4.2 Test Statistics to Be Compared. .
4.3 Simulation Results . . . .
4.4 Conclusions . . . . . .
TWO
THREE
FOUR
APPENDIX
BIBLIOGRAPHY
BIOGRAPHICAL
Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
NONPARAMETRIC ANALYSIS
OF
BIVARIATE CENSORED DATA
By
Edward Anthony Popovich
December, 1983
Chairman: Pejaver V. Rao
Major Department: Statistics
A class of statistics is proposed for the problem of testing for
location difference using censored matched pair data. The class
consists of linear combinations of two conditionally independent
statistics where the conditioning is on the number, N1, of pairs in
which both members are uncensored and the number, N2, of pairs in which
exactly one member is uncensored. Since every member of the class is
conditionally distributionfree under the null hypothesis, H : no
0
location difference, the statistics in the proposed class can be
utilized to provide an exact conditional test of Ho for all N1 and N2.
If n denotes the total number of pairs, then under suitable
conditions the proposed test statistics are shown to have asymptotic
normal distributions as n tends to infinity. As a result, large sample
tests can be performed using any member of the proposed class.
A method that can be used to choose one test statistic from the
proposed class of test statistics is outlined. However, the resulting
test statistic depends on the underlying distributional forms of the
populations from which the bivariate data and censoring variables are
sampled.
Simulation results indicate that the powers of certain members in
the class are as good as and, in some cases, better than the power of a
test for H proposed by Woolson and Lachenbruch in their paper titled
"Rank Tests for Censored Matched Paris" appearing on pages 597606 of
Biometrika in 1980. Also, unlike the test of Woolson and Lachenbruch,
the critical values for small samples can be tabulated for the tests in
the new class. Consequently, members of the new class of tests are
recommended for testing the null hypothesis.
CHAPTER ONE
INTRODUCTION
1.1 What is Censoring?
In many experiments it is important to compare two or more treat
ments by observing the times to "failure" of objects subjected to the
different treatments. For example, one might be interested in com
paring two drugs to see if there is a difference in the length of time
the drugs (treatments) are effective. The time period in which a
treatment is effective is called the survival time or the lifetime of
that treatment. Unfortunately, in many experiments the survival times
for all treatments under study are not observable due to a variety of
reasons. If for a particular subject the treatment is still effective
at the time of the termination of the experiment, then the correspond
ing survival time is known to be longer than the observation time of
that subject. In that case the survival time is unknown and said to be
right censored at the length of time the subject was under observation.
It is also possible, that at the time of first observation of a treated
subject, the treatment is already ineffective so that the corresponding
survival time is not observable, but is known to be less than the
observation time of that subject. In that case the survival time which
is unknown is said to be left censored at the observation time. It
should be noted that censored data can arise in cases where responses
are not measured on a timescale. Miller (1981) gives an example from
2
Leavitt and Olshen (1974) where the measured response is the amount of
money paid on an insurance claim on a certain date. For this example,
an uncensored response would be the amount paid on the total claim
while a rightcensored response would be the amount paid to date if the
total claim is unknown..
Censoring can arise in different situations determined by the
experimental restrictions on the observation time. For example, Type I
censoring occurs when the period of observation for each subject is
preset to be a certain length of time T; the survival times longer than
T would be rightcensored while those survival times .less than T would
be observed and are known as failure times. Type II censoring occurs
when the period of observation will terminate as soon as the rt
failure is observed where r is a predetermined integer greater than or
equal to one. If n is the total number of subjects in the experiment,
then the nr survival times which are unobserved are rightcensored at
the observation time T. Random censorship is a third form of censoring
and in many cases it is a generalization of Type I censoring. Random
censoring occurs when each subject has its own period of observation.
which need not be the same for all subjects. This could occur, for
instance, when the duration of the experiment is fixed, but the sub
jects do not necessarily enter at the beginning of the experiment. In
that case each subject could have its own observation time. When the
time of entry is random, the observation time will also be random.
Clearly, if a subject's survival time is greater than the subject's
observation time, then a rightcensored survival time results; other
wise, a failure time is observed for that subject. In the random
censoring case, leftcensored survival times could also occur if at the
time of first observation of the subject a failure has already occurred.
1.2 Censored Matched Pairs
This dissertation is concerned with the analysis of the survival
times of two treatments on matched pairs of subjects. The problem of
testing for the difference in locations of two treatments using cen
sored matched pair data was examined by Woolson and Lachenbruch (1980)
under the assumption that both members of a pair have equal but random
observation times and that the observation times for different pairs
are independent. They utilized the concept of the generalized rank
vector introduced by Kalbfleisch and Prentice (1973) and also used by
Prentice (1978) to derive a family of rank tests for the matched pairs
problem. The tests are developed by imitating the derivation of the
locally most powerful (LMP) rank tests for the uncensored case
(Lehmann, 1959). In spite of the fact that the WoolsonLachenbruch
family of rank tests (WL tests) is an intuitively reasonable genera
lization of the LMP rank tests for the uncensored case, the complicated
nature of these tests makes it difficult to investigate their theoreti
cal properties. For example, it is not clear that these tests are LMP
in the censored case. Also, the asymptotic normality of these tests,
as claimed by Woolson and Lachenbruch, is strictly valid only when N1
(the number of pairs in which both members are uncensored) and N2 (the
number of pairs in which exactly one member is censored) are regarded
as nonrandom and tending to infinity simultaneously. It is not known
whether the asymptotic normality holds (unconditionally) as the total
sample size tends to infinity.
For small sample sizes exact critical values for the WL test
statistics could be based on their (conditional on N1 and N2) permu
tation distributions. However, the determination of these critical
values becomes progressively impractical when either NI or N2. is
greater than 8 and the other is greater than 2. An example of the
Woolson and Lachenbruch test statistic is presented in Chapter Four.
The objective of this dissertation is to propose a class *of
statistics for testing the null hypothesis, Ho: no difference in
treatment effects. Any member of this class is an alternative to the
WL statistic for the censored matched pair problem. The statistics in
the proposed class are not only computationally simple, but under.Ho
are also distributionfree (conditional on N and N2). Furthermore,
the critical values of these statistics are easily tabulated.
Chapter Two begins with the general setup of the problem and some
preliminary results which are needed in Chapter Three. The class of
linear combinations of two conditionally distributionfree and.indepen
dent statistics for testing H is proposed in Chapter Three and it is
shown that the asymptotic normality of the corresponding test statis
tics hold unconditionally as the total sample size n tends to infinity.
Simulation studies are used in Chapter 4 to compare the test proposed
by Woolson and Lachenbruch (1980) to the tests proposed in this
dissertation.
CHAPTER TWO
TWO CONDITIONAL TESTS FOR CENSORED MATCHED PAIRS
2.1 Introduction and Summary
In this chapter two statistics for testing the null hypothesis Ho,
that there is no difference in treatment effects, using censored
matched pair data are developed. Either of these statistics ;can be
utilized to test for a shift in location, and together they provide
conditionally independent tests of H conditional on the numbers of
observed pairs in which both members of a pair are uncensored (N ) and
in which exactly one member of a pair is censored (N2).
The two statistics are proposed in Section 2.4 and it is shown
that the exact conditional distributions as well as the conditional
asymptotic distributions of the statistics are easily specified. Some
notations and assumptions are stated in Section 2.2, while Section 2.3
contains preliminary results needed in the development of Section 2.4.
2.2 Notation and Assumptions
This section begins with a set of notation which will be used
hereafter throughout. Let {(Xi oYI,Ci); i=1,2,...,n) denote
independent random variables distributed as (XY, yO, C). In addition,
(Xo,Y) and C are referred to as the true value of the treatment
responses and the censoring variable, respectively. The actual ob
served value of the treatment responses is (X,Y) where X = min (Xo,C)
and Y = min (Y,C). In addition to (X,Y) one also observes the value
of the random variable (6(X),6(Y)) where
W 1 X=X
(X) X=C
and
1 Y=YO
6(Y)= 0 Y=C .
Note here that the indicator random variable 6(X) (respectively, 6(Y))
indicates whether or not X (respectively, Y) represents a true
uncensored response.
Let
n
N1 = E 6(Xi) 6(Yi)
i= 1
n
N2L = i 6(Xi)[1 6(Yi)] ,
i= 1
n
N2R = E [1 6(Xi)] 6(Yi)
i=l
N2 = N2L + N2R
and
n
N3 = [l 6(Xi)][1 6(Yi)]
i=1
= N1 N2
Clearly, N1, N2, and N3 denote the number of pairs (Xi,Y ) in which
both members are uncensored, exactly one member is censored, and both
members are censored, respectively. Similar interpretations for N2L
and NR may be given.
The test procedures under consideration here are based on the
observed differences {D =XiYi ; i=l,...,n}. It is said an observed
difference Di falls into category C1 if 6(Xi)6(Yi)=1, into category C2
if 6(Xi)[16(Yi)] + [16(Xi)]6(Yi)=1, and into category C3 if
[16(Xi)][16(Yi)] = 1. It is evident that Nk is the number of pairs
(XiYi) in category Ck, k=1,2,3. Also, as noted by Woolson and Lachen
bruch (1980) a Di in category C1 equals XiYi Di; furthermore, a Di
in category C2 either satisfies > i > D (i.e., D is left censored at
Di) if .6(Xi)[16(Yi)] = 1 or satisfies Di < Di (i.e., Di is right
censored at Di) if [16(Xi)]6(Y) = 1. The order relationship between
Di and Di for those Di in category C3 is unknown.
Additional notation is needed for the N1 and N2 pairs (Xi, Yi) in
categories Cl and C2, respectively. Let (X1, Yi), i = 1, ..., N1
denote the N1 pairs in category Cl and let (X2j,Y2j), j = 1, ..., N2
denote the N2 pairs in category C2. Moreover, let Di and D2j be
defined as follows:
Dli li i i = ..., N,
and (2.2.1)
D2Jj 2J Y2j J = 1,..., N2'
Now a set of assumptions, which are used later, are stated.
Assumptions
(Al): (X , Y 0) are independent and identically distributed
(i.i.d.) as a continuous bivariate random variable
(Xo, YO).
(A2): There exists a real number O such
that (Xo,Yo+) is an exchangeable pair.
(A3): CI, C2,..., Cn are i.i.d. as a continuous random
variable C, and is independent of (Xo,Y).
(A4): Ci is independent of (Xo,Yi).
(A5): Let PF{*} indicate probability calculated with the
value of 9 satisfying (A2). Then
P {Di c C3} 5 P3(0) = P {Xo > C, Y0 > C} < 1.
Note that in view of (A2), X and YO + e have identical marginal
distributions and that (A5) implies PO{N3 = n} < 1. Therefore, under
(AS), with positive probability, there will be at least one D in
either category Cl or C2 or both. If the probability of observing a Di
in category Cl is positive, then F (x) will be used to denote the
distribution function of D1i. In other words,
F (x) a Pe{D1i S X}
(2.2.2)
= P D1 S x I D1 E Cl.
Finally, because a test of H : 0 = 0 utilizing Di, i=1,2,...,n is a
o o
test of Ho:0 = 0 utilizing D = Di 0 we may, without loss of
generality, take 9 = 0.
O
2.3 Preliminary Results
It is obvious that. if N =N3=0 with probability one, then the
problem under consideration is the usual paired sample problem. Since
under (A2), the D are symmetrically distributedabout 0, there exist a
number of linear signed rank tests appropriate for testing Ho: 0 = 0
0
(Randles and Wolfe, 1979) of which the wellknown Wilcoxon signed rank
test is an important example. However, if N2 > 0' with positive
probability, then the situation is different because a Di in C2 will
be either left or right censored at Di and not all the Dio will equal
Di. Lemmas 2.3.1 and 2.3.2 below state some properties of the Di
belonging to Cl and C2.
Lemma 2.3.1:
Let P {Di e Cl} P1 (9) P0{X < C, Yo < C}.
Suppose 0e 0 and assumptions (Al), (A2) and (A4) hold. Suppose
further that P (0) > 0, and set I(X) = 1 or 0 depending on whether
X > 0 or X 5 0, respectively.
(a) The distribution of Di conditional on Di E Cl is symmetric
about 0, That.is, for every real number a
PO{DI a I Die Cl }
= P0{Di i a  Die C } .
(b) The random variables [Di and ((Di ) are conditionally in
dependent. That is, for every real number a a 0 and u.= 0, 1
PO 1 Dij a, ((Di) = u 1Die Cl}
Po0{ Di 1 s a Die Cl} P{i(Di) = u D Cl} .
10
Proof:
To prove (a) consider
P {Di a DiE CL} P{D iE C1l
0 0 00
= P{X0 Y a, Xi Ci, Yi S Ci}.
By (A2), X o and Y are exchangeable, therefore, the quantity above
equals
PO{Y X0 S0 a, X 0 C, Y0 Ci}
= P{X0 Y0 0 a, X c Yi 0 5 Ci
= P {Di S a I Di Cl) P0{Di E C}
which completes the proof of (a).
To prove (b) consider for a > 0,
P0{ jDi I S a,'(Di) = 1 Die CI}
= PO{0 < Di S a I DiE Cl .
From part (a) the conditional distribution of D is symmetric about 0.
Therefore, the quantity on the right hand side of the equation equals
[PO{0 < Di 5 a I Die Cl) + P { a S Di < 0 I DiE Cl}]
= [(PO{Di S a I DiE Cl} P{Di S 0 DiEC l)
+ (P0{Di < 0 I DiE C1) POD < a I DiE C)]
11
= [I{ DiS a Di e Cl} P0{Di0 Die C1}
P {Di< a IDie C)]
= iPO{ a D 1 a I Di e C1l P{Di = 0 Die C1}.
Since under (Al) (X, YO) has continuous bivariate distribution, then
for all O P F{Xo Y = 0} = 0, and since Die CI implies Di = Di, then
the quantity above equals
kPo{ a 5 D S a j Die C1}
PO 1 Di Is a Di. C1} (2.3.1)
Now, from part (a)
PO{DI>0 Die C1} = PO (0(D)=1 I DiE C} =
and
PO{Di 0 Die Cl} = Po (D ) = 0 IDie C1 = .
Consequently, from (2.3.1) it can be seen that
SPo{Di I a, ,(Di ) = 1 Die C1}
= P{IDI a I Die C1
= PO{IDIS5 a Die Cl} Po{(Di) = 1 Di C1
Similarly,
Po{IDi 5 a. *(D ) = 0 Die Ci}
SPO I Di S a Die Cl} PO{ ID I a, (Di) = 1 Die CO1
12
= P0{ Di j a ID C1)
POqIDJI a DI Cl PO (Di) = 1 I D ECL
= POiDIJ a Die Cl} [1 PO(Di) = 1 Die Cl}]
= PO{ D :S a Di Cl} P0 i(DI) = 0 DieC1}
This completes the proof of (b). O
Lemma 2.3.2: Let D11, ..., DIN be as defined in equation (2.2.1).
Then conditional on N = ni, i=1,2,3, D11 ... DN1, and N2 are
mutually independent. That is, for real numbers dl,...,dn1, and
integer n2R
Pe{DIj d j=l,...,N1, and N2R = n2 Ni = ni, i=1,2,3}
n
P D1 j D dj Ni ni, i=1,2,3} P N2R = n2R IN, n,.i=1,2,3).
j=1
Proof:
Let
P1(d,) =
P2R(0) =
P {D = Xo YO 5 d, max (Xo, Yo) S C},
P {YO C < }, P2L(0) = Po {X C < Yo},
P2() = P2R(2) + PiL2() = PO{Di C2i
Recall that from assumption (A5)
P3(8) = P {min(XoY) > C} = P (Di e C3}.
13
Also, as a result of the definition of P1(e) in Lemma 2.3.1
P1(6) = P1 (=, e) Pe{Di C Cl}.
For arbitrary integers nl, n2 = n2R + n2L and n3 such that
n = n1 + n2 + n3 and for an arbitrary permutation
(i, .' in1 1 + 1 i1 + n2R n1 + 2R
in1 + n2 in + n2 + 1 "' in
of
(1,2, ... n)
under assumptions (Al), (A3), and (A4) the following identity results.
P {Dlj d j=l, ..., n1 ; [1 6(X i)]6(Y ) = 1,
j = n + 1, ..., n1 + n2R ; i(X.)[1 6(Yi)] = 1,
j = n1 + n2R + 1, ... n + 2;
[1 6(Xi )][l 6(Yi )] = 1, j = n1 + n2 + 1, ..., n}
n n2R n2L 3
= 1 Jlp(d,9) [P2R() [P2L() [P3(6)]3 (2.3.2)
j=1
Therefore, summing (2.3.2) over the
n1,n2R'n2Ln3)
distinct ways of dividing n subscripts into four groups of
nl, n2R, n2L, and n3 subscripts it follows that
Pg {D1j dj, j=1l... ,n, and N2R = n2R' N2L =n2L N3 = n3
/ n n n2 1 2R 2L ri
= j 1(dj) [P2R(e)1 [P2L() L[P3(e)]
nln2Rn2L,n3 j=
Now since (NI, N2, N3) has a trinomial distribution with probabilities
Pi (8), i=1,2,3, then
( n n I n 2 n 3
P e{Ni = ni =1,2,3} = [P(01)] 2[P2(e)] [P3(O)]
Ir n2' n3
Therefore,
P {D11 dl '.DIN1~ dN1 N2R N2R Ii =ni, i=1,2,3}
Snn2R n"2LP n3
S n (n1 P (d ,e)) (P2R(8)) 2(PL(O)) (P3(e))
.n1,n2Rn2Ln3 j=l Pld
(nl,nn 3) (P (e))1 (P2(e)) (P3(e))
n 2 n1 P 1(d ,e) 2R( n2R 2L ()n2L
n2R/ J= P1() J P2(e)J 2 (2.3.3)
which shows that the joint conditional distribution of Dl N2 and N2
is a product of the marginals, thereby completing the proof of
Lemma 2.3.2. .
Lemmas 2.3.1 and 2.3.2 will be utilized in the next section to develop
two test statistics for testing H : 0 = 0.
0
15
2.4 Two Tests for Difference in Location
Lemma 2.3.1 suggests that one could use the observed differences in
Cl to define a conditional test (conditional on N = n ) of Ho: 6 = 0.
As a consequence of the symmetry of the conditional null distribution of
the D1j, j=1,2,..., n1, any test for the location of a symmetric popula
tion is a reasonable candidate for a test of H : 0 = 0. For example,
one might consider using a linear signed rank test. The wellknown
Wilcoxon signed rank test (Randles and Wolfe, 1979, p. 322) is an obvious
choice due to its high efficiency for a wide variety of distributions
and the availability of tables of critical values for a large selection of
sample sizes. Accordingly, consider the statistic Tin 5 T1n(0) where
N1
T1n() = E (DIj 0) R (0)
with R (0) denoting the rank of Dij 0 among
Dil 6 IDi2 ..., IDINI e
Clearly, the Di in C2 also contain information about H Lemma
2.3.2 implies that conditional on N = n1, N2= n2, any test based on N2R
and N2L will be independent of the test based on Tin. Since, as will
subsequently be shown, under H
E{N2R N2L I Ni = ni, i=1,2,3] = 0,
a second statistic for testing Ho is T2n where
T2n = 2R N2L = 2N2R N2
Some properties of the conditional distributions of T1n and
T2n follow directly from Lemmas 2.3.1 and 2.3.2. These properties are
summarized in Lemma 2.4.1.
Lemma 2.4.1: Let F (.) be defined as in equation (2.2.2).
Also, let E (.) and Var (.) denote the expectation and variance with
respect to P (.). Under (Al), (A2), (A3), and (A4):
(a) Conditional on Ni = ni, i=1,2,3 Tin and T2n are independent.
(b) E [Tn Ni= n, i=1,2,3]= in(8) ,
and
Var [TlnlNi = ni, i=1,2,3] = o 2() + 0(n12)
In
where
lin(e) (n1) If 1 F (x) dF (x)
]'in 2 0
1 )2
o (e) = 4 I1 (
+ n1 [1 F (0)]
{ f [1 FO(x)]2dF(x)
00
[ fi 1 F (x)dF (x)]2
(c) E [Tn] Ni = ni, i=1,2,3] = 2n()
and
VarE [T2n Ni = ni, i=1,2,3] = 2 (e)
2n
where
12n(e) = n2 [P2()]1 [P2R(e)P2L(e)]
2
2 (e) =
2n
4 n2[P2(e)]2[P2R(8)][P2L(6)
Proof:
The proof of (a) follows directly from Lemma 2.3.2 since Tin is only
a function of Dlj, jl, ..., N1, and T2n is only a function of N2R.
The proof of (b) follows by noting that
Ti n +(0)
Tln r 1(D lj) R (0)
j=1
= E i
1 s i < j 5 n1
= (n1) U1 +
'1
n1
1(D1i + Dlj) + E (Dlj
j=1
n1 1,n1
where
1
U = ( 1) E (Di + Dj
1 1 i < j n1 ii
1 nl
Wln n" Z T(Dlj)
1 1 =I
are two Ustatistics (Randles and Wolfe, 1979, page 83). Since Un
and W,nlare unbiased estimates P (D1i + Dj > 0) and P (D1j > 0),
respectively, then
"ln(e) = (1 ) PO(D1i + DIj > 0) + n1P (Dj > 0).
Upon noting
Pe(Di + DIj > 0) =
f P (D1i > DIj IDlj = x) d PO(DIj S x)
= f [1 Fe(x)]dF8(x)
P (Dlj >
the expression for
Now,
0) = 1 Fe(0),
Pln(6) follows.
Var [Tin INi = nii=1,2,3]
+ 2 ( 1) n1 Cov (U1 Wln
21,n l,n1
2
( n1) Vare1 + Var W1n
2 1ln 1 0 1n
+2 (nl1)n1 Pl (e) [Vae Uln (Var W1 ]
2 1 r8Un I E ] e 1,n 1
where p (9) = Corr[Ul W1, is the correlation coefficient
1 1,n1, ,n1
between U and W,
1,n1 ,n1
As a result of Lemma A part (iii) of Serfling (1980, p. 183) it
follows that
Vare Uln = (2)2 n1 {E [(D11 + D12)(D11 + D13)] 
[E e(D1 + D12)]2} + 0(n1 2
0(n
= 4n {P (D11 + D12 > 0, D11 + D13 0)
[P(D11 + D12 > 0)]2} + 0(n12)
4n1{ f [1 F (x)]2 dF (x)
12 2
[ 1 F (x) dF (x)]2} + O(nl2)
Also,
Var W1,n1 = n1I{E [(D1)] 2 [E Y(D1)]2
= O(n1 1
Since p (e) = 0(1), then it follows that
Var [Tn N i,23]
Va [Tn I N=nii=l,2,3]
1 2
= 4 n1 (n1) { [1 
I [1 F (x)
a
F (x)]2 dF (x)
dF (x)]2 + (1)2 0(n2)
e 2
+ n12 0(n1)
+ 2(n1 )n1 [Var U ln] [0(nl 1)]
2 1 () l 1 1
2
=2oe) + 1 ) 0(n) + n2 0O(n1)
+ 2(21)n1 [O(n1)[ (n)
2]On 1~
2 e)
ain(e)
+ 0(n12) + 0(n)
+ 0 (n3/2
1
= ln() + (n
The proof of (c) follows from Lemma 2.3.2 when it is noted that
P{N2R = n2R I Ni = ni, i=1,2,3
n2 2R(e) n2R 2L n2L
2 2 p2L2 (
2R) I P2 () I (6) 1
which indicates that the conditional distribution of N2R is Binomial with
parameters n2 and [P2(0)] P2R(e).
Now,
E [T2n I Ni = nii=1,2,3]
SEe[N2R N2L I N nii=1,2,3]
= n2 [P2 (e)] [P2R (e) P2L (e)]
= 2n (e)
and
Var [T2n I N = ni,i1,2,3]
= Var [2N2R N2 Ni = ni,i=1,2,3]
= 4 n2 [P2 (e)1 [P2R(6)l [P2L(8)
= a~2(o)
which completes the proof of Lemma 2.4.1. O
Note that if G = 0 then F6(x) is symmetric by part (a) of Lemma
2.3.1 and as a result
Uln(0) = (,1) f 1 FO(x)dF0(x) + n1[l F0(0)]
= (n1) I F0(x) dF0(x) + (1/2) n
1
= (1/2) n1(n1 1) f u du + (1/2) nI
0
= (1/4) n1(n1 1) + (1/2) n1
n, (n + 1)
1 (n + 1 (2.4.1)
4
Similarly,
o2 (0) 4 )2
a2 (0) ( )2 {(1/3) (1/2)2}
In "1
= 2 + (n2) (2.4.2)
"1 1
where
S n1(nl+l)(2nl+1)
"1 24
is the null variance of the Wilcoxon signed rank statistic based on n1
observations.
Also, since under (A2) P2R(0) = P2L(0) = (1/2) P2(0)
02n (0) =2 2
P2R(0) P2L(0)
P2(0)
(2.4.3)
Similarly,
2n (0) = 4n2
= n2
S (2.4.4)
In Lemma 2.4.2 it is shown that, conditional on N1 = n1 and N = n ,
Tin and T2n have asymptotic normal distributions.
Lemma 2.4.2:
(a) lim Pe{[oln ()]l(Tln In ()) 5 x INi = nii=1,2,3} = (x)
nl+
where
x 2
@(x) = (2y) f exp[y /2] dy.
W0
(b) lim Po{ [o2n()]1 (T2n j2n(0)) 9 x INi = ni,i=1,2,3} (x) *
n2'
P2R(0) P2L(O)
[P2(0)]2
Proof:
(a) From the proof of Lemma 2.4.1
Tn= (1 ) U + n W
In 2 l,n1 1 1,n1
Therefore, it follows that
n" (1)1 [Tn =
1 2 In In
nl [Uln EUln ] + nl3/2(11)1[W1, E W1,
1 2 1 1
Now,
n
,n = n1 E (D )
1,n1 1 J= ij
and since Ee[Y(Dij)]2 = Po(Dlj > 0) is finite, then by corollary 3.2.5
of Randles and Wolfe (1979) W ,nconverges in quadratic mean to
1,n1
Therefore,
plim n 3/2 (;)1 [W,n EW1,n] = 0
1 2 llnI GWnn
as n1,, showing that
n l (21 ) [Tn ln()]
1 in In
and
n1 [Ul,nl E U,n]
have the same limiting distribution as nl .
By taking
2 = n ( 2 2
r F1 = n1 21) 2 1n
in Theorem 3.3.13 of Randles and Wolfe (1979), it is seen that
Ul,nl Ee(UI,nl)
(n )1 oln(e)
has an asymptotic normal distribution.
Note that on (8) = 0(n 3) so that Var [Tln Ni = ni, i=1,2,3] *o 2()
= 1 + 0(n1) which implies thatVar0 [Tin Ni=n, i=1,2,3] o n2(e)
tends to one as n1 .
Therefore,
n (l,) [(T)] T in(e)
n1 2 )1 [Tn In (8) Tin
n (nl)i 01n (e) (e)
has a limiting standard normal distribution which completes
the proof of (a).
26
The proof of (b) follows since the conditional distribution
1
of N2R is binomial with parameters n2 and [P2(8)] P2R()
Lemma 2.4.3 follows from the conditional independence of Tn and T2
in 2n
shown in Lemma 2.4.1.
Lemma 2.4.3:
Tin 41n(e) T 2n 2n()
lim P{ c () x, a ) y INi= ni =1,2,3}
min(n 1,n2) In 2n
In 2n
The conditional independence of Tin and T2n indicates that two
conditionally independent tests of H can be employed using the same data
0
set. In the next chapter a test for H based on a combination of
n T2ni considered.
Tn and T2n.is considered.
CHAPTER THREE
A CLASS OF TESTS FOR
TESTING FOR DIFFERENCE IN LOCATION
3.1 Introduction and Summary
In Chapter Two the general setup was given for the problem of
testing for the difference in locations of two treatments applied to
matched pairs of subjects when random censoring is present. In
addition, two conditionally independent test statistics were proposed
for testing H : 8=0. The main focus of this chapter is on a class of
test statistics consisting of certain linear combinations of these two
test statistics.
In Section 3.2 a class of test statistics for testing H : 8=0 is
proposed and some properties of the conditional distributions
(conditional on N1 and N2) of the members in this class are stated. In
particular it is shown that the statistics in the proposed class are
conditionally distributionfree. The asymptotic null distributions of
these statistics are shown to be normal in Section 3.3. Section 3.4
contains a worked example to demonstrate the use of one member of this
class for testing H when the sample size is small. In Section 3.5 a
method to choose one test statistic from the class of test statistics
is suggested.
28
3.2. An Exact Test for Difference in Location
In this section a class of exact conditional tests of H based on
o
TIn and T2n is proposed.
Let
T1 (N1) = [N1(N1 + 1)(2N1 + 1)/24] [T1 N1(N1 + 1)/4]
T2n(N2) N2 T2n
and {L n be a sequence of random variables satisfying:
Condition I
(a) Ln is only a function of N1 and N2,
(b) 0 5 L 6 1 with probability one,
(c) There exists a constant L such that plim L = L as nm.
The proposed test statistic is
Tn(N1,N2) (lL n) T1(N ) + Ln T2n(N2)
Note that under H and conditional on Ni=ni, i=1,2, T (N1) has a
distribution which is the same as the distribution of a standardized
Wilcoxon signed rank statistic based on n1 observations (Randles and
Wolfe, 1979), while T2n(N2) is distributed as an independent
29
standardized Binomial random variable with parameters n2 and .
Therefore, T (N1,N2) is conditionally distributionfree under Ho and a
test of H can be based on T (NIN2).
Hereafter, for the sake of simplicity Tn(n1,n2) will denote the
statistic T (N ,N2) conditioned on N,=ni, i=1,2 Useful
properties of Tn(nl,n2) are stated in Lemma 3.2.1.
Lemma 3.2.1: Under H : 0=0 and (Al), (A2), (A3), and (A5),
0
(a) E0 [Tn(n 1,n)] = 0,
(b) Var0 [Tn(n1,n2)] = 1,
(c) Tn(nl,n2) is symmetrically distributed about 0.
Proof:
Under H and conditional on Ni=nili=1,2, Tin (N ) has the same
0 in I
distribution as a standardized Wilcoxon signed rank statistic,
and T2n (N2) has the same distribution as a standardized Binomial
random variable. Also, conditional 6n Ni = ni, i=1,2, Ln is nonrandom.
Therefore,
E [TI(N ) Ni = nii=1,2,3] = 0 ,
EO[T2n(N2) IN = nii=1,2,3] = 0 ,
Var0 [T*n(N1) IN = ni,i=1,2,3] = 1 ,
and
Var0[T2n(N2) INi = nii=1,2,3] = 1
As a result (a) follows directly, and by Lemma 2.4.1(a) it is clear
that
Var0 [Tn(n,n2)] = (lLn) Var0 [Tn(N1) Ni = ni,i=1,2,3]
+ LVar0[T2 (N2) Ninii=1,2,3]
= (1 Ln) + Ln,
from which (b) follows.
(c) It is known (Randles and Wolfe, 1979) that the Wilcoxon signed
rank statistic and the binomial random variable with p= are
symmetrically distributed about their respective means. Therefore, it
follows that the conditional null distributions of
A *
T n(N1) and T2n(N2) are symmetric about 0 under H Since T n(N ) and
T2n(N2) are conditionally independent, then the proof of Lemma 3.2.1 is
complete. O
The conditional distribution of T (N ,N2) is a convolution of two
wellknown discrete distributions (i.e., the Wilcoxon signed rank and
Binomial distributions). The critical values for tests based on T (n1,n2)
are easily obtained for small and moderate values of n.,i=1,2. Note
that the tables given in the appendix are applicable to the test
statistic
Tn(n n2) = (.5) T1n(n1) + (.5) ;2n(n2
for n1 = 1,...,10 and n2 = 1,2,...,15 at the .01,.025,.05, and .10
levels of significance. Tables of critical values for other choices
of T (n1,n2) can be similarly produced but they depend on the
choice of L
n
For large values of ni,i=1,2, a test of H can be based on the
asymptotic distribution of Tn(N ,N2) as seen in Section 3.3.
3.3 The Asymptotic Distribution of the Test Statistic
In this section it will be shown that the null distribution of
T (N1,N2) converges in distribution to a standard normal distribution
as n tends to infinity, thereby showing that T (N1'N2) is
asymptotically distributionfree. This fact can be utilized to derive
a large sample test of Ho.
The main result of this section is Theorem 3.3.3 which will be
proved after establishing several preliminary results. The result
stated in Lemma 3.3.1 is a generalization of Theorem 1 of Anscombe
(1952).
Lemma 3.3.1: Let {T n,n for n1=1,2,..., n2=1,2,... be an array of
"n=2 2
random variables satisfying conditions (i) and (ii).
32
Condition (i): There exists a real number y, an array of positive
numbers {(nl n2}, and a distribution function F(*) such that
lim P{T Y S x ) } = F(x) ,
min(nl ,n2)+ n1 2 n12
at every continuity point x of F(.).
Condition (ii): Given any E > 0 and n > 0 there exist
v = v(e,n) and c 5 c(e,n) such that whenever min (n1,n2) > v then
P{I ,n2f Tn ,n21< n,n2 for all n',n' such that
In'nll < cn1, n"n21< cn2} > 1n.
Let {n r be an increasing sequence of positive integers tending to
infinity and let {N1r} and {N2r) be random variables taking on positive
integer values such that
plim (X nr) Nir = 1
as r+ for 0 < Ai < 1, i=1,2.
Then at every continuity point x of F(.),
lim P{TN 2 y < x W [Xl,[ } = F(x)
e rx] d s te r [in r 2nr
where [x] denotes the greatest integer less than or equal to x.
33
Proof:
For given c>O and n>0 let v* and c satisfy condition (ii). Also,
let v be chosen such that for min.(Xr n r) > v
P{ Nir .inr > i=1,2.
**
Then for min (Aln,X2nr) > max(v*,v ) v
P{ INr n Ar I
= P{ INr lnrl < cnr} + P{N2r A2nr< cA2nr}
P{INr Anr< ccnr r IN2r n"r< cA2nlr
2 (1 ) + (1 ) 1
2 2
= 1 n (3.3.1)
Let nlr = [AInr] and n2r = [2nr] and define the events A B,
Or, Dr, and Er as follows:
Ar: Nlr nli
2: IT, 2 T
B : IT T < E a
n1 ,n2 11 ,n2' n1n2
n > v, and ni ni
n
Ir 2r ir 2r Ir 2r
for all ni, i=1,2, such that
< cn i=1,2 ,
Or: Nlr nIrl < Cnlr and IN2r n2r I < n2r
D :. TN, N2r Y x 6 and A
r N 2r nlr,n2r r
E : TN N y7 x and A c
r N, Nrn2r r
where Arc denotes the complement of Ar
Note that for given 0 such that min(nlr,n2r) > v the event B implies
T ,n T I< e for all ni, i=1,2, such that n > v
n2 n1r' n 2r Ir'n2r 1
and ni ni < cni i=1,2. Since Ar is equal to Or
and ITrN T n < C nr if follows that B and 0 imply A .
N1rN2 n1 2r' 2r n1n2r r
Hence, for given r such that min (nlrn2r) >V'
P{A } > P{B and Or}
= P{B} P{B and O c
r
By condition (ii), P{B} > 1n and by (3.3.1) for given r such that
min(nlr,n2r) > v, P{0r} > 1n, which implies P{Orc0 < n. Therefore, if
min(n1r,n2r) > v, then
(3.3.2)
P{Ar} I 12n
which implies
P{Arc} ) 2n.
Now,
t1r N2r x n1r 2rand
D P{TNIr, N2r y :x Inr, n2r and
NirP{' r nlr n2rt < an n2r
N2n lr'n2r.'
TNir, N2r nlt' 2r2r
= P{TN N + (T1r n2r TNr N2r Y.
1r, 2r (nr, n2r N, N2r
and ITN T I
d Ir, N2r nlr, n2rE lr; "2r
Since
INr' N2r nrn2rl < n.r, n2r
implies
x 2 + (T T NiN2) < (x + n r'
Inrr,n2r nr~n2r
it follows that
P(D ) P{Tnrnr
 y (x + E)nr,n2r
(3.3.3)
36
Now, Dr and Er are mutually exclusive events such that Dr U Er
is the event TNIr, 2r y S x Wnrn 2r so that
P{D U E } = P{D } + P{E}
S P{T y 5 (x + e) W } + P{E }
"nir nr'2r nir n2r r
S P{T y S (x+c) } + P{A c}
lr',n2r nlr'n2r r
Therefore, from (3.3.3), for min (nlr,n2r) > V,
TNIr'N2r y lrIrn2r
SP{Tnr n2r
Now, for min(n1r,n2r)>v
P{D }
r
y 5 (x+E) I I + 2n (3.3.4)
"lr' n2r
= P{ T y 5 xw and A }
NIr, N2r n"r' n2r r
= { TNr N2r + (T n2r Tr N2r
xw + (T T ) and A }
nrn2r "nrn2r NIr'
Now since the event A implies
r2r 2r 1r
E] S T T..,
nIrmn2r nir'n2r N13N2
37
then the event
{T
"nrn2r
implies the event
{Tn
ir'n2r
 y < xu eI and A}
f nr,n2r nr'n2r r
 y 6 xi n + (T TN N ) and A }
nIr 2r n1r'2r N1r 2r r
Therefore,
P{Dr} r P{T y n (xc) n r and A )
r n1rn2r nr"'2r r
=P{Tr2r s(x)
+ P{A } P{T lr y S (x) Wnlrn
r n"lr'2r nlr'n
Therefore, from (3.3.2), if min.(n1r,n2r) > v, then
P(Dr) a P{Tnlr2r y S (xCA) r + (12
r ~ ~ ~ i Irr"l'2r
or A } .
2r r
n) 1
SP{T y s (xe) n } 2n
Ir'2r r.n'n2r
As a result, for given r such that min(nlr,n2r)>v
P {TNlrN2r y 5 xn1rn2r
= P{Dr} + P{Er}
SP{D }
38
P{Trr y (xE) nr2r} 2n (3.3.5)
In view of equations (3.3.4) and (3.3.5) it follows that for given
r such that min (n rn2r) > v
P{Tn1rn2r (xe) n rn2r 2
PITN rN2r y "nrn2r
SP{T nrn2r Y 5 (xE) n1r ,n2r} + 2n. (3.3.6)
Furthermore, condition (i) implies
lim P{T y 5 (xE)Ow = F(xc)
r+ nIr'n2r Ir'n2r
and
lim P(T y 5 (x+c)wnl } = F(x+e)
rs. Irn2r Irn2r
at all continuity points (xc) and (x+e) of F(.) Therefore, if x is a
continuity point of F(.), then (xE) and (x+E) can be chosen to be
continuity points of F(.) since the set of continuity points is dense.
Consequently,
lim P{TN N y 5 xln } = F(x) ,
r Ir' r lrn2r
which completes the proof of Lemma 3.3.1. 0
Lemma 3.3.2 below follows from Lemmas 2.4.1, 2.4.3, and 3.2.1.
Lemma 3.3.2: Let L = L (N1,N2) be defined as in Condition I.
Under H : e0, and (Al), (A2), (A3), and (A4) ,
o
POTn (N1,N2) 5 x I Ni = ni, i=1,2,3} tends to $(x)
as min (n1,n2) + subject to the condition
lim Ln(nl,n2) = L
min (n,,n2) *M
Proof:
Lemma 2.4.3 shows that al1 ()[T lnln(e)] and o2n() [T2n2n( )]
possess conditional distributions, conditional on Nini, i=1,2,3, which
converge to standard normal distributions as min(n1,n2) tends to
infinity. From equations (2.4.1) and (2.4.2) it is seen that
n1(n1 + 1)
In(0)
4
and
o2 n (nl + 1)(2nl + 1)
lim n2 (0) = 1 .
n in 24
Therefore under Ho, Tin (N1) has a conditional distribution which is
asymptotically standard normal. Similarly, under Ho: e=0, T2n(N2) has a
conditional distribution which is asymptotically standard normal.
As a consequence of Lemmas 2.4.3 and 3.2.1, Lemma 3.3.2 holds. 0
The following Theorem contains the main result of this section.
The proof makes use of the result of Sproule (1974) stated as Lemma 3.3.3.
Lemma 3.3.3: Suppose that
U = ()1 i f(X ...x )
n r BcB 1 r
where B = set of all unordered subsets of r integers chosen without
replacement from the set of integers {1,2, ...,n} is a Ustatistic
of degree r with a symmetric kernel f(). Let {nr) be an increasing
sequence of positive integers tending to as r + and {N } be a sequence
r
of random variables taking on positive integer valves with probability
one. If E{f(X ...Xr) <
lim Var(n U) = r2 > 0 ,
n 1
and
1
plim n N = 1,
r +_ r r
then
lim P(UN EUN ) 5 Nr x (r2 1)1 = *(x).
r r r
Proof:
It should be noted that in order to prove Lemma 3.3.3, only condition
C2 of Anscombe (1952) needs to be verified since condition Cl is valid
under the hypothesis. If Cl and C2 of Anscombe (1952) are valid then
Theorem 1 of Anscombe (1952), and thus Lemma 3.3.3, is valid. This proof
is contained in the proof of Theorem 6 of Sproule (1974). O
41
Theorem 3.3.4: Under H : 6=0, and (A1),(A2), (A3), (A4), (A5), and
for every real value x,
lim P {Tn(N ,N2) x} = =(x).
n 0
Proof:
Let Tn 2 T (n1,n2). Then Lemma 3.3.2 shows that {T } for
nI, n2 n 1 2 n,n2
nl=1,2,..., n2=1,2,..., satisfies condition (i) of Lemma 3.3.1 with y=0
and w = 1. Note that under H ,
n1, n2 o
Pi(0) = plim nNi = i
n+
Therefore, from assumption (AS) it can be seen that Ai > 0 for at least
one i=1,2. Since Pi(0) = 0 implies PO{Ni > 0} = 0 then Tn(N1 N2) is
equivalent to one Ti(Ni) if P (0) = 0 for exactly one i=1,2. In that
case by straightforward application of Theorem 1 of Anscombe (1952) and
Lemma 3.3.3 in a manner similar to the argument which follows in the case
where Xi > 0 for i=1,2 Theorem 3.3.4 follows. Consequently, it will
be assumed in the following argument that Ai > 0 for i=1,2. Now all
that is needed is to show condition (ii) of Lemma 3.3.1 is satisfied
to prove Theorem 3.3.3.
1
Lemma 2.4.2 shows that o In(0) [Tan In(8)I has a limiting
standard normal distribution. As a result T1n (nl) has a limiting stan
dard normal distribution under H : 8=0. Now, under H it can be seen that
o 0
T(n) = (3On1)(UnEUnl) + O(n1 ) (3.3.7)
TIn 1 1In 0In)
42
where U,n is the Ustatistic defined in the proof of Lemma 2.4.1(b).
As a consequence of Lemma 3.3.3, U,n satisfies condition C2 of
Anscombe (1952). In other words, for given E > 0 and n > 0, there exists
v1 and dl > 0 such that for any n1 > v
P{(3n) U1,n, U1,n < e for all n1 such that In, n1 < dn
As a result of (3.3.7) it can be
condition C2 of Anscombe (1952).
el > 0 and n > 0 there exists v,
seen that Tn (nl) satisfies
In other words, for given
and dI > 0 such that for any n1 > v
P{ T (n T (n ) < e for all n1 such
that n1 n1 < d1n "} 1 n.
Now let D21,..., D2n2 be as
defined in equation (2.2.1), let
D2 denote the true difference corresponding to D2i and let
I[D2i < D2 ] = 1 or 0 according to whether Di< D2 or D2i D2
Note that conditional on N2 = n,2
T2n = 2n2r n2
n2
=2 E
i=1
I [D2i < D2] n2
= 2n2 U, 2 n2
2Z, n2 2
(3.3.8)
where
n2
2=1
U2,n2 21 i1 D2
is the Ustatistic estimator of PO[Di < D D e C2]
Now by a similar argument used above for Tl (n), shows that U
in r 1 "*. 2 i,n 2
satisfies condition C2 of Anscombe (1952). Thus under H, T2n (n
0o 2n, 2
satisfies condition C2. In other words,for given E2 > 0 and n > 0, there
exists v2 and d2 > 0 such that for any n2 > 2
P{ (T2n (n2) T2*(n2) < E2 for all n2 such that
Sn2 n2 2 21< d 1 n.
Let E >
and (3.3.9).
T'
n ,n2
(3.3.9)
0 and n > 0 be given and let "1,v2, di,d2 satisfy (3.3.8)
Let v max (V1,V2) and c a min (d1,d2). Consider
= (1 L) Tn(nI) + L T2n(n2), then
in In .2n 2
P{IT', T' < 2 E for all nn' such that
nl n2' "1' 1 .< .2
 n_ n1< cn n' n2 ..< cn2}
44
2P{(1 L) T1(n1) T(nl)
+ L IT2(n2) T2n(n2) < 2 c for all n"I n such that
[n 1 1 < cni, n n2 < cn2
> P{(1 L) Tli(n1) T1(nl)l < e and
L JT2(n2) T2n2)< E for all
nl, n2 such that In nII< cn1,4n2 n21< cn2}
= P{(1 L) IT1n(n) T1n(n1) < E for all nI
such that In nil < cnI}
+ PiL 1T2(n2 T2(n2) < e for all n2
such that In2 n2 < cn2}
P{(1 L) IT1n(n) Tln(nl) < e or
T2n(n2) T2n(n2 < e for all n n2
such that In n1 < cni and n; n21< cn2}
45
a P{(1 L) I Tl(n1) T1n(n1) <. for all n
such that :n1 n1l < cn1}
+ ( P{L' T2(n2) T2n(n2) < E for all n2
such that n2 n2 < cn2}
1 (3.3.10)
Assume 0 < L < 1 for, if not, then by straightforward application of
Lemma 3.3.3 and Anscombe's Theorem 1 (1952)
T (N,N2) = (1 L)T (N) + L T2n2)
has an asymptotic normal distribution as n *.
Now using inequalities (3.3.8) and (3.3.9) and applying them
to (3.3.10) with E = min ( (l L)~, E2L } then
P{ T' T' i < 2e for all n1, n2
n ,n2 n1,n2 "
such that n1 n1 < en1, In 2 < cn2
> (1 n) + (1 n) 1 = 12n .
Therefore, T satisfies condition (ii) of Lemma 3.3.1 so that the
Theorem is valid for T (Ni,N ) = (1 L)Q (N) + L (N2). To see
that the theorem is true if L is replaced by L.2 consider
that the theorem is true if L is replaced by L consider
a
Tn(N1N2) Tn(N1,N2)
(1 L)T(N1) + LnT2n(N2
(1 L) T (N)L (N
= [(1 Ln) (1 L)] T1(N) + (Ln L) T2(N). (3.3.11)
Since Tin (N1) and T2n(N2) converge in distribution to standard
normal random variables, then Tin. (N1) and T2n(N2) are Op(1)
(Serfling, 1980,. p. 8). Also, since plim L = L as n +o, by.
L ) L) . ..
Condition I, then [(1 L ) (1L)] and (L L) are op(1)
n n p
Consequently (3.3.11) shows that
Tn (N N) n(N,N2) = o (1) ,
thus proving Theorem 3.3.4.
As a consequence of the results obtained in Sections 3.2 and 3.3
it is clear that a distributionfree test of H : 8=0 could be based on
0
Tn (Ni, N2). For small sample sizes exact tests utilizing the
conditional distribution of T (N ,N ) is possible and for large sample
sizes the asymptotic normality of Tn(N1'N2) can be used. In the next
section the use of T(N N2) is illustrated with a worked example.
'n 1, 2.
47
3.4 A SmallSample Example
To illustrate the use of T (N1,N2) developed in Sections 3.2 and 3.3,
the data set considered by Woolson and Lachenbruch (1980), which is a
slightly modified data set of Holt and Prentice (1974), will be
considered in this section. The data set is reproduced below.
TABLE 3.1
DAYS OF SURVIVAL OF SKIN GRAFTS ON BURN PATEINTS
Patient 1 2 3 4 5 6 7 8 9 10 11
Survival of
close match
(exp(Xi)) 37 19 57+ 93 16 22 20 18 63 29 60+
Survival of
Poor Match
(exp(Yi)) 29 13 15 26 11 17 26 21 43 15 40
Note. 57+ and 60+ denote rightcensored observations at 57 and 60.
The model considered by Woolson and Lachenbruch (1980) is
exp(Xi) = 0 Vli Wi
exp(Yi0) = V21 W (3.4.1)
where 1 > 0 is an unknown parameter, Vli and V2i are independent and
identically distributed nonnegative random variables and Wi is an
independent nonnegative random variable for all i. Taking the natural
logarithms of (3.4.1) yields
48
Xio = + log Vli + log Wi
YiO = log V2i + log Wi
(3.4.2)
where a0 log (. It is clear that a test of H : 1 = 1 is equivalent
to a test of Ho: e=0. Now considering X Yi the data set is given in
Table 3.2.
LOGARITHMS OF THE DIFFERENCE
TABLE 3.2
IN SURVIVAL TIME OF
THE SKIN GRAFTS
1 2 3 4 5 6
0.2436 0.3795 1.3550+ 1.2745 0.3747 0.2578
7 8 9 10 11
0.2624 0.1542 0.3819
0.6592 0.4055+
As can be seen from the data, N = 9, N2L = 0, N2R=2, and N2=2. It
follows by some elementary calculations that
Tin 40, T2n = 2, Tln(9) = 2.073, and T2n(2) = 1.414.
The test statistic Tn(9,2) can be calculated now if Ln is known.
Assuming that nothing more is known about the underlying distributions,
a reasonable choice for L would be L = .5 and, consequently,
S L) = .5. Note that for this choice equal weight is given to
(1 Ln) = .5. Note that for this choice equal weight is given to
Patient
Xi Yi
Patient
Xi Yi
T1n(N1) and T2n(N2) in the calculation of T (N ,N2). Now,
Tn(9,12) = (.5)' Tn(9) + (.5) T2(2)
= 2.466.
Comparing this result with the critical values appearing in the
tables given in the Appendix it can be seen that
PO{Tn(9,2) S 2.298} = 0.0093,
so that the observed.value of 2.466 is significant at the .01 level.
As shown in Woolson and Lachenbruch (1980) the WL test statistic of
logistic scores yields the largesample test statistic Z = 2.49,
which indicates significance at the .01 level. Also, the exact
level for the WL test statistic based on the permutation distribution
of the logistic scores is 11/2048 or 0.00537, again showing significance
at the .01 level. Note that the value of 8/2048 as given in Woolson
and Lachenbruch (1980) is incorrect due to a minor computational error.
In the next section a method to choose L will be outlined.
n
3.5 Choosing the Test Statistic
In section 3.4 an example was presented which illustrated the
testing of Ho: e=0 by using a linear combination of T (N1) and
T2n(N2). In this section a method for selecting the linear combination
with the maximum conditional Pitman Asymptotic Relative Efficiency
(PARE) is outlined. The main result of this section is contained in
Theorem 3.5.3 which is proved after establishing some necessary
preliminary results.
50
The following lemma is a restatement of the result appearing in
Callaert and Janssen (1978).
Lemma 3.5.1:
Let U be a Ustatistic of degree r=2 as defined in Lemma 3.3.3
n
such that
lim Var(n U) = 4 1' > 0.
n+.
If 3 = E5 I f(X1,X2) I < then there exists a constant M such
that for all n 9 2
sup j P{n(41) (Un EUn) S x *(x)
 < x < 
S M p3 13n
Proof:
The .proof is given by Callaert and Janssen (1978). O
Lemma 3.5.2:
Let in(8) and o2 (0) 1i1,2, be as defined in Lemma 2.4.1.
in in
and
012(6) = P[D11 + D12 > 0, D11 + D13 > 0]
lP[D11 + D12 > 01}2
22(e) = [P2(0)]2[P2R()] [P2L(e)
Suppose that for some x > 0
(i) inf a 12() o12
0 6 0 6 6
05056
(ii) inf a 2(6) = 22
o s s 
> 0,
> 0
Then
lim P{oin ) ( T n (6) x N1 = "n N2 = nn) = 2(x),
i=1,2 uniformly in 6 c [0,6] for every x.
Proof:
The proof follows from Lemma 3.5.1 concerning the rate of convergence
to normality of the distribution of a Ustatistic.
In Lemma 2.4.2(a) it was shown that
"l 21) '[Tn n(6)] = nll[U,n E0U1,n
+ nl3/2( 1)1[Wnl EoW1,] (3.5.1)
where
U n (1)1
1,n 2
E Z
1 i 5 j Sn1
W = n1 1 (Dj)
l,n 1 = j
1 =i i
1(DIj + DIj)
52
It now will be shown that [W, E W ] converges to 0 uniformly in
1,n 0@ 1,n1
6 E [0, 6], and that the distribution of n [U1,n E U ] converges to
a normal distribution uniformly in a e [0, 6]. In view of equation
(3.5.1) this will complete the proof of Lemma 3.5.2, for i=1. O
It is clear that W,n is an unbiased estimator for P [D11 > 0] and
1,n0
has variance equal to
Var (Wl,n ) = 1 P [D11 > 0] [ 1 P (D11 > 0)]
= n1 P[D11 > 0] P [D11 0]
S (4n1)1
Therefore, by Chebyshev's inequality (Randles and Wolfe, 1979)
Var W
e 1,n1
P{ IW1, E W1,n1 } 2
1
2
4n1
which implies that [W1 E W ] converges in probability to
1,n1 Q l,n
0 uniformly in Since n3/2 (~1) converges to zero as n tends
to infinity, then the second term on the righthand side of equation
(3.5.1) converges to zero uniformly in 3 e[0,6].
53
Let V3 Ee[IF(DII + D2)] Now, from Lemma 3.5.1 it follows that
for some positive constant M and n1 2
sup IP {21 n al1( )[U1,n EeU,nl] x N1 = n1, N2 = n2} (x)I

SMu3 o(6)] 3 nl (3.5.2)
Note that 3 5 1, by definition of the function T(.). Further, by
assumption (i) the righthand side of equation (3.5.2) is less than or
3 1
equal to M s n1 for 0 5 0 5 6 Consequently, the distribution of
nl[U,n EUI,nl] converges in law to a normal distribution as nI tends
to infinity uniformly in 0 E [0,6], thus proving the Lemma for i=1.
The proof for i=2 follows similarly except that assumption (ii) is
needed for the application of Lemma 3.5.1 to the Ustatistic
U 2,nwhere U2,n2 is defined as in the proof of Lemma 3.4.1(c). O
2,n2 2,n2
The main concern of this section is choosing a test statistic
Tyn = (1 ) Tn(N1) + A T2(N) (3.5.3)
where 0 < A < 1. The next Theorem provides a method for selecting A
such that the corresponding test procedure given by
Reject Ho if TXn > t ,
Do not reject H otherwise, (3.5.4)
where t is a constant, has the maximum PARE relative to all test procedures
corresponding to TXn
Theorem 3.5.3:
Let in(8) and Oin2(0), i=1,2, be as defined in Lemma 2.4.1.
Suppose the following conditions hold:
(i) For some 6 > 0 and 9 E [0,6]
V. (8) is differentiable with respect to 6 and
in
Sn(0) = d Vi(8) > 0, i=1,2.
in d n 0 = 0
(ii) There exist positive constraints ki, i=1,2,
such that
ki = lim [in(O0)]1 [ni in)], i=1,2.
ni + in i in
S 1
ni
where En = 0(n )
1
(iv) lim in (0) Oin(8) = 1 i=1,2,
ni
where n = O(ni )
(v) For some 6 > 0, suppose conditions (i) and (ii) of Lemma 3.5.1 hold.
Then among all tests for testing H : e = 0 against H : 0 > 0 having the
form (3.5.4) the test corresponding to the test statistic Tn where
Y = (k1 + k2)1 k1
has maximum conditional PARE (as min (ril, n2) 4) relative to all
tests of the form (3.5.4).
Proof:
In view of Theorem 10.2.3 of Serfling (1980) the proof is complete
if the Pitman conditions P(1) through P(6) (see Serfling (1980),
p. 317 and p. 323) are verified for Tin and T2n conditional on N1 = nI,
N = n 2 Conditions P(2) through P(5) follow directly from assumptions
(i), (ii), (iii) and (iv), while P(1) follows from assumption (v) and
Lemma 3.5.2. To verify condition P(6) note that by Lemma 2.4.2 the
conditional asymptotic joint distribution of (Tlns T2n) is bivariate
normal and from Lemma 3.5.2 it follows that
sup sup tPeO ()[Tm 6)l X,
sup sup [POIIn ITn In x
0 < x<
o < y <
2n( )[T2n 2n()] y NI= n n2
P(x)*(y)( + 0
as min (n1,n2) 1. Thus, the proof of Theorem 3.5.3 follows by taking
p = 0 in Theorem 10.2.3 of Serfling (1980). D
Selection of an optimum test based on Theorem 3.5.3 requires
verification of assumptions (i) through (v) and the computation of
constants kI and k2. Unfortunately, implementation of this method of
selection appears intractable because of the general nature of the
functional form of Fe(.) given in equation (2.2.2). It is not clear in
general what assumptions concerning F0 () can be considered reasonable.
However, as is shown below in the special case where X and Y obey the
usual nonparametric analogue of the paired sample model, the form of
F (.) is considerably simplified.
Suppose Xo and YO can be modelled as follows:
X = 0 + B + E
YO = B + E2 (3.5.5)
where 0 is a constant, B is a random variable representing the "pairing"
variable, and Ei 11,2 are i.i.d. according to the distribution function
F(.). Furthermore, assume the censoring variable C has distribution
function H() while CB has distribution function G(). Let
Pl(x) = P[E1 5 (CB) x, E2 5 (CB)]
= I F(c x) F(c) d G(c)
Note that
I = PQ [X 5 C, YO C]
= P [E1 s (C B) e, E2 S (C B)]
= P (e)
and, as indicated in the proof of Theorem 3.3.4, X1, can be assumed
to be greater than 0. Therefore,
F (x) = PO[Xo YO S x I Xo 6 C, YO 5 C]
A (x,e)
P1(e)
where
A(x, 0) = P [X YO x, Xo C, YO ; C]
= P [El E2 g x 6, E1 S (C B) e, E2 : (C B)]
= / P[E1 E2 5 x 0, El c ,
E2 c]dG(c)
m C
= 1 J P[E g e + x 
o C
= / f F[min(e + x, c)
CO C
8, E1 6 c e]dF(e)dG(c)
 e]dF(e)dG(c).
From an application of results such as Theorems A.2.3 and A.2.4
in Randles and Wolfe (1979) it now follows that if F(') is absolutely
continuous and its derivative f(.) is bounded, then both P1(6) and
A(x,6) are differentiable functions of 8. Furthermore, Fg(.) is also a
differentiable and bounded function of 0. Therefore, the representation
for p1n(6) and oa;) given in Lemma 2.4.1(b) can possibly be utilized
for checking the desired assumptions. For example, assumptions (iii)
and (iv) follow directly, but verification of assumptions (i), (ii),
and (v) will require more conditions on the specific form of F(.),
a problem yet to be solved. Similarly, assumptions (iii) and (iv) are
easily verified for T2n, because the representation
P2R() = Po[Y C < X]
= P[E2 5 (C B) 5 E1 + 0]
= f [1 F(c 0)] F(c) dG(c)
and
P2L(<) = Po[X C < yO]
= / [1 F(c)] F(c 8) dG(c) ,
can be utilized in conjunction with the formulas for 12n () ~28) given
in Lemma 2.4.1(c).
CHAPTER FOUR
SIMULATION RESULTS AND CONCLUSIONS
4.1 Introduction and Summary
In the previous chapters a class of test statistics was proposed
for the purpose of testing for location difference in censored matched
pairs. This class of test statistics is the set of all standardized
linear combinations of' two conditionally independent test statistics
proposed in Chapter Two. It was shown in Chapter Three that under
certain assumptions each member of this class is conditionally distribu
tionfree and has an asymptotically normal distribution.
In this chapter a simulation study is presented in order to
compare the powers of some members of the proposed class and a test
proposed by Woolson and Lachenbruch (1980). In Section 4.2 the statis
tics that are to be compared will be identified and in Section 4.3 the
results of the simulation studies will be listed. Section 4.4 will
contain the conclusions that are drawn based on the simulation study.
4.2 Test Statistics To Be Compared
In this section the six test statistics to be compared using a
simulation study will be identified. Four of these statistics are
members of the class proposed in Chapter Three, another statistic is
based .on the two statistics proposed in Chapter Two, while the
59
remaining one is a member of the family of statistics proposed by
Woolson and Lachenbruch (1980).
At this point some additional notation will be needed. Let Z(i)
for i=1,2, .. N1 denote the ordered absolute values of the
D1IiC1 and let Z(0) = 0 and Z(N + 1) = Let mi, i=0,1, ... N1
denote the number of D2jI, j=1,2,. ., N2, where D2j E C2,
contained in the interval [Z), Z(i+l) while p, i= 0,1, ,N
represents the number of these D2j which are positive. Also, let
N
M.= Z (m + 1)
3 k=j
The Woolson and Lachenbruch (WL) test statistic (1980) utilizing
logistic scores can be written as
TWL= T + T
WL u c
where
N i
Tu 1[2 '(D ) 1][1 H M / (M + 1)]
i=1 j=1
N i
Tc = (2p m ) + l(2pi mi)[l M./(M.+1)]
i=1 j=l 3 3
61
Under Ho, the variance of TWL, conditional on the observed pattern of
censoring, is given by
2 N i N i
oWL = [l R M / (Mj + 1)]2+ E m 1[1i ( M. / (M + 1)]
i=1 j=1 i=1 j=l
+ 1/4 m
It should be noted that if there is no censoring, that is if N1 = n
with probability one, then TWL is equivalent to the Wilcoxon signed
rank statistic.
The second test statistic studied in Section 4.3 is TMAX where
TMAX = MAX { Tn(N ), T2(N2)
It is clear that TMAX utilizes the test statistics Tin and T2n which
are proposed in Chapter Two since T1 (N1) and T2n(Ni) are the
standardized versions of Tin and T2n. Since, according to Lemma
2.4.1(a), TIn and T2n are conditionally independent, an exact
conditional test of H can be performed using TMAX.
The remaining test statistics are of the form
Tn(N1, N2) = (1 Ln) T1n(NI) + L T2(N2) (4.2.1)
as defined below. The first test statistic of this type is
TEQ = (.5) T1(N1) + ).5) 2
corresponding to Ln = .5 in (4.2.1). The second is
62
TSQR = [N / (N + N2 N) + [N (N + N2)I n T2n(N)
obtained by selecting Ln proportion to N The third statistic sets
L proportional to N with the result
TSS = [N1 / (N2 + N2) IT(N + [N2 (N2 + N2] T(N2)
The remaining test statistic to be studied is TSTD where
T = + T2n N1(N1+1)/4
TSTD = 
{[N1(N1 + 1)(2N1 + 1)/24] + N2}
Note that TSTD is obtained by standardizing Tin + T2n using its mean
and standard deviation under H : 0=0
o
Simple calculations can show that TSTD has the form (4.2.1) with
Ln= 2 2 + 2
2 1 2
where
oN = N1(N1 + 1)(2N1+1)/24
1
and
a N2 = N 2
2
Therefore, weights given to T (N ) and T (N ) by TSTD are propor
tional to the null variances of T2n, respectively.
tional to the null variances of Tin and T2n' respectively.
63
The statistics TEQ, TSQR, TSS, and TSTD are all of the form
(4.2.1) and are chosen to represent the class of all test statistics
proposed in Chapter Three. In the next section some simulation
results are given comparing these statistics among each other
and with TL and TMAX.
4.3 Simulation Results
In this section, following Woolson and Lachenbruch (1980), the
assumed model for X and Y i, i=1,2, . n, is the log linear model
exp(X ) = Vli Wi
exp(Y i) = V2i Wi
where > > 0 is an unknown parameter, Vli and V2i are independent and
identically distributed nonnegative random variables, and Wi is an
independent nonnegative random variable for all i. The simulation
results for each case are developed by generating 500 random samples of
50 observations of (log Vli, log V2i) and C where C is an independent
observation of some censoring variable. It is clear that if 6=log $ then,
Xi Yi = e + (log Vii log V2i) i=1,2, ., n;
consequently, the distributional form of X.0 Y. depends on the
1 1
distributional form of log V1i log V2i.
Samples from four distributional forms for X. Y were
simulated in this study. The logistic distribution was simulated
by generating Vli and V2i with independent exponential distributions.
Another lighttailed distribution, the normal distribution, was
simulated by generating log Vli and log V2i as independent standard
normal variables. Two heavytailed distributions, the double
exponential distribution and RambergSchmeiserTukey (RST) Lambda
Distribution (Randles and Wolfe, 1979, p. 416) were also studied.
The double exponential distribution was arrived at by generating
log Vli and log V2i with independent exponential distributions.
The RST Lambda cumulative distribution function (c.d.f.) can not be
expressed explicitly but its inverse c.d.f. can be given as follows:
F(u) = 1 + [u 1 (1u) 0 < u < 1.
As shown in Ramberg and Schmeiser (1972) when 1 = 0, A3 = X4 = 1 and
X2 = 3.0674, the RST Lambda Distribution approximates the Cauchy
distribution. This particular choice of RST Lambda Distribution was
used to generate log Vli and log V2i.
To generate the censoring random variable, the Uniform [0, B]
distribution was utilized for the logistic and double exponential
cases, while the natural logarithm of the Uniform [0, B] distribution
was used in the normal and RST Lambda cases. The choice of B was made
in each case in such a way so that, under Ho, the proportion of
uncensored pairs was approximately 75% of the total sample size while
approximately 20% of the total sample size consisted of pairs in which
exactly one member of a pair was uncensored. Consequently, approxi
mately 5% of the pairs were not utilized since they were pairs in which
both members were censored. Since the results presented in this
section only apply to the pattern of censoring described above, any
conclusions drawn only apply to this form of censoring.
65
In each case the hypothesis tested was H : 9=0 against H a: >0, at
level .05. Based on the asymptotic distribution of the test statis
tics, the critical value was chosen to be 1.645 in each case except for
TMAX which has a critical.value of 1.955. Tables 4.1, 4.2, 4.3, and
4.4 give the resulting power (the proportion of times the corresponding
test statistic exceeds the critical value) of the six tests described
in Section 4.2 for the logistic, normal, double exponential, and RST
Lambda distributions, respectively.
TABLE 4.1
POWER OF THE TESTS FOR THE LOGISTIC DISTRIBUTION
e=0 e=.181 e=.453 e=.907 e=1.814
TWL .050 .186 .532 .944 1.000
TMAX .040 .128 .414 .900 1.000
TEQ .048 .172 .512 .890 .998
TSQR .054 .156 .474 .896 1.000
TSS .040 .152 .444 .894 1.000
TSTD .040 .116 .248 .522 .922
TABLE 4.2
POWER OF THE TESTS FOR THE NORMAL DISTRIBUTION
6=0
.060
.046
.052
.054
.052
.050
0=.141
.180
.142
.170
.184
.156
.150
6=.354
.546
.344
.514
.514
.466
.418
0=.707
.960
.886
.944
.948
.936
.876
POWER OF THE TESTS
e=0
.064
.038
.064
.050
.056
.060
TABLE 4.3
FOR THE DOUBLE EXPONENTIAL DISTRIBUTION
0=.141
.250
.188
.262
.264
.236
.224
e=.354
.728
.584
.746
.750
.742
.648
6=. 707
.996
.970
.998
.998
.998
.974
0=1.414
1.000
1.000
1.000
1.000
1.000
1.000
TWL
TMAX
TEQ
TSQR
TSS
TSTD
6=1.414
1.000
1.000
1.000
1.000
1.000
1.000
TWL
TMAX
TEQ
TSQR
TSS
TSTD

67
TABLE 4.4
POWER OF THE TESTS FOR AN APPROXIMATE CAUCHY DISTRIBUTION
e=0 e=.2 e=.4 e=.6 e=.8 e=1.0
TWL .050 .124 .254 .370 .528 .670
TMAX .048 .102 .208 .304 .470 .596
TEQ .042 .126 .234 .350 .516 .652
TSQR .056 .132 .288 .394 .576 .720
TSS .060 .124 .272 .386 .594 ..728
TSTD .056 .126 .248 .356 .562 .694
Note that the values of 0 appearing in Tables 4.1, 4.2, and 4.3
are the values O, .1o, .25o, .5o, and lo where in Table 4.1 o2 = w2/3,
which is the variance of the logistic distribution, while in Tables 4.2
2
and 4.3 o = 2, which is the variance of the underlying normal and
double exponential distributions, respectively. In Table 4.4 the value
of 9 was chosen such that the corresponding percentiles of the RST
Lambda Distribution are approximately equal to the corresponding per
centiles determined by the values of 0 used in Tables 4.1, 4.2, and 4.3.
The simulation results displayed in this section show some
differences in the powers of the test statistics considered. For
example, TMAX appears to be less powerful than the other test
statistics near the null hypothesis. However, the results indicate
that the choice of critical value for TMAX is conservative since the
.05 level of significance is not attained under the null hypothesis.
Consequently, if the critical value is decreased until the .05 level of
68
significance is attained, then TMAX cam be expected to perform better
than it did in these studies. Nevertheless, all six test statistics
are similar in that their powers not only increase as 0 increases, but
for large 0 they all have high powers.
In Table 4.1 the results indicate that for the logistic
distribution TWL and TEQ have similar local (near the null hypothesis)
powers with TSQR and TSS falling closely behind. Table 4.2 shows
that TWL, TEQ, and TSQR all have similar local power when the
underlying distribution is normal. For the heavier tailed
distributions the powers represented in Tables 4.3 and 4.4 show that
TSQR has the highest local power with the powers of TWL and TEQ only
slightly lower.
4.4 Conclusions
The simulation study of Section 4.3 compared the largesample
powers for a test proposed by Woolson and Lachenbruch (1980) and some
tests from the class of tests proposed in this dissertation. As noted,
in the previous section no noticeable difference in largesample powers
was observed. Therefore, selection of a test procedure for testing
H : @ = 0 is not clear based on the results given in Section 4.3.
o
However, this selection should not only take into account the power of
the test procedures, but also should be based on the availability of
critical values and the level of difficulty involved in the imple
mentation of the test procedures.
As has been noted in this dissertation, critical values for the
proposed test statistics, Tn(N1,N2), when the sample size is small, are
easily obtained. This is not the case for the Woolson and Lachenbruch
(1980) statistic, TW, since the critical values need to be derived
from its permutation distribution conditional on the observed scores.
This can be time consuming and impractical since the critical values of
TWL depend on the observed sample and, thus, can not be tabulated.
Tables of critical values for T (N ,N2) are easily obtained as noted in
the Appendix. As for large sample sizes, Theorem 3.3.4 shows that the
critical values for T (N,,N2) can be found by using the standard normal
distribution. The conditional distribution of the statistic TWL under
H is shown by Woolson and Lachenbruch (1980) to have an asymptotic
normal distribution as min(n1,n2) +, but it is not clear if this holds
unconditionally as the overall sample size, n, tends to infinity.
Computation of the values of the statistics Tn(N ,N2) is not
difficult since it only involves the calculation of the values of
a Wilcoxon signed rank statistic and a binomial statistic. The value
of the statistic TWL is more difficult to find and for larger sample
sizes is only easily obtainable through the use of a computer.
Therefore, for moderate and large sample sizes the lack of computing
facilities makes the use of TWL somewhat impractical.
The simulation study in Section 4.3 does not include smallsample
power results since these results are not easily obtained. This is due
to the difficulty encountered when trying to find critical values for
TWL, since these values depend on the particular sample observed.
Consequently, the selection of the test procedure can now be made based
upon the availability of critical values, the level of difficulty of
calculating the corresponding test statistic, and the largesample
power results of the previous section. Based on these criteria TEQ is
70
the recommended statistic due to its ease of computation, the avail
ability of tables of critical values as in the Appendix, and since the
largesample power results contained in Section 4.3 shows its
performance is close to the best of the statistics considered in each
case. However, if the form of the underlying distributions.is known,
then it might be possible to apply the method of Section 3.5 to find
the test statistic T (N ,N2) which maximizes the conditional Pitman
Asymptotic Relative Efficiency. In that case, the resulting statistic
would be the recommended choice.
APPENDIX
TABLES OF CRITICAL VALUES FOR
TESTING FOR DIFFERENCE IN LOCATION
The tables in this appendix list the critical values of
Tn(nn2) = (5 1(nl) + (.5) T2n(n2
at the a equal to .01, .025, .05, and .10 levels of significance for
n1=1,2,...,10 and n2=1,2,...,15. The critical values for larger n1
or n2 can be approximated by the critical values obtained from the
standard normal distribution. When n =0 the critical values can
be obtained from the critical values of the standardized Binomial
distribution with parameters n2 and p = .5. Similarly, when n2 = 0
the critical values can be obtained from the critical values of
the standardized Wilcoxon signed rank statistic based on n1 obser
vations. The critical values of this test statistic are derived for
each n1 and n2 by convoluting the standardized Wilcoxon signed rank
distribution based on n1 observations with the standardized Binomial
distribution with n2 observations. Since both distributions are
discrete, exact .01, .025, .05, and .10 level critical values do not
always exist, these levels are only approximated. In addition to.the
critical values, the attained significance level of each critical value
(i.e., the probability that T (n1,n2) is greater than or equal to the
critical value) is given in the parentheses.
nl=1
n2=2
n2=3
  1.932 (.0625)
n2=4 n2=5 n2=6
  2.439 (.0078)
 2.288 (.0156) 2.439 (.0078)
2.121 (.0313) 2.288 (.0156) 1.862 (.0547)
2.121 (.0313) 1.656 (.0938) 1.862 (.0547)
n2=7 n2=8 n2=9
2.578 (.0039) 2.707 (.0020) 2.357 (.0098)
2.578 (.0039) 2.207 (.0176) 2.357 (.0098)
2.043 (.0313) 2.207 (.0176) 1.886 (.0449)
2.043 (.0313) 1.293 (.0742) 1.886 (.0449)
n2=10 n2=11 n2=12
2.496 (.0054). 2.626 (.0029) 2.340 (.0096)
2.049 (.0273) 2.200 (.0164) 2.340 (.0096)
2.049 (.0273) 1.773 (.0566) 1.742 (.0366)
1.529 (.0864) 1.638 (.0569) 1.334 (.0985)
n2=13 n2=14 n2=15
2.472
1.842
1.688
1.450
(.0056)
(.0231)
(.0668)
(.0676)
2.597
2.219
1.561
1.463
(.0032)
(.0143)
(.0453)
(.1064)
2.032
1.985
1.666
1.301
(.0088)
(.0296)
(.0299)
(.0773)
n2=1
74
ni=2
.01
.025
.05
.10
.01
.025
.05
.10
.01
.025
.05
.10
2.321 (.0120)
: 2.081 (.0144)
1.601 (.0453).
1.297 (.1088)
2.206 (.0088)
1.952 (.0243)
1.705 (.0604)
1.327 (.1229)
2.325 (.0053)
1.959 (.0193)
1.594 (.0535)
1.327 (.0952)
n2=1 n2=2 n2=3
  
  
  2.173 (.0313)
 1.949 (.0625) 1.541 (.0625)
n2=4 n2=5 n2=6
 2.530 (.0078) 2.681 (.0039)
2.363 (.0156) 2.530 (.0078) 2.103 (.0273)
1.730 (.0313) 1.897 (.0547) 2.048 (.0313)
1.656 (.0938) 1.897 (.0547) 1.526 (.0898)
n2=7 n2=8 n2=9
2.820 (.0020) 2.316 (.0098) 2.438 (.0054)
2.187 (.0176) 2.316 (.0098) 1.966 (.0273)
1.750 (.0586) 1.684 (.0459) 1.656 (.0688)
1.555 (.0742) 1.449 (.1006) 1.334 (.0908)
n2=10 n2=11 n2=12
2.290 (.0139) 2.029 (.0098) 2.357 (.0056)
1.920 (.0166) 2.015 (.0299) 1.949 (.0231)
1.658 (.0569) 1.603 (.0380) 1.765 (.0533)
1.396 (.1106) 1.382 (.0985) 1.919 (.1159)
n2=13 n2=14 n2=15
n1=3
n,=l
n,=2
a= .01
a = .025
a = .05
a = .10
a= .01
a =.025
a= .05
a = .10
a =.01
a .025
a = .05
a= .10
a = .01
a = .025
a = .05
a = .10
2.143
1.779
1.737
(.0090)
(.0245)
(.0422):
1.408 (.0125)
2.268 (.0124)
1.890 (.0250)
1.512 (.0494)
1.134 (.0988)
2.386 (.0101)
2.034 (.0269)
1.682 (.0485)
1.304 (.1128)
2 L
 2.134 (.0313) 2.359 (.0156)
 1.756 (.0625) 1.603 (.0469)
1.841 (.0625) 1.378 (.0938) 1.542 (.0938)
n2=4 n2=5 n2=6
2.548 (.0078) 2.337 (.0078) .2.488 (.0039)
2.170 (.0156) 2.083 (.0273) 1.911 (.0293)
1.792 (.0547) 1.705 (.0508) 1.711 (.0625)
1.414 (.1016) 1.450 (.0977) 1.333 (.1055)
n2=7 n2=8 n2=9
2.249 (.0098) 2.256 (.0093) 2.312 (.0139)
2.092 (.0166) 2.000 (.0239) 1.934 (.0254)
1.714 (.0459) 1.756 (.0415) 1.650 (.0505)
1.401 (.1016) 1.378 (.0908) 1.370 (.1106)
n2=10 n2=11 n2=12
2.167 (.0098) 2.248 (.0090) 2.359 (.0120)
2.028 (.0299) 1.870 (.0239) 1.981 (.0215)
1.650 (.0526) 1.773 (.0541) 1.603 (.0477)
1.272 (.1052) 1.396 (.0917) 1.225 (.1028)
n2=13 n2=14 n2=15
n =4
n2=1 n2=2 n2=3
 2.291 (.0156) 2.516 (.0078)
1.998 (.0313) 2.033 (.0313) 1.999 (.0234)
1.740 (.0625) 1.775 (.0469) 1.741 (.0391)
1.481 (.0938) 1.291 (.1094) 1.441 (.1016)
n24 n2=5 n2=6
2.189 (.0117) 2.356 (.0059) 2.248 (.0107)
1.998 (.0273) 1.981 (.0293) 1.929 (.0244)
1.740 (.0508) 1.723 (.0430) 1.671 (.0527)
1.291 (.1055) 1.349 (.1055) 1.352 (.0957)
n2=7 n2=8 n2=9
2.369 (.0093) 2.291 (.0120) 2.379 (.0085)
1.871 (.0249) 2.016 (.0251) 1.998 (.0256)
1.595 (.0498) 1.758 (.0500) 1.695 (.0515)
1.318 (.1055) 1.291 (.0994) 1.268 (.1061)
n2=10 n2=11 n2=12
2.236 (.0091) 2.357 (.0103) 2.299 (.0078)
1.927 (.0278) 2.009 (.0211) 1.933 (.0245)
1.720 (.0475) 1.672 (.0504) 1.699 (.0532)
1.291 (.1099) 1.312 (.0994) 1.333 (.0991)
n2=13 n2=14 n2=15
2.281
1.889
1.631
1.249
(.0081)
(.0272)
(.0482)
(.0987)
2.286 (.0101)
2.028 (.0252)
1.669 (.0518)
1.291 (.1035)
2.222
1.946
1.643
1.322
(.0095)
(.0257)
(.0502)
(.1004)
77
n1=5
n2=l
n2=2
 2.430 (.0078) 2.274 (.0117)
1.947 (.0313) 2.049 (.0234) 1.892 (.0273)
1.756 (.0469) 1.667 (.0547) 1.701 (.0508)
1.375 (.1094) 1.430 (.0938) 1.320 (.0977)
n2=4 n2=5 n2=6
2.272 (.0098) 2.379 (.0098) 2.394 (.0093)
2.082 (.0215) 1.997 (.0244) 2.008 (.0283)
1.700 (.0488) 1.676 (.0498) 1.631 (.0532)
1.319 (.1035) 1.295 (.0996) 1.255 (.0967)
n2=7 n2=8 n2=9
2.348 (.0076) 2.286 (.0099) 2.317 (.0090)
2.004 (.0254) 1.976 (.0220) 1.947 (.0279)
1.660 (.0559) 1.667 (.0530) 1.666 (.0486)
1.316 (.0950) 1.333 (.0988) 1.285 (.1013)
n2=10 n2=11 n2=12
2.325 (.0104) 2.351 (.0086) 2.274 (.0099)
1.943 (.0252) 1.969 (.0240) 1.919 (.0244)
1.687 (.0514) 1.643 (.0543) 1.675 (.0520)
1.312 (.0940) 1.307 (.1027) 1.293 (.1036)
n2=13 n2=14 n2=15
2.360 (.0095)
1.992 (.0249)
1.617 (.0493)
1.240 (.1002)
2.327 (.0102)
1.962 (.0262)
1.613 (.0526)
1.308 (.0983)
2.242
2.019
1.648
1.267
(.0099)
(.0256)
(.0501)
(.1028)
n1=6
2.264 (.0078) 2.408 (.0078) 2.337 (.0098)
1.967 (.0234) 1.964 (.0273) 1.965 (.0254)
1.671 (.0547) 1.667 (.0547) 1.668 (.0527)
1.374 (.1094) 1.371 (.1016) 1.299 (.1035)
n2=4 n2=5 n2=6
2.264 (.0107) 2.357 (.0098) 2.267 (.0107)
1.967 (.0254) 1.952 (.0229) 1.970 (.0273)
1.671 (.0488) 1.655 (.0498) 1.674 (.0525)
1.340 (.1025) 1.319 (.0986) 1.260 (.1021)
n2=7 n2=8 n2=9
2.358 (.0090) 2.315 (.0096) 2.290 (.0102)
1.945 (.0236) 1.964 (.0261) 1.967 (.0258)
1.648 (.0504) 1.667 (.0499) 1.671 (.0518)
1.321 (.1010) 1.315 (.1017) 1.321 (.1018)
n2=10 n2=11 n2=12
2.305 (.0099) 2.289 (.0101) 2.337 (.0098)
2.006 (.0235) 1.975 (.0244) 1.931 (.0245)
1.566 (.0503) 1.678 (.0490) 1.632 (.0534)
1.265 (.1043) 1.307 (.1043) 1.299 (.1023)
n2=13 n2=14 n2=15
2.284
1.987
1.648
1.308
(.0100)
(.0244)
(.0534)
(.1000)
2.313
1.949
1.653
1.289
(.0103)
(.0249)
(.0500)
(.0989)
2.321
1.956
1.660
1.294
(.0096)
(.0252)
(.0505)
(.1011)
79
n1=7
.01
.025
.05
.10
.01
.025
.05
.10
.01
.025
.05
.10
.01
.025
.05
.10
2.295 (.0102)
1.937 (.0263)
1.664 (.0503)
2.329 (.0096)
1.971 (.0249)
1.651 (.0499)
1.305 (.1010) 1.312 (.1000)
2.241 (.0103)
1.982 (.0254)
1.637 (.0494)
1.278 (.0982)
n2=1 n2=2 n2=3
2.261 (.0078) 2.315 (.0098) 2.300 (.0098)
1.902 (.0273) 1.956 (.0273) 1.962 (.0244)
1.663 (.0547) 1.673 (.0508) 1.703 (.0508)
1.305 (.1171) 1.315 (.1055) 1.344 (.0996)
n2=4 n2=5 n2=6
2.261 (.0107) 2.298 (.0095) 2.251 (.0106)
2.012 (.0244) 1.940 (.0254) 1.971 (.0258)
1.663 (.0518) 1.666 (.0513) 1.653 (.0510)
1.305 (.1064) 1.307 (.1011) 1.294 (.1053)
n2=7 n =8 n2=9
2.292 (.0108) 2.315 (.0103) 2.261 (.0105)
1.941 (.0250) 1.956 (.0265) 1.909 (.0260)
1.695 (.0478) 1.641 (.0501) 1.663 (.0498)
1.287 (.0991) 1.315 (.1025) 1.305 (.1033)
n2=10 n2=11 n2=12
2.298 (.0103) 2.313 (.0102) 2.300 (.0103)
1.970 (.0252) 1.954 (.0257) 1.962 (.0249)
1.642 (.0522) 1.647 (.0516) 1.653 (.0526)
1.284 (.1056) 1.305 (.1000) 1.295 (.1034)
n2=13 n2=14 n2=15
80
n=8
n2=1 n2=2 n2=3
2.192 (.0098) 2.287 (.0098) 2.314 (.0093)
1.895 (.0273) 1.990 (.0244) 1.992 (.0239)
1.697 (.0488) 1.693 (.0498) 1.695 (.0488)
1.400 (.0947) 1.297 (.1064) 1.324 (.0972)
n2=4 n2=5 n2=6
2.305 (.0098) 2.274 (.0103) 2.326 (.0096)
1.994 (.0242) 1.977 (.0240) 1.947 (.0269)
1.683 (.0505) 1.642 (.0532) 1.667 (.0497)
1.301 (.1040) 1.306 (.1027) 1.287 (.0997)
n2=7 n2=8 n2=9
2.287 (.0107) 2.287 (.0106) 2.291 (.0100)
1.951 (.0253) 1.985 (.0256) 1.971 (.0251)
1.653 (.0504) 1.688 (.0498) 1.674 (.0498)
1.297 (.1032) 1.297 (.0984) 1.301 (.1023)
n2=10 n2=11 n2=12
2.284 (.0097) 2.315 (.0097) 2.314 (.0097)
1.936 (.0260) 1.957 (.0256) 1.930 (.0251)
1.639 (.0510) 1.660 (.0500) 1.633 (.0499)
1.294 (.0968) 1.302 (.1004) 1.299 (.1032)
n2=13 n2=14 n2=15
2.304 (.0102)
1.944 (.0259)
1.647 (.0518)
1.285 (.0990) 1.287 (.1007)
2.299 (.0103)
1.965 (.0250)
1.643 (.0498)
1.303 (.1001)
2.272
1.971
1.674
(.0100)
(.0249)
(.0499)
81
nl=9
n2=1 n2=2 n2=3
2.173 (.0098) 2.298 (.0093) 2.272 (.0110)
1.922 (.0244) 1.963 (.0254) 1.958 (.0239)
1.670 (.0508) 1.634 (.0498) 1.685 (.0496)
1.335 (.1064) 1.298 (.1001) 1.309 (.0972)
n2=4 n2=5 n2=6
2.294 (.0103) 2.293 (.0099) 2.286 (.0107)
1.959 (.0248) 1.958 (.0248) 1.951 (.0262)
1.670 (.0505) 1.661 (.0508) 1.634.(.0499)
1.298 (.1012) 1.288 (.1007) 1.289 (.1038)
n2=7 n2=8 n29
2.300 (.0100) 2.298 (.0101) 2.309 (.0101)
1.965 (.0248) 1.963 (.0251) 1.953 (.0250)
1.649 (.0508) 1.634 (.0485) 1.649 (.0496)
1.294 (.1017) 1.296 (.1012) 1.304 (.0997)
n2=10 n2=11 n2=12
2.305 (.0100) 2.288 (.0100) 2.282 (.0103)
1.943 (.0244) 1.953 (.0242) 1.947 (.0255)
1.662 (.0499) 1.672 (.0500) 1.634 (.0502)
1.300 (.0987) 1.283 (.0980) 1.288 (.1017)
n2=13 n2=14 n2=15
2.283
1.948
1.639
1.300
(.0100)
(.0247)
(.0500)
(.1008)
2.306
1.932
1.638
1.300
(.0100)
(.0251)
(.0500)
(.1001)
2.302 (.0097)
1.960 (.0254)
1.655 (.0498)
1.290 (.1017)
82
n1=10
n2=l n2=2 n2=3
2.185 (.0093) 2.261 (.0105) 2.270 (.0107)
1.968 (.0210) 1.973 (.0247) 1.958 (.0253)
1.680 (.0483) 1.685 (.0503) 1.670 (.0505)
1.320 (.1079) 1.324 (.1030) 1.309 (.1030)
n2=4 n2=5 n2=6
2.315 (.0097) 2.282 (.0104) 2.273 (.0099)
1.968 (.0244) 1.938 (.0253) 1.982 (.0249)
1.666 (.0508) 1.650 (.0500) 1.624 (.0496)
1.320 (.0997) 1.289 (.1025) 1.264 (.0997)
n2=7 n2=8 n2=9
2.309 (.0097) 2.324 (.0098) 2.300 (.0103)
1.961 (.0244) 1.968 (.0256) 1.968 (.0250)
1.643 (.0501) 1.676 (.0498) 1.647 (.0507)
1.300 (.1005) 1.320 (.1000) 1.287 (.1011)
n2=10 n2=11 n2=12
2.315 (.0097) 2.321 (.0099) 2.318 (.0097)
1.954 (.0250) 1.967 (.0251) 1.957 (.0253)
1.651 (.0509) 1.673 (.0500) 1.645 (.0515)
1.291 (.1013) 1.312 (.1004) 1.286 (.0993)
n2=13 n2=14 n2=15
2.306
1.954
1.642
1.297
(.0100)
(.0256)
(.0501)
(.1001)
2.306
1.961
1.657
1.296
(.0101)
(.0249)
(.0501)
(.1005)
2.318
1.958
1.665
1.305
(.0100)
(.0255)
(.0504)
(.1001)
BIBLIOGRAPHY
Anscombe, F.J. (1952). LargeSample Theory of Sequential Estimation.
Proc. Cambridge Philos. Soc., 48, 600607.
Callaert, H. and Janssen, P. (1978). The BerryEsseen Theorem for
Ustatistics. The Annals of Statistics, 6, 417421.
Holt, J.D. and Prentice, R.L. (1974). Survival Analysis in Twin
Studies and Matched Pairs Experiments.' Biometrika, 61, 1730.
Kalbfleisch, J.D. and Prentice, R.L. (1973). Marginal Likelihoods
Based on Cox's Regression and Life Model. Biometrika, 60, 267278.
Leavitt, S.S. and Olshen, R.A. (1974). The Insurance Claims Adjuster
as Patient's Advocate: Quantitative Impact. Report for Insurance
Technology Company, Berkeley, California.
Lehmann, E.L. (1959). Testing Statistical Hypotheses. Wiley, New
York.
Miller, R.G. (1981). Survival Analysis. Wiley, New York.
Prentice, R.L. (1978). Linear Rank Tests with Right Censored Data.
Biometrika, 65, 167179.
Ramberg, J.S., and Schmeiser, B.W. (1972). An Approximate Method for
Generating Symmetric Random Variables. Communications of the
Association for Computing Machinery, Inc., 15, 987990.
Randles, R.H., and Wolfe, D.A. (1979). Introduction to the Theory of
Nonparametric Statistics. Wiley, New York.
Serfling, R.J. (1980). Approximation Theorems of Mathematical
Statistics. Wiley, New York.
Spoule, R.N. (1974). Asymptotic Properties of Ustatistics. Trans.
Am. Mathematical Soc., 199, 5564.
van Eeden, C. (1963). The Relation Between Pitman's Asymptotic
Relative Efficiency of Two Tests and the Correlation Coefficient
Between Their Test Statistics. The Annals of Mathematical
Statistics, 34, 14421451.
Woolson, R.F., and Lachenbruch, P.A. (1980). Rank Tests for Censored
Matched Pairs. Biometrika, 67, 597606.
BIOGRAPHICAL SKETCH
Edward Anthony Popovich was born in Detroit, Michigan, on March 2,
1957. He moved to San Diego, California, in 1961 and remained there
until he moved to Satellite Beach, Florida, in 1964. After graduating
from Satellite High School in 1975, he enrolled at the University of
Florida. He received his Bachelor of Science degree in Mathematics in
1977 at which time he received the Four Year Scholar award as valedic
torian at his commencement. He entered the Graduate School at the
University of Florida in 1978, and later received his Master of
Statistics degree in 1979. He expects to receive the degree of Doctor
of Philosophy in December, 1983. He is a member of the American
Statistical Association.
His professional career has included teaching various courses in
both the Mathematics and Statistics Departments at the University of
Florida and working two summers for NASA at the Kennedy Space Center in
Florida. He has been the recipient of the Wentworth Scholarship,
Graduate School fellowships and teaching assistantships during his
academic career at the University of Florida.
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.
Pejaver V. Rao, Chairman
Professor of Statistics
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.
Gerald J. Elfenb n
Associate Professor of Immunology
and Medical Microbiology
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.
$ovof H. ktJd
Ronald H. Randles
Professor of Statistics
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.
Andrew J. Rosalsky ,,
Assistant Professor of Statistics
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.
Dennis D. Wackerly .0
Associate Professor of Statstics
This dissertation was submitted to the Graduate Faculty of the
Department of Statistics in the College of Liberal Arts and Sciences
and to the Graduate School, and was accepted for partial fulfillment of
the requirements of the degree of Doctor of Philosophy.
December, 1983
Dean for Graduate Studies
and Research
