 Title Page 
 Acknowledgement 
 Table of Contents 
 Abstract 
 Introduction 
 Cramervon Mises type statistics... 
 A test of symmetry based on the... 
 Problems for further research 
 Bibliography 
 Biographical sketch 

Full Citation 
Material Information 

Title: 
Tests of symmetry based on the Cramérvon Mises and Watson statistics 

Alternate Title: 
Cramérvon Mises and Watson statistics 

Physical Description: 
v, 65 leaves : ; 28 cm. 

Language: 
English 

Creator: 
Hill, David Lawrence, 1949 

Publication Date: 
1976 

Copyright Date: 
1976 
Subjects 

Subject: 
Mathematical statistics ( lcsh ) Statistics ( lcsh ) Symmetry ( lcsh ) Statistics thesis Ph. D Dissertations, Academic  Statistics  UF 

Genre: 
bibliography ( marcgt ) nonfiction ( marcgt ) 
Notes 

Thesis: 
ThesisUniversity of Florida. 

Bibliography: 
Bibliography: leaves 6364. 

General Note: 
Typescript. 

General Note: 
Vita. 

Statement of Responsibility: 
by David Lawrence Hill. 
Record Information 

Bibliographic ID: 
UF00098122 

Volume ID: 
VID00001 

Source Institution: 
University of Florida 

Holding Location: 
University of Florida 

Rights Management: 
All rights reserved by the source institution and holding location. 

Resource Identifier: 
alephbibnum  000181650 oclc  03213612 notis  AAU8193 

Downloads 

Table of Contents 
Title Page
Page i
Page ia
Acknowledgement
Page ii
Table of Contents
Page iii
Abstract
Page iv
Page v
Introduction
Page 1
Page 2
Page 3
Page 4
Page 5
Page 6
Page 7
Cramervon Mises type statistics for testing symmetry
Page 8
Page 9
Page 10
Page 11
Page 12
Page 13
Page 14
Page 15
Page 16
Page 17
Page 18
Page 19
Page 20
Page 21
Page 22
Page 23
Page 24
Page 25
Page 26
Page 27
A test of symmetry based on the Watson statistic
Page 28
Page 29
Page 30
Page 31
Page 32
Page 33
Page 34
Page 35
Page 36
Page 37
Page 38
Page 39
Page 40
Page 41
Page 42
Page 43
Page 44
Page 45
Page 46
Page 47
Page 48
Page 49
Page 50
Page 51
Page 52
Page 53
Page 54
Page 55
Page 56
Page 57
Page 58
Page 59
Page 60
Problems for further research
Page 61
Page 62
Bibliography
Page 63
Page 64
Biographical sketch
Page 65
Page 66
Page 67

Full Text 
TESTS OF SYMMETRY BASED ON
THE CRAM9RVON MISES AND WATSON STATISTICS
By
DAVID LAWRENCE HILL
A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR
THE DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1976
ACKNOWLEDGMENTS
I would like to express my sincere thanks and appreciation
to Dr. P. V. Rao for his guidance and assistance as the chairman of
my committee. I would also like to thank Dr. Dennis D. Wackerly for
providing valuable advice whenever it was needed. Additional thanks
go to my parents for their encouragement and support, and to Mrs.
Edna Larrick for her excellent job of typing the manuscript.
ii
1
TABLE OF CONTENTS
Page
ACKNOWLEDGMENTS . . . . . . . . .. . . ii
ABSTRACT . . . . . . . . ... . . ... .iv
CHAPTER
1 INTRODUCTION . . . . . . . ... . 1
1.1 Literature Review .. . . . . . . 1
1.2 Summary of Results . . . . . . . . 6
2 CRAIMRVON MISES TYPE STATISTICS
FOR TESTING SYmMETRY . . . . . . . . 8
2.1 Two Classes of Cramervon Mises Type
Symmetry Statistics . . . . . . . 8
(a) (a)
2.2 Properties of R and S . . . . . 11
n n
2.3 A Third Class of Cramervon Nises
Type Statistics . . . . . . . . 23
3 A TEST OF SYMMETRY BASED ON THE WATSON STATISTIC . 28
3.1 A Class of Symmetry Statistics Based
on the Watson Statistic . . . . . . 28
(a)
3.2 Properties of U n . . . . . . . . 31
3.3 The Asymptotic Null Distribution of U(a). . . 37
3.4 The Exact Null Distribution of U .....56
n
4 PROBLEMS FOR FURTHER RESEARCH . . . . . ... 61
BIBLIOGRAPHY . . . . . . . . . . . . 63
BIOGRAPHICAL SKETCH . . . . . . . . ... . . 65
Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
TESTS OF SYMMETRY BASED ON
THE CRAMERVON MISES AND WATSON STATISTICS
By
David Lawrence Hill
August, 1976
Chairman: Dr. P. V. Rao
Major Department: Statistics
Three Cramervon Mises type statistics for testing symmetry
about zero have recently appeared in the literature. Two of these
have the desirable property of invariance with respect to taking the
negatives of the observations (Invariance Property I). However, none
of these statistics possesses the property of invariance with respect
to taking the reciprocals of the observations (Invariance Property II).
In this dissertation, several related statistics, which possess both
invariance properties, are developed.
In Chapter 2, two classes of statistics obtained by modifying
the twosample Cramervon Mises statistic are studied. Statistics in
these two classes are known to lead to consistent tests of symmetry
and to possess Invariance Property I. In addition, a relationship
between the two classes is established, and each class is shown to con
tain one of the above existing statistics. By combining statistics
from the two classes, a third class is obtained. Statistics in the
third class are shown to lead to consistent tests of symmetry and to
possess Invariance Property II. Exact critical values are given for
one of the statistics in the third class.
Since the asymptotic null distribution of statistics in the
third class is not available at this time, a class of statistics ob
tained from a modification of the Watson twosample statistic is defined
in Chapter 3. Statistics in this fourth class are shown to lead to con
sistent tests of symmetry and to possess Invariance Property I. In
addition, one of the statistics in the fourth class is shown to possess
both invariance properties, and exact critical values are calculated
for this statistic. The common asymptotic null distribution for sta
tistics in the fourth class is shown to be the asymptotic null distribu
tion of the Cram4rvon Mises goodnessoffit and twosample statistics.
CHAPTER 1
INTRODUCTION
1.1 Literature Review
This dissertation explores the problem of testing whether a
random variable with a continuous cumulative distribution function (CDF)
is distributed symmetrically about a known point. A random variable
with a continuous CDF, F, will be said to have a symmetric distribution
about the point 0, if, for all y,
F(O y) = 1 F(O + y).
Without loss of generality, the value of 0 can be taken as zero, since
X0 has a symmetric distribution about zero if and only if X has a sym
metric distribution about 0. Hence, the hypothesis of symmetry can be
defined as follows:
Definition 1.1.1:
A CDF, F, satisfies the hypothesis of symmetry, Ho, provided
F E (S, where s denotes the set of all continuous CDFs, such that for
s s
all y,
F(y) = 1 F(y). (1.1.1)
Note that if 0 denotes the set of all continuous CDFs, then
0 0 is the set of all alternatives to H .
s o
A variety of statistics are available for testing the hypothesis
H : F(y) = G(y) for all y,
O
where F and G denote the unknown CDFs of two populations. These statis
tics will be referred to as twosample statistics. Several twosample
statistics involve the empirical (sample) distribution function (EDF),
F (*). One such statistic is the twosample Cramirvon Mises statistic,
defined by Kiefer [10] as
+0
2 = n [F() ()2 dHn(y),
n n
lation with CDF, F (G), and H denotes the EDF based on the combined
n
sample. Another twosample EDF statistic, due to Watson [20], is defined
by
9
n n n n f n n n n
n =n n ( Y)~ ) n(WCn(W)) dHn(y),
where F G and H are as defined above. It should be noted here that
n n n
2 2
similar but slightly more complicated forms of a and np exist, for use
n n
in the case of unequal sample sizes.
Recently, several articles have appeared in the literature in
which a statistic for testing symmetry about zero is obtained by modi
fying one of the twosample EDF statistics ([3], [4], [11], [12], [15]
2 2
and [17]). Such modifications of 2 and p will be studied in this
n n
dissertation.
To date, three symmetry statistics obtained from a have been
n
studied. Orlov [12] and Rothman and Woodroofe [15] have independently
If X1,X2,...,Xn denotes a random sample, the EDF F (*), is
defined by 1
F (y) = n (the number of X.s y).
n 1
proposed essentially equivalent statistics. Orlov's statistic is
2
obtained from a by replacing [F (y) G (y)] with the expression,
n n n
[Fn(y) + F (y) i], and is given by
0
L(O) = [F() + Fn(y) 1]2 dF (y).
2 n n n
WJ
Orlov points out that the value of L(0) depends only on the values of
[Fn(y) + F (y) 1]2 at its discontinuity points, X1,X2,...,Xn and
that these values may be taken to be anywhere between the righthand
2
and lefthand limits of [F (y) + F (y) 1] without altering the
asymptotic properties of L(0). Orlov derives the asymptotic null dis
tribution of L(0) and gives a table of percentage points for this
distribution.
Rothnan and Woodroofe [15] have chosen a specific modification
of [F (y) + F (y) I], and their statistic, R', is defined as follows:
n n n
R = n[F'(y) + F'(y) 1]2 dFn(y), (1.1.2)
where
F'(y) = [F(y+) + F(y)]. (1.1.3)
This modification of the EDF has the desirable effect of producing a
statistic which is invariant with respect to the transformation X +X..
In other words, if X and X denote, respectively, the vectors
(X1,X2,...,X ) and (X IX2 ... ,X ), where X1,X2,...,X denotes a
random sample from F E 0, then
R'(X) = R'(X) w.p. 1,
n n
where R'(X) denotes the value of the statistic, R', for the sample X.
n n
In the sequel, this property will be called Invariance Property I.
The asymptotic null distribution of R', equivalent to that of
n
L(0), is obtained by Rothman and Woodroofe. In addition, they give
approximate critical values for selected sample sizes and levels, a,
based on a Monte Carlo study. Tests based on R are shown to be con
n
sistent against all alternatives in 4 (s, and the asymptotic distri
bution of R' under a sequence of local alternatives is also investigated.
n
The third modification of o2 is due to Srinivasan and Godio [17].
n
A discussion of this statistic requires the definition of the following
functions, to be used throughout the dissertation.
Definition 1.1.2:
Let (Z() Z(2),...,Z(n)) denote the vector obtained by ordering
the elements of Z = (Z1,Z2,.. ,Zn) in ascending order according to the
order relation given by
Z. "<" Z. if IZi. < Z. [.
i J 1
In addition, if P denotes a proposition, let the indicator function
I['] be given by
1 if P is true
I[P] = (1.1.4)
0if P is false.
Then, the functions 6k(), Pk() and Nk(,), for k=l,2,...,n, are
defined by
1 if Z(k) < 0
Sk(Z) = (1.1.5)
1 if Z(k) 0,
k
Pk(Z) = I[6(Z) = 1], (1.1.6)
k 3
j=1
and
k
Nk(Z) = I[ Z) =1. (1.1.7)
j=1
If Z denotes a random sample from a continuous CDF, then
P(IZil = Zjl) = P(Z.=0) = 0, i#j.
Thus, with probability one, the functions 6k( .) Pk() and Nk(.) are
well defined.
The statistic proposed by Srinivasan and Godio, S may now be
na
defined as follows:
n n
Sn S'(X)= [ k 412 k k(X)]2. (1.1.8)
'(X) X)Z 4[Nk (X)Pk (X)
k=l k=l
The asymptotic and exact (for n=10(1)20) null distributions of S are
n
given in [17], the former being equivalent to the common asymptotic null
distribution of the statistics L(0) and R In addition, tests based on
n
S are shown to be consistent against all alternatives in D .
n s
Although it is not mentioned in [17], it is clear from (1.1.8) that S'
n
possesses Invariance Property I.
1.2 Summary of Results
In Chapter 2, two classes of statistics for testing symmetry are
defined, one containing R', and the other containing a statistic equiva
n
lent to S The statistics in both classes are shown to possess Invar
n
iance Property I and to have a common asymptotic null distribution.
Tests based on these statistics are shown to be consistent against all
alternatives to symmetry about zero. For each statistic in the first
class, it is shown that there exists a corresponding statistic in the
second class with the same exact null distribution. These pairs of
statistics are related further in that the test based on one is equiva
lent to the test based on the other using the sample of reciprocals,
1 1 1
(X X2 ,...,Xn ). By combining these pairs, a third class of statistics
is defined. Statistics in the third class lead to consistent tests of
symmetry, possess Invariance Property I and, in addition, are invariant
1 1
with respect to the transformation, X. + X. In other words, if X
1 1 1
denotes the vector of reciprocals, (X ,X2 ,...,Xn ), where X denotes
a random sample from F c 4, and if T denotes a member of the third class
of statistics, then
T(X) = T(X ) w.p. 1.
This invariance property will henceforth be referred to as Invariance
Property II.
Exact critical values are given for one member of the third
class for n = 10(1)24. However, it does not appear that the common
asymptotic null distribution of members of the third class will be easy
to obtain. For this reason, a fourth class of statistics, based on a
2
modification of Wacson's twosample statistic, p is defined in
isdeiedi
7
Chapter 3. These statistics are shown to lead to consistent tests of
symmetry, as well as to possess Invariance Property I. In addition, the
common asymptotic null distribution of statistics in the fourth class is
shown to be that of the wellknown Cramdrvon Mises goodnessoffit sta
tistic [7]. One member of the fourth class is shown to possess Invar
iance Property II, and exact critical values for this statistic are
given for n = 9(1)20.
Chapter 4 contains a brief description of the problems that are
yet to be solved in the general area of the topics considered in this
dissertation.
CHAPTER 2
CRAMERVON MISES TYPE STATISTICS
FOR TESTING SYMMETRY
2.1 Two Classes of Cramarvon Mises
Type Symmetry Statistics
Intuitively, a test of H may be based on either of the following:
1. A comparison of estimates of F(y) and (1 F(y)) for all
values of y.
2. A comparison of estimates of (F(y) F(0)) and (F(O)F(y))
for all values of y.
Since a natural estimator for F(y) is the EDF, Fn(y), a reasonable test
statistic can he constructed from any functional of the difference be
tween the estimates to be compared in (1) or (2) above. In (1), this
test statistic would be based on
Fn(Y) (1 F (y))
= Fn(y) + Fn(y) 1,
whereas in (2), it would be based on
F (y) F (0) (F (0) F (y))
n n n n
= F (y) + Fn(y) 2Fn(0).
n a
(2.1.2)
A modified version of Fn, slightly more general than the one
introduced by Rothman and Woodroofe [15], will be used to define the
statistics considered in this dissertation. The reason for introducing
(2.1.1)
this version of the EDF will be noted after the proof of Theorem 2.2.2.
The modified EDF is presented in the following definition. Hereafter,
X = (XIX2,...,Xn) will denote a random sample from FE s, and F will
denote the standard version of the EDF obtained from X.
Definition 2.1.1:
(a)
Corresponding to each a S [0,1], the modified EDF, F is
n
defined as
aF (y+) + (la)F (y) if y < 0
n n
F a)(y) = (2.1.3)
n
(la)Fn(y+) + aFn(y) if y 0,
where F (y+) and F (y) are the righthand and lefthand limits, respec
n n
tively, of F at y. Note that, from (1.1.3),
F' = F (2.1.4)
n n
and like F', F(a) differs from F only at its discontinuity points,
ni n n
X1,X2,...,Xn. At these points, F(a) lies between the righthand and
lefthand limits of F Specifically, for i=1,2,...,n,
n
(1a)
F (X.) ( if X. < 0
n i n 1
F (a) ) = (2.1.5)
F(X i if X. > 0,
1
since
F (X.) = F (X.+) = F (X.) + 1
n 1 n 1 n 1 n
The following lemma shows that F and F(a) behave similarly
n n
for large samples.
Lemma 2.1.1:
For every F E E and every a c [0,1],
sup F(a)(y) F(y)I + 0 w.p. 1, as n ~.
y n
Proof:
It is easily seen from (2.1.5) that
sup F(a(y) F (y) max {a, (1a)}
max (a, (1a) < .
n n n n
The desired result now follows from the triangle inequality and the
GlivenkoCantelli Lemma [5].
The following corollary follows directly from (1.1.1) and
Leimma 2.1.1. It will be needed in the proof of a later theorem.
Corollary 2.1.1:
Under Ho,
sup jF(a)(y) + F(a)(y) 1 0 w.p. 1, as n + c,
y n n
for every a c [0,1].
(a)
Now, by replacing F with the modified EDF, F in the com
n n
prisons given by (2.1.1) and (2.1.2), and then applying the func
tional, J(.)2, two distinct Cramdrvon Mises type statistics for test
ing Ho are produced. Since the index, a, can take any value in [0,1],
two classes of statistics actually result. These two classes will be
denoted {R(a): 0 < a < and S(a): 0 < a < and their general
members are now defined.
Definition 2.1.2:
For a c [0,1],
+<
R(a) E R(a)(X) = n [F(a)y) + F(a)(y) ]2 dF (y), (2.1.6)
n n n n n
and
(a) S(a) = nf [F(a ) + (a)(y) 2F (0)]2 dF (y).
00
(2.1.7)
Some properties of R (a) and S (a)will be considered in the next
n n
section. It should be noted here that reasonable tests of H are obtained
o
(a) (a)
by rejecting H for "large" values of R or S Also, from (2.1.4),
0 n n
R' = R so that one member of the class {R(a): 0 s a S 1 has
n n n
already been studied by Rothman and Woodroofe [15].
2.2 Properties of R(a and S(a)
As pointed out by Rothman and Woodroofe [15], it is reasonable to
require that a test statistic for H possess Invariance Property I. In
Chapter 1, it was noted that both R and S satisfy this requirement.
n n
The next theorem shows that each member of the classes R : 0 ( a a 1
and S (a): 0 5 a 5 1 possesses Invariance Property I.
Theorem 2.2.1:
For every a e [0,1],
(a) (a)
R (X) = R (X) w.p. 1, (2.2.1)
n n
and
(a) (a)
S (X) = S (X) %.p. 1. (2.2.2)
n n 
Proof:
Let G and G denote, respectively, the EDF and its modifica
n n
tion (according to Definition 2.1.1) for the sample, X. If y
(a)
F (y) = aF (y+) + (la)F (y)
n n n
= n a I[X y] + (la) I[Xi < y
i=l i=l
n n
= n1 a I lI[Xi <y]+ (1a) l[X y]
i=l i=l
= 1 a G(y) (1a)G (y+).
Since y < 0 implies that y > 0, the last expression equals
(a)
S G (y).
n
Analogously, the same result holds if y > 0. Since F (0) = 1 G (0)
n n
with probability one, for all y,
(a) (a)
F (y) = 1 G (y) w.p. 1.
n n
Hence, for all y,
(a) (a) (a) G(a) wp i,
F (y) + F (y) 1 = (y) + G ( 1 wP. 1,
n n n n
from which (2.2.1) follows immediately. A sketch of the proof of
(2.2.2) is given on page 16, immediately following Corollary 2.2.1.
Recall that in (1.1.8), S' was defined in terms of Pk(), Nk()
and 6k('), and not in terms of the EDF. Thenext theorem provides rep
(a) (a)
presentations of Rn and Sn in terms of Pk(') Nk(') and 6k(*). These
representations not only provide a convenient form for the actual
representations not only provide a convenient form for the actual
computation of the test statistics, but also enable one to establish an
(1a) (a)
important relation between R and S 0 5 a 5 1. In addition,
n n 
they are useful in proving several theorems which follow.
Theorem 2.2.2:
With probability one, for every a e [0,1],
n
2
R(a)(X) = n' 1) Pk(X1) + (1a)6k X1 2, (2.2.3)
k=l
and
n
S(a)(X) = n2 k X) Pk(X) + (a)6k(X)2, (2.2.4)
k=1
where 6k(), Pk() and Nk(*) are given by (1.1.5), (1.1.6) and (1.1.7).
Proof:
The result given by (2.2.3) will be proved first. If it can be
shown that, for k=1,2,...,n,
(a) (a)
F (X ) F+(a ([X
n (nk+l) n ((nk+l) 1
Snl[Nk(Xl) k(X) + (1a)6k(X1)] (2.2.5)
then (2.2.3) will follow immediately from (2.1.6) by summing the squares
on both sides of (2.2.5) over the index k, k=1,2,...,n.
To establish (2.2.5), expressions involving F will be found for
Pk and Nk. If Y = X, then, from Definition 1.1.2, (Y( ) =X
(YKtnj+l)) =Xij)
Hence, for k=1,2,...,n, with probability one,
n
P (1) = X I 0],
k j=nk+l
n
N (x1) =[X(j)
j=nk+l
(2.2.6)
(2.2.7)
< 0].
Since IX(i) < IX(j)X if and only if i < j, equivalent expressions of
(2.2.6) and (2.2.7) are given by
n
Pk (X = [X X (nk+1)
i=l
and
n
Nk (X 1 IX (nk+l)
i=1
Thus, with probability one, for k=1,2,...,n,
n 1F= [ 'X(nkl) I )]
k n[Fn( X(nk+l))]
if X(nk+l) < 0
if X(nk+l)
if X > 0
(nk+l)
= n Fn (nk+l)) + n1 X(nk+1) _
N(X ) =nF(IX j .
NkX) = nF (nk+l)1).
Hence, with probability one,
N (X) Pk(X) + (la)6(X )
= n Fn (IX(nk+l)1) 1 + Fn(IX(nk+l)) 
+ n (1a)6 k+l (X)
= n Fn(X(nk+l)) + Fn(X(nk+l)) 
n1 (nk+l)
1 ak) ,
0T]
(2.2.8)
where
k = n X(nk+l) 0 (1a)6nk+X
(ia)
((i) if X < 0
n (nk+1)
a if X(nk+l) 2 0
Now, it follows from (2.1.5) that
k = 1,2,...,n.
F .(X )+F (X )oaF (a) (X )+F (X
(X(nk+l)) + Fn (nk+l)) k = Fa) (nk+l) + (a)(X(nk+l) )
and, hence, (2.2.5) follows from (2.2.8), which establishes (2.2.3).
The proof of (2.2.4) is almost identical. It can be shown that
n
Pk(X) = i [o0XiX(k)1'
i=l
and
n
X) = (ki= i 0]
i=1
This leads to the
following equation corresponding to (2.2.5):
(a) F(a) 2
n (X(k) n (X(k) n
S INkCX) P (X) + a 6k(X)] (2.2.9)
Thus, (2.2.4) follows as before, and the proof of the theorem is complete.
An immediate consequence of (1.1.8) and (2.2.4) is that
S(0)(X) = (4n2)S'(X).
n n 
Hence, Srinivasan and Godio's statistic, S can be expressed in terms of
n
(a)
the modified EDF, F using (2.1.7), which provides the reason for
n
introducing F(a)
n
The following corollary, which follows directly from Theorem 2.2.2,
establishes an important relation between the classes (a) : 0 a <
and (s() : 0
Corollary 2.2.1:
For every a E [0,1],
R )(x S (a C) w.p. i.
n n
In other words, the test of symmetry based on S using the
n
sample X, is equivalent to the test based on R using the sample of
n
reciprocals, X.
As a result of Corollary 2.2.1, the proof of Theorem 2.2.1 can
easily be completed by noting that (2.2.2) follows directly from (2.2.1)
l
and Corollary 2.2.1, since the transformations X X1 and X  are
commuta ive.
In conjunction with Corollary 2.2.1, the following lemma will be
useful in establishing several later results.
Lemma 2.2.1:
1
Let F and F denote the CDFs of X and its reciprocal, X ,
r
respectively. Then,
F e 0 if and only if F e .
s r s
Proof:
Let F* denote the CDF of the random variable X. Then it
follows directly from (1.1.1) that
F E if and only if F=F* and F e . (2.2.10)
Also observe that
SF(0) F(y1) if y < 0
Fr(y) = F(0) if y = 0
1 + F(0) F(y1) if y > 0.
Thus, since
lim [F(0)F(y) = lim [1 + F(O) F(y1 = F(0)
y+ 0 y 0+
it follows that
F = F* and F E $ if and only if F = F* and F e D.
r r r
Combining this result with (2.2.10) completes the proof.
Another consequence of Theorem 2.2.2 is that, for every ae [0,1],
(a) (a) (a)
both R and S are distribution free. Considering S first, it is
n n n
(a)
clear from Definition 1.1.2 and Theorem 2.2.2 that S (X) depends on X
only through
6 = (61(X),62(X),...,6n(X))
A result of Hajek and SidAk ([8], p. 40) shows that, under Ho,
P(6=d) = 2n
for every d E A, where
A = (d,d2,...,dn) : di = +
Hence, for every a E [0,1], S(a) is distribution free under H It now
n 0
follows from Corollary 2.2.1 and Lemma 2.2.1, that Ra is likewise
n
distribution free under H and that S( and R(1a are identically
o' n n
distributed for every a E [0,1] and every F E @s. That is, for each
s
statistic in the class S(a) : 0 a 1 there exists a corresponding
Sn
statistic in the class Ra) : 0
nf
It should be noted, however, that these pairs of statistics do not produce
equivalent tests of Ho, even when n becomes large, since the statistics
in a pair will not have equal values for every sample, X, and since
R(la) S (a) does not converge to zero.
n n
Rothman and Woodroofe [15] do not give exact critical values for
R', and Srinivasan and Codio [17] were not successful in deriving them with
n
the method they employed to obtain the exact values for S However, from
n
the above discussion, it follows that the critical values given in [17] are
1 2 (1)
also the critical values for (n ) R Hence, exact critical values are
available for a statistic closely related to R In fact, R and
n n
R() have the same asymptotic null distribution. Since R ( and S(0)
n n n
are identically distributed, this follows from the fact, noted in Section
1.1, that R ) and S(0) have a common asymptotic null distribution. The
n n
next theorem shows that this relation between R. and R ) holds for any
n n
pair of statistics in the RP class, and for any pair in the S class
as well.
Theorem 2.2.3:
Under H R(a) (S a)) and R(b) (Sb)) have the same asymptotic
o n n n n
distribution for any a and b in the unit interval. In fact, under Ho,
R(a) R(b) j 0 w.p. 1, as n m,
n n
and
iS(a) _ (b) I 0 w.p. 1, as n + .
n n
Proof:
In view of Corollary 2.2.1 and Lemma 2.2.1, it will be sufficient
(a)
to prove the result only for R Observe that
n n n
) [F() FF(b) F )]
R(a) = R(b) + n ( a(y) + F) F (b) () (b)()2
n n n n n n
+ 2 pIF(a) (y) + (a) (b) (b)
+ 2 y F (y) F (y) F 1y)
n y n n
x [up(b)(y) + F(b)(y) d y)
nsy n n
supF sup F (y) + F (Y) Fn (y) Fbn (Y) 2
From (2.1.5),
spF(a)(y) + F (y) F () F(b) (y) <2n1
y n n n
and thus it follows that
()(a) (b) (b) (b)
R(a) R() < o(l) + 4 supF (y) + F(b)(y) l. (2.2.11)
n n n y n n
By Corollary 2.1.1, the second term of the last expression, under Ho,
converges to zero with probability one, completing the proof of the
theorem.
(1a) (a)
It is now clear, from the fact that R and S are identi
n n
cally distributed, that all statistics in both classes have a common
asymptotic null distribution. This agrees with the work of Rothman and
Woodroofe [15] and Srinivasan and Godio [17], who independently derived
() (0)
the same asymptotic distribution for the statistics R and S(
n n
respectively. As noted in Section 1.1, percentage points have been
tabulated for this distribution by Orlov [12].
Consistency of the test for symmetry based on any statistic in
the classes R(a) : Oa1 or S(a) : 00a.\ is now established.
n n
Theorem 2.2.4:
(a) (a)
For any a [0,1], R (S ) is consistent against any alter
n n
native in 0 .
s
Proof:
Rothman and Woodroofe [15] show that if Ho is not satisfied,
R + w.p. 1.
n
Since sup IF (y) + F (y)l 1, it follows from (2.2.11) that for
y n n
any a c [0,1], R a)R()I o (1) + 4. Hence for any F E 0 ( ,
n n n S
R(a) + m w.p. i, (2.2.12)
n
and thus R is consistent against any alternative in 0 0 for every
n s
a E [0,1]. In view of Corollary 2.2.1 and Lemma 2.2.1, S is likewise
n
consistent for every a e [0,1], completing the proof.
(1)
Since exact critical values for R are readily available, it
n
is of interest to know whether there is any further advantage to be
gained by using R () instead of R ). More generally, one might look
n n
for a criterion to select a statistic from each of the two classes,
R(a) : 0
lished by Theorem 2.2.3 that the effect of the choice of the index, a,
becomes negligible as the sample size becomes large. However, the
value of the index will affect the small sample distributions of both
(a) (a) n
R and S Specifically, the approximately 2na sample points
n n
(of A) comprising the rejection region of an alevel test based on R(a)
n
or S will depend on the value of the index, a, and hence, so will the
n
small sample power of such a test. The following corollary to Theorem
2.2.2 may be applied to provide a characterization of the effect of the
choice of the index, a, on the statistics in the two classes.
Corollary 2.2. 2:
With probability one, and for every a E [0,1],
n1
(n2)S(a) I N(X) Pk
(n2)s
k=l
+ (la)[N (X) P(X)]2 na(la), (2.2.13)
and
nl
(n)R(a) = Nk(X) P (X 2
k=l
+ a [Nn (l) p x1)]2 na(la).
Proof:
(a)
It will be sufficient to prove only the result for S as the
n
corresponding result for R will then follow directly from Corollary
2.2.1. From (2.2.4),
n2S(a) = Nkx (X)]2
nS(a) (X
k=l
n n
+ 2a [Nk(X) Pk(X)] 6k(X) + [adk()2' (2.2.14)
k=l k=l
and from (1.1.6) and (1.1.7),
k
Nk(X) Pk(X) = (1) 6j(X).
j=1
Hence, the second
rewritten as
term on the righthand side of (2.2.14) can be
n k
2a > 6 k(X) 6j(X)
k=l j=l
n n n
 2a V(X)]2 2a 6k (X) 6j(X)
k=1 k=l j=l
n n
a 6k=l ( 6kX=l )
k=
Using (2.2.15) again, this last expression becomes
a [k(2X) a N 
k=l
Hence, (2.2.14) becomes
n
n2S(a) NkX) Pk(X)2 aNn() n(X)2
k=l
n
+ (a2 a) k(X)] 2
k=l
which reduces to (2.2.13) in view of the fact that [Sk(X) = 1 with
probability one, for k=1,2,...,n. This completes the proof of the
theorem.
(2.2.15)
An intuitively reasonable criterion for choosing a representa
tive statistic from each class may be based upon the preceding corollary.
With respect to S(a), (a) may be regarded as the weight assigned to
n
n(X) P (X 2, whereas, for R(a), the corresponding weight assigned
LN(X Pn2 n
to [N (X) P (X)]2 is (a). It is not clear how one would justify
Ln n .
the assignment to Nn(X) P(X)]2 or Nn(X) P (X)]2 a weight
different from that assigned to [Nk(X) PkX)2 or LNk(X1) PkP(X1)]2
(0) (1)
k=l,2,...,nl, (i.e., 1). Since S and R are the two statistics
n n
in which the full weight of 1 is assigned to the nth term, they appear
to be logical choices in determining representative statistics from the
two classes. In addition, exact critical values are available for
(0) (1)
both S and R and these two statistics are somewhat easier to
n n
compute than other members of their respective classes.
2.3 A Third Class of Cram&rvon Mises Type Statistics
Both invariance properties defined in Chapter 1 are desirable
properties for tests of symmetry. In Section 2.2, it was shown that
S (a) and R(a) possess Invariance Property I for all a s [0,1]. However,
n n
it is easily seen that neither class contains a statistic possessing
Invariance Property II. By Corollary 2.2.1, the assumption that either
R a) or S(a) possesses this property leads to the contradiction that
n n
(a) (1a)
S is equivalent to R Corollary 2.2.1 does suggest, however,
n n
a method of combining statistics of the R class with those of the S
n n
class to produce statistics possessing both invariance properties, as
shown by the following theorem.
Theorem 2.3.1:
Let
(a) (a) (1a) (a)
T n T (X) (X) + S (X .
n n 2ln n
(a)
Then, for every a e [0,1], T possesses invariance properties I
n
and II, and leads to a consistent test of symmetry against any alterna
tive in 4 .
s
Proof:
That T(a) possesses Invariance Property I is clear, as both
n
(1a) (a)
R and S possess the property. Invariance under the transforma
n n
1
tion X X1 follows from Corollary 2.2.1, since, with probability one,
T(a)() = (a) ) + R (1a)(X
D 2 n En 4
R(a)X) + S(a) (1
2 n n
T(a) 1
= T (X ).
n
Finally, the consistency of the test based on T follows immediately
n
(a)
from (2.2.12) and the corresponding result for S .
n
For the reasons stated at the end of Section 2.2, T appears
n
to be a logical choice of the "best" statistic from the class
2
{T(a) : 0aSl Table 1 provides the critical values for T*= T(0)
n n 4 n
at selected levels, a, and sample sizes, n = 10(1)24. Because the exact
null distribution is not available in closed form for either R(a) or
n
(a) (0)
S the exact distribution of T was calculated by means of a com
n n
puter enumeration of all possible values of the statistic, one for each
d in A. Since A contains 2n sample points, this method is suitable only
_ _
TABLE 1
EXACT CRITICAL VALUES FOR T* =
n = 11
x P(T* x)
33.00
34.50
49.50
50.00
56.50
57.50
81.50
82.00
0.8955
0.9072
0.9443
0.9502
0.9736
0.9775
0.9893
0.9912
2 (0)
4T
4n
n = 12
n = 13
x P(T* x)
49.25
51.75
64.75
65.75
77.75
78.75
101.75
104.25
0.8982
0.9009
0.9490
0.9504
0.9736
0.9751
0.9883
0.9902
n= 14
x P(T*
54.75
55.75
79.75
80.75
91.75
92.75
125.75
127.72
0.8979
0.9000
0.9497
0.9512
0.9736
0.9751
0.9899
0.9907
n = 15
x P(T*
68.00
68.50
84.00
84.50
112.50
113.00
132.50
133.00
0.8984
0.9001
0.9492
0.9501
0.9743
0.9750
0.9899
0.9902
n = 16
x P(T* x)
71.50
72.50
99.50
100.00
120.00
121.00
155.50
156.50
0.8988
0.9013
0.9487
0.9509
0.9748
0.9752
0.9897
0.9903
n =17
x P(T* x)
83.25
83.74
115.25
115.75
137.75
138.25
184.75
185.25
0.8989
0.9009
0.9499
0.9502
0.9749
0.9753
0.9900
0.9903
n = 10
x P(T*
32.25
33.25
38.25
39.25
44.25
48.25
60.25
62.25
0.8945
0.9102
0.9492
0.9570
0.9746
0.9785
0.9863
0.9902
x
43.00
44.00
53.00
54.00
71.50
73.50
80.50
83.50
P(T* x)
0.8970
0.9087
0.9497
0.9517
0.7961
0.7961
0.9883
0.9912
n = 18
x P(T*
96.25
97.25
121.25
122.25
161.25
162.25
192.25
193.25
0.8989
0.9008
0.9493
0.9506
0.9746
0.9752
0.9898
0.9907
TABLE 1 (Continued)
n = 19
x P(T* x)
101.50
102.00
141.50
142.00
169.50
170.00
220.00
220.50
0.8990
0.9004
0.9495
0.9505
0.9747
0.9750
0.9897
0.9900
n = 22
x
137.73
138.75
189.75
190.75
227.75
228.75
296.75
297.75
P(T*
0.8987
0.9006
0.9499
0.9506
0.9747
0.9751
0.9900
0.9901
n =20
x P(T* < x)
117.00
117.50
152.00
152.50
192.50
193.00
253.00
253.50
0.8981
0.9003
0.9496
0.9503
0.9748
0.9753
0.9900
0.9902
n = 23
x P(T* x)
155.75 0.8998
156.25 0.9003
199.25 0.9497
199.75 0.9501
257.25 0.9748
257.75 0.9750
332.25 0.9900
332.75 0.9901
n = 21
x P(T* < x)
125.25
126.25
167.75
168.25
218.25
218.75
263.75
264.25
0.8998
0.9001
0.9496
0.9504
0.9749
0.9751
0.9899
0.9900
n = 24
x P(T* x)
163.25 0.8996
163.75 0.9007
222.75 0.9498
223.25 0.9505
280.75 0.9750
281.25 0.9751
348.25 0.9900
348.75 0.9901
for small sample sizes. Also, as noted above, the problem of deriving
the asymptotic null distribution of T) does not appear to have an easy
n
solution. However, there is some evidence indicating that the critical
values corresponding to n = 24 will work fairly well for sample sizes
larger than 24. This can be seen from Table 2, which gives the exact
level, a', resulting from the use of the correct alevel critical value
m
(i.e., the critical value with exact level closest to a) corresponding
to samples of size m, m=15(1)24, when the sample size is actually 24.
Also given is a'a24, where a24 is obtained by using the correct alevel
critical value for samples of size 24. As can be seen from Table 2, this
difference is fairly small.
TABLE 2
EXACT LEVELS OF TESTS FOR SAMPLES OF SIZE 24 BASED ON
THE aLEVEL CRITICAL VALUES OF T(0) (15
m
= .05 a = .01
m a' a'a a' a'a
m m 24 m m 24
15 0.0549 +0.0047 0.0112 +0.0012
16 0.0457 0.0045 0.0097 0.0003
17 0.0443 0.0059 0.0080 0.0020
18 0.0542 +0.0040 0.0109 +0.0009
19 0.0472 0.0030 0.0095 0.0005
20 0.0518 +0.0016 0.0082 0.0018
21 0.0518 +0.0016 0.0102 +0.0002
22 0.0472 0.0030 0.0093 0.0007
23 0.0542 +0.0040 0.0084 0.0016
24 0.0502 0.0 0.0100 0.0
CHAPTER 3
A TEST OF SYMMETRY BASED ON
THE WATSON STATISTIC
3.1 A Class of Symmetry Statistics
Based on the Watson Statistic
Although T the statistic described in Section 2.3, is
n
desirable from an invariance point of view, it does not appear that
its properties will be easy to determine analytically. As noted in
the previous chapter, its asymptotic null distribution is not available
at this time. For this reason, an alternative statistic for testing H
will be developed and studied in this chapter. This statistic will be
2
based on p the twosample statistic proposed by Watson [20]. A def
n
2
inition of v appears in Section 1.1.
n
2 gf
The statistic pn and its goodnessoffit version p [19],
n n
possess several desirable properties. Both statistics are particularly
useful when dealing with populations having circular distributions.
Stephens [12] compared the goodnessoffit test based on gf with four
n
other EDFbased goodnessoffit tests using a Monte Carlo study. This
study showed that the Watson statistic performed best against shifts in
variance. In this chapter several nice properties of the modified ver
2
sion of p for testing symmetry will be established.
n
The procedure used to modify the Cramervon Mises twosample
2 2
statistic, can also be used to modify 2 Thus, the expression
n' n
[F(y) G (y) in 2 may be replaced by either [Fa) y) + F (a) i
or F(a)(y) + F(a)(y) 2Fn(0 to produce two classes, one correspond
ing to each of the above substitutions. However, the two resulting
classes are equivalent. To see this, first note that for any constant C,
+m
SC dFn(y) = C.
2
Thus, substituting in 2 any expression of the form
n
(a) (a)
F (y) + F (y) C,
n n
where C is constant with respect to the variable of integration, will
result in a statistic independent of the value of C. For the sake of
convenience in proving several of the results which follow, members of
this class of symmetry statistics will be defined with C=1.
Definition 3.1.1:
(a)
For any a e [0,1], the statistic U is defined by
n
(a) n (Fa)(y) +(a)
U = n (y) + F F(y) 
n n
+o
i (F (w) + F ((w) 1) dFn(w)2 dF (y)
(a)
where F is given by Definition 2.1.1.
n
Note that a reasonable test of symmetry is obtained by rejecting
(a)
H for "large" values of U .
o n
Corresponding to (2.2.3) and (2.2.4), the following theorem
(a)
gives two expressions for Un in terms of the functions N (), Pk)
n k k
(a)
and 6k('). These will provide a convenient form for computation of U(a
k n
(a)
and will be useful in establishing several properties of U(a
Theorem 3.1.1:
Let the functions 6k('), Pk() and Nk() be given by (1.1.5),
(1.1.6) and (1.1.7), respectively. Then, with probability one,
n
U(a)(X) 2 (iXi) k + (X)]J2
n [2 =N k(X) Pk + aSk(X) 2
k=l
n
n1([ N X) Pk(X) + a6sk X)2
k=l
n
= n2 Nk ) PkX) + (1a)k(X1)] 2
n
n [ (Nk(X1) Pk(X1) + (1a)6kX1))]2) (3.1.1)
k=l
Proof:
(a)
Clearly, U( can be expressed as
n
(a) L(a) (a) 2
S(a) [F (X) + F1 (X,) 1)
nn j n n
j=1
n
[F (a) (a) ]
S [ (X) + F (X.) 2F(0)]2
i=l
n
n[ (F(a) (X.) + F (aX) 2F (0)) 2
n n n
j=1
The theorem then follows from the two qualities, (2.2.5) and (2.2.9),
established in proving Theorem 2.2.2.
3.2 Properties of U(
n
2 2
Due to the close relationship between p and a it is not
n n
(a) (a)
surprising that the properties of U are similar to those of R and
n n
S (a) (see Chapter 2). For example, as a consequence of Theorem 3.1.1,
n
U(a) depends on X through 6(X) alone, and, hence, U( is distribution
n 
free (see remarks after Lemma 2.2.1). The next theorem, analogous to
(a)
Theorem 2.2.1, establishes Invariance Property I for U
n
Theorem 3.2.1:
For any a E [0,1],
U (a)( = U a)(X) w.p. 1. (3.2.1)
n n
Proof:
From Definition 1.1.2, it is easily seen that, with probabil
ity one,
Nk(X) = Pk(X),
and
6k(X) = 6k ()
Hence, (3.2.1) follows directly from (3.1.1), and the proof is complete.
The following theorem is a direct consequence of Theorem 3.1.1
and is stated without proof.
Theorem 3.2.2:
For every a C [0,1],
U (X) = U (X ) w.p. 1,
n n
and, in particular,
U )) = U)(Xx) w.p. 1.
n n
Thus, U possesses both invariance properties I and II, and,
n
therefore, is the logical choice as a "best" statistic from among those
in the class (a) : 0 sa 1 The next theorem shows that all sta
tistics in this class are asymptotically equivalent in the sense that
the difference between any two statistics converges to zero with prob
ability one.
Theorem 3.2.3:
Under H U(a) and U(b) have the same asymptotic distribution
0 n n
for any a,b E [0,1]. In fact,
IU(a) U(b) 0 w.p. 1 as n + =.
n n
Proof:
For any a E [0,1], let
(a)(y) = F (y) + F(a)(y) 1
n n n
F(a)(w) + F(a)(w) 1 dF (w).
nCn n
Then, for any b E [0,1],
+m
U(a) n A(a)()]2 dFn(Y)
Snf (A(a)(Y) A(b)(y))+ A(b)(y)2 dF(y)
4>
Sn A(a)(y) A (y)]2 dF(y) +U(b)
n n n n
+m
+ 2n A (a)(y) A(b)(y)] [(b)(y)] dFn(y).
Hence,
U (a) u(b) I nf sup lA(a)(y) A(b)(y)12] dFn(y)
4y
+ 2n f suplA~a)(y) A( (y) I. sup (b) (y) dF (y)
Sy yn
=n.SUpIAn (y) An(y)l)2
+ 2n suplA(a)(y) A(b)(y)I sup IA(b)(y)I. (3.2.2)
y n n y
Now,
sup A(a) (y) A(b) (y) 2 [pF(a)() F(by)
y n n Y n n
+ sup IF(a) F(b)(
y n n
Since
(a) (b) 1
sup F a)(y) F (y)I n ,
y n n
it follows that
sup Ia) (y) A(b y) I) n1.
Therefore, from (3.2.2),
Therefore, from (3.2.2),
U(a) (b)j 2 + 2n ~ ) suplA(b)(y
=o (1) + 8 sup IA(b)(y)l.
y n
Thus the proof will be complete if it can be shown that sup IA(b)(y)[
y "
converges to zero with probability one as n tends to infinity. But
this follows from Corollary 2.2.1, since
sup lA(b)(y)l < 2 sup F(b)(y) + F(b)(y) i.
y y n n
(,)
Note that U may also be regarded as a member of the class
n
1U 2 a) + Un : :0 a
possess both invariance properties. However, since
1 (a) 1a) (b) I 1 U(a) _(b) U(a) _U(b)
2n n n n n n
Theorem 3.2.3 implies that, under H, statistics in this class are
asymptotically equivalent to any statistic in the U class. For this
n
reason, the class (a) + U(1a : 0
aso2L, th las( )Ln n I 
consideration.
The final theorem presented in this section establishes the
consistency of tests of symmetry based on U(
n
Theorem 3.2.4:
(a)
For every a E [0,1], U is consistent against any alternative
n
in D .
S
Proof:
Observe that, from (1.1.1) and the continuity of F,
+m 4
r = f [F(y) + F(y) I/ (F(w) + F(w) 1) dF(w) 2 dF(y)
is positive if and only if Fe D If it can be shown that for any
S
a E [0,1],
lim n a) = r w.p. 1, (3.2.3)
nO n
it will then follow that with probability one, U(a) tends to infinity
n
with n if and only if Fe s, which will imply the consistency of
(a)
tests based on U Thus it is only necessary to establish (3.2.3).
Let H (y) = F(a(y) F(y). Then,
n n
+0
n[ n (y) + H (y) + (F(y) + F(y)l )]2 dF (y)
n H (w) + H (w) + (F(w) + F(w) 1) dF(w) .
(3.2.4)
Upon expanding the integrand and rearranging terms, the first integral
in (3.2.4) becomes
+
f H2(y) + H2(y) + 2(Hn(y) + H (y))(F(y) + F(y) 1)
n n
+00
+ 2 Hn(y) Hn(y) dF(y) + / [F(y) + F(y) ]2 dFn(y).
(3.2.5)
The two integrals in (3.2.5) will now be considered individually. The
absolute value of the first integral is bounded above by
sup IH (y) 2 + [sup IH (y) 2 + 2 sup H (y)I + sup IH (y)i
L y y y "
x sup IF(y) + F(y)l1 + 2 sup IH (Y)I sup IHn(y)l. (3.2.6)
y y y
From Lemma 2.1.1 and the fact that
sup IF(y) + F(y)l < 1,
y
it follows that (3.2.6) and hence, the first integral in (3.2.5) con
verge to zero with probability one as n tends to infinity. The second
integral in (3.2.5) can be expressed as
n
n>, F(Xi) + F(X ) 1]. (3.2.7)
i=1
Since
SF(y) + F(y) 112 dF(y) S 1,
the strong law of large numbers implies that (3.2.7) converges with
probability one to
E[(F(X)+F(X) 1) = f [F(y)+F(y) 12 dF(y). (3.2.8)
Thus the expression given by (3.2.5), as n tends to +, converges with
probability one to the righthand side of (3.2.8). A similar argument
shows that
JI n(w) + H (w) + (F(w) + F(w) 1)] dF (w)
n n
converges with probability one to
S[F(w) + F(w) 1 dF(w).
Hence, the second integral on the righthand side of (3.2.4) converges
with probability one to
+0
(f [F(w) + F(w)1] dF(w)).
This establishes (3.2.3) and completes the proof of the theorem.
3.3 The Asymptotic Null Distribution of U(
n 
In this section, the common asymptotic null distribution of the
statistics in the class (Ua) : 0 1 a will be derived by considering
a related random variable, U defined as
n = n f [Fn(y)+ F(y)l
f(F (w) + F(W) 1) dF(w)]2 dF(y).
Note that, because of the continuity of F, U is independent of
(a)
the version of the EDF (F or F (, 0 al ) appearing in the integrand.
n n
Theorem 3.3.1 below shows that U and U( have the same asymptotic null
n n
distribution. The proof is based on the corresponding proof given by
2
Watson [20] for 2 Before stating this theorem, two lemmas necessary
n
for its proof will be established.
Lemma 3.3.1:
Let F denote the EDF based on a sample of size n from F e 4 ,
n s
and let
+Vn
Vn = n f F (y) + F(y) 1 dF(y). (3.3.1)
n n
Then, as n tends to infinity, V converges in law to a normal random
variable with mean zero and variance 1/3.
Proof:
Clearly
+< n n
v n f I 5 y]+ I [X 5 y] n dF(y)
i=1 i=l
n + i
= n i7f dF(y) + dF(y) n
i=1A j
n 1 F(Xi)
Sn dt dt +f n
i= F (X ) 0
Since F(X
n
= n [1 F(X) + F(X)] n .
i=1
) = 1 F(X.) if F E s the last expression above equals
n
n 1 2n1 F )] (3.3.
i=l
2)
Now, since {F(X1),F(X2),...,F(Xn)} is distributed as a random sample
from a uniform distribution on [0,1], the lemma follows from (3.3.2)
and the Central Limit Theorem. This proof is therefore complete.
The second lemma needed for the proof of Theorem 3.3.1 is now
established.
Lemma 3.3.2:
Let
+m (
S vn n [F (y) (y) ] dFn(y),
(3.3.3)
where Vn is given by (3.3.1). Then, for every F e s'
E(Cn) = 0, (3.3.4)
and
Var(C ) 0 as n . (3.3.5)
n
Proof:
Note that the second term on the righthand side of (3.3.3) is
equal to Vn with dF(y) replaced by dFn(y). Using (3.3.2), with proba
bility one,
n
Cn = n[1 2n1 F(Xi
i=l
n n
T n1 (F (Xi) + n1 x 5 Xi + n
i=1 j=1
n n
= n 2 2nl F(Xi) n2 n(F) (X)
i=1 i=l
n
+ I i[x+X 0) .
j=1
Since, with probability one,
n n
n (X) (i ) = 1 n2,
i=l i=l
it follows that
n n n
C =  2n1 'F(X) n2 I 7fr +X. + ] (3.3.6)
n (2 Y 3
i=l i=l j=1
Therefore,
n n n
E(Cn) = n 2n1 E(F(X,)) n2 LE(I[X+X 0])
i=1 i=l j=l (3.3.7)
Since F(Xi) has a uniform distribution on the unit interval,
E(F(Xi)) = i=1,2,...,n. (3.3.8)
Using (2.2.10), it can be easily shown that if F E 4 (X.+X.) is
distributed symmetrically about zero. Thus, for i,j=l,2,...,n,
E(I[X. +X. <0]) = P(X. +X. 0) = (3.3.9)
Substituting (3.3.8) and (3.3.9) in (3.3.7) yields
E(C ) = n 2nln) 2(n2)2 = 0,
thus establishing (3.3.4).
Now, consider the variance of C Squaring the expression given
n
by (3.3.6) gives
Cn = n +4n2 (i 4 1 ( I X.+X.5 0
n 341 j
Si=l i= j=l
n n n
6n1 F(X) 3n2 > I[Xi+Xj 0
i=l i=l j=1
n n n
+ 4n3 F(X) IX +Xk ] 0.
i=1 j=l k=l
Taking expectations on both sides and using (3.3.8) and (3.3.9) results
in the following expression for the variance of C :
, n i
9 1 C 2
Var(C ) = 4 n+4n E ( F(X )
i=1
n n
+ n3E I(( i Xi + X ] 0)2
i=l j=1l
n n n
+ 4n2E F(X.) > T[X +\ 0]>. (3.3.10)
i=1 j=1 k=l
The three expectations in (3.3.10) are now considered individually.
As noted earlier, {F(X1),F(X2),...,F(Xn)} is distributed as a
random sample from a uniform distribution on the unit interval. There
fore, the first expectation in (3.3.10) is given by
n n n 2
E[' ZF(X)] 2) Var(n ZrF(X)) + (E(n1 F(X ))j
i=1 i=1 i=1
n1 1
= 12n + .
(3.3.11)
To evaluate the
I[X. +X. 3 0].
1 3
second expectation in (3.3.10), let a.. denote
Then
Then
n2 2 n n n n n
+ Z Xi+X 50 = ai.+ (a2+a ..)+ a....
x=1 j=l i=1 i=1 j=l i=1l j=1
i=1 j=1
i#j
n n n
+ iijk + ajkii)
i=1 j=l k=l 1 k
ii'j#k
n n n n
i=1 j =1 k=1i
+ijji1
Since a.. = a.. for every pair of integers, i and j, and since the terms
within a particular sum all have identical expectations, the righthand
side of the above equality becomes, for n > 4,
2 2
nE(a2) + 2n(nl)E(ca.) + n(nl)E(aiia )
ii ij ii jj
+ 4n(nl)E(cii ij) + 2n(nl)(n2)E(a.a.jk)
+ 4n(nl)(n2)E(a.jik ) + n(nl)(n2)(n3)E(a ijk ), (3.3.12)
where i, j, k and denote four distinct positive integers less than or
equal to n. Now, the expectation in (3.3.12) may be evaluated as
2 2
follows. First, since a.. = a.. and a.. = aii, (3.3.9) implies that
2 2 1
E(a E(a (a.) = E(a..) = (3.3.13)
ij ij 11 2'
Next, since a.. is independent of a and a.i is independent of a.. and
ajk' it follows that
1
E(a ia..) = E(a.ia ) = E(a..a*) = (3.3.14)
ini j ijk 11 i3k 4
Finally,
E(aij..a.) = P(X.+X. < 0 and Xi+X < O)
f if dF(yk) dF(y) dF(y ).
{(yi+yjO)M
r(y +yk 0)}
1 ls ls
S / dt dw ds = (3.3.15)
0 0 0
and, similarly,
3
E(aa ) =. (3.3.16)
Sii ij 8
Substituting (3.3.13), (3.3.14), (3.3.15) and (3.3.16) in (3.3.12),
n n
S[X + X ] = n( ) + 2n(n1)( ) + n(nl)( )
i=1 j=1
+4n(n1)( ) + 2n(nl)(n2)( )
+ 4n(nl)(n2)( ) + n(nl)(n2)(n3) (k)
= ()n4 + ( )n3 + o(n3). (3.3.17)
In a similar manner, the third expectation in (3.3.10) can be shown to
be equal to
( n n n
E F(Xi) + 0 (n3 (n2 + o(n2).
= j=l k=l (3.3.18)
Substituting (3.3.11), (3.3.17) and (3.3.18) in (3.3.10) yields
Var(C) = ( )n + 4n ()n1 + + n3 ()n4 + (+)n3 + )
+ 4n2 )n3 ( )n2 + o(n2)
3 3 2 ( 2 6
= n 3(n) + n2(n2),
which establishes (3.3.5) and completes the proof of the lemma.
Theorem 3.3.1 may now be given.
Theorem 3.3.1:
For any F E a, (U U n converges to zero in probability as
s n n
n tends to infinity.
Proof:
Expanding the integrands and rearranging terms yields
4
+n n (y) + F n(y) d(y) 2
{oo
+ [F (y)+ / [Fn(y) 1 dF(y) (3..1)
+ (' [(y) + (y) ] dF(y))
f (y) + F))] d(Y)) (3.3.19)
The fact that the first term on the righthand side of (3.3.19) con
verges to zero in probability has been established by Rothman and
Woodroofe [15]. Using (3.3.1) and (3.3.3), the second term can be
written as
V2 (V C )2 = C(2V C).
n n n n n n
Hence, by Lemmas 3.3.1 and 3.3.2, the second term on the righthand
side of (3.3.19) converges to zero in probability as n tends to infin
ity, and the proof is complete.
The asymptotic null distribution of Un may now be derived.
It will be necessary at this time to consider some preliminary results.
Proofs of Theorems 3.3.2 and 3.3.3, and Lemma 3.3.3 may be found in the
reference indicated, and these results will be stated without proof.
Proofs of the remainder of the preliminary results, Lemmas 3.3.4,
3.3.5, 3.3.6 and 3.3.7, will be given.
Theorem 3.3.2 (Prabhu [13], p. 28):
Let y(t) be a stochastic process on [0,1], with
E(jy(t) 2) < + m
for every t in [0,11, and let K(s,t) denote the covariance kernel of the
process, y(t). Then y(t) is Riemann integrable if and only if the
integral
1 1
4f K(s,t) ds dt
exists. If the above integral does exist, then
1 1
E( y(t) dt) = f E(y(t)) dt,
0 0
and
11 11
E( y(s) y(t) ds dt = f f K(s,t) ds dt.
0 0 0 0
The following definitions will be needed in several of the fol
lowing results. Let D denote the metric space of functions on [0,1]
that are rightcontinuous and have lefthand limits, and let A denote
the class of strictly increasing, continuous mappings of [0,1] onto
itself. The Skorokhod metric, d, may be defined as follows ([2], p. 111):
for x,y E D,
d(x,y) = inf > 0 : > A A such that suplt A(t) < E
and sup Ix(t) y(A(t))l E .
Theorem 3.3.3 may now be stated.
Theorem 3.3.3 (Billingsley [2], p. 30):
If the sequence of stochastic processes {y (t); n=1,2,...} in D
converges in law to the process y(t), and if g() is any measurable func
tional on D which is continuous in metric d almost everywhere (with
respect to the probability measure associated with y(t)), then as n tends
to infinity, g(yn(t)) converges in law to g(y(t)).
Lemma 3.3.3 (Durbin [7], p. 31):
The functional on D defined by
1
z(x()) = f x(t) dt
0
is continuous in the Skorokhod metric, d.
Lemma 3.3.4:
Let Z(t) be a Riemann integrable Gaussian process on [0,1].
1
Then f Z(t)dt is a normal random variable.
0
Proof:
Let
m(t) = E(Z(t)),
K(s,t) = Cov(Z(s),Z(t)),
and
n
n = Z[tj(n) tj.(n)] Z(tj(n)),
j=1
where {0=t0(n) < tl(n) <'.
of partitions of [0,1] such that max It.(n)t j_(n) converges to zero
as n tends to infinity. Clearly, for each n, W is a normal random
variable with
n
E(W ) = t. (n)t. j(n)]m(t.(n)),
j=l
and
n n
Var(Wn) = 7 [tk(n)tk(n) [tj(n)tj_(n)]K(tk(n),t (n)).
k=l j=l
By the definition of the Riemann integral, as n tends to infinity,
1
E(W) M f m(t) dt,
0
and
1 1
Var(Wn) 2 = f 4 K(s,t) ds dt.
This implies that the characteristic function of W converges to the
characteristic function of a normal random variable with mean, M, and
2
variance, 2, and hence, that W converges in law to such a normal
1 n
random variable. Since f Z(t) dt is the limit (in mean square) of
W as n tends to infinity, the proof is complete.
n
Lemma 3.3.5:
Let GC denote the EDF corresponding to a random sample of size
n from a uniform distribution on [0,1]. If the processes Y*(t) and
n
Q*(t) are defined by
Y*(t) = n Gn 1( l
and
Q*(t) = Y(t)+Y(t)) fY*(W) +Y" (w) dw, 05t 1,
n n n Ln n
then Q*(t) converges in law to a Gaussian process on [0,1].
Proof:
It is a wellknown result (see [2], p. 141) that the process
Y (t) defined by
n
Y (t) = n (G (t) t)
n n
converges in law to a Brownian bridge process, Y(t), on [0,1]. Clearly,
Y* is the process obtained from Y be rescaling to the interval [1,1].
n n
Hence Y* converges in law to Y*, where Y* is the process on [1,1]
n
defined by
Y*(t) = Y(1+)
Since Y is a Brownian bridge process, Y* is a Gaussian process on [1,1]
with
E(Y*(t)) = E(Y r) = 0, lstsl (3.3.20)
and
K*(s,t) = E(Y*(s)Y*(t)) = E(Y Y( t)) = 1)[ (1
1
= (l+s)(lt), 1s
Now, by Lemma 3.3.3 and Theorem 3.3.3, Q*(t) converges in law to Q(t),
where
Q(t) = Y*(t) + Y*(t) [Y*(w) + Y*(w)] dw, O
0
In view of Lemma 3.3.4, it follows that Q(t) is a Gaussian process,
which proves the lemma.
Lemma 3.3.6:
Let Q(t) denote the Gaussian process given by (3.3.22). Then
Cov(Q(s),Q(t)) = t + I(s +t2) +, 0
Proof:
By Theorem 3.3.2 and (3.3.20),
1
E(Q(t)) = E(Y*(t))+E(Y*(t)) f E(Y*(w))+E(Y*(w)) dw=0
0
for t E [0,1]. Hence, for 0.s
1
Cov(Q(s),Q(L)) = E [Y*(s)+Y*(s) f [Yt (w) +Y*(w) dw]
1
x [Y*(t) +Y*(t) f [Y*(w) +Y*(w] dw1
= E(Y*(s)Y*(t)) +E(Y(s)Y*(t)) +E(Y*(s)Y*(t))
1
E Y*(s) +Y*(s) +Y*(t) +Y*(t Y*(w) +Y*(w) dw) .
I(Y*(s)Y*(s) +Y *(t) + 10 
(
Letting
1
J(t) = E Y*(t) Y*(w)+Y*(w)dw
and using Theorem 3.3.2, the last expression becomes
K*(s,t) + K*(s,t) + K*(s,t) + K*(s,t)
1
J(s)J(s) J(t) J(t)+f [J(w)+J(w] dw. (3.3.24)
From (3.3.21),
1
J(t) = K*(t,w) +K*(t,w) dw
0 t 1
1
+ f (lw)(lt) dw)
0 J
= (1t2), (3.3.25)
for t E [0,1]. Similarly,
J(t) = (1 t2).
(3.3.26)
Thus,
1 1
(w) + J(w) dw = (1w2) dw = 1 (3.3.27)
Finally, substituting (3.3.21), (3.3.25), (3.3.26) and (3.3.27) in
(3.3.24) gives
1 2 1 0
Cov(Q(s),Q(t)) = t + (s + t2) + < t ,
as was to be shown.
Lemma 3.3.7:
The functional z: D D defined by
z(x(t)) = x2t)
is continuous in the Skorokhod metric, d.
Proof:
Let x(*) D be fixed. It must then be shown that, given
E > 0, there exists an a' > 0 such that for any y C D, d(x,y) < C'
2 2
implies that d(x ,y ) y c. Let a > 0 be fixed, let
B = max 0^{
and take ' = (2B + E) If d(x,y) < E', then, by the definition of
d, given just before Theorem 3.3.3, there exists a XA A such that
sup It A (t)l < E(2B + )1 <
and
sup Ix(t) y(X (t))j E(2B +E).
Ostsl
Since
sup Ix(t)+y(Xo(t)) E sup Ix(t)I + sup ly(Xo(t))j 2B+E,
Ostl 0
it follows that
sup [x2(t) y2( (t))I 5 c.
O0tsl
2 2
In view of the definition of d, this implies that d(x ,y ) 5 e, which
completes the proof of the lemma.
Theorem 3.3.4, giving the asymptotic null distribution of Un,
may now be stated and proved.
Theorem 3.3.4:
Let U be defined by
1
U= fQ2(t) dt,
0
where Q(t) is a Gaussian process on [0,1] with
E(Q(t)) = 0
and
Cov(Q(s),Q(t)) K(s,t) = t + (s +t2) + 0s
Then, for every F E s, Un converges in law to U as n tends to infinity.
Proof:
Since U is distribution free, without loss of generality
n
F may be assumed to be the CDF of a uniform random variable on the
interval [1,1], that is,
0 if t < 1
F(t) = 2(t+l) if 1 t
(1 if t > 1
Now, let
1
Qn(t) = n 2n (t) +Fn(t) l (F + Fn(w) l)dw, 1 < t <1.
Then,
Qn(t) = Qn(t),
and
1 1 2
S= n [Fn (t)+F (t)l (F (w)+F (w)l) t
n n n 2
1
1 1
Sn f F (t)+F (t)l (F (w)+F (w)l) dw]2 dt
n n n n
1
= Q2(t) dt. (3.3.28)
0 n
Since has a uniform distribution on [0,1] if X has a uniform
distribution on [1,1], it follows from Lemmas 3.3.5 and 3.3.6 that
Qn(t) converges in law to a Gaussian process, Q(t), on [0,1] with mean
zero and covariance kernel given by (3.3.23). Hence, by Theorem 3.3.3
and Lemmas 3.3.3 and 3.3.7, as n tends to infinity,
1 1
SQ2(t) dt Q2(t) dt in law,
and, in view of (3.3.28), the proof is complete.
(a)
It has thus been shown that Un, and hence, U for any ac [0,1],
1
converge in law to U = f Q2(t) dt, where Q is a Gaussian process. The
0
problem now is to obtain the distribution of the random variable U.
In what follows, it will be shown that the distribution of U is the
same as the asymptotic null distribution of the wellknown Cramnrvon
Miss goodnessoffit statistic, agf [7]. This result will follow from
n
the fact that U can be expressed as a certain infinite sum involving
independent X2 random variables. The following two theorems will be
needed in establishing this representation for U.
Theorem 3.3.5 (Rosenblatt [14], pp. 18595):
Let y(t) be a Gaussian process on [0,1] with continuous covar
iance kernel, K(s,t). Let X. and fj(*), j=1,2,..., be the eigenvalues
and corresponding eigenfunctions of the integral equation
1
f(t) = f K(s,t)f(s) ds, (3.3.29)
0
satisfying the normalizing condition
1
f.(s)fj.(s) ds = 6..,, (3.3.30)
0 3 3 3JJ
where 6. denotes the Kronecker delta. Then the process
13
n
y(t) Zf (t) ,
j=1 3 3
where {Z.; j=l,2,...,n} is a set of independent standard normal random
variables, converges in mean square to the process y(t).
Theorem 3.3.6 (Kac and Siegert [9]):
Under the conditions of Theorem 3.3.5,
1 2
B= y2(t) dt = 7 .Z2 w.p. 1.
0j=1
The characteristic function of B, %B( ), is given by
BB(C) = I (12i X.) .
j=1 3
An immediate consequence of Theorems 3.3.5 and 3.3.6 is that,
with probability one,
U = Z2 (3.3.31)
j=1
where {X.; j=l,2,...} is the set of solutions to the integral equation
J 1
Af(t) = f[max{s,t} +(s2+t2) + f(s) ds (3.3.32)
2 2
satisfying (3.3.30), and {Z ; j=l,2,...} is a set of independent X1
random variables. Furthermore, the characteristic function of U is
given by
U() = (12i Aj) (3.3.33)
j=1
Thus the distribution of U can be obtained from (3.3.33) if the solu
tions of (3.3.32) can be found. In this connection, note that since
1
f max{s,t} + (s2+t2) + ds = 0,
A = 0 and f(s) 1 give a solution to (3.3.32) satisfying (3.3.30).
Any other elgenfunction, f, must therefore satisfy
1
f f(s) ds = 0. (3.3.34)
Let f denote an eigenfunction corresponding to any nonzero eigenvalue, X.
Then
1 1
f(t) = max{s,t) + s f(s) ds + t2+ f(s) ds
o 0
t 1 1
f (t)f(s) ds + f(s)f(s) ds + s2 f() ds.
0 t 0 2
Differentiating both sides with respect to t results in the equation,
t t
Af'(t) = (t)f(t) ff(s) ds + tf(t) = f(s) ds.
0 0
Putting t=0 and t=l on the righthand side above and using (3.3.4),
it follows that
f'(0) = f'(1) = 0. (3.3.35)
Taking the derivative a second time shows that
Af"(t) = f(t).
That is,
Af"(t) + f(t) = 0.
Now, the general solution of the differential equation above is given by
(see, for example, [6], p. 506):
f(t) = a sin(A t) + 0 cos(XA t), (3.3.36)
where a and 0 are constants to be determined using the boundary
conditions, (3.3.30) and (3.3.35). The derivative of the general solu
tion is
f'(t) = A acos(x t) A2 B sin(A t).
Hence, by (3.3.35),
A a cos(0) A B sin(0) = 0,
from which it follows that a = 0. Again from (3.3.35), with a = 0,
A 2 sin(A ) = 0,
from which it follows that
A2 = j1r, j=l,2,....
Thus, the eigenvalues of (3.3.32) are given by
= (jT) 2, j=l,2,... .
Finally, from (3.3.31),
U = Z (j j)2 w.p. 1,
j=1 j
so that from (3.3.33),
,U(c = (12i E(j.T) 2
j=1
Durbin [7j has shown that under the null hypothesis the Crambrvon
gf
Mises goodnessoffit statistic, f converges in law to a random
n
variable with characteristic function %U((). Thus the asymptotic null
distributions of gf and U(a) are identical. Percentage points for this
n n
distribution have been tabulated by Anderson and Darling [1].
The result that gf and U(a) have the same limiting distribution
n n
is interesting in view of the similar result that the Watson goodnessof
fit and twosample statistics, gf and 2 have the same asymptotic null
n n
distribution as the corresponding KolmogorovSmirnov twosided statistics
([19] and [20]). No simple explanation has been found for either of
these results.
3.4 The Exact Null Distribution of U(
n
(a)
As was the case in Chapter 2 for T a closed form expression
(a)
is not available for the exact null distribution of U However,
n
57
using the method described in Section 2.3, exact critical values for
U() were calculated for various levels and sample sizes n = 9(1)20.
n
These critical values are given in Table 3.
In order to compare the exact distribution of U with the
20
asymptotic distribution, exact tail probabilities, P, for n=15(1)20,
were calculated using the critical values obtained from the asymptotic
distribution. These probabilities are given in Table 4. Apparently,
for sample sizes larger than 20, a test based on the alevel critical
value of U will have an actual level less than a, with the difference
between the actual level and a being less than the corresponding differ
ence for n = 20, given in Table 4. This difference appears to decrease
as n increases. A test using the alevel critical values for U when
20
n is somewhat larger than 20 will likely have an actual level greater
than, but closer to, a, than a test based on the asymptotic alevel
critical values.
TABLE 3
EXACT CRITICAL VALUES FOR U
n
a = .10
n x P(U >x)
n
9 0.29904 0.1016
10 0.29600 0.1094
11 0.31405 0.1006
12 0.31192 0.1011
13 0.31953 0.1008
14 0.32067 0.1007
15 0.32000 0.1003
16 0.32398 0.1016
17 0,32363 0.1004
18 0.32596 0.1001
19 0.32716 0.1003
20 0.32738 0.1003
S 0.34730
a = .05
n x P(U >x)
n
9 0.37037 0.0547
10 0.40000 0.0527
11 0.40421 0.0518
12 0.40741 0.0513
13 0.41875 0.0500
14 0.42310 0.0504
15 0.42193 0.0504
16 0.42554 0.0504
17 0.42825 0.0502
18 0.43090 0.0500
19 0.43126 0.0502
20 0.43200 0.0501
0.46136
x P(U) >x)
n
0.30727 0.0859
0.30400 0.0977
0.31555 0.0986
0.31713 0.0991
0.32135 0.0999
0.32106 0.0997
0.32178 0.0998
0.32422 0.0999
0.32444 0.1000
0.32648 0.0992
0.32804 0.0999
0.32800 0.0995
x P(U)>x)
n
0.39506 0.0469
0.40100 0.0449
0.41473 0.0498
0.40914 0.0483
0.41966 0.0496
0.42456 0.0494
0.42370 0.0500
0.42749 0.0499
0.42866 0.0497
0.43210 0.0494
0.43155 0.0500
0.43238 0.0497
TABLE 3 (Continued)
a = .01
n x P(U >x)
n
9 0.56790 0.0117
10 0.56900 0.0117
11 0.59654 0.0117
12 0.60359 0.0103
13 0.64087 0.0100
14 0.64140 0.0104
15 0.65600 0.0100
16 0.65625 0.0102
17 0.66436 0.0100
18 0.67164 0.0101
19 0.67298 0.0100
20 0.67738 0.0101
0.74346
a= .001
n x P(U(>x)
n
9 0.65295 0.0039
10 0.74400 0.0020
11 0.76033 0.0029
12 0.85417 0.0015
13 0.85662 0.0015
14 0.90671 0.0012
15 0.93333 0.0011
16 0.95679 0.0010
17 0.97619 0.0010
18 0.99263 0.0010
19 0.99898 0.0010
20 1.00638 0.0010
1.16786
x P(U >x)
n
0.65295 0.0039
0.61600 0.0078
0.61758 0.0098
0.63194 0.0098
0.64907 0.0095
0.64723 0.0099
0.65659 0.0099
0.65991 0.0100
0.66517 0.0099
0.67215 0.0100
0.67357 0.0099
0.67800 0.0100
x P(U O)>x)
n
0.74074 0.0000
0.82500 0.0000
0.83396 0.0010
0.92303 0.0005
0.90123 0.0010
0.93477 0.0010
0.94637 0.0010
0.97241 0.0009
0.97659 0.0010
0.99314 0.0010
0.99927 0.0010
1.00738 0.0010
TABLE 4
EXACT TAIL PROBABILITIES FOR U() BASED ON THE
n
aLEVEL CRITICAL VALUES OF U
P = P(U (,U)
n a
a = .10
a = .05
S P a P
0.0837
0.0848
0.0858
0.0858
0.0873
0.0884
0.0163
0.0152
0.0142
0.0142
0.0127
0.0116
a= .01
n P acP
0.0055
0.0056
0.0059
0.0060
0.0064
0.0065
0.0045
0.0044
0.0041
0.0040
0.0036
0.0035
P a P
0.0382
0.0395
0.0401
0.0411
0.0413
0.0419
0.0118
0.0105
0.0099
0.0089
0.0087
0.0081
S= .001
P a P
0.0002
0.0002
0.0002
0.0002
0.0003
0.0003
0.0008
0.0008
0.0008
0.0008
0.0007
0.0007
CHAPTER 4
PROBLEMS FOR FURTHER RESEARCH
Many problems in the area of symmetry tests remain to be solved.
As mentioned above, the asymptotic null distribution of the statistics
in the T class is not known. Also, it would be desirable to obtain
n
a method for determining the power of the tests based on statistics in
any of the four classes discussed above against various types of alter
natives. In addition to the Cramsrvon Mises type symmetry statistics,
there exist corresponding EDF symmetry statistics based on the Kolmogorov
Smirnov statistic ([3], [4], [11] and [16]). The Bahadur efficiencies of
several of these statistics, relative to each other, have been obtained
by Littell [1I], but there are no criteria available for comparing the
Cram6rvon Mises with the KolmogorovSmirnov statistics. The two invar
iance properties discussed in previous chapters, as well as the two com
parisons upon which the Cramsrvon Mises type statistics are based, have
not yet been investigated with respect to the KolmogorovSmirnov type
symmetry statistics.
In the area of distribution theory, it has been shown above that
U(a) and the Cramnrvon Mises goodnessoffit statistic have the same
n
asymptotic null distribution. Also, as noted in Section 3.3, the
Watson goodnessoffit and twosample statistics have the same asymp
totic null distribution as the corresponding twosided KolmogorovSmirnov
62
statistics. As it appears that these two distributions are common ones,
it might be possible to determine when a particular statistic will have
one of them as its asymptotic distribution.
Finally, in many practical situations, the center of symmetry,
0, may not be known. It would thus be desirable to find a distribution
free EDF test for symmetry when 0 is estimated. Since most distribution
free tests lose this property when parameters are estimated, this prob
lem may be impossible to solve for small samples, but asymptotically
distribution free procedures may possibly be obtained.
BIBLIOGRAPHY
[1] Anderson, T. W. and Darling, D. A. (1952). Asymptotic theory of
certain "goodness of fit" criteria based on stochastic
processes. Ann. Math. Statist. 23 193212.
[2] Billingsley, P. (1968). Convergence of Probability Measures.
Wiley, New York.
[3] Butler, C. C. (1969). A test for symmetry using the sample dis
tribution function. Ann. Math. Statist. 40 22092210.
[4] Chatterjee, S. K. and Sen, P. K. (1973). On KolmogorovSmirnov
type tests for symmetry. Ann. Inst. Statist. Math. 25
287300.
[5] Chung, K. L. (1968). A Course in Probability Theory. Harcourt,
Brace and World, New York.
[6] Courant, R. (1937). Differential and Integral Calculus, Vol. I,
2nd ed. (translated by E. J. McShane). Interscience,
New York.
[7] Durbin, J. (1973). Distribution Theory for Tests Based on the
Sample Distribution Function. SIAM, Philadelphia.
[8] Hajek, J. and Sidik, Z. (1967). Theory of Rank Tests. Academic
Press, New York.
[9] Kac, M. and Siegert, A. J. F. (1947). An explicit representation
of a stationary Gaussian process. Ann. Math. Statist. 18
438442.
[10] Kiefer, J. (1959). Ksample analogues of the KolmogorovSmirnov
and Cramervon Mises tests. Ann. Math. Statist. 30 420447.
[11] Littell, R. C. (1974). On the relative efficiency of some
KolmogorovSmirnovtype tests for symmetry about zero.
Commun. Statist. 3 10691076.
[12] Orlov, A. I. (1972). On testing the symmetry of distributions.
Theor. Probability Appl. 17 357361.
[13] Prabhu, N. U. (1965). Stochastic Processes. Macmillan, New York.
63
[14] Rosenblatt, M. (1962). Random Processes. Oxford Univ. Press,
New York.
[15] Rothman, E. D. and Woodroofe, M. (1972). A CramBrvon Mises
type statistic for testing symmetry. Ann. Math. Statist. 43
20352038.
[16] Smirnov, N. V. (1947). Sur un crithre de sym4trie de la loi de
distribution d'une variable alatoire. Izv. Acad. Sci. USSR.
56 1114.
[17] Srinivasan, R. and Godio, L. B. (1974). A Cramervon Mises type
statistic for testing symmetry. Biometrika. 61 196198.
[18] Stephens, M. A. (1974). EDF statistics for goodness of fit and
some comparisons. J. Amer. Statist. Assoc. 69 730737.
[19] Watson, G. S. (1961). Goodnessoffit tests on a circle.
Biometrika. 48 109114.
[20] Watson, G. S. (1962). Goodnessoffit tests on a circle. II.
Biometrika. 49 5763.
BIOGRAPHICAL SKETCH
David Lawrence Hill was born on July 11, 1949, in Morristown,
New Jersey, and raised in nearby Bernardsville. Upon graduation from
Bernards High School in June, 1967, he attended Bucknell University in
Lewisburg, Pennsylvania, and received the degree of Bachelor of Science
in Mathematics in May, 1971. While at Bucknell, he became interested
in statistics through the influence of the late Professor Paul Benson.
In September, 1971, David enrolled in the graduate school at the
University of Florida, and, in June, 1973, was awarded the degree of
Master of Statistics. Since then, he has been working toward the
degree of Doctor of Philosophy, and has been the recipient of a grad
uate fellowship and a research assistantship from the University of
Florida.
After receiving his degree, David will assume a position as
an Assistant Professor in the Department of Mathematical Sciences
at Northern Illinois University in DeKalb, Illinois. He is a member
of the American Statistical Association.
I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation and is
fully adequate, in scope and quality, as a dissertation for the degree
of Doctor of Philosophy.
P. V. Rao, Chairman
Professor of Statistics
I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation and is
fully adequate, in scope and quality, as a dissertation for the
degree of Doctor of Philosophy.
R. C. Uittell
Associate Professor of Statistics
I certify that I have read this study and that in my opinion
it conforms co acceptable standards of scholarly presentation and is
fully adequate, in scope and quality, as a dissertation for the
degree of Doctor of Philosophy.
D. D. Wackerly ..
Assistant Professor of Statistics
I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation and is
fully adequate, in scope and quality, as a dissertation for the degree
of Doctor of Philosophy.
R. D. lauldin
Associate Professor of Mathematics
This dissertation was submitted to the Graduate Faculty of the
Department of Statistics in the College of Arts and Sciences and
to the Graduate Council, and was accepted as partial fulfillment
of the requirements for the degree of Doctor of Philosophy.
August, 1976
Dean, Graduate School
}

