• TABLE OF CONTENTS
HIDE
 Title Page
 Acknowledgement
 Table of Contents
 Abstract
 Introduction
 Cramer-von Mises type statistics...
 A test of symmetry based on the...
 Problems for further research
 Bibliography
 Biographical sketch






Title: Tests of symmetry based on the Cramér-von Mises and Watson statistics
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00098122/00001
 Material Information
Title: Tests of symmetry based on the Cramér-von Mises and Watson statistics
Alternate Title: Cramér-von Mises and Watson statistics
Physical Description: v, 65 leaves : ; 28 cm.
Language: English
Creator: Hill, David Lawrence, 1949-
Publication Date: 1976
Copyright Date: 1976
 Subjects
Subject: Mathematical statistics   ( lcsh )
Statistics   ( lcsh )
Symmetry   ( lcsh )
Statistics thesis Ph. D
Dissertations, Academic -- Statistics -- UF
Genre: bibliography   ( marcgt )
non-fiction   ( marcgt )
 Notes
Thesis: Thesis--University of Florida.
Bibliography: Bibliography: leaves 63-64.
General Note: Typescript.
General Note: Vita.
Statement of Responsibility: by David Lawrence Hill.
 Record Information
Bibliographic ID: UF00098122
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: alephbibnum - 000181650
oclc - 03213612
notis - AAU8193

Downloads

This item has the following downloads:

PDF ( 1 MBs ) ( PDF )


Table of Contents
    Title Page
        Page i
        Page i-a
    Acknowledgement
        Page ii
    Table of Contents
        Page iii
    Abstract
        Page iv
        Page v
    Introduction
        Page 1
        Page 2
        Page 3
        Page 4
        Page 5
        Page 6
        Page 7
    Cramer-von Mises type statistics for testing symmetry
        Page 8
        Page 9
        Page 10
        Page 11
        Page 12
        Page 13
        Page 14
        Page 15
        Page 16
        Page 17
        Page 18
        Page 19
        Page 20
        Page 21
        Page 22
        Page 23
        Page 24
        Page 25
        Page 26
        Page 27
    A test of symmetry based on the Watson statistic
        Page 28
        Page 29
        Page 30
        Page 31
        Page 32
        Page 33
        Page 34
        Page 35
        Page 36
        Page 37
        Page 38
        Page 39
        Page 40
        Page 41
        Page 42
        Page 43
        Page 44
        Page 45
        Page 46
        Page 47
        Page 48
        Page 49
        Page 50
        Page 51
        Page 52
        Page 53
        Page 54
        Page 55
        Page 56
        Page 57
        Page 58
        Page 59
        Page 60
    Problems for further research
        Page 61
        Page 62
    Bibliography
        Page 63
        Page 64
    Biographical sketch
        Page 65
        Page 66
        Page 67
Full Text














TESTS OF SYMMETRY BASED ON
THE CRAM9R-VON MISES AND WATSON STATISTICS







By

DAVID LAWRENCE HILL


A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR
THE DEGREE OF DOCTOR OF PHILOSOPHY








UNIVERSITY OF FLORIDA

1976


























ACKNOWLEDGMENTS


I would like to express my sincere thanks and appreciation

to Dr. P. V. Rao for his guidance and assistance as the chairman of

my committee. I would also like to thank Dr. Dennis D. Wackerly for

providing valuable advice whenever it was needed. Additional thanks

go to my parents for their encouragement and support, and to Mrs.

Edna Larrick for her excellent job of typing the manuscript.






























ii


1

















TABLE OF CONTENTS


Page

ACKNOWLEDGMENTS . . . . . . . . .. . . ii

ABSTRACT . . . . . . . . ... . . ... .iv

CHAPTER

1 INTRODUCTION . . . . . . . ... . 1

1.1 Literature Review .. . . . . . . 1
1.2 Summary of Results . . . . . . . . 6

2 CRAIMR-VON MISES TYPE STATISTICS
FOR TESTING SYmMETRY . . . . . . . . 8

2.1 Two Classes of Cramer-von Mises Type
Symmetry Statistics . . . . . . . 8
(a) (a)
2.2 Properties of R and S . . . . . 11
n n
2.3 A Third Class of Cramer-von Nises
Type Statistics . . . . . . . . 23

3 A TEST OF SYMMETRY BASED ON THE WATSON STATISTIC . 28

3.1 A Class of Symmetry Statistics Based
on the Watson Statistic . . . . . . 28
(a)
3.2 Properties of U n . . . . . . . . 31

3.3 The Asymptotic Null Distribution of U(a). . . 37

3.4 The Exact Null Distribution of U .....56
n
4 PROBLEMS FOR FURTHER RESEARCH . . . . . ... 61

BIBLIOGRAPHY . . . . . . . . . . . . 63

BIOGRAPHICAL SKETCH . . . . . . . . ... . . 65












Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy


TESTS OF SYMMETRY BASED ON
THE CRAMER-VON MISES AND WATSON STATISTICS

By

David Lawrence Hill

August, 1976


Chairman: Dr. P. V. Rao
Major Department: Statistics

Three Cramer-von Mises type statistics for testing symmetry

about zero have recently appeared in the literature. Two of these

have the desirable property of invariance with respect to taking the

negatives of the observations (Invariance Property I). However, none

of these statistics possesses the property of invariance with respect

to taking the reciprocals of the observations (Invariance Property II).

In this dissertation, several related statistics, which possess both

invariance properties, are developed.

In Chapter 2, two classes of statistics obtained by modifying

the two-sample Cramer-von Mises statistic are studied. Statistics in

these two classes are known to lead to consistent tests of symmetry

and to possess Invariance Property I. In addition, a relationship

between the two classes is established, and each class is shown to con-

tain one of the above existing statistics. By combining statistics

from the two classes, a third class is obtained. Statistics in the

third class are shown to lead to consistent tests of symmetry and to











possess Invariance Property II. Exact critical values are given for

one of the statistics in the third class.

Since the asymptotic null distribution of statistics in the

third class is not available at this time, a class of statistics ob-

tained from a modification of the Watson two-sample statistic is defined

in Chapter 3. Statistics in this fourth class are shown to lead to con-

sistent tests of symmetry and to possess Invariance Property I. In

addition, one of the statistics in the fourth class is shown to possess

both invariance properties, and exact critical values are calculated

for this statistic. The common asymptotic null distribution for sta-

tistics in the fourth class is shown to be the asymptotic null distribu-

tion of the Cram4r-von Mises goodness-of-fit and two-sample statistics.

















CHAPTER 1


INTRODUCTION


1.1 Literature Review

This dissertation explores the problem of testing whether a

random variable with a continuous cumulative distribution function (CDF)

is distributed symmetrically about a known point. A random variable

with a continuous CDF, F, will be said to have a symmetric distribution

about the point 0, if, for all y,

F(O y) = 1 F(O + y).

Without loss of generality, the value of 0 can be taken as zero, since

X-0 has a symmetric distribution about zero if and only if X has a sym-

metric distribution about 0. Hence, the hypothesis of symmetry can be

defined as follows:


Definition 1.1.1:

A CDF, F, satisfies the hypothesis of symmetry, Ho, provided

F E (S, where s denotes the set of all continuous CDFs, such that for
s s

all y,

F(-y) = 1 F(y). (1.1.1)

Note that if 0 denotes the set of all continuous CDFs, then

0 0 is the set of all alternatives to H .
s o

A variety of statistics are available for testing the hypothesis

H : F(y) = G(y) for all y,
O










where F and G denote the unknown CDFs of two populations. These statis-

tics will be referred to as two-sample statistics. Several two-sample

statistics involve the empirical (sample) distribution function (EDF),

F (*). One such statistic is the two-sample Cramir-von Mises statistic,

defined by Kiefer [10] as
+0

2 = n [F() ()2 dHn(y),




n n

lation with CDF, F (G), and H denotes the EDF based on the combined
n

sample. Another two-sample EDF statistic, due to Watson [20], is defined

by
9

n n n n -f n n n n
n =n n -( Y)~ -) n(W-Cn(W)) dHn(y),



where F G and H are as defined above. It should be noted here that
n n n
2 2
similar but slightly more complicated forms of a and np exist, for use
n n
in the case of unequal sample sizes.

Recently, several articles have appeared in the literature in

which a statistic for testing symmetry about zero is obtained by modi-

fying one of the two-sample EDF statistics ([3], [4], [11], [12], [15]
2 2
and [17]). Such modifications of 2 and p will be studied in this
n n
dissertation.

To date, three symmetry statistics obtained from a have been
n
studied. Orlov [12] and Rothman and Woodroofe [15] have independently


If X1,X2,...,Xn denotes a random sample, the EDF F (*), is
defined by -1
F (y) = n (the number of X.s y).
n 1












proposed essentially equivalent statistics. Orlov's statistic is
2
obtained from a by replacing [F (y) G (y)] with the expression,
n n n
[Fn(y) + F (-y) i], and is given by

0
L(O) = [F() + Fn(-y) 1]2 dF (y).
2 n n n
-WJ


Orlov points out that the value of L(0) depends only on the values of

[Fn(y) + F (-y) 1]2 at its discontinuity points, X1,X2,...,Xn and

that these values may be taken to be anywhere between the right-hand
2
and left-hand limits of [F (y) + F (-y) 1] without altering the

asymptotic properties of L(0). Orlov derives the asymptotic null dis-

tribution of L(0) and gives a table of percentage points for this

distribution.

Rothnan and Woodroofe [15] have chosen a specific modification

of [F (y) + F (-y) I], and their statistic, R', is defined as follows:
n n n


R = n[F'(y) + F'(-y) 1]2 dFn(y), (1.1.2)



where

F'(y) = -[F(y+) + F(y-)]. (1.1.3)


This modification of the EDF has the desirable effect of producing a

statistic which is invariant with respect to the transformation X -+-X..

In other words, if X and -X denote, respectively, the vectors

(X1,X2,...,X ) and (-X I-X2 ... ,-X ), where X1,X2,...,X denotes a

random sample from F E 0, then

R'(X) = R'(-X) w.p. 1,
n n













where R'(X) denotes the value of the statistic, R', for the sample X.
n n

In the sequel, this property will be called Invariance Property I.

The asymptotic null distribution of R', equivalent to that of
n

L(0), is obtained by Rothman and Woodroofe. In addition, they give

approximate critical values for selected sample sizes and levels, a,

based on a Monte Carlo study. Tests based on R are shown to be con-
n

sistent against all alternatives in 4 (s, and the asymptotic distri-

bution of R' under a sequence of local alternatives is also investigated.
n
The third modification of o2 is due to Srinivasan and Godio [17].
n

A discussion of this statistic requires the definition of the following

functions, to be used throughout the dissertation.


Definition 1.1.2:

Let (Z() Z(2),...,Z(n)) denote the vector obtained by ordering

the elements of Z = (Z1,Z2,.. ,Zn) in ascending order according to the

order relation given by

Z. "<" Z. if IZi. < Z. [.
i J 1

In addition, if P denotes a proposition, let the indicator function

I['] be given by

1 if P is true

I[P] = (1.1.4)

0if P is false.


Then, the functions 6k(), Pk() and Nk(,), for k=l,2,...,n, are

defined by









-1 if Z(k) < 0

Sk(Z) = (1.1.5)

1 if Z(k) 0,


k

Pk(Z) = I[6(Z) = 1], (1.1.6)
k --3-
j=1
and
k

Nk(Z) = I[ Z) =-1. (1.1.7)
j=1

If Z denotes a random sample from a continuous CDF, then

P(IZil = Zjl) = P(Z.=0) = 0, i#j.


Thus, with probability one, the functions 6k( .) Pk(-) and Nk(.) are

well defined.

The statistic proposed by Srinivasan and Godio, S may now be
na
defined as follows:
n n

Sn S'(X)= [ k 412 k k(X)]2. (1.1.8)
'(X)-- -X)-Z 4[Nk (-X)-Pk (X)
k=l k=l

The asymptotic and exact (for n=10(1)20) null distributions of S are
n
given in [17], the former being equivalent to the common asymptotic null

distribution of the statistics L(0) and R In addition, tests based on
n
S are shown to be consistent against all alternatives in D .
n s
Although it is not mentioned in [17], it is clear from (1.1.8) that S'
n
possesses Invariance Property I.











1.2 Summary of Results

In Chapter 2, two classes of statistics for testing symmetry are

defined, one containing R', and the other containing a statistic equiva-
n
lent to S The statistics in both classes are shown to possess Invar-
n
iance Property I and to have a common asymptotic null distribution.

Tests based on these statistics are shown to be consistent against all

alternatives to symmetry about zero. For each statistic in the first

class, it is shown that there exists a corresponding statistic in the

second class with the same exact null distribution. These pairs of

statistics are related further in that the test based on one is equiva-

lent to the test based on the other using the sample of reciprocals,
-1 -1 -1
(X X2 ,...,Xn ). By combining these pairs, a third class of statistics

is defined. Statistics in the third class lead to consistent tests of

symmetry, possess Invariance Property I and, in addition, are invariant
-1 -1
with respect to the transformation, X. + X. In other words, if X
-1 -1 -1
denotes the vector of reciprocals, (X ,X2 ,...,Xn ), where X denotes

a random sample from F c 4, and if T denotes a member of the third class

of statistics, then

T(X) = T(X- ) w.p. 1.

This invariance property will henceforth be referred to as Invariance

Property II.

Exact critical values are given for one member of the third

class for n = 10(1)24. However, it does not appear that the common

asymptotic null distribution of members of the third class will be easy

to obtain. For this reason, a fourth class of statistics, based on a
2
modification of Wacson's two-sample statistic, p is defined in
isdeiedi







7



Chapter 3. These statistics are shown to lead to consistent tests of

symmetry, as well as to possess Invariance Property I. In addition, the

common asymptotic null distribution of statistics in the fourth class is

shown to be that of the well-known Cramdr-von Mises goodness-of-fit sta-

tistic [7]. One member of the fourth class is shown to possess Invar-

iance Property II, and exact critical values for this statistic are

given for n = 9(1)20.

Chapter 4 contains a brief description of the problems that are

yet to be solved in the general area of the topics considered in this

dissertation.
















CHAPTER 2


CRAMER-VON MISES TYPE STATISTICS
FOR TESTING SYMMETRY


2.1 Two Classes of Cramar-von Mises
Type Symmetry Statistics

Intuitively, a test of H may be based on either of the following:

1. A comparison of estimates of F(y) and (1 F(-y)) for all

values of y.

2. A comparison of estimates of (F(y) F(0)) and (F(O)-F(-y))

for all values of y.

Since a natural estimator for F(y) is the EDF, Fn(y), a reasonable test

statistic can he constructed from any functional of the difference be-

tween the estimates to be compared in (1) or (2) above. In (1), this

test statistic would be based on


Fn(Y) (1 F (-y))

= Fn(y) + Fn(-y) 1,


whereas in (2), it would be based on


F (y) F (0) (F (0) F (-y))
n n n n

= F (y) + Fn(-y) 2Fn(0).
n a


(2.1.2)


A modified version of Fn, slightly more general than the one

introduced by Rothman and Woodroofe [15], will be used to define the

statistics considered in this dissertation. The reason for introducing


(2.1.1)










this version of the EDF will be noted after the proof of Theorem 2.2.2.

The modified EDF is presented in the following definition. Hereafter,

X = (XIX2,...,Xn) will denote a random sample from FE s, and F will

denote the standard version of the EDF obtained from X.


Definition 2.1.1:

(a)
Corresponding to each a S [0,1], the modified EDF, F is
n
defined as

aF (y+) + (l-a)F (y-) if y < 0
n n

F a)(y) = (2.1.3)
n

(l-a)Fn(y+) + aFn(y-) if y 0,

where F (y+) and F (y-) are the right-hand and left-hand limits, respec-
n n
tively, of F at y. Note that, from (1.1.3),

F' = F (2.1.4)
n n

and like F', F(a) differs from F only at its discontinuity points,
ni n n
X1,X2,...,Xn. At these points, F(a) lies between the right-hand and

left-hand limits of F Specifically, for i=1,2,...,n,
n
(1-a)
F (X.) ( if X. < 0
n i n 1

F (a) ) = (2.1.5)


F(X i if X. > 0,
1
since

F (X.) = F (X.+) = F (X.-) + 1
n 1 n 1 n 1 n

The following lemma shows that F and F(a) behave similarly
n n
for large samples.












Lemma 2.1.1:

For every F E E and every a c [0,1],


sup F(a)(y) F(y)I + 0 w.p. 1, as n ~.
y n

Proof:

It is easily seen from (2.1.5) that

sup F(a(y) F (y) max {a, (1-a)}
-max (a, (1-a) < -.
n n n n

The desired result now follows from the triangle inequality and the

Glivenko-Cantelli Lemma [5].

The following corollary follows directly from (1.1.1) and

Leimma 2.1.1. It will be needed in the proof of a later theorem.


Corollary 2.1.1:

Under Ho,

sup jF(a)(y) + F(a)(-y) 1 0 w.p. 1, as n + c,
y n n

for every a c [0,1].

(a)
Now, by replacing F with the modified EDF, F in the com-
n n
prisons given by (2.1.1) and (2.1.2), and then applying the func-

tional, J(.)2, two distinct Cramdr-von Mises type statistics for test-

ing Ho are produced. Since the index, a, can take any value in [0,1],

two classes of statistics actually result. These two classes will be

denoted {R(a): 0 < a < and S(a): 0 < a < and their general

members are now defined.










Definition 2.1.2:

For a c [0,1],
+<

R(a) E R(a)(X) = n [F(a)y) + F(a)(-y) ]2 dF (y), (2.1.6)
n n n n n


and


(a) S(a) = nf [F(a ) + (a)(-y) 2F (0)]2 dF (y).
-00
(2.1.7)

Some properties of R (a) and S (a)will be considered in the next
n n
section. It should be noted here that reasonable tests of H are obtained
o
(a) (a)
by rejecting H for "large" values of R or S Also, from (2.1.4),
0 n n
R' = R so that one member of the class {R(a): 0 s a S 1 has
n n n

already been studied by Rothman and Woodroofe [15].



2.2 Properties of R(a and S(a)


As pointed out by Rothman and Woodroofe [15], it is reasonable to

require that a test statistic for H possess Invariance Property I. In

Chapter 1, it was noted that both R and S satisfy this requirement.
n n
The next theorem shows that each member of the classes R : 0 ( a a 1

and S (a): 0 5 a 5 1 possesses Invariance Property I.



Theorem 2.2.1:

For every a e [0,1],

(a) (a)
R (X) = R (-X) w.p. 1, (2.2.1)
n n
and

(a) (a)
S (X) = S (-X) %.p. 1. (2.2.2)
n n -










Proof:

Let G and G denote, respectively, the EDF and its modifica-
n n
tion (according to Definition 2.1.1) for the sample, -X. If y
(a)
F (y) = aF (y+) + (l-a)F (y-)
n n n

= n- a I[X y] + (l-a) I[Xi < y
i=l i=l
n n
= n-1 a I l-I[-Xi <-y]+ (1-a) l-[-X -y]
i=l i=l
= 1 a G(-y-) (1-a)G (-y+).

Since y < 0 implies that -y > 0, the last expression equals

(a)
S- G (-y).
n
Analogously, the same result holds if y > 0. Since F (0) = 1 G (0)
n n
with probability one, for all y,

(a) (a)
F (y) = 1 G (-y) w.p. 1.
n n
Hence, for all y,

(a) (a) -(a) G(a) wp i,
F (y) + F (-y) 1 = (y) + G (- 1 wP. 1,
n n n n

from which (2.2.1) follows immediately. A sketch of the proof of

(2.2.2) is given on page 16, immediately following Corollary 2.2.1.


Recall that in (1.1.8), S' was defined in terms of Pk(-), Nk()

and 6k('), and not in terms of the EDF. Thenext theorem provides rep-
(a) (a)
presentations of Rn and Sn in terms of Pk(') Nk(') and 6k(*). These
representations not only provide a convenient form for the actual
representations not only provide a convenient form for the actual












computation of the test statistics, but also enable one to establish an
(1-a) (a)
important relation between R and S 0 5 a 5 1. In addition,
n n -
they are useful in proving several theorems which follow.



Theorem 2.2.2:

With probability one, for every a e [0,1],

n
-2
R(a)(X) = n-' 1) Pk(X-1) + (1-a)6k X-1 2, (2.2.3)
k=l

and
n
S(a)(X) = n-2 k X) Pk(X) + (a)6k(X)2, (2.2.4)
k=1

where 6k(-), Pk(-) and Nk(*) are given by (1.1.5), (1.1.6) and (1.1.7).


Proof:

The result given by (2.2.3) will be proved first. If it can be

shown that, for k=1,2,...,n,

(a) (a)
F (X ) F+(a (-[X
n (n-k+l) n ((n-k+l) 1


Snl[Nk(X-l) k(X-) + (1-a)6k(X1)] (2.2.5)


then (2.2.3) will follow immediately from (2.1.6) by summing the squares

on both sides of (2.2.5) over the index k, k=1,2,...,n.

To establish (2.2.5), expressions involving F will be found for

Pk and Nk. If Y = X-, then, from Definition 1.1.2, (Y( ) -=X
(YKtn-j+l)) =Xij)









Hence, for k=1,2,...,n, with probability one,
n

P (-1) = X I 0],
k j=n-k+l


n

N (x1) =[X(j)
j=n-k+l


(2.2.6)



(2.2.7)


< 0].


Since IX(i) < IX(j)X if and only if i < j, equivalent expressions of
(2.2.6) and (2.2.7) are given by
n

Pk (X = [X X (n-k+1)
i=l
and
n

Nk (X 1 -IX (n-k+l)
i=1


Thus, with probability one, for k=1,2,...,n,


n 1-F= [ 'X(n-kl) I )]


k n[-Fn( X(n-k+l)|-)]


if X(n-k+l) < 0
if X(n-k+l)

if X > 0
(n-k+l)


= n -Fn (n-k+l)) + n-1 X(n-k+1) _


N(X ) =nF(-IX j .
NkX-) = nF (n-k+l)1).

Hence, with probability one,

N (X-) Pk(X-) + (l-a)6(X )-

= n Fn (-IX(n-k+l)1) 1 + Fn(IX(n-k+l)) -

+ n- (1-a)6 k+l (X)

= n Fn(X(n-k+l)) + Fn(-X(n-k+l)) -


n-1 (n-k+l)



1 ak) ,


0T]


(2.2.8)











where


k = n- X(n-k+l) 0 (1-a)6n-k+X


(-ia)
((i) if X < 0
n (n-k+1)


a if X(n-k+l) 2 0

Now, it follows from (2.1.5) that


k = 1,2,...,n.


F .(X )+F (-X )-oaF (a) (X )+F (-X
(X(n-k+l)) + Fn (n-k+l)) k = Fa) (n-k+l) + (a)(-X(n-k+l) )

and, hence, (2.2.5) follows from (2.2.8), which establishes (2.2.3).

The proof of (2.2.4) is almost identical. It can be shown that
n
Pk(X) = i [o0XiX(k)1'
i=l
and
n

X) = (ki= i 0]
i=1


This leads to the


following equation corresponding to (2.2.5):

(a) F(a) -2
n (X(k) n (-X(k) n


S- INkCX) P (X) + a 6k(X)] (2.2.9)

Thus, (2.2.4) follows as before, and the proof of the theorem is complete.

An immediate consequence of (1.1.8) and (2.2.4) is that

S(0)(X) = (4n-2)S'(X).
n n -

Hence, Srinivasan and Godio's statistic, S can be expressed in terms of
n
(a)
the modified EDF, F using (2.1.7), which provides the reason for
n
introducing F(a)
n











The following corollary, which follows directly from Theorem 2.2.2,

establishes an important relation between the classes (a) : 0 a <

and (s() : 0

Corollary 2.2.1:

For every a E [0,1],


R )(x S (a C) w.p. i.
n n

In other words, the test of symmetry based on S using the
n
sample X, is equivalent to the test based on R using the sample of
n
reciprocals, X.

As a result of Corollary 2.2.1, the proof of Theorem 2.2.1 can

easily be completed by noting that (2.2.2) follows directly from (2.2.1)
-l
and Corollary 2.2.1, since the transformations X X-1 and X - are

commuta ive.

In conjunction with Corollary 2.2.1, the following lemma will be

useful in establishing several later results.


Lemma 2.2.1:

-1
Let F and F denote the CDFs of X and its reciprocal, X ,
r

respectively. Then,

F e 0 if and only if F e .
s r s


Proof:

Let F* denote the CDF of the random variable -X. Then it

follows directly from (1.1.1) that










F E if and only if F=F* and F e . (2.2.10)

Also observe that

SF(0) F(y-1) if y < 0

Fr(y) = F(0) if y = 0

1 + F(0) F(y-1) if y > 0.

Thus, since

lim [F(0)-F(y) = lim [1 + F(O) F(y-1 = F(0)
y+- 0- y- 0+

it follows that

F = F* and F E $ if and only if F = F* and F e D.
r r r
Combining this result with (2.2.10) completes the proof.

Another consequence of Theorem 2.2.2 is that, for every ae [0,1],
(a) (a) (a)
both R and S are distribution free. Considering S first, it is
n n n
(a)
clear from Definition 1.1.2 and Theorem 2.2.2 that S (X) depends on X

only through

6 = (61(X),62(X),...,6n(X))

A result of Hajek and SidAk ([8], p. 40) shows that, under Ho,

P(6=d) = 2-n

for every d E A, where

A = (d,d2,...,dn) : di = +


Hence, for every a E [0,1], S(a) is distribution free under H It now
n 0
follows from Corollary 2.2.1 and Lemma 2.2.1, that Ra is likewise
n
distribution free under H and that S( and R(1-a are identically
o' n n
distributed for every a E [0,1] and every F E @s. That is, for each
s












statistic in the class S(a) : 0 a 1 there exists a corresponding
Sn
statistic in the class Ra) : 0 nf
It should be noted, however, that these pairs of statistics do not produce

equivalent tests of Ho, even when n becomes large, since the statistics

in a pair will not have equal values for every sample, X, and since

R(l-a) S (a) does not converge to zero.
n n
Rothman and Woodroofe [15] do not give exact critical values for

R', and Srinivasan and Codio [17] were not successful in deriving them with
n
the method they employed to obtain the exact values for S However, from
n
the above discussion, it follows that the critical values given in [17] are
1 2 (1)
also the critical values for (-n ) R Hence, exact critical values are

available for a statistic closely related to R In fact, R and
n n
R() have the same asymptotic null distribution. Since R ( and S(0)
n n n
are identically distributed, this follows from the fact, noted in Section

1.1, that R ) and S(0) have a common asymptotic null distribution. The
n n
next theorem shows that this relation between R. and R ) holds for any
n n
pair of statistics in the RP class, and for any pair in the S class

as well.


Theorem 2.2.3:

Under H R(a) (S a)) and R(b) (Sb)) have the same asymptotic
o n n n n
distribution for any a and b in the unit interval. In fact, under Ho,


R(a) R(b) j 0 w.p. 1, as n m,
n n
and


iS(a) _- (b) I 0 w.p. 1, as n + .
n n










Proof:

In view of Corollary 2.2.1 and Lemma 2.2.1, it will be sufficient
(a)
to prove the result only for R Observe that

n n n





) [F() FF(b) F )]





R(a) = R(b) + n ( a(y) + F) F (b) () (b)(-)2
n n n n n n







+ 2 pIF(a) (y) + (a) (b) (b)
+ 2 y F (-y) F (y) F 1y)
n y n n


x [up(b)(y) + F(b)(-y) d y)
nsy n n
sup|F sup F (y) + F (-Y) Fn (y) Fbn (-Y)| 2







From (2.1.5),
spF(a)(y) + F (-y) F () F(b) (-y) <2n-1
y n n n

and thus it follows that
()(a) (b) (b) (b)
R(a) R() < o(l) + 4 supF (y) + F(b)(-y) l|. (2.2.11)
n n n y n n

By Corollary 2.1.1, the second term of the last expression, under Ho,

converges to zero with probability one, completing the proof of the

theorem.
(1-a) (a)
It is now clear, from the fact that R and S are identi-
n n
cally distributed, that all statistics in both classes have a common

asymptotic null distribution. This agrees with the work of Rothman and

Woodroofe [15] and Srinivasan and Godio [17], who independently derived










() (0)
the same asymptotic distribution for the statistics R and S(
n n

respectively. As noted in Section 1.1, percentage points have been

tabulated for this distribution by Orlov [12].

Consistency of the test for symmetry based on any statistic in

the classes R(a) : Oa1 or S(a) : 00a.\ is now established.
n n


Theorem 2.2.4:

(a) (a)
For any a [0,1], R (S ) is consistent against any alter-
n n
native in 0 .
s


Proof:

Rothman and Woodroofe [15] show that if Ho is not satisfied,

R + w.p. 1.
n

Since sup IF (y) + F (-y)-l 1, it follows from (2.2.11) that for
y n n
any a c [0,1], R a)-R()I o (1) + 4. Hence for any F E 0 ( ,
n n n S

R(a) + m w.p. i, (2.2.12)
n
and thus R is consistent against any alternative in 0 0 for every
n s
a E [0,1]. In view of Corollary 2.2.1 and Lemma 2.2.1, S is likewise
n

consistent for every a e [0,1], completing the proof.

(1)
Since exact critical values for R are readily available, it
n

is of interest to know whether there is any further advantage to be

gained by using R () instead of R ). More generally, one might look
n n
for a criterion to select a statistic from each of the two classes,

R(a) : 0
lished by Theorem 2.2.3 that the effect of the choice of the index, a,

becomes negligible as the sample size becomes large. However, the










value of the index will affect the small sample distributions of both
(a) (a) n
R and S Specifically, the approximately 2na sample points
n n
(of A) comprising the rejection region of an a-level test based on R(a)
n
or S will depend on the value of the index, a, and hence, so will the
n
small sample power of such a test. The following corollary to Theorem

2.2.2 may be applied to provide a characterization of the effect of the

choice of the index, a, on the statistics in the two classes.


Corollary 2.2. 2:

With probability one, and for every a E [0,1],
n-1

(n2)S(a) I N(X) Pk
(n2)s
k=l

+ (l-a)[N (X) P(X)]2 na(l-a), (2.2.13)

and
n-l

(n)R(a) = Nk(X)- P (X- 2
k=l

+ a [Nn (l) p x1)]2 na(l-a).

Proof:
(a)
It will be sufficient to prove only the result for S as the

n
corresponding result for R will then follow directly from Corollary

2.2.1. From (2.2.4),


n2S(a) = Nkx (X)]2
nS(a) (X
k=l
n n

+ 2a [Nk(X) Pk(X)] 6k(X) + [adk()2' (2.2.14)
k=l k=l









and from (1.1.6) and (1.1.7),
k

Nk(X) Pk(X) = (-1) 6j(X).
j=1


Hence, the second

rewritten as


term on the right-hand side of (2.2.14) can be


n k
-2a > 6 k(X) 6j(X)
k=l j=l
n n n
- -2a V(X)]2 2a 6k (X) 6j(X)
k=1 k=l j=l
n n
-a 6k=l ( 6kX=l )
k=-


Using (2.2.15) again, this last expression becomes


-a [k(2X) a N -
k=l

Hence, (2.2.14) becomes
n
n2S(a) NkX) Pk(X)2 aNn() n(X)2
k=l
n
+ (a2 -a) k(X)] 2
k=l

which reduces to (2.2.13) in view of the fact that [Sk(X) = 1 with

probability one, for k=1,2,...,n. This completes the proof of the

theorem.


(2.2.15)












An intuitively reasonable criterion for choosing a representa-

tive statistic from each class may be based upon the preceding corollary.

With respect to S(a), (-a) may be regarded as the weight assigned to
n

n(X) P (X 2, whereas, for R(a), the corresponding weight assigned
LN(X Pn2- n

to [N (X-) -P (X-)]2 is (a). It is not clear how one would justify
Ln- n .-

the assignment to Nn(X) P(X)]2 or Nn(X) P (X-)]2 a weight

different from that assigned to [Nk(X)- PkX)2 or LNk(X-1) -PkP(X-1)]2

(0) (1)
k=l,2,...,n-l, (i.e., 1). Since S and R are the two statistics
n n
in which the full weight of 1 is assigned to the nth term, they appear

to be logical choices in determining representative statistics from the

two classes. In addition, exact critical values are available for
(0) (1)
both S and R and these two statistics are somewhat easier to
n n
compute than other members of their respective classes.


2.3 A Third Class of Cram&r-von Mises Type Statistics

Both invariance properties defined in Chapter 1 are desirable

properties for tests of symmetry. In Section 2.2, it was shown that

S (a) and R(a) possess Invariance Property I for all a s [0,1]. However,
n n
it is easily seen that neither class contains a statistic possessing

Invariance Property II. By Corollary 2.2.1, the assumption that either

R -a) or S(a) possesses this property leads to the contradiction that
n n
(a) (1-a)
S is equivalent to R Corollary 2.2.1 does suggest, however,
n n
a method of combining statistics of the R class with those of the S
n n

class to produce statistics possessing both invariance properties, as

shown by the following theorem.











Theorem 2.3.1:

Let
(a) (a) (1-a) (a)
T n T (X) (-X) + S (X .
n n 2ln n

(a)
Then, for every a e [0,1], T possesses invariance properties I
n
and II, and leads to a consistent test of symmetry against any alterna-

tive in 4 .
s


Proof:

That T(a) possesses Invariance Property I is clear, as both
n
(1-a) (a)
R and S possess the property. Invariance under the transforma-
n n
-1
tion X X1 follows from Corollary 2.2.1, since, with probability one,


T(a)() = (a) ) + R (1-a)(X
D 2 n En 4

R(-a)X) + S(a) (-1
2 n n

T(a) -1
= T (X --).
n

Finally, the consistency of the test based on T follows immediately
n
(a)
from (2.2.12) and the corresponding result for S .
n

For the reasons stated at the end of Section 2.2, T appears
n
to be a logical choice of the "best" statistic from the class
2
{T(a) : 0aSl Table 1 provides the critical values for T*= T(0)
n n 4 n
at selected levels, a, and sample sizes, n = 10(1)24. Because the exact

null distribution is not available in closed form for either R(a) or
n
(a) (0)
S the exact distribution of T was calculated by means of a com-
n n
puter enumeration of all possible values of the statistic, one for each

d in A. Since A contains 2n sample points, this method is suitable only


_ _

















TABLE 1


EXACT CRITICAL VALUES FOR T* =


n = 11

x P(T* x)


33.00
34.50
49.50
50.00
56.50
57.50
81.50
82.00


0.8955
0.9072
0.9443
0.9502
0.9736
0.9775
0.9893
0.9912


2 (0)
4-T
4n


n = 12


n = 13
x P(T* x)


49.25
51.75
64.75
65.75
77.75
78.75
101.75
104.25


0.8982
0.9009
0.9490
0.9504
0.9736
0.9751
0.9883
0.9902


n= 14
x P(T*

54.75
55.75
79.75
80.75
91.75
92.75
125.75
127.72


0.8979
0.9000
0.9497
0.9512
0.9736
0.9751
0.9899
0.9907


n = 15
x P(T*

68.00
68.50
84.00
84.50
112.50
113.00
132.50
133.00


0.8984
0.9001
0.9492
0.9501
0.9743
0.9750
0.9899
0.9902


n = 16
x P(T* x)


71.50
72.50
99.50
100.00
120.00
121.00
155.50
156.50


0.8988
0.9013
0.9487
0.9509
0.9748
0.9752
0.9897
0.9903


n =17
x P(T* x)


83.25
83.74
115.25
115.75
137.75
138.25
184.75
185.25


0.8989
0.9009
0.9499
0.9502
0.9749
0.9753
0.9900
0.9903


n = 10
x P(T*

32.25
33.25
38.25
39.25
44.25
48.25
60.25
62.25


0.8945
0.9102
0.9492
0.9570
0.9746
0.9785
0.9863
0.9902


x
43.00
44.00
53.00
54.00
71.50
73.50
80.50
83.50


P(T* x)
0.8970
0.9087
0.9497
0.9517
0.7961
0.7961
0.9883
0.9912


n = 18
x P(T*

96.25
97.25
121.25
122.25
161.25
162.25
192.25
193.25


0.8989
0.9008
0.9493
0.9506
0.9746
0.9752
0.9898
0.9907
















TABLE 1 (Continued)


n = 19
x P(T* x)


101.50
102.00
141.50
142.00
169.50
170.00
220.00
220.50


0.8990
0.9004
0.9495
0.9505
0.9747
0.9750
0.9897
0.9900


n = 22


x
137.73
138.75
189.75
190.75
227.75
228.75
296.75
297.75


P(T* 0.8987
0.9006
0.9499
0.9506
0.9747
0.9751
0.9900
0.9901


n =20
x P(T* < x)


117.00
117.50
152.00
152.50
192.50
193.00
253.00
253.50


0.8981
0.9003
0.9496
0.9503
0.9748
0.9753
0.9900
0.9902


n = 23
x P(T* x)
155.75 0.8998
156.25 0.9003
199.25 0.9497
199.75 0.9501
257.25 0.9748
257.75 0.9750
332.25 0.9900
332.75 0.9901


n = 21
x P(T* < x)


125.25
126.25
167.75
168.25
218.25
218.75
263.75
264.25


0.8998
0.9001
0.9496
0.9504
0.9749
0.9751
0.9899
0.9900


n = 24
x P(T* x)
163.25 0.8996
163.75 0.9007
222.75 0.9498
223.25 0.9505
280.75 0.9750
281.25 0.9751
348.25 0.9900
348.75 0.9901














for small sample sizes. Also, as noted above, the problem of deriving

the asymptotic null distribution of T) does not appear to have an easy
n

solution. However, there is some evidence indicating that the critical

values corresponding to n = 24 will work fairly well for sample sizes

larger than 24. This can be seen from Table 2, which gives the exact

level, a', resulting from the use of the correct a-level critical value
m
(i.e., the critical value with exact level closest to a) corresponding

to samples of size m, m=15(1)24, when the sample size is actually 24.

Also given is a'-a24, where a24 is obtained by using the correct a-level

critical value for samples of size 24. As can be seen from Table 2, this

difference is fairly small.



TABLE 2

EXACT LEVELS OF TESTS FOR SAMPLES OF SIZE 24 BASED ON
THE a-LEVEL CRITICAL VALUES OF T(0) (15 m



= .05 a = .01

m a' a'-a a' a'-a
m m 24 m m 24
15 0.0549 +0.0047 0.0112 +0.0012
16 0.0457 -0.0045 0.0097 -0.0003
17 0.0443 -0.0059 0.0080 -0.0020
18 0.0542 +0.0040 0.0109 +0.0009
19 0.0472 -0.0030 0.0095 -0.0005
20 0.0518 +0.0016 0.0082 -0.0018
21 0.0518 +0.0016 0.0102 +0.0002
22 0.0472 -0.0030 0.0093 -0.0007
23 0.0542 +0.0040 0.0084 -0.0016
24 0.0502 0.0 0.0100 0.0

















CHAPTER 3


A TEST OF SYMMETRY BASED ON
THE WATSON STATISTIC


3.1 A Class of Symmetry Statistics
Based on the Watson Statistic


Although T the statistic described in Section 2.3, is
n

desirable from an invariance point of view, it does not appear that

its properties will be easy to determine analytically. As noted in

the previous chapter, its asymptotic null distribution is not available

at this time. For this reason, an alternative statistic for testing H

will be developed and studied in this chapter. This statistic will be
2
based on p the two-sample statistic proposed by Watson [20]. A def-
n
2
inition of v appears in Section 1.1.
n
2 gf
The statistic pn and its goodness-of-fit version p [19],
n n

possess several desirable properties. Both statistics are particularly

useful when dealing with populations having circular distributions.

Stephens [12] compared the goodness-of-fit test based on gf with four
n

other EDF-based goodness-of-fit tests using a Monte Carlo study. This

study showed that the Watson statistic performed best against shifts in

variance. In this chapter several nice properties of the modified ver-
2
sion of p for testing symmetry will be established.
n
The procedure used to modify the Cramer-von Mises two-sample
2 2
statistic, can also be used to modify 2 Thus, the expression
n' n












[F(y) G (y) in 2 may be replaced by either [Fa) y) + F (a) i

or F(a)(y) + F(a)(-y) 2Fn(0 to produce two classes, one correspond-

ing to each of the above substitutions. However, the two resulting

classes are equivalent. To see this, first note that for any constant C,
+m
SC dFn(y) = C.

2
Thus, substituting in 2 any expression of the form
n

(a) (a)
F (y) + F (-y) C,
n n

where C is constant with respect to the variable of integration, will

result in a statistic independent of the value of C. For the sake of

convenience in proving several of the results which follow, members of

this class of symmetry statistics will be defined with C=1.


Definition 3.1.1:

(a)
For any a e [0,1], the statistic U is defined by
n

(a) n (Fa)(y) +(a)
U = n (y) + F F(-y) -
n n

+o
i (F (w) + F ((-w) 1) dFn(w)2 dF (y)


(a)
where F is given by Definition 2.1.1.
n
Note that a reasonable test of symmetry is obtained by rejecting
(a)
H for "large" values of U .
o n
Corresponding to (2.2.3) and (2.2.4), the following theorem
(a)
gives two expressions for Un in terms of the functions N (), Pk)
n k k









(a)
and 6k('). These will provide a convenient form for computation of U(a
k n
(a)
and will be useful in establishing several properties of U(a


Theorem 3.1.1:

Let the functions 6k('), Pk(-) and Nk(-) be given by (1.1.5),

(1.1.6) and (1.1.7), respectively. Then, with probability one,
n
U(a)(X) -2 (iXi) k + (X)]J2
n [-2 =N k(X) Pk + aSk(X) 2
k=l

n
n-1([ N X) Pk(X) + a6sk X)2
k=l

n
= n-2 Nk ) PkX-) + (1-a)k(X-1)] 2


n
n- [ (Nk(X-1) Pk(X1-) + (1-a)6kX-1))]2) (3.1.1)
k=l

Proof:
(a)
Clearly, U( can be expressed as
n

(a) L(a) (a) 2



S(a) [F (X) + F1 (-X,) -1)
nn j n n




j=1
n
[F (a) (a) ]
S [ (X) + F (-X.) 2F(0)]2
i=l
n
n[ (F(a) (X.) + F (a-X) -2F (0)) 2
n n n
j=1













The theorem then follows from the two qualities, (2.2.5) and (2.2.9),

established in proving Theorem 2.2.2.


3.2 Properties of U(
n-

2 2
Due to the close relationship between p and a it is not
n n
(a) (a)
surprising that the properties of U are similar to those of R and
n n

S (a) (see Chapter 2). For example, as a consequence of Theorem 3.1.1,
n

U(a) depends on X through 6(X) alone, and, hence, U( is distribution
n --
free (see remarks after Lemma 2.2.1). The next theorem, analogous to
(a)
Theorem 2.2.1, establishes Invariance Property I for U
n


Theorem 3.2.1:

For any a E [0,1],

U (a)( = U a)(-X) w.p. 1. (3.2.1)
n n


Proof:

From Definition 1.1.2, it is easily seen that, with probabil-

ity one,

Nk(X) = Pk(-X),

and

6k(X) = -6k (-)


Hence, (3.2.1) follows directly from (3.1.1), and the proof is complete.

The following theorem is a direct consequence of Theorem 3.1.1

and is stated without proof.













Theorem 3.2.2:

For every a C [0,1],


U (X) = U (X ) w.p. 1,
n n

and, in particular,

U )) = U)(Xx) w.p. 1.
n n


Thus, U possesses both invariance properties I and II, and,
n

therefore, is the logical choice as a "best" statistic from among those

in the class (a) : 0 sa 1 The next theorem shows that all sta-

tistics in this class are asymptotically equivalent in the sense that

the difference between any two statistics converges to zero with prob-

ability one.


Theorem 3.2.3:

Under H U(a) and U(b) have the same asymptotic distribution
0 n n

for any a,b E [0,1]. In fact,

IU(a) U(b) 0 w.p. 1 as n + =.
n n


Proof:

For any a E [0,1], let


(a)(y) = F (y) + F(a)(-y) -1
n n n


F(a)(w) + F(a)(-w) 1 dF (w).
nCn n










Then, for any b E [0,1],
+m
U(a) n A(a)()]2 dFn(Y)


Snf (A(a)(Y) A(b)(y))+ A(b)(y)2 dF(y)
4>

Sn A(a)(y) A (y)]2 dF(y) +U(b)
n n n n
+m
+ 2n A (a)(y) A(b)(y)] [(b)(y)] dFn(y).

Hence,

U (a)- u(b) I nf sup lA(a)(y) A(b)(y)12] dFn(y)
4y

+ 2n f suplA~a)(y) A( (y) I. sup (b) (y) dF (y)

Sy yn
=n.SUpIAn (y)- An(y)l)2

+ 2n suplA(a)(y) A(b)(y)I sup IA(b)(y)I. (3.2.2)
y n n y
Now,

sup A(a) (y) A(b) (y) 2 [pF(a)() F(by)
y n n Y n n


+ sup IF(a) -F(b)(
y n n
Since
(a) (b) -1
sup F a)(y) F (y)I n- ,
y n n
it follows that

sup Ia) (y) A(b y) I) n1.
Therefore, from (3.2.2),
Therefore, from (3.2.2),










-U(a) (b)j 2 + 2n ~ ) suplA(b)(y


=o (1) + 8 sup IA(b)(y)l.
y n

Thus the proof will be complete if it can be shown that sup IA(b)(y)[
y "
converges to zero with probability one as n tends to infinity. But

this follows from Corollary 2.2.1, since


sup lA(b)(y)l < 2 sup F(b)(y) + F(b)(-y) i.
y y n n

(,)
Note that U may also be regarded as a member of the class
n

1U 2 a) + Un : :0 a possess both invariance properties. However, since

1 (a) 1-a) -(b) I 1 U-(a) _(b) U(-a) _U(b)
2n n n n n n

Theorem 3.2.3 implies that, under H, statistics in this class are

asymptotically equivalent to any statistic in the U class. For this
n
reason, the class (a) + U(1-a : 0
aso2L, th las( )Ln n I --
consideration.

The final theorem presented in this section establishes the

consistency of tests of symmetry based on U(
n

Theorem 3.2.4:
(a)
For every a E [0,1], U is consistent against any alternative
n
in D- .
S











Proof:

Observe that, from (1.1.1) and the continuity of F,
+m 4-
r = f [F(y) + F(-y) I-/ (F(w) + F(-w) -1) dF(w) 2 dF(y)


is positive if and only if Fe D If it can be shown that for any
S
a E [0,1],

lim n a) = r w.p. 1, (3.2.3)
n-O n

it will then follow that with probability one, U(a) tends to infinity
n
with n if and only if Fe s, which will imply the consistency of
(a)
tests based on U Thus it is only necessary to establish (3.2.3).

Let H (y) = F(a(y) F(y). Then,

n n
+0
n[ n (y) + H (-y) + (F(y) + F(-y)-l )]2 dF (y)



n H (w) + H (-w) + (F(w) + F(-w) -1) dF(w) .

(3.2.4)

Upon expanding the integrand and rearranging terms, the first integral

in (3.2.4) becomes
+

f H2(y) + H2(-y) + 2(Hn(y) + H (-y))(F(y) + F(-y) 1)
n n

+00
+ 2 Hn(y) Hn(-y) dF(y) + / [F(y) + F(-y) ]2 dFn(y).

(3.2.5)

The two integrals in (3.2.5) will now be considered individually. The

absolute value of the first integral is bounded above by











sup IH (y) 2 + [sup IH (-y) 2 + 2 sup H (y)I + sup I|H (-y)i
L y y y "

x sup IF(y) + F(-y)-l1 + 2 sup IH (Y)I sup IHn(-y)l. (3.2.6)
y y y

From Lemma 2.1.1 and the fact that

sup IF(y) + F(-y)-l| < 1,
y
it follows that (3.2.6) and hence, the first integral in (3.2.5) con-

verge to zero with probability one as n tends to infinity. The second

integral in (3.2.5) can be expressed as
n

n>, F(Xi) + F(-X ) -1]. (3.2.7)
i=1

Since

SF(y) + F(-y)- 112 dF(y) S 1,


the strong law of large numbers implies that (3.2.7) converges with

probability one to

E[(F(X)+F(-X) -1) = f [F(y)+F(-y) -12 dF(y). (3.2.8)


Thus the expression given by (3.2.5), as n tends to +, converges with

probability one to the right-hand side of (3.2.8). A similar argument

shows that

JI n(w) + H (-w) + (F(w) + F(-w) -1)] dF (w)
n n

converges with probability one to

S[F(w) + F(-w) 1 dF(w).











Hence, the second integral on the right-hand side of (3.2.4) converges

with probability one to
+0
(f [F(w) + F(-w)-1] dF(w)).


This establishes (3.2.3) and completes the proof of the theorem.


3.3 The Asymptotic Null Distribution of U(
n -

In this section, the common asymptotic null distribution of the

statistics in the class (Ua) : 0 1 a will be derived by considering

a related random variable, U defined as


n = n f [Fn(y)+ F(-y)-l


f(F (w) + F(-W) -1) dF(w)]2 dF(y).


Note that, because of the continuity of F, U is independent of
(a)
the version of the EDF (F or F (, 0 al ) appearing in the integrand.
n n
Theorem 3.3.1 below shows that U and U( have the same asymptotic null
n n
distribution. The proof is based on the corresponding proof given by
2
Watson [20] for 2 Before stating this theorem, two lemmas necessary
n
for its proof will be established.


Lemma 3.3.1:

Let F denote the EDF based on a sample of size n from F e 4 ,
n s
and let
+Vn
Vn = n f F (y) + F(-y) 1 dF(y). (3.3.1)
n n









Then, as n tends to infinity, V converges in law to a normal random

variable with mean zero and variance 1/3.


Proof:

Clearly
+< n n
v n- f I 5 y]+ I [X 5 -y] -n dF(y)
i=1 i=l
n -+ -i
= n- i7f dF(y) + dF(y) n
i=1A j
n 1 F(-Xi)
Sn dt dt +f n
i= F (X ) 0


Since F(-X


n
= n [1 F(X) + F(-X)] n .
i=1

) = 1 F(X.) if F E s the last expression above equals
n
n 1 2n1 F )] (3.3.
i=l


2)


Now, since {F(X1),F(X2),...,F(Xn)} is distributed as a random sample

from a uniform distribution on [0,1], the lemma follows from (3.3.2)

and the Central Limit Theorem. This proof is therefore complete.

The second lemma needed for the proof of Theorem 3.3.1 is now

established.


Lemma 3.3.2:

Let
+m (
S vn- n [F (y) (-y) ] dFn(y),


(3.3.3)









where Vn is given by (3.3.1). Then, for every F e s'

E(Cn) = 0, (3.3.4)
and

Var(C ) 0 as n . (3.3.5)
n
Proof:

Note that the second term on the right-hand side of (3.3.3) is

equal to Vn with dF(y) replaced by dFn(y). Using (3.3.2), with proba-

bility one,
n
Cn = n[1 2n-1 F(Xi
i=l
n n
T n-1 (F (Xi) + n1 x 5 -Xi + n
i=1 j=1
n n
= n 2 2n-l F(Xi) -n-2 n(F) (X)
i=1 i=l
n
+ I i[x+X 0) .
j=1

Since, with probability one,
n n
n (X) (i- ) = 1 n2,
i=l i=l

it follows that
n n n
C = -- 2n-1 'F(X) -n2 I 7fr +X. + ] (3.3.6)
n (2 Y 3
i=l i=l j=1
Therefore,
n n n
E(Cn) = n 2n-1 E(F(X,)) -n-2 LE(I[X+X 0])
i=1 i=l j=l (3.3.7)









Since F(Xi) has a uniform distribution on the unit interval,

E(F(Xi)) = i=1,2,...,n. (3.3.8)

Using (2.2.10), it can be easily shown that if F E 4 (X.+X.) is

distributed symmetrically about zero. Thus, for i,j=l,2,...,n,

E(I[X. +X. <0]) = P(X. +X. 0) = (3.3.9)

Substituting (3.3.8) and (3.3.9) in (3.3.7) yields

E(C ) = n 2nln) 2(n2)2 = 0,

thus establishing (3.3.4).

Now, consider the variance of C Squaring the expression given
n
by (3.3.6) gives

Cn = n -+4n-2 (i 4 1 ( I X.+X.5 0
n 341 j
Si=l -i= j=l

n n n
6n-1 F(X) 3n-2 > I[Xi+Xj 0
i=l i=l j=1
n n n
+ 4n-3 F(X) IX +Xk ] 0.
i=1 j=l k=l

Taking expectations on both sides and using (3.3.8) and (3.3.9) results

in the following expression for the variance of C :
,- n -i
-9 -1 C 2
Var(C ) = 4 n+4n E ( F(X )
i=1

n n
+ n-3E I(( i Xi + X ] 0)2
i=l j=1l
n n n
+ 4n-2E F(X.) > T[X +\ 0]>. (3.3.10)
i=1 j=1 k=l











The three expectations in (3.3.10) are now considered individually.

As noted earlier, {F(X1),F(X2),...,F(Xn)} is distributed as a

random sample from a uniform distribution on the unit interval. There-

fore, the first expectation in (3.3.10) is given by
n n n 2

E[' ZF(X)] 2) Var(n ZrF(X)) + (E(n1 F(X ))j
i=1 i=1 i=1


n-1 1
= 12n + -.


(3.3.11)


To evaluate the

I[X. +X. 3 0].
1 3


second expectation in (3.3.10), let a.. denote
Then
Then


n2 2 n n n n n

+ Z Xi+X 50 = ai.+ (a2+a ..)+ a....
x=1 j=l i=1 i=1 j=l i=1l j=1



i=1 j=1
i#j
n n n

+ iijk + ajkii)
i=1 j=l k=l 1 k





ii'j#k
n n n n

i=1 j =1 k=1i
+ijji1


Since a.. = a.. for every pair of integers, i and j, and since the terms

within a particular sum all have identical expectations, the right-hand

side of the above equality becomes, for n > 4,









2 2
nE(a2) + 2n(n-l)E(ca.) + n(n-l)E(aiia )
ii ij ii jj

+ 4n(n-l)E(cii ij) + 2n(n-l)(n-2)E(a.a.jk)

+ 4n(n-l)(n-2)E(a.jik ) + n(n-l)(n-2)(n-3)E(a ijk ), (3.3.12)


where i, j, k and denote four distinct positive integers less than or

equal to n. Now, the expectation in (3.3.12) may be evaluated as
2 2
follows. First, since a.. = a.. and a.. = aii, (3.3.9) implies that

2 2 1
E(a E(a (a.) = E(a..) = (3.3.13)
ij ij 11 2'

Next, since a.. is independent of a and a.i is independent of a.. and

ajk' it follows that

1
E(a ia..) = E(a.ia ) = E(a..a*) = (3.3.14)
ini j ijk 11 i3k 4

Finally,

E(aij..a.) = P(X.+X. < 0 and Xi+X < O)



f if dF(yk) dF(y) dF(y ).
{(yi+yj-O)M
r(y +yk 0)}

1 l-s l-s
S / dt dw ds = (3.3.15)
0 0 0

and, similarly,
3
E(aa ) =-. (3.3.16)
Sii ij 8

Substituting (3.3.13), (3.3.14), (3.3.15) and (3.3.16) in (3.3.12),









n n
S[X + X ] = n( ) + 2n(n-1)( -) + n(n-l)( )
i=1 j=1
+4n(n-1)( ) + 2n(n-l)(n-2)( -)


+ 4n(n-l)(n-2)( ) + n(n-l)(n-2)(n-3) (k)


= ()n4 + ( )n3 + o(n3). (3.3.17)

In a similar manner, the third expectation in (3.3.10) can be shown to

be equal to
( n n n
E F(Xi) + 0 (n3 (n2 + o(n2).
= j=l k=l (3.3.18)

Substituting (3.3.11), (3.3.17) and (3.3.18) in (3.3.10) yields

Var(C) = (- )n + 4n ()n-1 + + n3 ()n4 + (+)n3 + )


+ 4n-2 )n3- ( )n2 + o(n2)

-3 3 -2 ( 2 6
= n 3(n) + n-2(n2),

which establishes (3.3.5) and completes the proof of the lemma.

Theorem 3.3.1 may now be given.


Theorem 3.3.1:

For any F E a, (U U n converges to zero in probability as
s n n
n tends to infinity.


Proof:


Expanding the integrands and rearranging terms yields








4-





+n n (y) + F n(-y)- d(y) 2
{oo







+- [F (y)+ / [Fn(-y) 1 dF(y) (3..1)

+ (' [(y) + (-y) ] dF(y))

f (y) + F))] d(Y)) (3.3.19)


The fact that the first term on the right-hand side of (3.3.19) con-

verges to zero in probability has been established by Rothman and

Woodroofe [15]. Using (3.3.1) and (3.3.3), the second term can be

written as
V2 (V -C )2 = C(2V -C).
n n n n n n

Hence, by Lemmas 3.3.1 and 3.3.2, the second term on the right-hand

side of (3.3.19) converges to zero in probability as n tends to infin-

ity, and the proof is complete.

The asymptotic null distribution of Un may now be derived.

It will be necessary at this time to consider some preliminary results.

Proofs of Theorems 3.3.2 and 3.3.3, and Lemma 3.3.3 may be found in the

reference indicated, and these results will be stated without proof.

Proofs of the remainder of the preliminary results, Lemmas 3.3.4,

3.3.5, 3.3.6 and 3.3.7, will be given.

Theorem 3.3.2 (Prabhu [13], p. 28):

Let y(t) be a stochastic process on [0,1], with

E(jy(t) 2) < + m









for every t in [0,11, and let K(s,t) denote the covariance kernel of the

process, y(t). Then y(t) is Riemann integrable if and only if the

integral
1 1
4f K(s,t) ds dt

exists. If the above integral does exist, then
1 1
E( y(t) dt) = f E(y(t)) dt,
0 0
and
11 11
E( y(s) y(t) ds dt = f f K(s,t) ds dt.
0 0 0 0

The following definitions will be needed in several of the fol-

lowing results. Let D denote the metric space of functions on [0,1]

that are right-continuous and have left-hand limits, and let A denote

the class of strictly increasing, continuous mappings of [0,1] onto

itself. The Skorokhod metric, d, may be defined as follows ([2], p. 111):

for x,y E D,

d(x,y) = inf > 0 : > A A such that suplt A(t) < E

and sup Ix(t) y(A(t))l E .
Theorem 3.3.3 may now be stated.


Theorem 3.3.3 (Billingsley [2], p. 30):

If the sequence of stochastic processes {y (t); n=1,2,...} in D

converges in law to the process y(t), and if g(-) is any measurable func-

tional on D which is continuous in metric d almost everywhere (with










respect to the probability measure associated with y(t)), then as n tends

to infinity, g(yn(t)) converges in law to g(y(t)).


Lemma 3.3.3 (Durbin [7], p. 31):

The functional on D defined by
1
z(x(-)) = f x(t) dt
0

is continuous in the Skorokhod metric, d.


Lemma 3.3.4:

Let Z(t) be a Riemann integrable Gaussian process on [0,1].
1
Then f Z(t)dt is a normal random variable.
0
Proof:

Let

m(t) = E(Z(t)),

K(s,t) = Cov(Z(s),Z(t)),

and
n
n = Z[tj(n) tj.(n)] Z(tj(n)),
j=1

where {0=t0(n) < tl(n) <'.
of partitions of [0,1] such that max It.(n)-t j_(n)| converges to zero

as n tends to infinity. Clearly, for each n, W is a normal random

variable with
n
E(W ) = t. (n)-t. j(n)]m(t.(n)),
j=l
and
n n
Var(Wn) = 7 [tk(n)-tk(n) [tj(n)-tj_(n)]K(tk(n),t (n)).
k=l j=l










By the definition of the Riemann integral, as n tends to infinity,
1
E(W) M f m(t) dt,
0
and
1 1
Var(Wn) 2 = f 4 K(s,t) ds dt.


This implies that the characteristic function of W converges to the

characteristic function of a normal random variable with mean, M, and
2
variance, 2, and hence, that W converges in law to such a normal
1 n
random variable. Since f Z(t) dt is the limit (in mean square) of

W as n tends to infinity, the proof is complete.
n

Lemma 3.3.5:

Let GC denote the EDF corresponding to a random sample of size

n from a uniform distribution on [0,1]. If the processes Y*(t) and
n
Q*(t) are defined by

Y*(t) = n Gn 1( -l-
and

Q*(t) = Y(t)+Y(t))- fY*(W) +Y" (-w) dw, 05t 1,
n n n Ln n

then Q*(t) converges in law to a Gaussian process on [0,1].


Proof:

It is a well-known result (see [2], p. 141) that the process

Y (t) defined by
n
Y (t) = n (G (t) t)
n n

converges in law to a Brownian bridge process, Y(t), on [0,1]. Clearly,











Y* is the process obtained from Y be rescaling to the interval [-1,1].
n n
Hence Y* converges in law to Y*, where Y* is the process on [-1,1]
n
defined by

Y*(t) = Y(1+)

Since Y is a Brownian bridge process, Y* is a Gaussian process on [-1,1]

with

E(Y*(t)) = E(Y r) = 0, -lstsl (3.3.20)

and

K*(s,t) = E(Y*(s)Y*(t)) = E(Y Y( t)) = 1)[- (1

1
= (l+s)(l-t), -1s
Now, by Lemma 3.3.3 and Theorem 3.3.3, Q*(t) converges in law to Q(t),

where

Q(t) = Y*(t) + Y*(-t) [Y*(w) + Y*(-w)] dw, O 0

In view of Lemma 3.3.4, it follows that Q(t) is a Gaussian process,

which proves the lemma.


Lemma 3.3.6:

Let Q(t) denote the Gaussian process given by (3.3.22). Then

Cov(Q(s),Q(t)) = -t + I(s +t2) +, 0

Proof:

By Theorem 3.3.2 and (3.3.20),
1
E(Q(t)) = E(Y*(t))+E(Y*(-t)) -f E(Y*(w))+E(Y*(-w)) dw=0
0


for t E [0,1]. Hence, for 0.s









1
Cov(Q(s),Q(L)) = E [Y*(s)+Y*(-s) -f [Yt (w) +Y*(-w) dw]

1
x [Y*(t) +Y*(-t) f [Y*(w) +Y*(-w] dw1

= E(Y*(s)Y*(t)) +E(Y(s)Y*(-t)) +E(Y*(-s)Y*(t))

1



E Y*(s) +Y*(-s) +Y*(t) +Y*(-t Y*(w) +Y*(-w) dw) .
I(Y*(-s)Y*(-s) +Y *(t) + -10 -
(



Letting
1
J(t) = E Y*(t) Y*(w)+Y*(-w)dw

and using Theorem 3.3.2, the last expression becomes
K*(s,t) + K*(-s,t) + K*(s,-t) + K*(-s,-t)
1
J(s)-J(-s) -J(t) -J(-t)+f [J(w)+J(-w] dw. (3.3.24)

From (3.3.21),
1
J(t) = K*(t,w) +K*(t,-w) dw
0 t 1


1
+ f (l-w)(l-t) dw)
0 J

= (1-t2), (3.3.25)

for t E [0,1]. Similarly,


J(-t) = (1 t2).


(3.3.26)










Thus,
1 1
(w) + J(-w) dw = (1w2) dw = 1 (3.3.27)


Finally, substituting (3.3.21), (3.3.25), (3.3.26) and (3.3.27) in

(3.3.24) gives
1 2 1 0 Cov(Q(s),Q(t)) = -t + (s + t2) + < t ,

as was to be shown.


Lemma 3.3.7:

The functional z: D D defined by

z(x(t)) = x2t)

is continuous in the Skorokhod metric, d.


Proof:

Let x(*) D be fixed. It must then be shown that, given

E > 0, there exists an a' > 0 such that for any y C D, d(x,y) < C'
2 2
implies that d(x ,y ) y c. Let a > 0 be fixed, let

B = max 0
and take ' = (2B + E) If d(x,y) < E', then, by the definition of

d, given just before Theorem 3.3.3, there exists a XA A such that

sup It A (t)l < E(2B + )-1 <


and

sup Ix(t) y(X (t))j E(2B +E)-.
Ostsl












Since

sup Ix(t)+y(Xo(t)) E sup Ix(t)I + sup ly(Xo(t))j 2B+E,
Ostl 0
it follows that

sup [x2(t) -y2( (t))I 5 c.
O0tsl

2 2
In view of the definition of d, this implies that d(x ,y ) 5 e, which

completes the proof of the lemma.

Theorem 3.3.4, giving the asymptotic null distribution of Un,

may now be stated and proved.


Theorem 3.3.4:

Let U be defined by
1
U= fQ2(t) dt,
0


where Q(t) is a Gaussian process on [0,1] with

E(Q(t)) = 0

and

Cov(Q(s),Q(t)) K(s,t) = -t + (s +t2) + 0s

Then, for every F E s, Un converges in law to U as n tends to infinity.


Proof:

Since U is distribution free, without loss of generality
n
F may be assumed to be the CDF of a uniform random variable on the

interval [-1,1], that is,










0 if t < -1

F(t) = 2(t+l) if -1 t
(1 if t > 1

Now, let
1

Qn(t) = n 2n (t) +Fn(-t) -l- (F + Fn(-w) l)dw, -1 < t <1.


Then,

Qn(t) = Qn(-t),
and
1 1 2
S= n [Fn (t)+F (-t)-l- (F (w)+F (-w)-l) t
n n n 2
-1
1 1
Sn f F (t)+F (-t)-l- (F (w)+F (-w)-l) dw]2 dt
n n n n

1
= Q2(t) dt. (3.3.28)
0 n


Since has a uniform distribution on [0,1] if X has a uniform

distribution on [-1,1], it follows from Lemmas 3.3.5 and 3.3.6 that

Qn(t) converges in law to a Gaussian process, Q(t), on [0,1] with mean

zero and covariance kernel given by (3.3.23). Hence, by Theorem 3.3.3

and Lemmas 3.3.3 and 3.3.7, as n tends to infinity,
1 1
SQ2(t) dt Q2(t) dt in law,


and, in view of (3.3.28), the proof is complete.
(a)
It has thus been shown that Un, and hence, U for any ac [0,1],
1
converge in law to U = f Q2(t) dt, where Q is a Gaussian process. The
0










problem now is to obtain the distribution of the random variable U.

In what follows, it will be shown that the distribution of U is the

same as the asymptotic null distribution of the well-known Cramnr-von

Miss goodness-of-fit statistic, agf [7]. This result will follow from
n
the fact that U can be expressed as a certain infinite sum involving

independent X2 random variables. The following two theorems will be

needed in establishing this representation for U.


Theorem 3.3.5 (Rosenblatt [14], pp. 185-95):

Let y(t) be a Gaussian process on [0,1] with continuous covar-

iance kernel, K(s,t). Let X. and fj(*), j=1,2,..., be the eigenvalues

and corresponding eigenfunctions of the integral equation
1
f(t) = f K(s,t)f(s) ds, (3.3.29)
0

satisfying the normalizing condition
1
f.(s)fj.(s) ds = 6..,, (3.3.30)
0 3 3 3JJ


where 6. denotes the Kronecker delta. Then the process
13
n

y(t) Zf (t) ,
j=1 3 3

where {Z.; j=l,2,...,n} is a set of independent standard normal random

variables, converges in mean square to the process y(t).


Theorem 3.3.6 (Kac and Siegert [9]):

Under the conditions of Theorem 3.3.5,
1 2
B= y2(t) dt = 7 .Z2 w.p. 1.
0j=1










The characteristic function of B, %B( ), is given by


BB(C) = I (1-2i X.) .
j=1 3

An immediate consequence of Theorems 3.3.5 and 3.3.6 is that,

with probability one,

U = Z2 (3.3.31)
j=1

where {X.; j=l,2,...} is the set of solutions to the integral equation
J 1
Af(t) = f[-max{s,t} +(s2+t2) + f(s) ds (3.3.32)


2 2
satisfying (3.3.30), and {Z ; j=l,2,...} is a set of independent X1

random variables. Furthermore, the characteristic function of U is

given by


U() = (1-2i Aj) (3.3.33)
j=1

Thus the distribution of U can be obtained from (3.3.33) if the solu-

tions of (3.3.32) can be found. In this connection, note that since
1
f -max{s,t} + -(s2+t2) + ds = 0,

A = 0 and f(s) 1 give a solution to (3.3.32) satisfying (3.3.30).

Any other elgenfunction, f, must therefore satisfy
1
f f(s) ds = 0. (3.3.34)


Let f denote an eigenfunction corresponding to any non-zero eigenvalue, X.

Then











1 1
f(t) = -max{s,t) + s f(s) ds + t2+ f(s) ds
o 0
t 1 1
f (-t)f(s) ds + f(-s)f(s) ds + s2 f() ds.
0 t 0 2

Differentiating both sides with respect to t results in the equation,
t t
Af'(t) = (-t)f(t) ff(s) ds + tf(t) = f(s) ds.
-0 0

Putting t=0 and t=l on the right-hand side above and using (3.3.4),

it follows that

f'(0) = f'(1) = 0. (3.3.35)

Taking the derivative a second time shows that

Af"(t) = -f(t).

That is,

Af"(t) + f(t) = 0.

Now, the general solution of the differential equation above is given by

(see, for example, [6], p. 506):

f(t) = a sin(A t) + 0 cos(XA t), (3.3.36)

where a and 0 are constants to be determined using the boundary

conditions, (3.3.30) and (3.3.35). The derivative of the general solu-

tion is

f'(t) = A acos(x t) A2 B sin(A t).

Hence, by (3.3.35),

A a cos(0) A B sin(0) = 0,

from which it follows that a = 0. Again from (3.3.35), with a = 0,

A 2 sin(A ) = 0,












from which it follows that

A2 = j1r, j=l,2,....

Thus, the eigenvalues of (3.3.32) are given by

= (jT) 2, j=l,2,... .


Finally, from (3.3.31),

U = Z (j j)2 w.p. 1,
j=1 j

so that from (3.3.33),



,U(c = (1-2i E(j.T) -2
j=1

Durbin [7j has shown that under the null hypothesis the Crambr-von
gf
Mises goodness-of-fit statistic, f converges in law to a random
n

variable with characteristic function %U((). Thus the asymptotic null

distributions of gf and U(a) are identical. Percentage points for this
n n

distribution have been tabulated by Anderson and Darling [1].

The result that gf and U(a) have the same limiting distribution
n n

is interesting in view of the similar result that the Watson goodness-of-

fit and two-sample statistics, gf and 2 have the same asymptotic null
n n

distribution as the corresponding Kolmogorov-Smirnov two-sided statistics

([19] and [20]). No simple explanation has been found for either of

these results.


3.4 The Exact Null Distribution of U(
n-

(a)
As was the case in Chapter 2 for T a closed form expression
(a)
is not available for the exact null distribution of U However,
n







57


using the method described in Section 2.3, exact critical values for

U() were calculated for various levels and sample sizes n = 9(1)20.
n
These critical values are given in Table 3.

In order to compare the exact distribution of U with the
20
asymptotic distribution, exact tail probabilities, P, for n=15(1)20,

were calculated using the critical values obtained from the asymptotic

distribution. These probabilities are given in Table 4. Apparently,

for sample sizes larger than 20, a test based on the a-level critical

value of U will have an actual level less than a, with the difference

between the actual level and a being less than the corresponding differ-

ence for n = 20, given in Table 4. This difference appears to decrease

as n increases. A test using the a-level critical values for U when
20
n is somewhat larger than 20 will likely have an actual level greater

than, but closer to, a, than a test based on the asymptotic a-level

critical values.
















TABLE 3


EXACT CRITICAL VALUES FOR U
n
a = .10


n x P(U >x)
n
9 0.29904 0.1016
10 0.29600 0.1094
11 0.31405 0.1006
12 0.31192 0.1011
13 0.31953 0.1008
14 0.32067 0.1007
15 0.32000 0.1003
16 0.32398 0.1016
17 0,32363 0.1004
18 0.32596 0.1001
19 0.32716 0.1003
20 0.32738 0.1003
S 0.34730

a = .05

n x P(U >x)
n
9 0.37037 0.0547
10 0.40000 0.0527
11 0.40421 0.0518
12 0.40741 0.0513
13 0.41875 0.0500
14 0.42310 0.0504
15 0.42193 0.0504
16 0.42554 0.0504
17 0.42825 0.0502
18 0.43090 0.0500
19 0.43126 0.0502
20 0.43200 0.0501
0.46136


x P(U) >x)
n
0.30727 0.0859
0.30400 0.0977
0.31555 0.0986
0.31713 0.0991
0.32135 0.0999
0.32106 0.0997
0.32178 0.0998
0.32422 0.0999
0.32444 0.1000
0.32648 0.0992
0.32804 0.0999
0.32800 0.0995


x P(U)>x)
n
0.39506 0.0469
0.40100 0.0449
0.41473 0.0498
0.40914 0.0483
0.41966 0.0496
0.42456 0.0494
0.42370 0.0500
0.42749 0.0499
0.42866 0.0497
0.43210 0.0494
0.43155 0.0500
0.43238 0.0497




















TABLE 3 (Continued)

a = .01


n x P(U >x)
n

9 0.56790 0.0117
10 0.56900 0.0117
11 0.59654 0.0117
12 0.60359 0.0103
13 0.64087 0.0100
14 0.64140 0.0104
15 0.65600 0.0100
16 0.65625 0.0102
17 0.66436 0.0100
18 0.67164 0.0101
19 0.67298 0.0100
20 0.67738 0.0101
0.74346


a= .001

n x P(U(>x)
n
9 0.65295 0.0039
10 0.74400 0.0020
11 0.76033 0.0029
12 0.85417 0.0015
13 0.85662 0.0015
14 0.90671 0.0012
15 0.93333 0.0011
16 0.95679 0.0010
17 0.97619 0.0010
18 0.99263 0.0010
19 0.99898 0.0010
20 1.00638 0.0010
1.16786


x P(U >x)
n
0.65295 0.0039
0.61600 0.0078
0.61758 0.0098
0.63194 0.0098
0.64907 0.0095
0.64723 0.0099
0.65659 0.0099
0.65991 0.0100
0.66517 0.0099
0.67215 0.0100
0.67357 0.0099
0.67800 0.0100


x P(U O)>x)
n
0.74074 0.0000
0.82500 0.0000
0.83396 0.0010
0.92303 0.0005
0.90123 0.0010
0.93477 0.0010
0.94637 0.0010
0.97241 0.0009
0.97659 0.0010
0.99314 0.0010
0.99927 0.0010
1.00738 0.0010



















TABLE 4


EXACT TAIL PROBABILITIES FOR U() BASED ON THE
n
a-LEVEL CRITICAL VALUES OF U

P = P(U (,U)
n a


a = .10


a = .05


S P a -P


0.0837
0.0848
0.0858
0.0858
0.0873
0.0884


0.0163
0.0152
0.0142
0.0142
0.0127
0.0116


a= .01

n P ac-P


0.0055
0.0056
0.0059
0.0060
0.0064
0.0065


0.0045
0.0044
0.0041
0.0040
0.0036
0.0035


P a P


0.0382
0.0395
0.0401
0.0411
0.0413
0.0419


0.0118
0.0105
0.0099
0.0089
0.0087
0.0081


S= .001

P a -P


0.0002
0.0002
0.0002
0.0002
0.0003
0.0003


0.0008
0.0008
0.0008
0.0008
0.0007
0.0007
















CHAPTER 4


PROBLEMS FOR FURTHER RESEARCH



Many problems in the area of symmetry tests remain to be solved.

As mentioned above, the asymptotic null distribution of the statistics

in the T class is not known. Also, it would be desirable to obtain
n
a method for determining the power of the tests based on statistics in

any of the four classes discussed above against various types of alter-

natives. In addition to the Cramsr-von Mises type symmetry statistics,

there exist corresponding EDF symmetry statistics based on the Kolmogorov-

Smirnov statistic ([3], [4], [11] and [16]). The Bahadur efficiencies of

several of these statistics, relative to each other, have been obtained

by Littell [1I], but there are no criteria available for comparing the

Cram6r-von Mises with the Kolmogorov-Smirnov statistics. The two invar-

iance properties discussed in previous chapters, as well as the two com-

parisons upon which the Cramsr-von Mises type statistics are based, have

not yet been investigated with respect to the Kolmogorov-Smirnov type

symmetry statistics.

In the area of distribution theory, it has been shown above that

U(a) and the Cramnr-von Mises goodness-of-fit statistic have the same
n

asymptotic null distribution. Also, as noted in Section 3.3, the

Watson goodness-of-fit and two-sample statistics have the same asymp-

totic null distribution as the corresponding two-sided Kolmogorov-Smirnov







62



statistics. As it appears that these two distributions are common ones,

it might be possible to determine when a particular statistic will have

one of them as its asymptotic distribution.

Finally, in many practical situations, the center of symmetry,

0, may not be known. It would thus be desirable to find a distribution

free EDF test for symmetry when 0 is estimated. Since most distribution

free tests lose this property when parameters are estimated, this prob-

lem may be impossible to solve for small samples, but asymptotically

distribution free procedures may possibly be obtained.
















BIBLIOGRAPHY


[1] Anderson, T. W. and Darling, D. A. (1952). Asymptotic theory of
certain "goodness of fit" criteria based on stochastic
processes. Ann. Math. Statist. 23 193-212.

[2] Billingsley, P. (1968). Convergence of Probability Measures.
Wiley, New York.

[3] Butler, C. C. (1969). A test for symmetry using the sample dis-
tribution function. Ann. Math. Statist. 40 2209-2210.

[4] Chatterjee, S. K. and Sen, P. K. (1973). On Kolmogorov-Smirnov-
type tests for symmetry. Ann. Inst. Statist. Math. 25
287-300.

[5] Chung, K. L. (1968). A Course in Probability Theory. Harcourt,
Brace and World, New York.

[6] Courant, R. (1937). Differential and Integral Calculus, Vol. I,
2nd ed. (translated by E. J. McShane). Interscience,
New York.

[7] Durbin, J. (1973). Distribution Theory for Tests Based on the
Sample Distribution Function. SIAM, Philadelphia.

[8] Hajek, J. and Sidik, Z. (1967). Theory of Rank Tests. Academic
Press, New York.

[9] Kac, M. and Siegert, A. J. F. (1947). An explicit representation
of a stationary Gaussian process. Ann. Math. Statist. 18
438-442.

[10] Kiefer, J. (1959). K-sample analogues of the Kolmogorov-Smirnov
and Cramer-von Mises tests. Ann. Math. Statist. 30 420-447.

[11] Littell, R. C. (1974). On the relative efficiency of some
Kolmogorov-Smirnov-type tests for symmetry about zero.
Commun. Statist. 3 1069-1076.

[12] Orlov, A. I. (1972). On testing the symmetry of distributions.
Theor. Probability Appl. 17 357-361.

[13] Prabhu, N. U. (1965). Stochastic Processes. Macmillan, New York.


63











[14] Rosenblatt, M. (1962). Random Processes. Oxford Univ. Press,
New York.

[15] Rothman, E. D. and Woodroofe, M. (1972). A CramBr-von Mises
type statistic for testing symmetry. Ann. Math. Statist. 43
2035-2038.

[16] Smirnov, N. V. (1947). Sur un crithre de sym4trie de la loi de
distribution d'une variable alatoire. Izv. Acad. Sci. USSR.
56 11-14.

[17] Srinivasan, R. and Godio, L. B. (1974). A Cramer-von Mises type
statistic for testing symmetry. Biometrika. 61 196-198.

[18] Stephens, M. A. (1974). EDF statistics for goodness of fit and
some comparisons. J. Amer. Statist. Assoc. 69 730-737.

[19] Watson, G. S. (1961). Goodness-of-fit tests on a circle.
Biometrika. 48 109-114.

[20] Watson, G. S. (1962). Goodness-of-fit tests on a circle. II.
Biometrika. 49 57-63.
















BIOGRAPHICAL SKETCH


David Lawrence Hill was born on July 11, 1949, in Morristown,

New Jersey, and raised in nearby Bernardsville. Upon graduation from

Bernards High School in June, 1967, he attended Bucknell University in

Lewisburg, Pennsylvania, and received the degree of Bachelor of Science

in Mathematics in May, 1971. While at Bucknell, he became interested

in statistics through the influence of the late Professor Paul Benson.

In September, 1971, David enrolled in the graduate school at the

University of Florida, and, in June, 1973, was awarded the degree of

Master of Statistics. Since then, he has been working toward the

degree of Doctor of Philosophy, and has been the recipient of a grad-

uate fellowship and a research assistantship from the University of

Florida.

After receiving his degree, David will assume a position as

an Assistant Professor in the Department of Mathematical Sciences

at Northern Illinois University in DeKalb, Illinois. He is a member

of the American Statistical Association.














I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation and is
fully adequate, in scope and quality, as a dissertation for the degree
of Doctor of Philosophy.




P. V. Rao, Chairman
Professor of Statistics







I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation and is
fully adequate, in scope and quality, as a dissertation for the
degree of Doctor of Philosophy.




R. C. Uittell
Associate Professor of Statistics







I certify that I have read this study and that in my opinion
it conforms co acceptable standards of scholarly presentation and is
fully adequate, in scope and quality, as a dissertation for the
degree of Doctor of Philosophy.




D. D. Wackerly ..
Assistant Professor of Statistics














I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation and is
fully adequate, in scope and quality, as a dissertation for the degree
of Doctor of Philosophy.




R. D. lauldin
Associate Professor of Mathematics







This dissertation was submitted to the Graduate Faculty of the
Department of Statistics in the College of Arts and Sciences and
to the Graduate Council, and was accepted as partial fulfillment
of the requirements for the degree of Doctor of Philosophy.


August, 1976


Dean, Graduate School




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs