THlE PROBAOBILITY THAT PART OF A SET OF EQUICORRELATED
NORMAL PARABLES ARE POSITIVE
BY
THOM11AS RAY HOFFMANX
ADISSERTATIONK PRESENTED TO THE GRADUATE COUNCIL OF
TjiF U~dVERSITY OF FLORIDA IN PAR7i'IAL
7F7LF7LLM11ENT OJF THE REQTUIprE ,Nrs FOR THE DEGI2FE OF
DOCTOR OF PHILOSOPHY
U NlVE R ST 71 OF FLORIDA
1.972
TO MY PARENTS
ACKNOWLEDGMENTS
1 would like to express my appreciation to my major professor, Dr. John Saw, who suggested the topic of this dissertation, and who was always available for assistance. Appreciation is expressed also to the other members of my supervisory committee, Professors R. L. Scheaffer, P. V. Rao, and Z. R. Pop Stojanovic.
Also, I would like to extend my thanks to the other members, faculty, students, and staff of the Department of Statistics. They made my stay at the University of Florida both rewarding and enjoyable.
The manuscript was typed by Mrs. Edna Larrick. Her patience and assistance through those final predeadline weeks will always be remembered and appreciated. Also, I would like to thank Mrs. Deborah Ingram for her excellent work drawing graphs.
Finally, I express deep appreciation to Professor Paul Benson~ of Bucknell University. Without his guidance and encouragement I would never have entered the field of statistics.
TABLE OF CONTENTS
Page
ACKNOWLEDGMENTS...................... . ... .. .. .. ....
LIST OF TABLES...........................vi
LIST OF FIGURES...........................vii
ABSTRACT...............................viii
CHAPTER
1 INTRODUCTION.........................1
1.1 Introduction.......................1
1.2 Definition of P m(P) ..................2
1.3 A Transformation Simplifying P rm(P)...........6
1.4 Summary of the Results of This Dissertation . 12
2 AN EXPRESSION FOR P rm(p) INVOLVING TCHEBYCHEFFHERMITE AND LEGENDRE POLYNOMIALS ..............14
2.1 Definitions and Properties..............14
2.2 The Fundamental Result..................18
2.3 The Integral J k(P) ................ 2
3 NUMERICAL RESULTS ......................26
3.1 Exact Results.....................26
3.2 Evaluating the Integral.....)...............31
3.3 Computing P m(p)....................38
3.4 Accuracy of the Results.................42
4 APPLICATION: A TEST FOR NORMALITY .............44
4.1 Introduction ......................44
4.2 The Null Distribution.................47
4.3 Approximations to the Null Distributior. .......49
iv
TABLE OF CONTENTS (Continued)
CHAPTER Page
5 OTHER METHODS OF EXPRESSING P (p) . . . . . 62
5.1 Introduction . . . . . . . . . 62
5.2 A Power Series in Rho . . . . . . . 62
5.3 A Series Resulting from an Inverse
Taylor Series Expansion of g(u) . . . . 73
5.4 Using Moments of Extreme Order Statistics . . 82
APPENDIXES . . . . . . . . . . . . . 84
BIBLIOGRAPHY . . . . . . . . . . . . 98
BIOGRAPHICAL SKETCH . . . . . . . . . . . 100
v
LIST OF TABLES
Table Page
1 Standard Deviation of Y....................48
2 fhe Cumulative Distribution of Y, n =19...........51
3 C (P ,2)..........................57
4 C (p )3).. ...................... .......58
6 Moments of Y V Y, n = 9....................61
7 Error Involved in Computing P m(p) when the Series
in Rho Is Truncated after Five Terms............72
8 Values of a ., j =0,2,... ,22 .................78
9 Values of vk (in), (1/3), $ (1/4), k k k
k = 0,2,... ,22, m = 10...................80
10 The First n Terms in the Series P (P),
m = 10, p = 1/3, 1/4, n = 0, 2,... ,22............81
vi
LIST OF FIGURES
Figure Page
2 h1
1 The function h Lk[2G(u) 1] Le Tu p = 1/10 . . 32
1 r 2 h1
2 The function hi Lk [2G(u) 1] e 2, p = 1/2 33
2 h1
3 The function h Lk[2G(u) 11]e p 9/10 . 34u
u 21 h1
4 The function (h) i Ekf(u)] [e u2 hp= 1/2, p=1/10 .................. 37
vii
Abstract of Dissertation Presented to the
Graduate Council of the University of Florida in Partial Fulfillment
of the Requirements for the Degree of Doctor of Philosophy
THlE PROBABILITY THAT PART OF A SET OF EQUICORRELATED NORMAL VARIABLES ARE POSITIVE
By
T'homas Ray Hoffman
March, 1972
(Thairma,: Dr. J. G. Saw
Major Department: Statistics
The probability that part of a set of equicorrelated normal variables are positive is_ defined by a multiple integral expression involving the multivariate normal density function. Although much research related to this integral expression has been published, most results do not include a practical method of its evaluation. Also, when the correlation is negative, no direct method of evaluating the integral expression is available. in this paper we discuss several methods of expressing the integral. One of these expressions, valid for both positive and negative correlation, is used to obtain numnerical results.
A transformation is used to simplify the integral expression for the probability that part of a set of equicarrelated normal variable are positive. Then the probability can be written as an integral involving the real normal distribution function when the correlation is positive, and the complex normal distribution function when the correlation is negative. For positive correlation, this integral expression has been used by other authors to obtain numerical results.
Viii
Next, we use a result connecting the terms of a binomial series with TchebycheffHermite and Legendre polynomials to obtain a finite series expression for the probability. Although the general term in the series involves an integral which cannot be evaluated in closed form, this integral depends only on the correlation and can be evaluated by numerical integration for both positive and negative correlation. The numerical results are Included in the appendixes.
As an application, the number of observations larger than the sample mean is used as a test statistic for testing the hypothesis that a population is normally distributed. The small sample null distribution is derived and numerical results are given. Approximations to the null distribution are also discussed.
Finally, we discuss three other methods of expressing the
probability that the entire set of equicorrelated normal variables is positive. Two of these methods express the probability as an infinite series. However, in both cases the convergence is quite slow. The third expression, involving moments of extreme order statistics, can be used for obtaining numerical results only for limited positive values of the correlation.
ix
CHAPTER I
INTRODUCTION
1.1 Introduction
in a recent paper, David (1962) suggested using the number of observations larger than the sample mean as a test for the homogeneity of a random sample. Assuming a normal population, he showed that the proportion of observations larger than the sample mean has an asymptotic normal distribution. However, David did not discuss the small sample distribution for the test statistic.
The work on this dissertation began in search for the small sample distribution for David's statistic. However, this work soon led to the more general problem of finding the probability that part of a set. of equicerrelated normal variables are positive and, in particular, the problent of evaluating a multivariate normal integral expression for the probability that all the variables are positive.
Much research related to the multivariate normal integral has been published. Gupta (1963b), in addition to an excellent survey paper, gives a complete bibliography of articles related to the multivariate normal integral. However, only a few of these articles offer a practical method of evaluating the integral. Also, although Steck (1962) gives a relation connecting the results for positive and negative correlation, no direct method of evaluating the integral has
1
2
been obtained when the correlation is negative. Only two authors, Ruben (1954) and Gupta (19630, give numerical results.
It is the purpose of this dissertation to find at least one method of evaluating the multivariate normal integral that works for both positive and negative correlation and that can be used easily to obtain numerical results. Then the small sample distribution of David's statistic can be given as an application to the more general problem.
1.2 Definition of P m(p)
r:m
Suppose the m variates Xl,X2, ... ,X,, each with zero mean and unit variance, have a multivariate normal distribution. Of interest is the probability that exactly r of these m variables are positive. If the variables are mutually independent, the problem has the binomial solution (Mr) G2). However, when the variables are dependent, no simple solution exists. In this paper we shall consider the case when the variables have common correlation p, 1 < p < 1. P (P) M1 r: m
will denote the probability that exactly r of the m variables are positive. That is,
P rm(p) = E P(Xi1 > 0 .... Xir > 0; Xir+1 < 0,... ,Xim < 0). where the summation is over all partitions [i .... ; I ,... ri m of the set [1,2,... ,in. Since X ,X2P Xm are identically distributed, the above equation may be written as
pr(P) =(m P(X > ,. A,X > 0; Xr+1 <0'''A <0)
Letting g(x2,2 xm represent the density on XI,2''.. Xi
we have
3
(1.2.1) Pr (P) = ' g(xl ,x2,...x )dx1,dx2,... dx
x'> o x.
1 1
i< r i>r = g(XlX2,...,x ) dx1,dx2,...,dxm'
[r,m]
where [r,m] will be used to denote the range of integration
(x. > 0: 1 < i < r; x < 0: r+1 < i < m)
1 1
In the case r = m, we will, for convenience, write
P (p) rather than P (p).
m m:m
A first approach to the problem might be to write down the density function g(x1,x2,... ,x ). From multivariate theory we have m 1 1 x 1x
2Vx
2 I iV 2 2 (1.2.2) g(x1,x2,... xm) = (2rt) 19 e
C < X. < C i = 1,2,...,m, where x denotes the vector (xi,x2,... X ) and V is the dispersion matrix given by
1 ... p
p 1 ..p Vi
p p ... 1
Due to the simplicity of the dispersion matrix, the determinant IV!
S1
is easily evaluated and the quadratic form x V1 x has a simple scalar representation. In fact, it will be shown that
(1.2.3) IV, = (1 p) m[l+(ml)p],
4
1+ (m2)p P "
1 1 l(m2)p ... P
(1.2.4) V (lp) [+ (m1)p]
P ... I+ (m2)pj
Since the determinant of a matrix equals the product of its
latent roots, (1.2.3) can be proven by finding the m latent roots of V. If X is a latent root of V it must satisfy (V XI)y = 0
where 0 represents the (mxl) vector of zeros, for at least one nonzero vector y. Letting % = (lp) we note that the matrix (V %I) contains only one distinct element. Therefore the rank of (V %I) is one and there exist (ml) nonzero vectors y satisfying (V XI)y = 0. Hence, (1p) is a (ml)fold latent root of V. To find the last latent root we note that the trace of a matrix equals the sum of its latent roots. Since the trace of V equals m, X = m (m1)(1p) = 1 + (m1)p,
and (1.2.3) is proven.
To prove (1.2.4) denote V_ by A = (a0). Then A must satisfy
m
Sv~j aj = 1,
j=l
m
Z va 0, a .
j=l ai aj
Substituting for v in the above equations, we have (1.2.5) p E a + a = 1,
p E a + (1oa + p a = 0.
Subtracting these last two equations, we have
1
(1.2.6) a a
It is clear from the preceding equations that a is independent of a and 0. Hence, equation (1.2.5) simplifies to (1.2.7) (m1)pati + a = 1.
Finally, the solutions of equations (1.2.6) and (1.2.7) are
aP
a = (1p)[l+(m1)p] 1+ (m2)o
B= (1p)[1+(m1)p] '
as was to be shown.
Using the scalar representations of IVI and x 'V1x in the density function, (1.2.1) may be written as m m1 1
(1.2.8) P (p) = ) f (2) 2 (lP) 2 [1+(m1)p]
r: m r
[r,m]
r m 2
M 2
[1+(m2)p] x 2p Z x.x.
i=1 i
exp L dx ...dx.
2(1p) [1+(ml)pl 1 m
Unfortunately, although void of matrix notation, the above representation of Pr:m(p) is not too desirable for obtaining numerical results. A more workable form of Pr:m(p) is needed.
6
1.3 A Transformation Simplifying P r:m(P) r: r
Consider the m variables X ,X ,... ,X' defined by
X =1 e
1 1 o
X2 = Y By '2 2 o
X = Y eY m m o
where the (m+l) variables, Yo ,Y ,... ,Y m, have a multivariate normal
distribution with mean vector zero and dispersion matrix I, and e is
an arbitrary constant. It follows that X',..., have a multivari1 2 ~m ate normal distribution with
E(X) = E(Y.) @E(Y )
1 1 0
= O,
o2
Var (X) = Var (Y.) + 2 Var (Y ) 29 Coy (Y.,Y )
1 1 O 1 0
2
= I+ 82
Coy (X.,X.) = Coy (Y.BY ,Y.BY )
1 3 1 o 0
= Coy (Y.,Y.) e Coy (Yi.,Y ) 1 3 1 0
2
B Coy (Y ,Y.) + e Coy (Y ,Y ) 0 3 o 0
= 2
Therefore, assuming p > 0, if we define 9 by
*
e =
X X/ X
2 2 1 1 2 m
B (1+2) equals p and the variables 1 2 ... m
(1+8 )2 (1+92 2 (1+92)
have the same distribution as X1,X2,... ,Xm defined in Section (1.2).
That is,
7
r
Var Var (X')
1 2 /
9
(ieO] coy (X. ,X.)
2 2 1+013
82
1+ 2
Hence, we have
p (~m (0 (r) P(X 1>0,.. 'X r>0; X r+I. <0 'Xm < 0)
r+1
=(.M) P(YGY >0,... e 0
yr+ eyo
g(y) e 1_Yc
G(y) f g(t)dt,
8
respectively. Hence, writing P r:m(p) conditional on Y = y and inteSr:m o
grating over y, we have
Pr() = (M) NY (I> Yo"'..'Yr> GY ; Yr
r:m r 1 o r o r+1 o
...,Y
Finally, using the independence and identically distributed properties of Y1,Y2''.... Ym' we have
OD
(1.3.1) P (p) = () f [1G(8y)] r[G(Gy)]mr g(y)dy.
r:m r C
Results similar to (1.3.1) have been given by Ruben (1954), Dunnett and Sobel (1955), Moran (1956), and Stuart (1958).
Although the expression for Pr:m(p) given in (1.3.1) was derived, assuming p >0, Steck and Owen (1962) have shown that it also holds for p <0 by defining G(8y) in the complex plane. For p <0, 8 is an imaginary number and can be written 9Cp where
(1P)
Then G(ey) equals G(i rPy) and is defined by integrating along a path in the complex plane parallel to the xaxis from m + iyy to + ipy. That is,
2 2
(1.3.2) G(py) Y 0 e g(t)dt.
(1.3.2) GCLcpy = e e e (t)dt.
9
The proof of (1.3.1) for p <0 consists of showing that the righthand sides of equations (1.2.8) and (1.3.1) are identical. First we note that
1 GCLCy) =1 J' j1S e '9 g(t)dt
= ek et'Y g(t)dt
+ e7 Yf e"Wg(t)dt
0
=e y I eL4t'y g(t)dt.
0
Then using (1.3.2) and writing the righthand side of equation (1.3.1), say R, as a multiple integral, we have
R (~; [1GQCPy)]r[GCLcpy)]mr' g(y)dy r 22 r
t. t0
mr 1 Lp
j r 1 rj~d
m
CO 2 2 jcxpy Z t.
j eg(t .. g(t )dt dt]y~y
1* g My**dy
L[r, ml
10
1
Since p >
m1
2 mp
1
< m
m1+l
= 1
and the integral
m
= 2 y t.
2,2y2 j=1
e e g(y)dy
converges. Therefore, interchanging the order of integration is permitted and we have
m
(m) ( P =1 3 1 (1m2 y
R r~j [T~Yi 1 t.)2
[r,m] ,/2T
g(t ) ..g(t )dt dt .
1m 1 m
The integral in brackets multiplied by (1np2) is the characteristic function of a normal random variable with zero mean and variance (1up2 1 Hence,
m 2
(CP E t.)
m t2 j=1l
m t
2 2y j=l 2(1n )
R = (2n) 2(1mq2 ) e e dtl...dtm
L r,m]
Substituting p(lp) for cp Z 2
m]
P 2. j1
e 2[1+(ml)p I dti*dtm Finally, making the transformation X.
S (1p )2
we have
R =()F(27T) 2 (1) 2 [1 l)f
[r, m]
exp ([+m~)~ l j kP 2 Jjdx... dxn
2(1p ) 1+ (ml)p] and the proof is complete.
The results of this section for positive and negative rho can
be summarized in the following lemma. Lemma 1
rP (m) [1G(Y),r [G(ey)]m g(y)dy
where
e P
12
and is written as ic when p <0 with 9 defined a.s
(1p)i
The functions g(y) and G(Oy) are defined by
1 2
g(y) 1 _jY2
g(y) = < e y
Sm g(t)dt, p > 0
G(By)
e 2e~2 e g(t)dt, p < 0
1.4 Summary of the Results of This Dissertation
In the next chapter Lemma 1 and a resuLt connecting the terms of a binomial series with TchebycheffHermite anrd Legendre polynomials are used to obtain a finite series expression fior Pr:m(p) valid for all allowable values of p. For p <0 the results reduce to a workable form once the real and imaginary parts of G(iy) are isolated.
In Chapter 3 it is shown how the results of Chapter 2 can be programmed to obtain numerical results. Since Pm (p) is simpler than
Pr:m (p) for computing purposes, a result expressing Pr:m (p) as a sum involving P.(o), j = r,r+1,.... m, is proven. 7The accuracy of the computed results is verified by comparison with exact results for special values of m and p and with the results of Ruben and Gupta.
An application of the results of the first three chapters is
given in Chapter 4. The statistic suggested by David is used for testing the hypothesis that a population is normally distributed. The null distribution is discussed and numerical results are given for sample sizes not exceeding 22. Approximations to the null distribution are also given.
Finally in Chapter 5 we discuss three alternative methods for computing P m (p). However, none of these methods can be used to obtain numerical results as readily as the method discussed in Chapters 2 and 3.
CHAPTER 2
AN EXPRESSION FOR Pr:m(p) INVOLVING TCHEBYCHEFFHERMITE AND LEGENDRE POLYNOMIALS
2.1 Definitions and Properties
Let ck(r,m) denote the kth order TchebycheffHermite polynomial orthogonal on r = 0,1,... ,m. Then ck(r,m) can be written (see, for example, Plackett, Sec. 6.5), as
(2.1.1) c (r,m) (k!) 3 k / r > (2kji\ mk+j)
k (2k):. (1) kj} k j / '
j=0
and satisfies the following three properties:
m
(2.1.2) Z ck(r,m) = 0, k = 1,2,...,
r=O m
(2.1.3) E c.(r,m)ck(r,m) = 0, j k,
r=O
m (k) 2 (m+k+l)
2 \2k+1
(2.1.4) ck(rm) 2k
Y k/2k\
r=O kk )
Also, let Lk(t) represent the kth order Legendre polynomial in t. Lk( t) is given by (see, for example, Abramowitz and Stegun,
kk
k 2 "
Chap. 22) the coefficient of s in the expansion of (l2ts+s ) and can be computed from the recurrence relation
14
L M11 L (t) = 1
0
(2.1.5) L(t) = t
2k1 k1
L (t) t L (t) k L (t) k 2.
k k k1 k k2 '
The following result due to Saw and Chow (1966) connects the terms of a binomial series with the TchebycheffHermite and Legendre polynomials. For any p,
m
(2.1.6) ( pr(1p)mr c (r,min) = m (k) L (2p1).
\r (mk)!(2k). k
r=O
The importance of this result to the next section justifies the inclusion of the following proof.
After substituting ck(rm), as defined by (2.1.1), into (mk) :(2k)'
equation (2.1.6) and multiplying the equation by (mk)(2k) the
(k )2
equation to be verified simplifies to
m k
j ) (2kj)Y(mk+j)! r mr
S(_)j () ( )k j!(kj) p (1p) =m!L k(2p1).
r/ j ( j .k
Notice that rj equals zero unless r > k j. Thus letting Q represent the lefthand side of the above equation and changing the order of summation, we have
k m
Q = (l)j (2kj)'(mk+j)! 7' (m) kjr(1p)mr
j.(kj). L i)Q Yr1)r
j=0 r=kj
Letting r = rk+j, the sum in brackets simplifies to
16
mk+j
I m pr I+kj (1Pmkjr'
r =0 (rk+j r')r (kj) mk+j
=(kmn pkJj (m+j) r/1 mk+j r/
r =0
= Q kj
Therefore Q reduces to
k
(i m!(2mj)! kj
Q (lj j(kj)! (kj).
j=0
Since by definition
E k (2p1) = I12s(2pl)+sj
k k
it remains to show that
m! s Q = 12s(2pl)+sI
Making the change of variable =k j the lefthand side of the
above equation becomes
AL0 j=0 + Li+
However it can be shown that [12s(2pl)+s 2 also reduces to the
above sum.. We have
17
2
= (its) [i 4s 2
( 1( 1 + s )
1 ('%, (4p)
(1+s) E (1
2 L (2+)! But 1\)may be written as (I) I (,)
(2 41
since
2 2/ 21
2:
A 1.3 ... (221)
2
2 A! 2.4 ... 22
1 (21)! 4 A(A! )2
Substituting the above result into the expression for L12s(2pl)+s 2]we have
[12s(2p1)+s 71_ (2()! p
1=0( '
= O (2A)! (sp j ) (22+j\ ij
1=0 (A:)2 j=0
_~ (1j (2Ltj)! I j+),
2=0 j=0 Y I!
which completes the proof.
18
2.2 The Fundamenta! Result
In this section we use the results of the last section and Lemma 1, page 11, to obtain a finite seriies expression fori P r:m(p).
r:m
First we let p = 1 G(9') in equation (2.1.6). Then we have
Mi\ r mr
E (r)[1G(y) ]r[G(Gy)] mck(r,m)
r=0
2
m.'(k!.)
(mk)! (2k) L [12G(ey)] (mk)!(2k):' k
Multiplication of the above equation by g(y) and integration with respect to y gives
am
(2.2.1) E [1G(9y)]r[G(ey)]mr ck(r,m) g(y)
= r=O
.2
= m!(k!) 2 Lk[12G(Gy)] g(y) dy .
(mk)!(2k). k
Now taking the lefthand integral inside the summation and defining
Jk(p) by
(2.2.2) Jk(p) = Lk[12G(Gy)]g(y) dy,
2 1
where p = (1+8 ) equation (2.2.1) becomes
m o
P r mr
E ck(r,m) r [1G(y)]r[G(y)] mrg(y) dy
k=O
2
m!(k!)
= (mk)!(2k)' k)
19
Finally, applying Lemma 1 to the lefthand side of the above equation, we have
m 2
(2.2.3) Ec (r,m) P (0) (mk) (2k) p)
0 k r:m (mk)!(2k). k
An alternate expression for Pr:m(p) can be obtained by noting that for fixed m the set of points Pr:m(p), r = 0,1,...,m, lie on a polynomial of degree at most m. Hence, for some constants, say e ,e ,...,e we can write
1
m
(2.2.4) P (o) = E e.c.(r,m).
r:m j=0 J 3
Multiplying (2.2.4) by ck(r,m) and summing over r yields
m m m
(2.2.5) E c (rm) P (p) = Z e.c (r,m)c (r,m).
r0k r:m r=j 3 k
r=O r=O j=O
Using the properties (2.1.3) and (2.1.4) of ck(r,m), the righthand
k
side above reduces to
m 2
ek E ck(r,m)
r=O
(k!)2 (m+k+l)
2k+1 /
e k /2k)
k /
Hence, the constants eoel,..., em can be determined by equating the righthand sides of equations (2.2.3) and (2.2.5). That is,
(k!)2 (I'm+k+l) 2
\ 2k+1 m (k!)
ek 2k) (mk)!(2k)! k)
\k.
20
After slight simplification, we have
e = r m!2kl i (P), k =0,1,... ,m
kk L) (m+kl).
Letting b k (in) denote the constant in brackets and substituting into equation (2.2.4), we have
m
(2.2.6) Prm()= Z c k (r ,m) b k (M)in) )
k_0
Before investigating the integral J () we should comment on the utility of the expression for P rm(p) given by (2.2.6.). Most important, by defining J k(p) appropriately, the expression is valid for both positive and negative values of rho. Next, the integralJk() does not depend on r or m. Hence, for a given value of p, only one set of values, J () k = 0,1,..., is needed. Also, as will be shown
in the next section, J p = 0 for odd k, thus decreasing the number of terms in the series by onehalf. Furthermore, for large m, the series may be truncated without serious effect, since the factor c k(r ,m) b k(in) approaches zero as k increases.
2.3 The IntegralJk(P
Consider the integral
Sk()= Sf L k[l2G(9y)] g(y) dy
defined for k = 0,1,... in the last section. Using the recurrence relation (2.1.3), it can be seen that the Legendre polynomial L kt) is
21
an even or odd function in t, doending on whether k is even or odd, respectively. Also, when rho is positive, G(ey) is the normal distribution function which implies l2G(ey) is an odd function in y. Therefore, Lk[l2G(ey)] is an odd function in y when k is odd and an even function in y when k is even. Since g(y), the normal density function, is an even function in y, it follows that J k(p), rho positive. equals zero when k is odd, and for even k
J k(p) = 2 S Lk[2G(ey)l1 g(y) dy.
0
After making the transformation u = iy, J kesp) becomes
2
the distribution function G(u), we have
k(P) 2h Lk[2G(u)I] e J dG(u)
2 2T
e 1
where p and OO.
eI e +7
Next consider the integral J k(p) when rho is negative. Now
e
e is imaginary and is written as e = Zy where C. =
Hence, the function G(ey) appearing in Jk(p) is complex and in order
22
to simplify J p the real and imaginary parts of GCjcpy) must be isolated. Denote these real and imaginary parts by a'(cpy) and (Cpy), respectively. Then GWiyy) and its conjugate can be3 written
G (Lcy) = ;Cy) + i (cpy) (2.3.1)
G(4ySy) = o(CPy) i a(yy).
Using the definition of GCpy) given in equation (1.3.2), we have
2ce((pyf = ~ e"t g(t)dt + f g(t)dt}
t CO
o(y)= 1Z tp ~)d
Subtracting equations (2.3.1) gives
2 2 0iy P
2i.F(9py) = er Oe g(t)dt J e g(t)dt}
!e__22 f (etCYY eiCY) g(t)dt.
0
it~py itcpy
Since e e 2j. sin (t~oy) we have
(2.3.2) 5(ePY) = e" YJI sin (tcpy) g(t)dt.
0
23
(Cpy) can be further simplified through integration by parts and differentiation of (2.3.2). First, integrating by parts,we have
2 2 CD
(2.3.3) $(py) = e y Y 1 cos (tpy)g(t)
COO
t1 cos (ty)g(t)dt
T_ 0
2 2 1 2
1 e 1_ y t cos (t~py)g(t)dt.
Tpy
9Y gg 9 o
Next, differentiating S(cpy) with respect to py, yields
d 2 2 m
(2.3.4) d (yy) e= y e sin (tpy)g(t)dt
0
2 2m
+ e 2Y ft cos (tpy)g(t)dt,
o
where differentiation was permitted inside the integral, since it cos (t~py)g(t)I < tg(t) which is integrable. Combining equations (2.3.2), (2.3.3), and (2.3.4), we have
dB(yy) 1 29 y2
d(cpy) = py (yy) + 2 gy $(Ty)
d(cpy)
2 2
ez
It follows, since $(0) = 0, that
_ e dt.
PY f 2
(2.3.4) 8(Ty) = 1e dt.
o ,2i
Hence the complex function G(iy) can be written as
24
1 CP1;I ~2
G(19y) + e dt
1
= + iL(Cy)
2
Returning to the integral Jk(p) we can now write
Jk(p) = Lk[2i$(cpy)] g(y) dy.
Although the Legendre polynomial has an imaginary component, its definition and recurrence relation still hold. In fact, Lk [2L8(yy)] is an even function in y when k is even and an odd function in y when k is odd. Therefore, as in the case when rho is positive, Jk(p) equals zero for odd k and for k even
Jk(p) = 2 Lk[2i5(9y)] g(y) dy.
o
0
1 1
After making the transformations u = y and h = 2
e2 2
Jk(p) becomes
,h1
Jk(P) = 2(h) L Lk(2i 5(u)] e+2 dG(u),
kk Le'
1 1
where p and < p <0 imply that h<m, m 2.
The following lema suarizes the results of this chapter.1
The following lemma summarizes the results of this chapter.
25
Lemma 2
m
p r:m(P) = ck (r,m) bk(mn) J(),
k=0
k even where
()3 k (k! )3 k ij r (2kj (mk+j
c (r,m) (2k) (1) .j k j
k (2k)1 .= j/ k2
j=0
m! (2k+l)!
bk(m) =
(k!) (m+k+i)! and for even k, Jk(p) is the integral
2 1 dG(u)
k(p= 2 fh Lk[h(u)] [e u dG(u)
with
1
h+l
and
2G(u) 1 h > O
h(u) =
21 u 2
2i I et dt h < m, m 2 2
Lk(t) is defined by the recurrence relation
L (t) = 1
0
L Ct) = t
L1
2ki ki
L (t) tL (t) L (t), k > 2.
k k ki k k2
CHAPTER 3
NUMERICAL RESULTS
3.1 Exact Results
In general the value of the integral J k(p) can only be
approximated so that exact results for P m(p) are not available. However, exact values of J (p) and J2 (p) can be found. Then, since
O2
Jk(p) is independent of m and r, P 2(p), r = 0,1,2, and P (p),
r:2 r:3
r = 0,1,2,3, can be determined.
Since L (t) 1 we have from Lemma 2, page 25,
0
J (P) = :2hl* 1L Ch(u)e dG(u)
hI e 2 du
= 1
J2(p) can be determined indirectly by first finding P2(p). Letting
m = r = 2 in Lemrna 2.
P 2:2 (P= P2(p) = C (2,2) b (2) J (p)
+ c2(2,2) b2(2) J2(P), so that
P 2p) c (2,2) b (2) (31.) J p =2o o
2(P) c2(2,2) b 2(2)
26
27
The value of P2(p) can be found in closed form by integrating the original expression for P m(p) given in equation (1.2.8). With m = r = 2, we have 22
x1+X2 2px xx2
2
1 2 ~ 2(1p dxdX
P2(p) = (2) (ip ) e dx12dx2
0 O
Making the transformation
x1p x2
u1 (1_p2)
u2 = x2 it follows that
x1 = (1p2) u1 + pu2
x2 = u2,
and the Jacobian of the transformation, J(x,x ulu ), is
J(x1,x2 6 ul,u2) =
01
= (1p 2 Therefore, r1 2
P ) = 2 e du du
2 J i 21 2 1
o (1p )U
p l1
28
Finally, making the polar transformation,
u = r cos 9
u2 = r sin 9,
we have
1
cos (p) 2
P2 = T dG re dr
1 1
2 cos (p ) 2T
1 (,2 1
S + sin p)
1 1 s.n1
= + sin p.
4 2TT
Next we need the values of c k(r,m) and bk(m) for k = 0 and
k = 2. For k = 0, c (r,m) = 1 and for k = 2,
0
1
c2(r,m) = r(r1) r(m1) + 1 m(m1).
2 6
1 2 1
Thus for m = 2, c (r,m) equals 3' 3, and 3 for r = 0,1, and 2,
2' 3 3' 3
respectively, and for m = 3, c2(r,m) equals 1, 1, 1, and 1 for
r = 0,1,2, and 3, respectively. The constants b O(m) and b (m) are
0 2
given by
1
b (m) ,
o m+l
30
b (m) =3
2 (m+) (m+2)(m+3)
Substituting m = 2 and m = 3 gives
1 1
b (2) b (3) = ,
o 3' o 4
1 1
b2(2) = b2(3) = .
2 2'2 4
29
Now using (3.1.1) we find that
3.i 1
J )(=P sin p
J2 T 2
Finally, substituting the above results into Lemma 2, we have, for m=2,
1 1 1 :2 = 4 T T p
1 1 1 1:2 2 17
1 1 1
2:2( = + sin p and for m=3,
Po:(p) p i
0:3 8 4T
3 3 1
Pl:3p) 8 4 sin
3 3 .i P2 3(p) j sin p
:8 47 sin
1 3 .1 (P + sin p.
Notice in the above results that
r: mmr:m( and
m
E p re(p)= .
r=O
These properties also hold in general. The first result follows immediately by letting r= mr in Lemma 1, page 11, and noting that G(Oy) = 1G(ey) for both positive and negative rho. Lemma 2 is used
m
in showing the second property. Since Z c (r,m)=0 for k= 1,2,..., r=O k
we have
30
m m m
E P (p) = Z Z b (m) c (rm) J (p)
r:m k k K
r=O r=O k=0
m
= b (m) c (r,m) J (p) r=0O
m
1
m+1
r=0
= 1.
1
P r:m(p) can also be computed exactly in the case when p = .
r:m
From Lemma 1,
P (1) (m) [1G(y)]r [G(y)]mr g(y)dy,
r:m 2 r
1
since = i when p = But
{m rmr()
(m [1G(y)]r [G(y)] g(y)
1 r (m+1)! mr r
m+1 m [G(y) [1G(y) g(y) ,
m+1 (mr)!r.
where the term inside the braces is the density on the (mr+ 1)st
order statistic from a normal random sample of size (m+ 1).
Therefore,
p 1 1
P () r = O,1,...,m.
r:m m+1
This last result implies that Jk( ) 0, k2. From Lemma 2,
Pr:m( ) = Z bk(m) ck(rm) J( 1)
r:m2 k=0 k k k
k0
k even
1 m
m + E bk(m) ck(rm) Jk2 k=2
k even
31
Since the last sum equals zero for all r and m and since b k(m) c k(rm) does not equal zero, J () =0.
k 2
3.2 Evaluating the Integral J k(P)
The Case When Rho Is Positive
1
Recall from Lemma 2, with p = (h+l) > 0, that
SkL(p 2 h L]k [2G(u)] I eh dG(u)
Before attempting numerical integration, the integrand should first be investigated for different values of k and p. For given values of G(u), the Legendre polynomials can be evaluated from the recurrence relation (2.1.5). Also, the value of u corresponding to G(u) is given to eight decimal places in The Kelley Statistical Tables. For each of
1 1 9
the cases p = T0, and 9 the integrand is plotted in Figures 1, 2, and 3, respectively, for both k= 2 and k= 10.
Since numerical integration is most accurate when the function being integrated is well behaved, we should expect good results for
1
p< < with the accuracy increasing as rho decreases. Also, since k is the order of the polynomial being integrated, the better results
1
should occur when k is small. When p >, the integrand approaches infinity as G(u) approaches one as can be seen from Figure 3. In this case, it is not likely that numerical integration will give accurate results.
Using Simpson's rule with 200 intervals, the interval width
.5
equals .0025, with G(u) taking the values .5 + (.0025)j,
32
1. 0 3 Ll,42 G~ 1 [e2 ,
.5
k 2
h 1
10
/1.0
1=I
Fir 1. Tefnto h2 k2 u)1 2
I /0
33
1.0~. i I23 ) 1
5I
1. L1 k2
k 10
1.h
Figure 2. The function h2 LK,[2G,\u' 11 Le21J
p =1/2.
34
1.5 ~.L. [2 G(u) 1I[eu Ie
0 G1u
0
k5 2
9
Figue 3 ThGucto 2Lu)~)1 e
k = 910.
35
j =0,1,...,.200. Then J k(p) can be approximated by
K2
j.0025 200 i .h1
Jk (p 2h' 32 c L (.005j) e
k ~ j=o
where
C =C =1
o 200
4 j odd,
S,2 even,
and u is the value of u satisfying G(u) = .5 + (.0025)j.
The Case When Rho Is Negative
With p <0, the Legendre polynomial has the imaginary argument if(u), where f(u) is given by u i 2
f(u) = 2 1, eit dt.
V o
For given values of u, the function f(u) can be evaluated quite rapidly by first expanding the integrand in a Maclaurin series and integrating term by term. That is,
f~) 2 U O t 2j
dt
f(u) = 2 d
,/7 o j=O 2 j.
2 CD 2
t J dt
,rT j=O 0 2j!
2 u2j+1
j =0 /21 21 (2j +1) j.
36
Letting f.(u) denote the (j4l)st term of the sum, we have the recurrence
relation
2
f (u) u
2
f(u) u 2jl f u). j = 1,2,...,
j 2 j(2j+l) j1
which can be easily programmed.
Although the Legendre polynomial has the imaginary argument,
if(u), for even values of k it is a real function. Hence, for computing purposes, we can avoid complex numbers by defining the function
.
L (t) by the recurrence relation,
k
L *(it) = 1
0 .
(3.2.1) L (it) = t
1
2kl t k .~t ki ( ), k>
Lk(t) = (1) k t  (it) k 2.
Then the function Lk (it) can be determined by
L (it) k even
L(it) = k t
i.Lk(.t) k odd.
The above relation can be verified by substituting into the recurrence relation (3.2.1) and comparing the results with the recurrence relation
for L k(t) given by (2.1.5).
Once the Legendre function, L k[if(u)], is evaluated, we can
21h1
examine the integrand, (h)* Lk[Zf(u)] Le J of Jk(p). The
integrand is plotted in Figure '4 for p = 7' k = 2, and for
1
p = 1 k=2 and k= 10. Notice that the scale of the graph differs
from those in Figures 1, 2, and 3, since,for large k,IJk(p)I is quite
37
5 (11)2 L [jif(u)] Ie2'
k L J
0  
150 i?
p=1/2, k2 p 1/10, k 2
2 Fig /1 ure 4. The function (h)2 I1 [if (u)J [e 2
p 1/2, p 110
38
large. Also, unlike the case when rho is positive, the functions do not cross the G(u) axis. These differences, together with the smoothness of the curves, should make numerical integration even more accurate for p < 0.
3.3 Computing Pm(P)
Once the J (p) k = 0, 2,... ,m, are computed, P (p) is deterk .r: m
mined, for r = 0,1,... ,m, by the expression given in Lemma 2. That is,
m
(3.3.1) Pr (p) = Z bk(in) ck(r'm) Jk(p)
r:m k k k
k=0
k even
Unfortunately, since the TchebycheffHermite polyrnomial, c k(r,m), cannot be expressed in a recurrence relation, it is difficult to compute
Pr m(p) using the above expression, except for small values of m. However, with r=m, c k(r,m) simplifies to. say a (m), m!. (k!)
(3.3.2) c (i) = m (k)
Furthermore, the probability P m(p) can be written as a linear combir: i
nation of P k(p), k = r,r+l,... ,m. That is, mr
(3.3.3) Pr: m(P) = (r 7(1) ( P r+j (P'
j=o
(The proofs of (3.3.2) and (3.3.3) are given at the end of this section.) Hence, we use (3.3.1) only in the special case when r=m. Relation (3.3.3) then can be used for computing P m(p ) when r i m.
r: i
39
A further simplification in the computation of P (p) can be
m
made by combining the constants bk(m) and ck(m). Letting dk(m) denote the product, we have
dk(m) = bk(m) ck(m)
2
m!(2k+l)! m!(k)
(k:) 2 (m+k+l)! (mk)! (2k)!
(m.)2 (2k+l) (m+k+l)! (mk)!
The constant dk(m) can be computed for even values of k by the recurk
rence relation,
1
d (m) =
o m+l
(3.3.4)
2k+l mk+2 mk+l d (m) = d (m), k = 2,4,... ,m.
k 2k3 mik+1 m+k k2
Finally, combining the above results, we have the following computing formula for P (p):
m
m
(3.3.5) Pm(p) = E dk(m) Jk(P
k=O
k even
The proofs of (3.3.2) and (3.3.3) follow. To prove (3.3.2), we first let r=m in the definition of ck(r,m) given by (2.2.1). Then, writing combinations as factorials and cancelling like terms, we have
ck(m) (ki k)3m ) 2kj mk+j
k (2k)! 1 j=O ) k j
3=0
(k!)2 m! k(2kj)
(2k)! (=!kj) '.
40
Thus, we must show that the summation above equals one. This summation can be written as
k
j 2kj\ k1
E (_) (k / (kji
j=0
Then, letting Z = kj, the result to be proved becomes k1 k+1) (k)
(1) = 1.
=0
Next, we introduce the negative binomial and binomial identities given by
1 r (k+r~ r
(3.3.6) k+l (_1) (kr ar
(l+a) r=0
k k (3.3.7) 1 + = Z k
Multiplying the lefthand sides of (3.3.6) and (3.3.7),gives
k
(l+a) 1 1 ( L
k k+l k k () a
a (1+a) a (1+a) a k=0
Equating this product to the product of the righthand sides of (3.3.6)
and (3.3.7), we have
1 k r k+r k rSE (1) a = E ()1 ) a
a 2=0 r=0 =0
0 k
Finally, equating coefficients of a and dividing by (1)k, we have
k Ik k+ k\
E (1) = 1,
a=0
as was to be shown.
41
In proving (3.3.3), we begin with the basic definition of Pr:m(p). That is, (mr
P (p) =() PX >0 ...'Xr >0; Xr+ <0,... X <0.
r:m r \ 1 r r+1 m
Next, let A. be the event [X.>O] and denote its complement by A.. Also, define I as the intersection A1A ...Ak. Then Pr:m (p) can
k 1 2 k r:m
be written as
P (p)= P ...A
r:m r 1 r r+1.. Xm
w/ r 1A.. r Ar+1 XM)
= ( P(I r) P(A,.. .AA ..A .
(r r1r r+1 m)
Applying de Morgan's rules, and since P(I ) = P (p), we have r r
(m \r mr
r:m ( (r) LPr(P \P( 1 Ar U Ar+j
j=1
( ) r(p)PCPUA..AAr. .
r (P L r= 1 r r+j
j=1
Finally, using the formula for the probability of a union, we have the desired result. That is,
(m{ emr
P (0) = Pr (p) r P A AA
r:m r r L 1 r r+j)
j=1
Z P(A ...AA A )
+1r r+3 r+A
1
+ E E P(A...AA *A A
J < j<" r r+J1 r+J2 r+j3
42
F (A Arr A
S+(r)(mr)
,r ) m /mr 1 (I
3 (Ir+3) +..+ _m r
(M) m _)j(mr P (P
rj=0 )rj
3.4 Accuracy of the Results
The integrals Jk(p), the constants dk(m), and the probabilities P (p) were evaluated with doubleprecision accuracy using an IBM model
360 computer. The computed values of Jk(p) were checked against the
1
exact values for k=0 and k=2 and for p =. For k=0 and k=2.
the results were accurate to at least seven significant digits for !1 I and to five significant digits for 1 1 was
computed accurately to six significant digits for k< 14 and to five
aIf 1
significant digits for 16< k < 22. Hence, for 1p we would
expect Jk(p), k>2, to be accurate to at least the fifth significant digit. The computed values of Jk(p), k< 22, are given in Appendix 1
1 1
for p = , p = 2(1)25, and for p p = 3(1)26.
p P
1
As expected, with p > ., accurate values of Jk(p) were not obtained using the method of quadrature described in Section 3.2. Further investigation has to be made in order to find a means of
evaluating Jk(p) accurately for p > .
43
The constants d k(m) were evaluated exactly using the recurrence relation (3.3.4). They are tabulated in Appendix 2 for m<25, k = 0,2,.. m.
Finally, the probabilities P (p) were evaluated using the
m
formula (3.3.5). Results for p >0 were compared with Ruben's tables and were found accurate to at least five decimal places. For p <0, Steck's relation,
MI) P (1:2) mm 2 k PRl(k ) Pk Y '
rn 2 k=2 LI
k even
was used in making comparisons. Again, the computed values of Pm (p) were accurate to at least five decimal places. P (p) is tabulated in
1
Appendix 3 for p = , p = 2(l)(25) and 2 < m < 22, and for
1
p p, p = 2(1)21, and 2 < m < p.
CHAPTER 4
APPLICATION: A TEST FOR NORMLITY
4.1 Introduction
Consider using a random sample, Y 1 Y 2 ...,Y n from a continuous distribution F to test the hypotheses: H F is a normal distribution
0
H F is a skewed distribution.
a
As David (1962) suggested, one might consider the number of observations larger than the sample mean as a test statistic. Letting
1 if Y. >
i
0 if Y < V for j = 1,2,...,n, the test statistic, Y, can be expressed as
n
Y = E q i .
j=1
Without loss of generality, we can assume that the variables in the sample have been standardized to have zero mean and unit variance. Then, using the representation of Y given by (4.1.1) and assuming that H 0 is true, we can find the mean and variance of Y. For the mean, we have
44
45
[n
E(Y) ELZ C j=1
n
= Z2 E (Cp) j=1
n
= P( = 1)
j=l
n
= P(Y Y > O)
j=1
n
1
2
j=1
n
2'
since Y. Y is symmetrically distributed about zero. Before
calculating the variance of Y, we first need the variance of CP and
covariance of c; and k. We have
Var (p.) = E(.) [E(j)]
E (C j) ( )2
2
1
and
cov (jk) = E(p k) E(c ) E(Cpk)
= P(Y Y>O, Y Y>O) Sk 4
But the probability above is identical to P2(p), where p equals
cov (Y.Y, Y Y) corr (Y. Y, Y Y) = k
3 k
A/Var (Y.Y) Var (Y Y)
3
46
The covariance of Y. and Y k Y and variance of Y. Y are
coy (V IY k) coy (Y .,Y) coy (iYk Y+cov (YY)
0 1 1 +1
n n fl
n
and
Var (Y 2 cov (Y.,Y + Var(Y
2 1
n n
n
respectively. Hence,
1
n
Therefore,
co (%Jk~ =2 (n 4i)
1 si N and
(jn
Var (Y) =Var (z )
n
Z Var (cp)+ 2Z cov(CIT, k
n +n(n1) si1 (I ) T 2r s 1~
47
The standard deviation has been computed for n< 50 and is given in Table 1.
4.2 The Null Distribution Clearly the test statistic, Y, has a discrete distribution,
taking on values 1,2,...,n1 with positive probabilities. The probability of the event, [Y=r], r = 1,2,...,nl, can be written as
P (Y=r) = P(Cpi = 1,. ..ir = ;ir+ = 0'... 'i =0) ,
n L"1 " r+ 1 n
where the summation is over all partitions fi ,...,i ; i +1,....i n 1 r r+l n
of the set [1,2,...,n]. Then, since the variables 11'2 ... Yn are
identically distributed,
P (Y=r) = r P(1 = 1..... r= 1; Y r+=0,... n=0)
n\r /1 'r' r+1' n 0
= (n)P(YlY>O,... ,YY>0; Y r+ 7Y<0,
(r 1r r+1
...,Ynf<0).
After the transformation
U. = (Y. Y) n
Jrn
3 3 \n17
we have
P (Y= r) (n) P(U >0,... ,U >0; U <0,... ,U <0),
n r/ 1 r r+l1 n
where the normal variables U 1,U2 ...U n, each have mean zero and common variance and correlation given by
48
TABLE 1
STANDARD DEVIATION OF Y
n n
3 .50000 27 1.56581
4 .59242 28 1.59456
5 .66760 29 1.62281
6 73389 30 1.65058
7 .79416 31 1.67788
8 .84994 32 1.70475
9 .90214 33 1.73119
10 .95140 34 1.75724
11 .99818 35 1.78291
12 1.04283 36 1.80822
13 1.08562 37 1.83317
14 1.12678 38 1.85779
15 1.16646 39 1.88208
16 1.20484 40 1.90607
1.24202 41 1.92976
18 1.27811 42 1.95316
19 1.31320 43 1.97628
20 1.34738 44 1.99914
21 1.38071 45 2.02193
22 1.41325 46 2.04408
23 1.44505 47 2.06619
24 1.47617 48 2.08806
25 1.50664 49 2.19701
26 1.53651 50 2.13112
49
Var (U.) n L Var (.Y,
3 ni
n (1 1 )
ni n
and
corr (U.,U n . coy (y, Y, Y Y)
k ni
n (_1
ni n
1
ni
Hence, for m
Ult 2 .... ,U., have a multivariate normal distribution. Therefore,
P pk < m, is defined and can be computed, using the method discussed in Chapter 3. Furthermore, since P(Y=rn) = 0, we can set P (p) equai
n
to zero and use relation (3.3.3) to find P n(Y= r). That is, (4.2.1) Pn(Y= r) = (n n r () j (Y) p
where P 1 and P n(p) = 0. Using equation (4.2.i) and
the results of Chapter 3, the null distribution on Y was obtained. The results, for n < 22, are given in Appendix 4.
4.3 Approximations to the Null Distribution
David (1962) has shown that the asymptotic distribution of
1 i
is normal wihmean zeoand variance
wit zeo Hence, for
n 4 2TT
large n, we should be able to approximate the distribution on Y, using a normal distribution function. In particular, the critical values of
50
Y needed to form the rejection region can be determined using the approximation. With a 10%1 level of significance and a twotailed alternative, the critical values are the solutions to the equation
1 n
2 2 = 1.645
vVar (Y)
(Notice that 1. was added as a correction for continuity factor.) As an example, with n = 19, we have 1 19
1.31320 = 164
or
r =9 2.1
a7, 11.
As a check, from the small sample distribution on Y given in Appendix 4, we have
P 1 (Y< 7) = P 1 (Y;>ll) = .0595.
Table 2 compares the small sample distribution of Y with the normally
approximated distribution for n= 19. (The results of the approxiriation which will be discussed next arelisted in the third column.) We would expect the approximated results to increase in accuracy as the sample size increases.
An alternate approach to the problem of approximating the null distribution on Y is through the use of order statistics. Consider again the random sample of standardized variables, YY 20 ... Y.n Then, letting Y 0)denote the Vth largest order statistic, the events [Y< r] and CY(v :7 ] are equivalent if we set V equal to n r. Letting F (x)
51
TABLE 2
THE CUMULATIVE DISTRIBUTION OF Y, n= 19 Approximations
Small
Sample Edgeworth' st
r Distribution Normal Expansion
1 .0000 .0000 .0000
2 .0000 .0000 .0000
3 .0000 .0000 .0002
4 .0000 .0000 .0007
5 .0007 .0011 .0029
6 .0092 .0112 .0142
7 .0595 .0639 .0638
8 .2190 .2231 .2175
9 .5000 .5000 .5000
10 .7810 .7769 .7825
11 .9405 .9361 .9362
12 .9908 .9889 .9858
13 .9993 .9989 .9971
14 1. 0000 1.0000 .9993
15 1.0000 1.0000 .9998
16 1.0000 1.0000 1.0000
17 1.0000 1.0000 1.0000
18 1.0000 1.0000 1.0000
Uses f irst two Troments of Y.
t
Uses first four moments of Y ( Y.
52
represent the standardized distribution function of Y( Y, we can write
(4.3.1) P (Y < r) P(Y O)
PY( )Y <
2 V 2 v
where aIVnd 2 represent the mean and variance of Y(respectively. An approximation to the distribution F V(x) can be obtained by using Edgeworth's expansion (see, for example, Cramer, p. 229). Letting kgv represent the kth central moment of Y(V)7, we have, using the first four moments, (4.3.2) F (x) L G(x) g(x) 1 3 %) (X2
(4.3.2 F6 3/2(x)
13
+ 4 3) (x3 3x)
2 V
where G(x) and g(x) are the standard normal distribution and density functions, respectively. Thus letting x in equation (4.3.2),
2 v
we can approximate P (Y r).
n
By using a power series representation, the moments, k k = 2,3,4, can be determined. Saw (1958) has shown that the kth moment, k%, of the Vth order statistic can be expressed as
53
1
(4.3.3) k4V = H (p ,O,k)
j=0 (n+2)
= E H.(pv,k) n(j), j=0
where p and, for convenience, we have replaced H. (p ,0,k) and
V n+1 j v
1 by H (p ,k) and n(j), respectively. The constants, H (p ,k), (n+2)j V
are tabulated for k = 1,2,3,4,j SE 5, and p = .50, .55,...,.95 by Flora (1965). It follows, since the moments kE are functions of kV' that
k = C.(p ,k) n(j),
j=0
where the constants C.(p ,k) are functions of H (p ,k).
In order to find the C,(p ,k), we must first express the .3 V
k Vas functions of the kP Letting Y(t) represent the characteristic function of Y Y, we have
cp(t) = E[e ( )]
Since Y and Y )Y are independent, we can write
_(t) = E [e t(Y)
StY
2n (it)j
.e
E=e
j=0
54
00 it () 1 O
1.0AL(2n) jz
C O 2 O 2 1 + j i t
1=0 j=O (n
After the change of variable, k = 22 + j, it follows that
~~(t o (it)Jk Fk/2 k!______k=O L1=o 2(k21)! (2n) (2V
Therefore,
k k/2 k
2!(k22e)! (2n)2 and, in particular,
E(Y 0 Y) =IP
E(Y ) =1
00) 2 V n
E( y3 2
(V) Y 3PV n 1V
E (Y 46 2
\O Y) 4= I n 2 .\ +
n
Using this information, the central moments, k of Y 0 Y can be
expressed as functions of k4V For example, with k =2,
2 (YCV 2
~ E(Y 2Y 4+
2 2 V) n V 1
22
2 2? 'V 1PV n
55
Similarly;
V 3 \ 3 2N "V +p 21
6 2 3 _462
4 V = 4P ~V + 2 1~ 3 4V 1 6 YV 2V
n
6 3P4
n 2PJ 1 I
Finally, the constants, C.i(p.,k), can be determined as functions of H (pvk) by substituting the power series representation of k P and
equating coefficients of n(j). With k = 2, we have
2 1
2 v 2 2V 1V n
=Z H.(p\,,2) n(j) H (p,1) n~j
j=0 J L iv=0 '
n:72 2'
1 n+ 2
E H.(p,,2) n(j) E H.(p 1) n(2j)
j=0 J =
2 E E H (p 1) H (P ,1) n(j+k) Z 2 2' n(j+1)
E C~ i (P,2 n(j)
Equating coefficients of n(j), we find that
C (pV2) = Ho(p2) H 2(pV,1),
C 1(p ,2) = H 1(pV2) H (pV1) H 1(pV 4) 1,
2 v' 2 1 V'op~)H(v 2 ,
56
2H 1(PJI) H 2(Pvtl) 4, C3 (P ,2) = H (p 2) 2H (p 1) H (pV,1)
3 %02 o 2 3 0V 4
2H (p ,1) H (p 1) 4,
2
C4 (P,2) = H4 (PV,2) H2(p ,1) 2Ho(Pl) H4(PV,1)
2H(PV,1) H3 (PV,1) 8
C (p ,2) = H (P,2) 2H (p ,1) H5(p ,1)
5 V) 5 0' o V 5 IV
2H (p,1) H4 (PIl) 2H2 (pV,1) H3 (VP,1) 16.
The constants, C (p ,3) and C (pu,4),can be found in the same tedious manner. The values of C (p ,k), k = 2,3,4, j' 5, and pV = .50, .55,..... 90, are tabulated in Tables 3, 4, and 5.
Using these tables, we can approximate the moments 2%) 3 V' 1p9
and 4%, and then use relation (4.3.2), with x = 1 in order
2 V
to obtain an approximate distribution on Y. As an example, we take n = 19. Then
p
V n+l
19r
20
Tables 3, 4, and 5 were used to approximate 2f' 3f and 4, for r = 1,2,...,9. The results are listed in Table 6. The values for 1P were taken from tables computed by Teichroew (1956). Teichroew's tables were also used to check the accuracy of the series approximation for 2~ For n = 19, the approximation was accurate to five decimal places.
TABLE 3
c (pv 2)
p \Vl 0 1 2 3 4 5
.50 Zero .57079633 .46740110 .53726893 3.86976182 12.0889819
.55 Zero .57983932 .49410800 .48836662 3.82833586 12.1396154
.60 Zero .60792651 .57852616 .33227941 3.70150569 12.3352921
.65 Zero .65821859 .73530568 .03640528 3.48374825 12.8518246
.70 Zero .73711417 .99616602 .47260438 3.17871571 14.2029896
.75 Zero .85676747 1.42762451 1.35866689 2.85548457 18.0128221
.80 Zero 1.04137154 2.18241438 3.03241842 2.93521545 30.4943719
.85 Zero 1.34534536 3.67686183 6.76962950 6.03082075 84.1011123
.90 Zero 1.92211072 7.45692711 18.3890366,2 35.47326470 474.58344823
4
TABLE 4
c i (pv 0 3)
p \ 0 2 3 4 5
.50 Zero Zero Zero Zero Zero Zero
.55 Zero Zero .14261956 .46342873 .77469507 .2860125
.60 Zero Zero .30026404 .99445568 1.67177376 .4710749
.65 Zero Zero .49242777 1.68698946 2.86385938 .3195549
.70 Zero Zero .75034258 2.70841421 4.66613950 .8926114
.75 Zero Zero 1.13308641 4.41851254 7.77768619 5.7261556
.80 Zero Zero 1.77168120 7.74697987 14.08664124 25.1106438
.85 Zero Zero 3.02054891 15.75323340 30.27071486 122.3376979
.90 Zero Zero 6.18793059 43.35945955 93.78561212 932.6502938
00
TABLE 5
c i (pv 0 4)
p \Vj 0 2 3 4 5
.50 Zero Zero .97862534 2.30187673 .68109061 14.7500839
.55 Zero Zero 1.00864090 2.46469125 1.08317896 21.1792945
.60 Zero Zero 1.10872392 3.01034124 2.48383168 13.8178825
.65 Zero Zero 1.29975514 4.11498558 5.51922615 11.4702010
.70 Zero Zero 1.63001191 6.20062897 11.82707278 5.4443069
.75 Zero Zero 2.20215147 10.27259002 25.71608407 9.5845455
.80 Zero Zero 3.25336401 19.03426372 60.41653805 41.8041794
.85 Zero Zero 5.42986248 41.49665868 168.6658848 71.0033502
.90 Zero Zero 11.08343110 121.2957630 683.1599674 891.8775674
60
These moments were used in the expression for F V x) given by (4.3.2) to obtain an approximate distribution on Y for n =19. The results are given i~n Table 2, page 51. For li= 19, there appears to be little difference in accuracy between the two approximating methodscertainly not enough to justify the extra labor involved in computations for the latter method. However, the second method does work well and is at least of theoretical interest.
61
TABLE 6
MOMENTS OF Y(V) Y, n = 19
r 2J **4
1 18 1.37994 .11015 .01897 .04152
2 17 1. 09945 .07308 .00868 .01768
3 16 .88586 .05484 .00492 .00975
4 15 .70661 .04416 .00308 .00624
5 14 .54771 .03739 .00202 .00442
6 13 .40164 .03298 .00131 .00342
7 12 .26374 .03020 .00079 .00285
8 11 .13072 .02866 .00038 .00255
9 10 .00000 .02816 .00000 .00246
= nr.
lv = E(Y () Y).
tk = E((V) y k
i ) k =2,3,4.
CHAPTER 5
OTHER METHODS OF EXPRESSING P (P)
5.1 Introduction
Before discovering the expression for P M(p) Involving
TchebycheffHermite and Legendre polynomials, three other methods of expressing P m(p) were used in attempting to obtain numerical results. Each of the first two methods, outlined in Sections 5.2 and 5.3, expresses P M(p) as an infinite series. However, in each case, not only is the series slow to converge, but no workable expression can be given for the kth term of the series. Therefore, these methods are not useful in obtaining accurate numerical results. In Section 5.4, we give an expression for P m(p) involving the moments of extreme order statistics. However, this expression can be used only for limited values of m and rho.
5.2 A Power Series in Rho
Using the definition of P m(p) given in Chapter 1, we can write
P (P) = I... I' g(x ,... ,x ;p) dx . .dx
0 0
where g(x ,..., mx p) is the multivariate normal density on the equicorrelated variables XV,...PXm. Since, when rho equals zero, the density function simplifies to
62
63
m
1 2
m  Zx
2 2j=1 g(xl,...x ; 0) = (2r) e
1 mi
m
we could simplify the integrand by expanding g(xI,...,xm; p) in a
Maclaurin series in rho. We have
O k k
(5.2.1) g(x1,... ,xm; P) = P g(x1,...,x ; P)
k=_0 P m p=0
Unfortunately, as a function of rho, g(x1,..., xm ; p) is quite complicated and it is not feasible to take derivatives with respect to rho.
However, the following identity simplifies the problem to.some extent:
(5.2.2) g(x ... ,xm; p)
p=O
6 6\ k ; EE Tx) g(x1,...,xm; 0)
i
In proving identity (5.2.2), we use the characteristic function,
q(tl...',tm), of the variables X, ...Xm. By definition
m
LE t.X. (5.2.3) c(t ,...,t ) = Ee j=l
1 m
m
CO CO E t.x.
j=1 3
= ... e g(x ... ,x ; p)dx ...dx
1 1 m
_CO 00
I t'Vt
= e
m
1 2
 t p t.t
e 2 j=l J i
64
Differentiating with respect to rho, we have
m
t.x.
... e j=1 g(x1 ...,x ; p)dx1...dx
1 2
 t pE' tmt E~ t. 2~ t t
2 <. j k i
= t.t. e
i
1
Since, for < p < 1, g(x ,...x ; p) is a continuous function
mn1 1''m
8
in p, P g(x1,...x; p) exists and is integrable. Therefore, differentiation inside the integral is permitted, and, at the point p = 0,
we have
m
O CO E t.x.
(5.2.4) L g(x1,. ,xm; p) e 6dx1...dxm
p=0
m
1 2
E t
= 2 t.t. e i
Next, we consider the characteristic function of Xi,...,X when p =0.
1m From (5.2.3) we have
m
CO I E t.x.
j=1 ; ...J e g(xl,...,x ; O)dx .. dx
_m _.. 1 xm 1 m
1m 2
E t
e2 j= j
2j=1
65
Differentiation with respect to ta and tb gives
m
O m E t.x.
F P 6 6 j=1 33
... e g(x ...,xm; O)dx ...dx
F a b
1 t2 8 6 2 j
=*e ot ot
a b
so that
m
CO CO Z t.x.
] j=1 iJ
(5.2.5) ... (xaxb) e g(x1 ,...,xm; O)dx1...dx
m
j J xaxb1 m .1* m
1 m 2 SE t 2 jlJ = tt e j=1
ab
Again, differentiation inside the integral was permitted, since the
function xxb g(x ,... ,xm; 0) exists and is integrable. Since
at 1 m
6 6
xA g(x1,... ,xm; 0) T x g(x1... ,xm; 0), a b
after summing both sides of equation (5.2.5) over values of a and b
such that a < b, we have
m
O /.L Et.x.
(5.2.6) ... 0 g(x ... xm; 0) e J=1 dx1...dx
FO r E E 6X F g (x ....
c a
m 2
2. j
1 =e 2 j1
a
66
Finally, addition of equations (5.2.4) and (5.2.6) gives
m m p =0
6 8"
a x & .'X 1
b1 E g(x1,... ,x;0) a
m
I E t.x.
j=1 a
*e dx1...dx =0,
1 m
which implies that
6 6 6
g(x1,..., x;P E (x1,..xm;O)
p=0 a
Since derivatives of all orders exist and are continuous, the preceding process can be repeated any number of times. Hence, the identity (5.2.2) is proven.
Using this identity in the Maclaurin expansion of g(x ... ,Xm;P),
1' m
given by (5.2.1), and substituting the resulting expression into Pm (p), we have
(5.2.7) PmP ...'J E k g(x1,...,xm;0)dx1...dx
o o a
k=0
= * a
k=0
k0 o o a < b a xb
The utility of this expression for Pm(p) depends on how readily each term in the series can be determined and on how quickly the series converges. Differentiation and integration are no problem once the
67
sum, I x has been expanded. In fact, the jth derivative
a
of g(x ..., x ; 0) with respect to x is given by
g(xl ... ,x ; 0) = (a g(x.) g(x)
ix 1 m \ida 1~ d jx x a
a a
= (1)j H.(x ) g(x ...,x m; 0), j a 1 m
where H.(x) is the jth order Hermite polynomial in x. For example,
3
2
H (x) = 1, H2(x) = x 1,
o2
3
H1(x) =x, H3(x)= x 3x.
It follows that
(5.2.8) g(x) dx = g(x)J
O O
= (1)j1 H j1(x) g(x)J
O
0
H (0)
= (1)
For even values of j, the (jl)st order Hermite polynomial vanishes
at the point x=0. Therefore, when expanding the sum, we need to
consider terms that involve only derivatives of odd orders.
6
Denoting  by 8a, we can represent the kth power of the
a
sum by the multinomial expansion
68
k k! k1 k2 k
E E 6 6 ) P
a
where N = (2) and where the summation is over all integer values of
N
klfk k satisfying Z k. = k and k. 0, j = 1,2,...,N.
V fj=l JJ '
Using this expansion and the value of the integral given by (5.2.8), we can determine the first few terms in the series expression (5.2.7) for Pm (p). Denoting the (k+l)st term in the series by S, for the first two terms, we have
0 M c
S g(x,...,x; 0) dx...dx
o o
= ()M
and
k1 .... k. 1 2 1 3m1 m
o 0 1 N
g(x1 .... xm)dx1. .dx 1 1 m
N
Since E k. = 1, there are N = (2)terms in the sum. Therefore,
s1 J
since X,..Xm are identically distributed, it follows that
69
o o 9 (x l .. . x O )d ' d X m
o
o2 o 1 1 m
0
M'" H (0) 2 2
(lm m (2)
where m(k) will be used to denote the permutation of m elements k at a time.
N
For the third term, S 2, Z k = 2. Since a value of k.
j=l
equal to two would result in an even powered derivative, we need to consider only values of k. equal to one. Furthermore, these two
3
"ones" must be assigned to two of the exponents in the quantity
kl(1 3k2.. k N
1 62 1 3) (6m1m) so that the resulting product contains
four distinct 5's. The first "one" can be assigned in N = (M) ways.
k.
Then, there remains (m 2 couples, 6i, containing 's dis1
tinct from the couple (6ii determined by the assignment of
the first "one." Hence, the second "one" can be assigned in (m 2) ways. Since the "ones" are not distinguishable, there are possibilities for selecting the two nonzero exponents. It follows that
70
2 O aO
2 6 6 6
2 2! 1 1! 2 2 / ' "
'o o 1 2 3 4
g(x, ... ,xm; 0)dx ...dx 1 m 1 mn
2 m H (0) 4 m4
p (4) 0 /1
2 4 L. pJ \
2 mm
P2 /1\m (4)
2 2
TT
Although one might hope for a general expression for S such k'
hopes diminish after evaluating the next two terms in the series. With k= 3, we must consider assigning either one "three" or three ones" to the N k.'s. (The choice of one "two" and one "one" results
3
in an even powered derivative.) The assignment of the three "ones" can result in two types of products with only odd exponents on the 6's.
The first type, 63 6 can be formed by [2(m2)](m3)~
T h e f i s t t y e 5 i6 2 6i 3 6 i 4 c a n 3 !frm d b
different assignments, and the second type, 5616i26i36i46i56i6, can be
formed by (2) /m2\ m4 Idifferent assignments of the three "ones".
2 2 XL 2 /3!. Consequently,
3 ( 63 3
3S3 ( 2/ F g(x ,x;O)dx1...dxm
o o 1 2
3!Dm 1 e 6 6 6 6
+ 1!111 k(2) [2(m2)](m3) 33...J
0 0 1 2 3 4
g(x ,...,x m;0)dx ...dx 1 m 1 m
71
0.\l'\~ 1 T2 3 X4 5 6S
*g(xl.. x M;O)dx 1**dx
p ~n(2) HO2 (1m
2 L r
1 H 2(0) _H(0_ 3 m+ m(4) L r "O(O)"3(*mj
+m(6) H 0 l
3 m rm( 4m (4 371 \2, LT 2 T
Finally, for k=4, values of the k.'s equal to "three" and one" or "two", and two "ones' can both be assigned to form a product
33
result in either a 8 5 product or a j&...
1l6'2636141'516 616'2.. 18
product. It follows that
= ()[2(m2)](m3)j[ 2(O12 ,/2(lm
r4~~~ /m 1 1(_20'2, H (0) 2 /m4 + L211 2[2(m2)l(m3):L JL K)
____ 414\ H~ H2 () H 0(0 5
+ (m [2(m2)]( ) ~L LOj
M6
2~)
72
L Ll17 1 ~2 2 A2A) 4iL Tj ( )
4 (1~ ~4Omi(4) 4m (6) m (8)
~L 1* TT 3T
It is unlikely, from the expressions given for S 3and S 4fthat a simple general expression for the kth term exists. Although additional terms in the series could be determined, it was seen from numerical examples that the series converges quite slowly, especially for large values of rho. Some of these results are given in Table 7.
TABLE 7
ERROR INVOVED IN COMPUTING P (p) WHEN THE SERIES IN RHO IS TRUNCATED AFTER FIVE TERMS m= 5 M_l0
Absolute S 4 Absolute S 4
Error Error
1/10 .00001 .00006 1/15 .00006 .00002
1/5 .00008 .00101 1/10 .00029 .00008
1/5 .00048 .00101 1/5 .00318 .00127
1/10 .00003 .00006 1/10 .00022 .00008
It should be noted that this method of computing P m(p) is valid for negative values of rho. In fact, if the results in Table 7 are any indication of the general behavior of the series, we would expect the fastest convergence for p <0.
73
5.3 A Series Resulting from an Inverse
Taylor Series Expansion of g(u)
For positive rho, consider the expansion for Pr:m(p) given in Lemma 1:
Pr:m(p) = ( [1 G(Gy)]r [G(Gy)]mr g(y)dy
r:m r .
2 1
where = p (1) Since P r:m(p) = P mr:m(p), we can write
r:m mr:m
P(p) = o:m(p),' so that
CO
P m(p) = G(y)m g(y)dy m2
After the change of variables, u = ey and h = 1/2, we have
21
1 1
(5.3.1) P (p) = f G(u) e dy
h1
2 1 m h1
= (2rr) h S G(u) g(u) dG(u)
o
0
By expanding the density, g(u), in an inverse Taylor's series, the integrand will contain only terms involving G(u). Expanding about an arbitrary point G( ), we have
[G(u) G(f)] d
(5.3.2) g(u) = gdu)
jO 3u=jM0
= (C)[G(u) G()]
j=0
74
The function G.(u), j = 0,1,..., is given by
3
.(u) d g(u),
j \dG(u)
and can be written
1 d ddg(u)
S j! dG(u))/ L dG(u)J
1 d ,j1 rdg(u) dG(u) j! dG (u)) L du du
1 fd j1
j dG(u) U
The function
dG u) u g(u) = g(u) (j+1)! o. (u)
dG(u)/j u 3+1
is tabulated by Saw (1958) for j = 0,1,...,10.
Substituting the expansion (5.3.2) of g(u) into (5.3.1), we have
h1i
S1 ( .)h1
= (2) h2 E .()[G(u) G(h)]
o j=0
m
SG(u) dG(u) hI
1 C
 1 ko
S(2rr) 2 h5 S k(f;p)[G(u) G(C)]kG(u)mdG(u),
o k=0
where the constant (f;p), k = 0,1,..., can be determined from the ak(f)'s. Assuming that an interchange of integration and summation
is permitted, we have
75
h1
cc 1
2 k m
P (p (2TT) hT k [G(u) G( )] G(u) dG(u)
k 0 0
Next, declining y k (F.,m) by
Y (Zm) = I M [G(u) G(7).l k G(u) m dG(u),
k
0
we can write h1
2
P (p (2TT) k( ;P) Yk( 'm)
k 0
A recurrence relation for Yk(f'm) can be found by first writing the
integrand as
k 1 k1
G(u)M[G(u) G(f)] G(u)' [G(u) G(f)]
m k1
G(f) G(u) [G(u) G(f)]
and then integrating by parts. We have
Yk (fm) I G (u)'n+ 1 [G(u) G(f ) ]kI dG(u)
0
G( ) G(u)m [G(u) G(f)] k1 dG(u)
0
1 G(u)' 1 [G(u) G( ) I kj
0
m+1 G(u)m [G(u) G(f) ]k dG(u)
k
0
G(f) Ykl(f'm)
k m+1 (I G(f)] Yk
Yk (f'm) 1(f'm)
76
It follows that
(5.3.3) ,k((,m) 1 [1 G()]k kG(() .k_(Cm).
k m+k+1 L kSince
1
yo({,m) = G(u)m dG(u)
o
1
=m+1 '
the Y k(,m), k = 1,2 ..., can be obtained easily from (5.3.3) for given values of m and G(f).
The selection of the point G(f) should be made so that the
Yk(t,m)'s are small. Since (m+l) G(u)m represents the density on G(u), we can write
1 k Y (f,m) = 1 E[G(u) G(f)]k
k m+1
Therefore, by letting
G(f) = E[G(u)]
1
p ~ m+ 1
= (m+l) G(u)m dG(u)
O
m+1
m+2 '
y1(f,m) equals zero and y2(f,m) is minimized. However, numerical work has shown that, although yk(f,m) becomes quite small as k increases, the values of .(), j = 0,1,..., corresponding to the
J
m+1
point f that satisfies G(f) m+ become exceedingly large.
m+2
(For example, with m= 8, T= 1.28155157 and a (e ) = 3,825,025.96.)
77
Therefore, it would be better to choose the point G( ) that minimizes aj(f), j = 0,1,..., especially since the yk( ,m)'s are bounded by
1
(r+l) Since a () can be represented by
3
a' ( ) = g( )
a'. ( ') = I J1 j2)
a j,i j 2,
j,(f)jI i=0 where the a. are constants, satisfying
a 0, i + j odd,
a < 0, i+ j even,
the choice = 0 clearly minimizes 1011()I. Denoting a'j(0) by a'j, we have
32
(2Tr) 2
so that a vanishes for odd values of j. Since 0k(0;p) say k(p) is a product of the a'.'s such that Eji = k, it follows that k (p) vanishes for odd values of k, and that the series P (p) contains only
m
even terms.
In order to determine the values for the a.'s, we first need
3
i
the coefficients of u in the function a'.(u). Denoting this coeffi3
cient by a'j.i and using the recurrence relation given by Saw (1958), we have
78
=(2TT) 2
o0,0
2n
j. j(j1) [j2)(j3)a. 2
3, JJ2,i2 + (2ij 5i + j3) j_2,i j32 ,i
+ (i+1)(i+2)a j2,i2], i j2, j = 2,4,..., where c ji equals zero if either i > j1 or i < 0. Using this relation, the values of aj, j = 0,2,...,22, were calculated. They appear in Table 8.
TABLE 8
VALUES OF a., j = 0,2,...,22
3
3 J J0'
0 .3989422803 8 1.958451122 16 105.6166131
2 1.253314138 10 4.703578753 18 326.3330223
4 .6562337483 12 12.48581643 20 1037.319292
6 .9620889240 14 35.44811307 22 3373.253924
1
Substituting G(f) = G(0) = into the recurrence relation
2
(5.3.3) for yk(,m) and denoting k(0,m) by yk(m), we have
Yk(i) 1 [(I.k kk
1 fl\k k
Yk(m) Yk* y(m)]
m+k+1 \2/ 2 yk1(m Since we need yk(m) only for even values of k, we can eliminate yk1(m) in the above relation. It follows that
79
(i nn) in ~ll k (k1) Tk (mn) 7 k=2,,
yk~m (m+k+l) (m~k) 21
km)=+ k~ k2m], k = 2,4,. .
This relation can be easily programmed to evaluate N'k (m), k = 2,4,..., for a given value of m.
Numerical examples were used to investigate how quickly the series,
hi
(5.3.4) P (p) = (2rr) hT E 0k(P) Yk(m),
k=0
converges. As examples, we include the numerical computations for m= 10, p =1/3, 1/4. The values of Yk(m), k (1/3), and 5k (1/4) for k = 0,2,..., 22 are given in Table 9. In Table 10 is listed
hi
2 n
(2 ) h E k ( ) k(m) k= 0
for n = 0,2 ... ,22 and p = 1/3, 1/4. The exact value of P (P) is given after n= o. As can be seen, absolute errors for p = 1/3 and
p = 1/4 are .00258 and .00122, respectively. In addition to the slow convergence of the series P (p), this method of evaluating P (p) has two other disadvantages. First, the expression for P (p) given in
m
(5.3.4) is only valid for p = h+1 h = 1,2,.... Also, the constants k (p) are difficult to evaluate for small values of rho.
80
TABLE 9
VALUES OF 'yk(m)' 8k( 1/3), 1/4),
k = 0,2,...,22, m = 10
k Yk (10) $ k (1/3) k (1/4)
0 .0909090909 .3989422803 .159154 9430
2 .0163170163 1.253314138 1.000000000
4 .0032092907 .6562337483 1.047197553
6 .0006629400 .9620889240 .8772981706
8 .0001413557 1.958451122 1.279624120
10 .0000308241 4.703578753 2.418906535
12 .0000068352 12.48581643 5.323901882
14 .0000015356 35.44811307 12.95550088
16 .0000003486 105.6166131 33.85865868
18 .0000000798 326.3330223 93.33839437
20 .0000000184 1037.319292 268.1907545
22 .0000000043 3373.253924 796.5361116
81
TABLE 10
THE FIRST n TERMS IN THE SERIES P (p),
m
m= 10, p =1/3, 1/4, n 0, 2,...,22
n p = 1/3t p = 1/4
0 .128564 .157459
2 .056070 .020116
4 .048604 .016459
6 .046344 .02278S
8 .045362 .024757
10 .044848 .025568
12 .044546 .025964
14 .044353 .026181
16 .044222 .026309
18 .044130 .026390
20 .044062 .026444
22 .044011 .026481
co .043753 .026603
1 1 n
tTabulated entries are (2rr)T 2r Z k (1/3)yk(10).
k0
Tabulated entries are (2r) 3 Z S k(l/4)v k(10).
k=0
82
5.4 Using Moments of Extreme Order Statistics
If we let f(u;m) represent the density on the largest order statistic in a sample of size m from a normal distribution, then
f(u;m) = G(u) du
mni
= mG(u)m g(u), 0
Using the representation of Pm(p) given in (5.3.1), we can write h1
(2rn) h 2 h1
P (P) + g(u) f(u;m+l) du.
The integral, say I, above can be simplified by successively integrating by parts. For example, after integrating by parts twice, we have
I = Sg(u)h1 dG(u)m+1
CO
= g(u)h1 G(u)m+1 (h1)u g(u)h1 G(u)m+1 du
= o + u g~Cuo2d~u1
h1 COguh2 dGum+2 = 0 + IIu g(u)h dG(u) mC2
(hi) (h2) c 2 h2 m+
(h1)(h2) (u2_1) g(u)h2 G(u)2 du m+2
(h1)(h2) 2_ h3 3
S(m+2)(m+3) (u2 1) g(u)3 dG(u)m+
It can be shown by induction that after integrating by parts k times,
83
= (h1)(h2)...(hk) hk1
(m+2)(m+3)... (m+k+1) ~ k g(u) f(u;mk)du
where Hk(u) is the kth order Hermite polynomial in u. Therefore,
after h1 integrations, we have
h1
2 C
(5.4.1) P (p) = (2) h(hl)(h2) ...21 f(u;m+h)du
m (m )(m+2)... (m+h) fCm 1u
h1
2
(217) E
=E [R_ (U) ],
I (m,,h )h1 U'
f h h U(m+h)
where Um+h) is the largest order statistic in a normal sample of
(rih)
size m+h.
For example, with p = 1/3 and p = 1/4, expression (5.4.1)
simplifies to
2n
P (1/3) = E U
m (m+1)(m+2)
U(2
U (m+2)
and
4 r3 T 2 1]
P (1/4) = E U _12
m (m+l)(m+2)(m+3) U (
U(m+3)
respectively. Using the table of moments of extreme order statistics
computed by Ruben (1954), expression (5.4.1) can be used to determine
P m(p) for values of m and p satisfying p > 1/12 and m + p 1<51,
where 1 is a positive inter.
where p is a positive integer.
APPENDIXES
APPENDIX 1
i k (p)
p
k 1/ 2 1/ 4 1/ 5
0 1.OCCOC 1.COCCG 1.CCCOC 1.000co
2 C., CCGC C.17548 0.25871 C.30772
4 O.lljocl j: CCCE77 0.'2771 G.C6523
6 C.CCCCC C.Colep C.CC243 O.CO299
3 C. )Ccoc C.CCC52 C.CCC45 C.COC41
io O.r cocc C.CCC21 C.rCC12 0.000ce
12 C.I)OCCC C.rlcl.,lc .*CCCQ4 O.CO002
14 C.GCCCC C.Cl"CC6 C.CCCG2 C.Cccci
16 C.OCCCC C.CCCC3 C.Cccol C.COOGO
C.Ococc CCCCC2 C.CCCCC C.00000
2C 0.lcllcr C.OOCC2 O.CCCGC C.COOOO
22 ().OCrCC C.CCCC1 C.CGCCC C.Cocco
1/
0 1.0ccc.c i.cccrIrl 1.GCCCC i.COCCO
2 0.3ziClC C.36311 C.38C32 C.39368
4 C.IS766 C.12483 C.14757 C.16b73
6 C.01365 CC26G8 C.C4C7C 0.05415
8 O.OCC02 C.OG21il CCC647 C.C1172
1L C.OCCC4 C.CCCC6 C:CCC32 C.CO134
12 C. Ccul C.CCCCC C.CCC02 C.CCO02
14 O.OCCCC C.CCCCC C.CCCCC 0.00000
16 c.ocl;oc CCOCCG O.CCCCC C.00000
18 0.0ccc: C.CoCrC
%. C.CCCCC C.Cocco
20 C.OCCGC C.CCCCO C.Ccccc C.Goooo
22 O.OCCCC COCCCC G.CCCCC O.Coooo
1/10 1/11 1/12 1/13
0 1.,Ncoc I.CCCCC 1.cccck,4 1.COOCC
2 C.4043 C.413C7 C.42C33 0.42647
4 O.lb3G 0.197C4 C.2CSIS C.21981
6 C.066G6 C.07ES5 ().Cgccs C.10040
8 C.01776 C.Ul1255 C.C3C96 C.C3771
10 COC524 C.CC796 C.01107
12 CCCC61; CCC140 0.00239
14 C.CCCCC CCCCC3 C.CCC14 COOG34
16 C.CCCCC C.CCCOO CCOCC2
16 L..CCCO C.Cccoo 0.00000
2 : 0. Oc tic) c c .. c I, c C. 0 0 c *C c 0 c O.Cocco
22 C.CCCCC C.CCCCC O.COCCC
85
86
APPP.,TDIX 1 (Continued)
p
k 1/14 1/15 1/16 1/17
0 1.occ% c I.CCCCC I.CCCCC
n
IL .43172 C43E29 0.44C28 C.44350
4 0.22917 C.23749 C.24491 C.25159
6 6.ic;;2 C.IlE72 0.12686 C.13439
ji C.3444C C.05CS6 C.C5733 t,'.G6348
10 C.01447 CCIEC9 C.C2185 0.02570
12 0.00363 C00512 O.rO682 C.00870
14 C.CUC6 cojolic C.CC168 C.C0239
16 0 3 C 0 7 C C 0 C 17 C'. w C C 3 1 C M )51
is C*OOCOC CCCCC1 C.CCCC4 C.COI08
2r, 0 ,: G C 0 CCOCCO c "I ccoo CCoCol
22 C.CccC C.CGCCC G.rcccc O.COCCG
1/18 1/is 112 C 1121
0 IPcoo 1.oCrCC
2 0.44692 C.44S72 C.45223 C*45451
4 0.25761 'C.263CS C.26PO7 0.27263
6 0.1413E C14727 C.1539.: C.15953
O.Ot941 C.07510 C.108C55 OOS577
C.,O 2 q6'L C.G3352 #I C 3 7 4 2 C*04129
12 0 0 1C 7 3 C.0128S C.C15,16 C.01751
14 C.:C323 C C Cl/ 19 C. C C 52 6 ^.00644
16 0 ,0 C 7 E CO;112 CCC154 C002C2
13 C.rlllr,15 C.CCC24 CCCC37 C.CO053
2 1 O.CCOU2 C.COCC4 CI.CCCC7 Cco ll
22 0.0c0c". L.C3,^CC C. cccol C.CCC402
1/22 1/23 1124 1/25
C 1.CCCCC 1.CCCCC I.CCCOC 1.00, Oo
2 0.4565F C.45E47 C.46C2C 0.46179
4 0.27682 C28C68 C.28426 C28757
6 0.1647S C.161;71 C.17433 4.17866
3 O.n,9076 C.O';553 C lcccs 0. 10445
C) C .0 45 i c C.C4EE5 C,. C5253 0.05613
in
C.'1991 C.C2237 CC2485 C02735
14 C. 'C771 C.C^SC6 ".ClC48 C.01196
i 6 ^j.O'25F C.CC22C OT. r C 3 9 C O.CO465
is 0.^CC,72 C COCS8 CCC126 0GO159
2 .? 0.^CO!7 C' 0 C r 2 5 C',. C C C 3 5 C.M 47
22 0 C 0 C, 0 3 C 0 C C C 5 C.CcCC8 CoCCO12
87
APPENDIX 1 (Continued)
p
k /31/ 14 .l/ 5 11 6
i.co i.occoe 1.ccccc i.ooocc
2 0.62452 C.7412G C.6q228 C.b5990
'4 2.44596 1.51C71 1.15617 C.G6895
64.C4666 2.53547
1/ 7 1/ 81/ C; 1/ic
31.C)CCoC 1.CCCCC 1.CCCojC 1.Clocco
2 0.6368S C.61S68 060632 C.59565
4 0.353316 C.775C5 C.71e5t r% 67592
6 1.86571 1.4S315 1.2586C 1.09863
?7.0626C A.466S8 3.2C847 2.48636
iG12.72389 8.09368
1/11 1/12 1/13 1/14
0 I.C0ccrl cco 1.CCC CCCGc 1.OCO
2n..58693 C.57S67 0.57353 0 .56827
4 C.64263 C.615S2 0.594C3 C.57576
6 0.98317 0.89624 %O.E2862 C.77462
6 c,.J72C 1.713E4 1.46E57 1.32C03
ic 5.7169C A.32794 3.43837 2.83220
12 23.39377 1A.93534 1C.42892 7.75,338
14 43,61756 27.917CC
1161/17 1/18
0 .o1 OCC CC 1 VCCOO I .COcco
2 0.56371 C*555972 0.5962C C.55308
4 0.5602S C.547lC3 0.53553 C.52546
6 0.731~5F 0.694CC C.66318 0.636b7
81.1898E 1.08676 1.rC33C 0.93454
10 2.39876 2.076E4 1.83032 1.63672
i2 6.03311A 4.86226 4.C31C7 3.41832
14 1931964 14.15336 IC.82316 8.57463
16 62.1604E 5Z.68241 36.2039,% 26.21621
18 155.S7371 I100.15185
88
APPENDIX 1 (Continued)
p
k 1/19 1/2C 1121 1122
c1. 0C.CC C 1.cccco 1 .,CCOC 1.00000
2 0.5502E C*54j77 C0.54r54q 0.54342
0 O.5165F C.50868 0.50162 C.49527
6 C.61415 C.59436 C.57696 0.56155
6 0.87702 C.82827 0.78648 0.7503C
12 2.953C,3 2.5GC8G 2.30307 2.C7017
14 6.97997 5.81267 4.933'43 4.255C0
16 19.77901 15.42370 12.3584C0 10.12928
18 6S.43843 4S.0852,1 36.4S633 28.16580
20 297.927CS 1S1.51C78 130.2634C 921.67789
22 571.91135 367.95599
1/23 /41/25 1/26
0 1.0cccc 1.C~lCCC 1.cccOc 1.COZCO
2 0.5A153 C.53SEC C.53821 C.53674
4 0.48951 C.48429 0.47(;52 v.47514
6 C54t762 C.53550 C.5243S C.51432
a 0.7187C C.69CE9 0.t6622 "0.64422
I 1.0E596 1.C2140 C.36'543 0.91650
121.87876 1.71S31 1.53483 1.47034
14 3.72C57 3.292CO 2. 1421;1 2.65465
16 9.462,;2 7.1879,7 6.119143 5.39946
18 22.2593C 17.98S90 14.82C64 12.41346
20 68.40 097 52.04764 40.63786 32.43603
22 249.31169 176.16453 128.EE4259 97.09741
24 1102.3E011 70S.76567 479.32931 336.70025
26 2132.22218 1373.66691
APPENDIX 2
d k (M)
m
k 2 3 5
0 C32333312 C.25rrCCCC 0.2cccccoc C.16666667
2 ').16666667 0.25CICOCCC 0.28571429 0.29761,3C 5
4 0.28571423 0.29761905
6 7 9
c 0.1428571/1 G.125CCCCC 0.111ILill C.lcocccoo
0.2976190 C.2SI66667 C.28292E28 0.27272727
C.0584415 C.,071;51#545 0.09730210 C.11328671
6 0 C C' I r 8 2 2 5 ODC3787E8 0. COE 't8C8 1 C.01363636
O.CC808C81 C.01363636
10 11 12 13
0.09314c9cs 0*08333333 0.076;2308 C.07142857 2 0.26223776 C.2518315C 0.24175824 0.23214286
4 Ct.'?587413 C.la5GasCl 0.143qEE36 G.15021CG8
6 v.62C,45348 C.52696078 C.C34^15573 0.04111013
a Cl.Cl3q2cl2 C.OC185560 O.CC318103 C.CC488722
V, r.CCO' C541 O.CC:32977 O.CCC:931; 0.00021874
0 c c c IC 9 3 19 C.OC021874
14 15 16 17
c C.36666667 0.0625CCCO O.C5582353 C.05555556
7 7i,.223D3922 0.21A46C78 O.2C639E35 C.19863041
4 0.15495356 C15847 23 0.16iC,99C71 C.16267943
6 C.)4'V;6182 0*054502C6 OC6C66317 C.0664C778
8 O.rl'r6q4l27 C.OC'329634 C I 18 S S31 0.01469616
ll O.CCG42811 C.OCC742C2 O.CC117258 C.00172896
12 0 C C' C C 17 2 3OCCO2337 0 ., C' C; t' 5 15 7 G0C")09 437
14 CI.CCOOC002 O.CCCCGC19 O.CCC 'CC77 COCOC0232
16 O.CCCCCr77 COC000232
89
90
APPENDIX 2 (Continued)
m
k 18 19 2C 21
0 i .C;52b315E 0.05ccocce 0.04761SO5 C.C4545455
2 O.IS172932 C.lE5C6 G4 0.17HOC670 0.17292490
4 Q.i6358986 C.16114 55 0.16414455 0.16377318
6 C.07172C41 Q'.0766OC79 O.'O81C5904 f .08511199
C,.C176353G 0.02067C,05 O.C2375868 C. 2686559 111 O.C02eiJ45E 0.0%'322t3l; C.C0416566 C.00521391 12 O.CCCI.73CS G.CCC27E95 O.CCC42265 C.Or. )60912
14 v. CCOO C 570 C CC CC l 2 11 C.00C023CT C.OCC,4038
16 CCCJ'jCCv7 O.PVCCCOC22 G.OCC CC59 C.OCOGC137
1 O.CCO rror O.CCCCCLCC O.CICCCCO1 C.0toOCOL02
2,3 O.CccccrIol C.O,CCO ,02
22 23 24 25
c .CA34782 C.04166E67 O.C4CCCCOC U'.03846154
2 0.1673913C 0.16217GA9 0.157264S6 C.1526251.
4 n.16309922 0.16217,349 0.16106101 0.15978275
6 1,08878061 C.Oli208EI2 O.'l'G5C5871 C.r)977166C
8 3.02996"'17C 0.0330IS13 O.C3602C93 C.03834997
1 1
%., 3.C0637367 (;.CC763419 O.C'08G7407 C.01038666 12 C .('iliS423Z C.OC112522 O.CO145975 C.OC164685
14 ;.cCVi6604 G.3CC10212 O.CCC15%82 C.CC021423
16 O.CCOOC284 O.OCCO0536 O.CrCCCS42 C.O''CO1557
C.CCCCQC15 O.CCC','Cr,33 C.CCOC0066 2c C, C t,0,, r
O.OCCCOLCO G.ccccccol COCCCGO01 22 O.CLCC*CCCO O.Cclljocc)c C.Occccooc
24 O.Cccccccn C.ccccocco
APPENDIX 3
P m (p
m 1/ 2 1 / 3 1/ 4 1/ 5
2 0.33133 C.304C9 C.29C22 C.282C5
3 0.25CCC G.2CE13 O.le532 0.17307
4 0.2CCOC C.14S74 C.12648 C.11301
5 0.1666? C.11i,13 C.CGC66 G.G7741
6 (,.14286 C.09C12 O.C6748 C.05508
7 0.125 "'C C.07311 O.C5176 CO.04043
3 0.11111 C.06C61 O.C4C67 C.C3044
061CCOC C.C5113 C.C3262 .C2343
C.09C91 C.04375 0.02660 0.01836
C.0833 ;.C37SC 0.02202 C.01463
12 C.0761 2 C.C3318 C.ClE45 0.01182
13 0.071,43 C.02 3C G. ;1564 O.GO967
j/. 0.06667 C.026C8 0.01336 C00799
i5 0.0625C C.02223E C.,;1155 6.OOt68
16 C,35882 C.C21C8 C.ClC04 OCO563
17 0.,.)5556 CClS12 C.CCE79 C.CC478
18 C.01742 C.CC775 0.00409
19 C.01594 COC0687 C.C0352
2C C.04762 C.01465 C.CCtl2 C.C0305
21 mo)4545 C.01351 C.CC548 C00266
22 C. 34349 CoC1251 O.CO492 OOU233
1/ 6 1/ 7 1/ a 1/ 9
2 f .2766 C.272el 0"26S95 0.26772
3 ().164G8 0.15S22 0.15492 OoI5158
4 0.10422 0.09EC4 C*C9345 C.0899C
5 CoO6R92 C.062C6 C.C5E75 C.05546
6 C.04733 C.042C5 C.C31E25 0.03538
7 r .,, 3 35 2 C.028 2 C.%C2566 002323
3 Go 2437 C.C2C42 O.C1766 C.01565
9 .01812 C.01414 C.C1244 O.CIOV79
).D1374 C.CiCE6 OCf.894 CC0758
C.CIC5 CoGO814 C.CC654 O.CC543
12 C C82S CoOC62C C.dC486 CC0395
13 C.OC65E C.CC479 CC0366 CoC0291
14 1%1 054E CCC274 CCC28C CC0218
15 C. .DC420 C.00296 C00216 CoOO165
16 O.')C351 C00237 C.CO169 C00i26
17 OOC29C C.CGIS1 CCO133 r1;.oojs3
13 C00242 C o 00 15 5 C.CC1C6 CoCC076
19 C.0C2,'3 COC127 CCCO85 CCOC60
2 '13 1 o: 0 17 2 c.Colrq, C 0 c c c 6 G C.COC48
21 0. jG146 C.Cccea C.CCC56 CCOG38
22 OJC125 CCOC73 CCCC46 OCO031
91
