Full Citation 
Material Information 

Title: 
Estimating and testing the parameters of a generalization of the first order nonstationary autoregressive process 

Physical Description: 
vii, 90 leaves. : illus. ; 28 cm. 

Language: 
English 

Creator: 
Downing, Darryl Jon, 1947 

Publication Date: 
1974 

Copyright Date: 
1974 
Subjects 

Subject: 
Stochastic processes ( lcsh ) Estimation theory ( lcsh ) Statistics thesis Ph. D Dissertations, Academic  Statistics  UF 

Genre: 
bibliography ( marcgt ) nonfiction ( marcgt ) 
Notes 

Thesis: 
ThesisUniversity of Florida. 

Bibliography: 
Bibliography: leaves 8788. 

General Note: 
Typescript. 

General Note: 
Vita. 
Record Information 

Bibliographic ID: 
UF00098169 

Volume ID: 
VID00001 

Source Institution: 
University of Florida 

Holding Location: 
University of Florida 

Rights Management: 
All rights reserved by the source institution and holding location. 

Resource Identifier: 
alephbibnum  000582655 oclc  14155852 notis  ADB1032 

Downloads 

Full Text 
ESTIMATING AND TESTING THE PARAMETERS OF A
GENERALIZATION OF THE FIRST ORDER NONSTATIONARY
AUTOREGRESSIVE PROCESS
by
Darryl Jon Downing
A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF ILORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1974
TO BARBARA, DARREN, AND KELLY
FOR THEIR LOVE, UNDERSTANDING,
AND CONSTANT SUPPORT.
ACKNOWLEDGMENTS
I would like to express my deepest gratitude and
heartfelt thanks to Dr. John Saw. He suggested this topic
to me, was always available for assistance, and without his
help I would have never completed this work. Appreciation
is expressed also to the other members of my supervisory
committee, Professors D. T. Hughes, M. C. K. Yang, and
Z. R. PopStojanovic.
A special thank you is given to Dr. William Mendenhall,
who made it possible for me to come to the University of
Florida. His concern for my welfare and my family's will
always be remembered and appreciated.
To Libby Coker who typed this manuscript from a rough
draft, I owe more than a simple thank you can express. Her
dedication and perseverance will always be remembered and
appreciated.
The years of study here were consummated by the whole of
the Statistics Department. To all of the faculty,
secretaries, and students I extend my thanks for making me
feel wanted and welcome.
TABLE OF CONTENTS
Page
ACKNOWLEDGMENTS .................. .....................
ABSTRACT ..............................................
CHAPTER
1 STATEMENT OF THE PROBLEM ...................
1.1 Introduction ..........................
1.2 Summary of Results ....................
1.3 Notation ..............................
2 THE DOOLITTLE DECOMPOSITION AND ASSOCIATED
DISTRIBUTION THEORY ........................
2.1 Introduction ..........................
2.2. V: A Class of Dispersion Matrices .....
2.3 The Doolittle Decomposition and Its
Jacobian ..............................
2.4 The Joint Distribution of G and D for
Arbitrary V and In Particular When
VV. .. ..................................
2.5 The Distribution of G, of D, and of G
Conditional on D When V Is Arbitrary and
When V V ...............................
2.6 Verification of the Distribution of G..
3 THE ESTIMATORS 02, 8,, AND :lsjm} .....
J zj .
(Table of Contents Continued)
Chapter Page
3 3.1 Introduction ............................ 25
3.2 The Distribution and Properties of
2 225
2 and {a, :l jm .................... 25
3.3 Tests of Hypothesis ............... ....... 31
4 MAXIMUM LIKELIHOOD ESTIMATORS ............... 34
4.1 Introduction ............................ 34
4.2 The Maximum Likelihood Estimators and
Their Distribution ........ ............ 34
4.3 Properties of the Maximum Likelihood
Estimators and Their Distribution ..... 42
5 A TEST OF THE ADEQUACY OF THE MODEL ........ 51
5.1 Introduction. ........................... 51
5.2 An Approximation to the Distribution of
p0logX1 .............................. 53
5.3 The Distribution of T a Function of
gij :0sj
5.4 Asymptotic Performance of (p0log1 )
and T.................................. .. 73
6 COMPUTER SIMULATIONS AND AN APPLICATION..... 75
6.1 Introduction ............................ 75
6.2 Computer Simulation Results ........... 76
6.3 Application ............................. 81
BIBLIOGRAPHY ........................................... 87
BIOGRAPHICAL SKETCH ....................... ............. 89
Abstract of Dissertation Presented to the
Graduate Council of the University of Florida in Partial
Fulfillment of the Requirements for the
Degree of Doctor of Philosophy
ESTIMATING AND TESTING THE PARAMETERS OF A
GENERALIZATION OF THE FIRST ORDER NONSTATIONARY
AUTOREGRESSIVE PROCESS
by
Darryl Jon Downing
August, 1974
Chairman: Dr. J. G. Saw
Major Department: Statistics
A stochastic process is represented as having two
components. The first component is called drift and measures
location. The second component, called noise, measures the
variability of the stochastic process. This paper is
concerned with estimating the noise process when the noise
process is assumed to follow what we shall call a generalized
first order nonstationary autoregressive process. The
generalized first order autoregressive process is defined
similar to the first order autoregressive process, except that
the parameter relating two observations is different for each
time point. In order to estimate these parameters it is
necessary that the stochastic process be replicated a
sufficient number of times.
A method of estimating the parameters is proposed and
the broad attendant distribution theory is delineated, both
in a general setting and for specific situations. The prop
erties of these estimators are given and some tests of
hypothesis concerning the parameters are investigated. In
order to comment further on the value of the proposed
estimators, we use as a benchmark the maximum likelihood
estimators. Their properties are given and a critical
comparison is made between them and the proposed estimators.
In any practical situation it will be necessary to
decide whether or not the first order generalized autoregres
sive process is sufficiently accurate to describe the data.
Therefore, a test of the adequacy of the model is given.
Finally, numerical results are obtained using a
computer simulation. The proposed estimators and the maximum
likelihood estimators are compared. Also a practical
application is given.
Chapter I
STATEMENT OF THE PROBLEM
1.1 Introduction
The statistical model in this dissertation is a
stochastic process {Y(t):tET}. Usually T will denote a
time interval and we shall suppose that replications of the
process can be monitored during T at times t0
a typical replicate yielding the random sample yj=Y(tj):Ojim.
It is not necessary that the time increments tl t0, t2 tl,
*** t tm1 be of equal length.
If we write p(t)=EY(t) and X(t)=Y(t)p(t), then we may
think of p(t) as the "drift" of the sample paths of Y(t) and
X(t) as "noise". Clearly EX(t)=0. Various schemes have,
classically, been used to describe the noise process. In
particular one may assume that, with y.=i(t.)+x.:Osjsm,
xj = alxjl+a2xj2+ ..+ap xjp+cj : p5jm, (1.1.1)
where E Ep+l, ..' em are independent identically distri
buted random variables.
In order that the data lend themselves to analysis under
this classical model, several assumptions must be made. The
most restrictive of these is that of stationarity. Expressed
informally, stationarity assumes that the process has been
running a sufficiently long time so that it has settled down.
Putting this into a probabilistic context, stationarity
implies that the probability distribution of xtl, x, .
xk is the same as the probability distribution of xt +t
t2 +t ..., xtk+t for every finite set of values (tl,t2,
...,tk) and for every finite t.
The classical analysis of the model of equation (1.1.1),
known as the pth order autoregressive model is likely to be
inappropriate in many cases due to the requirement of
stationarity. For example, consider observing the effect of
a diet on weight loss. Initially the weight loss will be
greatest and will tend toward zero as time goes on and the
subject tends to some constant weight. Obviously since the
larger values appear first the probability distribution of
the initial observations is not the same as that occurring
later. A second example is the effect of drug infusion. A
patient is given a dose of some drug, either orally or
intravenously, and blood samples are drawn at various times
t0o tl, ... tm thereafter. The amount of drug in the blood
is then measured for each sample at each time. Again the
initial readings will be larger than the later ones since
the drag will be absorbed into the system or discharged as
time goes on.
It may be argued in both examples cited, that successive
differences (or perhaps successive second differences) have,
approximately, a stationary distribution. Rather than con
cede to ad hoc procedures we prefer to replace the stationary
autoregressive process by a nonstationary process. We gain
this generality in the model for {Y(t) : teT} at the expense
~ ~~
of requiring (for our analysis of, and estimation of the
parameters of the process) several replications of this
process. Fortunately, in many instances of interest,
replicates will be available.
The simplest alternative to the pth order autoregres
sive scheme is what we shall call the first order generalized
autoregressive scheme. Formally we shall assume that the
errors xl, x2, ..., xm satisfy
x. = ajx. + Ej : ljcm, (1.1.2)
where again cl' E2 ..2' m are independent identically
distributed random variables. It will be assumed that the
joint distribution of the errors is multivariate normal.
The assumption that the process can be replicated is needed
in order to estimate the unknown parameters al, a2, ..., am
In the analysis of this model we shall be concerned with
three major problems: (1) providing estimators for the
unknown parameters, (2) finding the distribution of the
estimators and comparing them to other estimators (typically
likelihood estimators) and, (3) providing methodology for
the testing the goodness of fit of the model.
1.2 Summary of Results
A method of estimating the parameters is proposed in
Chapter 2 and the broad attendant distribution theory is
delineated, both in a general setting and for specific
situations. The properties of these estimators are provided
in Chapter 3. In Chapter 4 a critical comparison is made
~
between the maximum likelihood estimators and the estimators
which we propose as an alternative. In any practical
situation it will be necessary to decide whether or not the
first order generalized autoregressive scheme is
sufficiently accurate to describe the data. A decision
procedure bearing on this aspect is investigated in Chapter 5.
An application of the theory is given in Chapter 6,
with a comparison between the estimators derived in this
dissertation and the usual maximum likelihood estimators.
1.3 Notation
In almost all areas of statistics notation is very
important consistent notation aids in solving and under
standing the material presented. Also it is very convenient
to abbreviate the distributional properties of random
variables. For these reasons, certain notational conventions
have been adopted. Although many of these are standard,
they will be listed here for reference.
1. An underscored lower case letter invariably
represents a column vector. Its row dimension will
be given the first time a vector appears. Thus
x:(m) denotes a column vector consisting of m elements.
The vector which has all its elements zero will be
denoted by 0.
2. Matrices will be denoted by capital letters and
the first time a matrix appears, its row and column
dimensions will be given. Thus M:(rxc) denotes a
~
matrix with r rows and c columns. Denote the zero matrix
by (0). The symbol "I" will be reserved for the
identity matrix.
3. The elements of a matrix will be denoted by the
corresponding small letter with subscripts to denote
their row and column position. Thus m.. denotes the
element in the ith row and jth column of the matrix
M. The symbol (M).. is equivalent to m.. and is
sometimes substituted for convenience.
4. It is sometimes convenient to form row and column
vectors from a given matrix. The symbol (M). will
represent the row vector formed from the ith row of
M. Similarly (M).j will denote the column vector
formed from the jth column of M.
5. The matrix formed from the first j rows and
columns of M will be denoted by M(j). M(j) is
commonly called the jth principal submatrix.
6. The Kronecher product of M:(mxm) and N:(nxn) will
be denoted by M s N and is a matrix P:(mnxmn) with
(kl)s+i,( l)s+j = mk nij
7. Diagonal matrices will be denoted by D=diag(dl,d2,
...,dm), where diag is short for diagonal and dl,d2,
...,dm are the elements on the diagonal.
8. In keeping with conventional notation we shall
write etr(.) to denote the constant "e" (Euler's
constant) raised to the tr(.) power.
~
9. In distribution theory transformations are often
made use of. To denote the Jacobian of the
transformation the following notation J{XY} will
represent the Jacobian of the transformation from
the Xspace into the Yspace.
10. If y is a random variable having a normal density,
2
with mean p, and variance o, we will write
yN(p,o2)
11. If y is a random variable defined on (o,m) with
density
f(y) = {r(v) 2(V2)}v exp {y2}
we will denote this by
yXV(0).
This is to be read as "y has the central Chi density
on v degrees of freedom."
12. If y is a random variable defined on (o,) with
density
f(y) = {F(v)(2o2)v}1 y1 ey/22
we shall abbreviate this by
y o 2X2(0).
13. If y is an mdimensional column vector whose
elements have a joint normal density with mean
vector, : (m) and dispersion matrix V:(mxm), this will
be denoted by
yN (IjV)
14. If Yl, 2' "*. yn are mutually independent
mvariate column vectors with
Zi Nm(j,V) i=l,...,n .
Then, with Y = (1, y2..."' n) we will write
Y \Nmxn(M, VI) ,
where M = (,v, ...,).
15. With Y an mxn matrix such that
Y %N (M, VaI)
mxn
the mxm matrix
W=YY'
has a noncentral Wishart distribution with dispersion matrix
V, degrees of freedom n, and noncentrality matrix
MM'. We will write
W .Wm(V,n,MM')
16. If W is an mxm symmetric matrix whose 'm(m+l)/2
mathematically independent elements have the density
f(W) = km(V,v) IWI(vml) etr{V1W}
over the group of positive definite matrices, where
m j=
we will write
W W (V,v, (0)).
17. For referencing within the text I'] will denote
bibliographical references, while (,) will denote
references to equations. Thus [4] refers to the
fourth entry in the bibliography, while (1.2.3) refers
to equation 3 in section 2 of Chapter I.
Chapter II
THE DOOLITTLE DECOMPOSITION AND ASSOCIATED
DISTRIBUTION THEORY
2.1 Introduction
In the event that the "noise" process, X(t)=Y(t)U(t),
is nonstationary, the general models and methods of
estimation, like those given by Box and Jenkins [5] and
Anderson [2], are no longer valid. We ignore the cases
where the nonstationarity is caused by trend, since this
can be removed and the resulting series is stationary and
usual methods apply. An appropriate model for a "noise"
process with this irregular behavior is
Xj = a j1+ j : ljn
Our purpose in this chapter is to find a method of
estimating the parameters of this model and to establish
the attendent distribution theory.
Throughout, {y.:0jsm}, will denote the observed values
of Y(t) at times t0
normally distributed.
2.2 V: A Class of Dispersion Matrices
If {X(t) = Y(t) p(t): tET} is a nonstationary
time series, whose realizations, xj, satisfy the relationship
x. = a.xj_ + e. : l : m, (2.2.1)
J ] a 1 ]
then {X(t)} is said to follow the first order generalized
autoregressive sequence.
Suppose we arrive at the (m+l)variate column vector
S= (x0,x, ... ,xm) ', obtained from random sampling from
the above process. We assume that x is an observation from
the (m+l)variate normal distribution, that is,
x'N +l(,V) (2.2.2)
m+l
where p = gx = o, since Sx(t) = o, and
V = Var x = gxx'
In order to determine V we assume that x E, E' "... m
are uncorrelated random variables with
Var(x ) = o2 2 (2.2.3)
and
Var(e ) = o : ljsm .
Since the process follows equation (2.2.1) we can write
{X0,xl,...,"x} in terms of {x0,1,E 2,.. m}. Applying
equation (2.2.1) recursively we find the relationship between
xj and x0' ...',Em Letting x:(m+l)=(x0,x,...,xm) ',
: (m) = (EE2, ..., m) and A:(m+l x m+l) having elements
a.. = 1 : oim
aij = aj _, ,i : osj
ak1 =o : elsewhere,
then it is easily verified that
x = A (2.2.5)
From expression (2.2.5) we see that
V = AUA' (2.2.6)
where U:(m+l x m+l) = diag(2,o2,2 ... 2). (2.2.7)
Writing out the elements of V explicitly we have
v = 22
2 2
00
v = o2 + 2vj,j : l1jIn (2.2.8)
Vjk = kj = "j+l'+2 ... akjj : 1rj
In the density of x we need A=V1, because of the
form of V, A has a particularly simple form. By taking the
inverse of the product in (2.2.6) we find
A = (A1) 'U1A1 (2.2.9)
The inverse of U is trivial and since A is lower triangular
its inverse is easily shown to have elements:
ai = 1 : osj an
aj+lj = Xj+l : osjmn1 (2.2.10)
ajk = o : elsewhere
Hence the elements of A are given by
2 2 2 2
oo = B2 + ; 1mm = 1
2
o21j = 1 + j2 + : lIsjnl
2 2 (2.2.11)
2 2
0 Aj+lj = O j',j+l j+1 : ojn1
02 j = o : elsewhere
jk
A square matrix M,with m.. = 0 : lijj>l, is called a
"Jacobi matrix" in the literature. This matrix can be
factored into c2A = R'R, where R is lower triangular with
r =B1
r.. = 1 : lsjsm
r3 (2.2.12)
r. ., = ci : l5j5m
rjk = 0 : elsewhere
This result can be obtained from (2.2.9) very easily.
One further property of V is: let V(j) be the
(j+l x j+l) principal submatrix formed from V, then
V (j) = 2(j+l) 2 : osjsm (2.2.13)
This result can be obtained by partitioning the
matrices in (2.2.6) and taking the determinant of the
corresponding product of the partitioned matrices. Since
A is lower triangular with unit diagonal elements any
square partition has determinant equal to unity. The
square partition of U is diagonal with determinant equal to
(2.2.13).
We note that the special form of V and its properties
are due to the model. We shall let V denote the class of
matrices with this special form. Specifically, V is the
class of all positive definite matrices V such that V is
a Jacobi matrix. To see that VeV implies V is a Jacobi
matrix we note that V may always be represented by AUA'
where A is lower triangular given by (2.2.4). Since A
is given by (2.2.10) then the resulting product (A)'U 1A
is always a Jacobi matrix.
2.3 The Doolittle Decomposition and Its Jacobian
Suppose we observe a number of (m+l)variate column
vectors zj: lsjsn (n>m+l), obtained by random sampling from
an (m+l)variate normal population with mean 4 and dispersion
matrix V. It is well known that the maximum likelihood
estimates of P and V are given by
1 n
S= y (2.3.1)
n j=l
and
V= W (2.3.2)
n
where W is the (m+l x m+l) matrix
n
W = (y.y) (YjY)' (2.3.3)
j=1
W has the central Wishart distribution with v=nl degrees
of freedom and dispersion matrix V. It is well known that
W = vV, so that v W is an unbiassed estimate of V. In
later sections we shall assume that VeV so that V may be
written as V=AUA', where A and U are defined in section 2.
We note that the subdiagonal of A is (al,.2, ... ,am) with
other elements being products of the a's. Since A contains
all the information on the a's and U contains information
2 2
on o and 8 if we could estimate these matrices we would
have estimates of the unknown parameters. Since W estimates
V, perhaps a transformation on the element of W will give
us estimates of A and U. It is with this intuitive notion
in mind that we proceed. We assume that a matrix W is
available with v degrees of freedom, and for the moment,
that V is an arbitrary positive definite matrix.
For convenience we label the rows and columns of W
zero through m (rather than 1 through m+l) and let W(j) be
the (j+1 x j+l) principal submatrix of W. Define
do = lW( o)
S () 1 (2.3.4)
dj = IW W() I ] lj: 1 m .
With D:(m+l x m+l) = diag(d0,dl,...,dm) define G:(m+l x
m+l), a lower triangular matrix with unit diagonal elements
(uniquely) by
W = GDG' (2.3.5)
Since the (m+l) random diagonal elements of D and the
m(m+l) random elements {gij: o
give (m+2)(m+l) random variables we see that the trans
formation from W into G and D is nonsingular. The actual
decomposition of W into G and D can be obtained using the
forward Doolittle procedure outlined in Rao [14] and Saw [16]
We wish to determine the joint density of G and D.
This can be obtained by using the density of W, denoted by
f('), and obtaining the Jacobian of the transformation from
W into G and D defined by (2.3.5). Denoting this Jacobian
by J{W+G,D}, and the joint density on G and D by h(G,D) then
h(G,D) = f(GDG')J{WG,D} (2.3.6)
Direct evaluation of the Jacobian is cumbersome and the
method used here is due to Hsu as reported by Deemer and
Olkin [7]. Since the derivation of the Jacobian is rather
long the rest of this section is devoted to it.
We seek the Jacobian of the transformation from W to
G and D defined by
W = GDG', (2.3.7)
with all matrices (m+l x m+l) and G lower triangular with
unit diagonals. Let [6G] and [6D], both (m+l x m+l),
denote small changes in G and D, respectively. Suppose that
the changes [6G] in G and [6D] in D bring about a change
[6W] in W so that (2.3.5) is preserved. That is
W+[6W] = (G+[6G])(D+[6D]) (G+[6G])' (2.3.8)
Expanding equation (2.3.8) and dropping terms of second
order in the [6*][*E(G,D)], we find that
W+[6W] = GDG' + [G' + S]D + G[6D]G' + GD[6G]' (2.3.9)
Since W = GDG', we see that
[6W] = [6G]DG' + G[6D]G' + GD[6G]' (2.3.10)
Hsu has shown that
J{W+G,D} = J{[6W][6G],[6D]}, (2.3.11)
where J{[6W]*[6G],[6D]} is the Jacobian of the transforma
tion defined by (2.3.10), in which G and D are considered
to be fixed (m+l x m+l) matrices. In essence we have gone
from a nonlinear transformation in G and D into a linear
transformation in the differential elements [6G] and [6D].
Pre and post multiplying (2.3.10) by G1 and (G')1,
respectively gives
G [6W] (G') = G [61G]D 4 [6D] + D[6G]'(G')1 (2.3.12)
Let A = G [6W](G')1
B = G [6G] (2.3.13)
and C = [6D]
We note that A is symmetric, B is lower triangular with
(B)i = 0:0sitm and C is diagonal. We may rewrite (2.3.12)
as
A = BD + C + DB' .(2.3.14)
From equations (2.3.10), (2.3.12), (2.3.13), and (2.3.14) we
have
J{W+G,D} = J{[6W][6G], [6D]}
= JC[6W]A} J{AB,C} J{B,C[6G],[6D]}. (2.3.15)
We shall evaluate the last three Jacobians separately.
The Jacobian, J{[6W]A}, is the Jacobian of the first
transformation defined by (2.3.13). This can be evaluated
by usual methods and we find
J{ [W]+A} = IGI(m+2) = 1 (2.3.16)
since G is lower triangular with unit diagonals.
The Jacobian, J{B,C[6G],[6D]}, is the Jacobian of
the transformation defined by the last two equations in
(2.3.13). Hence it may be factored into the product of
two Jacobians, namely,
J{B,C[6G], [6D]} = J({B [6G]}J{C[6D]}. (2.3.17)
By the usual methods for determining Jacobians we find
J{B[6G]} = G 1(m+l) = 1 (2.3.18)
and J{C[6D]} = II (m+l) = 1 (2.3.19)
so that equation (2.3.17) is unity.
Finally we need to determine J{AB,C}. Writing out
the equations given by (2.3.14) and using the fact that B
is lower triangular with zero diagonal elements we find
aji = a.. = b.i.d : 0
and ajj = cjj : Ojgm
Hence we find that
3a.
3 = 1 : orj s:
33
and
(2.3.21)
Ba..
1) = d : oj
so that the Jacobian is
J{A+B,C}= 3(aooal0a20'. amOalla21 '*aml ''amm
3(coo0b0b20 bmo llb21 bml mm
= dM (2.3.22)
j=o 3
Following equation (2.3.15) we obtain
J{WG,D} = M dm (2.3.23)
j=o
2.4 The Joint Distribution of G and D for Arbitrary V and
in Particular When VsV.
In section 3 we derived the Jacobian of the nonlinear
transformation W = GDG'. We now suppose that, for
arbitrary positive definite V,
W Wm+1(V,v, (0)). (2.4.1)
The joint density on G and D is then obtained from (2.4.1),
(2.3.6), and (2.3.23).
h(G,D) = Km+1(V,v)etr{V1GDG'} I d v+m)j (2.4.2)
j=0 3
on
d.>o : ojsfm
and j
The term Km+(V,v) is defined by
~m+l
Kml(V,) = IV1v (2)\( m+1) km(m+l) m F((vj)). (2.4.3)
j=0
With A=V we may write,
tr{VGDG'} = tr{AGDG'}
= tr{DG'AG} (2.4.4)
m
=  Z d.(G'AG).
j=0
Hence we see that the density in (2.4.2) partitions into
the subsets {do ,g10,g20 ... ,gO}; {dl,g21,g 31,...,g'l ; ...
{dml' m, m1; {dm} which are mutually independent, but
variables within a subset are dependent.
___
In the case that VeV then we may write o2A= R'R from
equation (2.2.12) and we find
(GAG)jj = 2 (G'R'RG)jj
1 (2.4.5)
= 2 (RG)' (RG) ,
J j
where
(RG)'o (~l;g910a;g202glO; ;moamgm,ml)
and for isj n1
(RG)j = (0;0;...;l;gj+1,jaj+;gj+2,jaj+2gj+l,j; '. 
gmj amgmlj) (2.4.6)
The "1" appears as the j element in (RG)
Now we may write
2 m 2
(G'AG)oo 12 + (ko kk1,o
0 k=l
and for lsjam1 (2.4.7)
(G'AG) = 1 {1 + E (gk,jakukg,j2
o k=j+l
The density h(G,D) factors into
mi
h(G,D) = { I h (dj*,j+2,j g ,j ... g j)}hm(dm), (2.4.3)
j=o j j l 2 mj m m
where
ho(d0'g10,g20'..'gmo) =
2 m
do(v+m)letr {do,2+ (g kokk1,0)2
k=l
rF( )2 (v+m) 1m IV IV(0) 1(v1) (2.4.9)
and for 1ijsm1
hj(dj,gj+,j ..gmj) =
m
d(+m)jletr{ dj [1 + k=+l(gkjkkl,j2
j) (mj) IVIV(j 1(vjl) iv()()j) '
(2.4.10)
vnm 
VM
id2 2 d m exp[_hl12dm
and h (d ) = {F((vm))(2o2 2 } mexp
(2.4.11)
2.5 The Distribution of G, of D, and of G Conditional on
D When V Is Arbitrary and When VV .
The necessity of knowing the distribution of d ;dl;...;
dm and the subsets {g0o;g20;...;g; 21; 31 ;ml ...
{gm1,m} arises from the fact that functions of these statis
tics will be estimators for the parameters 2, and
{al,a2, ..., m. A knowledge of the distribution of the
estimators gives us the information we need to talk about
the "goodness" of the estimators. It is to this end that
we derive the distribution of G, D, and G conditional on D.
The distribution on the elements of D follow directly
from a theorem given in class lecture notes and in Saw [15].
Theorem 2.5.1
If Wwm+ (V,v, (0)), v integer with v>m; and W(r) is
defined by
W(r) 10 11 W Ir : Orsm .
rO wrl rr
Then {W( ); {Wr '/(r) Ir are independent chi
square variates such that
(o) I ( ) X2 (0) (2.5.2)
and for lasrm
(r) (r) V l/lv Xr (0). (2.5.3)
Wr I/IW(ri)'IV(r) I/U(rl) Xvr
Since from (2.3.4) we have
do = W(0)
dj = W(j) I/ W(j_l) I: l j n ,
then by direct application of Theorem 2.5.1 we have
do lV( 2 (o)
0 (0)1 ()
and for ljan
dV() I/IV l) I X_ (0) (2.5.4)
] (j) (j1l) v
Now if we allow VEV, then since IV() = 02(j+1) 2 we
find
do 02 2X2 (0)
0 V
and for lsj sn (2.5.5)
d.ja 2 (0)
j vj
To find the density on the subsets {g910;920;...9m;g
921;931; ... ;gml; "; {gm,m_l we refer back to equations
(2.4.9) and (2.4.10). Using those equations we may write
for o j snl,
hj(d jgj+l,j;) C d(v+m)jl etr{dj(G'AG)}, (2.5.6)
for some constant C. Performing the integration over d. and
m m
replacing (G'AG) by Z E X kg kjgj we have,
k=j k=jkgkj
20
m m
h (gj+,j; ...;mj) = Cm(v:V,j){ m E kgkj j
lk=j =j (2.5.7)
where, with V(1) taken as unity
C (v:V,j)  vi= .v(j ) 1 ( ) (2.5.8)
S0(V) (m) V H(vjl)
Remembering that g.jjl, a transformation shows the variables
in (2.5.7) have a multivariate tdistribution. The form and
properties of the multivariate tdistribution were found by
Cornish [6] in 1954 and also in the same year by Dunnett and
Sobel [8].
Now if we allow VEV, we find
Sm 2 (v+m)
hg (10;20 ;g .;gm)=m Cm (V:I ){l+B2 (gkO k1,0
k=l
(2.5.9)
and for l1j.ml
hj (gj+l,j ; ;gmj)=Cm Ij){l+ (gk,jakk1, j) 2 }4 +m) +j
k=+l (2.5.10)
Evidently g0',g21 .... ,gm,m are mutually independent and
vle(gl0al) t (0) (2.5.11)
and
(vk+l) (gk,klak) tk+1 : 2_kmn (2.5.12)
To find the distribution of G conditional on D let
IV() I = 1, then the marginal distribution on D is
h(D)= h(d0) h(dl)...h(dm)
m d.vj)lexp{ IV J1V d(}
= 3 (j j (2.5.13)
j=o (21V V(j ) l1 J (( j))
We have from equations (2.4.2) and (2.4.4) that the joint
density on (G,D) is
m _(V+p)jl
h(G,D)=k (V,v) {d. + exp{d. (G'AG) .}}
Tj=0
The conditional distribution of G given D is
h(GD) = d(mj) exp{d [(G'AG)j IV IlI]}
h(=ID) = 1 (2J ) (j )
j=0 (IVIIV(j)l 1) (2T'2 (M
}.(2.5.14)
Although this does not have a very pleasing form if we let
VcV, we find the conditional density simplifies greatly. If
we write G'G =(RG)'(RG) and use the fact that IV(j) I= o2(j+1) 2
we find
m1 1 (mj) 2 m 2
h(GID)= n {(27ca2rd ) exp[Ca d E (gkja kg,j .
j=o jk=j+l
(2.5.15)
Let gj be the (mj)variate column vector given by
9j = (gjj+ ,j; j+2,j;... mj)' : Oj ml, (2.5.16)
_ the (mj) dimensional column vector defined by
_j = (aj+l;aj+l j+2;...;aj+l j+2... ) ':o jml, (2.5.17)
and V. the (mj x mj) matrix whose elements are given by
2
(V )11 d : Osjsm1 ,
2
o 2
(Vj)kk d + j+kjklkl : 2ksmj; oajsml, (2.5.18)
(' ~kk d +j j k kl,k 1
and (Vj )kl= (Vj )k=aj+kaj+k+l...j+l (Vj )kk: 1k
oajgml.
Then equation (2.5.15) may be written as
m1
h(GID) = H h (g.jld )
j=o
where for oej:ml
h (g ld ) = (2)) (mj)IV j l exp[ (gjj)'v3 (gj_j)].
That is
9jldj Nmj( j,Vj) : orjSm1 .
(2.5.19)
2.5.20)
2.5.21)
(
2.6 Verification of the Distribution of G.
In finding the density of the subset (gk+,k;...;gmk)
the constant Cm (:V,k) was given, but was not verified. In
this section we show how the value of the constant can be
obtained.
We need to evaluate the integral
P P
I = ( E ai. .u .u ) dudu ..du (2.6.1)
i=0 j=0 1J 1 p
{m
where u =1 and a..= .
o 13 ji
We may write
P P
i=0 iuj 1 o 00 (2.6.2)
i=0 j=0
a AjK
where a = (a01, a02' .... op)' (2.6.3)
al l 12 ... alp
and A = a21 a22 .. a2p (2.6.4)
pi pp2 .."' pp
Now equation (2.6.2) may be rewritten as
P P
I E a.iu.u. =a +u'Au+2u'a (2.6.5)
i=0 j=0
=a +(uAl1a)'A(u+A a)a'Aa.(2.6.6)
00
1 1 
Write u + A = (a00a'A a) Kw (2.6.7)
where KAK'=I, so that XKJ = IAl1. The Jacobian from u
into w can be found by standard methods and is
J{uw} = (a00'A ')PIKI (2.6.8)
= (a00a'A a) PlAI2 (2.6.9)
Hence we have
Ig (a0a Aa) A/(l+w'w) dw ...dw (2.6.10)
{
We compare this with the following version of the multi
variate tdistribution
f(t)=C I EI I(4 t4 ) le (t,)](n+p) :_ o (2.6.11)
P IJp
ljap
where C = F((n+p))/{r F(n)} (2.6.12)
P
Let E=I and 8=o then setting 8=2(n+p) we see that
n=2Bp, since n>o we must have >kp and we find
/( +w'w) dw ...dw (2.6.13)
{
Hence, p
S a B ( p)
I .= (a a'A a) A r(B) (2.6.14)
g oo 
.I  A F(B Tp) P (2.6.15)
AI (p+1) B) *
Referring back to (2.5.8) we see that p=mj; B=(v+m)j,
so that
rF(v+m)j) A (j)
F  ) (mj) iA21  j )
where
jJ j,j+l J ,m
A1 = j+1,j +j+l,j+l x j+1,m
m,j m,j+1 m,m
and
xj+l,j+l j+1,j+2  j+l,m
A2 = j+2,j+l xj+2,j+2 ... xj+2,m
m,j+l m,j+2 ... m,m
Now with A=v1, then by an application of the
A1 12l 1 1
21 A2 2 11 12 22 21
= l22 V11ll1
hence we find
So that
and we confirm
Cm(v:V,j)
^221 =vlljVll .
AJ = Vi1 (j1)
A21 = IVI1 IV M ,
that
= F((v+m)j) Iv(j_ ) I(Vj)
F((vj)) (mj) IV lV (j) j1)
(2.6.17)
(2.6.18)
clockwise rule
(2.6.19)
(2.6.20)
(2.6.21)
(2.6.22)
Cm(V:V,j) =
24
(2.6.16)
~~__~
Chapter III
2 2
THE ESTIMATORS o,, *,, and {a* :lsjsm}
3.1 Introduction
In Chapter II the unknown parameters a2, 2, and
{a. :1sj4n were introduced to define the generalized first
order autoregressive process and thence the underlying
distribution of the observations. In this chapter we
propose estimators (alternatives to the maximum likelihood
estimators) for these parameters and discuss their properties.
(To distinguish between the estimators proposed in Chapter II
and the maximum likelihood estimator, we shall reserve the
hat (^) notation for the latter and star (*) notation for
the former.)
Finally, tests of hypothesis concerning the parameters
are discussed. Special attention is given to the case of
testing ak = ak+ = = = am = ro, for k = 2,3,...,m. A
method, due to Fisher, of combining independent tests is
likely to be appropriate.
2 2
3.2 The Distribution and Properties of o 3, 8 and {a,;: lj!m.}
Suppose that we observe a number of (m+l)variate
column vectors yj: 1sjrn (n>m+l), obtained by random sampling
from an (m+l)variate normal population with mean p and
dispersion matrix V. We assume that these observations come from
the process Y(t) = L(t) + X(t), during times t0
where X(t) follows the first order generalized autoregressive
process. Since Y(t) = w(t) we have, letting I(t.) = Vi:
osism, that
1 n
Vi = Yi = n j lYiJ : osism, (3.2.1)
1 1 n = 1
or alternatively with the (m+l) dimensional column vector p=
(V0' I' ".. m) that
S= = (3.2.2)
Estimates of the noise process X(t) for t0
x. = yj y : lsjsn (3.2.3)
From these we may arrive at
n ,
W = x.x. (3.2.4)
j=l 33
n
= ( ) (y ) ,(3.2.5)
j= 1
where W has the central Wishart distribution on v=nl
degrees of freedom and dispersion matrix V (where VeV).
Hence the theory of Chapter II is applicable and we have
from equation (2.5.5), (2.5.11), and (2.5.12) that
do 'o2 2 X (0) (3.2.6)
22
d.j o2 X2 (0) : Isjmn (3.2.7)
v B(gl0al) %tV(0) (3.2.8)
and (vk+l)2 (gk,k_nk)t k+1_(0): 2sksm .(3.2.9)
Hence the estimators for {a.: ljszm} are:
3
a, = gjjI : lSjsm, (3.2.10)
with densities given in equations (3.2.8) and (3.2.9).
2 2
The estimators for 2 and 2 are:
2 1 m
= J d (3.2.11)
1 j=1
where c1 = V !(m+l) (3.2.12)
m
2 m 1
and 32 = d0(c2 dj) (3.2.13)
0 =1
where c2 = v(mc12)1 (3.2.14)
2 2
We note that o2 has a Gamma density and 2B has a Beta Type
2 density.
In order to evaluate the "goodness" of these estimators
the following properties are investigated: (1) the first two
moments, (2) the consistency of the estimators, and (3) the
efficiency of the estimators.
The first moment of the estimators are given by
S(a,) = aj : lsjim (3.2.15)
2 2
(o2) = 2 (3.2.16)
and 2(B2) 2 (3.2.17)
so that the estimators are all unbiassed.
Letting E, denote the (m+2 x m+2) variancecovariance
matrix of the (m+2) dimensional vector of unbiassed estimators
2 2
(a*l' a*2, ..., a*m' 02, ,2), we find
(E ) = 2(v2) (3.2.18)
(2*)j = (vj+l)1 : 2Sj m (3.2.19)
1 4
(Z*)m+l,m+l = 2(mcl) (3.2.20)
(Z*)m+2m+2 = 2(v+mcl2)[v(mcl4)]l4 (3.2.21)
1 2 2
*)m+l,m+2 (*)m+2,m+l = 2(mc1)1 22, (3.2.22)
and (*)ij = 0 : elsewhere (3.2.23)
In Fisz [10] and Feller [9] it is shown that a statistic
t based on v observations, with mean and variance T will be
consistent for 8 if
lim T = 0 (3.2.24)
In the equations (3.2.18) thru (3.2.21) if we let vm,
we find the variance of the estimator goes to zero. Hence
2 2
the estimators (a*1, a*2i a *m o*, B,) are consistent.
The likelihood function, from which the estimators
are derived, is the density of W, given by
L = f(W) = Km+(V.,v) W (vm2)etr{VW} (3.2.25)
Using this we may obtain the "Information Matrix", F,
whose elements are defined by
(F) = log L (3.2.26)
where j and 6 are any two of the parameters al,...,am,
2 2 th 1
0 and The j diagonal element of F gives the
minimum varinace bound of any estimator of 0.. Letting
(1 2,' ..., m m+l' m+2) (alla2' ... ,m'2 2) we find
(F)jj = j(
,2 jl,j1
= ?vj., : lSj^m (3.2.27)
2 2 2 2 2 2 2
with Vjljl= (l+ajl+ajlaj_2+. ..+aj_j2...a2
2 2 222
+ajlaj 2...c2 2) (3.2.28)
(F) = [ V(+l) + 1 trVW]
m+l,m+l 2a 4 4
= v(m+l) (3.2.29)
20
w i
(F) = j + a
(m+2,m+2 2= + 26j
L
4
264
(F) = (F) I
m+1,m+2 m+2,m+1 4 4
,)
202 '
and (F).j = 0:elsewhere.
To see that (F)j = 0 elsewhere we note that
221og = 0 for lsjam; l!gsm; j f3 ,
2 log L = 1 j
6 6m+1 4 (a wj1,j1 j,j) : jm '
and
a2 log L 0 eIsm
2 6 =0 : 10m .
j m+2
Now e( 4 [ajwj_,j_w j_,j]) = (ajj_lj_ 
= 0 ,
since vj_l,j=avjj_l,j_1. Hence (F) j is zero as was
shown.
With Z = F we find that the minimum variance
are
1 2
(E)j (v v1lGl : 2ljm ,
1 4
(E)m+l,m+1 = 2(Vm)l ,
(E),=+2,m+2 2(m+l)(vm)1 4
( 1)m+2,m+2
() 2 (2) 282
(Em+l,m+2 (Z)m+2,m+l = 2(Vm) a
(3.2.30)
(3.2.31)
(3.2.32)
(3.2.33)
(3.2.34)
(3.2.35)
jl,j)
(3.2.36)
to be
bounds
(3.2.37)
(3.2.38)
(3.2.39)
(3.2.40)
and (E)jk = 0 : elsewhere (3.2.41)
Since the efficiency of an estimator is the ratio of the
minimum variance bound to the variance of the estimator,
we have from equations (3.2.18) thru (3.2.23) and equations
(3.2.37)thru (3.2.41) that the efficiency of the starred
estimator, denoted by Eff(6,), is
1
Eff(aL) = (v2)v1 (3.2.42)
Eff(a ,) = (vjl)(vv j,j lo2 21jsn (.3.2.43)
2 1
Eff(o ) = c l (3.2.44)
and Eff(SB) = (m+l)v(mc 4) [vm(v+mc 2)]1 (3.2.45)
2 2
The estimators al, a and 6B are asymptotically efficient
while the asymptotic efficiency of a*.: 2sjsm is given by
lim Eff(aj) = v 02 : 2
xi 3 J l'
Replacing vjl,j1 by the right hand side of equation
(3.2.28) gives
2 2 +2 2 2 2 2 2
li Eff(a) (1+a j2_ +...+ 2 a 2j.. .a + j22
2 2 2221
ajla j 2..2a 2 ) : 2sjFn (3.2.47)
Hence the asymptotic efficiency of aj depends on the true
values of the previous a's and B If the true value of
aj_ is zero then a,j is asymptotically efficient regardless
j1 i
of the values of any of the previous a's. That is, the
asymptotic efficiency of a. is dependent most upon the true
value of aj the previous a, next upon aj2, and so forth
with the least dependence on al and 52. If we suppose that
all the parameters are less than unity in absolute value and
write,
a = max( all, 1 21 .... Ic[aj_I 6 I), (3.2.48)
then we may arrive at the inequality
2(j2) 2j 2 2 2
(l+a.++a ...+a 2)+a) ) (l+a2 + 2. +...+
j1 j1 2
2 2 2 2 2 2 2 2
jlj_2...a2+ jla j_ 2...a2al ) (3.2.49)
where equality holds only when all = 121 = .
lajl1 = 18 = a The inequality reverses upon taking
reciprocals and we find the asymptotic efficiency of a i is
at least as qreat as
2(j1)2j_ 2 (j+1 2
min Eff(a,) = (la2(jl)+a2 a2(j+ ))(1a2) (3.2.50)
2sjn .
Although {a i: 2ai:m} is inefficient this loss in efficiency
is more than made up for by the fact that the distribution
of {(a ii ): 2i~)n} contains no unknown parameters.
3.3 Tests of Hypothesis
Since the distribution of the estimators are known,
tests of hypothesis may be carried out with ease. We
tabulate here a few hypothesis of interest.
It is desired to test the hypothesis
Ho: ak = ak+l = .. = rm = (k=2,3,...,m)
against (3.3.1)
Ha: at least one of the qualities does not hold.
With
tj = (vj+l)(gj,j_10) 2 : 2sjm (3.3.2)
define
Pj = P{It(j+l) atj} : 2jmam. (3.3.3)
An appropriate test statistic for testing (3.3.1) is
m
L = 2 E loge(p'). (3.3.4)
j=k
The quantity, L, has the chisquare distribution on 2(mk+l)
degrees of freedom. The hypothesis is rejected at signifi
cance level a if
L > Z (3.3.5)
where L is chosen so that
2
P{X2(mk+) = a (3.3.6)
This procedure is called Fisher's method of combining
independent tests. It has been shown by Littell and Folks
[12] to be asymptotically optimal over other tests as judged
by Bahadur relative efficiency. The Bahadur relative
efficiency compares the rates at which the competing teSt
statistics observed significance levels converge to zero,
in some sense, when the null hypothesis is false. The
interested reader is referred to Bahadur [3] and Littell
and Folks [12] .
The above hypothesis has some interesting interpretations
for choicesof no. If n0 = 0, we are testing whether the
process is white noise from some point k on. In the case
where no is a constant, not equal to zero, we are hypothe
sizing that the time series is stationary.
Hypothesis concerning individual parameters can be
carried out in the usual manner since the distribution of
I
the estimator is known.
An hypothesis of importance, concerning a single
parameter would be
2 2
Ho: p =0
0 0
against (3.3.7)
2 2
Ha 82 f3 .
a 0
An appropriate test statistic is
2 2
F0 = mc l,/(mcl2))0 (3.3.8)
which has an F distribution on v and mec degrees of
freedom, where cl is defined in equation (3.2.12) as
v(m+l). The null hypothesis is rejected at the a level
of significance if
F > F (3.3.9)
0 mclla
where FV is chosen so that
mci,a
P ({F > F } = a (3.3.10)
mc1 mc1,a
2
Choosing 80 = 1, the hypothesis implies homoscedasticity
between the initial observation and the errors of the "noise"
process.
Chapter IV
THE MAXIMUM LIKELIHOOD ESTIMATORS
4.1 Introduction
In order to comment further on the value of the
estimatorsgiven in Chapter III some standard of comparison
must be employed. To this end we study the maximum
likelihood estimators. In this chapter we obtain the
maximum likelihood estimators and examine their sampling
properties. A comparison is then made between the maximum
likelihood estimatorsand the starred estimators of Chapter III.
4.2 The Maximum Likelihood Estimators and Their Distribution
As in section 2 of Chapter III, we suppose that we
observe a number of (m+l)variate column vectors yj: lrsjn
(n>m+l), obtained by random sampling from an (m+l)variate
normal population with mean p and dispersion matrix V. As
in Chapter III, we estimate p byy and form the (m+l x m+l)
matrix W by
n
W = jl (yjy) (jy)) (4.2.1)
W has the central Wishart distribution with v=nl degrees
of freedom and dispersion matrix V. W may also be represented
by
W = E zz. (4.2.2)
j=33
where zl, z2' ..., z (v=nl) are mutally independent and
zj N+ (O,V) :
3 m+l 
To see this let the (m+l x n) matrix Y be defined by
Y = (yl'Y2' ... 'n) (4.
and let B be any orthogonal (n x n) matrix with last colur
( 1 1 (4.
Define the (m+l x n) matrix Z = (z, z, ..., z ) by
Z YB (4.
Z = YB (4.:
2.3)
2.4)
mn
2.5)
2.6)
We note that
(4.2.7)
Now W may be written as
W =
and since B is orthogonal
YY =
and upon substituting YY'
(4.2.8) we find
YY'nyy',
(BB'=I) we may write
(YB) (YB)'
ZZ' ,
= ZZ' and z = /n y into
W = Z Z z z
nn
ni
= z.
j=lj
Hence we have the representation
n
wij= z (Yiyi) (Yjz Yj)
ij ^^^ ii i 3x
(4.2.8)
(4.2.9)
(4.2.10)
= ZikZjk : osinm; osjsm (4.2.11)
k=l1
Hence forth we shall use W = ZZ keeping in mind the
representation given in (4.2.11).
Assuming that Y(t) = p(t) + X(t), where X(t) follows the
first order generalized autoregressive process then VEV and
has the properties given in section 2 of Chapter II. Since
it is through W that we obtain estimates, the elements of W
serve as observations and the likelihood of this set of
observations is L = f(W) given in (3.2.25). Taking the
logarithm of (3.2.25) and utilizing the form and properties
of V we obtain
log L = C v(m+l) log o2 log2
fl
+(vm2)ioglwItrV W (4.2.12)
where C is a constant. Recalling that A = V has the
special form given by (2.2.11) we may write
trViW = trAW
m m
1 2 1 2
= ) w + Z w.+ Z (a.w. 2a.w
o2 00 j=l j3=i 3  l
(4.2.13)
Substituting (4.2.13) into (4.2.12) we find
log L = C iv(m+l)logo2 log q2 + (vm2)loglWI
1 2 1 m m 2
2 w00+ E w..+ (ew ,12.w. l,)}
20a 0 j=1 i j=l 1,j1 
(4.2.14)
Differentiation of (4.2.14) with respect to aj yields
8 log L 1 r
j 2 j l jw wj l,} : lcj
aj a 2 j jljl ]1,j
differentiation with respect to 2 yields
m
S log L v(m+i) 1 2 1
S20 2o4 j=1
+ j (aw. J 2a.w. ) } (4.2.16)
and finally differentiation with respect to 92 gives
3 log L = 00 (4.2.17)
as2 292 2 2B4
Setting { L log L and log Lequal to
j So2 59
zero and solving we obtain the maximum likelihood estimators:
1
j = .Wl jwj : 1sj (4.2.18)
2 1 2 m 2 1
8 = [v(m+l)l i w0 j + (wjwjl ll'
0 =1
(.4.2.19)
and
2 = ( 2)w00 (4.2.20)
Eliminating 92 from equation (4.2.19) yields
^2 1 2 1
o = (vm) j (w w j,j) (4.2.21)
We now proceed to determine the distribution of the
maximum likelihood estimators. In order to do this we shall
use conditional arguments frequently. We shall write
y z f(.) (4.2.22)
in order to imply that "y conditional on z has the density
...." Using the representation of wi. given in (4.2.11)
one has
38
Z jl,k jk
k=l
aj = : ljsm (4.2.23)
2
7 jl,k
Letting
jj zk_ : osjsm ; lIk:v (4.2.24)
jk v 2
E jk
k=l
we may write
ai = k I jl,kzjk : l:jsm (4.2.25)
Recalling that
k Nm+ (0,V) : lsk:v (4.2.26)
when VeV, we must be able to represent zjk by
Zjk = ajl,k + jk : lsjam; lks: (4.2.27)
where Ejk are independent identically distributed normal
2
random variables with mean zero and variance o
Hence
zjk Zjl,k N(a zj. ,ko2): lsjam; 1sksv (4.2.28)
Using (4.2.28) in (4.2.25),
v v
2 2
aj {zjl k:lksv}IN(jk. j ,kzjl, k; =l ,k)
(4.2.29)
Since v
v 1jl,k
S. jl,kzjl,k = v 2 1 (4.2.30)
k=l ]1,k
and 2 1
and 2 1 (4.2.31)
k=1 E Z jl,k
k=l
we find upon their substitution into (4.2.29) that
aj{z llk: lsk:s'v.5N(a ,c2 ( 2 ,)
To complete the derivation we note the following.
Let
u s N(p,a2)
2U2
and v oX2(0) ,
then Student's tdistribution is defined as
t = (u P) ( v) 2
We note that the distribution of t conditional on v is
t vaNN(0,va 2v )
Hence, by analogy, the distribution of a. is also
we have
(aj aj) %t (0): l1j2mn
t and
(4.2.37)
and v .
and + a2 + 2 2 2 + a2 2 2
a 2 ~_jl ij 1j2 + jlj2''2
a2 a2 .02 2) l: 1jim (4.2.38)
ji j2" 21l
To find the distribution of 2 we define the (vxl)
column vectors 6 = (jl, j2 ..., .j )' by
31 32Jv
V 2 7v
jk= (k ZJk2 ;zjk : Isksv; o!jm ,
and note that
0. = 1 : O:rjm .
3 3
In terms cf the (m+l x v) matrix Z we have
j ={(Z) (Z) j } (Z J.
(4.2.39)
(4.2.40)
(4.2.41)
(4.2.32)
(4.2.33)
(4.2.34)
(4.2.35)
(4.3.36)
v
oo 2
where =
2
For convenience we take
2
j = wj W lJlj_1 : ljm (4.2.42)
and write
2 1 m
Sm j=lj 3 (4.2.43)
Using 6j we may write
.. = (Z) ( j6j 1_1) (Z) : l jm (4.2.44)
where I is the (vxv) identity matrix. Consider the
distribution of qm conditional on {(Z) j:0sjsml}. Now
(Z)m. {(Z) .: 0;j:ml}1 Nv (xl( (Z)m l, ;o 2I ). (4.2.45)
Conditional on {(Z). : osj ml1 the matrix (Iv 6 6'l)
3" v m111
is symmetric and idempotent with rank (v1), and the
quadratic form nm follows the noncentral chisquare
distribution, that is
nmj{(Z)j.: Osjsml} 02X2_ (Ym) (4.2.46)
where
1 2
m = 2 m (Z)m1l, I mm ml, (4.2.47)
Upon replacing mi by its definition in (4.2.47) we find
Ym = 0 (4.2.48)
Since none of {(Z) j : Ojsml} enter into the distribution
of nm we have the unconditional distribution is the same and
hence iq is independent of {(Z). n : o
2 2
m (vl) (0) (4.2.49)
The distribution of q(m1) conditional on {(Z). : osjsn2}
can be obtained in exactly the same manner and since it
depends only on {(Z).. : Osjsm2J it is independent of
Tm. By the same arguments as above n(m!) can be shown to
be independent of {(Z)j.: osjsr2} and hence
2 2
0(ml) X (0) (4.2.50)
(v1)
In exactly the same manner we can show qk+l is independent
of nk and hence so is q(k+2)' (k+3)' ..' m since they are
independent of n(k+l) Again the distribution of nk is free
of {(Z) .: O0jskl} so that the unconditional distribution of
nk is identical to nm In this way we argue that {rj: Isjan}
are mutually independent and identically distributed with
2 2
nj to X2(v) (0) : 1ijsm (4.2.51)
2 1 m
Since 8 E n we have
j=1 J
2
2 2 (0) 4.2.52)
my Xm(v1)
In the above argument we note that the variables
{nm, (ml,...,' 'k} are independent of {(Z) .: osj:kl}.
Hence we have {nm' (ml)' "'' rl1 are independent of (Z)o.,
and hence of
w00 = (Z) (Z)0 (4.2.53)
Since
ZOk N(0,o 2 ) : lksu (4.2.54)
then
W 0 T2 X2 (0) (4.2.55)
Since 2 = (va2) w00' then we have
(v1) 2 F
Nv mF2 (0) (4.2.56)
%2 m(v1)
4.3 Properties of the Maximum Likelihood Estimators
Since the distribution of the maximum likelihood
estimators is known their properties are easily obtained.
We find using equations (4.2.37), (4.2.52), and (4.2.56)
that (& a) = a : lsjsm (4.3.1)
8(82) = (v )v1 a2 (4.3.2)
and _(2) = mv[m(vl)2]2 (4.3.3)
"2 2
Hence the a. are unbiassed estimators while o and 2 are
3
biassed. Since unbiassedness is a desirable property we shall
2 2
use the unbiassed estimators of o and 52 in calculating the
rest of the properties.
SLetting 2 denote the (m+2 x m2) variancecovariance
matrix of the (m+2) dimensional vector of unbiassed
estimators (a1, a2' ... m, v(v1) 12, [m(vl)2] (mv)1 2)
we find
1 22
()11 = (v2) 2 (4.3.4)
11 2
(M). = (v2) v ,jl, : 2jcm (4.3.5)
where vjlj1 is defined in (4.2.38)
1 4
(E)m+lm+l = 2[m(vl)] 4 (4.3.6)
(E)m+2,m+2 = 2[m(vl)+v2] [mv(vl)4v] 14 (4.3.7)
()m+,m+2 = ()m+2,m+ = 2[m(vl)] o22 (4.3.8)
and
()jk = 0 : elsewhere .
jic
~
(4.3.9)
To see that the set of estimators (i.: l1jTi} are
independent of the set { 2 2} we note that the distribution
of 82 and 2 are free of the elements of a. and hence the two
3
sets of estimators are independent. To see that the covari
ances between the c's are zero we note that
V kl,k k
a. = E 1, klz.,k
1 k=l j,k
where j,k
k= j1, k(a j j,k+ jk)
= a + E jl,kej,k : 1!sj!m ,
is defined by equations (4.2.24) as
is defined by equations (4.2.24) as
(4.3.10)
Szj,k
j,k v 2
k zj,k
k=l 3
Hence the covariance between 8. and a (jZ) is given by
( )j = &a ju) ( j)E (
v v
S[( + j,k jk)(ak + E l,kE Zk) taZ
= (.ea+a E 9l,k Ck+ak Z .jlkjk
3Jk=l k=l
V V
+ E k jli ji ,k k)a (4.3.11)
i=l k=l jl'j 
Since the {est:Issm, lstsv}are independent identically
distributed normal random variables with zero mean, we have
takingexpectations first with respect to the st s that the
last three termsin brackets in (4.3.11) vanish and we are left
with
(C)jS = a jaj"
= 0 : ljfRua (4.3.12)
Although this shows the estimators are uncorrelated it
is not true that they are independent. To see this we
examine the case for m=2. Write
W=GDG'
dl+d0g20
dlg21+d0910920
dl 21+d0 10 20
d2~dl21+d0g20.1
(4.3.13)
w 01
1 woo
01 0 _
= 10 '
(4.3.14).
w12
2 w 1
(d0g10g20+dlg21)
2
d0g10+dl
We shall show that
2
2 (g10,d0',dl) N( 2' 2 )
dl+d0g10
by equation (4.3.14) this is equivalent to
(4.3.15)
(4.3.16)
Hence,
d 10
dO
= d010
d0O20
2
a2 (al,d0,dl)d1N(a2 2 ) (4.3.17)
d 1+d 0l
Since the conditional distribution of a2 depends on &1 we
have that they are dependent. To show (4.3.16) we need the
distribution of (gl0,g20'g21) conditional on (d0,d1,d2).
Referring back to (2.5.21) we have
20 = (g10'920)' (d0,dl)N 2(P0'V0) (4.3.18)
where = (Oaela2)' (4.3.19)
and
2 2
do d
V0 = (4.3.20)
2 2 2
o a 2
2 d0 d0 2 d0
and
g21 (d0,dl)N(a2' ) (4.3.21)
independent of (gl0,g20). Now the distribution of g20
conditional on (gl0,d0,dl) is easily shown to be
2
20l (10,d,dOdl)N(ae2g10' ) (4.3.22)
Since g21 is independent of g10, conditioning on gl0 does
not affect the distribution of g21' that is,
2
921 (gl,'d0,dl)N(a2' )
We note, by equation (4.3.15), that &2 conditional on (gl0'
d0,d1) is simply a linear combination of g20 and g21. Since
they are normally distributed (conditionally) so will 82
(conditionally). All we need do is calculate the mean and
variance to find the conditional distribution of 82
(dg 10g 20+d121)
P (21 (gl00''ddl) M (d0g10920+d 2I I (gl0'd0'dl)
a2(d0g10+dl)
2
dogl2+d!
d0O 101
= a2 (4.3.23)
Var(&2) (g10,d0,d1) = Var{ (d0g10 20+d1g21)
2 I (g10'd0,'d l)
d0g10 1
recalling that g20 and g21 are independent we have,
dg2 0 Varg20 (g10,ddl))+d2Var{(g21) 1(g10,d0dl)}
(dg 2+d 2
2 2
22 2 o 2 a
d0O 0 + d
0 1
(d0g0+dl) 2
2
02 (4.3.24)
(do010+di)
Hence
2
21 (g10,dOdl)N(2' 2 2
(d0g10+dI)
as was to be shown.
Furthermore it can be shown that &1 and 62 do not have
a bivariate tdistribution. To see this we find
f(821i. = 0). We have that
2
2 (al o,do,dl) N(a2'1
SGal+al
0 11
and hence
2
21 (1=0,ddl )^N(a2, ) (4.3.25)
Since the distribution does not depend on d and since
Since the distribution does not depend on dO and since
dO and d1 are independent we have
2
21(1=0,dl1 )N(a2'd )
and from (2.5.5)
2o 22
dl~vo Xv_1(0) ,
so that
f(&2,d lal=0) =
d
 (20 )2
1 2
2o2 2 2
/2i (s)
d)
V2
dl
d(v1) 202
_1 e
2 (v1)
(202)
,<& a2< .
r ((v1))
(4.3.28)
dl>0
(v1)>0
Integrating over dl, we have
d
 [+ (a a) 2]
dl e d(dl)
y2 (2o2r ) vF ((NI))
SF(v) [i+(82a2) 2]
Fr()r((v1))
:o
v>0 .
Hence we find &2 conditional on = 0 has a tdistribution
(4.3.26)
(4.3.27)
(4.3.29)
f( 21l=0) 
with (v1) degrees of freedom, but we know &2 has the
tdistribution with v degrees of freedom, this will show a
contradiction and 8 and a2 cannot have a bivariate t
distribution.
Now suppose tl and t2 have a bivariate tdistribution
with v degrees of freedom. Their joint density is given by
r( (v+2)) lZ 
f(tl't) t +2) 
r (v) 11 1 +V1 2) i
t2 2 2 t2 2 1=l,2
v>0
(4.3.30)
where i1 and v2 are the expected values of tl and t2
respectively. Also Z is the (2x2) variancecovariance matrix
of (tl't2). Relating this to &1 and 82 we would replace
(,1'12) by (al~,2) and Z would be a diagonal matrix since
we have shown the covariance of 1 and 82 to be zero. With
out any loss of generality we may take p 1=2=0 and Z=I. The
marginal density of tl is
f(t) = (+l) :tl
(1) (v)[+t2 (v+I) 1
r1(v)[I v>O
Hence the conditional density of t2 given ti is
f(t ,t2)
2'tl) f(tl)
r((v+2)) [1+t2 1(v+1)
I2.2 :(v+2) : m
r( (v+ )) [ 2 ] (v+2) 2
rC,5(v+l)) [1+t1+I 2 v2!0*
In particular, suppose t,=0, then
fF( (u+2))
f(t2 t1=) = 2 (+2) : 
T2 F((v+l))[1+t2 V20 .
That is, t2 conditional on tl = 0 has the Student tdistribution
with (v+l) degrees of freedom. Hence we see that if (&B,,2)
are bivariate tdistributed, then since &2 conditional on
61 = 0 has the Student tdistribution with (v1) degrees of
freedom it must be that &2 has the Student tdistribution with
(v2) degrees of freedom, but this is a contradiction since
82 has the Student tdistribution with v degrees of freedom.
Hence (01,22) do not have a bivariate tdistribution.
Hence we see that the maximum likelihood estimators are
not independently distributed and their joint distribution
is not multivariate t. This of course is a drawback in using
the maximum likelihood estimators and accentuates the benefits
of using the starred estimator, which are independent and have
the tdistribution.
It is easily seen that the unbiased estimators are
consistent since the variance tends to zero as the sample
size increases without bound. To find the efficiency of the
maximum likelihood estimators we compare their variance to
the minimum variance bounds given in equations (3.2.37) through
(3.2.41). We find that
Eff ( )=(v)2)v1 : Isjm (4.3.34)
Eff (v )1) 2) = (vl) 1 (4.3.35)
and Eff([m(vl)2](mv) 2)=[(m+l) (m(v1)4)] [m(m(vl)+v2]1
(4.3.36)
It is obvious that the estimators are asymptotically efficient.
Unlike the starred estimators the efficiency of the maximum
likelihood estimators does not depend on the unknown parameters,
but the distribution of the maximum likelihood estimators
{&j: 1js n} depend on the unknown parameters so that tests of
hypothesis of aj dependson knowing the values of al,a2' ...
aj1 This clearly shows the tradeoff between the starred
estimators and the maximum likelihood estimators. While the
starred estimators are inefficient, tests of hypotheses are
performed with no difficulty, and vice versa the maximum
likelihood estimators are efficient asymptoticallyy), but
test of hypotheses are complicated since their distribution
depends on several unknown parameters. Moreover the
dependence between the a's also causes complications in making
tests of hypotheses concerning two or more of the parameters
since the joint distribution may be very complex.
__
Chapter V
A TEST OF THE ADEQUACY OF THE MODEL
5.1 Introduction
Throughout we have assumed that the process is adequately
described by the first order autoregressive model. In this
chapter we propose a method of testing the validity of this
assumption. Due to the assumption of the first order auto
regressive process a class f dispersion matrices arose which
we identified by V. Since this class of dispersion matrices
is a consequence of the model, a test to validate the model
is equivalent to a test of H0: VeV against HI:V is an
arbitrary positive definite matrix.
In order to arrive at a test statistic for testing this
hypothesis we recall that if VeV then V=AUA' where A and U
were defined in equations (2.2.4) and (2.2.7). In particular
U was given as the (m+l)x(m+l) diagonal matrix
22 2 2 2
U=diag(o2B ,o ,c ,...,o ) (5.1.1)
We also showed that
22
dj' X (0) : lsjm (5.1.2)
and with v large compared to m each of the d.'s should be
nearly the same. Ignoring the first row and column of U we
have that the remaining diagonal elements of U are 02 and
{d.: lsjan} are independent estimators of this quantity. If
H0 is true then all of the d 's should be equal. Another
way of putting this is that the arithmetic mean of dl,d2,
...,dm is equal to the geometric mean, that is,
m
H d,
i=l 1
1 m m
Sdj
m d.
= m ) 1, (VEV) (5.1.3)
m Z d.
j=1
If H0 is false then the dj's will not be equal and the
arithmetic mean will be larger than the geometric mean and
Ai will be less than one. Hence we see that we reject HF for
small values of A1. The asymptotic distribution of A1 will
be investigated in section 2. This test has an interesting
geometrical interpretation. If we consider the
{d.: lsjsm} to be the squared lengths of the principal axes of
an m dimensional ellipsoid, the above hypothesis specifies
that these are all equal, that is, that the ellipsoid is a
sphere. Hence this test is the sphericity test on the
{d.:l sjnm}.
Besides the sphericity test we consider another test,
independent of A1, based on {gij: Oj
{gij: O0j
process since they are used as estimators of {a.: ljsm} and
hence of V. In section 3 we shall investigate the distribution
of a function of these statistics under the hypothesis that
VEV. In section 4 we discuss combining the two tests given
in sections 2 and 3 and the asymptotic equivalence of the
combined tests as compared to the likelihood ratio.
5.2 An Approximation to the Distribution of , log l.
Referring to equation (5.1.3) we see that A1 may be
written as the product of u,,u2,... ,u where
d.
u. = : 1sism (5.2.1)
t 1m
d.
m. 3
that is,
m
A1 = ui (5.2.2)
i=l
Rather than consider the distribution of A1 we shall
consider the distribution of
n = p log A1 : Osn< (5.2.3)
where p is some constant. The moment generating function
of n is
m ep
= ( n ui) (5.2.4)
i=l
In order to find this expectation we need to find the joint
distribution of (u1,U2,...,um). To find the joint distribution
we shall transform from (dl,d2 ...,d ) into (ulU2'.. .u ,S)
2" m 4 m
m
where u. is defined by (5.2.1) and S = d. Hence
1 m j=
we seek the Jacobian of the transformation from (dld2, ...
dm) to (ul,u2'... ,uS) defined by
d. = u.S : Isism ,
1 1
with
E u. = m.
i=l 1
(5.2.5)
(5.2.6)
Let [6u.] and [6S] denote small changes in u. and
respectively. Suppose that the changes [6ui] in u
[6S] in S bring about a change [6d.] in d. so that
and (5.2.6) is preserved. That is
d.+[6di] = (u.+[6u.]) (S+[6S]) : 1sima ,
1 1 1 1
E (u.+[6u.]) = m
i=l
S,
and
(5.2.5)
(5.2.7)
(5.2.8)
Expanding the above equations and dropping terms of second
order in the [6*][*e(di,S)], we find that
d.+[6d.] = u.S + [6u.]S + ui[6S] : lsism (5.2.9)
and
m m
z u. + E [6u] = m .
i=l 1 i=l
m
Since d. = u.S and Z u. = m we see that
i=l
[6di] = [6ui]S + ui[6S] : lsism ,
(5.2.10)
(5.2.11)
m
Z [6u.] = 0 (5.2.12)
i=l
To write the above in vector notation we define the (mxl)
column vectors
[6d] = ([6d ] [6d2],..., [6d ] (5.2.13)
d = (d1,d2, '.,d )' (5.2.14)
[6u] = ([6u ] [u2] ,. . m [u (5.2.15)
u = (Ul,u2'... um) (5.2.16)
and 1 = (1,1,...,1)' (5.2.17)
m
Equations (5.2.11) and (5.2.12) may now be written
[6d] = [6u]S + u[6S] (5.2.18)
and
1'[6u] = 0 (5.2.19)
m
Equations (5.2.11) can be thought of as a singular
transformation from {[6dl] [6d2], .... [ dm]} to {[ul] [6u2]
.., [5u m [6S]} made onetoone through use of equation
(5.2.12). Saw [17] has shown that
J{du,S} = J{[6d] [6u] [6S] }, (5.2.20)
where J{[6d][6u], [5S]} is the Jacobian of the transformation
defined by (5.2.18), in which u and S are considered fixed.
Choose P to be an orthogonal mxm matrix with the first
row equal to 1i and premultiplying equation (5.2.18) by it
m
gives
v = P[6d] = P[6u]S + P u[SS]
= Y S + 1 [6s] (5.2.21)
Y2 0
Ym 0
where yl 0 since 1'[6u] = 0.
From equations (5.2.20) and (5.2.21) we have
J{du, S} = J{ [6d] + [u] [S]}
= J{ [6d] *v}J{vy2 ...,ym [6S] }. (5.2.22)
The Jacobian, J{[d]Jv}, is unity since P is an orthogonal
matrix. The Jacobian, J{vy2,...,ym, [6S]}, is the modulus
of the determinant of the (mxm) matrix K with elements
av1
(K)11 3[6S] = 1
av.
(K) 3= S : 2sjSm ,
3v.
(K) = = S : 2j:m ,
(K)jj y
3j
av.
(K) = = 0 : elsewhere .(5.2.23)
kj 3ak
Hence K is a diagonal matrix and
J{vy2 ...'Ym, [6S]} = IKI = Sm1 (5.2.24)
and finally
J{du, S} = Sm1 (5.2.25)
Since the d.'s are independent with
22
dj X2_ (0) : lsj:m (5.2.26)
then ij1 20y dj
m d. e
f(dl,d ,...,d ) = H  : d.>0
m j=l 2 v (5.2.27)
3 (22 ) 3 (v.) l j'jsn
where vj = vj: lj m Hence we find the joint distribution
u = (Ulu2,...,um)' and S is
vj1
m m u.
f(ulu2", ,umS) =T( ) UT' I
=1 j= F( jl)m j)
r vjl 2 S
(S) j=l e 2a
m
(2o2) j=I
_ ( vj)
m j=1
m
: Osu.sn, Z u.m; S>0 (5.2.28)
S j=l
Hence we find u and S are independently distributed with
02 2
S  X2 m (0) (5.2.29)
m v.
j=l 1
and u is distributed as mZ where Z has the Dirichlet distri
bution with (mxl) dimensional parameter vector (v1, 2,...,
v m)' = (2(v1),(v2),...,. (vm))'.
If (y1, y2' ..., Yn) has the Dirichlet distribution with
parameters (al"a2, ...,an), then the moments about the
origin are given by
n n
S{ Fr(a+r )}P( E a)
1 r 2 n n j=l j=1 J (5.2.30)
P'(Yl Y2 ..."Yn n n
r( E (ar+r.)){ II r(a.)}
j=1 3 3 j=l 1
Hence we find the moment generating function of n is given
by
On(9) =:e9n
= g( H u)
Si=l1
and letting ui = mZi gives
m 6P
) (9)= i( ni Z Oi
S i=1
= mm .( p z.
i=l
and since Z.: l1ism are Dirichlet we have from (5.2.30)
1
that m m
{ n r(0v.jep)} r( vj)
e) =mp j=1 j=l (5.2.31)
F( E ( jep) { F r(3v)}
j=1 j=l
Since the moments of n are functions of gamma functions
we can apply Box's [4] method to obtain an approximation to
the distribution of n. A good discussion of Box's method is
also contained in Anderson [1].
Using equation (5.2.31) the cumulant generating function
for n is
T (0) = log0T(e)
m
= kmeplogm+ E logr((vj)9p)logF ((,vim((m+l)
j=l
m6p)), (5,2.32)
where k has a value independent of 8. Rewriting this as
m
yr (6)=kmpOlogm+ E logF(c. +p(126))logF(8+mp(126)) (5.2.33)
j=l
where
a. = (vpj) : l1j:m,
and (5.2.34)
8 = [mvmpm(m+l)]
~_
We use the expansion formula
log(x+h) = log 27 + (x+h)logx
)() rBr (h)
x E (5.2.35)
r=l r(r+l)xr
where B (h) is the sth Bernoulli polynomial defined by
hT S
hr = s
Te Z (h) (5.2.36)
(e l) s=0 s! s
for example
Bl B (= h ; B2(h) = hh+ ;
3 3 2 1
B3(h) = h 2 h + 2 h
Using the expansion formula (5.2.35) on (5.2.33) we find
that
()=k+ (2 (log 2log )(8+' log m
1 + 2 2
(ml)log(l28)+ E T } (5.2.37)
2 r=l r (12)r
with
S r, Br+l (0) m
r = (1) r B (cj )}
m r=1 (5.2.38)
r(r+l) (Pp)r
By virtue of the fact that Yn (=0)=0 we must have
k(m1) (log 2 log (+m ) log m = Z
r=l r
(5.2.39)
so that we may write
S(8)=k(ml)log(l28)+ {r 1}. (5.2.40)
Sr=1 (128)r
If r has the chisquare distribution one degrees of freedom
then its cumulant generating function is
Tr() = log (128) (5.2.41)
we see that equation (5.2.37) has the same form with
e=(ml) degrees of freedom and an additional sum which may
be called the remainder. This remainder may be reduced by
*
choosing P=P0 so that 71 = 0 and the approximation is improved..
*
For 71 = 0 we must have
m
B2() = m E B2(aj) (5.2.42)
j=l
or
m
2 1 2 1
I B+6m Z ( (5.2.43)
j=l 1
Recall that
= n(vp) n (m+l)
and
a.j=(vp)j : l1jsm ,
letting 6 = (vp) then
B= m6m(mn+l) (5.2.44)
and
a.j=6j : Isj~a (5.2.45)
Substituting (5.2.44) and (5.2.45) into equation (5.3.43)
gives
1 2 1 1
[m61m(m+l)] m6+im(m+l)+ =
m 1 2 1 1
mi [(6 j) + j+ ] (5.2.46)
j=1
Expanding the left hand side of equation (5.2.46) one has
2 212 1 2 2 1 1
m2 2m (m+l6S+l~n (m+l) m6+4m(m+l)+ (5.2.47)
and expanding and surfing the right hand side of equation
(5.2.46) one has
2 2 1 2 1 2 2 1 2 m
m262 m 2( (+l)6+ m2(m+l) (2m+l)m26 + m2(m+l)+ .
Collecting like terms we find (5.2.48)
6 = (m+1) (m+12m+8) (5.2.49)
48m
hence
(m+l) (m2+12m+8) (5.2.50)
0=90 24m
We find then that
4 (6) = (12e)_ (m1) exp{ Z r[(l2e)r 1]}
Sr=2 (5.2.51)
P=P0
where
(m+!) [3m636m5583m4336m3+160m2192]
2 2 6912 22
6912p m
P='PO (5.2.52)
Thus the cumulative distribution function of n=pOloghl is
found from
Pr{pl0ogX1 X }= Pr{X 2 1)
(m1) .
2 2 3
2 (Pr{2m+3) :X} Pr{x 1})+ R'(p
3 3 4 (5.2.53)
with R'(p0 ) a remainder involving terms in pO 'P ..
Asymptotically we have that the distribution of
m d.
PO log = Oi log 1 m (5.2.54)
E d.
mj=1 3
tends to that of a chisquare variate with (ml) degrees of
freedom.
5.3 The Distribution of T a Function of {gj.:0Oj
1]
In section 5 of Chapter II we derived the distribution
of {gij:0:j
arbitrary and when VEV. In particular, when VeV and m=4
we find from equation (2.5.15)
f(g10'g20'g30'g40'g21'g31,g41',32,g42,g43 d0,dl,d2,d3) =
1
2 1 2 exp
(2T2n d )
0
d
{ d 0 10al )2+(g20a2910) 2
20a
S+ (g30a3202 + (g40a430) 2
1
13 exp
2 )2
(2so d )2
2 2
2 21a 2 31a321 (g41o4931
20
d1 2 2
1 1 exp [(g323) +(g424g32)
(2Tio2d ) 20
1
(210 d 22 4
: 
O0 j< i4 .
(5.3.1)
If in (5.3.1) we replace (al,a2, 3,a4) by their estimators
(g10,g21,g32'g43) we obtain the statistics {(g20g21gl0)'
(930g32g20)' (g40g43g30 (g31g32921)' (g41g43931)'
(g42g43g32)}. These statistics indicate a departure from
the model. We shall investigate the distribution of a function
of these statistics. No compact expressions have been found
for general m so we present here the case for m=4. The
case for general m follows directly from the case m=4. In
what follows we shall use conditional distributions and for
convenience shall let g denote a set of fixed variables.
Consider first the statistics {(g40g43g30) (g41943g31)
(g42g4332 )}. We shall determine their joint distribution
conditional on H = {d0,dl,d2,d3,d4,g30,g31,g32}. We have
from (5.3.1) that g40,g41,g42, and g43 are normally
distributed conditional on 4 so we need to find the moments
of {(g40943930) (g41g43g31) (g42g43g32)} conditional
on Mto determine their joint distribution.
We have that
S(g40g43g30) I = [F(g40g43g30)] I (5.3.2)
where the inner expectation is with respect to g40 and the
outer expectation is with respect to g43. From (5.3.1) we
obtain (g401 ) by inspection to be a4g30' hence
(g40g43g30) I = ((a4g43)g30] (5.3.3)
= 0 (5.3.4)
since by inspection of (5.3.1) e (g43) I A4 "
In the same way
S(g41g43931) = e [e (g41g43g31) I
= e [ (a4943)931] I1
= 0 (5.3.5)
and
,,(g42g43932) I 4= e [(942943932)] 1
= [(a4943)g32]
= 0 (5.3.6)
The second moment
(940 4330 2
6(g40g43g30) l
ts are handled identically and we find
S 2 2 2 2 2
d +0 4 30 4 43930 4330
2 2
a 2 2 2 2 2 a2 2
do +4 g302a4g30 +g30(3 + 24)]
0 3
2 2
o 2 a
S+ 930d '
0 3
2 2 2
L(g412941g43g31 +43 31)
2 2 2 2 2 2
id + 4g 312a4g43g31+g43g31] I
2 2 2 2 2 2 a2 2
d +4 4g31j24g31+33 + 34)
1 3
= 20
d 931 '
1 3
and
E(942" ) = [e3 322 222g 2 2 2
2
2 2 2 2 2 2
= d2 +(4 3224g 43 32 +43 3211 4
2 2
2 2
= 2 2
d 2 V322a4g2 32 d 3 )
= + 932d (5.3.9)
2 3
The cross product terms are handled similarly and we have that
(g40g43g30) (g41g43g31) = [e(g40g41g40g4393141g4330
+g43930931)1]}j, (5.3.10)
(5.3.7)
5.3.8)
where the inner, middle, and outer expectations are with
respect to g40,'41, and g43' respectively. Continuing we
find
(40g43g30)(g41g43931) [ = &{e[a4g30941a4g30g43g31g41g43g30
+g43g30g31]}
2
S{a4g30g 31a4g30g43g31a4g43g30g31
2
2 2 a 2
a4g303124g30g 31+(3+a4)g30g31
2
= ~ g30g31' (5.3.11)
(g40g43g30) (g42g43g32) I= { [e (g40g42g40g43g32g42g43g30+
2
+g4330932)]} a
= 8{8[a4g30g4294g30g43g3242g43g30
2
+g943g30g32]}
2 2
e{a4g30g32a4g30g324g43g30g32
2
+g43g30932}I
2 2 2
= a4930g32a4g30932a4g30g32
o2 2
+ (a + 2g
d 3 (4 30g32
2
d g30g32 (5.3.12)
= 3032
and
f(g9414331) (g42g43g32) 1 = {e[(g41g42g41g43g32g42943g31
2
+943931g32) ]}
= f{[a4g31942a4g31g43g32g42g43g31
2
+g43931932]}I
2
= f{4g31g32n4g31g43g32Q4g32943931
+2
+943g31g32
2 2 2
= a4 g3231324931g32c4g31g32
2
+a 2
+( 3 a4)31932
3
2
S 3 32
dg31g32
3
(5.3.13)
Hence we find
g40943g30
941943931'
42g43932
Ia N3(~, o2V3)
where
(5.3.14)
(5.3.15)
0
1 = 0 o
0
1 930 1 1
d + 30 31d3 3032d31
0 3
2
1 1 31 1
V3 O d3 1 d d 331 d1 (5.3.16)
1 3
2
1 1 1 g32
g3032d31 g31g32d31 d2 + d 3
It is well known that if x:(nxl) N (0,Z) then x'E xx2 (0).
n n
Therefore it follows that
r 404330 4043 30
4 = 41 43 31 V3 4 43 321 X(0),
\42943g32/ \42g43g32) (5.3.17)
where the subscript on r equals the first subscript on g.
Since the distribution above is functionally independent of
the variables in 9 we have the unconditional distribution
also, that is,
22
r4 X3j(0) (5.3.18)
Putting = {d0,dld2,d3,d ,dg20,g21} we now find the
joint distribution of f(g30932g20), (g3132921)
Proceeding as before
e(g30g32g20) = e [E (g30g3220) ]l
= E [(a3g32)g20]
= 0 (5.3.19)
and
(g931932921) = [ (g31932921)] I
= P [(a3g32 )21 I
= 0 (5.3.20)
The second moments are
2(g32920 = C (O 2 2 2
(g303220 = g302g30g3220+9322011
= 2 2 2 2 2
do +c3g 202c3 32g20+g32g20]
0
2 2
d= 3 2123g32g2d + g2]20
0 2
2 2 2
0 a 2
+ d20
0 2
and
(g31g3221 22 [ (g2g 3 g3g31g2 303
a 2 2 2 2 2
d +a3 2123g32q21 32921
1
0 2 2 2 2 0 2 2
d +a3 2 3 21 d +d 3 21
1 2
2 2 2
1 2
The cross product term is
e(g9303g32920) (g3193221)U = P 8P(g3093130932921
(5.3.21)
(5.3.22)
g31g31g20
2
+g32g20g21)]
= {[(a3g20931a3g20g32g21g31932g20
+932g20g21]}[
2
=' a3 20 21 3 20 32 21a3 21 32 20
= a3g20g21a3g20g21+3g20g21
2
2 2 2
d+( +a3)g20g21
2
2
0
a 220g21 (5.3.23)
2
Hence we find that
( 30'32920
31932921 j
i' N2(v o V2)
where
(5.3.25)
0
p =
0
Liu 2
1 + g20o
d0 d
0 2
V2=
1
g20g21d21
Now form
r g30g32g20
r3 =
\g31g32g21/
1
20921d2
2
+ 21
( + g )
dI d
1 2
V1 g30g32g20
V2
31g32921
where
r3j t o2X2 (0)
(5.3.28)
By the same arguments as before we have that the distribution
of r3 is functionally independent of the elements of N and
hence
r3 G o2 2(0) (5.3.29)
unconditionally. Moreover r4 is independent of r3 since the
distribution of r4 is functionally independent of the
elements of r3.
Finally we consider (g20g21g10) and put
M= {d0,dl,d2,d3,d4,g10
(5.3.24)
(5.3.26)
(5.3.27)
8 (g20g21glO) = [ (g20g21glO) ] 1
= [ (ag2) l 10
= 0 (5.3.30)
and
2g F (2 2 2 2
e (g20g21 10) = 202g20g21 gl+g2 g ]
2
[ a 22 2 2 2 2
S +2g102c21 g021g10
2 22
0 2 2 2 12 2 2
= d 9 2 103 10 +.310
0 1
2 2
Sa02
d 0 d 1 10 (5.3.31)
0 1
Hence
(920g21g10) )I N(i,02V1) (5.3.32)
where
= 0 (5.3.33)
and 2
1 g10
V + 1. (5.3.34)
1 d d
With 2
(g20g21g0)
r = 20 2112 (5.3.35)
( + g)
d d
then
2 2
r21 %2 X1(0) (5.3.36)
Since the distribution of r2 is functionally independent of
the elements of Uwe have, unconditionally that, r2 is chi
square with one degree of freedom. Also r2 is independent of
r3 and r4 since their distribution is functionally independent
of the elements of r2. Since the three statistics are
independent we may add them to get
22
R = (r2+r3+r4) X6(0) (5.3.37)
We note that the distribution of R depends on the nuisance
2
parameter o to eliminate this parameter we consider
(R/6) 6
T = F (0) (5.3.38)
2 4V10
0,
Since R is independent of {d0,dl,d2,d3,d4} then its
2
independent of a and T is the ratio of two independent chi
square variates divided by their degrees of freedom, that is,
T has the F distribution.
Before extending this result to general m we note that
the dispersion matrix V3 of equation (5.3.16) may be written
V3 = D (3)+33 (5.3.39)
where
D(3) = diag (d0,dld2) (5.3.40)
and
1
13 = (30'g31'32) (5.8.41)
Since V3 may be written in this form its inverse can be
obtained from the Binomial Inverse Theorem, found in Press
[13], which states
i D(3) 313D(3)
(D (3)+y33) = D(3) 1+ D(3)y3 (5.3.42)
This is very useful in the actual computing of the statistic
T. It follows that V2 is the same form and can be written
V2 = D(2) + 1212 (5.3.43)
where
D(2)=diag(do,dl) (5.3.44)
and
12= y( o2) 21) (5.3.45)
In general we can form
1
rj=gjVjgj :2cjam (5.3.46)
where
j=[(gj0gjl,0ogj,jl) (gjlgjl,lgj,jl)'. .
(g, 2)] :2jsm (5.3.47)
'j2 J13,j2 j,j1
and
V =D (j1)+yj_ 1:2j:sm (5.3.48)
with
D(jl)=diag(d0,dl ,...,dj2) :22jm (5.3.49)
and
1
1 gj1,0 ;gj1,1 ;gj1,j2)':2jm(5.3.50)
(5.3.50)
Following the pattern given for m=4, when VeV
2 2
rj o Xj2_(0):2sjam (5.3.51)
and they are mutually independent so that
m 22
R = r. o X _(0), (VEV) (5.3.52)
j=2 3 m(m1)
and finally
T(R/tm(m1)) k F~m(ml)
T 2 m m(m +) (0), (VeV) (5.3.53)
2 mvkim(m+l)
No attempt has been made to find the distribution of T when
V/V, but a computer simulation indicates, as we would expect,
that T is stochastically larger in V/V than in VEV.
5.4 Asymptotic Performance of (polog. ) and T.
In section 3 we derived the asymptotic distribution of
P0logX1 and showed it to have a chisquare distribution with
(m1) degrees of freedom. In that section we also showed
that the distribution of p0logAl is independent of
Sm m
S= d. or equivalently of d.. In section 4 we showed
mj=l 3 j=l 3
that R has a chisquare distribution with m(ml) degrees of
freedom, independent of {d0',dl...' dm and hence independent
m m
of p0logA1 and E d.. Since R is independent of Z d. it
j=1 j=1 3
is independent of o2 and hence we formed T equal to the ratio
2
of R and oa divided by the appropriate constants to form an
F distribution with m(ml) degrees of freedom in the numerator
and [mvm(m+l)] degrees of freedom in the denominator. Since
2
both R and a* are independent of p0logAX then so is T. Now
the distribution of m(ml)T tends to that of a chisquare
variate with n(mi) degrees of freedom as vm. Since T and
p0logAl are independent we have
lim fPlogA + 2m(ml)T} X 1) (0) (5.4.1)
p0 1 X(ml) (m+2) (0) (5.4.1)
It has been shown by Wilks [18] that under certain
regularity conditions 2logA will be asymptotically distributed
as a chisquare with degrees of freedom under the null hypoth
esis, where. denotes the likelihood ratio. The degrees of
freedom, may be computed from (1 0) where i equals the
number of parameters estimated under the alternative hypothesis
(H1) and %0 equals the number of parameters estimated under
the null hypothesis (H0 ). For the problem here we find that
under H1 V is arbitrary and we must estimate all 2(m+l)(m+2)=Z1
different parameters. Under H there are only (m+2)=Z0
unknown parameters to estimate and hence
Z = i 0
= (m+2) (m1) (5.4.2)
That is, the asymptotic distribution of 2logA and
(p0logAl+m(ml)T) both agree, under the null hypothesis.
Hence both methods are asymptotically equivalent under the
null hypothesis.
We note that since p0logA1 and T are independent that
Fisher's method of combining independent tests may be used
in place of (p0logA, l+m(ml)T). Fisher's method would be
especially appropriate if the sample size is small.
Chapter VI
COMPUTER SIMULATIONS AND AN APPLICATION
6.1 Introduction
A computer simulation of the generalized autoregressive
process was performed thirty times. Each simulation had
fifty vector observations with each vector observation having
six measures including the initial measure. Specific values
2 2
were given (al,a2,a3',c4,a5), and V2 and they were
(0.80, 0.60, 0.50, 0.30, 0.20), 1.00, and 4.00, respectively.
The simulations were made using a computer program
written for the IBM 360 computer. The output from the program
includes
(1) the data used in the analysis
(2) the mean for each time period
(3) the cross product matrix
(4) the G matrix
(5) the diagonal elements of D
2 2
(6) the starred estimates of {(a: lsitm}, a and 8
(7) the maximum likelihood estimates of {a : l1ism},
2 o
o and B".
(8) the values of pologll and T used in testing the
adequacy of the model.
The main purpose of the simulations was to see if the
starred estimators would perform well. In keeping with this
we present only the starred and maximum likelihood estimates
2 2
for {a.: laism}, a and 6 .
1
An application of the theory was made using data from
a drug study at the University of Florida. This study was
directed by Dr. Arlan L. Rosenbloom. Each patient was
infused with glucose and observations were taken on the
patient's level of calcium prior to infusion and at 90 minute
intervals thereafter for four additional observations.
6.2 Computer Simulation Results
Each of the estimates was tested against its true value
at the .05 level of significance. On the average then we
would expect to reject two out of the thirty estimates by
chance alone. Those that were significantly different from
the actual value are listed with an asterisk. Counting the
number of tests that were accepted as a measure of the
estimator's goodness we find *,1 gave 28 acceptable estimates
out of 30. Since a,1 is identical to the maximum likelihood
estimator, &1, there is no comparison. a*2 gave acceptable
estimates in all 30 runs while 82 gave 28. Estimating
a3=.50, the starred estimators did slightly better with a*3
giving 28 acceptable estimates and 83 giving 27. a*4 gave
acceptable estimates in all runs while 84 gave 29. The last
estimators, a*S and 85, both gave 28 acceptable estimates.
We note that whenever the starred estimate was rejected so
was the maximum likelihood estimate, but not conversely.
Tests were also performed on the estimates of 2 and 82
In order to test both o, and 2 an approximation to the
distribution of chisquare given by Wilson and Hilferty [19]
was used. Their result is that (X2/ /3is approximately
normally distributed with mean, 12/(9v), and variance, 2/(9v).
This result and a discussion are also given in Kendall and
Stuart [11]. The results of the tests showed that the starred
estimator gave 25 acceptable estimates while the maximum
likelihood estimator gave 24. Again both estimates were
rejected on the same runs, with one exception, when the
maximum likelihood estimate was too high. All of the rejec
tions for the starred estimates were caused by under estimat
ing the true value.
The starred estimates and maximum likelihood estimates
performed equally well in estimating 2. Both gave
acceptable estimates 26 out of the 30 runs. Of the four in
correct estimates both were high on three and low on one.
They both gave poor estimates on the same runs.
Overall the starred estimators performed as well or
better than the maximum likelihood estimators. As can be
seen by the means and standard deviations at the bottom of
Tables 1 through 3, both estimates are very close to the true
value. The mean of the maximum likelihood estimates is closer
2 2
to the true value for a2, a,4 and a5, but not for a3' 0 or 8
Also we note that the sample standard deviations are smaller
2
for the maximum likelihood estimates except for 92. None
of the differences seem to be appreciable in any case.
Table 1
ESTIMATES OF al' a2' AND 3 FOR
COMPUTER SIMULATED PROCESS
Run al=.80 a =.60 2 =60 a 3=50 a 3=50
Number a1,=a1 a 2 2 C*3 3,
0.766
0.770
0.767
0.747
0.643*
0.842
0.723
0.839
0.906
0.902
0.892
0.674
0,799
0.747
0.826
0.748
0.795
0.815
0.861
0.848
0.810
0.722
0.746
0.747
0.826
0.927
0.741
0.652*
0.719
0.706
Mean
Standard
Deviation
*indicates
0.784
0.484
0.691
0.639
0.427
0.576
0.749
0.490
0.488
0.373
0.565
0.428
0.730
0.503
0.754
0.725
0.696
0.591
0.634
0.546
0.506
0.458
0.702
0.861
0.819
0.456
0.695
0.459
0.601
0.596
0.352
0.586
0.503
0.625
0.647
0.475
0.611
0.722
0.650
0.661
0.492
0.514
0.554
0.663
0.528
0.655
0.643
0.592
0.624
0.626
0.659
0.557
S0.532
0.636
0.667
0.571
0.575
0.784*
0.514
0.566
0.683
0.435*
0.599
0.415
0.274
0.623
0.620
0.420
0.423
0.469
0.485
0.837*
0.556
0.851*
0.422
0.389
0.775
0.579
0.277
0.411
0.389
0.650
0.559
0.258
0.375
0.610
0.522
0.663
0.255
0.529
0.607
0.508
0.568
0.511
0.421
0.414
0.564
0.518
0.356
0.553
0.452
0.521
0.719*
0.540
0.691*
0.493
0.333
0.541
0.643
0.381
0.415
0.508
0.573
0.398
0.335
0.368
0.709
0.603
0.632
0.429
0.600
0.642
0.510
0.521
0.513
0.074 0.134 0.079 0.158 0.112
estimate is significantly different from the true
value, at the .05 level of significance.
Table 2
ESTIMATES OF a4 AND a5 FOR
COMPUTER SIMULATED PROCESSES
a4=.30
a*4
4=. 30
U4
S5=.20
a*5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Mean
Standard
Deviation
*indicates
the true v
0.360
0.353
0.241
0.257
0.366
0.195
0.010
0.223
0.133
0.312
0.352
0.063
0.461
0.269
0.431
0.364
0.444
0.261
0.099
0.388
0.070
0.425
0.262
0.180
0.195
0.264
0.018
0.392
0.276
0.168
0.261
0.129
0.292
0.358
0.282
0.256
0.449
0.202
0.073
0.267
0.234
0.351
0.264
0.070*
0.383
0.225
0.295
0.271
0.352
0.147
0.194
0.425
0.168
0.405
0.349
0.284
0.170
0.207
0.102
0.420
0.260
0.261
0.267
0.101
0.032 0.060
0.077 0.090
0.042 0.013
0.156* 0.071*
0.334 0.204
0.105 0.109
0.050 0.048
0.080 0.076
0.347 0.338
0.268 0.300
0.209 0.212
0.053 0.093
0.119 0.175
0.096 0.102
0.259 0.224
0.225 0.187
0.095 0.027
0.209 0.196
0.040 0.108
0.214 0.199
0.022 0.026
0.301 0.307
0.106* 0.121*
0.236 0.250
0.390 0.363
0.330 0.327
0.135 0.110
0.381 0.294
0.084 0.099
0.033 0.029
0.128 0.140
0.160 0.129
estimate is significantly different from
value, at the .05 level of significance.
Run
Number
a5=.20
c5

`
r
Table 3
ESTIMATES OF 02 AND 82 FOR
COMPUTER SIMULATED PROCESSES
Run a2=1.00 a 2=1.00 2=4.00 B2=4.00
Number o2 a2 2 2
0.842
0.947
0.805*
0.791*
0.920
1.037
0.993
0.940
1.180
0.818*
0.811*
0.969
1.029
0.925
1.016
1.028
0.963
0.816*
1.011
0.989
1.082
1.010
0.840
1.081
1.141
0.994
0.996
0.894
0.986
0.865
Mean
Standard
0.957
aviation 0.102
*indicates estimate
the true value, at
0.826
0.928
0.788*
0.788*
0.899
1.015
0.975
0.925
1.163*
0.811*
0.795*
0.935
1.003
0.926
0.992
0.992
0.929
0.797*
0.981
0.993
1.058
0.988
0.831
1.053
1.103
0.971
0.957
0.861
0.942
0.867
0.936
0.096
4.982
3.490
6.847*
3.915
4.256
5.268
4.070
3.120
2.396*
5.888*
3.164
4.235
3.194
4.096
4.700
3.227
4.122
5.684*
3.191
3.904
3.458
4.296
3.004
3.637
4.413
3.494
4.023
4.51.7
4.209
4.217
4.102
0.944
5.127
3.594
7.055*
3.965
4.394
5.432
4.179
3.278
2.451*
5.991*
3.255
4.426
3.305
4.129
4.853
3.426
4.309
5.876*
3.317
3.922
3.569
4.429
3.061
3.768
4.608
3.608
4.223
4.733
4.444
4.243
4.232
0.971
is significantly different from
the .05 level of significance.
De
6.3 Application
As discussed in section 1, patients were infused with
glucose and measurements were taken on their calcium level,
prior to infusion and four times later at 90minute periods.
The data are given in Table 4. Inspecting the means at each
period given at the bottom of Table 4 we see that,on the
average, initially the calcium reading was highest and
infusion of glucose caused it to drop continually until the
last time period where there is a mild increase in the level
of calcium. In Table 5, both the starred estimates and the
maximum likelihood estimates are given. Both estimators
gave similar results for all of the parameters with a4 having
the largest value, probably reflecting the increase in the
level of calcium from time period 3 to time period 4.
Table 6 shows the standard deviations and 95% confidence
intervals for a,*1,ca,a,3, and a*4. The confidence intervals
for a,*la,2, and a,3 contain zero implying the parameters do
not differ significantly from zero. This could have been
guessed by noting the relatively small change in the mean
level of calcium from one period to the next. Since the mean
level rose in the last period the parameter *,4 is large and,
as noted by the 95% confidence interval, is significantly
different from zero.
In testing the adequacy of the model we found p0logAl =
4.91 and T = 1.77. Since the distribution of p0logXl is
approximately chisquare with 3 degrees of freedom we compare
the calculated value against the tabulated value at the .05
2
level of significance. We find X3,.05 = 7.81, since the
calculated value is less than this we accept the hypothesis
of sphericity. The distribution of T is F with 6 degrees of
freedom in the numerator and 114 degrees of freedom in the
denominator. The upper 5% point of this distribution is
6
F ,.5 = 2.18. Since the tabulated value is greater than
114,.05
the calculated value we accept the adequacy of the model.
To see how well the model fits the data we randomly
selected patient number 18 and calculated his measurements
for the four time periods using his pervious readings.
Letting Y*jk denote the predicted value at time j of patient
k we have that
Y*jk = j+xjk : ljs4 ; lsk32 (6.3.1)
where
Xjk = xjl,k: lj54 ; lsk.32 (6.3.2)
and
Xjk = Yjkj : lj4 ; lsks32 (6.3.3)
Hence we may write
Y*jk = j , j yj jl+ ,k : 1sj~4 ; lkg32 (6.3.4)
Given that patient 18 had an initial reading of 9.9, the
prediction for his 90minute reading is
Y* ,18 = 9.15 .137(9.64)+.137 (9.9)
= 9.20
Similarly for the rest of the readings we find that
Y*2,18 = 9.02 ,
y*3,18 = 9.06
and Y*4,18 = 9.29
Table 4
LEVEL OF CALCIUM IN GRAMS PER LITER IN
PATIENTS INFUSED WITH GLUCOSE
Patient Initial 90 180 270 360
Number Period Minutes Minutes Minutes Minutes
1 9.1 9.2 8.9 8.5 8.3.
2 10.0 9.2 "9.5 9.2 8.4
3 10.1 9.8 10.1 8.0 9.7
4 10.0 10.1 9.1 9.4 9.5
5 9.7 8.9 9.1 9.1 9.2
6 9.5 8.7 9.1 8.3 8.6
7 9.5 9.4 9.3 9.6 9.3
8 10.1 9.2 9.3 9.1 8.7
9 9.6 8.9 9.3 9.4 8.9
10 9.1 9.3 9.0 9.0 9.0
11 9.6 8.8 8.9 8.8 9.4
12 9.3 9.4 9.3 9.4 9.7
13 10.2 9.5 9.8 9.9 9.8
14 9.2 8.8 9.4 8.9 8.2
15 9.6 9.4 8.9 8.9 9.0
16 10.1 9.0 9.1 9.2 9.1
17 9.4 8.5 8.5 8.6 8.7
18 .9.9 8.9 9.5 9.5 9.8
19 10.4 8.9 9.4 8.3 8.1
20 9.0 8.8 8.5 8.5 8.4.
21 9.7 9.6 9.4 8.4 8.8
22 10.2 8.1 9.0 8.9 9.4
23 9.2 10.3 9.0 8.7 8.7
24 9.7 8.9 9.1 9.1 9.2
25 9.0 8.4 8.1 8.7 8.7
26 9.4 9.2 9.2 9.2 9.0
27 9.4 8.9 8.8 8.7 9.0
28 10.1 9.8 9.1 8.8 9.0
29 9.8 9.3 9.5 9.3 9.6
30 9.6 9.1 8.4 8.6 8.5
31 9.5 8.8 8.5 8.7 9.2
32 9.4 9.6 9.4 9.4 9.5
Mean
9.64
9.15
9.11
R.94
9.01
Table 5
ESTIMATES OF THE PARAMETERS
FOR THE GLUCOSE STUDY
Parameters
2 12
"1 "2 a3 4 0
0.137 0.334 0.305 0.506
0.173 0.843
Maximum
Likelihood 0.137 0.383 0.312 0.590 0.174 0.854
Type of
Estimate
Starred
Estimate
2 4
0
d2
a
c 
m
M 3
c 01
S2
 I 0 0
z 4
H 0
H0
0 c
Qi
'4
E0
0 ci
H H
Ui
< t
E H
.9! '4
0
m co
d 4
0
0
N
N CO
0
Oh
0
CO
4
0
en
Ln
O
o
0
in
[
co
o
0
I
0
N
I I 1
N
0
a)
c10
ar>
Ui
m
N
0
Cr
to
0 0
.
*001
r a)(
86
Comparing these to the actual measurements of 8.9, 9.5,
and 9.8 we see that the model gives reasonable predictions.
BIBLIOGRAPHY
[1] Anderson, T. W. (1958). An Introduction to Multivariate
Statistical Analysis. Wiley, New York .
S2] Anderson, T. W. (1971). The Statistical Analysis of
Time Series. Wiley, New York.
[3] Bahadur, R. R. (1960). Stochastic Comparison of Tests.
Ann. Math. Statist., 31, 276295.
[ 4] Box, G. E. P. (1949). A General Distribution Theory
for a Class of Likelihood Criteria. Biometrika,
36, 317346.
S5] Box, G. E. P. and Jenkins, G. M. (1970). Time Series
Analysis (Forecasting and Control). HoldenDay,
San Francisco.
[6] Cornish, E. A. (1954). The Multivariate tDistribution
Associated with a Set of Normal Sample Deviates.
Australian Journal of Physics, 7, 531542.
[ 7] Deemer, W. L. and Olkin, I. (1951). The Jacobians
of Certain Matrix Tranformations Useful In
Multivariate Analysis. Biometrika, 38, 345367.
S8] Dunnett, C. W. and Sobel, M. (1954). A Bivariate
Generalization of Students tDistribution with
Tables for Certain Special Cases. Biometrika,
41, 153169.
[ 9] Feller, W. (1966). An Introduction to Probability
Theory and Its Application, Vol. 2. Wiley, New York.
[10] Fisz, M. (1967). Probability Theory and Mathematical
Statistics. Wiley, New York.
[11] Kendall, M. G. and Stuart, A. (1967). The Advanced
Theory of Statistics, Vol. 2. Hafner, New York.
[12] Littell, R. C. and Folks, J. L. (1971). Asymptotic
Optimality of Fisher's Method of Combining
Independent Tests. Journal of the American Statis
tical Association, 66, 802806.
[13] Press, S. J. (1972). Applied Multivariate Analysis.
Holt, New York.
[14] Rao, C. R. (1952). Advanced Statistical Methods in
Biometric Research. Wiley, New York.
[15] Saw, J. G. (1964). Likelihood Ratio Tests of Hypothesis
on Multivariate Populations, Volume I: Distribution
Theory. Virginia Polytechnical Institute, Blacksbury,
Virginia.
[16] Saw, J. G. (1964). Likelihood Ratio Tests of Hypothesis
on Multivariate Populations, Volume II: Tests of
Hypothesis. Virginia Polytechnical Institute,
Blacksburg, Virginia.
[17] Saw, J. G. (1973). Jacobians of Singular Transformations
with Applications to Statistical Distribution Theory.
Communications In Statistics, 1, 8191.
[18] Wilks, S. S. (1938). The Large Sample Distribution of
the Likelihood Ratio for Testing Composite Hypothesis
Ann. Math. Statist., 9, 60.
[19] Wilson, E. B. and Hilferty, M. M. (1931). The Distribution
of Chisquare. Proc. Nat. Acad. Sci., U.S.A., 17, 684.
BIOGRAPHICAL SKETCH
Darryl Jon Downing was born January 4, 1947 in
Beaver Dam, Wisconsin, and was the youngest of the five
children who William and Roberta Downing had. He spent
most of his youth in Janesville, Wisconsin where he
graduated from high school in 1965.
Shortly after high school he married Barbara Ann Fisher.
It was through Barbara's coaxing that Darryl applied to
Whitewater State University where he obtained a Bachelor of
Science degree with a major in mathematics, in January of
1970. While attending Whitewater State Universtiy Darryl
met Dr. David Stoneman who introduced him to the field of
statistics. Dr. Stoneman was also instrumental in helping
Darryl go to graduate school.
After graduating from Whitewater State University
Darryl attended graduate school at Michigan Technological
University, majoring in mathematics. He attended Michigan
for six months and left for the University of Florida in
the Fall of 1970. In June, 1972 Darryl received the Master
of Statistics degree. From 1972 until the present he has
been working towards the degree of Doctor of Philosophy with
a major in Statistics.
90
Darryl and Barbara have two children: Darren Jon,
age 8 and Kelly Ann, age 6. Both children were born in
Janesville, Wisconsin while Darryl was attending Whitewater
State University.
Darryl has been hired as an Assistant Professor of
Statistics at Marquette University's Mathematics and
Statistics Department in Milwaukee, Wisconsin and will start
teaching there in August, 1974.
I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation
and is fully adequate, in scope and quality, as a dissertation
for the degree of Doctor of Philosophy.
J. G. Say, Chairman
Profess r of Statistics
I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation
and is fully adequate, in scope and quality, as a dissertation
for the Degree of Doctor of Philosophy.
T. Hughes /
Assistant Profssor of Statistics
I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation
and is fully adequate, in scope and quality, as a dissertation
for the Degree of Doctor of Philosophy.
M. C. K. Yang
Assistant Pro ss of statisticss
I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation
and is fully adequate, in scope and quality, as a dissertation
for the Degree of Doctor of Philosophy.
Z. R. Pop8tojahovic
Professor of Mathematics
This dissertation was submitted to the Department of Statistics
in the College of Arts and Sciences and the Graduate Council,
and was accepted as partial fulfillment of the requirements for
the degree of Doctor of Philosophy.
August, 1974
Dean, Graduate School

