Title: Estimating and testing the parameters of a generalization of the first order nonstationary autoregressive process
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00098169/00001
 Material Information
Title: Estimating and testing the parameters of a generalization of the first order nonstationary autoregressive process
Physical Description: vii, 90 leaves. : illus. ; 28 cm.
Language: English
Creator: Downing, Darryl Jon, 1947-
Publication Date: 1974
Copyright Date: 1974
 Subjects
Subject: Stochastic processes   ( lcsh )
Estimation theory   ( lcsh )
Statistics thesis Ph. D
Dissertations, Academic -- Statistics -- UF
Genre: bibliography   ( marcgt )
non-fiction   ( marcgt )
 Notes
Thesis: Thesis--University of Florida.
Bibliography: Bibliography: leaves 87-88.
General Note: Typescript.
General Note: Vita.
 Record Information
Bibliographic ID: UF00098169
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: alephbibnum - 000582655
oclc - 14155852
notis - ADB1032

Downloads

This item has the following downloads:

estimatingtestin00down ( PDF )


Full Text

















ESTIMATING AND TESTING THE PARAMETERS OF A
GENERALIZATION OF THE FIRST ORDER NONSTATIONARY
AUTOREGRESSIVE PROCESS





by

Darryl Jon Downing


A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF ILORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY


UNIVERSITY OF FLORIDA
1974


































TO BARBARA, DARREN, AND KELLY
FOR THEIR LOVE, UNDERSTANDING,
AND CONSTANT SUPPORT.
















ACKNOWLEDGMENTS


I would like to express my deepest gratitude and

heartfelt thanks to Dr. John Saw. He suggested this topic

to me, was always available for assistance, and without his

help I would have never completed this work. Appreciation

is expressed also to the other members of my supervisory

committee, Professors D. T. Hughes, M. C. K. Yang, and

Z. R. Pop-Stojanovic.

A special thank you is given to Dr. William Mendenhall,

who made it possible for me to come to the University of

Florida. His concern for my welfare and my family's will

always be remembered and appreciated.

To Libby Coker who typed this manuscript from a rough

draft, I owe more than a simple thank you can express. Her

dedication and perseverance will always be remembered and

appreciated.

The years of study here were consummated by the whole of

the Statistics Department. To all of the faculty,

secretaries, and students I extend my thanks for making me

feel wanted and welcome.

















TABLE OF CONTENTS


Page


ACKNOWLEDGMENTS .................. .....................

ABSTRACT ..............................................

CHAPTER

1 STATEMENT OF THE PROBLEM ...................

1.1 Introduction ..........................

1.2 Summary of Results ....................

1.3 Notation ..............................

2 THE DOOLITTLE DECOMPOSITION AND ASSOCIATED

DISTRIBUTION THEORY ........................

2.1 Introduction ..........................

2.2. V: A Class of Dispersion Matrices .....

2.3 The Doolittle Decomposition and Its

Jacobian ..............................
2.4 The Joint Distribution of G and D for

Arbitrary V and In Particular When

VV. .. ..................................

2.5 The Distribution of G, of D, and of G

Conditional on D When V Is Arbitrary and

When V V ...............................

2.6 Verification of the Distribution of G..


3 THE ESTIMATORS 02, 8,, AND :lsjm} .....
J zj .









(Table of Contents Continued)


Chapter Page
3 3.1 Introduction ............................ 25

3.2 The Distribution and Properties of
2 225
2 and {a, :l jm .................... 25

3.3 Tests of Hypothesis ............... ....... 31

4 MAXIMUM LIKELIHOOD ESTIMATORS ............... 34

4.1 Introduction ............................ 34

4.2 The Maximum Likelihood Estimators and

Their Distribution ........ ............ 34

4.3 Properties of the Maximum Likelihood

Estimators and Their Distribution ..... 42

5 A TEST OF THE ADEQUACY OF THE MODEL ........ 51

5.1 Introduction. ........................... 51

5.2 An Approximation to the Distribution of

-p0logX1 .............................. 53

5.3 The Distribution of T a Function of

gij :0sj
5.4 Asymptotic Performance of (-p0log1 )

and T.................................. .. 73

6 COMPUTER SIMULATIONS AND AN APPLICATION..... 75

6.1 Introduction ............................ 75

6.2 Computer Simulation Results ........... 76

6.3 Application ............................. 81

BIBLIOGRAPHY ........................................... 87

BIOGRAPHICAL SKETCH ....................... ............. 89
















Abstract of Dissertation Presented to the
Graduate Council of the University of Florida in Partial
Fulfillment of the Requirements for the
Degree of Doctor of Philosophy

ESTIMATING AND TESTING THE PARAMETERS OF A
GENERALIZATION OF THE FIRST ORDER NONSTATIONARY
AUTOREGRESSIVE PROCESS

by

Darryl Jon Downing
August, 1974

Chairman: Dr. J. G. Saw
Major Department: Statistics

A stochastic process is represented as having two

components. The first component is called drift and measures

location. The second component, called noise, measures the

variability of the stochastic process. This paper is

concerned with estimating the noise process when the noise

process is assumed to follow what we shall call a generalized

first order nonstationary autoregressive process. The

generalized first order autoregressive process is defined

similar to the first order autoregressive process, except that

the parameter relating two observations is different for each

time point. In order to estimate these parameters it is

necessary that the stochastic process be replicated a

sufficient number of times.

A method of estimating the parameters is proposed and

the broad attendant distribution theory is delineated, both

in a general setting and for specific situations. The prop-









erties of these estimators are given and some tests of

hypothesis concerning the parameters are investigated. In

order to comment further on the value of the proposed

estimators, we use as a benchmark the maximum likelihood

estimators. Their properties are given and a critical

comparison is made between them and the proposed estimators.

In any practical situation it will be necessary to

decide whether or not the first order generalized autoregres-

sive process is sufficiently accurate to describe the data.

Therefore, a test of the adequacy of the model is given.

Finally, numerical results are obtained using a

computer simulation. The proposed estimators and the maximum

likelihood estimators are compared. Also a practical

application is given.















Chapter I

STATEMENT OF THE PROBLEM


1.1 Introduction

The statistical model in this dissertation is a

stochastic process {Y(t):tET}. Usually T will denote a

time interval and we shall suppose that replications of the

process can be monitored during T at times t0
a typical replicate yielding the random sample yj=Y(tj):Ojim.

It is not necessary that the time increments tl t0, t2 -tl,

*** t- tm-1 be of equal length.

If we write p(t)=EY(t) and X(t)=Y(t)-p(t), then we may

think of p(t) as the "drift" of the sample paths of Y(t) and

X(t) as "noise". Clearly EX(t)=0. Various schemes have,

classically, been used to describe the noise process. In

particular one may assume that, with y.=i(t.)+x.:Osjsm,

xj = alxjl+a2xj-2+ ..+ap xjp+cj : p5jm, (1.1.1)

where E Ep+l, ..' em are independent identically distri-

buted random variables.

In order that the data lend themselves to analysis under

this classical model, several assumptions must be made. The

most restrictive of these is that of stationarity. Expressed

informally, stationarity assumes that the process has been

running a sufficiently long time so that it has settled down.










Putting this into a probabilistic context, stationarity

implies that the probability distribution of xtl, x, .

xk is the same as the probability distribution of xt +t

t2 +t ..., xtk+t for every finite set of values (tl,t2,

...,tk) and for every finite t.

The classical analysis of the model of equation (1.1.1),

known as the p-th order autoregressive model is likely to be

inappropriate in many cases due to the requirement of

stationarity. For example, consider observing the effect of

a diet on weight loss. Initially the weight loss will be

greatest and will tend toward zero as time goes on and the

subject tends to some constant weight. Obviously since the

larger values appear first the probability distribution of

the initial observations is not the same as that occurring

later. A second example is the effect of drug infusion. A

patient is given a dose of some drug, either orally or

intravenously, and blood samples are drawn at various times

t0o tl, ... tm thereafter. The amount of drug in the blood

is then measured for each sample at each time. Again the

initial readings will be larger than the later ones since

the drag will be absorbed into the system or discharged as

time goes on.

It may be argued in both examples cited, that successive

differences (or perhaps successive second differences) have,

approximately, a stationary distribution. Rather than con-

cede to ad hoc procedures we prefer to replace the stationary

autoregressive process by a nonstationary process. We gain

this generality in the model for {Y(t) : teT} at the expense


~ ~~









of requiring (for our analysis of, and estimation of the

parameters of the process) several replications of this

process. Fortunately, in many instances of interest,

replicates will be available.

The simplest alternative to the p-th order autoregres-

sive scheme is what we shall call the first order generalized

autoregressive scheme. Formally we shall assume that the

errors xl, x2, ..., xm satisfy

x. = ajx. + Ej : ljcm, (1.1.2)

where again cl' E2 ..2' m are independent identically

distributed random variables. It will be assumed that the

joint distribution of the errors is multivariate normal.

The assumption that the process can be replicated is needed

in order to estimate the unknown parameters al, a2, ..., am

In the analysis of this model we shall be concerned with

three major problems: (1) providing estimators for the

unknown parameters, (2) finding the distribution of the

estimators and comparing them to other estimators (typically

likelihood estimators) and, (3) providing methodology for

the testing the goodness of fit of the model.


1.2 Summary of Results

A method of estimating the parameters is proposed in

Chapter 2 and the broad attendant distribution theory is

delineated, both in a general setting and for specific

situations. The properties of these estimators are provided

in Chapter 3. In Chapter 4 a critical comparison is made


~









between the maximum likelihood estimators and the estimators

which we propose as an alternative. In any practical

situation it will be necessary to decide whether or not the

first order generalized autoregressive scheme is

sufficiently accurate to describe the data. A decision

procedure bearing on this aspect is investigated in Chapter 5.

An application of the theory is given in Chapter 6,

with a comparison between the estimators derived in this

dissertation and the usual maximum likelihood estimators.


1.3 Notation

In almost all areas of statistics notation is very

important consistent notation aids in solving and under-

standing the material presented. Also it is very convenient

to abbreviate the distributional properties of random

variables. For these reasons, certain notational conventions

have been adopted. Although many of these are standard,

they will be listed here for reference.


1. An underscored lower case letter invariably

represents a column vector. Its row dimension will

be given the first time a vector appears. Thus

x:(m) denotes a column vector consisting of m elements.

The vector which has all its elements zero will be

denoted by 0.

2. Matrices will be denoted by capital letters and

the first time a matrix appears, its row and column

dimensions will be given. Thus M:(rxc) denotes a


~









matrix with r rows and c columns. Denote the zero matrix

by (0). The symbol "I" will be reserved for the

identity matrix.

3. The elements of a matrix will be denoted by the

corresponding small letter with subscripts to denote

their row and column position. Thus m.. denotes the

element in the ith row and jth column of the matrix

M. The symbol (M).. is equivalent to m.. and is

sometimes substituted for convenience.

-4. It is sometimes convenient to form row and column

vectors from a given matrix. The symbol (M). will

represent the row vector formed from the ith row of

M. Similarly (M).j will denote the column vector

formed from the jth column of M.

5. The matrix formed from the first j rows and

columns of M will be denoted by M(j). M(j) is

commonly called the jth principal submatrix.

6. The Kronecher product of M:(mxm) and N:(nxn) will

be denoted by M s N and is a matrix P:(mnxmn) with

(k-l)s+i,( -l)s+j = mk nij

7. Diagonal matrices will be denoted by D=diag(dl,d2,

...,dm), where diag is short for diagonal and dl,d2,

...,dm are the elements on the diagonal.

8. In keeping with conventional notation we shall

write etr(.) to denote the constant "e" (Euler's

constant) raised to the tr(.) power.


~









9. In distribution theory transformations are often

made use of. To denote the Jacobian of the

transformation the following notation J{X-Y} will

represent the Jacobian of the transformation from

the X-space into the Y-space.


10. If y is a random variable having a normal density,
2
with mean p, and variance o, we will write

yN(p,o2)


11. If y is a random variable defined on (o,m) with

density

f(y) = {r(v) 2(V-2)}v- exp {-y2}

we will denote this by

yXV(0).
This is to be read as "y has the central Chi density

on v degrees of freedom."


12. If y is a random variable defined on (o,-) with

density

f(y) = {F(v)(2o2)v}-1 y-1 e-y/22

we shall abbreviate this by

y o 2X2(0).

13. If y is an m-dimensional column vector whose

elements have a joint normal density with mean

vector, : (m) and dispersion matrix V:(mxm), this will

be denoted by


yN (IjV)









14. If Yl, 2' "*. yn are mutually independent

m-variate column vectors with

Zi Nm(j,V) i=l,...,n .
Then, with Y = (1, y2..."' n) we will write

Y \Nmxn(M, VI) ,

where M = (,v, ...,).

15. With Y an mxn matrix such that

Y %N (M, VaI)
mxn
the mxm matrix

W=YY'

has a noncentral Wishart distribution with dispersion matrix

V, degrees of freedom n, and noncentrality matrix

MM'. We will write

W .Wm(V,n,MM')

16. If W is an mxm symmetric matrix whose 'm(m+l)/2

mathematically independent elements have the density

f(W) = km(V,v) IWI(v-m-l) etr{-V-1W}

over the group of positive definite matrices, where

m j=

we will write

W W (V,v, (0)).

17. For referencing within the text I'] will denote

bibliographical references, while (,) will denote

references to equations. Thus [4] refers to the

fourth entry in the bibliography, while (1.2.3) refers

to equation 3 in section 2 of Chapter I.















Chapter II

THE DOOLITTLE DECOMPOSITION AND ASSOCIATED
DISTRIBUTION THEORY


2.1 Introduction

In the event that the "noise" process, X(t)=Y(t)-U(t),

is nonstationary, the general models and methods of

estimation, like those given by Box and Jenkins [5] and

Anderson [2], are no longer valid. We ignore the cases

where the nonstationarity is caused by trend, since this

can be removed and the resulting series is stationary and

usual methods apply. An appropriate model for a "noise"

process with this irregular behavior is

Xj = a j-1+ j : l-jn

Our purpose in this chapter is to find a method of

estimating the parameters of this model and to establish

the attendent distribution theory.

Throughout, {y.:0jsm}, will denote the observed values

of Y(t) at times t0
normally distributed.


2.2 V: A Class of Dispersion Matrices

If {X(t) = Y(t) p(t): tET} is a nonstationary

time series, whose realizations, xj, satisfy the relationship

x. = a.xj_ + e. : l : m, (2.2.1)
J ] a -1 ]









then {X(t)} is said to follow the first order generalized

autoregressive sequence.

Suppose we arrive at the (m+l)-variate column vector

S= (x0,x, ... ,xm) ', obtained from random sampling from

the above process. We assume that x is an observation from

the (m+l)-variate normal distribution, that is,

x'N +l(,V) (2.2.2)
m+l
where p = gx = o, since Sx(t) = o, and

V = Var x = gxx'

In order to determine V we assume that x E, E' "... m

are uncorrelated random variables with

Var(x ) = o2 2 (2.2.3)

and

Var(e ) = o : ljsm .

Since the process follows equation (2.2.1) we can write

{X0,xl,...,"x} in terms of {x0,1,E 2,.. m}. Applying

equation (2.2.1) recursively we find the relationship between

xj and x0' ...',Em Letting x:(m+l)=(x0,x,...,xm) ',

: (m) = (EE2, ..., m) and A:(m+l x m+l) having elements

a.. = 1 : oim

aij = aj _, ,i : osj
ak1 =o : elsewhere,

then it is easily verified that

x = A (2.2.5)

From expression (2.2.5) we see that

V = AUA' (2.2.6)

where U:(m+l x m+l) = diag(2,o2,2 ... 2). (2.2.7)









Writing out the elements of V explicitly we have

v = 22
2 2
00
v = o2 + 2vj-,j- : l1jIn (2.2.8)

Vjk = kj = "j+l'+2 -... akjj : 1rj
In the density of x we need A=V1, because of the

form of V, A has a particularly simple form. By taking the

inverse of the product in (2.2.6) we find

A = (A-1) 'U-1A-1 (2.2.9)

The inverse of U is trivial and since A is lower triangular

its inverse is easily shown to have elements:

ai = 1 : osj an

aj+lj = -Xj+l : osjmn-1 (2.2.10)

ajk = o : elsewhere

Hence the elements of A are given by
2 -2 2 2
oo = B2 + ; 1mm = 1

2
o21j = 1 + j2 + : lIsjn-l
2 2 (2.2.11)
2 2
0 Aj+lj = O j',j+l -j+1 : ojn-1

02 j = o : elsewhere
jk

A square matrix M,with m.. = 0 : li-jj>l, is called a

"Jacobi matrix" in the literature. This matrix can be

factored into c2A = R'R, where R is lower triangular with
r =B-1

r.. = 1 : lsjsm
r3 (2.2.12)
r. ., = -ci : l5j5m

rjk = 0 : elsewhere

This result can be obtained from (2.2.9) very easily.









One further property of V is: let V(j) be the

(j+l x j+l) principal submatrix formed from V, then

V (j) = 2(j+l) 2 : osjsm (2.2.13)

This result can be obtained by partitioning the

matrices in (2.2.6) and taking the determinant of the

corresponding product of the partitioned matrices. Since

A is lower triangular with unit diagonal elements any

square partition has determinant equal to unity. The

square partition of U is diagonal with determinant equal to

(2.2.13).

We note that the special form of V and its properties

are due to the model. We shall let V denote the class of

matrices with this special form. Specifically, V is the

class of all positive definite matrices V such that V- is

a Jacobi matrix. To see that VeV implies V- is a Jacobi

matrix we note that V may always be represented by AUA'

where A is lower triangular given by (2.2.4). Since A-

is given by (2.2.10) then the resulting product (A-)'U 1A-

is always a Jacobi matrix.


2.3 The Doolittle Decomposition and Its Jacobian

Suppose we observe a number of (m+l)-variate column

vectors zj: lsjsn (n>m+l), obtained by random sampling from

an (m+l)-variate normal population with mean 4 and dispersion

matrix V. It is well known that the maximum likelihood

estimates of P and V are given by
1 n
S= y (2.3.1)
n j=l









and

V= W (2.3.2)
n
where W is the (m+l x m+l) matrix
n
W = (y.-y) (Yj-Y)' (2.3.3)
j=1
W has the central Wishart distribution with v=n-l degrees

of freedom and dispersion matrix V. It is well known that

W = vV, so that v W is an unbiassed estimate of V. In

later sections we shall assume that VeV so that V may be

written as V=AUA', where A and U are defined in section 2.

We note that the sub-diagonal of A is (al,.2, ... ,am) with

other elements being products of the a's. Since A contains

all the information on the a's and U contains information
2 2
on o and 8 if we could estimate these matrices we would

have estimates of the unknown parameters. Since W estimates

V, perhaps a transformation on the element of W will give

us estimates of A and U. It is with this intuitive notion

in mind that we proceed. We assume that a matrix W is

available with v degrees of freedom, and for the moment,

that V is an arbitrary positive definite matrix.

For convenience we label the rows and columns of W

zero through m (rather than 1 through m+l) and let W(j) be

the (j+1 x j+l) principal submatrix of W. Define

do = lW( o)
S () 1 (2.3.4)
dj = IW W() I ] lj: 1 m .

With D:(m+l x m+l) = diag(d0,dl,...,dm) define G:(m+l x

m+l), a lower triangular matrix with unit diagonal elements

(uniquely) by









W = GDG' (2.3.5)

Since the (m+l) random diagonal elements of D and the

m(m+l) random elements {gij: o
give (m+2)(m+l) random variables we see that the trans-

formation from W into G and D is nonsingular. The actual

decomposition of W into G and D can be obtained using the

forward Doolittle procedure outlined in Rao [14] and Saw [16]

We wish to determine the joint density of G and D.

This can be obtained by using the density of W, denoted by

f('), and obtaining the Jacobian of the transformation from

W into G and D defined by (2.3.5). Denoting this Jacobian

by J{W+G,D}, and the joint density on G and D by h(G,D) then

h(G,D) = f(GDG')J{W-G,D} (2.3.6)

Direct evaluation of the Jacobian is cumbersome and the

method used here is due to Hsu as reported by Deemer and

Olkin [7]. Since the derivation of the Jacobian is rather

long the rest of this section is devoted to it.

We seek the Jacobian of the transformation from W to

G and D defined by

W = GDG', (2.3.7)

with all matrices (m+l x m+l) and G lower triangular with

unit diagonals. Let [6G] and [6D], both (m+l x m+l),

denote small changes in G and D, respectively. Suppose that

the changes [6G] in G and [6D] in D bring about a change

[6W] in W so that (2.3.5) is preserved. That is

W+[6W] = (G+[6G])(D+[6D]) (G+[6G])' (2.3.8)

Expanding equation (2.3.8) and dropping terms of second









order in the [6*][*E(G,D)], we find that

W+[6W] = GDG' + [G' + S]D + G[6D]G' + GD[6G]' (2.3.9)

Since W = GDG', we see that

[6W] = [6G]DG' + G[6D]G' + GD[6G]' (2.3.10)

Hsu has shown that

J{W+G,D} = J{[6W]-[6G],[6D]}, (2.3.11)

where J{[6W]*[6G],[6D]} is the Jacobian of the transforma-

tion defined by (2.3.10), in which G and D are considered

to be fixed (m+l x m+l) matrices. In essence we have gone

from a non-linear transformation in G and D into a linear

transformation in the differential elements [6G] and [6D].

Pre and post multiplying (2.3.10) by G-1 and (G')1,

respectively gives

G [6W] (G')- = G [61G]D 4 [6D] + D[6G]'(G')-1 (2.3.12)

Let A = G [6W](G')-1

B = G- [6G] (2.3.13)

and C = [6D]

We note that A is symmetric, B is lower triangular with

(B)i = 0:0sitm and C is diagonal. We may rewrite (2.3.12)

as

A = BD + C + DB' .(2.3.14)

From equations (2.3.10), (2.3.12), (2.3.13), and (2.3.14) we

have

J{W+G,D} = J{[6W]-[6G], [6D]}

= JC[6W]-A} J{A-B,C} J{B,C-[6G],[6D]}. (2.3.15)

We shall evaluate the last three Jacobians separately.

The Jacobian, J{[6W]-A}, is the Jacobian of the first









transformation defined by (2.3.13). This can be evaluated

by usual methods and we find

J{ [W]+A} = IGI(m+2) = 1 (2.3.16)

since G is lower triangular with unit diagonals.

The Jacobian, J{B,C-[6G],[6D]}, is the Jacobian of

the transformation defined by the last two equations in

(2.3.13). Hence it may be factored into the product of

two Jacobians, namely,

J{B,C-[6G], [6D]} = J({B [6G]}-J{C-[6D]}. (2.3.17)

By the usual methods for determining Jacobians we find

J{B-[6G]} = G- 1(m+l) = 1 (2.3.18)

and J{C-[6D]} = II (m+l) = 1 (2.3.19)

so that equation (2.3.17) is unity.

Finally we need to determine J{A-B,C}. Writing out

the equations given by (2.3.14) and using the fact that B

is lower triangular with zero diagonal elements we find

aji = a.. = b.i.d : 0
and ajj = cjj : Ojgm

Hence we find that

3a.
3 = 1 : orj s:
33
and
(2.3.21)
Ba..
1) = d : oj

so that the Jacobian is

J{A+B,C}= 3(aooal0a20'. amOalla21 '*aml ''amm
3(coo0b0b20 bmo llb21 bml mm









= dM- (2.3.22)
j=o 3

Following equation (2.3.15) we obtain

J{W-G,D} = M dm (2.3.23)
j=o


2.4 The Joint Distribution of G and D for Arbitrary V and

in Particular When VsV.

In section 3 we derived the Jacobian of the non-linear

transformation W = GDG'. We now suppose that, for

arbitrary positive definite V,

W Wm+1(V,v, (0)). (2.4.1)

The joint density on G and D is then obtained from (2.4.1),

(2.3.6), and (2.3.23).

h(G,D) = Km+1(V,v)etr{-V-1GDG'} I d v+m)-j- (2.4.2)
j=0 3
on

d.>o : ojsfm

and -j
The term Km+(V,v) is defined by
~m+l

Kml(V,) = IV1v (2)\( m+1) km(m+l) m F((v-j)). (2.4.3)
j=0
With A=V- we may write,

tr{-V-GDG'} = tr{-AGDG'}

= tr{-DG'AG} (2.4.4)
m
= - Z d.(G'AG).
j=0

Hence we see that the density in (2.4.2) partitions into

the subsets {do ,g10,g20 ... ,gO}; {dl,g21,g 31,...,g'l ; ...

{dm-l' m, m-1; {dm} which are mutually independent, but

variables within a subset are dependent.


___









In the case that VeV then we may write o2A= R'R from

equation (2.2.12) and we find


(GAG)jj = 2 (G'R'RG)jj

1 (2.4.5)
= 2 (RG)' (RG) ,
J -j
where

(RG)'o (~l;g910-a;g20-2glO; ;mo-amgm,m-l)

and for isj n-1

(RG)j = (0;0;...;l;gj+1,j-aj+;gj+2,j-aj+2gj+l,j; '. -


gmj- amgm-lj) (2.4.6)

The "1" appears as the j element in (RG)

Now we may write

2 m 2
(G'AG)oo- 12 + (ko kk-1,o
0 k=l
and for lsjam-1 (2.4.7)

(G'AG) = 1 {1 + E (gk,j-akuk-g,j2
o k=j+l

The density h(G,D) factors into
m-i
h(G,D) = { I h (dj*,j+2,j g ,j ... g j)}hm(dm), (2.4.3)
j=o j j l 2 mj m m
where
ho(d0'g10,g20'..'gmo) =

2 m
do(v+m)-letr {-do,-2+ (g ko-kk-1,0)2
k=l
rF( )2 (v+m) 1m IV IV(0) 1(v-1) (2.4.9)

and for 1ijsm-1









hj(dj,gj+,j ..gmj) =

m
d(+m)-j-letr{- dj [1 + k=+l(gkj-kk-l,j2

-j) (m-j) IVIV(j 1(v-j-l) iv()()-j) '

(2.4.10)
v-nm -
V-M
-id2 2 d m exp[_hl12dm
and h (d ) = {F((v-m))(2o2 2 } mexp

(2.4.11)

2.5 The Distribution of G, of D, and of G Conditional on

D When V Is Arbitrary and When VV .

The necessity of knowing the distribution of d ;dl;...;

dm and the subsets {g0o;g20;...;g; 21; 31 ;ml ...

{gm-1,m} arises from the fact that functions of these statis-

tics will be estimators for the parameters 2, and

{al,a2, ..., m. A knowledge of the distribution of the

estimators gives us the information we need to talk about

the "goodness" of the estimators. It is to this end that

we derive the distribution of G, D, and G conditional on D.

The distribution on the elements of D follow directly

from a theorem given in class lecture notes and in Saw [15].

Theorem 2.5.1

If Wwm+ (V,v, (0)), v integer with v>m; and W(r) is

defined by



W(r) 10 11 W Ir : Orsm .


rO wrl rr








Then {W( ); {Wr '/(r-) Ir are independent chi-

square variates such that

(o) I ( ) X2 (0) (2.5.2)

and for lasrm

(r) (r-) V l/lv X-r (0). (2.5.3)
Wr I/IW(r-i)'IV(r) I/U(r-l) Xv-r

Since from (2.3.4) we have

do = W(0)

dj = W(j) I/ W(j_l) I: l j n ,

then by direct application of Theorem 2.5.1 we have
do lV( 2 (o)
0 (0)1 ()
and for ljan

d-V() I/IV l) I X_ (0) (2.5.4)
] (j) (j-1l) v-
Now if we allow VEV, then since IV() = 02(j+1) 2 we
find

do 02 2X2 (0)
0 V
and for lsj sn (2.5.5)
d.ja 2 (0)
j v-j

To find the density on the subsets {g910;920;...9m;g

921;931; ... ;gml; "; {gm,m-_l we refer back to equations
(2.4.9) and (2.4.10). Using those equations we may write
for o j sn-l,

hj(d jgj+l,j;) C d(v+m)-j-l etr{-dj(G'AG)}, (2.5.6)

for some constant C. Performing the integration over d. and
m m
replacing (G'AG) by Z E X kg kjgj we have,
k=j k=jkgkj






20
m m
h (gj+,j; ...;mj) = Cm(v:V,j){ m E kgkj j-
lk=j =j (2.5.7)

where, with V(-1) taken as unity


C (v:V,j) - vi= .v(j ) 1 ( ) (2.5.8)
S0(V) (m-) V H(v-j-l)

Remembering that g.jjl, a transformation shows the variables

in (2.5.7) have a multivariate t-distribution. The form and

properties of the multivariate t-distribution were found by

Cornish [6] in 1954 and also in the same year by Dunnett and

Sobel [8].

Now if we allow VEV, we find
Sm 2 -(v+m)
hg (10;20 ;g .;gm)=m Cm (V:I ){l+B2 (gkO- k-1,0
k=l
(2.5.9)
and for l1j.m-l

hj (gj+l,j ; ;gmj)=Cm Ij){l+ (gk,j-akk-1, j) 2 }-4 +m) +j
k=+l (2.5.10)
Evidently g0',g21 .... ,gm,m are mutually independent and

vle(gl0-al) t (0) (2.5.11)
and
(v-k+l) (gk,k-l-ak) t-k+1 : 2_kmn (2.5.12)

To find the distribution of G conditional on D let

IV() I = 1, then the marginal distribution on D is

h(D)= h(d0) -h(dl)...h(dm)

m d.v-j)-lexp{- IV J-1V |d(}
= 3 (-j j (2.5.13)
j=o (2|1V V(j ) l-1 -J (( -j))

We have from equations (2.4.2) and (2.4.4) that the joint

density on (G,D) is
m _(V+p)-j-l
h(G,D)=k (V,v) {d. + exp{-d. (G'AG) .}}
Tj=0
The conditional distribution of G given D is









h(GD) = d(m-j) exp{-d [(G'AG)j -IV I-lI]}
h(=ID) = -1 (2J ) (-j )
j=0 (IVIIV(j)l 1) -(2T'-2 (M


}.(2.5.14)


Although this does not have a very pleasing form if we let

VcV, we find the conditional density simplifies greatly. If

we write G'G =(RG)'(RG) and use the fact that IV(j) I= o2(j+1) 2

we find
m-1 -1 -(m-j) -2 m 2
h(GID)= n {(27ca2rd ) exp[-Ca d E (gkj-a k-g,j .
j=o jk=j+l
(2.5.15)
Let gj be the (m-j)-variate column vector given by

9j = (gjj+ ,j; j+2,j;... mj)' : Oj m-l, (2.5.16)
_ the (m-j) dimensional column vector defined by

_j = (aj+l;aj+l j+2;...;aj+l j+2... ) ':o jm-l, (2.5.17)

and V. the (m-j x m-j) matrix whose elements are given by
2
(V )11 d : Osjsm-1 ,

2
o 2
(Vj)kk d- + j+kjk-lkl : 2ksm-j; oajsm-l, (2.5.18)
(' ~kk d +j j k k-l,k- 1

and (Vj )kl= (Vj )k=aj+kaj+k+l...j+l (Vj )kk: 1k
oajgm-l.


Then equation (2.5.15) may be written as
m-1
h(GID) = H h (g.jld )
j=o

where for oej:m-l


h (g ld ) = (2)-) (m-j)IV j l exp[- (gj-j)'v3 (gj-_j)].
That is


9jldj Nmj( j,Vj) : orjSm-1 .


(2.5.19)


2.5.20)

2.5.21)


(









2.6 Verification of the Distribution of G.

In finding the density of the subset (gk+,k;...;gmk)

the constant Cm (:V,k) was given, but was not verified. In

this section we show how the value of the constant can be

obtained.

We need to evaluate the integral

P P
I = ( E ai. .u .u ) dudu ..du (2.6.1)
i=0 j=0 1J 1 p

{-m
where u =1 and a..= .
o 13 ji
We may write
P P
i=0 iuj 1 o -00 (2.6.2)
i=0 j=0
a AjK

where a = (a01, a02' .... op)' (2.6.3)


al l 12 ... alp

and A = a21 a22 .. a2p (2.6.4)


pi pp2 .."' pp


Now equation (2.6.2) may be rewritten as

P P
I E a.iu.u. =a +u'Au+2u'a (2.6.5)
i=0 j=0

=a +(u-A-l1a)'A(u+A- a)-a'Aa.(2.6.6)
00
-1 -1 -
Write u + A = (a00-a'A a) Kw (2.6.7)

where KAK'=I, so that XKJ = IAl1-. The Jacobian from u









into w can be found by standard methods and is

J{u-w} = (a00-'A '-)PIKI (2.6.8)

= (a00-a'A a) PlAI-2 (2.6.9)

Hence we have

Ig (a0-a A-a) A-/(l+w'w) dw ...dw (2.6.10)

{--
We compare this with the following version of the multi-

variate t-distribution

f(t)=C I EI- I(4 t-4 ) -le (t-,)]-(n+p) :-_ o (2.6.11)
P IJp
ljap
where C = F((n+p))/{-r F(n)} (2.6.12)
P
Let E=I and 8=o then setting 8=2(n+p) we see that

n=2B-p, since n>o we must have >kp and we find


/( +w'w) -dw ...dw (2.6.13)


{--
Hence, p
S -a -B (- p)
I .= (a -a'A a) A r(B) (2.6.14)
g oo -


.I -- A F(B- T-p) P (2.6.15)
AI (p+1)- B)- *

Referring back to (2.5.8) we see that p=m-j; B=(v+m)-j,

so that








rF(v+m)-j) |A (-j)
F ------ ) (m-j) iA21 - -j- )


where

jJ j,j+l J ,m


A1 = j+1,j +j+l,j+l x j+1,m


m,j m,j+1 m,m

and


xj+l,j+l j+1,j+2 -- j+l,m

A2 = j+2,j+l xj+2,j+2 ... xj+2,m


m,j+l m,j+2 ... m,m


Now with A=v-1, then by an application of the

A1 -12l -1 1
21 A2 2 11 12 22 21

= l22 V11ll-1


hence we find


So that





and we confirm

Cm(v:V,j)


^221 =vl-ljVll .


AJ = Vi-1 (j-1)

A21 = IVI-1 IV M ,
that

= F((v+m)-j) Iv(j_ ) I(V-j)

F(-(v-j)) -(m-j) IV lV (j) -j-1)


(2.6.17)









(2.6.18)





clockwise rule


(2.6.19)


(2.6.20)


(2.6.21)

(2.6.22)


Cm(V:V,j) =


24


(2.6.16)


~~__~
















Chapter III
2 2
THE ESTIMATORS o,, *,, and {a* :lsjsm}


3.1 Introduction

In Chapter II the unknown parameters a2, 2, and

{a. :1sj4n were introduced to define the generalized first

order autoregressive process and thence the underlying

distribution of the observations. In this chapter we

propose estimators (alternatives to the maximum likelihood

estimators) for these parameters and discuss their properties.

(To distinguish between the estimators proposed in Chapter II

and the maximum likelihood estimator, we shall reserve the

hat (^) notation for the latter and star (*) notation for

the former.)

Finally, tests of hypothesis concerning the parameters

are discussed. Special attention is given to the case of

testing ak = ak+ = = = am = ro, for k = 2,3,...,m. A

method, due to Fisher, of combining independent tests is

likely to be appropriate.

2 2
3.2 The Distribution and Properties of o 3, 8 and {a,;: lj!m.}

Suppose that we observe a number of (m+l)-variate

column vectors yj: 1sjrn (n>m+l), obtained by random sampling

from an (m+l)-variate normal population with mean p and









dispersion matrix V. We assume that these observations come from

the process Y(t) = L(t) + X(t), during times t0
where X(t) follows the first order generalized autoregressive

process. Since Y(t) = w(t) we have, letting I(t.) = Vi:

osism, that
1 n
Vi = Yi = n j lYiJ : osism, (3.2.1)
1 1 n = 1

or alternatively with the (m+l) dimensional column vector p=

(V0' I' ".. m) that

S= = (3.2.2)

Estimates of the noise process X(t) for t0
x. = yj y : lsjsn (3.2.3)

From these we may arrive at
n ,
W = x.x. (3.2.4)
j=l -3-3
n
= ( -) (y ) ,(3.2.5)
j= 1
where W has the central Wishart distribution on v=n-l

degrees of freedom and dispersion matrix V (where VeV).

Hence the theory of Chapter II is applicable and we have

from equation (2.5.5), (2.5.11), and (2.5.12) that

do 'o2 2 X (0) (3.2.6)
22
d.j o2 X2 (0) : Isjmn (3.2.7)

v B(gl0-al) %tV(0) (3.2.8)

and (v-k+l)2 (gk,k-_-nk)t -k+1_(0): 2sksm .(3.2.9)

Hence the estimators for {a.: ljszm} are:
3
a, = gjj-I : lSjsm, (3.2.10)








with densities given in equations (3.2.8) and (3.2.9).
2 2
The estimators for 2 and 2 are:
2 1 m
= J d (3.2.11)
1 j=1

where c1 = V -!(m+l) (3.2.12)
m
2 m -1
and 32 = d0(c2 dj) (3.2.13)
0 =1
where c2 = v(mc1-2)-1 (3.2.14)
2 2
We note that o2 has a Gamma density and 2B has a Beta Type

2 density.

In order to evaluate the "goodness" of these estimators

the following properties are investigated: (1) the first two

moments, (2) the consistency of the estimators, and (3) the

efficiency of the estimators.

The first moment of the estimators are given by

S(a,) = aj : lsjim (3.2.15)
2 2
(o2) = 2 (3.2.16)

and 2(B2) 2 (3.2.17)

so that the estimators are all unbiassed.

Letting E, denote the (m+2 x m+2) variance-covariance

matrix of the (m+2) dimensional vector of unbiassed estimators
2 2
(a*l' a*2, ..., a*m' 02, ,2), we find


(E ) = -2(v-2)- (3.2.18)

(2*)j = (v-j+l)-1 : 2Sj m (3.2.19)
-1 4
(Z*)m+l,m+l = 2(mcl) (3.2.20)

(Z*)m+2m+2 = 2(v+mcl-2)[v(mcl-4)]-l4 (3.2.21)
1 2 2
*)m+l,m+2 (*)m+2,m+l = -2(mc1)-1 22, (3.2.22)
and (*)ij = 0 : elsewhere (3.2.23)









In Fisz [10] and Feller [9] it is shown that a statistic

t based on v observations, with mean and variance T will be

consistent for 8 if

lim T = 0 (3.2.24)

In the equations (3.2.18) thru (3.2.21) if we let v-m,

we find the variance of the estimator goes to zero. Hence
2 2
the estimators (a*1, a*2i a *m o*, B,) are consistent.

The likelihood function, from which the estimators

are derived, is the density of W, given by

L = f(W) = Km+(V.,v) W (v-m-2)etr{-V-W} (3.2.25)

Using this we may obtain the "Information Matrix", F,

whose elements are defined by


(F) = log L (3.2.26)


where j and 6 are any two of the parameters al,...,am,
2 2 th -1
0 and The j diagonal element of F gives the

minimum varinace bound of any estimator of 0.. Letting

(1 2,' ..., m m+l' m+2) (alla2' ... ,m'2 2) we find

(F)jj = j(

,2 j-l,j-1
=- ?vj.-,- : lSj^m (3.2.27)

2 2 2 2 2 2 2
with Vjlj-l= (l+ajl+aj-laj_2+. ..+aj-_j-2...a2

2 2 222
+aj-laj -2...c2 2) (3.2.28)

(F) = [- V(+l) + 1 trV-W]
m+l,m+l 2a 4 4

= v(m+l) (3.2.29)
20








w i
(F) = j + a
(m+2,m+2 2= + 26j
L

4
264


(F) = (F) I
m+1,m+2 m+2,m+1 4 4


,)
202 '

and (F).j = 0:elsewhere.

To see that (F)j = 0 elsewhere we note that


221og- = 0 for lsjam; l!gsm; j f3 ,


2 log L = 1- j
6 6m+1 4 (a wj-1,j-1 -j-,j) : jm '
and

a2 log L 0 eIsm
2 6 =0 : 10m .
j m+2


Now e( 4 [ajwj_,j_-w j_,j]) = -(ajj_lj_ -


= 0 ,

since vj_l,j=avjj_l,j_1. Hence (F) j is zero as was

shown.

With Z = F- we find that the minimum variance

are
-1 2
(E)j (v v1lGl : 2ljm ,

-1 4
(E)m+l,m+1 = 2(Vm)-l ,


(E),=+2,m+2 2(m+l)(vm)-1 4
( 1)m+2,m+2
() 2 (2) 282
(Em+l,m+2 (Z)m+2,m+l = -2(Vm) a


(3.2.30)







(3.2.31)


(3.2.32)




(3.2.33)



(3.2.34)


(3.2.35)



j-l,j)


(3.2.36)

to be



bounds



(3.2.37)


(3.2.38)


(3.2.39)

(3.2.40)









and (E)jk = 0 : elsewhere (3.2.41)

Since the efficiency of an estimator is the ratio of the

minimum variance bound to the variance of the estimator,

we have from equations (3.2.18) thru (3.2.23) and equations

(3.2.37)thru (3.2.41) that the efficiency of the starred

estimator, denoted by Eff(6,), is
-1
Eff(aL) = (v-2)v-1 (3.2.42)

Eff(a ,) = (v-j-l)(vv j-,j- lo2 21jsn (.3.2.43)
2 -1
Eff(o ) = c l (3.2.44)

and Eff(SB) = (m+l)v(mc -4) [vm(v+mc -2)]-1 (3.2.45)

2 2
The estimators al, a and 6B are asymptotically efficient

while the asymptotic efficiency of a*.: 2sjsm is given by

lim Eff(aj) = v 02 : 2 xi -3 J- l'

Replacing vj-l,j-1 by the right hand side of equation

(3.2.28) gives

2 2 +2 2 2 2 2 2
li Eff(a) (1+a j-2_ +...+ -2 a 2j.. .a- + j-22

2 2 222-1
ajla j 2.-.2a 2 )- : 2sjFn (3.2.47)

Hence the asymptotic efficiency of aj depends on the true

values of the previous a's and B If the true value of

aj-_ is zero then a,j is asymptotically efficient regardless
j-1 i
of the values of any of the previous a's. That is, the

asymptotic efficiency of a. is dependent most upon the true

value of aj- the previous a, next upon aj-2, and so forth

with the least dependence on al and 52. If we suppose that








all the parameters are less than unity in absolute value and

write,

a = max( all, 1 21 .... Ic[aj-_I 6 I), (3.2.48)

then we may arrive at the inequality

2(j-2) 2j 2 2 2
(l+a.++a ...+a 2)+a) ) (l+a2 + 2. +...+
j-1 j-1 -2
2 2 2 2 2 2 2 2
j-lj_-2...a2+ j-la j-_ 2...a2al ) (3.2.49)

where equality holds only when all = 121 = .

lajl1 = 18 = a The inequality reverses upon taking
reciprocals and we find the asymptotic efficiency of a i is

at least as qreat as

2(j-1)2j_ 2 (j+1 2
min Eff(a,) = (l-a2(j-l)+a2 -a2(j+ ))(1-a2) (3.2.50)

2sjn .
Although {a i: 2ai:m} is inefficient this loss in efficiency

is more than made up for by the fact that the distribution

of {(a i-i ): 2i~)n} contains no unknown parameters.


3.3 Tests of Hypothesis

Since the distribution of the estimators are known,

tests of hypothesis may be carried out with ease. We

tabulate here a few hypothesis of interest.

It is desired to test the hypothesis

Ho: ak = ak+l = -.. = rm = (k=2,3,...,m)

against (3.3.1)

Ha: at least one of the qualities does not hold.
With

tj = (v-j+l)(gj,j_-10) 2 : 2sjm (3.3.2)









define

Pj = P{It(-j+l) atj} : 2jmam. (3.3.3)

An appropriate test statistic for testing (3.3.1) is
m
L = -2 E loge(p'). (3.3.4)
j=k

The quantity, L, has the chi-square distribution on 2(m-k+l)

degrees of freedom. The hypothesis is rejected at signifi-

cance level a if

L > Z (3.3.5)

where L is chosen so that
2
P{X2(m-k+) = a (3.3.6)

This procedure is called Fisher's method of combining

independent tests. It has been shown by Littell and Folks

[12] to be asymptotically optimal over other tests as judged

by Bahadur relative efficiency. The Bahadur relative

efficiency compares the rates at which the competing teSt

statistics observed significance levels converge to zero,

in some sense, when the null hypothesis is false. The

interested reader is referred to Bahadur [3] and Littell

and Folks [12] .

The above hypothesis has some interesting interpretations

for choicesof no. If n0 = 0, we are testing whether the

process is white noise from some point k on. In the case

where no is a constant, not equal to zero, we are hypothe-

sizing that the time series is stationary.

Hypothesis concerning individual parameters can be

carried out in the usual manner since the distribution of


I









the estimator is known.

An hypothesis of importance, concerning a single

parameter would be
2 2
Ho: p- =0
0 0
against (3.3.7)
2 2
Ha 82 f3 .
a 0

An appropriate test statistic is
2 2
F0 = mc l,/(mcl-2))0 (3.3.8)

which has an F distribution on v and mec degrees of

freedom, where cl is defined in equation (3.2.12) as

v-(m+l). The null hypothesis is rejected at the a level

of significance if

F > F (3.3.9)
0 mclla
where FV is chosen so that
mci,a

P ({F > F } = a (3.3.10)
mc1 mc1,a

2
Choosing 80 = 1, the hypothesis implies homoscedasticity

between the initial observation and the errors of the "noise"

process.
















Chapter IV

THE MAXIMUM LIKELIHOOD ESTIMATORS


4.1 Introduction

In order to comment further on the value of the

estimatorsgiven in Chapter III some standard of comparison

must be employed. To this end we study the maximum

likelihood estimators. In this chapter we obtain the

maximum likelihood estimators and examine their sampling

properties. A comparison is then made between the maximum

likelihood estimatorsand the starred estimators of Chapter III.


4.2 The Maximum Likelihood Estimators and Their Distribution

As in section 2 of Chapter III, we suppose that we

observe a number of (m+l)-variate column vectors yj: lrsjn

(n>m+l), obtained by random sampling from an (m+l)-variate

normal population with mean p and dispersion matrix V. As

in Chapter III, we estimate p byy and form the (m+l x m+l)

matrix W by
n
W = jl (yj-y) (j-y)) (4.2.1)

W has the central Wishart distribution with v=n-l degrees

of freedom and dispersion matrix V. W may also be represented

by
W = E zz. (4.2.2)
j=-3-3









where zl, z2' ..., z- (v=n-l) are mutally independent and


zj N+ (O,V) : -3 m+l -

To see this let the (m+l x n) matrix Y be defined by


Y = (yl'Y2' ... 'n) (4.

and let B be any orthogonal (n x n) matrix with last colur


( 1 1 (4.

Define the (m+l x n) matrix Z = (z, z, ..., z ) by
Z YB (4.
Z = YB (4.:


2.3)




2.4)

mn

2.5)




2.6)


We note that


(4.2.7)


Now W may be written as

W =

and since B is orthogonal

YY =



and upon substituting YY'

(4.2.8) we find


YY'-nyy',

(BB'=I) we may write

(YB) (YB)'

ZZ' ,

= ZZ' and z = /n y into


W = Z Z z z
-n-n




n-i
= z.
j=l-j-

Hence we have the representation
n
wij= z (Yi-yi) (Yjz -Yj)
-i-j ^^^ ii i 3x


(4.2.8)






(4.2.9)


(4.2.10)








= ZikZjk : osinm; osjsm (4.2.11)
k=l1

Hence forth we shall use W = ZZ keeping in mind the

representation given in (4.2.11).

Assuming that Y(t) = p(t) + X(t), where X(t) follows the

first order generalized autoregressive process then VEV and

has the properties given in section 2 of Chapter II. Since

it is through W that we obtain estimates, the elements of W

serve as observations and the likelihood of this set of

observations is L = f(W) given in (3.2.25). Taking the

logarithm of (3.2.25) and utilizing the form and properties

of V we obtain

log L = C v(m+l) log o2 log2
f-l

+(v-m-2)ioglwI-trV- W (4.2.12)

where C is a constant. Recalling that A = V- has the

special form given by (2.2.11) we may write

trV-iW = trAW

m m
1 2 -1 2
= ) w + Z w.-+ Z (a.w. -2a.w
o2 00 j=l j3=i 3 - -l
(4.2.13)
Substituting (4.2.13) into (4.2.12) we find

log L = C iv(m+l)logo2 log q2 + (v-m-2)loglWI
1 2 -1 m m 2
2- w00+ E w..+ (ew ,1-2-.w. -l,)}
20a 0 j=1 i j=l --1,j-1 -
(4.2.14)
Differentiation of (4.2.14) with respect to aj yields

8 log L 1 r
j--- 2 j l jw wj l,} : lcj aj a 2 j j-lj-l ]-1,j









differentiation with respect to 2 yields

m
S log L v(m+i) 1 2 -1
S20 2o4 j=1


+ j (aw. J -2a.w. ) } (4.2.16)


and finally differentiation with respect to 92 gives

3 log L =-- 00 (4.2.17)
as2 292 2 2B4


Setting { L log L and log Lequal to
j So2 59
zero and solving we obtain the maximum likelihood estimators:

-1
j = .Wl jwj : 1sj (4.2.18)


2 -1 -2 m 2 -1
8 = [v(m+l)l i w0 j + (wj-wjl -l-l'
0 =1
(.4.2.19)

and


2 = ( 2)-w00 (4.2.20)

Eliminating 92 from equation (4.2.19) yields

^2 -1 2 -1
o = (vm)- j (w -w j-,j) (4.2.21)


We now proceed to determine the distribution of the

maximum likelihood estimators. In order to do this we shall

use conditional arguments frequently. We shall write

y z f(.) (4.2.22)

in order to imply that "y conditional on z has the density

...." Using the representation of wi. given in (4.2.11)

one has






38


Z j-l,k jk
k=l
aj = -: ljsm (4.2.23)
2
7 j-l,k


Letting

jj zk_ : osjsm ; lIk:v (4.2.24)
jk v 2
E jk
k=l

we may write

ai = k I j-l,kzjk : l:jsm (4.2.25)


Recalling that


-k Nm+ (0,V) : lsk:v (4.2.26)

when VeV, we must be able to represent zjk by

Zjk = aj-l,k + jk : lsjam; lks: (4.2.27)

where Ejk are independent identically distributed normal
2
random variables with mean zero and variance o

Hence

zjk Zj-l,k N(a zj. ,ko2): lsjam; 1sksv (4.2.28)


Using (4.2.28) in (4.2.25),
v v
2 2
aj {zj-l k:lksv}I-N(jk. j- ,kzj-l, k; =l ,k)
(4.2.29)
Since v
v 1j-l,k
S. j-l,kzj-l,k = v 2 1 (4.2.30)

k=l ]-1,k

and 2 1
and 2 1 (4.2.31)
k=1 E Z j-l,k
k=l









we find upon their substitution into (4.2.29) that

aj{z llk: lsk:s'v.5N(a ,c2 ( 2 -,)


To complete the derivation we note the following.

Let

u s N(p,a2)
2U2
and v oX2(0) ,

then Student's t-distribution is defined as


t = (u -P) ( v) 2

We note that the distribution of t conditional on v is

t vaNN(0,va 2v )


Hence, by analogy, the distribution of a. is also

we have


(aj aj) %t (0): l1j2mn


t and



(4.2.37)


and v .
and + a2 + 2 2 2 + a2 2 2
a 2 ~_jl ij- 1j-2 + j-lj-2''2

a2 -a-2 .02 2) l: 1jim (4.2.38)
j-i j-2"- 21l

To find the distribution of 2 we define the (vxl)

column vectors 6 = (jl, j2 ..., .j )' by
31 32Jv


V 2 7v
jk= (k ZJk2 ;zjk : Isksv; o!jm ,

and note that

0. = 1 : O:rjm .
-3 --3

In terms cf the (m+l x v) matrix Z we have

j ={(Z) (Z) j } (Z J.


(4.2.39)



(4.2.40)



(4.2.41)


(4.2.32)


(4.2.33)

(4.2.34)


(4.2.35)



(4.3.36)


v
oo 2
where =
2









For convenience we take

2
j = wj- W- lJlj_1 : ljm (4.2.42)

and write
2 1 m
Sm j=lj 3 (4.2.43)

Using 6j we may write

.. = (Z) ( j6j 1_1) (Z) : l jm (4.2.44)

where I is the (vxv) identity matrix. Consider the

distribution of qm conditional on {(Z) j:0sjsm-l}. Now


(Z)m. {(Z) .: 0;j:m-l}1 Nv (xl( (Z)m l,- ;o 2I ). (4.2.45)

Conditional on {(Z). : osj m-l1 the matrix (Iv 6 6'l)
3" v -m-1-1-1
is symmetric and idempotent with rank (v-1), and the

quadratic form nm follows the non-central chi-square

distribution, that is

nmj{(Z)j.: Osjsm-l} 02X2_ (Ym) (4.2.46)

where
1 2
m = 2 m (Z)m-1l,- I m-m- m-l, (4.2.47)

Upon replacing m-i by its definition in (4.2.47) we find

Ym = 0 (4.2.48)

Since none of {(Z) j : Ojsm-l} enter into the distribution

of nm we have the unconditional distribution is the same and

hence iq is independent of {(Z). n : o
2 2
m (v-l) (0) (4.2.49)

The distribution of q(m-1) conditional on {(Z). : osjsn-2}

can be obtained in exactly the same manner and since it









depends only on {(Z).. : Osjsm-2J it is independent of

Tm. By the same arguments as above n(m-!) can be shown to

be independent of {(Z)j.: osjsr-2} and hence

2 2
0(m-l) X (0) (4.2.50)
(v-1)

In exactly the same manner we can show qk+l is independent

of nk and hence so is q(k+2)' (k+3)' ..' m since they are

independent of n(k+l) Again the distribution of nk is free

of {(Z) .: O0jsk-l} so that the unconditional distribution of

nk is identical to nm In this way we argue that {rj: Isjan}

are mutually independent and identically distributed with
2 2
nj to X2(v-) (0) : 1ijsm (4.2.51)

2 1 m
Since 8 E n- we have
j=1 J
2
2 2 (0) 4.2.52)
my Xm(v-1)

In the above argument we note that the variables

{nm, (m-l,...,' 'k} are independent of {(Z) .: osj:k-l}.
Hence we have {nm' (m-l)' "'' rl1 are independent of (Z)o.,

and hence of

w00 = (Z) (Z)0 (4.2.53)

Since

ZOk N(0,o 2 ) : lksu (4.2.54)
then

W 0 T2 X2 (0) (4.2.55)

Since 2 = (va2) w00' then we have









(v-1) 2 F
Nv- mF2 (0) (4.2.56)
%2 m(v-1)

4.3 Properties of the Maximum Likelihood Estimators

Since the distribution of the maximum likelihood

estimators is known their properties are easily obtained.

We find using equations (4.2.37), (4.2.52), and (4.2.56)

that (& a) = a : lsjsm (4.3.1)

8(82) = (v- )v-1 a2 (4.3.2)

and _(2) = mv[m(v-l)-2]-2 (4.3.3)

"2 2
Hence the a. are unbiassed estimators while o and 2 are
3
biassed. Since unbiassedness is a desirable property we shall
2 2
use the unbiassed estimators of o and 52 in calculating the

rest of the properties.

SLetting 2 denote the (m+2 x m--2) variance-covariance

matrix of the (m+2) dimensional vector of unbiassed

estimators (a1, a2' ... m, v(v-1) -12, [m(v-l)-2] (mv)-1 2)

we find
1 22
()11 = (v-2)- -2 (4.3.4)

-1-1 2
(M). = (v-2) v ,j-l, : 2jcm (4.3.5)

where vjlj1 is defined in (4.2.38)

-1 4
(E)m+lm+l = 2[m(v-l)] 4 (4.3.6)


(E)m+2,m+2 = 2[m(v-l)+v-2] [mv(v-l)-4v] -14 (4.3.7)


()m+,m+2 = ()m+2,m+ = -2[m(v-l)]- o22 (4.3.8)

and


()jk = 0 : elsewhere .
jic


~


(4.3.9)









To see that the set of estimators (i.: l1jTi} are

independent of the set { 2 2} we note that the distribution

of 82 and 2 are free of the elements of a. and hence the two
3
sets of estimators are independent. To see that the covari-

ances between the c's are zero we note that
V kl-,k k
a. = E 1, k-lz.,k
1 k=l j,k


where j,k


k= j1, k(a j j-,k+ jk)


= a + E j-l,kej,k : 1!sj!m ,
is defined by equations (4.2.24) as
is defined by equations (4.2.24) as


(4.3.10)


Szj,k
j,k v 2

k zj,k
k=l 3

Hence the covariance between 8. and a (jZ) is given by


( )j = &a ju)- ( j)E (

v v
S[( + j-,k jk)(ak + E -l,kE Zk) taZ



= (.ea+a E 9-l,k Ck+ak Z .j-lkjk
3Jk=l k=l


V V
+ E k j-li ji -,k k)-a (4.3.11)
i=l k=l j-l'j -

Since the {est:Issm, lstsv}are independent identically

distributed normal random variables with zero mean, we have

takingexpectations first with respect to the st s that the









last three termsin brackets in (4.3.11) vanish and we are left

with

(C)jS = a j-aj"
= 0 : ljfRua (4.3.12)

Although this shows the estimators are uncorrelated it

is not true that they are independent. To see this we

examine the case for m=2. Write

W=GDG'


dl+d0g20

dlg21+d0910920


dl 21+d0 10 20

d2~dl21+d0g20.1

(4.3.13)


w 01
1 woo
01 0 _


= 10 '


(4.3.14).


w12
2 w 1


(d0g10g20+dlg21)
2
d0g10+dl
We shall show that
2
2 (g10,d0',dl) N( 2' 2 )
dl+d0g10
by equation (4.3.14) this is equivalent to


(4.3.15)


(4.3.16)


Hence,


d 10


dO


= d010

d0O20








2
a2 (al,d0,dl)d1N(a2 2 ) (4.3.17)
d 1+d 0l

Since the conditional distribution of a2 depends on &1 we

have that they are dependent. To show (4.3.16) we need the

distribution of (gl0,g20'g21) conditional on (d0,d1,d2).

Referring back to (2.5.21) we have

20 = (g10'920)' (d0,dl)N 2(P0'V0) (4.3.18)

where = (Oaela2)' (4.3.19)

and
2 2
do d
V0 = (4.3.20)
2 2 2
o a 2
2 d0 d0 2 d0


and

g21 (d0,dl)N(a2'- ) (4.3.21)

independent of (gl0,g20). Now the distribution of g20

conditional on (gl0,d0,dl) is easily shown to be
2
20l (10,d,dOdl)N(ae2g10' ) (4.3.22)

Since g21 is independent of g10, conditioning on gl0 does

not affect the distribution of g21' that is,

2
921 (gl,'d0,dl)N(a2' )

We note, by equation (4.3.15), that &2 conditional on (gl0'

d0,d1) is simply a linear combination of g20 and g21. Since

they are normally distributed (conditionally) so will 82









(conditionally). All we need do is calculate the mean and

variance to find the conditional distribution of 82

(dg 10g 20+d121)
P (21 (gl00''ddl)- M (d0g10920+d 2I I (gl0'd0'dl)


a2(d0g10+dl)
2
dogl2+d!
d0O 101

= a2 (4.3.23)

Var(&2) (g10,d0,d1) = Var{ (d0g10 20+d1g21)
2 I (g10'd0,'d l)
d0g10 1

recalling that g20 and g21 are independent we have,


dg2 0 Varg20 (g10,ddl))+d2Var{(g21) 1(g10,d0dl)}
(dg 2+d 2

2 2
22 2 o 2 a
d0O 0 + d
0 1
(d0g0+dl) 2


2
02 (4.3.24)
(do010+di)

Hence
2
21 (g10,dOdl)N(2' 2 2
(d0g10+dI)

as was to be shown.

Furthermore it can be shown that &1 and 62 do not have
a bivariate t-distribution. To see this we find

f(821i. = 0). We have that








2
2 (al o,do,dl) N(a2'1
SGal+al
0 11


and hence


2
21 (1=0,ddl )^N(a2,- ) (4.3.25)
Since the distribution does not depend on d and since
Since the distribution does not depend on dO and since


dO and d1 are independent we have

2
21(1=0,dl1 )N(a2'd- )

and from (2.5.5)

2o 22
dl~vo Xv_1(0) ,


so that



f(&2,d lal=0) =


d
-- (2-0 )2
1 2
2o2 2- 2

/2i (s-)
d)
V2


dl
d(v-1)- 202
_1 e


2 (v-1)
(202)



,<& a2< .


r ((v-1))



(4.3.28)


dl>0

(v-1)>0


Integrating over dl, we have


d
--- [+ (a -a) 2]
dl e d(dl)
y2 (2o2r ) vF ((N-I))


SF(v) [i+(82-a2) 2]-
Fr()r((v-1))


:-o v>0 .


Hence we find &2 conditional on = 0 has a t-distribution


(4.3.26)



(4.3.27)


(4.3.29)


f( 21l=0) -








with (v-1) degrees of freedom, but we know &2 has the

t-distribution with v degrees of freedom, this will show a

contradiction and 8 and a2 cannot have a bivariate t-

distribution.

Now suppose tl and t2 have a bivariate t-distribution

with v degrees of freedom. Their joint density is given by

r( (v+2)) lZ -

f(tl't) t +2) - r (v) 11 -1 +V1 2) i
t2- 2 2 t2- 2 1=l,2
v>0

(4.3.30)

where i1 and v2 are the expected values of tl and t2

respectively. Also Z is the (2x2) variance-covariance matrix

of (tl't2). Relating this to &1 and 82 we would replace

(,1'12) by (al~,2) and Z would be a diagonal matrix since
we have shown the covariance of 1 and 82 to be zero. With-

out any loss of generality we may take p 1=2=0 and Z=I. The

marginal density of tl is


f(t) = (-+l) :-tl
(1) (v)[+t2 (v+I) 1 r1(v)[I v>O

Hence the conditional density of t2 given ti is

f(t ,t2)
2'tl) f(tl)


r((v+2)) [1+t2 1(v+1)
I2.2 :(v+2) : -m r( (v+ )) [ 2 ] (v+2) 2
rC,5(v+l)) [1+t1+I 2 v2!0*









In particular, suppose t,=0, then


fF( (u+2))
f(t2 |t1=) = 2 (+2) : - T2 F((v+l))[1+t2 V20 .


That is, t2 conditional on tl = 0 has the Student t-distribution

with (v+l) degrees of freedom. Hence we see that if (&B,,2)

are bivariate t-distributed, then since &2 conditional on

61 = 0 has the Student t-distribution with (v-1) degrees of

freedom it must be that &2 has the Student t-distribution with

(v-2) degrees of freedom, but this is a contradiction since

82 has the Student t-distribution with v degrees of freedom.

Hence (01,22) do not have a bivariate t-distribution.

Hence we see that the maximum likelihood estimators are

not independently distributed and their joint distribution

is not multivariate t. This of course is a drawback in using

the maximum likelihood estimators and accentuates the benefits

of using the starred estimator, which are independent and have

the t-distribution.

It is easily seen that the unbiased estimators are

consistent since the variance tends to zero as the sample

size increases without bound. To find the efficiency of the

maximum likelihood estimators we compare their variance to

the minimum variance bounds given in equations (3.2.37) through

(3.2.41). We find that

Eff ( )=(v)-2)v-1 : Isjm (4.3.34)

Eff (v- )1) 2) = (v-l) 1 (4.3.35)

and Eff([m(v-l)-2](mv) 2)=[(m+l) (m(v-1)-4)] [m(m(v-l)+v-2]-1

(4.3.36)









It is obvious that the estimators are asymptotically efficient.

Unlike the starred estimators the efficiency of the maximum

likelihood estimators does not depend on the unknown parameters,

but the distribution of the maximum likelihood estimators

{&j: 1js n} depend on the unknown parameters so that tests of

hypothesis of aj dependson knowing the values of al,a2' ...

aj-1 This clearly shows the trade-off between the starred

estimators and the maximum likelihood estimators. While the

starred estimators are inefficient, tests of hypotheses are

performed with no difficulty, and vice versa the maximum

likelihood estimators are efficient asymptoticallyy), but

test of hypotheses are complicated since their distribution

depends on several unknown parameters. Moreover the

dependence between the a's also causes complications in making

tests of hypotheses concerning two or more of the parameters

since the joint distribution may be very complex.


__















Chapter V

A TEST OF THE ADEQUACY OF THE MODEL


5.1 Introduction

Throughout we have assumed that the process is adequately

described by the first order autoregressive model. In this

chapter we propose a method of testing the validity of this

assumption. Due to the assumption of the first order auto-

regressive process a class f dispersion matrices arose which

we identified by V. Since this class of dispersion matrices

is a consequence of the model, a test to validate the model

is equivalent to a test of H0: VeV against HI:V is an

arbitrary positive definite matrix.

In order to arrive at a test statistic for testing this

hypothesis we recall that if VeV then V=AUA' where A and U

were defined in equations (2.2.4) and (2.2.7). In particular

U was given as the (m+l)x(m+l) diagonal matrix
22 2 2 2
U=diag(o2B ,o ,c ,...,o ) (5.1.1)

We also showed that

22
dj' X (0) -: lsjm (5.1.2)

and with v large compared to m each of the d.'s should be

nearly the same. Ignoring the first row and column of U we

have that the remaining diagonal elements of U are 02 and

{d.: lsjan} are independent estimators of this quantity. If









H0 is true then all of the d 's should be equal. Another
way of putting this is that the arithmetic mean of dl,d2,

...,dm is equal to the geometric mean, that is,
m
H d,
i=l 1
1 m m
Sdj






m d.
= m ) 1, (VEV) (5.1.3)

m Z d.
j=1

If H0 is false then the dj's will not be equal and the

arithmetic mean will be larger than the geometric mean and

Ai will be less than one. Hence we see that we reject HF for

small values of A1. The asymptotic distribution of A1 will

be investigated in section 2. This test has an interesting

geometrical interpretation. If we consider the

{d.: lsjsm} to be the squared lengths of the principal axes of

an m dimensional ellipsoid, the above hypothesis specifies

that these are all equal, that is, that the ellipsoid is a

sphere. Hence this test is the sphericity test on the

{d.:l sjnm}.

Besides the sphericity test we consider another test,

independent of A1, based on {gij: Oj
{gij: O0j
process since they are used as estimators of {a.: ljsm} and

hence of V. In section 3 we shall investigate the distribution









of a function of these statistics under the hypothesis that

VEV. In section 4 we discuss combining the two tests given

in sections 2 and 3 and the asymptotic equivalence of the

combined tests as compared to the likelihood ratio.


5.2 An Approximation to the Distribution of -, log l.

Referring to equation (5.1.3) we see that A1 may be

written as the product of u,,u2,... ,u where


d.
u. = : 1sism (5.2.1)
t 1m
d.
m. 3

that is,
m
A1 = ui (5.2.2)
i=l

Rather than consider the distribution of A1 we shall

consider the distribution of

n = -p log A1 : Osn< (5.2.3)

where p is some constant. The moment generating function

of n is




m -ep
-= ( n ui) (5.2.4)
i=l

In order to find this expectation we need to find the joint

distribution of (u1,U2,...,um). To find the joint distribution

we shall transform from (dl,d2 ...,d ) into (ulU2'.. .u ,S)
2" m 4 m








m
where u. is defined by (5.2.1) and S = d. Hence
1 m j=
we seek the Jacobian of the transformation from (dld2, ...

dm) to (ul,u2'... ,uS) defined by


d. = u.S : Isism ,
1 1


with


E u. = m.
i=l 1


(5.2.5)



(5.2.6)


Let [6u.] and [6S] denote small changes in u. and

respectively. Suppose that the changes [6ui] in u

[6S] in S bring about a change [6d.] in d. so that

and (5.2.6) is preserved. That is

d.+[6di] = (u.+[6u.]) (S+[6S]) : 1sima ,
1 1 1 1


E (u.+[6u.]) = m
i=l


S,

and

(5.2.5)



(5.2.7)



(5.2.8)


Expanding the above equations and dropping terms of second

order in the [6*][*e(di,S)], we find that

d.+[6d.] = u.S + [6u.]S + ui[6S] : lsism (5.2.9)

and


m m
z u. + E [6u] = m .
i=l 1 i=l
m
Since d. = u.S and Z u. = m we see that
i=l


[6di] = [6ui]S + ui[6S] : lsism ,


(5.2.10)






(5.2.11)


m
Z [6u.] = 0 (5.2.12)
i=l
To write the above in vector notation we define the (mxl)









column vectors

[6d] = ([6d ] [6d2],..., [6d ] (5.2.13)

d = (d1,d2, '.,d )' (5.2.14)

[6u] = ([6u ] [u2] ,. . m [u (5.2.15)

u = (Ul,u2'... um) (5.2.16)

and 1 = (1,1,...,1)' (5.2.17)
--m

Equations (5.2.11) and (5.2.12) may now be written

[6d] = [6u]S + u[6S] (5.2.18)

and

1'[6u] = 0 (5.2.19)
-m
Equations (5.2.11) can be thought of as a singular

transformation from {[6dl] [6d2], .... [ dm]} to {[ul] [6u2]

.., [5u m [6S]} made one-to-one through use of equation

(5.2.12). Saw [17] has shown that

J{d-u,S} = J{[6d] [6u] [6S] }, (5.2.20)

where J{[6d]-[6u], [5S]} is the Jacobian of the transformation

defined by (5.2.18), in which u and S are considered fixed.

Choose P to be an orthogonal mxm matrix with the first

row equal to 1i and pre-multiplying equation (5.2.18) by it
-m
gives

v = P[6d] = P[6u]S + P u[SS]

= Y S + 1 [6s] (5.2.21)

Y2 0


Ym 0

where yl 0 since 1'[6u] = 0.

From equations (5.2.20) and (5.2.21) we have









J{d-u, S} = J{ [6d] + [u] [S]}

= J{ [6d] -*v}-J{v-y2 ...,ym [6S] }. (5.2.22)

The Jacobian, J{[d]J-v}, is unity since P is an orthogonal

matrix. The Jacobian, J{v-y2,...,ym, [6S]}, is the modulus

of the determinant of the (mxm) matrix K with elements
av1
(K)11 3[6S] = 1


av.
(K) 3= S : 2sjS-m ,


3v.
(K) = = S : 2j:m ,
(K)jj y
3j


av.
(K) = = 0 : elsewhere .(5.2.23)
kj 3ak

Hence K is a diagonal matrix and

J{vy2 ...'Ym, [6S]} = IKI = Sm-1 (5.2.24)

and finally

J{d-u, S} = Sm-1 (5.2.25)

Since the d.'s are independent with

22
dj X2_ (0) : lsj:m (5.2.26)

then ij-1 20y dj
m d. e
f(dl,d ,...,d ) = H -- : d.>0
m j=l 2 v (5.2.27)
3 (22 ) 3 (v.) l j-'jsn

where vj = v-j: lj m Hence we find the joint distribution

u = (Ulu2,...,um)' and S is








vj-1
m m u.
f(ulu2", ,umS) =T( ) UT' I
=1 j= F( jl)m j)


r vj-l 2 S
(S) j=l e 2a
m

(2o2) j=I
_ ( vj)

m j=1

m
: Osu.sn, Z u.-m; S>0 (5.2.28)
S j=l

Hence we find u and S are independently distributed with

02 2
S -- X2 m (0) (5.2.29)
m v.
j=l 1
and u is distributed as mZ where Z has the Dirichlet distri-

bution with (mxl) dimensional parameter vector (v1, 2,...,

v m)' = (2(v-1),(v-2),...,. (v-m))'.

If (y1, y2' ..., Yn) has the Dirichlet distribution with

parameters (al"a2, ...,an), then the moments about the

origin are given by
n n
S{ Fr(a+r )}P( E a)
1 r 2 n n j=l j=1 J (5.2.30)
P'(Yl Y2 ..."Yn n n
r( E (ar+r.)){ II r(a.)}
j=1 3 3 j=l 1
Hence we find the moment generating function of n is given

by

On(9) =:e9n

= g( H u)
Si=l1










and letting ui = mZi gives

m -6P
) (9)= i( ni Z Oi
S i=1


= m-m .( p z-.
i=l

and since Z.: l1ism are Dirichlet we have from (5.2.30)
1
that m m
{ n r(0v.j-ep)} r( vj)

e) =-m-p j=1 j=l (5.2.31)
F( E ( j-ep) { F r(3v)}
j=1 j=l

Since the moments of n are functions of gamma functions

we can apply Box's [4] method to obtain an approximation to

the distribution of n. A good discussion of Box's method is

also contained in Anderson [1].

Using equation (5.2.31) the cumulant generating function

for n is

T (0) = log0T(e)
m
= k-meplogm+ E logr((v-j)-9p)-logF ((,v-im((m+l)
j=l

-m6p)), (5,2.32)

where k has a value independent of 8. Rewriting this as

m
yr (6)=k-mpOlogm+ E logF(c. +p(1-26))-logF(8+mp(1-26)) (5.2.33)
j=l
where

a. = (v-p-j) : l1j:m,

and (5.2.34)
8 = [mv-mp-m(m+l)]


~_









We use the expansion formula

log(x+h) = log 27 + (x+h-)logx

)(-) rBr (h)
x E (5.2.35)
r=l r(r+l)xr

where B (h) is the s-th Bernoulli polynomial defined by

hT S
hr = s
Te Z (h) (5.2.36)
(e -l) s=0 s! s

for example

Bl B (= h- ; B2(h) = hh+ ;


3 3 2 1
B3(h) = h 2 h + 2 h

Using the expansion formula (5.2.35) on (5.2.33) we find

that

()=k+ (2 (log 2-log -)-(8+' log m
1- +- 2 2


--(m-l)log(l-28)+ E T } (5.2.37)
2 r=l r (1-2)r

with

S r, Br+l (0) m
r = (-1) r B (cj )}
m r=1 (5.2.38)
r(r+l) (Pp)r

By virtue of the fact that Yn (=0)=0 we must have

k(m-1) (log 2- log (+m ) log m = Z
r=l r

(5.2.39)
so that we may write

S(8)=-k(m-l)log(l-28)+ {r 1}. (5.2.40)
Sr=1 (1-28)r










If r has the chi-square distribution on-e degrees of freedom

then its cumulant generating function is


Tr() = log (1-28) (5.2.41)

we see that equation (5.2.37) has the same form with

e=(m-l) degrees of freedom and an additional sum which may

be called the remainder. This remainder may be reduced by
*
choosing P=P0 so that 71 = 0 and the approximation is improved..
*
For 71 = 0 we must have
m
B2() = m E B2(aj) (5.2.42)
j=l
or
m
2 1 2 1
I -B+6-m Z ( (5.2.43)
j=l 1

Recall that

= n(v-p) -n (m+l)

and

a.j=(v-p)-j : l1jsm ,

letting 6 = (v-p) then

B= m6-m(mn+l) (5.2.44)

and

a.j=6-j : Isj~a (5.2.45)

Substituting (5.2.44) and (5.2.45) into equation (5.3.43)

gives
1 2 1 1
[m6-1m(m+l)] -m6+im(m+l)+ =

m 1 2 1 1
mi [(6- j)- + j+ ] (5.2.46)
j=1










Expanding the left hand side of equation (5.2.46) one has

2 212 1 2 2 1 1
m2- 2m (m+l6S+l~n (m+l) -m6+4m(m+l)+ (5.2.47)

and expanding and surfing the right hand side of equation

(5.2.46) one has
2 2 1 2 1 2 2 1 2 m
m262 m 2( (+l)6+ m2(m+l) (2m+l)-m26 + m2(m+l)+ .

Collecting like terms we find (5.2.48)

6 = (m+1) (m+12m+8) (5.2.49)
48m

hence
(m+l) (m2+12m+8) (5.2.50)
0=90 24m

We find then that

4 (6) = (1-2e)_- (m-1) exp{ Z r[(l-2e)-r -1]}
Sr=2 (5.2.51)
P=P0

where
(m+!) [3m6-36m5-583m4-336m3+160m2-192]
2 2 6912 22
6912p m
P='PO (5.2.52)

Thus the cumulative distribution function of n=-pOloghl is

found from

Pr{-pl0ogX1 X }= Pr{X 2 1)
(m-1) .
2 2 -3
2 (Pr{2m+3) :X}- Pr{x -1})+ R'(p
3 3 -4 (5.2.53)
with R'(p0 ) a remainder involving terms in pO 'P ..

Asymptotically we have that the distribution of
m d.
-PO log = -Oi log 1 m (5.2.54)
--E d.
mj=1 3
tends to that of a chi-square variate with (m-l) degrees of

freedom.









5.3 The Distribution of T a Function of {gj.:0Oj 1]--
In section 5 of Chapter II we derived the distribution

of {gij:0:j
arbitrary and when VEV. In particular, when VeV and m=4

we find from equation (2.5.15)

f(g10'g20'g30'g40'g21'g31,g41',32,g42,g43 d0,dl,d2,d3) =


1
2 -1 2 exp
(2T2n d )
0


d
{ d 0 10-al )2+(g20-a2910) 2
20a


S+ (g30-a3202 + (g40-a430) 2


1
1-3 exp
2 )2
(2so d )2


2 2
2 21-a 2 31-a321 (g41-o4931
20


d1 2 2
1 -1 exp --[(g32-3) +(g42-4g32)
(2Tio2d- ) 20
1


(210 d 22 4


: -- O0 j< i4 .


(5.3.1)


If in (5.3.1) we replace (al,a2, 3,a4) by their estimators

(g10,g21,g32'g43) we obtain the statistics {(g20-g21gl0)'

(930-g32g20)' (g40-g43g30 (g31-g32921)' (g41-g43931)'

(g42-g43g32)}. These statistics indicate a departure from
the model. We shall investigate the distribution of a function

of these statistics. No compact expressions have been found








for general m so we present here the case for m=4. The

case for general m follows directly from the case m=4. In
what follows we shall use conditional distributions and for

convenience shall let g denote a set of fixed variables.

Consider first the statistics {(g40-g43g30) (g41-943g31)

(g42-g4332 )}. We shall determine their joint distribution
conditional on H = {d0,dl,d2,d3,d4,g30,g31,g32}. We have

from (5.3.1) that g40,g41,g42, and g43 are normally

distributed conditional on 4 so we need to find the moments

of {(g40-943930) (g41-g43g31) (g42-g43g32)} conditional
on Mto determine their joint distribution.

We have that

S(g40-g43g30) I = [F(g40-g43g30)] I (5.3.2)
where the inner expectation is with respect to g40 and the

outer expectation is with respect to g43. From (5.3.1) we
obtain (g401 ) by inspection to be a4g30' hence

(g40-g43g30) I = ((a4-g43)g30] (5.3.3)

= 0 (5.3.4)
since by inspection of (5.3.1) e (g43) I A4 "

In the same way

S(g41-g43931) = e [e (g41-g43g31) I
= e [ (a4-943)931] I1
= 0 (5.3.5)
and

,,(g42-g43932) I 4= e [(942-943932)] 1
= [(a4-943)g32]
= 0 (5.3.6)









The second moment


(940 4330 2
6(g40-g43g30) l


ts are handled identically and we find

S 2 2 2 2 2



d +0 4 30 4 43930 4330

2 2
a 2 2 2 2 2 a2 2
do +4 g30-2a4g30 +g30(3 + 24)]
0 3
2 2
o 2 a
S+ 930d '
0 3

2 2 2
L(g41-2941g43g31 +43 31)

2 2 2 2 2 2
id + 4g 31-2a4g43g31+g43g31] I

2 2 2 2 2 2 a2 2
d +4 4g31-j24g31+33 + 34)
1 3

= 20
d 931 '
1 3


and

E(942-" ) = [e3 322 22-2g 2 2 2

2
2 2 2 2 2 2
= d2 +(4 32-24g 43 32 +43 3211 4
2 2


2 2
= 2 2
d 2 V32-2a4g2 32 d 3 )


= + 932d (5.3.9)
2 3
The cross product terms are handled similarly and we have that

(g40-g43g30) (g41-g43g31) = [e(g40g41-g40g43931-41g4330

+g43930931)1]}j, (5.3.10)


(5.3.7)












5.3.8)








where the inner, middle, and outer expectations are with

respect to g40,'41, and g43' respectively. Continuing we

find

(40-g43g30)(g41-g43931) [ = &{e[a4g30941-a4g30g43g31-g41g43g30

+g43g30g31]}

2
S{a4g30g 31-a4g30g43g31-a4g43g30g31

2


2 2 a 2
a4g3031-24g30g 31+(3-+a4)g30g31

2
= ~ g30g31' (5.3.11)



(g40-g43g30) (g42-g43g32) I= { [e (g40g42-g40g43g32-g42g43g30+

2
+g4330932)]} a


= 8{8[a4g30g42-94g30g43g32-42g43g30

2
+g943g30g32]}

2 2
e{a4g30g32-a4g30g32-4g43g30g32
2
+g43g30932}I

2 2 2
= a4930g32-a4g30932-a4g30g32

o2 2
+ (a + 2g
d 3 (4 30g32
2
d g30g32 (5.3.12)
= 3032









and

f(g941-4331) (g42-g43g32) 1 = {e[(g41g42-g41g43g32-g42943g31

2
+943931g32) ]}|


= f{[a4g31942-a4g31g43g32-g42g43g31


2
+g43931932]}I


2
= f{4g31g32-n4g31g43g32-Q4g32943931


+2
+943g31g32


2 2 2
= a4 g323132-4931g32-c4g31g32
2
+a 2
+( 3 a4)31932
3


2
S 3 32
d-g31g32
3


(5.3.13)


Hence we find


g40-943g30

941-943931'

42g43932


Ia N3(~, o2V3)


where


(5.3.14)







(5.3.15)


0

1 = 0 o
0








1 930 -1 -1
d + 30 31d3 3032d31
0 3
2
-1 1 31 -1
V3 O d3 1 d d 331 d1 (5.3.16)
1 3
2
-1 -1 1 g32
g3032d31 g31g32d31 d2 + d 3

It is well known that if x:(nxl) N (0,Z) then x'E- xx2 (0).
n n
Therefore it follows that


r 40-4330 4043 30

4 = 41- 43 31 V3 4 43 321 X(0),
\42-943g32/ \42-g43g32) (5.3.17)

where the subscript on r equals the first subscript on g.

Since the distribution above is functionally independent of

the variables in 9 we have the unconditional distribution

also, that is,
22
r4 X3j(0) (5.3.18)

Putting = {d0,dld2,d3,d ,dg20,g21} we now find the

joint distribution of f(g30-932g20), (g31-32921)

Proceeding as before

e(g30-g32g20) = e [E (g30-g3220) ]l

= E [(a3-g32)g20]

= 0 (5.3.19)
and

(g931-932921) = [ (g31-932921)] I

= P [(a3-g32 )21 I
= 0 (5.3.20)









The second moments are

2(g-32920 = C (O 2 2 2
(g303220 = g30-2g30g3220+9322011

= 2 2 2 2 2
do +c3g 20-2c3 32g20+g32g20]
0
2 2
d= 3 21-23g32g2d + g2]20
0 2
2 2 2
0 a 2
+ d-20
0 2
and

(g31-g3221 22 [ (g-2g 3 g3g31g2 303

a 2 2 2 2 2
d +a3 21-23g32q21 32921
1

0 2 2 2 2 0 2 2
d +a3 -2 3 21 d +d 3 21
1 2

2 2 2

1 2

The cross product term is

e(g9303g32920) (g31-93221)U = P 8P(g30931-30932921


(5.3.21)













(5.3.22)


-g31g31g20


2
+g32g20g21)]


= {[(a3g20931-a3g20g32g21-g31932g20

+932g20g21]}[

2
=' a3 20 21- 3 20 32 21-a3 21 32 20



= a3g20g21-a3g20g21+3g20g21
2
2 2 2

d+( +a3)g20g21
2
2
0
a 220g21 (5.3.23)
2








Hence we find that

( 30'32920

-31-932921 j


i' N2(v o V2)


where


(5.3.25)


0
p =
0


Liu 2
1 + g20o
d0 d
0 2
V2=
-1
g20g21d21

Now form


r g30-g32g20
r3 =
\g31g32g21/


-1
20921d2

2
+ 21
(- + g )
dI d
1 2


-V1 g30-g32g20
V2
31g32921


where


r3j t o2X2 (0)


(5.3.28)


By the same arguments as before we have that the distribution
of r3 is functionally independent of the elements of N and
hence
r3 G o2 2(0) (5.3.29)
unconditionally. Moreover r4 is independent of r3 since the
distribution of r4 is functionally independent of the
elements of r3.
Finally we consider (g20-g21g10) and put
M= {d0,dl,d2,d3,d4,g10


(5.3.24)


(5.3.26)


(5.3.27)









8 (g20-g21glO) = [ (g20-g21glO) ] 1
= [ (a-g2) l 10
= 0 (5.3.30)

and
2g F (2 2 2 2
e (g20-g21 10) = 20-2g20g21 gl+g2 g ]
2
[ a 22 2 2 2 2
S +2g10-2c21 g021g10


2 22
0 2 2 2 12 2 2

= d 9 2 10-3 10 +.310
0 1

2 2
Sa02
d 0 d 1 10 (5.3.31)
0 1
Hence

(920-g21g10) )I N(i,02V1) (5.3.32)
where

= 0 (5.3.33)

and 2
1 g10
V + 1. (5.3.34)
1 d d

With 2
(g20-g21g0)
r = 20 2112 (5.3.35)
( + g)
d d

then
2 2
r21 %2 X1(0) (5.3.36)

Since the distribution of r2 is functionally independent of

the elements of Uwe have, unconditionally that, r2 is chi-

square with one degree of freedom. Also r2 is independent of

r3 and r4 since their distribution is functionally independent

of the elements of r2. Since the three statistics are

independent we may add them to get









22
R = (r2+r3+r4) X6(0) (5.3.37)

We note that the distribution of R depends on the nuisance
2
parameter o to eliminate this parameter we consider


(R/6) 6
T = F (0) (5.3.38)
2 4V-10
0,

Since R is independent of {d0,dl,d2,d3,d4} then its
2
independent of a and T is the ratio of two independent chi-

square variates divided by their degrees of freedom, that is,

T has the F distribution.

Before extending this result to general m we note that

the dispersion matrix V3 of equation (5.3.16) may be written

V3 = D (3)+33 (5.3.39)

where

D(3) = diag (d0,dld2) (5.3.40)

and
1
13 = (30'g31'32) (5.8.41)

Since V3 may be written in this form its inverse can be

obtained from the Binomial Inverse Theorem, found in Press

[13], which states

-i D(3) 313D(3)
(D (3)+y33) = D(3)- 1+ D(3)y3 (5.3.42)


This is very useful in the actual computing of the statistic

T. It follows that V2 is the same form and can be written

V2 = D(2) + 1212 (5.3.43)


where









D(2)=diag(do,dl) (5.3.44)

and

12= y( o2) 21) (5.3.45)

In general we can form
-1
rj=gjVjgj :2cjam (5.3.46)

where

j=[(gj0-gj-l,0ogj,j-l) (gjl-gj-l,lgj,j-l)'. .

(g, 2-)] :2jsm (5.3.47)
'j-2 J-13,j-2 j,j-1
and

V =D (j-1)+yj_ 1:2j:sm (5.3.48)

with

D(j-l)=diag(d0,dl ,...,dj-2) :22jm (5.3.49)

and
1
-1 gj-1,0 ;gj-1,1 ;gj-1,j-2)':2jm(5.3.50)
(5.3.50)

Following the pattern given for m=4, when VeV

2 2
rj o Xj2_(0):2sjam (5.3.51)

and they are mutually independent so that

m 22
R = r. o X _(0), (VEV) (5.3.52)
j=2 3 m(m1)
and finally

T(R/tm(m-1)) k F~m(m-l)
T- 2 m -m(m +) (0), (VeV) (5.3.53)
2 mv-kim(m+l)

No attempt has been made to find the distribution of T when

V/V, but a computer simulation indicates, as we would expect,

that T is stochastically larger in V/V than in VEV.









5.4 Asymptotic Performance of (-polog. ) and T.

In section 3 we derived the asymptotic distribution of

-P0logX1 and showed it to have a chi-square distribution with

(m-1) degrees of freedom. In that section we also showed

that the distribution of -p0logAl is independent of
Sm m
S= d. or equivalently of d.. In section 4 we showed
mj=l 3 j=l 3
that R has a chi-square distribution with m(m-l) degrees of

freedom, independent of {d0',dl...' dm and hence independent
m m
of -p0logA1 and E d.. Since R is independent of Z d. it
j=1 j=1 3
is independent of o2 and hence we formed T equal to the ratio
2
of R and oa divided by the appropriate constants to form an

F distribution with m(m-l) degrees of freedom in the numerator

and [mv-m(m+l)] degrees of freedom in the denominator. Since
2
both R and a* are independent of -p0logAX then so is T. Now

the distribution of m(m-l)T tends to that of a chi-square

variate with n(m-i) degrees of freedom as v-m. Since T and

-p0logAl are independent we have

lim f-PlogA + 2m(m-l)T} X -1) (0) (5.4.1)
p0 1 X(m-l) (m+2) (0) (5.4.1)

It has been shown by Wilks [18] that under certain

regularity conditions -2logA will be asymptotically distributed

as a chi-square with degrees of freedom under the null hypoth-

esis, where. denotes the likelihood ratio. The degrees of

freedom, may be computed from (1 0) where i equals the

number of parameters estimated under the alternative hypothesis

(H1) and %0 equals the number of parameters estimated under

the null hypothesis (H0 ). For the problem here we find that









under H1 V is arbitrary and we must estimate all 2(m+l)(m+2)=Z1

different parameters. Under H there are only (m+2)=Z0

unknown parameters to estimate and hence

Z = i -0


= (m+2) (m-1) (5.4.2)

That is, the asymptotic distribution of -2logA and

(-p0logAl+m(m-l)T) both agree, under the null hypothesis.

Hence both methods are asymptotically equivalent under the

null hypothesis.

We note that since -p0logA1 and T are independent that

Fisher's method of combining independent tests may be used

in place of (-p0logA, l+m(m-l)T). Fisher's method would be

especially appropriate if the sample size is small.
















Chapter VI

COMPUTER SIMULATIONS AND AN APPLICATION


6.1 Introduction

A computer simulation of the generalized autoregressive

process was performed thirty times. Each simulation had

fifty vector observations with each vector observation having

six measures including the initial measure. Specific values
2 2
were given (al,a2,a3',c4,a5), and V2 and they were

(0.80, 0.60, 0.50, 0.30, 0.20), 1.00, and 4.00, respectively.

The simulations were made using a computer program

written for the IBM 360 computer. The output from the program

includes

(1) the data used in the analysis

(2) the mean for each time period

(3) the cross product matrix

(4) the G matrix

(5) the diagonal elements of D
2 2
(6) the starred estimates of {(a: lsitm}, a and 8

(7) the maximum likelihood estimates of {a : l1ism},
2 o
o and B".

(8) the values of -pologll and T used in testing the

adequacy of the model.

The main purpose of the simulations was to see if the









starred estimators would perform well. In keeping with this

we present only the starred and maximum likelihood estimates
2 2
for {a.: laism}, a and 6 .
1
An application of the theory was made using data from

a drug study at the University of Florida. This study was

directed by Dr. Arlan L. Rosenbloom. Each patient was

infused with glucose and observations were taken on the

patient's level of calcium prior to infusion and at 90 minute

intervals thereafter for four additional observations.


6.2 Computer Simulation Results

Each of the estimates was tested against its true value

at the .05 level of significance. On the average then we

would expect to reject two out of the thirty estimates by

chance alone. Those that were significantly different from

the actual value are listed with an asterisk. Counting the

number of tests that were accepted as a measure of the

estimator's goodness we find *,1 gave 28 acceptable estimates

out of 30. Since a,1 is identical to the maximum likelihood

estimator, &1, there is no comparison. a*2 gave acceptable

estimates in all 30 runs while 82 gave 28. Estimating

a3=.50, the starred estimators did slightly better with a*3

giving 28 acceptable estimates and 83 giving 27. a*4 gave

acceptable estimates in all runs while 84 gave 29. The last

estimators, a*S and 85, both gave 28 acceptable estimates.

We note that whenever the starred estimate was rejected so

was the maximum likelihood estimate, but not conversely.










Tests were also performed on the estimates of 2 and 82

In order to test both o, and 2 an approximation to the

distribution of chi-square given by Wilson and Hilferty [19]
was used. Their result is that (X2/ /3is approximately

normally distributed with mean, 1-2/(9v), and variance, 2/(9v).

This result and a discussion are also given in Kendall and

Stuart [11]. The results of the tests showed that the starred

estimator gave 25 acceptable estimates while the maximum

likelihood estimator gave 24. Again both estimates were

rejected on the same runs, with one exception, when the

maximum likelihood estimate was too high. All of the rejec-

tions for the starred estimates were caused by under estimat-

ing the true value.

The starred estimates and maximum likelihood estimates

performed equally well in estimating 2. Both gave

acceptable estimates 26 out of the 30 runs. Of the four in-

correct estimates both were high on three and low on one.

They both gave poor estimates on the same runs.

Overall the starred estimators performed as well or

better than the maximum likelihood estimators. As can be

seen by the means and standard deviations at the bottom of

Tables 1 through 3, both estimates are very close to the true

value. The mean of the maximum likelihood estimates is closer
2 2
to the true value for a2, a,4 and a5, but not for a3' 0 or 8

Also we note that the sample standard deviations are smaller
2
for the maximum likelihood estimates except for 92. None

of the differences seem to be appreciable in any case.
















Table 1

ESTIMATES OF al' a2' AND 3 FOR

COMPUTER SIMULATED PROCESS


Run al=.80 a =.60 2 -=60 a 3=50 a 3=50

Number a1,=a1 a 2 2 C*3 3,


0.766
0.770
0.767
0.747
0.643*
0.842
0.723
0.839
0.906
0.902
0.892
0.674
0,799
0.747
0.826
0.748
0.795
0.815
0.861
0.848
0.810
0.722
0.746
0.747
0.826
0.927
0.741
0.652*
0.719
0.706


Mean
Standard
Deviation
*indicates


0.784


0.484
0.691
0.639
0.427
0.576
0.749
0.490
0.488
0.373
0.565
0.428
0.730
0.503
0.754
0.725
0.696
0.591
0.634
0.546
0.506
0.458
0.702
0.861
0.819
0.456
0.695
0.459
0.601
0.596
0.352

0.586


0.503
0.625
0.647
0.475
0.611
0.722
0.650
0.661
0.492
0.514
0.554
0.663
0.528
0.655
0.643
0.592
0.624
0.626
0.659
0.557
S0.532
0.636
0.667
0.571
0.575
0.784*
0.514
0.566
0.683
0.435*


0.599


0.415
0.274
0.623
0.620
0.420
0.423
0.469
0.485
0.837*
0.556
0.851*
0.422
0.389
0.775
0.579
0.277
0.411
0.389
0.650
0.559
0.258
0.375
0.610
0.522
0.663
0.255
0.529
0.607
0.508
0.568

0.511


0.421
0.414
0.564
0.518
0.356
0.553
0.452
0.521
0.719*
0.540
0.691*
0.493
0.333
0.541
0.643
0.381
0.415
0.508
0.573
0.398
0.335
0.368
0.709
0.603
0.632
0.429
0.600
0.642
0.510
0.521

0.513


0.074 0.134 0.079 0.158 0.112
estimate is significantly different from the true


value, at the .05 level of significance.















Table 2

ESTIMATES OF a4 AND a5 FOR

COMPUTER SIMULATED PROCESSES


a4=.30

a*4


4=. 30

U4


S5=.20

a*5


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Mean
Standard
Deviation
*indicates
the true v


0.360
0.353
0.241
0.257
0.366
0.195
0.010
0.223
0.133
0.312
0.352
0.063
0.461
0.269
0.431
0.364
0.444
0.261
0.099
0.388
0.070
0.425
0.262
0.180
0.195
0.264
0.018
0.392
0.276
0.168
0.261


0.129


0.292
0.358
0.282
0.256
0.449
0.202
0.073
0.267
0.234
0.351
0.264
0.070*
0.383
0.225
0.295
0.271
0.352
0.147
0.194
0.425
0.168
0.405
0.349
0.284
0.170
0.207
0.102
0.420
0.260
0.261
0.267

0.101


-0.032 0.060
-0.077 0.090
-0.042 -0.013
-0.156* -0.071*
0.334 0.204
0.105 0.109
-0.050 -0.048
0.080 0.076
0.347 0.338
0.268 0.300
0.209 0.212
0.053 0.093
0.119 0.175
0.096 0.102
0.259 0.224
0.225 0.187
-0.095 0.027
0.209 0.196
0.040 0.108
0.214 0.199
0.022 -0.026
0.301 0.307
-0.106* -0.121*
0.236 0.250
0.390 0.363
0.330 0.327
0.135 0.110
0.381 0.294
0.084 0.099
-0.033 0.029
0.128 0.140


0.160 0.129


estimate is significantly different from
value, at the .05 level of significance.


Run

Number


a5=.20

c5


---


--`----


r
















Table 3

ESTIMATES OF 02 AND 82 FOR

COMPUTER SIMULATED PROCESSES

Run a2=1.00 a 2=1.00 2=4.00 B2=4.00

Number o2 a2 2 2


0.842
0.947
0.805*
0.791*
0.920
1.037
0.993
0.940
1.180
0.818*
0.811*
0.969
1.029
0.925
1.016
1.028
0.963
0.816*
1.011
0.989
1.082
1.010
0.840
1.081
1.141
0.994
0.996
0.894
0.986
0.865


Mean
Standard


0.957


aviation 0.102
*indicates estimate
the true value, at


0.826
0.928
0.788*
0.788*
0.899
1.015
0.975
0.925
1.163*
0.811*
0.795*
0.935
1.003
0.926
0.992
0.992
0.929
0.797*
0.981
0.993
1.058
0.988
0.831
1.053
1.103
0.971
0.957
0.861
0.942
0.867


0.936

0.096


4.982
3.490
6.847*
3.915
4.256
5.268
4.070
3.120
2.396*
5.888*
3.164
4.235
3.194
4.096
4.700
3.227
4.122
5.684*
3.191
3.904
3.458
4.296
3.004
3.637
4.413
3.494
4.023
4.51.7
4.209
4.217


4.102

0.944


5.127
3.594
7.055*
3.965
4.394
5.432
4.179
3.278
2.451*
5.991*
3.255
4.426
3.305
4.129
4.853
3.426
4.309
5.876*
3.317
3.922
3.569
4.429
3.061
3.768
4.608
3.608
4.223
4.733
4.444
4.243
4.232

0.971


is significantly different from
the .05 level of significance.


De










6.3 Application

As discussed in section 1, patients were infused with

glucose and measurements were taken on their calcium level,

prior to infusion and four times later at 90-minute periods.

The data are given in Table 4. Inspecting the means at each

period given at the bottom of Table 4 we see that,on the

average, initially the calcium reading was highest and

infusion of glucose caused it to drop continually until the

last time period where there is a mild increase in the level

of calcium. In Table 5, both the starred estimates and the

maximum likelihood estimates are given. Both estimators

gave similar results for all of the parameters with a4 having

the largest value, probably reflecting the increase in the

level of calcium from time period 3 to time period 4.

Table 6 shows the standard deviations and 95% confidence

intervals for a,*1,ca,a,3, and a*4. The confidence intervals

for a,*la,2, and a,3 contain zero implying the parameters do

not differ significantly from zero. This could have been

guessed by noting the relatively small change in the mean

level of calcium from one period to the next. Since the mean

level rose in the last period the parameter *,4 is large and,

as noted by the 95% confidence interval, is significantly

different from zero.

In testing the adequacy of the model we found -p0logAl =

4.91 and T = 1.77. Since the distribution of -p0logXl is

approximately chi-square with 3 degrees of freedom we compare

the calculated value against the tabulated value at the .05









2
level of significance. We find X3,.05 = 7.81, since the

calculated value is less than this we accept the hypothesis

of sphericity. The distribution of T is F with 6 degrees of

freedom in the numerator and 114 degrees of freedom in the

denominator. The upper 5% point of this distribution is
6
F ,.5 = 2.18. Since the tabulated value is greater than
114,.05
the calculated value we accept the adequacy of the model.

To see how well the model fits the data we randomly

selected patient number 18 and calculated his measurements

for the four time periods using his pervious readings.

Letting Y*jk denote the predicted value at time j of patient

k we have that

Y*jk = j+xjk : ljs4 ; lsk32 (6.3.1)
where

Xjk = xj-l,k: lj54 ; lsk.32 (6.3.2)

and

Xjk = Yjk-j : lj4 ; lsks32 (6.3.3)

Hence we may write

Y*jk = j -, j yj- j-l+ -,k : 1sj~4 ; lkg32 (6.3.4)

Given that patient 18 had an initial reading of 9.9, the

prediction for his 90-minute reading is

Y* ,18 = 9.15 .137(9.64)+.137 (9.9)
= 9.20

Similarly for the rest of the readings we find that

Y*2,18 = 9.02 ,
y*3,18 = 9.06

and Y*4,18 = 9.29
















Table 4

LEVEL OF CALCIUM IN GRAMS PER LITER IN

PATIENTS INFUSED WITH GLUCOSE

Patient Initial 90 180 270 360
Number Period Minutes Minutes Minutes Minutes
1 9.1 9.2 8.9 8.5 8.3.
2 10.0 9.2 "9.5 9.2 8.4
3 10.1 9.8 10.1 8.0 9.7
4 10.0 10.1 9.1 9.4 9.5
5 9.7 8.9 9.1 9.1 9.2
6 9.5 8.7 9.1 8.3 8.6
7 9.5 9.4 9.3 9.6 9.3
8 10.1 9.2 9.3 9.1 8.7
9 9.6 8.9 9.3 9.4 8.9
10 9.1 9.3 9.0 9.0 9.0
11 9.6 8.8 8.9 8.8 9.4
12 9.3 9.4 9.3 9.4 9.7
13 10.2 9.5 9.8 9.9 9.8
14 9.2 8.8 9.4 8.9 8.2
15 9.6 9.4 8.9 8.9 9.0
16 10.1 9.0 9.1 9.2 9.1
17 9.4 8.5 8.5 8.6 8.7
18 .9.9 8.9 9.5 9.5 9.8
19 10.4 8.9 9.4 8.3 8.1
20 9.0 8.8 8.5 8.5 8.4.
21 9.7 9.6 9.4 8.4 8.8
22 10.2 8.1 9.0 8.9 9.4
23 9.2 10.3 9.0 8.7 8.7
24 9.7 8.9 9.1 9.1 9.2
25 9.0 8.4 8.1 8.7 8.7
26 9.4 9.2 9.2 9.2 9.0
27 9.4 8.9 8.8 8.7 9.0
28 10.1 9.8 9.1 8.8 9.0
29 9.8 9.3 9.5 9.3 9.6
30 9.6 9.1 8.4 8.6 8.5
31 9.5 8.8 8.5 8.7 9.2
32 9.4 9.6 9.4 9.4 9.5


Mean


9.64


9.15


9.11


R.94


9.01

















Table 5

ESTIMATES OF THE PARAMETERS

FOR THE GLUCOSE STUDY


Parameters
2 12
"1 "2 a3 4 0


0.137 0.334 0.305 0.506


0.173 0.843


Maximum
Likelihood 0.137 0.383 0.312 0.590 0.174 0.854


Type of
Estimate

Starred
Estimate









































2 4

0

d2

a
c -
m
M 3

c 01

S2-




- I 0 0
z 4
H 0


H0

0 c
Qi














'-4
E0





0 ci
H H
U-i
< t










E- H


.9! '-4

0


m co

d 4
0
0


N
N CO
0


Oh
0









CO
-4

0



en
Ln

O



o

0
in

















[-
co

o














0
I











0
N

I I 1





N

0







a)

c10
ar>




Ui


m

N

0


Cr
to
0- 0


.

*001
r- a)(






86


Comparing these to the actual measurements of 8.9, 9.5,

and 9.8 we see that the model gives reasonable predictions.
















BIBLIOGRAPHY


[1] Anderson, T. W. (1958). An Introduction to Multivariate
Statistical Analysis. Wiley, New York .

S2] Anderson, T. W. (1971). The Statistical Analysis of
Time Series. Wiley, New York.

[3] Bahadur, R. R. (1960). Stochastic Comparison of Tests.
Ann. Math. Statist., 31, 276-295.

[ 4] Box, G. E. P. (1949). A General Distribution Theory
for a Class of Likelihood Criteria. Biometrika,
36, 317-346.

S5] Box, G. E. P. and Jenkins, G. M. (1970). Time Series
Analysis (Forecasting and Control). Holden-Day,
San Francisco.

[6] Cornish, E. A. (1954). The Multivariate t-Distribution
Associated with a Set of Normal Sample Deviates.
Australian Journal of Physics, 7, 531-542.

[ 7] Deemer, W. L. and Olkin, I. (1951). The Jacobians
of Certain Matrix Tranformations Useful In
Multivariate Analysis. Biometrika, 38, 345-367.

S8] Dunnett, C. W. and Sobel, M. (1954). A Bivariate
Generalization of Students t-Distribution with
Tables for Certain Special Cases. Biometrika,
41, 153-169.

[ 9] Feller, W. (1966). An Introduction to Probability
Theory and Its Application, Vol. 2. Wiley, New York.

[10] Fisz, M. (1967). Probability Theory and Mathematical
Statistics. Wiley, New York.

[11] Kendall, M. G. and Stuart, A. (1967). The Advanced
Theory of Statistics, Vol. 2. Hafner, New York.

[12] Littell, R. C. and Folks, J. L. (1971). Asymptotic
Optimality of Fisher's Method of Combining
Independent Tests. Journal of the American Statis-
tical Association, 66, 802-806.









[13] Press, S. J. (1972). Applied Multivariate Analysis.
Holt, New York.

[14] Rao, C. R. (1952). Advanced Statistical Methods in
Biometric Research. Wiley, New York.

[15] Saw, J. G. (1964). Likelihood Ratio Tests of Hypothesis
on Multivariate Populations, Volume I: Distribution
Theory. Virginia Polytechnical Institute, Blacksbury,
Virginia.

[16] Saw, J. G. (1964). Likelihood Ratio Tests of Hypothesis
on Multivariate Populations, Volume II: Tests of
Hypothesis. Virginia Polytechnical Institute,
Blacksburg, Virginia.

[17] Saw, J. G. (1973). Jacobians of Singular Transformations
with Applications to Statistical Distribution Theory.
Communications In Statistics, 1, 81-91.

[18] Wilks, S. S. (1938). The Large Sample Distribution of
the Likelihood Ratio for Testing Composite Hypothesis
Ann. Math. Statist., 9, 60.

[19] Wilson, E. B. and Hilferty, M. M. (1931). The Distribution
of Chi-square. Proc. Nat. Acad. Sci., U.S.A., 17, 684.
















BIOGRAPHICAL SKETCH


Darryl Jon Downing was born January 4, 1947 in

Beaver Dam, Wisconsin, and was the youngest of the five

children who William and Roberta Downing had. He spent

most of his youth in Janesville, Wisconsin where he

graduated from high school in 1965.

Shortly after high school he married Barbara Ann Fisher.

It was through Barbara's coaxing that Darryl applied to

Whitewater State University where he obtained a Bachelor of

Science degree with a major in mathematics, in January of

1970. While attending Whitewater State Universtiy Darryl

met Dr. David Stoneman who introduced him to the field of

statistics. Dr. Stoneman was also instrumental in helping

Darryl go to graduate school.

After graduating from Whitewater State University

Darryl attended graduate school at Michigan Technological

University, majoring in mathematics. He attended Michigan

for six months and left for the University of Florida in

the Fall of 1970. In June, 1972 Darryl received the Master

of Statistics degree. From 1972 until the present he has

been working towards the degree of Doctor of Philosophy with

a major in Statistics.






90


Darryl and Barbara have two children: Darren Jon,

age 8 and Kelly Ann, age 6. Both children were born in

Janesville, Wisconsin while Darryl was attending Whitewater

State University.

Darryl has been hired as an Assistant Professor of

Statistics at Marquette University's Mathematics and

Statistics Department in Milwaukee, Wisconsin and will start

teaching there in August, 1974.










I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation
and is fully adequate, in scope and quality, as a dissertation
for the degree of Doctor of Philosophy.



J. G. Say, Chairman
Profess r of Statistics


I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation
and is fully adequate, in scope and quality, as a dissertation
for the Degree of Doctor of Philosophy.



T. Hughes /
Assistant Profssor of Statistics


I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation
and is fully adequate, in scope and quality, as a dissertation
for the Degree of Doctor of Philosophy.



M. C. K. Yang
Assistant Pro ss of statisticss


I certify that I have read this study and that in my opinion
it conforms to acceptable standards of scholarly presentation
and is fully adequate, in scope and quality, as a dissertation
for the Degree of Doctor of Philosophy.



Z. R. Pop-8tojahovic
Professor of Mathematics


This dissertation was submitted to the Department of Statistics
in the College of Arts and Sciences and the Graduate Council,
and was accepted as partial fulfillment of the requirements for
the degree of Doctor of Philosophy.

August, 1974


Dean, Graduate School




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - Version 2.9.7 - mvs