UFDC Home |   Help  |   RSS
HIDE
 Title Page Dedication Acknowledgement Table of Contents List of Tables List of Figures Abstract Statement of the problem Linear stationary models The logistic growth process An empirical comparison Bibliography Biographical sketch

 Title: On some problems of estimation and prediction for non-stationary time series
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
 Material Information Title: On some problems of estimation and prediction for non-stationary time series Alternate Title: Prediction for non-stationary time series Physical Description: viii, 103 leaves. : ill. ; 28 cm. Language: English Creator: McClave, James Thomas, 1944- Publication Date: 1971 Copyright Date: 1971
 Subjects Subject: Time-series analysis   ( lcsh )Statistics thesis Ph. DDissertations, Academic -- Statistics -- UF Genre: bibliography   ( marcgt )non-fiction   ( marcgt )
 Notes Thesis: Thesis--University of Florida, 1971. Bibliography: Bibliography: leaves 100-102. Additional Physical Form: Also available on World Wide Web General Note: Manuscript. General Note: Vita.
 Record Information Bibliographic ID: UF00097680 Volume ID: VID00001 Source Institution: University of Florida Holding Location: University of Florida Rights Management: All rights reserved by the source institution and holding location. Resource Identifier: alephbibnum - 000566005oclc - 13618620notis - ACZ2431

PDF ( 3 MBs ) ( PDF )

Title Page
Page i
Dedication
Page ii
Acknowledgement
Page iii
Page iv
List of Tables
Page v
List of Figures
Page vi
Abstract
Page vii
Page viii
Statement of the problem
Page 1
Page 2
Page 3
Page 4
Page 5
Page 6
Page 7
Page 8
Linear stationary models
Page 9
Page 10
Page 11
Page 12
Page 13
Page 14
Page 15
Page 16
Page 17
Page 18
Page 19
Page 20
Page 21
Page 22
Page 23
Page 24
Page 25
Page 26
Page 27
Page 28
Page 29
Page 30
Page 31
Page 32
Page 33
Page 34
Page 35
Page 36
Page 37
Page 38
Page 39
Page 40
Page 41
Page 42
Page 43
Page 44
Page 45
The logistic growth process
Page 46
Page 47
Page 48
Page 49
Page 50
Page 51
Page 52
Page 53
Page 54
Page 55
Page 56
Page 57
Page 58
Page 59
Page 60
Page 61
Page 62
Page 63
Page 64
Page 65
Page 66
Page 67
Page 68
Page 69
Page 70
Page 71
Page 72
Page 73
Page 74
Page 75
Page 76
Page 77
Page 78
Page 79
An empirical comparison
Page 80
Page 81
Page 82
Page 83
Page 84
Page 85
Page 86
Page 87
Page 88
Page 89
Page 90
Page 91
Page 92
Page 93
Page 94
Page 95
Page 96
Page 97
Page 98
Page 99
Bibliography
Page 100
Page 101
Page 102
Biographical sketch
Page 103
Page 104
Page 105
Page 106
Full Text

ON SOME PROBLEMS OF ESTIMATION AND PREDICTION
FOR NON-STATIONARY TIME SERIES

By

JAMES THO:.LAS McCLAVE

A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE RErLUIRE.LENT'S FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA
1971

TO MY FATHER, MY MOTHER

AND MY V.IFE

FOR THEIR LOVE, UNDERSTANDING

AND FINANCIAL SUPPORT

ACKNOWLEDGMENTS

I wish to express my heartfelt thanks to Dr. K. C. Chanda for

his expert and patient guidance in this effort.

I also want to express my gratitude to Dr. J. G. Saw for

my office partner, Dr. J. Shuster, for many helpful discussions.

Special thanks go to my wife, Mary Jay, for her loving support,

financial and otherwise.

Finally, I wish to thank Mrs. Edna Larrick for the magnificent

job of turning the rough draft I gave her into this typing masterpiece.

Page

ACKNOWLEDGMENTS ... .. ... .. ... ... .. iii

LIST OF TABLES ...... ... .... . . .. . . v

LIST OF FIGURES . ..... ..... .. ........ vi

ABSTRACT . .. .... . ... . .. . ... . .vii

CHAPTER
I STATEMENT OF THE PROBLEM. . ............ .. 1

1.1 Introduction .... . . ... . ... .. 1
1.2 History--Stationary Processes .. . ... ... 4
1.3 History--Non-stationary Processes . ... ... . 5
1.4 Summary of Results . . . . . . . .. 7

II LINEAR STATIONARY MODELS .. . . .. ... . ... 9

2.1 Introduction . . . .... . ...... . 9
2.2 The Autoregressive Sequence . ... .. . 10
2.3 The Moving Average Sequence ... ... . 24
2.4 The Hybrid Model ..... .. . . .. .... 32

III THE LOGISTIC GROWTH PROCESS ..... .. .. . 46

3.1 Introduction .. .... . ..... . . 46
3.2 Probability Distribution of [X(t)] . . . . . 47
3.3 Equispaced Observation . . .... . ... 50
3.4 Continuous Observation in [O,T] . . ... .. ... 58
3.5 Arrival Time Observation .. . .... . .... 73
3.6 Prediction . . .. . . . . .. . .. . 75

IV AN EMPIRICAL COMPARISON.. .. ..... . ... 80

4.1 Introduction .......... . . . ... 80
4.2 Rhodes' Estimators . . .. . . . . . . 80
4.3 Sampling Properties of Rhodes' Estimators . . .. 82
4.4 Monte Carlo Experiments . . .. ... .. . 86

BIBLIOGRAPHY.. .......... ...... . . . 100

BIOGRAPHICAL SKETCH .. . . .. .. .... .. .... 103

LIST OF TABLES

Table k Page

2.1 Tables of Efficiencies for Y0 = dj C . . . . 41
j=l J1

4.1 Generated and Expected Sequence for n = 40, Q = 10.0,
and p = 0.8 . . . . . . . . ... . . . 8.

4.2 Maximum Likelihood Estimates, E and Rhodes' Estimates
for Data in Table 4.1 for n = 40, a = 10.0 and p = 0.8 . 9C

4.3 Generated and Expected Sequence for n = 100, a = 5.0,
and p = 0.8 . . . . . . . . ... . . . 93

4.4 Maximum Likelihood Estimates, E and Rhodes' Estimates
for Data in Table 4.3 for n = 100, Q = 5.0 and p = 0.8 .. 9.

4.5 Generated and Expected Sequence for n = 400, a = 3.0,
and p = 0.8 . . . . . . . . ... . . . 9

4.6 Maximum Likelihood Estimates, E and Rhodes' Estimates
for Data in Table 4.5 for n = 400, a = 3.0 and p = 0.8 . 9E

4.7 Summary of Results of Study . . . . . . . . 9

LIST OF FIGURES

Figure Page

4.1 Generated and Expected Sequence for n = 40, a = 10.0,
and p = 0.8 ... .. .. .. .. ........ ... . 89

4.2 Generated and Expected Sequence for n = 100, a = 5.0,
and p = 0.8 ... . . . . . . .. .... . 92

4.3 Generated and Expected Sequence for n = 400, a = 3.0,
and p = 0.8 .... . . . . . . . . 95

Abstract of Dissertation Presented to the
Graduate Council of the University of Florida in Partial Fulfillment
of the Requirements for the Degree of Doctor of Philosophy

ON SOME PROBLEMS OF ESTIMATION AND PREDICTION
FOR NON-STATIONARY TIME SERIES

By

James Thomas McClave

March, 1971

Chairman: Dr. Kamal C. Chanda
Major Department: Statistics

Many techniques are available for estimating the parameters of

linear stationary time series. The effect of some of these estimators

on the least squares predictor for some future value of the series is

examined. We have obtained approximations for the increase in predic-

tion error due to parameter estimation in several cases for which no

exact expression could be found.

An efficient estimator for the parameters in a first order

autoregressive-moving average model is developed by making use of a

linear function of the autocorrelations. For more general models we

conclude that efficient estimation is so difficult to attain that first

consideration in many estimation prediction problems should be given

to ease of calculation.

Numerous unsolved estimation and prediction problems remain

for non-stationary time series. We consider the logistic growth proc-

ess, which is used extensively as an economic and population growth

model, in detail. Current estimation procedures for the logistic

process' parnrneters make no reference to an error structure.

We propose a probability structure consistent with the realistic

properties of the series. We then use this structure to obtain esti-

mators from three different observational standpoints:

(1) that of observation at equidistant time points,

(2) that of continuous observation, and

(3) that of arrival time observation.

For observation types (1) and (2) we have used a modification

of maximum likelihood procedures to obtain estimators having most of

the usual properties associated with maximum likelihood estimators.

A computer program was written for type (1) to solve the intractable

estimation equations, using the Newton-Rhapson procedure. Observation

of arrival times is shown not to lead to any useful estimation proce-

dures. We also examine the effect of the estimators calculated from

the observations taken at equal time intervals (type (1) above) on the

error of prediction.

The procedure developed for observation type (1) is then compared

by means of example to an estimation procedure developed by Rhodes.

The logistic model is also fitted by each method to the population of

conterminous United States. The estimates are then used to obtain

predictions of the population of conterminous United States in 1970.

We conclude from the results that the effort required to calculate the

maximum likelihood estimates is worthwhile. We further conjecture that

the methods developed may be applicable to other growth processes.

viii

CHAPTER I

STATEMENT OF THE PROBLEM

1.1 Introduction

Consider the following two types of time series:

1. fX(t); t e Ja,b, where Ja,b is the set of real integers

1a,a+1,...,b}. This is the discrete time series.

2. {X(t); t e Ia,b}, where Ia,b is the interval (open or closed)

between the real numbers a and b. This is the continuous

time series.

Suppose an observation is made on a given series, either obtain-

ing an observation set of the kind (x(S),x(S+l),...,x(T)}, with S and

T real integers such that JS,T C Ja,b from the first or second type;

or of the kind fx(t); t e IST, with IS Iab, from the second type.
S,T S,T a,b'
The problem to be considered is that of finding the least squares pre-

dictor of X(T+m) (with m an integer if the series is discrete and T+msb)

based upon the observation. Thus, we want to find that function of the

observation set, say x (m), which minimizes the mean square error of

prediction.

The first step toward the solution of this problem is to construct

a parametric representation of the series. One must make use of all avail-

able information about (X(t)J in order to place the problem in a parametric

framework. This parametrization may vary from specification of the first

and/or second moments of the series to complete specification of the

probability distribution at each point t.

This study of the series' genesis will usually place the process

into one of two subclasses:

(a) Weakly stationary, hereafter referred to as stationary,

series; that is to say, those whose first two moments are,

for every t,

E(X(t)i= u, (1.1.1)

independent of t; and

E(X(t+s)X(t)] = s (1.1.2)

a function of s only.

(b) Non-stationary series, the complement of subclass (a).

If the series satisfies the conditions of stationarity, the

prediction problem is simplified. The minimum mean square error pre-

dictor depends on the first two moments of the time series. Since for

stationary processes the second moment depends only upon the "lag" or

time difference of the random variables, and, since the first moment is

independent of time, all sequences of observations which are of equal

lengths yield the same information about the series' first two moments,

no matter when in the series' development the observations are taken.

This second order homogeneity simplifies the problem to the extent that

all further parametrization pertains to the description of 4j and ,s'

from which the predictor will be calculated.

On the other hand, if the series is non-stationary, the problem

is more complex because of the nature of this subclass. We will divide

the non-stationary subclass into two categories:

(i) those for which a simple transformation exists which trans-

forms the process into a stationary time series, and

(ii) those for which either no such transformation exists, or at

least no simple transformation can be found.

As to the first category, the nature of the transformation

depends, of course, on the reason for the series' non-stationarity.

As a simple example, suppose (X(t)} is a continuous time process with

E(X(t)} = P(t) = e + Bt, (1.1.3)

E(X(t) X(t+s)] = Ys(1.1.4)

Then the process fY(t)), formed by taking

Y(t) = X(t) m(t), (1.1.5)

will be stationary.

This particular kind of transformation is popularly referred to

as removal of the trend from fX(t)];it can be accomplished for simple

linear trends like that given by (1.1.3), or for more complex seasonal

or harmonic trends common to many economic time series. The end result

is the same; this further parametrization has permitted us to reduce the

problem to that of stationary series prediction, for which solutions

are known.

However, it is perhaps within the second category that one

encounters the most complex prediction problems in time series. Without

the homogeneity of the first two moments, the non-stationary process'

analysis will certainly depend on when the observation is taken; and the

underlying probability structure of the series can be extremely compli-

cated. In short, general methods cannot be used to solve the prediction

problem for non-stationary processes with satisfactory results. Each

series must be considered individually, its evolution thoroughly re-

searched, a parametrization made, and then an optimal prediction proce-

dure developed. One can hope that this systematic method will apply to

a general group of processes, but that the actual procedure developed

will apply to any but the specific series being investigated is extremely

doubtful.

1.2 History--Stationary Processes

The literature on the prediction problem for the stationary time

series is voluminous. That the area is extremely attractive follows from

the above; most results have wide application within the subclass of

stationary processes, and the chances that a problem is mathematically

tractable are enhanced by stationarity. In addition, of course, station-

ary processes have great practical significance.

Problems of least squares prediction for stochastic processes

were first considered by Kolmogorov [20], using Hilbert space geometry.

This treatment has been adapted to the stationary processes by several

authors, including Doob [9], Yaglom [36], and Parzen [24]. Hence the

prediction problem is essentially solved for all stationary time series;

the known general methods need only be applied to the specific stationary

process being considered. Many authors have contributed to this end,

among them Doob [9], Yaglom [36], Whittle [33,34], Box and Jenkins [3,4],

and Jenkins and Watts [17].

The particular parametrization proposed often results in an

estimation problem which must be solved prior to making the prediction.

This is illustrated by three of the most commonly hypothesized dis-

crete time series: the autoregressive, moving average, and mixed

autoregressive-moving aerange (hybrid) models. Each receives wide appli-

cation in engineering and economic research, and each entails parameter

estimation. These estimation problems have been considered by Mann and

Wald [21], V;Tiittle [33], Durbin [10,11,12], Walker [28,29,30,31],

Hannan [16], and Box and Jenkins [-1]. Some of the estimation procedures

are rattler complicated to apply; but, for the most part, satisfactory

estimation methods exist for these linear stationary models.

The literature is nearly devoid of research exploring the effect

of parameter estimation on the error of least square predictors. Even

though the error increase due to parancter estimation will usually be of

a smaller order than the mean square error, the precise nature of this

increase needs to be studied. Box and Jenkins [4] do present a brief

discussion of this problem for one simple time series model, but to my

knowledge the subject has not been mentioned elsewhere. This problem

will be considered at some length in Chapter II.

1.3 History--Non-stationary Processes

All the work on the prediction problem for non-stationary time

series is divided into two distinct groups: that which deals with trend

removal in order to make the process stationary, and that which does not.

We shall consider only the latter in this dissertation; the interested

reader may consult Grenander and Rosenblatt [15] and Box and Jenkins [1]

for a complete treatment of trend removal.

Kolmogorov's Hilbert space solution to the prediction problem

is applicable to non-stantionary processes. The solution depends mainly

on the second order properties of the process. Since the second moments

are inherently more complex for non-stationary processes than those for

stationary processes, the result of the application of Kolmogorov's

method to a non-stationary time series usually does not have general

application within the class of non-stationary processes. For this

reason one finds that most of the estimation and prediction research in

this area has dealt with specific process types; for example, queuing

processes, renewal processes, birth-death processes, and growth

(evolutive) processes. The list is not exhaustive; each class has

received attention in the literature, much of it devoted to the estima-

tion and prediction problems.

Specifically, considering the birth-death and growth processes,

both usually Markov processes, most of the research has been concen-

trated on homogeneous processes; that is, those with transition probabil-

ities independent of time. Feller [13] gives a rather complete intro-

duction to the birth-death processes, defining the underlying probabil-

ity structure of them. Then David Kendall [18] uses generating functions

to obtain solutions for the probability distribution of these series at

any time t. Moran [23] partly solves the problem of estimation for

homogeneous birth-death processes, and both Moran [22] and Kendall [19]

propose several r.ethods for estimating the parameters of the transition

probabilities of some simple non-homogeneous birth-death and growth

processes.

Since the publication of these papers, research in this area has

been sparse at best; and the prediction problem has received surpris-

ingly little attention from statisticians. Biologists and economists

(Rhodes [25], Tinter [26], and Granger [14]) have taken some interest in

this area, since it includes prediction of animal population size and

economic forecasting. Unfortunately, these problems are often treated

with little reference to underlying probability structures. The results

rely more on intuition than mathematics, and hence the predictions are

often unsatisfactory.

An example of this kind of treatment is given by Rhodes [25].

Rhodes considers the problem of estimating human population size without

using any probabilistic arguments. His early paper was an indication of

things to come. In all fairness though, early researchers had a legit-

imate excuse for steering away from these arguments, for they often make

prediction problems for non-homogeneous Markov processes mathematically

intractable. However, in this age of the electronic computer, many

formerly insoluble problems can be treated iteratively and solutions

obtained.

1.4 Summary of Results

The results are divided into two distinct categories: first, in

Chapter II the effect of estimation on the error of prediction for linear

stationary time series is considered. For several models an exact expres-

sion for the error increase due to parameter estimation is obtained; in

many cases for which this is not possible, approximations for this

increase are given. For the mixed autoregressive-moving average model,

a new estimation procedure is examined with respect to the prediction

problem.

In Chapter III the logistic growth process, a non-stationary,

Markov time series, is considered in detail. A realistic probabilistic

structure is assigned to this process, and then the parameters involved

are estimated from several different observational standpoints. Finally,

the prediction problem is considered for this process, and the effect

of parameter estimation on the mean square error of prediction is

examined.

The methods developed for the logistic process may be applicable

to other non-stationary processes. Nevertheless, the logistic process

has application in describing animal population growth and certain eco-

nomic growth patterns, so that the results may be useful in themselves.

Chapter IV is devoted to a comparison of an estimation procedure

(for the logistic process) derived by Rhodes [25] with one of those

derived in this paper. The analytical intractability of the estimators'

sampling properties calls for numerical comparison, so that conclusions

are tentative. The purpose of this Monte Carlo study is to determine

whether the extra effort which the probabilistic methods require is

worthwhile, or whether one should continue to estimate the parameters,

using methods like those proposed by Rhodes; that is to say, those which

rely mainly on intuitive appeal and their ease of calculation.

CHAPTER II

LINEAR STATIONARY MODELS

2.1 Introduction

Many volumes have been written about estimation and prediction

problems for linear stationary time series, but there is little in the

literature which considers the effect of estimation upon prediction.

That is, the increase in the mean square error of prediction due to

parameter estimation has not been examined.

Knowledge about the increase in prediction error for various

predictors may be useful. If the increase is small for several differ-

ent estimators, one might choose the estimator easiest to calculate.

On the other hand, one would spare no effort in finding that estimator

which least increased the error of prediction if that increase were

known to be significant.

Our purpose in this chapter is to consider this estimation-

prediction problem for the three basic linear stationary time series:

autoregressive, moving average, and a mixture of these (hybrid). Only

discrete time series will be examined; since the stationarity of the

process implies that the mean is mathematically independent of time,

this mean will be assumed known and equal to zero. All results remain

asymptotically true for the case of an unknown mean for the series

formed by subtracting the arithmetic mean from each of the observed

values. To pursue this would be stray from the objective; the inter-

ested render is referred to Bartlett [1] and Box and Jenkins [1].

Finally, the sequence [Zt; t= 0, 1 ,...] will be used throughout

to denote a stationary series whose first two moments are the following:

EZt3 = 0, (2.1.1)

2
2. if j = 0
E[Z Zt = z (2.1.2)
0 if j 0 .

From time to time comment will be made on the case where Zt is normally

distributed. But unless specified, no distributional assumptions are

2.2 The Autoregressive Sequence

If (Xt; t Jm m is a stationary time series satisfying the

relationship

Xt + .Xt-I + ... + p X = Z (2.2.1)
t 1 t-1 p t-p t

for every t,then [Xt is said to be a pth order autoregressive sequence.

The problem of estimating the set of parameters [a.; j e J

is analogous to the classical regression problem. Defining -,j as the

j-step covariance of [X}t and pj as the j-step correlation, that is

to say

yj = E[X t+X t (2.2.2)

and
-I
Pj = YjO1 (2.2.3)

we get

p
e -r = r (2.2.4)
j=1

for r e J by multiplying (2.2.1) by X and taking ex-pected values.
1,p t-r

We use the fact that Z and X are uncorrelated when r < 0, which
t t-r

will be sho-~n later. Equation (2.2.4) gives rise to the estimating

equations
P
E a. C = C (2.2.5)
j=l 3 j-r r

for r e J ,p' where

N-lI
C. = N t x t l (2.2.6)
t=l iJ.

is the sample covariance function for the observed sequence

fxt; t e J1,N. Mann and Wald [21] have shown that the estimators

[&.; j e J1,p) given by (2.2.5) are asymptotically normally distributed
-1 2 -1
with null mean and variance covariance matrix (NYO) OZ where

S is a (p Xp) matrix whose element in the (i,j) cell is

( .. = 0. . (2.2.7)
ij 1-j
,T
The vector of estimates whose transpose is a = (a& l'&2' '" ) is
T
shown to be the maximum likelihood estimator of a = (a ,'a,... ,p) if

the distribution of Zt is assumed normal for each t.

In order to predict the value of XN+m using the observations

Xlx2,'...,., we will use the function which minimizes the mean square
^(1)
error of prediction. That is, the optimal predictor is ^N (m) which

minimizes

MSE ((m) = E([X -R(1)(m)]2 (2.2.8)

But this implies that

X (m) = EXN+m/(Xj = x; j e J1N)3 (2.2.9)
m ,N

Substitution of (2.2.1) into (2.2.9) yields

(1I)
N (m) -- E/- ( ] xj j e Jl, )
EZNm j J N+m-j j 1,N

P
= T. a (m-j), (2.2.10)
j=l

where

E(XN+j/( kXk; k e J1,N

-(I) if jj>o
N (j) i j >o (2.2.11)

.N+j if j s .

Thus one can calculate the m-step predictor, xN(m), using the recursive

relation (2.2.10).

This, however, requires that the autoregressive parameters be

known. When they are not, we may logically use

.(2) ..-2)
x (m) = a. 2) (m-j) (2.2.12)
j=1

To be determined is the increase in the error of prediction caused by
-(2) C(1)
using xN (m) instead of N (m).

Case 1: p = m = 1

For the sake of simplicity, let a = c1, so that [Xt) satisfies

the relationship

X aX = Z (2.2.13)
t t-1 t

for all t. Using the Mann-Wald estimate for o, one has

-1
a = C1 C1 (2.2.14)
and 2
Var (5) ; (N-(0) (2.2.15)

Rewriting (2.2.13) and using the fact that the stationarity of [Xt

implies Ic4 < 1, one obtains

X = (Xt-2 + Zt + Zt

k-1 .
k k-i
= X + E cA Z (2.2.16)
t-k t-j
j=o
for any positive integer k. We can extend this to the limit in the

mean square sense as k m. That is, one can write

Xt cE Z _j, (2.2.17)
j=0

since

E[X2 = 2j 2
j=0

2 -1 2
= (1-2) 2 (2.2.18)

is finite. Note that the left side of (2.2.18) is, by definition, "0'

and

Yk = E Xt+k Xt]

= E Zt+k-jj r, t-jj
j=0 j=0

= Y YO" (2.2.19)

Thus
-1 k
Pk = Yk YO = k (2.2.20)

In (2.2.19) and (2.2.20) we assume k is a positive integer; that y-k-k

is easily shown.

The minimum mean square error predictor of XN+1 is given by

setting p = m = 1 in (2.2.10), whereupon

-(1)
x (1) = x (2.2.21)
N N

The prediction error is given by

-SE(1) 2
MSE = E([XN (1)]"
1 N+1 N

= E[(XN+I XN)2

= 2 (2.2.22)

If a is unknown, the predictor corresponding to (2.2.12) is

^(2) = 1 x. (2.2.23)

(1)
XN N.

Using the fact that [X X (1)] is uncorrelated with the sequence
N+1 N
[X ; j c J ,N3, one has

S(2) 2
MSE = E[[XN+1- N (1)] ]
2 N+1 N

= MSE1 + EfXN2( -o)2}. (2.2.24)

Thus the increase in the error of prediction due to estimating a is

MSE2 MSE1 = EfXN(-o)2} (2.2.25)

Suppose the estimator a is calculated using only the first k

observations in the sequence [xj; j e J1,N. where k << N; that is to say,

we form the sample covariances in (2.2.3), using only k observations.

Since pj decreases exponentially, the fXj; j E Jk} and X are virtually

uncorrelated, and hence so are a and XN. A first approximation for

(2.2.25) is thus

MSE 1MS E(X } E[((--)2]

-1 2 -1 2
= Y k (1-c) = k a1.. (2.2.26)

Box and Jenkins [4] present an argument similar to the above. The

degree to which the approximation (2.2.24) holds is impossible to ascer-

tain from these arguments.

To obtain a more exact expression for (2.2.25), first define

6 = C. Yj (2.2.27)

for j e J1. Then we can write

-1
(<'-c,) = C1 C a
--I
(60 + Y-)1 ( 0), (2.2.28)

using the fact that 1 iY0 = 0. Then

E( -a) = El(61- 0 )2 (60+70)-2

--1 -1
= Efy2(61-p 0)2 (1 -260YO1 +3(60 YO ...)

= Eyo ( 60) 23 + o(N-1), (2.2.29)

-1 '2
since 60 and 1 have sampling errors whose order of magnitude is N
-1
(Bartlett [1]). Thus, retaining terms of order N- and higher, (2.2.25)

becomes

E[X2c-)23 Yo2 E(X (6 1- 2 0 (2.2.30)

We now state

Theorem 2.2.1:

If (X ; t e Jm } is an autoregressive sequence of order one,

then

MSE2 MSZ1 = N-K(2) (l + c2) 2 K (4)~ l(2))

+ o(N-1),

where YZ(r) is the rth cumulant of the distribution of [Zt]

Proof:

(2.2.31)

Referring to (2.2.30), we must evaluate

(i) E([ 60)

(ii) E[ 52 and

(iii) Er 50 51

(i) We will evaluate the first term in detail in order to make

the methods clear. First, note that

E([. 5} = E( (CO -)2

N
2 -1 N 2
= E( (N XE YO)

N 2 -
= E(N- l x2) 2N- 0
t=1

N 2
EXt= XN
t=1

2 2
+ y0 XN

Now

X (N
t=1

2N2 N 2 4 N
t ) = E[XN Xtj + 2
t=1 t=2

t-1

s=l

E[X9 2 2
2E t
E(% N s x

From (2.2.16), for t > s,

t-s-1
t-s k
X= u S X + Y
t s Zt-k
k=O
t-s
= X + y
s S,t

(2.2.32)

(2.2.33)

(2.2.34)

where

t-s-1
Y = E a Z k (2.2.35)
s,t k=0

We now write

v(k) = EYk (2.2.36)
s,t s,t

defining

v(k) = y = 0 (2.2.37)
SS S= S
s,s s,s

for all k e J Hence we obtain from (2.2.34), for t S N,

E(X X2X = E(E[X X2/X 13

S2(N-t) EX4 (2) EX (2.2.38)
= a EXtJ + tN t

Furthermore, for s S t 5 N,

EfXN X2 X2 = E(E[XN X2 X2/X]2
X xt Xs N t s s

1= 2(N-t)E([a(t-s) X + Y 2 X2
s s,t s

V(2) E (t-s) 2 2
+ t,N s st s

2(N-t) +4(t-s) 6
-a E(X 3

S 2(N-t) + 2(t-s) (2) 2(t-s) (2) 4
+ 16a +a VtN] E(Xs
-st t,N

S42(N-t) + (t-s) (3) X3
s,t s

+ [I2(N-t) s(4) (2) (2) E(X2 (2.2.39)

Before we can write (2.2.32) in terms of the cumulants of [Zt}, we need

to note that the rth umulant of the processis,using(2.2.17),
to note that the r cumulant of the (Xt] process is, using (2.2.17),

X(r) = (1 -r) -1 z (r) ,

and hence that

E(X6s = (1-c6) 1 Y(6) + 15[(1-o4)(1-Cr2 )-1 t(4) ~ (2)

3-2 2 2 -3 3
+ 10(1-a3) K(3) + 15(1 -o2)3 (2),

E(X]J = (1-0 4)1 (4) + 3(1-o2)2 (2),

E[SX3 = (1-c3)-1 z(3),

and

E[X2] = (1-a2)1 K (2).

We will also need, from (2.2.35) and (2.2.36),

(4) -- 4j 2
Vst = E [?z(4) + 3K(2)] + 6
j=0

t-s-1

i=l

i-1

j=o

(2.2.40)

(2.2.41)

(2.2.42)

(2.2.43)

(2.2.44)

2(i+j) (2)
2

4 -1 4(t-s)
= (1-c ) (1- )Z(4)

2)-2 2(t-s) 2 2
+ 3(1-a ) (1-ct ) x (2),

(3) t-s-l
v(3) = j Z (3)
j=0

= (1-e3 -1( 13(t-s)) (3),
= (1-ct) (1-ct ) K2(3),

(2.2.45)

(2.2.46)

t-s-1
(2) =2j
V st= C O(2)
j=0

= (1-u )-1(-2(t-s) a (2) .

(2.2.47)

We are now prepared to obtain (2.2.39) in terms of the cumulants of

(Zt3; the result of substitution of (2.2.i41) (2.2.417) into (2.2.39) is

E 2 2 6 -1 2(N-s)+2(t-s)
EfX t = (1- 6)- ~Z(6)

S[ 4 (12 -2 2(N-s)+2(t-s)
+ [(1-ca )(-cY )] [S'

2(N-s) 2(N-t) 2(t-s)
+ 5a + a + a ] x Z(4) KZ(2)

3 3-2 2(N-s)+2(t-s) 2N-s-t 2
+ (1-c ) [6a + 4a + 4Z(3)

2 -3 i 2(N-s) 2(N-t)
+ (1-a ) [1+O1 + 2a

2(t-s) 3
+ 22(t ] 3(2) (2.2.48)

We now substitute (2.2.48) into (2.2.33) to get
N N-1
Ef ( E X )2 = (1-a6)-1 Z(6) E a
t=1 t=O

N-i t
N-1 t 2s+2tI
+ 2 E E Cy
t=l s=l

4 2 N-1 t 2s+2t
+ [(1-c )(1-a2)]- qz(4)x (2)fN+16 E E
t=l s=l
N-2 2t 2 N- 2t
+ E a [2(N-t-l) + a (2N+8t)] + 14 E cr
t=O t=O

3 -2 2 N- 2t N-1 s 2t+2s
+ (1- ) KZ (3)f10 E C + 12 E E a
t=0 s=2 t=l

N-1 t-I
+ 8 E E t+s]
t=l s=0

2-3 3 2 N-2 2t
+ (1-C ) z (2)[N +2N + E a [4(N-t-l)
t=0

+ 2(4N+16t)]3. (2.2.49)

The second term on the right side of (2.2.32) is similarly evaluated to

obtain in part

N 2 4-1 N- 2(N-t)
E E[XN X = (1-) (4) Z
t=l t=0

2 -2 2 N-1 2(N-t)
+ (1-a ) H(2)[l+ 2 ac
t=0

4( ) -1 N- 2t
= (1-C) (4) 2t
t=0
N-i
S- N-1 2t
+ (1-C) H(2) [N+2 E a I (2.2.50)
t=0

Putting (2.2.49) and (2.2.50) into (2.2.32) yields

E[f 62} = N-l([(1-&4)(1-&c2)2]1 (l+ 2) 4 (4) K (2)
X =-4(2) 3 z

+ 2(1-c 2)4(l+a2) (2)) + o(N1). (2.2.51)

(ii) We proceed in the manner described above to obtain

E(XN 62] = N -(2[(1-a )(1-2a )21 4 a (4) x (2)

+ (1- )4 (1+4a 2-4) (2)] + o(N-). (2.2.52)

(iii) Similarly, we find

E -1 4 22-1 2
E[X 5 80 = N f[(1-: )(1-a ) 2 o(l+a ) n (4) x (2)

+ 4(1-2) 4 3 (2)]+o(N1). (2.2.53)

Finally, from (2.2.30) and (2.2.51) (2.2.53),

MSE2 MSE1 = (1-c2) 2 (2) [EfX2 2 2caE(XN .51
2 2 -1 N 0

+ a E[ 2 A bI

= N{ (x(2) (1+2 )-1 a2 Z(4) < (2)] + o(N1),

((2.2.31))

which completes the proof of the theorem.

Note that if [Zt3 is a normal process, YZ(4) is zero, so that

MSE-SE SE = N-1 (2) + o(N- )
2 1 2

= N1 2 + o(N1). (2.2.54)
Z

Thus we have the rather remarkable result that, after [Zt3 is a normal

process, XN and a are uncorrelated when terms of lower order than N-

are ignored. That is,

MSE MSE = E[X 2 E((&- )2} + o(N-1). (2.2.55)

Case 2: p = 1, m general

Using (2.2.9) and (2.2.16)
m-1
xl) (m) = Ef E ZN+mj +m XN/(X = x; j Jl, )
XN = N+m-j j
j=O

= mxN. (2.2.56)

Then

MSE1 = E[XN+m (1) (m)]2}

2 2m 2 (2.)
= a (1-0 )(1-0? (2.2.57)

Assuming ca unknown, one forms the predictor

x(2)() = am xN. (2.2.58)

As in Case 1,

MSE MSE = E[X((m) X(2)(m)]2

= EXN [(56 + a))m Om 2 (2.2.59)
= (N[( + ]3

writing

6? = & a.

Upon expanding (2.2.59) and remembering that 5o has sampling error of
-1/2
order N we have

MSE2 MSE1 = EX2L[(,52)m + mc(5)m-1 + + mm-(6)] 2

2 2(m-1) 2 1
= (m Ef.(6o)2} + o(N-). (2.2.60)

Using Theorem 2.2.1 and assuming Z normal, this becomes

-1 2 2(m-1) 2 + o(N-l1 (2.2.61)
MISE MSE = N m a + o(N ). (2.2.61)
2 1 Z

If the assumption is dropped, the fourth order cumulant must be included

in (2.2.61).

Case 3: p general, m = 1

The optimal predictor here is

4(1) p
N (1) t xN+l-t (2.2.62)
t=l

with prediction error

MSE = Ef[ (1) (1)]2
1 N+1 N

p 2
= E .(X + Z a. X
E[(XN+1 .+ j N+l-j)
j=1
2
Z0. (2.2.63)

The predictor becomes, when the parameters are unknown,

-(2) P
xN (1) = Ej 1& x -j (2.2.64)
j=l J N+1-j

The increase in prediction error is given by

MSSE MSE = Ef[ .2)(1) X(1)(1)]2
2 1 N N

= T T (2.2.65)
= EU~o) x x (2.2.65)

T
using matrix notation with X = (XNX_1 .. XN+). The evaluation of

(2.2.65) is intractable; Theorem 2.2.1 leads us to conjecture that, for

normal Zt, X and (6a) are uncorrelated (ignoring terms of order less
-1
than N ). In this case

MSE2 MSE1 YOE[(6& E Z()},

where X X has been replaced by its expected value. It follows from the

results of Mann and Wald [21] that, asymptotically,

(6-) N N ,c (Y0Z) (2.2.66)

where N denotes a p-variate normal distribution and cp a null vector of
p p
order p. Thus

= -1 2 -
2LSE ASE1 YO[(Ny0) ac p] + o(N )

-2 2 -
= N p a + o(N ), (2.2.67)

T 2 -1 -1
using the fact that N(6a)T[a (yOE)--1() ) has a chi-square distribution

with p degrees of freedom. Note that expression (2.2.67) is a simple

generalization of expression (2.2.54) for the case p = 1.

Case 4: p, m general

In this most general case, the predictor is (2.2.10), where

the values 1)(j) are determined recursively. When ([^.; j e J ,P are

substituted for ([.; j = J }, one obtains (2)(m), given by (2.2.12).
j ,p N
We are interested in calculating

S (2) (1)
MSE2 MSE1 = E[[ E ( X (2m) a X (m))] (2.2.68)

All attempts to obtain (2.2.68) in closed form have been unsuccessful.

That the trm is of older N follows from the properties of ().
That the te.rm is of order N follows from the properties of (i').

Further conclusions might be possible with the aid of Monte Carlo

experiments.

In summary, the autoregressive process allows one to calculate

estimators having many satisfying properties, including relative ease

of calculation. Furthermore, the effect of these estimators on the

error of prediction becomes negligible when N is large. Exactly how

large N must be is dependent upon p and m. We have obtained precise

expressions for the increase in prediction error for several special

cases.

2.3 The Moving Average Sequence

If [Xt; t e J(-m )] is a stationary time series satisfying the

relationship

X= + Z (2.3.1)
Xt = t + t-1 ... + Bq t-q

th
for every t, then [Xtj is said to be a q -order moving average process.

The problem of parameter estimation for the moving average

sequence is complicated by the fact that the sample covariances, C., do

not provide efficient estimators of the covariance function of f[Xtl, y

(Efficiency is defined with respect to the variance of the maximum

likelihood estimator when Zt is assumed normal.) The following is

a brief history of the estimation problem.

Whittle [33] obtained maximum likelihood equations for the

moving average process assuming Z to be normal. Estimators were not

obtained from these equations, however; even iterative methods have

proven unsatisfactory for solving the likelihood equations. The search

thus began for estimators of calculable form and with high efficiency.

We note that

yj = E(Xt+j X t

q q
= Ef( E 0k Zt+j-k)( X k Zt-k)3
k=0 k=0

2-j
q5 8k k+j if j
k=O

0 if j Jq+l,' (2.3.2)

defining = 1 and noting that yj = .. One can obtain estimates
3 -j
for j j; j J1,q by substituting C. for yj in (2.3.2) for j C Jl,q

However, the lack of efficiency of C. for y. has a predictable effect

on these estimators. Whittle [33] has shown that for q = 1 and 1= 0.5,

the efficiency of this method is about 0.3.

Another representation of iXts is the following:

Xt + E cj Xj = Zt, (2.3.3)
3=1

where

J J2 J
r >0 1 2 q
s>O

sCJ1,q
Zjs= r

and where i .j; j Jl,q are the (distinct) roots of

zq + 1z-1 + ... + q = 0. (2.3.5)
1 q

If the roots of (2.3.5) are not all different, the representation is

not affected; only the definition of the parameter set (ci; j e J 3"
3 1,a
differs slightly. Also, in order that (2.3.3) be a valid representa-

tion (in the mean square sense), we must require that all roots of

(2.3.5) lie inside the unit circle. This restriction is not serious;

in fact, it insures that the representation (2.3.1) is unique.

Thus [Xt3 may be considered an autoregressive process of infin-

ite order. Durbin [10] proposes that, since it is easily shown that

ar decreases exponentially in r, we use the representation

k
X + E C Xt ; Z (2.3.6)
j=1

where k is chosen large, but still k << N. The parameters (I; j e Jl ,k

may then be estimated using methods discussed in 2.2; finally, (2.3.4)

and (2.3.5) are used to transform (&[; j J l,k to estimators of

([ ; j e J 3,q. Durbin shows the estimators can be made efficient by

choosing k(and hence N) large. All properties derived assume N infinite

and k large enough that relationships (2.3.3) and (2.3.6) are nearly

equivalent. The computation of the estimators requires considerable

effort.

Walker [31] finds linear functions of the sample correlations
-1
[r = C CO ; j J1,N) which provide asymptotically efficient esti-

mators of the correlation function [p = 1 j J He then
=, y E: ji He then

substitutes these estimators into (2.3.2) (expressing (2.3.2) in terms

of correlations rather than covariances), obtaining asymptotically

efficient estimators of [%j; j e J q}. The efficiency depends pri-

marily on the number of correlations used in the calculation of the

estimators and the number of observations, N. The weights assigned to

the sample correlations used in these linear estimators are determined

iteratively, so calculations may be tedious.

The most recent contribution is due to Hannan [16]. He uses

the spectral representation of fXt} to obtain parameter estimators.

Though his work is too complicated to summarize here, it should be

noted that the estimators are shown to be asymptotically normal with

variance-covariance matrix identical to that of the maximum likelihood

estimators. The procedure does involve an iterative procedure which,

by his own examples,may be slow and thus expensive. His Monte Carlo

experiments also showed inexplicable bias in his estimates.

By now one realizes that the estimation problem for the moving

average time series is anything but simple; but the author's objective

is to obtain an estimator suitable for use in a predictor. The predic-

tor which minimizes the mean square error of prediction is

^(1)
xN (m) = EXN+m/(X = x; j e J1,N. (2.3.7)

If the entire history of fXt were known, that is, if the observation

were fxj; j Je mN}, we would have

q
(1) q
xN (m) = E{ Z 3. Z_. ./(X. = x.; j J }. (2.3.8)
j=0 N+m-j j j -,N

Note that fX. = x.; j e J ,N uniquely determines Zj =z; j e J -,N
j j -J o,N j -dN3

and conversely, by using (2.3.1) and (2.3.3). Thus

EZ N+k/(X. = x; j J N
N+k3 j-mN

= E(ZN(Zj = z ; j J 3
N+k- ( 3,N3

= 0,

(2.3.9)

for k J 1,., since the (Ztj process is an uncorrelated sequence.

Thus (2.3.8) becomes

4(1) q
XN (m) = S m (2.3.10)
j=m

Remembering that at decreases exponentially in r, we can approximate

ZN-k by

N-k-I
ZN~k w xNk + Q 0t. x (2.3. 11)
zN-k X xN-k + 1j xN-k-j
j=1

for use in (2.3.10). Henceforth we assume that the entire history of

the (XtJ series is known up to time N so that the discussion is not

needlessly complicated.

Case 1: q = m = 1

From (2.3.10)

x(1 ) = B zN, (2.3.12)

writing Q = I The error is

MSE1 = E[[XN+ N(1)]2

= E((XN+1 N 2
= E[(XN1 B ZN)2

2

= (2) (2.3.13)
If S is unknown, and one uses the estimator 5, we have

^) (1) = (2.3.14)

where from (2.3.3)

Z = Xt + E (-) Xt ., (2.3.15)
j3=1

and so

ZN = XN + j ( x)J N-j
j=1

N-1
Sx + F (-s)j xN-j. (2.3.16)
j=1

From previous arguments we know that

MSE2 SE1 = E[(\$ZN ) ZN)2. (2.3.17)

-1/2
If we assume that 0 has sampling error of order N/2, which

is the case for each of the estimation procedures considered above,

we have

MSE2 MSE1 = E([[ZN (6+)(6Z N+ N)]2

= E{[ZN (62, + 02 E[(6ZN)23 + o(N-1), (2.3.18)

where 60 = and 6ZN =ZN N

Note that it follows from (2.3.9) that if we use (xj; j J1,N-1

in calculating \$, then 0 and ZN are uncorrelated. Then

EfZ (68)23 = E(Z E[(60)2

= a2 E{([-_)23 (2.3.19)

The second term on the right-hand side of (2.3.18) follows from

E((6ZN )2 = E2 [ E (-1) (J-~J)XNj]2
j=1

= E[ (-1)j j -1 (6)XNj 23 + o(N-1). (2.3.20)
j=l

The value of (2.3.20) depends on the specific form of H. However, the

expression will be complicated for all estimates considered. One can

2
obtain an approximation for (2.3.20) by replacing (58) by its expected

value to get

E(8ZN)2 E[(6)2 [ j2 2(j-1)
j=l

2y1 j(j+l)w2(j-1)
j=l

SE(86)2] 2 o2(1-82)1 (2.3.21)

Substitution of (2.3.19) and (2.3.21) into (2.3.18) yields

USE-0S2 22 2 2 -1
SE2 E E(5)2 o[ + 8(1i-B2)1]

= Ef(B-) 2] a (1-". (2.3.22)

Whittle [33] has shown that the maximum likelihood estimator for B has

asymptotic variance N- (1- ), so that, at best, the increase in the

mean square error of prediction is given by
-1 2
MSE MSE = N aZ (2.3.23)
2 1 Z

A strong argument can be made for using the inefficient estimator

given by substituting sample correlations into (2.3.2) and solving for B;

the estimation equation is thus
-1 12-l
rl = C1 C1 = (1+ 2)1 (2.3.24)

Only one of the solutions to this quadratic will be less than unity.

The estimator is easy to calculate, and even if the efficiency is only

0.1, the additional error given by (2.3.22) is negligible for large N.

Case 2: q, m general (m < q)

Using the m-step predictor given in (2.3.10), the mean square

error is

q-m
MSE = E{[XN+m- j+m Z Nj]2
j=0

m-1
= a2(1 + E ) .
j=1

When the estimator

.(2) q
N (m) E= 2 N+m-j
j=m

is used, the increase

MSE MSE =
2 1

in prediction error is given by

Ef[x(1) (2) 2
N N

E[E (Z (6\$ ) + \$.(6Z .) 23
=E[ {ZN+m-j ) + (ZN+m-j

+ o(N- ).

By using only [x.; j e J ,Nq in the calculation of (Pj; j

the cross product terms in (2.3.27) vanish, leaving

q 2 q
MSE MSE = Ef[ E Z .(68j)] } + Ef[ Z j.6Z
2 1 N+m-j N
j=m j=m

(2.3.27)

1,q3'

e J

+m-2
+m-j

+ o(N- ). (2.3.28)

The first term on the right-hand side of (2.3.28) is easily

evaluated as

Ef E ZN+m (6 )2= Ef(.- ) 2}.
j=m 3 j=m

(2.3.29)

The second term, however, is complicated by the fact that the [Xt]

process is q-dependent; that is to say, (.j; j e Jq,q are all non-zero.

If we apply the methods used in Case 1,

E[[ Z B. 5Z ]- = E (E (,E.-c.)XN+m
j N+m-j =m l+m-j-t
j(2.3.30)
(2.3.30)

(2.3.25)

(2.3.26)

where [(; j e Jlq] are obtained from (2.3.4), using [T.; j Jl,q

as the roots of

+ zq-1 + ... + B 0. (2.3.31)

All that could be gleaned from (2.3.30) is the fact that an

increase in q will result in an increase in the prediction error

increase given by (2.3.27). But since no more concise expression could

be obtained for (2.3.30), no approximation was obtained for (2.3.27).

The argument for using the estimators which are easily calculated,

but perhaps inefficient, still applies, of course. But determination

of N in order that the increase be bounded by a given quantity must be

accomplished by using numerical methods.

2.4 The Hybrid Model

If IXt; t e J, ]3 is a stationary time series satisfying the

relationship

t + + t-1 p + .t-p + 1 t-1 t-q (2.4.1)

for every t, then [Xt is said to be a hybrid model of order (p,q).

Estimation of the parameter set [ ... ; B1 ... 8 3 for the

hybrid model is the most difficult of those considered. The fact that

the sample covariance function is an inefficient estimator of the true

covariance function contributes to this difficulty. The maximum like-

lihood estimators are again intractable, but their asymptotic variance-

covariance matrix can be used for calculations of efficiency.

Durbin [11,12] again makes use of an infinite order autoregres-

sive representation of [Xt). The problems with the procedure are similar

to those mentioned in 2.3: the choice of truncation point for the

autoregressive representation, the choice of N, and the cumbersome

iterative procedure used in the calculations. However, the efficiency

of the estimators can be made as close to unity as desired by taking

the truncation point and N large.

Walker [32] approaches the problem via the covariance function;

by finding an efficient estimator of this function one can efficiently

estimate the hybrid model parameters. The following relationships are

used:
P p
E{( ZE X .)( Z y. X .s)
j=0 j =0 s

q q
= E( E Z .)( E B. Z .)J (2.4.2)
j=0 J j=0 s

implies that

P P 2 q
j i _cr.i Y(t-s)+(i-j)=o s j -. .t-sI, (2. 4.3)
j=0 i=0 j=t-sI j-t-s

where o= 0 = 1, and |t-sl q. This system of equations gives

a ; e Jl,p] and [j ; j E Jl,q in terms of thp covariance function

{yj). In order that the solutions to (2.4.3) be unique, one must require

that the roots of

z + 1 zq-1 + ... + q = 0 (2.4.4)

have modulus less than unity.

Also, after multiplication of (2.4.1) by Xr for r e J ,q+p
t-r q+l,q+p'
taking expected values yields

P
E i yr-j = 0, (2.4.5)
j=0

using the fact that X is uncorrelated with the sequence
t

(Z t+; j J1 ]. Walker finds efficient estimators of y j), and then

uses (2.4.3) and (2.4.5) to calculate estimates of the hybrid model

parameters. An iterative procedure is required, but the resulting

estimators are asymptotically normal and fully efficient.

Hannan [16] achieves the same goal by a different method:

that of estimating the spectral density and using this to get estimates

of the parameters. The effort required is substantial, but the esti-

mators are asymptotically normal and efficient.

The predictor of Xm given the observation x.; j c J_ N is

-(1)
N+m N -"N

xN (m) = EN+mI /(X x; j c J

p q
p 4(1) q
CE (1)(m-j) + Bj. zN+m
j=1 j=m
= if m C Jl,q
(1)
(1m-j) if m J (2.4.6)
j a N (m- q+l,'

where

.(1) if j J1
xN () =,

Nj if j J- (2.4.7)

Note that the values of [N-; j E J are calculated from
Cl,q
[xj; j e J_ N, using the infinite order autoregressive representation.

When only [x ; j C J ) is observed, we may approximate [z N;j cJ q]

by truncating the autoregressive representation after N terms.

When the parameters are unknown, the predictor used is

-P (2) q
E a J( N (m-j) + E' j ZN+m-j
j=1 j=m-j

.(2) if me Jq
xN (m) = l,q
P (2)
E ^ x2) (m-j) if m Jq+l, (2.4.8)
j=l

^(2) ^(1)
where xN (j) is calculated just as xN (j) with the

by their estimators, and z. is calculated, using the

presentations with estimated parameters.

Case 1: p = q = m = 1

The time series IXt; t e J

X cXt_ = Z + zt_1,
writing = and = in (2.4.t- ).
writing c = -Yl and \$ = P1 in (2.4.1).

parameters replaced

autoregressive rep-

_-C satisfies

(2.4.9)

From (2.4.6),

S(1) = N +
XN (1) = CYN + BZN.

(2.4.10)

The autoregressive representation of Z is

ZN = XN + E (-0) (XN XN )
j=1

= XN + (1 oal ) E (-)J XN-j.
j=1

(2.4.11)

Noticing that the autoregressive parameters (the coefficients of the

Xt s) on the right-hand side of (2.4.11) are decreasing exponentially,

we will use

N-1
zN xN + (1 (-)B-3 xN (2.4.12)
j=1

in (2.4.10) to calculate iN (m). The mean square error of prediction

for (2.4.10) is

MSE1 = E[[X X)
1 N+1 N

= E[XN+- (XN + ZN) 2

= 2 (2.4.13)

When a and B are estimated by & and B, respectively, we use

(m) = xN ZN. (2.4.14)

The increase in prediction error is

MSE2 MSE1 = E[[eX+ \$ZN- (&N + ZN)]2]. (2.4.15)

The term Z greatly complicates this expression; but if %e assume
N
ZN ZN'

MSE2 MSE1 E[[('-ue)XN + (S-^)ZN]2}

S yoE (-a)2}+ 02 E (^-)2

+ 2o2E[(a )(S-S)], (2.4.16)

where

(i) XN and & are assumed approximately uncorrelated;

(ii) (&,P) are calculated using only [x.; j e J1,N_1 so that

3 and ZN are uncorrelated; and, using the fact that [Xt can be repre-

sented as an infinite order moving average process,

(iii) EXNZN} = E[[ZN + c jZNj ]ZN]
j=1

= a2 (2.4.17)
Cz

To obtain ~ ej ; j Jl in terms of (a,c), note that

xI = Zt + (Zj. + z_j_)
j=1
Zt + ( t-j t-j-1

= Z + (0) E Z .. (2.4.18)
j=l

One can derive an estimator for a from the equation (2.4.5):

Yr- r-1 = 0 (2.4.19)

for every re J2,. With this in mind, we define k as an asymptotic

(kxk) covariance matrix with (i,j) element

*
C.i = lim E((C.-C')(C.-')), (2.4.20)
13 N 1 J

-1
where C. =C. C (2.4.21)
1 1+1 i

for i e J,k By letting

-1 -1
6C = C C
i i+l i -i+l Yi

= Ci a (2.4.22)

and

6C. = C. (2.4.23)

we have
-1
6C. = (C CI.)C1
1 j+l J j

= (6C. aoc.)(6C. + y.)-
J+l J 3 J

-1 -1/2
=v- (6C.j aC.) + o (N 2). (2.4.24)
J J+ p

But Bartlett [1] has shown that C. is consistent for y so that
J 3a
(2.4.24) implies that C. is consistent for &c.
J

Furthermore, from (2.4.18) we find

Q(2 1+( 2+ 20ci (1-2)- i:
Ys
2 am-1
(Z aca i:

where
2 -1
a = (c+t)(1+)( 1-+a )

Defining

v = Ef(5c )(5C )],
Xrr+s E r )(Cr+s '

Bartlett [1] has shown that

2
NXr r+s = Y

S(ppjj+s
j 3-S

+ jPj+r+s ) + o(1).
j-r j+r+s

(2.4.28)

-1
Remembering that p = yo we use (2.4.25) to get

Sp.p
SPjPj+s
3 ,

1 + 2a2(1-c2 )-1

Ss-2 2 2 2 -1 2
a [2a a2 (1_- ) + (k-1)a + 2ac]

if s= 0

if s i 0.

(2.4.29)

We now state

Theorem 2.4.1:

The estimator

k
cy = djo C ,
j= JO

T
where d = (dl,d0,...,d

S T -1 T -1 -1,
I S [f S ]]1

(2.4.30)

(2.4.31)

(2.4.32)

T
d 4 1
-0

f s = 0

f s 0,

(2.4.25)

(2.4.26)

(2.4.27)

T
with = (1,1,...1) is consistent and has minimum mean square

error among the estimators

k
= E d. C.
j=1 J J

with d T = 1. This error is given by

MSE( ) = [fT E-1 T-1,

(2.4.33)

(2.4.34)

where

(a r- 1-2 (1+2 -4ac+2a2)

r -2 2 2 2
r,r+s = (a r ) (2 a-ca -a 2 )

-2r 2
a (1-- )c

if s = 0

if s = 1

if s J2 c

Proof:

That aO is consistent for a follows immediately
T
that C. is consistent for a (2.4'24) and d T = 1.

Now if & is of form (2.3.33),

MSE[} = dT E d.

(2.4.35)

from the fact

(2.4.36)

In order to minimize this with the restriction d = 1, we form the

Lagrange equation

L = dT d 2\(d T 1).

But = 0 implies
-I d=d
0

T T
d E = 0.
-0

(2.4.37)

(2.4.38)

Using d 11 = 1, it follows that

S= [ Z -1 ]-1 (2.4.39)

and hence

T T 1 T -1l -1
d = ~ S I[ E- ] (2.4.40)

as was to be shown.

Substituting (2.4.40) into (2.4.36), we obtain

MSE(^&0 = [T E-1 -P1. (2.4.41)

Finally, noting from (2.4.24) that

a = E[(6C )(6C* )
r,r+s r r+s

r= r+s rt's,r+s+l C ,r.tsl + Yr+l,rs)

+ a2 rr+s, (2.4.42)

substitution of (2.4.28) and (2.4.29) into (2.4.42) yields (2.4.35),

which completes the proof.

Table 2.4.1 gives efficiencies for &0 for several values of

Oa and \$; three values of k (the number of sample covariances used in

the calculation of &0) are given. All calculations were carried out

on the University of Florida's IBM 360 computer. Also, efficiencies are

calculated with respect to the variance of the maximum likelihood esti-

mator of ce given by Whittle [appendix of 35] as
W 1 2 -2
W= N(1- )(cM+a3)2(c+). (2.4.43)

The table shows that the efficiency is especially sensitive to

changes in 8; as B tends to unity, it becomes necessary to use more

sample covariances to achieve high efficiency.

Table 2.1

k *
Tables of Efficiencies for 0 = djo Cj
j=1

k= 2 y

.1 .3 .5 .7 .9

.1 0.999 0.999 0.999 1.000 1.000
.3 0.952 0.961 0.972 0.983 0.994
0 .5 0.791 0.840 0.888 0.934 0.979
.7 0.591 0.687 0.780 0.871 0.958
.9 0.427 0.551 0.680 0.809 0.937

k= 4 y

.1 .3 .5 .7 .9

.1 1.000 1.000 1.000 1.000 1.000
.3 0.999 0.999 0.999 1.000 1.000
0 .5 0.967 0.975 0.983 0.990 0.997
.7 0.830 0.877 0.918 0.954 0.986
.9 0.632 0.731 0.819 0.897 0.968

k= 6 y

.1 .3 .5 .7 .9

.1 1.000 1.000 1.000 1.000 1.000
.3 1.000 1.000 1.000 1.000 1.000
0 .5 0.996 0.997 0.998 0.999 1.000
.7 0.931 0.951 0.968 0.982 0.995
.9 0.746 0.820 0.882 0.935 0.980

In practice, one would need to estimate S (using 2.4.2) in order

to calculate a Thus, though the effort of calculation is somewhat
o
reduced, an iterative method must be used. We can obtain an upper

bound to the mean square error of prediction for

(2M) = &xN + 8N (2.4.44)

by considering

(m) O xN. (2.4.45)

The corresponding increase in prediction error is

MSE3 MSE1 = Ef[(^ -c)XN + 6ZN]2}
3 1N

aC [2 + K-(l+ce1)2], (2.4.46)
z

where we assume K observations Ix ; je J ,K have been used in the

calculation of &0, with K << N, in order that the assumption that X

and ^ are uncorrelated is more plausible.

In fact, the use of (3)m) as a predictor of XN may not be
N N+m
unreasonable as long as 3 is small, thus saving the rather tedious

calculation of Z N'

Case 2: p = q = 1; m general

The m-step predictor is given by

1) m m-l
xN (m) = m x + a-1 (2.4.47)

with mean square error

z2 if m= 1
MSE =
:2[1+ (re+B)2 (1- m)(1-C2 ) ] if m J2. (2.4.48)

If we use

^(2) A^m m-l
xN (m) = C xN + am N' (2.4.49)

the increase in prediction error will be

m m [m-1X + (m
USE MSE1 2 MS E E[( -C )XN+ (a C -CY )ZN]

m-1 m-N
= E{[ma -(6 )N + (a- (60)

+ mom -1 c(6))ZN 2) + o(N-1)

2 2(m-1 (0) 2 2 2(m-l) E (6)2
Sm a (YO+ Z)Ef(a') + E(6)

2(m-1) 2
+ 2m 1) (1+)oCT E((6a)(6 )}, (2.4.50)

where we have replaced ZN by ZN, and assumed (&,0) to be uncorrelated

with XN.

We can again obtain an upper bound for this rather complicated

expression by considering

xN (m) = xN. (2.4.51)

Then MSE2 MSE1 is bounded above by

2 -2(m-l) 2 a-1 ]2

a C [ 2( 2 + K- m(l+a) ]. (2.4.52)

In addition to the comments applied to this quantity in Case 1, notice

that the increase becomes less significant as m is increased.

Case 3: p general; q = m = 1

The representation of (Xt; t e J_, ~ is now, from 2.4.1,

Xt + at- + +" +p t-p= Z + Zt-1 (2.4.53)

The predictor is

N (1) = E Q. 0-x + Np (2.N.54)
jXl "xNil-j + ZN'
j=1

and the error of prediction is

MSE = E([X ( Xn. i + 8Z,)2I
1 N+I 3- o N+ 1-3 N
j=1

= oz. (2.4.55)

When o. ; j e J ] arc unknown, estimators can be obtained

from (2.4.2) and (2.-1.5); we have

p
E a C = 0 (2.-1.56)
j=O

for r e J2,p41 (a = 1) and (2.4.3) for estimating the parameters.

One can also extend the method presented in Theorem 2.4.1 to this case,

but the analysis is increasingly complicated as p increases. The deter-

mination of estimates from (2.4.56) and (2.1.3) is probably easiest,
-1
and the order of the error increase is N even though the estimators

are inefficient.

The cases with p, q, and m general are analytically intractable

insofar as evaluation of increase in prediction error due to parameter

estimation is concerned. But the study of the simple cases has shown

45

that primary consideration should be given to the construction of

estimators which are consistent and have mean square error of order

-l
N-. As long as N is large, high efficiency is not a necessary

condition for small mean square error of prediction.

CHAPTER III

THE LOGISTIC GROWTH PROCESS

3.1 Introduction

Let [X(t); t E I0, be a time series such that

t -1
E[X(t)/X(0) = n) = n(+ce)(1+ap) (3.1.1)

where a > 0, n e J and p e (0,1). Then [X(t)3 is said to be a

growth process following the logistic law, or simply a logistic growth

process. This process receives wide application in animal population

models and various economic models; the representation varies slightly,

depending upon the particular application, but with a little adjustment

each can be written as (3.1.1).

The problems to be considered are:

(i) the estimation of a and p in (3.1.1), and

(ii) the prediction of future values of X(t) when a and p

are estimated.

Several observational possibilities will be considered. The

process may be observed periodically, at equidistant time points; it

may be observed continuously during a given time interval; or only

the times at which the process increases (assuming all increases are

discrete) may be observed.

3.2 Probability Distribution of IX(t)}

The probability structure of a growth process may be specified

in any of an infinite number of ways. But in reality these processes

are often used wi:h no reference to such a structure; the parameter

estimation problem is considered without reference to error.

We shall use the following probabilistic model for (X(t)}:

let [X(t); t S IO,-} be a Markov process with

P{X(t+5t) = j/X(t) = i}

ik(t)6t + o(6t) if j = i + 1

= 1 ik(t)6t + o(6t) if j = i

o(6t) otherwise, (3.2.1)

where 6t is an infinitesimal time interval. This model obviously

requires the assumption that [X(t)] grows in equivalent discrete units

(economic units, population units, etc.), and that the probability of

a one-unit increase at any given time is proportional to the size of

the process at that time. The reader will recognize (3.2.1) as the

common definition given a birth-death process with the death rate zero

for all t. Kendall [19] has shown that the above representation is

the birth-death model which minimizes the variance of (X(t)3.

Defining

pij(st) = P{X(t) = j/X(s) = i), (3.2.2)

an application of the Kolmogorov equations yields

(j-1)(t)6t p ij-1(s,t) +

p .(s,t+.t) = + [1 j.(t)6t]p .(s,t) +o(6t) if j e Ji+

[1-ij (t)6t]pij(s,t) +o(6t) if j = i.
(3.2.3)

Thus

(j-l)\(t)P (s,t) j,(t)p. i (s,t)

[P j(s,t)] = if j J+,m

-j .(t)p. (s,t) if j = i. (3.2.4)
13

Multiplication of (3.2.4) by j and summation over all j gives

E J([- [Pij(st)}] =
j=i

2
= \(t)[ E j(j-l)p (s,t) E j P.(st)]
j=i+l ij-1 -i

= x(t) E j p..(s,t). (3.2.5)
j=i

Defining

m.(s,t) = E j p. (s,t), (3.2.6)
1 1.
S j=i 1

we rewrite (3.2.5), obtaining

[m (s,t)] = \(t) m.(s,t). (3.2.7)

Solving,
t
mi(s,t) = i exp [ X (u)du}, (3.2.8)
s

using the initial condition

m.(s,s) = i. (3.2.9)
1
The function m.(s,t) shall be called the mean function of the growth

process (x(t)J, since obviously

m (s,t) = EfX(t) X(s) = i]. (3.2.10)
1 -

From (3.1.1) we know that

m (0,t) = n(l+Qi)(l+0p t)-
n

(3.2.11)

is the number of units at time zero. Equating (3.2.11) and

(for s = 0) implies that, for the logistic growth process,

X(t) = a((a + p -1 (3.2.12)

(3.2.13)

0 = log p.

We now state

Theorem 3.2.1:

If [X(t); t

ability structure is

e IO,] is a logistic growth

(3.2.1), then

process whose proba-

j+i-1 i (1 T )
Sj+J -1 s,t st

0

where

T = (1 + p )(l + Op ).
s,t

Proof:

The proof is a special case of that given by Kendall [19] for

the general birth-death process. We let X(t) be defined by (3.2.12) and

p(t), the death rate, be zero to obtain the density given in (3.2.14).

The rigorous proof of this is given for the general case in many places,

and so will not be reproduced here.

where n

(3.2.8)

where

if je Jo

otherwise,

(3.2.14)

(3.2.15)

3.3 Equispaced Observation

Assume that N observations are made on the logistic growth

process I((t); t e 10~} at equispaced intervals of length A. Without

losing any generality in the results to be presented, we henceforth

assume A = 1. Thus (f(j); j e JiN will be observed, with X(0) = n

known.

Writing the set of observations as [xj; j e J ,N], the likeli-

hood function for this set is

L[La,p; (x ; j J1,N)] =

= P (Xj = x.; j J1,N)/e,p n). (3.3.1)

Using the fact that [X(t)3 is a Markov process and applying Theorem

3.2.1,

N
L[e,p; (xj; j C J1)] = IT P[X(j) = x./X(j-1) = x )
j=l1 j-1

N x.- x. x.-x.
TT T J-1 (1-T ) J j-1
j=1 j-1
(3.3.2)

where x0 = n.

Taking the logarithm of (3.3.2), and replacing T by the
j -1,j
defining equation (3.2.15), one gets

N x -1 N
log L = Z x ) + x j-[log (l+cp)
j=l j-1 j=l

N
log (l+Cpjl)] + (x.-x. )[log o(pJ-1pj
j=l 3 3-l

- log (l+apj-1)],

(3.3.3)

writing L = L[c,p;(x; j J1,N)] for simplicity. Differentiation of

(3.3.3) with respect to a yields
N-l
Slog L -1 -1 j+1 j+1 -1
--- = n[p(l+oap) a ] + E x.[p (+ap )
j=1

(1+ jpJ-l) -1 +xN[O- -p1 (1+apN-1) 1; (3.3.4)

and differentiation with respect to p gives

log L -1 -1
Sl- = n[(1-p) + a,(l+ap)1]

N-1
+ x.[(j+l) apj (l+ap j+l) p1
j=1

(j-1)cpj-2 (l+1apj-1)-1]

+ xN[(N-l)p1- (l-p)- (N-l) pN-2 (l+pN-1 )-].

(3.3.5)

Before considering (3.3.4) and (3.3.5) at greater length, let

(Y (t); t e I O, i e J,n be n independent logistic processes, where

Y (0) = 1 for each i, and each of the n processes sharesthe same param-

eters as [X(t)}, (a,p). Defining, for s < t,

Hyi(s,t; z,w) = E E PfY.(s) = m; Y.(t) = k}z wm (3.3.6)
m=1 k=m

one has

Hyi(s,t; z,w) = PY.i(s) = m/Y.(0) = 1} wm E P[Y.(t)=k/Y.(s) = m}zk
m=l k=m

T. (1-T0 )m-1 m k-l m (1-Ts k-m k
= Os(w-,st ( w s ( m-l) Ts,t (-,t
m=l ,s O( s k=m s,t st

STO, 6(w zT )[I-(I-T )6(w,ZsT ,)]- (3.3.7)
Os ''s,t t .3, L st

where

6(w,Z;T s) = T wz[1-z(I-T )]-
Simis,t s,t

Similarly, defining

Hx(s,t;z,w) = Z

one has

HX(s,t;z,w) =

SPkX(s) = m; (t) = k ) = n (3 9)
Z P(X(s) = m; X(t) = kX(0) = n)z w (3.3.9)

m=n k=m

-mn
Sn-I
m=n

n (1-T -n m
0,s ( 0,

X Z (k-1)
k m-1
k=m

Tm (-T k-m zk
s,t s,t

= (T0,s 6(w,;T t )[-(-T ) (w,z;T s,] 3
O,s s,t s,t st

which implies that

H (s,t;z,w) = [Hi(s,t;z,w)] = T Hyi(st;zw).
i=1

X(t) = Z Y.(t)
i=l1

(t e I ).-

If we rewrite (3.3.3) for the random sequence

[X(j) = X.; j e J1,N0 we get

N
log L(cp,p; X1, .. N) = n p 0(a,p) + E X. (a,p)
j=1

(3.3.8)

Hence

(3.3.10)

(3.3.11)

(3.3.12)

where

(3.3.13)

N
S -1 N
log [(l+Qp)[(l-p)] nJ+n E
i=1

log [(l+apj+l) [p(l+apj-1)] -1

log [pN-l1 -(l-p ( N-1 -1
log [ap (l-p)(l+wp ) 3

log X -1
i-1

if j = 0

if je J1,N-1

if j = N.

(3.3.14)

But in view of (3.3.12), this can be written

log L(a,p; Y1'2 ...' ) =

N n
= n q ((e,p) + E E Yi,j Pj(~,p)
j=1 i=1

n
= z Z
i=l

YT = (Y Y. Yi
1 ij'i ,2 .. ,N

Y. = Y.(j)

z = o (a,p) +

(3.3.15)

(3.3.16)

(3.3.17)

N
S=1 ( ip)Y
j=l 1 j

It follows from the discussion leading up to (3.3.12) that

N
zi = Co (Op) + cp.(a,p)yi,
Sj=l
= log p( i,a1,p),

where

P()'%Cp) = PYi, = y ,1'.... i.N = i,N.

CPj CkQP

where

(3.3.18)

(3.3.19)

We are now ready to state and prove

Theorem 3.3.1: The equations

8 log L(a,p; X1,X .... XN)
= 0
ac

Slog L(o,p; X1 XN)
= 0 (3.3.20)
ap

have a solution (&,p) which is weakly consistent for (c,p). Furthermore,

the asymptotic distribution of [,/n- (&-c); ../n (p-p)] as n m is

normal, with null mean vector and variance-covariance matrix 2, where

11 12
-1 CT
12 22
(3.3.21)

with

11 2 p-2 -2
a = p (1+ap) a

N-1 -2
S(l+c(ltpJ -1 2(j+l) (l+cpj+il -2 2(j-1) j-1)
+ E (l+u)(l+up ) [p (1+ep ) -p (1+Cp ]
j=1

N -1 -2 2(N-1) N1-2
+ (l+a)(l+upN) [c-I -p (1+Np ) ], (3.3.22)

12 (lp -2
a = (1+cp)

N-l
+ E (l+a)(l+c&J)-l[(j-l)pj-2(l+QpJ-1)-2-(j+l)p (l+cpJ+l)-2
j=l

- (1+o)(1+cPN -1[(N-1)p2(N-1) (1+QN-) -2]

(3.3.23)

and
22 2 -2 ( -2
22 = o (1+op) (1-p)

N-1
N-i j -1 -2 2 2j j-1
+ E (l+-i)(1+ap ) -p~ +Q++ [(j+l)c2p 2 -j(j+l)p-1
j=l

X (1+pj+1 -2 [(j-1)(j-2)p j3- (j- )2 2(j-2)
x (l+crpj-l)-2] N)-i N-3
X (1+apj-1)-23 + (l+')(l+app )1 ({[(N-l)(N-2)ap 3

2 2(N-2 N-1 -2 -2 -2
(N-l)ap ]( (1+c-p ) +(l-p) + (N-l)p] (3.3.24)

Proof:

Write p = p(y ; a,p), 01 = a, and 62 = p for convenience.

Chanda [5] has shown that the following three conditions are suffi-

cient for the theorem to be true:

(i) For all (01, 2) e ( = [(0,=) X (0,1)3,

3
a log p 8 log p and log p exist for almost a
8 9 *and o9 oe9 exist for almost all },
i 1 3 1 j k

and for all i,j,k e J
1,2-
(ii) Defining F(-) to be a summable function when

E F(y ) < <, (3.3.25)
i -
where Z is the summation over almost all ,, we must
yi
have, for all (91, 2) e 0 I < F

I 12P {3 log p k
< Fij (y), and aeisejae

where F (y) and F.. (-) are summable functions, and
i ~ ij1

S H k(y )p(y; e, 6) < M,
E Hj ,k i 1 2 )
zi

M being a finite positive constant. This condition

must be satisfied for i,j,k e J1,2

(iii) For all (98182) e C, the matrix J = ((Ji,j (1 2) ),

where

(3 log p 3 log p,
J (6 ,9 ) =) p(y, 9 9e )
i,j 1 2 1P(i, '1 2
1i 1 3

is positive definite and J|J is finite.

(i) Condition (i) will be established as soon as
2 3
ar r _r
rW' 8.e., and roe 0-o are shown to exist for r e JO,N and
i j J j k

i,j,k c J1,2. This follows from the form of log p(yi ;,p), which

from (3.3.18) is

N
log p(y ;o,p)= Z cj(,p)y. ., (3.3.26)
j=O 'J'

where we define yi, = 1.

But by inspection of (3.3.14), one sees that cr(a,p) is the

logarithm of a continuous finite function for all (a,p) e ,. Thus

it is easily shown that derivatives up to third order exist in C.

(ii) It follows from (3.3.26) that

N
p(~i;o(,p)= exp [ c.(o,p)y.i. (3.3.27)

Thus
dp. N dr N
e E i,r e-. exp _E y
j r=O j=0

N ar

r=0 3

which is bounded by the summable function

N ra
F.(y.) = I E i Y p(y; ,2), (3.3.29)
3 ~1 CC i,N 1 r1
r= i

for from (i) the terms C- are finite, and Efyi, < .
3
Similarly,

2 8 N # er
eae. e (aee) ( k,r -k 1 2
1 J i r=0 j

N 2
= 0- T Yk,r Pk 1 2)
r=O 1 3

r=0 s=0 1 3

which requires (i) and finite second order moment of yk in order to be

bounded by the summable function

N Tr .
i,j Lk [ ae= ] kN krk 12

N N r s
+ E a j YN P(Y ;1 2). (3.3.31)
r=0 s=0 i j

Finally,
3
8 log ps N (3.3.32)
i.dej r=E d.d9.9 Ys,r (3.3.32
1 j k r=0 i j k

is bounded by
N o r
Hijk ) = YsN r i. oh.k (3.3.33)
L'-0 i j k

which has finite expected value by (i) and the finite expected value of

ys,N. Thus (ii) is established.

(iii) Chanda has shown that J will be positive definite as

long as p is a density function. That IJI is finite follows from the

fact that log log P) is finite (from (i)) for all (9 ,) e Q.
1i j

Thus the proof of Theorem 3.3.1 is complete. A closed form has

not been found for the solution to (3.3.20). Some numerical results

using an iterative procedure will be presented in Chapter IV. Also, we

attempted to establish stronger consistency properties for (,p)), using

results of Billingsley [2] and Wald [27], but have had no success as yet.

3.4 Continuous Observation in [O,T]

Assume that the logistic growth process iX(t); t e Io 1 is

observed continuously in [0,T] = IO,T That is, [X(t); t I ,T) will

be observed, with X(0) = n known.

The problem of estimating aC and p will be approached as follows:

first, divide the interval I,T into N equal intervals of infinitesimal

length h. Assume that observations will be made only at the end of

each of the N intervals; that is, (X(jh); j e J 1N will be observed,

where Nh = T. From (3.3.2) we can write the likelihood function for

such an observation:

L[c,p; (xjh; j E J ,N)

S jh )j-l)h(T jh- (j-l)h
j=l (j -1h-1 T (j-l)h,jh (j-l)h,jh)
(3.4.1)

where x0 = n and T is defined in (3.2.15). Note that to obtain
0 s,t
the continuous observation in I ,T, one must let h 0 and N -

such that Nh = T. We thus want to consider the properties of

log [L[a,p; (Xh; j JON)]} as h 0.

We first define the concept of "limit in mean" (l.i.m.).

If a sequence of random variables fz,,Z 2" ,m,... are such that

lim E[(Z Z)23 = 0, (3.4.2)
mm
m M

then [Zm is said to approach Z as a limit in the mean square sense, or

simply as a limit in mean. We will write

l.i.m. Z = Z (3.4.3)
m
m - co

We now state

Theorem 3.4.1:

The limit in mean of log fL[c~,p;(Xjh; j e JO,N)] as

h 0 (Nh = T) does not exist. However, for fixed (a ,P) e Q,

l.i.m. [log L[a,p; (Xjh; j JO,N)] -log L[0,PO; (Xjh; j eJON)
h -0
Nh=T
T
= (log t log 0) dX(t)
0

+ j X(t)[t 0] dt,
where we define dt
where we define

O t t -1
t = ao0oP (1+0 Po

(3.4.4)

(3.4.5)

(3.4.6)

The stochastic integrals on the right-hand side of (3.4.4) are defined

in the usual sense, and remember e = log p.

Before proving Theorem 3.4.1, we will need the following lemmas.

Lemma 3.4.1:

lim N T
h 0 log T 1)hj = t dt. (3.4.7)
Nh=T = 0

The

lim N
h 0 {ZI log (1 T(-l)h ,Jh)
h=T 1 =1

does not exist, but for any fixed (0 ,Po) e

lim N F 0 -1
h 0 E h log L (1 (j-1)hj T (-)hIj
Nh =T =1

T
= (log Et log g )dt, (3.4.8)
0

writing (s < t)

,t (1 + cg t)(1 + ) -1 (3.4.9)

Proof of Lemma 3.4.1:

Using (3.2.15),

jh (j-1)h -1
(j-1)h,jh = (1 + pjh ) ( + Cp-1)h )1

= (1 + apjh) [1 + ap exp (Sh)]1

= [1 + (1 + cp jh)-l [Cpjh(exp [\$h3 -1)])-1

[ + h + (h) (3.4.10)
= [1 + h( + 0(h-)] (3.4.10)
j h

Hence

log T(j-l)h,jh = log [1 + h jh + O(h2)

h jh + 0(h2), (3.4.11)

from which (3.4.7) follows immediately.

But from (3.4.10)
2 2 -1
1 T(jl)h,jh = [h jh +0(h )][ + h jh +0(h2)]-, (3.4.12)

lim
which implies that h 0 [log (1 T(j-l)h,jh) does not exist. Now
Nh=T

(j-l)h,jh (j-l)h,jh

2 O O 2 0 2 -1
=[hjh +0(h )][1+hh +0(h2 )][h h +0(h )]

X [I + h j+0(h2)]

Sjh Sj
= jh~0jh) [1 + 0(h)] (3.4.13)

We substitute (3.4.13) into the left-hand side of (3.4.8) to complete

the proof.

Lemma 3.4.2:

If we define, for s < t u,

R (t,u) = C (t,u) mxs(s,t)mx (s,u) (3.4.14)

where

Cs(t,u) = E[X(u) X(t)/(X(s) = xs)) (3.4.15)

and mx (s,t) is defined in (3.2.10), then

R t 5) u( t -1
R (t,u) = e(p-p )(1+op s) [(1+Cop )( t +p )] (3.4.16)
5

Proof of Lemma 3.4.2:

Using (3.2.26),

C (t,u) = E[X(t) [E[X(u).'X(t) = x t] '(X(s) = x )

= (1 + &p t) (1 + Cp )-1 C (t,t). (3.4.17)
s

And from Theorem 3.2.1,

E[[X(t) x ][X(t) x 1/(X(s) = xs)

0 /x +j-1 xj
= j(T(-1s t s (1-T
j=0 s s,

= (x +1)x (-T ), (T ) (3.4.18)
s s s,t s,t

Thus

C (t,t) = (x +1)x (-T )(T )-2
s s s s,t s,t

+ (2x +1)E[X(t)/(X(s) = xs) Xs(x +l)

-,)
= x (T ) "(x +1-T ). (3.4.19)
s s,t s s,t

Substitution of (3.4.19) into (3.4.17) gives

-1
C (t,u) = x (T T )- (x +1-T ). (3.4.20)
s s s,t S,u s s,t

Then using (3.4.20) in (3.4.14) yields

-1
R (t,u) = x (T T )1 (1 -T ), (3.4.21)
s s s,u s,t s

which is (3.4.16) upon substitution of (3.2.15) in (3.4.21).

63

Lemma 3.4.3:

The stochastic integral

1.i.m. N
I = h 0 E X(jh) jh h (3.4.22)
Nh= T j=l

exists in the mean square sense for all (n,p) e Q. We shall write this

limiting random variable

T
I, f X(t) dt. (3.4.23)
0

Proof of Lenna 3.4.3:

Using a result from Cramer and Leadbetter [7],sufficient condi-

tions for the existence of Il are that R (t,u) be of bounded variation

in [IO,T X IOT) and that

T T
J1 T t u R0(t,u) dt du (3.4.24)
0 0

exist.

Since, from (3.4.16),

0 R (t,u) x0 (1+f), (3.4.25)

the first of these conditions is satisfied. And since t is finite for

any (a,p) e I1 and all t e IO,T, the integral (3.4.24) exists. Thus

the lemma is proved.

Lemma 3.4.4:

The stochastic integral

1.i.m. N
2 = h 0 (log jh)[X(jh) X((j-l)h)] (3.4.26)
Nh=T j=l

exists in the mean square sense for all (a,p) e Q.

We shall write this limiting random variable

T
12 = J (log St) dX(t). (3.4.27)
0

Proof of Lemma 3-1.4:

Again from Cramer and Leadbetter [7], sufficient conditions for

the existence of 12 are that R (t,u) be of bounded variation and that

T T
J2 =j (log t)(log u) dt[R0(t,u)] (3.4.28)
0 0

exist. The first of these has already been shown, and by differentia-

tion of R (t,u), one has

T Tt
J2= J (log t)(log S )[a3(l+o)]2 pt
0 0

[(l+apt (l+apu)]- dt du (3.4.29)

The integrand is finite in C and [IOT X IT}, which completes the

proof.

Proof of Theorem 3.4.1:

From equation (3.4.1)

log L[o,p; (Xjh; j e J,N)

N X.h N
= (X Jh + E X log [T
j=1 (j-1)h- j=1 (j-1)h (j-1)hjh

N
+ E (Xjh X j-)h) log [1 T j-)hjh ]. (3.4.30)
j=1

The third term on the right-hand side of (3.4.30) has no limit in the

mean square sense because of the first part of Lemma 3.4.1. We thus

consider

log L[a,p;(Xjh;j e JON)] log L[ 0 ,P0;(Xjh;J e JO,N)

N 0
= X(j.l)h (log T(j-l)h,jh log T(j-l)hjh
j=1

N 0 -1
+ Z (X^ -X ) log [(l-T )(1-T ) 13
+ E (Xjh X(j-l)h) log [(l-T(j-l)h,jh)(1-T(j-l)h,jh
j=1
(3.4.31)

for any (a0, 0) e Q.

Using lemmas 3.4.3 and 3.4.4, we have

log L* [(,p;O',P ; (X(t); t e IO,T)]

T T
S X(t)(~-0t) dt+ (log t-log 0) dX(t), (3.4.32)
0 0
writing

log L f(,p;aO,p0; (X(t); t e I ),T

1.i.m.
= h -0 (log L[a ,P; (Xjh; j e JO,N)
Nh=T

log L O('PO; (Xjh; e JO,N)]. (3.4.33)

This completes the proof of the theorem. The idea of resolving the

limit problem by considering the ratio of likelihood functions, as in

(3.4.31), is from Bartlett [1].

By using the expression (3.3.12), we may rewrite (3.4.32)

to obtain
g n
log L (a,p;a0,PO; [X(t); t e I,T] = ZT,i, (3.4.34)
j=1

where
T T
ZTi (t)( )dt+ (log t -log 0)dYi(t). (3.4.35)
O 0

(ZT,i; i J1,N is a sequence of independent, identically distributed

random variables. We will be interested in

n T
alog L -l t t) -
.Z Z {I [a-p (+ ( cp ) ] dY. (t)
i=l 0

T
pt(l+ op Yi(t) dt (3.4.36)
0

and
a log L n T t t -
9 f 1(1 + ap -pt) (1 + CI p dYi(t)
i=1 0

T
T t t)-2
a p (l+apt -Bt)(l + p )2 i(t) dt (3.4.37)
0

Since L is not the sum of probability densities, as is the

usual case for likelihood functions, we will need the following

Lemma 3.4.5:

Efa log L E log L 0. (3.4.38)
Eo(u ( } E (3.4.38)

Proof:

From (3.4.36)

a log L -1
SY C ) -p + ept) ] dtm l(, t)
i=1 0

T
1 pt (l+ ap )-2 m(,t)dt

n T

0
S (+1( (l)8 [c~- -pt(l+cp t)]p (l+p -)dt
i=1 0
T
P(+) 1 pt(1+Cpt)- 3dt}
0

= 0,

(3.4.39)

using the fact that (Cramer-Leadbetter [7])

EfdYi(t)] = dtEfYi(t)j

= d [m (0,t)]. (3.4.40)

Similarly,

log L* n 1
E{( ls = Z 1 { (l+Cpt-t) (l+apt)-dtml(O0,t)
i=l 0

n T t t t -3
= (c+1) (l1+ap -Pt)p (l+ap ) dt
i=l 0

a(c-l) (l+apt-t)pt(l+apt)-3dtt
0

=0, (3.4.41)

which completes the proof.

We are now ready to state and prove

Theorem 3.4.2:

The equations

8 log L ',\$;ao ,P; [X(t); t e IOT]]
= 0

8 log L*[a,P;O, 0; [X(t); t e IO,T ]
= 0 (3.4.42)

have a solution (a ,* ) which is weakly consistent for (&,P). Further-

more, the asymptotic distribution of [(A ( *-y); 4n (4*-8)] as n m is

normal, with null mean vector and variance-covariance matrix Z where

[*
a11
* =

a12
L

a12

*22
22

-2 2 2
Now, (w ?W +w X 12U w A ) I
Now, al = 1 (2 22 22 11 2 2212,
-2 2 2
a22 = (12 Xli + w1 22 11 12 12)
and
-2 2
12 = A 12 2211+ 11 12 1)22- (U2 + 11 22 12

with, writing a1 = and 2 = '

2 *
= -=1 Efa 82 log L
w"ij -n E r6939.
1 j

. =..n-1E( log L
ij l o8. /

2
A = wll W22 W12.

d log L *
\ 9. /J '
J

(3.4.48)

Proof:

Writing

log [Pi [c,';CO', 0; (Yi(t); t E I ,T)]}

T T
= J Y (t)(S -)dt +
0 0

(log ?t log 0)dyi(t),
t ~t I

we have from (3.4.34) that

log L *O.B,;&c00; [x(t); t IO, T]

= log Pi [',;cO,'80; (Yi(t); t C IO,T)].
i=1

(3.4.43)

(3.4.44)

(3.4.45)

(3.4.46)

(3.4.47)

(3.4.49)

(3.4.50)

As mentioned prior to the statement of Lemma 3.4.5, we cannot regard

(p ; i e J1,N as a sequence of probability density functions. However,

as we shall see, this is not essential to the proof of the theorem.

In the proofs given by Cramer [6] for the single parameter case

and Chanda [5) for the multiparameter case, the only property of the

density function crucial to the proofs is

a log f,
E( = 0, (3.4.51)

where f is the density function and 9 some parameter of f. The property

holds as soon as I [ is bounded by an integrable function (over the

sample space).

But in the present case we have, from Lemma 3.4.5,

n *
E alog } E( lo 0, (3.4.52)
J 1=1 j

which implies

6 log p
E = 0, (3.4.53)

for j C J1,2. Thus the assumption that p is a probability density,

and, in fact, the requirement that I 'I is integrable (or summable

in this case) are unnecessary.

We now must establish the three conditions given in the proof of

Theorem 3.3.1 to complete the proof of this theorem.

(i) To see that log p. is differentiable up to the third

order for almost all yi, we need only note that t

and log Ft are differentiable to that order; this is

easily sho'n using (3.4.5).

(ii) As mentioned, we no longer need the summability condition
,2 *
for I '|. The additional requirement that 1 I be
j j k
bounded by a summable function is used such that

9 *

Ek p= = 0, (3.4.
Ii i k

and hence
2 *
a log p. a log P* ( .
E 6e. e 1 J = E (3.4.
Sk j k

But we shall not make use of the property (3.4.55), so it will not be
,2 *
o p.
necessary to have I bounded by a summable function.
Sk 3 *
j- k s log p.
Thus, we need only establish the condition that ioge- e
b6 de k6
0J k I
< H. j,k, ), where EI[H j,k,(i)l < M, some finite, positive constant.

Using (3.4.49),
C3 log p T a3
i -leke a i 0 Yi(t)dt
0 j k 0

54)

55)

3 2
+ T -1 -2 t
0 j' k t \aea a k ae,
2 2
-2 t t -2tt

2-3 t (t) (3.4.56)
j k I

Since, for every (81 ,2) E 0, we can easily bound t and its derivatives
2 It
used in (3.4.56) for all t C IO,T we can find Hj ,k,(yi) as soon as
T T
E[( y (t)dt] and E[J dy (t)} < ", which follows from the proof of
0 0
Lenma 3.4.5.

(iii) The third condition required for Theorem 3.4.2 to be true

is that J = ((Jij 1(9,2)) is positive definite and that

IJI is finite, where

i E a 2log p log p
Ji,j(el, 2) A 9. J
i j

.ij -
13

(3.4.57)

From (3.4.36) and (3.4.37),

nX11 = f [a tl -p (1+p ) 1]
0 0

+ 2 fT T t+u( t)
+ pp )
0 0

T T

0 0

-l u u -i
[- -pU (l+ap ) dt uR (t ,u)

(l+cpU) ]-2
RO (t,u)dt du

[C -p (1+ap t)-] p (l+ap )- dtR (t.u)du,

(3.4.58)

1 -1 T T 1 t t )-1 u u -1
nhl2 = B [ -pt(l+op (lltp) (1+ap -Bu)(l+Yp u) dt,uRo(t,u)
0 0

T T
+ T J Jp t+U(l+p t-0t) [(l+app ) (l+CYp )]2 R(t,u)dtdu
0 0

T T
J J (1+a(pt -t) (l+Yp ) -pU (l+p )2
0 0

+ crcUU -p (1+ap) (l+p ) t)- p (l+apU-Bu) (l+apU)c )

X dt RO(t,u)du,

T T
n22 = B-2
0 0

(3.4.59)

(l+ap t-et) (1+p U-Bu) [(l+cp t) (1+p U)]-1

X dt R (t,u)
t,U 0

T T
+ 2T J Jp (t+0p -St) (l+opU-Su) [(l+cip )(+pU)]"
0 0
X R (t,u)dt du

T T
2a_1-1 T fT (++ tt)(l+, u)(l+p t-1 (l+culu-2
0 0
xdtRo(t,u)du. (3.4.60)

For all (c,B) 0e R (t,u), given by Lemma 3.4.2, is finite, as are its

derivatives. Thus .ij is the integral of a finite integrand over a

finite interval, and so is finite. This completes the proof of the

theorem.

Thus we have shown that (0 ,P ) have the properties usually asso-

ciated with maximum likelihood estimators, losing only the property that

Slog pi log P. log p
E e i E E e i 1 i)} (3.4.61)
3 k j k

that is to say,

Uk = jk (3.4.62)

When this property holds, the form of 7 is greatly simplified; since we

cannot assume (3.4.62) here, calculation of E is tedious. In addition

to .ij 's given by (3.4.58) (3.4.60), we will need

Wll = (c1)B pt[a(l+&pt )4 ]1dt, (3.4.63)
0
T
t -2 t t-1
I12 = (4+1) J (1+ip ) [p (1+i-p ) Otp
0

X(l+- p ) -2]dt, (3.4.64)
and

022 = o(ao+l)B p [(l+c pt) 1-St(+cp t) 2]2 dt. (3.4.65)
0

No closed form has been found for the solution to the estimation

equations (3.4.36) and (3.4.37), nor could a more convenient form be

found for the elements of Z It is conjectured that iterative numer-

ical methods can be used, but we leave this particular topic open to

further research.

3.5 Arrival Time Observation

Assume that instead of observing the logistic process (X(t)}

itself, we observe only the times at which the process increases; that is,

the "arrival times" are observed. Observation is terminated when the

time of the Nth increase has been recorded. Hence we will observe

(Tj; j J1,N,' where

< n + j if t x(T.) is

> n + j if t 2 T., (3.5.1)

where x(0) = n.

The distribution of this sequence of random variables is easily

obtained using Theorem 3.2.1. For any 9 2 0,

P Tk+j -T.j e/(Ti = ti; i J,j))

= PX(t. + 0) -X(t.) > k/(t.; i e J .)
3 3 1 1,3

= P[X(t.+e) X(t.) > k/C. (3.5.2)

where

x(t) = n

x(t) = n + 1

C =a

x(t) = n + j 1

x(t) = n + j

if t < t

if t 1 t < t2

if t t < t
f j-1 j=
if t = t..
J

Since [X(t)} is :,arkov, we have

P[T .+ T. 9e/(T. = t.; i C J .)}
.k+ 3 1 1 1,J

= P[X(t.+e) X(t.) k/(X(t.) = n + j))
J J J

= P[X(t.+e) 2 n + j + k/(X(t.) = n + j)]

CO +j+i-I n+j

i=k t.,t.+e t.,t.+
i=k J J J J

Since (3.5.4) is a function of tj only, it follows th-t [T ; j C

is a Markov sequence, and from (3.5.4) the distribution function

given by

FT (;t.) = P[T j+t.+OT = t.
j+1

= 1 (Tt.,
3

requiring 9 0.

ferent birth and de3th processes would be

function

where

by Moran [22] for several dif-

to calculate the likelihood

L(c,p; t t ) = IT f (c),p; t ,
1 N" T J
j=1 j+1

f ( ;t ) [FT (6;t )],
j+1 d(t +) Tj+1i

(3.5.6)

(3.5.7)

(3.5.3)

(3.5.4)

1,Ns
is

(3.5.5)

which exists in Q since T. t.+, and thus F (e;tj), is absolutely
J' J J+1l
continuous for 9 > 0.

However, Moran's approach can be shown unsatisfactory for this

case by observing that, from (3.5.5),

lim FT (e;t)
S- j+1 J

t.+e t
= lim [1 [(l+cp J )(l+ap j)-l]n+j3

t
= 1 (l+p )-(n+j) (3.5.8)

Hence there is a non-zero probability that, given T = t., Tj+ will

not be finite. Thus ETj+ /(T = t.)J does notexist, and the likeli-
j+1 J J
hood function given by (3.5.6) is not useful for estimating (ci,p).

Perhaps a conditional likelihood could be used to get estimation

equations. Further study of the problem is needed.

3.6 Prediction

Assume that N equispaced observations will be made on the

logistic growth process (X(t); t e IO,C0, as in 3.3. Thus, based upon

observations taken on fX(j); j J1,N we will predict the value of the

time series at time N + m, that is, X(N+m).

Restricting our attention to the predictor which minimizes the

mean square error of prediction, we define

xN (m) a E[X(N+m)/[X(j) = xj; j e J1,N

= E[X(N+m)/(X(N) = xN))

= m (N,N+m)
N

(3.6.1)

= XN(l+opN)( 1+ap N+m)-l

S(1)
using (3.2.10). Then x (m) minimizes the mean square error of pre-
N
diction, which is

MSE = E[[X(N+m) ,(m)]2

= ERx (N+m,N+m)

SXN 'N(NPm (+Q ) -2
o(1+ct N N N+m N+m -2.
= EfX^ a(+c/p )(p -p )(1+op ) }

= n(l+c) pN (1-p m)(l+pN+m)-2

(3.6.2)

where X(0) = n.

If cy and p are unknown, consider the predictor
-(2)(m) = xN(+ N)( ^ N+m -1
(m) = x (1+cyp )(l+ao )

(3.6.3)

-(2)
Denoting the mean square error of x (m) by MSE it
N SE2, it

( 2) (m) M1)()]2 9
MSE2 MSE = E[ (m) "- X'

9 N N+m -1
= E(X[(l+ap ) (l+ap )

(1+&p ) (.opN+m)-1 2 .

Define

v I ^ ^N+m
V = 1 + ci

6g = a 0,

6p= p p.

follows that

(3.6.4)

(3.6.5)

(3.6.6)

(3.6.7)

(3.6.8)

Then

6U = (1 + QpN) (1 + apN)

--N N
= (&p &p

N N-1N N
= (6 Ce) (6 p+p) Qp

= p 6 + NpN-1 6p + 0 (n ), (3.6.9)
p

making use of the fact that the maximum likelihood estimators are weakly

consistent. Similarly,

6v = (l+cypN ) (1 + pN+m)

N+m N+m-1 -1
Sp 6 + (N+m)ap 6p + 0 (n ). (3.6.10)

Hence
-1 N N+m -1
U = [6U + (l+ap) ] [6V + (l+cYpN) ]

= (l+op ) (+apN+m )-+ (1+p N+m)-6U

-(l+ap ) (l+pN+m)2 6) V + 0 (n ). (3.6.11)
p

Now we can get an approximation for (3.6.4) by taking

USE MSE = E({X[UV1 (l+!pN) (l+0p N+m) 1 2
2 1 N

E(X) E[UV- (1+apN) (l+YpN+m )-12

= CO(N,N)(1+apN+m)-2 Ef6U- (l+cpN) V]2

2N N -l -! 2 .. N, N+m) -2
= p n[n+a(l-p )(1+a)-1](l+()2 [(l+(p ) (l+apN )1

N N+m -l m 2 -1
X[a11[l-(l+cp ) (l+Cpp ) p ] + 2 12 p

X[N-(2N+m)pm(l+apN) (l+CpN+m )- + (N+m)
2m N2 N+m -2] 2 -2
X p (l+ap ) (l+p ) ] + 22c p

[N (N-m)(1+ p N) (+epN+m )- pm ] 2.

(3.6.12)

Perhaps (3.6.12) is only a crude approximation for (3.6.4), but inspec-

tion of MSE (3.6.2) and the approximation for MSE2 MSE1 (3.6.12) leads

to several interesting observations:

1. MSE1 is proportional to pN; that is, MSE1 decreases exponen-

tially as N increases. This is intuitively plausible, since the corre-

lation between X(N) and X(N+m) is given by

Corr [X(N), X(N+m)]

-1/2
= R (N,N+m)[R (N,N) RO(N+m, N+m)]

= [(lpN (lN+m)-1 1/2 (3.6.13)

Hence the correlation is a monotonically increasing function of N, and

N is large; thus the decrease in prediction error.

2. MSE1 is proportional to n. Thus, contrary to the error

associated with the maximum likelihood estimators of the logistic param-

eters, the error of prediction increases with the initial size of the

process. This is perhaps partially explained by the fact that the

process variance at any time is proportional to n (Lemma 3.4.2);

heuristically, the population or process size is less deterministic for

large populations.
2N
3. MSE2 MSE1 is proportional to p Hence the additional

prediction error incurred by parameter estimation also decreases as

N increases, and at a faster rate than MSE1.

4. MSE2 MSE1 is proportional to n.

Observation (4) is surprising if true; but it depends most

directly upon the accuracy of approximation (3.6.12). Thus, though

79

this may be just another of the logistic process' strange characteristics,

one must question the credibility of (3.6.12). Rather, one expects the

increase in prediction error to be of order one, following the general

pattern that a decrease in estimator variance produces a corresponding

decrease in the ratio of MSE2 MSE1 to MISE1.
2 1 1

CHAPTER IV

AN EMPIRICAL COMPARISON

4.1 Introduction

Rhodes [25] has given a method of estimating the logistic

parameters which receives wide application. We present his method in

this chapter, with an attempt to examine the estimators' sampling prop-

erties. These shall be shown to be analytically intractable; thus

a Monte Carlo experiment is performed in order to compare these esti-

mators with those presented in the last chapter. Finally, the logistic

model will be fitted to the population of conterminous United States,

using both Rhodes' and maximum likelihood estimation procedures.

4.2 Rhodes' Estimators

Recall that the mean function for the logistic growth process,

given by (3.2.12), is

t -I
EfX(t)/X(0) = nj=mn(0,t) = n(l+o)(l+ctp )1 (3.2.12)

Upon inspection, it is seen that

-1 -1 -1
[m (O,t+l)] =p[m (0,t)] + (l-p)[n(l+,:a)] (4.2.1)
n n

Rhodes thus defines

Y(t) = X(t)] (4.2.2)
Y(t) = [X(t)] (4.2.2)

and, observing the random variables (Y(j) =Y.) Y1 Y.. 2 Y N he finds

estimators for p and y in

t+ = + v t J1,N- (4.2.3)

where

y = (-p) a) (4.2.4)

and yt is the observed value of Yt

The estimators, obtained by minimizing

N-1 2
S = E [y t+ (pyt .2 (4.2.5)
i=l

are

P r-1 -t 1 t+1 2 -1
C1 -3 -)(Yt 9y2)j (tY) J (4.2.6)
=1 =1

and

0r = Y2 P Y (4.2.7)

where

N-1
y = (N-1) E y (4.2.8)
t=l

and

N-1
y = (N-1) Z y t (4.2.9)
t=1

Using (4.2.4), one gets an estimate for a:

&r = (I-p)(n r)-i -1 (4.2.10)

Rhodes then shows by application that one obtains reasonably

close agreement between projection and reality, using these estimated

values. To my knowledge no probabilistic study of the properties of

4.3 Sampling Properties of Rhodes' Estimators

Define, for 0 S s S t and u > 0,
-1
Bn(s,t) = E[Y(t)/Y(s) n },

gn(s,t) = EY 2(t),Y(s) = n-1},

and

n(s,tt+u) = E[Y(t+u)Y(t)/Y(s) = n- }.

(4.3.1)

(4.3.2)

(4.3.3)

Now fix s,t and u, and write n and Tn for (4.3.1), (4.3.2) and

(4.3.3), respectively. We now prove

Theorem 4.3.1:

(i) B =
n

n-i
n -1 n-j
V log T T j-i V ;
j=l

(4.3.4)

(ii) n 1 -1 2 -1 2
(ii) = [3[cos (1-T)3 6n cos (1-T) + 2rr
n 12o-( 1T +2

n-I j-1
-1 1 -1 -1 -k
+ E [j log T j Z k v ;
j=1 k=l

(4.3.5)

and

writing

-1 *
(iii) T) = [T n B ],
n+1 n n

T = (l+Cpt) (l+cps )-

B = B (s,t+u),
n n

(4.3.6)

(4.3.7)

(4.3.8)

(4.3.9)

and defining

V = T(T-1)

Proof:

It follows from Theorem 3.2.1 that

PlY(t) = j-/Y(s) = n-1 = P[X(t) = j/X(s) = n3

= Tn (1-T)j-n

0

if j = n,n+l,...

otherwise.

O j--1 )Tn(l-T) jn
n E -1
J=n

M1 +n-1 n
= (j+n)1 n(1-T) .
j=0 n-1

C -1 -1 )n+1 j-n-1
n+1 n
On+l = j n Tl(j
j=n+l

-- +n-1 .
-1 -1 n 3
= n v E [n(j+n) 1] nl T (l-T)
j=0

= v(n n-1)

Notice that

81 = E{Y(t)/Y(s) = 13

= j- T(1-T)j-1
j=1

(4.3.10)

(4.3.11)

(4.3.12)

= V log T .

Thus

(4.3.13)

Now, defining the generating function

CO
n
P (z) = E B z
n=l

we have, using (4.3.12) and (4.3.13),

-1 n
P (z) = 01z + vz P (z) vz E n z
n=l

Rearranging,

-l
P (z) = (1 Vz )1 vz [log T + log (1-z)],

which implies that

n-1
n = n log T E j1 vj,
j=1

as was to be shown.

(ii) Similarly,

(jn)-2 +n-1
j=. n-n ) T (1-T)-')
j=O

CD
+1= (j+n)-2
nj=
j=l

+n-1 n+l j-1
n (1-)

= n-1 (j+n) [n(j+n)-1 -1] Tn(l-T)
j=0

-1
= V [n -n B].
n n

(4.3.14)

(4.3.15)

(4.3.16)

(4.3.1)

(4.3.17)

(4.3.18)

Noticing that, from a result in Davis [8],

S= EY2(t)/Y(s) = 13

S j-2 T(--T)j-1
j=1

1 [3[cos 6 os-1 (l-T) 2
12 v [3[cos -(1-T)] 6T cos (1-T) + 2TT ,

we define the generating function

Co
n
P (z) = E z.
n=l

It then follows from (4.3.18) that

-1 n
P (z) = 1z + vz [P (z) E n Pn zn],
n=1

which implies that

n n-1 l

n-1
n -1 n-j
Sj
j=1

Upon substitution of (4.3.19) and (4.3.1), this is (4.3.2).

(iii) From (4.3.3)

In = E

E [(j+n)(k+j+n)] I

j=O k=O

X P(X(t+u) = k+j+n/X(t) = j+njP[X(t)

= j+n/X(s) = n3 .

n+1 = E E [(j+n)(k+j+n)]-1
j=1 k=0

X P(X(t+u) = k+j+n'X(t) = j+n3}PX(t) = j+n/X(s) = n+1l

(4.3.19)

(4.3.20)

(4.3.21)

(4.3.22)

(4.3.23)

CD CO
= n- ( k+j+n)- [l+n(j+n)-]
j=0 ko=

X P(X(t+u) = k+j+n.'X(t) = j+n]P[X(t) = j+n'X(s) = n)

= V [7n n-*1], (4.3.24)

as was to be shown. This completes the proof of Theorem 4.3.1.

The reason generating functions were not employed to further

resolve (4.3.24) is that no closed form could be found for 1 '

It is not difficult to use Theorem 4.3.1 to obtain approximations

for the first order sampling properties of Rhodes' estimators. The

expressions are tedious, the computing process was extremely time con-

suming (and hence expensive), and the few numerical results obtained

were nonsensical. This is probably due to rounding errors resulting

from the great number of calculations involved, or perhaps from the

inaccuracy of approximations made in order to obtain these expected

values.

Second order properties of Rhodes' estimators would involve

fourth order moments of [Y(t)], which are intractable. However, one

can perform Monte Carlo experiments to at least compare the estimators

for specific processes.

4.4 Monte Carlo Experiments

A computer program was written for the IBM 360 which

(1) simulates a logistic growth process given n, a', p and N;

(2) uses this sequence to calculate the maximum likelihood

estimates given in 3.3, using the Ncwton-Rhapson iterative

procedure;

(3) calculates E, the asymptotic variance-covariance matrix

for these estimators;

(4) calculates Rhodes' estimates for a and p, using the method

given in 4.2.

The results of several of the experiments are presented. The format is

as follows: first, a table is given which lists the generated logistic

sequence and the sequence of expected values. This table is followed

by a graph depicting both the generated sequence and the expected

sequence. Finally, a second table gives the maximum likelihood esti-

mates and the corresponding asymptotic variance-covariance matrix,

using the first third, first two-thirds, and all of the generated

sequence, respectively. Rhodes' estimators are also calculated for

each case.

Three such experiments are presented, each with a different

value of n and Ca. We have conducted several other experiments, and

those presented here are exemplary. Changes in p do not significantly

affect the results, except that rounding errors in the computations

affected the maximum likelihood estimates for p (p -). These cases

are practically uninteresting, though, since for small p the process

quickly approaches an asymptotic value, a characteristic uncommon to

most economic and other real sequences.

Note that Rhodes' estimates are unreliable when N is small,

whereas the effect of N on the maximum likelihood estimates is negli-

gible. In addition, we could obtain asymptotic confidence intervals

for the parameters, using the maximum likelihood estimates and T. We

also note that the bias in a seems to depend primarily on the relationship

Table 4.1

Generated and Expected Sequence for
n = 40, a = 10.0, and p = 0.8

i x. m (0,i)
1 n

1 55 59.46

2 58 71.90

3 72 86.34

4 84 102.88

5 94 121.50

6 110 142.07

7 128 164.32

8 144 187.86

9 164 212.18

10 176 236.69

11 192 260.79

12 211 283.92

13 225 305.60

14 244 325.48

15 262 343.35

1 xi m (0,i)
1 n

16 267 359.13

17 276 372.84

18 291 384.58

19 296 394.52

20 302 402.84

21 307 409.76

22 309 415.48

23 313 420.16

24 314 423.98

25 318 427.09

26 319 429.61

27 321 431.65

28 321 433.30

29 324 434.62

30 326 435.68

400-

350

300

:t)

250-

200-

00

0
* 0

0

* 0

o observed value
* expected value

Fig. 4.1. Generated and E:Tecterl Sequence for
n = 40, a = 10.0, and p = 0.S

S0 0

*
*

0 0

00 0000 00
0 0

) 5 10 15 20 25 30

Table 4.2

Maximum Likelihood Estimates, and Rhodes'
Estimates for Data in Table 4.1 for
n = 40, a = 10.0 and p = 0.8

N = 10

81.242

0.603

C = 5.909

p = 0.794

S= 69.464

S= 0.873
r

0.603

0.00493

N = 20

S 2.801

-0.0170

-0.0170

-3
0. 182 x 10

N = 30

3.247

-0. 0162

-0.0162

112 X-3
0. 112 xlO0

7.134

0.810

9.747

0.851

= 7.045

= 0.808

0r =
Pr =

8.290

0.845

Table 4.3

Generated and Expected Sequence for
n = 100, a = 5.0, and p = 0.8

i x. m (0,i)
1 fl

1 146 142.86

2 179 168.54

3 225 196.85

4 265 227.41

5 304 259.66

6 339 292.89

7 386 326.29

8 425 359.05

9 472 390.40

10 509 419.73

11 534 446.56

12 566 470.63

13 591 491.84

14 607 510.24

15 629 525.98

i x. m (0,i)
1 n

645

655

668

677

682

686

691

696

697

698

700

702

702

702

703

539.28

550.42

559.67

567.30

573.55

578.65

582.80

586.16

588.88

591.07

592.83

594.25

595.39

596.31

597.04

S0 0 0 00000

0**

o -observed value
S- expected value

t
Fig. 4.2. Generated and E, ected Sequence for
n = 100, a = 5.0, and p = 0.8

700

450

X(t)
400-

300

250-

0 0

200

150

n=100,