On the maximum chi-squared test for white noise


Material Information

On the maximum chi-squared test for white noise
Physical Description:
viii, 216 leaves : ; 28 cm.
Sincich, Terence L ( Terence Leigh ), 1953-
Publication Date:


Subjects / Keywords:
Chi-square test   ( lcsh )
Statistical hypothesis testing   ( lcsh )
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )


Thesis (Ph. D.)--University of Florida, 1980.
Includes bibliographical references (leaves 213-215).
Statement of Responsibility:
by Terence L. Sincich.
General Note:
General Note:

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 000099733
notis - AAL5191
oclc - 07119781
System ID:

Full Text










I am indebted to Dr. James T. McClave,not only for his patience

and encouragement while guiding me through this dissertation, but also

for the invaluable friendship he provided during my course of graduate

study. For their helpful suggestions towards this research, I wish to

thank Dr. John G. Saw, Dr. Mark C. Yang, Dr. Andre Khuri, Dr. Ramon C.

Littel and Dr. Andrew Rosalsky. Also, I am most appreciative of Dr.

Dennis D. Wackerly and Dr. Richard L. Scheaffer for their understanding

and assistance in matters related to my graduate program.

Finally, I especially thank my typist, Cecily Noble, for her

amazing ability to transform the raw text, bulky equations and lengthy

tables handed her into immaculate typed copy.

ACKNOWLEDGEMENTS ................................................. iii

ABSTRACT........................................................... vi


I INTRODUCTION................................................. 1

1.1 The Problem............................................. 1
1.2 Various Tests............................................ 4
1.2.1 The Durbin-Watson Bounds Test.................... 4
1.2.2 Beta Approximation.... .......................... 9
1.2.3 du Approximation................................ 10
1.2.4 BLUS Estimators.................................. 11
1.2.5 The Sign Test.................................... 13
1.2.6 Periodogram Based Tests ......................... 14
1.3 Concluding Note............. ...... ..................... 16


2.1 Power Comparisons ...................................... 17
2.1.1 Exact d Test Versus Approximate Tests of
Serial Correlation ............................. 17
2.1.2 d Test Versus BLUS Test........................... 18
2.1.3 The Durbin-Watson d Test Versus Non-first
Order Alternative Tests ....................... 19
2.2 The Alternative Error Model............................. 20


3.1 Box and Pierce 'Portmanteau' Test....................... 22
3.2 A New Technique: The Max--X2 Procedure .................. 25
3.3 Asymptotic Distribution of the Test Statistics.......... 30
3.3.1 Preliminaries..................................... 30
3.3.2 Generil Case..................................... 31
3.3.3 Special Cases .................................... 42 Model 1: the Case J=2, K=2............. 49 Model 2: the Case J=2, K=4.............. 53 Approximate Asymptotic
Power of the Box-Pierce Test. 55 Approximate Asymptotic
Power of the Max-x2 Test..... 60

AND D TESTS............................................... 7]

4.1 Monte Carlo Simulations .................. .... .......... 71

4.1.1 Observable Residuals: No Regression................. 74
4.1.2 Estimated Residuals After Regression................ 95
4.2 Power Approximations for Large n ...........................113
4.2.1 Case: J=2, K=4 ...................................... 114
4.2.2 Case: J=3, K=4..................................... 115
4.2.3 Case: J=4, K=4 .....................................116
4.2.4 Approximation Results...............................117
4.2.5 A Note on the Taylor-Series Approximation...........128
4.3 Summary.................................................... 131


5.1 Asymptotic Powers for Large n..............................133
5.2 Hodges and Lehmann Asymptotic Relative Efficiency..........142
5.3 Likelihood Ratio ...........................................153
5.3.1 General Case ........................................154
5.3.2 Special Case: J=5, K=2.............................156
5.3.3 Special Case: J=K..................................158

VI CONCLUSION......................................................164

6.1 Concluding Remarks..........................................164
6.2 Future Research.............................................. 65


BIBLIOGRAPH ................................................. ........213

BIOGRAPHICAL SKETCH...................................................216

Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of the Requirements
for the Degree of Doctor of Philosophy



Terence L. Sincich
August, 1980

Chairman: James T. McClave
Major Department: Statistics

Consider the general regression model

Y = +X + + BX + + Z t t = 1, 2, n
t o lIlt 2X2t g gt t

where the dependent variable Yt and the independent variables Xt, X2t' .

X are observed and the vector of model parameters B' = (B 1 .,
gt o 1
B ) is unknown. The usual assumption on the unobservable errors { Z } is
g t
that Z, Z2' 3 Zn are independent with zero mean and constant

variance. When this assumption is violated, the application of the least

squares regression method will often lead to erroneous inferences on the

elements of B. In order to avoid this problem, a test for independence of

the errors is essential.

Many testing procedures have been proposed for this problem and a

summary of tlhe most widely used tests is given. However, these tests

attain high power only when the regression errors follow the first order

autoregressive process

S t +
t Z t-1+ Ct

where I0j<1 and {t } is white noise. We consider two tests, based on

the vector of sample autocorrelations, which are designed to have high

power against a more general autoregressive alternative: the Box-Pierce

test and a new procedure called the max-x2 test. An extensive Monte-Carlo

simulation study is undertaken in order to compare the powers of the tests

under the first order lag J autoregressive alternative

Zt = Ztt-J + t

where J is not necessarily 1. This particular model is chosen for it is

one which is near white noise, and hence one for which many testing pro-

cedures will fail to detect the presence of serially correlated errors.

A graphical display of the simulated powers of the tests shows that

although both tests attain a high power in this instance, the max-x2

test clearly outperforms the Box-Pierce procedure.

The two tests are also compared to the Durbin-Watson d test for several

least squares regression models. Since the d test is optimal when J=l

in the alternative error model, it is recommended as the initial test

to apply in practice. However, the d test performs poorly for J>1 and

should be supplemented by another test which has high power in this case.

Power simulation results indicate that the max-X2 test would be the better

supplementary test.

Since the problem of analytically deriving the exact powers of the

tests is intractable even for simple alternative error models, asymptotic

powers arc considered. Approximations to the asymptotic powers of the

Box-Pierce and max-2 tests are discussed and their performance evaluated.

Comparisons of the Box-Pierce and max-X2 tests are also made with

respect to Hodges-Lehmann efficiency and deficiency and with respect to

likelihood ratio. Although the asymptotic relative efficiency of the


tests is 1, the Box-Pierce test is shown to be asymptotically deficient

when compared to the max-x2 test. A derivation of the likelihood ratio

statistic for special cases of the problem reveals that the max-X2 test

is asymptotically equivalent to the likelihood ratio test.




1.1 The Problem

Consider the general regression model

Yt = Bo + BIt + 2X2t + g g + gXgt + t

where the dependent variable, Yt, and the independent variables, Xlt,

S. ., Xgt, are observed and the random error, Zt, is unobserved, t = 1,

2, ., n. In most practical cases, the vector of model parameters,

1' = (go B1' g) will be unknown. The researcher who is interested

in estimating the parameters for prediction purposes usually applies the

least squares regression method. The least squares model can be written

as follows:

Yt = o + BXlt + 52X2t + + BgXgt (1.1.1)

where g' = (go, E' g) is the vector of parameter estimates and

Yt is the predicted value of the dependent variable for the particular

set of independent variables (X1t, X2t, ., Xgt) observed.

Before utilizing the least squares prediction model, the researcher

often wishes to make inferences about the components of B. To ensure the

validity of these inferences, the following assumptions must be made:

(1) The error terms (Zt's) have zero mean and equal variance,

i.e., Zt ~ (0,02).

(2) If the independent variables in the model are random, then the

error terms are distributed independently of these independent

variables, i.e., Zt is distributed independently of the Xt's.

(3) Successive errors are uncorrelated.

When autoregressive schemes or stochastic difference equations are

used in the regression model, lagged values of the dependent variable

occur as independent variables. These types of models violate assumption

(2), and will be excluded from further consideration in this research.

For models which do not contain lagged values of the dependent variables

as regressors, the independent variables can be regarded as fixed, known

constants, even if they are in fact random variables. All inferences are

then conditional on the observed values of the independent variables.

Of importance, then, is determining whether the model under consideration

is in violation of assumption (3). When this assumption is violated, as

is often the case in the analysis of time series data and in econometric

modeling, complications arise. In order to grasp the severity of the

problem, let us consider the following:

Rewrite the regression model (1.1.1) in matrix form as

Y = X B ) + Z. (1.1.2)
(nxl) (= nxg) (gxl) + (nxl) (1.1.2)

Let us write assumption (3) as Z ~ (0,o21), or, more strongly, Z ~ N(0,o2I).

Then E(Y) = X and E{(Y-XB)(Y-XB)'} = a2I. Applying the method of maximum

likelihood to the density of Y, the least squares estimates of the unknown

model parameters (6) are obtained and are given by the well-known formula

S= (X') X'Y. Under the assumption that Z N(0,a2I), the vector e has

the following properties:

(a) E{L( ) (

(bJ) E{(-B-)(&-P)'} ,o-(X'X)-1

(c) 8 is the BLUE (best linear unbiased estimator) for B (Gauss-

Markov Theorem).

(d) 3 has minimum variance among all LUE for B (Gauss-Markov

(e) B is a maximum likelihood estimate.

(f) S N{8, o2(X'X)-1)

Suppose now, that assumption (3) is violated and that Z N{0,E},

where E = a21, a general covariance matrix. Then E(Y) = XS and

E{(Y-XB)(Y-XS)'} = E = 02T. In order to obtain estimates of the elements

of 8, the maximum likelihood procedure is again applied to the density of

Y. This yields the estimator B = (X'"- X)- 1'?YY, sometimes called the

Markov estimator. From the Gauss-Markov Theorem, it can be shown that

5 is the BLUE for 8 and that 8 has minimum variance (o2(X'-'1X)-1) among

all LUE for B. Let us consider the plight of the researcher who (in place

of the more general Markov estimator) utilizes the usual least squares

estimate of 8, say in the presence of dependent errors. Only if the

characteristic vectors of Y have a specific form, does L equal 8 [ 4].

However, this is usually not the case in any practical situation. Assuming

that the two estimators are unequal, the vector B has the following


(a) E( 3) = B

(b) E{(L -)( -)'} = (X'X)-IX'X(X'X)-I = o2(X'X)-1X'" X(X'X)-1
(c) L no longer has minimum variance among all LUE for 8
(d) 8L is no longer a maximum likelihood estimator

(e) ~ N {3R, c2(X'X)1X'YX(X'X)1}

The implication of properties (b) and (e) is that the usual least squares

formula for finding the variance of B O2(X'X)-1, yields a biased estimate

of the true variance E{(1 03)'( -W)} = a2(X'X)-X' X(X'X)-.
-L -L -

The least squares test statistics used for making inferences about the

elr.ients of 3 are often stochastically larger in the presence of dependent

errors. Thus, the researcher will tend to reject the null hypothesis

H : B = 0 more often than he should, and hence, will include insignificant
O 1
terms in the model.

In order to avoid the problems introduced by applying the least squares

procedure when the errors are dependent, a test for independence in the

errors (sometimes called a test for serial correlation of the errors) is


1.2 Various Tests

Since the errors (Z) of a least squares regression are not observable,

a test for serial correlation of the errors must be based on the residuals

(Z) obtained from the calculated regression, where Z = Y XB. However,

in most cases the residuals are correlated whether the errors are dependent

or not; consequently, ordinary tests of independence for an observed se-

quence of random variables cannot be used without modification. Many

test procedures have been proposed for this problem, and in this section

a summary of several popular tests will be presented.

1.2.1 The Durbin-Watson Test

J. Durbin and G.S. Watson [10] in 1950-51 were among the first to

consider the problem of testing for serial correlation of the errors in

least squares regression, and their test and its modified versions remain

tLe most widely applied tests of regression error independence. In order

to investigate the random error component Z in the model

Y = nx Bgxl) + Z(n)
Y(nxI)~ (nxg)~ (gxl) (rnx1)'


Durbin-Watson (D-W) make the usual assumption under the null hypothesis:

the errors are independent and identically distributed normal random

variables with zero mean and constant variance, i.e., Zt i.i.d N(o,o2).

The type of alternative they consider is one for which the vector of

random errors, Z, is stationary, and for which Cov{Zt,Zt+h} = y(h) is

exponentially (or geometrically) decreasing in h. A convenient model

for this hypothesis is the first order autoregressive [AR(1)] model, or

as it is sometimes called, the stationary Markov process: Zt = Zt-1 + Et,

where [0! < 1 and t ~ i.i.d N(o,o2) independently of the Zt 's. Thus,

the null and alternative hypotheses take the form:

Ho: 0 = 0 versus Ha: 0 0 .

The test statistic derived by (D-W) is given by:

Z (z -zt_)2
t=2 t t-Z 1 ,
n .
Z Z 2

where Z is the vector of residuals from the calculated least squares re-

gression (Z = Y XW).

Durbin and Watson derived the d statistic using a result obtained by

T.W. Anderson [ 5] in 1948. Anderson showed that no test exists which

is uniformly most powerful against the two-sided alternative. However,

for cases in which the columns of the X (design) matrix in the model

(called 'regression vectors' by Anderson) are linear combinations of the

eigenvectors (latent vectors) of a matrix A (where A occurs in the density
function of Z), tests based on ~ are uniformly most powerful (UMP)
against one-sided alternatives, and~have optimal properties for two-sided

alternatives. The density functions of Z which lead to the UMP test have

have been shown by Anderson to take the form:

f(Z:0,o2) = K exp[{(+0)2Z'Z 20Z'A Z} (- -- ]. (1.2.2)

When the distribution of Z takes the form of the AR(1) model, the density

of Z can be written as:

f(Z:0,)= K2exp[- 1 {(1+0)2 1Z t2 02(Z12+Zn2)
) 2 20 t=l n
20 Z t z }] .(1.2.3)
n 2
By taking Z'A Z = 2Zt t- in equation (1.2.2), (D-W) obtained the
t=2 t1l
density function:

f(Z:0,o2) = K3exp[-2 (1+0)2 E Z 2 0(Z2+ Z 2)
2a t=l t
208 2Z"Zt ] (1.2.4)

which is very close to the AR(1) density in equation (1.2.3). Thus, an

appropriate choice of A, namely

1 -1 0 0
-1 2 -1 0
0 -1 2 0
A = 0 0 -1 0

0 0 0 2 -1
0 0 0 .-1 1

leads to a d statistic which provides a bl'M test against one-sided

alternatives in the "latent vector case", i.e., the case when the columns

of the design matrix coincide with the latent vectors of A.

The d statistic is bounded blow by 0 (obvious) and above by 4. In

order to see that d t 4, expand the numerator of the statistic:

n n
E (Z -Z = E (Z 2-2Z Z Z +Z2
t=2 t=2 t t =2 t t-1 t2
n n n
S Z 2-2 Z Z + Z Z2
t=2 t t=2 t t-1 t=2 t-1
n n n
t=l t t=2 t t- t=l t


d = t2Zt-t)2 2 2t t- = 2 1 -t t t-
Zt 2 tt=2 L t-l J
z 2 2 2
t=1 t=l t=l


t=2 t-l >
E 2
t= t


d 2[1-(-1)] = 4

Thus, if the errors were positively serially correlated, e.g., Zt = 0Zt-l

+ Et, 0 > 0, d would tend to be relatively small, and if the errors were

negatively serially correlated, e.g., Zt = 0Zt_1 + et, 0 < 0, d would

tend to be relatively large. The user of the d test interested in detecting

the existence of positive serial correlation of the regression errors would

reject the null hypothesis of independent errors (0 = 0) in favor of the

alternative 0 > 0 if d < d*, where d* is the appropriate critical value

of d for a lower-tailed test. Upper-tailed tests (tests for negative

serial correlation) are carried out in a similar manner.

A major drawback to this testing procedure is that it is only possible

to find the exact null distribution of d in very special cases (see

Anderson [ 51), cases which do not frequently occur in practice. However,

D-W show that it is possible to calculate upper (du*) and lower (dL*)

bounds to the critical value d*, .where 0 d L*
values of the critical bounds are made available in Durbin and Watson

[10]. The procedure for conducting a test designed to detect the

existence of positive serial correlation is as follows:

(a) if d < dL*, reject H (significant at level a)

(b) if d > dU*, do not reject H (insignificant at level a)

(c) if dL* < d < dU*, the test is inconclusive.

By observing that the null distribution of d is symmetric, with 0 d 4,

a test for negative serial correlation can be carried out exactly as

above, only using the statistic 4-d. Two-sided tests are obtained by

combining single tail areas, using a/2.

When the observed value of d falls in the "region of ignorance",

dL < d < du*, the bounds test is inconclusive. It is here that D-W

recommend applying a beta approximation. The procedure involves transforming

d so that its range is (0,1), accomplished by the simple transformation

-d. A beta distribution with the same mean and variance of d is then fit

using the method of moments. Critical values of -d are found from Tables

of the Incomplete Beta Function, and the test is conducted in the usual


Tests for the case in which the errors in a least squares regression

follow a specific non-first order autoregressive process were developed

by Peter Schmidt [29] and K.F. Wallis [34] and were based upon the work

done by D-W. Schmidt considered the AR(2) alternative: Zt = 1Zt-1 +

2Zt--2 + E' ~C i.i.d(o,a2). A test of the null hypothesis Ho: 0 = 2 = 0

is carried out by considering a generalization of the D-W d statistic:

n ^ 2 n ^ ^
E (Z -Zt _) + t (z-z t)2
d2 = t=2 tt=3 t t-
n ^

When analyzing quarterly data, Wallis argued that researchers may be

interested in detecting the existence of seasonal variation of the error

terms. One possible seasonal error model is the AR(4) process: Zt =

0Zt-4 + t t ~ i.i.d(o,a2). Wallis' "seasonal effects" test is based

upon the value of the modified D-W statistic:

E 2
t=l t

Critical bounds for both non-first order tests have been tabulated by

their respective developers. The tests are conducted in exactly the same

manner as the D-W bounds test.

1.2.2 Beta Approximation

H. Theil and A.L. Nagar [33] criticized the D-W bounds test because

of its inconclusive region (dL*, du*), and fear that the practical re-

search worker may interpret "no inference possible" as equivalent with

"no evidence to reject the null hypothesis of independence." This results

in bias in the sense that many cases of positive serial correlation will

be overlooked, especially when the inconclusive region is large. This

situation occurs when the number of observations is small and the number

of independent variables in the model is large.

The Theil-Nagar (T-N) approach is to fit a beta distribution i~mmedi-

ately, dispensing with finding bounds for the critical value of d*.

However, their method of fitting the beta distribution differs from that

used by D-W. The d statistic is transformed to a beta (p,q) variable


in the range (0,1) by considering the quantity X = 4-(a+, where the

range of d is (a,b). The parameters p, q, a, and b are then determined

by the methods of moments, using the first four moments of d. The first

two moments of d are used to find expressions for a and b, while the

third and fourth moments of d are utilized for determining p and q. T-N

simplify the necessary calculations considerably by ignoring terms of

lower order in the moment expressions.

R.C. Henshaw [20] also considered a beta approximation to the distri-

bution of d. However, where T-N approximate the first four moments of d,

Henshaw uses the exact moments of d in fitting the beta distribution.

The transformation X = d-vl is used, where v and vng are, respectively,
n-g-l 1 n-g
the smallest and largest latent roots of (I-X(X'X)-1X')A. The parameters

p, q, Vl Vn-g are then found by the method of moments.

The calculations needed in Henshaw's test are much more extensive

than those of T-N. However, T-N point out that their approximations are

good only when "the behavior of the [independent] variables is sufficiently

smooth in the sense that the first and second differences are small compared

with the range of the variable itself."

1.2.3 dU Approximation

For polynomial regression, E.J. Hannan [17] discovered that the D-W

upper bounding statistic, dU, gives a very good approximation to the true

distribution of d. Hannan's reasoning was based on the fact that the

d-, bound is attained in the latent vector case, i.e., when the g columns

of the design matrix are linear combinations of the g latent vectors

vectors associated with the smallest non-zero latent roots of the matrix

A. Later, Hannan and R.D. Terrell [18] showed that the approximation

rirc-ains reasonably good whenever the coded values of the independent

vzrriab es are concentrated near the origin (spectrum of the independent

variables in the model is concentrated near the origin), which fre-

quently occurs for economic time series.

In the light of Hannan's work, Durbin and Watson [12] theorized that

the shape of the d distribution might well be better approximated by the

distribution of the upper bounding statistic dU. However, D-W felt that

this approximate distribution should have the same mean and variance

as the distribution of d since this was the one desirable property possessed

by the beta approximation. The statistic d* = a + bdU was considered;

where a and b are chosen so that d* has the same first two moments as

d. Critical values of dU, tabulated by D-W in their 1951 paper, are used

in finding the critical values of d* [11].

1.2.4 BLUS Estimators

It can be shown that the covariance matrix for the residuals is

E(Z Z') =- o2{-X(X'X)-X'} = a2M.
Thus, even if the null hypothesis is true, the residuals are, in general,
correlated and heteroscadastic. The result complicates matters considerably,

as evidenced by D-W's upper and lower bounds to the significant points

of d. H. Theil [321 reasoned that the "testing procedure would be simpli-

fied considerably if the covariance matrix of the [residuals] were of the

form o2I rather than o2M." Theil considered new estimates of the regression

errors, called BLUS estimates; that is, best linear unbiased within the

class of those linear unbiased estimates that have scaler covariance

matrix o21.

The new regression residuals, V, are formed such that E{(V-Z)'(V-Z)}

is minimized subject to E(V) = E(Z) = 0 and E(VV') = o2Q, where 2 is

chosen apriori. By choosing n(nxn) as a diagonal matrix with n-g ones

and g zeroes along the diagonal, the new residuals V are BLUS estimators

of Z. Theil's test statistic takes the form of D-W's d statistic, only

with the Zt's replaced by the Vt's:

Z (V -V )2
dB = t=2 t t-l
E V 2
t=l t

Now the BLUS estimates, under the null hypothesis, are independent with

zero mean and constant variance, hence, the exact null distribution of

dBLUS is a standard one. Critical points are tabulated by B.I. Hart [19].

Theil notes that the choice of the matrix Q actually results in a

dropping of g of the n residuals from the estimation process. This proce-

dure permits any set of g residuals to be dropped, the choice being left

to the user. However, as pointed out by A.P.J. Abrahamse and A.S. Louter

[ 2], there exist (n) possible BLUS estimators, each of which. will,in

general, yield a different result. No method has been given for selecting

the optimal vector of BLUS estimators.

Abrahamse and Louter (A-L) hypothesize that a diagonal matrix Q is

not a necessary condition to eliminate the inconclusive region in the

D-W bounds test. It is sufficient to require that 0 does not depend on

the design matrix. The estimators of A-L are derived in the same manner

as the BLUS estimators with the modification that ao2= G2KK' approximates

o2M, and the test statistic, like Theil's, takes the form:

n 2
t t

A-L show that the Q statistic has the same distribution as the D-W upper

bounding statistic dU. Hence, it is an exact test, with tables available

for finding significance points.

The procedures developed by Theil and A-L both necessitate tedious

calculations in order to obtain residual estimates. G.D.A. Phillips and

A.C. Harvey [26] sought to simplify the calculations and interpretations

of the estimated residuals, while retaining the properties of the BLUS

estimates. They determined that this could be accomplished by utilizing

recursivee residuals." One normally estimates the model parameters on

the basis of all n observations. The recursive residuals, denoted by

Z., are found by using only the first j observations, j=g+l, g+2, .

n. These residuals can be easily generated by a recursive algorithm.

The Phillips and Harvey (P-H) test statistic takes the familiar form:

d = t=g+2
n 2

Under Ho,the {Z } are distributed independently with zero mean and constant

variance. Critical values of this standard distribution have been tabu-

lated by Hart [19]. Like Theil's BLUS estimators, the recursive residuals

are linear, unbiased, and possess a scaler covariance matrix.

1.2.5 The Sign Test

One of the first researchers to consider a non-parametric alternative

to the D-Q bounds test was R.C. Geary [14]. Geary found that a simple

count of the number of sign changes in the least squares residuals could

be used as a test for serial correlation in the errors. If the errors

are independent, the signs (plus or minus) of the residuals in the se-

quence will occur in random order. Serial correlation is inferred then,

if the number of sign changes are fewer than expected.

Let n be the number of observations and T the number of sign changes

in the sequence of residuals. Then under the null hypothesis of

independence, T has the distribution:

P(T) = (n-l)!

Thus, it is possible to calculate significance points of T under H The

hypothesis of white noise is rejected when T < T*, where T* is the tabulated

significance point for the values of n and a.

Geary notes that the T-test is very similar to the familiar runs

test. In fact, the number of runs, U, is one more than the number of

sign changes, T, i.e., U = T + 1. In runs theory, the number of plus

and minus signs are taken into account, whereas these are ignored in the

sign test. Geary remarks that the sign test is a handy tool for assessing

the probable presence of serial correlation for those workers in multi-

variate regression time series who are computerless, or without the Durbin-

Watson routine in their computer.

1.2.6 Periodogram Based Tests

Recall that D-W's 1950 d test was designed to have high power against

only a first order autoregressive alternative error model. However, the

nature of the residual dependence, when it exists, may be more general

(non-first order). Thus, an investigator may wish to obtain a more

comprehensive picture of the departure from serial independence than is

provided by a single statistic like d. Durbin [ 9] suggests that it may

be more appropriate to ask what the data reveals about the departure from

serial independence rather than to set up a particular parametric alter-

native and seek a test which has high power against it. Durbin's 1969

teo-hnique for studying the general nature of the serial dependence in a

stationary series of observations Z1, Z2, ., ZT is to compute the


j m
P/ P
E r E r
S. = r=1 r=1

, r = I, 2, 3, ., m


2 T Z e(27irt)/T 2
P = t t
r T t=l1

and then to make a plot of S. versus j/m. This plot is called the cumu-

lated periodogram, S. is its sample path, and r = 1, 2, .., m the

periodogram ordinates.

It is known that when Z1, Z2, ., ZT are independent and identically

distributed, the sample path S. behaves asymptotically like that of the

sample distribution function in the theory of order statistics. Hence,

the Kolmogorov-Smirnov limits can be applied to provide a test of serial

independence. However, when the periodogram is computed from the least

squares residuals (Zt), modifications to this procedure are necessary,

since the residuals themselves are correlated in any case.

For a test against positive serial correlation, a critical value C

for the Kolmogorov-Smirnov statistic is acquired from available tables.

Like the D-W bounds procedure, there are three regions for the sample

path of S.:

(1) If the sample path of S. crosses an upper line C + j/m',
3 o
reject H .

(2) If the sample path of S. fails to cross a lower line C +
3 o
1 1 1< <
[j 2(g-l)]/m', do not reject H (where -g 2 j m and
m' = 1(n-g)).

(3) If the sample path crosses neither the lower or upper line,

the test is inconclusive (analagous to the inconclusive region

for the D-W bounds test).

Durbin notes that this technique is conservative in the sense that, for

a test of significance level a, the probability of falsely rejecting the

null hypothesis does not exceed a. Tests against negative serial cor-

relation are carried out similarly, and a two-sided test may be obtained

by applying both one-sided regions at significance level a/2.

1.3 Concluding Note

We have presented a brief summary of available tests for serial

correlation of the errors in a regression analysis. The properties of

these tests are discussed, and comparisons made, in the chapter that

follows. In particular, we are concerned with the performance of these

tests in the presence of non-first order serially correlated errors.



2.1 Power Comparisons

Many researchers have conducted Monte Carlo and simulation studies

in order to estimate the powers of the various tests for serial inde-

pendence. Most of the power studies in the literature focus on the

Durbin-Watson d test. In the following sections, the d test is compared

to three groups of alternative tests and a brief summary of the results

is presented.

2.1.1 Exact d Test Versus Approximate Tests of Serial Correlation

In 1971, Durbin and Watson [12] investigated and compared the exact

distribution of d with certain approximate tests which were developed

using their procedure as a basis. These include the Durbin-Watson,

Theil-Nagar, and Henshaw beta approximations and the Hannan dU and Durbin-

Watson a+bdU approximations.

Letting d' denote a random variable having a distribution used as an

approximation to the true distribution of d, D-W computed P(d'-d.) for

each approximating distribution, where P(d -d )= a. The power comparison.

reveal that the tests fall into three categories: Theil-Nagar beta and

Hannon dU; D-W beta and D-W a+bdU; Henshaw beta. These groups are given

in increasing order of accuracy, which coincides with increasing difficulty

of calculation.

As a result of their research, D-W make the following recommendation:

when a test for serial correlation based on the least squares residuals

is required, the D-W d test should be applied first. If the result is

inconclusive, apply the D-W beta or a+bdU approximation. When special

accuracy is needed, use the Henshaw beta approximation. Finally, if

a more comprehensive picture of the serial properties of the errors is

desired than is provided by the value of a single statistic, employ the

cumulated periodogram method.

2.1.2 d Test Versus BLUS Test

Abrahamse and Koerts [ 1] calculated the power of the BLUS test for

various values of n and g (number of independent variables) with the

parameter of the alternative error hypothesis at a relatively large value

of 0 = .8. This was compared with the following three quantities:

(a) P(d
the d test, where dL* is the lower critical bound for d,

(b) P(d>dU*lHa), called the probability of an incorrect decision

for the d test, where du* is the upper critical bound for d,

(c) P(d
approximate true significance point of d obtained by using a

method similar to those discussed in Chapter I.

Note that the probability of a correct decision, as defined by Abrahamse

and Koerts, is the power of the d test when using D-W's bounds.

The results obtained by A-K can be summarized as follows:

(i) The power of the d test generally exceeds the power of the BLUS


(ii) The power of the BLUS test dominates the probability of a correct

decision for the d test. The smaller the number of degrees of freedom

(n-g), the greater the difference.

(iii) The power of the BLUS test, the probability of a correct

decision and the power of the d test converge as n-g increases, for

in this case the inconclusive region of the d test disappears.

Since calculation of approximate significance points for the d test

may require many more computations than the BLUS procedure, A-K recommend

one follow the schematic shown in Figure 2.1.1 when choosing a test for

residual correlation.


n-g large
n-g Durbin-Watson d


n-g small BLUS
n- smallDurbin-Watson d----if inconclusiveLU


FIGURE 2.1.1


2.1.3 The Durbin-Watson d Test Versus Non-First Order Alternative Tests

V. Kerry Smith [30] compared the approximate power of four tests for

serial correlation with two sets of alternatives: second order auto-

regressive processes, denoted AR(2), and first order moving average

processes, denoted MA(l). The four tests considered in the study were

the D-W d test, the Durbin cumulated periodogram (S.) test, the Schmidt
0 J

(dz) test and the Geary (T) count of signs change test. Smith's Monte

Carlo results are summarized as follows:

(a) For the MA(1) error model, the d test proved to be consistently

more powerful than the other procedures. However, with small

samples and low first order correlation coefficients, the

differences in the tests are not pronounced.

(b) For the AR(2) error model, three main results are obtained.

(i) For certain values of 01 and 02 in the AR(2) model, the

d test is more powerful than the d2 test. The T test is

shown to be much inferior to the d.

(ii) In models with large and approximately equal values of 01

and 02 the d2 test appears to become more powerful than

the others. (Large 01 and 02 imply first and second order

autocorrelations near 1. For a detailed discussion of auto-

correlation, see Section 3.3.1.)

(iii)The Sj test appears to be at least as powerful as the

d test for most of the values of 01 and 02 considered by

Smith. However, the Sj test does not perform as well as

the d2 test in most instances.

2.2 The Alternative Error Model

Almost all of the tests discussed in Section 1.2 consider only a first

order autoregressive hypothesis, i.e., one in which the hypothesized

error model under Ha takes the form

Zt = 0Zt-1 + t'

where {e.) i.i.d. (0,02) independently of {Zt}. The primary reason for

this restrictive hypothesis is that the distributions of the test statistics

are complex, even in this simple case. However, many time series exhibit

autocorrelation patterns which are not well modeled by the first order

model. When analyzing quarterly residuals, it may be more appropriate

to consider the model

Zt = 0Zt-4 + Et

where {c } is defined as above. The performance of the Durbin-Watson

d or similar test should be evaluated for models like these.

In the sections that follow, we will examine the properties of tests

designed for the general alternative autoregressive-moving average

(ARMA) model
+ 0iZt-l + 02Zt_2 + +
S+ t-1 + t-2 t-p t 1 t-1

+ 92 t-2 + q t-q



3.1 Box and Pierce "Portmanteau" Test

G.E.P. Box and D.A. Pierce [ 7] developed a method for checking the

adequacy of fit in autoregressive-moving average (ARMA) time series models.

Finding a test for serial independence of the errors in least squares

regression was not of prime importance to Box and Pierce (B-P). However,

they did discover that the statistic used for checking model adequacy is,

under certain conditions, very nearly equivalent to the D-W d test.

Consider the general ARMA (p,q) model:

p q
Z = 0 zt + Oiat-i + at, t =1,2,. n (3.1.1)
t j=1l jt-j +i=1

where the {at} are independent unobservable random variables, and the

vector of model parameters (01, 02," 0 1' 2' ., 9 q)' is

unknown. In order for (3.1.1) to attain stationarity, the autoregressive

parameters (01, 0212 .' ., p ) must satisfy certain constraints. Fuller

[13], and others show that the condition of stationarity is met if the

roots of the equation x + + lx- 02xp-2 + + p = 0 are all in

the unit circle. To guarantee the invertibility of the moving average

part of uaodel (3.1.1), Fuller states that the roots of the equation

y ++ 9q-1 92 q-2 + + 9 = 0 must also lie in the unit circle.

This condition on (e' 2' ., 0 'q) enables the researcher to express

the {Zt, generated from the MA process,

Z = Z 9.a
t i=O 1 t-i

as an AR time series,

z jlJZt0. j + at
t j=1J t-j t

(We utilize this invertibility property later in Section 3.3.) Through-

out, then, we assume that model (3.1.1) is stationary and invertible,

i.e., the roots of the equations

xP x+ 1xp1 +2xP-2 + .. + 0 = 0


yq + 1yq-1 + 82y q-2 + + 9 = 0

are all in the unit circle.

After calculating the parameter estimates, 01, 02, p' 01'
2, q, by the usual time series methods (see Box-Jenkins [ 8]),

the researcher may then be interested in checking the adequacy of the


p ^ q ^
Z = Z 0Z j j + iat-l + a (3.1.2)
t j=l t- i=l t-1 t

B-P proposed that the first K sample autocorrelations of the residual

series {a }, rI, r2, ., r, be used for this purpose, where K is

small relative to n, and where

r = t- +latat-jC
r = t= 1a j j = 1, 2, ,..., (3.1.3)
t la2

In particular, B-P considered the statistic,

n r
j=1 3

However, the true sample autocorrelations cannot be calculated since the

{at} in (3.1.1) are unobservable. In order to proceed, the researcher

may apply the method of least squares to (3.1.2), to obtain the residuals

{a t. Replacing the {at with {at} in (3.1.3), estimates of the first

K sample autocorrelations are found, i.e.,

r. atat_
t=j+l t j = 1, 2, ., k. (3.1.4)
n ^ 2
t=l t

The statistic Q =n.Er 2 is then of interest.
j=1 J
B-P shows that Q will, to a close approximation, have an asymptotic

X2-distribution with K-p-q degrees of freedom. The 'portmanteau' test

for checking the adequacy of any ARMA (p,q) process will lead to a rejection

of the hypothesis that the model adequately fits the data if Q is "too

large", where Q ~ X2 (K-p-q)

Although B-P are not actually considering regression model in which

the errors are to be tested for independence, notice that if p=q=0 in

(3.1.1), i.e., if the {Zt} are assumed independent, a test based on

l= ( tztZtl)i(Zt2),
rl 2 tt- t 1t2)

is very nearly equivalent to the D-W d test. This fact would lead one

to believe that the 'portmanteau' test may have applications in the area

of testing for serial independence of the errors in a regression analysis.

In particular, let p # 0, q = 0, in (3.1.1). The model then takes the


Zt = 0Zt-1 + t-2 2 + + pZp + t (3.1.5)

where (Et} are independently distributed. If we let {Zt} represent the

unobservable errors in a regression model, then (3.1.5) is the general

autoregressive alternative error model discussed in Section 2.2. Under

the null hypothesis of white noise, p = 0, and the B-P statistic Q is

distributed asymptotically as a chi-square random variable with

K degrees of freedom. It seems reasonable to utilize the Q statistic in

a test for serial correlation of the {Zt "Large" values of Q would

lead to a rejection of the hypothesis of white noise. Further, since

the form of (3.1.5) is more general than the alternative model considered

by D-W and others (see Section 1.2), one would expect the 'portmanteau'

test to be more powerful against general alternatives.

3.2 A New Technique: The Max-X2 Procedure

James T. McClave [24] considered a new method of estimating the order

of autoregressive time series models, called the maximum chi-square

(max-X2) technique. As was the case with B-P, a test for serial cor-

relation of the errors in a regression model was not of prime importance

in McClave's research. However,, after developing the max-X2 procedure,

McClave hypothesized that this technique could also be used as a test

for serial correlation of the regression errors, and that this test would

have high power against general alternatives.

IcClave considered the stationary autoregressive (AR) model,

Zt = 0 Zt- + t-2 + + Zt-m + Et (3.2.1)

where the {e tis a while noise process. Call model (3.2.1) a pth order

subset AR model with maximum lag m if exactly p of the lag coefficients

0, 02' "' Om are nonzero, Om 0 0 and 0. = 0, Vj > m. The re-

searcher whose ultimate goal is to obtain estimates of the nonzero lag

coefficients in (3.2.1), must first estimate the elements of the parameter

set -= {p; jl' J2', ., jp) where (jl' j2, j p) are the

lags having nonzero coefficients. The initial step in the order estima-

tion process is the application of the 'subset autoregression algorithm'

developed by McClave [24]. Subset autoregression enables the researcher

to generate a sequence of estimates of @, say S = {10 1 2'

K }, where Qk = {k; jl' 2' j k}. Once these estimates are
obtained, McClave proposes the following sequential procedure for choosing

the order from S:

1. Compute the sequence of statistics {M0, M1,' ', K_}1, where

M = n( 2 2 )/02 k = 0, 1, 2, ., K-l, O2 is the residual
k k k+l k+l k
variance corresponding to the model with parameter estimates k,' and

o2 is the residual variance corresponding to the model with parameter
estimates Hk+l

2. For a given choice of a, determine Ck such that P{Mk > Ck} = a,

where Mk is the maximum order statistic in a sequence of (K-k) independent

X random variables.

3. Choose the estimated order, p, such that

p = {min k: Ck, 0 k K} ,

and thus a) = (p. This method of estimating Q is referred to as

the "max-X2 technique."

To illustrate the application of the max-X2 technique in tests for

serial correlation, consider the first order AR model with nonzero but

unknown lag jl:

Z = 0. Z + ,
t J1t-1 t

where the{c } is white noise. The hypotheses of interest to the re-
searcher who is attempting to detect serial correlation in {Z ) are

Ho: jl 0
H1: 0. # 0.

Rewriting Ho and H1 in terms of the parameter set we have

H : = {0} (white noise)

H1: @ = {1; jl}

where 1 jl K. The test statistic takes the form:

T0 2 2
T = no I


^ i n
(o2 Zt = Co, and a12 is generic for the estimated
residual variance of each first order AR model. However, McClave has

shown that the subset algorithm, when applied to (3.2.2) will choose

that lag jl, which minimizes 012 over all first order models. Hence,

the test statistic takes the form:

o2 mi ^O2
M max T = n 31
S31 1 /

2 c 1th
Now a 2 = C (1-r1 ), where i: is the jth sample autocorrelation function.

C min[C (1-r? )]
M = n Jl (3.2.3)
0o |min[C (1-r )]
jl o J1

C C [1-max r i
=n 0 1 1
C [1-max r ]
0 31 J31

max r2
j n i 31
1-max r
3i1 1

In equation (3.2.3), notice that M is equivalent to the statistic

nmax r.2 McClave showed that T -D 2 i.e., the statistic
11 o Ho 1
T converges in distribution when H is true to a X2 random variable

with 1 degree of freedom. It follows then, that under the hypothesis
^ D
of white noise, M ---> M, where M is distributed as maximum order
o Ho
statistic from a sequence of K independent X1 random variables. Values

of the test statistic "too large" will lead to a rejection of Ho, where

"too large" is determined according to the distribution of M.

Let us compare the "max-X2" statistic,

n max r. ,

with the Box-Pierce "portmanteau" statistic,

j=l j

Both statistics are based on the vector of sample autocorrelations, r

(rl, r2, ., rK)'. Models (3.1.5) and (3.2.1) both take the form of

the more general alternative error structure

Zt = 01Z -1 0Z + t-2 + mt-m + t' (3.2.4)

which gives rise to the plausibility that both testing procedures attain

high power against non-first order alternatives. The similarities between

the two statistics are evident, and considering the fact that the well-known

B-P procedure has been in the literature and widely used since 1970,

some may question the motivation of those investigating the max-X2 tech-

nique as a test for serial correlation. Obtaining an answer to this

query is the goal of this research. As a partial answer, we offer the

following: the max-X2 test is presented as an alternative testing

procedure to those researchers concerned with studying parsimonious

models, i.e., models with few nonzero lag parameters. In most practical

cases, if the {Z.} are in fact correlated, a minimum number of the lag

parameters (01, 02' .' m ) in model (3.2.4) are nonzero. Consider

the researcher who utilizes

K 2
n j r
j=l 3

as a test statistic. We will discover in Section 3.3 that knowledge of

the sample autocorrelation vector r is sufficient for calculation of the
^ A A
parameter estimates (01' 2' ., m). Then the researcher is essentially

computing estimates of lag parameters which are known to be zero as well

as estimates of those nonzero lag parameters. These "zero" estimates,

of course, contribute very little to the statistic

K 2
n r 2

Our conjecture is that a predominance of small r.'s will deflate the sum,

resulting in a failure to reject the null hypothesis of white noise when

in fact several of the true autocorrelations are nonzero. This, in turn,

leads to an invalid implementation of the least squares inferential

techniques (see Section 1.1). Our goal is to show that, in many cases,

the practical research worker can avoid this problem by employing the

max-X2 procedure.

Consideration is given to both the null and non-null asymptotic

distributions of the test statistics in the following section. These

results become the foundation for the power comparisons discussed in

Chapter 4.

3.3 Asymptotic Distribution of the Test Statistics

3.3.1 Preliminaries

Assume that Z Z2, ., Z are observations generated at n consecu-

tive time points by a stationary AR model of order p with lag coefficients

(01 2 0 ), i.e.,

Zt = Zt- + 2Zt-2 + + P t-p + t'

t = 1, 2, ., n (3.3.1)

where {Etc is an uncorrelated series with mean 0 and variance 2(white

noise). Let

Y = E{ZtZt+} v = 0, l, 2, .

1 n-l[j
C = Z Z t+ v = 0, 1, 2, ., (n-l),
v n t=l t+vl

represent the true and sample autocovariance functions, respectively.

Also, let p = yyV and r, =C CC be the true and sample autocorrelations
v o v v o v
of order v, respectively. Then the vector of lag coefficients 0 = (01'

02 .. ., p )' is estimated by 0 = (01' 02'". p" in the modified

Yule-Walker equations:

r0 = r

where r is a symmetric pxp matrix given by:

1 rI r2 rp-1

rl 1 r rp-2

r 2 r 1 r -3


and r = (rI, r, ., r ). Also, the estimate of o2 is given by

S2 = C [1-r'r].
P o

3.3.2 General Case

Due to the complexities involved with obtaining the exact distribu-

tion of the two test statistics of interest, this research will consider

the large sample properties of the tests. Anderson [ 4], Grenander and

Rosenblatt [15], Fuller [13] and others showed that the asymptotic joint

distribution of the first K sample autocorrelations is multivariate normal,

although Bartlett [ 6] was one of the first to consider the form of the

limiting variance of the sample autocorrelation. This asymptotic result,

given in its most general form by Anderson, is stated in the lemma that


Lemma 1: Let

t j=-- j t-j

where {e } is white noise,


j=-c- J


ZJ aj <

Define the quantity P as

P = j=Pj-


P = y-1 = E{Z Z t /E{Zt 2

and let

n Z Z
C t t-v
r= t
v t=v+l < <
1 v- K .
E Z 2

Then vn(r-p) D N {o,V} where r = (r, r2, ., rK)

P = (pip P2' ., pK) and V = (vij) is the KxK dispersion matrix of
- j 2 K 1j

the limiting distribution whose elements are given by:

vii = P + P2. + 2o 2P 4p P. 1 i K (3.3.2)
01 o 2 i o 1 1

v.. = Pj + P. + 2p.p.P 2p P 2pP.
13 i+j 1i- i jo j j i
1 j < i < K (3.3.3)

Proof: See Anderson [ 4], pp. 489-495. l[

Recall the invertibility property of the ARMA model in Section 3.1.

Fuller [13] and others show that {Zt} generated from the AR process

given by (3.3.1) can be expressed as an infinite moving average (MA) of

{Ct}, i.e.,

Z = E ajt-j
t j=0

with {a.} absolutely summable, if the roots of the equation

xP + 0 xP-1+ 02xp2 + + 0 = 0

are all in the unit circle. Since model (3.3.1) is stationary, this

condition is satisfied and thus the AR process is just a special case

of the time series model defined in Lemma 1. However, before applying

Lemma 1 to obtain the asymptotic distribution of the Box-Pierce and

max-x2 test statistic, we point out that in practice, the {Z}t in

model (3.3.1) are unknown. Hence, the researcher, unable to calculate

the sample autocorrelations,

r = E ZtZt- < <
V t=v+l 1 v K
n 2

must, instead, calculate the estimated sample autocorrelations,

n ^ -
r = t t--v
t=v+l 1 < v K
n t=2


Where {Zt} are the residuals computed from least squares regression.

The {r } are used in place of {r } when calculating the Box-Pierce and

max-x2 test statistics. Thus, our interest lies in determining the

limiting behavior of the estimated sample autocorrelations, (rl, r2,
A result due to Anderson [4], formally stated in Lemma 2, shows that

under certain regularity conditions on the design matrix X of regression,

the limiting distribution of the sample autocorrelations defined in Lemma

1 also holds for autocorrelations computed using the residuals { Zt cal-

culated from least squares regression.

Lemma 2: Let Y = X8 + Z be the regression model given in equation

(1.1.1), where

Zt j =-aj t-j '

.E a I < I

jE I _, la2 <

and where is white noise with bounded moments. Let

Sij(h) =E 1< I i< < g <
t(h) = xit+h x, 1 i g, 1 j g
i3 t=1xi,t+h xjt'


x11 x21 x

12 x22 g2

X "

xln X2n gn

Given that the following four conditions (Grenander's condition) hold:

(1) an.(0) -

Si,n+l 0
a.. (0)

a .(h)
(3) 1j
/ai (0) aj (0)
ii jj

as n *

as n -


(4) R(0) is nonsingular, where

as n --

R(h) = [p i(h)] ,

n(r p) D> NK {0, V}

where p and V are defined in Lemma 1,

nZ Z
E t t-j
r. = tjl j K,
3 n ^

t=l t
A A -1 1
and Z = Y XB = [I-X(X'X) X ]Y.

Proof: See Anderson [4] pg. 593. ]

In order to simplify notation, let
(BP) 2
T =n. r = nr'r
j=1 J

(m) 2
T = n ma r .
1-j-K j

The asymptotic null and non-null distributions of T(BP) and T(m) are

given in theorem form. As a consequence of Lemma 2, the results stated

throughout the remainder of this chapter may also be applied with T(BP)

and T(m) are computed from least square residuals.

Theorem 1: Let r = (r r2, ., rK)" be the vector of sample

autocorrelations generated from model (3.3.1), and let T(BP) and T(m)



represent the Box-Pierce and max-X2 test statistics, respectively.


(BP) D
T (BP) D> X K(o),
Ho K

i.e., T converges in distribution when the null hypothesis of white

noise is true to a central X2 random variable with K degrees of freedom,

(m) D
T ----> M,

i.e., T(m) converges in distribution when the null hypothesis of white

noise is true to the distribution of M, where M represents the maximum

order statistic from a sequence of K independent X1 random variables.

Proof: Under the null hypothesis of white noise, {Z } is uncorrelated

with mean 0 and variance a2. Without loss of generality, take a2= 1.

In the null case, we have then

Yv 1 v = 0
0 v = 1, 2, K

Sv= 0
P =
0 v= 1, 2, ., K

1 v = 0
P =
0 v = 1, 2, ., K

Recall that {Z I in the stationary AR model of (3.3.1) may be written
as an infinite MA process,

t j= j 't-j

where the {a.} are absoluLely su-mmable. Having satisfied the conditions

of Lemma 1, we apply equations (3.3.2) and (3.3.3) to obtain

1 i=j
v.. 0 (i=1,2,. ., K),(j=l,2,. ., K) .
1i 0 iij

Hence, by Lemma 1:

Snr D-- NK 0,I .
n Ho

Then T(BP) D X2(0), where XK(o) is a central X2 random variable
HoKHQ (m)K K
with K degrees of freedom, and T(m) D-- M, where M is the maximum order
statistic from a sequence of K independent random variables.
We now proceed to find the appropriate critical values, Ca and

C(m), where:

(BP) (BP)
P H{T(BP) > C (BP)} = a
0o a
(mB) (m)
PHo{T >Ca }=a

It is obvious that C(BP) = X2 where X2 is given by
a K,a K,a

P{X(0) > X2 = a .

(m) (m)= c and observe that:
In order to find C m), let C. = c and observe that:

P{n max r2 < c = Pnr < c, nr2 < c, ., nr < c
= Ptnr2 < c)-P{nr2 < c})-'P{nr2 < c}
1 2 K

= P{-v- 1_ 2-K

= [P{-vc < < /c}]


where Z is a normal random variable with mean 0 and variance 1. Then

1-a = [P{-/c


(1-a) = P{-/c < Z < /c}

= 20(/c)-1,


Q(x) = x e-Z /2 .

Thus, we have that

=(m -)1/K
(m)1{ (1-ct) }2
cM) [(-if- + a }}]2
a 2 2

The two procedures for testing Ho:{Zt} uncorrelated versus H :{Zt

correlated, at significance level a, can be outlined as follows:

1. Box-Pierce test: reject Ho in favor of H1 if T(BP) > X
2. Max-x2 test: reject Ho in favor of Hi if

T(m)> [I-1 (1+ -)/K} 2
2 2

In order to compare the two procedures in terms of power (Chapter 4),

consideration must also be given to the non-null distributions of the test

statistics. However, the asymptotic distributions of T(BP) and T(m) are

not so easily derived when the {Z } are dependent.

Recall that /n(r p) D-- NK{O,V) (Lemma 1). When the alternative

of serial correlation is true, the asymptotic dispersion matrix V has

a non-diagonal form. For our purposes, we write that, under Ha, n r -

NK{/n p,V} for large n, where "- means "has an approximate distribution,"
Ks -

and p and V are determined from the expressions given in Lemma 1. A

result due to H. Scheffe'[28], given here as a lemma, enables us to

express the Box-Pierce statistic T(BP) as a linear combination of

independent non-central X2 random variables. Following this lemma,

the asymptotic non-null distribution of T(BP) is given in theorem form.

Lemma 3: Let x = (xl, x2, n) be a column random vector

which follows a multivariate normal law with mean vector u and covariance

matrix E. Consider the quadratic form Q = x'Ax, where A is a KxK

symmetrix matrix. Then if E is nonsingular, Q may be expressed in the


Q = i AiX2r (6i2)
i=1 i

where (AX, A2, 2' .,A ) are the distinct nonzero eigenvalues of AE,

(rl, r2, ., rk) their respective orders of multiplicity, (612, 622

. ., 2) are certain linear combinations of (pl' P2 n)

and the {X2r (6i)} a sequence of independent non-central X2 random

variables with ri degrees of freedom and non-centrality parameter i2

Proof: See Scheffe' [28], pp. 418. [

Theorem 2: Let r = (rl, r2, ., r K) be the vector of sample

autocorrelations generated from model (3.3.1), and V = (vij) the

symmetric positive definite KxK dispersion matrix of the asymptotic

distribution of /n r. Also, let V = TT', T a lower triangular matrix,

and consider the statistic

T(BP) = n E r 2= n(r'r).
j=1 j

Then we can write, for large n:

T(BP) X2 (62),
i=l ri i'

where (i) (A1' 2' x .,X) are the distinct nonzero eigenvalues of V,

(ii) (ml, m2, ., m ) are the respective orders of multiplicities

for the eigenvalues,

(iii) {X2 (6i2)}= a sequence of independent non-central X2 random
mi i =1
variables with m. degrees of freedom and non-centrality para-
meter 62, i 1, 2, ., ,
1 1
(iv) 6.2= np-VT- { ( ).-(T'T-X *I)}T-1 i = 1, 2, ., .
i s=1 i-s s

Proof: From Lemma 1, we have /n r -K N {/n p,V} for large n.

Let x = -n r, p = n p E = V, and A = I in Lemma 3. Then

T(BP) = n(r'r) = (/n r)'I (n r) = x'A x

and the conditions of Lemma 3 are satisfied. Thus, for large n,

T(BP) E= X2 (62)

where iA, mi, and XIi(6i2) are defined by (i), (ii), and (iii). If V

is positive definite, W.Y. Tan [31] obtained a formula useful for comput-

ing the non-centrality parameters { 6.2}:

6.2 = n p'-V -T-E*T-1-p (i = 1, 2, ),

where V = TT" and E = H ( .- )-(T'T-X *I), i = 1, 2, ., .
i S=i 1S
This is condition (iv) of theorem, and thus completes the proof. [

By Theorem 2, the asymptotic distribution of T is found by

determining the distribution of a weighted sum of independent non-

central ,,2 random variables. The problem is not an easy one, but

J. P. Imhoff [23] and others have succeeded in deriving an expression

for the distribution of the quadratic form Q (given in Lemma 4), by

means of an inverstion of the characteristic function of Q. These

results are given in Section and later applied in Chapter 4,

where we seek powers of the test.

Similarly, the asymptotic non-null distribution of the max-x2

statistic, T(m) is a complex one as described in the following theorem.

Theorem 3: Let r = (rl, r2, ., rK)' be the vector of the sample

autocorrelations generated from model (3.3.1), and let T(m)= n*max r.2
lIjK J
Then, for large n, T(m)is approximately distributed as the maximum order

statistic from a sequence of K dependent non-central X2 random variables,

{v.. *X 2(6)} with
1ii1 i i=l

(i) non-centrality parameters given by 62 = i 1iK

(ii) v = Po + P2 + 2pP 4p .P l

Proof: Again, by Lemma 1, /n r -NK{ / p, V} for large n, where the

elements of V=(vij) are found from equations (3.3.2) and (3.3.3). Hence,

for large n, the marginal distributions of {vn ri K can be approximated
by a sequence of K dependent normal random variables with means{/nip }
aK <
and variances 1{vii}K respectively. Now, for 1-i-K, /nr N{7n pi,vii}
11 i=l i
implies that
/n ri In pi
-N -- ,I
ii /V ii

nri2 62) 62 n i2
Thus, we have that -- ~ X6 where -2 ; or, we write,
vii i i 2vii

n2 ~ v 2(62), l i-K. It follows that (m) = max(nr 2) is approximately
ni ii 1 ( 1-3-K

distributed as the maximum order statistic of the dependent (since vij#0

in general) sequence {viiX (62)}i. K

The complexity of the distribution derived in Theorem 3 makes power

computations very difficult. In Section, this problem is made

somewhat more workable by considering a Taylor-series expansion of the

density function of T(

In the sections that follow, a few simple alternative error models

are presented, and the non-null distributions of the max-x2 and Box-Pierce

statistics are discussed. The severity of the distributional problem,

even in these simple cases, will be illustrated.

3.3.3 Special Cases

Recall the first order AR alternative error model considered by

Durbin-Watson (Section 1.2.1). Let us consider a more general form of

this model,

Zt = + et (3.3.4)

where t = 1, 2, ., n and {e t a white noise process. In the termi-

nology of the previous sections, we call model (3.3.4) a first order AR

model with maximum (and singleton) lag J, where J is unknown.

Lemma 4: The vector of true autocorrelations, 0 = (p' P2, )'

for model (3.3.4) are calculated as follows:

1 = 0

p = Om = mJ; m = 1, 2, 3, (3.3.5)

0 otherwise

Proof: Recall that p = y/Yo, v = 1, 2, 3, .. Using model (3.3.4)

we write for v = 1, 2, .

y = E{Zt.Z ) = EZt- Z + tEZ
t-v t-v t-J t t-v

= OE{ZtZt_(J-v)} + E{ctZt_ v.

Now, E{etZ }= 0 since Zt is uncorrelated with t-v+l', Et-v+2, '

Et, Et+ .. Hence,

y = 0-E{Z *Z t_(j_) = 0YJ_- (v = 1, 2, 3, ..). (3.3.6)

Immediately, we see that for m = 1, 2, 3, .,

YmJ = Y(m-J '
and thus,

mJ = O(m-l)J

Since po= 1, it follows that pmJ = m, (m = 1, 2, 3, .). Consider now,

those autocorrelations of order not a multiple of J. Take v = J-v in

equation (3.3.6). Then y = 0y ; but also y = 0y j. This implies

that y. = 02y,, for v = 1, 2, 3, .., J-l, J+1, ., 2J-1, 2J+1, ...

And since 0 # 0, it must be that y = 0 for v # mJ, (m = 1, 2, 3, .)

This completes the proof. [

The vector p, given in equation (3.3.5), will be used to compute

the asymptotic mean vector of the sample autocorrelations for this special


Lemma 5: For the model (3.3.4), the quantities {P ) v = 1, 2, .

defined in Lemma 1, are given by:

1+02 v= 0

P = m[m+jl-(m-1)02] = mJ, (m = 1, 2, 3, .) (3.3.7)


Proof: From Lemma 1, we have P = E pjp (v =0, 1, 2, .)
V j=- J v-j
First let v = 0. Then we have

P0 0 +2 lp?
Py = E p2 = p + 2 Ep .
o j=-cj o j=1

Applying the results from Theorem 4, we obtain

P =1 + 2 E l2
o m=l mj

= 1 + 2 2m

C i a
Sa =
i=l 1-a

if [a| <1, we have

Po = 1 + 2( )
oT 1-02


For v a non-multiple of J and je(--, o), the integers j and v-j cannot

both be multiples of J simultaneously. It follows that Py = 0 for

v mJ, (m = 1, 2, .).

To find the quantities Pm (m = 1, 2, .),

we write

00mJ j=j mJj jo jJ -j)J
PJ j = =- 0j pmj-j= j=Zpjjp (m-j)j

-1 m
j=- PjJ(m-j)J + -1Pj (m-j)J

Utilizing the results in Theorem 4, we have

P = 00m +j m- +
mJ j=l j=0

Cj=j (mj)J
j=m+l JJ (m-j)J"

i = jgj-m


Now take i=j-m in the last summation of equation (3.3.8) and simplify to

finally obtain:

Pm 012j + (m+l)0m + 0m 02i
mJ -=1 i=l

= 20m 02J

m 02
= 20 (1-0)

+ (m+l)0m

+ (m+l) 0m

= 0m[M+H (m-1)02]/(1-02).

We now proceed to find the structure of the asymptotic distribution

of the sample autocorrelations for this special case using the results

derived in Lemmas 4 and 5.

Theorem 4: The covariance structure for the asymptotic distribution

of the first K sample autocorrelations, (rl, r2, ., rK), from model

(3.3.4), is given by:


S. = mJ (m = 1, 2, 3, .)

0m+2 (3.3.9)
S. v ,J even (m = 1, 3, 5,
2 .)

. otherwise

0m[m+l-(m-l)02] vl+v2=mJ or v -v = mJ
02 but not oth (3.3.10)
(m = 1, 2, 3, .)
Om[Vm+l-(m-1) 02 ]+01( [+- (-1)202. .
1-.0 +v2=mJ (mR=1,2,3. .)
1-0 v1- 2= J(=1,2,3. .)

0m- [m-+ (m-Z-1) 02 ] -0m


[m- +l-l(mn+-l) 02

V1 = mJ (m = 1, 2, .)
2 = J (k = 1, 2, .



Proof: Recall from Lemma 1, the expression for the asymptotic variance

of the vth order sample autocorrelation, /n r

v = + + 2p2po 4 (3.3.11)

For vj mJ, we have p = 0 (Lemma 4) and P = P2v = 0 (Lemma 5), thus

v Po = 102 (v mJ; m = 1, 2, 3, .)

Now consider v = mJ. Here we have from Lemmas 5 and 6,p = 0m,

Pv 1-02


P 2m[2m+l-(2m-1)02]
2v 1-02

We then can write

1+02 +02m[ 2m+l-(2m-l)02] + 202m 1+(2 1 2m[m+l-(m-1)02]
vv 1-02 1-02 1i-02/ 102

= {1+02+02m[-2m-1+02(-l+2m)]} /(1-02)

= {l+02-(2m+l)02m+(2m-1)02(m+l)}/(l-02), (v =mJ; m = 1, 2, .).

Because of the term P2v in equation (3.3.11), we must also consider those

orders of multiples of J/2, when J is an even integer, i.e.,v = J/2, 3J/2,

5J/2, .. Here 2v becomes an odd multiple of J, thus (from Lemmas 4, and

m[m+l- 0 and (m-l)0 ( = 3, 5 .)
pV = O,PV = 0 and P = P = (m = 1, 3, 5,
2v mJ 1-02

It follows that

1+02 0m m+l-(m-l)02
vv 1- 02 1-02

1+02 + (ml)inm-(m-1)0m+2 mJ even (
1-02 2 1, 3 5 .).

This completes the proof of equation (3.3.9).

The asymptotic covariances of the sample autocorrelations are found

in a similar manner. Rewriting equation (3.3.3) from Lemma 1, we have,

for large n,

V1 v2 = PVl+V2 + PV1- 2 + 2pv pv2Po 2P P 2

2p Pv (V1 v2). (3.3.12)

These computations become quite involved, for now we must consider those

values of v v2, vI + v2 or v v which are multiples of J. The result
1 2 1 2 1 2
is obtained in a straightforward but nontrivial fashion if we group the

possibilities as follows:

a) v1 a multiple of J, but v2 a nonmultiple of J.

b) v2 a multiple of J, but v a nonmultiple of J.

c) v1 + 2 a multiple of J, but vl 2 a nonmultiple of J.

d) v1 2 a multiple of J, but v + v2 a nonmultiple of J.

e) Both v + v2 and v1 2 multiples of J, but neither v1

nor v2 a multiple of J.

f) Both v and v multiples of J.
Notice that in case (a) and case (b), neither v + v nor v V2 can

be a multiple of J. Hence, each of the quantities summed in equation

(3.3.12) are zero. Thus,

v 1' 2 = 0 for v 1 mJ or v2 # mJ (m = 1, 2, 3, .)

For case (c), the only nonzero term in (3.3.12) is P,~yl Similarly,

for case (d), the only nonzero term in (3.3.12) is Py 2. This gives

S= 0m[m+l-(m-l)02] for either vl+2=m
Vl,v2 =mJ 1-02 1 2
or V -V2=mJ but not both (m = 1, 2, .).

When the orders of the sample autocorrelations follow the pattern in case

(e), only the terms P1 +2 and Pl-,v2 are nonzero. Hence,

m[m+1-(m-1)02] + 0[+l-(Z-1)02]
Vvl,2 = P mj+Pj -
192 mJ VJ~ 1-02

for v +v 2=mJ (m = 1, 2, 3, .), and v1-v2=YJ

(9= 1, 2, 3, .).

Finally, we consider the order pattern in (f). Both v1 and v2 multiples

of J imply that both v1+v2 and -v2 are multiples of J. Thus, none of

the terms summed in (3.3.12) are nonzero. Utilizing Theorems 4 and 5, and lettin

v =mJ and v 2=J, we obtain:

1vv2 =(m +)J+P (m-)J+2pmJP JPo -2PmJ J-2p JPmJ

Sm+0 [m-Z+l- (m+z-l)02 Om- [m-+l-(m-k-l)02]
1-02 1-2

20m+$ [ +l-(-1l)02 20m+Z[m+-l-(m-1)02]
1-02 1-02

Sm-"[m-Z+l-(m-l-1)02] O m+[m+e+l-(m+Z-1)02]
= $-------

This completes the proof.

49 Model 1: The Case J=2, K=2

Let us now apply these results to model (3.3.4) when J=2, i.e.,

Z = 0Z + e (3.3.13)
t t-2 t

Suppose also that the researcher chooses to base the test of

H: 0 = 0

H-: 0 8 0

on the first two sample autocorrelations rl and r2 (i.e., K=2). Call

this the case (J=2, K=2). Applying formulas (3.3.9) and (3.3.10) from

Theorem 5, we obtain

1+20+02 (1+0)2
V11 1-02 1-02

S 1-20 +02 (1-0)2 = 1-02,
v22 1-_02 1-0

v12 = 0.

From equation (3.3.35), we obtain p, = 0 and p = 0. Thus, we have that,

as n tends to infinity, the vector (nl rl, /n r2 /n 0) is distributed

asymptotically normal with mean vector (0,0)' and dispersion matrix

(1+0)2 0

V =
0 1-02

We write:

(i/n rl,/n r2)-. N (0, 0)" [1) 0
S, (3.3.14)

0 1-02,
for large n.

Corollary 1: Under the alternative error model given in equation (3.3.13),

the Box-Pierce statistic T(BP) = nr + nr2 is asymptotically distributed

as a weighted sum of independent single degree of freedom chi-square random


(BP) (1+0)2
T =- (0) + (1-0)2.X-- 2

where X =

Proof: From Theorem 2, we write T(BP) =2 xi (62), where the (A.},
'- i=l i i'i 1 i
{m}, and{62} are defined previously. Since V is a diagonal matrix,

it is obvious that the eigenvalues of V are:

=(1+0)2 and A= 02
1- 1-0 and X2 = 1

with corresponding multiplicities ml=1 and m2=1. In order to determine

62 and 62 we utilize (iv) of Theorem 2. Now
1 2

E I( I (T'T-X 2'I)

E2 =2 ) (T'T-X I '


1+0 0

T = (Recall that V=TT').


Hence, we have

= n*p V TE T1p

= n-(0, Fn7)'TET1 _p

=n-(0O'(102)i/2) ElT

an 0 1-02 -]T-l
= n*(0,(1-_2)j / z XI 2 p

1-02-2 I~
1 -i2

since 2 = 1-02


n020[102- 2]
_(1-2)20(-) 2


= n-pV-1TE2T-p

(-2 22 1
= n202[1-02-1 i


From Corollary 1, then, we write the exact asymptotic power of the Box-

Pierce test as:

lirn PT(BP)>C (BP)O = (1+0)
nlimP{T>C- )

(1-02)X2()>CBP) ,

where A = n202/(1-02)2. Now the probability on the right-hand side of

equation (3.3.15) is not an easy one to compute, since the constants

mulitplying the single degree of freedom X2 random variables are unequal.


= n'(0,(1- )

In fact, the problem of obtaining the exact probability remains unsolved.

In Section, we present three methods of approximating the right

hand side of (3.3.15).

Corollary 2: Under the alternative error model given in equation (3.3.13),

the exact asymptotic power of the max-x2 test, T(m)= max(nr, nr2), is
1' 2
given by:

(m) (m) { (m(-0 C )-02) (3.3.16)
lim PfT >C } = 1- 2{- *c-1 1j-02

where (D(x) = f (2x) e 2Z dz

Proof: Let C(M)= C and start by writing the power of the max-X test as


P{T()>C(0} = P{max(nr1,'nr2)>C}

= l-P{max(nr2, nr2)
= l-P{nrl

From the asymptotic distribution given in (3.3.14), we have that for

large n, 'nrI and Vnr2 are distributed as independent normal random

variates; hence,

lim P{T(m)>C0} = l-P{nr2 -1 2

=-~r l-P{-vk/cvIr ,r)1< }. P VrJ_ / 2< I.C


1 C(1 -0 C(1-2) -/C-/n0 -/C-/n0

1 1 -
= 1- 2[ ( 1+0L D [ p-vO (D 2

Of course, tables are available for computing tail areas of the standard

normal distribution. Thus, for the special case (J=2, K=2), the exact

asymptotic power can be calculated for the max-X2 test. However, the

notion of exact asymptotic power is the exception rather than the rule.

The researcher who attempts to calculate powers utilizing an analytic

approach similar to the work above, will more often than not be forced

into approximating the true asymptotic power, as with the Box-Pierce

statistic. This point is further illustrated in the following section. Model 2: The Case J=2, K=4

Again, let us consider model (3.3.13) as an appropriate representa-

tion of the residuals when serial correlation is present. Now we base

the test on the first four sample autocorrelations (rI, r2, r3, r4).

Call this the case (J=2, K=4). From equation (3.3.5), we observe that

p1=0, p2=0, P3=0 and p4=02. Applying (3.3.9) gives

S- (1+0)2
11 1-02

v22 = 1 02 ,

33 1-02

= 1+02_50 W+3 _
v = 1+02-50 +306 = 1+202-304
V44 _-072

The covariances, obtained from (3.3.10), are computed as follows:

v12 = 0 ,

S0(3-02)+20 20+302_04
v13 -i02 i-2

v14 = 0,

v23 = 0,

S20-03(4-202) 20(1-202+4) + 20
v24 1-02 1- 20(1-2)

Then, as n tends to infinity, the vector /n(r-p) is distributed

asymptotically normal, with mean vector 0, and dispersion matrix V,


(1+0)2 0 20+302_04
0 0
1-02 1-0F
0 1-02 0 20(1-02)

= 20+302-04 1+2+403-205
V02 0 i_1- 0 (3.3.17)

0 20(1-02) 0 1+202-304

The method of obtaining exact asymptotic powers in the case (J=2, K=2)

depended heavily on the independence of the first two sample autocorrela-

tions. The addition of the third and fourth order sample autocorrelations

into the testing procedure contribute nonzero off-diagonal elements to

the covariance structure, V. These covariances make the distributional

problem so complex, that the researcher is forced to abandon the search

for exact asymptotic powers and seek, instead, approximate asymptotic

powers. In the subsections that follow, we examine several approximate

procedures for the Box-Pierce test, and one approximation for the max-x2

test. Approximate asymptotic power of the Box-Pierce test

Three approximate methods for computing the asymptotic power of the

Box-Pierce test are given in theorem form, and are labeled as follows:

(i) Imhoff numerical integration

(ii) Extension to Pearson's three-moment approximation

(iii) Rao asymptotic normality.

Lemma 6 (Imhoff): Let x have an n-variate normal distribution with

mean i and covariance V, where V is an nxn symmetric positive definite

matrix. Consider the quadratic form,

2 (62),
Q = x'A x = i1AiX2m. (2

where{X.}, {m }, and { 6} are defined in Lemma 2. Then
I i 1

Q>C} + sin 9(u)
P{Q>C} J+ du u
2 r u p (u)

1 1
9(u) = i i tan (Ai'u)+6iu(l+A5 u2)-l]- ICu
2 i 1 i 11 1 2

p(u) = (l+Xiu2)-i'exp{- 1 (6j1X.u')2/(l+u2)}
i=l i 2 j=l i

Proof: Imhoff's result is obtained by inversion of the characteristic

function of the variable Q. For details, see Imhoff [23], pp.421. [

Theorem 7 (Imhoff numerical integration): Under the alternative

model given by (3.3.13), the asymptotic power of the Box-Pierce test,

based on K=4 sample autocorrelations, is given by

(BP) (BP) 1 1 sin O(u)
lim P=T >C ) + 1r us du ,(3.3.18)
na 2 r ,j up(u)


(u) = [m tan- (Au) +62X.u(l+X2u2)]- 1 (BP)u
2i i= i ~i ii 2 Ca

p(u) = J1 (l+Xu2) miexp[1 (6 u)2 u2)

and where { .} are the eigenvalues of the matrix V given in (3.3.17),
with respective multiplicities {m.}, and { 6} are obtained from (iv)
1 1
of Theorem 2.

Proof: The result follows directly from Theorem 2 and Lemma 6. [

The power expression in (3.3.18) results from Imhoff's ability to

find the distribution function of a linear combination of non-central X2

random variables explicitly. However, the integral in (3.3.18) cannot

be computed analytically. Imhoff outlines a procedure in which the re-

searcher can approximate the power by a numerical integration method.

In Chapter 4, approximate asymptotic powers of the Box-Pierce test for

this case (J=2, K=4) and others are calculated by means of Imhoff's

procedure using numerical integration.

Lemma 7 (Pearson): Given the conditions of Lemma 3, tail areas of

the distribution of the quadratic form Q may be approximated as follows:

P Q>C} =F{ X (O)>C*},
3 2
where h a2 C* = (C-a)(h/a ) +h, a (m+j6) (j=l,2,3)
3 1= (C2-a l)j (=/a h 1 i I
andX (0) a central X2 random variable with h degrees of freedom.

Proof: The result is derived by extending Pearson's [25] "three-moment

central X2 approximation to the distribution of a non-central X2" to

the general case of a quadratic form in normal random variables. Details

are given in Imhoff [231, pp. 425. [

Theorem 7 (Imhoff's extension to Pearson 3-moment approximation):

An approximation to the asymptotic power of the Box-Pierce test is given


(BP) (BP)
lim PT ( >C (P X(O)>C*}, (3.3.19)
a h
3 n-
where h=a2/a2, C*C(BP)a (h/a2) + h, aj (m+j), (j=,2,3),
3 a 1 + h, a =1 (j1,2,3),
({ A,{ m.},and { 2} defined as in (iv) of Theorem 2, and where ":"

represents Pearson's 3-moment central X2 approximation to the distribution

of a non-central X2.

Proof: The results of Theorem 2 enable us to write T(BP) as a quadratic

form, Q. Apply Imhoff's extension to Pearson's 3-moment approximation

(Lemma 7) and the result follows. [

Since h in Theorem 7 need not necessarily be an integer, the user of

this extended three-moment approximate method must have access to a

computer routine which computes tail areas of central X2 random variables

with non-integer degrees of freedom. For a numerical application of this

result, see Section (4.1).

Lemma 8 (Rao): Let nn(x-p) D N 0,E} and let g: K- R

be a continuous function. Then, under certain regularity conditions,

n g(x)-g() -- NP0,5'EC),

where /
where (t) g(t) ag(t)

r = t1 li' I t2 2 tK K K

Proof: See C.R. Rao [27], pp. 387. []

Theorem 8: (Rao's asymptotic normal transformation : An alternate

form for the asymptotic power of the Box-Pierce test is given by:


(BP) (BP) /C(BP-) n pp)C PB)-n/(e 'e)
lim P{T C() (B = 1--- -- (3.3.20)
n-l+ (p'p) '(p'Vp) (p'p)- (p'Vp)2

x 2 _z2/2
Q(x) = (2r)- e z dz ,

where p and V are the asymptotic mean vector and asymptotic dispersion

matrix, respectively, of the vector of sample autocorrelations, r.

Proof: From Lemma 1, we have /n(r-p) D > N {0,V} ; hence, we may

apply the result of Lemma 8. Consider the function

g(t= (tt)~ =/t2 + t2 + .+ t2
1 2 K


8g(t) i
1 (t2 + t2 + + t) 2(2t)
at. 2 1 2 K i
= ti(t't)- (i = 1, 2, ., K),


t. = p i(p') (i = i, 2, ., K),
i i

imply that X = (p'p)2p. Thus, from Lemma 8:

1 I D -
/n{(r'r) (p'p)I----> N{O, (p'p)lp'Vp}

For large n, we see that /n(r'r)2 has an approximate normal distribution

with mean /n(p'p)2 and variance (p'p)-lp'Vp. We compute the asymptotic

power by writing:

P{T(BP)>C = P{nr'r>C = l-P{nr'r
= l-P{-/C
Taking limits on both sides, we have

(BP) P= < /C- n( Z
lim P{T >c} = I-P (')((e e 'V) Z ve (p'p)-I(p 'Vp)
nr>* 0 1

1-1 +
= 1_ I ( +
](p'p)-(p'vp) +

. 0

We apply this result (Theorem 8) to the case (J=2, K=4).

Corollary 3: For the case (J=2, K=4), the asymptotic power of the Box-

Pierce test is given by the expression:

lim PT(BP)>C(B 1-i

/-- (BP)
C 0/n(1+02 )
71+40-2 -30b

-cB P-0n (l .(3.3.21)
+ /1 402-420 -30b
I 1+02

Proof: Recall that, for (J=2, K=4), p = (0, 0, 0, 02)', and the matrix V

is given by (3.3.17). Then p'p = 02 + 04 = 02(1+2) and

p'Vp = [0, 0(1-02)+02[20(1-02)], 0, 0[20(1-02)J-02(1-202-304)]. 0j

= 02((1-2)+20(1-02)+20 (1-02)+04(1+202-304)

= 02+404-206-308

= 02(1+402-204-306)


(p'p) (p'Vp = [0/+07] 1. /1+402-20-30b

= +402-20 4-306
V 1+0

Now utilize equation (3.3.20) in Theorem 8 to obtain the approximate

asymptotic power expression:

S(B 01n(1+0) C(B 0TBn(lT)
1- +_---_-- --- + --- --(
1+402-204-306 1+40 20-30
1+02 1+0

/ C-Vn-(e e'p)

A comparison of the three approximations for computing the power of

the Box-Pierce statistic is found in Chapter 4. Approximate asymptotic power of the max-x2 test.

We now restrict our attention to the max-x2 test. For the case (J=2,
K=4), the asymptotic power of T may be written:

P{T()>C} =P{n max r.2>C}
1-j-4 J

= l-P{nr2

= l-P{-v <'nrlc

Let x =/nr. and i= np., l-i-4. Then the problem is to evaluate
i i i i

S /C /C f(x: p, V) dxldx dxdx, (3.3.22)

where f(x: p,V) is the joint density function of the first four sample

autocorrelations. For large n, /nr is multivariate normal; thus

f(x: 1,V) = -()2exp{- -(x-p)'V-l(x-)}

Due to the presence of offdiagonal elements in the matrix V, the exact

computation of the multiple integral in (3.3.22) becomes extremely cumber-

some. The researcher must again resort to approximate methods. We propose

a Taylor-series expansion method of approximating (3.3.22).

Definition 1: Let y = (YI, y2, yK ) be distributed as

multivariate normal, with E(y) = 0, E(y.2) = 1, l-i-K, and E(y.y.)=C.,

l-iiK, lI-j-K, ifj (i.e., y N {0, R}, where R=(Cij) is a correlation

matrix). A Taylor-series expansion of the density of v, f(y: 0,R), about

the elements Cij in R is defined as follows:

O K K (Cij)t t
f(y: 0,R) = t, ~ )f(y: 0,R) (3.3.23)
t=0 i=1 j=l t' ) ij ~ C
where {Cij}=0 implies Cij=0 Vi,j (i#j). []

For notational purposes, let

t f(y: 0,R) = fo
Cij ~ ij0 ij

Expanding (3.3.23) we obtain:

__h __f Ea f
f(y: 0,R) = {f(y: 0,I)} + C12 C2o + C13 C3 o + (3.3.24)

+ C ------f
K-1,K CK-1,K
1{2 (L f +a )+ CK2f+ o
+ {C12 f + + C2 f + C12C13( CC)
2 1K K-l,K 12 13

+ .C C _____(__ )f I
K-2,K K-1,K K CK-2,K K-1,K

+ {C ( )3fo + + C ()fo + C2C13 C 13
+3.1 3C1 1 CK-,K 3CKI, 12 13

+c C2 2 )f
+ K-2,K K-KK C }+

We will attempt to simplify equation (3.3.24) in order to make it

more workable for the practical researcher. To do this, we introduce the

concept of Hermite polynomials.

Definition 2: Let ) (x) represent the density function of a

univariate N(0,1) random variable (i.e., (x) =- e-x2 ). The n

Herrite polynomial, H (x), is defined from the following identity:

(-l)nI (x) (x) = (--)n (x). Q (3.3.25)
n ax

The properties of Hermite polynomials are given in the following lemma.

Lemma 9: Let H (x) be the nt Hermite polynomial. Then Ho(x)=l,

H1(x)=x, H2(x)=(x2-1), H3(x)=(x3-3x) and,in general,

Hn+(x) = xHn(x) (-)Hn(x) (3.3.26)

Proof: By convention, take Ho(x)=l. Now,

(0-)() = -x 'x),

(-)2 (x) = (x) + x2 d(x) = (x2-1) (x)
S)0(x) = (2x)4(x)-x(x2-1)M(x) = -(x3-3x) (x).

By equation 3.3.25, then, Hl(x)=x, H2(x)=(-l1) and H3(x)=(x3-3x).

In general, the (n+l)st Hermite polynomial is found as follows:

(-)n+Hn+l(x)(x) = (x)n+(x)

= () (-x)n (x)

= (_-_)(-l1)nH(x)H(x)

= (-1)"{ [n(),n(x)].(x) xhn(x)S(x)}

Thus, we obtain the (n+l)st Hermite polynomial from the recursive relation:

Hn+1(X) = xHn(x) -- ()x)Hn(x)

In the lemmas that follow, the application of Hermite polynomials in the

problem at hand becomes evident.

Lemma 10: Given the conditions of Definition 1, we have:

(i) ( )tf(y: 0,R) = ( )tf(v: 0,R), li-K, l-jK, i#j
3C.. 7 y-
1c3 aYi j

(ii) ( )( 0,R) = H(yi)Hnyj)f: 0,).
Yj yj I { Cij }=O

Proof: The proof of (i) is straightforward. Perform the required partial

differentiation of the density f(y: 0,R) on each side of the equation, and

the result follows (this requires many tedious computations).

We derive the identity in (ii) for the case K=2. For the case K=2,

we have

1 C I 1
R=1 12 R-1 1) -12
C 1 1-CC 1-C 1


R !12 2 1
f(y: 0,R) = -R- exp{ (y-2C2yy2+y2 2 1-C2
2 1 12122 1 C2
(27r) 12

Applying (i) and taking the partial derivatives, we obtain

( )f(y: 0,R) = ( )f(y: 0,R)
C12_ ~ ~ YIY2

( )(f(: 0,R).( -C )
Y2 -C12

C12 CI2y2-Yl C12y1-y2
= f(y: 0,R) (1-C-) + f(y: 0,R) ( --C 1-C2
12 12 12

Taking C12=0 in the above equation gives:

( a)f(y: 0,R) = yly2f(y: 0,I)=H1 (y)H1(Y2)f(y: 0,1).
12 {C..}=0

The proof of the more general result (ii) follows a similar argument. [

Lemma 11: Given the conditions of Definition 1, the Taylor-series

expansion of f(y: 0,R) can be expressed as follows:

f(y: 0,R) ={ [1] + [C12H1(y1)H (y2) + C13H1(Y1)H 3) + +

CK-1,Kl (YK-1)H1(YK)

1 [C2 H2(y )H2(y2)+... + C2,H2 I)( H Cy,(Hy
+2 [C12 H21) 2 + C1,KH2YK-1)H2(YK) + C12Cl3H2 (Y1)H1(Y3)JI(Y2

+ + CK-2,KCK-1,KH1 (K-2)1(yK-1)H2(YK)

+ [C2H3Y3 (Y2) + + CK-,KH3K-1)H3(K
+ C2C13H3(Y1)H2(Y2)Hy3)+ + CK-2,KCK-lH(K_2)H2 K-1)H3(YK) +
+ C12C13C14H3(y1)H 1(2)H1( 3)H1(y4) +
+ CK-3,KCK-2,KCK-1,K (yK-2)HI(yK-)H3(yK)H(yK-3)

+ 41 [. .] + } f

where fo = f(y: 0,I) = (yl) (y2) (K).

Proof: Apply the results of Lemma 10 directly to equation (3.3.24). [

Lemma 12:

/b fx(b)-x(a) .n=0
Hn(x) (x)dx = >
a {lnl(a) (a)-Hn-l(b)4i(b) n 1

1 2
1 2x
(x) = 1-- e-2ix


W(x) = <#(z)dz

Proof: For n=0,

b l(x))(x)dx = (x)dx = f (x)dx (x)dx = O(b) (a) .
F a a -m i

For n >- 1,

Hn(x)(x)d =


=-H n_(x)

= H (a)P(a) H (b)4(b)
n-i n-i

Lemma 13:

f, /' 2
Sa 1 a2

- aK

Hnl(1)Hn2(2) .

. H (K) (Y) (Y2) .

K [H (ai) (a
.11 n.-1

iil [ (bi (ai)]

.i [Q(b.)-(ai.)]
i 1 1

c(Y )d d 2 d

i)-Hn 1 (b.) (b.)]
1 *

, ni 1 V.
1 1

S ni = 0 V
1 1

K [Hn-I (ai )(a-Hn (bi) (b.)]
i=mrl 1 1

where n1, n2,
are reordered
< <
n.=0, 1-i-m
n.-, m+l<-i 1

so. th, t
so that

Proof: Break up the integration as follows:

fb H (Yl 1)(Yj1 2H 2 (Y (Y) f HnK K) K y-
2 n1 n2 2 2 a dyK "



Y2 d y

The result now follows directly from Lemma 12. O

We now return to the problem of evaluating the multiple integral in

(3.3.22). Let

i v

where V=(v..), 1 i-4, 1 j-4. Then y = (yl1y2,y3,4) is multivariate
normal with mean 0 and dispersion matrix R=(C .), where C..=1 and
iJ 11

C.. = _J
ii jj

The asymptotic power of the max-x2 test for the case (J=2, K=4) may

now be written:

b b b
P{T(m)>C(m)} = 1 -
Sa a 2 a3 a4
1^m,^) 3 4

f(y: 0,R)dyldy2dy3dy4, (3.3.28)

a. -
a. =
1 AV..


c (M)c vnnPi
b. U


The Taylor-series expansion of f(y: 0,R) is used to obtain an approximation

of (3.3.28).

Theorem 9 (Taylor-series approximation): A Taylor-series approxima-

tion to the asymptotic power of the max->2 test for the case (J=2,K=4)


is obtained by computing

1- i[2(bl)-1]- I[ (b (a2)][2 (b3) -1][ I(b )- (a4)] (3.3.29)

+{C24[2,(b1)-1] [2P(b3)-1]* [ (a2)-4(b2] [ (a4)-4(b4]}

+ [(b2)-O(a2) ] [(b )-D(a) ] [C3] [4bil (bl)b3 (b) ]

+ [20(bl)-1][20(b3)-] [C4] [a2 (a2)-b2 (b) ][a (a )-b (b ]}

+ i{ [C4] [2(bl)-l][21(b3)-1][(a-l)( (a )-(b2-1) (b) ]

[(a2-1)g(a )-(b2-1)i(b )]

+ [C3C24 [4bl )(bl)b3 (b)][(a2)-(b)] [(a) (b

+ 1{[4(b 2)-(a )][(b )-((a )][C3][4(bl-3bl)4b(b) (b3-3b3)4(b3)]

+ [C24] [ 1 2bl-[23(b3)- ] [ (a -3a2)(a)-(b2-3b) (b2) ] *[ (a-3a) (a)
-(bi-3b )(b )]

+ [C13C 4][4bl(b1)b3 )(b3)][a2 (a2-b (b2))][a4 (a4)-b 4 (b4)

+ 1 {[C 4][2'(bl)-11[2(b3)-] [(a2-6a2+3)((a2)-(b2-6bi+3)4(b2)

[ (a -6a +3) (a )-(b-6b+3)(b)

+ [C 3C24][4(bl-3bl)M(b )(b3-3b3) (b3)][I(a2)- (b2) ][4(a4)-[ ((ba]

+ [C C3 24][4bl (b )b3(b)][(a --1b)(a )-(b)2-l)](b2)][(a -1)(a )
1324 1 13 3 2
-(b4-1)4(b )]}

+ ,{[C 3][ (b2)-O(a2)][ (b )--(a )][4(b-10bl+15bl) ^(bl)(b3-10b'+15b3
+1 13 [(b2) 2 4 4 1 1 1 1 33)

+ [CI3C 4][4(bl2-3bl)(bl)(b3-3b3)<(b3)][a2(a2)-b2(b2][a4i(a )-b (b )

+ [C3 C]24 [4bl(bl)b3((b3)][(a2-3a2)P(a2)-(b 3b2 )(b )]
[(a -3a) (a) )-(b --3b 4) (b 4)

+ [C,4][20(al-1)-l][2(a3)-1][ (a2-10a2+15a2)(a2)-(b-10b +15b2) (b2)]

[(a -10a+15a4)O(a4)-(b -10b +15b ) (b4)]},
4 4 4 4 4 4 4)]11


1 2
S(x) e--X
2 ~i

C13 = (++0)/i+02+403-20b

Cm -/n- 0
b2 /ji-2
b 2- -1---O---
/C(m) -40
V !+ 2- -30 4

, C24 = l+20m30

-C(m) -/n
' a2 /-
a0c2= vI:

Sb = 1+0

VC(m)(1-0 )
3 l +02+403-20b

vC (m) -n 0
,and a =
4 /+202-30

From earlier results (3.3.17),

Vll 1-02

v22 = 1-2

v33 = 1-0


V 13 1-02

v24 = 20(1-02)

-3205 2
2 v4 = 1+20-30

v12 = V14 = V23 = V34 = 0 ;

hence, the elements of R are written,

C11 = C22 = C33 = C44 = ,

C12 = 14 = C23

= C34 = 0 ,

013 (1+0) /1+02+40 -20

2 v/i+2--3
C24 = VIT-W


(x) = ((z)dz,
f -CO


Now p = 0, P2 = 0, = 02 implies that
1 3 2 '4

'c (m ) ( j .- )

a3 =
1 ~ 1+0

a = --- _-

2 02
3 =1+02+403-205

a 4 a -

Sa (1-02)

b 0

/C(m) (1-02)
b 3
/i+02+40 -2

C (m)-_n 02
b = a
/1+202-3 0-

in the power expression 3.3.28). Put C12 = C24 = C23 C34 = 0 in equation

(3.3.27), to obtain the Taylor series-expansion:

f(y: 0,R) = {[l]+[C13H( 1)H (y3)+C24HI(Y2)1(y4 )]+[C3H2(yl)H2(y3)

+ C4H2 (2)H2(y)+C13C24H( H (Y2)H1 (3)H1 ( ]

+ 3 3 2
!+ 13[CH3(1)H3(Y3)+ C24H3(Y2)H3(4) + C13C24H2(y1)Hl(y2))2)H(y4)
+ C13C24H1 ()H2 (Y2)H1 (y3)H2 (Y4)

+ [C13H4(y)H4(3)+C4H4 (y2)H4(Y )+C3C24H3 (H(Y)H3(Y )H 1(4

2 2 3
+ C13C24H2 (y1)H2(y2)H2(y3)H2 (4) + C13C24H1(Y1)H3(Y2)H1(Y3)H3(Y4)

+51 rr.5T H +
+ [C3H5(Y1)H5(Y3) + C24H5(Y2)H5(Y4) + C3C24 41()H1(2)H4(y3)H1(Y4

+ C13C2243I y)H2(2)H3(y3)H2 (y) + C C3CH (1) (2)H2 (y3)H3(y)
13 24 ay 2 2 3 -3 "J + C13 24 1

+ C13C24H (v)H (y2)HI (Y)H (y )] + I![C 3H6(y )16 (3) + C2H6 ()H6(y4)

+ C13C24H5 (Yl)HI(y2)IH5(y3)H1 (y4) + C 13C24H4(yl)2 (y2)H4(y3)IH2(y4)

+ Cl3C24H3()H3(2)H3(Y3)H3(4) + C13C24H2(Y)H(y2)H23)H4(y)

+ C1C24H1(Y )H5(Y2)H(3)H5()] + } f(y: 0,I)

Before integrating (3.3.30), notice that c(x) is an even function, i.e.,

c(x) = p(-x). Also, Hn(x) is an even function when n is even, and is an

odd function when n is odd, i.e.,

S H (x) n even
H (-x) =
-Hn(x) n odd

Since al=-bI and a3=-b3, we have from Lemma 9,

H (x)<(x)dx = H a.)(ai)-Hn-(bi)4(bi) (3.3.31)

= H (-b.)(-b.)-H (bi.)(b.)
n-1 i i n-li 1

-2H n (bi.)(bi) if n is even (n-1 odd)

0 if n is odd (n-1 even)

Now integrate (3.3.30) using (3.3.31), Lemma 9 and Lemma 13 to

obtain the desired result. ]

In Chapter 4, each of the approximate methods outlined in this section

will be applied to models similar to (3.3.4), and numerical powers of the

two tests compared. Also, a study designed in part to determine the

accuracy of these approximations is conducted. This is accomplished by

comparing the approximate power of the test to the power generated from a

Monte-Carlo simulation.



In Sections 3.1 and 3.2 we presented two testing procedures, the

Box-Pierce "portmanteau" and max-X2 tests, designed to detect serial

correlation of the errors in the presence of a general alternative

error structure. In Chapter 4, an extensive power simulation study

is undertaken. Our goal is three-fold. First, we wish to substantiate

the conjecture that the Box-Pierce and max-x2 tests attain high power

in the general case. Secondly, we investigate the specific types of

alternative error models for which one test outperforms the other.

Finally, the accuracy of the asymptotic power approximations (Section 3.3)

is determined by comparing the simulated power to the approximate power.

4.1 Monte Carlo Simulations

Recall the stationary AR model of order p with lag coefficients

(019 0 ., 0 ) given by equation (3.3.1). In order to obtain the

simulated powers of the tests via the Monte Carlo method, a technique

for generating the sequence of observations Z, Z2, Zn needs to

be developed. We adopt a procedure in which the first p-variates Z1,

Z2, ., Z are generated so that they possess the desired covariance

structure of the general alternative error model under consideration.

The remaining n-p variates Zp+, Z p+2' "Zn are then calculated

recursively from equation (3.3.1). However, the formation of the co-

variance structure first entails computing the vector of true autocorrelations,


p. Given the lag coefficients (01' 02' ., 0 ) of the model, the

vector p is easily obtained. The relationship between p and the lag

coefficients is given in the following lemma.
Lemma 14: Consider the stationary p order AR model

Z = 0 Z + 0 2 + + 0 Z + E 3 (4.1.1)
t = Zt- + Zt-2 + t-p + (4.1.1)

where {e } is an uncorrelated series with mean 0 and variance o2, and

the vector of lag coefficients 0 = (0, 02' ., 0 ) is known. Let

y and p represent the vectors of true autocovariances and true auto-

correlations, respectively, where y = E{Z Z }, p =y-l v=0, 1, 2, .

Then Ap=0, where the elements of the matrix A are given by:

1 02i if 2i p
a.. =
1 if 2i > p

ij -0 if i+j <- i < j
a.-p = j
1a0 if i+j>p, i < j

-i+j-0 i-3 if i+j-p, i > j
a.. =
13 -0i-j if i+j>p, i > j

Proof: Multiply Zt in model (4.1.1) by Z t+, v= 1, 2, 3, .., p, and

take expectations to obtain the true autocovariances:

Y1 = o 2 1+ 6 2Y + 2 + 03 + 05Y4 + + 0pYp-I

Y2 = 0Y + 02o + + 42 + 53 + + pYp-2

Y3 = 01Y2 + 2Y1 + 03Y0 + 04Y1 + 5Y2+ + 0pYp-3

p = 0 p-+ + yp-2 -3 + 0Y-4 + 5p-5 + + p o

Now utilize p = yo-y to obtain expressions for the true autocorrelations:

p1 = 0 + 02P1 + 03P2 + 04P3 + 054 + + pPp-1

2= Op + 02 + 3P + 042 5+ 3+ + p-2

P = 0,Ppl1 + 020p-2 + 030p-3 + 040p-4 + 0 5p-5 + + 0

Solving the above equations for the lag parameters 01, 02, 03' 0p gives:

01 = (1-02)P 2 4P3 054 0 0p-1

02 = -(01+03)P1 + (1-04)p2 5P3 p 0p-2

0p = -0p-l 1 0p-2P2 p-33 + Pp
or in matrix form:

(1-02) -03 0 -0p 0 0

-(01+03) (1-04) -05 -0 0 0

p- -p-2 -p-3 -02 -0 1

To see this more clearly, consider the case, p=4. Expressions for the

first four true autocorrelations are given by:

P = 01 + 02Pl + 03P2 + 04P3

P2 = 0p1 + 02 + 03P1 + 0402

P3 = 0IP2 + 2P1 + 03 + 0 4

P4= 01p3 + 02P2 + 03P1 + 04

Now solve each of the above equations, respectively, for 0., j = 1, 2,

3, 4 to obtain:

01 = (-02)P1 03p2 04p3

02 = -(1+03)P1 + (1-04)p

03 = -(02+04)PI 0IP2 + P3

04 = -03P 02P2 01P3 + p4'

In matrix form, we have:

1-02 -3 -04 0

-(01 +03) 1-0 0 0
0 = P ,
-(02+4) -01 1 0

-03 -02 -01 1

which agrees with the lemma result. [

The simulation technique used to generate the powers of the tests

in Sections 4.1.1 and 4.1.2 is outlined in Table 4.1.

In Section 4.1.1 we examine the powers of the Box-Pierce and max-X2

tests when the vector of sample autocorrelations r is known, i.e., when

the random errors {Zt } are observable. Of course, in practice the {Zt }

are unobservable. We have indicated (Section 3.3.2) that the researcher

must instead calculate the vector of estimated sample autocorrelations, r,

using the residuals {Zt} computed from least squares regression. Power

comparisons of the Box-Pierce and max-x2 test in this more practical case

are discussed in Section 4.1.2. These power comparisons also include the

Durbin-Watson d test, which was designed to test regression residuals.

4.1.1 Observable Residuals: No Regression

In this section we present a power simulation study assuming that

the residuals {Zt from model (4.1.1) are observable. The objective is

to detect power trends which may extend to the more practical instance

in which powers are calculated using the residuals {Zt} computed from

least squares regression.


ERROR MODEL: Zt = 0 Z + 0 Z + + 0 Z + E ,
t 1 t-1 2 t-2 p t-p t
where {e t is White Noise

STEP 1: Utilizing Lemma 14, compute the vector of true autocorrelations

e =A 0,

where p= (p, pl' ) 0 = (0 0 2. "0 )' and the elements

of A are given by:

1-0 2i
a. =

if 2i p

if 2i > p

-0 j if i + j < i
0 if i + j > p, i a.. =
13 .0.- if i + j~ p, i>j

-0i-j if i + j >p, i>j

STEP 2: Generate the covariance structure, F,

(Z1, Z2, Zp) as follows:

Yo Yi Y2 Y p-l

SY Y1 p-2

Yp-1 Yp-2 Yp-3 Yo

where y =E{Zt t+} = Y = 1, 2, 3,

of the first p-variates

. p and

TABLE 4.1--Continued

Y = 01 + 02Y2 + + p + 02 (= (0 + 0 + 22 + pPp)o + 02

1-(01pI + 02P2 + + 0pPp)

Without loss of generality, take E(e2) = 02= 1.

STEP 3: Write r= TT' where T is a pxp lower triangular matrix. Solve

for the elements of T using a computer routine.

STEP 4: Generate the first p random variates Z1, Z2, ., Z by

computing 7 = T-e where Z = (ZI, Z2, ., Z )' and where e =
~p p ~p 1 2 P ~P

(elE2' ., p)' is a vector of independent normal random variables

with 0 and variance 1. (Many computer packages have routines which

generate N(0,1) random variates.) Note that Cov{Z } = E{Z Z '}=
-P ~P-P
TE{e e '}T' = TT' = r, and thus the first p random variates generated
have the desired covariance structure.

STEP 5: Compute the remaining n-p variates, Zp+, Zp+2', Zn, from

the recursive relations given by model (4.1.1), i.e.,

Zp+1 p + 2p-1 + + Z + Ep+

Zp+2 = 1 Zp+l + Zp + p + pZ2 + p+2

Z = 0Z + 0Z + + Z + E ,
n 1 n-- 2 n-2 p n-p n

where ,p+ n are distributed multi-variate normal with mean 0 and
p+1 n
covariance I.

STEP 6: Use the generated random errors Z1, Z2, ., Zn to calculate

the first K sample autocorrelations


TABLE 4.1--Continued

E' t zt-j
r, r2, ., rK, where r. = tj+lt t K.
J t K.
STEP 7: Compute the test statistics, T(BP) = n j and T(m)= n m
j1 -j-K
Determine whether the calculated values of the test statistics fall in their

respective rejection regions:

T(BP) > X2

T(m) > 1 (1 a)

STEP 8: Repeat steps 1 7 N times where we selected N = 1000.

STEP 9: Compute the simulated power for each test as follows:

Pow number of rejections
power 1000

A list of the controllable parameters in the study is given below:

n = number of residuals generated

a = probabilility of a Type I error

K = number of sample autocorrelations included in the Box-Pierce

and max- X2test statistics

p = order of the autoregressive model

(013 02' .. .' p) = values of the lag coefficients in model (4.1.1)

m = number of nonzero lag coefficients in model (4.1.1)

(jl' 2' jm) = lags associated with nonzero lag coefficients.

Of course, there exists an infinite number of parameter combinations. For

our study, we have attempted to select those model parameters which are

intuitively appealing to the time-series analyst and/or econometrician,

i.e., those which also have practical applications. Our basic error model

takes the form of equation (3.3.4). The two versions of this model used

in the study are: (1) the first order AR process with lag J

Zt = 0Zt- + Et (4.1.2)
L -t-J t

and (2) the second order AR process with lags, Jl and J2

= J t-1 + 2 t2+ t

The parameter values of each model are given in Table 4.2 and Table 4.3,

respectively. In order to present the results of the study in a clear

fashion, we do not give tables of the numerical power simulations in this

section. Instead, we sketch power curves for various cases which are

typical of those considered and which also best emphasize the main results.

Tables of all numerical results obtained in the study are included in the

Appendix which follows the final chapter. However, when summarizing the

simulation results, we will at times refer the reader to these tables.



ERROR MODEL 1: Z = 0Z +s
t t-J t



























*NOTE: Powers are generated for all of the above combinations of

{(J,K) 0, n, a} with the excpetion of the case

{(5,4), 0, 200, .05}.


ERROR MODEL 2: Zt = 0J ZtJ + 0J2Z-J + t
t J1 tJ1 J2 t-J2 t


(JlJ2;K) J'0J2 n a

(1,2;2) (-.5,.3) 50 .05

(1,2;12) (.5, .3) 200


(1.0,-. 6)



(-.3, .1)

(.7, .3)


(1,3;4) (.1,-.9) 50 .05

(.1,.7) 200




(-1.0, .1)





1.0 ,


.8 I

0 -6 6- 1

- .5

o .4 \\

,. TB(M) ,\\
.2 TP) \ \
T(M) \\

-.9 -.8 -.7 -.6 -.5 -.4 -.3 -.2 -.1 0






/ -



I n=50

I n=200


t i I I

0 .1 .2 .3 4 .5


.6 .7

.8 .9





1.0 -~-


0 6,

0 .4 \

II \\

S\4 ,,

S00 .4

.2 ---T(BP) It

____ i___ _- i-

.I ,,..,

-.9 -.8 -.7 -.6 -.5 -.4 -.3 -.2 -.I 0















n=50 (M
---- T(BP)
n= 200


0 .I .2 .3 .4

.5 .6 .7 .8 .9





1.0 ..-..

.9 \ \


\ i
\ \

.8 \ i


.6 v

- .5 '

0 .4

.n= 50



I I I I I I I I I .
-.9 -.8 -.7 -.6 -.5 -.4 -.3 -.2 -..1 0




I.Or _. --








--- T(BP)
--- T(M)

0 .1 .2 .3 .4 .5 .6 .7 .8 .9







' '


\ \
\ \I
\ I

\ I

\ ,



-.9 -.8 -.7 -.6 -.5 -.4 -.3 -.2 -.1 0





// /
.9 I


.7 :1

0 .i .2 .3 .4 .5 .6 .7 .8 .9











-.9 -.8 -.7 -.6 -.5

-.4 -.3 -.2 -.1



I I I I i


J=5, K=4




T (M)


0 .1

.3 .4 .5 .6 .7 .8 .9



_ I | | I L I I


We first discuss the power simulation results for error model 1

(see Tables A-i A-9 in Appendix). Note that for n=50, Figures 4.1 -

4.4 show that moderate and large values of the lag parameter (in absolute

value), i.e., .5 101 1, lead to a high rate of rejection of the null

hypothesis of white noise for both the Box-Pierce and max-x2 tests. When

the sample size is increased to 200, high rejection rates result at values

of the lag parameter .3 or larger (in absolute value). Except for the

case J=5, K=4 (Figure 4.5), these results hold for all other cases con-

sidered. For these models, then, this simulation study clearly supports

our supposition that both tests attain high power under a general alternative.

Compare now Figure 4.1 and 4.3, i.e., the cases J=2, K=4 and J=2, K=12.

Notice that the powers of the tests generally decrease as K, the number of

autocorrelations included in the test statistic is increased from 4 to 12.

And in all other cases considered, for a fixed lag J, the powers of the

tests decreased as K increased. In contrast, if the researcher unknowingly

selects K less than the smallest nonzero lag in the model (J in our model),

the powers of the tests are reduced dramatically, as demonstrated in Figure

4.5 for the case J=5, K=4. For moderate and small values of the lag para-

meter, the powers of the tests are less than .10. Thus, the test user's

choice of K is a delicate one. The researcher who conservatively selects

a K much larger than the largest nonzero lag in the general alternative

error model will fail to reject the null hypothesis of white noise when

serial correlation exists at a more frequent rate. Alternatively, if K

is chosen smaller than the lowest lag in the hypothesized error model,

the tests have very low power.

Comparing the powers of the tests, Figures 4.la 4.4a reveal that

the power of the max-x2 procedure dominates the power of the Box-Pierce

test for almost all negative values of the lag parameter, or more

specifically, for 0 .3. Consider the case J=12, K=24 (Figure 4.4).

Referring to Table A-8 in the Appendix, at 0 = -.7 and n=50, the max-X2

test with a simulated power of .853 greatly outperforms the Box-Pierce

test (simulated power of .583). When analyzing yearly data, of course,

it is very possible that the regression errors {Zt} follow an auto-

regressive scheme with nonzero lag 12. Thus, the practical significance

of the result is clear. For small values of the lag parameter, i.e.,

-.3 0 .3, the figures show that the Box-Pierce power is generally

larger than the max-x2 power. However, this result seems much weaker

than that previously stated since the powers attained by both tests

for 0 in this range are very low, and in some cases, near zero. The

apparent trend in the powers of the tests for moderate and large values

of the lag parameter at n=50 can be stated as follows: (a) for .3 0 .5,

the powers of the tests are very nearly equivalent; (b) for 0 .5, the

power of the max-x2 test is slightly higher than the Box-Pierce power.

The max-x2 procedure also seems to be less sensitive to an increase in K.

For example, compare Figures 4.1 and 4.3, the cases J=2, K=4 and J=2, K=12.

At 0 = .3 and n=200 the Box-Pierce power drops from .947 (Table A-l) to

.837 (Table A-4), while the drop in the power of the max-x2 test is only

from .959 to .925.

Figure 4.5 represents the only case (J=5, K=4) in which the Box-

Pierce test clearly outperforms the max-x2 test. In this in. ance, we

expect both tests to perform poorly. With J=5, the first four true

autocorrelations p p2, P3, p4 are zero, and thus each of the first

four sample autocorrelations rl, r2, r3, r4 estimate a quantity known

to be zero. Hence, both the sum and the maximum of the first four sample