UFDC Home  Search all Groups  UF Institutional Repository  UF Institutional Repository  UF Theses & Dissertations  Vendor Digitized Files   Help 
Material Information
Subjects
Notes
Record Information

Full Text 
ON THE MAXIMUM CHISQUARED TEST FOR WHITE NOISE BY TERENCE L. SINCICH A DISSFiRTA'TION PRESENTED TO 1HE GRADUATE COUNCIL OF THE UNTVERSi ', Oo, FLORIDA IN PARTIAL FiLTFILI.ENT OF THIE REQUIREMENTS FOP. TrE DEGREE 0F' DOCTOR OF PlHLOSOPIiY INIVERS.I~Y OF !:.)RIDA 1980 TO MY FAMILY AND FAITH ACKNOWLEDGEMENTS I am indebted to Dr. James T. McClave,not only for his patience and encouragement while guiding me through this dissertation, but also for the invaluable friendship he provided during my course of graduate study. For their helpful suggestions towards this research, I wish to thank Dr. John G. Saw, Dr. Mark C. Yang, Dr. Andre Khuri, Dr. Ramon C. Littel and Dr. Andrew Rosalsky. Also, I am most appreciative of Dr. Dennis D. Wackerly and Dr. Richard L. Scheaffer for their understanding and assistance in matters related to my graduate program. Finally, I especially thank my typist, Cecily Noble, for her amazing ability to transform the raw text, bulky equations and lengthy tables handed her into immaculate typed copy. TABLE OF CONTENTS Page ACKNOWLEDGEMENTS ................................................. iii ABSTRACT........................................................... vi CHAPTERS I INTRODUCTION................................................. 1 1.1 The Problem............................................. 1 1.2 Various Tests............................................ 4 1.2.1 The DurbinWatson Bounds Test.................... 4 1.2.2 Beta Approximation.... .......................... 9 1.2.3 du Approximation................................ 10 1.2.4 BLUS Estimators.................................. 11 1.2.5 The Sign Test.................................... 13 1.2.6 Periodogram Based Tests ......................... 14 1.3 Concluding Note............. ...... ..................... 16 IT COMPARING TFSTS OF SERIAL CORRELATION ........................ 17 2.1 Power Comparisons ...................................... 17 2.1.1 Exact d Test Versus Approximate Tests of Serial Correlation ............................. 17 2.1.2 d Test Versus BLUS Test........................... 18 2.1.3 The DurbinWatson d Test Versus Nonfirst Order Alternative Tests ....................... 19 2.2 The Alternative Error Model............................. 20 III TWO TESTS DESIGNED TO DETECT A GENERAL ALTERNATIVE........... 22 3.1 Box and Pierce 'Portmanteau' Test....................... 22 3.2 A New Technique: The MaxX2 Procedure .................. 25 3.3 Asymptotic Distribution of the Test Statistics.......... 30 3.3.1 Preliminaries..................................... 30 3.3.2 Generil Case..................................... 31 3.3.3 Special Cases .................................... 42 3.3.3.1 Model 1: the Case J=2, K=2............. 49 3.3.3.2 Model 2: the Case J=2, K=4.............. 53 3.3.3.2.1 Approximate Asymptotic Power of the BoxPierce Test. 55 3.3.3.2.2 Approximate Asymptotic Power of the Maxx2 Test..... 60 IV A POWER COMPARISON OF TIE BOXPIERCE 'PORLTMNTEAU', MAX,', AND D TESTS............................................... 7] 4.1 Monte Carlo Simulations .................. .... .......... 71 4.1.1 Observable Residuals: No Regression................. 74 4.1.2 Estimated Residuals After Regression................ 95 4.2 Power Approximations for Large n ...........................113 4.2.1 Case: J=2, K=4 ...................................... 114 4.2.2 Case: J=3, K=4..................................... 115 4.2.3 Case: J=4, K=4 .....................................116 4.2.4 Approximation Results...............................117 4.2.5 A Note on the TaylorSeries Approximation...........128 4.3 Summary.................................................... 131 V OTHER PROPERTIES OF THE BOXPIERCE AND MAXX2 TESTS.............133 5.1 Asymptotic Powers for Large n..............................133 5.2 Hodges and Lehmann Asymptotic Relative Efficiency..........142 5.3 Likelihood Ratio ...........................................153 5.3.1 General Case ........................................154 5.3.2 Special Case: J=5, K=2.............................156 5.3.3 Special Case: J=K..................................158 VI CONCLUSION......................................................164 6.1 Concluding Remarks..........................................164 6.2 Future Research.............................................. 65 APPENDIX..............................................................169 BIBLIOGRAPH ................................................. ........213 BIOGRAPHICAL SKETCH...................................................216 Abstract of Dissertation Presented to the Graduate Council of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy ON THE MAXIMUM CHISQUARED TEST FOR WHITE NOISE By Terence L. Sincich August, 1980 Chairman: James T. McClave Major Department: Statistics Consider the general regression model Y = +X + + BX + + Z t t = 1, 2, n t o lIlt 2X2t g gt t where the dependent variable Yt and the independent variables Xt, X2t' . X are observed and the vector of model parameters B' = (B 1 ., gt o 1 B ) is unknown. The usual assumption on the unobservable errors { Z } is g t that Z, Z2' 3 Zn are independent with zero mean and constant variance. When this assumption is violated, the application of the least squares regression method will often lead to erroneous inferences on the elements of B. In order to avoid this problem, a test for independence of the errors is essential. Many testing procedures have been proposed for this problem and a summary of tlhe most widely used tests is given. However, these tests attain high power only when the regression errors follow the first order autoregressive process S t + t Z t1+ Ct where I0j<1 and {t } is white noise. We consider two tests, based on the vector of sample autocorrelations, which are designed to have high power against a more general autoregressive alternative: the BoxPierce test and a new procedure called the maxx2 test. An extensive MonteCarlo simulation study is undertaken in order to compare the powers of the tests under the first order lag J autoregressive alternative Zt = ZttJ + t where J is not necessarily 1. This particular model is chosen for it is one which is near white noise, and hence one for which many testing pro cedures will fail to detect the presence of serially correlated errors. A graphical display of the simulated powers of the tests shows that although both tests attain a high power in this instance, the maxx2 test clearly outperforms the BoxPierce procedure. The two tests are also compared to the DurbinWatson d test for several least squares regression models. Since the d test is optimal when J=l in the alternative error model, it is recommended as the initial test to apply in practice. However, the d test performs poorly for J>1 and should be supplemented by another test which has high power in this case. Power simulation results indicate that the maxX2 test would be the better supplementary test. Since the problem of analytically deriving the exact powers of the tests is intractable even for simple alternative error models, asymptotic powers arc considered. Approximations to the asymptotic powers of the BoxPierce and max2 tests are discussed and their performance evaluated. Comparisons of the BoxPierce and maxX2 tests are also made with respect to HodgesLehmann efficiency and deficiency and with respect to likelihood ratio. Although the asymptotic relative efficiency of the Sii tests is 1, the BoxPierce test is shown to be asymptotically deficient when compared to the maxx2 test. A derivation of the likelihood ratio statistic for special cases of the problem reveals that the maxX2 test is asymptotically equivalent to the likelihood ratio test. viii CHAPTER I INTRODUCTION 1.1 The Problem Consider the general regression model Yt = Bo + BIt + 2X2t + g g + gXgt + t where the dependent variable, Yt, and the independent variables, Xlt, S. ., Xgt, are observed and the random error, Zt, is unobserved, t = 1, 2, ., n. In most practical cases, the vector of model parameters, 1' = (go B1' g) will be unknown. The researcher who is interested in estimating the parameters for prediction purposes usually applies the least squares regression method. The least squares model can be written as follows: Yt = o + BXlt + 52X2t + + BgXgt (1.1.1) where g' = (go, E' g) is the vector of parameter estimates and Yt is the predicted value of the dependent variable for the particular set of independent variables (X1t, X2t, ., Xgt) observed. Before utilizing the least squares prediction model, the researcher often wishes to make inferences about the components of B. To ensure the validity of these inferences, the following assumptions must be made: (1) The error terms (Zt's) have zero mean and equal variance, i.e., Zt ~ (0,02). (2) If the independent variables in the model are random, then the error terms are distributed independently of these independent variables, i.e., Zt is distributed independently of the Xt's. (3) Successive errors are uncorrelated. When autoregressive schemes or stochastic difference equations are used in the regression model, lagged values of the dependent variable occur as independent variables. These types of models violate assumption (2), and will be excluded from further consideration in this research. For models which do not contain lagged values of the dependent variables as regressors, the independent variables can be regarded as fixed, known constants, even if they are in fact random variables. All inferences are then conditional on the observed values of the independent variables. Of importance, then, is determining whether the model under consideration is in violation of assumption (3). When this assumption is violated, as is often the case in the analysis of time series data and in econometric modeling, complications arise. In order to grasp the severity of the problem, let us consider the following: Rewrite the regression model (1.1.1) in matrix form as Y = X B ) + Z. (1.1.2) (nxl) (= nxg) (gxl) + (nxl) (1.1.2) Let us write assumption (3) as Z ~ (0,o21), or, more strongly, Z ~ N(0,o2I). Then E(Y) = X and E{(YXB)(YXB)'} = a2I. Applying the method of maximum likelihood to the density of Y, the least squares estimates of the unknown model parameters (6) are obtained and are given by the wellknown formula S= (X') X'Y. Under the assumption that Z N(0,a2I), the vector e has the following properties: (a) E{L( ) ( (bJ) E{(B)(&P)'} ,o(X'X)1 (c) 8 is the BLUE (best linear unbiased estimator) for B (Gauss Markov Theorem). (d) 3 has minimum variance among all LUE for B (GaussMarkov Theorem). A (e) B is a maximum likelihood estimate. (f) S N{8, o2(X'X)1) Suppose now, that assumption (3) is violated and that Z N{0,E}, where E = a21, a general covariance matrix. Then E(Y) = XS and E{(YXB)(YXS)'} = E = 02T. In order to obtain estimates of the elements of 8, the maximum likelihood procedure is again applied to the density of Y. This yields the estimator B = (X'" X) 1'?YY, sometimes called the Markov estimator. From the GaussMarkov Theorem, it can be shown that 5 is the BLUE for 8 and that 8 has minimum variance (o2(X''1X)1) among all LUE for B. Let us consider the plight of the researcher who (in place of the more general Markov estimator) utilizes the usual least squares estimate of 8, say in the presence of dependent errors. Only if the characteristic vectors of Y have a specific form, does L equal 8 [ 4]. However, this is usually not the case in any practical situation. Assuming that the two estimators are unequal, the vector B has the following properties: (a) E( 3) = B (b) E{(L )( )'} = (X'X)IX'X(X'X)I = o2(X'X)1X'" X(X'X)1 A (c) L no longer has minimum variance among all LUE for 8 L (d) 8L is no longer a maximum likelihood estimator (e) ~ N {3R, c2(X'X)1X'YX(X'X)1} The implication of properties (b) and (e) is that the usual least squares formula for finding the variance of B O2(X'X)1, yields a biased estimate A A of the true variance E{(1 03)'( W)} = a2(X'X)X' X(X'X). L L  The least squares test statistics used for making inferences about the elr.ients of 3 are often stochastically larger in the presence of dependent errors. Thus, the researcher will tend to reject the null hypothesis H : B = 0 more often than he should, and hence, will include insignificant O 1 terms in the model. In order to avoid the problems introduced by applying the least squares procedure when the errors are dependent, a test for independence in the errors (sometimes called a test for serial correlation of the errors) is essential. 1.2 Various Tests Since the errors (Z) of a least squares regression are not observable, a test for serial correlation of the errors must be based on the residuals (Z) obtained from the calculated regression, where Z = Y XB. However, in most cases the residuals are correlated whether the errors are dependent or not; consequently, ordinary tests of independence for an observed se quence of random variables cannot be used without modification. Many test procedures have been proposed for this problem, and in this section a summary of several popular tests will be presented. 1.2.1 The DurbinWatson Test J. Durbin and G.S. Watson [10] in 195051 were among the first to consider the problem of testing for serial correlation of the errors in least squares regression, and their test and its modified versions remain tLe most widely applied tests of regression error independence. In order to investigate the random error component Z in the model Y = nx Bgxl) + Z(n) Y(nxI)~ (nxg)~ (gxl) (rnx1)' (1.2.1) DurbinWatson (DW) make the usual assumption under the null hypothesis: the errors are independent and identically distributed normal random variables with zero mean and constant variance, i.e., Zt i.i.d N(o,o2). The type of alternative they consider is one for which the vector of random errors, Z, is stationary, and for which Cov{Zt,Zt+h} = y(h) is exponentially (or geometrically) decreasing in h. A convenient model for this hypothesis is the first order autoregressive [AR(1)] model, or as it is sometimes called, the stationary Markov process: Zt = Zt1 + Et, where [0! < 1 and t ~ i.i.d N(o,o2) independently of the Zt 's. Thus, the null and alternative hypotheses take the form: Ho: 0 = 0 versus Ha: 0 0 . The test statistic derived by (DW) is given by: n Z (z zt_)2 t=2 t tZ 1 , n . Z Z 2 t=l where Z is the vector of residuals from the calculated least squares re gression (Z = Y XW). Durbin and Watson derived the d statistic using a result obtained by T.W. Anderson [ 5] in 1948. Anderson showed that no test exists which is uniformly most powerful against the twosided alternative. However, for cases in which the columns of the X (design) matrix in the model (called 'regression vectors' by Anderson) are linear combinations of the eigenvectors (latent vectors) of a matrix A (where A occurs in the density Z'A Z function of Z), tests based on ~ are uniformly most powerful (UMP) ZZ against onesided alternatives, and~have optimal properties for twosided alternatives. The density functions of Z which lead to the UMP test have have been shown by Anderson to take the form: f(Z:0,o2) = K exp[{(+0)2Z'Z 20Z'A Z} (  ]. (1.2.2) When the distribution of Z takes the form of the AR(1) model, the density of Z can be written as: n f(Z:0,)= K2exp[ 1 {(1+0)2 1Z t2 02(Z12+Zn2) ) 2 20 t=l n n 20 Z t z }] .(1.2.3) t=2 n 2 By taking Z'A Z = 2Zt t in equation (1.2.2), (DW) obtained the t=2 t1l density function: 1 f(Z:0,o2) = K3exp[2 (1+0)2 E Z 2 0(Z2+ Z 2) 2a t=l t n 208 2Z"Zt ] (1.2.4) which is very close to the AR(1) density in equation (1.2.3). Thus, an appropriate choice of A, namely 1 1 0 0 1 2 1 0 0 1 2 0 A = 0 0 1 0 0 0 0 2 1 0 0 0 .1 1 leads to a d statistic which provides a bl'M test against onesided alternatives in the "latent vector case", i.e., the case when the columns of the design matrix coincide with the latent vectors of A. The d statistic is bounded blow by 0 (obvious) and above by 4. In order to see that d t 4, expand the numerator of the statistic: n n E (Z Z = E (Z 22Z Z Z +Z2 t=2 t=2 t t =2 t t1 t2 n n n S Z 22 Z Z + Z Z2 t=2 t t=2 t t1 t=2 t1 n n n Then d = t2Ztt)2 2 2t t = 2 1 t t t Zt 2 tt=2 L tl J z 2 2 2 t=1 t=l t=l Now, t=2 tl > n E 2 t= t hence d 2[1(1)] = 4 Thus, if the errors were positively serially correlated, e.g., Zt = 0Ztl + Et, 0 > 0, d would tend to be relatively small, and if the errors were negatively serially correlated, e.g., Zt = 0Zt_1 + et, 0 < 0, d would tend to be relatively large. The user of the d test interested in detecting the existence of positive serial correlation of the regression errors would reject the null hypothesis of independent errors (0 = 0) in favor of the alternative 0 > 0 if d < d*, where d* is the appropriate critical value of d for a lowertailed test. Uppertailed tests (tests for negative serial correlation) are carried out in a similar manner. A major drawback to this testing procedure is that it is only possible to find the exact null distribution of d in very special cases (see Anderson [ 51), cases which do not frequently occur in practice. However, DW show that it is possible to calculate upper (du*) and lower (dL*) bounds to the critical value d*, .where 0 d L* values of the critical bounds are made available in Durbin and Watson [10]. The procedure for conducting a test designed to detect the existence of positive serial correlation is as follows: (a) if d < dL*, reject H (significant at level a) (b) if d > dU*, do not reject H (insignificant at level a) (c) if dL* < d < dU*, the test is inconclusive. By observing that the null distribution of d is symmetric, with 0 d 4, a test for negative serial correlation can be carried out exactly as above, only using the statistic 4d. Twosided tests are obtained by combining single tail areas, using a/2. When the observed value of d falls in the "region of ignorance", dL < d < du*, the bounds test is inconclusive. It is here that DW recommend applying a beta approximation. The procedure involves transforming d so that its range is (0,1), accomplished by the simple transformation d. A beta distribution with the same mean and variance of d is then fit using the method of moments. Critical values of d are found from Tables of the Incomplete Beta Function, and the test is conducted in the usual manner. Tests for the case in which the errors in a least squares regression follow a specific nonfirst order autoregressive process were developed by Peter Schmidt [29] and K.F. Wallis [34] and were based upon the work done by DW. Schmidt considered the AR(2) alternative: Zt = 1Zt1 + 2Zt2 + E' ~C i.i.d(o,a2). A test of the null hypothesis Ho: 0 = 2 = 0 is carried out by considering a generalization of the DW d statistic: n ^ 2 n ^ ^ E (Z Zt _) + t (zz t)2 d2 = t=2 tt=3 t t n ^ Z Z t=l When analyzing quarterly data, Wallis argued that researchers may be interested in detecting the existence of seasonal variation of the error terms. One possible seasonal error model is the AR(4) process: Zt = 0Zt4 + t t ~ i.i.d(o,a2). Wallis' "seasonal effects" test is based upon the value of the modified DW statistic: E 2 t=l t Critical bounds for both nonfirst order tests have been tabulated by their respective developers. The tests are conducted in exactly the same manner as the DW bounds test. 1.2.2 Beta Approximation H. Theil and A.L. Nagar [33] criticized the DW bounds test because of its inconclusive region (dL*, du*), and fear that the practical re search worker may interpret "no inference possible" as equivalent with "no evidence to reject the null hypothesis of independence." This results in bias in the sense that many cases of positive serial correlation will be overlooked, especially when the inconclusive region is large. This situation occurs when the number of observations is small and the number of independent variables in the model is large. The TheilNagar (TN) approach is to fit a beta distribution i~mmedi ately, dispensing with finding bounds for the critical value of d*. However, their method of fitting the beta distribution differs from that used by DW. The d statistic is transformed to a beta (p,q) variable 1 da in the range (0,1) by considering the quantity X = 4(a+, where the 4(a+b)' range of d is (a,b). The parameters p, q, a, and b are then determined by the methods of moments, using the first four moments of d. The first two moments of d are used to find expressions for a and b, while the third and fourth moments of d are utilized for determining p and q. TN simplify the necessary calculations considerably by ignoring terms of lower order in the moment expressions. R.C. Henshaw [20] also considered a beta approximation to the distri bution of d. However, where TN approximate the first four moments of d, Henshaw uses the exact moments of d in fitting the beta distribution. The transformation X = dvl is used, where v and vng are, respectively, ngl 1 ng the smallest and largest latent roots of (IX(X'X)1X')A. The parameters p, q, Vl Vng are then found by the method of moments. The calculations needed in Henshaw's test are much more extensive than those of TN. However, TN point out that their approximations are good only when "the behavior of the [independent] variables is sufficiently smooth in the sense that the first and second differences are small compared with the range of the variable itself." 1.2.3 dU Approximation For polynomial regression, E.J. Hannan [17] discovered that the DW upper bounding statistic, dU, gives a very good approximation to the true distribution of d. Hannan's reasoning was based on the fact that the d, bound is attained in the latent vector case, i.e., when the g columns U of the design matrix are linear combinations of the g latent vectors vectors associated with the smallest nonzero latent roots of the matrix A. Later, Hannan and R.D. Terrell [18] showed that the approximation rircains reasonably good whenever the coded values of the independent vzrriab es are concentrated near the origin (spectrum of the independent variables in the model is concentrated near the origin), which fre quently occurs for economic time series. In the light of Hannan's work, Durbin and Watson [12] theorized that the shape of the d distribution might well be better approximated by the distribution of the upper bounding statistic dU. However, DW felt that this approximate distribution should have the same mean and variance as the distribution of d since this was the one desirable property possessed by the beta approximation. The statistic d* = a + bdU was considered; where a and b are chosen so that d* has the same first two moments as d. Critical values of dU, tabulated by DW in their 1951 paper, are used in finding the critical values of d* [11]. 1.2.4 BLUS Estimators It can be shown that the covariance matrix for the residuals is E(Z Z') = o2{X(X'X)X'} = a2M. Thus, even if the null hypothesis is true, the residuals are, in general, correlated and heteroscadastic. The result complicates matters considerably, as evidenced by DW's upper and lower bounds to the significant points of d. H. Theil [321 reasoned that the "testing procedure would be simpli fied considerably if the covariance matrix of the [residuals] were of the form o2I rather than o2M." Theil considered new estimates of the regression errors, called BLUS estimates; that is, best linear unbiased within the class of those linear unbiased estimates that have scaler covariance matrix o21. The new regression residuals, V, are formed such that E{(VZ)'(VZ)} is minimized subject to E(V) = E(Z) = 0 and E(VV') = o2Q, where 2 is chosen apriori. By choosing n(nxn) as a diagonal matrix with ng ones and g zeroes along the diagonal, the new residuals V are BLUS estimators of Z. Theil's test statistic takes the form of DW's d statistic, only with the Zt's replaced by the Vt's: n Z (V V )2 dB = t=2 t tl BLUS n n E V 2 t=l t Now the BLUS estimates, under the null hypothesis, are independent with zero mean and constant variance, hence, the exact null distribution of dBLUS is a standard one. Critical points are tabulated by B.I. Hart [19]. Theil notes that the choice of the matrix Q actually results in a dropping of g of the n residuals from the estimation process. This proce dure permits any set of g residuals to be dropped, the choice being left to the user. However, as pointed out by A.P.J. Abrahamse and A.S. Louter [ 2], there exist (n) possible BLUS estimators, each of which. will,in general, yield a different result. No method has been given for selecting the optimal vector of BLUS estimators. Abrahamse and Louter (AL) hypothesize that a diagonal matrix Q is not a necessary condition to eliminate the inconclusive region in the DW bounds test. It is sufficient to require that 0 does not depend on the design matrix. The estimators of AL are derived in the same manner as the BLUS estimators with the modification that ao2= G2KK' approximates o2M, and the test statistic, like Theil's, takes the form: n 2 t t t=l AL show that the Q statistic has the same distribution as the DW upper bounding statistic dU. Hence, it is an exact test, with tables available for finding significance points. The procedures developed by Theil and AL both necessitate tedious calculations in order to obtain residual estimates. G.D.A. Phillips and A.C. Harvey [26] sought to simplify the calculations and interpretations of the estimated residuals, while retaining the properties of the BLUS estimates. They determined that this could be accomplished by utilizing recursivee residuals." One normally estimates the model parameters on the basis of all n observations. The recursive residuals, denoted by Z., are found by using only the first j observations, j=g+l, g+2, . n. These residuals can be easily generated by a recursive algorithm. The Phillips and Harvey (PH) test statistic takes the familiar form: (ZtZl)2 d = t=g+2 n 2 t=l Under Ho,the {Z } are distributed independently with zero mean and constant variance. Critical values of this standard distribution have been tabu lated by Hart [19]. Like Theil's BLUS estimators, the recursive residuals are linear, unbiased, and possess a scaler covariance matrix. 1.2.5 The Sign Test One of the first researchers to consider a nonparametric alternative to the DQ bounds test was R.C. Geary [14]. Geary found that a simple count of the number of sign changes in the least squares residuals could be used as a test for serial correlation in the errors. If the errors are independent, the signs (plus or minus) of the residuals in the se quence will occur in random order. Serial correlation is inferred then, if the number of sign changes are fewer than expected. Let n be the number of observations and T the number of sign changes in the sequence of residuals. Then under the null hypothesis of independence, T has the distribution: P(T) = (nl)! T!(nlT)! Thus, it is possible to calculate significance points of T under H The hypothesis of white noise is rejected when T < T*, where T* is the tabulated significance point for the values of n and a. Geary notes that the Ttest is very similar to the familiar runs test. In fact, the number of runs, U, is one more than the number of sign changes, T, i.e., U = T + 1. In runs theory, the number of plus and minus signs are taken into account, whereas these are ignored in the sign test. Geary remarks that the sign test is a handy tool for assessing the probable presence of serial correlation for those workers in multi variate regression time series who are computerless, or without the Durbin Watson routine in their computer. 1.2.6 Periodogram Based Tests Recall that DW's 1950 d test was designed to have high power against only a first order autoregressive alternative error model. However, the nature of the residual dependence, when it exists, may be more general (nonfirst order). Thus, an investigator may wish to obtain a more comprehensive picture of the departure from serial independence than is provided by a single statistic like d. Durbin [ 9] suggests that it may be more appropriate to ask what the data reveals about the departure from serial independence rather than to set up a particular parametric alter native and seek a test which has high power against it. Durbin's 1969 teohnique for studying the general nature of the serial dependence in a stationary series of observations Z1, Z2, ., ZT is to compute the quantities j m P/ P E r E r S. = r=1 r=1 J , r = I, 2, 3, ., m where 2 T Z e(27irt)/T 2 P = t t r T t=l1 and then to make a plot of S. versus j/m. This plot is called the cumu lated periodogram, S. is its sample path, and r = 1, 2, .., m the periodogram ordinates. It is known that when Z1, Z2, ., ZT are independent and identically distributed, the sample path S. behaves asymptotically like that of the sample distribution function in the theory of order statistics. Hence, the KolmogorovSmirnov limits can be applied to provide a test of serial independence. However, when the periodogram is computed from the least squares residuals (Zt), modifications to this procedure are necessary, since the residuals themselves are correlated in any case. For a test against positive serial correlation, a critical value C o for the KolmogorovSmirnov statistic is acquired from available tables. Like the DW bounds procedure, there are three regions for the sample path of S.: (1) If the sample path of S. crosses an upper line C + j/m', 3 o reject H . o (2) If the sample path of S. fails to cross a lower line C + 3 o 1 1 1< < [j 2(gl)]/m', do not reject H (where g 2 j m and 1 m' = 1(ng)). (3) If the sample path crosses neither the lower or upper line, the test is inconclusive (analagous to the inconclusive region for the DW bounds test). Durbin notes that this technique is conservative in the sense that, for a test of significance level a, the probability of falsely rejecting the null hypothesis does not exceed a. Tests against negative serial cor relation are carried out similarly, and a twosided test may be obtained by applying both onesided regions at significance level a/2. 1.3 Concluding Note We have presented a brief summary of available tests for serial correlation of the errors in a regression analysis. The properties of these tests are discussed, and comparisons made, in the chapter that follows. In particular, we are concerned with the performance of these tests in the presence of nonfirst order serially correlated errors. CHAPTER II COMPARING TESTS OF SERIAL CORRELATION 2.1 Power Comparisons Many researchers have conducted Monte Carlo and simulation studies in order to estimate the powers of the various tests for serial inde pendence. Most of the power studies in the literature focus on the DurbinWatson d test. In the following sections, the d test is compared to three groups of alternative tests and a brief summary of the results is presented. 2.1.1 Exact d Test Versus Approximate Tests of Serial Correlation In 1971, Durbin and Watson [12] investigated and compared the exact distribution of d with certain approximate tests which were developed using their procedure as a basis. These include the DurbinWatson, TheilNagar, and Henshaw beta approximations and the Hannan dU and Durbin Watson a+bdU approximations. Letting d' denote a random variable having a distribution used as an approximation to the true distribution of d, DW computed P(d'd.) for each approximating distribution, where P(d d )= a. The power comparison. reveal that the tests fall into three categories: TheilNagar beta and Hannon dU; DW beta and DW a+bdU; Henshaw beta. These groups are given in increasing order of accuracy, which coincides with increasing difficulty of calculation. As a result of their research, DW make the following recommendation: when a test for serial correlation based on the least squares residuals is required, the DW d test should be applied first. If the result is inconclusive, apply the DW beta or a+bdU approximation. When special accuracy is needed, use the Henshaw beta approximation. Finally, if a more comprehensive picture of the serial properties of the errors is desired than is provided by the value of a single statistic, employ the cumulated periodogram method. 2.1.2 d Test Versus BLUS Test Abrahamse and Koerts [ 1] calculated the power of the BLUS test for various values of n and g (number of independent variables) with the parameter of the alternative error hypothesis at a relatively large value of 0 = .8. This was compared with the following three quantities: (a) P(d the d test, where dL* is the lower critical bound for d, (b) P(d>dU*lHa), called the probability of an incorrect decision for the d test, where du* is the upper critical bound for d, (c) P(d approximate true significance point of d obtained by using a method similar to those discussed in Chapter I. Note that the probability of a correct decision, as defined by Abrahamse and Koerts, is the power of the d test when using DW's bounds. The results obtained by AK can be summarized as follows: (i) The power of the d test generally exceeds the power of the BLUS test. (ii) The power of the BLUS test dominates the probability of a correct decision for the d test. The smaller the number of degrees of freedom (ng), the greater the difference. (iii) The power of the BLUS test, the probability of a correct decision and the power of the d test converge as ng increases, for in this case the inconclusive region of the d test disappears. Since calculation of approximate significance points for the d test may require many more computations than the BLUS procedure, AK recommend one follow the schematic shown in Figure 2.1.1 when choosing a test for residual correlation. BLUS ng large ng DurbinWatson d BLUS ng small BLUS n smallDurbinWatson dif inconclusiveLU approximate methods FIGURE 2.1.1 ABRAIIHAMSEKOERTS CHOICE OF A TEST FOR SERIAL CORRELATION 2.1.3 The DurbinWatson d Test Versus NonFirst Order Alternative Tests V. Kerry Smith [30] compared the approximate power of four tests for serial correlation with two sets of alternatives: second order auto regressive processes, denoted AR(2), and first order moving average processes, denoted MA(l). The four tests considered in the study were the DW d test, the Durbin cumulated periodogram (S.) test, the Schmidt 0 J (dz) test and the Geary (T) count of signs change test. Smith's Monte Carlo results are summarized as follows: (a) For the MA(1) error model, the d test proved to be consistently more powerful than the other procedures. However, with small samples and low first order correlation coefficients, the differences in the tests are not pronounced. (b) For the AR(2) error model, three main results are obtained. (i) For certain values of 01 and 02 in the AR(2) model, the d test is more powerful than the d2 test. The T test is shown to be much inferior to the d. (ii) In models with large and approximately equal values of 01 and 02 the d2 test appears to become more powerful than the others. (Large 01 and 02 imply first and second order autocorrelations near 1. For a detailed discussion of auto correlation, see Section 3.3.1.) (iii)The Sj test appears to be at least as powerful as the d test for most of the values of 01 and 02 considered by Smith. However, the Sj test does not perform as well as the d2 test in most instances. 2.2 The Alternative Error Model Almost all of the tests discussed in Section 1.2 consider only a first order autoregressive hypothesis, i.e., one in which the hypothesized error model under Ha takes the form Zt = 0Zt1 + t' where {e.) i.i.d. (0,02) independently of {Zt}. The primary reason for this restrictive hypothesis is that the distributions of the test statistics are complex, even in this simple case. However, many time series exhibit autocorrelation patterns which are not well modeled by the first order model. When analyzing quarterly residuals, it may be more appropriate to consider the model Zt = 0Zt4 + Et where {c } is defined as above. The performance of the DurbinWatson d or similar test should be evaluated for models like these. In the sections that follow, we will examine the properties of tests designed for the general alternative autoregressivemoving average (ARMA) model + 0iZtl + 02Zt_2 + + S+ t1 + t2 tp t 1 t1 + 92 t2 + q tq CHAPTER III TWO TESTS DESIGNED TO DETECT A GENERAL ALTERNATIVE 3.1 Box and Pierce "Portmanteau" Test G.E.P. Box and D.A. Pierce [ 7] developed a method for checking the adequacy of fit in autoregressivemoving average (ARMA) time series models. Finding a test for serial independence of the errors in least squares regression was not of prime importance to Box and Pierce (BP). However, they did discover that the statistic used for checking model adequacy is, under certain conditions, very nearly equivalent to the DW d test. Consider the general ARMA (p,q) model: p q Z = 0 zt + Oiati + at, t =1,2,. n (3.1.1) t j=1l jtj +i=1 where the {at} are independent unobservable random variables, and the vector of model parameters (01, 02," 0 1' 2' ., 9 q)' is unknown. In order for (3.1.1) to attain stationarity, the autoregressive parameters (01, 0212 .' ., p ) must satisfy certain constraints. Fuller [13], and others show that the condition of stationarity is met if the roots of the equation x + + lx 02xp2 + + p = 0 are all in the unit circle. To guarantee the invertibility of the moving average part of uaodel (3.1.1), Fuller states that the roots of the equation y ++ 9q1 92 q2 + + 9 = 0 must also lie in the unit circle. This condition on (e' 2' ., 0 'q) enables the researcher to express the {Zt, generated from the MA process, L q Z = Z 9.a t i=O 1 ti as an AR time series, z jlJZt0. j + at t j=1J tj t (We utilize this invertibility property later in Section 3.3.) Through out, then, we assume that model (3.1.1) is stationary and invertible, i.e., the roots of the equations xP x+ 1xp1 +2xP2 + .. + 0 = 0 and yq + 1yq1 + 82y q2 + + 9 = 0 are all in the unit circle. After calculating the parameter estimates, 01, 02, p' 01' ^ 2, q, by the usual time series methods (see BoxJenkins [ 8]), the researcher may then be interested in checking the adequacy of the model: p ^ q ^ Z = Z 0Z j j + iatl + a (3.1.2) t j=l t i=l t1 t BP proposed that the first K sample autocorrelations of the residual series {a }, rI, r2, ., r, be used for this purpose, where K is small relative to n, and where n y r = t +latatjC r = t= 1a j j = 1, 2, ,..., (3.1.3) n t la2 tilt In particular, BP considered the statistic, K n r j=1 3 However, the true sample autocorrelations cannot be calculated since the {at} in (3.1.1) are unobservable. In order to proceed, the researcher may apply the method of least squares to (3.1.2), to obtain the residuals {a t. Replacing the {at with {at} in (3.1.3), estimates of the first K sample autocorrelations are found, i.e., n r. atat_ t=j+l t j = 1, 2, ., k. (3.1.4) n ^ 2 t=l t K The statistic Q =n.Er 2 is then of interest. j=1 J BP shows that Q will, to a close approximation, have an asymptotic X2distribution with Kpq degrees of freedom. The 'portmanteau' test for checking the adequacy of any ARMA (p,q) process will lead to a rejection of the hypothesis that the model adequately fits the data if Q is "too large", where Q ~ X2 (Kpq) Although BP are not actually considering regression model in which the errors are to be tested for independence, notice that if p=q=0 in (3.1.1), i.e., if the {Zt} are assumed independent, a test based on l= ( tztZtl)i(Zt2), rl 2 tt t 1t2) is very nearly equivalent to the DW d test. This fact would lead one to believe that the 'portmanteau' test may have applications in the area of testing for serial independence of the errors in a regression analysis. In particular, let p # 0, q = 0, in (3.1.1). The model then takes the form: Zt = 0Zt1 + t2 2 + + pZp + t (3.1.5) where (Et} are independently distributed. If we let {Zt} represent the unobservable errors in a regression model, then (3.1.5) is the general autoregressive alternative error model discussed in Section 2.2. Under the null hypothesis of white noise, p = 0, and the BP statistic Q is distributed asymptotically as a chisquare random variable with K degrees of freedom. It seems reasonable to utilize the Q statistic in a test for serial correlation of the {Zt "Large" values of Q would lead to a rejection of the hypothesis of white noise. Further, since the form of (3.1.5) is more general than the alternative model considered by DW and others (see Section 1.2), one would expect the 'portmanteau' test to be more powerful against general alternatives. 3.2 A New Technique: The MaxX2 Procedure James T. McClave [24] considered a new method of estimating the order of autoregressive time series models, called the maximum chisquare (maxX2) technique. As was the case with BP, a test for serial cor relation of the errors in a regression model was not of prime importance in McClave's research. However,, after developing the maxX2 procedure, McClave hypothesized that this technique could also be used as a test for serial correlation of the regression errors, and that this test would have high power against general alternatives. IcClave considered the stationary autoregressive (AR) model, Zt = 0 Zt + t2 + + Ztm + Et (3.2.1) where the {e tis a while noise process. Call model (3.2.1) a pth order subset AR model with maximum lag m if exactly p of the lag coefficients 0, 02' "' Om are nonzero, Om 0 0 and 0. = 0, Vj > m. The re searcher whose ultimate goal is to obtain estimates of the nonzero lag coefficients in (3.2.1), must first estimate the elements of the parameter set = {p; jl' J2', ., jp) where (jl' j2, j p) are the lags having nonzero coefficients. The initial step in the order estima tion process is the application of the 'subset autoregression algorithm' developed by McClave [24]. Subset autoregression enables the researcher to generate a sequence of estimates of @, say S = {10 1 2' K }, where Qk = {k; jl' 2' j k}. Once these estimates are obtained, McClave proposes the following sequential procedure for choosing the order from S: 1. Compute the sequence of statistics {M0, M1,' ', K_}1, where M = n( 2 2 )/02 k = 0, 1, 2, ., Kl, O2 is the residual k k k+l k+l k variance corresponding to the model with parameter estimates k,' and o2 is the residual variance corresponding to the model with parameter k+l estimates Hk+l 2. For a given choice of a, determine Ck such that P{Mk > Ck} = a, where Mk is the maximum order statistic in a sequence of (Kk) independent X random variables. 3. Choose the estimated order, p, such that p = {min k: Ck, 0 k K} , and thus a) = (p. This method of estimating Q is referred to as the "maxX2 technique." To illustrate the application of the maxX2 technique in tests for serial correlation, consider the first order AR model with nonzero but unknown lag jl: Z = 0. Z + , t J1t1 t where the{c } is white noise. The hypotheses of interest to the re t searcher who is attempting to detect serial correlation in {Z ) are t then Ho: jl 0 H1: 0. # 0. Rewriting Ho and H1 in terms of the parameter set we have H : = {0} (white noise) o H1: @ = {1; jl} where 1 jl K. The test statistic takes the form: T0 2 2 T = no I where ^ i n (o2 Zt = Co, and a12 is generic for the estimated t=l residual variance of each first order AR model. However, McClave has shown that the subset algorithm, when applied to (3.2.2) will choose that lag jl, which minimizes 012 over all first order models. Hence, the test statistic takes the form: S2 o2 mi ^O2 M max T = n 31 S31 1 / 2 c 1th Now a 2 = C (1r1 ), where i: is the jth sample autocorrelation function. Then, Then, C min[C (1r? )] M = n Jl (3.2.3) 0o min[C (1r )] jl o J1 C C [1max r i =n 0 1 1 C [1max r ] 0 31 J31 max r2 j n i 31 1max r 3i1 1 In equation (3.2.3), notice that M is equivalent to the statistic nmax r.2 McClave showed that T D 2 i.e., the statistic 11 o Ho 1 T converges in distribution when H is true to a X2 random variable with 1 degree of freedom. It follows then, that under the hypothesis ^ D of white noise, M > M, where M is distributed as maximum order o Ho 2 statistic from a sequence of K independent X1 random variables. Values of the test statistic "too large" will lead to a rejection of Ho, where "too large" is determined according to the distribution of M. Let us compare the "maxX2" statistic, n max r. , IPjK J with the BoxPierce "portmanteau" statistic, K j=l j Both statistics are based on the vector of sample autocorrelations, r (rl, r2, ., rK)'. Models (3.1.5) and (3.2.1) both take the form of the more general alternative error structure Zt = 01Z 1 0Z + t2 + mtm + t' (3.2.4) which gives rise to the plausibility that both testing procedures attain high power against nonfirst order alternatives. The similarities between the two statistics are evident, and considering the fact that the wellknown BP procedure has been in the literature and widely used since 1970, some may question the motivation of those investigating the maxX2 tech nique as a test for serial correlation. Obtaining an answer to this query is the goal of this research. As a partial answer, we offer the following: the maxX2 test is presented as an alternative testing procedure to those researchers concerned with studying parsimonious models, i.e., models with few nonzero lag parameters. In most practical cases, if the {Z.} are in fact correlated, a minimum number of the lag parameters (01, 02' .' m ) in model (3.2.4) are nonzero. Consider the researcher who utilizes K 2 n j r j=l 3 as a test statistic. We will discover in Section 3.3 that knowledge of the sample autocorrelation vector r is sufficient for calculation of the ^ A A parameter estimates (01' 2' ., m). Then the researcher is essentially computing estimates of lag parameters which are known to be zero as well as estimates of those nonzero lag parameters. These "zero" estimates, of course, contribute very little to the statistic K 2 n r 2 Our conjecture is that a predominance of small r.'s will deflate the sum, resulting in a failure to reject the null hypothesis of white noise when in fact several of the true autocorrelations are nonzero. This, in turn, leads to an invalid implementation of the least squares inferential techniques (see Section 1.1). Our goal is to show that, in many cases, the practical research worker can avoid this problem by employing the maxX2 procedure. Consideration is given to both the null and nonnull asymptotic distributions of the test statistics in the following section. These results become the foundation for the power comparisons discussed in Chapter 4. 3.3 Asymptotic Distribution of the Test Statistics 3.3.1 Preliminaries Assume that Z Z2, ., Z are observations generated at n consecu tive time points by a stationary AR model of order p with lag coefficients (01 2 0 ), i.e., Zt = Zt + 2Zt2 + + P tp + t' t = 1, 2, ., n (3.3.1) where {Etc is an uncorrelated series with mean 0 and variance 2(white noise). Let Y = E{ZtZt+} v = 0, l, 2, . and 1 nl[j C = Z Z t+ v = 0, 1, 2, ., (nl), v n t=l t+vl represent the true and sample autocovariance functions, respectively. Also, let p = yyV and r, =C CC be the true and sample autocorrelations v o v v o v of order v, respectively. Then the vector of lag coefficients 0 = (01' 02 .. ., p )' is estimated by 0 = (01' 02'". p" in the modified YuleWalker equations: r0 = r A where r is a symmetric pxp matrix given by: 1 rI r2 rp1 rl 1 r rp2 r 2 r 1 r 3 2 rp1 and r = (rI, r, ., r ). Also, the estimate of o2 is given by S2 = C [1r'r]. P o 3.3.2 General Case Due to the complexities involved with obtaining the exact distribu tion of the two test statistics of interest, this research will consider the large sample properties of the tests. Anderson [ 4], Grenander and Rosenblatt [15], Fuller [13] and others showed that the asymptotic joint distribution of the first K sample autocorrelations is multivariate normal, although Bartlett [ 6] was one of the first to consider the form of the limiting variance of the sample autocorrelation. This asymptotic result, given in its most general form by Anderson, is stated in the lemma that follows. Lemma 1: Let 00 t j= j tj where {e } is white noise, t co j=c J and co ZJ aj < j=00 Define the quantity P as co P = j=Pj where P = y1 = E{Z Z t /E{Zt 2 and let n Z Z C t tv r= t v t=v+l < < 1 v K . n E Z 2 t=l Then vn(rp) D N {o,V} where r = (r, r2, ., rK) P = (pip P2' ., pK) and V = (vij) is the KxK dispersion matrix of  j 2 K 1j the limiting distribution whose elements are given by: vii = P + P2. + 2o 2P 4p P. 1 i K (3.3.2) 01 o 2 i o 1 1 v.. = Pj + P. + 2p.p.P 2p P 2pP. 13 i+j 1i i jo j j i 1 j < i < K (3.3.3) Proof: See Anderson [ 4], pp. 489495. l[ Recall the invertibility property of the ARMA model in Section 3.1. Fuller [13] and others show that {Zt} generated from the AR process given by (3.3.1) can be expressed as an infinite moving average (MA) of {Ct}, i.e., Z = E ajtj t j=0 with {a.} absolutely summable, if the roots of the equation xP + 0 xP1+ 02xp2 + + 0 = 0 are all in the unit circle. Since model (3.3.1) is stationary, this condition is satisfied and thus the AR process is just a special case of the time series model defined in Lemma 1. However, before applying Lemma 1 to obtain the asymptotic distribution of the BoxPierce and maxx2 test statistic, we point out that in practice, the {Z}t in model (3.3.1) are unknown. Hence, the researcher, unable to calculate the sample autocorrelations, n r = E ZtZt < < V t=v+l 1 v K n 2 must, instead, calculate the estimated sample autocorrelations, n ^  Z Z r = t tv t=v+l 1 < v K n t=2 t=l Where {Zt} are the residuals computed from least squares regression. The {r } are used in place of {r } when calculating the BoxPierce and maxx2 test statistics. Thus, our interest lies in determining the limiting behavior of the estimated sample autocorrelations, (rl, r2, A rk A result due to Anderson [4], formally stated in Lemma 2, shows that under certain regularity conditions on the design matrix X of regression, the limiting distribution of the sample autocorrelations defined in Lemma 1 also holds for autocorrelations computed using the residuals { Zt cal culated from least squares regression. Lemma 2: Let Y = X8 + Z be the regression model given in equation (1.1.1), where Zt j =aj tj ' .E a I < I jE I _, la2 < and where is white noise with bounded moments. Let nh Sij(h) =E 1< I i< < g < t(h) = xit+h x, 1 i g, 1 j g i3 t=1xi,t+h xjt' where x11 x21 x 12 x22 g2 X " xln X2n gn Given that the following four conditions (Grenander's condition) hold: (1) an.(0)  X2 Si,n+l 0 n a.. (0) n a .(h) (3) 1j /ai (0) aj (0) ii jj as n * as n  p..(h) 1J (4) R(0) is nonsingular, where as n  R(h) = [p i(h)] , ij n(r p) D> NK {0, V} where p and V are defined in Lemma 1, n nZ Z E t tj r. = tjl j K, 3 n ^ t=l t A A 1 1 and Z = Y XB = [IX(X'X) X ]Y. Proof: See Anderson [4] pg. 593. ] In order to simplify notation, let K (BP) 2 T =n. r = nr'r j=1 J (m) 2 T = n ma r . 1jK j The asymptotic null and nonnull distributions of T(BP) and T(m) are given in theorem form. As a consequence of Lemma 2, the results stated throughout the remainder of this chapter may also be applied with T(BP) and T(m) are computed from least square residuals. Theorem 1: Let r = (r r2, ., rK)" be the vector of sample autocorrelations generated from model (3.3.1), and let T(BP) and T(m) then 36 represent the BoxPierce and maxX2 test statistics, respectively. Then (BP) D T (BP) D> X K(o), Ho K i.e., T converges in distribution when the null hypothesis of white noise is true to a central X2 random variable with K degrees of freedom, and (m) D T > M, Ho i.e., T(m) converges in distribution when the null hypothesis of white noise is true to the distribution of M, where M represents the maximum order statistic from a sequence of K independent X1 random variables. Proof: Under the null hypothesis of white noise, {Z } is uncorrelated with mean 0 and variance a2. Without loss of generality, take a2= 1. In the null case, we have then Yv 1 v = 0 0 v = 1, 2, K Sv= 0 P = 0 v= 1, 2, ., K and 1 v = 0 P = 0 v = 1, 2, ., K Recall that {Z I in the stationary AR model of (3.3.1) may be written t as an infinite MA process, CO t j= j 'tj where the {a.} are absoluLely summable. Having satisfied the conditions J of Lemma 1, we apply equations (3.3.2) and (3.3.3) to obtain 1 i=j v.. 0 (i=1,2,. ., K),(j=l,2,. ., K) . 1i 0 iij Hence, by Lemma 1: Snr D NK 0,I . n Ho Then T(BP) D X2(0), where XK(o) is a central X2 random variable HoKHQ (m)K K with K degrees of freedom, and T(m) D M, where M is the maximum order Ho statistic from a sequence of K independent random variables. (BP) We now proceed to find the appropriate critical values, Ca and C(m), where: a (BP) (BP) P H{T(BP) > C (BP)} = a 0o a and (mB) (m) PHo{T >Ca }=a It is obvious that C(BP) = X2 where X2 is given by a K,a K,a P{X(0) > X2 = a . K,a (m) (m)= c and observe that: In order to find C m), let C. = c and observe that: P{n max r2 < c = Pnr < c, nr2 < c, ., nr < c l<j = Ptnr2 < c)P{nr2 < c})'P{nr2 < c} 1 2 K = P{v 1_ 2K = [P{vc < < /c}] 38 where Z is a normal random variable with mean 0 and variance 1. Then 1a = [P{/c implies (1a) = P{/c < Z < /c} = 20(/c)1, where Q(x) = x eZ /2 . Thus, we have that =(m )1/K (m)1{ (1ct) }2 cM) [(if + a }}]2 a 2 2 The two procedures for testing Ho:{Zt} uncorrelated versus H :{Zt correlated, at significance level a, can be outlined as follows: 1. BoxPierce test: reject Ho in favor of H1 if T(BP) > X K,a 2. Maxx2 test: reject Ho in favor of Hi if T(m)> [I1 (1+ )/K} 2 2 2 In order to compare the two procedures in terms of power (Chapter 4), consideration must also be given to the nonnull distributions of the test statistics. However, the asymptotic distributions of T(BP) and T(m) are not so easily derived when the {Z } are dependent. Recall that /n(r p) D NK{O,V) (Lemma 1). When the alternative of serial correlation is true, the asymptotic dispersion matrix V has a nondiagonal form. For our purposes, we write that, under Ha, n r  NK{/n p,V} for large n, where " means "has an approximate distribution," Ks  and p and V are determined from the expressions given in Lemma 1. A result due to H. Scheffe'[28], given here as a lemma, enables us to express the BoxPierce statistic T(BP) as a linear combination of independent noncentral X2 random variables. Following this lemma, the asymptotic nonnull distribution of T(BP) is given in theorem form. Lemma 3: Let x = (xl, x2, n) be a column random vector which follows a multivariate normal law with mean vector u and covariance matrix E. Consider the quadratic form Q = x'Ax, where A is a KxK symmetrix matrix. Then if E is nonsingular, Q may be expressed in the form: Q = i AiX2r (6i2) i=1 i where (AX, A2, 2' .,A ) are the distinct nonzero eigenvalues of AE, (rl, r2, ., rk) their respective orders of multiplicity, (612, 622 . ., 2) are certain linear combinations of (pl' P2 n) and the {X2r (6i)} a sequence of independent noncentral X2 random variables with ri degrees of freedom and noncentrality parameter i2 Proof: See Scheffe' [28], pp. 418. [ Theorem 2: Let r = (rl, r2, ., r K) be the vector of sample autocorrelations generated from model (3.3.1), and V = (vij) the symmetric positive definite KxK dispersion matrix of the asymptotic distribution of /n r. Also, let V = TT', T a lower triangular matrix, and consider the statistic K T(BP) = n E r 2= n(r'r). j=1 j Then we can write, for large n: T(BP) X2 (62), i=l ri i' where (i) (A1' 2' x .,X) are the distinct nonzero eigenvalues of V, (ii) (ml, m2, ., m ) are the respective orders of multiplicities for the eigenvalues, (iii) {X2 (6i2)}= a sequence of independent noncentral X2 random mi i =1 variables with m. degrees of freedom and noncentrality para 1 meter 62, i 1, 2, ., , 1 1 (iv) 6.2= npVT { ( ).(T'TX *I)}T1 i = 1, 2, ., . i s=1 is s s#i Proof: From Lemma 1, we have /n r K N {/n p,V} for large n. Let x = n r, p = n p E = V, and A = I in Lemma 3. Then T(BP) = n(r'r) = (/n r)'I (n r) = x'A x and the conditions of Lemma 3 are satisfied. Thus, for large n, T(BP) E= X2 (62) where iA, mi, and XIi(6i2) are defined by (i), (ii), and (iii). If V is positive definite, W.Y. Tan [31] obtained a formula useful for comput ing the noncentrality parameters { 6.2}: 1 6.2 = n p'V TE*T1p (i = 1, 2, ), where V = TT" and E = H ( . )(T'TX *I), i = 1, 2, ., . i S=i 1S sfi This is condition (iv) of theorem, and thus completes the proof. [ By Theorem 2, the asymptotic distribution of T is found by determining the distribution of a weighted sum of independent non central ,,2 random variables. The problem is not an easy one, but J. P. Imhoff [23] and others have succeeded in deriving an expression for the distribution of the quadratic form Q (given in Lemma 4), by means of an inverstion of the characteristic function of Q. These results are given in Section 3.3.3.2.1 and later applied in Chapter 4, where we seek powers of the test. Similarly, the asymptotic nonnull distribution of the maxx2 statistic, T(m) is a complex one as described in the following theorem. Theorem 3: Let r = (rl, r2, ., rK)' be the vector of the sample autocorrelations generated from model (3.3.1), and let T(m)= n*max r.2 lIjK J Then, for large n, T(m)is approximately distributed as the maximum order statistic from a sequence of K dependent noncentral X2 random variables, {v.. *X 2(6)} with 1ii1 i i=l 2 (i) noncentrality parameters given by 62 = i 1iK 2vii and (ii) v = Po + P2 + 2pP 4p .P l Proof: Again, by Lemma 1, /n r NK{ / p, V} for large n, where the elements of V=(vij) are found from equations (3.3.2) and (3.3.3). Hence, for large n, the marginal distributions of {vn ri K can be approximated i=l by a sequence of K dependent normal random variables with means{/nip } i=l aK < and variances 1{vii}K respectively. Now, for 1iK, /nr N{7n pi,vii} 11 i=l i implies that /n ri In pi N  ,I ii /V ii nri2 62) 62 n i2 Thus, we have that  ~ X6 where 2 ; or, we write, vii i i 2vii n2 ~ v 2(62), l iK. It follows that (m) = max(nr 2) is approximately ni ii 1 ( distributed as the maximum order statistic of the dependent (since vij#0 in general) sequence {viiX (62)}i. K The complexity of the distribution derived in Theorem 3 makes power computations very difficult. In Section 3.3.3.2.2, this problem is made somewhat more workable by considering a Taylorseries expansion of the density function of T( In the sections that follow, a few simple alternative error models are presented, and the nonnull distributions of the maxx2 and BoxPierce statistics are discussed. The severity of the distributional problem, even in these simple cases, will be illustrated. 3.3.3 Special Cases Recall the first order AR alternative error model considered by DurbinWatson (Section 1.2.1). Let us consider a more general form of this model, Zt = + et (3.3.4) where t = 1, 2, ., n and {e t a white noise process. In the termi nology of the previous sections, we call model (3.3.4) a first order AR model with maximum (and singleton) lag J, where J is unknown. Lemma 4: The vector of true autocorrelations, 0 = (p' P2, )' for model (3.3.4) are calculated as follows: 1 = 0 p = Om = mJ; m = 1, 2, 3, (3.3.5) 0 otherwise Proof: Recall that p = y/Yo, v = 1, 2, 3, .. Using model (3.3.4) we write for v = 1, 2, . y = E{Zt.Z ) = EZt Z + tEZ tv tv tJ t tv = OE{ZtZt_(Jv)} + E{ctZt_ v. Now, E{etZ }= 0 since Zt is uncorrelated with tv+l', Etv+2, ' ty Et, Et+ .. Hence, y = 0E{Z *Z t_(j_) = 0YJ_ (v = 1, 2, 3, ..). (3.3.6) Immediately, we see that for m = 1, 2, 3, ., YmJ = Y(mJ ' and thus, mJ = O(ml)J Since po= 1, it follows that pmJ = m, (m = 1, 2, 3, .). Consider now, those autocorrelations of order not a multiple of J. Take v = Jv in equation (3.3.6). Then y = 0y ; but also y = 0y j. This implies that y. = 02y,, for v = 1, 2, 3, .., Jl, J+1, ., 2J1, 2J+1, ... And since 0 # 0, it must be that y = 0 for v # mJ, (m = 1, 2, 3, .) This completes the proof. [ The vector p, given in equation (3.3.5), will be used to compute the asymptotic mean vector of the sample autocorrelations for this special case. Lemma 5: For the model (3.3.4), the quantities {P ) v = 1, 2, . defined in Lemma 1, are given by: 1+02 v= 0 P = m[m+jl(m1)02] = mJ, (m = 1, 2, 3, .) (3.3.7) otherwise Proof: From Lemma 1, we have P = E pjp (v =0, 1, 2, .) V j= J vj First let v = 0. Then we have P0 0 +2 lp? Py = E p2 = p + 2 Ep . o j=cj o j=1 Applying the results from Theorem 4, we obtain P =1 + 2 E l2 o m=l mj = 1 + 2 2m m=1 Since C i a Sa = i=l 1a if [a <1, we have Po = 1 + 2( ) oT 102 1+02 S102 10 For v a nonmultiple of J and je(, o), the integers j and vj cannot both be multiples of J simultaneously. It follows that Py = 0 for v mJ, (m = 1, 2, .). To find the quantities Pm (m = 1, 2, .), we write 00mJ j=j mJj jo jJ j)J PJ j = = 0j pmjj= j=Zpjjp (mj)j 1 m j= PjJ(mj)J + 1Pj (mj)J Utilizing the results in Theorem 4, we have P = 00m +j m + mJ j=l j=0 Cj=j (mj)J j=m+l JJ (mj)J" i = jgjm jm~l (3.3.8) Now take i=jm in the last summation of equation (3.3.8) and simplify to finally obtain: Pm 012j + (m+l)0m + 0m 02i mJ =1 i=l = 20m 02J j=1 m 02 = 20 (10) + (m+l)0m + (m+l) 0m = 0m[M+H (m1)02]/(102). We now proceed to find the structure of the asymptotic distribution of the sample autocorrelations for this special case using the results derived in Lemmas 4 and 5. Theorem 4: The covariance structure for the asymptotic distribution of the first K sample autocorrelations, (rl, r2, ., rK), from model (3.3.4), is given by: 10 S. = mJ (m = 1, 2, 3, .) 0m+2 (3.3.9) mJ S. v ,J even (m = 1, 3, 5, 2 .) . otherwise 0m[m+l(ml)02] vl+v2=mJ or v v = mJ 02 but not oth (3.3.10) (m = 1, 2, 3, .) Om[Vm+l(m1) 02 ]+01( [+ (1)202. . 1.0 +v2=mJ (mR=1,2,3. .) 10 v1 2= J(=1,2,3. .) 0m [m+ (mZ1) 02 ] 0m i2 1 [m +ll(mn+l) 02 V1 = mJ (m = 1, 2, .) 2 = J (k = 1, 2, . otherwise Vl'2 Proof: Recall from Lemma 1, the expression for the asymptotic variance of the vth order sample autocorrelation, /n r v = + + 2p2po 4 (3.3.11) For vj mJ, we have p = 0 (Lemma 4) and P = P2v = 0 (Lemma 5), thus 1+02 v Po = 102 (v mJ; m = 1, 2, 3, .) Now consider v = mJ. Here we have from Lemmas 5 and 6,p = 0m, 0m[m+1(m1)02] Pv 102 and P 2m[2m+l(2m1)02] 2v 102 We then can write 1+02 +02m[ 2m+l(2ml)02] + 202m 1+(2 1 2m[m+l(m1)02] vv 102 102 1i02/ 102 = {1+02+02m[2m1+02(l+2m)]} /(102) = {l+02(2m+l)02m+(2m1)02(m+l)}/(l02), (v =mJ; m = 1, 2, .). Because of the term P2v in equation (3.3.11), we must also consider those orders of multiples of J/2, when J is an even integer, i.e.,v = J/2, 3J/2, 5J/2, .. Here 2v becomes an odd multiple of J, thus (from Lemmas 4, and 5) m[m+l 0 and (ml)0 ( = 3, 5 .) pV = O,PV = 0 and P = P = (m = 1, 3, 5, 2v mJ 102 It follows that 1+02 0m m+l(ml)02 vv 1 02 102 1+02 + (ml)inm(m1)0m+2 mJ even ( 102 2 1, 3 5 .). This completes the proof of equation (3.3.9). The asymptotic covariances of the sample autocorrelations are found in a similar manner. Rewriting equation (3.3.3) from Lemma 1, we have, for large n, V1 v2 = PVl+V2 + PV1 2 + 2pv pv2Po 2P P 2 2p Pv (V1 v2). (3.3.12) These computations become quite involved, for now we must consider those values of v v2, vI + v2 or v v which are multiples of J. The result 1 2 1 2 1 2 is obtained in a straightforward but nontrivial fashion if we group the possibilities as follows: a) v1 a multiple of J, but v2 a nonmultiple of J. b) v2 a multiple of J, but v a nonmultiple of J. c) v1 + 2 a multiple of J, but vl 2 a nonmultiple of J. d) v1 2 a multiple of J, but v + v2 a nonmultiple of J. e) Both v + v2 and v1 2 multiples of J, but neither v1 nor v2 a multiple of J. f) Both v and v multiples of J. 2 Notice that in case (a) and case (b), neither v + v nor v V2 can be a multiple of J. Hence, each of the quantities summed in equation (3.3.12) are zero. Thus, v 1' 2 = 0 for v 1 mJ or v2 # mJ (m = 1, 2, 3, .) For case (c), the only nonzero term in (3.3.12) is P,~yl Similarly, for case (d), the only nonzero term in (3.3.12) is Py 2. This gives "1" S= 0m[m+l(ml)02] for either vl+2=m Vl,v2 =mJ 102 1 2 or V V2=mJ but not both (m = 1, 2, .). When the orders of the sample autocorrelations follow the pattern in case (e), only the terms P1 +2 and Pl,v2 are nonzero. Hence, m[m+1(m1)02] + 0[+l(Z1)02] Vvl,2 = P mj+Pj  192 mJ VJ~ 102 for v +v 2=mJ (m = 1, 2, 3, .), and v1v2=YJ (9= 1, 2, 3, .). Finally, we consider the order pattern in (f). Both v1 and v2 multiples of J imply that both v1+v2 and v2 are multiples of J. Thus, none of the terms summed in (3.3.12) are nonzero. Utilizing Theorems 4 and 5, and lettin v =mJ and v 2=J, we obtain: 1vv2 =(m +)J+P (m)J+2pmJP JPo 2PmJ J2p JPmJ Sm+0 [mZ+l (m+zl)02 Om [m+l(mkl)02] 102 12 20m+$ [ +l(1l)02 20m+Z[m+l(m1)02] 102 102 Sm"[mZ+l(ml1)02] O m+[m+e+l(m+Z1)02] = $ This completes the proof. 49 3.3.3.1. Model 1: The Case J=2, K=2 Let us now apply these results to model (3.3.4) when J=2, i.e., Z = 0Z + e (3.3.13) t t2 t Suppose also that the researcher chooses to base the test of H: 0 = 0 H: 0 8 0 a on the first two sample autocorrelations rl and r2 (i.e., K=2). Call this the case (J=2, K=2). Applying formulas (3.3.9) and (3.3.10) from Theorem 5, we obtain 1+20+02 (1+0)2 V11 102 102 S 120 +02 (10)2 = 102, v22 1_02 10 and v12 = 0. From equation (3.3.35), we obtain p, = 0 and p = 0. Thus, we have that, as n tends to infinity, the vector (nl rl, /n r2 /n 0) is distributed asymptotically normal with mean vector (0,0)' and dispersion matrix (1+0)2 0 V = 0 102 We write: 2 (i/n rl,/n r2). N (0, 0)" [1) 0 S, (3.3.14) 0 102, for large n. Corollary 1: Under the alternative error model given in equation (3.3.13), the BoxPierce statistic T(BP) = nr + nr2 is asymptotically distributed as a weighted sum of independent single degree of freedom chisquare random variables: (BP) (1+0)2 T = (0) + (10)2.X 2 n202 where X = 102 Proof: From Theorem 2, we write T(BP) =2 xi (62), where the (A.}, ' i=l i i'i 1 i {m}, and{62} are defined previously. Since V is a diagonal matrix, it is obvious that the eigenvalues of V are: =(1+0)2 and A= 02 1 10 and X2 = 1 with corresponding multiplicities ml=1 and m2=1. In order to determine 62 and 62 we utilize (iv) of Theorem 2. Now 1 2 E I( I (T'TX 2'I) and E2 =2 ) (T'TX I ' where 1+0 0 T = (Recall that V=TT'). 0 Hence, we have 1TEITI = n*p V TE T1p = n(0, Fn7)'TET1 _p =n(0O'(102)i/2) ElT an 0 102 ]Tl = n*(0,(1_2)j / z XI 2 p 12 1022 I~ 1 i2 since 2 = 102 2 2 n020[102 2] _(12)20() 2 SSimilarly, = npV1TE2Tp (2 22 1 = n202[1021 i n202 (102)2 From Corollary 1, then, we write the exact asymptotic power of the Box Pierce test as: lirn PT(BP)>C (BP)O = (1+0) nlimP{T>C ) ne (102)X2()>CBP) , where A = n202/(102)2. Now the probability on the righthand side of equation (3.3.15) is not an easy one to compute, since the constants mulitplying the single degree of freedom X2 random variables are unequal. (3.3.15) = n'(0,(1 ) (10 In fact, the problem of obtaining the exact probability remains unsolved. In Section 3.3.3.2, we present three methods of approximating the right hand side of (3.3.15). Corollary 2: Under the alternative error model given in equation (3.3.13), the exact asymptotic power of the maxx2 test, T(m)= max(nr, nr2), is 1' 2 given by: (m) (m) { (m(0 C )02) (3.3.16) lim PfT >C } = 1 2{ *c1 1j02 noo where (D(x) = f (2x) e 2Z dz Proof: Let C(M)= C and start by writing the power of the maxX test as follows: P{T()>C(0} = P{max(nr1,'nr2)>C} = lP{max(nr2, nr2) = lP{nrl From the asymptotic distribution given in (3.3.14), we have that for large n, 'nrI and Vnr2 are distributed as independent normal random variates; hence, lim P{T(m)>C0} = lP{nr2 =~r lP{vk/cvIr ,r)1< }. P VrJ_ / 2< I.C r 1 C(1 0 C(12) /C/n0 /C/n0 1 1  = 1 2[ ( 1+0L D [ pvO (D 2 Of course, tables are available for computing tail areas of the standard normal distribution. Thus, for the special case (J=2, K=2), the exact asymptotic power can be calculated for the maxX2 test. However, the notion of exact asymptotic power is the exception rather than the rule. The researcher who attempts to calculate powers utilizing an analytic approach similar to the work above, will more often than not be forced into approximating the true asymptotic power, as with the BoxPierce statistic. This point is further illustrated in the following section. 3.3.3.2 Model 2: The Case J=2, K=4 Again, let us consider model (3.3.13) as an appropriate representa tion of the residuals when serial correlation is present. Now we base the test on the first four sample autocorrelations (rI, r2, r3, r4). Call this the case (J=2, K=4). From equation (3.3.5), we observe that p1=0, p2=0, P3=0 and p4=02. Applying (3.3.9) gives S (1+0)2 11 102 v22 = 1 02 , S1+02+403205 33 102 = 1+02_50 W+3 _ v = 1+0250 +306 = 1+202304 V44 _072 The covariances, obtained from (3.3.10), are computed as follows: v12 = 0 , S0(302)+20 20+302_04 v13 i02 i2 v14 = 0, v23 = 0, S2003(4202) 20(1202+4) + 20 v24 102 1 20(12) Then, as n tends to infinity, the vector /n(rp) is distributed asymptotically normal, with mean vector 0, and dispersion matrix V, where (1+0)2 0 20+302_04 0 0 102 10F 0 102 0 20(102) = 20+30204 1+2+403205 V02 0 i_1 0 (3.3.17) 0 20(102) 0 1+202304 The method of obtaining exact asymptotic powers in the case (J=2, K=2) depended heavily on the independence of the first two sample autocorrela tions. The addition of the third and fourth order sample autocorrelations into the testing procedure contribute nonzero offdiagonal elements to the covariance structure, V. These covariances make the distributional problem so complex, that the researcher is forced to abandon the search for exact asymptotic powers and seek, instead, approximate asymptotic powers. In the subsections that follow, we examine several approximate procedures for the BoxPierce test, and one approximation for the maxx2 test. 3.3.3.2.1 Approximate asymptotic power of the BoxPierce test Three approximate methods for computing the asymptotic power of the BoxPierce test are given in theorem form, and are labeled as follows: (i) Imhoff numerical integration (ii) Extension to Pearson's threemoment approximation (iii) Rao asymptotic normality. Lemma 6 (Imhoff): Let x have an nvariate normal distribution with mean i and covariance V, where V is an nxn symmetric positive definite matrix. Consider the quadratic form, 2 (62), Q = x'A x = i1AiX2m. (2 1 where{X.}, {m }, and { 6} are defined in Lemma 2. Then I i 1 Q>C} + sin 9(u) P{Q>C} J+ du u 2 r u p (u) where 1 1 9(u) = i i tan (Ai'u)+6iu(l+A5 u2)l] ICu 2 i 1 i 11 1 2 and 91 p(u) = (l+Xiu2)i'exp{ 1 (6j1X.u')2/(l+u2)} i=l i 2 j=l i Proof: Imhoff's result is obtained by inversion of the characteristic function of the variable Q. For details, see Imhoff [23], pp.421. [ Theorem 7 (Imhoff numerical integration): Under the alternative model given by (3.3.13), the asymptotic power of the BoxPierce test, based on K=4 sample autocorrelations, is given by (BP) (BP) 1 1 sin O(u) lim P=T >C ) + 1r us du ,(3.3.18) na 2 r ,j up(u) where (u) = [m tan (Au) +62X.u(l+X2u2)] 1 (BP)u 2i i= i ~i ii 2 Ca p(u) = J1 (l+Xu2) miexp[1 (6 u)2 u2) and where { .} are the eigenvalues of the matrix V given in (3.3.17), 1 with respective multiplicities {m.}, and { 6} are obtained from (iv) 1 1 of Theorem 2. Proof: The result follows directly from Theorem 2 and Lemma 6. [ The power expression in (3.3.18) results from Imhoff's ability to find the distribution function of a linear combination of noncentral X2 random variables explicitly. However, the integral in (3.3.18) cannot be computed analytically. Imhoff outlines a procedure in which the re searcher can approximate the power by a numerical integration method. In Chapter 4, approximate asymptotic powers of the BoxPierce test for this case (J=2, K=4) and others are calculated by means of Imhoff's procedure using numerical integration. Lemma 7 (Pearson): Given the conditions of Lemma 3, tail areas of the distribution of the quadratic form Q may be approximated as follows: P Q>C} =F{ X (O)>C*}, 3 2 where h a2 C* = (Ca)(h/a ) +h, a (m+j6) (j=l,2,3) 3 1= (C2a l)j (=/a h 1 i I andX (0) a central X2 random variable with h degrees of freedom. Proof: The result is derived by extending Pearson's [25] "threemoment central X2 approximation to the distribution of a noncentral X2" to the general case of a quadratic form in normal random variables. Details are given in Imhoff [231, pp. 425. [ Theorem 7 (Imhoff's extension to Pearson 3moment approximation): An approximation to the asymptotic power of the BoxPierce test is given by: (BP) (BP) lim PT ( >C (P X(O)>C*}, (3.3.19) a h 3 n where h=a2/a2, C*C(BP)a (h/a2) + h, aj (m+j), (j=,2,3), 3 a 1 + h, a =1 (j1,2,3), ({ A,{ m.},and { 2} defined as in (iv) of Theorem 2, and where ":" represents Pearson's 3moment central X2 approximation to the distribution of a noncentral X2. Proof: The results of Theorem 2 enable us to write T(BP) as a quadratic form, Q. Apply Imhoff's extension to Pearson's 3moment approximation (Lemma 7) and the result follows. [ Since h in Theorem 7 need not necessarily be an integer, the user of this extended threemoment approximate method must have access to a computer routine which computes tail areas of central X2 random variables with noninteger degrees of freedom. For a numerical application of this result, see Section (4.1). Lemma 8 (Rao): Let nn(xp) D N 0,E} and let g: K R be a continuous function. Then, under certain regularity conditions, n g(x)g()  NP0,5'EC), where / where (t) g(t) ag(t) r = t1 li' I t2 2 tK K K Proof: See C.R. Rao [27], pp. 387. [] Theorem 8: (Rao's asymptotic normal transformation : An alternate form for the asymptotic power of the BoxPierce test is given by: 58 (BP) (BP) /C(BP) n pp)C PB)n/(e 'e) lim P{T C() (B = 1  (3.3.20) nl+ (p'p) '(p'Vp) (p'p) (p'Vp)2 where x 2 _z2/2 Q(x) = (2r) e z dz , where p and V are the asymptotic mean vector and asymptotic dispersion matrix, respectively, of the vector of sample autocorrelations, r. Proof: From Lemma 1, we have /n(rp) D > N {0,V} ; hence, we may apply the result of Lemma 8. Consider the function g(t= (tt)~ =/t2 + t2 + .+ t2 1 2 K Then, 8g(t) i 1 (t2 + t2 + + t) 2(2t) at. 2 1 2 K i 1 = ti(t't) (i = 1, 2, ., K), and ag(t) t. = p i(p') (i = i, 2, ., K), i i imply that X = (p'p)2p. Thus, from Lemma 8: 1 I D  /n{(r'r) (p'p)I> N{O, (p'p)lp'Vp} For large n, we see that /n(r'r)2 has an approximate normal distribution with mean /n(p'p)2 and variance (p'p)lp'Vp. We compute the asymptotic power by writing: P{T(BP)>C = P{nr'r>C = lP{nr'r = lP{/C Taking limits on both sides, we have (BP) P= < /C n( Z lim P{T >c} = IP (')((e e 'V) Z ve (p'p)I(p 'Vp) nr>* 0 1 11 + = 1_ I ( + ](p'p)(p'vp) + . 0 We apply this result (Theorem 8) to the case (J=2, K=4). Corollary 3: For the case (J=2, K=4), the asymptotic power of the Box Pierce test is given by the expression: lim PT(BP)>C(B 1i nmoo / (BP) C 0/n(1+02 ) 71+402 30b 1+0Z cB P0n (l .(3.3.21) + /1 402420 30b I 1+02 Proof: Recall that, for (J=2, K=4), p = (0, 0, 0, 02)', and the matrix V is given by (3.3.17). Then p'p = 02 + 04 = 02(1+2) and p'Vp = [0, 0(102)+02[20(102)], 0, 0[20(102)J02(1202304)]. 0j 02 = 02((12)+20(102)+20 (102)+04(1+202304) = 02+404206308 = 02(1+402204306) Thus, (p'p) (p'Vp = [0/+07] 1. /1+4022030b = +40220 4306 V 1+0 Now utilize equation (3.3.20) in Theorem 8 to obtain the approximate asymptotic power expression: S(B 01n(1+0) C(B 0TBn(lT) 1 +__  +  ( 1+402204306 1+40 2030 1+02 1+0 / CVn(e e'p) (p'p)(p'Vp)2 A comparison of the three approximations for computing the power of the BoxPierce statistic is found in Chapter 4. 3.3.3.2.2 Approximate asymptotic power of the maxx2 test. We now restrict our attention to the maxx2 test. For the case (J=2, (m) K=4), the asymptotic power of T may be written: P{T()>C} =P{n max r.2>C} 1j4 J = lP{nr2 = lP{v <'nrlc Let x =/nr. and i= np., li4. Then the problem is to evaluate i i i i S /C /C f(x: p, V) dxldx dxdx, (3.3.22) where f(x: p,V) is the joint density function of the first four sample autocorrelations. For large n, /nr is multivariate normal; thus I f(x: 1,V) = ()2exp{ (xp)'Vl(x)} Due to the presence of offdiagonal elements in the matrix V, the exact computation of the multiple integral in (3.3.22) becomes extremely cumber some. The researcher must again resort to approximate methods. We propose a Taylorseries expansion method of approximating (3.3.22). Definition 1: Let y = (YI, y2, yK ) be distributed as multivariate normal, with E(y) = 0, E(y.2) = 1, liK, and E(y.y.)=C., liiK, lIjK, ifj (i.e., y N {0, R}, where R=(Cij) is a correlation matrix). A Taylorseries expansion of the density of v, f(y: 0,R), about the elements Cij in R is defined as follows: O K K (Cij)t t f(y: 0,R) = t, ~ )f(y: 0,R) (3.3.23) t=0 i=1 j=l t' ) ij ~ C i where {Cij}=0 implies Cij=0 Vi,j (i#j). [] For notational purposes, let t f(y: 0,R) = fo Cij ~ ij0 ij Expanding (3.3.23) we obtain: __h __f Ea f f(y: 0,R) = {f(y: 0,I)} + C12 C2o + C13 C3 o + (3.3.24) + C f K1,K CK1,K 1{2 (L f +a )+ CK2f+ o + {C12 f + + C2 f + C12C13( CC) 2 1K Kl,K 12 13 + .C C _____(__ )f I K2,K K1,K K CK2,K K1,K + {C ( )3fo + + C ()fo + C2C13 C 13 +3.1 3C1 1 CK,K 3CKI, 12 13 +c C2 2 )f + K2,K KKK C }+ 4' We will attempt to simplify equation (3.3.24) in order to make it more workable for the practical researcher. To do this, we introduce the concept of Hermite polynomials. Definition 2: Let ) (x) represent the density function of a univariate N(0,1) random variable (i.e., (x) = ex2 ). The n 71 Herrite polynomial, H (x), is defined from the following identity: (l)nI (x) (x) = ()n (x). Q (3.3.25) n ax The properties of Hermite polynomials are given in the following lemma. Lemma 9: Let H (x) be the nt Hermite polynomial. Then Ho(x)=l, H1(x)=x, H2(x)=(x21), H3(x)=(x33x) and,in general, Hn+(x) = xHn(x) ()Hn(x) (3.3.26) Proof: By convention, take Ho(x)=l. Now, (0)() = x 'x), ()2 (x) = (x) + x2 d(x) = (x21) (x) and S)0(x) = (2x)4(x)x(x21)M(x) = (x33x) (x). By equation 3.3.25, then, Hl(x)=x, H2(x)=(l1) and H3(x)=(x33x). In general, the (n+l)st Hermite polynomial is found as follows: ()n+Hn+l(x)(x) = (x)n+(x) = () (x)n (x) = (__)(l1)nH(x)H(x) = (1)"{ [n(),n(x)].(x) xhn(x)S(x)} Thus, we obtain the (n+l)st Hermite polynomial from the recursive relation: Hn+1(X) = xHn(x)  ()x)Hn(x) In the lemmas that follow, the application of Hermite polynomials in the problem at hand becomes evident. Lemma 10: Given the conditions of Definition 1, we have: (i) ( )tf(y: 0,R) = ( )tf(v: 0,R), liK, ljK, i#j 3C.. 7 y 1c3 aYi j (ii) ( )( 0,R) = H(yi)Hnyj)f: 0,). Yj yj I { Cij }=O Proof: The proof of (i) is straightforward. Perform the required partial differentiation of the density f(y: 0,R) on each side of the equation, and the result follows (this requires many tedious computations). We derive the identity in (ii) for the case K=2. For the case K=2, we have 1 C I 1 R=1 12 R1 1) 12 C 1 1CC 1C 1 and R !12 2 1 f(y: 0,R) = R exp{ (y2C2yy2+y2 2 1C2 2 1 12122 1 C2 (27r) 12 Applying (i) and taking the partial derivatives, we obtain ( )f(y: 0,R) = ( )f(y: 0,R) C12_ ~ ~ YIY2 ( )(f(: 0,R).( C ) Y2 C12 C12 CI2y2Yl C12y1y2 = f(y: 0,R) (1C) + f(y: 0,R) ( C 1C2 12 12 12 Taking C12=0 in the above equation gives: ( a)f(y: 0,R) = yly2f(y: 0,I)=H1 (y)H1(Y2)f(y: 0,1). 12 {C..}=0 The proof of the more general result (ii) follows a similar argument. [ Lemma 11: Given the conditions of Definition 1, the Taylorseries expansion of f(y: 0,R) can be expressed as follows: f(y: 0,R) ={ [1] + [C12H1(y1)H (y2) + C13H1(Y1)H 3) + + CK1,Kl (YK1)H1(YK) 1 [C2 H2(y )H2(y2)+... + C2,H2 I)( H Cy,(Hy +2 [C12 H21) 2 + C1,KH2YK1)H2(YK) + C12Cl3H2 (Y1)H1(Y3)JI(Y2 + + CK2,KCK1,KH1 (K2)1(yK1)H2(YK) + [C2H3Y3 (Y2) + + CK,KH3K1)H3(K 2 + C2C13H3(Y1)H2(Y2)Hy3)+ + CK2,KCKlH(K_2)H2 K1)H3(YK) + + C12C13C14H3(y1)H 1(2)H1( 3)H1(y4) + + CK3,KCK2,KCK1,K (yK2)HI(yK)H3(yK)H(yK3) + 41 [. .] + } f where fo = f(y: 0,I) = (yl) (y2) (K). Proof: Apply the results of Lemma 10 directly to equation (3.3.24). [ Lemma 12: /b fx(b)x(a) .n=0 Hn(x) (x)dx = > a {lnl(a) (a)Hnl(b)4i(b) n 1 where 1 2 1 2x (x) = 1 e2ix /27 and x W(x) = <#(z)dz Proof: For n=0, b l(x))(x)dx = (x)dx = f (x)dx (x)dx = O(b) (a) . F a a m i For n > 1, Jb Hn(x)(x)d = a d[Hn(x)>(x)] a =H n_(x) ni = H (a)P(a) H (b)4(b) ni ni Lemma 13: f, /' 2 Sa 1 a2  aK Hnl(1)Hn2(2) . . H (K) (Y) (Y2) . nK K [H (ai) (a .11 n.1 1=1 K iil [ (bi (ai)] .i [Q(b.)(ai.)] i 1 1 c(Y )d d 2 d i)Hn 1 (b.) (b.)] 1 * , ni 1 V. 1 1 S ni = 0 V 1 1 K [HnI (ai )(aHn (bi) (b.)] i=mrl 1 1 where n1, n2, are reordered < < n.=0, 1im n., m+l<i so. th, t so that Proof: Break up the integration as follows: fb H (Yl 1)(Yj1 2H 2 (Y (Y) f HnK K) K y 2 n1 n2 2 2 a dyK " b (x) a 66 d Y2 d y The result now follows directly from Lemma 12. O We now return to the problem of evaluating the multiple integral in (3.3.22). Let Sipi i v 11 where V=(v..), 1 i4, 1 j4. Then y = (yl1y2,y3,4) is multivariate 1J normal with mean 0 and dispersion matrix R=(C .), where C..=1 and iJ 11 V.. C.. = _J 1J ii jj The asymptotic power of the maxx2 test for the case (J=2, K=4) may now be written: b b b P{T(m)>C(m)} = 1  Sa a 2 a3 a4 1^m,^) 3 4 f(y: 0,R)dyldy2dy3dy4, (3.3.28) a.  a. = 1 AV.. 11 iin c (M)c vnnPi b. U 11 1i4. The Taylorseries expansion of f(y: 0,R) is used to obtain an approximation of (3.3.28). Theorem 9 (Taylorseries approximation): A Taylorseries approxima tion to the asymptotic power of the max>2 test for the case (J=2,K=4) where is obtained by computing 1 i[2(bl)1] I[ (b (a2)][2 (b3) 1][ I(b ) (a4)] (3.3.29) +{C24[2,(b1)1] [2P(b3)1]* [ (a2)4(b2] [ (a4)4(b4]} + [(b2)O(a2) ] [(b )D(a) ] [C3] [4bil (bl)b3 (b) ] + [20(bl)1][20(b3)] [C4] [a2 (a2)b2 (b) ][a (a )b (b ]} + i{ [C4] [2(bl)l][21(b3)1][(al)( (a )(b21) (b) ] [(a21)g(a )(b21)i(b )] + [C3C24 [4bl )(bl)b3 (b)][(a2)(b)] [(a) (b + 1{[4(b 2)(a )][(b )((a )][C3][4(bl3bl)4b(b) (b33b3)4(b3)] + [C24] [ 1 2bl[23(b3) ] [ (a 3a2)(a)(b23b) (b2) ] *[ (a3a) (a) (bi3b )(b )] + [C13C 4][4bl(b1)b3 )(b3)][a2 (a2b (b2))][a4 (a4)b 4 (b4) + 1 {[C 4][2'(bl)11[2(b3)] [(a26a2+3)((a2)(b26bi+3)4(b2) [ (a 6a +3) (a )(b6b+3)(b) + [C 3C24][4(bl3bl)M(b )(b33b3) (b3)][I(a2) (b2) ][4(a4)[ ((ba] + [C C3 24][4bl (b )b3(b)][(a 1b)(a )(b)2l)](b2)][(a 1)(a ) 1324 1 13 3 2 (b41)4(b )]} + ,{[C 3][ (b2)O(a2)][ (b )(a )][4(b10bl+15bl) ^(bl)(b310b'+15b3 +1 13 [(b2) 2 4 4 1 1 1 1 33) 4(b3)] + [CI3C 4][4(bl23bl)(bl)(b33b3)<(b3)][a2(a2)b2(b2][a4i(a )b (b ) + [C3 C]24 [4bl(bl)b3((b3)][(a23a2)P(a2)(b 3b2 )(b )] [(a 3a) (a) )(b 3b 4) (b 4) + [C,4][20(al1)l][2(a3)1][ (a210a2+15a2)(a2)(b10b +15b2) (b2)] [(a 10a+15a4)O(a4)(b 10b +15b ) (b4)]}, 4 4 4 4 4 4 4)]11 where 1 2 S(x) eX 2 ~i 20+302_04 C13 = (++0)/i+02+40320b Cm /n 0 b2 /ji2 b 2 1O /C(m) 40 b4 V !+ 2 30 4 20/102 , C24 = l+20m30 C(m) /n ' a2 / a0c2= vI: Sb = 1+0 VC(m)(10 ) 3 l +02+40320b vC (m) n 0 ,and a = 4 /+20230 From earlier results (3.3.17), S(1+0)2 Vll 102 v22 = 12 0+02+4 v33 = 10 20+3&0204L V 13 102 v24 = 20(102) 3205 2 2 v4 = 1+2030 v12 = V14 = V23 = V34 = 0 ; hence, the elements of R are written, C11 = C22 = C33 = C44 = , C12 = 14 = C23 = C34 = 0 , 20+30204 013 (1+0) /1+02+40 20 2 v/i+23 C24 = VITW Proof: x (x) = ((z)dz, f CO 69 Now p = 0, P2 = 0, = 02 implies that 1 3 2 '4 'c (m ) ( j . ) aa a3 = 1 ~ 1+0 a =  _ 2 02 a 3 =1+02+403205 a 4 a  /,1+20;303 SJ(m) Sa (102) 1+0 b 0 b/22 /C(m) (102) b 3 /i+02+40 2 C (m)_n 02 b = a /1+2023 0 in the power expression 3.3.28). Put C12 = C24 = C23 C34 = 0 in equation (3.3.27), to obtain the Taylor seriesexpansion: f(y: 0,R) = {[l]+[C13H( 1)H (y3)+C24HI(Y2)1(y4 )]+[C3H2(yl)H2(y3) (3.3.30) + C4H2 (2)H2(y)+C13C24H( H (Y2)H1 (3)H1 ( ] + 3 3 2 !+ 13[CH3(1)H3(Y3)+ C24H3(Y2)H3(4) + C13C24H2(y1)Hl(y2))2)H(y4) 2 + C13C24H1 ()H2 (Y2)H1 (y3)H2 (Y4) + [C13H4(y)H4(3)+C4H4 (y2)H4(Y )+C3C24H3 (H(Y)H3(Y )H 1(4 2 2 3 + C13C24H2 (y1)H2(y2)H2(y3)H2 (4) + C13C24H1(Y1)H3(Y2)H1(Y3)H3(Y4) +51 rr.5T H + + [C3H5(Y1)H5(Y3) + C24H5(Y2)H5(Y4) + C3C24 41()H1(2)H4(y3)H1(Y4 + C13C2243I y)H2(2)H3(y3)H2 (y) + C C3CH (1) (2)H2 (y3)H3(y) 13 24 ay 2 2 3 3 "J + C13 24 1 + C13C24H (v)H (y2)HI (Y)H (y )] + I![C 3H6(y )16 (3) + C2H6 ()H6(y4) + C13C24H5 (Yl)HI(y2)IH5(y3)H1 (y4) + C 13C24H4(yl)2 (y2)H4(y3)IH2(y4) + Cl3C24H3()H3(2)H3(Y3)H3(4) + C13C24H2(Y)H(y2)H23)H4(y) + C1C24H1(Y )H5(Y2)H(3)H5()] + } f(y: 0,I) Before integrating (3.3.30), notice that c(x) is an even function, i.e., c(x) = p(x). Also, Hn(x) is an even function when n is even, and is an odd function when n is odd, i.e., S H (x) n even H (x) = Hn(x) n odd Since al=bI and a3=b3, we have from Lemma 9, bi H (x)<(x)dx = H a.)(ai)Hn(bi)4(bi) (3.3.31) = H (b.)(b.)H (bi.)(b.) n1 i i nli 1 2H n (bi.)(bi) if n is even (n1 odd) 0 if n is odd (n1 even) Now integrate (3.3.30) using (3.3.31), Lemma 9 and Lemma 13 to obtain the desired result. ] In Chapter 4, each of the approximate methods outlined in this section will be applied to models similar to (3.3.4), and numerical powers of the two tests compared. Also, a study designed in part to determine the accuracy of these approximations is conducted. This is accomplished by comparing the approximate power of the test to the power generated from a MonteCarlo simulation. CHAPTER IV A POWER COMPARISON OF THE BOXPIERCE "PORTMANTEAU", MAXX2, AND d TESTS In Sections 3.1 and 3.2 we presented two testing procedures, the BoxPierce "portmanteau" and maxX2 tests, designed to detect serial correlation of the errors in the presence of a general alternative error structure. In Chapter 4, an extensive power simulation study is undertaken. Our goal is threefold. First, we wish to substantiate the conjecture that the BoxPierce and maxx2 tests attain high power in the general case. Secondly, we investigate the specific types of alternative error models for which one test outperforms the other. Finally, the accuracy of the asymptotic power approximations (Section 3.3) is determined by comparing the simulated power to the approximate power. 4.1 Monte Carlo Simulations Recall the stationary AR model of order p with lag coefficients (019 0 ., 0 ) given by equation (3.3.1). In order to obtain the simulated powers of the tests via the Monte Carlo method, a technique for generating the sequence of observations Z, Z2, Zn needs to be developed. We adopt a procedure in which the first pvariates Z1, Z2, ., Z are generated so that they possess the desired covariance structure of the general alternative error model under consideration. The remaining np variates Zp+, Z p+2' "Zn are then calculated recursively from equation (3.3.1). However, the formation of the co variance structure first entails computing the vector of true autocorrelations, 72 p. Given the lag coefficients (01' 02' ., 0 ) of the model, the vector p is easily obtained. The relationship between p and the lag coefficients is given in the following lemma. th Lemma 14: Consider the stationary p order AR model Z = 0 Z + 0 2 + + 0 Z + E 3 (4.1.1) t = Zt + Zt2 + tp + (4.1.1) where {e } is an uncorrelated series with mean 0 and variance o2, and the vector of lag coefficients 0 = (0, 02' ., 0 ) is known. Let y and p represent the vectors of true autocovariances and true auto correlations, respectively, where y = E{Z Z }, p =yl v=0, 1, 2, . Then Ap=0, where the elements of the matrix A are given by: 1 02i if 2i p a.. = 1 if 2i > p ij 0 if i+j < i < j a.p = j 1a0 if i+j>p, i < j i+j0 i3 if i+jp, i > j a.. = 13 0ij if i+j>p, i > j Proof: Multiply Zt in model (4.1.1) by Z t+, v= 1, 2, 3, .., p, and take expectations to obtain the true autocovariances: Y1 = o 2 1+ 6 2Y + 2 + 03 + 05Y4 + + 0pYpI Y2 = 0Y + 02o + + 42 + 53 + + pYp2 Y3 = 01Y2 + 2Y1 + 03Y0 + 04Y1 + 5Y2+ + 0pYp3 p = 0 p+ + yp2 3 + 0Y4 + 5p5 + + p o Now utilize p = yoy to obtain expressions for the true autocorrelations: p1 = 0 + 02P1 + 03P2 + 04P3 + 054 + + pPp1 2= Op + 02 + 3P + 042 5+ 3+ + p2 P = 0,Ppl1 + 020p2 + 030p3 + 040p4 + 0 5p5 + + 0 Solving the above equations for the lag parameters 01, 02, 03' 0p gives: 01 = (102)P 2 4P3 054 0 0p1 02 = (01+03)P1 + (104)p2 5P3 p 0p2 0p = 0pl 1 0p2P2 p33 + Pp or in matrix form: (102) 03 0 0p 0 0 (01+03) (104) 05 0 0 0 13P S= p p2 p3 02 0 1 To see this more clearly, consider the case, p=4. Expressions for the first four true autocorrelations are given by: P = 01 + 02Pl + 03P2 + 04P3 P2 = 0p1 + 02 + 03P1 + 0402 P3 = 0IP2 + 2P1 + 03 + 0 4 P4= 01p3 + 02P2 + 03P1 + 04 Now solve each of the above equations, respectively, for 0., j = 1, 2, 3, 4 to obtain: 01 = (02)P1 03p2 04p3 02 = (1+03)P1 + (104)p 03 = (02+04)PI 0IP2 + P3 04 = 03P 02P2 01P3 + p4' In matrix form, we have: 102 3 04 0 (01 +03) 10 0 0 0 = P , (02+4) 01 1 0 03 02 01 1 which agrees with the lemma result. [ The simulation technique used to generate the powers of the tests in Sections 4.1.1 and 4.1.2 is outlined in Table 4.1. In Section 4.1.1 we examine the powers of the BoxPierce and maxX2 tests when the vector of sample autocorrelations r is known, i.e., when the random errors {Zt } are observable. Of course, in practice the {Zt } are unobservable. We have indicated (Section 3.3.2) that the researcher must instead calculate the vector of estimated sample autocorrelations, r, using the residuals {Zt} computed from least squares regression. Power comparisons of the BoxPierce and maxx2 test in this more practical case are discussed in Section 4.1.2. These power comparisons also include the DurbinWatson d test, which was designed to test regression residuals. 4.1.1 Observable Residuals: No Regression In this section we present a power simulation study assuming that the residuals {Zt from model (4.1.1) are observable. The objective is to detect power trends which may extend to the more practical instance in which powers are calculated using the residuals {Zt} computed from least squares regression. TABLE 4.1 MONTE CARLO SIMULATION ERROR MODEL: Zt = 0 Z + 0 Z + + 0 Z + E , t 1 t1 2 t2 p tp t where {e t is White Noise STEP 1: Utilizing Lemma 14, compute the vector of true autocorrelations 1 e =A 0, where p= (p, pl' ) 0 = (0 0 2. "0 )' and the elements of A are given by: 10 2i a. = if 2i p if 2i > p 0 j if i + j < i 0 if i + j > p, i 13 .0. if i + j~ p, i>j 0ij if i + j >p, i>j STEP 2: Generate the covariance structure, F, (Z1, Z2, Zp) as follows: Yo Yi Y2 Y pl SY Y1 p2 Yp1 Yp2 Yp3 Yo where y =E{Zt t+} = Y = 1, 2, 3, of the first pvariates . p and TABLE 4.1Continued Y = 01 + 02Y2 + + p + 02 (= (0 + 0 + 22 + pPp)o + 02 U2 1(01pI + 02P2 + + 0pPp) Without loss of generality, take E(e2) = 02= 1. STEP 3: Write r= TT' where T is a pxp lower triangular matrix. Solve for the elements of T using a computer routine. STEP 4: Generate the first p random variates Z1, Z2, ., Z by computing 7 = Te where Z = (ZI, Z2, ., Z )' and where e = ~p p ~p 1 2 P ~P (elE2' ., p)' is a vector of independent normal random variables with 0 and variance 1. (Many computer packages have routines which generate N(0,1) random variates.) Note that Cov{Z } = E{Z Z '}= P ~PP TE{e e '}T' = TT' = r, and thus the first p random variates generated pp have the desired covariance structure. STEP 5: Compute the remaining np variates, Zp+, Zp+2', Zn, from the recursive relations given by model (4.1.1), i.e., Zp+1 p + 2p1 + + Z + Ep+ Zp+2 = 1 Zp+l + Zp + p + pZ2 + p+2 Z = 0Z + 0Z + + Z + E , n 1 n 2 n2 p np n where ,p+ n are distributed multivariate normal with mean 0 and p+1 n covariance I. STEP 6: Use the generated random errors Z1, Z2, ., Zn to calculate the first K sample autocorrelations 77 TABLE 4.1Continued n E ZZ E' t ztj r, r2, ., rK, where r. = tj+lt t K. J t K. n tZt tt K STEP 7: Compute the test statistics, T(BP) = n j and T(m)= n m j1 jK Determine whether the calculated values of the test statistics fall in their respective rejection regions: T(BP) > X2 K,a T(m) > 1 (1 a) STEP 8: Repeat steps 1 7 N times where we selected N = 1000. STEP 9: Compute the simulated power for each test as follows: Pow number of rejections power 1000 1000 A list of the controllable parameters in the study is given below: n = number of residuals generated a = probabilility of a Type I error K = number of sample autocorrelations included in the BoxPierce and max X2test statistics p = order of the autoregressive model (013 02' .. .' p) = values of the lag coefficients in model (4.1.1) m = number of nonzero lag coefficients in model (4.1.1) (jl' 2' jm) = lags associated with nonzero lag coefficients. Of course, there exists an infinite number of parameter combinations. For our study, we have attempted to select those model parameters which are intuitively appealing to the timeseries analyst and/or econometrician, i.e., those which also have practical applications. Our basic error model takes the form of equation (3.3.4). The two versions of this model used in the study are: (1) the first order AR process with lag J Zt = 0Zt + Et (4.1.2) L tJ t and (2) the second order AR process with lags, Jl and J2 = J t1 + 2 t2+ t The parameter values of each model are given in Table 4.2 and Table 4.3, respectively. In order to present the results of the study in a clear fashion, we do not give tables of the numerical power simulations in this section. Instead, we sketch power curves for various cases which are typical of those considered and which also best emphasize the main results. Tables of all numerical results obtained in the study are included in the Appendix which follows the final chapter. However, when summarizing the simulation results, we will at times refer the reader to these tables. 79 TABLE 4.2 SIMULATION STUDY ERROR MODEL 1: Z = 0Z +s t tJ t Parameters: (J,K) (2,4) (3,4) (4,4) (5,4)* (2,12) (3,12) (4,12) (12,12) (12,24) 0 .9 .7 .5 .3 .1 0 .1 .3 .5 .7 .9 n 50 200 *NOTE: Powers are generated for all of the above combinations of {(J,K) 0, n, a} with the excpetion of the case {(5,4), 0, 200, .05}. TABLE 4.3 SIMULATION STUDY ERROR MODEL 2: Zt = 0J ZtJ + 0J2ZJ + t t J1 tJ1 J2 tJ2 t Parameters: (JlJ2;K) J'0J2 n a (1,2;2) (.5,.3) 50 .05 (1,2;12) (.5, .3) 200 (1.0,.6) (1.0,. 6) (.1,.7) (.1,.7) (.3, .1) (.7, .3) (.7,.3) (1,3;4) (.1,.9) 50 .05 (.1,.7) 200 (.5,1.2) (.5,.3) (1.0,.].) (1.0, .1) (.5,.3) (1.0,.3) (.3,.1) 81 POWER SIMULATIONS ERROR MODEL I J=2,K=4 1.0 , .9\ .8 I n0' 0 6 6 1  .5 o .4 \\ ,. TB(M) ,\\ 1 .2 TP) \ \ n=200 T(M) \\ I I I I I I I I L .9 .8 .7 .6 .5 .4 .3 .2 .1 0 LAG PARAMETER (t) FIGURE 4.la POWER SIMULATIONS ERROR MODEL I J=2,K=4 1.0 i /  iI I I 1 I I T I n=50 I T I I T(BF I I n=200 T(M) II 4, t i I I 0 .1 .2 .3 4 .5 I I I I .6 .7 .8 .9 LAG PARAMETER ( ) FIGURE 4.1b I I 83 POWER SIMULATIONS ERROR MODEL I J=4,K=4 1.0 ~ \8 0 6, O'I 0 .4 \ 'II S.6n50 II \\ S\4 ,, S00 .4 .2 T(BP) It ____ i___ _ i n=200 .I ,,.., .9 .8 .7 .6 .5 .4 .3 .2 .I 0 LAG PARAMETER (p) FIGURE 4.2a POWER 1.0 .9 .8 .7 SIMULATIONS ERROR MODEL I J=4,K=4 / I I I I I I I r'I "I ;I II II T(BP) n=50 (M T(M)  T(BP) n= 200 T(M) I I I I I I I I L I 0 .I .2 .3 .4 .5 .6 .7 .8 .9 LAG PARAMETER (<) FIGURE 4.2b  85 POWER SIMULATIONS ERROR MODEL I J=2,K=12 1.0 .... .9 \ \ .8 I \ i \ \ .8 \ i II .6 v  .5 ' 0 .4 a .n= 50 T(M) .2 n=200 .I I I I I I I I I I . .9 .8 .7 .6 .5 .4 .3 .2 ..1 0 LAG PARAMETER ( ) FIGURE 4.3a 86 POWER SIMULATIONS ERROR MODEL I J=2,K=12 I.Or _.  .5F / / I I ^/ I I L I FIGURE 4.3b T(BP) n=50 T(M)  T(BP) n=200  T(M) 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 LAG PARAMETER (#) POWER 1.0 .9 .8 .7 SIMULATIONS ERROR MODEL I J=12,K=24 ' ' ' I \ \ \ \I \ I r \ I \\ \\ \I \ , \1 \l \l. 'I' .9 .8 .7 .6 .5 .4 .3 .2 .1 0 LAG PARAMETER (0) FIGURE 4.4a 88 POWER SIMULATIONS ERROR MODEL I J=12,K=24 1.0 // / .9 I 'I .8 II .7 :1 0 .i .2 .3 .4 .5 .6 .7 .8 .9 LAG PARAMETER ( ) FIGURE 4.4b 89 POWER SIMULATIONS ERROR MODEL I J=5,K=4 .6 .5 (BP) n=50 T(M) .3 I I I I I I I I .9 .8 .7 .6 .5 .4 .3 .2 .1 LAG PARAMETER (9) FIGURE 4.5a I I I I i 90 POWER SIMULATIONS ERROR MODEL I J=5, K=4 1.0 .9 .8 .7 T(BP) n=50 T (M) I I I 0 .1 .3 .4 .5 .6 .7 .8 .9 LAG PARAMETER (9) FIGURE 4.5b _ I   I L I I I I I We first discuss the power simulation results for error model 1 (see Tables Ai A9 in Appendix). Note that for n=50, Figures 4.1  4.4 show that moderate and large values of the lag parameter (in absolute value), i.e., .5 101 1, lead to a high rate of rejection of the null hypothesis of white noise for both the BoxPierce and maxx2 tests. When the sample size is increased to 200, high rejection rates result at values of the lag parameter .3 or larger (in absolute value). Except for the case J=5, K=4 (Figure 4.5), these results hold for all other cases con sidered. For these models, then, this simulation study clearly supports our supposition that both tests attain high power under a general alternative. Compare now Figure 4.1 and 4.3, i.e., the cases J=2, K=4 and J=2, K=12. Notice that the powers of the tests generally decrease as K, the number of autocorrelations included in the test statistic is increased from 4 to 12. And in all other cases considered, for a fixed lag J, the powers of the tests decreased as K increased. In contrast, if the researcher unknowingly selects K less than the smallest nonzero lag in the model (J in our model), the powers of the tests are reduced dramatically, as demonstrated in Figure 4.5 for the case J=5, K=4. For moderate and small values of the lag para meter, the powers of the tests are less than .10. Thus, the test user's choice of K is a delicate one. The researcher who conservatively selects a K much larger than the largest nonzero lag in the general alternative error model will fail to reject the null hypothesis of white noise when serial correlation exists at a more frequent rate. Alternatively, if K is chosen smaller than the lowest lag in the hypothesized error model, the tests have very low power. Comparing the powers of the tests, Figures 4.la 4.4a reveal that the power of the maxx2 procedure dominates the power of the BoxPierce test for almost all negative values of the lag parameter, or more specifically, for 0 .3. Consider the case J=12, K=24 (Figure 4.4). Referring to Table A8 in the Appendix, at 0 = .7 and n=50, the maxX2 test with a simulated power of .853 greatly outperforms the BoxPierce test (simulated power of .583). When analyzing yearly data, of course, it is very possible that the regression errors {Zt} follow an auto regressive scheme with nonzero lag 12. Thus, the practical significance of the result is clear. For small values of the lag parameter, i.e., .3 0 .3, the figures show that the BoxPierce power is generally larger than the maxx2 power. However, this result seems much weaker than that previously stated since the powers attained by both tests for 0 in this range are very low, and in some cases, near zero. The apparent trend in the powers of the tests for moderate and large values of the lag parameter at n=50 can be stated as follows: (a) for .3 0 .5, the powers of the tests are very nearly equivalent; (b) for 0 .5, the power of the maxx2 test is slightly higher than the BoxPierce power. The maxx2 procedure also seems to be less sensitive to an increase in K. For example, compare Figures 4.1 and 4.3, the cases J=2, K=4 and J=2, K=12. At 0 = .3 and n=200 the BoxPierce power drops from .947 (Table Al) to .837 (Table A4), while the drop in the power of the maxx2 test is only from .959 to .925. Figure 4.5 represents the only case (J=5, K=4) in which the Box Pierce test clearly outperforms the maxx2 test. In this in. ance, we expect both tests to perform poorly. With J=5, the first four true autocorrelations p p2, P3, p4 are zero, and thus each of the first four sample autocorrelations rl, r2, r3, r4 estimate a quantity known to be zero. Hence, both the sum and the maximum of the first four sample 