UFDC Home  Search all Groups  UF Institutional Repository  UF Institutional Repository  UF Theses & Dissertations  Vendor Digitized Files   Help 
Material Information
Subjects
Notes
Record Information

Full Text 
EXACT TESTING PROCEDURES FOR UNBALANCED RANDOM AND MIXED LINEAR MODELS By ROBERT CALVIN CAPEN A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1991 To my mother and the memory of my father ACKNOWLEDGMENTS I would like to express my most sincere appreciation to Dr. Andre' I. Khuri for his patience, understanding, and guidance during the writing of this dissertation. Also, I would like to thank Dr. Malay Ghosh, Dr. Ramon Littell, Dr. P.V. Rao, and Dr. Timothy White for serving on my committee. I am grateful to my fiance, Debra Thurman, for all the love and support she has provided me during these past two years. She gave me the strength to continue. I would also like to thank Jeff, Joe, Amber, and Julie for their friendship and moral support. Finally, I wish to acknowledge Leslie Easom for her superb typing job  thank you very much. iii LIST OF TABLES Table Page 3.1 An example of an unbalanced random twoway model with missing cells ...................................... 44 3.2 An example of an unbalanced random 3fold nested model ............................................... 91 3.3 Information needed to determine the vector v in (3.257) for a threeway crossclassification model in which Factors A and B are fixed while Factor C is random ............................................. 129 3.4 Information needed to determine the vector x in (3.288) for a threeway crossclassification model in which Factor A is fixed while Factors B and C are random ............................................ 139 3.5 A comparison of the powers of the exact tests for o in an unbalanced mixed twoway crossclassification model ................................................. 162 3.6 Numerical scores resulting from a comparison between dyed cottonsynthetic cloth and a standard .......... 186 3.7 Some useful quantities used in the construction of the exact tests for the variance components utilizing the methodology developed in Section 3.3.3 .......... 188 3.8 Expected mean square values for the cloth example ... 189 3.9 Results from the analysis of the variance components for the cloth example .............................. 190 3.10 Some useful quantities used in the construction of the exact tests for the fixedeffects utilizing the methodology developed in Section 3.3.4 .............. 191 3.11 Summary of results for the cloth example ............ 192 3.12 Performance values for three machines ............... 194 iv Table Page 3.13 Results from the analysis of the variance components for the machine example ............................. 195 3.14 Some useful quantities needed in the development of the exact confidence intervals for the machine example ............................................... 196 3.15 Exact and approximate 95% confidence intervals for the machine example ..................... .............. 198 3.16 Some useful quantities used in the construction of the exact tests for the four variance components in the example .......................................... 262 3.17 Results from our testing procedures ................. 264 TABLE OF CONTENTS Page ACKNOWLEDGMENTS ............................................... iii LIST OF TABLES ................................................ iv ABSTRACT ...................................................... viii CHAPTERS ONE INTRODUCTION .......................................... 1 TWO LITERATURE REVIEW ..................................... 5 THREE EXACT TESTING PROCEDURES .............................. 13 3.1 The Unbalanced Random TwoWay model with Some Cells Missing ........................... 13 3.1.1 Introduction .................................. 13 3.1.2 Preliminary Development ................... 19 3.1.3 An Exact Test Concerning .............. 26 3.1.4 An Exact Test Concerning a2 ............... 36 3.1.5 An Exact Test Concerning ap .............. 41 3.1.6 A Numerical Example ....................... 42 3.1.7 The Power of Exact Tests .................. 45 3.1.8 Concluding Remarks ........................ 47 3.2 The Unbalanced ThreeFold Nested Random Model .... 49 3.2.1 Introduction .................................. 49 3.2.2 Preliminary Development ................... 54 3.2.3 An Exact Test Concerning 2 ............... 58 3.2.4 An Exact Test Concerning a. ............... 71 3.2.5 Reduction of the Exact Tests when the Design is Balanced ......................... 82 3.2.6 A Numerical Example ....................... 88 3.2.7 The Power of Exact Tests .................. 92 3.2.8 Concluding Remarks ........................ 98 vi 3.3 The General Unbalanced Mixed Model with Imbalance Affecting the Last Stage Only .................... 99 3.3.1 Introduction .................................. 99 3.3.2 Preliminary Development ................... 100 3.3.3 Analyzing the Variance Component .......... 110 3.3.4 Analyzing the Fixed Effects Parameter ..... 119 3.3.5 Reduction of the Tests When the Design is Balanced ........................ 141 3.3.6 The Power of the Exact Tests .............. 143 3.3.7 An Alternative Approach to the Analysis of the Unbalanced Mixed Model with Imbalance Affecting the Last Stage Only ............. 151 3.3.8 Numerical Examples ........................ 184 3.3.9 Concluding Remarks ........................ 199 3.4 The Random Unbalanced Nested Model with Imbalance Affecting the Last Two Stages Only ............... 201 3.4.1 Introduction ............................. 201 3.4.2 Notations and Preliminary Development ..... 201 3.4.3 An Exact Test for as .......... ........... 204 3.4.4 An Exact Test for *1 .................. 211 3.4.5 An Exact Test for a2, 92,**, sr2 ......... 227 3.4.6 Reduction of the Exact Tests When the Design is Balanced ....................... 247 3.4.7 The Power of the Exact Tests .............. 254 3.4.8 A Numerical Example ....................... 261 3.4.9 Concluding Remarks ........................ 261 FOUR CONCLUSIONS AND DIRECTIONS FOR FUTURE RESEARCH ........ 265 APPENDICES A MATHEMATICAL RESULTS .................................. 269 B STATISTICAL RESULTS ................................... 280 C AN ILLUSTRATION OF THE UTILITY OF RESAMPLING .......... 286 D HIROTSU'S APPROXIMATION ............................... 292 REFERENCES ...................................................... 294 BIOGRAPHICAL SKETCH ............................................. 298 vii Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy EXACT TESTING PROCEDURES FOR UNBALANCED RANDOM AND MIXED LINEAR MODELS By Robert Calvin Capen August 1991 Chairman: Dr. Andr6 I. Khuri Major Department: Statistics Although much attention has been given to the problem of estimating variance components in unbalanced random and mixed linear models, the same cannot be said about the development of exact testing procedures for the aforementioned quantities. One reason for this is the mathematical complexity of the problem. Even in some balanced random or mixed models, no exact analysis of variance tests exist for testing certain variance components. Consequently, approximate methods, such as Satterthwaite's procedure, are typically relied upon to furnish the required tests. Unfortunately, the exact distributions of these approximate procedures are generally unknown, even under the null hypothesis. viii Recently, some progress has been made in the construction of exact tests for the variance components and for estimable linear functions of the fixedeffects parameters in unbalanced random and mixed models. In this dissertation we extend some of these known results and generate some new ones. Specifically, exact procedures are developed to test hypotheses concerning the variance components in an unbalanced random twoway model with some cells missing, an unbalanced random 3fold nested model, and an unbalanced random sfold nested model where the imbalance affects the last two stages only. Also, methods are derived to test hypotheses about the variance components and about estimable linear functions of the fixedeffects parameters in a general unbalanced mixed model with imbalance affecting the last stage only. ix CHAPTER ONE INTRODUCTION Although much attention has been given to the problem of estimating variance components (see, e.g., Rao and Kleffe (1988)), the same cannot be said about the development of exact testing procedures concerning the variance components in unbalanced random and mixed linear models. Perhaps one reason for this is due to the mathematical complexity of the problem. Even for some balanced models no exact analysis of variance (ANOVA) tests exist for testing certain variance components. For example, in a threeway cross classification randomeffects model, no exact procedures exist for testing hypotheses concerning the variance components associated with the main effects of the model. Also, when exact tests do exist for balanced models, they generally cannot be used to test the corresponding hypotheses in unbalanced models. The reason for this is that for unbalanced models, the traditional test statistics used with balanced data are, in general, no longer based upon independent, chisquaredtype sums of squares. Approximate methods, such as Satterthwaite's (1946) procedure, are typically relied upon to furnish tests for the variance components when no such exact procedure exists. However, approximate methods are just that  approximate. Some may be relatively easy to implement, and perform reasonably well in certain situations, but 2 they cannot be expected to always produce adequate solutions (see, e.g., Cummings and Gaylor (1974), Tietjen (1974), and more recently, Ames and Webster (1991)). Thus there is a clear need for the development of exact testing procedures. In this dissertation we have concentrated on developing exact methodology for some unbalanced random and mixed linear models. Much of this work has been inspired by the derivations given in Khuri (1984, 1987, 1990), Khuri and Littell (1987), Gallo (1987), and Gallo and Khuri (1990). Their results and those obtained in this dissertation are based on certain transformations which reduce the analysis of the unbalanced model to that of a balanced one. A crucial theme in the construction of these transformations is a scheme for resampling from the error vector. In this scheme, only a portion of the error vector is utilized in making the transformation. Consequently, the values of the resulting test statistics will not be unique. However, the distributional properties of these statistics will remain unchanged no matter what portion of the error vector is used in the transformation. It is in this sense that our methodology is similar to procedures involving resampling. Appendix C provides an illustration of the utility of resampling as a means of "counteracting" the imbalance in a design. This thesis is divided into four chapters. Chapters one and give a brief introduction to the area of exact testing procedures unbalanced random and mixed linear models and a review of the pertinent literature. The main contributions of this study are fo in Chapter three. This chapter is divided into four sections. Ir )und i the first section we develop exact tests for the variance components in a random twoway model with some cells missing. Section two deals with the unbalanced random 3fold nested model. Here the imbalance is assumed to affect all stages of the design. In the last two sections we concentrate on more general models. The general mixed model with imbalance affecting the last stage only is analyzed in Section three. While in Section four we obtain exact procedures for testing the variance components in a random sfold nested model with imbalance affecting the last two stages only. Finally, Chapter four summarizes our findings and offers directions for future research. Throughout this work we will try to motivate, as much as possible, the reason behind each transformation we use. At times, the mathematical derivations may become "intense" in the sense that there may be pages containing nothing but equations. Unfortunately, there seems to be no way to avoid this. A numerical example is given at the end of each section in Chapter three to illustrate the proposed methodology. These examples also serve as useful (and probably needed) "rest stops" between pages full of formulas. The following notation will be used throughout this thesis: rank(A) will denote the rank of the matrix A range(A) will denote the range space of the matrix A, that is, the vector space spanned by the columns of A. tr(A) will denote the trace of the matrix A. A will denote a generalized inverse of A, that is, any matrix which satisfies AAA=A. 4 * diag(A, B) will denote the matrix 0 B This can also be represented by AEB where "D" is the direct sum operator. * AOB will be used to represent the (right) direct (Kronecker) product of two matrices. Briefly, if A is nxm and B is pxq then AB is the npxmq partitioned matrix whose (i,j)th matrix is aijB where aij is the (i,j)th element of A. CHAPTER TWO LITERATURE REVIEW The first paper that addresses the issue of exact tests for variance components in unbalanced models is due to Wald (1940). Wald constructed an exact test for a/.a in the random oneway model: yij =p+ai+ij (2.1) (i=l,2,. .,p; j=l,2,.,mi), where p is an unknown constant parameter; ai and eij are independent, normally distributed random variables with zero means and variances a. and a respectively. He later extended this result to the general randomeffects model without interactions. Wald (1947) also considered the mixed model y=Xa+Zb+e (2.2) where a is a vector of fixed effects; bN(Q, oaI) independently of f~N(Q, aoI). Using a regression analysis approach, Wald was able to construct an exact test for ab 2 As pointed out in Spj0tvoll (1968), Wald's idea cannot be used to test variance components in all linear models. To illustrate this phenomenon, Spj0tvoll considered the random twoway model. Based on a modification of Wald's method he was able to derive an exact test for the variance component associated with the interaction effect. However, only under the assumption of additivity could he construct exact tests for the main effects variance components. By modifying an approach suggested by Graybill and Hultquist (1961), Thomsen (1975) was able to construct exact tests for the variance components in an unbalanced twoway random model. Thomsen did this by first making an orthogonal transformation as outlined below. Let y be the vector of cell means, then in matrix notation we have (see Thomsen (1975, p. 258)): y= Plrs+Biq+B2+ Irs(aP) +? (2.3) where 1rs is the vector of ones of dimension rs; Irs is the rsxrs identity matrix; B1 and B2 are defined as B1 =Ir is (2.4) B2 = rIs and r is the number of rows and s is the number of columns in the design. Since A = B1B1' and A2 = B2B2' commute, there exists an orthogonal matrix P which simultaneously diagonalizes A1 and A2. Further, the first row of P may be chosen to be (1/,rs)l's. Making the transformation z =P (2.5) results in Thomsen's "semicanonical" form. Thomsen used the last rs1 elements of z to derive exact tests for ag (the variance component associated with the interaction effect) and for a2 and o (the main effects variance components). However, as in Spj0tvoll (1968), the development of the latter two tests rested on the assumption that a2p =0. Also, Thomsen assumed the layout contained no missing cells. (In a later section he dropped this assumption, but still assumed the design was connected.) In 1983, Seely and ElBassiouni returned to Wald's variance component test. In their paper, necessary and sufficient conditions were given under which the Wald test can be used in mixed models. They also developed a uniqueness property of Wald's test. Thus based solely on matrix ranks, one can determine if a particular variance component test is a Wald's test. They applied their notion to the tests developed by Spjetvoll (1968) and Thomsen (1975). They concluded that the tests derived by Thomsen are indeed Vald's tests, while only the test for interaction developed by Spjetvoll is a Wald's test. Although both Spjotvoll and Thomsen needed to assume additivity in their development of exact tests for the main effects variance components in an unbalanced random twoway model, Khuri and Littell (1987) demonstrated that this assumption was unnecessary. Initially, their work proceeded similarly to Thomsen (1975). However, they also made a particular transformation that reduced the analysis of the unbalanced model to that of a balanced one. Their idea was based on a scheme for resampling from the error vector. After making this transformation they were left with a model of the form (see Khuri and Littell (1987, p. 548)), w= PBla +P1B2P +E* (2.6) where a~N(O, oaIr); ~N(O, aIs); *~N(Q, 6Irs1) where 6= a2 +Amax Oa and Amax is the largest eigenvalue of the matrix P1KP1' where K==diag(n ,n ,.. **,nn) and P1 consists of the last rs1 rows of P (see (2.5)). Furthermore, a, 8 and c* are mutually independent. B1 and B2 are defined in (2.4). After partitioning w into w=(_a, W', wap)', Khuri and Littell developed exact tests for a2 and a based on (Y, "al)' and (W W1 )', respectively. In particular, their test statistic for testing Ho:a =0 is given by F= w/(r1) (2.7) W apw/[(r1)(s1)] It should be noted that rank(P1B1:P1B2) = r + s2 rank(P1B1) = r rank(P1B2) = s1. Consequently, according to Seely and ElBassiouni (1983), the tests derived by Khuri and Littell (1987) for the main effects variance components are Wald's tests based on the w model in (2.6). One attractive feature of a Vald's test is that when applied in an unbalanced onewayrandom effects model, it may be interpreted as being "almost equal to the most powerful invariant tests against large alternatives" (Spjotvoll (1967, p.422)). This result is also hinted at in Mostafa (1967). Also, under certain conditions, Mathew and Sinha (1988) show that the Wald test for testing the variance component associated with the unbalanced random twoway mixed model without interaction is the uniformly most powerful invariant test under a certain group of transformations. This is more fully discussed in chapter four. Besides the above mentioned results, other exact procedures do exist in the literature. Khuri (1987) developed an exact test for the nesting effect's variance component in an unbalanced random 2 fold nested model. The idea of resampling from the error vector was also utilized in this paper. Verdooren (1988) also analyzed the unbalanced random 2fold nested model. He was able to derive an exact test for p, for each given value of P2. Here p1 is the ratio of the nesting effect's variance component to the error variance component and P2 is the ratio of the nested effect's variance component to the error variance component. Exact tests for the random and fixed effects in some unbalanced mixed models were derived in Gallo (1987) and Gallo and Khuri (1990). Hocking (1988) also considered the mixed linear model. Although he 10 was more interested in estimation than in testing, he did note that his estimators could be used to develop tests of hypotheses for designs with moderate imbalance. These tests, however, are not exact. We only mention them since they may represent reasonable alternatives to the methodology presented in Section three of Chapter three. Turning to more general models, Khuri (1990) developed exact tests for random models with unequal cell frequencies in the last stage. These designs include all crossclassification random effects models with no missing cells. Utilizing resampling, Khuri was able to construct a set of mutually independent sums of squares that are distributed as scaled chisquared variates. These sums of squares were then used to test hypotheses concerning the variance components. It should be noted that not all variance components can be tested exactly using Khuri's procedure. This should not be too surprising, however, since this phenomenon occurs even for some balanced random (or mixed) models (cf. Kendall and Stuart (1968), p. 83). Using a different approach, Ofversten (1989) was also able to construct exact tests for the variance components in general mixed linear models. His idea was to use a result found in Allen (1974) to orthogonally transform the model matrices into socalled layer triangular form. The nice thing about this type of transformation is that when combined with orthogonal row permutations it can be used to reduce matrices to ones with full row rank. This technique is repeatedly utilized in Ofversten's work to produce exact tests for the variance components in certain mixed models. However, when 11 certain rank conditions are not satisfied this approach cannot be used to produce the exact procedures. When this occurs, Ofversten relies on resampling to develop his exact tests. Ofversten's methodology can be used in any unbalanced mixed model to construct tests for the variance components. However, it is not altogether clear as to how to adapt his methods to designs not examined in his dissertation. More is said about Ofversten's results in Chapter four. Partial motivation behind the need to develop exact testing procedures can be found in Tietjen (1974) and Cummings and Gaylor (1974). Both of these papers considered the random 2fold nested model. In his simulation study, Tietjen showed that the approximate test based on Satterthwaite's (1946) procedure for testing the nesting effect's variance component can be markedly off in approximating the true pvalue. This led him to recommend the use of the conventional Ftest (that is, the test used when the data are balanced) as an approximate test for this hypothesis. Cummings and Gaylor (1974) considered various 2fold nested designs in their simulations. These designs were specifically chosen so that the mean squares involved in the approximate (Satterthwaite) Ftest were either dependent, failed to be distributed as scaled chi squared variates, or both. They demonstrated that when the mean squares were dependent, the estimated test size was less than the stated test size and thus, the approximate test was conservative. When the mean squares were independent but did not have chisquared type distributions, the opposite occurred. The estimated test size was greater than the stated test size. When the mean squares were 12 both dependent and not distributed as scaled chisquared variates, a "counterbalancing" effect was observed. The approximate Ftest in this case seemed to perform well. However, this counteracting effect may not occur in higher level designs. As Cummings and Gaylor (1974, p. 771) state "there is a possibility the test size disturbances due to mean square dependence and nonchisquaredness will provide some reinforcement." In the next chapter we extend the work of Khuri (1987, 1990), Khuri and Littell (1987), and Gallo and Khuri (1990). The idea of resampling from the error vector is vital in our development and is used repeatedly throughout this dissertation. The reader is referred to Appendix C where the utility of this approach is illustrated. CHAPTER THREE EXACT TESTING PROCEDURES 3.1. The Unbalanced Random TwoWay Model with Some Cells Missing 3.1.1. Introduction In this section we will consider the unbalanced random twoway model with some cells missing. Specifically, we will concentrate on the following model: yijk=P++ai +j+(aP)ij +ijk (3.1) ((i,j)EY; k=l,2,...,nij); where p is an unknown constant parameter; ai, Pj, (af)ij, and eijk are independently distributed normal random variables with zero means and variances a,, a, #, and aa, respectively; I is the set of subscripts, (i,j), which label the nonempty cells of (3.1). That is, Y={(i,j): 1O} (3.2) where r is the number of rows and s the number of columns of the design. Note that I is a proper subset of IXJ, where I= {i:i=1,2,* ,r}; J={j:j=1,2,.. ,s}; and X denotes the Cartesian product. 13 14 In matrix form, (3.1) becomes y = 1n +Xlg+X2 +X3(%#) +f (3.3) where y is the vector of observations of dimension n= C nij; in (i,j)E is a vector of ones of dimension n; X1, X2, and X3 are known matrices of zeros and ones of orders nxr, nxs, and nx(rsp), respectively, where p is the number of empty cells; a, /, and (apf) are column vectors whose elements are the ai, )j, and (ap)ij, respectively ((i,j) )); e is the nxl vector of errors. Define nij Yij n k=i jk ii k=l (3.4) for all (i,j) E. From (3.1) we obtain (3.5) ((i,j) E), where (3.6) nij ~1i j = eijk ij i i k=1 for all (i,j) E?. In matrix form, (3.5) becomes (3.7) Y=Prsp +A +A2l+ Irsp(al) + where y is the (rsp)xl vector of (filled) cell means; Irsp is the 15 identity matrix of order rsp; A1 and A2 are known matrices of zeros and ones of orders (rsp)xr and (rsp)xs, respectively. A1 and A2 can be expressed as r Ai= El spi (3.8) i=l 1i (Ispl: 0(spl)xpl)E1 A,= (Is2 0(sP2) 2)E2 (3.9) (Isr: (sr) xpr)Er where pi is the number of empty cells in row i (i=1,2,.* ,r); O(sp)xPi is a matrix of zeros of order (spi)xpi (i=1,2,. ,r); and Ei is a product of interchange matrices (i=1,2,. ,r). We take (Isi : 0(spi)xp)=Is when pi=0 (i=1,2,...,r). Example 3.1 Consider the following twoway table, an "X" represents a filled cell. X X X X X X X X X x\ ~r 16 Then, r=3 s=4 Pi =0 P2 = 1 p3 = 2 P3=2 p=3 furthermore, El = 14 = E3 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 E2 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0001 0100 1 0 0 0 consequently, A1 =diag(14,13,12) 17 14 10 00 A= 0 0 1 0 O 001 (12:02 X2) Besides the usual assumptions concerning the random effects, we will also assume: (i) The model used is model (3.1) (or model (3.3)) (ii) Both r and s are at least two (iii) rank(Al:A2)=r+s1 (3.10) (iv) p<(r1)(s1) (v) n >2(rsp)min(r,s) Assumption (iii) implies that the design is connected (Gately (1962) p. 46), while the last assumption (along with the others) is needed to insure that the exact tests for the main effects' variance components are valid. The inclusion of missing cells into any design adds a degree of difficulty to the analysis. The twoway random model is no exception. One problem is that the matrices BI=A1A and B2 =A2A do not, in general, commute. B1 and B2 will commute if all cells are filled (see Thomsen (1975) or Khuri and Littell (1987)). 18 Reconsidering Example 3.1, we have Bi=diag(J4, J3, J2) where Ja = la While 1E (o3) 13 hence BB,2= J3(I3:)E2 I J4E' (o) J3 Since, for example, J4( o= 2) while J2(:0) = (J2:0), we see that BIB2 is not symmetric. Thus B1 and B2 do not commute. Consequently no orthogonal matrix exists that will simultaneously diagonalize B1 and B2. (2?) (I1:0) E2 I ) I J4( 02) J3( 13:Q)E2 0 B2 = (I3:)E2 2:0) (2:0)E I (I,:0) ~o (I,0) J2(12:0) J2(l2:0)E2 of 19 The next section lays the groundwork necessary for the construction of the exact tests. 3.1.2 Preliminary Development Recall (3.7). From the distributional properties of the random effects, we can conclude that ~N(p4rsp, E) where E= B+B2 +Irsp+K E = Bla2+B201+ Irs po + + (3.11) (3.12) K = E nA1 (i,Bj) E B, = A,A; (3.13) B2 = A'A The following lemma provides the ranks of B1 and B2. 20 Lemma 3.1 (a) rank(B1)=r (b) rank(B2)= s proof: (a) rank(B1) =rank(A1A') =rank(A1). r rank(A1) = ( 1sp) i=1 1 Since (see (3.8)) r =E (1)=r, i=l the result follows. (b) Without loss of generality, we can assume that each column (and row) in the layout includes at least one cell containing observations. Thus, each column in A2 contains at least one "1". Hence there exists a permutation matrix II such that A2 = [**] for some (rsps)xs matrix 4. Consequently, rank(B2) = rank(A2A ) =rank(A2) 21 =rank(IIA2) ran. =S. = ran = s. Now, let u = Hy (3.14) where H1 consists of the last rsp1 rows of the (rsp) x (rs p) Helmert matrix (see Searle (1982), p. 71). We note that Hlrsp =rsp1 Then, it is easily verified that E(u)= Qrsp1 (3.15) Var(u) = H1B1H'u + H1B2H + Ip7'/ + HKH' HIB a + Irs_pl aP + H1cKHl 'T The purpose of this transformation is to construct a vector which has a zero mean vector. The ranks of H1B1H' and H1B2H' are examined in the next lemma. Lemma 3.2 (a) rank(HB1H'))=rl 22 (b) rank(HB2H) = s1 (c) rank(HI1B1H'+H1B2H) =r+s2 Proof: (a) rank(H1BHII) =r(HIAAIAH1) =r(H1A1) (from Lemma 3.1(a)). However, Allr = I1isPi lr = rsp. Thus, HiAlir = Hilrsp = 9rsp1 Therefore, rank(H1B1H') < r 1. Conversely, by the Frobenius Inequality (see Lemma A.1), we have 23 rank(H1BiH') = rank(H1A1) > rank(H1) +rank(Al) rank(Irsp) =rsp1+r(rsp) =r1. Consequently, rank(Hl1BlH) = r 1. (b) Similar to (a). (c) rank(H1B1H' + H1B2H') < rank(H1B1H') + rank(H1B2H') =r+s2 (from (a) and (b)). Conversely, the Frobenius Inequality gives us rank(H1B1H' +HIB2HI) = rank(Hi(Bi + B2)H) = rank(H1(A1A + A2A )H2) =rank(H,[A,:A2] H1 A1 = rank(H [Ai:A2]) > rank(H1) + rank([Ai:A2])rank(Irsp) 24 =rsp1+r+sl(rs p) =r+s2 (by assumption, rank ([Ai:A2])=r+s1 (see (3.10))). 0 The next step in our analysis is to simultaneously diagonalize HIB1HB and H1B2H1. Because B1 and B2 do not, in general, commute, neither will H1B1H' and H1BZHi. So we cannot achieve simultaneous diagonalization by way of an orthogonal transformation. However, both H1BIH' and H1B2H' are n.n.d. Consequently a nonsingular matrix, M, of order (rsp1)x(rsp1), will exist that will diagonalize both H1B1H' and H1B2H' (see Lemma A.4). Define Al = MH1BIHM' (3.16) A2 = MH1B2H'M' (3.17) Both Ai and A2 are diagonal matrices of order rsp1. By Corollary A.4.1 of Lemma A, we can express A1 and A2 as Al=diag(Ir1, Os,_ 0(rl)(sl)p) (3.18) A2=diag(Or_ I,_1 0(r1)(s1)p). (3.19) Consider, then, the (rsp1)xl vector v defined as 25 S= Mu. (3.20) It is easily shown that E() = (3.21) Var(y) = 2Al + A +F2 + G2 where F = MM' (3.22) G = MHIKH'M'. We can partition y into (3.23) y=(y'1, y, y3 where vy consists of the r1 elements of y whose covariance matrix is free from ao; y2 consists of the s1 elements of y whose covariance matrix is free from a.; and v3 consists of the (rl)(sl)p elements of y whose covariance matrix is free from both ao and a . Hence, 26 E(y) = (3.24) Var(y) =diag(Ir10a' IS1fl, O) +F ap+Ga. This result follows from (3.18), (3.19), and (3.21). We are now ready to develop our exact tests. To begin with, we develop an exact test concerning a'. 3.1.3 An Exact Test Concerning aq Initial development. In this section an exact test for testing H0:o =0 vs. Ha:ua#O is derived by utilizing the subvectors y1 and Y3 of y in (3.23). To begin with, partition F and G (see (3.22)) as and G= (3.25) (3.26) where F11(G11), F22(G22), and F33(G33) are, respectively, (r1)x(r1), (s1)x(s1), and ((r1)(s1)p)x((r1)(s1)p) matrices. Also, let V1:3 be the (s(r1)p)xl vector defined as 27 (3.27) Yl 1 :3 = Y3 where y1 is (r1)xl while Y3 is ((rl(sl)p)xl. Then, from (3.21) we have E(y1:3) = 9 (3.28) Var(y1:3)= AlOa+FI:3Tap+GI:30f where r1 A1 = 0 0 0(r1)(s1)p FIl F113 F1:3= G1l Gl:3= G13 (3.29) (3.30) (3.31) G13 G33 From (3.22) notice that since both F and G are positive definite (p.d.), F1:3 and GI:3 are also both p.d. 28 Now, there exists a nonsingular matrix N or order s(rl)p such that (see Lemma A.3) NFI:3N' =Is(rl)p (3.32) and NAlN' = A, (3.33) where A1 is a diagonal matrix whose diagonal entries are solutions 0 to IA10FI:3 = 0. (3.34) We note that if F13 represents the first r1 rows and columns of F;3 then the nonzero solutions to (3.34) are exactly the eigenvalues of F1:3. Without loss of generality we may express A, in (3.33) as A= ] (3.35) 0 0 where A* is a diagonal matrix of order r1 whose diagonal entries are the eigenvalues of F:l3. (If A1 is not in this form then we can find an orthogonal matrix P such that A0 0 PAP'= 1 0 0 29 Thus, letting N* = PN we see that N*AIN*' = PNA1N'P' =PA1P' (see (3.33)) 0 0 The purpose of introducing the matrix N was to simultaneously diagonalize A1 and F1:3 in the expression for Var(yV:3) given in (3.28). Define the s(r1)p random vector x1:3 as (3.36) where Xi is (rl)xl and x3 is ((r1)(s1)p)xl. Then E(x1:3) =0 (3.37) Var(x:32) = A +2 2+LIr. var(l:3) =Alaa+Is(rl)_p a# +  X:3= = N = Ny1:3 13 Y3 30 This follows from (3.28), (3.32), and (3.33). L1 is the (s(rl)p) x(s(rl)p) full rank matrix L1 = NG: 3N'. (3.38) Ve can equivalently express Var(51:3) in (3.37) as Var(xl:3)=diag(A* +Ir1p I(r1)(s1)p a0)+1E. (3.39) Because L1 is not diagonal, x, and x3 are not independent. To achieve independence between x1 and x3, we resample from the error vector. The results of Lemma B.6 are utilized heavily in the forthcoming development. Utilizing resampling. Because model (3.1) or (3.3) is a special case of model (B.1) the error sum of squares, SSE, associated with this model can be expressed as SSE = y'Ry (3.40) where R is the nxn symmetric, idempotent matrix (3.41) R = In W(W'VW)W' where W= [In:Xi:X2:X3] (3.42) 31 and X1, X2, and X3 are described in (3.3). Since range(In)C range(X3) range(Xi)C range(X3) (i = 1,2), (3.41) can also be written as R= In X3(X )3 =In (i (Jn./nij). (i,j) EY ij (3.43) This last equality follows since X = e lnij (i,j)e3 TJ (3.44) Following Lemma B.6 (iv), there exists an orthogonal matrix C and a diagonal matrix F such that R= CrC'. (3.45) Partition C and r as C= [C1C2:C:C3 (3.46) 32 r = diag(I,1,,12,0) where T1= s(r1)p (3.47) 12 = n2(rs p) +s. We note that + 72= n (rsp) = rank(R) and 172>0 by (3.10)(v). C1,C2, and C3 are matrices or orders nxYr, nx72 and nx(rsp), respectively. From (3.46), R can be written as R = C1C+C +CA. (3.48) Now, consider the (s(r1)p)xl random vector W1:3 defined to be Y :3 = :3+ ( l,maxI i L1)2Cy (3.49) 1 where Al,max is the largest eigenvalue of L1 and (Al,maxI7qL1)2 is a symmetric matrix with eigenvalues equal to the square roots of the eigenvalues of (A, maxl 1L1), which are nonnegative. Let y1:3 be partitioned just like x1:3 in (3.36), that is 13= (1) (3.50) where wy and w3 are of orders r1 and (r1)(sl)p, respectively. 33 The major result of this section is given in the following theorem. Theorem 3.1 (a) E(w1)=O; E(Y3) = (b) W, and yw are independent, normally distributed random vectors and have the following variancecovariance matrices: Var(wl) = Aa+ 61Ir_1 (3.51) Var(w3) = 61i(r1)(s1)p where l = ap +Ai,max i. Proof: (a) From (3.49) we have 1 E(wl:3) = E(xl:3) + (Al,maxIz1 L1)2C'E(y) From (3.37) we know that E(x1:3)=0. Since Rln= (see Lemma B.6(ii)) and range(C1) Crange(R), we have that C1n=O. Thus the result follows after noting that E(y) =Pin (see (3.3)). 34 (b) Clearly y1 and w3 are normally distributed. From Lemma B.6(v) we have that y is independent of SSE. Consequently DOR =0 where D is such that Dy=y, and n = Var(y) = XXl, a + X2X'2 + X3XaZ0 + 2 EIn (3.52) (see (3.3)). However, since range(C1)Crange(R) it follows that DC1 = 0. Thus y is independent of C'y. Since x1:3 is a function of y, it must also be independent of C'y. Hence 1 1 Var(W:3) = Var(XZ1:3) + (A1,maxIgl L1)'C'1OC1(A1,maxIq7 Lj)!. (3.53) From Lemma B.6(ii), ROR=R2a2. Thus since range(C1)Crange(R) and R is idempotent we have C f2C1 = CCa = or 21 The last equality follows from the fact that C = [C:C2:C3] is orthogonal. Therefore, (3.53) can be expressed as 35 Var(xl:3) = Var(x1:3) + (Al,maxI1 L) = +2 'a + l 1 2 L1)a 2 =Ai +Is(r1)_pp a + ,(Al,maxI, l ) = A1o% + Is(rl1)p ap '1,max0) Consequently, from (3.35), Var(yw) = A Ia +61Ir_1 Var(w3) = 61I(r1)(s1)p where 61 = p + Amax' 0 As a result of Theorem 3.1 we obtain (i) SS1 = '(A*+r 2 1 (3.54) (ii) SS3= yW3/61~ xr1l)(sl)p (3.55) (iii) SSI and SS3 are independent. A test statistic for testing H0:Oi=0 vs. Ha:a:0O is then F YW1/(r1) 33/((r1)(s1)p) (3.56) 36 Under Ho, F1Frl,(rl)(sl)p* Note that <^)J= 12a + (3.57) where W1 is the average of the eigenvalues of F\:3 (see (3.35)). Since F1:3 (see (3.30)) is p.d. it follows that F13 is p.d. Thus Fl\1 is p.d., implying that 01>0. Consequently, we would reject Ho at the 7 level of significance if F1> F,r1,(r1)(s1)p where F,r1,(r1)(s)p is the upper 1007% point of the corresponding Fdistribution. We now turn our attention to the construction of an exact test concerning aa. 3.1.4 An Exact Test Concerning a2 Initial development. This section is concerned with the construction of an exact test for a~. Much of the development in this section parallels that of Section 3.1.4 except that we now focus on the subvectors y2 and v3 of v in (3.23). Let v2:3 be the (r(s1)p)xl random vector S(3.58) Y2:3 = (3.58) Y3 37 where v2 is (sl)xl and vy (recall) is ((r1)(s1)p)x1. Similarly to Section 3.1.3 we can make a nonsingular transformation, T, to obtain a random vector x2:3 = [x:x']'=Tv2:3 with E(x2:3) = (3.59) Var(x?2:3) = A2U + I(s1)ppr+L 2 where A2=diag(Ah, 0) and Ah is a diagonal matrix of order s1 whose diagonal entries are the eigenvalues of F3. F2:3 represents the first s1 rows and columns of F3 where F22 F23 F2:3 = (3.60) F'23 F33 (see (3.25)). Also, L2 is the (r(s1)p)x(r(s1)p) full rank matrix L2 = TG2:3T'. (3.61) T is the nonsingular matrix that simultaneously diagonalizes F2:3 and A2 where A2 1= (3.62) 0 0(r1)(s1)p (see(3.24) ; and G2:3 is defined to be 38 G22 G22 G2:3 = (3.63) G23 G23 (see (3.26)). Ve note that x3, defined in x2:3 above, is not the same as x3 in (3.36) since a different nonsingular transformation was necessary to simultaneously diagonalize F2:3 and A2. Because x2 and x* are not independent (L2 is not diagonal (see (3.39))), we will again need to employ resampling to aid in the construction of the exact test. Utilizing resampling. Consider again the matrix R defined in (3.41) or (3.43). Unlike the partitioning of C and r described in (3.46), we now partition C and r as C= [C*:C*:C*] (3.64) rF=diag(I ,1I *,0) where l= r(s1)p (3.65) 72 = n2(rsp) +r. Note that q1 +1 =n(rsp)=rank(R) and >0 by (3.10)(v). Cl, C(, 39 and C* are matrices of orders nx i, nx 7, and nx(rsp), respectively. From (3.64), we can express R as R= C' + C2' (3.66) (see also (3.45)). Now define the (r(sl)p)x1 random vector 2.:3 as W2:3= 2:3+(A2,maxI L2)C'y (3.67) 1 where A2,max is the largest eigenvalue of L2 and (A2,maxI *L2)2 is a symmetric matrix with eigenvalues equal to the square roots of the eigenvalues of (A2,maxI *L2), which are nonnegative. Let 2:3 be partitioned just like x2:3, that is W2 "2:3= (3.68) 2:3 where y2 and w* are of orders s1 and (r1)(sl)p, respectively. Theorem 3.2 represents the major result of this section. Theorem 3.2 (a) E(w,)=O; E(wD =9 (b) y2 and w3 are independent, normally distributed random vectors and have the following variancecovariance matrices: 40 Var(_2) = A + 62Is1 Var(w) =2l(r1)(s1)p (3.69) where 62= ao + A2,maxO*' Proof: The proof of Theorem 3.2 is very similar to the proof of Theorem 3.1 and is thus omitted. As a result of Theorem 3.2 we obtain (i) SS = y(A2 + 62Is_1)12 ~Xs1 (ii) SS *= = ~x2r2) 3 3 3/2 (r1)(s1)p (iii) SSS2 and SS* are independent. A test statistic for testing Ho:0 =0 vs. Ha:a 0 is then FW2 y2/(s1) F2 2 W /((r1)(s1)p) Under Ho, F2 Fsl,(rl)(sl)p. Note that p(2= Y 22 + 62 ks_l) 20fl where 02 is the average of the eigenvalues of F113. (3.70) (3.71) (3.72) (3.73) Since F2:3 (see 41 (3.60)) is p.d. it follows that F'3 is p.d. Thus F113 is p.d., implying that 02>0. Consequently we would reject Ho at the ylevel of significance if F2>Fy,sl,(rl)(sl)p* A test for the variance component associated with the interaction effect is developed in the next section. 3.1.5 An Exact Test Concerning a2p In this section we develop an exact test for au by using y3 (see (3.23)). We note at the onset that this test is equivalent to Thomsen's (1975) test. From (3.24)(3.26) we have E(y3) = (3.74) Var(y3) = F33ap4 + G330a. Since v is a function of y (see (3.14) and (3.20)) and y is independent of SSE (Lemma B.6(v)), y is also independent of SSE. Thus Y3 is independent of SSE as it is a subvector of v. Also, it clearly follows from (3.74) that Y(F33 +G33 ) Yl3~")(s l) p (3.75) 3(V33 0 3G33a 3Xr_1)(s1)p" Because SSE/a2 X_(rsp) a test statistic for testing Ho:ao f= H:a) ^ 0 is 42 Y3IYV3/((r1)(sl)p) F3= (3.76) SSE/(n(rsp)) Under Ho, F3 ~F(rl)(sl)p,n(rsp). Note that SyG334Y3 =3o 2 ap 2 (3.77) (r1)(s1)p 3 7 where 93 is the average of the eigenvalues of G3F33. Since both G1 and F33 are p.d. 93>0 (see Graybill(1983, p. 397)). Hence we would reject Ho for large values of F3. In this case, we did not need to resample to construct the exact test. 3.1.6 A Numerical Example An automobile manufacturer wished to study the effects of differences between drivers and differences between cars on gasoline consumption. Four drivers were selected at random; also five cars of the same model with manual transmission were randomly selected from the assembly line. Each driver drove each car a maximum of three times over a 40mile test course and the miles per gallon were recorded. For technical reasons the drivers did not drive the cars the same number of times. Unfortunately, unforeseen mechanical problems forced the cancellation of some of the test drives. The data is recorded in Table 3.1 We first test for a significant interaction effect using (3.76). We find that 43 F3 = 0.7167 numerator d.f. =9 denominator d.f. =22 pvalue = 0.6885. Thus there does not appear to be a significant interaction effect. The results concerning the main effects variance components are now presented. For testing Ho0:O =0, regarding the driver variance component, we use (3.56). For this data we find that F1 = 261.1971 numerator d.f. =3 denominator d.f. =9 pvalue <0.0001 The test for the car variance component resulted in (see (3.72)) F2 = 71.2372 numerator d.f. =4 denominator d.f. =9 pvalue < 0.0001 All calculations were performed using Splus. 44 Table 3.1 A(driver) An example of an unbalanced random twoway model with missing cells 1 25.3 25.2 33.6 32.9 31.8 27.7 28.5 4 29.2 29.3 (source: Dr. Andre I. 2 28.1 28.9 30.0 36.7 36.5 B(car) 3 24.8 25.1 31.7 31.9 30.7 30.4 32.4 32.4 Khuri) 4 28.4 27.9 26.7 35.6 35.0 29.7 30.2 31.5 31.8 30.7 33.7 33.9 32.9 29.2 28.9 30.3 29.9 45 3.1.7 The Power of the Exact Tests The approximate power of the exact test concerning a is derived in this section. Similar expressions can also be developed for the test for a2 and the test for a2. Our development is based on the results in Hirotsu (1979, pp. 578579). The reader is referred to Appendix D. Let T1 denote the power of the test for a2. Then from (3.56) 3/[(ris1) Fp] FT,r1,(r1)(s1)p Ha (3.78) where H:a~ 0. Under Ha, w1wl is distributed like a random variable of the form (see Box (1954b)) r1 T= E (j 61)j, (3.79) j=1 where the Xj's are independent chisquared variates with one degree of freedom, and A is the jth diagonal element of A* (j=1,2,,r1); A* is defined in (3.35) and 61 in (3.51). The exact distribution of T is complicated. To obtain approximate power values let MSz3= V3*3 (3.80) 3= (r1)(sl)p H = yl/cf (3.81) MS3/1b 46 h = cf 7,rl,(r1)(s1)p where in this case r1 S(j+ 1)2 c j=1 (3.83) c=1 r1 E (j.+ 1) j=l Fr1 f = L (3.84) (A+1)2 j=1 2 2 = c (3.85) =1 02 2 1 a p A1, maxe Then L = P(H > hHa) P(Fff2 h) +[p/{3(f +2)(f +4)B(lf,f2)}] x(1 fh(f + f2)/(fh) f x[( +)( +) 2(f +f2)(f +4) (f + f2 + 2)(f+f2) (3.86) x[(f+2)(f+4) + 2 S1 + f/(fh) {1 + f/(fh)}2 where (3.87) f = (r1)(s1)p 47 [r1 )r1 3 p + 1= 1) I (3.88) "r1 +)2 j=1 and B(ml,m2) is the beta function. We note that from (3.86) the power of the exact test depends on the design parameters Al,max',l*,' r the level of significance; and the ratios a /a. and ,a2 of the variance components. 3.1.8 Concluding Remarks The key to the construction of the exact tests for the main effects' variance components were the random vectors W1:3 and W2:3 in (3.49) and (3.67), respectively. Both of these vectors represent a linear combination between a function of y and a portion of the error vector. Ve note that C, in (3.49) and C* in (3.67) are not unique. Thus the values of the test statistics F1 and F, (see (3.56) and (3.72), respectively) will depend on the choice for C1 and C*, respectively. However, the distributions of W1:3 and Y2:3 are invariant to these choices. It is in this sense that our methodology is similar to techniques involving resampling. We also presented an exact test for the variance component associated with the interaction effect. This test is similar to Thomsen's (1975) procedure. Seely and ElBassiouni (1983) point out 48 that Thomsen's (1975) test for a 2 is a Vald's test. Since our test is based on the same degrees of freedom as Thomsen's test, we conclude that our test for apl is also a Wald's test. Consequently, F3 in (3.76) can be equivalently expressed as R (a() itsg,)/[(r1)(s1)p] F ( ) (3.89) 3 SSE/(n(rsp)) where R(aP) Iu,a,)/) represents the sum of squares for (a/3) adjusted for p, a, and f/. Here, a and # are treated as if they were fixed effects (see Seely and ElBassiouni (1983, p. 198)). Consider again wl:3 defined in (3.49). It is easy to verify that Yl:3 can be written as 1 Y1:3 = Aa+ (3.90) where A =(A 0) (3.91) (0 0 (see (3.35)), and E~N(O, 61Is(r1)p). (3.92) b6 is defined in (3.51). Now, we have 1 rank(A2) = r1 49 and (using similar notation to that of Seely and ElBassiouni), f= s(r1)p(r1) = (r1)(s1)p kW = r1. Thus, the test statistic F1 is a Wald's test based on the y1:3 model. Likewise, we can also show that F2 is a Wald's test based on the W2:3 model. 3.2 The Unbalanced 3Fold Nested Random Model 3.2.1 Introduction The unbalanced 3fold nested random model can be expressed as Yijk = +ai 3ij +Tijk+Eijke (3.93) (i=1,2,. .,a; j=l,2,...,bi; k=1,2,...,cij; e=1,2,. ,nijk), where p is an unknown constant parameter; ai, ,ij' 7ijk are random effects associated with the firststage, secondstage and thirdstage nested factors, respectively; and Eijke is the random error component. We assume that ai, /ij, 7ijk, and cijke are independently distributed normal random variables with zero means and variances a2, ao, 2a, and 2 respectively. We assume there are no missing cells. If n is the 50 total number of observations and c and b the total number of levels of the third and secondstage nested factors, respectively, then n= E nijk i,j, k c= Ecij (3.94) 1,j b= Ebi. i We are concerned with constructing exact tests for a2 and a2. We again need to use resampling to achieve our goal. The results given in Seely and ElBassiouni (1983) will allow us to conclude that our tests are Wald's tests. We note that an exact test for a2 is known to exist based on comparing R(71p,a,fl) to SSE where R(7p,a,13) is the sum of squares for 7 adjusted for p,a, and P (treating a,6 and 7 as fixed effects); and SSE is the error sum of squares. That is, to test Ho:a2=O vs. Ha: 2 O we would reject Ho for large values of Fy where SR(7l p,a,)/(cb) F7 SSE/(nc) Under Ho, Ft Fcb,nc (see, e.g., Searle (1971)). Resampling was not needed to develop this test. Now, in matrix notation, model (3.93) becomes y = Pln+X +Xt /XX37+ ( (3.95) 51 where y is the nxl vector of observations; in is a vector of ones of dimension n; X1, X2, and X3 are known matrices of one's and zero's (see (3.97) (3.99)); a, 0, and 7 are random vectors whose elements are the ai, fij, and 7ijk, respectively; and e is the nxl vector of errors. If ni and nij are defined as ni .= nijk j,k nij= knijk k a Xl= (D in. i=1 1 b. a 1 X,2= E in.. i=1 j=l ij a b1 cij 3  $ 1T !nijk X i=l je ke in i=1 j=1 k=1 ijk (i =1,2, . ,a) (3.96) (i=l1,2, ,a; j=1,2,  ,bi) (3.97) (3.98) (3.99) From (3.95) we have that where S 1(3Jn.)C+( Jn.j 2 +(i kJn ijk)2 + n . i 1 i, j 1 aJk ijk then, (3.100) y~N(pln, 0) 52 Now, define yijk as nijk Yijk=ifi E Yijkt (i=1,2, . ,a; j=1,2, ,bi; k=1,2, ,cij). (3.101) Then from (3.93) we (3.102) Yijk = P ai +ij +ijk +ijk (i=1,2,. .,a; j=1,2,...,bi; k=1,2,.,cij), where 1j Zijk nijk nijk E ijk'" 9=1 In matrix form, (3.102) becomes y =pc + + +A2 + Ic2 + (3.103) where y is the cxl vector of means; T is the cxl vector Y = (En l,112',... abacaba)'; and A1 and A2 are defined as a Al =. 1 i=l i a bi A2= l 1c i=1 j=l 1ij (3.104) (3.105) have 53 where ci= .cij c..J 13 (i =1,2,.,a). (3.106) We note that range(1c) C range(A1) C range(A2). In fact, Ic= Alla = A2lb Al = A(.ilbi) (3.107) From (3.103) we have where (3.108) and (3.109) a A1 = AA' E Jc. 1i= 1 y~N(plc, E) 54 a bi A2=A2A/= = Jc. (3.110) i=l j=l 1i a bi Cij 1 K= $ Enik. (3.111) i=1 j=l k=1 Besides the assumptions concerning the random effects, we will also assume: (i) The model used is model (3.93) (or model (3.95)) (ii) ba>0 (b is defined in (3.94)) (3.112) (iii) n>2c1 (n and c are defined in (3.94)) (iv) c > 2b1. The last two assumptions (along with the others) are needed to insure the validity of our procedures. Neither is overly restrictive. Assumption (iii) can be satisfied if, for example, nijk>2 for all (i,j,k) while Assumption (iv) will hold if, for example, cij22 for all (i,j). 3.2.2. Preliminary Development In this section we use the last c1 rows of the cxc Helmert matrix (see Searle (1982), p. 71) to transform model (3.103) into one that has a zero mean vector. We also present some results concerning the ranks of the model matrices in the transformed model. 55 Let HI be the (c1)xc matrix resulting from the deletion of the first row of the cxc Helmert matrix. HI has orthonormal rows and is such that Hi1c=0cl. Define the (cl)xl vector u as U = Hly. (3.113) Then, E(u) = HIE(y) = pHl1c = Qc_1 Var(u) = H1A1Hlo + HlA2H O + fl c1 + L (3.114) where L = H1KH . (3.115) Recalling (3.104) and (3.105), it is easily seen that rank(A,) = a (3.116) rank(A2) = b. The next lemma provides information on the ranks of HIA1H' and H1A2H . 56 Lemma 3.3 (a) rank(HI iH') =al (b) rank(Hix2H') =b1 (c) rank(HliHA+H liH') = b 1. Proof: (a) rank(HA1H,') = rank(H1AA'H') = rank(HIA1) < rank(A1) = a. (see (3.116)) However, from (3.107), c =A1a. Since Ic = Qc1 it follows that the a columns of HIA1 are linearly dependent. Thus rank(H1A1) Frobenius Inequality (see Lemma A.1), we have rank(H11Hi) = rank(H1Aj) > rank(H1) +rank(A1) rank(Ic) 57 =cl+ac=a1. Thus, rank(H1A1Hi ) = a1. (b) Similar to (a). (c) Since range(A1)Crange(A2) we have that range(HiA1)Crange(H1A2). Consequently, range (HA1AAH') = range(H1iA) C range(H1A2) = range(HA2An H) . Thus there exists a matrix B such that HiAIH' = (HiA2H))B. (Since both H1A1HA and H1iA2H are n.n.d. B must be n.n.d.) Hence, rank(H1A1Hl + H1A2H) = rank[(HHA2H) B + HiA21H = rank[Hi2HA (B+I)] = rank(H1A2HI) (B+Ihas full rank) (see (b)). =b1 58 3.2.3 An Exact Test Concerning a2 Introduction. In this section we construct an exact test for the variance component associated with the secondstage nested factor. The idea is to first make a transformation of model (3.113) based on resampling from the error vector. This transformation will result in a model that is a special case of the model described in Seely and ElBassiouni (1983, p. 198). We will then apply the results in Seely and ElBassiouni's Equation 3.2 to develop the exact procedure. Utilizing resampling. Recall model (3.95). Let W be the partitioned matrix W=[1in:Xi:X2:X3] (3.117) and let R be the matrix R = I W(W'W)'. (3.118) Since range(1n) Crange(X) C range(X2)C range(X3), (3.118) can be expressed as R = In X3(XX3)x. 3 (3.119) 59 The error sum of squares associated with model (3.95) is given by SSE=y'Ry. (3.120) It is known that R is symmetric, idempotent, and has rank nc. Furthermore, SSE/a. has the chisquared distribution with nc degrees of freedom independently of y (see, e.g., Lemma B.6). We can express R as R = CrC' (3.121) where C is an orthogonal matrix and r is a diagonal matrix whose first nc diagonal elements are equal to unity and the remaining c entries are equal to zero. By assumption (3.112)(iii), we can partition C and r as C = [C1:C2:C3] (3.122) r =diag(I1,I,2,0) where I =c1 (3.123) '12 = n2c+l > 0 60 and C1, C2, and C3 are matrices of orders nxI1, nxy2, and nxc, respectively. Note that 71+72 = nc = rank(R). From these partitionings and the fact that (recall C is orthogonal) CiCi =I (i =1,2,3) (3.124) C'Cj =0 (i# j), we obtain R= CC' + CC'. (3.125) Now, define the (c1)xl random vector w as 1 L= +(AmaxI,1 L)2C'y (3.126) where Amax is the largest eigenvalue of L (see (3.115)) and C' is the nx~l matrix defined in (3.122). The matrix (AmaxI 1 L) is p.s.d. Hence (AmaxI 1L)2 is welldefined with eigenvalues equal to the square roots of the eigenvalues of (AmaxII1L), which are nonnegative. The first major result of this section is given in the next theorem which provides some distributional properties for w. 61 Theorem 3.3 The random vector w is normally distributed with E(y)= ci (3.127) Var(y) = HiAH, 2 + H1A2H+2 6Ic_ where 6 = + Amax27 Proof: The fact that w is normally distributed is clear. We know that E(u)=O (see (3.114)). From Lemma B.6 (ii) we have that Rln=O. Since range(C1)C range(R), it follows that C(1n=Q (recall that R is symmetric). Hence E(w) = E(u) + (Amaxl1 L)2CE(y) 1 = O+ (AmaxI1 L)Yi (ln)p =0. Now, we claim that u and C y are independent in the expression given for y in (3.126). To verify this recall that SSE is independent of y 62 (see Lemma (B.6)(v)). Consequently, DOR = 0 where D is such that Dy=y, and 0 is defined in (3.100). Since range(C1) Crange(R) it follows that DQC1 = 0. Thus y is independent of C'y. Since u is a function of y, u is also independent of C'y as claimed. Hence, 1 1 Var(w) = Var(u)+(Amax1 L)2CQCi (AmaxI771 L)2. (3.128) From Lemma B.6(ii), RR=R2a2. Thus since range(C1)C range(R) and R is idempotent we have C, CI = CcClo2 = a11. The last equality follows from (3.124). Therefore we can express (3.128) as Var(w) = Var(u) + (Amaxy L)O = HI1AHl + H"a Hll + Ica + La + (Amaxli1 L I1 1 2 C t77 63 = HIA1Ha ~ + HiA2H 61 + 61_1 where 6= + Amaxa O From Theorem 3.3 we note that model (3.126) can also be written as w=B B1+B2P+ (3.129) 1 where B1= HA1; B2 = HA2; and = HI7+H + (AmaxI1 L)2C'i ~N(O,b6Il) independently of a and /. This model in the form required by Seely and ElBassiouni (1983, p. 198). We are now in position to construct an exact test for testing Ho:uO O= vs. Ha:0 0O. The exact test. Let us start with some notation. SSE, will denote the error sum of squares associated with model (3.129). That is, SSE= w'(Ic1 [Bi:B2]([Bi:B2]'[B :B2])[B1:2]', (3.130) Also, let R,(Pla) denote the sum of squares for # adjusted for a (treating a and 6 as fixed) in model (3.129). Thus Rw( g) = '(I1 B(B'B)B) SSEw. (3.131) Since range(B1) Crange(B2), (3.130) and (3.131) can be equivalently expressed as 64 SSER = 'wl(I_ B2(B'B,)B R( a) = wy B2(B(B2)B B,1(B B1B'. (3.132) (3.133) The second and last major result of this section is given in the next theorem. Theorem 3.4 a) A ginverse of B'Bj = A'.HH1Aj is given by (A'.HH1A')= (A.Aj)1 (j=l,2). b) RV(PI Q)/6 _XbXa under Ho =0O c) SSEW/6 ~ x2 Xcb d) R.(flIg) is independent of SSEW Proof: a) Since HiH'i=Ic_1 and H1lc=0c_1. It follows that H'HI=Ic c where Jc= Ic'. Also recall that 1 = Ala = A2b (see (3.107)). Thus A'H'HlAj = A'(IcJc)Aj 65 = A'Aj A' JA = A'Aj A' AjJ A'jAj t a if j=1 t b if j=2. (j=1,2), (3.134) Therefore, A'HHA (A' jA )A'H'HA =(A'A A'jJtA'.Aj)(A'jAj)'(A'Aj A'jAjJ tAAj) =(It A' jAJt)(A'Aj A'AjJtA jA = A'A A' A J A' A + A'. A J A' A J A'. A =AjAjJJ it JA J c AjJ J AjJ J =A'Aj JAjJ +A A'JcA JA'Aj = A'. A AJcA' +A'.Jc cA =j A Cj C 2 A c JJ = A'.Aj C Si i i = Aj(Ic J)Aj = A' H'Aj. where 66 Although (b), (c), and (d) follow from Seely and ElBassiouni's (1983) Equation (3.2), we nevertheless prove these results for the sake of completeness. b) First note that (3.133) can be expressed as (3.135) w'(P2 P) where (j = 1,2). (3.136) We need to verify that (P2P1)(Var(w))/6 is idempotent of rank b a when = 0 (see Lemma (B.4)). Recall from Theorem 3.3 that Var(w) = H1A1AH la + H1A2AH + 6c1 Since Pj is idempotent of rank t1 (see part (a) of this theorem, Lemma (3.3), and (3.134)) and range(P1) Crange(P2) it follows that there exists a matrix 0 such that P1 = P20 ==0'P2. (by symmetry) Pj =HAj(A'Aj)1'A'H' 67 Thus P1P2= 0'PP, = OP2 = PI. Consequently, P2P1 is idempotent and has rank (b1) (a1) =ba. Now, since range(H1A1,H') C range(HIA2AgAH) = range(P2), we have (P1P2 = P1) =(I P ,) [HAA'Hoa + H1A2A He' + 6P2] = (I P1)H1A2AH'I +6(aIP1)P (3.137) = 6(P2 P1) (when a = 0). Formula (3.137) follows since range(H1A1A'H')=range(P1) and (IPI)PI=O0. Consequently, (P2P1)(var(y))/6 is idempotent of rank ba. Thus from Lemma B.4 we have R( l)/6~Xba under Ho:92 =0. c) SSEW in (3.132) can be expressed as (P2 P1)(Var()) = (I P)P2Var(yw) SSEW = w'(Ic1 P2)" 68 In the expression for Var(y), range(H1A1AH1) C range(HIA2AH') =range(P2). Thus since (IP2)P2=0 it follows that (IP2)(Var(w))/6=IP, and therefore has rank (c1)(b1)=cb. Thus from Lemma B.4, SSE/~ c_b d) To show that R(/PIa) is independent of SSEw we need to verify that (P2P )(Var(w))(Ic P2)=0 (see Lemma B.2). From (c) we have (Var(w))(IP2) = (I P2) Thus (P2P)(Var(w))= 6(P2P)(IP2) = 0. The last equality follows since range(P1)Crange(P2). Consequently R(/PI) and SSEW are independent. We can conclude from Theorem 3.4 that Rw(1q)/(ba) F SSE/(cb) bacb /3 SSE,/(cb) ba,cb (3.138) 69 under IHO:e2 =O. We would reject Ho at the v level of significance if Ff>Fv,ba,cb where Fu,ba,cb is the upper 100%l point of the corresponding F distribution. The fact that we reject HO for large values of our test statistic follows from the next lemma. Lemma 3.4 a) tr[(I P1)H1A2A'H'] >0 b) rank[(I P)HAAH] = ba, where (IP1)H1A2A2H1 is part of the expression for (P2P)(Var(w)) in (3.137). Proof: a) Both (IP1) and H1A2A H' are n.n.d. Thus tr[(IP1)H1A2A H] >0 (Graybill (1983, p. 397)). However, since range(P1) Crange(P2) =range(H1A2A''H) it follows that (IP1)H1A2AH 0 if it were zero then range(H1A2AAHl) would be contained in range(P1), a contradiction). Hence tr[(IP1)H1A2A H']>0 (see Graybill (1983, p. 397)). b) First note that P2HIA2A'H' = HIA2AHi. Thus, since P1P2 =P1 70 (I P1)H1A2A'H = (P2 PI)H1A2AH. Consequently, rank[ (I PI)HIA2AH'] = rank[ (P2 P )HiA2AH'l] < rank(P2Pi) = ba. Conversely, by the Frobenius Inequality (see Lemma A.1), rank[(I PI)HIA2A'Hl] > rank(I P1) + rank(H1A2A'H') rank(Ic_ ) = ca+b1 (c1) = ba. Hence, rank[(I P1)H1A2A~H'] = ba. 1 If 0P3(>0) denotes the average of the nonzero eigenvalues of (IP1)HIA2A'H' then from (3.137) b1aI) = tr[(P2 P)Var(w) (ba) = Bp/2 +6. We note that 0 is an average since rank[(IP1)HA2AH']= ba by (b) of this lemma. We now develop an exact test for ac. 71 3.2.4. An Exact Test Concerning ar Introduction. In this section an exact test is constructed for the variance component associated with the nesting factor. Similarly to Section 3.2.3, we would like to use model (3.129) to develop our exact procedure. Unfortunately, because range(B1)Crange(B2) in (3.129), we have R( I ) = wIc B(B2 B2)B SSEw =0. Thus another approach is necessary. The problem here is similar to the problem we would have encountered if we had tried to use the original model (3.95) to construct an exact test for ao. Namely because range(X2) Crange(X3), R(31p,g,7)=0. We overcame this problem by first averaging over the last subscript, then transforming the resulting model into one that had a zero mean vector. Finally, another transformation was made based on resampling from the error vector to arrive at model (3.129). From here we were able to derive an exact test for a . The same idea can be used again to develop an exact test for oa. Now, however, our starting model is model (3.129). This model is similar to an unbalanced random twofold nested model (without an intercept term) in the sense that it enjoys all the same mathematical and statistical properties of such a model. 72 Initial development. Our first step will be to "average over the last subscript". This can be accomplished by considering the bxl random vector w defined as D= (BAB22)B = (A H'HIA2)A2Hy (A~'A2)A Hl . (3.139) The last equality follows from Theorem 3.4(a). Next we transform model (3.139) into one that has a "zero mean vector." Although E() )=O, it is still convenient (for mathematical reasons) to make this transformation. Let H* be the last b1 rows of the bxb Helmert matrix. Then H*H*'= Ib and Hlb= b1 Define y as the (bl)xl vector y= H*. (3.140) The mean and variance of y is provided in the next lemma. Lemma 3.5 a) E(v) =Qb1 b) Var(y) = Ha Jb.1 + b1 +HI(A2Az) H'6 (3.141) 73 Proof: a) From Theorem 3.3, E(w) =Q_1. Thus E(y) = HE(D) = H(A A2) 1AH'E(y) = . b) Also from Theorem 3.3, Var(w) = HlA c1a + HI A2Ha 1 + 1 Thus, Var(y) =H*[Var(g)]H*' = H(AA2)'A H'[Var( )]HiA2(A A2)'H ' = H*(A'A,) 1A [HIAH1H "a +A, 2H'0 + 6Ic1_ x HiA2(AA ) H1 Now, from (3.107), A1 = A2( i Jbi Therefore HiAIH' = HIA2 1 bi H'I. eJb)A''1 (3.142) 74 Also, ) A2H1A 2 1 1 I(A2A2) 2Ic c)A2 = H (A A2) '[A A2 A JcA21 = H(AA ) A A JbA (from (3.107)) = H[Ib JbA A2] =H (recall that Hb = 0) (3.143) Consequently, from (3.142) and (3.143), Var(v) = [H i1Jbi HlJa+HA H*r +H *(A2A2)'A2H16]HA2(A2A2)1 H = ( H Jb H'a + Hb  + l(AA A)1H *6. 'i=1 i * :( i=1 b )H +bl P T122) 1H6 Our last step will be to resample from the error vector. In this case, the error vector is R*w (R*=Ic1 P2 ) since the error sum of squares for the w model (see (3.129)) is given by SSE =w'(IclP2)w . Utilizing resampling. Noting that R* is symmetric, idempotent, and has rank cb, we can write it as 75 R* = C*F*C*' (3.144) where C* is orthogonal and F* is diagonal whose first cb diagonal entries are equal to unity and the remaining entries are equal to zero. Furthermore C* and F* can be partitioned as C* = [C:C :C C*] (3.145) r*=diag(I *,I *,0), 71 '72 where '7 =b1 (3.146) 2 = c2b+ 1. We note that +' =+ = rank(R*) and 9 >0 by (3.112)(iv). Also, C*, C*, and C* are matrices of orders (cl)x*, (c1)x7*, and (c1)x9, respectively, satisfying CIyCi = I (i = 1,2,3) (3.147) CiC= 0 (i j). 1 3 76 Thus, R* = C*C*' +C (3.148) Define the (b1)xl random vector r as 1 S= Y + (Amax L*)iCw (3.149) where Amax is the largest eigenvalue of L* = H(A2A2)'H' and 1 (A*axI *L*)2 is a welldefined matrix with eigenvalues equaling the square roots of the eigenvalues of (AmaxI *L*), which are 771 nonnegative. The major result of this section is given in the following theorem. Theorem 3.5 The random vector r is normally distributed with a zero mean vector and a variancecovariance matrix given by Var(r) =H*( a Jbi)H' + 6*Ib1 (3.150) 1i=1 E where 6* = +Aax5. 77 Proof: Clearly r is normally distributed. From Lemma 3.5 we know that E(v)=Qb1. E(w)=OQc_ from Theorem 3.3. Thus 1 E(r)=E(y)+(Amai L*)C'E(W) (see (3.149)) =0. We claim that v = HI = (A'A )'AH'i (see (3.140) and (3.139)) is independent of C*'w in (3.149). To verify this consider (A A2)'AH':[Var(w) R* = (A'A2)'A H'[Var(w)](IP2) =6(AA2)1'A'H(IP2) (recall (3.127) and (3.136)) =0. The last equality follows since range[(A/A2)l'A'H] = range(P2). Since range(C*)C range(R*) it follows that (AA2)AWHI [Var(w) ]C= 0. Thus v is independent of Cu'w as claimed. Consequently, from (3.149) 1 1 Var() =Var(v) + (A*axI L*)2C'[Var() ]C(A*axI L*)2 91 91 78 Now, recalling that [Var(y)]R*=6R* (see (3.127) and (3.136)), C*'[Var(w) ]C* = C*'R*[Var(w) ]R*C* = 6C*'R*C* (from (3.148)) (R* is idempotent) = 6C*ICb =6Ib1 Hence, from Lemma 3.5(b) and the fact that L* =H*(AA2) 'HI, Var(r) = Var(y) + (AaxI L*)S 771 = H 1bi)i ) + Ibl + L5* + (AmaxI L*)6 =H Jb a b16 i=1 77 where 6* = o + max6. We can infer from Theorem 3.5 that model (3.149) can be re expressed as (3.151) 1 where ?=H 1 +H *+(AmaxI *L*)2C *1* N(Q,6*Ibl) independently of a and T* = (AA2) A'H'i*. The vector is defined in (3.129). Model r=(H *ilb i+ b1 79 (3.151) is similar to an unbalanced oneway random effects model with zero intercept. We note that rank(H H lbi )< rank( i bi) But, since ( i 1bi a=ib and H b=Ob1, it follows that rank(H i=lb ) Lemma A.1), rank (H* ilb ) rank(H*) +rank( i b )rank(Ib) 1 i i= l" / b) =b1 +ab =a1. Hence, rank(H i bi)=al. We have therefore just proven Lemma 3.6. Lemma 3.6 In (3.151), rank(H i=(lb )=a1. We can now develop the exact test for ra. The exact test. We now construct an exact test for HO:a=2=0 vs. Ha:2a~#O. To start with let F be the (bl)xa matrix a F=HHJ Dlb.b (3.152) ii1 1 80 Also, let SSET denote the error sum of squares for model (3.151). That is, SSET = '(Ib1 F(F'F)F')r. (3.153) R,(q) will denote the sum of squares for the a effect in model (3.151). That is Rr(q) = r'F(F'F)F'. (3.154) In the next theorem we find a generalized inverse for F'F and show that R(go) and SSE, are independent and have scaled chisquared distributions (the former having a chisquared distribution only under HO: 2=0). Theorem 3.6 (a) A ginverse of F'F=( i 1Ib.i)1'H1 1bi.) is a  (F'F) = b1. i=1 (b) R(aQ)/6*a~a_1, under Ho:a* =0 (c) SSE/6l* 2b Xba (d) r(qa) is independent of SSE,. 81 Proof: The proof of Theorem 3.6 follows along much the same reasoning as that for Theorem 3.4 and is thus omitted. A test statistic for testing HO:oa2=0 vs. Ha:a2 0O is given by SR(()/(a1) F SSE/(ba) a SSE,/(ba) (3.155) Under HO, Fa Fal,ba Note that (3.156) E =i] +6* where 0a is the average of the eigenvalues of FF'=Hi=~l(Jbi)t'' If H* denotes bxb Helmert matrix, then (3.157) Consequently, b i= i bi = tr b i 1 bi)b b i=1 bi)lb 4ib = Jb H* ' i ( i= b Hi 1bbi =H 1 11 b H* Hi n 82 lit 1 i +( ( i ) i = br tE) Hence, ba 1 b i >0. (3.158) Therefore we would reject HO:aO==0 for large values of F.a 3.2.5. Reduction of the Exact Tests when the Design is Balanced Introduction. We now show that the exact tests for a2 and ao given in (3.155) and (3.138), respectively, reduce to the usual ANOVA tests when the design is balanced. The test statistic F7, given in Section 3.2.1, is the usual ANOVA test for testing HgO:2=0O vs. Ha: u' O. When the design is balanced we have b b* Vi cijc* V(i,j) (3.159) nijkn* V(i,j,k). 83 Consequently, b= bi = ab* i 1 c = c j =ab*c* 1,i (3.160) n =E nijk = ab*c*n* i,j,k Also, recalling (3.104), (3.105), and (3.111) A1 = b*c* = Ia b* c* 1 A2=. 1* =Ia Ib* lc 1,Jc K = E n*1 = (Ia I ( i,j,k n b* c*) Thus the matrix L defined in (3.115) becomes L= HIKH' =1 1c It's largest eigenvalue is therefore Amax= The matrix L*, (3.149), was defined as (3.149), was defined as (3.161) (3.162) used in L*= H*( AA)1 '. 1 2 'H*/ 84 When the design is balanced, L* becomes (see (3.161)) L* = H(c*Iab*)1H' = lb1 (3.163) The largest eigenvalue of L*, A*ax, is therefore . The matrices P1 c and P2 were defined in (3.136). When the design is balanced they reduce to P1 = H(IaJb* Jc*)H' (3.164) P2 = Hl(Ia Ib* Jc*) where Js= s1s* Reduction of the exact tests. Consider first the vector w defined in (3.126). When the design is balanced w becomes 1 W= + (AmaxI L)2CIy =u (from (3.162)) =HI (see (3.113)). (3.165) 85 Using (3.163), the vector r, defined in (3.149), reduces to 1 S= y + (A*axI L*)C /, 'r v,:Y+ ma 1 1 (from (3.163)) (see (3.140)) = (AA )'A'H =H(A'A2) A2 1W = H (I,( I* 1*)Hw C* b* & c (see (3.139)) (from (3.161)). (3.166) Since the design is balanced, Ic P2 H[Ia Ib* (c* Jc* and P2P P = H[Ia (Ib* Jb*) Jc* Consequently, (see (3.135)) = y'IHI[Ia (Ib b*) Jc* IHI Rw(fl a) = w'(P2 P)w 86 =y{Ia (Ib*Jb*) Jc* (3.167) The last equality follows since H'Hl=IcJc and Jb*(Ib*Jb) =0. Also SSEW = w(Ic1 P2)y (see (3.132)) (3.168) The last equality follows since Jc (I* Jc) =0. (3.153) and (3.154) we have SSE, = r'(Ib1 F(F'F)F')r ='rb(Ibl FF')r b b* Furthermore, from (by Theorem 3.6(a)) = '(IblH*(IaJb*,)H')r (see (3.152)) = r'H[Ia (Ib* Jb*)1~'T From (3.166) we can further reduce SSE, to SSEr= i H lQ'H 'H Ia 00(b* b*)I'H*QH' C1w_ n**, 711 b* _jbP,) 1 'HrIW SSEr 2 c,~1~~ La (b = y'HHI[Ia b* & )(Ic* c*)H =y Ia Ib* ((Ic Jc*) 4 87 = 'H2 ,IaS b* b b*)]Q.Hl where, from (3.166), r = Ia Ib* lc* Since Ia (Ib* Jb*) Ia IbJ*b*) c* we have SSEr = /'Hla ( ) = i: (P2 P1) w. c Thus, c*SSE' = R('3 g). Similarly, we can show that R7() = r'F(F'F)F'r C* l = c, PI w (3.169) 88 Thus, c*Rr(a) =y'H'P1Hly. Since the design is balanced, Y = 1(la I n* Ib*( Ic* "n*Y (3.170) (3.171) Thus (3.167), (3.168), and (3.170) can be expressed as, respectively, and R,( Ia) = 1*Ya (Ib Jb*) Jc* & Jn* SSEWY = l aI* la c* Jc*) J* Rr(a) = n c (Ia a)Jb* Jc* Jn*f (3.172) (3.173) (3.174) Therefore, from Khuri (1982), it follows that n*c*R,(g), n*R(/lg) =n*c*SSE,, and n*SSEw reduce to the usual ANOVA sums of squares for balanced data. 3.2.6. A Numerical Example The following data are taken from Bennett and Franklin (1954, p. 405) and has been made unbalanced for illustrative purposes. The 89 data represent a part of a statistical study on the variability of a number of properties of crude smoked rubber. The figures given in Table 3.2 represent measurements of the modulus at 700% elongation made using 24 samples. On each of these samples at most 3 determinations of the modulus at 700% elongation were made, resulting in a total of 53 observations. The three factors of interest are supplier, batch, and mix. We assume each factor to be random. Yijki will represent the Ith replicate determination from mix k, which is derived from batch j of supplier i. The linear model in this experiment is YijkE =p+ai +I3ij +ijk+ ijkE' where i=1,2,3,4; j=1,2,..,bi; k=1,2; E=1,2,. *,nijk. In this case b =2, b2=4, b3=3, and b4=3. Notice that cij =2 for all (i,j) so that the design associated with this experiment is partially balanced. We start our analysis by testing for a mix effect. That is, we test for Ho0:0=0 vs Ha:a2#0. The appropriate test statistic is F given in Section 3.2.1. In the present case we find F =1.3110 numerator d.f. =12 denominator d.f. =29 pvalue =0.2650. 90 Thus there does not appear to be any significant variation due to the mix effect. To test for a significant batch effect we use Fp, the test statistic developed in Section 3.2.3. For these data we find Fp =0.3800 numerator d.f. =8 denominator d.f. =12 pvalue = 0.9114. No significant variation in the moduli of the 24 samples exist among batches. The test statistic Fa (see (3.155)) is used to test for a significant supplier effect. Ve find Fa =39.0024 numerator d.f. =3 denominator d.f. =8 pvalue <0.0001. There is highly significant variation in the moduli of the samples among the four suppliers. All calculations were performed using Splus. 91 Table 3.2 An example of an unbalanced random 3fold nested model Supplier A B C D Mix 1 2 1 2 1 2 1 2 Batch I Batch II Batch III Batch IV 196 200 250 238 226 248 249 193 204 165 194 209 221 196 156 186 196 174 172 202 211 204 323 279 251 238 250 273 221 262 272 223 256 230 (Source: Bennett and Franklin (1954, p. 405)) 