Citation |

- Permanent Link:
- http://ufdc.ufl.edu/AA00066650/00001
## Material Information- Title:
- Some new extended block designs and their analyses
- Creator:
- Schreckengost, Jack Franklyn, 1944-
- Publication Date:
- 1974
- Language:
- English
- Physical Description:
- ix, 9 leaves. : ; 28 cm.
## Subjects- Subjects / Keywords:
- Block designs ( lcsh )
Statistics thesis Ph. D Dissertations, Academic -- Statistics -- UF Block designs. ( fast ) - Genre:
- bibliography ( marcgt )
non-fiction ( marcgt ) Academic theses. ( lcgft )
## Notes- Thesis:
- Thesis -- University of Florida.
- Bibliography:
- Includes bibliographical references (leaves 106-108).
- General Note:
- Typescript.
- General Note:
- Vita.
## Record Information- Source Institution:
- University of Florida
- Holding Location:
- University of Florida
- Rights Management:
- The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. Â§107) for non-profit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact the RDS coordinator (ufdissertations@uflib.ufl.edu) with any additional information they can provide.
- Resource Identifier:
- 022828571 ( ALEPH )
14101107 ( OCLC )
## UFDC Membership |

Downloads |

## This item has the following downloads: |

Full Text |

SOME NEW EXTENDED BLOCK DESIGNS AND THEIR ANALYSES
By JACK FRANKLYN SCHRECKENGOST A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1974 TO MY WIFE ACKNOWLEDGMENTS I would like to express my appreciation to Dr. John A. Cornell for his guidance and assistance while di recting this dissertation. My thanks, also, to the other members of my advisory committee, Dr. F. W. Knapp, Dr. Frank G. Martin, Dr. John G. Saw, and Dr. P. V. Rao, for their helpful suggestions. A belated thanks is expressed to Mr. Ronald E. Boyer, a good teacher and a valued friend, who gave me much encouragement from the very beginning. Appreciation to my wife, Donna Rae, cannot be ex pressed as deeply as is felt. I thank her for her patience and understanding during my many hours of study, research, and writing. iii TABLE OF CONTENTS Page ACKNOWLEDGMENTS iii LIST OF TABLES vi ABSTRACT vii CHAPTER 1 INTRODUCTION 1 1.1 Blocking Designs .... 2 1.2 Extended Complete Block Designs 2 1.3 Purpose of This Work 5 2 LITERATURE REVIEW 7 3 EXTENDED COMPLETE BLOCK DESIGNS WITH CORRELATED OBSERVATIONS 16 3.1 Notation and Definitions 17 3.2 Intrablock Estimation of the Treatment Effects 21 3.3 A Test for the Presence of Correlation 26 3.4 The Exact Distribution of SSR for p > 0 30 3.5 An Approximate Distribution of SSR for p > 0 40 3.6 An Estimate of the Correlation 4 3 4 A PARTIALLY BALANCED GROUP DIVISIBLE ECBD ... 47 4.1 Definitions and Notation 50 IV TABLE OF CONTENTS (Continued) CHAPTER 4 (Continued) Page 4.2 Intrablock Estimation of the Treatment Effects 53 4.3 Distributions of the Sums of Squares and Relevant Tests of Hypotheses 58 4.4 Mixed Model Analysis 62 5 A PARTIALLY BALANCED ECBD WITH THE L ASSOCIATION SCHEME 7 66 5.1 Intrablock Analysis 68 5.2 Distributions of the Sums of Squares and Relevant Tests of Hypotheses 72 6 THE GENERAL PARTIALLY BALANCED EXTENDED COMPLETE BLOCK DESIGN 77 6.1 Intrablock Analysis 78 7 CONCLUDING REMARKS AND A SENSORY TESTING EXAMPLE 86 APPENDIX 1 THE EXACT DISTRIBUTION OF SSR FOR b = 2t AND k = t+1 WHEN p>0 94 2 AN APPROXIMATE DISTRIBUTION OF SSR FOR b = 2t AND k = t+1 WHEN p>0 98 BIBLIOGRAPHY 106 BIOGRAPHICAL SKETCH 109 V LIST OF TABLES Table Page 1 Intrablock Analysis of Variance for an ECBD 25 2 Values of g and h for the Approximate Distribution of SSR, I 42 3 Intrablock Analysis of Variance for Partially Balanced Group Divisible Extended Complete Block Designs 57 4 Mixed Model Analysis of Variance for Partially Balanced Group Divisible Extended Complete Block Designs 63 5 Intrablock Analysis of Variance for Partially Balanced ECBD of the Association Scheme ... 73 Al Values of g and h for the Approximate Distribution of SSR, II 99 A2 Comparison of the Exact Distribution and Two Approximate Distributions of SSR 104 vi Abstract of Dissertation Presented to the Graduate Council of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy SOME NEW EXTENDED BLOCK DESIGNS AND THEIR ANALYSES By Jack Franklyn Schreckengost August, 1974 Chairman: Dr. John A. Cornell Major Department: Statistics An extended complete block design is a balanced block design consisting of t treatments in b blocks each of size k such that k varies between t and 2t. The balance among the treatments is achieved by selecting duplicates of some of the treatments for each block according to the scheme followed when selecting blocks from the class of balanced incomplete block designs. Under the assumption of an addi tive model, it may be of interest to investigate the exist ence of correlation between responses to the same treatment in the same block. When a positive correlation between du plicate observations is present, it has been previously shown that k should be taken equal to t+1 for maximum efficiency with the extended complete block designs when compared to complete block designs. A procedure for the test of the hypothesis of zero correlation is presented as is a method for estimating the vii correlation if the hypothesis of zero correlation is rejected in favor of the alternative hypothesis of positive correla tion. Particular attention is given to the distribution of the sum of squares for remainder, where remainder is defined as residual minus duplication error, under the alternative hypothesis of positive correlation. The distribution of the sum of squares for remainder is necessary for calculating the power of the test and for obtaining an approximation to the distribution of the estimator of the correlation. A specific formula for the distribution of the sum of squares for remainder is given for the case t = b and k = t+1. The exact distribution and an approximate distribution of the sum of squares for remainder are also presented for the case b = 2t and k = t+1. The general partially balanced extended complete block design is defined as a partially balanced block design consisting of t treatments in b blocks each of size k greater than t. The analyses of variance for the non-additive fixed effects and mixed models are presented for the special class of designs called partially balanced group divisible extended complete block designs. The analysis of variance of the additive fixed effects model is also presented for the class of partially balanced extended complete block designs with the (Latin Square) association scheme. The analysis of variance of the non-additive mixed model for this class of designs is mentioned briefly. The intrablock analysis of variance for the viii additive model is developed for the general partially bal anced extended complete block designs. Also, the recovery of the interblock information and the combined intrablock and interblock analysis for these general designs are men tioned briefly. The final chapter contains some comments about the assumptions made and about directions for future study. A numerical example of a taste testing experiment is also pre sented with the resulting analysis for the balanced extended complete block designs considered in this work. ix CHAPTER 1 INTRODUCTION In many fields of experimentation, a distinction that has long been implicit in the statistical literature is the difference between experiments designed for the es timating of absolute treatment effects and experiments of the comparative type. In comparative experiments, the emphasis is on performing comparisons between the effects of the dif ferent treatments such as the effects of different doses of a drug or the effects of different levels of nitrogen on the average yield of soybeans. While the distinction between comparative experiments and experiments designed to es timate the absolute treatment effects individually is per haps not always clearly defined, nevertheless, the idea of a comparative type of experiment remains convenient and useful. For comparative experiments, it is clear that an advantage is to be gained by comparing the treatments under homogeneous conditions. To achieve this end, much of the effort in choosing the homogeneous conditions is directed toward the selection and use of block designs. Over the years, both complete and incomplete block designs have been discussed in detail. In this work, we shall be concerned mainly with combinations of these block designs for use in comparative type experiments. 1 2 1.1 Blocking Designs In blocking experiments where the objective is the comparison of different treatment effects, the number of experimental units in each block may or may not equal the number of treatments to be compared. When the size of the block, where size refers to the number of experimental units in each block, is equal to the number of different treatments and each treatment is randomly assigned once with every other treatment in each block, the design is known as a randomized complete block design. If the size of the block is less than the number of treatments, an incomplete block design may be used. Incomplete block designs are common in applications where either the number of treatments is large or the size of the block must be kept small in order to ensure homogene ity of the experimental units in each block. Still another type of block design exists when the size of the block exceeds the number of different treatments. In this latter design, if each block contains first repli cates of all of the treatments plus duplicates or second replicates of some of the treatments, the design is called an "extended complete block design." We now discuss such designs. 1.2 Extended Complete Block Designs In an attempt to increase the precision of the com parisons between the effects of each of the treatments and the effect of a control treatment, Pearce (1960) introduced 3 blocking experiments where in each block the control treat ment was replicated. Later, Pearce (1964) considered possi ble methods for designing experiments in which for a given experiment the blocks are of varying sizes. The analysis of a fertilizer experiment on strawberries in which an ex tended complete block design was used is mentioned briefly by Pearce (1963). Extended complete block designs, as introduced by John (1963), are block designs in which each block contains a first replicate of all of the treatments plus a duplicate or second replicate of some of the treatments. These second replicates in each block comprise an incomplete block se lected from the class of balanced incomplete block designs. An example of an extended complete block design formed by augmenting complete blocks of size three with balanced in complete blocks of size two resulting in extended complete blocks of size five is presented in Figure 1, where the three treatments are denoted by A, B, and C. A A A B B B C C C A A B B C C complete block design balanced incomplete block design Figure 1. An extended complete block design consisting of three blocks each of size five experimental units containing treatments A, B, and C. Extended complete block designs can be used in a variety of experimental situations. In sensory experiments 4 where the objective is the comparison of preferences for dif ferent food samples (treatments) expressed by a panel of judges (blocks), the number of food samples that a panelist may effectively evaluate at a single sitting is limited but may be more than the number of different samples to be eval uated. Acquiring panelists for these sensory experiments is often difficult and/or costly. Hence, if a panelist can ef fectively evaluate all of the different samples plus repli cates of some of the samples at a single sitting and if a fixed number of observed values of each sample is necessary, a smaller number of panelists would be required with the use of an extended complete block design than if each panelist could evaluate each of the samples only once. The use of a smaller number of panelists would result in a savings in terms of time and cost. In an agricultural setting, an experimenter wishing to compare the effects of different chemical sprays on cit rus trees may have available more trees in a block than the number of sprays to be tested. On the additional trees in each of the blocks, second replicates of some of the dif ferent chemical sprays could be applied. In an industrial experiment, on a given day an experimenter may be able to ob tain observed responses from each of the treatments as well as responses from second replicates of some of the treat ments. If he does not have enough time to observe the re sponses from second replicates of all the treatments, an extended complete block design could be used. 5 The analysis of a block design in which two treat ments are applied to the experimental units in blocks of size three was discussed by John (1962). The following year, John (1963) introduced extended complete block designs and presented their analysis. In his designs, the block size k could vary between t and 2t, where t is the number of dif ferent treatments used in the experiment. Trail and Weeks (1973) generalized this latter work of John to include de signs in which k is greater than 2t. In the papers by John (1963) and Trail and Weeks (1973), the analysis of the fixed effects model as well as the mixed model was presented in detail. The application of the extended complete block de signs of John to the area of sensory evaluation was con sidered by Cornell and Knapp (1972, 1974). Also considered by Cornell (1974) was the efficiency of these designs com pared to randomized complete block designs. 1.3 Purpose of This Work The first part of this work will be concentrated on extending the works by Cornell and Knapp (1972) and Cornell (1974) with special emphasis on the area of sensory evaluation. Specifically, we shall be interested in the analysis of extended complete block designs where correlation is present between duplicate responses to the same treatment in the same block and the magnitude of the correlation is constant over all treatments and blocks. In sensory 6 experiments for example, the presence of correlated observa tions easily could arise as a result of using highly skilled judges ( as the blocks). Therefore, we shall be interested in testing whether there is any evidence of correlation pre sent in the data. Futhermore, if there is sufficient evi dence to indicate that correlation is present, we shall seek to obtain an estimate of the magnitude of the correlation, which is denoted by p. An approximate test on the treatment effects in the presence of a value of p greater than zero will be suggested, since an exact test on treatment effects cannot be performed for this experimental situation. In the second part of this work, we shall general ize the work of Trail and Weeks (1973) to include extended complete block designs generated by partially balanced in complete block (PBIB) designs with two associate classes. CHAPTER 2 LITERATURE REVIEW The analysis of block designs in which the block size k could vary between t and 2t, where t denotes the num ber of treatments in the experiment, was first introduced by John (1963). These designs, called extended complete block (ECB) designs, contain in each of the b blocks first repli cates of all of the treatments plus second replicates of k-t of the treatments. The method taken by John of choosing the k-t second replicates in each block was to use the class of balanced incomplete block (BIB) designs of block size k-t. Using formulae similar in structure to the formulae used in the analysis of balanced incomplete block designs, John discussed the intrablock analysis, the interblock anal ysis, and the recovery of the interblock information. The recovery of the interblock information was achieved by com bining the two independent intrablock and interblock es timates of the effects of each of the treatments. In the intrablock analysis, in addition to ob taining the treatment effects adjusted for blocks and the unadjusted block analysis, John obtained estimates of both the experimental error variation and the block x treatment interaction. The measure of the interaction was obtained by subtracting the experimental error variation from the 7 8 residual variation in the intrablock analysis of variance. With the interblock analysis, however, an additive model was assumed. That is, the block x treatment interaction variance component was assumed to be zero. Using the assumption of the additive model then, the combined estimate of each of the treatment effects was formed using a linear combination of the weighted intrablock and interblock estimates. The weights used with the intrablock and interblock estimates were the reciprocals of the estimates of their respective variances. The special case where t = b and k = t+1 was presented in detail. Trail and Weeks (1973) considered the aforemen tioned extended complete block designs (ECBD) as a special case of the more general class of designs which they called extended complete block designs generated by balanced in complete block designs (BIBD). Their generalization of the work of John (1963) included balanced block designs in which the block size could exceed 2t. An example of this more general design in which a balanced incomplete block design is added to a double complete block design (CBD) is pre sented in Figure 2. A second example of these more general designs in which the complete block design is augmented by two balanced incomplete block designs is presented in Fig ure 3. In both of these figures, the three treatments are denoted by the letters A, B, and C. It should be noted that the treatments would be randomly assigned to the experimental units within each block when the experiment is performed. 9 A A A B B B C C C A A A B B B C C C A A B B C C CBD CBD BIBD Figure 2. An ECBD for three treatments generated by a BIBD consisting of three blocks each of size eight experimental units. A A A B B B C C C A A B B C C A B A C C B CBD BIBD BIBD Figure 3. An ECBD for three treatments generated by a BIBD consisting of three blocks each of size seven experimental units. In a block design in which t treatments are ar ranged in b blocks, properties of the design can be obtained by studying the elements of the incidence matrix of the de sign. The incidence matrix N = (n^j) is a t x b matrix such th that n^j denotes the number of times the i treatment ap- th pears in the j block. The elements of the incidence matrix for the extended complete block designs may be constructed by appropriately summing the elements of the incidence matrix of a balanced incomplete block design and the elements of the incidence matrix of a complete block design. For the extended complete block designs generated by BIB designs, Trail and Weeks showed that the incidence 10 matrix N can be generated from the incidence matrix N* of any balanced incomplete block design by using the equation N = cQJ + (C;l-c0)N* (2.1) where J is the incidence matrix of a complete block design, that is, J is a t x b matrix of ones, and c^ and c^ are ele ments of the set of positive integers. The model used by the authors is y = C^x + X23 + Y) + (2.2) where x is a t x i vector of treatment effects, 3 is a b x l vector of block effects, y is a bt x i vector of interaction effects, and e is a bk x l vector of independent random er ror effects. Letting 1 denote a t x l vector of ones, I i.U denote the t x t identity matrix, and y. represent the &1" 1 X/ -H Vi response to the i1" treatment in the j11 block, the vector y and the matrices C, X^, and X2 in (2.2) are of the forms y = 111 lln 11 211 tbn tb ?i - it I , x = ' it 1^ ~t ' ~2 ~t - it btxt 1 rt* i bkxl btxb and 11 :n 11 :n 21 C . (2.3) bkxbt In our notation, 1 is an n~. x 1 vector of ones. ~n21 21 In addition to presenting the intrablock and interblock analyses, Trail and Weeks expressed the formula for calculating the combined estimates of the treatment ef fects using the method presented by Seshadri (1963a) for combining unbiased estimators. Trail and Weeks also dis cussed how "best" designs might be obtained. They defined the "best" design as that design for which the variance of the difference between the intrablock estimates of the ef fects of any two different treatments is a minimum for fixed t and k. The minimum value of the variance of the difference is achieved by minimizing the absolute value of the differ ence Cq-c^, where Cq and c-^ are the magnitudes of the ele ments in the incidence matrix N of the design. An application of extended complete block designs to sensory testing experiments was presented by Cornell and Knapp (1972). Separate estimates of block x treatment 12 interaction and experimental error were obtained in their analyses. Cornell and Knapp showed that the use of the experimental error only as a measure of the within treatment variability when comparing treatments results in a more ef ficient test than when using the residual variation (the sum of the experimental error and the interaction variation) when some measure, however small, of interaction is present. Replication of extended complete block designs was also discussed by Cornell and Knapp (1974). Replication of the designs was performed to achieve a balance between the blocks and the treatments. By balance is meant, each and every treatment appears in each block (is evaluated by each panelist) the same number of times over the replications. Hence, pairwise comparisons of the treatments in each block can be made with equal precision. With the replicated designs, the assumption of negligible replication variation was made by the authors. This assumption resulted in simpler expressions for the for mulae for calculating estimates of the treatment effects as well as the intrablock sums of squares when compared to the formulae used with the unreplicated extended complete block designs. An example of a replicated extended complete block design is presented in Figure 4. Using a non-additive model, Cornell (1974) dis cussed the efficiency of extended complete block designs compared to complete block designs for uncorrelated observa tions. The efficiency of each design was defined as the 13 reciprocal of the variance of the difference between any pair of treatment means with the respective design. To illustrate, with the extended complete block design con sisting of b blocks each of size k, the estimate of the variance of the difference between any two treatment means A Var(x.-T., i i ECB 2k(t-1) Q 2 b (k2-3k+2t) e (2.4) where cr* is the intrablock estimate of experimental error. An estimate of the efficiency of the extended complete block design would be the reciprocal of (2.4). (The author used k for the size of the blocks in the balanced incomplete block design used in the extension. In this work, k* denotes the size of the blocks in the balanced incomplete block design used in the extension, while k refers to the size of the blocks in the extended complete block design.) Replications I II III 1 ABCAC ABCBC ABCAB 2 ABCBC ABCAB ABCAC 3 ABCAB ABCAC ABCBC Figure 4. A replicated extended complete block design consisting of three replicates of an ECBD. In a complete block design with the same number of replicates of each of the treatments, that is, with bk/t complete blocks of size t, the estimate of the variance of the difference between any two different treatment means is 14 Var(xi t,)cb bk jesidual (2.5) where Residual ;''s t^ie residual mean square. From (2.4) and (2.5) an estimate of the efficiency of the extended complete block design compared to the complete block design is ob tained using the ratio Var(r-T,)rn Ef f (ECB to CB) = 1 z Var(xi-Ti,)ECB t(k2-3k+2t) a2 ^ v residual x l b J k2(t-1) a* To obtain estimated efficiency values for dif ferent values of t and k, the value of the ratio of ^2 o . aresidual to ae 1S re(2uired- For the value of this ratio, Cornell used the ratio of the mean square for interaction to the mean square for error, which is easily obtainable from the analysis of variance table of the extended complete block design. With this ratio, denoted by F, Cornell showed that when the hypothesis of zero interaction effect is true, resulting in F = 1, the extended complete block design is a slightly less efficient design than the complete block de sign with the same number of replicates of the treatments. However, when F is greater than 1, the extended complete block design is the more efficient design with the efficiency increasing with increasing values of F. Cornell (1974) also considered the situation where a positive correlation p exists between the two responses to 15 a treatment in the same block. For an extended complete block design having fixed balanced incomplete block size k*, it was found that as P approaches one the efficiency of the extended complete block design compared to the complete block design decreases. In fact, the larger the value of k* (k* + t) the faster the efficiency of the extended complete block design approaches one-half that of the complete block design with twice as many blocks. This implies that if one suspects a positive correlation to be present between du plicate treatment responses in the same block, one should use k* equal to one for maximum efficiency if using an ex tended complete block design. Owing to the results previously found concerning the effect of correlated observations on the efficiency of extended complete block designs compared to complete block designs, in the next chapter we shall investigate the for mulation of the test of the hypothesis of zero correlation. If there is evidence of correlation present between du plicate treatment responses in the same block, we shall want to estimate the correlation p. With an estimate of p, an estimate of the variance of the difference between any two intrablock estimates of the treatment effects can be cal culated. CHAPTER 3 EXTENDED COMPLETE BLOCK DESIGNS WITH CORRELATED OBSERVATIONS In the extended complete block designs discussed to this point, we have observed that some of the treatments in each block are duplicated. In the papers by John (1963), Trail and Weeks (1973) Cornell and Knapp (1972) and Cornell (1974), the responses to the duplicated treatments in each block are assumed to be independent and are used to obtain an estimate of the experimental error. Comments on the ef ficiency of these designs when the duplicated observations are not independent but rather are positively correlated were made in the latter paper. As mentioned previously in Section 1.3, in sensory experiments correlated observations are a real possibility. A panelist's response to a treatment might very likely be positively correlated with his response to the duplicate of the treatment, particularly if the panelist has previously been trained for these experiments. The presence of posi tive correlation between responses to the same treatment by a panelist reflects a measure of the efficiency of the pan elist. That is, the closer in magnitude the responses to the same treatment by a panelist are, the more consistent the panelist is in evaluating that treatment. Although the correlation could be different for each treatment and/or 16 17 each panelist, we shall consider only the case where the correlation is.assumed to be constant and equal for all pan elists (blocks) and treatments. 3.1 Notation and Definitions The parameters associated with an extended complete block design are as follows: t = the number of treatments, b = the number of blocks, k = the number of experimental units in each block (block size), r = the number of replications of each treatment in the experiment, X = the number of distinct pairs of experimental units which receive any fixed pair of treatments while appearing in the same block, and N = (n..) = the incidence matrix, where n.. denotes the ID ID number of times the i^ treatment appears in the j*"*1 block. The following parameters are associated with the balanced incomplete block design used to form the extended complete block design: t = the number of treatments, b = the number of blocks, k* = the block size, r* = the number of replications of each treatment, 18 X* = the number of times over the b blocks each pair of treatments appears in the same block, and N* = (n* ) = the incidence matrix, ij The following identities involving the aforemen tioned parameters are satisfied: 1. r = r*+b 2. k = k*+t 3. r*t = bk* 4. rt = bk 5. N = N*+J, where J is a t x b matrix of I's 6. X = 2r-b+X* 7. X*(t-1) = r*(k*-l) 8. X(t-1) = rk-3r+2b . The model written in matrix notation is y = C(yl + X.t + X 3) + e (3.1.1) ~ ~ ~Dt ~~ where all symbols are defined following (2.2) with the ex ception of y and e. These parameters are y, the overall mean, and e, a bk x 1 vector of random errors with the prop erties E where E() denotes mathematical expectation and 19 E(eijieij 'pa2 i = i', j = j l jt l' a2 i = i1 j = j 1 =5/' , 0 otherwise (3.1.2) where a2 denotes the variance of the distribution from which the errors are sampled and p denotes the correlation between the duplicate observations. We shall assume that the values of p lie in the interval 0 < p < 1. Owing to the properties in (3.1.2) of the random errors, then E(y) = C(ylbt + X^r + X23) and (3.1.3) Var(y) = E(ee') = V . (3.1.4) The matrix V consists of the following partitions corre sponding to the form of the vector y in (2.3); on the main diagonal of V are positioned the matrices and a2 [ 1 ] , while there are zeros located in all other positions. Hence, under the assumption of the normality of the random errors, y ~ N( C(ylbt + X1t + X2B), V ) . Before discussing the intrablock estimation of the treatment effects, we illustrate the form of the matrix V by 20 referring to the extended complete block design presented in Figure 1. If the vector of observations y is written as Y = All A12 A21 A22 A31 Bll B12 B21 B31 B32 Cll C21 C22 C31 L y C32 J then the corresponding matrix V is V = 1 P 1 P 1 P 1 P 1 P 1 P 21 On the main diagonal of the matrix V are positioned the matrices which correspond to the duplicate responses to a treatment in the same block, and the matrices a2[ 1 ], which corre spond to the response to only a single treatment in the block. We shall now discuss the intrablock estimation of the treatment effects where both the treatments and the pan elists are assumed to represent fixed effects in the model in (3.1.1). The panelist effects represent fixed effects either when it is desired to compare the specific panelists used in the experiment or when the panelists chosen to eval uate the treatments cannot realistically be assumed to rep resent the general public. A case which comes to mind in this latter situation is when trained panelists are used in an attempt to enhance the efficiency of the comparisons be tween the treatments. 3.2 Intrablock Estimation of the Treatment Effects To obtain the intrablock estimates of the effects of the different treatments, we recall the form (3.1.1) of the model y = C(ylbt + X t + x23) + e , 22 where the elements of the random error vector e have the properties specified in (3.1.2). If the method of least squares is used to obtain the intrablock estimates t of x, the normal equations are IbtSi ibt?ibt IbtHi it?; XjC'y = ?12ibt ?i5?i 51552 - - *2~bt ?255i ;j5;2 (3.2.1) where the bt x bt matrix D = C'C and the hat (~) denotes estimate. According to the definitions and parameter iden tities specified in Section 3.1, the forms of the matrices ~1~~1' XiDX2' and X2DX2 n (3-2*1) are ~1~X1 = rit ?1??2 = i? and X'DX2 = klfa , and therefore the normal equations (3.2.1) are expressed as G - T = _ B bk ri; ki rl rl N ~t ~t ~ kl. N' kl. ~b ~b y A T A L e J (3.2.2) where G denotes the grand total of the observations and T and B are the t x 1 vector of treatment totals and the b x 1 vector of block totals, respectively. For a solution to the normal equations, both sides of the equality in (3.2.2) are premultiplied by the matrix 23 1/bk O O O Ifc -N/k O -N'/r I,_ ~ ~b -J A A and the constraints 1' r = 0 and 1,' 3 = 0 are imposed on the ~t~ ~b~ parameter estimates. Corresponding to these particular con straints imposed, the following relation results G/bk r /s _ y kQ = A Ax rB N'T (rkl N1N)3 ~ JD ~ ~ ~ where kQ = kT NB and A = rkl^ NN'. Characteristic of these designs, the matrix NN' = (rk-At)I + AJ. Hence, the matrix A can be expressed in the simple form (l/k)A = (At/k)[lt (l/t)j] . From the equation kQ = At in (3.2.3), the t x 1 vector x of intrablock estimates of the treatment effects is x = kQ/At , (3.2.4) where A = (rk-3r+2b)/(t-1). Furthermore, with the properties J_ l- /N of the vector e specified by (3.1.2), the i element x_^ of the vector x is unbiased for x. since E(e)=0 and with l'x=0 1 -V, -V ~ Cov (x\ x ,) (t-1)(Ak+2[A(k+t)-2rk]p)o2/(tA)2 i = i' -Var(xi)/(t-l) f i ^ i' (3.2.5) Since we are interested in the pairwise comparison of the treatment effects, we also have (3.2.6) and , 4{X (k+t)-2rk} 7 + po- tx2 (3.2.7) In the formula (3.2.7), the quantity 2kcr2/tX on the right-hand side of the equality is the variance of the difference between the intrablock estimates of the treatment effects t. and x., in the case of uncorrelated errors. Thus, if correlation is present between responses to the same treatment in the same block, the variance of the difference between the intrablock estimates of any two treatments, over all blocks, is greater than the variance between the same two treatments when the observations are uncorrelated, since the quantity [X (k+t) 2rk] is always positive. The intrablock analysis of variance table is pre sented in Table 1. It is clear from Table 1 that an exact test does not exist for testing the hypothesis of equal treatment effects when a non-zero correlation is present. If we wish to test this hypothesis, an approximate test must be performed. Before suggesting an approximate test for the equality of the treatment effects when p > 0, we shall first consider a procedure for testing for the presence of correla tion. If correlation is present, we shall need to know how this correlation affects the distributional properties of TABLE 1 Intrablock Analysis of Variance for an ECBD Source df Sum of Squares EMS* Treatments (adjusted) t-1 SST = (k/Xt) l Q? A i 1 E(MST ) A Blocks (unadjusted) b-1 SSB = (1/k) l B2 - j 3 (G2/bk) Residual bk' -t-b+1 (by subtraction) E(MSR ) e Total bk-1 TSS = l l l y? (G2/bk) i j Â£ ^ * E (MST ) = a2 {1 A + iP Xk [(k+t> _2rkR + k(t-i) J CM -H E(MSR ) = a2 + 1 , { (b-1) (t-1) d> bJ*' -*> )pa2 26 the sums of squares associated with the two sources, treat ments and residual. 3.3 A Test for the Presence of Correlation Although one of the initial steps in the analysis of data arising from a comparative type experiment is a test on the equality of the treatment effects, in this section we shall first investigate the possible presence of correlation between duplicate observations in the same block. The rea son for this investigation is that if correlation is present, an exact test of the hypothesis of equal treatment effects cannot be performed and an approximate test must be derived. Furthermore, if correlation is present, the formula for the variance of the intrablock estimates of the treatment ef fects contains p and an estimate of p is needed to estimate this variance. The same is true of the formula for the dif ference between the intrablock estimates of the effects of two treatments. To determine if there is evidence of correlation in the data, we shall consider a test of the hypothesis Hq: p = 0. If this hypothesis is rejected in favor of the alternative hypothesis H : p > 0, we shall conclude that the duplicate observations are not uncorrelated and insist on finding an estimate of p. If, on the other hand, the hy pothesis is not rejected, the inference made here shall be that the duplicate observations are uncorrelated, or, if they are correlated, there is not sufficient information in 27 the data to show that the magnitude of the correlation is greater than zero. In order to test the hypothesis Hq: p = 0, we first need to derive the form of a test statistic. To this end, recall from Table 1 that the source of variation termed residual has bk-b-t+1 degrees of freedom. The residual var iation is a composite of duplication variation as well as another source of variation which we shall call remainder variation. To see this, let d^j be the difference or range th th of the observations made on the i treatment in the j block so that if n^j = 2, then d^j > 0, and if n^j = 1, then dj-j = 0. If each of the d^j is squared and these squares are summed over all treatments and blocks, then the resulting quantity when multiplied by one-half is called the sum of squares for duplication variation (SSDV). In summation no tation, the sum of squares for duplication variation is given by t b SSDV = h l l d[. . i j The sum of squares for remainder (SSR) is found by cal culating the difference, sum of squares for residual SSDV. To derive the form of a test statistic for testing the hypothesis, we require the separate distributional prop erties of the sum of squares for duplication variation and the sum of squares for remainder. The distributional prop erties of SSDV and SSR are most easily obtained by rewriting SSDV and SSR as quadratic forms and then using our knowledge 28 of the distributional properties of quadratic forms. In ma trix notation then, SSDV and SSR can be expressed in the quadratic forms SSDV = y' [i. -CD_1C'] y = y'A,y (3.3.1) ~ L~bk ~ J ~ ~ ~1~ and SSR I' SC5'1 e 525 it (bt E (bt e5;22>Js' X ' (3.3.2) where both the matrices and A^ are real, symmetric, and idempotent. In the quadratic form (3.3.1) for SSDV, the matrix A^ consists of the square matrices and [ 0 ] on the main diagonal and zeros in all other positions. This partitioning corresponds identically to the partitioning of the matrix V as defined and illustrated in Section 3.1. Thus, by direct computation A V = (l-p)a2A. (3.3.3) ~ -L~ ~ and since A^ is a real, symmetric, idempotent matrix, so also is the matrix A V/{ (1-p) cr2 } The trace of the matrix ~ 1 ~ A^ is equal to b (k-t) and therefore under the assumption of normality of the errors, 29 SSDV ~ (l-p)a2 X(k_t) / (3.3.4) where x* denotes a random variable with a central chi square distribution with v degrees of freedom. With E(*) denoting mathematical expectation, then E(SSDV) = b(k-t)(1-p)o2 (3.3.5) and E(MSDV) = (1-p)a2 (3.3.6) where MSDV denotes the mean square for duplication variation. (An alternate derivation of the distribution of SSDV is pre sented in Section 3.4.) In the quadratic form (3.3.2) for SSR, the matrix A^ is real, symmetric, and idempotent. However, it can be shown that if c is a scalar constant, the equality AVA ~2~~2 is not true in general. (To see this would only require working through the small example where t = b = 3 and k = r = 4.) Hence, unlike SSDV, the random variable SSR does not have an exact weighted chi square distribution when p > 0. The exact distribution of SSR is discussed in Sec tion 3.4. The distributions of SSDV and SSR are independent, since A VC = 0. The trace of the matrix A V is equal to ~ 1 ~ ~ Z ~ (b-1) (t-1) (l+ 30 E(MSR) = (l+ where 4> is defined in Table 1. When the hypothesis HQ: p = 0 is true, the random variable SSR has a chi square distribution. Hence, to test the hypothesis Hq: p = 0, the test statistic F^ = MSR/MSDV is used. When p = 0, the test statistic has an F distri bution with (b-1) (t-1) and b (k-t) degrees of freedom in the numerator and denominator, respectively. Therefore, the hy pothesis is rejected in favor of the alternative hypothesis H : p > 0 for large values of F . a p A brief discussion of the power of this test is reserved for a later section, since we must first consider the distributional properties of SSR when p > 0. 3.4 The Exact Distribution of SSR for p > 0 In the previous section, it was shown that when p > 0, the distribution of the sum of squares for remainder does not in general have a weighted chi square distribution. This is because the matrix A2 of the quadratic form SSR does not necessarily satisfy the equality A2VA2 = CA2, w^ere c some constant and V is the covariance matrix of the observa tions. In this section we shall seek to find an expression involving independent chi square distributed random vari ables for which the moments of the distribution of SSR can be found. The approach we shall use to find the distribution of SSR involves rewriting the model in (3.1.1) in the form 31 yij = P + T + gj + (l-p)?Szijjl + p^Ujlj (3.4.1) i 1 / 2 f t / ^ 1 / 2/ Id ^ dnd = 1 / 2 / j , where the x^ are the treatment effects, the Bj are the block effects, p is the magnitude of the correlation between du plicate responses to the same treatment in the same block, and zj_jÂ£ and uij are independent, identically distributed normal random variables each with mean zero and variance a2. Let us now define the random variable SSR|u^j to be the usual sum of squares for interaction (which we have chosen to call remainder in our additive model) given the u- Since the conditional distribution of SSR given u- 1J J can be found, the form of the unconditional distribution of SSR is obtained by taking the expectation of the random variable SSR|u^j with respect to u^j. The distribution of the random variable SSR|u^j is given by SSR | u^ j ~ a2 (1 p) X2(b_i) (t-1) ( 2 (-p) r2) (3.4.2) where x2 (1) denotes a random variable with a non-central chi square distribution with v degrees of freedom and non centrality parameter X. In the noncentrality parameter of the distribution in (3.4.2), R2 is of the form >2 _ = min l Z x* B*)2 , (3.4.3) T*, B* i j where x* and B are the parameters in the conditional distri bution corresponding to y+x^ and y+Bj, respectively, in the unconditional distribution. In order that the distribution 32 in (3.4.2) be expressed in a simpler form, it is convenient to write R2 in terms of the design parameters t, b, k, and r. To this end, let D~^ denote the diagonal matrix of cell frequencies associated with a t x b table in a two-way cross classification, that is, n 11 n 12 n lb n 21 Then (3.4.3) may be written in the form R2 = min (u-y)'D*1(u-y) , y :F y = 0 where (3.4.4) y' (yxl, ..., ylb, v21, ytb)' u' (ui;l, ..., ulb, U21, utb) ' and F is a matrix of constraints for additivity in a t x b cross classification. The form of the matrix F is given shortly. The quantity R2 equals the minimum value of the 33 quadratic form in (3.4.4). To find this minimum value, we write Q = (u-y)'D*1(u-y) + 2H'F'y where II is a (b-1) (t-1) x l vector of Lagrange multipliers. Differentiating Q with respect to y and setting the result equal to zero, the minimum value in (3.4.4) is R2 u'F(F'D*F) u . (3.4.5) An expression for (3.4.5) involving the design parameters t, b, k, and r for our problem requires the la tent roots of the matrix F(F'D*F) ^F'. If these roots are denoted by 0, then 0 are the solutions to the equation I ?(r?*!,)~V ei(b-l)(t-l) I = 0 (3.4.6) Using the following identity, 6(b-1) (t-1) F' F'D*F Qb+t-lj 0F.D*F p.F| F'D*F 01 (b-1) (t-1) - F(F'D tF) 1fi we find that b+t-1 roots of (3.4.6) are zero while the re maining (b-1)(t-1) roots are positive. These latter (b-1)(t-1) positive roots can be found by solving for 0' in the equation F'D*F 0 1 F F | = 0 (3.4.7) 34 and setting 9 = 1/9'. Since the 9' are non-zero, the frac tion 1/9' is not undefined. We should like to express equation (3.4.7) in a simpler form to find the values of 9'. To this end, the ma trix of constraints F is written as the direct product of two other matrices. This direct product is F = F(t-l) F(b-l) , where the two matrices are defined by and F(t-l) 14-i -I t-1 tx(t-1) F(b-l) = l-i -I b-1 J bx(b-1) Then F'F 1 -b-1 f where L = I + J (3.4.8) ~a ~a That is, L is an a x a matrix with 2's on the main diagonal ~a and l's in all other positions. We now make use of the fol lowing theorem and corollary to find the values of 9' satis fying (3.4.7) . Theorem. Let W be an m x m matrix with the distinct latent 35 roots w_. with respective multiplicities m., j = 1, 2. Let R be an m x s matrix satisfying R'R = I Then the roots of ~ ~ ~ ~s the matrix R'WR are the values of 0' satisfying the equation R'WR 6'I | = 0. These values are 9' = w10" + w2(l-0") , where 0" are the solutions of | R'MR 0"I | =0 with V v Proof; The proof follows directly by replacing M with (W-w^Im)/(w^-w2) in the determinant | R'MR 0"I | = 0 and simplifying. Corollary. If W is defined as in the theorem, and F is an m x s matrix of full rank s < m, then the solutions 0' of the equation | F'WF 0'F'F | = 0 are 0 = w-^0 + w2 (1-0" ) , where 0" are the solutions of | F'MF 0"F'F | =0 with M defined in the theorem. Proof: There exists a matrix K such that K'F'FK = Ig. Let r' = k'F' and apply the theorem. In our problem, W is the matrix D*. Therefore from the theorem, M is a diagonal matrix of ones and zeros. Referring to the corollary to obtain the values of 0' satis fying (3.4.7), we now need only to find the solutions 0" of 36 | F'MF 6"F'F |=0. (3.4.9) Because the structure of the matrix F'F depends on the matrices L^._^ and Lb_^ as defined in (3.4.8), a forward Doolittle procedure is performed on La to find that the val ues of 0" satisfying (3.4.9) are the same as the values of 0" satisfying I e"i(b-l) (t-l) I = 0 > (3.4.10) where F* = H(t) 0 H(b) with the matrix H(a) defined as the first a-1 columns of the a x a Helmertz orthogonal matrix. Since M = M* and M'M = M, then F;MF* = (MF*)'MF*, and the positive values of 0" satisfying (3.4.10) are the positive solutions 0" satisfying | MF*F;M 0"Ibt |=0. (3.4.11) Since we may write f*f; = (Gt 0 Gb)/bt , where Ga is an a x a matrix with a-1 on the main diagonal and -1 in the other positions, then the positive solutions 0" of (3.4.11) are functions of the positive solutions 0* satisfying the equation | M(Gt 0 Gb)M 0*Ibt |=0, (3.4.12) where 0* = bt0". At this stage, an expression for the random vari able SSR is presently untenable for general t, b, k, and r. 37 In the remainder of this section, we shall derive an expres sion for the random variable SSR when p > 0 for the special case t = b and r = k = t+1. A similar expression for SSR when p > 0 for the case b = 2t, k = t+1, and r = 2 (t+1) is presented in Appendix 1. For the special case considered in this section, the matrix M(G 0 G, )M in (3.4.12) has the non-zero partition ~ ~t ~b ~ {(t-1)+ J / for which the positive latent roots are t(t-2) and t(t-l) with multiplicities t-1 and 1, respectively. Hence, since these roots are simple multiples of the solutions 0" of (3.4.9), the 0" are 0" (t-2)/t with multiplicity t-1 (t-l)/t 1 0 (t-l)2-t With the use of the corollary where w^ = % and w^ = 1/ the values of 0' satisfying (3.4.7) are 0' (t+2)/2t with multiplicity t-1 (t+1)/2t 1 1 (t-l)2-t and the values of 0 satisfying (3.4.6) are 0 = 2t/(t+2) with multiplicity 2t/(t+1) 1 11 11 t-1 1 (t-l)2-t 38 In the distribution of the conditional random variable SSR given u^ in (3.4.2), we may now express R2 as R2 = 2t X2 2t 2 t+2 At-1 t+1 xi + x(t-D2-t Furthermore, upon taking the expectation of SSR|u^_. with re spect to u,, and using the notation SSR = E(SSR|u,.), the lj 1 lj sum of squares for remainder when p > 0 is distributed as SSR/a: al Xvx + a2 xv2 + a3 Xv3 ' (3.4.13) where and al = 1 + p TT a2 1 + P t+1 ' a3 1 , V1 = t-1 , v2 = 1 , v3 = (t-1)2 t An approximate distribution to (3.4.13) will be obtained in Section 3.5. The conditional distribution of the usual sum of squares for error given u^ is SSE|uj ~ a2(1-p) X(k_t)(0) , and the random variable SSE|u. is independent of u_^ Hence, we have 39 SSDV ~ a2(1-p) x(k-t)(0) ' where SSDV denotes the expectation of the random variable SSE|u^j with respect to j. Since the duplication var iation sum of squares random variable is also independent of the random variable SSR|u^j, then the random variable SSDV is independent of the random variable SSR. By independent random variables is meant, the distributions of the random variables are independent. (The independence of the distri butions of SSDV and SSR was established previously in Sec tion 3.3 through the use of quadratic forms.) To this point, it has been shown that for the special case t = b and r = k = t+1, the random variable SSR is distributed as a sum of weighted independent chi square random variables when p > 0. We still do not have the exact form of the density of the random variable SSR which is nec essary in order to specify the distribution of the random variable SSR/SSDV. The distribution of SSR/SSDV is also necessary in order that we may calculate the power of the test of the hypothesis HQ: p = 0 for non-zero values of p. Since an exact form of the density of SSR would likely re quire an excessive amount of work and since a simpler form of an approximating distribution of SSR would suffice for our problem in a majority of cases, an approximate distri bution of the random variable SSR will now be considered. A check on the accuracy of the approximate distribution when compared to the exact distribution of SSR when p > 0 is pre sented in Table A2 of Appendix 2. 40 3.5 An Approximate Distribution of SSR for p > 0 In this section we shall consider an approximation of the distribution of the sum of squares for remainder when p > 0 for the special case t = b and r = k = t+1. An ap proximate distribution of SSR when p > 0 for the special case b = 2t, k = t+1, and r = 2(t+1) is given in Appendix 2. There are numerous approaches that could be used to approximate the distribution of SSR. The approach used in this section (and also used in Appendix 2) was introduced by Box (1954). The rationale in selecting Box's ap proximation lies not only in its relative ease of appli cation but also in the fact that it was shown by Box that the approximate distribution compared to the exact distri bution of a quadratic form is fairly good except when small differences in probability are to be examined. We now state the theorem in his paper which we shall use. Theorem. The quadratic form is distributed approximately as where g (3.5.1) and h (3.5.2) In both of the expressions for the scale constant g and the 41 degrees of freedom h, the are scalars and the are the degrees of freedom of the respective chi square random vari ables that are summed to form Q, j = 1, 2, ..., p. In our problem we seek to approximate the distri bution of the random variable SSR, where SSR/a2 ~ ax + a2 xv + a3 X* v. with a^ and vj, j = 1, 2, and 3, defined following (3.4.13). From the theorem by Box previously stated then, if the aj and Vj are substituted into (3.5.1) and (3.5.2) to find g and h, respectively, we may say that SSR is approximately distributed as a scaled chi square random variable with h degrees of freedom. A tabulation of the values of g and h corresponding to the integer values 3, 4, 5, 6, and 7 of t and to some values of p in the interval between zero and one is presented in Table 2, where h has been rounded to the nearest integer and g has been rounded to four decimal places. The approximate distribution of SSR may be used, when testing the null hypothesis Hq: p = 0 against the gen eral alternative hypothesis H : p > 0, to compute the power Cl of the test under the alternative hypothesis for values of p greater than zero but less than one. Since the distribution of SSR is independent of the distribution of SSDV, then under the alternative hypothesis we approximate the distri bution of the statistic TABLE 2 Values of g and h for the Approximate Distribution of SSR, I p t V, a. a g h 1 2 3 1 2 3 0.1 3 2 1 1 1.025 1.05 1 1.0253 4 0.3 1.075 1.15 1.0776 4 0.5 1.125 1.25 1.1319 4 0.7 1.175 1.35 1.1880 4 0.9 1.225 1.45 1.2457 4 0.1 4 3 1 5 1.04 1.06 1.0205 9 0.3 1.12 1.18 1.0645 9 0.5 1.2 1.3 1.1121 9 0.7 1.28 1.42 1.1629 9 0.9 1.36 1.54 1.2166 9 0.1 5 4 1 11 1.05 1.0667 1.0173 16 0.3 1.15 1.2 1.0554 16 0.5 1.25 1.3333 1.0978 16 0.7 1.35 1.4667 1.1441 16 0.9 1.45 1.6 1.1940 15 0.1 6 5 1 19 1.0571 1.0714 1.0149 25 0.3 1.1714 1.2143 1.0485 25 0.5 1.2857 1.3571 1.0867 25 0.7 1.4 1.5 1.1291 24 0.9 1.5143 1.6429 1.1754 24 0.1 7 6 1 29 1.0625 1.075 1.0131 36 0.3 1.1875 1.225 1.0431 36 0.5 1.3125 1.375 1.0778 35 0.7 1.4375 1.525 1.1168 35 0.9 1.5625 1.675 1.1599 35 43 MSR Fp MSDV with a weighted F distribution with h and b(k-t) degrees of freedom, where the weight is given by g/(l-p). That is, 1-P g ' b (k-t) (3.5.3) approximately. The probability that Fp exceeds some value Fq is approximately equal to the probability that the random variable F^^-t) exceec^s (l-p)Fo/g* A method for estimating the magnitude of the correlation between duplicate responses observed with the same treatment in the same block will be discussed in the following section. 3.6 An Estimate of the Correlation In Section 3.3 a procedure for testing the hy pothesis Hq: p = 0 was outlined in detail. As mentioned at the beginning of Section 3.3, the test of the hypothesis of zero correlation is normally the first action to be taken during the analysis of the experimental data. If the hy pothesis is rejected, we should then want an estimate of p. The estimate of p would be used when estimating the variance th of the intrablock estimate of the effect of the i treat ment as shown in (3.2.5) and/or when estimating the variance of the difference between the intrablock estimates of the effects of two treatments as shown in (3.2.7). Still anoth er use for the estimate of p would be when estimating the 44 efficiency of the extended complete block design compared to the complete block design as shown in the paper by Cornell (1974). The value of the relative efficiency of the two de signs could be very useful when considering designs for sub sequent experimentation, particularly in a sensory exper iment where the same panelists are to be used in additional experiments. Referring to the formulae (3.3.6) and (3.3.7), we see that the expectations of the mean squares for duplica tion variation and for remainder variation are E(MSDV) = (l-p)CT2 (3.6.1) and E(MSR) = (l+4>p) CT2 (3.6.2) where these mean squares and a ratio of two linear combinations of them are considered, we can express the correlation p in the form p = {E (MSR) E (MSDV) }/{E (MSR) + cÂ¡>E (MSDV) } (3.6.3) As an estimate of p then, the expectations in (3.6.3) are replaced by their respective mean squares resulting in the formula p = MSR ~ MSDV (3.6.4) M MSR + <{>MSDV 45 Similarly from (3.6.1) and (3.6.2), an estimate of a2 may be obtained as ^2 MSR + 1 + 4 Since the calculated value of MSR is always greater A than or equal to zero, then from (3.6.4) p > -l/cj>. Further more, since the calculated value of MSDV is always greater A than or equal to zero, then p < 1. If these extremes are considered as the endpoints of the range for the values of A A the estimate p, then -l/ interested only in the values of p in the interval between A zero and one, any negative value of p calculated is con sidered meaningless and is set equal to zero in this case. (Setting a negative estimate of a non-negative parameter e- qual to zero is a procedure practiced when estimating vari ance components in random and mixed models.) A The distribution of the random variable p depends on the forms of the distributions of the random variables SSR and SSDV. It was shown in Section 3.4 that the random variable SSR is distributed as a weighted sum of independent chi square random variables. Although an approximate distri bution of SSR was given in Section 3.5, an approximation to \ the distribution of p is at present untenable. Nevertheless, A the first two moments of the distribution of p could be ap proximated using a Taylor series. That is, the formula in A (3.6.4) for p may be expressed in a Taylor series and the A mean and the variance of the distribution of p could be 46 approximated with a finite number of terms in the series by taking the appropriate expectations. CHAPTER 4 A PARTIALLY BALANCED GROUP DIVISIBLE ECBD In the extended complete block designs presented thus far, the class of balanced incomplete block designs was only considered in the extended portion of the b blocks. That is, in making the extended complete blocks of size k, we have considered in combination with the complete blocks of size t only balanced incomplete blocks of size k-t. By restricting attention to the use of balanced incomplete block designs only in the extended portion, the extended complete block designs retain the property of balance among the treat ments. By balance is meant, the off-diagonal elements in the matrix A (or NN') in (3.2.3) are all equal, resulting in a single value of the variance for all pairwise treatment comparisons. Hence, all pairwise treatment comparisons could be made with the same precision. When it is not necessary to have equal precision for all pairwise treatment comparisons or when to achieve balance the use of balanced incomplete block designs requires a large number of replications of the treatments or possibly too many blocks, a partially balanced incomplete block design (PBIBD) might be used in the extended portion of an extended complete block design. To illustrate this point, consider an extended complete block design consisting of six treatments 47 48 in blocks of size nine. If a BIBD were used in the extended portion of the.blocks, the balanced design would require ten extended blocks supporting fifteen replicates of each of the six treatments. On the other hand, if a PBIBD were used in the extended portion, only six blocks supporting nine repli cations of each treatment would be required. In this and subsequent chapters then, the use of PBIB designs in the extended portion of the extended complete block design will be considered. Specifically, we shall limit our attention to the use of PBIB designs with two asso ciate classes. By relaxing the requirement of balance, in most cases we do not sacrifice that much precision when con sidering PBIB designs with two associate classes where in stead two variances are required for making all pairwise comparisons of the treatments. The two variances arise be cause with each treatment a subset of the t-1 other treat ments are first associates while the remaining other treat ments are second associates. One variance is used for pair wise comparisons among the treatments that are first asso ciates while the second variance is used among treatments that are second associates. The generalization to partially balanced incomplete block designs with more than two asso ciate classes should be straightforward. For the balanced extended complete block designs, the case where responses to the same treatment in the same block were positively correlated was presented in detail. The general theory developed in Section 3.4 on the exact 49 distribution of SSR when p > 0 with the additive model could be used with partially balanced extended complete block de signs. Hence, since the theory is general, the analysis of partially balanced ECB designs with correlated observations will not be presented. Our discussion will be limited to the additive and non-additive models when all observations are uncorrelated. The group divisible association scheme for t = mn treatments where m and n are integers is derived by parti tioning the treatments into m groups of n treatments each with those in the same group being first associates and those in different groups being second associates. For example, with six treatments (denoted by the numbers 1, 2, 3, 4, 5, and 6) a group divisible association scheme for three groups of two treatments each would be given by the 3x2 rectangu lar array 1 2 3 4 5 6 . In the array, treatments in the same row (1 and 2, 3 and 4, 5 and 6) are first associates. Treatments not in the same row as a specified treatment are second associates of that treatment. For example, the set of second associates of treatment 1 consists of the treatments 3, 4, 5, and 6. For a PBIBD with the group divisible association scheme and incidence matrix N*, the matrix N*N*' may be 50 arranged in a particular pattern that will be described in detail in Section 4.2. The particular pattern of the matrix N*N* facilatates finding a solution of the normal equations for the intrablock estimates of the treatment effects. The pattern of N*N*' carries over to the matrix NN', where N is the incidence matrix of the extended complete block design. Extended complete block designs generated by the class of PBIB designs with the group divisible association scheme will be called "partially balanced group divisible extended complete block designs." 4.1 Definitions and Notation An extended complete block design generated by a PBIBD is defined as a connected, two-way classification with the following properties: 1. Each treatment is applied either Cq or c-^ times in a block, c. >0, i = 0, 1. i 2. In the incidence matrix of the design, replacement of Cq by zero and c^ by unity results in the inci dence matrix N* of a PBIBD (with two associate classes). It follows from this definition that the incidence matrix N of such a design can be generated from the incidence matrix of any PBIBD. That is, given a PBIBD with the inci dence matrix N* and denoting by J a matrix of l's (the inci dence matrix of a complete block design), the incidence matrix of an extended complete block design (ECBD) generated 51 by a PBIBD is N = cQJ + (C;l-c0)N* (4.1.1) This equation is identical to the equation (2.1) for the incidence matrix of a balanced ECBD except that N* is now the incidence matrix of the generating PBIBD. The parameters associated with an ECBD generated by a PBIBD are as follows: t = the number of treatments, b = the number of blocks, k = the block size, r = the number of replications of each treatment in the experiment, = the number of distinct pairs of experimental units which receive any fixed pair of i^ associates while appearing in the same block, i = 1, 2, th n^ = the number of i associates of each treatment, i = 1, 2, and 4-Vi p., = the number of treatments that are both j asso- th ciates of treatment a and k associates of treat . .th ment 3 given that a and 3 are i associates. The corresponding design parameters of the generating PBIBD will be denoted by t, b, k*, r*, A*, n^, and Pl owing to the definitions given above for the param eters of the generating PBIBD as well as the ECBD generated by a PBIBD, the following relationships are satisfied: 52 1. r = clr* + (b-r*)cQ = cQb + (crc0)r* (4.1.2) 2. k = gxk* + (t-k*)cQ = cQt + (Gl-c0)k* (4.1.3) 3. r*t = bk* (4.1.4) 4. rt = bk (4.1.5) 5. Xi = (cl"c0)2Xi + c0(2r-bcQ) i = 1, 2 (4.1.6) 6. rk - (cQ+c^)r + CQC^b = n-^X^ + n2X2 * (4.1.7) In matrix notation, the non-additive model is y = + ?11 + + Y) + e (4.1.8) where y is the overall mean effect, x is a t x 1 vector of treatment effects, 8 is a b x 1 vector of block effects and y is a bt x l vector of block x treatment interaction effects. Letting lt denote a t x l vector of l's, It denote the t x t th identity matrix, and Yj_jÂ£ represent the l response to the th i treatment in the j block, the vector y and the matrices C, Xj, and X2 in (4.1.8) are of the forms y = ym Ylln Y211 11 Ytbn tb ' ?1 = it it 'it r Xr) btxt it it J btxb bkxl and 53 1 ~n 11 C bkxbt 4.2 Intrablock Estimation of the Treatment Effects Consider the model in (4.1.8) where the interaction effects are all zero. This additive model is written as y = C(ylb + xr + X28) + e (4.2.1) Setting up the normal equations exactly as detailed in Sec tion 3.2 results in the equality G/bk r a y kQ = /A Ax rB N'T (rkl, N'N) Â§ ~ D ~ ~ ~ J where all symbols are defined in Section 3.2. The solution of kQ = Ax in (4.2.2) depends upon the form of the matrix A rkl NN' t (4.2.3) 54 which in turn depends upon the form of the matrix N* from (4.1.1). For a PBIBD with the group divisible association scheme, the matrix N*N* may be arranged in a particular pattern as follows. For an association scheme consisting of m groups each containing n treatments (so that t = mn), let the treatments in the first group be labeled 1 through n; in the second group, the treatments are labeled n+1 through 2n; th and so on, so that in the m group, the treatments are la beled (m-l)ntl through mn. Now, with the corresponding blocking plan and the labeled treatments listed in numerical order in the incidence matrix N*, the matrix N*N*' becomes N*N*' = (r*-A*) (Im In) + (Asj'-A*) (Im Jn) + X* (Jm Jn) , (4.2.4) where denotes the direct (Kronecker) product of two matri ces and A| and A| are the number of distinct pairs of exper imental units over all blocks which receive any fixed pair of first and second associates, respectively, in the same block in the generating PBIBD. Since the incidence matrix of the ECBD generated by a PBIBD equals cgJ + (c]_-cq)N*, then NN' = (c-l-Cq) 2N*N*' + cQ (2r-cQb) (Jm Jn) (4.2.5) By substituting (4.2.4) into (4.2.5) and simplifying, we obtain NN* = [(c0+c1)r-c0c1b-X1] (Im In) + (XrX2) (Im Jn) + (4.2.6) X0(J J) . 2 ~m ~n 55 To facilitate finding a solution of kQ = At for x in (4.2.2), an-expression for the matrix A may now be found by substituting (4.2.6) for NN' into (4.2.3). By further simplification using the identity (4.1.7) involving X-^ and X2, the result is kQ = [(nXi+njjkj) (Im 0 In) (X1-X2)(Im 0 Jn) - x2 th of which the i element is given by kQi = (nX1+n2X2)xi (^i~^2^G^Ti^ X2T* (4.2.7) where G(x^) denotes the sum of all the estimated treatment . th effects of the treatments m the group containing the 1 treatment. In other words, G(x^) is the sum of the effect th of the 1 treatment plus the effects of all first associates th of the i treatment. Also in (4.2.7), x. denotes the sum of all the estimated treatment effects. However, one of the restrictions used on the treatment effects to obtain (4.2.2) was to set x. equal to zero. Therefore, if G(kQ^) denotes the sum of the kQ^ in (4.2.7) plus the kQ^1, i ^ i', corre- . th spondmg to all the first associates of the 1 treatment, we obtain G(kQ^) (nX j+n2 X 2) G (x ^) n (X^-X2) G (x^) = tX2G(xi) (4.2.8) By substituting G(Xj_) from (4.2.8) back into (4.2.7), the th intrablock estimate of the effect of the i treatment is 56 found by solving for x^ in the expression X -X (nX +n X )x = kQ. + G (kQ. ) . 1 2 2 i i i (4.2.9) The difference between the intrablock estimates of the effects of the treatments i and i', i ^ i', can be writ ten as A A T -X . 1 1 t = MQ.-Q. ,) + i x' VX2 tX0 [g (kQ ) -G (kQ ) ] . i l Under the assumption that the errors are normally distributed , 2 with mean zero and variance structure o I, the difference e~bk A A x^-x^, has the properties E(t.-t, i i T -T . 1 1 i and Var(x.-T. i i ,) 2ka2 r i & i' are 1st associates nXi+n2X2 2ka2 e nXi+n2X2 (1 + i & i' are 2nd associates . The intrablock analysis of variance table for the partially balanced group divisible extended complete block design is presented in Table 3. In the sum of squares ex pressions in Table 3, CM _1 bk l l i j l a i j Z r TABLE 3 Intrablock Analysis of Variance for Partially Balanced Group Divisible Extended Complete Block Designs Source df Treatments (adjusted) t-1 SSTA = (n~~Tn~~~r H^kQi)2 + t\ ^ (kQi)G(kQi)] Sum of Squares 1' **22' i tX2 Blocks (unadjusted) b-1 SSBjj = | l B? CM j Remainder (t-1) (b-1) SSR =11^- R?j " 1 D ID SSTa SSB0 CM Error b(k-t) sse -III Yin 11 ht: RL j H i D ID Total bk-1 TSS yijÂ£ i j Â£ CM * E (MSTa) = ae + k(t-1) t'At E(MSR) = a* + y'Dy and E(MSE) = - rr 2 a where A = rklfc NN' and D = C'C EMS* E (MSTa) E (MSR) E (MSE) cn -J 58 As can be seen from Table 3, the ratio of block, and treatment the mean square for remainder to the mean square for error (previously called the mean square for duplication variation) provides a statistic for a test on the validity of the addi tive model. If the hypothesis is not rejected, a test of the hypothesis of equal treatment effects could be performed using the statistic the ratio of the mean square for treat ments (adjusted) to the mean square resulting from pooling the mean square for remainder with the mean square for error. If the hypothesis is rejected, then we might wish to consider a non-additive model. With the non-additive model, we could concern ourselves with the estimation of the block x treat ment interaction effects or concentrate on testing the hy pothesis of equal treatment effects in the presence of inter action effects. The next section gives validity to both of the above mentioned tests of hypotheses. As will be shown, the sums of squares in Table 3 are each distributed as weighted chi square random variables and each sum of squares is dis tributed independently of the others. 4.3 Distributions of the Sums of Squares and Relevant Tests of Hypotheses In order to validate the tests mentioned at the end of Section 4.2, we must first obtain the distributions 59 of the sums of squares in Table 3. The approach that will be used to derive the distributions of these sums of squares is to express them as quadratic forms and use our knowledge of the distributions of quadratic forms. The sums of squares in Table 3 can be written in the quadratic forms SETA = X'dlX = X :,..,+n2y S'ibt k ;2525>?1 SSBu X'*2X = X' BE tbcx2xc J) y , SSR = y'A3y = y' (CD_1C' Â£ CX^C' A-^ y SSE = y'A4y = y' (Ibk CD 1C') y , (4.3.1) (4.3.2) (4.3.3) (4.3.4) and TSS = y'A5y = y' (Ibk Â£Â£ J) y (4.3.5) where in (4.3.1), F = Im IR. In the quadratic forms (4.3.1) through (4.3.5), each of the matrices A^, A2, A^, A^, and A^ is real, sym metric, and idempotent, and 4 I ip = *5 P=1 P Also, the ranks of the five matrices, where r(Ap) denotes the rank of the matrix A are ~ IT riA^) = t-1 , 60 r(A2) = b-1 , r(A3) = (b-1)(t-1) , r(A4) = b(k-t) , r(Ac) = bk-1 = l r(A ) . P=i ~p Hence, by applying Theorem 5 in Searle (1971) on the dis tribution of quadratic forms, it is found that when y N( y, aeibk / then y'A Y ~ CTe xr(A ) ( y'^p!y2a ) (4.3.6) P for p = 1, 2, 3, 4 and the y'A y are mutually independent. ~ ~ IT ~ The distributional forms in (4.3.6) will be used in con structing the aforementioned tests of hypotheses. The test of the assumption concerning the validity of the additive model corresponds to the test of the hy pothesis of zero interaction effects when the non-additive model is considered. To test the validity of the additive model, the test statistic used is the ratio of the mean square for remainder to the mean square for error. If the additive model holds, then the test statistic possesses an F distribution with the appropriate degrees of freedom, and the hypothesis concerning the validity of the model is re jected for large values of this ratio. If the additive model assumption is valid, a test of the hypothesis of equal 61 treatment effects would be performed using the ratio of the mean square for treatments (adjusted) to either the mean square for error or the pooled mean square for remainder plus error. Under the hypothesis of equal treatment effects, this ratio possesses an F distribution. On the other hand, if there is evidence to reject the assumption of the addi tive model in favor of the non-additive model, then an ap proximate test on the treatments could be performed if de sired. The intrablock analysis given in this section and in the previous section is, of course, an analysis of a model in which the treatment, block, and interaction effects are considered as fixed effects. In comparative type experi ments, usually we seek to draw inferences about the effects of the specific treatments used in the experiment and hence the assumption of fixed treatment effects presents little argument. Now the block effects, on the other hand, may be fixed or random. In this latter case the emphasis may be on drawing inferences about the magnitude of the variance of the population from which the sample of block effects was assumed to be drawn. A model in which the treatment effects are fixed while the block effects are random is called a mixed model. Since the partially balanced group divisible ex tended complete block designs would frequently be used with random block effects, we shall now present the analysis of a mixed model for these designs. 62 4.4 Mixed Model Analysis In the mixed model analysis, all symbols in the model are defined exactly as in (4.1.8) with the exception of 8 and y. In the mixed model, the parameters 8 and y are assumed to be independently distributed normal random vari ables with 8 ~ N( 0, oIb ) , I ~ N< btibt > ' and each distributed independently of the random errors e. Under these assumptions, the expectation of the vector of observations is E (y) = c(ylbt + x-jt) = y and the variance of the observations is (4.4.1) Var(y) = e[c(X28 + y) + e] [c (X28 + y) + e] ' = ?X2XC-a + CC'at + a^Ibk = V (4.4.2) The analysis of variance table for the mixed model is presented in Table 4. The differences between the en tries in Table 4 and Table 3 for the fixed effects model are the replacement of the source of variation called blocks (adjusted) for the source of variation called blocks (un adjusted) and the expected mean square expressions. Of course other differences exist in the use of the two tables, namely in the interpretation of the tests of hypotheses. The expected mean squares in Table 4 were obtained using the identity TABLE 4 Mixed Model Analysis of Variance for Partially Balanced Group Divisible Extended Complete Block Designs Source df Treatments (adjusted) t-1 Blocks (adjusted) b-1 Interaction (t-1)(b-1) Error b (k-t) SS* SST A SSB, SSI SSE Expected Mean Squares** 1 e + t-1 r 'Ax e + (si- 7 + hT<>* 7F (r-k) s,}at2 b-1 V1 r 2' b b-11^ rk ae + (b-i)Tt-l) {S1 k S2 ^*}abt 2J bt Total bk-1 TSS * SSBa = SSTa + SSBy SSTy and the other SS are defined in Table 3 with SSI = SSR . p ** A = rklt NN' s = l l ni- for p = 1, 2, and 3, X** = (cq+c-^Aj. CqCjT and ** = bk(n>^2A2) '1 + ) (kslS2-2slS3+S|) + (X1s22X**s1) } . OJ 64 E(y'Apy) = tr(ApV) + u'Apu involving the expectation of a quadratic form with y ~ N( y, y ) and where tr(A V) denotes the trace of the matrix A V. ~ IT ~ ir ~ To test the hypothesis HQ: = 0, the statistic takes the form of the ratio of MSI to MSE, where MSI equals SSI/(b-l)(t-1) and MSE equals SSE/b(k-t) with SSI and SSE defined in Table 4. The hypothesis is rejected for values of this ratio larger than the appropriate tabular F value. /s _ If the hypothesis is rejected, an estimate of could be found using the analysis of variance approach. An es- timate of the random blocks component of variance may also be obtained using the analysis of variance approach. When the hypothesis HQ: = 0 is rejected, we may still wish to test the hypothesis that the treatment effects are small relative to the magnitude of the inter action variation. For this test the statistic is given by s-,^- (1/k) s2~(j)5 (b-1) (t-lT~ MSTa - t-1 MSI s1~ (1/k) s2-(f)* (b-1)(t-1) t-1 MSE where expressions for calculating the values of s^, s2, and <})* are presented at the bottom of Table 4. The statistic FT can be shown to possess an approximate F distribution with f and b(k-t) degrees of freedom in the numerator and denom inator, respectively, where the number f is computed by the 65 procedure given by Satterthwaite (1946) for approximating the distribution of the estimate of a variance component. If the hypothesis HQ: crÂ£t = 0 is not rejected, a simpler test on the treatment effects may be performed. For this case, the statistic is the ratio of the mean square for treatments (adjusted) to the pooled mean square consisting of the mean squares for interaction and error. CHAPTER 5 A PARTIALLY BALANCED ECBD WITH THE L2 ASSOCIATION SCHEME The partially balanced group divisible extended complete block designs presented in Chapter 4 comprise a large class of partially balanced ECB designs. However, because of its general applicability, still another class of partially balanced ECB designs to be considered is the class of partially balanced extended complete block designs with the L2 (Latin Square) association scheme. The L2 asso ciation scheme for t = n2 treatments is characterized from the arrangement of the treatments in a square array. The classification of the treatments to one another in the L2 association scheme is such that the treatments in the same row or same column are first associates and the treatments not in the same row or same column are second associates. For example, with sixteen treatments (denoted by the numbers 1 through 16) an 1>2 association scheme would be determined from the square array 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 The first associates of treatment 1 are the treatments 2, 3, 66 67 4, 5, 9, and 13, while the second associates of treatment 1 are the treatments 6, 7, 8, 10, 11, 12, 14, 15, and 16. In this chapter, we shall present only the intra block analysis of the partially balanced extended complete block designs with the L2 association scheme. The mixed model analysis follows the procedure outlined in Section 4.4. In fact, the only difference in the final analysis of the mixed model with the two association schemes is that with the partially balanced ECB designs with the L2 association scheme, the expected mean square expressions are slightly more complicated than the corresponding expressions with the partially balanced group divisible ECB designs. In the intrablock analysis, we shall make use of row sum and column sum operators which we denote by RS() and CS () respectively. For the square array containing the treatment effects x^ through Xg the row sums RS(x^), RS(Xg), and RS(Xg) are given by RSx^ = ti+t2+t3 = RS(x2) = RS(x3) , RS(x5) = t4+x5+t6 , and RS(x 9) t7+t8+t9 68 The column sums for the same treatment effects are given by CS(t1) = T1+T4+T7 = cs(t4) = CS(x7) , cs and CS(t9) = t3+t6+t9 . Note that for the i1" treatment effect, RS (T) + CS(Ti) = s1(Ti) + 2xi , where S^(t^) denotes the sum of the effects of all treatments that are first associates of the i*" treatment. For a PBIBD with the association scheme, the ma trix N*N*' may be arranged in a particular pattern which will be described in Section 5.1. The particular pattern of the matrix N*N*1 facilitates solving the normal equations to ob tain the intrablock estimates of the treatment effects. As expected, the pattern of N*N*' is reflected in the matrix NN', where N is the incidence matrix of the ECB design. 5.1 Intrablock Analysis The definitions and notation presented in Section 4.1 will be used again in this section with the exception that N* now denotes the incidence matrix of the generating PBIBD with the association scheme. Let us consider an additive model consisting of the overall mean parameter, a treatment parameter, a block 69 parameter, and a random error term. Since the interaction effects are all zero, the model in (4.1.8) may be written as y = C(ulbt + Xjj + X28) + e (5.1.1) Using the same form of the normal equations as detailed in Sections 3.2 and 4.1, we have G/bk r a - y A kQ = At A rB N'T (rklb N'N)B where all symbols are defined in Section 3.2. A The solution of kQ = At in (5.1.2) depends upon the form of the matrix NN', since A = rklt NN'. For a PBIBD with the L^ association scheme, the matrix N*N* may be arranged in a particular pattern as follows. In the association scheme, let us label or number the treatments from 1 to n2. Then, if the treatments are listed in numeri cal order in the incidence matrix according to the particu lar blocking plan used, the matrix N*N* is of the form N*N*' = (r*-2A*+A*)(I 0 I ) + (A*-A*)(I 0 J ) - ~ 2 ~n ~n 1 2 ~n ~n + (A*-A*)(Jn 0 In) + A*(Jn 0 Jn) (5.1.3) J. ^ where 0 denotes the direct product of two matrices and where over all blocks in the generating PBIBD, A* and A* are the number of distinct pairs of experimental units which receive any fixed pair of first and second associates, respectively, in the same block. 70 Since the matrix N can be expressed in the form N = CqJ + (c-l-CqJN* , then NN* = (C-Cq)2N*N*' + c0(2r-c0b)(Jn 0 Jn) (5.1.4) Now if the form (5.1.3) of the matrix N*N* is substituted into NN' in (5.1.4), the resulting expression for NN' after simplifying is NN = [(c1+c0)r-c0c1b+X2-2A1] (In 0 IR) + (Xl-X2)[(Jn In)+(In Jn)] +X2(Jn0Jn) . (5.1.5) This expression for NN' can now be substituted into A = rklfc NN' , so that the intrablock estimates of the treatment effects are obtained by solving the equation kg = (n[2Aj+(n-2)X2] (Jn In> i.\1~X2) [(Jn Jn) + (Jn Jn>] - X2 of which the i element is kQi = n[2X1+(n-2)X2]xi (Xx-X2) [RS (t) +CS (t) ] (5.1.6) since -A2t. = 0. In (5.1.6), RS(t^) and CS(t^) are the row th sum and column sum, respectively, of the estimated i treat ment effect. 71 To obtain the expression for from (5.1.6), we need the row and column sums of kQ^ in (5.1.6). These sums are respectively RS(kQ) = [nX1+n(n-l)X2]RS(Ti) (5.1.7) and CS(kQ^) = [nX-^+n (n-1) X2] CS (x^) . (5.1.8) Replacing RS(x^) and CS(x^) in (5.1.6) by their respective equivalent expressions in (5.1.7) and (5.1.8), then kQ^ in (5.1.6) may be rewritten as *>i = n[2X1+(n-2)X2lT1-nX1+Mn-l)X2 [RS (kQi)+CS (kQl) ] , th and hence the intrablock estimate of the effect of the i treatment is given by the equation n [2X^+(n-2)X2]Ti = + A1 A2 nX^+n(n-1)X^ [s1(kQi)+2kQi] (5.1.9) where Sj(kQ^) is the sum of kQ^ in (5.1.6) plus the kQ^,, th i ^ i', corresponding to the first associates of the i treatment. The difference between the estimated effects of treatments i and i' can be written as Ti~Ti' = X X k (QiQi>) + nX +n(n-1)X i. [S1(kQi)-S1(kQi,)+2k(Qi-Qi,)] . Under the assumption that the random errors in (5.1.1) are 72 normally distributed with mean zero and variance structure A A eIbk' the difference has the properties A A and A A Var(Ti-xi,)= Kl> + X.+n (n-T)'rf i & i' are lSt associates X1 X2 K[l + nx (n-)X 1' i & i' are 2nC^ associates, where K = kcr* /n[2X-^ + (n-^)^] and n = /t* The intrablock analysis of variance table for the partially balanced extended complete block designs with the association scheme is presented in Table 5. All symbols in Table 5 are defined as they were defined in Table 3 with the exception of S^(kQ^) which has been defined in this section. 5.2 Distributions of the Sums of Squares and Relevant Tests of Hypotheses As in Section 4.2, before considering any relevant tests of hypotheses, we shall obtain the distributions of the sums of squares in Table 5. Resorting once again to the the ory of the distributional properties of quadratic forms, the sums of squares formulae in Table 5 are expressed as quad ratic forms in the following matrix notation TABLE 5 Intrablock Analysis of Variance for Partially Balanced ECBD of the Association Scheme Source df Sum of Squares EMS* X. -X, Treatments (adjusted) t-1 ssta = 2nX- k L+n ( n-2)X2 Kq? + i 1 nX-^+n 2 (n-l)X2 [Qisi (Q) + 2Q?]} E (MST Blocks (unadjusted) b-1 SSBu 1 k l j B2 D - CM Remainder (t- 1)(b-1) SSR = l l 1 . R? - SST - SSB 7 - CM E (MSR) i j nij ID A U Error b (k-t) SSE = I i 1 j I y a 2 _ i j a l l r i j I] R2 . ID E(MSE) Total bk-1 TSS = I i l j C 2 _ i j Â£ CM * E (MST ) = a2 + x1 At E(MSR) = a2 + y' Dy and E(MSE) = a2 where A e k(t-l) ~ ~~ e ~ ~~ e A = rkl NN' and D = C'C . co 74 SST = y'A y = y' A ~ ~1~ ~ C(I J. X X D) X, 2nA2_+n (n-2) X2 ~ ~bt k ~2~2~ ~1 A -A IX + 1 2 h] X' (I 1 DX X' ) C' y (5.2.1) (n-1)A o ~ ~1 ~bt k ~~2~2 ~ '~t nAi+n SSB = y' A y = y' I(CX X'C I J) y , U ~ ~2~ ~ k ~~2~2~ b ~ ~ (5.2.2) SSR = y'A y = y' (CD-1C' -ij-A -A)y, (5.2.3) ~ ~3~ ~ ~~ ~ bk ~ ~1 ~2 SSE = y'A y = y' (I CD_1C') y , ~ ~4~ ~ ~bk ~~ (5.2.4) and TSS = y'A y = y' (I 1 J) y , ~ ~5~ ~ ~bk bk ~ (5.2.5) where in (5.2.1), the matrix H is given by H = (I J ) (J I ) . ~n ~n ~n ~n The matrices A., A., A-, A., and Ac are each real, ~1 ~2 ~3 ~4 ~5 symmetric, and idempotent, and l A = A_ . L. ~p ~5 p=l Furthermore, the ranks of the five matrices, where r(A ) de- ~ hr notes the rank of the matrix A are ~P r(^l} t-1 , b-1 , (b-1)(t-1) , b(k-t) , and r(A ) = bk-1 = V r (A ) ~ R _L, ~ D p=l 75 Hence, by again referring to Theorem 5 in Searle (1971) on the distribution of quadratic forms, we find that when y ~ N( H' aeibk } ' then yApy ~ e xr(A )( H'^pH/2ae ) (5.2.6) for p = 1, 2, 3, and 4 and the y'A y are mutually independ- ~ ~ IT ~ ent. The distributional forms (5.2.6) can now be used to construct statistics for the tests of hypotheses in the usual manner. The test of the assumption concerning the validity of the additive model corresponds to the test of the hy pothesis of zero interaction effects when the non-additive model is considered. To test the validity of the additive model, the test statistic used is the ratio of the mean square for remainder to the mean square for error. If the additive model assumption holds, then the test statistic possesses an F distribution with the appropriate degrees of freedom, and the hypothesis concerning the validity of the model is rejected for large values of this ratio. If the additive model assumption is valid, a test of the hypothesis of equal treatment effects would be performed using the ratio of the mean square for treatments (adjusted) to either the mean square for error or the pooled mean square for remainder plus error. Under the hypothesis of equal treatment effects, this ratio possesses an F distribution. On the other hand, 76 if there is evidence to reject the assumption of the addi tive model in favor of the non-additive model, then an ap proximate test on the treatment effects could be performed if desired. CHAPTER 6 THE GENERAL PARTIALLY BALANCED EXTENDED COMPLETE BLOCK DESIGN In Chapter 4 the analysis of the fixed effects model as well as the analysis of the mixed model for the class of partially balanced group divisible extended com plete block designs was presented in detail. In Chapter 5 the analysis of the fixed effects model was presented in de tail and the analysis of the mixed model was mentioned for the class of partially balanced extended complete block de signs with the I12 association scheme. These two special cases of the general partially balanced (GPB) extended com plete block designs were presented in detail, not only be cause of their general applicability, but also because the constants (containing the parameters of the designs) were of the same form for both special cases. In this chapter, we shall present the analysis of the GPB extended complete block designs. For this general class of designs, it will be necessary to introduce new con stants to aid in simplifying the forms of the necessary cal culating formulae. The introduction of these new constants stems from the desire to conform to the use of the standard notation for the analysis of general partially balanced in complete block designs. In particular, the new constants are d., d and A which correspond to the constants c c , 12s 12 77 78 and A as defined and used by Bose and Shimamoto (1952) in the analysis of PBIB designs. We now present the intrablock analysis of variance for the GPB extended complete block designs. 6.1 Intrablock Analysis Let us consider the model in (4.1.8) where again the interaction effects are all zero. The additive model is written as y = C(ylbt + XjT + X23) + e (6.1.1) where the symbols y, C, y, lbt, X^, t, X2, 3, and e are de fined following (4.1.8). The normal equations are set up exactly as presented in Sections 3.2 and 5.1, resulting in G/bk 1 < 2- \ kQ -- A At A rB N'T (rklb -N'N)3 where G is the grand total of the observations, Q is the vector of adjusted treatment totals, B and T are respec tively vectors of the unadjusted totals of the blocks and treatments, N is the incidence matrix of the design, r is the number of replications of each treatment in the exper iment, and k is the block size. As in the previous sections where the intrablock analysis was discussed, the matrix A is given by A = rklt NN' , 79 A A A and y, x, and 3 are respectively the estimates of the param eters y, x, and g in (6.1.1). To find an expression for the intrablock estimate th of the effect of the i treatment, we note from the t x 1 a th vector of equations kQ = Ax in (6.1.2) that the i element can be written as b t kQi = rkTi J I nijnhjTh , (6.1.3) th where is the adjusted total for the i treatment and n^ and n^j are elements of the incidence matrix N of the de sign. The quantities 1 l ijVTh 3 h J can be expressed in terms of the parameters of the design as follows: l 1 j h nijnhjTh l nij*i 3 l xiTi 1 _ A l Vi i = h s t i f h, i & h are 1 associates i ^ h, i & h are 2n<^ associates (rk-n-^X j-n2X2^ Ti XjS^(x^) ( i / h ^2^2 s t i & h are 1 associates, nd i & h are 2 associates where S-^ix^) is the sum of the estimated treatment effects of all treatments (n-^ in number) that are first associates th a of the i treatment, and likewise, S2(xÂ£) is the sum of the 80 estimated treatment effects of all treatments (n^ in number) J_ T_ that are second associates of the iT'n treatment. By re placing the quantities Â£ l n. .n t j h x3 h3 h in equation (6. ,1.3) with their equivalent expressions in- /\ A volving S^(t_^) and S2(t^), we maY wr;i-te equation (6.1.3) in the form A A Z\ kQi = ^ni^l+n2^2^Ti ^lSl^Ti^ ^2S2^Ti^ (6.1.4) Equation (6.1.4) is now summed over the first asso ciates of the x treatment resulting in the expression kS1(Qi) = "hVi + Sl(i> (nlXl+n2A21lPl'i2p2) + S2(.)(-X1p^1-X2p2) (6.1.5) Summing (6.1.4) over the second associates of the i1" treat ment results in kS2(Q.) = X2n2il + S2(il> (Vl+n2X2'XlP12X2P22) + S1(.)(-X1p>2-X2P2) (6.1.6) In (6.1.5) and (6.1.6) px is the number of treatments that 3k are both a jth + Vi associate of treatment a and a associate of treatment 6 given that treatments a and 8 are ix asso- ciates. As in Bose and Shimamoto (1952), we write equations (6.1.5) and (6. .1.6) in the forms kS1(Qi) = = -n1X1i + a11S1(?l) + a12S2(.) (6.1.7) and 81 kS2(Qi) n2A2Ti (6.1.8) all nlXl + n2X2 Xlpl X2P12 a 12 Xlpl X2P2 a 21 "Xlp2 X2P22 and X1P2 X2P22 ' In order to express the intrablock estimate x^ as a function of Q^, S^(Q^), and S2 (Q) having arrived at the equations (6.1.7) and (6.1.8), we interrupt the development A A briefly to see how the sums S^ix^) and S2(x^) can be re placed in (6.1.7) and (6.1.8) with the quantities S^(Q^) and S2(Q^). To this end, consider the linear combination L = k2Q + d1kS1(Qi) + d2kS2(Qi) (6.1.9) 1^1vvi involving only the Qi and parameters of the design with d^ and d2 being constants consisting of linear functions of the aij, i < j = 1, 2. If both (6.1.7) and (6.1.8) are substi tuted into (6.1.9) for kS^Q-^) and kS2(Q^), respectively, the resulting expression for is = [k (njX^+n2X2)-djn^X-^-d2n2A2] x^ + (d^a^j+d2a22_~kA^) S^ (Q^) (6.1.10) + (d1a12+d2a22-kX2)S2(Q^) . The quantities d-^ and d2 in (6.1.10) are now chosen so that 82 upon equating the right-hand side of (6.1.9) to the right- hand side of (6.1.10), the equation (6.1.10) becomes k2Qi + d1kS1(Qi) + d2kS2(Qi) = k (n^+n^) (6.1.11) A That is, equation (6.1.11) expresses the estimate as a function of the quantities S-^iQ^), and s2 (Q^) . To obtain the values for d-^ and d2 so that equation (6.1.11) is as shown, we require the identities k^i = d^ (a-j^+n^A-^) + d2 (a2^+n2X2) (6.1.12) and kX2 d^ ^2 ^a22+n2^2^ (6.1.13) Solving equations (6.1.12) and (6.1.13) simultaneously by the use of determinants, we have d-^ = D-^/D and d2 = D2/D, where D, D-^, and D2 are given by D (a^^tn^X^) (a22+n2A2) (a^2"^~^2_^i^ (a2*^tn2X2) ^ (6.1.14) ~ kX^(a22tn2X2) kX2 (a2i+n2'*''2^ t (6.1.15) and D2 kA2(a^*^~^"^2_^}_^ """* kXi (a-^2+n^X^) (6.1.16) Substituting for a^, a^2, a2^, and a22 in (6.1.14), (6.1.15), and (6.1.16), simplifying, and writing D = k2A the fol- b lowing equations for d-^ and d2 are obtained kAgd-j_ = A1 (n1X1+n2X2 + X2) + (X-^X2) (^ 2P2 ^ lp 2 ^ and 83 k^s^2 ^2 l"^^2^2^^ 1^ **" ^2^ ^2P12^lp12^ f (6.1.18) where k2A s ^nl^l+n2^2+^l^ ^nl^l+n2^2+^2^ + (X1-X2) [ (niAi+n2A2) (P3_2"P2^ + X2P12 ^lp12^ * The expressions (6.1.17) and (6.1.18) for the values of d-^ and d2, respectively, are now substituted back into (6.1.11). th The intrablock estimate of the effect of the i treatment is Ti k(n1A1+n2A2) tk2Qi + ^i^Qd.) + d2S2(kQi)] (6.1.19) which has the alternate forms Ti k(n1A1+n2X2) t(k dl)kQi + {dl d2)s1(kQi)] (6.1.20) and - = 1 Ti k(n^X^+n2X2) [Oc-a2)kQ. + (a2-a1)s2(kQ.)] (6.1.21) The use of one of the alternate forms would probably be more convenient since only one sum, either S-^(kQ^) or S2(kQ^), for each treatment need be calculated. /\ /v The difference of the estimated effects of treatments i and i', i ^ i', using the alternate form in (6.1.20) above is Ti"Ti* = k (n.X^+nX.) [(k"di)k(Qi Q,) 11 2 2 + (d1-d2)k{S1(Q.)-Si(Q.I)}] Under the assumption that the random errors in (6.1.1) are 84 independent normally distributed random variables with mean r. zero and variance a^r the difference has the properties and A A Var(xi-xi,) f2(k"dl)ae st -:rr; i & i' are 1 associates nl^l+n2^2 2(k-a2)a a -;rrr 1 & 1 are 2 associates . [nlh+n2X2 The analysis of variance table for the general partially balanced extended complete block design is exactly of the same form as Table 5 except that the calculating formula for the sum of squares for treatments (adjusted) is now given by SSTa 1 k2(n]_^]_+n2^2) I [ (k-d1) (kQi) 2+ (d1-d2) (kQi) S1 (kQi) ] . Also, the tests of hypotheses usually performed are con ducted in the manner described in Section 5.2. The recovery of the intrablock information and a combined estimate of the intrablock and interblock treatment effects can also be obtained with the straightforward appli cation of the maximum likelihood method of Rao (1947). The utility of the general partially balanced ex tended complete block designs is at present limited. This limitation is imposed by the complexity of the formulae for 85 the estimates of the treatment effects, in that the con stants and d^ must be calculated for any design used. A similar difficulty is encountered in the analysis of PBIB designs, unless one has reference to the extensive listing of PBIB designs (with two associate classes) and their asso ciated constants necessary for an analysis as given by Clatworthy (1973). CHAPTER 7 CONCLUDING REMARKS AND A SENSORY TESTING EXAMPLE Throughout the development and presentation of this work, it has been necessary to make certain assumptions concerning the model as well as the type of correlation present in the data in order to formulate our methods of analysis. In Chapter 3 for example, in the development con cerning the possible presence of the correlation p between duplicate responses to the same treatment in the same block, the additive model only was assumed. Without making the assumption that the interaction effects are all zero, the estimate of the magnitude of the correlation would be con founded with the estimates of the interaction effects. In this case, an estimate of the correlation free of the inter action effects would not have been possible by the method we used. Although the assumption of additivity for many realistic situations is somewhat restrictive, the additivity assumption was made as a matter of necessity and to be in line with the assumption employed in the analysis of ran domized complete block designs (of which our designs are just extensions). The assumption that the correlation between du plicate responses to the same treatment in the same block is constant and equal for all treatments and blocks may also 86 87 appear to be somewhat restrictive. It would seem more appro priate perhaps to assume the correlation is not constant but rather varies over the treatments and blocks. That is, for many practical applications it may be more realistic to con sider the correlations p^j, i = 1, 2, ..., t and j =1, 2, ..., b. However, this non-equality of the correlations would give rise to the difficulty of having to estimate a larger number of parameters than the number of observations present in the experiment. Also, all the correlations p^j could not be estimated in a given experiment since not all block-treatment combinations would have duplicate responses. A simplification of the problem of non-equality of the correlations would be to consider the correlations pj, j = 1, 2, ..., b, that is, a different correlation is asso ciated with each block. Such a case might arise when pan elists of varying degrees of proficiency are used in sensory testing experiments. The estimation of the Pj and the sub sequent test on the treatment effects is being considered for future work. The methods presented for testing the hypothesis of zero correlation and for estimating p may appear to be somewhat intuitive. First attempts in finding a likelihood ratio test of the hypothsis of zero correlation and a maxi mum likelihood estimator of p resulted in complex expres sions which did not seem to simplify. These likelihood pro cedures could be investigated in future work. At that time, it may be of interest to compare the likelihood results with the results contained in this work. 88 The possible lack of utility of the general par tially balanced extended complete block designs developed in Chapter 6 was mentioned at the end of that chapter. Inves tigations into the possibility of expressing our solutions in terms of the parameters and constants already tabulated for PBIB designs by Clatworthy (1973) could be considered. Hopefully, this would enhance the utility of the GPB ex tended complete block designs with respect to the calcula tions involved in the analysis. As mentioned at the end of Section 3.2, we shall now suggest a test of the hypothesis of equal treatment ef fects in the presence of a non-zero correlation. The test procedure is just a suggestion since the properties of the procedure have not been studied in detail at this time and remain for future consideration. The quadratic forms for the sums of squares for treatments (adjusted) in Table 1 of Section 3.2 and for re mainder in (3.3.2) are not independently distributed in gen eral. However, the quadratic forms for SST^ and the sum of squares for duplication variation in (3.3.1) are independ ently distributed. In fact, we have that SSTa ~ a2{1 + [(k+t)X 2rk]} x^-i when the hypothesis of equal treatment effects is true, that SSDV ~ o2(1P) X(k_t) , and that these random variables are distributed independently of one another. Thus, to test the hypothesis under con sideration, we may use the test statistic 89 MST /{I + |Â£ [ (k+t)A 2rk]} pi A AK t MSDV/(1 p) which possesses a central F distribution with t-1 and b(k-t) degrees of freedom in the numerator and denominator, respec tively. When the hypothesis is not true, the test statistic F has a non-central F distribution which depends upon the true value of p as well as the unknown value of the ratio of Â£ x2 to ct2. Thus, the power of the test could be calculated for various values of p and the ratio J x2/a2. The value of the test statistic F depends upon the true value of p which is usually unknown. The suggested A procedure, therefore, is to replace p with p resulting in the approximate test statistic F* given as MST /{I + ||. [(k+t) A 2rk] } p* A AK t MSDV/(1-p) The distribution of the approximate test statistic F* depends upon the unknown distribution of p. At this point, complications arise in arriving at the exact or an approximate distribution of F* since p is calculated using the value of SSR which, as a random variable, is not inde pendent of the random variable SST Hence, until more A investigation may be made into the distribution of the es timator p, the distribution of the test statistic F* might be approximated by an F distribution with t-1 and b(k-t) 90 degrees of freedom (the distribution of F ). The closeness of this approximation to the exact distribution is being considered and will hopefully be reported in later work. The following is a numerical example of a taste testing experiment. The objectives of the experiment were twofold. First, it was of interest to compare the degree of preference for the treatments by the specific panelists used. Second, it was suspected that correlation would be present in the data and hence a test for its presence was to be per formed. Each of the trained panelists (denoted by the num bers 1 through 10) was asked to evaluate five different treatments (denoted by the letters A, B, C, D, and E) by as signing a numerical value of 1 through 9 according to his or her degree of preference for the treatments. The lower end of this hedonic scale reflects an extreme non-preference while the upper end reflects an extreme preference for the treatments. Since there was an interest in measuring the consistency of the panelists used and since each panelist could evaluate six food samples effectively at one sitting, each panelist was asked to evaluate each of the treatments plus a replicate of one of the treatments. The data with some calculations is 91 Treatments where the kQ. are i calculated using the formula which follows formula in (3 tions are kQ. = kT. l n..B 13 3 (3.2.3) and the are calculated using the 2.4). For treatment A, the necessary calcula- kQ = (6) (89) {(2) (38+19) +(1) (34+27+23+36+31+42+30+22)} A = 175 and Â£ 175 o c ta JT4TT5T ~ *b * The sums of squares for treatments (adjusted) and blocks (unadjusted), the total sum of squares, and the sum of squares for residual are calculated using the formulae in Table 1. The results of these calculations are |

Full Text |

SOME NEW EXTENDED BLOCK DESIGNS AND THEIR ANALYSES
By JACK FRANKLYN SCHRECKENGOST A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1974 TO MY WIFE ACKNOWLEDGMENTS I would like to express my appreciation to Dr. John A. Cornell for his guidance and assistance while di recting this dissertation. My thanks, also, to the other members of my advisory committee, Dr. F. W. Knapp, Dr. Frank G. Martin, Dr. John G. Saw, and Dr. P. V. Rao, for their helpful suggestions. A belated thanks is expressed to Mr. Ronald E. Boyer, a good teacher and a valued friend, who gave me much encouragement from the very beginning. Appreciation to my wife, Donna Rae, cannot be ex pressed as deeply as is felt. I thank her for her patience and understanding during my many hours of study, research, and writing. iii TABLE OF CONTENTS Page ACKNOWLEDGMENTS iii LIST OF TABLES vi ABSTRACT vii CHAPTER 1 INTRODUCTION 1 1.1 Blocking Designs .... 2 1.2 Extended Complete Block Designs 2 1.3 Purpose of This Work 5 2 LITERATURE REVIEW 7 3 EXTENDED COMPLETE BLOCK DESIGNS WITH CORRELATED OBSERVATIONS 16 3.1 Notation and Definitions 17 3.2 Intrablock Estimation of the Treatment Effects 21 3.3 A Test for the Presence of Correlation 26 3.4 The Exact Distribution of SSR for p > 0 30 3.5 An Approximate Distribution of SSR for p > 0 40 3.6 An Estimate of the Correlation 4 3 4 A PARTIALLY BALANCED GROUP DIVISIBLE ECBD ... 47 4.1 Definitions and Notation 50 IV TABLE OF CONTENTS (Continued) CHAPTER 4 (Continued) Page 4.2 Intrablock Estimation of the Treatment Effects 53 4.3 Distributions of the Sums of Squares and Relevant Tests of Hypotheses 58 4.4 Mixed Model Analysis 62 5 A PARTIALLY BALANCED ECBD WITH THE L ASSOCIATION SCHEME 7 66 5.1 Intrablock Analysis 68 5.2 Distributions of the Sums of Squares and Relevant Tests of Hypotheses 72 6 THE GENERAL PARTIALLY BALANCED EXTENDED COMPLETE BLOCK DESIGN 77 6.1 Intrablock Analysis 78 7 CONCLUDING REMARKS AND A SENSORY TESTING EXAMPLE 86 APPENDIX 1 THE EXACT DISTRIBUTION OF SSR FOR b = 2t AND k = t+1 WHEN p>0 94 2 AN APPROXIMATE DISTRIBUTION OF SSR FOR b = 2t AND k = t+1 WHEN p>0 98 BIBLIOGRAPHY 106 BIOGRAPHICAL SKETCH 109 V LIST OF TABLES Table Page 1 Intrablock Analysis of Variance for an ECBD 25 2 Values of g and h for the Approximate Distribution of SSR, I 42 3 Intrablock Analysis of Variance for Partially Balanced Group Divisible Extended Complete Block Designs 57 4 Mixed Model Analysis of Variance for Partially Balanced Group Divisible Extended Complete Block Designs 63 5 Intrablock Analysis of Variance for Partially Balanced ECBD of the Association Scheme ... 73 Al Values of g and h for the Approximate Distribution of SSR, II 99 A2 Comparison of the Exact Distribution and Two Approximate Distributions of SSR 104 vi Abstract of Dissertation Presented to the Graduate Council of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy SOME NEW EXTENDED BLOCK DESIGNS AND THEIR ANALYSES By Jack Franklyn Schreckengost August, 1974 Chairman: Dr. John A. Cornell Major Department: Statistics An extended complete block design is a balanced block design consisting of t treatments in b blocks each of size k such that k varies between t and 2t. The balance among the treatments is achieved by selecting duplicates of some of the treatments for each block according to the scheme followed when selecting blocks from the class of balanced incomplete block designs. Under the assumption of an addi tive model, it may be of interest to investigate the exist ence of correlation between responses to the same treatment in the same block. When a positive correlation between du plicate observations is present, it has been previously shown that k should be taken equal to t+1 for maximum efficiency with the extended complete block designs when compared to complete block designs. A procedure for the test of the hypothesis of zero correlation is presented as is a method for estimating the vii correlation if the hypothesis of zero correlation is rejected in favor of the alternative hypothesis of positive correla tion. Particular attention is given to the distribution of the sum of squares for remainder, where remainder is defined as residual minus duplication error, under the alternative hypothesis of positive correlation. The distribution of the sum of squares for remainder is necessary for calculating the power of the test and for obtaining an approximation to the distribution of the estimator of the correlation. A specific formula for the distribution of the sum of squares for remainder is given for the case t = b and k = t+1. The exact distribution and an approximate distribution of the sum of squares for remainder are also presented for the case b = 2t and k = t+1. The general partially balanced extended complete block design is defined as a partially balanced block design consisting of t treatments in b blocks each of size k greater than t. The analyses of variance for the non-additive fixed effects and mixed models are presented for the special class of designs called partially balanced group divisible extended complete block designs. The analysis of variance of the additive fixed effects model is also presented for the class of partially balanced extended complete block designs with the (Latin Square) association scheme. The analysis of variance of the non-additive mixed model for this class of designs is mentioned briefly. The intrablock analysis of variance for the viii additive model is developed for the general partially bal anced extended complete block designs. Also, the recovery of the interblock information and the combined intrablock and interblock analysis for these general designs are men tioned briefly. The final chapter contains some comments about the assumptions made and about directions for future study. A numerical example of a taste testing experiment is also pre sented with the resulting analysis for the balanced extended complete block designs considered in this work. ix CHAPTER 1 INTRODUCTION In many fields of experimentation, a distinction that has long been implicit in the statistical literature is the difference between experiments designed for the es timating of absolute treatment effects and experiments of the comparative type. In comparative experiments, the emphasis is on performing comparisons between the effects of the dif ferent treatments such as the effects of different doses of a drug or the effects of different levels of nitrogen on the average yield of soybeans. While the distinction between comparative experiments and experiments designed to es timate the absolute treatment effects individually is per haps not always clearly defined, nevertheless, the idea of a comparative type of experiment remains convenient and useful. For comparative experiments, it is clear that an advantage is to be gained by comparing the treatments under homogeneous conditions. To achieve this end, much of the effort in choosing the homogeneous conditions is directed toward the selection and use of block designs. Over the years, both complete and incomplete block designs have been discussed in detail. In this work, we shall be concerned mainly with combinations of these block designs for use in comparative type experiments. 1 2 1.1 Blocking Designs In blocking experiments where the objective is the comparison of different treatment effects, the number of experimental units in each block may or may not equal the number of treatments to be compared. When the size of the block, where size refers to the number of experimental units in each block, is equal to the number of different treatments and each treatment is randomly assigned once with every other treatment in each block, the design is known as a randomized complete block design. If the size of the block is less than the number of treatments, an incomplete block design may be used. Incomplete block designs are common in applications where either the number of treatments is large or the size of the block must be kept small in order to ensure homogene ity of the experimental units in each block. Still another type of block design exists when the size of the block exceeds the number of different treatments. In this latter design, if each block contains first repli cates of all of the treatments plus duplicates or second replicates of some of the treatments, the design is called an "extended complete block design." We now discuss such designs. 1.2 Extended Complete Block Designs In an attempt to increase the precision of the com parisons between the effects of each of the treatments and the effect of a control treatment, Pearce (1960) introduced 3 blocking experiments where in each block the control treat ment was replicated. Later, Pearce (1964) considered possi ble methods for designing experiments in which for a given experiment the blocks are of varying sizes. The analysis of a fertilizer experiment on strawberries in which an ex tended complete block design was used is mentioned briefly by Pearce (1963). Extended complete block designs, as introduced by John (1963), are block designs in which each block contains a first replicate of all of the treatments plus a duplicate or second replicate of some of the treatments. These second replicates in each block comprise an incomplete block se lected from the class of balanced incomplete block designs. An example of an extended complete block design formed by augmenting complete blocks of size three with balanced in complete blocks of size two resulting in extended complete blocks of size five is presented in Figure 1, where the three treatments are denoted by A, B, and C. A A A B B B C C C A A B B C C complete block design balanced incomplete block design Figure 1. An extended complete block design consisting of three blocks each of size five experimental units containing treatments A, B, and C. Extended complete block designs can be used in a variety of experimental situations. In sensory experiments 4 where the objective is the comparison of preferences for dif ferent food samples (treatments) expressed by a panel of judges (blocks), the number of food samples that a panelist may effectively evaluate at a single sitting is limited but may be more than the number of different samples to be eval uated. Acquiring panelists for these sensory experiments is often difficult and/or costly. Hence, if a panelist can ef fectively evaluate all of the different samples plus repli cates of some of the samples at a single sitting and if a fixed number of observed values of each sample is necessary, a smaller number of panelists would be required with the use of an extended complete block design than if each panelist could evaluate each of the samples only once. The use of a smaller number of panelists would result in a savings in terms of time and cost. In an agricultural setting, an experimenter wishing to compare the effects of different chemical sprays on cit rus trees may have available more trees in a block than the number of sprays to be tested. On the additional trees in each of the blocks, second replicates of some of the dif ferent chemical sprays could be applied. In an industrial experiment, on a given day an experimenter may be able to ob tain observed responses from each of the treatments as well as responses from second replicates of some of the treat ments. If he does not have enough time to observe the re sponses from second replicates of all the treatments, an extended complete block design could be used. 5 The analysis of a block design in which two treat ments are applied to the experimental units in blocks of size three was discussed by John (1962). The following year, John (1963) introduced extended complete block designs and presented their analysis. In his designs, the block size k could vary between t and 2t, where t is the number of dif ferent treatments used in the experiment. Trail and Weeks (1973) generalized this latter work of John to include de signs in which k is greater than 2t. In the papers by John (1963) and Trail and Weeks (1973), the analysis of the fixed effects model as well as the mixed model was presented in detail. The application of the extended complete block de signs of John to the area of sensory evaluation was con sidered by Cornell and Knapp (1972, 1974). Also considered by Cornell (1974) was the efficiency of these designs com pared to randomized complete block designs. 1.3 Purpose of This Work The first part of this work will be concentrated on extending the works by Cornell and Knapp (1972) and Cornell (1974) with special emphasis on the area of sensory evaluation. Specifically, we shall be interested in the analysis of extended complete block designs where correlation is present between duplicate responses to the same treatment in the same block and the magnitude of the correlation is constant over all treatments and blocks. In sensory 6 experiments for example, the presence of correlated observa tions easily could arise as a result of using highly skilled judges ( as the blocks). Therefore, we shall be interested in testing whether there is any evidence of correlation pre sent in the data. Futhermore, if there is sufficient evi dence to indicate that correlation is present, we shall seek to obtain an estimate of the magnitude of the correlation, which is denoted by p. An approximate test on the treatment effects in the presence of a value of p greater than zero will be suggested, since an exact test on treatment effects cannot be performed for this experimental situation. In the second part of this work, we shall general ize the work of Trail and Weeks (1973) to include extended complete block designs generated by partially balanced in complete block (PBIB) designs with two associate classes. CHAPTER 2 LITERATURE REVIEW The analysis of block designs in which the block size k could vary between t and 2t, where t denotes the num ber of treatments in the experiment, was first introduced by John (1963). These designs, called extended complete block (ECB) designs, contain in each of the b blocks first repli cates of all of the treatments plus second replicates of k-t of the treatments. The method taken by John of choosing the k-t second replicates in each block was to use the class of balanced incomplete block (BIB) designs of block size k-t. Using formulae similar in structure to the formulae used in the analysis of balanced incomplete block designs, John discussed the intrablock analysis, the interblock anal ysis, and the recovery of the interblock information. The recovery of the interblock information was achieved by com bining the two independent intrablock and interblock es timates of the effects of each of the treatments. In the intrablock analysis, in addition to ob taining the treatment effects adjusted for blocks and the unadjusted block analysis, John obtained estimates of both the experimental error variation and the block x treatment interaction. The measure of the interaction was obtained by subtracting the experimental error variation from the 7 8 residual variation in the intrablock analysis of variance. With the interblock analysis, however, an additive model was assumed. That is, the block x treatment interaction variance component was assumed to be zero. Using the assumption of the additive model then, the combined estimate of each of the treatment effects was formed using a linear combination of the weighted intrablock and interblock estimates. The weights used with the intrablock and interblock estimates were the reciprocals of the estimates of their respective variances. The special case where t = b and k = t+1 was presented in detail. Trail and Weeks (1973) considered the aforemen tioned extended complete block designs (ECBD) as a special case of the more general class of designs which they called extended complete block designs generated by balanced in complete block designs (BIBD). Their generalization of the work of John (1963) included balanced block designs in which the block size could exceed 2t. An example of this more general design in which a balanced incomplete block design is added to a double complete block design (CBD) is pre sented in Figure 2. A second example of these more general designs in which the complete block design is augmented by two balanced incomplete block designs is presented in Fig ure 3. In both of these figures, the three treatments are denoted by the letters A, B, and C. It should be noted that the treatments would be randomly assigned to the experimental units within each block when the experiment is performed. 9 A A A B B B C C C A A A B B B C C C A A B B C C CBD CBD BIBD Figure 2. An ECBD for three treatments generated by a BIBD consisting of three blocks each of size eight experimental units. A A A B B B C C C A A B B C C A B A C C B CBD BIBD BIBD Figure 3. An ECBD for three treatments generated by a BIBD consisting of three blocks each of size seven experimental units. In a block design in which t treatments are ar ranged in b blocks, properties of the design can be obtained by studying the elements of the incidence matrix of the de sign. The incidence matrix N = (n^j) is a t x b matrix such th that n^j denotes the number of times the i treatment ap- th pears in the j block. The elements of the incidence matrix for the extended complete block designs may be constructed by appropriately summing the elements of the incidence matrix of a balanced incomplete block design and the elements of the incidence matrix of a complete block design. For the extended complete block designs generated by BIB designs, Trail and Weeks showed that the incidence 10 matrix N can be generated from the incidence matrix N* of any balanced incomplete block design by using the equation N = cQJ + (C;l-c0)N* (2.1) where J is the incidence matrix of a complete block design, that is, J is a t x b matrix of ones, and c^ and c^ are ele ments of the set of positive integers. The model used by the authors is y = C^x + X23 + Y) + (2.2) where x is a t x i vector of treatment effects, 3 is a b x l vector of block effects, y is a bt x i vector of interaction effects, and e is a bk x l vector of independent random er ror effects. Letting 1 denote a t x l vector of ones, I i.U denote the t x t identity matrix, and y. represent the &1" 1 X/ -H Vi response to the i1" treatment in the j11 block, the vector y and the matrices C, X^, and X2 in (2.2) are of the forms y = 111 lln 11 211 tbn tb ?i - it I , x = ' it 1^ ~t ' ~2 ~t - it btxt 1 rt* i bkxl btxb and 11 :n 11 :n 21 C . (2.3) bkxbt In our notation, 1 is an n~. x 1 vector of ones. ~n21 21 In addition to presenting the intrablock and interblock analyses, Trail and Weeks expressed the formula for calculating the combined estimates of the treatment ef fects using the method presented by Seshadri (1963a) for combining unbiased estimators. Trail and Weeks also dis cussed how "best" designs might be obtained. They defined the "best" design as that design for which the variance of the difference between the intrablock estimates of the ef fects of any two different treatments is a minimum for fixed t and k. The minimum value of the variance of the difference is achieved by minimizing the absolute value of the differ ence Cq-c^, where Cq and c-^ are the magnitudes of the ele ments in the incidence matrix N of the design. An application of extended complete block designs to sensory testing experiments was presented by Cornell and Knapp (1972). Separate estimates of block x treatment 12 interaction and experimental error were obtained in their analyses. Cornell and Knapp showed that the use of the experimental error only as a measure of the within treatment variability when comparing treatments results in a more ef ficient test than when using the residual variation (the sum of the experimental error and the interaction variation) when some measure, however small, of interaction is present. Replication of extended complete block designs was also discussed by Cornell and Knapp (1974). Replication of the designs was performed to achieve a balance between the blocks and the treatments. By balance is meant, each and every treatment appears in each block (is evaluated by each panelist) the same number of times over the replications. Hence, pairwise comparisons of the treatments in each block can be made with equal precision. With the replicated designs, the assumption of negligible replication variation was made by the authors. This assumption resulted in simpler expressions for the for mulae for calculating estimates of the treatment effects as well as the intrablock sums of squares when compared to the formulae used with the unreplicated extended complete block designs. An example of a replicated extended complete block design is presented in Figure 4. Using a non-additive model, Cornell (1974) dis cussed the efficiency of extended complete block designs compared to complete block designs for uncorrelated observa tions. The efficiency of each design was defined as the 13 reciprocal of the variance of the difference between any pair of treatment means with the respective design. To illustrate, with the extended complete block design con sisting of b blocks each of size k, the estimate of the variance of the difference between any two treatment means A Var(x.-T., i i ECB 2k(t-1) Q 2 b (k2-3k+2t) e (2.4) where cr* is the intrablock estimate of experimental error. An estimate of the efficiency of the extended complete block design would be the reciprocal of (2.4). (The author used k for the size of the blocks in the balanced incomplete block design used in the extension. In this work, k* denotes the size of the blocks in the balanced incomplete block design used in the extension, while k refers to the size of the blocks in the extended complete block design.) Replications I II III 1 ABCAC ABCBC ABCAB 2 ABCBC ABCAB ABCAC 3 ABCAB ABCAC ABCBC Figure 4. A replicated extended complete block design consisting of three replicates of an ECBD. In a complete block design with the same number of replicates of each of the treatments, that is, with bk/t complete blocks of size t, the estimate of the variance of the difference between any two different treatment means is 14 Var(xi t,)cb bk jesidual (2.5) where Residual ;''s t^ie residual mean square. From (2.4) and (2.5) an estimate of the efficiency of the extended complete block design compared to the complete block design is ob tained using the ratio Var(r-T,)rn Ef f (ECB to CB) = 1 z Var(xi-Ti,)ECB t(k2-3k+2t) a2 ^ v residual x l b J k2(t-1) a* To obtain estimated efficiency values for dif ferent values of t and k, the value of the ratio of ^2 o . aresidual to ae 1S re(2uired- For the value of this ratio, Cornell used the ratio of the mean square for interaction to the mean square for error, which is easily obtainable from the analysis of variance table of the extended complete block design. With this ratio, denoted by F, Cornell showed that when the hypothesis of zero interaction effect is true, resulting in F = 1, the extended complete block design is a slightly less efficient design than the complete block de sign with the same number of replicates of the treatments. However, when F is greater than 1, the extended complete block design is the more efficient design with the efficiency increasing with increasing values of F. Cornell (1974) also considered the situation where a positive correlation p exists between the two responses to 15 a treatment in the same block. For an extended complete block design having fixed balanced incomplete block size k*, it was found that as P approaches one the efficiency of the extended complete block design compared to the complete block design decreases. In fact, the larger the value of k* (k* + t) the faster the efficiency of the extended complete block design approaches one-half that of the complete block design with twice as many blocks. This implies that if one suspects a positive correlation to be present between du plicate treatment responses in the same block, one should use k* equal to one for maximum efficiency if using an ex tended complete block design. Owing to the results previously found concerning the effect of correlated observations on the efficiency of extended complete block designs compared to complete block designs, in the next chapter we shall investigate the for mulation of the test of the hypothesis of zero correlation. If there is evidence of correlation present between du plicate treatment responses in the same block, we shall want to estimate the correlation p. With an estimate of p, an estimate of the variance of the difference between any two intrablock estimates of the treatment effects can be cal culated. CHAPTER 3 EXTENDED COMPLETE BLOCK DESIGNS WITH CORRELATED OBSERVATIONS In the extended complete block designs discussed to this point, we have observed that some of the treatments in each block are duplicated. In the papers by John (1963), Trail and Weeks (1973) Cornell and Knapp (1972) and Cornell (1974), the responses to the duplicated treatments in each block are assumed to be independent and are used to obtain an estimate of the experimental error. Comments on the ef ficiency of these designs when the duplicated observations are not independent but rather are positively correlated were made in the latter paper. As mentioned previously in Section 1.3, in sensory experiments correlated observations are a real possibility. A panelist's response to a treatment might very likely be positively correlated with his response to the duplicate of the treatment, particularly if the panelist has previously been trained for these experiments. The presence of posi tive correlation between responses to the same treatment by a panelist reflects a measure of the efficiency of the pan elist. That is, the closer in magnitude the responses to the same treatment by a panelist are, the more consistent the panelist is in evaluating that treatment. Although the correlation could be different for each treatment and/or 16 17 each panelist, we shall consider only the case where the correlation is.assumed to be constant and equal for all pan elists (blocks) and treatments. 3.1 Notation and Definitions The parameters associated with an extended complete block design are as follows: t = the number of treatments, b = the number of blocks, k = the number of experimental units in each block (block size), r = the number of replications of each treatment in the experiment, X = the number of distinct pairs of experimental units which receive any fixed pair of treatments while appearing in the same block, and N = (n..) = the incidence matrix, where n.. denotes the ID ID number of times the i^ treatment appears in the j*"*1 block. The following parameters are associated with the balanced incomplete block design used to form the extended complete block design: t = the number of treatments, b = the number of blocks, k* = the block size, r* = the number of replications of each treatment, 18 X* = the number of times over the b blocks each pair of treatments appears in the same block, and N* = (n* ) = the incidence matrix, ij The following identities involving the aforemen tioned parameters are satisfied: 1. r = r*+b 2. k = k*+t 3. r*t = bk* 4. rt = bk 5. N = N*+J, where J is a t x b matrix of I's 6. X = 2r-b+X* 7. X*(t-1) = r*(k*-l) 8. X(t-1) = rk-3r+2b . The model written in matrix notation is y = C(yl + X.t + X 3) + e (3.1.1) ~ ~ ~Dt ~~ where all symbols are defined following (2.2) with the ex ception of y and e. These parameters are y, the overall mean, and e, a bk x 1 vector of random errors with the prop erties E where E() denotes mathematical expectation and 19 E(eijieij 'pa2 i = i', j = j l jt l' a2 i = i1 j = j 1 =5/' , 0 otherwise (3.1.2) where a2 denotes the variance of the distribution from which the errors are sampled and p denotes the correlation between the duplicate observations. We shall assume that the values of p lie in the interval 0 < p < 1. Owing to the properties in (3.1.2) of the random errors, then E(y) = C(ylbt + X^r + X23) and (3.1.3) Var(y) = E(ee') = V . (3.1.4) The matrix V consists of the following partitions corre sponding to the form of the vector y in (2.3); on the main diagonal of V are positioned the matrices and a2 [ 1 ] , while there are zeros located in all other positions. Hence, under the assumption of the normality of the random errors, y ~ N( C(ylbt + X1t + X2B), V ) . Before discussing the intrablock estimation of the treatment effects, we illustrate the form of the matrix V by 20 referring to the extended complete block design presented in Figure 1. If the vector of observations y is written as Y = All A12 A21 A22 A31 Bll B12 B21 B31 B32 Cll C21 C22 C31 L y C32 J then the corresponding matrix V is V = 1 P 1 P 1 P 1 P 1 P 1 P 21 On the main diagonal of the matrix V are positioned the matrices which correspond to the duplicate responses to a treatment in the same block, and the matrices a2[ 1 ], which corre spond to the response to only a single treatment in the block. We shall now discuss the intrablock estimation of the treatment effects where both the treatments and the pan elists are assumed to represent fixed effects in the model in (3.1.1). The panelist effects represent fixed effects either when it is desired to compare the specific panelists used in the experiment or when the panelists chosen to eval uate the treatments cannot realistically be assumed to rep resent the general public. A case which comes to mind in this latter situation is when trained panelists are used in an attempt to enhance the efficiency of the comparisons be tween the treatments. 3.2 Intrablock Estimation of the Treatment Effects To obtain the intrablock estimates of the effects of the different treatments, we recall the form (3.1.1) of the model y = C(ylbt + X t + x23) + e , 22 where the elements of the random error vector e have the properties specified in (3.1.2). If the method of least squares is used to obtain the intrablock estimates t of x, the normal equations are IbtSi ibt?ibt IbtHi it?; XjC'y = ?12ibt ?i5?i 51552 - - *2~bt ?255i ;j5;2 (3.2.1) where the bt x bt matrix D = C'C and the hat (~) denotes estimate. According to the definitions and parameter iden tities specified in Section 3.1, the forms of the matrices ~1~~1' XiDX2' and X2DX2 n (3-2*1) are ~1~X1 = rit ?1??2 = i? and X'DX2 = klfa , and therefore the normal equations (3.2.1) are expressed as G - T = _ B bk ri; ki rl rl N ~t ~t ~ kl. N' kl. ~b ~b y A T A L e J (3.2.2) where G denotes the grand total of the observations and T and B are the t x 1 vector of treatment totals and the b x 1 vector of block totals, respectively. For a solution to the normal equations, both sides of the equality in (3.2.2) are premultiplied by the matrix 23 1/bk O O O Ifc -N/k O -N'/r I,_ ~ ~b -J A A and the constraints 1' r = 0 and 1,' 3 = 0 are imposed on the ~t~ ~b~ parameter estimates. Corresponding to these particular con straints imposed, the following relation results G/bk r /s _ y kQ = A Ax rB N'T (rkl N1N)3 ~ JD ~ ~ ~ where kQ = kT NB and A = rkl^ NN'. Characteristic of these designs, the matrix NN' = (rk-At)I + AJ. Hence, the matrix A can be expressed in the simple form (l/k)A = (At/k)[lt (l/t)j] . From the equation kQ = At in (3.2.3), the t x 1 vector x of intrablock estimates of the treatment effects is x = kQ/At , (3.2.4) where A = (rk-3r+2b)/(t-1). Furthermore, with the properties J_ l- /N of the vector e specified by (3.1.2), the i element x_^ of the vector x is unbiased for x. since E(e)=0 and with l'x=0 1 -V, -V ~ Cov (x\ x ,) (t-1)(Ak+2[A(k+t)-2rk]p)o2/(tA)2 i = i' -Var(xi)/(t-l) f i ^ i' (3.2.5) Since we are interested in the pairwise comparison of the treatment effects, we also have (3.2.6) and , 4{X (k+t)-2rk} 7 + po- tx2 (3.2.7) In the formula (3.2.7), the quantity 2kcr2/tX on the right-hand side of the equality is the variance of the difference between the intrablock estimates of the treatment effects t. and x., in the case of uncorrelated errors. Thus, if correlation is present between responses to the same treatment in the same block, the variance of the difference between the intrablock estimates of any two treatments, over all blocks, is greater than the variance between the same two treatments when the observations are uncorrelated, since the quantity [X (k+t) 2rk] is always positive. The intrablock analysis of variance table is pre sented in Table 1. It is clear from Table 1 that an exact test does not exist for testing the hypothesis of equal treatment effects when a non-zero correlation is present. If we wish to test this hypothesis, an approximate test must be performed. Before suggesting an approximate test for the equality of the treatment effects when p > 0, we shall first consider a procedure for testing for the presence of correla tion. If correlation is present, we shall need to know how this correlation affects the distributional properties of TABLE 1 Intrablock Analysis of Variance for an ECBD Source df Sum of Squares EMS* Treatments (adjusted) t-1 SST = (k/Xt) l Q? A i 1 E(MST ) A Blocks (unadjusted) b-1 SSB = (1/k) l B2 - j 3 (G2/bk) Residual bk' -t-b+1 (by subtraction) E(MSR ) e Total bk-1 TSS = l l l y? (G2/bk) i j Â£ ^ * E (MST ) = a2 {1 A + iP Xk [(k+t> _2rkR + k(t-i) J CM -H E(MSR ) = a2 + 1 , { (b-1) (t-1) d> bJ*' -*> )pa2 26 the sums of squares associated with the two sources, treat ments and residual. 3.3 A Test for the Presence of Correlation Although one of the initial steps in the analysis of data arising from a comparative type experiment is a test on the equality of the treatment effects, in this section we shall first investigate the possible presence of correlation between duplicate observations in the same block. The rea son for this investigation is that if correlation is present, an exact test of the hypothesis of equal treatment effects cannot be performed and an approximate test must be derived. Furthermore, if correlation is present, the formula for the variance of the intrablock estimates of the treatment ef fects contains p and an estimate of p is needed to estimate this variance. The same is true of the formula for the dif ference between the intrablock estimates of the effects of two treatments. To determine if there is evidence of correlation in the data, we shall consider a test of the hypothesis Hq: p = 0. If this hypothesis is rejected in favor of the alternative hypothesis H : p > 0, we shall conclude that the duplicate observations are not uncorrelated and insist on finding an estimate of p. If, on the other hand, the hy pothesis is not rejected, the inference made here shall be that the duplicate observations are uncorrelated, or, if they are correlated, there is not sufficient information in 27 the data to show that the magnitude of the correlation is greater than zero. In order to test the hypothesis Hq: p = 0, we first need to derive the form of a test statistic. To this end, recall from Table 1 that the source of variation termed residual has bk-b-t+1 degrees of freedom. The residual var iation is a composite of duplication variation as well as another source of variation which we shall call remainder variation. To see this, let d^j be the difference or range th th of the observations made on the i treatment in the j block so that if n^j = 2, then d^j > 0, and if n^j = 1, then dj-j = 0. If each of the d^j is squared and these squares are summed over all treatments and blocks, then the resulting quantity when multiplied by one-half is called the sum of squares for duplication variation (SSDV). In summation no tation, the sum of squares for duplication variation is given by t b SSDV = h l l d[. . i j The sum of squares for remainder (SSR) is found by cal culating the difference, sum of squares for residual SSDV. To derive the form of a test statistic for testing the hypothesis, we require the separate distributional prop erties of the sum of squares for duplication variation and the sum of squares for remainder. The distributional prop erties of SSDV and SSR are most easily obtained by rewriting SSDV and SSR as quadratic forms and then using our knowledge 28 of the distributional properties of quadratic forms. In ma trix notation then, SSDV and SSR can be expressed in the quadratic forms SSDV = y' [i. -CD_1C'] y = y'A,y (3.3.1) ~ L~bk ~ J ~ ~ ~1~ and SSR I' SC5'1 e 525 it (bt E (bt e5;22>Js' X ' (3.3.2) where both the matrices and A^ are real, symmetric, and idempotent. In the quadratic form (3.3.1) for SSDV, the matrix A^ consists of the square matrices and [ 0 ] on the main diagonal and zeros in all other positions. This partitioning corresponds identically to the partitioning of the matrix V as defined and illustrated in Section 3.1. Thus, by direct computation A V = (l-p)a2A. (3.3.3) ~ -L~ ~ and since A^ is a real, symmetric, idempotent matrix, so also is the matrix A V/{ (1-p) cr2 } The trace of the matrix ~ 1 ~ A^ is equal to b (k-t) and therefore under the assumption of normality of the errors, 29 SSDV ~ (l-p)a2 X(k_t) / (3.3.4) where x* denotes a random variable with a central chi square distribution with v degrees of freedom. With E(*) denoting mathematical expectation, then E(SSDV) = b(k-t)(1-p)o2 (3.3.5) and E(MSDV) = (1-p)a2 (3.3.6) where MSDV denotes the mean square for duplication variation. (An alternate derivation of the distribution of SSDV is pre sented in Section 3.4.) In the quadratic form (3.3.2) for SSR, the matrix A^ is real, symmetric, and idempotent. However, it can be shown that if c is a scalar constant, the equality AVA ~2~~2 is not true in general. (To see this would only require working through the small example where t = b = 3 and k = r = 4.) Hence, unlike SSDV, the random variable SSR does not have an exact weighted chi square distribution when p > 0. The exact distribution of SSR is discussed in Sec tion 3.4. The distributions of SSDV and SSR are independent, since A VC = 0. The trace of the matrix A V is equal to ~ 1 ~ ~ Z ~ (b-1) (t-1) (l+ 30 E(MSR) = (l+ where 4> is defined in Table 1. When the hypothesis HQ: p = 0 is true, the random variable SSR has a chi square distribution. Hence, to test the hypothesis Hq: p = 0, the test statistic F^ = MSR/MSDV is used. When p = 0, the test statistic has an F distri bution with (b-1) (t-1) and b (k-t) degrees of freedom in the numerator and denominator, respectively. Therefore, the hy pothesis is rejected in favor of the alternative hypothesis H : p > 0 for large values of F . a p A brief discussion of the power of this test is reserved for a later section, since we must first consider the distributional properties of SSR when p > 0. 3.4 The Exact Distribution of SSR for p > 0 In the previous section, it was shown that when p > 0, the distribution of the sum of squares for remainder does not in general have a weighted chi square distribution. This is because the matrix A2 of the quadratic form SSR does not necessarily satisfy the equality A2VA2 = CA2, w^ere c some constant and V is the covariance matrix of the observa tions. In this section we shall seek to find an expression involving independent chi square distributed random vari ables for which the moments of the distribution of SSR can be found. The approach we shall use to find the distribution of SSR involves rewriting the model in (3.1.1) in the form 31 yij = P + T + gj + (l-p)?Szijjl + p^Ujlj (3.4.1) i 1 / 2 f t / ^ 1 / 2/ Id ^ dnd = 1 / 2 / j , where the x^ are the treatment effects, the Bj are the block effects, p is the magnitude of the correlation between du plicate responses to the same treatment in the same block, and zj_jÂ£ and uij are independent, identically distributed normal random variables each with mean zero and variance a2. Let us now define the random variable SSR|u^j to be the usual sum of squares for interaction (which we have chosen to call remainder in our additive model) given the u- Since the conditional distribution of SSR given u- 1J J can be found, the form of the unconditional distribution of SSR is obtained by taking the expectation of the random variable SSR|u^j with respect to u^j. The distribution of the random variable SSR|u^j is given by SSR | u^ j ~ a2 (1 p) X2(b_i) (t-1) ( 2 (-p) r2) (3.4.2) where x2 (1) denotes a random variable with a non-central chi square distribution with v degrees of freedom and non centrality parameter X. In the noncentrality parameter of the distribution in (3.4.2), R2 is of the form >2 _ = min l Z x* B*)2 , (3.4.3) T*, B* i j where x* and B are the parameters in the conditional distri bution corresponding to y+x^ and y+Bj, respectively, in the unconditional distribution. In order that the distribution 32 in (3.4.2) be expressed in a simpler form, it is convenient to write R2 in terms of the design parameters t, b, k, and r. To this end, let D~^ denote the diagonal matrix of cell frequencies associated with a t x b table in a two-way cross classification, that is, n 11 n 12 n lb n 21 Then (3.4.3) may be written in the form R2 = min (u-y)'D*1(u-y) , y :F y = 0 where (3.4.4) y' (yxl, ..., ylb, v21, ytb)' u' (ui;l, ..., ulb, U21, utb) ' and F is a matrix of constraints for additivity in a t x b cross classification. The form of the matrix F is given shortly. The quantity R2 equals the minimum value of the 33 quadratic form in (3.4.4). To find this minimum value, we write Q = (u-y)'D*1(u-y) + 2H'F'y where II is a (b-1) (t-1) x l vector of Lagrange multipliers. Differentiating Q with respect to y and setting the result equal to zero, the minimum value in (3.4.4) is R2 u'F(F'D*F) u . (3.4.5) An expression for (3.4.5) involving the design parameters t, b, k, and r for our problem requires the la tent roots of the matrix F(F'D*F) ^F'. If these roots are denoted by 0, then 0 are the solutions to the equation I ?(r?*!,)~V ei(b-l)(t-l) I = 0 (3.4.6) Using the following identity, 6(b-1) (t-1) F' F'D*F Qb+t-lj 0F.D*F p.F| F'D*F 01 (b-1) (t-1) - F(F'D tF) 1fi we find that b+t-1 roots of (3.4.6) are zero while the re maining (b-1)(t-1) roots are positive. These latter (b-1)(t-1) positive roots can be found by solving for 0' in the equation F'D*F 0 1 F F | = 0 (3.4.7) 34 and setting 9 = 1/9'. Since the 9' are non-zero, the frac tion 1/9' is not undefined. We should like to express equation (3.4.7) in a simpler form to find the values of 9'. To this end, the ma trix of constraints F is written as the direct product of two other matrices. This direct product is F = F(t-l) F(b-l) , where the two matrices are defined by and F(t-l) 14-i -I t-1 tx(t-1) F(b-l) = l-i -I b-1 J bx(b-1) Then F'F 1 -b-1 f where L = I + J (3.4.8) ~a ~a That is, L is an a x a matrix with 2's on the main diagonal ~a and l's in all other positions. We now make use of the fol lowing theorem and corollary to find the values of 9' satis fying (3.4.7) . Theorem. Let W be an m x m matrix with the distinct latent 35 roots w_. with respective multiplicities m., j = 1, 2. Let R be an m x s matrix satisfying R'R = I Then the roots of ~ ~ ~ ~s the matrix R'WR are the values of 0' satisfying the equation R'WR 6'I | = 0. These values are 9' = w10" + w2(l-0") , where 0" are the solutions of | R'MR 0"I | =0 with V v Proof; The proof follows directly by replacing M with (W-w^Im)/(w^-w2) in the determinant | R'MR 0"I | = 0 and simplifying. Corollary. If W is defined as in the theorem, and F is an m x s matrix of full rank s < m, then the solutions 0' of the equation | F'WF 0'F'F | = 0 are 0 = w-^0 + w2 (1-0" ) , where 0" are the solutions of | F'MF 0"F'F | =0 with M defined in the theorem. Proof: There exists a matrix K such that K'F'FK = Ig. Let r' = k'F' and apply the theorem. In our problem, W is the matrix D*. Therefore from the theorem, M is a diagonal matrix of ones and zeros. Referring to the corollary to obtain the values of 0' satis fying (3.4.7), we now need only to find the solutions 0" of 36 | F'MF 6"F'F |=0. (3.4.9) Because the structure of the matrix F'F depends on the matrices L^._^ and Lb_^ as defined in (3.4.8), a forward Doolittle procedure is performed on La to find that the val ues of 0" satisfying (3.4.9) are the same as the values of 0" satisfying I e"i(b-l) (t-l) I = 0 > (3.4.10) where F* = H(t) 0 H(b) with the matrix H(a) defined as the first a-1 columns of the a x a Helmertz orthogonal matrix. Since M = M* and M'M = M, then F;MF* = (MF*)'MF*, and the positive values of 0" satisfying (3.4.10) are the positive solutions 0" satisfying | MF*F;M 0"Ibt |=0. (3.4.11) Since we may write f*f; = (Gt 0 Gb)/bt , where Ga is an a x a matrix with a-1 on the main diagonal and -1 in the other positions, then the positive solutions 0" of (3.4.11) are functions of the positive solutions 0* satisfying the equation | M(Gt 0 Gb)M 0*Ibt |=0, (3.4.12) where 0* = bt0". At this stage, an expression for the random vari able SSR is presently untenable for general t, b, k, and r. 37 In the remainder of this section, we shall derive an expres sion for the random variable SSR when p > 0 for the special case t = b and r = k = t+1. A similar expression for SSR when p > 0 for the case b = 2t, k = t+1, and r = 2 (t+1) is presented in Appendix 1. For the special case considered in this section, the matrix M(G 0 G, )M in (3.4.12) has the non-zero partition ~ ~t ~b ~ {(t-1)+ J / for which the positive latent roots are t(t-2) and t(t-l) with multiplicities t-1 and 1, respectively. Hence, since these roots are simple multiples of the solutions 0" of (3.4.9), the 0" are 0" (t-2)/t with multiplicity t-1 (t-l)/t 1 0 (t-l)2-t With the use of the corollary where w^ = % and w^ = 1/ the values of 0' satisfying (3.4.7) are 0' (t+2)/2t with multiplicity t-1 (t+1)/2t 1 1 (t-l)2-t and the values of 0 satisfying (3.4.6) are 0 = 2t/(t+2) with multiplicity 2t/(t+1) 1 11 11 t-1 1 (t-l)2-t 38 In the distribution of the conditional random variable SSR given u^ in (3.4.2), we may now express R2 as R2 = 2t X2 2t 2 t+2 At-1 t+1 xi + x(t-D2-t Furthermore, upon taking the expectation of SSR|u^_. with re spect to u,, and using the notation SSR = E(SSR|u,.), the lj 1 lj sum of squares for remainder when p > 0 is distributed as SSR/a: al Xvx + a2 xv2 + a3 Xv3 ' (3.4.13) where and al = 1 + p TT a2 1 + P t+1 ' a3 1 , V1 = t-1 , v2 = 1 , v3 = (t-1)2 t An approximate distribution to (3.4.13) will be obtained in Section 3.5. The conditional distribution of the usual sum of squares for error given u^ is SSE|uj ~ a2(1-p) X(k_t)(0) , and the random variable SSE|u. is independent of u_^ Hence, we have 39 SSDV ~ a2(1-p) x(k-t)(0) ' where SSDV denotes the expectation of the random variable SSE|u^j with respect to j. Since the duplication var iation sum of squares random variable is also independent of the random variable SSR|u^j, then the random variable SSDV is independent of the random variable SSR. By independent random variables is meant, the distributions of the random variables are independent. (The independence of the distri butions of SSDV and SSR was established previously in Sec tion 3.3 through the use of quadratic forms.) To this point, it has been shown that for the special case t = b and r = k = t+1, the random variable SSR is distributed as a sum of weighted independent chi square random variables when p > 0. We still do not have the exact form of the density of the random variable SSR which is nec essary in order to specify the distribution of the random variable SSR/SSDV. The distribution of SSR/SSDV is also necessary in order that we may calculate the power of the test of the hypothesis HQ: p = 0 for non-zero values of p. Since an exact form of the density of SSR would likely re quire an excessive amount of work and since a simpler form of an approximating distribution of SSR would suffice for our problem in a majority of cases, an approximate distri bution of the random variable SSR will now be considered. A check on the accuracy of the approximate distribution when compared to the exact distribution of SSR when p > 0 is pre sented in Table A2 of Appendix 2. 40 3.5 An Approximate Distribution of SSR for p > 0 In this section we shall consider an approximation of the distribution of the sum of squares for remainder when p > 0 for the special case t = b and r = k = t+1. An ap proximate distribution of SSR when p > 0 for the special case b = 2t, k = t+1, and r = 2(t+1) is given in Appendix 2. There are numerous approaches that could be used to approximate the distribution of SSR. The approach used in this section (and also used in Appendix 2) was introduced by Box (1954). The rationale in selecting Box's ap proximation lies not only in its relative ease of appli cation but also in the fact that it was shown by Box that the approximate distribution compared to the exact distri bution of a quadratic form is fairly good except when small differences in probability are to be examined. We now state the theorem in his paper which we shall use. Theorem. The quadratic form is distributed approximately as where g (3.5.1) and h (3.5.2) In both of the expressions for the scale constant g and the 41 degrees of freedom h, the are scalars and the are the degrees of freedom of the respective chi square random vari ables that are summed to form Q, j = 1, 2, ..., p. In our problem we seek to approximate the distri bution of the random variable SSR, where SSR/a2 ~ ax + a2 xv + a3 X* v. with a^ and vj, j = 1, 2, and 3, defined following (3.4.13). From the theorem by Box previously stated then, if the aj and Vj are substituted into (3.5.1) and (3.5.2) to find g and h, respectively, we may say that SSR is approximately distributed as a scaled chi square random variable with h degrees of freedom. A tabulation of the values of g and h corresponding to the integer values 3, 4, 5, 6, and 7 of t and to some values of p in the interval between zero and one is presented in Table 2, where h has been rounded to the nearest integer and g has been rounded to four decimal places. The approximate distribution of SSR may be used, when testing the null hypothesis Hq: p = 0 against the gen eral alternative hypothesis H : p > 0, to compute the power Cl of the test under the alternative hypothesis for values of p greater than zero but less than one. Since the distribution of SSR is independent of the distribution of SSDV, then under the alternative hypothesis we approximate the distri bution of the statistic TABLE 2 Values of g and h for the Approximate Distribution of SSR, I p t V, a. a g h 1 2 3 1 2 3 0.1 3 2 1 1 1.025 1.05 1 1.0253 4 0.3 1.075 1.15 1.0776 4 0.5 1.125 1.25 1.1319 4 0.7 1.175 1.35 1.1880 4 0.9 1.225 1.45 1.2457 4 0.1 4 3 1 5 1.04 1.06 1.0205 9 0.3 1.12 1.18 1.0645 9 0.5 1.2 1.3 1.1121 9 0.7 1.28 1.42 1.1629 9 0.9 1.36 1.54 1.2166 9 0.1 5 4 1 11 1.05 1.0667 1.0173 16 0.3 1.15 1.2 1.0554 16 0.5 1.25 1.3333 1.0978 16 0.7 1.35 1.4667 1.1441 16 0.9 1.45 1.6 1.1940 15 0.1 6 5 1 19 1.0571 1.0714 1.0149 25 0.3 1.1714 1.2143 1.0485 25 0.5 1.2857 1.3571 1.0867 25 0.7 1.4 1.5 1.1291 24 0.9 1.5143 1.6429 1.1754 24 0.1 7 6 1 29 1.0625 1.075 1.0131 36 0.3 1.1875 1.225 1.0431 36 0.5 1.3125 1.375 1.0778 35 0.7 1.4375 1.525 1.1168 35 0.9 1.5625 1.675 1.1599 35 43 MSR Fp MSDV with a weighted F distribution with h and b(k-t) degrees of freedom, where the weight is given by g/(l-p). That is, 1-P g ' b (k-t) (3.5.3) approximately. The probability that Fp exceeds some value Fq is approximately equal to the probability that the random variable F^^-t) exceec^s (l-p)Fo/g* A method for estimating the magnitude of the correlation between duplicate responses observed with the same treatment in the same block will be discussed in the following section. 3.6 An Estimate of the Correlation In Section 3.3 a procedure for testing the hy pothesis Hq: p = 0 was outlined in detail. As mentioned at the beginning of Section 3.3, the test of the hypothesis of zero correlation is normally the first action to be taken during the analysis of the experimental data. If the hy pothesis is rejected, we should then want an estimate of p. The estimate of p would be used when estimating the variance th of the intrablock estimate of the effect of the i treat ment as shown in (3.2.5) and/or when estimating the variance of the difference between the intrablock estimates of the effects of two treatments as shown in (3.2.7). Still anoth er use for the estimate of p would be when estimating the 44 efficiency of the extended complete block design compared to the complete block design as shown in the paper by Cornell (1974). The value of the relative efficiency of the two de signs could be very useful when considering designs for sub sequent experimentation, particularly in a sensory exper iment where the same panelists are to be used in additional experiments. Referring to the formulae (3.3.6) and (3.3.7), we see that the expectations of the mean squares for duplica tion variation and for remainder variation are E(MSDV) = (l-p)CT2 (3.6.1) and E(MSR) = (l+4>p) CT2 (3.6.2) where these mean squares and a ratio of two linear combinations of them are considered, we can express the correlation p in the form p = {E (MSR) E (MSDV) }/{E (MSR) + cÂ¡>E (MSDV) } (3.6.3) As an estimate of p then, the expectations in (3.6.3) are replaced by their respective mean squares resulting in the formula p = MSR ~ MSDV (3.6.4) M MSR + <{>MSDV 45 Similarly from (3.6.1) and (3.6.2), an estimate of a2 may be obtained as ^2 MSR + 1 + 4 Since the calculated value of MSR is always greater A than or equal to zero, then from (3.6.4) p > -l/cj>. Further more, since the calculated value of MSDV is always greater A than or equal to zero, then p < 1. If these extremes are considered as the endpoints of the range for the values of A A the estimate p, then -l/ interested only in the values of p in the interval between A zero and one, any negative value of p calculated is con sidered meaningless and is set equal to zero in this case. (Setting a negative estimate of a non-negative parameter e- qual to zero is a procedure practiced when estimating vari ance components in random and mixed models.) A The distribution of the random variable p depends on the forms of the distributions of the random variables SSR and SSDV. It was shown in Section 3.4 that the random variable SSR is distributed as a weighted sum of independent chi square random variables. Although an approximate distri bution of SSR was given in Section 3.5, an approximation to \ the distribution of p is at present untenable. Nevertheless, A the first two moments of the distribution of p could be ap proximated using a Taylor series. That is, the formula in A (3.6.4) for p may be expressed in a Taylor series and the A mean and the variance of the distribution of p could be 46 approximated with a finite number of terms in the series by taking the appropriate expectations. CHAPTER 4 A PARTIALLY BALANCED GROUP DIVISIBLE ECBD In the extended complete block designs presented thus far, the class of balanced incomplete block designs was only considered in the extended portion of the b blocks. That is, in making the extended complete blocks of size k, we have considered in combination with the complete blocks of size t only balanced incomplete blocks of size k-t. By restricting attention to the use of balanced incomplete block designs only in the extended portion, the extended complete block designs retain the property of balance among the treat ments. By balance is meant, the off-diagonal elements in the matrix A (or NN') in (3.2.3) are all equal, resulting in a single value of the variance for all pairwise treatment comparisons. Hence, all pairwise treatment comparisons could be made with the same precision. When it is not necessary to have equal precision for all pairwise treatment comparisons or when to achieve balance the use of balanced incomplete block designs requires a large number of replications of the treatments or possibly too many blocks, a partially balanced incomplete block design (PBIBD) might be used in the extended portion of an extended complete block design. To illustrate this point, consider an extended complete block design consisting of six treatments 47 48 in blocks of size nine. If a BIBD were used in the extended portion of the.blocks, the balanced design would require ten extended blocks supporting fifteen replicates of each of the six treatments. On the other hand, if a PBIBD were used in the extended portion, only six blocks supporting nine repli cations of each treatment would be required. In this and subsequent chapters then, the use of PBIB designs in the extended portion of the extended complete block design will be considered. Specifically, we shall limit our attention to the use of PBIB designs with two asso ciate classes. By relaxing the requirement of balance, in most cases we do not sacrifice that much precision when con sidering PBIB designs with two associate classes where in stead two variances are required for making all pairwise comparisons of the treatments. The two variances arise be cause with each treatment a subset of the t-1 other treat ments are first associates while the remaining other treat ments are second associates. One variance is used for pair wise comparisons among the treatments that are first asso ciates while the second variance is used among treatments that are second associates. The generalization to partially balanced incomplete block designs with more than two asso ciate classes should be straightforward. For the balanced extended complete block designs, the case where responses to the same treatment in the same block were positively correlated was presented in detail. The general theory developed in Section 3.4 on the exact 49 distribution of SSR when p > 0 with the additive model could be used with partially balanced extended complete block de signs. Hence, since the theory is general, the analysis of partially balanced ECB designs with correlated observations will not be presented. Our discussion will be limited to the additive and non-additive models when all observations are uncorrelated. The group divisible association scheme for t = mn treatments where m and n are integers is derived by parti tioning the treatments into m groups of n treatments each with those in the same group being first associates and those in different groups being second associates. For example, with six treatments (denoted by the numbers 1, 2, 3, 4, 5, and 6) a group divisible association scheme for three groups of two treatments each would be given by the 3x2 rectangu lar array 1 2 3 4 5 6 . In the array, treatments in the same row (1 and 2, 3 and 4, 5 and 6) are first associates. Treatments not in the same row as a specified treatment are second associates of that treatment. For example, the set of second associates of treatment 1 consists of the treatments 3, 4, 5, and 6. For a PBIBD with the group divisible association scheme and incidence matrix N*, the matrix N*N*' may be 50 arranged in a particular pattern that will be described in detail in Section 4.2. The particular pattern of the matrix N*N* facilatates finding a solution of the normal equations for the intrablock estimates of the treatment effects. The pattern of N*N*' carries over to the matrix NN', where N is the incidence matrix of the extended complete block design. Extended complete block designs generated by the class of PBIB designs with the group divisible association scheme will be called "partially balanced group divisible extended complete block designs." 4.1 Definitions and Notation An extended complete block design generated by a PBIBD is defined as a connected, two-way classification with the following properties: 1. Each treatment is applied either Cq or c-^ times in a block, c. >0, i = 0, 1. i 2. In the incidence matrix of the design, replacement of Cq by zero and c^ by unity results in the inci dence matrix N* of a PBIBD (with two associate classes). It follows from this definition that the incidence matrix N of such a design can be generated from the incidence matrix of any PBIBD. That is, given a PBIBD with the inci dence matrix N* and denoting by J a matrix of l's (the inci dence matrix of a complete block design), the incidence matrix of an extended complete block design (ECBD) generated 51 by a PBIBD is N = cQJ + (C;l-c0)N* (4.1.1) This equation is identical to the equation (2.1) for the incidence matrix of a balanced ECBD except that N* is now the incidence matrix of the generating PBIBD. The parameters associated with an ECBD generated by a PBIBD are as follows: t = the number of treatments, b = the number of blocks, k = the block size, r = the number of replications of each treatment in the experiment, = the number of distinct pairs of experimental units which receive any fixed pair of i^ associates while appearing in the same block, i = 1, 2, th n^ = the number of i associates of each treatment, i = 1, 2, and 4-Vi p., = the number of treatments that are both j asso- th ciates of treatment a and k associates of treat . .th ment 3 given that a and 3 are i associates. The corresponding design parameters of the generating PBIBD will be denoted by t, b, k*, r*, A*, n^, and Pl owing to the definitions given above for the param eters of the generating PBIBD as well as the ECBD generated by a PBIBD, the following relationships are satisfied: 52 1. r = clr* + (b-r*)cQ = cQb + (crc0)r* (4.1.2) 2. k = gxk* + (t-k*)cQ = cQt + (Gl-c0)k* (4.1.3) 3. r*t = bk* (4.1.4) 4. rt = bk (4.1.5) 5. Xi = (cl"c0)2Xi + c0(2r-bcQ) i = 1, 2 (4.1.6) 6. rk - (cQ+c^)r + CQC^b = n-^X^ + n2X2 * (4.1.7) In matrix notation, the non-additive model is y = + ?11 + + Y) + e (4.1.8) where y is the overall mean effect, x is a t x 1 vector of treatment effects, 8 is a b x 1 vector of block effects and y is a bt x l vector of block x treatment interaction effects. Letting lt denote a t x l vector of l's, It denote the t x t th identity matrix, and Yj_jÂ£ represent the l response to the th i treatment in the j block, the vector y and the matrices C, Xj, and X2 in (4.1.8) are of the forms y = ym Ylln Y211 11 Ytbn tb ' ?1 = it it 'it r Xr) btxt it it J btxb bkxl and 53 1 ~n 11 C bkxbt 4.2 Intrablock Estimation of the Treatment Effects Consider the model in (4.1.8) where the interaction effects are all zero. This additive model is written as y = C(ylb + xr + X28) + e (4.2.1) Setting up the normal equations exactly as detailed in Sec tion 3.2 results in the equality G/bk r a y kQ = /A Ax rB N'T (rkl, N'N) Â§ ~ D ~ ~ ~ J where all symbols are defined in Section 3.2. The solution of kQ = Ax in (4.2.2) depends upon the form of the matrix A rkl NN' t (4.2.3) 54 which in turn depends upon the form of the matrix N* from (4.1.1). For a PBIBD with the group divisible association scheme, the matrix N*N* may be arranged in a particular pattern as follows. For an association scheme consisting of m groups each containing n treatments (so that t = mn), let the treatments in the first group be labeled 1 through n; in the second group, the treatments are labeled n+1 through 2n; th and so on, so that in the m group, the treatments are la beled (m-l)ntl through mn. Now, with the corresponding blocking plan and the labeled treatments listed in numerical order in the incidence matrix N*, the matrix N*N*' becomes N*N*' = (r*-A*) (Im In) + (Asj'-A*) (Im Jn) + X* (Jm Jn) , (4.2.4) where denotes the direct (Kronecker) product of two matri ces and A| and A| are the number of distinct pairs of exper imental units over all blocks which receive any fixed pair of first and second associates, respectively, in the same block in the generating PBIBD. Since the incidence matrix of the ECBD generated by a PBIBD equals cgJ + (c]_-cq)N*, then NN' = (c-l-Cq) 2N*N*' + cQ (2r-cQb) (Jm Jn) (4.2.5) By substituting (4.2.4) into (4.2.5) and simplifying, we obtain NN* = [(c0+c1)r-c0c1b-X1] (Im In) + (XrX2) (Im Jn) + (4.2.6) X0(J J) . 2 ~m ~n 55 To facilitate finding a solution of kQ = At for x in (4.2.2), an-expression for the matrix A may now be found by substituting (4.2.6) for NN' into (4.2.3). By further simplification using the identity (4.1.7) involving X-^ and X2, the result is kQ = [(nXi+njjkj) (Im 0 In) (X1-X2)(Im 0 Jn) - x2 th of which the i element is given by kQi = (nX1+n2X2)xi (^i~^2^G^Ti^ X2T* (4.2.7) where G(x^) denotes the sum of all the estimated treatment . th effects of the treatments m the group containing the 1 treatment. In other words, G(x^) is the sum of the effect th of the 1 treatment plus the effects of all first associates th of the i treatment. Also in (4.2.7), x. denotes the sum of all the estimated treatment effects. However, one of the restrictions used on the treatment effects to obtain (4.2.2) was to set x. equal to zero. Therefore, if G(kQ^) denotes the sum of the kQ^ in (4.2.7) plus the kQ^1, i ^ i', corre- . th spondmg to all the first associates of the 1 treatment, we obtain G(kQ^) (nX j+n2 X 2) G (x ^) n (X^-X2) G (x^) = tX2G(xi) (4.2.8) By substituting G(Xj_) from (4.2.8) back into (4.2.7), the th intrablock estimate of the effect of the i treatment is 56 found by solving for x^ in the expression X -X (nX +n X )x = kQ. + G (kQ. ) . 1 2 2 i i i (4.2.9) The difference between the intrablock estimates of the effects of the treatments i and i', i ^ i', can be writ ten as A A T -X . 1 1 t = MQ.-Q. ,) + i x' VX2 tX0 [g (kQ ) -G (kQ ) ] . i l Under the assumption that the errors are normally distributed , 2 with mean zero and variance structure o I, the difference e~bk A A x^-x^, has the properties E(t.-t, i i T -T . 1 1 i and Var(x.-T. i i ,) 2ka2 r i & i' are 1st associates nXi+n2X2 2ka2 e nXi+n2X2 (1 + i & i' are 2nd associates . The intrablock analysis of variance table for the partially balanced group divisible extended complete block design is presented in Table 3. In the sum of squares ex pressions in Table 3, CM _1 bk l l i j l a i j Z r TABLE 3 Intrablock Analysis of Variance for Partially Balanced Group Divisible Extended Complete Block Designs Source df Treatments (adjusted) t-1 SSTA = (n~~Tn~~~r H^kQi)2 + t\ ^ (kQi)G(kQi)] Sum of Squares 1' **22' i tX2 Blocks (unadjusted) b-1 SSBjj = | l B? CM j Remainder (t-1) (b-1) SSR =11^- R?j " 1 D ID SSTa SSB0 CM Error b(k-t) sse -III Yin 11 ht: RL j H i D ID Total bk-1 TSS yijÂ£ i j Â£ CM * E (MSTa) = ae + k(t-1) t'At E(MSR) = a* + y'Dy and E(MSE) = - rr 2 a where A = rklfc NN' and D = C'C EMS* E (MSTa) E (MSR) E (MSE) cn -J 58 As can be seen from Table 3, the ratio of block, and treatment the mean square for remainder to the mean square for error (previously called the mean square for duplication variation) provides a statistic for a test on the validity of the addi tive model. If the hypothesis is not rejected, a test of the hypothesis of equal treatment effects could be performed using the statistic the ratio of the mean square for treat ments (adjusted) to the mean square resulting from pooling the mean square for remainder with the mean square for error. If the hypothesis is rejected, then we might wish to consider a non-additive model. With the non-additive model, we could concern ourselves with the estimation of the block x treat ment interaction effects or concentrate on testing the hy pothesis of equal treatment effects in the presence of inter action effects. The next section gives validity to both of the above mentioned tests of hypotheses. As will be shown, the sums of squares in Table 3 are each distributed as weighted chi square random variables and each sum of squares is dis tributed independently of the others. 4.3 Distributions of the Sums of Squares and Relevant Tests of Hypotheses In order to validate the tests mentioned at the end of Section 4.2, we must first obtain the distributions 59 of the sums of squares in Table 3. The approach that will be used to derive the distributions of these sums of squares is to express them as quadratic forms and use our knowledge of the distributions of quadratic forms. The sums of squares in Table 3 can be written in the quadratic forms SETA = X'dlX = X :,..,+n2y S'ibt k ;2525>?1 SSBu X'*2X = X' BE tbcx2xc J) y , SSR = y'A3y = y' (CD_1C' Â£ CX^C' A-^ y SSE = y'A4y = y' (Ibk CD 1C') y , (4.3.1) (4.3.2) (4.3.3) (4.3.4) and TSS = y'A5y = y' (Ibk Â£Â£ J) y (4.3.5) where in (4.3.1), F = Im IR. In the quadratic forms (4.3.1) through (4.3.5), each of the matrices A^, A2, A^, A^, and A^ is real, sym metric, and idempotent, and 4 I ip = *5 P=1 P Also, the ranks of the five matrices, where r(Ap) denotes the rank of the matrix A are ~ IT riA^) = t-1 , 60 r(A2) = b-1 , r(A3) = (b-1)(t-1) , r(A4) = b(k-t) , r(Ac) = bk-1 = l r(A ) . P=i ~p Hence, by applying Theorem 5 in Searle (1971) on the dis tribution of quadratic forms, it is found that when y N( y, aeibk / then y'A Y ~ CTe xr(A ) ( y'^p!y2a ) (4.3.6) P for p = 1, 2, 3, 4 and the y'A y are mutually independent. ~ ~ IT ~ The distributional forms in (4.3.6) will be used in con structing the aforementioned tests of hypotheses. The test of the assumption concerning the validity of the additive model corresponds to the test of the hy pothesis of zero interaction effects when the non-additive model is considered. To test the validity of the additive model, the test statistic used is the ratio of the mean square for remainder to the mean square for error. If the additive model holds, then the test statistic possesses an F distribution with the appropriate degrees of freedom, and the hypothesis concerning the validity of the model is re jected for large values of this ratio. If the additive model assumption is valid, a test of the hypothesis of equal 61 treatment effects would be performed using the ratio of the mean square for treatments (adjusted) to either the mean square for error or the pooled mean square for remainder plus error. Under the hypothesis of equal treatment effects, this ratio possesses an F distribution. On the other hand, if there is evidence to reject the assumption of the addi tive model in favor of the non-additive model, then an ap proximate test on the treatments could be performed if de sired. The intrablock analysis given in this section and in the previous section is, of course, an analysis of a model in which the treatment, block, and interaction effects are considered as fixed effects. In comparative type experi ments, usually we seek to draw inferences about the effects of the specific treatments used in the experiment and hence the assumption of fixed treatment effects presents little argument. Now the block effects, on the other hand, may be fixed or random. In this latter case the emphasis may be on drawing inferences about the magnitude of the variance of the population from which the sample of block effects was assumed to be drawn. A model in which the treatment effects are fixed while the block effects are random is called a mixed model. Since the partially balanced group divisible ex tended complete block designs would frequently be used with random block effects, we shall now present the analysis of a mixed model for these designs. 62 4.4 Mixed Model Analysis In the mixed model analysis, all symbols in the model are defined exactly as in (4.1.8) with the exception of 8 and y. In the mixed model, the parameters 8 and y are assumed to be independently distributed normal random vari ables with 8 ~ N( 0, oIb ) , I ~ N< btibt > ' and each distributed independently of the random errors e. Under these assumptions, the expectation of the vector of observations is E (y) = c(ylbt + x-jt) = y and the variance of the observations is (4.4.1) Var(y) = e[c(X28 + y) + e] [c (X28 + y) + e] ' = ?X2XC-a + CC'at + a^Ibk = V (4.4.2) The analysis of variance table for the mixed model is presented in Table 4. The differences between the en tries in Table 4 and Table 3 for the fixed effects model are the replacement of the source of variation called blocks (adjusted) for the source of variation called blocks (un adjusted) and the expected mean square expressions. Of course other differences exist in the use of the two tables, namely in the interpretation of the tests of hypotheses. The expected mean squares in Table 4 were obtained using the identity TABLE 4 Mixed Model Analysis of Variance for Partially Balanced Group Divisible Extended Complete Block Designs Source df Treatments (adjusted) t-1 Blocks (adjusted) b-1 Interaction (t-1)(b-1) Error b (k-t) SS* SST A SSB, SSI SSE Expected Mean Squares** 1 e + t-1 r 'Ax e + (si- 7 + hT<>* 7F (r-k) s,}at2 b-1 V1 r 2' b b-11^ rk ae + (b-i)Tt-l) {S1 k S2 ^*}abt 2J bt Total bk-1 TSS * SSBa = SSTa + SSBy SSTy and the other SS are defined in Table 3 with SSI = SSR . p ** A = rklt NN' s = l l ni- for p = 1, 2, and 3, X** = (cq+c-^Aj. CqCjT and ** = bk(n>^2A2) '1 + ) (kslS2-2slS3+S|) + (X1s22X**s1) } . OJ 64 E(y'Apy) = tr(ApV) + u'Apu involving the expectation of a quadratic form with y ~ N( y, y ) and where tr(A V) denotes the trace of the matrix A V. ~ IT ~ ir ~ To test the hypothesis HQ: = 0, the statistic takes the form of the ratio of MSI to MSE, where MSI equals SSI/(b-l)(t-1) and MSE equals SSE/b(k-t) with SSI and SSE defined in Table 4. The hypothesis is rejected for values of this ratio larger than the appropriate tabular F value. /s _ If the hypothesis is rejected, an estimate of could be found using the analysis of variance approach. An es- timate of the random blocks component of variance may also be obtained using the analysis of variance approach. When the hypothesis HQ: = 0 is rejected, we may still wish to test the hypothesis that the treatment effects are small relative to the magnitude of the inter action variation. For this test the statistic is given by s-,^- (1/k) s2~(j)5 (b-1) (t-lT~ MSTa - t-1 MSI s1~ (1/k) s2-(f)* (b-1)(t-1) t-1 MSE where expressions for calculating the values of s^, s2, and <})* are presented at the bottom of Table 4. The statistic FT can be shown to possess an approximate F distribution with f and b(k-t) degrees of freedom in the numerator and denom inator, respectively, where the number f is computed by the 65 procedure given by Satterthwaite (1946) for approximating the distribution of the estimate of a variance component. If the hypothesis HQ: crÂ£t = 0 is not rejected, a simpler test on the treatment effects may be performed. For this case, the statistic is the ratio of the mean square for treatments (adjusted) to the pooled mean square consisting of the mean squares for interaction and error. CHAPTER 5 A PARTIALLY BALANCED ECBD WITH THE L2 ASSOCIATION SCHEME The partially balanced group divisible extended complete block designs presented in Chapter 4 comprise a large class of partially balanced ECB designs. However, because of its general applicability, still another class of partially balanced ECB designs to be considered is the class of partially balanced extended complete block designs with the L2 (Latin Square) association scheme. The L2 asso ciation scheme for t = n2 treatments is characterized from the arrangement of the treatments in a square array. The classification of the treatments to one another in the L2 association scheme is such that the treatments in the same row or same column are first associates and the treatments not in the same row or same column are second associates. For example, with sixteen treatments (denoted by the numbers 1 through 16) an 1>2 association scheme would be determined from the square array 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 The first associates of treatment 1 are the treatments 2, 3, 66 67 4, 5, 9, and 13, while the second associates of treatment 1 are the treatments 6, 7, 8, 10, 11, 12, 14, 15, and 16. In this chapter, we shall present only the intra block analysis of the partially balanced extended complete block designs with the L2 association scheme. The mixed model analysis follows the procedure outlined in Section 4.4. In fact, the only difference in the final analysis of the mixed model with the two association schemes is that with the partially balanced ECB designs with the L2 association scheme, the expected mean square expressions are slightly more complicated than the corresponding expressions with the partially balanced group divisible ECB designs. In the intrablock analysis, we shall make use of row sum and column sum operators which we denote by RS() and CS () respectively. For the square array containing the treatment effects x^ through Xg the row sums RS(x^), RS(Xg), and RS(Xg) are given by RSx^ = ti+t2+t3 = RS(x2) = RS(x3) , RS(x5) = t4+x5+t6 , and RS(x 9) t7+t8+t9 68 The column sums for the same treatment effects are given by CS(t1) = T1+T4+T7 = cs(t4) = CS(x7) , cs and CS(t9) = t3+t6+t9 . Note that for the i1" treatment effect, RS (T) + CS(Ti) = s1(Ti) + 2xi , where S^(t^) denotes the sum of the effects of all treatments that are first associates of the i*" treatment. For a PBIBD with the association scheme, the ma trix N*N*' may be arranged in a particular pattern which will be described in Section 5.1. The particular pattern of the matrix N*N*1 facilitates solving the normal equations to ob tain the intrablock estimates of the treatment effects. As expected, the pattern of N*N*' is reflected in the matrix NN', where N is the incidence matrix of the ECB design. 5.1 Intrablock Analysis The definitions and notation presented in Section 4.1 will be used again in this section with the exception that N* now denotes the incidence matrix of the generating PBIBD with the association scheme. Let us consider an additive model consisting of the overall mean parameter, a treatment parameter, a block 69 parameter, and a random error term. Since the interaction effects are all zero, the model in (4.1.8) may be written as y = C(ulbt + Xjj + X28) + e (5.1.1) Using the same form of the normal equations as detailed in Sections 3.2 and 4.1, we have G/bk r a - y A kQ = At A rB N'T (rklb N'N)B where all symbols are defined in Section 3.2. A The solution of kQ = At in (5.1.2) depends upon the form of the matrix NN', since A = rklt NN'. For a PBIBD with the L^ association scheme, the matrix N*N* may be arranged in a particular pattern as follows. In the association scheme, let us label or number the treatments from 1 to n2. Then, if the treatments are listed in numeri cal order in the incidence matrix according to the particu lar blocking plan used, the matrix N*N* is of the form N*N*' = (r*-2A*+A*)(I 0 I ) + (A*-A*)(I 0 J ) - ~ 2 ~n ~n 1 2 ~n ~n + (A*-A*)(Jn 0 In) + A*(Jn 0 Jn) (5.1.3) J. ^ where 0 denotes the direct product of two matrices and where over all blocks in the generating PBIBD, A* and A* are the number of distinct pairs of experimental units which receive any fixed pair of first and second associates, respectively, in the same block. 70 Since the matrix N can be expressed in the form N = CqJ + (c-l-CqJN* , then NN* = (C-Cq)2N*N*' + c0(2r-c0b)(Jn 0 Jn) (5.1.4) Now if the form (5.1.3) of the matrix N*N* is substituted into NN' in (5.1.4), the resulting expression for NN' after simplifying is NN = [(c1+c0)r-c0c1b+X2-2A1] (In 0 IR) + (Xl-X2)[(Jn In)+(In Jn)] +X2(Jn0Jn) . (5.1.5) This expression for NN' can now be substituted into A = rklfc NN' , so that the intrablock estimates of the treatment effects are obtained by solving the equation kg = (n[2Aj+(n-2)X2] (Jn In> i.\1~X2) [(Jn Jn) + (Jn Jn>] - X2 of which the i element is kQi = n[2X1+(n-2)X2]xi (Xx-X2) [RS (t) +CS (t) ] (5.1.6) since -A2t. = 0. In (5.1.6), RS(t^) and CS(t^) are the row th sum and column sum, respectively, of the estimated i treat ment effect. 71 To obtain the expression for from (5.1.6), we need the row and column sums of kQ^ in (5.1.6). These sums are respectively RS(kQ) = [nX1+n(n-l)X2]RS(Ti) (5.1.7) and CS(kQ^) = [nX-^+n (n-1) X2] CS (x^) . (5.1.8) Replacing RS(x^) and CS(x^) in (5.1.6) by their respective equivalent expressions in (5.1.7) and (5.1.8), then kQ^ in (5.1.6) may be rewritten as *>i = n[2X1+(n-2)X2lT1-nX1+Mn-l)X2 [RS (kQi)+CS (kQl) ] , th and hence the intrablock estimate of the effect of the i treatment is given by the equation n [2X^+(n-2)X2]Ti = + A1 A2 nX^+n(n-1)X^ [s1(kQi)+2kQi] (5.1.9) where Sj(kQ^) is the sum of kQ^ in (5.1.6) plus the kQ^,, th i ^ i', corresponding to the first associates of the i treatment. The difference between the estimated effects of treatments i and i' can be written as Ti~Ti' = X X k (QiQi>) + nX +n(n-1)X i. [S1(kQi)-S1(kQi,)+2k(Qi-Qi,)] . Under the assumption that the random errors in (5.1.1) are 72 normally distributed with mean zero and variance structure A A eIbk' the difference has the properties A A and A A Var(Ti-xi,)= Kl> + X.+n (n-T)'rf i & i' are lSt associates X1 X2 K[l + nx (n-)X 1' i & i' are 2nC^ associates, where K = kcr* /n[2X-^ + (n-^)^] and n = /t* The intrablock analysis of variance table for the partially balanced extended complete block designs with the association scheme is presented in Table 5. All symbols in Table 5 are defined as they were defined in Table 3 with the exception of S^(kQ^) which has been defined in this section. 5.2 Distributions of the Sums of Squares and Relevant Tests of Hypotheses As in Section 4.2, before considering any relevant tests of hypotheses, we shall obtain the distributions of the sums of squares in Table 5. Resorting once again to the the ory of the distributional properties of quadratic forms, the sums of squares formulae in Table 5 are expressed as quad ratic forms in the following matrix notation TABLE 5 Intrablock Analysis of Variance for Partially Balanced ECBD of the Association Scheme Source df Sum of Squares EMS* X. -X, Treatments (adjusted) t-1 ssta = 2nX- k L+n ( n-2)X2 Kq? + i 1 nX-^+n 2 (n-l)X2 [Qisi (Q) + 2Q?]} E (MST Blocks (unadjusted) b-1 SSBu 1 k l j B2 D - CM Remainder (t- 1)(b-1) SSR = l l 1 . R? - SST - SSB 7 - CM E (MSR) i j nij ID A U Error b (k-t) SSE = I i 1 j I y a 2 _ i j a l l r i j I] R2 . ID E(MSE) Total bk-1 TSS = I i l j C 2 _ i j Â£ CM * E (MST ) = a2 + x1 At E(MSR) = a2 + y' Dy and E(MSE) = a2 where A e k(t-l) ~ ~~ e ~ ~~ e A = rkl NN' and D = C'C . co 74 SST = y'A y = y' A ~ ~1~ ~ C(I J. X X D) X, 2nA2_+n (n-2) X2 ~ ~bt k ~2~2~ ~1 A -A IX + 1 2 h] X' (I 1 DX X' ) C' y (5.2.1) (n-1)A o ~ ~1 ~bt k ~~2~2 ~ '~t nAi+n SSB = y' A y = y' I(CX X'C I J) y , U ~ ~2~ ~ k ~~2~2~ b ~ ~ (5.2.2) SSR = y'A y = y' (CD-1C' -ij-A -A)y, (5.2.3) ~ ~3~ ~ ~~ ~ bk ~ ~1 ~2 SSE = y'A y = y' (I CD_1C') y , ~ ~4~ ~ ~bk ~~ (5.2.4) and TSS = y'A y = y' (I 1 J) y , ~ ~5~ ~ ~bk bk ~ (5.2.5) where in (5.2.1), the matrix H is given by H = (I J ) (J I ) . ~n ~n ~n ~n The matrices A., A., A-, A., and Ac are each real, ~1 ~2 ~3 ~4 ~5 symmetric, and idempotent, and l A = A_ . L. ~p ~5 p=l Furthermore, the ranks of the five matrices, where r(A ) de- ~ hr notes the rank of the matrix A are ~P r(^l} t-1 , b-1 , (b-1)(t-1) , b(k-t) , and r(A ) = bk-1 = V r (A ) ~ R _L, ~ D p=l 75 Hence, by again referring to Theorem 5 in Searle (1971) on the distribution of quadratic forms, we find that when y ~ N( H' aeibk } ' then yApy ~ e xr(A )( H'^pH/2ae ) (5.2.6) for p = 1, 2, 3, and 4 and the y'A y are mutually independ- ~ ~ IT ~ ent. The distributional forms (5.2.6) can now be used to construct statistics for the tests of hypotheses in the usual manner. The test of the assumption concerning the validity of the additive model corresponds to the test of the hy pothesis of zero interaction effects when the non-additive model is considered. To test the validity of the additive model, the test statistic used is the ratio of the mean square for remainder to the mean square for error. If the additive model assumption holds, then the test statistic possesses an F distribution with the appropriate degrees of freedom, and the hypothesis concerning the validity of the model is rejected for large values of this ratio. If the additive model assumption is valid, a test of the hypothesis of equal treatment effects would be performed using the ratio of the mean square for treatments (adjusted) to either the mean square for error or the pooled mean square for remainder plus error. Under the hypothesis of equal treatment effects, this ratio possesses an F distribution. On the other hand, 76 if there is evidence to reject the assumption of the addi tive model in favor of the non-additive model, then an ap proximate test on the treatment effects could be performed if desired. CHAPTER 6 THE GENERAL PARTIALLY BALANCED EXTENDED COMPLETE BLOCK DESIGN In Chapter 4 the analysis of the fixed effects model as well as the analysis of the mixed model for the class of partially balanced group divisible extended com plete block designs was presented in detail. In Chapter 5 the analysis of the fixed effects model was presented in de tail and the analysis of the mixed model was mentioned for the class of partially balanced extended complete block de signs with the I12 association scheme. These two special cases of the general partially balanced (GPB) extended com plete block designs were presented in detail, not only be cause of their general applicability, but also because the constants (containing the parameters of the designs) were of the same form for both special cases. In this chapter, we shall present the analysis of the GPB extended complete block designs. For this general class of designs, it will be necessary to introduce new con stants to aid in simplifying the forms of the necessary cal culating formulae. The introduction of these new constants stems from the desire to conform to the use of the standard notation for the analysis of general partially balanced in complete block designs. In particular, the new constants are d., d and A which correspond to the constants c c , 12s 12 77 78 and A as defined and used by Bose and Shimamoto (1952) in the analysis of PBIB designs. We now present the intrablock analysis of variance for the GPB extended complete block designs. 6.1 Intrablock Analysis Let us consider the model in (4.1.8) where again the interaction effects are all zero. The additive model is written as y = C(ylbt + XjT + X23) + e (6.1.1) where the symbols y, C, y, lbt, X^, t, X2, 3, and e are de fined following (4.1.8). The normal equations are set up exactly as presented in Sections 3.2 and 5.1, resulting in G/bk 1 < 2- \ kQ -- A At A rB N'T (rklb -N'N)3 where G is the grand total of the observations, Q is the vector of adjusted treatment totals, B and T are respec tively vectors of the unadjusted totals of the blocks and treatments, N is the incidence matrix of the design, r is the number of replications of each treatment in the exper iment, and k is the block size. As in the previous sections where the intrablock analysis was discussed, the matrix A is given by A = rklt NN' , 79 A A A and y, x, and 3 are respectively the estimates of the param eters y, x, and g in (6.1.1). To find an expression for the intrablock estimate th of the effect of the i treatment, we note from the t x 1 a th vector of equations kQ = Ax in (6.1.2) that the i element can be written as b t kQi = rkTi J I nijnhjTh , (6.1.3) th where is the adjusted total for the i treatment and n^ and n^j are elements of the incidence matrix N of the de sign. The quantities 1 l ijVTh 3 h J can be expressed in terms of the parameters of the design as follows: l 1 j h nijnhjTh l nij*i 3 l xiTi 1 _ A l Vi i = h s t i f h, i & h are 1 associates i ^ h, i & h are 2n<^ associates (rk-n-^X j-n2X2^ Ti XjS^(x^) ( i / h ^2^2 s t i & h are 1 associates, nd i & h are 2 associates where S-^ix^) is the sum of the estimated treatment effects of all treatments (n-^ in number) that are first associates th a of the i treatment, and likewise, S2(xÂ£) is the sum of the 80 estimated treatment effects of all treatments (n^ in number) J_ T_ that are second associates of the iT'n treatment. By re placing the quantities Â£ l n. .n t j h x3 h3 h in equation (6. ,1.3) with their equivalent expressions in- /\ A volving S^(t_^) and S2(t^), we maY wr;i-te equation (6.1.3) in the form A A Z\ kQi = ^ni^l+n2^2^Ti ^lSl^Ti^ ^2S2^Ti^ (6.1.4) Equation (6.1.4) is now summed over the first asso ciates of the x treatment resulting in the expression kS1(Qi) = "hVi + Sl(i> (nlXl+n2A21lPl'i2p2) + S2(.)(-X1p^1-X2p2) (6.1.5) Summing (6.1.4) over the second associates of the i1" treat ment results in kS2(Q.) = X2n2il + S2(il> (Vl+n2X2'XlP12X2P22) + S1(.)(-X1p>2-X2P2) (6.1.6) In (6.1.5) and (6.1.6) px is the number of treatments that 3k are both a jth + Vi associate of treatment a and a associate of treatment 6 given that treatments a and 8 are ix asso- ciates. As in Bose and Shimamoto (1952), we write equations (6.1.5) and (6. .1.6) in the forms kS1(Qi) = = -n1X1i + a11S1(?l) + a12S2(.) (6.1.7) and 81 kS2(Qi) n2A2Ti (6.1.8) all nlXl + n2X2 Xlpl X2P12 a 12 Xlpl X2P2 a 21 "Xlp2 X2P22 and X1P2 X2P22 ' In order to express the intrablock estimate x^ as a function of Q^, S^(Q^), and S2 (Q) having arrived at the equations (6.1.7) and (6.1.8), we interrupt the development A A briefly to see how the sums S^ix^) and S2(x^) can be re placed in (6.1.7) and (6.1.8) with the quantities S^(Q^) and S2(Q^). To this end, consider the linear combination L = k2Q + d1kS1(Qi) + d2kS2(Qi) (6.1.9) 1^1vvi involving only the Qi and parameters of the design with d^ and d2 being constants consisting of linear functions of the aij, i < j = 1, 2. If both (6.1.7) and (6.1.8) are substi tuted into (6.1.9) for kS^Q-^) and kS2(Q^), respectively, the resulting expression for is = [k (njX^+n2X2)-djn^X-^-d2n2A2] x^ + (d^a^j+d2a22_~kA^) S^ (Q^) (6.1.10) + (d1a12+d2a22-kX2)S2(Q^) . The quantities d-^ and d2 in (6.1.10) are now chosen so that 82 upon equating the right-hand side of (6.1.9) to the right- hand side of (6.1.10), the equation (6.1.10) becomes k2Qi + d1kS1(Qi) + d2kS2(Qi) = k (n^+n^) (6.1.11) A That is, equation (6.1.11) expresses the estimate as a function of the quantities S-^iQ^), and s2 (Q^) . To obtain the values for d-^ and d2 so that equation (6.1.11) is as shown, we require the identities k^i = d^ (a-j^+n^A-^) + d2 (a2^+n2X2) (6.1.12) and kX2 d^ ^2 ^a22+n2^2^ (6.1.13) Solving equations (6.1.12) and (6.1.13) simultaneously by the use of determinants, we have d-^ = D-^/D and d2 = D2/D, where D, D-^, and D2 are given by D (a^^tn^X^) (a22+n2A2) (a^2"^~^2_^i^ (a2*^tn2X2) ^ (6.1.14) ~ kX^(a22tn2X2) kX2 (a2i+n2'*''2^ t (6.1.15) and D2 kA2(a^*^~^"^2_^}_^ """* kXi (a-^2+n^X^) (6.1.16) Substituting for a^, a^2, a2^, and a22 in (6.1.14), (6.1.15), and (6.1.16), simplifying, and writing D = k2A the fol- b lowing equations for d-^ and d2 are obtained kAgd-j_ = A1 (n1X1+n2X2 + X2) + (X-^X2) (^ 2P2 ^ lp 2 ^ and 83 k^s^2 ^2 l"^^2^2^^ 1^ **" ^2^ ^2P12^lp12^ f (6.1.18) where k2A s ^nl^l+n2^2+^l^ ^nl^l+n2^2+^2^ + (X1-X2) [ (niAi+n2A2) (P3_2"P2^ + X2P12 ^lp12^ * The expressions (6.1.17) and (6.1.18) for the values of d-^ and d2, respectively, are now substituted back into (6.1.11). th The intrablock estimate of the effect of the i treatment is Ti k(n1A1+n2A2) tk2Qi + ^i^Qd.) + d2S2(kQi)] (6.1.19) which has the alternate forms Ti k(n1A1+n2X2) t(k dl)kQi + {dl d2)s1(kQi)] (6.1.20) and - = 1 Ti k(n^X^+n2X2) [Oc-a2)kQ. + (a2-a1)s2(kQ.)] (6.1.21) The use of one of the alternate forms would probably be more convenient since only one sum, either S-^(kQ^) or S2(kQ^), for each treatment need be calculated. /\ /v The difference of the estimated effects of treatments i and i', i ^ i', using the alternate form in (6.1.20) above is Ti"Ti* = k (n.X^+nX.) [(k"di)k(Qi Q,) 11 2 2 + (d1-d2)k{S1(Q.)-Si(Q.I)}] Under the assumption that the random errors in (6.1.1) are 84 independent normally distributed random variables with mean r. zero and variance a^r the difference has the properties and A A Var(xi-xi,) f2(k"dl)ae st -:rr; i & i' are 1 associates nl^l+n2^2 2(k-a2)a a -;rrr 1 & 1 are 2 associates . [nlh+n2X2 The analysis of variance table for the general partially balanced extended complete block design is exactly of the same form as Table 5 except that the calculating formula for the sum of squares for treatments (adjusted) is now given by SSTa 1 k2(n]_^]_+n2^2) I [ (k-d1) (kQi) 2+ (d1-d2) (kQi) S1 (kQi) ] . Also, the tests of hypotheses usually performed are con ducted in the manner described in Section 5.2. The recovery of the intrablock information and a combined estimate of the intrablock and interblock treatment effects can also be obtained with the straightforward appli cation of the maximum likelihood method of Rao (1947). The utility of the general partially balanced ex tended complete block designs is at present limited. This limitation is imposed by the complexity of the formulae for 85 the estimates of the treatment effects, in that the con stants and d^ must be calculated for any design used. A similar difficulty is encountered in the analysis of PBIB designs, unless one has reference to the extensive listing of PBIB designs (with two associate classes) and their asso ciated constants necessary for an analysis as given by Clatworthy (1973). CHAPTER 7 CONCLUDING REMARKS AND A SENSORY TESTING EXAMPLE Throughout the development and presentation of this work, it has been necessary to make certain assumptions concerning the model as well as the type of correlation present in the data in order to formulate our methods of analysis. In Chapter 3 for example, in the development con cerning the possible presence of the correlation p between duplicate responses to the same treatment in the same block, the additive model only was assumed. Without making the assumption that the interaction effects are all zero, the estimate of the magnitude of the correlation would be con founded with the estimates of the interaction effects. In this case, an estimate of the correlation free of the inter action effects would not have been possible by the method we used. Although the assumption of additivity for many realistic situations is somewhat restrictive, the additivity assumption was made as a matter of necessity and to be in line with the assumption employed in the analysis of ran domized complete block designs (of which our designs are just extensions). The assumption that the correlation between du plicate responses to the same treatment in the same block is constant and equal for all treatments and blocks may also 86 87 appear to be somewhat restrictive. It would seem more appro priate perhaps to assume the correlation is not constant but rather varies over the treatments and blocks. That is, for many practical applications it may be more realistic to con sider the correlations p^j, i = 1, 2, ..., t and j =1, 2, ..., b. However, this non-equality of the correlations would give rise to the difficulty of having to estimate a larger number of parameters than the number of observations present in the experiment. Also, all the correlations p^j could not be estimated in a given experiment since not all block-treatment combinations would have duplicate responses. A simplification of the problem of non-equality of the correlations would be to consider the correlations pj, j = 1, 2, ..., b, that is, a different correlation is asso ciated with each block. Such a case might arise when pan elists of varying degrees of proficiency are used in sensory testing experiments. The estimation of the Pj and the sub sequent test on the treatment effects is being considered for future work. The methods presented for testing the hypothesis of zero correlation and for estimating p may appear to be somewhat intuitive. First attempts in finding a likelihood ratio test of the hypothsis of zero correlation and a maxi mum likelihood estimator of p resulted in complex expres sions which did not seem to simplify. These likelihood pro cedures could be investigated in future work. At that time, it may be of interest to compare the likelihood results with the results contained in this work. 88 The possible lack of utility of the general par tially balanced extended complete block designs developed in Chapter 6 was mentioned at the end of that chapter. Inves tigations into the possibility of expressing our solutions in terms of the parameters and constants already tabulated for PBIB designs by Clatworthy (1973) could be considered. Hopefully, this would enhance the utility of the GPB ex tended complete block designs with respect to the calcula tions involved in the analysis. As mentioned at the end of Section 3.2, we shall now suggest a test of the hypothesis of equal treatment ef fects in the presence of a non-zero correlation. The test procedure is just a suggestion since the properties of the procedure have not been studied in detail at this time and remain for future consideration. The quadratic forms for the sums of squares for treatments (adjusted) in Table 1 of Section 3.2 and for re mainder in (3.3.2) are not independently distributed in gen eral. However, the quadratic forms for SST^ and the sum of squares for duplication variation in (3.3.1) are independ ently distributed. In fact, we have that SSTa ~ a2{1 + [(k+t)X 2rk]} x^-i when the hypothesis of equal treatment effects is true, that SSDV ~ o2(1P) X(k_t) , and that these random variables are distributed independently of one another. Thus, to test the hypothesis under con sideration, we may use the test statistic 89 MST /{I + |Â£ [ (k+t)A 2rk]} pi A AK t MSDV/(1 p) which possesses a central F distribution with t-1 and b(k-t) degrees of freedom in the numerator and denominator, respec tively. When the hypothesis is not true, the test statistic F has a non-central F distribution which depends upon the true value of p as well as the unknown value of the ratio of Â£ x2 to ct2. Thus, the power of the test could be calculated for various values of p and the ratio J x2/a2. The value of the test statistic F depends upon the true value of p which is usually unknown. The suggested A procedure, therefore, is to replace p with p resulting in the approximate test statistic F* given as MST /{I + ||. [(k+t) A 2rk] } p* A AK t MSDV/(1-p) The distribution of the approximate test statistic F* depends upon the unknown distribution of p. At this point, complications arise in arriving at the exact or an approximate distribution of F* since p is calculated using the value of SSR which, as a random variable, is not inde pendent of the random variable SST Hence, until more A investigation may be made into the distribution of the es timator p, the distribution of the test statistic F* might be approximated by an F distribution with t-1 and b(k-t) 90 degrees of freedom (the distribution of F ). The closeness of this approximation to the exact distribution is being considered and will hopefully be reported in later work. The following is a numerical example of a taste testing experiment. The objectives of the experiment were twofold. First, it was of interest to compare the degree of preference for the treatments by the specific panelists used. Second, it was suspected that correlation would be present in the data and hence a test for its presence was to be per formed. Each of the trained panelists (denoted by the num bers 1 through 10) was asked to evaluate five different treatments (denoted by the letters A, B, C, D, and E) by as signing a numerical value of 1 through 9 according to his or her degree of preference for the treatments. The lower end of this hedonic scale reflects an extreme non-preference while the upper end reflects an extreme preference for the treatments. Since there was an interest in measuring the consistency of the panelists used and since each panelist could evaluate six food samples effectively at one sitting, each panelist was asked to evaluate each of the treatments plus a replicate of one of the treatments. The data with some calculations is 91 Treatments where the kQ. are i calculated using the formula which follows formula in (3 tions are kQ. = kT. l n..B 13 3 (3.2.3) and the are calculated using the 2.4). For treatment A, the necessary calcula- kQ = (6) (89) {(2) (38+19) +(1) (34+27+23+36+31+42+30+22)} A = 175 and Â£ 175 o c ta JT4TT5T ~ *b * The sums of squares for treatments (adjusted) and blocks (unadjusted), the total sum of squares, and the sum of squares for residual are calculated using the formulae in Table 1. The results of these calculations are 92 SSTa = 127.919 , SSB0 = 83.933 , TSS = 225.933 , and SSR = 44.081 . e The sum of squares residual is partitioned into the sums of squares for duplication variation and remainder by using the formula prior to equation (3.3.1) and by finding the value of the difference SSR SSDV. These results are e SSDV = 3.5 and SSR = 40.581 . To test the hypothesis HQ: p = 0 at the 0.05 level of significance, the ratio of the mean square for remainder to the mean square for duplication variation is compared 3 6 with the tabular value F-^q q = 2.68. The result of this ratio, denoted by Fp in the discussion prior to equation (3.5.3), is Fp = 3.22. Hence, the data presents sufficient evidence at the 0.05 level of significance to indicate that p is greater than zero. An estimate of p is obtained using formula (3.6.4), which for this data is - 40.581 3.5 p 40.581 + (0.159)(3.5) 0.901 , where the value 0.159 of given in Table 1. Since p is very close to 1, the conclusion 93 is that the panelists are consistent in their evaluations of a treatment and its duplicate. Future experimentation using these same panelists could be performed using any appro priate experimental design without specific emphasis placed on duplicating any of the treatment responses for each pan elist. Using the procedure suggested in this chapter for testing the hypothesis of equal treatment effects, the re sult is F* = -099 127.919/4 = 7>45 t 1.214 3.5/10 Since the tabular value for is 3.48, the hypothesis 10/0 *05 is rejected and the conclusion is that the treatment effects are not equal. The use of a multiple comparison procedure for comparing the treatments could now be used where the formula for the variance of a specific effect or the dif ference between two treatment effects is given in Section 3.2 (in equations (3.2.5) and (3.2.7), respectively). APPENDIX 1 THE EXACT DISTRIBUTION OF SSR FOR b=2t AND k=t+l WHEN p>0 In Section 3.4, the exact distribution of the sum of squares for remainder was developed when p > 0 for the special case t = b and r = k = t+1. In this appendix, we shall extend the development of the exact distribution of SSR when p > 0 to another special case, namely the case b = 2t, k = t+1, and r = 2(t+1). In other words, we shall now consider the situation where the number of blocks is twice the number of treatments but the block size remains equal to t+1. Following the procedure outlined in Section 3.4, we note that in this situation the matrix M(Gt 0 G^JM in (3.4.12) contains the non-zero partition W, where on the main diagonal of the matrix W there are t matrices of the form (b-1)(t-1) 1-t 1-t (b-1)(t-1) while the number 1 is present in all other positions. If we can find the latent roots of W, then the latent roots of the matrix F*MF* in (3.4.10) can easily be obtained, since the roots of FMF* are simple multiples of the roots of W. With the aid of the corollary in Section 3.4, the roots of the 94 95 matrix F(F'D^F) ^F' in (3.4.6) can be obtained, from which an expression for the minimum value R2 in (3.4.5) and (3.4.3) can be specified. Once R2 is expressed in terms of the de sign parameters, the distribution of the conditional random variable SSRlu.. is specified. 1 iD To obtain the latent roots d of the matrix W, we seek the solutions of the equation |W dl | = 0 (A.1.1) This determinant can be expressed in a simpler form by per forming elementary row and column operations on the matrix W-dl^r resulting in the expression [b(t-1)-d]t+1[(b-1) (t-1)- (t+1)-d]t_1 = 0 (A.1.2) for which the roots d are b(t-l) with multiplicity t+1 d = bt-2t-b t-1 Since b = 2t, the roots d may be written in terms of the single design parameter t in the form 2t (t-1) d = 2t(t-2) with multiplicity t+1 t-1 . Thus, the roots 0" of the matrix F^MF* in (3.4.10) are 96 ," ^ (t-l)/t with multiplicity t+1 (t-2)/t t-1 0 2t2-5t+l while the solutions 0' satisfying (3.4.7) are (t+l)/2t with multiplicity t+1 0' = (t+2)/2t t-1 1 2t2-5t+l Finally, the roots of the matrix F(F'D*F) ^F' in (3.4.6) are 0 = 2t/(t+l) with multiplicity t+1 2t/(t+2) t-1 1 2t2-5t+l Using the above values of 0, an expression for R2 in the distribution of the conditional random variable SSR|u^j in (3.4.2) may be written as R2 2t 2 2t 2 t+1 xt+l + t+2 xt-l + X 2 2t2-5t+l That is, R2 is again a linear combination of weighted independent chi square random variables. Upon taking the expectation of the conditional random variable SSR|uj_j with respect to u^j and writing SSR = E(SSR|u^j), we find that the random variable SSR is distributed in the following manner: SSR/ct2 ~ a-L Xv + a2 xj + a3 xj r 12 3 (A.1.3) where 97 and ai = 1 + p fir ' v1 = t+1 , v2 = t-1 , = 2tz-5t+l . An approximate distribution for the random variable SSR in (A.1.3) is given in Appendix 2. APPENDIX 2 AN APPROXIMATE DISTRIBUTION OF SSR FOR b=2t AND k=t+l WHEN p>0 In Appendix 1, it was shown that the random vari able SSR is distributed in the following way: SSR/a2 ~ ax Xv + a2 X* + a3 X2 (A.2.1) J- ^ *J where a^ and v j j = 1, 2, and 3, are defined following (A.1.3). In this appendix, we shall consider an approximate distribution of (A.2.1) for the given special case b = 2t, k = t+1, and r = 2 (t+1). The form of the approximate dis tribution follows from the theorem stated in Section 3.5 on the approximate distribution of a quadratic form. That is, SSR/a2 ~ g X (A.2.2) approximately, where the values of g and h are computed using (3.5.1) and (3.5.2), respectively. For the integer values 3, 4, 5, 6, and 7 of t and for some values of p between 0 and 1, values of g and h are presented in Table A1. The value of the scale constant g was computed to four decimal places and the approximate de grees of freedom h were expressed to the nearest integer. To justify the use of the approximate distribution in (A.2.2) for testing the hypothesis Hq: p = 0 against the alternative hypothesis Ha: p > 0, the approximate 0.05 level 98 TABLE Al Values of g and h for the Approximate Distribution of SSR, II p t V, v a a^ a g h 1 2 _3 1 2 3 ZJ 0.1 3 4 2 4 1.05 1.025 1 1.0255 10 0.3 1.15 1.075 1.0792 10 0.5 1.25 1.125 1.1361 10 0.7 1.35 1.175 1.1958 10 0.9 1.45 1.225 1.2581 10 0.1 4 5 3 13 1.06 1.04 1.0207 21 0.3 1.18 1.12 1.0658 21 0.5 1.3 1.2 1.1156 21 0.7 1.42 1.28 1.1695 20 0.9 1.54 1.36 1.2271 20 0.1 5 6 4 26 1.0667 1.05 1.0174 36 0.3 1.2 1.15 1.0563 36 0.5 1.3333 1.25 1.1004 35 0.7 1.4667 1.35 1.1491 35 0.9 1.6 1.45 1.2022 34 0.1 6 7 5 43 1.0714 1.0571 1.0150 55 0.3 1.2143 1.1714 1.0493 55 0.5 1.3571 1.2857 1.0887 54 0.7 1.5 1.4 1.1330 53 0.9 1.6429 1.5143 1.1818 53 0.1 7 8 6 64 1.075 1.0625 1.0132 78 0.3 1.225 1.1875 1.0438 78 0.5 1.375 1.3125 1.0795 77 0.7 1.525 1.4375 1.1200 76 0.9 1.675 1.5625 1.1650 74 100 of significance for the distribution in (A.2.2) is compared with an exact probability level as well as a probability level obtained using a second approximate distribution. We shall feel confident in using the approximate distribution in (A.2.2), if the exact probability levels are close to 0.05 for reasonable values of t and p. To compute the exact probability levels corre sponding to the approximate probability 0.05, the method suggested by Box (1954) can be used. However, the calcula tions of the exact probability values are limited to the case t = 3 for all values of p because of the large number of significant digits necessary to ensure accuracy to the fourth decimal place. This limitation prompted us to con sider a second approximation, which is more exact than the distribution specified in (A.2.2), to check the probability levels of the distribution in (A.2.2). The second approxi mation is now described in detail. Corresponding to the form of the distribution of the random variable SSR/p2 given in (A.2.1), the moment gen erating function m(0) of SSR/ct2 may be written as -v-,/2 -V-/2 -Vo/2 m (0) = (l-2a10) (l-2a20) (l-2a30) (A.2.3) Taking the natural logarithm of m(0) results in the sum 3 vi In m(6) = Â£ ln(l-2a^0) (A.2.4) i=l The quantity ln(l-2a^0) can be expressed in terms of a new constant a as 101 ln(l-2ai6) In (l-2a0) + In l-2ai9 l-2a0 (A.2.5) The reason for expressing ln(l-2a^0) in this form will become evident as the development of the second approximation con tinues . Using some simple algebraic manipulations and the expansion of ln(l+x), we may write the second term on the right-hand side of (A.2.5) in the form In l-2ai0 l-2a0 In(l-2a0) + 1 P l-2a0 (A.2.6) where = (a/a^) 1. Substituting (A. 2.6) into (A. 2.5) and using the resulting expression in (A.2.4) gives v v. In m(0) = 2 ln(l-2a0) - 3 X 5 1 = 1 1 I | P=1 p Oi-i l-2a0 (-1) p-1 (A.2.7) where v = + \>2 + Vg. A value a of a is now chosen so that when we set o p = 1, the second expression on the right-hand side of (A.2.7) is equal to zero. The value aQ is given by a = v v o (A.2.8) which satisfies the equation 3 2 (l-2a0) = 0 102 With this value of aQ substituted into (A.2.7), the resulting expression for m(0) in (A.2.3) is given approximately by m(e) = (l~2a_6) V/2 exp oo 3 I I p=2 i=l 2p ai P (~1)P l-2a 0 o j (A.2.9) If we further define to be ~ P 3 v-a^ _ J -%r <-p and define to be \p = l-2ao0 , then m(0) in (A.2.9) may be written in the forro m (0) = (l-2aQ0) V//2 exp I p=2 (A.2.10) Since we require the equality m(0) = 1, the approximation in (A.2.10) is altered to satisfy m(0) = 1 resulting in m(0) (l-2aQ0) V^2 exp p=2 TTp (l/j P-l) (A.2.11) An approximation to m(0) in (A.2.11) is made by using the first three terms of the infinite sum on p and by approximating e by the formula eX = 1 + x + (x2/2) . The approximation becomes 103 m(0) (l-2ao0) [ (1-S+sS2) + tt^I-S)^ ^ + ir^d-S)^ ^ + [tt4 (1-S)+35tt2] \p 4) , where S = 7^ + it 3 + Hence, the second approximation of the required probability is given by Pr[ SSR/a2 > S ] = (I-S+5S2) Pr[ aQ x* > <$ ] + tt2 (1-S) Pr[ aQ x*+4 > <$ ] + it3 (1-S) Pr [ aQ x+6 > 6 ] + [tt4 (l-S)+is7r2] Pr [ aQ xJ+8 > 6 ] . (A.2.12) The probabilities presented in Table A2 are the values resulting from the use of Pr[ SSR/ct2 > gxQ ] , where xQ is the tabular value of a chi square random variable with h degrees of freedom such that the probability of ex ceeding xQ is equal to 0.05. That is, the value of gxQ was determined by the approximate distribution in (A.2.2) and this value of gxQ was then used to determine the exact prob ability values using the method given by Box (1954) as well as the second approximate probability values computed using (A.2.12) . From the entries in Table A2, we note that the probabilities for the approximation in (A.2.2) are nearly equal to the probabilities corresponding to the exact TABLE A2 Comparison of the Exact and Two Approximate Distributions of SSR t Â£ Exact Approx I Approx II 3 0.1 0.0499 0.0500 0.0499 0.3 0.0492 0.0492 0.5 0.0480 0.0480 0.7 0.0465 0.0465 0.9 0.0448 0.0448 4 0.1 0.0494 0.3 0.0480 0.5 0.0455 0.7 0.0578 0.9 0.0532 5 0.1 0.0497 0.3 0.0476 0.5 0.0554 0.7 0.0498 0.9 0.0546 6 0.1 0.0497 0.3 0.0470 0.5 0.0508 0.7 0.0533 0.9 0.0434 7 0.1 0.0497 0.3 0.0465 0.5 0.0479 0.7 0.0463 0.9 0.0492 105 distribution in (A.2.1) as approximated by the more exact second method. Therefore, the approximate distribution in (A.2.2) can be used with confidence, at least for the values of t included in the table. These values of t cover most of the cases which might be encountered in the application of our problem. BIBLIOGRAPHY Anderson, T.W. (1958). An Introduction to Multivariate Sta tistic Analysis. Wiley, New York, 203-207. Bose, R.C. and T. Shimamoto (1952). Classification and anal ysis of designs with two associate classes. Journal of the American Statistical Association, 47, 151-190. Box, G.E.P. (1949). A general distribution theory for a class of likelihood criteria. Biometrika, 36, 317-346. (1954) Some theorems on quadratic forms ap plied in the study of analysis of variance problems, I. Effect of inequality of variance in the one-way classi fication. Annals of Mathematical Statistics, 25, 290- 302. Clatworthy, W.H. (1956). Contributions on partially bal anced incomplete block designs with two associate classes. National Bureau of Standards. Applied Mathe matics Series, No. 47. (1973) Tables of two-associate-class par tially balanced designs. National Bureau of Standards. Applied Mathematics Series, No. 63. Cornell, J.A. and F.W. Knapp (1972). Sensory evaluation using composite complete-incomplete block designs. Journal of Food Science, 37, 876-882. (1974) Replicated composite complete-incomplete block designs for sensory experi ments. Journal of Food Science, 39, 503-507. Cornell, J.A. (1974). More on extended complete block de signs. Biometrics, 30, 179-186. Fisher, R.A. (1944). Statistical Methods for Research Work ers Oliver and Boyd, Edinburgh, 10th edition. Gleser, L.J. (1972). On a new class of bounds for the dis tribution of quadratic forms in normal variates. Jour- nal of the American Statistical Association, 67, 655- 659. 106 107 Imhop, J.P. (1961). Computing the distribution of quadratic forms in normal variables. Biometrika, 48, 419-426. John, P.W.M. (962). Testing two treatments when there are three experimental units in each block. Applied Sta tistics 11, 164-169. (1963). Extended complete block designs. Aus tralian Journal of Statistics, 5, 147-152. (1964). Balanced designs with unequal numbers of replicates. Annals of Mathematical Statistics, 35, 897-899. (1971). Statistical Design and Analysis of Ex periments Macmillan, New York. Jones, R.M. (1959). On a property of incomplete blocks. Journal of the Royal Statistical Society B, 21, 172-179. Kulshreshta, A.C., A. Dey and G.M. Saha (1972). Balanced designs with unequal replications and unequal block sizes. Annals of Mathematical Statistics, 43, 1342- 1345. Pearce, S.C. (1960). Supplemented balance. Biometrika, 47, 263-271. (1963). The use and classification of non- orthogonal designs. Journal of the Royal Statistical Society A, 126, 353-377. (1964). Experimenting with blocks of natural size. Biometrics, 20, 699-706. (1965). Biological Statistics: An Introduction. McGraw-Hill, New York. (1968). The mean efficiency of equi-replicate designs. Biometrika, 55, 251-253. (1970). The efficiency of block designs in general. Biometrika, 57, 339-346. Press, J.S. (1966). Linear combinations of non-central chi- square variates. Annals of Mathematical Statistics, 66, 480-487. Rao, C.R. (1947). General methods of analysis for incomplete block designs. Journal of the American Statistical Association, 42, 541-561. 108 Rao, P.V. (1961). Analysis of a class of PBIB designs with more than two associate classes. Annals of Mathe matical Statistics, 32, 800-808. Satterthwaite, F.E. (1946). An approximate distribution of estimates of variance components. Biometrics Bulletin, 2, 110-114. Scheffe, H. (1959). The Analysis of Variance. Wiley, New York. Searle, S.R. (1971). Linear Models. Wiley, New York. Seshadri, V. (1963a). Combining unbiased estimators. Bio metrics 19, 163-170. (1963b). Constructing uniformly better es timators Journal of the American Statistical Asso ciation, 58, 172-175. Tocher, K.D. (1952). The design and analysis of block experiments (with discussion). Journal of the Royal Statistical Society B, 14, 45-100. Trail, S.M. and D.L. Weeks (1973). Extended complete block designs generated by BIBD. Biometrics, 29, 565-578. Vartak, M.N. (1959). The non-existence of certain PBIB de signs. Annals of Mathematical Statistics, 30, 1051- 1062 . BIOGRAPHICAL SKETCH Jack Franklyn Schreckengost was born on January 6, 1944, in Summerville, Pennsylvania. After graduation from the Brookville Area High School in 1961, he attended nearby Clarion State College. During his enrollment there, he ma jored in chemistry and mathematics receiving the degree of Bachelor of Science in Secondary Education in 1965. While he taught mathematics in the Penn Hills School District in > a suburb of Pittsburgh, he studied part time at Bucknell Uni versity, Lewisburg, Pennsylvania. Having been awarded the degree of Master of Science in Mathematics (Teaching) in 1968, he assumed a position at Bucknell University teaching fresh man calculus and elementary statistics. In 1970, he enrolled in the Graduate School of the University of Florida and re ceived the degree of Master of Statistics in 1972. While attending the Graduate School, he worked as a teaching assist ant and as a research assistant in the Department of Statis tics and, until the present time, has pursued his work toward the degree of Doctor of Philosophy. He is married to the former Donna Rae Leach of Arnold, Pennsylvania, and is the father of two children. He is a member of Theta Chi and Pi Mu Epsilon fraternities and the American Statistical Association. 109 I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. John A. Cornell, Chairman Associate Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Frank G. Martin Associate Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. John G..Saw Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and guality, as a dissertation for the degree of Doctor of Philosophy. (v'.Aj .d0 P .V. Rao Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. /U2 J Frederick W. Knapp Associate Professor ;of Food Science This dissertation was submitted to the Graduate Faculty of the Department of Statistics in the College of Arts and Sciences and to the Graduate Council, and was ac cepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. August, 1974 Dean, Graduate School |