UFDC Home  myUFDC Home  Help 



Full Text  
OPTIMAL DESIGN AND ANALYSIS OF CLONAL FORESTRY TRIALS USING SIMULATED DATA By SALVADOR ALEJANDRO GEZAN A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2005 Copyright 2005 by Salvador Alejandro Gezan "If a scientific heresy is ignored or denounced by the general public, there is a chance it may be right. If a scientific heresy is emotionally supported by the general public, it is almost certainly wrong." (Isaac Asimov, 1977) Dedicated to: My biological family, Tita, Lincoln, Ivan, Demain, Florencia and Alexandra My political family, Dean, Quena and Becky, Lauren, Anha and Kevin And to an special individual that put us all together, Pincho ACKNOWLEDGMENTS I would like to thank the members of my supervisory committee, Drs. T.L. White, R.C. Littell, D.A. Huber, D. S. Wofford, and R. L. Wu, for their energy, time and help during my program. I thank, in particular to Dr. White, for the opportunity to come to Gainesville and do such an interesting project and also for his patience and wiliness to show me not only science, but also emotional intelligence. I thank to Dr. Huber, for many small but important details, and for showing me a different dimension of thinking, sometimes unreachable but which surprisingly I liked. I thank to Dr. Littell, an unconditional supporter and also model in many senses. This research would not have been done without the financial support of the members of the Cooperative Forest Genetic Research Program (CFGRP). Here, I want to thank Greg Powell for his enormous support in several aspects, and for showing me what being a gator is all about. I also want to thank several friends: The Latino Mafia, Veronica, Alex, Rodrigo, Bernardo, Belkys, Gabriela, and Rossanna; the Trigators Mafia, Terrence, Mathew, Shannon, Josh, Betsy, Mark, and with special love Eugenia. Special thanks go to all members of my family for tolerating my absence, but particularly to my step father Dean W. Pettit, for his unusual vision and for opening doors for me to cross as I choose. TABLE OF CONTENTS ACKNOW LEDGM ENTS ........................................ iv LIST OF TABLES ............... ............. ...................... vii LIST OF FIGURES ................................................. ..............x ABSTRACT ............................... ............. ...............xiii CHAPTER 1 IN TR OD U CTION ............... ..1.................... .................. .......... 2 COMPARISON OF EXPERIMENTAL DESIGNS DOR CLONAL FORESTRY USING SIMULATED DATA....................... .............................7 Introduction ...................... ..... ......................7 M materials an d M eth o d s .......................................................................................... 10 Field Layout. ....... ......... ..... ...... .. ....... ..........10 Genetic Structure and Linear Model .................................. ........12 Spatial Surface..................... ............. ......... 13 Simulation Process ...................... ........ ........ .... ...15 S im u latin g M ortality ...................................................................................... 16 Statistical Analysis ...................... ........ ........ .... ...17 R results and D discussion .............. .................................................................... ............... 19 V ariance C om ponents ...........................................................19 Correlations between True and Predicted Clonal Values...............................22 H eritabilities for Sim ulated D esigns .............................................. ......24 Conclusions....................................... ........ .26 3 ACHIEVING HIGHER HERITABILITIES THROUGH IMPROVED DESIGN AND ANALYSIS OF CLONAL TRIALS................................................ 29 Introduction ...................................... ......... ........... .29 M materials and M methods ............................................................3 1 Field D esign and Sim ulation .......................................................... ... ....31 Statistical M models ............................. ........................... 33 Results and Discussion ......................... ..... ......... ........ 36 Heritability Estimates and Confidence Intervals...................... ...............36 esign and Sim ulation ......................................................................... 3 1 Statistical M models ........................ ................ ................... ..... .... 33 R results and D discussion ....................... ........ .............. ................................... 36 Heritability Estimates and Confidence Intervals......................................36 Surface P aram eter B behavior ......................................... .......................................40 Latinization........................................ 43 N um ber of R am ets per Clone ........................................................ ... .......... 44 Conclusions and R ecom m endations................................... ...................................... 47 4 ACCOUNTING FOR SPATIAL VARIABILITY IN BREEDING TRIALS ...........49 In tro d u ctio n .............. ...... ...... .. ................. ................. ................ 4 9 M materials and M methods ....................................................................... ..................53 Field D esign and Sim ulation ........................................... ......... ............... 53 Statistical A analysis ........................ .............. ...........................55 Results .............. ........................... ........................59 M modeling Global and Local Trends.................................. ...................... 59 N earest N neighbor M odels......................................................... ............... 65 D discussion .................................... .......... .... .......... ......... ............ 68 M modeling Global and Local Trends.................................. ...................... 69 N earest N neighbor M odels......................................................... ...............72 C o n c lu sio n s........................................................................................................... 7 5 5 POSTHOC BLOCKING TO IMPROVE HERITABILITY AND BREEDING V A L U E PR E D IC TIO N ................................................................... .....................77 In tro du ctio n ...................................... ................................................ 7 7 M materials and M methods ....................................................................... ..................79 R e su lts ...................................... .......................................................8 3 D iscu ssio n ...................................... ................................................. 8 8 6 CON CLU SION S .................................. .. .......... .. .............93 APPENDIX A SU PPLEM EN TA L TA B LE S .......................................................... .....................98 B EXTENSIONS IN CONFIDENCE INTERVALS FOR HERITABILITY E S T IM A T E S ...................................................... ................ 10 6 C M A TLAB SOURCE CODE .......................................................... ............... 108 Annotated Matlab Code to Generate Error Surfaces. ............................................108 Annotated Matlab Code to Generate Clonal Trials. ..............................................109 L IST O F R E FE R E N C E S ....................................................................... .................... 112 BIOGRAPH ICAL SKETCH .........................................................................118 LIST OF TABLES Table page 21 Simulated designs and their number of replicates, blocks per replicate, plots per block, and trees per block for singletree plots (STP) and fourtree row plots (4 tree). The number of rows and columns is given for the rowcolumn design. All designs contained 2,048 trees arranged in a rectangular grid of 64 x 32 positions.. 12 22 Linear models fitted for simulated datase s for singletree plots (STP) and four tree row plots (4tree) over all surface patterns, plot and design types. All model effects were considered random. ........ ..... .......... ........................................ 18 31 Linear models fitted for nonLatinized experimental designs over all surface patterns. All model effects other than the mean were considered random. .........33 32 Linear models fitted for Latinized linear experimental designs over all surface patterns. All model effects other than the mean were considered random. ..........33 33 Average individual singlesite broadsense heritability with its standard deviations in parenthesis and relative efficiencies calculated over completely randomized design for experimental designs and linear models that were not Latinized (NoLAT). Note that the parametric HB2 was established for CR design as 0.25. .....................................................37 34 Coefficient of variation (100 std / x) for the estimated variance component for clone and error on 3 surface patterns, different number of ramets per clone and 3 selected experim ental designs. ........................................................ 45 41 Linear models fitted for simulated datasets using classical experimental designs to model global trends: complete randomized (CR), randomized complete block (RCB), incomplete block design with 32 blocks (IB), and rowcolumn (RC) design. All model effects other than the mean were considered random..............56 42 Linear models fitted for simulated datasets using polynomial functions to model global trends: linear (Linear), quadratic (RedPoly), and quadratic model with some interactions (FullPoly). All variables are assumed fixed and the treatment effect is random.................................................56 d om .................................................... ................ 56 43 Average variance components for the surface pattern GRAD obtained from fitting the models in Tables 41 and 42 for the following error structures: a) independent errors (ID); and b) autoregressive without nugget (AR1 AR1). All values are the means from 1,000 simulations and the true values were y2g = 0.25 and 2s + 2ms = 0.75 for CR design.................................. ......................... 62 44 Average variance components for the surface pattern ALL obtained from fitting the models in Tables 41 and 42 for the following error structures: a) independent errors (ID); b) autoregressive without nugget (AR1OAR1); and c) autoregressive with nugget (AR1AR1+rj). All values are the means from 1,000 simulations and the true values were 2g = 0.25 and y2s + 02ms = 0.75 for C R design ............................................................................63 45 Average variance components for the surface pattern PATCH obtained from fitting the models in Tables 41 and 42 for the following error structures: a) independent errors (ID); b) autoregressive without nugget (AR1AR1); and c) autoregressive with nugget (AR1AR1+rj). All values are the means from 1,000 simulations and the true values were 2g = 0.25 and y2s + C2ms = 0.75 for C R design. ...................................................................64 46 Average correlation between true and predicted treatment effects (CORR) and average treatment variance component for iterated Papadakis methods PAP6, PAP8 and PAP11 for the surface patterns ALL, PATCH and GRAD .................68 51 Linear models fitted for simulated datasets using classical experimental designs to model global trends over all surface patterns: randomized complete block (RCB), incomplete block design with x blocks (IB), and a rowcolumn (RC) design. All model effects other than the mean were considered random .................81 52 Average variance components and individual broadsense heritability for all surface patterns obtained from fitting the models in Table 51 from simulated datasets with a randomized complete block designs with 8 ramets per clone per site ........................................................ .................................85 53 Relative frequency of best posthoc blocking fits from 500 datasets for each posthoc design and surface pattern with 8 and 2 ramets per clone per site according to the loglikelihood (logL) and average standard deviation of the difference (SED). The values in bold correspond to the most frequent designs......87 Ai Average variance component estimates for singletree plot with no mortality in all surface patterns and experimental designs simulated. All values are the 2 2 means for 1,000 simulations and the true values were clone = 0.25 and C2 = 0.75 for C R design. ..................... .................... ..................... .. ......98 A2 Average variance component estimates for fourtree plot with no mortality in all surface patterns and experimental designs simulated. All values are the means 2 2 for 1,000 simulations and the true values were clone = 0.25 and 2s = 0.75 for C R design ............................................................................99 A3 Average variance component estimates for singletree plot with 25% mortality in all surface patterns and experimental designs simulated. All values are the 2 2 means for 1,000 simulations and the true values were clone = 0.25 and C2 0.75 for C R design. ........................................... ........................ 100 A4 Average variance component estimates for fourtree plot with 25% mortality in all surface patterns and experimental designs simulated. All values are the means for 1,000 simulations and the true values were clone = 0.25 and C2s 0.75 for C R design. ........................................... ........................ 10 1 A5 Average estimated correlations between true and predicted clonal (CORR) with standard deviations for 0 and 25% mortality obtained from 1,000 simulations for singletree plots (STP) and fourtree plot row plots (4tree) in all surface patterns and experim ental designs. ........................................ ........................................ 102 A6 Average individual broadsense heritability calculated for singletree plot (STP) and fourtree plot (4tree) with their standard deviations in all surface patterns, experimental designs and mortality cases. All values are the means for 1,000 simulations and the base heritability value was 0.25 for CR design....................103 A7 Coefficient of variation (100 std / x) for broadsense individual heritability obtained from 1,000 simulations with 8 ramets per clone for singletree plot in all surface patterns and experimental designs. ................................................ 104 A8 Average correlations between true and predicted treatment effects (CORR) with standard deviations in parenthesis in the ALL surface patterns for 2 and 8 ramets per clone for selected experimental designs analyses and polynomial models fitted for the following error structures: independent errors (ID); autoregressive without nugget (AR1OAR1); and autoregressive with nugget (AR1AR1+r). ..105 A9 Original and posthoc blocking average individual broadsense heritabilities with standard deviations in parenthesis for all surface patterns obtained from fitting the models in Table 51 from simulated datasets with a randomized complete block design with 8 ramets per clone per site............... ... .................105 LIST OF FIGURES Figure page 21 Replicate layout for randomized complete block design simulations for single tree plots (left) and fourtree row plots (right). The same 8 resolvable replicates were used for incomplete block and rowcolumn designs. All simulations had 256 unrelated clones with 8 ramets per clone for a total of 2,048 trees ................. 11 22 Average proportions of restricted maximum likelihood (REML) variance component estimates (after correction so that all components sum to one) for singletree and fourtree row plots for the no mortality case in all surface patterns and experimental designs simulated. ................. ............................... 21 23 Average estimated correlations between true and predicted clonal values (CORR) obtained from 1,000 simulations for singletree and fourtree row plots in the 0% mortality case for all surface patterns and designs. ................................23 24 Average heritabilities obtained for singletree and fourtree row plots in each surface pattern and design type for the no mortality case. The error bars correspond to the upper limit for a 95% confidence interval of the mean .............25 25 Heritability distributions for the no mortality case in the surface pattern ALL for selected completely randomized (CR), randomized complete block (RCB), and rowcolumn (RC) designs in singletree and fourtree row plots ..........................25 31 Plots of means and 95% confidence intervals for estimated individual broad sense heritabilities obtained from simulation runs and by the method proposed by Dickerson (1969) for all design types and surface patterns for nonLatinized designs and analyses (NoLAT). Average heritabilities correspond to the central points in each confidence interval, and all simulations involved 8 ramets per clon e. .............................................................................. 3 8 32 Contours for individual heritabilities obtained by using the lowess smoother (Cleveland 1979) of rowcolumn design for different surface parameters from fitting experimental designs and analyses that were nonLatinized (NoLAT). .....39 33 Average individual heritability estimates for ALL and PATCH surface pattern for simulated surfaces for subsets of simulations with High (px > 0.7 and py > 0.7) and Low (px < 0.3 and py < 0.3) spatial correlations............... ................ 42 34 Distribution of individual broadsense heritability estimates for rowcolumn designs with 2 and 8 ramets per clone obtained from 500 simulations each for the surface patterns ALL and PATCH from fitting experimental designs and analyses that were nonLatinized (NoLAT). ................ .................. ...........46 35 Average values for rowcolumns design in all surface patterns from fitting experimental designs and analyses that were nonLatinized: a) Clonal mean m heritabilities (the whiskers correspond to the 95% confidence intervals); and b) Clonal mean heritability increments. ......................................... 46 41 Neighbor plots and definitions of covariates used in Papadakis (PAP) and Moving Average (MA) methods. Plots with the same numbers indicate a common covariate. ..................... .......... ....... ........ 58 42 Average correlations between true and predicted treatment effects (CORR) in 3 different surface patterns for classical experimental design analyses and polynomial models fitted for the following error structures: independent errors (ID); autoregressive without nugget (AR1OAR1); and autoregressive with nugget (AR1AR1+r). Surface GRAD is not shown in the last graph because few simulations converged. ........... ......... ......................61 43 Average correlations between true and predicted treatment effects (CORR) in 3 different surface patterns for nearest neighbor analyses fitted assuming independent errors. PAP: Papadakis, MA: Moving Average............... ...............67 44 Average correlations between true and predicted treatment effects (CORR) in 3 different surface patterns for selected methods: randomized complete block (RCB), rowcolumn (RC), and Papadakis (PAP) using the following error structures: independent errors (ID), and autoregressive with nugget (AR1OAR1+). ....................................................67 51 Experimental layout for randomized complete block design simulations with 8 resolvable replicates (left) and partitioning of one replicate for incomplete block design layouts (right). All simulations had 256 unrelated clones with 8 ramets per clone per site for a total of 2,048 trees. .....................................81 52 Average correlation between true and predicted clonal values (CORR) for original and posthoc blocking analyses on ALL, PATCH and GRAD surface patterns for simulated surfaces with 8 ramets per clone. .......................................84 53 Average individual heritability (a) and correlation between true and predicted clonal values (b) calculated over 500 simulations with a randomized complete block designs with 8 ramets per clone per site for each surface pattern and post hoc design. ...................................... ......... ........... .86 hoc design ..................................... ........................... .. ..... .. ..... 86 54 Average correlation between true and predicted clonal values (CORR) calculated for ALL and PATCH surface patterns and all posthoc designs for sets with High (px, > 0.7 and py > 0.7) and Low (px < 0.3 and py < 0.3) spatial correlations. The datasets used corresponded to simulated surfaces with a randomized complete block design and 8 ramets per clone per site ......................89 Bl Plots of means and 95% confidence intervals for broadsense heritabilities for singletree plots (STP) and fourtree row plots (4tree) in all designs and surface patterns for nonLatinized designs and analyses (NoLAT) with 25% mortality. The methods correspond to simulation runs and the approximation proposed by Dickerson (1969). Average heritabilities correspond to the central points in each confidence interval. ..................... .. ........................... ...... ... .... ........... 107 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy OPTIMAL DESIGN AND ANALYSIS OF CLONAL FORESTRY TRIALS USING SIMULATED DATA By Salvador Alejandro Gezan December 2005 Chair: Timothy L. White Cochair: Ramon Littell Major Department: Forest Resources and Conservation Various alternatives for the design and analysis of clonal field trials in forestry were studied to identify "optimal" or "near optimal" experimental designs and statistical techniques for estimating genetic parameters through the use of simulated data for single site analysis. These simulations investigated the consequences of different plot types (singletree or fourtree row), experimental designs, presence of mortality, patterns of environmental heterogeneity, and number of ramets per clone. Approximated confidence intervals, Latinization and posthoc blocking were also studied. Later, spatial techniques such as nearest neighbor methods and modeling of the error structure by specifying an autoregressive covariance fitted with two variants (with and without nugget) were compared. Considerable improvements were obtained through selection of appropriate experimental designs and statistical analyses. A 5% higher correlation between true and predicted clonal values was found for singletree plots as compared to fourtree plots. The best experimental designs were rowcolumn for singletree plots, and incomplete blocks with 32 blocks for fourtree row plots increasing heritability by 10% and 14% over a randomized complete block design, respectively. Larger variability of some variance component estimates was the only effect of 25% mortality. Experiments with more ramets per clone yielded higher clonal mean heritabilities, and using between 4 and 6 ramets per clone per site is recommended. Dickerson's approximate method for estimating the variance of heritability estimates produced reasonable 95% confidence intervals, but an underestimation was detected in the upper confidence limit of complex designs. The effects of implementing Latinization were significant for increasing heritability, but small in practical terms. Also, substantial improvements in statistical efficiency were obtained using posthoc blocking, with negligible differences compared to predesigned local control with no reduction in the genetic variance as the size of the block decreased. For spatial analyses, the incorporation of a separable autoregressive error structure with or without nugget yielded the best results. Differences between experimental designs were almost nonexistent when an error structure was also modeled. Some variants of the Papadakis method were almost as good as models that incorporated the error structure. Further, an iterated Papadakis method did not produce improvements over the non iterated Papadakis. CHAPTER 1 INTRODUCTION Genetic testing is a critical activity for all breeding programs; however, it is time consuming and frequently one of the most expensive processes (Zobel and Talbert 1984, p. 232). Appropriate selection of experimental design and statistical analysis can yield considerable improvements in terms of increased precision of genetic parameter estimates and efficient use of resources. Genetic testing aims to achieve several objectives such as (White 2004) 1) improved genetic gain from selection by evaluation of genetic quality; 2) estimation of genetic parameters; 3) creation of a base population for future selection cycles; and 4) quantification of realized genetic gains. Also, various operational decisions depend on information obtained from genetic tests, and limited quantitative genetic knowledge degrades the efficiency of a breeding program, eventually affecting potential gains. At present, there is extensive research in genetic testing for agricultural crops that can be useful for testing in forestry; nevertheless, for forest species several important differences must be considered. Forest sites tend to be heterogeneous, the individuals of interest (trees) are usually large, and the screening of hundreds or thousands of genetic entities (e.g., families or clones) is common. These elements imply the need for relatively large test sites or the use of fewer replicates per site for each entry. Finding optimal testing sites is difficult, and if areas with high environmental heterogeneity are selected, the residual error variance will be inflated due to the confounding of treetotree variation with other largescale environmental effects. Nevertheless, optimization of genetic testing is possible in any of the following stages: 1) design of experiments; 2) implementation; and 3) statistical analysis. The design of experiments is a critical element and no single design will suit all testing objectives and environmental conditions perfectly, but statistical and computational tools are available to improve efficiency and they should be used as safeguards against site heterogeneity or potential undetected biases. Traditionally, in forest tree improvement, randomized complete block designs have been the design of choice, which is effective when within replicate (or block) variability is relatively small, a situation that is rare in forest sites (Costa e Silva et al. 2001). Nevertheless, more complex and efficient designs can be easily implemented. Incomplete block designs allow for a better control of site heterogeneity by specifying smaller compartments than do randomized complete block design. Also, row and column positions of the experimental units can be utilized to simultaneously implement twoway blocking factors producing rowcolumn designs. Other design options include the utilization of restricted randomization such as Latinization, nested structures and spatial designs (Whitaker et al. 2002). In the second stage of optimizing experimental precision (i.e., implementation), it is important to exercise control at all levels of field testing (installation, maintenance and measurement) with proper care on documentation, labeling, randomization, site uniformity and survival. The objectives of this stage are to control for all possible factors that might increase experimental noise, and, therefore, reduce precision of estimation of parameters of interest. The stage of statistical analysis is also very important. The use of mixed linear models combined with techniques such as restricted maximum likelihood (REML) to estimate variance components and to predict random effects is well understood and broadly used. Spatial analysis (Gilmour et al. 1997) and nearest neighbor methods (Vollmann et al. 1996) are interesting techniques used to improve prediction of genetic values. Spatial analyses are particularly attractive because they incorporate the coordinates of the experimental units (plots, trees or plants) in the linear model to account for physical proximity by modeling the error structure (i.e., environmental heterogeneity). Another relevant tool is posthoc blocking, which consists of superimposing complete or incomplete blocks over the original field design and fitting a modified linear model as if the blocking effects were present in the original design. Clonal forestry is a new practice in many widely used commercial forestry species, e.g., Pinus spp., Eucalyptus spp. and Populus spp. (Carson 1986; Elridge et al. 1994, p. 230; Ritchie 1992). Clonal forestry refers to the use of a relatively small number of tested clones deployed in operational plantations through masspropagation techniques (Bonga and Park 2003). Testing clones is of particular interest because the use of identical genetic entities (ramets from a clone) allows sampling several environmental microsites, thus allowing complete separation of genetic and environmental effects to increase the precision of prediction of breeding values (Shaw and Hood 1985). Other benefits of clonal testing include 1) achieving greater genetic gains than under traditional tree breeding; 2) capturing greater portions of nonadditive genetic variation for deployment; 3) using genotype x environment interaction by selecting clones most suited to specific site conditions; 4) detecting and utilizing correlation breakers for traits with undesirable genetic correlations; 5) preventing inbreeding in populations; 6) increasing plantation and product uniformity; and 7) reducing time between breeding and testing cycles (Libby 1977; Libby and Rauter 1984; Zobel and Talbert 1984, p. 311). There is extensive research relative to field testing of progenies using seedling from half or fullsib families (e.g., White 2004). Some of these guidelines can be used for clonal testing, but differences must be taken into consideration. In the literature, few studies exist that give important guidelines for the design and analysis of clonal experiments. The characteristics of the optimal designs and analyses are influenced by many factors such as magnitudes of heritability, g x e interaction, and ratio of additive to nonadditive genetic variance. Also, several questions of interest remain to be answered. For example, for the design of experiments: 1) are singletree plots more efficient than multipletree plots? is there an optimal incomplete block size? 2) how much better than randomized complete blocks are incomplete blocks or rowcolumn designs? 3) what are the effects of mortality on the estimation of genetic parameters? 4) do the best designs depend on the pattern of environmental heterogeneity? and 5) what is the optimal number of ramets per clone? For statistical analysis we require answers to the following: 1) is there any gain from incorporating the error structure in the model? 2) is it necessary to model gradients? 3) how good are the nearest neighbor methods compared with spatial analysis techniques? and 4) is posthoc blocking useful; and how should we implement posthoc blocking? One of the best ways to answer many of the above questions is with the use of simulation techniques. Statistical simulation or Monte Carlo experiments are widely used in statistical research. These methods are based on the generation of random numbers with a computer to obtain approximate solutions to problems that are difficult to solve analytically, and they can aid understanding and knowledge of the properties of the experiments or methods of interest (Johnson 1987, p. 1). Particularly, for field testing, they allow several alternatives to be evaluated without incurring large expense or having to wait years before practical results are available. The overall goal of this research was to identify "optimal" or "near optimal" experimental designs and statistical analysis techniques for the prediction of clonal values and estimation of genetic parameters in order to achieve maximum genetic gains from clonal testing. This study explored singlesite simulations with sets of unrelated clones "planted" in environments with different patterns of variability. In the first two chapters many alternative designs for clonal experiments are compared. In Chapter 2, the consequences for the estimation of genetic parameters of different design alternatives were studied. The elements compared were 1) singletree plot versus fourtree row plots; 2) several experimental designs; 3) no mortality versus 25% mortality; and 4) three different environmental patterns. For Chapter 3, more detailed work was done with STP experiments. Here, individual heritabilities estimates were examined in detail according to the different parameters used to simulate the environmental patterns. Also, confidence intervals for heritability obtained using the method proposed by Dickerson (1969) were compared with simulated percentile confidence sets; and the effects of using different number of ramets per clone were investigated. Several statistical analysis techniques were studied in the last two chapters using singletree plot experiments only. First, in Chapter 4, multiple statistical tools were used to account for spatial variability. The selected techniques included 1) modeling of global trends through the use of traditional experimental designs or polynomial models; 2) specification of a separable autoregressive error structure (with and without nugget); and 3) variants of nearest neighbor methods (Papadakis and Moving Average). Finally, Chapter 5 aimed to understand consequences of and to define strategies for posthoc blocking. Some of the aspects evaluated included comparing performance of original blocking versus posthoc blocking for several experimental designs; and defining strategies to select an optimal blocking structure within posthoc blocking. CHAPTER 2 COMPARISON OF EXPERIMENTAL DESIGNS FOR CLONAL FORESTRY USING SIMULATED DATA Introduction Genetic field tests of forest trees are critical to tree improvement serving to estimate genetic parameters and evaluate provenances, families, and individuals. Genetic testing of forest trees is a timeconsuming process and is frequently the most expensive activity of an improvement program (Zobel and Talbert 1984, p. 232). Therefore, it is of primary interest to maximize benefits by allocating resources efficiently (Namkoong 1979, p. 117). It is common in forestry trials to study the performance of a large number of genetic entries (e.g., families or clones), and this implies the need for relatively large test sites or the use of fewer replicates per site for each entry. Because forest sites for field experiments are often inherently variable, it is difficult to find optimal sites. The presence of this environmental heterogeneity inflates the residual variance due to the confounding of treetotree variation with other, largerscale environmental effects; and, hence, decreases the benefits of using simple experimental designs (Grondona et al. 1996). Withinsite variability is caused by variation in natural factors such as soil, microclimate, topography, wind and aspect. In addition other forms of heterogeneity might originate from machinery, stock quality or planting technique. Some conditions produced by topography or moisture might be easy to identify but, more commonly, environmental heterogeneity is only recognized after the fact as differential response to environmental conditions. Gradients across the site, local patches, and random microsite variance are the common types of variability, and these sources may appear individually or in combination (Costa e Silva et al. 2001), with some sources more frequent in particular geographical areas. There are three stages during which optimization of genetic testing can occur: 1) experimental design and planning; 2) implementation; and 3) statistical analysis. Forest genetic trials have traditionally employed randomized complete block designs (RCB) or, more recently, incomplete block designs (IB). The RCB is most effective when the site is relatively uniform within replicates, which is rarely the case in forest sites (Costa e Silva et al. 2001). Reducing the size of the unit by incorporating incomplete blocks within a full replicate (IB designs) allows for better control of site heterogeneity because smaller blocks tend to be less variable than larger ones; therefore, IB designs have the potential to increase precision over RCB (Cochran and Cox 1957, p. 386; Williams et al. 2002, p. 120). For simulated forest genetic trials under several environmental conditions, Fu et al. (1998) report a 42% increase in the efficiency for IB over RCB. In a related study Fu et al. (1999a) found that most of the withinsite variation can be controlled using 5 to 20 plots per incomplete block with better results in those cases where square blocks were employed instead of row or column blocks. Another experimental option is the rowcolumn design (John and Williams 1995, p. 87). When the experimental units are located in twodimensional arrays, it is possible to simultaneously implement two blocking factors (instead of one as in IB) corresponding to the rows and columns of the experiment. Greater efficiencies are expected when row column designs are used (Lin et al. 1993; Williams and John 1996). Qiao et al. (2000), using several wheat breeding trials, reported an increase of efficiency over RCB of 11% for rowcolumn designs compared with 8% for IB. Clonal forestry is a new practice in many widely planted commercial tree species, e.g., Pinus spp., Eucalyptus spp., Populus spp. (Carson 1986; Elridge et al. 1993, p. 230; Ritchie 1992). Interest in clonal forestry stems from its additional benefits such as ability to achieve greater genetic gains; potential to capture greater portions of nonadditive genetic variation and genotype x environment interaction; increased plantation and product uniformity; and accelerated use of results from tree improvement by reducing breeding and testing cycles (Libby 1977; Zobel and Talbert 1984, p. 311). In the agronomy and forestry literature, there are several studies that compare design and analysis alternatives for breeding experiments through real or simulated datasets. In forest genetics, the few simulation studies available report some guidelines for future testing. Fu et al. (1998) showed that adesigns were the most efficient arrangement for IB designs. For the majority of the environmental conditions studied they were superior to an RCB design for the estimation of family means, but the benefits of IB over RCB were reduced as the level of missing observations increased (Fu et al. 1999b). In a related study for adesigns, Fu et al. (1999a) reported that smaller incomplete blocks were more efficient for significant patches when singletree plots were established to estimate family and clonal means. These studies give important guidelines for the design of genetic experiments; nevertheless, several questions of interest remain to be answered: Are singletree plots more efficient than multipletree plots? In relation to IB designs, how much better (or worse) are rowcolumn designs? Is there an optimal incomplete block size? How is the phenotypic variance partitioned under different experimental designs? Also, it is common in previous simulation studies to screen a limited number of clones and to treat the genetic entries as fixed effects in the linear model. In this study genetic entries were assumed to be random, allowing estimation of heritabilities and prediction of clonal genetic values for a range of simulated conditions. The present study is focused on identifying "optimal" or "near optimal" experimental designs for estimating genetic parameters and achieving maximum genetic gain from forestry clonal tests. This study explores sets of unrelated clones tested on a single site through the use of simulated data created with different patterns of environmental variability. In particular, the objectives of these simulations are to investigate the consequences for the estimation of genetic parameters of 1) using single tree or fourtree row plots; 2) using completely randomized, randomized complete block, incomplete blocks of various sizes, or rowcolumn designs; 3) experiencing no mortality versus 25% mortality; and 4) planting on sites with different environmental patterns of surface variation (only patches, only gradients, and both patches and gradients). Materials and Methods Field Layout All simulations were based on a single trial of 2,048 trees planted on a contiguous rectangular site of 64 rows and 32 columns with square spacing and 8 ramets for each of the 256 unrelated clones. The trees were "planted" in either singletree plots (STP) or fourtree row plots (4tree), corresponding to 8 and 2 plots per clone, respectively (Figure 21). A completely randomized design (CR) in which clonal plots were randomly assigned to the field site was considered the baseline experimental design for each plot size. A randomized complete block design (RCB), a variety of incomplete block designs 11 (IB), and a rowcolumn design (RC) were also implemented for both STP and 4tree row plots (Table 21). There are many ways in which treatments (or clones in this case) can be allocated to incomplete blocks. For the present work, adesigns were used to obtain IB and RC layouts. These designs can be obtained quickly and efficiently for many design parameters (Williams et al. 1999). All IB and RC designs were generated using the software CycDesigN (Whitaker et al. 2002) that incorporates an algorithm to generate optimal or nearoptimal adesigns. Outputs from 100 different independent runs were used for each design and plot type. CR and RCB layouts were generated using code programmed in MATLAB (MathWorks 2000). 1 5 2 6 64 trees 3 7 2 32 trees 16 trees 4 8 16 trees 32 trees Figure 21. Replicate layout for randomized complete block design simulations for singletree plots (left) and fourtree row plots (right). The same 8 resolvable replicates were used for incomplete block and rowcolumn designs. All simulations had 256 unrelated clones with 8 ramets per clone for a total of 2,048 trees. Table 21. Simulated designs and their number of replicates, blocks per replicate, plots per block, and trees per block for singletree plots (STP) and fourtree row plots (4tree). The number of rows and columns is given for the rowcolumn design. All designs contained 2,048 trees arranged in a rectangular grid of 64 x 32 positions. STP Design a Replicates Blocks/ Rep b Plots/ Block Trees / Block CR 1 1 2048 2048 RCB 8 1 256 256 IB 4 8 4 64 64 1B 8 8 8 32 32 IB 16 8 16 16 16 IB 32 8 32 8 8 RC 8 32 rows 32 columns 4tree Design Replicates Blocks / Rep Plots / Block Trees / Block CR 1 1 512 2048 RCB 2 1 256 1024 IB4 2 4 64 256 IB8 2 8 32 128 IB 16 2 16 16 64 IB 32 2 32 8 32 RC 2 16 rows 16 columns a CR, complete randomized; RCB, randomized complete block; IB x, incomplete block with x blocks per replicate; RC, rowcolumn. b Rep, resolvable replicate. Genetic Structure and Linear Model The linear model used to generate the simulated data was YVk = + Ck + Es(,jk) + Ems(jk) 21 where yijk is the response of the tree located in the ith row and jth column of the kth clone, LI is a fixed population mean which was set equal to 10 units, Ck is the random genetic clonal effect, Es(ijk) is the surface error (or structured residual) and Ems(ijk) corresponds to the microsite random error (or unstructured residual). The total variance for the above linear model (Equation 21) considering all components as random is C2 T clone + G2s + C2ms. For simplicity, but without loss of generality, o2T was fixed to 1. Further, the variance structure for all surfaces and designs was set to c2clone = 0.25 and C2s + c2ms = 0.75; hence, the singlesite biased broadsense heritability H = o2clone / clone + 2s + 2ms) is 0.25 for completely randomized designs. Spatial Surface The spatial surface is a rectangular grid (x and y coordinates) composed of unstructured and structured residuals. The unstructured errors, Ems, correspond to white noise and can originate from measurement errors, planting technique, stock quality, and unstructured microsite variation. The structured residuals, Es, are due to the underlying environmental surface, and were generated from two distinct patterns: gradients and patches. Gradients were modeled as a mean response vector t of size 2,048 x 1, at each of the positions of the 64 x 32 grid, employing the following polynomial function: t, = a (x, +y,) + p8(x, 2y, + X Y 2) 22 where xci and yci correspond to the centered values xci = xi x and yci = yi y for the ith tree located in column x and row y; and a and representt fixed weights on linear and quadratic components, respectively. This function defines a flat plane (i.e., no gradients), when a and flare zero. Also, the primary environmental gradient was oriented along the short axis of the 64 x 32 rectangle to minimize variation within a replicate, as an experiment would be laid out in the field. Patches were modeled by incorporating a covariance structure based on a first order separable autoregressive process (AR1OAR1), which is a variant of the exponential model used in spatial statistics (Littell et al. 1996, p. 305). This error structure has been previously used successfully for analyses of agricultural experiments (Cullis and Gleeson 1991; Zimmerman and Harville 1991; Grondona et al. 1996; Gilmour et al. 1997) and forestry trials (Costa e Silva et al. 2001; Dutkowsky et al. 2002). The AR1OAR1 error structure considers two perpendicular correlations, one for the x direction (px) and the other for y (py). The model defines an anisotropic model, where the covariance (or correlation) between two observations is not only a function of their distance, but also of their direction (Cressie 1993, p. 62). The parameters px and py define the correlation between the structured residuals (Es) of nearby trees, and there is a positive relationship between the magnitude of these parameters and patch size. To generate the patches, a 2,048 x 2,048 variancecovariance matrix (R) was constructed and later used to simulate residuals. The elements of this matrix were obtained as Var(e) = 2s for diagonal elements 23 Cov(e,,e,,) = 2s phx Phy for offdiagonal elements 24 where hx = I xi xi, 1, hy = I yi yi' , i.e., the absolute distance in row and column position respectively, between two trees. The other parameters were previously defined. This covariance structure also includes a nugget parameter (c2ms), which allows the modeling of discontinuities in covariance over very small distances. In real datasets the nugget effect reflects variability that would be found if multiple measurements were made exactly on the same position or the microsite variation of positions very close together (Cressie 1993, p. 59; Young and Young 1998, p. 256). Simulation Process The first stage of the simulation consisted of the independent generation of surfaces over which different experimental designs were superimposed. Each surface or environmental pattern was produced by selecting 5 parameters at random from uniform distributions (a, Px, p, and C2ms). The parameters were restricted to the following ranges: 0 to 0.05 for a, 0 to 0.0005 for /, 0.01 to 0.99 for the correlation parameters px and py, and 0.15 to 0.60 for C2ms. Additionally, these correlations were restricted so their absolute difference was smaller than 0.85; and C2s was calculated from C2s = 0.75 C2ms. All these parameters were then used to generate the vector t and the residual variancecovariance matrix R. The Cholesky decomposition of the R matrix was obtained and used together with t and an independent random vector of standard normal numbers to obtain the correlated residuals (Johnson 1987, p. 5254), producing e ~ N(t, R). Later, a set of normal independent random residuals was incorporated to constitute the white noise (i.e., Ems). After this, all components were added together and a standardization was performed to ensure that the total environmental variance was fixed at 0.75. The remaining variability (0.25) belongs to the clonal component of the linear model (Equation 21). Because all clones are unrelated, each set of 256 clonal values was generated as 500 independent standard normal vectors, which were scaled to have a variance of 0.25. Three surface patterns were implemented to generate 1,000 independent surfaces of each pattern: patches only (PATCH) with a= /= 0 in Equation 22; gradients only (GRAD) where px = py = 0 in Equation 24; and both patches and gradients together (ALL) with none of the parameters set to zero. Because of the standardization that was applied to all surfaces, the comparison between different patterns must be viewed with caution. ALL includes both gradients and patches and so is more heavily "corrected" when the error variance was adjusted to 0.75; hence, the effect of each surface component is reduced when compared with PATCH or GRAD. The process of generating a simulated trial for a particular surface pattern and design consisted of selecting at random a surface and a vector of clonal values. Additionally, a random experimental design (layout) was selected from those generated through CycDesigN or MATLAB code, and a simple partial randomization was performed that consisted of shuffling clone numbers. Finally, all components were added together to produce the response vector (y), which was stored with other relevant variables (x and y positions, replicate, block, etc.) for statistical analysis. Simulating Mortality Two levels of mortality were considered in the present study: 0% and 25%. The process of mortality was modeled using two independent components: clone and microsite. The clonal component assumed that some clones survived better than others because of resistance to disease, adaptability to site, or difference in stock quality. This component was generated as a random vector of standard normal numbers independent of their original genetic values (i.e., no relationship with their original clonal values, Ck). For the microsite component, it was assumed that mortality occurs in clusters or patches; and it was simulated as a patchy surface with exactly the same error structure previously described using Equations 23 and 24 where 20% of the total variation corresponded to white noise or random mortality. The following values were used to generate the mortality surfaces: a= /= 0, px = py = 0.75, o2ms = 0.20 and 2s = 0.80. Finally, the mortality process consisted of calculating an index for each tree based on a weighted average of both components (10% for clonal and 90% for the microsite component), and according to this index the lowest 25% of the trees were eliminated. If all ramets (out of 8) were eliminated for any clone, then the process was repeated. The procedure previously described does not take into account changes in competition due to death of neighboring trees. This effect was not incorporated because it is expect d to be more relevant several years after establishment and to affect some variables more than others. Statistical Analysis The data for each of the simulated trials were analyzed using ASREML (Gilmour et al. 2002) with the linear models specified in Table 22 (all effects were considered random). This software fits mixed linear models producing restricted maximum likelihood (REML) estimates of variance components and best linear unbiased predictions (BLUP) of the random effect (Patterson and Thompson 1971). Altogether there were 84,000 datasets analyzed (3 surface patterns x 7 experimental designs x 2 plot types x 2 levels of mortality x 1,000 simulations each). The output was compiled and summarized to obtain averages and standard deviations for heritability estimates and REML variance components for the response variable y. In addition, empirical correlations (CORR) were calculated between the true and predicted clonal values. Singlesite heritabilities for each simulation were calculated as T2 U clone HB 2 lon+ e for STP, and 25 0 clone r0 e T 2 U clone HB clone =2 plot for 4tree. 26 2 clone + 02 plot + 2 e 18 Because heritability estimates correspond to a ratio of correlated variance component estimates, the simple average is not an unbiased estimator of the first moment. Hence, the following formula was used for STP, and for 4tree Z ('2 clone + plot + 2j e n where the sums are over n=1,000 simulations of each surface pattern, plot and design type. Table 22. Linear models fitted for simulated datasets for singletree plots (STP) and fourtree row plots (4tree) over all surface patterns, plot and design types. All model effects were considered random. Design a STP b CR Yj = p + clone, + r1 RCB y, = + rep, + clone, + , IB Yijk = + rep1 + block(rep), + clonek + ijk RC Yijk1 = + rep, + col(rep), + row(rep)lk + clones + ijk1 Design 4tree CR Yijk = + clone, + plot, + Syk RCB Yijk = + rep1 + clones + plot, + iyk IB YijkI = + rep1 + block(rep), + plotik + clonek + Ejk1 RC Yijklm = + rep1 + col(rep), + row(rep)lk + clone, + plotil + sijklm a CR, complete randomized; RCB, randomized complete block; IB, incomplete block; RC, rowcolumn. b clone, clone effect; rep, resolvable replicate; block(rep), incomplete block nested within replicate; col(rep), column nested within replicate; row(rep), row nested within replicate; plot, plot effect; e, residual. H (2 cone 2 i clone / l TT 2 11B Genetic gain from clonal selection was not estimated in the present study because gain is directly proportional to heritability (HB2) for a fixed selection intensity and phenotypic variance (Falconer and Mackay 1996, p. 189). Thus, any increases in HB2 lead directly to greater genetic gains. Also, the correlation between true and predicted clonal values (CORR) is a direct measure of genetic gain. As CORR approaches 1, clonal values are precisely predicted and genetic gain from clonal selection is maximized. Results and Discussion Variance Components For all surface patterns, designs and plot types the simulated data yielded REML estimates of the genetic variance that averaged close to the parametric values imposed during simulation (i.e., Y2clone = 0.25). Also, for CR (the reference experimental design) the estimated variance components for error averaged almost exactly to their parametric values (i.e., C2s = 0.75 for STP and o2plot + c2 = 0.75 for 4tree) (Figure 22). During the simulation process, the total variance was set to 1, and, after analysis the estimated phenotypic variance (sum of average variance component estimates) were extremely close to 1 for CR for all surface patterns and plot types and for all experimental designs in PATCH (Tables Ai, A2, A3 and A4). Simulation scenarios other than CR designs that contained gradients (ALL and GRAD) and replications resulted in estimated phenotypic variances slightly greater than 1 (2% to 6% overestimation) with higher values for 4tree row plots than STP. This situation was due to an inflation of the error variance that originates when a replicate effect is incorporated in the model. Theoretically, a flat plane is specified for each level of replicate when in fact a nonhorizontal trend should be used to model the field gradients. The presence of a slightly inflated total variance was corroborated in a small simulation study using surfaces that contained only gradient and no errors (i.e., random noise or patches). Upon fitting an RCB to this gradient, residuals were generated and an error variance was estimated when in fact it should not exist. Box and Hay (1953) reported a similar situation with time trends and indicated that this inflated error produced by trends within the replicates can be eliminated by using very small replicates. In this study, we preferred to correct the variance components in all models so that they will sum to 1 for each simulated dataset. For STP in PATCH and ALL surfaces, the variance component for the incomplete block effect (o2block(rep)) increased with the number of incomplete blocks (Figure 22). For surfaces with patches. the error variance component, C2y, decreased for smaller incomplete block sizes because an increasing portion of the variance was explained by incomplete blocks. In general, larger values for the error component Cs2 were found for PATCH compared to GRAD surfaces indicating that some of the variation due to patches tended to be confounded with the microsite error instead of being captured by the incomplete blocks. Fu et al. (1998) reported a similar result where blocks were more efficient in controlling for gradients than for patches. The variance components 2 rep and o2block(rep) for those surfaces with gradients (GRAD and ALL) explained a larger portion of the total variability than for PATCH surfaces. Also, for surfaces with gradients, the variance components for RC indicated that there was more variability among columns than rows, since the main environmental gradient was oriented along the short axis of the experiment. 4Tree row Surface: PATCH CR RCB IB4 IB8 1816 IB 32 RC STP Surface: GRAD 4Tree row Surface: GRAD CR RCB 1B4 1R 16 I 132 RC STPSurface: ALL CR RCB IB4 B 8 B116 B 32 RC 4Tree row Surface: ALL CR RCB IB4 I8 1I16 IR 32 RC Clone Block(rep) I Row(rep) I Rep I Col(rep) S Plot Error Figure 22. Average proportions of restricted maximum likelihood (REML) variance component estimates (after correction so that all components sum to one) for singletree and fourtree row plots for the no mortality case in all surface patterns and experimental designs simulated. STPSurface: PATCH The tendencies for 4tree were similar to STP except that the values of G2rep were smaller because replicates were 4 times larger and more heterogeneous. Also, the proportion of variance explained by incomplete blocks was greater for 4tree than STP for GRAD and ALL surfaces and smaller for PATCH. But, the sum of O2rep and o2block(rep) was very similar between plot types with slightly larger values for STP in PATCH. The plot effect (o2plot) was smaller for GRAD and larger for PATCH indicating that some of the variability of patches was confounded in the plot effect. For both plot types, smaller values of o2, were found on ALL surfaces which had both gradients and patches. Similarly, Fu et al. (1998) found increased efficiencies in the estimation of family means when high levels of gradients and patches occurred simultaneously. The variance components were very similar, comparing 25% mortality to 0% mortality, for all of the plot types, designs, and surface patterns (Tables Ai, A2, A3 and A4). The major impact of mortality was more variation in the variance component estimates. For G2clone, the standard deviation increased from 6.8% to 12.6%, while for G2s the increase was only 0.2% to 2.5%. This result agrees with findings reported by Fu et al. (1999b) that showed for clonal tests a slight decrease in the efficiency of IB over RCB under random mortality. However, in the present study the impact of mortality was small considering that 25% of the observations were missing, and it demonstrates the usefulness of the REML technique to estimate unbiased parameters efficiently. Correlations between True and Predicted Clonal Values Performance of the various designs and plot types for predicting clonal values was assessed by the correlation between true and predicted clonal values (CORR). For all surface patterns the best 4tree row plot design was less precise for predicting clonal values (lower CORR averages) than any STP design indicating that for these simulations the latter did a better job accounting for microsite variation (Figure 23). This finding agrees with results reported by LooDinkins et al. (1990) and Costa e Silva et al. (2001). The difference in average CORR values between the best and worst experimental designs was larger for 4tree than for STP, and this difference increased in surfaces that incorporated gradients (GRAD and ALL). So, while 4tree row plot designs were generally less efficient than STP, experimental designs have more impact on the efficiency of experiments established with multipletree plots. For STP, the best designs were RC followed closely by IB 32. On GRAD surfaces all the incomplete block designs behaved similarly, and on PATCH the RC performed best. For 4tree the best designs for GRAD and ALL were IB 8, IB 16 and IB 32; and R C was clearly inferior, while for PATCH the best designs were IB 32 and RC. STP 0% Mortality 4Tree row 0% Mortality 090 090 088 088 086 086 084 084 / 082 82 0 80 0 80 *"7' 078 078 * PATCH  PATCH 0 GRAD / 0 GRAD 076 ALL 076 ALL 074 I I I I 074 CR RCB IB4 IB8 IB16 IB32 RC CR RCB IB4 IB8 IB16 IB32 RC Figure 23. Average estimated correlations between true and predicted clonal values (CORR) obtained from 1,000 simulations for singletree and fourtree row plots in the 0% mortality case for all surface patterns and designs. The effect of mortality on the average correlations was almost identical for all designs, plot types, and surface patterns. CORR decreased approximately 4% from the loss of 25% of the observations (Table A5). Heritabilities for Simulated Designs As for variance components estimates, the average heritability calculated using Equations 27 and 28 returned the correct parametric value of HB = 0.25 for the CR reference experimental design (see Figure 24). For the majority of all other experimental designs, the estimated average heritabilities were always higher than 0.25 indicating that these designs were effective at reducing the residual variance. For STP, there was a considerable increase in heritability for RCB above CR in GRAD and ALL surfaces (Table A6 and Figure 24). For 4tree row plots, the greater increase occurred when the design changed from RCB to incomplete block designs. In STP, the design producing the highest average HR for any surface pattern was RC followed by IB 32. In the case of 4 tree, the best design was clearly IB 32. For the IB designs, larger heritability values were obtained as the number of incomplete blocks increased for all plot types and surface patterns indicating that smaller incomplete blocks were more efficient than larger blocks in controlling environmental variation. Similar conclusions were obtained from other simulation studies where between 5 and 10 plots (or trees) per incomplete block were recommended (Fu et al. 1999a). For this study, the best incomplete block design (IB 32) had only 8 plots per block for both STP and 4tree row plots. This block size is relatively small, and the use of fewer plots could produce greater heritabilities, but further studies are required. 4Tree row 0% Mortality CR RCB IB4 IB8 IB 16 IB32 RC CR RCB IB4 IB8 IB 16 IB32 RC Figure 24. Average heritabilities obtained for singletree and fourtree row plots in each surface pattern and design type for the no mortality case. The error bars correspond to the upper limit for a 95% confidence interval of the mean. STP 0% Mortality 4Tree row 0% Mortality 150 100 50 / so 50 0.1 2 0.3 0.4 0.5 0.6 0.1 0.2 0.3 0.4 0.5 0.6 HB2 H,2 Figure 25. Heritability distributions for the no mortality case in the surface pattern ALL for selected completely randomized (CR), randomized complete block (RCB), and rowcolumn (RC) designs in singletree and fourtree row plots. 036 034 032 030 M 028 I 026 024 022 020 STP 0% Mortality 034 032 030 I, M 028 I 026 024 022 020 STP 0% Mortality The comparison of the distributions of H2 estimates for different designs showed that as the number of incomplete blocks increased the distribution became asymmetric, with a longer tail and increased spread to the right (Figure 25). This result indicates that assuming a normal distribution for heritability can sometimes be incorrect. A ztest comparing the H, values for 0% and 25% mortality levels for designs of the same plot type and surface patterns was conducted to study if the heritability estimates changed. No statistically significant differences were found (experimentwise a = 0.05) among average HB2 estimates with the exception of a few comparisons within R C indicating that the analyses produced unbiased heritability estimates for all experimental designs and plot types. The main effect of mortality was the increase in the variance among HR estimates. Compared to 0% mortality, Var(HB2) increased 9.8% for STP and 12.0% for 4tree row plots for datasets with 25% missing trees. The minor effect produced by 25% missing values could indicate that 8 ramets per clone was more than necessary. Several authors recommend using between 1 and 6 ramets/clone for single site experiments (Shaw and Hood 1985, Russell and Libby 1986); hence, it is possible that under the conditions of the present study fewer than 8 ramets per clone might be adequate; nevertheless, this topic requires further study. Conclusions The results from these simulations indicate that proper selection of experimental designs can lead to considerable increases in heritability, precision of predicted genetic values and genetic gain from selection. The use of singletree plots instead of 4tree row plots resulted in an average increase in the correlation between true and predicted clonal values of 5%. Hence, singletree plots allow a more effective sampling of the environmental variation and reduce error variance more than 4tree row plots. The best experimental design for singletree plot experiments was the rowcolumn design which fits random effects for both rows and columns within each resolvable replicate. RC designs were followed closely by incomplete block designs with small blocks which were very efficient for STP. For 4tree row plots, an incomplete block design with 32 blocks per replicate is recommended. The use of incomplete blocks (in one or two directions) controlled for an important portion of the total environmental variability and produced unbiased estimates of genetic variance components and clonal values. The increase in the standard deviation of the average pairwise clonal comparison from the use of incomplete blocks was counteracted by the improved precision obtained in the statistical analysis. For singletree plots, the smallest incomplete block under study had 8 trees per block, and it is possible that smaller blocks could produce even greater improvements. For the different simulated surface patterns, the ranking using heritability or correlation from high to low was ALL, GRAD and PATCH. In the latter pattern some of the small patches were confounded with random error; hence, for this surface pattern lower heritability values should be expected. Twentyfive percent mortality produced only slight changes in the statistics studied. The consequences were primarily an increase in the variability of some variance components (variance of 2clione increases about 10%, and a2s about 1%) and, therefore, the variability of heritabilities increased. Thus, the effect of mortality was small, but might have been ameliorated by the relatively large number of ramets used per clone. 28 Lastly, the simulations from this study did not incorporate the effect of competition between neighboring trees; therefore, these results must be interpreted with caution. Trials with strong between tree competition, as occurs in older testing ages, and with large mortality patches, might produce different results. CHAPTER 3 ACHIEVING HIGHER HERITABILITIES THROUGH IMPROVED DESIGN AND ANALYSIS OF CLONAL TRIALS Introduction For any operational breeding program, genetic testing constitutes one of the most important and expensive activities. Several alternatives are available for experimental designs. The widely employed randomized complete block design (RCB) is most effective when blocks (or replicates) are relatively uniform, something that usually occurs only with small replicates (Costa e Silva et al. 2001). With large numbers of treatments (e.g., families and clones) the use of incomplete blocks (IB) can increase efficiency considerably (Fu et al. 1998; Fu et al. 1999a), particularly when there are large amounts of environmental variability or when the orientation of replicates can not be correctly specified (Lin et al. 1993). Also, the row and column positions of the experimental units can be utilized to simultaneously implement two blocking factors and produce row column designs (RC), as described in detail by John and Williams (1995, p. 87). These designs have demonstrated greater efficiencies than other common designs (Lin et al. 1993; Williams and John 1996). Latinization is rarely used as a design technique for increasing the efficiency of estimating treatment effects. A design is said to be "Latinized" when the randomization is restricted so that the position of the experimental units for the same treatment are forced to sample different areas of the experimental area. This is accomplished by defining long blocks (such as row or columns) that span multiple replicates ensuring that the treatments are spread out across the entire test site (John and Williams 1995, p. 8788). The increased interest in clonal forestry is generated by the higher genetic gains that can be obtained from using tested clones for deployment to operational plantations. Some benefits include the possibility of capturing nonadditive genetic effects, using greater amounts of genotype x environment interaction, and increasing plantation and product uniformity (Zobel and Talbert 1984, p. 311). Field experiments for clonal testing are particularly challenging because large numbers of genotypes need to be evaluated, implying the need for large test areas. Relatively homogenous areas are difficult to find because forest sites tend to have high environmental variability usually expressed in the form of patches, gradients or both, together with considerable random microsite noise (Costa e Silva et al. 2001). Therefore, site heterogeneity is an important factor to consider when clonal trials are designed, implemented and analyzed. Selection of an appropriate experimental design together with a correct specification of the linear model could produce considerable improvements in the precision of predicted genetic values, heritabilities and gains from selection. Several recommendations are available in the literature for clonal testing, and in general, optimal designs would employ 1 to 6 ramets per clone per site, and the number of clones tested would be maximized while utilizing as few ramets as possible (Shaw and Hood 1985; Russell and Libby 1986; LooDinkins et al. 1990; Russell and LooDinkins 1993). Still, difficulties remain in defining optimal designs, numbers of families, clones per family and ramets per clone to be planted on single or multiple sites. Also, it is not clear which analytical methods are to be preferred. In Chapter 2, we simulated singlesite clonal trials under several experimental designs and 3 different environmental surface patterns to identify appropriate conditions for estimating genetic parameters. That study showed that singletree plots (STP) experiments were more efficient for predicting clonal values than fourtree row plots across all designs and conditions simulated. This new study considers only STP experiments and aims to determine the effects of different designs and analytical options on the prediction of clonal values and the precision of some genetic components. In particular, the objectives are to determine 1) which experimental designs, including Latinization, maximize broadsense heritability and, hence, gain from clonal selection? 2) which patterns of environmental or spatial variability yield high or low heritabilities? 3) what the effects of using different number of ramets per clone are? and 4) how close Dickerson's approximate method for confidence intervals is to the empirical estimates? Materials and Methods Field Design and Simulation The simulated datasets used in this study are based on a single site trial of 2,048 ramets planted in a rectangular grid of 64 rows and 32 columns with 8 ramets for each of the 256 clones arranged in singletree plots with no missing observations (details in Chapter 2). Briefly, two factors were considered in simulating the environments under which the different experiments were performed a gradient generated with a polynomial function depending on the x and y coordinates of the grid; and patches that were modeled by incorporating a covariance structure based in a firstorder separable autoregressive process or AR1OAR1 (Cressie 1993; Gilmour et al. 1997). The polynomial function employed was t = a(x Y) + P+(xc 2y +xl 2) 31 where xc and yci correspond to the centered values xc = xi x and yci = yi y for the ith tree located in column x and row y; a and representt fixed weights on linear and quadratic components, respectively. The AR1OAR1 error surface employed two perpendicular correlations (px and p,) and was generated through an error variancecovariance matrix defined as Var (e) = cs + cms for diagonal elements 32 Cov (e,, e,) = sphxpyhy for offdiagonal elements 33 where hx = I xi xi' 1, hy = I yi yi' 1, i.e., the absolute distance between two trees in the row and column position, respectively. The variance component c2s describes the surface error or structured residual, and 2yms corresponds to the microsite random error, white noise or unstructured residual. The three surface patterns simulated included only patches (PATCH), only gradients (GRAD), and a combination of both (ALL). Each surface was generated by drawing at random the parameters a, /, px, py and 2yms from uniform distributions with the following ranges: 0 to 0.05 for a, 0 to 0.0005 for /, 0.01 to 0.99 for px and py, and 0.15 to 0.60 for 2Ms. Without loss of generality, the total variance for all simulated datasets was fixed at 1, and the genetic structure was set to have a single site biased broadsense heritability of 0.25; hence, G2clone = 0.25 and C2y + C02m = 0.75. The experimental designs studied were completely randomized (CR), randomized complete block with 8 resolvable replicates (RCB), a variety of incomplete block designs with 4, 8, 16 and 32 incomplete blocks within each resolvable replicate (IB 4, IB 8, IB 16 and IB 32, respectively), and a rowcolumn design (RC). Both nonLatinized and Latinized designs were obtained for IB and RC designs using the software CycDesigN (Whitaker et al. 2002). One thousand simulations were generated for each combination of surface pattern, design type and Latinization option, for a total of 36,000 datasets. Statistical Models All the simulated experiments were analyzed using the software ASREML (Gilmour et al. 2002) fitting the models specified in Tables 31 and 32 for nonLatinized and Latinized design options, respectively. For each set of simulations, the restricted maximum likelihood (REML) variance components estimates were obtained, and single site individual heritabilities for each fitted model were estimated using the following expression: H"= " 34 2 clone 3 e H 2cone + j2e Table 31. Linear models fitted for nonLatinized experimental designs over all surface patterns. All model effects other than the mean were considered random. Design a Model b CR yj = p + clone, + r1 RCB y, = p + rep, + clone, + r, IB Yijk = i + rep + block(rep), + clonek + Eik RC Yijk1 = i + rep + col(rep), + row(rep),k + clones + Eijkl a CR, complete randomized; RCB, randomized complete block; IB, incomplete block; RC, rowcolumn. b clone, clone effect; rep, resolvable replicate; block(rep), incomplete block nested within replicate; col(rep), column nested within replicate; row(rep), row nested within replicate; e, residual. Table 32. Linear models fitted for Latinized linear experimental designs over all surface patterns. All model effects other than the mean were considered random. Design a Model b IB Yijkl = 1i + rep, + lat, + block(rep)lk + clone1 + Eljk RC Yijklmn = .t + repi + lcolj + lrOWk + col(rep),i + row(rep)lm + clone + Sijklmn a IB, incomplete block; RC, rowcolumn. b clone, clone effect; rep, resolvable replicate; lat, Latinized block; block(rep), incomplete block nested within replicate; Icol, Latinized long column; Irow, Latinized long row; col(rep), short column nested within replicate; row(rep), short row nested within replicate; e, residual. To estimate an unbiased empirical expected heritability for each design and surface, the ratio of the mean of the numerator over the mean of the denominator was used (see Equation 34). Also, the relative efficiency (RE) was calculated as the ratio of means of the heritability of a particular design divided by that of an RCB design of the same surface pattern. In this study, Dickerson's method was used to obtain approximate 95% confidence intervals (CI's) on individual heritability estimates. Approximations are required because heritabilities correspond to a ratio between linear functions of random variables, and in general, and particularly for unbalanced data, a closed form expression does not exist to estimate confidence intervals (Dieters 1994, p. 4). Assumptions and procedures suggested by Dickerson (1969) approximate the variance of H B for each data set as (f 2 )= Vr (j 2 clone) 35 Var(H ) = r one) 35 k[ clone + 2 e]2 where the numerator Var(j2 clone) corresponded to the variance of a variance component obtained from ASREML based on asymptotic theory of restricted maximum likelihood estimation (Searle et al. 1992, p. 473). Then, Dickerson's 95% confidence intervals for Hi2 were calculated as HE2 1.96 x std(H2) where std(HB2) is the square root of the variance expression from Equation 35. For comparison, empirical or percentile CI's were obtained by finding the top and bottom 2.5% estimated Hf 2 from the 1,000 individual heritability estimates from each fitted set of simulations. CI's were calculated for each experimental design and environmental surface. The magnitude of fj 2 is affected by different levels of the parameters used to generate the error surfaces (a, ,/, px, py and Cms). Influences of these parameters were studied using heritability estimates obtained from the nonLatinized designs and models fitted without Latinization through the use of simple group means and by producing smoothed threedimensional surfaces using the loess smoother (Cleveland 1979). Two types of comparisons were made to study the efficiency of Latinization. First, the effectiveness of Latinization in the design phase alone was tested using a ztest with the estimated heritabilities that were generated with and without Latinization analyzed with linear models that did not include these Latinization effects (i.e., only models from Table 31). These analyses correspond to PartialLAT (indicating Latinization in design but not in analysis) and NoLAT (no Latinization for both design and analysis). Second, to examine the importance of including Latinization in the analysis phase, Latinized experimental designs were fitted with and without Latinization effects in their respective linear models. These analyses were identified as FullLAT (indicating Latinization in design and analysis) and PartialLAT, and the likelihood ratio test for mixed models described by Wolfinger (1996) was used to compare these fitted models. Finally, the effects of using different numbers of ramets per clone on the efficiency of estimating variance components and predicting clonal values were studied by selecting subsets of each original simulated dataset and fitting their corresponding linear models. The first 500 simulations of three designs (RCB, IB 32 and RC) previously generated for the nonLatinized designs were analyzed by selecting at random 2, 4, 6 and 8 contiguous replicates and using the linear models from Table 31. Comparisons were preformed using both individual broad sense heritability and mean clonal heritability (H,2) from each subset. The Hfi2 formula used was '2 T2 clone 36 cc 36 0 clone +T e I 1m where the variance components are as previously described, and m is the number of ramets (2, 4, 6 or 8). As with Equation, 34 an unbiased mean was estimated as the ratio of the mean of the numerator over the mean of the denominator. In the present study comparisons of methods and designs were made using heritabilities only. Genetic gains are also of interest, but they were not used here because for a given intensity of selection and fixed phenotypic variance, gains are directly proportional to heritability (Falconer and Mackay 1996, p. 189). Results and Discussion Heritability Estimates and Confidence Intervals The RC design produced the highest average individual heritability and relative efficiency (RE) for all surface patterns (Table 33 and Figure 24). This was particularly true for surface pattern ALL where both gradients and patches were simulated together. For IB designs, heritability increased as the size of the incomplete block decreased for all surfaces, and IB designs with 32 incomplete blocks per replicate (8 trees per block) were nearly as efficient as RC designs for all surfaces. In general, for all designs higher relative efficiencies were found for surfaces with simulated patches (ALL and PATCH), indicating the usefulness of one or twoway blocking to account for them. These results agree with other studies that also reported greater efficiencies for IB and RC designs (Lin et al. 1993; Fu et al. 1999a; Qiao et al. 2000). Also, higher average heritabilities were accompanied by an increase in their standard deviation (Table 33 and Figure 25). For example, in PATCH surfaces for RC designs, the standard deviation was more than double that of RCB designs. Hence, the implementation of genetic experiments with RC and IB 32 designs is recommended; nevertheless, they have the drawback of yielding more variable heritability estimates. Table 33. Average individual singlesite broadsense heritability with its standard deviations in parenthesis and relative efficiencies calculated over completely randomized design for experimental designs and linear models that were not 2 Latinized (NoLAT). Note that the parametric H2 was established for CR design as 0.25. HE2 RE Design a ALL b GRAD PATCH ALL GRAD PATCH 0.250 0.252 0.251 CR 0.84 0.86 0.98 (0.026) (0.025) (0.024) 0.296 0.291 0.256 RCB 1.00 1.00 1.00 (0.033) (0.030) (0.026) 0.309 0.305 0.262 IB 4 309 0305 1.04 1.05 1.02 (0.040) (0.033) (0.030) 0.317 0.307 0.269 IB 8 1.07 1.05 1.05 (0.045) (0.034) (0.034) 0.323 0.308 0.273 IB 16 1.09 1.06 1.07 (0.049) (0.035) (0.038) 0.328 0.308 0.284 IB 32 1.11 1.06 1.11 (0.055) (0.035) (0.044) RC 0.336 0.312 0.287 RC 1.13 1.07 1.12 (0.060) (0.035) (0.052) a CR, complete randomized; RCB, randomized complete block; IB x, incomplete block with x blocks per replicate; RC, rowcolumn. b ALL, surfaces with patches and gradients; GRAD, surfaces with only gradients; PATCH, surfaces with only patches. In most cases, Dickerson's approximation and the percentile CI's obtained directly from simulations gave similar estimates for the 95% confidence intervals (Figure 31). Both methods were particularly close for the RCB designs for all surface patterns, and for all designs in the GRAD surfaces. In contrast, Dickerson's approximation almost always yielded smaller CI's than the percentile intervals for surfaces PATCH and ALL, and the upper limit of the CI's was particularly underestimated for RC and IB designs on these surfaces. This was a consequence of Dickerson's approximation assuming a symmetrical distribution when, in fact, the simulated H 2 distributions were skewed to the right (see Figures 25 and 34). Surface: PATCH 0 25 0 20 0 15 0 50 0 45 0 40 0 35 0 30 0 25 0 20 0 15 0 50 0 45 0 40 CR RCB IB 4 IB 8 IB 16 IB 32 RC Surface: GRAD S00 0l CR RCB IB 4 IB 8 IB 16 IB 32 RC Surface: ALL jilt CR RCB IB 4 IB 8 IB 16 IB 32 RC 0 Simulation 4 Dickerson's Figure 31. Plots of means and 95% confidence intervals for estimated individual broad sense heritabilities obtained from simulation runs and by the method proposed by Dickerson (1969) for all design types and surface patterns for non Latinized designs and analyses (NoLAT). Average heritabilities correspond to the central points in each confidence interval, and all simulations involved 8 ramets per clone. : Iii     H2 Surface PATCH 00 02 04 06 08 10 00 02 04 06 08 10 PX Px H2Surface GRAD H2Surface GRAD 00005 0 8 001 002 003 H2 Surface ALL 001 0 02 0 03 0 04 0 05 H2 Surface ALL 00 02 04 06 PX I 02 08 10 00 02 0 25 0 30 0 35 0 40 0 45 0 50 04 06 08 10 Figure 32. Contours for individual heritabilities obtained by using the lowess smoother (Cleveland 1979) of rowcolumn design for different surface parameters from fitting experimental designs and analyses that were nonLatinized (NoLAT). 0 0004  0 0003 0 0002 0 0001  0 0000  0 00 H2Surface PATCH The ideal situation would be to derive the exact probability distribution for heritability, but this is very complex and usually only possible for simple linear models and balanced datasets. Better approximate confidence intervals than Dickerson's can be obtained using Taylor series, which improves the estimation of the heritability variance (more details in Dieters 1994, p. 1720); nevertheless, methods based on Taylor series do not correct the problem of skewness. Another option is to use bootstrap and jackknife techniques to approximate distributions by resampling. These methods have been implemented successfully for specific situations with heritability (Ndlovu 1992) and genetic correlations estimates (Liu et al. 1997), but their statistical properties are not well understood. Surface Parameter Behavior The simulation parameters (a, px, py and C2ms) used to generate the three surface patterns varied considerably and differentially influenced the magnitude of the individual heritability estimates. Gradients (or trends) on simulated surfaces were specified by a and ,8, here, larger values corresponded to steeper gradients. These parameters, in most cases, had little impact on heritability estimates; with larger values of a and /, only small increases in heritabilities were found. Also, as expected, the effect of these gradient parameters was more pronounced in GRAD surfaces (for RC see Figures 32c and 32d). It is well known that smaller amounts of unstructured residual increase the efficiency of analyses of field experiments. For this study, an increase in heritabilities for all designs and surface patterns was noted as the simulated 02ms decreased. This increase was more pronounced for designs with smaller incomplete blocks, where some 2 2 heritability estimates reached values as high as 0.50. Also, increases in H,2 as the 2ms got smaller were more pronounced in surfaces with patches (ALL and PATCH). For the simulated surfaces, using larger spatial correlation parameters (px and py) produced bigger patches. The results of this study indicated that, in general, for all experimental designs there was an improvement in heritabilities as the patch size increased. The effect of the spatial correlation parameters was larger for IB and RC designs than for RCB designs. Also, an interaction between these parameters and &2ms was observed. For example, in RC designs, high values of spatial correlation and low values of white noise yielded larger heritabilities, where the effect of spatial correlations was more pronounced at low values of O2ms (Figures 32b and 32f). In another simulation study Fu et al. (1998) reported that increasing patch size in RCB and IB designs gave smaller variance of family mean contrasts, a result that agrees with our findings. Nevertheless, they also reported that the presence of gradients produced a reduction in precision, which was more pronounced for RCB designs. In the present study, the effect of gradient was almost negligible, but more relevant for those designs that had smaller incomplete blocks. The disparity of results is probably a consequence of using different block sizes. Fu et al. (1998) considered a long rectangular block (10 x 1 tress) perpendicular to the main gradient; therefore, increasing its effect, but in the present study blocks were always considered square or almost square. The larger effect of px and py on IB and RC designs in comparison to RCB occurs because small incomplete blocks were more likely to be completely contained in a larger patch; therefore, reducing the within block variability and explaining a larger portion of the total variance in comparison to using only large blocks or replicates. To examine this fact, the average individual heritabilities for subsets of simulations with low spatial correlations (px < 0.3 and py < 0.3) and large correlations (px, > 0.7 and py > 0.7) were compared. For all surface patterns, within the set of low correlations, no important differences were noted between all designs, but in the case of high correlations, an important increase in average heritability occurred as the size of the incomplete block decreased. For example, in PATCH surfaces, the average heritability between IB 4 and IB 32 was fixed at 0.25 for low correlations, in contrast with a change from 0.29 to 0.35 for the high correlations set (Figure 33). Thus, the use of smaller incomplete blocks should be preferred for field experiments under the presence of moderate to strong spatial correlations. 0.40  0.38  0.36  0.34  0.32  0.30  0.28  0.26  0.24 CR RCB IB4 IB8 IB16 IB32 RC Figure 33. Average individual heritability estimates for ALL and PATCH surface pattern for simulated surfaces for subsets of simulations with High (px > 0.7 and py > 0.7) and Low (px < 0.3 and py < 0.3) spatial correlations. Latinization A ztest was conducted to compare average H2 values for all designs and surface patterns generated from simulations with and without Latinization but always fitted without the Latinization random effects (i.e., PartialLAT versus NoLAT). This test indicated significant differences (experimentwise a= 0.05) for all IB and RC designs on PATCH surfaces only. In this surface pattern, Latinized designs produced an increase up to 0.04 units of heritability, which tended to be larger for simpler designs. It was expected that the incorporation of a restricted randomization, as occurs with Latinized designs, would yield smaller standard deviation for HB: and for the variance component o2block(rep) (Williams et al. 1999); but when PartialLAT was compared to NoLAT very small reductions in heritability standard deviations were found and changes in coefficient of variation were not relevant (for H2: see Table A7). The other comparison considered only simulated datasets originating from Latinized experiments, which were fitted using linear models with and without Latinization effects (i.e., FullLAT and PartialLAT). The results from the likelihood ratio test indicated statistical significant differences (experimentwise a= 0.05) between these two models for all designs and surface patterns with the exception of designs IB 4 and IB 8 for PATCH surfaces. Better model fitting was found on FullLAT, but differences in terms of average heritability, clonal and error variance were very small and not of practical relevance. Also, the largest improvements of including Latinization effects occurred in GRAD surfaces or with IB 32 designs in any surface. The above findings agree with what Qiao et al. (2000) reported for the analysis of several Latinized RC designs, where greater efficiencies were reported in those analyses that included the Latinized effects in the linear model. In summary, once a design is generated with Latinization, the incorporation of its corresponding random effects is important. The small benefits of Latinization found in this study in the design and analysis phase were not important in practice, but could possible be a consequence of the large sample size used per clone (8 ramets). Nevertheless, the use of Latinization in the design stage is recommended to help control potential undetected environmental biases. Number of Ramets per Clone The variance components estimated through REML for all experimental designs and surface patterns were very similar for datasets with different number of ramets per clone. Hence, even with reduced amounts of information, REML yielded unbiased variance component estimates and best linear unbiased predictions of genetic values. The main impact of larger experiments (i.e., more ramets per clone) was a decrease in variability or improved precision on variance component estimates. The greatest overall reduction occurred when the number of ramets changed from 2 to 4, and was more pronounced for PATCH surfaces. For example, within RC designs the coefficient of variation for the clonal variance component on average in all surfaces was reduced from 22.5% for analyses with 2 ramets to 12.0% for the case with 8 ramets (Table 34). As with variance components, for the same design type and surface pattern average heritability estimates changed very little between datasets with different numbers of ramets. Nevertheless, HB distributions for any design type and surface pattern showed a decrease in range (and variance) as the number of ramets increased (for RC design see Figure 34). Table 34. Coefficient of variation (100 std / x) for the estimated variance component for clone and error on 3 surface patterns, different number of ramets per clone and 3 selected experimental designs. Clonal Variance ramets / clone Design a Surface b 2 4 6 8 RCB ALL 21.5 % 14.4 % 13.1% 12.0 % GRAD 23.6 % 14.6 % 12.7 % 11.7 % PATCH 24.5 % 15.4 % 12.7 % 11.8 % IB 32 ALL 21.5 % 14.5 % 12.8 % 11.7 % GRAD 21.3 % 14.4 % 11.9 % 11.3 % PATCH 25.3 % 15.7 % 13.6 % 12.7 % RC ALL 21.5 % 15.3 % 13.0 % 12.1% GRAD 21.2 % 14.3 % 12.6 % 11.9 % PATCH 24.8 % 15.0 % 13.2 % 12.0 % Error Variance ramets / clone Design Surface 2 4 6 8 RCB ALL 15.0% 12.2% 11.0% 9.8% GRAD 15.9% 11.1% 9.9% 8.9% PATCH 10.5% 7.1% 5.0% 4.6% IB 32 ALL 22.0% 20.3% 20.0% 19.6% GRAD 14.9% 12.6% 12.3% 12.2% PATCH 17.1% 15.3% 15.0% 14.7% RC ALL 22.5% 21.3% 20.8% 20.5% GRAD 14.1% 11.9% 11.6% 11.6% PATCH 18.9% 17.8% 17.3% 17.2% a RCB, randomized complete block; IB 32, incomplete block with 32 blocks per replicate; RC, row column. b ALL, surfaces with patches and gradients; GRAD, surfaces with only gradients; PATCH, surfaces with only patches. Surface: ALL 160 140 120 > 100 3 80 LL 60 40 20 0 Surface: PATCH 160 140 120 > 100 a, 3 80 1 60 40 20  0 / 0.0 0.1 Figure 34. Distribution of individual broadsense heritability estimates (H2 ) for row column designs with 2 and 8 ramets per clone obtained from 500 simulations each for the surface patterns ALL and PATCH from fitting experimental designs and analyses that were nonLatinized (NoLAT). 020 018 \ % 016 o > 014 5 012 I w 010 008  C 0 06 004 8 24 ramets / clone 46 ramets / clone Figure 35. Average values for rowcolumns design in all surface patterns from fitting experimental designs and analyses that were nonLatinized: a) Clonal mean heritabilities (the whiskers correspond to the 95% confidence intervals); and b) Clonal mean heritability increments. 0.2 0.3 0.4 0.5 0.6 0.7 HB2 0.3 C HB2 For all studied designs, as the number of ramets increased considerably higher clonal mean heritabilities (H/2) were found; nevertheless, incremental improvements above 6 ramets per clone were small. For RC designs Hf2 averaged over all surfaces for 2, 4, 6 and 8 ramets per clone were 0.44, 0.62, 0.70 and 0.76, respectively (Figure 35a); and the largest change occurred when the number of ramets changed from 2 to 4. Notably, these increments were always larger in PATCH surfaces than any other surface (see Figure 35b). Hence, the results from these simulations (that considered no missing values) indicated that the optimal number of ramets per clone per site should be between 4 and 6. This number of ramets agrees with results obtained in other studies that recommend between 1 and 6 ramets per clone per site (Shaw and Hood 1985; Russell and Libby 1986), and will produce near maximal genetic gains for clonal selection and will allow alternative uses of available resources (for example, increasing the number of clones tested). Conclusions and Recommendations Appropriate selection of experimental designs can yield considerably improvements in clonal testing. Rowcolumn and incomplete block designs with a block size of 8 trees had the higher individual heritability estimates. Unfortunately, these two designs yielded more variable heritability estimates than simpler designs. Approximate 95% confidence intervals on heritability estimates obtained using Dickerson's method were similar to empirical percentile CI's. These methods were particularly close for simple designs (CR and RCB), but Dickerson's approximation presented a bias in the upper confidence limit of more complex designs (IB and RC). The examination of the simulation parameters used to generate the surfaces patterns had larger individual heritabilities when the amount of white noise or unstructured residual 2ms was smaller and the spatial correlations larger, with a greater effect of the latter at low levels of C2ms. Also, the effect of different levels of gradients on Hf2 tended to be almost nonexistent, affecting slightly GRAD surfaces only. In the present study, the use of Latinized designs improved the experimental design efficiency mainly for PATCH surfaces, but the benefits of designing an experiment with Latinization were not important in practical terms. On the other hand, once an experiment is designed with Latinization, the inclusion of its corresponding effects in the linear model yielded better analyses. As expected, experiments with more ramets per clone produced more precise variance component estimates and larger clonal mean heritabilities. Using 4 to 6 ramets per clone per site is recommended. More than 6 ramets produced marginal improvements in precision of clonal means, but for clonal forestry it is important to screen large numbers of clones in one test. Finally, it is important to note, that these findings were produced analyzing datasets that did not have missing values. Nevertheless, in other related studies (see Chapter 2 and Fu et al. 1999b) the effects of mortality were small. Therefore, these results can be extended safely to situations with low mortality levels. CHAPTER 4 ACCOUNTING FOR SPATIAL VARIABILITY IN BREEDING TRIALS Introduction The amount and type of environmental heterogeneity found in agricultural or forestry trials greatly influences the statistical precision obtained in the comparison of treatments. As the homogeneity decreases, the error variance of the treatment effect estimates increases and, consequently, it is harder to detect differences among treatments. Many natural and anthropogenic factors affecting site variability, such as soil, topography, wind, machinery or planting technique are usually detected after test establishment, making it difficult to implement optimal designs and to minimize the portion of the experimental error due to site variation (Fu et al. 1999). This environmental or spatial variability usually is expressed as gradients, patches or a combination of both, together with random microsite differences (Costa e Silva et al. 2001). Proper treatment randomization is enough to ensure that, over repeated testing and sampling, unbiased estimates of treatment effects are obtained, and hypothesis testing is valid (Grondona and Cressie 1991; Brownie et al. 1993). Nevertheless, under environmental heterogeneity, the estimates obtained for the analysis of a particular dataset are highly variable. This situation can be improved by the use of proper experimental designs and statistical analysis. Randomized complete block, incomplete block and rowcolumn designs constitute classical approaches that control for spatial variability and are recognized as "a priori" techniques because they are implemented in the design stage. Another group of techniques ("a posteriori ") deals with statistical analyses that incorporate the x and y coordinates of the experimental units (plots, trees or plants) in the linear model to account for physical proximity. The majority of statistical analyses for spatial data are based on modeling two main components of spatial variability (Grondona et al. 1996): 1) global trends or largescale variation; and 2) local trends or smallscale variation. The first component corresponds to field gradients which can be modeled by: i) incorporating designs effects (e.g., replicate, block, row, column, etc.); ii) using x and y positions as one or multiple fixed effects in the linear model to describe continuous mathematical functions; or, iii) calculating covariates to adjust each observation. Some of the most common options are 1) polynomial regression (Brownie et al. 1993); 2) cubic smoothing spline (Durban et al. 1999; Verbyla et al. 1999); 3) nearest neighbor analysis (Bartlett 1978; Brownie et al. 1993; Vollmann et al. 1996); and 4) moving average (TownleySmith and Hurd 1973). Of particular interest are the nearest neighbor techniques, for which there are multiple variants. Nevertheless, the majority corresponds to modifications of the Papadakis method (Atkinson 1969) which is based on the use of one or more covariates obtained from averaging residuals of neighboring plots; these residuals are obtained from fitting a completely randomized or randomized complete block design. The original method uses the residual average of one plot to the right and one to the left; thus, known as EW adjustment. Several modifications are available which consider different definitions of the covariates with 2, 4, 8 or even 24 adjacent plots (Brownie et al. 1993; Stroup et al. 1994; Vollman et al. 1996). Also, Bartlett (1978) suggested performing an iterated Papadakis method that recalculates residuals iteratively. A method similar to Papadakis is based on the original measurements and not on residuals is the Moving Average analysis. Here, a "local" mean is calculated from a number of adjacent plots and used as a covariate (TownleySmith and Hurd, 1973). The second component of spatial variability (local trends), frequently identified as multiple patches on the field surface, can be modeled through the specification of an error structure that takes into account some form of spatial correlation produced by the physical proximity among experimental units. Most of the techniques for modeling global trends (as presented above) assume that residuals or errors from the linear model are independent and identically distributed with a common variance. Under spatial correlation, the error variancecovariance matrix has nonzero offdiagonal elements, which are assumed to be a function of the distance among experimental units. Several correlated error models are available (Cressie 1993; Littell et al. 1996), allowing substantial flexibility in assumptions and covariance structures; nevertheless, different error structures tend to yield similar results (Zimmerman and Harville 1991). The firstorder separable autoregressive error structure has been used successfully in several agronomic and forestry trials (Grondona et al. 1996; Gilmour et al. 1997; Qiao et al. 2000; Sarker et al. 2001; Costa e Silva et al. 2001; Dutkowski et al. 2002). This pattern considers two perpendicular correlations (one for row and the other for the column direction), and is equivalent to the separable exponential geostatistical model. Sometimes a nugget parameter is included in the error structure to model the expected variability that occurs if repeated measurements were made exactly at the same location (Cressie, 1993) allowing modeling of potential microsite variability or measurement error. Finally, it is possible to fit global and local trends simultaneously for which multiple combinations of techniques exist. Because many of these options yield similar results, it is more difficult to select an adequate model, and the use of some statistical and graphical tools (e.g., sample variograms) seems particularly useful (Gilmour et al. 1997). Field testing of varieties is critical for the progress of genetic improvement programs, and is an expensive and time consuming activity. Therefore, allocation of resources and the use of the best available statistical techniques to obtain precise estimation of genetic parameters, such as heritability and breeding values, are critical. Due to the large number of varieties or treatments to be studied relatively large test areas are required. These areas usually have high amounts of environmental heterogeneity that inflate the residual error variance. Hence, controlling for this variability is important, and requires the use of more sophisticated experimental designs, or the implementation of spatial statistical analyses. The goal of this study is to quantify efficiencies of implementing a range of "a priori" and "a posteriori" statistical techniques to account for spatial variability, and also to discriminate among these techniques for parsimony and global performance for a broad set of conditions. In particular, the selected array of techniques is compared to improvement in the precision for comparing and predicting treatment effects. This is achieved through simulations of field trials with sets of unrelated genotypes (or treatments) tested on a single site with environments having different patterns of heterogeneity (only patchy, only gradient, and both types). These patterns were generated using a polynomial function for the gradients and an AR1OAR1 error structure for patches. The selected techniques included 1) modeling of global trends through the use of traditional experimental designs (complete randomized, randomized complete block, incomplete block and rowcolumn) or polynomial models; 2) fitting AR1AR1 error structure (with and without nugget) individually or in combination with some of the above specifications of global trends; and 3) implementation of several variants of Papadakis and Moving Average methods. Materials and Methods Field Design and Simulation The simulations were based on a single trial of 2,048 plots arranged on a contiguous rectangular site of 64 rows and 32 columns with square spacing and no missing observations. For each site a total of 256 treatments genotypess or varieties) with 8 replicates each were tested in plots composed by single plants. The linear model used to generate the simulated data for the response variable was Y ijk = P+ gk+ Es(jk) + Ems(lJk) 41 where yijk is the response of the plant located in the ith row and jth column of the kth treatment, u is a fixed population mean which was set equal to 10 units, gk is the random treatment effect, Es(ijk) is the surface error (or structured residual) and Ems(ijk) corresponds to the microsite random error (or unstructured residual sometimes called nugget). The total variance for the above linear model is C2T = C2g + C2s + c2ms. For simplicity, but without loss of generality, y2T was fixed to 1, and the variances for all surfaces were set to 02, = 0.25 and 2s + c2ms = 0.75. The base experimental design simulated corresponded to an incomplete block design with 32 blocks in each of the 8 resolvable replications in an alattice design (John and Williams 1995, p. 68), which had previously shown to be superior to completely randomized and randomized complete block designs (Chapter 2). This experimental design was implemented in three surface patterns: only patches (PATCH), only gradients (GRAD), and both types (ALL). These simulated error surfaces included two main components: gradients and patches. The following third degree polynomial function using the x and y coordinates of each plot was used to generate the gradients: t' = a (x +Y,) +y+3(xc 2y + xcly ) 42 where xc and yci correspond to the centered position values for the ith plot (i.e., xci = xi  x) located in column x and row y. Patches were generated using a firstorder separable autoregressive error structure (AR1OAR1) based on two perpendicular spatial correlations (px and py) with an error variancecovariance matrix defined as Var (e) = cs + cms for diagonal elements 43 Cov (e,, e,) = os pxhxp hy for offdiagonal elements 44 where hx = I xi xi' 1, hy = I yi yi' , i.e., the absolute distance in the row and column position, respectively, between two plots. The simulation consisted of the generation of independent surfaces over which different experimental designs were superimposed. Each surface was produced by selecting 5 parameters at random from uniform distributions (a, /f, px, py and c2ms). The parameters were restricted to the following ranges: 0 to 0.05 for a and 0 to 0.0005 for /Y, 0.01 to 0.99 for the correlation parameters px and py, and 0.15 to 0.60 for 2 Ms. Additionally, the two correlations were restricted so their absolute difference was smaller than 0.85 and C2y was calculated from C2y = 0.75 G2Ms. Several layouts of the incomplete block design were generated using the software CycDesigN and randomly selected (Whitaker et al. 2002). For each surface pattern 1,000 simulations were generated and stored for further analysis. More details of the simulation process can be found in Chapter 2. Statistical Analysis For the prediction of treatment effects and estimation of other parameters of interest four sets of statistical analyses were considered based on modeling: 1) only global trends; 2) only local trends; 3) both components simultaneously; and 4) nearest neighbor methods. All linear models were fitted using the software ASREML (Gilmour et al. 2002), which estimates fixed effects, variancecovariance parameters and predicts random effects by REML (Patterson and Thompson 1971). For those analyses that considered only global trends it was assumed that the errors were independent and identically distributed; therefore, no spatial error structure was considered. To model the global trends or mean structure, classical experimental designs and explicit trend modeling were implemented. The classical designs or models corresponded to a completely randomized (CR), randomized complete block (RCB), the original incomplete block design with 32 blocks (IB) which was employed to simulate the data, and a superimposed rowcolumn (RC) design that consisted of short rows and columns within each replicate. All effects with the exception of the overall mean were considered random (Table 41). The original alattice design did not consider row and column effects; nevertheless, it is of interest to include this type of analysis because the short rows and columns allow for some form of spatial modeling (Lin et al. 1993). Explicit trend modeling considered a set of continuous variables to describe polynomial regressions. The independent variables used in the study corresponded to the x and y coordinates of the individual plots, which were assumed fixed. The models used are specified on Table 42. It is important to note that the model FullPoly was the one used to generate the gradients in the error surface simulations (see Equation 42). For modeling the local trends two variants of the AR1AR1 error structure were fitted with and without nugget effect. The variant with nugget is specified by Equations 43 and 44; and for the case of no nugget, the only difference is that the variance component C2 ms was assumed to be zero. These two variants were fitted alone or in combination with different forms of global trends: 1) the classical models specified in Table 41; and 2) the polynomial regressions from Table 42. Table 41. Linear models fitted for simulated datasets using classical experimental designs to model global trends: complete randomized (CR), randomized complete block (RCB), incomplete block design with 32 blocks (IB), and row column (RC) design. All model effects other than the mean were considered random. Model a CR Yj = P + g, + +1 RCB yj = p + rep, + gj + ~1 IB yjk = + rep + block(rep), + gk + Fk RC Yjkl = + rep1 + col(rep), + row(rep)lk + gl + Flkl a g, treatment effect; rep, resolvable replicates; block(rep), incomplete block nested within replicate; col(rep), column nested within replicate (short column); row(rep), row nested within replicate (short row); and E, residual. Table 42. Linear models fitted for simulated datasets using polynomial functions to model global trends: linear (Linear), quadratic (RedPoly), and quadratic model with some interactions (FullPoly). All variables are assumed fixed and the treatment effect is random. Model a Linear Yljk = + PO X, + p1 Yj + gk + k jk RedPoly yljk = + po Xl + p1 YJ + f2 X2 + 3 Yj2 + gk + Eljk FullPoly Yljk = X + Po X1 + P1 Yj + P2 X2 Yj + P3 Yj2 X1 + gk + jk a x, longitudinal position of the plant; y, latitudinal position; g, treatment; and e, residual. For the nearest neighbor methods, a number of different variants of Papadakis (PAP) and Moving Average (MA) methods were implemented. The design effects (replicate, block, etc.) were not considered when these methods were implemented. The residuals (or deviations) used to implement the PAP method and its variants were computed as ,y = y, wk, where y, is the observation of the plot located in the ith row and jth column and yk is the simple treatment average (fitting a CR design). The covariates used to correct for differential yields (X,j) were calculated using the average of these residuals from a number of neighboring plots. In the case of the MA method, the covariate was calculated using the original observations (y,) instead of their residuals (n, ). The following linear model, which considered the covariates as fixed effects and the treatment effects (gk) as random, was used y,jk = + Po X1,1, + P X2,j + ... + m Xp,,j + gk+ ,jk 45 where Xi,,, X2,, ..., Xp,, are p different covariates. All definitions of covariates and linear models considered for this study are detailed in Figure 41. All Papadakis or Moving Average models were fitted with an error structure assuming independent errors. Also, an iterated PAP method was implemented for the models PAP6, PAP8 and PAP11. Three iterations were implemented on these models using the residuals obtained from the previous iteration. In summary, a total of 35 different statistical models where fitted for 3 difference surface patterns (PATCH, GRAD and ALL) each one with 1,000 simulated datasets totaling 105,000 sets of estimated parameters; additionally 24,000 sets were available from the iterative PAP models. 58 The output from each of the fitted models, datasets and surface patterns was compiled and summarized to obtain averages for variance component parameters. Also, empirical correlations between true and predicted treatment effect values (CORR) were calculated to compare different techniques and models. PAP1 PAP4 PAP4 PAP7 / MA1 1 1 PAP2 2 1 2 PAP5 PAP8 / MA2 1 1 1 1 1 1 1 1 l PAP10 PAP3 1 1 PAP11 2 5 1 6 3 3 6 1 5 2 PAP6 2 1 3 3 2 2 PAP9 / MA3 212 2 1 2 1 1 2 1 2 Figure 41. Neighbor plots and definitions of covariates used in Papadakis (PAP) and Moving Average (MA) methods. Plots with the same numbers indicate a common covariate. Results Modeling Global and Local Trends Comparisons of the average correlation values between true and predicted treatment effects showed considerable differences among linear models, error structures, and surface patterns with values that ranged from 0.85 to 0.90 (Figure 42). While this range is relatively narrow, small increments in values closer to 1 correspond to large improvements in terms of precision. In relation to those models that assumed independent errors (ID), all designs yielded higher average CORR values than CR. In general, the best model was RC, but FullPoly was better for surfaces with only global gradients (GRAD). In general, the models with explicit trends (Linear, RedPoly and FullPoly) were very good for GRAD surfaces and unsuccessful for sites with only local patches (PATCH). Nevertheless, for patchy surfaces polynomial models were slightly better than CR, which is an indication that a small portion of the patches can be explained as trends. Approximately 98% and 95% of the fitted models converged for the autoregressive without nugget (AR1OAR1) and with nugget (AR1AR1+r) error structures, respectively. In GRAD surfaces, the results for the latter structure are not reported because very few simulations converged. For all surface patterns, most of the failed convergences occurred in surfaces without patches, i.e., when the spatial correlations (px and py) were close to zero. This situation makes the covariance formula in Equation 44 equal to zero; hence, the variance of the error composed by Cy2 and Y2ms as indicated in Equation 43 cannot be adequately partitioned. Once the errors were assumed correlated (i.e., when the error structure was incorporated in the linear models) several changes occurred (Figure 42). First, AR1OAR1 had for all models on surfaces ALL and PATCH larger average CORR values than model with independent error structure. The largest improvement in CORR was found on PATCH surfaces, which was expected because in this case now the error structure is explicitly modeled. Also, the CORR values in GRAD surfaces almost did not change between AR1IAR1 and ID error structures, with the exception of CR designs that had better CORR values than with ID. Once the nugget effect was incorporated, as with AR1AR1+rj, even larger CORR values were found. Comparisons of this error structure with the other two showed that differences between experimental designs were even smaller. The maximum values for CORR were 0.90 and 0.88 with the FullPoly model for ALL and PATCH, respectively. But it is important to note that this last model corresponds exactly to the one used to simulate the data; therefore, in this study it was used as reference or base model for the maximum performance that can be achieved. As with CORR, the average estimated variance components differed between error structures and models (see Tables 43, 44 and 45). And, as expected, within the same error structure, there was an inverse relationship between the variance of the error and the average CORR value. For all designs, surfaces and error structures, the mean treatment variance component (C2g) after 1,000 simulations was close to its parametric value of 0.25; and, hence, unbiased. In addition, the estimate of the error variance ( 2 or c2 + C2ms) was near its parametric value f 0.75 for all surface patterns and error structures (for models that partitions the error structure into replicates, blocks, rows and columns, these components are included in the total sum of 0.75). ID AR10 AR1 AR1 AR1+Tl 0.90 0.89 0.88 0.86  PATCH 0.85 Figure 42. Average correlations between true and predicted treatment effects (CORR) in 3 different surface patterns for classical experimental design analyses and polynomial models fitted for the following error structures: independent errors (ID); autoregressive without nugget (AR1OAR1); and autoregressive with nugget (AR1AR1+rI). Surface GRAD is not shown in the last graph because few simulations converged. The average variance components for the reference model from this simulation study (FullPoly with AR1AR1+rq) for PATCH surfaces were close to their parametric constants, with the exception of px which was slightly overestimated (0.52 instead of 0.50). A similar situation was found in ALL, but in this case the error variance only sums up to 0.56 (instead of the parametric 0.75) because part of the variability was explained by the polynomial fixed effects. Hence, for this surface explicit trends explained approximately 19% of the total variability. Within AR1AR1+qr important differences were found when a model different than FullPoly was used. In ALL surfaces, larger average px and py were obtained with values as high as 0.88. Because of the lack of trends in PATCH the spatial correlation for 62 other models different than FullPoly were close to 0.50. In the case of GRAD, the correlations for AR1IAR1 should be close to zero, but in almost all models, slightly positive values were found with the exception of FullPoly. This bias was particularly large for CR, possible a consequence of failing to model the underlying surface trends properly. Table 43. Average variance components for the surface pattern GRAD obtained from fitting the models in Tables 41 and 42 for the following error structures: a) independent errors (ID); and b) autoregressive without nugget (AR1 AR1). All values are the means from 1,000 simulations and the true values were C2 =0.25 and G2 + C2ms = 0.75 for CR design. ID Model g b rep block(rep) col(rep) row(rep) px Py 72ms (2s CR 0.25 0.75 RCB 0.25 0.16 0.61 IB 32 0.25 0.16 0.05 0.56 RC 0.25 0.16 0.04 0.02 0.55 Linear 0.25 0.60 RedPoly 0.25 0.60 FullPoly 0.25 0.56 ARIAR1 Model g rep block(rep) col(rep) row(rep) px py 72ms (2s CR 0.25 0.19 0.20 0.74 RCB 0.25 0.16 0.07 0.07 0.61 IB 32 0.25 0.16 0.04 0.03 0.02 0.57 RC 0.25 0.16 0.04 0.02 0.02 0.01 0.56 Linear 0.25 0.07 0.07 0.60 RedPoly 0.25 0.06 0.07 0.60 FullPoly 0.25 0.00 0.00 0.56 a CR, complete randomized; RCB, randomized complete block; IB 32 incomplete block with 32 blocks per replicate; RC, rowcolumn; Linear, linear regression; RedPoly, reduced polynomial regression; Full Poly, full polynomial regression. b g, treatment; rep, resolvable replicate; block(rep), incomplete block nested within replicate; col(rep), column nested within replicate; row(rep), row nested within replicate; px, correlation in the xdirection; py, correlation in the ydirection; ms, microsite error; s, surface error. Table 44. Average variance components for the surface pattern ALL obtained from fitting the models in Tables 41 and 42 for the following error structures: a) independent errors (ID); b) autoregressive without nugget (AR1OAR1); and c) autoregressive with nugget (AR1OAR1+rj). All values are the means from 1,000 simulations and the true values were C2 = 0.25 and 2s + C2ms = 0.75 for CR design. ID Model g b rep block(rep) col(rep) row(rep) px Py (2ms 2s CR 0.25 0.75 RCB 0.25 0.17 0.60 IB 32 0.25 0.17 0.09 0.51 RC 0.25 0.17 0.06 0.05 0.50 Linear 0.25 0.61 RedPoly 0.25 0.60 FullPoly 0.25 0.55 ARIAR1 Model g rep block(rep) col(rep) row(rep) px Py (2ms 2s CR 0.25 0.29 0.28 0.72 RCB 0.25 0.16 0.19 0.18 0.60 IB 32 0.25 0.16 0.02 0.16 0.13 0.58 RC 0.25 0.16 0.04 0.03 0.12 0.12 0.53 Linear 0.25 0.20 0.19 0.60 RedPoly 0.25 0.19 0.18 0.60 FullPoly 0.25 0.15 0.14 0.55 ARIOARI+rT Model g rep block(rep) col(rep) row(rep) px Py (2ms 2s CR 0.25 0.87 0.88 0.44 0.33 RCB 0.25 0.07 0.76 0.78 0.35 0.35 IB 32 0.25 0.06 0.00 0.79 0.79 0.42 0.29 RC 0.25 0.07 0.00 0.00 0.75 0.77 0.35 0.33 Linear 0.25 0.73 0.72 0.41 0.22 RedPoly 0.25 0.73 0.72 0.40 0.22 FullPoly 0.25 0.54 0.51 0.36 0.20 a CR, complete randomized; RCB, randomized complete block; IB 32 incomplete block with 32 blocks per replicate; RC, rowcolumn; Linear, linear regression; RedPoly, reduced polynomial regression; Full Poly, full polynomial regression. b g, treatment; rep, resolvable replicate; block(rep), incomplete block nested within replicate; col(rep), column nested within replicate; row(rep), row nested within replicate; px, correlation in the xdirection; py, correlation in the ydirection; ms, microsite error; s, surface error. Table 45. Average variance components for the surface pattern PATCH obtained from fitting the models in Tables 41 and 42 for the following error structures: a) independent errors (ID); b) autoregressive without nugget (AR1OAR1); and c) autoregressive with nugget (AR1OAR1+rj). All values are the means from 1,000 simulations and the true values were C2 = 0.25 and 2s + C2ms = 0.75 for CR design. ID Model g b rep block(rep) col(rep) row(rep) px Py (2ms 2s CR 0.25 0.75 RCB 0.25 0.02 0.73 IB 32 0.25 0.02 0.10 0.63 RC 0.25 0.02 0.06 0.06 0.62 Linear 0.25 0.74 RedPoly 0.25 0.74 FullPoly 0.25 0.73 ARIAR1 Model g rep block(rep) col(rep) row(rep) px py (2 (72s CR 0.25 0.22 0.21 0.74 RCB 0.25 0.02 0.21 0.20 0.73 IB 32 0.25 0.01 0.02 0.20 0.18 0.71 RC 0.25 0.02 0.04 0.03 0.17 0.16 0.66 Linear 0.25 0.22 0.21 0.73 RedPoly 0.25 0.22 0.21 0.73 FullPoly 0.25 0.21 0.20 0.73 ARIOARI+rT Model g rep block(rep) col(rep) row(rep) px py (2ms 2s CR 0.25 0.52 0.50 0.37 0.39 RCB 0.25 0.00 0.53 0.51 0.33 0.43 IB 32 0.25 0.00 0.00 0.51 0.49 0.36 0.39 RC 0.25 0.00 0.00 0.00 0.51 0.50 0.33 0.42 Linear 0.25 0.52 0.50 0.37 0.40 RedPoly 0.25 0.52 0.50 0.36 0.40 FullPoly 0.25 0.52 0.50 0.37 0.40 a CR, complete randomized; RCB, randomized complete block; IB 32 incomplete block with 32 blocks per replicate; RC, rowcolumn; Linear, linear regression; RedPoly, reduced polynomial regression; Full Poly, full polynomial regression. b g, treatment; rep, resolvable replicate; block(rep), incomplete block nested within replicate; col(rep), column nested within replicate; row(rep), row nested within replicate; px, correlation in the xdirection; py, correlation in the ydirection; ms, microsite error; s, surface error. Considering only models with independent errors, the partition of variance components showed larger values of the replicate variance component in ALL and GRAD in comparison with PATCH. Also, the incomplete block variance components (from IB and RC) were always smaller in GRAD surfaces. Because few simulations converged for GRAD surfaces, to draw inferences comparisons between AR1IAR1 and AR1AR1+rj can only be performed for surfaces ALL and PATCH. Failing to incorporate the nugget effect in these surface patterns produced an under estimation of the spatial correlation parameters for any experimental design. Another consequence was an increase in the magnitude of the variance components for other effects. For example, in ALL surfaces the replicate variance increased from 0.07 to 0.16, indicating that part of the variability that the error structured failed to capture was absorbed by the replicate effect. Nearest Neighbor Models Due to differences in the number of neighbor plots accounted for and covariates considered, a wide variety of average CORR values were found for the nearest neighbor methods (Figure 43), but in all cases, average CORR values greater than 0.86 were found. In general, models that considered more plots and/or covariates produced larger CORR values. When compared with analyses with RCB designs, both the PAP and MA methods gave larger CORR values for PATCH and ALL, but for GRAD surface RCB was in general better than any nearest neighbor methods. This situation is not surprising because PAP and MA were originally designed to control for correlated errors (or patches) and not potential surface trends or gradients. Nevertheless, as occurred before with the incorporation of autoregressive error structures, controlling for patches helped to model a small portion of the trends (see GRAD in Figures 42 and 43). Additionally, for all nearest neighbor methods studied, unbiased estimates of variance components were always obtained, as indicated by values close to 0.25 for all cases. Also, a proportional inverse relation was noted between the average error variance and the CORR values. Generally, PAP6 and PAP11 were superior to any of the MA methods across all surface patterns (Figure 43), where PAP11 was slightly better, particularly for GRAD surfaces. Interestingly, the PAP8 method that only considers one covariate for the 8 surrounding plots was particularly good for GRAD. PAP6, PAP8 and PAP11 were also fitted iteratively using their previously estimated residuals to calculate new covariates. For any given surface, a decrease in the average CORR and the treatment and error variance was noted in PAP6 and PAP11, this decrease was more abrupt in the last iteration (Table 46) and affected mostly surfaces with patches (ALL and PATCH). PAP8 was almost not affected by the iterations showing stable statistics and unbiased variance components. A group of selected models was plotted in Figure 44. The first 4 models assumed that the errors were independent (ID) and the last 2 models corresponded to the best models for the autoregressive error structures. ID models did not achieve the maximum potential CORR values that could be obtained by using the FullPoly with AR1ARl+rq. As indicated before, the classical design RC was very good in ALL and GRAD, but less precise for PATCH surfaces than other models. PAP6 and PAP11 performed very well but had slightly lower CORR values than the maximum potential CORR obtained with the ARlOARl+r. 0.88  0.87  0.86 0 ALL GRAD  PATCH 0.85 i, , Figure 43. Average correlations between true and predicted treatment effects (CORR) in 3 different surface patterns for nearest neighbor analyses fitted assuming independent errors. PAP: Papadakis, MA: Moving Average. RCB RC PAP6 PAP11 FullPoly FullPoly (ID) (ID) (ID) (ID) (AR1gAR1) (AR1gAR1+q1) Figure 44. Average correlations between true and predicted treatment effects (CORR) in 3 different surface patterns for selected methods: randomized complete block (RCB), rowcolumn (RC), and Papadakis (PAP) using the following error structures: independent errors (ID), and autoregressive with nugget (ARIAR1+r). Table 46. Average correlation between true and predicted treatment effects (CORR) and average treatment variance component for iterated Papadakis methods PAP6, PAP8 and PAP11 for the surface patterns ALL, PATCH and GRAD. ALL a PAP6 PAP8 PAP11 Iteration CORR 0gb 2, CORR o a2 CORR o 2, 0 0.89 0.25 0.52 0.89 0.25 0.54 0.89 0.25 0.51 1 0.89 0.25 0.52 0.89 0.25 0.53 0.89 0.26 0.50 2 0.89 0.25 0.47 0.89 0.25 0.53 0.89 0.25 0.44 3 0.87 0.24 0.45 0.89 0.25 0.53 0.87 0.24 0.39 PATCH PAP6 PAP8 PAP11 Iteration CORR og, 2, CORR o, 0G2 CORR o 2, 0 0.88 0.25 0.59 0.87 0.25 0.65 0.88 0.25 0.59 1 0.86 0.26 0.62 0.87 0.25 0.64 0.87 0.26 0.57 2 0.87 0.25 0.49 0.87 0.25 0.64 0.87 0.25 0.47 3 0.83 0.23 0.49 0.87 0.25 0.64 0.85 0.24 0.44 GRAD PAP6 PAP8 PAP11 Iteration CORR 02, a2 CORR 02, a2 CORR o2 02 0 0.88 0.25 0.61 0.88 0.25 0.60 0.88 0.25 0.60 1 0.88 0.25 0.60 0.88 0.25 0.60 0.88 0.25 0.59 2 0.87 0.25 0.59 0.88 0.25 0.60 0.87 0.25 0.56 3 0.87 0.24 0.56 0.88 0.25 0.60 0.86 0.25 0.45 a ALL, surfaces with patches and gradients; GRAD, surfaces with only gradients; PATCH, surfaces with only patches. b g, treatment effect; s, surface error. Discussion Sites for field trials in agronomy or forestry are particularly variable showing spatial heterogeneity that produces positive correlations between plots in close proximity. In this context, classical experimental designs that assume independent errors are still valid due to the randomization of treatments, but they are not optimal. Some of the consequences are (Ball et al. 1993) 1) inflation of the mean square error (MSE); 2) MSE is not the minimumvariance estimator of the error; 3) overestimated standard error of treatment pairwise comparisons; 4) invalid hypothesis testing and confidence intervals; and 5) bias in other statistics calculated from variance components estimates (e.g., heritability and genetic correlations). Nevertheless, controlling for spatial variation by modeling the error structure or using nearest neighbor analysis helps to minimize these problems and to obtain more efficient analyses. A concern in this study was CORR values close to the maximum of 1, which was probably a consequence of using a large number of replicates in the field design. To study this, the first 500 simulated dataset from the ALL surface were fitted using only 2 contiguous replicates and the models from Tables 41 and 42 with all three error structures described previously. The results showed an average CORR for all surfaces that ranged from 0.67 to 0.71 (for ALL see Table A8), which were lower than the ones obtained with 8 replicates (see Figure 42), but both numbers of replicates results were similar in range and behavior. Some relevant differences were a greater difficulty in achieving convergence and an increased variability in the estimates. Modeling Global and Local Trends RC was the best of all experimental design models fitted assuming independent errors (ID). This design which incorporates short rows and columns within replicates was particularly good for patches and in a less extent for gradients. These findings agree with other studies where RC designs showed greater efficiency than RCB (Williams et al. 1999) and IB designs (Qiao et al. 2000). Hence, for all surface patterns tested in this study designing an experiment as RC is recommended. Incorporation of the autoregressive error structure yielded improved precision of treatment estimates as reflected in larger CORR values. The best model, as expected, corresponded to FullPoly with AR1AR1+r which was the reference model used to simulate the surfaces. In contrast, when the autoregressive error structure was fitted without the nugget effect only small decreases in average CORR values was noted, but considerable bias in some of the variance components, particularly an underestimation of the spatial correlation parameters was found. For forestry breeding trials Dutkowski et al. (2002) also reported an overestimation when the nugget effect was not included in the model but in their case for the additive genetic variance component. Zimmerman and Harville (1991) indicated that in many cases this extra parameter might be unnecessary for most field experiments and could produce problems because of increased computation. But in this and several other studies, microsite error has been found to be necessary and often large relative to the total surface error (Ball et al. 1993; Gilmour et al. 1997; Cullis et al. 1998; Costa e Silva et al. 2001; Dutkowski et al. 2002). All these results suggest that the inclusion of the nugget or microsite error might be important. Usually, portions of the gradient variability can be explained by the error structure (Zimmerman and Harville 1991). In this study, this situation was observed when the gradient was modeled incorrectly (e.g., using RedPoly instead of FullPoly) under an autoregressive error structure fro GRAD surfaces some improvements on average CORR values over CR design were found. There is debate about the use of the design effects together with an error structure in the same model. These effects induce artificial discontinuities for adjacent experimental units on the boundaries (replicates or incomplete blocks). In the present study, once the error structure was modeled through autoregressive with or without nugget, differences between experimental designs were considerably reduced, particularly for PATCH surfaces (see Figure 42). Nevertheless, in analyses of real datasets, some authors using autoregressive errors together with design effects have been successful in increasing precision of treatment effect predictions (Sarker et al. 2001; Dutkowski et al. 2002). The concept behind modeling global trends is to represent the field gradient that is generated by gradual changes in topography and soil characteristics. Polynomial functions were used in this study, which under independent errors proved to be suitable only on those simulated surfaces with dominant gradients (ALL and GRAD). Importantly, once the correct autoregressive error structure was incorporated (AR1AR1+r) no differences were found between models with different polynomial functions or design effects. Differences were only noted in the GRAD surface were Full Poly was clearly better than any other alternative. In several studies, the incorporation of fixed or random effects to model global trends together with the autoregressive error structure has resulted in improved precision of treatment estimates or predictions (Brownie et al. 1993; Brownie and Gumpertz 1997; Gilmour et al. 1997). Here, trend or gradient was modeled using polynomial functions, but other options such as smoothing splines (Verbyla et al. 1999) or differencing the data (Cullis and Gleeson 1991) have proved useful. In this study, each simulated dataset was fitted in an automated fashion. Due to the variability between trials, a general model cannot adequately fit all experiments. For routine analyses it is recommended, as suggested by Gilmour et al. (1997), that for each individual datasets several variants of the autoregressive spatial model must be tried, and the linear model extended by the inclusion of other design factors, polynomial functions, or other fixed or random effects to model the gradient. Also, model selection should be always accompanied by the use of diagnostic tools, such as variograms (Cressie 1993) or the use some of the available mixed model information criteria. Nearest Neighbor Models The findings in this study indicated that considerable improvements can be achieved with PAP or MA methods in controlling for spatial variation. As expected, the best results were obtained for patchy surfaces. For the same covariate definition, PAP methods were always better than MA. This situation demonstrates that, as done with PAP, it is necessary to eliminate the treatment effect from the response variables. The performance of PAP methods agrees with other studies in which they were always superior to RCB designs, generally better than IB designs (Kempton and Howes 1981; Brownie et al. 1993; Vollmann et al. 1996), and only slightly inferior to models that fitted the autoregressive error structure (Zimmerman and Harville 1991; Brownie et al. 1993; Brownie and Gumpertz 1997). The best nearest neighbor methods corresponded to PAP6 and PAP11, which are based on 4 and 6 covariates respectively, and yielded better results than the original PAP method based on a single covariate (PAP1 and PAP3). This agrees with other studies where the inclusion of more than one covariate produced better results (Kempton and Howes 1981; Vollmann et al. 1996). With extra covariates, it is possible to use more information about the neighbor plots; therefore, improving efficiency. Nevertheless, for different situations, the ideal set of covariates needs to be investigated. PAP methods and other nearest neighbor methods not studied here have the advantages of simplicity and flexibility. They are simple because only the assumption of independent errors is required so computational requirements are minimal, and they are flexible because several covariates with varying number of plots can be defined and studied. Nevertheless, these methods are not free of difficulties. Testing the inclusion of covariates in a model is difficult because it is not clear how many degrees of freedom should be considered for each covariate due to the fact that a previous estimate of the treatment mean was used. As point of reference, Pearce (1998) in a simulation study recommended considering 2 degrees of freedom for each covariate. Also, it is not clear how to treat covariates obtained from border plots, or what to do if no border plots exist. Ideally, measurements outside the test area should be available as plots that received a treatment with an average response (Scharf and Alley 1993). Nevertheless, it is common to use only existing plots to calculate covariates (Wilkinson et al. 1983), or replace non existent plots by their interior complement (Magnussen 1993). A similar situation occurs with missing information that generates an internal border. Another reported difficulty is that these methods appear to be more sensitive to outliers (Kempton and Howes 1981) and to the presence of abrupt changes in the error surface (Binns 1987; Pearce 1998); and in many cases it is not clear what are the ideal solutions for these problems. The iterative PAP was first suggested by Bartlett (1978) to improve efficiency. Wilkinson et al. (1983) claimed that there is a considerable positive bias for testing significance of treatment effects for a single covariate iterative PAP method; nevertheless, Pearce (1998) in another study found no bias, and showed that some PAP methods were more effective after iteration. The results in this study for iterated PAP methods were not favorable producing a reduction in the efficiency as more iterations were executed. Given the results, noniterated PAP methods appear to be the best option. Nevertheless, is important to consider that for the present simulations a relatively large number of replicates per treatment were available (i.e., 8) that allowed reasonable initial estimates of treatment means to calculate covariates. Apparently, the PAP methods can be improved by including model effects to incorporate gradients or trends (which is where these methods failed to outperform other options, see Figure 44). Design effects, as replicates or blocks have been used in some studies together with some nearest neighbor covariates (Kempton and Howes 1981; Scharf and Alley 1993) with varying outcomes. But, as indicated before, the use of design effects is not recommended because they create artificial discontinuities for neighboring plots that belong to different blocks. Polynomials functions or smoothing splines do not have this problem because they produce a continuous gradient; hence, they are recommended for modeling gradients when employing nearest neighbor methods to account for patchiness. Usually, in spatial analysis only positive correlations due to physical proximity are considered. But, in many real situations, a negative correlation can be generated by competition between plots belonging to different treatments. This competition can disrupt the spatial correlation (Kempton and Howes 1981) and might require explicit modeling. However, this negative correlation is only present once competition for resources (light, water and nutrients) is important, which might take several years to occur. Because the simulations in this study did not consider these effects some caution must be taken in situations with high levels of them. The use of spatial techniques discussed here does not invalidate classical design of experiments. As Sarker et al. (2001) suggest, it is important to design the layout of an experiment according to a preferred design (e.g., IB), which should be first analyzed as designed, but further spatial models should be fitted to capture other relevant information (or noise) present in the field. For any design or analysis, treatment randomization is still critical to neutralize the effect of spatial correlations and provide unbiased estimates and to validate hypothesis testing (Grondona and Cressie 1991; Brownie et al. 1993). Also, specific design of experiments in the presence of spatial correlation might yield improvements, particularly in balancing treatment pairwise comparisons. Little research in this topic has been reported, and for example, Williams (1985) described an algorithm to improve IB designs by considering their spatial arrangement. The results from this study concentrated in singlesite analyses. Combining several sites and other sources of information together can yield to improved estimation of the genetic parameters. Usually, for multiplesite analyses each test is studied individually to determine its significant model effects and error structure, and later, all sites are combined in a unique model using their singlesite specifications obtained previously. Specialized software is required to fit this complex mixed linear model, which commonly presents difficulties in getting variance components that converge. In contrast, the use of nearest neighbor models on multiplesite analyses is considerable easier, particularly because the error structure is simple, and it only requires the definition of site specific covariate parameters. Nevertheless, further studies in these topics are required. Conclusions The main findings of this study indicated that an adequate selection of the experimental design can produce considerable improvements in the prediction of treatment effects and better design options than the widely used RCB are available for field trials in agronomical and forestry studies. Particularly among models assuming independent errors, RC designs gave the best results and IB designs of 32 incomplete blocks were almost as good. Nevertheless, in PATCH surfaces these designs did not perform as well as autoregressive error structures or nearest neighbor methods. Also, those designs that modeled global trends (i.e., polynomial functions) performed well in surfaces with a dominant gradient (GRAD and ALL) but poorly in PATCH. The incorporation of a separable autoregressive error structure with or without nugget yielded the highest CORR values among all models and methods tried for all surface patterns. Here, the inclusion of design effects (replicates and blocks) or polynomial covariates were only useful when the nugget effect was not included. Nevertheless, in this last situation biased spatial correlations parameters were found. Promising results were obtained with the use of nearest neighbor techniques, particularly for surfaces with patches, but in surfaces with gradients some minor improvements were also found. PAP methods were always better than MA, and they were almost as good as methods that modeled the error structure, particularly those variants that considered more plots and/or covariates. PAP11 was the best alternative followed by PAP6, but the use of the latter is recommended because of its simplicity. The use of an iterated PAP method did not produce improvements in this study, generating deteriorated estimates. In summary, if simple analyses are preferred, RC and IB designs with independent errors should be preferred. For further improvements, it is recommended that some form of PAP methods be used, but several covariates must be tried and tested. Finally, spatial analysis incorporating error structures are promising, and they should be used whenever possible, but the computational requirements and some uncertainty about the correct procedures and testing may limit its practical use. CHAPTER 5 POSTHOC BLOCKING TO IMPROVE HERITABILITY AND BREEDING VALUE PREDICTION Introduction Genetic testing of provenances, varieties and families in forestry and agricultural crops is usually a long process and frequently one of the most expensive activities of any genetic improvement program (Zobel and Talbert 1984, p. 232). Appropriate selection of an experimental design and statistical analysis can produce considerable improvement in the prediction of breeding values and efficient use of available resources. Traditionally, in forest trials the randomized complete block (RCB) design has been favored. RCB design is effective when within replicate (or block) variability is relatively small, a situation that is rare in forest sites, or occurs only with relatively small replicates (Costa e Silva et al. 2001). Since a large number of genetic entities are usually tested together, a good alternative is to implement incomplete block designs (IB) which subdivide the full replicate into smaller compartments (i.e., incomplete blocks) that tend to be more homogeneous. For these designs, several authors have reported greater efficiencies in comparison to RCB (Fu et al. 1998; Fu et al. 1999; see Chapter 2). It is also possible to simultaneously implement twoway blocking using rowcolumn designs (RC). These RC designs are described in detail by John and Williams (1995, p. 87) and often yield even greater efficiencies than IB designs (Lin et al. 1993; Qiao et al. 2000; see also Chapter 2). Fitting the best linear model that adequately describes the trial data together with the appropriate specification of random and fixed effects is critical. In the literature several techniques are available for improving statistical analyses, such as 1) spatial models (Gilmour et al. 1997; Chapter 4); 2) nearest neighbor methods (Vollmann et al. 1996; Chapter 4); and 3) inclusion of covariates and other factors to deal with missing values (Cochran 1957) or nuisance effects (Gilmour et al. 1997). These are examples of "a posteriori" analyses. In these options, an extended linear model different than the original experimental design is fitted, which aims to model elements that were unnoticed or not controlled at test establishment. Another "a posteriori" technique, called posthoc blocking, can also be easily implemented. This method consists of superimposing a blocking structure on top of the original field design and a linear model is fitted as if the blocking effects were present in the original design. Usually, an IB design is fitted over a simpler design (e.g., RCB), but an RC can also be superimposed. This technique was originally proposed by Patterson and Hunter (1983) as a tool to inexpensively evaluate the efficiency of potential experimental designs. Nevertheless, several authors have used posthoc blocking successfully to increase heritability and the precision of prediction of genetic effects for final analyses (Ericsson 1997; Dutkowski et al. 2002; Lopez et al. 2002). The primary problem in implementing posthoc blocking is determining the characteristics of the incomplete blocks to be superimposed. If large incomplete blocks are used, the blocks will continue to be highly heterogeneous. On the other hand, small incomplete blocks should considerably reduce the residual variability by capturing smaller areas (or patches) of the environmental surface. But if block size becomes too small, fewer treatments will occur together in the same block; therefore, the standard error of their difference can be considerably larger. Also, according to Ericsson (1997), with small blocks part of the treatment (or genetic) variability can be absorbed by the block variance due to an increased chance that two good (or bad) genotypes occur together in the same block. Another problem is the shape of the incomplete block; nevertheless, general guidelines used for design of experiments can be followed. For example, a shape as close to a square as possible, and orienting the blocks main axis perpendicularly to the principal field gradient should be preferred (Zobel and Talbert 1984, p. 248). The present study used simulated singlesite clonal trials and 3 different environmental patterns to understand consequences of and define future strategies for using posthoc blocking in estimating heritabilities and predicting breeding values. Specifically, the objectives were to 1) study the use of posthoc blocking for several experimental designs, surface patterns and numbers of ramets; 2) verify a potential reduction in the genetic variance as the incomplete blocks become smaller; 3) compare the performance of different strategies or criteria for selecting an appropriate blocking structure; and 4) discuss statistical and practical issues related to implementing posthoc blocking. Materials and Methods Several single site clonal trials were simulated based on a rectangular grid of 64 rows and 32 columns with square spacing and no missing observations (see Chapter 2 for details). Briefly, 256 clones each with 8 ramets were "planted" in singletree plots according to several experimental plans. The simulated environmental or surface patterns were based on two main elements: a gradient or trend generated with a third degree continuous polynomial function, and patches that were incorporated through a covariance structure based in a firstorder separable autoregressive process or AR1AR1 (Cressie 1993; Gilmour et al. 1997). The 3 surface patterns considered were 1) only patches (PATCH); 2) only gradients (GRAD); and 3) both components (ALL). The variance structure was set for a completely randomized design to have a singlesite individual broadsense heritability (Hi2) of 0.25 originating from a total variance fixed at 1.0 and a clonal variance of 2clione = 0.25. The simulated designs were a randomized complete design with 8 resolvable replicates (RCB), incomplete block designs with 4, 8, 16 and 32 incomplete blocks within each of the 8 full replicates (IB 4, IB 8, IB 16 and IB 32), and a rowcolumn design with 16 rows and 16 columns (RC). For each design, several implementations were generated using the software CycDesigN (Whitaker et al. 2002) that produces adesigns. For each combination of surface pattern and design type 1,000 different datasets were generated. The software ASREML (Gilmour et al. 2002) was used to fit all mixed linear models and to obtain restricted maximum likelihood (REML) variance component estimates and best linear unbiased predictions (BLUP) of clonal values (Patterson and Thompson 1971). To compare alternatives, several statistics were calculated: summary statistics of estimated variance components, average empirical correlation between true and predicted clonal BLUP values (CORR) and individual singlesite heritabilities (H2 ). Finally, for each linear model and dataset the loglikelihood value (logL) was recorded together with its average standard error of the difference (SED) between any two clones. 81 Table 51. Linear models fitted for simulated datasets using classical experimental designs to model global trends over all surface patterns: randomized complete block (RCB), incomplete block design with x blocks (IB), and a rowcolumn (RC) design. All model effects other than the mean were considered random. Model a RCB y, = + rep, + clone, + r, IB x Yk = + rep1 + block(rep), + clonek + Ek RC Yljk = + rep, + col(rep), + row(rep)1k + clone1 + skl a clone, clone effect; rep, resolvable replication; block(rep), incomplete block nested within replication; col(rep), column nested within replication (short column); row(rep), row nested within replication (short row); E, residual. IB48x8 B 88x4 TB 164x4 1 5 9 13 2 6 10 14 3 7 11 15 4 8 12 16 IB 324x2 IB642x2 I1 I1 1 1 I IB 128 2 x 1 Figure 51. Experimental layout for randomized complete block design simulations with 8 resolvable replicates (left) and partitioning of one replicate for incomplete block design layouts (right). All simulations had 256 unrelated clones with 8 ramets per clone per site for a total of 2,048 trees. 32 trees _^___________> 161 16 trees The first stage in this study consisted of fitting the original experimental designs (without posthoc blocking) using the corresponding linear models (see Table 51). Later, by employing only the RCB designs for all surface patterns, posthoc blocking was implemented by superimposing 4, 8, 16 and 32 incomplete blocks of sizes 8 x 8, 8 x 4, 4 x 4, and 4 trees x 2 trees, respectively, and a RC design with short rows and columns of length 16 (see Figure 51). These "new" posthoc designs were fitted using the corresponding linear model from Table 51. Statistical comparisons of individual heritabilities between original designs and posthoc blocking were made using a ztest. In a second stage, the first 500 datasets generated for RCB designs were used to perform additional posthoc blocking studies. The same posthoc blocking designs previously used were fitted and two extra incomplete block designs were incorporated to examine the effects of very small incomplete blocks: a layout with 64 (IB 64) and 128 (IB 128) blocks were superimposed with block sizes of 2 x 2 and 2 x 1 respectively (see Figure 51). The linear models from Table 51 were fitted to all datasets, and the previously described statistics were also calculated. Additionally, for each of the 500 datasets, a subset of 2 contiguous replicates was selected at random constituting a "new" trial based on only 2 ramets per clone. This reduced trial size was used because of interest in studying the potential effects of using posthoc blocking in the estimation of the variance components and other statistics for trials with fewer replications. For these datasets, the same incomplete block and rowcolumn designs described previously were fitted (IB 4, IB 8, IB 16, IB 32, IB 64, IB 128 and RC), and summary statistics were calculated. In practical situations, several blocking alternatives (varying numbers of blocks and block shapes) are tested during analysis, and a final model must be selected. In this study, the performance of loglikelihood and SED decision criteria was studied and compared with individual heritabilities. The first criteria, logL, selects the blocking structure that maximizes the loglikelihood of the mixed linear model (Lopez et al. 2002). For SED, the model with the smallest average standard deviation of the difference between treatments is selected. The procedure consisted of selecting and recording the best model fitted in the second stage using both of these criteria for each of the 500 datasets and experimental designs. Here, frequency tables were obtained with all designs together, and only for IB designs. Results When results from fitting the original experimental designs were compared to post hoc blocking designs of the same block size and orientation, small differences were found for all statistics. The largest difference for average HR was 0.003 units (Table A9). Further, ztests comparing individual heritabilities between the original design and post hoc blocking for each design type and surface pattern indicated that no significant differences existed (experimentwise a = 0.05), with the single exception of RC designs on PATCH surfaces. Also, when standard deviations for HE2 were compared between original and posthoc blocking analyses, the differences in magnitude were very small and probably due to sampling variation (Table A9). The average clonal variance did not differ between original and posthoc blocking, and a negligible decrease in the precision of this component was found for posthoc blocking. A similar situation of very little difference between original experimental designs and posthoc blocking was also observed for average correlation between true and predicted clonal values (Figure 52). Average variance components obtained in the second stage employing even smaller posthoc blocks for all designs and surface patterns for the case of 8 ramets per clone are shown in Table 52. The clonal variance was always close to its parametric value of 0.25, with no reduction in its average value as the size of the block decreased, as postulated by Ericsson (1997). Also, the block variance increased and the error variance decreased as the number of incomplete blocks increased. Considerable differences were noted in block variance components between IB 4 and IB 128, indicating a large influence of the experimental design used on partitioning total variance. 0.90 0 .88 . ... .. .... ........ . 0.86  0.86 ALL Original **0 ALL Posthoc 0 4  PATCH Original .A. PATCH Posthoc GRAD Original *V GRAD Posthoc 0.82 1 1 RCB IB4 IB8 IB16 IB32 RC Figure 52. Average correlation between true and predicted clonal values (CORR) for original and posthoc blocking analyses on ALL, PATCH and GRAD surface patterns for simulated surfaces with 8 ramets per clone. Table 52. Average variance components and individual broadsense heritability for all surface patterns obtained from fitting the models in Table 51 from simulated datasets with a randomized complete block designs with 8 ramets per clone per site. ALL a Model b clone rep block(rep) col(rep) row(rep) error Hi RCB 0.25 0.17 0.60 0.296 IB 4 0.25 0.16 0.05 0.56 0.310 IB 8 0.25 0.17 0.07 0.54 0.319 IB 16 0.25 0.17 0.08 0.53 0.323 IB 32 0.25 0.17 0.09 0.51 0.329 IB 64 0.25 0.17 0.10 0.50 0.337 IB 128 0.25 0.17 0.10 0.50 0.336 RC 0.25 0.17 0.06 0.05 0.50 0.336 PATCH Model clone rep block(rep) col(rep) row(rep) error HE2 RCB 0.25 0.02 0.73 0.256 IB 4 0.25 0.01 0.03 0.71 0.263 IB 8 0.25 0.01 0.05 0.68 0.269 IB 16 0.25 0.02 0.07 0.67 0.274 IB 32 0.25 0.02 0.10 0.63 0.284 IB 64 0.25 0.02 0.13 0.60 0.295 IB 128 0.25 0.02 0.15 0.58 0.304 RC 0.25 0.01 0.06 0.06 0.62 0.288 GRAD Model clone rep block(rep) col(rep) row(rep) error HE, RCB 0.25 0.17 0.61 0.292 IB 4 0.25 0.15 0.05 0.57 0.305 IB 8 0.25 0.16 0.05 0.56 0.308 IB 16 0.25 0.16 0.05 0.56 0.308 IB 32 0.25 0.16 0.05 0.56 0.308 IB 64 0.25 0.16 0.06 0.54 0.314 IB 128 0.25 0.17 0.06 0.55 0.313 RC 0.25 0.16 0.04 0.02 0.55 0.312 a ALL, surfaces with patches and gradients; GRAD, surfaces with only gradients; PATCH, surfaces with only patches. b CR, complete randomized; RCB, randomized complete block; IB x, incomplete block with x blocks per replicate; RC, rowcolumn. c rep, resolvable replicate; block(rep), incomplete block nested within replicate; col(rep), column nested within replicate; row(rep), row nested within replicate; clone, clone; e, residual. 86 0.34        0.90 a b 0.32 0.89 5 0.30 / 0.88 V I I *o O b 0 b 0.28 0.87 SO..0 0o'' ..0.0. .0 I  I0 " 0.26 0 ALL 0.86 O 0 ..O0... PATCH  GRAD O 0.24 0.85 Figure 53. Average individual heritability (a) and correlation between true and predicted clonal values (b) calculated over 500 simulations with a randomized complete block designs with 8 ramets per clone per site for each surface pattern and posthoc design. For average individual heritability (Figure 53a) IB 128 was the best design across all 3 surface patterns, but IB 64 and RC were relatively close. In fact, considerable improvements in H,2 can be achieved with posthoc blocking using better designs; average H values of up to 0.34, compared with a base value of 0.25 for the completely randomized design were obtained for smaller block sizes. Nevertheless, a better statistic to use is the average correlation between the true and predicted values (CORR) because it evaluates how well a posthoc blocking design predicts clonal breeding values and, hence, genetic gain from clonal selection. The trends between H,2 and CORR differ considerably (see Figure 53), and a decrease in CORR was present for those designs 