Citation
Optimal mating designs and optimal techniques for analysis of quantitative traits in forest genetics

Material Information

Title:
Optimal mating designs and optimal techniques for analysis of quantitative traits in forest genetics
Creator:
Huber, Dudley Arvle, 1948-
Publication Date:
Language:
English
Physical Description:
ix, 151 leaves : ill. ; 29 cm.

Subjects

Subjects / Keywords:
Architectural design ( jstor )
Covariance ( jstor )
Design efficiency ( jstor )
Estimation bias ( jstor )
Estimation methods ( jstor )
Linear models ( jstor )
Matrices ( jstor )
Maximum likelihood estimations ( jstor )
Random variables ( jstor )
Statistical discrepancies ( jstor )
Dissertations, Academic -- Forest Resources and Conservation -- UF
Forest Resources and Conservation thesis Ph. D
Genre:
bibliography ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1993.
Bibliography:
Includes bibliographical references (leaves 145-150).
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by Dudley Arvle Huber.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
030180493 ( ALEPH )
30335713 ( OCLC )
AJZ0671 ( NOTIS )

Downloads

This item has the following downloads:


Full Text










OPTIMAL MATING DESIGNS AND OPTIMAL TECHNIQUES FOR ANALYSIS OF
QUANTITATIVE TRAITS IN FOREST GENETICS

















By

DUDLEY ARVLE HUBER


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


1993












ACKNOWLEDGEMENTS


I express my gratitude to Drs. T. L. White, G. R. Hodge, R. C. Littell, M. A.

DeLorenzo and D. L. Rockwood for their time and effort in the pursuit of this work. Their

guidance and wisdom proved invaluable to the completion of this project.

I further acknowledge Dr. Bruce Bongarten for his encouragement to continue my

academic career. I am grateful to Dr. T. L. White and the School of Forest Resources and

Conservation at the University of Florida for funding this work.

I extend special thanks to George Bryan and Dr. M. A. DeLorenzo of the Dairy Science

Department and Greg Powell of the School of Forest Resources and Conservation for the use of

computing facilities, programming help and aid in running the simulations required.

Most importantly, I thank my family, Nancy, John and Heather, for their understanding

and encouragement in this endeavor.












TABLE OF CONTENTS



ACKNOWLEDGEMENTS ........................................ ii

LIST OF TABLES ................................. ........... vi

LIST OF FIGURES ............................................. vii

ABSTRACT ................................................ viii

CHAPTER 1
INTRODUCTION .................... 1

CHAPTER 2
THE EFFICIENCY OF HALF-SIB, HALF-DIALLEL
AND CIRCULAR MATING DESIGNS IN THE ESTIMATION
OF GENETIC PARAMETERS WITH VARIABLE NUMBERS OF
PARENTS AND LOCATIONS ................ 4
Introduction ............................................... 4
M ethods .............................................. 6
Assumptions Concerning Block Size ...................... 6
The Use of Efficiency (i) ............................. 7
General Methodology .................................. 8
Levels of Genetic Determination .......................... 10
Covariance Matrix for Variance Components ............... 12
Covariance Matrix for Linear Combinations of Variance Components
and Variance of a Ratio ............... .......... 13
Comparison Among Estimates of Variances of Ratios ............ .14
Results .............. ... ........ ... ........... ...... 17
H eritability ........................................ 17
Type B Correlation .................................. 18
Dominance to Additive Variance Ratio ...................... 21
Discussion ................... ........... ............... 22
Comparison of Mating Designs ........................... 22
A General Approach to the Estimation Problem ................ 23
Use of the Variance of a Ratio Approximation ................. 25
Conclusions ............................................. 26






CHAPTER 3


ORDINARY LEAST SQUARES ESTIMATION OF GENERAL
AND SPECIFIC COMBINING ABILITIES FROM
HALF-DIALLEL MATING DESIGNS ........


Introduction


. .. .. . .. 28


M ethods ........................
Linear Model ................
Ordinary Least Squares Solutions ...
Sum-to-Zero Restrictions ........
Components of the Matrix Equation .
Estimation of Fixed Effects .......
Numerical Examples ................
Balanced Data (Plot-mean Basis) ....
Missing Plot ................
Missing Cross ...............
Several Missing Crosses .........
Discussion ......................
Uniqueness of Estimates .........
Weighting of Plot Means and Cross Me
Diallel Mean ................
Variance and Covariance of Plot Means
Comparison of Prediction and Estimatio


......................
......................
......................
......................
................ol..
...........l.........

......................
......................
......................

......................
.....................
...eto ol .is ...........
........... ....o.

,ans in Estimating Parameters .. .
................o..
)n Methodologies .. .. .. .. .. .


Conclusions ............................................

CHAPTER 4
VARIANCE COMPONENT ESTIMATION TECHNIQUES
COMPARED FOR TWO MATING DESIGNS


WITH FOREST GENETIC ARCHITECTURE
THROUGH COMPUTER SIMULATION .....
Introduction ....................................


M ethods ............................
Experimental Approach .............
Experimental Design for Simulated Data .
Full-Sib Linear Model ..............
Half-sib Linear Model ..............
Data Generation and Deletion .... .....
Variance Component Estimation Techniques
Comparison Among Estimation Techniques
Results and Discussion ...................
Variance Components ..............
Ratios of Variance Components ........
General Discussion .....................
Observational Unit ................
Negative Estimates ................
Estimation Technique ...............
Recommendation .................


i







CHAPTER 5
GAREML: A COMPUTER ALGORITHM FOR
ESTIMATING VARIANCE COMPONENTS AND
PREDICTING GENETIC VALUES ............... 82
Introduction ............................................. 82
Algorithm .............................................. 83
Operating GAREML ...................................... 86
Interpreting GAREML Output ................................ 90
Variance Component Estimates ........................... 90
Predictions of Random Variables .......................... 91
Asymptotic Covariance Matrix of Variance Components ........... 92
Fixed Effect Estimates ............................... 93
Error Covariance Matrices ............................. 93
Example ............... ......................... .... 94
Data .................. ........ .... .............. 94
Analysis ..................................... ... 94
O utput .......................................... 98
Conclusions .............................................. 103

CHAPTER 6
CONCLUSIONS ..................... 104

APPENDIX
FORTRAN SOURCE CODE FOR GAREML ............ 107

REFERENCE LIST ........................................... 145

BIOGRAPHICAL SKETCH ...................................... 151













LIST OF TABLES


Table 2-1. Parametric variance components .. ..................... .11

Table 3-1. Data set for numerical examples .......................... 43

Table 3-2. Numerical results for examples ............................. 44

Table 4-1. Abbreviation for and description of variance component estimation

m ethods .................. ............................ 60

Table 4-2. Sets of true variance components ............................ .61

Table 4-3. Sampling variance for the estimates .......................... 72

Table 4-4. Bias for the estimates ................................... 74

Table 4-5. Probability of nearness .................................. 75

Table 5-1. Data for example ...................................... 95













LIST OF FIGURES


Figure 2-1. Efficiency () for h .................................... 16

Figure 2-2. Efficiency () for rB ................. ................. 19

Figure 2-3. Efficiency () fory .................................... 20

Figure 3-1. The overparameterized linear model ......................... 33

Figure 3-2. The linear model for a four-parent half-diallel .................. 33

Figure 3-3. Intermediate result in SCA submatrix generation .................. 39

Figure 3-4. Weights on overall cross means ............................ 49

Figure 4-1. Distribution of 1000 MIVQUE estimates ....................... 77













Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

OPTIMAL MATING DESIGNS AND OPTIMAL TECHNIQUES FOR ANALYSIS OF
QUANTITATIVE TRAITS IN FOREST GENETICS

By

Dudley Arvie Huber
May 1993

Chairperson: Timothy L. White
Major Department: School of Forest Resources and Conservation

First, the asymptotic covariance matrix of the variance component estimates is used to

compare three common mating designs for efficiency (maximizing the variance reducing property

of each observation) for genetic parameters across numbers of parents and locations and varying

genetic architectures. It is determined that the circular mating design is always superior in

efficiency to the half-diallel design. For single-tree heritability, the half-sib design is most

efficient. For estimating type B correlation, maximum efficiency is achieved by either the half-

sib or circular mating design and that change in rank for efficiency is determined by the

underlying genetic architecture.

Another intent of this work is comparing analysis methodologies for determining parental

worth. The first of these investigations is ordinary least squares assumptions in the estimation

of parental worth for the half-diallel mating design with balanced and unbalanced data. The

conclusion from comparison of ordinary least squares to alternative analysis methodologies is that

best linear unbiased prediction and best linear prediction are more appropriate to the problem of

determining parental worth.







The next analysis investigation contrasts variance component estimation techniques across

levels of imbalance for the half-diallel and half-sib mating designs for the estimation of genetic

parameters with plot means and individuals used as the unit of observation. The criteria for

discrimination are variance of the estimates, mean square error, bias and probability of nearness.

For all estimation techniques individuals as the unit of observation produced estimates with the

most desirable properties. Of the estimation techniques examined, restricted maximum likelihood

is the most robust to imbalance.

The computer program used to produce restricted maximum likelihood estimates of

variance components was modified to form a user friendly analysis package. Both the algorithm

and the outputs of the program are documented. Outputs available from the program include

variance component estimates, generalized least squares estimates of fixed effects, asymptotic

covariance matrix for variance components, best linear unbiased predictions for general and

specific combining abilities and the error covariance matrix for predictions and estimates.












CHAPTER 1
INTRODUCTION


Analysis of quantitative traits in forest genetic experiments has traditionally been

approached as a two-part problem. Parental worth would be estimated as fixed effects and later

considered as random effects for the determination of genetic architecture. While traditional, this

approach is most probably sub-optimal given the proliferation of alternative analysis approaches

with enhanced theoretical properties (White and Hodge 1989).

In this dissertation emphasis is placed on the half-diallel mating design because of its

omnipresence and the uniqueness of the analysis problem this mating design presents. The half-

diallel mating design has been and continues to be used in plant sciences (Sprague and Tatum

1942, Gilbert 1958, Matzinger et al. 1959, Burley et al. 1966, Squillace 1973, Weir and Zobel

1975, Wilcox et al. 1975, Snyder and Namkoong 1978, Hallauer and Miranda 1981, Singh and

Singh 1984, Greenwood et al. 1986, and Weir and Goddard 1986). The unique feature of the

half-diallel mating system which hinders analysis with many statistical packages is that a single

observation contains two levels of the same main effect.

Optimality of mating design for the estimation of commonly needed genetic parameters

(single-tree heritability, type B correlation and dominance to additive variance ratio) is examined

utilizing the asymptotic covariance of the variance components (Kendall and Stuart 1963,

Giesbrecht 1983 and McCutchan et al. 1989). Since genetic field experiments are composed of

both a mating design and a field design, the central consideration in this investigation is which

mating design with what field design (how many parents and across what number of locations







2

within a randomized complete block design) is most efficient. The criterion for discernment

among designs is the efficiency of the individual observation in reducing the variance of the

estimate (Pederson 1972). This question is considered under a range of genetic architectures

which spans that reported for coniferous growth traits (Campbell 1972, Stonecypher et al. 1973,

Snyder and Namkoong 1978, Foster 1986, Foster and Bridgwater 1986, Hodge and White [in

press]).

The investigation into optimal analysis proceeds by considering the ordinary least squares

(OLS) treatment of estimating parental worth for the half-diallel mating design. OLS assumptions

are examined in detail through the use of matrix algebra for both balanced and unbalanced data.

The use of matrix algebra illustrates both the uniqueness of the problem and the interpretation

of the OLS assumptions. Comparisons among OLS, generalized least squares (GLS), best linear

unbiased prediction (BLUP) and best linear prediction (BLP) are made on a theoretical basis.

Although consideration of field and mating design of future experiments is essential, the

problem of optimal analysis of current data remains. In response to this need, simulated data

with differing levels of imbalance, genetic architecture and mating design is utilized as a basis

for discriminating among variance component estimation techniques in the determination of

genetic architecture. The levels of imbalance simulated represent those commonly seen in forest

genetic data as less than 100% survival, missing crosses for full-sib mating designs and only

subsets of parents in common across location for half-sib mating designs. The two mating

designs are half-sib and half-diallel with a subset of the previously used genetic architectures.

The field design is a randomized complete block with fifteen families per block and six trees per

family per block. The four criteria used to discriminate among variance component estimation

techniques are probability of nearness (Pittman 1937), bias, variance of the estimates and mean

square error (Hogg and Craig 1978).







3

The techniques compared for variance component estimation are minimum variance

quadratic unbiased estimation (Rao 1971b), minimum norm quadratic unbiased estimation (Rao

1971a), restricted maximum likelihood (Patterson and Thompson 1971), maximum likelihood

(Hartley and Rao 1967) and Henderson's method 3 (Henderson 1953). These techniques are

compared using the individual and plot means as the unit of observation. Further, three

alternatives are explored for dealing with negative variance component estimates which are accept

and live with negative estimates, set negative estimates to zero, and re-solve the system setting

negative components to zero.

The algorithm used for the method which provided estimates with optimal properties

across experimental levels was converted to a user friendly program. This program providing

restricted maximum likelihood variance component estimates uses Giesbrecht's algorithm (1983).

Documentation of the algorithm and explanation of the program's output are provided along with

the Fortran source code (appendix).













CHAPTER 2
THE EFFICIENCY OF HALF-SIB, HALF-DIALLEL
AND CIRCULAR MATING DESIGNS IN THE ESTIMATION
OF GENETIC PARAMETERS WITH VARIABLE NUMBERS OF
PARENTS AND LOCATIONS


Introduction


In forest tree improvement, genetic tests are established for four primary purposes:

1) ranking parents, 2) selecting families or individuals, 3) estimating genetic parameters, and 4)

demonstrating genetic gain (Zobel and Talbert 1984). While the four purposes are not mutually

exclusive, a test design optimal for one purpose is most probably not optimal for all (Burdon and

Shelbourne 1971, White 1987). A breeder then must prioritize the purposes for which a given

test is established and choose a design based on these priorities. Within a genetic test design

there are two primary components: mating design and field design. There have been several

investigations of optimal designs for these two components either separately or simultaneously

under various criteria. These criteria have included the efficient and/or precise estimation of

heritability (Pederson 1972, Namkoong and Roberds 1974, Pepper and Namkoong 1978,

McCutchan et al. 1985, McCutchan et al. 1989), precise estimation of variance components

(Braaten 1965, Pepper 1983), and efficient selection of progeny (van Buijtenen 1972, White and

Hodge 1987, van Buijtenen and Burdon 1990, Loo-Dinkins et al. 1990).

Incorporated within this body of research has been a wide range of genetic and

environmental variance parameters and field and mating designs. However, the models in

previous investigations have been primarily constrained to consideration of testing in a single






5

environment with a corresponding limited number of factors in the model, i.e., genotype by

environment interaction and/or dominance variance are usually not considered. This chapter

focuses on optimal mating designs through consideration of three common mating designs (half-

sib, half-diallel, and circular with four crosses per parent) for estimation of genetic parameters

with a field design extending across multiple locations.

In this chapter the approach to the optimal design problem is to maintain the basic field

design within locations as randomized complete block with four blocks and a six-tree row-plot

representing each genetic entry within a block (noted as one of the most common field designs

by Loo-Dinkins et al. 1990). The number of families in a block, number of locations, mating

design and number of parents within a mating design are allowed to change. Since optimality,

besides being a function of the field and mating designs, is also a function of the underlying

genetic parameters, all designs are examined across a range of levels of genetic determination (as

varying levels of heritability, genotype by environment interaction and dominance) reflecting

estimates for many economically important traits in conifers (Campbell 1972, Stonecypher et al.

1973, Snyder and Namkoong 1978, Foster 1986, Foster and Bridgwater 1986, Hodge and White

(in press)).

For each design and level of genetic determination, a Minimum Variance Quadratic

Unbiased Estimation (MIVQUE) technique and an approximation of the variance of a ratio

(Kendall and Stuart 1963, Giesbrecht 1983 and McCutchan et al. 1989) are applied to estimate

the variance of estimates of heritability, additive to additive plus additive by environment variance

ratio, and dominance to additive variance ratio. These techniques use the true covariance matrix

of the variance component estimates (utilizing only the known parameters and the test design and

precluding the need for simulated or real data) and a Taylor series approximation of the variance

of a ratio. The relative efficiencies of different test designs are compared on the basis of i (the







6

efficiency of an individual observation in reducing the variance of an estimate, Pederson 1972).

Thus this research explores which mating design, number of parents and number of locations is

most efficient per unit of observation in estimating heritability, additive to additive plus additive

by environment variance ratio, and dominance to additive variance ratio for several variance

structures representative of coniferous growth traits.


Methods


Assumptions Concerning Block Size


As opposed to McCutchan et al. (1985), where block sizes were held constant and

including more families resulted in fewer observations per family per block, in this chapter the

blocks are allowed to expand to accommodate increasing numbers of families. This expansion is

allowed without increasing either the variance among block or the variance within blocks. For

the three mating designs which are discussed, the addition of one parent to the half-sib design

increases block size by 6 trees (plot for a half-sib family), the addition of a parent to the circular

design increases block size by 12 trees (two plots for full-sib families), and the addition of a

parent to the half-diallel design increases block size by 6p (where p is the number of parents

before the addition or there are p new full-sib families per block). Therefore, block size is

determined by the mating design and the number of parents.

All comparisons among mating designs and numbers of locations are for equal block

sizes, i.e., equal numbers of observations per location. This results in comparing mating designs

with unequal numbers of parents in the designs and comparing two location experiments against

five location experiments with equal numbers of observations per location but unequal total

numbers of observations.








The Use of Efficiency (i)


Efficiency is the tool by which comparisons are made and is the efficacy of the individual

observations in an experiment in lowering the variance of parameter estimates. An increasing

efficiency indicates that for increasing experimental size the additional observations have

enhanced the variance reducing property of all observations. Efficiency is calculated as i = 1

/ N(Var(x)) where N is the total number of observations and Var(x) is the variance of a generic

parameter estimate. Increasing N always results in a reduction of the variance of estimation, all

other things being equal. Yet the change in efficiency with increasing N is dependent on whether

the reduction in variance is adequate to offset the increase in N which caused the reduction.

Comparing a previous efficiency with that obtained by increasing N, i.e., increasing the number

of parents in a mating design or increasing the number of locations in which an experiment is

planted:

since i, = 1 / N(Var(x)), 2-1

then N(Var(x)) = 1 / i,

and (N + AN)(Var(x) + AVar(x)) = 1 / i;

if i, (the old efficiency) = i. (the new efficiency),

then AVar(x) / Var(x) = AN / (N + AN);

if i, < in, then AVar(x) / Var(x) < AN / (N + AN);

and if i, > i,, then AVar(x) / Var(x) > AN / (N + AN);

where A denotes the change in magnitude.

Viewing equation 2-1, if N is held constant and one design has a higher efficiency (i), the design

must also produce parameter estimates which have a lower variance.









General Methodology


Sets of true variance components are calculated in accordance with a stated level of

genetic control and the design matrix is generated in correspondence with the field and mating

design. Knowing the design matrix and a set of true variance components, a true covariance

covariancee) matrix of variance component estimates is generated. Once the covariance matrix

of the variance components is in hand, the variance of and covariances between any linear

combinations of the variance component estimates are calculated. From the covariance matrix

for linear combinations, the variance of genetic ratios as ratios of linear combinations of variance

components are approximated by a Taylor series expansion. Since definition of a set of variance

components and formation of the design matrix are dependent on the linear model employed,

discussion of specific methodology begins with linear models.


Linear Models


Half-diallel and circular designs

The scalar linear model employed for half-diallel and circular mating designs is

yijm = IA + ti + bij + g + g, + Sw + tgk + tg, + tsiJ + pij + wijv 2-2

where yix. is the mL observation of the kli cross in the jth block of the ih test;

jL is the population mean;

ti is the random variable test environment ~ NID(0,o,);

by is the random variable block NID(O,o2);

gk is the random variable female general combining ability (gca) ~ NID(0,^g);

g, is the random variable male gca ~ NID(0,og,);

sk is the random variable specific combining ability (sca) ~ NID(0,ol,);

tg, is the random variable test by female gca interaction ~ NID(0,olg);








tgn is the random variable test by male gca interaction NID(O,a2;

ts, is the random variable test by sea interaction NID(O,a );

p, is the random variable plot ~ NID(O,a2,);

wij is the random variable within plot ~ NID(0,a2,); and

there is no covariance between random variables in the model.

This linear model in matrix notation is (dimensions below model component):

y = Al + Zer + Ze, + ZGeG + Zes + + G + se + Zpe, +ew 2-3

nxl nxI nxt txl nxb bxl nxg gxl nxs sxl nxtg tgxl nxts tsxl nxp pxl nxl

where y is the observation vector;

Z7 is the portion of the design matrix for the il random variable;

e, is the vector of unobservable random effects for the it random variable;

1 is a vector of l's; and

n, t, b, g, s, tg, ts, and p are the number of observations, tests, blocks, gca's, sca's,

test by gca interactions, test by sea interactions and plots, respectively.

Utilizing customary assumptions in half-diallel mating designs (Method 4, Griffing 1956), the

variance of an individual observation is

Var(yI,~j = oa + o2, + 2oW + a2e + 2oa + ou, + o2p + o2,; 2-4

and in matrix notation the covariance matrix for the observations is

Var(y) = ZrZ;t + ZBZo2b + ZGZoga. + ZsZo + ZcZeoG + Z oZr + Z22p, + I.o2, 2-5

where indicates the transpose operator, all matrices of the form Z.Z1' are nxn, and I, is an

nxn identity matrix.

Half-sib design

The scalar linear model for half-sib mating designs is

yjkm = L + t- + bi + gk + tgik + p*,jk + W*,2 2-6








where y,, is the mh observation of the kL half-sib family in the jL block of the ih test;

I, t., bj, gk, and tga, retain the definition in Eq.2-2;

p*4k is the random variable plot containing different genotype by environment

components than Eq.2-2 NID(0,2,.);

w*'jB is the random variable within plot containing different levels of genotypic and

genotype by environment components than Eq.2-2 ~ NID(O,o2.); and

there is no covariance between random variables in the model.

The matrix notation model is

y = ll + Zre + Z4e, + Zce, + ZrGe + Ze, + ew 2-7

nxl nxl nxt txl nxb bxl nxg gxl nxtg tgxl nxp pxl nxl

The variance of an individual observation in half-sib designs is

Var(yij = o, + o2b + o2 + o, + o2,. + o.; 2-8

and Var(y) = Z.r 2, + ZBZoa2b + ZGZGoa9cr + ZrGZtG02g + ZpZp'o,. + I2, 2-9


Levels of Genetic Determination


Eight levels of genetic determination are derived from a factorial combination of two

levels of each of three genetic ratios: heritability (h2 = 4o2gq / (202p + o2, + 2o2, + o2, +

a2p + o2,) for full-sib models and h2 = 4o2~ / (o2gs + a2, + o2, + o2,) for half-sib models);

additive to additive plus additive by environment variance ratio (r. = a2, I (oa2 + o,), Type

B correlation of Burdon 1977); and dominance to additive variance ratio (7 = o2 / agJ. The

levels employed for each ratio are h2 = 0.1 and 0.25; rE = 0.5 and 0.8; and 7 = 0.25 and 1.0.

To generate sets of true variance components (Table 2-1) for half-diallel and circular

mating designs from the factorial combinations of genetic parameters, the denominator of h2 is

set to 10 (arbitrarily, but without loss of generality) which, given the level of h2, leads to the







11

solution for oa Solving for o, and knowing y yields the value for o2,. Knowing the level

of rB and a2, allows the equation for rB to be solved for o2l. An assumption that the ratio of y

Table 2-1. Parametric variance components for the factorial combination of heritability (.1 and
.25), Type B Correlation (.5 and .8) and dominance to additive variance ratio (.25 and 1.0) for
full and half-sib designs. o2, and o2b were maintained at 1.0 and .5, respectively for all levels and
designs.
Design Level h2 r, y o., oa2 o o. o, t2.
Full 1 .1 .8 1.0 .2500 .2500 .0625 .0625 .6344 8.4281
2 .1 .5 1.0 .2500 .2500 .2500 .2500 .5950 7.9050
3 .1 .8 .25 .2500 .0625 .0625 .0156 .6508 8.6461
4 .1 .5 .25 .2500 .0625 .2500 .0625 .6212 8.2538
5 .25 .8 1.0 .6250 .6250 .1562 .1562 .5359 7.1203
6 .25 .5 1.0 .6250 .6250 .6250 .6250 .4376 5.8125
7 .25 .8 .25 .6250 .1562 .1562 .0391 .5769 7.6649
8 .25 .5 .25 .6250 .1562 .6250 .1562 .5031 6.6844
Half I and .1 .8 .2500 .0625 .4844 9.2031
3
2 and .1 .5 .2500 .2500 .4750 9.0250
4
5 and .25 .8 .6250 .1562 .4609 8.7579
7
6 and .25 .5 .6250 .6250 .4375 8.3125
8


equals the ratio of a2, / o, permits a solution for a2,. A further assumption that 2, is seven

percent of a02 + o2, yields a solution for both a2p and a2,. Finally, a2 and o2, are set to 1.0 and

0.5, respectively, for all treatment levels.

In order to facilitate comparisons of half-sib mating designs with full-sib mating designs,

og,. and o2, retain the same values for given levels of h2 and r. and the denominator of

heritability again is set to 10. To solve for o2,. and o2,, the assumption that o21. is five percent

of o2. + o2,. permits a solution for a2p. and o2, and maintains ap. approximately equal to and

no larger than o2 of the full-sib mating designs (Namkoong et al. 1966) for the same levels of







12

h2 and rB. Under the previous definitions all consideration of differences in y changing the

magnitudes of o2,. and o2, is disallowed. Thus, there are only four parameter sets for the half-

sib mating design (Table 2-1).


Covariance Matrix for Variance Components


The base algorithm to produce the covariance matrix for variance component estimates

is from Giesbrecht (1983) and was rewritten in Fortran for ease of handling the study data. In

using this algorithm, we assume that all random variables are independent and normally

distributed and that the true variances of the random variables are known. Under these

assumptions, Minimum Norm Quadratic Unbiased Estimation (MINQUE, Rao 1972) using the

true variance components as priors (the starting point for the algorithm) becomes MIVQUE (Rao

1971b), which requires normality and the true variance components as priors (Searle 1987), and

for a given design the covariance matrix of the variance component estimates becomes fixed. A

sketch of the steps from the MIVQUE equation (Eq.2-10, Giesbrecht 1983, Searle 1987) to the

true covariance matrix for variance components estimates is

{tr(QVQVj)})2 = {y'QV,Qy) 2-10

rxr rxl rxl

then ( = {tr(QVQVj)}-'{y'QVQy)

and Var(2) = {tr(QVQVj)}-'Var({y'QVQy}){tr(QVQVj)}-

rxr rxr rxr rxr

where {aj is a matrix whose elements are aj where in the full-sib designs i= 1

to 8 and j= 1 to 8, i.e., there is a row and column for every random

variable in the linear model;







13

tr is the trace operator that is the sum of the diagonal elements of a

matrix;

Q = V' V-'X(X'V'X)-X'V- for V = the covariance matrix of y and X as

the design matrix for fixed effects;

V, = ZZZ', where i = the random variables test, block, etc.;

W is the vector of variance component estimates; and

r is the number of random variables in the model.

The variance of a quadratic form (where A is any non-negative definite matrix of proper

dimension) under normality is Var(y'Ay) = 2tr(AVAV) + jI'Ajz (Searle 1987); however,

MINQUE derivation (Rao 1971) requires that AX = 0 which in our case is Al =0 and is

equivalent to /W'Al11 = 0, thus

Var({y'QViQy}) = 2{tr(QVQVj)}; 2-11

and using Eq.2-10 and Eq.2-11 Var(2) = {tr(QV.QVj)}-'2{tr(QVQVj)}{tr(QVQVQV)}-1

and therefore Var(2) = V, = 2{tr(QVQVJ)}-'. 2-12

From Eq.2-12 it is seen that the MIVQUE covariance matrix of the variance component estimates

is dependent only on the design matrix (the result of the field design and mating design) and the

true variance components; a data vector is not needed.


Covariance Matrix for Linear Combinations of Variance Components and Variance of a Ratio


Once the covariance matrix for the variance component estimates (Eq.2-12) is created,

then the covariance matrix of linear combinations of these variance components is formed as

V, = L'VL 2-13

2x2 2xrrxrrx2







14

where L specifies the linear combinations of the variance components which are the

combinations of variance components in the denominator and numerator of the genetic ratio being

estimated. A Taylor series expansion (first approximation) for the variance of a ratio using the

variances of and covariance between numerator and denominator is then applied using the

elements of V, to produce the approximate variance of the three ratio estimates as (Kendall and

Stuart 1963):

Var(ratio) (1/D)2(Vk(1,1)) 2(N/D3)(V,(1,2)) + (N2/D4)(Vk(2,2)) 2-14

where the generic ratio is N/D and N and D are the parametric values;

V,(1,1) is the variance of N;

V,(1,2) is the covariance between N and D; and

Vl(2,2) is the variance of D.


Comparison Among Estimates of Variances of Ratios


The approximate variances of the three ratio estimates (h2, r., and y) are compared across

mating designs with equal (or approximately equal) numbers of observations, across numbers of

locations, and across numbers of parents within a mating design all within a level of genetic

determination. The standard for comparison is i. Results are presented by the genetic ratio

estimated so that direct comparisons may be made among the mating designs for equal numbers

of observations within a number of locations for varying levels of genetic control. Number of

genetic entries (number of crosses for full-sib designs and number of half-sib families for half-sib

designs) is used as a proxy for number of observations since, for all designs, number of

observations equals twenty-four times the number of locations times the number of genetic

entries. Further, by plotting the two levels of numbers of locations on a single figure, a







15

comparison is made of the utility of replication of a design across increasing numbers of

locations.

Efficiency plots also permit contrasts of the absolute magnitude of variance of estimation

among designs. For a given number of genetic entries and locations, the design with the highest

efficiency is the most precise (lowest variance of estimation). Increasing the number of genetic

entries or locations always results in greater precision (lower variance of estimation), but is not

necessarily as efficient (the reduction in variance was not sufficient to offset the increase in

numbers of observations). A primary justification for using the efficiency of a design as a

criterion is that a more precise estimate of a genetic ratio is obtained by using the mean of two

estimates from replication of the small design as two disconnected experiments as opposed to the

estimate from single large design. This is true when 1) the number of observations in the large

design (N) equals twice the number of observations in small design (n), 2) the small design is

more efficient, and 3) the variances are homogeneous. This is proven below:

Since N = n, + n2

and n, = n,

then N = 2n,.

By definition i = 1 / (N*(Var(Ratio)));

and Var(Ratio) = 1 /(i*N).

The proposition is (Var,(Ratio) + Var,(Ratio))/4.0 < Varl(Ratio);

substitution gives ((1/(n1*i)) + (1/(n,*i,)))/4.0 < (1/(N*i)).

Simplification yields (1/(2.0*n,*ij) < (1/(N*i));

and multiplication by N produces /i, < l/i, 2-15

which is strictly true so long as i, > ii where i, is the efficiency of the smaller experiment and

i, is the efficiency of the larger experiment.
















,i











i : i i,
a i
ie







\ 5 -






\ \\
.7



S 5


/ ; SW
9J

a 1
o o
13 *, ;

4N


\\ <; !

-- 8

_______________________


: uIi


S iU I I l

U s


a Ib



oq r
i i
i----------------i 8 ?




a


do 00
,/ /- ,, I




\ Q -


S a 8
a d d
I--------------------4
^ ? f 1

^9 4 H ~ 8


: g v 5 v
d d d


d d


AON30tUi


0
4. -0

to







g'
. E
~II










a



a"



- 4
'-



a)
-.g







II

4-3


a)'^

U U,


kaN3DiIJ









Results


Heritability


Half-sib designs are almost globally superior to the two full-sib designs in precision of

heritability estimates (results not shown for variance but may be seen from efficiencies in Figure

2-1). For designs of equal size, half-sib designs excel with the exception of genetic level three

(Figure 2-1c, h2 = 0.1, r, = 0.8, and y = 0.25). In genetic level three, the circular design

provides the most precise estimate of h2 for two location designs; however, when the design is

extended across five locations, the half-sib mating design again provides the most precise

estimates. The circular mating design is superior in precision to the half-diallel design across all

levels of genetic control and location, even with a relatively large number of crosses per parent

(four).

Half-sib designs are, in general, (seven genetic control levels out of eight, Figure 2-1)

more efficient with the exception of level three across two locations (Figure 2-1c). For the

circular and half-sib mating designs considered, increasing the number of genetic entries always

improves the efficiency of the design. However, definite optima exist for the half-diallel mating

design for number of genetic entries, i.e., crosses which convert to a specific number of parents.

These optima are not constant but tend to be six parents or less, lower with increasing h2 or

number of locations. The six-parent half-diallel is never far from the half-diallel optima, and

increasing the number of parents past the optimum results in decreased efficiency.

For half-sib designs with h2 = 0.1, five locations are more efficient than two locations;

however, at h2 = 0.25 two locations are most efficient. Further, the number of locations

required to efficiently estimate h2 for half-sib designs is determined only by the level of h2 and

does not depend on the levels of the other ratios. Although estimates over larger numbers of







18

observations are more precise (five-location estimates are more precise than two-location

estimates), the efficiency (increase in precision per unit observation) declines. So that if h2 =

0.25 and estimates of a certain precision are required, disconnected sets of two-location

experiments are preferred to five-location experiments. The relative efficiencies of five locations

versus two locations is enhanced with decreasing r, (increasing genotype by environment

interaction) within a level of h2 (compare Figures 2-la to 2-lb and 2-lc to 2-1d for h2 = 0.1, and

2-le to 2-If and 2-lg to 2-lh for h2 = 0.25). Yet, this enhancement is not sufficient to cause

a change in efficiency ranking between the location levels.

The full-sib designs differ markedly from this pattern (Figure 2-1) in that, for these

parameter levels, it is never more efficient to increase the number of locations from two to five

for heritability estimation. As observed with half-sib designs, for full-sib designs the relative

efficiency status of five locations improves with decreasing r.. To further contrast mating designs

note that the efficiency status of full-sib designs relative to the half-sib design improves with

decreasing y and increasing r. (Figures 2-lb versus 2-lc and 2-if versus 2-1g).


Type B Correlation


As opposed to h2 estimation, no mating design performs at or near the optima for

precision of rB estimates across all levels of genetic control (Figure 2-2). However, the circular

mating designs produce globally more precise estimates than those of the half-diallel mating

design. In general, the utility of full-sib versus half-sib designs is dependent on the level of ra.

The lower r, value favors half-sib designs while the higher r, tends to favor full-sib designs

(compare Figures 2-2a to 2-2b, 2-2c to 2-2d, 2-2e to 2-2f and 2-2g to 2-2h).

Decreasing y and lowering h2 always improves the relative efficiency of full-sib designs to half-

sib designs (compare Figures 2-2c and 2-2d to 2-2e and 2-2f).











i


mi e- '
v i










: ; i










I i
^ --.-












da .
I I ,


7-

R

S -
a

S al


* X1






I


b D U
i / ,d


T I I
i / -





' i 5 5 5 5 -




SI


. .. o .



I j
4(31
St4


AON131I3O


4-
3
'a

U
g



So
I-

Bs
at
o,
.8 8

at
8 0










h|
oo a






.a


00
St
ru II
^U












08
4,O




















'gcr
'E-.
-s
















gi
ba
^ II
0 ~

























s g
4-
II


















i '.
1


I "4



73


4; '4
%41 3C


bL
i
4*
o .
a
XD ;k



sg~~I~rrgj


0


"i


1 1 1 1 1 1 Id 1 1 1 1 1 1 1


A3N313J3












(3a) h= .1; re = .5; y"= 1.0


0.0006


0.000


0.0005


0.0001


0.00041


0.000


0.00031


0.000




0.0012




0.001




0.0008




0.0006




0.0004




0.0002


0 10 20 30 40


(3c) h2=.25; r = .8; /= 1.0


-----------

..

AA


I

I
I

d










I I I


GENETIC ENTRIES

Circulr 2 location
l-------o-------i

Cicular 5localrn
------ ----


5


B


5


5


5


4


5


0.016




0.014




0.012




0.01




0.008




0.006




0.04



0.035



0.03



0.025



0.02



0.015



0.01


n0E0


50 0 10 20 30
GENETIC ENTRIES
SHalf-diallel 2 locations
A------


HaF-diallel 5locations
-----A-----


Figure 2-3. Efficiency (i) for y plotted against number of genetic entries for four levels for
genetic control for circular, half-diallel, and half-sib mating designs across levels of location
where i = 1/(N(Var(y))) and N = the total number of observations.


I I I I


-a-*--
.EB ...--


Er

S. ......




AA
I









I I I I

0 10 20 30 40 50


(3d) h=.25; re = .8; /= .25




-. .. -O
8-


N- -

'
.r--.














-
c-u:
Y,^
...... ..... .. ..-



0 10 20 30 40 5


(3b) h2 = .1; r = .5; -/= .25






21

For estimation of rs, full-sib designs are more efficient than half-sib designs except in the

three cases of low r. (0.5) and high y (1.0) for h2 = 0.1 (Figure 2-2b) and low r. for h2 = 0.25

(Figures 2-2f and 2-2h). Within full-sib designs the circular design is globally superior to the

half-diallel. As with h2 estimation, half-diallel designs have optimal levels for numbers of

parents. The six-parent half-diallel is again close to these optima for all genetic levels and

numbers of locations.

At low h2 for full-sib designs, planting in two locations is always more efficient than five

locations. For half-sib designs at low h2, the relative efficiency of two versus five locations is

dependent on the level of r, with lower r. favoring replication across more locations. At h2 =

0.25, half-sib designs are more efficient when replicated across five locations. At the higher h2

value, full-sib design efficiency across locations is dependent on the level of rB. With rB = 0.5

and h2 = 0.25, replication of full-sib designs is for the first time more efficient across five

locations than across two locations; however, at the higher rB level two locations is again the

preferred number.


Dominance to Additive Variance Ratio


In comparing the two full-sib designs for relative efficiency in estimating y, the circular

design is always approximately equal to or, for most cases, superior to the half-diallel design

(Figure 2-3). The relative superiority of the circular design is enhanced by decreasing y and r,

(not shown). The half-diallel design again demonstrates optima for number of parents with the

six-parent design being near optimal. Within a mating design the use of two locations is always

more efficient than the use of five locations. The magnitude of this superiority escalates with

increasing rB and h2 (Figures 2-3a and 2-3b versus 2-3c and 2-3d).








Discussion


Comparison of Mating Designs


A prior knowledge of genetic control is required to choose the optimal mating and field

design for estimation of h2, rB and 7. Given that such knowledge may not be available, the

choices are then based on the most robust mating designs and field designs for the estimation of

certain of the genetic ratios. If h2 is the only ratio desired, then the half-sib mating design is

best. Estimation of both h2 and rB requires a choice between the half-sib and circular designs.

If there is no prior knowledge then the selection of a mating design is dependent on which ratio

has the highest priority. For experiments in which h2 received highest weighting, the half-sib

design is preferred and in the alternative case the circular design is the better choice. In the last

scenario information on all three ratios is desired from the same experiment and in this case the

circular design is the better selection since the circular design is almost globally more efficient

than the half-diallel design.

After choosing a mating design, the next decision is how many locations per experiment

are required to optimize efficiency. For the half-sib design the number of locations required to

optimize efficiency is dependent on both the ratio being estimated and the level of genetic control.

A broad inference is that for h2 estimation a two location experiment is more efficient and for rB

a five location experiment has the better efficiency. Estimation of any of the three ratios with

a full-sib design is almost globally more efficient in two location experiments.

The disparity between the behavior of the half-sib and full-sib designs with respect to the

efficiency of location levels can be explained in terms of the genetic connectedness offered by the

different designs. Genetic connectedness can be viewed as commonality of parentage among

genetic entries. The more entries having a common parent the more connectedness is present.






23

The half-sib design is only connected across locations by the one common parent in a half-sib

family in each replication. Full-sib designs are connected across locations in each replication by

the full-sib cross plus the number of parents minus two (half-diallel) or three (circular) for each

of the two parents in a cross. The connectedness in a full-sib design means each observation is

providing information about many other observations. The result of this connectedness is that,

in general, fewer observations (number of locations) are required for maximum efficiency.


A General Approach to the Estimation Problem


The estimation problems may be viewed in a broader context than the specific solutions

in this chapter. The technique for comparison of mating designs and numbers of locations across

levels of genetic determination may be construed, for the case of h2 estimation, to be the effect

of these factors on the variance of o2g, estimates. Viewing the variance approximation formula,

the conclusion may be reached that the variance of o2, estimates is the controlling factor in the

variance of h2 estimates since the other factors at these heritability levels are multiplied by

constants which reduce their impact dramatically. Given this conclusion, the variance of h2

estimates is essentially the (3,3) element in 2{tr(QVQVj)})1 (Eq. 2-11). Further, since the

covariances of the other variance component estimates with oe estimates are small, the variance

of o2g, estimates is basically determined by the magnitude of the (3,3) element of {tr(QVQVj)}

which is tr(QVQV). Thus, the variance of h2 estimates is minimized by maximizing

tr(QVgQVg with h2 used as an illustration because this simplification is possible.

Considering the impact of changing levels of genetic control, while holding the mating

and field designs constant, V, is fixed, the diagonal elements of V are fixed at 11.5 because of

our assumptions, and only the off-diagonal elements of V change with genetic control levels.

Since Q is a direct function of V', what we observe in Figure 2-1 comparing a design across






24

levels of genetic control are changes in V" brought about by changes in the magnitude of the off-

diagonal elements of V (covariances among observations). The effect of positive (the linear

model specifies that all off-diagonal elements in V are zero or positive) off-diagonal elements on

V' is to reduce the magnitude of the diagonal elements and often also result in negative off-

diagonal elements. If one increases the magnitude of the off-diagonal elements in V, then the

magnitude of the diagonal elements of V' is reduced and the magnitude of negative off-diagonal

elements is increased. Since tr(QVQV) is the sum of the squared elements of the product of

a direct function of VY and a matrix of non-negative constants (V), as the diagonal elements of

V' are reduced and the off-diagonal elements become more negative, tr(QVQV) must become

smaller and the variance of h2 estimates increases.

Mating designs may be compared by the same type of reasoning. Within a constant field

design changes in mating design produce alterations in V. Of the three designs the half-sib

produces a V matrix with the most zero off-diagonal elements, the circular design next, and the

half-diallel the fewest number of zero off-diagonal elements. Knowing the effect of off-diagonal

elements on the variance of h2 estimates, one could surmise that the variance of estimates is

reduced in the order of least to most non-zero off-diagonal elements. This tenant is in basic

agreement with the results in Figures 2-1 through 2-3.

The effects of rB and y on the variance of h2 estimates can also be interpreted utilizing

the above approach. In the results section of this chapter it is noted that decreasing the magnitude

of rB and/or y causes full-sib designs to rise in efficiency relative to the half-sib design. In

accordance with our previous arguments this would be expected since decreasing the magnitude

of those two ratios causes a decrease in the magnitude of off-diagonal elements. More precisely,

decreasing y results in the reduction of off-diagonal elements in V of the full-sib designs while

not affecting the half-sib design, and decreasing r% results in the reduction of off-diagonal






25

elements in V of full-sib and half-sib designs. Relative increases in efficiency of full-sib designs

result from the elements due to location by additive interaction occurring much less frequently

in the half-sib designs; thus, the relative impact of reduction in r, in half-sib designs is less than

that for full-sibs.


Use of the Variance of a Ratio Approximation


Use of Kendall and Stuart's (1963) first approximation (first-term Taylor series

approximation) of the variance of a ratio has two major caveats. The approximation depends on

large sample properties to approach the true variance of the ratio, i.e., with a small number of

levels for random variables the approximation does not necessarily closely approximate the true

variance of the ratio. Work by Pederson (1972) suggests that for approximating the variance of

h2 at least ten parents are required in diallels before the approximation will converge to the true

variance even after including Taylor series terms past the first derivative. Pederson's work also

suggests that the approximation is progressively worse for increasing heritability with low

numbers of parents. Using the field design in this chapter (two locations,four blocks and six-tree

row-plots), simulation work (10,000 data sets) has demonstrated that with a heritability of 0.1

using four parents in a half-diallel across two locations that the variance of a ratio approximation

yields a variance estimate for h2 of 0.1 while the convergent value for the simulation was 0.08

(Huber unpublished data). One should remember the dependence of the first approximation of

the variance of a ratio on large sample properties when applying the technique to real data.

The second caveat is that the range of estimates of the denominator of the ratio cannot

pass through zero (Kendall and Stuart 1963). This constraint is of no concern for h2; however,

the structure of r. and y denominators allows unbiased minimum variance estimates of those

denominators to pass through zero which means at one point in the distribution of the estimates






26

of the ratios they are undefined (the distributions of these ratio estimates are not continuous).

Simulation has shown that the variances of r, and y are much greater than the approximation

would indicate (Huber unpublished data). The discrepancy in variance of the estimates could be

partially alleviated through using a variance component estimation technique which restricts

estimates to the parameter space 0 < o2 < oo. Nevertheless, because of the two caveats,

approximations of the variance of h2, r. and y estimates should be viewed only on a relative basis

for comparisons among designs and not on an absolute scale.

Additionally, the expectation of a ratio does not equal the ratio of the expectations (Hogg

and Craig 1978). If a value of genetic ratios is sought so that the value equals the ratio of the

expectations, then the appropriate way to calculate the ratio would be to take the mean of

variance components or linear combinations of variance components across many experiments and

then take the ratio. If the value sought for h2 is the expectation of the ratio, then taking the mean

of many h2 estimates is the appropriate approach. Returning to the results from simulated data

(10,000 data sets) where the h2 value was set at 0.1, using the ratio of the means of variance

components rendered a value of 0.1 for h2, the mean of the h2 estimates returned a value of 0.08,

and a Taylor series approximation of the mean of the ratio yielded 0.07 (Pederson 1972).


Conclusions


Results from this study should be interpreted as relative comparisons of the levels of the

factors investigated. However, viewing the optimal design problem as illustrated in the

discussion section of this chapter can provide insight to the more general problem.

There is no globally most efficient number of locations, parents or mating design for the

three ratios estimated even within the restricted range of this study; yet, some general conclusions

can be drawn. For estimating h2 the half-sib design is always optimal or close to optimal in






27

terms of variance of estimation and efficiency. In the estimation of rB and 7, the circular mating

design is always optimal or near optimal in variance reduction and efficiency. Across numbers

of parents within a mating design only the half-diallel shows optima for efficiency. The other

mating designs have non-decreasing efficiency plots over the level of number of parent; so that

while there is an optimal number of locations for a level of genetic control, the number of genetic

entries per location is limited more by operational than efficiency constraints.

Two locations is a near global optimum over five locations for the full-sib mating designs.

Within the half-sib mating design optimality depends on the levels of h2 and rB: 1) for h2

estimation the optimal number of locations is inversely related to the level of h2, i.e. at the higher

level two tests were optimal and at the lower level five tests were optimal; and 2) for rB

estimation for the half-sib design, the optimal number of locations was also inversely related to

the level of rB.

Means of estimates from disconnected sets provide lower variance of estimation where

the smaller experiments have higher efficiencies. Thus, disconnected sets are preferred according

to number of locations for all mating designs and according to number of parents for the half-

diallel mating design.

In practical consideration of the optimal mating design problem, the results of this study

indicate that if h2 estimation is the primary use of a progeny test then the half-sib mating design

is the proper choice. Further, the circular mating design is an appropriate choice if the

estimation of rB is more important than h2,. Finally, if a full-sib design is required to furnish

information about dominance variance, the circular design provides almost globally better

efficiencies for h2, rB, and y than the half-diallel.












CHAPTER 3
ORDINARY LEAST SQUARES ESTIMATION OF GENERAL
AND SPECIFIC COMBINING ABILITIES FROM
HALF-DIALLEL MATING DESIGNS


Introduction


The diallel mating system is an altered factorial design in which the same individuals (or

lines) are used as both male and female parents. A full diallel contains all crosses, including

reciprocal crosses and selfs, resulting in a total of p2 combinations, where p is the number of

parents. Assumptions that reciprocal effects, maternal effects, and paternal effects are negligible

lead to the use of the half-diallel mating system (Griffing 1956, method 4) which has p(p-1)/2

parental combinations and is the mating system addressed in this chapter.

Half diallels have been widely used in crop and tree breeding (Sprague and Tatum 1942,

Gilbert 1958, Matzinger et al. 1959, Burley et al. 1966, and Squillace 1973) and the widespread

use of this mating system continues today (Weir and Zobel 1975, Wilcox et al. 1975, Snyder and

Namkoong 1978, Hallauer and Miranda 1981, Singh and Singh 1984, Greenwood et al. 1986,

and Weir and Goddard 1986).

Most of the statistical packages available treat fixed effect estimation as the objective of

the program with random variables representing nuisance variation. Within this context a

common analysis of half-diallel experiments is conducted by first treating genetic parameters as

fixed effects for estimation of general (GCA) and specific (SCA) combining abilities and

subsequently as random variables for variance component estimation (used for estimating

heritabilities, genetic correlations, and general to specific combining ability variance ratios for






29

determining breeding strategies). This chapter focuses on the estimation of GCA's and SCA's

as fixed effects. The treatment of GCA and SCA as fixed effects in OLS (ordinary least squares)

is an entirely appropriate analysis if the comparisons are among parents and crosses in a

particular experiment. If, as forest geneticists often wish to do, GCA estimates from

disconnected experiments are to be compared, then methods such as checklots must be used to

place the estimates on a common basis.

Formulae (Griffing 1956, Falconer 1981, Hallauer and Miranda 1981, and Becker 1975)

for hand calculation of general and specific combining abilities are based on a solution to the OLS

equations for half-diallels created by sum-to-zero restrictions, i.e., the sum of all effect estimates

for an experimental factor equals zero. These formulae will yield correct OLS solutions for sum-

to-zero genetic parameters provided the data have no missing cells. If cell (plot) means are used

as the basis for the estimation of effects, there must be at least one observation per cell (plot)

where a cell is a subclassification of the data defined by one level of every factor (Searle 1987).

An example of a cell is the group of observations denoted by ABj for a randomized complete

block design with factor A across blocks (B). If the above formulae are applied without

accounting for missing cells, incorrect and possibly misleading solutions can result. The matrix

algebra approach is described in this chapter for these reasons: 1) in forest tree breeding

applications data sets with missing cells are extremely common; 2) many statistical packages do

not allow direct specification of the half-diallel model; 3) the use of a linear model and matrix

algebra can yield relevant OLS solutions for any degree of data imbalance; and 4) viewing the

mechanics of the OLS approach is an aid to understanding the properties of the estimates.

The objectives of this chapter are to (1) detail the construction of ordinary least squares

(OLS) analysis of half-diallel data sets to estimate genetic parameters (GCA and SCA) as fixed

effects, (2) recount the assumptions and mathematical features of this type of analysis, (3)






30

facilitate the reader's implementation of OLS analyses for diallels of any degree of imbalance and

suggest a method for combining estimates from disconnected experiments, and (4) aid the reader

in ascertaining what method is an appropriate analysis for a given data set.


Methods


Linear Model


Plot means are used as the unit of observation for this analysis with unequal numbers of

observations per plot. Plot (cell) means are always estimable as long as there is one observation

per plot, and linear combinations of these means (least squares means) provide the most efficient

way of estimating OLS fixed effects (Yates 1934). Throughout this chapter, estimates are

denoted by lower case letters while the parameters are designated by upper case letters and

matrices are in bold print.

Using plot means as observations, a common scalar linear model for an analysis of a half-

diallel mating design with p(p-1)/2 crosses planted at a single location in a randomized complete

block design with one plot per block is

yik = z + B, + GCAj + GCAk + SCAji + eij 3-1



where yk is the mean of the il block for the jkt cross;

it is an overall mean;

B, is the fixed effect of block i for i= 1 to b;

GCAj is the fixed general combining ability effect of the jLh female parent or

kh male parent, j or k = 1,. .,p (j k);

SCAj, is the fixed specific combining ability effect of parents j and k; and








ei, is the random error associated with the observation of the jk- cross in

the i1 block where eij (0, o2).

Cross by block interaction as genotype by environment interaction is treated as confounded with

between plot variation as for contiguous plots.

The model in matrix notation is

y = X# + e 3-2

where y is the vector of observation vectors (nxl = n rows and 1 column) where n equals

the number of observations;

X is the design matrix (nxm) whose function is to select the appropriate parameters

for each observation where m equals the number of fixed effect parameters in the

model;

( is the vector (mxl) of fixed effect parameters ordered in a column; and

e is the vector (nxl) of deviations (errors) from the expectation associated with each

observation.


Ordinary Least Squares Solutions


The matrix representation of an OLS fixed effects solution is

b = (X'X)-X'y 3-3



where b is the vector of estimated fixed effect parameters, i.e., an estimate of P, and

X is the design matrix either made full rank by reparameterization,

or a generalized inverse of X'X may be used.

Inherent in this solution is the ordinary least squares assumption that the variance-






32

covariance matrix (V) of the observations (y) is equal to Ia,, where I is an nxn identity matrix.

The elements of an identity matrix are I's on the main diagonal and all other elements are 0.

Multiplying I by i, places oa on the main diagonal. In the covariance matrix for the

observations, the variance of the observations appears on the main diagonal and the covariance

between observations appears in the off-diagonal elements. Thus, V = Io, states that the

variance of the observations is equal to a. for each observation and there are no covariances

between the observations (which is one direct result of considering genetic parameters as fixed

effects).


Sum-to-Zero Restrictions


The design matrix presented in this chapter is reparameterized by sum-to-zero restrictions

to (1) reduce the dimension of the matrices to a minimal size, and (2) yield estimates of fixed

effects with the same solution as common formulae in the balanced case. Other restrictions such

as set-to-zero could also be applied so the discussion that follows treats sum-to-zero restrictions

as a specific solution to the more general problem which is finding an inverse for X'X. The

subscripts 'o' and 's' refer to the overparameterized model and the reparameterized model with

sum-to-zero restrictions, respectively.

The matrix X, of Figure 3-1 is the design matrix for an overparameterized linear model

(Milliken and Johnson 1984, page 96). Overparameterization means that the equations are written

in more unknowns (parameters, in this case 13) than there are equations (number of observations

minus degrees of freedom for error, in this case 12 5 = 7) with which to estimate the

parameters. Reparameterization as a sum-to-zero matrix overcomes this dilemma by reducing

the number of parameters through making some of the parameters linear combinations of others.

Sum-to-zero restrictions make the resulting parameters and estimates sum to zero even though









the unrestricted parameters (for example, the true GCA values as applied to a broader population)

do not necessarily sum-to-zero within a diallel. This is the problem of comparability of GCA

estimates from disconnected experiments.



py B, B2 GCA, GCA2 GCA3 GCA4 SCA12 SCA,3 SCA4 SCA, SCA, SCA,

112 I 0 1 1 0 0 1 0 0 0 0 0 A
Y13 110 1 0 1 0 0 1 0 0 0 0 B,
y14 110 1 0 0 1 0 0 1 0 0 0 B2
Y,2s 1 0 0 1 1 0 0 0 0 1 0 0 GCA,
y,2 11 0 0 1 0 1 0 0 0 0 1 0 GCA,
y, = 1 1 0 0 0 1 1 0 0 0 0 0 1 GCA,
212 1 0 1 1 1 0 0 1 0 0 0 0 0 GCA4
Y213 0 1 1 0 1 0 0 1 0 0 0 0 SCA,,
Y214 10 1 1 0 0 1 0 0 1 0 0 0 SCA,,
y 1 0 1 0 1 1 0 0 0 0 1 0 0 SCA,4
y224 1 0 1 0 1 0 1 0 0 0 0 1 0 SCA23
Y2 1 0 1 0 0 1 1 0 0 0 0 0 1 SCA2
.SCA .
y = X, #

Figure 3-1. The overparameterized linear model for a four-parent half-diallel planted on a single
site in two blocks displayed as matrices. The design matrix (X.) and parameter vector (.) are
shown in overparameterized form. I's and 0's denote the presence or absence of a parameter in
the model for the observed means (data vector, y). The parameters displayed above the design
matrix label the appropriate column for each parameter. Error vector not exhibited.


t B, GCA, GCA2 GCA, SCAIn SCA,3


Y112
Y113
YlI4
Y123
Y124
Y134
Y2n1
Y2z13
Y214

y2m .


Bl

GCA,

GCA,
GCA,
SCA,,
SCA,3 .


e112
ell3
e113
e114
e123
e124
e134
e212
e213
e214
e223
e224
e234


y = X, + e.

Figure 3-2. The linear model for a four-parent half-diallel planted on a single site in two blocks
displayed as matrices. The design matrix (X) and the parameter vector (f,) are presented in
sum-to-zero format. The parameters displayed above the design matrix label the appropriate
column for each parameter.



To illustrate the concept of sum-to-zero estimates versus population parameters, we use

the expectation of a common formula. Becker (1975) gives equation 3-4 (which for balanced






34

cases is equivalent to gj = ((p-1)/(p-2))(Z,. Z..)) as the estimate for general combining ability

for the jt line with p equalling the number of parents and Z4 equalling the site mean of the j x

k cross. This equation yields the same solution as the matrix equations with no missing plots or

crosses and with a design matrix which contains the sum-to-zero restrictions. An evaluation of

this formula in a four-parent half-diallel planted in b blocks for the GCA of parent 1 is obtained

by substituting the expectation of the linear model (equation 3-1) for each observation:

gj = (/(p(p-2)))(pZj. 2Z..) 3-4

E{g,} = E{(1/(p(p-2)))(pZ,. 2Z..)}

E{g,}= 3/4(GCAI) 1/4(GCA2 + GCA3 + GCA4) + 1/4(SCA12 + SCA13 + SCA4) -

1/4(SCA23 + SCA4 + SCA4).

The result of equation 3-4 is obviously not GCA, from the unrestricted model (equation

3-1). Thus, gj, an estimable function and an estimate of parameter GCA,, (the estimate of the

GCA of parent 1 given the sum-to-zero restrictions), does not have the same meaning as GCA,

in the unrestricted model. An estimable function is a linear combination of the observations; but

in order for an individual parameter in a model to be estimable, one must devise a linear

combination of the observations such that the expectation has a weight of one on the parameter

one wishes to estimate while having a weight of zero on all other parameters. A solution such

as this does not exist for the individual parameters in the overparameterized model (equation 3-1).

So, although the sum-to-zero restricted GCA parameters and estimates are forced to sum-to-zero

for the sample of parents in a given diallel, the unrestricted GCA parameters only sum-to-zero

across the entire population (Falconer 1981) and an evaluation of GCA1, demonstrates that the

estimate contains other model parameters.

The result of sum-to-zero restrictions is that the degrees of freedom for a factor equals

the number of columns (parameters) for that factor in X, (Figure 3-2). Thus, a generalized






35

inverse for X,'X, is not required since the number of columns in the sum-to-zero X. matrix for

each factor equals the degrees of freedom for that factor in the model (X, is full column rank and

provides a solution to equation 3-3).


Components of the Matrix Equation


The equational components of 3-2 are now considered in greater detail.

Data vector v

Observations (plot means) in the data vector are ordered in the manner demonstrated in

Figure 3-1. For our example Figure 3-1 is the matrix equation of a four parent half-diallel

mating design planted in two randomized complete blocks on a single site. There are six crosses

present in the two blocks for a total of 12 observations in the data vector, y. The observations

are first sorted by block. Second, within each block the observations should be in the same

sequence (for simplicity of presentation only). This sequence is obtained by assigning numbers

1 through p to each of the p parents and then sorting all crosses containing parent 1 (whether as

male or female) as the primary index in descending numerical order by the other parent of the

cross as the secondary index. Next all crosses containing parent 2 (primary index, as male or

female) in which the other parent in the cross (secondary index) has a number greater than 2 are

then also sorted in descending order by the secondary index. This procedure is followed through

using parent p-1 as the primary index.

Design matrix and parameter vector. X and B

The design matrix for a model is conceptually a listing of the parameters present in the

model for each observation (Searle 1987, page 243). In Figure 3-1, y and f. are exhibited and

the parameters in f are displayed at the tops of the columns of X, (a visually correct

interpretation of the multiplication of a matrix by a vector). For each observation in y, the scalar







36

model (equation 3-1) may be employed to obtain the listing of parameters for that observation

(the row of the design matrix corresponding to the particular observation). The convention for

design matrices is that the columns for the factors occur in the same order as the factors in the

linear model (equation 3-1 and Figure 3-1). Since design matrices can be devised by first

creating the columns pertinent to each factor in the model (submatrices) and then horizontally

and/or vertically stacking the submatrices, the discussion of the reparameterized design matrix

formulation will proceed by factor.

Mean

The first column of X, is for j and is a vector of I's with the number of rows equalling

the number of observations (Figure 3-2). The linear model (equation 3-1) indicates that all

observations contain u and the deviation of the observations from 1 is explained in terms of the

factors and interactions in the model plus error.

Block

The number of columns for block is equal to the number of blocks minus one (column

2, X,). Each row of a block submatrix consists of I's and O's or -l's according to the identity

of the observation for which the row is being formed. The normal convention is that the first

column represents block 1 and the second column block 2, etc. through block b-1. Since we

have used a sum-to-zero solution (Eb,=0), the effect due to block b is a linear combination of

the other b-1 effects, i.e., bb = -E!"bi which in our example is 0 = b, + b2 and b2 = -b,.

Thus, the row of the block submatrix for an observation in block b (the last block) has a -1 in

each of the b-1 columns signifying that the block b effect is indeed a linear combination of the

other b-1 block effects. Columns 2 and 3 of X, (Figure 3-1) have become column 2 of X,

(Figure 3-2).








General combining ability

This submatrix of X, is slightly more complex than previous factors as a result of having

two levels of a main effect present per observation, i.e., the deviation of an observation from tA

is modeled as the result of the GCA's of both the male and female parents (equation 3-1). Again

we have imposed a restriction, Ejgcaj=O. Since GCA has p-I degrees of freedom, the submatrix

for GCA should have p-1 columns, i.e., gcap = -E;gca,. The GCA submatrix for X, (columns

3 through 5 in Figure 3-2) is formed from X, (columns 4 through 7 in Figure 3-1) according in

the same manner as the block matrix: (1) add minus one to the elements in the other columns

along each row containing a one for gca (p=4 in our example); and (3) delete the column from

X, corresponding to gcap. The GCA submatrix has p(p-1)/2 rows (the number of crosses). This,

with no missing cells (plots), equals the number of observations per block. To form the GCA

factor submatrix for a site, the GCA submatrix is vertically concatenated (stacked on itself) b

times. This completes the portion of the X, matrix for GCA.

Specific combining ability

In order to facilitate construction of the SCA submatrix, a horizontal direct product

should be defined. A horizontal direct product, as applied to two column vectors, is the element

by element product between the two vectors (SAS/IML' User's Guide 1985) such that the

element in the it row of the resulting product vector is the product of the elements in the i1 rows

of the two initial vectors. The resultant product vector has dimension n x 1. A horizontal direct

product is useful for the formation of interaction or nested factor submatrices where the initial

matrices represent the main factors and the resulting matrix represents an interaction or a nested

factor (product rule, Searle 1987).


'SAS/IML is the registered trademark of the SAS Institute Inc. Cary, North Carolina.







38

The SCA submatrix can be formulated from the horizontal direct products of the columns

of the GCA sub-matrix in X, (Figure 3-2). The results from the GCA columns require

manipulation to become the SCA submatrix (since degrees of freedom for SCA do not equal those

of an interaction for a half-diallel analysis), but the GCA column products provide a convenient

starting point. The column of the SCA submatrix representing the cross between the jP and the

kt parents (SCA.) is formed as the product between the GCAj and GCAk columns (Figure 3-3).

The GCA columns in Figure 3-2 are multiplied in this order: column 1 times column 2 forming

the first SCA column, column 1 times column 3 forming the second SCA column, and column

2 times column 3 forming the third SCA column (Figure 3-3). With four parents (six crosses)

there are three degrees of freedom for GCA (p-1) and two degrees of freedom for SCA (6 crosses

- 3 for GCA 1 for the mean). Since SCA has only two degrees of freedom, a sum-to-zero

design matrix can have only two columns for SCA. Imposing the restriction that the sum of the

SCA's across all parents equals zero is equivalent to making the last column for the SCA

submatrix (Figure 3-3) a linear combination of the others (Figure 3-2). The procedure for

deleting the third column product is identical to that for the GCA submatrix: add minus one to

every element in the rows of the remaining SCA columns in which a one appears in the column

which is to be deleted (Figure 3-2, columns 6 and 7). The number of rows in the SCA submatrix

equals the number observations in a block and must be vertically concatenated b times to create

the SCA submatrix for a site.

An algebraic evaluation of SCA sum-to-zero restrictions requires that Ejscap = 0 for

each k and that EEkscaj = 0; thus, for observations in the i- block with i serving to denote the

row of the SCA submatrix in block i, sca,14 = -sca,12 -sca,3 and entries in the submatrix row for

y,14 are -l's. The estimate for scai equals sca,14 because scai is the negative of the sum of the

independently estimated SCA's (sca,,2 and scai1) from the restriction that the sum of the SCA's







39

across all parents equals zero. Similarly, by sum-to-zero definition sca~ = -scam -scan and by

substitution scam = -(-sca,12 -sca13) -sca,2 = sca,13. By the same protocol, it can be shown that

sca, = sca12n. The elements in the rows of the SCA submatrix are l's, -l's and 0's in

accordance with the algebraic evaluation. Thus, while it may seem that there should be 6 SCA

values (one for each cross), only 2 can be independently estimated and the remaining 4 are linear

combinations of the independently estimated SCA's. Again the SCA sum-to-zero estimates are

not equal to the parametric population SCA's. An analogous illustration for SCA to that for

GCA would show that the estimable function (linear combination of observations) for a given

SCA, contains a variety of other parameters.


OBS. GCAixGCA2 GCAixGCA3 GCA2xGCA, SCAt2 SCA,, SCA,
Y2 (1)(1)=1 (1)(0)=0 (1)(0)=0 1 0 0
YS (1)(0)=0 (1)(1)=1 (0)(1)=0 0 1 0
Yi4 (0)(-1)=0 (0)(-1)=0 (-1)(-1)=1 0 0 1
Ym (0)(1)=0 (0)(1)=0 (1)(1)=1 0 0 1
Y" (-1)(0)=0 (-1)(-1)=1 (0)(-1)=0 0 1 0
Y (-1)(-)= (-1)(0)=0 (-1)(0)=0 1 0 0

Figure 3-3. Intermediate result in SCA submatrix generation (SCA columns as horizontal direct
products of GCAi, GCA2, and GCA3 columns within a block). The SCAj column is the
horizontal direct product of the columns for GCA, and GCAk.


Estimation of Fixed Effects


GCA parameters

The GCA parameters can be estimated (without mean, block, and SCA in the design

matrix) through the use of equation 3-3, if there are no missing cell means (plots) for any cross

and no missing crosses. The design matrix consists only of the GCA submatrix. This design

matrix has {p-1} (for GCA's) columns (the third through the fifth columns of X.). The b vector

is an estimate of the GCA portion of 8, as in Figure 3-2 and the linear combinations for the

estimation of gca, is gca, = -pE-gcaj. Parameters for any of the factors can be estimated






40

independently using the pertinent submatrix as long as there are no missing cell means (plots) and

no missing crosses; this uses a property known as orthogonality.

Orthogonality requires that the dot product between two vectors equals zero (Schneider

1987, page 168). The dot product (a scalar) is the sum of the values in a vector obtained from

the horizontal direct product of two vectors. For two factors to be orthogonal, the dot products

of all the column vectors making up the section of the design matrix for one factor with the

column vectors making up the portion of the design matrix for the second must be zero. If all

factors in the model are orthogonal, then the X,'X, matrix is block diagonal. A block-diagonal

X.'X. matrix is composed of square factor submatrices (degrees of freedom x degrees of freedom)

along the diagonal with all off-diagonal elements not in one of the square factor submatrices

equalling zero. A property of block-diagonal matrices is that the inverse can be calculated by

inverting each block separately and replacing the original block in the full X'X matrix by the

inverted block. Because the blocks can be inverted separately and all other off-diagonal elements

of the inverse are zero, the effects for factors which are orthogonal to all other factors may be

estimated separately, i.e., there are no functions of other sum-to-zero factors in the sum-to-zero

estimates.

Mean, block. GCA and SCA parameters

All parameters are estimated simultaneously by horizontally concatenating the mean,

block, GCA, and SCA matrices to create X,. Equation 3-3 is again utilized to solve the system

of equations. The b vector for the four parent example is an estimate of 3, of Figure 3-2.

Again, one parameter is estimated for each column in the X, matrix and all parameter estimates

not present are linear combinations of the parameter estimates in the b vector. So b, is equal to -

E -Ib, and gca, is equal to -EI-gcaj. The linear combinations for SCA effects can be obtained

by reading along the row of the SCA submatrix associated with the observation containing the






41

parameter, i.e., in Figure 3-2 the observation ym contains the effect sca, which is estimated as

the linear combination -scal2 -sca.13.

This completes the estimation of fixed effect parameters from a data set which is balanced

on a plot-mean basis. Since field data sets with such completeness are a rarity in forestry

applications, the next step is OLS analysis for various types of data imbalance. Calculations of

solutions based on a complete data set and simulated data sets with common types of imbalance

are demonstrated in numerical examples.



Numerical Examples


The data set analyzed in the numerical examples is from a five-year-old, six-parent half-

diallel slash pine (Pinus elliottii var. elliottii Engelmn) progeny test planted on a single site in

four complete blocks. Each cross is represented by a five-tree row plot within each block. Total

height in meters and diameter at breast height (dbh in centimeters) are the traits selected for

analysis. The data set is presented in Table 3-1 so that the reader may reconstruct the analysis

and compare answers with the examples. The numbers 1 through 6 were arbitrarily assigned to

the parents for analysis. Because of unequal survival within plots, plot means are used as the unit

of observation.


Balanced Data (Plot-mean Basis)


The sum-to-zero design matrix for the balanced data set has (4 blocks)x(15 crosses) = 60

rows (which equals the number of observations in y) and has the following columns: one column

for iJ, three columns for blocks (b-1), five columns for GCA (p-1), and nine columns for SCA

(15 crosses 5 1) for a total of 18 columns. With sixty plot means (degrees of freedom) and

18 degrees of freedom in the model, subtracting 18 from 60 yields 42 degrees of freedom for







42

error which matches the degrees of freedom for cross by block interaction, thus verifying that

degrees of freedom concur with the number of columns in the sum-to-zero design matrix.

To illustrate the principle of orthogonality in the balanced case, the X'X and (X'X)1 matrices

may be printed to show that they are block diagonal. In further illustration, the effects within

a factor may also be estimated without any other factors in the design matrix and compared to

the estimates from the full design matrix.

The vectors of parameter estimates for height and dbh (Table 3-2) were calculated from the

same X, matrix because height and dbh measurements were taken on the same trees. In other

words, if a height measurement was taken on a tree, a dbh measurement was also taken, so the

design matrices are equivalent.


Missing Plot


To illustrate the problem of a missing plot, the cross, parent two by parent three, was

arbitrarily deleted in block one (as if observation yz were missing). This deletion prompts

adjustments to the factor matrices in order to analyze the new data set. The new vector of

observations (y) now has 59 rows. This necessitates deletion of the row of the design matrix (XY)

in block 1 which would have been associated with cross 2 x 3. This is the only matrix alteration

required for the analysis. Thus, the resultant X, matrix has 60 1 = 59 rows and 18 columns.

With 59 means in y and 18 columns in X., the degrees of freedom for error is 41.

Comparisons between results of the analyses (Table 3-2) of the full data set and the data

set missing observation y,3 reveal that for this case the estimates of parameters have been

relatively unaffected by the imbalance (magnitudes of GCA's changed only slightly and rankings

by GCA were unaffected).







43

Table 3-1. Data set for numerical examples. Five-year-old slash pine progeny test with a 6-
parent half-diallel mating design present on a single site with four randomized complete blocks
and a five-tree row plot per cross per block.

Within Plot Trees
Mean Mean Variance Variance per
Block Female Male Height DBH Height DBH Plot
Meters Centimeters n2 cm2
1 1 2 2.6899 3.810 0.9800 3.484 4
1 1 3 1.9080 2.134 1.4277 3.893 5
1 1 5 3.1242 4.445 0.4487 1.656 4
1 1 6 2.4933 3.200 0.8488 5.664 5
1 2 5 1.4783 1.588 0.6556 2.167 4
1 2 6 2.7026 3.471 0.1136 0.344 3
1 3 2 3.0480 4.699 0.2341 0.968 4
1 3 5 3.4991 5.131 0.0945 0.271 5
1 3 6 2.4003 2.794 0.5149 1.548 4
1 4 1 3.3955 4.928 0.1489 0.761 5
1 4 2 3.4290 5.144 0.7943 3.285 4
1 4 3 2.5298 2.984 0.9557 4.188 4
1 4 5 2.4155 3.175 0.5936 2.946 4
1 4 6 3.2004 4.521 1.7034 7.594 5
1 5 6 2.2403 2.794 1.0433 6.280 4
2 1 2 3.5662 5.080 0.9560 2.903 5
2 1 3 2.6335 3.353 0.7695 3.497 5
2 1 5 3.6942 5.893 0.0573 0.432 5
2 1 6 3.4808 4.928 0.9222 2.890 5
2 2 5 3.4260 4.877 0.7017 2.432 5
2 2 6 2.4282 3.302 0.0616 0.452 3
2 3 2 3.0480 4.064 0.0192 0.301 4
2 3 5 2.8895 4.013 0.1957 0.690 5
2 3 6 1.9406 1.863 0.0560 0.408 3
2 4 1 3.0114 3.962 1.9753 6.342 5
2 4 2 3.6454 5.283 0.1731 0.787 5
2 4 3 2.9566 3.861 0.0506 0.174 5
2 4 5 2.8118 4.382 1.1336 5.435 4
2 4 6 3.2674 4.318 1.1211 4.354 5
2 5 6 3.7917 5.893 0.0848 0.497 5
3 1 2 2.2961 2.625 0.3914 1.699 3
3 1 3 2.8956 4.128 1.2926 4.532 4
3 1 5 2.5359 3.607 0.8284 4.303 5
3 1 6 2.9032 3.937 0.8252 4.064 4
3 2 5 2.7737 4.064 0.9829 3.226 2
3 2 6 1.2040 0.635 0.4464 0.806 2
3 3 2 2.9870 4.191 0.9049 2.989 4
3 3 5 2.8407 3.962 0.7309 3.632 5
3 3 6 1.3564 0.000 0.1677 0.000 2
3 4 1 2.6746 3.620 0.8463 2.984 4
3 4 2 2.7066 3.353 0.5590 1.787 5
3 4 3 3.4198 4.623 0.3509 0.690 5
3 4 5 3.3299 4.953 0.4102 1.226 4
3 4 6 3.4564 4.978 0.8369 3.503 5
3 5 6 3.2614 4.826 .
4 1 2 1.8974 2.476 1.0160 3.629 4
4 1 3 1.3005 0.508 0.2019 0.774 3
4 1 5 2.0726 2.540 1.2235 5.097 3
4 1 6 1.8821 1.778 0.4728 3.312 4
4 2 5 1. 64 1.334 0.5354 2.382 4
4 2 6 1.5392 0.635 0.0376 0.806 2
4 3 2 1.8898 2.032 0.7364 1.892 4
4 3 5 2.5146 3.620 0.0876 0.446 4
4 3 6 1.8389 2.201 0.0941 0.280 3
4 4 1 2.3348 2.591 0.3816 2.722 5
4 4 2 1.7272 1.693 2.1640 8.602 3
4 4 3 1.6581 1.524 0.0537 0.903 5
4 4 5 2.1184 2.286 0.3137 2.366 4
4 4 6 1.5545 1.422 0.4803 1.019 5
4 5 6 1.4122 1.693 0.0338 0.150 3










Table 3-2. Numerical results for examples of data imbalance using the OLS techniques presented
in the text.


Balanced' Missing Plotb Missing Cross"


Estimate

of'
It
B'
B,
B,
GCA,
GCA,
GCA,
GCA,
GCA,
SCAt
SCA,,
SCA4,
SCA,,
SCA,
SCA2
SCA2
SCA,
SCA35


DBH
3.362
0.292
0.976
0.205
0.144
-.180
-.347
0.398
0.489
0.172
-.628
-.128
0.126
0.912
0.289
-.706
0.164
0.677


Height
2.5787
0.1074
0.5274
0.1308
0.0760
-.1186
-.1426
0.2544
0.1320
0.0763
-.3277
-.0550
0.0700
0.3600
0.1627
-.3084
-.0493
0.3679


DBH
3.346
0.245
0.992
0.220
0.163
-.220
-.386
0.417
0.509
0.208
-.592
-.152
0.102
0.771
0.324
-.670
0.129
0.712


Height
2.5386
0.1074
0.5386
0.1180
0.1260
-.2186
-.2426
0.3044
0.1820
0.1663
-.2377
-.1150
0.0100

0.2527
-.2187
0.0406
0.4793


DBH
3.260
0.245
1.023
0.187
0.270
-.434
-.601
0.524
0.616
0.400
-.400
-.280
-.026

0.517
-.478
0.064
0.905


Five
Missing Crossesd


Height
2.4980
0.1393
0.6041
0.0689
0.1361
-.2371
-.3972
0.4241
0.1746


DBH
3.149
0.309
1.140
0.087
0.232
-.493
-.952
0.804
0.646


Height
2.5830
0.1203
0.5230
0.1264
0.0706
-.1077
-.1316
0.2489
0.1265
0.0665
-.3374
-.0484
0.0766
0.3995
0.1528
-.3185
-.0592
0.3580


"where (numerical examples are for height)
b4= -E3bi = -.7697;
gca6 = -E5gca, = -.2067;
scaP = -Escap, for j or k = p and p= 1,2,3 then sca,1 = .2428,
sca, = -.3002, and sca, = -.3608; sca4 = -Esca. = -.2898,
e = independently estimated sea's 1, 9;
sca4 = sca,2 + sca3l + sca15 + sca3 + sca2 + sca35 = .2446;
and sca5 = scale2 + sca43 + sca4 + sca2 + sca4 + sca, = .1737.

where the linear combinations for parameter estimates are identical
to the balanced example.

'where scap = -Escaj for j or k = p and p= 1 to 3; sca4 = -VEsca,
e = independently estimated SCA's 1,. ..,8;
sca4 = sca2, + scale3 + sca15 + sca2 + sca35; and
sca = sca12 + sca13 + sca14+ sca, + sca,.


where


sca16 = -sca14 -sca,,, sca. = -sca., sca, = sca,,
sca, = sca=5, sca. = sca14 + sca2 + sca4, and
scan = the negative of the sum of the four independently
estimated sea's.


"where for all cases linear combinations for block and gca are the same as in the balanced case.


-.2041 -.410
0.0480 0.094

0.1920 0.408

0.1163 0.246








Missing Cross


Another common form of imbalance in diallel data sets, the missing cross, is examined

through arbitrary deletion of the 2 x 3 cross from all blocks, i.e., y12, y3, y33, y4 are missing

in the data vector. This type of imbalance is representative of a particular cross that could not

be made and is therefore missing from all blocks. The matrix manipulations required for this

analysis are again presented by factor. For appropriate SCA restrictions, the data vector and

design matrix should be ordered so that the p1 parent has no missing crosses. Since the labeling

of a parent as parent p is entirely subjective, any parent with all crosses may be designated as

parent p. The previous labelling directions are necessary since we generate the SCA submatrix

as horizontal direct products of the columns of the GCA submatrix; and to account for missing

crosses, the horizontal direct product for each particular missing parental combinations are not

calculated which sets the missing SCA's to zero. If there is a cross missing from those of the

p1 parent, we cannot account for the missing cross with this technique (Searle 1987, page 479).

For the mean, block, and GCA submatrices, the adjustment for the missing cross dictates

deleting the rows in the submatrices which would have corresponded to the y2 observations. The

SCA submatrix must be reformed since a degree of freedom for SCA and hence a column of the

submatrix has been lost. The SCA submatrix is reinstituted from the GCA horizontal direct

products (remembering that one cross, 2x3, no longer exists and therefore that product GCA2 x

GCA3 is inappropriate). Dropping the column for SCA, is equivalent to setting SCA, to zero

(Searle 1987) so that the remaining SCA's will sum-to-zero. After that, the reformation is

according to the established pattern. With one missing cross there are now 56 observations and

hence 56 degrees of freedom available. The columns of the X, matrix are now: one for the

mean, three for block, five for GCA, and eight for SCA for a total of 17 columns. The






46

remaining degrees of freedom for error is 39, matching the correct degrees of freedom ((14-

1)x(4-1)= 39).

For the missing cross example 4 is no longer equivalent to the mean of the plot means

since 4 = 2.5386 and Egjyij)/N = 2.5715 where N = 56 (number of plot means). This is the

result of GCA effects which are no longer orthogonal to the mean. Check the X,'X, matrix or

try estimating factors separately and compare to the estimates when all factors are included in X.

If formulae for balanced data (Becker 1975, Falconer 1981, and Hallauer and Miranda

1981) are applied to unbalanced data (plot-mean basis) estimates of parameters are no longer

appropriate because factors in the model are no longer independent orthogonall). Applying

Becker's formula which uses totals of cross means for a site (y. to the missing cross example

yields: gca, = .2992, gca2 = -.5649, gca3 = -.5888, gca4 = .4665, gca5 = .3552, and gca, =

.0219. These answers are very different in magnitude from those in Table 3-2 for this example

and gca6 also has a different sign. Employing these formulae in the analysis of unbalanced data

is analogous to matrix estimation of GCA's without the other factors in the model which is

inappropriate.


Several Missing Crosses


The concluding example (Table 3-2) is a drastically unbalanced data set resulting from

the arbitrary deletion of five crosses (1 x 2, 1 x 3, 2 x 3, 3 x 5, and 4 x 5). The matrix

manipulation for this example is an extension of the previous one cross deletion example. Rows

corresponding to yi12, Yi13, yi2, yi, and y,5 are deleted from the mean, block and GCA

submatrices for all blocks. The SCA matrix (now 4 columns = 10 crosses 5 -1 = 4 degrees

of freedom) is again reformed with only the relevant products of the GCA columns. Counting

degrees of freedom (columns of the sum-to-zero design matrix), the mean has one, block has






47

three, GCA has five, and SCA has four degrees of freedom for a total of 13. Error has (4-1)(10-

1) = 27 degrees of freedom. Totaling degrees of freedom for modeled effects and error yields

40 which equals the number of plot means.

In increasingly unbalanced cases (Table 3-2), the spread among the GCA estimates tends

to increase with increasing imbalance (loss of information). This is a general feature of OLS

analyses and the basis for the feature is that the spread among the GCA estimates is due to both

the innate spread due to additive genetics effects as well as the error in estimation of the GCA's.

When there is less information, GCA estimates tend to be more widely spread due to the increase

in the error variance associated with their estimation. This feature has been noted (White and

Hodge 1989, page 54) as the tendency to pick as parental winners individuals in a breeding

program which are the most poorly tested.


Discussion


After developing the OLS analysis and describing the inherent assumptions of the

analysis, there are four important factors to consider in the interpretation of sum-to-zero OLS

solutions: (1) the lack of uniqueness of the parameter estimates; (2) the weights given to plot

means (yi) and in turn site means (y .) for crosses in data sets with missing crosses in parameter

estimation; (3) the arbitrary nature of using a diallel mean (perforce a narrow genetic base) as

the mean about which the GCA's sum-to-zero; and (4) the assumption that the covariance matrix

for the observations (V) is Ia2,.


Uniqueness of Estimates


Sum-to-zero restrictions furnish what would appear to be unique estimates of the

individual parameters, e.g. GCA1, when, in fact, these individual parameters are not estimable






48

(Graybill 1976, Freund and Littell 1981, and Milliken and Johnson 1984). The lack of

estimability is again analogous to attempting to solve a set of equations in n unknowns with t

equations where n is greater than t. Therefore, an infinite number of solutions exist for 8.

There are quantities in this system of equations that are unique (estimable), i.e., the

estimate is invariant regardless of the restriction (sum-to-zero or set-to-zero) or generalized

inverse (no restrictions) used (Milliken and Johnson 1984) and the estimable functions include

sum-to-zero GCA and SCA estimates since they are linear combinations of the observations; but,

these estimable quantities do not estimate the individual parametric GCA's and SCA's of the

overparameterized model (equation 3-4) since there is no unique solution for those parameters.


Weighting of Plot Means and Cross Means in Estimating Parameters


With at least one measurement tree in each plot and with plot means as the unit of

observation, use of the matrix approach produces the same results as the basic formulae. The

weight placed on each plot mean in the estimation of a parameter can be determined by

calculating (X,'XX)'X' which can be viewed as a matrix of weights W so that equation 3-3 can

be written as b = Wy. The matrix W has these dimensions: the number of rows equals the

number of parameters in f, and the number of columns equals the number of plot means in y.

The i!t row of the W contains the weights applied to y to estimate the i1 parameter in b (6). In

the discussion which follows gca, is utilized as 6,.

If there are no missing plots, the cross mean in every block (yj) has the same weighting

and weights can be combined across blocks to yield the weight on the overall cross mean (y.j).

It can be shown that for the balanced numerical example gca, is calculated by weighting the

overall cross means containing parent 1 by 1/6 and weighting all overall cross means not











GCA3 GCA4


1/6 1/6 1/6 1/6 1/6
.16667 .16667 .16667 .16667 .16667

.14583 -1/12 -1/12 -1/12 -1/12
missing -.08333 -.08333 -.08333 -.08333

.14583 missing -1/12 -1/12 -1/12
missing missing -.08333 -.08333 -.08333

.18056 -.10417 -.10417 -1/12 -1/12

.22549 .01961 -.11765 -.08333 -.08333

.18056 -.10417 -.10417 -.06944 -1/12

.31372 -.27451 missing missing t~ -.08333

.18056 -.10417 -.10417 -.06944 -.06944
.29412 .08824 -.04902 -.29412 -.20588


Figure 3-4. Weights on overall cross means (y.j) for the three numerical examples for
estimation of GCAI. The weights for the balanced example (above the diagonal) are presented
in both fractional and decimal form. The weights for the one-cross missing and the five-crosses
missing are presented as the upper number and lower number, respectively, in cells below the
diagonal. The marginal weights on GCA parameters (right margin) do not change although cells
are missing.


GCA1



GCA2



GCA3



GCA4



GCA5



GCA6


5/6



-1/6



-1/6



-1/6



-1/6



-1/6


GCA1


GCA2


GCA5


GCA6







50

containing parent 1 by -1/12. Figure 3-4 (above the diagonal) demonstrates the weightings on

the overall cross means for the balanced numerical example as well as the marginal weighting on

the GCA parameters. These marginal weightings are obtained by summing along a row and/or

column as one would to obtain the marginal totals for a parent (Becker 1975). One feature of

sum-to-zero solutions is that these marginal weightings will be maintained no matter the

imbalance due to missing crosses, as will be seen by considering the numerical examples for a

missing cross (Figure 3-4 below the diagonal, upper number) and five missing crosses (Figure

3-4 below the diagonal, lower number). The marginal weights have remained the same as in the

balanced case while the weights on the cross means differ among the crosses containing parent

1 and also among the crosses not containing parent 1. In the five missing crosses example,

crosses y.2 and y.2 even receive a positive weighting where in the prior examples they had

negative weighting.

The expected value in all three examples is GCA,, (for sum-to-zero) despite the

apparently nonsensical weightings to cross means with missing crosses; however, the evaluation

of the estimates in terms of the original model changes with each new combination of missing

cells, i.e., y.2 and y26 have a positive weight in the five missing crosses example in GCA,

estimation. Whether this type of estimation is desirable with missing cell (cross) means has been

the subject of some discussion (Speed, Hocking and Hackney 1978, Freund 1980, and Milliken

and Johnson 1984). The data analyst should be aware of the manner in which sum-to-zero treats

the data with missing cell means and decide whether that particular linear combination of cross

means estimating the parameter is one of interest, realizing that the meaning of the estimates in

terms of the original model is changing.








Diallel Mean


The use of the mean for a half-diallel as the mean around which GCA's sum-to-zero is

not satisfactory in that the diallel mean is the mean of a rather narrow genetically based

population, and in particular that the comparisons of interest are not usually confined to the

specific parents in a specific diallel on a particular site. A checklot can be employed to represent

a base population against which comparison of half- or full-sib families can be made to provide

for comparison of GCA estimates from other tests (van Buijtenen and Bridgwater 1986).

Mathematically, when effects are forced to sum-to-zero around their own mean, the

absolute value of the GCA's is reflective of their value relative to the mean of the group. Even

if the parents involved in the particular diallel were all far superior to the population mean for

GCA, GCA's calculated on an OLS basis would show that some of these GCA's were negative.

If the GCA's of the diallel parents were in fact all below the population mean, the opposite and

equally undesirable result ensues. For disconnected diallels together on a single site, an OLS

analysis would yield GCA estimates that sum-to-zero within each diallel since parents are nested

within diallels. Unless the comparisons of interest are only in the combination of the parents in

a specific diallel on a specific site, the checklot alternative is desirable.

A method for obtaining the desired goal of comparable GCA's from disconnected

experiments, disregarding the problem of heteroscedasticity, is to form a function from the data

which yields GCA estimates properly located on the number scale. Such a function can be

formed (using GCA, as an example) from gca,,, the diallel mean, and the checklot mean.

From expectations of the scalar linear model (equation 3-1),

GCA,. = ((p-1)/p)GCA, (1/p)EJ.2GCAj + (1/p)E=2SCAlk 3-5

(2/(p(p-2)))E E 1E=.=SCA+;

E{diallel mean) = A + (E=,B.)/b + (2/p)Ee=,GCAj + (2/(p(p-l)))EgII-IE=2SCAjk; and








E{checklot mean} = j + (E1=B.)/b + 7;

where j for GCA is j or k and r represents the fixed genetic parameter of the checklot. The

function used to properly locate GCA,, (the subscript rel denotes the relocated GCAi) is gca,,

= gca,, + (1/2)(diallel mean checklot mean). The expectation of gca,, with negligible SCA

is GCA,,o = GCAi r/2; and since breeding value equals twice GCA, BV,, = BV, T. If SCA

is non-negligible then the expectation is

GCA,, = GCAI + (l/(p-1))E.=2SCA, (l/((p-l)(p-2)))E I .3SCA, r/2. 3-6

In either case the function provides a reasonable manner by which GCA estimates from

disconnected diallels are centered at the same location on a number scale and are then

comparable.


Variance and Covariance of Plot Means


The variances of plot means with unequal numbers of trees per plot are by definition

unequal, i.e., Var(y1) = o2, + o2,,/n where o~, is plot variance, o, is the within plot variance

and ni, is the number of observations per plot. Also, if blocks were considered random, there

would be an additional source of variance for plot means due to blocks (as well as a covariance

between plot means in the same block) and this could be incorporated into the V matrix with

Var(yj) = oab + a2, + o2 ,n,/n. Since the variances of the means in the observation vector are

not equal and there is a covariance between the means if blocks are being considered random,

best linear unbiased estimates (BLUE) would be secured by weighting each mean by it's true

associated variance (Searle 1987, page 316). This is the generalized least squares (GLS)

approach as


b = (X,'V-'X)-'X,'V-y






53

The GLS approach relaxes the OLS assumptions of equal variance of and no covariance between

the observations (plot means) while still treating genetic parameters as fixed effects. The entries

along the diagonal of the V matrix are the variances of the plot means (Var(yk)) in the same

order as means in the data vector. The off-diagonal elements of V would be either 0 or o2b (the

variance due to the random variable block) for elements corresponding to observations in the

same block. BLUE requires exact knowledge of V; if estimates of ao, o2,, and o2, are utilized

in the V matrix, estimable functions of f approximate BLUE.

The OLS assumption that SCA and GCA are fixed effects can also be relaxed to allow

for covariances due to genetic relatedness. In particular, the information that means are from the

same half- or full-sib family could be included in the V matrix. Relaxation of the zero covariance

assumption implies that GCA and SCA are random variables. If GCA and SCA are treated as

random variables, then the application of best linear prediction (BLP) or best linear unbiased

prediction (BLUP) to the problem would be more appropriate (White and Hodge 1989, page 64).

The treatment of the genetic parameters as random variables is consistent with that used in

estimating genetic correlations and heritabilities. The V matrix of such an application would

include, in addition to the features of the GLS V matrix, the covariance between full-sib or half-

sib families added to the off-diagonal elements in V, i.e., if the first and second plot means in

the data vector had a covariance due to relationship, then that covariance is inserted twice in the

V matrix. The covariance would appear as the second element in the first row and the first

element in the second row of V (V is a symmetric matrix). Also the diagonal elements of V

would increase by 2o,, (the variance due to treating GCA as a random variable) + o. (the

variance due to treating SCA as a random variable).








Comparison of Prediction and Estimation Methodologies


Which methodology (OLS, GLS, BLP, or BLUP) to apply to individual data bases is

somewhat a subjective decision. The decision can be based both on the computational or

conceptual complexity of the method and the magnitude of the data base with which the analyst

is working. To aid in this decision, this discussion highlights the differences in the inherent

properties and assumptions of the techniques.

For all practical purposes the answers from the four techniques will never be equal;

however, there are two caveats. First, OLS estimates equal GLS estimates if all the cell means

are known with the same precision (variance), (Searle 1987, page 490). Otherwise, GLS

discounts the means that are known with less precision in the calculations and different estimates

result. The second caveat is if the amount of data is infinite, i.e., all cross means are known

without error, then all four techniques are equivalent (White and Hodge 1989, pages 104-106).

In all other cases BLP and BLUP shrink predictions toward the location parameters) and produce

predictions which are different from OLS or GLS estimates even with balanced data. During

calculations GLS, BLP, and BLUP place less weight on observations known with less precision,

which is intuitively pleasing.

With OLS and GLS forest geneticists treat GCA's and SCA's as fixed effects for

estimation and then as random variables for genetic correlations and heritabilities. BLP and

BLUP provide a consistent treatment of GCA's and SCA's as random variables while differing

in their assumptions about location parameters (fixed effects). In BLP fixed effects are assumed

known without error (although they are usually estimated from the data) while with BLUP fixed

effects are estimated using GLS. BLP and BLUP techniques also contain the assumption that the

covariance matrix of the observations is known without error (most often variances must be

estimated). In many BLUP applications (Henderson 1974), mixed model equations are utilized






55

iteratively to estimate fixed effects and to predict random variables from a data set. A BLUP

treatment of fixed effects allows any connectedness between experiments to be utilized in the

estimation of the fixed effects. This provides an intuitive advantage of BLUP over BLP in

experimentation where connectedness among genetic experiments is available or where the data

are so unbalanced that treating the fixed effects as known is less desirable than a GLS estimate

of the fixed effects.

An ordering of computational complexity and conceptual complexity from least to most

complex of the four methods is OLS, GLS, BLP and BLUP. The latter three methods require

the estimation of the covariance matrix of the observations either separately (a priori) or

iteratively with the fixed effects. Precise estimation of the covariance matrix for observations

requires a great number of observations and the precision of GLS, BLP and BLUP estimations

or predictions is affected by the error of estimation of the components of V.

Selection of a method can then be based on weighing the computational complexity and

size of the available data base against the advantages offered by each method. Thus, if

complexity of the computational problem is of paramount concern, the analyst necessarily would

choose OLS. With a small data base (one that does not allow reasonable estimates of variances),

the analyst would again choose OLS. With a large data base and no qualms with computational

complexity, the analyst can choose between BLP and BLUP based on whether there is sufficient

connectedness or imbalance among the experiments to make BLUP advantageous.


Conclusions


Methods of solving for GCA and SCA estimates for balanced (plot-mean basis) and

unbalanced data have been presented along with the inherent assumptions of the analysis. The

use of plot means and the matrix equations will produce sum-to-zero OLS estimates for GCA and







56

SCA for all types of imbalance. Formulae in the literature which yield OLS solutions for

balanced data can yield misleading solutions for unbalanced data because of the loss of

orthogonality and also weightings on site means for crosses (or totals) are constants.

GCA's and SCA's obtained through sum-to-zero restriction are not truly estimates of

parametric population GCA's and SCA's. There are an infinite number of solutions for GCA's

and SCA's from the system of equations as a result of the overparameterized linear model. Yet,

if the only comparisons of interest are among the specific parents on a particular site, then the

estimates calculated by sum-to-zero restrictions are appropriate. Checklots may be used to

provide comparability among estimates derived from disconnected sets.

Having discussed the innate mathematical features of OLS analysis, knowledge of these

features should help the data analyst decide if OLS is the most desirable technique for the data

at hand. It may be desirable to relax OLS assumptions, which are in all likelihood invalid for

the covariance matrix of the observations. This could lead to GLS, BLP or BLUP as better

alternatives.












CHAPTER 4
VARIANCE COMPONENT ESTIMATION TECHNIQUES
COMPARED FOR TWO MATING DESIGNS
WITH FOREST GENETIC ARCHITECTURE
THROUGH COMPUTER SIMULATION


Introduction


In many applications of quantitative genetics, geneticists are commonly faced with the

analysis of data containing a multitude of flaws (e.g. non-normality, imbalance, and

heteroscedasticity). Imbalance, as one of these flaws, is intrinsic to quantitative forest genetics

research because of the difficulty in making crosses for full-sib tests and the biological realities

of long term field experiments. Few definitive studies have been conducted to establish optimal

methods for estimation of variance components from unbalanced data. Simulation studies using

simple models (one-way or two-way random models) have been conducted for certain data

structures, i.e., imbalance, experimental design, and variance parameters (Corbeil and Searle

1976, Swallow 1981, Swallow and Monahan 1984, interpretations by Littell and McCutchan

1986). The results from these studies indicate that technique optimality is a function of the data

structure.

In practice (both historically and still common place), estimation of variance components

in forest genetics applications has been achieved by using sequentially adjusted sums of squares

as an application of Henderson's Method 3 (HM3, Henderson 1953). Under normality and with

balanced data, this technique has the desirable properties of being the minimum variance unbiased

estimator. If the data are unbalanced, then the only property retained by HM3 estimation is






58

unbiasedness (Searle 1971, Searle 1987 pp. 492,493,498). Other estimators have been shown

to be locally superior to HM3 in variance or mean square error properties in certain cases (Klotz

et al. 1969, Olsen et al. 1976, Swallow 1981, Swallow and Monahan 1984).

Over the last 25 years, there has been a proliferation of variance component estimation

techniques including minimum norm quadratic unbiased estimation (MINQUE, Rao 1971a),

minimum variance quadratic unbiased estimation (MIVQUE, Rao 1971b), maximum likelihood

(ML, Hartley and Rao 1967), and restricted maximum likelihood (REML, Patterson and

Thompson 1971). The practical application of these techniques has been impeded by their

computational complexity. However, with continuing advances in computer technology and the

appearance of better computational algorithms, the application of these procedures continues to

become more tractable (Harville 1977, Geisbrecht 1983, Meyer 1989). Whether these methods

of analysis are superior to HM3 for many genetics applications remains to be shown.

With balanced data and disregarding negative estimates, all previously mentioned

techniques except ML produce the same estimates (Harville 1977). With unbalanced data, each

technique produces a different set of variance component estimates. Criteria must then be

adopted to discriminate among techniques. Candidate criteria for discrimination include

unbiasedness (large number convergence on the parametric value), minimum variance (estimator

with the smallest sampling variance), minimum mean square error (minimum of sampling

variance plus squared bias, Hogg and Craig 1978), and probability of nearness (probability that

sample estimates occur in a certain interval around the parametric value, Pitman 1937).

Negative estimates are also problematic in the estimation of variance components. Five

alternatives for dealing with the dilemma of estimates less than zero (outside the natural parameter

space of zero to infinity) are (Searle 1971): 1) accept and use the negative estimate, 2) set the

negative estimate to zero (producing biased estimates), 3) re-solve the system with the offending






59

component set to zero, 4) use an algorithm which does not allow negative estimates, and 5) use

the negative estimate to infer that the wrong model was utilized.

The purpose of this research was to determine if the criteria of unbiasedness, minimum

variance, minimum mean square error, and probability of nearness discriminated among several

variance component estimation techniques while exploring various alternatives for dealing with

negative variance component estimates. In order to make such comparisons, a large number of

data sets were required for each experimental level. Using simulated data, this chapter compares

variance component estimation techniques for plot-mean and individual observations, two mating

systems (modified half-diallel and half-sib) and two sets of parametric variance components.

Types of imbalance and levels of factors were chosen to reflect common situations in forest

genetics.


Methods


Experimental Approach


For each experimental level 1000 data sets were generated and analyzed by various

techniques (Table 4-1) producing numerous sets of variance component estimates for each data

set. This workload resulted in enormous computational time being associated with each

experimental level. The overall experimental design for the simulation was originally conceived

as a factorial with two types of mating design (half-diallel and half-sib), two sets of true variance

components (Table 4-2), two kinds of observations (individual and plot mean) and three types of

imbalance: 1) survival levels (80% and 60%, with 80% representing moderate survival and 60%

representing poor survival; 2) for full-sib designs three levels of missing crosses (0, 2, and 5 out

of 15 crosses); and 3) for half-sib designs two levels of connectedness among tests (15 and 10

common families between tests out of 15 families per test). Because of the computational time









Table 4-1. Abbreviation for and description of variance component estimation methods utilized
for analyses based on individual observations (if utilized for plot-mean analysis the abbreviation
is modified by pre-fixing a 'P').

Abbreviation Description Citation

ML Maximum Likelihood: estimates not restricted to the parameter Hartley and Rao 1967;
PML space (individual and plot-mean analysis). Shaw 1987

MODML Maximum Likelihood: negative estimates set to zero after Hartley and Rao 1967
convergence (individual analysis).

NNML Maximum Likelihood: if negative estimates appeared at Hartley and Rao 1967;
convergence, they were set to zero and the system re-solved Miller 1973
(individual analysis).

REML Restricted Maximum Likelihood: estimates not restricted to the Patterson and
PREML parameter space (individual and plot-mean analysis). Thompson 1971; Shaw
1987; Harville 1977

MODREML Restricted Maximum Likelihood: negative estimates set to zero Patterson and
after convergence (individual analysis). Thompson 1971

NNREML Restricted Maximum Likelihood: if negative estimates appeared Patterson and
PNNREML at convergence, they were set to zero and the system re-solved Thompson 1971; Miller
(individual and plot-mean analysis). 1983

MIVQUE Minimum Variance Quadratic Unbiased: non-iterative with true Rao 1971b
PMIVQUE parametricc) values of the variance components as priors
(individual and plot-mean analysis).

MINQUE1 Minimum Norm Quadratic Unbiased: non-iterative with ones as Rao 1971a
PMINQUEI priors for all variance components (individual and plot-mean
analysis).

TYPE3 Sequentially Adjusted Sums of Squares; Henderson's Method 3 Henderson 1953
PTYPE3 (individual and plot-mean analysis).

MIVPEN MIVQUE with a penalty algorithm to prevent negative estimates Harville 1977
(individual analysis).


constraint, the experiment could not be run as a complete factorial and the investigation continued

as a partial factorial. In general, the approach was to run levels which were at opposite ends of

the imbalance spectrum, i.e., 80% survival and no missing crosses versus 60% survival and 5

missing crosses, within a variance component level. If results were consistent across these


treatment combinations, intermediate levels were not run.






61

Designation of a treatment combination is by five character alpha-numeric field. The first

character is either "H" (half-sib) or "D" (half-diallel). The second character denotes the set of

parametric variance components where "1" designated the set of variance components associated

with heritability of 0.1 and "2" designated the set of variance components associated with

heritability of 0.25 (Table 4-1). The third character is an "S" indicating that the last two

characters determine the imbalance level. The fourth character designates the survival level either

"6" for 60% or "8" for 80%. The final character specifies the number of missing crosses (half-

diallel) or lack of connectedness (half-sib). The treatment combination 'H1S80' is a half-sib

mating design (H), the set of variance components associated with heritability equalling 0.1 (1),

80% survival (8), and 15 common parents across tests (0).


Table 4-2. Sets of true variance components for the half-diallel and half-sib mating designs
generated from specification of two levels of single-tree heritability (h2), type B correlation (rB),
and non-additive to additive variance ratio (d/a).

Genetic Ratios"* True Variance Components"
SMating
r. r, d/a is gn I e I aa 1
full-sib 1.0 0.5 0.25 0.25 0.25 0.25 .595 7.905
0.1 0.5 1.0
half-sib 1.0 0.5 0.25 NA 0.25 NA .475 7.9964
0.25 0.8 .25 full-sib 1.0 0.5 0.625 .1562 .1562 .0391 .5769 7.6649

a h = 4,2g / I2phnypc; re = 4o, / (402, + 4u2); and uD / 2A as d/a = 4o2 / 40,.
b See definitions in equation 4-1.


Experimental Design for Simulated Data


The mating design for the simulation was either a six-parent half-diallel (no selfs) or a

fifteen-parent half-sib. The randomized complete block field design was in three locations (i.e.,

separate field tests) with four complete blocks per location and six trees per family in a block;

where family is a full-sib family for half-diallel or a half-sib family for the half-sib design. This






62

field design and the mating designs reflect typical designs in forestry applications (Squillace 1973,

Wilcox et al. 1975, Bridgwater et al. 1983, Weir and Goddard 1986, Loo-Dinkins et al. 1991)

and are also commonly used in other disciplines (Matzinger et al. 1959, Hallauer and Miranda

1981, Singh and Singh 1984). The six trees per family could be considered as contiguous or

non-contiguous plots without affecting the results or inferences.


Full-Sib Linear Model


The scalar linear model employed for half-diallel individual observations is

Yim = + ti + bj + gk + + Su + tga + tga + tSa+ + p + wju, 4-1

where yj1 is the mL observation of the klW cross in the jh block of the ih test;

AL is the population mean;

ti is the random variable test location ~ NID(0,,);

byj is the random variable block ~ NID(O,,2b);

gk is the random variable female general combining ability (gca) ~ NID(0,o2);

g, is the random variable male gca NID(0,2);

s, is the random variable specific combining ability (sca) ~ NID(0,o2,);

tg,~ is the random variable test by female gca interaction ~ NID(O,es);

tg, is the random variable test by male gca interaction ~ NID(0,o);

ts, is the random variable test by sca interaction ~ NID(0,o,);

pij is the random variable plot ~ NID(O0,p);

wyjkl is the random variable within-plot NID(0,o2,); and

there is no covariance between random variables in the model.

This linear model in matrix notation is (dimensions below model component)

y = Zl + ZTr + Z4eB + ZeG + Zss + ZeCG + Zse + Zpep + ew 4-2








nxl n nxt txl nxb bxl nggl nxs sxl nxtgtgxl nxts tsxl nxp pxl nxl

where y is the observation vector;

Z, is the portion of the design matrix for the ith random variable;

e, is the vector of unobservable random effects for the it random variable;

1 is a vector of l's; and

n, t, b, g, s, tg, ts, and p are the number of observations, tests, blocks, gca's, sca's, test

by gca interactions, test by sea interactions and plots, respectively.

Utilizing customary assumptions in half-diallel mating designs (Method 4, Griffing 1956), the

variance of an individual observation is

Var(yljm) = o2 + o2+ 2 + 0 2, + 2o+ 2 + o2, + 2p + o2w; 4-3

and in matrix notation the covariance matrix for the observations is

var(y) = ZrZo, + zZo2b + ZG Zo2 + ZSZ 2. + z GG74G2, + ZrSZ 2, + ZP z2, + I.o2. 4-4

where indicates the transpose operator, all matrices of the form ZZ|' are nxn, and I, is an

nxn identity matrix.


Half-sib Linear Model


The scalar linear model for half-sib individual observations is

yij = V + ti + bi + gk + tg, + phi* + Whi, 4-5

where yi, is the mh observation of the kh half-sib family in the j'h block of the iL test;

,u, ti, b1,, gk, and tg, retain the definition in Eq.4-1;

phj is the random variable plot containing different genotype by environment

components than the corresponding term in Eq.4-1 NID(O,a2,);

Whj, is the random variable within-plot containing different levels of genotypic and

genotype by environment components than the corresponding term in Eq.4-1








~ NID(O,c,); and

there is no covariance between random variables in the model.

The matrix notation model is (dimensions below model component)

y = pl + Zrer + Ze, + ZGG + ZrTG + Zep + ew 4-6

Snxl nxt txl nxtt b bxl nxg gxl nxtg tgxl nxp pxl nx

The variance of an individual observation in half-sib designs is

Var(ygi = a2 + e + 0 + + 0 p+ 2w 4-7

and Var(y) = ZrZ;Wo + ZBZob + ZGZGa, + ZGZ'2tg + ZZp'2ph + Ia, 4-8

For an observational vector based on plot means, the plot and within-plot random

variables were combined by taking the arithmetic mean across the observations within a plot.

The resulting plot means model has a new o2 or o2p, (a. or fh.) term being a composite of the

plot and within-plot variance terms of the individual observation model.

Three estimates of ratios among variance components were determined: 1) single tree

heritability adjusted for test location and block as f2 = 4g2 / 26,I where a2?,,, is the

estimate of the variance of an individual observation from equations 4-3 and 4-7 with the variance

components for test location and block deleted; 2) type B correlation as (i, = 4a2g / (4a2, +

4i2); and dominance to additive variance ratio as d/a = 42, / 42,.


Data Generation and Deletion


Data generation was accomplished by using a Cholesky upper-lower decomposition of the

covariance matrix for the observations (Goodnight 1979) and a vector of pseudo-random standard

normal deviates generated using the Box-Muller transformation with pseudo-random uniform

deviates (Knuth 1981, Press et al. 1989). The upper-lower decomposition creates a matrix (U)

with the property that Var(y) = U'U. The vector of pseudo-random standard normal deviates






65

(z) has a covariance matrix equal to an identity matrix (I) where n is the number of observations.

The vector of observations is created as y = U'z. Then Var(y) = U'(Var(z))U and since Var(z)

= I., Var(y) = U'IU = U'U.

Analyses of survival patterns using data from the Cooperative Forest Genetic Research

Program (CFGRP) at the University of Florida were used to develop survival distributions for

the simulation. The data sets chosen for survival analysis were from full-sib slash pine (Pinus

elliottii var elliottii Engelm) tests planted in randomized complete block designs with the families

in row plots and were selected because the survival levels were either approximately 60% or

80%. Survival levels for most crosses (full-sib families) clustered around the expected value,

i.e., approximately 60% for an average survival level of 60%; however, there were always a few

crosses that had much poorer survival than average and also a small number of crosses that had

much better survival than average. This survival pattern was consistent across the 50 experiments

analyzed. Thus, a lower than average survival level was arbitrarily assigned to certain crosses,

a higher than average survival level was assigned to certain crosses, and the average survival

level assigned to most crosses. This modeling of survival pattern was also extended to the half-

sib mating design. At 80% survival no missing plots were allowed and at 60% survival missing

plots occurred at random.

Full-sib family deletion simulated crosses which could not be made and were therefore

missing from the experiment. When deleting five crosses, the deletion was restricted to a

maximum of four crosses per parent to prevent loss of all the crosses in which a single parent

appeared since this would have resulted in changing a six-parent to a five-parent half-diallel.

Tests having only subsets of the half-sib families in common are a frequent occurrence

in data analysis at CFGRP. This partial connectedness was simulated by generating data in which






66

only 10 of the 15 families present in a test were common to either one of the other two tests

comprising a data set.


Variance Component Estimation Techniques


Two algorithms were utilized for all estimation techniques: sequentially adjusted sums

of squares (Milliken and Johnson 1984, p 138) for HM3; and Giesbrecht's algorithm (Giesbrecht

1983) for REML, ML, MINQUE and MIVQUE. Giesbrecht's algorithm is primarily a gradient

algorithm (the method of scoring), and as such allows negative estimates (Harville 1977,

Giesbrecht 1983). Negative estimates are not a theoretical difficulty with MINQUE or MIVQUE;

however, for REML and ML, estimates should be confined to the parameter space. For this

reason estimators referred to as REML and ML in this chapter are not truly REML and ML when

negative estimates occur; further, there is the possibility that the iterative solution stopped at a

local maxima not the global maximum. These concerns are commonplace in REML and ML

estimation (Corbeil and Searle 1976, Harville 1977, Swallow and Monahan 1984); however,

ignoring these two points, these estimators are still referred to as REML and ML.

The basic equation for variance component estimation under normality (Giesbrecht 1983)

for MIVQUE, MINQUE and REML is {tr(QViQV4)}^2 = {y'QViQy} 4-9

rxr rxl rxl

then = {tr(QVQVj)}-l{y'QViQy};

and for ML (tr(V-'VV-'Vj)}~i = {y'QVQy} 4-10

rxr rxl rxl

where {tr(QViQVj)} is a matrix whose elements are tr(QViQVj) where in the full-sib

designs i= 1 to 8 and j=l to 8, i.e., there is a row and column for

every random variable in the linear model;








tr is the trace operator that is the sum of the diagonal elements of a matrix;

Q = V1 V'X(X'V'X)-X'V- for V as the covariance matrix of y and X as

the design matrix for fixed effects;

V, = ZZ'i where i = the random variables test, block, etc.;

y is the vector of variance component estimates; and

r is the number of random variables in the model.

The MINQUE estimator used was MINQUE1 i.e., ones as priors for all variance

components; calculated by applying Giesbrecht's algorithm non-iteratively. MINQUE1 was

chosen because of results demonstrating MINQUEO (prior of 1 for the error term and of 0 for

all others) to be an inferior estimation technique for many cases (Swallow and Monahan 1984,

R.C. Littell unpublished data).

With normally-distributed uncorrelated random variables, the use of the true values of

the variance components as priors in a non-iterative application of Giesbrecht's algorithm

produced the MIVQUE solutions (equation 4-5). Obtaining true MIVQUE estimation is a luxury

of computer simulation and would not be possible in practice since the true variance components

are required (Swallow and Searle 1978). This estimator was included to provide a standard of

comparison for other estimators.. An additional MIVQUE-type estimator, referred to as

MIVPEN, was also included. MIVPEN was also a non-iterative application of the algorithm with

the true variance components as priors; however, this estimator was conditioned on the variance

component parameter space and did not allow negative estimates. The non-negative conditioning

of MIVPEN was accomplished by adding a penalty algorithm to MIVQUE such that no variance

component was allowed to be less than 1x10l7. Estimates from MIVPEN were equal to MIVQUE

for data sets for which there were no negative MIVQUE variance component estimates. When

negative MIVQUE estimates occur the two techniques were no longer equivalent. The penalty






68

algorithm operated by using A = a2 and by choosing a scalar weight w such that no element

of W,, is less than lx10-7. Then ~Z = o + wA, where A is the vector of departure from the

true values (a2), 1x107 is an arbitrary constant and ,~, is the vector of estimated variance

components conditioned on non-negativity.

REML estimates were from repeated application of Giesbrecht's algorithm (equation 4-9)

in which the estimates from the kh iteration become the priors for the k+ 1l iteration. The

iterations were stopped when the difference between the estimates from the k* and k+1*

iterations met the convergence criterion; then the estimates of the k + 1 iteration became the

REML estimates. The convergence criterion utilized was E=, I o2ik) 2i+1) < 1x104. This

criterion imposed convergence to the fourth decimal place for all variance components. Since

for this experimental workload it was desired that the simulation run with little analyst

intervention and in as few iterations as possible, the robustness of REML solutions obtained from

Giesbrecht's algorithm to priors (or starting points) was explored. The difference in solutions

starting from two distinct points (a vector of ones and the true values) was compared over 2000

data sets of different structures (imbalance, true variance components, and field design). The

results (agreeing with those of Swallow and Monahan 1984) indicated that the difference between

the two solutions was entirely dependent on the stringency of the convergence criterion and not

on the starting point (priors). Also the number of iterations required for convergence was greatly

decreased by using the true values as priors. Thus, all REML estimates were calculated starting

with the true values as priors.

Three alternatives for coping with negative estimates after convergence were used for

REML solutions: accept and use the negative estimates (Shaw 1987), arbitrarily set negative

estimates to zero, and re-solve the system setting negative estimates to zero (Miller 1973). The

first two alternatives are self-explanatory and the latter is accomplished by re-analyzing those data






69

sets in which the initial unrestricted REML estimates included one or more negative estimates.

During re-analysis if a variance component became negative, it was set to zero (could never be

any value other than zero) and the iterations continued. This procedure persisted until the

convergence criterion was met with a solution in which all variance components were either

positive or zero.

Harville (1977) suggested several adaptations of Henderson's mixed model equations

(Henderson et al. 1959) which do not allow variance component estimates to become negative;

however, the estimates can become arbitrarily close to zero. After trial of these techniques

versus the set the negative estimates to zero after convergence and re-solve the system approach,

comparison of results using the same data sets indicates that there is little practical advantage

(although more desirable theoretically) in using the approach suggested by Harville. The

differences between sets of estimates obtained by the two methods are extremely minor (solving

the system with a variance component set to zero versus arbitrarily close to zero).

ML solutions, as iterative applications of equation 4-6, were calculated from the same

starting points and with the same convergence criterion as REML solutions. The three negative

variance component alternatives explored for ML were to accept and use the negative estimates,

to arbitrarily set negative estimates to zero after converging to a solution for the former, and (for

half-sib data only) to re-solve the system setting negative variance components to zero.

The algorithm to calculate solutions for HM3 (sequentially adjusted sums of squares) was

based on the upper triangular G2 sweep (Goodnight 1979) and Hartley's method of synthesis

(Hartley 1967). The equation solved was E{MS}) = MS where MS is the vector of mean

squares and E{MS} is their expectation. The alternative used for negative estimates was to accept

and use the negative estimates.










Comparison Among Estimation Techniques


For the simulation MIVQUE estimates were the basis for all comparisons because

MIVQUE is by definition the minimum variance quadratic unbiased estimator. The results of

comparing the mean of 1000 MIVQUE estimates for an experimental level to the means for other

techniques were termed "apparent bias". "Apparent bias" denotes that 1000 data sets were not

sufficient to achieve complete convergence to the true values of the variance components.

Sampling variances of estimation were calculated from the 1000 observations within an

experimental level and estimation technique for variance components and genetic ratios (single

tree heritability, Type B correlation and dominance to additive variance ratio). Mean square

error then equalled variance plus squared "apparent bias". While mean square error was

investigated, there was never sufficient bias for mean square error to lead to a different decision

concerning techniques than sampling variance of the estimates; so mean square error was deleted

from the remainder of this discussion.

Probability of nearness is the probability that an estimate will lie within a certain interval

around the true parameter. The three total interval widths utilized were one-half, equal to, and

twice the parameter size. The percentage of 1000 estimates falling within these intervals were

calculated for the different estimation techniques within an experimental level for variance

components and ratios and utilized as an estimate of probability of nearness.

Results are presented by variance component or genetic ratio estimated as a percentage

of MIVQUE (except in the case of probability of nearness). MIVQUE estimates represent 100%

with estimates with greater variance having values larger than 100% and "apparently biased"

estimates having values different from 100%. The percentages were calculated as equal to 100

times the estimate divided by the MIVQUE value. For the criterion of variance, the lower the






71

percentage the better the estimator performed; for bias, values equalling 100% (0 bias) are

preferred; and for probability of nearness, larger percentages (probabilities) are favored since

they are indicative of greater density of estimates near the parametric value.



Results and Discussion


Variance Components


Sampling variance of the estimators

For all variance components estimated, REML and ML estimation techniques were

consistently equal to or less than MIVQUE for sampling variance of the estimator (Table 4-3).

The variance among estimates from these techniques was further reduced by setting the negative

components to zero (MODML and MODREML) or setting negative estimates to zero plus re-

solving the system (NNREML, NNML, and PNNREML). Variance among MINQUE1 estimates

is always equal to or greater than for MIVQUE, as one might expect, since they are, in this

application, the same technique with MIVQUE having perfect priors (the true values). Variances

for HM3 estimators (TYPE3 and PTYPE3) are either equal to or greater than MIVQUE (HM3

estimates have progressively larger relative variance with higher levels of imbalance. MIVPEN,

although impractical because of the need for the true priors, had much more precise estimates of

variance components than other techniques illustrating what could be accomplished given the true

values as priors plus maintaining estimates within the parameter space.

In general, the spread among the percentages for variance of estimation for the estimation

techniques is highly dependent on the degree of imbalance and the type of mating system. With

increasing imbalance the likelihood-based estimators realized greater advantage for sampling

variance of the estimates over HM3 for both mating systems. The most advantageous application









Table 4-3. Sampling variance for the estimates of a2, (upper number), oa2 (second number), and
h2 (third number where calculated) as a percentage of the MIVQUE estimate by type of estimator
and treatment combination; NA is not applied. Values greater than 100 indicate larger variance
among 1000 estimates.

Estimator II S80 DIS65 D2S65 H1S80 H1S65

REML 99.9 102.6 101.5 99.6 106.3
100.2 100.0 104.1 99.7 98.0
100.0 101.0 101.4 99.6 105.8
ML 77.3 78.2 76.4 95.9 103.9
106.9 104.8 110.7 100.8 99.1
82.5 82.9 86.4 96.2 103.8
MINQUEI 100.0 104.2 104.0 104.0 146.7
101.2 118.8 123.6 112.5 139.7
100.3 105.8 103.9 104.0 145.8
NNREML 80.8 71.6 95.2 88.0 68.6
67.9 48.3 54.9 78.7 48.6
76.8 64.2 92.2 87.3 67.7
NNML NA NA NA 83.3 65.3
79.4 48.9
83.1 64.7
MODML 58.2 50.0 69.5 84.7 74.6
12.8 81.4 81.6 86.6 68.5
58.1 46.1 72.0 83.8 71.4
MODREML 81.5 74.5 96.1 88.9 78.1
89.1 74.0 73.7 85.4 66.9
76.4 63.5 88.9 87.7 74.3
TYPE3 101.0 101.0 105.5 100.6 121.0
101.1 101.0 115.5 100.9 125.6
100.5 108.4 102.9 100.4 121.6
PREML 100.3 106.3 101.7 107.5 146.9
102.7 113.5 119.8 122.0 150.7
PML 77.6 81.9 77.1 103.6 143.4
109.7 117.3 127.2 123.3 151.9
PMINQUEI 100.3 107.6 105.4 107.5 179.3
102.7 129.0 137.3 122.0 180.6
PNNREML 80.9 71.1 93.9 92.7 86.6
69.8 53.2 60.5 94.0 68.1
PTYPE3 100.3 106.6 105.4 107.5 168.1
102.7 124.7 133.3 122.0 184.9
100.6 110.8 104.1 106.9 168.0
MIVPEN NA 36.2 29.1 80.0 45.6
26.6 20.0 74.3 39.6
34.7 30.2 79.8 45.4
PMIVQUE 100.3 104.2 102.4 107.5 146.9
102.7 114.4 117.8 122.0 150.7






73

of likelihood-based estimators is in the H1S65 case where the imbalance is not only random

deletions of individuals but also incomplete connectedness across locations, i.e. the same families

are not present in each test (akin to incomplete blocks within a test).

An analysis of variance was conducted to determine the importance of the treatment of

negative variance component estimates in the variance of estimation for REML and ML estimates.

The model of sampling variance of the estimates as a result of mating design, imbalance level,

treatment of negative estimates and size of the variance component demonstrated consistently (for

all variance components except error) that treatment of negative estimates is an important

component of the variance of the estimates (p < .05). The model accounted for up to 99% of

the variation in the variance of the variance component estimates with 1) accepting and using

negative estimates producing the highest variance; 2) setting the negative components to zero

being intermediate; and 3) re-solving the system with negative estimates set to zero providing the

lowest variance.

For all estimation techniques, lower variance among estimates was obtained by using

individual observations as compared to plot means. The advantage of individual over plot-mean

observations increased with increasing imbalance.

Bias

The most consistent performance for bias (Table 4-4) across all variance components was

TYPE3 known from inherent properties to be unbiased. The consistent convergence of the

TYPE3 value to the MIVQUE value indicated that the number of data sets used (1000 per

technique and experimental level) was suitable for the purpose of examining bias. The other two

consistent performers were REML and MINQUE1. PTYPE3 (HM3 based on plot means) was

unbiased when no plot means were missing, but produced "apparently biased" estimates when

plot means were missing.










Table 4-4. Bias for the estimates of o2 (upper number), o2' (second number), and h2 (third
number where calculated) as a percentage of the MIVQUE estimate by type of estimator and
experimental combination; NA is not applied. Values different from 100 denote "apparent" bias.

Estimator DIS80 DIS65 D2S65 H1S80 HIS65

REML 99.9 101.5 98.7 99.9 102.8
99.9 102.2 99.8 99.9 98.9
99.9 101.3 98.6 99.9 102.6
ML 74.6 61.6 76.0 96.2 98.2
106.5 114.6 109.7 101.3 101.8
75.5 61.8 77.9 96.3 98.2
MINQUE 99.7 96.4 99.0 99.4 102.0
100.1 100.8 101.3 100.8 98.3
99.7 96.6 98.9 99.4 101.3
NNREML 107.9 116.5 98.1 101.9 107.8
93.1 92.9 92.9 100.5 102.3
108.7 118.4 98.2 102.2 107.7
NNML NA NA NA 101.9 107.8
100.5 102.3
98.2 103.8
MODML 86.6 90.4 79.0 98.1 114.1
109.9 129.9 127.4 101.3 122.9
87.8 91.5 79.4 99.6 112.6
MODREML 109.5 124.2 100.6 103.1 117.8
103.7 119.8 119.2 104.6 120.6
109.5 123.2 98.4 102.9 116.2
TYPE3 100.1 99.4 99.6 100.2 99.6
100.2 101.0 102.4 100.2 100.9
100.0 99.5 99.3 100.2 99.7
PREML 99.7 98.7 97.7 99.5 110.6
100.1 103.6 100.2 102.4 98.3
PML 74.2 58.5 73.6 95.9 105.2
106.9 116.2 111.5 103.2 102.0
PMINQUE 99.7 95.2 98.8 99.5 106.5
100.1 102.1 102.9 102.4 114.8
PNNREML 107.9 114.5 96.7 101.8 115.6
92.9 94.0 95.0 104.5 110.2
PTYPE3 99.7 96.8 99.0 99.5 104.5
100.1 97.2 96.0 102.4 108.7
99.8 98.0 98.8 99.6 104.1
MIVPEN NA 107.5 98.6 102.0 103.2
99.0 91.7 101.4 105.1
112.6 103.9 102.1 103.4
PMIVQUE 99.7 97.4 99.2 99.5 106.8
100.1 101.7 100.5 102.4 98.8










Table 4-5. Probability of nearness for o2 (upper number), o2, (second number), and h2 (third
number where calculated). The probability interval is equal to the magnitude of the parameter.

Estimator DI D1S65 D2S65 H1S80 H1S65

REML 32.8 24.3 41.8 45.3 28.6
43.0 26.2 25.7 36.6 27.1
34.2 25.3 45.4 45.0 28.3

ML 33.6 22.3 40.7 45.4 29.2
42.9 26.4 24.8 36.2 26.7
34.6 22.3 45.0 45.7 28.2

MINQUE 32.6 24.6 41.0 45.1 26.1
43.1 24.3 25.4 34.2 23.2
33.7 25.0 44.6 44.7 25.6

NNREML 33.4 23.4 41.7 45.1 29.3
44.9 28.1 25.6 38.0 28.9
34.3 24.3 46.1 45.2 29.5

NNML NA NA NA 45.9 29.7
37.9 29.1
46.0 29.0

TYPE3 34.0 23.2 42.5 45.3 27.1
42.6 27.1 24.8 37.3 25.0
35.3 23.8 45.8 45.9 27.3

PREML 32.1 20.0 41.6 43.7 24.6
42.7 26.8 24.6 32.3 20.4

PML 33.5 19.8 39.7 44.0 24.4
41.0 26.3 23.6 31.6 21.1

PMINQUE 32.1 21.4 40.4 43.7 24.5
42.7 24.8 23.1 32.3 21.9

PNNREML 31.9 19.2 41.0 43.4 26.0
43.3 28.0 23.3 33.1 21.3

PTYPE3 32.1 23.3 41.7 43.7 25.2
42.7 25.4 24.1 32.3 22.4
32.6 24.1 46.0 44.6 24.6

MIVQUE 33.6 25.7 43.7 45.1 29.2
42.9 28.6 26.4 36.9 26.3
34.8 26.8 47.7 45.4 29.4

MIVPEN NA 41.1 78.5 48.4 35.6
47.0 60.3 39.2 31.2
42.4 80.5 48.7 35.3

PMIVQUE 32.1 20.0 41.8 43.7 25.9
42.7 28.5 26.8 32.3 20.8






76

Among estimators which displayed bias, maximum likelihood estimators (ML and PML)

were known to be inherently biased (Harville 1977, Searle 1987) with the amount of bias

proportional to the number of degrees of freedom for a factor versus the number of levels for the

factor. Other biases resulted from the method of dealing with negative estimates. Living with

negative estimates produced the estimators with the least bias. Setting negative variance

components to zero resulted in the greatest bias. Intermediate in bias were the estimates resulting

from re-solving the system with negative components set to zero.

Probability of nearness

Results for probability of nearness proved to be largely non-discriminatory among

techniques (Table 4-5). The low levels of probability density near the parametric values are

indicative of the nature of the variance component estimation problem. Figure 4-1 illustrates the

distribution of MIVQUE variance component estimates for h2 (4-la) and a2g (4-lb) for level

D1S80. The distributions for all unconstrained variance component estimates have the appearance

of a chi-square distribution, positively skewed with the expected value (mean) occurring to the

right of the peak probability density and a proportion of the estimates occurring below zero

(except error). With increasing imbalance, the variance among estimates increases and the

probability of nearness decreases for all interval widths.


Ratios of Variance Components


Single tree heritability

Results for estimates of single tree heritability adjusted for locations and blocks are shown

in Tables 4-3 and 4-4 (third number from the top in each cell, if calculated). For these relatively

low heritabilities (0.1 and 0.25), the bias and variance properties of the estimated ratio are similar

to those for acg estimates (Figure 4-1). This implies that knowing the properties of the numerator




























4-la. h2
w ,----------


-.25 -.10 0.0 .10


- 15


- -I 101 -


.6 -0.5 -.5 0.0 U .2 1.5 2.
0.6 -0.625 -.250.0 .25 .625 1.0 1.5 2.0


MIVQUE ESTIMATES 1000 DATA SETS









Figure 4-1. Distribution of 1000 MIVQUE estimates ofh2 (4-la) and o02 (4-1b) for experimental
level D1S80 illustrating the positive skew and similarity of the distributions. The true values are
.1 for h2 and .25 for oa,. The interval width of the bars is one-half the parametric value.


n


.25 0.4


4-lb. oa
i-


. ---






78

of heritability reveals the properties of the ratio (especially true of ratios with expected values of

0.1 and 0.25, Kendall and Stuart 1963, Ch. 10). Variance component estimation techniques

which performed well for bias and/or variance among estimates for oa2 also performed well for

h2.

Type B correlation and dominance to additive variance ratio

Type B correlation (Table 4-3 and 4-4 as ol) and dominance to additive variance ratio

(not shown) estimates both proved to be too unstable (extremely large variance among estimates)

in their original formulations to be useful in discrimination among variance component estimation

techniques. This high variance is due to the estimates of the denominators of these ratios

approaching zero and to the high variance of the denominator of ratios (Table 4-2). These ratios

were reformulated with numerators of interest (4a' for additive genetic by test interaction and

4o2, for dominance variance, respectively) and a denominator equal to the estimate of the

phenotypic variance. With this reformulation the variance and bias properties of estimates of the

altered ratios is approximated by the properties of estimates of the numerators.

For increasing imbalance maximum-likelihood-based estimation offers an increasing

advantage over HM3, and for all techniques individual observations offer increasing advantage

over plot-mean observations for variance of the estimates of these ratios. Bias, other than

inherently biased methods (ML), is associated with the probability of negative estimates which

is increased by increasing imbalance. This assertion is supported by comparing the biases of

REML, NNREML, and MODREML estimates across imbalance levels.








General Discussion


Observational Unit

Some general conclusions regarding the choice of a variance component estimation

methodology can be drawn from the results of this investigation. For any degree of imbalance

the use of individual observations is superior to the use of plot means for estimation of variance

component or ratios of variance components. If the data are nearly balanced (close to 100%

survival with no missing plots, crosses (full-sib) or lack of connectedness (half-sib)), the

properties of the estimation techniques based on individual and plot-mean observations become

similar; so if departure from balance is nominal, plot means can be used effectively. However,

using individual observations obviates the need for a survey of imbalance in the data since

individual observations produce better results than plot means for any of the estimation techniques

examined.


Negative Estimates


Drawing on the results of this investigation, the discussion of practical solutions for the

negative estimates problem will revolve around two solutions: 1) accept and use the negative

estimates; and 2) re-solving the system with negative estimates set to zero.

Given that the property of interest is the true value of a variance component or genetic

ratio, often estimated as a mean across data sets, then negativity constraints come into play if the

component of interest is small in comparison to other underlying variance components in the data,

or the variance of estimates is high due to an inadequate experimental design for variance

component estimation. These factors lead to an increased number of negative estimates. If the

data structure is such that negative estimates would occur frequently, then accepting negative

estimates is a good alternative.






80

If negative estimates tend to occur infrequently or bias is of less concern than variance

among estimates, then re-solving the system after convergence yields negative estimates is the

preferable solution. This tactic reduces both bias and variance among estimates below that of

arbitrarily setting negative estimates to zero.


Estimation Technique


The primary competitors among estimation techniques that are practically achievable are

REML and TYPE3 (HM3). Both techniques produce estimates with little or no bias; however,

REML estimates for the most part have slightly less sampling variance than TYPE3 estimates.

If only subsets of the parents are in common across tests as in the case H1S65, REML has a

distinct advantage in variance among estimates over TYPE3.

REML does have three additional advantages over TYPE3 which are 1) REML offers

generalized least squares estimation of fixed effects while TYPE3 offers ordinary least squares

estimation; 2) Best Linear Unbiased Predictions (BLUP) of random variables are inherent in

REML solutions, i.e., gca predictions are available; and thus in solving for the variance

components with REML, fixed effects are estimated and random variables are predicted

simultaneously (Harville 1977); and 3) REML offers greater flexibility in the model specification

both in univariate and multivariate forms as well as heterogeneous or correlated error terms.

Further, although the likelihood equations for common REML applications are based on

normality, the technique has been shown to be robust against the underlying distribution (Westfall

1987, Banks et al. 1985).








Recommendation


If one were to choose a single variance component estimation technique from among

those tested which could be applied to any data set with confidence that the estimates had

desirable properties (variance, MSE, and bias), that technique would be REML and the basic unit

of observation would be the individual. This combination (REML plus individual observations)

performed well across mating design and types and levels of imbalance. Treatment of negative

estimates would be determined by the proposed use of the estimates that is whether unbiasedness

(accepting and using the negative estimates) is more important than sampling variance (re-solve

the system setting negative estimates to zero).

A primary disadvantage of REML and individual observations is that they are both

computationally expensive (computer memory and time). HM3 estimation could replace REML

on many data sets and plot means could replace individual observations on some data sets; but

general application of these without regard to the data at hand does result in a loss in desirable

properties of the estimates in many instances.

The computational expense of REML and individual observations ensures that estimates

have desirable properties for a broad scope of applications. With the advent of bigger and faster

computers and the evolution of better REML algorithms, what was not feasible in the past on

most mainframe computers can now be accomplished on personal computers.












CHAPTER 5
GAREML: A COMPUTER ALGORITHM FOR
ESTIMATING VARIANCE COMPONENTS AND
PREDICTING GENETIC VALUES


Introduction


The computer program described in this chapter, called GAREML for Giesbrecht's

algorithm of restricted maximum likelihood estimation (REML), is useful for both estimating

variance components and predicting genetic values. GAREML applies the methodology of

Giesbrecht (1983) to the problems of REML estimation (Patterson and Thompson 1971) and best

linear unbiased prediction (BLUP, Henderson 1973) for univariate (single trait) genetics models.

GAREML can be applied to half-sib (open-pollinated or polymix) and full-sib (partial diallels,

factorials, half-diallels [no selfs] or disconnected sets of half-diallels) mating designs when planted

in single or multiple locations with single or multiple replications per location. When used for

variance component estimation, this program has been shown to provide estimates with desirable

properties across types of imbalance commonly encountered in forest genetics field tests (Huber

et al. in press) and with varying underlying distributions (Banks et al. 1985, Westfall 1987).

GAREML is also useful for determining efficiencies of alternative field and mating designs for

the estimation of variance components.

Utilizing the power of mixed-model methodology (Henderson 1984), GAREML provides

BLUP of parental general (gca) and specific combining abilities (sca) as well as generalized least

squares (GLS) solutions for fixed effects. The application of BLUP to forest genetics problems

has been addressed by White and Hodge (1988, 1989). With certain assumptions, the desirable






83

properties of BLUP predictions include maximizing the probability of obtaining correct parental

rankings from the data and minimizing the error associated with using the parental values

obtained in future applications. GLS fixed effect estimation weights the observations comprising

the estimates by their associated variances approximating best linear unbiased estimation (BLUE)

for fixed effects (Searle 1987, p 489-490).

The purpose of this chapter is to describe the theory and use of GAREML in enough

detail to facilitate use by other investigators. The program is written in FORTRAN and is not

dependent on other analysis programs. An interactive version of this program can be obtained

as a stand-alone executable file from the senior author; this file will run on any IBM compatible

PC under DOS or WINDOWS2 operating systems. The size of the problem an investigator can

solve will be dependent on the amount of extended memory and hard disk space (for swap files)

available for program use. In addition, the FORTRAN source code can be obtained for analysts

wishing to compile the program for use on alternate systems (e.g. mainframe computers).


Algorithm


GAREML proceeds by reading the data and forming a design matrix based on the number

of levels of factors in the model. Any portions of the design matrix for nested factors or

interactions are formed by horizontal direct product. Columns of zeroes in the design matrix (the

result of imbalance) are then deleted. The design matrix columns are in an order specified by

Giesbrecht's algorithm: columns for fixed effects are first, followed by the data vector, and the

last section of the matrix is for random effects. The design matrix is the only fully formed

matrix in the program. All other matrices are symmetric; therefore, to save computational space


2Windows is the trademark of the Microsoft Corporation, Redmond, WA.






84

and time, only the diagonal and the above diagonal portions of matrices are formed and utilized

(i.e., half-stored).

A half-stored matrix of the dot products of the design columns is formed and either kept

in common memory or stored in temporary disk space so that the matrix is available for recall

in the iterative solution process. The algorithm proceeds by modifying the matrix of dot products

such that the inverse of the covariance matrix for the observations (V) is enclosed by the column

specifiers in the dot products as X'X becoming X'V'X. This transfer is completed without

inversion of the total V matrix. The identity used to accomplish this transfer is

if Vh = hZhZl' + V,+1) where Vh is nonsingular;

then Vh = V'(+l) ahV-'(+l)Zh(Ih + CahZh'V-'lh+I)Zh) Vl(h+l). 5-1

A compact form of equation 5-1 is obtained by pre-multiplying by Z,' and post-multiplying by

Zj where h = 1, k-l (k = the total number of random factors), at is the prior associated with

random variable h, Vk = akI, V, = V and Z, is the portion of the design matrix for random

variable i (Giesbrecht 1983). A partitioned matrix is formed in order to update V1',+1) until V,1'

or V is obtained. This matrix is of the form:

S h + ahZaV -+1)'Z, V-hZa,'V +,)'(X Y I Z ,1 ... I Zk1)

V. (X I y I ZI... I Z-i)Vh+ )-'Z( T ,

where Tk-, = (XI y I Z ... I Zk)'Vk.,-'(X I y Z, ... IZk).

The sweep operator of Goodnight (1979) is applied to the upper left partition of the

matrix (equation 5-2) and the result of equation 5-1 is obtained. The matrix is sequentially

updated and swept until T, = (XIy Z, I...| Zk-,)'V'(X y IZ, I... I Zk.) is obtained. T, is then

swept on the columns for fixed effects (X'V'X). This sweep operation produces generalized least

squares estimates for fixed effects, results which can be scaled into predictions of random

variables, the residual sum of squares and all the necessary ingredients for assembling the






85

equation to solve for the variance components. The equation to be solved for the variance

components is

{tr(QVQVj)}f2 = {y'QV,Qy}

rxr rxI rxl

then ( = {tr(QVQVj)}1{y'QVQy}; 5-3

where {tr(QVQV)} is a matrix whose elements are tr(QVQVj) where i= 1 to r and

j=l to r, i.e., there is a row and column for every random variable in

the linear model;

tr is the trace operator that is the sum of the diagonal elements of a matrix;

Q = V- V -X(X'V-'X)X'V" for V as the covariance matrix of y and X as

the design matrix for fixed effects;

V, = Z1Z', where the i's are the random variables;

Z is the vector of variance component estimates; and

r is the number of random variables in the model (k-1).

The entire procedure from forming T, to solving for the variance components continues

until the variance component estimates from the last iteration are no more different from the

estimates of the previous iteration than the convergence criterion specifies. The fixed effect

estimates and predictions of random variables are then those of the final iteration. The

asymptotic covariance matrix for the variance components is obtained as

Var(2) = 2{tr(QV,QVj)}- 54

by utilizing intermediate results from the solution for the variance components.

The coefficient matrix of Henderson's mixed model equations is formed in order to

calculate the covariance matrix for fixed and random effects. The covariance matrix for






86

observations is constructed using the variance components estimates from Giesbrecht's algorithm.

The coefficient matrix is

S X'R''X X'R'Z ]5-5
Z'R-IX Z'R-'Z + D"

where R is the error covariance matrix which in this application is lo2,

where a2, is the variance of random variable w (equation 5-6 and 5-7);

X is the fixed effects design matrix;

Z is the random effects design matrix; and

D is the covariance matrix for the random variables which, in this

application, has variance components on the diagonal and zeroes on the

off-diagonal (no covariance among random variables).

The generalized inverse of the matrix (equation 5-5) is the error covariance matrix of the fixed

effect estimates and random predictions assuming the covariance matrix for observation is known

without error.


Operating GAREML


While GAREML will run in either batch or interactive mode, we focus on the interactive

PC-version which begins by prompting the analyst to answer questions determining the factors

to be read from the data. Specifically, the analyst answers yes or no to these questions: 1) are

there multiple locations? 2) are there multiple blocks? 3) are there disconnected sets of full-sibs?

i.e., usually referring to disconnected half-diallels and 4) is the mating design half-sib or full-sib?

The program then determines the proper variables to read from the data as well as the most

complicated (number of main factors plus interactions) scalar linear model allowed.

The most complicated linear model allowed for full-sib observations is






87

yiu = + t+ b + set, + gk + S, + tg + tg, + ts, + pi3j + Wi. 5-6

where y, is the mm observation of the kld cross in the jI block of the it test;

pt is the population mean;

ti is the random or fixed variable test environment;

bu is the random or fixed variable block;

set, is the random or fixed variable set, i.e., a variable is created so that

disconnected sets of half-diallels planted in the same experiment can be

analyzed in the same run or to analyze provenances and families within

provenance where provenance equals set; sets are assumed to be across test

environments and blocks with families nested within sets and interactions with

set are assumed unimportant.

gk is the random variable female general combining ability (gca);

g, is the random variable male gca;

s, is the random variable specific combining ability (sca);

tgg is the random variable test by female gca interaction;

tgo is the random variable test by male gca interaction;

tsj is the random variable test by sea interaction;

pi is the random variable plot;

w1i is the random variable within-plot; and

there is no covariance between random variables in the model.

The assumptions utilized are the variance for female and male random variables are equal (~2

= o = o2); and female and male environmental interactions are the same (oa2 = 0 = 02).

The most complicated scalar linear model allowed for half-sib observations is

yjhn = M + t. + bj + set, + gk + tg, + ph + Whi, 5-7








where yi~ is the mL observation of the kL half-sib family in the jh block of the i test;

/, t., bj, set., gk, and tga retain the definition in the full-sib equation;

ph\ is the random variable plot containing different genotype by environment

components than the full-sib model;

whig is the random variable within-plot containing different levels of

genotypic and genotype by environment components than the full-sib model;

and there is no covariance between random variables in the model.

The analyst builds the linear model by answering further prompts. If test, block and/or

set are in the model, they must be declared as fixed or random effects. When any of the three

effects is declared random, the analyst must furnish prior values for the variance. If no prior

value is known, 1.0's may be used as priors. Using 1.0's as priors will not affect the values for

resulting variance component estimates within the constraints of the convergence criterion; but

there may be a time penalty due to increasing the number of iterations required for convergence.

All remaining factors in the model are treated as random variables.

To complete the definition of the model, the analyst chooses to include or exclude each

possible factor by answering yes or no when prompted. After each yes answer, the program asks

for a prior value for the variance. Again, if no known priors exist, 1.0's may be substituted.

After the model has been specified, the program counts the number of fixed effects and the

number of random effects and asks if the number fits the model expected. A "yes" answer

proceeds through the program while a "no" returns the program to the beginning.

GAREML is now ready to read the data file (which must be an ASCII data file) in this

order: test, block, set, female, male, and the response variable. The analyst is prompted to

furnish a proper FORTRAN format statement for the data. Test, block, set, female and male are

read as character variables (A fields) with as many as eight characters per field, while the data






89

vector (response variable) is read as a double precision variable (F field). An example of a

format statement for a full-sib mating design across locations and blocks is "(4A8,F10.5)" which

reads four character variables sequentially occupying 8 columns each and the response variable

beginning in column 33 and ending in column 42 having five decimal places.

After reading the data, GAREML begins to furnish information to the analyst. This

information should be scanned to make sure the data read are correct. This information includes

the number of parents, the number of full-sib crosses, the number of observations, the maximum

number of fixed effect design matrix columns, and the maximum number of random effect design

matrix columns. If there is an error at this point, use CTRL-BRK to exit the program. Probable

causes of errors are the data are not in the format specified, missing values are included, blank

lines or other similar errors are in the data file, or the model was not correctly specified.

At this point, there are three other prompts concerning the data analysis (number of

iterations, convergence criterion and treatment of negative variance components). The number

of iterations is arbitrarily set to 30 and can be changed at the analyst's discretion. No warning

is issued that the maximum number of iterations has been reached; however, the current iteration

number and variance component estimates are output to the screen at the beginning of each

iteration. The convergence criterion used is the sum of the absolute values of the difference

between variance component estimates for consecutive iterations. The criterion has been set to

lx104 meaning that convergence is required to the fourth decimal place for all variance

components. The convergence criterion should be modified to suit the magnitude of the variances

under consideration as well as the practical need for enhanced resolution. Enhanced resolution

is obtained at the cost of increasing the number of iterations to convergence.

The analyst must decide whether to accept and use negative estimates or to set negative

estimates to zero and re-solve the system. The latter solution results in variance component






90

estimates with lower sampling variance and slight bias. If one is interested in unbiased estimates

of variance components that have a high probability of negative estimates, then accepting and

using the negative estimates may be the proper course to take.



Interpreting GAREML Output


Analysis is now underway. The priors for each iteration and the iteration number are

printed out to the screen. GAREML continues to iterate until the convergence criterion is met

or the maximum number of iterations is reached. The next time that analyst intervention is

required is to provide a name for the output file for variance component estimates. The file name

follows normal DOS file naming protocol; however, alternative directories may not be specified,

i.e., all outputs will be found in the same directory as the data file. The program will now quiz

the analyst to determine if additional outputs are desired. These additional outputs are gca

predictions, sea predictions (if applicable), the asymptotic covariance matrix for the variance

components, generalized least squares fixed effect estimates, error covariance matrix of the gca

predictions and error covariance matrix for fixed effects. An answer of yes to the inclusion of

an output will result in a prompting for a file name. In addition, for gca and sea predictions the

analyst may input a different value for 2v. or &2, with which to scale predictions. The

discussion which follows furnishes more detailed information concerning GAREML outputs.


Variance Component Estimates


Ignoring concerns about convergence to a global maximum and negative values, variance

component estimates are restricted maximum likelihood estimates of Patterson and Thompson

(1971). The estimates are robust against starting values (priors), i.e., the same estimates, within

the limits of the convergence criterion, can be obtained from diverse priors. However, priors






91

close to the true values will, in general, reduce the number of iterations required to reach

convergence. The value of the convergence criterion must be less than or equal to the desired

precision for the variance components. REML variance component estimates from this program

have been shown to have more desirable properties (variance and bias) than other commonly used

estimation techniques (maximum likelihood, minimum norm quadratic unbiased estimation and

Henderson's Method 3) over a wide range of data imbalance. The properties of the estimates are

further enhanced by using individual observations as data rather than plot means. The output is

labelled by the variance component estimated.


Predictions of Random Variables


The predictions output are for general and specific combining abilities and approximate

best linear unbiased predictions (BLUP) of the random variables. BLUP predictions have several

optimal properties: 1) the correlation between the predicted and true values is maximized; 2) if

the distribution is multivariate normal then BLUP maximizes the probability of obtaining the

correct rankings (Henderson 1973) and so maximizes the probability of selecting the best

candidate from any pair of candidates (Henderson 1977).

Predictions are of the form:

6 = DZ'V-'(y-X6) 5-8

where ii is the vector of predictions;

I is the estimated covariance matrix for random variables from the REML

variance component estimates, see equation 5-5;

Z' is the transpose of the design matrix for random variables;

y is the data vector;

X is the design matrix for fixed effects;




Full Text

PAGE 2

X 237,0$/ 0$7,1* '(6,*16 $1' 237,0$/ 7(&+1,48(6 )25 $1$/<6,6 2) 48$17,7$7,9( 75$,76 ,1 )25(67 *(1(7,&6 %\ '8'/(< $59/( +8%(5 ',66(57$7,21 35(6(17(' 72 7+( *5$'8$7( 6&+22/ 7+( 81,9(56,7< 2) )/25,'$ ,1 3$57,$/ )8/),//0(17 2) 7+( 5(48,5(0(176 )25 7+( '(*5(( 2) '2&725 2) 3+,/2623+< 81,9(56,7< 2) )/25,'$

PAGE 3

$&.12:/('*(0(176 H[SUHVV P\ JUDWLWXGH WR 'UV 7 / :KLWH 5 +RGJH 5 & /LWWHOO 0 $ 'H/RUHQ]R DQG / 5RFNZRRG IRU WKHLU WLPH DQG HIIRUW LQ WKH SXUVXLW RI WKLV ZRUN 7KHLU JXLGDQFH DQG ZLVGRP SURYHG LQYDOXDEOH WR WKH FRPSOHWLRQ RI WKLV SURMHFW IXUWKHU DFNQRZOHGJH 'U %UXFH %RQJDUWHQ IRU KLV HQFRXUDJHPHQW WR FRQWLQXH P\ DFDGHPLF FDUHHU DP JUDWHIXO WR 'U 7 / :KLWH DQG WKH 6FKRRO RI )RUHVW 5HVRXUFHV DQG &RQVHUYDWLRQ DW WKH 8QLYHUVLW\ RI )ORULGD IRU IXQGLQJ WKLV ZRUN H[WHQG VSHFLDO WKDQNV WR *HRUJH %U\DQ DQG 'U 0 $ 'H/RUHQ]R RI WKH 'DLU\ 6FLHQFH 'HSDUWPHQW DQG *UHJ 3RZHOO RI WKH 6FKRRO RI )RUHVW 5HVRXUFHV DQG &RQVHUYDWLRQ IRU WKH XVH RI FRPSXWLQJ IDFLOLWLHV SURJUDPPLQJ KHOS DQG DLG LQ UXQQLQJ WKH VLPXODWLRQV UHTXLUHG 0RVW LPSRUWDQWO\ WKDQN P\ IDPLO\ 1DQF\ -RKQ DQG +HDWKHU IRU WKHLU XQGHUVWDQGLQJ DQG HQFRXUDJHPHQW LQ WKLV HQGHDYRU

PAGE 4

7$%/( 2) &217(176 $&.12:/('*(0(176 LL /,67 2) 7$%/(6 YL /,67 2) ),*85(6 YLL $%675$&7 YLLL &+$37(5 ,1752'8&7,21 &+$37(5 7+( ()),&,(1&< 2) +$/)6,% +$/)',$//(/ $1' &,5&8/$5 0$7,1* '(6,*16 ,1 7+( (67,0$7,21 2) *(1(7,& 3$5$0(7(56 :,7+ 9$5,$%/( 180%(56 2) 3$5(176 $1' /2&$7,216 ,QWURGXFWLRQ 0HWKRGV $VVXPSWLRQV &RQFHUQLQJ %ORFN 6L]H 7KH 8VH RI (IILFLHQF\ Lf *HQHUDO 0HWKRGRORJ\ /HYHOV RI *HQHWLF 'HWHUPLQDWLRQ &RYDULDQFH 0DWUL[ IRU 9DULDQFH &RPSRQHQWV &RYDULDQFH 0DWUL[ IRU /LQHDU &RPELQDWLRQV RI 9DULDQFH &RPSRQHQWV DQG 9DULDQFH RI D 5DWLR &RPSDULVRQ $PRQJ (VWLPDWHV RI 9DULDQFHV RI 5DWLRV 5HVXOWV +HULWDELOLW\ 7\SH % &RUUHODWLRQ 'RPLQDQFH WR $GGLWLYH 9DULDQFH 5DWLR 'LVFXVVLRQ &RPSDULVRQ RI 0DWLQJ 'HVLJQV $ *HQHUDO $SSURDFK WR WKH (VWLPDWLRQ 3UREOHP 8VH RI WKH 9DULDQFH RI D 5DWLR $SSUR[LPDWLRQ &RQFOXVLRQV LLL

PAGE 5

&+$37(5 25',1$5< /($67 648$5(6 (67,0$7,21 2) *(1(5$/ $1' 63(&,),& &20%,1,1* $%,/,7,(6 )520 +$/)',$//(/ 0$7,1* '(6,*16 ,QWURGXFWLRQ 0HWKRGV /LQHDU 0RGHO 2UGLQDU\ /HDVW 6TXDUHV 6ROXWLRQV 6XPWR=HUR 5HVWULFWLRQV &RPSRQHQWV RI WKH 0DWUL[ (TXDWLRQ (VWLPDWLRQ RI )L[HG (IIHFWV 1XPHULFDO ([DPSOHV %DODQFHG 'DWD 3ORWPHDQ %DVLVf 0LVVLQJ 3ORW 0LVVLQJ &URVV 6HYHUDO 0LVVLQJ &URVVHV 'LVFXVVLRQ 8QLTXHQHVV RI (VWLPDWHV :HLJKWLQJ RI 3ORW 0HDQV DQG &URVV 0HDQV LQ (VWLPDWLQJ 3DUDPHWHUV 'LDOOHO 0HDQ 9DULDQFH DQG &RYDULDQFH RI 3ORW 0HDQV &RPSDULVRQ RI 3UHGLFWLRQ DQG (VWLPDWLRQ 0HWKRGRORJLHV &RQFOXVLRQV &+$37(5 9$5,$1&( &20321(17 (67,0$7,21 7(&+1,48(6 &203$5(' )25 7:2 0$7,1* '(6,*16 :,7+ )25(67 *(1(7,& $5&+,7(&785( 7+528*+ &20387(5 6,08/$7,21 ,QWURGXFWLRQ 0HWKRGV ([SHULPHQWDO $SSURDFK ([SHULPHQWDO 'HVLJQ IRU 6LPXODWHG 'DWD )XOO6LE /LQHDU 0RGHO +DOIVLE /LQHDU 0RGHO 'DWD *HQHUDWLRQ DQG 'HOHWLRQ 9DULDQFH &RPSRQHQW (VWLPDWLRQ 7HFKQLTXHV &RPSDULVRQ $PRQJ (VWLPDWLRQ 7HFKQLTXHV 5HVXOWV DQG 'LVFXVVLRQ 9DULDQFH &RPSRQHQWV 5DWLRV RI 9DULDQFH &RPSRQHQWV *HQHUDO 'LVFXVVLRQ 2EVHUYDWLRQDO 8QLW 1HJDWLYH (VWLPDWHV (VWLPDWLRQ 7HFKQLTXH 5HFRPPHQGDWLRQ ,9

PAGE 6

&+$37(5 *$5(0/ $ &20387(5 $/*25,7+0 )25 (67,0$7,1* 9$5,$1&( &20321(176 $1' 35(',&7,1* *(1(7,& 9$/8(6 ,QWURGXFWLRQ $OJRULWKP 2SHUDWLQJ *$5(0/ ,QWHUSUHWLQJ *$5(0/ 2XWSXW 9DULDQFH &RPSRQHQW (VWLPDWHV 3UHGLFWLRQV RI 5DQGRP 9DULDEOHV $V\PSWRWLF &RYDULDQFH 0DWUL[ RI 9DULDQFH &RPSRQHQWV )L[HG (IIHFW (VWLPDWHV (UURU &RYDULDQFH 0DWULFHV ([DPSOH 'DWD $QDO\VLV 2XWSXW &RQFOXVLRQV &+$37(5 &21&/86,216 $33(1',; )2575$1 6285&( &2'( )25 *$5(0/ 5()(5(1&( /,67 %,2*5$3+,&$/ 6.(7&+ Y

PAGE 7

/,67 2) 7$%/(6 7DEOH 3DUDPHWULF YDULDQFH FRPSRQHQWV 7DEOH 'DWD VHW IRU QXPHULFDO H[DPSOHV 7DEOH 1XPHULFDO UHVXOWV IRU H[DPSOHV 7DEOH $EEUHYLDWLRQ IRU DQG GHVFULSWLRQ RI YDULDQFH FRPSRQHQW HVWLPDWLRQ PHWKRGV 7DEOH 6HWV RI WUXH YDULDQFH FRPSRQHQWV 7DEOH 6DPSOLQJ YDULDQFH IRU WKH HVWLPDWHV 7DEOH %LDV IRU WKH HVWLPDWHV 7DEOH 3UREDELOLW\ RI QHDUQHVV 7DEOH 'DWD IRU H[DPSOH 9,

PAGE 8

/,67 2) ),*85(6 )LJXUH (IILFLHQF\ Lf IRU K )LJXUH (IILFLHQF\ Lf IRU U% )LJXUH (IILFLHQF\ f IRU )LJXUH 7KH RYHUSDUDPHWHUL]HG OLQHDU PRGHO )LJXUH 7KH OLQHDU PRGHO IRU D IRXUSDUHQW KDOIGLDOOHO )LJXUH ,QWHUPHGLDWH UHVXOW LQ 6& $ VXEPDWUL[ JHQHUDWLRQ )LJXUH :HLJKWV RQ RYHUDOO FURVV PHDQV )LJXUH 'LVWULEXWLRQ RI 0,948( HVWLPDWHV YLL

PAGE 9

$EVWUDFW RI 'LVVHUWDWLRQ 3UHVHQWHG WR WKH *UDGXDWH 6FKRRO RI WKH 8QLYHUVLW\ RI )ORULGD LQ 3DUWLDO )XOILOOPHQW RI WKH 5HTXLUHPHQWV IRU WKH 'HJUHH RI 'RFWRU RI 3KLORVRSK\ 237,0$/ 0$7,1* '(6,*16 $1' 237,0$/ 7(&+1,48(6 )25 $1$/<6,6 2) 48$17,7$7,9( 75$,76 ,1 )25(67 *(1(7,&6 %\ 'XGOH\ $UYOH +XEHU 0D\ &KDLUSHUVRQ 7LPRWK\ / :KLWH 0DMRU 'HSDUWPHQW 6FKRRO RI )RUHVW 5HVRXUFHV DQG &RQVHUYDWLRQ )LUVW WKH DV\PSWRWLF FRYDULDQFH PDWUL[ RI WKH YDULDQFH FRPSRQHQW HVWLPDWHV LV XVHG WR FRPSDUH WKUHH FRPPRQ PDWLQJ GHVLJQV IRU HIILFLHQF\ PD[LPL]LQJ WKH YDULDQFH UHGXFLQJ SURSHUW\ RI HDFK REVHUYDWLRQf IRU JHQHWLF SDUDPHWHUV DFURVV QXPEHUV RI SDUHQWV DQG ORFDWLRQV DQG YDU\LQJ JHQHWLF DUFKLWHFWXUHV ,W LV GHWHUPLQHG WKDW WKH FLUFXODU PDWLQJ GHVLJQ LV DOZD\V VXSHULRU LQ HIILFLHQF\ WR WKH KDOIGLDOOHO GHVLJQ )RU VLQJOHWUHH KHULWDELOLW\ WKH KDOIVLE GHVLJQ LV PRVW HIILFLHQW )RU HVWLPDWLQJ W\SH % FRUUHODWLRQ PD[LPXP HIILFLHQF\ LV DFKLHYHG E\ HLWKHU WKH KDOI VLE RU FLUFXODU PDWLQJ GHVLJQ DQG WKDW FKDQJH LQ UDQN IRU HIILFLHQF\ LV GHWHUPLQHG E\ WKH XQGHUO\LQJ JHQHWLF DUFKLWHFWXUH $QRWKHU LQWHQW RI WKLV ZRUN LV FRPSDULQJ DQDO\VLV PHWKRGRORJLHV IRU GHWHUPLQLQJ SDUHQWDO ZRUWK 7KH ILUVW RI WKHVH LQYHVWLJDWLRQV LV RUGLQDU\ OHDVW VTXDUHV DVVXPSWLRQV LQ WKH HVWLPDWLRQ RI SDUHQWDO ZRUWK IRU WKH KDOIGLDOOHO PDWLQJ GHVLJQ ZLWK EDODQFHG DQG XQEDODQFHG GDWD 7KH FRQFOXVLRQ IURP FRPSDULVRQ RI RUGLQDU\ OHDVW VTXDUHV WR DOWHUQDWLYH DQDO\VLV PHWKRGRORJLHV LV WKDW EHVW OLQHDU XQELDVHG SUHGLFWLRQ DQG EHVW OLQHDU SUHGLFWLRQ DUH PRUH DSSURSULDWH WR WKH SUREOHP RI GHWHUPLQLQJ SDUHQWDO ZRUWK YLLL

PAGE 10

7KH QH[W DQDO\VLV LQYHVWLJDWLRQ FRQWUDVWV YDULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXHV DFURVV OHYHOV RI LPEDODQFH IRU WKH KDOIGLDOOHO DQG KDOIVLE PDWLQJ GHVLJQV IRU WKH HVWLPDWLRQ RI JHQHWLF SDUDPHWHUV ZLWK SORW PHDQV DQG LQGLYLGXDOV XVHG DV WKH XQLW RI REVHUYDWLRQ 7KH FULWHULD IRU GLVFULPLQDWLRQ DUH YDULDQFH RI WKH HVWLPDWHV PHDQ VTXDUH HUURU ELDV DQG SUREDELOLW\ RI QHDUQHVV )RU DOO HVWLPDWLRQ WHFKQLTXHV LQGLYLGXDOV DV WKH XQLW RI REVHUYDWLRQ SURGXFHG HVWLPDWHV ZLWK WKH PRVW GHVLUDEOH SURSHUWLHV 2I WKH HVWLPDWLRQ WHFKQLTXHV H[DPLQHG UHVWULFWHG PD[LPXP OLNHOLKRRG LV WKH PRVW UREXVW WR LPEDODQFH 7KH FRPSXWHU SURJUDP XVHG WR SURGXFH UHVWULFWHG PD[LPXP OLNHOLKRRG HVWLPDWHV RI YDULDQFH FRPSRQHQWV ZDV PRGLILHG WR IRUP D XVHU IULHQGO\ DQDO\VLV SDFNDJH %RWK WKH DOJRULWKP DQG WKH RXWSXWV RI WKH SURJUDP DUH GRFXPHQWHG 2XWSXWV DYDLODEOH IURP WKH SURJUDP LQFOXGH YDULDQFH FRPSRQHQW HVWLPDWHV JHQHUDOL]HG OHDVW VTXDUHV HVWLPDWHV RI IL[HG HIIHFWV DV\PSWRWLF FRYDULDQFH PDWUL[ IRU YDULDQFH FRPSRQHQWV EHVW OLQHDU XQELDVHG SUHGLFWLRQV IRU JHQHUDO DQG VSHFLILF FRPELQLQJ DELOLWLHV DQG WKH HUURU FRYDULDQFH PDWUL[ IRU SUHGLFWLRQV DQG HVWLPDWHV ,;

PAGE 11

&+$37(5 ,1752'8&7,21 $QDO\VLV RI TXDQWLWDWLYH WUDLWV LQ IRUHVW JHQHWLF H[SHULPHQWV KDV WUDGLWLRQDOO\ EHHQ DSSURDFKHG DV D WZRSDUW SUREOHP 3DUHQWDO ZRUWK ZRXOG EH HVWLPDWHG DV IL[HG HIIHFWV DQG ODWHU FRQVLGHUHG DV UDQGRP HIIHFWV IRU WKH GHWHUPLQDWLRQ RI JHQHWLF DUFKLWHFWXUH :KLOH WUDGLWLRQDO WKLV DSSURDFK LV PRVW SUREDEO\ VXERSWLPDO JLYHQ WKH SUROLIHUDWLRQ RI DOWHUQDWLYH DQDO\VLV DSSURDFKHV ZLWK HQKDQFHG WKHRUHWLFDO SURSHUWLHV :KLWH DQG +RGJH f ,Q WKLV GLVVHUWDWLRQ HPSKDVLV LV SODFHG RQ WKH KDOIGLDOOHO PDWLQJ GHVLJQ EHFDXVH RI LWV RPQLSUHVHQFH DQG WKH XQLTXHQHVV RI WKH DQDO\VLV SUREOHP WKLV PDWLQJ GHVLJQ SUHVHQWV 7KH KDOI GLDOOHO PDWLQJ GHVLJQ KDV EHHQ DQG FRQWLQXHV WR EH XVHG LQ SODQW VFLHQFHV 6SUDJXH DQG 7DWXP *LOEHUW 0DW]LQJHU HW DO %XUOH\ HW DO 6TXLOODFH :HLU DQG =REHO :LOFR[ HW DO 6Q\GHU DQG 1DPNRRQJ +DOODXHU DQG 0LUDQGD 6LQJK DQG 6LQJK *UHHQZRRG HW DO DQG :HLU DQG *RGGDUG f 7KH XQLTXH IHDWXUH RI WKH KDOIGLDOOHO PDWLQJ V\VWHP ZKLFK KLQGHUV DQDO\VLV ZLWK PDQ\ VWDWLVWLFDO SDFNDJHV LV WKDW D VLQJOH REVHUYDWLRQ FRQWDLQV WZR OHYHOV RI WKH VDPH PDLQ HIIHFW 2SWLPDOLW\ RI PDWLQJ GHVLJQ IRU WKH HVWLPDWLRQ RI FRPPRQO\ QHHGHG JHQHWLF SDUDPHWHUV VLQJOHWUHH KHULWDELOLW\ W\SH % FRUUHODWLRQ DQG GRPLQDQFH WR DGGLWLYH YDULDQFH UDWLRf LV H[DPLQHG XWLOL]LQJ WKH DV\PSWRWLF FRYDULDQFH RI WKH YDULDQFH FRPSRQHQWV .HQGDOO DQG 6WXDUW *LHVEUHFKW DQG 0F&XWFKDQ HW DO f 6LQFH JHQHWLF ILHOG H[SHULPHQWV DUH FRPSRVHG RI ERWK D PDWLQJ GHVLJQ DQG D ILHOG GHVLJQ WKH FHQWUDO FRQVLGHUDWLRQ LQ WKLV LQYHVWLJDWLRQ LV ZKLFK PDWLQJ GHVLJQ ZLWK ZKDW ILHOG GHVLJQ KRZ PDQ\ SDUHQWV DQG DFURVV ZKDW QXPEHU RI ORFDWLRQV

PAGE 12

ZLWKLQ D UDQGRPL]HG FRPSOHWH EORFN GHVLJQf LV PRVW HIILFLHQW 7KH FULWHULRQ IRU GLVFHUQPHQW DPRQJ GHVLJQV LV WKH HIILFLHQF\ RI WKH LQGLYLGXDO REVHUYDWLRQ LQ UHGXFLQJ WKH YDULDQFH RI WKH HVWLPDWH 3HGHUVRQ f 7KLV TXHVWLRQ LV FRQVLGHUHG XQGHU D UDQJH RI JHQHWLF DUFKLWHFWXUHV ZKLFK VSDQV WKDW UHSRUWHG IRU FRQLIHURXV JURZWK WUDLWV &DPSEHOO 6WRQHF\SKHU HW DO 6Q\GHU DQG 1DPNRRQJ )RVWHU )RVWHU DQG %ULGJZDWHU +RGJH DQG :KLWH >LQ SUHVV@f 7KH LQYHVWLJDWLRQ LQWR RSWLPDO DQDO\VLV SURFHHGV E\ FRQVLGHULQJ WKH RUGLQDU\ OHDVW VTXDUHV 2/6f WUHDWPHQW RI HVWLPDWLQJ SDUHQWDO ZRUWK IRU WKH KDOIGLDOOHO PDWLQJ GHVLJQ 2/6 DVVXPSWLRQV DUH H[DPLQHG LQ GHWDLO WKURXJK WKH XVH RI PDWUL[ DOJHEUD IRU ERWK EDODQFHG DQG XQEDODQFHG GDWD 7KH XVH RI PDWUL[ DOJHEUD LOOXVWUDWHV ERWK WKH XQLTXHQHVV RI WKH SUREOHP DQG WKH LQWHUSUHWDWLRQ RI WKH 2/6 DVVXPSWLRQV &RPSDULVRQV DPRQJ 2/6 JHQHUDOL]HG OHDVW VTXDUHV */6f EHVW OLQHDU XQELDVHG SUHGLFWLRQ %/83f DQG EHVW OLQHDU SUHGLFWLRQ %/3f DUH PDGH RQ D WKHRUHWLFDO EDVLV $OWKRXJK FRQVLGHUDWLRQ RI ILHOG DQG PDWLQJ GHVLJQ RI IXWXUH H[SHULPHQWV LV HVVHQWLDO WKH SUREOHP RI RSWLPDO DQDO\VLV RI FXUUHQW GDWD UHPDLQV ,Q UHVSRQVH WR WKLV QHHG VLPXODWHG GDWD ZLWK GLIIHULQJ OHYHOV RI LPEDODQFH JHQHWLF DUFKLWHFWXUH DQG PDWLQJ GHVLJQ LV XWLOL]HG DV D EDVLV IRU GLVFULPLQDWLQJ DPRQJ YDULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXHV LQ WKH GHWHUPLQDWLRQ RI JHQHWLF DUFKLWHFWXUH 7KH OHYHOV RI LPEDODQFH VLPXODWHG UHSUHVHQW WKRVH FRPPRQO\ VHHQ LQ IRUHVW JHQHWLF GDWD DV OHVV WKDQ b VXUYLYDO PLVVLQJ FURVVHV IRU IXOOVLE PDWLQJ GHVLJQV DQG RQO\ VXEVHWV RI SDUHQWV LQ FRPPRQ DFURVV ORFDWLRQ IRU KDOIVLE PDWLQJ GHVLJQV 7KH WZR PDWLQJ GHVLJQV DUH KDOIVLE DQG KDOIGLDOOHO ZLWK D VXEVHW RI WKH SUHYLRXVO\ XVHG JHQHWLF DUFKLWHFWXUHV 7KH ILHOG GHVLJQ LV D UDQGRPL]HG FRPSOHWH EORFN ZLWK ILIWHHQ IDPLOLHV SHU EORFN DQG VL[ WUHHV SHU IDPLO\ SHU EORFN 7KH IRXU FULWHUD XVHG WR GLVFULPLQDWH DPRQJ YDULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXHV DUH SUREDELOLW\ RI QHDUQHVV 3LWWPDQ f ELDV YDULDQFH RI WKH HVWLPDWHV DQG PHDQ VTXDUH HUURU +RJJ DQG &UDLJ f

PAGE 13

7KH WHFKQLTXHV FRPSDUHG IRU YDULDQFH FRPSRQHQW HVWLPDWLRQ DUH PLQLPXP YDULDQFH TXDGUDWLF XQELDVHG HVWLPDWLRQ 5DR Ef PLQLPXP QRUP TXDGUDWLF XQELDVHG HVWLPDWLRQ 5DR Df UHVWULFWHG PD[LPXP OLNHOLKRRG 3DWWHUVRQ DQG 7KRPSVRQ f PD[LPXP OLNHOLKRRG +DUWOH\ DQG 5DR f DQG +HQGHUVRQfV PHWKRG +HQGHUVRQ f 7KHVH WHFKQLTXHV DUH FRPSDUHG XVLQJ WKH LQGLYLGXDO DQG SORW PHDQV DV WKH XQLW RI REVHUYDWLRQ )XUWKHU WKUHH DOWHUQDWLYHV DUH H[SORUHG IRU GHDOLQJ ZLWK QHJDWLYH YDULDQFH FRPSRQHQW HVWLPDWHV ZKLFK DUH DFFHSW DQG OLYH ZLWK QHJDWLYH HVWLPDWHV VHW QHJDWLYH HVWLPDWHV WR ]HUR DQG UHVROYH WKH V\VWHP VHWWLQJ QHJDWLYH FRPSRQHQWV WR ]HUR 7KH DOJRULWKP XVHG IRU WKH PHWKRG ZKLFK SURYLGHG HVWLPDWHV ZLWK RSWLPDO SURSHUWLHV DFURVV H[SHULPHQWDO OHYHOV ZDV FRQYHUWHG WR D XVHU IULHQGO\ SURJUDP 7KLV SURJUDP SURYLGLQJ UHVWULFWHG PD[LPXP OLNHOLKRRG YDULDQFH FRPSRQHQW HVWLPDWHV XVHV *LHVEUHFKWfV DOJRULWKP f 'RFXPHQWDWLRQ RI WKH DOJRULWKP DQG H[SODQDWLRQ RI WKH SURJUDPfV RXWSXW DUH SURYLGHG DORQJ ZLWK WKH )RUWUDQ VRXUFH FRGH DSSHQGL[f

PAGE 14

&+$37(5 7+( ()),&,(1&< 2) +$/)6,% +$/)',$//(/ $1' &,5&8/$5 0$7,1* '(6,*16 ,1 7+( (67,0$7,21 2) *(1(7,& 3$5$0(7(56 :,7+ 9$5,$%/( 180%(56 2) 3$5(176 $1' /2&$7,216 ,QWURGXFWLRQ ,Q IRUHVW WUHH LPSURYHPHQW JHQHWLF WHVWV DUH HVWDEOLVKHG IRU IRXU SULPDU\ SXUSRVHV f UDQNLQJ SDUHQWV f VHOHFWLQJ IDPLOLHV RU LQGLYLGXDOV f HVWLPDWLQJ JHQHWLF SDUDPHWHUV DQG f GHPRQVWUDWLQJ JHQHWLF JDLQ =REHO DQG 7DOEHUW f :KLOH WKH IRXU SXUSRVHV DUH QRW PXWXDOO\ H[FOXVLYH D WHVW GHVLJQ RSWLPDO IRU RQH SXUSRVH LV PRVW SUREDEO\ QRW RSWLPDO IRU DOO %XUGRQ DQG 6KHOERXUQH :KLWH f $ EUHHGHU WKHQ PXVW SULRULWL]H WKH SXUSRVHV IRU ZKLFK D JLYHQ WHVW LV HVWDEOLVKHG DQG FKRRVH D GHVLJQ EDVHG RQ WKHVH SULRULWLHV :LWKLQ D JHQHWLF WHVW GHVLJQ WKHUH DUH WZR SULPDU\ FRPSRQHQWV PDWLQJ GHVLJQ DQG ILHOG GHVLJQ 7KHUH KDYH EHHQ VHYHUDO LQYHVWLJDWLRQV RI RSWLPDO GHVLJQV IRU WKHVH WZR FRPSRQHQWV HLWKHU VHSDUDWHO\ RU VLPXOWDQHRXVO\ XQGHU YDULRXV FULWHULD 7KHVH FULWHULD KDYH LQFOXGHG WKH HIILFLHQW DQGRU SUHFLVH HVWLPDWLRQ RI KHULWDELOLW\ 3HGHUVRQ 1DPNRRQJ DQG 5REHUGV 3HSSHU DQG 1DPNRRQJ 0F&XWFKDQ HW DO 0F&XWFKDQ HW DO f SUHFLVH HVWLPDWLRQ RI YDULDQFH FRPSRQHQWV %UDDWHQ 3HSSHU f DQG HIILFLHQW VHOHFWLRQ RI SURJHQ\ YDQ %XLMWHQHQ :KLWH DQG +RGJH YDQ %XLMWHQHQ DQG %XUGRQ /RR'LQNLQV HW DO f ,QFRUSRUDWHG ZLWKLQ WKLV ERG\ RI UHVHDUFK KDV EHHQ D ZLGH UDQJH RI JHQHWLF DQG HQYLURQPHQWDO YDULDQFH SDUDPHWHUV DQG ILHOG DQG PDWLQJ GHVLJQV +RZHYHU WKH PRGHOV LQ SUHYLRXV LQYHVWLJDWLRQV KDYH EHHQ SULPDULO\ FRQVWUDLQHG WR FRQVLGHUDWLRQ RI WHVWLQJ LQ D VLQJOH

PAGE 15

HQYLURQPHQW ZLWK D FRUUHVSRQGLQJ OLPLWHG QXPEHU RI IDFWRUV LQ WKH PRGHO LH JHQRW\SH E\ HQYLURQPHQW LQWHUDFWLRQ DQGRU GRPLQDQFH YDULDQFH DUH XVXDOO\ QRW FRQVLGHUHG 7KLV FKDSWHU IRFXVHV RQ RSWLPDO PDWLQJ GHVLJQV WKURXJK FRQVLGHUDWLRQ RI WKUHH FRPPRQ PDWLQJ GHVLJQV KDOI VLE KDOIGLDOOHO DQG FLUFXODU ZLWK IRXU FURVVHV SHU SDUHQWf IRU HVWLPDWLRQ RI JHQHWLF SDUDPHWHUV ZLWK D ILHOG GHVLJQ H[WHQGLQJ DFURVV PXOWLSOH ORFDWLRQV ,Q WKLV FKDSWHU WKH DSSURDFK WR WKH RSWLPDO GHVLJQ SUREOHP LV WR PDLQWDLQ WKH EDVLF ILHOG GHVLJQ ZLWKLQ ORFDWLRQV DV UDQGRPL]HG FRPSOHWH EORFN ZLWK IRXU EORFNV DQG D VL[WUHH URZSORW UHSUHVHQWLQJ HDFK JHQHWLF HQWU\ ZLWKLQ D EORFN QRWHG DV RQH RI WKH PRVW FRPPRQ ILHOG GHVLJQV E\ /RR'LQNLQV HW DO f 7KH QXPEHU RI IDPLOLHV LQ D EORFN QXPEHU RI ORFDWLRQV PDWLQJ GHVLJQ DQG QXPEHU RI SDUHQWV ZLWKLQ D PDWLQJ GHVLJQ DUH DOORZHG WR FKDQJH 6LQFH RSWLPDOLW\ EHVLGHV EHLQJ D IXQFWLRQ RI WKH ILHOG DQG PDWLQJ GHVLJQV LV DOVR D IXQFWLRQ RI WKH XQGHUO\LQJ JHQHWLF SDUDPHWHUV DOO GHVLJQV DUH H[DPLQHG DFURVV D UDQJH RI OHYHOV RI JHQHWLF GHWHUPLQDWLRQ DV YDU\LQJ OHYHOV RI KHULWDELOLW\ JHQRW\SH E\ HQYLURQPHQW LQWHUDFWLRQ DQG GRPLQDQFHf UHIOHFWLQJ HVWLPDWHV IRU PDQ\ HFRQRPLFDOO\ LPSRUWDQW WUDLWV LQ FRQLIHUV &DPSEHOO 6WRQHF\SKHU HW DO 6Q\GHU DQG 1DPNRRQJ )RVWHU )RVWHU DQG %ULGJZDWHU +RGJH DQG :KLWH LQ SUHVVff )RU HDFK GHVLJQ DQG OHYHO RI JHQHWLF GHWHUPLQDWLRQ D 0LQLPXP 9DULDQFH 4XDGUDWLF 8QELDVHG (VWLPDWLRQ 0,948(f WHFKQLTXH DQG DQ DSSUR[LPDWLRQ RI WKH YDULDQFH RI D UDWLR .HQGDOO DQG 6WXDUW *LHVEUHFKW DQG 0F&XWFKDQ HW DO f DUH DSSOLHG WR HVWLPDWH WKH YDULDQFH RI HVWLPDWHV RI KHULWDELOLW\ DGGLWLYH WR DGGLWLYH SOXV DGGLWLYH E\ HQYLURQPHQW YDULDQFH UDWLR DQG GRPLQDQFH WR DGGLWLYH YDULDQFH UDWLR 7KHVH WHFKQLTXHV XVH WKH WUXH FRYDULDQFH PDWUL[ RI WKH YDULDQFH FRPSRQHQW HVWLPDWHV XWLOL]LQJ RQO\ WKH NQRZQ SDUDPHWHUV DQG WKH WHVW GHVLJQ DQG SUHFOXGLQJ WKH QHHG IRU VLPXODWHG RU UHDO GDWDf DQG D 7D\ORU VHULHV DSSUR[LPDWLRQ RI WKH YDULDQFH RI D UDWLR 7KH UHODWLYH HIILFLHQFLHV RI GLIIHUHQW WHVW GHVLJQV DUH FRPSDUHG RQ WKH EDVLV RI L WKH

PAGE 16

HIILFLHQF\ RI DQ LQGLYLGXDO REVHUYDWLRQ LQ UHGXFLQJ WKH YDULDQFH RI DQ HVWLPDWH 3HGHUVRQ f 7KXV WKLV UHVHDUFK H[SORUHV ZKLFK PDWLQJ GHVLJQ QXPEHU RI SDUHQWV DQG QXPEHU RI ORFDWLRQV LV PRVW HIILFLHQW SHU XQLW RI REVHUYDWLRQ LQ HVWLPDWLQJ KHULWDELOLW\ DGGLWLYH WR DGGLWLYH SOXV DGGLWLYH E\ HQYLURQPHQW YDULDQFH UDWLR DQG GRPLQDQFH WR DGGLWLYH YDULDQFH UDWLR IRU VHYHUDO YDULDQFH VWUXFWXUHV UHSUHVHQWDWLYH RI FRQLIHURXV JURZWK WUDLWV 0HWKRGV $VVXPSWLRQV &RQFHUQLQJ %ORFN 6L]H $V RSSRVHG WR 0F&XWFKDQ HW DO f ZKHUH EORFN VL]HV ZHUH KHOG FRQVWDQW DQG LQFOXGLQJ PRUH IDPLOLHV UHVXOWHG LQ IHZHU REVHUYDWLRQV SHU IDPLO\ SHU EORFN LQ WKLV FKDSWHU WKH EORFNV DUH DOORZHG WR H[SDQG WR DFFRPRGDWH LQFUHDVLQJ QXPEHUV RI IDPLOLHV 7KLV H[SDQVLRQ LV DOORZHG ZLWKRXW LQFUHDVLQJ HLWKHU WKH YDULDQFH DPRQJ EORFN RU WKH YDULDQFH ZLWKLQ EORFNV )RU WKH WKUHH PDWLQJ GHVLJQV ZKLFK DUH GLVFXVVHG WKH DGGLWLRQ RI RQH SDUHQW WR WKH KDOIVLE GHVLJQ LQFUHDVHV EORFN VL]H E\ WUHHV SORW IRU D KDOIVLE IDPLO\f WKH DGGLWLRQ RI D SDUHQW WR WKH FLUFXODU GHVLJQ LQFUHDVHV EORFN VL]H E\ WUHHV WZR SORWV IRU IXOOVLE IDPLOLHVf DQG WKH DGGLWLRQ RI D SDUHQW WR WKH KDOIGLDOOHO GHVLJQ LQFUHDVHV EORFN VL]H E\ S ZKHUH S LV WKH QXPEHU RI SDUHQWV EHIRUH WKH DGGLWLRQ RU WKHUH DUH S QHZ IXOOVLE IDPLOLHV SHU EORFNf 7KHUHIRUH EORFN VL]H LV GHWHUPLQHG E\ WKH PDWLQJ GHVLJQ DQG WKH QXPEHU RI SDUHQWV $OO FRPSDULVRQV DPRQJ PDWLQJ GHVLJQV DQG QXPEHUV RI ORFDWLRQV DUH IRU HTXDO EORFN VL]HV LH HTXDO QXPEHUV RI REVHUYDWLRQV SHU ORFDWLRQ 7KLV UHVXOWV LQ FRPSDULQJ PDWLQJ GHVLJQV ZLWK XQHTXDO QXPEHUV RI SDUHQWV LQ WKH GHVLJQV DQG FRPSDULQJ WZR ORFDWLRQ H[SHULPHQWV DJDLQVW ILYH ORFDWLRQ H[SHULPHQWV ZLWK HTXDO QXPEHUV RI REVHUYDWLRQV SHU ORFDWLRQ EXW XQHTXDO WRWDO QXPEHUV RI REVHUYDWLRQV

PAGE 17

7KH 8VH RI (IILFLHQF\ Lf (IILFLHQF\ LV WKH WRRO E\ ZKLFK FRPSDULVRQV DUH PDGH DQG LV WKH HIILFDF\ RI WKH LQGLYLGXDO REVHUYDWLRQV LQ DQ H[SHULPHQW LQ ORZHULQJ WKH YDULDQFH RI SDUDPHWHU HVWLPDWHV $Q LQFUHDVLQJ HIILFLHQF\ LQGLFDWHV WKDW IRU LQFUHDVLQJ H[SHULPHQWDO VL]H WKH DGGLWLRQDO REVHUYDWLRQV KDYH HQKDQFHG WKH YDULDQFH UHGXFLQJ SURSHUW\ RI DOO REVHUYDWLRQV (IILFLHQF\ LV FDOFXODWHG DV L 19DU[ff ZKHUH 1 LV WKH WRWDO QXPEHU RI REVHUYDWLRQV DQG 9DU[f LV WKH YDULDQFH RI D JHQHULF SDUDPHWHU HVWLPDWH ,QFUHDVLQJ 1 DOZD\V UHVXOWV LQ D UHGXFWLRQ RI WKH YDULDQFH RI HVWLPDWLRQ DOO RWKHU WKLQJV EHLQJ HTXDO
PAGE 18

*HQHUDO 0HWKRGRORJ\ 6HWV RI WUXH YDULDQFH FRPSRQHQWV DUH FDOFXODWHG LQ DFFRUGDQFH ZLWK D VWDWHG OHYHO RI JHQHWLF FRQWURO DQG WKH GHVLJQ PDWUL[ LV JHQHUDWHG LQ FRUUHVSRQGHQFH ZLWK WKH ILHOG DQG PDWLQJ GHVLJQ .QRZLQJ WKH GHVLJQ PDWUL[ DQG D VHW RI WUXH YDULDQFH FRPSRQHQWV D WUXH FRYDULDQFH FRYDULDQFHf PDWUL[ RI YDULDQFH FRPSRQHQW HVWLPDWHV LV JHQHUDWHG 2QFH WKH FRYDULDQFH PDWUL[ RI WKH YDULDQFH FRPSRQHQWV LV LQ KDQG WKH YDULDQFH RI DQG FRYDULDQFHV EHWZHHQ DQ\ OLQHDU FRPELQDWLRQV RI WKH YDULDQFH FRPSRQHQW HVWLPDWHV DUH FDOFXODWHG )URP WKH FRYDULDQFH PDWUL[ IRU OLQHDU FRPELQDWLRQV WKH YDULDQFH RI JHQHWLF UDWLRV DV UDWLRV RI OLQHDU FRPELQDWLRQV RI YDULDQFH FRPSRQHQWV DUH DSSUR[LPDWHG E\ D 7D\ORU VHULHV H[SDQVLRQ 6LQFH GHILQLWLRQ RI D VHW RI YDULDQFH FRPSRQHQWV DQG IRUPDWLRQ RI WKH GHVLJQ PDWUL[ DUH GHSHQGHQW RQ WKH OLQHDU PRGHO HPSOR\HG GLVFXVVLRQ RI VSHFLILF PHWKRGRORJ\ EHJLQV ZLWK OLQHDU PRGHOV /LQHDU 0RGHOV +DOIGLDOOHO DQG FLUFXODU GHVLJQV 7KH VFDODU OLQHDU PRGHO HPSOR\HG IRU KDOIGLDOOHO DQG FLUFXODU PDWLQJ GHVLJQV LV \cMNOP + E\ JN J 6X WJLN WJX W6Z SLMNO ZLMNOP ZKHUH \LMNOP LV WKH P REVHUYDWLRQ RI WKH NO FURVV LQ WKH M EORFN RI WKH L WHVW + LV WKH SRSXODWLRQ PHDQ WL LV WKH UDQGRP YDULDEOH WHVW HQYLURQPHQW a 1,'L2Rf EM LV WKH UDQGRP YDULDEOH EORFN a 1,'UEf JN LV WKH UDQGRP YDULDEOH IHPDOH JHQHUDO FRPELQLQJ DELOLW\ JFDf a 1,'UJFJ LV WKH UDQGRP YDULDEOH PDOH JFD a 1,'DJFDf VX LV WKH UDQGRP YDULDEOH VSHFLILF FRPELQLQJ DELOLW\ VHDf a 1,'AIIAf WJcM LV WKH UDQGRP YDULDEOH WHVW E\ IHPDOH JFD LQWHUDFWLRQ a 1,'L2Af

PAGE 19

WJLL LV WKH UDQGRP YDULDEOH WHVW E\ PDOH JFD LQWHUDFWLRQ a 1,'ADA WV0 LV WKH UDQGRP YDULDEOH WHVW E\ VHD LQWHUDFWLRQ a 1,'DSLMNO LV WKH UDQGRP YDULDEOH SORW a 1,'&2Af :MMNLP LV WKH UDQGRP YDULDEOH ZLWKLQ SORW a 1,'FUZf DQG WKHUH LV QR FRYDULDQFH EHWZHHQ UDQGRP YDULDEOHV LQ WKH PRGHO 7KLV OLQHDU PRGHO LQ PDWUL[ QRWDWLRQ LV GLPHQVLRQV EHORZ PRGHO FRPSRQHQWf \ =MM&M =%H% =*H* =VHV =AHA =AAWV =3H3 HZ Q[O D[O R[W W[O Q[E E[O Q[J J[O Q[V V[O Q[WJ WJ[O R[WV WVMHO Q[S S[O Q[O ZKHUH \ LV WKH REVHUYDWLRQ YHFWRU =M LV WKH SRUWLRQ RI WKH GHVLJQ PDWUL[ IRU WKH Lf§ UDQGRP YDULDEOH H LV WKH YHFWRU RI XQREVHUYDEOH UDQGRP HIIHFWV IRU WKH Lf§ UDQGRP YDULDEOH LV D YHFWRU RI OfV DQG Q W E J V WJ WV DQG S DUH WKH QXPEHU RI REVHUYDWLRQV WHVWV EORFNV JFDfV VHDfV WHVW E\ JFD LQWHUDFWLRQV WHVW E\ VHD LQWHUDFWLRQV DQG SORWV UHVSHFWLYHO\ 8WLOL]LQJ FXVWRPDU\ DVVXPSWLRQV LQ KDOIGLDOOHO PDWLQJ GHVLJQV 0HWKRG *ULIILQJ f WKH YDULDQFH RI DQ LQGLYLGXDO REVHUYDWLRQ LV 9DU\LMOGUW 7E "JFD DA A Dr GY D DQG LQ PDWUL[ QRWDWLRQ WKH FRYDULDQFH PDWUL[ IRU WKH REVHUYDWLRQV LV 9DU\f =S=IR =%=R? =*=R& =V=\f A*AR? =A=AR =3=fUS ,R ZKHUH f LQGLFDWHV WKH WUDQVSRVH RSHUDWRU DOO PDWULFHV RI WKH IRUP =Af DUH Q[Q DQG ,Q LV DQ Q[Q LGHQWLW\ PDWUL[ +DOIVLE GHVLJQ 7KH VFDODU OLQHDU PRGHO IRU KDOIVLE PDWLQJ GHVLJQV LV \cMNP IW Wc E\ JN WJr 3rMN ZrMNP

PAGE 20

ZKHUH \LMNP LV WKH P REVHUYDWLRQ RI WKH N KDOIVLE IDPLO\ LQ WKH Mf§ EORFN RI WKH Lf§ WHVW U Wc EcM JN DQG WJA UHWDLQ WKH GHILQLWLRQ LQ (T SrLMN LV WKH UDQGRP YDULDEOH SORW FRQWDLQLQJ GLIIHUHQW JHQRW\SH E\ HQYLURQPHQW FRPSRQHQWV WKDQ (T a 1,'2pf ZrMNP LV WKH UDQGRP YDULDEOH ZLWKLQ SORW FRQWDLQLQJ GLIIHUHQW OHYHOV RI JHQRW\SLF DQG JHQRW\SH E\ HQYLURQPHQW FRPSRQHQWV WKDQ (T a 1,'I2pr}f DQG WKHUH LV QR FRYDULDQFH EHWZHHQ UDQGRP YDULDEOHV LQ WKH PRGHO 7KH PDWUL[ QRWDWLRQ PRGHO LV \ QO =7Wr7 I =JJ I =T&T I A[JAWJ f =MHS Ir UHG UXH DUW WUO UXE KUO UXJ J[O LXWJ WJ[O UXS SUO UXO 7KH YDULDQFH RI DQ LQGLYLGXDO REVHUYDWLRQ LQ KDOIVLE GHVLJQV LV 9DU\LMND UE UJFD D? FUS RA DQG 9DU\f 7A/?D? =%=UE =*= [JFD =A=ADA =3=3fDS f /HYHOV RI *HQHWLF 'HWHUPLQDWLRQ (LJKW OHYHOV RI JHQHWLF GHWHUPLQDWLRQ DUH GHULYHG IURP D IDFWRULDO FRPELQDWLRQ RI WZR OHYHOV RI HDFK RI WKUHH JHQHWLF UDWLRV KHULWDELOLW\ K RUJUD DJFD DVFD ODA DA DS UZf IRU IXOOVLE PRGHOV DQG K UJFD DJFD D? RUS RAf IRU KDOIVLE PRGHOVf DGGLWLYH WR DGGLWLYH SOXV DGGLWLYH E\ HQYLURQPHQW YDULDQFH UDWLR U% UJFD UJFD DAf 7\SH % FRUUHODWLRQ RI %XUGRQ f DQG GRPLQDQFH WR DGGLWLYH YDULDQFH UDWLR RA UJF7KH OHYHOV HPSOR\HG IRU HDFK UDWLR DUH K DQG U% DQG DQG DQG 7R JHQHUDWH VHWV RI WUXH YDULDQFH FRPSRQHQWV 7DEOH f IRU KDOIGLDOOHO DQG FLUFXODU PDWLQJ GHVLJQV IURP WKH IDFWRULDO FRPELQDWLRQV RI JHQHWLF SDUDPHWHUV WKH GHQRPLQDWRU RI K LV VHW WR DUELWUDULO\ EXW ZLWKRXW ORVV RI JHQHUDOLW\f ZKLFK JLYHQ WKH OHYHO RI K OHDGV WR WKH

PAGE 21

VROXWLRQ IRU DLDc 6ROYLQJ IRU HUJFD DQG NQRZLQJ \ \LHOGV WKH YDOXH IRU RA .QRZLQJ WKH OHYHO RI U% DQG DOORZV WKH HTXDWLRQ IRU U% WR EH VROYHG IRU DA $Q DVVXPSWLRQ WKDW WKH UDWLR RI 7DEOH 3DUDPHWULF YDULDQFH FRPSRQHQWV IRU WKH IDFWRULDO FRPELQDWLRQ RI KHULWDELOLW\ DQG f 7\SH % &RUUHODWLRQ DQG f DQG GRPLQDQFH WR DGGLWLYH YDULDQFH UDWLR DQG f IRU IXOO DQG KDOIVLE GHVLJQV D[ DQG D? ZHUH PDLQWDLQHG DW DQG UHVSHFWLYHO\ IRU DOO OHYHOV DQG GHVLJQV 'HVLJQ /HYHO K I% R} )XOO +DOI DQG DQG DQG DQG HTXDOV WKH UDWLR RI Dr D? SHUPLWV D VROXWLRQ IRU XWH $ IXUWKHU DVVXPSWLRQ WKDW US LV VHYHQ SHUFHQW RI DS DZ \LHOGV D VROXWLRQ IRU ERWK DS DQG DZ )LQDOO\ FU DQG D? DUH VHW WR DQG UHVSHFWLYHO\ IRU DOO WUHDWPHQW OHYHOV ,Q RUGHU WR IDFLOLWDWH FRPSDULVRQV RI KDOIVLE PDWLQJ GHVLJQV ZLWK IXOOVLE PDWLQJ GHVLJQV DJFD DQG DA UHWDLQ WKH VDPH YDOXHV IRU JLYHQ OHYHOV RI K DQG U% DQG WKH GHQRPLQDWRU RI KHULWDELOLW\ DJDLQ LV VHW WR 7R VROYH IRU US DQG RA WKH DVVXPSWLRQ WKDW [S LV ILYH SHUFHQW RI D3 Dr SHUPLWV D VROXWLRQ IRU XS DQG D: DQG PDLQWDLQV US DSSUR[LPDWHO\ HTXDO WR DQG QR ODUJHU WKDQ DS RI WKH IXOOVLE PDWLQJ GHVLJQV 1DPNRRQJ HW DO f IRU WKH VDPH OHYHOV RI

PAGE 22

K DQG U% 8QGHU WKH SUHYLRXV GHILQLWLRQV DOO FRQVLGHUDWLRQ RI GLIIHUHQFHV LQ \ FKDQJLQJ WKH PDJQLWXGHV RI DS DQG DA LV GLVDOORZHG 7KXV WKHUH DUH RQO\ IRXU SDUDPHWHU VHWV IRU WKH KDOI VLE PDWLQJ GHVLJQ 7DEOH f &RYDULDQFH 0DWUL[ IRU 9DULDQFH &RPSRQHQWV 7KH EDVH DOJRULWKP WR SURGXFH WKH FRYDULDQFH PDWUL[ IRU YDULDQFH FRPSRQHQW HVWLPDWHV LV IURP *LHVEUHFKW f DQG ZDV UHZULWWHQ LQ )RUWUDQ IRU HDVH RI KDQGOLQJ WKH VWXG\ GDWD ,Q XVLQJ WKLV DOJRULWKP ZH DVVXPH WKDW DOO UDQGRP YDULDEOHV DUH LQGHSHQGHQW DQG QRUPDOO\ GLVWULEXWHG DQG WKDW WKH WUXH YDULDQFHV RI WKH UDQGRP YDULDEOHV DUH NQRZQ 8QGHU WKHVH DVVXPSWLRQV 0LQLPXP 1RUP 4XDGUDWLF 8QELDVHG (VWLPDWLRQ 0,148( 5DR f XVLQJ WKH WUXH YDULDQFH FRPSRQHQWV DV SULRUV WKH VWDUWLQJ SRLQW IRU WKH DOJRULWKPf EHFRPHV 0,948( 5DR Ef ZKLFK UHTXLUHV QRUPDOLW\ DQG WKH WUXH YDULDQFH FRPSRQHQWV DV SULRUV 6HDUOH f DQG IRU D JLYHQ GHVLJQ WKH FRYDULDQFH PDWUL[ RI WKH YDULDQFH FRPSRQHQW HVWLPDWHV EHFRPHV IL[HG $ VNHWFK RI WKH VWHSV IURP WKH 0,948( HTXDWLRQ (T *LHVEUHFKW 6HDUOH f WR WKH WUXH FRYDULDQFH PDWUL[ IRU YDULDQFH FRPSRQHQWV HVWLPDWHV LV ^WUL49M49Mf`A ^\f49M4\` U[U U[O U[O WKHQ r ^WU4 9 c4 9Mf`n ^\f49M4\` DQG 9DURf ^WU49t9An9DU4\f49t\0WU49t9Mf` U[U U[U U[U U[U ZKHUH ^DM LV D PDWUL[ ZKRVH HOHPHQWV DUH DcM ZKHUH LQ WKH IXOOVLE GHVLJQV L WR DQG M O WR LH WKHUH LV D URZ DQG FROXPQ IRU HYHU\ UDQGRP YDULDEOH LQ WKH OLQHDU PRGHO

PAGE 23

WU LV WKH WUDFH RSHUDWRU WKDW LV WKH VXP RI WKH GLDJRQDO HOHPHQWV RI D PDWUL[ 4 9n 9n;;f9,;fn;f9n IRU 9 WKH FRYDULDQFH PDWUL[ RI \ DQG ; DV WKH GHVLJQ PDWUL[ IRU IL[HG HIIHFWV 9c =M=f ZKHUH L WKH UDQGRP YDULDEOHV WHVW EORFN HWF D LV WKH YHFWRU RI YDULDQFH FRPSRQHQW HVWLPDWHV DQG U LV WKH QXPEHU RI UDQGRP YDULDEOHV LQ WKH PRGHO 7KH YDULDQFH RI D TXDGUDWLF IRUP ZKHUH $ LV DQ\ QRQQHJDWLYH GHILQLWH PDWUL[ RI SURSHU GLPHQVLRQf XQGHU QRUPDOLW\ LV 9DU\f$\f WU$9$9f Wf$r 6HDUOH f KRZHYHU 0,148( GHULYDWLRQ 5DR f UHTXLUHV WKDW $; ZKLFK LQ RXU FDVH LV $ DQG LV HTXLYDOHQW WR [Of$OL WKXV 9DU^\f49M4\`f ^WU49L49Mf` DQG XVLQJ (T DQG (T 9DUAf ^WUW49M49AAWUW49L49S+WUW49L49Mf` DQG WKHUHIRUH 9DUAf 9YF ^WU49L49Mf` )URP (T LW LV VHHQ WKDW WKH 0,948( FRYDULDQFH PDWUL[ RI WKH YDULDQFH FRPSRQHQW HVWLPDWHV LV GHSHQGHQW RQO\ RQ WKH GHVLJQ PDWUL[ WKH UHVXOW RI WKH ILHOG GHVLJQ DQG PDWLQJ GHVLJQf DQG WKH WUXH YDULDQFH FRPSRQHQWV D GDWD YHFWRU LV QRW QHHGHG &RYDULDQFH 0DWUL[ IRU /LQHDU &RPELQDWLRQV RI 9DULDQFH &RPSRQHQWV DQG 9DULDQFH RI D 5DWLR 2QFH WKH FRYDULDQFH PDWUL[ IRU WKH YDULDQFH FRPSRQHQW HVWLPDWHV (Tf LV FUHDWHG WKHQ WKH FRYDULDQFH PDWUL[ RI OLQHDU FRPELQDWLRQV RI WKHVH YDULDQFH FRPSRQHQWV LV IRUPHG DV 9N /f9YF/ [ [U U[U U[

PAGE 24

ZKHUH / VSHFLILHV WKH OLQHDU FRPELQDWLRQV RI WKH YDULDQFH FRPSRQHQWV ZKLFK DUH WKH FRPELQDWLRQV RI YDULDQFH FRPSRQHQWV LQ WKH GHQRPLQDWRU DQG QXPHUDWRU RI WKH JHQHWLF UDWLR EHLQJ HVWLPDWHG $ 7D\ORU VHULHV H[SDQVLRQ ILUVW DSSUR[LPDWLRQf IRU WKH YDULDQFH RI D UDWLR XVLQJ WKH YDULDQFHV RI DQG FRYDULDQFH EHWZHHQ QXPHUDWRU DQG GHQRPLQDWRU LV WKHQ DSSOLHG XVLQJ WKH HOHPHQWV RI 9N WR SURGXFH WKH DSSUR[LPDWH YDULDQFH RI WKH WKUHH UDWLR HVWLPDWHV DV .HQGDOO DQG 6WXDUW f 9DUUDWLRf O'f9NOOff 1'f9NOff O63'f9Nff ZKHUH WKH JHQHULF UDWLR LV 1' DQG 1 DQG DUH WKH SDUDPHWULF YDOXHV 9NOOf LV WKH YDULDQFH RI 1 9NOf LV WKH FRYDULDQFH EHWZHHQ 1 DQG DQG 9Nf LV WKH YDULDQFH RI &RPSDULVRQ $PRQJ (VWLPDWHV RI 9DULDQFHV RI 5DWLRV 7KH DSSUR[LPDWH YDULDQFHV RI WKH WKUHH UDWLR HVWLPDWHV K U% DQG \f DUH FRPSDUHG DFURVV PDWLQJ GHVLJQV ZLWK HTXDO RU DSSUR[LPDWHO\ HTXDOf QXPEHUV RI REVHUYDWLRQV DFURVV QXPEHUV RI ORFDWLRQV DQG DFURVV QXPEHUV RI SDUHQWV ZLWKLQ D PDWLQJ GHVLJQ DOO ZLWKLQ D OHYHO RI JHQHWLF GHWHUPLQDWLRQ 7KH VWDQGDUG IRU FRPSDULVRQ LV L 5HVXOWV DUH SUHVHQWHG E\ WKH JHQHWLF UDWLR HVWLPDWHG VR WKDW GLUHFW FRPSDULVRQV PD\ EH PDGH DPRQJ WKH PDWLQJ GHVLJQV IRU HTXDO QXPEHUV RI REVHUYDWLRQV ZLWKLQ D QXPEHU RI ORFDWLRQV IRU YDU\LQJ OHYHOV RI JHQHWLF FRQWURO 1XPEHU RI JHQHWLF HQWULHV QXPEHU RI FURVVHV IRU IXOOVLE GHVLJQV DQG QXPEHU RI KDOIVLE IDPLOLHV IRU KDOIVLE GHVLJQVf LV XVHG DV D SUR[\ IRU QXPEHU RI REVHUYDWLRQV VLQFH IRU DOO GHVLJQV QXPEHU RI REVHUYDWLRQV HTXDOV WZHQW\IRXU WLPHV WKH QXPEHU RI ORFDWLRQV WLPHV WKH QXPEHU RI JHQHWLF HQWULHV )XUWKHU E\ SORWWLQJ WKH WZR OHYHOV RI QXPEHUV RI ORFDWLRQV RQ D VLQJOH ILJXUH D

PAGE 25

FRPSDULVRQ LV PDGH RI WKH XWLOLW\ RI UHSOLFDWLRQ RI D GHVLJQ DFURVV LQFUHDVLQJ QXPEHUV RI ORFDWLRQV (IILFLHQF\ SORWV DOVR SHUPLW FRQWUDVWV RI WKH DEVROXWH PDJQLWXGH RI YDULDQFH RI HVWLPDWLRQ DPRQJ GHVLJQV )RU D JLYHQ QXPEHU RI JHQHWLF HQWULHV DQG ORFDWLRQV WKH GHVLJQ ZLWK WKH KLJKHVW HIILFLHQF\ LV WKH PRVW SUHFLVH ORZHVW YDULDQFH RI HVWLPDWLRQf ,QFUHDVLQJ WKH QXPEHU RI JHQHWLF HQWULHV RU ORFDWLRQV DOZD\V UHVXOWV LQ JUHDWHU SUHFLVLRQ ORZHU YDULDQFH RI HVWLPDWLRQf EXW LV QRW QHFHVVDULO\ DV HIILFLHQW WKH UHGXFWLRQ LQ YDULDQFH ZDV QRW VXIILFLHQW WR RIIVHW WKH LQFUHDVH LQ QXPEHUV RI REVHUYDWLRQVf $ SULPDU\ MXVWLILFDWLRQ IRU XVLQJ WKH HIILFLHQF\ RI D GHVLJQ DV D FULWHULRQ LV WKDW D PRUH SUHFLVH HVWLPDWH RI D JHQHWLF UDWLR LV REWDLQHG E\ XVLQJ WKH PHDQ RI WZR HVWLPDWHV IURP UHSOLFDWLRQ RI WKH VPDOO GHVLJQ DV WZR GLVFRQQHFWHG H[SHULPHQWV DV RSSRVHG WR WKH HVWLPDWH IURP VLQJOH ODUJH GHVLJQ 7KLV LV WUXH ZKHQ f WKH QXPEHU RI REVHUYDWLRQV LQ WKH ODUJH GHVLJQ 1f HTXDOV WZLFH WKH QXPEHU RI REVHUYDWLRQV LQ VPDOO GHVLJQ Qcf f WKH VPDOO GHVLJQ LV PRUH HIILFLHQW DQG f WKH YDULDQFHV DUH KRPRJHQHRXV 7KLV LV SURYHQ EHORZ 6LQFH 1 Q Q DQG QL Q WKHQ 1 Q %\ GHILQLWLRQ L 1r9DU 5DWLRfff DQG 9DU5DWLRf r1f 7KH SURSRVLWLRQ LV 9DUV5DWLRf 9DUV5DWLRff 9DU5DWLRf VXEVWLWXWLRQ JLYHV OQLrf Qrfff 1r_ff 6LPSOLILFDWLRQ \LHOGV OrQrff O1rLMff DQG PXOWLSOLFDWLRQ E\ 1 SURGXFHV +L WM ZKLFK LV VWULFWO\ WUXH VR ORQJ DV L% ZKHUH L LV WKH HIILFLHQF\ RI WKH VPDOOHU H[SHULPHQW DQG L LV WKH HIILFLHQF\ RI WKH ODUJHU H[SHULPHQW

PAGE 26

()),&,(1&< ()),&,(1&< Df K U 9 Ef K U% 9 Hf K U% 9 K Uf 9 &LUFXODU ORFDWLRQV +DOIGLDOOG ORFDWLRQV +DOIVLE ORFDWLRQV &LUFXODU ORFDWLRQV +DOIGLDOOHO ORFDWLRQV +DOIVLE ORFDWLRQV f Jf Kf U 9 )LJXUH (IILFLHQF\ Lf IRU K SORWWHG DJDLQVW QXPEHU RI JHQHWLF HQWULHV IRU OHYHOV WKURXJK IRU JHQHWLF FRQWURO IRU FLUFXODU KDOI GLDOOHO DQG KDOIVLE PDWLQJ GHVLJQV DFURVV OHYHOV RI ORFDWLRQ ZKHUH L O19DUKfff DQG 1 WKH WRWDO QXPEHU RI REVHUYDWLRQV Lr D?

PAGE 27

5HVXOWV +HULWDELOLWY +DOIVLE GHVLJQV DUH DOPRVW JOREDOO\ VXSHULRU WR WKH WZR IXOOVLE GHVLJQV LQ SUHFLVLRQ RI KHULWDELOLW\ HVWLPDWHV UHVXOWV QRW VKRZQ IRU YDULDQFH EXW PD\ EH VHHQ IURP HIILFLHQFLHV LQ )LJXUH f )RU GHVLJQV RI HTXDO VL]H KDOIVLE GHVLJQV H[FHO ZLWK WKH H[FHSWLRQ RI JHQHWLF OHYHO WKUHH )LJXUH OF K U% DQG \ f ,Q JHQHWLF OHYHO WKUHH WKH FLUFXODU GHVLJQ SURYLGHV WKH PRVW SUHFLVH HVWLPDWH RI K IRU WZR ORFDWLRQ GHVLJQV KRZHYHU ZKHQ WKH GHVLJQ LV H[WHQGHG DFURVV ILYH ORFDWLRQV WKH KDOIVLE PDWLQJ GHVLJQ DJDLQ SURYLGHV WKH PRVW SUHFLVH HVWLPDWHV 7KH FLUFXODU PDWLQJ GHVLJQ LV VXSHULRU LQ SUHFLVLRQ WR WKH KDOIGLDOOHO GHVLJQ DFURVV DOO OHYHOV RI JHQHWLF FRQWURO DQG ORFDWLRQ HYHQ ZLWK D UHODWLYHO\ ODUJH QXPEHU RI FURVVHV SHU SDUHQW IRXUf +DOIVLE GHVLJQV DUH LQ JHQHUDO VHYHQ JHQHWLF FRQWURO OHYHOV RXW RI HLJKW )LJXUH f PRUH HIILFLHQW ZLWK WKH H[FHSWLRQ RI OHYHO WKUHH DFURVV WZR ORFDWLRQV )LJXUH OFf )RU WKH FLUFXODU DQG KDOIVLE PDWLQJ GHVLJQV FRQVLGHUHG LQFUHDVLQJ WKH QXPEHU RI JHQHWLF HQWULHV DOZD\V LPSURYHV WKH HIILFLHQF\ RI WKH GHVLJQ +RZHYHU GHILQLWH RSWLPD H[LVW IRU WKH KDOIGLDOOHO PDWLQJ GHVLJQ IRU QXPEHU RI JHQHWLF HQWULHV LH FURVVHV ZKLFK FRQYHUW WR D VSHFLILF QXPEHU RI SDUHQWV 7KHVH RSWLPD DUH QRW FRQVWDQW EXW WHQG WR EH VL[ SDUHQWV RU OHVV ORZHU ZLWK LQFUHDVLQJ K RU QXPEHU RI ORFDWLRQV 7KH VL[SDUHQW KDOIGLDOOHO LV QHYHU IDU IURP WKH KDOIGLDOOHO RSWLPD DQG LQFUHDVLQJ WKH QXPEHU RI SDUHQWV SDVW WKH RSWLPXP UHVXOWV LQ GHFUHDVHG HIILFLHQF\ )RU KDOIVLE GHVLJQV ZLWK K ILYH ORFDWLRQV DUH PRUH HIILFLHQW WKDQ WZR ORFDWLRQV KRZHYHU DW K WZR ORFDWLRQV DUH PRVW HIILFLHQW )XUWKHU WKH QXPEHU RI ORFDWLRQV UHTXLUHG WR HIILFLHQWO\ HVWLPDWH K IRU KDOIVLE GHVLJQV LV GHWHUPLQHG RQO\ E\ WKH OHYHO RI K DQG GRHV QRW GHSHQG RQ WKH OHYHOV RI WKH RWKHU UDWLRV $OWKRXJK HVWLPDWHV RYHU ODUJHU QXPEHUV RI

PAGE 28

REVHUYDWLRQV DUH PRUH SUHFLVH ILYHORFDWLRQ HVWLPDWHV DUH PRUH SUHFLVH WKDQ WZRORFDWLRQ HVWLPDWHVf WKH HIILFLHQF\ LQFUHDVH LQ SUHFLVLRQ SHU XQLW REVHUYDWLRQf GHFOLQHV 6R WKDW LI K DQG HVWLPDWHV RI D FHUWDLQ SUHFLVLRQ DUH UHTXLUHG GLVFRQQHFWHG VHWV RI WZRORFDWLRQ H[SHULPHQWV DUH SUHIHUUHG WR ILYHORFDWLRQ H[SHULPHQWV 7KH UHODWLYH HIILFLHQFLHV RI ILYH ORFDWLRQV YHUVXV WZR ORFDWLRQV LV HQKDQFHG ZLWK GHFUHDVLQJ U% LQFUHDVLQJ JHQRW\SH E\ HQYLURQPHQW LQWHUDFWLRQf ZLWKLQ £ OHYHO RI K FRPSDUH )LJXUHV OD WR OE DQG OF WR OG IRU K DQG OH WR OI DQG OJ WR OK IRU K f
PAGE 29

()),&,(1&< ()),&,(1&< Df K ‘ r < r Ef K < f K LS 9 f K] < Ff K UB < Gf K < Jf K < Ef K Uf < *(1(7,& (175,(6 &LUFXODU ORFDWLRQV +DOIGLDOOHO ORFDWLRQV +DOIVLE ORFDWLRQV &LUFXODU ORFDWLRQV +DOIGLR'FO ORFDWLRQV +DOIVLE ORFDWLRQV )LJXUH (IILFLHQF\ Lf IRU U% SORWWHG DJDLQVW QXPEHU RI JHQHWLF HQWULHV IRU OHYHOV WKURXJK IRU JHQHWLF FRQWURO IRU FLUFXODU KDOIGLDOOHO DQG KDOIVLE PDWLQJ GHVLJQV DFURVV OHYHOV RI ORFDWLRQ ZKHUH L O19DUU%fff DQG 1 WKH WRWDO QXPEHU RI REVHUYDWLRQV YR

PAGE 30

()),&,(1&< ()),&,(1&< Df K U \ Ff K U% 9 &LUFXODU ORFDWLRQV Ef K U% \ Gf K U% 9 *(1(7,& (175,(6 +DOIGLDOOHO ORFDWLRQV &LUFXODU ORFDWLRQV +DOIGLDOOHO ORFDWLRQV ‘ $ )LJXUH (IILFLHQF\ ]f IRU SORWWHG DJDLQVW QXPEHU RI JHQHWLF HQWULHV IRU IRXU OHYHOV IRU JHQHWLF FRQWURO IRU FLUFXODU KDOIGLDOOHO DQG KDOIVLE PDWLQJ GHVLJQV DFURVV OHYHOV RI ORFDWLRQ ZKHUH L O19DUfff DQG 1 WKH WRWDO QXPEHU RI REVHUYDWLRQV

PAGE 31

)RU HVWLPDWLRQ RI U% IXOOVLE GHVLJQV DUH PRUH HIILFLHQW WKDQ KDOIVLE GHVLJQV H[FHSW LQ WKH WKUHH FDVHV RI ORZ U% f DQG KLJK f IRU K )LJXUH Ef DQG ORZ U% IRU K )LJXUHV I DQG Kf :LWKLQ IXOOVLE GHVLJQV WKH FLUFXODU GHVLJQ LV JOREDOO\ VXSHULRU WR WKH KDOIGLDOOHO $V ZLWK K HVWLPDWLRQ KDOIGLDOOHO GHVLJQV KDYH RSWLPDO OHYHOV IRU QXPEHUV RI SDUHQWV 7KH VL[SDUHQW KDOIGLDOOHO LV DJDLQ FORVH WR WKHVH RSWLPD IRU DOO JHQHWLF OHYHOV DQG QXPEHUV RI ORFDWLRQV $W ORZ K IRU IXOOVLE GHVLJQV SODQWLQJ LQ WZR ORFDWLRQV LV DOZD\V PRUH HIILFLHQW WKDQ ILYH ORFDWLRQV )RU KDOIVLE GHVLJQV DW ORZ K WKH UHODWLYH HIILFLHQF\ RI WZR YHUVXV ILYH ORFDWLRQV LV GHSHQGHQW RQ WKH OHYHO RI U% ZLWK ORZHU U% IDYRULQJ UHSOLFDWLRQ DFURVV PRUH ORFDWLRQV $W K KDOIVLE GHVLJQV DUH PRUH HIILFLHQW ZKHQ UHSOLFDWHG DFURVV ILYH ORFDWLRQV $W WKH KLJKHU K YDOXH IXOOVLE GHVLJQ HIILFLHQF\ DFURVV ORFDWLRQV LV GHSHQGHQW RQ WKH OHYHO RI U% :LWK U% DQG K UHSOLFDWLRQ RI IXOOVLE GHVLJQV LV IRU WKH ILUVW WLPH PRUH HIILFLHQW DFURVV ILYH ORFDWLRQV WKDQ DFURVV WZR ORFDWLRQV KRZHYHU DW WKH KLJKHU U% OHYHO WZR ORFDWLRQV LV DJDLQ WKH SUHIHUUHG QXPEHU 'RPLQDQFH WR $GGLWLYH 9DULDQFH 5DWLR ,Q FRPSDULQJ WKH WZR IXOOVLE GHVLJQV IRU UHODWLYH HIILFLHQF\ LQ HVWLPDWLQJ WKH FLUFXODU GHVLJQ LV DOZD\V DSSUR[LPDWHO\ HTXDO WR RU IRU PRVW FDVHV VXSHULRU WR WKH KDOIGLDOOHO GHVLJQ )LJXUH f 7KH UHODWLYH VXSHULRULW\ RI WKH FLUFXODU GHVLJQ LV HQKDQFHG E\ GHFUHDVLQJ DQG U% QRW VKRZQf 7KH KDOIGLDOOHO GHVLJQ DJDLQ GHPRQVWUDWHV RSWLPD IRU QXPEHU RI SDUHQWV ZLWK WKH VL[SDUHQW GHVLJQ EHLQJ QHDU RSWLPDO :LWKLQ D PDWLQJ GHVLJQ WKH XVH RI WZR ORFDWLRQV LV DOZD\V PRUH HIILFLHQW WKDQ WKH XVH RI ILYH ORFDWLRQV 7KH PDJQLWXGH RI WKLV VXSHULRULW\ HVFDODWHV ZLWK LQFUHDVLQJ U% DQG K )LJXUHV D DQG E YHUVXV F DQG Gf

PAGE 32

'LVFXVVLRQ &RPSDULVRQ RI 0DWLQJ 'HVLJQV $ SULRUL NQRZOHGJH RI JHQHWLF FRQWURO LV UHTXLUHG WR FKRRVH WKH RSWLPDO PDWLQJ DQG ILHOG GHVLJQ IRU HVWLPDWLRQ RI K U% DQG *LYHQ WKDW VXFK NQRZOHGJH PD\ QRW EH DYDLODEOH WKH FKRLFHV DUH WKHQ EDVHG RQ WKH PRVW UREXVW PDWLQJ GHVLJQV DQG ILHOG GHVLJQV IRU WKH HVWLPDWLRQ RI FHUWDLQ RI WKH JHQHWLF UDWLRV ,I K LV WKH RQO\ UDWLR GHVLUHG WKHQ WKH KDOIVLE PDWLQJ GHVLJQ LV EHVW (VWLPDWLRQ RI ERWK K DQG U% UHTXLUHV D FKRLFH EHWZHHQ WKH KDOIVLE DQG FLUFXODU GHVLJQV ,I WKHUH LV QR SULRU NQRZOHGJH WKHQ WKH VHOHFWLRQ RI D PDWLQJ GHVLJQ LV GHSHQGHQW RQ ZKLFK UDWLR KDV WKH KLJKHVW SULRULW\ )RU H[SHULPHQWV LQ ZKLFK K UHFHLYHG KLJKHVW ZHLJKWLQJ WKH KDOIVLE GHVLJQ LV SUHIHUUHG DQG LQ WKH DOWHUQDWLYH FDVH WKH FLUFXODU GHVLJQ LV WKH EHWWHU FKRLFH ,Q WKH ODVW VFHQDULR LQIRUPDWLRQ RQ DOO WKUHH UDWLRV LV GHVLUHG IURP WKH VDPH H[SHULPHQW DQG LQ WKLV FDVH WKH FLUFXODU GHVLJQ LV WKH EHWWHU VHOHFWLRQ VLQFH WKH FLUFXODU GHVLJQ LV DOPRVW JOREDOO\ PRUH HIILFLHQW WKDQ WKH KDOIGLDOOHO GHVLJQ $IWHU FKRRVLQJ D PDWLQJ GHVLJQ WKH QH[W GHFLVLRQ LV KRZ PDQ\ ORFDWLRQV SHU H[SHULPHQW DUH UHTXLUHG WR RSWLPL]H HIILFLHQF\ )RU WKH KDOIVLE GHVLJQ WKH QXPEHU RI ORFDWLRQV UHTXLUHG WR RSWLPL]H HIILFLHQF\ LV GHSHQGHQW RQ ERWK WKH UDWLR EHLQJ HVWLPDWHG DQG WKH OHYHO RI JHQHWLF FRQWURO $ EURDG LQIHUHQFH LV WKDW IRU K HVWLPDWLRQ D WZR ORFDWLRQ H[SHULPHQW LV PRUH HIILFLHQW DQG IRU U% D ILYH ORFDWLRQ H[SHULPHQW KDV WKH EHWWHU HIILFLHQF\ (VWLPDWLRQ RI DQ\ RI WKH WKUHH UDWLRV ZLWK D IXOOVLE GHVLJQ LV DOPRVW JOREDOO\ PRUH HIILFLHQW LQ WZR ORFDWLRQ H[SHULPHQWV 7KH GLVSDULW\ EHWZHHQ WKH EHKDYLRU RI WKH KDOIVLE DQG IXOOVLE GHVLJQV ZLWK UHVSHFW WR WKH HIILFLHQF\ RI ORFDWLRQ OHYHOV FDQ EH H[SODLQHG LQ WHUPV RI WKH JHQHWLF FRQQHFWHGQHVV RIIHUHG E\ WKH GLIIHUHQW GHVLJQV *HQHWLF FRQQHFWHGQHVV FDQ EH YLHZHG DV FRPPRQDOLW\ RI SDUHQWDJH DPRQJ JHQHWLF HQWULHV 7KH PRUH HQWULHV KDYLQJ D FRPPRQ SDUHQW WKH PRUH FRQQHFWHGQHVV LV SUHVHQW

PAGE 33

7KH KDOIVLE GHVLJQ LV RQO\ FRQQHFWHG DFURVV ORFDWLRQV E\ WKH RQH FRPPRQ SDUHQW LQ D KDOIVLE IDPLO\ LQ HDFK UHSOLFDWLRQ )XOOVLE GHVLJQV DUH FRQQHFWHG DFURVV ORFDWLRQV LQ HDFK UHSOLFDWLRQ E\ WKH IXOOVLE FURVV SOXV WKH QXPEHU RI SDUHQWV PLQXV WZR KDOIGLDOOHOf RU WKUHH FLUFXODUf IRU HDFK RI WKH WZR SDUHQWV LQ D FURVV 7KH FRQQHFWHGQHVV LQ D IXOOVLE GHVLJQ PHDQV HDFK REVHUYDWLRQ LV SURYLGLQJ LQIRUPDWLRQ DERXW PDQ\ RWKHU REVHUYDWLRQV 7KH UHVXOW RI WKLV FRQQHFWHGQHVV LV WKDW LQ JHQHUDO IHZHU REVHUYDWLRQV QXPEHU RI ORFDWLRQVf DUH UHTXLUHG IRU PD[LPXP HIILFLHQF\ $ *HQHUDO $SSURDFK WR WKH (VWLPDWLRQ 3UREOHP 7KH HVWLPDWLRQ SUREOHPV PD\ EH YLHZHG LQ D EURDGHU FRQWH[W WKDQ WKH VSHFLILF VROXWLRQV LQ WKLV FKDSWHU 7KH WHFKQLTXH IRU FRPSDULVRQ RI PDWLQJ GHVLJQV DQG QXPEHUV RI ORFDWLRQV DFURVV OHYHOV RI JHQHWLF GHWHUPLQDWLRQ PD\ EH FRQVWUXHG IRU WKH FDVH RI K HVWLPDWLRQ WR EH WKH HIIHFW RI WKHVH IDFWRUV RQ WKH YDULDQFH RI DA HVWLPDWHV 9LHZLQJ WKH YDULDQFH DSSUR[LPDWLRQ IRUPXOD WKH FRQFOXVLRQ PD\ EH UHDFKHG WKDW WKH YDULDQFH RI RA HVWLPDWHV LV WKH FRQWUROOLQJ IDFWRU LQ WKH YDULDQFH RI K HVWLPDWHV VLQFH WKH RWKHU IDFWRUV DW WKHVH KHULWDELOLW\ OHYHOV DUH PXOWLSOLHG E\ FRQVWDQWV ZKLFK UHGXFH WKHLU LPSDFW GUDPDWLFDOO\ *LYHQ WKLV FRQFOXVLRQ WKH YDULDQFH RI K HVWLPDWHV LV HVVHQWLDOO\ WKH f HOHPHQW LQ ^WU49c49Mf`nf (T f )XUWKHU VLQFH WKH FRYDULDQFHV RI WKH RWKHU YDULDQFH FRPSRQHQW HVWLPDWHV ZLWK RA HVWLPDWHV DUH VPDOO WKH YDULDQFH RI DA HVWLPDWHV LV EDVLFDOO\ GHWHUPLQHG E\ WKH PDJQLWXGH RI WKH f HOHPHQW RI ^WU49c49Mf` ZKLFK LV WU49J49Jf 7KXV WKH YDULDQFH RI K HVWLPDWHV LV PLQLPL]HG E\ PD[LPL]LQJ WU49J49Jf ZLWK K XVHG DV DQ LOOXVWUDWLRQ EHFDXVH WKLV VLPSOLILFDWLRQ LV SRVVLEOH &RQVLGHULQJ WKH LPSDFW RI FKDQJLQJ OHYHOV RI JHQHWLF FRQWURO ZKLOH KROGLQJ WKH PDWLQJ DQG ILHOG GHVLJQV FRQVWDQW 9J LV IL[HG WKH GLDJRQDO HOHPHQWV RI 9 DUH IL[HG DW EHFDXVH RI RXU DVVXPSWLRQV DQG RQO\ WKH RIIGLDJRQDO HOHPHQWV RI 9 FKDQJH ZLWK JHQHWLF FRQWURO OHYHOV 6LQFH 4 LV D GLUHFW IXQFWLRQ RI 9 ZKDW ZH REVHUYH LQ )LJXUH FRPSDULQJ D GHVLJQ DFURVV

PAGE 34

OHYHOV RI JHQHWLF FRQWURO DUH FKDQJHV LQ 9n EURXJKW DERXW E\ FKDQJHV LQ WKH PDJQLWXGH RI WKH RII GLDJRQDO HOHPHQWV RI 9 FRYDULDQFHV DPRQJ REVHUYDWLRQVf 7KH HIIHFW RI SRVLWLYH WKH OLQHDU PRGHO VSHFLILHV WKDW DOO RIIGLDJRQDO HOHPHQWV LQ 9 DUH ]HUR RU SRVLWLYHf RIIGLDJRQDO HOHPHQWV RQ 9n LV WR UHGXFH WKH PDJQLWXGH RI WKH GLDJRQDO HOHPHQWV DQG RIWHQ DOVR UHVXOW LQ QHJDWLYH RII GLDJRQDO HOHPHQWV ,I RQH LQFUHDVHV WKH PDJQLWXGH RI WKH RIIGLDJRQDO HOHPHQWV LQ 9 WKHQ WKH PDJQLWXGH RI WKH GLDJRQDO HOHPHQWV RI 9n LV UHGXFHG DQG WKH PDJQLWXGH RI QHJDWLYH RIIGLDJRQDO HOHPHQWV LV LQFUHDVHG 6LQFH WU49J49Jf LV WKH VXP RI WKH VTXDUHG HOHPHQWV RI WKH SURGXFW RI D GLUHFW IXQFWLRQ RI 9n DQG D PDWUL[ RI QRQQHJDWLYH FRQVWDQWV 9Jf DV WKH GLDJRQDO HOHPHQWV RI 9n DUH UHGXFHG DQG WKH RIIGLDJRQDO HOHPHQWV EHFRPH PRUH QHJDWLYH WU49J49Jf PXVW EHFRPH VPDOOHU DQG WKH YDULDQFH RI K HVWLPDWHV LQFUHDVHV 0DWLQJ GHVLJQV PD\ EH FRPSDUHG E\ WKH VDPH W\SH RI UHDVRQLQJ :LWKLQ D FRQVWDQW ILHOG GHVLJQ FKDQJHV LQ PDWLQJ GHVLJQ SURGXFH DOWHUDWLRQV LQ 9 2I WKH WKUHH GHVLJQV WKH KDOIVLE SURGXFHV D 9 PDWUL[ ZLWK WKH PRVW ]HUR RIIGLDJRQDO HOHPHQWV WKH FLUFXODU GHVLJQ QH[W DQG WKH KDOIGLDOOHO WKH IHZHVW QXPEHU RI ]HUR RIIGLDJRQDO HOHPHQWV .QRZLQJ WKH HIIHFW RI RIIGLDJRQDO HOHPHQWV RQ WKH YDULDQFH RI K HVWLPDWHV RQH FRXOG VXUPLVH WKDW WKH YDULDQFH RI HVWLPDWHV LV UHGXFHG LQ WKH RUGHU RI OHDVW WR PRVW QRQ]HUR RIIGLDJRQDO HOHPHQWV 7KLV WHQDQW LV LQ EDVLF DJUHHPHQW ZLWK WKH UHVXOWV LQ )LJXUHV WKURXJK 7KH HIIHFWV RI U% DQG \ RQ WKH YDULDQFH RI K HVWLPDWHV FDQ DOVR EH LQWHUSUHWHG XWLOL]LQJ WKH DERYH DSSURDFK ,Q WKH UHVXOWV VHFWLRQ RI WKLV FKDSWHU LW LV QRWHG WKDW GHFUHDVLQJ WKH PDJQLWXGH RI U% DQGRU \ FDXVHV IXOOVLE GHVLJQV WR ULVH LQ HIILFLHQF\ UHODWLYH WR WKH KDOIVLE GHVLJQ ,Q DFFRUGDQFH ZLWK RXU SUHYLRXV DUJXPHQWV WKLV ZRXOG EH H[SHFWHG VLQFH GHFUHDVLQJ WKH PDJQLWXGH RI WKRVH WZR UDWLRV FDXVHV D GHFUHDVH LQ WKH PDJQLWXGH RI RIIGLDJRQDO HOHPHQWV 0RUH SUHFLVHO\ GHFUHDVLQJ \ UHVXOWV LQ WKH UHGXFWLRQ RI RIIGLDJRQDO HOHPHQWV LQ 9 RI WKH IXOOVLE GHVLJQV ZKLOH QRW DIIHFWLQJ WKH KDOIVLE GHVLJQ DQG GHFUHDVLQJ U% UHVXOWV LQ WKH UHGXFWLRQ RI RIIGLDJRQDO

PAGE 35

HOHPHQWV LQ 9 RI IXOOVLE DQG KDOIVLE GHVLJQV 5HODWLYH LQFUHDVHV LQ HIILFLHQF\ RI IXOOVLE GHVLJQV UHVXOW IURP WKH HOHPHQWV GXH WR ORFDWLRQ E\ DGGLWLYH LQWHUDFWLRQ RFFXUULQJ PXFK OHVV IUHTXHQWO\ LQ WKH KDOIVLE GHVLJQV WKXV WKH UHODWLYH LPSDFW RI UHGXFWLRQ LQ U% LQ KDOIVLE GHVLJQV LV OHVV WKDQ WKDW IRU IXOOVLEV 8VH RI WKH 9DULDQFH RI D 5DWLR $SSUR[LPDWLRQ 8VH RI .HQGDOO DQG 6WXDUWfV f ILUVW DSSUR[LPDWLRQ ILUVWWHUP 7D\ORU VHULHV DSSUR[LPDWLRQf RI WKH YDULDQFH RI D UDWLR KDV WZR PDMRU FDYHDWV 7KH DSSUR[LPDWLRQ GHSHQGV RQ ODUJH VDPSOH SURSHUWLHV WR DSSURDFK WKH WUXH YDULDQFH RI WKH UDWLR LH ZLWK D VPDOO QXPEHU RI OHYHOV IRU UDQGRP YDULDEOHV WKH DSSUR[LPDWLRQ GRHV QRW QHFHVVDULO\ FORVHO\ DSSUR[LPDWH WKH WUXH YDULDQFH RI WKH UDWLR :RUN E\ 3HGHUVRQ f VXJJHVWV WKDW IRU DSSUR[LPDWLQJ WKH YDULDQFH RI K DW OHDVW WHQ SDUHQWV DUH UHTXLUHG LQ GLDOOHOV EHIRUH WKH DSSUR[LPDWLRQ ZLOO FRQYHUJH WR WKH WUXH YDULDQFH HYHQ DIWHU LQFOXGLQJ 7D\ORU VHULHV WHUPV SDVW WKH ILUVW GHULYDWLYH 3HGHUVRQfV ZRUN DOVR VXJJHVWV WKDW WKH DSSUR[LPDWLRQ LV SURJUHVVLYHO\ ZRUVH IRU LQFUHDVLQJ KHULWDELOLW\ ZLWK ORZ QXPEHUV RI SDUHQWV 8VLQJ WKH ILHOG GHVLJQ LQ WKLV FKDSWHU WZR ORFDWLRQVIRXU EORFNV DQG VL[WUHH URZSORWVf VLPXODWLRQ ZRUN GDWD VHWVf KDV GHPRQVWUDWHG WKDW ZLWK D KHULWDELOLW\ RI XVLQJ IRXU SDUHQWV LQ D KDOIGLDOOHO DFURVV WZR ORFDWLRQV WKDW WKH YDULDQFH RI D UDWLR DSSUR[LPDWLRQ \LHOGV D YDULDQFH HVWLPDWH IRU K RI ZKLOH WKH FRQYHUJHQW YDOXH IRU WKH VLPXODWLRQ ZDV +XEHU XQSXEOLVKHG GDWDf 2QH VKRXOG UHPHPEHU WKH GHSHQGHQFH RI WKH ILUVW DSSUR[LPDWLRQ RI WKH YDULDQFH RI D UDWLR RQ ODUJH VDPSOH SURSHUWLHV ZKHQ DSSO\LQJ WKH WHFKQLTXH WR UHDO GDWD 7KH VHFRQG FDYHDW LV WKDW WKH UDQJH RI HVWLPDWHV RI WKH GHQRPLQDWRU RI WKH UDWLR FDQQRW SDVV WKURXJK ]HUR .HQGDOO DQG 6WXDUW f 7KLV FRQVWUDLQW LV RI QR FRQFHUQ IRU K KRZHYHU WKH VWUXFWXUH RI U% DQG \ GHQRPLQDWRUV DOORZV XQELDVHG PLQLPXP YDULDQFH HVWLPDWHV RI WKRVH GHQRPLQDWRUV WR SDVV WKURXJK ]HUR ZKLFK PHDQV DW RQH SRLQW LQ WKH GLVWULEXWLRQ RI WKH HVWLPDWHV

PAGE 36

RI WKH UDWLRV WKH\ DUH XQGHILQHG WKH GLVWULEXWLRQV RI WKHVH UDWLR HVWLPDWHV DUH QRW FRQWLQXRXVf 6LPXODWLRQ KDV VKRZQ WKDW WKH YDULDQFHV RI U% DQG DUH PXFK JUHDWHU WKDQ WKH DSSUR[LPDWLRQ ZRXOG LQGLFDWH +XEHU XQSXEOLVKHG GDWDf 7KH GLVFUHSDQF\ LQ YDULDQFH RI WKH HVWLPDWHV FRXOG EH SDUWLDOO\ DOOHYLDWHG WKURXJK XVLQJ D YDULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXH ZKLFK UHVWULFWV HVWLPDWHV WR WKH SDUDPHWHU VSDFH D RR 1HYHUWKHOHVV EHFDXVH RI WKH WZR FDYHDWV DSSUR[LPDWLRQV RI WKH YDULDQFH RI K U% DQG HVWLPDWHV VKRXOG EH YLHZHG RQO\ RQ D UHODWLYH EDVLV IRU FRPSDULVRQV DPRQJ GHVLJQV DQG QRW RQ DQ DEVROXWH VFDOH $GGLWLRQDOO\ WKH H[SHFWDWLRQ RI D UDWLR GRHV QRW HTXDO WKH UDWLR RI WKH H[SHFWDWLRQV +RJJ DQG &UDLJ f ,I D YDOXH RI JHQHWLF UDWLRV LV VRXJKW VR WKDW WKH YDOXH HTXDOV WKH UDWLR RI WKH H[SHFWDWLRQV WKHQ WKH DSSURSULDWH ZD\ WR FDOFXODWH WKH UDWLR ZRXOG EH WR WDNH WKH PHDQ RI YDULDQFH FRPSRQHQWV RU OLQHDU FRPELQDWLRQV RI YDULDQFH FRPSRQHQWV DFURVV PDQ\ H[SHULPHQWV DQG WKHQ WDNH WKH UDWLR ,I WKH YDOXH VRXJKW IRU K LV WKH H[SHFWDWLRQ RI WKH UDWLR WKHQ WDNLQJ WKH PHDQ RI PDQ\ K HVWLPDWHV LV WKH DSSURSULDWH DSSURDFK 5HWXUQLQJ WR WKH UHVXOWV IURP VLPXODWHG GDWD GDWD VHWVf ZKHUH WKH K YDOXH ZDV VHW DW XVLQJ WKH UDWLR RI WKH PHDQV RI YDULDQFH FRPSRQHQWV UHQGHUHG D YDOXH RI IRU K WKH PHDQ RI WKH K HVWLPDWHV UHWXUQHG D YDOXH RI DQG D 7D\ORU VHULHV DSSUR[LPDWLRQ RI WKH PHDQ RI WKH UDWLR \LHOGHG 3HGHUVRQ f &RQFOXVLRQV 5HVXOWV IURP WKLV VWXG\ VKRXOG EH LQWHUSUHWHG DV UHODWLYH FRPSDULVRQV RI WKH OHYHOV RI WKH IDFWRUV LQYHVWLJDWHG +RZHYHU YLHZLQJ WKH RSWLPDO GHVLJQ SUREOHP DV LOOXVWUDWHG LQ WKH GLVFXVVLRQ VHFWLRQ RI WKLV FKDSWHU FDQ SURYLGH LQVLJKW WR WKH PRUH JHQHUDO SUREOHP 7KHUH LV QR JOREDOO\ PRVW HIILFLHQW QXPEHU RI ORFDWLRQV SDUHQWV RU PDWLQJ GHVLJQ IRU WKH WKUHH UDWLRV HVWLPDWHG HYHQ ZLWKLQ WKH UHVWULFWHG UDQJH RI WKLV VWXG\ \HW VRPH JHQHUDO FRQFOXVLRQV FDQ EH GUDZQ )RU HVWLPDWLQJ K WKH KDOIVLE GHVLJQ LV DOZD\V RSWLPDO RU FORVH WR RSWLPDO LQ

PAGE 37

WHUPV RI YDULDQFH RI HVWLPDWLRQ DQG HIILFLHQF\ ,Q WKH HVWLPDWLRQ RI U% DQG WKH FLUFXODU PDWLQJ GHVLJQ LV DOZD\V RSWLPDO RU QHDU RSWLPDO LQ YDULDQFH UHGXFWLRQ DQG HIILFLHQF\ $FURVV QXPEHUV RI SDUHQWV ZLWKLQ D PDWLQJ GHVLJQ RQO\ WKH KDOIGLDOOHO VKRZV RSWLPD IRU HIILFLHQF\ 7KH RWKHU PDWLQJ GHVLJQV KDYH QRQGHFUHDVLQJ HIILFLHQF\ SORWV RYHU WKH OHYHO RI QXPEHU RI SDUHQW VR WKDW ZKLOH WKHUH LV DQ RSWLPDO QXPEHU RI ORFDWLRQV IRU D OHYHO RI JHQHWLF FRQWURO WKH QXPEHU RI JHQHWLF HQWULHV SHU ORFDWLRQ LV OLPLWHG PRUH E\ RSHUDWLRQDO WKDQ HIILFLHQF\ FRQVWUDLQWV 7ZR ORFDWLRQV LV D QHDU JOREDO RSWLPXP RYHU ILYH ORFDWLRQV IRU WKH IXOOVLE PDWLQJ GHVLJQV :LWKLQ WKH KDOIVLE PDWLQJ GHVLJQ RSWLPDOLW\ GHSHQGV RQ WKH OHYHOV RI K DQG U% f IRU K HVWLPDWLRQ WKH RSWLPDO QXPEHU RI ORFDWLRQV LV LQYHUVHO\ UHODWHG WR WKH OHYHO RI K LH DW WKH KLJKHU OHYHO WZR WHVWV ZHUH RSWLPDO DQG DW WKH ORZHU OHYHO ILYH WHVWV ZHUH RSWLPDO DQG f IRU U% HVWLPDWLRQ IRU WKH KDOIVLE GHVLJQ WKH RSWLPDO QXPEHU RI ORFDWLRQV ZDV DOVR LQYHUVHO\ UHODWHG WR WKH OHYHO RI U% 0HDQV RI HVWLPDWHV IURP GLVFRQQHFWHG VHWV SURYLGH ORZHU YDULDQFH RI HVWLPDWLRQ ZKHUH WKH VPDOOHU H[SHULPHQWV KDYH KLJKHU HIILFLHQFLHV 7KXV GLVFRQQHFWHG VHWV DUH SUHIHUUHG DFFRUGLQJ WR QXPEHU RI ORFDWLRQV IRU DOO PDWLQJ GHVLJQV DQG DFFRUGLQJ WR QXPEHU RI SDUHQWV IRU WKH KDOI GLDOOHO PDWLQJ GHVLJQ ,Q SUDFWLFDO FRQVLGHUDWLRQ RI WKH RSWLPDO PDWLQJ GHVLJQ SUREOHP WKH UHVXOWV RI WKLV VWXG\ LQGLFDWH WKDW LI K HVWLPDWLRQ LV WKH SULPDU\ XVH RI D SURJHQ\ WHVW WKHQ WKH KDOIVLE PDWLQJ GHVLJQ LV WKH SURSHU FKRLFH )XUWKHU WKH FLUFXODU PDWLQJ GHVLJQ LV DQ DSSURSULDWH FKRLFH LI WKH HVWLPDWLRQ RI U% LV PRUH LPSRUWDQW WKDQ K )LQDOO\ LI D IXOOVLE GHVLJQ LV UHTXLUHG WR IXUQLVK LQIRUPDWLRQ DERXW GRPLQDQFH YDULDQFH WKH FLUFXODU GHVLJQ SURYLGHV DOPRVW JOREDOO\ EHWWHU HIILFLHQFLHV IRU K U% DQG \ WKDQ WKH KDOIGLDOOHO

PAGE 38

&+$37(5 25',1$5< /($67 648$5(6 (67,0$7,21 2) *(1(5$/ $1' 63(&,),& &20%,1,1* $%,/,7,(6 )520 +$/)',$//(/ 0$7,1* '(6,*16 ,QWURGXFWLRQ 7KH GLDOOHO PDWLQJ V\VWHP LV DQ DOWHUHG IDFWRULDO GHVLJQ LQ ZKLFK WKH VDPH LQGLYLGXDOV RU OLQHVf DUH XVHG DV ERWK PDOH DQG IHPDOH SDUHQWV $ IXOO GLDOOHO FRQWDLQV DOO FURVVHV LQFOXGLQJ UHFLSURFDO FURVVHV DQG VHLIV UHVXOWLQJ LQ D WRWDO RI S FRPELQDWLRQV ZKHUH S LV WKH QXPEHU RI SDUHQWV $VVXPSWLRQV WKDW UHFLSURFDO HIIHFWV PDWHUQDO HIIHFWV DQG SDWHUQDO HIIHFWV DUH QHJOLJLEOH OHDG WR WKH XVH RI WKH KDOIGLDOOHO PDWLQJ V\VWHP *ULIILQJ PHWKRG f ZKLFK KDV SSOf SDUHQWDO FRPELQDWLRQV DQG LV WKH PDWLQJ V\VWHP DGGUHVVHG LQ WKLV FKDSWHU +DOI GLDOOHOV KDYH EHHQ ZLGHO\ XVHG LQ FURS DQG WUHH EUHHGLQJ 6SUDJXH DQG 7DWXP *LOEHUW 0DW]LQJHU HW DO %XUOH\ HW DO DQG 6TXLOODFH f DQG WKH ZLGHVSUHDG XVH RI WKLV PDWLQJ V\VWHP FRQWLQXHV WRGD\ :HLU DQG =REHO :LOFR[ HW DO 6Q\GHU DQG 1DPNRRQJ +DOODXHU DQG 0LUDQGD 6LQJK DQG 6LQJK *UHHQZRRG HW DO DQG :HLU DQG *RGGDUG f 0RVW RI WKH VWDWLVWLFDO SDFNDJHV DYDLODEOH WUHDW IL[HG HIIHFW HVWLPDWLRQ DV WKH REMHFWLYH RI WKH SURJUDP ZLWK UDQGRP YDULDEOHV UHSUHVHQWLQJ QXLVDQFH YDULDWLRQ :LWKLQ WKLV FRQWH[W D FRPPRQ DQDO\VLV RI KDOIGLDOOHO H[SHULPHQWV LV FRQGXFWHG E\ ILUVW WUHDWLQJ JHQHWLF SDUDPHWHUV DV IL[HG HIIHFWV IRU HVWLPDWLRQ RI JHQHUDO *&$f DQG VSHFLILF 6&$f FRPELQLQJ DELOLWLHV DQG VXEVHTXHQWO\ DV UDQGRP YDULDEOHV IRU YDULDQFH FRPSRQHQW HVWLPDWLRQ XVHG IRU HVWLPDWLQJ KHULWDELOLWLHV JHQHWLF FRUUHODWLRQV DQG JHQHUDO WR VSHFLILF FRPELQLQJ DELOLW\ YDULDQFH UDWLRV IRU

PAGE 39

GHWHUPLQLQJ EUHHGLQJ VWUDWHJLHVf 7KLV FKDSWHU IRFXVHV RQ WKH HVWLPDWLRQ RI *&$fV DQG 6&$fV DV IL[HG HIIHFWV 7KH WUHDWPHQW RI *&$ DQG 6&$ DV IL[HG HIIHFWV LQ 2/6 RUGLQDU\ OHDVW VTXDUHVf LV DQ HQWLUHO\ DSSURSULDWH DQDO\VLV LI WKH FRPSDULVRQV DUH DPRQJ SDUHQWV DQG FURVVHV LQ D SDUWLFXODU H[SHULPHQW ,I DV IRUHVW JHQHWLFLVWV RIWHQ ZLVK WR GR *&$ HVWLPDWHV IURP GLVFRQQHFWHG H[SHULPHQWV DUH WR EH FRPSDUHG WKHQ PHWKRGV VXFK DV FKHFNORWV PXVW EH XVHG WR SODFH WKH HVWLPDWHV RQ D FRPPRQ EDVLV )RUPXODH *ULIILQJ )DOFRQHU +DOODXHU DQG 0LUDQGD DQG %HFNHU f IRU KDQG FDOFXODWLRQ RI JHQHUDO DQG VSHFLILF FRPELQLQJ DELOLWLHV DUH EDVHG RQ D VROXWLRQ WR WKH 2/6 HTXDWLRQV IRU KDOIGLDOOHOV FUHDWHG E\ VXPWR]HUR UHVWULFWLRQV LH WKH VXP RI DOO HIIHFW HVWLPDWHV IRU DQ H[SHULPHQWDO IDFWRU HTXDOV ]HUR 7KHVH IRUPXODH ZLOO \LHOG FRUUHFW 2/6 VROXWLRQV IRU VXP WR]HUR JHQHWLF SDUDPHWHUV SURYLGHG WKH GDWD KDYH QR PLVVLQJ FHOOV ,I FHOO SORWf PHDQV DUH XVHG DV WKH EDVLV IRU WKH HVWLPDWLRQ RI HIIHFWV WKHUH PXVW EH DW OHDVW RQH REVHUYDWLRQ SHU FHOO SORWf ZKHUH D FHOO LV D VXEFODVVLILFDWLRQ RI WKH GDWD GHILQHG E\ RQH OHYHO RI HYHU\ IDFWRU 6HDUOH f $Q H[DPSOH RI D FHOO LV WKH JURXS RI REVHUYDWLRQV GHQRWHG E\ $%cM IRU D UDQGRPL]HG FRPSOHWH EORFN GHVLJQ ZLWK IDFWRU $ DFURVV EORFNV %f ,I WKH DERYH IRUPXODH DUH DSSOLHG ZLWKRXW DFFRXQWLQJ IRU PLVVLQJ FHOOV LQFRUUHFW DQG SRVVLEO\ PLVOHDGLQJ VROXWLRQV FDQ UHVXOW 7KH PDWUL[ DOJHEUD DSSURDFK LV GHVFULEHG LQ WKLV FKDSWHU IRU WKHVH UHDVRQV f LQ IRUHVW WUHH EUHHGLQJ DSSOLFDWLRQV GDWD VHWV ZLWK PLVVLQJ FHOOV DUH H[WUHPHO\ FRPPRQ f PDQ\ VWDWLVWLFDO SDFNDJHV GR QRW DOORZ GLUHFW VSHFLILFDWLRQ RI WKH KDOIGLDOOHO PRGHO f WKH XVH RI D OLQHDU PRGHO DQG PDWUL[ DOJHEUD FDQ \LHOG UHOHYDQW 2/6 VROXWLRQV IRU DQ\ GHJUHH RI GDWD LPEDODQFH DQG f YLHZLQJ WKH PHFKDQLFV RI WKH 2/6 DSSURDFK LV DQ DLG WR XQGHUVWDQGLQJ WKH SURSHUWLHV RI WKH HVWLPDWHV 7KH REMHFWLYHV RI WKLV FKDSWHU DUH WR f GHWDLO WKH FRQVWUXFWLRQ RI RUGLQDU\ OHDVW VTXDUHV 2/6f DQDO\VLV RI KDOIGLDOOHO GDWD VHWV WR HVWLPDWH JHQHWLF SDUDPHWHUV *&$ DQG 6&$f DV IL[HG HIIHFWV f UHFRXQW WKH DVVXPSWLRQV DQG PDWKHPDWLFDO IHDWXUHV RI WKLV W\SH RI DQDO\VLV f

PAGE 40

IDFLOLWDWH WKH UHDGHUfV LPSOHPHQWDWLRQ RI 2/6 DQDO\VHV IRU GLDOOHOV RI DQ\ GHJUHH RI LPEDODQFH DQG VXJJHVW D PHWKRG IRU FRPELQLQJ HVWLPDWHV IURP GLVFRQQHFWHG H[SHULPHQWV DQG f DLG WKH UHDGHU LQ DVFHUWDLQLQJ ZKDW PHWKRG LV DQ DSSURSULDWH DQDO\VLV IRU D JLYHQ GDWD VHW 0HWKRGV /LQHDU 0RGHO 3ORW PHDQV DUH XVHG DV WKH XQLW RI REVHUYDWLRQ IRU WKLV DQDO\VLV ZLWK XQHTXDO QXPEHUV RI REVHUYDWLRQV SHU SORW 3ORW FHOOf PHDQV DUH DOZD\V HVWLPDEOH DV ORQJ DV WKHUH LV RQH REVHUYDWLRQ SHU SORW DQG OLQHDU FRPELQDWLRQV RI WKHVH PHDQV OHDVW VTXDUHV PHDQVf SURYLGH WKH PRVW HIILFLHQW ZD\ RI HVWLPDWLQJ 2/6 IL[HG HIIHFWV
PAGE 41

HLMN LV WKH UDQGRP HUURU DVVRFLDWHG ZLWK WKH REVHUYDWLRQ RI WKH MN FURVV LQ WKH L EORFN ZKHUH HLMN B DHf &URVV E\ EORFN LQWHUDFWLRQ DV JHQRW\SH E\ HQYLURQPHQW LQWHUDFWLRQ LV WUHDWHG DV FRQIRXQGHG ZLWK EHWZHHQ SORW YDULDWLRQ DV IRU FRQWLJXRXV SORWV 7KH PRGHO LQ PDWUL[ QRWDWLRQ LV \ ;H ZKHUH \ LV WKH YHFWRU RI REVHUYDWLRQ YHFWRUV Q[O Q URZV DQG FROXPQf ZKHUH Q HTXDOV WKH QXPEHU RI REVHUYDWLRQV ; LV WKH GHVLJQ PDWUL[ Q[Pf ZKRVH IXQFWLRQ LV WR VHOHFW WKH DSSURSULDWH SDUDPHWHUV IRU HDFK REVHUYDWLRQ ZKHUH P HTXDOV WKH QXPEHU RI IL[HG HIIHFW SDUDPHWHUV LQ WKH PRGHO LV WKH YHFWRU P[Of RI IL[HG HIIHFW SDUDPHWHUV RUGHUHG LQ D FROXPQ DQG H LV WKH YHFWRU Q[Of RI GHYLDWLRQV HUURUVf IURP WKH H[SHFWDWLRQ DVVRFLDWHG ZLWK HDFK REVHUYDWLRQ 2UGLQDU\ /HDVW 6TXDUHV 6ROXWLRQV 7KH PDWUL[ UHSUHVHQWDWLRQ RI DQ 2/6 IL[HG HIIHFWV VROXWLRQ LV E ;f;\;f\ ZKHUH E LV WKH YHFWRU RI HVWLPDWHG IL[HG HIIHFW SDUDPHWHUV LH DQ HVWLPDWH RI DQG ; LV WKH GHVLJQ PDWUL[ HLWKHU PDGH IXOO UDQN E\ UHSDUDPHWHUL]DWLRQ RU D JHQHUDOL]HG LQYHUVH RI ;f; PD\ EH XVHG ,QKHUHQW LQ WKLV VROXWLRQ LV WKH RUGLQDU\ OHDVW VTXDUHV DVVXPSWLRQ WKDW WKH YDULDQFH

PAGE 42

FRYDULDQFH PDWUL[ 9f RI WKH REVHUYDWLRQV \f LV HTXDO WR D ZKHUH LV DQ Q[Q LGHQWLW\ PDWUL[ 7KH HOHPHQWV RI DQ LGHQWLW\ PDWUL[ DUH OfV RQ WKH PDLQ GLDJRQDO DQG DOO RWKHU HOHPHQWV DUH 0XOWLSO\LQJ E\ FUH SODFHV UH RQ WKH PDLQ GLDJRQDO ,Q WKH FRYDULDQFH PDWUL[ IRU WKH REVHUYDWLRQV WKH YDULDQFH RI WKH REVHUYDWLRQV DSSHDUV RQ WKH PDLQ GLDJRQDO DQG WKH FRYDULDQFH EHWZHHQ REVHUYDWLRQV DSSHDUV LQ WKH RIIGLDJRQDO HOHPHQWV 7KXV 9 ,DH VWDWHV WKDW WKH YDULDQFH RI WKH REVHUYDWLRQV LV HTXDO WR DH IRU HDFK REVHUYDWLRQ DQG WKHUH DUH QR FRYDULDQFHV EHWZHHQ WKH REVHUYDWLRQV ZKLFK LV RQH GLUHFW UHVXOW RI FRQVLGHULQJ JHQHWLF SDUDPHWHUV DV IL[HG HIIHFWVf 6XPWR=HUR 5HVWULFWLRQV 7KH GHVLJQ PDWUL[ SUHVHQWHG LQ WKLV FKDSWHU LV UHSDUDPHWHUL]HG E\ VXPWR]HUR UHVWULFWLRQV WR f UHGXFH WKH GLPHQVLRQ RI WKH PDWULFHV WR D PLQLPDO VL]H DQG f \LHOG HVWLPDWHV RI IL[HG HIIHFWV ZLWK WKH VDPH VROXWLRQ DV FRPPRQ IRUPXODH LQ WKH EDODQFHG FDVH 2WKHU UHVWULFWLRQV VXFK DV VHWWR]HUR FRXOG DOVR EH DSSOLHG VR WKH GLVFXVVLRQ WKDW IROORZV WUHDWV VXPWR]HUR UHVWULFWLRQV DV D VSHFLILF VROXWLRQ WR WKH PRUH JHQHUDO SUREOHP ZKLFK LV ILQGLQJ DQ LQYHUVH IRU ;f; 7KH VXEVFULSWV fRf DQG fVf UHIHU WR WKH RYHUSDUDPHWHUL]HG PRGHO DQG WKH UHSDUDPHWHUL]HG PRGHO ZLWK VXPWR]HUR UHVWULFWLRQV UHVSHFWLYHO\ 7KH PDWUL[ ; RI )LJXUH LV WKH GHVLJQ PDWUL[ IRU DQ RYHUSDUDPHWHUL]HG OLQHDU PRGHO 0LOOLNHQ DQG -RKQVRQ SDJH f 2YHUSDUDPHWHUL]DWLRQ PHDQV WKDW WKH HTXDWLRQV DUH ZULWWHQ LQ PRUH XQNQRZQV SDUDPHWHUV LQ WKLV FDVH f WKDQ WKHUH DUH HTXDWLRQV QXPEHU RI REVHUYDWLRQV PLQXV GHJUHHV RI IUHHGRP IRU HUURU LQ WKLV FDVH f ZLWK ZKLFK WR HVWLPDWH WKH SDUDPHWHUV 5HSDUDPHWHUL]DWLRQ DV D VXPWR]HUR PDWUL[ RYHUFRPHV WKLV GLOHPPD E\ UHGXFLQJ WKH QXPEHU RI SDUDPHWHUV WKURXJK PDNLQJ VRPH RI WKH SDUDPHWHUV OLQHDU FRPELQDWLRQV RI RWKHUV 6XPWR]HUR UHVWULFWLRQV PDNH WKH UHVXOWLQJ SDUDPHWHUV DQG HVWLPDWHV VXP WR ]HUR HYHQ WKRXJK

PAGE 43

WKH XQUHVWULFWHG SDUDPHWHUV IRU H[DPSOH WKH WUXH *&$ YDOXHV DV DSSOLHG WR D EURDGHU SRSXODWLRQf GR QRW QHFHVVDULO\ VXPWR]HUR ZLWKLQ D GLDOOHO 7KLV LV WKH SUREOHP RI FRPSDUDELOLW\ RI *&$ HVWLPDWHV IURP GLVFRQQHFWHG H[SHULPHQWV \ \XV \LZ \LD \A \X f§ \rL \L \0 \L \ B\ % % *&$ *&$M JFD *&$f VFD VFDL VFD 6& $\ 6&$M 6&$f n % % *&$ *&$M *&$ L JFD L VFD L VFD L VFD L 6& $\ L 6&$ 6&$r B \ [ S )LJXUH 7KH RYHUSDUDPHWHUL]HG OLQHDU PRGHO IRU D IRXUSDUHQW KDOIGLDOOHO SODQWHG RQ D VLQJOH VLWH LQ WZR EORFNV GLVSOD\HG DV PDWULFHV 7KH GHVLJQ PDWUL[ ;DQG SDUDPHWHU YHFWRU 2f DUH VKRZQ LQ RYHUSDUDPHWHUL]HG IRUP fV DQG fV GHQRWH WKH SUHVHQFH RU DEVHQFH RI D SDUDPHWHU LQ WKH PRGHO IRU WKH REVHUYHG PHDQV GDWD YHFWRU \f 7KH SDUDPHWHUV GLVSOD\HG DERYH WKH GHVLJQ PDWUL[ ODEHO WKH DSSURSULDWH FROXPQ IRU HDFK SDUDPHWHU (UURU YHFWRU QRW H[KLELWHG L % *&$ *&$ *&$ 6&$ 6&$ n HO % HO HOO *&$ HO HO *&$ HO H *&$ H H VFD H H VFD H \ ;Vf6V H )LJXUH 7KH OLQHDU PRGHO IRU D IRXUSDUHQW KDOIGLDOOHO SODQWHG RQ D VLQJOH VLWH LQ WZR EORFNV GLVSOD\HG DV PDWULFHV 7KH GHVLJQ PDWUL[ ;DQG WKH SDUDPHWHU YHFWRU ILM DUH SUHVHQWHG LQ VXPWR]HUR IRUPDW 7KH SDUDPHWHUV GLVSOD\HG DERYH WKH GHVLJQ PDWUL[ ODEHO WKH DSSURSULDWH FROXPQ IRU HDFK SDUDPHWHU 7R LOOXVWUDWH WKH FRQFHSW RI VXPWR]HUR HVWLPDWHV YHUVXV SRSXODWLRQ SDUDPHWHUV ZH XVH WKH H[SHFWDWLRQ RI D FRPPRQ IRUPXOD %HFNHU f JLYHV HTXDWLRQ ZKLFK IRU EDODQFHG

PAGE 44

FDVHV LV HTXLYDOHQW WR J S fSff=M = ff DV WKH HVWLPDWH IRU JHQHUDO FRPELQLQJ DELOLW\ IRU WKH Mf§ OLQH ZLWK S HTXDOOLQJ WKH QXPEHU RI SDUHQWV DQG HTXDOOLQJ WKH VLWH PHDQ RI WKH M [ N FURVV 7KLV HTXDWLRQ \LHOGV WKH VDPH VROXWLRQ DV WKH PDWUL[ HTXDWLRQV ZLWK QR PLVVLQJ SORWV RU FURVVHV DQG ZLWK D GHVLJQ PDWUL[ ZKLFK FRQWDLQV WKH VXPWR]HUR UHVWULFWLRQV $Q HYDOXDWLRQ RI WKLV IRUPXOD LQ D IRXUSDUHQW KDOIGLDOOHO SODQWHG LQ E EORFNV IRU WKH *&$ RI SDUHQW LV REWDLQHG E\ VXEVWLWXWLQJ WKH H[SHFWDWLRQ RI WKH OLQHDU PRGHO HTXDWLRQ f IRU HDFK REVHUYDWLRQ JM OLSLSA;S=M =f 7 (^J` (^OSSfffS= =f` (^J` *&$f *&$ *&$ *&$6&$ 6&$ 6&$8f 6&$ 6&$ 6&$7KH UHVXOW RI HTXDWLRQ LV REYLRXVO\ QRW *&$ IURP WKH XQUHVWULFWHG PRGHO HTXDWLRQ f 7KXV J DQ HVWLPDEOH IXQFWLRQ DQG DQ HVWLPDWH RI SDUDPHWHU *&$6 WKH HVWLPDWH RI WKH *&$ RI SDUHQW JLYHQ WKH VXPWR]HUR UHVWULFWLRQVf GRHV QRW KDYH WKH VDPH PHDQLQJ DV *&$ LQ WKH XQUHVWULFWHG PRGHO $Q HVWLPDEOH IXQFWLRQ LV D OLQHDU FRPELQDWLRQ RI WKH REVHUYDWLRQV EXW LQ RUGHU IRU DQ LQGLYLGXDO SDUDPHWHU LQ D PRGHO WR EH HVWLPDEOH RQH PXVW GHYLVH D OLQHDU FRPELQDWLRQ RI WKH REVHUYDWLRQV VXFK WKDW WKH H[SHFWDWLRQ KDV D ZHLJKW RI RQH RQ WKH SDUDPHWHU RQH ZLVKHV WR HVWLPDWH ZKLOH KDYLQJ D ZHLJKW RI ]HUR RQ DOO RWKHU SDUDPHWHUV $ VROXWLRQ VXFK DV WKLV GRHV QRW H[LVW IRU WKH LQGLYLGXDO SDUDPHWHUV LQ WKH RYHUSDUDPHWHUL]HG PRGHO HTXDWLRQ f 6R DOWKRXJK WKH VXPWR]HUR UHVWULFWHG *&$ SDUDPHWHUV DQG HVWLPDWHV DUH IRUFHG WR VXPWR]HUR IRU WKH VDPSOH RI SDUHQWV LQ D JLYHQ GLDO OHL WKH XQUHVWULFWHG *&$ SDUDPHWHUV RQO\ VXPWR]HUR DFURVV WKH HQWLUH SRSXODWLRQ )DOFRQHU f DQG DQ HYDOXDWLRQ RI *&$6 GHPRQVWUDWHV WKDW WKH HVWLPDWH FRQWDLQV RWKHU PRGHO SDUDPHWHUV 7KH UHVXOW RI VXPWR]HUR UHVWULFWLRQV LV WKDW WKH GHJUHHV RI IUHHGRP IRU D IDFWRU HTXDOV WKH QXPEHU RI FROXPQV SDUDPHWHUVf IRU WKDW IDFWRU LQ ; )LJXUH f 7KXV D JHQHUDOL]HG

PAGE 45

LQYHUVH IRU ;6f;6 LV QRW UHTXLUHG VLQFH WKH QXPEHU RI FROXPQV LQ WKH VXPWR]HUR ; PDWUL[ IRU HDFK IDFWRU HTXDOV WKH GHJUHHV RI IUHHGRP IRU WKDW IDFWRU LQ WKH PRGHO ; LV IXOO FROXPQ UDQN DQG SURYLGHV D VROXWLRQ WR HTXDWLRQ f &RPSRQHQWV RI WKH 0DWUL[ (TXDWLRQ 7KH HTXDWLRQDO FRPSRQHQWV RI DUH QRZ FRQVLGHUHG LQ JUHDWHU GHWDLO 'DWD YHFWRU Y 2EVHUYDWLRQV SORW PHDQVf LQ WKH GDWD YHFWRU DUH RUGHUHG LQ WKH PDQQHU GHPRQVWUDWHG LQ )LJXUH )RU RXU H[DPSOH )LJXUH LV WKH PDWUL[ HTXDWLRQ RI D IRXU SDUHQW KDOIGLDOOHO PDWLQJ GHVLJQ SODQWHG LQ WZR UDQGRPL]HG FRPSOHWH EORFNV RQ D VLQJOH VLWH 7KHUH DUH VL[ FURVVHV SUHVHQW LQ WKH WZR EORFNV IRU D WRWDO RI REVHUYDWLRQV LQ WKH GDWD YHFWRU \ 7KH REVHUYDWLRQV DUH ILUVW VRUWHG E\ EORFN 6HFRQG ZLWKLQ HDFK EORFN WKH REVHUYDWLRQV VKRXOG EH LQ WKH VDPH VHTXHQFH IRU VLPSOLFLW\ RI SUHVHQWDWLRQ RQO\f 7KLV VHTXHQFH LV REWDLQHG E\ DVVLJQLQJ QXPEHUV WKURXJK S WR HDFK RI WKH S SDUHQWV DQG WKHQ VRUWLQJ DOO FURVVHV FRQWDLQLQJ SDUHQW ZKHWKHU DV PDOH RU IHPDOHf DV WKH SULPDU\ LQGH[ LQ GHVFHQGLQJ QXPHULFDO RUGHU E\ WKH RWKHU SDUHQW RI WKH FURVV DV WKH VHFRQGDU\ LQGH[ 1H[W DOO FURVVHV FRQWDLQLQJ SDUHQW SULPDU\ LQGH[ DV PDOH RU IHPDOHf LQ ZKLFK WKH RWKHU SDUHQW LQ WKH FURVV VHFRQGDU\ LQGH[f KDV D QXPEHU JUHDWHU WKDQ DUH WKHQ DOVR VRUWHG LQ GHVFHQGLQJ RUGHU E\ WKH VHFRQGDU\ LQGH[ 7KLV SURFHGXUH LV IROORZHG WKURXJK XVLQJ SDUHQW S DV WKH SULPDU\ LQGH[ 'HVLJQ PDWUL[ DQG SDUDPHWHU YHFWRU ; DQG 7KH GHVLJQ PDWUL[ IRU D PRGHO LV FRQFHSWXDOO\ D OLVWLQJ RI WKH SDUDPHWHUV SUHVHQW LQ WKH PRGHO IRU HDFK REVHUYDWLRQ 6HDUOH SDJH f ,Q )LJXUH \ DQG IW DUH H[KLELWHG DQG WKH SDUDPHWHUV LQ IW DUH GLVSOD\HG DW WKH WRSV RI WKH FROXPQV RI ; D YLVXDOO\ FRUUHFW LQWHUSUHWDWLRQ RI WKH PXOWLSOLFDWLRQ RI D PDWUL[ E\ D YHFWRUf )RU HDFK REVHUYDWLRQ LQ \ WKH VFDODU

PAGE 46

PRGHO HTXDWLRQ f PD\ EH HPSOR\HG WR REWDLQ WKH OLVWLQJ RI SDUDPHWHUV IRU WKDW REVHUYDWLRQ WKH URZ RI WKH GHVLJQ PDWUL[ FRUUHVSRQGLQJ WR WKH SDUWLFXODU REVHUYDWLRQf 7KH FRQYHQWLRQ IRU GHVLJQ PDWULFHV LV WKDW WKH FROXPQV IRU WKH IDFWRUV RFFXU LQ WKH VDPH RUGHU DV WKH IDFWRUV LQ WKH OLQHDU PRGHO HTXDWLRQ DQG )LJXUH f 6LQFH GHVLJQ PDWULFHV FDQ EH GHYLVHG E\ ILUVW FUHDWLQJ WKH FROXPQV SHUWLQHQW WR HDFK IDFWRU LQ WKH PRGHO VXEPDWULFHVf DQG WKHQ KRUL]RQWDOO\ DQGRU YHUWLFDOO\ VWDFNLQJ WKH VXEPDWULFHV WKH GLVFXVVLRQ RI WKH UHSDUDPHWHUL]HG GHVLJQ PDWUL[ IRUPXODWLRQ ZLOO SURFHHG E\ IDFWRU 0HDQ 7KH ILUVW FROXPQ RI ; LV IRU Q DQG LV D YHFWRU RI OfV ZLWK WKH QXPEHU RI URZV HTXDOOLQJ WKH QXPEHU RI REVHUYDWLRQV )LJXUH f 7KH OLQHDU PRGHO HTXDWLRQ f LQGLFDWHV WKDW DOO REVHUYDWLRQV FRQWDLQ Q DQG WKH GHYLDWLRQ RI WKH REVHUYDWLRQV IURP S LV H[SODLQHG LQ WHUPV RI WKH IDFWRUV DQG LQWHUDFWLRQV LQ WKH PRGHO SOXV HUURU %ORFN 7KH QXPEHU RI FROXPQV IRU EORFN LV HTXDO WR WKH QXPEHU RI EORFNV PLQXV RQH FROXPQ ;(DFK URZ RI D EORFN VXEPDWUL[ FRQVLVWV RI OfV DQG fV RU OfV DFFRUGLQJ WR WKH LGHQWLW\ RI WKH REVHUYDWLRQ IRU ZKLFK WKH URZ LV EHLQJ IRUPHG 7KH QRUPDO FRQYHQWLRQ LV WKDW WKH ILUVW FROXPQ UHSUHVHQWV EORFN DQG WKH VHFRQG FROXPQ EORFN HWF WKURXJK EORFN E 6LQFH ZH KDYH XVHG D VXPWR]HUR VROXWLRQ eec f WKH HIIHFW GXH WR EORFN E LV D OLQHDU FRPELQDWLRQ RI WKH RWKHU E HIIHFWV LH EE ( cEc ZKLFK LQ RXU H[DPSOH LV Ec E DQG E E 7KXV WKH URZ RI WKH EORFN VXEPDWUL[ IRU DQ REVHUYDWLRQ LQ EORFN E WKH ODVW EORFNf KDV D LQ HDFK RI WKH E FROXPQV VLJQLI\LQJ WKDW WKH EORFN E HIIHFW LV LQGHHG D OLQHDU FRPELQDWLRQ RI WKH RWKHU E EORFN HIIHFWV &ROXPQV DQG RI ;f )LJXUH f KDYH EHFRPH FROXPQ RI ; )LJXUH f

PAGE 47

*HQHUDO FRPELQLQJ DELOLW\ 7KLV VXEPDWUL[ RI ; LV VOLJKWO\ PRUH FRPSOH[ WKDQ SUHYLRXV IDFWRUV DV D UHVXOW RI KDYLQJ WZR OHYHOV RI D PDLQ HIIHFW SUHVHQW SHU REVHUYDWLRQ LH WKH GHYLDWLRQ RI DQ REVHUYDWLRQ IURP LV PRGHOHG DV WKH UHVXOW RI WKH *&$fV RI ERWK WKH PDOH DQG IHPDOH SDUHQWV HTXDWLRQ f $JDLQ ZH KDYH LPSRVHG D UHVWULFWLRQ (MJFDA2 6LQFH *&$ KDV S GHJUHHV RI IUHHGRP WKH VXEPDWUL[ IRU *&$ VKRXOG KDYH S FROXPQV LH JFD (MMJFDM 7KH *&$ VXEPDWUL[ IRU ; FROXPQV WKURXJK LQ )LJXUH f LV IRUPHG IURP ; FROXPQV WKURXJK LQ )LJXUH f DFFRUGLQJ LQ WKH VDPH PDQQHU DV WKH EORFN PDWUL[ f DGG PLQXV RQH WR WKH HOHPHQWV LQ WKH RWKHU FROXPQV DORQJ HDFK URZ FRQWDLQLQJ D RQH IRU JFDS S LQ RXU H[DPSOHf DQG f GHOHWH WKH FROXPQ IURP ; FRUUHVSRQGLQJ WR JFDS 7KH *&$ VXEPDWUL[ KDV SSOf URZV WKH QXPEHU RI FURVVHVf 7KLV ZLWK QR PLVVLQJ FHOOV SORWVf HTXDOV WKH QXPEHU RI REVHUYDWLRQV SHU EORFN 7R IRUP WKH *&$ IDFWRU VXEPDWUL[ IRU D VLWH WKH *&$ VXEPDWUL[ LV YHUWLFDOO\ FRQFDWHQDWHG VWDFNHG RQ LWVHOIf E WLPHV 7KLV FRPSOHWHV WKH SRUWLRQ RI WKH ; PDWUL[ IRU *&$ 6SHFLILF FRPELQLQJ DELOLW\ ,Q RUGHU WR IDFLOLWDWH FRQVWUXFWLRQ RI WKH 6&$ VXEPDWUL[ D KRUL]RQWDO GLUHFW SURGXFW VKRXOG EH GHILQHG $ KRUL]RQWDO GLUHFW SURGXFW DV DSSOLHG WR WZR FROXPQ YHFWRUV LV WKH HOHPHQW E\ HOHPHQW SURGXFW EHWZHHQ WKH WZR YHFWRUV 6$6,0/ 8VHUfV *XLGH f VXFK WKDW WKH HOHPHQW LQ WKH Lf§ URZ RI WKH UHVXOWLQJ SURGXFW YHFWRU LV WKH SURGXFW RI WKH HOHPHQWV LQ WKH Lf§ URZV RI WKH WZR LQLWLDO YHFWRUV 7KH UHVXOWDQW SURGXFW YHFWRU KDV GLPHQVLRQ Q [ $ KRUL]RQWDO GLUHFW SURGXFW LV XVHIXO IRU WKH IRUPDWLRQ RI LQWHUDFWLRQ RU QHVWHG IDFWRU VXEPDWULFHV ZKHUH WKH LQLWLDO PDWULFHV UHSUHVHQW WKH PDLQ IDFWRUV DQG WKH UHVXOWLQJ PDWUL[ UHSUHVHQWV DQ LQWHUDFWLRQ RU D QHVWHG IDFWRU SURGXFW UXOH 6HDUOH f n6$6,0/ LV WKH UHJLVWHUHG WUDGHPDUN RI WKH 6$6 ,QVWLWXWH ,QF &DU\ 1RUWK &DUROLQD

PAGE 48

7KH 6&$ VXEPDWUL[ FDQ EH IRUPXODWHG IURP WKH KRUL]RQWDO GLUHFW SURGXFWV RI WKH FROXPQV RI WKH *&$ VXEPDWUL[ LQ ; )LJXUH f 7KH UHVXOWV IURP WKH *&$ FROXPQV UHTXLUH PDQLSXODWLRQ WR EHFRPH WKH 6&$ VXEPDWUL[ VLQFH GHJUHHV RI IUHHGRP IRU 6&$ GR QRW HTXDO WKRVH RI DQ LQWHUDFWLRQ IRU D KDOIGLDOOHO DQDO\VLVf EXW WKH *&$ FROXPQ SURGXFWV SURYLGH D FRQYHQLHQW VWDUWLQJ SRLQW 7KH FROXPQ RI WKH 6&$ VXEPDWUL[ UHSUHVHQWLQJ WKH FURVV EHWZHHQ WKH Mf§ DQG WKH N SDUHQWV 6&$MLV IRUPHG DV WKH SURGXFW EHWZHHQ WKH *&$M DQG *&$N FROXPQV )LJXUH f 7KH *&$ FROXPQV LQ )LJXUH DUH PXOWLSOLHG LQ WKLV RUGHU FROXPQ WLPHV FROXPQ IRUPLQJ WKH ILUVW 6&$ FROXPQ FROXPQ WLPHV FROXPQ IRUPLQJ WKH VHFRQG 6&$ FROXPQ DQG FROXPQ WLPHV FROXPQ IRUPLQJ WKH WKLUG 6&$ FROXPQ )LJXUH f :LWK IRXU SDUHQWV VL[ FURVVHVf WKHUH DUH WKUHH GHJUHHV RI IUHHGRP IRU *&$ Sf DQG WZR GHJUHHV RI IUHHGRP IRU 6&$ FURVVHV IRU *&$ IRU WKH PHDQf 6LQFH 6&$ KDV RQO\ WZR GHJUHHV RI IUHHGRP D VXPWR]HUR GHVLJQ PDWUL[ FDQ KDYH RQO\ WZR FROXPQV IRU 6&$ ,PSRVLQJ WKH UHVWULFWLRQ WKDW WKH VXP RI WKH 6&$fV DFURVV DOO SDUHQWV HTXDOV ]HUR LV HTXLYDOHQW WR PDNLQJ WKH ODVW FROXPQ IRU WKH 6&$ VXEPDWUL[ )LJXUH f D OLQHDU FRPELQDWLRQ RI WKH RWKHUV )LJXUH f 7KH SURFHGXUH IRU GHOHWLQJ WKH WKLUG FROXPQ SURGXFW LV LGHQWLFDO WR WKDW IRU WKH *&$ VXEPDWUL[ DGG PLQXV RQH WR HYHU\ HOHPHQW LQ WKH URZV RI WKH UHPDLQLQJ 6&$ FROXPQV LQ ZKLFK D RQH DSSHDUV LQ WKH FROXPQ ZKLFK LV WR EH GHOHWHG )LJXUH FROXPQV DQG f 7KH QXPEHU RI URZV LQ WKH 6&$ VXEPDWUL[ HTXDOV WKH QXPEHU REVHUYDWLRQV LQ D EORFN DQG PXVW EH YHUWLFDOO\ FRQFDWHQDWHG E WLPHV WR FUHDWH WKH 6&$ VXEPDWUL[ IRU D VLWH $Q DOJHEUDLF HYDOXDWLRQ RI 6&$ VXPWR]HUR UHVWULFWLRQV UHTXLUHV WKDW 6MVFDMN IRU HDFK N DQG WKDW (AVFDA WKXV IRU REVHUYDWLRQV LQ WKH Lf§ EORFN ZLWK L VHUYLQJ WR GHQRWH WKH URZ RI WKH 6&$ VXEPDWUL[ LQ EORFN L VFDc VFDLO VFDLO DQG HQWULHV LQ WKH VXEPDWUL[ URZ IRU \LO DUH OfV 7KH HVWLPDWH IRU VFDA HTXDOV VFDc EHFDXVH VFDL LV WKH QHJDWLYH RI WKH VXP RI WKH LQGHSHQGHQWO\ HVWLPDWHG 6&$fV VFDM DQG VFDLOf IURP WKH UHVWULFWLRQ WKDW WKH VXP RI WKH 6&$fV

PAGE 49

DFURVV DOO SDUHQWV HTXDOV ]HUR 6LPLODUO\ E\ VXPWR]HUR GHILQLWLRQ VFDA VFDA VHD DQG E\ VXEVWLWXWLRQ VFDA VFDc VFDcf VFDL VFDc %\ WKH VDPH SURWRFRO LW FDQ EH VKRZQ WKDW VFDA VFDc 7KH HOHPHQWV LQ WKH URZV RI WKH 6&$ VXEPDWUL[ DUH OfV OfV DQG fV LQ DFFRUGDQFH ZLWK WKH DOJHEUDLF HYDOXDWLRQ 7KXV ZKLOH LW PD\ VHHP WKDW WKHUH VKRXOG EH 6&$ YDOXHV RQH IRU HDFK FURVVf RQO\ FDQ EH LQGHSHQGHQWO\ HVWLPDWHG DQG WKH UHPDLQLQJ DUH OLQHDU FRPELQDWLRQV RI WKH LQGHSHQGHQWO\ HVWLPDWHG 6&$fV $JDLQ WKH 6&$ VXPWR]HUR HVWLPDWHV DUH QRW HTXDO WR WKH SDUDPHWULF SRSXODWLRQ 6&$fV $Q DQDORJRXV LOOXVWUDWLRQ IRU 6&$ WR WKDW IRU *&$ ZRXOG VKRZ WKDW WKH HVWLPDEOH IXQFWLRQ OLQHDU FRPELQDWLRQ RI REVHUYDWLRQVf IRU D JLYHQ 6&$H FRQWDLQV D YDULHW\ RI RWKHU SDUDPHWHUV 2%6 *&$[*&$ *&$[*&$ *&$[*&$ VFD 6&$VFD
PAGE 50

LQGHSHQGHQWO\ XVLQJ WKH SHUWLQHQW VXEPDWUL[ DV ORQJ DV WKHUH DUH QR PLVVLQJ FHOO PHDQV SORWVf DQG QR PLVVLQJ FURVVHV WKLV XVHV D SURSHUW\ NQRZQ DV RUWKRJRQDOLW\ 2UWKRJRQDOLW\ UHTXLUHV WKDW WKH GRW SURGXFW EHWZHHQ WZR YHFWRUV HTXDOV ]HUR 6FKQHLGHU SDJH f 7KH GRW SURGXFW D VFDODUf LV WKH VXP RI WKH YDOXHV LQ D YHFWRU REWDLQHG IURP WKH KRUL]RQWDO GLUHFW SURGXFW RI WZR YHFWRUV )RU WZR IDFWRUV WR EH RUWKRJRQDO WKH GRW SURGXFWV RI DOO WKH FROXPQ YHFWRUV PDNLQJ XS WKH VHFWLRQ RI WKH GHVLJQ PDWUL[ IRU RQH IDFWRU ZLWK WKH FROXPQ YHFWRUV PDNLQJ XS WKH SRUWLRQ RI WKH GHVLJQ PDWUL[ IRU WKH VHFRQG PXVW EH ]HUR ,I DOO IDFWRUV LQ WKH PRGHO DUH RUWKRJRQDO WKHQ WKH ;f; PDWUL[ LV EORFN GLDJRQDO $ EORFNGLDJRQDO ;f; PDWUL[ LV FRPSRVHG RI VTXDUH IDFWRU VXEPDWULFHV GHJUHHV RI IUHHGRP [ GHJUHHV RI IUHHGRPf DORQJ WKH GLDJRQDO ZLWK DOO RIIGLDJRQDO HOHPHQWV QRW LQ RQH RI WKH VTXDUH IDFWRU VXEPDWULFHV HTXDOOLQJ ]HUR $ SURSHUW\ RI EORFNGLDJRQDO PDWULFHV LV WKDW WKH LQYHUVH FDQ EH FDOFXODWHG E\ LQYHUWLQJ HDFK EORFN VHSDUDWHO\ DQG UHSODFLQJ WKH RULJLQDO EORFN LQ WKH IXOO ;f; PDWUL[ E\ WKH LQYHUWHG EORFN %HFDXVH WKH EORFNV FDQ EH LQYHUWHG VHSDUDWHO\ DQG DOO RWKHU RIIGLDJRQDO HOHPHQWV RI WKH LQYHUVH DUH ]HUR WKH HIIHFWV IRU IDFWRUV ZKLFK DUH RUWKRJRQDO WR DOO RWKHU IDFWRUV PD\ EH HVWLPDWHG VHSDUDWHO\ LH WKHUH DUH QR IXQFWLRQV RI RWKHU VXPWR]HUR IDFWRUV LQ WKH VXPWR]HUR HVWLPDWHV 0HDQ EORFN *&$ DQG 6&$ SDUDPHWHUV $OO SDUDPHWHUV DUH HVWLPDWHG VLPXOWDQHRXVO\ E\ KRUL]RQWDOO\ FRQFDWHQDWLQJ WKH PHDQ EORFN *&$ DQG 6&$ PDWULFHV WR FUHDWH ; (TXDWLRQ LV DJDLQ XWLOL]HG WR VROYH WKH V\VWHP RI HTXDWLRQV 7KH E YHFWRU IRU WKH IRXU SDUHQW H[DPSOH LV DQ HVWLPDWH RI RI )LJXUH $JDLQ RQH SDUDPHWHU LV HVWLPDWHG IRU HDFK FROXPQ LQ WKH ; PDWUL[ DQG DOO SDUDPHWHU HVWLPDWHV QRW SUHVHQW DUH OLQHDU FRPELQDWLRQV RI WKH SDUDPHWHU HVWLPDWHV LQ WKH E YHFWRU 6R LV HTXDO WR ( cEc DQG JFD LV HTXDO WR 'f cJFDM 7KH OLQHDU FRPELQDWLRQV IRU 6&$ HIIHFWV FDQ EH REWDLQHG E\ UHDGLQJ DORQJ WKH URZ RI WKH 6&$ VXEPDWUL[ DVVRFLDWHG ZLWK WKH REVHUYDWLRQ FRQWDLQLQJ WKH

PAGE 51

SDUDPHWHU LH LQ )LJXUH WKH REVHUYDWLRQ FRQWDLQV WKH HIIHFW VFDA ZKLFK LV HVWLPDWHG DV WKH OLQHDU FRPELQDWLRQ VFDc VFDc 7KLV FRPSOHWHV WKH HVWLPDWLRQ RI IL[HG HIIHFW SDUDPHWHUV IURP D GDWD VHW ZKLFK LV EDODQFHG RQ D SORWPHDQ EDVLV 6LQFH ILHOG GDWD VHWV ZLWK VXFK FRPSOHWHQHVV DUH D UDULW\ LQ IRUHVWU\ DSSOLFDWLRQV WKH QH[W VWHS LV 2/6 DQDO\VLV IRU YDULRXV W\SHV RI GDWD LPEDODQFH &DOFXODWLRQV RI VROXWLRQV EDVHG RQ D FRPSOHWH GDWD VHW DQG VLPXODWHG GDWD VHWV ZLWK FRPPRQ W\SHV RI LPEDODQFH DUH GHPRQVWUDWHG LQ QXPHULFDO H[DPSOHV 1XPHULFDO ([DPSOHV 7KH GDWD VHW DQDO\]HG LQ WKH QXPHULFDO H[DPSOHV LV IURP D ILYH\HDUROG VL[SDUHQW KDOI GLDOOHO VODVK SLQH 3LUQV HOOLRWWLL YDU HOOLRWWLL (QJHOPQf SURJHQ\ WHVW SODQWHG RQ D VLQJOH VLWH LQ IRXU FRPSOHWH EORFNV (DFK FURVV LV UHSUHVHQWHG E\ D ILYHWUHH URZ SORW ZLWKLQ HDFK EORFN 7RWDO KHLJKW LQ PHWHUV DQG GLDPHWHU DW EUHDVW KHLJKW GEK LQ FHQWLPHWHUVf DUH WKH WUDLWV VHOHFWHG IRU DQDO\VLV 7KH GDWD VHW LV SUHVHQWHG LQ 7DEOH VR WKDW WKH UHDGHU PD\ UHFRQVWUXFW WKH DQDO\VLV DQG FRPSDUH DQVZHUV ZLWK WKH H[DPSOHV 7KH QXPEHUV WKURXJK ZHUH DUELWUDULO\ DVVLJQHG WR WKH SDUHQWV IRU DQDO\VLV %HFDXVH RI XQHTXDO VXUYLYDO ZLWKLQ SORWV SORW PHDQV DUH XVHG DV WKH XQLW RI REVHUYDWLRQ %DODQFHG 'DWD 3ORWPHDQ %DVLVf 7KH VXPWR]HUR GHVLJQ PDWUL[ IRU WKH EDODQFHG GDWD VHW KDV EORFNVf[ FURVVHVf URZV ZKLFK HTXDOV WKH QXPEHU RI REVHUYDWLRQV LQ \f DQG KDV WKH IROORZLQJ FROXPQV RQH FROXPQ IRU L WKUHH FROXPQV IRU EORFNV Ef ILYH FROXPQV IRU *&$ Sf DQG QLQH FROXPQV IRU 6&$ FURVVHV f IRU D WRWDO RI FROXPQV :LWK VL[W\ SORW PHDQV GHJUHHV RI IUHHGRPf DQG GHJUHHV RI IUHHGRP LQ WKH PRGHO VXEWUDFWLQJ IURP \LHOGV GHJUHHV RI IUHHGRP IRU

PAGE 52

HUURU ZKLFK PDWFKHV WKH GHJUHHV RI IUHHGRP IRU FURVV E\ EORFN LQWHUDFWLRQ WKXV YHULI\LQJ WKDW GHJUHHV RI IUHHGRP FRQFXU ZLWK WKH QXPEHU RI FROXPQV LQ WKH VXPWR]HUR GHVLJQ PDWUL[ 7R LOOXVWUDWH WKH SULQFLSOH RI RUWKRJRQDOLW\ LQ WKH EDODQFHG FDVH WKH ;f; DQG ;f;fn PDWULFHV PD\ EH SULQWHG WR VKRZ WKDW WKH\ DUH EORFN GLDJRQDO ,Q IXUWKHU LOOXVWUDWLRQ WKH HIIHFWV ZLWKLQ D IDFWRU PD\ DOVR EH HVWLPDWHG ZLWKRXW DQ\ RWKHU IDFWRUV LQ WKH GHVLJQ PDWUL[ DQG FRPSDUHG WR WKH HVWLPDWHV IURP WKH IXOO GHVLJQ PDWUL[ 7KH YHFWRUV RI SDUDPHWHU HVWLPDWHV IRU KHLJKW DQG GEK 7DEOH f ZHUH FDOFXODWHG IURP WKH VDPH ; PDWUL[ EHFDXVH KHLJKW DQG GEK PHDVXUHPHQWV ZHUH WDNHQ RQ WKH VDPH WUHHV ,Q RWKHU ZRUGV LI D KHLJKW PHDVXUHPHQW ZDV WDNHQ RQ D WUHH D GEK PHDVXUHPHQW ZDV DOVR WDNHQ VR WKH GHVLJQ PDWULFHV DUH HTXLYDOHQW 0LVVLQJ 3ORW 7R LOOXVWUDWH WKH SUREOHP RI D PLVVLQJ SORW WKH FURVV SDUHQW WZR E\ SDUHQW WKUHH ZDV DUELWUDULO\ GHOHWHG LQ EORFN RQH DV LI REVHUYDWLRQ \ ZHUH PLVVLQJf 7KLV GHOHWLRQ SURPSWV DGMXVWPHQWV WR WKH IDFWRU PDWULFHV LQ RUGHU WR DQDO\]H WKH QHZ GDWD VHW 7KH QHZ YHFWRU RI REVHUYDWLRQV \f QRZ KDV URZV 7KLV QHFHVVLWDWHV GHOHWLRQ RI WKH URZ RI WKH GHVLJQ PDWUL[ ;LQ EORFN ZKLFK ZRXOG KDYH EHHQ DVVRFLDWHG ZLWK FURVV [ 7KLV LV WKH RQO\ PDWUL[ DOWHUDWLRQ UHTXLUHG IRU WKH DQDO\VLV 7KXV WKH UHVXOWDQW ; PDWUL[ KDV URZV DQG FROXPQV :LWK PHDQV LQ \ DQG FROXPQV LQ ; WKH GHJUHHV RI IUHHGRP IRU HUURU LV &RPSDULVRQV EHWZHHQ UHVXOWV RI WKH DQDO\VHV 7DEOH f RI WKH IXOO GDWD VHW DQG WKH GDWD VHW PLVVLQJ REVHUYDWLRQ \ UHYHDO WKDW IRU WKLV FDVH WKH HVWLPDWHV RI SDUDPHWHUV KDYH EHHQ UHODWLYHO\ XQDIIHFWHG E\ WKH LPEDODQFH PDJQLWXGHV RI *&$fV FKDQJHG RQO\ VOLJKWO\ DQG UDQNLQJV E\ *&$ ZHUH XQDIIHFWHGf

PAGE 53

7DEOH 'DWD VHW IRU QXPHULFDO H[DPSOHV )LYH\HDUROG VODVK SLQH SURJHQ\ WHVW ZLWK D SDUHQW KDOIGLDOOHO PDWLQJ GHVLJQ SUHVHQW RQ D VLQJOH VLWH ZLWK IRXU UDQGRPL]HG FRPSOHWH EORFNV DQG D ILYHWUHH URZ SORW SHU FURVV SHU EORFN %ORFN )HPDOH 0DOH 0HDQ +HLJKW 0HDQ '%+ :LWKLQ 3ORW 9DULDQFH 9DULDQFH +HLJKW '%+ 7UHH SHU 3ORW 0HWHUV &HQWLPHWHUV WQ FP

PAGE 54

7DEOH 1XPHULFDO UHVXOWV IRU H[DPSOHV RI GDWD LPEDODQFH XVLQJ WKH 2/6 WHFKQLTXHV SUHVHQWHG LQ WKH WH[W )LYH (VWLPDWH %DODQFHGr 0LVVLQJ 3ORWE 0LVVLQJ &URVVr 0LVVLQJ &URVVHV R3 +HLJKW '%+ +HLJKW '%+ +HLJKW '%+ +HLJKW '%+ 0 % E *&$ JFD *&$M *&$ *&$6 6&$A 6&$M 6&$P VFD 6& $\ 6&$ 6& $\ 6&$f 6&$MM fZKHUH QXPHULFDO H[DPSOHV DUH IRU KHLJKWf E (ID JFDr (IHFDM VFDA (VFDMN IRU M RU N S DQG S WKHQ VFD VFDA DQG VFDA VFD (AVFD H LQGHSHQGHQWO\ HVWLPDWHG VHDfV VFDA VFD VFD VFD VFDA VFDA VFD DQG VHDr VFD VFD VFD VFDA VFDA VHDr EZKHUH WKH OLQHDU FRPELQDWLRQV IRU SDUDPHWHU HVWLPDWHV DUH LGHQWLFDO WR WKH EDODQFHG H[DPSOH FZKHUH VHDr (VFDMN IRU M RU N S DQG S WR VFD (pVFDH H LQGHSHQGHQWO\ HVWLPDWHG 6&$fV VFDA VFD VFD VFD VFDMM VFD DQG VFDA VFD VFD, VFD 6& VFDA GZKHUH VFD VFD VFD, VFDA 6& VHDr VFDA VHDr VFD VHDr VFD VFD VHDr DQG VFDMM WKH QHJDWLYH RI WKH VXP RI WKH IRXU LQGHSHQGHQWO\ HVWLPDWHG VHDfV fZKHUH IRU DOO FDVHV OLQHDU FRPELQDWLRQV IRU EORFN DQG JFD DUH WKH VDPH DV LQ WKH EDODQFHG FDVH

PAGE 55

0LVVLQJ &URVV $QRWKHU FRPPRQ IRUP RI LPEDODQFH LQ GLDOOHO GDWD VHWV WKH PLVVLQJ FURVV LV H[DPLQHG WKURXJK DUELWUDU\ GHOHWLRQ RI WKH [ FURVV IURP DOO EORFNV LH \ \A \ \ DUH PLVVLQJ LQ WKH GDWD YHFWRU 7KLV W\SH RI LPEDODQFH LV UHSUHVHQWDWLYH RI D SDUWLFXODU FURVV WKDW FRXOG QRW EH PDGH DQG LV WKHUHIRUH PLVVLQJ IURP DOO EORFNV 7KH PDWUL[ PDQLSXODWLRQV UHTXLUHG IRU WKLV DQDO\VLV DUH DJDLQ SUHVHQWHG E\ IDFWRU )RU DSSURSULDWH 6&$ UHVWULFWLRQV WKH GDWD YHFWRU DQG GHVLJQ PDWUL[ VKRXOG EH RUGHUHG VR WKDW WKH SA SDUHQW KDV QR PLVVLQJ FURVVHV 6LQFH WKH ODEHOLQJ RI D SDUHQW DV SDUHQW S LV HQWLUHO\ VXEMHFWLYH DQ\ SDUHQW ZLWK DOO FURVVHV PD\ EH GHVLJQDWHG DV SDUHQW S 7KH SUHYLRXV ODEHOOLQJ GLUHFWLRQV DUH QHFHVVDU\ VLQFH ZH JHQHUDWH WKH 6&$ VXEPDWUL[ DV KRUL]RQWDO GLUHFW SURGXFWV RI WKH FROXPQV RI WKH *&$ VXEPDWUL[ DQG WR DFFRXQW IRU PLVVLQJ FURVVHV WKH KRUL]RQWDO GLUHFW SURGXFW IRU HDFK SDUWLFXODU PLVVLQJ SDUHQWDO FRPELQDWLRQV DUH QRW FDOFXODWHG ZKLFK VHWV WKH PLVVLQJ 6&$fV WR ]HUR ,I WKHUH LV D FURVV PLVVLQJ IURP WKRVH RI WKH S SDUHQW ZH FDQQRW DFFRXQW IRU WKH PLVVLQJ FURVV ZLWK WKLV WHFKQLTXH 6HDUOH SDJH f )RU WKH PHDQ EORFN DQG *&$ VXEPDWULFHV WKH DGMXVWPHQW IRU WKH PLVVLQJ FURVV GLFWDWHV GHOHWLQJ WKH URZV LQ WKH VXEPDWULFHV ZKLFK ZRXOG KDYH FRUUHVSRQGHG WR WKH \A REVHUYDWLRQV 7KH 6&$ VXEPDWUL[ PXVW EH UHIRUPHG VLQFH D GHJUHH RI IUHHGRP IRU 6&$ DQG KHQFH D FROXPQ RI WKH VXEPDWUL[ KDV EHHQ ORVW 7KH 6&$ VXEPDWUL[ LV UHLQVWLWXWHG IURP WKH *&$ KRUL]RQWDO GLUHFW SURGXFWV UHPHPEHULQJ WKDW RQH FURVV [ QR ORQJHU H[LVWV DQG WKHUHIRUH WKDW SURGXFW *&$ [ *&$ LV LQDSSURSULDWHf 'URSSLQJ WKH FROXPQ IRU 6&$A LV HTXLYDOHQW WR VHWWLQJ 6&$] WR ]HUR 6HDUOH f VR WKDW WKH UHPDLQLQJ 6&$fV ZLOO VXPWR]HUR $IWHU WKDW WKH UHIRUPDWLRQ LV DFFRUGLQJ WR WKH HVWDEOLVKHG SDWWHUQ :LWK RQH PLVVLQJ FURVV WKHUH DUH QRZ REVHUYDWLRQV DQG KHQFH GHJUHHV RI IUHHGRP DYDLODEOH 7KH FROXPQV RI WKH ; PDWUL[ DUH QRZ RQH IRU WKH PHDQ WKUHH IRU EORFN ILYH IRU *&$ DQG HLJKW IRU 6&$ IRU D WRWDO RI FROXPQV 7KH

PAGE 56

UHPDLQLQJ GHJUHHV RI IUHHGRP IRU HUURU LV PDWFKLQJ WKH FRUUHFW GHJUHHV RI IUHHGRP Of[Of f )RU WKH PLVVLQJ FURVV H[DPSOH [ LV QR ORQJHU HTXLYDOHQW WR WKH PHDQ RI WKH SORW PHDQV VLQFH [ DQG (LMN\LMNf1 ZKHUH 1 QXPEHU RI SORW PHDQVf 7KLV LV WKH UHVXOW RI *&$ HIIHFWV ZKLFK DUH QR ORQJHU RUWKRJRQDO WR WKH PHDQ &KHFN WKH ;f; PDWUL[ RU WU\ HVWLPDWLQJ IDFWRUV VHSDUDWHO\ DQG FRPSDUH WR WKH HVWLPDWHV ZKHQ DOO IDFWRUV DUH LQFOXGHG LQ ; ,I IRUPXODH IRU EDODQFHG GDWD %HFNHU )DOFRQHU DQG +DOODXHU DQG 0LUDQGD f DUH DSSOLHG WR XQEDODQFHG GDWD SORWPHDQ EDVLVf HVWLPDWHV RI SDUDPHWHUV DUH QR ORQJHU DSSURSULDWH EHFDXVH IDFWRUV LQ WKH PRGHO DUH QR ORQJHU LQGHSHQGHQW RUWKRJRQDOf $SSO\LQJ %HFNHUfV IRUPXOD ZKLFK XVHV WRWDOV RI FURVV PHDQV IRU D VLWH \ MNf WR WKH PLVVLQJ FURVV H[DPSOH \LHOGV JFD JFD JFD JFD JFD DQG JFDr 7KHVH DQVZHUV DUH YHU\ GLIIHUHQW LQ PDJQLWXGH IURP WKRVH LQ 7DEOH IRU WKLV H[DPSOH DQG JFD DOVR KDV D GLIIHUHQW VLJQ (PSOR\LQJ WKHVH IRUPXODH LQ WKH DQDO\VLV RI XQEDODQFHG GDWD LV DQDORJRXV WR PDWUL[ HVWLPDWLRQ RI *&$fV ZLWKRXW WKH RWKHU IDFWRUV LQ WKH PRGHO ZKLFK LV LQDSSURSULDWH 6HYHUDO 0LVVLQJ &URVVHV 7KH FRQFOXGLQJ H[DPSOH 7DEOH f LV D GUDVWLFDOO\ XQEDODQFHG GDWD VHW UHVXOWLQJ IURP WKH DUELWUDU\ GHOHWLRQ RI ILYH FURVVHV [ [ [ [ DQG [ f 7KH PDWUL[ PDQLSXODWLRQ IRU WKLV H[DPSOH LV DQ H[WHQVLRQ RI WKH SUHYLRXV RQH FURVV GHOHWLRQ H[DPSOH 5RZV FRUUHVSRQGLQJ WR \LO \LO \A \A DQG \L DUH GHOHWHG IURP WKH PHDQ EORFN DQG *&$ VXEPDWULFHV IRU DOO EORFNV 7KH 6&$ PDWUL[ QRZ FROXPQV FURVVHV GHJUHHV RI IUHHGRPf LV DJDLQ UHIRUPHG ZLWK RQO\ WKH UHOHYDQW SURGXFWV RI WKH *&$ FROXPQV &RXQWLQJ GHJUHHV RI IUHHGRP FROXPQV RI WKH VXPWR]HUR GHVLJQ PDWUL[f WKH PHDQ KDV RQH EORFN KDV

PAGE 57

WKUHH *&$ KDV ILYH DQG 6& $ KDV IRXU GHJUHHV RI IUHHGRP IRU D WRWDO RI (UURU KDV Of f GHJUHHV RI IUHHGRP 7RWDOLQJ GHJUHHV RI IUHHGRP IRU PRGHOHG HIIHFWV DQG HUURU \LHOGV ZKLFK HTXDOV WKH QXPEHU RI SORW PHDQV ,Q LQFUHDVLQJO\ XQEDODQFHG FDVHV 7DEOH f WKH VSUHDG DPRQJ WKH *&$ HVWLPDWHV WHQGV WR LQFUHDVH ZLWK LQFUHDVLQJ LPEDODQFH ORVV RI LQIRUPDWLRQf 7KLV LV D JHQHUDO IHDWXUH RI 2/6 DQDO\VHV DQG WKH EDVLV IRU WKH IHDWXUH LV WKDW WKH VSUHDG DPRQJ WKH *&$ HVWLPDWHV LV GXH WR ERWK WKH LQQDWH VSUHDG GXH WR DGGLWLYH JHQHWLFV HIIHFWV DV ZHOO DV WKH HUURU LQ HVWLPDWLRQ RI WKH *&$fV :KHQ WKHUH LV OHVV LQIRUPDWLRQ *&$ HVWLPDWHV WHQG WR EH PRUH ZLGHO\ VSUHDG GXH WR WKH LQFUHDVH LQ WKH HUURU YDULDQFH DVVRFLDWHG ZLWK WKHLU HVWLPDWLRQ 7KLV IHDWXUH KDV EHHQ QRWHG :KLWH DQG +RGJH SDJH f DV WKH WHQGHQF\ WR SLFN DV SDUHQWDO ZLQQHUV LQGLYLGXDOV LQ D EUHHGLQJ SURJUDP ZKLFK DUH WKH PRVW SRRUO\ WHVWHG 'LVFXVVLRQ $IWHU GHYHORSLQJ WKH 2/6 DQDO\VLV DQG GHVFULELQJ WKH LQKHUHQW DVVXPSWLRQV RI WKH DQDO\VLV WKHUH DUH IRXU LPSRUWDQW IDFWRUV WR FRQVLGHU LQ WKH LQWHUSUHWDWLRQ RI VXPWR]HUR 2/6 VROXWLRQV f WKH ODFN RI XQLTXHQHVV RI WKH SDUDPHWHU HVWLPDWHV f WKH ZHLJKWV JLYHQ WR SORW PHDQV \LMNf DQG LQ WXUQ VLWH PHDQV \ MNf IRU FURVVHV LQ GDWD VHWV ZLWK PLVVLQJ FURVVHV LQ SDUDPHWHU HVWLPDWLRQ f WKH DUELWUDU\ QDWXUH RI XVLQJ D GLDOOHO PHDQ SHUIRUFH D QDUURZ JHQHWLF EDVHf DV WKH PHDQ DERXW ZKLFK WKH *&$fV VXPWR]HUR DQG f WKH DVVXPSWLRQ WKDW WKH FRYDULDQFH PDWUL[ IRU WKH REVHUYDWLRQV 9f LV ,DH 8QLTXHQHVV RI (VWLPDWHV 6XPWR]HUR UHVWULFWLRQV IXUQLVK ZKDW ZRXOG DSSHDU WR EH XQLTXH HVWLPDWHV RI WKH LQGLYLGXDO SDUDPHWHUV HJ *&$ ZKHQ LQ IDFW WKHVH LQGLYLGXDO SDUDPHWHUV DUH QRW HVWLPDEOH

PAGE 58

*UD\ELOO )UHXQG DQG /LWWHOO DQG 0LOOLNHQ DQG -RKQVRQ f 7KH ODFN RI HVWLPDELOLW\ LV DJDLQ DQDORJRXV WR DWWHPSWLQJ WR VROYH D VHW RI HTXDWLRQV LQ Q XQNQRZQV ZLWK W HTXDWLRQV ZKHUH Q LV JUHDWHU WKDQ W 7KHUHIRUH DQ LQILQLWH QXPEHU RI VROXWLRQV H[LVW IRU 7KHUH DUH TXDQWLWLHV LQ WKLV V\VWHP RI HTXDWLRQV WKDW DUH XQLTXH HVWLPDEOHf LH WKH HVWLPDWH LV LQYDULDQW UHJDUGOHVV RI WKH UHVWULFWLRQ VXPWR]HUR RU VHWWR]HURf RU JHQHUDOL]HG LQYHUVH QR UHVWULFWLRQVf XVHG 0LOOLNHQ DQG -RKQVRQ f DQG WKH HVWLPDEOH IXQFWLRQV LQFOXGH VXPWR]HUR *&$ DQG 6&$ HVWLPDWHV VLQFH WKH\ DUH OLQHDU FRPELQDWLRQV RI WKH REVHUYDWLRQV EXW WKHVH HVWLPDEOH TXDQWLWLHV GR QRW HVWLPDWH WKH LQGLYLGXDO SDUDPHWULF *&$fV DQG 6&$fV RI WKH RYHUSDUDPHWHUL]HG PRGHO HTXDWLRQ f VLQFH WKHUH LV QR XQLTXH VROXWLRQ IRU WKRVH SDUDPHWHUV :HLJKWLQJ RI 3ORW 0HDQV DQG &URVV 0HDQV LQ (VWLPDWLQJ 3DUDPHWHUV :LWK DW OHDVW RQH PHDVXUHPHQW WUHH LQ HDFK SORW DQG ZLWK SORW PHDQV DV WKH XQLW RI REVHUYDWLRQ XVH RI WKH PDWUL[ DSSURDFK SURGXFHV WKH VDPH UHVXOWV DV WKH EDVLF IRUPXODH 7KH ZHLJKW SODFHG RQ HDFK SORW PHDQ LQ WKH HVWLPDWLRQ RI D SDUDPHWHU FDQ EH GHWHUPLQHG E\ FDOFXODWLQJ ;f;n;f ZKLFK FDQ EH YLHZHG DV D PDWUL[ RI ZHLJKWV : VR WKDW HTXDWLRQ FDQ EH ZULWWHQ DV E :\ 7KH PDWUL[ : KDV WKHVH GLPHQVLRQV WKH QXPEHU RI URZV HTXDOV WKH QXPEHU RI SDUDPHWHUV LQ IW DQG WKH QXPEHU RI FROXPQV HTXDOV WKH QXPEHU RI SORW PHDQV LQ \ 7KH L URZ RI WKH : FRQWDLQV WKH ZHLJKWV DSSOLHG WR \ WR HVWLPDWH WKH L SDUDPHWHU LQ E Ecf ,Q WKH GLVFXVVLRQ ZKLFK IROORZV JFD LV XWLOL]HG DV E ,I WKHUH DUH QR PLVVLQJ SORWV WKH FURVV PHDQ LQ HYHU\ EORFN \LMOFf KDV WKH VDPH ZHLJKWLQJ DQG ZHLJKWV FDQ EH FRPELQHG DFURVV EORFNV WR \LHOG WKH ZHLJKW RQ WKH RYHUDOO FURVV PHDQ \ MNf ,W FDQ EH VKRZQ WKDW IRU WKH EDODQFHG QXPHULFDO H[DPSOH JFD LV FDOFXODWHG E\ ZHLJKWLQJ WKH RYHUDOO FURVV PHDQV FRQWDLQLQJ SDUHQW E\ DQG ZHLJKWLQJ DOO RYHUDOO FURVV PHDQV QRW

PAGE 59

*&$ *&$ *&$ *&$ *&$ *&$ *&$ *&$ *&$ *&$ *&$ *&$ PLVVLQJ PLVVLQJ PLVVLQJ PLVVLQJ PLVVLQJ PLVVLQJ ? ;;;;;;; Mii )LJXUH :HLJKWV RQ RYHUDOO FURVV PHDQV \ MNf IRU WKH WKUHH QXPHULFDO H[DPSOHV IRU HVWLPDWLRQ RI *&$ 7KH ZHLJKWV IRU WKH EDODQFHG H[DPSOH DERYH WKH GLDJRQDOf DUH SUHVHQWHG LQ ERWK IUDFWLRQDO DQG GHFLPDO IRUP 7KH ZHLJKWV IRU WKH RQHFURVV PLVVLQJ DQG WKH ILYHFURVVHV PLVVLQJ DUH SUHVHQWHG DV WKH XSSHU QXPEHU DQG ORZHU QXPEHU UHVSHFWLYHO\ LQ FHOOV EHORZ WKH GLDJRQDO 7KH PDUJLQDO ZHLJKWV RQ *&$ SDUDPHWHUV ULJKW PDUJLQf GR QRW FKDQJH DOWKRXJK FHOOV DUH PLVVLQJ

PAGE 60

FRQWDLQLQJ SDUHQW E\ )LJXUH DERYH WKH GLDJRQDOf GHPRQVWUDWHV WKH ZHLJKWLQJV RQ WKH RYHUDOO FURVV PHDQV IRU WKH EDODQFHG QXPHULFDO H[DPSOH DV ZHOO DV WKH PDUJLQDO ZHLJKWLQJ RQ WKH *&$ SDUDPHWHUV 7KHVH PDUJLQDO ZHLJKWLQJV DUH REWDLQHG E\ VXPPLQJ DORQJ D URZ DQGRU FROXPQ DV RQH ZRXOG WR REWDLQ WKH PDUJLQDO WRWDOV IRU D SDUHQW %HFNHU f 2QH IHDWXUH RI VXPWR]HUR VROXWLRQV LV WKDW WKHVH PDUJLQDO ZHLJKWLQJV ZLOO EH PDLQWDLQHG QR PDWWHU WKH LPEDODQFH GXH WR PLVVLQJ FURVVHV DV ZLOO EH VHHQ E\ FRQVLGHULQJ WKH QXPHULFDO H[DPSOHV IRU D PLVVLQJ FURVV )LJXUH EHORZ WKH GLDJRQDO XSSHU QXPEHUf DQG ILYH PLVVLQJ FURVVHV )LJXUH EHORZ WKH GLDJRQDO ORZHU QXPEHUf 7KH PDUJLQDO ZHLJKWV KDYH UHPDLQHG WKH VDPH DV LQ WKH EDODQFHG FDVH ZKLOH WKH ZHLJKWV RQ WKH FURVV PHDQV GLIIHU DPRQJ WKH FURVVHV FRQWDLQLQJ SDUHQW DQG DOVR DPRQJ WKH FURVVHV QRW FRQWDLQLQJ SDUHQW ,Q WKH ILYH PLVVLQJ FURVVHV H[DPSOH FURVVHV \0 DQG \ HYHQ UHFHLYH D SRVLWLYH ZHLJKWLQJ ZKHUH LQ WKH SULRU H[DPSOHV WKH\ KDG QHJDWLYH ZHLJKWLQJ 7KH H[SHFWHG YDOXH LQ DOO WKUHH H[DPSOHV LV *&$OV IRU VXPWR]HURf GHVSLWH WKH DSSDUHQWO\ QRQVHQVLFDO ZHLJKWLQJV WR FURVV PHDQV ZLWK PLVVLQJ FURVVHV KRZHYHU WKH HYDOXDWLRQ RI WKH HVWLPDWHV LQ WHUPV RI WKH RULJLQDO PRGHO FKDQJHV ZLWK HDFK QHZ FRPELQDWLRQ RI PLVVLQJ FHOOV LH \ A DQG \ D KDYH D SRVLWLYH ZHLJKW LQ WKH ILYH PLVVLQJ FURVVHV H[DPSOH LQ *&$W HVWLPDWLRQ :KHWKHU WKLV W\SH RI HVWLPDWLRQ LV GHVLUDEOH ZLWK PLVVLQJ FHOO FURVVf PHDQV KDV EHHQ WKH VXEMHFW RI VRPH GLVFXVVLRQ 6SHHG +RFNLQJ DQG +DFNQH\ )UHXQG DQG 0LOOLNHQ DQG -RKQVRQ f 7KH GDWD DQDO\VW VKRXOG EH DZDUH RI WKH PDQQHU LQ ZKLFK VXPWR]HUR WUHDWV WKH GDWD ZLWK PLVVLQJ FHOO PHDQV DQG GHFLGH ZKHWKHU WKDW SDUWLFXODU OLQHDU FRPELQDWLRQ RI FURVV PHDQV HVWLPDWLQJ WKH SDUDPHWHU LV RQH RI LQWHUHVW UHDOL]LQJ WKDW WKH PHDQLQJ RI WKH HVWLPDWHV LQ WHUPV RI WKH RULJLQDO PRGHO LV FKDQJLQJ

PAGE 61

'LDOOHO 0HDQ 7KH XVH RI WKH PHDQ IRU D KDOIGLDOOHO DV WKH PHDQ DURXQG ZKLFK *&$fV VXPWR]HUR LV QRW VDWLVIDFWRU\ LQ WKDW WKH GLDOOHO PHDQ LV WKH PHDQ RI D UDWKHU QDUURZ JHQHWLFDOO\ EDVHG SRSXODWLRQ DQG LQ SDUWLFXODU WKDW WKH FRPSDULVRQV RI LQWHUHVW DUH QRW XVXDOO\ FRQILQHG WR WKH VSHFLILF SDUHQWV LQ D VSHFLILF GLDOOHO RQ D SDUWLFXODU VLWH $ FKHFNORW FDQ EH HPSOR\HG WR UHSUHVHQW D EDVH SRSXODWLRQ DJDLQVW ZKLFK FRPSDULVRQ RI KDOI RU IXOOVLE IDPLOLHV FDQ EH PDGH WR SURYLGH IRU FRPSDULVRQ RI *&$ HVWLPDWHV IURP RWKHU WHVWV YDQ %XLMWHQHQ DQG %ULGJZDWHU f 0DWKHPDWLFDOO\ ZKHQ HIIHFWV DUH IRUFHG WR VXPWR]HUR DURXQG WKHLU RZQ PHDQ WKH DEVROXWH YDOXH RI WKH *&$fV LV UHIOHFWLYH RI WKHLU YDOXH UHODWLYH WR WKH PHDQ RI WKH JURXS (YHQ LI WKH SDUHQWV LQYROYHG LQ WKH SDUWLFXODU GLDOOHO ZHUH DOO IDU VXSHULRU WR WKH SRSXODWLRQ PHDQ IRU *&$ *&$fV FDOFXODWHG RQ DQ 2/6 EDVLV ZRXOG VKRZ WKDW VRPH RI WKHVH *&$fV ZHUH QHJDWLYH ,I WKH *&$fV RI WKH GLDOOHO SDUHQWV ZHUH LQ IDFW DOO EHORZ WKH SRSXODWLRQ PHDQ WKH RSSRVLWH DQG HTXDOO\ XQGHVLUDEOH UHVXOW HQVXHV )RU GLVFRQQHFWHG GLDOOHOV WRJHWKHU RQ D VLQJOH VLWH DQ 2/6 DQDO\VLV ZRXOG \LHOG *&$ HVWLPDWHV WKDW VXPWR]HUR ZLWKLQ HDFK GLDOOHO VLQFH SDUHQWV DUH QHVWHG ZLWKLQ GLDOOHOV 8QOHVV WKH FRPSDULVRQV RI LQWHUHVW DUH RQO\ LQ WKH FRPELQDWLRQ RI WKH SDUHQWV LQ D VSHFLILF GLDOOHO RQ D VSHFLILF VLWH WKH FKHFNORW DOWHUQDWLYH LV GHVLUDEOH $ PHWKRG IRU REWDLQLQJ WKH GHVLUHG JRDO RI FRPSDUDEOH *&$fV IURP GLVFRQQHFWHG H[SHULPHQWV GLVUHJDUGLQJ WKH SUREOHP RI KHWHURVFHGDVWLFLW\ LV WR IRUP D IXQFWLRQ IURP WKH GDWD ZKLFK \LHOGV *&$ HVWLPDWHV SURSHUO\ ORFDWHG RQ WKH QXPEHU VFDOH 6XFK D IXQFWLRQ FDQ EH IRUPHG XVLQJ *&$ DV DQ H[DPSOHf IURP JFDOV WKH GLDOOHO PHDQ DQG WKH FKHFNORW PHDQ )URP H[SHFWDWLRQV RI WKH VFDODU OLQHDU PRGHO HTXDWLRQ f *&$OV SOfSf*&$ OSfeI *&$M OSf(( 6&$ON SSfff("f(( 6&$MN (^GLDOOHO PHDQ` Q (A%-E Sf(3 *&$M SSOfff(3M(A6&$MN DQG

PAGE 62

(^FKHFNORW PHDQ` Q (A%-E U ZKHUH M IRU *&$ LV M RU N DQG W UHSUHVHQWV WKH IL[HG JHQHWLF SDUDPHWHU RI WKH FKHFNORW 7KH IXQFWLRQ XVHG WR SURSHUO\ ORFDWH *&$OUG WKH VXEVFULSW UHO GHQRWHV WKH UHORFDWHG *&$f LV JFDUH JFD OfGLDOOHO PHDQ FKHFNORW PHDQf 7KH H[SHFWDWLRQ RI JFDUH ZLWK QHJOLJLEOH 6&$ LV *&$OQ *&$ W DQG VLQFH EUHHGLQJ YDOXH HTXDOV WZLFH *&$ %9UH %9 W ,I 6&$ LV QRQQHJOLJLEOH WKHQ WKH H[SHFWDWLRQ LV *&$UH *&$ OSOff(_86&$N OSOfSfffAM-"6&$r 7 ,Q HLWKHU FDVH WKH IXQFWLRQ SURYLGHV D UHDVRQDEOH PDQQHU E\ ZKLFK *&$ HVWLPDWHV IURP GLVFRQQHFWHG GLDOOHOV DUH FHQWHUHG DW WKH VDPH ORFDWLRQ RQ D QXPEHU VFDOH DQG DUH WKHQ FRPSDUDEOH 9DULDQFH DQG &RYDULDQFH RI 3ORW 0HDQV 7KH YDULDQFHV RI SORW PHDQV ZLWK XQHTXDO QXPEHUV RI WUHHV SHU SORW DUH E\ GHILQLWLRQ XQHTXDO LH 9DU\LMNf FUS R9QLMN ZKHUH DS LV SORW YDULDQFH Rr LV WKH ZLWKLQ SORW YDULDQFH DQG QLMN LV WKH QXPEHU RI REVHUYDWLRQV SHU SORW $OVR LI EORFNV ZHUH FRQVLGHUHG UDQGRP WKHUH ZRXOG EH DQ DGGLWLRQDO VRXUFH RI YDULDQFH IRU SORW PHDQV GXH WR EORFNV DV ZHOO DV D FRYDULDQFH EHWZHHQ SORW PHDQV LQ WKH VDPH EORFNf DQG WKLV FRXOG EH LQFRUSRUDWHG LQWR WKH 9 PDWUL[ ZLWK 9DU\LMNf D? US R-QWMN 6LQFH WKH YDULDQFHV RI WKH PHDQV LQ WKH REVHUYDWLRQ YHFWRU DUH QRW HTXDO DQG WKHUH LV D FRYDULDQFH EHWZHHQ WKH PHDQV LI EORFNV DUH EHLQJ FRQVLGHUHG UDQGRP EHVW OLQHDU XQELDVHG HVWLPDWHV %/8(f ZRXOG EH VHFXUHG E\ ZHLJKWLQJ HDFK PHDQ E\ LWfV WUXH DVVRFLDWHG YDULDQFH 6HDUOH SDJH f 7KLV LV WKH JHQHUDOL]HG OHDVW VTXDUHV */6f DSSURDFK DV E ;9n;-n;f9n\

PAGE 63

7KH */6 DSSURDFK UHOD[HV WKH 2/6 DVVXPSWLRQV RI HTXDO YDULDQFH RI DQG QR FRYDULDQFH EHWZHHQ WKH REVHUYDWLRQV SORW PHDQVf ZKLOH VWLOO WUHDWLQJ JHQHWLF SDUDPHWHUV DV IL[HG HIIHFWV 7KH HQWULHV DORQJ WKH GLDJRQDO RI WKH 9 PDWUL[ DUH WKH YDULDQFHV RI WKH SORW PHDQV 9DU\LMNff LQ WKH VDPH RUGHU DV PHDQV LQ WKH GDWD YHFWRU 7KH RIIGLDJRQDO HOHPHQWV RI 9 ZRXOG EH HLWKHU RU D? WKH YDULDQFH GXH WR WKH UDQGRP YDULDEOH EORFNf IRU HOHPHQWV FRUUHVSRQGLQJ WR REVHUYDWLRQV LQ WKH VDPH EORFN %/8( UHTXLUHV H[DFW NQRZOHGJH RI 9 LI HVWLPDWHV RI DS DDQG Rf DUH XWLOL]HG LQ WKH 9 PDWUL[ HVWLPDEOH IXQFWLRQV RI DSSUR[LPDWH %/8( 7KH 2/6 DVVXPSWLRQ WKDW 6&$ DQG *&$ DUH IL[HG HIIHFWV FDQ DOVR EH UHOD[HG WR DOORZ IRU FRYDULDQFHV GXH WR JHQHWLF UHODWHGQHVV ,Q SDUWLFXODU WKH LQIRUPDWLRQ WKDW PHDQV DUH IURP WKH VDPH KDOI RU IXOOVLE IDPLO\ FRXOG EH LQFOXGHG LQ WKH 9 PDWUL[ 5HOD[DWLRQ RI WKH ]HUR FRYDULDQFH DVVXPSWLRQ LPSOLHV WKDW *&$ DQG 6&$ DUH UDQGRP YDULDEOHV ,I *&$ DQG 6&$ DUH WUHDWHG DV UDQGRP YDULDEOHV WKHQ WKH DSSOLFDWLRQ RI EHVW OLQHDU SUHGLFWLRQ %/3f RU EHVW OLQHDU XQELDVHG SUHGLFWLRQ %/83f WR WKH SUREOHP ZRXOG EH PRUH DSSURSULDWH :KLWH DQG +RGJH SDJH f 7KH WUHDWPHQW RI WKH JHQHWLF SDUDPHWHUV DV UDQGRP YDULDEOHV LV FRQVLVWHQW ZLWK WKDW XVHG LQ HVWLPDWLQJ JHQHWLF FRUUHODWLRQV DQG KHULWDELOLWLHV 7KH 9 PDWUL[ RI VXFK DQ DSSOLFDWLRQ ZRXOG LQFOXGH LQ DGGLWLRQ WR WKH IHDWXUHV RI WKH */6 9 PDWUL[ WKH FRYDULDQFH EHWZHHQ IXOOVLE RU KDOI VLE IDPLOLHV DGGHG WR WKH RIIGLDJRQDO HOHPHQWV LQ 9 LH LI WKH ILUVW DQG VHFRQG SORW PHDQV LQ WKH GDWD YHFWRU KDG D FRYDULDQFH GXH WR UHODWLRQVKLS WKHQ WKDW FRYDULDQFH LV LQVHUWHG WZLFH LQ WKH 9 PDWUL[ 7KH FRYDULDQFH ZRXOG DSSHDU DV WKH VHFRQG HOHPHQW LQ WKH ILUVW URZ DQG WKH ILUVW HOHPHQW LQ WKH VHFRQG URZ RI 9 9 LV D V\PPHWULF PDWUL[f $OVR WKH GLDJRQDO HOHPHQWV RI 9 ZRXOG LQFUHDVH E\ XJFD WKH YDULDQFH GXH WR WUHDWLQJ *&$ DV D UDQGRP YDULDEOHf UVFD WKH YDULDQFH GXH WR WUHDWLQJ 6&$ DV D UDQGRP YDULDEOHf

PAGE 64

&RPSDULVRQ RI 3UHGLFWLRQ DQG (VWLPDWLRQ 0HWKRGRORJLHV :KLFK PHWKRGRORJ\ 2/6 */6 %/3 RU %/83f WR DSSO\ WR LQGLYLGXDO GDWD EDVHV LV VRPHZKDW D VXEMHFWLYH GHFLVLRQ 7KH GHFLVLRQ FDQ EH EDVHG ERWK RQ WKH FRPSXWDWLRQDO RU FRQFHSWXDO FRPSOH[LW\ RI WKH PHWKRG DQG WKH PDJQLWXGH RI WKH GDWD EDVH ZLWK ZKLFK WKH DQDO\VW LV ZRUNLQJ 7R DLG LQ WKLV GHFLVLRQ WKLV GLVFXVVLRQ KLJKOLJKWV WKH GLIIHUHQFHV LQ WKH LQKHUHQW SURSHUWLHV DQG DVVXPSWLRQV RI WKH WHFKQLTXHV )RU DOO SUDFWLFDO SXUSRVHV WKH DQVZHUV IURP WKH IRXU WHFKQLTXHV ZLOO QHYHU EH HTXDO KRZHYHU WKHUH DUH WZR FDYHDWV )LUVW 2/6 HVWLPDWHV HTXDO */6 HVWLPDWHV LI DOO WKH FHOO PHDQV DUH NQRZQ ZLWK WKH VDPH SUHFLVLRQ YDULDQFHf 6HDUOH SDJH f 2WKHUZLVH */6 GLVFRXQWV WKH PHDQV WKDW DUH NQRZQ ZLWK OHVV SUHFLVLRQ LQ WKH FDOFXODWLRQV DQG GLIIHUHQW HVWLPDWHV UHVXOW 7KH VHFRQG FDYHDW LV LI WKH DPRXQW RI GDWD LV LQILQLWH LH DOO FURVV PHDQV DUH NQRZQ ZLWKRXW HUURU WKHQ DOO IRXU WHFKQLTXHV DUH HTXLYDOHQW :KLWH DQG +RGJH SDJHV f ,Q DOO RWKHU FDVHV %/3 DQG %/83 VKULQN SUHGLFWLRQV WRZDUG WKH ORFDWLRQ SDUDPHWHUVf DQG SURGXFH SUHGLFWLRQV ZKLFK DUH GLIIHUHQW IURP 2/6 RU */6 HVWLPDWHV HYHQ ZLWK EDODQFHG GDWD 'XULQJ FDOFXODWLRQV */6 %/3 DQG %/83 SODFH OHVV ZHLJKW RQ REVHUYDWLRQV NQRZQ ZLWK OHVV SUHFLVLRQ ZKLFK LV LQWXLWLYHO\ SOHDVLQJ :LWK 2/6 DQG */6 IRUHVW JHQHWLFLVWV WUHDW *&$fV DQG 6&$fV DV IL[HG HIIHFWV IRU HVWLPDWLRQ DQG WKHQ DV UDQGRP YDULDEOHV IRU JHQHWLF FRUUHODWLRQV DQG KHULWDELOLWLHV %/3 DQG %/83 SURYLGH D FRQVLVWHQW WUHDWPHQW RI *&$fV DQG 6&$fV DV UDQGRP YDULDEOHV ZKLOH GLIIHULQJ LQ WKHLU DVVXPSWLRQV DERXW ORFDWLRQ SDUDPHWHUV IL[HG HIIHFWVf ,Q %/3 IL[HG HIIHFWV DUH DVVXPHG NQRZQ ZLWKRXW HUURU DOWKRXJK WKH\ DUH XVXDOO\ HVWLPDWHG IURP WKH GDWDf ZKLOH ZLWK %/83 IL[HG HIIHFWV DUH HVWLPDWHG XVLQJ */6 %/3 DQG %/83 WHFKQLTXHV DOVR FRQWDLQ WKH DVVXPSWLRQ WKDW WKH FRYDULDQFH PDWUL[ RI WKH REVHUYDWLRQV LV NQRZQ ZLWKRXW HUURU PRVW RIWHQ YDULDQFHV PXVW EH HVWLPDWHGf ,Q PDQ\ %/83 DSSOLFDWLRQV +HQGHUVRQ f PL[HG PRGHO HTXDWLRQV DUH XWLOL]HG

PAGE 65

LWHUDWLYHO\ WR HVWLPDWH IL[HG HIIHFWV DQG WR SUHGLFW UDQGRP YDULDEOHV IURP D GDWD VHW $ %/83 WUHDWPHQW RI IL[HG HIIHFWV DOORZV DQ\ FRQQHFWHGQHVV EHWZHHQ H[SHULPHQWV WR EH XWLOL]HG LQ WKH HVWLPDWLRQ RI WKH IL[HG HIIHFWV 7KLV SURYLGHV DQ LQWXLWLYH DGYDQWDJH RI %/83 RYHU %/3 LQ H[SHULPHQWDWLRQ ZKHUH FRQQHFWHGQHVV DPRQJ JHQHWLF H[SHULPHQWV LV DYDLODEOH RU ZKHUH WKH GDWD DUH VR XQEDODQFHG WKDW WUHDWLQJ WKH IL[HG HIIHFWV DV NQRZQ LV OHVV GHVLUDEOH WKDQ D */6 HVWLPDWH RI WKH IL[HG HIIHFWV $Q RUGHULQJ RI FRPSXWDWLRQDO FRPSOH[LW\ DQG FRQFHSWXDO FRPSOH[LW\ IURP OHDVW WR PRVW FRPSOH[ RI WKH IRXU PHWKRGV LV 2/6 */6 %/3 DQG %/83 7KH ODWWHU WKUHH PHWKRGV UHTXLUH WKH HVWLPDWLRQ RI WKH FRYDULDQFH PDWUL[ RI WKH REVHUYDWLRQV HLWKHU VHSDUDWHO\ D SULRULf RU LWHUDWLYHO\ ZLWK WKH IL[HG HIIHFWV 3UHFLVH HVWLPDWLRQ RI WKH FRYDULDQFH PDWUL[ IRU REVHUYDWLRQV UHTXLUHV D JUHDW QXPEHU RI REVHUYDWLRQV DQG WKH SUHFLVLRQ RI */6 %/3 DQG %/83 HVWLPDWLRQV RU SUHGLFWLRQV LV DIIHFWHG E\ WKH HUURU RI HVWLPDWLRQ RI WKH FRPSRQHQWV RI 9 6HOHFWLRQ RI D PHWKRG FDQ WKHQ EH EDVHG RQ ZHLJKLQJ WKH FRPSXWDWLRQDO FRPSOH[LW\ DQG VL]H RI WKH DYDLODEOH GDWD EDVH DJDLQVW WKH DGYDQWDJHV RIIHUHG E\ HDFK PHWKRG 7KXV LI FRPSOH[LW\ RI WKH FRPSXWDWLRQDO SUREOHP LV RI SDUDPRXQW FRQFHUQ WKH DQDO\VW QHFHVVDULO\ ZRXOG FKRRVH 2/6 :LWK D VPDOO GDWD EDVH RQH WKDW GRHV QRW DOORZ UHDVRQDEOH HVWLPDWHV RI YDULDQFHVf WKH DQDO\VW ZRXOG DJDLQ FKRRVH 2/6 :LWK D ODUJH GDWD EDVH DQG QR TXDOPV ZLWK FRPSXWDWLRQDO FRPSOH[LW\ WKH DQDO\VW FDQ FKRRVH EHWZHHQ %/3 DQG %/83 EDVHG RQ ZKHWKHU WKHUH LV VXIILFLHQW FRQQHFWHGQHVV RU LPEDODQFH DPRQJ WKH H[SHULPHQWV WR PDNH %/83 DGYDQWDJHRXV &RQFOXVLRQV 0HWKRGV RI VROYLQJ IRU *&$ DQG 6&$ HVWLPDWHV IRU EDODQFHG SORWPHDQ EDVLVf DQG XQEDODQFHG GDWD KDYH EHHQ SUHVHQWHG DORQJ ZLWK WKH LQKHUHQW DVVXPSWLRQV RI WKH DQDO\VLV 7KH XVH RI SORW PHDQV DQG WKH PDWUL[ HTXDWLRQV ZLOO SURGXFH VXPWR]HUR 2/6 HVWLPDWHV IRU *&$ DQG

PAGE 66

6&$ IRU DOO W\SHV RI LPEDODQFH )RUPXODH LQ WKH OLWHUDWXUH ZKLFK \LHOG 2/6 VROXWLRQV IRU EDODQFHG GDWD FDQ \LHOG PLVOHDGLQJ VROXWLRQV IRU XQEDODQFHG GDWD EHFDXVH RI WKH ORVV RI RUWKRJRQDOLW\ DQG DOVR ZHLJKWLQJV RQ VLWH PHDQV IRU FURVVHV RU WRWDOVf DUH FRQVWDQWV *&$fV DQG 6&$fV REWDLQHG WKURXJK VXPWR]HUR UHVWULFWLRQ DUH QRW WUXO\ HVWLPDWHV RI SDUDPHWULF SRSXODWLRQ *&$fV DQG 6&$fV 7KHUH DUH DQ LQILQLWH QXPEHU RI VROXWLRQV IRU *&$fV DQG 6&$fV IURP WKH V\VWHP RI HTXDWLRQV DV D UHVXOW RI WKH RYHUSDUDPHWHUL]HG OLQHDU PRGHO
PAGE 67

&+$37(5 9$5,$1&( &20321(17 (67,0$7,21 7(&+1,48(6 &203$5(' )25 7:2 0$7,1* '(6,*16 :,7+ )25(67 *(1(7,& $5&+,7(&785( 7+528*+ &20387(5 6,08/$7,21 ,QWURGXFWLRQ ,Q PDQ\ DSSOLFDWLRQV RI TXDQWLWDWLYH JHQHWLFV JHQHWLFLVWV DUH FRPPRQO\ IDFHG ZLWK WKH DQDO\VLV RI GDWD FRQWDLQLQJ D PXOWLWXGH RI IODZV HJ QRQQRUPDOLW\ LPEDODQFH DQG KHWHURVFHGDVWLFLW\f ,PEDODQFH DV RQH RI WKHVH IODZV LV LQWULQVLF WR TXDQWLWDWLYH IRUHVW JHQHWLFV UHVHDUFK EHFDXVH RI WKH GLIILFXOW\ LQ PDNLQJ FURVVHV IRU IXOOVLE WHVWV DQG WKH ELRORJLFDO UHDOLWLHV RI ORQJ WHUP ILHOG H[SHULPHQWV )HZ GHILQLWLYH VWXGLHV KDYH EHHQ FRQGXFWHG WR HVWDEOLVK RSWLPDO PHWKRGV IRU HVWLPDWLRQ RI YDULDQFH FRPSRQHQWV IURP XQEDODQFHG GDWD 6LPXODWLRQ VWXGLHV XVLQJ VLPSOH PRGHOV RQHZD\ RU WZRZD\ UDQGRP PRGHOVf KDYH EHHQ FRQGXFWHG IRU FHUWDLQ GDWD VWUXFWXUHV LH LPEDODQFH H[SHULPHQWDO GHVLJQ DQG YDULDQFH SDUDPHWHUV &RUEHLO DQG 6HDUOH 6ZDOORZ 6ZDOORZ DQG 0RQDKDQ LQWHUSUHWDWLRQV E\ /LWWHOO DQG 0F&XWFKDQ f 7KH UHVXOWV IURP WKHVH VWXGLHV LQGLFDWH WKDW WHFKQLTXH RSWLPDOLW\ LV D IXQFWLRQ RI WKH GDWD VWUXFWXUH ,Q SUDFWLFH ERWK KLVWRULFDOO\ DQG VWLOO FRPPRQ SODFHf HVWLPDWLRQ RI YDULDQFH FRPSRQHQWV LQ IRUHVW JHQHWLFV DSSOLFDWLRQV KDV EHHQ DFKLHYHG E\ XVLQJ VHTXHQWLDOO\ DGMXVWHG VXPV RI VTXDUHV DV DQ DSSOLFDWLRQ RI +HQGHUVRQfV 0HWKRG +0 +HQGHUVRQ f 8QGHU QRUPDOLW\ DQG ZLWK EDODQFHG GDWD WKLV WHFKQLTXH KDV WKH GHVLUDEOH SURSHUWLHV RI EHLQJ WKH PLQLPXP YDULDQFH XQELDVHG HVWLPDWRU ,I WKH GDWD DUH XQEDODQFHG WKHQ WKH RQO\ SURSHUW\ UHWDLQHG E\ +0 HVWLPDWLRQ LV

PAGE 68

XQELDVHGQHVV 6HDUOH 6HDUOH SS f 2WKHU HVWLPDWRUV KDYH EHHQ VKRZQ WR EH ORFDOO\ VXSHULRU WR +0 LQ YDULDQFH RU PHDQ VTXDUH HUURU SURSHUWLHV LQ FHUWDLQ FDVHV .ORW] HW DO 2OVHQ HW DO 6ZDOORZ 6ZDOORZ DQG 0RQDKDQ f 2YHU WKH ODVW \HDUV WKHUH KDV EHHQ D SUROLIHUDWLRQ RI YDULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXHV LQFOXGLQJ PLQLPXP QRUP TXDGUDWLF XQELDVHG HVWLPDWLRQ 0,148( 5DR Df PLQLPXP YDULDQFH TXDGUDWLF XQELDVHG HVWLPDWLRQ 0,948( 5DR Ef PD[LPXP OLNHOLKRRG 0/ +DUWOH\ DQG 5DR f DQG UHVWULFWHG PD[LPXP OLNHOLKRRG 5(0/ 3DWWHUVRQ DQG 7KRPSVRQ f 7KH SUDFWLFDO DSSOLFDWLRQ RI WKHVH WHFKQLTXHV KDV EHHQ LPSHGHG E\ WKHLU FRPSXWDWLRQDO FRPSOH[LW\ +RZHYHU ZLWK FRQWLQXLQJ DGYDQFHV LQ FRPSXWHU WHFKQRORJ\ DQG WKH DSSHDUDQFH RI EHWWHU FRPSXWDWLRQDO DOJRULWKPV WKH DSSOLFDWLRQ RI WKHVH SURFHGXUHV FRQWLQXHV WR EHFRPH PRUH WUDFWDEOH +DUYLOOH *HLVEUHFKW 0H\HU f :KHWKHU WKHVH PHWKRGV RI DQDO\VLV DUH VXSHULRU WR +0 IRU PDQ\ JHQHWLFV DSSOLFDWLRQV UHPDLQV WR EH VKRZQ :LWK EDODQFHG GDWD DQG GLVUHJDUGLQJ QHJDWLYH HVWLPDWHV DOO SUHYLRXVO\ PHQWLRQHG WHFKQLTXHV H[FHSW 0/ SURGXFH WKH VDPH HVWLPDWHV +DUYLOOH f :LWK XQEDODQFHG GDWD HDFK WHFKQLTXH SURGXFHV D GLIIHUHQW VHW RI YDULDQFH FRPSRQHQW HVWLPDWHV &ULWHULD PXVW WKHQ EH DGRSWHG WR GLVFULPLQDWH DPRQJ WHFKQLTXHV &DQGLGDWH FULWHULD IRU GLVFULPLQDWLRQ LQFOXGH XQELDVHGQHVV ODUJH QXPEHU FRQYHUJHQFH RQ WKH SDUDPHWULF YDOXHf PLQLPXP YDULDQFH HVWLPDWRU ZLWK WKH VPDOOHVW VDPSOLQJ YDULDQFHf PLQLPXP PHDQ VTXDUH HUURU PLQLPXP RI VDPSOLQJ YDULDQFH SOXV VTXDUHG ELDV +RJJ DQG &UDLJ f DQG SUREDELOLW\ RI QHDUQHVV SUREDELOLW\ WKDW VDPSOH HVWLPDWHV RFFXU LQ D FHUWDLQ LQWHUYDO DURXQG WKH SDUDPHWULF YDOXH 3LWPDQ f 1HJDWLYH HVWLPDWHV DUH DOVR SUREOHPDWLF LQ WKH HVWLPDWLRQ RI YDULDQFH FRPSRQHQWV )LYH DOWHUQDWLYHV IRU GHDOLQJ ZLWK WKH GLOHPPD RI HVWLPDWHV OHVV WKDQ ]HUR RXWVLGH WKH QDWXUDO SDUDPHWHU VSDFH RI ]HUR WR LQILQLW\f DUH 6HDUOH f f DFFHSW DQG XVH WKH QHJDWLYH HVWLPDWH f VHW WKH QHJDWLYH HVWLPDWH WR ]HUR SURGXFLQJ ELDVHG HVWLPDWHVf f UHVROYH WKH V\VWHP ZLWK WKH RIIHQGLQJ

PAGE 69

FRPSRQHQW VHW WR ]HUR f XVH DQ DOJRULWKP ZKLFK GRHV QRW DOORZ QHJDWLYH HVWLPDWHV DQG f XVH WKH QHJDWLYH HVWLPDWH WR LQIHU WKDW WKH ZURQJ PRGHO ZDV XWLOL]HG 7KH SXUSRVH RI WKLV UHVHDUFK ZDV WR GHWHUPLQH LI WKH FULWHULD RI XQELDVHGQHVV PLQLPXP YDULDQFH PLQLPXP PHDQ VTXDUH HUURU DQG SUREDELOLW\ RI QHDUQHVV GLVFULPLQDWHG DPRQJ VHYHUDO YDULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXHV ZKLOH H[SORULQJ YDULRXV DOWHUQDWLYHV IRU GHDOLQJ ZLWK QHJDWLYH YDULDQFH FRPSRQHQW HVWLPDWHV ,Q RUGHU WR PDNH VXFK FRPSDULVRQV D ODUJH QXPEHU RI GDWD VHWV ZHUH UHTXLUHG IRU HDFK H[SHULPHQWDO OHYHO 8VLQJ VLPXODWHG GDWD WKLV FKDSWHU FRPSDUHV YDULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXHV IRU SORWPHDQ DQG LQGLYLGXDO REVHUYDWLRQV WZR PDWLQJ V\VWHPV PRGLILHG KDOIGLDOOHO DQG KDOIVLEf DQG WZR VHWV RI SDUDPHWULF YDULDQFH FRPSRQHQWV 7\SHV RI LPEDODQFH DQG OHYHOV RI IDFWRUV ZHUH FKRVHQ WR UHIOHFW FRPPRQ VLWXDWLRQV LQ IRUHVW JHQHWLFV 0HWKRGV ([SHULPHQWDO $SSURDFK )RU HDFK H[SHULPHQWDO OHYHO GDWD VHWV ZHUH JHQHUDWHG DQG DQDO\]HG E\ YDULRXV WHFKQLTXHV 7DEOH f SURGXFLQJ QXPHURXV VHWV RI YDULDQFH FRPSRQHQW HVWLPDWHV IRU HDFK GDWD VHW 7KLV ZRUNORDG UHVXOWHG LQ HQRUPRXV FRPSXWDWLRQDO WLPH EHLQJ DVVRFLDWHG ZLWK HDFK H[SHULPHQWDO OHYHO 7KH RYHUDOO H[SHULPHQWDO GHVLJQ IRU WKH VLPXODWLRQ ZDV RULJLQDOO\ FRQFHLYHG DV D IDFWRULDO ZLWK WZR W\SHV RI PDWLQJ GHVLJQ KDOIGLDOOHO DQG KDOIVLEf WZR VHWV RI WUXH YDULDQFH FRPSRQHQWV 7DEOH f WZR NLQGV RI REVHUYDWLRQV LQGLYLGXDO DQG SORW PHDQf DQG WKUHH W\SHV RI LPEDODQFH f VXUYLYDO OHYHOV b DQG b ZLWK b UHSUHVHQWLQJ PRGHUDWH VXUYLYDO DQG b UHSUHVHQWLQJ SRRU VXUYLYDO f IRU IXOOVLE GHVLJQV WKUHH OHYHOV RI PLVVLQJ FURVVHV DQG RXW RI FURVVHVf DQG f IRU KDOIVLE GHVLJQV WZR OHYHOV RI FRQQHFWHGQHVV DPRQJ WHVWV DQG FRPPRQ IDPLOLHV EHWZHHQ WHVWV RXW RI IDPLOLHV SHU WHVWf %HFDXVH RI WKH FRPSXWDWLRQDO WLPH

PAGE 70

7DEOH $EEUHYLDWLRQ IRU DQG GHVFULSWLRQ RI YDULDQFH FRPSRQHQW HVWLPDWLRQ PHWKRGV XWLOL]HG IRU DQDO\VHV EDVHG RQ LQGLYLGXDO REVHUYDWLRQV LI XWLOL]HG IRU SORWPHDQ DQDO\VLV WKH DEEUHYLDWLRQ LV PRGLILHG E\ SUHIL[LQJ D f3ff $EEUHYLDWLRQ 'HVFULSWLRQ &LWDWLRQ 0/ 30/ 0D[LPXP /LNHOLKRRG HVWLPDWHV QRW UHVWULFWHG WR WKH SDUDPHWHU VSDFH LQGLYLGXDO DQG SORWPHDQ DQDO\VLVf +DUWOH\ DQG 5DR 6KDZ 02'0/ 0D[LPXP /LNHOLKRRG QHJDWLYH HVWLPDWHV VHW WR ]HUR DIWHU FRQYHUJHQFH LQGLYLGXDO DQDO\VLVf +DUWOH\ DQG 5DR 110/ 0D[LPXP /LNHOLKRRG LI QHJDWLYH HVWLPDWHV DSSHDUHG DW FRQYHUJHQFH WKH\ ZHUH VHW WR ]HUR DQG WKH V\VWHP UHVROYHG LQGLYLGXDO DQDO\VLVf +DUWOH\ DQG 5DR 0LOOHU 5(0/ 35(0/ 5HVWULFWHG 0D[LPXP /LNHOLKRRG HVWLPDWHV QRW UHVWULFWHG WR WKH SDUDPHWHU VSDFH LQGLYLGXDO DQG SORWPHDQ DQDO\VLVf 3DWWHUVRQ DQG 7KRPSVRQ 6KDZ +DUYLOOH 02'5(0/ 5HVWULFWHG 0D[LPXP /LNHOLKRRG QHJDWLYH HVWLPDWHV VHW WR ]HUR DIWHU FRQYHUJHQFH LQGLYLGXDO DQDO\VLVf 3DWWHUVRQ DQG 7KRPSVRQ 115(0/ 3115(0/ 5HVWULFWHG 0D[LPXP /LNHOLKRRG LI QHJDWLYH HVWLPDWHV DSSHDUHG DW FRQYHUJHQFH WKH\ ZHUH VHW WR ]HUR DQG WKH V\VWHP UHVROYHG LQGLYLGXDO DQG SORWPHDQ DQDO\VLVf 3DWWHUVRQ DQG 7KRPSVRQ 0LOOHU 0,948( 30,948( 0LQLPXP 9DULDQFH 4XDGUDWLF 8QELDVHG QRQLWHUDWLYH ZLWK WUXH SDUDPHWULFf YDOXHV RI WKH YDULDQFH FRPSRQHQWV DV SULRUV LQGLYLGXDO DQG SORWPHDQ DQDO\VLVf 5DR E 0,148( 30,148( 0LQLPXP 1RUP 4XDGUDWLF 8QELDVHG QRQLWHUDWLYH ZLWK RQHV DV SULRUV IRU DOO YDULDQFH FRPSRQHQWV LQGLYLGXDO DQG SORWPHDQ DQDO\VLVf 5DR D 7<3( 37<3( 6HTXHQWLDOO\ $GMXVWHG 6XPV RI 6TXDUHV +HQGHUVRQfV 0HWKRG LQGLYLGXDO DQG SORWPHDQ DQDO\VLVf +HQGHUVRQ 0,93(1 0,948( ZLWK D SHQDOW\ DOJRULWKP WR SUHYHQW QHJDWLYH HVWLPDWHV LQGLYLGXDO DQDO\VLVf +DUYLOOH FRQVWUDLQW WKH H[SHULPHQW FRXOG QRW EH UXQ DV D FRPSOHWH IDFWRULDO DQG WKH LQYHVWLJDWLRQ FRQWLQXHG DV D SDUWLDO IDFWRULDO ,Q JHQHUDO WKH DSSURDFK ZDV WR UXQ OHYHOV ZKLFK ZHUH DW RSSRVLWH HQGV RI WKH LPEDODQFH VSHFWUXP LH b VXUYLYDO DQG QR PLVVLQJ FURVVHV YHUVXV b VXUYLYDO DQG PLVVLQJ FURVVHV ZLWKLQ D YDULDQFH FRPSRQHQW OHYHO ,I UHVXOWV ZHUH FRQVLVWHQW DFURVV WKHVH WUHDWPHQW FRPELQDWLRQV LQWHUPHGLDWH OHYHOV ZHUH QRW UXQ

PAGE 71

'HVLJQDWLRQ RI D WUHDWPHQW FRPELQDWLRQ LV E\ ILYH FKDUDFWHU DOSKDQXPHULF ILHOG 7KH ILUVW FKDUDFWHU LV HLWKHU + KDOIVLEf RU KDOIGLDOOHOf 7KH VHFRQG FKDUDFWHU GHQRWHV WKH VHW RI SDUDPHWULF YDULDQFH FRPSRQHQWV ZKHUH GHVLJQDWHG WKH VHW RI YDULDQFH FRPSRQHQWV DVVRFLDWHG ZLWK KHULWDELOLW\ RI DQG GHVLJQDWHG WKH VHW RI YDULDQFH FRPSRQHQWV DVVRFLDWHG ZLWK KHULWDELOLW\ RI 7DEOH f 7KH WKLUG FKDUDFWHU LV DQ 6 LQGLFDWLQJ WKDW WKH ODVW WZR FKDUDFWHUV GHWHUPLQH WKH LPEDODQFH OHYHO 7KH IRXUWK FKDUDFWHU GHVLJQDWHV WKH VXUYLYDO OHYHO HLWKHU IRU b RU IRU b 7KH ILQDO FKDUDFWHU VSHFLILHV WKH QXPEHU RI PLVVLQJ FURVVHV KDOI GLDOOHOf RU ODFN RI FRQQHFWHGQHVV KDOIVLEf 7KH WUHDWPHQW FRPELQDWLRQ f+6f LV D KDOIVLE PDWLQJ GHVLJQ +f WKH VHW RI YDULDQFH FRPSRQHQWV DVVRFLDWHG ZLWK KHULWDELOLW\ HTXDOOLQJ f b VXUYLYDO f DQG FRPPRQ SDUHQWV DFURVV WHVWV f 7DEOH 6HWV RI WUXH YDULDQFH FRPSRQHQWV IRU WKH KDOIGLDOOHO DQG KDOIVLE PDWLQJ GHVLJQV JHQHUDWHG IURP VSHFLILFDWLRQ RI WZR OHYHOV RI VLQJOHWUHH KHULWDELOLW\ Kf W\SH % FRUUHODWLRQ U%f DQG QRQDGGLWLYH WR DGGLWLYH YDULDQFH UDWLR GDf *HQHWLF 5DWLRVr 0DWLQJ 'HVLJQ 7UXH 9DULDQFH &RPSRQHQWVf K GD R" R R@ RI } IXOOVLE KDOIVLE 1$ 1$ IXOOVLE D K RJ IISKHQRW\SLF U% R DJ UWJf DQG "' D? DV GD D UJ E 6HH GHILQLWLRQV LQ HTXDWLRQ ([SHULPHQWDO 'HVLJQ IRU 6LPXODWHG 'DWD 7KH PDWLQJ GHVLJQ IRU WKH VLPXODWLRQ ZDV HLWKHU D VL[SDUHQW KDOIGLDOOHO QR VHLIVf RU D ILIWHHQSDUHQW KDOIVLE 7KH UDQGRPL]HG FRPSOHWH EORFN ILHOG GHVLJQ ZDV LQ WKUHH ORFDWLRQV LH VHSDUDWH ILHOG WHVWVf ZLWK IRXU FRPSOHWH EORFNV SHU ORFDWLRQ DQG VL[ WUHHV SHU IDPLO\ LQ D EORFN ZKHUH IDPLO\ LV D IXOOVLE IDPLO\ IRU KDOIGLDOOHO RU D KDOIVLE IDPLO\ IRU WKH KDOIVLE GHVLJQ 7KLV

PAGE 72

ILHOG GHVLJQ DQG WKH PDWLQJ GHVLJQV UHIOHFW W\SLFDO GHVLJQV LQ IRUHVWU\ DSSOLFDWLRQV 6TXLOODFH :LOFR[ HW DO %ULGJZDWHU HW DO :HLU DQG *RGGDUG /RR'LQNLQV HW DO f DQG DUH DOVR FRPPRQO\ XVHG LQ RWKHU GLVFLSOLQHV 0DW]LQJHU HW DO +DOODXHU DQG 0LUDQGD 6LQJK DQG 6LQJK f 7KH VL[ WUHHV SHU IDPLO\ FRXOG EH FRQVLGHUHG DV FRQWLJXRXV RU QRQFRQWLJXRXV SORWV ZLWKRXW DIIHFWLQJ WKH UHVXOWV RU LQIHUHQFHV )XOO6LE /LQHDU 0RGHO 7KH VFDODU OLQHDU PRGHO HPSOR\HG IRU KDOIGLDOOHO LQGLYLGXDO REVHUYDWLRQV LV \cMNWR 0 Wc EM JN J_ 6X WJLN WJX W6MMM SLMNO ZLMNWR ZKHUH \LMNOP LV WKH P REVHUYDWLRQ RI WKH NO FURVV LQ WKH Mf§ EORFN RI WKH Lf§ WHVW + LV WKH SRSXODWLRQ PHDQ Wc LV WKH UDQGRP YDULDEOH WHVW ORFDWLRQ a 1,'Df EcM LV WKH UDQGRP YDULDEOH EORFN a 1,'UEf JN LV WKH UDQGRP YDULDEOH IHPDOH JHQHUDO FRPELQLQJ DELOLW\ JFDf a 1,'L2RA J LV WKH UDQGRP YDULDEOH PDOH JFD a 1,'O2DA VX LV WKH UDQGRP YDULDEOH VSHFLILF FRPELQLQJ DELOLW\ VHDf a 1,'ARf WJA LV WKH UDQGRP YDULDEOH WHVW E\ IHPDOH JFD LQWHUDFWLRQ a 1,'UAf WJX LV WKH UDQGRP YDULDEOH WHVW E\ PDOH JFD LQWHUDFWLRQ a 1,'AFUA WVA LV WKH UDQGRP YDULDEOH WHVW E\ VHD LQWHUDFWLRQ a 1,'XSLMNO LV WKH UDQGRP YDULDEOH SORW a 1,'USf ZLMNWD LV WKH UDQGRP YDULDEOH ZLWKLQSORW a 1,'XZf DQG WKHUH LV QR FRYDULDQFH EHWZHHQ UDQGRP YDULDEOHV LQ WKH PRGHO 7KLV OLQHDU PRGHO LQ PDWUL[ QRWDWLRQ LV GLPHQVLRQV EHORZ PRGHO FRPSRQHQWf \ f§ UO =7HU =%H% =*H* =JA =7*HS* =7VHL6 =3H3

PAGE 73

UXH UXO UXW W[O UXE EMHO UXJ J[O UXV VMF UXWJ WJ[O UXWV WVMHO UXS SMF UXO ZKHUH \ LV WKH REVHUYDWLRQ YHFWRU =c LV WKH SRUWLRQ RI WKH GHVLJQ PDWUL[ IRU WKH Lf§ UDQGRP YDULDEOH Hc LV WKH YHFWRU RI XQREVHUYDEOH UDQGRP HIIHFWV IRU WKH Lf§ UDQGRP YDULDEOH LV D YHFWRU RI OfV DQG Q W E J V WJ WV DQG S DUH WKH QXPEHU RI REVHUYDWLRQV WHVWV EORFNV JFDfV VHDfV WHVW E\ JFD LQWHUDFWLRQV WHVW E\ VHD LQWHUDFWLRQV DQG SORWV UHVSHFWLYHO\ 8WLOL]LQJ FXVWRPDU\ DVVXPSWLRQV LQ KDOIGLDOOHO PDWLQJ GHVLJQV 0HWKRG *ULIILQJ f WKH YDULDQFH RI DQ LQGLYLGXDO REVHUYDWLRQ LV 9DU\LMOGD A U A Dr US Dr DQG LQ PDWUL[ QRWDWLRQ WKH FRYDULDQFH PDWUL[ IRU WKH REVHUYDWLRQV LV 9DU\f ==fR ==RJ =V=fD? =7&=A =S=-R \Z ZKHUH f LQGLFDWHV WKH WUDQVSRVH RSHUDWRU DOO PDWULFHV RI WKH IRUP =c=cf DUH UXQ DQG ,f LV DQ UXQ LGHQWLW\ PDWUL[ +DOIVLE /LQHDU 0RGHO 7KH VFDODU OLQHDU PRGHO IRU KDOIVLE LQGLYLGXDO REVHUYDWLRQV LV \cMNP 0 Wc E\ JN WJr 3KMN :KLMNP ZKHUH \LMNQL LV WKH P REVHUYDWLRQ RI WKH N KDOIVLE IDPLO\ LQ WKH Mf§ EORFN RI WKH L WHVW + WM E\ JN DQG WJr UHWDLQ WKH GHILQLWLRQ LQ (T SKLMN LV WKH UDQGRP YDULDEOH SORW FRQWDLQLQJ GLIIHUHQW JHQRW\SH E\ HQYLURQPHQW FRPSRQHQWV WKDQ WKH FRUUHVSRQGLQJ WHUP LQ (T a 1,'DSKf ZKLMNP LV WKH UDQGRP YDULDEOH ZLWKLQSORW FRQWDLQLQJ GLIIHUHQW OHYHOV RI JHQRW\SLF DQG JHQRW\SH E\ HQYLURQPHQW FRPSRQHQWV WKDQ WKH FRUUHVSRQGLQJ WHUP LQ (T

PAGE 74

a 1,'L2RAf DQG WKHUH LV QR FRYDULDQFH EHWZHHQ UDQGRP YDULDEOHV LQ WKH PRGHO 7KH PDWUL[ QRWDWLRQ PRGHO LV GLPHQVLRQV EHORZ PRGHO FRPSRQHQWf \ S&M =JJ =T&M AWJAWJ r =S*S HAY UXO D[O UXW W[O UXE E[O D[J J[O UXWJ WJMFO D[S S[O UXO 7KH YDULDQFH RI DQ LQGLYLGXDO REVHUYDWLRQ LQ KDOIVLE GHVLJQV LV 9DU\LMND A R A DA WUA DQG 9DU\f f§ =A=AFU =%=%ME =*=FUJ A =3=3 LfSE ,QII}r )RU DQ REVHUYDWLRQDO YHFWRU EDVHG RQ SORW PHDQV WKH SORW DQG ZLWKLQSORW UDQGRP YDULDEOHV ZHUH FRPELQHG E\ WDNLQJ WKH DULWKPHWLF PHDQ DFURVV WKH REVHUYDWLRQV ZLWKLQ D SORW 7KH UHVXOWLQJ SORW PHDQV PRGHO KDV D QHZ US RU USK DS RU DSEf WHUP EHLQJ D FRPSRVLWH RI WKH SORW DQG ZLWKLQSORW YDULDQFH WHUPV RI WKH LQGLYLGXDO REVHUYDWLRQ PRGHO 7KUHH HVWLPDWHV RI UDWLRV DPRQJ YDULDQFH FRPSRQHQWV ZHUH GHWHUPLQHG f VLQJOH WUHH KHULWDELOLW\ DGMXVWHG IRU WHVW ORFDWLRQ DQG EORFN DV IL UJ *7SEHQRW\SLF ZKHUH IISKFQRO\SLF LV WKH HVWLPDWH RI WKH YDULDQFH RI DQ LQGLYLGXDO REVHUYDWLRQ IURP HTXDWLRQV DQG ZLWK WKH YDULDQFH FRPSRQHQWV IRU WHVW ORFDWLRQ DQG EORFN GHOHWHG f W\SH % FRUUHODWLRQ DV U% EJ UJ UWJf DQG GRPLQDQFH WR DGGLWLYH YDULDQFH UDWLR DV GD UJ FUJ 'DWD *HQHUDWLRQ DQG 'HOHWLRQ 'DWD JHQHUDWLRQ ZDV DFFRPSOLVKHG E\ XVLQJ D &KROHVN\ XSSHUORZHU GHFRPSRVLWLRQ RI WKH FRYDULDQFH PDWUL[ IRU WKH REVHUYDWLRQV *RRGQLJKW f DQG D YHFWRU RI SVHXGRUDQGRP VWDQGDUG QRUPDO GHYLDWHV JHQHUDWHG XVLQJ WKH %R[0XOOHU WUDQVIRUPDWLRQ ZLWK SVHXGRUDQGRP XQLIRUP GHYLDWHV .QXWK 3UHVV HW DO f 7KH XSSHUORZHU GHFRPSRVLWLRQ FUHDWHV D PDWUL[ 8f ZLWK WKH SURSHUW\ WKDW 9DU\f 8f8 7KH YHFWRU RI SVHXGRUDQGRP VWDQGDUG QRUPDO GHYLDWHV

PAGE 75

]f KDV D FRYDULDQFH PDWUL[ HTXDO WR DQ LGHQWLW\ PDWUL[ ,ZKHUH Q LV WKH QXPEHU RI REVHUYDWLRQV 7KH YHFWRU RI REVHUYDWLRQV LV FUHDWHG DV \ 8f] 7KHQ 9DU\f 8f9DU]ff8 DQG VLQFH 9DU]f ,f 9DU\f 878 8f8 $QDO\VHV RI VXUYLYDO SDWWHUQV XVLQJ GDWD IURP WKH &RRSHUDWLYH )RUHVW *HQHWLF 5HVHDUFK 3URJUDP &)*53f DW WKH 8QLYHUVLW\ RI )ORULGD ZHUH XVHG WR GHYHORS VXUYLYDO GLVWULEXWLRQV IRU WKH VLPXODWLRQ 7KH GDWD VHWV FKRVHQ IRU VXUYLYDO DQDO\VLV ZHUH IURP IXOOVLE VODVK SLQH 3LUQV HOOLRWWLL YDU HOOLRWWLL (QJHOPf WHVWV SODQWHG LQ UDQGRPL]HG FRPSOHWH EORFN GHVLJQV ZLWK WKH IDPLOLHV LQ URZ SORWV DQG ZHUH VHOHFWHG EHFDXVH WKH VXUYLYDO OHYHOV ZHUH HLWKHU DSSUR[LPDWHO\ b RU b 6XUYLYDO OHYHOV IRU PRVW FURVVHV IXOOVLE IDPLOLHVf FOXVWHUHG DURXQG WKH H[SHFWHG YDOXH LH DSSUR[LPDWHO\ b IRU DQ DYHUDJH VXUYLYDO OHYHO RI b KRZHYHU WKHUH ZHUH DOZD\V D IHZ FURVVHV WKDW KDG PXFK SRRUHU VXUYLYDO WKDQ DYHUDJH DQG DOVR D VPDOO QXPEHU RI FURVVHV WKDW KDG PXFK EHWWHU VXUYLYDO WKDQ DYHUDJH 7KLV VXUYLYDO SDWWHUQ ZDV FRQVLVWHQW DFURVV WKH H[SHULPHQWV DQDO\]HG 7KXV D ORZHU WKDQ DYHUDJH VXUYLYDO OHYHO ZDV DUELWUDULO\ DVVLJQHG WR FHUWDLQ FURVVHV D KLJKHU WKDQ DYHUDJH VXUYLYDO OHYHO ZDV DVVLJQHG WR FHUWDLQ FURVVHV DQG WKH DYHUDJH VXUYLYDO OHYHO DVVLJQHG WR PRVW FURVVHV 7KLV PRGHOLQJ RI VXUYLYDO SDWWHUQ ZDV DOVR H[WHQGHG WR WKH KDOI VLE PDWLQJ GHVLJQ $W b VXUYLYDO QR PLVVLQJ SORWV ZHUH DOORZHG DQG DW b VXUYLYDO PLVVLQJ SORWV RFFXUUHG DW UDQGRP )XOOVLE IDPLO\ GHOHWLRQ VLPXODWHG FURVVHV ZKLFK FRXOG QRW EH PDGH DQG ZHUH WKHUHIRUH PLVVLQJ IURP WKH H[SHULPHQW :KHQ GHOHWLQJ ILYH FURVVHV WKH GHOHWLRQ ZDV UHVWULFWHG WR D PD[LPXP RI IRXU FURVVHV SHU SDUHQW WR SUHYHQW ORVV RI DOO WKH FURVVHV LQ ZKLFK D VLQJOH SDUHQW DSSHDUHG VLQFH WKLV ZRXOG KDYH UHVXOWHG LQ FKDQJLQJ D VL[SDUHQW WR D ILYHSDUHQW KDOIGLDOOHO 7HVWV KDYLQJ RQO\ VXEVHWV RI WKH KDOIVLE IDPLOLHV LQ FRPPRQ DUH D IUHTXHQW RFFXUUHQFH LQ GDWD DQDO\VLV DW &)*53 7KLV SDUWLDO FRQQHFWHGQHVV ZDV VLPXODWHG E\ JHQHUDWLQJ GDWD LQ ZKLFK

PAGE 76

RQO\ RI WKH IDPLOLHV SUHVHQW LQ D WHVW ZHUH FRPPRQ WR HLWKHU RQH RI WKH RWKHU WZR WHVWV FRPSULVLQJ D GDWD VHW 9DULDQFH &RPSRQHQW (VWLPDWLRQ 7HFKQLTXHV 7ZR DOJRULWKPV ZHUH XWLOL]HG IRU DOO HVWLPDWLRQ WHFKQLTXHV VHTXHQWLDOO\ DGMXVWHG VXPV RI VTXDUHV 0LOOLNHQ DQG -RKQVRQ S f IRU +0 DQG *LHVEUHFKWfV DOJRULWKP *LHVEUHFKW f IRU 5(0/ 0/ 0,148( DQG 0,948( *LHVEUHFKWfV DOJRULWKP LV SULPDULO\ D JUDGLHQW DOJRULWKP WKH PHWKRG RI VFRULQJf DQG DV VXFK DOORZV QHJDWLYH HVWLPDWHV +DUYLOOH *LHVEUHFKW f 1HJDWLYH HVWLPDWHV DUH QRW D WKHRUHWLFDO GLIILFXOW\ ZLWK 0,148( RU 0,948( KRZHYHU IRU 5(0/ DQG 0/ HVWLPDWHV VKRXOG EH FRQILQHG WR WKH SDUDPHWHU VSDFH )RU WKLV UHDVRQ HVWLPDWRUV UHIHUUHG WR DV 5(0/ DQG 0/ LQ WKLV FKDSWHU DUH QRW WUXO\ 5(0/ DQG 0/ ZKHQ QHJDWLYH HVWLPDWHV RFFXU IXUWKHU WKHUH LV WKH SRVVLELOLW\ WKDW WKH LWHUDWLYH VROXWLRQ VWRSSHG DW D ORFDO PD[LPD QRW WKH JOREDO PD[LPXP 7KHVH FRQFHUQV DUH FRPPRQSODFH LQ 5(0/ DQG 0/ HVWLPDWLRQ &RUEHLO DQG 6HDUOH +DUYLOOH 6ZDOORZ DQG 0RQDKDQ f KRZHYHU LJQRULQJ WKHVH WZR SRLQWV WKHVH HVWLPDWRUV DUH VWLOO UHIHUUHG WR DV 5(0/ DQG 0/ 7KH EDVLF HTXDWLRQ IRU YDULDQFH FRPSRQHQW HVWLPDWLRQ XQGHU QRUPDOLW\ *LHVEUHFKW f IRU 0,948( 0,148( DQG 5(0/ LV 04949Mf`r ^\f4A4\` U[U UMFO UMFO WKHQ A ^WU49L49Mf`n^\f49L4\` DQG IRU 0/ WU2A99n9f`r ^\f49L4\` U[U U[O U[O ZKHUH ^WU49c49Mf` LV D PDWUL[ ZKRVH HOHPHQWV DUH WU49L49Mf ZKHUH LQ WKH IXOOVLE GHVLJQV L WR DQG M O WR LH WKHUH LV D URZ DQG FROXPQ IRU HYHU\ UDQGRP YDULDEOH LQ WKH OLQHDU PRGHO

PAGE 77

WU LV WKH WUDFH RSHUDWRU WKDW LV WKH VXP RI WKH GLDJRQDO HOHPHQWV RI D PDWUL[ 4 9n 9n;;f9n;fn;f9n IRU 9 DV WKH FRYDULDQFH PDWUL[ RI \ DQG ; DV WKH GHVLJQ PDWUL[ IRU IL[HG HIIHFWV 9 =W=? ZKHUH L WKH UDQGRP YDULDEOHV WHVW EORFN HWF D LV WKH YHFWRU RI YDULDQFH FRPSRQHQW HVWLPDWHV DQG U LV WKH QXPEHU RI UDQGRP YDULDEOHV LQ WKH PRGHO 7KH 0,148( HVWLPDWRU XVHG ZDV 0,148( LH RQHV DV SULRUV IRU DOO YDULDQFH FRPSRQHQWV FDOFXODWHG E\ DSSO\LQJ *LHVEUHFKWfV DOJRULWKP QRQLWHUDWLYHO\ 0,148( ZDV FKRVHQ EHFDXVH RI UHVXOWV GHPRQVWUDWLQJ 0,148(2 SULRU RI IRU WKH HUURU WHUP DQG RI IRU DOO RWKHUVf WR EH DQ LQIHULRU HVWLPDWLRQ WHFKQLTXH IRU PDQ\ FDVHV 6ZDOORZ DQG 0RQDKDQ 5& /LWWHOO XQSXEOLVKHG GDWDf :LWK QRUPDOO\GLVWULEXWHG XQFRUUHODWHG UDQGRP YDULDEOHV WKH XVH RI WKH WUXH YDOXHV RI WKH YDULDQFH FRPSRQHQWV DV SULRUV LQ D QRQLWHUDWLYH DSSOLFDWLRQ RI *LHVEUHFKWfV DOJRULWKP SURGXFHG WKH 0,948( VROXWLRQV HTXDWLRQ f 2EWDLQLQJ WUXH 0,948( HVWLPDWLRQ LV D OX[XU\ RI FRPSXWHU VLPXODWLRQ DQG ZRXOG QRW EH SRVVLEOH LQ SUDFWLFH VLQFH WKH WUXH YDULDQFH FRPSRQHQWV DUH UHTXLUHG 6ZDOORZ DQG 6HDUOH f 7KLV HVWLPDWRU ZDV LQFOXGHG WR SURYLGH D VWDQGDUG RI FRPSDULVRQ IRU RWKHU HVWLPDWRUV $Q DGGLWLRQDO 0,948(W\SH HVWLPDWRU UHIHUUHG WR DV 0,93(1 ZDV DOVR LQFOXGHG 0,93(1 ZDV DOVR D QRQLWHUDWLYH DSSOLFDWLRQ RI WKH DOJRULWKP ZLWK WKH WUXH YDULDQFH FRPSRQHQWV DV SULRUV KRZHYHU WKLV HVWLPDWRU ZDV FRQGLWLRQHG RQ WKH YDULDQFH FRPSRQHQW SDUDPHWHU VSDFH DQG GLG QRW DOORZ QHJDWLYH HVWLPDWHV 7KH QRQQHJDWLYH FRQGLWLRQLQJ RI 0,93(1 ZDV DFFRPSOLVKHG E\ DGGLQJ D SHQDOW\ DOJRULWKP WR 0,948( VXFK WKDW QR YDULDQFH FRPSRQHQW ZDV DOORZHG WR EH OHVV WKDQ O[O2 (VWLPDWHV IURP 0,93(1 ZHUH HTXDO WR 0,948( IRU GDWD VHWV IRU ZKLFK WKHUH ZHUH QR QHJDWLYH 0,948( YDULDQFH FRPSRQHQW HVWLPDWHV :KHQ QHJDWLYH 0,948( HVWLPDWHV RFFXU WKH WZR WHFKQLTXHV ZHUH QR ORQJHU HTXLYDOHQW 7KH SHQDOW\

PAGE 78

DOJRULWKP RSHUDWHG E\ XVLQJ $ D R DQG E\ FKRRVLQJ D VFDODU ZHLJKW Z VXFK WKDW QR HOHPHQW RI DA LV OHVV WKDQ O[O2n 7KHQ DA D Z$ ZKHUH $ LV WKH YHFWRU RI GHSDUWXUH IURP WKH WUXH YDOXHV Rf O[O2n LV DQ DUELWUDU\ FRQVWDQW DQG £A LV WKH YHFWRU RI HVWLPDWHG YDULDQFH FRPSRQHQWV FRQGLWLRQHG RQ QRQQHJDWLYLW\ 5(0/ HVWLPDWHV ZHUH IURP UHSHDWHG DSSOLFDWLRQ RI *LHVEUHFKWfV DOJRULWKP HTXDWLRQ f LQ ZKLFK WKH HVWLPDWHV IURP WKH Nr LWHUDWLRQ EHFRPH WKH SULRUV IRU WKH Nr LWHUDWLRQ 7KH LWHUDWLRQV ZHUH VWRSSHG ZKHQ WKH GLIIHUHQFH EHWZHHQ WKH HVWLPDWHV IURP WKH Nr DQG NOr LWHUDWLRQV PHW WKH FRQYHUJHQFH FULWHULRQ WKHQ WKH HVWLPDWHV RI WKH NOr LWHUDWLRQ EHFDPH WKH 5(0/ HVWLPDWHV 7KH FRQYHUJHQFH FULWHULRQ XWLOL]HG ZDV FWP ULNf O[O2 7KLV FULWHULRQ LPSRVHG FRQYHUJHQFH WR WKH IRXUWK GHFLPDO SODFH IRU DOO YDULDQFH FRPSRQHQWV 6LQFH IRU WKLV H[SHULPHQWDO ZRUNORDG LW ZDV GHVLUHG WKDW WKH VLPXODWLRQ UXQ ZLWK OLWWOH DQDO\VW LQWHUYHQWLRQ DQG LQ DV IHZ LWHUDWLRQV DV SRVVLEOH WKH UREXVWQHVV RI 5(0/ VROXWLRQV REWDLQHG IURP *LHVEUHFKWfV DOJRULWKP WR SULRUV RU VWDUWLQJ SRLQWVf ZDV H[SORUHG 7KH GLIIHUHQFH LQ VROXWLRQV VWDUWLQJ IURP WZR GLVWLQFW SRLQWV D YHFWRU RI RQHV DQG WKH WUXH YDOXHVf ZDV FRPSDUHG RYHU GDWD VHWV RI GLIIHUHQW VWUXFWXUHV LPEDODQFH WUXH YDULDQFH FRPSRQHQWV DQG ILHOG GHVLJQf 7KH UHVXOWV DJUHHLQJ ZLWK WKRVH RI 6ZDOORZ DQG 0RQDKDQ f LQGLFDWHG WKDW WKH GLIIHUHQFH EHWZHHQ WKH WZR VROXWLRQV ZDV HQWLUHO\ GHSHQGHQW RQ WKH VWULQJHQF\ RI WKH FRQYHUJHQFH FULWHULRQ DQG QRW RQ WKH VWDUWLQJ SRLQW SULRUVf $OVR WKH QXPEHU RI LWHUDWLRQV UHTXLUHG IRU FRQYHUJHQFH ZDV JUHDWO\ GHFUHDVHG E\ XVLQJ WKH WUXH YDOXHV DV SULRUV 7KXV DOO 5(0/ HVWLPDWHV ZHUH FDOFXODWHG VWDUWLQJ ZLWK WKH WUXH YDOXHV DV SULRUV 7KUHH DOWHUQDWLYHV IRU FRSLQJ ZLWK QHJDWLYH HVWLPDWHV DIWHU FRQYHUJHQFH ZHUH XVHG IRU 5(0/ VROXWLRQV DFFHSW DQG XVH WKH QHJDWLYH HVWLPDWHV 6KDZ f DUELWUDULO\ VHW QHJDWLYH HVWLPDWHV WR ]HUR DQG UHVROYH WKH V\VWHP VHWWLQJ QHJDWLYH HVWLPDWHV WR ]HUR 0LOOHU f 7KH ILUVW WZR DOWHUQDWLYHV DUH VHOIH[SODQDWRU\ DQG WKH ODWWHU LV DFFRPSOLVKHG E\ UHDQDO\]LQJ WKRVH GDWD

PAGE 79

VHWV LQ ZKLFK WKH LQLWLDO XQUHVWULFWHG 5(0/ HVWLPDWHV LQFOXGHG RQH RU PRUH QHJDWLYH HVWLPDWHV 'XULQJ UHDQDO\VLV LI D YDULDQFH FRPSRQHQW EHFDPH QHJDWLYH LW ZDV VHW WR ]HUR FRXOG QHYHU EH DQ\ YDOXH RWKHU WKDQ ]HURf DQG WKH LWHUDWLRQV FRQWLQXHG 7KLV SURFHGXUH SHUVLVWHG XQWLO WKH FRQYHUJHQFH FULWHULRQ ZDV PHW ZLWK D VROXWLRQ LQ ZKLFK DOO YDULDQFH FRPSRQHQWV ZHUH HLWKHU SRVLWLYH RU ]HUR +DUYLOOH f VXJJHVWHG VHYHUDO DGDSWDWLRQV RI +HQGHUVRQfV PL[HG PRGHO HTXDWLRQV +HQGHUVRQ HW DO f ZKLFK GR QRW DOORZ YDULDQFH FRPSRQHQW HVWLPDWHV WR EHFRPH QHJDWLYH KRZHYHU WKH HVWLPDWHV FDQ EHFRPH DUELWUDULO\ FORVH WR ]HUR $IWHU WULDO RI WKHVH WHFKQLTXHV YHUVXV WKH VHW WKH QHJDWLYH HVWLPDWHV WR ]HUR DIWHU FRQYHUJHQFH DQG UHVROYH WKH V\VWHP DSSURDFK FRPSDULVRQ RI UHVXOWV XVLQJ WKH VDPH GDWD VHWV LQGLFDWHV WKDW WKHUH LV OLWWOH SUDFWLFDO DGYDQWDJH DOWKRXJK PRUH GHVLUDEOH WKHRUHWLFDOO\f LQ XVLQJ WKH DSSURDFK VXJJHVWHG E\ +DUYLOOH 7KH GLIIHUHQFHV EHWZHHQ VHWV RI HVWLPDWHV REWDLQHG E\ WKH WZR PHWKRGV DUH H[WUHPHO\ PLQRU VROYLQJ WKH V\VWHP ZLWK D YDULDQFH FRPSRQHQW VHW WR ]HUR YHUVXV DUELWUDULO\ FORVH WR ]HURf 0/ VROXWLRQV DV LWHUDWLYH DSSOLFDWLRQV RI HTXDWLRQ ZHUH FDOFXODWHG IURP WKH VDPH VWDUWLQJ SRLQWV DQG ZLWK WKH VDPH FRQYHUJHQFH FULWHULRQ DV 5(0/ VROXWLRQV 7KH WKUHH QHJDWLYH YDULDQFH FRPSRQHQW DOWHUQDWLYHV H[SORUHG IRU 0/ ZHUH WR DFFHSW DQG XVH WKH QHJDWLYH HVWLPDWHV WR DUELWUDULO\ VHW QHJDWLYH HVWLPDWHV WR ]HUR DIWHU FRQYHUJLQJ WR D VROXWLRQ IRU WKH IRUPHU DQG IRU KDOIVLE GDWD RQO\f WR UHVROYH WKH V\VWHP VHWWLQJ QHJDWLYH YDULDQFH FRPSRQHQWV WR ]HUR 7KH DOJRULWKP WR FDOFXODWH VROXWLRQV IRU +0 VHTXHQWLDOO\ DGMXVWHG VXPV RI VTXDUHVf ZDV EDVHG RQ WKH XSSHU WULDQJXODU VZHHS *RRGQLJKW f DQG +DUWOH\fV PHWKRG RI V\QWKHVLV +DUWOH\ f 7KH HTXDWLRQ VROYHG ZDV (^06`R 06 ZKHUH 06 LV WKH YHFWRU RI PHDQ VTXDUHV DQG (^06` LV WKHLU H[SHFWDWLRQ 7KH DOWHUQDWLYH XVHG IRU QHJDWLYH HVWLPDWHV ZDV WR DFFHSW DQG XVH WKH QHJDWLYH HVWLPDWHV

PAGE 80

&RPSDULVRQ $PRQJ (VWLPDWLRQ 7HFKQLTXHV )RU WKH VLPXODWLRQ 0,948( HVWLPDWHV ZHUH WKH EDVLV IRU DOO FRPSDULVRQV EHFDXVH 0,948( LV E\ GHILQLWLRQ WKH PLQLPXP YDULDQFH TXDGUDWLF XQELDVHG HVWLPDWRU 7KH UHVXOWV RI FRPSDULQJ WKH PHDQ RI 0,948( HVWLPDWHV IRU DQ H[SHULPHQWDO OHYHO WR WKH PHDQV IRU RWKHU WHFKQLTXHV ZHUH WHUPHG DSSDUHQW ELDV $SSDUHQW ELDV GHQRWHV WKDW GDWD VHWV ZHUH QRW VXIILFLHQW WR DFKLHYH FRPSOHWH FRQYHUJHQFH WR WKH WUXH YDOXHV RI WKH YDULDQFH FRPSRQHQWV 6DPSOLQJ YDULDQFHV RI HVWLPDWLRQ ZHUH FDOFXODWHG IURP WKH REVHUYDWLRQV ZLWKLQ DQ H[SHULPHQWDO OHYHO DQG HVWLPDWLRQ WHFKQLTXH IRU YDULDQFH FRPSRQHQWV DQG JHQHWLF UDWLRV VLQJOH WUHH KHULWDELOLW\ 7\SH % FRUUHODWLRQ DQG GRPLQDQFH WR DGGLWLYH YDULDQFH UDWLRf 0HDQ VTXDUH HUURU WKHQ HTXDOOHG YDULDQFH SOXV VTXDUHG DSSDUHQW ELDV :KLOH PHDQ VTXDUH HUURU ZDV LQYHVWLJDWHG WKHUH ZDV QHYHU VXIILFLHQW ELDV IRU PHDQ VTXDUH HUURU WR OHDG WR D GLIIHUHQW GHFLVLRQ FRQFHUQLQJ WHFKQLTXHV WKDQ VDPSOLQJ YDULDQFH RI WKH HVWLPDWHV VR PHDQ VTXDUH HUURU ZDV GHOHWHG IURP WKH UHPDLQGHU RI WKLV GLVFXVVLRQ 3UREDELOLW\ RI QHDUQHVV LV WKH SUREDELOLW\ WKDW DQ HVWLPDWH ZLOO OLH ZLWKLQ D FHUWDLQ LQWHUYDO DURXQG WKH WUXH SDUDPHWHU 7KH WKUHH WRWDO LQWHUYDO ZLGWKV XWLOL]HG ZHUH RQHKDOI HTXDO WR DQG WZLFH WKH SDUDPHWHU VL]H 7KH SHUFHQWDJH RI HVWLPDWHV IDOOLQJ ZLWKLQ WKHVH LQWHUYDOV ZHUH FDOFXODWHG IRU WKH GLIIHUHQW HVWLPDWLRQ WHFKQLTXHV ZLWKLQ DQ H[SHULPHQWDO OHYHO IRU YDULDQFH FRPSRQHQWV DQG UDWLRV DQG XWLOL]HG DV DQ HVWLPDWH RI SUREDELOLW\ RI QHDUQHVV 5HVXOWV DUH SUHVHQWHG E\ YDULDQFH FRPSRQHQW RU JHQHWLF UDWLR HVWLPDWHG DV D SHUFHQWDJH RI 0,948( H[FHSW LQ WKH FDVH RI SUREDELOLW\ RI QHDUQHVVf 0,948( HVWLPDWHV UHSUHVHQW b ZLWK HVWLPDWHV ZLWK JUHDWHU YDULDQFH KDYLQJ YDOXHV ODUJHU WKDQ b DQG DSSDUHQWO\ ELDVHG HVWLPDWHV KDYLQJ YDOXHV GLIIHUHQW IURP b 7KH SHUFHQWDJHV ZHUH FDOFXODWHG DV HTXDO WR WLPHV WKH HVWLPDWH GLYLGHG E\ WKH 0,948( YDOXH )RU WKH FULWHULRQ RI YDULDQFH WKH ORZHU WKH

PAGE 81

SHUFHQWDJH WKH EHWWHU WKH HVWLPDWRU SHUIRUPHG IRU ELDV YDOXHV HTXDOOLQJ b ELDVf DUH SUHIHUUHG DQG IRU SUREDELOLW\ RI QHDUQHVV ODUJHU SHUFHQWDJHV SUREDELOLWLHVf DUH IDYRUHG VLQFH WKH\ DUH LQGLFDWLYH RI JUHDWHU GHQVLW\ RI HVWLPDWHV QHDU WKH SDUDPHWULF YDOXH 5HVXOWV DQG 'LVFXVVLRQ 9DULDQFH &RPSRQHQWV 6DPSOLQJ YDULDQFH RI WKH HVWLPDWRUV )RU DOO YDULDQFH FRPSRQHQWV HVWLPDWHG 5(0/ DQG 0/ HVWLPDWLRQ WHFKQLTXHV ZHUH FRQVLVWHQWO\ HTXDO WR RU OHVV WKDQ 0,948( IRU VDPSOLQJ YDULDQFH RI WKH HVWLPDWRU 7DEOH f 7KH YDULDQFH DPRQJ HVWLPDWHV IURP WKHVH WHFKQLTXHV ZDV IXUWKHU UHGXFHG E\ VHWWLQJ WKH QHJDWLYH FRPSRQHQWV WR ]HUR 02'0/ DQG 02'5(0/f RU VHWWLQJ QHJDWLYH HVWLPDWHV WR ]HUR SOXV UHn VROYLQJ WKH V\VWHP 115(0/ 110/ DQG 3115(0/f 9DULDQFH DPRQJ 0,148( HVWLPDWHV LV DOZD\V HTXDO WR RU JUHDWHU WKDQ IRU 0,948( DV RQH PLJKW H[SHFW VLQFH WKH\ DUH LQ WKLV DSSOLFDWLRQ WKH VDPH WHFKQLTXH ZLWK 0,948( KDYLQJ SHUIHFW SULRUV WKH WUXH YDOXHVf 9DULDQFHV IRU +0 HVWLPDWRUV 7<3( DQG 37<3(f DUH HLWKHU HTXDO WR RU JUHDWHU WKDQ 0,948( +0 HVWLPDWHV KDYH SURJUHVVLYHO\ ODUJHU UHODWLYH YDULDQFH ZLWK KLJKHU OHYHOV RI LPEDODQFH 0,93(1 DOWKRXJK LPSUDFWLFDO EHFDXVH RI WKH QHHG IRU WKH WUXH SULRUV KDG PXFK PRUH SUHFLVH HVWLPDWHV RI YDULDQFH FRPSRQHQWV WKDQ RWKHU WHFKQLTXHV LOOXVWUDWLQJ ZKDW FRXOG EH DFFRPSOLVKHG JLYHQ WKH WUXH YDOXHV DV SULRUV SOXV PDLQWDLQLQJ HVWLPDWHV ZLWKLQ WKH SDUDPHWHU VSDFH ,Q JHQHUDO WKH VSUHDG DPRQJ WKH SHUFHQWDJHV IRU YDULDQFH RI HVWLPDWLRQ IRU WKH HVWLPDWLRQ WHFKQLTXHV LV KLJKO\ GHSHQGHQW RQ WKH GHJUHH RI LPEDODQFH DQG WKH W\SH RI PDWLQJ V\VWHP :LWK LQFUHDVLQJ LPEDODQFH WKH OLNHOLKRRGEDVHG HVWLPDWRUV UHDOL]HG JUHDWHU DGYDQWDJH IRU VDPSOLQJ YDULDQFH RI WKH HVWLPDWHV RYHU +0 IRU ERWK PDWLQJ V\VWHPV 7KH PRVW DGYDQWDJHRXV DSSOLFDWLRQ

PAGE 82

7DEOH 6DPSOLQJ YDULDQFH IRU WKH HVWLPDWHV RI UJ XSSHU QXPEHUf DA VHFRQG QXPEHUf DQG K WKLUG QXPEHU ZKHUH FDOFXODWHGf DV D SHUFHQWDJH RI WKH 0,948( HVWLPDWH E\ W\SH RI HVWLPDWRU DQG WUHDWPHQW FRPELQDWLRQ 1$ LV QRW DSSOLHG 9DOXHV JUHDWHU WKDQ LQGLFDWH ODUJHU YDULDQFH DPRQJ HVWLPDWHV (VWLPDWRU '6 '6 '6 +6 +6 5(0/ 0/ 0,148( 115(0/ 110/ 1$ 1$ 1$ 02'0/ 02'5(0/ 7<3( 35(0/ 30/ 30,148( 3115(0/ 37<3( 0,93(1 1$ 30,948(

PAGE 83

RI OLNHOLKRRGEDVHG HVWLPDWRUV LV LQ WKH +6 FDVH ZKHUH WKH LPEDODQFH LV QRW RQO\ UDQGRP GHOHWLRQV RI LQGLYLGXDOV EXW DOVR LQFRPSOHWH FRQQHFWHGQHVV DFURVV ORFDWLRQV LH WKH VDPH IDPLOLHV DUH QRW SUHVHQW LQ HDFK WHVW DNLQ WR LQFRPSOHWH EORFNV ZLWKLQ D WHVWf $Q DQDO\VLV RI YDULDQFH ZDV FRQGXFWHG WR GHWHUPLQH WKH LPSRUWDQFH RI WKH WUHDWPHQW RI QHJDWLYH YDULDQFH FRPSRQHQW HVWLPDWHV LQ WKH YDULDQFH RI HVWLPDWLRQ IRU 5(0/ DQG 0/ HVWLPDWHV 7KH PRGHO RI VDPSOLQJ YDULDQFH RI WKH HVWLPDWHV DV D UHVXOW RI PDWLQJ GHVLJQ LPEDODQFH OHYHO WUHDWPHQW RI QHJDWLYH HVWLPDWHV DQG VL]H RI WKH YDULDQFH FRPSRQHQW GHPRQVWUDWHG FRQVLVWHQWO\ IRU DOO YDULDQFH FRPSRQHQWV H[FHSW HUURUf WKDW WUHDWPHQW RI QHJDWLYH HVWLPDWHV LV DQ LPSRUWDQW FRPSRQHQW RI WKH YDULDQFH RI WKH HVWLPDWHV S f 7KH PRGHO DFFRXQWHG IRU XS WR b RI WKH YDULDWLRQ LQ WKH YDULDQFH RI WKH YDULDQFH FRPSRQHQW HVWLPDWHV ZLWK f DFFHSWLQJ DQG XVLQJ QHJDWLYH HVWLPDWHV SURGXFLQJ WKH KLJKHVW YDULDQFH f VHWWLQJ WKH QHJDWLYH FRPSRQHQWV WR ]HUR EHLQJ LQWHUPHGLDWH DQG f UHVROYLQJ WKH V\VWHP ZLWK QHJDWLYH HVWLPDWHV VHW WR ]HUR SURYLGLQJ WKH ORZHVW YDULDQFH )RU DOO HVWLPDWLRQ WHFKQLTXHV ORZHU YDULDQFH DPRQJ HVWLPDWHV ZDV REWDLQHG E\ XVLQJ LQGLYLGXDO REVHUYDWLRQV DV FRPSDUHG WR SORW PHDQV 7KH DGYDQWDJH RI LQGLYLGXDO RYHU SORWPHDQ REVHUYDWLRQV LQFUHDVHG ZLWK LQFUHDVLQJ LPEDODQFH %LDV 7KH PRVW FRQVLVWHQW SHUIRUPDQFH IRU ELDV 7DEOH f DFURVV DOO YDULDQFH FRPSRQHQWV ZDV 7<3( NQRZQ IURP LQKHUHQW SURSHUWLHV WR EH XQELDVHG 7KH FRQVLVWHQW FRQYHUJHQFH RI WKH 7<3( YDOXH WR WKH 0,948( YDOXH LQGLFDWHG WKDW WKH QXPEHU RI GDWD VHWV XVHG SHU WHFKQLTXH DQG H[SHULPHQWDO OHYHOf ZDV VXLWDEOH IRU WKH SXUSRVH RI H[DPLQLQJ ELDV 7KH RWKHU WZR FRQVLVWHQW SHUIRUPHUV ZHUH 5(0/ DQG 0,148( 37<3( +0 EDVHG RQ SORW PHDQVf ZDV XQELDVHG ZKHQ QR SORW PHDQV ZHUH PLVVLQJ EXW SURGXFHG DSSDUHQWO\ ELDVHG HVWLPDWHV ZKHQ SORW PHDQV ZHUH PLVVLQJ

PAGE 84

7DEOH %LDV IRU WKH HVWLPDWHV RI DJ XSSHU QXPEHUf DA VHFRQG QXPEHUf DQG K WKLUG QXPEHU ZKHUH FDOFXODWHGf DV D SHUFHQWDJH RI WKH 0,948( HVWLPDWH E\ W\SH RI HVWLPDWRU DQG H[SHULPHQWDO FRPELQDWLRQ 1$ LV QRW DSSOLHG 9DOXHV GLIIHUHQW IURP GHQRWH DSSDUHQW ELDV (VWLPDWRU '6 '6 '6 +6 +6 5(0/ 0/ 0,148( 115(0/ 110/ 1$ 1$ 1$ 02'0/ 02'5(0/ 7<3( 35(0/ 30/ 30,148( 3115(0/ 37<3( 0,93(1 1$ 30,948(

PAGE 85

7DEOH 3UREDELOLW\ RI QHDUQHVV IRU FUJ XSSHU QXPEHUf D? VHFRQG QXPEHUf DQG K WKLUG QXPEHU ZKHUH FDOFXODWHGf 7KH SUREDELOLW\ LQWHUYDO LV HTXDO WR WKH PDJQLWXGH RI WKH SDUDPHWHU (VWLPDWRU '6 '6 '6 +6 +6 5(0/ 0/ 0,148( 115(0/ 110/ 1$ 1$ 1$ 7<3( 35(0/ 30/ 30,148( 3115(0/ 37<3( 0,948( 0,93(1 1$ 30,948(

PAGE 86

$PRQJ HVWLPDWRUV ZKLFK GLVSOD\HG ELDV PD[LPXP OLNHOLKRRG HVWLPDWRUV 0/ DQG 30/f ZHUH NQRZQ WR EH LQKHUHQWO\ ELDVHG +DUYLOOH 6HDUOH f ZLWK WKH DPRXQW RI ELDV SURSRUWLRQDO WR WKH QXPEHU RI GHJUHHV RI IUHHGRP IRU D IDFWRU YHUVXV WKH QXPEHU RI OHYHOV IRU WKH IDFWRU 2WKHU ELDVHV UHVXOWHG IURP WKH PHWKRG RI GHDOLQJ ZLWK QHJDWLYH HVWLPDWHV /LYLQJ ZLWK QHJDWLYH HVWLPDWHV SURGXFHG WKH HVWLPDWRUV ZLWK WKH OHDVW ELDV 6HWWLQJ QHJDWLYH YDULDQFH FRPSRQHQWV WR ]HUR UHVXOWHG LQ WKH JUHDWHVW ELDV ,QWHUPHGLDWH LQ ELDV ZHUH WKH HVWLPDWHV UHVXOWLQJ IURP UHVROYLQJ WKH V\VWHP ZLWK QHJDWLYH FRPSRQHQWV VHW WR ]HUR 3UREDELOLW\ RI QHDUQHVV 5HVXOWV IRU SUREDELOLW\ RI QHDUQHVV SURYHG WR EH ODUJHO\ QRQGLVFULPLQDWRU\ DPRQJ WHFKQLTXHV 7DEOH f 7KH ORZ OHYHOV RI SUREDELOLW\ GHQVLW\ QHDU WKH SDUDPHWULF YDOXHV DUH LQGLFDWLYH RI WKH QDWXUH RI WKH YDULDQFH FRPSRQHQW HVWLPDWLRQ SUREOHP )LJXUH LOOXVWUDWHV WKH GLVWULEXWLRQ RI 0,948( YDULDQFH FRPSRQHQW HVWLPDWHV IRU K ODf DQG UJ OEf IRU OHYHO '6 7KH GLVWULEXWLRQV IRU DOO XQFRQVWUDLQHG YDULDQFH FRPSRQHQW HVWLPDWHV KDYH WKH DSSHDUDQFH RI D FKLVTXDUH GLVWULEXWLRQ SRVLWLYHO\ VNHZHG ZLWK WKH H[SHFWHG YDOXH PHDQf RFFXUULQJ WR WKH ULJKW RI WKH SHDN SUREDELOLW\ GHQVLW\ DQG D SURSRUWLRQ RI WKH HVWLPDWHV RFFXUULQJ EHORZ ]HUR H[FHSW HUURUf :LWK LQFUHDVLQJ LPEDODQFH WKH YDULDQFH DPRQJ HVWLPDWHV LQFUHDVHV DQG WKH SUREDELOLW\ RI QHDUQHVV GHFUHDVHV IRU DOO LQWHUYDO ZLGWKV 5DWLRV RI 9DULDQFH &RPSRQHQWV 6LQJOH WUHH KHULWDELOLWY 5HVXOWV IRU HVWLPDWHV RI VLQJOH WUHH KHULWDELOLW\ DGMXVWHG IRU ORFDWLRQV DQG EORFNV DUH VKRZQ LQ 7DEOHV DQG WKLUG QXPEHU IURP WKH WRS LQ HDFK FHOO LI FDOFXODWHGf )RU WKHVH UHODWLYHO\ ORZ KHULWDELOLWLHV DQG f WKH ELDV DQG YDULDQFH SURSHUWLHV RI WKH HVWLPDWHG UDWLR DUH VLPLODU WR WKRVH IRU XJ HVWLPDWHV )LJXUH f 7KLV LPSOLHV WKDW NQRZLQJ WKH SURSHUWLHV RI WKH QXPHUDWRU

PAGE 87

3 ( 5 LR & ( 1 7 0,948( (67,0$7(6 '$7$ 6(76 OD K OE DW / f / )LJXUH 'LVWULEXWLRQ RI 0,948( HVWLPDWHV RI K ODf DQG UJ OEf IRU H[SHULPHQWDO OHYHO '6 LOOXVWUDWLQJ WKH SRVLWLYH VNHZ DQG VLPLODULW\ RI WKH GLVWULEXWLRQV 7KH WUXH YDOXHV DUH IRU K DQG IRU DIL 7KH LQWHUYDO ZLGWK RI WKH EDUV LV RQHKDOI WKH SDUDPHWULF YDOXH

PAGE 88

RI KHULWDELOLW\ UHYHDOV WKH SURSHUWLHV RI WKH UDWLR HVSHFLDOO\ WUXH RI UDWLRV ZLWK H[SHFWHG YDOXHV RI DQG .HQGDOO DQG 6WXDUW &K f 9DULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXHV ZKLFK SHUIRUPHG ZHOO IRU ELDV DQGRU YDULDQFH DPRQJ HVWLPDWHV IRU WUJ DOVR SHUIRUPHG ZHOO IRU K 7\SH % FRUUHODWLRQ DQG GRPLQDQFH WR DGGLWLYH YDULDQFH UDWLR 7\SH % FRUUHODWLRQ 7DEOH DQG DV DA DQG GRPLQDQFH WR DGGLWLYH YDULDQFH UDWLR QRW VKRZQf HVWLPDWHV ERWK SURYHG WR EH WRR XQVWDEOH H[WUHPHO\ ODUJH YDULDQFH DPRQJ HVWLPDWHVf LQ WKHLU RULJLQDO IRUPXODWLRQV WR EH XVHIXO LQ GLVFULPLQDWLRQ DPRQJ YDULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXHV 7KLV KLJK YDULDQFH LV GXH WR WKH HVWLPDWHV RI WKH GHQRPLQDWRUV RI WKHVH UDWLRV DSSURDFKLQJ ]HUR DQG WR WKH KLJK YDULDQFH RI WKH GHQRPLQDWRU RI UDWLRV 7DEOH f 7KHVH UDWLRV ZHUH UHIRUPXODWHG ZLWK QXPHUDWRUV RI LQWHUHVW AA IRU DGGLWLYH JHQHWLF E\ WHVW LQWHUDFWLRQ DQG IIV IRU GRPLQDQFH YDULDQFH UHVSHFWLYHO\f DQG D GHQRPLQDWRU HTXDO WR WKH HVWLPDWH RI WKH SKHQRW\SLF YDULDQFH :LWK WKLV UHIRUPXODWLRQ WKH YDULDQFH DQG ELDV SURSHUWLHV RI HVWLPDWHV RI WKH DOWHUHG UDWLRV LV DSSUR[LPDWHG E\ WKH SURSHUWLHV RI HVWLPDWHV RI WKH QXPHUDWRUV )RU LQFUHDVLQJ LPEDODQFH PD[LPXPOLNHOLKRRGEDVHG HVWLPDWLRQ RIIHUV DQ LQFUHDVLQJ DGYDQWDJH RYHU +0 DQG IRU DOO WHFKQLTXHV LQGLYLGXDO REVHUYDWLRQV RIIHU LQFUHDVLQJ DGYDQWDJH RYHU SORWPHDQ REVHUYDWLRQV IRU YDULDQFH RI WKH HVWLPDWHV RI WKHVH UDWLRV %LDV RWKHU WKDQ LQKHUHQWO\ ELDVHG PHWKRGV 0/f LV DVVRFLDWHG ZLWK WKH SUREDELOLW\ RI QHJDWLYH HVWLPDWHV ZKLFK LV LQFUHDVHG E\ LQFUHDVLQJ LPEDODQFH 7KLV DVVHUWLRQ LV VXSSRUWHG E\ FRPSDULQJ WKH ELDVHV RI 5(0/ 115(0/ DQG 02'5(0/ HVWLPDWHV DFURVV LPEDODQFH OHYHOV

PAGE 89

*HQHUDO 'LVFXVVLRQ 2EVHUYDWLRQDO 8QLW 6RPH JHQHUDO FRQFOXVLRQV UHJDUGLQJ WKH FKRLFH RI D YDULDQFH FRPSRQHQW HVWLPDWLRQ PHWKRGRORJ\ FDQ EH GUDZQ IURP WKH UHVXOWV RI WKLV LQYHVWLJDWLRQ )RU DQ\ GHJUHH RI LPEDODQFH WKH XVH RI LQGLYLGXDO REVHUYDWLRQV LV VXSHULRU WR WKH XVH RI SORW PHDQV IRU HVWLPDWLRQ RI YDULDQFH FRPSRQHQW RU UDWLRV RI YDULDQFH FRPSRQHQWV ,I WKH GDWD DUH QHDUO\ EDODQFHG FORVH WR b VXUYLYDO ZLWK QR PLVVLQJ SORWV FURVVHV IXOOVLEf RU ODFN RI FRQQHFWHGQHVV KDOIVLEff WKH SURSHUWLHV RI WKH HVWLPDWLRQ WHFKQLTXHV EDVHG RQ LQGLYLGXDO DQG SORWPHDQ REVHUYDWLRQV EHFRPH VLPLODU VR LI GHSDUWXUH IURP EDODQFH LV QRPLQDO SORW PHDQV FDQ EH XVHG HIIHFWLYHO\ +RZHYHU XVLQJ LQGLYLGXDO REVHUYDWLRQV REYLDWHV WKH QHHG IRU D VXUYH\ RI LPEDODQFH LQ WKH GDWD VLQFH LQGLYLGXDO REVHUYDWLRQV SURGXFH EHWWHU UHVXOWV WKDQ SORW PHDQV IRU DQ\ RI WKH HVWLPDWLRQ WHFKQLTXHV H[DPLQHG 1HJDWLYH (VWLPDWHV 'UDZLQJ RQ WKH UHVXOWV RI WKLV LQYHVWLJDWLRQ WKH GLVFXVVLRQ RI SUDFWLFDO VROXWLRQV IRU WKH QHJDWLYH HVWLPDWHV SUREOHP ZLOO UHYROYH DURXQG WZR VROXWLRQV f DFFHSW DQG XVH WKH QHJDWLYH HVWLPDWHV DQG f UHVROYLQJ WKH V\VWHP ZLWK QHJDWLYH HVWLPDWHV VHW WR ]HUR *LYHQ WKDW WKH SURSHUW\ RI LQWHUHVW LV WKH WUXH YDOXH RI D YDULDQFH FRPSRQHQW RU JHQHWLF UDWLR RIWHQ HVWLPDWHG DV D PHDQ DFURVV GDWD VHWV WKHQ QHJDWLYLW\ FRQVWUDLQWV FRPH LQWR SOD\ LI WKH FRPSRQHQW RI LQWHUHVW LV VPDOO LQ FRPSDULVRQ WR RWKHU XQGHUO\LQJ YDULDQFH FRPSRQHQWV LQ WKH GDWD RU WKH YDULDQFH RI HVWLPDWHV LV KLJK GXH WR DQ LQDGHTXDWH H[SHULPHQWDO GHVLJQ IRU YDULDQFH FRPSRQHQW HVWLPDWLRQ 7KHVH IDFWRUV OHDG WR DQ LQFUHDVHG QXPEHU RI QHJDWLYH HVWLPDWHV ,I WKH GDWD VWUXFWXUH LV VXFK WKDW QHJDWLYH HVWLPDWHV ZRXOG RFFXU IUHTXHQWO\ WKHQ DFFHSWLQJ QHJDWLYH HVWLPDWHV LV D JRRG DOWHUQDWLYH

PAGE 90

,I QHJDWLYH HVWLPDWHV WHQG WR RFFXU LQIUHTXHQWO\ RU ELDV LV RI OHVV FRQFHUQ WKDQ YDULDQFH DPRQJ HVWLPDWHV WKHQ UHVROYLQJ WKH V\VWHP DIWHU FRQYHUJHQFH \LHOGV QHJDWLYH HVWLPDWHV LV WKH SUHIHUDEOH VROXWLRQ 7KLV WDFWLF UHGXFHV ERWK ELDV DQG YDULDQFH DPRQJ HVWLPDWHV EHORZ WKDW RI DUELWUDULO\ VHWWLQJ QHJDWLYH HVWLPDWHV WR ]HUR (VWLPDWLRQ 7HFKQLTXH 7KH SULPDU\ FRPSHWLWRUV DPRQJ HVWLPDWLRQ WHFKQLTXHV WKDW DUH SUDFWLFDOO\ DFKLHYDEOH DUH 5(0/ DQG 7<3( +0f %RWK WHFKQLTXHV SURGXFH HVWLPDWHV ZLWK OLWWOH RU QR ELDV KRZHYHU 5(0/ HVWLPDWHV IRU WKH PRVW SDUW KDYH VOLJKWO\ OHVV VDPSOLQJ YDULDQFH WKDQ 7<3( HVWLPDWHV ,I RQO\ VXEVHWV RI WKH SDUHQWV DUH LQ FRPPRQ DFURVV WHVWV DV LQ WKH FDVH +6 5(0/ KDV D GLVWLQFW DGYDQWDJH LQ YDULDQFH DPRQJ HVWLPDWHV RYHU 7<3( 5(0/ GRHV KDYH WKUHH DGGLWLRQDO DGYDQWDJHV RYHU 7<3( ZKLFK DUH f 5(0/ RIIHUV JHQHUDOL]HG OHDVW VTXDUHV HVWLPDWLRQ RI IL[HG HIIHFWV ZKLOH 7<3( RIIHUV RUGLQDU\ OHDVW VTXDUHV HVWLPDWLRQ f %HVW /LQHDU 8QELDVHG 3UHGLFWLRQV %/83f RI UDQGRP YDULDEOHV DUH LQKHUHQW LQ 5(0/ VROXWLRQV LH JFD SUHGLFWLRQV DUH DYDLODEOH DQG WKXV LQ VROYLQJ IRU WKH YDULDQFH FRPSRQHQWV ZLWK 5(0/ IL[HG HIIHFWV DUH HVWLPDWHG DQG UDQGRP YDULDEOHV DUH SUHGLFWHG VLPXOWDQHRXVO\ +DUYLOOH f DQG f 5(0/ RIIHUV JUHDWHU IOH[LELOLW\ LQ WKH PRGHO VSHFLILFDWLRQ ERWK LQ XQLYDULDWH DQG PXOWLYDULDWH IRUPV DV ZHOO DV KHWHURJHQHRXV RU FRUUHODWHG HUURU WHUPV )XUWKHU DOWKRXJK WKH OLNHOLKRRG HTXDWLRQV IRU FRPPRQ 5(0/ DSSOLFDWLRQV DUH EDVHG RQ QRUPDOLW\ WKH WHFKQLTXH KDV EHHQ VKRZQ WR EH UREXVW DJDLQVW WKH XQGHUO\LQJ GLVWULEXWLRQ :HVWIDOO %DQNV HW DO f

PAGE 91

5HFRPPHQGDWLRQ ,I RQH ZHUH WR FKRRVH D VLQJOH YDULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXH IURP DPRQJ WKRVH WHVWHG ZKLFK FRXOG EH DSSOLHG WR DQ\ GDWD VHW ZLWK FRQILGHQFH WKDW WKH HVWLPDWHV KDG GHVLUDEOH SURSHUWLHV YDULDQFH 06( DQG ELDVf WKDW WHFKQLTXH ZRXOG EH 5(0/ DQG WKH EDVLF XQLW RI REVHUYDWLRQ ZRXOG EH WKH LQGLYLGXDO 7KLV FRPELQDWLRQ 5(0/ SOXV LQGLYLGXDO REVHUYDWLRQVf SHUIRUPHG ZHOO DFURVV PDWLQJ GHVLJQ DQG W\SHV DQG OHYHOV RI LPEDODQFH 7UHDWPHQW RI QHJDWLYH HVWLPDWHV ZRXOG EH GHWHUPLQHG E\ WKH SURSRVHG XVH RI WKH HVWLPDWHV WKDW LV ZKHWKHU XQELDVHGQHVV DFFHSWLQJ DQG XVLQJ WKH QHJDWLYH HVWLPDWHVf LV PRUH LPSRUWDQW WKDQ VDPSOLQJ YDULDQFH UHVROYH WKH V\VWHP VHWWLQJ QHJDWLYH HVWLPDWHV WR ]HURf $ SULPDU\ GLVDGYDQWDJH RI 5(0/ DQG LQGLYLGXDO REVHUYDWLRQV LV WKDW WKH\ DUH ERWK FRPSXWDWLRQDOO\ H[SHQVLYH FRPSXWHU PHPRU\ DQG WLPHf +0 HVWLPDWLRQ FRXOG UHSODFH 5(0/ RQ PDQ\ GDWD VHWV DQG SORW PHDQV FRXOG UHSODFH LQGLYLGXDO REVHUYDWLRQV RQ VRPH GDWD VHWV EXW JHQHUDO DSSOLFDWLRQ RI WKHVH ZLWKRXW UHJDUG WR WKH GDWD DW KDQG GRHV UHVXOW LQ D ORVV LQ GHVLUDEOH SURSHUWLHV RI WKH HVWLPDWHV LQ PDQ\ LQVWDQFHV 7KH FRPSXWDWLRQDO H[SHQVH RI 5(0/ DQG LQGLYLGXDO REVHUYDWLRQV HQVXUHV WKDW HVWLPDWHV KDYH GHVLUDEOH SURSHUWLHV IRU D EURDG VFRSH RI DSSOLFDWLRQV :LWK WKH DGYHQW RI ELJJHU DQG IDVWHU FRPSXWHUV DQG WKH HYROXWLRQ RI EHWWHU 5(0/ DOJRULWKPV ZKDW ZDV QRW IHDVLEOH LQ WKH SDVW RQ PRVW PDLQIUDPH FRPSXWHUV FDQ QRZ EH DFFRPSOLVKHG RQ SHUVRQDO FRPSXWHUV

PAGE 92

&+$37(5 *$5(0/ $ &20387(5 $/*25,7+0 )25 (67,0$7,1* 9$5,$1&( &20321(176 $1' 35(',&7,1* *(1(7,& 9$/8(6 ,QWURGXFWLRQ 7KH FRPSXWHU SURJUDP GHVFULEHG LQ WKLV FKDSWHU FDOOHG *$5(0/ IRU *LHVEUHFKWfV DOJRULWKP RI UHVWULFWHG PD[LPXP OLNHOLKRRG HVWLPDWLRQ 5(0/f LV XVHIXO IRU ERWK HVWLPDWLQJ YDULDQFH FRPSRQHQWV DQG SUHGLFWLQJ JHQHWLF YDOXHV *$5(0/ DSSOLHV WKH PHWKRGRORJ\ RI *LHVEUHFKW f WR WKH SUREOHPV RI 5(0/ HVWLPDWLRQ 3DWWHUVRQ DQG 7KRPSVRQ f DQG EHVW OLQHDU XQELDVHG SUHGLFWLRQ %/83 +HQGHUVRQ f IRU XQLYDULDWH VLQJOH WUDLWf JHQHWLFV PRGHOV *$5(0/ FDQ EH DSSOLHG WR KDOIVLE RSHQSROOLQDWHG RU SRO\PL[f DQG IXOOVLE SDUWLDO GLDOOHOV IDFWRULDOV KDOIGLDOOHOV >QR VHLIV@ RU GLVFRQQHFWHG VHWV RI KDOIGLDOOHOVf PDWLQJ GHVLJQV ZKHQ SODQWHG LQ VLQJOH RU PXOWLSOH ORFDWLRQV ZLWK VLQJOH RU PXOWLSOH UHSOLFDWLRQV SHU ORFDWLRQ :KHQ XVHG IRU YDULDQFH FRPSRQHQW HVWLPDWLRQ WKLV SURJUDP KDV EHHQ VKRZQ WR SURYLGH HVWLPDWHV ZLWK GHVLUDEOH SURSHUWLHV DFURVV W\SHV RI LPEDODQFH FRPPRQO\ HQFRXQWHUHG LQ IRUHVW JHQHWLFV ILHOG WHVWV +XEHU HW DO LQ SUHVVf DQG ZLWK YDU\LQJ XQGHUO\LQJ GLVWULEXWLRQV %DQNV HW DO :HVWIDOO f *$5(0/ LV DOVR XVHIXO IRU GHWHUPLQLQJ HIILFLHQFLHV RI DOWHUQDWLYH ILHOG DQG PDWLQJ GHVLJQV IRU WKH HVWLPDWLRQ RI YDULDQFH FRPSRQHQWV 8WLOL]LQJ WKH SRZHU RI PL[HGPRGHO PHWKRGRORJ\ +HQGHUVRQ f *$5(0/ SURYLGHV %/83 RI SDUHQWDO JHQHUDO JFDf DQG VSHFLILF FRPELQLQJ DELOLWLHV VHDf DV ZHOO DV JHQHUDOL]HG OHDVW VTXDUHV */6f VROXWLRQV IRU IL[HG HIIHFWV 7KH DSSOLFDWLRQ RI %/83 WR IRUHVW JHQHWLFV SUREOHPV KDV EHHQ DGGUHVVHG E\ :KLWH DQG +RGJH f :LWK FHUWDLQ DVVXPSWLRQV WKH GHVLUDEOH

PAGE 93

SURSHUWLHV RI %/83 SUHGLFWLRQV LQFOXGH PD[LPL]LQJ WKH SUREDELOLW\ RI REWDLQLQJ FRUUHFW SDUHQWDO UDQNLQJV IURP WKH GDWD DQG PLQLPL]LQJ WKH HUURU DVVRFLDWHG ZLWK XVLQJ WKH SDUHQWDO YDOXHV REWDLQHG LQ IXWXUH DSSOLFDWLRQV */6 IL[HG HIIHFW HVWLPDWLRQ ZHLJKWV WKH REVHUYDWLRQV FRPSULVLQJ WKH HVWLPDWHV E\ WKHLU DVVRFLDWHG YDULDQFHV DSSUR[LPDWLQJ EHVW OLQHDU XQELDVHG HVWLPDWLRQ %/8(f IRU IL[HG HIIHFWV 6HDUOH S f 7KH SXUSRVH RI WKLV FKDSWHU LV WR GHVFULEH WKH WKHRU\ DQG XVH RI *$5(0/ LQ HQRXJK GHWDLO WR IDFLOLWDWH XVH E\ RWKHU LQYHVWLJDWRUV 7KH SURJUDP LV ZULWWHQ LQ )2575$1 DQG LV QRW GHSHQGHQW RQ RWKHU DQDO\VLV SURJUDPV $Q LQWHUDFWLYH YHUVLRQ RI WKLV SURJUDP FDQ EH REWDLQHG DV D VWDQGDORQH H[HFXWDEOH ILOH IURP WKH VHQLRU DXWKRU WKLV ILOH ZLOO UXQ RQ DQ\ ,%0 FRPSDWLEOH 3& XQGHU '26 RU :,1'2:6 RSHUDWLQJ V\VWHPV 7KH VL]H RI WKH SUREOHP DQ LQYHVWLJDWRU FDQ VROYH ZLOO EH GHSHQGHQW RQ WKH DPRXQW RI H[WHQGHG PHPRU\ DQG KDUG GLVN VSDFH IRU VZDS ILOHVf DYDLODEOH IRU SURJUDP XVH ,Q DGGLWLRQ WKH )2575$1 VRXUFH FRGH FDQ EH REWDLQHG IRU DQDO\VWV ZLVKLQJ WR FRPSLOH WKH SURJUDP IRU XVH RQ DOWHUQDWH V\VWHPV HJ PDLQIUDPH FRPSXWHUVf $OJRULWKP *$5(0/ SURFHHGV E\ UHDGLQJ WKH GDWD DQG IRUPLQJ D GHVLJQ PDWUL[ EDVHG RQ WKH QXPEHU RI OHYHOV RI IDFWRUV LQ WKH PRGHO $Q\ SRUWLRQV RI WKH GHVLJQ PDWUL[ IRU QHVWHG IDFWRUV RU LQWHUDFWLRQV DUH IRUPHG E\ KRUL]RQWDO GLUHFW SURGXFW &ROXPQV RI ]HURHV LQ WKH GHVLJQ PDWUL[ WKH UHVXOW RI LPEDODQFHf DUH WKHQ GHOHWHG 7KH GHVLJQ PDWUL[ FROXPQV DUH LQ DQ RUGHU VSHFLILHG E\ *LHVEUHFKWfV DOJRULWKP FROXPQV IRU IL[HG HIIHFWV DUH ILUVW IROORZHG E\ WKH GDWD YHFWRU DQG WKH ODVW VHFWLRQ RI WKH PDWUL[ LV IRU UDQGRP HIIHFWV 7KH GHVLJQ PDWUL[ LV WKH RQO\ IXOO\ IRUPHG PDWUL[ LQ WKH SURJUDP $OO RWKHU PDWULFHV DUH V\PPHWULF WKHUHIRUH WR VDYH FRPSXWDWLRQDO VSDFH :LQGRZV LV WKH WUDGHPDUN RI WKH 0LFURVRIW &RUSRUDWLRQ 5HGPRQG :$

PAGE 94

DQG WLPH RQO\ WKH GLDJRQDO DQG WKH DERYH GLDJRQDO SRUWLRQV RI PDWULFHV DUH IRUPHG DQG XWLOL]HG LH KDOIVWRUHGf $ KDOIVWRUHG PDWUL[ RI WKH GRW SURGXFWV RI WKH GHVLJQ FROXPQV LV IRUPHG DQG HLWKHU NHSW LQ FRPPRQ PHPRU\ RU VWRUHG LQ WHPSRUDU\ GLVN VSDFH VR WKDW WKH PDWUL[ LV DYDLODEOH IRU UHFDOO LQ WKH LWHUDWLYH VROXWLRQ SURFHVV 7KH DOJRULWKP SURFHHGV E\ PRGLI\LQJ WKH PDWUL[ RI GRW SURGXFWV VXFK WKDW WKH LQYHUVH RI WKH FRYDULDQFH PDWUL[ IRU WKH REVHUYDWLRQV 9f LV HQFORVHG E\ WKH FROXPQ VSHFLILHUV LQ WKH GRW SURGXFWV DV ;f; EHFRPLQJ ;f9n; 7KLV WUDQVIHU LV FRPSOHWHG ZLWKRXW LQYHUVLRQ RI WKH WRWDO 9 PDWUL[ 7KH LGHQWLW\ XVHG WR DFFRPSOLVK WKLV WUDQVIHU LV LI 9K DK=K=KL 9Kf ZKHUH 9K LV QRQVLQJXODU WKHQ Q 9n9 DK9O,f=K,K DE=Kf9Kf=Kfn=Kf9Kf $ FRPSDFW IRUP RI HTXDWLRQ LV REWDLQHG E\ SUHPXOWLSO\LQJ E\ =cf DQG SRVWPXOWLSO\LQJ E\ =M ZKHUH K N N WKH WRWDO QXPEHU RI UDQGRP IDFWRUVf DK LV WKH SULRU DVVRFLDWHG ZLWK UDQGRP YDULDEOH K 9N DN, 9 9 DQG =c LV WKH SRUWLRQ RI WKH GHVLJQ PDWUL[ IRU UDQGRP YDULDEOH L *LHVEUHFKW f $ SDUWLWLRQHG PDWUL[ LV IRUPHG LQ RUGHU WR XSGDWH XQWLO 9n RU 9 LV REWDLQHG 7KLV PDWUL[ LV RI WKH IRUP ,f mK=IF:n= 9AK=Kf9Kf;_\_=_=Nf YnM[[, \ =M _ ]NfYKf f] 7Kf ZKHUH 7N ;_\ _=_ c=Nff9N\;c\_=,__=f 7KH VZHHS RSHUDWRU RI *RRGQLJKW f LV DSSOLHG WR WKH XSSHU OHIW SDUWLWLRQ RI WKH PDWUL[ HTXDWLRQ f DQG WKH UHVXOW RI HTXDWLRQ LV REWDLQHG 7KH PDWUL[ LV VHTXHQWLDOO\ XSGDWHG DQG VZHSW XQWLO 7 ;c\ c=_ c=Nff9n;c\ c = Mc=Nf LV REWDLQHG 7 LV WKHQ VZHSW RQ WKH FROXPQV IRU IL[HG HIIHFWV ;f9 n;f 7KLV VZHHS RSHUDWLRQ SURGXFHV JHQHUDOL]HG OHDVW VTXDUHV HVWLPDWHV IRU IL[HG HIIHFWV UHVXOWV ZKLFK FDQ EH VFDOHG LQWR SUHGLFWLRQV RI UDQGRP YDULDEOHV WKH UHVLGXDO VXP RI VTXDUHV DQG DOO WKH QHFHVVDU\ LQJUHGLHQWV IRU DVVHPEOLQJ WKH

PAGE 95

HTXDWLRQ WR VROYH IRU WKH YDULDQFH FRPSRQHQWV 7KH HTXDWLRQ WR EH VROYHG IRU WKH YDULDQFH FRPSRQHQWV LV ^WU49M49Mf`D ^\f49M4\` QU UMFO UMFO WKHQ r ^WU 4 9 c4 9Mf`n ^\ f4 9 c4\` ZKHUH ^WU4949Mf` LV D PDWUL[ ZKRVH HOHPHQWV DUH WU49_49Mf ZKHUH L WR U DQG M O WR U LH WKHUH LV D URZ DQG FROXPQ IRU HYHU\ UDQGRP YDULDEOH LQ WKH OLQHDU PRGHO WU LV WKH WUDFH RSHUDWRU WKDW LV WKH VXP RI WKH GLDJRQDO HOHPHQWV RI D PDWUL[ 4 9n 9n;L;f9n;\;f9 IRU 9 DV WKH FRYDULDQFH PDWUL[ RI \ DQG ; DV WKH GHVLJQ PDWUL[ IRU IL[HG HIIHFWV 9c =c=f ZKHUH WKH LfV DUH WKH UDQGRP YDULDEOHV D LV WKH YHFWRU RI YDULDQFH FRPSRQHQW HVWLPDWHV DQG U LV WKH QXPEHU RI UDQGRP YDULDEOHV LQ WKH PRGHO Nf 7KH HQWLUH SURFHGXUH IURP IRUPLQJ 7 WR VROYLQJ IRU WKH YDULDQFH FRPSRQHQWV FRQWLQXHV XQWLO WKH YDULDQFH FRPSRQHQW HVWLPDWHV IURP WKH ODVW LWHUDWLRQ DUH QR PRUH GLIIHUHQW IURP WKH HVWLPDWHV RI WKH SUHYLRXV LWHUDWLRQ WKDQ WKH FRQYHUJHQFH FULWHULRQ VSHFLILHV 7KH IL[HG HIIHFW HVWLPDWHV DQG SUHGLFWLRQV RI UDQGRP YDULDEOHV DUH WKHQ WKRVH RI WKH ILQDO LWHUDWLRQ 7KH DV\PSWRWLF FRYDULDQFH PDWUL[ IRU WKH YDULDQFH FRPSRQHQWV LV REWDLQHG DV 9DURf ^WU49L49Mf` E\ XWLOL]LQJ LQWHUPHGLDWH UHVXOWV IURP WKH VROXWLRQ IRU WKH YDULDQFH FRPSRQHQWV 7KH FRHIILFLHQW PDWUL[ RI +HQGHUVRQfV PL[HG PRGHO HTXDWLRQV LV IRUPHG LQ RUGHU WR FDOFXODWH WKH FRYDULDQFH PDWUL[ IRU IL[HG DQG UDQGRP HIIHFWV 7KH FRYDULDQFH PDWUL[ IRU

PAGE 96

REVHUYDWLRQV LV FRQVWUXFWHG XVLQJ WKH YDULDQFH FRPSRQHQWV HVWLPDWHV IURP *LHVEUHFKWfV DOJRULWKP 7KH FRHIILFLHQW PDWUL[ LV ;f5f; ;f5n= =f5n; =f5n= ZKHUH 5 LV WKH HUURU FRYDULDQFH PDWUL[ ZKLFK LQ WKLV DSSOLFDWLRQ LV ,IZ ZKHUH [Z LV WKH YDULDQFH RI UDQGRP YDULDEOH Z HTXDWLRQ DQG f ; LV WKH IL[HG HIIHFWV GHVLJQ PDWUL[ = LV WKH UDQGRP HIIHFWV GHVLJQ PDWUL[ DQG LV WKH FRYDULDQFH PDWUL[ IRU WKH UDQGRP YDULDEOHV ZKLFK LQ WKLV DSSOLFDWLRQ KDV YDULDQFH FRPSRQHQWV RQ WKH GLDJRQDO DQG ]HURHV RQ WKH RIIGLDJRQDO QR FRYDULDQFH DPRQJ UDQGRP YDULDEOHVf 7KH JHQHUDOL]HG LQYHUVH RI WKH PDWUL[ HTXDWLRQ f LV WKH HUURU FRYDULDQFH PDWUL[ RI WKH IL[HG HIIHFW HVWLPDWHV DQG UDQGRP SUHGLFWLRQV DVVXPLQJ WKH FRYDULDQFH PDWUL[ IRU REVHUYDWLRQ LV NQRZQ ZLWKRXW HUURU 2SHUDWLQJ *$5(0/ :KLOH *$5(0/ ZLOO UXQ LQ HLWKHU EDWFK RU LQWHUDFWLYH PRGH ZH IRFXV RQ WKH LQWHUDFWLYH 3&YHUVLRQ ZKLFK EHJLQV E\ SURPSWLQJ WKH DQDO\VW WR DQVZHU TXHVWLRQV GHWHUPLQLQJ WKH IDFWRUV WR EH UHDG IURP WKH GDWD 6SHFLILFDOO\ WKH DQDO\VW DQVZHUV \HV RU QR WR WKHVH TXHVWLRQV f DUH WKHUH PXOWLSOH ORFDWLRQV" f DUH WKHUH PXOWLSOH EORFNV" f DUH WKHUH GLVFRQQHFWHG VHWV RI IXOOVLEV" LH XVXDOO\ UHIHUULQJ WR GLVFRQQHFWHG KDOIGLDOOHOV DQG f LV WKH PDWLQJ GHVLJQ KDOIVLE RU IXOOVLE" 7KH SURJUDP WKHQ GHWHUPLQHV WKH SURSHU YDULDEOHV WR UHDG IURP WKH GDWD DV ZHOO DV WKH PRVW FRPSOLFDWHG QXPEHU RI PDLQ IDFWRUV SOXV LQWHUDFWLRQVf VFDODU OLQHDU PRGHO DOORZHG 7KH PRVW FRPSOLFDWHG OLQHDU PRGHO DOORZHG IRU IXOOVLE REVHUYDWLRQV LV

PAGE 97

\LMNLP 0 Wc EcM VHWf JN J 6X WJIF WJX WVcX SLMNZLMNOP ZKHUH LV WKH P REVHUYDWLRQ RI WKH NO FURVV LQ WKH M EORFN RI WKH L WHVW + LV WKH SRSXODWLRQ PHDQ Wc LV WKH UDQGRP RU IL[HG YDULDEOH WHVW HQYLURQPHQW E\ LV WKH UDQGRP RU IL[HG YDULDEOH EORFN VHWF LV WKH UDQGRP RU IL[HG YDULDEOH VHW LH D YDULDEOH LV FUHDWHG VR WKDW GLVFRQQHFWHG VHWV RI KDOIGLDOOHOV SODQWHG LQ WKH VDPH H[SHULPHQW FDQ EH DQDO\]HG LQ WKH VDPH UXQ RU WR DQDO\]H SURYHQDQFHV DQG IDPLOLHV ZLWKLQ SURYHQDQFH ZKHUH SURYHQDQFH HTXDOV VHW VHWV DUH DVVXPHG WR EH DFURVV WHVW HQYLURQPHQWV DQG EORFNV ZLWK IDPLOLHV QHVWHG ZLWKLQ VHWV DQG LQWHUDFWLRQV ZLWK VHW DUH DVVXPHG XQLPSRUWDQW JN LV WKH UDQGRP YDULDEOH IHPDOH JHQHUDO FRPELQLQJ DELOLW\ JFDf J LV WKH UDQGRP YDULDEOH PDOH JFD 6\ LV WKH UDQGRP YDULDEOH VSHFLILF FRPELQLQJ DELOLW\ VHDf WJIU LV WKH UDQGRP YDULDEOH WHVW E\ IHPDOH JFD LQWHUDFWLRQ WJX LV WKH UDQGRP YDULDEOH WHVW E\ PDOH JFD LQWHUDFWLRQ W6MX LV WKH UDQGRP YDULDEOH WHVW E\ VHD LQWHUDFWLRQ SLMNO LV WKH UDQGRP YDULDEOH SORW ZLMNOP LV WKH UDQGRP YDULDEOH ZLWKLQSORW DQG WKHUH LV QR FRYDULDQFH EHWZHHQ UDQGRP YDULDEOHV LQ WKH PRGHO 7KH DVVXPSWLRQV XWLOL]HG DUH WKH YDULDQFH IRU IHPDOH DQG PDOH UDQGRP YDULDEOHV DUH HTXDO DA DJ_ 7Jf DQG IHPDOH DQG PDOH HQYLURQPHQWDO LQWHUDFWLRQV DUH WKH VDPH DA DA R?f 7KH PRVW FRPSOLFDWHG VFDODU OLQHDU PRGHO DOORZHG IRU KDOIVLE REVHUYDWLRQV LV \LMNQ 0 Wc E\ VHW JN WJLN SKLMN ZKLMNP

PAGE 98

ZKHUH \LMNP LV WKH P REVHUYDWLRQ RI WKH Nf§ KDOIVLE IDPLO\ LQ WKH Mf§ EORFN RI WKH Lf§ WHVW + Wc ELM VHWR JN DQG WJr UHWDLQ WKH GHILQLWLRQ LQ WKH IXOOVLE HTXDWLRQ SKLMN LV WKH UDQGRP YDULDEOH SORW FRQWDLQLQJ GLIIHUHQW JHQRW\SH E\ HQYLURQPHQW FRPSRQHQWV WKDQ WKH IXOOVLE PRGHO ZKLMNQL LV WKH UDQGRP YDULDEOH ZLWKLQSORW FRQWDLQLQJ GLIIHUHQW OHYHOV RI JHQRW\SLF DQG JHQRW\SH E\ HQYLURQPHQW FRPSRQHQWV WKDQ WKH IXOOVLE PRGHO DQG WKHUH LV QR FRYDULDQFH EHWZHHQ UDQGRP YDULDEOHV LQ WKH PRGHO 7KH DQDO\VW EXLOGV WKH OLQHDU PRGHO E\ DQVZHULQJ IXUWKHU SURPSWV ,I WHVW EORFN DQGRU VHW DUH LQ WKH PRGHO WKH\ PXVW EH GHFODUHG DV IL[HG RU UDQGRP HIIHFWV :KHQ DQ\ RI WKH WKUHH HIIHFWV LV GHFODUHG UDQGRP WKH DQDO\VW PXVW IXUQLVK SULRU YDOXHV IRU WKH YDULDQFH ,I QR SULRU YDOXH LV NQRZQ fV PD\ EH XVHG DV SULRUV 8VLQJ fV DV SULRUV ZLOO QRW DIIHFW WKH YDOXHV IRU UHVXOWLQJ YDULDQFH FRPSRQHQW HVWLPDWHV ZLWKLQ WKH FRQVWUDLQWV RI WKH FRQYHUJHQFH FULWHULRQ EXW WKHUH PD\ EH D WLPH SHQDOW\ GXH WR LQFUHDVLQJ WKH QXPEHU RI LWHUDWLRQV UHTXLUHG IRU FRQYHUJHQFH $OO UHPDLQLQJ IDFWRUV LQ WKH PRGHO DUH WUHDWHG DV UDQGRP YDULDEOHV 7R FRPSOHWH WKH GHILQLWLRQ RI WKH PRGHO WKH DQDO\VW FKRRVHV WR LQFOXGH RU H[FOXGH HDFK SRVVLEOH IDFWRU E\ DQVZHULQJ \HV RU QR ZKHQ SURPSWHG $IWHU HDFK \HV DQVZHU WKH SURJUDP DVNV IRU D SULRU YDOXH IRU WKH YDULDQFH $JDLQ LI QR NQRZQ SULRUV H[LVW fV PD\ EH VXEVWLWXWHG $IWHU WKH PRGHO KDV EHHQ VSHFLILHG WKH SURJUDP FRXQWV WKH QXPEHU RI IL[HG HIIHFWV DQG WKH QXPEHU RI UDQGRP HIIHFWV DQG DVNV LI WKH QXPEHU ILWV WKH PRGHO H[SHFWHG $ \HV DQVZHU SURFHHGV WKURXJK WKH SURJUDP ZKLOH D QR UHWXUQV WKH SURJUDP WR WKH EHJLQQLQJ *$5(0/ LV QRZ UHDG\ WR UHDG WKH GDWD ILOH ZKLFK PXVW EH DQ $6&,, GDWD ILOHf LQ WKLV RUGHU WHVW EORFN VHW IHPDOH PDOH DQG WKH UHVSRQVH YDULDEOH 7KH DQDO\VW LV SURPSWHG WR IXUQLVK D SURSHU )2575$1 IRUPDW VWDWHPHQW IRU WKH GDWD 7HVW EORFN VHW IHPDOH DQG PDOH DUH UHDG DV FKDUDFWHU YDULDEOHV $ ILHOGVf ZLWK DV PDQ\ DV HLJKW FKDUDFWHUV SHU ILHOG ZKLOH WKH GDWD

PAGE 99

YHFWRU UHVSRQVH YDULDEOHf LV UHDG DV D GRXEOH SUHFLVLRQ YDULDEOH ) ILHOGf $Q H[DPSOH RI D IRUPDW VWDWHPHQW IRU D IXOOVLE PDWLQJ GHVLJQ DFURVV ORFDWLRQV DQG EORFNV LV $)f ZKLFK UHDGV IRXU FKDUDFWHU YDULDEOHV VHTXHQWLDOO\ RFFXS\LQJ FROXPQV HDFK DQG WKH UHSRQVH YDULDEOH EHJLQQLQJ LQ FROXPQ DQG HQGLQJ LQ FROXPQ KDYLQJ ILYH GHFLPDO SODFHV $IWHU UHDGLQJ WKH GDWD *$5(0/ EHJLQV WR IXUQLVK LQIRUPDWLRQ WR WKH DQDO\VW 7KLV LQIRUPDWLRQ VKRXOG EH VFDQQHG WR PDNH VXUH WKH GDWD UHDG DUH FRUUHFW 7KLV LQIRUPDWLRQ LQFOXGHV WKH QXPEHU RI SDUHQWV WKH QXPEHU RI IXOOVLE FURVVHV WKH QXPEHU RI REVHUYDWLRQV WKH PD[LPXP QXPEHU RI IL[HG HIIHFW GHVLJQ PDWUL[ FROXPQV DQG WKH PD[LPXP QXPEHU RI UDQGRP HIIHFW GHVLJQ PDWUL[ FROXPQV ,I WKHUH LV DQ HUURU DW WKLV SRLQW XVH &75/%5. WR H[LW WKH SURJUDP 3UREDEOH FDXVHV RI HUURUV DUH WKH GDWD DUH QRW LQ WKH IRUPDW VSHFLILHG PLVVLQJ YDOXHV DUH LQFOXGHG EODQN OLQHV RU RWKHU VLPLODU HUURUV DUH LQ WKH GDWD ILOH RU WKH PRGHO ZDV QRW FRUUHFWO\ VSHFLILHG $W WKLV SRLQW WKHUH DUH WKUHH RWKHU SURPSWV FRQFHUQLQJ WKH GDWD DQDO\VLV QXPEHU RI LWHUDWLRQV FRQYHUJHQFH FULWHULRQ DQG WUHDWPHQW RI QHJDWLYH YDULDQFH FRPSRQHQWVf 7KH QXPEHU RI LWHUDWLRQV LV DUELWUDULO\ VHW WR DQG FDQ EH FKDQJHG DW WKH DQDO\VWfV GLVFUHWLRQ 1R ZDUQLQJ LV LVVXHG WKDW WKH PD[LPXP QXPEHU RI LWHUDWLRQV KDV EHHQ UHDFKHG KRZHYHU WKH FXUUHQW LWHUDWLRQ QXPEHU DQG YDULDQFH FRPSRQHQW HVWLPDWHV DUH RXWSXW WR WKH VFUHHQ DW WKH EHJLQQLQJ RI HDFK LWHUDWLRQ 7KH FRQYHUJHQFH FULWHULRQ XVHG LV WKH VXP RI WKH DEVROXWH YDOXHV RI WKH GLIIHUHQFH EHWZHHQ YDULDQFH FRPSRQHQW HVWLPDWHV IRU FRQVHFXWLYH LWHUDWLRQV 7KH FULWHULRQ KDV EHHQ VHW WR O[O2 PHDQLQJ WKDW FRQYHUJHQFH LV UHTXLUHG WR WKH IRXUWK GHFLPDO SODFH IRU DOO YDULDQFH FRPSRQHQWV 7KH FRQYHUJHQFH FULWHULRQ VKRXOG EH PRGLILHG WR VXLW WKH PDJQLWXGH RI WKH YDULDQFHV XQGHU FRQVLGHUDWLRQ DV ZHOO DV WKH SUDFWLFDO QHHG IRU HQKDQFHG UHVROXWLRQ (QKDQFHG UHVROXWLRQ LV REWDLQHG DW WKH FRVW RI LQFUHDVLQJ WKH QXPEHU RI LWHUDWLRQV WR FRQYHUJHQFH 7KH DQDO\VW PXVW GHFLGH ZKHWKHU WR DFFHSW DQG XVH QHJDWLYH HVWLPDWHV RU WR VHW QHJDWLYH HVWLPDWHV WR ]HUR DQG UHVROYH WKH V\VWHP 7KH ODWWHU VROXWLRQ UHVXOWV LQ YDULDQFH FRPSRQHQW

PAGE 100

HVWLPDWHV ZLWK ORZHU VDPSOLQJ YDULDQFH DQG VOLJKW ELDV ,I RQH LV LQWHUHVWHG LQ XQELDVHG HVWLPDWHV RI YDULDQFH FRPSRQHQWV WKDW KDYH D KLJK SUREDELOLW\ RI QHJDWLYH HVWLPDWHV WKHQ DFFHSWLQJ DQG XVLQJ WKH QHJDWLYH HVWLPDWHV PD\ EH WKH SURSHU FRXUVH WR WDNH ,QWHUSUHWLQJ *$5(0/ 2XWSXW $QDO\VLV LV QRZ XQGHUZD\ 7KH SULRUV IRU HDFK LWHUDWLRQ DQG WKH LWHUDWLRQ QXPEHU DUH SULQWHG RXW WR WKH VFUHHQ *$5(0/ FRQWLQXHV WR LWHUDWH XQWLO WKH FRQYHUJHQFH FULWHULRQ LV PHW RU WKH PD[LPXP QXPEHU RI LWHUDWLRQV LV UHDFKHG 7KH QH[W WLPH WKDW DQDO\VW LQWHUYHQWLRQ LV UHTXLUHG LV WR SURYLGH D QDPH IRU WKH RXWSXW ILOH IRU YDULDQFH FRPSRQHQW HVWLPDWHV 7KH IGH QDPH IROORZV QRUPDO '26 ILOH QDPLQJ SURWRFRO KRZHYHU DOWHUQDWLYH GLUHFWRULHV PD\ QRW EH VSHFLILHG LH DOO RXWSXWV ZLOO EH IRXQG LQ WKH VDPH GLUHFWRU\ DV WKH GDWD ILOH 7KH SURJUDP ZLOO QRZ TXL] WKH DQDO\VW WR GHWHUPLQH LI DGGLWLRQDO RXWSXWV DUH GHVLUHG 7KHVH DGGLWLRQDO RXWSXWV DUH JFD SUHGLFWLRQV VHD SUHGLFWLRQV LI DSSOLFDEOHf WKH DV\PSWRWLF FRYDULDQFH PDWUL[ IRU WKH YDULDQFH FRPSRQHQWV JHQHUDOL]HG OHDVW VTXDUHV IL[HG HIIHFW HVWLPDWHV HUURU FRYDULDQFH PDWUL[ RI WKH JFD SUHGLFWLRQV DQG HUURU FRYDULDQFH PDWUL[ IRU IL[HG HIIHFWV $Q DQVZHU RI \HV WR WKH LQFOXVLRQ RI DQ RXWSXW ZLOO UHVXOW LQ D SURPSWLQJ IRU D ILOH QDPH ,Q DGGLWLRQ IRU JFD DQG VHD SUHGLFWLRQV WKH DQDO\VW PD\ LQSXW D GLIIHUHQW YDOXH IRU EJD RU FUP ZLWK ZKLFK WR VFDOH SUHGLFWLRQV 7KH GLVFXVVLRQ ZKLFK IROORZV IXUQLVKHV PRUH GHWDLOHG LQIRUPDWLRQ FRQFHUQLQJ *$5(0/ RXWSXWV 9DULDQFH &RPSRQHQW (VWLPDWHV ,JQRULQJ FRQFHUQV DERXW FRQYHUJHQFH WR D JOREDO PD[LPXP DQG QHJDWLYH YDOXHV YDULDQFH FRPSRQHQW HVWLPDWHV DUH UHVWULFWHG PD[LPXP OLNHOLKRRG HVWLPDWHV RI 3DWWHUVRQ DQG 7KRPSVRQ f 7KH HVWLPDWHV DUH UREXVW DJDLQVW VWDUWLQJ YDOXHV SULRUVf LH WKH VDPH HVWLPDWHV ZLWKLQ WKH OLPLWV RI WKH FRQYHUJHQFH FULWHULRQ FDQ EH REWDLQHG IURP GLYHUVH SULRUV +RZHYHU SULRUV

PAGE 101

FORVH WR WKH WUXH YDOXHV ZLOO LQ JHQHUDO UHGXFH WKH QXPEHU RI LWHUDWLRQV UHTXLUHG WR UHDFK FRQYHUJHQFH 7KH YDOXH RI WKH FRQYHUJHQFH FULWHULRQ PXVW EH OHVV WKDQ RU HTXDO WR WKH GHVLUHG SUHFLVLRQ IRU WKH YDULDQFH FRPSRQHQWV 5(0/ YDULDQFH FRPSRQHQW HVWLPDWHV IURP WKLV SURJUDP KDYH EHHQ VKRZQ WR KDYH PRUH GHVLUDEOH SURSHUWLHV YDULDQFH DQG ELDVf WKDQ RWKHU FRPPRQO\ XVHG HVWLPDWLRQ WHFKQLTXHV PD[LPXP OLNHOLKRRG PLQLPXP QRUP TXDGUDWLF XQELDVHG HVWLPDWLRQ DQG +HQGHUVRQfV 0HWKRG f RYHU D ZLGH UDQJH RI GDWD LPEDODQFH 7KH SURSHUWLHV RI WKH HVWLPDWHV DUH IXUWKHU HQKDQFHG E\ XVLQJ LQGLYLGXDO REVHUYDWLRQV DV GDWD UDWKHU WKDQ SORW PHDQV 7KH RXWSXW LV ODEHOOHG E\ WKH YDULDQFH FRPSRQHQW HVWLPDWHG 3UHGLFWLRQV RI 5DQGRP 9DULDEOHV 7KH SUHGLFWLRQV RXWSXW DUH IRU JHQHUDO DQG VSHFLILF FRPELQLQJ DELOLWLHV DQG DSSUR[LPDWH EHVW OLQHDU XQELDVHG SUHGLFWLRQV %/83f RI WKH UDQGRP YDULDEOHV %/83 SUHGLFWLRQV KDYH VHYHUDO RSWLPDO SURSHUWLHV f WKH FRUUHODWLRQ EHWZHHQ WKH SUHGLFWHG DQG WUXH YDOXHV LV PD[LPL]HG f LI WKH GLVWULEXWLRQ LV PXOWLYDULDWH QRUPDO WKHQ %/83 PD[LPL]HV WKH SUREDELOLW\ RI REWDLQLQJ WKH FRUUHFW UDQNLQJV +HQGHUVRQ f DQG VR PD[LPL]HV WKH SUREDELOLW\ RI VHOHFWLQJ WKH EHVW FDQGLGDWH IURP DQ\ SDLU RI FDQGLGDWHV +HQGHUVRQ f 3UHGLFWLRQV DUH RI WKH IRUP X e!=f9 r\;f ZKHUH LV WKH YHFWRU RI SUHGLFWLRQV e! LV WKH HVWLPDWHG FRYDULDQFH PDWUL[ IRU UDQGRP YDULDEOHV IURP WKH 5(0/ YDULDQFH FRPSRQHQW HVWLPDWHV VHH HTXDWLRQ =f LV WKH WUDQVSRVH RI WKH GHVLJQ PDWUL[ IRU UDQGRP YDULDEOHV \ LV WKH GDWD YHFWRU ; LV WKH GHVLJQ PDWUL[ IRU IL[HG HIIHFWV

PAGE 102

LV WKH YHFWRU RI IL[HG HIIHFW HVWLPDWHV DQG 9 LV WKH HVWLPDWHG FRYDULDQFH PDWUL[ IRU REVHUYDWLRQV IURP 5(0/ YDULDQFH FRPSRQHQW HVWLPDWHV 127( LI SUHGLFWLRQV DUH GHVLUHG EDVHG RQ SULRU YDOXHV IRU WKH YDULDQFH FRPSRQHQWV VHW WKH QXPEHU RI LWHUDWLRQV WR DIWHU KDYLQJ LQSXW WKH GHVLUHG YDOXHV DV SULRUV 3UHGLFWLRQV DUH RXWSXW DV D ODEHOOHG YHFWRU $V\PSWRWLF &RYDULDQFH 0DWUL[ RI 9DULDQFH &RPSRQHQWV 7KH RXWSXW IRU WKH DV\PSWRWLF FRYDULDQFH PDWUL[ $9&0f RI YDULDQFH FRPSRQHQWV LV IURP HTXDWLRQ 7KLV RXWSXW UHSUHVHQWV WKH YDULDQFH RI UHSHDWHG PLQLPXP YDULDQFH TXDGUDWLF XQELDVHG YDULDQFH FRPSRQHQW HVWLPDWHV XVLQJ WKH VDPH H[SHULPHQWDO GHVLJQ LI WKH HVWLPDWHV DUH HTXDO WR WKH WUXH YDOXHV 7KLV WHFKQLTXH KDV EHHQ XVHG IRU VLPXODWLRQ ZRUN WR GHILQH RSWLPDO PDWLQJ DQG ILHOG GHVLJQV 0F&XWFKDQ HW DO f 7KH $9&0 LV XVHG WR FUHDWH WKH DV\PSWRWLF YDULDQFH RI OLQHDU FRPELQDWLRQV RI HVWLPDWHV RI YDULDQFH FRPSRQHQWV DV 9DUI/fDf /f9DUAf/ ZKHUH / VSHFLILHV WKH OLQHDU FRPELQDWLRQVf RI YDULDQFH FRPSRQHQWV LV WKH YHFWRU RI YDULDQFH FRPSRQHQW HVWLPDWHV DQG 9DURf LV WKH $9&0 IURP HTXDWLRQ 7KH GLDJRQDO HOHPHQWV RI /f9DUf/ DUH WKH YDULDQFHV RI WKH OLQHDU FRPELQDWLRQV DQG WKH RII GLDJRQDO HOHPHQWV DUH WKH FRYDULDQFHV EHWZHHQ WKH OLQHDU FRPELQDWLRQV 7KHVH YDOXHV DUH WKHQ XVHIXO IRU 7D\ORU VHULHV DSSUR[LPDWLRQ RI WKH YDULDQFH RI D UDWLR RI OLQHDU FRPELQDWLRQV VXFK DV KHULWDELOLW\ $9&0 LV RXWSXW DV D YHFWRU KDOIVWRUHG PDWUL[f DQG HDFK URZ RI WKH RXWSXW LV ODEHOOHG

PAGE 103

)L[HG (IIHFW (VWLPDWHV )L[HG HIIHFW HVWLPDWHV DUH WKRVH RI JHQHUDOL]HG OHDVW VTXDUHV DQG DUH LQ D VHW WR ]HUR IRUPDW 6HW WR ]HUR IRUPDW FRPPRQO\ VHHQ LQ 6$6 RXWSXWf LV FKDUDFWHUL]HG E\ WKH ODVW OHYHO RI D PDLQ HIIHFW RU QHVWHG HIIHFW EHLQJ VHW WR ]HUR 7KHVH HVWLPDWHV DUH DSSUR[LPDWHO\ EHVW OLQHDU XQELDVHG HVWLPDWHV %/8(f RI WKH IL[HG HIIHFWV EHFDXVH WKH FRYDULDQFH PDWUL[ IRU REVHUYDWLRQV ZDV HVWLPDWHG DQG QRW NQRZQ ZLWKRXW HUURU .DFNDU DQG +DUYLOOH f KDYH VKRZQ IRU D EURDG FODVV RI YDULDQFH HVWLPDWRUV WKDW WKH IL[HG HIIHFWV HVWLPDWHV DUH VWLOO XQELDVHG 7KH ZRUG %HVW LQ %/8( UHIHUV WR WKH SURSHUWLHV RI PLQLPXP YDULDQFH IRU WKH FODVV RI XQELDVHG HVWLPDWRUV *HQHUDOL]HG OHDVW VTXDUHV HVWLPDWHV LQ VHW WR ]HUR IRUPDW IRU IL[HG HIIHFWV DUH RI WKH IRUP ;f9n;\;f9n\ ZKHUH ; 9 DQG \ DUH DV GHILQHG LQ HTXDWLRQ )L[HG HIIHFW HVWLPDWHV DUH RXWSXW DV D ODEHOOHG YHFWRU (UURU &RYDULDQFH 0DWULFHV 7KH HUURU FRYDULDQFH PDWULFHV IRU SUHGLFWLRQV DQG IL[HG HIIHFW HVWLPDWHV DUH REWDLQHG E\ SURGXFLQJ D JHQHUDOL]HG LQYHUVH RI HTXDWLRQ +HQGHUVRQ 0F/HDQ f 6LQFH DOO FRYDULDQFH PDWULFHV DUH V\PPHWULF WKH RXWSXW LV LQ WKH IRUP RI D YHFWRU ZKLFK LV HTXLYDOHQW WR D KDOIVWRUHG PDWUL[ 2XWSXW IRU HUURU RI JFD SUHGLFWLRQV LV ODEHOHG ZKLOH WKH HUURU RI IL[HG HIIHFWV LV QRW 7KH ODEHOLQJ RQ JFD HUURUV PDNHV WKH XQODEHOOHG RXWSXW IRU IL[HG HIIHFW YDULDQFH VHOI H[SODQDWRU\ 7KH HUURU FRYDULDQFH PDWUL[ IRU JFD SUHGLFWLRQV FDQ EH FRQYHUWHG WR WKH FRYDULDQFH PDWUL[ IRU JFD SUHGLFWLRQV E\ IRUPLQJ WKH FRYDULDQFH PDWUL[ IRU WKH JFD UDQGRP YDULDEOHV DQG 6$6 LV WKH UHJLVWHUHG WUDGHPDUN RI 6$6 ,QVWLWXWH ,QF &DU\ 1RUWK &DUROLQD

PAGE 104

VXEWUDFWLQJ WKH HUURU FRYDULDQFH PDWUL[ 7KH FRYDULDQFH PDWUL[ IRU SUHGLFWLRQV KDV EHHQ GHQRWHG DV 9DUJf E\ :KLWH DQG +RGJH f ([DPSOH 7KH IROORZLQJ GLVFXVVLRQ LQYROYHV WKH DQDO\VLV RI D VLPXODWHG GDWD VHW LQ RUGHU WR IXUWKHU GHPRQVWUDWH WKH RXWSXWV RI *$5(0/ 'DWD 7KH GDWD 7DEOH f ZDV JHQHUDWHG XVLQJ D VL[SDUHQW KDOIGLDOOHO PDWLQJ GHVLJQ DQG D UDQGRPL]HG FRPSOHWH EORFN ILHOG GHVLJQ 7KH ILHOG GHVLJQ LV LQ WZR ORFDWLRQV ZLWK IRXU FRPSOHWH EORFNV SHU ORFDWLRQ DQG WZR WUHHV SHU IDPLO\ SHU EORFN 7KH XQGHUO\LQJ JHQHWLF SDUDPHWHUV IRU WKH GDWD DUH LQGLYLGXDO WUHH KHULWDELOLW\ HTXDOV 7\SH % FRUUHODWLRQ HTXDOV GRPLQDQFH WR DGGLWLYH YDULDQFH UDWLR HTXDOV DQG WKH SRSXODWLRQ PHDQ HTXDOV $IWHU D EDODQFHG GDWD VHW ZDV JHQHUDWHG WKH REVHUYDWLRQV ZHUH VXEMHFWHG WR b UDQGRP GHOHWLRQ VLPXODWLQJ b VXUYLYDOf 7KH GDWD VHW LV FRPSULVHG RI D VPDOO QXPEHU RI REVHUYDWLRQV DQG ZKLOH QRW DQ RSWLPDO DSSOLFDWLRQ RI *$5(0/ VHUYHV ZHOO DV DQ LOOXVWUDWLRQ $QDO\VLV 7KH DQDO\VLV ZDV FDUULHG RXW ZLWK WZR GLIIHUHQW OLQHDU PRGHOV XVLQJ LQGLYLGXDO REVHUYDWLRQV DV WKH GDWD 7KH PRGHO FRQWDLQHG HLJKW VRXUFHV RI YDULDWLRQ DQG ZDV IURP HTXDWLRQ ZLWKRXW WKH YDULDEOH VHW ,Q PRGHO WHVW HQYLURQPHQW DQG EORFNV ZLWKLQ WHVW DUH GHFODUHG IL[HG 7KH VXEVHTXHQW PRGHO PRGHO f KDV DOO UDQGRP HIIHFWV H[FHSW WKH PHDQ 9DULDQFH

PAGE 105

7DEOH 'DWD IRU H[DPSOH RI *$5(0/ RSHUDWLRQ / %O ) 0 7 DQG 59 VWDQG IRU ORFDWLRQ EORFN IHPDOH WUHH DQG UHVSRQVH YDULDEOH UHVSHFWLYHO\ $ SURSHU )2575$1 UHDG IRUPDW ZRXOG EH $7$7$7$7)f / %O ) 0 7 59

PAGE 106

7DEOH /% ) FRQWLQXHG 0 7 59

PAGE 107

7DEOH /% ) FRQWLQXHG 0 7 59

PAGE 108

7DEOH FRQWLQXHG / % ) 0 7 59 FRPSRQHQWV DUH HVWLPDWHG ZLWK PRGHO UHFHLYLQJ WZR GLIIHUHQW WUHDWPHQWV RI QHJDWLYH HVWLPDWHV LH OLYH ZLWK WKH QHJDWLYH HVWLPDWHV PRGHO $f RU UHVROYH WKH V\VWHP VHWWLQJ QHJDWLYH HVWLPDWHV WR ]HUR PRGHO ,%f 7KH GLIIHUHQW PRGHOV DQG PHWKRGV IRU GHDOLQJ ZLWK QHJDWLYH HVWLPDWHV DUH GHPRQVWUDWHG VR WKDW WKH UHDGHU FDQ VHH D UDQJH RI RXWSXWV IURP *$5(0/ 2XWSXW 9DULDQFH FRPSRQHQW HVWLPDWHV 7KH YDULDQFH FRPSRQHQW HVWLPDWHV DUH 0RGHO $ 6,*0$648$5(' *&$ 6,*0$648$5(' 6&$ 6,*0$648$5(' /2&[*&$ 6,*0$648$5(' /2&[6&$ 6,*0$648$5(' %/2&.[)$0 6,*0$648$5(' (5525 0RGHO ,% 6,*0$648$5(' *&$ 6,*0$648$5(' 6&$ 6,*0$648$5(' /2&[*&$

PAGE 109

6,*0$648$5(' /2&[6&$ 6,*0$648$5(' %/2&.[)$0 6,*0$648$5(' (5525 DQG 0RGHO 6,*0$648$5(' /2&$7,21 6,*0$648$5(' %/2&./2&f 6,*0$648$5(' *&$ 6,*0$648$5(' 6&$ 6,*0$648$5(' /2&[*&$ 6,*0$648$5(' /2&[6&$ 6,*0$648$5(' %/2&.[)$0 6,*0$648$5(' (5525 7KHVH YDULDQFH FRPSRQHQW HVWLPDWHV LOOXVWUDWH RXWSXWV IRU WKH UDQGRP PRGHO WKH PL[HG PRGHO DQG WKH DOWHUQDWLYHV IRU GHDOLQJ ZLWK QHJDWLYH HVWLPDWHV )L[HG HIIHFW HVWLPDWHV )L[HG HIIHFW HVWLPDWHV DUH 0RGHO ,% 08 /2&$7,21 /2&$7,21 %/2&./2&f %/2&./2&f %/2&./2&f %/2&./2&f %/2&./2&f %/2&./2&f %/2&./2&f %/2&./2&f DQG 0RGHO 08 7KH LQWHUSUHWDWLRQ RI IL[HG HIIHFW HVWLPDWHV IRU PRGHO ,% LV WKDW EORFNV WKURXJK EHORQJ ZLWK ORFDWLRQ DQG WKH IRXUWK EORFN LV VHW WR ]HUR %ORFNV WKURXJK DUH WKRVH RI ORFDWLRQ DQG WKH HLJKWK EORFN LV VHW WR ]HUR DV ZHOO DV ORFDWLRQ 6HWV RI EORFNV ZLWKLQ ORFDWLRQ FDQ DOZD\V EH GHWHUPLQHG E\ WKH ODVW EORFN ZLWKLQ D ORFDWLRQ EHLQJ VHW WR ]HUR 7KH LQWHUSUHWDWLRQ RI VHW WR ]HUR

PAGE 110

LV 08 LV WKH PHDQ RI WKH IRXUWK EORFN ODEHOOHG EORFN f LQ ORFDWLRQ WZR DQG DQ\ HVWLPDEOH IXQFWLRQ RI WKH IL[HG HIIHFWV FDQ EH JHQHUDWHG IURP WKHVH HVWLPDWHV $Q H[DPSOH RI DQ HVWLPDEOH IXQFWLRQ ZRXOG EH WKH VLWH PHDQ RI ORFDWLRQ 7KLV PHDQ ZRXOG EH HVWLPDWHG DV 08 /2&$7,21 O%/2&./2&f %/2&./2&f %/2&./2&f %/2&./2&f f 08 RI PRGHO LV WKH HVWLPDWH RI WKH JHQHUDO PHDQ DFURVV VLWHV LI DOO RWKHU IDFWRUV DUH UDQGRP $OO RI WKHVH HVWLPDWHV DUH WKH UHVXOW RI JHQHUDOL]HG OHDVW VTXDUHV HVWLPDWLRQ $V\PSWRWLF FRYDULDQFH PDWUL[ IRU WKH YDULDQFH FRPSRQHQWV 7KH DV\PSWRWLF FRYDULDQFH PDWUL[ IRU WKH YDULDQFH FRPSRQHQWV LQ PRGHO ,% ZRXOG DSSHDU DV $6<03727,& 9$5,$1&( &29$5,$1&( 0$75,; *&$ *&$ *&$ 6&$ *&$ /2&[*&$ *&$ /2&[6&$ *&$ %/2&.[)$0 *&$ (5525 6&$ 6&$ 6&$ /2&[*&$ 6&$ /2&[6&$ 6&$ %/2&.[)$0 6&$ (5525 /2&[*&$ /2&[*&$ /2&[*&$ /2&[6&$ /2&[*&$ %/2&.[)$0 /2&[*&$ (5525 /2&[6&$ /2&[6&$ /2&[6&$ %/2&.[)$0 /2&[6&$ (5525 %/2&.[)$0 %/2&.[)$0 %/2&.[)$0 (5525 (5525 (5525 7KLV PDWUL[ DV DUH DOO RWKHU PDWULFHV RXWSXW LV KDOIVWRUHG 7KH RXWSXW LV UHDG DV *&$ *&$ LV WKH DV\PSWRWLF YDULDQFH RI WKH JFD YDULDQFH FRPSRQHQW 7KH QH[W URZ ODEHOOHG *&$ 6&$

PAGE 111

LV WKH DV\PSWRWLF FRYDULDQFH EHWZHHQ WKH HVWLPDWHV RI WKH JFD YDULDQFH FRPSRQHQW DQG WKH VHD YDULDQFH FRPSRQHQW 7KXV WKH QH[W IRXU URZV DUH DV\PSWRWLF FRYDULDQFHV RI JFD YDULDQFH HVWLPDWHV ZLWK WKH RWKHU UDQGRP YDULDEOHV LQ WKH PRGHO 7KH RWKHU URZV DUH UHDG LQ D OLNH PDQQHU DQG LI WKH DQDO\VW ZLVKHG WR DUUD\ WKH RXWSXW DV D PDWUL[ DOO QHFHVVDU\ FRPSRQHQWV DUH DW KDQG 3UHGLFWLRQV RI UDQGRP YDULDEOHV $OO SUHGLFWLRQV RI UDQGRP YDULDEOHV DUH DSSURSULDWHO\ ODEHOOHG DFFRUGLQJ WR WKH FKDUDFWHU QDPH UHDG IURP WKH GDWD DQG IRU PRGHO ,% ZRXOG DSSHDU DV IURP WKH JFD RXWSXWf *&$ *&$ *&$ *&$ *&$ *&$ IURP WKH VHD RXWSXWf 6&$ 6&$ 6&$ 6&$ 6&$ 6&$ 6&$ 6&$ 6&$ 6&$ 6&$ 6&$ 6&$ 6&$ 6&$ $OO WKHVH SUHGLFWLRQV DUH DSSUR[LPDWHO\ EHVW OLQHDU XQELDVHG SUHGLFWLRQV DQG DUH DSSUR[LPDWH EHFDXVH WKH YDULDQFH FRPSRQHQWV ZHUH HVWLPDWHG IURP WKH VDPH GDWD

PAGE 112

(UURU FRYDULDQFH PDWUL[ RI WKH SUHGLFWLRQV 7KH HUURU FRYDULDQFH PDWUL[ RI WKH SUHGLFWLRQV LV RXWSXW DV D KDOIVWRUHG PDWUL[ ZLWK HDFK URZ DSSURSULDWHO\ ODEHOOHG 7KLV PDWUL[ IRU PRGHO ,% DSSHDUV DV 7+( (5525 9$5,$1&( &29$5,$1&( 0$75,; )25 *&$ $55$<(' $6 $ 9(&725 7KH ODEHOOLQJ RI WKH RXWSXW LV LQWHUSUHWHG LGHQWLFDOO\ WR WKDW IRU WKH DV\PSWRWLF YDULDQFH FRYDULDQFH PDWUL[ IRU WKH YDULDQFH FRPSRQHQWV 7KRVH URZV ZKLFK FRQWDLQ D SDUHQWDO QDPH WZLFH DUH WKH HUURU YDULDQFH IRU WKDW SDUHQWDO SUHGLFWLRQ DQG WKRVH URZV FRQWDLQLQJ WZR SDUHQWDO QDPHV DUH WKH HUURU FRYDULDQFH IRU WKH WZR SDUHQWDO SUHGLFWLRQV ,Q WKLV XQEDODQFHG FDVH WKH UHDGHU ZLOO VHH WKDW VRPH SDUHQWV KDYH PRUH HUURU DVVRFLDWHG ZLWK WKHLU SUHGLFWLRQV WKDQ RWKHUV LH FRPSDUH WKH HUURU IRU SDUHQW ZLWK SDUHQW 7KLV LV WUXH EHFDXVH RI WKH YDU\LQJ QXPEHU RI REVHUYDWLRQV DVVRFLDWHG ZLWK WKH SUHGLFWLRQ IRU HDFK SDUHQW DQG DOVR WKH YDU\LQJ GLVWULEXWLRQ RI WKRVH REVHUYDWLRQV DFURVV WHVWV DQG EORFNV ,I RQH DVVXPH WKDW WKH HVWLPDWH IRU JFD YDULDQFH IURP WKH

PAGE 113

GDWD HTXDOV WKH WUXH YDULDQFH IRU JFD WKHQ WKH FRUUHODWLRQ RI WKH SUHGLFWLRQ ZLWK WKH WUXH YDOXH &RUUJJf :KLWH DQG +RGJH f IRU SDUHQW LV HTXDO WR 9O f RU (UURU FRYDULDQFH PDWUL[ IRU WKH IL[HG HIIHFWV 7KH HUURU FRYDULDQFH PDWUL[ IRU WKH IL[HG HIIHFWV LV RXWSXW DV D KDOIVWRUHG PDWUL[ 7KH RXWSXW LV QRW ODEHOOHG KRZHYHU RQH RQO\ KDV WR NQRZ WKH WRWDO QXPEHU RI OHYHOV IRU DOO IL[HG HIIHFWV WR DVVLJQ ODEHOV LI QHHGHG 7KH SULPDU\ XVH RI WKLV PDWUL[ LV WR HVWLPDWH WKH YDULDQFH RI HVWLPDEOH IXQFWLRQV RI WKH IL[HG HIIHFWV ,I GHQRWHV WKH YHFWRU FRQWDLQLQJ WKH VSHFLILFDWLRQ RI DQ HVWLPDEOH IXQFWLRQ DQG 9E GHQRWHV WKH HUURU FRYDULDQFH PDWUL[ IRU IL[HG HIIHFWV WKHQ WKH YDULDQFH RI DQ HVWLPDEOH IXQFWLRQ LV HTXDO WR Of9EO f IRU WKH PHDQ RI WHVW HTXDOV > @ &RQFOXVLRQV *$5(0/ LV DQ DQDO\WLFDO WRRO IRU XVH ZLWK PRGHOV FRPPRQ WR IRUHVW JHQHWLFV 7KH SURSHUWLHV RI WKH YDULDQFH FRPSRQHQW HVWLPDWLRQ DOJRULWKP KDYH EHHQ GRFXPHQWHG E\ VLPXODWLRQ VWXGLHV DQG WKH DOJRULWKP SUHVHQWV VROXWLRQV DV UHVWULFWHG PD[LPXP OLNHOLKRRG HVWLPDWHV 0DQ\ RWKHU RXWSXWV DUH DYDLODEOH IURP WKH SURJUDP LQFOXGLQJ EHVW OLQHDU XQELDVHG SUHGLFWLRQV JHQHUDOL]HG OHDVW VTXDUHV HVWLPDWHV RI IL[HG HIIHFWV HUURU FRYDULDQFH PDWULFHV RI SUHGLFWLRQV DQG HVWLPDWHV DQG WKH DV\PSWRWLF FRYDULDQFH PDWUL[ IRU YDULDQFH FRPSRQHQW HVWLPDWHV *$5(0/ LV QRW LQWHQGHG WR EH XVHG DV D EODFN ER[ 7KH SURJUDP KDV PDQ\ SRWHQWLDO XVHV YDULDQFH FRPSRQHQW HVWLPDWLRQ SDUHQWDO HYDOXDWLRQ SURJHQ\ HYDOXDWLRQ DQG VLPXODWHG HYDOXDWLRQ RI PDWLQJ DQG ILHOG GHVLJQ +RZHYHU WKRXJKWIXO LQWHUSUHWDWLRQ RI WKH RXWSXWV LV QHHGHG LQ RUGHU WR UHDOL]H WKH SRZHU DQG XWLOLW\ RI WKH SURJUDP

PAGE 114

&+$37(5 &21&/86,216 2SWLPDO PDWLQJ GHVLJQ IRU WKH GHWHUPLQDWLRQ RI JHQHWLF DUFKLWHFWXUH ZDV H[SORUHG *HQHUDO FRQFOXVLRQV ZHUH UHDFKHG WKURXJK FRPSDULVRQ RI WKH KDOIGLDOOHO KDOIVLE DQG FLUFXODU PDWLQJ GHVLJQV ,Q SDUWLFXODU WKH FRPSDULVRQ RI WKH KDOIGLDOOHO DQG FLUFXODU GHVLJQV LV SHUWLQHQW WR WKH HVWDEOLVKPHQW RI IXWXUH SURJHQ\ WHVWV LQ ZKLFK IXOOVLE IDPLOLHV DUH GHVLUHG $FURVV WKH H[SHULPHQWDO OHYHOV H[DPLQHG WKH FLUFXODU PDWLQJ GHVLJQ SURYLGHV PRUH HIILFLHQW HVWLPDWHV RI SDUDPHWHUV IRU JHQHWLF DUFKLWHFWXUH WKDQ WKH KDOIGLDOOHO GHVLJQ ,I DQ HVWLPDWH RI WKH YDULDQFH LQ JHQHUDO FRPELQLQJ DELOLWLHV LV UHTXLUHG WKH KDOIVLE GHVLJQ LV PRUH HIILFLHQW WKDQ WKH FLUFXODU PDWLQJ GHVLJQ RYHU PRVW RI WKH H[SHULPHQWDO OHYHOV H[DPLQHG 7KLV SDWWHUQ RI HIILFLHQF\ DUJXHV IRU FRPSOHPHQWDU\ PDWLQJ GHVLJQV LQYROYLQJ KDOIVLE GHVLJQV RSHQSROOLQDWHG RU SRO\FURVVf WR ZRUN HVWLPDWH JHQHUDO FRPELQLQJ DELOLW\ DQG D VHFRQG GHVLJQ IXOOVLE PDWLQJf WR JHQHUDWH FURVVHV IURP ZKLFK WR PDNH VHOHFWLRQV &RPSOLPHQWDU\ PDWLQJ GHVLJQV GR UHTXLUH D JUHDWHU PRQHWDU\ DQG WHPSRUDO FRPPLWPHQW ,I WKLV W\SH RI FRPPLWPHQW LV QRW MXVWLILHG RU SRVVLEOH WKHQ WKH FLUFXODU PDWLQJ GHVLJQ VKRXOG EH XVHG WR JHQHUDWH IXOOVLE IDPLOLHV DQG HVWLPDWH JHQHWLF SDUDPHWHUV VLPXOWDQHRXVO\ &RQVLGHULQJ ILHOG GHVLJQ LQ FRPELQDWLRQ ZLWK PDWLQJ GHVLJQ IXOOVLE GHVLJQV UHDFK PD[LPXP HIILFLHQF\ IRU JHQHWLF SDUDPHWHU HVWLPDWLRQ LQ IHZHU QXPEHUV RI UHSOLFDWHV DFURVV ORFDWLRQV WKDQ KDOIVLE GHVLJQV )RU DQ\ VSHFLILF FDVH RI ILHOG GHVLJQ DQG WKH KDOIVLE PDWLQJ GHVLJQ D SULRUL NQRZOHGJH RI WKH JHQHWLF DUFKLWHFWXUH LV UHTXLUHG WR FKRRVH WKH RSWLPDO ILHOG GHVLJQ IRU QXPEHU RI ORFDWLRQV

PAGE 115

,Q FDVHV ZKHUH PD[LPXP HIILFLHQF\ RI DQ H[SHULPHQWDO GHVLJQ LV REWDLQHG DQG WKH SUHFLVLRQ RI JHQHWLF SDUDPHWHU HVWLPDWHV LV VWLOO OHVV WKDQ GHVLUHG WKH RSWLPDO XVH RI H[SHULPHQWDO XQLWV ZRXOG EH GLVFRQQHFWHG VHWV RI H[SHULPHQWV DW PD[LPXP HIILFLHQF\ ZLWK WKH SDUDPHWHU HVWLPDWH WKHQ EHLQJ D PHDQ RI WKH HVWLPDWHV IURP WKH GLVFRQQHFWHG H[SHULPHQWV 2I WKH WKUHH PDWLQJ GHVLJQV RQO\ WKH KDOIGLDOOHO H[KLELWV HIILFLHQF\ RSWLPD IRU QXPEHU RI SDUHQWV 7KH RSWLPXP IRU QXPEHU RI SDUHQWV LQ KDOIGLDOOHOV LV DOZD\V FORVH WR DQG QHYHU ODUJHU WKDQ VL[ SDUHQWV ZLWK WKH IOXFWXDWLRQ UHVXOWLQJ IURP WKH JHQHWLF DUFKLWHFWXUH 7KXV IRU KDOIGLDOOHOV IRU PD[LPXP HIILFLHQF\ LQ JHQHWLF SDUDPHWHU HVWLPDWLRQ WKH QXPEHU RI SDUHQWV VKRXOG QRW H[FHHG VL[ DQG GHVLUHG SDUDPHWHU SUHFLVLRQ REWDLQHG E\ XVLQJ GLVFRQQHFWHG VHWV RI VL[ SDUHQWV 2SWLPD IRU QXPEHU RI ORFDWLRQV H[LVW IRU DOO PDWLQJ GHVLJQV DQG PD[LPXP HIILFLHQF\ ZRXOG DJDLQ EH REWDLQHG E\ UHSOLFDWLQJ DQ H[SHULPHQW RQO\ IRU WKH RSWLPDO QXPEHU RI ORFDWLRQV $ SDUDPHWHU HVWLPDWH RI WKH GHVLUHG SUHFLVLRQ ZRXOG EH FDOFXODWHG DV D PHDQ RI GLVFRQQHFWHG H[SHULPHQWV 2SWLPDO DQDO\VLV ZDV GHDOW ZLWK RQ WZR VWDJHV HVWLPDWLQJ SDUHQWDO ZRUWK DQG HVWLPDWLRQ RI YDULDQFH FRPSRQHQWV RU JHQHWLF DUFKLWHFWXUHf 7KH HVWLPDWLRQ RI SDUHQWDO ZRUWK ZDV H[DPLQHG IRU WKH KDOIGLDOOHO PDWLQJ GHVLJQ ,W LV DUJXHG RQ WKHRUHWLFDO JURXQGV DQG LQ JHQHUDOLW\ WKDW EHVW OLQHDU XQELDVHG SUHGLFWLRQ DQG EHVW OLQHDU SUHGLFWLRQ DUH PRUH VXLWHG WR WKH SUREOHP RI SDUHQWDO HYDOXDWLRQ WKDQ RUGLQDU\ OHDVW VTXDUHV 8VLQJ VLPXODWHG GDWD IRU WZR PDWLQJ GHVLJQV KDOIGLDOOHO DQG KDOIVLEf YDULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXHV ZHUH FRPSDUHG ZLWK YDU\ OHYHOV RI GDWD LPEDODQFH DQG WZR OHYHOV RI JHQHWLF FRQWURO ,Q HVWLPDWLQJ YDULDQFH FRPSRQHQWV RU JHQHWLF UDWLRV VXFK DV KHULWDELOLW\f IRXU FULWHULD ZHUH DGRSWHG IRU GLVFULPLQDWLRQ DPRQJ HVWLPDWLRQ WHFKQLTXHV SUREDELOLW\ RI QHDUQHVV ELDV PHDQ VTXDUH HUURU DQG YDULDQFH RI HVWLPDWLRQf 2I WKH IRXU RQO\ ELDV DQG YDULDQFH RI HVWLPDWLRQ SURYHG LQIRUPDWLYH %LDV SURYHG XVHIXO LQ GLVFULPLQDWLQJ DPRQJ WUHDWPHQWV RI QHJDWLYH HVWLPDWHV ZLWK DFFHSWLQJ DQG OLYLQJ ZLWK WKH QHJDWLYH HVWLPDWHV KDYLQJ WKH OHDVW ELDV UHVROYLQJ WKH V\VWHP

PAGE 116

ZLWK QHJDWLYH HVWLPDWHV VHW WR ]HUR LQWHUPHGLDWH LQ ELDV DQG VHWWLQJ QHJDWLYH HVWLPDWHV WR ]HUR SURGXFLQJ WKH PRVW ELDV 9DULDQFH RI HVWLPDWLRQ DOVR ZDV GLVFULPLQDWRU\ DPRQJ WUHDWPHQWV RI QHJDWLYH HVWLPDWHV ZLWK DFFHSWLQJ DQG OLYLQJ ZLWK QHJDWLYH HVWLPDWHV KDYLQJ WKH KLJKHVW YDULDQFH VHWWLQJ QHJDWLYH HVWLPDWHV WR ]HUR LQWHUPHGLDWH LQ YDULDQFH DQG UHVROYLQJ WKH V\VWHP VHWWLQJ QHJDWLYH HVWLPDWHV WR ]HUR KDYLQJ WKH ORZHVW YDULDQFH 9DULDQFH RI HVWLPDWLRQ ZDV DOVR GLVFULPLQDWRU\ DPRQJ XQLWV RI REVHUYDWLRQ DQG YDULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXHV 2I WKH WZR XQLWV RI REVHUYDWLRQ XVHG LQGLYLGXDOV DQG SORW PHDQVf LQGLYLGXDO REVHUYDWLRQV SURGXFHG HVWLPDWHV ZLWK EHWWHU SURSHUWLHV DFURVV DOO OHYHOV RI LPEDODQFH PDWLQJ GHVLJQV DQG YDULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXHV 2I WKH YDULDQFH FRPSRQHQW HVWLPDWLRQ WHFKQLTXHV FRQWUDVWHG UHVWULFWHG PD[LPXP OLNHOLKRRG SURGXFHG HVWLPDWHV ZLWK WKH EHVW SURSHUWLHV ELDV DQG YDULDQFH RI HVWLPDWLRQf DFURVV DOO PDWLQJ GHVLJQV OHYHOV RI JHQHWLF FRQWURO DQG OHYHOV RI LPEDODQFH 7KHUHIRUH LW LV SURSRVHG WKDW UHVWULFWHG PD[LPXP OLNHOLKRRG HVWLPDWLRQ ZLWK LQGLYLGXDO REVHUYDWLRQV DV GDWD VKRXOG EH XWLOL]HG :LWK WKH UHFRPPHQGDWLRQ WR XVH UHVWULFWHG PD[LPXP OLNHOLKRRG WKH SURJUDP XVHG WR DQDO\]H WKH VLPXODWHG GDWD ZDV UHZULWWHQ LQWR D XVHU IULHQGO\ IRUPDW DEOH WR DQDO\]H ERWK IXOOVLE DQG KDOIVLE GDWD $GGLWLRQDO RXWSXWV RWKHU WKDQ YDULDQFH FRPSRQHQWVf ZHUH DOVR DGGHG DV RSWLRQV 7KHVH RXWSXWV LQFOXGH JHQHUDO DQG VSHFLILF FRPELQLQJ DELOLW\ SUHGLFWLRQV WKH DV\PSWRWLF FRYDULDQFH PDWUL[ IRU YDULDQFH FRPSRQHQWV JHQHUDOL]HG OHDVW VTXDUHV HVWLPDWHV RI IL[HG HIIHFWV DQG WKH FRYDULDQFH PDWULFHV IRU SUHGLFWLRQV DQG HVWLPDWHV

PAGE 117

R R $33(1',; )2575$1 6285&( &2'( )25 *$5(0/ &rrrrrr;+,6 SURJUDP 352'8&(6 5(0/ $1' 0,948( 9$5,$1&(rrrrrrrrrrrrr &rrrr&20321(17 (67,0$7(6 %< 67$57,1* ,7(5$7,21 )520 7+(rrrrrrrrrrr &rrrr758( 9$/8(6 2) 7+( 3$5$0(7(56 7+528*+ 7+( 86( 2)rrrrrrrrrrrrr &rrrrrrrrrrrrrrr4M(MJe(&+7f6 $/*25,*+70rrrrrrrrrrrrrrrrrrrrrrrrrrr & 3$5$0(7(56 '(7(50,1( 7+( 352*5$0 ',0(16,216 $1< &+$1*( ,1 & 3$5$0(7(5 6,=( '(&/$5$7,21 6+28/' %( */2%$/ 6,1&( 7+(< $5( & $/62 63(&,),(' ,1 7+( 68%5287,1(6 352*5$0 0$,1 3$5$0(7(5 & 12%6(5 ,6 7+( 0$;,080 180%(5 2) 2%6(59$7,216 1 12%6(5 & 12%/ ,6 7+( 0$;,080 180%(5 2) %/2&.6 3(5 /2&$7,21 1 12%/ & 12&5 ,6 7+( 0$;,080 180%(5 2) )8//6,% &5266(6 1 12&5 12%+ ,6 7+( 0$;,080 180%(5 2) ),;(' ())(&7 /(9(/6 ,1&/8',1* 7+( 0($1 1 12%+ & 19$5%+ ',0(16,216 7+( 9$5,$1&( &29$5,$1&( 0$75,; )25 ),;(' & ())(&76 1 19$5%+ 12%+r12%+ ff 12%+ & 12*&$ ,6 7+( 0$;,080 180%(5 2) 3$5(176 1 12*&$ & 129$5* ',0(16,216 7+( 9$5,$1&( &29$5,$1&( 0$75,; )25 *&$ 1 129$5* 12*&$r12*&$Off 12*&$ & 12; ,6 7+( 0$;,080 180%(5 2) &2/8016 )25 ),;(' ())(&76 3/86 & 5$1'20 ())(&76 & 3/86 21( )25 7+( '$7$ 1 12; & 12&%6 ,6 7+( 0$;,080 180%(5 2) /(9(/6 )25 7+( 5$1'20 ())(&7 & +$9,1* 7+( *5($7(67 180%(5 868$//< &5266 %< %/2&. 25 3/27 & &20%,1$7,216 1 12&%6 & 1727 ,6 7+( 727$/ 180%(5 2) &2/8016 2) 12; 3/86 12&%6 1 1727 12; 12&%6 & 27+(5 3$5$0(7(56 86( 7+( 35(9,286 '(&/$5$7,216 72 $//2&$7(' & 68)),&,(17 6,=( 72 6<00(75,& 0$75,&(6 6725(' $6 9(&7256 1 1,=(' 12;r12&%6

PAGE 118

1 1,;3; 12;r12; fff 12; 1 16,3 12; 12&%6 1 1,=(3 16,3r16,3 fff 16,3f &20021&01 1&2/71&2/7%1&2/*1&2/61&2/*71&2/6712%6 1 1&2/%1&2/;1&2/&%1&/f125$112),;1&/),; 1 1&/5$11&2/6(15$1f &20021&01 1 <494
PAGE 119

35,17 r f:$51,1* <28 +$9( -867 (17(5(' 7+( 7:,/,*+7 =21( 2) 19$5,$1&( &20321(176f 35,17 r f$16:(5 < )25 <(6 25 1 )25 12 72 7+( )2//2:,1* 48(67,216f :5,7(f )250$7 fIFrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr -Arrrrrrrrrrrrrr M $ :5,7(f )250$7f 3/($6( 75< $*$,1ff 35,17rf ),567 7+( )$&7256 72 %( 5($' )520 7+( '$7$ :,// %( '(7( 150,1('f 35,17 r f '2(6 7+( '$7$ +$9( 08/7,3/( /2&$7,216" 5($'f '7(50f ,)'7(50f1(f
PAGE 120

,) '7(50f(4f)ff 7+(1 *2 72 (1',) '7(50f f5f 35,17 r f :+$7 ,6 7+( 35,25 )25 /2&$7,21" 5($'f 35,,f )250$7)f ,) '7(50f(4f1ff *2 72 35,17 r f %/2&. ,6 ),;(' 25 5$1'20" f 5($'f '7(50f ,)'7(50f1(f)ff$1''7(50f1(f5fff 7+(1 :5,7(f *2 72 (1',) ,) '7(50f(4f)ff 7+(1 *2 72 (1',) '7(50f f5f 35,17 r f :+$7 ,6 7+( 35,25 )25 %/2&." 5($'f 35,,f ,) '7(50f(4f1ff *2 72 35,17 r f 6(76 $5( ),;(' 25 5$1'20" 5($'f '7(50f ,)'7(50f1(f)ff$1''7(50f1(f5fff 7+(1 :5,7(f *2 72 (1',) ,) '7(50f(4f)ff 7+(1 *2 72 (1',) '7(50f f5f 35,17 r f :+$7 ,6 7+( 35,25 )25 6(76" 5($'f 35,,f 35,17 r f $// 27+(5 )$&7256 $5( &216,'(5(' 5$1'20f 35,17 r f $16:(5 < )25 <(6 25 1 )25 12 )25 ,1&/86,21 2) 7+( )$&72 15 ,1 7+( 02'(/f :5,7(f 35,17 r f ,6 *&$ ,1 7+( 02'(/" f 5($'f '7(50f ,)'7(50f1(f
PAGE 121

(1',) ,) '7(50f(4f1ff *2 72 & 35,17 r f *&$ ,6 ),;(' 25 5$1'20" & ,1387 r '7(50f & ,) '7(50f(4f)ff 7+(1 & & *2 72 & (1',) '7(50f f5f 35,17 r f :+$7 ,6 7+( 35,25 )25 *&$" 5($'f 35,,f ,)'80(4f+ff 7+(1 '7(50f f1f *2 72 (1',) 35,17 r f ,6 6& $ ,1 7+( 02'(/" 5($'f '7(50f ,)'7(50f1(f
PAGE 122

35,17 r f :+$7 ,6 7+( 35,25 )25 /2&$7,21[*&$" 5($'f 35,,f ,)'80(4f+ff 7+(1 '7(50f f1f *2 72 (1',) 35,17 r f ,6 /2&$7,21[6&$ ,1 7+( 02'(/" 5($'f '7(50f ,)'7(50f1(f
PAGE 123

)250$72 7+( 180%(5 2) ),;(' )$&7256 3/86 7+( 0($1 f 1f 7+( 180%(5 2) 5$1'20 )$&7256 3/86 (5525 ff 35,17 r f '2 7+(6( /(9(/6 0$7&+ <285 ,17(1'(' 02'(/" < 25 1 f 5($'f '80'80 ,) '80'80(4f1ff 7+(1 35,17 r f 5(7851,1* 72 ,1,7,$/,=$7,21 2) 02'(/f 35,17 r f 72 (;,7 352*5$0 86( &21752/%5($.f *2 72 (1',) 35,17 r f 7+( ,1387 '$7$ 6(7 1$0( ,6 5($'f )/1$0( )250$7$f :5,7(f )250$72 7+( )250$7 2) 7+( '$7$ ,6 5(0(0%(5,1* 3$5(17+(6(6ff 5($',f )0$7 )250$7$f 23(1 ),/( )/1$0(67$786 f2/'ff 12%6 ,)'80(4f+ff *2 72 ,)'7(50f(4f1ff $1' '7(50f (4f1ff $1' '7(50f 1(4f1fff *2 72 ,)'7(50f(4f1ff$1''7(50f(4f1fff *2 72 ,)'7(50f(4f1ff$1''7(50f(4f1fff *2 72 ,)'7(50f(4f1ff *2 72 ,)'7(50f(4f1ff$1''7(50f(4f1fff *2 72 ,)'7(50f(4f1ff *2 72 ,)'7(50f(4f1ff *2 72 5($' )07 )0$7(1' f 7(6712%6f%/2&.12%6f6(712%6f 1 )12%6f012%6f0($112%6f *2 72 5($' )07 )0$7(1' f 7(6712%6f%/2&.12%6f)12%6f012%6f 1 0($112%6f *2 72 5($' )07 )0$7(1' f %/2&.12%6f6(712%6f)12%6f012%6f 1 0($112%6f *2 72 5($' )07 )0$7(1' f )12%6f012%6f0($112%6f *2 72 5($' )07 )0$7(1' f 6(712%6f)12%6f012%6f0($112%6f *2 72 5($' )07 )0$7(1' f %/2&.12%6f)12%6f012%6f0($112%6f *2 72 5($' )07 )0$7(1' f 7(6712%6f)12%6f012%6f0($112%6f *2 72 5($' )07 )0$7(1' f 7(6712%6f6(712%6f)12%6f012%6f 1 0($112%6f *2 72 ,)'7(50f (4f1ff $1' '7(50f (4f1ff $1' '7(50f

PAGE 124

1(4f1fff *2 72 ,)'7(50f(4f1ff$1''7(50f(4f1fff *2 72 ,)'7(50f(4f1ff$1''7(50f(4f1fff *2 72 ,)'7(50f(4f1ff$1''7(50f(41fff *2 72 ,)'7(50f(4f1ff *2 72 ,)'7(50f(4f1ff *2 72 5($' )07 )0$7(1' f 7(6712%6f%/2&.12%6f6(712%6f 1 )12%6f0($112%6f *2 72 5($' )07 )0$7(1' f 7(6712%6f%/2&.12%6f)12%6f 1 0($112%6f *2 72 5($' )07 )0$7(1' f )12%6f0($112%6f *2 72 5($' )07 )0$7(1' f 6(712%6f)12%6f0($112%6f *2 72 5($' )07 )0$7(1' f %/2&.12%6f)12%6f0($112%6f *2 72 5($' )07 )0$7(1' f 7(6712%6f)12%6f0($112%6f *2 72 5($' )07 )0$7(1' f 7(6712%6f6(712%6f)12%6f 1 0($112%6f 12%6 12%6 *2 72 12%6 12%6 &/26(Of :5,7(f 12%6 )250$7f 7+( 180%(5 2) 2%6(59$7,216 ,6 ff ,)'80(4f+ff *2 72 '2 12%6 )0,f ),f0,f &217,18( '2 ,)'7(50,f(4f1ff *2 72 ,)'7(50,f(4f5ff 7+(1 . 5$11$0.f 1$0(,f (1',) &217,18( 5$11$0. f 1$0(f '2 O12&5 )09(&,f f &217,18( '2 ,)35,,f*7f 7+(1 -

PAGE 125

6,*-f 35,,f (1',) &217,18( 1&2/7 1&2/% 1&2/6( 1&2/7% 1&2/* 1&2/6 1&2/*7 1&2/67 1&2/&% ,)'7(50f(4f1ff *2 72 &$// 12&2/7(6712%6/2&21&2/7f 1&/f 1&2/7 ,)'7(50f(4f1ff *2 72 &$// 12&2/%/2&.12%65(31&2/%f ,)'7(50f(4f1ff *2 72 &$// 12&2/6(712%6',66(71&2/6(f 1&/f 1&2/6( ,)'7(50f(41ff$1''7(50f(4f
PAGE 126

1 '2 11&2/6 ,))09(&.f/7)09(&-ff *2 72 17 )09(&.f )0 9(&.f )0 9(&-f )09(&-f 17 &217,18( ,)'80(4f+ff 1&2/6 1&/f 1&2/6 1&2/67 1&2/6 r1&2/7 1&2/*7 1&2/* r1&2/7 1&2/&% 1&2/6 r1&2/7% ,)'80(4f+ff 1&2/&% 1&2/*r1&2/7% ,)'7(50f(4 f 1ff 1&2/*7 ,)'7(50f(4f1ff 1&2/67 ,)'7(50f(4f1ff 1&2/&% 1&/f 1&2/*7 1&/f 1&2/67 1&/f 1&2/&% :5,7(f 1&2/* )250$72 180%(5 2) 3$5(176 ,6 ff :5,7(f 1&2/6 )250$7f 180%(5 2) )8//6,% &5266(6 ,6 ff 1&/),; 1&/5$1 '2 ,)'7(50,f(4f)ff 7+(1 1&/),; 1&/),; 1&/,f *2 72 (1',) 1&/5$1 1&/5$1 1&/,f &217,18( :5,7(f 1&/),;1&/5$1 )250$72 ),;(' ())(&7 &2/8016 f 1f 5$1'20 ())(&7 &2/8016 ff &9(5* 35,17 rf 7+( &219(5*(1&( &5,7(5,21 )25 9$5,$1&( &20321(176 :+,&+ 1(48$/6f 35,17 rf 7+( 680 2) 7+( $%62/87( '(9,$7,216 ,6 6(7 72 f 35,17 rf ,) <28 :,6+ 72 &+$1*( 7<3( < ,) 127 7<3( 1 f 5($'f '80'80 )250$7$Of ,)'80'80(4f1ff *2 72 35,17rf 7+( &219(5*(1&( &5,7(5,21 ,6 f 5($'f &9(5* 1&2/; 1&/),; 1&/5$1 12,76

PAGE 127

35,17rf 7+( 180%(5 2) ,7(5$7,216 $//2:(' ,6 6(7 72 f 35,17rf '2 <28 :,6+ 72 &+$1*( 7+,6" < 25 1f f 5($'f '80'80 ,)'80'80(4f
PAGE 128

&$// /6:36/15$115$1 125$1f '2 O125$1 5(0/,f 62/,125$1 f &217,18( =$* '2 O125$1 =$* =$* '$%65(0/f'80,ff &217,18( '2 O125$1 6,*,f 5(0/,f &217,18( ,)=$*/7&9(5*f *2 72 &217,18( ,)'80%(4f1ff *2 72 ,)'80%(4f
PAGE 129

,)'80'80(4f
PAGE 130

'2 ,)'7(50,f(4f)ff 7+(1 '2 1&/,f ,),(4Of 7+(1 :5,7()07 f /2&2.f%+$7-f (1',) ,)(4f 7+(1 :5,7()07 f ',66(7.f%+$7-f (1',) :5,7()07 f 1$0(,f.%+$7-f &217,18( (1',) &217,18( )250$7$7)f )250$7$,)f &/26(f '2 O129$5* 9$5*,f &217,18( '2 19$5%+ 9$5%+,f &217,18( 35,17 '2 <28 '(6,5( 7+( $6<03727,& 9$5,$1&( &29$5,$1&(f 35,17 0$75,; )25 9$5,$1&( &20321(176" < 25 1f f 5($'f '80'80 ,)'80'80(4f1ff *2 72 35,17 rf :+$7 ,6 7+( ),/(1$0( )25 9$59&f" f 5($'f )/1$0( 23(1),/( )/1$0(67$786 f81.12:1ff :5,7(f )250$7f $6<03727,& 9$5,$1&( &29$5,$1&( 0$75,;ff '2 O125$1 '2 ,125$1 62/,-f 62/,-fr :5,7(f 5$11$0,f5$11$0-f62/,-f &217,18( &217,18( )250$7$ 7$ 7)f 35,17rf'2 <28 '(6,5( 7+( (5525 9$5,$1&( &29$5,$1&( 0$75,; )25 1*&$" < 25 1f f 5($'f '80'80 ,)'80'80(4f1ff *2 72 35,17 rf :+$7 ,6 7+( ),/(1$0( )25 (9$5*+$7f" f 5($'f )/1$0( 23(1),/( )/1$0(67$786 f81.12:1ff &$// 9$5;9$5*9$5%+f :5,7( f

PAGE 131

. '2 O1&2/* '2 ,1&2/* . :5,7(f 9$5*.f3$5(17,f3$5(17-f &217,18( &217,18( )250$7f7+( (5525 9$5,$1&( &29$5,$1&( 0$75,; )25 *&$ $55$<(' 1$6 $ 9(&725ff )250$7)7$7$f &/26(,2f 35,17 r f '2 <28 '(6,5( 7+( 9$5,$1&( &29$5,$1&( 0$75,; )25 ),;(' 1 ())(&76" < 25 1f f 5($'f '80% ,)'80%(4f1ff *2 72 ,)'80'80(4f1ff &$// 9$5;9$5*9$5%+f 35,17 r f :+$7 ,6 7+( ),/(1$0( )25 9$5%(7$+$7f" f 5($'f )/1$0( 23(1OO),/( )/1$0(67$786 f81.12:1ff '2 1&/),; '2 ,1&/),; . :5,7(f 9$5%+.f &217,18( &217,18( )250$7)f &/26(OOf 6723 (1' errrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr & 68%5287,1( /6:3 6:36 7+( '(6,*1$7(' &2/8016 2) $ 0$75,; ; $1' & 5(78516 7+( 6:(37 0$75,; $6 ; 68%5287,1( /6:3;152:;1&2/;;167$1(1'f ,17(*(5 152:;1&2/;;167$1(1'1727 '28%/( 35(&,6,21 ;f '0,1 % %%f & 16:3 '(),1(6 7+( 3,927 &2/8016 )25 6:3 '0,1 ,( & ,) /(66 7+$1 )8// 5$1. 0$75,&(6 $5( (1&2817(5(' '0,1 0867 %( & (03/2<(' & 72 =(52 7+( 52: $1' &2/801 $662&,$7(' :,7+ 7+( '(3(1'(1&< 72 & 352'8&( $ *(1(5$/,=(' ,19(56( '2 167$1(1' ;..f ,) '/('0,1f 7+(1 '2 O152:; '2 O1&2/;; ;,.f

PAGE 132

;.-f &217,18( &217,18( *2 72 (1',) '2 O1&2/;; ;.-f ;.-f' &217,18( '2 O152:; & 6+28/' %( ,1&5(0(17(' 62 7+$7 ,6 127 (48$/ 72 % ;,.f '2 / O1&2/;; ;,/f ;,/f%r;./f &217,18( ;,.f %' &217,18( ;..f & %$&.:$5' (/,0,1$7,21 1727 167$ 1(1' ,)1727(4f *2 72 & 6$9,1* $%29( ',$*21$/ (175,(6 )25 08/7,3/,&$7,21 :(,*+76 .. '2 %%..f ;-.f .. .. &217,18( & =(52,1* $%29( ',$*21$/ (175,(6 )25 ,16(57,21 2) ,19(56( 9$/8(6 '2 .O ;,.f &217,18( & '2,1* 52: 23(5$7,216 72 &5($7( $%29( ',$*21$/ (175,(6 )25 ,19(56( 1 '2 0 % %%1f 1 1 '2 O1&2/;; ;0-f ;0-f%r;.-f &217,18( &217,18( &217,18( 5(7851 (1' & '(6,*1 &5($7(6 '(6,*1 0$75,&(6 )25 0$,1 ())(&76 $1' ,17(5$&7,216 & $1' )2506 7+( 1250$/ (48$7,216 68%5287,1( '(6,*1 3$5$0(7(5

PAGE 133

1 12%6(5 1 12%/ 1 12&5 1 12%+ 1 12*&$ 1 12; 1 12&%6 1 1727 12; 12&%6 1 1,=(' 12;r12&%6 1 1,;3; 12;r12; fff 12; 1 16,3 12; 12&%6 1 1,=(3 16,3r16,3 fff 16,3f &20021&01 1&2/71&2/7%1&2/*1&2/61&2/*71&2/6712%6 1 1&2/%1&2/;1&2/&%1&/f125$112),;1&/),; 1 1&/5$11&2/6(15$1f &20021&01 1 <494
PAGE 134

'2 1:180 3,f &217,18( errrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr & )250,1* 7+( 0$75,; 72 %( 6:3 72 352'8&( <494< $1' 9494 errrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr errrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr & 7. ;fr,199.fr; &203/(7(' Afr&rrrrrrrrrrWrrrrrrrrrrWrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr 167$. 1&/),; 1&/5$1 '2 ,180 O125$1O 12'80 125$1,180 1&2/5' 1&2/; 15$112'80f 167$. 167$.15$112'80f 1(1'. 167$. 15$112'8 0f '2 167$.1(1'. 0 19(&, 1&2/;f ,, ,167$. 1 19(&,,1&2/5'f 11 1 '2 ,1(1'. 0 0 11 11 311f 7.0fr6,*12'80f &217,18( 31 f 31 f &217,18( & 5 6,*,fr=Lfr,199.fr=Lf +$6 %((1 )250(' '2 167$.1(1'. '2 1&2/; . ,)-/7,f 7+(1 '.f *2 72 (1',) 0 19(&, 1&2/;f 0 0 -, '.f 7.0fr64576,*12'80ff &217,18( &217,18( '2 167$.1(1'. 1 19(&,1&2/;f ,, ,167$. 11 1&2/;r,,Of '2 ,1&2/;

PAGE 135

R R 0 1-, 11 '.f 7.0fr64576,*12'80ff &217,18( &217,18( eWrWrr_W_WWrrrrWWrrWrrrrrrWWrWWWWrrrrrrrrrrrrrrrrrrrrWrrrrrWrr & =Lfr,199.fr;r64576,*,ff +$6 %((1 )250(' e!rrrrrrrrrrrrr_rrWrrWLrrrrrrrrrrrrrrrrrrrrWWW_rOrrrrrrrrrrrrrr Arrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr 7' 'f _Wr_rrrW_rrr_WrrrrWOFWWrrWWrrrrrrrrrr__rrrrrrrWWWrrrrrrrrrrrrIF 1(1' 15$112'80f '2 O15$112'80f 1 19(&,1&2/5'f '2 15$112'80f O1&2/5' . 0 1-, 30f '.f &217,18( &217,18( '2 15$112'80f O1&2/5' 19(&,1&2/5'f ,, ,15$112'80f 0 19(&,,1&2/;f '2 ,1&2/5' . 0 0 3.f 7.0f &217,18( &217,18( & 3 5_ c'f7'c @7.f &$// 9(&6:331&2/5'1&2/5' 15$112'80ff '2 O1&2/; ,, 15$112'80f 0 19(&,,1&2/5'f '2 ,1&2/; . 0 0 7..f 30f &217,18( &217,18( &217,18( '2 O1&2/;

PAGE 136

,, 15$1f 0 19(&,,1&2/5'f '2 ,1&2/; . 0 0 7..f 30f &217,18( &217,18( 1'),; 1&/),; &$// 9&6:37.1&2/;1&2/; 1&/),;1'),;f errrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr & 3257,216 2) 7. $5( 6(/(&7(' $1' 08/7,3/,(' $1' 7+( 75$&( &$/&8 & /$7(' 72 )250 9494 errrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr &rrrrrrrrrrrrrrrrre4/MMMAM>VM M \4 125$1 2) 9494rrrrrrrrrrrrrrrrrrrrr 1(1' 1&/),; '2 O125$1O 167$ 1(1' 1(1' 167$ 15$1-f 75 167$. 1(1' '2 -125$1O ,)D(4-f 7+(1 '2 167$1(1' 1 19(&,,1&2/;f '2 ,,1(1' 0 1 .,, ,),,(4.f 7+(1 75 75 7.0fr7.0f *2 72 (1',) 75 75 r7.0fr7.0f &217,18( &217,18( 9494-,f 75 *2 72 (1',) 1(1'. 167$. 15$1,f 75 '2 / 167$1(1' 1 19(&/1&2/;f '2 167$.1(1'. 0 1 ./ 75 75 7.0fr7.0f &217,18( &217,18( 167$. 1(1'. 9494-,f 75

PAGE 137

&217,18( &217,18( ArrrrrrrrrrrrrrrAf4MAAMMA>L\M QRUDQ RI YTYTrrrrrrrrrrrrrrrrrrrrrrrrrrrr '2 125$1 75$&(5p '2 ,125$1O 9494-,f 9494,-f &217,18( &217,18( 167$ 1&/),; '2 O125$1 1(1' 167$ 15$1-f '2 167$1(1' 1 19(&,1&2/;f 1 1 75$&(5-f 75$&(5-f 7.1f &217,18( 167$ 1(1' &217,18( '2 O125$1O 9494,125$1f 75$&(5,f &217,18( 68% '2 O125$1O 68% 68% 75$&(5,fr6,*,f '2 O125$1O 9494IO125$1f 9494,125$1f6,*-fr9494,-ff &217,18( 9494,125$1f 9494,125$1f6,*125$1f &217,18( 167$. 12%61'),; 75 )/2$7167$.f 9494125$1125$1f 7568%f6,*125$1f '2 125$1 9494125$1125$1f 9494125$1125$1f6,*,fr9494,125$1ff &217,18( 9494125$1125$1f 9494125$1125$1f6,*125$1f '2 O125$1O 9494125$1,f 9494,125$1f &217,18( Frrrrrrrrrrrrr)50L1T 9(&725 2) ),;(' ())(&76 (67,0$7(6rrrrrrrrr '2 1&/),; 1 19(&,1&2/;f 1 1 1&/),;, %+$7,f 7.1f &217,18( errrrrrrrrrrrr)45@PMYM4 9(&7256 2) SUHGLFWLRQVrrrrrrrrrrrrrr '2

PAGE 138

,)5$11$0,f(4f*&$ff 7+(1 167$ *2 72 (1',) &217,18( *2 72 1(1' '2 167$ 1(1' 1(1' 15$1,f &217,18( / 1(1' 1 19(&1&/),; 1&2/;f / / 1 '2 O1&2/* / / *&$,f 7./f &217,18( '2 ,)5$11$0,f(4f6&$ff 7+(1 167$ *2 72 (1',) &217,18( *2 72 1(1' '2 167$ 1(1' 1(1' 15$1,f &217,18( / 1(1' 1 19(&1&/),; 1&2/;f / / 1 '2 O1&2/6 / / 6&$,f 7./f &217,18( errrWrrrrrrrrrS4-A>\---A4 <494
PAGE 139

<494<125$1f 7.167$f '2 O125$1O <494<125$1f <494<125$1f6,*,fr<494<,ff &217,18( <494<125$1f <494<125$1f6,*125$1f '($//2&$7( 7.'3f 5(7851 (1' Frrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr & 7+,6 )81&7,21 &28176 7+( 180%(5 2) (175,(6 )25 $1 ())(&7 68%5287,1( 12&2/9(&2%69(& 1&2/f 3$5$0(7(5 1 12%6(5 f ,17(*(5 2%61&2/ &+$5$&7(5r 9(&12%6(5f9(& rf=;17 '2 2%6 ,)D(4Of 7+(1 Y(FLDf Y(FDf 1&2/ *2 72 (1',) '2 O1&2/ ; 9(&Df = 9(& -f ,);(4=f *272 &217,18( 1&2/ 1&2/ 9(& 1&2/f 9(&,f &217,18( '2 O1&2/O 1 '2 11&2/ ,)9(&.f/79(&-ff *2 72 17 9(&.f 9(& .f 9(& -f 9(&-f 17 &217,18( 5(7851 (1' Frrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr & 7+,6 )81&7,21 &28176 7+( 180%(5 2) (175,(6 )25 3$5(176 68%5287,1( 123$59(&O9(&2%69(&13$5f 3$5$0(7(5 1 12%6(5 1 12*&$ f ,17(*(5 2%613$5 &+$5$&7(5r 9(&O12%6(5f9(&12%6(5f9(&12*&$f<=;17 '2 2%6

PAGE 140

,)IO(4Of 7+(1 9(&,f 9(&,f 9(&,f 9(&,f 13$5 *2 72 (1',) '2 13$5 ; 9(&,f = 9(&-f ,);(4=f *2 72 &217,18( 13$5 13$5 9(&13 $5f 9(& ,f '2 13$5 < 9(&f = 9(&.f ,)<(4=f *272 &217,18( 13$5 13$5 9(&13$5f 9(&,f &217,18( '2 13$5 1 '2 113$5 ,)9(&.f/79(&-ff *2 72 17 9(&.f 9(&.f 9(&-f 9(&-f 17 &217,18( 5(7851 (1' \rrrr &rr9(&6:3 352'8&(6 $ ,19(56( 2) $ 6<00(75,& 0$75,; 6725(' $6rr A 9(&725 r rrrrrr r r r r r rrrrr r r r r r r r r r r r r errrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr 68%5287,1( 9(&6:39(&152:;1&2/;;167$1(1'f 3$5$0(7(5 1 12%6(5 1 12%/ 1 12&5 1 12%+ 1 12*&$ 1 12; 1 12&%6 1 1727 12; 12&%6 1 1,=(' 12;r12&%6 1 1,;3; 12;r12; fff 12; 1 16,3 12; 12&%6

PAGE 141

1 1,=(3 16,3r16,3Offf 16,3f ',0(16,21 9(&rf9f $//2&$7$%/( 9 ,17(*(5 152:;1&2/;;167$1(1'180%19(&9180O180 '28%/( 35(&,6,21 9(&'0,1'%& $//2&$7( 91&2/;;ff '0,1 '2 O1&2/;; YDf L &217,18( '2 167$1(1' 180 .r.ff 1&2/;;r.Of 180% 180 9(&180f ,) '$%6'f/('0,1f 7+(1 '2 ,),(4.f 7+(1 180 ,r,ff 1&2/;;r,Of *2 72 (1',) 180 ,r, ff .1&2/;;r,f 9(&180f &217,18( 180 180% '2 O1&2/;; 180 180 9(&180f &217,18( *2 72 (1',) '2 O152:; ,),(4.f *2 72 180 19(&,1&2/;;f ,),/7.f 7+(1 180 180 ., % 9(&180f' *2 72 (1',) 180 180% ,. % )/2$79,ffr)/2$79.ffr9(&180ff' ,)'$%6%f/7'ff *2 72 '2 ,1&2/;; ,)-(4.f *2 72 ,)./7-f 7+(1 180 180% -. & 9(&180f *2 72 (1',)

PAGE 142

180 -r-Off 1&2/;;r-Of & )/$79-ffr)/$79.ffr9(&180f ,)'$%6&f/7'ff *2 72 180 180 9(&180f 9(&180f%r&f &217,18( &217,18( '2 .1&2/;; 180 180% -. 9(&18 0f 9(&180f' &217,18( '2 ,),(4.f 7+(1 180 DrDff 1&2/;;r,f *2 72 (1',) 180 ,r,Off 1&2/;;r,Of 9(&180f 9(&180f' &217,18( 9(&180% f 9 .f 9 .f &217,18( '($//2&$7( 9f 5(7851 (1' &rr9&6:3 352'8&(6 $ ,19(56( 2) $ 6<00(75,& 0$75,; 6725(' $6rr A r rr `F IF rrr rrrArr A 9(&725 r r r r r r r r r r rrA rrrrrr r r r r r r rrrrrr r emrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr 68%5287,1( 9&6:39(&152:;1&2/;;167$1(1'1')f 3$5$0(7(5 1 12%6(5 1 12%/ 1 12&5 1 12%+ 1 12*&$ 1 12; 1 12&%6 1 1727 12; 12&%6 1 1,=(' 12;r12&%6 1 1,;3; 12;r12; fff 12; 1 16,3 12; 12&%6 1 1,=(3 16,3r16,3Offf 16,3f ',0(16,21 9(&rf9f $//2&$7$%/( 9 ,17(*(5 152:;1&2/;;167$1(1'180%19(&9180O1801') '28%/( 35(&,6,21 9(&'0,1'%& '0,1 '

PAGE 143

$//2&$7( 91&2/;;ff '2 O1&2/;; 9,f &217,18( '2 167$1(1' 180 .r.ff 1&2/;;r.f 180% 180 9(& 18 0f ,) '$%6'f/('0,1f 7+(1 1') 1') '2 ,),(4.f 7+(1 180 ,r,ff 1&2/;;r,f *2 72 (1',) 180 ,r,'f 1&2/;;r,Of 9(&180f &217,18( 180 180% '2 .O1&2/;; 180 180 9(& 18 0f &217,18( *2 72 (1',) '2 O152:; ,),(4.f *2 72 180 19(&,1&2/;;f ,),/7.f 7+(1 180 180 ., % 9(&180f' *2 72 (1',) 180 180% ,. % )/2$79,ffr)/2$79.ffr9(&180ff' ,)'$%6%f/7'ff *2 72 '2 ,1&2/;; ,)-(4.f *2 72 ,)./7-f 7+(1 180 180%-. & 9(&180f *2 72 (1',) 180 rff 1&2/;;r-f & )/2$79-ffr)/2$79.ffr9(&180f ,)'$%6&f/7'ff *2 72 180 180 9(&180f 9(&180f%r&f

PAGE 144

&217,18( &217,18( '2 .1&2/;; 180 180% -. 9(&180f 9(&180f' &217,18( '2 ,),(4.f 7+(1 180 DrDff 1&2/;;r,Of *2 72 (1',) 180 ,r, ff 1&2/;;r,Of 9(&180f 9(&180f' &217,18( 9(&180% f 9 .f 9 .f &217,18( '($//2&$7( 9f 5(7851 (1' 4rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr &rrrrrr19(& &28176 7+( 3523(5 326,7,21 2) $1 (/(0(17rrrrrrr &rrrrrrrrr,1 7+( +$/) 6725(' 0$75,; $6 $ 9(&725frrrrrrrrrr &rrrrrrr$&&5',1* 72 ,76 1250$/ 52: &2/801 326,7,21rrrrrrrr &rrrrrrrrrrrrrrrrrMQ WKH RULJLQDO PDWUL[rrrrrrrrrrrrrrrrrrr )81&7,21 19(&152:61&2/;;f ,17(*(5 152:61&2/;;19(& 0 '2 O152:6 ,)D(4Of *2 72 0 0 1&2/;; f &217,18( 19(& 0 5(7851 (1' errrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr 68%5287,1( ;35,0;7(67%/2&.6(7)0)0f 3$5$0(7(5 1 12%6(5 1 12%/ 1 12&5 1 12%+ 1 12*&$ 1 12; 1 12&%6 1 1727 12; 12&%6 1 1,=(' 12;r12&%6

PAGE 145

1 1,;3; 1 2; r 1 2; fff 12; 1 16,3 12; 12&%6 1 1,=(3 16,3r16,3 fff 16,3f &20021&01 1&2/71&2/7%1&2/*1&2/61&2/*71&2/6712%6 1 1&2/%1&2/;1&2/&%1&/f125$112),;1&/),; 1 1&/5$11&2/6(15$1f &20021&01 1 <494
PAGE 146

,)7(67,f(4/2&2-ff 7+(1 10/9 *2 72 (1',) &217,18( ;,1-f &217,18( /2&OOf 0/9 0/9 0/9 1&2/7 /2&Of 0/9 & )250,1* '(6,*1 0$75,; )25 %/2&. ,)'7(50Of(4f1ff25'7(50f(4f5fff *2 72 '2 12%6 '2 O1&2/% ,)%/2&.,f‘ (45(3-ff 7+(1 1. *2 72 (1',) &217,18( '%/2&.,1.f &217,18( 167$ /2&OOf 1(1' /2&Of ,)'7(50f(4f1ff 7+(1 167$ 1(1' (1',) '2 12%6 / 0/9 '2 167$1(1' '2 O1&2/% ;,/f ;,-fr'%/2&.,.f / / &217,18( &217,18( &217,18( /2&Of 0/9 0/9 0/9 1&2/7% /2&f 0/9 ,)2'7(50D2(4f1A25&'7(50A(4f5fff *272 '2 12%6 '2 O1&2/6( ,)6(7,f(4',66(7-ff 7+(1 1. 0/9 *2 72 (1',) &217,18( ;,1.f

PAGE 147

&217,18( /2&f 0/9 0/9 0/9 1&2/6( /2&f 0/9 0/9 0/9 ,)'7(50OOf(4f1ff25'7(50Of(4f)fff *2 72 '2 12%6 & )250,1* '(6,*1 0$75,; )25 7(67 '2 O1&2/7 ,)7(67,f(4/2&2-ff 7+(1 10/9 *2 72 (1',) &217,18( ;,1-f &217,18( /2&OOf 0/9 0/9 0/9 1&2/7 /2&Of 0/9 & )250,1* '(6,*1 0$75,; )25 %/2&. ,)'7(50Of(4f1ff25'7(50f(4f)fff *2 72 '2 12%6 '2 O1&2/% ,)%/2&.,f(45(3-ff 7+(1 1. *2 72 (1',) &217,18( '%/2&.,1.f &217,18( 167$ /2&OOf 1(1' /2&O f ,)'7(50f(4f1ff 7+(1 1(1' 167$ (1',) '2 12%6 / 0/9 '2 167$1(1' '2 O1&2/% ;,/f ;,-fr'%/2&.,.f / / &217,18( &217,18( &217,18( /2&Of 0/9 0/9 0/9 1&2/7% /2&f 0/9

PAGE 148

,)2'7(5022(4f1225&'7(50A(4f)fff *2 72 '2 12%6 '2 O1&2/6( ,)6(7f(4',66(7-ff 7+(1 1. 0/9 *2 72 (1',) &217,18( ;,1.f &217,18( /2&Of 0/9 0/9 0/9 1&2/6( /2&f 0/9 & )250,1* '(6,*1 0$75,; )25 *&$ ,)'7(50f(4f1ff *2 72 '2 12%6 '2 O1&2/* ,)),f(43$5(17-ff 7+(1 1/ 0/9 *2 72 (1',) &217,18( ;,1/f ,)'80(4f+ff *2 72 '2 O1&2/* ,)0,f(43$5(17.ff 7+(1 11 .0/9 *2 72 (1',) &217,18( ;,11f &217,18( /2&Of 0/9 0/9 0/9 1&2/* /2&f 0/9 ,)'7(50f(4f1ff *2 72 167$ 0/9 '2 12%6 '2 O1&2/6 ,))0,f(4)09(&-ff 7+(1 ;,167$f *2 72 (1',) &217,18( &217,18( /2&Of 0/9 0/9 0/9 1&2/6 /2&f 0/9

PAGE 149

,)'7(50Of(4f1ff25'7(50OOf(4f1fff *2 72 167$ /2&OOf 1(1' /2&Of 167$. /2&Of 1(1'. /2&f '2 12%6 / 0/9 '2 167$1(1' '2 167$.1(1'. ;,/f ;,-fr;,.f / / &217,18( &217,18( &217,18( 0/9 0/9 1&2/*7 ,)'7(50Of(4f1ff25'7(50OOf(4f1fff *2 72 167$. /2&Of 1(1'. /2&f '2 12%6 / 0/9 '2 167$1(1' '2 167$.1(1'. ;,/f ;,-fr;,.f / / &217,18( &217,18( &217,18( 0/9 0/9 1&2/67 ,)'7(50Of(41f25'7(50Of(4f1fff *2 72 167$ /2&Of 1(1' /2&f 167$. /2&Of 1(1'. /2&f ,)'80(4f+ff 7+(1 167$. /2&Of 1(1'. /2&f (1',) '2 12%6 / 0/9 '2 167$1(1' '2 167$.1(1'. [D/f [DMfr[D.f / / &217,18( &217,18( &217,18( ArrrrrrrrrrrrrrrrrrrrrrrrrrrArrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr & ; 08c _+7_ _7_ c7%c _*_ c6c c *7c c 67c _&% &203/(7('

PAGE 150

Arrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr '($//2&$7( '%/2&.f 35,17r frrrrrrr),1,6+(' )250,1* 7+( '(6,*1 0$75,;rrrrrrrrrrf 35,17r frrrrrrr12: &+(&.,1* )25 18// &2/8016rrrrrrrrrrrrrrrf 1(1' 1&/),; 10,66 '2 O125$1O 167$ 1(1' 1(1' 167$ 15$1.f '2 167$1(1' '2 12%6 ,);D-f1(f *2 72 &217,18( 15$1.f 15$1.f 10,66 10,66 18/9(&10,66f &217,18( &217,18( 35,17rfrrrrrrrrrrr),1,6+(' &+(&.,1* )25 18// &2/8016rrrrrrrrrf :5,7(f 10,66 )250$7I 7+(5( :(5( ff 18// &2/8016ff ,)10,66(42f *2 72 35,17 r!rrrrrrrrrrr12: '(/(7,1* 18// &2/8016rrrrrrrrrrrrrrrrf 18/9(&10,66 Of 1&2/; / 18/9(&f '2 10,66 ,)18/9(&, f18/9(&,ff(4 f *2 72 '2 18/9(&,f 18/9(&, f '2 O12%6 ;./f ;.-f &217,18( / / &217,18( &217,18( 1&/5$1 1&/5$110,66 1&2/; 1&2/;10,66 180 1&2/;r1&2/;Off 1&2/; $//2&$7( ;3;180Off '2 180 ;3;Df &217,18( 35,17rfrrrrrrrrrr)250,1* '27 352'8&76 2) '(6,*1 &2/8016rrrrrrrf '2 1&2/; 1 19(&, 1&2/;f '2 ,1&2/; 1 1 '2 O12%6 ;3;1f ;3;1f )/2$7;.,ffr)/2$7;.-fff

PAGE 151

&217,18( &217,18( &217,18( 35,17rfrrrrrrrr)250,1* '27 352'8&76 2) '(6,*1 &2/8016 $1' 7+( 1$7$ 9(&725rrrrrrrrf / 1&/),; '2 O1&2/; ,)-/(1&/),;f 7+(1 1 19(&-1&2/;f 1 1 1&/),; (1',) ,) -*71&/),;f 7+(1 1 19(&/1&2/;f 1 1-1&/),; (1',) '2 O12%6 =$3 )/2$7;.-ff =,3 0($1.f ,)-(4/f =$3 0($1.f ;3;1f ;3;1f =,3r= $3f &217,18( &217,18( 35,17rfrrrrrrr$// '27 352'8&76 +$9( 12: %((1 )250('rrrrrrrrf 35,17rfrrr6$9,1* ; 35,0( ; 0$75,; )25 )8785( ,7(5$7,216rrrrf :5,7(f ;3; 35,17rfrrrrrrrrr; 35,0( ; ,6 6725('rrrrrrrrrf '($//2&$7( ;;3;f 5(7851 (1' errrrrrrrrrrrrrrrr$/*25,7+0rrrrrrrrrrrrrrrrrrrrrrrrrrr &rrrrrrrrrrr0',),(' 7 287387 9$5,$1&( &29$5,$1&(rrrrrrrrrrrrrrrr errrrrr rrrrrrrrrrPAnMnOAMe 2) 35(',&7,216 rrrrrrrrrrrrrrrrrrrrrrrrrrrr 68%5287,1( 9$5;9$5*9$5%+f 3$5$0(7(5 1 12%6(5 1 12%/ 1 12&5 1 12%+ 1 19$5%+ 12%+r12%+ ff 12%+ 1 12*&$ 1 129$5* 12*&$r12*&$Off 12*&$ 1 12; 1 1,;3; 12;r12;Off 12; 1 12&%6 1 1727 12; 12&%6 1 1,=(' 12;r12&%6 1 16,3 12; 12&%6 1 1,=(3 16,3r16,3 fff 16,3f

PAGE 152

&20021&01 1&2/71&2/7%1&2/*1&2/61&2/*71&2/6712%6 1 1&2/%1&2/;1&2/&%1&/f125$112),;1&/),; 1 1&/5$11&2/6(15$1f &20021&01 1 <494
PAGE 153

&217,18( 1 19(&,1&2/;f '2 ,1&2/; ,)-(41&/),;ff *2 72 '2 / O12=(52 ,)-*(16,*/ff$1'-/(16,*/fff *2 72 &217,18( 11 1-, . 7..f ;3;11f6,*125$1f &217,18( &217,18( '2 1&/),; O1&2/7. 19(&,1&2/7.f 1 . 7.1f 7.1f O''.fff &217,18( '($//2&$7( ';3;f errrrrrrrrrrrrreTMMAnMf-216 KDYH QRZ EHHQ IRUPHGrrrrrrrrrrrrrrrrrrrrrrrrrr &$// 9(&6:37.1&2/7.1&2/7.O1&2/7.f '2 ,)5$11$0,f(4f*&$ff 7+(1 167$ *2 72 (1',) &217,18( 1(1' '2 167$ ,)6,*,f(4f *2 72 1(1' 1(1' 15$1,f &217,18( 167$. 1(1' 1&/),; 1(1'. 167$. 15$1167$f 1 '2 167$.1(1'. 19(&,1&2/7.f '2 ,1(1'. .. .-, 1 1 9$5*1f 7...f &217,18( &217,18( 1 '2 1&/),; 19(&,1&2/7.f '2 ,1&/),;

PAGE 154

.. .-, 1 1 9$5%+1f 7...f &217,18( &217,18( '($//2&$7( 7.f 5(7851 (1'

PAGE 155

5()(5(1&( /,67 %DQNV %' 0DR ,/ t :DOWHU -3 5REXVWQHVV RI WKH UHVWULFWHG PD[LPXP OLNHOLKRRG HVWLPDWRU GHULYHG XQGHU QRUPDOLW\ DV DSSOLHG WR GDWD ZLWK VNHZHG GLVWULEXWLRQV 'DLU\ 6FL %HFNHU :$ 0DQXDO RI 4XDQWLWDWLYH *HQHWLFV :DVKLQJWRQ 6WDWH 8QLY3UHVV 3XOOPDQ:$ SS %UDDWHQ 02 7KH XQLRQ RI SDUWLDO GLDOOHO PDWLQJ GHVLJQV DQG LQFRPSOHWH EORFN HQYLURQPHQWDO GHVLJQV 1RUWK &DUROLQD 6WDWH 8QLY ,QVW RI 6WDW 0LPHR 6HULHV 1R SS %ULGJZDWHU )( 7DOEHUW -7 t -DKURPL 6 ,QGH[ VHOHFWLRQ IRU LQFUHDVHG GU\ ZHLJKW LQ D \RXQJ OREOROO\ SLQH SRSXODWLRQ 6LOYDH *HQHW %XUGRQ 5' *HQHWLF FRUUHODWLRQ DV D FRQFHSW IRU VWXG\LQJ JHQRW\SHHQYLURQPHQW LQWHUDFWLRQ LQ IRUHVW WUHH EUHHGLQJ 6LOYDH *HQHW %XUGRQ 5' t 6KHOERXUQH &-$ %UHHGLQJ SRSXODWLRQV IRU UHFXUUHQW VHOHFWLRQ &RQIOLFWV DQG SRVVLEOH VROXWLRQV 1 = )RU 6FL %XUOH\ %XUURZV 30 $UPLWDJH )% t %DUQHV 5' 3URJHQ\ WHVW GHVLJQV IRU 3LQXV SDWXOD LQ 5KRGHVLD 6LOYDH *HQHW &DPSEHOO *HQHWLF YDULDELOLW\ LQ MXYHQLOH KHLJKWJURZWK RI 'RXJODVILU 6LOYDH *HQHW &RUEHLO 55 t 6HDUOH 65 $ FRPSDULVRQ RI YDULDQFH FRPSRQHQW HVWLPDWRUV %LRPHWULFV )DOFRQHU '6 ,QWURGXFWLRQ WR 4XDQWLWDWLYH *HQHWLFV /RQJPDQ t &R 1HZ
PAGE 156

)UHXQG 5t /LWWHOO 5& 6$6 IRU /LQHDU 0RGHOV 6$6 ,QVWLWXWH,QF &DU\1& SS *LHVEUHFKW )* (IILFLHQW SURFHGXUH IRU FRPSXWLQJ PLQTXH RI YDULDQFH FRPSRQHQWV DQG JHQHUDOL]HG OHDVW VTXDUHV HVWLPDWHV RI IL[HG HIIHFWV &RPPXQ 6WDWLVW 7KHRU 0HWK *LOEHUW 1(* 'LDOOHO FURVV LQ SODQW EUHHGLQJ +HUHGLW\ *RRGQLJKW -+ $ WXWRULDO RQ WKH VZHHS RSHUDWRU $PHU 6WDW f *UD\ELOO )$ 7KHRU\ DQG $SSOLFDWLRQ RI WKH /LQHDU 0RGHO 'X[EXU\ 3UHVV 1RUWK 6FLWXDWH0$ SS *UHHQZRRG 06 /DPEHWK && t +XQW -/ $FFHOHUDWHG EUHHGLQJ DQG SRWHQWLDO LPSDFW XSRQ EUHHGLQJ SURJUDPV ,Q 6RXWKHUQ &RRSHUDWLYH 6HULHV %XOOHWLQ 1R /RXLVLDQD $J ([SHULPHQW 6WDWLRQ %DWRQ 5RXJH/$ SS *ULIILQJ % &RQFHSW RI JHQHUDO DQG VSHFLILF FRPELQLQJ DELOLW\ LQ UHODWLRQ WR GLDOOHO FURVVLQJ V\VWHPV $XVW %LRO 6FL +DOODXHU $5 t 0LUDQGD -% 4XDQWLWDWLYH *HQHWLFV LQ 0DL]H %UHHGLQJ ,RZD 6WDWH 8QLY3UHVV $PHV SS +DUWOH\ +2 ([SHFWDWLRQV YDULDQFHV DQG FRYDULDQFHV RI $129$ PHDQ VTXDUHV E\ V\QWKHVLV %LRPHWULFV +DUWOH\ +2 t 5DR -1. 0D[LPXP OLNHOLKRRG HVWLPDWLRQ IRU WKH PL[HG DQDO\VLV RI YDULDQFH PRGHO %LRPHWULND +DUYLOOH '$ 0D[LPXP OLNHOLKRRG DSSURDFKHV WR YDULDQFH FRPSRQHQW HVWLPDWLRQ DQG WR UHODWHG SUREOHPV $PHU 6WDW $VVRF +HQGHUVRQ &5 (VWLPDWLRQ RI YDULDQFH DQG FRYDULDQFH FRPSRQHQWV %LRPHWULFV +HQGHUVRQ &5 6LUH HYDOXDWLRQ DQG H[SHFWHG JHQHWLF DGYDQFH ,Q $QLPDO %UHHGLQJ DQG *HQHWLFV 6\PSRVLXP LQ +RQRU RI /XVK $QLPDO 6FL $VVRF $PHU &KDPSDLJQ ,OO SS +HQGHUVRQ &5 *HQHUDO IOH[LELOLW\ RI OLQHDU PRGHO WHFKQLTXHV IRU VLUH HYDOXDWLRQ 'DLU\ 6FL +HQGHUVRQ &5 %HVW OLQHDU XQELDVHG SUHGLFWLRQ RI EUHHGLQJ YDOXHV QR LQ WKH PRGHO IRU UHFRUGV 'DLU\ 6FL +HQGHUVRQ &5 $SSOLFDWLRQV RI /LQHDU 0RGHOV LQ $QLPDO %UHHGLQJ 8QLYHUVLW\ RI *XHOSK *XHOSK 2QWDULR &$1 S

PAGE 157

+HQGHUVRQ &5 .HPSWKRPH 2 6HDUOH 65 t 9RQ .URVLJN &1 (VWLPDWLRQ RI HQYLURQPHQWDO DQG JHQHWLF WUHQGV IURP UHFRUGV VXEMHFW WR FXOOLQJ %LRPHWULFV +RGJH *5 t :KLWH 7/ LQ SUHVVf *HQHWLF SDUDPHWHU HVWLPDWHV IRU JURZWK WUDLWV DW GLIIHUHQW DJHV LQ VODVK SLQH 6LOYDH *HQHW +RJJ 59 t &UDLJ $7 ,QWURGXFWLRQ WR 0DWKHPDWLFDO 6WDWLVWLFV )RXUWK HGLWLRQ 0DFPLOODQ 3XEO &R 1HZ
PAGE 158

0LOOHU -$V\PSWRWLF SURSHUWLHV DQG FRPSXWDWLRQ RI PD[LPXP OLNHOLKRRG HVWLPDWHV LQ WKH PL[HG PRGHO RI WKH DQDO\VLV RI YDULDQFH 7HFK 5HS 1R 'HSDUWPHQW RI 6WDWLVWLFV 6WDQIRUG 8QLY 6WDQIRUG &$ 0LOOLNHQ *$ t -RKQVRQ '( $QDO\VLV RI 0HVV\ 'DWD 'HVLJQHG ([SHULPHQWV /LIHWLPH /HDUQLQJ 3XE %HOPRQW&$ SS 1DPNRRQJ 6Q\GHU (% t 6WRQHF\SKHU 5: +HULWDELOLW\ DQG JDLQ FRQFHSWV IRU HYDOXDWLQJ EUHHGLQJ V\VWHPV VXFK DV VHHGOLQJ RUFKDUGV 6LOYDH *HQHW 1DPNRRQJ t 5REHUGV -+ &KRRVLQJ PDWLQJ GHVLJQV WR HIILFLHQWO\ HVWLPDWH JHQHWLF YDULDQFH FRPSRQHQWV IRU WUHHV 6LOYDH *HQHW 2OVHQ $ 6HHO\ t %LUNHV ,QYDULDQW TXDGUDWLF XQELDVHG HVWLPDWLRQ RI WZR YDULDQFH FRPSRQHQWV $QQ 6WDW 3DWWHUVRQ +' t 7KRPSVRQ 5 5HFRYHU\ RI LQWHUEORFN LQIRUPDWLRQ ZKHQ EORFN VL]HV DUH XQHTXDO %LRPHWULND 3HGHUVRQ '* $ FRPSDULVRQ RI IRXU H[SHULPHQWDO GHVLJQV IRU WKH HVWLPDWLRQ RI KHULWDELOLW\ 7KHRUHW $SSO *HQHW 3HSSHU :' &KRRVLQJ SODQWPDWLQJ GHVLJQ DOORFDWLRQV WR HVWLPDWH JHQHWLF YDULDQFH FRPSRQHQWV LQ WKH DEVHQFH RI SULRU NQRZOHGJH RI WKH UHODWLYH PDJQLWXGHV %LRPHWULFV 3HSSHU :' t 1DPNRRQJ &RPSDULQJ HIILFLHQF\ RI EDODQFHG PDWLQJ GHVLJQ IRU SURJHQ\ WHVWLQJ 6LOYDH *HQHW 3LWWPDQ (-* 7KH FORVHVW HVWLPDWHV RI VWDWLVWLFDO SDUDPHWHUV 3UR &DPEU 3KLORV 6RF 3UHVV :+ )ODQQHU\ %3 7HXNROVN\ 6$ t 9HWWHUOLQJ :7 1XPHULFDO 5HFLSHV 7KH $UW RI 6FLHQWLILF &RPSXWLQJ )RUWUDQ YHUVLRQf &DPEULGJH 8QLY 3UHVV 1HZ
PAGE 159

6FKQHLGHU '0 /LQHDU $OJHEUD $ &RQFUHWH ,QWURGXFWLRQ 0D[PLOODQ 3XE &R 1HZ
PAGE 160

:HLU 5t *RGGDUG 5( $GYDQFHG JHQHUDWLRQ RSHUDWLRQDO EUHHGLQJ SURJUDPV IRU OREOROO\ DQG VODVK SLQH ,Q 6RXWKHUQ &RRS 6HULHV %XOO /RXLVLDQD $JULH ([S 6WQ %DWRQ 5RXJH /$ SS :HLU 5t =REHO %0DQDJLQJ JHQHWLF UHVRXUFHV IRU WKH IXWXUH D SODQ IRU WKH 1& 6WDWH ,QGXVWU\ &RRSHUDWLYH 7UHH ,PSURYHPHQW 3URJUDP ,Q 3URF WK 6RXWK )RU 7UHH ,PSURYH &RQI -XQH 5DOHLJK 1& SS :HVWIDOO 3+ $ FRPSDULVRQ RI YDULDQFH FRPSRQHQW HVWLPDWHV IRU DUELWUDU\ XQGHUO\LQJ GLVWULEXWLRQV $PHU 6WDW $VVRF :KLWH 7/ $ FRQFHSWXDO IUDPHZRUN IRU WUHH LPSURYHPHQW SURJUDPV 1HZ )RUHVWV :KLWH 7/ t +RGJH *5 3UDFWLFDO XVHV RI EUHHGLQJ YDOXHV LQ WUHH LPSURYHPHQW SURJUDPV DQG WKHLU SUHGLFWLRQ IURP SURJHQ\ WHVW GDWD 3 LQ 3URF WK 6RXWK )RU 7UHH ,PSURYH &RQI 7H[DV $ t 0 8QLY &ROOHJH 6WDWLRQ 7; :KLWH 7/ t +RGJH *5 %HVW OLQHDU SUHGLFWLRQ RI EUHHGLQJ YDOXHV LQ D IRUHVW WUHH LPSURYHPHQW SURJUDP 7KHRU $SSO *HQHW :KLWH 7/ t +RGJH *5 3UHGLFWLQJ %UHHGLQJ 9DOXHV ZLWK $SSOLFDWLRQV LQ )RUHVW 7UHH ,PSURYHPHQW .OXZHU $FDGHPLF 3XE 'RUGUHFKW7KH 1HWKHUODQGV SS :LOFR[ 0' 6KHOERXUQH &-$ t )LUWK $ *HQHUDO DQG VSHFLILF FRPELQLQJ DELOLW\ LQ HLJKW VHOHFWHG FORQHV RI UDGLDWD SLQH 1 = )RU 6FL
PAGE 161

%,2*5$3+,&$/ 6.(7&+ 'XGOH\ $UYOH +XEHU ZDV ERUQ 'HFHPEHU LQ )XOWRQ &RXQW\ *HRUJLD WR 'XGOH\ DQG 'RURWK\ +XEHU +LV EDVLF HGXFDWLRQ ZDV LQ WKH 6WHSKHQV &RXQW\ VFKRRO V\VWHP +H HQWHUHG *HRUJLD ,QVWLWXWH RI 7HFKQRORJ\ WR VWXG\ FKHPLFDO HQJLQHHULQJ DQG ODWHU WUDQVIHUUHG WR WKH 8QLYHUVLW\ RI *HRUJLD LQ WKH IRUHVWU\ SURJUDP ,Q KH UHFHLYHG D %DFKHORU RI 6FLHQFH GHJUHH )URP WR KH VHUYHG LQ WKH 8 6 1DY\ DQG DIWHU VHUYLFH UHHQWHUHG WKH 8QLYHUVLW\ RI *HRUJLD UHFHLYLQJ D 0DVWHU RI 6FLHQFH GHJUHH LQ $IWHU VHYHUDO \HDUV RI VHOI HPSOR\PHQW DQG HPSOR\PHQW DW WKH 8QLYHUVLW\ RI *HRUJLD KH EHJDQ D 'RFWRU RI 3KLORVRSK\ SURJUDP LQ +H LV FXUUHQWO\ HPSOR\HG DV RSHUDWLRQV JHQHWLFLVW IRU 6RXWKHUQ )RUHVW 7UHH ,PSURYHPHQW E\ :H\HUKDHXVHU &RPSDQ\

PAGE 162

, FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ 7LPRWK\ / :KLWH &KDLUPDQ $VVRFLDWH 3URIHVVRU RI )RUHVW 5HVRXUFHV DQG &RQVHUYDWLRQ FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ 0LFKDHO $ 'H/RUHQ]R $VVRFLDWH 3URIHVVRU RI 'DLU\ 6FLHQFH FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ $VVLVWDQW 5HVHDUFK 6FLHQWLVW RI )RUHVW 5HVRXUFHV DQG &RQVHUYDWLRQ FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ 5DPRQ & /LWWHOO 3URIHVVRU RI 6WDWLVWLFV WR DFFHSWDEOH D GLVVHUWDWLRQ FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ / a= AFW0A/UFUF[ 'RQDOG / 5RFNZRRG 3URIHVVRU RI )RUHVW 5HVRXUFHV DQG &RQVHUYDWLRQ

PAGE 163

7KLV GLVVHUWDWLRQ ZDV VXEPLWWHG WR WKH *UDGXDWH )DFXOW\ RI WKH 6FKRRO RI )RUHVW 5HVRXUFHV DQG &RQVHUYDWLRQ LQ WKH &ROOHJH RI $JULFXOWXUH DQG WR WKH *UDGXDWH 6FKRRO DQG ZDV DFFHSWHG DV SDUWLDO IXOILOOPHQW RI WKH UHTXLUHPHQWV IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ 0D\ $&8? L A 'LUHFWRU )RUHVW 5HVRXUFHV DQG &RQVHUYDWLRQ 'HDQ *UDGXDWH 6FKRRO

PAGE 164

81,9(56,7< 2) )/25,'$



< u-
OPTIMAL MATING DESIGNS AND OPTIMAL TECHNIQUES FOR ANALYSIS OF
QUANTITATIVE TRAITS IN FOREST GENETICS
By
DUDLEY ARVLE HUBER
DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1993

ACKNOWLEDGEMENTS
I express my gratitude to Drs. T. L. White, G. R. Hodge, R. C. Littell, M. A.
DeLorenzo and D. L. Rockwood for their time and effort in the pursuit of this work. Their
guidance and wisdom proved invaluable to the completion of this project.
I further acknowledge Dr. Bruce Bongarten for his encouragement to continue my
academic career. I am grateful to Dr. T. L. White and the School of Forest Resources and
Conservation at the University of Florida for funding this work.
I extend special thanks to George Bryan and Dr. M. A. DeLorenzo of the Dairy Science
Department and Greg Powell of the School of Forest Resources and Conservation for the use of
computing facilities, programming help and aid in running the simulations required.
Most importantly, I thank my family, Nancy, John and Heather, for their understanding
and encouragement in this endeavor.
11

TABLE OF CONTENTS
ACKNOWLEDGEMENTS ii
LIST OF TABLES vi
LIST OF FIGURES vii
ABSTRACT viii
CHAPTER 1
INTRODUCTION 1
CHAPTER 2
THE EFFICIENCY OF HALF-SIB, HALF-DIALLEL
AND CIRCULAR MATING DESIGNS IN THE ESTIMATION
OF GENETIC PARAMETERS WITH VARIABLE NUMBERS OF
PARENTS AND LOCATIONS 4
Introduction 4
Methods 6
Assumptions Concerning Block Size 6
The Use of Efficiency (i) 7
General Methodology 8
Levels of Genetic Determination 10
Covariance Matrix for Variance Components 12
Covariance Matrix for Linear Combinations of Variance Components
and Variance of a Ratio 13
Comparison Among Estimates of Variances of Ratios 14
Results 17
Heritability 17
Type B Correlation 18
Dominance to Additive Variance Ratio 21
Discussion 22
Comparison of Mating Designs 22
A General Approach to the Estimation Problem 23
Use of the Variance of a Ratio Approximation 25
Conclusions 26
iii

CHAPTER 3
ORDINARY LEAST SQUARES ESTIMATION OF GENERAL
AND SPECIFIC COMBINING ABILITIES FROM
HALF-DIALLEL MATING DESIGNS 28
Introduction 28
Methods 30
Linear Model 30
Ordinary Least Squares Solutions 31
Sum-to-Zero Restrictions 32
Components of the Matrix Equation 35
Estimation of Fixed Effects 39
Numerical Examples 41
Balanced Data (Plot-mean Basis) 41
Missing Plot 42
Missing Cross 45
Several Missing Crosses 46
Discussion 47
Uniqueness of Estimates 47
Weighting of Plot Means and Cross Means in Estimating Parameters ... 48
Diallel Mean 51
Variance and Covariance of Plot Means 52
Comparison of Prediction and Estimation Methodologies 54
Conclusions 55
CHAPTER 4
VARIANCE COMPONENT ESTIMATION TECHNIQUES
COMPARED FOR TWO MATING DESIGNS
WITH FOREST GENETIC ARCHITECTURE
THROUGH COMPUTER SIMULATION 57
Introduction 57
Methods 59
Experimental Approach 59
Experimental Design for Simulated Data 61
Full-Sib Linear Model 62
Half-sib Linear Model 63
Data Generation and Deletion 64
Variance Component Estimation Techniques 66
Comparison Among Estimation Techniques 70
Results and Discussion 71
Variance Components 71
Ratios of Variance Components 76
General Discussion 79
Observational Unit 79
Negative Estimates 79
Estimation Technique 80
Recommendation 81
IV

CHAPTER 5
GAREML: A COMPUTER ALGORITHM FOR
ESTIMATING VARIANCE COMPONENTS AND
PREDICTING GENETIC VALUES 82
Introduction 82
Algorithm 83
Operating GAREML 86
Interpreting GAREML Output 90
Variance Component Estimates 90
Predictions of Random Variables 91
Asymptotic Covariance Matrix of Variance Components 92
Fixed Effect Estimates 93
Error Covariance Matrices 93
Example 94
Data 94
Analysis 94
Output 98
Conclusions 103
CHAPTER 6
CONCLUSIONS 104
APPENDIX
FORTRAN SOURCE CODE FOR GAREML 107
REFERENCE LIST 145
BIOGRAPHICAL SKETCH 151
v

LIST OF TABLES
Table 2-1. Parametric variance components 11
Table 3-1. Data set for numerical examples 43
Table 3-2. Numerical results for examples 44
Table 4-1. Abbreviation for and description of variance component estimation
methods 60
Table 4-2. Sets of true variance components 61
Table 4-3. Sampling variance for the estimates 72
Table 4-4. Bias for the estimates 74
Table 4-5. Probability of nearness 75
Table 5-1. Data for example 95
VI

LIST OF FIGURES
Figure 2-1. Efficiency (i) for h2 16
Figure 2-2. Efficiency (i) for rB 19
Figure 2-3. Efficiency (/) for 7 20
Figure 3-1. The overparameterized linear model 33
Figure 3-2. The linear model for a four-parent half-diallel 33
Figure 3-3. Intermediate result in SC A submatrix generation 39
Figure 3-4. Weights on overall cross means 49
Figure 4-1. Distribution of 1000 MIVQUE estimates 77
vii

Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
OPTIMAL MATING DESIGNS AND OPTIMAL TECHNIQUES FOR ANALYSIS OF
QUANTITATIVE TRAITS IN FOREST GENETICS
By
Dudley Arvle Huber
May 1993
Chairperson: Timothy L. White
Major Department: School of Forest Resources and Conservation
First, the asymptotic covariance matrix of the variance component estimates is used to
compare three common mating designs for efficiency (maximizing the variance reducing property
of each observation) for genetic parameters across numbers of parents and locations and varying
genetic architectures. It is determined that the circular mating design is always superior in
efficiency to the half-diallel design. For single-tree heritability, the half-sib design is most
efficient. For estimating type B correlation, maximum efficiency is achieved by either the half-
sib or circular mating design and that change in rank for efficiency is determined by the
underlying genetic architecture.
Another intent of this work is comparing analysis methodologies for determining parental
worth. The first of these investigations is ordinary least squares assumptions in the estimation
of parental worth for the half-diallel mating design with balanced and unbalanced data. The
conclusion from comparison of ordinary least squares to alternative analysis methodologies is that
best linear unbiased prediction and best linear prediction are more appropriate to the problem of
determining parental worth.
viii

The next analysis investigation contrasts variance component estimation techniques across
levels of imbalance for the half-diallel and half-sib mating designs for the estimation of genetic
parameters with plot means and individuals used as the unit of observation. The criteria for
discrimination are variance of the estimates, mean square error, bias and probability of nearness.
For all estimation techniques individuals as the unit of observation produced estimates with the
most desirable properties. Of the estimation techniques examined, restricted maximum likelihood
is the most robust to imbalance.
The computer program used to produce restricted maximum likelihood estimates of
variance components was modified to form a user friendly analysis package. Both the algorithm
and the outputs of the program are documented. Outputs available from the program include
variance component estimates, generalized least squares estimates of fixed effects, asymptotic
covariance matrix for variance components, best linear unbiased predictions for general and
specific combining abilities and the error covariance matrix for predictions and estimates.
IX

CHAPTER 1
INTRODUCTION
Analysis of quantitative traits in forest genetic experiments has traditionally been
approached as a two-part problem. Parental worth would be estimated as fixed effects and later
considered as random effects for the determination of genetic architecture. While traditional, this
approach is most probably sub-optimal given the proliferation of alternative analysis approaches
with enhanced theoretical properties (White and Hodge 1989).
In this dissertation emphasis is placed on the half-diallel mating design because of its
omnipresence and the uniqueness of the analysis problem this mating design presents. The half-
diallel mating design has been and continues to be used in plant sciences (Sprague and Tatum
1942, Gilbert 1958, Matzinger et al. 1959, Burley et al. 1966, Squillace 1973, Weir and Zobel
1975, Wilcox et al. 1975, Snyder and Namkoong 1978, Hallauer and Miranda 1981, Singh and
Singh 1984, Greenwood et al. 1986, and Weir and Goddard 1986). The unique feature of the
half-diallel mating system which hinders analysis with many statistical packages is that a single
observation contains two levels of the same main effect.
Optimality of mating design for the estimation of commonly needed genetic parameters
(single-tree heritability, type B correlation and dominance to additive variance ratio) is examined
utilizing the asymptotic covariance of the variance components (Kendall and Stuart 1963,
Giesbrecht 1983 and McCutchan et al. 1989). Since genetic field experiments are composed of
both a mating design and a field design, the central consideration in this investigation is which
mating design with what field design (how many parents and across what number of locations
1

2
within a randomized complete block design) is most efficient. The criterion for discernment
among designs is the efficiency of the individual observation in reducing the variance of the
estimate (Pederson 1972). This question is considered under a range of genetic architectures
which spans that reported for coniferous growth traits (Campbell 1972, Stonecypher et al. 1973,
Snyder and Namkoong 1978, Foster 1986, Foster and Bridgwater 1986, Hodge and White [in
press]).
The investigation into optimal analysis proceeds by considering the ordinary least squares
(OLS) treatment of estimating parental worth for the half-diallel mating design. OLS assumptions
are examined in detail through the use of matrix algebra for both balanced and unbalanced data.
The use of matrix algebra illustrates both the uniqueness of the problem and the interpretation
of the OLS assumptions. Comparisons among OLS, generalized least squares (GLS), best linear
unbiased prediction (BLUP) and best linear prediction (BLP) are made on a theoretical basis.
Although consideration of field and mating design of future experiments is essential, the
problem of optimal analysis of current data remains. In response to this need, simulated data
with differing levels of imbalance, genetic architecture and mating design is utilized as a basis
for discriminating among variance component estimation techniques in the determination of
genetic architecture. The levels of imbalance simulated represent those commonly seen in forest
genetic data as less than 100% survival, missing crosses for full-sib mating designs and only
subsets of parents in common across location for half-sib mating designs. The two mating
designs are half-sib and half-diallel with a subset of the previously used genetic architectures.
The field design is a randomized complete block with fifteen families per block and six trees per
family per block. The four critera used to discriminate among variance component estimation
techniques are probability of nearness (Pittman 1937), bias, variance of the estimates and mean
square error (Hogg and Craig 1978).

3
The techniques compared for variance component estimation are minimum variance
quadratic unbiased estimation (Rao 1971b), minimum norm quadratic unbiased estimation (Rao
1971a), restricted maximum likelihood (Patterson and Thompson 1971), maximum likelihood
(Hartley and Rao 1967) and Henderson’s method 3 (Henderson 1953). These techniques are
compared using the individual and plot means as the unit of observation. Further, three
alternatives are explored for dealing with negative variance component estimates which are accept
and live with negative estimates, set negative estimates to zero, and re-solve the system setting
negative components to zero.
The algorithm used for the method which provided estimates with optimal properties
across experimental levels was converted to a user friendly program. This program providing
restricted maximum likelihood variance component estimates uses Giesbrecht’s algorithm (1983).
Documentation of the algorithm and explanation of the program’s output are provided along with
the Fortran source code (appendix).

CHAPTER 2
THE EFFICIENCY OF HALF-SIB, HALF-DIALLEL
AND CIRCULAR MATING DESIGNS IN THE ESTIMATION
OF GENETIC PARAMETERS WITH VARIABLE NUMBERS OF
PARENTS AND LOCATIONS
Introduction
In forest tree improvement, genetic tests are established for four primary purposes:
1) ranking parents, 2) selecting families or individuals, 3) estimating genetic parameters, and 4)
demonstrating genetic gain (Zobel and Talbert 1984). While the four purposes are not mutually
exclusive, a test design optimal for one purpose is most probably not optimal for all (Burdon and
Shelbourne 1971, White 1987). A breeder then must prioritize the purposes for which a given
test is established and choose a design based on these priorities. Within a genetic test design
there are two primary components: mating design and field design. There have been several
investigations of optimal designs for these two components either separately or simultaneously
under various criteria. These criteria have included the efficient and/or precise estimation of
heritability (Pederson 1972, Namkoong and Roberds 1974, Pepper and Namkoong 1978,
McCutchan et al. 1985, McCutchan et al. 1989), precise estimation of variance components
(Braaten 1965, Pepper 1983), and efficient selection of progeny (van Buijtenen 1972, White and
Hodge 1987, van Buijtenen and Burdon 1990, Loo-Dinkins et al. 1990).
Incorporated within this body of research has been a wide range of genetic and
environmental variance parameters and field and mating designs. However, the models in
previous investigations have been primarily constrained to consideration of testing in a single
4

5
environment with a corresponding limited number of factors in the model, i.e., genotype by
environment interaction and/or dominance variance are usually not considered. This chapter
focuses on optimal mating designs through consideration of three common mating designs (half-
sib, half-diallel, and circular with four crosses per parent) for estimation of genetic parameters
with a field design extending across multiple locations.
In this chapter the approach to the optimal design problem is to maintain the basic field
design within locations as randomized complete block with four blocks and a six-tree row-plot
representing each genetic entry within a block (noted as one of the most common field designs
by Loo-Dinkins et al. 1990). The number of families in a block, number of locations, mating
design and number of parents within a mating design are allowed to change. Since optimality,
besides being a function of the field and mating designs, is also a function of the underlying
genetic parameters, all designs are examined across a range of levels of genetic determination (as
varying levels of heritability, genotype by environment interaction and dominance) reflecting
estimates for many economically important traits in conifers (Campbell 1972, Stonecypher et al.
1973, Snyder and Namkoong 1978, Foster 1986, Foster and Bridgwater 1986, Hodge and White
(in press)).
For each design and level of genetic determination, a Minimum Variance Quadratic
Unbiased Estimation (MIVQUE) technique and an approximation of the variance of a ratio
(Kendall and Stuart 1963, Giesbrecht 1983 and McCutchan et al. 1989) are applied to estimate
the variance of estimates of heritability, additive to additive plus additive by environment variance
ratio, and dominance to additive variance ratio. These techniques use the true covariance matrix
of the variance component estimates (utilizing only the known parameters and the test design and
precluding the need for simulated or real data) and a Taylor series approximation of the variance
of a ratio. The relative efficiencies of different test designs are compared on the basis of i (the

6
efficiency of an individual observation in reducing the variance of an estimate, Pederson 1972).
Thus this research explores which mating design, number of parents and number of locations is
most efficient per unit of observation in estimating heritability, additive to additive plus additive
by environment variance ratio, and dominance to additive variance ratio for several variance
structures representative of coniferous growth traits.
Methods
Assumptions Concerning Block Size
As opposed to McCutchan et al. (1985), where block sizes were held constant and
including more families resulted in fewer observations per family per block, in this chapter the
blocks are allowed to expand to accomodate increasing numbers of families. This expansion is
allowed without increasing either the variance among block or the variance within blocks. For
the three mating designs which are discussed, the addition of one parent to the half-sib design
increases block size by 6 trees (plot for a half-sib family), the addition of a parent to the circular
design increases block size by 12 trees (two plots for full-sib families), and the addition of a
parent to the half-diallel design increases block size by 6p (where p is the number of parents
before the addition or there are p new full-sib families per block). Therefore, block size is
determined by the mating design and the number of parents.
All comparisons among mating designs and numbers of locations are for equal block
sizes, i.e., equal numbers of observations per location. This results in comparing mating designs
with unequal numbers of parents in the designs and comparing two location experiments against
five location experiments with equal numbers of observations per location but unequal total
numbers of observations.

7
The Use of Efficiency (i)
Efficiency is the tool by which comparisons are made and is the efficacy of the individual
observations in an experiment in lowering the variance of parameter estimates. An increasing
efficiency indicates that for increasing experimental size the additional observations have
enhanced the variance reducing property of all observations. Efficiency is calculated as i = 1
/ N(Var(x)) where N is the total number of observations and Var(x) is the variance of a generic
parameter estimate. Increasing N always results in a reduction of the variance of estimation, all
other things being equal. Yet the change in efficiency with increasing N is dependent on whether
the reduction in variance is adequate to offset the increase in N which caused the reduction.
Comparing a previous efficiency with that obtained by increasing N, i.e., increasing the number
of parents in a mating design or increasing the number of locations in which an experiment is
planted:
since ia = 1 / N(Var(x)), 2-1
then N(Var(x)) = 1 / ia
and (N + AN)(Var(x) + AVar(x)) = 1 /
if ia (the old efficiency) = in (the new efficiency),
then AVar(x) / Var(x) = - AN / (N + AN);
if i0 < in, then AVar(x) / Var(x) < - AN / (N + AN);
and if ia > then AVar(x) / Var(x) > - AN / (N + AN);
where A denotes the change in magnitude.
Viewing equation 2-1, if N is held constant and one design has a higher efficiency (i), the design
must also produce parameter estimates which have a lower variance.

8
General Methodology
Sets of true variance components are calculated in accordance with a stated level of
genetic control and the design matrix is generated in correspondence with the field and mating
design. Knowing the design matrix and a set of true variance components, a true covariance
(covariance) matrix of variance component estimates is generated. Once the covariance matrix
of the variance components is in hand, the variance of and covariances between any linear
combinations of the variance component estimates are calculated. From the covariance matrix
for linear combinations, the variance of genetic ratios as ratios of linear combinations of variance
components are approximated by a Taylor series expansion. Since definition of a set of variance
components and formation of the design matrix are dependent on the linear model employed,
discussion of specific methodology begins with linear models.
Linear Models
Half-diallel and circular designs
The scalar linear model employed for half-diallel and circular mating designs is
y¡jklm = H + ti + b|j + gk + g, + Su + tgik + tgu + tSw + pijkl + wijklm 2-2
where yijklm is the m- observation of the kl- cross in the j- block of the i- test;
H is the population mean;
tj is the random variable test environment ~ NID^o^);
b;j is the random variable block ~ NID(0, gk is the random variable female general combining ability (gca) ~ NID(0, g, is the random variable male gca ~ NID(0,(rgC3);
su is the random variable specific combining ability (sea) ~ NID^ff2^);
tg¡j. is the random variable test by female gca interaction ~ NIDiO,^);

9
tgu is the random variable test by male gca interaction ~ 1^10(0,0%;
tsM is the random variable test by sea interaction ~ NID(O,02J;
pijkl is the random variable plot ~ NIDCO,^);
Wjjkim is the random variable within plot ~ NID(O,02W); and
there is no covariance between random variables in the model.
This linear model in matrix notation is (dimensions below model component):
y = + ZjjCj + ZBeB + ZGeG + Zses + Z^e^ + Z^^ts + ZPeP + ew 2-3
nxl axl axt txl nxb bxl nxg gxl rus sxl nxtg tgxl ruts tsjel rup pxl nxl
where y is the observation vector;
Zj is the portion of the design matrix for the i— random variable;
e, is the vector of unobservable random effects for the i— random variable;
1 is a vector of l’s; and
n, t, b, g, s, tg, ts, and p are the number of observations, tests, blocks, gca’s, sea’s,
test by gca interactions, test by sea interactions and plots, respectively.
Utilizing customary assumptions in half-diallel mating designs (Method 4, Griffing 1956), the
variance of an individual observation is
Var(yijldJ = and in matrix notation the covariance matrix for the observations is
Var(y) = Z^o2, + ZBZ¿o\ + ZGZ¿o28C. + ZsZy„ + 7^G7^a\ + Z^Z^ + ZPZ’ where " ’ " indicates the transpose operator, all matrices of the form Z^’ are run, and In is an
nxn identity matrix.
Half-sib design
The scalar linear model for half-sib mating designs is
yijkm - M + by + gk + tg^ + p*jk + W*jkm
2-6

10
where yijkm is the m- observation of the k- half-sib family in the j— block of the i— test;
/r, t¡, bij, gk, and tg^ retain the definition in Eq.2-2;
p*ijk is the random variable plot containing different genotype by environment
components than Eq.2-2 ~ NIDiO.o2,,.);
w*jkm is the random variable within plot containing different levels of genotypic and
genotype by environment components than Eq.2-2 ~ NIDfO^2*»); and
there is no covariance between random variables in the model.
The matrix notation model is
y = nl -I- Z.-jt*'!' + Zg6g + ZqCq f ^xg^tg ”1" p -f* 2-7
rul rue 1 art txl rub hrl rug gxl iutg tgxl rup prl rul
The variance of an individual observation in half-sib designs is
Var(yijkJ = a2, + and Var(y) = T^L\a\ + ZBZ¿ Levels of Genetic Determination
Eight levels of genetic determination are derived from a factorial combination of two
levels of each of three genetic ratios: heritability (h2 = 4orgra / (2a2gca + a2sca + 2a2^ + a2^ +
a2p + additive to additive plus additive by environment variance ratio (rB = B correlation of Burdon 1977); and dominance to additive variance ratio (7 = o2^ / levels employed for each ratio are h2 = 0.1 and 0.25; rB = 0.5 and 0.8; and 7 = 0.25 and 1.0.
To generate sets of true variance components (Table 2-1) for half-diallel and circular
mating designs from the factorial combinations of genetic parameters, the denominator of h2 is
set to 10 (arbitrarily, but without loss of generality) which, given the level of h2, leads to the

11
solution for a2ia¡. Solving for cr2gca and knowing y yields the value for o2^. Knowing the level
of rB and allows the equation for rB to be solved for a2^. An assumption that the ratio of 7
Table 2-1. Parametric variance components for the factorial combination of heritability (.1 and
.25), Type B Correlation (.5 and .8) and dominance to additive variance ratio (.25 and 1.0) for
full and half-sib designs. a2x and a\ were maintained at 1.0 and .5, respectively for all levels and
designs.
Design
Level
h2
fB
7
<4,
o2*.
<
a2»
Full
1
.1
.8
1.0
.2500
.2500
.0625
.0625
.6344
8.4281
2
.1
.5
1.0
.2500
.2500
.2500
.2500
.5950
7.9050
3
.1
.8
.25
.2500
.0625
.0625
.0156
.6508
8.6461
4
.1
.5
.25
.2500
.0625
.2500
.0625
.6212
8.2538
5
.25
.8
1.0
.6250
.6250
.1562
.1562
.5359
7.1203
6
.25
.5
1.0
.6250
.6250
.6250
.6250
.4376
5.8125
7
.25
.8
.25
.6250
.1562
.1562
.0391
.5769
7.6649
8
.25
.5
.25
.6250
.1562
.6250
.1562
.5031
6.6844
Half
1 and
3
.1
.8
.2500
.0625
.4844
9.2031
2 and
4
.1
.5
.2500
.2500
.4750
9.0250
5 and
7
.25
.8
.6250
.1562
.4609
8.7579
6 and
8
.25
.5
.6250
.6250
.4375
8.3125
equals the ratio of a2* / a\ permits a solution for u2te. A further assumption that percent of a2p + a2w yields a solution for both a2p and cr2w. Finally, cr2 and a\ are set to 1.0 and
0.5, respectively, for all treatment levels.
In order to facilitate comparisons of half-sib mating designs with full-sib mating designs,
u2gca and a\ retain the same values for given levels of h2 and rB and the denominator of
heritability again is set to 10. To solve for of a2P, + a2W, permits a solution for u2p, and a2W, and maintains no larger than a1p of the full-sib mating designs (Namkoong et al. 1966) for the same levels of

12
h2 and rB. Under the previous definitions all consideration of differences in y changing the
magnitudes of a2p. and a2W, is disallowed. Thus, there are only four parameter sets for the half-
sib mating design (Table 2-1).
Covariance Matrix for Variance Components
The base algorithm to produce the covariance matrix for variance component estimates
is from Giesbrecht (1983) and was rewritten in Fortran for ease of handling the study data. In
using this algorithm, we assume that all random variables are independent and normally
distributed and that the true variances of the random variables are known. Under these
assumptions, Minimum Norm Quadratic Unbiased Estimation (MINQUE, Rao 1972) using the
true variance components as priors (the starting point for the algorithm) becomes MIVQUE (Rao
1971b), which requires normality and the true variance components as priors (Searle 1987), and
for a given design the covariance matrix of the variance component estimates becomes fixed. A
sketch of the steps from the MIVQUE equation (Eq.2-10, Giesbrecht 1983, Searle 1987) to the
true covariance matrix for variance components estimates is
{triQVjQV^Jff2 = {y’QVjQy} 2-10
rxr rxl rxl
then ¿* = {tr(QViQVj)}‘1{y’QViQy}
and Varío2) = {tríQV&V^-'VarQy’Q^QyMtríQV&Vj)}-1
rxr rxr rxr rxr
where {aj is a matrix whose elements are a:j where in the full-sib designs i= 1
to 8 and j=l to 8, i.e., there is a row and column for every random
variable in the linear model;

13
tr is the trace operator that is the sum of the diagonal elements of a
matrix;
Q = V'1 - V'XtX’V 'XyX’V'1 for V = the covariance matrix of y and X as
the design matrix for fixed effects;
V¡ = ZjZ’, where i = the random variables test, block, etc.;
ó2 is the vector of variance component estimates; and
r is the number of random variables in the model.
The variance of a quadratic form (where A is any non-negative definite matrix of proper
dimension) under normality is Var(y’Ay) = 2tr(AVAV) + /t’A/* (Searle 1987); however,
MINQUE derivation (Rao 1971) requires that AX = 0 which in our case is A1 =0 and is
equivalent to /xl’Al/i = 0, thus
Var({y’QVjQy}) = 2{tr(QViQVj)}; 2-11
and using Eq.2-10 and Eq.2-11 Varió2) = {triQVjQVpj ^itriQViQVpJitriQVjQVj)}1
and therefore Varió2) = Vvc = 2{tr(QViQVj)}1. 2-12
From Eq.2-12 it is seen that the MIVQUE covariance matrix of the variance component estimates
is dependent only on the design matrix (the result of the field design and mating design) and the
true variance components; a data vector is not needed.
Covariance Matrix for Linear Combinations of Variance Components and Variance of a Ratio
Once the covariance matrix for the variance component estimates (Eq.2-12) is created,
then the covariance matrix of linear combinations of these variance components is formed as
Vk = L’VvcL 2-13
2x2 2xt rxr rx2

14
where L specifies the linear combinations of the variance components which are the
combinations of variance components in the denominator and numerator of the genetic ratio being
estimated. A Taylor series expansion (first approximation) for the variance of a ratio using the
variances of and covariance between numerator and denominator is then applied using the
elements of Vk to produce the approximate variance of the three ratio estimates as (Kendall and
Stuart 1963):
Var(ratio) = (l/D)2(Vk(l,l)) - 2(N/D3)(Vk(l,2)) + (tfVD4)^^)) 2-14
where the generic ratio is N/D and N and D are the parametric values;
Vk(l,l) is the variance of N;
Vk(l,2) is the covariance between N and D; and
Vk(2,2) is the variance of D.
Comparison Among Estimates of Variances of Ratios
The approximate variances of the three ratio estimates (h2, rB, and y) are compared across
mating designs with equal (or approximately equal) numbers of observations, across numbers of
locations, and across numbers of parents within a mating design all within a level of genetic
determination. The standard for comparison is i. Results are presented by the genetic ratio
estimated so that direct comparisons may be made among the mating designs for equal numbers
of observations within a number of locations for varying levels of genetic control. Number of
genetic entries (number of crosses for full-sib designs and number of half-sib families for half-sib
designs) is used as a proxy for number of observations since, for all designs, number of
observations equals twenty-four times the number of locations times the number of genetic
entries. Further, by plotting the two levels of numbers of locations on a single figure, a

15
comparison is made of the utility of replication of a design across increasing numbers of
locations.
Efficiency plots also permit contrasts of the absolute magnitude of variance of estimation
among designs. For a given number of genetic entries and locations, the design with the highest
efficiency is the most precise (lowest variance of estimation). Increasing the number of genetic
entries or locations always results in greater precision (lower variance of estimation), but is not
necessarily as efficient (the reduction in variance was not sufficient to offset the increase in
numbers of observations). A primary justification for using the efficiency of a design as a
criterion is that a more precise estimate of a genetic ratio is obtained by using the mean of two
estimates from replication of the small design as two disconnected experiments as opposed to the
estimate from single large design. This is true when 1) the number of observations in the large
design (N) equals twice the number of observations in small design (n¡), 2) the small design is
more efficient, and 3) the variances are homogeneous. This is proven below:
Since
N = n, + n2
and
n, = n2
then
N = 2n,.
By definition
i=l/ (N*(Var (Ratio)));
and
Var(Ratio) = 1 /(z* N).
The proposition is
(Vars(Ratio) + Vars(Ratio))/4.0 < Var,(Ratio):
substitution gives
((l/(n,*0) + (l/(n,*z.)))/4.0 < (l/(N*zj)).
Simplification yields
(17(2-0*n,*ü) < (1/(N*/,));
and multiplication by N produces Hi, < 1/zj 2-15
which is strictly true so long as zs > zj where z8 is the efficiency of the smaller experiment and
z, is the efficiency of the larger experiment.

EFFICIENCY EFFICIENCY
(1a) h =1; r,
.8; V =1.0
(1b) h =.1; rB =.5; V = 1.0
(1e) h =.25; rB = 8: v = 1.0
h -.25; rB = .5; V = 1.0
Circular 2 locations Half-dialld 2 locations Half-sib 2 locations
Circular 5 locations Half-dioUd 5 locations Half-sib 5 locations
(11)
(1g) h’ = 25; r, = .5; V = .25
Figure 2-1. Efficiency (/) for h2 plotted against number of genetic entries for levels 1 through 8 for genetic control for circular, half-
diallel, and half-sib mating designs across levels of location where i = l/(N(Var(h2))) and N = the total number of observations.
ON

17
Results
Heritabilitv
Half-sib designs are almost globally superior to the two full-sib designs in precision of
heritability estimates (results not shown for variance but may be seen from efficiencies in Figure
2-1). For designs of equal size, half-sib designs excel with the exception of genetic level three
(Figure 2-lc, h2 = 0.1, rB = 0.8, and y = 0.25). In genetic level three, the circular design
provides the most precise estimate of h2 for two location designs; however, when the design is
extended across five locations, the half-sib mating design again provides the most precise
estimates. The circular mating design is superior in precision to the half-diallel design across all
levels of genetic control and location, even with a relatively large number of crosses per parent
(four).
Half-sib designs are, in general, (seven genetic control levels out of eight, Figure 2-1)
more efficient with the exception of level three across two locations (Figure 2-lc). For the
circular and half-sib mating designs considered, increasing the number of genetic entries always
improves the efficiency of the design. However, definite optima exist for the half-diallel mating
design for number of genetic entries, i.e., crosses which convert to a specific number of parents.
These optima are not constant but tend to be six parents or less, lower with increasing h2 or
number of locations. The six-parent half-diallel is never far from the half-diallel optima, and
increasing the number of parents past the optimum results in decreased efficiency.
For half-sib designs with h2 = 0.1, five locations are more efficient than two locations;
however, at h2 = 0.25 two locations are most efficient. Further, the number of locations
required to efficiently estimate h2 for half-sib designs is determined only by the level of h2 and
does not depend on the levels of the other ratios. Although estimates over larger numbers of

18
observations are more precise (five-location estimates are more precise than two-location
estimates), the efficiency (increase in precision per unit observation) declines. So that if h2 =
0.25 and estimates of a certain precision are required, disconnected sets of two-location
experiments are preferred to five-location experiments. The relative efficiencies of five locations
versus two locations is enhanced with decreasing rB (increasing genotype by environment
interaction) within á level of h2 (compare Figures 2-la to 2-lb and 2-lc to 2-ld for h2 = 0.1, and
2-le to 2-lf and 2-lg to 2-lh for h2 = 0.25). Yet, this enhancement is not sufficient to cause
a change in efficiency ranking between the location levels.
The full-sib designs differ markedly from this pattern (Figure 2-1) in that, for these
parameter levels, it is never more efficient to increase the number of locations from two to five
for heritability estimation. As observed with half-sib designs, for full-sib designs the relative
efficiency status of five locations improves with decreasing rB. To further contrast mating designs
note that the efficiency status of full-sib designs relative to the half-sib design improves with
decreasing y and increasing rB (Figures 2-lb versus 2-lc and 2-lf versus 2-lg).
Type B Correlation
As opposed to h2 estimation, no mating design performs at or near the optima for
precision of rB estimates across all levels of genetic control (Figure 2-2). However, the circular
mating designs produce globally more precise estimates than those of the half-diallel mating
design. In general, the utility of full-sib versus half-sib designs is dependent on the level of rB.
The lower rB value favors half-sib designs while the higher rB tends to favor full-sib designs
(compare Figures 2-2a to 2-2b, 2-2c to 2-2d, 2-2e to 2-2f and 2-2g to 2-2h).
Decreasing y and lowering h2 always improves the relative efficiency of full-sib designs to half-
sib designs (compare Figures 2-2c and 2-2d to 2-2e and 2-2f).

EFFICIENCY EFFICIENCY
(2a) h2 = .1; rB =8; y=1.0
(2c) h2 = .1: r B = -a: y-2s
(26) b2 = .1; 18 = -5; Y = 10
(2d) rB = 5; y =-25
0 10 20 30 40 50
GENETIC ENTRIES
(2.) h2»^: 'b=.8; y = 1.0
(2g) h2 =.25; rB ».«; y=.25
(21) h2 =55; r B = .5; y-1.0
(2b) h2=.2S: f B = T»; y=2S
Circular 2 locations Half-dialld 2 locations
Circular 5 locations Half-dialltl 5 locations
Half-sib 2 locations
Half-sib 5 locations
•
Figure 2-2. Efficiency (i) for rB plotted against number of genetic entries for levels 1 through 8 for genetic control for circular, half-diallel, and
half-sib mating designs across levels of location where i = l/(N(Var(rB))) and N = the total number of observations.
vo

EFFICIENCY EFFICIENCY
20
(3a) h2 = .1; r0 = .5; -y = 1.0
(3c) h2 = .25; r0 = .8; V = 1.0
Circular 2 locations
(3b) h2= .1; r0 = .5; -y = .25
(3d) h2 =.25; rB = .8; V = .25
GENETIC ENTRIES
Half-diallel 2 locations
Circular 5 locations Half-diallel 5 locations
â–  A
Figure 2-3. Efficiency (z) for 7 plotted against number of genetic entries for four levels for
genetic control for circular, half-diallel, and half-sib mating designs across levels of location
where i = l/(N(Var(7))) and N = the total number of observations.

21
For estimation of rB, full-sib designs are more efficient than half-sib designs except in the
three cases of low rB (0.5) and high 7 (1.0) for h2 = 0.1 (Figure 2-2b) and low rB for h2 = 0.25
(Figures 2-2f and 2-2h). Within full-sib designs the circular design is globally superior to the
half-diallel. As with h2 estimation, half-diallel designs have optimal levels for numbers of
parents. The six-parent half-diallel is again close to these optima for all genetic levels and
numbers of locations.
At low h2 for full-sib designs, planting in two locations is always more efficient than five
locations. For half-sib designs at low h2, the relative efficiency of two versus five locations is
dependent on the level of rB with lower rB favoring replication across more locations. At h2 =
0.25, half-sib designs are more efficient when replicated across five locations. At the higher h2
value, full-sib design efficiency across locations is dependent on the level of rB. With rB = 0.5
and h2 = 0.25, replication of full-sib designs is for the first time more efficient across five
locations than across two locations; however, at the higher rB level two locations is again the
preferred number.
Dominance to Additive Variance Ratio
In comparing the two full-sib designs for relative efficiency in estimating 7, the circular
design is always approximately equal to or, for most cases, superior to the half-diallel design
(Figure 2-3). The relative superiority of the circular design is enhanced by decreasing 7 and rB
(not shown). The half-diallel design again demonstrates optima for number of parents with the
six-parent design being near optimal. Within a mating design the use of two locations is always
more efficient than the use of five locations. The magnitude of this superiority escalates with
increasing rB and h2 (Figures 2-3a and 2-3b versus 2-3c and 2-3d).

22
Discussion
Comparison of Mating Designs
A priori knowledge of genetic control is required to choose the optimal mating and field
design for estimation of h2, rB and 7. Given that such knowledge may not be available, the
choices are then based on the most robust mating designs and field designs for the estimation of
certain of the genetic ratios. If h2 is the only ratio desired, then the half-sib mating design is
best. Estimation of both h2 and rB requires a choice between the half-sib and circular designs.
If there is no prior knowledge then the selection of a mating design is dependent on which ratio
has the highest priority. For experiments in which h2 received highest weighting, the half-sib
design is preferred and in the alternative case the circular design is the better choice. In the last
scenario information on all three ratios is desired from the same experiment and in this case the
circular design is the better selection since the circular design is almost globally more efficient
than the half-diallel design.
After choosing a mating design, the next decision is how many locations per experiment
are required to optimize efficiency. For the half-sib design the number of locations required to
optimize efficiency is dependent on both the ratio being estimated and the level of genetic control.
A broad inference is that for h2 estimation a two location experiment is more efficient and for rB
a five location experiment has the better efficiency. Estimation of any of the three ratios with
a full-sib design is almost globally more efficient in two location experiments.
The disparity between the behavior of the half-sib and full-sib designs with respect to the
efficiency of location levels can be explained in terms of the genetic connectedness offered by the
different designs. Genetic connectedness can be viewed as commonality of parentage among
genetic entries. The more entries having a common parent the more connectedness is present.

23
The half-sib design is only connected across locations by the one common parent in a half-sib
family in each replication. Full-sib designs are connected across locations in each replication by
the full-sib cross plus the number of parents minus two (half-diallel) or three (circular) for each
of the two parents in a cross. The connectedness in a full-sib design means each observation is
providing information about many other observations. The result of this connectedness is that,
in general, fewer observations (number of locations) are required for maximum efficiency.
A General Approach to the Estimation Problem
The estimation problems may be viewed in a broader context than the specific solutions
in this chapter. The technique for comparison of mating designs and numbers of locations across
levels of genetic determination may be construed, for the case of h2 estimation, to be the effect
of these factors on the variance of a2^ estimates. Viewing the variance approximation formula,
the conclusion may be reached that the variance of o2^ estimates is the controlling factor in the
variance of h2 estimates since the other factors at these heritability levels are multiplied by
constants which reduce their impact dramatically. Given this conclusion, the variance of h2
estimates is essentially the (3,3) element in 2{tr(QV¡QVj)}'‘ (Eq. 2-11). Further, since the
covariances of the other variance component estimates with o2^ estimates are small, the variance
of a2^ estimates is basically determined by the magnitude of the (3,3) element of {tr(QV¡QVj)}
which is tr(QVgQVg). Thus, the variance of h2 estimates is minimized by maximizing
tr(QVgQVg) with h2 used as an illustration because this simplification is possible.
Considering the impact of changing levels of genetic control, while holding the mating
and field designs constant, Vg is fixed, the diagonal elements of V are fixed at 11.5 because of
our assumptions, and only the off-diagonal elements of V change with genetic control levels.
Since Q is a direct function of V1, what we observe in Figure 2-1 comparing a design across

24
levels of genetic control are changes in V'1 brought about by changes in the magnitude of the off-
diagonal elements of V (covariances among observations). The effect of positive (the linear
model specifies that all off-diagonal elements in V are zero or positive) off-diagonal elements on
V'1 is to reduce the magnitude of the diagonal elements and often also result in negative off-
diagonal elements. If one increases the magnitude of the off-diagonal elements in V, then the
magnitude of the diagonal elements of V'1 is reduced and the magnitude of negative off-diagonal
elements is increased. Since tr(QVgQVg) is the sum of the squared elements of the product of
a direct function of V'1 and a matrix of non-negative constants (Vg), as the diagonal elements of
V'1 are reduced and the off-diagonal elements become more negative, tr(QVgQVg) must become
smaller and the variance of h2 estimates increases.
Mating designs may be compared by the same type of reasoning. Within a constant field
design changes in mating design produce alterations in V. Of the three designs the half-sib
produces a V matrix with the most zero off-diagonal elements, the circular design next, and the
half-diallel the fewest number of zero off-diagonal elements. Knowing the effect of off-diagonal
elements on the variance of h2 estimates, one could surmise that the variance of estimates is
reduced in the order of least to most non-zero off-diagonal elements. This tenant is in basic
agreement with the results in Figures 2-1 through 2-3.
The effects of rB and y on the variance of h2 estimates can also be interpreted utilizing
the above approach. In the results section of this chapter it is noted that decreasing the magnitude
of rB and/or y causes full-sib designs to rise in efficiency relative to the half-sib design. In
accordance with our previous arguments this would be expected since decreasing the magnitude
of those two ratios causes a decrease in the magnitude of off-diagonal elements. More precisely,
decreasing y results in the reduction of off-diagonal elements in V of the full-sib designs while
not affecting the half-sib design, and decreasing rB results in the reduction of off-diagonal

25
elements in V of full-sib and half-sib designs. Relative increases in efficiency of full-sib designs
result from the elements due to location by additive interaction occurring much less frequently
in the half-sib designs; thus, the relative impact of reduction in rB in half-sib designs is less than
that for full-sibs.
Use of the Variance of a Ratio Approximation
Use of Kendall and Stuart’s (1963) first approximation (first-term Taylor series
approximation) of the variance of a ratio has two major caveats. The approximation depends on
large sample properties to approach the true variance of the ratio, i.e., with a small number of
levels for random variables the approximation does not necessarily closely approximate the true
variance of the ratio. Work by Pederson (1972) suggests that for approximating the variance of
h2 at least ten parents are required in diallels before the approximation will converge to the true
variance even after including Taylor series terms past the first derivative. Pederson’s work also
suggests that the approximation is progressively worse for increasing heritability with low
numbers of parents. Using the field design in this chapter (two locations,four blocks and six-tree
row-plots), simulation work (10,000 data sets) has demonstrated that with a heritability of 0.1
using four parents in a half-diallel across two locations that the variance of a ratio approximation
yields a variance estimate for h2 of 0.1 while the convergent value for the simulation was 0.08
(Huber unpublished data). One should remember the dependence of the first approximation of
the variance of a ratio on large sample properties when applying the technique to real data.
The second caveat is that the range of estimates of the denominator of the ratio cannot
pass through zero (Kendall and Stuart 1963). This constraint is of no concern for h2; however,
the structure of rB and y denominators allows unbiased minimum variance estimates of those
denominators to pass through zero which means at one point in the distribution of the estimates

26
of the ratios they are undefined (the distributions of these ratio estimates are not continuous).
Simulation has shown that the variances of rB and y are much greater than the approximation
would indicate (Huber unpublished data). The discrepancy in variance of the estimates could be
partially alleviated through using a variance component estimation technique which restricts
estimates to the parameter space 0 < a2 < oo. Nevertheless, because of the two caveats,
approximations of the variance of h2, rB and y estimates should be viewed only on a relative basis
for comparisons among designs and not on an absolute scale.
Additionally, the expectation of a ratio does not equal the ratio of the expectations (Hogg
and Craig 1978). If a value of genetic ratios is sought so that the value equals the ratio of the
expectations, then the appropriate way to calculate the ratio would be to take the mean of
variance components or linear combinations of variance components across many experiments and
then take the ratio. If the value sought for h2 is the expectation of the ratio, then taking the mean
of many h2 estimates is the appropriate approach. Returning to the results from simulated data
(10,000 data sets) where the h2 value was set at 0.1, using the ratio of the means of variance
components rendered a value of 0.1 for h2, the mean of the h2 estimates returned a value of 0.08,
and a Taylor series approximation of the mean of the ratio yielded 0.07 (Pederson 1972).
Conclusions
Results from this study should be interpreted as relative comparisons of the levels of the
factors investigated. However, viewing the optimal design problem as illustrated in the
discussion section of this chapter can provide insight to the more general problem.
There is no globally most efficient number of locations, parents or mating design for the
three ratios estimated even within the restricted range of this study; yet, some general conclusions
can be drawn. For estimating h2 the half-sib design is always optimal or close to optimal in

27
terms of variance of estimation and efficiency. In the estimation of rB and y, the circular mating
design is always optimal or near optimal in variance reduction and efficiency. Across numbers
of parents within a mating design only the half-diallel shows optima for efficiency. The other
mating designs have non-decreasing efficiency plots over the level of number of parent; so that
while there is an optimal number of locations for a level of genetic control, the number of genetic
entries per location is limited more by operational than efficiency constraints.
Two locations is a near global optimum over five locations for the full-sib mating designs.
Within the half-sib mating design optimality depends on the levels of h2 and rB: 1) for h2
estimation the optimal number of locations is inversely related to the level of h2, i.e. at the higher
level two tests were optimal and at the lower level five tests were optimal; and 2) for rB
estimation for the half-sib design, the optimal number of locations was also inversely related to
the level of rB.
Means of estimates from disconnected sets provide lower variance of estimation where
the smaller experiments have higher efficiencies. Thus, disconnected sets are preferred according
to number of locations for all mating designs and according to number of parents for the half-
diallel mating design.
In practical consideration of the optimal mating design problem, the results of this study
indicate that if h2 estimation is the primary use of a progeny test then the half-sib mating design
is the proper choice. Further, the circular mating design is an appropriate choice if the
estimation of rB is more important than h2,. Finally, if a full-sib design is required to furnish
information about dominance variance, the circular design provides almost globally better
efficiencies for h2, rB, and y than the half-diallel.

CHAPTER 3
ORDINARY LEAST SQUARES ESTIMATION OF GENERAL
AND SPECIFIC COMBINING ABILITIES FROM
HALF-DIALLEL MATING DESIGNS
Introduction
The diallel mating system is an altered factorial design in which the same individuals (or
lines) are used as both male and female parents. A full diallel contains all crosses, including
reciprocal crosses and seifs, resulting in a total of p2 combinations, where p is the number of
parents. Assumptions that reciprocal effects, maternal effects, and paternal effects are negligible
lead to the use of the half-diallel mating system (Griffing 1956, method 4) which has p(p-l)/2
parental combinations and is the mating system addressed in this chapter.
Half diallels have been widely used in crop and tree breeding (Sprague and Tatum 1942,
Gilbert 1958, Matzinger et al. 1959, Burley et al. 1966, and Squillace 1973) and the widespread
use of this mating system continues today (Weir and Zobel 1975, Wilcox et al. 1975, Snyder and
Namkoong 1978, Hallauer and Miranda 1981, Singh and Singh 1984, Greenwood et al. 1986,
and Weir and Goddard 1986).
Most of the statistical packages available treat fixed effect estimation as the objective of
the program with random variables representing nuisance variation. Within this context a
common analysis of half-diallel experiments is conducted by first treating genetic parameters as
fixed effects for estimation of general (GCA) and specific (SCA) combining abilities and
subsequently as random variables for variance component estimation (used for estimating
heritabilities, genetic correlations, and general to specific combining ability variance ratios for
28

29
determining breeding strategies). This chapter focuses on the estimation of GCA’s and SCA’s
as fixed effects. The treatment of GCA and SCA as fixed effects in OLS (ordinary least squares)
is an entirely appropriate analysis if the comparisons are among parents and crosses in a
particular experiment. If, as forest geneticists often wish to do, GCA estimates from
disconnected experiments are to be compared, then methods such as checklots must be used to
place the estimates on a common basis.
Formulae (Griffing 1956, Falconer 1981, Hallauer and Miranda 1981, and Becker 1975)
for hand calculation of general and specific combining abilities are based on a solution to the OLS
equations for half-diallels created by sum-to-zero restrictions, i.e., the sum of all effect estimates
for an experimental factor equals zero. These formulae will yield correct OLS solutions for sum-
to-zero genetic parameters provided the data have no missing cells. If cell (plot) means are used
as the basis for the estimation of effects, there must be at least one observation per cell (plot)
where a cell is a subclassification of the data defined by one level of every factor (Searle 1987).
An example of a cell is the group of observations denoted by AB¡j for a randomized complete
block design with factor A across blocks (B). If the above formulae are applied without
accounting for missing cells, incorrect and possibly misleading solutions can result. The matrix
algebra approach is described in this chapter for these reasons: 1) in forest tree breeding
applications data sets with missing cells are extremely common; 2) many statistical packages do
not allow direct specification of the half-diallel model; 3) the use of a linear model and matrix
algebra can yield relevant OLS solutions for any degree of data imbalance; and 4) viewing the
mechanics of the OLS approach is an aid to understanding the properties of the estimates.
The objectives of this chapter are to (1) detail the construction of ordinary least squares
(OLS) analysis of half-diallel data sets to estimate genetic parameters (GCA and SCA) as fixed
effects, (2) recount the assumptions and mathematical features of this type of analysis, (3)

30
facilitate the reader’s implementation of OLS analyses for diallels of any degree of imbalance and
suggest a method for combining estimates from disconnected experiments, and (4) aid the reader
in ascertaining what method is an appropriate analysis for a given data set.
Methods
Linear Model
Plot means are used as the unit of observation for this analysis with unequal numbers of
observations per plot. Plot (cell) means are always estimable as long as there is one observation
per plot, and linear combinations of these means (least squares means) provide the most efficient
way of estimating OLS fixed effects (Yates 1934). Throughout this chapter, estimates are
denoted by lower case letters while the parameters are designated by upper case letters and
matrices are in bold print.
Using plot means as observations, a common scalar linear model for an analysis of a half-
diallel mating design with p(p-l)/2 crosses planted at a single location in a randomized complete
block design with one plot per block is
y¡jk = n + B, + GCAj + GCAk + SCAjk + e¡jk 3-1
where yijk is the mean of the i— block for the jk- cross;
is an overall mean;
B; is the fixed effect of block i for i = 1 to b;
GCAj is the fixed general combining ability effect of the j- female parent or
k- male parent, j or k = 1,. . ,,p (j k);
SCA^ >s the fixed specific combining ability effect of parents j and k; and

31
eijk is the random error associated with the observation of the jk- cross in
the i- block where eijk _ (0, a2e).
Cross by block interaction as genotype by environment interaction is treated as confounded with
between plot variation as for contiguous plots.
The model in matrix notation is
y = X/J + e 3-2
where y is the vector of observation vectors (nxl = n rows and 1 column) where n equals
the number of observations;
X is the design matrix (nxm) whose function is to select the appropriate parameters
for each observation where m equals the number of fixed effect parameters in the
model;
/3 is the vector (mxl) of fixed effect parameters ordered in a column; and
e is the vector (nxl) of deviations (errors) from the expectation associated with each
observation.
Ordinary Least Squares Solutions
The matrix representation of an OLS fixed effects solution is
b = (X’XyX’y 3-3
where b is the vector of estimated fixed effect parameters, i.e., an estimate of /5, and
X is the design matrix either made full rank by reparameterization,
or a generalized inverse of X’X may be used.
Inherent in this solution is the ordinary least squares assumption that the variance-

32
covariance matrix (V) of the observations (y) is equal to I a2,, where I is an nxn identity matrix.
The elements of an identity matrix are l’s on the main diagonal and all other elements are 0.
Multiplying I by cre places observations, the variance of the observations appears on the main diagonal and the covariance
between observations appears in the off-diagonal elements. Thus, V = Ia2e states that the
variance of the observations is equal to a2e for each observation and there are no covariances
between the observations (which is one direct result of considering genetic parameters as fixed
effects).
Sum-to-Zero Restrictions
The design matrix presented in this chapter is reparameterized by sum-to-zero restrictions
to (1) reduce the dimension of the matrices to a minimal size, and (2) yield estimates of fixed
effects with the same solution as common formulae in the balanced case. Other restrictions such
as set-to-zero could also be applied so the discussion that follows treats sum-to-zero restrictions
as a specific solution to the more general problem which is finding an inverse for X’X. The
subscripts ’o’ and ’s’ refer to the overparameterized model and the reparameterized model with
sum-to-zero restrictions, respectively.
The matrix X0 of Figure 3-1 is the design matrix for an overparameterized linear model
(Milliken and Johnson 1984, page 96). Overparameterization means that the equations are written
in more unknowns (parameters, in this case 13) than there are equations (number of observations
minus degrees of freedom for error, in this case 12 - 5 = 7) with which to estimate the
parameters. Reparameterization as a sum-to-zero matrix overcomes this dilemma by reducing
the number of parameters through making some of the parameters linear combinations of others.
Sum-to-zero restrictions make the resulting parameters and estimates sum to zero even though

33
the unrestricted parameters (for example, the true GCA values as applied to a broader population)
do not necessarily sum-to-zero within a diallel. This is the problem of comparability of GCA
estimates from disconnected experiments.
ym
yus
ym
ym
ym
ym
—
y*i2
y2i3
y2M
yi23
y224
ym .
v
B,
GCA,
GCAj
gca3
gca4
sca12
scaI3
sca14
SC Ay
SC Ay
SCA„
l
1
0
1
1
0
0
1
0
0
0
0
0 '
l
1
0
1
0
1
0
0
1
0
0
0
0
B,
i
1
0
1
0
0
1
0
0
1
0
0
0
B,
i
1
0
0
1
1
0
0
0
0
1
0
0
GCA,
l
1
0
0
1
0
1
0
0
0
0
1
0
gca2
l
1
0
0
0
1
1
0
0
0
0
0
1
GCA,
l
0
1
1
1
0
0
1
0
0
0
0
0
gca4
i
0
1
1
0
1
0
0
1
0
0
0
0
sca,2
i
0
1
1
0
0
1
0
0
1
0
0
0
sca,3
l
0
1
0
1
1
0
0
0
0
1
0
0
sca,4
l
0
1
0
1
0
1
0
0
0
0
1
0
SC Ay
i
0
1
0
0
1
1
0
0
0
0
0
1
SC Ay
SCAy .
y = x0 p0
Figure 3-1. The overparameterized linear model for a four-parent half-diallel planted on a single
site in two blocks displayed as matrices. The design matrix (XJ and parameter vector (0O) are
shown in overparameterized form. 1 ’s and 0’s denote the presence or absence of a parameter in
the model for the observed means (data vector, y). The parameters displayed above the design
matrix label the appropriate column for each parameter. Error vector not exhibited.
H B, GCA, GCA2 GCA3 SCA12 SCA,3
0
1
-1
-1
1
0
0
1
-1
-1
1
0
' el 12
B,
el 13
ell4
GCA,
el23
el24
GCA,
+
el34
e212
GCA,
e213
e214
sca,2
e223
e224
sca,3 .
e234
y = Xs)Ss + e.
Figure 3-2. The linear model for a four-parent half-diallel planted on a single site in two blocks
displayed as matrices. The design matrix (XJ and the parameter vector (fij are presented in
sum-to-zero format. The parameters displayed above the design matrix label the appropriate
column for each parameter.
To illustrate the concept of sum-to-zero estimates versus population parameters, we use
the expectation of a common formula. Becker (1975) gives equation 3-4 (which for balanced

34
cases is equivalent to g, = ((p-l)/(p-2))(Zj - Z )) as the estimate for general combining ability
for the j— line with p equalling the number of parents and Z^ equalling the site mean of the j x
k cross. This equation yields the same solution as the matrix equations with no missing plots or
crosses and with a design matrix which contains the sum-to-zero restrictions. An evaluation of
this formula in a four-parent half-diallel planted in b blocks for the GCA of parent 1 is obtained
by substituting the expectation of the linear model (equation 3-1) for each observation:
gj = (l/foip^XpZj. -2Z.) T4
E{g,} = E{(l/(p(p-2)))(pZ, - 2Z )}
E{g,} = 3/4(GCA,) - 1/4(GCA2 + GCA3 + GCA4) + 1/4(SCA12 + SCA13 + SCAU) -
1/4(SCA23 + SCA.4 + SCAj,).
The result of equation 3-4 is obviously not GCA, from the unrestricted model (equation
3-1). Thus, g,, an estimable function and an estimate of parameter GCA,S (the estimate of the
GCA of parent 1 given the sum-to-zero restrictions), does not have the same meaning as GCA,
in the unrestricted model. An estimable function is a linear combination of the observations; but
in order for an individual parameter in a model to be estimable, one must devise a linear
combination of the observations such that the expectation has a weight of one on the parameter
one wishes to estimate while having a weight of zero on all other parameters. A solution such
as this does not exist for the individual parameters in the overparameterized model (equation 3-1).
So, although the sum-to-zero restricted GCA parameters and estimates are forced to sum-to-zero
for the sample of parents in a given dial lei, the unrestricted GCA parameters only sum-to-zero
across the entire population (Falconer 1981) and an evaluation of GCA,S demonstrates that the
estimate contains other model parameters.
The result of sum-to-zero restrictions is that the degrees of freedom for a factor equals
the number of columns (parameters) for that factor in X, (Figure 3-2). Thus, a generalized

35
inverse for X,’X, is not required since the number of columns in the sum-to-zero X, matrix for
each factor equals the degrees of freedom for that factor in the model (X, is full column rank and
provides a solution to equation 3-3).
Components of the Matrix Equation
The equational components of 3-2 are now considered in greater detail.
Data vector v
Observations (plot means) in the data vector are ordered in the manner demonstrated in
Figure 3-1. For our example Figure 3-1 is the matrix equation of a four parent half-diallel
mating design planted in two randomized complete blocks on a single site. There are six crosses
present in the two blocks for a total of 12 observations in the data vector, y. The observations
are first sorted by block. Second, within each block the observations should be in the same
sequence (for simplicity of presentation only). This sequence is obtained by assigning numbers
1 through p to each of the p parents and then sorting all crosses containing parent 1 (whether as
male or female) as the primary index in descending numerical order by the other parent of the
cross as the secondary index. Next all crosses containing parent 2 (primary index, as male or
female) in which the other parent in the cross (secondary index) has a number greater than 2 are
then also sorted in descending order by the secondary index. This procedure is followed through
using parent p-1 as the primary index.
Design matrix and parameter vector. X and 6
The design matrix for a model is conceptually a listing of the parameters present in the
model for each observation (Searle 1987, page 243). In Figure 3-1, y and ft are exhibited and
the parameters in ft are displayed at the tops of the columns of X0 (a visually correct
interpretation of the multiplication of a matrix by a vector). For each observation in y, the scalar

36
model (equation 3-1) may be employed to obtain the listing of parameters for that observation
(the row of the design matrix corresponding to the particular observation). The convention for
design matrices is that the columns for the factors occur in the same order as the factors in the
linear model (equation 3-1 and Figure 3-1). Since design matrices can be devised by first
creating the columns pertinent to each factor in the model (submatrices) and then horizontally
and/or vertically stacking the submatrices, the discussion of the reparameterized design matrix
formulation will proceed by factor.
Mean
The first column of X, is for n and is a vector of l’s with the number of rows equalling
the number of observations (Figure 3-2). The linear model (equation 3-1) indicates that all
observations contain /r and the deviation of the observations from n is explained in terms of the
factors and interactions in the model plus error.
Block
The number of columns for block is equal to the number of blocks minus one (column
2, XJ. Each row of a block submatrix consists of l’s and 0’s or -l’s according to the identity
of the observation for which the row is being formed. The normal convention is that the first
column represents block 1 and the second column block 2, etc. through block b-1. Since we
have used a sum-to-zero solution (£^¡=0), the effect due to block b is a linear combination of
the other b-1 effects, i.e., bb = -E- = ¡b¡ which in our example is 0 = b, + b2 and b2 = -b,.
Thus, the row of the block submatrix for an observation in block b (the last block) has a -1 in
each of the b-1 columns signifying that the block b effect is indeed a linear combination of the
other b-1 block effects. Columns 2 and 3 of X„ (Figure 3-1) have become column 2 of X,
(Figure 3-2).

37
General combining ability
This submatrix of X, is slightly more complex than previous factors as a result of having
two levels of a main effect present per observation, i.e., the deviation of an observation from n
is modeled as the result of the GCA’s of both the male and female parents (equation 3-1). Again
we have imposed a restriction, Ejgca^O. Since GCA has p-1 degrees of freedom, the submatrix
for GCA should have p-1 columns, i.e., gca,, = -Ejjgcaj. The GCA submatrix for X, (columns
3 through 5 in Figure 3-2) is formed from X„ (columns 4 through 7 in Figure 3-1) according in
the same manner as the block matrix: (1) add minus one to the elements in the other columns
along each row containing a one for gca,, (p = 4 in our example); and (3) delete the column from
X0 corresponding to gca,,. The GCA submatrix has p(p-l)/2 rows (the number of crosses). This,
with no missing cells (plots), equals the number of observations per block. To form the GCA
factor submatrix for a site, the GCA submatrix is vertically concatenated (stacked on itself) b
times. This completes the portion of the X, matrix for GCA.
Specific combining ability
In order to facilitate construction of the SCA submatrix, a horizontal direct product
should be defined. A horizontal direct product, as applied to two column vectors, is the element
by element product between the two vectors (SAS/IML1 User’s Guide 1985) such that the
element in the i— row of the resulting product vector is the product of the elements in the i— rows
of the two initial vectors. The resultant product vector has dimension n x 1. A horizontal direct
product is useful for the formation of interaction or nested factor submatrices where the initial
matrices represent the main factors and the resulting matrix represents an interaction or a nested
factor (product rule, Searle 1987).
'SAS/IML is the registered trademark of the SAS Institute Inc. Cary, North Carolina.

38
The SCA submatrix can be formulated from the horizontal direct products of the columns
of the GCA sub-matrix in X, (Figure 3-2). The results from the GCA columns require
manipulation to become the SCA submatrix (since degrees of freedom for SCA do not equal those
of an interaction for a half-diallel analysis), but the GCA column products provide a convenient
starting point. The column of the SCA submatrix representing the cross between the j— and the
k- parents (SCAjJ is formed as the product between the GCAj and GCAk columns (Figure 3-3).
The GCA columns in Figure 3-2 are multiplied in this order: column 1 times column 2 forming
the first SCA column, column 1 times column 3 forming the second SCA column, and column
2 times column 3 forming the third SCA column (Figure 3-3). With four parents (six crosses)
there are three degrees of freedom for GCA (p-1) and two degrees of freedom for SCA (6 crosses
- 3 for GCA - 1 for the mean). Since SCA has only two degrees of freedom, a sum-to-zero
design matrix can have only two columns for SCA. Imposing the restriction that the sum of the
SCA’s across all parents equals zero is equivalent to making the last column for the SCA
submatrix (Figure 3-3) a linear combination of the others (Figure 3-2). The procedure for
deleting the third column product is identical to that for the GCA submatrix: add minus one to
every element in the rows of the remaining SCA columns in which a one appears in the column
which is to be deleted (Figure 3-2, columns 6 and 7). The number of rows in the SCA submatrix
equals the number observations in a block and must be vertically concatenated b times to create
the SCA submatrix for a site.
An algebraic evaluation of SCA sum-to-zero restrictions requires that EjScajk = 0 for
each k and that E^sca^ = 0; thus, for observations in the i— block with i serving to denote the
row of the SCA submatrix in block i, sca¡14 = -scail2 -scail3 and entries in the submatrix row for
yil4 are -l’s. The estimate for sca^ equals sca¡14 because scai23 is the negative of the sum of the
independently estimated SCA’s (scaj12 and scail3) from the restriction that the sum of the SCA’s

39
across all parents equals zero. Similarly, by sum-to-zero definition sca^ = -sca^ -sea,,,, and by
substitution sca^ = -(-sca¡12 -sca¡13) -scai12 = sca¡13. By the same protocol, it can be shown that
sca^ = sca¡12. The elements in the rows of the SCA submatrix are l’s, -l’s and 0’s in
accordance with the algebraic evaluation. Thus, while it may seem that there should be 6 SCA
values (one for each cross), only 2 can be independently estimated and the remaining 4 are linear
combinations of the independently estimated SCA’s. Again the SCA sum-to-zero estimates are
not equal to the parametric population SCA’s. An analogous illustration for SCA to that for
GCA would show that the estimable function (linear combination of observations) for a given
SCAe contains a variety of other parameters.
OBS.
GCA,xGCA2
GCA,xGCA3
GCA2xGCA3
sca12
SCA,J
sca23
Y„2 1
0)(D=l
(1)(0)=0
(1)(0)=0
1
0
0
Y¡u
(D(0)=0
(1)(1)=1
(0)(1)=0
0
1
0
Ym4
(0)(-l)=0
(0)(-l)=0
(-1)(-1)=1
0
0
1
Yj23
(0)(1)=0
(0)(1)=0
(1)(1)=1
0
0
1
Y*4
(-1)(0)=0
(-1)(-1)=1
(0)(-l)=0
0
1
0
Y« J
(-!)(-!) = 1
(-1)(0)=0
(-1)(0)=0
1
0
0
Figure 3-3. Intermediate result in SCA submatrix generation (SCA columns as horizontal direct
products of GCA,, GCA2, and GCA3 columns within a block). The SCAjk column is the
horizontal direct product of the columns for GCAj and GCAk.
Estimation of Fixed Effects
GCA parameters
The GCA parameters can be estimated (without mean, block, and SCA in the design
matrix) through the use of equation 3-3, if there are no missing cell means (plots) for any cross
and no missing crosses. The design matrix consists only of the GCA submatrix. This design
matrix has {p-1} (for GCA’s) columns (the third through the fifth columns of XJ. The b vector
is an estimate of the GCA portion of as in Figure 3-2 and the linear combinations for the
estimation of gca,, is gca,, = -E?=¡gca¡. Parameters for any of the factors can be estimated

40
independently using the pertinent submatrix as long as there are no missing cell means (plots) and
no missing crosses; this uses a property known as orthogonality.
Orthogonality requires that the dot product between two vectors equals zero (Schneider
1987, page 168). The dot product (a scalar) is the sum of the values in a vector obtained from
the horizontal direct product of two vectors. For two factors to be orthogonal, the dot products
of all the column vectors making up the section of the design matrix for one factor with the
column vectors making up the portion of the design matrix for the second must be zero. If all
factors in the model are orthogonal, then the X,’X, matrix is block diagonal. A block-diagonal
X,’X, matrix is composed of square factor submatrices (degrees of freedom x degrees of freedom)
along the diagonal with all off-diagonal elements not in one of the square factor submatrices
equalling zero. A property of block-diagonal matrices is that the inverse can be calculated by
inverting each block separately and replacing the original block in the full X’X matrix by the
inverted block. Because the blocks can be inverted separately and all other off-diagonal elements
of the inverse are zero, the effects for factors which are orthogonal to all other factors may be
estimated separately, i.e., there are no functions of other sum-to-zero factors in the sum-to-zero
estimates.
Mean, block. GCA and SCA parameters
All parameters are estimated simultaneously by horizontally concatenating the mean,
block, GCA, and SCA matrices to create X,. Equation 3-3 is again utilized to solve the system
of equations. The b vector for the four parent example is an estimate of 0, of Figure 3-2.
Again, one parameter is estimated for each column in the X, matrix and all parameter estimates
not present are linear combinations of the parameter estimates in the b vector. So K is equal to -
X¡- = ¡b¡ and gca,, is equal to -Ejjgcaj. The linear combinations for SCA effects can be obtained
by reading along the row of the SCA submatrix associated with the observation containing the

41
parameter, i.e., in Figure 3-2 the observation contains the effect sca^ which is estimated as
the linear combination -sca¡12 -sca¡13.
This completes the estimation of fixed effect parameters from a data set which is balanced
on a plot-mean basis. Since field data sets with such completeness are a rarity in forestry
applications, the next step is OLS analysis for various types of data imbalance. Calculations of
solutions based on a complete data set and simulated data sets with common types of imbalance
are demonstrated in numerical examples.
Numerical Examples
The data set analyzed in the numerical examples is from a five-year-old, six-parent half-
diallel slash pine (Pirns elliottii var. elliottii Engelmn) progeny test planted on a single site in
four complete blocks. Each cross is represented by a five-tree row plot within each block. Total
height in meters and diameter at breast height (dbh in centimeters) are the traits selected for
analysis. The data set is presented in Table 3-1 so that the reader may reconstruct the analysis
and compare answers with the examples. The numbers 1 through 6 were arbitrarily assigned to
the parents for analysis. Because of unequal survival within plots, plot means are used as the unit
of observation.
Balanced Data (Plot-mean Basis)
The sum-to-zero design matrix for the balanced data set has (4 blocks)x(15 crosses) = 60
rows (which equals the number of observations in y) and has the following columns: one column
for /i, three columns for blocks (b-1), five columns for GCA (p-1), and nine columns for SCA
(15 crosses - 5 - 1) for a total of 18 columns. With sixty plot means (degrees of freedom) and
18 degrees of freedom in the model, subtracting 18 from 60 yields 42 degrees of freedom for

42
error which matches the degrees of freedom for cross by block interaction, thus verifying that
degrees of freedom concur with the number of columns in the sum-to-zero design matrix.
To illustrate the principle of orthogonality in the balanced case, the X’X and (X’X)'1 matrices
may be printed to show that they are block diagonal. In further illustration, the effects within
a factor may also be estimated without any other factors in the design matrix and compared to
the estimates from the full design matrix.
The vectors of parameter estimates for height and dbh (Table 3-2) were calculated from the
same X, matrix because height and dbh measurements were taken on the same trees. In other
words, if a height measurement was taken on a tree, a dbh measurement was also taken, so the
design matrices are equivalent.
Missing Plot
To illustrate the problem of a missing plot, the cross, parent two by parent three, was
arbitrarily deleted in block one (as if observation y123 were missing). This deletion prompts
adjustments to the factor matrices in order to analyze the new data set. The new vector of
observations (y) now has 59 rows. This necessitates deletion of the row of the design matrix (XJ
in block 1 which would have been associated with cross 2x3. This is the only matrix alteration
required for the analysis. Thus, the resultant X, matrix has 60 - 1 = 59 rows and 18 columns.
With 59 means in y and 18 columns in X,, the degrees of freedom for error is 41.
Comparisons between results of the analyses (Table 3-2) of the full data set and the data
set missing observation y123 reveal that for this case the estimates of parameters have been
relatively unaffected by the imbalance (magnitudes of GCA’s changed only slightly and rankings
by GCA were unaffected).

43
Table 3-1. Data set for numerical examples. Five-year-old slash pine progeny test with a 6-
parent half-diallel mating design present on a single site with four randomized complete blocks
and a five-tree row plot per cross per block.
Block
Female
Male
Mean
Height
Mean
DBH
Within Plot
Variance Variance
Height DBH
Tree
per
Plot
Meters
Centimeters
m
cm2
1
1
2
2.6899
3.810
0.9800
3.484
4
1
1
3
1.9080
2.134
1.4277
3.893
5
1
1
5
3.1242
4.445
0.4487
1.656
4
1
1
6
2.4933
3.200
0.8488
5.664
5
1
2
5
1.4783
1.588
0.6556
2.167
4
1
2
6
2.7026
3.471
0.1136
0.344
3
1
3
2
3.0480
4.699
0.2341
0.968
4
1
3
5
3.4991
5.131
0.0945
0.271
5
1
3
6
2.4003
2.794
0.5149
1.548
4
1
4
1
3.3955
4.928
0.1489
0.761
5
1
4
2
3.4290
5.144
0.7943
3.285
4
1
4
3
2.5298
2.984
0.9557
4.188
4
1
4
5
2.4155
3.175
0.5936
2.946
4
1
4
6
3.2004
4.521
1.7034
7.594
5
1
5
6
2.2403
2.794
1.0433
6.280
4
2
1
2
3.5662
5.080
0.9560
2.903
5
2
1
3
2.6335
3.353
0.7695
3.497
5
2
1
5
3.6942
5.893
0.0573
0.432
5
2
1
6
3.4808
4.928
0.9222
2.890
5
2
2
5
3.4260
4.877
0.7017
2.432
5
2
2
6
2.4282
3.302
0.0616
0.452
3
2
3
2
3.0480
4.064
0.0192
0.301
4
2
3
5
2.8895
4.013
0.1957
0.690
5
2
3
6
1.9406
1.863
0.0560
0.408
3
2
4
1
3.0114
3.962
1.9753
6.342
5
2
4
2
3.6454
5.283
0.1731
0.787
5
2
4
3
2.9566
3.861
0.0506
0.174
5
2
4
5
2.8118
4.382
1.1336
5.435
4
2
4
6
3.2674
4.318
1.1211
4.354
5
2
5
6
3.7917
5.893
0.0848
0.497
5
3
1
2
2.2961
2.625
0.3914
1.699
3
3
1
3
2.8956
4.128
1.2926
4.532
4
3
1
5
2.5359
3.607
0.8284
4.303
5
3
1
6
2.9032
3.937
0.8252
4.064
4
3
2
5
2.7737
4.064
0.9829
3.226
2
3
2
6
1.2040
0.635
0.4464
0.806
2
3
3
2
2.9870
4.191
0.9049
2.989
4
3
3
5
2.8407
3.962
0.7309
3.632
5
3
3
6
1.3564
0.000
0.1677
0.000
2
3
4
1
2.6746
3.620
0.8463
2.984
4
3
4
2
2.7066
3.353
0.5590
1.787
5
3
4
3
3.4198
4.623
0.3509
0.690
5
3
4
5
3.3299
4.953
0.4102
1.226
4
3
4
6
3.4564
4.978
0.8369
3.503
5
3
5
6
3.2614
4.826
1
4
1
2
1.8974
2.476
1.0160
3.629
4
4
1
3
1.3005
0.508
0.2019
0.774
3
4
1
5
2.0726
2.540
1.2235
5.097
3
4
1
6
1.8821
1.778
0.4728
3.312
4
4
2
5
1. 64
1.334
0.5354
2.382
4
4
2
6
1.5392
0.635
0.0376
0.806
2
4
3
2
1.8898
2.032
0.7364
1.892
4
4
3
5
2.5146
3.620
0.0876
0.446
4
4
3
6
1.8389
2.201
0.0941
0.280
3
4
4
1
2.3348
2.591
0.3816
2.722
5
4
4
2
1.7272
1.693
2.1640
8.602
3
4
4
3
1.6581
1.524
0.0537
0.903
5
4
4
5
2.1184
2.286
0.3137
2.366
4
4
4
6
1.5545
1.422
0.4803
1.019
5
4
5
6
1.4122
1.693
0.0338
0.150
3

44
Table 3-2. Numerical results for examples of data imbalance using the OLS techniques presented
in the text.
Five
Estimate
Balanced*
Missing Plotb
Missing Cross'
Missing Crosses0
oP
Height
DBH
Height
DBH
Height
DBH
Height
DBH
M
2.5830
3.362
2.5787
3.346
2.5386
3.260
2.4980
3.149
B,
0.1203
0.292
0.1074
0.245
0.1074
0.245
0.1393
0.309
0.5230
0.976
0.5274
0.992
0.5386
1.023
0.6041
1.140
b3
0.1264
0.205
0.1308
0.220
0.1180
0.187
0.0689
0.087
GCA,
0.0706
0.144
0.0760
0.163
0.1260
0.270
0.1361
0.232
gca2
-.1077
-.180
-.1186
-.220
-.2186
-.434
-.2371
-.493
GCAj
-.1316
-.347
-.1426
-.386
-.2426
-.601
-.3972
-.952
GCA,
0.2489
0.398
0.2544
0.417
0.3044
0.524
0.4241
0.804
GCAS
0.1265
0.489
0.1320
0.509
0.1820
0.616
0.1746
0.646
SCA^
0.0665
0.172
0.0763
0.208
0.1663
0.400
SCA,j
-.3374
-.628
-.3277
-.592
-.2377
-.400
SCAm
-.0484
-.128
-.0550
-.152
-.1150
-.280
-.2041
-.410
sca,5
0.0766
0.126
0.0700
0.102
0.0100
-.026
0.0480
0.094
SC Ay
0.3995
0.912
0.3600
0.771
sca24
0.1528
0.289
0.1627
0.324
0.2527
0.517
0.1920
0.408
SCAjj
-.3185
-.706
-.3084
-.670
-.2187
-.478
SCA„
-.0592
0.164
-.0493
0.129
0.0406
0.064
0.1163
0.246
SCAjj
0.3580
0.677
0.3679
0.712
0.4793
0.905
“where (numerical examples are for height)
b4= -Efa = -.7697;
gca^ = -Efecaj = -.2067;
sca^ = -Escajk for j or k = p and p = 1,2,3 then sca16 = .2428,
sca^ = -.3002, and sca^ = -.3608; sca45 = -E^sca,, = -.2898,
e = independently estimated sea’s 1, ... ,9;
sca^ = sca12 + sca13 + sca,5 + sca^ + sca^ + sca35 = .2446;
and sea.* = sca12 + sca13 + sca14 + sca^, + sca^ + sea-* = .1737.
bwhere the linear combinations for parameter estimates are identical
to the balanced example.
cwhere sca,,6 = -Escajk for j or k = p and p = 1 to 3; sca45 = -E®scae
e = independently estimated SCA’s 1,. . .,8;
sca^ = sca12 + sca13 + sca15 + scajj + sca35; and
sca^ = sca12 + scaI3 + sca14+ SC324 + sca^.
dwhere sca16 = -sca14 -scaI5, sca^ = -SC324, sca^ = sca^,
sca^ = sca15, sca^ = sca14 + SC324 + sca^, and
scajj = the negative of the sum of the four independently
estimated sea’s.
“where for all cases linear combinations for block and gca are the same as in the balanced case.

45
Missing Cross
Another common form of imbalance in diallel data sets, the missing cross, is examined
through arbitrary deletion of the 2 x 3 cross from all blocks, i.e., y123, y^, y323, y423 are missing
in the data vector. This type of imbalance is representative of a particular cross that could not
be made and is therefore missing from all blocks. The matrix manipulations required for this
analysis are again presented by factor. For appropriate SCA restrictions, the data vector and
design matrix should be ordered so that the p1^ parent has no missing crosses. Since the labeling
of a parent as parent p is entirely subjective, any parent with all crosses may be designated as
parent p. The previous labelling directions are necessary since we generate the SCA submatrix
as horizontal direct products of the columns of the GCA submatrix; and to account for missing
crosses, the horizontal direct product for each particular missing parental combinations are not
calculated which sets the missing SCA’s to zero. If there is a cross missing from those of the
p- parent, we cannot account for the missing cross with this technique (Searle 1987, page 479).
For the mean, block, and GCA submatrices, the adjustment for the missing cross dictates
deleting the rows in the submatrices which would have corresponded to the y^ observations. The
SCA submatrix must be reformed since a degree of freedom for SCA and hence a column of the
submatrix has been lost. The SCA submatrix is reinstituted from the GCA horizontal direct
products (remembering that one cross, 2x3, no longer exists and therefore that product GCA2 x
GCA3 is inappropriate). Dropping the column for SCA^, is equivalent to setting SCA^ to zero
(Searle 1987) so that the remaining SCA’s will sum-to-zero. After that, the reformation is
according to the established pattern. With one missing cross there are now 56 observations and
hence 56 degrees of freedom available. The columns of the X, matrix are now: one for the
mean, three for block, five for GCA, and eight for SCA for a total of 17 columns. The

46
remaining degrees of freedom for error is 39, matching the correct degrees of freedom ((14-
l)x(4-l) = 39).
For the missing cross example /x is no longer equivalent to the mean of the plot means
since /x = 2.5386 and Eijkyijk)/N = 2.5715 where N = 56 (number of plot means). This is the
result of GCA effects which are no longer orthogonal to the mean. Check the X/X, matrix or
try estimating factors separately and compare to the estimates when all factors are included in X,.
If formulae for balanced data (Becker 1975, Falconer 1981, and Hallauer and Miranda
1981) are applied to unbalanced data (plot-mean basis) estimates of parameters are no longer
appropriate because factors in the model are no longer independent (orthogonal). Applying
Becker’s formula which uses totals of cross means for a site (y jk) to the missing cross example
yields: gca, = .2992, gca2 = -.5649, gca3 = -.5888, gca4 = .4665, gca, = .3552, and gca*; =
.0219. These answers are very different in magnitude from those in Table 3-2 for this example
and gca,, also has a different sign. Employing these formulae in the analysis of unbalanced data
is analogous to matrix estimation of GCA’s without the other factors in the model which is
inappropriate.
Several Missing Crosses
The concluding example (Table 3-2) is a drastically unbalanced data set resulting from
the arbitrary deletion of five crosses (1x2, 1 x 3, 2 x 3, 3 x 5, and 4 x 5). The matrix
manipulation for this example is an extension of the previous one cross deletion example. Rows
corresponding to yil2, yil3, y^, y^, and yi45 are deleted from the mean, block and GCA
submatrices for all blocks. The SCA matrix (now 4 columns = 10 crosses -5-1 =4 degrees
of freedom) is again reformed with only the relevant products of the GCA columns. Counting
degrees of freedom (columns of the sum-to-zero design matrix), the mean has one, block has

47
three, GCA has five, and SC A has four degrees of freedom for a total of 13. Error has (4-l)(10-
1) = 27 degrees of freedom. Totaling degrees of freedom for modeled effects and error yields
40 which equals the number of plot means.
In increasingly unbalanced cases (Table 3-2), the spread among the GCA estimates tends
to increase with increasing imbalance (loss of information). This is a general feature of OLS
analyses and the basis for the feature is that the spread among the GCA estimates is due to both
the innate spread due to additive genetics effects as well as the error in estimation of the GCA’s.
When there is less information, GCA estimates tend to be more widely spread due to the increase
in the error variance associated with their estimation. This feature has been noted (White and
Hodge 1989, page 54) as the tendency to pick as parental winners individuals in a breeding
program which are the most poorly tested.
Discussion
After developing the OLS analysis and describing the inherent assumptions of the
analysis, there are four important factors to consider in the interpretation of sum-to-zero OLS
solutions: (1) the lack of uniqueness of the parameter estimates; (2) the weights given to plot
means (yijk) and in turn site means (y jk) for crosses in data sets with missing crosses in parameter
estimation; (3) the arbitrary nature of using a diallel mean (perforce a narrow genetic base) as
the mean about which the GCA’s sum-to-zero; and (4) the assumption that the covariance matrix
for the observations (V) is Ia2e.
Uniqueness of Estimates
Sum-to-zero restrictions furnish what would appear to be unique estimates of the
individual parameters, e.g. GCA,, when, in fact, these individual parameters are not estimable

48
(Graybill 1976, Freund and Littell 1981, and Milliken and Johnson 1984). The lack of
estimability is again analogous to attempting to solve a set of equations in n unknowns with t
equations where n is greater than t. Therefore, an infinite number of solutions exist for 0.
There are quantities in this system of equations that are unique (estimable), i.e., the
estimate is invariant regardless of the restriction (sum-to-zero or set-to-zero) or generalized
inverse (no restrictions) used (Milliken and Johnson 1984) and the estimable functions include
sum-to-zero GCA and SCA estimates since they are linear combinations of the observations; but,
these estimable quantities do not estimate the individual parametric GCA’s and SCA’s of the
overparameterized model (equation 3-4) since there is no unique solution for those parameters.
Weighting of Plot Means and Cross Means in Estimating Parameters
With at least one measurement tree in each plot and with plot means as the unit of
observation, use of the matrix approach produces the same results as the basic formulae. The
weight placed on each plot mean in the estimation of a parameter can be determined by
calculating (X/XJ 'X,’ which can be viewed as a matrix of weights W so that equation 3-3 can
be written as b = Wy. The matrix W has these dimensions: the number of rows equals the
number of parameters in /5S and the number of columns equals the number of plot means in y.
The i— row of the W contains the weights applied to y to estimate the i- parameter in b (b¡). In
the discussion which follows gca, is utilized as b,.
If there are no missing plots, the cross mean in every block (yijlc) has the same weighting
and weights can be combined across blocks to yield the weight on the overall cross mean (y jk).
It can be shown that for the balanced numerical example gca, is calculated by weighting the
overall cross means containing parent 1 by 1/6 and weighting all overall cross means not

49
GCA1 GCA2 GCA3 GCA4 GCA5 GCA6
GCA1
GCA2
GCA3
GCA4
GCA5
GCA6
1/6
.16667
1/6
.16667
1/6
.16667
1/6
.16667
1/6
.16667
. 14583
missing
-1/12
-.08333
- 1/ 12
-.08333
- 1/12
-.08333
-1/12
-.08333
. 14583
missing
missing
missing
- 1/ 12
-.08333
-1/12
-.08333
- 1/12
-.08333
. 18056
.22549
-. 104 17
.0 196 1
-. 104 17
-.11765
- 1/12
-.08333
- 1/12
-.08333
. 18056
.3 1372
-. 104 17
-.27451
-. 104 17
missing
-.06944
missing
\ AAXXAAA/
-1/12
-.08333
. 18056
.294 12
-. 104 17
.08824
104 17
-.04902
-.06944
-.29412
-.06944
-.20588
j§§
5/6
-1/6
-1/6
-1/6
-1/6
-1/6
Figure 3-4. Weights on overall cross means (y jk) for the three numerical examples for
estimation of GCA,. The weights for the balanced example (above the diagonal) are presented
in both fractional and decimal form. The weights for the one-cross missing and the five-crosses
missing are presented as the upper number and lower number, respectively, in cells below the
diagonal. The marginal weights on GCA parameters (right margin) do not change although cells
are missing.

50
containing parent 1 by -1/12. Figure 3-4 (above the diagonal) demonstrates the weightings on
the overall cross means for the balanced numerical example as well as the marginal weighting on
the GCA parameters. These marginal weightings are obtained by summing along a row and/or
column as one would to obtain the marginal totals for a parent (Becker 1975). One feature of
sum-to-zero solutions is that these marginal weightings will be maintained no matter the
imbalance due to missing crosses, as will be seen by considering the numerical examples for a
missing cross (Figure 3-4 below the diagonal, upper number) and five missing crosses (Figure
3-4 below the diagonal, lower number). The marginal weights have remained the same as in the
balanced case while the weights on the cross means differ among the crosses containing parent
1 and also among the crosses not containing parent 1. In the five missing crosses example,
crosses yM and y -26 even receive a positive weighting where in the prior examples they had
negative weighting.
The expected value in all three examples is GCAls (for sum-to-zero) despite the
apparently nonsensical weightings to cross means with missing crosses; however, the evaluation
of the estimates in terms of the original model changes with each new combination of missing
cells, i.e., y ^ and y M have a positive weight in the five missing crosses example in GCAt
estimation. Whether this type of estimation is desirable with missing cell (cross) means has been
the subject of some discussion (Speed, Hocking and Hackney 1978, Freund 1980, and Milliken
and Johnson 1984). The data analyst should be aware of the manner in which sum-to-zero treats
the data with missing cell means and decide whether that particular linear combination of cross
means estimating the parameter is one of interest, realizing that the meaning of the estimates in
terms of the original model is changing.

51
Diallel Mean
The use of the mean for a half-diallel as the mean around which GCA’s sum-to-zero is
not satisfactory in that the diallel mean is the mean of a rather narrow genetically based
population, and in particular that the comparisons of interest are not usually confined to the
specific parents in a specific diallel on a particular site. A checklot can be employed to represent
a base population against which comparison of half- or full-sib families can be made to provide
for comparison of GCA estimates from other tests (van Buijtenen and Bridgwater 1986).
Mathematically, when effects are forced to sum-to-zero around their own mean, the
absolute value of the GCA’s is reflective of their value relative to the mean of the group. Even
if the parents involved in the particular diallel were all far superior to the population mean for
GCA, GCA’s calculated on an OLS basis would show that some of these GCA’s were negative.
If the GCA’s of the diallel parents were in fact all below the population mean, the opposite and
equally undesirable result ensues. For disconnected diallels together on a single site, an OLS
analysis would yield GCA estimates that sum-to-zero within each diallel since parents are nested
within diallels. Unless the comparisons of interest are only in the combination of the parents in
a specific diallel on a specific site, the checklot alternative is desirable.
A method for obtaining the desired goal of comparable GCA’s from disconnected
experiments, disregarding the problem of heteroscedasticity, is to form a function from the data
which yields GCA estimates properly located on the number scale. Such a function can be
formed (using GCA! as an example) from gcals, the diallel mean, and the checklot mean.
From expectations of the scalar linear model (equation 3-1),
GCAls = ((p-l)/p)GCA, - (l/p)£f=2GCAj + (l/p)EE=2SCAlk - 3-5
(2/(p(p-2)))E?:’EE=3SCAjk;
E{diallel mean} = n + (E^BJ/b + (2/p)EP=1GCAj + (2/(p(p-l)))EP:jE^2SCAjk; and

52
E{checklot mean} = n + + t;
where j for GCA is j or k and t represents the fixed genetic parameter of the checklot. The
function used to properly locate GCAlrd (the subscript rel denotes the relocated GCA,,) is gca,re,
= gca,, + (l/2)(diallel mean - checklot mean). The expectation of gca,re, with negligible SCA
is GCAln., = GCA, - t/2; and since breeding value equals twice GCA, BV,re, = BV, - t. If SCA
is non-negligible then the expectation is
GCA,re, = GCA, + (l/(p-l))E|USCA,k - (l/Op-lto^^I^SCA* - t/2. 3-6
In either case the function provides a reasonable manner by which GCA estimates from
disconnected diallels are centered at the same location on a number scale and are then
comparable.
Variance and Covariance of Plot Means
The variances of plot means with unequal numbers of trees per plot are by definition
unequal, i.e., Var(yijk) = crp + (PJnijk where a2p is plot variance, and nijk is the number of observations per plot. Also, if blocks were considered random, there
would be an additional source of variance for plot means due to blocks (as well as a covariance
between plot means in the same block) and this could be incorporated into the V matrix with
Var(yijk) = a\ + not equal and there is a covariance between the means if blocks are being considered random,
best linear unbiased estimates (BLUE) would be secured by weighting each mean by it’s true
associated variance (Searle 1987, page 316). This is the generalized least squares (GLS)
approach as
b = (X,’V1XJ'1X8’V-1y
3-7

53
The GLS approach relaxes the OLS assumptions of equal variance of and no covariance between
the observations (plot means) while still treating genetic parameters as fixed effects. The entries
along the diagonal of the V matrix are the variances of the plot means (Var(yijk)) in the same
order as means in the data vector. The off-diagonal elements of V would be either 0 or a\ (the
variance due to the random variable block) for elements corresponding to observations in the
same block. BLUE requires exact knowledge of V; if estimates of a2p, aand o2„ are utilized
in the V matrix, estimable functions of 0 approximate BLUE.
The OLS assumption that SCA and GCA are fixed effects can also be relaxed to allow
for covariances due to genetic relatedness. In particular, the information that means are from the
same half- or full-sib family could be included in the V matrix. Relaxation of the zero covariance
assumption implies that GCA and SCA are random variables. If GCA and SCA are treated as
random variables, then the application of best linear prediction (BLP) or best linear unbiased
prediction (BLUP) to the problem would be more appropriate (White and Hodge 1989, page 64).
The treatment of the genetic parameters as random variables is consistent with that used in
estimating genetic correlations and heritabilities. The V matrix of such an application would
include, in addition to the features of the GLS V matrix, the covariance between full-sib or half-
sib families added to the off-diagonal elements in V, i.e., if the first and second plot means in
the data vector had a covariance due to relationship, then that covariance is inserted twice in the
V matrix. The covariance would appear as the second element in the first row and the first
element in the second row of V (V is a symmetric matrix). Also the diagonal elements of V
would increase by 2 variance due to treating SCA as a random variable).

54
Comparison of Prediction and Estimation Methodologies
Which methodology (OLS, GLS, BLP, or BLUP) to apply to individual data bases is
somewhat a subjective decision. The decision can be based both on the computational or
conceptual complexity of the method and the magnitude of the data base with which the analyst
is working. To aid in this decision, this discussion highlights the differences in the inherent
properties and assumptions of the techniques.
For all practical purposes the answers from the four techniques will never be equal;
however, there are two caveats. First, OLS estimates equal GLS estimates if all the cell means
are known with the same precision (variance), (Searle 1987, page 490). Otherwise, GLS
discounts the means that are known with less precision in the calculations and different estimates
result. The second caveat is if the amount of data is infinite, i.e., all cross means are known
without error, then all four techniques are equivalent (White and Hodge 1989, pages 104-106).
In all other cases BLP and BLUP shrink predictions toward the location parameter(s) and produce
predictions which are different from OLS or GLS estimates even with balanced data. During
calculations GLS, BLP, and BLUP place less weight on observations known with less precision,
which is intuitively pleasing.
With OLS and GLS forest geneticists treat GCA’s and SCA’s as fixed effects for
estimation and then as random variables for genetic correlations and heritabilities. BLP and
BLUP provide a consistent treatment of GCA’s and SCA’s as random variables while differing
in their assumptions about location parameters (fixed effects). In BLP fixed effects are assumed
known without error (although they are usually estimated from the data) while with BLUP fixed
effects are estimated using GLS. BLP and BLUP techniques also contain the assumption that the
covariance matrix of the observations is known without error (most often variances must be
estimated). In many BLUP applications (Henderson 1974), mixed model equations are utilized

55
iteratively to estimate fixed effects and to predict random variables from a data set. A BLUP
treatment of fixed effects allows any connectedness between experiments to be utilized in the
estimation of the fixed effects. This provides an intuitive advantage of BLUP over BLP in
experimentation where connectedness among genetic experiments is available or where the data
are so unbalanced that treating the fixed effects as known is less desirable than a GLS estimate
of the fixed effects.
An ordering of computational complexity and conceptual complexity from least to most
complex of the four methods is OLS, GLS, BLP and BLUP. The latter three methods require
the estimation of the covariance matrix of the observations either separately (a priori) or
iteratively with the fixed effects. Precise estimation of the covariance matrix for observations
requires a great number of observations and the precision of GLS, BLP and BLUP estimations
or predictions is affected by the error of estimation of the components of V.
Selection of a method can then be based on weighing the computational complexity and
size of the available data base against the advantages offered by each method. Thus, if
complexity of the computational problem is of paramount concern, the analyst necessarily would
choose OLS. With a small data base (one that does not allow reasonable estimates of variances),
the analyst would again choose OLS. With a large data base and no qualms with computational
complexity, the analyst can choose between BLP and BLUP based on whether there is sufficient
connectedness or imbalance among the experiments to make BLUP advantageous.
Conclusions
Methods of solving for GCA and SCA estimates for balanced (plot-mean basis) and
unbalanced data have been presented along with the inherent assumptions of the analysis. The
use of plot means and the matrix equations will produce sum-to-zero OLS estimates for GCA and

56
SCA for all types of imbalance. Formulae in the literature which yield OLS solutions for
balanced data can yield misleading solutions for unbalanced data because of the loss of
orthogonality and also weightings on site means for crosses (or totals) are constants.
GCA’s and SCA’s obtained through sum-to-zero restriction are not truly estimates of
parametric population GCA’s and SCA’s. There are an infinite number of solutions for GCA’s
and SCA’s from the system of equations as a result of the overparameterized linear model. Yet,
if the only comparisons of interest are among the specific parents on a particular site, then the
estimates calculated by sum-to-zero restrictions are appropriate. Checklots may be used to
provide comparability among estimates derived from disconnected sets.
Having discussed the innate mathematical features of OLS analysis, knowledge of these
features should help the data analyst decide if OLS is the most desirable technique for the data
at hand. It may be desirable to relax OLS assumptions, which are in all likelihood invalid for
the covariance matrix of the observations. This could lead to GLS, BLP or BLUP as better
alternatives.

CHAPTER 4
VARIANCE COMPONENT ESTIMATION TECHNIQUES
COMPARED FOR TWO MATING DESIGNS
WITH FOREST GENETIC ARCHITECTURE
THROUGH COMPUTER SIMULATION
Introduction
In many applications of quantitative genetics, geneticists are commonly faced with the
analysis of data containing a multitude of flaws (e.g. non-normality, imbalance, and
heteroscedasticity). Imbalance, as one of these flaws, is intrinsic to quantitative forest genetics
research because of the difficulty in making crosses for full-sib tests and the biological realities
of long term field experiments. Few definitive studies have been conducted to establish optimal
methods for estimation of variance components from unbalanced data. Simulation studies using
simple models (one-way or two-way random models) have been conducted for certain data
structures, i.e., imbalance, experimental design, and variance parameters (Corbeil and Searle
1976, Swallow 1981, Swallow and Monahan 1984, interpretations by Littell and McCutchan
1986). The results from these studies indicate that technique optimality is a function of the data
structure.
In practice (both historically and still common place), estimation of variance components
in forest genetics applications has been achieved by using sequentially adjusted sums of squares
as an application of Henderson’s Method 3 (HM3, Henderson 1953). Under normality and with
balanced data, this technique has the desirable properties of being the minimum variance unbiased
estimator. If the data are unbalanced, then the only property retained by HM3 estimation is
57

58
unbiasedness (Searle 1971, Searle 1987 pp. 492,493,498). Other estimators have been shown
to be locally superior to HM3 in variance or mean square error properties in certain cases (Klotz
et al. 1969, Olsen et al. 1976, Swallow 1981, Swallow and Monahan 1984).
Over the last 25 years, there has been a proliferation of variance component estimation
techniques including minimum norm quadratic unbiased estimation (MINQUE, Rao 1971a),
minimum variance quadratic unbiased estimation (MIVQUE, Rao 1971b), maximum likelihood
(ML, Hartley and Rao 1967), and restricted maximum likelihood (REML, Patterson and
Thompson 1971). The practical application of these techniques has been impeded by their
computational complexity. However, with continuing advances in computer technology and the
appearance of better computational algorithms, the application of these procedures continues to
become more tractable (Harville 1977, Geisbrecht 1983, Meyer 1989). Whether these methods
of analysis are superior to HM3 for many genetics applications remains to be shown.
With balanced data and disregarding negative estimates, all previously mentioned
techniques except ML produce the same estimates (Harville 1977). With unbalanced data, each
technique produces a different set of variance component estimates. Criteria must then be
adopted to discriminate among techniques. Candidate criteria for discrimination include
unbiasedness (large number convergence on the parametric value), minimum variance (estimator
with the smallest sampling variance), minimum mean square error (minimum of sampling
variance plus squared bias, Hogg and Craig 1978), and probability of nearness (probability that
sample estimates occur in a certain interval around the parametric value, Pitman 1937).
Negative estimates are also problematic in the estimation of variance components. Five
alternatives for dealing with the dilemma of estimates less than zero (outside the natural parameter
space of zero to infinity) are (Searle 1971): 1) accept and use the negative estimate, 2) set the
negative estimate to zero (producing biased estimates), 3) re-solve the system with the offending

59
component set to zero, 4) use an algorithm which does not allow negative estimates, and 5) use
the negative estimate to infer that the wrong model was utilized.
The purpose of this research was to determine if the criteria of unbiasedness, minimum
variance, minimum mean square error, and probability of nearness discriminated among several
variance component estimation techniques while exploring various alternatives for dealing with
negative variance component estimates. In order to make such comparisons, a large number of
data sets were required for each experimental level. Using simulated data, this chapter compares
variance component estimation techniques for plot-mean and individual observations, two mating
systems (modified half-diallel and half-sib) and two sets of parametric variance components.
Types of imbalance and levels of factors were chosen to reflect common situations in forest
genetics.
Methods
Experimental Approach
For each experimental level 1000 data sets were generated and analyzed by various
techniques (Table 4-1) producing numerous sets of variance component estimates for each data
set. This workload resulted in enormous computational time being associated with each
experimental level. The overall experimental design for the simulation was originally conceived
as a factorial with two types of mating design (half-diallel and half-sib), two sets of true variance
components (Table 4-2), two kinds of observations (individual and plot mean) and three types of
imbalance: 1) survival levels (80% and 60%, with 80% representing moderate survival and 60%
representing poor survival; 2) for full-sib designs three levels of missing crosses (0, 2, and 5 out
of 15 crosses); and 3) for half-sib designs two levels of connectedness among tests (15 and 10
common families between tests out of 15 families per test). Because of the computational time

60
Table 4-1. Abbreviation for and description of variance component estimation methods utilized
for analyses based on individual observations (if utilized for plot-mean analysis the abbreviation
is modified by pre-fixing a ’P’).
Abbreviation
Description
Citation
ML
PML
Maximum Likelihood: estimates not restricted to the parameter
space (individual and plot-mean analysis).
Hartley and Rao 1967;
Shaw 1987
MODML
Maximum Likelihood: negative estimates set to zero after
convergence (individual analysis).
Hartley and Rao 1967
NNML
Maximum Likelihood: if negative estimates appeared at
convergence, they were set to zero and the system re-solved
(individual analysis).
Hartley and Rao 1967;
Miller 1973
REML
PREML
Restricted Maximum Likelihood: estimates not restricted to the
parameter space (individual and plot-mean analysis).
Patterson and
Thompson 1971; Shaw
1987; Harville 1977
MODREML
Restricted Maximum Likelihood: negative estimates set to zero
after convergence (individual analysis).
Patterson and
Thompson 1971
NNREML
PNNREML
Restricted Maximum Likelihood: if negative estimates appeared
at convergence, they were set to zero and the system re-solved
(individual and plot-mean analysis).
Patterson and
Thompson 1971; Miller
1983
MIVQUE
PMIVQUE
Minimum Variance Quadratic Unbiased: non-iterative with true
(parametric) values of the variance components as priors
(individual and plot-mean analysis).
Rao 1971b
MINQUE1
PMINQUE1
Minimum Norm Quadratic Unbiased: non-iterative with ones as
priors for all variance components (individual and plot-mean
analysis).
Rao 1971a
TYPE3
PTYPE3
Sequentially Adjusted Sums of Squares; Henderson’s Method 3
(individual and plot-mean analysis).
Henderson 1953
MIVPEN
MIVQUE with a penalty algorithm to prevent negative estimates
(individual analysis).
Harville 1977
constraint, the experiment could not be run as a complete factorial and the investigation continued
as a partial factorial. In general, the approach was to run levels which were at opposite ends of
the imbalance spectrum, i.e., 80% survival and no missing crosses versus 60% survival and 5
missing crosses, within a variance component level. If results were consistent across these
treatment combinations, intermediate levels were not run.

61
Designation of a treatment combination is by five character alpha-numeric field. The first
character is either "H" (half-sib) or "D" (half-diallel). The second character denotes the set of
parametric variance components where " 1" designated the set of variance components associated
with heritability of 0.1 and "2" designated the set of variance components associated with
heritability of 0.25 (Table 4-1). The third character is an "S" indicating that the last two
characters determine the imbalance level. The fourth character designates the survival level either
"6" for 60% or "8" for 80%. The final character specifies the number of missing crosses (half-
diallel) or lack of connectedness (half-sib). The treatment combination ’H1S80’ is a half-sib
mating design (H), the set of variance components associated with heritability equalling 0.1 (1),
80% survival (8), and 15 common parents across tests (0).
Table 4-2. Sets of true variance components for the half-diallel and half-sib mating designs
generated from specification of two levels of single-tree heritability (h2), type B correlation (rB),
and non-additive to additive variance ratio (d/a).
Genetic Ratios*
Mating
Design
True Variance Components1’
h2
d/a
0?
o]
<
0?.
0»
0.1
0.5
1.0
full-sib
1.0
0.5
0.25
0.25
0.25
0.25
.595
7.905
half-sib
1.0
0.5
0.25
NA
0.25
NA
.475
7.9964
0.25
0.8
.25
full-sib
1.0
0.5
0.625
.1562
.1562
.0391
.5769
7.6649
a h2 = 4o2g / ff2phenotypic; rB = 4cfg / (4a2g + 4 b See definitions in equation 4-1.
Experimental Design for Simulated Data
The mating design for the simulation was either a six-parent half-diallel (no seifs) or a
fifteen-parent half-sib. The randomized complete block field design was in three locations (i.e
separate field tests) with four complete blocks per location and six trees per family in a block;
where family is a full-sib family for half-diallel or a half-sib family for the half-sib design. This

62
field design and the mating designs reflect typical designs in forestry applications (Squillace 1973,
Wilcox et al. 1975, Bridgwater et al. 1983, Weir and Goddard 1986, Loo-Dinkins et al. 1991)
and are also commonly used in other disciplines (Matzinger et al. 1959, Hallauer and Miranda
1981, Singh and Singh 1984). The six trees per family could be considered as contiguous or
non-contiguous plots without affecting the results or inferences.
Full-Sib Linear Model
The scalar linear model employed for half-diallel individual observations is
y¡jkto = M + t¡ + b;j + gk + g| + Su + tgik + tgu + tSjjj + pijkl + wijkto 4-1
where yijklm is the m- observation of the kl- cross in the j— block of the i— test;
H is the population mean;
t¡ is the random variable test location ~ NID(0,a2,);
b¡j is the random variable block ~ NID(0, gk is the random variable female general combining ability (gca) ~ NIDCO,^;
g, is the random variable male gca ~ NIDlO.a2^;
su is the random variable specific combining ability (sea) ~ NID^o2,,);
tg^ is the random variable test by female gca interaction ~ NID(0,(r^);
tgu is the random variable test by male gca interaction ~ NID^cr2,^;
ts^ is the random variable test by sea interaction ~ NID(0,u2J;
pijkl is the random variable plot ~ NID(0, wijkta is the random variable within-plot ~ NID(0,a2w); and
there is no covariance between random variables in the model.
This linear model in matrix notation is (dimensions below model component)
y — Ml + Z-r&r + ZBeB 4- ZGeG + Zses + ZTGe-pG + ZTse-iS + ZPeP -I- e^, 4-2

63
rue 1 rul rut txl rub bjel rug gxl rus sjc 1 rutg tgxl ruts tsjel rup pjc 1 rul
where y is the observation vector;
Z¡ is the portion of the design matrix for the i— random variable;
e¡ is the vector of unobservable random effects for the i— random variable;
1 is a vector of l’s; and
n, t, b, g, s, tg, ts, and p are the number of observations, tests, blocks, gca’s, sea’s, test
by gca interactions, test by sea interactions and plots, respectively.
Utilizing customary assumptions in half-diallel mating designs (Method 4, Griffing 1956), the
variance of an individual observation is
Var(yijklJ = a2, + <4 + 2 a2g + and in matrix notation the covariance matrix for the observations is
Var(y) = Z,Z’o2, + ZBZy„ + ZcZ¿o2g + ZsZ’a2, + + Z^L^a\ + ZrZyv + I.o2. 4-4
where " ’ " indicates the transpose operator, all matrices of the form Z¡Z¡’ are run, and I„ is an
run identity matrix.
Half-sib Linear Model
The scalar linear model for half-sib individual observations is
yijk» = M + ti + by + gk + tgi + Phyk + Whijkm 4-5
where yijkm is the m— observation of the k- half-sib family in the j— block of the i- test;
H, tj, by, gk, and tg^ retain the definition in Eq.4-1;
phijk is the random variable plot containing different genotype by environment
components than the corresponding term in Eq.4-1 ~ NID(0,a2ph);
whijkm is the random variable within-plot containing different levels of genotypic and
genotype by environment components than the corresponding term in Eq.4-1

64
~ NIDÍO.o2^); and
there is no covariance between random variables in the model.
The matrix notation model is (dimensions below model component)
y = 4 4 Zg6g 4" ZqC(j 4- ^tg^tg 4* ZpGp 4" e^v 4-6
rul axl rut txl rub bxl axg gxl rutg tgjcl axp pjcl rul
The variance of an individual observation in half-sib designs is
Var(yijkJ = a2, 4- a2b 4- and Var(y) — Z^-Z^cr, 4- ZBZB(j2b 4- ZGZ¿crg 4- Z^Z-j^u'^ 4- ZPZP cj‘pb 4- 4-8
For an observational vector based on plot means, the plot and within-plot random
variables were combined by taking the arithmetic mean across the observations within a plot.
The resulting plot means model has a new plot and within-plot variance terms of the individual observation model.
Three estimates of ratios among variance components were determined: 1) single tree
heritability adjusted for test location and block as fi2 = 4 estimate of the variance of an individual observation from equations 4-3 and 4-7 with the variance
components for test location and block deleted; 2) type B correlation as (rB = 4b2g / (4 4or2tg); and dominance to additive variance ratio as d/a = 4 Data Generation and Deletion
Data generation was accomplished by using a Cholesky upper-lower decomposition of the
covariance matrix for the observations (Goodnight 1979) and a vector of pseudo-random standard
normal deviates generated using the Box-Muller transformation with pseudo-random uniform
deviates (Knuth 1981, Press et al. 1989). The upper-lower decomposition creates a matrix (U)
with the property that Var(y) = U’U. The vector of pseudo-random standard normal deviates

65
(z) has a covariance matrix equal to an identity matrix (IJ where n is the number of observations.
The vector of observations is created as y = U’z. Then Var(y) = U’(Var(z))U and since Var(z)
= I,„ Var(y) = UTU = U’U.
Analyses of survival patterns using data from the Cooperative Forest Genetic Research
Program (CFGRP) at the University of Florida were used to develop survival distributions for
the simulation. The data sets chosen for survival analysis were from full-sib slash pine (Pirns
elliottii var elliottii Engelm) tests planted in randomized complete block designs with the families
in row plots and were selected because the survival levels were either approximately 60% or
80%. Survival levels for most crosses (full-sib families) clustered around the expected value,
i.e., approximately 60% for an average survival level of 60%; however, there were always a few
crosses that had much poorer survival than average and also a small number of crosses that had
much better survival than average. This survival pattern was consistent across the 50 experiments
analyzed. Thus, a lower than average survival level was arbitrarily assigned to certain crosses,
a higher than average survival level was assigned to certain crosses, and the average survival
level assigned to most crosses. This modeling of survival pattern was also extended to the half-
sib mating design. At 80% survival no missing plots were allowed and at 60% survival missing
plots occurred at random.
Full-sib family deletion simulated crosses which could not be made and were therefore
missing from the experiment. When deleting five crosses, the deletion was restricted to a
maximum of four crosses per parent to prevent loss of all the crosses in which a single parent
appeared since this would have resulted in changing a six-parent to a five-parent half-diallel.
Tests having only subsets of the half-sib families in common are a frequent occurrence
in data analysis at CFGRP. This partial connectedness was simulated by generating data in which

66
only 10 of the 15 families present in a test were common to either one of the other two tests
comprising a data set.
Variance Component Estimation Techniques
Two algorithms were utilized for all estimation techniques: sequentially adjusted sums
of squares (Milliken and Johnson 1984, p 138) for HM3; and Giesbrecht’s algorithm (Giesbrecht
1983) for REML, ML, MINQUE and MIVQUE. Giesbrecht’s algorithm is primarily a gradient
algorithm (the method of scoring), and as such allows negative estimates (Harville 1977,
Giesbrecht 1983). Negative estimates are not a theoretical difficulty with MINQUE or MIVQUE;
however, for REML and ML, estimates should be confined to the parameter space. For this
reason estimators referred to as REML and ML in this chapter are not truly REML and ML when
negative estimates occur; further, there is the possibility that the iterative solution stopped at a
local maxima not the global maximum. These concerns are commonplace in REML and ML
estimation (Corbeil and Searle 1976, Harville 1977, Swallow and Monahan 1984); however,
ignoring these two points, these estimators are still referred to as REML and ML.
The basic equation for variance component estimation under normality (Giesbrecht 1983)
for MIVQUE, MINQUE and REML is MQV.QVj)}^ = {y’Q^Qy} 4-9
rxr rjcl rjcl
then ¿* = {tr(QViQVj)}'1{y’QViQy};
and for ML (trCV ’V.V'Vj)}^ = {y’QViQy} 4-10
rxr rxl rxl
where {tr(QV¡QVj)} is a matrix whose elements are tr(QViQVj) where in the full-sib
designs i= 1 to 8 and j=l to 8, i.e., there is a row and column for
every random variable in the linear model;

67
tr is the trace operator that is the sum of the diagonal elements of a matrix;
Q = V'1 - V'XCX’V-'XyX’V1 for V as the covariance matrix of y and X as
the design matrix for fixed effects;
V, = ZtZ\ where i = the random variables test, block, etc.;
b2 is the vector of variance component estimates; and
r is the number of random variables in the model.
The MINQUE estimator used was MINQUE1 , i.e., ones as priors for all variance
components; calculated by applying Giesbrecht’s algorithm non-iteratively. MINQUE 1 was
chosen because of results demonstrating MINQUEO (prior of 1 for the error term and of 0 for
all others) to be an inferior estimation technique for many cases (Swallow and Monahan 1984,
R.C. Littell unpublished data).
With normally-distributed uncorrelated random variables, the use of the true values of
the variance components as priors in a non-iterative application of Giesbrecht’s algorithm
produced the MIVQUE solutions (equation 4-5). Obtaining true MIVQUE estimation is a luxury
of computer simulation and would not be possible in practice since the true variance components
are required (Swallow and Searle 1978). This estimator was included to provide a standard of
comparison for other estimators. An additional MIVQUE-type estimator, referred to as
MIVPEN, was also included. MIVPEN was also a non-iterative application of the algorithm with
the true variance components as priors; however, this estimator was conditioned on the variance
component parameter space and did not allow negative estimates. The non-negative conditioning
of MIVPEN was accomplished by adding a penalty algorithm to MIVQUE such that no variance
component was allowed to be less than lxl(f7. Estimates from MIVPEN were equal to MIVQUE
for data sets for which there were no negative MIVQUE variance component estimates. When
negative MIVQUE estimates occur the two techniques were no longer equivalent. The penalty

68
algorithm operated by using A = a2 - o2 and by choosing a scalar weight w such that no element
of a2new is less than lxlO'7. Then a2^ = a2 + wA, where A is the vector of departure from the
true values (o2), lxlO'7 is an arbitrary constant and a2^ is the vector of estimated variance
components conditioned on non-negativity.
REML estimates were from repeated application of Giesbrecht’s algorithm (equation 4-9)
in which the estimates from the k* iteration become the priors for the k+1* iteration. The
iterations were stopped when the difference between the estimates from the k* and k+1*
iterations met the convergence criterion; then the estimates of the k+l* iteration became the
REML estimates. The convergence criterion utilized was E-=11 ct2m - criterion imposed convergence to the fourth decimal place for all variance components. Since
for this experimental workload it was desired that the simulation run with little analyst
intervention and in as few iterations as possible, the robustness of REML solutions obtained from
Giesbrecht’s algorithm to priors (or starting points) was explored. The difference in solutions
starting from two distinct points (a vector of ones and the true values) was compared over 2000
data sets of different structures (imbalance, true variance components, and field design). The
results (agreeing with those of Swallow and Monahan 1984) indicated that the difference between
the two solutions was entirely dependent on the stringency of the convergence criterion and not
on the starting point (priors). Also the number of iterations required for convergence was greatly
decreased by using the true values as priors. Thus, all REML estimates were calculated starting
with the true values as priors.
Three alternatives for coping with negative estimates after convergence were used for
REML solutions: accept and use the negative estimates (Shaw 1987), arbitrarily set negative
estimates to zero, and re-solve the system setting negative estimates to zero (Miller 1973). The
first two alternatives are self-explanatory and the latter is accomplished by re-analyzing those data

69
sets in which the initial unrestricted REML estimates included one or more negative estimates.
During re-analysis if a variance component became negative, it was set to zero (could never be
any value other than zero) and the iterations continued. This procedure persisted until the
convergence criterion was met with a solution in which all variance components were either
positive or zero.
Harville (1977) suggested several adaptations of Henderson’s mixed model equations
(Henderson et al. 1959) which do not allow variance component estimates to become negative;
however, the estimates can become arbitrarily close to zero. After trial of these techniques
versus the set the negative estimates to zero after convergence and re-solve the system approach,
comparison of results using the same data sets indicates that there is little practical advantage
(although more desirable theoretically) in using the approach suggested by Harville. The
differences between sets of estimates obtained by the two methods are extremely minor (solving
the system with a variance component set to zero versus arbitrarily close to zero).
ML solutions, as iterative applications of equation 4-6, were calculated from the same
starting points and with the same convergence criterion as REML solutions. The three negative
variance component alternatives explored for ML were to accept and use the negative estimates,
to arbitrarily set negative estimates to zero after converging to a solution for the former, and (for
half-sib data only) to re-solve the system setting negative variance components to zero.
The algorithm to calculate solutions for HM3 (sequentially adjusted sums of squares) was
based on the upper triangular G2 sweep (Goodnight 1979) and Hartley’s method of synthesis
(Hartley 1967). The equation solved was E{MS}o2 = MS where MS is the vector of mean
squares and E{MS} is their expectation. The alternative used for negative estimates was to accept
and use the negative estimates.

70
Comparison Among Estimation Techniques
For the simulation MIVQUE estimates were the basis for all comparisons because
MIVQUE is by definition the minimum variance quadratic unbiased estimator. The results of
comparing the mean of 1000 MIVQUE estimates for an experimental level to the means for other
techniques were termed "apparent bias". "Apparent bias" denotes that 1000 data sets were not
sufficient to achieve complete convergence to the true values of the variance components.
Sampling variances of estimation were calculated from the 1000 observations within an
experimental level and estimation technique for variance components and genetic ratios (single
tree heritability, Type B correlation and dominance to additive variance ratio). Mean square
error then equalled variance plus squared "apparent bias". While mean square error was
investigated, there was never sufficient bias for mean square error to lead to a different decision
concerning techniques than sampling variance of the estimates; so mean square error was deleted
from the remainder of this discussion.
Probability of nearness is the probability that an estimate will lie within a certain interval
around the true parameter. The three total interval widths utilized were one-half, equal to, and
twice the parameter size. The percentage of 1000 estimates falling within these intervals were
calculated for the different estimation techniques within an experimental level for variance
components and ratios and utilized as an estimate of probability of nearness.
Results are presented by variance component or genetic ratio estimated as a percentage
of MIVQUE (except in the case of probability of nearness). MIVQUE estimates represent 100%
with estimates with greater variance having values larger than 100% and "apparently biased"
estimates having values different from 100%. The percentages were calculated as equal to 100
times the estimate divided by the MIVQUE value. For the criterion of variance, the lower the

71
percentage the better the estimator performed; for bias, values equalling 100% (0 bias) are
preferred; and for probability of nearness, larger percentages (probabilities) are favored since
they are indicative of greater density of estimates near the parametric value.
Results and Discussion
Variance Components
Sampling variance of the estimators
For all variance components estimated, REML and ML estimation techniques were
consistently equal to or less than MIVQUE for sampling variance of the estimator (Table 4-3).
The variance among estimates from these techniques was further reduced by setting the negative
components to zero (MODML and MODREML) or setting negative estimates to zero plus re¬
solving the system (NNREML, NNML, and PNNREML). Variance among MINQUE1 estimates
is always equal to or greater than for MIVQUE, as one might expect, since they are, in this
application, the same technique with MIVQUE having perfect priors (the true values). Variances
for HM3 estimators (TYPE3 and PTYPE3) are either equal to or greater than MIVQUE (HM3
estimates have progressively larger relative variance with higher levels of imbalance. MIVPEN,
although impractical because of the need for the true priors, had much more precise estimates of
variance components than other techniques illustrating what could be accomplished given the true
values as priors plus maintaining estimates within the parameter space.
In general, the spread among the percentages for variance of estimation for the estimation
techniques is highly dependent on the degree of imbalance and the type of mating system. With
increasing imbalance the likelihood-based estimators realized greater advantage for sampling
variance of the estimates over HM3 for both mating systems. The most advantageous application

72
Table 4-3. Sampling variance for the estimates of h2 (third number where calculated) as a percentage of the MIVQUE estimate by type of estimator
and treatment combination; NA is not applied. Values greater than 100 indicate larger variance
among 1000 estimates.
Estimator
D1S80
D1S65
D2S65
H1S80
H1S65
REML
99.9
102.6
101.5
99.6
106.3
100.2
100.0
104.1
99.7
98.0
100.0
101.0
101.4
99.6
105.8
ML
77.3
78.2
76.4
95.9
103.9
106.9
104.8
110.7
100.8
99.1
82.5
82.9
86.4
96.2
103.8
MINQUE1
100.0
104.2
104.0
104.0
146.7
101.2
118.8
123.6
112.5
139.7
100.3
105.8
103.9
104.0
145.8
NNREML
80.8
71.6
95.2
88.0
68.6
67.9
48.3
54.9
78.7
48.6
76.8
64.2
92.2
87.3
67.7
NNML
NA
NA
NA
83.3
65.3
79.4
48.9
83.1
64.7
MODML
58.2
50.0
69.5
84.7
74.6
12.8
81.4
81.6
86.6
68.5
58.1
46.1
72.0
83.8
71.4
MODREML
81.5
74.5
96.1
88.9
78.1
89.1
74.0
73.7
85.4
66.9
76.4
63.5
88.9
87.7
74.3
TYPE3
101.0
101.0
105.5
100.6
121.0
101.1
101.0
115.5
100.9
125.6
100.5
108.4
102.9
100.4
121.6
PREML
100.3
106.3
101.7
107.5
146.9
102.7
113.5
119.8
122.0
150.7
PML
77.6
81.9
77.1
103.6
143.4
109.7
117.3
127.2
123.3
151.9
PMINQUE1
100.3
107.6
105.4
107.5
179.3
102.7
129.0
137.3
122.0
180.6
PNNREML
80.9
71.1
93.9
92.7
86.6
69.8
53.2
60.5
94.0
68.1
PTYPE3
100.3
106.6
105.4
107.5
168.1
102.7
124.7
133.3
122.0
184.9
100.6
110.8
104.1
106.9
168.0
MIVPEN
NA
36.2
29.1
80.0
45.6
26.6
20.0
74.3
39.6
34.7
30.2
79.8
45.4
PMIVQUE
100.3
104.2
102.4
107.5
146.9
102.7
114.4
117.8
122.0
150.7

73
of likelihood-based estimators is in the H1S65 case where the imbalance is not only random
deletions of individuals but also incomplete connectedness across locations, i.e. the same families
are not present in each test (akin to incomplete blocks within a test).
An analysis of variance was conducted to determine the importance of the treatment of
negative variance component estimates in the variance of estimation for REML and ML estimates.
The model of sampling variance of the estimates as a result of mating design, imbalance level,
treatment of negative estimates and size of the variance component demonstrated consistently (for
all variance components except error) that treatment of negative estimates is an important
component of the variance of the estimates (p < .05). The model accounted for up to 99% of
the variation in the variance of the variance component estimates with 1) accepting and using
negative estimates producing the highest variance; 2) setting the negative components to zero
being intermediate; and 3) re-solving the system with negative estimates set to zero providing the
lowest variance.
For all estimation techniques, lower variance among estimates was obtained by using
individual observations as compared to plot means. The advantage of individual over plot-mean
observations increased with increasing imbalance.
Bias
The most consistent performance for bias (Table 4-4) across all variance components was
TYPE3 known from inherent properties to be unbiased. The consistent convergence of the
TYPE3 value to the MIVQUE value indicated that the number of data sets used (1000 per
technique and experimental level) was suitable for the purpose of examining bias. The other two
consistent performers were REML and MINQUE1. PTYPE3 (HM3 based on plot means) was
unbiased when no plot means were missing, but produced "apparently biased" estimates when
plot means were missing.

74
Table 4-4. Bias for the estimates of a2g (upper number), a2^ (second number), and h2 (third
number where calculated) as a percentage of the MIVQUE estimate by type of estimator and
experimental combination; NA is not applied. Values different from 100 denote "apparent" bias.
Estimator
D1S80
D1S65
D2S65
H1S80
H1S65
REML
99.9
101.5
98.7
99.9
102.8
99.9
102.2
99.8
99.9
98.9
99.9
101.3
98.6
99.9
102.6
ML
74.6
61.6
76.0
96.2
98.2
106.5
114.6
109.7
101.3
101.8
75.5
61.8
77.9
96.3
98.2
MINQUE
99.7
96.4
99.0
99.4
102.0
100.1
100.8
101.3
100.8
98.3
99.7
96.6
98.9
99.4
101.3
NNREML
107.9
116.5
98.1
101.9
107.8
93.1
92.9
92.9
100.5
102.3
108.7
118.4
98.2
102.2
107.7
NNML
NA
NA
NA
101.9
107.8
100.5
102.3
98.2
103.8
MODML
86.6
90.4
79.0
98.1
114.1
109.9
129.9
127.4
101.3
122.9
87.8
91.5
79.4
99.6
112.6
MODREML
109.5
124.2
100.6
103.1
117.8
103.7
119.8
119.2
104.6
120.6
109.5
123.2
98.4
102.9
116.2
TYPE3
100.1
99.4
99.6
100.2
99.6
100.2
101.0
102.4
100.2
100.9
100.0
99.5
99.3
100.2
99.7
PREML
99.7
98.7
97.7
99.5
110.6
100.1
103.6
100.2
102.4
98.3
PML
74.2
58.5
73.6
95.9
105.2
106.9
116.2
111.5
103.2
102.0
PMINQUE
99.7
95.2
98.8
99.5
106.5
100.1
102.1
102.9
102.4
114.8
PNNREML
107.9
114.5
96.7
101.8
115.6
92.9
94.0
95.0
104.5
110.2
PTYPE3
99.7
96.8
99.0
99.5
104.5
100.1
97.2
96.0
102.4
108.7
99.8
98.0
98.8
99.6
104.1
MIVPEN
NA
107.5
98.6
102.0
103.2
99.0
91.7
101.4
105.1
112.6
103.9
102.1
103.4
PMIVQUE
99.7
97.4
99.2
99.5
106.8
100.1
101.7
100.5
102.4
98.8

75
Table 4-5. Probability of nearness for crg (upper number), a\ (second number), and h2 (third
number where calculated). The probability interval is equal to the magnitude of the parameter.
Estimator
D1S80
D1S65
D2S65
H1S80
H1S65
REML
32.8
24.3
41.8
45.3
28.6
43.0
26.2
25.7
36.6
27.1
34.2
25.3
45.4
45.0
28.3
ML
33.6
22.3
40.7
45.4
29.2
42.9
26.4
24.8
36.2
26.7
34.6
22.3
45.0
45.7
28.2
MINQUE
32.6
24.6
41.0
45.1
26.1
43.1
24.3
25.4
34.2
23.2
33.7
25.0
44.6
44.7
25.6
NNREML
33.4
23.4
41.7
45.1
29.3
44.9
28.1
25.6
38.0
28.9
34.3
24.3
46.1
45.2
29.5
NNML
NA
NA
NA
45.9
29.7
37.9
29.1
46.0
29.0
TYPE3
34.0
23.2
42.5
45.3
27.1
42.6
27.1
24.8
37.3
25.0
35.3
23.8
45.8
45.9
27.3
PREML
32.1
20.0
41.6
43.7
24.6
42.7
26.8
24.6
32.3
20.4
PML
33.5
19.8
39.7
44.0
24.4
41.0
26.3
23.6
31.6
21.1
PMINQUE
32.1
21.4
40.4
43.7
24.5
42.7
24.8
23.1
32.3
21.9
PNNREML
31.9
19.2
41.0
43.4
26.0
43.3
28.0
23.3
33.1
21.3
PTYPE3
32.1
23.3
41.7
43.7
25.2
42.7
25.4
24.1
32.3
22.4
32.6
24.1
46.0
44.6
24.6
MIVQUE
33.6
25.7
43.7
45.1
29.2
42.9
28.6
26.4
36.9
26.3
34.8
26.8
47.7
45.4
29.4
MIVPEN
NA
41.1
78.5
48.4
35.6
47.0
60.3
39.2
31.2
42.4
80.5
48.7
35.3
PMIVQUE
32.1
20.0
41.8
43.7
25.9
42.7
28.5
26.8
32.3
20.8

76
Among estimators which displayed bias, maximum likelihood estimators (ML and PML)
were known to be inherently biased (Harville 1977, Searle 1987) with the amount of bias
proportional to the number of degrees of freedom for a factor versus the number of levels for the
factor. Other biases resulted from the method of dealing with negative estimates. Living with
negative estimates produced the estimators with the least bias. Setting negative variance
components to zero resulted in the greatest bias. Intermediate in bias were the estimates resulting
from re-solving the system with negative components set to zero.
Probability of nearness
Results for probability of nearness proved to be largely non-discriminatory among
techniques (Table 4-5). The low levels of probability density near the parametric values are
indicative of the nature of the variance component estimation problem. Figure 4-1 illustrates the
distribution of MIVQUE variance component estimates for h2 (4-la) and D1S80. The distributions for all unconstrained variance component estimates have the appearance
of a chi-square distribution, positively skewed with the expected value (mean) occurring to the
right of the peak probability density and a proportion of the estimates occurring below zero
(except error). With increasing imbalance, the variance among estimates increases and the
probability of nearness decreases for all interval widths.
Ratios of Variance Components
Single tree heritabilitv
Results for estimates of single tree heritability adjusted for locations and blocks are shown
in Tables 4-3 and 4-4 (third number from the top in each cell, if calculated). For these relatively
low heritabilities (0.1 and 0.25), the bias and variance properties of the estimated ratio are similar
to those for a2g estimates (Figure 4-1). This implies that knowing the properties of the numerator

77
20
15
P
E
R io
C
E
N ,
T
0
-.25 -.10 0.0 .10 .25 0.4 0.6 -0.625 -.25 0.0 .25 .625 1.0 1.5 2.0
MIVQUE ESTIMATES 1000 DATA SETS
4-la. h2 4-lb. a2g
201
J
L
5 • - -
J
L
Figure 4-1. Distribution of 1000 MIVQUE estimates of h2 (4-la) and (rg (4-lb) for experimental
level D1S80 illustrating the positive skew and similarity of the distributions. The true values are
.1 for h2 and .25 for a2g. The interval width of the bars is one-half the parametric value.

78
of heritability reveals the properties of the ratio (especially true of ratios with expected values of
0.1 and 0.25, Kendall and Stuart 1963, Ch. 10). Variance component estimation techniques
which performed well for bias and/or variance among estimates for trg also performed well for
h2.
Type B correlation and dominance to additive variance ratio
Type B correlation (Table 4-3 and 4-4 as a2^ and dominance to additive variance ratio
(not shown) estimates both proved to be too unstable (extremely large variance among estimates)
in their original formulations to be useful in discrimination among variance component estimation
techniques. This high variance is due to the estimates of the denominators of these ratios
approaching zero and to the high variance of the denominator of ratios (Table 4-2). These ratios
were reformulated with numerators of interest (4^^ for additive genetic by test interaction and
4ff2s for dominance variance, respectively) and a denominator equal to the estimate of the
phenotypic variance. With this reformulation the variance and bias properties of estimates of the
altered ratios is approximated by the properties of estimates of the numerators.
For increasing imbalance maximum-likelihood-based estimation offers an increasing
advantage over HM3, and for all techniques individual observations offer increasing advantage
over plot-mean observations for variance of the estimates of these ratios. Bias, other than
inherently biased methods (ML), is associated with the probability of negative estimates which
is increased by increasing imbalance. This assertion is supported by comparing the biases of
REML, NNREML, and MODREML estimates across imbalance levels.

79
General Discussion
Observational Unit
Some general conclusions regarding the choice of a variance component estimation
methodology can be drawn from the results of this investigation. For any degree of imbalance
the use of individual observations is superior to the use of plot means for estimation of variance
component or ratios of variance components. If the data are nearly balanced (close to 100%
survival with no missing plots, crosses (full-sib) or lack of connectedness (half-sib)), the
properties of the estimation techniques based on individual and plot-mean observations become
similar; so if departure from balance is nominal, plot means can be used effectively. However,
using individual observations obviates the need for a survey of imbalance in the data since
individual observations produce better results than plot means for any of the estimation techniques
examined.
Negative Estimates
Drawing on the results of this investigation, the discussion of practical solutions for the
negative estimates problem will revolve around two solutions: 1) accept and use the negative
estimates; and 2) re-solving the system with negative estimates set to zero.
Given that the property of interest is the true value of a variance component or genetic
ratio, often estimated as a mean across data sets, then negativity constraints come into play if the
component of interest is small in comparison to other underlying variance components in the data,
or the variance of estimates is high due to an inadequate experimental design for variance
component estimation. These factors lead to an increased number of negative estimates. If the
data structure is such that negative estimates would occur frequently, then accepting negative
estimates is a good alternative.

80
If negative estimates tend to occur infrequently or bias is of less concern than variance
among estimates, then re-solving the system after convergence yields negative estimates is the
preferable solution. This tactic reduces both bias and variance among estimates below that of
arbitrarily setting negative estimates to zero.
Estimation Technique
The primary competitors among estimation techniques that are practically achievable are
REML and TYPE3 (HM3). Both techniques produce estimates with little or no bias; however,
REML estimates for the most part have slightly less sampling variance than TYPE3 estimates.
If only subsets of the parents are in common across tests as in the case H1S65, REML has a
distinct advantage in variance among estimates over TYPE3.
REML does have three additional advantages over TYPE3 which are 1) REML offers
generalized least squares estimation of fixed effects while TYPE3 offers ordinary least squares
estimation; 2) Best Linear Unbiased Predictions (BLUP) of random variables are inherent in
REML solutions, i.e., gca predictions are available; and thus in solving for the variance
components with REML, fixed effects are estimated and random variables are predicted
simultaneously (Harville 1977); and 3) REML offers greater flexibility in the model specification
both in univariate and multivariate forms as well as heterogeneous or correlated error terms.
Further, although the likelihood equations for common REML applications are based on
normality, the technique has been shown to be robust against the underlying distribution (Westfall
1987, Banks et al. 1985).

81
Recommendation
If one were to choose a single variance component estimation technique from among
those tested which could be applied to any data set with confidence that the estimates had
desirable properties (variance, MSE, and bias), that technique would be REML and the basic unit
of observation would be the individual. This combination (REML plus individual observations)
performed well across mating design and types and levels of imbalance. Treatment of negative
estimates would be determined by the proposed use of the estimates that is whether unbiasedness
(accepting and using the negative estimates) is more important than sampling variance (re-solve
the system setting negative estimates to zero).
A primary disadvantage of REML and individual observations is that they are both
computationally expensive (computer memory and time). HM3 estimation could replace REML
on many data sets and plot means could replace individual observations on some data sets; but
general application of these without regard to the data at hand does result in a loss in desirable
properties of the estimates in many instances.
The computational expense of REML and individual observations ensures that estimates
have desirable properties for a broad scope of applications. With the advent of bigger and faster
computers and the evolution of better REML algorithms, what was not feasible in the past on
most mainframe computers can now be accomplished on personal computers.

CHAPTER 5
GAREML: A COMPUTER ALGORITHM FOR
ESTIMATING VARIANCE COMPONENTS AND
PREDICTING GENETIC VALUES
Introduction
The computer program described in this chapter, called GAREML for Giesbrecht’s
algorithm of restricted maximum likelihood estimation (REML), is useful for both estimating
variance components and predicting genetic values. GAREML applies the methodology of
Giesbrecht (1983) to the problems of REML estimation (Patterson and Thompson 1971) and best
linear unbiased prediction (BLUP, Henderson 1973) for univariate (single trait) genetics models.
GAREML can be applied to half-sib (open-pollinated or polymix) and full-sib (partial diallels,
factorials, half-diallels [no seifs] or disconnected sets of half-diallels) mating designs when planted
in single or multiple locations with single or multiple replications per location. When used for
variance component estimation, this program has been shown to provide estimates with desirable
properties across types of imbalance commonly encountered in forest genetics field tests (Huber
et al. in press) and with varying underlying distributions (Banks et al. 1985, Westfall 1987).
GAREML is also useful for determining efficiencies of alternative field and mating designs for
the estimation of variance components.
Utilizing the power of mixed-model methodology (Henderson 1984), GAREML provides
BLUP of parental general (gca) and specific combining abilities (sea) as well as generalized least
squares (GLS) solutions for fixed effects. The application of BLUP to forest genetics problems
has been addressed by White and Hodge (1988, 1989). With certain assumptions, the desirable
82

83
properties of BLUP predictions include maximizing the probability of obtaining correct parental
rankings from the data and minimizing the error associated with using the parental values
obtained in future applications. GLS fixed effect estimation weights the observations comprising
the estimates by their associated variances approximating best linear unbiased estimation (BLUE)
for fixed effects (Searle 1987, p 489-490).
The purpose of this chapter is to describe the theory and use of GAREML in enough
detail to facilitate use by other investigators. The program is written in FORTRAN and is not
dependent on other analysis programs. An interactive version of this program can be obtained
as a stand-alone executable file from the senior author; this file will run on any IBM compatible
PC under DOS or WINDOWS2 operating systems. The size of the problem an investigator can
solve will be dependent on the amount of extended memory and hard disk space (for swap files)
available for program use. In addition, the FORTRAN source code can be obtained for analysts
wishing to compile the program for use on alternate systems (e.g. mainframe computers).
Algorithm
GAREML proceeds by reading the data and forming a design matrix based on the number
of levels of factors in the model. Any portions of the design matrix for nested factors or
interactions are formed by horizontal direct product. Columns of zeroes in the design matrix (the
result of imbalance) are then deleted. The design matrix columns are in an order specified by
Giesbrecht’s algorithm: columns for fixed effects are first, followed by the data vector, and the
last section of the matrix is for random effects. The design matrix is the only fully formed
matrix in the program. All other matrices are symmetric; therefore, to save computational space
2Windows is the trademark of the Microsoft Corporation, Redmond, WA.

84
and time, only the diagonal and the above diagonal portions of matrices are formed and utilized
(i.e., half-stored).
A half-stored matrix of the dot products of the design columns is formed and either kept
in common memory or stored in temporary disk space so that the matrix is available for recall
in the iterative solution process. The algorithm proceeds by modifying the matrix of dot products
such that the inverse of the covariance matrix for the observations (V) is enclosed by the column
specifiers in the dot products as X’X becoming X’V'X. This transfer is completed without
inversion of the total V matrix. The identity used to accomplish this transfer is
if Vh = ahZhZhf + V^+j) where Vh is nonsingular;
then V-'h = VV,, - «hV-‘(h+I)Zh(Ih + abZh’V->(h+1)Zh)'Zh’V-'(h+1). 5-1
A compact form of equation 5-1 is obtained by pre-multiplying by Z¡’ and post-multiplying by
Zj where h = 1, k-1 (k = the total number of random factors), ah is the prior associated with
random variable h, Vk = o¡kI, V, = V and Z¡ is the portion of the design matrix for random
variable i (Giesbrecht 1983). A partitioned matrix is formed in order to update until V,'1
or V is obtained. This matrix is of the form:
Ih + V^hZh’V(h+1)-1(X|y|Z1|...!Zk.1) 1
5-2
. ^(X | y | z, I... I Zk.,)V(h+1) ‘Zh T(h+1)
where Tk., = (X|y |Z,|... ¡ Zk.,)’Vk.r‘(X ¡ y |Z,| ...|Zk,,).
The sweep operator of Goodnight (1979) is applied to the upper left partition of the
matrix (equation 5-2) and the result of equation 5-1 is obtained. The matrix is sequentially
updated and swept until T, = (X¡ y ¡ Z, |... ¡ Zk.,)’V'1(X¡y ¡ Z, j... ¡ Zk.,) is obtained. T, is then
swept on the columns for fixed effects (X’V 'X). This sweep operation produces generalized least
squares estimates for fixed effects, results which can be scaled into predictions of random
variables, the residual sum of squares and all the necessary ingredients for assembling the

85
equation to solve for the variance components. The equation to be solved for the variance
components is
{tr(QVjQVj)}ff2 = {y’QVjQy}
nr rjcl rjcl
then ¿* = {tr (Q V ¡Q Vj)}'1 {y ’Q V ¡Qy}; 5-3
where {tr(QV1QVj)} is a matrix whose elements are tr(QV|QVj) where i= 1 to r and
j = l to r, i.e., there is a row and column for every random variable in
the linear model;
tr is the trace operator that is the sum of the diagonal elements of a matrix;
Q = V'1 - V-'XiX’V-'XyX’V1 for V as the covariance matrix of y and X as
the design matrix for fixed effects;
V, = Z¡Z’( where the i’s are the random variables;
a2 is the vector of variance component estimates; and
r is the number of random variables in the model (k-1).
The entire procedure from forming T, to solving for the variance components continues
until the variance component estimates from the last iteration are no more different from the
estimates of the previous iteration than the convergence criterion specifies. The fixed effect
estimates and predictions of random variables are then those of the final iteration. The
asymptotic covariance matrix for the variance components is obtained as
Varío2) = 2{tr(QViQVj)}1 54
by utilizing intermediate results from the solution for the variance components.
The coefficient matrix of Henderson’s mixed model equations is formed in order to
calculate the covariance matrix for fixed and random effects. The covariance matrix for

86
observations is constructed using the variance components estimates from Giesbrecht’s algorithm.
The coefficient matrix is
X’R’X X’R'Z
Z’R'X Z’R'Z + D 1
5-5
where R is the error covariance matrix which in this application is I(fw
where X is the fixed effects design matrix;
Z is the random effects design matrix; and
D is the covariance matrix for the random variables which, in this
application, has variance components on the diagonal and zeroes on the
off-diagonal (no covariance among random variables).
The generalized inverse of the matrix (equation 5-5) is the error covariance matrix of the fixed
effect estimates and random predictions assuming the covariance matrix for observation is known
without error.
Operating GAREML
While GAREML will run in either batch or interactive mode, we focus on the interactive
PC-version which begins by prompting the analyst to answer questions determining the factors
to be read from the data. Specifically, the analyst answers yes or no to these questions: 1) are
there multiple locations? 2) are there multiple blocks? 3) are there disconnected sets of full-sibs?
i.e., usually referring to disconnected half-diallels and 4) is the mating design half-sib or full-sib?
The program then determines the proper variables to read from the data as well as the most
complicated (number of main factors plus interactions) scalar linear model allowed.
The most complicated linear model allowed for full-sib observations is

87
yijkim - M + t¡ + b¡j + set„ + gk + g, + Su + tgfc + tgu + tSuj + p¡jkJ + wijklm 5-6
where yijklm is the m- observation of the kl- cross in the j- block of the i- test;
H is the population mean;
t¡ is the random or fixed variable test environment;
by is the random or fixed variable block;
setc is the random or fixed variable set, i.e., a variable is created so that
disconnected sets of half-diallels planted in the same experiment can be
analyzed in the same run or to analyze provenances and families within
provenance where provenance equals set; sets are assumed to be across test
environments and blocks with families nested within sets and interactions with
set are assumed unimportant.
gk is the random variable female general combining ability (gca);
g, is the random variable male gca;
Sy is the random variable specific combining ability (sea);
tgfr is the random variable test by female gca interaction;
tgu is the random variable test by male gca interaction;
tSuj is the random variable test by sea interaction;
pijkl is the random variable plot;
wijklm is the random variable within-plot; and
there is no covariance between random variables in the model.
The assumptions utilized are the variance for female and male random variables are equal (a2^
= a2g, = (Tg); and female and male environmental interactions are the same (a2^ = a2^ = a\).
The most complicated scalar linear model allowed for half-sib observations is
yijkn, = M + t¡ + by + set0 + gk + tgik + phijk + whijkm
5-7

88
where yijkm is the m- observation of the k— half-sib family in the j— block of the i— test;
H, t¡, bij; seto; gk, and tg* retain the definition in the full-sib equation;
phijk is the random variable plot containing different genotype by environment
components than the full-sib model;
whijkni is the random variable within-plot containing different levels of
genotypic and genotype by environment components than the full-sib model;
and there is no covariance between random variables in the model.
The analyst builds the linear model by answering further prompts. If test, block and/or
set are in the model, they must be declared as fixed or random effects. When any of the three
effects is declared random, the analyst must furnish prior values for the variance. If no prior
value is known, 1.0’s may be used as priors. Using 1.0’s as priors will not affect the values for
resulting variance component estimates within the constraints of the convergence criterion; but
there may be a time penalty due to increasing the number of iterations required for convergence.
All remaining factors in the model are treated as random variables.
To complete the definition of the model, the analyst chooses to include or exclude each
possible factor by answering yes or no when prompted. After each yes answer, the program asks
for a prior value for the variance. Again, if no known priors exist, 1.0’s may be substituted.
After the model has been specified, the program counts the number of fixed effects and the
number of random effects and asks if the number fits the model expected. A "yes" answer
proceeds through the program while a "no" returns the program to the beginning.
GAREML is now ready to read the data file (which must be an ASCII data file) in this
order: test, block, set, female, male, and the response variable. The analyst is prompted to
furnish a proper FORTRAN format statement for the data. Test, block, set, female and male are
read as character variables (A fields) with as many as eight characters per field, while the data

89
vector (response variable) is read as a double precision variable (F field). An example of a
format statement for a full-sib mating design across locations and blocks is "(4A8,F10.5)" which
reads four character variables sequentially occupying 8 columns each and the reponse variable
beginning in column 33 and ending in column 42 having five decimal places.
After reading the data, GAREML begins to furnish information to the analyst. This
information should be scanned to make sure the data read are correct. This information includes
the number of parents, the number of full-sib crosses, the number of observations, the maximum
number of fixed effect design matrix columns, and the maximum number of random effect design
matrix columns. If there is an error at this point, use CTRL-BRK to exit the program. Probable
causes of errors are the data are not in the format specified, missing values are included, blank
lines or other similar errors are in the data file, or the model was not correctly specified.
At this point, there are three other prompts concerning the data analysis (number of
iterations, convergence criterion and treatment of negative variance components). The number
of iterations is arbitrarily set to 30 and can be changed at the analyst’s discretion. No warning
is issued that the maximum number of iterations has been reached; however, the current iteration
number and variance component estimates are output to the screen at the beginning of each
iteration. The convergence criterion used is the sum of the absolute values of the difference
between variance component estimates for consecutive iterations. The criterion has been set to
lxlO'4 meaning that convergence is required to the fourth decimal place for all variance
components. The convergence criterion should be modified to suit the magnitude of the variances
under consideration as well as the practical need for enhanced resolution. Enhanced resolution
is obtained at the cost of increasing the number of iterations to convergence.
The analyst must decide whether to accept and use negative estimates or to set negative
estimates to zero and re-solve the system. The latter solution results in variance component

90
estimates with lower sampling variance and slight bias. If one is interested in unbiased estimates
of variance components that have a high probability of negative estimates, then accepting and
using the negative estimates may be the proper course to take.
Interpreting GAREML Output
Analysis is now underway. The priors for each iteration and the iteration number are
printed out to the screen. GAREML continues to iterate until the convergence criterion is met
or the maximum number of iterations is reached. The next time that analyst intervention is
required is to provide a name for the output fde for variance component estimates. The fde name
follows normal DOS file naming protocol; however, alternative directories may not be specified,
i.e., all outputs will be found in the same directory as the data file. The program will now quiz
the analyst to determine if additional outputs are desired. These additional outputs are gca
predictions, sea predictions (if applicable), the asymptotic covariance matrix for the variance
components, generalized least squares fixed effect estimates, error covariance matrix of the gca
predictions and error covariance matrix for fixed effects. An answer of yes to the inclusion of
an output will result in a prompting for a file name. In addition, for gca and sea predictions the
analyst may input a different value for o2ga or crwith which to scale predictions. The
discussion which follows furnishes more detailed information concerning GAREML outputs.
Variance Component Estimates
Ignoring concerns about convergence to a global maximum and negative values, variance
component estimates are restricted maximum likelihood estimates of Patterson and Thompson
(1971). The estimates are robust against starting values (priors), i.e., the same estimates, within
the limits of the convergence criterion, can be obtained from diverse priors. However, priors

91
close to the true values will, in general, reduce the number of iterations required to reach
convergence. The value of the convergence criterion must be less than or equal to the desired
precision for the variance components. REML variance component estimates from this program
have been shown to have more desirable properties (variance and bias) than other commonly used
estimation techniques (maximum likelihood, minimum norm quadratic unbiased estimation and
Henderson’s Method 3) over a wide range of data imbalance. The properties of the estimates are
further enhanced by using individual observations as data rather than plot means. The output is
labelled by the variance component estimated.
Predictions of Random Variables
The predictions output are for general and specific combining abilities and approximate
best linear unbiased predictions (BLUP) of the random variables. BLUP predictions have several
optimal properties: 1) the correlation between the predicted and true values is maximized; 2) if
the distribution is multivariate normal then BLUP maximizes the probability of obtaining the
correct rankings (Henderson 1973) and so maximizes the probability of selecting the best
candidate from any pair of candidates (Henderson 1977).
Predictions are of the form:
u = £>Z’V *(y-X6) 5-8
where ü is the vector of predictions;
£> is the estimated covariance matrix for random variables from the REML
variance component estimates, see equation 5-5;
Z’ is the transpose of the design matrix for random variables;
y is the data vector;
X is the design matrix for fixed effects;

92
6 is the vector of fixed effect estimates; and
V is the estimated covariance matrix for observations from REML variance
component estimates.
NOTE: if predictions are desired based on prior values for the variance components, set the
number of iterations to 1 after having input the desired values as priors.
Predictions are output as a labelled vector.
Asymptotic Covariance Matrix of Variance Components
The output for the asymptotic covariance matrix (AVCM) of variance components is from
equation 5-4. This output represents the variance of repeated minimum variance quadratic
unbiased variance component estimates using the same experimental design if the estimates are
equal to the true values.
This technique has been used for simulation work to define optimal mating and field
designs (McCutchan et al. 1989). The AVCM is used to create the asymptotic variance of linear
combinations of estimates of variance components as
VarfL’a2) = L’Var(^)L 5-9
where L specifies the linear combination(s) of variance components;
ó2 is the vector of variance component estimates; and
Varío2) is the AVCM from equation 5-4.
The diagonal elements of L’Varíó^L are the variances of the linear combinations and the off-
diagonal elements are the covariances between the linear combinations. These values are then
useful for Taylor series approximation of the variance of a ratio of linear combinations such as
heritability. AVCM is output as a vector (half-stored matrix) and each row of the output is
labelled.

93
Fixed Effect Estimates
Fixed effect estimates are those of generalized least squares and are in a set to zero
format. Set to zero format (commonly seen in SAS3 output) is characterized by the last level of
a main effect or nested effect being set to zero. These estimates are approximately best linear
unbiased estimates (BLUE) of the fixed effects because the covariance matrix for observations
was estimated and not known without error. Kackar and Harville (1981) have shown, for a broad
class of variance estimators, that the fixed effects estimates are still unbiased. The word "Best"
in BLUE refers to the properties of minimum variance for the class of unbiased estimators.
Generalized least squares estimates, in set to zero format, for fixed effects are of the
form:
6 = (X’V'XyX’V-'y 5-10
where 6, X, V and y are as defined in equation 5-8.
Fixed effect estimates are output as a labelled vector.
Error Covariance Matrices
The error covariance matrices for predictions and fixed effect estimates are obtained by
producing a generalized inverse of equation 5-5 (Henderson 1984, McLean 1989). Since all
covariance matrices are symmetric, the output is in the form of a vector which is equivalent to
a half-stored matrix. Output for error of gca predictions is labeled while the error of fixed effects
is not. The labeling on gca errors makes the unlabelled output for fixed effect variance self-
explanatory. The error covariance matrix for gca predictions can be converted to the covariance
matrix for gca predictions by forming the covariance matrix for the gca random variables and
3SAS is the registered trademark of SAS Institute Inc., Cary, North Carolina.

94
subtracting the error covariance matrix. The covariance matrix for predictions has been denoted
as Var(g) by White and Hodge (1989).
Example
The following discussion involves the analysis of a simulated data set in order to further
demonstrate the outputs of GAREML.
Data
The data (Table 5-1) was generated using a six-parent half-diallel mating design and a
randomized complete block field design. The field design is in two locations with four complete
blocks per location and two trees per family per block. The underlying genetic parameters for
the data are individual tree heritability equals 0.25, Type B correlation equals 0.8, dominance to
additive variance ratio equals 0.25 and the population mean equals 15.0. After a balanced data
set was generated, the observations were subjected to 40% random deletion (simulating 60%
survival). The data set is comprised of a small number of observations and while not an optimal
application of GAREML serves well as an illustration.
Analysis
The analysis was carried out with two different linear models using individual
observations as the data. The model contained eight sources of variation and was from equation
5-6 without the variable set. In model 1, test environment and blocks within test are declared
fixed. The subsequent model (model 2) has all random effects except the mean. Variance

95
Table 5-1. Data for example of GAREML operation. L, Bl, F, M, T and RV stand for location,
block, female, tree and response variable, respectively. A proper FORTRAN read format would
be (A2,T5,A2,T9,A2,T13,A2,T22,F10.5).
L Bl
F
M
T
RV
1
1
1
2
1
19.07165
1
1
1
3
1
13.17908
1
1
1
6
1
14.33610
1
1
1
6
2
12.48194
1
1
2
3
1
7.57821
1
1
2
3
2
12.73262
1
1
2
5
1
18.38451
1
1
2
5
2
9.84538
1
1
2
6
1
15.60306
1
1
2
6
2
17.44872
1
1
3
4
1
14.59613
1
1
3
5
1
16.95861
1
1
3
5
2
15.02863
1
1
3
6
1
15.95634
1
1
4
5
1
19.13362
1
1
4
5
2
12.08240
1
1
4
6
1
5.37647
1
1
5
6
1
18.87956
1
2
1
3
2
16.79470
1
2
1
5
1
15.81553
1
2
1
5
2
19.77063
1
2
1
6
1
17.49746
1
2
1
6
2
18.81207
1
2
2
3
1
15.03569
1
2
2
5
1
11.68149
1
2
2
6
2
12.78227
1
2
3
4
1
13.39599
1
2
3
5
1
13.54873
1
2
3
5
2
12.00935
1
2
3
6
1
16.89523
1
2
3
6
2
20.48223
1
2
4
5
1
15.21563
1
2
4
6
1
14.21138
1
2
4
6
2
15.65649
1
2
5
6
1
21.36959
1
2
5
6
2
16.39244
1
3
1
3
1
18.83196
1
3
1
3
2
20.45754
1
3
1
4
1
14.10900
1
3
1
4
2
16.49369
1
3
1
6
2
14.25154
1
3
2
3
1
19.57695
1
3
2
5
2
12.38303

96
Table 5-1
LB1 F
1 3 2
1 3 3
1 3 3
1 3 3
1 3 4
1 3 5
1 3 5
1 4 1
1 4 1
1 4 1
1 4 1
1 4 1
1 4 2
1 4 2
1 4 2
1 4 2
1 4 3
1 4 3
1 4 3
1 4 3
1 4 4
1 4 4
1 4 4
1 4 5
1 4 5
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 2
2 1 2
2 1 2
2 1 2
2 1 2
2 1 3
2 1 3
2 1 4
2 1 4
2 1 4
2 1 4
2 1 5
2 1 5
2 2 1
--continued
M T RV
6 2 17.12110
4 1 13.03351
4 2 13.20463
5 2 12.44908
5 1 14.28528
6 1 17.57996
6 2 16.57026
3 1 16.91731
3 2 18.36209
4 2 16.70828
5 2 21.29535
6 1 15.23314
3 1 12.14596
3 2 12.20679
4 1 11.83520
6 1 14.27080
4 1 14.34923
4 2 16.39791
5 1 12.17513
5 2 14.95300
5 2 11.63311
6 1 13.29654
6 2 15.90303
6 1 17.22657
6 2 10.04577
2 2 9.80034
3 1 12.12891
3 2 18.00497
4 1 12.68041
4 2 13.14452
6 1 19.19915
3 1 5.36263
3 2 13.39351
5 2 11.13499
6 1 13.46429
6 2 16.87729
4 2 9.24115
6 2 13.49004
5 1 11.88620
5 2 9.83032
6 1 11.46474
6 2 12.68435
6 1 16.66260
6 2 14.14226
2 1 15.77378

97
Table 5-1
LB1 F
2 2 1
2 2 1
2 2 1
2 2 1
2 2 1
2 2 2
2 2 2
2 2 2
2 2 2
2 2 3
2 2 3
2 2 4
2 2 4
2 2 4
2 2 5
2 2 5
2 3 1
2 3 1
2 3 1
2 3 1
2 3 1
2 3 2
2 3 2
2 3 2
2 3 2
2 3 2
2 3 2
2 3 2
2 3 3
2 3 3
2 3 4
2 3 4
2 3 4
2 3 4
2 3 5
2 3 5
2 4 1
2 4 1
2 4 1
2 4 1
2 4 1
2 4 1
2 4 2
2 4 2
--continued
M T RV
3 1 13.28328
4 1 11.22915
4 2 9.94041
5 2 14.03251
6 2 20.41990
3 1 10.74312
4 1 6.72215
5 1 12.77779
5 2 11.10388
4 1 12.52286
5 1 8.02745
5 1 14.14567
5 2 11.85937
6 2 14.61252
6 1 10.56892
6 2 14.13368
2 1 21.17819
3 1 13.56761
4 1 9.35457
5 1 13.78936
6 1 11.12412
3 1 9.41810
3 2 12.77555
4 1 15.38449
4 2 9.64170
5 2 11.64608
6 1 11.79241
6 2 9.14105
4 1 8.92909
6 1 8.08095
5 1 10.13996
5 2 10.30808
6 1 9.88286
6 2 8.80803
6 1 11.65281
6 2 7.90006
3 1 12.72744
3 2 14.44072
4 1 14.67983
5 1 9.27305
5 2 16.99880
6 1 14.17835
3 1 14.14628
3 2 10.64403

98
Table 5-1—continued
L B1
F
M
T
RV
2
4
2
4
1
16.55552
2
4
2
5
1
10.30221
2
4
2
5
2
13.24760
2
4
3
4
2
8.44671
2
4
3
5
1
14.12292
2
4
3
5
2
14.17583
2
4
3
6
1
13.92882
2
4
3
6
2
16.18924
2
4
4
5
1
8.89750
2
4
4
5
2
9.79576
2
4
4
6
1
12.29319
2
4
4
6
2
9.16987
2
4
5
6
1
14.85018
2
4
5
6
2
16.69414
components are estimated with model 1 receiving two different treatments of negative estimates,
i.e., live with the negative estimates (model 1 A) or re-solve the system setting negative estimates
to zero (model IB). The different models and methods for dealing with negative estimates are
demonstrated so that the reader can see a range of outputs from GAREML.
Output
Variance component estimates
The variance component estimates are
Model 1A
SIGMA-SQUARED GCA 1.221435
SIGMA-SQUARED SCA 0.233278
SIGMA-SQUARED LOCxGCA -0.096850
SIGMA-SQUARED LOCxSCA -0.548142
SIGMA-SQUARED BLOCKxFAM 1.242110
SIGMA-SQUARED ERROR 7.285051;
Model IB
SIGMA-SQUARED GCA 1.160636
SIGMA-SQUARED SCA 0.003190
SIGMA-SQUARED LOCxGCA 0.000000

99
SIGMA-SQUARED LOCxSCA 0.000000
SIGMA-SQUARED BLOCKxFAM 0.753049
SIGMA-SQUARED ERROR 7.375388; and
Model 2
SIGMA-SQUARED LOCATION 3.430921
SIGMA-SQUARED BLOCK(LOC) 0.000000
SIGMA-SQUARED GCA 1.233609
SIGMA-SQUARED SCA 0.000000
SIGMA-SQUARED LOCxGCA 0.000000
SIGMA-SQUARED LOCxSCA 0.000000
SIGMA-SQUARED BLOCKxFAM 0.960168
SIGMA-SQUARED ERROR 7.197284.
These variance component estimates illustrate outputs for the random model, the mixed model
and the alternatives for dealing with negative estimates.
Fixed effect estimates
Fixed effect estimates are
Model IB
MU
13.085052
LOCATION
1
1.805455
LOCATION
2
0.000000
BLOCK(LOC)
1
-0.475396
BLOCK(LOC)
2
0.856959
BLOCK(LOC)
3
0.844716
BLOCK(LOC)
4
0.000000
BLOCK(LOC)
5
-0.219529
BLOCK(LOC)
6
-0.526635
BLOCK(LOC)
7
-1.682449
BLOCK(LOC)
8
0.000000; and
Model 2
MU 13.809567.
The interpretation of fixed effect estimates for model IB is that blocks 1 through 4 belong with
location 1 and the fourth block is set to zero. Blocks 5 through 8 are those of location 2 and the
eighth block is set to zero as well as location 2. Sets of blocks within location can always be
determined by the last block within a location being set to zero. The interpretation of set to zero

100
is MU is the mean of the fourth block (labelled block 8) in location two; and any estimable
function of the fixed effects can be generated from these estimates. An example of an estimable
function would be the site mean of location 1. This mean would be estimated as MU +
LOCATION 1 + l/4(BLOCK(LOC) 1 + BLOCK(LOC)2 + BLOCK(LOC)3 + BLOCK(LOC)
4). MU of model 2 is the estimate of the general mean across sites if all other factors are
random. All of these estimates are the result of generalized least squares estimation.
Asymptotic covariance matrix for the variance components
The asymptotic covariance matrix for the variance components in model IB would appear
as
ASYMPTOTIC VARIANCE COVARIANCE MATRIX
GCA
GCA
0.7902569240
GCA
SCA
-0.0490465017
GCA
LOCxGCA
0.0000000000
GCA
LOCxSCA
0.0000000000
GCA
BLOCKxFAM
0.0003970615
GCA
ERROR
-0.0001155675
SCA
SCA
0.2047376344
SCA
LOCxGCA
0.0000000000
SCA
LOCxSCA
0.0000000000
SCA
BLOCKxFAM
-0.1319741909
SCA
ERROR
-0.0020057997
LOCxGCA
LOCxGCA
0.0000000000
LOCxGCA
LOCxSCA
0.0000000000
LOCxGCA
BLOCKxFAM
0.0000000000
LOCxGCA
ERROR
0.0000000000
LOCxSCA
LOCxSCA
0.0000000000
LOCxSCA
BLOCKxFAM
0.0000000000
LOCxSCA
ERROR
0.0000000000
BLOCKxFAM
BLOCKxFAM
1.6336304265
BLOCKxFAM
ERROR
-1.2680804956
ERROR
ERROR
2.0069152440
This matrix, as are all other matrices output, is half-stored. The output is read as "GCA GCA"
is the asymptotic variance of the gca variance component. The next row labelled "GCA SCA"

101
is the asymptotic covariance between the estimates of the gca variance component and the sea
variance component. Thus the next four rows are asymptotic covariances of gca variance
estimates with the other random variables in the model. The other rows are read in a like manner
and if the analyst wished to array the output as a matrix, all necessary components are at hand.
Predictions of random variables
All predictions of random variables are appropriately labelled according to the character
name read from the data and for model IB would appear as
(from the gca output)
GCA 1
GCA 2
GCA 3
GCA 4
GCA 5
GCA 6
1.573253
-0.356262
-0.423469
-1.310747
-0.054977
0.572202;
(from the sea output)
SCA 1 2
SCA 1 3
SCA 1 4
SCA 1 5
SCA 1 6
SCA 2 3
SCA 2 4
SCA 2 5
SCA 2 6
SCA 3 4
SCA 3 5
SCA 3 6
SCA 4 5
SCA 4 6
SCA 5 6
0.003806
0.002662
-0.002028
0.001562
-0.001678
-0.003976
0.001827
-0.003550
0.000914
-0.000036
-0.002495
0.002681
0.000656
-0.004021
0.003676.
All these predictions are approximately best linear unbiased predictions and are approximate
because the variance components were estimated from the same data.

102
Error covariance matrix of the predictions
The error covariance matrix of the predictions is output as a half-stored matrix with each
row appropriately labelled. This matrix for model IB appears as
THE ERROR VARIANCE COVARIANCE MATRIX FOR GCA ARRAYED AS A VECTOR
0.3618685934 1 1
0.1692300980 1 2
0.1465129987 1 3
0.1583039830 1 4
0.1713608386 1 5
0.1533590404 1 6
0.3687218966 2 2
0.1382132356 2 3
0.1730487382 2 4
0.1543784409 2 5
0.1570431430 2 6
0.3545855963 3 3
0.1622943256 3 4
0.1744667783 3 5
0.1845626177 3 6
0.3518724881 4 4
0.1567087948 4 5
0.1584072224 4 6
0.3466599143 5 5
0.1570607852 5 6
0.3502027434 6 6.
The labelling of the output is interpreted identically to that for the asymptotic variance covariance
matrix for the variance components. Those rows which contain a parental name twice are the
error variance for that parental prediction and those rows containing two parental names are the
error covariance for the two parental predictions. In this unbalanced case the reader will see that
some parents have more error associated with their predictions than others, i.e., compare the
error for parent 2 with parent 5. This is true because of the varying number of observations
associated with the prediction for each parent and also the varying distribution of those
observations across tests and blocks. If one assume that the estimate for gca variance from the

103
data equals the true variance for gca, then the correlation of the prediction with the true value
(Corr(g,g), White and Hodge 1989) for parent 5 is equal to Vl - ( . 347/1. 161) or 0.84.
Error covariance matrix for the fixed effects
The error covariance matrix for the fixed effects is output as a half-stored matrix. The
output is not labelled; however, one only has to know the total number of levels for all fixed
effects to assign labels if needed. The primary use of this matrix is to estimate the variance of
estimable functions of the fixed effects. If 1 denotes the vector containing the specification of an
estimable function and Vb denotes the error covariance matrix for fixed effects, then the variance
of an estimable function is equal to l’Vbl. 1’ for the mean of test 1 equals [1 1 0 1/4 1/4 1/4 1/4
00 00],
Conclusions
GAREML is an analytical tool for use with models common to forest genetics. The
properties of the variance component estimation algorithm have been documented by simulation
studies and the algorithm presents solutions as restricted maximum likelihood estimates. Many
other outputs are available from the program including best linear unbiased predictions,
generalized least squares estimates of fixed effects, error covariance matrices of predictions and
estimates, and the asymptotic covariance matrix for variance component estimates.
GAREML is not intended to be used as a black box. The program has many potential
uses: variance component estimation, parental evaluation, progeny evaluation and simulated
evaluation of mating and field design. However, thoughtful interpretation of the outputs is
needed in order to realize the power and utility of the program.

CHAPTER 6
CONCLUSIONS
Optimal mating design for the determination of genetic architecture was explored.
General conclusions were reached through comparison of the half-diallel, half-sib and circular
mating designs. In particular, the comparison of the half-diallel and circular designs is pertinent
to the establishment of future progeny tests in which full-sib families are desired. Across the
experimental levels examined, the circular mating design provides more efficient estimates of
parameters for genetic architecture than the half-diallel design. If an estimate of the variance in
general combining abilities is required, the half-sib design is more efficient than the circular
mating design over most of the experimental levels examined. This pattern of efficiency argues
for complementary mating designs involving half-sib designs (open-pollinated or polycross) to
work estimate general combining ability and a second design (full-sib mating) to generate crosses
from which to make selections. Complimentary mating designs do require a greater monetary
and temporal commitment. If this type of commitment is not justified or possible, then the
circular mating design should be used to generate full-sib families and estimate genetic parameters
simultaneously.
Considering field design in combination with mating design, full-sib designs reach
maximum efficiency for genetic parameter estimation in fewer numbers of replicates across
locations than half-sib designs. For any specific case of field design and the half-sib mating
design, a priori knowledge of the genetic architecture is required to choose the optimal field
design for number of locations.
104

105
In cases where maximum efficiency of an experimental design is obtained and the
precision of genetic parameter estimates is still less than desired, the optimal use of experimental
units would be disconnected sets of experiments at maximum efficiency with the parameter
estimate then being a mean of the estimates from the disconnected experiments. Of the three
mating designs only the half-diallel exhibits efficiency optima for number of parents. The
optimum for number of parents in half-diallels is always close to and never larger than six parents
with the fluctuation resulting from the genetic architecture. Thus for half-diallels for maximum
efficiency in genetic parameter estimation, the number of parents should not exceed six and
desired parameter precision obtained by using disconnected sets of six parents. Optima for
number of locations exist for all mating designs and maximum efficiency would again be obtained
by replicating an experiment only for the optimal number of locations. A parameter estimate of
the desired precision would be calculated as a mean of disconnected experiments.
Optimal analysis was dealt with on two stages (estimating parental worth and estimation
of variance components or genetic architecture). The estimation of parental worth was examined
for the half-diallel mating design. It is argued, on theoretical grounds and in generality, that best
linear unbiased prediction and best linear prediction are more suited to the problem of parental
evaluation than ordinary least squares.
Using simulated data for two mating designs (half-diallel and half-sib) variance component
estimation techniques were compared with vary levels of data imbalance and two levels of genetic
control. In estimating variance components (or genetic ratios such as heritability) four criteria
were adopted for discrimination among estimation techniques (probability of nearness, bias, mean
square error and variance of estimation). Of the four, only bias and variance of estimation
proved informative. Bias proved useful in discriminating among treatments of negative estimates
with accepting and living with the negative estimates having the least bias, re-solving the system

106
with negative estimates set to zero intermediate in bias and setting negative estimates to zero
producing the most bias. Variance of estimation also was discriminatory among treatments of
negative estimates with accepting and living with negative estimates having the highest variance,
setting negative estimates to zero intermediate in variance and re-solving the system setting
negative estimates to zero having the lowest variance.
Variance of estimation was also discriminatory among units of observation and variance
component estimation techniques. Of the two units of observation used (individuals and plot
means), individual observations produced estimates with better properties across all levels of
imbalance, mating designs and variance component estimation techniques. Of the variance
component estimation techniques contrasted, restricted maximum likelihood produced estimates
with the best properties (bias and variance of estimation) across all mating designs, levels of
genetic control and levels of imbalance. Therefore it is proposed that restricted maximum
likelihood estimation with individual observations as data should be utilized.
With the recommendation to use restricted maximum likelihood, the program used to
analyze the simulated data was rewritten into a user friendly format able to analyze both full-sib
and half-sib data. Additional outputs (other than variance components) were also added as
options. These outputs include general and specific combining ability predictions, the asymptotic
covariance matrix for variance components, generalized least squares estimates of fixed effects
and the covariance matrices for predictions and estimates.

o o
APPENDIX
FORTRAN SOURCE CODE FOR GAREML
C******XHIS program PRODUCES REML AND MIVQUE VARIANCE*************
C****COMPONENT ESTIMATES BY STARTING ITERATION FROM THE***********
C****TRUE VALUES OF THE PARAMETERS THROUGH THE USE OF*************
C***************QjE C PARAMETERS DETERMINE THE PROGRAM DIMENSIONS: ANY CHANGE IN
C PARAMETER SIZE DECLARATION SHOULD BE GLOBAL SINCE THEY ARE
C ALSO SPECIFIED IN THE SUBROUTINES
PROGRAM MAIN
PARAMETER (
C NOBSER IS THE MAXIMUM NUMBER OF OBSERVATIONS
N NOBSER = 5000,
C NOBL IS THE MAXIMUM NUMBER OF BLOCKS PER LOCATION
N NOBL=36,
C NOCR IS THE MAXIMUM NUMBER OF FULL-SIB CROSSES
N NOCR = 75,
NOBH IS THE MAXIMUM NUMBER OF FIXED EFFECT LEVELS INCLUDING THE
MEAN
N NOBH = 200,
C NVARBH DIMENSIONS THE VARIANCE COVARIANCE MATRIX FOR FIXED
C EFFECTS
N NVARBH = (NOBH*(NOBH-1 ))/2 + NOBH,
C NOGCA IS THE MAXIMUM NUMBER OF PARENTS
N NOGCA=50,
C NOVARG DIMENSIONS THE VARIANCE COVARIANCE MATRIX FOR GCA
N NOVARG = (NOGCA*(NOGCA-l))/2 + NOGCA,
C NOX IS THE MAXIMUM NUMBER OF COLUMNS FOR FIXED EFFECTS PLUS
C RANDOM EFFECTS
C PLUS ONE FOR THE DATA
N NOX =1400,
C NOCBS IS THE MAXIMUM NUMBER OF LEVELS FOR THE RANDOM EFFECT
C HAVING THE GREATEST NUMBER, USUALLY CROSS BY BLOCK OR PLOT
C COMBINATIONS
N NOCBS =1000,
C NTOT IS THE TOTAL NUMBER OF COLUMNS OF NOX PLUS NOCBS
N NTOT = NOX + NOCBS,
C OTHER PARAMETERS USE THE PREVIOUS DECLARATIONS TO ALLOCATED
C SUFFICIENT SIZE TO SYMMETRIC MATRICES STORED AS VECTORS
N NIZED = NOX*NOCBS,
107

108
N NIXPX = ((NOX*(NOX-1 ))/2) + NOX,
N NSIP = NOX + NOCBS,
N NIZEP = ((NSIP*(NSIP-1 ))/2) + NSIP)
COMMON/CMN1/ NCOLT,NCOLTB,NCOLG,NCOLS,NCOLGT,NCOLST,NOBS,
N NCOLB,NCOLX,NCOLCB,NCL(9),NORAN,NOFIX,NCLFIX,
N NCLRAN,NCOLSE,NRAN(9)
COMMON/CMN2/
N YQVQY(9),VQVQ(9,9),MEAN(NOBSER),SIG(9),GCA(NOGCA),
N BHAT(NOBH),SCA(NOCR)
COMMON/CMN3/ DTERM(8,2),RANNAM(9),DUM2,FMVEC(NOCR),
N PARENT(NOGCA),LOCO(10),REP(NOBL),DISSET(10)
DIMENSION TEST(NOBSER),BLOCK(NOBSER),F(NOBSER),M(NOBSER),
N FM(NOBSER),REML(9),VARHAT(9),SOL(9,10),DUM(9),PRI(9),
N SET(NOBSER),NAME(9),NUMMY(9),VARG(NOVARG),VARBH(NVARBH)
INTEGER NCOLT,NCOLB,NCOLG,NCOLS,NCOLGT,NCOLST,NCOLCB,NOBS,
N NCOLTB,NCOLX,NCOLSE,NCL,NCLFIX,NCLRAN,NORAN,NOFIX,
N NUMMY,NRAN,NOITS,LEP
DOUBLE PRECISION YQVQY,SIG,REML,ZAG,VARHAT,SOL,MEAN,DUM,VQVQ,
N GCA,BHAT,PRI,SCA,SCALES,SCALEG,VARG,VARBH
REAL CVERG
CHARACTER* 1 DTERM,DUMDUM,DUM2,DUMB
CHARACTER*80 FMAT
CHARACTER* 16 FLNAME,FM,FMVEC,NT,KICK,LICK
CHARACTER* 11 NAME,RANNAM
CHARACTER* 13 SIGMA
CHARACTER*8 TEST,LOCO,F,M,PARENT,BLOCK,SET,DISSET,REP
SIGMA =’SIGMA-SQUARED’
NAME(1) = ’LOCATION’
NAME(2) = ’BLOCK(LOC)’
NAME(3) = ’SET’
NAME(4) = ’GCA’
NAME(5) = ’SCA’
NAME(6) = ’LOCxGCA’
NAME(7) = ’LOCxSCA’
NAME(8) = ’BLOCKxFAM’
NAME(9) =’ERROR’
OPEN(UNIT = 13,STATUS = ’SCRATCH’,FORM = ’UNFORMATTED’)
DO 2031 1=1,8
DO 2032 J = 1,2
DTERM(I,J) = ’ ’
2032 CONTINUE
2031 CONTINUE
PRINT *, ’ REML VARIANCE COMPONENTS ESTIMATED BY THE METHOD OF
NSCORING’
PRINT *, ’ THROUGH THE USE OF GIESBRECHTS ALGORITHM’
PRINT *, ’ WRITTEN BY DUDLEY HUBER UNIVERSITY OF FLORIDA’
PRINT * *

109
PRINT *, ’WARNING YOU HAVE JUST ENTERED THE TWILIGHT ZONE OF
NVARIANCE COMPONENTS’
PRINT *, ’ANSWER Y FOR YES OR N FOR NO TO THE FOLLOWING QUESTIONS’
WRITE(6,2012)
2012 FORMAT(/ ’fc*******************************************************
J^************** j A
WRITE(6,2012)
10101=0
J = 0
2500 FORMAT(’ PLEASE TRY AGAIN’)
PRINT*,’ FIRST THE FACTORS TO BE READ FROM THE DATA WILL BE DETE
NRMINED’
2501 PRINT *, ’ DOES THE DATA HAVE MULTIPLE LOCATIONS?
READ(6,1501) DTERM(1,1)
IF((DTERM(1,1).NE.’Y’).AND.(DTERM(1,1).NE.’N’)) THEN
WRITE(6,2500)
GO TO 2501
ENDIF
2502 PRINT *, ’ ARE THERE MULTIPLE BLOCKS(LOCATION) IN THE DATA? ’
READ(6,1501) DTERM(2,1)
IF((DTERM(2,1).NE.’Y’).AND.(DTERM(2,1).NE.’N’)) THEN
WRITE(6,2500)
GO TO 2502
ENDIF
2503 PRINT *, ’ ARE THERE DISCONNECTED SETS OF GENETIC ENTRIES IN THE
NDATA? ’
READ(6,1501) DTERM(3,1)
IF((DTERM(3,1).NE.’Y’).AND.(DTERM(3,1).NE.’N’)) THEN
WRITE(6,2500)
GO TO 2503
ENDIF
WRITE(6,2012)
7001 PRINT *,’ IS THE ANALYSIS BASED ON HALF-SIB (H) OR FULL-SIB FAMILI
NES (F)? (H OR F) ’
READ(6,1501) DUM2
IF((DUM2.NE.’H’).AND.(DUM2.NE.’F’)) THEN
WRITE(6,2500)
GO TO 7001
ENDIF
PRINT *, ’ NOW TO DETERMINE FIXED OR RANDOM FACTORS AND PRIORS’
PRINT *, ’ ANSWER F FOR FIXED OR R FOR RANDOM TO DETERMINE STATUS’
IF (DTERM(1,1).EQ.’N’) GO TO 1001
2504 PRINT *, ’ LOCATION IS FIXED OR RANDOM? ’
READ(6,1501) DTERM(1,2)
IF((DTERM(1,2).NE.’F’).AND.(DTERM(1,2).NE.’R’)) THEN
WRJTE(6,2500)
GO TO 2504
ENDIF

110
IF (DTERM(1,2).EQ.’F’) THEN
J=J+1
GO TO 1001
ENDIF
DTERM(1,2) = ’R’
PRINT *, ’ WHAT IS THE PRIOR FOR LOCATION?
1 = 1+1
READ(6,1502) PRI(I)
1502 FORMAT(F20.6)
1001 IF (DTERM(2,1).EQ.’N’) GO TO 1002
2505 PRINT *, ’ BLOCK IS FIXED OR RANDOM? ’
READ(6,1501) DTERM(2,2)
IF((DTERM(2,2).NE.’F’).AND.(DTERM(2,2).NE.’R’)) THEN
WRITE(6,2500)
GO TO 2505
ENDIF
IF (DTERM(2,2).EQ.’F’) THEN
J=J+1
GO TO 1002
ENDIF
DTERM(2,2) = ’R’
PRINT *, ’ WHAT IS THE PRIOR FOR BLOCK?
1 = 1+1
READ(6,1502) PRI(I)
1002 IF (DTERM(3,1).EQ.’N’) GO TO 1003
2506 PRINT *, ’ SETS ARE FIXED OR RANDOM?
READ(6,1501) DTERM(3,2)
IF((DTERM(3,2).NE.’F’).AND.(DTERM(3,2).NE.’R’)) THEN
WRITE(6,2500)
GO TO 2506
ENDIF
IF (DTERM(3,2).EQ.’F’) THEN
J=J+1
GO TO 1003
ENDIF
DTERM(3,2) = ’R’
PRINT *, ’ WHAT IS THE PRIOR FOR SETS?
1 = 1+1
READ(6,1502) PRI(I)
1003 PRINT *, ’ ALL OTHER FACTORS ARE CONSIDERED RANDOM’
PRINT *, ’ ANSWER Y FOR YES OR N FOR NO FOR INCLUSION OF THE FACTO
NR IN THE MODEL’
WRITE(6,2012)
2507 PRINT *, ’ IS GCA IN THE MODEL? ’
READ(6,1501) DTERM(4,1)
IF((DTERM(4,1).NE.’Y’).AND.(DTERM(4,1).NE.’N’)) THEN
WRITE(6,2500)
GO TO 2507

ENDIF
IF (DTERM(4,1).EQ.’N’) GO TO 1004
C PRINT *, ’ GCA IS FIXED OR RANDOM?
C INPUT *, DTERM(4,2)
C IF (DTERM(4,2).EQ.’F’) THEN
C J=J+1
C GO TO 1004
C ENDIF
DTERM(4,2) = ’R’
PRINT *, ’ WHAT IS THE PRIOR FOR GCA?
1 = 1+1
READ(6,1502) PRI(I)
IF(DUM2.EQ.’H’) THEN
DTERM(5,1) = ’N’
GO TO 1005
ENDIF
1004 PRINT *, ’ IS SC A IN THE MODEL?
READ(6,1501) DTERM(5,1)
IF((DTERM(5,1).NE.’Y’).AND.(DTERM(5,1).NE.’N’)) THEN
WRITE(6,2500)
GO TO 1004
ENDIF
IF ((DTERM(5,l).EQ.’N’).OR.(DUM2.EQ.’H’)) GO TO 1005
C PRINT *, ’ SCA IS FIXED OR RANDOM? ’
C INPUT *, DTERM(5,2)
C IF (DTERM(5,2).EQ.’F’) THEN
C J=J+1
C GO TO 1005
C ENDIF
DTERM(5,2) = ’R’
PRINT *, ’ WHAT IS THE PRIOR FOR SCA?
1 = 1+1
READ(6,1502) PRI(I)
1005 IF(DTERM(1,1).EQ.’N’) GO TO 1007
PRINT *, ’ IS LOCATIONxGCA INTERACTION IN THE MODEL?
READ(6,1501) DTERM(6,1)
IF((DTERM(6,1).NE.’Y,).AND.(DTERM(6,1).NE.,N’)) THEN
WRITE(6,2500)
GO TO 1005
ENDIF
IF (DTERM(6,1).EQ.’N’) GO TO 1006
C PRINT *, ’ LOCATIONxGCA IS FIXED OR RANDOM? ’
C INPUT *, DTERM(6,2)
C IF (DTERM(6,2).EQ.’F’) THEN
C J=J + 1
C GO TO 1006
C ENDIF
DTERM(6,2) = ’R’

PRINT *, ’ WHAT IS THE PRIOR FOR LOCATIONxGCA?
1 = 1+1
READ(6,1502) PRI(I)
1006 IF(DUM2.EQ.’H’) THEN
DTERM(7,1) = ’N’
GO TO 1007
ENDIF
PRINT *, ’ IS LOCATIONxSCA IN THE MODEL?
READ(6,1501) DTERM(7,1)
IF((DTERM(7,1).NE.’Y’).AND.(DTERM(7,1).NE.’N’)) THEN
WRITE(6,2500)
GO TO 1006
ENDIF
IF ((DTERM(7,l).EQ.’N’).OR.(DUM2.EQ.’H’)) GO TO 1007
C PRINT *, ’ LOCATIONxSCA IS FIXED OR RANDOM? ’
C INPUT *, DTERM(7,2)
C IF (DTERM(7,2).EQ.’F’) THEN
C J=J+1
C GO TO 1007
C ENDIF
DTERM(7,2) = ’R’
PRINT *, ’ WHAT IS THE PRIOR FOR LOCATIONxSCA?
1 = 1+1
READ(6,1502) PRI(I)
1007 PRINT *, ’ IS PLOT OR FAMILYxBLOCK IN THE MODEL?
READ(6,1501) DTERM(8,1)
IF((DTERM(8,1).NE.’Y’).AND.(DTERM(8,1).NE.’N’)) THEN
WRITE(6,2500)
GO TO 1007
ENDIF
IF (DTERM(8,1).EQ.’N’) GO TO 1008
C PRINT *, ’ PLOT OR FAMILYxBLOCK IS FIXED OR RANDOM?
C INPUT *, DTERM(8,2)
C IF (DTERM(8,2).EQ. ’F’) THEN
C J=J+1
C GO TO 1008
C ENDIF
DTERM(8,2) = ’R’
PRINT *, ’ WHAT IS THE PRIOR FOR PLOT OR FAMILYxBLOCK?
1 = 1+1
READ(6,1502) PRI(I)
1008 PRINT *, ’ WHAT IS THE PRIOR FOR ERROR?
1 = 1+1
READ(6,1502) PRI(I)
J=J+1
NOFIX=J
NORAN=I
WRITE(6,1009) NOFIX,NORAN

113
1009 FORMATO THE NUMBER OF FIXED FACTORS PLUS THE MEAN = ’,12,/,
N’ THE NUMBER OF RANDOM FACTORS PLUS ERROR = ’,12)
PRINT *, ’ DO THESE LEVELS MATCH YOUR INTENDED MODEL? Y OR N ’
READ(6,1501) DUMDUM
IF (DUMDUM.EQ.’N’) THEN
PRINT *, ’ RETURNING TO INITIALIZATION OF MODEL’
PRINT *, ’ TO EXIT PROGRAM USE CONTROL-BREAK’
GO TO 1010
ENDIF
PRINT *, ’ THE INPUT DATA SET NAME IS:
READ(6,1503) FLNAME
1503 FORMAT(A16)
WRITE(6,1011)
1011 FORMATO THE FORMAT OF THE DATA IS: REMEMBERING PARENTHESES’,/)
READ(6,10I2) FMAT
1012 FORMAT(A80)
OPEN (1 ,FILE = FLNAME,STATUS = ’OLD’)
NOBS=1
1 IF(DUM2.EQ.’H’) GO TO 2
IF((DTERM(1,1).EQ.’N’). AND. (DTERM(2,1). EQ.’N’). AND. (DTERM(3,1).
NEQ.’N’)) GO TO 1013
IF((DTERM(1,1).EQ.’N’).AND.(DTERM(2,1).EQ.’N’)) GO TO 1014
IF((DTERM(1,1).EQ.’N’).AND.(DTERM(3,1).EQ.’N’)) GO TO 1015
IF(DTERM(1,1).EQ.’N’) GO TO 1000
IF((DTERM(2,1).EQ.’N’).AND.(DTERM(3,1).EQ.’N’)) GO TO 1016
IF(DTERM(2,1).EQ.’N’) GO TO 1017
IF(DTERM(3,1).EQ.’N’) GO TO 1018
READ( 1 ,FMT = FMAT,END = 3) TEST(NOBS),BLOCK(NOBS),SET(NOBS),
N F(NOBS),M(NOBS),MEAN(NOBS)
GO TO 1019
1018 READ (1,FMT = FMAT,END = 3) TEST(NOBS),BLOCK(NOBS),F(NOBS),M(NOBS),
N MEAN(NOBS)
GO TO 1019
1000 READ (1,FMT = FMAT,END = 3) BLOCK(NOBS),SET(NOBS),F(NOBS),M(NOBS),
N MEAN(NOBS)
GO TO 1019
1013 READ (1 ,FMT = FMAT,END = 3) F(NOBS),M(NOBS),MEAN(NOBS)
GO TO 1019
1014 READ (1 ,FMT=FMAT,END = 3) SET(NOBS),F(NOBS),M(NOBS),MEAN(NOBS)
GO TO 1019
1015 READ (1,FMT=FMAT,END = 3) BLOCK(NOBS),F(NOBS),M(NOBS),MEAN(NOBS)
GO TO 1019
1016 READ (1,FMT = FMAT,END = 3) TEST(NOBS),F(NOBS),M(NOBS),MEAN(NOBS)
GO TO 1019
1017 READ (1,FMT = FMAT,END = 3) TEST(NOBS),SET(NOBS),F(NOBS),M(NOBS),
N MEAN(NOBS)
GO TO 1019
2 IF((DTERM(1,1). EQ.’N’). AND. (DTERM(2,1). EQ.’N’). AND. (DTERM(3,1).

114
NEQ.’N’)) GO TO 7013
IF((DTERM(1,1).EQ.’N’).AND.(DTERM(2,1).EQ.’N’)) GO TO 7014
IF((DTERM(1,1).EQ.’N’).AND.(DTERM(3,1).EQ.’N’)) GO TO 7015
IF((DTERM(2,1).EQ.’N’).AND.(DTERM(3,1).EQ.,N’)) GO TO 7016
IF(DTERM(2,1).EQ.’N’) GO TO 7017
IF(DTERM(3,1).EQ.’N’) GO TO 7018
READ( 1 ,FMT=FMAT,END = 3) TEST(NOBS),BLOCK(NOBS),SET(NOBS),
N F(NOBS),MEAN(NOBS)
GO TO 1019
7018 READ (1 ,FMT=FMAT,END = 3) TEST(NOBS),BLOCK(NOBS),F(NOBS),
N MEAN(NOBS)
GO TO 1019
7013 READ (1,FMT = FMAT,END = 3) F(NOBS),MEAN(NOBS)
GO TO 1019
7014 READ (1,FMT = FMAT,END = 3) SET(NOBS),F(NOBS),MEAN(NOBS)
GO TO 1019
7015 READ (1,FMT=FMAT,END = 3) BLOCK(NOBS),F(NOBS),MEAN(NOBS)
GO TO 1019
7016 READ (1,FMT=FMAT,END = 3) TEST(NOBS),F(NOBS),MEAN(NOBS)
GO TO 1019
7017 READ (1,FMT=FMAT,END = 3) TEST(NOBS),SET(NOBS),F(NOBS),
N MEAN(NOBS)
1019 NOBS=NOBS+1
GO TO 1
3 NOBS = NOBS-1
CLOSE(l)
WRITE(6,2015) NOBS
2015 FORMAT(’ THE NUMBER OF OBSERVATIONS IS ’,14)
IF(DUM2.EQ.’H’) GO TO 7019
DO 4 1=1,NOBS
FM(I) = F(I)//M(I)
4 CONTINUE
7019 K=0
DO 5010 1=1,8
IF(DTERM(I,1).EQ.’N’) GO TO 5010
IF(DTERM(I,2).EQ.’R’) THEN
K = K+ 1
RANNAM(K) = NAME(I)
ENDIF
5010 CONTINUE
RANNAM(K+ 1) = NAME(9)
DO 72 1= l,NOCR
FMVEC(I) = ’
72 CONTINUE
J=0
DO 162 1=1,9
IF(PRI(I).GT.0.0) THEN
J=J+1

115
SIG(J) = PRI(I)
ENDIF
162 CONTINUE
NCOLT=0
NCOLB=0
NCOLSE = 0
NCOLTB=0
NCOLG=0
NCOLS=0
NCOLGT=0
NCOLST=0
NCOLCB=0
IF(DTERM(1,1).EQ.’N’) GO TO 1020
CALL NOCOL(TEST,NOBS,LOCO,NCOLT)
1020 NCL(1) = NCOLT
IF(DTERM(2,1).EQ.’N’) GO TO 1021
CALL NOCOL(BLOCK,NOBS,REP,NCOLB)
1021 IF(DTERM(3,1).EQ.’N’) GO TO 1022
CALL NOCOL(SET,NOBS,DISSET,NCOLSE)
1022 NCL(3) = NCOLSE
IF((DTERM(1,1).EQ.,N’).AND.(DTERM(2,1).EQ.’Y’)) THEN
NCOLTB = NCOLB
GO TO 1023
ENDIF
NCOLTB = NCOLT*NCOLB
1023 IF(DUM2.EQ.’H’) THEN
CALL NOCOL(F,NOBS,PARENT,NCOLG)
GO TO 7022
ENDIF
CALL NOPAR(F,M,NOBS,PARENT,NCOLG)
7022 NCL(2) = NCOLTB
NCL(4) = NCOLG
IF((DUM2.EQ.’H’).OR.(DTERM(5,l).EQ.’N’)) GO TO 7021
DO 32 1=1,NOBS
IF(LEQ.l) THEN
FMVEC(I) = FM(I)
NCOLS=1
GO TO 32
ENDIF
DO 33 J = l,NCOLS
KICK=FM(I)
LICK = FMVEC(J)
IF(KICK.EQ.LICK) GO TO 32
33 CONTINUE
NCOLS = NCOLS + 1
FMVEC(NCOLS) = FM(I)
32 CONTINUE
DO 159 K= l,NCOLS-l

116
N = K +1
DO 159 J = N,NCOLS
IF(FMVEC(K).LT.FMVEC(J)) GO TO 159
NT=FMVEC(K)
FM VEC(K) = FM VEC(J)
FMVEC(J) = NT
159 CONTINUE
7021 IF(DUM2.EQ.’H’) NCOLS = 0
NCL(5) = NCOLS
NCOLST=NCOLS *NCOLT
NCOLGT=NCOLG *NCOLT
NCOLCB = NCOLS *NCOLTB
IF(DUM2.EQ.’H’) NCOLCB = NCOLG*NCOLTB
IF(DTERM(6,1).EQ. ’ N’) NCOLGT=0
IF(DTERM(7,1).EQ.’N’) NCOLST = 0
IF(DTERM(8,1).EQ.’N’) NCOLCB = 0
NCL(6) = NCOLGT
NCL(7) = NCOLST
NCL(8) = NCOLCB
WRITE(6,5005) NCOLG
5005 FORMATO NUMBER OF PARENTS IS ’,14)
WRITE(6,5006) NCOLS
5006 FORMAT(’ NUMBER OF FULL-SIB CROSSES IS ’,14)
NCLFIX = 1
NCLRAN=0
DO 1024 1=1,8
IF(DTERM(I,2).EQ.’F’) THEN
NCLFIX = NCLFIX + NCL(I)
GO TO 1024
ENDIF
NCLRAN = NCLRAN + NCL(I)
1024 CONTINUE
WRITE(6,6001) NCLFIX,NCLRAN
6001 FORMATO FIXED EFFECT COLUMNS = ’,18,
N’ RANDOM EFFECT COLUMNS = ’,18)
CVERG = .0001
PRINT *,’ THE CONVERGENCE CRITERION FOR VARIANCE COMPONENTS
WHICH
NEQUALS’
PRINT *,’ THE SUM OF THE ABSOLUTE DEVIATIONS IS SET TO .0001.’
PRINT *,’ IF YOU WISH TO CHANGE TYPE Y IF NOT TYPE N. ’
READ(6,1501) DUMDUM
1501 FORMAT(Al)
IF(DUMDUM.EQ.’N’) GO TO 9021
PRINT*,’ THE CONVERGENCE CRITERION IS: ’
READ(6,1502) CVERG
9021 NCOLX = NCLFIX + NCLRAN + 1
NOITS = 30

117
PRINT*,’ THE NUMBER OF ITERATIONS ALLOWED IS SET TO 30’
PRINT*,’ DO YOU WISH TO CHANGE THIS? (Y OR N) ’
READ(6,1501) DUMDUM
IF(DUMDUM.EQ.’Y’) THEN
PRINT*,’ THE NUMBER OF ITERATIONS DESIRED IS: ’
READ*, NOITS
ENDIF
PRINT *, ’ IF THE SOLUTION AFTER ITERATING TO CONVERGENCE CONTAINS
N ONE OR MORE’
PRINT *, ’ NEGATIVE VARIANCE COMPONENT ESTIMATES!!!!’
PRINT *, ’ DO YOU WISH TO RE-SOLVE THE SYSTEM SETTING NEGATIVE EST
NIMATES TO ZERO?’
PRINT *, ’ TYPE Y OR N
READ(6,1501) DUMB
CALL XPRIMX(TEST,BLOCK,SET,F,M,FM)
REWIND(13)
DO 801 1 = l,NORAN
NUMMY(I) = 0
801 CONTINUE
803 DO 50 L=l,NOITS
DO 71 1= l,NORAN
DUM(I) = SIG(I)
71 CONTINUE
WRITE(6,5001) L
5001 FORMATf THIS IS ITERATION NUMBER ’,13)
DO 8001 1= l,NORAN
WRITE(6,154) SIGMA,RANNAM(I),SIG(I)
8001 CONTINUE
DO 21 1= l,NORAN
DO 22 J = l,NORAN
VQVQ(I,J)=0.0
22 CONTINUE
YQVQY(I) = 0.0
21 CONTINUE
DO 51 1= l,NORAN
IF(SIG(I).LT.0.0) SIG(I) = 0.0
51 CONTINUE
CALL DESIGN
REWIND(13)
DO 5 1= l,NORAN
SOL(I,NORAN+ 1) = YQVQY(I)
REML(I) = 0.0
IF(NUMMY(I).EQ. 1) YQVQY(I) = 0.0
DO 6 J = l,NORAN
SOL(I,J) = VQVQ(I,J)
IF(NUMMY(I).EQ. 1) SOL(I,J) = 0.0
6 CONTINUE
5 CONTINUE

118
CALL L2SWP(S0L,N0RAN,N0RAN +1,1 ,NORAN)
DO 7 1= l,NORAN
REML(I) = SOL(I,NORAN+ 1)
7 CONTINUE
ZAG = 0.0
DO 8 1= l,NORAN
ZAG = ZAG + DABS(REML(1)-DUM(I))
8 CONTINUE
DO 9 I=l,NORAN
SIG(I) = REML(I)
9 CONTINUE
IF(ZAG.LT.CVERG) GO TO 11
50 CONTINUE
11 IF(DUMB.EQ.’N’) GO TO 8025
IF(DUMB.EQ.’Y’) THEN
LEP=0
DO 851 1= I,NORAN
IF(SIG(1).LT.0.0) LEP= 1
IF(SIG(I).LE.0.0) THEN
SIG(I) = 0.0
NUMMY(I)= 1
ENDIF
851 CONTINUE
ENDIF
IF(LEP.EQ.l) GO TO 803
8025 DO 10 1= l,NORAN
VARHAT(I) = SIG(1)
10 CONTINUE
PRINT *, ’ WHAT IS THE FILENAME FOR THE VARIANCE COMPONENT OUTPUT
N? ’
READ(6,1503) FLNAME
OPEN (2,FILE = FLNAME,STATUS =’UNKNOWN’)
DO 155 J = l,NORAN
WRITE(2,FMT = 154) SIGMA,RANNAM(J),VARHAT(J)
155 CONTINUE
154 FORMAT(1X,A13,A12,F20.6)
CLOSE(2)
DO 156 1=1,9
IF(RANNAMa).EQ. ’GCA’) SCALEG = VARHAT(I)
IF(RANNAM(I).EQ.’SCA’) SCALES = VARH AT®
156 CONTINUE
PRINT *, ’ DO YOU DESIRE GCA PREDICTIONS? (Y OR N) ’
READ(6,1501) DUMDUM
IF(DUMDUM.EQ.’N’) GO TO 704
PRINT *, ’ DO YOU HAVE A PRIOR ESTIMATE OF GCA VARIANCE TO USE INS
NTEAD’
PRINT *, ’ OF THE DATA ESTIMATE? (Y OR N) ’
READ(6,1501) DUMDUM

119
IF(DUMDUM.EQ.’Y’) THEN
PRINT *, ’ WHAT IS THE GCA VARIANCE ESTIMATE YOU WISH TO USE? ’
READ(6,1502) SCALEG
ENDIF
DO 157 1= l,NCOLG
GCA© = SC ALEG*GC A©
157 CONTINUE
PRINT *, ’ WHAT IS THE FILENAME FOR THE GCA PREDICTION OUTPUT? ’
READ(6,1503) FLNAME
OPEN(4,FILE=FLNAME,STATUS = ’UNKNOWN’)
DO 178 1= l,NCOLG
WRITE(4,FMT=703) PARENT©,GCA©
178 CONTINUE
703 FORMA©’ GCA’,1X,A8,F20.6)
CLOSE(4)
704 IF©UM2.EQ.’H’) GO TO 705
IF(DTERM(5,1).EQ.’N’) GO TO 705
PRINT *, ’ DO YOU DESIRE SCA PREDICTIONS? (Y OR N) ’
READ(6,1501) DUMDUM
IF©UMDUM.EQ.’N’) GO TO 705
PRINT *, ’ DO YOU HAVE A PRIOR ESTIMATE OF SCA VARIANCE TO USE INS
NTEAD’
PRINT *, ’ OF THE DATA ESTIMATE? (Y OR N) ’
READ(6,1501) DUMDUM
IF©UMDUM.EQ.’Y’) THEN
PRINT *, ’ WHAT IS THE SCA VARIANCE ESTIMATE YOU WISH TO USE? ’
READ(6,1502) SCALES
ENDIF
DO 169 1= l,NCOLS
SCA©=SCALES *SC A©
169 CONTINUE
PRINT *, ’ WHAT IS THE FILENAME FOR THE SCA PREDICTION OUTPUT? ’
READ(6,1503) FLNAME
OPEN(8,FILE = FLN AME,STATUS = ’UNKNOWN’)
DO 171 1= l,NCOLS
WRITE(8,FMT=707) FMVEC(I),SCA(I)
171 CONTINUE
707 FORMATf SCA’,IX,A16,F20.6)
CLOSE(8)
705 PRINT *, ’ DO YOU DESIRE FIXED EFFECT ESTIMATES? (Y OR N) ’
READ(6,1501) DUMDUM
IF©UMDUM.EQ.’N’) GO TO 706
PRINT *, ’ WHAT IS THE FILENAME FOR FIXED EFFECTS ESTIMATES? ’
READ(6,1503) FLNAME
OPEN(9,FILE = FLNAME,STATUS = ’UNKNOWN’)
WRITE(9,FMT=708) BHAT(l)
708 FORMAT(’ MU’,T15,F20.6)
J=1

120
DO 172 1=1,3
IF(DTERM(I,2).EQ.’F’) THEN
DO 173 K=1,NCL(I)
J=J+1
IF(I.EQ.l) THEN
WRITE(9,FMT=711) LOCO(K),BHAT(J)
ENDIF
IF(1.EQ.3) THEN
WRITE(9,FMT=711) DISSET(K),BHAT(J)
ENDIF
WRITE(9,FMT=709) NAME(I),K,BHAT(J)
173 CONTINUE
ENDIF
172 CONTINUE
711 FORMAT(A8,T15,F20.6)
709 FORMAT(A11,I3,F20.6)
CLOSE(9)
DO 726 1= l,NOVARG
VARG(I) = 0.D0
726 CONTINUE
DO 727 1= 1,NVARBH
VARBH(I) = 0.D0
727 CONTINUE
706 PRINT DO YOU DESIRE THE ASYMPTOTIC VARIANCE COVARIANCE’
PRINT MATRIX FOR VARIANCE COMPONENTS? (Y OR N) ’
READ(6,1501) DUMDUM
IF(DUMDUM.EQ.’N’) GO TO 751
PRINT WHAT IS THE FILENAME FOR VAR(VC)? ’
READ(6,1503) FLNAME
OPEN(12,FILE = FLNAME,STATUS =’UNKNOWN’)
WRITE(12,755)
755 FORMAT(’ ASYMPTOTIC VARIANCE COVARIANCE MATRIX’,/)
DO 752 1= l,NORAN
DO 753 J = I,NORAN
SOL(I,J) = SOL(I,J)*2.0
WRITE(12,754) RANNAM(I),RANNAM(J),SOL(I,J)
753 CONTINUE
752 CONTINUE
754 FORMAT(A 11 ,T15,A11 ,T30,F20.10)
751 PRINT*,’DO YOU DESIRE THE ERROR VARIANCE COVARIANCE MATRIX FOR
NGCA? (Y OR N) ’
READ(6,1501) DUMDUM
IF(DUMDUM.EQ.’N’) GO TO 715
PRINT *,’ WHAT IS THE FILENAME FOR EVAR(GHAT)? ’
READ(6,1503) FLNAME
OPEN(10,FILE = FLNAME,STATUS =’UNKNOWN’)
CALL VARX(VARG,VARBH)
WRITE( 10,721)

121
K=0
DO 716 I=l,NCOLG
DO 717 J=l,NCOLG
K=K+1
WRITE(10,718) VARG(K),PARENT(I),PARENT(J)
717 CONTINUE
716 CONTINUE
721 FORMAT(’THE ERROR VARIANCE COVARIANCE MATRIX FOR GCA ARRAYED
NAS A VECTOR’,/)
718 FORMAT(F20.10,T25,A8,T35,A8)
CLOSE(IO)
715 PRINT *, ’ DO YOU DESIRE THE VARIANCE COVARIANCE MATRIX FOR FIXED
N EFFECTS? (Y OR N) ’
READ(6,1501) DUMB
IF(DUMB.EQ.’N’) GO TO 719
IF(DUMDUM.EQ.’N’) CALL VARX(VARG,VARBH)
PRINT *, ’ WHAT IS THE FILENAME FOR VAR(BETAHAT)? ’
READ(6,1503) FLNAME
OPEN(ll,FILE = FLNAME,STATUS =’UNKNOWN’)
K = 0
DO 723 1= 1,NCLFIX
DO 724 J = I,NCLFIX
K = K+ 1
WRITE(11,722) VARBH(K)
724 CONTINUE
723 CONTINUE
722 FORMAT(F20.10)
CLOSE(ll)
719 STOP
END
c*******************************************************
C SUBROUTINE L2SWP SWPS THE DESIGNATED COLUMNS OF A MATRIX X AND
C RETURNS THE SWEPT MATRIX AS X
SUBROUTINE L2SWP(X,NROWX,NCOLXX,NSTA,NEND)
INTEGER NROWX,NCOLXX,NSTA,NEND,NTOT
DOUBLE PRECISION X(9,10), DMIN, D, B, BB(10)
C NSWP DEFINES THE PIVOT COLUMNS FOR SWP
DMIN= IE-8
C IF LESS THAN FULL RANK MATRICES ARE ENCOUNTERED, DMIN MUST BE
C EMPLOYED
C TO ZERO THE ROW AND COLUMN ASSOCIATED WITH THE DEPENDENCY TO
C PRODUCE A GENERALIZED INVERSE
DO 10 K = NSTA,NEND
D = X(K,K)
IF (D.LE.DMIN) THEN
DO 21 1= l,NROWX
DO 22 J=l,NCOLXX
X(I,K) = 0.0

122
X(K,J) = 0.0
22 CONTINUE
21 CONTINUE
GO TO 10
ENDIF
DO 20 J = l,NCOLXX
X(K,J) = X(K,J)/D
20 CONTINUE
DO 30 I = K+ l,NROWX
C 1 SHOULD BE INCREMENTED SO THAT I IS NOT EQUAL TO K
B = X(I,K)
DO 40 L= l,NCOLXX
X(I,L) = X(I,L)-B*X(K,L)
40 CONTINUE
X(I,K) = -B/D
30 CONTINUE
X(K,K)= 1/D
C BACKWARD ELIMINATION
NTOT = NSTA + NEND
IF(NTOT.EQ.2) GO TO 61
C SAVING ABOVE DIAGONAL ENTRIES FOR MULTIPLICATION WEIGHTS
KK= 1
DO 12 J = 1,K-1
BB(KK) = X(J,K)
KK=KK+1
12 CONTINUE
C ZEROING ABOVE DIAGONAL ENTRIES FOR INSERTION OF INVERSE VALUES
DO 13 1=1, K-l
X(I,K) = 0.0
13 CONTINUE
C DOING ROW OPERATIONS TO CREATE ABOVE DIAGONAL ENTRIES FOR INVERSE
N= 1
DO 70 M = 1,K-1
B = BB(N)
N = N+1
DO 80 J = l,NCOLXX
X(M,J) = X(M,J)-B*X(K,J)
80 CONTINUE
70 CONTINUE
10 CONTINUE
61 RETURN
END
C DESIGN CREATES DESIGN MATRICES FOR MAIN EFFECTS AND INTERACTIONS
C AND FORMS THE NORMAL EQUATIONS
SUBROUTINE DESIGN
PARAMETER (

123
N NOBSER = 5000,
N NOBL = 36,
N NOCR=75,
N NOBH = 200,
N NOGCA = 50,
N NOX= 1400,
N NOCBS= 1000,
N NTOT = NOX + NOCBS,
N NIZED = NOX*NOCBS,
N NIXPX = ((NOX*(NOX-1 ))/2) + NOX,
N NSIP = NOX + NOCBS,
N NIZEP=((NSIP*(NSIP-1 ))/2) + NSIP)
COMMON/CMN1/ NCOLT,NCOLTB,NCOLG,NCOLS,NCOLGT,NCOLST,NOBS,
N NCOLB,NCOLX,NCOLCB,NCL(9),NORAN,NOFIX,NCLFIX,
N NCLRAN,NCOLSE,NRAN(9)
COMMON/CMN2/
N YQVQY(9),VQVQ(9,9),MEAN(NOBSER),SIG(9),GCA(NOGCA),
N BHAT(NOBH),SCA(NOCR)
COMMON/CMN3/ DTERM(8,2),RANNAM(9),DUM2,FMVEC(NOCR),
N PARENT(NOGCA),LOCO(10),REP(NOBL),DISSET(10)
DIMENSION TK(:),D(:),P(:),TRACER(9)
ALLOCATABLE :: TK,D,P
INTEGER NCOLT,NCOLTB,NCOLG,NCOLS,NOBS,NCOLGT,NCOLST,NVEC,
NNCOLCB,NCOLX,NSTA,NEND,NCOLRD,NSTAK,NENDK,NWNUMl, NWNUM2,
NINUM,
NNWNUM3,NCL,NBIG,NORAN,NOFIX,NRAN,NCLFIX,NCLRAN,NDFIX,NCOLSE,
NNODUM
DOUBLE PRECISION TK,MEAN,D,TR,TRACER,YQVQY,VQVQ,SIG,GCA,
N BHAT,P,SUB,SCA
CHARACTER* 1 DTERM,DUM2
CHARACTER*8 PARENT,LOCO,REP,DISSET
CHARACTER*16 FMVEC
CHARACTER* 11 RANNAM
NBIG = NRAN(NORAN)
DO 1012 1= l,NORAN-l
IF(NRAN(I).GT.NBIG) NBIG = NRAN(I)
1012 CONTINUE
NWNUM1 = (NCOLX*(NCOLX-l))/2 + NCOLX
NWNUM2 = NCOLX*NBIG
NWNUM3 = ((NCOLX + NBIG)*(NCOLX + NBIG-1 ))/2 + NCOLX + NBIG
ALLOCATE (TK(NWNUM1),D(NWNUM2),P(NWNUM3))
READ(13) TK
DO 10 1= 1,NWNUM 1
TK(I)=TK(I)/SIG(NORAN)
10 CONTINUE
DO 11 I=1,NWNUM2
D(I) = 0.0
11 CONTINUE

124
DO 12 1= 1,NWNUM3
P(I)=0.0
12 CONTINUE
£****************************************************************
C FORMING THE MATRIX TO BE SWP TO PRODUCE YQVQY AND VQVQ
£****************************************************************
£******************************************************************
C TK = X‘*INV(VK)*X COMPLETED
^-’*:C**********;t:**********:t:******************************************
NSTAK = 2 + NCLFIX + NCLRAN
DO 1300 INUM = l,NORAN-l
NODUM = NORAN-INUM
NCOLRD = NCOLX + NRAN(NODUM)
NSTAK = NSTAK-NRAN(NODUM)
NENDK = NSTAK + NRAN(NODU M)-1
DO 251 I = NSTAK,NENDK
M = NVEC(I, NCOLX)
II = I-NSTAK+ 1
N = NVEC(II,NCOLRD)
NN = N
DO 252 J = I,NENDK
M = M+ 1
NN=NN+1
P(NN)=TK(M)*SIG(NODUM)
252 CONTINUE
P(N+ 1) = P(N+ 1) + 1.0
251 CONTINUE
C R = I + SIG(I)*(Zi‘*INV(VK)*Zi) HAS BEEN FORMED
K = 0
DO 254 J = NSTAK,NENDK
DO 255 1=1, NCOLX
K = K+ 1
IF(J.LT.I) THEN
D(K)=0.0
GO TO 255
ENDIF
M = NVEC(I, NCOLX)
M = M + J-I+ 1
D(K)=TK(M)*SQRT(SIG(NODUM))
255 CONTINUE
254 CONTINUE
DO 222 I = NSTAK,NENDK
N = NVEC(I,NCOLX)
II = I-NSTAK+ 1
NN = NCOLX*(II-l)
DO 223 J = I,NCOLX

o o
125
M = N+J-I+ 1
K = NN +J
D(K)=TK(M)*SQRT(SIG(NODUM))
223 CONTINUE
222 CONTINUE
£******:t::|::t::t::*:***:t::t:**:t:*:|:***:*::t::t:*:t::t::t::t:********************:t:********
C D=Zi‘*INV(VK)*X*SQRT(SIG(I)) HAS BEEN FORMED
^s************************************************************
^*************************************************************
TD = D‘
:|::t:*:|:***:t::|:***:|::t:****:t::lc:t::t:**:t:***********:|::|:*******:t::t::t:****:t:*******:fc
K=0
NEND = NRAN(NODUM)
DO 22 1= l,NRAN(NODUM)
N = NVEC(I,NCOLRD)
DO 23 J = NRAN(NODUM)+ l,NCOLRD
K = K+ 1
M = N+J-I+ 1
P(M) = D(K)
23 CONTINUE
22 CONTINUE
DO 25 I = NRAN(NODUM)+ l,NCOLRD
K = NVEC(I,NCOLRD)
II = I-NRAN(NODUM)
M = NVEC(II,NCOLX)
DO 26 J = I,NCOLRD
K = K+ 1
M = M+ 1
P(K)=TK(M)
26 CONTINUE
25 CONTINUE
C P = (R| ¡D)//(TD¡ ]TK)
CALL VECSWP(P,NCOLRD,NCOLRD, 1 ,NRAN(NODUM))
K = 0
DO 226 1= l,NCOLX
II = I + NRAN(NODUM)
M = NVEC(II,NCOLRD)
DO 227 J = I,NCOLX
K = K+ 1
M = M+ 1
TK(K) = P(M)
227 CONTINUE
226 CONTINUE
1300 CONTINUE
K=0
DO 826 I=l,NCOLX

126
II = I + NRAN(1)
M = NVEC(II,NCOLRD)
DO 827 J = I,NCOLX
K = K+ 1
M = M+ 1
TK(K) = P(M)
827 CONTINUE
826 CONTINUE
NDFIX = NCLFIX
CALL VC2SWP(TK,NCOLX,NCOLX, 1,NCLFIX,NDFIX)
£******************************************************************
C PORTIONS OF TK ARE SELECTED AND MULTIPLIED AND THE TRACE CALCU-
C LATED TO FORM VQVQ
£****************************************************************
£*****************cQLjj[^j[sj j yQ NORAN-1 OF VQVQ*********************
NEND=1 +NCLFIX
DO 841 J = l.NORAN-l
NSTA = NEND+ 1
NEND = NSTA + NRAN(J)-1
TR = 0.0
NSTAK=NEND+1
DO 838 1= J,NORAN-l
IF(LEQ.J) THEN
DO 828 11 = NSTA,NEND
N = NVEC(II,NCOLX)
DO 830 K = II,NEND
M = N + K-II +1
IF(II.EQ.K) THEN
TR=TR + TK(M)*TK(M)
GO TO 830
ENDIF
TR = TR + 2*TK(M)*TK(M)
830 CONTINUE
828 CONTINUE
VQVQ(J,I)=TR
GO TO 838
ENDIF
NENDK = NSTAK + NRAN(I)-1
TR = 0.0
DO 833 L = NSTA,NEND
N = NVEC(L,NCOLX)
DO 835 K = NSTAK,NENDK
M = N + K-L+ 1
TR=TR + TK(M)*TK(M)
835 CONTINUE
833 CONTINUE
NSTAK=NENDK+1
VQVQ(J,I)=TR

127
838 CONTINUE
841 CONTINUE
^***************^QI^^j[^[^ noran of vqvq****************************
DO 932 1= 1,NORAN-1
TRACER(I) = 0.0
DO 933 J = I,NORAN-l
VQVQ(J,1) = VQVQ(I,J)
933 CONTINUE
932 CONTINUE
NSTA = 2 + NCLFIX
DO 935 J = l,NORAN-1
NEND = NSTA + NRAN(J)-1
DO 934 1 = NSTA,NEND
N = NVEC(I,NCOLX)
N = N+1
TRACER(J)=TRACER(J) + TK(N)
934 CONTINUE
NSTA = NEND +1
935 CONTINUE
DO 938 1= l,NORAN-l
VQVQ(I,NORAN)=TRACER(I)
938 CONTINUE
SUB = 0.0
DO 936 1= l,NORAN-l
SUB = SUB + TRACER(I)*SIG(I)
DO 937 J = l,NORAN-l
VQVQfl,NORAN) = VQVQ(I,NORAN)-(SIG(J)*VQVQ(I,J))
937 CONTINUE
VQVQ(I,NORAN) = VQVQ(I,NORAN)/SIG(NORAN)
936 CONTINUE
NSTAK=NOBS-NDFIX
TR = FLOAT(NSTAK)
VQVQ(NORAN,NORAN) = (TR-SUB)/SIG(NORAN)
DO 940 1= 1,NORAN-1
VQVQ(NORAN,NORAN) = VQVQ(NORAN,NORAN)-(SIG(I)*VQVQ(I,NORAN))
940 CONTINUE
VQVQ(NORAN,NORAN) = VQVQ(NORAN,NORAN)/SIG(NORAN)
DO 941 1= l,NORAN-l
VQVQ(NORAN,I) = VQVQ(I,NORAN)
941 CONTINUE
c*************F0RMiNg VECTOR OF FIXED EFFECTS ESTIMATES*********
DO 951 I=1,NCLFIX
N = NVEC(I,NCOLX)
N = N + NCLFIX-I+2
BHAT(I)=TK(N)
951 CONTINUE
£*************FQR¡mjvjQ VECTORS OF predictions**************
DO 952 1 = 1,9

128
IF(RANNAM(I) EQ.’GCA’) THEN
NSTA = I
GO TO 953
ENDIF
952 CONTINUE
GO TO 955
953 NEND = 0
DO 954 1= 1,NSTA-1
NEND = NEND + NRAN(I)
954 CONTINUE
L=NEND+1
N = NVEC(NCLFIX + 1 ,NCOLX)
L = L + N
DO 955 1= l,NCOLG
L = L+ 1
GCA(I) = TK(L)
955 CONTINUE
DO 962 1=1,9
IF(RANNAM(I).EQ.’SCA’) THEN
NSTA = I
GO TO 963
ENDIF
962 CONTINUE
GO TO 965
963 NEND=0
DO 964 1= 1,NSTA-1
NEND = NEND + NRAN(I)
964 CONTINUE
L=NEND+1
N = NVEC(NCLFIX + 1 ,NCOLX)
L=L + N
DO 965 1 = l.NCOLS
L = L+ 1
SCA(I) = TK(L)
965 CONTINUE
£***:t:*********pQJ^[yjJJ^Q YQVQY*********************** ***************
NSTA = NCLFIX + 2
NEND = NSTA + NRAN(l) -1
N = NVEC(NCLFIX + 1 ,NCOLX)
DO 926 J= l,NORAN-l
DO 925 I = NSTA,NEND
M = N + I-NCLFIX
YQVQY(J) = YQVQY(J) + TK(M)*TK(M)
925 CONTINUE
NSTA = NEND+ 1
NEND = NSTA + NRAN(J+ 1)-1
926 CONTINUE
NSTA = NVEC(NCLFIX + l,NCOLX)+ 1

129
YQVQY(NORAN)=TK(NSTA)
DO 927 1= l,NORAN-l
YQVQY(NORAN) = YQVQY(NORAN)-(SIG(I)*YQVQY(I))
927 CONTINUE
YQVQY(NORAN) = YQVQY(NORAN)/SIG(NORAN)
DEALLOCATE (TK,D,P)
RETURN
END
c****************************************************************
C THIS FUNCTION COUNTS THE NUMBER OF ENTRIES FOR AN EFFECT
SUBROUTINE NOCOL(VEC,OBS,VEC1 ,NCOL)
PARAMETER (
N NOBSER = 5000)
INTEGER OBS,NCOL
CHARACTER*8 VEC(NOBSER),VEC 1 (*),Z,X,NT
DO 11 1=1, OBS
IFa.EQ.l) THEN
vEcia)=vEca)
NCOL=1
GO TO 11
ENDIF
DO 12 J = l,NCOL
X = VECa)
Z = VEC 1 (J)
IF(X.EQ.Z) GOTO 11
12 CONTINUE
NCOL=NCOL+ 1
VEC 1 (NCOL) = VEC(I)
11 CONTINUE
DO 159 K= l,NCOL-l
N = K +1
DO 159 J = N,NCOL
IF(VEC1(K).LT.VEC1(J)) GO TO 159
NT=VEC1(K)
VEC 1 (K) = VEC 1 (J)
VEC1(J) = NT
159 CONTINUE
RETURN
END
c****************************************************************
C THIS FUNCTION COUNTS THE NUMBER OF ENTRIES FOR PARENTS
SUBROUTINE NOPAR(VECl,VEC2,OBS,VEC3,NPAR)
PARAMETER (
N NOBSER = 5000,
N NOGCA = 50)
INTEGER OBS,NPAR
CHARACTER*8 VECl(NOBSER),VEC2(NOBSER),VEC3(NOGCA),Y,Z,X,NT
DO 11 1=1,OBS

130
IFfl.EQ.l) THEN
VEC3(I) = VEC1(I)
VEC3(I+1) = VEC2(I)
NPAR=2
GO TO 11
ENDIF
DO 12 J = 1,NPAR
X = VEC1(I)
Z = VEC3(J)
IF(X.EQ.Z) GO TO 15
12 CONTINUE
NPAR = NPAR + 1
VEC3(NP AR) = VEC1 (I)
15 DO 13 K= 1,NPAR
Y = VEC2(1)
Z = VEC3(K)
IF(Y.EQ.Z) GOTO 11
13 CONTINUE
NPAR = NPAR + 1
VEC3(NPAR) = VEC2(I)
11 CONTINUE
DO 159 K=1,NPAR-1
N = K+ 1
DO 159 J = N,NPAR
IF(VEC3(K).LT.VEC3(J)) GO TO 159
NT=VEC3(K)
VEC3(K) = VEC3(J)
VEC3(J) = NT
159 CONTINUE
RETURN
END
****%:$: %%***%: ****%: *$:**$: it*** y:*** y:***
C**VECSWP PRODUCES A G2 INVERSE OF A SYMMETRIC MATRIX STORED AS**
^*^*^4:***********4:****4:* ^ VECTOR * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
c******************************************************************
SUBROUTINE VECSWP(VEC,NROWX,NCOLXX,NSTA,NEND)
PARAMETER (
N NOBSER = 5000,
N NOBL = 36,
N NOCR = 75,
N NOBH = 200,
N NOGCA = 50,
N NOX= 1400,
N NOCBS = 1000,
N NTOT=NOX + NOCBS,
N NIZED = NOX*NOCBS,
N NIXPX = ((NOX*(NOX-1 ))/2) + NOX,
N NSIP = NOX + NOCBS,

N NIZEP=((NSIP*(NSIP-l))/2) + NSIP)
DIMENSION VEC(*),V(:)
ALLOCATABLE V
INTEGER NROWX,NCOLXX,NSTA,NEND,NUMB,NVEC,V,NUMl,NUM2
DOUBLE PRECISION VEC,DMIN,D,B,C
ALLOCATE (V(NCOLXX))
DMIN= 1.0D-8
DO 9 1= l,NCOLXX
va)=i
9 CONTINUE
DO 10 K = NSTA,NEND
NUM2 = -(K*(K-3))/2 + NCOLXX*(K-l)
NUMB = NUM2-1
D = VEC(NUM2)
IF (DABS(D).LE.DMIN) THEN
DO 22 1=1,K
IF(I.EQ.K) THEN
NUM2 = -(I*(I-3))/2 + NCOLXX*(I-l)
GO TO 53
ENDIF
NUM2 = -(I*(I-1 ))/2 + K+NCOLXX*(I-1)
53 VEC(NUM2) = 0.0
22 CONTINUE
NUM2 = NUMB +1
DO 21 J = K+ l,NCOLXX
NUM2 = NUM2 + 1
VEC(NUM2) = 0.0
21 CONTINUE
GO TO 10
ENDIF
DO 23 1= l,NROWX
IF(I.EQ.K) GO TO 23
NUM1 = NVEC(I,NCOLXX)
IF(I.LT.K) THEN
NUM2 = NUM1 +K-I+1
B = VEC(NUM2)/D
GO TO 27
ENDIF
NUM2 = NUMB + I-K+ 1
B = (FLOAT(V(I))*FLOAT(V(K))*VEC(NUM2))/D
27 IF(DABS(B).LT.(1.0D-20)) GO TO 23
DO 24 J = I,NCOLXX
IF(J.EQ.K) GO TO 24
IF(K.LT.J) THEN
NUM2 = NUMB + J-K+1
C = VEC(NUM2)
GO TO 28
ENDIF

132
NUM2 = -(J*(J-l))/2 + K + NCOLXX*(J-l)
C = FL0AT(V(J))*FL0AT(V(K))*VEC(NUM2)
28 IF(DABS(C).LT.(1.0D-20)) GO TO 24
NUM2 = NUM1 + J -I +1
VEC(NUM2) = VEC(NUM2)-(B*C)
24 CONTINUE
23 CONTINUE
DO 26 J = K,NCOLXX
NUM2=NUMB +J-K+1
VEC(NU M2) = VEC(NUM2)/D
26 CONTINUE
DO 25 1=1, K
IF(I.EQ.K) THEN
NUM2 = -(I*(I-3))/2 + NCOLXX*(I-1)
GO TO 54
ENDIF
NUM2 = -(I*(I-l))/2 + K + NCOLXX*(I-l)
54 VEC(NUM2) = -VEC(NUM2)/D
25 CONTINUE
VEC(NUMB+ 1)= I/D
V (K) = -V (K)
10 CONTINUE
DEALLOCATE (V)
RETURN
END
C**VC2SWP PRODUCES A G2 INVERSE OF A SYMMETRIC MATRIX STORED AS**
it:*** ;}c *** ¡Je** ^ VECTOR. * * * * * * * * * * * * * * * * * * ********* ***** *
£«******************************************************************
SUBROUTINE VC2SWP(VEC,NROWX,NCOLXX,NSTA,NEND,NDF)
PARAMETER (
N NOBSER = 5000,
N NOBL = 36,
N NOCR = 75,
N NOBH = 200,
N NOGCA = 50,
N NOX= 1400,
N NOCBS= 1000,
N NTOT=NOX + NOCBS,
N NIZED = NOX*NOCBS,
N NIXPX = ((NOX*(NOX-l))/2) + NOX,
N NSIP = NOX + NOCBS,
N NIZEP = ((NSIP*(NSIP-l))/2) + NSIP)
DIMENSION VEC(*),V(:)
ALLOCATABLE V
INTEGER NROWX,NCOLXX,NSTA,NEND,NUMB,NVEC,V,NUMl,NUM2,NDF
DOUBLE PRECISION VEC,DMIN,D,B,C
DMIN= 1.0D-8

133
ALLOCATE (V(NCOLXX))
DO 9 1= l,NCOLXX
V(I)=1
9 CONTINUE
DO 10 K = NSTA,NEND
NUM2 = -(K*(K-3))/2 + NCOLXX*(K-1)
NUMB = NUM2-1
D = VEC (NU M2)
IF (DABS(D).LE.DMIN) THEN
NDF = NDF-1
DO 22 1=1,K
IF(I.EQ.K) THEN
NUM2 = -(I*(I-3))/2 + NCOLXX*(I-1)
GO TO 53
ENDIF
NUM2 = -(I*(I-l))/2 + K + NCOLXX*(I-l)
53 VEC(NUM2) = 0.0
22 CONTINUE
NUM2 = NUMB +1
DO 21 J = K+l,NCOLXX
NUM2 = NUM2 + 1
VEC (NU M2) = 0.0
21 CONTINUE
GO TO 10
ENDIF
DO 23 1= l,NROWX
IF(I.EQ.K) GO TO 23
NUM1 = NVEC(I,NCOLXX)
IF(I.LT.K) THEN
NUM2 = NUM1 +K-I+1
B = VEC(NUM2)/D
GO TO 27
ENDIF
NUM2 = NUMB + I-K+1
B = (FLOAT(V(I))*FLOAT(V(K))*VEC(NUM2))/D
27 IF(DABS(B).LT.(1.0D-20)) GO TO 23
DO 24 J = I,NCOLXX
IF(J.EQ.K) GO TO 24
IF(K.LT.J) THEN
NUM2 = NUMB+J-K+ 1
C = VEC(NUM2)
GO TO 28
ENDIF
NUM2 = -(J *(J-1 ))/2 + K + NCOLXX*(J-1)
C = FLOAT(V(J))*FLOAT(V(K))*VEC(NUM2)
28 IF(DABS(C).LT.(1.0D-20)) GO TO 24
NUM2 = NUM1 + J -I +1
VEC(NUM2) = VEC(NUM2)-(B*C)

134
24 CONTINUE
23 CONTINUE
DO 26 J = K,NCOLXX
NUM2 = NUMB +J-K+1
VEC(NUM2) = VEC(NUM2)/D
26 CONTINUE
DO 25 1 = 1,K
IF(I.EQ.K) THEN
NUM2 = -a*a-3))/2 + NCOLXX*(I-l)
GO TO 54
ENDIF
NUM2 = -(I*(I-1 ))/2 + K + NCOLXX*(I-l)
54 VEC(NUM2) = -VEC(NUM2)/D
25 CONTINUE
VEC(NUMB+ 1) = 1/D
V (K) = -V (K)
10 CONTINUE
DEALLOCATE (V)
RETURN
END
Q**********************************************************
C******NVEC COUNTS THE PROPER POSITION OF AN ELEMENT*******
C*********IN THE HALF STORED MATRIX (AS A VECTOR)**********
C*******ACC0RDING TO ITS NORMAL ROW COLUMN POSITION********
£*****************jjsj Tj-jj? ORIGINAL \jATRIX*******************
FUNCTION NVEC(NROWS,NCOLXX)
INTEGER NROWS,NCOLXX,NVEC
M = 0
DO 3 1= l,NROWS
IFa.EQ.l) GO TO 3
M = M + NCOLXX - (1-2)
3 CONTINUE
NVEC = M
RETURN
END
SUBROUTINE XPRIMX(TEST,BLOCK,SET,F,M,FM)
PARAMETER (
N NOBSER = 5000,
N NOBL = 36,
N NOCR=75,
N NOBH = 200,
N NOGCA = 50,
N NOX= 1400,
N NOCBS= 1000,
N NTOT=NOX + NOCBS,
N NIZED = NOX*NOCBS,

135
N NIXPX = ((N0X*(N0X-l))/2) + NOX,
N NSIP = NOX + NOCBS,
N NIZEP = ((NSIP*(NSIP-1 ))/2) + NSIP)
COMMON/CMN1/ NCOLT,NCOLTB,NCOLG,NCOLS,NCOLGT,NCOLST,NOBS,
N NCOLB,NCOLX,NCOLCB,NCL(9),NORAN,NOFIX,NCLFIX,
N NCLRAN,NCOLSE,NRAN(9)
COMMON/CMN2/
N YQVQY(9),VQVQ(9,9),MEAN(NOBSER),SIG(9),GCA(NOGCA),
N BHAT(NOBH),SCA(NOCR)
COMMON/CMN3/ DTERM(8,2),RANNAM(9),DUM2,FMVEC(NOCR),
N PARENT(NOGCA),LOCO(10),REP(NOBL),DISSET(10)
DIMENSION X(:,:),DBLOCK(:,:),LOC(5,2),
N NULVEC(NOBSER),XPX(:)
ALLOCATABLE :: DBLOCK,XPX,X
INTEGER X,DBLOCK,NCOLT,NCOLTB,NCOLG,NCOLS,NCOLGT,NCOLST,
N NOBS,NCOLB,NCOLX,NCOLCB,NUMl,NCL,NORAN,NOFIX,
N NCLFIX,NCLRAN,NCOLSE,MLV,LOC,NRAN,NMISS,NULVEC
DOUBLE PRECISION XPX,YQVQY,VQVQ,MEAN,SIG,ZIP,ZAP,GCA,BHAT,SCA
CHARACTER* 1 DTERM,DUM2
CHARACTER*8 PARENT,LOCO,REP,DISSET,TEST(NOBSER),BLOCK(NOBSER),
N SET(NOBSER),F(NOBSER),M(NOBSER)
CHARACTER*16 FMVEC,FM(NOBSER)
CHARACTER*11 RANNAM
ALLOCATE (X(NOBS,NCOLX),DBLOCK(NOBS,NCOLB))
PRINT*, ’ ********FORMING THE DESIGN MATRIX**********’
J = 0
DO 12001=1,8
IF((NCL(I).GT.0).AND.(DTERM(I,2).EQ.’R’)) THEN
J=J+1
NRAN(J) = NCL(I)
ENDIF
1200 CONTINUE
DO 47 1=1,NOBS
DO 127 K=l,NCOLB
DBLOCK(I,K) = 0
127 CONTINUE
DO 48 J = l,NCOLX
xa,J)=o
48 CONTINUE
47 CONTINUE
DO 31 1=1,NOBS
X(I,D=1
31 CONTINUE
MLV = 1
IF((DTERM(l,l).EQ.’N’).OR.(DTERM(l,2).EQ.’R’)) GO TO 1101
DO 1001 1=1,NOBS
C FORMING DESIGN MATRIX FOR TEST
DO 5504 J = l,NCOLT

136
IF(TEST(I).EQ.LOCO(J)) THEN
NJ=J + MLV
GO TO 5505
ENDIF
5504 CONTINUE
5505 X(I,NJ) = 1
1001 CONTINUE
LOC(l,l) = MLV+1
MLV = MLV + NCOLT
LOC(l,2) = MLV
C FORMING DESIGN MATRIX FOR BLOCK
1101 IF((DTERM(2,l).EQ.’N’).OR.(DTERM(2,2).EQ.’R’)) GO TO 1102
DO 1002 1=1,NOBS
DO 5501 J = l,NCOLB
IF(BLOCK(I)â–  EQ.REP(J)) THEN
NK=J
GO TO 5502
ENDIF
5501 CONTINUE
5502 DBLOCK(I,NK) = 1
1002 CONTINUE
NSTA = LOC(l,l)
NEND=LOC( 1,2)
IF(DTERM(1,1).EQ.’N’) THEN
NSTA= 1
NEND=1
ENDIF
DO 136 1=1,NOBS
L = MLV+ 1
DO 137 J = NSTA,NEND
DO 138 K=l,NCOLB
X(I,L) = X(I,J)*DBLOCK(I,K)
L = L+ 1
138 CONTINUE
137 CONTINUE
136 CONTINUE
LOC(2,l) = MLV+1
MLV = MLV + NCOLTB
LOC(2,2) = MLV
1102 IFODTERMaO.EQ.’N^.OR.CDTERM^.EQ.’R’)) GOTO 1103
DO 1003 1 = 1,NOBS
DO 5506 J = l,NCOLSE
IF(SET(I).EQ.DISSET(J)) THEN
NK=J + MLV
GO TO 5507
ENDIF
5506 CONTINUE
5507 X(I,NK)=1

1003 CONTINUE
LOC(3,1) = MLV +1
MLV = MLV + NCOLSE
LOC(3,2) = MLV
1103 MLV = MLV+1
IF((DTERM(l,l).EQ.’N’).OR.(DTERM(l,2).EQ.’F’)) GO TO 2101
DO 2001 1=1,NOBS
C FORMING DESIGN MATRIX FOR TEST
DO 5508 J = l,NCOLT
IF(TEST(I).EQ.LOCO(J)) THEN
NJ=J + MLV
GO TO 5509
ENDIF
5508 CONTINUE
5509 X(I,NJ) = 1
2001 CONTINUE
LOC(l,l) = MLV+1
MLV = MLV + NCOLT
LOC(l,2) = MLV
C FORMING DESIGN MATRIX FOR BLOCK
2101 IF((DTERM(2,l).EQ.’N’).OR.(DTERM(2,2).EQ.’F’)) GO TO 2102
DO 2002 1=1,NOBS
DO 5510 J=l,NCOLB
IF(BLOCK(I).EQ.REP(J)) THEN
NK=J
GO TO 5511
ENDIF
5510 CONTINUE
5511 DBLOCK(I,NK) = 1
2002 CONTINUE
NSTA = LOC(l,l)
NEND = LOC(l ,2)
IF(DTERM(1,1).EQ.’N’) THEN
NEND=1
NSTA= 1
ENDIF
DO 36 1=1,NOBS
L = MLV + 1
DO 37 J = NSTA,NEND
DO 38 K=l,NCOLB
X(I,L) = X(I,J)*DBLOCK(I,K)
L=L+1
38 CONTINUE
37 CONTINUE
36 CONTINUE
LOC(2,l) = MLV+ 1
MLV = MLV + NCOLTB
LOC(2,2) = MLV

2102 IFODTERMO.O.EQ.’NO.OR.CDTERM^.EQ.’F’)) GO TO 2103
DO 2003 1=1, NOBS
DO 5512 J=l,NCOLSE
IF(SET(1).EQ.DISSET(J)) THEN
NK=J + MLV
GO TO 5513
ENDIF
5512 CONTINUE
5513 X(I,NK)=1
2003 CONTINUE
LOC(3,l) = MLV + 1
MLV = MLV + NCOLSE
LOC(3,2) = MLV
C FORMING DESIGN MATRIX FOR GCA
2103 IF(DTERM(4,1).EQ.’N’) GO TO 2104
DO 2004 1=1,NOBS
DO 5514 J=l,NCOLG
IF(F(I).EQ.PARENT(J)) THEN
NL=J + MLV
GO TO 5515
ENDIF
5514 CONTINUE
5515 X(I,NL) = 1
IF(DUM2.EQ.’H’) GO TO 2004
DO 5516 K=l,NCOLG
IF(M(I).EQ.PARENT(K)) THEN
NN=K+MLV
GO TO 5517
ENDIF
5516 CONTINUE
5517 X(I,NN) = 1
2004 CONTINUE
LOC(4,l) = MLV +1
MLV = MLV + NCOLG
LOC(4,2) = MLV
2104 IF(DTERM(5,1).EQ.’N’) GO TO 2105
NSTA = MLV
DO 34 1=1,NOBS
DO 35 J = l,NCOLS
IF(FM(I).EQ.FMVEC(J)) THEN
X(I,J + NSTA)= 1
GO TO 34
ENDIF
35 CONTINUE
34 CONTINUE
LOC(5,l) = MLV+ 1
MLV = MLV + NCOLS
LOC(5,2) = MLV

139
2105 IF((DTERM(6,l).EQ.’N’).OR.(DTERM(l,l).EQ.’N’)) GO TO 2106
NSTA= LOC(l,l)
NEND=LOC(l,2)
NSTAK= LOC(4,l)
NENDK = LOC(4,2)
DO 49 1=1,NOBS
L = MLV+ 1
DO 39 J = NSTA,NEND
DO 40 K = NSTAK,NENDK
X(I,L) = X(I,J)*X(I,K)
L = L+ 1
40 CONTINUE
39 CONTINUE
49 CONTINUE
MLV = MLV + NCOLGT
2106 IF((DTERM(7,l).EQ.’N’).OR.(DTERM(l,l).EQ.’N’)) GO TO 2107
NSTAK = LOC(5,l)
NENDK = LOC(5,2)
DO 41 1=1,NOBS
L = MLV + 1
DO 42 J = NSTA,NEND
DO 43 K = NSTAK,NENDK
X(I,L) = X(I,J)*X(I,K)
L=L+ 1
43 CONTINUE
42 CONTINUE
41 CONTINUE
MLV = MLV + NCOLST
2107 IF((DTERM(8,l).EQ.,N,).OR.(DTERM(2,l).EQ.’N’)) GO TO 2108
NSTA = LOC(2,l)
NEND = LOC(2,2)
NSTAK = LOC(5,l)
NENDK = LOC(5,2)
IF(DUM2.EQ.’H’) THEN
NSTAK = LOC(4,l)
NENDK = LOC(4,2)
ENDIF
DO 44 1=1,NOBS
L = MLV+ 1
DO 45 J = NSTA,NEND
DO 46 K = NSTAK,NENDK
xa,L)=xa,j)*xa,K)
L = L+1
46 CONTINUE
45 CONTINUE
44 CONTINUE
^***************************^*************************************
C X = MU¡ |HT| |TJ ¡TB¡ |G| ¡S¡ ¡ GT¡ ¡ ST¡ |CB COMPLETED

140
^*****************************************************************
DEALLOCATE (DBLOCK)
PRINT*, ’*******FINISHED FORMING THE DESIGN MATRIX**********’
PRINT*, ’*******NOW CHECKING FOR NULL COLUMNS***************’
2108 NEND = NCLFIX + 1
NMISS=0
DO 3001 K= l,NORAN-l
NSTA = NEND+ 1
NEND = NSTA + NRAN(K)-1
DO 3002 J = NSTA,NEND
DO 3003 1=1,NOBS
IF(X(I,J).NE.0) GO TO 3002
3003 CONTINUE
NRAN(K) = NRAN(K)-1
NMISS = NMISS+ 1
NULVEC(NMISS)=J
3002 CONTINUE
3001 CONTINUE
PRINT*,’***********FINISHED CHECKING FOR NULL COLUMNS*********’
WRITE(6,3006) NMISS
3006 FORMATf THERE WERE ’,14,’ NULL COLUMNS’)
IF(NMISS.EQ.O) GO TO 3011
PRINT *5>***********NOW DELETING NULL COLUMNS****************’
NULVEC(NMISS+ l) = NCOLX+ 1
L = NULVEC(1)
DO 3021 1=1,NMISS
IF((NULVEC(I+ 1)-NULVEC(I)).EQ. 1) GO TO 3021
DO 3022 J = NULVEC(I)+ 1,NULVEC(I+ 1)-1
DO 3023 K=l,NOBS
X(K,L)=X(K,J)
3023 CONTINUE
L = L+ 1
3022 CONTINUE
3021 CONTINUE
3011 NCLRAN = NCLRAN-NMISS
NCOLX = NCOLX-NMISS
NUM1 =(NCOLX*(NCOLX-l))/2 + NCOLX
ALLOCATE (XPX(NUMl))
DO 10 1= 1,NUM1
XPXa) = 0.0
10 CONTINUE
PRINT*,’**********FORMING DOT PRODUCTS OF DESIGN COLUMNS*******’
DO 15 1=1,NCOLX
N = NVEC(I, NCOLX)
DO 16 J = I,NCOLX
N = N+ 1
DO 17 K=l,NOBS
XPX(N) = XPX(N) + (FLOAT(X(K,I))*FLOAT(X(K,J)))

141
17 CONTINUE
16 CONTINUE
15 CONTINUE
PRINT*,’********FORMING DOT PRODUCTS OF DESIGN COLUMNS AND THE D
NATA VECTOR********’
L=NCLFIX+ 1
DO 6 J= l,NCOLX
IF(J.LE.NCLFIX) THEN
N = NVEC(J,NCOLX)
N = N + NCLFIX + 2-J
ENDIF
IF (J.GT.NCLFIX) THEN
N = NVEC(L,NCOLX)
N = N+J-NCLFIX
ENDIF
DO 7 K= l,NOBS
ZAP=FLOAT(X(K,J))
ZIP = MEAN(K)
IF(J.EQ.L) ZAP = MEAN(K)
XPX(N) = XPX(N) + (ZIP*Z AP)
7 CONTINUE
6 CONTINUE
PRINT*,’*******ALL DOT PRODUCTS HAVE NOW BEEN FORMED********’
PRINT*,’***SAVING X PRIME X MATRIX FOR FUTURE ITERATIONS****’
WRITE(13) XPX
PRINT*,’*********X PRIME X IS STORED*********’
DEALLOCATE (X,XPX)
RETURN
END
£*****************|_J£JSJJ}£J^SJ’J¡ ALGORITHM***************************
C***********M0DIFIED T0 OUTPUT VARIANCE COVARIANCE****************
£****** **********m^'j'l^j2£ OF PREDICTIONS ****************************
SUBROUTINE VARX(VARG,VARBH)
PARAMETER (
N NOBSER = 5000,
N NOBL=36,
N NOCR = 75,
N NOBH = 200,
N NVARBH = (NOBH*(NOBH-1 ))/2 + NOBH,
N NOGCA = 50,
N NOVARG = (NOGCA*(NOGCA-l))/2 + NOGCA,
N NOX= 1400,
N NIXPX = (NOX*(NOX-l))/2 +NOX,
N NOCBS = 1000,
N NTOT=NOX + NOCBS,
N NIZED = NOX*NOCBS,
N NSIP = NOX + NOCBS,
N NIZEP = ((NSIP*(NSIP-l))/2) + NSIP)

142
COMMON/CMN1/ NCOLT,NCOLTB,NCOLG,NCOLS,NCOLGT,NCOLST,NOBS,
N NCOLB,NCOLX,NCOLCB,NCL(9),NORAN,NOFIX,NCLFIX,
N NCLRAN,NCOLSE,NRAN(9)
COMMON/CMN2/
N YQVQY(9),VQVQ(9,9),MEAN(NOBSER),SIG(9),GCA(NOGCA),
N BHAT(NOBH),SCA(NOCR)
COMMON/CMN3/ DTERM(8,2),RANNAM(9),DUM2,FMVEC(NOCR),
N PARENT(NOGCA),LOCO( 10),REP(NOBL),DISSET( 10)
DIMENSION TK(:),D(:),VARG(NOVARG),VARBH(NVARBH),
N NSIG(9,2),XPX(:)
ALLOCATABLE :: TK,D,XPX
INTEGER NCOLT,NCOLTB,NCOLG,NCOLS,NCOLGT,NCOLST,NOBS,
N NCOLB,NCOLX,NCOLCB,NCL,NORAN,NOFIX,NCLFIX,NSIG,NCOLTK,
N NCLRAN,NCOLSE,NRAN,NSTA,NEND,NSTAK,NENDK,NCOLD,NOZERO,
N NUM1
DOUBLE PRECISION YQVQY,VQVQ,MEAN,SIG,GCA,BHAT,SCA,TK,D,
N VARG,VARBH,XPX
CHARACTER* 1 DTERM,DUM2
CHARACTER* 16 FMVEC
CHARACTER* 11 RANNAM
CHARACTER*8 LOCO,PARENT,DISSET,REP
NUM1 =(NCOLX*(NCOLX-l))/2 + NCOLX
ALLOCATE (XPX(NUMl),D(NCOLX))
READ(13) XPX
K = 0
NOZERO=0
NCOLTK = NCLFIX
NCOLD = NCLFIX+1
DO 22 1= l,NORAN-l
NCOLD = NCOLD + NRAN(I)
IF(SIG(I) EQ.0.0) THEN
NOZERO = NOZERO+1
NSIG(NOZERO, 1) = NCOLD + l-NRAN(I)
NSIG(NOZERO,2) = NCOLD
GO TO 22
ENDIF
NCOLTK = NCOLTK + NRAN(I)
DO 21 J= 1,NRAN(I)
K = K+ 1
D(K) = SIG(I)
21 CONTINUE
22 CONTINUE
ALLOCATE (TK(NUMl))
K = 0
DO 302 1=1,NCOLX
IF(I.EQ.(NCLFIX4-1)) GO TO 302
DO 23 L=l,NOZERO
IF((I.GE.NSIG(L,1)).AND.(I.LE.NSIG(L,2))) GO TO 302

143
23 CONTINUE
N = NVEC(I,NCOLX)
DO 301 J = I,NCOLX
IF(J.EQ.(NCLFIX+1)) GO TO 301
DO 24 L=l,NOZERO
IF((J.GE.NSIG(L,1)).AND.(J.LE.NSIG(L,2))) GO TO 301
24 CONTINUE
NN = N+J-I+1
K = K+1
TK(K) = XPX(NN)/SIG(NORAN)
301 CONTINUE
302 CONTINUE
K=0
DO 28 I = NCLFIX + l,NCOLTK
J = NVEC(I,NCOLTK)
N=J + 1
K = K+1
TK(N)=TK(N)+ (l.D0/(D(K)))
28 CONTINUE
DEALLOCATE (D,XPX)
£**************£qjj^'j’]ONS have now been formed**************************
CALL VECSWP(TK,NCOLTK,NCOLTK,l,NCOLTK)
DO 952 1=1,9
IF(RANNAM(I).EQ.’GCA’) THEN
NSTA=I
GO TO 953
ENDIF
952 CONTINUE
953 NEND = 0
DO 954 1= 1,NSTA-1
IF(SIG(I).EQ.0.0) GO TO 954
NEND = NEND + NRAN(I)
954 CONTINUE
NSTAK = NEND + NCLFIX +1
NENDK = NSTAK + NRAN(NSTA)-1
N=0
DO 955 I = NSTAK,NENDK
K = NVEC(I,NCOLTK)
DO 956 J = I,NENDK
KK = K+J-I+ 1
N = N+1
VARG(N)=TK(KK)
956 CONTINUE
955 CONTINUE
N = 0
DO 957 1=1,NCLFIX
K = NVEC(I,NCOLTK)
DO 958 J = I,NCLFIX

144
KK = K+J-I+ 1
N = N+ 1
VARBH(N)=TK(KK)
958 CONTINUE
957 CONTINUE
DEALLOCATE (TK)
RETURN
END

REFERENCE LIST
Banks, B.D., Mao, I.L. & Walter, J.P. 1985. Robustness of the restricted maximum likelihood
estimator derived under normality as applied to data with skewed distributions. J. Dairy
Sci. 68:1785-1792.
Becker, W.A. 1975. Manual of Quantitative Genetics. Washington State Univ.Press,
Pullman,WA. 170 pp.
Braaten, M.O. 1965. The union of partial diallel mating designs and incomplete block
environmental designs. North Carolina State Univ. Inst, of Stat. Mimeo. Series No.
432, 77pp.
Bridgwater, F.E., Talbert, J.T. & Jahromi, S. 1983. Index selection for increased dry weight in
a young loblolly pine population. Silvae Genet. 32:157-161.
Burdon, R.D. 1977. Genetic correlation as a concept for studying genotype-environment
interaction in forest tree breeding. Silvae Genet. 26:168-175.
Burdon, R.D. & Shelbourne, C.J.A. 1971. Breeding populations for recurrent selection:
Conflicts and possible solutions. N. Z. J. For. Sci. 1:174-193.
Burley, J., Burrows, P.M., Armitage, F.B. & Barnes, R.D. 1966. Progeny test designs for
Pinus patula in Rhodesia. Silvae Genet. 15:166-173.
Campbell, K. 1972. Genetic variability in juvenile height-growth of Douglas-fir. Silvae
Genet. 21:126-129.
Corbeil, R.R. & Searle, S.R. 1976. A comparison of variance component estimators.
Biometrics 32:779-791
Falconer, D.S. 1981. Introduction to Quantitative Genetics. Longman & Co., New York,NY.
340 pp.
Foster, G.S. 1986. Trends in genetic parameters with stand development and their influence on
early selection for volume growth in loblolly pine. For. Sci. 32:944-959.
Foster, G.S. & Bridgwater, F.E. 1986. Genetic analysis of fifth-year data from a seventeen
parent partial diallel of loblolly pine. Silvae Genet. 35:118-122.
Freund, R.J. 1980. The case of the missing cell. Amer. Stat. 34:94-98.
145

146
Freund, R.J. & Littell, R.C. 1981. SAS for Linear Models. SAS Institute,Inc., Cary,NC. 231
pp.
Giesbrecht, F.G. 1983. Efficient procedure for computing minque of variance components and
generalized least squares estimates of fixed effects. Commun. Statist. -Theor. Meth.
12:2169-2177.
Gilbert, N.E.G. 1958. Diallel cross in plant breeding. Heredity 12:477-498.
Goodnight, J.H. 1979. A tutorial on the sweep operator. Amer. Stat. 33(3): 149-158.
Graybill, F.A. 1976. Theory and Application of the Linear Model. Duxbury Press, North
Scituate,MA. 704 pp.
Greenwood, M.S., Lambeth, C.C. & Hunt, J.L. 1986. Accelerated breeding and potential
impact upon breeding programs. In: Southern Cooperative Series Bulletin No. 309.
Louisiana Ag. Experiment Station, Baton Rouge,LA. pp. 39-41.
Griffing, B. 1956. Concept of general and specific combining ability in relation to diallel
crossing systems. Aust. J. Biol. Sci. 9:463-493.
Hallauer, A.R. & Miranda, J.B. 1981. Quantitative Genetics in Maize Breeding. Iowa State
Univ.Press, Ames,10. 468 pp.
Hartley, H.O. 1967. Expectations, variances and covariances of ANOVA mean squares by
"synthesis". Biometrics 21:467-480.
Hartley, H.O. & Rao, J.N.K. 1967. Maximum likelihood estimation for the mixed analysis of
variance model. Biometrika 54:93-108.
Harville, D.A. 1977. Maximum likelihood approaches to variance component estimation and to
related problems. J. Amer. Stat. Assoc. 72:320-338.
Henderson, C.R. 1953. Estimation of variance and covariance components. Biometrics
9:226-252.
Henderson, C.R. 1973. Sire evaluation and expected genetic advance. In: Animal Breeding and
Genetics Symposium in Honor of J. Lush, Animal Sci. Assoc. Amer., Champaign, Ill.
pp 10-41.
Henderson, C.R. 1974. General flexibility of linear model techniques for sire evaluation. J.
Dairy Sci. 57:963-972.
Henderson, C.R. 1977. Best linear unbiased prediction of breeding values no in the model for
records. J. Dairy Sci. 60:783-787.
Henderson, C.R. 1984. Applications of Linear Models in Animal Breeding. University of
Guelph, Guelph, Ontario, CAN. 462 p.

147
Henderson, C.R., Kempthome, O., Searle, S.R. & Von Krosigk, C.N. 1959. Estimation of
environmental and genetic trends from records subject to culling. Biometrics 30:583-
588.
Hodge, G.R. & White, T.L. (in press). Genetic parameter estimates for growth traits at
different ages in slash pine. Silvae Genet.
Hogg, R.V. & Craig, A.T. 1978. Introduction to Mathematical Statistics. Fourth edition.
Macmillan Publ. Co. New York, NY. 438 pp.
Kackar, R.N. & Harville, D.A. 1981. Unbiasedness of two-stage estimation and prediction
procedures for mixed linear models. Comm. Stat. A. Theory and Methods 10:1249-1261.
Kendall, M.G. & Stuart, A. 1963. The Advanced Theory of Statistics. Vol. 1. Hafner Publ.
Co., New York. 433 pp.
Klotz, J.H., Milton, R.C. & Zacks, S. 1969. Mean square efficiency of estimators of
variance components. J. Amer. Stat. Assoc. 64:1383-1402.
Knuth, D.E. 1981. Seminumerical Algorithms, 2nd ed., vol. 2 of The art of computer
programming. Addison-Wesley Reading, MA.
Littell, R.C. & McCutchan, B.G. 1986. Use of SAS for variance component estimation. In:
Statistical considerations in genetic testing of forest trees. South. Coop. Series Bull.
No. 324. pp 75-86.
Loo-Dinkins, J.A., Tauer, C.G. & Lambeth, C.C. 1990. Selection system efficiencies for
computer simulated progeny test field designs in loblolly pine. Theor. Appl. Genet.
79:89-96.
Matzinger, D.F., Sprague, G.F. & Cockerham, C.C. 1959. Diallel crosses of maize in
experiments repeated over locations and years. Crop Sci. 51:346-350.
McCutchan, B.G., Ou, J.X. & Namkoong, G. 1985. A comparison of planned unbalanced
design for estimating heritability in perennial tree crops. Theor. Appl. Genet.
71:536-544.
McCutchan, B.G., Namkoong, G. & Giesbrecht, F.G. 1989. Design efficiencies with planned
and unplanned unbalance for estimating heritability in forestry. For. Sci. 35:801-815.
McLean, R.A. 1989. An introduction to general linear models. In: Applications of Mixed Models
in Agriculture and Related Disciplines, South. Coop. Ser. Bull. No. 343. pp 23-30.
Louisiana Agricultural Experiment Station. Baton Rouge.
Meyer, K. 1989. Restricted maximum likelihood to estimate variance components for animal
models with several random effects using a derivative-free algorithm. Genet. Sel.
Evol. 21:317-340.

148
Miller, J.J. 1973. Asymptotic properties and computation of maximum likelihood estimates in
the mixed model of the analysis of variance. Tech. Rep. No. 12, Department of
Statistics, Stanford Univ., Stanford, CA.
Milliken, G.A. & Johnson, D.E. 1984. Analysis of Messy Data I, Designed Experiments.
Lifetime Learning Pub., Belmont,CA. 473 pp.
Namkoong, G., Snyder, E.B. & Stonecypher, R.W. 1966. Heritability and gain concepts for
evaluating breeding systems such as seedling orchards. Silvae Genet. 15:76-84.
Namkoong, G. & Roberds, J.H. 1974. Choosing mating designs to efficiently estimate genetic
variance components for trees. Silvae Genet. 23:43-53.
Olsen, A., Seely, J. & Birkes, D. 1976. Invariant quadratic unbiased estimation of two
variance components. Ann. Stat. 4:878-890.
Patterson, H.D. & Thompson, R. 1971. Recovery of interblock information when block sizes are
unequal. Biometrika 58:545-554.
Pederson, D.G 1972. A comparison of four experimental designs for the estimation of
heritability. Theoret. Appl. Genet. 42:371-377.
Pepper, W.D. 1983. Choosing plant-mating design allocations to estimate genetic variance
components in the absence of prior knowledge of the relative magnitudes. Biometrics
39:511-521.
Pepper, W.D. & Namkoong, G. 1978. Comparing efficiency of balanced mating design for
progeny testing. Silvae Genet. 27:161-169.
Pittman, E.J.G. 1937. The "closest" estimates of statistical parameters. Pro. Cambr. Philos.
Soc. 33:212-222.
Press, W.H., Flannery, B.P., Teukolsky, S.A. & Vetterling, W.T. 1989. Numerical Recipes.
The Art of Scientific Computing (Fortran version). Cambridge Univ. Press, New York
NY. 702 pp.
Rao, C.R. 1971a. Estimation of variance and covariance components-minque theory. J.
Multivar. Anl. 1:257-275.
Rao, C.R. 1971b. Minimum variance quadratic unbiased estimation of variance components. J.
Multivar. Anl. 1:445-456.
Rao, C.R. 1972. Estimation of variance and covariance components in linear models. J.
Amer. Stat. Assoc. 67:112-115.
SAS Institute, Inc. 1985. SAS Interactive Matrix Language Guide for Personal Computers.
SAS Insitute,Inc., Cary,NC. 429 pp.

149
Schneider, D.M. 1987. Linear Algebra, A Concrete Introduction. Maxmillan Pub. Co., New
York, NY. 506 pp.
Searle, S.R. 1971. Topics in variance component estimation. Biometrics 27:1-76.
Searle, S.R. 1987. Linear Models for Unbalanced Data. John Wiley and Sons, New York,
NY. 536 pp.
Shaw, R.G. 1987. Maximum-likelihood approaches applied to quantitative genetics of natural
populations. Evolution 41(4):812-826.
Singh, M. & Singh, R.K. 1984. A comparison of different methods of half-diallel analysis.
Theor. Appl. Genet. 67:323-326.
Snyder, E.B. & Namkoong, G. 1978. Inheritance in a diallel crossing experiment with
longleaf pine. In: USDA For. Serv. Res. Pap. SO-140. South. For. Exp. Stn., New
Orleans, LA. 31pp.
Speed, F.M., Hocking, R.R. & Hackney, O.P. 1978. Methods of analysis of linear models
with unbalanced data. J. Amer. Stat. Assoc. 73:105-112.
Sprague, G.F. & Tatum, L.A. 1942. General vs. specific combining ability in single crosses of
corn. J. Amer. Soc. Agron. 34:923-932.
Squillace, A.E. 1973. Comparison of some alternative second-generation breeding plans for
slash pine. In: South. For. Tree Improve. Conf. June 12-13, 1973 Baton Rouge, LA,
pp. 2-13.
Stonecypher, R.W., Zobel, B.J. & Blair, R. 1973. Inheritance patterns of loblolly pines from a
nonselected natural population. Technical Bulletin No. 224, North Carolina Ag. Exp. Stn.
Swallow, W.H. 1981. Variances of locally minimum variance quadratic unbiased estimators
(’MIVQUE’s)’ of variance components. Technometrics 23:271-283.
Swallow, W.H. & Monahan, J.F. 1984. Monte Carlo comparison of ANOVA, MIVQUE,
REML, and ML estimators of variance components. Technometrics 26(l):47-57.
van Buijtenen, J.P. 1972. Efficiency of mating designs for second-generation selection. IN
Proceedings, IUFRO Working Party Meeting on Progeny Testing, 25-27 Oct. 1972,
Macon, GA. Edited by John F. Kraus, Ga. For. Res. Council, Macon, pp. 103-126.
van Buijtenen, J.P. & Bridgwater, F. 1986. Mating and genetic test designs. In: Advanced
Generation Breeding of Forest Trees. Southern Coop. Series Bull. 309. Louisiana Ag.
Exp. Stn., Baton Rouge,LA. pp. 5-10.
van Buijtenen, J.P. & Burdon, R.D. 1990. Expected efficiencies of mating designs for
advanced generation selection. Can. J. For. Res. 20:1648-1663.

150
Weir, R.J. & Goddard, R.E. 1986. Advanced generation operational breeding programs for
loblolly and slash pine. In: Southern Coop. Series Bull. 309. Louisiana Agrie. Exp.
Stn., Baton Rouge, LA. pp. 21-26.
Weir, R.J. & Zobel, B.J. 1975. Managing genetic resources for the future a plan for the N.C.
State Industry Cooperative Tree Improvement Program. In: Proc. 13th South. For.
Tree Improve. Conf. June 10-11, Raleigh, NC. pp. 73-82.
Westfall, P.H. 1987. A comparison of variance component estimates for arbitrary underlying
distributions. J. Amer. Stat. Assoc. 82:866-874.
White, T.L. 1987. A conceptual framework for tree improvement programs. New Forests
4:325-342.
White, T.L. & Hodge, G.R. 1987. Practical uses of breeding values in tree improvement
programs and their prediction from progeny test data. P. 276-283 in Proc. 19th South.
For. Tree Improve. Conf. Texas A & M Univ., College Station, TX.
White, T.L. & Hodge, G.R. 1988. Best linear prediction of breeding values in a forest tree
improvement program. Theor. Appl. Genet. 76:719-727.
White, T.L. & Hodge, G.R. 1989. Predicting Breeding Values with Applications in Forest
Tree Improvement. Kluwer Academic Pub., Dordrecht,The Netherlands. 367 pp.
Wilcox, M.D., Shelbourne, C.J.A. & Firth, A. 1975. General and specific combining ability in
eight selected clones of radiata pine. N. Z. J. For. Sci. 5:219-225.
Yates, F. 1934. The analysis of multiple classifications with unequal numbers in the different
classes. J. Amer. Stat. Assoc. 29:51-66.
Zobel, B.J. & Talbert, J. 1984. Applied Forest Tree Improvement. John Wiley and Sons,
New York, NY. 505 pp.

BIOGRAPHICAL SKETCH
Dudley Arvle Huber was born December 13, 1948, in Fulton County, Georgia, to Dudley
and Dorothy Huber. His basic education was in the Stephens County school system. He entered
Georgia Institute of Technology to study chemical engineering and later transferred to the
University of Georgia in the forestry program. In 1970, he received a Bachelor of Science
degree. From 1971 to 1977, he served in the U. S. Navy and after service re-entered the
University of Georgia, receiving a Master of Science degree in 1981. After several years of self-
employment and employment at the University of Georgia, he began a Doctor of Philosophy
program in 1988. He is currently employed as operations geneticist for Southern Forest Tree
Improvement by Weyerhaeuser Company.
151

I certify that I have read this study and that in my opinion it conforms to acceptable
standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation
for the degree of Doctor of Philosophy.
^7770
Timothy L. White, Chairman
Associate Professor of Forest Resources
and Conservation
I certify that I have read this study and that in my opinion it conforms to acceptable
standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation
for the degree of Doctor of Philosophy.
Michael A. DeLorenzo
Associate Professor of Dairy Science
I certify that I have read this study and that in my opinion it conforms to acceptable
standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation
for the degree of Doctor of Philosophy.
Assistant Research Scientist
of Forest Resources and Conservation
I certify that I have read this study and that in my opinion it conforms
standards of scholarly presentation and is fully adequate, in scope and quality, as
for the degree of Doctor of Philosophy.
Ramon C. Littell
Professor of Statistics
to acceptable
a dissertation
I certify that I have read this study and that in my opinion it conforms to acceptable
standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation
for the degree of Doctor of Philosophy.
Donald L. Rockwood
Professor of Forest Resources
and Conservation

This dissertation was submitted to the Graduate Faculty of the School of Forest Resources
and Conservation in the College of Agriculture and to the Graduate School and was accepted as
partial fulfillment of the requirements for the degree of Doctor of Philosophy.
May 1993
Director, Forest Resources and
Conservation
Dean, Graduate School

UNIVERSITY OF FLORIDA
3 1262 08553 9400



xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID EP1E48KFG_B6TTCX INGEST_TIME 2017-07-13T21:50:34Z PACKAGE AA00003661_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES