• TABLE OF CONTENTS
HIDE
 Cover
 Acknowledgement
 Table of Contents
 List of Tables
 List of Figures
 Abstract
 Design of paired comparison experiments:...
 Asymptotic theory of the maximum...
 Minimization of the variance of...
 Designs for fitting a quadratic...
 Designs that protect against...
 Designs for preliminary test...
 Two-stage sampling for model...
 Appendices
 Bibliography
 Biographical sketch














Title: Design of paired comparison experiments with quantitative independent variables /
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00098278/00001
 Material Information
Title: Design of paired comparison experiments with quantitative independent variables /
Physical Description: x, 181 leaves : ill. ; 28 cm.
Language: English
Creator: Offen, Walter William, 1953-
Publication Date: 1980
Copyright Date: 1980
 Subjects
Subject: Paired comparisons (Statistics)   ( lcsh )
Statistics thesis Ph. D
Dissertations, Academic -- Statistics -- UF
Genre: bibliography   ( marcgt )
non-fiction   ( marcgt )
 Notes
Thesis: Thesis (Ph. D.)--University of Florida, 1980.
Bibliography: Bibliography: leaves 178-180.
Statement of Responsibility: by Walter William Offen.
General Note: Typescript.
General Note: Vita.
 Record Information
Bibliographic ID: UF00098278
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: alephbibnum - 000099856
oclc - 07188318
notis - AAL5316

Downloads

This item has the following downloads:

designofpairedco00offe ( PDF )


Table of Contents
    Cover
        Page i
        Page ii
    Acknowledgement
        Page iii
    Table of Contents
        Page iv
        Page v
    List of Tables
        Page vi
    List of Figures
        Page vii
    Abstract
        Page viii
        Page ix
        Page x
    Design of paired comparison experiments: A literature review
        Page 1
        Page 2
        Page 3
        Page 4
        Page 5
        Page 6
        Page 7
        Page 8
        Page 9
        Page 10
        Page 11
        Page 12
        Page 13
        Page 14
        Page 15
        Page 16
        Page 17
        Page 18
        Page 19
        Page 20
        Page 21
        Page 22
        Page 23
        Page 24
        Page 25
    Asymptotic theory of the maximum likelihood estimators
        Page 26
        Page 27
        Page 28
        Page 29
        Page 30
        Page 31
        Page 32
        Page 33
    Minimization of the variance of a single parameter estimate
        Page 34
        Page 35
        Page 36
        Page 37
        Page 38
        Page 39
        Page 40
        Page 41
        Page 42
        Page 43
        Page 44
        Page 45
        Page 46
        Page 47
        Page 48
        Page 49
        Page 50
        Page 51
        Page 52
        Page 53
        Page 54
        Page 55
        Page 56
        Page 57
        Page 58
        Page 59
        Page 60
        Page 61
        Page 62
        Page 63
        Page 64
    Designs for fitting a quadratic model
        Page 65
        Page 66
        Page 67
        Page 68
        Page 69
        Page 70
        Page 71
        Page 72
        Page 73
        Page 74
        Page 75
        Page 76
        Page 77
        Page 78
        Page 79
        Page 80
        Page 81
        Page 82
        Page 83
        Page 84
        Page 85
        Page 86
        Page 87
        Page 88
        Page 89
        Page 90
        Page 91
        Page 92
        Page 93
        Page 94
        Page 95
        Page 96
        Page 97
        Page 98
        Page 99
        Page 100
        Page 101
        Page 102
        Page 103
        Page 104
        Page 105
        Page 106
        Page 107
        Page 108
        Page 109
        Page 110
        Page 111
        Page 112
        Page 113
        Page 114
        Page 115
        Page 116
        Page 117
        Page 118
        Page 119
        Page 120
        Page 121
        Page 122
        Page 123
        Page 124
        Page 125
        Page 126
        Page 127
        Page 128
        Page 129
        Page 130
        Page 131
        Page 132
        Page 133
        Page 134
    Designs that protect against bias
        Page 135
        Page 136
        Page 137
        Page 138
        Page 139
        Page 140
        Page 141
        Page 142
        Page 143
        Page 144
    Designs for preliminary test estimators
        Page 145
        Page 146
        Page 147
        Page 148
        Page 149
    Two-stage sampling for model fitting
        Page 150
        Page 151
        Page 152
        Page 153
        Page 154
        Page 155
        Page 156
        Page 157
        Page 158
        Page 159
        Page 160
        Page 161
        Page 162
        Page 163
        Page 164
    Appendices
        Page 165
        Page 166
        Page 167
        Page 168
        Page 169
        Page 170
        Page 171
        Page 172
        Page 173
        Page 174
        Page 175
        Page 176
        Page 177
    Bibliography
        Page 178
        Page 179
        Page 180
    Biographical sketch
        Page 181
        Page 182
        Page 183
Full Text


















DESIGN OF PAIRED COMPARISON EXPERIMENTS
WITH QUANTITATIVE INDEPENDENT VARIABLES












BY

WALTER WILLIAM OFFEN


A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY






UNIVERSITY OF FLORIDA


1980



















Digitized by the Internet Archive
in 2009 witiff~hiing from
University of Florida, George A. Smathers Libraries


http://www.archive.org/details/designofpairedco00offe
















ACKNOWLEDGMENTS


I would like to thank Drs. Ramon C. Littell and William Mendenhall

for encouraging me to work past the Master of Statistics degree to pur-

sue a Ph.D. I sincerely appreciate their concern and beneficial advice.

A special thanks goes to Dr. Ramon C. Littell, first of all for

the attention and help he gave to me in my early years at the University

of Florida while working for him at the Institute of Food and Agricul-

tural Sciences. I especially thank him for the immense amount of time

and effort he contributed to this project. His patience and sincere

interest will always be remembered.

Thanks go to my family for their understanding and support during

the many years of my education. I am especially grateful to my wife,

Susie, for her love and caring, which makes it all worthwhile.

Finally, I would like to thank my typists, Susan and Walter Offen,

who both did an outstanding job of typing this manuscript. Also,

thanks go to Drs. John Cornell, Ramon Littell, Frank Martin, and

Kenneth Portier for allowing my typists to use their office type-

writers.


1ii
















TABLE OF CONTENTS


Page
ACKNOWLEDGMENTS . . . . . .. . . . . . iii

LIST OF TABLES . . . . . . . . . . . . vi

LIST OF FIGURES . . . . . . . . ... . . . .vii

ABSTRACT . . . . . . . . ... . . . . . .viii

CHAPTER

1 DESIGN OF PAIRED COMPARISON EXPERIMENTS: A LITERATURE
REVIEW . . . . . . . . . ..... 1

1.1 Introduction .... . . . . . . . 1
1.2 Bradley-Terry Model . . . . . . . 3
1.3 Estimation of the Bradley-Terry Parameters and
Corresponding Asymptotic Distribution . . 6
1.4 Designs of Factorial Paired Comparison
Experiments . . . . . . . . . 11
1.5 Response Surface Designs . . . . ... .22

2 ASYMPTOTIC THEORY OF THE MAXIMUM LIKELIHOOD ESTIMATORS 26

2.1 Introduction . . . . . . . ... 26
2.2 Estimation .. . . . . . . . .27
2.3 Asymptotic Distribution . . . . . . 29

3 MINIMIZATION OF THE VARIANCE OF A SINGLE PARAMETER
ESTIMATE ........ . . . . . . .... 34

3.1 Introduction . . . . . . . . . 34
3.2 Linear Model .... . . . . . .34
3.3 Quadratic Model ... . . . ... 38

Minimization of var (82) . . . . ... 38

Equivalence of El-Helbawy and Bradley's
method and minimizing the variance . . 50
3.4 Cubic Model .. . . . . . . . 58

4 DESIGNS FOR FITTING A QUADRATIC MODEL . . . ... .65

4.1 Introduction .... . ..... ... .... 65









Page

. 68


4.2 D-optimal Designs . . . . . .
Condition 1 .....
Condition 2 . . . . .
Condition 3 .....
4.3 Average-variance Designs ....
Condition 1 .....
Condition 2 . . . . . .
Condition 3 ......
4.4 Minimax Designs ......
Condition 1 ......
Condition 2 . . . . .
Condition 3 . . . .


4.5 Conclusion and Design Recommendations . .. .128

5 DESIGNS THAT PROTECT AGAINST BIAS . . . . ... 135


Introduction . . . . . . .
All Bias Designs . . . . . .
Integrated Mean Square Error Designs .


. . 135
. . . 136
. . . 138


6 DESIGNS FOR PRELIMINARY TEST ESTIMATORS . . . .. .145

6.1 Introduction . . . .. . . . . .145
6.2 Average-variance Designs for Preliminary Test
Maximum Likelihood Estimators . . . ... .146

7 TWO-STAGE SAMPLING FOR MODEL FITTING . . . . .. .150


Introduction . . . . . . .
Optimal Error Rate . . . . . .
Concluding Remarks . . . . . .


. .. 150
. 151
. .. 157


APPENDIX


A COMPUTER PROGRAM THAT FINDS MAXIMUM LIKELIHOOD ESTIMATES
OF 7i ...." t'r . . . . . . . . .. .. 165

B COMPUTER PROGRAMS THAT FIND MAXIMUM LIKELIHOOD ESTIMATES
OF 8 ,..., s . . . . . . . . . . 170

BIBLIOGRAPHY . . . . . . . . . . . . . 178

BIOGRAPHICAL SKETCH . . . . . . . . .. ... .181















LIST OF TABLES


TABLE Page

1.1 Efficiencies of balanced 2 and 2 factorial paired
comparison designs . . . . . . . . ... 18

3.1 Designs which minimize var(B ) . . . . . . 48

3.2 Designs which minimize var( 3) . . . . . . 64

4.1 D-optimal designs with restriction x=-l,0,1 ...... 90

4.2 D-optimal (local) designs with 2=0 . . . . . 94

4.3 D-optimal designs .. . . . . . . . . .95

4.4 Average-variance designs with restriction x=-l,0,l .. 105

4.5 Average-variance (local) designs with B2=0 ...... 109

4.6 Average-variance designs . . . . . . . .. 110

4.7 Minimax designs with restriction x=-l,0,1 . . ... 119

4.8 Minimax (local) designs with 2=0 . . . . ... 123

4.9 Minimax designs ... . . . . . . ... .124

4.10 Efficiency of designs . . .... ... ...... 133

4.11 Bias of designs .... . . . . . . . 134

5.1 J-criterion designs . . . .. . .. . . 141

5.2 Values of J . . . . . . . . . . . .143


















LIST OF FIGURES


FIGURE


1.1 Depiction of Designs I and II . . . . .

3.1 Designs which minimize var(2) . . . .

4.1 Procedure summary . . . . . . .

4.2 Depiction of possible designs for Condition 1 .

4.3 Locally optimal designs when 2=0 . . . .

7.1 Minimization of the average mean square error -
prior . . . . . . . . . .

7.2 Minimization of the average mean square error -
prior . . . . . . . . . . .

7.3 D-optimality Uniform prior . . . .

7.4 D-optimality Normal prior . . ......


Page

. . 12

. . . 49

. . 69

. . . 75

129

Uniform
. . 161

Normal
. . . 162


F
















Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy


DESIGN OF PAIRED COMPARISON EXPERIMENTS
WITH QUANTITATIVE INDEPENDENT VARIABLES


By

Walter William Offen

August 1980

Chairman: Ramon C. Littell
Major Department: Statistics


An experiment in which the treatments are compared and ranked

pairwise is called a paired comparison experiment. The Bradley-Terry

model for preference probabilities with ties not allowed is used in

this dissertation. This model defines treatment parameters, ...

t such that the probability that treatment T. is preferred over

-1
treatment T. is equal to i. (T7.+r.) When the treatments are levels

of continuous, quantitative variables, the logarithm of the Bradley-

Terry parameters can be modeled as a regression-type model,

In7T = kxki. There have been a large number of papers on the topic
i k kxi'

of paired comparisons, but only a few have considered experimental

design. Past papers on design are reviewed in Chapter 1. In 1973

Springall presented the asymptotic distribution of the maximum likeli-

hood estimators of S1 ... 3 where s is the number of regression-

type parameters in the model. Chapter 2 of the present dissertation

contains a derivation of this asymptotic distribution. The remainder

viii










of the dissertation is a consideration of designs for situations where

the treatments are levels of a single quantitative variable.

The asymptotic variance-covariance matrix is a function of the

true parameter values. Hence, the optimal designs also depend on the

parameters. In previous papers on design, the Bradley-Terry parameters

were taken to be equal when evaluating the variances and covariances.

The present dissertation considers the design problem for various

parameter values.

Notice that the general design problem is complex. It involves

deciding how many levels and which particular levels should be chosen.

Furthermore, it must be determined how often to compare each pair of

treatments. In some cases this complexity was resolved. In other

cases additional restrictions had to be imposed before optimal designs

could be found.

Chapter 3 considers designs which minimize the variance of a

single parameter estimate, S.. The linear, quadratic, and cubic models
1

are considered. The optimal design in the linear case turns out to be

comparisons of only the smallest level in the experimental region with

the largest, for most values of the linear parameter. The designs for

the quadratic and cubic cases depend more heavily on the parameter

values than for the linear case.

Chapter 4 presents optimal designs for fitting a quadratic model.

The optimality criteria considered are D-optimality, the minimization

of the average variance of Inr., and the minimization of the maximum

value of InT.. These designs also depend heavily on the parameters,
1

although some overall design recommendations are given in the last

section.









The remaining chapters contain a brief discussion of related top-

ics. Chapter 5 is a consideration of designs which protect against the

bias present when the true model is cubic. Chapter 6 is a discussion

of designs for preliminary test estimators. Chapter 7 is a considera-

tion of two-stage sampling. The first stage is a design which results

with a small variance of the quadratic parameter estimate. A test of

hypothesis then determines whether the second stage design should be

one which is optimal for fitting a linear or a quadratic model. A

discussion of the best error rate to use for the test of hypothesis is

included.

The appendices contain computer programs that find maximum likeli-

hood estimates of the Bradley-Terry parameters and 1, .... ,s. These

programs are written in the languages APL and Fortran.
















CHAPTER 1

DESIGN OF PAIRED COMPARISON EXPERIMENTS:
A LITERATURE REVIEW


1.1 Introduction


An experiment in which the treatments are compared and ranked

pairwise is called a paired comparison experiment. Such an experiment

requires a number of subjects which are usually individuals, but could

conceivably be a machine, for instance. Each subject is required to

compare two samples and decide which of the two is preferable for the

attribute under study. For example, each subject could be asked to

state which of two samples tastes better. Paired comparison experi-

ments are often conducted in areas such as food testing where it may be

difficult for an individual to quantify his like or dislike of a sample,

but he may be able to readily decide which of the two samples is pre-

ferred.

Suppose that an experimenter wants to determine which of four

brands of coffee most consumers prefer. There are a number of experi-

mental procedures available. One possibility is that each individual

be required to taste and rank all four brands from least favorable to

most favorable. However, sensory fatigue may be a problem because

after tasting the second or third sample, it may be difficult to per-

ceive any difference between the subsequent sample and the first sample

tasted.









Another possible procedure is to have each individual assign to

each brand a score on some numerical scale. Sensory fatigue may again

be a problem with this experimental procedure since all four brands are

tasted by each individual. Additionally, there may be a degree of ar-

bitrariness to choosing a numerical score for each treatment.

A third possibility is to conduct a paired comparison experiment.

The problems mentioned in the two preceding paragraphs are not present

for paired comparison experiments. That is to say, paired comparison

experiments have a minimal amount of individual guesswork involved.

As pointed out in the first paragraph, the complete experiment involves

a number of individuals, each one choosing out of two samples the one

which is preferable. For the particular example with four brands of

coffee, 24 subjects might be utilized, thereby resulting in 4 repli-

cates of each of the 6 distinct pairs of treatments.

The treatments can have a factorial arrangement. However, facto-

rial paired comparison experiments have often been analyzed ignoring

the factorial arrangement of the treatments. For example, Larmond,

Petrasovits, and Hill (1968) conducted a 2x3 factorial paired compari-

son experiment, but they did not test for interaction or main effects

as normally is done with factorial experiments. Although Abelson and

Bradley (1954) presented a theoretical development, only relatively

recently have tractable methods been available for testing interaction

and main effects in factorial paired comparison experiments.

In the case of quantitative treatments, an analogue of multiple

linear regression analysis is appropriate. For example, instead of

four distinctly different brands of coffee, there may only be one brand










of coffee with four different levels of an additive. The design of

paired comparison experiments for fitting response surfaces with one

independent variable is discussed in Chapters 3-7.


1.2 Bradley-Terry Model


Denote the t treatments as T ,T ... ,Tt. Let ni.. 0 be the

number of times T. and T. are compared, i 1 3
T. T. for "T. is selected over T.", where selection is based on the
1 ] 1 3
particular attribute under study. A general model defines ( ) function-

ally independent parameters, 0 s it.. 1, such that



P(T. T.) = T ., U.. + .. = 1, ij, i,j=l,...,t. (1.2.1)



Each distinct pair of treatment comparisons is a set of independent

Bernoulli trials. The entire experiment is a combination of the (2)

independent sets.

Bradley and Terry (1952) proposed a basic model for paired compar-

isons. A treatment parameter T. is associated with each treatment T.,
l 1
i=, ... ,t. The Bradley-Terry model is then defined to be



P(T. T.) = .. = i./(T. + T.) i7j, i,j=l,...,t. (1.2.2)
1 j 13 1 1 3



The right-hand side of (1.2.2) is invariant under scale changes, so a

single constraint is imposed to make the treatment parameters unique.

Two popular choices are Z rT = 1, or Z In 7. = 0. This constraint is
replaced with another constraint in Chapters 2-7.
replaced with another constraint in Chapters 2-7.










Another model for paired comparisons was presented by Thurstone

(1927). He used the concept of a subjective continuum on which only

order can be perceived. Let p. represent the location point on the

continuum for treatment T., i=l, ... ,t. An individual receives a
1

sensation X. in response to treatment T., with X. distributed
1 1 1

Normal(J.,l). The probability that T. is preferred over T. is then
1 1




P(T. T.) = P(X. > X.) =- e /2 dy i (1.2.3)
1 J3 1 3 i,j=l,...,t.
-(i.-UJ.)



Bradley (1953) noted that replacement of the normal density function

in (1.2.3) by the logistic density function yields




P(T. T) = 1sech2 (y/2) dy = /(+) j (1.2.4)
J sech(y/ i J i,j=l,...,t.
-(InT.-lnT.)
1 J


This is the Bradley-Terry model with U.=lnr., i=l, ... ,t.
1 1

Thompson and Singh (1967) gave a psychophysical consideration by

assuming a treatment T stimulates unobservable signals UI, ... ,U

which are then transmitted to the brain by m sense receptors. The U.

are independent and identically distributed, and m is taken to be fixed

and large. The two variables considered for the response X to treatment

T are


(i) X is the average of the signals U1, ... ,U

(ii) X is the maximum of the signals U1, ... ,U .
1 m









By the Central Limit Theorem, asymptotic normality results for X in

(i), which results with Thurstone's model. The random variable in (ii)

has one of three distributions depending on the distribution of the U..
1

If U., for example, is distributed exponentially, then the asymptotic
1

distribution of X is the extreme value distribution. In any case, how-

ever, the Bradley-Terry model in (1.2.4) results from (ii).

Rao and Kupper (1967) generalized the Bradley-Terry model to allow

for ties. A tie parameter 8 2 1 is introduced, and the model becomes




P(T. T) 1 sechy/2) dy = r./(7T.+7.) i, j
1 3 4 1 1 ] i j=l.
-(lniT.-lnTj)+n



(lnTr.-lnr.)+-f
T 1 2
(T. = T = 1 sech (y/2) dy (1.2.5)
S] 4
(InTT. -InTT.)-n

ij




(7T. + 6T.) (6O s. + 7T.)



where ni=ln6 is the threshold of perception for the judges. That is to

say, if it is hypothesized that the two samples generate a random vari-

able d which measures the difference between the samples, and further-

more that Idl < n, then the judge will not be able to discern a differ-

ence between the two samples, and will thus declare a tie. Note that

the Bradley-Terry model is a special case with 8=1. Another model to

handle ties was proposed by Davidson (1970).










The Bradley-Terry model has received much attention in statistical

literature since it was introduced in 1952. For the remainder of this

dissertation, the Bradley-Terry model with ties not allowed is used.


1.3 Estimation of the Bradley-Terry Parameters
and Corresponding Asymptotic Distribution


When the model was proposed by Bradley and Terry, it was assumed

that all of the n.. were equal. Dykstra (1960) extended the method to
13

permit unequal n...

The parameters are estimated by the maximum likelihood method and

likelihood ratio tests are developed. Assuming independence of the

paired comparisons, the complete likelihood function is



t a. t n..
L = H / H (T. + i .) (1.3.1)
i=l i


where a. is the total number of times T. is selected over any other
1 i

treatment, i=l, ... ,t. The likelihood equations are



t
a. n..
p .i = 0 i=l,...,t,
Pi (Pi + Pj)
J 3
j i

(1.3.2)
t

SPi= 1 ,
i=



where p.i denotes the maximum likelihood estimate of i'', i=l, ... ,t.
1 1










These equations must be solved iteratively (see Appendix A for an

APL computer program which finds maximum likelihood estimates of the

treatment parameters). A brief description of the iterative procedure

follows. Letting p(k) be the kth approximation to pi, then


t
(k) *(k) *(k)
Pi = i / Pi (1.3.3)
i=l

where

t
*(k) a (k-l) (k-l) -- 1.3.
.= / nij/(P + P ) k=1,2,.--. (1.3.4)
Pi

j
j#i

(0)
The iteration can be started with p. =l/t, i=l, ... ,t. Dykstra (1956)

(0)
suggested a method of obtaining good initial parameter estimates p ,

i=l, ... ,t.

Ford (1957) proposed the model independently and proved that the

iterative procedure converged to a unique maximum for L if the following

assumption is true.


Assumption 1.1. For every possible partition of the treatments into

the two sets S1 and S2, some treatment in S1 is preferred at least once

to some treatment in S2'


If Assumption 1.1 does not hold, then a. must be zero for at least one
1
i. If exactly one a. is zero, say a. there may still be a unique
0
solution with p. =0. However, Ford proceeded to explain that if the
o

set S2 which violates Assumption 1.1 contains two or more treatments,

then the corresponding estimates of the Bradley-Terry treatment









parameter must be zero, and consequently individual members of 52 could

not be compared. This fact is also evident by noticing that in (1.3.4),

*(k)
pi would be zero for at least two values of i, and so by (1.3.3),
(k)
Pi would also be zero for the same values of i. Then the next time

in the iterative procedure that (1.3.4) is executed, the denominator

in the summation would be zero for some i and j, and hence the itera-

tive procedure would not converge.

The asymptotic theory and tests of hypothesis require the follow-

ing additional assumption.


Assumption 1.2. For every partition of the treatments into two non-

empty subsets S1 and S2, there exists a treatment T ES1 and a treatment

T.S such that j. >0, where


n..

ij N
ij = N-lim N itj, i,j=l.....t,




N = ni-

i
Bradley (1955) investigated asymptotic properties of the treatment

parameters with each n.,=n. Davidson and Bradley (1970) considered a

multivariate model where each subject chooses one of two samples on more

than one attribute. As a special case of the multivariate model, David-

son and Bradley obtained a more general result, allowing the n.. to be

different. They showed that the asymptotic joint distribution of

vN(p 1-r), .....i./(p -r ) is the singular multivariate normal distribu-

tion of dimensionality (t-1) in a space of t dimensions, with zero mean









vector and variance-covariance matrix Z = ( ..), where
3


7.. = cofactor of A.. in
13 13


13 txt txl



1'xt 0



13 txt tx1



1' 0
-- xt


(1.3.5)


1' = (1,1, ... ,1), the t-dimensional unit row vector, and


t
_ii i= Z 'J i=l,..,t,
ji (7T + T" )

Jti


A = -U i/(i + T)2
13 13 1


It is apparent that the variances and covariances in (1.3.5) depend on

the true treatment parameters. Hence estimates of the variances and

covariances are usually found by substituting the consistent maximum

likelihood estimates for 71' "'. t into (1.3.5).

In the special case with i.=l/t, n..=n for all i,j,
1 13


0.. = 2(t-l) /t3 and a


= -2(t-l)/t3 ij.


(1.3.6)


Use of (1.3.6) is adequate if 7T, ... 7 t are approximately equal.


ij, i,j=l,...,t.








The major test proposed is a test of treatment equality,


H 0 i = = it = 1/t ,

versus

H : T. f j for some i,j, i/j, i,j=l,...,t.
a i 3

The likelihood ratio statistic is



-21nX = 2Nln2 2 [ nijln(p +p ) a In pI.

i
Under the null hypothesis, H -21nA is asymptotically central chi-
o
square with (t-l) degrees of freedom. Bradley and Terry (1952) and

Bradley (1954) provided tables for t=3,4,5, including exact signifi-

cance levels. Comparison of the chi-square significance levels with

the tabled exact significance levels suggests that the former may be

used even for relatively small values of n... This is perhaps compar-

able to using the normal approximation to the binomial.

The asymptotic distribution under the alternative hypothesis is

also available for -21nX. Let



6 t
1 iN
S= -+ = 0 lim 6. = 6.
i=l


Then -21nA is asymptotically noncentral chi-square with (t-l) degrees

of freedom and noncentrality parameter i where



92
1 4 Iij(i 6
i








1.4 Designs of Factorial Paired Comparison Experiments


Littell and Boyett (1977) compared two designs for RxC factorial

paired comparison experiments. The two designs considered are:


Design I: Consists of independent paired comparisons of each of the
RC
( ) pairs of treatments.


Design II: Consists of comparisons between pairs of treatment com-

binations differing only in the levels of one factor, i.e. comparing

two treatments in the same row or column.


Note that there are (R) = -RC(RC-1) distinct pairs of treatments
2 2

for Design I, but only R( ) + C( ) =RC(R+C-2) distinct pairs of treat-

ments for Design II. In the 4x4 case, for example, Design I requires

120 distinct paired comparisons whereas Design II requires only 48

distinct paired comparisons. The two designs could entail the same

total number of paired comparisons, but Design II requires preparation

of fewer distinct pairs of samples.

Design II is also preferable from a computational point of view

because its analysis can be carried out by using a computer program

capable of handling the one-way classification with unequal replication

of pairs. A specialized factorial program is required for Design I to

obtain maximum likelihood estimates under the main effects hypothesis,

H : T., = .B., where T., is the Bradley-Terry treatment parameter
0 13 1 3 1]
given by (1.4.1).

The two designs are illustrated for the 2x2 case in Figure 1.1.

The comparisons made with Design I are indicated by the solid arrows.

The comparisons made with Design II are indicated by the dotted arrows.

















Factor B


high


cE--- --z


Figure 1.1. Depiction of Designs I and II


low


Factor A


high









Designs I and II were compared by Littell and Boyett on the basis

of the asymptotic relative efficiencies, in the Bahadur sense, of their

respective likelihood ratio tests. That is, test statistics are com-

pared on the basis of the ratio of the exponential rates of convergence

to zero of their observed significance levels. In general, the Bahadur

efficiency of test 1 relative to test 2 is defined to be



lim(- -2 log L ())
n n
E = (2) = cl ()/c2 (e)
lim(- log L )
n n
rrto


(i)
where L is the observed significance level of test i. The function

c. () is called the exact slope of test i.

Usually Bahadur efficiency is used to compare two test statistics

that are based on the same sample space. However, its use is equally

appropriate when the sample spaces are different. The two designs

should be compared on the basis of an equivalent number of comparisons

in each. Therefore, a "single observation" with Design I will be taken

as n = R+C-2 comparisons of each pair in the design, and a "single

observation" with Design II will be taken as niI = RC-1 comparisons of

each pair in the design. This gives -RC(RC-1)(R+C-2) total comparisons

in each design.

Let T.. represent the treatment with the first factor at level i

and the second factor at level j. The Bradley-Terry model is then



P(T. Tkl) = T ij i + k ik=l" (1.4.1)
j,1=1,...,C.









The following tests of hypotheses were considered by Littell and

Boyett.


Test 1: H : = 1/RC H : 7. = .B .
o 1i a 13 x]
This is a test for detecting two main effects versus no treatment dif-

ferences. Efficiencies were tabulated for the 2x2 case (Littell and

Boyett (1977), Table 1). The table showed that Design I is more effi-

cient than Design II for most combinations of a and 3.


Test 2: H : T.. = 1/RC, H : .,. = 8./R.
o 13 a 13 3
This is a test for detecting a single main effect versus no treatment

differences. For this test, Design I was shown to be uniformly more

efficient than Design II.


Test 3: H : T.. = ./R H : i. = aB .
o j3 3 a 13 i j
This is a test for detecting two main effects versus a single main ef-

fect. Efficiencies were again tabulated for the 2x2 case. The table

showed that Design I is more efficient than Design II for most combina-

tions of a and B.


Test 4: H : T.. = a.6. H : T.. 0 .
o 13 i 3 a 1]
This is a test for detecting interaction versus two main effects. The

results showed that Design II is uniformly superior to Design I for

detecting interaction.


Summarizing, when testing for main effects (Tests 1-3), Design I

is usually more efficient than II. When testing for interaction,

Design II is uniformly more efficient than I.










Littell and Boyett suggested that it might be a good idea for the

experiment to be conducted in two stages. The first stage is a number

of replicates of Design II followed by a test for interaction. If it

is significant, the second stage is completed by again using Design II,

and simple effects are analyzed by applying one-way analyses within rows

and columns. If interaction is not significant, then the second stage

consists of replicates of pairs in Design I that are not in Design II,

and main effects are analyzed.

The total number of comparisons to be made in the experiment can

be divided into the two parts such that after the experiment is complet-

ed, the end result is either a complete Design I or Design II. For

example, suppose the treatments form a 2x2 factorial, and resources are

available for 24 comparisons. In the first stage, the experimenter runs

4 replicates of the four comparisons in Design II (see Figure 1.1).

If interaction is significant, two more replicates of the four compari-

sons in Design I that are not in Design II are run. The latter then

results in the same comparisons made as if a single stage experiment

using Design I had been conducted.

Quenouille and John (1971) also considered designs for 2n facto-

rials in paired comparisons. However, they assumed an individual is

asked to assign a number on a scale to signify the degree of difference

in the two samples instead of merely deciding which sample possesses

more of the particular attribute under study. This model was developed

by Scheffe (1952).

In particular, Quenouille and John discussed the following 23

factorial. The 28 possible paired comparisons were divided into 7 sets









of 4 blocks as follows:


Set (1) : (I,ABC) (A,BC) (B,AC) (C,AB)

Set (2): (I,AB) (A,B) (C,ABC) (AC,BC)

Set (3) : (I,AC) (A,C) (B,ABC) (AB,BC)

Set (4): (I,BC) (A,ABC) (B,C) (AB,AC) (1.4.2)

Set (5) : (I,A) (B,AB) (C,AC) (BC,ABC)

Set (6) : (I,B) (A,AB) (C,BC) (AC,ABC)

Set (7) : (I,C) (A,AC) (B,BC) (AB,ABC)


The letter "I" represents the treatment where all factors are at their

low levels. If a letter appears, then that factor is at its high level.

Note that each treatment appears once in each set. These sets are con-

structed by first pairing treatment I with each of the remaining treat-

ments. These are called the initial blocks. The remaining blocks in

the sets are constructed by multiplying the initial block by any treat-

ment that has not yet appeared in that particular set, with the usual

2 2 2
rule that A = B = C =1.

To illustrate this technique, we will construct set (1) in (1.4.2).

The initial block is (I,ABC). So far only I and ABC have appeared, so

we may choose any treatment other than these. Suppose we choose BC.

Then the next block in the set would be (BC,A). Note that order is

unimportant. Now we may choose any treatment other than I, ABC, BC, or

A. If we choose B, we get the third block in set (1). To construct the

fourth block, there are only two choices remaining, namely C or AB.

Either one results in the last block of set (1).

Each set lacks information on the effects that have an even number

of letters in common with the initial block. For example, the initial









block for set (1) is (I,ABC), so this set provides no information on

the effects AB, AC, and BC, i.e. on all first-order interactions. Sim-

ilarly, the initial block for set (5) in (1.4.2) is (I,A), so this set

provides no information on B, C, and BC.

This concept of no information on B, C, and BC from data in set

(5) may be viewed as follows. If the responses from comparisons in set

(5) are modeled as


Y(I,A) = (p+a+e (p+e a+E(I,A)


Y(B,AB) = (+a+b+(ab)+eAB) (+b+e = a+(ab)+(BAB)


Y(C,AC) = (+a+c+(ac)+eAC) (U+c+eC) = a+(ac)+E(C,AC)


Y(BC,ABC) = (p+a+b+c+(ab)+(ac)+(bc)+(abc)+e BC) -
(BCABC) ABC
(p+b+c+(bc)+eBC


= a+(ab)+(ac)+(abc)+E(BCABC)


where all of the e's are independent and have expected value zero, then

it is clear that b, c, and (bc), corresponding to the effects B, C, and

BC, are not estimable from the four comparisons in set (5).

Quenouille and John define efficiency as follows. Suppose the

design is made up of r sets, s (s : r) of which give information on some

particular effect. Then s/r is the fraction of the comparisons made

that give information on that particular effect. On the other hand,

suppose that each set is run exactly once, i.e. each pair of treatments

is compared. It can be shown that there are a total of (2n-1) sets,

n- of which give information on any particular effect. In this case,

2n-/(2 -1) is the fraction of the comparisons made that give










Efficiencies of


Table 1.1.
balanced 2 and 2 factorial paired comparison designs


n Design


{2}
2
f1}


{3}
{2}
{1}
3
{3,2}

{3,1}
{2,1}


Number of
blocks


2
4


4
12
12
16
16
24


Efficiency of interactions of order
0 1 2


1.50 0.00
0.75 1.50


1.75
1.17
0.58
1.31
0.88
0.88


0.00
1.17
1.17
0.88
0.88
1.17


1.75
0.00
1.75
0.44
1.75
0.88


Notation: {j}: A design constructed by combining all
block I and a j-factor combination.

{j,k): A design constructed by combining the
(k}.

Source: Quenouille and John (1971, Table 1).


sets with initial


designs {j} and


information on any particular effect. The efficiency defined by Que-

nouille and John is the ratio of these two fractions. In other words,

the efficiency is defined to be


S s/r
E
2n-l/(2n-l)


Table 1.1 gives the efficiencies of some designs in the 2 and 2

cases. These designs are balanced in the sense that all main effects

are estimated with the same efficiency, all first order interactions










are estimated with the same efficiency, and so on. The table indicates

that in the 22 case, Design {1} is efficient for estimating the inter-

action effect. This result is the same as the result Littell and

Boyett obtained using Bahadur efficiency.

El-Helbawy and Bradley (1978) considered treatment contrasts in

paired comparison experiments. The theory developed is completely gen-

eral, but they specifically showed how optimal designs can be found in

the case of a 23 factorial. Their optimality criterion is the maximi-

zation of the chi-square noncentrality parameter from the likelihood

ratio test, which thereby maximizes the asymptotic power of the test.

Let a=(al,a2,a3), where a. represents the level of factor i,

i=1,2,3, and each a. is 0 or 1. Then the treatment parameters are
1

defined by El-Helbawy and Bradley to be


1 = 2 3 12 13 23 123 (1.4.3)
= T T T 7 i T v (1.4.3)
a al a2 a3 ala2 aa3 a2a3 ala2a3


for all a. The factorial parameters are defined by taking logarithms

of the 8 treatment parameters given by (1.4.3). So from (1.4.3),



Ya = logfa


3

= log + log7 + logT123
a a.a alaa3



3

= Ya + Y + Yl23 (1.4.4)
i=l i


The last two expressions in (1.4.4) define the factorial parameters








1 2 3 12 13 23 123
Y' Y2 Y 3 2a, y Y These parameters are
as1 a2 a3 ala aal a a3 3 ala2a3

subject to the usual analysis of variance constraints, namely


= 0 ,


i=1,2,3,


1

Ya a
a.=0
3


= 0 ,


i

(1.4.5)


1
S 123
ala a3
a 2=0


1


12a3
a3=0


The null hypothesis of no two-factor interaction

two factors is now considered. This test may be made


12
Ho: YOO


between the first

by testing


= 0 .


12
The constraints then require that Y 2=0, al,a2=0,1. The null hypoth-


esis is therefore equivalent to


12 12 12 12
S 00 Y O 01+ = 01


12 12 12 12 12
since by (1.4.5), y0 y Y + Y1 = 4y00 Let B denote a

matrix whose rows are orthonormal contrasts. If we let Y' =

(Y000,Y 001,Y010 00 ,Y01 ,lll), then the null hypothesis is

also equivalent to



H : B y = 0,
o -n=l

where B = S (1,1,-1,-1,-1,-,1,1).
"n=1 8


a.=0
1




Ya a.
a. =0
I


1
7123

al=0


= 0.










Let


a if comparing T. and T. give no information (in the

sense discussed by Quenouille and John) on the two-
n .
1-3
N factor interaction between the first two factors,


b otherwise,


where n.. is defined in Section 1.2, and N = n

i Then 12a + 16b = 1, since 16 of the 28 distinct pairs give information

on the two-factor interaction. El-Helbawy and Bradley then proceeded

to show that b should be taken as large as possible in order to maximize

the noncentrality parameter. This gives the result b=1/16, a=0. So

the optimal design is to compare the treatments that give information

on the two-factor interaction between the first two factors. These

pairs of treatments would be the sets (3), (4), (5), and (6) in (1.4.2).

This result is the same as Quenouille and John would obtain if

they wanted to find an optimal design to estimate AB interaction, since

using these 4 sets maximizes the efficiency of the AB interaction using

their definition of efficiency. It is also analogous to the result

Littell and Boyett obtained because the treatments compared in either

case differ either in the level of A, or the level of B, but not both.

El-Helbawy and Bradley noted that the result discussed in the two

preceding paragraphs still holds if other specified factorial effects

are assumed to be null (e.g. three-factor interaction). While the sta-

tistics for the two tests may differ, their limit distributions are the

same for either H or H
o a










El-Helbawy and Bradley also considered simultaneously testing for

the presence of AB, AC, and ABC interactions. In this case the null

hypothesis is


H: B y= 0,
o -n=3-

where


1 1 -1 -1 -1 -1 1 1

B = = 1 -1 1 -1 -1 1 -1 1
-n=3 Y8
1 -1 -1 1 -1 1 1 -1


By the same method they concluded that the optimal design is to make

the following comparisons (using Quenouille and John's notation):

(I,A), (B,AB), (C,AC), and (BC,ABC). Note that this is set (5) in

(1.4.2). Note also that this is the only set that contains information

on all three of the effects AB, AC, and ABC. So this design is also

optimal using Quenouille and John's definition of efficiency.

Beaver (1977) considered the analysis of factorial paired compari-

son experiments using the weighted least squares approach of Grizzle,

Starmer, and Koch (1969). He considered a loglinear transformation on

the T. 's. This approach produces noniterative estimates of the i.'s.

Also, the usual tests for a factorial experiment are easily implemented.

However, Beaver gave no consideration to experimental design.


1.5 Response Surface Designs


Springall (1973) considered response surface fitting using the

Bradley-Terry model with ties allowed as generalized by Rao and Kupper

(1967). The model is given in (1.2.5).










The logarithm of the treatment parameters are defined to be func-

tions of continuous, independent variables, x ,x2, ... ,xs, measurable

without error, i.e.


s

In = k xi i=,...,t. (1.5.1)
k=l


Note that some of the variables can be functions of other variables.
2
For example, s=2 and x 2=xli results in the parameterization
2
In i = 1xli+ 2xli .. The parameters 6, ... are estimated by

the maximum likelihood method. Appendix B has an APL program and a

Fortran program which produce maximum likelihood estimates of ...

I but which assume there are no ties.

Springall presented the following definition, theorem, and example.


Definition 1.1. Analogue designs are defined to be designs in which

the elements of the paired comparison variance-covariance matrix are

proportional to the elements of any classical response surface variance-

covariance matrix with the same design points and an equal number of

replicates at each design point.


The advantage of such designs is that they enable certain desirable

properties found in classical response surface designs (e.g. rotatabi-

lity, orthogonality), which are dependent on the relative sizes of the

elements of the variance-covariance matrix, to be readily produced.
*
For example, consider the classical regression model y = 0 +
*2
81x + 8x Then an experiment can be designed in such a way that the
variance-covariance matrix of the least squares estimators 0' and
"* m*
variance-covariance matrix of the least squares estimators 3 B%, and









82 possesses an optimal form. For instance, the classical design with

an equal number of replicates at each of the four levels x=-2,-1,1,2

can be shown to be rotatable, and so the analogue paired comparison

designs discussed in Example 1.1 below have the same property, i.e.

var(ln ) = var(ln ).
x -x
2
The analagous paired comparison model is In x = x + 2 x An

analogue paired comparison design has a corresponding variance-covari-

ance matrix of the maximum likelihood estimators 81 2, such that each

element is a constant times the corresponding element of the least

squares variance-covariance matrix. For the particular example of a

quadratic model currently under discussion, var(6l) = c.var(B ),

var( 2) = c-var(B2), and cov(S ,62) = c-cov(18 2), for some c.

Springall described a method which uses linear programming tech-

niques that produces an analogue design with minimum elements of the

variance-covariance matrix. The following theorem shows how one can

produce an analogue design, but it does not necessarily have minimum

elements of the variance-covariance matrix.


Theorem 1.1. If a classical response surface design has certain prop-

erties dependent on the relative sizes of the elements of the variance-

covariance matrix, then the analogue paired comparison design will have

the same properties if



n. = N [i kl 11
k

where i.. = i.7./(7. + .) 2
iJ I J J









Example 1.1. Consider the quadratic model In Ti = B x + 2x2, with

design coordinates x=-2,-l,l,2. Assume e=1 and 1= 2=0, and set N=150.

Springall showed that an analogue design is given by n.. = 150/6 = 25,

i
An alternative design may be obtained by using linear programming.

Let Z denote the variance-covariance matrix of the maximum likelihood

estimates 1I and B2. By simultaneously maximizing the elements of ET

subject to the constraint N=150, a design which minimizes the elements

of Z is produced. Springall did this and arrived at the design

{n 2=0, n 1371, n14=9, n23=0, n24=70, n34=0}, after integerization of

the n...
13

The asymptotic distribution of the maximum likelihood estimates

of 8,81 .. ,s was presented by Springall, but he presented few de-

tails. Therefore, the maximum likelihood estimates of the parameters

81, ... ,s and their asymptotic joint distribution are developed in

Chapter 2. The tie parameter, 0, is not considered in the remainder

of this dissertation.
















CHAPTER 2

ASYMPTOTIC THEORY OF THE
MAXIMUM LIKELIHOOD ESTIMATORS


2.1 Introduction


This chapter contains the asymptotic theory for the maximum like-

lihood estimates of the parameters given by (1.5.1), where the logarithms

of the treatment parameters are expressed as a polynomial function of a

single quantitative variable, x. An example of an experiment which had

a single quantitative variable was given in Section 1.1. The treatments

in that example were different levels of an additive for coffee. For

an experiment with quantitative independent variables, it is appropriate

to do a paired comparison analogue to classical regression analysis.

The theory developed in Sections 2.2 and 2.3 assume the general

case of s independent variables xl, ... ,xs. The parameterization of

the Bradley-Terry parameters, given in (1.5.1), is reproduced here:


s
ln = Z kki' i1...
k=l


The Bradley-Terry model with ties not allowed is used, where the r. are

defined by (1.2.2).

In order for all of the parameters 81 ... ,s to be estimable, it

is required that s a (t-l). This is clearly necessary since the Bradley-

Terry parameters have dimension (t-l). Note also that an intercept










parameter, 0 is not estimable without imposing a constraint on the

Bradley-Terry parameters. This becomes apparent when examining the

preference probabilities. For example, suppose the model was


s

in = 0 + k xki
k=l


i=l,...,t.


Then


P(T. T.) = ./(T. + r.)
I 3 1 1 3

s

exp(80 + kki
k=l


s

exp (B0 + Bkxki)
k=l


s

+ exp(B0 + 7 kxkj
k=l


s
exp( kxki)
k=l
s s

exp( kxki) + exp( kxkj)
k=l k=l


isi


The parameter 60 falls out, implying that any choice of B0 results with

the same preference probabilities. Choosing B0=0 replaces the imposi-

tion of a constraint on the treatment parameters.


2.2 Estimation


As was previously mentioned, the parameters B1, ... ,s are esti-

mated by the maximum likelihood method. From (1.3.1), the log-likeli-

hood equation is







t
In L = a.lni. n .(nT(r. +
Si= il
i=l i

t s
= ia Skkik
i=l Lk=l


(2.2.1)


I n ijn exp( S kki) + exp( kxkj
i

The maximum likelihood estimators are found by taking partial

derivatives of the log-likelihood equation, setting them equal to zero,

and solving for the parameters. The likelihood equations are


t
9ln L r
-1n L = a.x n.)
r i=l i r







Xrj
= i m. m
isj ij


s s
Sriexp( kx)ki + x rjexp( kxkj)
k=l k=l
s s
exp(Z kkxki) + exp( 7 kXkj
k=l k=l


s s

iexp( SkXki) + xrj exp( kxkj)
k=l k=l
s s
exp( kxki) + exp( B Skxkj
k=l k=l


i= > 7m. ri r = 0 r=l...s.
13 ] + 7.
i/j *L 3


(2.2.2)


where m.. is the total number of times T. is preferred over T..
1] 3










These equations must be solved iteratively. Appendix B contains

an APL program and a Fortran program which find the estimates of 81,...,

0 The procedure described in the appendix converged for most situa-
s

tions. However, occasionally when an attempt was made to fit a full

model (s=t-l), the procedure did not converge, which remains unexplained.

In this situation one can resort to the program found in Appendix A

which finds estimates of the Bradley-Terry parameters, and then solve

(t-l) equations in (t-l) unknowns to find the estimates of 8 ,... s


2.3 Asymptotic Distribution


First of all, some additional notation and elementary theory is

presented. This will become useful when the asymptotic joint distribu-

tion of ... is derived.
1 s

Let Xik be a random variable which is equal to 1 if T. is selected
i;)k 1
th
over T. for the k comparison of T. and T., 0 otherwise. Then X.
3 1 3 13k
has the Bernoulli distribution with probability of "success" equal to

T./(T. + T.), i,j=l, ... ,t, k=l, ... n.j., and the Bernoulli random

variables are independent of each other. This implies that m. has the
1J

binomial distribution, with sample size n.. and "success" probability

equal to T./(IT + T.), where
1 i 3


13
m. = Xijk i
k=l


Furthermore, the m..'s are also independent of each other, and from the
ii
properties of the binomial distribution we have


Em.. = n. .T./( + T.) ,
1J 1J J 1 3


i

(2.3.1)









2
var(m.ij) = n.ijiT/(ir + .) i 1D 1313 1 3


cov(mij,mkl) = 0 ,


i i,j,k,l=l,...,t.


From (2.3.1), (2.3.2), and (2.3.3), it follows that



n. Tr. nk T
E(m..m. ) = (Em.j) (E ) = k
] k mk iT. + Tj. Tk + T '
1 ] k I





2 2
2 2 in. j.T + n.. i.
Em.. = var(m..) + (Em..)2 1 J l 1
13 13 13 (I + i)2
1 3


(2.3.2)



(2.3.3)


(2.3.4)


(2.3.5)


Under certain regularity conditions verified by Springall (1973),

it follows from a theorem found in Wilks (1962, page 380) that

NI( ) ...s, N( -B ) is asymptotically multivariate normal,
with mean vector ~- and variance-covariance matrix (ab), where
with mean vector 0 and variance-covariance matrix (A ,) where
ab


= 31n L ln L
a = E aJ '
a b b


a,b=l,...,s.


We now proceed to derive the asymptotic variance-covariance matrix

by deriving Xab for general a and b. From (2.2.2) we have



Ix .-x x .-x
Dln L = Xai x a + a a .
~ ~ = L L m^ij + L ^7 m+jT j
a i N 4-V j
3<1


(2.3.6)









- X -X X .
SX i m.. + a] ai (n..- .. .
zi< i z + 1j T2 3 + 1 ijj )
i

m. (x -x ) n. a=l,...,s. (2.3.7)
i< j ( a -

Then from (2.3.4)-(2.3.7), we find Xab to be



A = E m. i (x .-x ) n. i.
ab i








i (ijk or j24)
2 2
n. .T i T n. i T

i
- nij r nkl~Tk
2 n n (x \-x ) (x -x )
z j (zz : .) (T+T, 7+1) I ai aj bk bl





i

+(7C + lT ) kT + T1) (Xai-xaj) kxbk xbl)
i



i< 1 2l
ni ( Z 2+, Zj -r.)2 T aib=l,....s. (2.3.8)
T- < T.T









As previously mentioned, the remaining chapters deal with finding

optimal designs when the logarithm of the treatment parameters is ex-

pressed as a polynomial equation in a single variable. The various

optimality criteria considered include


(i) Minimization of the variance of the maximum likelihood esti-

mator of one parameter,

(ii) D-optimality,

(iii) Minimization of the average variance of the predicted

"response", In iT

(iv) Minimization of the maximum variance of In 7 (minimax),
x
(v) Minimization of the bias,

(vi) Minimization of the average mean square error of In Z when
x
bias is present (J-criterion),

(vii) D-optimality for a preliminary test estimator.


These are all defined in subsequent chapters.

For the case of a single quantitative independent variable, the

model given in (1.5.1) becomes


s

n t. = x i=l,...,t. (2.3.9)
i k ,i k ..
k=l


Similarly, the variance-covariance matrix given in (2.3.8) becomes
S-l
(ab)1, where



\b = nj 2 (x a-x a)(x.b-x.b), a,b,...s.
ab + ( 2 + 1 (2
i









Notice that the complete design problem is quite complex. It

involves deciding how many levels and additionally which particular

levels should be chosen. Furthermore, it must be determined how often

to compare each pair of treatments.

A further complication is apparent by noticing that the variance-

covariance matrix, whose elements are given in (2.3.10), depends on the

true parameter values. Hence the optimal designs also depend on the

parameters. The designs discussed by El-Helbawy and Bradley and Spring-

all were found under the assumption that all Bradley-Terry treatment

parameters are equal, or equivalently that = 2= =B =0. In no

paper of which the author is aware was the dependency of the optimal

design on the true parameter values examined. As will become apparent

in the following chapters, the optimal design for one set of parameter

values can be quite different from the optimal design for another set

of parameter values.















CHAPTER 3

MINIMIZATION OF THE VARIANCE
OF A SINGLE PARAMETER ESTIMATE


3.1 Introduction


The present chapter contains a discussion of the linear, quadra-

tic, and cubic models. The optimality criterion considered is the

minimization of var(S.), where i=1,2, and 3 for the linear, quadratic,

and cubic models, respectively. The linear model is discussed in Sec-

tion 3.2. Designs for the quadratic model are found in Section 3.3 for

32=0, 81 variable. The general expression for var(82), given by

(3.2.6), can be shown to be a continuous function of 2, and hence the

designs in Section 3.3 are locally optimal at the point B2=0. Likewise,

designs for the cubic model are found in Section 3.4 for 3=0, 81 and

2 variable, and are similarly locally optimal.


3.2 Linear Model


From (2.3.9), the linear model is


In f. = 8x. i=l,...,t. (3.2.1)
1 1


Since there is only one parameter, by (2.3.10) the variance-covariance

matrix becomes a simple variance, X-1, where

B(x. + x.)
S TS' e I ]2
i i








(x. x.)2

ni. (exp(8(x.-x.)/2) + exp(B(x.-x.)/2))2
i

a.

-= nij 8d../2 -Bd/2 (3.2.2)
i

where d..=x.-x.. For the remainder of this dissertation, it is assumed
13 i J
without loss of generality that the independent variable, x, is scaled

such that it is in the interval [-1,1].

Although it is not necessary to treat the two cases B=0 and SrO

separately, the special case 8=0 is considered first because the argu-

ment used to derive the optimal designs is easier to follow for this

situation. When 8=0, (3.2.2) reduces to



n = n. (x. x.)2/4 (3.2.3)

i

The expression in (3.2.3) is a maximum when all the "weight" of the n..
1)
2
is placed onto pairs which maximize (x. x.) Clearly this implies
1 3
that the design which minimizes var(B) is a comparison of only the pair

(-1,1).

The designs which maximize A when 85O are found using an argument

identical to the one used for the case $=0. From (3.2.2), all of the

"weight" of the n.. should be placed onto pairs which maximize
13

2 (d/2 -$d/2 2
y = d2(e + e- 2 (3.2.4)


where d is then the optimal difference between the levels in the pairs.

The value of d which maximizes y in (3.2.4) is the solution to









y 2d -^d2(ed/2 e-Bd/2
y = 2d Sd 2(e = 0 (3.2.5)
3d (ed/2 + e-d/2)2 (ed/2 + e-d/2 3



The equality in (3.2.5) implies


coth(Sd/2) = Sd/2 (3.2.6)


The solution to coth(u) = u is a transcendental number, and can be

shown to be approximately u=1.1997. Hence, from (3.2.6) and the fact

that Idl must be less than or equal to 2, the design which minimizes

var(B) for a given value of 8 is


dl = Ix. x.I = min(2,2.3994/ B ) (3.2.7)


It is now shown that the value of Idl given in (3.2.7) is in fact

a maximum for y, and consequently for A. By (3.2.5), the second par-

tial derivative of y with respect to d is given by


2 y Bd/2 + d/2 2
Y (e + e )
Dd

22 322 2
2 4dB-tanh(Bd/2) 2d2/2 + 6 d tanh ($d/2) (3.2.8)


The second partial derivative 3 y/d2 is negative if and only if the

expression in (3.2.8) is negative. From (3.2.6), the equality


2 = Bd-tanh(8d/2) (3.2.9)


holds at d=2.3994/181, which is then substituted into (3.2.8) to give


2
3- = 2 8 B2d2/2 + 6 = -2d2 /2 < 0 (3.2.10)

opt. d










Because there is only one solution to (3.2.6) where d>0, the result in

(3.2.10) implies that the solution of Idl given in (3.2.7) does in fact

minimize the variance of B.

By (3.2.7) the optimal design is a comparison of only the pair

(-1,1) when 8 <1.2. It can be shown that P(l -1)=.92 when 8=1.2.

Because of this relatively large preference probability of choosing one

endpoint over the other, in most practical situations B will be less

than 1.2. So in these cases the optimal design is a comparison of only

the endpoints of the experimental region.

The optimal designs given by (3.2.7) are also optimal for other

criteria. In particular, they are optimal for criteria (ii)-(iv) intro-

duced in Section 2.3. These criteria are formally defined in Chapter 4,

but a brief explanation for their equivalence in the case of a linear

model follows. The equivalence of the criterion being discussed in the

present section to D-optimality follows from the fact that the deter-

minant of a Ixl matrix is simply the only element in the matrix. The

equivalence to the other two criteria follows from the fact that


2
var(ln i ) = x var() .(3.2.11)
x

From (3.2.11), criterion (iii) is, in this case, the minimization of



2 2 "
I
x var(8) dx = -var(8)
-1


and criterion (iv) is the minimization of


2
max x var(B) = var(B)
xEL-l,1]









3.3 Quadratic Model

The quadratic model is

2.
In = xi + x i=, ,t.
1 81xi 2 1 '


From (2.3.10),

mum likelihood


the asymptotic variance-covariance matrix of the maxi-
S2 i -1
estimators a1 and 82 is (A ) where
1 2 abI


A11 =
i


12 -= I
i


2i i

2
ijij(X i x)




*..(X. x.)(x. 2 2
nij ij(xi 2 x



2 2 2
n ijij (xi x. )
13fl 1


(3.3.2)




(3.3.3)




(3.3.4)


and where


.ij = (xi'x) =
iJ 1 3


exp( 81(x. + x.) + 8(x.2 + .) )
S2exp( 2 1
2 2 )2
( exp(8 1xi + 8 2xi ) + exp(81xj + 2xj ) x


(3.3.5)


i


Minimization of var( 2)


Consider the test of the hypothesis


H : 8 = 0 S, unspecified,
o 2 1


versus


H : S2 / 0 1B unspecified.
a 2 2.


(3.3.1)









Minimizing the asymptotic variance of 52 would equivalently be maximiz-

ing the asymptotic power of the test. The variance of 2 is the lower

right-hand element of the 2x2 matrix (lab)- That is,



var(B2) = 1 (3.3.6)
11 22 12


Under the null hypothesis, (3.3.5) becomes


8 (Xi + Xj)
e
J = 1 -2 i (e13 +e )


As was pointed out in Section 3.1, in the present section (3.3.6)

is minimized for 82=0. This produces locally optimal designs. Assuming

52=0 greatly simplifies the design problem as will be seen in the

remainder of the section.

The end result of Lemmas 3.1-3.2 and Theorems 3.1-3.3 is a set of

optimal designs which minimize var(82) under the null hypothesis for

various choices of the linear parameter. Hereafter, "optimal design"

will refer to the design which is optimal for the criterion being dis-

cussed in the section.

The following two definitions are useful in the discussion that

follows.


Definition 3.1. The pair of treatments (-x ,-x2) is defined to be

the symmetric counterpart of the pair (x ,x2).


Definition 3.2. A design is said to be symmetric if each pair in the

design is compared as often as its respective symmetric counterpart.









Lemma 3.1. Let {(x.,x.) denote .ij as given by (3.3.7). Then

(xix .) = (-Xi-x.-x), for any x. and x..

Proof.
8 C(x. + x.) -2 1(x. + x.)
x e e
X i lxj 2 -281(xi + x.)
Se +e ) e 3


81(-xi-x)
e
e1(-x.) + e1 (-x ) 2
Ce^ j' +e


= 0(-Xi,-xj ) +


Theorem 3.1. The optimal design is a symmetric design.

Proof. Notice that both of the following hold:

2 2
(i) (x. x.) = ( (-x.) (-x ) )
3 1 3 1

(ii) (x.2 x 2) = ( (-x.) (-x.)


From (3.3.2), (3.3.4), Lemma 3.1, and the above, it is clear that ll

and X22 are invariant under changes of any number of comparisons of

pairs to comparisons of their symmetric counterparts.

Let D denote any particular design, and let varDC (2) be the var-
th
iance of 2 for Design D Also, let the i pair in D be denoted

(x.,xi). The sets of comparisons {(x.,x ), (-x.,-x)} are considered

in sequence, and without loss of generality, it is assumed that


n ,- n = d. 2 ,
x.,x. -X. ,-X.
1 1 1 1

for all i. Notice that it could be true that n = 0.
-X -X
I I










Design D2 is formed from Design D1 by changing d./2 of the compar-

isons of the pair (x.,xi) to comparisons of the pair (-x.,-x'). This
1 1 1 1
results in (n + n )/2 comparisons of each of the two pairs
x.,x. -x.,-x.
11 1 1
th
in the i set. This change is made for each set of comparisons, there-

by defining Design D2 to be a symmetric design. The total number of

comparisons, N, is left unchanged.

By the first paragraph of this proof, the values of All and A22

are the same for the two designs. By (3.3.3), 12=0 for Design D2,
12
and hence A2 is a minimum. Therefore, by (3.3.6),



varD (2) S varD (2) (3.3.8)



Note that the variances are equal if 12=0 for Design DI, and it

is possible that A 12=0 for a nonsymmetric design. However, (3.3.8)

shows that no design can "do better" than a symmetric design. Also

note that in practice, d. must be an even integer in order for the
1
n. 's to be integers.


So the search for optimal designs in the present section will hence-

forth be restricted to the class of all symmetric designs.

Because 12=0 for a symmetric design, (3.3.6) reduces to


var(B ) = 1/22 (3.3.9)



So it is now necessary to find designs which maximize A22. Notice that

122 depends on 1. As previously mentioned, the optimal design also

depends on the value of S1.









Theorem 3.2. The optimal design is a design in which the only compar-

isons made are comparisons of the pairs (x,l) and (-l,-x) an equal num-

ber of times, for some x. Furthermore, x, 0.


Proof. First of all, the notation used for A22' which is defined by

(3.3.4) and (3.3.7), is changed. The N total comparisons are arbitrar-
.th
ily ordered, and the i pair is then defined to be (x.,x!). The vari-

ables y. and z. are defined to be
1 1


y = x. x z. = x + x i=,...,N.
1 1 1 1 1


Then from (3.3.7),


1(x' + x) -8 l(X + x!)
1 i i 11 1
= e e
i li xx l -Sl(xi + x!)
(e +e ) e



8 y /2 -1+i/2 2 i=l .. N
(e + e


Notice that i. is a function of y. but not of z.. From (3.3.4), 12 is
1 1 12
rewritten as
N

22 = iii zi
i=l


Let D be any design. By holding y. fixed, we examine what changes
2
in z.2 will increase 22. Clearly if 1zil is as large as possible for

every i, then X22 is also as large as possible for the particular set

of y.'s found in D. Without loss of generality, it is assumed that
1
x. 1 1 1 1 1









all comparisons must involve either x=-l or x=l. Additionally, using

an argument similar to one used in Section 3.2, it is clear from the

expression for A22 given in (3.3.4) that x=l should be compared with

one and only one level of x, namely the value of x which maximizes

0(l,x)-(1-x ) This result in conjunction with Theorem 3.1 proves the

first part of the theorem.

We can now rewrite 122 simply as


22
122 = NO(l,x)-(l-x2)2


It remains to be proven that the optimal value of x is such that x L 0.

The variable I(x) is defined to be the positive square root of

X22/N, i.e.


81(1 + x)/2


1 B2x
~(x) ( x2
e1 1x
e + e


(3.3.10)


Since (3.3.10) and the fact that xE[-l,l] imply I(x)

A22 is equivalent to maximizing P(x).

It will now be shown that '(x) > i(-x), x a 0.

inequalities are equivalent:


f(x) a 4(-x)


1 (1 + x)/2
e
1 1x
e + e



13 (l+x)+1 1 (l+x)-0 x
e + e


> 0, maximizing



The following five




(3.3.11)


1e(1 x)/2
e (3.3.12)
e + e



~ (l-x) + 1 1( (1-x)+81x
r e + e (3.3.13)









81(1 + x)/2 -1 (1 + x)/2 81(1 x)/2 -8 (1 x)/2
e + e a e + e (3.3.14)


cosh(u + 81x) cosh(u) (3.3.15)

where

u-u
cosh(u) = (e + e )/2 ,

u = 1 (1 x)/2


Now cosh(u) is a monotonically increasing function of u for u. 2 0.

Furthermore, it can be assumed that 3 1. 0 because Theorem 3.3 proves

that the optimal designs for 1 = 10 are the same. Or, equivalently,

the remainder of this proof can be repeated for the case 1 <0. Then

x ; 0 and B1 > 0 implies B x > 0, and hence by the equivalence of

(3.3.11) and (3.3.15), 4(x) L(-x) for all x O 0. This completes

the proof of the second part. t


The next theorem shows that optimal designs only need to be found

for nonnegative values of 81.


Theorem 3.3. The optimal design for B1=-B10 is the same as the

optimal design for 81=B10.

Proof. Recall that the optimal design is found by finding the value of

x which maximizes P(x) from the proof of Theorem 3.2. Since i(x) is

actually a function of 1 and x, we write


B1(1 + x)/2

(81,x) = e1 x/ (1 2
e + e









The theorem is proved by observing that



81(1 + x)/2 -81(1 + x)
e 2 e
t(81,x) = 1ex (1 x2) e* (l + x)
e +e e


-81(1 + x)/2
=e (1 x2)
-81x -81
1
e + e


= (-1,x) +



All that remains to be done is to find the value of x that maxi-

mizes f(x) for various choices of B1 0. Unfortunately, a closed form

solution of x as a function of B1 does not exist. However when 31=0,

the optimal value of x can be analytically derived, and hence this

situation is considered first.

Suppose B =0. Then from (3.3.10), i(x) = (1 x2)/2. Clearly

this is maximized by x=0, and hence from Theorems 3.1 and 3.2, the

optimal design is an equal number of comparisons of only the pairs

(0,1) and (-1,0). For the case 81 0, the grid method described in the

next paragraph was used to obtain the results presented in Table 3.1

and Figure 3.1.

The grid method utilized first calculated 4(x) for x=0(.l)l,

and the best value of x, say xl, out of these 11 choices was found.

Then t(x) was calculated for the 11 values x=(x -.l),(x -.08), ...

(x +.l), and again the best value of x was found. This procedure was

repeated until the optimal value of x was found accurate to the fourth









decimal place. The entire procedure was done separately for each value

of 16 .

The following lemma proves that the values of x which are found

via the grid search are absolute maximums for i(x).


Lemma 3.2. Exactly one local maximum (and hence it is the absolute

maximum) exists for l (x) in the interval L0,1].

Proof. By Theorem 3.3, only the case 1 >0 needs to be considered.

From (3.3.10), the first partial derivative of l(x) with respect to x

is

1(1 + x)/2 e (81(1 x )/2 2x)
1 1(x) = 1 1x 2 x (3.3.16)
L 2 l 2
( e + e ) e (8B(1 x )/2 + 2x)


It can be seen from (3.3.16) that (0)>0 and 4i'(l)<0. Hence ,0'(x)=0

for at least one xeLo,l1, and it remains to be shown that i'(x)=0 for

exactly one xE[O,lj. (An attempt was made to show 4"(x) is negative

for all 61 and xELO,l2, but it turned out not to be true. In particular,

a graph of '(x) as a function of x for 81=5 indicated ("(x)>0 for values

of x near 0.)

From (3.3.16), W'(x)=0 if and only if

-- (1 x)

(81(1 x2)/2 2x) = e () 1(1 x )/2 + 2x) (3.3.17)


Notice that the functions f(x) = ( 1(1 x2)/2 2x) and

f2(x) = (81(1 x2)/2 + 2x) are reflections of each other about the

axis x=0, and they are quadratic in x. Hence they intersect exactly









once at x=0. Also, for fixed B1, fl and f2 have maximums at -2/81 and

2/81, respectively. Let gl(x) be the right-hand side of (3.3.17). It

was shown in the preceding paragraph that fl(x) and g (x) intersect at

least once in the interval Lo,l]. To determine that they intersect

exactly once, it is sufficient to show that f l(x)/3x < gl (x)/ax.

From (3.3.17),


f l(x)
3x -x 2,


gl(x) -81(1 x) 2 1(1 x)
x = e (1(1 x )/2 + 2x) + e (-8x + 2)


-3 (1 x)
= e 1 ( 2(1 x )/2 + B3x + 2)


For 31>0 and xELO,1], it is clear that fl (x)/2x < 0 and

agl(x)/3x > 0, which completes the proof. t


Table 3.1 presents the optimal value of x found from the grid

search described immediately prior to Lemma 3.2, for various choices

of 1 1. Figure 3.1 is a graph of the optimal value of x versus 1B11.

The table includes relatively large values of B11 to illustrate that

xtl as Is1 B m. Recall that for a given value of B l1, the optimal

design is a comparison of (-l,-x) and (x,l) an equal number of times,

where x is found in Table 3.1 or Figure 3.1.

Table 3.1 indicates that the optimal design when B1=0 involves

only 3 levels, x=-l,0,1. However, when 81O, the optimal design

requires 4 levels, x=-l,-x ,x ,1, for some x In certain experimental
So o
situations it may be significantly more laborious to make four "batches"
















Table 3.1. Designs which minimize var(B2)


Relative
81 Optimal x efficiency

0 0 1
.05 .0003 1
.10 .0012 1
.15 .0028 1
.20 .0050 1
.25 .0077 .9999
.30 .0110 .9998
.35 .0150 .9995
.40 .0194 .9992
.45 .0243 .9988
.50 .0297 .9982
.55 .0356 .9974
.60 .0419 .9963
.65 .0486 .9950
.70 .0556 .9935
.75 .0630 .9915
.80 .0708 .9893
.85 .0787 .9866
.90 .0869 .9836
.95 .0953 .9801
1.0 .1039 .9761
1.1 .1216 .9668
1.2 .1396 .9556
1.3 .1580 .9426
1.4 .1764 .9269
1.5 .1948 .9097
1.6 .2130 .8905
1.7 .2310 .8696
1.8 .2487 .8472
1.9 .2660 .8233
2.0 .2829 .7983
3.0 .4269 .5242
4.0 .5295 .2974
5.0 .6031 .1546









































CN
< ~












Ol
N












4
U
Lf

aD



a.






cC









than it is to make just three. For this reason, and because a priori

information about 1 may not be available, asymptotic relative efficien-

cies were calculated for the design ({n,0 = nl = N/2} to the optimal
-1,0 0,1
designs found in the table. The relative efficiency of the first design

to the second (optimal) design is defined to be


var 2(2)
R.E. = 2
var (B2)


1, 1 2
e -/(e 1 + 1)2
B1(1 + x) ,2 1 x 2
e (1 x )/( e + e )


The relative efficiencies were enumerated for various choices of 81,

and are presented next to the optimal designs in Table 3.1.


Equivalence of El-Helbawy and Bradley's method
and minimizing the variance


An application of El-Helbawy and Bradley's method for finding opti-

mal designs was briefly discussed in Section 1.4. It will now be prov-

en that their optimality criterion is equivalent to minimizing var(8k),

any k. First of all, notation and results of El-Helbawy and Bradley's

paper which are necessary for the proof of Theorem 3.5 are presented.

Recall that El-Helbawy and Bradley considered linear contrasts of

Yi = In i=l, ... ,t. The matrices B and B are defined as mxt and
1 1 -m -f
nxt, respectively, with zero-sum, orthonormal rows, such that

0 s m < m+n S t-l. The null hypothesis is


H : B Y(T) = 0
S -n-- -n










and the alternative hypothesis is


H : B y(r) 0 ,
a -n-- -n


th
where Tr and y(r) are column vectors with i elements 7T. and y., respec-

tively. The rows of B represent the linear contrasts assumed to be
-m

null. Note that B need not exist (i.e. m=0).
-m

The vector T is a value of T which satisfies the null hypothesis
-o -

and the constraints, where r is the true parameter vector.

The matrix B is defined to be the txt orthonormal matrix



[1,'//


B = B (3.3.18)

--m




where B* is any (t-m-l)xt matrix such that BB' = I and l'=(l, ... ,1),
-m t -

the t-dimensional unit vector.

The following theorem is extracted from El-Helbawy and Bradley

(1978, Theorem 2, page 833).



Theorem 3.4. Let p be the maximum likelihood estimator of T. Under

the assumption of connectedness (Assumption 1.2), and by the assumption

of strictly positive elements of T, which has already been inherently

assumed since ln(0) is undefined, v7((p) Y(7)) has a limiting distri-

bution function that is singular, t-variate normal in a space of (t-m-l)

dimensions, with zero mean vector and dispersion matrix


,
E(r) = B*'(C(-)) B* (3.3.19)
-m -- -m


where










C(T) = B*A(7r)B*' (3.3.20)
M- -m


and A(r) is the txt matrix with elements


t

Aii (i) = i ij i /(ni + .), i=l.. t,

j=l
ji (3.3.21)


A..(Tr) = -U. .ir. ./(r. + i .) i j, i,j=l,...,t,



and where U.. is defined in Assumption 1.2.
13


El-Helbawy and Bradley's optimality criterion is the maximization

of the noncentrality parameter, 2, for the asymptotic chi-square dis-

tribution of the likelihood ratio statistic. The noncentrality parame-

ter is


2 -1
2 = 6'E 6 (3.3.22)
-0 -


where 6 is a vector of constants defined in their paper, E is the
o
-1
leading principal matrix of order n of C B provides the first n
-o -n
rows of B*, and


C = B*A(T )B*' (3.3.23)
-o -m- -o -m


A theorem proving the equivalence of the two optimality criteria

is now presented.


Theorem 3.5. Consider the model In iT = 1 x+B 62x2+ +kxk

Suppose we are interested in testing, for some fixed i,


H : 3.=0 versus H : 30 .
o 1 a i









Then the design which minimizes var(Bi) is identical to the design

which maximizes the noncentrality parameter from the likelihood ratio

test. (Note that it does not matter whether or not it is assumed that

some of the other l.'s are 0.)
3

Proof. Orthogonal polynomials can be derived for any spacing of the

independent variable. Let these contrasts be the rows of B. The lxt

matrix B is the orthogonal polynomial corresponding to the parameter
-n=1

B. being tested.

From the method used to find orthogonal polynomials, it follows

that


1 c-,n=1 (TI


where c is a constant. So minimizing var(B.) is equivalent to minimiz-

ing var(B y(p)). But from Theorem 3.4,
-n=1-

-1
var(B n-(p)) = B (B*'(C(T)) B*)B'
-n=l -m - -n=






-1
0
= (1,0, ... ,0) (C(7) 1





S ((C(7)) ) l


2
From the definition of Z E is the leading principal matrix of

-order 1 of C 1
order 1 of C i.e.
-o


Z = ((C())1)-1
o - 11









In this case, 6 is a scalar constant, and so maximizing 12 is equiva-

lent to minimizing Z = var(B). t
o 1


Example 3.1. This example illustrates the equivalence of the two

methods presented in Theorem 3.5. It is assumed that the true model

is a quadratic model, In ix = l x + 5 x Also, the design is re-

stricted to have the three design points x=-l,0,1.

Table 3.1 indicates that the design which minimizes the variance

of O2 is {n-1,0 = nO,1 = N/2}.

The design problem is now approached using the method presented by

El-Helbawy and Bradley. The null hypothesis is


Ho: 2 = 0 unknown.
o 2 1


Using orthogonal polynomials, this hypothesis is equivalent to


H (-1 2 -1)
Ho. y(T) = 0 ,
o /6


where y_' (r) = (In -i1, In O In 1) .

Note that El-Helbawy and Bradley's constraint, that the In ir.'s
1
sum to zero, poses no problem. Instead of constraining T0 to be equal

to 1, an intercept parameter, B0, could be introduced to the model.

The model would then become


2
In = 0 Sx + B2x


with the constraint

SIn rT = 0
x x
x


It can be shown in this particular case that 80 3 .
0 32








For a definition of the following matrices and notation, see

(3.3.18)-(3.3.23). For the example under discussion,


B
-n=1


= (-1 2 -1)//6 ,


2(-1 2 -1)//6
B* =
-m (-1 0 1)//2


12 + 13
A_ ) = T -1

--13


where 7 '=(1,1,1), i.e. B =B =0. Note that T can be any
-o 1 2 -o
satisfies the null hypothesis and the constraint.

The matrix C is
-o


vector which


C = B*A(TT )B*'
-o -M- -- -im



1 2 12 23 Vi2 12 23

S112 -23 23 1223


implying


2 (112 + 4 +13 + 23
C

S12 H23)
712


12 -23)
/12 / -

3 12 + 23
2 12 23


-1
Then since Z is the upper left-hand element of C
o -o


S 12 + 4113 + "23
o 21 C


-W12

+12 + H23

-123


-13

-23

13 + 23_









implying



-1 21 Co
o0 12 + 4 +13 + 23



3 V12V13 + V12 23 + 113 23
S ---V(3.3.24)
8 012 + 4113 + 23 (3 24)



The design which maximizes the chi-square noncentrality parameter from
-l
the likelihood ratio test is the design which maximizes E
o
-1
By the definition of the U ., 3 = 1 12 13 So can be
ij 23 12 13 o
rewritten as



-1 12 13 + (12 13)( 12 13
o 12 + 4 +13 + (1 V2 13)


2
12(1 V12 U13) + (V13 132)
(3.3.25)
31 + 1
13


Suppose that P13 is temporarily held constant. It is then claimed that
131
choosing u12 = (1 13)/2 maximizes o- for the particular V13"

From (3.3.25), the expression 12((1 U 3) 1 2) must be maximized.

Because of the constraint 0 S 12 5 1- ,13' it is clear that

12((1 U13) 12) is maximized when U12 (1 13)/2, which

proves the claim. The constraint among the U..'s in addition to the

above claim implies that the optimal design must be a design such that


1 (13
12 2 V23 (3.3.26)









-I
The scalar ~ is now rewritten as a function of v112 alone. From
o
(3.3.24) and (3.3.26),



-1 3 212(1 212) + 12
o 8 2112 + 4(1 212)


2
3 212 3112
8 4 612
3 1 12


= 3112
16


Clearly o-1 is maximized by choosing 112 as large as possible. By

(3.3.26), this would make 12- 23=.5, and 13=0. So this optimal

design is in fact the same as the design which minimizes var(2).


An attempt was made to find the optimal design for the test of

hypothesis

H : B = = 0 versus H : not H
o 1 2 a o


using El-Helbawy and Bradley's methodology. When simultaneously test-

ing more than one parameter, or equivalently more than one treatment

contrast, a constant value, 6., must be chosen for each parameter. The

theory behind the optimal designs includes an infinite sequence of

values of T satisfying H which are converging to a vector m7 satisfy-
Sa -o

ing H When the treatment contrasts are for factorial treatments,
o

which is the case they discussed extensively, the relative sizes of the

6. are said to be equivalent to the relative importance placed on each
1

contrast being tested in H Hence, El-Helbawy and Bradley usually

chose 6 =6 = = 6 .
1 2 n









In the present situation, choosing 61 and 62 restricts the alter-

native parameter values in the infinite sequence mentioned in the pre-

ceding paragraph to be a proper subset of the alternative parameters

satisfying Ha. For example, 61=62 implies that 68= 2, and hence the
2
alternative models are In T1 = Bx + Sx with B- as N-*. Choosing
x
different 6. can result in a different design because the maximization
1
of the noncentrality parameter depends on the relative sizes of the

6.. Additionally, the optimal designs actually found after choosing

the relative sizes of 61 and 62 were found to be unreasonable (e.g. for

61=62, the optimal design is a comparison of only the pair (-1,0)).

Chapter 4 presents other optimality criteria for fitting a quadratic

model.


3.4 Cubic Model


The cubic model is


2 3
In I. = x. + Bx.2 + x 3 i=l,...,t. (3.4.1)



The asymptotic variance-covariance matrix of the maximum likelihood
-1
estimators, B1, ,' and 63 is ( ab) where lab is given by (2.3.10).

The inverse of the variance-covariance matrix is partitioned as



11 12 13


(A b) = 12 22 23



13 A23 33


Then by a result found in Graybill (1969, page 165), the variance of









83 is

r -1
var( = ab 33



= 33 (A13 23> ( 12 ( 1j
12 22 23



SF 13 132212 23 23 1123- 12 13

33 11 A22 122 (3.4.2)


To simplify (3.4.2), only the case S =0 is considered in the

search for designs which minimize var(3 ). As pointed out in Section

3.1, this gives locally optimal designs. The following theorem shows

that optimal designs only need to be found for linear and quadratic

parameters which are both positive.


Theorem 3.6. Suppose that 3=0, and further that the optimal design

for given 3 and 2 is comparisons of the pairs (x.,x.) a total of

n times, for all pairs. Then
x. ,x.
1

(i) The optimal design for (-81) and (-52) is the same as the

optimal design for 1 and 82'

(ii) The optimal design for (-81) and 2 is comparisons of the

pairs (-x.,-x.) a total of n times, for all pairs,
1 3 x.,x.
(iii) The optimal design for 81 and (-2 ) is the same as in (ii).

Proof. From (3.3.5),


. 2 (xij) =








2 2 2 2
exp(-8 (x.+x.) 2 (x+x. )) exp(2 (x.+x.) + 23 (x. i+x ))
1 1 3 2 l 1 j 2 i
(exp(-1x i-82x ) + exp(-81x -82x )) exp (2B1(x +x ) + 282 (x i+x 2))


2 2
exp(B (x. + x.) + 2(x + x ))

(exp(61xi + B2xi2) + exp( 1xj + 2x 2))2


= 1 (x ,x.) (3.4.3)


2
Notice that since B =0, Tr i./(r. + ir.) from the expression for a
3 13 1 3 ab
given in (2.3.10) is equal to 812 (x,x ). Hence by (2.3.10) and
83,63 i j

(3.4.3), it is clear that Xab is invariant under changing the sign of

both 81 and 2, for all a and b. Therefore the variance of 3 is also

invariant under changing the sign of both 81 and 2. This implies that

the design which minimizes var(3 ) for 81 and 82 is also the design

which minimizes var(3 ) for (- 1) and (-8 ), thereby proving (i).

To prove (ii), note that from (3.3.5),

2 2
exp(-8 (x.+x.) + 3 (x. +x. ))
) = (xi,x.)' =
5-1 1 j 2 2 2
S2 (exp(-1Xi +B2xi ) + exp(-81xj+62x ))


exp(B (-xi-x.) + 8 (x. +x. ))

(exp(B1(-xi)+B2xi ) + exp(B (-x )+B2x2))2



S (-x.,xi'-x) (3.4.4)


Again, by (2.3.10) and (3.4.4), changing 81 to (-81) and the pairs

(x.,x.) to (-x.,-x.), for all pairs in the design, will not change 1 ,

122' 33' nor 113. It changes only the sign of A12 and 123. But









notice that in (3.4.2) all terms involving A12 and A23 are of the form

2 2
112 23, 122, or 23 Hence the negative sign cancels, leaving

var(3 ) invariant under the change described above. This proves (ii).

To prove (iii), simply apply the results of (i) and (ii).


Theorem 3.5 proved the equivalence of minimizing var(83) and

El-Helbawy and Bradley's criterion. A Fortran computer program was

used to find the optimal design using El-Helbawy and Bradley's metho-

dology for 8 = 2=8 =0. The levels allowed were x=-l,-.5,0,.5,1. The

grid search included all ten pairs formed by these five levels, assum-

ing a symmetric design. The optimal design found was {n =

n = .045N, n.5 = .621N, n_, = .298N, all other n.i = 0}.
.5,1 -.5,.5 -1,1 13
For this design, it can be shown that var(3 )=16/N.

It was decided that for unequally spaced levels, employing El-Hel-

bawy and Bradley's methodology would be substantially more laborious

than minimizing var(83). For this reason, the designs subsequently

discussed were found by minimizing the expression found in (3.4.2).

Optimal designs were found by a grid search via a Fortran computer

program for 83=0, 81 2=0(.2)1. The comparisons allowed were (-l,x ),

(x,x2 ), (x 2,1), and (-1,1), where the optimal value of xl and x2 were

also found by the computer. The motivation for the choice of these

four pairs is the result discussed in the two preceding paragraphs for

the case of all three parameters equal to zero, which indicated that

only the pairs (-1,-.5), (-.5,.5), (.5,1),.and (-1,1) need to be com-

pared.

The grid procedure is now briefly described. The initial values

for xl and x2 were x = .7(.1)-.3 and x =.3(.1).7. For each of the 25









combinations of xl and x2, the optimal comparison proportions were

found by another grid. The latter grid began by considering different

comparison proportions, changing by increments of 0.1. This grid

became finer and finer until the proportions were accurate to .0004.

From this the best combination of xl and x2 out of the initial 25

choices was found, and the entire procedure was repeated for a finer

grid about the previous best choice of xl and x2. The entire grid

search stopped when xl and x2 were accurate to .0037.

Notice that it is possible that the designs found are not abso-

lutely optimal, but rather they may only be local minimums for (3.4.2).

This is because only four pairs were considered when it is possible

that a five or six pair design, for example, might offer a slight im-

provement. Also, the grid search restricted xl and x2 rather than

allowing them to take on values in the entire interval [-1,1. The

latter point is not believed to be a problem due to the resulting

design for 1 =2 =B3=0 originally found using El-Helbawy and Bradley's

methodology, and the fact that the optimal designs currently being dis-

cussed all have values of x1 and x2 within 0.02 of 0.5.

The optimal designs and the value of var( 3) for N=l are presented

in Table 3.2. The first two columns of the table give the value of

81 and 2. Columns 5 through 8 give the comparison proportions for the

pairs (-l,x ), (x ,x ), (x2,1), and (-1,1), respectively, where xl and

x2 are given in columns 3 and 4. The last column gives N-var(3 ). The

number of times a pair is compared is found by multiplying the appro-

priate comparison proportion by N. Recall that by Theorem 3.6, the

optimal design for both parameters negative is the same as the optimal

design for !31 and B2 1. If exactly one parameter is negative, then









the optimal design is found by first finding the optimal design for

j8 1 and 1821, and then taking -ix to be the proportion of the total

number of comparisons that the pair (l,-x ) is compared, xl,x2 to be

the proportion for (-xl,-x ), and so on. To calculate var(3 ) for N>1,

the tabled value of N-var(B3) is divided by N.

Notice from Table 3.2 that for 8: .6, the optimal design is very

close to the design {n 1,.5 = n.5,.5 = n.5,1 = .3333N}, regardless of
-i,-.5 -.5,.5 .5,1
the value of 2. This implies that if it is a priori believed that 8

is relatively large, then this design will result in a variance of 83

very close to the tabled minimum. As has been previously mentioned, it

is also of interest that the optimal values of xl and x2 are always

very close to -.5 and .5, respectively. Finally, notice that the

design for =B 2=0 found in Table 3.2 is not the same as the one found

using El-Helbawy and Bradley's methodology, although the value of

var( 3) appears to be the same for both, and hence they are equally

optimal. However, it was not established that the two designs are

mathematically equivalent. Since the two criteria are equivalent,

there may be some problem with local minimums of var(63). It appears

that the surface of the expression for var( 3) is relatively flat, and

so it is possible that a finer initial grid is necessary to guarantee

that the minimum found is in fact the absolute minimum of var(83).












Table 3.2. Designs which minimize var(B3)




1 62 xl x2 -1,xl Xl,X2 X2,1 1,1 N-var(83)


0 0 -.50 .50 .133 .533 .133 .200 16.0000
.2 0 -.51 .50 .338 .331 .331 0 16.0836
.4 0 -.50 .50 .332 .337 .332 0 16.3225
.6 0 -.50 .50 .330 .341 .330 0 16.7322
.8 0 -.50 .50 .327 .346 .327 0 17.3190
1 0 -.50 .50 .323 .353 .323 0 18.0959

0 .2 -.50 .50 0 .666 0 .333 16.0000
.2 .2 -.50 .51 .333 .329 .338 0 16.1447
.4 .2 -.50 .50 .330 .336 .335 0 16.3835
.6 .2 -.50 .50 .329 .338 .333 0 16.7943
.8 .2 -.50 .50 .326 .346 .328 0 17.3819
1 .2 -.50 .50 .321 .353 .327 0 18.1604

0 .4 -.50 .50 0 .666 0 .333 16.0003
.2 .4 -.50 .50 .105 .561 .103 .231 16.3209
.4 .4 -.50 .50 .330 .333 .337 0 16.5664
.6 .4 -.50 .50 .327 .338 .335 0 16.9799
.8 .4 -.51 .50 .324 .342 .334 0 17.5716
1 .4 -.51 .50 .319 .349 .333 0 18.3554

0 .6 -.50 .50 0 .666 0 .333 16.0005
.2 .6 -.50 .50 .004 .661 .001 .334 16.3207
.4 .6 -.51 .50 .333 .328 .340 0 16.8748
.6 .6 -.51 .50 .327 .333 .340 0 17.2927
.8 .6 -.51 .50 .323 .337 .340 0 17.8907
1 .6 -.51 .49 .317 .346 .337 0 18.6825

0 .8 -.50 .50 0 .666 0 .333 16.0010
.2 .8 -.50 .50 .004 .661 0 .334 16.3225
.4 .8 -.51 .47 .039 .633 .078 .320 17.2619
.6 .8 -.51 .50 .328 .327 .344 0 17.7368
.8 .8 -.51 .50 .323 .331 .346 0 18.3436
1 .8 -.52 .49 .319 .338 .343 0 19.1472

0 1 -.50 .50 0 .666 0 .333 16.0015
.2 1 -.50 .50 0 .664 0 .335 16.3233
.4 1 -.51 .47 .034 .633 0 .333 17.2685
.6 1 -.52 .50 .331 .318 .351 0 18.1804
.8 1 -.52 .50 .324 .325 .351 0 18.9372
1 1 -.52 .49 .318 .331 .351 0 19.7558
















CHAPTER 4

DESIGNS FOR FITTING A QUADRATIC MODEL


4.1 Introduction


The present chapter contains optimal designs for fitting a quad-

ratic model under three different conditions, each for three different

optimality criteria. The D-optimality criterion is considered in Sec-

tion 4.2. This criterion minimizes the determinant of the asymptotic

variance-covariance matrix. In the present chapter, this matrix is a

2x2 matrix since there are two maximum likelihood estimators, 81 and B2"

From Section 2.3, these estimators are asymptotically normal. There-

fore, by a result in Box and Lucas (1959), the determinant of the

variance-covariance matrix is proportional to the volume contained with-

in any particular ellipsoidal probability contour for $=(81, 2) about

S, where $ is the true parameter value. The D-optimal design will
-o -o

then minimize the volume of any such probability contour.

Section 4.3 contains designs which minimize the average-variance

of In T = B x + 2 x The average-variance is simply the variance of

ln 't integrated over the experimental region. In this case, the exper-

imental region is xEL-l,l]. The property that these designs have is

obvious. That is to say, a good estimate of all of the preference prob-

abilities is desirable, and the design which minimizes the integral over

x of var(ln ir ) will give estimates of 7 which are good "on the
x x
average".









The criterion discussed in Section 4.4 is similar to the average-

variance criterion. The designs found in this section minimize the

maximum of var(ln TT ), where the maximum is taken over the interval
x
xEi-l,l]. These designs are referred to as minimax designs. The dis-

advantage of this criterion is similar to the disadvantage often attrib-

utable to the minimax rules in the decision theory framework. That is,

minimax designs may be slightly "pessimistic" in the sense that they

minimize the largest value that the variance can be. This minimization

is good, but at the same time the variance of In T may be relatively
x
large for most values of x in the interval [-1,1], hence causing the

minimax design to be unfavorable.

Finally, Section 4.5 presents some design recommendations and

conclusions. A number of designs are compared to determine which

design does best for a wide range of parameter values, using primarily

the D-optimality criterion. Additionally, an examination is performed

on how well these designs protect against the bias present when the

true model is cubic.

Recall that in general, the problem of finding optimal designs is

complex. It involves the determination of the number of levels to be

included in the experimental design, and also which particular levels

are to be chosen. Additionally, the number of times each pair is to

be compared must be determined. In Chapter 3 it was proven that the

design which minimizes var( 2) must be of the form {n_-x = nx,1 = N/2,

thereby implying that the optimal number of levels is four, and only

two pairs are to be compared an equal number of times. All that re-

mained to be resolved was the determination of x for different values

of a1 In the present situation, a similar simplification of the









general problem does not appear to be possible, except by imposing some

restrictions, such as letting $2=0. For this reason, three conditions

are considered under which optimal designs are found. This essentially

results in designs which are optimal for a proper subclass of the class

of all designs. However, as will become apparent in the remainder of

the chapter, there is not much difference in the degree of optimality

among the three conditions presently considered. Hence it is conjec-

tured that the degree of optimality of the designs found in the final

and most general subsection of Sections 4.2-4.4 is very close to if not

equal to the best that can possibly be achieved. The three conditions

are now briefly introduced.

The first condition considered in each of Sections 4.2-4.4 is that

the levels are restricted to be x=-l,0,1. This greatly simplifies the

general problem to one with only two unknowns, namely the comparison

proportions, P-1,0 and p0,1. The value of J-1,1 is found by the rela-

tionship -i, + 0 + -_ = i.
-tionship 1,0 0,1 -1,1

The second condition discussed in each of the following three sec-

tions is to let $2=0 in the enumeration of the variances and covariance.

In this situation the optimal designs are proved to be symmetric, and

hence the design problem is simplified to a choice from the class of

symmetric designs. As was pointed out in Section 2.3, previous papers

on the design of paired comparison experiments have always relied upon

setting all parameters equal to zero. Presently, the linear parameter

is allowed to vary. The three criteria are a continuous function of

the parameters because the variance-covariance matrix is continuous in

the parameters. Therefore, the designs found for the second condition









are locally optimal in the sense that they are close to optimal for

values of 82 which are near zero.

The third condition in each of the following three sections is a

consideration of designs whose comparisons are of the form (-l,x ),

(x2,1), and (-1,1). The optimal choice of the levels xl and x2 are

found in addition to the three comparison proportions. This is a gen-

eralization of the first condition since setting x =x2=0 results in the

first condition.

Figure 4.1 presents a synopsis of the procedures used in Sections

4.2-4.4 to find the optimal designs for each of the three optimality

criteria and the three conditions introduced in the present section.

The first column presents a brief description of the three conditions,

and hence the three rows in the figure correspond to the three condi-

tions. Columns 2-4 contain a brief description of the methodology used

for the three criteria.


4.2 D-optimal Designs


The D-optimal design is defined to be the design which minimizes

the determinant of the asymptotic variance-covariance matrix. Letting

Z denote the variance-covariance matrix, this is equivalent to maximiz-

ing IZ-l. In the case of a quadratic model, the D-optimal design is

the design which maximizes D, where


D = = 11 22 -122 (4.2.1)



and where X 11 122' and X12 are given in (3.3.2)-(3.3.4). Because the

optimal design depends on the true parameter values, designs are found













0

> C0

>- 0
-.4l o j




-,-t
o -o > o
X4 X .O O
I *- m X C) .


















O*H
a )01 '0 a .
4J -,- UT 4 *

04 r 50 -
I ifU 0 0
-C 0-4


'.0 0.
0 Cl C


I 0

"' -ri 11

































CC 0
4- 4 C) 4- CC

-1U m > U


,, >.- X -H 4
Eo 0 i


- u 'o
; U *d 1U X -H -


. .-i ro c n i
S C) U C) CB N U CC





















SH 0 1II 0 > 0 C
SO 4
S0 00 0




Uo ,) a) > o 1- z
Q f q o 0 4J

Ci 0 CC 0
0 0 004
I -H *4 CC
rC U 0 U


I NC O >S 00
i '-Q *H 0 *'C C 'i U


I E X U -. U





14 O R ir U O CC 0 0 ]
0 CC 0 CC N U V 4 CO













I 0 CC -- C: 0- C
i ti *- *H *-l CO ( C0
4 4 34 40 U










I 0 O 10 + 4J


0 H rN N- n Q, I
R 4 ul g g






S 0 0 rc4 r3 H
0 > 0 U a-i 4J C 0




E I U c
0 0 ---- CC e CC H 0




*0 0 C U 0 0 U
--4 0 -4 C 0 -





o H a U .C N 0 X N

4J C C U o HC 0Q 0 U M
0 :a 04 Io C ll -H C o x 'N
0 CC EC 4C rI I -H f X n
'N__ r ,-______< '4m~ C C ~ H ~









for various choices of the parameters. The three conditions intro-

duced in Section 4.1 are now discussed, beginning with Condition 1.


Condition 1

The levels are restricted to be x=-1,0,1. Because of this restric-

tion, and by substituting i.. for n.. in (3.3.2)-(3.3.4), the expression

for All' A12' and 22 can be simplified to



11 = -1,0 -1,0 + 0,1O Ol + 4-1,1 -1,1


12 0,1 0,1 1,0 -1,0 (4.2.2)


22 = P-1,0 -1.0 + 0,10,1


The substitution of Pij for n.. is made throughout the present chapter.

This essentially sets N=l since U.. = lim(n../N).
13 N4- ij

From (4.2.1) and (4.2.2), the expression for D is


D = 4- 1,0 0,1-1,0 0,1 +


4(1-u1,- 0 0,l1 -1,0_-1,0 + 0,1 0,1-'1,1 (4.2.3)



where .. is defined in (3.3.5). The partial derivatives of D with

respect to i -,0 and 10, are, respectively,
-1,0 0,1


3 4 0 40 ( 0 + 0 ) +
U, 0,1 -1,00, -1,1 -1,0-1,0 + 0, 0,1 +
-1,0
(4.2.4)
4(1 U_1,0 ,- ,O -1,1 -1,0 '








aD
-- =, 4U-1,00- ',0o0,1 4 -1,1( -1',0_-1,0 + '0,O10'1) +

4(1 p-1,0 0, -1,1 '


Setting the derivatives in (4.2.4) equal to zero results in


-= 1,0 -1,1 '


(4.2.5)


-iO, (-1-,0 0,1 + -,0 -,1 0,1 -1,1

0+ O,1 20,l ,


In matrix notation, (4.2.5) is equivalent to


-1I,0 -1,0 -1,1
0,1 0,i -i,i


= 0,-1,1


(4.2.6)


implying


ul 0,l -1,1


(4.2.7)


where


S2-1,'0 -1,1
A =

-1,0 0,1 -1,0 -1,1 0, -1,1


-tl,0 O,l% 0,1 -1,1 -1,

2%,itid)


For the solution of (4.2.7) to be a maximum for the expression in

(4.2.3), it needs to be shown that the matrix of second order partial


(4.2.4)


1-1 ,0(2.-_1,0 _I,' +









derivatives is negative definite (see Kaplan, page 128). By (4.2.4),

this matrix is


-1,02 D/-1,O 0,1
= = -4 A (4.2.8)
2 D/-1,0 Ol 2D/O, 12


where, from (3.3.5),


1 1 2
Se= 1/( e + e

-i + B + 82 2
1 2 1 2 2
= e /( e + 1 (4.2.9)

6i -u S -i-S
S= e + 2/(1 + e1 2 )2
0,



It is known from elementary matrix algebra that the determinant of

a square matrix is equal to the product of its eigenvalues. So in this

case, IXI=ele2, where el and e2 are the eigenvalues of X. If jX[>0,

then either both eigenvalues are positive or both are negative, or

equivalently, X is either positive definite or negative definite. But

it is clear from (4.2.8) that



( 1 0 ) X = -8_-1,1 -10 < 0, (4.2.10)



where -1,1 and (,0 are both positive and are given in (4.2.9). The
-1,1 -1,0
inequality in (4.2.10) implies that X is negative definite if |xJ>0.

However, it will be shown by (4.2.12) that the determinant is not nec-

essarily positive for all values of 81 and $2.








To simplify the notation, we write


/ -2a

-(a + c b)


-(a + c b)

-2c


where


a = = e / (e + e )(e + 1)
26 -1 + 2 a3 + )] 2

b = 0-1,010,1 = e22/ (e 1 + 1)(1 + e 1 2 2

81 +82 -81 1 + 2] 2
c = i,0, = e / (e + e )(1 + e )


Then the determinant of X can be written as

IXI = 4ac (a + c b)2

2ac + 2ab2 2 c
= 2ac + 2ab + 2bc a -b -c


(4.2.11)


After resubstituting for a, b, and c, and after some algebra in

(4.2.11), an expression for IXI is found by


482 -B + 382 8 + 8 -8 + 8
e + 4e + 1 + 4e + 3e

1 3e + 382 -381 + 2 -21 + 282 381 + 382
S= 2 +3e + e + 3e + e

281 + 28 28 -381 + 382 38 + 82
+ 3e +3e + e + e


- -21 -8 281 + 482 -48 + 282
e + 2e + e + e

481 + 28 -281 + 482 281 81 + 282
+ e + e + e +2e


(4.2.12)



I,


I









where

282 [(-B, B1 -8 + 2 B1 + 82, ) 2
S= 4e / (e + e ) (e + 1)(1 + e 1 (4.2.13)


Since >>0 for all $1 and 82, IXI>0 if and only if the right-hand side

of (4.2.12) is positive. However, it appears that for large enough

values of B1 and B2, some of the terms with a negative sign will over-

whelm the sum of all of the terms with a positive sign, and hence the

determinant could be negative. For example, 1 and B2 can be made

large enough such that the term exp(481 + 282) in (4.2.12) is larger

than the sum of all the terms with positive signs, thereby making

the expression in (4.2.12) negative. It will be seen shortly that this

is in fact the case.

An APL computer program was utilized to solve equation (4.2.7) for

various values of B1 and B2. The determinant, Ix was also calculated

and found to be positive for all values of 81 and 82 when the solution

fell inside the region of allowable values for the comparison propor-

tions, p ,0' ,1 and -i 1,1 That is, J.. Z 0, for all i and j, and

-1,0 0,1 + -1 ,1 = 1. Figure 4.2 depicts the values that -1I,0 and
-Iu Ui -+,v -1,0
j0,1 can have. They must lie inside or on the solid triangle. The
0,1
dotted triangle in the figure depicts the region that the D-optimal

designs are exclusively located.

When the solution to (4.2.7) falls outside of the solid triangle

in Figure 4.2, the optimal design must lie somewhere on the solid tri-

angle. This can be proven by contradiction since D is a quadratic func-

tion of two unknowns, and hence it can only have one stationary point.

For a point to be on the triangle, one of the following must be true:

1-1,0 = 0, o0,1 or -1,0 0,1= 1.











10,1


1









.5 ---------









0
0.5 1




Figure 4.2. Depiction of possible designs for Condition 1



If i,0 = 0, then from (4.2.3),


D = 40,1l _,10, 1 1 10,12



and 3D/O,1 = 0 implies lOi = 0.5. This clearly maximizes D

because of the negative coefficient on the quadratic term. Therefore

if the optimal design lies on the line segment -i1,0 = 0, then it must

be {VI0, = 1- = 0.5}. If v0,i = 0, then

2
D = 4_1,10-1,0 -, -l,0



and it can similarly be shown that the optimal design is the midpoint









of the line segment, namely {0, = 0, -I,0 = -1, = 0.5}. Finally,
0 / i ~-1,0 -1= /

if 1-1,0 +0, = 1, then


0 = 1 0 -
D = 4~-1,010, -1_,0 0,l



= 4- 1,0 0,11-1,0(1 -1,0'


Again, it can similarly be shown that D is maximized when _,

p0, = 0.5. Therefore, if the D-optimal design is on the solid tri-

angle, it must be one of these three designs.

When the solution to (4.2.7) was outside of the solid triangle in

Figure 4.2, D was calculated for the three designs presented in the

preceding paragraph, and the one with the largest value of D was taken

as D-optimal. Otherwise the D-optimal design was found directly from

(4.2.7). The determinant of X was also calculated. It was occasion-

ally negative as conjectured earlier, but only when the solution was

outside of the region, at which time the sign of IXJ is irrelevant.

The "restricted" D-optimal designs are presented in Table 4.1 for

all combinations of the parameter values B =0(.1)1,2,3 and 2=0(.1)1,2,3.

Recall that -1,0' 0,1' and p-1,1 represent the proportion of the

total available comparisons to be allocated to the pairs (-1,0), (0,1),

and (-1,1), respectively. The number of times one of these three pairs

should be compared is then found by multiplying the corresponding Pij

by N. The values of D are also given for the purpose of comparing the

designs with the designs found in the next two subsections. Addition-

ally, the relative efficiencies are given for the "restricted" D-opti-

mal designs presented in Table 4.1 relative to the D-optimal designs









found for Condition 3 in Table 4.3. The relative efficiency is the

square root of the ratio of the corresponding values of D, since D is a
2
multiple of N

Recall from Section 3.1 that a value of 1 for the parameter 1 is

"large". This is basically true for any parameter since x is in the

interval [-1,1]. For this reason, it is assumed that rarely will S1 or

82 be larger than 1, and hence parameter values of less than or equal to

1 are of most interest. Parameter values of 2 and 3 are included pri-

marily to observe the change in the designs for large parameter values.

Table 4.1 only gives positive parameter values. The following

theorem shows that the table can also be used to find optimal designs

for negative values of 81 or 2. This theorem is similar to Theorem

3.6.


Theorem 4.1. Suppose that the D-optimal design for given 81 and 82

is a comparison of x. with x. a total of n times, for all pairs in
1 ] x.,x.
the design. Then


(i) The D-optimal design for (-81) and (-82) is the same as for

81 and 682

(ii) The D-optimal design for (-81) and 2 is a comparison of

(-x.) with (-x.) a total of n times, for all pairs in
1 ] x.,x.
1 3
the design,

(iii) The D-optimal design for 81 and (-82) is the same as in (ii).


Proof. From (3.4.3),


- 1,- (xix.) = i, 2(x.,x.) ,








where ( 12(x.,x.) is given in (3.3.5). From (3.3.2)-(3.3.4) and the

above result, it is clear that A ll, 22' and A12 are invariant under

changing the sign of both 81 and 2. Therefore, since D = A 1 22- 12

D is invariant under changing the sign of both B1 and B2, and so (i) is

proved.

To prove (ii), note that by (3.4.4),


1,l 82 (xi,xj) = l,0 (- ,-x )



Again, from (3.3.2)-(3.3.4) and the above, changing 81 to (-61) and the

pairs (x.,x.) to (-x.,-x.) will not change XA nor A22 It changes

only the sign of 12' and hence does not change A 12. Therefore (ii)

is proved.

To prove (iii), simply apply the results of (i) and (ii).


So, as in Section 3.3, the optimal design for both parameters nega-

tive is the same as the design for I811 and IB21. When exactly one

parameter is negative, the column headed 1,0 becomes 0,' and vice

versa.

Notice from Table 4.1 that for small parameter values, the optimal

design is inside the solid triangle in Figure 4.2, signifying all three

pairs are to be compared. The optimal design approaches the boundary

as the parameters increase. Also, in all cases the optimal design was

inside or on the dotted triangle in Figure 4.2, signifying that no pair

should be compared more than half of the time.

Notice also from Table 4.1 that most of the relative efficiencies

are approximately 0.95 when 81 and 82 are both less than or equal to 1.









This indicates that the increase in efficiency for the generalization

considered under Condition 3 is slight. If an experimenter warrants

that a minimum number of levels should be used, then the advantage of

the designs found presently under Condition 1 is that they have only

three levels as opposed to four levels for the other two conditions.

The fact that they are also about 95% efficient allows the present

designs to be useful in this situation. However, if there is no signi-

ficant advantage to only having three levels, then perhaps the designs

found under Conditions 2 and 3 are more useful since they are slightly

more efficient.


Condition 2


In this subsection, locally D-optimal designs are found analytical-

ly for 32=0. The designs are local as described in Section 4.1. That

is, by the continuity of the criterion, the optimal designs are close

to optimal for 32 close to zero. It turns out that they are also local

in the sense that the designs found under Condition 3 have slightly

larger values of D for some values of 1. Since the criterion is a

maximization of D, the designs under Condition 3 are therefore occa-

sionally slightly better.

By the same argument that was used in the proof of Theorem 3.1,

the D-optimal design must be a symmetric design when 2=0, which implies

12=0. So the search for a D-optimal design is reduced to a search

for a symmetric design which maximizes D = 1 22. Also, the proof of

Theorem 3.2 can be presently implemented to show that every pair in the

D-optimal design must include either x=-l or x=l.









From (3.3.7), .ij is written as


4i. = ( 4cosh2(1 (x. x.)/2) )
13 i J


where


1 u -u
cosh(u) = (e + e-)
2


1 u -u
sinh(u) = (e e) .
2

The maximization of D is equivalent to the maximization of the follow-

ing redefined D:


D = 4Xi 22 = f(x)g(x)

N/2 2



N/2 (1 x.)
f(x) =
i= cosh2(8(xi 1)/2)



N/2 2 + .)2
g (x ) = .
cosh (8 (x. 1)/2)
i=l 1


x' = (1,X2.x, ... XN/2)


Two useful, easily verified facts are


3cosh(6 (x. 1)/2) 1
x 2- B sinh(8l(x -
x. 2 1
.1


(4.2.14)






(4.2.15)





(4.2.16)


1)/2) ,


(4.2.17)


Dsinh(6(xi 1)/2) 1
= 1 -- cosh(3S x 1)/2)
ix. 2


where









The partial derivative of D with respect to x. is then
1

DD f(x) 9g(x)
x g(x) + f(x) (4.2.18)
dx. ax. x.
1 1 1


where


af(x) -2(1 x.) (1 x)2 sinh(8 (xi 1)/2)
-1 1 1 11
axi cosh2 (6(x 1)/2) cosh 3 (x 1)/2)
(4.2.19)


Dg(x) 4x.3 4x. (1 x.2) 2 sinh(3C(x. 1)/2)
1 1 1 1 1
xi cosh2(8 (x. 1)/2) cosh3(1 (x. 1)/2)

(4.2.20)
i=l,...,N/2.


Because of the complexity of handling these N/2 equations of N/2

unknowns, and because it appeared to be reasonable, the expression in

(4.2.18) was evaluated at the N/2-dimensional point x.=x, i=1,...,N/2,

set equal to zero and solved. This produced a design of two equally

weighted comparisons, namely (-l,-x) and (x,l), for some x. The mechan-

ics of the procedure follows.

The partial derivative of D with respect to x., evaluated at x.=x,
1 1
for all i, is


S -2(l -__x) l (1 x)2sinh (1 2)2

9D L cosh cosh3 cosh


-+ 2(l x2 (1 x) )
2 2 2




cosh cosh cosh2

(4.2.21)

where cosh and sinh are shorthand for cosh(B1(x 1)/2) and








sinh(1B(x 1)/2), respectively. By setting the expression in (4.2.21)

equal to zero, multiplying it by cosh5/(N(l x)3(1 + x)), and combin-

ing terms, we have


(1 + 3x)cosh + B1(1 x2)sinh = 0 (4.2.22)


Because of the transcendental nature of this equation, x can not be

expressed as a closed form function of 1. Consequently, an APL pro-

gram was written to find the solution to (4.2.22) for various choices

of 1. These designs are presented in Table 4.2 and are discussed at

the end of the present subsection.

In order for the solution to (4.2.22) to be a local maximum for D,

it must be shown that


cosa


(cosal, ... ,cos /2) axx. 0

Sx.=x c /
cosaN/2


for any angles ai, such that 0 < a. 2r, and


N/2
cos a = 1
i=l


(see Kaplan, page 128). This is equivalent to showing that the matrix

X is negative definite, where


( 2D
X = (4.2.23)
3x N/2xN/2
i = N/2 x N/2








From (4.2.18), the second order partial derivatives are


2 D 2f(x) 2f(x) Dg(x) 2 g(x)
= g(x) + 2 + f(x)- i=,...,N/2,
2 2 1 (4.2.24)x.
S1 1 (4.2.24)


a2D
2x. x.
1 3


F[2f(x) f(x) 2g(x)] f(x) 2g(x) 2gx)
= x.x. xg( + xx. Sx. + x.) x-
131- 3 iL j 1 3


;f(x) 3g(x) 9f(x) 3g(x)
= 2x x + -. x. x. ij, i,j=l,... N/2,
1 3 3 1


(4.2.25)


where, from (4.2.19) and (4.2.20),


2 2812
2 (-x )2
22 i 1 +
cosh


4(l-x )B sinh
cosh3
cosh


2 1 2 2 2
4(3x. -1) -(1-x )
1 2 c
cosh2


-(l-x ) 21 sinh2

cosh4
(4.2.26)


3
8(x. -xi)S sin]

cosh3


3 1-x 2 2 inh 2
-(1-x ) 8I sinh
cosh 4
cosh


I
i=l,...,N/2.


(4.2.27)


Evaluating the second order partial derivatives at the point x.=x,
for al i, (4.2.23) can be written as
for all i, (4.2.23) can be written as


a b
b a


x =


(4.2.28)


a b
b a


22f (x)
2





92g(x)
ax.2
1





2
2 gx)

ax.2
1









where


= 2D

Dx.
x .=x




= g (x) + 2 f (x) (2 (4.2.29)
2 xx -x.2
3x 3xi
xa.=
Xi=X

and


2
Do
b = Dxx (any ifj)

x.=x
1


Df(x) Dg(x)
= 2 Dx. x. (4.2.30)
1 1i



The complete expressions for a and b can be found from (4.2.29) and

(4.2.30), in addition to (4.2.15), (4.2.16), (4.2.19), and (4.2.20),

by substituting x for x..

The matrix X is of a special form as indicated by (4.2.28). It

can be written as


X = D + yz' ,


where D=(a b) I N/2, =b, and y'=z'=l'=(1, ... ,1). From Graybill

(page 187, Theorem 8.5.3), the characteristic equation is


N/2 N
-1
S d 2
(d + c7 y.z. ) (d ) = 0
i=l









N

N
d + a Zy.z. = (a b) + b. As discussed in the previous subsection,
1 1
1
the matrix X is negative definite if and only if all of the eigenvalues

are negative. An analytic proof that the two eigenvalues are negative

follows.

Consider first the eigenvalue (a-b), where from (4.2.29) and

(4.2.30),


C2x) 2 g(x)
a-b = g(x) + f(x)
3x.i x.

1


cosh2(2 (l1-x)2 ) + sinh-cosh-(4(l-x)1


+ sinh2 (- 2(1-x) 2 2J -22

cosh4 cosh2


2 2 1 2 2
cosh (4(3x -1) -(1-x )B
2 1
(1-x) 2 + sinh-cosh-(8x(l-x 2) ) + sinh ( -(1-x2) 2)

coshcosh 4

(4.2.31)


I.t needs to be shown that (a-b) is negative at the stationary point x,

which is the solution to (4.2.22). This relationship between x and B1

found in (4.2.22) is used in the above expression of (a-b) by substi-

tuting for sinh,


-(1 + 3x)cosh
sinh = (4.2.32)
1l(1 x2)








Upon substituting (4.2.32) into (4.2.31), the eigenvalue (a-b) at the

stationary point, x, is



(2 --x) 2 -4(1+3x) x)2] 1-x) 2 +
2 1 (2(1+x) 2 cosh4
a-b =

+ 2 1 2 2 3 21 4
[ 4(3x -1) 1-x )1) 8x(l+3x) + I1(l+3x) .2sh 3 3

(4.2.33)


After factoring out -(l-x) /cosh and combining terms,


a-b < 0

if and only if (4.2.34)


2 (x 1)(x2 2) ) + ( -22x2 20x 6 ) < 0
1 2

The first term in (4.2.34) has roots x=l and x=+v2, and is nonpositive

for XEL-l,l]. The roots of the second term are imaginary, and clearly

the function is negative at x=0. Hence it is negative for all x. The

result a-b < 0 then follows from (4.2.34).
N
The second eigenvalue is (a-b) + b. From (4.2.19), (4.2.20),

and (4.2.30),



b 2(1-x) (-2cosh (l-x)B sinh) (-4x(l+x)cosh (l-x)(l+x)2 sinh)
cosh6 1 1



2(1-x) 8x(l+x)cosh2 + 2(1-x)(l+x)2 1cosh-sinh
cosh6 + 4x(l-x2S cosh-sinh + (1-x2) 2 2sinh2


2(1-x) (-x2 + 2x 1) (4.2.35)
cosh









the last step produced by again using the relationship in (4.2.32) and

combining terms. Then


N (1 x)
Sb =(2x 2) (4.2.36)
2 4
cosh


By factoring N(1-x) /cosh4 out of the second eigenvalue, it is clear
2
from (4.2.34) and (4.2.36) that


(a b) + b < 0
2
if and only if


2 2(- -(x2-1)(x -2)) + (-22x2-20x-6) + (2x-2) < 0
1 2

if and only if


12(- -()x2-) (x -2)) + (-22x2-18x-8) < 0 (4.2.37)


The first term in (4.2.37) is unchanged from (4.2.34). The roots of

the second term are still imaginary, and hence the eigenvalue is nega-

tive. Therefore X is negative definite, and the solution to (4.2.22)

is in fact a local maximum for D.

The locally D-optimal designs for 81=0(.02)1,2(1)5 are presented

in Table 4.2. The value of the original D=llX 22 is also given for

comparison purposes. By comparing the present values of D to the cor-

responding values of D associated with the designs in Table 4.1, it can

be seen that the present designs offer a slight improvement for 2=0.

Theorem 3.3 is applicable here since B2=0, and so the optimal design

for any particular value of 81 is the same as the optimal design for

(- ). Recall that for a given 31, the optimal design is to make half









of the total comparisons between x and 1, where x is found in the table,

and the other half between (-x) and (-1). The optimal value of x is

graphed as a function of 1811 for 1811 1 in Figure 4.3.


Condition 3


A grid method was used to find D-optimal designs with no restric-

tions on the values of the two parameters. With 2S 0, the argument for

a symmetric design is no longer applicable. As a generalization of the

first subsection, the three comparisons (-l,x ), (x 2,1), and (-1,1)

were allowed. A description of the grid method follows.

In general, if there are k independent unknowns, and the grid is

to have n values for each variable at each stage, then the grid method

is simply an evaluation of the function at each of the n points, and

the determination of which of these points maximizes (or minimizes) the

function. It then repeats the procedure, only with a finer grid cen-

tered at the best point previously found. This procedure is repeated

until the optimal point has been found with sufficient accuracy.

The procedure used for the present condition was essentially a

grid within a grid. At each step, the combination of three values each

of x1 and x2 were taken. Initially these values were -0.5, 0, and 0.5.

For each of these 9 points, a grid method was used on the allocation of

the comparisons to the three pairs. This grid consisted of initial

increments of 0.1 in the allocation fractions. Notice that this is not

a full 113 grid since, for example, { -, = ,l = U-1 = 0.5} is
1X X2 L ,1
not a possible allocation for the three pairs because they must sum to

1. After the best of these possible allocations were found, a finer 52








grid was performed about the current best allocation. Note that it is

a 52 grid rather than a 53 grid, because when two of the allocation

proportions are known, the third is just the sum of the two proportions

subtracted from 1. This was then repeated until the best allocation

fractions were accurate to .0001. The best of the 9 combinations of

x and x2 was taken, and then a finer 32 grid about the current best

combination of xl and x2 was performed using the same procedure des-

cribed in the present paragraph.

The entire procedure described in the preceding paragraph was

repeated until the values of x1 and x2 were also accurate to .0001.

D-optimal designs were found for combinations of 81=0(.1)1,2,3 and

82=0(.1)1,2,3, and are presented in Table 4.3. Once again, the value

of D for N=l is also given in the table. Notice from the table that

when nx,1=0, the value of x2 is immaterial, and therefore an asterisk


appears under the column for x2.

Theorem 4.1 is applicable, so designs for negative 81 or 82 can

also be found from the table. If both S1 and 82 are negative, then the

optimal design is identical to the one for positive B1 and 82. If,

however, exactly one of the parameters is negative, then the optimal

design is the same as the one for 16 I and 1B I, except that nx
1 2 -P1t tx

changes to n and n x2 changes to nx2,. For example, if
l,-x x ,l 2

81=0.6, 82=-0.3, then the D-optimal design would be to run .462N compar-

isons of the pair (-1,.23), .468N of the pair (-.25,1), and .070N of

the pair (-1,1).

Table 4.3 indicates that once again no pair should be compared more

than half the time. Also, the optimal design is inside the region










Table 4.1. D-optimal designs with restriction x=-1,0,l




1 82 -1.,0 p0,l -1,1 D Efficiency

0 0 .333 .333 .333 .083333 .9613
.1 0 .334 .334 .332 .082506 .9613
.2 0 .337 .337 .327 .080098 .9613
.3 0 .341 .341 .318 .076314 .9612
.4 0 .348 .348 .305 .071460 .9608
.5 0 .357 .357 .287 .065895 .9599
.6 0 .369 .369 .263 .059986 .9583
.7 0 .384 .384 .232 .054068 .9555
.8 0 .405 .405 .190 .048426 .9536
.9 0 .433 .433 .135 .043283 .9542
1 0 .470 .470 .060 .038814 .9592
2 0 .500 .500 0 .011024 .9664
3 0 .500 .500 0 .002041 .7260

0 .1 .333 .333 .334 .083056 .9615
.1 .1 .336 .332 .332 .082233 .9617
.2 .1 .340 .333 .327 .079836 .9615
.3 .1 .345 .336 .319 .076070 .9613
.4 .1 .353 .342 .306 .071237 .9610
.5 .1 .362 .350 .288 .065694 .9602
.6 .1 .374 .362 .264 .059805 .9585
.7 .1 .389 .378 .233 .053905 .9558
.8 .1 .409 .399 .192 .048276 .9538
.9 .1 .435 .428 .137 .043141 .9543
1 .1 .470 .467 .063 .038676 .9591
2 .1 .500 .500 0 .011001 .9660
3 .1 .500 .500 0 .002039 .7257

0 .2 .332 .332 .336 .082233 .9619
.1 .2 .336 .330 .334 .081422 .9619
.2 .2 .342 .329 .329 .079060 .9619
.3 .2 .349 .331 .321 .075346 .9618
.4 .2 .357 .335 .308 .070575 .9615
.5 .2 .367 .343 .291 .065097 .9608
.6 .2 .378 .354 .268 .059269 .9594
.7 .2 .393 .370 .237 .053421 .9567
.8 .2 .411 .392 .197 .047833 .9544
.9 .2 .436 .421 .143 .042725 .9545
1 .2 .468 .461 .070 .038270 .9587
2 .2 .500 .500 0 .010931 .9647
3 .2 .500 .500 0 .002034 .7245

0 .3 .331 .331 .338 .080888 .9625
.1 .3 .337 .327 .337 .080098 .9625
.2 .3 .344 .324 .332 .077793 .9626




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs