Full Citation 
Material Information 

Title: 
Parameterfree designs and confidence regions for nonlinear models 

Physical Description: 
viii, 105 leaves : ill. ; 28 cm. 

Language: 
English 

Creator: 
Bumrungsup, Chinnaphong, 1951 

Copyright Date: 
1984 
Subjects 

Subject: 
Parameter estimation ( lcsh ) Confidence intervals ( lcsh ) Statistics ( lcsh ) Statistics thesis Ph. D Dissertations, Academic  Statistics  UF 

Genre: 
bibliography ( marcgt ) nonfiction ( marcgt ) 
Notes 

Statement of Responsibility: 
by Chinnaphong Bumrungsup. 

Thesis: 
Thesis (Ph. D.)University of Florida, 1984. 

Bibliography: 
Bibliography: leaves 101104. 

General Note: 
Typescript. 

General Note: 
Vita. 
Record Information 

Bibliographic ID: 
UF00098622 

Volume ID: 
VID00001 

Source Institution: 
University of Florida 

Holding Location: 
University of Florida 

Rights Management: 
All rights reserved by the source institution and holding location. 

Resource Identifier: 
alephbibnum  000468635 oclc  11628537 notis  ACN3321 

Downloads 

Full Text 
PARAMETERFREE IESIGONS AND
CONFIDENCE REGIONS FOR NONLINEAR MODELS
By
CHINNAPHONG BUMRUNGSUP
A DISSERTATION PRESENTED TO THE
GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
Dedicated to
my late father
Mr. Siri Bumrungsuip
and
my late mother
Mrs. Varanee Bumrungsup
ACKNOWL ErGM ENTS
I would like to express my sincerest appreciation to the chairman
of my supervisory committee, Dr. Andre Khuri, for his tireless efforts,
guidance, suggestions, encouragement, friendship, and help throughout my
graduate program and preparation of this dissertation. I am also
grateful to Dr. Arun Varma, a member of the supervisory committee, for
his helpful and valuable comments and suggestions. The contributions of
Dr. John Cornell, Dr. Frank Martin, and Or. Gerald Mott as members of
the supervisory committee are acknowledged. Thanks are also extended to
Dr. Michael Conlon for programming assistance and to Dr. Kim Sawyer for
reading this dissertation.
i am most appreciative to Dr. Richard Scheaffer for his
understanding and assistance in matters related to my graduate program.
I would like to express sincere thanks to Thammasat University and
the Thai Government for granting the leave of absence required for the
program. In addition, I am grateful to the Rockefeller Foundation and
the Department of Statistics for providing financial support for my
graduate program.
I am thankful to my family for their constant support: my wife
Kulvadee, for her patience, love and understanding, my daughter Aininta
who has been a constant source of joy, my brother Thanin, and my sisters
Manida and Arirak who were always there with love and encouragement.
Special thanks are due to Ms. Carole Boone who spent iany long
hours typing this manuscript.
TABLE OF CONTENTS
PAGE
ACKNOWLEDGMENTS .......................................... ........... ii
ABSTRACT ............ ... .. ..................................... ii
CHAPTER
ONE INTRODUCTION ..................................................
1.1 Nonlinear Models.........................................
1.2 Design of Experiments for Nonlinear Models................2
1.3 Confidence Regions for Parameters in Nonlinear Models.....7
1.3.1 The Likelihood Ratio Method....................... 8
1.3.2 Asymptotic Normality of the Least Squares
Estimator............................................9
1.3.3 Hartley 's Method................................... 9
1.4 The Purpose of This Research................. ............ 10
TWO DESIGNS FOR NONLINEAR MODELS .................. ............... 12
2.1 Introduction ......................... ................... 12
2.2 Approximation and Interpolation of Functions .............13
2.2.1 Lagrange Interpolation............................14
2.2.2 Spline Interpolation..............................22
2.2.2.1 Linear Spline............................ 27
2.2.2.2 Quadratic Spline.........................28
2.2.2.3 Cubic Spline .. ......... ..... .. ..... 30
2.3 Optimal Designs ....................................... 33
2.3.1 Designs Obtained through Lagrangian
Interpolation of Nonlinear Models in the
Case of Two Input Variables ....................... 37
2.3.1.1 Lagrangian Interpolation of the Mean
Response ................................ 38
2.3.1.2 Designs Based on Lagrangian
Interpolation............................41
2.3.2 Designs Obtained through Spline Interpolation
of Nonlinear Models ...... ........... ....... . ...44
2.3.2.1 Spline Interpolation of the Mean
Response... .......... ........ ......... 44
2.3.2.2 Designs Based on Spline Interpolation....48
2.4 Examples ....................... ........................ 50
2.5 Discussion.............................................. .6a
THREE CONF[OENCE REGIONS FOR THE PARAMETERS IN NONLINEAR MODELS ......66
3.1 Introduction ............................................66
3.2 Method for Obtaining Confidence Region on Parameters .....67
3.3 Confidence Region on ................................... 68
3.4 Examples.. ........................................... 74
3.5 Discussion.............................................. 90
FOUR CONCLUSIONS AND RECOMMENDATIONS .............................. 91
APPENDICES
1 PROOF OF INEQUALITY (2.78) ..................................94
2 PROOF OF INEQUALITY (2.81)................................... 97
3 BOX AND LUCAS DESIGN......................................... 100
REFERENCES .........................................................101
BIOGRAPHICAL SKETCH ................................................ 105
Abstract of Dissertation Presented to the
Graduate School of the University of Florida in
Partial Fulfillment of the Requirements for the
Degree of Doctor of Philosophy
PARAMETERFREE DESIGNS AND
CONFIDENCE REGIONS FOR NONLINEAR MODELS
By
Chinnaphong Bunrungsup
August 1984
Chairman: Andre I. Khuri
Major Department: Statistics
The choice of an optimal design for parameter estimation in a
nonlinear model depends on the values of the model's parameters. This
is an undesirable feature since such a design is used to estimate the
same parameter values which are needed for its derivation. To overcome
this difficulty a method is developed for the construction of a
nonlinear design which does not depend on the nonlinear model's
parameters. Such a design is called a parameterfree design. This
method is an extension of A. I. Khuri's modified Doptimality criterion
(ParameterFree Designs for Nonlinear Models. Technical Report No. 188,
Department of Statistics, University of Florida, Gainesville, Florida,
1982). It is based on using a proper approximation of the true mean
response function described in the nonlinear model with either Lagrange
or spline interpolating polynomials. Each approximation will be used to
derive a parameterfree design. Optimal designs obtained on the basis
of these interpolating polynomials depend on the size of the error
associated with the approximation of the true mean response.
A method for the construction of a confidence region on the vector
of parameters in a nonlinear model is also developed. Unlike most other
methods which are available for that purpose, this method does not
require specifying initial estimates of the parameters. The
aforementioned confidence region can be used to obtain simultaneous
confidence intervals on the nonlinear model's parameters. These
intervals, however, are conservative in the sense that their joint
confidence coefficient cannot be less than a preset value.
viii
CHAPTER ONE
INTRODUCTION
1.1 Nonlinear Models
To understand what we mean by the term "nonlinear model" we shall
first make the following definition: A linear model is the one in which
the parameters, the quantities to be estimated, appear linearly. Such a
model can be represented by
y = 80 + BZ + B2z2 + ... + + c (1.1)
where Bg, BI, ..., Bp are unknown parameters, the zi are any functions
of the input or independent variables xl, ..., xk, y is the dependent or
response variable and e is an unobservable random error.
Any model which is not linear in the parameters will be called a
nonlinear model. Examples of two such models are
y = exp[9I + 62x2 + E] (1.2)
y = 6 1 2 [exp(e2x) exp(e1x)] + e (1.3)
1 2
The models (1.2) and (1.3) are both nonlinear in the parameters 01 and
02. But the model (1.2) can be transformed so that the parameters
appear in linear fashion. If we transform the model by taking the
logarithm to the base e of the quantities on both sides of the equality
sign, we obtain
In y = 01 + 2x2 + E (1.4)
which is the form of (1.1) and is linear in the parameters. We say that
model (1.2) is an intrinsically linear model because it can be expressed
in the linear form (1.1) by a suitable transformation.
However, it is impossible to transform the model (1.3) into the
linear form (1.1). Such a model is said to be intrinsically
nonlinear. Throughout this dissertation all models mentioned will be
intrinsically nonlinear. This research is concerned with the design of
experiments for parameter estimation and the construction of confidence
regions for the parameters in nonlinear models.
1.2 Design of Experiments for Nonlinear Models
Consider the nonlinear model
y = f(x; e) + E (1.5)
where y is the measured response, f(x; 8) is the true mean response at
the point x, where x is the vector of k input variables (xl, ..., xk)'
_ is the vector of p parameters (el, 82, ", )', and c is a random
error.
When there are n observations of the form yl, y2 .., y with yi
being obtained at the design point i, i = 1, 2, ..., n, then we can
write model (1.5) in the form
yi = f(x : ) + Ei (1.6)
where xi = (xil, xi2, ..., xik)' The program of n experiments may be
represented by the nxk design matrix D = [xij]. The jth member of the
ith row of this matrix gives the level of the jth variable for the
ith experiment. We assume that the random errors l', 2, ..., n are
independently distributed as N(0, 02).
The design criterion for parameter estimation in a nonlinear model
suggested by Chernoff (1953) is to choose the experimental design that
minimizes the trace of [h(x, 8) h'(x, 9)]1, where h(x, e)h'(x, e)/o2 is
Fisher's information matrix of order pxp and h'(x, 6) is an nxp
derivative matrix with ijth entry af(xj ; e)/38a. The problem is that
the Fisher information matrix depends on the unknown parameters 8, and
one cannot determine the optimal design setting for estimating a without
knowing the actual value of a. Chernoff suggested to consider locally
optimal designs and to minimize the trace of [h(x, 20) h'(x, _20)]1
where n is the value of in the neighborhood of which f(x_ ; 8) is
assumed to be approximately linear in e for i = 1, 2, ..., n. Hamilton
(1982) pointed out that this criterion is not invariant under changes of
scale of the parameters. Box and Lucas (1959) suggested choosing the
experimental design to maximize the determinant h(x, M) h' (x, 0)
with respect to the experimental variable settings where n, the number
of experiments, is equal to p, the number of parameters. This parameter
estimation criterion was called Doptimality by Kiefer and Wolfowitz
(1959). Unlike the minimum trace criterion, the Doptimal design is
invariant to scale changes of the parameters (see Cochran (1973) and
Hamilton (1982)).
Atkinson and Hunter (1968) extended the Box and Lucas results to
cases where n is greater than p. In some cases when n is a multiple of
p, the optimal design simply results in n/p replications of the points
which would be given by the n=p Box and Lucas approach. However,
Atkinson and Hunter gave a counter example to show that this is not
always true. M. J. Box (1968, 1970, 1971) gave a number of additional
results particularly on the subjects of replicated design points.
Several applications and theoretical extensions to Doptimal
designs for nonlinear models have been proposed. Reviews of these are
given by Cochran (1973) and St. John and Draper (1975).
In some cases it may be true that initially 01, say, will be known
more precisely than 82 at the outset of experimentation. For example,
suppose we know the standard deviations of their prior distributions are
one and ten, respectively. Draper and Hunter (1967a and 1967b)
developed a procedure for the design of experiments for parameter
estimation when such prior information is available.
For the nonlinear model, it is necessary to have initial estimates
of the parameters in order to apply the previously mentioned design
criterion. If these initial estimates are poor, the efficiency of the
design may be greatly reduced since the design criterion is a function
of the parameter values that are employed. A more efficient approach
would be to treat O as an initial estimate in a sequential procedure
which starts by selecting a ppoint optimal design on the basis of this
estimate. A series of p experiments are then performed at the optimal
design settings and an estimate of a is obtained. Thereafter, the
experiments can be designed one at a time, using the current best
estimates of the parameters, which can be recalculated after each
experiment. This sequential procedure has been discussed by Box and
Hunter (1965) and they showed how to add points one at a time.
M. J. Box (1971) added sequentially sets of p points each. At each
stage a new estimate of a is obtained and used to generate the next set
of design points. This sequential procedure continues until the values
of the parameter estimates stabilize along with the values of the
coordinates of the design points.
The sequential procedure described above can help improve the
efficiency of the design but it does not eliminate the dependency of the
design on the parameter's values. Furthermore, Atkinson and Hunter
(1968) pointed out that there are situations in which it is not feasible
to proceed sequentially.
Zacks (1977) considered the design problem in a Bayesian framework
which is constructed so that the expected value of the determinant of
Fisher's information matrix with respect to some prior distribution of e
is maximized. Bayes sequential designs can also be obtained. In this
case an optimal design for a new stage of experimentation is determined
by maximizing the expected value of Fisher's information determinant
with respect to the posterior distribution of 9 given the results from
the previous stages. The Bayesian approach (sequential and
nonsequential) eliminates the problem of dependency of the optimal
design on 6, but it introduces dependency on the prior distribution.
Hamilton (1982) introduced a design criterion based on the second
order approximation to the volume of the confidence region of the
parameters in nonlinear models. This criterion is more efficient than
the Doptimality criterion since the latter is based on the first order
approximation of the abovementioned volume. However, Hamilton's
criterion requires good initial estimates of the parameters as well as
of the error variance.
Khuri (1982) considered two types of designs for parameter
estimation which do not depend on the unknown parameters of the
nonlinear model. The first type is called a minimax Doptimal design.
This design is conservative and may be inefficient as pointed out by
Khuri. The second type of design is more efficient than the minimax D
optimal one. To obtain this second type, an adequate approximation to
the true mean response with a Lagrange interpolating polynomial was
constructed. This approximation was used to derive a parameterfree
design on the basis of a certain modification of the Doptimality
criterion. Khuri noted that this design depends on the size of the
error associated with the approximation of the true mean response. Only
a nonlinear model with a single input variable was considered.
In this dissertation we shall extend Khuri's second design to the
cases where we approximate the mean response function with a Lagrange
interpolating polynomial (for the case of two input variables) and with
spline interpolating polynomials (for the case of a single input
variable).
1.3 Confidence Regions for Parameters
in Nonlinear Models
Beale (1960) introduced approximate confidence regions for the
nonlinear parameters based on the likelihood ratio criterion when the
sample size is large. Williams (1962) presented a procedure for
obtaining what are described as fiducial intervals and regions for the
nonlinear model's parameters. He considered the nonlinear model
yi = a + Bf(x ; Y) + Ei
where Y is the only nonlinear parameter. Williams' procedure is
applicable for a model with only one parameter appearing nonlinearly.
Halperin (1963) extended Williams' results to a larger class of
nonlinear models with more than one parameter appearing nonlinearly. He
showed that the fiducial intervals and regions discussed by Williams are
also confidence intervals and regions. Hartley (1964) suggested a
procedure for obtaining an exact confidence region about the parameters
in a general nonlinear model. A method for constructing an
approximation of the true mean response was described. This
approximation was used to obtain exact confidence regions for the
parameters. Sundararaj (1978) considered a method for the construction
of joint confidence regions on a vector of parameters in a nonlinear
model. A set of p random variables was constructed, each of which is a
least squares equation for estimating the parameters, where p is the
number of parameters. These p random variables were used to construct
the joint confidence region on the parameters.
Several methods for determining confidence regions for the
parameters of nonlinear models have been proposed. Reviews of these are
given by Gallant (1976) and Shaw and Griffiths (1979). We shall
describe briefly some of the methods for obtaining the confidence
regions on the parameters in nonlinear models. First we define the
error sum of squares for the nonlinear model given in (1.6) as
n 2
S(6) = E (Yi f(x ; )) (1.7)
i=!
We shall denote the least squares estimate of e as 9.
1.3.1 The Likelihood Ratio Method
This method defines the confidence region for 0 by the expression
S(e)S(9)
S(e) < p F(p, np, a)
S(e) np
which provides approximate 100(1a) percent confidence regions for 6,
where F(p, np, a) is obtained from a table of the Fdistribution
corresponding to a significance level a and numerator and denominator
degrees of freedom p and np, respectively (see Draper and Smith (1981,
p. 504) and Ratkowsky (1983, p. 30)).
1.3.2 Asymptotic Normality of the Least Squares Estimator
The 100(1a) percent confidence region for can be defined as (see
Gallant (1975a))
(6a)' C" (ea)
<  F(p, np, a)
s(a) np
where C = [h(x, ) h'(x, e)]1 and h(x, 6) h'(x, 8)/o2 is Fisher's
information matrix as defined in Section 1.2.
1.3.3 Hartley's Method
Hartley (1964) developed a method of constructing the socalled
"essentially linear" approximation to the nonlinear model f(xi; 8) of
the form
p
f(x ; I ) = wt(e) u. ,
1 t it
t=l
where wt(6) are continuous functions of e and the nxp matrix U of the
uit has rank p and does not depend on _. An exact 100(1 a) percent
confidence region for e is defined as
reg SS/res SS < 2 F(p, np, a)
np
where
1,
reg SS = (U's)' (U'U) (U')
res SS = e's reg SS
and
S= 2(El ....' n
1.4 The Purpose of This Research
The choice of an optimal design for parameter estimation in a
nonlinear model depends on the values of the model's parameters. The
statistical literature suggests several procedures for obtaining the
designs for parameter estimation as was described in Section 1.2. In
most cases, it is necessary to have initial estimates of the parameters
in order to apply the design criterion. If the estimates upon which the
design is based are poor, the design may be inefficient.
To overcome this difficulty we present a design criterion for
parameter estimation which does not depend on the nonlinear model's
parameters. Our design criterion for parameter estimation, which is an
extension of Khuri's (1982) procedure, is presented in Chapter Two.
Such a design is called a parameterfree design. The approximations to
11
the true mean response function given in (1.5) will involve a Lagrange
interpolating polynomial (for the case of two input variables) and the
spline interpolating polynomials (for the case of a single input
variable) in order to obtain these parameterfree designs. Chapter
Three contains a method for the construction of confidence regions on
the parameters in nonlinear models. This method does not require
specifying initial estimates of the parameters. Using the confidence
intervals on the mean response functions in conjunction with
Bonferroni's inequality, a conservative confidence region on the
nonlinear model's parameters is constructed. In the final chapter, we
make some concluding comments concerning our methods.
CHAPTER TWO
DESIGNS FOR NONLINEAR MODELS
2.1 Introduction
In this chapter we introduce techniques for constructing the
polynomial approximations of a nonlinear model. Each approximation will
be used to obtain a design for parameter estimation which does not
depend on the parameter vector 6. We shall restrict our consideration
to the cases of one and two input variables.
Consider the nonlinear model
y = f(x; 8) + e (2.1)
where x = (xl, x, ...., xk)' is a vector of k input variables, e =
(31, 62, ..., p)' is a vector of p unknown parameters, and E is a
random error. To estimate the elements of 8, a series of n experiments
(n>p) is performed. In each, a value of the response is observed at a
particular setting of x. The program of n such settings is represented
by the n x k design matrix, D = [xij], i = 1, 2, ..., n; j = 1, 2, ...,
k. We assume that the random errors 1, ', .., E r are independently
distributed as N(0, 2).
We now discuss the problem of finding the polynomial approximations
to the given nonlinear models of one and two input variables.
2.2 Approximation and Interpolation of Functions
One of the simplest ways to approximate a function defined on an
interval, is to obtain a polynomial which takes the same values as the
function at a certain number of points in the domain of the function.
This procedure is called interpolation. Szidarovszky and Yakowitz
(1978) pointed out that polynomials provide a simple way to approximate
or interpolate functions. By way of illustration, let us state a well
known theorem of Weierstrass.
Theorem 2.1 (The Weierstrass Approximation Theorem)
Given any interval [a,b], any real number 6 > 0, and any real
valued continuous function f(x) on [a,b], there exists a polynomial p(x)
such that
If(x) p(x) < 6 for all x:[a,b]
This result may be proved in a number of ways: see, for example, Davis
(1963) and Rivlin (1969).
In this dissertation we shall center our attention on the
approximation of a nonlinear model by considering a Lagrange
interpolating polynomial (for the case of two input variables) and
consider spline interpolating polynomials (for the case of a single
input variable). Each approximation will be used to obtain a parameter
free design on the basis of a modification of the 0optimality criterion
which was proposed by Khuri (1982).
2.2.1 Lagrange Interpolation
OneDimensional Problem
The Lagrange polynomials are among the simplest and most practical
of the interpolating polynomials. Given m+1 distinct real numbers
ayo
g(x) be a polynomial of degree m or less for which
g(yi) = (yi) i = 0, 1, 2, ., m .(2.2)
This polynomial can be written in Lagrange form
m
g(x) = 1 f(Yi )li(x) (2.3)
i=O
where
m 
li(x) = n (x Y )(yi yj) i = 0, ..., m, (2.4)
j=0
jpi
is the ith Lagrange polynomial of degree m (see, for example, Young and
Gregory (1972, Ch. 6), and deBoor (1978, Ch. 1 and 2)). We note that
li(Yj)= 0 if i j and li(yi) = 1. The points yg, y ..., ym are
called interpolating points and the polynomial g(x) is called Lagrange
interpolating polynomial of degree m. Such a polynomial always exists
and is unique as can be stated in the following theorem.
Theorem 2.2
There exists a unique polynomial g(x) of the form (2.3) for which
g(YO) = f(y0), g(y1) = f(Y1), .**, g(ym) = f(Y,)
The proof of this theorem can be found in Prenter (1975), and
Szidarovszky and Yakowitz (1978).
The error resulting from the use of Lagrange interpolation is, by
definition, the difference between f(x) and g(x), where g(x) is the
Lagrange polynomial of degree m interpolating f(x) at m + 1 distinct
real numbers yo, yl, ..., ym. We assume that f(x) has continuous
derivatives up to order m + 1 with respect to x over the interval
a
m(m+)
e(x) = f(x) g(x) = H (x yi)f(m+1)( )/(m+1)! (2.5)
i=O
where a < g < b, xcEa,b] and f(m+l)(x) = m+'f(x)/axm+l (see, for
example, Prenter (1975, p. 36) or deBoor (1978, p. 8)). As is not
explicitly known, formula (2.5) cannot be used for the calculation of an
exact value of e(x) but, as we shall now see, it can be used for
obtaining an upper bound for the absolute value of e(x). If we let
m
w(yo' Yl .. ym) = max I (x yi)
a
an upper bound for le(x)j is given by
max le(x)l < w(y0, ... ym) max f (x)/(m+1)!
a
(2.6)
Notice that the interpolation points influence the size of the upper
bound on the error through the function w(yO, 1' .. Yym). The
function w(yO, yI, ..., ym) can be made small by suitable choice of the
points yi. In fact the values of yi which minimize the function
w(yO' yl' y. Ym) can be obtained by using the following result given
in deBoor (1978, p. 2631): the function w(y0l,y, ..... ym) in (2.6)
attains a minimum value of 2(b a)m+' /m+l if the interpolation points
are chosen as the zeros of the Chebyshev polynomial of the first kind
and of degree m+1 for the interval a
formula
y = (a )cos[(i + i], i=0, ..., m (2.7)
From (2.6) and the fact that the minimum value of w(y0, yl .., ym) is
2(b a)m+1/4m+1 we obtain
max e(x); < 2(b a)m+1 max If(m+l)(x) /[4m+1 (n+ )!] (2.8)
a
TwoDimensional Problem
We now look at the problem of approximating functions of two
variables with Lagrange polynomials in two variables. Suppose that we
are given a rectangular region
R2 = (xl, x2): a1 < x1 < bI, a2 < x2 < b2}.
Let 1: aI yo<...< ym
be partitions of the intervals [a.,bl] and [a2,b2],
YO, yl, ., Ym1 and zO, zI, ..., m2 are ml+l and
numbers. The set of points
z1 < ...
respectively, where
m2+1 distinct real
H = (yi z) : O
partitions R2 into a family of subrectangles
x
Rj : yi1 < xl C i; j1 < x2 < z
(i = 1 .., ml; j = 1, .... m2'
YO = al' Y1 = b z0 = a2' zm2 = b2)
Let f('x, x2) be a function defined on R2 and let g(xi, x2) be a
polynomial of degree m in x and m in x such that
polynomial of degree m, in x, and m2 in x2 such that
g(Yi z.) = f(y zj) O
This polynomial can be written in Lagrange form
g(x1, x2) =
04i
0
f( i z ) l j (xi, x2)
m
S(x ) = 1
k=0
kmi
,j(xI, x2) = li(x1)1j(x 2)
(x1 yk)yi y) 1 i=0, 1 ..., 1
(x2 zu)(z z u)1, j=0, 1, ..., m2
(2.11)
(2.12)
m
lj(x2) = 2
u=0
u j
li(xi) and 1j(x2) are the i, jth Lagrange polynomials of degree mi and
m2, respectively (see Prenter (1975, p. 119)). We note here that
if i=k and j=u,
1J Zu) a
0
otherwise.
(2.9)
where
(2.10)
Theorem 2.3
There exists a unique polynomial of degree mi or less in xI and m2
or less in x2 which satisfies (2.9).
The proof of this theorem can be found in Prenter (1975, p. 120).
We now look at the error of interpolation. Let us assume that f(xl, x2)
has continuous derivatives up to order ml+1 and m2+1 with respect to x,
2
and x2, respectively, over the rectangular region Rx; then an upper
bound for the absolute value of e(x1, x2) is given by
max ]e(x1, x2)1 = max f(xl, x2) g(x1, x2)I
a l
a2
x1 O(Y (m l1)
< (Y Y max If (xl, x)
1 al
a2
x2(z0 ., zm2 (m +1)
m2+1 max fx (x, x2)l
(m2+)x al1 <
a2 x2
wxl y0" ym1 )wx2(z0 ...z2 (ml+1) (m2+1)
S (m +l)!(m2+1)! mafx fx (xl, X2)fx (xl, x2)]
a 2 al
a2Cx2
(2.13)
where m +1
(ml+1) a f(x1, x2)
fxl (xI, x2) =2 ml
ax
m2+l
(m2+1) 3 f(x ,l' x2)
x2 (xl, 2 m2+1
3x2 
m
w (y .. ) = max I (xl Y)
xl 1 alx1
and
m
w (Z, ..., z ) = max 2 (x2 z)
2 0 2 a2
(see Prenter (1975, p. 125)).
Consider now the upper bound of the error in (2.13). Suppose
f(xI, x2) is a function of xI alone, that is f(xl, x2) is reduced to a
onedimensional function. Then fx+(, +) )= 0, 0
2
upper bound in (2.13) is reduced to
(m +1)
max f (xl)
al
(ml+1)! wxl(YO .... Y*m)
which is the same as the upper bound in (2.6).
Now if yo, yl, ..., yml are chosen as the zeros of the Chebyshev
polynomial of the first kind and of degree mI+1 for the interval
[al,bl], and Zo, z1, ..., zm2 are the zeros of the Chebyshev polynomial
of the first kind and of degree m2+l for the interval [a2,n2], that is,
al+b1 a h
yi 2 os[ + ) 1, i=, 1 ,
(2.14)
a2+b2 a b2
z = 2 ( 2 cos[(j + ) 1 jO .. 1. ,m2
then the functions wx1(yg ..." yIl) and wx2(ZO, ..., zm2) in (2.13)
ml+ mf+1 m2+l m2+l
attain minimum values of 2(b1 al)l+ /4m+1 and 2(b2 a2) +/4
respectively. In this case, the upper bound of the error of
interpolation becomes
(m,+l)
2(b1 al) (m,+1)
max e(xI, x2) < m +1 max If (x, x2)
al< b1 (ml+1)!4 1 a
a2
2(b2a2) 2 (m2+1)
+ 22m2+1 max f (xl, x2
(m2+1)!4 2 al< x b 1 2)
a2
ml+1 m2+1
4(b a1) (b2a2) 2 (m 1+) (m2+l)
m+m2+2 max If (x1, x2l x (xl, x2)
1y2 ~ x1 1x 1
(m+1)!(m2+1)!4 1 alxb 1 2
a2
(2.15)
We have described the Lagrange interpolating polynomial for the
cases of one and two input variables. The construction of this
approximation is fairly simple. However, a drawback to this
approximation is that it is not smooth, since it cannot be
differentiated at the interpolating points. In the next section we
introduce a smoother approximation. This approximation is called a
spline interpolating polynomial.
2.2.2 Spline Interpolation
In the last 30 years or so following the pioneering work of
Schoenberg (1946), piecewise polynomial functions have become an
important theoretical tool for the approximation of functions, since in
many cases the behavior of a function in one region may be totally
different from its behavior in another region and one polynomial may not
be appropriate to approximate that function. The most popular class of
approximating functions is the class of spline functions. Therefore,
instead of trying to approximate a function (of possibly several
variables) over the entire region by one polynomial of a certain degree,
one partitions the region into several subregions, and approximates the
function on each of these subregions by (possibly) a different
polynomial. One of the important ideas in using this process is the
method in which the region is subdivided. In one dimension, the method
consists of defining points (called knots) whose positions generally are
not known beforehand. Ideally, we would like to have these knots as
variables that can be shifted into the optimal positions as the
computation proceeds. However, this turns out to be a difficult
nonlinear problem and so the knot positions are usually fixed in
advance. Throughout this dissertation we assume that the knot positions
are preselected, so that the problem is a linear approximation problem;
furthermore, for simplicity we shall restrict our consideration to the
case of equidistant knots. Further discussion of the positioning of the
knots can be found in Powell (1967), Rice (1969) and Wold (1974).
Estimation of the knot locations is discussed by Gallant and Fuller
(1973).
We shall now discuss the spline interpolating polynomial of degrees
1, 2 and 3 (linear, quadratic and cubic splines). We shall consider
only the case of a single input variable.
We consider the interval [a,b] and divide it into h+1 pieces using
h points or "knots" Tr, T2, ..., h such that a = Tr
Th+l = b. Generally, a function which is equal to a polynomial of
degree at most m on each subinterval (Ti,Ti+1), i = 0, 1, ..., h, can be
written as
m m m
g(x) = E a i + ali (x T1 )+ + + ahi( rh)
i=O i=0O I=
(2.16)
where
(xT.)i = (xTj) x T.
(i = 1, 2, . )
= 0
, x < T.
J
A function g(x) is called a spline of degree m if it is a polynomial of
degree (at most) m on each of the h+l intervals ( ,Tii+I), i = 0, 1,
..., h, and has m1 continuous derivatives at Ti, i=1, 2, ..., h. By
imposing the restriction that g(x) in (2.16) has m1 continuous
derivatives at Ti, i = 1, 2, ..., h, we obtain a spline of degree m of
the form
m h
g(x) = 1 a x1 + b(x i)m (2.17)
i=0 i=I
(see, for example, Studden and Van Arman (1969), Studden (1971) and Park
(1978)).
For illustration, suppose cubiccubic polynomials are adequate for
a response function f given in (2.1) over the interval a
g(x) = a10 + a11x + a2x2 + a3x3 a < x < T
(2.18)
= a20 + a21x + 2L22x + a23x T < x < b
We can write g(x) in (2.18) as the single equation model
g(x) = aO + alx + a2x2 + a3x3 + a4(xT)0 + a5(x )+
(2.19)
+ a6(x ) + a7(x T)
where
a0 = 010' al = all' a2 = a12' a3 = 13
a4 = 20 010 + rC(21 11) + 2('22 a12) + 3 (a23 a13)
a5 = 21 ll + 2T(a22 a12) + 3 2(a23 "13)
a5 = a22 a12 + 3r(a23 a13)
a7 = "23 a13
We assume that g(x) in (2.19) is continuous and has first and
second continuous derivatives at x=T; thus, from the condition of
continuity at x=T, we can obtain from (2.18) that
S+ + 122 a33 2 3
010 + ll + + 213r = 020 + a21r + + a23 '
"20 '10 + (a21 all) + ,2(a22 a12) + 3(a23 a13) = 0
which implies a4 = 0 in (2.19). From the condition of first continuous
derivative at x=T, we have from (2.18) that
a21 ll + 2T(a22 "12) + 3T2(~23 '13) = 0
hence a5 = 0 in (2.19). Finally, we impose that g(x) in (2.19) has
second continuous derivative at x=T, and we obtain
a22 a12 + 3r(a23 a13) = 0
which implies that a6 = 0. Thus g(x) in (2.19) becomes
2 3 3
g(x) = a0 + a x + a2x + a3 a(x T)x
Throughout this dissertation all the splines of degree m have m1
continuous derivatives at the knots. A spline function of degree m with
h interior knots is represented, in general, by a (possibly) different
polynomial in each of the h+1 intervals into which the h interior knots
divide the real line. As each polynomial involves m+1 parameters, the
spline function involves a total of (m+l)(h+1) parameters. However, the
continuity conditions impose certain constraints on these parameters.
At each interior knot, Ti, i=l, 2, ..., h, the spline function has
continuous derivatives of orders 0, 1, ..., m1. Thus, m constraints
are imposed at each knot. As there are h interior knots, a total of mh
constraints are imposed. The number of free parameters involved in the
spline function is the total number of parameters minus the number of
constraints, that is, (m+l)(h+1) mh = m+h+1. We note here that the
number of free parameters is equal to the number of a's and h's
in (2.17).
2.2.2.1 Linear Spline
Let a = T0
interval [a,b] and let f(x) be a function defined at TO, T1, ..,
Th+1. We construct a linear spline function as follows. On each
interval TiTi+l ], we choose the interpolant g to f to consist of
linear pieces,
g(x) = Pi(x) = a. + Bx, T < x < i+1, i=0, 1, .., h (2.20)
The ith linear piece Pi is made to satisfy the interpolatory
constraints
Pi(Ti) = f(Ti) i=0, 1, 2, h,
Ph (h+1) f(h+1) (2.21)
Pi ( l) = Pi+ (Ti+ ) i=0, 1, 2 .. h1
(see Ahlberg, Nilson and Walsh (1967, p. 109)). The resulting
piecewise linear function g(x) agrees with f(x) at TO, ..., Th+1 and
is continuous at Ti, T2, ".., h. By using (2.17) with m=I, g(x) in
(2.20) can be written in the form
g(x) = a0 + alx + bl(x '1)+ + b2(x 2) + + ... + bh(x Th)+
(2.22)
where ao, al, b1, ..., bh can be obtained by using the conditions in
(2.21). We note that the number of parameters in (2.22) is equal to the
number of free parameters which is h+2.
Consider now the error resulting from the use of a linear spline.
We assume that the function f(x) has continuous derivatives up to order
2 with respect to x over the interval a
the error is given by
max ig(x) f(x)l < (tr)2 max f(2)(x)/8 (2.23)
a
where
T = i+ i=0, 1, .. h,
and
f(2)(x) = a2f(x)/ax2
(see deBoor (1978, p. 40)).
2.2.2.2 Quadratic Spline
Let f(x) be a function defined at TO', T, .... h+1 with a = T0<
T <... < h
quadratic pieces,
g(x) = P.(x) = a. + SOx + x2 x < i= 1, ..., h .
(2.24)
The ith quadratic piece Pi is made to satisfy the interpolant
constraints
Pi l) + P.+1(i+I) i=0, 1, ..., h1 ,
S (Ti+ i+1) ,i=, 1, .. h1
PO(T0) f(TO)
(2.25)
Ph(Th+l) = f(Th+I)
Pi(4i) = f(ci) i = 0, 1, ..., h,
where
i = (Ti + Ti+1)/2, i=0, 1, 2, ..., h, and
P (T + ) = [ iP (x)/3x ]x i=0, 1, .... h1
1 i+ 11t
(see Ahlberg, Nilson and Walsh (1957, p. 109) and deBoor (1978)). The
piecewise quadratic function g(x) interpolates the function f(x) at
TO' h+1 ... 1 h and has continuous derivatives of orders 0
and 1 at T1, T2' .., Th. By using (2.17) with m=2, the function g(x)
in (2.24) can be written as
g(x) = a0 + ax + ax+ b (xl)+ b (xT2 ... + bh( Th 2
(2.25)
where aO, a1, a2, bl, b2, ..., bh can be obtained by using the
conditions in (2.25). The number of free parameters is equal to the
number of parameters in (2.26) which is h+3.
To obtain the error Ig(x) f(x), we assume that the function f(x)
has continuous derivatives up to order 3 with respect to x over the
interval a
use of quadratic spline is given by
max g(x) f(x)l (AT)3 max f(3)(x)/8 (2.27)
a
where
AT = Ti+l i=0, 2, ... h
and
f(3)(x) = a3f(x)/ax3
(see deBoor (1978, p. 81)).
2.2.2.3 Cubic Spline
Let a = TO < T < ... < Th < Th+ = b be partition points for the
interval [a,b] and let f(x) be a function defined at TO, T1 ...
Th+1 We construct a piecewise cubic interpolant g to f as follows. On
each interval [ti, i+l], let g agree with some cubic polynomial of the
form
2 3
g(x) = P (x) = a. + B5x + Y x + 6.x3, T < x < Ti+,l i=O, 1, ..., h
(2.28)
The ith cubic piece Pi is made to satisfy the conditions
Pi (i+1) = Pi+I(i+1), i=, 1 ...h
p(l) (1)
P (Ti+) = P1i+ (i+l), i=O, ..., hI
P(2) (2)
Pi (T) 1 i+ i+j ), i=O, 1 ..., hI
Pi( ) = f(i), i O, 1, .. h (2.29)
Ph('h+l) = f(th+l)
((l) T f(1)(
h h+i h+1
Phl)( h+l) : f(1)( h+l1
where
Pi(j +) = [ajP(x)/axJ]x = j = 1, 2,
1 Ti+l
and
f(1)(T ) = [f(x)/ x] ,
(see Ahlberg, Nilson and Walsh (1967, p. 109) and Prenter (1975,
p. 78)). The piecewise cubic function g(x) interpolates the function
f(x) at TO, T1 ..., Th+1 and has continuous derivatives of orders 0, 1,
and 2 at T1, T2 ..., Th. By using (2.17) with m=3, the function g(x)
in (2.28) can be expressed as
2 3 3 3
g(x) = a + ax + a + a x3 + bl(x 1) + b2(x T2)
(2.30)
3
+ ... + bh(x Th)
where a0, a1, a2, a3, bl, b2, ..., bh can be obtained by solving the
conditions in (2.29). The number of free parameters is equal to the
number of parameters in (2.30) which is h+4.
We assume that f(x) has continuous derivatives up to order 4 with
respect to x over the interval a
resulting from the use of cubic spline is given by
max Ig(x) f(x) < 5(Ar)4 max If(4)(x) /384 (2.31)
a
where
AT = Ti+ i=0, 1, 2, ..., h,
and
f(4)(x) = 34f(x)/ax4
(see Hall (1968) and deBoor (1978, p. 68)).
2.3 Optimal Designs
Khuri (1982) obtained an adequate approximation to the mean
response function with a Lagrange interpolation polynomial and used this
approximation to derive a parameterfree design on the basis of a
certain modification of the 0optimality criterion. In this
dissertation we shall extend Khuri's design criterion to the cases where
we approximate the mean response function given in (2.1) with spline
interpolating polynomials (with single input variable) and Lagrange
interpolating polynomial (with two input variables).
k
Let Rk denote the experimental region for the model
x
y = f(x; 6) + (2.32)
We suppose that Rk is of the form
x
R = {(xI, ... xk) : a < xi < b ; i i 1 = 2, ..., k}
By a change of variables of the form
2x a b.
ti  a i = 1 2, ..., k, (2.33)
i ba. 1
1 1
the region Rk is changed to the region, Rk = _:1 < t = 1,
x t 1
2, ..., k}. If we substitute t for x in model (2.32) using the linear
transformation (2.33) we obtain the model
y = n(t; 3) + E. (2.34)
Suppose the interpolating polynomial g(t; 3) is adequate for
approximating the response function n(t; e) given in (2.34) over the
region Rk and for all in some parameter pace ; that is, g(t; ) is
region Rt and for all 3 in some parameter space 0; that is, g(t; 9
approximately equal to n(t; 9) with some small error of approximation
over the region R and for all 6 in 9. Thus, an approximate
representation of model (2.34) is given by
y = g(t; 6) + s (2.35)
A criterion for the selection of a design for estimating e in model
(2.35) which does not depend on the parameter vector e is given by Khuri
(1982). This criterion is restricted to the case of a single input
variable. We shall extend Khuri's criterion to the general case where
we have k input variables and later in this section we consider the
cases where k=2 for Lagrange interpolating polynomial and k=l for spline
interpolating polynomial.
Let t, 2.... _t be npoints (n > p) with coordinates
(t1, t .... t1k), (t21, t22 .... t2k) .. (tnl, tn .... tnk)
over the region Rk = {t : l
the coordinates corresponds in a onetoone manner through (2.33) to
the points of the design D = [xij], i=l, 2, ..., n: j=1, 2, ..., k with
, i=l, 2, ..., n, belonging to the experimental region
Rk = {x:a x. < b j = 1, 2, .... k .
X 3 3 I
In the design criterion by Khuri (1982), some function of the
determinant M(D, 8) is made as large as possible with respect to t_,
y2 ..., tn, where M(D, 8)/a2 is the information matrix of order pxp with
respect to the model (2.35) and M has the form
n
M(D, 9) = a r(t., 9) r'(t 8) (2.36)
i=1
where the pxl matrix r(ti, 8) is the partial derivative of g(j ; 9) with
respect to 8. In matrix formula, as will be seen later, we can write
r(ti ,e) as
r(t., 8) = U( )v(t. ) (2.37)
where U(6) is a pxq matrix which does not depend on the design points
and v(t.) is a qxl matrix not involving 8; the value of q will be
1
specified depending on the form of the interpolating polynomial. From
(2.36) and by using (2.37), we obtain
n
M(D, 8) = U(e) v(t, )v'(ti) U'(8) (2.38)
i=l
If we let
n
A(t1, ) = I v(i'(t) (2.39)
i=l
then from (2.38) and (2.39) we have
IM(D, ) = 8 =U(B)A(tl, t2 ...' tni)U'() (2.40)
Let X0(t,1' 2 ...., t ) and Xq1( t1, t2'... t ) denote the smallest
and largest eigenvalues of A(t t ..., t) respectively. It will
1 2 "n
be shown later that A(ti' t2' "" tn ) is nonsingular, and hence
positive definite, so that the eigenvalues are positive. From (2.40) we
have
AP (t1 ... I t)U(e)U'(B) c I (, e)l < xi t )l U()u (e)l
(2.41)
If the pxq matrix U(9) is of rank p then the pxp matrix U(e)U'(6) has
a nonzero determinant. Thus, we can write (2.41) in the form
IM(D, e) = [yux ( 1 .... t) + (1 Y)^ .1(l .... )]U(e) '(e) .
(2.42)
where O
density
v(y) 1 0
v(otherwi) se
0 otherwise
then from (2.42), the expected value of Mt(D, 9_) with respect to y is
equal to
E [IM(D, e)1] = (t t ... t );U(e)U'(.e) (2.43)
where
(ti' t2, .. t ) = [ (l, t2 .., tn) + X t ... tn)]/2
(2.44)
The parameterfree design is the one which maximizes E [IM(D, e) ] with
respect to t1, t2' ..., t This can be carried out by using the
computer program SEARCH, which was written by Conlon (1979) and is based
on Price's (1977) Controlled Random Search procedure, to maximize
(tl t 2 ... t ) in (2.44).
2.3.1 Designs Obtained through Lagrangian Interpolation of Nonlinear
Models in the Case of Two Input Variables
In this section we develop a method for constructing an adequate
approximation to the mean response function given in (2.32) with a
Lagrange interpolating polynomial. We shall restrict our consideration
of model (2.32) to the case of two input variables.
2.3.1.1 Lagrangian Interpolation of the Mean Response
Consider the model (2.32) for the case of two input variables ('<=2)
y = f(x1, x2; _) + E (2.45)
with the experimental region R2 = I(X x2) : al
we substitute tl, t2 for xl and x2 in model (2.45) using the linear
transformation given in (2.33) with k=2 we obtain the model
y = n(tI, t2; ) + (2.46)
with the region R2 =(t t2) : 1tl
(a) n(t1, t2; _) has continuous partial derivatives up to order mi + 1
with respect to t1 and order m2 + 1 with respect to t2 over the
region R for all 6, where mi and m2 are such that (ml + 1)(m2 +
1) is a positive integer greater than or equal to p, the number of
parameters in model (2.46), and are large enough so that
(m,+1) (m2+1)
max nt (tl t2; )I max In (t1 t2; 8)
1
1ct2
mi ml
(ml+1)! 2 (m2+1)! 2m2
(m1+1) (m2+1) (2.47)
max nt (t1' t2: )n t (t1, t2: 9)I
1
1
+ < 6
m! +! 2+m2
(ml+l)! (m2+l)! 2
for all 9 in some parameter space e, where
(m,+1) ml+l ml+1
nit (tl' t2; _) = t n(ti, t2; )/at1
(m2+1) m2+1 m2+1
nt2 (tl, t2; a) = n(tl, t2'; )/ t2
and 6 is a given positive constant.
(b) For any set of (ml+1)(m2+1) distinct points (yo, z0)' (yo, z1),
..., (yo, zm2) (Y1' zo) *.... (Ym' zm2)' such that 1
yl <...
integers defined in Assumption (a), the px [(ml+l)(m2+1)] matrix,
U(6) = [U(yo' zo, 6), U(y0, zl, ), ..., U(Yml, m2 e)] (2.48)
has rank p, where U(y z., 8) is the partial derivative of
n(y z.; a) with respect to 9 (i=O, 1, ..., ml; j=O,
1, ..., m2).
We consider the points with coordinates (yo, zo), (yo, z1),
..., (ym1, Zm2) described in Assumption (b) with yi and zj being
of the Chebyshev type given in (2.14) with al = 1, b1 = 1, a2 =
1 and b2 = 1. Substituting tl, t2 for x1 and x2 in (2.10) using
the linear transformation given in (2.33) with k=2, the Lagrange
interpolation polynomial becomes
g(tl, t2; a) = L n(yi, zj: e)lij (tI t2) (2.49)
0
0
where
ij (t t2) = 1 (t )lj(t2)
"1 t k m2 t2z
1= l II =, i=o, 1, .. i 1, (2.50)
k=O Yi Yk u=0 j u j=, 1 .... 2.
kti utj
The error resulting from the use of Lagrangian interpolation is, by
definition, the difference between n(t1, t2; 9) and g(tl, t2; a), the
upper bound of this error can be obtained from (2.15) and is given by
(m +1)
max ma (tt t2:
l
l
+ 2
max le(tl, t2) t < 1m
1 1tl (ml+1)!2 1
I
(m2++)
(m1+1) (m2+1) t
max nt (t t2; 1)nt (
l
1
K1 I 1 '~
(ml+l)!(m2+1)!2 1 2
(2.51)
2.3.1.2 Designs Based on Lagrangian Interpolation
Consider the Lagrange interpolating polynomial g(tl, t2; e) given
in (2.49) with interpolation points being of the Chebyshev type given in
(2.14) with al = 1, bI = 1, a2 = 1 and b2 = 1. The positive integers
mi and m2 are determined such that g(t1, t2; e) is approximately equal
to n(tl, t2; ) with an error of approximation not exceeding 6 over
2
the region R and for all e in 0. An approximate representation of
model (2.46) is given by
y = g(t', t2; ) + E (2.52)
Let (t11, t12), (t21, t22) ..., (tnl, tn2) be the points in the
region R2 = {(tl, t2):1
spondence through (2.33) with the points D = [(x11, x12), (x21, x22),
S(xnl, xn2)]' in the region R2 = ((x, x2):al
consider r(ti, e) in (2.36) since it is the partial derivative of g(til,
ti2; _) with respect to e, therefore, from (2.49) we can write r(ti, 9)
as
3n(y z.; 9)
r(t 2) = uj(t 2) i=1, 2 ... n
O
0
From (2.37) and (2.53), the px[(ml+l)(m2+1)] matrix U(B) can be written
an(y zo; 6) an(y0, z2; 6) an(ym, z; )
u(e) e a ": r i
(2.54)
and is of rank p by Assumption (b) and v(j) is an [(mn+1)(m2+1)] x 1
matrix of the Form
i i) = [100(til' ti2)' 101(til' ti2) .... I1m m2(til' ti2)]
(2.55)
If n > (ml+1)(m2+1) and at least (ml+1) of t1 .... tnl are distinct,
and at least (m2+1) of t12 ...". tn2 are distinct, then the
[(ml+l)(m2+1)] x [(ml+l)(m2+l)] matrix A(t1, t2' .... t) of the form
(2.39), where v(ti) is given in (2.55), should be of full rank and thus
nonsingular. To show this, suppose that A(M t2, ..., t) is of rank
less than (ml+l)(m2+1) then the (ml+1)(m2+1) x n matrix
G = [v(tl), (t 2), ..., (t )] is of the same rank as A(t_, t ...,
t1]. In this case there must exist a set of aj, u=0, 1, ..., m,;
j = 0, 1, ..., m2, not all equal to zero such that
m m2
Z auj uj(til = 0 i= 2, ..., n (2.56)
u=0 j=0
on R = ((t, t2): l
as lu(ti)lj(ti2) so (2.56) becomes auj u(t(t t )'(t i2) = 0,
u=0 j=0
T"1
i=1, 2, ..., n. Let Cj(til) = a auj l (til). It follows that
u=O
m2
M2
SCj(til)j (ti) = 0 for each til in [1,1], i=!, 2, ... Rut
j=0O
the lj's are linearly independent on the interval [1,1] (see Khuri
(1932)). Thus, Cj(til) = 0 for each j=0, 1, ..., m2, and each til, i=l,
2, .... n, in [1,1]. But then for each j, j=O, 1, 2 ..., m ,
m
Sa uj1 (til) = 0, i=l, 2, ..., n, and by the linear independence of
u=0
the lu's, auj = 0 for all u=0, 1, ..., mi and j=0, 1, ..., m2, which is
impossible since we assume that auj's (u=0, I, ..., ml; j=0, 1, ..., m2)
are not all equal to zero.
We now have U(8) a matrix of rank p, and thus U(e)U'(e) has a non
zero determinant. We have also shown that the matrix A(tl, t2 ..., t )
is of rank (ml+1)(m2+1), and thus nonsingular. From Section 2.3, the
parameterfree design maximizes '(tl, t, ..., t ) with respect to tl
t2' ... n where (t_1, 12' ... 1n) is of the form
(t ..., ) [ ,(t1, ..., t ) + x (t ..., )]/2
(ml+l)(m2+1)1
(2.57)
where X0(ti' ... t ) and x (ti' ... t ) denote the
(ml+1l)(m2+l)1
smallest and largest eigenvalues of A(t_, t2, ... )
We have provided the methodology for obtaining the optimal design
based on Lagrange interpolating polynomial for the case of two input
variables. It was shown that the number of design points is at least
equal to (ml+1)(m2+1). This number tends to grow rapidly as mi or m2
become large. In this case, it becomes very difficult, and practically
impossible, to determine optimal design settings using Price's computer
algorithm, or any other similar algorithm, due to the high
dimensionality of the optimization problem.
2.3.2 Designs Obtained through Spline Interpolation of Nonlinear Models
In this section we develop methods for constructing designs
assuming the true mean response given in (2.32) can be adequately
approximated with a linear, a quadratic and a cubic spline in the case
of a single input variable. Each approximation will be used to obtain a
design for parameter estimation which does not depend on the parameter
vector a.
2.3.2.1 Spline Interpolation of the Mean Response
Consider the model (2.32) for the case of a single input variable
(k=l)
y = f(x; e) + E (2.58)
with the experimental region R8 = {x: a
in model (2.58) using the linear transformation given in (2.33) with k=1
we obtain the model
y = n(t; a) + E (2.59)
the region R is reduced to the interval 1
x
Consider the knots TO, r' ...", h+1 where 1 = TO < 1 < ... <
Th < Th+I = 1. A spline of degree m (with m1 continuous derivatives at
the knots Ti, i=1, 2, ..., h) which approximates n(t; a) can be
obtained by substituting t for x in (2.17), we obtain
m h
g(t; 6) = a.ti + b.(t Ti) (2.60)
i=0 i=
where
(t (ti) 1
(t i)j o h j=l, 2, ... (2.61)
\0 otherwise
The coefficients a0, a1, ..., am, bi, b2, ..., bh in model (2.60) can be
obtained by solving the interpolation problems in (2.21), (2.25) and
(2.29) for linear, quadratic and cubic splines, respectively. We note
that the coefficients ai, bj, i=O, 1, 2, ..., m; j=I, 2, ..., h are
functions of 9.
Consider the model in (2.59). We assume that
(a) n(t: 9) has continuous derivatives up to order m+l with respect to
t over the interval 14t
the spline function and is such that m+h+1 is a positive integer
greater than or equal to p, the number of parameters in model
(2.58) and h is the number of interior knots and is large enough
so that
(AT)2 max In(2)(t; 6)I/ 8 < 6, for linear spline, i.e., m=1,
1
(2.62)
or
(AT)3 max In(3)(t; 8)1/8 < 5. for quadratic spline, i.e., m=2,
l1t
(2.63)
or
5(A:)4 max n ((t: _8)/384 < 6, for cubic spline, i.e., m=3,
14t
(2.64)
for all 6 in some parameter space 0, where AT = T. T.,
i = 0, 1, ..., h, is the common distance between the knots,
n( )(t; ) = tn (t: 6)/at j = 2, 3, 4, and 6 is a given
positive constant. We need the quantity m+h+l to be a positive
integer greater than or equal to p because of the parameter esti
mation purpose: since mh+l is the number of a's and b's in (2.60)
and these a's and b's are functions of 9 = (el ...., p)', if
m+h+l < p we cannot uniquely estimate 9 ... 9 The assumption
that n(t: e) has continuous derivatives up to order m+1 with
respect to t is needed for the purpose of obtaining the upper
bound of the error of interpolation.
(b) The px (m+h+1) matrix
U(e) = [u ( ), U (e), ..., (+h )
(2.65)
is of rank p, where UJ () = 3a /ae i = 0, 1, ..., m,
U m j( ) = 3b ./36 j = 1, 2, ..., h and ai, bj, i=0, 1 ..., m,
j = 1, 2, .., h, are the coefficients in model (2.60).
The error resulting from the use of spline interpolation
polynomial is the difference between n(t; 9) and g(t; e). The
upper bound of the errors can be obtained by substituting t for x
in (2.23), (2.27) and (2.31) using the linear transformation given
in (2.33) with k=l resulting in
max jg(t; e) n(t; 6)1 < (AT)2 max n(2) (t; e)l/8,
1
(2.66)
for linear spline,
max g(t; 9) n(t; 6)I < (AT)3 max In(3)(t; 6)/8,

(2.67)
for quadratic spline,
max Ig(t; 6) n(t; e)I < 5(AT)4 max In( )(t; e)1/384,

(2.68)
for cubic spline
where AT and nT()(t; 6), j = 2, 3, 4, are defined in Assumption
2.3.2.2 Designs Based on Spline Interpolation
Consider the spline interpolation polynomial g(t; 8) given in
(2.60) with h interior knots. The positive integer h is determined by
requiring the upper bounds in (2.66) or (2.67) or (2.68), for linear,
quadratic and cubic spline, to be less than some specified small value
6 > 0 for all in some parameter space e. Thus, an approximate
representation of model (2.59) is given by
y = g(t; e) + e (2.69)
Let t, t2, ..., tn be the points on the interval 1
correspond in a onetoone manner through (2.33) to the points of the
design 0 = [x1, x2, ..., xn]' with xi (i=1, 2, ..., n) belonging to the
experimental region R1= {x: a
where r(ti, 9) is the partial derivative of g(t; e) with respect to 6,
as
m h
r(ti, ) ( u + U En (bu)ti u) i = 1,2 ..., n
u=O  u= 
(2.70)
From (2.37) and (2.70) we obtain U(B) and t(ti) of the form
U(83) [ae o' a 1' 2 am' a 7 .bh
(2.71)
2 tm (r ,
(ti) = [1, ti, t ..., t (t ..., (t 
..., I + i Inh)
(2.72)
the matrix U(e) is of order px (m+h+1) and of rank p by Assumption (b)
and v(ti) is an (m+h+l)xl matrix.
Now, if n>m+h+l and at least m+h+1 of the design points are
distinct (this is needed so that the a's and b's in (2.60) are uniquely
estimated), then the (m+h+1) x (m+h+1) matrix
n
A(tI, t2.... t) = (ti)Vt'(t) (2.73)
i=l
should be of rank m+h+1 and thus nonsingular. To show this, suppose
that A(tl, t2'..., tn) is of rank less than m+h+1, then the (m+h+l)xn
matrix G = Cv(t), ... .(tn)] is of the same rank as A(tl, t2'
..., t ), Therefore, there must exist a set of ai, i = 0, 1, ..., m+h,
not all equal to zero such that
m m+h
at. + i )m = 0 j = 1, 2, ... n
i=0 j i+1= j m +
(2.74)
m m+h
From (2.74) we conclude that at1 + a.(t T )m, which is
i=0 i m+l 1
a polynomial of degree m, has n roots, namely, tI, t2, ..., t which is
impossible since n > m + h + 1 > m and at least m+h+1 of tI, t2 ...,
tn are distinct (a polynomial of degree m has at most m distinct roots).
We have thus shown that the (m+h+1) x (m+h+1) matrix
A(tI, t2, ..., t ) is of rank m+h+1. Therefore, it is positive
definite and by Assumption (b), U(e) is a matrix of rank p, hence the
pxp matrix 'u(9)U'(8) has a nonzero determinant. Consequently, from
Section 2.3, the parameterfree design is the one which maximizes
p(tl t2' ... tn) with respect to tl, t2, .... tn and
'(t, t2' ... tn) is of the form
p(t, t ... tn) = [X(t t2 ..., tn) + +h(tl, t2 ..., n)]/2
(2.75)
where 0(tl, t2, ..., tn) and m+h(tl' t2' tn) denote the smallest
and largest eigenvalues of A(t t2, ..., t ).
We note that in order to be able to generate a satisfactory distri
bution of information throughout the interval [1,1], the optimal design
obtained on the basis of spline interpolation is subject to the con
straint that at least one design point is allocated in each subinterval.
2.4 Examples
We now present two examples for obtaining parameterfree designs
based on Lagrangian interpolation (for the case of two input variables)
and spline interpolation polynomials of degrees 1, 2 and 3 (for the case
of a single input variable).
Example 2.1 Consider the model used by Box and Hunter (1965) for a
certain chemical reaction rate which can be described by the nonlinear
model
f(x1, x2' e) = ele3x1/(1 + x1 + 2x2) (2.76)
where f(x1, x2; 9) represents the reaction rate, xl and x2 are partial
pressures of reactant and product respectively, 81 and 82 are adsorption
equilibrium constants for reactant and product respectively, and 83 is
the effective reaction rate constant. Suppose that a prior knowledge of
the parameters 01, e2 and e3 is available in the form of known lower and
upper bounds. Thus, v, <9
values vi and wi(i = 1, 2, 3) and suppose that the experimental region
is R 2= (x, x2): 0
positive values specified by the experimenter (in this example we choose
bi = 3 and b2=2). Using the linear transformation given in (2.33) with
k=2, aO=0 and a2=O model (2.76) becomes
n(t', t2; 6) = 1e3bl(tl+1)/[2 + 61bl(tl+) + 2b2(t2+l)] ,
(2.77)
the region R2 is transformed to Rt = {(t, t2): 1 1t2c
Lagrangian interpolation accuracy is determined by using the equation
given in (2.51). It can be shown (see Appendix 1) that for all 8 in 0,
the upper bound for the error is given by
m2+1 m2+1
1 W1w3b1(tl+l)w2 b2
max Ie(tl,t2) < max I]2 2
1
1
m,+1 m+l1
1 w1 w3b1 [2+w2b2(t2+1)]
+ max 12
ml I
2 1
ml+ 22 ml+2 m2+1 m2+l1
S1 +2 W3bl1 (tl+l)w2 b [2+w2b2(t2+)]
+ max +
(2.78)
Suppose now we require that the upper bound of the error in (2.78)
(with bl=3 and b2=2) be less than or equal to 5 = .05. We note here
that this upper bound involves the upper and lower limits of aI, 82 and
83. Suppose that the lower and upper limits of e8, %2 and 83 are such
that .5 < 81 < 1, .7 < 92 < 2 and .4 < a3 < .8. These values are
obtained from the approximate 95% confidence region of 9 which was
constructed using ten response values that were generated according
to the model in (2.77) with 81, 82, e3 and the error variance, o2, taken
as .6, 1.2, .5 and .0006, respectively. The ten design points were
chosen from the region
Rt2= {(t t2): 1 < tl < 1 t < 1}
To compute the upper bound of the error we fix mi and m2 and carry out
the maximization as in (2.78) over the region R = {(tl, t2):
1 < t1 < 1, 1 < t2 < 1}. This maximization is accomplished by the
use of the computer program SEARCH, which was written by Conlon
(1979). By computing this upper bound for several assignments of mi and
m2 we obtain the values of mI and m2 to satisfy the 6 = .05
requirement. These values are ml=12 and m2=4 and the value of the upper
bound is .04816541. The Chebyshev points corresponding to the values of
mi and m2 are obtained from (2.14) with al=1, a2=1, b1l= and b2=1.
These points are given in Table 1. We use these Chebyshev points to
form the coordinates of points in two dimensions, we obtain (1l+1)(m2+l)
= 65 points. These coordinates are given in Table 2. We choose n, the
number of design points, to be equal to 65 (here all sixtyfive points
are distinct). The vector v(tj) as well as the matrix A(tI, ...,
.65) can be obtained from formulas (2.55) and (2.39), respectively,
which leads us to the maximization of (tl, t2' ...' 65) in (2.57) with
p=3. Since the number of design points required for this example is
computationally intractable, the actual determination of the 65pt.
design was not attempted.
Table 1
Chebyshev Points for Example 2.1
Based on ml = 12
Based on m2 = 4
.95106
.58779
0
.58779
.95106
.99271
.93502
.82298
.66312
.46472
.23932
0
.23932
.46472
.66312
.82298
.93502
.99271
Table 2
Coordinates of Points in TwoDimension
Obtained from Chebyshev Points for Example 2.1
(.99271, .95106)
(.99271, .58779)
(.99271, 0 )
(.99271, .58779)
(.99271, .95106)
(.93502, .95106)
(.93502, .58779)
(.93502, 0 )
(.93502, .58779)
(.93502, .95106)
(.82298, .95106)
(.82298, .58779)
(.82298, 0 )
(.82298, .58779)
(.82298, .95106)
(.66312, .95106)
(.66312, .58779)
(.66312, 0 )
(.66312, .58779)
(.66312, .95106)
(.46472, .95106)
(.46472, .58779)
.46472, 0 )
.46472, .58779)
.46472, .95106)
.23932, .95106)
.23932, .58779)
.23932, 0 )
.23932, .58779)
.23932, .95106)
0 .95106)
0 .58779)
0 0, )
0 .58779)
( 0 ,.95106)
(.23932, .95106)
(.23932, .58779)
(.23932, 0 )
(.23932, .58779)
(.23932, .95106)
(.46472, .95106)
(.46472, .58779)
(.46472, 0 )
(.46472, .58779)
(.46472, .95106)
(.66312, .95106)
(.66312, .58779)
(.66312, 0 )
(.66312, .58779)
(.66312, .95106)
(.82298, .95106)
(.82298, .58779)
(.82298, 0 )
(.82298, .58779)
(.82298, .95106)
(.93502, .95106)
(.93502, .58779)
(.93502, 0 )
(.93502, .58779)
(.93502, .95106)
(.99271, .95106)
(.99271, .58779)
(.99271, 0 )
(.99271, .58779)
(.99271, .95106)
Example 2.2 In this second example the parameterfree designs are
obtained through linear, quadratic and cubic spline interpolating
polynomials used for approximating the nonlinear model. We consider the
model used by Draper and Smith (1981). The model is taken from an
investigation performed at Proctor and Gamble. The investigation
involved the relationship between Available Chlorine, y, of product A
and the length of time of manufacture, x. It is required that product A
must have a fraction 0.50 of Available Chlorine at the time of
manufacture and the fraction of Available Chlorine in the product
decreases with time. The form of the model is
y = a + (.49 a)exp(B(x 8)) + E (2.79)
We suppose that the experimental region R1 = {x: a
x
are some positive values specified by the experimenter (in this example
a = 10 and b = 40), and let the parameter space 0, be such that
0.36
confidence region of a and B in the example by Draper and Smith (1981,
p. 486)). Using the linear transformation given in (2.33) with k=1, a=10
and b=40, model (2.79) becomes
y = a + (.49 a)exp(t(15t + 17)) + E (2.80)
where 1
(AT)2(.0679675) for linear spline
max Ig(t,6) n(t,6)j < (6A)3(.1631221197) for quadratic spline
l
(AT)4(.0407805) for cubic spline.
(2.81)
Suppose now we require that the upper bounds in (2.81) be less than or
equal to 6 = .05, by taking into consideration that we have equidistant
knots. Then the values of AT = Ti+l Ti, i=0, 1, ..., h, that satisfy
the 6 = .05 requirement are
.4 with maximum error = .0108748 for linear spline
AT = .5 with maximum error = .02039 for quadratic spline
1 with maximum error = .0407805 for cubic spline.
(2.82)
From (2.82), the number of interior knots on the interval l
linear, quadratic and cubic splines are h=4, 3 and 1, respectively. The
knot positions for linear spline are 1, .6, .2, .2, .6, 1, for
quadratic spline are 1, .5, 0, .5, 1, and for cubic spline are 1,
0, 1. By choosing n, the number of design points to be equal to m + h +
1, where m is the degree of the spline functions (in our case m=l, 2,
and 3) and h is the number of interior knots, we obtain n=6, 6 and 5 for
linear, quadratic and cubic splines, respectively. The vector v(t) as
well as the matrix A(tl, t2, ..., tn) can be obtained from formulas
(2.72) and (2.73), respectively. The maximum values of i(tl, t2, ...,
tn) in (2.75) with p=2 for linear, quadratic and cubic splines are
178.605, 219.227 and 114.813 and the corresponding optimal values of
ti(i=l, 2, ..., n) along with the join points (interior knots) ri, i=l,
2, ..., h, are given in Table 3.
For comparison purposes and to check the efficiency of our optimal
design, by using a = (a, 6)' = (.39, .10)', values of the determinant
of the matrix M(D, 3) given in (2.38) were computed at both optimal
design and the BoxLucas design (see Appendix 3) and are given in
Table 4. The efficiency is defined by: (Lucas (1976))
M(D,)0Optimal Design[/ 1/p
Efficiency = [ iMlD)BoxLucas/n ]
BoxLucas 2
where n1, n2, (in this example nI = 6, 6, 5 for linear, quadratic and
cubic splines and n2 = 2), are the numbers of design points for the
optimal and BoxLucas designs, respectively, and p is the number of
parameters in the model (in our case p=2). The efficiencies are also
given in Table 4.
From Table 4 we can conclude that with respect to model (2.79), our
optimal designs based on the linear and the quadratic splines are 1.367
and 1.042 times more efficient than the BoxLucas design. But the
optimal design based on cubic spline has a lower efficiency than the
BoxLucas design.
For the purpose of comparison, the plots of the linear, quadratic
and cubic splines are produced in Figures 2, 3 and 4. For the true mean
~
response curve in Figure 1, we set a = .39 and B = .10 (tnese two values
are the least squares estimates of a and B obtained by Draper and Smith
(1981)). The equations for the linear, quadratic and cubic splines that
provide the approximations to the model in (2.80) with the maximum
errors given in (2.82) are of the forms
linear spline:
g(t;9) = .379523.0923504t+.0416674(t+.6)++.0228676(t+.2)+
+.01255(t.2)+.00688758(t.6) ,
quadratic spline:
g(t;e) = .416754+.00793627t+.0630552t2.0340915(t+.5)2
.0151525(t0)+.00724385(t.5)2,
cubic spline:
g(t;9) = .408268+.0261167t+.0157712t2
.0217168t3+.0178701(t0)3
(2.83)
These splines are constructed under the assumption that each of them has
m1 continuous derivatives at the knots, where m=1, 2, and 3,
respectively. The mean response values obtained from the model in
(2.80) (with a = .39 and B = .10), and values obtained from linear,
quadratic and cubic splines in (2.83) are given in Table 5 for certain
values of t in the interval [1,1], the plots of the response values
(for each model) against the t values are given in Figures 1, 2, 3, and
4, respectively.
Optimal Designs and
Table 3
Interior Knots.
Linear Spline with
Interior Knots at
.6, .2, .2, .6
Optimal Designs
Quadratic Spline with
Interior Knots at
.5, 0, .5
Cubic Spline with
Interior Knots at
0
.6960602 .9823732 1
.200152 .9458824 .0007636
.1999294 .0002855 .9975816
.5999959 .4995994 .9990895
.9997785 .9999732 1
.9999283 .9999965
Table 4
Values of IM(D,B) from Optimal and BoxLucas Designs
for B = (a, B)' = (.39, .10)'
and the Efficiencies of the Optimal Design. Example 2.2.
M(D,)optimal design M(D,B) BoxLucas Efficiency of
Optimal Design
Linear
Spline .387684347 .069146279 1.367
Quadratic
Spline .247975528 .076197813 1.042
Cubic
Spline .169248231 .085737989 .889
Example 2.2
Mean Response Values
Table 5
with Corresponding Values of t. Example 2.2
Mean Response Values
t True Model Linear Spline
1LQC
.9
.3
.7
.6L
.5Q
.4
.3
.2L
.1
OQC
.1
.2L
.3
.4
.5Q
.6L
.7
.8
.9
1LQC
.471873
.460469
.450653
.442205
.434933
.428674
.423287
.418650
.414660
.411255
.408268
.405724
.403534
.401648
.400026
.398629
.397427
.396393
.395502
.394736
.394076
.471873
.462638
.453403
.444168
.434933
.429865
.424797
.419728
.414660
.411878
.409097
.406315
.403534
.402007
.400481
.398954
.397427
.396590
.395752
.394914
.394076
Quadratic Spline Cubic Spline
.471873
.460686
.450760
.442096
.434692
.428674
.423327
.418684
.414621
.411136
.408268
.405754
.403533
.401628
.399979
.398629
.397438
.396400
.395494
.394719
.394076
.471873
.460379
.450374
.441726
.434306
.427984
.422628
.418109
.414296
.411059
.408268
.405810
.403645
.401749
.400099
.398672
.397445
.396395
.395499
.394733
.394076
L = knots for linear spline, Q = knots for quadratic spline,
C = knots for cubic spline
~ _
62
O 45 .
0. 25 ]
0. L425
0.375
10 0.5 0.0 0.5 1.0
X
Fig. 1. Plot of a+(.49a)exp(B(15t+17)) on [1,1] with a = .39 and
B = .10 for Example 2.2
T
0 .475
0. \
13 uL
0. 375 _
L. 0.5 0.0 0.5 L.O
X
Fig. 2. Plot of Linear Spline Interpolate to a+(.49a)exp(S(15t+17))
with Six Evenly Spaced Knots on [1,1] for Example 2.2
T
0,475I
0.490 \
0.425
0.400
0.375
1.0 0,5 0.0 0.5 L.0
X
Fig. 3 Plot of Quadratic Spline Interpolate to a+(.49a)exp(S(15t+17))
with Five Evenly Spaced Knots on [1,1] for Example 2.2
T
0,475
0. 14259
0,400o 2
0.3752
L.0 0.5 0,0 0.5 1.0
X
Fig. 4. Plot of Cubic Spline Interpolate to a+(.49a)exp(S(15t+17))
with Three Evenly Spaced Knots on [1,1] for Example 2.2
2.5 Discussion
The proposed optimal design is obtained from the interpolating
polynomials, so the efficiency of the design depends on the accuracy of
the interpolations. In Example 2.2 we compared the designs based on the
linear, quadratic and cubic splines to the BoxLucas design. Table 4
summarizes the design comparisons, by showing that the optimal designs
based on linear and quadratic splines have higher efficiencies than the
BoxLucas design but the optimal design based on cubic spline has lower
efficiency than the BoxLucas design. This may be explained by the
accuracies of the interpolations; that is, the upper bound of the error
of interpolation using cubic spline is larger than those of the linear
and quadratic splines (see Eq. 2.82). We also note here that the design
based on the linear spline has slightly higher efficiency than the
design based on the quadratic spline when comparing both of them to the
BoxLucas design. Based on the efficiencies of the designs our
recommendation for Example 2.2 is to choose the design that is based on
the linear spline. The number of design points (n) for the linear,
quadratic and cubic splines in this example are 6, 6, and 5,
respectively. This number seems to be nonincreasing as the degree of
the spline interpolating polynomial increases. This is not always true
since the number of design points depends on m and h (n = m+h+1) where m
is the degree of spline interpolating polynomial and h is the number of
interior knots. The value of h can increase, decrease or stay the same
when m increases. Hence, there is the possibility of increasing the
number of design points with an increase in the degree of the spline
interpolating polynomial. The optimal design is also affected by the
position of the design points in the experimental region. A problem
arises of how the design points should be allocated, especially for the
design that is based on spline interpolating polynomial. Box and Draper
(1975) point out that "the design should generate a satisfactory
distribution of information throughout the region of interest." For
this reason, for the optimal design based on spline interpolating
polynomials, we would choose our design so that there is at least one
design point in each subinterval. One other problem is that when the
number of design points becomes large (n > 10), it is almost impossible
to calculate the design points through the Controlled Random Search
Procedure. This was the case when we tried to calculate the design
points for the design that is based on Lagrange interpolating polynomial
with two input variables (Example 2.1). It is hoped that a sequential
procedure, in which points are added one at a time, can be used to solve
the problem of optimization when the number of design points is large.
This is subject to future research.
CHAPTER THREE
CONFIDENCE REGIONS FOR THE PARAMETERS
IN NONLINEAR MODELS
3.1 Introduction
In this chapter we present a method for the construction of a
confidence region on a vector of parameters in a nonlinear model.
Consider the same model as in (1.5)
y = f(x; 6) + (3.1)
where y is the measured or observed response, f(x; 6) is the true mean
response, x = (xl, x2, ..., xk)' is a vector of k input variables, 6 =
(e8, 2'... ep)' is a vector of p unknown parameters, and e is a
random error. If there are n observations (n > p) of the form yl, y2,
..., yn with yi being obtained at the design point xi = (xil, xi?2 ***
ik)', i = 1, 2, ..., n, then model (3.1) can be written in the form
yi = f(i ; B) + i i = 1, 2, .., n (3.2)
We assume that the random errors el' 2' ,"" n. are independently
distributed as N(O, o2), with o2 unknown. We define the n x k design
matrix as D = [xij], i = 1, 2, ..., n; j = i, 2, ..., k. The ith row
ii of this matrix with elements xil, xi? ..., ik provides the levels
of the k variables at which the response is to be observed at the ith
run. Throughout our development, we shall assume that replications can
be obtained at each design point. These replications will be used to
obtain an estimate of 02 (Draper and Smith (1981)). In the following
sections, we consider the problem of constructing a confidence region on
the vector of parameters, a.
3.2 Method for Obtaining Confidence Region
on Parameters
Consider the model given in (3.1). For a given value of x, f(x; 9)
is a function of 9. When each element in x takes on n values in the
experimental region R, we end up having n functions of 6. To construct
a confidence region on the parameters 9, we shall proceed as follows.
Step One. For n > p, we select q = (n) sets of points each
consisting of p design points. Let W consist of the collection of the q
of p points Let x(l) (1)
sets of p ..., x denote the p points in the lth
Sp
set (1 = 1, 2, ..., q).
Step Two. For the 1th set of points in W, 1 = 1, 2, ..., q, we
obtain a set of simultaneous confidence intervals on f(x(l; ),.
f(x(1) ; ). Let (1 ) be the cartesian product of these intervals,
that is, ) is a rectangular confidence region on the vector
that is, R(l) is a rectangular confidence region on the vector
(f(x_1); ) ..., f(x(1); e))'. Then the confidence intervals .ill be
1 p
chosen so that the confidence coefficient for this region is equal to
1 a.
Step Three. Let
n1l = f(x(l); 6)
p P
We shall solve the system of the above equations for 81, ..., p in
terms of n' I, ..., nr() (this can be achieved by assuming certain
1 
conditions on the functions f(x1; 6) f(x ; ), (1 = 1, 2,
.. q)).
Step Four. We obtain a rectangular confidence region on a vector
6 = (e,1 .... p)' with a confidence coefficient > 1 a. This region,
denoted by r(l), is based on the region (1 ).
Step Five. Using Bonferroni's inequality, the intersection,
q (
i r() provides a confidence region on the vector a with a
1=1
confidence coefficient > 1 qa. This intersection can be used to
obtain simultaneous confidence intervals on any set of functions of
61, ..., ep
We shall now give details of Steps One through Five.
3.3 Confidence Region on 1
In this section we first construct the rectangular confidence
region (I) on each of a set of p functions of the form f(x'1, a),
1
i = 1, 2, ..., p, where x ), (1) is a set of points in W,
1 3
1 = 1, 2, ..., q. The pvariate student's tdistribution defined by
Dunnett and Sobel (1954) will be used to obtain these confidence
regions. For the sake of simplicity, in this section we shall drop the
superscript (1) and just use 1i.
Consider the model given in (3.2) and suppose
YlI, Y12 ..... Yln are n1 repeat observations at xI
Y21' Y22' ..., 2n2 are n2 repeat observations at x2
Ynl' Yn2.' ., Ynn are nn repeat observations at _
where yiu is the uth observation at the ith design point. Let us
assume the yiu are independently normal distributed with mean f(xi; 9)
and common unknown variance a2 (i = 1, 2, ..., n; u = 1, 2, ..., ni).
Draper and Smith (1981) pointed out that to estimate 02 we can use the
pooled variance of the form
n n
2 i l u l (Yiuyi)
n
idl (nil)
S2 (3.3)
(n11)s2+(n 1)s2+...+(n )s2
m
where s2 is the sample variance of the ni(> 2) observed responses at
the ith design point, yi is the average of the values of the observed
response at the ith design point, and n is the number of distinct
n
design points. Here s2 has m = i (n l) degrees of freedom.
Let
yif(x.; e)
t = i = 1, 2, ..., p. (3.4)
Sr7
n
We note here that y9 is normally distributed with mean f(xi: e) and
variance o2/n If we write
z. = IT (yif(x ; )] i = 1, 2, .... p, (3.5)
then zi, i = 1, 2, ..., p, has a pvariate normal distribution with mean
0 and common unknown variance a2 and zl, ..., zp are independent by the
assumption regarding the random errors stated in Section 3.1. Also,
ms2/a2 has a chisquare distribution, independent of the zi, with m
degrees of freedom. By substituting zi into (3.4) we obtain
zi
ti =  i = 1, 2, .. p. (3.6)
The joint distribution of ti, i = 1, 2, ..., p, of the form (3.6), is a
central pvariate tdistribution with m degrees of freedom and has the
density function
(m+p)] 1
h(t .., tp) :[ + 1
1 p (mn)p/2T() m i1 ti
p 2
 (m+p)
I ,
2
(3.7)
where r( ) is the Gamma function (see Dunnett and Sobel (1954)).
Simultaneous confidence intervals on f(x ; ), i = 1, 2 ...,
with confidence coefficient 1a can be set up by finding p positive real
numbers, say al, a2, ..., ap, such that
a a1 aI
P( t \< ,.al.2, t 
ap apI a
S P1 (3.8)
= 1 a .
By substituting the ti given in (3.4), formula (3.8) can be written in
the equivalent form
P(a f( a; 1al_, .., pa 
1 1 P P (3.9)
which determines 100(1 a)% simultaneous confidence intervals on
f(x ; e), ..., f(x ; _). Let n denote the cartesian product of these
intervals. The region Q is a rectangular confidence region on the
vector (f(xl; ), ..., f(x ; e))' with a confidence coefficient 1 a.
Let ni = f(xi; 6), i = 1, 2, ..., p. i is a function of i and
6. For a given value of xi, f(xi; 6) is a function of 0 and we write
f(i. ; ) as fi(e). Thus we can express ni, i = 1, 2, ..., p, as
n1 = f 1 2' e "' e p) 1 f()
T2 = f 12(, 2 ... ) = f2e)
(3.10)
n = fp(e1' 62' .. ep) fp )
Using the expressions in (3.10) we can write (3.9) as
P((n1, n2, ..., n ) E n) 1 a (3.11)
3(f ..., f )
If the Jacobian ( "' 0, then equations (3.10) can be
e i (9l' ..., 9p)
uniquely solved for 1, ..., p in terms of n1, ..., np and we obtain
el = 91(n1 n?, ..., rp)
62 = 92(n1, n2, ..., p)
0 n (3.12)
8p gp(nl, n2 ...n Pp)
We let L. = min g (ni ..., n ) and U = max gi (nl ..., p), then
from (3.11) and (3.12) we can conclude that
P(L1 < e < U1, ... I < e < Up ) > 1 a (3.13)
which gives a rectangular confidence region on 9 with a confidence
coefficient > 1 a. We denote this region by r. Equation (3.13) can
be written in the alternative form
P((81, 62' "." p) E r) > 1 a (3.14)
The above method can be applied to every set in W consisting of p
points. We shall denote the confidence region obtained as in (3.14) for
the various q sets in 0 by r(1), ..., r(q) We thus have the following
q confidence regions on 8 each having a confidence coefficient > 1 a,
P(( 9 ...2' p) e r(1)) > 1 a
P((el, 82 .... p) r(2) > 1 a (3.15)
P((e1, 2' ... ) e r() ) 1 a
We now combine the information from these q confidence regions using
Bonferroni's inequality (see, for example, Neter and Wasserman (1974, p.
146) or Bickel and Doksum (1977, p. 439)). According to this inequality
and from (3.15), we have
P((e1, 82 ... p) E r1 () > i qa (3.16)
p 1=1
which gives a confidence region on e with a confidence coefficient
> 1 qa. We can write (3.16) in the equivalent form
P(L* < 61 < U*, ..., L* (< < U*) > 1 qa (3.17)
Lt = max(L l), ..., ), U = min( ) ..., ), i = 2, ..., p,
where L 1 ), U 1 i = 1, 2, ..., p, 1 = 1, 2, ..., q, are the lower
1 1
and upper bounds, respectively, on ei obtained as in (3.13) for the lth
set of points in W. We note from formula (3.17) that the intervals
[L*, Ut], i=1, ..., p, form a set of simultaneous confidence intervals
on 61 ..... p, respectively, with a joint confidence coefficient
greater than or equal to 1 qa.
3.4 Examples
We now present two examples to illustrate the technique for
obtaining the confidence regions of parameters in nonlinear models.
Example 3.1. Consider the two parameter exponential growth model
used by Gallant (1975b) of the form
62x
y = l1e + (3.18)
A total of n = 11 design points are chosen from R = {x: 0 < x < 1}.
These points are 0, .1, .2, ..., 1. Each point is replicated five
times. The parameter space is taken as 0 = {(6 81 2): 6 > 0, 62 > 0}.
Observed response values are generated using the parameter values,
e1 = .62, e2 = .5, and o2 = .04. These values are shown in Table 6.
The mean (yi) and the variance (s5] of the observed responses at the
ith design point for i=l, ..., 11, are listed in Table 7. The pooled
variance s2, can be calculated by using the formula in (3.3) and is
equal to .0332818 with m = 44 degrees of freedom.
Since n = 11 and p, the number of parameters is equal to 2, then
q = (1) = 55. We thus have 55 pairs of the functions of the form
f(xj1); e), i = 1, 2; 1 = 1, 2, ..., 55. For each pair, the
simultaneous confidence interval on f(x1); f(x ; ), can be
1 2 ; e), can be
obtained as follows. First we use the bivariate tdistribution given
in (3.7) with p = 2 to calculate the probability described in (3.8). We
assume that the limits of integration al, a2 are equal to a, say, so
that Equation (3.8) can be expressed as
a a
P( tll < a, It2 < a) = f f h(tl, t2) dt1dt2 = 1 a ,
a a
(3.19)
where h(tl, t2) can be obtained from (3.7) with p = 2 and m = 44. To
calculate the probability value in (3.19) we fix a and calculate the
integrals. This can be carried out by using a computer subprogram
called DMLIN from IMSL (1982). Suppose we would like the confidence
coefficient 1 qa in (3.16) to be equal to .94. We set
1 (1)a = .94 which implies that a : .001. By computing the
integral in (3.19) for several assignments of a we obtain a = 3.75 which
corresponds to a = .001057813 and 1 1)a = .941319999. To find
corre pond )ao
Table 6
Observed Responses and Design Points for Example 3.1
Observed Design
Responses Points
(y) (x)
.84025
.84356
.66729
.75769
.67426
.53177
.62303
.75965
.49493
.70555
.77509
.61030
.79161
.58948
.74782
.46908
.82377
.91124
.66913
Observed Design
Responses Points
(y) (x)
.65852
.85087
.63028
.56506
.81007
.89356
1.10607
1.09510
.64805
.88696
.85727
1.31870
.73195
.85361
.52289
1.05243
1.02124
1.05300
Design
Points
(x)
Observed
Responses
(y)
.82359
.63941
.63028
.94460
.71885
.89828
1.00649
.74048
.96815
1.08855
.66642
.79255
.88114
1.0697
.56959
.91519
1.31447
.84882
Table 7
Means and Variances of Observed Responses
at Distinct Design Points for Example 3.1
Design Points Observed Responses
(xi) Mean (Y ) Variance (s )
0 .75661 .00733
.1 .62299 .01255
.2 .70286 .00913
.3 .70635 .02893
.4 .74997 .02073
.5 .91869 .03606
.6 .89592 .09277
.7 .83350 .04062
.8 .86174 .01607
.9 .87936 .02613
1 .94355 .07578
the 100(1 a)% 99.9% simultaneous confidence intervals on
f(x1l ; ), and f(x l); G), 1 = 1, 2, ..., 55, we consider Equation
(3.9) with al = a2 = a = 3.75, n1 = n2 = 5. These intervals are given
in Table 8.
We now find the confidence region on (61, 92). For the sake of
simplicity, in this example we shall drop the superscript (1) and just
use x1 and x2. We assume that xr < x2 and let
e2x
n = 6 e = f(x ; a)
(3.20)
52x2
n2 = ele = f(x2; 8)
Suppose cl, dl are the lower and upper limits of the confidence interval
on f(xl; e) and c2, d2 are the lower and upper limits of the confidence
interval on f(x2;: ). We solve the two equations in (3.20) for el, e2
and obtain
x1 x2
X1X2 x2x1
1 = n2 n1
(3.21)
In n2n n1
2 x2x1
where In is the natural logarithm. From (3.21) we obtain by minimizing
and maximizing each of 81 and e2 with respect to n over the rectangular
region [cl, dl] x [c2, d23
1
x1 x2
x1x2 x2x1
1min d2 c
xI x2
XlX2 x2xI
Imax = c2 d
In c21n dl
2min x2_X
In d ln cl
2max x2x1
Values of l6min, 8max and 2min', 2max for each of the 55 pairs of
simultaneous confidence intervals on n1, n2 are given in Table 8. From
Table 8, if we take the largest of the lower bounds on 0e and on 02 as
well as the smallest of the upper bounds on 81 and on 82, we obtain the
following rectangular confidence region on 8 which has a confidence
coefficient of at least 94.18 percent:
81 e [.45065, .96861]
82 e [.41816, 1.01981]
For comparison purposes we use the data in Table 6 to construct
simultaneous confidence intervals on 81 and 82 by using each of three
other known methods. These methods are:
Table 8
99.9 Percent Confidence Regions on [f(xl(1); _), f(x2(1); _)],
1 = 1, 2, ..., 55 and at least
99.9 Percent Confidence Regions on [el, 62] for Example 3.1
Design Points Confidence Region of Confidence Region of
x x1 [f(xl; 1), f(x [e1 2]
1 2 1 _
.1 [.45065,1.06257],[.31703,.92894]
.2 [.45065,1.06257],[.3969,1.00882]
.3 [.45065,1.06257],[.40039,1.0123]
.4 [.45065,1.06257],[.44401,1.05592]
.5 [.45065,1.06257],[.61273,1.22465]
.6 [.45065,1.06257],[.58996,1.02187]
.7 [.45065,1.06257],[.52755,1.13946]
.8 [.45065,1.06257],[.55578,1.1577]
.9 [.45065,1.06257],[.57341,1.18532]
1 [.45065,1.06257],[.63759,1.24951]
.2 [.31703,.92894],[.3969,1.00882]
.3 [.31703,.92394],[.40039,1.0123]
.4 [.31703, .92894],[.44401,1.05592]
.5 [.31703,.92894],[.61273,1.22465]
.6 [.31703,.92894],[.58996,1.02137]
.7 [.31703,.92394],[.52755,1.13946]
.8 [.31703,.92894],[.55578,1.1677]
[.45065,1.06257],[12.094,7.2335]
[.45065,1.06257],[4.9238,4.02918]
[.45065,1.06257],[3.2533,2.69762]
[.45065,1.06257], [2.1815,2.12369]
[.45065,1.06257] ,[1.101,1.9992]
[.45065,1.06257],[.98065,1.6349]
[.45065,1.06257],[1.0003,1.32516]
[.45065,1.06257],[.81008,1.19011]
[.45065,1.06257],[.68539,1.07452]
[.45065,1.06257],[.51074,1.01981]
[.09963,2.17417],[8.5036,11.5754]
[.17742,1.41495].[4.208,5.80496]
[.21229,1.1881],[2.4607,4.0106]
[.22614,1.03079],[1.0403,3.37354]
[.24236,1.01724],[.90799,2.66529]
[.25615,1.02081],[.91302,2.1322]
[.26315,.99967],[.73381,1.86256]
Table 8contirued.
.9 [.31703,.92894],[.57341,1.18532]
1 [.31703, .92394],[.63759,1.24951]
.3 [.3969,1.00882],[.40039,1.0123]
.4 [.3969,1.00882],[.44401,1.05592]
.5 [.3969,1.00882],[.61273,1.22465]
.6 [.3969,1.00882],[.58996,1.02187]
.7 [.3969,1.X002],[.52755,1.139463
.8 [.3969,1.00882],[.55578,1.1577]
.9 [.3969,1.C 082],[.57341,1.18532]
1 [.3969,1.00882],[.63759,1.24951]
.4 [.40039,1.0123],[.44401,1.05592]
.5 [.40039,1.0123],[.61273,1.224653
.6 [.40039,1.0123],[.58996,1.02187]
.7 [.40039,1.0123],[.52755,1.13946]
.8 [.40039,1.0123],[.55578,1.1677]
.9 [.40003,1.0123],[.57341,1.18532]
1 [.40039,1.0123],[.63759,1.24951]
.5 [.44401,1.05592],[.61273,1.22465]
.6 [.44401,1.05592],[.58996,1.02187]
.7 [.44401,1.05592],[.52755,1.13946]
.8 [.44401,1.05592],[.55578,1.1677]
.9 [.44401,1.05592],[.57341,1.18532]
1 [.44401,1.05592],[.63759,1.24951]
.6 [.61273,1.22465],[.58996,1.02187]
[.26885,.98669],[.60307,1.64847]
[.27222, .96861],[.41816,1.5239]
[.06101,6.40426],[9.2409,9.36293]
[.14919,2.29209],[4.1034,4.8924]
[.18727,1.40661],[1.662,3.75572]
[.22809,1.31919],[1.3412,2.76986]
[.26030,1.30747],[1.2966,2.10924]
[.27699,1.23059],[.99359,1.79849]
[.29035,1.18553],[.80706,1.56296]
[.29797,1.13143],[.57353,1.43352]
[.02133,11.9968],[8.2414,9.69731]
[.07435,2.14967],[2.5103,5.58983]
[.13339,1.737],[1.79998,3.66398]
[.18274,1.65044],[1.5294,2.61467]
[.21066,1.45062],[1.1992,2.14069]
[.23271,1.34504],[.94732,1.80888]
[.24585,1.23411],[.6604,1.62581]
[.00767,9.31279],[5.4424,10.1456]
[.06059,3.38264],[2.9106,4.97893]
[.12637,2.66356],[2.3131,3.14154]
[.16883,2.00614],[1.6045,2.41735]
[.20241,1.72096],[1.2212,1.96384]
[.22276,1.47806],[.846078,1.72443]
[.02110,47.2015],[7.3035,6.73707]
Table 8contirued.
.5 .7 [.61273,1.22465],[.52755,1.13941]
.5 .8 [.61273,1.22465],[.55578,1.1677]
.5 .9 [.61273,1.22465],[.57341,1.18532]
.5 1 [.61273,1.22465],[.63759,1.24951]
.6 .7 [.58996,1.02187],[.52755,1.13946]
.6 .8 [.58996,1.02187],[.55578,1.1677]
.6 .9 [.58996,1.02198],[.57341,1.18532]
.6 1 [.58996,1.02187],[.63759,1.24951]
.7 .8 [.52755,1.13946],[.55578,1.1677]
.7 .9 [.52755,1.13946],[.57341,1.18532]
.7 1 [.52755,1.13946],[.63759,1.24951]
.8 .9 [.55578,1.1677],[.57341,1.18532]
.8 1 [.55578,1.1677],[.63759,1.24951]
.9 1 [.57341,1.18532],[.63759,1.24951]
[.12993,10.0551],[4.2108,3.10191]
[.20917,4.56934],[2.6334,2.14953]
[.26858,3.1619],[1.897,1.64959]
[.30047,2.352211],[1.3054,1.42516]
[.01136,168.052],[8.234,6.58257]
[.07608,12.154],[3.8563,3.41368]
[.14615,5.29023],[2.4668,2.32571]
[.19140,3.11048],[1.5848,1.87614]
[.00o23,173.484],[7.1793,7.94551]
[.03103,12.6057],[3.4336,4.04765]
[.07055,4.41628],[1.9353,2.87423]
[.00129,345.366],[7.112,7.57389]
[.02176,13.1361],[3.0254,4.05065]
[.00052,314.366],[6.2006,7.78915]
* is the largest value of the lower bound.
** is the smallest value of the upper bound.
1) The likelihood ratio method
2) The method based on the asymptotic normality of the least
squares estimator
3) Hartley's method
(see Section 1.3).
The boundaries of the confidence regions on 81 and e2 obtained from
the above three methods are ellipses. In order to be able to compare
our simultaneous confidence intervals on 81 and e2, which we have found
earlier, to those obtained by each of the above methods, we project the
ellipses onto the 18 and 62 axes. The resulting intervals form
simultaneous, but conservative, confidence intervals on 81 and @2. The
results are given in Table 9.
Table 9
Simultaneous Confidence Intervals on 81 and e2
with a Confidence Coefficient > .9418 Obtained from Projecting the
94.18 percent Confidence Regions on (el, e2) for Example 3.1
Method e1 e2
Lower Upper Lower Upper
Limit Limit Limit Limit
Likelihood Ratio .585 .790 .09 .55
Asymptotic Normality of the LS Estimator .602 .768 .129 .506
Hartley .555 .820 .02 .63
We note that the proposed method provides wider confidence
intervals than the other three methods. This is attributed to the
conservative nature of the proposed method caused by the use of the
confidence intervals in (3.13), which are already conservative, followed
by the use of Bonferroni's inequality which leads yet to another set of
conservative intervals as in (3.17).
Example 3.2. In this second example we construct simultaneous
confidence intervals on e1, 82 for a nonlinear model with two input
variables. Consider the model used by Draper and Smith (1981) for a
certain chemical reaction which can be described by the nonlinear model
y = exp(9elx exp[el2(/x2 1/520)]} + E (3.22)
where y is the fraction of original material remaining, xI is the
reaction time in minutes, and x2 is the temperature in degrees Kelvin.
The data are taken from page 520 of Draper and Smith (1981) and are
given in Table 10. The data consist of two replications at xl = 120,
x2 = 612, eleven replications at xI = 60, x2 = 620, and three replica
tions at x1 = 30, x2 = 639. The mean (i) and variance (s ) of the
observed responses at each setting of x1 and x2 are given in Table 11.
Before we proceed any further we would like to test to make sure
that the variance of the observed responses from each set of replica
tions are not significantly different because in our method we use the
pooled variance. To test the null hypotheses of equal variances,
Bartlett's (1937) chisquare test is considered (see also Brownlee
(1965, p. 293)). The test statistic is
M = [(N n) In s2 (n 1) In s ]/C (3.23)
i=1
where
1 n 1 1
C = 1 + 1 i (n l1 Nni
and s2 = .00006222 is the pooled variance defined in (3.3),
n 2 2 2
N = Z ni = 16, sl, s2, s are in Table 11 and n is the number of
i=1
distinct design points. Under the null hypothesis M has an approximate
chisquare distribution with n 1 = 2 degrees of freedom. The value of
M in (3.23) is equal to .988932. The value of chisquare with 2 degrees
of freedom and the significance level of .05 is equal to 5.99 which
exceeds the value of M. Therefore, we do not reject the null hypothesis
of equal variances.
We have n = 3 distinct sets of design points with two parameters.
There are () = 3 pairs of functions of the form f(xl),x ; ),
i = 1, 2: 1 = 1, 2, 3. To obtain a confidence region on e with a
confidence coefficient of at least 95 percent we need a to satisfy
1 (3)a .95, thus a : .015. By computing the integrals in (3.19)
for several assignments of a we obtain a = 3.15 which makes 1 a
.984949706 and 1 ()a = .954849117. The 98.49 percent simultaneous
confidence interval on f(x 1; ), fx1 ; ) 1 =
can be obtained from Equation (3.9) with p = 2, m = 13, n1 = 2, n2 = 11,
n3 = 3 and i s. are as in Table 11. The results are given in
Table 12.
Again, in this example we shall drop the superscript (1) and just
use X11, x12, x2! and x22.
Table 10
Observed Responses and Design Points for Example 3.2
Design Points
Observed Responses
(y)
Table 11
Means and Variances of Observed Responses
at Each Replication for Example 3.2
Design Points
xl
Observed
x2 Mean (yi)
Responses
Variance (s2)
(s
120 612 .78800 .000018
60 620 .79627 .000054
30 639 .65067 .000124
To find simultaneous confidence intervals on 6e, e2,let
n = f(x11,x12; 6) = exp{91x,!exp[62(1/x2 1/520)]}
n2 = f(x21,x22; 6) = exp{elx21exp[62(1/x22 1/520)]}
(3.24)
Let cl, dl be the lower and upper limits of the confidence interval on
f(xll,xl2; 6) and c2,d2 be the lower and upper limits of the confidence
interval on f(x21,x22; 9). We solve the two equations in (3.24) for e9
and 82 and obtain
1e = (In Ti) exp[C2(1/x12 1/620)]/x11
(3.25)
In(ln ri) In(ln n2) In x11 + In x21
82
S(1/x221/620)(1/x121/620)
We now find the maximum and minimum values of e1 and e2 of the form
(3.25) over the region [c1,dl] x [c2,d2], where n1 c [cp,dl] and n2 e
[c2,d2]. The results are also in Table 12. The largest of the lower
bounds on 81 and 62 and the smallest of the upper bounds on 81 and e2
from Table 12 form a confidence region on 9 with a confidence
coefficient of at least 95.48 percent and is given by
e1 e [.00364, .00392]
e2 E [26558.7, 29617.4]
For comparison purposes we use the data in Table 10 to construct
simultaneous confidence intervals on 1, 82 by using each of the same
three methods as in Example 3.1. The boundaries of these confidence
regions are ellipses. The simultaneous confidence intervals on
61 and e2 can be obtained by projecting the ellipses onto the
61 and 62 axes. These confidence intervals are conservative and are
given in Table 13.
Our method seems to do well when comparing the ranges of the
parameters with the other three methods.
Table 12
98.49 Percent Confidence Region on
[f(xi) x() :e), f(x ,1 ,x2 ;)], 1 = 1, 2, 3, and at least
98.49 Percent Confidence Regions on [el, 82] for Example 3.2
Design Points Confidence Region on Confidence Region on
(I) (I) (1) (I) (1x) (I .) (1) (1)
(x11 x12 ) (x21 ,x 2 ) f(x ,x2 ', f(X21 ,x 2 ')] [l, e2
(120, 612) (60, 620) [.77043,.80557],[.78878,.80376] [.00364, .0035] ,[24470.3,37234.9]
(120, 612) (30, 639) [.77043,.80557],[.63632,.66501] [.00334,.00392],[26558.7,30762]
( 60, 620) (30, 639) [.78878,.80376],[.63632, .66501] [.00364,.00395],[25753.9,29617.4]
is the largest value of the lower bound
is the smallest value of the upper bound
Table 13
Simultaneous Confidence Intervals on 69 and 92 with a Confidence
Coefficient > .9548 Obtained from Projecting the
95.48 Percent Confidence Regions on (e, 82) for Example 3.2
e1 e2
Method Lower Upper Lower Upper
Limit Limit Limit Limit
Likelihood Ratio .00361 .00389 26768 29393
Asymptotic Normality of the .00363 .00386 27006.3 29158.1
LS Estimator
Hartley .00356 .00388 27100 30950
3.5 Discussion
Our confidence region on the parameters is conservative; that is,
it has a confidence level of at least 100(1 qa) percent, where
q = [n), n is the number of distinct design points and p is the number
of parameters. This might explain the reason in Example 3.1 that our
method provides wider confidence intervals than the methods of
likelihood ratio, asymptotic normality of the least squares estimator
and Hartley. The advantage of the proposed method is that it does not
require specifying the initial estimates of the parameters, unlike the
likelihood ratio and asymptotic normality methods. Furthermore, in
comparison with Hartley's method, the proposed method uses the true
response function rather than the linearized one. In Example 3.2 our
method did quite well by comparison with the other three methods.
The conservative nature of the proposed method means that it
guarantees a percent coverage of the parameter vector which is no less
than a preset value. It would, therefore, be desirable to determine how
much larger is the true percent coverage than the preset value. Such
determination can help reduce the size of the confidence region obtained
by this method. Further investigation of this will be needed.
CHAPTER FOUR
CONCLUSIONS AND RECOMMENDATIONS
A design criterion for parameter estimation which is not dependent
on the parameter values in a nonlinear model as well as a method for
obtaining a confidence region on the parameter vector in a nonlinear
model have been proposed.
In Chapter Two approximations to the true mean response functions
with a Lagrange interpolating polynomial (for the case of two input
variables) and spline interpolating polynomials (for the case of a
single input variable) were constructed. Each approximation was used to
obtain a parameterfree design on the basis of a modification of the
Doptimality criterion which was proposed by Khuri (1982). This design
is found to depend on the size of the error associated with the
approximation of the true mean response. If the error of an
approximation is large, the design determined on the basis of our
procedure may be inefficient.
It is unfortunate that, at the present stage, no computer algorithm
is available for the determination of the optimal design based on the
Lagrange interpolating polynomial for the case of two (or more) input
variables when the number of design points is large. It is hoped that a
sequential procedure for the generation of design points can reduce the
computational difficulty to a manageable level. This will be
investigated in future research (see Example 2.1).
The reason that we chose spline interpolating polynomials to
approximate the mean response function is because of the nice properties
of the splines as approximating functions (Wold (1974)). These
functions seem to be extremely suitable for interpolating or
approximating functions; to quote deBoor and Rice (1968, p. 1): "Spline
functions are the most successful approximating functions in use
today. They combine ease of handling in a computer with great
flexibility, and are therefore particularly suited for the approximation
of experimental data."
Our attention has centered on the construction of an optimal design
based on spline interpolating polynomials for the case of a single input
variable. Extensions of the procedure for constructing optimal designs
using spline interpolating polynomials to more than one input variable
cause no difficulties in terms of methodology. However, in order to
calculate the design point locations, we may have the same difficulty as
in Example 2.1 if the number of design points is large.
Our design criterion was developed under the assumption that the
values of all the parameters are of equal interest to the
experimenter. It is sometimes the case, however, that the experimenter
is more interested in the values of certain parameters than the values
of others. The extension of our design criterion to obtain the design
of experiments for subsets of parameters requires further research.

