Title: Parameter-free designs and confidence regions for nonlinear models
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00098622/00001
 Material Information
Title: Parameter-free designs and confidence regions for nonlinear models
Physical Description: viii, 105 leaves : ill. ; 28 cm.
Language: English
Creator: Bumrungsup, Chinnaphong, 1951-
Copyright Date: 1984
 Subjects
Subject: Parameter estimation   ( lcsh )
Confidence intervals   ( lcsh )
Statistics   ( lcsh )
Statistics thesis Ph. D
Dissertations, Academic -- Statistics -- UF
Genre: bibliography   ( marcgt )
non-fiction   ( marcgt )
 Notes
Statement of Responsibility: by Chinnaphong Bumrungsup.
Thesis: Thesis (Ph. D.)--University of Florida, 1984.
Bibliography: Bibliography: leaves 101-104.
General Note: Typescript.
General Note: Vita.
 Record Information
Bibliographic ID: UF00098622
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: alephbibnum - 000468635
oclc - 11628537
notis - ACN3321

Downloads

This item has the following downloads:

parameterfreedes00bumr ( PDF )


Full Text














PARAMETER-FREE IESIGONS AND
CONFIDENCE REGIONS FOR NONLINEAR MODELS





By

CHINNAPHONG BUMRUNGSUP


A DISSERTATION PRESENTED TO THE
GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

















Dedicated to


my late father

Mr. Siri Bumrungsuip



and



my late mother

Mrs. Varanee Bumrungsup




















ACKNOWL ErGM ENTS

I would like to express my sincerest appreciation to the chairman

of my supervisory committee, Dr. Andre Khuri, for his tireless efforts,

guidance, suggestions, encouragement, friendship, and help throughout my

graduate program and preparation of this dissertation. I am also

grateful to Dr. Arun Varma, a member of the supervisory committee, for

his helpful and valuable comments and suggestions. The contributions of

Dr. John Cornell, Dr. Frank Martin, and Or. Gerald Mott as members of

the supervisory committee are acknowledged. Thanks are also extended to

Dr. Michael Conlon for programming assistance and to Dr. Kim Sawyer for

reading this dissertation.

i am most appreciative to Dr. Richard Scheaffer for his

understanding and assistance in matters related to my graduate program.

I would like to express sincere thanks to Thammasat University and

the Thai Government for granting the leave of absence required for the

program. In addition, I am grateful to the Rockefeller Foundation and

the Department of Statistics for providing financial support for my

graduate program.

I am thankful to my family for their constant support: my wife

Kulvadee, for her patience, love and understanding, my daughter Aininta












who has been a constant source of joy, my brother Thanin, and my sisters

Manida and Arirak who were always there with love and encouragement.

Special thanks are due to Ms. Carole Boone who spent iany long

hours typing this manuscript.





















TABLE OF CONTENTS



PAGE

ACKNOWLEDGMENTS .......................................... ........... ii

ABSTRACT ............ ... .. ..................................... ii

CHAPTER

ONE INTRODUCTION ..................................................

1.1 Nonlinear Models.........................................
1.2 Design of Experiments for Nonlinear Models................2
1.3 Confidence Regions for Parameters in Nonlinear Models.....7
1.3.1 The Likelihood Ratio Method....................... 8
1.3.2 Asymptotic Normality of the Least Squares
Estimator............................................9
1.3.3 Hartley 's Method................................... 9
1.4 The Purpose of This Research................. ............ 10

TWO DESIGNS FOR NONLINEAR MODELS .................. ............... 12

2.1 Introduction ......................... ................... 12
2.2 Approximation and Interpolation of Functions .............13
2.2.1 Lagrange Interpolation............................14
2.2.2 Spline Interpolation..............................22
2.2.2.1 Linear Spline............................ 27
2.2.2.2 Quadratic Spline.........................28
2.2.2.3 Cubic Spline .. ......... ..... .. ..... 30
2.3 Optimal Designs ....................................... 33
2.3.1 Designs Obtained through Lagrangian
Interpolation of Nonlinear Models in the
Case of Two Input Variables ....................... 37
2.3.1.1 Lagrangian Interpolation of the Mean
Response ................................ 38
2.3.1.2 Designs Based on Lagrangian
Interpolation............................41













2.3.2 Designs Obtained through Spline Interpolation
of Nonlinear Models ...... ........... ....... . ...44
2.3.2.1 Spline Interpolation of the Mean
Response... .......... ........ ......... 44
2.3.2.2 Designs Based on Spline Interpolation....48
2.4 Examples ....................... ........................ 50
2.5 Discussion.............................................. .6a

THREE CONF[OENCE REGIONS FOR THE PARAMETERS IN NONLINEAR MODELS ......66

3.1 Introduction ............................................66
3.2 Method for Obtaining Confidence Region on Parameters .....67
3.3 Confidence Region on ................................... 68
3.4 Examples.. ........................................... 74
3.5 Discussion.............................................. 90

FOUR CONCLUSIONS AND RECOMMENDATIONS .............................. 91


APPENDICES

1 PROOF OF INEQUALITY (2.78) ..................................94
2 PROOF OF INEQUALITY (2.81)................................... 97
3 BOX AND LUCAS DESIGN......................................... 100

REFERENCES .........................................................101

BIOGRAPHICAL SKETCH ................................................ 105




















Abstract of Dissertation Presented to the
Graduate School of the University of Florida in
Partial Fulfillment of the Requirements for the
Degree of Doctor of Philosophy

PARAMETER-FREE DESIGNS AND
CONFIDENCE REGIONS FOR NONLINEAR MODELS

By

Chinnaphong Bunrungsup

August 1984

Chairman: Andre I. Khuri
Major Department: Statistics

The choice of an optimal design for parameter estimation in a

nonlinear model depends on the values of the model's parameters. This

is an undesirable feature since such a design is used to estimate the

same parameter values which are needed for its derivation. To overcome

this difficulty a method is developed for the construction of a

nonlinear design which does not depend on the nonlinear model's

parameters. Such a design is called a parameter-free design. This

method is an extension of A. I. Khuri's modified D-optimality criterion

(Parameter-Free Designs for Nonlinear Models. Technical Report No. 188,

Department of Statistics, University of Florida, Gainesville, Florida,

1982). It is based on using a proper approximation of the true mean

response function described in the nonlinear model with either Lagrange

or spline interpolating polynomials. Each approximation will be used to












derive a parameter-free design. Optimal designs obtained on the basis

of these interpolating polynomials depend on the size of the error

associated with the approximation of the true mean response.

A method for the construction of a confidence region on the vector

of parameters in a nonlinear model is also developed. Unlike most other

methods which are available for that purpose, this method does not

require specifying initial estimates of the parameters. The

aforementioned confidence region can be used to obtain simultaneous

confidence intervals on the nonlinear model's parameters. These

intervals, however, are conservative in the sense that their joint

confidence coefficient cannot be less than a preset value.


viii


















CHAPTER ONE
INTRODUCTION



1.1 Nonlinear Models

To understand what we mean by the term "nonlinear model" we shall

first make the following definition: A linear model is the one in which

the parameters, the quantities to be estimated, appear linearly. Such a

model can be represented by



y = 80 + BZ + B2z2 + ... + + c (1.1)



where Bg, BI, ..., Bp are unknown parameters, the zi are any functions

of the input or independent variables xl, ..., xk, y is the dependent or

response variable and e is an unobservable random error.

Any model which is not linear in the parameters will be called a

nonlinear model. Examples of two such models are



y = exp[9I + 62x2 + E] (1.2)



y = 6 1 2 [exp(-e2x) exp(-e1x)] + e (1.3)
1 2












The models (1.2) and (1.3) are both nonlinear in the parameters 01 and

02. But the model (1.2) can be transformed so that the parameters

appear in linear fashion. If we transform the model by taking the

logarithm to the base e of the quantities on both sides of the equality

sign, we obtain



In y = 01 + 2x2 + E (1.4)



which is the form of (1.1) and is linear in the parameters. We say that

model (1.2) is an intrinsically linear model because it can be expressed

in the linear form (1.1) by a suitable transformation.

However, it is impossible to transform the model (1.3) into the

linear form (1.1). Such a model is said to be intrinsically

nonlinear. Throughout this dissertation all models mentioned will be

intrinsically nonlinear. This research is concerned with the design of

experiments for parameter estimation and the construction of confidence

regions for the parameters in nonlinear models.



1.2 Design of Experiments for Nonlinear Models

Consider the nonlinear model



y = f(x; e) + E (1.5)



where y is the measured response, f(x; 8) is the true mean response at

the point x, where x is the vector of k input variables (xl, ..., xk)'













_ is the vector of p parameters (el, 82, ", )', and c is a random

error.

When there are n observations of the form yl, y2 .., y with yi

being obtained at the design point i, i = 1, 2, ..., n, then we can

write model (1.5) in the form



yi = f(x : ) + Ei (1.6)



where xi = (xil, xi2, ..., xik)' The program of n experiments may be

represented by the nxk design matrix D = [xij]. The j-th member of the

i-th row of this matrix gives the level of the j-th variable for the

i-th experiment. We assume that the random errors l', 2, ..., n are

independently distributed as N(0, 02).

The design criterion for parameter estimation in a nonlinear model

suggested by Chernoff (1953) is to choose the experimental design that

minimizes the trace of [h(x, 8) h'(x, 9)]-1, where h(x, e)h'(x, e)/o2 is

Fisher's information matrix of order pxp and h'(x, 6) is an nxp

derivative matrix with ij-th entry af(xj ; e)/38a. The problem is that

the Fisher information matrix depends on the unknown parameters 8, and

one cannot determine the optimal design setting for estimating a without

knowing the actual value of a. Chernoff suggested to consider locally

optimal designs and to minimize the trace of [h(x, 20) h'(x, _20)]-1

where -n is the value of in the neighborhood of which f(x_ ; 8) is

assumed to be approximately linear in e for i = 1, 2, ..., n. Hamilton

(1982) pointed out that this criterion is not invariant under changes of













scale of the parameters. Box and Lucas (1959) suggested choosing the

experimental design to maximize the determinant |h(x, -M) h' (x, 0)

with respect to the experimental variable settings where n, the number

of experiments, is equal to p, the number of parameters. This parameter

estimation criterion was called D-optimality by Kiefer and Wolfowitz

(1959). Unlike the minimum trace criterion, the D-optimal design is

invariant to scale changes of the parameters (see Cochran (1973) and

Hamilton (1982)).

Atkinson and Hunter (1968) extended the Box and Lucas results to

cases where n is greater than p. In some cases when n is a multiple of

p, the optimal design simply results in n/p replications of the points

which would be given by the n=p Box and Lucas approach. However,

Atkinson and Hunter gave a counter example to show that this is not

always true. M. J. Box (1968, 1970, 1971) gave a number of additional

results particularly on the subjects of replicated design points.

Several applications and theoretical extensions to D-optimal

designs for nonlinear models have been proposed. Reviews of these are

given by Cochran (1973) and St. John and Draper (1975).

In some cases it may be true that initially 01, say, will be known

more precisely than 82 at the outset of experimentation. For example,

suppose we know the standard deviations of their prior distributions are

one and ten, respectively. Draper and Hunter (1967a and 1967b)

developed a procedure for the design of experiments for parameter

estimation when such prior information is available.













For the nonlinear model, it is necessary to have initial estimates

of the parameters in order to apply the previously mentioned design

criterion. If these initial estimates are poor, the efficiency of the

design may be greatly reduced since the design criterion is a function

of the parameter values that are employed. A more efficient approach

would be to treat -O as an initial estimate in a sequential procedure

which starts by selecting a p-point optimal design on the basis of this

estimate. A series of p experiments are then performed at the optimal

design settings and an estimate of a is obtained. Thereafter, the

experiments can be designed one at a time, using the current best

estimates of the parameters, which can be recalculated after each

experiment. This sequential procedure has been discussed by Box and

Hunter (1965) and they showed how to add points one at a time.

M. J. Box (1971) added sequentially sets of p points each. At each

stage a new estimate of a is obtained and used to generate the next set

of design points. This sequential procedure continues until the values

of the parameter estimates stabilize along with the values of the

coordinates of the design points.

The sequential procedure described above can help improve the

efficiency of the design but it does not eliminate the dependency of the

design on the parameter's values. Furthermore, Atkinson and Hunter

(1968) pointed out that there are situations in which it is not feasible

to proceed sequentially.

Zacks (1977) considered the design problem in a Bayesian framework

which is constructed so that the expected value of the determinant of













Fisher's information matrix with respect to some prior distribution of e

is maximized. Bayes sequential designs can also be obtained. In this

case an optimal design for a new stage of experimentation is determined

by maximizing the expected value of Fisher's information determinant

with respect to the posterior distribution of 9 given the results from

the previous stages. The Bayesian approach (sequential and

nonsequential) eliminates the problem of dependency of the optimal

design on 6, but it introduces dependency on the prior distribution.

Hamilton (1982) introduced a design criterion based on the second

order approximation to the volume of the confidence region of the

parameters in nonlinear models. This criterion is more efficient than

the D-optimality criterion since the latter is based on the first order

approximation of the above-mentioned volume. However, Hamilton's

criterion requires good initial estimates of the parameters as well as

of the error variance.

Khuri (1982) considered two types of designs for parameter

estimation which do not depend on the unknown parameters of the

nonlinear model. The first type is called a minimax D-optimal design.

This design is conservative and may be inefficient as pointed out by

Khuri. The second type of design is more efficient than the minimax D-

optimal one. To obtain this second type, an adequate approximation to

the true mean response with a Lagrange interpolating polynomial was

constructed. This approximation was used to derive a parameter-free

design on the basis of a certain modification of the D-optimality

criterion. Khuri noted that this design depends on the size of the













error associated with the approximation of the true mean response. Only

a nonlinear model with a single input variable was considered.

In this dissertation we shall extend Khuri's second design to the

cases where we approximate the mean response function with a Lagrange

interpolating polynomial (for the case of two input variables) and with

spline interpolating polynomials (for the case of a single input

variable).



1.3 Confidence Regions for Parameters
in Nonlinear Models

Beale (1960) introduced approximate confidence regions for the

nonlinear parameters based on the likelihood ratio criterion when the

sample size is large. Williams (1962) presented a procedure for

obtaining what are described as fiducial intervals and regions for the

nonlinear model's parameters. He considered the nonlinear model



yi = a + Bf(x ; Y) + Ei



where Y is the only nonlinear parameter. Williams' procedure is

applicable for a model with only one parameter appearing nonlinearly.

Halperin (1963) extended Williams' results to a larger class of

nonlinear models with more than one parameter appearing nonlinearly. He

showed that the fiducial intervals and regions discussed by Williams are

also confidence intervals and regions. Hartley (1964) suggested a

procedure for obtaining an exact confidence region about the parameters

in a general nonlinear model. A method for constructing an













approximation of the true mean response was described. This

approximation was used to obtain exact confidence regions for the

parameters. Sundararaj (1978) considered a method for the construction

of joint confidence regions on a vector of parameters in a nonlinear

model. A set of p random variables was constructed, each of which is a

least squares equation for estimating the parameters, where p is the

number of parameters. These p random variables were used to construct

the joint confidence region on the parameters.

Several methods for determining confidence regions for the

parameters of nonlinear models have been proposed. Reviews of these are

given by Gallant (1976) and Shaw and Griffiths (1979). We shall

describe briefly some of the methods for obtaining the confidence

regions on the parameters in nonlinear models. First we define the

error sum of squares for the nonlinear model given in (1.6) as


n 2
S(6) = E (Yi f(x ; )) (1.7)
i=!


We shall denote the least squares estimate of e as 9.



1.3.1 The Likelihood Ratio Method

This method defines the confidence region for 0 by the expression


S(e)-S(9)
S(e) < -p F(p, n-p, a)
S(e) n-p













which provides approximate 100(1-a) percent confidence regions for 6,

where F(p, n-p, a) is obtained from a table of the F-distribution

corresponding to a significance level a and numerator and denominator

degrees of freedom p and n-p, respectively (see Draper and Smith (1981,

p. 504) and Ratkowsky (1983, p. 30)).


1.3.2 Asymptotic Normality of the Least Squares Estimator

The 100(1-a) percent confidence region for can be defined as (see

Gallant (1975a))


(6-a)' C" (e-a)
< -- F(p, n-p, a)
s(a) n-p

where C = [h(x, ) h'(x, e)]-1 and h(x, 6) h'(x, 8)/o2 is Fisher's

information matrix as defined in Section 1.2.


1.3.3 Hartley's Method

Hartley (1964) developed a method of constructing the so-called

"essentially linear" approximation to the nonlinear model f(xi; 8) of

the form


p
f(x ; I ) = wt(e) u. ,
-1 t it
t=l


where wt(6) are continuous functions of e and the nxp matrix U of the

uit has rank p and does not depend on _. An exact 100(1 a) percent

confidence region for e is defined as
















reg SS/res SS < -2- F(p, n-p, a)
n-p


where


1,
reg SS = (U's)' (U'U)- (U')



res SS = e's reg SS

and

S= 2(El ....' n





1.4 The Purpose of This Research

The choice of an optimal design for parameter estimation in a

nonlinear model depends on the values of the model's parameters. The

statistical literature suggests several procedures for obtaining the

designs for parameter estimation as was described in Section 1.2. In

most cases, it is necessary to have initial estimates of the parameters

in order to apply the design criterion. If the estimates upon which the

design is based are poor, the design may be inefficient.

To overcome this difficulty we present a design criterion for

parameter estimation which does not depend on the nonlinear model's

parameters. Our design criterion for parameter estimation, which is an

extension of Khuri's (1982) procedure, is presented in Chapter Two.

Such a design is called a parameter-free design. The approximations to








11





the true mean response function given in (1.5) will involve a Lagrange

interpolating polynomial (for the case of two input variables) and the

spline interpolating polynomials (for the case of a single input

variable) in order to obtain these parameter-free designs. Chapter

Three contains a method for the construction of confidence regions on

the parameters in nonlinear models. This method does not require

specifying initial estimates of the parameters. Using the confidence

intervals on the mean response functions in conjunction with

Bonferroni's inequality, a conservative confidence region on the

nonlinear model's parameters is constructed. In the final chapter, we

make some concluding comments concerning our methods.





















CHAPTER TWO
DESIGNS FOR NONLINEAR MODELS



2.1 Introduction

In this chapter we introduce techniques for constructing the

polynomial approximations of a nonlinear model. Each approximation will

be used to obtain a design for parameter estimation which does not

depend on the parameter vector 6. We shall restrict our consideration

to the cases of one and two input variables.

Consider the nonlinear model



y = f(x; 8) + e (2.1)



where x = (xl, x, ...., xk)' is a vector of k input variables, e =

(31, 62, ..., p)' is a vector of p unknown parameters, and E is a

random error. To estimate the elements of 8, a series of n experiments

(n>p) is performed. In each, a value of the response is observed at a

particular setting of x. The program of n such settings is represented

by the n x k design matrix, D = [xij], i = 1, 2, ..., n; j = 1, 2, ...,

k. We assume that the random errors 1, ', .., E r are independently

distributed as N(0, 2).













We now discuss the problem of finding the polynomial approximations

to the given nonlinear models of one and two input variables.



2.2 Approximation and Interpolation of Functions

One of the simplest ways to approximate a function defined on an

interval, is to obtain a polynomial which takes the same values as the

function at a certain number of points in the domain of the function.

This procedure is called interpolation. Szidarovszky and Yakowitz

(1978) pointed out that polynomials provide a simple way to approximate

or interpolate functions. By way of illustration, let us state a well-

known theorem of Weierstrass.



Theorem 2.1 (The Weierstrass Approximation Theorem)

Given any interval [a,b], any real number 6 > 0, and any real-

valued continuous function f(x) on [a,b], there exists a polynomial p(x)

such that



If(x) p(x)| < 6 for all x:[a,b]



This result may be proved in a number of ways: see, for example, Davis

(1963) and Rivlin (1969).

In this dissertation we shall center our attention on the

approximation of a nonlinear model by considering a Lagrange

interpolating polynomial (for the case of two input variables) and

consider spline interpolating polynomials (for the case of a single













input variable). Each approximation will be used to obtain a parameter-

free design on the basis of a modification of the 0-optimality criterion

which was proposed by Khuri (1982).



2.2.1 Lagrange Interpolation

One-Dimensional Problem

The Lagrange polynomials are among the simplest and most practical

of the interpolating polynomials. Given m+1 distinct real numbers

ayo
g(x) be a polynomial of degree m or less for which



g(yi) = (yi) i = 0, 1, 2, ., m .(2.2)


This polynomial can be written in Lagrange form


m
g(x) = 1 f(Yi )li(x) (2.3)
i=O


where


m -
li(x) = n (x Y )(yi yj) i = 0, ..., m, (2.4)
j=0
jpi



is the i-th Lagrange polynomial of degree m (see, for example, Young and

Gregory (1972, Ch. 6), and deBoor (1978, Ch. 1 and 2)). We note that

li(Yj)= 0 if i j and li(yi) = 1. The points yg, y ..., ym are













called interpolating points and the polynomial g(x) is called Lagrange

interpolating polynomial of degree m. Such a polynomial always exists

and is unique as can be stated in the following theorem.



Theorem 2.2

There exists a unique polynomial g(x) of the form (2.3) for which

g(YO) = f(y0), g(y1) = f(Y1), .**, g(ym) = f(Y,)
The proof of this theorem can be found in Prenter (1975), and

Szidarovszky and Yakowitz (1978).

The error resulting from the use of Lagrange interpolation is, by

definition, the difference between f(x) and g(x), where g(x) is the

Lagrange polynomial of degree m interpolating f(x) at m + 1 distinct

real numbers yo, yl, ..., ym. We assume that f(x) has continuous

derivatives up to order m + 1 with respect to x over the interval

a

m(m+)
e(x) = f(x) g(x) = H (x yi)f(m+1)( )/(m+1)! (2.5)
i=O


where a < g < b, xcEa,b] and f(m+l)(x) = m+'f(x)/axm+l (see, for

example, Prenter (1975, p. 36) or deBoor (1978, p. 8)). As is not

explicitly known, formula (2.5) cannot be used for the calculation of an

exact value of e(x) but, as we shall now see, it can be used for

obtaining an upper bound for the absolute value of e(x). If we let


m
w(yo' Yl .. ym) = max I (x yi)
a











an upper bound for le(x)j is given by


max le(x)l < w(y0, ... ym) max f (x)|/(m+1)!
a (2.6)


Notice that the interpolation points influence the size of the upper

bound on the error through the function w(yO, 1' .. Yym). The

function w(yO, yI, ..., ym) can be made small by suitable choice of the

points yi. In fact the values of yi which minimize the function

w(yO' yl' y. Ym) can be obtained by using the following result given
in deBoor (1978, p. 26-31): the function w(y0l,y, ..... ym) in (2.6)

attains a minimum value of 2(b a)m+' /m+l if the interpolation points

are chosen as the zeros of the Chebyshev polynomial of the first kind
and of degree m+1 for the interval a
formula



y = (a )cos[(i + -i], i=0, ..., m (2.7)


From (2.6) and the fact that the minimum value of w(y0, yl .., ym) is

2(b a)m+1/4m+1 we obtain


max |e(x); < 2(b a)m+1 max If(m+l)(x) /[4m+1 (n+ )!] (2.8)
a












Two-Dimensional Problem

We now look at the problem of approximating functions of two
variables with Lagrange polynomials in two variables. Suppose that we

are given a rectangular region


R2 = (xl, x2): a1 < x1 < bI, a2 < x2 < b2}.


Let 1: aI yo<...< ym be partitions of the intervals [a.,bl] and [a2,b2],

YO, yl, ., Ym1 and zO, zI, ..., m2 are ml+l and
numbers. The set of points


z1 < ...
respectively, where

m2+1 distinct real


H = (yi z) : O

partitions R2 into a family of subrectangles
x


Rj : yi-1 < xl C i; j-1 < x2 < z


(i = 1 .., ml; j = 1, .... m2'


YO = al' Y1 = b z0 = a2' zm2 = b2)


Let f('x, x2) be a function defined on R2 and let g(xi, x2) be a
polynomial of degree m in x and m in x such that
polynomial of degree m, in x, and m2 in x2 such that












g(Yi z.) = f(y zj) O

This polynomial can be written in Lagrange form


g(x1, x2) =
04i 0

f( i z ) l j (xi, x2)


m
S(x ) = 1
k=0
kmi


,j(xI, x2) = li(x1)1j(x 2)




(x1 yk)yi y)- 1 i=0, 1 ..., 1






(x2 zu)(z z u)-1, j=0, 1, ..., m2


(2.11)


(2.12)


m
lj(x2) = 2
u=0
u j


li(xi) and 1j(x2) are the i, j-th Lagrange polynomials of degree mi and

m2, respectively (see Prenter (1975, p. 119)). We note here that


if i=k and j=u,


1J Zu) a
0


otherwise.


(2.9)


where


(2.10)












Theorem 2.3
There exists a unique polynomial of degree mi or less in xI and m2
or less in x2 which satisfies (2.9).
The proof of this theorem can be found in Prenter (1975, p. 120).
We now look at the error of interpolation. Let us assume that f(xl, x2)
has continuous derivatives up to order ml+1 and m2+1 with respect to x,
2
and x2, respectively, over the rectangular region Rx; then an upper
bound for the absolute value of e(x1, x2) is given by




max ]e(x1, x2)1 = max |f(xl, x2) g(x1, x2)I
a l a2


x1 O(Y (m l1)
< (Y Y max If (xl, x)
1 al a2

x2(z0 ., zm2 (m +1)
m2+1 max fx (x, x2)l
(m2+)x al1 < a2 x2


wxl y0" ym1 )wx2(z0 ...z2 (ml+1) (m2+1)
S (m +l)!(m2+1)! mafx fx (xl, X2)fx (xl, x2)]
a 2 al a2Cx2 (2.13)












where m +1
(ml+1) a f(x1, x2)
fxl (xI, x2) =2 ml
ax

m2+l
(m2+1) 3 f(x ,l' x2)
x2 (xl, 2 m2+1
3x2 -


m
w (y .. ) = max I (xl Y)
xl 1 alx1 and


m
w (Z, ..., z ) = max 2 (x2 z)
2 0 2 a2
(see Prenter (1975, p. 125)).

Consider now the upper bound of the error in (2.13). Suppose

f(xI, x2) is a function of xI alone, that is f(xl, x2) is reduced to a
one-dimensional function. Then fx+(, +) )= 0, 0 2
upper bound in (2.13) is reduced to

(m +1)
max f (xl)
al (ml+1)! wxl(YO .... Y*m)


which is the same as the upper bound in (2.6).

Now if yo, yl, ..., yml are chosen as the zeros of the Chebyshev
polynomial of the first kind and of degree mI+1 for the interval
[al,bl], and Zo, z1, ..., zm2 are the zeros of the Chebyshev polynomial

of the first kind and of degree m2+l for the interval [a2,n2], that is,













al+b1 a h
yi 2 os[ + ) 1, i=, 1 ,


(2.14)


a2+b2 a -b2
z = 2 ( 2 cos[(j + ) 1 jO .. 1. ,m2


then the functions wx1(yg ..." yIl) and wx2(ZO, ..., zm2) in (2.13)

ml+ mf+1 m2+l m2+l
attain minimum values of 2(b1 al)l+ /4m+1 and 2(b2 a2) +/4

respectively. In this case, the upper bound of the error of

interpolation becomes


(m,+l)
2(b1 al) (m,+1)
max |e(xI, x2) < m +1 max If (x, x2)
al< b1 (ml+1)!4 1 a a2


2(b2-a2) 2 (m2+1)
+ 22m2+1 max |f (xl, x2
(m2+1)!4 2 al< x b 1 2)
a2

ml+1 m2+1
4(b -a1) (b2-a2) 2 (m 1+) (m2+l)
m+m2+2 max I|f (x1, x2l x (xl, x2)
1y2 ~ x1 1x 1
(m+1)!(m2+1)!4 1 alxb 1 2
a2 (2.15)













We have described the Lagrange interpolating polynomial for the

cases of one and two input variables. The construction of this

approximation is fairly simple. However, a drawback to this

approximation is that it is not smooth, since it cannot be

differentiated at the interpolating points. In the next section we

introduce a smoother approximation. This approximation is called a

spline interpolating polynomial.



2.2.2 Spline Interpolation

In the last 30 years or so following the pioneering work of

Schoenberg (1946), piecewise polynomial functions have become an

important theoretical tool for the approximation of functions, since in

many cases the behavior of a function in one region may be totally

different from its behavior in another region and one polynomial may not

be appropriate to approximate that function. The most popular class of

approximating functions is the class of spline functions. Therefore,

instead of trying to approximate a function (of possibly several

variables) over the entire region by one polynomial of a certain degree,

one partitions the region into several subregions, and approximates the

function on each of these subregions by (possibly) a different

polynomial. One of the important ideas in using this process is the

method in which the region is subdivided. In one dimension, the method

consists of defining points (called knots) whose positions generally are

not known beforehand. Ideally, we would like to have these knots as

variables that can be shifted into the optimal positions as the














computation proceeds. However, this turns out to be a difficult

nonlinear problem and so the knot positions are usually fixed in

advance. Throughout this dissertation we assume that the knot positions

are preselected, so that the problem is a linear approximation problem;

furthermore, for simplicity we shall restrict our consideration to the

case of equidistant knots. Further discussion of the positioning of the

knots can be found in Powell (1967), Rice (1969) and Wold (1974).

Estimation of the knot locations is discussed by Gallant and Fuller

(1973).

We shall now discuss the spline interpolating polynomial of degrees

1, 2 and 3 (linear, quadratic and cubic splines). We shall consider

only the case of a single input variable.

We consider the interval [a,b] and divide it into h+1 pieces using

h points or "knots" Tr, T2, ..., h such that a = Tr
Th+l = b. Generally, a function which is equal to a polynomial of

degree at most m on each subinterval (Ti,Ti+1), i = 0, 1, ..., h, can be

written as


m m m
g(x) = E a i + ali (x T1 )+ + + ahi( rh)
i=O i=0O I=


(2.16)

where

(x-T.)i = (x-Tj) x T.

(i = 1, 2, . )


= 0


, x < T.
J















A function g(x) is called a spline of degree m if it is a polynomial of

degree (at most) m on each of the h+l intervals ( ,Tii+I), i = 0, 1,

..., h, and has m-1 continuous derivatives at Ti, i=1, 2, ..., h. By

imposing the restriction that g(x) in (2.16) has m-1 continuous

derivatives at Ti, i = 1, 2, ..., h, we obtain a spline of degree m of

the form


m h
g(x) = 1 a x1 + b(x i)m (2.17)
i=0 i=I


(see, for example, Studden and Van Arman (1969), Studden (1971) and Park

(1978)).

For illustration, suppose cubic-cubic polynomials are adequate for

a response function f given in (2.1) over the interval a


g(x) = a10 + a11x + a2x2 + a3x3 a < x < T

(2.18)
= a20 + a21x + 2L22x + a23x T < x < b



We can write g(x) in (2.18) as the single equation model



g(x) = aO + alx + a2x2 + a3x3 + a4(x-T)0 + a5(x )+

(2.19)
+ a6(x ) + a7(x T)












where


a0 = 010' al = all' a2 = a12' a3 = 13


a4 = 20 010 + rC(21 11) + 2('22 a12) + 3 (a23 a13)


a5 = 21 ll + 2T(a22 a12) + 3 2(a23 "13)


a5 = a22 a12 + 3r(a23 a13)


a7 = "23 a13


We assume that g(x) in (2.19) is continuous and has first and
second continuous derivatives at x=T; thus, from the condition of
continuity at x=T, we can obtain from (2.18) that


S+ + 122 a33 2 3
010 + ll + + 213r = 020 + a21r + + a23 '


"20 '10 + -(a21 all) + ,2(a22 a12) + 3(a23 a13) = 0


which implies a4 = 0 in (2.19). From the condition of first continuous
derivative at x=T, we have from (2.18) that















a21 ll + 2T(a22 "12) + 3T2(~23 '13) = 0


hence a5 = 0 in (2.19). Finally, we impose that g(x) in (2.19) has

second continuous derivative at x=T, and we obtain



a22 a12 + 3r(a23 a13) = 0


which implies that a6 = 0. Thus g(x) in (2.19) becomes


2 3 3
g(x) = a0 + a x + a2x + a3 a(x T)x


Throughout this dissertation all the splines of degree m have m-1

continuous derivatives at the knots. A spline function of degree m with

h interior knots is represented, in general, by a (possibly) different

polynomial in each of the h+1 intervals into which the h interior knots

divide the real line. As each polynomial involves m+1 parameters, the

spline function involves a total of (m+l)(h+1) parameters. However, the

continuity conditions impose certain constraints on these parameters.

At each interior knot, Ti, i=l, 2, ..., h, the spline function has

continuous derivatives of orders 0, 1, ..., m-1. Thus, m constraints

are imposed at each knot. As there are h interior knots, a total of mh

constraints are imposed. The number of free parameters involved in the

spline function is the total number of parameters minus the number of

constraints, that is, (m+l)(h+1) mh = m+h+1. We note here that the












number of free parameters is equal to the number of a's and h's

in (2.17).


2.2.2.1 Linear Spline

Let a = T0
interval [a,b] and let f(x) be a function defined at TO, T1, ..,

Th+1. We construct a linear spline function as follows. On each
interval TiTi+l ], we choose the interpolant g to f to consist of

linear pieces,



g(x) = Pi(x) = a. + Bx, T < x < i+1, i=0, 1, .., h (2.20)


The i-th linear piece Pi is made to satisfy the interpolatory

constraints




Pi(Ti) = f(Ti) i=0, 1, 2, h,

Ph (h+1) f(h+1) (2.21)

Pi ( l) = Pi+ (Ti+ ) i=0, 1, 2 .. h-1




(see Ahlberg, Nilson and Walsh (1967, p. 109)). The resulting

piecewise linear function g(x) agrees with f(x) at TO, ..., Th+1 and

is continuous at Ti, T2, ".., h. By using (2.17) with m=I, g(x) in

(2.20) can be written in the form













g(x) = a0 + alx + bl(x '1)+ + b2(x 2) + + ... + bh(x Th)+



(2.22)
where ao, al, b1, ..., bh can be obtained by using the conditions in

(2.21). We note that the number of parameters in (2.22) is equal to the

number of free parameters which is h+2.

Consider now the error resulting from the use of a linear spline.

We assume that the function f(x) has continuous derivatives up to order

2 with respect to x over the interval a
the error is given by



max ig(x) f(x)l < (tr)2 max f(2)(x)/8 (2.23)
a

where



T = i+ i=0, 1, .. h,

and


f(2)(x) = a2f(x)/ax2


(see deBoor (1978, p. 40)).


2.2.2.2 Quadratic Spline

Let f(x) be a function defined at TO', T, .... h+1 with a = T0<

T <... < h
quadratic pieces,












g(x) = P.(x) = a. + SOx + x2 x < i= 1, ..., h .
(2.24)

The i-th quadratic piece Pi is made to satisfy the interpolant

constraints



Pi l) + P.+1(i+I) i=0, 1, ..., h-1 ,


S (Ti+ i+1) ,i=, 1, .. h-1


PO(T0) f(TO)

(2.25)
Ph(Th+l) = f(Th+I)


Pi(4i) = f(ci) i = 0, 1, ..., h,


where
i = (Ti + Ti+1)/2, i=0, 1, 2, ..., h, and



P (T + ) = [ iP (x)/3x ]x i=0, 1, .... h-1
1 i+ 11t


(see Ahlberg, Nilson and Walsh (1957, p. 109) and deBoor (1978)). The
piecewise quadratic function g(x) interpolates the function f(x) at
TO' h+1 ... 1 h and has continuous derivatives of orders 0
and 1 at T1, T2' .., Th. By using (2.17) with m=2, the function g(x)
in (2.24) can be written as













g(x) = a0 + ax + ax+ b (x-l)+ b (x-T2 ... + bh( Th 2


(2.25)

where aO, a1, a2, bl, b2, ..., bh can be obtained by using the

conditions in (2.25). The number of free parameters is equal to the

number of parameters in (2.26) which is h+3.

To obtain the error Ig(x) f(x)|, we assume that the function f(x)

has continuous derivatives up to order 3 with respect to x over the

interval a
use of quadratic spline is given by



max |g(x) f(x)l (AT)3 max f(3)(x)|/8 (2.27)
a
where


AT = Ti+l i=0, 2, ... h


and

f(3)(x) = a3f(x)/ax3



(see deBoor (1978, p. 81)).


2.2.2.3 Cubic Spline

Let a = TO < T < ... < Th < Th+ = b be partition points for the

interval [a,b] and let f(x) be a function defined at TO, T1 ...

Th+1 We construct a piecewise cubic interpolant g to f as follows. On












each interval [ti, i+l], let g agree with some cubic polynomial of the
form


2 3
g(x) = P (x) = a. + B5x + Y x + 6.x3, T < x < Ti+,l i=O, 1, ..., h


(2.28)
The i-th cubic piece Pi is made to satisfy the conditions


Pi (i+1) = Pi+I(i+1), i=, 1 ...h-


p(l) (1)
P (Ti+) = P1i+ (i+l), i=O, ..., h-I


P(2) (2)
Pi (T) 1 i+ i+j ), i=O, 1 ..., h-I


Pi( ) = f(i), i O, 1, .. h (2.29)


Ph('h+l) = f(th+l)


((l) T f(1)(



h h+i h+1
Phl)( h+l) : f(1)( h+l1


where

Pi(j +) = [ajP(x)/axJ]x = j = 1, 2,
1 -Ti+l













and

f(1)(T ) = [f(x)/ x] ,



(see Ahlberg, Nilson and Walsh (1967, p. 109) and Prenter (1975,

p. 78)). The piecewise cubic function g(x) interpolates the function

f(x) at TO, T1 ..., Th+1 and has continuous derivatives of orders 0, 1,

and 2 at T1, T2 ..., Th. By using (2.17) with m=3, the function g(x)

in (2.28) can be expressed as


2 3 3 3
g(x) = a + ax + a + a x3 + bl(x 1) + b2(x T2)


(2.30)

3
+ ... + bh(x Th)



where a0, a1, a2, a3, bl, b2, ..., bh can be obtained by solving the

conditions in (2.29). The number of free parameters is equal to the

number of parameters in (2.30) which is h+4.

We assume that f(x) has continuous derivatives up to order 4 with

respect to x over the interval a
resulting from the use of cubic spline is given by



max Ig(x) f(x) < 5(Ar)4 max If(4)(x) /384 (2.31)
a

where


AT = Ti+ i=0, 1, 2, ..., h,














and


f(4)(x) = 34f(x)/ax4


(see Hall (1968) and deBoor (1978, p. 68)).



2.3 Optimal Designs

Khuri (1982) obtained an adequate approximation to the mean

response function with a Lagrange interpolation polynomial and used this

approximation to derive a parameter-free design on the basis of a

certain modification of the 0-optimality criterion. In this

dissertation we shall extend Khuri's design criterion to the cases where

we approximate the mean response function given in (2.1) with spline

interpolating polynomials (with single input variable) and Lagrange

interpolating polynomial (with two input variables).
k
Let Rk denote the experimental region for the model
x


y = f(x; 6) + (2.32)



We suppose that Rk is of the form
x

R = {(xI, ... xk) : a < xi < b ; i i 1 = 2, ..., k}



By a change of variables of the form


2x -a -b.
ti -- a i = 1 2, ..., k, (2.33)
i b-a. 1
1 1













the region Rk is changed to the region, Rk = _:1 < t = 1,
x t 1
2, ..., k}. If we substitute t for x in model (2.32) using the linear

transformation (2.33) we obtain the model



y = n(t; 3) + E. (2.34)



Suppose the interpolating polynomial g(t; 3) is adequate for

approximating the response function n(t; e) given in (2.34) over the
region Rk and for all in some parameter pace ; that is, g(t; ) is
region Rt and for all 3 in some parameter space 0; that is, g(t; 9

approximately equal to n(t; 9) with some small error of approximation

over the region R and for all 6 in 9. Thus, an approximate

representation of model (2.34) is given by



y = g(t; 6) + s (2.35)


A criterion for the selection of a design for estimating e in model

(2.35) which does not depend on the parameter vector e is given by Khuri

(1982). This criterion is restricted to the case of a single input

variable. We shall extend Khuri's criterion to the general case where

we have k input variables and later in this section we consider the

cases where k=2 for Lagrange interpolating polynomial and k=l for spline

interpolating polynomial.

Let t-, -2.... _t be n-points (n > p) with coordinates

(t1, t .... t1k), (t21, t22 .... t2k) .. (tnl, tn .... tnk)

over the region Rk = {t : -l












the coordinates corresponds in a one-to-one manner through (2.33) to

the points of the design D = [xij], i=l, 2, ..., n: j=1, 2, ..., k with

-, i=l, 2, ..., n, belonging to the experimental region


Rk = {x:a x. < b j = 1, 2, .... k .
X 3 3 I

In the design criterion by Khuri (1982), some function of the

determinant M(D, 8)| is made as large as possible with respect to t_,

y2 ..., tn, where M(D, 8)/a2 is the information matrix of order pxp with

respect to the model (2.35) and M has the form


n
M(D, 9) = a r(t., 9) r'(t 8) (2.36)
i=1


where the pxl matrix r(ti, 8) is the partial derivative of g(j ; 9) with

respect to 8. In matrix formula, as will be seen later, we can write

r(ti ,e) as


r(t., 8) = U( )v(t. ) (2.37)


where U(6) is a pxq matrix which does not depend on the design points

and v(t.) is a qxl matrix not involving 8; the value of q will be
--1
specified depending on the form of the interpolating polynomial. From

(2.36) and by using (2.37), we obtain


n
M(D, 8) = U(e) v(t, )v'(ti) U'(8) (2.38)
i=l


If we let












n
A(t1, ) = I v(i'(t) (2.39)
i=l

then from (2.38) and (2.39) we have



IM(D, ) = 8 =U(B)A(tl, t2 ...' tni)U'() (2.40)


Let X0(t,1' 2 ...., t ) and Xq-1( t1, t2'... t ) denote the smallest

and largest eigenvalues of A(t t ..., t) respectively. It will
1 -2 "-n
be shown later that A(ti' t2' "" tn ) is nonsingular, and hence

positive definite, so that the eigenvalues are positive. From (2.40) we

have


AP (t1 ... I t)U(e)U'(B)| c I (, e)l < xi t )l U()u (e)l

(2.41)


If the pxq matrix U(9) is of rank p then the pxp matrix U(e)U'(6) has

a nonzero determinant. Thus, we can write (2.41) in the form



IM(D, e)| = [yux ( 1 .... t) + (1 Y)^ .1(l .... )]|U(e) '(e) .

(2.42)


where O
density


v(y) 1 0 v(otherwi) se
0 otherwise












then from (2.42), the expected value of Mt(D, 9_) with respect to y is

equal to


E [IM(D, e)1] = (t t ... t );U(e)U'(.e) (2.43)


where



(ti' t2, .. t ) = [ (l, t2 .., tn) + X t ... tn)]/2

(2.44)


The parameter-free design is the one which maximizes E [IM(D, e) ] with

respect to t1, t2' ..., t This can be carried out by using the

computer program SEARCH, which was written by Conlon (1979) and is based

on Price's (1977) Controlled Random Search procedure, to maximize

(tl t 2 ... t ) in (2.44).


2.3.1 Designs Obtained through Lagrangian Interpolation of Nonlinear
Models in the Case of Two Input Variables

In this section we develop a method for constructing an adequate

approximation to the mean response function given in (2.32) with a

Lagrange interpolating polynomial. We shall restrict our consideration

of model (2.32) to the case of two input variables.












2.3.1.1 Lagrangian Interpolation of the Mean Response

Consider the model (2.32) for the case of two input variables ('<=2)

y = f(x1, x2; _) + E (2.45)

with the experimental region R2 = I(X x2) : al
we substitute tl, t2 for xl and x2 in model (2.45) using the linear
transformation given in (2.33) with k=2 we obtain the model

y = n(tI, t2; -) + (2.46)

with the region R2 =(t t2) : -1tl
(a) n(t1, t2; _) has continuous partial derivatives up to order mi + 1
with respect to t1 and order m2 + 1 with respect to t2 over the
region R for all 6, where mi and m2 are such that (ml + 1)(m2 +

1) is a positive integer greater than or equal to p, the number of
parameters in model (2.46), and are large enough so that

(m,+1) (m2+1)
max nt (tl t2; )I max I|n (t1 t2; 8)
-1 -1ct2 mi ml
(ml+1)! 2 (m2+1)! 2m2


(m1+1) (m2+1) (2.47)
max nt (t1' t2: )n t (t1, t2: 9)I
-1 -1 + < 6
m! +! 2+m2
(ml+l)! (m2+l)! 2












for all 9 in some parameter space e, where



(m,+1) ml+l ml+1
nit (tl' t2; _) = t n(ti, t2; -)/at1

(m2+1) m2+1 m2+1
nt2 (tl, t2; a) = n(tl, t2'; )/- t2




and 6 is a given positive constant.

(b) For any set of (ml+1)(m2+1) distinct points (yo, z0)' (yo, z1),
..., (yo, zm2) (Y1' zo) *.... (Ym' zm2)' such that -1 yl <... integers defined in Assumption (a), the px [(ml+l)(m2+1)] matrix,


U(6) = [U(yo' zo, 6), U(y0, zl, ), ..., U(Yml, m2 e)] (2.48)


has rank p, where U(y z., 8) is the partial derivative of
n(y z.; a) with respect to 9 (i=O, 1, ..., ml; j=O,
1, ..., m2).
We consider the points with coordinates (yo, zo), (yo, z1),
..., (ym1, Zm2) described in Assumption (b) with yi and zj being
of the Chebyshev type given in (2.14) with al = -1, b1 = 1, a2 =
-1 and b2 = 1. Substituting tl, t2 for x1 and x2 in (2.10) using
the linear transformation given in (2.33) with k=2, the Lagrange
interpolation polynomial becomes













g(tl, t2; a) = L n(yi, zj: e)lij (tI t2) (2.49)
0 0 where

ij (t t2) = 1 (t )lj(t2)

"1 t k m2 t2-z
1= l II =, i=o, 1, .. i 1, (2.50)
k=O Yi Yk u=0 j -u j=, 1 .... 2.
kti utj


The error resulting from the use of Lagrangian interpolation is, by

definition, the difference between n(t1, t2; 9) and g(tl, t2; a), the

upper bound of this error can be obtained from (2.15) and is given by


(m +1)
max ma (tt t2:
-l -l

+ 2
max le(tl, t2) t < 1m
-1 1tl (ml+1)!2 1
-I
(m2++)





(m1+1) (m2+1) t
max nt (t t2; 1)nt (
-l -1 -K1 I 1 '~-


(ml+l)!(m2+1)!2 1 2


(2.51)












2.3.1.2 Designs Based on Lagrangian Interpolation
Consider the Lagrange interpolating polynomial g(tl, t2; e) given
in (2.49) with interpolation points being of the Chebyshev type given in

(2.14) with al = -1, bI = 1, a2 = -1 and b2 = 1. The positive integers
mi and m2 are determined such that g(t1, t2; e) is approximately equal
to n(tl, t2; -) with an error of approximation not exceeding 6 over
2
the region R and for all e in 0. An approximate representation of
model (2.46) is given by


y = g(t', t2; -) + E (2.52)


Let (t11, t12), (t21, t22) ..., (tnl, tn2) be the points in the
region R2 = {(tl, t2):-1 spondence through (2.33) with the points D = [(x11, x12), (x21, x22),
S(xnl, xn2)]' in the region R2 = ((x, x2):al
consider r(ti, e) in (2.36) since it is the partial derivative of g(til,

ti2; _) with respect to e, therefore, from (2.49) we can write r(ti, 9)
as



3n(y z.; 9)
r(t 2) = uj(t 2) i=1, 2 ... n
O 0

From (2.37) and (2.53), the px[(ml+l)(m2+1)] matrix U(B) can be written











an(y zo; 6) an(y0, z2; 6) an(ym, z; )
u(e) e a ": r i
(2.54)


and is of rank p by Assumption (b) and v(j-) is an [(mn+1)(m2+1)] x 1
matrix of the Form


i i) = [100(til' ti2)' 101(til' ti2) .... I1m m2(til' ti2)]
(2.55)


If n > (ml+1)(m2+1) and at least (ml+1) of t1 .... tnl are distinct,
and at least (m2+1) of t12 ...". tn2 are distinct, then the
[(ml+l)(m2+1)] x [(ml+l)(m2+l)] matrix A(t1, t2' .... t) of the form
(2.39), where v(ti) is given in (2.55), should be of full rank and thus
nonsingular. To show this, suppose that A(M t2, ..., t) is of rank
less than (ml+l)(m2+1) then the (ml+1)(m2+1) x n matrix
G = [v(tl), (t 2), ..., (t )] is of the same rank as A(t_, t ...,
t1]. In this case there must exist a set of aj, u=0, 1, ..., m,;
j = 0, 1, ..., m2, not all equal to zero such that

m m2
Z auj uj(til = 0 i= 2, ..., n (2.56)
u=0 j=0


on R = ((t, t2): -l
as lu(ti)lj(ti2) so (2.56) becomes auj u(t(t t )'(t i2) = 0,
u=0 j=0











T"1
i=1, 2, ..., n. Let Cj(til) = a auj l (til). It follows that
u=O
m2
M2
SCj(til)j (ti) = 0 for each til in [-1,1], i=!, 2, ... Rut
j=0O
the lj's are linearly independent on the interval [-1,1] (see Khuri
(1932)). Thus, Cj(til) = 0 for each j=0, 1, ..., m2, and each til, i=l,

2, .... n, in [-1,1]. But then for each j, j=O, 1, 2 ..., m ,
m
Sa uj1 (til) = 0, i=l, 2, ..., n, and by the linear independence of
u=0

the lu's, auj = 0 for all u=0, 1, ..., mi and j=0, 1, ..., m2, which is

impossible since we assume that auj's (u=0, I, ..., ml; j=0, 1, ..., m2)
are not all equal to zero.
We now have U(8) a matrix of rank p, and thus U(e)U'(e) has a non-

zero determinant. We have also shown that the matrix A(tl, t2 ..., t )
is of rank (ml+1)(m2+1), and thus nonsingular. From Section 2.3, the

parameter-free design maximizes '(tl, t, ..., t ) with respect to tl

t2' ... n where (t_1, 12' ... 1n) is of the form




(t ..., ) [ ,(t1, ..., t ) + x (t ..., )]/2
(ml+l)(m2+1)-1
(2.57)



where X0(ti' ... t ) and x (ti' ... t ) denote the
(ml+1l)(m2+l)-1

smallest and largest eigenvalues of A(t_, t2, ... )
We have provided the methodology for obtaining the optimal design
based on Lagrange interpolating polynomial for the case of two input













variables. It was shown that the number of design points is at least

equal to (ml+1)(m2+1). This number tends to grow rapidly as mi or m2

become large. In this case, it becomes very difficult, and practically

impossible, to determine optimal design settings using Price's computer

algorithm, or any other similar algorithm, due to the high

dimensionality of the optimization problem.



2.3.2 Designs Obtained through Spline Interpolation of Nonlinear Models

In this section we develop methods for constructing designs

assuming the true mean response given in (2.32) can be adequately

approximated with a linear, a quadratic and a cubic spline in the case

of a single input variable. Each approximation will be used to obtain a

design for parameter estimation which does not depend on the parameter

vector a.



2.3.2.1 Spline Interpolation of the Mean Response

Consider the model (2.32) for the case of a single input variable

(k=l)


y = f(x; e) + E (2.58)

with the experimental region R8 = {x: a
in model (2.58) using the linear transformation given in (2.33) with k=1

we obtain the model













y = n(t; a) + E (2.59)



the region R is reduced to the interval -1 x
Consider the knots TO, r' ...", h+1 where -1 = TO < 1 < ... <

Th < Th+I = 1. A spline of degree m (with m-1 continuous derivatives at

the knots Ti, i=1, 2, ..., h) which approximates n(t; a) can be

obtained by substituting t for x in (2.17), we obtain


m h
g(t; 6) = a.ti + b.(t Ti) (2.60)
i=0 i=


where

(t (t-i) 1
(t i)j o h j=l, 2, ... (2.61)

\0 otherwise


The coefficients a0, a1, ..., am, bi, b2, ..., bh in model (2.60) can be

obtained by solving the interpolation problems in (2.21), (2.25) and

(2.29) for linear, quadratic and cubic splines, respectively. We note

that the coefficients ai, bj, i=O, 1, 2, ..., m; j=I, 2, ..., h are

functions of 9.

Consider the model in (2.59). We assume that

(a) n(t: 9) has continuous derivatives up to order m+l with respect to

t over the interval -14t
the spline function and is such that m+h+1 is a positive integer

greater than or equal to p, the number of parameters in model












(2.58) and h is the number of interior knots and is large enough

so that


(AT)2 max In(2)(t; 6)I/ 8 < 6, for linear spline, i.e., m=1,
-1 (2.62)
or

(AT)3 max In(3)(t; 8)1/8 < 5. for quadratic spline, i.e., m=2,
-l1t (2.63)
or

5(A:)4 max n ((t: _8)/384 < 6, for cubic spline, i.e., m=3,
-14t (2.64)

for all 6 in some parameter space 0, where AT = T. T.,

i = 0, 1, ..., h, is the common distance between the knots,

n( )(t; ) = tn (t: 6)/at j = 2, 3, 4, and 6 is a given

positive constant. We need the quantity m+h+l to be a positive

integer greater than or equal to p because of the parameter esti-

mation purpose: since m-h+l is the number of a's and b's in (2.60)

and these a's and b's are functions of 9 = (el ...., p)', if

m+h+l < p we cannot uniquely estimate 9 ... 9 The assumption

that n(t: e) has continuous derivatives up to order m+1 with

respect to t is needed for the purpose of obtaining the upper

bound of the error of interpolation.

(b) The px (m+h+1) matrix


U(e) = [u ( ), U (e), ..., (+h )


(2.65)












is of rank p, where UJ () = 3a /ae i = 0, 1, ..., m,

U m j( ) = 3b ./36 j = 1, 2, ..., h and ai, bj, i=0, 1 ..., m,

j = 1, 2, .., h, are the coefficients in model (2.60).

The error resulting from the use of spline interpolation

polynomial is the difference between n(t; 9) and g(t; e). The

upper bound of the errors can be obtained by substituting t for x

in (2.23), (2.27) and (2.31) using the linear transformation given

in (2.33) with k=l resulting in


max jg(t; e) n(t; 6)1 < (AT)2 max n(2) (t; e)l/8,
-1 (2.66)

for linear spline,


max |g(t; 9) n(t; 6)I < (AT)3 max In(3)(t; 6)/8,
- (2.67)

for quadratic spline,


max Ig(t; 6) n(t; e)I < 5(AT)4 max In( )(t; e)1/384,
- (2.68)

for cubic spline


where AT and nT()(t; 6), j = 2, 3, 4, are defined in Assumption












2.3.2.2 Designs Based on Spline Interpolation

Consider the spline interpolation polynomial g(t; 8) given in

(2.60) with h interior knots. The positive integer h is determined by

requiring the upper bounds in (2.66) or (2.67) or (2.68), for linear,

quadratic and cubic spline, to be less than some specified small value

6 > 0 for all in some parameter space e. Thus, an approximate

representation of model (2.59) is given by


y = g(t; e) + e (2.69)



Let t, t2, ..., tn be the points on the interval -1
correspond in a one-to-one manner through (2.33) to the points of the

design 0 = [x1, x2, ..., xn]' with xi (i=1, 2, ..., n) belonging to the

experimental region R1= {x: a
where r(ti, 9) is the partial derivative of g(t; e) with respect to 6,

as

m h
r(ti, ) ( u + U En (-bu)ti u) i = 1,2 ..., n
u=O -- u= -
(2.70)


From (2.37) and (2.70) we obtain U(B) and t(ti) of the form


U(83) [ae o' a 1' 2 am' -a 7 .bh


(2.71)












2 tm (r ,
(ti) = [1, ti, t ..., t (t ..., (t -
..., I + i Inh)
(2.72)


the matrix U(e) is of order px (m+h+1) and of rank p by Assumption (b)

and v(ti) is an (m+h+l)xl matrix.

Now, if n>m+h+l and at least m+h+1 of the design points are

distinct (this is needed so that the a's and b's in (2.60) are uniquely

estimated), then the (m+h+1) x (m+h+1) matrix

n
A(tI, t2.... t) = (ti)Vt'(t) (2.73)
i=l


should be of rank m+h+1 and thus nonsingular. To show this, suppose

that A(tl, t2'..., tn) is of rank less than m+h+1, then the (m+h+l)xn

matrix G = Cv(t), ... .-(tn)] is of the same rank as A(tl, t2'

..., t ), Therefore, there must exist a set of ai, i = 0, 1, ..., m+h,

not all equal to zero such that

m m+h
at. + i )m = 0 j = 1, 2, ... n
i=0 j i+1= j -m +
(2.74)

m m+h
From (2.74) we conclude that at1 + a.(t T )m, which is
i=0 i m+l 1
a polynomial of degree m, has n roots, namely, tI, t2, ..., t which is












impossible since n > m + h + 1 > m and at least m+h+1 of tI, t2 ...,

tn are distinct (a polynomial of degree m has at most m distinct roots).

We have thus shown that the (m+h+1) x (m+h+1) matrix

A(tI, t2, ..., t ) is of rank m+h+1. Therefore, it is positive

definite and by Assumption (b), U(e) is a matrix of rank p, hence the

pxp matrix 'u(9)U'(8) has a nonzero determinant. Consequently, from

Section 2.3, the parameter-free design is the one which maximizes

p(tl t2' ... tn) with respect to tl, t2, .... tn and

'(t, t2' ... tn) is of the form


p(t, t ... tn) = [X(t t2 ..., tn) + +h(tl, t2 ..., n)]/2

(2.75)

where 0(tl, t2, ..., tn) and m+h(tl' t2' tn) denote the smallest

and largest eigenvalues of A(t t2, ..., t ).

We note that in order to be able to generate a satisfactory distri-

bution of information throughout the interval [-1,1], the optimal design

obtained on the basis of spline interpolation is subject to the con-

straint that at least one design point is allocated in each subinterval.



2.4 Examples
We now present two examples for obtaining parameter-free designs

based on Lagrangian interpolation (for the case of two input variables)

and spline interpolation polynomials of degrees 1, 2 and 3 (for the case

of a single input variable).













Example 2.1 Consider the model used by Box and Hunter (1965) for a

certain chemical reaction rate which can be described by the nonlinear

model



f(x1, x2' e) = ele3x1/(1 + x1 + 2x2) (2.76)


where f(x1, x2; 9) represents the reaction rate, xl and x2 are partial

pressures of reactant and product respectively, 81 and 82 are adsorption

equilibrium constants for reactant and product respectively, and 83 is

the effective reaction rate constant. Suppose that a prior knowledge of

the parameters 01, e2 and e3 is available in the form of known lower and

upper bounds. Thus, v, <9
values vi and wi(i = 1, 2, 3) and suppose that the experimental region

is R 2= (x, x2): 0
positive values specified by the experimenter (in this example we choose

bi = 3 and b2=2). Using the linear transformation given in (2.33) with

k=2, aO=0 and a2=O model (2.76) becomes


n(t', t2; 6) = 1e3bl(tl+1)/[2 + 61bl(tl+) + 2b2(t2+l)] ,

(2.77)


the region R2 is transformed to Rt = {(t, t2): -1 -1t2c
Lagrangian interpolation accuracy is determined by using the equation

given in (2.51). It can be shown (see Appendix 1) that for all 8 in 0,

the upper bound for the error is given by














m2+1 m2+1
1 W1w3b1(tl+l)w2 b2
max Ie(tl,t2) < max I]2 2
-1 -1



m,+1 m+l1
1 w1 w3b1 [2+w2b2(t2+1)]
+ max |-12
ml -I 2 1




ml+ 22 ml+2 m2+1 m2+l1
S1 +2 W3bl1 (tl+l)w2 b [2+w2b2(t2+)]
+ max +






(2.78)


Suppose now we require that the upper bound of the error in (2.78)

(with bl=3 and b2=2) be less than or equal to 5 = .05. We note here

that this upper bound involves the upper and lower limits of aI, 82 and
83. Suppose that the lower and upper limits of e8, %2 and 83 are such

that .5 < 81 < 1, .7 < 92 < 2 and .4 < a3 < .8. These values are

obtained from the approximate 95% confidence region of 9 which was
constructed using ten response values that were generated according












to the model in (2.77) with 81, 82, e3 and the error variance, o2, taken

as .6, 1.2, .5 and .0006, respectively. The ten design points were

chosen from the region



Rt2= {(t t2): -1 < tl < 1 t < 1}


To compute the upper bound of the error we fix mi and m2 and carry out

the maximization as in (2.78) over the region R = {(tl, t2):

-1 < t1 < 1, -1 < t2 < 1}. This maximization is accomplished by the

use of the computer program SEARCH, which was written by Conlon

(1979). By computing this upper bound for several assignments of mi and

m2 we obtain the values of mI and m2 to satisfy the 6 = .05
requirement. These values are ml=12 and m2=4 and the value of the upper

bound is .04816541. The Chebyshev points corresponding to the values of

mi and m2 are obtained from (2.14) with al=-1, a2=-1, b1l= and b2=1.

These points are given in Table 1. We use these Chebyshev points to

form the coordinates of points in two dimensions, we obtain (1l+1)(m2+l)

= 65 points. These coordinates are given in Table 2. We choose n, the

number of design points, to be equal to 65 (here all sixty-five points

are distinct). The vector v(tj) as well as the matrix A(tI, ...,

.65) can be obtained from formulas (2.55) and (2.39), respectively,
which leads us to the maximization of (tl, t2' ...' -65) in (2.57) with

p=3. Since the number of design points required for this example is

computationally intractable, the actual determination of the 65-pt.

design was not attempted.













Table 1

Chebyshev Points for Example 2.1


Based on ml = 12


Based on m2 = 4


.95106

.58779

0

-.58779

-.95106


.99271

.93502

.82298

.66312

.46472

.23932

0

-.23932

-.46472

-.66312

-.82298

-.93502

-.99271













Table 2

Coordinates of Points in Two-Dimension
Obtained from Chebyshev Points for Example 2.1


(.99271, .95106)
(.99271, .58779)
(.99271, 0 )
(.99271, -.58779)
(.99271, -.95106)
(.93502, .95106)
(.93502, .58779)
(.93502, 0 )
(.93502, -.58779)
(.93502, -.95106)
(.82298, .95106)
(.82298, .58779)
(.82298, 0 )
(.82298, -.58779)
(.82298, -.95106)
(.66312, .95106)
(.66312, .58779)
(.66312, 0 )
(.66312, -.58779)
(.66312, -.95106)
(.46472, .95106)
(.46472, .58779)


.46472, 0 )
.46472, -.58779)
.46472, -.95106)
.23932, .95106)
.23932, .58779)
.23932, 0 )
.23932, -.58779)
.23932, -.95106)
0 .95106)
0 .58779)
0 0, )
0 -.58779)
( 0 ,-.95106)
(-.23932, .95106)
(-.23932, .58779)
(-.23932, 0 )
(-.23932, -.58779)
(-.23932, -.95106)
(-.46472, .95106)
(-.46472, .58779)
(-.46472, 0 )
(-.46472, -.58779)


(-.46472, -.95106)
(-.66312, .95106)
(-.66312, .58779)
(-.66312, 0 )
(-.66312, -.58779)
(-.66312, -.95106)
(-.82298, .95106)
(-.82298, .58779)
(-.82298, 0 )
(-.82298, -.58779)
(-.82298, -.95106)
(-.93502, .95106)
(-.93502, .58779)
(-.93502, 0 )
(-.93502, -.58779)
(-.93502, -.95106)
(-.99271, .95106)
(-.99271, .58779)
(-.99271, 0 )
(-.99271, -.58779)
(-.99271, -.95106)













Example 2.2 In this second example the parameter-free designs are

obtained through linear, quadratic and cubic spline interpolating

polynomials used for approximating the nonlinear model. We consider the

model used by Draper and Smith (1981). The model is taken from an

investigation performed at Proctor and Gamble. The investigation

involved the relationship between Available Chlorine, y, of product A

and the length of time of manufacture, x. It is required that product A

must have a fraction 0.50 of Available Chlorine at the time of

manufacture and the fraction of Available Chlorine in the product

decreases with time. The form of the model is



y = a + (.49 a)exp(-B(x 8)) + E (2.79)



We suppose that the experimental region R1 = {x: a x
are some positive values specified by the experimenter (in this example

a = 10 and b = 40), and let the parameter space 0, be such that

0.36
confidence region of a and B in the example by Draper and Smith (1981,

p. 486)). Using the linear transformation given in (2.33) with k=1, a=10

and b=40, model (2.79) becomes



y = a + (.49 a)exp(-t(15t + 17)) + E (2.80)



where -1











(AT)2(.0679675) for linear spline


max Ig(t,6) n(t,6)j < (6A)3(.1631221197) for quadratic spline
-l
(AT)4(.0407805) for cubic spline.


(2.81)
Suppose now we require that the upper bounds in (2.81) be less than or

equal to 6 = .05, by taking into consideration that we have equidistant

knots. Then the values of AT = Ti+l Ti, i=0, 1, ..., h, that satisfy

the 6 = .05 requirement are


.4 with maximum error = .0108748 for linear spline

AT = -.5 with maximum error = .02039 for quadratic spline

1 with maximum error = .0407805 for cubic spline.


(2.82)


From (2.82), the number of interior knots on the interval -l
linear, quadratic and cubic splines are h=4, 3 and 1, respectively. The

knot positions for linear spline are -1, -.6, -.2, .2, .6, 1, for

quadratic spline are -1, -.5, 0, .5, 1, and for cubic spline are -1,

0, 1. By choosing n, the number of design points to be equal to m + h +

1, where m is the degree of the spline functions (in our case m=l, 2,

and 3) and h is the number of interior knots, we obtain n=6, 6 and 5 for

linear, quadratic and cubic splines, respectively. The vector v(t) as

well as the matrix A(tl, t2, ..., tn) can be obtained from formulas













(2.72) and (2.73), respectively. The maximum values of i(tl, t2, ...,

tn) in (2.75) with p=2 for linear, quadratic and cubic splines are

178.605, 219.227 and 114.813 and the corresponding optimal values of

ti(i=l, 2, ..., n) along with the join points (interior knots) ri, i=l,

2, ..., h, are given in Table 3.

For comparison purposes and to check the efficiency of our optimal

design, by using a = (a, 6)' = (.39, .10)', values of the determinant

of the matrix M(D, 3) given in (2.38) were computed at both optimal

design and the Box-Lucas design (see Appendix 3) and are given in

Table 4. The efficiency is defined by: (Lucas (1976))


M(D,)0Optimal Design[/ 1/p
Efficiency = [ iMlD)Box-Lucas/n ]
Box-Lucas 2


where n1, n2, (in this example nI = 6, 6, 5 for linear, quadratic and

cubic splines and n2 = 2), are the numbers of design points for the

optimal and Box-Lucas designs, respectively, and p is the number of

parameters in the model (in our case p=2). The efficiencies are also

given in Table 4.

From Table 4 we can conclude that with respect to model (2.79), our

optimal designs based on the linear and the quadratic splines are 1.367

and 1.042 times more efficient than the Box-Lucas design. But the

optimal design based on cubic spline has a lower efficiency than the

Box-Lucas design.

For the purpose of comparison, the plots of the linear, quadratic

and cubic splines are produced in Figures 2, 3 and 4. For the true mean


~













response curve in Figure 1, we set a = .39 and B = .10 (tnese two values

are the least squares estimates of a and B obtained by Draper and Smith

(1981)). The equations for the linear, quadratic and cubic splines that

provide the approximations to the model in (2.80) with the maximum

errors given in (2.82) are of the forms


linear spline:


g(t;9) = .379523-.0923504t+.0416674(t+.6)++.0228676(t+.2)+

+.01255(t-.2)+.00688758(t-.6) ,


quadratic spline:

g(t;e) = .416754+.00793627t+.0630552t2-.0340915(t+.5)2

-.0151525(t-0)+-.00724385(t-.5)2,


cubic spline:

g(t;9) = .408268+.0261167t+.0157712t2

-.0217168t3+.0178701(t-0)3

(2.83)

These splines are constructed under the assumption that each of them has

m-1 continuous derivatives at the knots, where m=1, 2, and 3,

respectively. The mean response values obtained from the model in

(2.80) (with a = .39 and B = .10), and values obtained from linear,

quadratic and cubic splines in (2.83) are given in Table 5 for certain

values of t in the interval [-1,1], the plots of the response values

(for each model) against the t values are given in Figures 1, 2, 3, and

4, respectively.















Optimal Designs and


Table 3

Interior Knots.


Linear Spline with
Interior Knots at
-.6, -.2, .2, .6


Optimal Designs
Quadratic Spline with
Interior Knots at
-.5, 0, .5


Cubic Spline with
Interior Knots at
0


-.6960602 -.9823732 -1
-.200152 -.9458824 -.0007636
.1999294 -.0002855 .9975816
.5999959 .4995994 .9990895
.9997785 .9999732 1
.9999283 .9999965








Table 4

Values of IM(D,B)| from Optimal and Box-Lucas Designs
for B = (a, B)' = (.39, .10)'
and the Efficiencies of the Optimal Design. Example 2.2.



M(D,)optimal design M(D,B) Box-Lucas Efficiency of
Optimal Design

Linear
Spline .387684347 .069146279 1.367

Quadratic
Spline .247975528 .076197813 1.042

Cubic
Spline .169248231 .085737989 .889


Example 2.2















Mean Response Values


Table 5
with Corresponding Values of t. Example 2.2


Mean Response Values


t True Model Linear Spline


-1LQC
-.9
-.3
-.7
-.6L
-.5Q
-.4
-.3
-.2L
-.1
OQC
.1
.2L
.3
.4
.5Q
.6L
.7
.8
.9
1LQC


.471873
.460469
.450653
.442205
.434933
.428674
.423287
.418650
.414660
.411255
.408268
.405724
.403534
.401648
.400026
.398629
.397427
.396393
.395502
.394736
.394076


.471873
.462638
.453403
.444168
.434933
.429865
.424797
.419728
.414660
.411878
.409097
.406315
.403534
.402007
.400481
.398954
.397427
.396590
.395752
.394914
.394076


Quadratic Spline Cubic Spline


.471873
.460686
.450760
.442096
.434692
.428674
.423327
.418684
.414621
.411136
.408268
.405754
.403533
.401628
.399979
.398629
.397438
.396400
.395494
.394719
.394076


.471873
.460379
.450374
.441726
.434306
.427984
.422628
.418109
.414296
.411059
.408268
.405810
.403645
.401749
.400099
.398672
.397445
.396395
.395499
.394733
.394076


L = knots for linear spline, Q = knots for quadratic spline,

C = knots for cubic spline


~ _






62








O 45- .


0. 25 ]


0. L4-25


0.375-
-10 -0.5 0.0 0.5 1.0

X
Fig. 1. Plot of a+(.49-a)exp(-B(15t+17)) on [-1,1] with a = .39 and
B = .10 for Example 2.2





T
0 .475-



0. \





13 uL-



0. 375 _
-L. -0.5 0.0 0.5 L.O

X
Fig. 2. Plot of Linear Spline Interpolate to a+(.49-a)exp(-S(15t+17))
with Six Evenly Spaced Knots on [-1,1] for Example 2.2










T
0,475I


0.490- \


0.425


0.400-


0.375

-1.0 -0,5 0.0 0.5 L.0

X
Fig. 3 Plot of Quadratic Spline Interpolate to a+(.49-a)exp(-S(15t+17))
with Five Evenly Spaced Knots on [-1,1] for Example 2.2





T
0,475-






0. 14259


0,400o -----2-


0.3752

-L.0 -0.5 0,0 0.5 1.0

X

Fig. 4. Plot of Cubic Spline Interpolate to a+(.49-a)exp(-S(15t+17))
with Three Evenly Spaced Knots on [-1,1] for Example 2.2













2.5 Discussion

The proposed optimal design is obtained from the interpolating

polynomials, so the efficiency of the design depends on the accuracy of

the interpolations. In Example 2.2 we compared the designs based on the

linear, quadratic and cubic splines to the Box-Lucas design. Table 4

summarizes the design comparisons, by showing that the optimal designs

based on linear and quadratic splines have higher efficiencies than the

Box-Lucas design but the optimal design based on cubic spline has lower

efficiency than the Box-Lucas design. This may be explained by the

accuracies of the interpolations; that is, the upper bound of the error

of interpolation using cubic spline is larger than those of the linear

and quadratic splines (see Eq. 2.82). We also note here that the design

based on the linear spline has slightly higher efficiency than the

design based on the quadratic spline when comparing both of them to the

Box-Lucas design. Based on the efficiencies of the designs our

recommendation for Example 2.2 is to choose the design that is based on

the linear spline. The number of design points (n) for the linear,

quadratic and cubic splines in this example are 6, 6, and 5,

respectively. This number seems to be nonincreasing as the degree of

the spline interpolating polynomial increases. This is not always true

since the number of design points depends on m and h (n = m+h+1) where m

is the degree of spline interpolating polynomial and h is the number of

interior knots. The value of h can increase, decrease or stay the same

when m increases. Hence, there is the possibility of increasing the

number of design points with an increase in the degree of the spline













interpolating polynomial. The optimal design is also affected by the

position of the design points in the experimental region. A problem

arises of how the design points should be allocated, especially for the

design that is based on spline interpolating polynomial. Box and Draper

(1975) point out that "the design should generate a satisfactory

distribution of information throughout the region of interest." For

this reason, for the optimal design based on spline interpolating

polynomials, we would choose our design so that there is at least one

design point in each subinterval. One other problem is that when the

number of design points becomes large (n > 10), it is almost impossible

to calculate the design points through the Controlled Random Search

Procedure. This was the case when we tried to calculate the design

points for the design that is based on Lagrange interpolating polynomial

with two input variables (Example 2.1). It is hoped that a sequential

procedure, in which points are added one at a time, can be used to solve

the problem of optimization when the number of design points is large.

This is subject to future research.




















CHAPTER THREE
CONFIDENCE REGIONS FOR THE PARAMETERS
IN NONLINEAR MODELS


3.1 Introduction

In this chapter we present a method for the construction of a

confidence region on a vector of parameters in a nonlinear model.

Consider the same model as in (1.5)



y = f(x; 6) + (3.1)



where y is the measured or observed response, f(x; 6) is the true mean

response, x = (xl, x2, ..., xk)' is a vector of k input variables, 6 =

(e8, 2'... ep)' is a vector of p unknown parameters, and e is a

random error. If there are n observations (n > p) of the form yl, y2,

..., yn with yi being obtained at the design point xi = (xil, xi?2 ***

ik)', i = 1, 2, ..., n, then model (3.1) can be written in the form



yi = f(i ; B) + i i = 1, 2, .., n (3.2)












We assume that the random errors el' 2' ,"" n. are independently

distributed as N(O, o2), with o2 unknown. We define the n x k design

matrix as D = [xij], i = 1, 2, ..., n; j = i, 2, ..., k. The i-th row

ii of this matrix with elements xil, xi? ..., ik provides the levels

of the k variables at which the response is to be observed at the i-th

run. Throughout our development, we shall assume that replications can

be obtained at each design point. These replications will be used to

obtain an estimate of 02 (Draper and Smith (1981)). In the following

sections, we consider the problem of constructing a confidence region on

the vector of parameters, a.



3.2 Method for Obtaining Confidence Region
on Parameters
Consider the model given in (3.1). For a given value of x, f(x; 9)

is a function of 9. When each element in x takes on n values in the

experimental region R, we end up having n functions of 6. To construct

a confidence region on the parameters 9, we shall proceed as follows.

Step One. For n > p, we select q = (n) sets of points each

consisting of p design points. Let W consist of the collection of the q
of p points Let x(l) (1)
sets of p ..., x denote the p points in the l-th
S-p
set (1 = 1, 2, ..., q).

Step Two. For the 1-th set of points in W, 1 = 1, 2, ..., q, we

obtain a set of simultaneous confidence intervals on f(x(l; ),.

f(x(1) ; ). Let (1 ) be the cartesian product of these intervals,
that is, ) is a rectangular confidence region on the vector
that is, R(l) is a rectangular confidence region on the vector













(f(x_1); ) ..., f(x(1); e))'. Then the confidence intervals .ill be
1 -p
chosen so that the confidence coefficient for this region is equal to

1 a.

Step Three. Let


n1l = f(x(l); 6)



p -P

We shall solve the system of the above equations for 81, ..., p in

terms of n' I, ..., nr() (this can be achieved by assuming certain

1 --
conditions on the functions f(x1; 6) f(x ; ), (1 = 1, 2,

.. q)).

Step Four. We obtain a rectangular confidence region on a vector

6 = (e,1 .... p)' with a confidence coefficient > 1 a. This region,

denoted by r(l), is based on the region (1 ).

Step Five. Using Bonferroni's inequality, the intersection,
q (
i r() provides a confidence region on the vector a with a
1=1
confidence coefficient > 1 qa. This intersection can be used to

obtain simultaneous confidence intervals on any set of functions of

61, ..., ep

We shall now give details of Steps One through Five.



3.3 Confidence Region on 1

In this section we first construct the rectangular confidence

region (I) on each of a set of p functions of the form f(x'1, a),
--1-













i = 1, 2, ..., p, where x ), (1) is a set of points in W,
1 3-
1 = 1, 2, ..., q. The p-variate student's t-distribution defined by

Dunnett and Sobel (1954) will be used to obtain these confidence

regions. For the sake of simplicity, in this section we shall drop the

superscript (1) and just use 1i.

Consider the model given in (3.2) and suppose



YlI, Y12 ..... Yln are n1 repeat observations at xI-

Y21' Y22' ..., 2n2 are n2 repeat observations at x2


Ynl' Yn2.' ., Ynn are nn repeat observations at _




where yiu is the u-th observation at the i-th design point. Let us

assume the yiu are independently normal distributed with mean f(xi; 9)

and common unknown variance a2 (i = 1, 2, ..., n; u = 1, 2, ..., ni).

Draper and Smith (1981) pointed out that to estimate 02 we can use the

pooled variance of the form



n n
2 i l u l (Yiu-yi)
n
idl (ni-l)
S2 (3.3)
(n1-1)s2+(n -1)s2+...+(n )s2
m












where s2 is the sample variance of the ni(> 2) observed responses at

the i-th design point, yi is the average of the values of the observed

response at the i-th design point, and n is the number of distinct
n
design points. Here s2 has m = i (n -l) degrees of freedom.

Let

yi-f(x.; e)
t = i = 1, 2, ..., p. (3.4)
Sr7
n


We note here that y9 is normally distributed with mean f(xi: e) and

variance o2/n If we write


z. = IT (yi-f(x ; )] i = 1, 2, .... p, (3.5)



then zi, i = 1, 2, ..., p, has a p-variate normal distribution with mean

0 and common unknown variance a2 and zl, ..., zp are independent by the

assumption regarding the random errors stated in Section 3.1. Also,

ms2/a2 has a chi-square distribution, independent of the zi, with m

degrees of freedom. By substituting zi into (3.4) we obtain


zi
ti = -- i = 1, 2, .. p. (3.6)


The joint distribution of ti, i = 1, 2, ..., p, of the form (3.6), is a

central p-variate t-distribution with m degrees of freedom and has the

density function











-(m+p)] 1
h(t .., tp) :[ +- 1
1 p (mn)p/2T() m i1- ti
p 2


- (m+p)

I ,- 2


(3.7)


where r( ) is the Gamma function (see Dunnett and Sobel (1954)).


Simultaneous confidence intervals on f(x ; ), i = 1, 2 ...,
with confidence coefficient 1-a can be set up by finding p positive real

numbers, say al, a2, ..., ap, such that



a a-1 aI
P( t \< ,.al.2, t | -ap -ap-I -a
S P1 (3.8)

= 1 a .




By substituting the ti given in (3.4), formula (3.8) can be written in

the equivalent form


P(-a ---f( a; 1al_, .., p-a -- 1 1 P P (3.9)


which determines 100(1 a)% simultaneous confidence intervals on

f(x ; e), ..., f(x ; _). Let n denote the cartesian product of these
intervals. The region Q is a rectangular confidence region on the












vector (f(xl; ), ..., f(x ; e))' with a confidence coefficient 1 a.

Let ni = f(xi; 6), i = 1, 2, ..., p. i is a function of i and
6. For a given value of xi, f(xi; 6) is a function of 0 and we write

f(i. ; ) as fi(e). Thus we can express ni, i = 1, 2, ..., p, as


n1 = f 1 2' e "' e p) 1 f()

T2 = f 12(, 2 ... ) = f2e)
(3.10)

n = fp(e1' 62' .. ep) fp )


Using the expressions in (3.10) we can write (3.9) as


P((n1, n2, ..., n ) E n) 1 a (3.11)

3(f ..., f )
If the Jacobian (- "' 0, then equations (3.10) can be
e i (9l' ..., 9p)
uniquely solved for 1, ..., p in terms of n1, ..., np and we obtain


el = 91(n1 n?, ..., rp)
62 = 92(n1, n2, ..., p)
0 n (3.12)

8p gp(nl, n2 ...n Pp)


We let L. = min g (ni ..., n ) and U = max gi (nl ..., p), then

from (3.11) and (3.12) we can conclude that












P(L1 < e < U1, ... I < e < Up ) > 1 a (3.13)


which gives a rectangular confidence region on 9 with a confidence

coefficient > 1 a. We denote this region by r. Equation (3.13) can

be written in the alternative form

P((81, 62' "." p) E r) > 1 a (3.14)


The above method can be applied to every set in W consisting of p
points. We shall denote the confidence region obtained as in (3.14) for

the various q sets in 0 by r(1), ..., r(q) We thus have the following

q confidence regions on 8 each having a confidence coefficient > 1 a,


P(( 9 ...2' p) e r(1)) > 1 a

P((el, 82 .... p) r(2) > 1 a (3.15)



P((e1, 2' ... ) e r() ) 1 a


We now combine the information from these q confidence regions using
Bonferroni's inequality (see, for example, Neter and Wasserman (1974, p.
146) or Bickel and Doksum (1977, p. 439)). According to this inequality
and from (3.15), we have


P((e1, 82 ... p) E r1 () > i qa (3.16)
p 1=1













which gives a confidence region on e with a confidence coefficient

> 1 qa. We can write (3.16) in the equivalent form


P(L* < 61 < U*, ..., L* (< < U*) > 1 qa (3.17)



Lt = max(L l), ..., ), U = min( ) ..., ), i = 2, ..., p,

where L 1 ), U- 1 i = 1, 2, ..., p, 1 = 1, 2, ..., q, are the lower
1 1
and upper bounds, respectively, on ei obtained as in (3.13) for the l-th

set of points in W. We note from formula (3.17) that the intervals

[L*, Ut], i=1, ..., p, form a set of simultaneous confidence intervals

on 61 ..... p, respectively, with a joint confidence coefficient

greater than or equal to 1 qa.



3.4 Examples

We now present two examples to illustrate the technique for

obtaining the confidence regions of parameters in nonlinear models.

Example 3.1. Consider the two parameter exponential growth model

used by Gallant (1975b) of the form


62x
y = l1e + (3.18)



A total of n = 11 design points are chosen from R = {x: 0 < x < 1}.

These points are 0, .1, .2, ..., 1. Each point is replicated five

times. The parameter space is taken as 0 = {(6 81 2): 6 > 0, 62 > 0}.

Observed response values are generated using the parameter values,













e1 = .62, e2 = .5, and o2 = .04. These values are shown in Table 6.

The mean (yi) and the variance (s5] of the observed responses at the

i-th design point for i=l, ..., 11, are listed in Table 7. The pooled

variance s2, can be calculated by using the formula in (3.3) and is

equal to .0332818 with m = 44 degrees of freedom.

Since n = 11 and p, the number of parameters is equal to 2, then

q = (1) = 55. We thus have 55 pairs of the functions of the form
f(xj1); e), i = 1, 2; 1 = 1, 2, ..., 55. For each pair, the

simultaneous confidence interval on f(x1); f(x ; ), can be
1 2 ; e), can be
obtained as follows. First we use the bi-variate t-distribution given

in (3.7) with p = 2 to calculate the probability described in (3.8). We

assume that the limits of integration al, a2 are equal to a, say, so

that Equation (3.8) can be expressed as

a a
P( tll < a, It2| < a) = f f h(tl, t2) dt1dt2 = 1 a ,
-a -a
(3.19)


where h(tl, t2) can be obtained from (3.7) with p = 2 and m = 44. To

calculate the probability value in (3.19) we fix a and calculate the

integrals. This can be carried out by using a computer subprogram

called DMLIN from IMSL (1982). Suppose we would like the confidence

coefficient 1 qa in (3.16) to be equal to .94. We set

1 (1)a = .94 which implies that a : .001. By computing the

integral in (3.19) for several assignments of a we obtain a = 3.75 which

corresponds to a = .001057813 and 1 1)a = .941319999. To find
corre pond )ao













Table 6
Observed Responses and Design Points for Example 3.1


Observed Design
Responses Points
(y) (x)


.84025

.84356

.66729

.75769

.67426

.53177

.62303

.75965

.49493

.70555

.77509

.61030

.79161

.58948

.74782

.46908

.82377

.91124

.66913


Observed Design
Responses Points
(y) (x)


.65852

.85087

.63028

.56506

.81007

.89356

1.10607

1.09510

.64805

.88696

.85727

1.31870

.73195

.85361

.52289

1.05243

1.02124

1.05300


Design
Points
(x)


Observed
Responses
(y)


.82359

.63941

.63028

.94460

.71885

.89828

1.00649

.74048

.96815

1.08855

.66642

.79255

.88114

1.0697

.56959

.91519

1.31447

.84882













Table 7
Means and Variances of Observed Responses
at Distinct Design Points for Example 3.1

Design Points Observed Responses

(xi) Mean (Y ) Variance (s )



0 .75661 .00733

.1 .62299 .01255

.2 .70286 .00913

.3 .70635 .02893

.4 .74997 .02073

.5 .91869 .03606

.6 .89592 .09277

.7 .83350 .04062

.8 .86174 .01607

.9 .87936 .02613

1 .94355 .07578












the 100(1 a)% 99.9% simultaneous confidence intervals on

f(x1l ; ), and f(x l); G), 1 = 1, 2, ..., 55, we consider Equation

(3.9) with al = a2 = a = 3.75, n1 = n2 = 5. These intervals are given

in Table 8.

We now find the confidence region on (61, 92). For the sake of

simplicity, in this example we shall drop the superscript (1) and just

use x1 and x2. We assume that xr < x2 and let


e2x
n = 6 e = f(x ; a)

(3.20)
52x2
n2 = ele = f(x2; 8)


Suppose cl, dl are the lower and upper limits of the confidence interval

on f(xl; e) and c2, d2 are the lower and upper limits of the confidence

interval on f(x2;: ). We solve the two equations in (3.20) for el, e2

and obtain
x1 x2

X1-X2 x2-x1
1 = n2 n1
(3.21)
In n2-n n1
2 x2-x1


where In is the natural logarithm. From (3.21) we obtain by minimizing

and maximizing each of 81 and e2 with respect to n over the rectangular

region [cl, dl] x [c2, d23


1











x1 x2
x1-x2 x2-x1
1min d2 c


xI x2
Xl-X2 x2-xI

Imax = c2 d



In c2-1n dl
2min x2_-X



In d -ln cl
2max x2-x1


Values of l6min, 8max and 2min', 2max for each of the 55 pairs of

simultaneous confidence intervals on n1, n2 are given in Table 8. From

Table 8, if we take the largest of the lower bounds on 0e and on 02 as

well as the smallest of the upper bounds on 81 and on 82, we obtain the

following rectangular confidence region on 8 which has a confidence

coefficient of at least 94.18 percent:



81 e [.45065, .96861]



82 e [-.41816, 1.01981]



For comparison purposes we use the data in Table 6 to construct

simultaneous confidence intervals on 81 and 82 by using each of three

other known methods. These methods are:













Table 8
99.9 Percent Confidence Regions on [f(xl(1); _), f(x2(1); _)],
1 = 1, 2, ..., 55 and at least
99.9 Percent Confidence Regions on [el, 62] for Example 3.1

Design Points Confidence Region of Confidence Region of

x x1 [f(xl; 1), f(x [e1 2]
1 2 1 _


.1 [.45065,1.06257],[.31703,.92894]

.2 [.45065,1.06257],[.3969,1.00882]

.3 [.45065,1.06257],[.40039,1.0123]

.4 [.45065,1.06257],[.44401,1.05592]

.5 [.45065,1.06257],[.61273,1.22465]

.6 [.45065,1.06257],[.58996,1.02187]

.7 [.45065,1.06257],[.52755,1.13946]

.8 [.45065,1.06257],[.55578,1.1577]

.9 [.45065,1.06257],[.57341,1.18532]

1 [.45065,1.06257],[.63759,1.24951]

.2 [.31703,.92894],[.3969,1.00882]

.3 [.31703,.92394],[.40039,1.0123]

.4 [.31703, .92894],[.44401,1.05592]

.5 [.31703,.92894],[.61273,1.22465]

.6 [.31703,.92894],[.58996,1.02137]

.7 [.31703,.92394],[.52755,1.13946]

.8 [.31703,.92894],[.55578,1.1677]


[.45065,1.06257],[-12.094,7.2335]

[.45065,1.06257],[-4.9238,4.02918]

[.45065,1.06257],[-3.2533,2.69762]

[.45065,1.06257], [-2.1815,2.12369]

[.45065,1.06257] ,[-1.101,1.9992]

[.45065,1.06257],[-.98065,1.6349]

[.45065,1.06257],[-1.0003,1.32516]

[.45065,1.06257],[-.81008,1.19011]
[.45065,1.06257],[-.68539,1.07452]

[.45065,1.06257],[-.51074,1.01981]
[.09963,2.17417],[-8.5036,11.5754]

[.17742,1.41495].[-4.208,5.80496]

[.21229,1.1881],[-2.4607,4.0106]
[.22614,1.03079],[-1.0403,3.37354]

[.24236,1.01724],[-.90799,2.66529]

[.25615,1.02081],[-.91302,2.1322]

[.26315,.99967],[-.73381,1.86256]












Table 8--contirued.


.9 [.31703,.92894],[.57341,1.18532]
1 [.31703, .92394],[.63759,1.24951]

.3 [.3969,1.00882],[.40039,1.0123]

.4 [.3969,1.00882],[.44401,1.05592]
.5 [.3969,1.00882],[.61273,1.22465]

.6 [.3969,1.00882],[.58996,1.02187]

.7 [.3969,1.X002],[.52755,1.139463

.8 [.3969,1.00882],[.55578,1.1577]

.9 [.3969,1.C 082],[.57341,1.18532]

1 [.3969,1.00882],[.63759,1.24951]
.4 [.40039,1.0123],[.44401,1.05592]

.5 [.40039,1.0123],[.61273,1.224653

.6 [.40039,1.0123],[.58996,1.02187]

.7 [.40039,1.0123],[.52755,1.13946]

.8 [.40039,1.0123],[.55578,1.1677]

.9 [.40003,1.0123],[.57341,1.18532]

1 [.40039,1.0123],[.63759,1.24951]
.5 [.44401,1.05592],[.61273,1.22465]

.6 [.44401,1.05592],[.58996,1.02187]

.7 [.44401,1.05592],[.52755,1.13946]

.8 [.44401,1.05592],[.55578,1.1677]
.9 [.44401,1.05592],[.57341,1.18532]

1 [.44401,1.05592],[.63759,1.24951]
.6 [.61273,1.22465],[.58996,1.02187]


[.26885,.98669],[-.60307,1.64847]

[.27222, .96861],[-.41816,1.5239]

[.06101,6.40426],[-9.2409,9.36293]

[.14919,2.29209],[-4.1034,4.8924]

[.18727,1.40661],[-1.662,3.75572]

[.22809,1.31919],[-1.3412,2.76986]

[.26030,1.30747],[-1.2966,2.10924]

[.27699,1.23059],[-.99359,1.79849]
[.29035,1.18553],[-.80706,1.56296]

[.29797,1.13143],[-.57353,1.43352]
[.02133,11.9968],[-8.2414,9.69731]

[.07435,2.14967],[-2.5103,5.58983]

[.13339,1.737],[-1.79998,3.66398]

[.18274,1.65044],[-1.5294,2.61467]

[.21066,1.45062],[-1.1992,2.14069]
[.23271,1.34504],[-.94732,1.80888]

[.24585,1.23411],[-.6604,1.62581]

[.00767,9.31279],[-5.4424,10.1456]
[.06059,3.38264],[-2.9106,4.97893]

[.12637,2.66356],[-2.3131,3.14154]
[.16883,2.00614],[-1.6045,2.41735]

[.20241,1.72096],[-1.2212,1.96384]

[.22276,1.47806],[-.846078,1.72443]
[.02110,47.2015],[-7.3035,6.73707]












Table 8--contirued.

.5 .7 [.61273,1.22465],[.52755,1.13941]
.5 .8 [.61273,1.22465],[.55578,1.1677]

.5 .9 [.61273,1.22465],[.57341,1.18532]
.5 1 [.61273,1.22465],[.63759,1.24951]

.6 .7 [.58996,1.02187],[.52755,1.13946]

.6 .8 [.58996,1.02187],[.55578,1.1677]

.6 .9 [.58996,1.02198],[.57341,1.18532]

.6 1 [.58996,1.02187],[.63759,1.24951]

.7 .8 [.52755,1.13946],[.55578,1.1677]

.7 .9 [.52755,1.13946],[.57341,1.18532]

.7 1 [.52755,1.13946],[.63759,1.24951]

.8 .9 [.55578,1.1677],[.57341,1.18532]

.8 1 [.55578,1.1677],[.63759,1.24951]

.9 1 [.57341,1.18532],[.63759,1.24951]


[.12993,10.0551],[-4.2108,3.10191]

[.20917,4.56934],[-2.6334,2.14953]

[.26858,3.1619],[-1.897,1.64959]

[.30047,2.352211],[-1.3054,1.42516]

[.01136,168.052],[-8.234,6.58257]

[.07608,12.154],[-3.8563,3.41368]

[.14615,5.29023],[-2.4668,2.32571]

[.19140,3.11048],[-1.5848,1.87614]

[.00o23,173.484],[-7.1793,7.94551]
[.03103,12.6057],[-3.4336,4.04765]

[.07055,4.41628],[-1.9353,2.87423]

[.00129,345.366],[-7.112,7.57389]
[.02176,13.1361],[-3.0254,4.05065]

[.00052,314.366],[-6.2006,7.78915]


* is the largest value of the lower bound.
** is the smallest value of the upper bound.













1) The likelihood ratio method

2) The method based on the asymptotic normality of the least

squares estimator

3) Hartley's method

(see Section 1.3).

The boundaries of the confidence regions on 81 and e2 obtained from

the above three methods are ellipses. In order to be able to compare

our simultaneous confidence intervals on 81 and e2, which we have found

earlier, to those obtained by each of the above methods, we project the

ellipses onto the 18 and 62 axes. The resulting intervals form

simultaneous, but conservative, confidence intervals on 81 and @2. The

results are given in Table 9.





Table 9
Simultaneous Confidence Intervals on 81 and e2
with a Confidence Coefficient > .9418 Obtained from Projecting the
94.18 percent Confidence Regions on (el, e2) for Example 3.1


Method e1 e2
Lower Upper Lower Upper
Limit Limit Limit Limit


Likelihood Ratio .585 .790 .09 .55

Asymptotic Normality of the LS Estimator .602 .768 .129 .506

Hartley .555 .820 .02 .63













We note that the proposed method provides wider confidence

intervals than the other three methods. This is attributed to the

conservative nature of the proposed method caused by the use of the

confidence intervals in (3.13), which are already conservative, followed

by the use of Bonferroni's inequality which leads yet to another set of

conservative intervals as in (3.17).

Example 3.2. In this second example we construct simultaneous

confidence intervals on e1, 82 for a nonlinear model with two input

variables. Consider the model used by Draper and Smith (1981) for a

certain chemical reaction which can be described by the nonlinear model


y = exp(-9elx exp[-el2(/x2 1/520)]} + E (3.22)



where y is the fraction of original material remaining, xI is the

reaction time in minutes, and x2 is the temperature in degrees Kelvin.

The data are taken from page 520 of Draper and Smith (1981) and are

given in Table 10. The data consist of two replications at xl = 120,

x2 = 612, eleven replications at xI = 60, x2 = 620, and three replica-
tions at x1 = 30, x2 = 639. The mean (i) and variance (s ) of the

observed responses at each setting of x1 and x2 are given in Table 11.

Before we proceed any further we would like to test to make sure

that the variance of the observed responses from each set of replica-

tions are not significantly different because in our method we use the

pooled variance. To test the null hypotheses of equal variances,

Bartlett's (1937) chi-square test is considered (see also Brownlee

(1965, p. 293)). The test statistic is












M = [(N n) In s2 (n 1) In s ]/C (3.23)
i=1
where
1 n 1 1
C = 1 + 1 i (n -l1 N-ni


and s2 = .00006222 is the pooled variance defined in (3.3),
n 2 2 2
N = Z ni = 16, sl, s2, s are in Table 11 and n is the number of
i=1
distinct design points. Under the null hypothesis M has an approximate

chi-square distribution with n 1 = 2 degrees of freedom. The value of

M in (3.23) is equal to .988932. The value of chi-square with 2 degrees

of freedom and the significance level of .05 is equal to 5.99 which

exceeds the value of M. Therefore, we do not reject the null hypothesis

of equal variances.

We have n = 3 distinct sets of design points with two parameters.

There are () = 3 pairs of functions of the form f(xl),x ; ),

i = 1, 2: 1 = 1, 2, 3. To obtain a confidence region on e with a

confidence coefficient of at least 95 percent we need a to satisfy

1 (3)a .95, thus a : .015. By computing the integrals in (3.19)

for several assignments of a we obtain a = 3.15 which makes 1 a

.984949706 and 1 ()a = .954849117. The 98.49 percent simultaneous

confidence interval on f(x 1; ), fx1 ; ) 1 =

can be obtained from Equation (3.9) with p = 2, m = 13, n1 = 2, n2 = 11,

n3 = 3 and i s. are as in Table 11. The results are given in

Table 12.

Again, in this example we shall drop the superscript (1) and just

use X11, x12, x2! and x22.













Table 10
Observed Responses and Design Points for Example 3.2


Design Points


Observed Responses
(y)













Table 11
Means and Variances of Observed Responses
at Each Replication for Example 3.2


Design Points

xl


Observed

x2 Mean (yi)


Responses

Variance (s2)
(s


120 612 .78800 .000018

60 620 .79627 .000054

30 639 .65067 .000124


To find simultaneous confidence intervals on 6e, e2,let



n = f(x11,x12; 6) = exp{-91x,!exp[-62(1/x2 1/520)]}


n2 = f(x21,x22; 6) = exp{-elx21exp[-62(1/x22 1/520)]}


(3.24)


Let cl, dl be the lower and upper limits of the confidence interval on

f(xll,xl2; 6) and c2,d2 be the lower and upper limits of the confidence
interval on f(x21,x22; 9). We solve the two equations in (3.24) for e9

and 82 and obtain












1e = (-In Ti) exp[C2(1/x12 1/620)]/x11
(3.25)

In(-ln ri) In(-ln n2) In x11 + In x21
82
S(1/x22-1/620)-(1/x12-1/620)


We now find the maximum and minimum values of e1 and e2 of the form

(3.25) over the region [c1,dl] x [c2,d2], where n1 c [cp,dl] and n2 e

[c2,d2]. The results are also in Table 12. The largest of the lower
bounds on 81 and 62 and the smallest of the upper bounds on 81 and e2

from Table 12 form a confidence region on 9 with a confidence

coefficient of at least 95.48 percent and is given by


e1 e [.00364, .00392]


e2 E [26558.7, 29617.4]


For comparison purposes we use the data in Table 10 to construct

simultaneous confidence intervals on 1, 82 by using each of the same

three methods as in Example 3.1. The boundaries of these confidence

regions are ellipses. The simultaneous confidence intervals on

61 and e2 can be obtained by projecting the ellipses onto the

61 and 62 axes. These confidence intervals are conservative and are

given in Table 13.

Our method seems to do well when comparing the ranges of the

parameters with the other three methods.















Table 12
98.49 Percent Confidence Region on
[f(xi) x() :e), f(x ,1 ,x2 ;)], 1 = 1, 2, 3, and at least
98.49 Percent Confidence Regions on [el, 82] for Example 3.2

Design Points Confidence Region on Confidence Region on
(I) (I) (1) (I) (1x) (I .) (1) (1)
(x11 x12 ) (x21 ,x 2 ) f(x ,x2 ', f(X21 ,x 2 ')] [l, e2


(120, 612) (60, 620) [.77043,.80557],[.78878,.80376] [.00364, .0035] ,[24470.3,37234.9]

(120, 612) (30, 639) [.77043,.80557],[.63632,.66501] [.00334,.00392],[26558.7,30762]

( 60, 620) (30, 639) [.78878,.80376],[.63632, .66501] [.00364,.00395],[25753.9,29617.4]


is the largest value of the lower bound

is the smallest value of the upper bound



Table 13
Simultaneous Confidence Intervals on 69 and 92 with a Confidence
Coefficient > .9548 Obtained from Projecting the
95.48 Percent Confidence Regions on (e, 82) for Example 3.2

e1 e2
Method Lower Upper Lower Upper
Limit Limit Limit Limit


Likelihood Ratio .00361 .00389 26768 29393

Asymptotic Normality of the .00363 .00386 27006.3 29158.1
LS Estimator

Hartley .00356 .00388 27100 30950













3.5 Discussion

Our confidence region on the parameters is conservative; that is,

it has a confidence level of at least 100(1 qa) percent, where

q = [n), n is the number of distinct design points and p is the number

of parameters. This might explain the reason in Example 3.1 that our

method provides wider confidence intervals than the methods of

likelihood ratio, asymptotic normality of the least squares estimator

and Hartley. The advantage of the proposed method is that it does not

require specifying the initial estimates of the parameters, unlike the

likelihood ratio and asymptotic normality methods. Furthermore, in

comparison with Hartley's method, the proposed method uses the true

response function rather than the linearized one. In Example 3.2 our

method did quite well by comparison with the other three methods.

The conservative nature of the proposed method means that it

guarantees a percent coverage of the parameter vector which is no less

than a preset value. It would, therefore, be desirable to determine how

much larger is the true percent coverage than the preset value. Such

determination can help reduce the size of the confidence region obtained

by this method. Further investigation of this will be needed.





















CHAPTER FOUR
CONCLUSIONS AND RECOMMENDATIONS



A design criterion for parameter estimation which is not dependent

on the parameter values in a nonlinear model as well as a method for

obtaining a confidence region on the parameter vector in a nonlinear

model have been proposed.

In Chapter Two approximations to the true mean response functions

with a Lagrange interpolating polynomial (for the case of two input

variables) and spline interpolating polynomials (for the case of a

single input variable) were constructed. Each approximation was used to

obtain a parameter-free design on the basis of a modification of the

D-optimality criterion which was proposed by Khuri (1982). This design

is found to depend on the size of the error associated with the

approximation of the true mean response. If the error of an

approximation is large, the design determined on the basis of our

procedure may be inefficient.

It is unfortunate that, at the present stage, no computer algorithm

is available for the determination of the optimal design based on the

Lagrange interpolating polynomial for the case of two (or more) input

variables when the number of design points is large. It is hoped that a

sequential procedure for the generation of design points can reduce the













computational difficulty to a manageable level. This will be

investigated in future research (see Example 2.1).

The reason that we chose spline interpolating polynomials to

approximate the mean response function is because of the nice properties

of the splines as approximating functions (Wold (1974)). These

functions seem to be extremely suitable for interpolating or

approximating functions; to quote deBoor and Rice (1968, p. 1): "Spline

functions are the most successful approximating functions in use

today. They combine ease of handling in a computer with great

flexibility, and are therefore particularly suited for the approximation

of experimental data."

Our attention has centered on the construction of an optimal design

based on spline interpolating polynomials for the case of a single input

variable. Extensions of the procedure for constructing optimal designs

using spline interpolating polynomials to more than one input variable

cause no difficulties in terms of methodology. However, in order to

calculate the design point locations, we may have the same difficulty as

in Example 2.1 if the number of design points is large.

Our design criterion was developed under the assumption that the

values of all the parameters are of equal interest to the

experimenter. It is sometimes the case, however, that the experimenter

is more interested in the values of certain parameters than the values

of others. The extension of our design criterion to obtain the design

of experiments for subsets of parameters requires further research.




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs