Title: Bayesian modeling of nonstationarity in normal and lognormal processes
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00098090/00001
 Material Information
Title: Bayesian modeling of nonstationarity in normal and lognormal processes with applications in CVP analysis and life testing models
Physical Description: xiii, 24 leaves : graphs ; 28 cm.
Language: English
Creator: Velez-Arocho, Jorge Ivan, 1947-
Publication Date: 1978
Copyright Date: 1978
 Subjects
Subject: Bayesian statistical decision theory   ( lcsh )
Break-even analysis   ( lcsh )
Economic forecasting -- Mathematical models   ( lcsh )
Management thesis Ph. D   ( lcsh )
Dissertations, Academic -- Management -- UF   ( lcsh )
Genre: bibliography   ( marcgt )
non-fiction   ( marcgt )
 Notes
Thesis: Thesis--University of Florida.
Bibliography: Bibliography: leaves 198-212.
Statement of Responsibility: by Jorge Ivan Velez-Arocho.
General Note: Typescript.
General Note: Vita.
 Record Information
Bibliographic ID: UF00098090
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: alephbibnum - 000071549
oclc - 04536074
notis - AAH6803

Downloads

This item has the following downloads:

bayesianmodeling00velerich ( PDF )


Full Text













BAYESIAN MODELING OF NONSTATIONARITY IN
NORMAL AND LOGNORMAL PROCESSES WITH
APPLICATIONS IN CVP ANALYSIS AND
LIFE TESTING MODELS













By

JORGE IVAN VELEZ-AROCHO


A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
TIE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHItLISOPIY








UNIVERSITY OF FLORIDA


1978

































Copyright 1978

by

Jorge Ivan Velez-Arocho


























This dissertation stands as a symbol of love
to my wife, Angle, and to my daughter,
Angeles Maria, without whose under-
standing, patience and willingness
to accept sacrifice this
investigation would have
been quite impossible.

















ACKNOWLEDGMENTS


I would like to acknowledge my full indebtedness to those

people who gave their interest, time and effort to making this dis-

sertation possible.

To Dr. Christopher B. Barry who has been my advisor and my

friend, I wish to express my gratitude and deepest appreciation for the

support he has given me throughout the development of this study. He

critized but tolerated my mistakes and encouraged my good performance.

His intelligent guidance, extraordinary competence, and friendly attitude

have been a source of inspiration and encouragement for me.

I am especially grateful to Dr. Antal Majthay for his sincere

advice and assistance during the supervision of my doctoral program and

the preparation of this dissertation. I admire and am inspired by his

unreserved dedication to excellence in education. He will always be

remembered as one of the most valuable models of excellent teaching.

The other members of my committee, Dr. Tom Hodgson and Dr. Zoran

Pop-Stojanovic have each in his own way contributed to the successful

completion of this work. Appreciation is extended to each for his indi-

vidual efforts and expressed concern for my progress. Although not on

my committee, I would also like to express appreciation to Dr. Gary

Koehler, whose support and encouragement came when they were badly

needed.

To Onmar Ruiz, Dean of the School of Business Administration

of the University of Pucurt.o Rico at Mayaguez, I am particularly grateful

i v














for his understanding, confidence and cooperation during my leave of

absence from that institution. Completion of this study was only pos-

sible because of the combined financial support of the University of

Puerto Rico, the University of Florida and Peter Eckrich and Sons Co..

'li'h ii ouil in u iou support is sincerely appreciated.

I am indebted to Dr. Conrad Doenges, Chairman of the Department

of Finance of the University of Texas at Austin, for his interest and

help and to the many members of the Finance faculty for their interest

during my period of research at the University of Texas. Special thanks

go to Nettie Webb for her warm friendship and continuous secretarial

assistance to my wife.

It is difficult to adequately convey the support my family has

provided. My parents, Jorge Velez and Elba Iucrecia Arocho, and my

brothers and sisters provided understanding and moral assistance for

which I will always be grateful. Their high expectations and constant

encouragement have been a powerful factor in shaping my desire to pursue

this degree.

Most of all a gratitude which cannot be expressed in words

goes to my loving wife, Angle, for her patience and persistence in

typing this dissertation and for her wonderful attitude throughout

the entire arduous process.




















TABLE OF CONTENTS


ACKNOWLEDGEMENTS .................


LIST OF APPENDIX TABLES..........


LIST OF FIGURES ..................


ABSTRACT..........................


Chapter


ONE INTRODUCTION .............................................


1.1 Introduction ........................................
1.2 Summary of Results and Overview of Dissertation.....


TWO SURVEY OF PERTINENT LITERATURE...........................


2.1 Cost-Volume-Profit (CVP) Analysis...................
2.2 Life Testing Models .................................


2.2.1 Introduction.................................
2.2.2 Some Common Life Distributions..............
2.2.3 Traditional Approach to Life Testing
Inferences ..................................
2.2.4 Bayesian Techniques in Life Testing.........


2.3 Modeling of Nonstationary Processes.................


THREE NONSTATLONARITY IN NORMAL AND LOGNORMAL PROCESSES........


3.1 Introduction.........................................
3.2 Bayesian Analysis of Normal and Lognormal Processes.
3.3 Nonstationary Model for Normal and Lognormal Means..


.33.1 i is Unknown and a2 is Known................
3.3.2 j and d2 Both Unknown.......................
3.3.3 Stationary Versus Nonstationary Results.....


3.4 Conclusion ..........................................


FOUR LIMITING RESULTS AND PREDICTION INTERVALS FOR NONSTA-
TTONAKY NORMAL AND LOGNORIALI PROCESSES ...................


4 .1 Lnt roduc t ion ........................................


vi


Page


iv


ix


x


xi


..........

. . . . . .


..........


..........


....................


....................


....................


....................



















4.2 Special Properties and Limiting Results Under
Nonstationarity. ........................................ 86

4.2.1 Limiting Behavior of m' and n' When P is
the Only Unknown Parameter..................... 86
4.2.2 Limiting Behavior of m nt', v and dt
When Both Parameters p and oc are Unknown....... 95

4.3 Prediction Intervals for Normal, Student, Lognormal
and LogStudent Distributions............................ 103
4.4 Conclusion. ........................................... 117

FIVE NONSTATIONARITY IN CVP AND STATISTICAL LIFE ANALYSIS........ 119

5.1 Introduction........................................... 119
5.2 Nonstationarity in Cost-Volume-Profit Analysis.......... 120

5.2.1 Existing Analysis............................... 120
5.2.2 Nonstationary Bayesian CVP Model ............... 122
5.2.3 Extensions to the Nonstationary Bayesian
CVP Model...................................... 136

5.3 Nonstationarity in Statistical Life Analysis........... 140

5.3.1 Existing Analysis............................... 140
5.3.2 A Life Testing Model Under Nonstationarity..... 141

5.4 Conclusion............................................. 148

SIX CONCLUSIONS, LIMITATIONS AND FURTHER STUDY.................. 150

6.1 Summary ............. ........................... 150
6.2 Limitations ............................. ........ 152
6.3 Suggestions for Further Research ............... 155

APPENDIXES

I Bayesian Analysis of Normal and Lognormal Processes......... 160
IT Nonstationary Models for the Exponential Distribution....... 172
IIL Algorithm to Determine Prediction Intervals for Lognormal
and LogStudent Distributions ................................ 185


Chapter


Page

















Page

LIST OF REFERENCES.................................................... 198

BIOGRAPHICAL SKETCH .................................................... 213


viii






















LIST OF APPENDIX TABLES


Table Page

1. predictive Intervals for Some Lognormal Predictive
Distributions ........................................ .... 191

2. Predictive Intervals for Some LogStudent Predictive
Distributions. ............................................ 192





















LIST OF FIGURES


Figure Page

1. Life Characteristics of Some Systems...................... 21

AIII.1 Predictive Distribution.................................. 186

AIII.2 Predictive Distribution .................................. ]87

AIII.3 Predictive Distribution.................................. 188

AIII.4 Predictive Distribution.................................. 189















Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of the Requirements
for the Degree of Doctor of Philosophy



BAYESIAN MODELING OF NONSTATIONARITY IN
NORMAL AND LOGNORMAL PROCESSES WITH
APPLICATIONS IN CVP ANALYSIS AND
LIFE TESTING MODELS

By

Jorge Ivan Velez-Arocho

June 1978

Chairman: Christopher B. Barry
Major Department: Management

Probability models applied by decision makers in a wide variety

of contents must be able to provide inferences under conditions of change.

A stochastic process whose probabilistic properties change through time

can be described as a nonstationary process. In this dissertation a model

involving normal and lognormal processes is developed for handling a par-

ticular form of nonstationarity within a Bayesian framework. Two uncer-

tainty conditions are considered; in one the location parameter, p, is

assumed to be unknown and the spread parameter, a, is assumed to be known;

and in the other both parameters are assumed to be unknown. Comparing the

nonstationary model with the stationary one it is shown that:

_1. more uncertainty (of a particular definition) is present

under nonstationarity than under stationarity;

2. since the variance of a lognormal distribution, V(x), is a

function of p and o2, nonstationarity in P means that both mean and vari-

ance of the random variable, x, are nonstationary so that the lognormal

xi















case provides a generalization of the normal results;

and

3. as additional observations are collected uncertainty about

stochastically-varying parameters is never entirely eliminated.-

The asymptotic behavior of the model has important implications

for the decision maker. An implication of the stationary Bayesian model

for normal and lognormal processes is that as additional observations are

collected,parameter uncertainty is reduced and (in the limit) eliminated

altogether. In contrast, for the nonstationary model considered in this

dissertation the following inferential results are obtained:

1. for the case of lognormal or normal model, a particular form

of stochastic parameter variation implies a treatment of data involving

the use of all observations in a differential weighting scheme;

and

2. random parameter variation produces important differences in

the limiting behavior of the prior and predictive distributions since

under nonstationarity the limiting values of the parameters of the poste-

rior and predictive distributions cannot be determined clearly.

Practical implications of the results for the areas of Cost-

Volume-Profit Analysis and life testing are discussed with emphasis on

the predictive distribution for the outcome of a future observation from

the data generating process. It is emphasized that a Cost-Volume-Profit

(CVP) and life testing model ideally should include the changing charac-

ter of the process by allowing for changes in the parametric description

of the process through time. Failure to recognize nonstationarity when

Xii














it is present has a number of implications in the CVP and life-testing

contexts that are explored in the dissertation. For example, inferences

are improperly obtained if the nonstationarity is ignored, and prediction

interval coverage probabilities are overstated since uncertainty is

greater (in a particular sense) when nonstationarity is present.


:'7 A

















CHAPTER ONE


INTRODUCTION

1.1 Introduction

Uncertainty is an essential and intrinsic part of the human

condition. The opinions we express, the conclusions we reach and the

decisions we make are often based on beliefs concerning the probability

of uncertain events such as the result of an experiment, the future value

of an investment or the number of units to be sold next year. If manage-

ment, for instance, were certain about what circumstances would exist at

a given time, the preparation of a forecast would be a trivial matter.

Virtually all situations faced by management involve uncertainty, however,

and judgments must be made and information must be gathered to reduce

this uncertainty and its effects. One of the functions of applied mathe-

matics is to provide information which may be used in making decisions

or forming judgments about unknown quantities.

Several early studies by econometricians and statisticians

examined the problem of constructing a model.whose output is as

close aspossible to the observed data from the real system and which

reflects all the uncertainty that the decision maker has. Mathematical

models for statistical problems, for instance, have some element of un-

certainty incorporated in the form of a probability measure. The model

usually involves the formulation of a probability distribution of the

uncertain quantities. This element of uncertainty is carried through













the analysis to the inferences drawn. The equations that form the mathe-

matical model are usually specified to within a number of parameters

or coefficients which must be estimated. The unknown parameters are

usually assumed to be constant and the problem of model identification

is reduced to one of constant parameter estimation.

There are several reasons for suspecting that the parameters

of many models constructed by engineers and econometricians are not

constant but in fact time-varying. For instance, it has become increas-

ingly clear that to assume that behavioral and technological relationships

are stable over time is,in many cases,completly untenable on the basis

of economic theory. Several recent studies provide support for the claim

that the parameters of distributions of stock-price-related variables may

change over time [see Barry and Winkler (1976)]. In engineering, particu-

larly in reliability theory, the origins of parameter variation are usually

not very hard to pinpoint. Component wear, variation in inputs or compo-

nent failure are some very common reasons for parameter variations. The

major objective of construction of engineering models is control and regu-

lation of the real system modeled. Therefore, much of the research in

that area has concentrated on devising ways to make the output of the

model insensitive to parameter variation. Similarly, in forecasting models

for economic variables, researchers have had great concern with time varying

parameters of the distributions of interest. In this area the problem of

varying parameters has received increased attention because there is

increasing evidence that the common regression assumption of stable















parameters often appears invalid.

Ln this dissertation we plan to study a particular type of random

parameter variation which is likely to be applicable when nonstationariLy

over time is present. The modeling of nonstationarity that we are going to

present assumes that successive values in time of the unknown parameter

are related in a stochastic manner; i.e., the parameter variation includes

a component which is a realization of some random process. For purposes

of estimation we are interested in specific realizations of the random

process. When the process generating the unknown parameter is a nonsta-

tionary process over time the decision maker should be concerned with

a sequence of values of the parameter instead of a single value as in

the usual stationary model; i.e., inferences and decisions concerning the

parameter should reflect the fact that it is changing over time.

If the values of an unknown parameter over time are related

in a stochastic manner, a formal analysis of the situation requires

some assumptions about the stochastic relationship. For the model of

nonstationarity that we develop in this dissertation, the specification

of the stochastic relationship between values of the parameter is suf-

ficient. Moreover it is assumed that this relationship is stationary

(usually referred to as second-order stationarity) in the sense that the

stochastic relationship is the same for any pair of consecutive values

of the unknown parameter.

We want to gain more precise information about the structure

of the time-varying parameters and to obtain estimated relationships













that are suitable for forecasting. The model to be developed makes it

possible to draw inferences about the structure of the relationship at

every point in time. There are problems in accounting, life testing theory,

finance and a variety of other areas that can benefit from nonstationary

parameter estimation techniques.


1.2 Summary of Results and Overview of Dissertation

The goals of this dissertation are to develop a rigorous model

for handling nonstationarity within a Bayesian framework, to compare

inferences from stationary and nonstationary models, and to investigate

inferential applications in the areas of Cost-Volume-Profit Analysis and

life testing models involving nonstationarity. Probably the most important

advantage of the new work to be presented in this dissertation is the

increased versatility it adds to the nonstationary Bayesian model derived

by Winkler and Barry (1973). The new results enlarge the range of real and

important problems involving univariate and multivariate nonstationary

normal and lognormal processes which can be handled. Another advantage

is the simplicity of the updating methods for the efficient handling of

the estimation of unknown parameters and the prediction of the outcome

of a future sample.

A survey of the most relevant literature is provided in Chapter

Two to set the stage for the new developments in the remainder of the dis-

sertation. In this survey we present an overview of probabilistic Cost-

Volume-Profit (CVP) Analysis and discuss the most important articles

that deal with CVP under- conditions of uncertainty. The rev t~w of the















literature includes a section on life testing models emphasizing the use

of Bayesian techniques used in life testing. It is emphasized that most

of the research done in these two areas neglects the problem of nonsta-

tionarity. A special section is presented to discuss some important

articles about modeling nonstationary processes.

As is mentioned in Chapter Two, most research concerned with

the normal and lognormal distributions has considered only stationary

situations. That is, the parameters and distributions used are assumed

to remain the same in all periods. In Chapter Three we develop a Bayesian

model of nonstationnrity for normal and lognormal processes. In it we

describe essential features of the Bayesian analysis of normal and log-

normal processes inder nonstationarity, like the prior, posterior and

predictive distributions. Two uncertainty conditions are considered in

this chapter; in one the location parameter, 1, is assumed to be unknown

and the spread parameter, o, is assumed to be known; and in the other,

both parameters are assumed to be unknown. Comparing the nonstationary

model with the stationary one it is shown that:


1. more uncertainty (of a particular definition) is present

under nonstationarity than under stationarity;


2. since the variance of a lognormal distribution, V(x), is a

function of p and (J2 nonstationarity in p means that both mean and vari-

ance of the random variable, x, are nonstationary, so that the lognormal

case provides a generalizati n of the normal results;














and,

3. that,as additional observations are collected, uncertainty about

stochastically-varying parameters is never entirely eliminated.

The results discussed in Chapter Three have to do with the period-

to-period effects of random parameter variation upon the posterior and pre-

dictive distributions. However, the asymptotic behavior of the model has

important implications for the decision maker. An implication of the sta-

tionary Bayesian model for normal and lognormal processes is that as addi-

tional observations are collected parameter uncertainty is reduced and

(in the limit) eliminated altogether. Such an implication is inconsistent

with observed real world behavior largely because the conditions under

which inferences are made typically change across time. The common dictum

[see Dickinson (1974)] has been to eliminate some observations in the case

of changing parameters so that only those most recent observations are

considered. In Chapter Four we show that:


1. for the case of a lognormal or normal model, a particular

form of stochastic parameter variation implies a treatment of data

involving tile use of all observations in a differential weighting scheme,

and,

2. random parameter variation produces important differences

in the limiting behavior of the prior and predictive distributions since

under nonstationarity tile limiting values of some of the parameters of the

posterior .nd predictive distributions can not be determined clearly.















One objective of tlis dissertation is to develop Bayesian pre-

diction intervals for future observations that come from normal and log-

normal data generating processes. In Chapter Four we address the problem

of constructing prediction intervals for normal, Student, lognormal and

logStudent distributions. It is pointed out that it is easy to construct

these intervals for the normal and Student distributions but that it is

rather difficult for the lognormal and logStudent distributions. An

algorithm is presented to compute the Bayesian prediction intervals for

the lognormal and logStudent distributions. Bayesian prediction intervals

under nonstationarity are compared with classical, certainty equivalent

and Bayesian stationary intervals.

In Chapter Five we discuss the application of the results of

Chapters Three and Four concerning nonstationarity to the area of CVP

analysis and life testing models. Practical implications of our results

for these two areas are discussed with emphasis on the predictive dis-

tribution for the outcome of a future observation from the data generating

process. It is emphasized that CVP and life testing models ideally

should include the changing character of the process by allowing for

changes in the parametric description of the process through time. It

is shown that, for the case of normal and lognormal data generating

processes under a particular form of stochastic parameter variation, the

presence of nonstationarity produces greater uncertainty to the decision

maker. Nonstiationarilty implies greater uncertainty, which is reflected

by an increase in lie predictive variance of profits for CVP models,











8


by an increase in the predictive variance of life length for life testing

models, and by an increase in the width of intervals required to contain

particular coverage probabilities.

Chapter Six provides conclusions, limitations and suggestions

for further research. Since stationarity assumptions are often quite

unrealistic, it is concluded in that chapter that the introduction of

possible nonstationarity greatly increases the realism and applicability

of statistical inference methods, in particular of Bayesian procedures.
















CHAPTER TWO


SURVEY OF PERTINENT LITERATURE

The primary purpose of the research in this dissertation is

to present a Bayesian model of nonstationarity in normal and lognormal

processes with applications in Cost-Volume-Profit analysis and life

testing models. A survey of the most relevant literature is provided

in the chapter and will serve to set the stage for the new developments

in the remainder of the thesis.

In this survey, three areas are covered. In Section 2.1 we pre-

sent an overview of probabilistic Cost-Volume-Profit (CVP) analysis and

discuss the most important articles that deal with CVP under conditions

of uncertainty. In Section 2.2 we discuss life testing models with an

emphasis on the exponential, gamma, Weibull and lognormal models. The

review of the literature includes a special section on Bayesian techniques

used in life testing. Finally in Section 2.3 a survey is presented of

some important articles about modeling nonstationary processes.


2.1 Cost-Volume-Profit (CVP), Analysis

Management requires realistic and accurate information to

aid in decision making. Cost-Volume-Profit (CVP) analysis is a widely

accepted generator of information useful in decision making proces-

ses. CVP analysis essentially consists in examining tle relationship

between changes in volume ( output ) and changes in profit. The funda-

mental assumption in all types of CVP decisions is that the firm, or

a department or otl.ier Lype of costing unit, possesses a fixed set













of resources that commits the firm to a certain level of fixed costs

for at least a shortrun period. The decision problem facing a manager

is to determine the most efficient and productive use of this fixed

set of resources relative to output levels and output mixes. The scope

of CVP analysis ranges from determination of the optimal output level

for a single-product department to the determination of optimal output

mix of a large multi-product firm. All these decisions rely on simple

relationships between changes in revenues and costs and changes in

output levels or mixes. All CVP analyses are characterized by their

emphasis on cost and revenue behavior over various ranges of output

levels and mixes.

The determination of the selling price of a product is a

complex matter that is often affected by forces partially or entirely

beyond the control of management. Nevertheless, management must formu-

late pricing policies within the bounds permitted by the market place.

Accounting can play an important role in the development of policy

by supplying management with special reports on the relative profit-

ability of its various products, the probable effects of contemplated

changes in selling price and other CVP relationships.

The unit cost of producing a commodity is affected by such

factors as the inherent nature of the product, the efficiency of oper-

ations, and the volume of production. An increase in the quantity

produced is ordinarily accompaniedd by a decrease in unit cost, pro-

vided the volume attained remains within the limits of plant capacity.

Quantitative data relating to the effect on income of changes in













unit selling price, sales volume, production volume, production costs.

and operating expenses help management to improve the relationships

among these variables. If a change in selling price appears to be de-

sirable or, because of competitive pressure, unavoidable, the possible

effect of the change on sales volume and product cost needs to be

considered.

A mathematical expression of the profit equation of CVP

analysis is:

(2.1.1) Z = Q (P-V) F,

where Z = total profits,

Q = sales volume in units,

P = unit selling price,

V = unit variable cost,

and F = total fixed costs.

This accounting model of analysis has been traditionally

used by the management accountant in profit planning. This use, how-

wver, typically ignores the uncertainty associated with the firm's oper-

ation, thus severely limiting its applicability. During the past 12

years, accountants have attempted to resolve this problem by intro-

ducing stochastic aspects into the analysis.

The applicability of probabilistic models for this analysis

has been claimed because of the realism of such models, i.e., deci-

sions are always accompanied by uncertainty. Thus, the ideal model

is one that gives a probability distribution of the criterion variable,

profit, and that fully recognizes the uncertainty faced by the firm.












The realism of such a model is dependent on logical assumptions for

the input variables and rigorous methodology in obtaining the output

distribution. Further, we hope that,the model can accommodate a wide

range of uses. For example, the capability to handle dependence among

input variables adds a highly useful dimension.

Jaedicke and Robichek (1964) first introduced risk into the

model. They assumed the following relation among the means


(2.1.2) R(Z) = E(Q) [E(P) E(V)] E(F),


where E(.) denotes mathematical expectation.

In addition they assumed that the key variables were all normally

distributed and that the resulting profit is also normally distributed.

Thus, by computing the mean value and standard deviation of the re-

sulting profit function, various probabilistic measures of profit

can be obtained. This model has been depicted as a limit analysis,

since the assumptions of the independent model parameters and the

normalcy of the resulting profit function are not true except in

limiting cases. According to Ferrara, Hayya and Nachman (1972), the

product of two normally and independently distributed variables will

approximate normality if the sum of the two coefficients of variation

is less than or equal to .12.

Others have confronted the same problem of how to identify

the resulting profit distribution when it is not close to a normal

distribution. They have noted that it is often difficult to obtain

analytical expressions for Lhe product of random variables. Because

Lhe appropriate d i :tr inbi ti nal forms for the product of the variable














functions may not be known, Buzby (1974) suggests the application of

Tchebycheff's theorem to stochastic Cost-Volume-Profit analysis. This

theorem, however, permits the analyst to derive only some very crude

bounds on the probabilities of interest, so its value as a decision-

making tool is limited. Liao (1975) illustrated how model sampling

(also called distribution sampling) coupled with a curve-fitting

technique can be used to overcome the above problems associated

with stochastic CVP analysis. In his paper, the illustration of the

proposed approach to stochastic CVP analysis is first developed through

a consideration to the Jaedicke-Robicheck problem, wherein the model

parameters are independent and normally distributed. After that, the

illustration problem is modified to accommodate dependent and non-normal

variates in the problem.

Hilliard and Leitch (1975) developed a model for CVP analysis

assuming a more tractable distribution for the inputs of the equation.

It allows for dependent relationships and permits a rigorous deriva-

tion of the distribution of profit. The problems of assuming price and

quantity to be independent are pointed out. The authors also pointed

out that assuming sales to be normally distributed implies a positive

probability of negative sales.

Probabilities and tolerance intervals for the Hilliard and

Leitch model are obtained from tables of the normal distribution.

The only assumpt ions required for the model are (1) quantity and

contribution margin are lognormally distributed random variables

and (2) fixed cosLs art deterministic. The assumption that sales














quantity and contribution margin are bivariate lognormally distributed

eliminates the possibility of negative sales and of selling prices

below variable costs, and it has the nice additional property that the

product of two bivariate lognormal random variables is also lognormal.

Thus, we can allow for uncertainty in price and quantity and still

have a closed form expression for the probability distribution of

gross profits. H!illi:ird and Leitch can not assume that price and

variable costs are marginally lognormally distributed and have contri-

bution margin also be lognormally distributed. Similarly, if fixed

costs are assumed to be lognormally distributed too, net profits will

not be lognormally distributed.

Adar, Barnea and Lev (1977) presented a model for CVP

analysis under uncertainty that combines the probability characteristics

of the environment variables with the risk preferences of decision

makers. The approach is based on recently suggested economic models

of the firm's optimal output decision under uncertainty, which were

modified within the mean-standard deviation framework to provide for

a cost-volume-utility analysis allowing management to: (1) determine

optimal output, (2) consider the desirability of alternative plans

involving changes in fixed and variable costs, expected price and

uncertainty of price and technology changes and (3) determine the

economic consequences of fixed cost variances.

Dickinson (1974) addresses the problem of CVP analysis under

uncert.i nty by examining l he reliability of using the usual methods

of estimating the means ;nd variances of the past distributions of














sales demand lie emphasized that, when the expectation and variance

of profits are estimated from past data, it is important to differen-

tiate between what, in fact, are estimated and what are true values

of the parameters. In other words, he pointed out that the estimated

expectation of profits, E(T), reflects estimation risk and is not

equal to E(,-). Classical confidence intervals were used for the

expected value of profits, E(n), for the variance of profits, Var(nf),

and for probabilities of various profit levels. However, Dickinson

misinterpreted the classical confidence intervals that he obtains in

his paper. When a classicist constructs a 90 percent confidence interval

for for example, he would state that in the long run, 90 percent

of all such intervals will contain the true value of p. The classical

statement is based on long-run frequency considerations. The classicist

is absolutely opposed to the interpretation that the 90 percent refers

to the probability that the true universe mean lies within the specified

interval. in the eyes of a classicist, a unique true value exists for

the universe mean, and therefore the value of the universe mean can-

not be treated as a random variable. Dickinson's paper also illus-

trates the difficulty of obtaining the probability statements of

greatest interest to management in a classical approach. lis analysis

is only able to provide confidence intervals of probabilities of

profit levels rather than the profit level probabilities themselves.

The problem of parameter uncertainty has been neglected by

the people that have studied CVP analysis under uncertainty. In the

Bayesian approach, 1ine11rtainty regarding the parameters of probability












models is reflected in prior and posterior probability statements

regarding the parameters. Marginal distributions of variables which

depend on those parameters may be obtained by integrating out the

distribution of the parameters, thereby obtaining predictive distri-

butions [see Roberts (1965) and Zellner (1971)] of the quantities of

interest to the manager. These predictive distributions permit one

to make valid probability statements regarding the important quan-

tities, such as profits.

Nonstationarity is another important aspect related to CVP

analysis that no one has considered. In a world that is continually

changing, it is important to recognize that the parameters that

describe a process at a particular point in time may not do so at

a later point in time. In the case of the variable sales, for instance,

experience shows that it is typically affected by a variety of eco-

nomic and political events. Thus, a CVP model ideally should include

the changing character of the process by allowing for changes in the

parametric description of tle process through time. Failure to recog-

nize the nonstationary conditions may result in misleading inferences.

In this dissertation the problem of- Cost-Volume-Profit analy-

sis will be considered from a Bayesian viewpoint, and inferences under

a special case of nonstationarity will be considered. Also the Bayesian

results under nonstationarity will be compared with those results

that can be obtained under a stationary Bayesian model, and the Baye-

sian model will hib compared with some alternative approaches.















2.2 Life Testing Models


2.2.1 Introduction

The development of recent technology has given special impor-

tance to several problems concerning the improvement of the effective-

ness of devices of various kinds. It is often important to impose

extraordinarily high standards on the performance of these devices,

since a failure in the performance could bring disastrous consequences.

The quality of production plays an important role in today's life. An

interruption in the operation of a regulating device can lead not

only to deterioration in the quality of a manufactured product but

also to damage of the industrial process. From a purely economic view-

point high reliability is desirable to reduce costs. However, since

it is costly to achieve high reliability, there is a tradeoff. The

failure of a part or component results not only in the loss of the

failed item hut often results in the loss (at least temporarily) of

some larger assembly or system of which it is part. There are nu-

merous examples in which failures of components have caused losses

of millions of dollars and personal losses. The space program is an

excellent example where even the lives of some astronauts were lost

due to failure in the system. The following authors have considered

the statistical theory of reliability and provide a good set of re-

ferences on the subject: Mendenhall (L958), Buckland (1960), Birnbaum

(1962), Govindarajlu l (1 )4), Mann, Schaefer and Singpurwalla (1973),

and Canfield and Borgman (1975).

Re ability tchuury is lih disciple ine that deals wi th procedures













to ensure the maximum effectiveness of manufactured articles

and that develops methods of evaluating the quality of systems from

known qualities of their component parts. A large number of problems

in reliability theory have a mathematical character and require the

use of mathematical tools and the development of new ones for their

solution. Areas like probability theory and mathematical statistics

are necessary to solve some of the problems found in reliability

theory. No matter how hard the company works to maintain constant

conditions during a production process, fluctuations in the production

factors lead to a significant variation in the properties of the

finished products. In addition, articles are subjected to different

conditions in the course of their use. To maintain and to increase

the reliability of a system or of an article requires both material

expenditures and scientific research.

Statist c.ia theory and methodology have played an influen-

tial role in the development of reliability theory since the publi-

cation of the paper by Epstein and Sobel (1953). Four statistical

concepts provide the basis for estimating relevant parameters and

testing hypotheses about the life characteristic of the subject

matter. These concepts are:

(i) the distribution function of some variable which is a

direct or indirect measure of the response (life time) to usage in

a particular euvirunment;

(ii) the associated probability density (or frequency)


fiinC ti on













(iii) the survival probability function; and

(iv) the conditional failure rate.

A failure distribution provides a mathematical description

of the length of life of a device, structure or material. Consider

a piece of equipment which has been in a given environment,e. The

fatigue life of this piece of equipment is defined to be the length

of time, T(e), this piece of equipment operates before it fails. Full

information about e would fully determine T(e), so that given e, T(e)

would not be random. One source of randomness in life is in uncertainty

about the environment, i.e., T(e) is a random variable because e is

random. Equipment has different survival characteristics depending on

the conditions under which it is operated, and e provides a statement

of what conditions are but does not determine T(e) Fully.

The reliability of an operating system is defined as the

probability that the system will perform satisfactorily within

specified conditions over a given future time period when the system

starts operating at some time origin. Different distributions can

be distinguished according to their failure rate function, which

is known in the literature of reliability as a hazard rate [see

Barlow and Prosch:n (1965)]. The hazard rate (denoted by h), which

is a function of time, gives the conditional density of failure at

time, t,with the hypothesis that the unit has been funcitoning with-

out failure up to that point in time. The conditional failure is

defined as:


!(t) = f(t)/[l F(t)] = f(t)/R(t) ,


(2.2.1)













where (2.2.2) F(t) = Prob (T < t) = ft f(t) ds,
-CO

is the probability that an observed value of T will be less than or

equal to an assigned number t. The reliability function (also called

the survival function) of the random variable T gives the probability

that T will exceed t and is defined by


(2.2.3) R(t) = 1 F(t) = Prob (T > t).

The probability density function of the random variable T, f(t),

0 < t < oo, is known as the failure density function of the device.

It can be shown that the conditional failure rate and the distribu-

tion function of a random variable are related by


(2.2.4) F(t) = 1 exp[- ft h(s) d(s)].
0

The causes of failure can be categorized into three basic

types. It is recognized, however, that there may be more than one

contributing cause to a particular failure and that, in some cases,

there may be no completely clearcut distinction between some of the

causes. The three classes of failure are infant mortalities, or

early failures, random failures and wearout failures. The behavior

of the hazard rate as a funciton of time is sometimes known as the

hazard function or life characteristic of the system. For a typical

system that may experience any of the three previously described types

of failure, the life characteristic will appear as in Figure 1. The

representation of the life characteristic has been classically referred

to as the "bathtub curve", wherein the three segments of the curve

represent the three time periods of initial, chance and wearout failure.





















Hazard

rate














Initial Random failure Wearout
Random failure
failure failure





Time
Figure 1. Life characteristics of some systems


The initial failure period is characterized by a high hazard rate

shortly after time x=O and a gradual reduction during the initial

period of operation. During the chance failure period, the hazard

rate is constant and generally lower than during the initial period.

The cause of this failure is attributed to unusual and unpredictable

environmental conditions occurring during the operating time of the

system or of the device. The hazard rate increases during the wearout

period. This failure is associated with the gradual depletion of a

material or an accumulation of shocks and so on.

In the following subsections we will consider the general













properties of some widely used life distributions, the assessment

and use of those distributions, and the literature related to Bayesian

methods in life testing.


2.2.2 Some Common Life Distributions


..2.2.1 The Exponential Distribution

In the case of a constant failure rate the distribution of

life is exponential. This case has received the most emphasis in the

literature, since, in spite of theoretical limitations, it presents

attractive statistical properties and is highly tractable. Data

arising from life tests under laboratory or service conditions are

often found to conform to the exponential distribution.

An acceptable justification for the assumption of an expo-

nential distribution to life studies was initially presented by Davis

(1952). More recently Barlow and Proschan (1965) have advanced a mathe-

matical argument to support the plausability of the exponential dis-

tribution as the failure law of complex equipment. The random variable

T has an exponential distribution if it has a probability density

function of the form


(2.2.5) fT(t) = o-" exp[-(t-O)/o] t > 6,

o > 0.

The mean and variance ot T are (0 + 0) and o2, respectively. In most

applications 0 is taken as zero. For this distribution, the physical

interpretation of a constant hazard function is that, irrespective of

the time elapsed since the start of operation,of a system the prob-

ability th:it the system fail in the next time intervals dt,














given that it has survived to time t, is independent of the elapsed

time t and is constant.


2.2.2.2 The Gamma Distribution

An extremely useful distribution in fatigue and wearout

studies is the gamma distribution. It also has a very important rela-

tionship to the exponential distribution, namely, that the sum of n

independent and identically distributed (i.i.d.) exponential random

variables with common parameters e=0 and o is a random variable that

has a gamma distribution with parameters n and a. Hence, the exponen-

tial distribution is a special case of the gamma with n=l.

The random variable T has a gamma distribution if its pro-

bability density function is of the form,


(2.2.6) f (t) = {(t- )n-1 exp[-(t-0)/o]} /onr(n); n > 0,
o > 0,
o > 0.

The standard form of the distribution is obtained by putting o=l and

0=0, giving


(2.2.7) f (t) = [tn- exp(-t)]/ (n), t > 0;

where the gamma function, denoted F, is a mapping of the interval

(0,) into itself and is defined by


(2.2.8) F(n) = tn-1 exp(-t) dt.
0
The probability distribution function of (2.2.7) is


(2.2.9) 'roblT < t] = lr(n)]-1 ft xn-1 exp(-x) dx
0












Since a distribution of the form given in equation (2.2.6)

can he obtained from standardized distributions, as in equation (2.2.7),

by the linear transformation t=(t'-6)/o. there is no difficulty in

deriving formulas for moments, generating functions,etc., for equation

(2.2.6) from those for equation (2.2.7).

One of the most important properties of the distribution is

the reproductive property; if T1 and T2 are independent random variables

each having a distribution of the form (2.2.7), possibly with different

values n', n" of n but with common values of o and 0, then (T1+ T2)

also has a distribution of this form, with the same value of o and

0, and with n = n' + n".


2.2.2.3 The Weibull Distribution

The Weibull distribution was developed by W. Weibull (1951)

of Sweden and used for problems involving fatigue lives of materials.

Three parameters are required to uniquely define a particular Weibull

distribution. Those three parameters are the scale parameter o, the

shape parameter n and the location parameter 0.

A random variable T has a Weibull distribution if there are

values of the parameters n (>0), o (>0) and 0 such that,

(2.2.10) Y = [(t-e)/on

has the exponential distribution with probability density function


(2.2.11) ty(y) = exp(-y), y > 0.

The probability density function of T is given by














at succesive points of time of, for example, a fatigue crack or the

growth of biological organisms and the change between any pairs of

successive steps or stages is a random proportion of the previous

size, then asymptotically the distribution of the random variable is

lognormal [see Kapteyn (1903)]. This theoretical result imparted some

plausibility to the lognormal distribution for failure problems. Let

tl< t2< ... < t be a sequence of random variables that denote the

sizes of a fatigue crack at succesive stages of its growth. It is

assumed that the crack growth at stage i, t.- t.i1, is randomly

proportional to tle size of the crack, t.1 and that the item fails

when the crack reaches t Let ti- t = t_.L i= 1, 2, ..., n, where

7. is a random variable. The ni are assumed to be independently dis-

tributed random variables that need not have a common distribution

for all i's when n is large but that need to be lognormally distrib-

uted otherwise. Thus,


1ri = (t t i = 1 2, . n .


Mann, Schaefer and Singpurwalla (1973) show that In tn, the life

length of the item, for large n, is asymptotically normally distri-

buted, and hence tn has a lognormal distribution.

If there is a number y such that

(2.2.18) Z = In(t-y)


is normally disti-ibilted, then the distribution of t is said to be

lognormal. The distribution of t can be defined by the equation,

(2.2. 19) U = , + 6 ln(tL- ) ,












(2.2.12) fT(t) = no- [(t-0)/o]n- exp (t- )/o]} t > .


The standard Weibuill distribution is obtained by putting o=1 and

0=0. The value zero for 0 is by far the most frequently used, espe-

cially in representing distributions of life times.

The Weibull distribution has cumulative distribution function


(2.2.13) FT(t) = ]-exp{-[(t-0)/o]n ,

and its mean and variance are

(2.2.14) E(t) = oF(l + [1/n])


and (2.2.15) Var(t) = o2{F(l+[2/n]) F2(l+[l/n])} respectively.

For the two parameter Weibull distribution we have that the reliability

and hazard function are

(2.2.16) R (L) = exp [-(t/o)n]

and
(2.2.17) h (t) = nt-1/n


When n=l, the hazard function is a constant. Thus the exponential dis-

tribution is a special case of the Weibull distribution with n=l.


2.2.2.4 The Lognormal Distribution

The lognormal distribution is also a very popular distribution

in describing wearoit failures. This model was developed as a physical

or, more appropriately biological, model associated with the theory

of proportion Le effects (see Aitchison and Brown (1957) for a full

description of thle distribution, its properties, and its developments).

Briefly, if a random variable is supposed to represent the magnitudes















where U is a unit normal variable and 0, 6 and y are parameters. The

probability density function of T is defined by

(2.2.20) fT(t) = 6[(t-y)/Ji ]- exp[-{f+61n(t-y)}2/2], t>y.


An alternative, more fashionable notation replaces 0 and 6 by the

expected value p and standard deviation o of Z = In(t-y). The two

sets ot parameters are related by the equations,


(2.2.21) p = -0/6

and

(2.2.22) o = 6-

so that the distribution of t can be defined by


(2.2.23) U = [In(t-y) p]/o


and the probabiliLy density function of T by

(2.2.24) fT(t) = [(t-y)/2Ro]-1 exp[-{ln(t-y)-p}2/2U2], t>y.


In many applications, y is known (or assumed) to be zero.

This important case has been given the name two parameter lognormal

distribution. The mean and variance of the two parameter distribution

are given by


(2.2.25) E(t) = exp[P + (2/2)] ,

and

(2.2.26) Var(t) = [exp(211) ] ((a-l) ,

where M = rxp(O2).














In addition, the value t such that Pr(t
corresponding percentile, U of the unit normal distribution by the

relation,

(2.2.27) t = exp(p + U a).


Applications of the lognormal distribution have appeared in

many diverse areas, e.g., environmental health [see Dixon (1937) and

Hill (1963)], air pollution control [see Singpurwalla (1971, 1972),

Larsen (1969) and others like economics and insurance claims [see

Wilson and Worcester (1945)] application of the distribution is not

only based on empirical observation, but in some cases is supported

by theoretical arguments.

For example, such arguments have been made in the distribution

of particle sizes in natural aggregates and in the closely related

distribution of dust concentration in industrial atmospheres [see

Tomlinson (1957) and Oldham (1965)]. The lognormal distribution has

also been found to be a serious competitor to the Weibull distribution

in representing life time distributions for manufactured products.

Among our references, Adams (1962), Ansley (1967), Epstein (1947,

1948), Farewell and Prentice (1977), Govindarajulu (1977), Goldthwaite

(1961), Gupta (1962), Hald (1952) and Nowick and Berry (1961) refer

to this topic. Other applications in quality control are described

by Ferrell (1958), Morrison (1958) and Rohn (1959). Many of these

applications are also referenced by Aitchison and Brown (1957),

Finney (1941) and Gupta et al. (1974).














2.2.3 Traditional Approach to Life Testing Inferences

In life testing theory we find a large number of random quan-

tities. In most cases we do not know the distributions and theoretical

characteristics; our aim is to estimate some of these quantities. This

is usually accomplished with the aid of observations on the random

variables. According to the laws of large numbers, an "exact" deter-

mination of a probability, an expected value, etc., would require an

"infinite" number of observations. Having samples of finite size,

we can do no more than estimate the theoretical values in question.

The sample characteristics, or statistics, serve the purpose of sta-

tistical estimation. For a good estimation of theoretical quantities,

a fairly large sample is sometimes needed. In many practical situations

the following two types of estimation problems arise. A certain quan-

tity, say 6, which is, from the statistical point of view, a theo-

retical quantity, has to be determined by means of measurement. Such a

quantity may be, for example, the electrical resistance of a given

device, the life of a given product, etc. The result T of the mea-

suring procedure is a random variable whose distribution depends on

o and perhaps on additional quantities. That is,we have to estimate

the parameter 0 out of a sample T1, T, ... T taken on T. In the
n

other case, tile quantity in question is a random variable itself

and in such cases we are interested in the (theoretical) average

value, or tie dispersion of T, etc. This means that we have to es-

timate the expected value E(T) or Var(T), and perhaps other (constant)

quantities that can be expressed with the aid of the distributed on














function of T, like the reliability function. More often for lifetime

distributions, the quantity of interest is a distribution percentile,

also known as the reliable life of the item to be tested, corresponding

to some specified population survival proportion; or it is the pop-

ulation proportion surviving at least a specified time, say S

For the classical statistician,the unknown parameter 6 is

considered to be a constant. In estimating a constant value there

are various aspects to consider. If we wish to have an estimator

whose value can be used instead of the unknown parameter in formulas

[certainty equivalent (CE) approach], then the estimator should

have one given value. In this case we speak of point estimation. But

knowing that our estimator is subject to error, sometimes we would

like to have some information on the average deviation from the

value. In this case we have to construct an interval that contains

the unknown parameter, at least with high probability, or give a

measure of the variability of the estimator (such as the standard

error of the estimate). Most of the literature about the traditional

approach to life testing inferences is focused in two areas; one

relates to point and interval estimation procedures for lifetime

distributions and the other relates to methods of testing statisti-

cal hypotheses in reliability (known as "reliability demonstration

tests").

The classical approach to point estimation in life testing

inferences emphasizes that a good estimator should have properties

like unbiasedness, eff iciency, consistency and sufficiency [see














Dubey (1968), Bartlett (1937) and Weiss (1961). Two methods, the

method of moments and method of maximum likelihood, are frequently

used to yield estimators with as many as possible of the previously

mentioned properties. Under various sampling assumptions, the maxi-

mum likelihood estimators of the parameters were obtained for the

following distributions; gamma Isee Choi and Wette (1969) and Harter

and Moore (1965) ; Weibull [see Bain (1972), Billman et al. (1971),

Cohen (1965), Englehardt (1975), Haan and Beer (1967), Lemon (1975)

and Rockette et aL. (1973)1; exponential [see Deemer and Votaw (1955),

El-Sayyad (1967) and Epstein (1957)]; and for the normal and lognormal

[see Cohen (1951), Hlarter and Moore (1966), Lambert (1964) and Tallis

and Young (1962)]. The traditional approach also includes some linear

estimation properties like Best Linear Unbiased (BLU) and Best Linear

Invariance (BLI).

Interval estimation procedures have also been developed for

the parameters of the life distributions. Examples include Bain and

Englehardt (1973), Epstein (1961), carter (1964) and Mann (1968).

Point or interval estimators for functions of the life distributions,

such as reliable life, reliability function, hazard rate, etc. were

obtained by substituting for the unknown parameters the point or inter-

val estimators obtained for them [see Johns and Lieberman (1966),

Bartholomew (1963), (rubbs (1971), Harris and Singpurwalla (1968, 1969),

Lawless (1971,1972), Iikes (1967), Mann (1969-a, 1969-b, 1970), Varde

(1969) and Linhart ( )5) ].













Testing reliability hypotheses is the second major area of

research in the classical approach to life testing. By means of

the methods referenced previously, a test statistic is selected,

regions of acceptance and rejection are set up, and risks of in-

correct decisions are calculated. In addition it is emphasized

that the risks of incorrect decisions are specified before the

sample is obtained, and in this case n, the sample size, is gene-

rally to be determined. Some of the references in this area include

[Epstein (1960), Epstein and Sobel (1955), Kumar and Pate] (1971),

Lilliefors (1967, 1969), Sobel and Tischendorf (1959), Thoman et al.

(1969, 1970) and Fercho and Ringer (1972)].

A large part of the statistical problem in reliability in-

volves the estimation of parameters in failure models. Each of the

methods of obtaining point estimates previously referenced has

certain statistical properties that make it desirable, at least

from a theoretical viewpoint. Not surprisingly, point estimates

are often made (particularly in reliability) because decisions are

to be based on them. The consequences of the decisions based on the

estimates often involve money, or, more generally, some form of

utility. Hence the decision maker is more interested in the practi-

cal consequence of the estimate than in its theoretical properties.

In particular, he may be interested in making estimates that mini-

mize the expected loss (cost), but this can not be accomplished in

general with classical methodology because the methodology does not

admit probability distributions of the parameters.















92. 2. Bayesian Techniques in Life Testing

The non-Bayesian (classical) approach to estimation con-

siders an unknown parameter as fixed. This means that classical in-

terval estimation and hypothesis testing must lean on inductive

reasoning either through the likelihood function or the sampling

distributions. In point estimation, the classical approach must

depend on estimates the criteria for which often are not based on

the practical consequences of the estimates. On the other hand,Bayes

procedures assume a prior distribution of the parameter space, that

is,considers the parameter as a random variable, and,hence, the pos-

terior distribution is available. This creates the possibility of

a whole new class of criteria for estimation, namely, minimization

of expected loss, probability intervals and others.

In view of the difficulty in assessing utility or costs of

complex reliability problems, in previous studies Bayesian methods

have been used primarily to provide a means of combining previous

data (expressed as the prior distribution) with observed data

(expressed in the likelihood function) to obtain estimates of parame-

ters by using the posterior density. However, it must be emphasized that

Bayesian methods are perfectly general in providing whatever the

reliability problem demands.

Tlhre is a loss function that is rather popular in Bayesian

analysis and gives simple results. Suppose that 6 is an estimate of

)and that the loss function is


(-'.2. 28) 1(k,) = (0 O) 2.













This function states that the loss is equal to the square of the

distance of ; from 0. The Bayes approach is to select the estimate

of 6 that minimizes the expected loss with respect to the posterior

distribution. The estimate that accomplishes this is the posterior

mean, that is,


(2.2.29) 6 = E(61tl, t2, ... tn;P)

where P represents prior information. The above loss function is often

called the quadratic loss function and the posterior mean is termed

the Bayes estimate. If the loss function is of the form


(2.2.30) L(6,6) = 6-6 ,

the estimate of 0 that minimizes the expected loss is the median of

the posterior dLstribution. Canfield (1970) developed a Bayesian es-

timate of reliability for the exponential case using this loss function.

The resulting estimate is seen to be the MVUE of reliability when the

prior is flat. A third and simple case is the asymmetric linear,


k (6-6) if 0>6
(2.2.31) L(6,6) = (- if 6
k (6-6) if 6<6.


The estimate of 6 that minimizes the expected loss if the ku/(k0+ k )

fractile, [see Raiffa and Schlaifer (1961)]. Beyond these three

simple cases, things become difficult in regard to loss function for

two reasons:

(i) difficulties in assessing a realistic loss function

and


(ii) mathematical intractability.














The expected loss is generally a random variable a prior since it

depends on the as yet unobserved sample data. The unconditional ex-

pectation (with respect to the sample) of the expected loss is called

the "Bayes risk" and is minimized by the Bayes estimate.

Bayes methods have been used in a variety of areas of re-

liability. Most uses can be characterized as point or interval esti-

mation of parameters of Life distributions or of reliability functions.

Examples include Breipohl, et. al., (1965) who studied the be-

havior of a family of Bayesian posterior distributions. In addition

the properties of the mean of the posterior distribution as a point

estimate and a method for constructing confidence intervals were

given. The problem of hypothesis testing was considered, among others,

by MacFarland (1972). He provided a simple exposition of the rudi-

ments of applying Bayes equation to hypotheses concerning relia-

bility.

The Bayesian approach has also been applied to parameter

estimation and reliability estimation of some known distributions

like gamma, Poisson, lognormal and others. Lwin and Singh (1974)

considered a Bayesian analysis of a two-parameter gamma model in

life testing context with special emphasis on estimation of the

reliability function. The Poisson distribution has received the

attention of Canavos (1972, 1973). In the first article a smooth

empirical Bayes estimator is derived for the hazard rate. The re-

liability function is also estimated either by using the empirical

Bayes estimate of the parameters, or by obtaining the expectation













of the reliability function. Results indicate a significant reduc-

tion in mean squared error of the empirical Bayes estimates over

the maximum likelihood estimates. A similar result was also derived

for the exponential distribution by Lemon (1972) and by Martz (1975).

Next, Canavos developed Bayesian procedures for life testing with

respect to a random intensity parameter. Bayes estimators were

derived for the Poisson parameters and reliability function based

on uniform and gamma prior distributions. Again, as expected, the

Bayes estimators have mean squared errors (MSE) that are appreciably

smaller than those of the minimum variance unbiased estimator (MVUE)

and of the maximum likelihood estimator (MLE).

Zellner (1971) has studied the Bayesian estimation of the

parameters of the lognormal distribution. Employing a flat prior,

Zellner found that the NSE estimators of the parameters are the

optimal Bayesian estimators when a relative squared error loss

function is used.

The Weibull and exponential function have received most

of the attention of authors who have studied life distributions

from a Bayesian viewpoint. The Weibull process with unknown scale

parameter is taken as a model by Soland (1968) for Bayesian decision

theory. The family of natural conjugate prior distributions for the

scale parameter is used in prior and posterior analysis. In addition,

preposterior analysis is given for an acceptance sampling problem

with utility linear in the unknown mean of the Weibull process. Soland

(1969) extended the analysis by treating both the shape and scale














parameters as unknown, hut as was previously known i t is not possi-

ble to find a family of continuous joint distributions on the two

parameters that is closed under sampling, so a family of prior dis-

tributions is used that places continuous distributions on the scale

parameter and discrete distributions on the shape parameter. Prior

and posterior analysis are examined and seen to be no more difficult

than for the case in which only the scale parameter is treated as

unknown, but posterior analysis and determination of optimal sampling

plans are considerably more complicated in this case.

In Bury (1972), a two-parameter Weibull distribution is

assumed to be an appropriate statistical life model. A Bayesian decision

model is constructed around a conjugate probability density function

for the Weibull hazard rate. Since a single sufficient statistic of

fixed dimensionality does not exist for the Weibull model, Bury was

able to consider only two sampling plans in his preposterior analysis:

obtain one further observation or terminate testing. Bury points out

that small sample Bayesian analysis tends to be more accurate than

classical analysis because of the additional prior information utilized

,in the analysis. Bayes credible bounds for the scale parameter and

for the reliability function are derived by Papadopoulos and Tsokos

(1975).

Reliability data often include information that the failure

event has not yet ccuIrred for some items, while observations of

complete lifetimes are available for other items. Cozzolino (1974)

addressed this problem from a Bayesian point of view, considering













density functions that have failure rate functions consisting of a known

function multiplied by an unknown scale factor. It is shown that a gamma

family of priors is conjugate for the unknown scale parameter for both

complete and incomplete experiments. A very flexible and convenient

model resulting from the assumption of a piecewise constant failure function.

Life tests that are terminated at preassigned time points or

after a preassigned number of failures are sometimes found in reliabil-

ity theory. Bhattaclarya (1967) provided a Bayesian analysis of the

exponential model based on this kind of life test. He showed that the

reliability estimate for a diffuse prior (which is uniform over the

entire positive line) closely resembles the classical MVUE, and he

considered the role of prior quasi-densitiesL when a life tester has

no prior information. Bhattacharya points out that the use of constant

density over the positive real line has been suggested to express

ignorance but that it causes problems. For example it can not be

interpreted as a probability density since it assigns infinite measure

on the parameter space. [See Box and Tiao (1972).]

A paper by Dunsmore (1974), stands out from among the other

Bayesian papers in life testing and is particularly pertinent to

the life testing application in this thesis. This article is an

important exception because it carries the Bayesian approach to its

natural conclusions by determining prediction intervals for future



If g(O) is any non-negative function defined in the parame-
ter space S& such that g(O) 0 V 0 c Q, then g(b) is called a prior
quasi-densi ty.














observations in life testing using the concept of the Bayesian pre-

dictive distribution. One objective of prediction is to provide some

estimate either point or interval, for future observations of an

experiment F based on the results obtained from an informative experi-

ment E. As we mentioned before, the classical approach to prediction

involves the use of tolerance regions. [See Aitchison (1966), Folks

and Browne (1975), Guenther et al. (1976) and Hewett and Moeschberger

(1976)]. In these we obtain a prediction interval only, and the

measure of confidence refers to the repetitions of the whole experimental

situation. The Bayesian approach on the other hand, allows us to

incorporate further information which might be available through a

prior distribution and leads to a more natural interpretation.

Let t ..., t he a random sample from a distribution with

probability density function P(t 6), (tcT;0cO), and let yl', y2 .... yn

be a second independent random sample of "future" observations from

a distribution with probability density function P(ylo), (ycY;OcO).

Our aim is to make predictions about some function of y y2' "..' Y 1

The Bayesian approach assumes that a prior density function P(0),

(Oce) is available that measures our uncertainty about the value of 6.

If the information in E is summarized by a sufficient statistic t

then a posterior distribution P(eOt) is available. Suppose now that

we wish to predict some statistic y defined on y y2 .' yn. Then




Such a suffice ie nt statistic will always exist since, for
T
example, t co Ild I tl ie vector (tI, L, ... t )
1" n













the predictive density function for y is given by


(2.2.32)


P(y t) = f P(y|e) P(olt) de


A Bayesian prediction interval of cover is then defined as an in-

terval I such that,

(2.2.33) P(l t) = f P(ylt) dy = 8.


[See, for example, Aitchison and Sculthorpe (1965), Aitchison (1966)

and Guttman (1970).] It should be emphasized that in the Bayesian

approach the complete inferential statement about y is given by the

predictive density function P(y t). Any prediction interval is only

a summary of the full description P(y t).

In general there will be many intervals I that satisfy (2.2.33).

Dunsmore considers most plausible Bayesian prediction intervals

(commonly known as highest posterior density (HPD) intervals) of cover

B, which have the form,

(2.2.34) I = [y:P(y It) > ],

where A is.determined by P(I t) = B.

In conclusion we might say that the-uses of Bayesian methods

in life testing have been limited. However in those cases where Bayes

estimators have been found, they performed better, according to clas-

sical criteria, than the conventional ones. The use of loss functions

has not been analyzed deeply for the reasons mentioned before; namely




1
It is implicitly assumed in (2.2.32) that conditional on
6, y and t are independent.















that the loss function is usually complex and unknown, and that even

when the loss function is known the Bayes estimate is sometimes dif-

ficult to find. Some of these problems wil.L be solved with the develop-

ment of mathematical theory and probably with the development of

computer systems. Only the Dunsmore paper fully used the Bayesian method-

ology to obtain prediction intervals that consider all available in-

formation and fully recognize the remaining parameter uncertainty.

All of the papers discussed in the previous section con-

sidered a stationary situation. That is, the known parameters and

the distributions used are assumed to remain the same across all

time periods. It would be of value to study the nonstationary case,

where the parameters are changing in time and possibly the distri-

butions could also change in time. It is important to recognize,

however, that probably the problems now faced with the stationarity

assumption will be greater when that assumption is relaxed. Never

this dynamic system is well worth investigating.


2.3 'Modeling of Nonstationary Processes

For many real world data generating processes the assump-

tion of stationarity is questionable. Take for instance life testing

models. When it is assumed that the life of certain commodities

follows a lognormal distribution, for example, the stationarity as-

sumption could beC expected to hold over short periods of time; but

in most cases it would ihe expected that for a lengthy period, sta-

Lionarity would bh, a doubtful assumption. If the model represents

the life of perishable products, ike food for example, then it












would be expected that environmental factors like heat and humidity

could change and affect the characteristics of the life distribution

of the product or affect the input factors used in the manufacturing

process. Furthermore, the wearout of the machines used in the manu-

facture of the products could cause changes in the quality of the pro-

ducts and hence in the parameters of the life distributions.

Random parameter variation is surely to be a reasonable as-

sumption when we are concerned with economic variables, like those

used in Cost-Volume-Profit analysis. A wide spectrum of circumstances

could be mentioned where the economic environment is gradually

affected. For example, the level of economic development changes

gradually in a country and consequently brings gradual changes in

related variables like income, consumption and price. Also, consumer's

tastes and preferences evolve relatively slowly as social and economic

conditions change and as new marketing channels or techniques are

developed. The gradual increase in technology available to the indus-

try and to the government may produce changes that are not dramatic

but tlat will have some influence in any particular period of time.

In other words, it seems reasonable to assume that in at least some

situations the distribution functions of variables, like sales, price

or costs, could be gradually changing in time. It is important to

emphasize that we are referring to gradual changes, the effects of

which are not perfIctly predictable in advance for a particular period.

If a data generating process characterized by some parameter

0 is nonstationary, then it is not particularly realistic to make














inferences and decisions concerning 0 as if 0 only took on a single

value. Instead we should he concerned with a sequence 6 02, ... of

values of 0 corresponding to different time periods, assuming the

characteristics of the process vary across time but are relatively

constant within a given period. Some researchers have studied this

problem with particular stochastic processes.

Chernoff and Zacks (1964) studied what they called a "tracking"

problem. Observations are taken on the successive positions of an

object traveling on a path, and it is desired to estimate its current

position. If the path is smooth, regression estimates seem appropriate.

However, if the path is subjected to occasional changes in direction,

regression will give misleading results. Their objective was to arrive

at a simple formula which implicitly accounts for possible changes in

direction and discounts observations taken before the latest change.

Successive observations were assumed to be taken on n independently

and normally distributed random variables with means p', 2', ..' n"

Each mean is equal to the preceding mean except when an occasional

change takes place. The object is to estimate the current mean p n

They studied the problem from a Bayesian point of view and made the

following assumptions: the time points of change obey an arbitrary

specified a priori probability distribution; the amounts of change

in the means (when changes take place) are independently and normally

distributed random variables with zero mean; and the current mean

n is a normally distributed random variable with zero mean. Using

a quadratic loss function and a uniform prior distribution for p1 on













the whole real line they derived a Bayes estimator of p In ad-

dition they derived the minimum variance linear unbiased (MVLU)

estimator of 1 n. Comparing both estimators they found that although

the MVLU estimator is considerably simpler than the Bayes estimator,

when the expected number of changes in the mean is neither zero nor

n-1 the Bayes estimator is more efficient than the MVLU.

Chernoff and Zacks studied an alternative problem in which the

prior distribution of time points of change is such that there is at

most one change. This problem leads to a relatively simple Bayes esti-

mator. However, difficulties may arise if this estimator is applied

when there are actually two (or more) changes. The suggested technique

starts at the end of a series, searches back for a change in mean and

then estimates the mean value of the series forward from the point at

which such a change is assumed to have occurred. They designed a procedure

to test whether a change in mean has occurred and found a simpler test

than the one used by Page (1954, 1955). Most of the results appearing in

this paper were derived in a previous paper by Barnard (1959) in a some-

what different manner, but the general results are essentially the same.

The previous paper by Chernoff and Zacks motivated some

research in the following years. Mustafi (1968) considered a situa-

tion in which a random variable is observed sequentially over time

and the distribution of this random variable is subjected to a pos-

sible, change at very point in the sequence. The study of this prob-

lem is centered about the model introduced by Chernoff and Zacks.














Three aspects of the problem were considered by Nustafi. First he

considered the problem of estimating the current value of the mean

on the basis of a set of observations taken up to the present. Chernoff

and Zacks assumed Lhat certain parameters occurring in the model were

known. Mustafi then derives a procedure for estimating the current

value of the mean on the basis of a set of observations taken at

successive time points when nothing is known about the other parame-

ters occurring in the model. Second Mustafi estimated the various

points of change in the framework of an empirical Bayes procedure and

used an idea similar to that of Taimiter (1966) to derive a sequence

of tests to be applied at each stage. Third he considers n independent

observations of a random variable that belong to the one parameter

exponential family taken at successive time points. He examines the

problem of testing the equality of these n parameters against the

alternative that the parameter has changed r-times at some unknown

points where r is some finite positive integer less than n. He de-

veloped a test procedure generalizing the techniques used by Kender

and Zacks (1966) and Page (1955).

Hinich and Farley (1966) also studied the problem of estima-

tion models for time series with nonstationary means. They assumed

a model similar to the one developed by Chernoff and Zacks except

that they assumed that the number of points of change per unit time

are Poisson distributed with a known shift rate parameter. They found

an estimator for tie mean which is unbiased and efficient. Also it

turned out to be a linear combination of the vector of observations.













The Farley-llinich technique attempts to estimate jointly the level

of the mean at the beginning of a series as well as the size of the

change (if any).

Farley and Hinich in a later paper (1970) compared the method

developed in (1966) with the one presented by Chernoff and Zacks (1964)

and later generalized by Mustafi (1968). Some ways were examined to

systematically track time series which may contain small stochastic

mean shifts as well as random measurement errors. A "small" shift

is one which is small relative to measurement error. Three approaches

were tested with artificial data, by means of Monte Carlo methods,

using mean shifts which were rather small, that is, mean shifts which

were half the magnitude of random measurement error variance. Several

false starts with actual marketing data showed that there was an iden-

tification problem to provide an adequate test of the procedures'

performance, and artificial data of known configuration provided a

more natural starting point. Two techniques (one developed by the

authors and the other by Chernoff and Zacks) involved formal estimation

under the assumption that there was at most one discrete jump in a

data record of fixed length of the type often stored in an information

system. Both techniques performed reasonably well when the rate of

shift occurrence was known, but both techniques are very sensitive

to prior specification of the rate at which shifts occur in

terms of both classes of errors, that is, missing shifts which

occur and identifying "shifts" which do not occur. Knowing the

shift rate precis ly and knowing that more than one shift in a record















is extremely unlikely are two very severe restrictions for many ap-

plications. A simpler filter technique was tested similarly with more

promising results in terms of avoiding both classes of errors. The

filter approach involved first smoothing the series and then imple-

menting ad hoc decision rules based on consecutive occurrences of

smoothed values falling outside a predetermined range around the

moving average.

Hlarrison and Stevens have produced two important papers about

Bayesian forecasting using nonstationary models. In the first of these

papers (1971), they described a new approach to short-term forecasting

based on Bayesian principles in conjunction with a multi-state data-

generating process. The various states correspond to the occurrence of

transient errors and step changes in trend and slope. The performance

of conventional systems, like the growth models of Holt] (1957), Brown

(1963) and Box-Jenkins (1970), is often upset by the occurrence of

changes in trend ,nd slope or transients. In Harrison and Stevens'

approach events of this nature are modelled explicitly, and succes-

sive data points are used to calculate the posterior probabilities

of such events at each instant of time.

In the second paper (1976), Harrison and Stevens describe a

more general approach to forecasting. The principles of Bayesian fore-

casting are discussed and the formal inclusion of the "forecaster"

in the forecasting system is emphasized as a major feature. The criti-

cal distinction is that between a statistical forecasting method and

a forecasting sysltcm. The former transform input data into output in-













formation in a purely mechanical way. The latter, however, includes

people: the person responsible for the forecast and all the people

concerned with using the forecasts and supplying information relevant

to the resulting actions. It is necessary that people can communicate

their information to the method and that the method clearly communi-

cates the uncertain information in such a way that it is readily

interpreted and accepted by decision makers. The basic model, called

by them "the dynamic linear model", is defined together with Kalman

filter recurrence relations and a number of model formulations are

given based on their result. They first phrase the models in terms

of their "natural" parameters and structure, and then translate them

into the dynamic linear model form. Some of the models discussed by

them are, a) regression models, b) the steady model, c) the linear

growth model, d) the general polynomial models, e) seasonal models,

f) autoregressive models, and g) moving average models.

Multiprocess models introduce uncertainty as to the under-

tying model, itself, and this approach is described in a more general

fashion than in their 1971 paper. In the 1976 paper they present a

Bayesian approach to forecasting which not only includes many con-

ventional methods, as presented before, but possesses a remarkable

range of additional facilities, not the least being its ability to

respond effectively in the start-up situation where no prior data

history (as distinct from information) is available. The essential

found ions of the method are:

(a) a para:metric (or state space) model, as distinct from















a functional model;

(b) probabilistic information on the parameters at any given

time;

(c) a sequential model definition which describes how the

parameters change in time, both systematically and as a result of

random shocks;

and

(d) uncertainty as to the underlying model itself, as be-

tween a number of discrete alternatives.

Kamat (1976) developed a smoothed Bayes control procedure for

controlling the output of a production process when the quality charac-

teristic is continuous with a linear shift in its basic level. The

procedure uses Bayesian estimation with exponential smoothing for

updating the necessary parameter estimates. The application of the

procedure to real life data is illustrated with an example. Applica-

tions of the traditional x-chart and the cumulative sum control chart

to the same data are also illustrated for comparison.

In Chapter Three of this dissertation we develop a Bayesian

model of nonstationarity for normal and lognormal processes. We build

our results directly on two papers, Winkler and Barry (1973) and Barry

and Winkler (1976). In the first paper they developed a Bayesian model

for nonstationary means in a multinormal data-generating process and

demonstrated that the presence of nonstationary means can have an impact

upon the uncertainty associated with a given random variable that has

a normal distribution. Moreover, the nonstationary model considered by














them seems to have more realistic properties than the corresponding

stationary model. For example, they found that in tlhe nonstationary

model the recent observations are given more weight that the distant

ones in determining the mean of the distribution at any given time,

and the uncertainty about the parameters of the process is never

completely removed. Barry and Winkler (1976) were concerned with the

effects of nonstationarity on portfolio decision. The use of a Bayesian

approach to statistical inference and decision provides a convenient

framework for studying the problem of changing parameters, both in

terms of forecasting security prices and in terms of portfolio decision

making. In this thesis a number of extensions to their results are

made, thereby removing some of the restrictiveness of their results,

and applications are considered in the areas of CVP analysis and life

testing.
















CHAPTER TREE


NONSTATIONARITY IN NORMAL AND LOGNORMAL PROCESSES

3.1 Introduction

The normal distribution is considered by many persons an im-

portant distribution. The earliest workers regarded the distribution

only as a convenient approximation to the binomial distribution. However,

with the work of Laplace and Gauss its broader theoretical importance

spread. The normal distribution became widely and uncritically accepted

as the basis of much practical statistical work. More recently a more

critical spirit has developed, with more attention being paid to systems

of "skew (asymmetric) frequency curves". This critical spirit has per-

sisted, but is offset by developments in both theory and practice. The

normal distribution has a unique position in probability theory, and can

be used as an approximation to many other distributions. In real world

problems, "normal theory" can frequently be applied, with small risk of

serious erros, when substantially non-normal distributions correspond more

closely to observed values. This allows us to take advantage of the elegant

nature and extensive supporting numerical tables of normal theory. Host

theoretical arguments for the use of the normal distribution are based on

forms of central limit theorems. These theorems state conditions under

which the distribution of standardized sums of random variables tends to

a unit normal dist ribut ion as the number of variables in the sum increases,

that is with conditions sufficient to ensure an asymptotic unit normal

distribution.













The normal distribution, for the reasons exposed before, has

been widely used and enumerating the fields of application would be

lengthy and not really informative. However, we do emphasize that the

normal distribution is almost always used as an approximation, either

to a theoretical or an unknown distribution. The normal distribution

is well suited to this because its theoretical analysis is fully worked

out and often simple in form. Where these conditions are not fulfilled

substitutes for normal distributions should be sought. Even when nor-

mal distributions are not used results corresponding to "normal theory"

are often useful as standards of comparison.

The use of normal distributions when the coefficient of variation

is large presents many difficulties in some applications. For instance,

observed values more than twice the mean would then imply the existence

of observations with negative values. Frequently this is a logical absurdity.

The lognormal distribution, as defined in equation 2.2.20, is in at least

one important respect a more realistic representation of distributions

of characters that cannot assume negative values than is the normal distri-

bution. A normal distribution assigns positive probability to such events,

while the lognormal distribution does not. The use of the lognormal distri-

bution has been investigated as a possible solution to this problem [see

Cohen (1951), Calton (1879), Jenkins (1932) and Yuan (1933)]. In a

review of the literature Caddum (1945) found that the lognormal dis-

tribution could be used to describe several processes. In Chapter Two

we presented a list of some of the applications of this distribution















to real life problems. Among those applications we emphasized its

use in Cost-Volume-Profit analysis and in life testing models. Fur-

thermore, by taking the spread parameter small enough, it is possible

to construct a lognormal distribution closely resembling any normal

distribution. Hence, even if a normal distribution is felt to be really

appropriate, it might be replaced by a suitable lognormal distribution.

As was mentioned in Chapter Two, most research concerned with

the normal and lognormal distributions has considered only stationary

situations. That is, the parameters (known or assumed to be known)

and distributions used are assumed to remain the same in the future.

In this third chapter we intend to build a nonstationary model for

normal and lognormal processes from a Bayesian point of view. Section

3.2 sets the stage for the development of the nonstationary model. In

it, we describe essential features of the Bayesian analysis of normal

and lognormal processes including prior, posterior and predictive dis-

tributions. Two uncertainty situations are considered in this section;

in one the shift parameter, U is assumed to be unknown and the spread

parameter, o, is assumed to be known; and in the other, both parameters

are assumed to be unknown. In Section 3.3, we develop a particular non-

stationary model for the shift parameter of the lognormal distribution,

again under the same two uncertainty situations, and provide a com-

parison of the results with a stationary model.













3.2 Bayesian Analysis of Normal and Lognormal Processes


Before the last decade, most of the Bayesian research dealing

with problems ot statistical inference and decisions concerning a parame-

ter 0 assume that 0 takes on a single value; those models are called

stationary models. For example, 6 may represent the proportion of de-

fective items produced by a certain manufacturing process; the mean

monthly profits of a given company; the mean life of a manufactured

product and so on. In each case a is assumed to be a fixed but not known.

A formal Bayesian statistical analysis articulates the evidence of a

sample to be analyzed with evidence other than that of the sample; it

is felt that there usually is prior evidence. The non-sample evidence

is assessed judgmentally or subjectively and is expressed in proba-

bilistic terms, by means of: (1) a data distribution that specifies

the probability of any sample result conditional on certain parameters;

and (2) a prior distribution that expresses our uncertainty about the

parameters. When judgment in the form of the assessment of a likeli-

hood function to apply to the data is combined with evidence of a

sample, we have the likelihood function of the sample. The likelihood

function of the sample is combined with the prior distribution via

Bayes' theorem to produce a posterior distribution for the parameters

of the data distribution, and this is the typical output of a formal

Bayesian analysis. If we assume that the prior distribution, for the

parameters of the data distribution, is continuous then we may express

Bayes' theorem as















(3.2.1) t(6 x) = ( f(x)) o;

where
x denotes the vector of sample observations,


6 represents all the unknown parameters,

and
r represents the known parameters of the prior
distribution of 0.


We can interpret f(xlo) in two ways: (1) for given 6, f(xJo)

gives the distribution of the random vector 1; (2) for given x, f(xlO)

as a function of d, together with a] positive multiples, in the ususal

usage is the likelihood function of the sample.

The prior probability of the sample f(xrT) is computed from


(3.2.2) f(xII) = / f(0I1) f(xIo) de,
0

from which we see that f(xIT) can be interpreted as the expected

value of the likelihood in the light of the prior distribution. Alter-

natively, f(x[T) can be interpreted as the marginal distribution of

the random vector R with respect to the joint distribution,


(3.2.3) f(x, O T) = f(eO T) f(xle).


Since (3.2.2) can be computed in advance of the sample for any x,

we shall frequently refer to the marginal distribution of R as the

predictive distribution implied by the specified prior distribution

and datai distribtl ion.











If we have a posterior distribution f(ojx) and if a future

random vector "I is to come from f(w O), which may or may not be

the same data distribution as in (3.2.2), we may compute


(3.2.4) f(xl x) = I f(0lx) f(x 10) dO.
0

We refer to the distribution so defined as the predictive distribution

of a future sample implied by the posterior distribution. It must be

understood that (3.2.2) and (3.2.4) are but two instances of the same

relationship; sometimes it is worth distinguishing the practical prob-

lems arising when predictions refer to the present sample from those

arising in connection with predictions about a future sample; that is

a "not-yet-observed" sample. The revision of the prior distribution

gives the statistician a method for drawing inferences about 9, the

uncertain expression, quantity or parameter of interest, and for deci-

sions related to 0.

In general then we may say that the term Bayesian refers to

any use or user of prior distributions on a parameter space (although

there is some nonp.irametric Bayesian material also) with the associ-

ated application of Bayes theorem in the analysis of an inferential

or decision problem under uncertainty. Such an analysis rests on the

belief that in most practical situations the statistician will pos-

sess some subjective a priori information concerning the probable

values of the parameter. This information may often be reasonably

summarized and formalized by the choice of a suitable prior dis-

tribution on the parameter space. The fact that the decision maker

can not specify every detail of his prior distribution by direct asses-















sment means that t here will often be considerable latitude in the

choice of the family of distributions to be used, even though the

selection of a particular member within the chosen family will

usually be wholly determined by the decision maker's expressed beliefs

or betting odds. Three characteristics are particularly desirable for

a family of prior distributions:




(i) analytical tractability in three aspects; namely




a) it should be reasonably easy to determine the

posterior distribution resulting from a given prior and sample,




1) it should be possible to express in convenient

form the expectations of some simple utility functions with respect

to any member of it,

and

c) the family should be closed in the sense that if

the prior is a member of it, the posterior will also be a member of it;




(ii) the family should be rich, sot that there will exist a

member of it capable of expressing the decision maker's prior beliefs

or at least approximating them well;

and

(iii) it should be pajrametrizable in a manner which can

readily be interpr-etted, so that it will be easy to verify that the













chosen member of the family is really in close agreement with the

decision maker's prior judgments about 0 and not a mere artifact

agreeing with one or two quantitative summarizations of these judg-

ments.

A family of prior densities which gives rise to posteriors

belonging to the same family is very useful inasmuch as one aspect

of mathematical tractability is maintained, and this property has

been termed "closure under sampling". For densities which admit

sufficient statistics of fixed dimensionality, a concept to be

explained later, Raiffa and Schlaifer (1961) have considered a

method of generating prior densities on the parameter space that

possess the "closure under sampling" property. A family of such

densities has been called by them a "natural conjugate family".

To define the concepts of sufficient statistic and sufficient sta-

tistic of fixed dimensionality, consider a statistical problem in

which a large amount of experimental data has been collected. The

treatment of the data is often simplified if the statistician

computes a few numerical values, or statistics, and considers these

values as summaries of the relevant information in the data. In some

problems, a statistical analysis that is based on these few sum-

mary values can be just as effective as any analysis that could be

based on all observed values. If the summaries are fully informative

they are known as sufficient statistics. Formally, suppose that 6 is

a parametrr which takes a value in the space 0. Also suppose that x

is a randoin variable, or random vector, which takes values in the














sample space S. We shall let f(.0 0) denote the conditional proba-

bility density function (p.d.f.) of x when 0=00 (O0O). It is

assumed that the observed value of x will be available for making

inferences and decisions related to the parameter e. Denote any

function T of the observations x, a statistic. Loosely speaking,

a statistic T is called a sufficient statistic if, for any prior

distribution of 0, its posterior distribution depends on the ob-

served value of x only through T(x). More formally, for any prior

p.d.f. g(0) and any observed value xeS, let g(" x) denote the pos-

terior p.d.f. of 0, assuming for simplicity that for every value of

xeS and every prior p.d.f. g, the posterior g(' x) exists and is

specified by the Bayes theorem. Then it is said that a statistic

T is sufficient for the family of p.d.f.'s f(-10), 60O, if

g( -IX) = g(. x2) for any prior p.d.f. g and any two points xleS

and x2ECS such that T(x1) = T(x2).

Now, consider only data generating processes which generate

independent and identically distributed random variables ', 2, ...

such that, for any n and any (xl, x2 ... x ) there exists a suf-

ficient statistic. Sufficient statisticsof fixed dimensionality are

those statistics T such that T' (x x2 ... x ) = T = (T1, T ... T )

where a particular value T. is a real number and the dimensionality

s of T does nor depend on n. Independently of how many elements we

sample, only s stall istics are needed.

Raiffa and Schlaifer (1961) present the following method for

developing the natural runjugn te prior for a given likelihood function:













(i) Let the density function of a be g, where g denotes either

a prior or a posterior density, and let k be another function on 0

such that
k(O)
(3.2.5) g k(
fk(o) de
0
Then we shall write

(3.2.6) g(o) k(e)

and say that k is a kernel of the density of 0.



(ii) Let the likelihood of x given 0 be l(x 0), and suppose

that P and k are functions on x such that, for all x and 0,

(3.2.7) l(xlO) = k(xl|) P(x).


Then we shall say that k(xl0) is a kernel of the likelihood of x

given 0 and that P(x) is a residue of this likelihood.



(iii) Let the prior distribution of the random variable 0

have a density g'. For any x such taht l*(xlg') = J 1(x 0) g'(0) dO > 0,
0 ~
it follows from Bayes theorem that the posterior distribution of 0 has

a density g" whose value at (0) for the given x is

(3.2.8) g"(O0x) = g'(0) l(x|O) N(x)

where

N(x) = [ f g'(O) l(x O) dO]-1
0














(iv) Now let k' denote a kernel of the prior density of 0. It

follows from the definitions of k and 1 and of the symbol I that

the Bayes formula can be written,


(3.2.9) g"(9O x) = g'(6) 1(x|6) N(x)


= k'(0) [ k(o) de]1 k(x e) P(x) N(x)
0

g"(' x) k'(6) k(x|6),

where the value of the constant of proportionality for the given x,


(3.2.10) P(x) N(x) [ f k(6) dol-]


can always be determined by the condition,


(3.2.11) g"(Olx) do = 1, whenever the integral exists.


Before we begin our presentation of a basic Bayesian analysis

of normal and lognormal processes we want to emphasize that caution

should be exercised in the application of the method developed by

Raiffa and Schlailer, as is pointed out by Box and Tiao (1972). According

to them it is often appropriate to analyze data from scientific inves-

tigation on the assumption that the likelihood dominate the prior, for

two reasons:

(i) a scientific investigation is not usually undertaken unless

information supplied by the investigation is likely to be considerably

more precise than information already available, that is unless it is

likely to increase knowledge by a substantial amount. Therefore analysis











with priors which are dominated by the likelihood often realistically

represents the true inferential situation.

(ii) Even when a scientist holds strong prior beliefs about the

value of a parameter 0, nevertheless, in reporting the results it would

usually be appropriate and most convincing to his colleagues if he ana-

lyzed the data against a reference prior which is dominated by the like-

lihood. He could say that, irrespective of what he or anyone else be-

lieved to begin with, the posterior distribution represented what some-

one who a priori knew very little about 0 should believe in the light

of the data. Reference priors in general mean standard priors domi-

nated by the likelihood. [See Dickey (1973) for a general discussion

of Bayesian methods in scientific reporting.]

In general a prior which is dominated by the likelihood is one

which does not change very much over the region in which the likelihood

is appreciable and does not assume large values outside that range. We

shall refer to a prior distribution which has these properties as a

locally uniform prior. There are some difficulties, however, associated

with locally uniform priors. The choice of a prior to characterize a

situation where "nothing" (or, more realistically, little) is known a

priori has long been, and still is, a matter of dispute. Bayes tenta-

tively suggested that where such knowledge was lacking concerning the

nature of the prior distribution, it might be regarded as uniform. There

is an objection to Bayes postulate. If the distribution of a continuous

parameter 0 were taken to he locally uniform, then the distribution of

log or some other transformation of (which might provide equally
log 0, 0 or some other transformation of 0 (which might provide equally














sensible bases for parametrizing the problem) would not be locally

uniform. Thus, application of Bayes' postulate to different trans-

formations of 0 would lead to posterior distributions from the same

data which were inconsistent with the notion that nothing is known

about 0 or functions of 0 This argument is of course correct, but

the arbitrariness of the choice of parametrization does not by it-

self mean that we should not employ Bayes postulate in practice.

Box and Tiao (1972) present an argument for choosing a par-

ticular metric in terms of which a locally uniform prior can be

regarded as noninformative about the parameters. It is important to

bear in mind that one can never be in a state of complete ignorance;

further, the statement "knowing little a priori" can only have mean-

ing relative to the information provided by the experiment. A prior

distribution is supposed to represent knowledge about parameters

before the outcome of a projected experiment is known. Thus, the main

issue is how to select a prior which provides little information rela-

tive to what is expected to be provided by the intended experiment.


3.3 Nonstationary Model for Normal and Lognormal Means

It was emphasized in Section 2.3 that for many real world

data generating processes the assumption of stationarity is question-

able. Random parameter variation could be a reasonable assumption when

we are concerned with life testing models or with economic variables.

For example, in life testing models, when it is assumed that the life

of certain parts follows a lognormal distribution, the stationarity













assumption could be expected to hold over short periods of time; but

in most cases it would be expected that for a lengthy period, statio-

narity would be a doubtful assumption. Similarly in other areas like

Cost-Volume-Profit analysis it is doubtful that the stationarity

assumption will hold over long periods of time. Variables like sales,

costs, and contribution margin are affected by economic, political

and environmental factors. In particular it was pointed out that we

are interested in gradual changes, the effects of which are not perfectly

predictable in advance for a particular period.

If a data generating process characterized by some parameter

6 is nonstationary, then it is potentially misleading to make inferences

and decisions concerning 0 as if 6 only took on a single value. Instead

we should he concerned with a sequence 61, 62, ... of values of 6 cor-

responding to different time periods, assuming the characteristics of

the process may vary across Lime. Several methods have been proposed

to study stochastic parameter variation [see Chernoff and Zacks (1964)

and Harrison and Stevens (1976)]. Some have claimed that a reasonable

approach to the effects of gradual change might be to model the para-

meters of nonstationary distributions as if they undergo independent

random shifts through time [see Barry (1976), Carter (1972), and

Kamat (1976)]. Specifically they suggest the use of a model that

assumes that the mean of the distribution has a linear shift. In those

papers, it is clearly demonstrated that when it is assumed that the

process represented by the model is normal, this linear random shift

,model allows anal ytical comparisons to be drawn if it is assumed that

the successive increments in the process mean are drawn independently














from a normal population with mean u and variance p. We intend to

use the same approach in this dissertation. Two cases are considered:

11 unknown and o2 known; and both 1 and o2 unknown.



3.3.1 P is Unknown and 02 is Known

For a process that has a normal density function with unknown

parameter f, Raiffa and Schlaifer (1961) show that the natural conju-

gate prior is normal with parameters m' and o2/n'. (See Appendix I

for the details of their exposition.) From the prior distribution on

00 and with a sequence of n independent observations (x, x2, ... xn

from the normal process under consideration [N(p,o2)], the posterior

distribution in period zero is obtained. If the sample yields sufficient

statistics m and n, then the posterior distribution is normal with para-

meters n1 and m" given by


(3.3.1) n" = n + n,

and

(3.3.2) mi = (n0 mo + n m)/(n + n).


If the mean of the distribution does not change from period to period

except by the effect of the sample information then each posterior can

be thought of as prior with respect to the following sample. Thus, the

posterior distribution on 00 is the prior distribution on i0 ; i.e.


(.3.3.) 1I (i(0 i o2/n1 ) = f (Fp o2/1 )














where

(3.3.4) m = m' ,

and

(3.3.5) n = n
0 1


In general, if we assume that a fixed sample of size n is employed

every time a sample is taken and if we assume that the mean is sta-

tionary except by the effect of the sample information, then in any

given period t the posterior distribution is normal with parameters

n" and m" given by,
t t


(3.3.6) n" = n' + n
t t

and

(3.3.7) m" = (n' m' + n m)/(n' + n).
t tt t


This inferential model is called a stationary model since it assumes

that neither the distribution nor the parameters change from period

to period. In this case it assumes that ut takes on the same value

in every period and that f'(u ) represents the information available

about that value as of the start of the t-th period.

Suppose now that the process generating the observations un-

dergoes a mean shift between succesive periods. In particular infer-

ences about the mean of a normal process are considered when the para-

meter I shifts from period to period, with the shifts governed by an

independent normal process. Formally, consider a data generating pro-

cess that generates n observations tl x t2' ..., xtn during time














period t according to a normal process with parameters pt and ao.

Assume that the parameter o" is known and does not change over time,

whereas pt is not known and may vary over time. In particular, values

of the parameter for successive time periods are related as,


(3.3.8) t l = t + %t+1 t = 1, 2, ...


where t+ is a normal "random shock" term independent of p with

known mean u and variance o2.That is t behaves as a random walk.
e t

The mean in any period t is equal to the mean in the previous period

plus an increment e, which has a normal distribution, with known

mean and variance.

Before the sample is taken at time t, we assume that a prior

density function could be assessed that represents judgment (based

on past experience, past information etc.) concerning the probabilities

for the possible values of t. If the prior distribution of pt at the

beginning of time period t is represented by f'(t ), and a sample of

size nt during period t yields xt = (t ... xtn), then the prior

distribution of p can be revised. Furthermore at the end of time

period t (the beginning of time period t+l), the data generating pro-

cess is governed by a new mean Dt+ so it is necessary to use the

posterior distribution of t and the relation (3.3.8) to determine

the prior distribution of pt+

In order to determine the distribution of the parameter pt+l

a wel l known tlheoim c could be used. It says that the convolution g(z)

of two normal dist ribut ions witli parameters (pt,o2) and (p2,02)













gives a distribution which is normal with mean (ip + 112) and variance

(02 + o2), i.e..



(3.3.9) g(z) = fN(ZIPl + P2' + o2)'


[see Mood et. al. (1974)]. Thus the distribution of t+ is normal,

i.e.,


(3.3.10) f( t+ m" + u, (02/n") + o2) -,< t+ <'
N C+1 t t e t+1
-m< m" + u t

(o2/n") + 02 >0.
t e

We could find a simpler expression if we realize that, since o2 and

a2 are positive, there must exist n such that,
s



(3.3.11) 02 = o2/n
e s


or n = 02/o2
s e


In other words, the disturbance variance is a multiple of the pro-

cess variance. The prior distribution of the mean after t periods then

simplifies to


(3.3.12) fN(t+Im + u, 02(n" + s )/n' ns
f t t !!

or

(3.3.13) f'( t ln' 0 2/ )
where' t+
where


m'+l = mi' + u
t+i 1: '


( .3.14)















and

(3.3.15) ni = i n / (n( + n ) <] n"
L+1 t s t s t


The inequality stated above can be interpreted as showing that the

presence of nonstationarity produces greater uncertainty (variance)

at the start of period L+L than would be present under stationarity

because in the stationary case n' = n". If we assume that a change
t+I t
in the mean occurs between every two consecutive periods then we could

repeat the previous procedure each time a change occurs to determine

the new prior distribution.

For a process that has a lognormal density function as defined

in (Al.14), it was shown in Appendix I that, when the unknown parame-

ter is p, the natural conjugate prior is normal. Thus, the revision

of the prior distribution in any given period is identical to the revi-

sion in the normal case [see equations (3.3.6) and (3.3.7)] except that m

is defined as the sample mean of the natural logarithms of the observed

x values. Furthermore the procedure presented before to represent

changes in the mean, F, of the normal distribution can be used to model

changes in the shift parameter p of the lognormal distribution. The

normality of the natural conjugate prior, in this case, allows us to

use the formulas (.3..8)-(3.3.15) to study the behavior of the prior dis-

tribution of p alter t periods of time.

Since the variance V(x) of the lognormal random variable x is

1 function of i and 6" in the lognormal case, nonstaLionarity in 0

means that both the mean and the variance of x are nonstationary, so













that the lognormal case provides a generalization of the normal results.


3.3,2 p and 62 Both Unknown

The results of the previous section can he extended to the case

of unknown mean and variance. The joint natural conjugate prior density

function for p and 02 is a normal-gamma-2 functions, as was shown in

Appendix I, given by


(3.3.16)
d'
-1
n' d'v' d'v' 2 d'v'
/n' exp[- -z2(U-m')] exp[- -7-T2--] [-
' (',a2 m',v',n',d') = 2 2-G 2
oN-y /2- F(d'/2)


Given a prior from this family and assuming that information is

available from a normal (or lognormal) process through a sample of obser-

vations xl, x,5 ... x it is possible to obtain a posterior distribution

of the two parameters D and d2. It was shown in Appendix I that the pos-

terior distribution is also normal-gamma-2, i.e., f" (p,62 m",v",n",d")
N-y-2
where


(3.3.17) m" = (n'm' + n m)/(R' + n)



(3.3.18) v" = [d'v' + n'm'2 + dv + nm2 n"m"2]/(d' + n)


n" = n' + n


(3.3.19)














and

(3.3.20) d" = d' + n,


It is clear from (3.3.16) that the joint distribution of p

and a2 is the product of two marginal distributions, i.e.,


(3.3.21) f" ( ,2 im",v",n",d") = f"( 12 ,n",m") ft"(C2v",d")
N-y-2


The marginal density of 82 does not depend on . Now consider the case

of nonstationary p as in the previous section. The independence of the

marginal distribution of o2 from p will be an important factor in our

results below.

At the end of period t (the beginning of time period t+l) the

posterior distribution of p and 02 could be used in conjunction with

the relation between pt and the random shock et+l to get the joint

prior distribution at the beginning of period t+l. As before,the random

shock model to be considered is pt = p + e We make the assump-
t+l t t+i
tion that although 02 is unknown, it is known that e 's variance,
t
o2, is 1/n times the unknown process variance, 02. As before, assuming
e s

that P has a posterior distribution with parameters (m",'2/n") and that

& is distributed normally with parameters (u,o2/n ) it was shown in
s
Appendix I that the convolution z (z = u + e) has a conditional density

given by


(3.3.22) g(z) = f"(zim" + u, d2[(l/n") + (1/n )]).












Note that this density is conditional on d2, as is the conjugate

prior of f. Thus, the prior density of pt+l' at the beginning of period

t+l after the random shock has occurred, is given by


(3.3.23) f'(pt+llm" + u, 2 [("s + nt)/n't ns


Since 62 is assumed constant, f2 (a2) does not change but

equals the posterior distribution at the end of period t. Hence,the

joint distribution at the beginning of period t+l is given by


(3.3.24) f- 2(t+1,6) = Nf (t+llmt + u,2[(ns+ nns) ) f (62Idtv )


If we let

(3.3.25) m' = m" + u,
mt+1 t



(3.3.26) n' = n" n/(n + n")
t+1 t s s t



(3.3.27) d+1 =t'

and

(3.3.28) v'+ = v"
t+1 t '

then the distribution of j and OQ could be written as


(3.3.29) f' ( t+l,02) = '( t+lm' 2/nt') f'2 (62d v'
N-y2(t+1 2 N t+1 t+1, t+1 f-2( t+1, t+1).

The revision could be continued since the prior distribution

at the beginning of period t+l is still a normal-gamma-2 distribution.

At any Lime t, the process mean is not known with certainty, but the













information from the samples collected up to time t provides an indi-

cation of t. Before lhe sample is taken at time t, we assume that

one is capable of assessing a prior density function that represents

our judgment (based on past experience, past information, etc.) con-

cerning the probabilities for the possible values of pt and 6'. In

effect, one view; (Pt,0 ) as a pair of random variables to which we

have assigned a probability density function; in this case a normal-

gamma-2 with parameters m', n', v' and d'. The sample results at time

t can be described in terms of the sufficient statistics m,, nt, v

and d ; sample mean, sample size, sample variance and degrees of free-

dom needed to determine vt, respectively. Using these sample results,

a new posterior distribution could be obtained which is normal-gamma-2.

The tractability of the model is maintained when a natural conjugate

prior is used and ai shift model of the form (3.3.8) is assumed for the

changes of the parameter P between two consecutive periods. Hence,

after t periods of time the joint distribution of p and 2 is norma-

gamma-2; that is,


(3.3.30) fN_-y2 (t+ i021 m,+l n'+ dt+' vt+l) '

where

(3.3.31) d' = d' + (t)n
t+l 1


(3.3.32) n' = (n' + n)ns/[(n' + n) + n ,
L+1 t s s













(3.3.33) v' = [d'v' + n'm'2 + dv + nm2 + n"m"2]/td' + n]
Lt t t t t t t t

and

(3.3.34) m = (n'm' + nm)/(n' + n).
t3+1 tt t


In this manner, a sequence of prior and posterior distributions for

successive pt may be obtained as successive values of the random vector

g = (xIt' .. xt ) are observed.

For the process that has a lognormal density function as defined

in (Ai.14), it was shown before that when both parameters are unknown

the joint natural conjugate prior is normal-gamma-2. Thus, the revision

of the prior distribution in any given period is identical to the revi-

sion in the normal case. Furthermore the procedure presented previously

to represent changes in the mean, P, of the normal distribution could

be used to model changes in the shift parameter of the lognormal.

The fact that both normal and lognormal distributions have a joint

natural conjugate prior which is normal-gamma-2 allows us to use the

formulas (3.3.30 3.3.34) to study the behavior of the prior distri-

bution of P and 62 after t periods.



3.3.3 Stationary Versus Nonstationary Results

Stationary conditions, in the context of our discussion, imply

that there is no shift in the mean, i, of the distribution; that is,

= 0 and consequently u and 92 are both zero. Successive values of U
t e
are the same across t ime, i.e., Iii= P2= ... For the case when
1 2 t














only Ti is unknown,this implies that equation (3.3.10) becomes,


(.3.3.5) f,(p't+ll m + 0, (,2/np) + 0)

or

(3.3.36) f [Ot+l m"' (o2/n")].


Under stationarity, then, the prior distribution of Dt+, at the start

of period t+l is the same as the posterior distribution of jt at the

end of period t. In the case of nonstationarity with no drift, u=0;

in other words, the distribution of e is normal with mean 0 and vari-

ance a2. For this case it is clear that for a given posterior distri-

bution of t at time t, the only difference between the prior dis-

tributions of 0it+ under stationarity (see equation 3.3.36) and the

prior distribution of t+ under nonstationarity (see equation 3.3.10)

is the variance term. The prior variance of .t+l under stationarity

is,

(3.3.37) VarS t+1) = o2/n'+1 = 02/n't;
(3.3.37) Var 2 n+ G


whereas the prior variance of Ot+l under nonstationarity is,


(3.3.38) VarN ( ) = 02/n+ = (2/n") + (02/n )


= o2[(1/nt) + (1/n)].


As expected, the incorporation of the nonstationary condition has

caused an increase in the variance of the prior distribution. The

variance increased by an amount oz/ns; that is,by an amount equal












to the variance of the distribution of successive increments in the

process mean. For the stationary case


(3.3.39) [n' = [/n"]
t+1 t

and for the nonstationary case,

-I ( / t ( / )
(3.3.40) [nt' ] = [(1/n ) + (1/n )].
t+1 t s

Thus, equivalently, we could say that for a given posterior distri-

bution of pt at time t, the only difference between the prior distri-

bution of 0Jtl under stationarity is that the term n' is larger
t+l
with the stationary condition. When u+0, m' is always changing and,
t
therefore, there is a difference in mean and variance.

Stationary conditions, in the case when both 0 and 82 are

unknown, imply that in any given period t+l the joint prior density

for P and 62 is a normal-gamma-2 of the form given in equations

(3.3.30 3.3.34). That is,

(3.3.41)


fN-y-P( 2 t82v' n'd'+ ) = +l(P +ll'-n f;-2( 21d(2 d' v' )
t-It2it+1 ,tt+] ,t+l ,n tI+ N + t+t
='fl 'nt+1

where

(3.3.42) m' = m" ,

t+13 t

(3.3.43) v' = v" ,
t+1 t


(3.3.44)


,I
11t- = cI














and

(3.3.45) d' = d"
+ t

Under stationarity, then, the joint prior distribution of p and 82

at the start of period t+l is the same as the posterior distribution

of pt and d2 at the end of period t. Since the distribution of 62

does not depend on 5, only on the parameters d and v, we could model

changes in p. These changes in the mean only affect the function

f-'(+l mti+,' 2/n'+ ), in equation (3.3.41). In fact, the effect

of the nonstationarity assumption on f'(p +) is identical to the

effect of nonstationarity over the prior distribution in the case

when only j was the unknown parameter. In the case of nonstationarity

with no drift, i.e., u=0, for a given posterior distribution of jt and

62 at time t, the joint prior density function for 0 and 62 is similar

to the stationary counterpart, as given in equation (3.3.41), except

for the fact that the variance of f'N ,, t,' 62/n' ) is larger
fN t+l
than the variance of f(p +l m'+l, 62/n' ) in the stationary case.

In other words a2/n1' in the stationary case is smaller than 82/n'
L+l t+1
in the nonstationary case.

The nonstationarity assumption also affects the predictive

distribution. For the case when p is the unknown parameter and the

data generating process is normal, assume that after t periods we have

.1 posterior distribution f"(p ) which is normal with mean m" and
t L L
variance o '/11'. Ttie predictive distribution at the end of period t

was sliown in equat ion (A1.12) to be normal with mean,













(3.3.46) E (x ) = m"
t t t

and variance


(3.3.47) Var (x) = 02(1 + n")/n"] = 2[1 + (1/n")].
t t t t t


If the process is stationary then the predictive distribution of the

random variable of interest at the beginning of period t+l is the same

as the distribution we had at the end of period t, i.e., N(m",o2[(1+n")/n").
t t t

However if we assume the nonstationary condition, the prior distribu-

tion of P at the start of period t+l has a different mean and a dif-

ferent variance. Consequently the predictive distribution changes in

mean and variance between consecutive time periods. In other words

E t(xt ) is always changing depending on the stochastic change of

the mean pt+. In the case of nonstationarity with no drift, i.e., u=0,

for a given posterior distribution of pt at time t, the only differ-

ence between the predictive distribution of xt+I under stationarity

and the predictive distribution of xt+l under nonstationarity is the

variance term. The variance of xt+ under stationarity, at the start

of time period t+l, is


(3.3.48) Var (x ) = o2[(l+n' )/n' I = o2[1+(1/n )].


It was stated previously that the parameter n +1 is smaller when

is unknown and nonstationary than when p is unknown but stationary.

Hence, as expected, the variance of the predictive distribution,

Var t+(xt+ ), is larger when p is nonstationary. This has some

implications for the determination of prediction intervals; which














we will discuss in detail in Chapter Four. Nonstationarity implies

greater uncertainty, which is reflected by an increase in the mea-

sure of uncertainty, variance.

For the case when both p and 62 are the unknown parameters

and the data generating process is normal, assume that after t

periods we have a posterior distribution f"( ,t'2) which is

normal-gamma-2 with parameters m", n", v" and d". The predictive
t t t t
distribution at the end of period t was shown in equation (AI.33)

to be Student with mean,


(3.3.49) E (x ) = m" d" > 1,
t t t t


and variance

(3.3.50) Var (x ) = [v" (n'+ ) /n"] [d"/(d" 2)], d" > 2.
L t t L t t t t


Again, if the process is stationary then the predictive distribution

at the beginning of period t+1 is the same as the distribution that

we had at the end of period t, i.e., ST (m", [v"(n"+l)/n"][d"/(d" 2)]).
t t t t t t

When we assume the nonstationary condition, the.joint prior

distribution of 0 and 82 at the start of period t+l changes from its

original form at the end of period t. The specific random model we

are assuming causes the parameter m and n of the distribution of

U to change from lhe end of period t to the start of period t+l.

Therefore the predictive distribution f' (x ) has a different
t+1 t+1
mean and variance than t"(x ). In the case of nonstationarity with
no drift, i.
no drift, i.e., u=0, for a given posterior distribution of 1t and 62
t














at time t, the only difference between the predictive distribution of

xt+1 under statioiarity vis-a-vis nonstationarity is the variance term.

Observing equation (3.3.50) closely we note that the effect of nonsta-

tionarity is the same as in all previous cases; that is the parameter

n+, is smaller when p is nonstationary and therefore the variance is

larger. In this case since both p and a2 are unknown, at the end of

period t our estimate of the variance is v" which includes all the
t
information that we have available at the time including sample in-

formation.

A comparison of stationary versus nonstationary results when

the data generating process is lognorma] moves along the same lines as

the normal process does. For the case where the unknown parameter is

1, the nonstationarity condition causes an increase in the variance

and in the mean of the normal prior distribution which causes an

increase in the mean and variance of the lognormal predictive distri-

bution. Similarly, for the case when both parameters are unknown the

condition causes an increase in mean and variance in the prior distri-

bution of p and a change in the joint prior distribution of Q and a2

which affects the logStudent predictive distribution. The logStudent

predictive distribution has infinite mean and variance which are not

affected by the nonstationary condition.


3.4 Conclusion

In this chapter we modeled nonstationarity in the mean of

normal and lognormal processes under two uncertainty assumptions,














The model is built upon the Bayesian analysis of normal processes

of Raiiftli andi Schlaifer (19(1) and upon tile analysis of nonstationary

means of normal processes, for unknown i, of Barry (1973). We extended

the nonstationary results of Barry (1973) to the lognormal distribu-

tion. The variance of the lognormal distribution is given by


(3.4.1) Var(x) = w(w-l) e2 ,


where w = exp(o2).


Since V(x) is a function of 11 and o2 in the lognormal case, nonsta-

tionarity in D means that both mean and variance of x are nonsta-

tionary, so that the lognormal case provides a generalization of the

normal results. Furthermore, we developed the nonstationary model

for tie mean of normal and lognormal processes for the case when both

parameters, j and 02, are unknown. For each group of assumptions we

noted that, in every time period t, the uncertainty is never fully

eliminated from tle model.

In Chapter Two we emphasized that the exponential distri-

bution was often used to represent life testing models. All the

research in the area of life testing where this distribution has

been used has assumed stationary conditions for the parameters of

the model and for the model itself. Appendix II shows the Bayesian

modeling of nonstationarity for the parameters of an exponential dis-

tribution using random shock models. Only under very trivial as-

sumptions does the analysis yield tractable and consequently useful

results. On the other h ;nd, as was shown in this chapter, tile normal














and lognormal distributions provide results that are especially

tractable.

In any given period t, the prior, posterior and predictive

distributions depend on the parameters, m and n when only p is
t t

unknown; and on the parameters m nt, vt and d when both p and

o2 are unknown. Under the nonstationarity conditions, these para-

meters change from period to period not only because new information

becomes available through the sample, but because of the additional

uncertainty involving the shifts in the parameter p. To make better

use of these distributions the decision maker must know how they are

evolving through time. Management requires realistic and accurate

information to aid in decision making. For instance the decision

maker can be interested in knowing how the variance of the distri-

bution of the mean, p, changes across time. Furthermore, since one

of the objectives of the user of the distribution is to construct

prediction intervals for the process variable he can be interested

in knowing how the variance of the predictive distribution behaves

as the number of observed periods increases. We will address this

problem in detail in Chapter Four through the study of the limiting

behavior of the parameters m n t, vt and d In addition, attention

will be focused on the methods of constructing prediction intervals

for the normal, Student, lognormal and logStudent distributions

under various uncertainty conditions.


















CHAPTER FOUR


LIMITING RESULTS AND PREDICTION INTERVALS FOR NONSTATIONARY

NORMAL AND LOGNORMAL PROCESSES

4.1 Introduction

In Chapter Three we emphasized that for many real world data

generating processes the assumption of stationarity is questionable and

stochastic parameter variation seems to be a reasonable assumption. If

a data generating process characterized by some parameter is nonstation-

ary, then it is potentially misleading to make inferences and decisions

concerning the parameter as if it only took on a single value. We should

be concerned with a sequence of values of the parameter corresponding to

different time periods. It was shown in Chapter Three that if we use a

particular stochastic model we can model nonstationarity for the shift

parameter of normal and lognormal processes from a Bayesian viewpoint,

under two uncertainty conditions, and that we can obtain tractable

results. In particular, values of the parameter for successive time

periods are assumed to be related as

(4.1.1) Pt+ = ~ + et+1 t = 1, 2, ... ,


where e+l is a normal "random shock" term independent of t with known

mean u and variance o2. The mean in any period t is equal to the mean
e

in the previous period plus an increment e, which has a normal distri-

bution, with known mean.

Comparing the stationary with the nonstationary processes we

pointed out that when the data generating process is normal or log-

83













normal and the unknown parameter is p, the nonstationary condition

causes in any given period t an increase in the variance of the nor-

mal prior distribution. This causes an increase in the mean of the nor-

mal predictive distribution for normal processes and causes an increase

in the mean and variance of the lognormal predictive distribution for

lognormal processes. When both parameters, p and o2, are unknown a

similar result is found for the prior and predictive distributions of

the normal and lognormal data generating processes.

The results discussed in Chapter Three have to do with the

period to period effects of random parameter variation upon the prior

and predictive distributions. However, the asymptotic behavior of the

model has important implications for the decision maker. For instance,

when only p is the unknown parameter, under constant parameters uncer-

tainty about p eventually is eliminated since n' increases without

bound and the sequence of prior variances (o2/nd) converges to zero.

Hence the distribution of t eventually will be unaffected by further

samples. On the other hand, shifting parameters could increase the uncer-

tainty under which a decision must be made since it reduces the infor-

mation content that past samples offer for the actual situation. Increases

in uncertainty, caused by stochastic parameter variation, have important

implications for the decision maker since his decisions depend upon

the uncertainty under which they are made. Similarly, random parameter

variation produces important differences in the limiting behavior of

the prior and predictive distributions when i and o2 are the unknown

parameters. In Section 4.2 we study the limiting behavior of the param-














eters m', v, nt, and d' of the prior and predictive distributions for

the normal and lognormal data generating processes. In addition we dis-

cuss the implications of these limiting results for the inferences and

decisions based on the posterior and predictive distributions.

In any period t, all the information contained in the initial

prior distribution and in subsequent samples is fully reflected in the

posterior and the predictive distributions. In some applications, partial

summaries of the information are of special importance. One important

way to partially summarize the information contained in the posterior

distribution is to quote one or more intervals which contain a stated

amount of probability. Often the problem itself will dictate certain

limits which are of special interest. A rather different situation

occurs when there are no limits of special interest, but an interval

is needed to show a range over which "most of the probability lies".

One objective of this thesis is to develop Bayesian prediction

intervals for future observations that come from normal and lognormal

data generating processes. In particular, we are interested in most plau-

sible Bayesian prediction intervals of cover P as were defined in Section

2.2. In Section 4.3 we discuss the problem of constructing prediction

intervals for normal, Student, lognormal and logStudent distributions.

It is pointed out that it is easy to construct these intervals for the

normal and Student distributions but that it is rather difficult for

the lognormal and logStudent distributions. An algorithm is presented

to compute the BayesLan prediction intervals for the lognormal and log-

Student distributions. In addition, we discuss the relationship that










86


exists between Bayesian prediction intervals under nonstationarity

and classical certainty equivalent and Bayesian stationary intervals.



4.2 Special Properties and Limiting Results
Under Nonstationarity

4.2.1 Limiting Behavior of m' and n' When P is the Only Unknown Parameter
t t

For a process that has a normal density function with unknown

parameter p, Raiffa and Schlaifer (1961) show that the natural conjugate

prior distribution is normal with parameters m' and o2/n'. In Section

3.3 we pointed out that if the mean, p, of the data generating process

does not change from period to period except by the effect of the sample

information, then each posterior can be thought of as a prior with

respect to a subsequent sample. In general, if we assume that a sample

of size nt is employed every time a sample is taken [which yields a
n
statistic m = ( E x ./n)] and if we assume that the mean p is sta-
i=l
tionary then in any given period t the posterior distribution of p

is normal with parameters n" and m" given by
t t


(4.2.1) n" = n' + n
t t t
and

(4.2.2) m" = (n' m' + n m )/(n' + n).
t t t t t t


In order to study the limiting values of n' and m' under sta-
t t
tionary conditions, we have to characterize the posterior and predictive

distributions after t periods of time have elapsed. Since the limiting

results under nonstationary means will be based on a fixed sample size














each period, we will make the same assumption for the stationary lim-

iting results, that is n n, Vt, In period one, for a process that has

a normal density function with unknown parameter p,i.e., fN(xl1p),

the natural conjugate prior is normal with mean m' and variance o2/n,

i.e.,fN (Pim{,G2/n{). If a sample of size n from a normal process yields

the sufficient statistics mI and n, then the posterior and predictive

distributions at the end of period one are given by


(4.2.3) f" [pl(n'm{ + nm )/(n'+ n),o2/(n{+ n)] = f"(Plm1',2/n")

or

= fN21m2,o2/n) ,

and

(4.2.4) fN(xl im, 2(l + n/n")


respectively.

In period two, if a sample is taken from a normal process that

yields the sufficient statistics m2 and n then the posterior and predic-

tive distributions at the end of the period are given by,


(4.2.5)


f [I2 [n1ml + n(ml+ m2)]/(n'+ 2n). o2/(n' + 2n)] = f"(21m'0, o2/n)

or

= f 3(lm3, 02/n'),

and

(4.2.6) fN(x2 m2, 02(1 + n2)/n')
1' 2 2"


respectively.




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - Version 2.9.7 - mvs