
Citation 
 Permanent Link:
 http://ufdc.ufl.edu/AA00032923/00001
Material Information
 Title:
 Reliability analysis applied to modeling of hydrologic processes
 Creator:
 Maalel, Khlifa, 1955
 Publication Date:
 1983
 Language:
 English
 Physical Description:
 xvi, 358 leaves : illustrations ; 28 cm
Subjects
 Subjects / Keywords:
 Hydrological modeling ( jstor )
Mathematical independent variables ( jstor ) Mathematical variables ( jstor ) Maximum likelihood estimations ( jstor ) Modeling ( jstor ) Parametric models ( jstor ) Rain ( jstor ) Statistical estimation ( jstor ) Statistical models ( jstor ) Statistics ( jstor ) Hydrologic models  Reliability ( lcsh ) Hydrology  Statistical methods ( lcsh )
 Genre:
 bibliography ( marcgt )
theses ( marcgt ) nonfiction ( marcgt )
Notes
 Bibliography:
 Includes bibliographical references (leaves 345357).
 General Note:
 Typescript.
 General Note:
 Vita.
 Statement of Responsibility:
 by Khlifa Maalel.
Record Information
 Source Institution:
 University of Florida
 Holding Location:
 University of Florida
 Rights Management:
 The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. Â§107) for nonprofit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact the RDS coordinator (ufdissertations@uflib.ufl.edu) with any additional information they can provide.
 Resource Identifier:
 11826709 ( OCLC )
ocm11826709 30550750 ( Aleph number )

Downloads 
This item has the following downloads:

Full Text 
RELIABILITY ANALYSIS APPLIED
TO MODELING OF
HYDROLOGIC PROCESSES
By
Khlifa Maalel
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1983
To rmy Mother, Father
d
and my lovely 0
ACKNOWLEDGEMENTS
This research would not have been completed without the valuable
guidance of Dr. Wayne C. Huber, the continuous interest in this topic of Dr. James P. Heaney and encouragement from Dr. Barry A. Benedict, Dr. Clyde Kiker and Dr. Gabriel Bitton. To all of them goes my deep appreciation.
Special thanks go to Janet Wilson for her excellent typing of this dissertation and to Dibbie Dunnam for helping in typing the references.
Appreciation is expressed to Bob Dickinson for his friendship and for helping in solving many of the computer puzzles. Also, I want to
thank all my friends in the Environmental Resources Management program at the University of Florida for their friendship and sense of humor which made me feel at home.
My thanks are also offered to the Tunisian government for the high priority it is giving to scientific research and higher education, without which I would not have been able to pursue my studies.
Part of this research was supported by the South Florida Water Management District and by the U.S. Environmental Protection Agency. Computations were made using the Northeast Regional Data Center computer facilities at. the University of Florida.
TABLE OF CONTENTS
Page
ACKNOWLEDGEMENTS..................................................iii
LIST OF TABLES...................................................viii
LIST OF FIGURES....................................................xi
NOTATION..........................................................xii
ABSTRACT........................................................... v
CHAPTER 1: INTRODUCTION............................................1
1.1. Generalities.............................................1
1.2. Reliability Definitions and Indices......................8
1.2.1. The concept of reliability.......................8
1.2.2. Reliability definitions..........................9
1.2.3. Reliability indices.............................14
1.3. Overview of the Study...................................16
1.4. Overview of the Results.................................19
1.5. Research Uniqueness.....................................22
CHAPTER 2: LITERATURE REVIEW......................................23
2.1. Classification of Hydrologic Models.....................23
2.2. Statistical Parameter Estimation........................26
2.2.1. Introduction....................................26
2.2.2. Properties of estimators........................27
2.2.3. Least squares method............................28
2.2.4. Maximum likelihood..............................29
2.2.5. Weighted least squares..........................31
2.2.6. Method of moments................................34
2.2.7. Order statistic based least squares.............35
2.2.8. Conclusion......................................39
2.3. Probability Distributions................................ 39
2.3.1. Introduction....................................39
2.3.2. Discrete distributions..........................40
2.3.3. Continuous distributions........................40
2.3.4. Logarithmically derived distributions...........40
2.3.5. Other derived distributions.....................44
2.4. Summary.................................................47
iv
Page
CHAPTER 3: GENERALIZED PROBABILITY DISTRIBUTIONS..................49
3.1. Generalized Gamma Distribution..........................49
3.1.1. Introduction....................................49
3.1.2. Historical background...........................49
3.1.3. Linear regression and confidence limits.........56
3.1.4. Summary and implications........................59
3.2. New Parameterization of the GGD.........................62
3.2.1. Generalized normal distribution (GND)...........62
3.2.2. Generalized extreme value distribution (GED)....63 3.2.3. Generalized Rayleigh distribution (GRD).........64
3.2.4. Generalized Pearson distribution (GPD)..........65
3.2.5. Other generalized distributions.................68
3.2.6. Summary.........................................71
3.3. Generalized Probability Distribution Computer
Program (GPDCP).........................................74
3.3.1. Solution algorithm..............................74
3.3.2. Program description.............................77
CHAPTER 4: ILLUSTRATIVE EXAMPLES..................................85
4.1. Introduction............................................85
4.2. Illustrative Example 1..................................85
4.3. Illustrative Example 2.................................101
4.4. Sensitivity Analysis...................................122
4.4.1. Sensitivity to the plotting position
definition.....................................122
4.4.2. Sensitivity to the change of scale.............129
4.4.3. Sensitivity to the objective function
formulation and to the estimation procedure... .133
4.5. Summary and Conclusions.................................138
CHAPTER 5: RELIABILITY ANALYSIS...................................141
5.1. Introduction...........................................141
5.2. Second Moment Reliability Modeling.....................141
5.2.1. Normal case....................................141
5.2.2. Lognormal case.................................143
5.3. Third Order Reliability Modeling.......................144
5.3.1. Introduction...................................144
5.3.2. Approximate relations between moments of
transformed and untransformed variables........145 5.3.3. Third order reliability representation.........148
5.4. Sensitivity Analysis...................................150
5.4.1. Sensitivity of the predicted pth percentile
to the shape of the distribution...............150
5.4.2. Sensitivity of the level of reliability to the shape of the distribution and to the first order approximation......................164
v
Page
5.4.3. Sensitivity of design event magnitude to the design period and to the shape of the distribution...................................171
5.5. Summary and Conclusions................................174
CHAPTER 6: CASE STUDY............................................175
6.1. Introduction...........................................175
6.2. Case Study.............................................176
6.2.1. Sites and data description......................176
6.2.2. Temporal and spatial variability of the
recorded data..................................176
6.2.3. Continuous versus single event simulation......186
6.3. Deterministic Models...................................187
6.3.1. Introduction...................................187
6.3.2. Distribution free statistical analysis.........188
6.3.3. Calibration and verification of deterministic
models.........................................197
6.3.4. Water quantity and quality continuous
simulation.....................................199
6.3.5. Summary and conclusions........................212
6.4. Probabilistic Models...................................212
6.4.1. Introduction...................................212
6.4.2. Annual rainfall series.........................214
6.4.3. Event statistics...............................215
6.4.4. Quality Statistics.............................217
6.4.5. Conclusion.....................................220
6.5. Stochastic Models......................................220
6.5.1. Introduction...................................220
6.5.2. Model description..............................222
6.5.3. Parameter estimation and goodness of fit
evaluation.....................................223
6.5.4. Annual ARMA(l,1) model.........................226
6.5.5. Monthly ARMA(l,l) model.........................235
6.5.6. Reliability of estimated parameters............243
6.6. Summary and Conclusions.................................246
CHAPTER 7: SUMMARY AND CONCLUSIONS...............................250
7.1. Summary................................................250
7.2. Conclusion.............................................251
7.3. Suggestion for Further Research.........................253
APPENDICES........................................................
A. Linear Regression........................................255
B. Exact Relations Between Moments of the Normal and
Lognormal Distributions..................................258
C. Probability Distributions.................................261
D. GPDCP Source Program.....................................274
E. Modified Kite Program....................................284
F. Coefficient of Reliability as a Function of ALFA and V . .293
y
vi
Page
G. Reliability as a Function of COR and V ..................318
H. Monthly and Annual Rainfall at Eight NKS Stations ........ 333
I. Extension of the Akaike and Bayesian Information
Criteria to the BoxCox Transformation...................341
REFERENCES........................................................345
BIOGRAPHICAL SKETCH...............................................358
vii
LIST OF TABLES
Table Page
3.1. Special Distributions of the CGD............................58
3.2. Optimal Power Transformation and Information Number Ratio...61
3.3. Special Distributions of the New Parametrized Generalized
Distributions...............................................66
3.4. Other Special Distributions of the GOD Not Included in
This Study..................................................69
3.5. New Generalized Family of Distributions.....................73
4.1. Annual Maximum Daily Runoff and Statistics of Original
and Log Transformed Flows, Example 1........................87
4.2. Standard Errors for Example 1...............................88
4.3. Models, Parameters, and Selection Statistics, Example 1,
Runoff Series...............................................90
4.4. Optimal Selection Statistics and Corresponding
Transformation (a) for Example 1............................92
4.5. Best Model Based on R2 Selection Statistics, Example 1,
Runoff......................................................94
4.6. Detailed Statistics for the R2 Selected Model, Example 1... .96 4.7. Best Model Based on STDE Selection Statistic, Example 1.....97
4.8. Detailed Statistics for the STDE Selected Model,
Example 1...................................................98
4.9. Detailed Statistics for the WSS Selected Model, Example 1..103
4.10. Best Model Based on the WSS Selection Statistics,
Example 1..................................................104
4.11. Total Annual Runoff and Statistics of Original and Log
Transformed Flow for Example 2.............................105
4.12. Total Annual Rainfall and Statistics of Original and Log
Transformed Data...........................................106
4.13. Standard Errors for Example 2..............................108
4.14. Models, Parameters, and Selection Statistics, Example 2,
Runoff.....................................................109
4.15. Optimal Selection Statistic and Corresponding
Transformation (a), Example 2, Annual Total Runoff.........110
4.16. Best Model Based on R2 Selection Statistic, Example 2,
Runoff.....................................................112
4.17. Detailed Statistics for the R2 Selected Model, Example 2,
Runoff.....................................................113
4.18. Best Model Based on the STDE Selection Statistic,
Example 2, Runoff..........................................114
4.19. Detailed Statistics for the STDE Selected Model,
Example 2, Runoff........................................115
4.20. Models, Parameters, and Selection Statistics, Example 2,
Rainfall...................................................118
viii
Table Page
4.21. Optimal Selection Statistics and Corresponding (a),
Example 2, Annual Total Rainfall...........................119
4.22. Best Model Based on R2 Selection Statistics, Example 2,
Rainfall...................................................120
4.23. Detailed Statistics for the R2 Selected Model, Example 2,
Rainfall...................................................121
4.24. Best Model Based on the STDE Selection Statistic,
Example 2, Rainfall........................................123
4.25. Detailed Statistic for the STDE Selected Model,
Example 2, Rainfall........................................124
4.26. Sensitivity of Optimal Selection Statistic (R2) and
Corresponding Transformation (a) to Plotting Position
Definition.................................................127
4.27. Sensitivity of Optimal Selection Statistic STDE and
Corresponding Transformation to Plotting Position
Definition.................................................128
4.28. Selection Statistics for St. Marys Runoff Data Expression
cfs and Dimensionless Units................................130
4.29. Selection Statistics for Kissimmee River Runoff Data
Expressed in Inches, Decimeters and Meters.................132
4.30. Nonlinear Parameter Estimation, Original Variables,
Starting Values AF=0.40, A=13, and B=100...................134
4.31. Nonlinear Parameter Estimation, Original Variables,
Starting Values AF=0.41, A=15, and B=117...................135
4.32. Nonlinear Parameter Estimation, Transformed Variables,
Starting Values AF=0.41, A=13, and B=117...................136
4.33. Nonlinear Parameter Estimation, Transformed Variables,
Starting Values AF=0.40, A=15, B=100.......................137
5.1. Power to Lognormal Percentile Ratio, 0 < a <1, P=95%......154 5.2. Power to Lognormal Percentile Ratio, 0 < a <1, P=99%......155 5.3. Power to Lognormal Percentile Ratio, 0 < a <1, P=99.9%....156 5.4. Power to Lognormal Percentile Ratio, 0 < a 1 1, P=99.999%..157
5.5. Power to Lognormal Percentile Ratio, 0 < a < 1,
P=99.9999%.................................................158
5.6. Power to Lognormal Percentile Ratio, 0 > a > 1, P=95%.....159 5.7. Power to Lognormal Percentile Ratio, 0 > a > 1, P=99%.....160 5.8. Power to Lognormal Percentile Ratio, 0 > a > 1, P=99.9%...161
5.9. Power to Lognormal Percentile Ratio, 0 > a > 1,
P=99.999%..................................................162
5.10. Power to Lognormal Percentile Ratio, 0 > a > 1,
P=99.9999%.................................................163
5.11. Sensitivity of the Ratio, Design Event Magnitude/Mean
(1/COR), to the Design Period and to the Shape of the
Distribution...............................................173
6.1. Characteristics of the Four Urban Sites....................179
6.2. Hourly Rainfall Station Identification.....................180
6.3. Time Variability of Rainfall, Runoff, and Runoff
Coefficient.............................................. 182
6.4. Spatial Variability of Rainfall within the Urban Basins....183
6.5. Temporal and Spatial Variability of the Seasonal Yearly
Total Rainfall Over the South of Florida..................185
ix
Table Page
6.6. Monthly Storm Event Statistics at Miami Airport Station....189 6.7. Annual Storm Event Statistics at Miami Airport Station.....190
6.8. Time and Space Variability of the Storm Event based
Statistics.................................................196
6.9. Highway Basin Calibration by Adjusting the Width...........200
6.10. Calibration results for the Storm Water Management Model
at the South Florida Four Basins...........................201
6.11. Yearly and Monthly Summaries from SWMM for the Multifamily
Urban Basin................................................202
6.12. Ranked Hourly Rainfall for the Multifamily Urban Basin.....203
6.13. Ranked Runoff and Pollutant Concentration for the
Multifamily Basin..........................................204
6.14. Options Selection for STATS Analysis.......................208
6.15. Empirical Frequency Generated by STATS for the Multifamily
Basin......................................................209
6.16. Monthly Event Statistics of Generated Runoff at the
Multifamily Basin..........................................210
6.17. Annual Event Statistics of Generated Runoff at the
Multifamily Basin..........................................211
6.18. GPDCP Selected Transformation and Corresponding Optimal
Statistics for the 8 NWS Stations..........................213
6.19. Sensitivity Analysis of the Optimal MXLF Statistics to
the Plotting Position Definition...........................216
6.20. GPDCP Optimal Selection Statistic Summary, Belle Glade
Storm Event Characteristics................................218
6.21. GPDCP Optimal Selection Statistics for COD Pollutant
Loads, Miami Multifamily Basin.............................219
6.22. ARMA(l,l) Parameter Estimates for Annual Rainfall Series.. .227 6.23. Skewness Coefficient for Transformed Annual Series.........229
6.24. Residual Variance and Minimum Sum of Squares for Annual
Rainfall Series.............. .............................230
6.25. Akaike Information Criteria for the ARMA(l,l) Models of
the Annual Rainfall Series.................................233
6.26. Bayesian Information Criteria for the ARMA(l,l) Models
of the Annual Rainfall Series..............................234
6.27. First Estimates of ARMA(l,l) Parameters for Three
Transformations of Monthly Rainfall Series.................236
6.28. FTMXL Parameter Estimates of the ARMA(l,l) Monthly Models..237
6.29. First and Final Estimates of ARMA(l,l) Parameters for
Square Root of Monthly Rainfall............................238
6.30. Residual Variance and Minimum Sum of Squares for the
ARMA(l,l) Rainfall Model..................................239
6.31. Sensitivity of the Monthly and Annual Skeiness to the
Transformation of Monthly Rainfall Series..................242
6.32. Akaike and Bayesian Information Criteria for the Monthly
ARMA(l,l) Model............................................244
6.33. Monte Carlo Simulation Summary Statistics..................245
x
LIST OF FIGURES
Figure Page
1.1. Error Distribution About the Fitted Model....................5
1.2. Schematic Representation of the Engineering Design Process.....................................................10
1.3. Reliability and Risk Definitions............................12
2.1. Hydrologic Model Classification.............................24
3.1. Generalized Gamma Distribution..............................53
3.2. Flow Chart for the Generalized Probability Distribution Computer Program (GPDCP)....................................78
Linear Plot Linear Plot Linear Plot Linear Plot Linear Plot Linear Plot Linear Plot Reliability Reliability Reliability Reliability Reliability Location of Location of
of R2 Selected Model, Example 1.................99
of STDE Selected Model, Example 1..............100
of MXLF Selected Model, Example 1..............102
of R2 Selected Model, Example 2, Runoff........116 of STDE Selected Model, Example 2, Runoff......117 of R2 Selected Model, Example 2, Rainfall......125 of STDE Selected Model, Example 2, Rainfall....126 Versus Standardized Mean, a=1.0................166
Versus Standardized Mean, a=0.60...............167
Versus Standardized Mean, a=0.0................168
Versus Standardized Mean, a=0.6...............169
Versus Standardized Mean, a=1.0...............170
the Four Urban Basins..........................177
the Eight YWS Hourly Rainfall Stations.........178
6.3. Seasonal Variability of the Time Between Events............192
6.4. Seasonal Variability of the Average Intensity..............194
6.5. Frequency Plot and Statistical Parameters of Generated
Flows at the Multifamily Basin.............................205
6.6. Frequency Plot and Statistical Parameters of Generated
COD Loads at the Multifamily Basin.........................206
6.7. GPDCP Modeling of SWMM Generated Hourly COD Loads..........221
6.8. Sum of Squares Surface Contours for Belle Glade Annual
Rainfall Series............................................231
6.9. Sum of Squares Surface Contours for Belle Glade Monthly
Rainfall Series............................................245
6.10. Annual Series Autocorrelation Functions, a=0.0.............247
6.11. Annual Series Autocorrelation Functions, a=1.3.............248
xi
4.1.
4.2.
4.3.
4.4.
4.5.
4.6.
4.7.
5.1.
5.2.
5.3.
5.4.
5.5.
6.1.
6.2.
NOTATION
A regression parameter
a GGD parameter
ACF Autocorrelation Function
AF transformation parameter
ALFA transformation parameter
AIC Akaike Information Criteria
ARMA Autoregressive Moving Average Model
B regression parameter
b GGD parameter
BIC Bayesian Information Criteria
CDF Cumulative Distribution Function
COD Chemical Oxygen Demand
COR Coefficient of Reliability
CP Multiplicative constant
CT Additive constant
F Cumulative Distribution Function
f Probability density function
fs Standardized probability density
g Fitted model
GED Generalized Extremal Distribution
GGD Generalized Gamma Distribution
GPD Generalized Pearson Distribution
GPDCP Generalized Probability Distribution Computer Program
xii
GRD Generalized Rayleigh Distribution
GRI Generalized Reliability Index
h probability density function
H Cumulative distribution function
IMSL International Mathematical Statistical Library
k, K GGD shape parameters
L likelihood function
k loglikelihood function
MXLF Maximum likelihood selection statistics
n number of observations
P9 0Probability value p number of parameters
P confidence level
pdf probability density function
Pr Probability
q GGD shape parameter
R Reliability
R2 Coefficient of determination, selection statistics
SAS Statistical Analysis System
SF Safety Factor
STD Standard Deviation
STDE Standard error, selection statistics
V Coefficient of variation
w weight in least squares regression
WSS Weighted Sum of Squares
x Explanatory variable
y original variable
xiii
Y transformed variable
Z standardized variate
z expected order statistics
ax transformation parameter
ARMA(l,1) autoregressive parameter y skewness coefficient
y mean of original variables
y
IA mean of transformed variables
V number of degrees of freedom
T digamma function
trigamma function
product symbol
test level
summation symbol
0 model parameters
01 ARMA(1,1) moving average parameter
xiv
Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
RELIABILITY ANALYSIS APPLIED
TO MODELING OF
HYDROLOGIC PROCESSES
By
Khlifa Maalel
December 1983
Chairman: Wayne C. Huber
Cochairman: James P. Heaney
Major Department: Environmental Engineering Sciences
Parameter estimation in the modeling of physical (natural) processes in general and of hydrologic processes in particular is one of the most challenging problems still puzzling scientists and engineers. This research is an attempt to shed new light on such problems by suggesting a uniform reliability based approach to the estimation of the parameters of deterministic, probabilistic and stochastic models and to the analysis of model predictions.
New families of generalized probability distributions are defined. These reduce to the simple form of location and scale type distributions when the data are transformed by the BoxCox transformation. The transformation parameter is introduced as a shape parameter for the new families of distributions. An algorithm for estimating the location and scale parameters separately from the nonlinear parameter is developed and translated into a FORTRAN computer program. This program selects the best model from among four generalized probability distributions
xv
from the normal, extremal, Rayleigh, and Pearson families. The selection may be based on any of four criteria: the standard error, the coefficient of determination of the simple linear regression, the weighted sum of squares, and the maximum likelihood function. These four statistics exhibit a high sensitivity to the shape parameter, a small sensitivity to the plotting position definition and no sensitivity
to the change of scale on which the data are expressed. For all selection statistics, the four analyzed families of distributions performed equally well in modeling most of the analyzed series.
New relations between the moments of the original and BoxCox
transformed variables are derived, based on a Taylor series expansion of the transformation. These relations allow the extension of the classical second moment reliability theory, based on the assumption of
normality, to a third order reliability theory, wherein deviation from normality is explicitly accounted for through the transformation parameter. Generalized expressions for different measures and indices of
reliability are derived, and their high sensitivity to the shape parameter is illustrated by generated tables and plots of reliability for several transformation parameters and coefficients of variation.
The potential applicability of the reliability based approach to the modeling of hydrologic processes is illustrated by the analysis of several hydrologic series from southeast Florida.
xvi
CHAPTER 1
INTRODUCTION
1.1. Generalities
One of the main tasks of scientists in general and engineers in particular is that of data reduction. An engineer who has compiled tables of data wishes to reduce them to a more convenient and comprehensive form (Bard, 1974). This may be accomplished, for example, by plotting the data, reducing them to a graphical form, and by selecting a class of functions and choosing from this class the one that best fits the data. The functional form may be expressed mathematically as
y = g(x,e) (1.1.1)
where y = (y , y2' ', yn) is the dependent variable, x = (xl 2' '' x n) is the independent variable (explanatory variable) and 0 (01, 02, ..., 0 ) are the parameters; and where n and p are the number of measurements and number of parameters, respectively.
The parameters are usually chosen to give the "best fit" to the data. Several fit criteria may be used, among which the most widely used is the least squares criterion, in which the sum of squares (SS) is minimized:
SS = [iy.  g(x2,6)] (1.1..2)
If the selection of the functional form g is mainly based on computational considerations, then the physical nature of the process generating the data will be reproduced to a minor extent by g. In this case, the parameter estimation procedure is called curve fitting, and
1
2
the fitted parameters will usually bear little physical significance if at all. These curves are useful tools for summarizing data and interpolating between tabulated values, but they have limited use outside the range of measurements (e.g., prediction and extrapolation purposes).
On the other hand, if the selection of g is based on some theoretical considerations of the physical laws governing the behavior of the natural system generating the data, then the procedure is called model
fitting (calibration). The parameters of such models usually represent quantities that have physical significance. The estimation of these parameters is a much more complicated problem than simple curve fitting (Bard, 1974). This is so because in addition to fitting the data well, these parameters should preserve their physical meaning by coming fairly
close to the "true" values. Unfortunately, such true values are usually unknown; otherwise there would be no reason for performing the experiment. Thus, due to experimental uncertainties, even if the functional form, g, is correct we can never expect to obtain the true parameters and the consequent model predictions with absolute certainty. This is a consequence of the deterministic nature of g, wherein each variable is represented by a unique value, which is in contradiction to its uncertain (random) nature. Such uncertainty was described by Bevington (1969) in the following words:
It is a wellestablished rule of scientific investigation that the
first time an experiment is performed the results bear all too
little resemblance to the truth being sought. As the experiment is repeated, with successive refinements of technique and method, the
results gradually and asymptotically approach what we may accept
with some confidence to be a reliable description of events.
This is even more true in hydrologic modeling, where in addition to experimental uncertainty, the modeled processes are highly variable in nature, and where engineering decisions are often based on reliability
3
analysis. Contrary to deterministic modeling, where the objective is to dampen uncertainty, the objective of reliability analysis is to magnify it in order to represent, predict and understand the uncertainty of
reality. Thus, the functional form of Equation 1.1.1 may be modified to
2
include an error term, e, explicitly, with zero mean and variance a ,
y = g(x,O) + e . (1.1.3)
It is the analysis of this error that will characterize the performance
of the model, measure the deviation of the predicted from observed values and allow the construction of confidence limits about g(x,O).
Modeling uncertainty has become an important task of modern engineering analysis (Ditlevsen, 1981). In fact, it was not until the late 1960's that reliability began to interest a wide circle of scientists
and engineers (Lomnicki, 1973). The spread was so fast among different engineering fields that it became hard to formulate a unique and simple definition or measure of reliability. Some of the most common definitions will be given in the next section, but first, applications of reliability concepts to hydrologic modeling are presented through a review of related studies.
Wood and RodriguezIturbe (1975a, 1975b) and Kite (1975) were among the pioneers in the analysis of reliability in flood flow frequency modeling. They based their analysis on Monte Carlo simulation of
extreme events using different probability distribution models. Wood and RodriguezIturbe adopted a Bayesian framework to account for parameters and model uncertainties, while Kite assumed that the error term, e, is normally distributed and derived approximate confidence limits for the predicted values. Lwin and Maritz (1980) noted that there is no
4
need for such a normality assumption if the dependent variable y has a distribution from a location and scale family, i.e.,
Pr(y > yo I = x0) = F(Yg(xe)
0 a
where F is the cumulative distribution function of the standardized residuals. The error distribution and confidence limits about the fitted model are illustrated by Figure 1.1.
Klemes and Bulu (1979) investigated by a split sample experiment and Monte Carlo simulation the reliability of various statistical
parameters of three different operational stochastic models. Yen and Tang (1976) and Tung and Mays (1980) included hydraulic and hydrologic uncertainties in the design of hydraulic structures. Cooley (1977,
1979) developed a method for estimating the parameters and assessing the reliability of steady state groundwater flow models. Moss (1979)
treated model and measurement errors as a third dimension in the analysis of the timespace tradeoff in hydrologic data collection for
regional regression model calibrations. Lall and Beard (1981), through a reliability analysis, developed a decision theoretic framework in which the value of data was related to process risk and estimate uncertainty. Hirsh (1979) suggested the use of synthetic hydrology for reliability analysis of water supply systems. After comparison of the performance of six generating methods of synthetic flow sequences he found that it
is operationally superior to perform the analysis on transformed (normalized) values of historical streamflow data, rather than on the data themselves.. Also, he found that preservation of statistical moments (mean, standard deviation, and lag correlation coefficients) may be a misleading criterion for judging the ability of the generating model to reproduce the performance of the water supply system. A better measure
5
U'
I I I
Figure 1.1
5 10 20 50 100
Return PeriodyYears
Error Distribution About the Fitted Model. Modified from Kite (1977)
Y
~1J
Cl
C
C
w
C
tn
Lfl
w
g(x,e)
I
/
/
/
/
/
/
/
x
6
of this performance was provided by checking the similarity between historical and synthetic cumulative distribution functions of the modeled flows or storages. This criterion was applied later for the comparison of the performance of four streamflow record extension techniques (Hirsh, 1982).
Stedinger and Taylor (1982a, 1982b) reconsidered the problem of
streamflow generation and compared the performance of five generation models. They did this in two steps: a verification phase to confirm that the model reproduces the statistics of the observed data, and a validation phase to demonstrate that other important characteristics of generated flow sequences are consistent with those of the historical flow, e.g., statistics related to the frequency and severity of droughts. They also showed that by incorporating parameter uncertainty into the streamflow generating model, derived distributions of reservoir reliability and performance will better reflect what is known (or is not) about a basin's hydrology. Among their conclusions they noted that the
impact of parameter uncertainty is much greater than that of the selection between a simple and a relatively complicated model. Thus, streamf low model parameter uncertainty should be incorporated into reservoir simulation studies to obtain realistic and honest estimates of system. reliability, given what is actually known about basin hydrology (Stedinger and Taylor, 1982b).
Hashimoto et al. (1982) defined three criteria that can be used to assist in the evaluation and selection of alternative design and operating policies for a wide variety of water resource projects. These
criteria describe how likely a system is not to fail (reliability), how quickly it recovers from failure (resilience) and how severe the
7
consequences of failure may be (vulnerability). Application of these
criteria was illustrated with a water supply reservoir example for which it was found that there is a tradeoff among expected benefits, reliability, resilience and vulnerability. For instance, high system reliability was accompanied by high system vulnerability.
Niku et al. (1979, 1981) developed a reliability model for the
evaluation of activated sludge processes of plants under design or under operation. The model was based on the assumption of lognormality of the analyzed effluent and on the relation between the moments of the normal and lognormal distributions. Reliabilitybased parameter estimation
procedures of rainfallrunoff models have been reported by Sorooshian and Dracup (1980), Sorooshian (1981), Troutman (1982), Sorooshian et al. (1983), Sorooshian and Gupta (1983) and Gupta and Sorooshian (1983) in
which special objective functions and solution techniques were selected on the basis of the stochastic properties of the errors present in the
data and in the model. In all of these studies, the data were transformed in order to comply with the assumptions implied by the estimation methods.
Stedinger (1983a) proposed the use of the noncentral t distribution to construct approximate confidence limits for specified design events
from the normal and lognormal distributions, and suggested an adjustment for skewness for the Pearson type III distribution. Through a Monte
Carlo simulation, the new confidence limits were shown to achieve the desired confidence level better than those based on asymptotic theory (Kite, 1975) or on the U.S. Water Resources Council Guidelines (WRC, 1977, 1981). In another paper Stedinger (1983b) recommended the use of probability weighted moments for estimating the parameters of the
8
dimensionless flood flow frequency distributions, in which the dimensionless variables are obtained by dividing the data by the sample mean. In a third paper, Stedinger (1983c) assumed that either the annual floods, their logarithms, or some other transformation of the flows are normally distributed, from which he derived an estimate of the pth
percentile of the flood flow distribution. This estimate was based on a Bayesian approach, incorporating hydrologic and geomorphic information
(prior distribution of the parameters) in addition to the sample information (measured data).
As mentioned earlier, and illustrated by this brief review of
literature, the term reliability has no strict (unique) definition since it applies to the analysis of the residuals as well as to the variability of the data themselves. Transition between these two applications may be easily handled by changing the definition of the functional form g(x,O). Another type of reliability analysis is related to parameter estimation in which the variancecovariance matrix of the parameters
needs to be estimated along with the parameters themselves. The next section gives some definitions and measures of reliability.
1.2. Reliability Definitions and Indices
1.2.1. The concept of reliability
An introduction to the concept of reliability is given by Hendrenyi (1981):
Reliability is an old concept and a new discipline. For ages,
things and people have been called reliable if they had lived up to certain expectations, and unreliable otherwise. A reliable person
would never (or hardly ever) fail to deliver what he had promised;
a reliable watch would be keeping the time day after day. The
types of expectations to judge reliability by have all been related
to the performance of some function or duty; the reliability of
a device has been considered high if it had repeatedly performed
its function with success and low if it had tended to fail in
repeated trials.
9
This vague notion of reliability is of limited use in technical applications, as in physics and engineering, where concepts must have numerical measures. A better schematization of the concept of reliability is given by Figure 1.2, in which the engineering design process
is compared to the links of a chain, and reliability is referred to as the strength of the weakest link. However, a suitable definition of
reliability is still required in order to quantify it and utilize one or several measurable quantities.
1.2.2. Reliability definitions
The classical definition which was first employed in the engineering field is not too far from the nontechnical concepts presented above. This definition is the following:
Reliability in the probability of a device or system performing
its function adequately, for the period of time intended under the
operating conditions intended. (Hendrenyi, 1981)
Note that this definition is based on the mathematical concept of
probability, which is fundamentally associated with reliability. This definition will be adapted for this study, wherein the performance of the system or device will be described by the previously defined functional form g(x,6). Thus, the reliability will be the probability that the model g(x,O) will perform adequately in predicting the behavior of the system over a given range of observations and within specified confidence limits. Thus, the concept of reliability remains unchanged and is the "probability of success" or "probability of adequate performance," often referred to as probability of "no failure" or no risk.
Reliability = 1  Risk
= 1  Pr(failure) (1.2.1)
Sampling Testing Formulas Experience
Build with confidence
Figure 1.2
Schematic Representation of the Engineering Design Process. After Harr (1977)
11
If the performance of the system is described by g(x,O) and an interval A is defined about the mean prediction g(x,O) within which the observation will be considered as a success, then the reliability may be expressed as
Reliability = 1  Pr(ly  g(x,O) > A) . (1.2.2) From Figure 1.1, the error distribution about a single prediction g(x.,O) defines the reliability as the area under the curve delimited by the upper and lower limits U and LI respectively. The risk is defined by the area under the tails of the distribution. Both risk and reliability may be defined with only one limit, the second limit may be set to + or . Figure 1.3a gives one such example in which y and g(x,6) are replaced by capacity C and demand D, and safety margin SM (Harr, 1977). The risk of having the demand exceed the capacity (negative SM) is given by the shaded area in Figure 1.3a. For this case the lower limit was set equal to zero. Such limits are usually defined by standards or design events which should not be exceeded for a given level of reliability. If these limits are themselves random variables, they will
be represented by their probability distributions, and the reliability will be evaluated as illustrated in Figure 1.3b, where the probability associated with a level of demand (fixed limit), say D, is
Pr(1D1Dj < d) = f(D )dD (1.2.3)
and the probability that the capacity is less than D is the shaded area of the same figure,
Pr(C < D1) = _Df C (C)dC . (1.2.4)
The probability of failure is the product of these two probabilities or
D
dP f =f D(D 1)dD f fC(C)dC (1.2.5)
12
f(SM)
I
(a)
SM  C D
Upper tail or demand distribution D (b)
Lower tail of capacity distribution C f (D 1) .
dD
f(C)
Demand distribution D Cap
(c)
city distribution
C,D
Figure 1.3
Reliability and Risk Definitions. After Hlarr (1977)
r'''
15 5

13
Integration of this expression over all possible values of demand will define the total risk represented by the shaded area of Figure 1.3c, SD
Risk = f [f_, fC(D)dC] f D(D)dD (1.2.6)
which after evaluation of the inner integral and substitution into Equation 1.2.1 gives
Reliability = 1  _f FC(D) fD(D)dD . (1.2.7)
In this equation, FC(D) is the cumulative distribution function of the capacity. Tung and Mays (1980) developed a computer program to evaluate
this expression for different combinations of capacity and demand distributions, including the most widely used distributions in the hydrologic field. Such relations will be dealt with in more detail in Chapter 5.
Chow (1964) defined two types of reliability, sampling reliability and prediction reliability for flood flow frequency analysis. The
first relates to the lack of fit of events from a given sample to the theoretical functional form, say g(x,6), representing the total population of samples. Sampling reliability is usually expressed in terms of confidence limits. Prediction reliability relates to the probability of nonoccurrence of an event of given average recurrence interval (T) during a given number of time intervals (n)
Pr(y < yL) = (1  n (1.2.8)
where n is known as the design period or project life. It is a function of the reliability from the above equation for a given design event y with average recurrence interval T log Pr(y~yL )
n logP(T1)/T (1.2.9)
log[ (T1) /T]
14
This study will focus mainly on sampling reliability, although prediction reliability models will be treated as special cases of the functional form g(x,O).
Nix (1982) lists three types of reliability applicable for the measure of the performance of stormwater related systems. These are annual reliability, defined as the probability of no failure within a
year; time reliability, defined as the portion of time with no failure during the operation period; and volume reliability, defined as the portion of the total demand satisfied during the operation period. These three definitions were based on water supply reservoir theory; they will not be considered further, although they may be easily incorporated into the general frame of reliability analysis of this study.
1.2.3. Reliability indices
1.2.3.1. Safety factor (SF). This is a well known index of reliability among engineers; it is usually defined as the ratio of the expected capacity (C) to the expected demand (D) (Figure 1.3c) of a given
structure. Other values of capacity and demand may ba substituted for the expected values (e.g., C and D of the same figure). Tung and Mays (1980) discussed six different safety factors, originally proposed by Yen (1978) for hydrologic and hydraulic design. The general expression of the safety factor is
SF = (1.2.10)
D
where C and D are some measure of the capacity and demand, respectively.
1.2.3.2. Coefficient of reliability (COR). Niku et al. (1979, 1981) defined a similar index to measure the performance of a wastewater treatment plant. This index was called coefficient of reliability and defined as the ratio of the average effluent concentration (C) for
15
which the plant should be designed (or operated) to meet a given standard (D) p% of the time (i.e., with p% reliability),
COR = (1.2.11)
D
where C and D are the mean and pth percentile variate of the lognormal distribution fitted to the pollutant of interest. For example a COR of 0.2 relates to a percentile equal to 5 times the mean of the lognormal
distribution. Detailed expressions of this coefficient will be given in Chapter 5.
1.2.3.3. Generalized reliability index (GRI). Ditlevsen (1981, p. 232) defined a generalized reliability index of a system with respect to a limit state L dividing the transformed space into a safe set and a
y
failure set by the formula
GRI (y)= 41[ UL (zl) $(z2) ... $(zn)dz] (1.2.12)
where yL is the confidence limit (safe set) in the normalized space of
1
input variables corresponding to the limit state Ly , is the inverse of the cumulative disLribution function of , the standardized T.ormal density function, and z, z2, ., zn are the standardized normal variates.
y.  g(x.,O)
z. = 1 (1.2.13)
1 a
For the special case where L is limited to one variable, yL will be defined by a single point, and the reliability index of Equation
1.2.12 reduces to
GRI ( y) L 4(z )dz = 01 (zL = . (1.2.14)
16
which is equal to the number of standard deviations separating the mean from the safeset limit yL. The last two equations may be combined to define confidence limits about g(x,O) for a given reliability level zL
yL = g(x,6) + ZLO . (1.2.15)
1.2.3.4. Conclusion. As shown above, reliability indices are often based on the first two moments of the observations, namely the mean and the standard deviation. The use of these two moments is usually associated with the assumption of normality, or lognormality, such as the case of the GRI and COR indices, respectively. The sensitivity of these indices to such assumptions will be investigated in Chapter 5, and approximate new formulae will be derived to account explicitly for deviation from the assumption of normality.
An overview of the objectives of this study and a description of the remaining chapters are given in the next section.
1.3. Overview of the Study
The importance of probability distributions in model classification, parameter estimation, and reliability analysis is illustrated in Chapter 2 where a classification of hydrological models is presented
along with a review of different methods of statistical parameter estimation procedure. Then classical distribution models and their statistical parameters are reviewed. Of special interest is a class of derived distributions in which the variate of a parent distribution is transformed using simple functions such as the logarithmic or power transformation. Chapter 3 starts with a detailed analysis of the generalized gamma distribution, a widely used distribution in reliability studies. This distribution includes most of the distributions
17
reviewed in Chapter 2 as special cases, but it has the disadvantage of
being so poorly parameterized that no simple procedure for estimating its parameters has been reported in the literature. The estimates are usually highly correlated, and iterative algorithms are required for their evaluation. The convergence of these algorithms is not always
guaranteed, and the final estimates are often dependent on their first estimates. To circumvent such limitations, a new parameterization of
the generalized distribution is suggested. In its new form the generalized distribution includes four families of distributions, each expressible in terms of only two parameters (location and scale), once the data are transformed to the right space through the BoxCox transformation. The four families of generalized distribution analyzed are the normal, Gumbel, Rayleigh, and Pearson distributions. A computer program is developed to fit up to 100 probability distribution models simultaneously and select the best one according to any of four selection statistics. These statistics are the correlation coefficient, standard error of estimate, weighted sum of squares and maximum likelihood function.
The performance of this program and the new generalized distributions is compared in Chapter 4 to that of the classical maximum likelihood procedure used by Kite (1977) for estimating the parameters of six of the most widely used frequency distribution models in hydrology. The comparison is based on two examples. The first is a series of maximum daily discharges from the St. Mary's River at Stillwater, Nova Scotia, analyzed by Kite (1977). The second example has two series of
annual total rainfall and runoff from the Kissimmee River basin analyzed by Huber et al. (1982). These two examples are also used for a
18
sensitivity analysis of the selection statistics to the definition of the plotting position, change of scale and estimation procedure.
Chapter 5 illustrates the use of the new parameterized general
distributions for reliability analysis. In this chapter, approximate formulae relating the moments of the transformed and untransformed
variables are derived and substituted into expressions for reliability indices. Reliability tables are generated for different probability levels and for different spaces. A sensitivity analysis of the reliability, the predicted percentile and the design period to the shape of the distribution is based on these new formulae.
Chapter 6 begins with a description of four urban basins used as a case study for the deterministic modeling of water quantity and quality. These basins were intensively investigated by the United States Geological Survey (USGS) and most of the data collected during these investigations are available at the University of Florida (Huber et al., 1981a). Then a review of the calibration and verification of deterministic rainfo.1runoff models is presented. The basin data are then used for the calibration of the EPA Storm Water Management Model (SWMM). A new procedure for calibrating SWM using more than one storm event within the same run of the program is developed. The performance of the model is evaluated through a reliability analysis of the parameter estimates and a comparison of the variability in percent error for the total volume and peak flow predicted by the model to the percent error in the inputs (spatial and time variability of rainfall and basin characteristics). Also, within this case study, hourly rainfall data from eight stations located in southeast Florida are used for a detailed analysis of the rainfall spatial and temporal variability over the region
19
for the period 1956 to 1979. These data are analyzed on a yearly, monthly, and event based discretization of the time scale. The event based analysis is performed using a synoptic statistical analysis
program (SYNOP) originally developed by Hydroscience (1976) and improved by the University of Florida. This program performs statistical analyses on the duration, total volume, average intensity and time between events. A more detailed analysis of these characteristics is performed by the generalized probability distribution computer program (GPDCP) described in Chapter 3. The same analysis is performed on the hourly series of flows and pollutant concentrations for the calibrated basins
using a continuous simulation with SWNM. Statistical analysis of these generated series is performed directly by the statistical block of SWMM (STATS), followed by SYNOP for comparison purposes. The last section of Chapter 6 deals with stochastic modeling of the yearly and monthly rainfall series from the eight stations introduced in section 6.2. The reliability of the estimated parameters is analyzed through a Monte
Carlo simulation. The variabili'y in parameter estimates due to tenporal and spatial variations and to transformation of variables is then analyzed by comparison of estimates for the eight stations and for different transformations.
Chapter 7 summarizes the main findings of the research, and gives some recommendations for future investigations along with the conclusions.
1.4. Overview of the Results
Most probability distribution models used in hydrology and in
reliability analysis are shown to reduce to a location and scale type
20
distribution after an adequate reparameterization and transformation of the data by the BoxCox transformation. Based on this finding, four
generalized families of probability distributions are derived, with the normal, extremal, Rayleigh, and Pearson as parent distributions. A computer program is developed for estimating their parameters. Separation of the estimation of the location and scale parameters from that of
the transformation parameter, adopted by this program, results in a very efficient algorithm which always converges in efforts where other classical algorithms have failed.
The high performance of these distributions is illustrated by the modeling of three annual series taken from the literature. Four selection statistics are used for the evaluation of such performance. These statistics exhibit a high sensitivity to the transformation parameters,
and no sensitivity to the definition of the plotting position and to the change of scale on which the observed data are expressed. The optimal selection statistics are found to be much less sensitive to the choice
of the parent distribution than to the transformation parameter, e.g., all four generalized distributions perform equally well once the data are optimally transformed.
Relations between the moments of the original and transformed
variables are derived, based on a Taylor series expansion of the BoxCox transformation. These relations allow the extension of the second moment reliability theory, based on the assumption of normality, to a
third order reliability theory, wherein the deviation from normality is explicitly accounted for through the transformation parameter. Generalized expresssions for different measures and indices of reliability
are derived, and their sensitivity to the shape of the distribution is
21
illustrated by generated tables and plots of these indices for several transformation parameters and coefficients of variation. An unexpected increase of reliability with the coefficient of variation for fixed
extreme events is highlighted by the new generalized reliability indices and definitions.
Deterministic models, such as S MM, are proven to be valuable tools for continuous simulation of hydrologic processes, once calibrated against several storms representative of average hydrological conditions
of the basins under investigation. Simulated series provide valuable information for storm water management and for hydrological planning and design when separated into events and treated by statistical analysis
programs such as the Statistics Block of SWMM or the Synoptic Statistical Analysis Program (SYNOP).
More insight into the modeled hydrologic processes may be gained through a probabilistic and stochastic modeling of these simulated series, e.g., the generalized probability distribution computer program
(GPDCP) is ustcd for the modling of the event z:araFcteristics of the generated series (duration, volume, intensity and time between events) and of generated series of extreme hourly pollutant loads. For all these series the performance of the models fitted to the event data is as good as for the total annual series.
Stochastic modeling of annual and monthly rainfall series is
illustrated by the analysis of data from eight National Weather Service NWS stations from southeast Florida. Parameter estimation based on optimal decision criteria, such as the Akaike Information Criterion
(AIC) and Bayesian Information Criterion (BIC) extended to the BoxCox transformation performed very poorly in selecting the normalizing
22
transformation compared to the reliability based approach of the GPDCP program. This conclusion is supported by a Monte Carlo simulation.
1.5. Research Uniqueness
The reliability based approach for estimating the parameters of deterministic, probabilistic and stochastic models, adapted in this study is unique to this research. Also, the reparameterization of the
generalized gamma distribution, the introduction of the BoxCox transformation parameter as a shape parameter for the normal, extremal,
Rayleigh, and Pearson families of distributions, and the development of a solution algorithm for estimating the parameters of all four types of distributions are among the unique aspects of this research. Another
contribution of this study is the derivation of new relations between the moments of original and transformed data, and their use for the definition of generalized reliability measures and indices for cases
where the data are normalized by the BoxCox transformation. This research also extends the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC) to the BoxCox transformation.
Although most of the research applications focus on the southeast
Florida case study, the main objective is not to answer specific questions about the modeled basins or rainfall stations, but rather to
illustrate the theory, highlighting the potential of the reliability based approach in the modeling of hydrologic processes.
CHAPTER 2
LITERATURE REVIEW
2.1. Classification of Hydrologic Models
Hydrological models are mathematical formulations to simulate
natural phenomena. Their classification may be made according to the simulated processes. A hydrologic process was defined by Chow (1964) as a hydrologic phenomenon subject to continuous variation, especially with respect to time. Mathematical models may be classified into three main groups: deterministic, probabilistic and stochastic. A deterministic model is one in which the simulated process is assumed to be chance free. Models that include a random component for the simulation of
chance dependent processes are called probabilistic or stochastic according to their independence or dependence on time. For stochastic models the sequence of occurrences is called a time series. Depending on the correlation structure of the simulated time series elements,
these models are called pure random or nonpure random for uncorrelated and correlated series, respectively.
The probability distribution of the simulated time series may be time dependent or independent, requiring a stationary or nonstationary
stochastic model. Classification of hydrologic models and their interrelation are summarized in Figure 2.1, based on Chow (1964, p. 810).
Hydrologic processes are generally treated as stationary to avoid the mathematical complexity of the stochastic models. But it is well known to hydrologists that these processes are stochastic nonpure
23
HYDROLOGIC MODELS
Deterministic ChanceIndependent
Stochastic or Probabilistic
ChanceDependent
Stochastic
TimeDependent
or Sequence Considered
Pure Random
Stationary
TimeIndependent Distributions
Non Pure Random
Non Stationary TimeDependent Distributions
Figure 2.1 Hydrologic Model Classification.
Modified from Chow (1964, p. 810)
Probabilistic TimeIndependent or Sequence Ignored
Pure Random
4s
I
I
25
random, and that their exact mathematical representation is practically impossible. Thus, all other models shown in Table 2.1 are simple representations that only approximate hydrologic processes. These approximations are based on knowledge of the modeled processes gained by experience.
For each model the necessary assumptions should be verified. If
not satistied, either another model may be tested or the original time series may be treated to comply with the previous assumptions. Among the possible treatments of these series, removal of periodicity, trends and correlation are widely documented in the literature (Yevjevich, 1972b; Lawrence and Kottegoda, 1977; Delleur and Kavvas, 1978).
From the previous classification of hydrologic models emerges the importance of the frequency distribution. The classification of each process was based on some assumptions about the probability distribution of the modeled series. Such assumptions are even more important for estimating the parameters of the models, making inferences about the goodness of fit, and evaluating the reliability of the estimated parameters. As will be shown in the next section, some assumptions about the error distribution have to be made independently of the model and the method used for parameter estimation. Such assumptions are made implicitly or explicitly depending on the method of estimation. Model predictions and the associated confidence limits are shown to be very
sensitive to the assumed distribution (Maalel, 1983a). A review of the method of parameter estimation is given in the next section with emphasis on the importance of the error probability distributions.
26
2.2. Statistical Parameter Estimation
2.2.1. Introduction
In the previous section hydrologic models were classified into
deterministic, probabilistic and stochastic models, but estimation of
the parameters of these models is usually based on data samples of limited sizes, and estimates of the same parameter from different samples are not expected to be exactly equal. This is due to limited
accuracy of measurement techniques, the idealized conditions for which the model was derived and the unpredicted or unaccounted for disturbances affecting the real system. Such disturbances are as much a part
of physical reality as the underlying exact quantities which appear in the model (Bard, 1974). Thus, disturbances or errors are explicity included in the mathematical representation of the modeled process. Such representations bear the general form y = g(x,O) + e (2.2.1)
where y = (y1, ..., yn) ' x (x, ..., xn) are the dependent and independent variables, respectively, n is the number of data points or experiments, e = (61, a2, ..., 6 ) defines the parameters of the model, and e = (el, e2' e., en) is the error or deviation of the model prediction from the observation
e. = y.  g(x.,o) . (2.2.la)
Parameter estimation methods are based on the satisfaction of
Equation 2.2.1 by some measured values of the dependent and independent variable (y0, x0). These measurements constitute a sample (realization) of finite size n from the population of all possible values that may occur in actual physical situations. Such samples are assumed to be representative of the total population, leading to an estimate of the
27
true value of the parameters. The independent variable x is often referred to as an explanatory variable since it describes the variation in the dependent variable through the model g(x,O). The unexplained
2
variability of y will be given by the variance of the residuals a ,
e
denoting the average scatter of the observed data about the fitted curve. The smaller this unexplained variance is, the better is the fit of the model to the observed data. As in Equation 2.2.1, in the presence of explanatory variables the mean of the residual is often assumed equal to zero reducing the total number of estimated parameters to p + 1. If there are no explanatory variables as is usually the case in frequency analysis, g(x,O) is replaced by the mean of the observations, V , and all the variability in the data will be included in the variance of the residuals,
e. = y.  y (2.2.lb)
t i y
2 2
a =a . (2.2.lc)
e y
In this case, only two parameters need to be estimated from the data,
2
namely the mean p and the variance a .
y y
2.2.2. Properties of estimators
An estimator 0 of the parameter 0 can be characterized by the following criteria, taken individually or collectively (Neter and Wasserman, 1974).
1. Unbiasedness
An estimator 0 is an unbiased estimator of 0 if
E(;  0) = 0 (2.2.2)
where E stands for expected value.
28
2. Consistency
An estimator e is a consistent estimator of e if the probability that 0 differs from 0 by more than an arbitrary constant E
approaches zero as the sample size (n) approaches infinity
lim P(i001 > e) = 0 for any e > 0. (2.2.3) n)
3. Sufficiency
An estimator ; is a sufficient estimator of e if the conditional joint probability density function of the sample observations, given ;, does not depend on the parameter 0.
4. Efficiency
An estimator 0 is an efficient (minimum variance) estimator of
0 if the variance of any other estimator 0' is larger than the
variance of 0,
2 ^2
a ( _) < a (0') for all 0'.
2.2.3. Least squares method
The simplest and most popular method of parameter estimation is the least squares method. This method is based on minimizing the sum of the squares of the differences between model predictions and measurements, e, of Equation 2.2.1,
n n) 2 (2.2.4)
SS = E e =E [y  V~ 6] 224
i=l i=l
This method was first suggested by Legendre (1805) for parameter estiination in linear curve fitting. Curve fitting differs from parameter estimation in that the parameters of the former have no physical meaning. Parameters estimated by the method of least squares are often of this type. Since no assumption about the distribution of errors is made, the estimates cannot be related to the true values of the
29
parameters and are thus meaningless. Duality between curve fitting and parameter estimation will be dealt with in more detail in Chapter 6.
Equating to zero the derivatives of Equation 2.2.4 with respect to each of the parameters 0., j=l, ..., p leads to a set of p equations with p unknowns. These are the well known normal equations
a(S = 2 E [y  g(x ,e)] g =xle 0
()(x.,6)
a(sS) 2 Z [y  g(x ,0)] = 0 (2.2.5)
2 2
3(sS) igx,0)
D( S = 2 E  g (x .,0 )] e 0 P p
The ease of the solution of these equations is highly dependent on the form of the function g. If this function is a linear expression of the parameters, then the normal equations are also linear, and the problem reduces to the linear least squares problem, solved readily by multiple
linear regression. On the other hand if g is not a linear expression of the parameters, the normal equations are nonlinear. No direct explicit solution for such equations exists. Nonlinear least squares methods may be used for this solution, usually based on a direct search for the set of parameters minimizing Equation 2.2.4 or on an iterative solution of a linearized form of the normal equations. A detailed description of these methods may be found in Bevington (1969) and Bard (1974).
2.2.4. Maximum likelihood Method
By showing that least squares estimates maximized the probability density function for the normal error distribution, Gauss (1809) (referenced in Bard (1974)), laid the statistical foundation for parameter estimation. In this, Gauss anticipated the maximum likelihood method.
30
This method was revived later by Fisher (1950) who studied its estimator properties such as consistency, efficiency and sufficiency.
Assuming the measured values y , i=l, ..., n, are independent and that each has a probability density function (pdf) f about its expected value y: estimated by g(x.,O) of Equation 2.2.1, the
probability that all n observations lie within dy of the predicted value g(x.,0) defines the likelihood function
n
L = H f (e.) . (2.2.6)
i=1 yi
Meyer (1975, pp. 136) emphasized that the term "likelihood" distinguishes, and is reserved for, a posteriori probabilities. That is, after a result y. has been observed, f (e ) states how probable it was that e was found. The main distinction between a posteriori and ordinary (a priori) probabilities is that the latter are well defined numbers (even if unknown) whereas the former have the character of being random variable.
If the errors e. are normally distributed with mean zero and
2
variance a , then Equation 2.2.6 gives the following expression of the likelihood function:
n y1 Y  g(x.,O) 2
L = 11 exp [  ( ) I
i=1 /2 a.
(2.2.7)
1 y.  g(xi,O) 2 n 1
= exp [  a 2
a_ i=1 V2~7 a.
The maximum likelihood estimator (MLE) of 0, say 0, is the value of 0 that maximizes L or equivalently the logarithm of L, k = log L (Meyer, 1975, p. 312). The MLE of 0 is often but not always the solution of the following p equations:
31
= 0, i=l, ..., p.. (2.2.8)
Examples requiring alternative procedures to obtain MLE may be found in Mann et al. (1974, p. 113) and Meyer (1975, p. 314). These equations are similar to the previously defined normal equations. Usually they are not linear expressions of the parameters, and their solution requires nonlinear estimation methods. The complexity of the solution increases with that of the form of g(x,O) and f . Maximum likelihood estimates satisfying Equation 2.2.6 will often occur at local maxima
(Meyer, 1975, p. 312), but they do have the following important properties (Mann et al., 1974, p. 82): 1) If an efficient estimator, 0, of e exists, then this estimator is the unique solution of the related likelihood function. 2) If the number of sufficient statistics is equal to the number of unknown parameters, the MLE's are the minimum variance estimators of their respective expected values. 3) Under certain
general conditions the MLE's converge probabilistically (Equation 2.2.3) to the true value 0, as n>, and are asymptotically normal and asymptotically efficient estimates of 0. 4) The MTLE's are invariant to a transformation with a singlevalued inverse. In other words, if 0 is the MLE of 0, and t(O) is a function of 0 with a singlevalued inverse, t(;) is the MLE of t(0). This last property is a very important one; it will be used in Chapter 3 for statistical inferences about estimates from transformed variables.
2.2.5. Weighted least squares
By defining "chisquare"
2 Yi  g(x,0) 2
X = ( a. (2.2.9)
32
and substituting it into Equation 2.2.7 the maximum likelihood function will be
2 1
L = exp (X /2) H . (2.2.10)
i=l 12j W a (
2
The only term in this expression that depends on 0 is X . Therefore maximizing L or k will be equivalent to minimizing x2 (Meyer, 1975, p. 254).
If all observations are normally distributed with the same variance
2 2
(homoscedastic) a. = a , i=l, ..., p, then maximizing L, Equation 2.2.7 will be equivalent to minimizing the simple sum of squares of Equation 2.2.4. This is the previously mentioned relation between simple least squares and maximum likelihood given by Gauss (1809).
Equation 2.2.9 may be viewed as a weighted sum of squares (WSS) when written in the following form:
n 2
WSS = E w. (y.  g(x.,0)) (2.2.11)
i=l
where the weights w. are equal to the inverse of the variance of the residual (1/a.). Weighting by the elements of the inverse of the covariance matrix of the errors was shown to lead to the leastvariance estimates if the model g(x,O) is linear in the parameters, or if the number of observations is large and the errors are normally distributed (Bard, 1974, p. 57). For the general case of nonlinear models and nonnormal distributions, approximate optimal properties may still be
reached by weighting with the inverse of the covariance matrix. When the covariance matrix is not known Bard suggested either to guess its elements or to estimate them along with the other parameters using a
method such as the maximum likelihood, while Bevington (1969, p. 102) suggested evaluating these weights directly from instrumental uncertainties.
33
When measurements are made with physical instruments, as is often the case in hydrology, the uncertainty in each measurement generally arises from fluctuations of repeated readings of the instrument scale. Such fluctuations may result from either human imprecision in reading the settings, imperfection in the equipment, or a combination of both. A replication of some of these experiments will give direct estimates of the elements of the covariance matrix. Draper and Smith (1966, p. 77) considered the case where g(x,O) is linear in the parameters. When the covariance matrix is not diagonal (correlated observations) or is diagonal with inequal variances (some observations are more reliable than others) they showed that there is always a transformation of the observations y to other variables Y satisfying at least approximately the following properties: 1) Y is linear in the parameters, 2) the covariance matrix is diagonal with constant elements, and 3) the errors are approximately normally distributed with zero mean and variance
2
matrix I a , where I is the identity matrix. With these properties the transformed variables satisfy all conditions leading to the minimum variance unbiased estimate of e by unweighted least squares and allow statistical tests and confidence intervals to be constructed. These results are then reexpressed in terms of the original variables; the optimality of these estimates is guaranteed by the previously mentioned property 4 of the MLE's. From Draper and Smith's approach we see how the problem of finding the right weights may be reduced to that of finding an appropriate transformation of the measured variables. Ditlevsen (1981, p. 232) followed the same approach of normalizing the observations y to N(O, I a 2), and then defining a general reliability index in the normalized space. This index is extended to a more general normalizing transformation in Chapters 3 and 5.
34
2.2.6. Method of moments
2
The error term of Equation 2.2.1 has a pdf f (e) = f (y, g(x,0), a ) y y
The rth moment about zero of f is
y
'r 2
= f y f (y, g(x,0), a ) dy (2.2.12)
00 y
The moment p' is a function of the parameters 0., i=1, ..., p and a ,
r
and hence can be written as
P' = '. a ) . (2.2.13)
r r 1' 2 p
From a sample of observations yl, y2' ..., yn the first p+l sample moments, m', are
1 n r
M =  E yr (2.2.14)
r n . l
i=1
^2 2
The moment estimators 0., i=l, ..., p of the p O's and a of a are obtained by solving the system of p+l equations resulting from equating equations 2.2.12 and 2.2.14.
PI' = m' , r=l, 2, ..., p+l r r
2
Norce that for this case the variance a was considered unknown but constant. If a2 is not constant the number of unknowns will jump to p+n requiring the calculation of the (p+n) first sample moments. This is a main drawback of the method of moments since sample moments of order higher than first are biased. The following properties of moment estimators were originally shown by Cramer (1946) and reported in Mann et al. (1974, p. 80). (1) They are simple and squared error consistent;
(2) they are asymptotically normal, but not, in general, asymptotically efficient or best asymptotically normal.
35
2.2.7. Order statistics based least squares
If the distribution f (y) of the errors in Equation 2.2.1 is of known form with location and scale parameters g(x,a) and a , respectively, its parameters may be estimated by applying general least squares theory to the ordered sample of observations. This method was
first analyzed by Lloyd (1952) who showed that the resulting estimates are unbiased, linear in the ordered observations, and of minimum variance. He also developed explicit formulae for the estimates and for their variances and covariances.
For a distribution which depends on location and scale parameters only, the cumulative distribution function (CDF), F (y) may be reduced to a parameterfree distribution (standardized) Fz(z) with pdf fZ(z), where z is equal to
z = Y g(x,e) (2.2.15)
e
If the variates (z's) are arranged in ascending order, such that z < z, i=2, ..., n, the smallest observation z1 is called the first order
tatistic whereas the last observatory. z is called the r.th order stan
tistic. The small probability h (z) dz that the ith order statistic, z., lies in the range z  dz/2 < z. < z + dz/2 will define the probI C>  1
ability density function, h z.(z), of the order statistic. The probability that ii observations are less than z, ni observations are greater than z, and exactly one observation is in the range z  dz/2 < z < z + dz/2 is given by the multinomial distribution, h (z) dz = . (ni)! [F (z)  [1  F (z)]ni f z) dz . (2.2.16)
The expected value of the ith smallest order statistic in a sample of size n is
36
E(z.) = f z h z(z) dz
=.f! .i!z [F (z)] [1  F (z)]ni f (z) dz .(2.2.17)
Substitution of E(z ) for z into Equation 2.2.15 leads to the relation
y = g(x ,e) + Ge E(z.) . (2.2.18)
If g(x.,O) is replaced by iy , the mean of the n observations, then this equation reduces to the classical case of two parameter pdf's with a = a (Equation 2.2.10). Equation 2.2.18 illustrates the linear relation between the mean observation and the expected order statistic. This equation is the analytical formulation of the probability paper linear plot. The expected standardized order statistic E(z ) may be converted
to (and is sometimes termed) the plotting position because it defines the probability at which y. should be plotted on probability paper. Equation 2.2.17 is usually too complicated for analytical evaluation.
Harter (1961) used numerical integration for n=2 to 400 to obtain values of E(z ) accurate to five decimal places with a normal parent distribution. Computer algorithms for the evaluation of the expected normal order statistic are included in many statistical packages (Westcott, 1977; Royston, 1982; Statistical Analysis System, 1982). For the extreme value Type I parent distribution Lieblein and Salzer (1953)
simplified an expression, originally derived by Kimbal (1946), for Equation 2.2.17:
ni
E(z.) = n _, ) r nl) C + ln (i+r) (2.2.19)
1 1 r i+r
where C = 0.5772 is Euler's constant. Cunnane (1978) found that for n larger than 35, even with double precision FORTRAN, arithmetic roundoff errors overwhelmed the evaluation of E(z ) and suggested the use of
37
approximation formulae for the expected order statistic. For an exponential distribution Equation 2.2.17 reduces to the very simple expression
i
E(z.) = 1 l/(n + 1  j) (2.2.20)
j=1
and for a uniform distribution E(z ) is numerically equal to the plotting position (probability) and is given by the wellknown Weibull (1939) formula,
E(z )=F1 (2.2.21)
1 y n + 1 n + 1
A general approximation formula for E(z.) was first suggested by Blom (1958)
E(z.) = Fy +l2a) (2.2.22)
l
where F is the inverse of the parent CDF and a is a constant depending
y
only on the form of the probability distribution. Blom recommended a
0.375 for the normal distribution. Gringorten (1963) suggested a = 0.44 for the extreme value Type I distribution. For a = 0 and a uniform probability distribution Equation 2.2.22 reduces to the exact formula of Equation 2.2.21. The name plotting position is usually given to the
argument of Equation 2.2.22, the probability at which the ith observation should be plotted. Conversion of the plotting position to a standardized variate, E(z ), allows the estimation of the location and scale parameters in Equation 2.2.18 by simple regression (Lloyd, 1952). The same technique was suggested by Chow (1953) for frequency analysis in hydrology. Kimbal (1960) and Cunnane (1978) compared the performance of several plotting position formulae using the same technique. They used variance and bias of the location and scale parameter estimates as
38
criteria of goodness of fit for the normal and extreme value Type I distributions. Maalel and Triki (1979) used the same approach to compare the performance of different probability distributions in the
modeling of rainfall intensitydurationfrequency (IDF) structure over Tunisia. Tang (1980) applied linear regression for a Bayesian frequency analysis of annual flood discharge. Like Maalel and Triki, Tang used
the Weibull plotting position to calculate the expected standardized order statistic; independently they presented the same formulation to account for discrepancies between observed data and model predictions, and for the uncertainty of extrapolation from a limited sample of data. An expression for the overall variability of any individual prediction
may be found in many statistical textbooks such as Raiffa and Schlaifer (1961), and Gremy and Salman (1969). From Equation A.10 we have
S 1 Z  Z n1
y. y (( + + 2.2.23)
n z.  z) n 3
where S is the standard deviation denoting the average scatter of the data points about the regression line y = A + Bz , (2.2.24)
where z is the standardized variate (reduced order statistic) with mean z. A and B are the location and scale parameters, respectively. These parameters are obtained from the regression analysis of Equation 2.2.24. A description of these estimators is given in Appendix A.
The ratio of the difference between expected and predicted observation to the overall standard deviation Equation 2.2.23 has a Student t distribution (Draper and Smith, 1964, p. 24). The (1  p/2) percent confidence interval of any prediction y. is
39
A = t(v, 1  p/2) i (2.2.25)
where v is the number of degrees of freedom.
2.2.8. Conclusion
The regression based on order statistics is extended in this study to the more general case where the location parameter A is allowed to have some explanatory variables, Equation 2.2.18, y = g(x.,o) + a E(z )
Note that if g(x,e) is linear in the parameters (O's) this equation is
linear, and multiple linear regression will have optimal properties. If g(x,O) is not linear, the nonlinear parameters may be estimated separately through a nonlinear procedure, and Equation 2.2.18 will be used
for an overall evaluation of the model. This approach will be analyzed in more detail in Chapter 6.
Civen the importance of the probability distribution in the classification and parameter estimation of hydrologic models, a review of these distributions is given in the next section.
2.3. Probability Distributions
2.3.1. Introduction
Hydrologic modeling has always involved probabilistic analysis in order to account for the stochastic nature of the processes involved.
Many discrete and continuous distributions have been found to be useful for hydrologic frequency analysis (Chow, 1964). A review of these distributions and their application in hydrology may be found in several of the hydrologic textbooks such as Yevjevich (1972), Haan (1977), Kite (1977), and Viessman et al. (1977).
40
2.3.2. Discrete distributions
The main discrete distributions used in hydrology are from the Bernouilli and Poisson processes. A detailed description of these distributions along with their interrelations is given in Section C.l of Appendix C. Waymire and Gupta (1981) present a very good review of the Bernouilli and Poisson models and their application for the mathematical representation of the temporal and spatial distribution of rainfall and rainfalldriven hydrologic processes.
2.3.3. Continuous distributions
A review of the main continuous distributions and their statistical parameters is given in section C.2 of Appendix C. The emphasis of this review was on the interrelation between different distributions. Such* relations are very important in that they allow an objective choice among different probability models through regression analysis. Among these relations, the exponential distribution (Equation C.2.2) was shown to be a special case of the gamma distribution (Equation C.2.3). Other special cases of the gamma distribution are the Pearson Type III and chisquare distributions, Equations C.2.4 and C.2.9, respectively. A logarithmic transformation of the variables allows the transition from a Type II or III to Type I extreme value distribution. Along this line of interrelation, many other distributions have been derived and are widely used in the scientific fields in general and in hydrologic studies in particular. The following sections describe some of these derived distributions.
2.3.4. Logarithmically derived distributions
2.3.4.1. Logextremal distribution. In Section C.2.9 it is shown that the extreme value Type I distribution can be transformed to a Type II
41
or III distribution by replacing the variate (xa) by its logarithm, log (xa). By this transformation Equations C.2.13 and C.2.14 simplify to the more tractable form of Equation C.2.11 where the parameters are of the location and scale type. This feature of variable transformation will be explored in more detail in Chapter 4.
2.3.4.2. Lognormal distribution. The transformation of data may be based on professional knowledge of the modeled processes (Benjamin and Cornell, 1970). For example, if the data are known to result from the product of many small effects, then their logs will be the sum of the logs of these effects. From the central limit theorem, the distribution of this sum is expected to be normal (Haan, 1977), and the original data will be lognormally distributed. The lognormal distribution is also
known as the Galton distribution since Galton (1875) was the first to study it (Chow, 1964, pp. 817). Aitchison and Brown (1957) derived its statistical moments and applied it for economic analysis. Chow (1954) showed that the extreme value Type I distribution is one of its special cases.
The pdf of the lognormal distribution is
1 Y Y 2
f(y) 1 e (2.3.1)
a/27 y
where Y = log y, and p and cy are the mean and standard deviation of Y.
The statistical parameters for the variate y are (see Appendix B, Equations B.9 to B.12)
2
yy+ a 22
mean y = e
y
2
variance a 2= 2 (e 1)
y Y
42
2
coefficient of variation V = a(2  1)1/2
y p.
3
and skewness y =3 V
y y y
A location parameter, a, is often added to the lognormal distribution.
Redefining Y of Equation 2.3.1 as log (ya) leads to the three parameter lognormal distribution. Its pdf is
1 log(ya)  vi1Y 2
1 2a
f(y) = e Y (2.3.2)
a V2' 7 (ya) with (ya) > 0, a the location parameter, and p and a are as defined above.
Munro and Wixley (1970) showed that the lognormal distribution can be expressed in terms of location, scale and shape parameters P, t
and a, respectively. This was illustrated by the following reparameterization of Equation 2.3.1, ~1 2
2 [log(l+az) ] f(y) 1 e 2a (2.3.3)
v (1+rz) T where
z =
T
= a + e
= ae Y
an: a=a
y
In this form it is obvious that the lognormal distribution approaches the normal distribution as the shape parameter approaches zero for fixed location (p) and scale (T) parameters (given, lim log(l+az) = az).
a0
43
2.3.4.3. LogPearson Type III distribution. If the variate (ya) of the Pearson Type III distribution, Equation C.2.4, is replaced by its logarithm, the resulting equation defines the logPearson Type III distribution, after multiplication by the Jacobian of the transformation, of course. This distribution was recommended by the United States Water Resources Council for flood flow frequency analysis, WRC (1976, 1977, 1981). Application of this distribution in hydrology is extensively reported in the literature (Bobee, 1975; Bobee and Robitaille, 1977; Landwehr et al., 1978; Rao, 1980, 1981; Lall and Beard, 1981). The pdf of the logPearson Type III distribution is A kl AY
f(y) = k(k) Y e (2.3.4)
with Y = log (ya)
2.3.4.4. Summary. The logarithmic transformations of the extremal and Pearson distributions has no theoretical basis. Their popularity results mainly from the good fit they show in modeling many hydrological processes. The assumption of an infinite number of small multiplicativa effects imlie by the lcgnormal distribution is expected to be seldom if ever satisfied in nature, where effects are usely of finite number. Landwehr et al. (1978) compared flood statistics in real and log space and found that flood sequences are predominantly positively skewed in the real space and dominantly negatively skewed in the log space, indicating an over transformation of the data. From these findings it is obvious that some intermediate transformation (space) may be much more appropriate for the modeling of these data. The following section gives some alternative transformations which have been reported in hydrologic and meteorologic frequency analysis studies.
44
2.3.4. Other derived distributions
Stidd (1953, 1968) found a good fit of daily, monthly and annual rainfall to the normal distribution with a cube root transformation of
the data. The choice of this transformation was based on speculation about the nature of the skewness of the untransformed time series. Stidd assumed that rainfall is the product of three meteorological effects, namely atmospheric vertical motion, air moisture content and rainfall duration time. Later, Stidd (1970) noted that no mathematical
proof of the previous concept was found, but that an experiment with synthetic precipitation data confirmed the cube root distribution of precipitation. Also, he generalized this concept to the Nth root normal distribution and developed a straight line plotting method for the
estimation of the parameters. If the Nth root of the observed precipitation y is normally distributed, it may be plotted as a straight line, ya = a (zz0) (2.3.5)
a
where a = 1/N, a is the standard deviation of y , z is the standard normal variate and z0 is the z value corresponding to the cumulative
percentage of zeros in the data (threshold in the truncated normal distribution). Tables of the normal cumulative density function (CDF) were used for evaluation of the z values. To enter this table, Stidd used
empirical frequencies calculated from the previously defined Weibull plotting position, without even introducing the plotting position concept. (Evaluation of the Weibull plotting position using real data will be given in the next chapter.) Stidd developed an iterative procedure for the estimation of the parameters a, a and z . The procedure had as
an objective function the maximization of the linearity of log (y) vs log (zz0) plot. From his experiment with synthetic rainfall Stidd
45
concluded that the Nth root normal distribution has a more valid theoretical basis when compared to the empirically derived Pearson family of distributions and that the Nth root normal distribution is more appropriate for extrapolation to return periods greater than the period of
observation. Chander et al. (1979) applied the normalizing power transformation suggested by Box and Cox (1964), for the analysis of flood frequency. The BoxCox transformation is yci.
0 C 0
Y = a (2.3.6)
log y a 0.
This transformation has the advantage of being continuous at a=O (N*, Equation 2.3.5) and not restricting a to the reciprocal of positive integers. A more detailed discussion of this transformation will be given in Chapter 3. Chander et al. applied this transformation to annual maximum discharges of fifteen rivers from India. These data were better fitted by the power normal (normal with Equation 2.3.6 transformation) than by the normal, lognormal, Pearson Type III, logPearson Type III or Gumbel distribution. The goodness of fit was the closeness (when viewed by eye) of these model predictions to the observed data when plotted on normal probability paper. The Weibull plotting position was used for the evaluation of the probability of nonexceedence with all distributions. Here again no justification was given for such a choice of the plotting position.
Independently, Maalel and Triki (1979) applied a similar transformation with the normal and extreme value distributions for frequency analysis of average rainfall intensity of fixed duration. This transformation was
46
ya a #0
Y = (2.3.7)
log y a = 0
This transformation has the disadvantage of not being continuous at a=O, although its performance is the same as the BoxCox transformation.
Rainfall depths recorded at ten meteorological stations from Tunisia were analysed for durations ranging from 5 to 120 minutes. Among their conclusions, Maalel and Triki found that with the power exponent a (Equation 2.3.7) as a third parameter, the normal, extremal, and many
other two parameters distributions of the exponential type fitted most of their data equally well. A direct computer search algorithm was developed to solve for the location and scale parameters and the appropriate transformation.a. The solution was based on the linear relation between transformed variables and expected standardized order statistic, Y = A E(z.) + B (2.3.8)
where Y is defined in Equation 2.3.7, E(z ) is the standardized order statistic of the distribution of interest, and A, B are model parameters. For the normal distribution this method is similar to that of Stidd (1970) and Equation 2.3.8 is the same as Equation 2.3.5 with E(z )=z, A=a, and B=az0. The parameters A, B are estimated by simple linear regression for a fixed grid of a's, ranging from 1.20 to 1.20. The optimal a was the one giving the minimum mean square error (MSE) of the untransformed precipitation,
n 2
Z (y  yO )2
MSE = [I]l/2 (2.3.9)
where y is the observed value. This method will be analyzed in more
1
detail and its performance compared to other methods in Chapters' 3 and 5.
47
Salas et al. (1980, p. 71) suggested a more general transformation for the analysis of annual and monthly rainfall and runoff time series, a (yb)a a# 0
Y = (2.3.10) log (yb) a= 0 .
They found good approximations to the normal distribution for a=0, b=0 and a=l/2, 1/3, and 1/4 for all the data they analyzed. This was in
good agreement with Stidd (1970), who found that a should be between 1/2 and 1/3, but stated that it should not become smaller than 1/3. Iyengar (1982) applied the square root transformation (a=l/2) for monthly rainfall series along with a change of sign from year to year,
Y = (..1)j+l (y )1/2 (2.3.11)
13 1j
where i = 1, 2, ..., 12 months, and
j = 1, 2, ..., n years.
This transformation resulted in an approximately symmetrical process with a 24 month period. The transformed data from ten Indian rainfall
stations were well fit by a straight line on a normal probability paper plot.
2.4. Summary
Hydrologic models are deterministic, probabilistic or stochastic depending on the probability distribution of the errors. These errors are defined by the deviation of model predictions from the unknown real
values as estimated by sample observations. A uniform approach for evaluating the reliability of these predictions may be accomplished by linear regression analysis on the ranked observations and the expected order statistics. This approach was shown to be a very powerful technique, leading to optimal estimates of expected values and their
48
associated confidence intervals including statistical uncertainties. Transformations other than the classical logarithmic transformation define a better space on which inferences about model goodness of fit and estimated parameters are possible.
The problem of finding the best probability distribution for hydrologic reliability analysis may be solved following the same approach, with g(x,O) equal to a constant, say p, in Equation 2.2.18. As will be seen in the next chapter most probability distributions may be reduced to a location scale parameter type distribution when the variables are expressed in the right space. Chapter 3 starts with a review of generalized probability distributions, to end with an even more general distribution with a form as simple as Equation 2.3.8.
CHAPTER 3
GENERALIZED PROBABILITY DISTRIBUTIONS
3.1. Generalized Gamma Distribution
3.1.1. Introduction
A review of different probability distribution models used in
hydrology was given in Chapter 2. The review included derived distributions, such as lognormal, logextremal, loggamma and Nth root normal models. This chapter will focus on the relation between these distributions. First, generalized gamma and extreme value distributions are presented. Then, a new parameterization is introduced, leading to an even more general distribution with a much simpler form, consisting of the location and scale parameter type (Equation 2.2.24). The new
generalized distribution can be easily fitted to a small sample of data. Confidence intervals for parameter estimates and model predictions are well defined given the linearity in the parameters of these models.
3.1.2. Historical background
A three parameter generalized gamma distribution (GGD) was discussed by Stacy (1962). The probability density function (pdf) of this distribution is
bkl
f(y,a,b,k) = b k ep b] (3.1.1)
a r(k) a
where a is a scale parameter, b and k define a shape parameter d=bk, and with y, a, b and k > 0.
49
50
This distribution is widely used in reliability analysis when the
observed data fail to follow any of the more familiar probability models described in Chapter 2. Most of these distributions are special cases of the GGD, e.g., the exponential (b=k=l), Weibull (k=l), and gamma (b=1). The GGD is not the same as the familiar three parameter gamma distribution, where the third parameter is a location parameter (Equation C.2.4).
Maximum likelihood estimates from samples of limited size were
derived by Parr and Webster (1965) for the parameters a, b and k along with their asymptotic variances. Stacy and Mihram (1965) extended the GGD to a more general distribution by allowing the parameter b to have negative values. They also compared parameter estimators obtained by
the method of moments, maximum likelihood, and minimum variance. Harter (1967) introduced a further generalization to the GGD by defining a
fourth parameter (location parameter). He developed an iterative procedure to solve the likelihood equations for the parameter estimates. These equations were obtained by equating to zero the first partial
derivatives of the logarithm of the likelihood function with respect to each of the parameters (Equation 2.2.6). The asymptotic variancecovariance matrix of the estimates was estimated by the inverse of the information matrix, M(n), which is composed of the expected values of the second partial derivatives of the likelihood function with respect to the parameters.
51
a2L 2L 2
Da2 oab aaak
1 32L 32L _2L
M(n) = nE a2b 2 bk (3.1.2)
Tb
2 3 2 D2L
D 2L 9 2L a 2L
ak 3bk Ak2
From the analysis of the components of this matrix, Harter (1967) found a high negative correlation between the estimates b and k. This should be expected, since, as stated earlier, they are related to the same parameter through their product. This correlation increased the number of iterations required when b and k were estimated simultaneously. A more stable solution was reached for the estimation of their product d, the shape parameter. This feature of reparameterization will be applied at the end of this chapter for a better mapping of the GGD.
Hager and Bain (1970) and Hager et al. (1971) presented a detailed analysis of the properties of maximum likelihood parameter estimates for the GGD. Their main finding was that these estimates are "illbehaved," in that the solution to the likelihood equations may not exist, and even when it does exist, the convergence of the iterative procedure is very slow. Also they found that k, b/b and (a/a)b are distributed independently of a and b, and that the assumed asymptotic normal distribution of the parameter k was not satisfied even for sample sizes as high as 400. The last result was based on synthetic data generated with k values of 1 and 2.
Prentice (1974) and Farewell and Prentice (1977) introduced a
reparameterization and a logarithmic transformation to the GGD. From the new form of the distribution they showed that the lognormal distribution
52
is a special case of the GGD (k>). For a better mapping of the lognormal distribution in the parameter space, they defined a new parameter q=k1/2. Therefore, as q approaches zero the GGD will approach the lognormal distribution. When Equation 3.1.1 is written for the variate Y=log y, the following relationships may be developed:
Y log a + z (3.l.3a)
b
or Y  log a (3.1. 3b)
where z is the reduced variate with pdf
f (z,k) = 1 exp(kzez) . (3.1.4)
r(k)
This is the log GGD (LGGD) probability density function, where the variate z follows a locationscale model with variate the log of a r(k) variate (Prentice, 1977).
The LGGD as defined above has the extreme value Type I (k=1) and the lognormal (k>) as special cases. The relation between the lognormal and gamma distribution was established by Bartlett and Kendall (1946). By introducing the reparameterization q=k12 Prentice (1974) extended the GGD to negative q; this was accomplished by reflecting the pdf at fixed q about the origin (Figure 3.1). The reduced variate z, Equation 3.1.3b, has a cumulant generating function log (r(k+a)/r(k)), from which the mean and variance of z are
Pz = p(k) = d log r(k) (3.1.5)
dk
and a2 = $'(k) (3.1.6)
= p'k dk
where '(k) and V'(k) are the digamma and trigamma functions, respectively. Prentice (1977) showed that the standardized variate
53
. 5 .4 3 .2 .1 0
2q== 53
1 2 3 4 5
Generalized Gamma Distribution
  I       .   
q=l
q= 3L
Figure 3. 1
54
z = = (z  )k2 (3.1.7)
o q Z
has mean zero and variance approaching one as k approaches (q=O).
Equations 3.1.3a and 3.1.7 may be combined to give the following expression for Y
Y = m + cz (3.1.8)
yjz
where m = log a+ (3.1.9)
and a = b k1/2 . (3.1.10)
With these parameters, the LGGD pdf given in Equation 3.1.4 is extended to
jqj 2 z
q 2) exp(zq  e ) (q#O)
cf r(q
h(Y,m,a,q) 1 exp 1 2 (3.1.11)
x(z ) (q=O)
(Ym)
where z = q + yz'
zo = (Ym)/a, and
yz 27
11Z = (q )
Extreme value Type I distributions for maxima and minima are easily shown to be special cases of the LGGD for q=l and 1, respectively. The normal distribution is now at the center of the parameter space, q=O.
The main feature of this reparameterization is the reduction of all
the distributions included in the LGGD to the simple form of a location and scale type distribution. Such parameters are easily estimated due to their linearity with the reduced variates. Prentice (1974) noticed that the number of parameters to estimate may be reduced by one
55
since a maximum likelihood estimate of m is given by the mean of Y=log y for all values of q,
n
E log y.
m = Y = (3.1.12)
n
The number of parameters estimated iteratively is then reduced to two, namely a and q. From a simulation study where the estimated parameters were compared to their true values, Prentice found that although true and unbiased starting values for q and a were used, the iterative procedure failed to converge for 5 out of 400 samples of size 25 from the normal distribution (q=O). The rate of failure was much higher for samples of the same size from the extreme value distribution (1ql=1).
^ 2
The efficiency of the estimate a ranged from 100% to 84% for the normal and extreme value distributions, respectively. Such convergence problems result from the high nonlinearity of the maximum likelihood equations in the parameters q and a. These are
n Z.
E [(e  q2)qz .] n (3.1.13a)
i=l0
n z. 2 2 2 z i 2
[(e  q )qz .] 2c  z q  q ) = n (3.1.13b)
Y.  Y
where z. = q + pz
Y.  Y
zo = a ,and
n = sample size.
The asymptotic variancecovariance matrix of the parameter estimates (m, a, q) was shown to reduce to a diagonal matrix with elements a 2n, a 2/2n and 6/n for q=0.
56
3.1.3. Linear regression and confidence limits
Farewell and Prentice (1977) applied a slightly simplified version of Equation 3.1.11 for several data sets from industrial and medical literature to study the shape of the GGD distribution. The simplified distribution expressed in terms of z and k is kk/2
f(y,z,k) = k k1k) exp[v/k z  ke z/] (3.1.14)
which, as shown previously, approaches the normal distribution as k approaches infinity.
Farewell and Prentice included explanatory variables x = (x, ..., x ). These regression variables were assumed to be linear in the log transformed variables Y, such that
Y = m + xe + az (3.1.15)
with m and a as defined above, z the reduced variate following the GGD (Equation 3.1.4), and 0 = (01, ..., 0 ) = regression coefficients of the p C
explanatory variables. This equation is exactly the same as Equation 2.2.18 with g(x,O)=m+xO, and E(z )=z. The maximum likelihood estimate of the 100 pth percentile of Y at a given x=x0 and a specified distribution shape q=q is
Y = m(q ) + x 0 (q ) + a(q ) z (3.1.15a)
p 0 o p
where m" 0, and a are the maximum likelihood estimates of the parameters and z is the 100 pth percentile of the GGD.
Given the linearity of Equation 3.1.15 the variance of the predicted percentile yp is estimated directly by the general linear regression. Such solutions may be found in many statistical textbooks, e.g., Draper and Smith (1964, p. 61),
2T
a2 = X V(q ) X (3.1.16)
Y p 0 0
p
57
where V(q ) is the variancecovariance matrix of the parameter estimates, and
X0 = (1, x0, z) , (3.1.16a)
is the coefficient matrix derived from Equation 3.1.15. The variance of the estimates, and approximate confidence limits for Y can be calcup
lated at any confidence level po,
Y = Y + z a (3.1.17)
pa p po Y,
where z is the GGD reduced variate corresponding to its cumulative distribution having a value equal to p . If there is no explanatory variable, Equation 3.1.15 will be of the type of Equation 3.3.1 (to be discussed later) and Equation 3.1.16 reduces to the simple expression for variance given by Equation A.10.
Bain and Englehardt (1981) started from Equation 3.1.14 to develop simple procedures for approximating confidence limits for the parameters and the predictions of the Weibull and extreme value distributions.
Their results were based on chisquare and Student's t approximations to some functions of the GGD maximum likelihood parameter estimates. The accuracy of their approximations was checked against results of Monte Carlo simulations and was found adequate for most applied problems.
Farewell and Prentice (1977) found no particular shape appropriate for their data and also reported a high variability of the estimated coefficients of the explanatory variables with the parameter q. But
they found that addition of q to the CGD was very useful for accommodating a variety of shapes in the tail of the distribution and for identifying outliers.
Table 3.1 gives a summary of the special distributions and corresponding parameters of the GGD.
58
Table 3.1. Special Distributions of the GGD.
*c is the parameter of the Rayleigh Equation 3.2.11.
distribution,
Parameters ___Distribution arb ts
a b k q
Normal a 1 O 0
Extreme I max a b 1 1
Extreme I min a b 1 1
Weibull a b 1 1
Rayleigh *cv/ 2 1 1
Gamma a 1 k
59
3.1.4. Summary and implications
From this review of the GGD historical development, the interrelation between most probability models introduced in Chapter 2 becomes obvious. An adequate reparameterization of these distributions along with a logarithmic transformation of the dependent variable results in
very simple relations between reduced variates and the new parameters (Equations 3.1.8 and 3.1.15). But even with these relations, estimation of the original parameters remains very difficult, due first to their dependence (Equations 3.1.9 and 3.1.10) and second to the high nonlinearity of the resulting maximum likelihood equations (Equation
3.1.13a,b). These problems are mainly due to the poor parameterization of the GGD. Because such parameterization was based on the form of classical distributions, the GGD resulted in an over parameterization. A good illustration of this over parameterization is given by the modeling of the distribution shape, described by three parameters, b, k (Equation
3.1.1) and the logarithmic transformation (Equation 3.1.4). The equivalence of the logarithmic transformation to the addition of a shape
parameter was given in Chapter 2, Section 2.3.3. Any one of these three parameters would suffice for an adequate modeling of any distribution shape, although transformation of the dependent variable seems to be the most appropriate for this purpose. As will be seen below, incorporation of the shape parameter into the BoxCox transformation results in much simpler equations that are linear in the location and scale parameters and in the transformed dependent variable.
Tukey (1957) was the first to suggest the power transformation to
reduce random variables to approximate normality. The transformation he analyzed was of the type of Equation 2.3.7. Later Box and Cox (1964)
60
modified the Tukey transformation to the transformation defined by Equation 2.3.6 and applied it for linear regression analysis. Box and Cox showed that for this type of analysis assumptions such as (1) linearity of structure of expected values, (2) constancy of error variance,
(3) normality of distribution, and (4) independence of observations are better satisfied in terms of transformed than original observations.
Hinkley (1975) applied the same transformation to exponential and gamma random variables to reduce them to approximate symmetrical distributions suitable for statistical inferences based on equitailed order statistics.
Hernandez and Johnson (1980) investigated the large sample behavior of the BoxCox transformation. They evaluated the closeness of the transformed variable distribution (f ) to the normal distribution (0 ) using the KullbackLeibler information number, f (u)
I(f , $ ) = ff (u) log[ (u) ]du . (3.1.16)
The optimal transformation is the one that minimizes this number. For many distributions from the gamma and Weibull families, Hernandez and Johnson derived the optimal transformation leading to the best approximation by the normal distribution. These values along the ratio of the
information numbers for original data (a=l) and optimally transformed data (a=a opt) are given in Table 3.2. Note the improvement in the goodness of fit (I ) and the performance of the transformation
a =a
opt
(I a=I ) with the increase in the shape parameter of the gamma
opt
distribution. This should be expected since the higher the value of the shape parameter, the closer the gamma distribution approaches the normal
distribution (kc, Equation 3.1.14).
61
Table 3.2. Optimal Power Transformation and Information Number
Ratio. After Hernandez and Johnson (1980).
Distribution adopt 1(a=a ) 2 a=aopt 1 2
Gamma (a,1,.5) 0.2084 0.98175 0.1205 81.5
Exponential (a,1,1) 0.2654 0.41894 0.00278 150
Gamma (a,1,1.5) 0.2887 0.26070 0.00140 186
Gamma (a,1,2) 0.3006 0.18830 0.00051 369
Gamma (a,1,3) 0.3124 0.12067 0.00019 635
Weibull (a,b,1) 0.2654a  
62
As an alternative to the classical form of the GGD a new class of
families of probability distributions based on the BoxCox transformation will be presented in the following sections. For fixed values of the GGD shape parameters a family of distributions can be defined simply by transformation of the dependent variable. This generalization of the BoxCox transformation to the GGD has apparently not been attempted in
any previous work. Also, the evaluation of the relative performances of these generalized distributions and comparison to the performance of
special cases of the classical GGD based on real word data are believed to be among the unique aspects of this study.
3.2. New Parameterization of the GGD
3.2.1. Generalized normal distribution (GND)
In the previous section it was seen that the lognormal distribution is a special case of the GGD (q=O), Equation 3.1.7. If the logarithmic transformation is replaced by the more versatile power transformation a af0
Y1
Y = (3.2.1) log y a=0
a more general distribution of the variable y will be defined while the distribution of Y is approximated by the normal pdf
1 1Y 2
1 Y
f(Y) = e . (3.2.2)
This distribution has the normal (a=l), lognormal (a=0) and the Nth root
1
normal (a=N ~) distributions as special cases. Its cumulative distribution function (CDF) is
63
1 u  yY 2
F(Y) = _f e Y du . (3.2.3)
V2; Y
For a given transformation (a=a ) the only parameters of this distribution are the location and scale, Y and jY, respectively. The expected order statistic E(z ) of Equation 2.2.22 is
E(z ) = F 'n + 1  a2a (3.2.4)
where F1 is the inverse of the normal CDF, and the argument is the plotting position, defined in Chapter 2. Equations 3.2.2, 3.2.3 and
3.2.4 cannot be evaluated analytically. Several approximations of these
equations may be found in the literature, e.g., Abramowitz and Stegun (1964, pp. 932933). By equating the expected order statistic to the
standardized normal variate the following equation is obtained (for simplicity of notation E(z ) will be replaced by Z for the rest of the text),
Y =Y + a Z (3.2.5)
which is the. same as Equation 2.2.24. The parameters p Y and a can be estimated directly by simple linear regression. From Chapter 2 we saw that in addition to the simplicity of their evaluation these estimates are unbiased and have minimum variance.
3.2.2. Generalized extreme value distribution (GED)
Reduction of a set of observations to a distribution other than
normal by a power transformation was introduced by Maalel and Triki (1979) although the transformation they used (Equation 2.3.14) was not continuous at a=O. Aitkin and Clayton (1980) used the same equation to define a generalized extreme value distribution and recommended the
64
BoxCox transformation to ensure continuity at zero. The pdf of the GED is
Y  a)
f(Y) = 1 exp[(ba)  e b (3.2.6)
where Y is defined by Equation (3.2.1). It can easily be shown that the extreme value distributions and the Weibull distribution are special cases of the GED with JcJ=l and a=O, respectively. The CDF of Equation
3.2.6 is
F(Y) = exp(exp(Y b a)) (3.2.7)
Contrary to the normal distribution, these equations can be evaluated analytically. The inverse of the CDF exists. Letting F(Y) = P, the above equation may be solved for Y to yield
Y =a + b log [log (P)] . (3.2.8)
By definition of the expected order statistic Z, Equation 3.2.4, this relation may be written as
Y = a + b Z (3.2.9)
where Z is evaluated directly for any probability level P by the following equation
Z = F1 (P) = log (log (P)) (3.2.10)
Equation 3.2.9 is exactly the same as Equation 3.2.5; therefore, the same procedure may be followed for the estimation of its parameters.
3.2.3. Generalized Rayleigh distribution (CRD)
If the parameters of the GGD defined by Equation 3.1.1 are set to the following values a=cV2', b=2 and k=l, the resulting distribution is known as the Rayleigh distribution (Benjamin and Cornell, 1970, p. 301, Stacy and Mihraim, 1965). The GRD is defined by the power transformation
65
of the observed variable y (Equation 3.2.1). The pdf of Y is
I~y 1 Yd 2
f(Y) =  (y  d) exp[  2] (3.2.11)
cC
and its CDF is
F(Y)=11 Y  d2
F(Y) = 1  exp[ i ( c . (3.2.12)
Here again, analytical evaluation of these equations is straight foreward, and the inverse of the CDF exists. Letting F(Y) P, Equation
3.2.12 may be solved for Y, yielding
Y = d + c [2 log (lP)]1/2 (3.2.13)
which in terms of the expected order statistics reduces to
Y = d + c Z , (3.2.14)
where Z may be evaluated for any probability level, P, directly by the following equation
1
Z 1 (P) = [2 log(l  P)] (3.1.15)
Equation 3.2.14 is the same as Equation 3.2.9; consequently, the same solution procedure may be followed for the estimation of its parameters.
Table 3.3 lists some of the classical distributions which are special cases of the new families of generalized distributions.
3.2.4. Generalized Pearson distribution (GPD)
The Pearson Type III distribution, Equation C.2.4, was shown to be a special case of the gamma distribution, which itself is a special case of the GGD with b=1. The pdf of the GPD expressed in term of the power transformed variable, Y, is
f(Y) = X ) (Y  d) k exp[A(Y  d)] . (3.2.16)
Definition of the different terms is the same as for Equation 2.3.7b.
66
Table 3.3. Special Distributions of the New Parameterized Generalized
Distributions.
Parameters
Family Distribution Location Scale Shape
Normal 1 a 1
Lognormal y a 0
y y
GND Square root normal y a 0.5
y y
Nth root normal y a 1/N
y y
Inverse normal a c l
y y
Extreme I max a b 1
CED Frechet a b 0
Extreme I min a b 1
Rayleigh d cVZ 1
GRD
Log Rayleigh d cV's 0
Pearson III d 1/A 1
GPD
Log Pearson III d 1/A 0
N = positive integer
67
This distribution is much more complicated to deal with than the
previous ones because it has four parameters. In fact the shape of the distribution is modeled by two parameters, k and the power exponent a. As mentioned earlier, this is an over parameterization of the shape of the distribution; for fixed k the power transformation will introduce
enough flexibility in Equation 3.2.16 to fit any shape of the observed data. Thus, the parameter k will be replaced by its moment or maximum
likeliLood estimate, reducing to three the number of parameters to be estimated. Elimination of the shape parameter from the estimation
scheme was reported by Kite (1977) and applied by Siswadi (1981), and Quesenberry and Kent (1982) for the selection among probability distribution models. The moment estimate is
2
k )~ (3.2.17a)
where p and ay are the mean and standard deviation of the gamma reduced variate (Equation C.2.3c).
Alternatively, the maximum likelihood estimate of k can be evaluated using the polynomial approximation of Greenwood and Durand (1960).
2
k = (0.5000876 + 1.648852 C  0.0544274 C )/C for 0 < C < 0.5772
and k = (8.898919 + 9.059950 C + 0.9775373 C 2
(17.79728 + 11.968477 C + C 2)C (3.2.17b)
for 0.5772 < C < 17
where C = log(Y)  log(Y)
The CDF of the GPD is then
k
F (Y) = f (u  d)k1 exp[X(u  d)] du = r (k)/r(k) (3.2.18)
_ (k) z
68
where z(k) is the incomplete gamma function with argument z = X(Yd). The reduced variate, z, is equal to the inverse of the CDF, and to the expected order statistic. Therefore, the transformed variable Y is equal to
Y = d + Z (3.2.19)
X
with Z = F1(P). Note that for the GPD as for the GND f, F and F1 cannot be evaluated analytically. Tables of these functions can be
found in many statistics books, such as Benjamin and Cornell (1970) and Haan (1977). Abramowitz and Stegun (1964, pp. 940946) have some approximations for these functions. These are implemented in many computer libraries such as IIISL (1979) and SAS (1982).
3.2.5.. Other generalized distributions
As stated earlier, for any fixed values, b and k , of the GGD (Equation 3.1.1) a new generalized distribution can be defined by a power transformation of the dependent variable. Some special distributions from the GGD, resulting from such generalizations are given in Table 3.4. These distributions are not considered further in this study; they are listed for completeness. Note, however, that the exponential and Weibull distribution are special cases of the GED, and the chi and chisquare distributions are special cases of the GPD.
Other distributions not of the location and scale type and/or with more than three parameters have limited application in reliability analysis. Therefore, they have not been considered in this study,
although some of these distributions have been shown to be potentially useful in flood frequency analysis. Such distributions include the general lamda, the Wakeby, and the kappa distributions. All three distributions are expressible in inverse form, a property that made
69
Table 3.4. Other Special Distributions of the GGD Not
Included in This Study. (after Stacy and
Mihram, 1965)
v = degree of freedom
Parameters
Distribution b k
Exponential a 1 1
Weibull a b 1
Chisquare 2 1 v/2
Chi F 2 v/2
Half normal F 2 1/2
Circular normal F 2 1
Spherical normal F 2 3/2
70
them of special interest to Greenwood et al. (1979) who showed that for this type of distribution, probability weighted moments are the most appropriate for parameter estimation.
The inverse forms of the lamda, Wakeby and kappa distributions are, respectively,
y = m + aF  c(l F)c 3.2.20)
y = m + a[l  (1 F) b]  c]l  (1  F) d] (3.2.21) y = m + a[bFb/(l  Fb)] 1/bc (3.2.22)
where m, a, b, c and d are the distribution parameters. The first two equations do not even have an explicit form for F, the cumulative distribution function. Note the nonlinearity in the parameters in addition
to the fact that all three distributions require fitting of at least four parameters. Of the same type is the three parameter generalized extreme value distribution analyzed by Prescott and Walden (1983)
x = m + ab[l + (log F)l/b] . (3.2.23)
A generalized lognormal distribution suitable for fitting a wide range of positively skewed hydrologic data was developed by Brakensiek (1958). This distribution was based on Chow's (1954) representation of hydrologic variables in terms of a frequency factor, K,
K, = 1 + V K (3.2.24)
Iy y
and the relation between the moments of the normal and lognormal distributions (Appendix B). The transformation to lognormality was accomplished through the following location and scale transformation
 = b() + (1  b) (3.2.25)
Y Py
71
Y
where b =  , and V and V are the coefficients of variation of Y and
y
y, respectively. The generalized distribution has the form
 =exp(cz  c /2) (3.2.26)
Y
where c log(l + V )1/2 (3.2.27)
and z is the reduced normal variate. This distribution was extended to the normal and extreme value distributions through first order approximations of the two previous equations.
1 ~ 1 + cz (3.2.28)
Ijy
c~V . (3.2.29)
Similar approximations will be used in Chapter 5. Note that Equation
3.2.28 is similar to Equation 5.2.2. An iterative procedure was
developed to choose the best transformation, b, (Equation 3.2.25) satisfying Equation 3.2.28. The power transformation adopted within this
dissertation (Equation 3.2.1) may easily be shown to be equivalent to a locationscale transformation in the log space (Equation 2.2.25). For the development of the new generalized distributions, no such approximations were made since the parent distributions are defined in the
untransformed space, and the reduced variates have the same distribution as the parent distributions, rather than being equal to the standardized normal variate as in the Brakensiek formulation (Equation 3.2.28).
3.2.6. Summary
By this new parameterization the GGD has a much simpler analytical form: independently of the assumed parent distribution the transformed variables are related to the expected order statistics by the same type of relation (Equation 3.2.30). The linear parameters are the location
72
and scale parameters, A and B, while the only nonlinear parameter, a, is included in the dependent variable y.
y 1 = A + BZ C40
a
Y = (3.2.30)
log y = A + BZ a=O
where y is the untransformed observation, Z is the expected order statistic, A and B are the linear parameters and a is the shape parameter.
The four generalized distributions are summarized in Table 3.5.
The new parameterization has the following advantages over the classical form of the GGD.
1. It covers a larger family of classical distribution with simpler form in the parameters.
2. The linear parameters A and B may be estimated independently of the nonlinear parameter a by ordinary least squares methods. Lawton and Sylvestre (1971) and Spitzer (1982) showed that such a separation increased the rate of convergence considerably, while Maalel (1983a) found that such a procedure led to less biased estimates than simple nonlinear methods.
3. Confidence intervals and statistical inferences are more easily
established in the transformed space. Atkinson (1983) gave some applications of this type of transformation with the normal distribution for eliminating outlier effects and displaying influential observations.
Based on the work of McCullagh (1980) and his own experience, Atkinson suggested the use of a linear model along with the GGD given by Prentice (1974) for more appropriate statistical investigations. The procedure outlined in this chapter follows exactly this suggestion, although it has been developed independently.
73
Table 3.5. New Generalized Family of Distributions.
4,1
i
n F 1
n PP
= inverse cumulative normal distribution function. = rank of ordered data. = sample size. = inverse cumulative gamma distribution function. = sample size. = plotting position.
Distribution Transformed Expected Plotting
Family variable order statistic position
(Y) (Z) (PP)
Normal ya  1 1 i  0.375
(GND) a n+120.375
Gumbel yalog (og(PP)) i  0.44
(GED) a n+120.44
Rayleigh Y 1 [2 log(lPP) 1/2 i  0.44
(GRD) a n+120.44
Pearson ya  1 F1 i  0.40
(GPD) a G n+120.40
74
4. Regression of the transformed variables against the expected order statistics allows a more efficient use of the information contained in the observations, in that the rank (frequency) of the observation contributes to the estimation in addition to the observed values.
Greenwood et al. (1979) showed the merit of using the order statistics to derive analytical expressions for weighted moments of several distributions expressible in inverse form. The derived weighted moments were of simpler analytical structure than the relationships between the conventional moments and the parameters. This is in good agreement with
the simplicity of the expressions derived in the previous sections, Equation 3.2.30. Landwehr and Matalas (1979) compared the probability weighted moments parameters and quantile estimates of the Gumbel distribution with the estimates from conventional moments and maximum
likelihood methods. Their results showed good agreement between the weighted moments and the other two methods.
5. Inclusion of the order statistics (ranks transformation) in the linear regression constitutes a bridge between parametric and nonparametric statistics. Conover and Iman (1981) and Iman and Conover (1979) gave a good illustration of this type of regression showing that it works quite well on monotonic data. This will always be the case for the regression of the transformed variable against the expected order statistics, where the CDF is always a monotonic function.
3.3. Generalized Probability Distribution Computer Program (GPDCP)
3.3.1. Solution algorithm
In the previous section it was shown that the four newly parameterized families of distributions, GNP, GEP, GRD and GPD, reduce to the simple form of Equation 3.2.30. This equation relates the expected
75
order statistics to the transformed variables by a relation linear in the location and scale parameters, A and B, respectively,
Y = A + BZ . (3.3.1)
Given a sample of n observations y=(yl, y2' ...' yn ) the expected order statistics Z=(Z1, Z2' ..'' Zn) are calculated directly from the
approximate formula for the plotting position (Equation 2.2.22). Then for a given a, the data are transformed using Equation 2.3.6. Note that
this transformation is a monotonic function and, consequently, does not change the rank of the original data and the expected order statistics. Box and Cox (1964) replaced the parameters A and B by their maximum likelihood estimates, in the search for the best transformation to normality (GNP family). Also, they noted that the maximum likelihood
estimates of the linear parameters are the least squares estimates for the transformed variable Y, and that the maximum likelihood estimate of a is better given by solving for different transformations and choosing the one that maximizes the likelihood function (Equation 2.2.7). Similar algorithms, where the linear parameters are eliminated from the nonlinear estimation, have proven to be fast and efficient compared to those where all parameters are estimated simultaneously. Such algorithms have been applied by many authors, (Lawton and Sylvestre, 1971;
lernandez and Johnson, 1980; Spitzer, 1982; and Maalel, 1983a). Like Box and Cox, all these authors did not include the expected order statistics among the explanatory variables (independent variables) in their
models, although parameter estimates based on expected order statistics are known to be more efficient and less biased than those estimated by the more classical procedures (Chapter 2, Section 2.2).
76
Gupta (1970) developed a general program f or the selection among ten theoretical probability distributions from the normal and extremal families. For most of these distributions the estimation of the parameters was through linear regression of the observations on the
expected order statistics. The same procedure was used by Cunnane (1978) in an investigation of unbiased plotting positions. While Cunnane used the goodness of fit by a straight line (visual test from plot) as the best selection criterion, Gupta used the coefficient of determination R defined as
SS
R2 = 1 ( 3.3.2)
t
where SS is the sum of squares of the predicted minus observed values,
e
and SSt is the sum of squares of the observed values minus their mean. Thirriot et al. (1981) used regression analysis of transformed variables (Equation 2.3.7) on expected order statistics calculated from the
Weibull plotting position to select among frequency distribution models from the normal.and extreme value Type I families. The selection criterion was the sum of squares of the untransformed residuals. Greenwood et al. (1979) and Landwehr et al. (1978) used the expected order statistics to derive probability weighted moments. These moments led to
less biased estimates from small generated data samples than the maximum likelihood and the unweighted moments. Somerville and Bean (1982)
compared maximum likelihood to order statistic based least squares methods and found that the least squares method often gave a substantially better fit to generated data, especially in the presence of outliers or when the underlying distribution was not clearly established. Stedinger (1983b) recommended the use of probability weighted
77
moments with small samples, normalized by their sample means and logarithmically transformed to obtain consistent and accurate estimates of normalized flood flow distribution parameters.
The procedure outlined in this section will combine the advantages of the regression on the expected order statistics and the separation of the linear parameters from the nonlinear regression. The only nonlinear parameter is then the shape parameter a of the transformation y  M aj
Y a (3.3.3)
log y a=.
Thus, the selection of the best distribution will be a onedimensional problem, since for each a the parameters A and B are found from simple regression analysis (see Appendix A). A computer program (GPDCP) was developed especially for the selection among frequency models from the four generalized distributions summarized in Table 3.5. A detailed description of this program is given in the next section. Figure 3.2
lists the main steps of the selection procedure followed by the GPDCP program.
3.3.2. Program.description
A complete listing of the Generalized Probability Distribution Computer Program (GPDCP) source program is given in Appendix D. This section. will be limited to a rather simplified description of the capabilities and options of the program, with more emphasis on the
mathematical formulation of the estimation procedure and selection criteria (statistics).
The GPDCP can handle simultaneously a virtually unlimited number of
stations each with up to 10 samples of observations. The number of
1.
Figure 3.2.
Flow Chart for the Generalized Probability Distribution Computer Program (GPDCP).
78
( START
Read control flags and input data = series of a's, samples information, and observed data.
Rank the data in ascending order if they are not already sorted, (subroutine ORDER).
Calculate empirical frequencies and expected order statistics for each generalized family of distributions according to Table 3.5. (subroutines CDF, GSSI, MDCHI).
For each family of distribution and for each a, transform the data (Equation 3.3.3) and perform linear regression analysis (subroutine REGRE).
Calculate selection statistics and save them for later comparison. If end of a's and end of distribution families go to next step, if not back to step 4.
Choose the best frequency model according to the specified selection statistic and output regression results for all
models.
Summarize performance of the best model and perform goodness of fit test for predictions and residuals.
(END)
2.
3.
4.
5.
6.
7.
79
observations by sample is limited to 60, but it can easily be increased by simply changing the dimension statement. For each sample the best frequency model may be selected from up to 100 (4 families and 25 transformations) theoretical distributions.
The observed data may optionally be multiplied by a scale factor
(CP) and shifted by a constant (CT) using the transformation
y = y  CP + CT . (3.3.4)
This option was introduced in the program to allow investigation of the effect of change of units and standardization of the observations on the
selection procedure. When the input data are not ranked, subroutine ORDER sorts them in ascending order. The empirical cumulative frequencies are then calculated using plotting formulae listed in Table
3.5; this calculation is performed by subroutine CDF. From these frequencies, the expected order statistics are calculated using the inverse of the cumulative distribution function of each family. The inverse of the cumulative normal distribution, 0~1, is solved for the reduced normal variate (exDected order statistic) using subroutine RNV. This subroutine is based on a rational approximation for 4~P (Abramowitz and
Stegun, 1964, Equation 26.2.23). The inverse of the cumulative Pearson distribution is evaluated using IMSL subroutine MDGHI (IMSL, 1979) and the relation between the gamma and chisquare variates (Section C.2.7)
zG = zCH
(3.3.5)
VCH k/2
where zG and z CH are the reduced variates of the gamma and chisquare distributions, respectively, v CH is the degree of freedom of the chisquare distribution and k, the shape parameter of the gamma distribution (estimated by Equation 3.2.17a). The expected order statistics
80
for the Gumbel and Rayleigh general distributions are calculated directly from the mathematical expressions for the inverse of their respective cumulative distributions listed in Table 3.5.
Regression analysis is then performed on the transformed observations and expected order statistics, as dependent and independent variables, respectively. The analysis is made according to.the solution
procedure outlined in Appendix A by subroutine REGRE. For each family and for all specified transformations (up to 100 models) the program calculates four selection statistics. These are: 1. coefficient of determination, 2. standard error, 3. weighted sum of squares, and 4. maximum likelihood function. The four selection statistics were included in the program to allow the comparison of their performance and to give the user the option to choose his/her own criteria. Definitions of these selection statistics are as follow:
1. Coefficient of Determination (R2 )
This coefficient is the same as the one defined by Equation 3.3.2, except that, herein the residuals refer to the transformed variable Y. Thus, R is the well known coefficient of determination of the fitted straight line (Equation 3.3.1). It is calculated directly within the subroutine REGRE by the formula
R2 Cov (ZY) (3.3.6)
Var(Z) x Var(Y)
where the variables are as defined above.
2. Standard Error (STDE)
This is defined by the square root of the mean square of deviations between observed and predicted untransformed variables. The standard
error has exactly the same expression as the mean square error (MSE) of Equation 2.3.9.
81
STDE = [ 0 ) ]l/2 (337)
n
where y is the observed variable and y is predicted.
3. Weighted Sum of Squares (WSS)
From section 2.2.5 it was seen that the maximum likelihood method was equivalent to a weighted least squares, where the weights are defined as the inverse of the variances (Equation 2.2.11)
w.= (3.3.8)
y.
Following the procedure adapted by Sorooshian and Dracup (1980) and
Sorooshian (1981) and first applied by Box and Hill (1974) the weights of this equation are expressed in terms of the variance of the transformed variable Y=f(y). From the relation between these two variances (discussed in Chapter 5, Section 5.3.2) we have
2
a2 _ Y 3.3.9)
y ft(y)2
and as will be shown in Equation 5.3.6
f'(y) =ya1 . (3.3.10)
where f' is the first derivative of the transformation (Equation 3.3.3). Substitution of these expressions into Equation 3.3.8 gives 2a2
w = = 2 .(3.3.11)
i 2 2
a Y
The weighted sum of squares reduces then to 2a2
n y. 2
WSS = E 2 i oi) (3.3.12)
i= Y
82
n ac1 2 2
= E [(  y0. ) y ] /a . The last expression is the one used by the GPDCP program for calculation
2.
of the WSS. The variance aY is calculated by subroutine REGRE using Equation A.5. Note that for our case the predicted values yp are calculated from the predicted transformed variables Y (Equation 3.3.1) and the inverse of the BoxCox transformation (Equation 3.3.3), (Y P a + 1)lh/a U0 y x =)(3.3.13) yp exp (Y ) a=0
4. Maximum Likelihood Function
Expressed in terms of the untransformed variable y, the maximum likelihood function, defined by Equation 2.2.7 reduces to Y  A  BZ2
L /2 n exp[ T E ( a ] (3.3.14)
(2/r)/ G 2 y
y
where J(a,y) is the Jacobian of the transformation of Equation 3.3.3 (Box and Cox. 1964) n dY.
J(n,y) = d (3.3.15)
=1 dy
n a1
=1=1
The logarithm of the likelihood function Z, is, after replacing the Jacobian by its expression into Equation 3.3.14,
Z = (a) E log yi  log (2w)  n log (ay)
1 n 2
 X (Y. A  BZ.) . (3.3.16a)
20 i=l
Y
83
2.
If in the last term the variance o is replaced by its maximum likelihood estimate, Equation A.4, k reduces to, n on2
(a1) E log y 2 log (27T)  n log (ay)  ( ) .3.3.16b)
For a given sample, the second and the last terms are constant; thus, maximizing Z will be equivalent to maximizing the reduced expression
NXLF = (al) Z log y  n log (aY) . (3.3.17)
This is the fourth selection statistic calculated by the GPDCP program. The performance of the selected frequency distribution model is then analyzed in more detail; residuals and confidence limits are calculated for predicted values in real and transformed spaces. Statistical tests
including Student t, F test, and DurbinWatson statistic are performed on the residuals of the linear regression. A simple description of these tests follows:
Student t test. The Student t statistic tests the deviation of each observation from the fitted straight line. The t statistic is calculated from Equation 2.2.25
AY. e.
t =  I= 1 (3.3.18)
Y. Y.
I 1
where a is the variance of individual predictions, calculated using
I
Equation 2.2.23. This statistic is then compared to the value t (v, 1p /2), where v is the number of degree of freedom (=n3), and p is the percent confidence (reliability) level on which the comparison is made. For given v and p , t (Equation 3.3.18) should be less or equal to t (v,
0 0 0
1p /2), to conclude that observation Y is within the p0 percent confidence interval about the fitted line. Tables for t are available in
0
many statistics textbooks. The CPDCP program calculates t by calling
the IMSL subroutine MDSTI (IMSL, 1979).
84
F test. This is the appropriate test statistic for testing the
linearity of the regression function. The F statistic is defined as the ratio of the mean square due to regression to the mean square due to
residual variation (Draper and Smith, 1966, p. 24). Each of these two means, when multiplied by its degrees of freedom follows a X2 distribution with 1 and n2 degrees of freedom, respectively, MS  2 2
F = = Y  X (1) (3.3.19)
MS 2 9
e E (Y  Y ) X~(n2)
where Y is the observed transformed variable.
0
The ratio of two X2 distributed random variables is known to follow an F distribution with degrees of freedom equal to those of the X2 distributions, 1 and n2 for this case. Draper and Smith noted that this statistic is exactly the same as the Student t test for a zero slope in the case of fitting a straight line.
The F statistic and its corresponding tail area of the Fdistribution are calculated in the GPDCP program by calling the IMSL subroutine RLONE.
DurbinWatson statistic. This statistic is used in testing for first order linear correlation in the residuals. It is defined as
n 2
Z (e.e _)
DW = i=2 n (3.3.20)
n2
Z e.
i=l1
and is always in the interval 0 to 4. A DW value significantly smaller (larger) than 2 indicates the presence of positive (negative) correlation. Durbin and Watson (1951) estimated significance points for the 1,
2.5 and 5 percent levels for DW. This statistic too, is calculated within the RLONE subroutine.

Full Text 
31
O, 1 1 . P*.
(2.2.8)
i
Examples requiring alternative procedures to obtain MLE may be found in
Mann et al. (1974, p. 113) and Meyer (1975, p. 314). These equations
are similar to the previously defined normal equations. Usually they
are not linear expressions of the parameters, and their solution re
quires nonlinear estimation methods. The complexity of the solution
increases with that of the form of g(x,0) and f Maximum likelihood
yi
estimates satisfying Equation 2.2.6 will often occur at local maxima
(Mejier, 1975, p. 312), but they do have the following important prop
erties (Mann et al., 1974, p. 82): 1) If an efficient estimator, 0, of 0
exists, then this estimator is the unique solution of the related like
lihood function. 2) If the number of sufficient statistics is equal to
the number of unknown parameters, the MLE's are the minimum variance
estimators of their respective expected values. 3) Under certain
general conditions the MLE's converge probabilistically (Equation 2.2.3)
to the true value 0, as n*, and are asymptotically normal and asymp
totically efficient estimates of 0. 4) The MLE's are invariant to a
transformation with a singlevalued inverse. In other words, if 0 is
the MLE of 0, and t(0) is a function of 0 with a singlevalued inverse,
t(0) is the MLE of t(0). This last property is a very important one; it
will be used in Chapter 3 for statistical inferences about estimates
from transformed variables.
2.2.5. Weighted least squares
By defining "chisquare"
2
Â£ (
g(x,0) 2
)
X
o.
i
(2.2.9)
Table 4.26. Sensitivity of Optimal Selection Statistic (R2) and Corresponding
Transformations (a) to Plotting Position Definition. Maximum
Values of R2 are underlined.
Plotting
Position
Constant
Normal
Gumbel
Rayleigh
Pearson
0.00
0.97747/0.10
0.97668/1.0
0.96774/0.70
0.98043/0.10
0.200
0.98015/0.10
0.95773/0.10
0.96401/0.10
0.98245/0.10
0.300
0.98122/0.10
0.95670/0.10
0.96428/0.10
0.98221/0.10
0.375
0.98192/0.10
0.95566/0.10
0.96453/0.10
0.98419/0.10
0.400
0.98220/0.10
0.95522/0.10
0.96456/0.10
0.98347/0.10
0.440
0.98262/0.10
0.95448/0.10
0.96456/0.10
0.98485/0.10
0.600
0.98405/0.10
0.94990/0.10
0.96455/0.10
0.98522/0.10
0.900
0.98231/0.10
0.91732/0.10
0.95699/0.10
0.98299/0.10
nonoo oooo
PROGRAM MWK FOR GENERATION OF RELIABILITY COEFFICIENTS M!4K 1
ANB SENSITIVITY ANALYSIS HUK 2
HUK 3
UK 4
DIMENSION V(20)ALF(20)ZA(20)>C0RP(202Q)jC0RL(2Q) HUK 5
DIMENSION P20)RATIO(2020)C0RR(20) HUK 6
HUK 7
HZ = NUMBER OF REDUCED VARIATES HUK 8
NA = NUMBER OF TRANSFORMATIONS HUK 9
NV = NUMBER OF COEFFICIENTS OF VARIATION HUK 10
HUK 11
READ(5f10) NZ)NANV HUK 12
10 FORMAT C5I5) HUK 13
READ(5f20) (P(I),ZA(IM=1NZ) HUK 14
20 FORMAT(F100F10*3) HUK 15
C HUK 16
PV0=01 HUK 17
PA0=0.1 HUK 18
AF0=0001 HUK 19
C HUK 20
DO 300 IZ=1NZ HUK 21
Z=ZA(IZ) HUK 22
DO 200 IV=1NV HUK 23
VI=PVOHV HUK 24
V(IV)=VI HUK 25
V2=VI+VI HUK 26
V21=l+V2 HUK 27
V21L=ALOG(V21) HUK 28
C0RL(IV)=SGRT
C HUK 30
DO 100 IA=1jNA HUK 31
AF=PA0*(IA1) HUK 32
IFA.EQ.1) AF=AFO HUK 33
ALF(IA)=AF HUK 34
IFIAF.NE.O.) AIF=1./AF HUK 35
C0R=h+.5IAF(AFl.)*V2tAF*ZWI HUK 36
IFICOR.LT.O.) C0R=0000001 HUK 37
CORP(IVIA)=CORAIF HUK 33
RAT10(IV>IA)=C0RP(IVIA)/C0RL(IV) HUK 39
CORRI IV)=C0RL(IV)/CORL(IV) HUK 40
C HUK 41
100 CONTINUE HUK 42
200 CONTINUE HUK 43
C HUK 44
URITE(6j50) P(IZ)Z(ALF{I)I=1NA> HUK 45
50 FORMAT(lHl>////>40Xf'COEFFICIENT OF RELIABILITY'//>20Xr'FOR A ' HUK 46
.'PROBABILITY OF 'F10.6>' l WITH NORMAL VARIATE Z = ',F6^3 HUK 47
JIh5X' VY '40X' ALFA'>///>14X'LOG'rllF7.3j//) HUK 48
WRITE{6>60) J=1 NA)1=1 NV) HUK 49
60 FORMAT!(F10.312F7.3/)) HUK 50
URITE(690) HUK 51
URITE(6i70) P(IZ)Z(ALF(I)I=lfNA) HUK 52
70 FORMAT<1H1j///j26X'POWER TO LOGNORMAL PREDICTED PTH PERCENTILE'j HUK 53
,' RATIO 'ill;20X>'FOR A ' HUK 54'
.'PROBABILITY OF ',F10.6' l WITH NORMAL VARIATE Z = ',F6.3i HUK 55
,///5X'VY '4OX' ALFA'///14X'L0G'11F7.3//) HUK 56
URITE(6>80) (V(I)>CORRI=lfNV) HUK 57
80 FORMAT((F7l3X12F73/)) HUK 58
URITE'6,90) HUK 59
90 FORMAT'ALFA t POWER TRANSFORMATION EXPONENT' HUK 60
,/,12X'VY 5 COEFFICIENT OF VARIATION') HUK 61
300 CONTINUE HUK 62
WRITE!6400) HUK 63
400 FORMAT(1H1) HUK 64
STOP HUK 65
END HUK 66
315
122
in Tables 4.22 and 4.23. The other three selection statistics gave the
same model, the Gumbel with no transformation (ALFA=1.0). The per
formance of this model is shown in Tables 4.24 and 4.25. For this
example too, the Student statistics suggest a much better fit based on
the optimal R2 statistic than by the other statistics. The plots of
these two models in the transformed spaces (Figures 4.6 and 4.7) do not
show any visible difference at the 95% confidence levels. From this, it
is possible to conclude that the Student statistic is more powerful than
the graphical display of the confidence intervals in detecting the
goodness of fit of the model.
4.4. Sensitivity Analysis
4.4.1. Sensitivity to the plotting position definition
In the previous section, the series of the two illustrative exam
ples were fitted using the Weibull plotting position (constant a=0.0 in
Equation 3.2.4). The sensitivity of the selection statistics to the
definition of the plotting position was investigated by varying the
constant a from 0.0 to 1.0. This range covers all recommended values
for different distributions (Table 3.5 and Section 2.2.7). The optimal
values of the statistic R2 and STDE are given by Tables 4.26 and 4.27 for
the Kissimmee river rainfall series (Section 4.2). From these tables
we note that the shape parameter is practically independent of the
definition of the plotting position, and that the sensitivity of the
optimal statistics to the change of this definition is very small
compared to the sensitivity of these statistics to the change of the
shape of the distribution (transformation a). Due to this low sensi
tivity, no clear conclusion can be made about the appropriate plotting
position for each distribution family. But it is interesting to notice
223
The ARMA(1,1) model is represented by
zt *1 Vi + et A Vi
(6.5.1)
where z is the standardized transformed variable at time t, and
0^ are the autoregressive and moving average parameters, respectively,
and e is an independent random variable (white noise) with zero mean
2
and variance 0Â£. A detailed description of the parameter estimation
procedure may be found in Box and Jenkins (1976) or in a more simplified
form with application to hydrologic time series in Salas et al. (1980).
In this second reference two computer programs are given for the analy
sis of annual and monthly hydrologic time series based on the IMSL sub
routines (IMSL, 1979). These programs have been modified to perform the
type of analysis described in the following sections.
6.5.3. Parameter estimation and goodness of fit evaluation
6.5.3.1. Parameter Estimation. The parameters are first estimated by
the least squares method (Section 2.2.3). The sum of squares of the
residuals is calculated for a range of values of and 0^ within the
interval [1,+1] imposed by the stationarity and invertibility condi
tions (Box and Jenkins, 1976, p. 76),
N N 2
SS((.1,01) = Z et = Z (zfc 1 zt_1 + 01 et_1) (6.5.2)
where is set equal to zero. The optimal parameters are those for
which the sum of squares is minimized. More refined estimates of these
parameters are then calculated by the IMSL subroutine FTMXL (IMSL, 1979)
using the previous estimates as starting values. The nonlinear proce
dure used by this subroutine is described by Box and Jenkins (1976,
Section 7.2). As noted in Chapter 2, the least squares estimates are
222
and q of the moving average (MA) component of an ARMA(p,q) model; 4)
estimation of model parameters; 5) testing goodness of fit of the model;
and 6) evaluation of the uncertainty of model predictions based on
available data.
The first four stages have been largely investigated (Rao et al.,
1982; Salas and Obeysekera, 1982; Ozaki, 1977; Delleur and Kawas, 1978;
Klemes and Bulu, 1979; Miller et al., 1981; Hirsh, 1982; Srikanthan and
McMahon, 1982; Stedinger and Taylor, 1982a). Although many statistics
have developed to compare the performance of different models, the best
model remains overshadowed by natural and parameter uncertainty (Klemes
et al., 1981; Stedinger and Taylor, 1982b) making model selection a
rather subjective decision. Thus, this section will focus mainly on the
last three stages, in an effort to shed more light on parameter esti
mation, goodness of fit testing and uncertainty evaluation, areas which
have not been given enough attention, especially when the analysis is
based on transformed data (e.g., BoxCox transformation, Equation
3.3.3).
6,5.2. Model Description
Since the main objective of this section is reliability analysis
emphasizing the above last three stages of the modeling process, only
one model will be considered. The focus will be on the effect of the
BoxCox transformation (Equation 3.3.3) on the performance of this model
and the reliability of its parameter.estimates. The model is the
ARMA(1,1), a model that has been recommended by many (e.g., Delleur and
Kawas, 1978, Salas et al., 1980) for the analysis of rainfall time
series similar to those of this case study (Appendix H).
TABLE G1.8
RELIABILITY AS A FUNCTION OF COR,
COR
VY
0. 2000
0. 4000
0. 3000
0. 9999
0. 9972
0. 4000
0. 9994
0. 9860
0. 5000
0. 9968
0. 9702
0. 6000
0. 9918
0. 9553
0. 7000
0. 9856
0. 9436
0. 8000
0. 9795
0. 9356
0. 9000
0. 9743
0. 9309
1. 0000
0. 9704
0. 9288
1. 1000
0. 9678
0. 9289
1. 2000
0. 9663
0. 9305
1. 3000
0. 9658
0. 9332
1. 4000
0. 9662
0. 9367
1. 5000
0. 9672
0. 9408
1. 6000
0. 9687
0. 9451
1. 7000
0. 9705
0. 9496
0. 6000
0. 8000
1. 0000
0. 9599
0. 8216
0. 5832
0. 9244
0. 7921
0. 6103
0. 8987
0. 7814
0. 6368
0. 8830
0. 7811
0. 6628
0. 8749
0. 7867
0. 6879
0. 8723
0. 7958
0. 7123
0. 8736
0. 8071
0. 7357
0. 8774
0. 8195
0. 7580
0. 8830
0. 8325
0. 7793
0. 8897
0. 8456
0. 7995
0. 8971
0. 8586
0. 8186
0. 9049
0. 8713
0. 8365
0. 9128
0. 8834
0. 8531
0. 9205
0. 8950
0. 8686
0. 9281
0. 9058
0. 8830
COR : COEFFICIENT OF RELIABILITY
VY : COEFFICIENT OF VARIATION
VY, FOR ALFA
Q. 400
1. 2000
0. 3371
O. 4235
0. 4087
0. 5417
0. 5070
0. 6269
0. 6627
O. 6953
O. 7251
O. 7525
0. 7777
O. 8009
O. 8222
O. 8418
0. 8596
1. 4000
0. 1610
0. 2675
0. 3556
0. 4285
0. 4902
0. 5437
0. 5909
O. 6330
O. 6710
O. 7054
0. 7366
0. 7651
0. 7910
0. 8146
0. 8360
1. 6000
0. 0651
0. 1556
0. 2469
0. 3293
0. 4018
0. 4656
0. 5221
0. 5726
0. 6179
0. 6588
0. 6958
0. 7293
0. 7597
0. 7872
0. 8121
1. 8000
0. 0228
0. 0843
0. 1647
O. 2469
0. 3240
O. 3942
O. 4577
O. 5149
0. 5666
0. 6133
0. 6555
0. 6938
0. 7285
O. 7599
0. 7882
2. 0000
O. 0071
0. 0430
O. 1061
0. 1811
0. 2575
O. 3305
0. 3984
0. 4607
0. 5175
0. 5692
0. 6162
0. 6589
0. 6976
0. 7326
0. 7643
202
Table 6.11 Yearly and Monthly Summaries from
SWMM for the Multifamily Urban Basin.
SUMMARY OF QUANTITY AND QUALITY RESULTS FOR 1974
MONTH
INLT
RAIN
INCH
FLOW
CU. FEET
COD
POUNDS
TOT. SOL.
POUNDS
JANU
99
0. 30
1. 950E+04
5. 740E+01
7. 727E+01
FEBR
99
0. 20
1. 263E+04
4.527E+0
7.207E+01
MARC
99
1. 70
1.194E+05
1.7B2E+02
1.319E+03
APR I
99
0. 70
4. 771Er04
1.2B8E+02
6.523E+02
MAY
99
2. SO
1. B59E+05
1. 727E02
1.529E+03
JUNE
99
5. 10
3.47SE+05
1. 072E+02
7.303E+02
JULY
99
7. 60
5.221E+05
1.405E+02
7.975E+02
AUCU
99
3. 50
2.360E+05
1.006E+02
7. 022E+02
SEPT
99
5. 50
3. 792E+05
1. 414E+02
9.769E+02
OCTO
99
1. 10
7. 252E+04
1.813E+01
6. 41BE+01
NOVE
99
S. 20
3. S38E+05
2.032E+02
1.193E+03
DECE
99
0. 40
2. 670E+04
4.062E+01
9.982E+01
YR TOT
99
37. 10
2. 55E+06
1. 33E+03
8.21E+03
SUMMARY OF
QUANTITY AND QUALITY RESULTS FOR 1977
JANU
99
4. 20
2.981E+05
8.919E+01
5. 805E+02
FEBR
99
1. 60
1.066E+05
1.255E+02
5. 005E+02
MARC
99
0. 0
0. 0
0. 0
0. 0
APR I
99
1. 50
9. 206E+04
1.979E+02
8. 45BE+02
MAY
99
7. 60
5. 193E+05
1.586E+02
1. 805E+03
JUNE
99
6. 30
4.321E+05
1.093E+02
5. 570E+02
JULY
99
4. 50
3.16SE+05
1. 075E+02
7. 503E+02
AUCU
99
7. 40
4. 965E+05
1.521E+02
9. 772E+02
SEPT
99
8. 50
5. S93E+05
9.753E+01
4. 884E+02
CTO
99
3. 10
2.15SE+05
1.10BE+02
7.1S6E+02
NOVE
99
6. 60
4.741E+05
1.1B9E+02
5. 932E+02
NOVE
99
5. 10
3. 594E+05
1.147E+02
B. 0B6E+02
YR TOT
99
56. 40
3. 90E+06
1.3BE+03
8. 62E+03
42
coefficient of variation
and skewness
1)
1/2
y = 3 V + V3 .
y y y
A location parameter, a, is often added to the lognormal distribution.
Redefining Y of Equation 2.3.1 as log (ya) leads to the three parame
ter lognormal distribution. Its pdf is
f (y) =
Oy v/2tt ~ (ya)
log(ya) y ?
~ 7 ( )
e 2 Y
(2.3.2)
with (ya) > 0, a the location parameter, and yiy and y are as defined
above.
Munro and Wixley (1970) showed that the lognormal distribution can
be expressed in terms of location, scale and shape parameters y, t
and a, respectively. This was illustrated by the following reparameter
ization of Equation 2.3.1,
1 2a
f (y) = e
/2tt (1+oz)t
^2 [log(l+az)]2
(2.3.3)
where
y ~ K
z =
y = a + e
and a = a
y
In this form it is obvious that the lognormal distribution approaches
the normal distribution as the shape parameter approaches zero for
fixed location (y) and scale (t) parameters (given, lim log(l+az) = az) .
aK)
331
FUNCTION CDF(X)
KHU
KHU
62
63
P0=0*231419
KHU
64
B1=0.31938153
KHU
65
B2=0356563782
KHU
66
B3=l7B147793?
KHU
67
B4=l.821255978
KHU
68
B5=1.330274429
KHU
69
PI=4.*ATflN(l.)
KHU
70
F=EXP (Xi.X/2.) /SORT (2. TPI)
KHU
71
T=1./(1,+P0*ABS(X))
KHU
72
R=F*< (<
KHU
73
P=R
KHU
74
IF
KHU
75
CBF=P
KHU
76
RETURN
KHU
77
END
KHU
78
YEAR
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
MEAN
STDV
C VAR
SKEW
TABLE H7
ST LUCIE NEW LOCK 1 STATION # 087859 195679
MONTH
1
2
o
4
5
6
7
8
9
10
11
12
TOTAL
1.
00
0.
54
0.
23
2.
72
3.
07
5.
87
4.
65
6.
61
4.
64
5.
73
0.
22
2.
58
37. 86
2.
7 6
3.
12
3.
17
6.
04
5.
06
5.
17
9.
67
10.
24
4.
55
9.
29
1.
10
3.
81
63. 98
9.
02
0.
69
4.
93
2.
82
7.
17
3.
64
4.
66
2.
63
6.
43
7.
42
1.
50
5.
93
56. 84
4.
34
0.
31
7.
26
4.
48
9.
80
12.
69
5.
65
7.
13
9.
10
8.
23
5.
42
3.
88
78. 29
0.
2 5
5.
41
1.
66
7.
36
4.
85
10.
07
8.
68
4.
88
19.
42
3.
79
1.
19
0.
48
68. 04
3.
97
0.
64
1.
28
1.
77
9.
50
3.
82
1.
39
6.
97
1.
55
4.
88
1.
64
0.
02
37. 43
0.
98
0.
64
3.
20
5.
30
2.
10
7.
82
8.
90
12.
78
9.
49
0.
76
0.
81
0.
08
52. 86
0.
77
4.
08
1.
36
0.
23
4.
74
6.
43
3.
51
3.
26
8.
82
5.
70
2.
88
8.
11
49. 89
2.
03
2.
53
0.
23
2,
63
3.
42
4.
15
7.
38
12.
79
4.
89
11.
48
1.
18
1.
62
54. 33
0.
82
4.
79
1.
92
1.
26
0.
36
9.
04
3.
34
4.
21
6.
13
8.
16
0.
52
0.
68
4l! 23
7.
00
4.
57
2.
53
4.
31
3.
66
15.
08
4.
01
4.
17
7.
64
9.
05
2.
41
1.
19
65. 62
1.
28
3.
45
1.
74
0.
07
1.
30
7.
95
5.
09
9.
74
5.
89
14.
78
1.
10
1.
24
53. 63
0.
33
1.
97
0.
43
0.
25
8.
44
15.
10
7.
15
7.
76
7.
59
9.
45
2.
36
0.
0
60 83
1.
64
1.
45
4.
74
1.
29
13.
19
2.
59
5.
06
4.
83
4.
50
9.
83
2.
14
2.
33
53. 59
5.
39
4.
11
13.
13
0.
0
4.
57
9.
27
4.
06
4.
33
7.
88
5.
91
0.
0
0.
1 1
58. 76
0.
47
3.
69
1.
66
0.
19
10.
19
3.
86
8.
89
5.
74
7.
19
5.
09
4.
76
2.
37
54. 10
0.
59
2.
64
3.
11
5.
38
14.
31
14.
16
6.
20
3.
07
3.
03
3.
56
3.
26
2.
26
61. 57
2.
58
1.
99
1.
49
1.
34
3.
48
6.
36
11.
43
8.
29
9.
02
5.
19
2.
19
1.
17
54. 53
2.
16
0.
12
2.
27
0.
93
3.
53
10.
98
12.
26
3.
31
4.
91
5.
43
3.
19
1.
21
50. 30
0.
39
2.
43
1.
01
0.
83
8.
70
11.
27
8.
42
2.
51
8.
73
3.
20
1.
58
0.
57
49 64
0.
31
2.
65
0.
03
2.
09
12.
39
8.
03
4.
97
9.
22
6.
25
1.
43
3.
64
2.
91
53 92
2.
59
0.
23
0.
21
1.
29
1.
10
6.
14
6.
74
5.
10
9.
56
8.
70
4.
17
4.
43
50. 26
1.
92
1.
67
1.
82
1.
74
3.
43
5.
85
6.
96
2.
35
3.
92
4.
71
3.
08
5.
33
42 78
5.
43
0.
25
0.
97
3.
63
9.
12
3.
25
3.
10
2.
55
17.
12
2.
70
3.
26
1.
63
53. 01
2. 42
2. 33
2. 25
1.63
2. 52
2. 84
2. 41
2. 09
6. 14
4. 00
7. 86
3. 80
6. 34
2. 73
6. 02
3. 16.
7. 43
3. 99
6. 44
3. 30
2. 23
1. 42
2. 25
2. 09
54. 30
9. 43
0. 97
0. 73
1. 13
0. 87
0. 65
0. 48
0. 43
0. 53
0. 54
0. 51
0. 64
0. 93
0. 17
1. 42
0. 30
2. 61
0. 85
0. 53
0. 57
0. 44
0. 80
1. 66
0. 51
0. 48
1. 23
0. 35
TABLE FI. 8
COEFFICIENT OF RELIABILITY
FOR A PROBABILITY OF 99.999000 7. WITH NORMAL VARIATE Z = 4.275
VY ALFA
LOG
0.
001
0.
100
0. 200
0.
300
0.
400
0. 500
0.
600
0.
700
0.
800
0. 900
1.
000
0. 100
0. 656
0.
655
0.
661
0. 666
0.
671
0.
676
0. 680
0.
685
0.
609
0.
693
0. 697
0.
701
0. 200
0. 437
0.
434
0.
448
0. 460
0.
472
0.
484
0. 494
0.
504
0.
514
0.
523
0. 531
0.
539
0. 300
0. 298
0.
291
0.
310
0. 329
0.
346
0.
362
0. 376
0.
390
0.
403
0.
416
0. 427
0.
438
0. 400
0. 207
0.
196
0.
219
0. 241
0.
261
0.
280
0. 297
0. 313
0.
328
0.
343
0. 356
0.
369
0. 500
0. 148
0.
134
0.
158
0. 181
0.
202
0.
222
0. 241
0.
258
0.
275
0.
290
0. 305
0.
319
0. 600
0. 109
0.
092
0.
116
0. 139
0.
160
0.
181
0. 200
0. 218
0.
235
0.
251
0. 266
0.
281
0. 700
0. 082
0.
064
0.
087
0. 108
0.
130
0.
150
0. 169
0.
187
0.
204
0.
220
0. 236
0.
250
0. SOO
0. 063
0.
045
0.
066
0. 086
0.
106
0.
126
0. 145
0.
162
0.
180
0.
196
0. 211
0.
226
0. 900
0. 050
0.
032
0.
050
0. 069
0.
089
0.
107
0. 126
0.
143
0.
160
0.
176
0. 192
0.
206
1. 000
0. 040
0.
023
0.
039
0. 057
0.
075
0.
093
0. 110
0.
127
0.
144
0.
160
0. 175
0.
190
1. 100
0. 033
0.
017
0.
031
0. 047
0.
064
0.
081
0. 098
0.
114
0.
130
Q.
146
0. 161
0.
175
1. 200
0. 028
0.
012
0.
025
0. 039
0.
055
0.
071
0. 087
0.
103
0.
119
0.
134
0. 149
0.
163
1. 300
0. 023
0.
009
0.
020
0. 033
0.
048
0.
063
0. 079
0. 094
0.
109
0.
124
0.138
0.
152
1. 400
0. 020
0.
007
0.
016
0. 028
0.
042
0.
056
0. 071
0.
086
0.
101
0.
115
0. 129
0.
143
1. 500
0. 017
0.
005
0.
013
0. 024
0.
037
0.
051
0. 065
0.
079
0.
094
0.
108
0. 121
0.
135
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
300
Table 6.25. Akaike Information Criterion for the ARMA(1,1) Models of the
Annual Rainfall Series.
Station
a
opt
Transformation
a = a
opt
a = 1.0
a = 0.50
a = 0.0
Miami AP
0.10
169.00
4.54
91.85
188.28
N. N. R. Canal
0.65
58.94
7.03
87.20
181.32
West Palm Beach
0.90
18.79
0.68
96.66
193.95
Belle Glade
1.30
58j_79
3.85
87.51
178.55
Ortona Lock 2
1.20
42^20
5.67
85.45
176.22
Port Mayaca
2.00
194_!_76
11.78
80.00
172.06
St. Lucie
0.45
102.77
1.84
93.25
188.30
Daytona Beach
0.90
12.97
5.40
86.48
178.46
233
2
the fitted parameters will usually bear little physical significance if
at all. These curves are useful tools for summarizing data and inter
polating between tabulated values, but they have limited use outside the
range of measurements (e.g., prediction and extrapolation purposes).
On the other hand, if the selection of g is based on some theoreti
cal considerations of the physical laws governing the behavior of the
natural system generating the data, then the procedure is called model
fitting (calibration). The parameters of such models usually represent
quantities that have physical significance. The estimation of these
parameters is a much more complicated problem than simple curve fitting
(Bard, 1974). This is so because in addition to fitting the data well,
these parameters should preserve their physical meaning by coming fairly
close to the "true" values. Unfortunately, such true values are usually
unknown; otherwise there would be no reason for performing the experi
ment. Thus, due to experimental uncertainties, even if the functional
form, g, is correct we can never expect to obtain the true parameters
and the consequent model predictions with absolute certainty. This is a
consequence of the deterministic nature of g, wherein each variable is
represented by a unique value, which is in contradiction to its uncer
tain (random) nature. Such uncertainty was described by Bevington
(1969) in the following words:
It is a wellestablished rule of scientific investigation that the
first time an experiment is performed the results bear all too
little resemblance to the truth being sought. As the experiment is
repeated, with successive refinements of technique and method, the
results gradually and asymptotically approach what we may accept
with, some confidence to be a reliable description of events.
This is even more true in hydrologic modeling, where in addition to
experimental uncertainty, the modeled processes are highly variable in
nature, and where engineering decisions are often based on reliability
O'
O'
O' on
rnroo
2rmo
to (3:
ci (
50 *
2 2:
O C
I CO I
O II
OGD
!OD
CD x
CD H
5C ^
l o "O
CO
> o
[ CO
^ o
io
m O'
d 3>
m 3;
r~ 
f 
XM
. x
 *T>
Hf
II CD
CO
Dccn
to
^ C u
~o
CO
rr.
CD
CD
O'
O'
cncocc3;i>no
C3 5CI>I>C3:00
11 m 50 50 11 r~ 3;
COST II II 3> II H4
CD l Cw^v^hq
73*^3>NX(OZ
4X30X3: CD hO
 x W2: m
CZX 1
:z> h3:^
ndx id
ox x
3 I
d^X
M ' f4r
*< I'D
04
I
04
3>~vtO
3Knnm
11 1 _
2>nwj>
CO CD CO 
d''3>
Z3>W
ICO CO
I .x%.
CD3>m
H3K0
r co
rKr. ro
to r** co
XHw
cnm
5
i 3>
O ^
O' cn
3> n
co ~o
II DC
30 
^ CD
Sil
ow
w *
CD O
*n cd rnrn
CD +* x
DC CD X
2 1
3= to
CD
II
to
X
X
3
I
: 5>
:i<
 3:
onrritor5ot03>to
O 11 li II II II II II CD
z*nmonow>
1I4 4 C*4
** ^ 3>
3 3> 3> I*<
d~ oo 1 ro 11
mooxsoocDk
CD
X
X
x 1
xxx x*
Xx
x . w V >< >1 W

w I I i I n>
i 1 3> 3> n> 3> nt:
d> :> 3:3:3: re r
DCDcrrrrw
M N
M hO * ++
04 KD
^3>
CD
2:ctd
Q ^
N3
VÂ£>
O
S ^ 2 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 3*s ^ 5>C 5*; Ds ^ 5C ^ ^ ^ ^ ^
WmWmMMmMh W
~l 7~l ~H lH I H 4 1 H I H < I < I 4 I I I I I 4  I ( I
mmmmmmmmmmmmrnmrnmmrnrnmmmmmmrnmmnirnrnmmmmmmrnmrn
Â£*i H wwwwwwwwwwwo4wwwwwwwwwwwwwc*4wwwwwwwww
D'U14>WK.)h0'OCOssJO'W^WK)H*0'CCOsJO'C4U*WIOh*0'0)NJO'W4wWlkOS)psj
TABLE FI. 9
COEFFICIENT OF RELIABILITY
FOR A PROBABILITY OF 79.979900 7. WITH NORMAL VARIATE Z = 4.772
VY ALFA
LOG
0. 001.
0. 100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0. 800
0. 900
1.
000
0. 100
0. 624
0. 624
0. 630
0. 636
0. 642
0. 648
0. 653
0. 658
0. 663
0. 668
0. 673
0.
677
0. 200
0. 396
0. 393
0. 409
0. 423
0. 437
0. 449
0. 461
0. 473
0. 483
0. 493
0. 503
0.
512
0. 300
0. 257
0. 250
0. 272
0. 292
0. 311
0. 328
0. 344
0. 359
0. 373
0. 387
0. 399
0.
411
0. 400
0. 171
0. 161
0. 185
0. 208
0. 229
0. 249
0. 267
0. 284
0. 301
0. 316
0. 330
0.
344
0. 500
0. 117
0. 105
0. 129
0. 152
0. 174
0. 195
0. 214
0. 232
0. 249
0. 266
0. 281
0.
295
0. 600
0. 083
0. 069
0. 092
0. 114
0. 136
0. 156
0. 176
0. 194
0. 212
0. 228
0. 244
0.
259
0. 700
0. 060
0. 045
0. 066
O: 087
0. 1.08
0. 128
0. 147
0. 165
0. 183
0. 199
0. 215
0.
230
0. 800
0. 045
0. 030
0. 049
0. 068
0. 087
0. 106
0. 125
0. 143
0. 160
0. 177
0. 192
0.
208
0. 900
0. 034
0. 021
0. 036
0. 054
0. 072
0. 090
0. 108
0. 125
0. 142
0. 158
0. 174
0.
189
1. 000
0. 027
0. 014
0. 028
0. 043
0. 060
0. 077
0. 094
0. Ill
0. 127
0. 143
0. 158
0.
173
1. 100
0. 021
0. 010
0. 021
0. 035
0. 051
0. 067
0. 083
0. 099
0. 115
0. 130
0. 145
0.
160
1. 200
0. 017
0. 007
0. 016
0. 029
0. 043
0. 058
0. 074
0. 089
0. 105
0. 120
0. 134
0.
149
1. 300
0. 014
0. 005
0. 013
0. 024
0. 037
0. 051
0. 066
0. 081
0. 096
0. 110
0. 125
0.
139
1. 400
0. 012
0. 003
0. 010
0. 020
0. 032
0. 046
0. 060
0. 074
0. 088
0. 102
0. 116
0.
130
1. 500
0. 010
0. 002
0. 008
0. 017
0. 028
0. 041
0. 054
0. 068
0. 082
0. 095
0. 109
0.
123
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
301
203
Table 6.12 Ranked Hourly Rainfall for
the Multifamily Urban Basin.
SUMMARY OF THE SO H20HEST RAINFALL INTENSITIES FOR THIS SIMULATION.
RANK
DATE
HOUR
RAIN
1
5/10/75
15
3. 90
2
11/24/77
2
2. 60
3
10/11/76
3
2. 10
4
11/18/74
4
2. 10
5
6/ 3/75
13
1. BO
&
7/23/77
16
1. 70
7
10/21/73
2
1. 50
a
3/ 7/76
19
1. 50
9
12/13/76
22
1. 50
10
11/24/77
1
1. 30
11
12/16/77
13
1. 40
12
5/29/77
1 6
1. 40
13
6/ 9/77
15
1. 40
14
9/22/77
8
1. 30
15
7/ 3/74
6
1. 30
16
2/28/76
9
1.20
17
8/19/74
17
1.20
IB
8/24/77
4
1. 20
19
6/ 6/74
IB
1. 20
20
3/23/77
21
1. 20
21
7/12/74
15
1. 10
22
12/13/76
21
1. 10
23
1/13/77
13
1. 10
24
11/17/74
21
1. 10
25
3/13/74
22
1. 00
26
10/11/76
2
1. 00
27
9/25/74
23
1.00
28
10/23/77
B
1. 00
29
11/18/74
3
1. 00
30
6/23/76
23
0. 90
31
3/29/75
14
0. 90
32
12/ 9/77
19
0. 80
33
7/ 6/74
14
0. 80
34
7/ 1/77
8
0. 80
35
9/12/76
6
0. 80
36
10/22/73
16
0. 80
37
3/23/76
16
0. 80
3B
11/20/73
3
0. 80
39
5/29/76
18
0. 80
40
1/14/76
14
0. BO
41
7/ 1/74
19
0. 80
42
10/13/73
IB
0. 70
43
9/13/76
12
0. 70
44
3/14/74
21
0. 70
45
1/15/77
14
0. 70
46
9/30/74
19
0. 70
47
9/ 1/77
22
0. 70
48
9/ 1/77
23
0. 70
49
6/16/94
10
0. 70
50
6/ 9/77
14
0. 70
THERE WERE 764 HOURS WITH PRECIPITATION FOR THIS SIMULATION.
CARD GROUP FI
EVAPORATION RATE
JAN. FEB. MAR.
0. 12 O. 15 0. 20
IN/DAY).
APR. MAY JUN.
0. 25 0. 25 0. 24
UUL. AUG. SEP.
0. 23 O. 23 O. 1?
OCT. NOV. DEC.
O. IB O. 14 O. 13
# # * NO GUTTER OR PIPE NETWORK * * *
TABLE F2.6
COEFFICIENT OF RELIABILITY
FOR A PROBABILITY OF 99.900000 7. WITH NORMAL VARIATE Z = 3.090
VY ALFA
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0. 800
0. 900
1.
000
0. 100
0. 738
0. 738
0. 735
0. 732
0. 728
0. 725
0. 721
0. 718
0. 714
0. 710
0. 705
0.
701
0. 200
0. 553
0. 549
0. 541
0. 531
0. 521
0. 510
0. 498
0. 486
0. 472
0. 457
0. 440
0.
422
0. 300
0. 421
0. 414
0. 399
0. 383
0. 366
0. 346
0. 325
0. 301
0. 275
0. 244
0. 207
0.
163
0. 400
0. 328
0. 314
0. 295
0. 274
0. 251
0. 225
0. 195
0. 162
0. 123
0. 075
0. 016
0.
000
0. 500
0. 260
0. 241
0. 219
0. 195
0. 168
0. 137
0. 103
0. 064
0. 021
0. 000
0. 000
0.
000
0. 600
0. 210
0. 187
0. 164
0. 137
O
1H
6
0. 077
0. 043
0. 009
0. 000
0. 000
0. 000
0.
000
0. 700
0. 173
0. 147
0. 123
0. 096
0. 068
0. 039
0. 010
0. 000
0. 000
0. 000
0. 000
0.
o
o
o
0. BOO
0. 146
0. 116
0. 092
0. 067
0. 041
0. 016
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
0. 900
0. 124
0. 093
0. 070
0. 046
0. 023
0. 004
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
1. 000
0. 108
0. 075
0. 053
0. 032
;Â¡ 0. 012
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
1. 100
0. 095
0. 061
0. 041
0. 022
0. 006
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
o
o
o
1. 200
0. 084
0. 050
0. 032
0. 015
0. 003
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
O
O
O
1. 300
0. 076
0. 042
0. 025
0. 010
0. 001
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
O
O
O
1. 400
0. 069
0. 035
0. 020
0. 007
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
1. 500
0. 063
0. 030
0. 016
0. 005
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
309
APPENDIX D
GPDCP SOURCE PROGRAM
'able 6.16 Monthly Event Statistics of Generated
Runoff at the Multifamily Basin.
MIAMI WSO AP RUNOFF STATION S 085663 .196463. 3 HRS INTEREVENT TIHE
RAINFALL STATISTICS BY MONTHCFOR PERIOD DF RECORD)
MONTH
NUMBER
TOTAL
1
DURATION
134.
0. 696000E+03
INTENSITY
134.
0.464005E+01
VOLUME
134.
0. 266696E+02
DELIA
133.
0. 1664S5E+05
2
DURATION
121.
0.508000E+03
INTENSITY
121.
0. 529963E+01
VOLUME
121.
0. 256795E+02
DELTA
121.
0. 158135E+05
3
DURATION
116.
0. 313000E+03
INTENSITY
116.
0. 477725E+01
VOLUME
116.
0.22329BE+02
DELTA
116.
0. 174120E+05
4
DURATION
99.
0. 467000E+03
INTENSITY
99.
0. 377303E+01
VOLUME
99.
0. 253696E+02
DELTA
97.
0. 1311OOE+03
5
DURATION
211.
0. 943000E+03
INTENSITY
21 1.
0. 176B15E+02
VOLUME
211.
0. 100959E+03
DELTA
211.
0.2234B0E+03
6
DURATION
353.
0. 176100E+O4
INTENSITY
353.
0. 26B700E+02
VOLUME
333.
0. 14643SE + 03
DELTA
353.
0. 175445E+03
7
DURATION
334.
0. 127300E+04
INTENSITY
334.
0. 24B077E+02
VOLUME
354.
0. B3017BE+02
8
DELTA
354.
0. 174530E+05
DURATION
341.
0. 132100E+04
INTENSITY
341.
0. 236573E+02
VOLUME
341.
0. B77277E+02
DELTA
341.
0. 1B0693E+03
9
DURATION
407.
0.1B5900E+04
INTENSITY
407.
0. 221706E+02
VOLUME
407.
0. 109397E+03
10
DELTA
407.
0. 175670E+03
DURATION
345.
0. 160B00E+04
INTENSITY
343.
0. 199409E+02
VOLUME
345.
0.9B747BE+02
11
DELTA
343.
0. 170335E+03
DURATION
172.
0. 841000E+03
INTENSITY
192.
0. 957725E+01
VOLUME
192.
0. 532587E+02
DELTA
192.
0.160310E+03
12
DURATION'
113.
0. 513000E+03
INTENSITY
113.
0. 530993E+01
VOLUME
113.
0. 263297E+02
DELTA
113.
0. 166685E+05
MINIMUM MAXIMUM AVERAGE
0. 100000E+01
0. 355600E02
O. 1OOOOOE01
0. BOOOOOE+Ol
0. lOGOOOE+Oi
O. 66700E02
O. 1 OOOOOEOl
O. 850000E+01
O. lOOOOOE+OI
O. 625100E02
0. lOCOOOEOl
O. 930000E+01
0. lOOOOOE+OI
O. 600100E02
O. 1 OOOOOE01
O. 100000E+02
O. lOOOOOE+OI
O. 42B600E02
0. 1 OOOOOEOl
O. 730000E+01
0. lOOOOOE+OI
O. 371300E02
0. 1 OOOOOE01
O. BOOOOOE+Ol
0. lOOOOOE+OI
O. 332400E02
0. 1 OOOOOEOl
0. BOOOOOE+Ol
0. lOOOOOE+OI
O. 444500E02
0. 1 OOOOOE01
O. 730000E+01
0. lOCOOOE+Ol
O. 373100E02
0. 1 OOOOOE01
O. 750000E+01
O. lOOOOOE+OI
O. 333600E02
0. 1 OOOOOEOl
0. BOOOOOE+Ol
0. 100000E+01
O. 371500EO2
0. 1 OOOOOEOl
O. BOOOOOE+Ol
0. lOOOOOE+OI
O. 500100E02
0.1OOOOOE01
0. 900000E+01
O. 370000E+02
O. 176667E+00
O.lBOOOOE+Ol
O. 901000E+03
O.240000E+02
O. 213001E+00
O. 3B0000E+01
O. 541000E+03
O. 310000E+02
O. 23666BE+00
O. 18000E+01
O. 793500E+03
O. 400000E+02
O. 640002E+00
O. 393000E+01
O. 138600E+04
O. 320000E+02
O. 793333E+00
0. 14 1B00E+02
O. 118000E+04
O. 390000E+02
O. 9B0002E+00
O. 3B3000E+01
O. 425000E+03
O. 240000E+02
O. B80002E+00
O. 2B0000E+01
O. 356300E+03
,0. 320000E+02
O. 365002E+00
O. 605000E+01
O. 363300E+03
O. 360000E+02
O. 900002E+00
O. 613000E+01
O. 290000E+03
O. 460000E+02
O. 743002E+00
O. 439000E+01
O. 270000E+03
O. 340000E+02
O. 323002E+00
O.635000E+01
O.64B300E+03
0. 310000E+02
O.436250E+00
O. 349000E+01
0.925500E+03
O. 319403E+01
O. 346273E01
O. 199027E+00
O. 140214E+03
O. 419B33E+01
O. 4379B8E01
O. 212227E+00
O.130690E+03
O.442241E+01
0. 411B32EC1
O. 192498E+00
0. 1 30103E+03
O.471717E+01
O. 5S3135E01
O. 256239E+00
O. 132626E+03
O. 446919E+01
O. B3798BE01
O.47B47BE+00
0. 10391 3E+03
O. 49BB67E+01
O. 817B47E01
0. 4 14B37E+00
O. 497011E+02
O. 360169E+01
O. 700762E01
O. 234513E+00
O. 493022E+02
O. 387390E+01
O. 693762E01
O. 257266E + 00
O. 329B97E+02
O. 436757E+01
O. 544732E01
O. 2692S1E+00
O. 431622E+02
O. 4660B7E+01
O. 377998E01
O.2B6B05E+00
O.493725E+02
0. 430021 E+Ql
0. 49881 5E01
O.2773B9E+00
O.833990E+02
O. 447B26E+01
O. 461733E01
O. 230693E+00
O. 144943E+03
6TD DEV
O. 3B3343E+01
O. 335B93E01
O. 314090E*00
O. 161740E+03
O. 353379E+01
O. 41B31BE01
O.402422E+00
O. 1142C1E+03
O. 447632E+0I
O. 449B20E01
O. 297742E+00
O. 142493E+03
O. 320996E+01
O.101235E+00
O. 4BB020E+00
O. 183014E+03
O. 49436BE+01
O. 130340E+00
O. 141662E+01
O. 1711S0E+03
0. 319723E+01
0. 11B730E+00
O. 740B46E+00
O. 523072E+02
O. 304956E+01
O. 1 1333BE+00
0.376811E+00
O. 497404E+02
O. 36394SE+01
O. 995B30E01
O. 4B061BE+00
O. 337923E+02
O. 3054B4E+01
O. B59633E01
O. 376770Ei 00
O. 432115E+02
O. 303226E+01
O. B66490E01
O. 52028VE + 00
O. 4B9290E+02
O. 492416E+01
O. 783320E01
O. 683763E+00
O. 932B61E+02
O. 4B2200E + 01
O. 729273E01
O. 4B3766E+00
0. 1396B0E+03
VARIANCE
CDEFVAR
O. 340523E+02
O. 112B24E02
O. 9B6523E01
O. 261599E+05
O. 126437E+02
O. 174990E02
O. 161944E+00
O. 130418E+05
O. 200374E+02
O. 20233BE02
O. BB6305E01
O. 203O41E+O3
0. 271437E+02
O. 102525E01
O. 23B164E+00
O. 334942E+03
O.24439BE+02
O. 169BB6E01
O.2006B2E+01
O.292950E+03
0. 270112E+02
O. 140967E01
O. 348B32E+00
O. 273605E+04
O. 9299B3E+01
O. 13302BE01
O. 1419B7E+00
0. 247411E+04
O. 13243BE+02
O. 99167BE02
O. 230994E+00
0. 3112B0E+04
O. 255515E+02
O. 739004E02
O. 332664E+00
O. 1B6724E+04
O.253236E+02
O. 730B03E02
O. 270701E+00
O.239405E+04
O. 242473E+02
0. 616727E02
O.470271E+00
O. B70230E+04
0. 232317E+02
0. 531839E02
O. 233969E+00
0. 254977E+05
O. 112349E+01
O. 970024E+00
0. 137812E+01
O. 115352E+01
O. B46931E+00
O. 955091E+00
O. 1B961BE+01
O. B73B2BE+00
O. 101219E+01
O. 109224E+01
O. 1 34673E+01
O. 949296E+00
0. 110447E+01
O. 173639E+01
O. 190440E+01
O. 119910E+01
O. 110662E+01
O. 133340E+01
O. 296069E+01
O. 161600E+01
O. 1041B1E+01
O. 14 31 73E+01
O. 17B3B7E+01
O. 105243E+01
O. 846702E+00
O. 1645B4E+01
O. 16067BE+01
O. 100B89E+01
O. 9394BBE+00
O. 143540E+01
O. 1S6B1BE+01
O. 1032B9E+01
0. 11066BE+01
0. 137812E+01
O. 2141B9E+01
0. 100114E+01
O. 10796BE+01
0. 1 4991 2E+01
0. 1B1 409E+01
O. 9V1019E+00
O. 112418E+01
O. 157437E+01
O. 247221E+01
0. Ill 38BE+01
O. 107676E+01
0. 157943E+01
O. 21036BE+01
0. 110167E+01
APPENDIX C
PROBABILITY DISTRIBUTIONS
C.l. Discrete Distributions
C.1.1. Binomial distribution
One distribution arising from a Bernouilli process is the binomial
distribution. A Bernouilli process is characterized by: 1. the
independence of its events; 2. an event may either occur or not occur
within a given trial (location or time) with probabilities p and q=lp,
respectively; and 3. the probability p does not change with time. The
binomial probability mass function is
p(y) = (y) py q11 y (C.1.1)
where () defines the number of combinations of n events taken y at a
time
(n) = ii
y n!(ny)!
The probability p(y) defines the probability of y successes in n trials.
The binomial distribution has the following moments,
mean y = np,
2
variance a = npq,
and skewness y = (qp)/ Aipq .
The binomial distribution is one of the most commonly used discrete
distributions (Chow, 1964).
261
Table 4.9 Detailed Statistics for the
WSS Selected Model, Example 1.
EXAMPLE 1 ST. MARYS RIVER AT STILLWATER (KITE, 1977, TABLE 22)
REGRESSION RESULTS : PEARSON FAMILY GIVES OPTIMAL 'WSS FOR ALFA = 0.10
BASIC DESCRIPTIVE STATISTICS
STANDARD
MEAN
DEVIATION
CORRELATION
TRANSFORMED VARIABLE
6. 141
0. 130
0. 994
REDUCED VARIATE
10SS. 938
31. 330
ANALYSIS OF VARIANCE
SOURCE OF
DEGREE OF
SUMS OF
MEAN
P(EXCEEDING
VARIATION
FREEDOM
SQUARES
SQUARES FVALUE
UNDER HO)
REGRESSION
1. 000
0. 982
0. 9S2 4753. 984 0. 0
RESIDUAL
53. 000
0. 012
0. 000
TOTAL
57. 000
0. 994
DURBINWATSCN
STATISTIC
0. 457
MODEL
PARAMETER
INFERENCES
LOWER
UPPER
POINT
STANDARD CONFIDENCE
CONFIDENCE
ESTIMATE
: ERROR
LIMIT
LIMIT
SCALE
0. 004
0. 000
0. 004
0. 004
LOCATION
1. 657
0. 065
1. 527
1. 787
COVARIANCE
0. 000
103
280
83 CONTINUE GPDP422
C GPDP423
WRITE!IOU,722) GPDP424
722 FDRHAT(//,20X' Z REDUCED VARIATE S 6PDP425
. /20X' YO TRANSFORMED OBSERVATIONS GPDP426
. /,20X,' YP TRANSFORMED PREDICTION S GPDP427
/,20X,' YR TRANSFORMED RESIDUAL S GPDP428
/>20X' T STUDENT STATISTIC S GPDP429
/,20X,' TO 957. STUDENT T 't/I) GPDP430
C GPDP431
SRES=SQRT(SRES/AN) GPDP432
C GPDP433
CC OUTPUT ANOVA STATISTIC GPDP434
C GPDP435
NANDV=0 GPDP436
IF(NANOV.EQ.l) GO TO 996 GPDP437
WRITE!IOU,705) SITE GPDP438
WRITE!IOU,702) DT(LE),ALF,CT,CP GPDP439
WRITE(I0U,706) (NADI!H,I),I=1,3),SLCT(ISL),RF(N,H) GPDP440
WRITE(I0U,714) DES(2),DES(4),DES(5),DES(1),DES(3) GPDP441
WRITE!I0U,715) AN0VA!1),AN0VA!4),AN0VA(7),AN0VA(9),AN0VA(0), GPDP442
. AN0VA(2),AN0VA(5),AN0VA8), GPDP443
. ANOVA(3),ANOVA!6),AN0VA(14) GPDP444
WRITE!IOU,716) STAT(1),STAT(2),STAT(3),STAT<4), GPDP445
. STAT(5),STAT!),STAT(7)iSTAT(8),STAT(9) GPDP446
C GPDP447
714 F0RHAT(///,20X,'BASIC DESCRIPTIVE STATISTICS',//,45X, GPDP448
.'STANDARDS/,35X,'MEAN S5X,'DEVIATION CORRELATIONS//,5X, GPDP449
.'TRANSFORMED VARIABLES5X,2F11.3,/,55X,FIO.3,/5X, GPDP450
.'REDUCED VARIATE S8X,2F11.3,//) GPDP451
715 FORMAT!//,25X,'ANALYSIS OF VARIANCE',//5X, GPDP452
.'SOURCE OF DEGREE OF SUMS OF HEANS14X,'P!EXCEEDING F' GPDP453
.,/,5X, 'VARIATION FREEDOM SQUARES SQUARES FVALUES4X, GPDP454
.'UNDER HO)',//,5X,'REGRESSI0NS5F11.3,/,5X,'RESIDUAL S3F11.3,/, GPDP455
,5X,'TOTALS5X,2F11.3,//,5X,'DURBINUATS0N STATISTIC SF6.3,//) GPDP456
716 FORMAT!/,25X,'MODEL PARAMETER INFERENCES',//,48X,'L0WERS7X, GPDP457
.'UPPERS/,25X,'POINT STANDARD CONFIDENCE CONFIDENCES/, GPDP458
.23X,'ESTIMATE ERROR LIMIT LIMIT',//,5X, GPDP459
.'SCALE S 7X,4F12.3,/,5X, GPDP460
, 'LOCATION'4X,4F12.3,//5X, GPDP461
. 'C0VARIANCES4X.F10.3,/) GPDP462
C GPDPT63
996 CONTINUE GPDP464
997 CONfINUE GPDP465
998 CONTINUE GPDP466
999 CONTINUE GPDP467
STOP GPDP468
END GPDP469
84
F test. This is the appropriate test statistic for testing the
linearity of the regression function. The F statistic is defined as the
ratio of the mean square due to regression to the mean square due to
residual variation (Draper and Smith, 1966, p. 24). Each of these two
2
means, when multiplied by its degrees of freedom follows a x distribu
tion with 1 and n2 degrees of freedom, respectively,
F =
mc 2 2
bR = Â£ (Y Y; = x (D
Mq 2 2
e E (Y Yor X (n2)
(3.3.19)
where Y is the observed transformed variable,
o
2
The ratio of two x distributed random variables is known to follow
2
an F distribution with degrees of freedom equal to those of the x
distributions, 1 and n2 for this case. Draper and Smith noted that
this statistic is exactly the same as the Student t test for a zero
slope in the case of fitting a straight line.
The F statistic and its corresponding tail area of the Fdistri
bution are calculated in the GPDCP program by calling the IMSL sub
routine RLONE.
DurbinWatson statistic. This statistic is used in testing for
first order linear correlation in the residuals. It is defined as
DW =
n
E
i=2
(ei eil}
n
E
i=l
(3.3.20)
2
e.
i
and is always in the interval 0 to 4. A DW value significantly smaller
(larger) than 2 indicates the presence of positive (negative) correla
tion. Durbin and Watson (1951) estimated significance points for the 1,
2.5 and 5 percent levels for DW. This statistic too, is calculated
within the RLONE subroutine.
TABLE F2.2
COEFFICIENT OF RELIABILITY
FOR A PROBABILITY OF 95.000000 7. WITH NORMAL VARIATE Z = 1.645
VY ALFA
LOG
0. 001
0.
100
0. 200
0.
300
0.
400
0.
500
0.
600
0. 700
0.
800
0.
900
1.
000
0. 100
0. 853
0. 852
0.
852
0. 851
0.
851
0.
850
0.
849
0.
848
0. 848
0.
847
0.
846
0.
845
0. 200
0. 736
0. 733
0.
732
0. 730
0.
723
0.
726
0.
723
0.
721
0. 719
0.
716
0.
714
0.
711
0. 300
0. 644
0. 638
0.
635
0. 631
0.
627
0.
623
0.
619
0.
615
0. 611
0.
606
0.
601
0.
596
0. 400
0. 571
0. 560
0.
556
0. 551
0.
546
0.
540
0.
534
0.
528
0. 522
0.
516
0.
509
0.
502
0. 500
0. 514
0. 498
0.
492
0. 486
0.
479
0.
473
0.
466
0.
459
0. 451
0.
444
0.
436
0.
428
0. 600
0. 468
0. 446
0.
440
0. 433
0.
426
0.
419
0.
412
0.
404
0. 396
0.
389
0.
381
0.
373
0. 700
0. 432
0. 404
0.
397
0. 390
0.
383
0.
377
0.
370
0.
363
0. 356
0.
350
0.
344
0.
339
0. 800
0. 403
0. 369
0.
363
0. 356
0.
350
0.
344
0.
339
0.
334
0. 329
0.
326
0.
324
0.
324
0. 900
0. 379
0. 341
0.
335
0. 330
0.
325
0.
321
0.
318
0.
316
0. 315
0.
317
0.
321
0.
330
1. 000
0. 360
0. 318
0.
314
0. 310
0.
307
0.
305
0.
305
0.
308
0. 313
0.
322
0.
336
0.
355
1. 100
0. 344
0. 300
0.
297
0. 295
0.
295
0.
297
0.
301
0.
310
0. 323
0.
342
0.
367
0.
O
O
1. 200
0. 330
0. 285
0.
284
0. 285
0.
288
0.
295
0.
306
0.
322
0. 345
0.
376
0.
416
0.
466
1. 300
0. 319
0. 274
0.
276
0. 280
0.
287
0.
300
0.
319
0.
345
0. 381
0.
427
0.
484
0.
551
1. 400
0. 310
0. 266
0.
271
0. 279
0.
292
0.
312
0.
340
0.
379
0. 430
0.
494
0.
570
0.
657
1. 500
0. 302
0. 261
0.
269
0. 282
0.
302
0.
332
0.
372
0.
426
0. 495
0.
579
0.
676
0.
782
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
305
Table 4.1 Annual Maximum Daily Runoff (cfs) and Statistics
of Original and Logtransformed Flows, Example 1.
EXAMPLE 1 ST. MARYS RIVER AT STILLWATER, (KITE* 1977,TABLE 22)
SORTED RECORDED EVENTS
34400. 000
19900. 000
1S300. 000
16100. 000
14300. 000
13100. 000
12400. 000
1 1800. 000
10300. 000
8210. 000
29100. 000
19900. 000
1 6200. 000
16100. 000
14300. 000
13100. 000
12300. 000
11800. 000
10200. 000
8210. 000
20600. 000
19500. 000
13000. 000
16000. 000
13900. 000
13000. 000
12300. 000
13 600. 000
10200. 000
3180. 000
23000. 000
19200. 000
17200. 000
15600. 000
13900. 000
13000. 000
12200. 000
11000. 000
9900. 000
8040. 000
20600. 000
10600. 000
16900. 000
15100. 000
13900. 000
12900. 000
11900. 000
10700. 000
9020. 000
7130. 000
20100. 000
3 8500. 000
16400.000
14500. 000
13600. 000
12700. 000
1 1900. 000
10400. 000
8390. 000
6700. 000
00
MEAN OF Y
VARIANCE OF Y
SKEW OF Y
14554.660
27319370.000
1. 889
MEAN OF LN
VARIANCE OF LN
SKEW OF LN
9. 523
0. 115
0. 132
r'r~:D''n;z:..:D:D crzDZDmc cj!ti:3::dgtiloz::d:i> :
125
KISSIMMEE RIVER RAINFALL DATA
P1AH5DN 32 ALFA = Q.15
Figure 4.6 Linear Plot of R2 Selected Model,
Example 2, Rainfall
173
Table 5.11. Sensitivity of the Ratio, Design Event Magnitude/Mean
(1/COR), to the Design Period and to the Shape of the
Distribution for R =0.95.
n
n
R
ZL
ALFA
1.0
0.50
0.00
0.50
20
0.997439
2.80
2.96
3.68
4.80
24.09
50
0.998975
3.09
3.16
4.08
5.77
95.65
100
0.999487
3.28
3.30
4.35
6.50
782.43
144
indices are valid for the logarithm of the variables, but they may also
be expressed in terms of the moments of the original data using the
relations derived in Appendix B. Equations B.13 and B.14 define
Uy = log(y ) \ log(l+Vy) (5.2.7)
Gy = log(1+V2) (5.2.8)
inhere Y = log y, and V^ is the coefficient of variation of the untrans
formed variables. Substitution of these relations into Equations 5.2.1,
5.2.2 and 5.2.6 gives, after some algebraic rearrangement, the following
new expressions for the reliability indices
log & (1+V2) 1/2 ]
ZL = "
yL y
[log(l+V2)]1/2
or
2 1/2 Vlc,g(l+VX/2
yL = V1+Vy> '* y
from which COR = = (1+V )
yL y
2^1/2 e ZL[lg(l+Vy)]
and
ZRI =
y 1+V
yl y2.1/2,
lg[ (7~2) ]
y0 i+v
2 yi
{log[(1+V2 )(1+V2 )]}1/2
(5.2.9)
(5.2.9a)
(5.2.10)
(5.2.11)
5.3. Third Order Reliability Modeling
5.3.1. Introduction
In the above section expressions for reliability indices were
developed based on the first two moments of the original variables.
Such relations are widely used for analyzing uncertainty, assessing
TABLES F2
O > ALFA > 1
TABLE G1.3
RELIABILITY AS A FUNCTION OF COR,
VY
0. 2000
0. 4000
0. 3000
0. 9999
0. 9999
0. 4000
0. 9999
0. 9991
0. 5000
0. 9999
0. 9945
0. 6000
0. 9999
0. 9844
0. 7000
0. 9999
0. 9703
0. 8000
0. 9998
0. 9542
0. 9000
0. 9993
0. 9379
1. 0000
0. 9982
0. 9224
1. 1000
0. 9964
0. 9083
1. 2000
0. 9938
0. 8958
1. 3000
0. 9905
0. 8848
1. 4000
0. 9867
0. 8754
1. 5000
0. 9825
0. 8674
1. 6000
0. 9780
0. 8607
1. 7000
0. 9735
0. 8551
COR
0. 6000
0. 8000
1. 0000
0. 9799
C. 8040
0. 5239
0. 9423
0. 7508
0. 5319
0. 9024
0. 7182
0. 5398
0. 8678
0. 6978
0. 5478
0. 8399
C. 6848
0. 5557
0. 8178
0. 6767
0. 5636
0. 8007
0. 6719
0. 5714
0. 7875
0. 6696
0. 5793
0. 7774
0. 6690
0. 5871
0. 7698
0. 6697
0. 5948
0. 7642
0. 6714
0. 6026
0. 7602
0. 6738
0. 6103
0. 7576
0. 6769
0. 6179
0. 7560
0. 6805
0. 6255
0. 7554
0. 6845
0. 6331
COR : COEFFICIENT OF RELIABILITY
VY : COEFFICIENT OF VARIATION
VY, FOR ALFA 0. 600
1. 2000
1. 4000
1. 6000
1. 8000
2.
0000
0. 3030
0. 1696
0. 0959
0. 0558
0.
0336
0. 3625
0. 2477
0. 1726
0. 1234
0.
0905
0. 4031
0. 3052
0. 2360
0. 1866
0.
1505
0. 4334
0. 3491
0. 2869
0. 2402
0.
2046
0. 4575
0. 3839
0. 3281
0. 2852
0.
2514
0. 4777
0. 4126
0. 3624
0. 3231
0.
2916
0. 4953
0. 4370
0. 3916
0. 3556
0.
3263
0. 5109
0. 4583
0. 4170
0. 3839
0.
3568
0. 5251
0. 4773
0. 4395
0. 4089
0.
3838
0. 5383
0. 4945
0. 4597
0. 4314
0.
4080
0. 5506
0. 5102
0. 4781
0. 4518
0.
4301
0. 5622
0. 5249
0. 4950
0. 4706
0.
4502
0. 5733
0. 5386
0. 5108
0. 4879
0.
4689
0. 5840
0. 5515
0. 5255
0. 5042
0.
4863
0. 5942
0. 5639
0. 5395
0. 5194
0.
5026
TABLE G1.5
RELIABILITY AS A FUNCTION OF COR
COR
VY
0. 2000
0. 4000
0. 3000
0. 9999
0. 9997
0. 4000
0. 9999
0. 9963
0. 5000
0. 9999
0. 9865
0. 6000
0. 9997
0. 9723
0. 7000
0. 9986
0. 9570
0. 8000
0. 9965
0. 9426
0. 9000
0. 9932
0.9302
1. 0000
0. 9892
0. 9201
1. 1000
0. 9848
0. 9122
1. 2000
0. 9804
0. 9063
1. 3000
0. 9762
0. 9021
1. 4000
0.9723
0. 8994
1. 5000
0. 9690
0. 8980
1. 6000
0. 9661
0. 8977
1. 7000
0. 9638
0. 8982
0. 6000
0. 8000
1. 0000
0. 9721
0. BIOS
0. 5478
0. 9338
0. 7675
0. 5636
0. 8990
0. 7442
0. 5793
0. 8721
0. 7325
0. 5948
0. 8528
0. 7277
0. 6103
0. 8395
0.7275
0. 6255
0. 8309
0. 7302
0. 6406
0. 8258
0. 7351
0. 6554
0. 8235
0. 7413
0. 6700
0. 8233
0. 7486
0. 6844
0. 8248
0. 7566
0. 6985
0. 8275
0. 7652
0. 7123
0. 8311
0. 7740
0. 7257
0. 8355
0. 7831
0. 7389
0. 8405
0. 7922
0. 7517
COR : COEFFICIENT OF RELIABILITY
VY : COEFFICIENT OF VARIATION
VYiFOR ALFA
0. 200
1. 2000
O. 316S
0. 3868
O. 4372
O. 4767
0. 5097
0. 5383
0. 5640
0. 5874
0. 6092
0. 6296
0. 6489
0. 6672
0. 6846
0. 7013
0. 7172
1.
4000
1. 6000
1. 8000
2.
0.
1674
0. 0845
0. 0419
0.
0.
2567
0. 1682
0. 1100
0.
0.
3261
0. 2428
0. 1816
0.
0.
3812
0. 3058
0. 2469
0.
0.
4267
0. 3591
0. 3042
0.
0.
4654
0. 4049
0. 3545
0.
0.
4994
0. 4449
0. 3989
0.
0.
5297
0. 4806
0. 4386
0.
0.
5573
0. 5128
0. 4744
0.
0.
5827
0. 5423
0. 5071
0.
0.
6063
0. 5694
0. 5372
0.
0.
6284
0. 5947
0. 5651
0.
0.
6492
0. 6183
0. 5911
0.
0.
6688
0. 6404
0. 6154
0.
0.
6874
0. 6613
0. 6383
0.
0000
0208
0724
1369
2008
2596
3124
3597
4024
4410
4763
5088
5389
5669
5931
6176
322
99.999%
Table 5.4 Power to Lognormal Percentile Ratio, 0 < a < 1, P =
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A
PROBABILITY OF
99.999000 >
: WITH
NORMAL
VARIATE
Z
= 4.
275
VY
ALFA
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0.
BOO
0.
900
1.
000
0. 1
1. 000
0. 999
1. 007
1. 015
1. 023
1. 030
1. 037
1. 043
1. 050
1.
056
1.
062
1.
068
0. 2
1. 000
0. 993
1. 023
1. 053
1. 080
1. 106
1. 130
1. 153
1. 174
1.
195
1.
214
1.
233
0. 3
1. 000
0. 976
1. 042
1. 104
1. 161
1. 215
1. 265
1. 311
1. 355
1.
396
1.
435
1.
472
0. 4
1. 000
0. 946
1. 057
1. 162
1. 258
1. 348
1. 431
1. 510
1. 583
1.
652
1.
717
1.
779
0. 5
1. 000
0. 903
1. 066
1. 220
1. 364
1. 498
1. 623
1. 741
1. 852
1.
956
2.
055
2.
148
0. 6
1. 000
0. 847
1. 065
1. 274
1. 472
1. 658
1. 833
1. 999
2. 155
2.
302
2.
442
2.
574
0. 7
1. 000
0. 784
1. 055
1. 322
1. 578
1. 823
2. 055
2. 276
2. 485
2.
684
2.
873
3.
052
0. 8
1. 000
0. 715
1. 035
1. 361
1. 679
1. 987
2. 283
2. 566
2. 836
3.
093
3.
339
3.
573
0. 9
1. 000
0. 644
1. 008
1. 390
1. 773
2. 148
2. 512
2. 863
3. 200
3.
523
3.
832
4.
128
1. 0
1. 000
0. 574
0. 974
1. 410
1. 856
2. 301
2. 738
3. 162
3. 571
3.
966
4.
345
4.
710
1. 1
1. 000
0. 507
0. 936
1. 421
1. 930
2. 446
2. 957
3. 458
3. 945
4.
416
4.
871
5.
309
1. 2
1. 000
0. 446
0. 894
1. 425
1. 994
2. 580
3. 167
3. 747
4. 315
4.
867
5.
403
5.
920
1. 3
1. 000
0. 389
0. 852
1. 421
2. 049
2. 703
3. 367
4. 028
4. 679
5.
316
5.
935
6.
535
1. 4
1. 000
0. 339
0. 809
1. 413
2. 094
2. 816
3. 556
4. 298
5. 034
5.
757
6.
463
7.
149
1. 5
1. 000
0. 295
0. 766
1. 400
2. 131
2. 918
3. 733
4. 557
5. 378
6.
188
6.
982
7.
758
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
157
150
Here again, it can easily be shown that this expression is a first order
approximation of Equation 5.2.10 and exactly the same as Equation 5.2.2,
for a equal zero and one, respectively.
5.3.3.3. Generalized reliability index for the case of two random
variables. By replacing the means and standard deviations of Equation
5.2.6 by their approximate expressions in terms of the original moments
(Equation 5.3.23 and 5.3.24) we have
ZRI = 
i ,
i
+ 2
 1
I (2 Vy.
(y2 V2 + y2 V2 )1/2
yl yl y2 y2
(5.3.30)
This index of reliability allows the estimation of the reliability
(Equation 5.2.4) or risk (dashed area of Figure 1.3c) directly from the
cumulative density function of the normal distribution.
5.4. Sensitivity Analysis
5.4.J. Sensitivity of the predicted pth percentile to the shape
of the distribution
The predicted pth percentile may be the limit of a confidence
interval of an estimated parameter, or the magnitude of an extreme
hydrologic event or water quality standard which should not be exceeded
p percent of the time (i.e., a reliability of p percent). The exceed
ance of such limits will define a failure. The equations developed in
the previous sections apply for this characterization, since they define
the relationship between the magnitude of such events and their prob
ability of occurrence (risk) or nonoccurrence (reliability), which are
related by Equation 1.2.1. Of special interest is Equation 5.3.29,
giving the coefficient of reliability (COR) in terms of the shape of the
APPENDIX G
RELIABILITY AS A FUNCTION
OF COR AND Vy
Page
5.4.3. Sensitivity of design event magnitude to the
design period and to the shape of the
distribution 171
5.5. Summary and Conclusions 174
CHAPTER 6: CASE STUDY 175
6.1. Introduction 175
6.2. Case Study 176
6.2.1. Sites and data description 176
6.2.2. Temporal and spatial variability of the
recorded data 176
6.2.3. Continuous versus single event simulation 186
6.3. Deterministic Models 187
6.3.1. Introduction 187
6.3.2. Distribution free statistical analysis 188
6.3.3. Calibration and verification of deterministic
models 197
6.3.4. Water quantity and quality continuous
s imulation 199
6.3.5. Summary and conclusions 212
6.4. Probabilistic Models 212
6.4.1. Introduction 212
6.4.2. Annual rainfall series 214
6.4.3. Event statistics ....215
6.4.4. Quality Statistics 217
6.4.5. Conclusion 220
6.5. Stochastic Models 220
6.5.1. Introduction.... 220
6.5.2. Model description 222
6.5.3. Parameter estimation and goodness of fit
evaluation 223
6.5.4. Annual ARMA(1,1) model 226
6.5.5. Monthly ARMA(1,1) model 235
6.5.6. Reliability of estimated parameters 243
6.6. Summary and Conclusions 246
CHAPTER 7: SUMMARY AND CONCLUSIONS 250
7.1. Summary 250
7.2. Conclus ion 251
7.3. Suggestion for Further Research 253
APPENDICES
A. Linear Regression 255
B. Exact Relations Between Moments of the Normal and
Lognormal Distributions 258
C. Probability Distributions 261
D. GPDCP Source Program 27.4
E. Modified Kite Program 284
F. Coefficient of Reliability as a Function of ALFA and V^..293
vi
cc
GENERALIZED PROBABILITY DISTRIBUTIONS COMPUTER PROGRAM (GPDCP)
GPDP 1
GPDP 2
c
GPDP 3
c
GPDP 4
DIMENSION S(60)>PI60)>PC(60)iF(60)fDD(604)jY(60)Z(60)
GPDP 5
DIMENSION AA(IOO)jBB(IOQ)>BP(60)>E2(1GO)>E1(1QO)>R2(1QO)
GPDP 6
DIMENSION DT(10)fN0(10)>Yl(60)Y2(60)iNADI(43)SITE(20)
GPDP 7
DIMENSION RF(60i4)TR(60)>AFT(100)>ZZ(10)jAHZ(10)>SS(60)
GPDP 8
DIMENSION U(10)fZV(10)>ZHO)>NP(10)RESY(100)RES(100)RT(60)
GPDP 9
DIMENSION TC(10))ON(10)IRG1(60)>IRG2(60)jXMXL(100)>S1(100)
GPDP 10
DIMENSION YMEOO)YV(100)YS(100)iSLCT<4)fST(60)(SCI60)DFN(25)
GPDP 11
DIMENSION XY<60>2)ALBAPI3)>BES(5)AH0VA(14)iSTAT(V)PRED(60i7)
GPDP 12
c
GPDP 13
c
GPDP 14
DATA INjIOU/5,6/
GPDP 15
DATA NADI< 11)NADI(1 2)NADI(1 j3)/' NOVRMALV 7
GPDP Id
DATA NADI(21)NADI(2>2)NADI(23)/' GUVMBELV 7
GPDP 17
DATA NADI(3>1)>NADI(3>2)>NADI(3j3)/' RAVYLEI'j'GH 7
GPDP 18
DATA NADI(4>l)>NAIiI(4>2)NADI(4>3)/' PEVARSOVN 7
DATA SLCT/' R2 VSTBEVUSS VHXLF7
GPDP 19
GPDP 20
DATA ALBAP/Q.01>0*05>0.05/
GPDP 21
c
GPDP 22
cc
CONTROL FLAGS
GPDP 23
c
GPDP 24
IP0=0
GPDP 25
IPl^l
GPDP 26
IAF=0
GPDP 27
c
GPDP 28
NS=1
GPDP 29
NLB=1
GPDP 30
NLE=4
GPDP 31
NL0I=NLENLBI1
GPDP 32
ALF=0.00
GPDP 33
c
GPDP 34
cc
STATION DO LOOP
GPDP 35
c
GPDP 36
DO 999 KS=liNS
GPDP 37
c
GPDP 38
CT::0,D
GPDP 39
CP=1.0
GPDP 40
IPRINT=1
GPDP 41
c
GPDP 42
cc
READ IN TRANSFORMATION REQUEST USING 501 OR 502 FORMAT
GPDP 43
c
GPDP 44
IFIAFEQ1) GO TO 10
GPDP 45
c
GPDP 46
READ(IN>501) NAF,AFO>PAS
GPDP 47
501
F0RMAT(I5>F6.2>F5.3)
GPDP 48
GO TO 20
GPDP 49
c
GPDP 50
10
READ(IN>502)NAF>((RF(I>J)>I=1>NAF)>J=NLB>NLE)
GPDP 51
502
FORMAT(I510F6*2)
GPDP 52
PAS=0.001
GPDP 53
20
CONTINUE
GPDP 54
C
GPDP 55
READ(IN>503) L=1>20)
GPDP 56
503
FORMAT(20A4)
GPDP 57
READN.504) NECH>NANNE>INDEX
GPDP 58
504
FORMAT (1
GPDP 59
C
GPDP 60
REAB(IN>505) (DT(I)>I=1,NECH)
GPDP 61
505
FORMAT(10F50)
GPDP 62
READdN.506) (NO(I)>I=1,NECH)
GPDP 63
506
FORMAT(1015)
GPDP 64
C
GPDP 65
274
Table 4.23 Detailed Statistics for the R2
Selected Model, Example 2, Rainfall.
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RAINFALL,19341931
REGRESSION RESULTS : PEARSON FAMILY GIVES OPTIMAL R2 FOR ALFA = O.15
BASIC DESCRIPTIVE STATISTICS
MEAN
STANDARD
DEVIATION
CORRELATION
TRANSFORMED VARIABLE
2. 952
0. 037
REDUCED VARIATE
570. 602
22. 624
0. 988
ANALYSIS OF VARIANCE
SOURCE OF
VARIATION
DEGREE OF
FREEDOM
SUMS OF
SQUARES
MEAN
SQUARES
P(EXCEEDING F
FVALUE UNDER HO)
REGRESSION
RESIDUAL
TOTAL
1. 000
46. 000
47. 000
0. 347
0. 003
0. 355
0. 347
0. 000
1900. 204 0. 0
DURBINWATSON
STATISTIC
0. 399
MODEL PARAMETER INFERENCES
LOWER
UPPER
POINT
STANDARD
CONFIDENCE
CONFIDENCE
ESTIMATE
ERROR
LIMIT
LIMIT
SCALE
0. 004
0. 000
0. 004
0. 004
LOCATION
0. 755
0. 050
0. 653
0. 856
COVARIANCE
0. 000
121
Tabic 6.26, Bayesian Information Criterion for the ARMA(1,1) Models of the
Annual Rainfall Series.
Station
a
opt
Transformation
a = a
opt
ct = 1.0
a = 0.50
a = 0.0
Miami AP
0.10
183.33
2.18
127.48
190.64
N. N. R. Canal
0.65
91.58
4.67
122.83
183.68
West Palm Beach
0.90
33.12
1.67
132.29
196.31
Belle Glade
1.30
108.34
1.49
123.14
180.90
Ortona Lock 2
1.20
71,79
3.32
121.07
178.58
Port Mayaca
2.00
458.58
9.42
115.63
174.42
St. Lucie
0.45
13806
0.51
128.88
190.66
Daytona Beach
0.90
27.30
3.05
122.11
180.81
234
Table 4.14 Models, Parameters
Statistics, Example 2,
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RUNOFF 19341'
REGRESSION RESULTS
HHHHHHHHHHHHHHHHf#
GUMDEL
FAMILY
GIVES OPTIMAL R2 FOR
ALFA =
0. 75
LAW
ALFA
LOCATION
SCALE
R2
STDE
NORMAL
0. 0
1. 98334
0. 68403
0. 90539
1. 07562
NORMAL
0. 10
2. 22034
0. B2052
0. 93615
0. 78601
NORMAL
0. 20
2. 49799
0. 98991
0. 95712
0. 71095
NORMAL
0. 30
2. 82467
1. 20062
0. 96920
0. 76856
NORMAL
0. 40
3. 21069
1. 46337
0. 9737B
0. 87749
NORMAL
0. 50
3. 66B79
1.79179
0. 97'183
0. 99701
NORMAL
0. 60
4. 21475
2. 20329 .
0. 96455
1. 11667
NORMAL
0. 70
4. 86017
2. 72014
0. 95293
1. 23225
NORMAL
0. 80
5. 65346
3. 37090
0. 93781
1. 34700
NORMAL
0. 90
6. 60111
4. 19221
0. 91908
1. 46151
NORMAL
1. 00
7. 74935
5. 23127
0. 89971
1. 57547
GUMDEL
0. 0
1. 69668
0. 52340
0. 81945
2. 76635
CUMBEL
0. 10
1. 87286
0. 63444
0. 86522
2. 01224
GUMBEL
0. 20
2. 07454
0. 77316
0. 90260
1. 49857
GUMBEL
0. 30
2. 30609
0. 946B5
0. 93192
1. 14133
GUMBEL
0. 40
2. 57271
1. 16485
0. 95384
0. 89756
GUMBEL
0. 50
2. 8B060
1. 43913
0. 96917
0. 74470
GUMBEL
0. 60
3. 23711
1. 70504
0. 97072
0. 66038
GUMBEL
0. 70
3. 65104
2. 22230
0. 90325
0. 65400
GUMBEL
0. 80
4. 13287
2. 77630
0. 9834B
0. 68524
GUMBEL
0. 90
4. 69511
3. 48009
0. 97996
0. 74890
GUMBEL
1. 00
5. 352/1
4. 37592
0. 97322
0. 84000
RAYLEIGH
0. 0
0. 73812
1. 00150
0. B4299
1. 7B402
RAYLEIGH
0. 10
0. 71461
1. 21 103
0. 80575
1. 29444
RAYLEIGH
0. 20
0. 66765
1. 47211
0. 91937
0. 90416
RAYLEIGH
0. 30
0. 58894
1. 79815
0. 94433
0. 81103
RAYLEIGH
0. 40
0. 46749
2. 20630
0. 96143
0. 74504
RAYLEIGH
0. 50
0. 28883
2. 71844
0. 97160
0. 75308
RAYLEIGH
0. 60
0. 03393
3. 36256
0. 97579
0. 80379
RAYLEIGH
0. 70
0. 32233
4. 17463
0. 974B7
0. 87639
RAYLEIGH
0. BO
0. 81297
5. 20083
0. 96963
0. 96096
RAYLEIGH
0. 90
1. 48142
6. 50064
0. 96071
1.05513
RAYLEIGH
1. 00
2. 38488
8. 15078
0. 94869
1. 16265
PEARSON
0. 0
0. 62612
0. 31873
0. 82203
2. 40725
PEARSON
0. 10
0. 66320
0. 40648
0. 86244
1. 01376
PEARSON
0. 20
0. 71257
0. 52490
0. B9478
1. 42118
PEARSON
0. 30
0. 77602
0. 68534
0. 91980
1. 15912
PEARSON
0. 40
0. 85625
0. 90319
0. 93858
0. 99017
PEARSON
0. 50
0. 95560
1.20024
0. 95232
0. 8B501
PEARSON
0. 60
1. 07794
1. 60605
0. 96196
0. 82495
PEARSON
0. 70
1. 22707
2. 16245
0. 96841
0. 79629
PEARSON
0. 80
1. 4081 0
2. 92745
0. 97244
0. 70739
PEARSON
0. 90
1. 62Q26
3. 98149
0. 97463
0. 79120
PEARSON
1. 00
1.89502
5. 43843
0. 9753B
0. B0538
R2
CORRELATION COEFFICIENT
STDE
STANDARD ERROR
WSS
WEIGHTED SUM OF SQUARES
MXLF
MAXIMUM LIKELIHOOD FUNCTION
STD
STANDARD DEVIATION
K
PEARSON SHAPE PARAMETER
and Selection
Runoff.
WSS
MXLF
MEAN
STD
SKEW
2. 08890
17. 68
1. 983
0. 676
1. 210
1. 62781
11. 69
2. 220
0. 79B
0. 815
I. 30805
6. 46
2. 496
0. 952
0. 472
1. 1624 7
3. 61
2. 825
1. 147
0. 174
1. 204 56
4. 47
3. 211
1.395
0. 088
1. 43834
B. 72
3 669
1.710
0. 323
1. 8571 1
14. 86
4.215
2. 110
0. 537
2. 44810
21. 49
4. 868
2. 621
0. 736
3. 44664
29. 70
5. 653
3. 274
0. 925
4. 03943
33. 51
6. 601
. 4. 112
1. 107
4. 52098
36. 21
7. 749
5. 108
1. 284
3. 77901
31. 91
1. 983
0. 676
1. 210
3. 18185
27. 78
2. 220
0. 79B
0. 815
2. 57414
22. 69
2. 498
0. 952
0. 472
2. 00242
16. 66
2. 825
1. 147
0. 174
1. 50699.
9. 84
3. 21 1
1. 395
0. 0B8
1. 11956
2. 71
3. 669
1.710
0. 323
0. B6244
3. 55
4.215
2. 110
0. 537
0. 74907
6. 93
4. 868
2. 621
0. 736
0. 78475
5. 82
5. 653
3. 274
0. 925
0. 96712
0. DO
6. 601
4.112
1. 107
1. 28520
6. 02
7. 749
5. 108
1. 284
3. 05144
26. 77
1.983
0. 676
1. 210
2. 482BD
21. 83
2. 220
0. 798
0. 815
1. 96227
16. 18
2. 490
0. 952
0. 472
1. 53131
10. 23
2. B25
1. 147
0. 174
1. 22177
4. B1
3. 211
1. 395
0. 08B
1. 05486
1. 28
3. 669
1.710
0. 323
1. 04181
0. 9B
4. 215
2. 110
0. 537
1. 18509
4. 08
4. B68
2. 621
0. 736
1. 47951
9. 40
5. 653
3. 274
0. 925
1. 91256
15. 56
6. 601
4.112
1. 107
2. 46216
21. 62
7. 749
5. IBS
1. 284
K
3. 62361
30. 90
1. 983
0. 676
1. 210
8. 602
3. 15184
27. 55
2. 220
0. 798
0. 815
7. 747
2. 69941
23. B3
2. 498
0. 952
0. 472
6. 888
2. 29077
19. B9
2. B25
1. 147
0. 174
6. 063
1. 94352
15. 95
3. 211
1.395
0. OBB
5. 298
1. 66181
12. 19
3. 669
1.710
0. 323
4. 605
1. 44890
8. 90
4. 215
. 2. 110
0. 537
3. 989
1. 30225
6. 34
4. B6B
2. 621
0. 736
3. 449
1. 21373
4. 65
5. 653
3. 274
0. 925
2. 981
1. 17481
3. 87
6. 601
4. 112
1. 107
2. 578
1. 18147
4. 00
7. 749
5. 188
1. 2B4
2. 231
109
29
parameters and are thus meaningless. Duality between curve fitting and
parameter estimation will be dealt with in more detail in Chapter 6.
Equating to zero the derivatives of Equation 2.2.4 with respect to
each of the parameters 0 ^ j=l, ..., p leads to a set of p equations
with p unknowns. These are the x^ell known normal equations
= 0
9(SS) y r nv, 3g(Xi0)
= 2 I [y. gCx.,0)] a0i
9(SS) _ 3g(xi>6)
902 [yi g(xi0^ 902
= 0
(2.2.5)
3 (SS)
30
P
3g(x ,0)
2 Â£ [y g(X,0)] gg = 0 .
P
The ease of the solution of these equations is highly dependent on the
form of the function g. If this function is a linear expression of the
parameters, then the normal equations are also linear, and the problem
reduces to the linear least squares problem, solved readily by multiple
linear regression. On the other hand if g is not a linear expression of
the parameters, the normal equations are nonlinear. No direct explicit
solution for such equations exists. Nonlinear least squares methods may
be used for this solution, usually based on a direct search for the set
of parameters minimizing Equation 2.2.4 or on an iterative solution of a
linearized form of the normal equations. A detailed description
of these methods may be found in Bevington (1969) and Bard (1974).
2.2.4. Maximum likelihood Method
By showing that least squares estimates maximized the probability
density function for the normal error distribution, Gauss (1809) (refer
enced in Bard (1974)), laid the statistical foundation for parameter
estimation. In this, Gauss anticipated the maximum likelihood method.
7
consequences of failure may be (vulnerability). Application of these
criteria was illustrated with a water supply reservoir example for which
it was found that there is a tradeoff among expected benefits, reli
ability, resilience and vulnerability. For instance, high system reli
ability was accompanied by high system vulnerability.
Niku et al. (1979, 1981) developed a reliability model for the
evaluation of activated sludge processes of plants under design or under
operation. The model was based on the assumption of lognormality of the
analyzed effluent and on the relation between the moments of the normal
and lognormal distributions. Reliabilitybased parameter estimation
procedures of rainfallrunoff models have been reported by Sorooshian
and Dracup (1980), Sorooshian (1981), Troutman (1982), Sorooshian et al.
(1983), Sorooshian and Gupta (1983) and Gupta and Sorooshian (1983) in
which special objective functions and solution techniques were selected
on the basis of the stochastic properties of the errors present in the
data and in the model. In all of these studies, the data were trans
formed in order to comply with the assumptions implied by the estimation
methods.
Stedinger (1983a) proposed the use of the noncentral t distribution
to construct approximate confidence limits for specified design events
from the normal and lognormal distributions, and suggested an adjustment
for skewness for the Pearson type III distribution. Through a Monte
Carlo simulation, the new confidence limits were shown to achieve the
desired confidence level better than those based on asymptotic theory
(Kite, 1975) or on the U.S. Water Resources Council Guidelines (WRC,
1977, 1981). In another paper Stedinger (1983b) recommended the use of
probability weighted moments for estimating the parameters of the
<i r, r rn CD 3D r. r m 3
170
RELIABILITY VERSUS COE
FOJS m=~i
Figure 5.5 Reliability Versus Standardized Mean, a = 1.0
67
This distribution is much more complicated to deal with than the
previous ones because it has four parameters. In fact the shape of the
distribution is modeled by two parameters, k and the power exponent a.
As mentioned earlier, this is an over parameterization of the shape of
the distribution; for fixed k the power transformation will introduce
enough flexibility in Equation 3.2.16 to fit any shape of the observed
data. Thus, the parameter k will be replaced by its moment or maximum
likelihood estimate, reducing to three the number of parameters to be
estimated. Elimination of the shape parameter from the estimation
scheme was reported by Kite (1977) and applied by Siswadi (1981) and
Quesenberry and Kent (1982) for the selection among probability dis
tribution models. The moment estimate is
(3.2.17a)
where and are the mean and standard deviation of the gamma reduced
variate (Equation C.2.3c).
Alternatively, the maximum likelihood estimate of k can be eval
uated using the polynomial approximation of Greenwood and Durand (1960) .
k = (0.5000876 + 1.648852 C 0.0544274 C2)/C ,
for
0 < C < 0.5772
and
k = (8.898919 + 9.059950 C + 0.9775373 C2)/
(17.79728 + 11.968477 C + C2)C
(3.2.17b)
for
0.5772 < C < 17 .
where C = log(Y) log(Y) .
The CDF of the GPD is then
(u d)^ ^ exp[A(u d)] du = Tz(k)/r(k) (3.2.18)
288
V=A(XN**2)/B KITE230
U=A KITE231
U=(B/XN)XN/A KITE232
DU=R KITE233
DV=RXN*t3/Btt2 KITE234
Ma1.0+(XN)/A2 KITE235
FPN=XN4TRI4( AXN*DW/U KITE236
AS=GMLFCN/FPN KITE237
DELTA=ABS(EPS*AS> KITE23S
IF(ABS(ASGHL).LTDELTA) GO TO 4 KITE239
IFdCOUNT>GTNHAX) GO TO 6 KITE240
GHL=AS KITE241
GO TO 2 KITE242
4 CONTINUE KITE243
GAHHA=AS KITE244
DO 5 J=1N KITE245
T=SND(J) KITE246
E=BETA(l./3.)1.0/<9.0*BETA(2*/3.))+T/(3*0ttETA(l./6J) KITE247
XT(J)=GAHHA*ALPHA*EW3 KITE24B
5 CONTINUE KITE24?
GO TO 66 KITE250
6 CONTINUE KITE251
C KITE252
IF(IEP5.GTIEPSrt) GO TO 64 KITE253
EPS=EPS10 KITE254
IEPS=IEPSP1 KITE255
GO TO 10 KITE254
66 CONTINUE KITE257
RETURN KITE258
END KITE259
C KITE260
SUBROUTINE EV1 M1H2>TXTBETAALPHADELTAICOUNTEPS) KITE261
C KITE262
DIMENSION X(1)T(1)XT(1) KITE263
REAL Ml.M2 KITE244
XN=N KITE245
C KITE266
NMAX=25 KITE247
ALPHA=12325/(SORT(H2)) KITE268
AML=ALPHA KITE269
IEPS=1 KITE270
IEPSH=7 KITE271
10 CONTINUE KITE272
IC0UNT=0 KITE273
1 IC0UNT=IC0UNTF1 KITE274
A=l*0/(AHLtt2) KITE275
B=H110/AML KITE276
C=0,0 KITE277
D=00 KITE278
E=0,0 KITE279
DO 2 1=1N KITE280
TEHP=EXP(AHLtX(D) KITE281
C=C+TEMP KITE282
D=DFTEMPX(I) KITE283
E=E+TEHP*X(I)2 KITE284
2 CONTINUE KITE285
74
4. Regression of the transformed variables against the expected order
statistics allows a more efficient use of the information contained
in the observations, in that the rank (frequency) of the observation
contributes to the estimation in addition to the observed values.
Greenwood et al. (1979) showed the merit of using the order statistics
to derive analytical expressions for weighted moments of several distri
butions expressible in inverse form. The derived weighted moments were
of simpler analytical structure than the relationships, between the
conventional moments and the parameters. This is in good agreement with
the simplicity of the expressions derived in the previous sections,
Equation 3.2.30. Landwehr and Matalas (1979) compared the probability
weighted moments parameters and quantile estimates of the Gumbel dis
tribution with the estimates from conventional moments and maximum
likelihood methods. Their results showed good agreement between the
weighted moments and the other two methods.
5. Inclusion of the order statistics (ranks transformation) in the
linear regression constitutes a bridge between parametric and non
parametric statistics. Conover and Iman (1981) and Iman and Conover
(1979) gave a good illustration of this type of regression showing that
it works quite well on monotonic data. This will always be the case for
the regression of the transformed variable against the expected order
statistics, where the CDF is always a monotonic function.
3.3, Generalized Probability Distribution Computer Program (GPDCP)
3.3.1. Solution algorithm
In the previous section it was shown that the four newly parame
terized families of distributions, GNP, GEP, GRD and GPD, reduce to the
simple form of Equation 3.2.30. This equation relates the expected
268
normal distribution is defined over the entire real domain (>,+>) .
However, Yevjevich (1972a) noted that if the mean is larger than 3o the
probability that the value x reaches the lower boundary is very small
and can be neglected for many practical purposes.
C.2.7. Chisquare distribution
The sum of squares of v normally distributed standardized variables
has a chisquare distribution with v degrees of freedom. This is a
special case of the gamma distribution, Equation C.2.3a, with \ = y and
k = 2v, where v is the single parameter of the chisquare distribution.
The pdf of the chisquare distribution is
f (y)
x(l v/2)
2V/2 T(v/2)
S.
2
and its moments are
mean
variance
H = v ,
a2 = 2v ,
(C.2.9)
and skewness y = /2/v .
Hie chisquare distribution is mainly used in statistical infer
ences and hypothesis testing.
C.2,8. Student t distribution
A variable y has a Student t distribution with v degree of freedom
if it is defined by
y = z//u/v
where z is a standardized normal variate and u is a chisquare variate
with v degrees of freedom. The pdf of the Student distribution is
f (y)
= r(~o
(i +
y2/v)
(v+1)
2
/[/^ r(v/2)]
(C.2.10)
355
Srikanthan, R. and T.A. McMahon. 1981. Log Pearson III Distribution
and EmpiricallyDerived Plotting Position. Journal of Hydrology,
Vol. 52, pp. 161163.
Srikanthan, R. and T.A. McMahon. 1982. Stochastic Generation of
Monthly Stream Flows. Journal of the Hydraulics Division, ASCE,
Vol. 108, No. HY3, pp. 419441.
Stacy, E.W. 1962. A Generalization of the Gamma Distribution. Ann.
Math. Stat., Vol. 33, pp. 11871192.
Stacy, E.W. and G.A, Mlhram. 1965. Parameter Estimation for a Gen
eralized Gamma Distribution. Technometrics, Vol. 7, No. 3., pp.
349358.
Statistical Analysis System. 1982. SAS User's Guide. SAS Institute,
Inc., Statistical Analysis System, Cary, North Carolina.
Stedinger, J.R. 1980. Fitting Lognormal Distribution to Hydrologic
Data. Water Resour. Res., Vol. 16, No. 2, pp. 481490.
Stedinger, J.R. 1983a. Confidence Intervals for Design Events. Jour,
of Hyd. Eng., ASCE, Vol. 109, No. 1, pp. 1327.
Stedinger, J.R. 1983b. Estimating a Regional Flood Frequency Distri
bution. Water Resour. Res., Vol. 19, No. 2, pp. 503510.
Stedinger, J.R. 1983c. Design Events with Specified Flood Risk. Water
Resour. Res., Vol. 19, No. 2, pp. 511522.
Stedinger, J.R. and M.R. Taylor. 1982a. Synthetic Streamflow Genera
tion: 1. Model Verification and Validation. Water Resour. Res.,
Vol. 18, No. 4, pp. 909918.
Stedinger, J.R. and M.R. Taylor. 1982b. Synthetic Streamflow Genera
tion: 2. Effect of Parameter Uncertainty. Water Resour. Res.,
Vol. 18, No. 4, pp. 919924.
Stidd, C.K. 1953. CubeRootNormal Precipitation Distributions.
Trans. Amer. Geophys. Union, Vol. 34, pp. 3135.
Stidd, C.K. 1968. A Three Parameter Distribution for Precipitation
Data with a StraightLine Plotting Method. Proc. 1st Statist.
Meteorol. Conf., American Meteorological Society, Hartford, Con
necticut, pp. 158162.
Stidd, C.K. 1970. The Nth Root Normal Distribution of Precipitation.
Water Resour. Res., Vol. 6, No. 4, pp. 10951103.
Tang, W.H. 1980. Bayesian Frequency Analysis. Journal of the Hydrau
lics Division, ASCE, Vol. 106, No. HY7, pp. 12031217.
6
of this performance was provided by checking the similarity between
historical and synthetic cumulative distribution functions of the
modeled flows or storages. This criterion was applied later for the
comparison of the performance of four streamflow record extension tech
niques (Hirsh, 1982).
Stedinger and Taylor (1982a, 1982b) reconsidered the problem of
streamflow generation and compared the performance of five generation
models. They did this in two steps: a verification phase to confirm
that the model reproduces the statistics of the observed data, and a
validation phase to demonstrate that other important characteristics of
generated flow sequences are consistent with those of the historical
flow, e.g., statistics related to the frequency and severity of droughts.
They also showed that by incorporating parameter uncertainty into the
streamflow generating model, derived distributions of reservoir reli
ability and performance will better reflect what is known (or is not)
about a basin's hydrology. Among their conclusions they noted that the
impact of parameter uncertainty is much greater than that of the selec
tion between a simple and a relatively complicated model. Thus, stream
flow model parameter uncertainty should be incorporated into reservoir
simulation studies to obtain realistic and honest estimates of system
reliability, given what is actually known about basin hydrology (Sted
inger and Taylor, 1982b).
Hashimoto et al. (1982) defined three criteria that can be used to
assist in the evaluation and selection of alternative design and operat
ing policies for a wide variety of water resource projects. These
criteria describe how likely a system is not to fail (reliability), how
quickly it recovers from failure (resilience) and how severe the
249
models was illustrated by this case study. The need to comply with the
normality assumption implied by the least squares estimation procedure
in all three types of models was emphasized by a sensitivity analysis of
the selection statistics to the shape of the distribution (transforma
tion a). The use of deterministic models for continuous simulation of
urban runoff quantity and quality was proven to be a very promising tool
for the description of such processes when the models are calibrated
over several storms representative of average hydrological conditions..
Despite the conceptual nature of these models, their estimated parame
ters often lose their physical meaning during the calibration and
verification stages. The use of lumped parameters, such as the basin
width for the Runoff block of SWIM, along with calibration for more than
one storm simultaneously, avoided most such losses of meaning and
increased the reliability of the estimates. More insight into the
simulated processes may be gained by a distribution free analysis of the
generated series using STATS or SYNOP, or by a probabilistic or sto
chastic modeling of the generated series.
Statistics such as the AIC and BIC recommended by Rao et al. (1982)
for the selection between the no transformation, square root trans
formation and logarithmic transformation were extended to the BoxCox
transformation, and were shown to select transformations different from
the normalizing transformation. Thus, selection of the best trans
formation to normality (or to any other distribution from the types
analyzed in Chapter 3) should be based on a frequency analysis similar
to the one performed by the GPDCP program. The advantage of the normal
izing transformation was illustrated by a Monte Carlo simulation, where
simulated series statistics were found much more reliable than those of
the associated parameters.
99.9%
Table 5.S Power to Lognormal Percentile Ratio, 0 >_ a >_ 1, P =
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A PROBABILITY OF 99.900000 X WITH NORMAL VARIATE Z = 3.090
VY ALFA
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0. BOO
0. 900
1. 000
0. 1
1. 000
0. 999
0. 995
0. 991
0. 986
0. 982
0. 977
0. 972
0. 966
0. 961
0. 955
0. 949
0. 2
1. 000
0. 993
0. 978
0. 961
0. 942
0. 923
0. 901
0. 878
0. 853
0. 826
0. 796
0. 763
0. 3
1. 000
0. 982
0. 947
0. 909
0. 867
0. 822
0. 772
0. 715
0. 652
0. 578
0. 492
0. 387
0. 4
1. 000
0. 960
0. 902
0. 837
0. 766
0. 686
0. 597
0. 494
0. 374
0. 230
0. 049
0. 000
0. 5
1. 000
0. 930
0. 845
0. 750
0. 646
0. 529
0. 397
0. 248
0. 081
0. 000
0. 000
0. 000
0. 6
1. 000
0. 890
0. 778
0. 654
0. 518
0. 368
0. 206
0. 044
0. 000
0. 000
0. 000
0. 000
0. 7
1. 000
0. 845
0. 707
0. 555
0. 393
0. 222
0. 060
0. 000
0. 000
0. 000
0. 000
0. 000
0. a
1. 000
0. 795
0. 633
0. 460
0. 280
0. 109
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 9
1. 000
0. 745
0. 562
0. 372
0. 187
0. 036
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 0
1. 000
0. 692
0. 494
0. 295
0. 115
0. 004
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 1
1. 000
0. 643
0. 433
0. 230
0. 064
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 2
1. 000
0. 595
0. 377
0. 177
0. 031
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 3
1.000
0. 550
0. 328
0. 134
0. 013
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 4
1. 000
0. 509
0. 286
0. 101
0. 004
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 5
1. 000
0. 472
0. 250
0. 075
0. 001
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
161
220
from a total number of 3494 hours of simulation (Table 6.13). Figure
6.7 gives the plot of the transformed series versus the reduced variates
(expected order statistics) along with the fitted model and the 95%
confidence interval.
6.4.5. Conclusions
This section presented some examples illustrating the potential use
of the GPDCP program in modeling different types of hydrologic series.
The program performed equally well for all the modeled examples. A
sensitivity analysis of the shape parameter to the change of scale
confirmed the results obtained by the illustrative examples of Chapter
4. The selection statistics showed little sensitivity to the plotting
position definition, which was found highly sample dependent, deviating
excessively from the expected value given in the literature (Cunnane,
1978; Royston, 1982).
The next section presents some applications of the reliability
based approach for stochastic modeling.
6.5. Stochastic Models
6.5.1. Introduction
Time series stochastic modeling for either data generation or
forecasting of hydrologic variables has become an important step in the
planning and operation of water resources systems (Salas and Obeysekera,
1982). The modeling process is generally composed of six main stages
(Salas et al., 1980): 1) identification of model composition, e.g,
univariate, multivariate or multilevel (disaggregation) models; 2)
identification of model type, e.g., autoregressivemoving average (ARMA),
fractional Gaussian noise (FGN), broken line (BL), etc.; 3) identifi
cation of model form or order, e.g., order p of the autoregressive (AR)
TABLE G1.7
RELIABILITY AS A FUNCTION OF COR,
COR
VY
0. 2000
0. 4000
0. 3000
0. 9999
0. 9985
0. 4000
0. 9999
0. 9902
0. 5000
0. 9989
0. 9758
0. 6000
0. 9960
0. 9604
0. 7000
0. 9915
0. 9470
0. 8000
O. 9861
0. 9366
0. 9000
0. 9807
0. 9293
1. 0000
0. 9759
0. 9247
1. 1000
0. 9720
0. 9224
1. 2000
0. 9690
0. 9219
1. 3000
0. 9670
0. 9228
1. 4000
0. 9658
0. 9248
1. 5000
0. 9654
0. 9276
1. 6000
0. 9656
0. 9310
1. 7000
0. 9663
0. 9348
0. 6000
0. 8000
1. 0000
0. 9640
0. 8179
0. 5714
0. 9270
0. 7839
0. 5948
0. 8982
0. 7693
0. 6179
0. 8789
0. 7654
0. 6406
0. 8673
0. 7678
0. 6628
0. 8615
0. 7742
0. 6844
0. 8598
0. 7830
0. 7054
0. 8612
0. 7934
0. 7257
0. 8646
0. 8047
0. 7454
0. 8696
O. 8164
0. 7642
0. 8757
0. 8284
0. 7823
0. 8824
0. 8403
0. 7995
0. 8895
0. 8521
0. 8159
0. 8968
0. 8635
0. 8315
0. 9042
0. 8746
0. 8461
COR : COEFFICIENT OF RELIABILITY
VY : COEFFICIENT OF VARIATION
ND VY,FOR ALFA
0. 200
1. 2000
1. 4000
1. 6000
1. 8000
2. 0000
0. 3303
0. 1635
0. 0718
0. 0288
0. 0108
0. 4113
0. 2643
0. 1606
0. 0935
0. 0528
0. 4716
O. 3460
0. 2465
0. 1717
0. 1176
0. 5201
0. 4129
0. 3223
0. 2484
0. 1897
0. 5615
0. 4692
0. 3882
0. 3188
0. 2604
0. 5979
0. 5179
0. 4459
0. 3822
0. 3266
0. 6307
0. 5609
0. 4970
0. 4392
0. 3874
0. 6607
0. 5995
0. 5427
0. 4905
0. 4430
0. 6884
0. 6344
0. 5839
0. 5370
0. 4937
0. 7141
0. 6664
0. 6215
0. 5793
0. 5400
0. 7380
0. 6958
0. 6558
0. 6180
0. 5824
0. 7603
0. 7229
0. 6872
0. 6534
0. 6213
0. 7812
0. 7479
0. 7162
0. 6859
0. 6570
0. 8007
0. 7711
0. 7428
0. 7157
0. 6898
0. 8188
0. 7925
0. 7673
0. 7432
0. 7199
324
TABLE FI.2
COEFFICIENT OF RELIABILITY
FOR A PROBABILITY OF 95.000000 7. WITH NORMAL VARIATE Z = 1.645
VY ALFA
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0. 800
0. 900
1. 000
0. 100
0. 853
0. 853
0. 853
0. 854
0. 855
0. 855
0. 856
0. 856
0. 857
V.
0. 858
0. 858
0. 859
0. 200
0. 736
0. 734
0. 736
0. 738
0. 740
0. 742
0. 744
0. 746
0. 747
0. 749
0. 751
0. 752
0. 300
0. 644
0. 639
0. 642
0. 645
0. 649
0. 652
0. 655
0. 658
0. 661
0. 664
0. 667
0. 670
0. 400
0. 571
0. 561
0. 566
0. 570
0. 575
0. 579
0. 584
0. 588
0. 592
0. 596
0.599
0. 603
0. 500
0. 514
0. 498
0. 504
0. 509
0. 515
0. 520
0. 525
0. 530
0. 535
0. 540
0. 544
0. 549
0. 600
0. 468
0. 446
0. 453
0. 459
0. 465
0. 471
0. 477
0. 482
0. 488
0. 493
0. 498
0. 503
0. 700
0. 432
0. 404
0. 411
0. 417
0. 424
0. 430
0. 436
0. 442
0. 448
0. 454
0. 459
0. 465
0. 800
0. 403
0. 370
0. 376
0. 382
0. 389
0. 395
0. 402
0. 408
0. 414
0. 420
0. 426
0. 432
0. 900
0. 379
0. 341
0. 347
0. 353
0. 360
0. 366
0. 372
0. 379
0. 385
0. 391
0. 397
0. 403
1. 000
0. 360
0. 318
0. 323
0. 329
0. 335
0. 341
0. 347
0. 353
0. 359
0. 366
0. 372
0. 378
1. 100
0. 344
0. 300
0. 304
0. 309
0. 314
X). 319
0. 325
0. 331
0. 337
0. 344
0. 350
0. 356
1. 200
0. 330
0. 286
0. 288
0. 291
0. 296
0. 301
0. 306
0. 312
0. 318
0. 324
0. 330
0. 336
1. 300
0. 319
0. 274
0. 275
0. 277
0. 281
0. 285
0. 290
0. 295
0. 301
0. 306
0. 313
0. 319
1. 400
0. 310
0. 266
0. 265
0. 265
0. 268
0. 271
0. 275
0. 280
0. 285
0. 291
0. 297
0. 303
1. 500
0. 302
0. 261
0. 257
0. 256
0. 257
0. 259
0. 262
0. 267
0. 271
0. 277
0. 282
0. 288
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
294
61
Table 3.2. Optimal Power Transformation and Information Number
Ratio. After Hernandez and Johnson (1980).
Distribution
a
opt
\ ,
(a=a1)
h
(a=a )
opt
Vh
Gamma (a,l,.5)
0.2084
0.98175
0.1205
81.5
Exponential (a,1,1)
0.2654
0.41894
0.00278
150
Gamma (a,1,1.5)
0.2887
0.26070
0.00140
186
Gamma (a,1,2)
0.3006
0.18830
0.00051
369
Gamma (a,1,3)
0.3124
0.12067
0.00019
635
Weibull (a,b,l)
0.2654a



282
SUBROUTINE RNV(P,X,D,IE)
GPDP524
GPDP525
GPDP526
IE0
GPDP527
IF(P) 142
GPDP52B
1
IE=1
GPDP529
GO TO 12
GPDP530
2
IF(Pl) 7.6,1
GPDP531
4
X=0,9E+32
GPDP532
5
D=0,
GPDP533
GO TO 12
GPDP534
6
X0.9EI32
GPDP535
GO TO 5
GPDP536
7
D=P
GPDP537
IRD0.5) 9,9,8
GPDP538
8
D=l.D
GPDP539
9
T2=ALQG(1./(DM>
GPDP540
T=SQRT(T2)
GPDP541
X=T(2.515517+0,802853m0,010328*T2)/(l,+l,432788m0.189269*T2
GPDP542
.+0.001308TT2)
GPDP543
IF(P0 5) 10,10,11
GPDP544
10
x=x
GPDP545
11
D=0.3989423EXP (XlX/2 *)
GPDP546
12
RETURN
GPDP547
END
GPDP548
SUBROUTINE REGRE(XYiY2,HMAX,JlJ2A,BR2iSfVARXAHX,VARY)
GPDP549
GPDP550
GPDP551
DIMENSION X(60),Y(60),Y2(60)
GPDP552
GPDP553
AHX=0,
GPDP554
GPDP555
AHY0
GPDP556
UAX=0.
GPDP557
VAYO.
GPDP558
COVO,
GPDP559
DO 1 I=J1,J2
GPDP560
AHXAHX+X)
GPDP561
AMY=AMY+Y(I)
GPDP562
VAX=VAX+X(I)iX(I)
GPDP563
VAY=VAY+Y(I)?Y(I)
GPDP564
COV=COV+X(I)tY(I)
GPDP565
1
CONTINUE
GPDP566
AN=FLOAT < J2J1+1)
GPDP567
AHXAMX/AN
GPDP568
AMYAHY/AN
GPDP569
VARXVAX/ANAHXtAHX
GPDP570
VARY=UAY/ANAHYAHY
GPDP571
COVACQO/ANAMXYAMY
GPDP572
A=COVA/VARX
GPDP573
BAMYAJAMX
GPDP574
R2=CQUAtt2/VARX/VARY
GPDP575
SSQRT ((AN/ (AN2,)) (UARYCOUA W2/VARX))
GPDP576
DO 2 I=J1,J2
GPDP577
Y2(I)=AtX(IHB
GPDP578
0
i.
CONTINUE
GPDP579
RETURN
GPDP580
END
GPDP581
o oo
281
c
c
100
200
300
400
500
00
C
C
C
C
1
2
4
5
6
7
8
C
GPDP470
CPDCP SUBROUTINES
GPDP471
GPDP472
SUBROUTINE ORDER(XIRANG1.IRANG2N,ND)
GPDP473
GPDP474
DIMENSION X(ND),IRANG1(NB)IRANG2(ND)
GPDP475
GPDP476
DO 100 1=1jN
GPDP477
IRANG2(I)=I
GPDP478
M=N1
GPDP47?
IF(ii) 600600*200
GPDP480
DO 400 1=1,H
GPDP4B1
L=I+1
GPDP482
DO 400 J=L,N
GPDP483
IF(X(I)X(J)) 400,400,300
GPDP484
T=X(I)
GPDP485
X(I)=X(J)
GPDP483
X(J)=T
GPDP487
IT=IRANG2(I)
GPDP488
IRANG2(I)=IRANG2(J)
GPDP48?
IRAH62 J)=IT
GPDP490
CONTINUE
GPDP491
DO 500 K=1N
GPDP492
L=IRANG2(K)
GPDP493
IRANG1(L)=K
GPDP494
RETURN
GPDP495
END
GPDP496
GPDP497
GPDP498
SUBROUTINE CDF(P,PC,F,NK,N,J1,J2,INDEX,ALF)
GPDP499
GPDP500
DIMENSION P(NM),PC(NM)F
GPDP501
GPDP502
PC(1)=P(1)
GPDP503
IF(INDEX1) 1,3,1
GPDP504
DO 2 1=2,N
GPDP505
PC(I)=PC(I1)+P(I)
GPDP506
GO TO 5
GPDP507
CONTINUE
GPDP503
DO 4 1=2,N
GPDP509
PC(I)=P(I)
GPDP510
CONTINUE
GPDP511
DO 6 1=1,N
GPDP512
PPP=PC(I)
GPDP513
IF(PPP) ,,7
GPDP514
CONTINUE
GPDP515
J1=I
GPDP516
J2=N
GPDP517
I1=N
GPDP518
DO 8 I=J1,J2
GPDP519
F
GPDP520
RETURN
GPDP521
END
GPDP522
GPDP523
21
illustrated by generated tables and plots of these indices for several
transformation parameters and coefficients of variation. An unexpected
increase of reliability with the coefficient of variation for fixed
extreme events is highlighted by the new generalized reliability indices
and definitions.
Deterministic models, such as SWMM, are proven to be valuable tools
for continuous simulation of hydrologic processes, once calibrated
against several storms representative of average hydrological conditions
of the basins under investigation. Simulated series provide valuable
information for storm water management and for hydrological planning and
design when separated into events and treated by statistical analysis
programs such as the Statistics Block of SWMM or the Synoptic Statis
tical Analysis Program (SYNOP).
More insight into the modeled hydrologic processes may be gained
through a probabilistic and stochastic modeling of these simulated
series, e.g., the generalized probability distribution computer program
(GPDCP) is used for the modeling of the event characteristics of the
generated series (duration, volume, intensity and time between events)
and of generated series of extreme hourly pollutant loads. For all
these series the performance of the models fitted to the event data is
as good as for the total annual series.
Stochastic modeling of annual and monthly rainfall series is
illustrated by the analysis of data from eight National Weather Service
NWS stations from southeast Florida. Parameter estimation based on
optimal decision criteria, such as the Akaike Information Criterion
(AIC) and Bayesian Information Criterion (BIC) extended to the BoxCox
transformation performed very poorly in selecting the normalizing
17
reviewed in Chapter 2 as special cases, but it has the disadvantage of
being so poorly parameterized that no simple procedure for estimating its
parameters has been reported in the literature. The estimates are
usually highly correlated, and iterative algorithms are required for
their evaluation. The convergence of these algorithms is not always
guaranteed, and the final estimates are often dependent on their first
estimates. To circumvent such limitations, a new parameterization of
the generalized distribution is suggested. In its new form the general
ized distribution includes four families of distributions, each express
ible in terms of only two parameters (location and scale), once the data
are transformed to the right space through the BoxCox transformation.
The four families of generalized distribution analyzed are the normal,
Gumbel, Rayleigh, and Pearson distributions. A computer program is
developed to fit up to 100 probability distribution models simultane
ously and select the best one according to any of four selection sta
tistics. These statistics are the correlation coefficient, standard
error of estimate, weighted sum of squares and maximum likelihood
function.
The performance of this program and the new generalized distri
butions is compared in Chapter 4 to that of the classical maximum
likelihood procedure used by Kite (1977) for estimating the parameters
of six of the most widely used frequency distribution models in hydrol
ogy. The comparison is based on two examples. The first is a series of
maximum daily discharges from the St. Mary's River at Stillwater, Nova
Scotia, analyzed by Kite (1977). The second example has two series of
annual total rainfall and runoff from the Kissimmee River basin analyzed
by Huber et al. (1982). These two examples are also used for a
69
Table 3.4. Other Special Distributions of the GGD Not
Included in This Study. (after Stacy and
Mihrara, 1965)
Distribution
Parameters
a
b
k
Exponential
a
1
1
Weibull
a
b
1
Chisquare
2
1
v/2
Chi
/2
2
v/2
Half normal
/2
2
1/2
Circular normal
/2
2
1
Spherical normal
/2
2
3/2
v = degree of freedom
30
This method was revived later by Fisher (1950) who studied its estimator
properties such as consistency, efficiency and sufficiency.
Assuming the measured values y i=l, n, are independent and
that each has a probability density function (pdf) f about its
*i
expected value y: estimated by g(x^,0) of Equation 2.2.1, the
probability that all n observations lie within dy of the predicted value
g(x_^,0) defines the likelihood function
n
L = n f (e.) (2.2.6)
. t y. i
i=l J i
Meyer (1975, pp. 136) emphasized that the term "likelihood" distin
guishes, and is reserved for, a posteriori probabilities. That is, after
a result y. has been observed, f (e.) states how probable it was that
1 yi i
e^ was found. The main distinction between a_ posteriori and ordinary (a
priori) probabilities is that the latter are well defined numbers (even
if unknown) whereas the former have the character of being random vari
able.
If the errors e^ are normally distributed with mean zero and
2
variance ck, then Equation 2.2.6 gives the following expression of the
likelihood function:
n
, 1 1 ,yi gCxi0>121
L = II exp [ y ( ) J
i=l /2tt a.
a.
i
(2.2.7)
= exp [ j ( a
Y ~ g(x.,6) 0 n
)'] n
i=l /2tt a.
i
The maximum likelihood estimator (MLE) of 0, say 0, is the value of 0
that maximizes L or equivalently the logarithm of L, Â£ = log L (Meyer,
1975, p. 312). The MLE of 0 is often but not always the solution of
the following p equations:
256
2 = S(Y A BZ)'
Y n 2
(A.4)
(A.5)
which after some arrangements reduces to
_2 E(YY)2 B2E(ZZ)2
Y n 2
Equation A.O may be rewritten in the following form as a function
of the estimates A and B to calculate the predicted value of Y=Y
A A
Y = Y + B(ZZ) (A.6)
Since Y and B are random variables, Y is also a random variable, and
its variance is
A O A A
Var(Y) = Var(Y) + (ZZ) Var(B) + 2(ZZ) Cov (Y, B) (A.7)
From Draper and Smith (1966, pp. 1922) we have
2
Var(Y) =
n
Var(B) =
, and
Â£(ZZ)
Cov(Y, B) = 0 .
Substitution into Equation A.7 leads to
Var(Y) = Â£ + a2
E(ZZ) u
(A.8)
where is a measure of the average scatter of the data about the
u
regression line. This variance is often estimated from the data by
Equation A.5 after some additional correction for bias. Raiffa and
Schlaifer (1961, Eq. 11.39) used the following estimate
2
n
1 2
n 3 Y
(A.9)
to account for the uncertainty in the estimate of a (Equation A.5).
LIST OF TABLES
Table Page
3.1. Special Distributions of the GGD 58
3.2. Optimal Power Transformation and Information Number Ratio...61
3.3. Special Distributions of the New Parametrized Generalized
Distributions 66
3.4. Other Special Distributions of the GGD Not Included in
This Study 69
3.5. New Generalized Family of Distributions 73
4.1. Annual Maximum Daily Runoff and Statistics of Original
and Log Transformed Flows, Example 1 87
4.2. Standard Errors for Example 1 88
4.3. Models, Parameters, and Selection Statistics, Example 1,
Runoff Series 90
4.4. Optimal Selection Statistics and Corresponding
Transformation (a) for Example 1 92
4.5. Best Model Based on R2 Selection Statistics, Example 1,
Runoff 94
4.6. Detailed Statistics for the R2 Selected Model, Example 1....96
4.7. Best Model Based on STDE Selection Statistic, Example 1 97
4.8. Detailed Statistics for the STDE Selected Model,
Example 1 98
4.9. Detailed Statistics for the WSS Selected Model, Example 1..103
4.10. Best Model Based on the WSS Selection Statistics,
Example 1 104
4.11. Total Annual Runoff and Statistics of Original and Log
Transformed Flow for Example 2 105
4.12. Total Annual Rainfall and Statistics of Original and Log
Transformed Data 106
4.13. Standard Errors for Example 2 .108
4.14. Models, Parameters, and Selection Statistics, Example 2,
Runoff 109
4.15. Optimal Selection Statistic and Corresponding
Transformation (a), Example 2, Annual Total Runoff 110
4.16. Best Model Based on R2 Selection Statistic, Example 2,
Runoff 112
4.17. Detailed Statistics for the R2 Selected Model, Example 2,
Runoff 113
4.18. Best Model Based on the STDE Selection Statistic,
Example 2, Runoff 1.14
4.19. Detailed Statistics for the STDE Selected Model,
Example 2, Runoff 115
4.20. Models, Parameters, and Selection Statistics, Example 2,
Rainfall 118
viii
11
If the performance of the system is described by g(x,8) and an interval
A is defined about the mean prediction g(x,0) within which the observa
tion will be considered as a success, then the reliability may be ex
pressed as
Reliability = 1 Pr(y g(x,0) > A) (1.2.2)
From Figure 1.1, the error distribution about a single prediction
g(x^,8) defines the reliability as the area under the curve delimited by
the upper and lower limits and L^, respectively. The risk is defined
by the area under the tails of the distribution. Both risk and reli
ability may be defined with only one limit, the second limit may be set
to + or Figure 1.3a gives one such example in which y and g(x,0) are
replaced by capacity C and demand D, and safety margin SM (Harr,
1977). The risk of having the demand exceed the capacity (negative SM)
is given by the shaded area in Figure 1.3a. For this case the lower
limit was set equal to zero. Such limits are usually defined by stand
ards or design events which should not be exceeded for a given level of
reliability. If these limits are themselves random variables, they will
be represented by their probability distributions, and the reliability
will be evaluated as illustrated in Figure 1.3b, where the probability
associated with a level of demand (fixed limit), say D^, is
Pr( di~D I < ~) = fpiD^dD (1.2.3)
and the probability that the capacity is less than is the shaded area
of the same figure,
1
Pr(C < DJ = f f (C)dC (1.2.4)
The probability of failure is the product of these two probabilities or
D!
dPf = fD(D1)dD _J ^ fc(C)dC .
(1.2.5)
Table 5.2 Power to Lognormal Percentile Ratio, 0 <_ a <1, P = 99%
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A PROBABILITY OF 99.000000 7. WITH NORMAL VARIATE Z = 2.326
VY ALFA
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0. 800
0. 900
1. 000
0. 1
1. 000
1. 000
1. 002
1. 004
1. 005
1. 007
1. 009
1. 011
1. 013
1. 015
1. 016
1. 018
0. 2
1. 000
0. 997
1. 004
1. 011
1. 018
1. 025
1. 031
1. 038
1. 044
1. 050
1. 055
1. 061
0. 3
1. 000
0. 988
1. 003
1. 018
1. 033
1. 047
1. 060
1. 072
1. 084
1. 095
1. 106
1. 117
0. 4
1. 000
0. 973
0. 998
1. 023
1. 046
1. 068
1. 089
1. 109
1. 127
1. 145
1. 162
1. 178
0. 5
1. 000
0. 951
0. 988
1. 024
1. 057
1. 088
1. 117
1. 144
1. 170
1. 195
1. 218
1. 241
0. 6
1. 000
0. 925
0. 973
1. 019
1. 062
1. 102
1. 140
1. 176
1. 209
1. 241
1. 271
1. 300
0. 7
1. 000
0. 894
0. 954
1. 010
1. 063
1. 112
1. 158
1. 202
1. 243
1. 282
1. 319
1. 354
0. 8
1. 000
0. 860
0. 930
0. 997
1. 059
1. 117
1. 171
1. 222
1. 271
1. 316
1. 360
1. 401
0. 9
1. 000
0. 826
0. 905
0. 980
1. 051
1. 116
1. 178
1. 237
1. 292
1. 344
1. 394
1. 442
1. 0
1. 000
0. 791
0. 879
0. 961
1. 039
1. 112
1. 181
1. 246
1. 307
1. 366
1. 421
1. 474
1. 1
1. 000
0. 758
0. 852
0. 941
1. 025
1. 104
1. 179
1. 249
1. 317
1. 381
1. 442
1. 500
1. 2
1. 000
0. 727
0. 826
0. 920
1. 009
1. 093
1. 173
1. 249
1. 321
1. 390
1. 456
1. 519
1. 3
1. 000
0. 699
0. 802
0. 900
0. 993
1. 081
1. 165
1. 245
1. 322
1. 395
1. 465
1. 532
1. 4
1. 000
0. 674
0. 779
0. 879
0. 975
1. 067
1. 154
1. 238
1. 318
1. 395
1. 469
1. 540
1. 5
1. 000
0. 653
0. 758
0. 860
0. 958
1. 052
1. 142
1. 229
1. 312
1. 392
1. 469
1. 544
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
155
RELIABILITY ANALYSIS APPLIED
TO MODELING OF
HYDROLOGIC PROCESSES
By
Khlifa Maalel
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1983
Table 6.14 Options Selection for STATS Analysis.
THE PERIOD OF TIME FOR WHICH THE STATISTICAL ANALYSIS IS BEING PERFORMED IS:
STARTING DATE: 740101 STARTING TIME: 0.0 HOURS
ENDING DATE: 771231 ENDING TIME: 0.0 HOURS
THE MINIMUM INTEREVENT TIME HAS BEEN DEFINED AS 10 TIME STEPS
THE LOCATION NUMBER REQUESTED FOR STATISTICAL ANALYSIS IS
ENGLISH UNITS ARE USED IN INPUT/OUTPUT
THE NUMBER OF POLLUTANTS REQUESTED FOR STATISTICAL ANALYSIS IS 1
THE BASE FLOW TO SEPARATE EVENTS IS O. O
THE POLLUTANTS REQUESTED FOR THIS RUN, IDENTIFIED BY NUMBER, ARE AS FOLLOWS: 1
THE STATISTICAL OPTIONS REQUESTED FOR
FLOW RATE ARE
INDICATED
BY A
'1 '
TOTAL FLOW
AVERAGE
FLOW
PEAK FLOW
EVENT DURATION
INTEREVENT DURATION
TABLE OF RETURN PERIOD AND FREQUENCY
1
1
1
1
1
GRAPH OF RETURN PERIOD
1
1
1
1
1
GRAPH OF FREQUENCY
1
1
1
1
1
MOMENTS
1
1
1
1
1
THE STATISTICAL OPTIONS REQUESTED FOR
TABLE OF RETURN PERIOD AND FREQUENCY
POLLUTANT NUMBER
TOTAL LOAD
1
1 ARE
AVERAGE
1
INDICATED BY A '1'
LOAD PEAK LOAD
1
FLOW WEIGHTED
AVERAGE CONC
1
GRAPH OF RETURN PERIOD
1
1
1
1
GRAPH OF FREQUENCY
1
1
1
1
MOMENTS
1
1
1
1
PEAK CONC
1
1
1
1
PROGRAM EXECUTION CONTINUING. DATA WILL BE READ FROM THE INTERFACE FILE AND SEPARATED INTO EVENTS.
******* END OF INTERFACE FILE REACHED
******* LAST DATE AND TIME READ ARE 771227 0.0 HOURS
******* PROGRAM CONTINUING WITH ANALYSIS OF EVENTS
THE NUMBER OF MONTHS WITHIN THE PERIOD OF ANALYSIS, ROUNDED TO THE NEAREST MONTH, IS 40
THE NUMBER OF EVENTS WITHIN THE PERIOD OF ANALYSIS IS 311
208
5 4 3
Figure 3.1 Generalized Garama Distribution
226
These three criteria were used for the selection of the best trans
formation among the square root, logarithmic and no transformation.
This statistic is also extended in this study to the BoxCox trans
formation. The following general expression is defined based on the
maximum likelihood function (Equation 3.3.16) and the previous three
special cases (a=l, 1/2 and 0) of the BIC (Appendix I gives some details
about this derivation)
BIC(p,q,a) = n log (aM) + (p+q) log (n)
 8n a(al) log (2) 2n(al) h^og y (6.5.10)
For a fixed p and q (e.g., ARMA(1,1)) the optimal transformation will be
the one minimizing the above equation. The performance of these sta
tistics is illustrated by the following two examples of annual and
monthly time series from south Florida.
6.5.4. Annual AKHA(1,1) Model
The annual series of total rainfall from the eight NWS stations
(Table 6.1) are fitted to the ARMA(1,1) model for four different trans
formations. Among these transformations are the normalizing trans
formation (a ) selected by the GPDCP program (Table 6.18). The first
estimates of the autoregressive and moving average parameters are
independent of a for seven of the eight stations investigated; the only
change is for the logtransformed series at the West Palm Beach station,
where these estimates change from 0.0 to 0.20 for both and 0^.
This change is not that important for the final estimates given by the
FTMXL subroutine (IMSL, 1979). Table 6.22 gives a complete list of the
final parameter estimates. Note the low sensitivity of the final
estimates to the change of a suggested by this table. Although the
difference between the parameters for a given station is small,
APPENDIX A
LINEAR REGRESSION
Equation 2.2.2 is linear in the two parameters A and B,
Y = A + BZ (A.O)
2
Least squares estimates of A, B and a^, the intercept, slope and vari
ance, respectively, may be found in many statistical textbooks, e.g.
Draper and Smith (1966), Haan (1977). These are the solution of the
normal equations defined by Equation 2.2.5.
3SS
n
= 2 E (Y. A BZ.) = 0
3A x i
x=l
3SS
n
(A.l)
= 2 E (Y. A BZ.) Z. = 0
3B i ii
1=1
which after some manipulation reduce to
A n + BEZ. = EY.
i i
AEZ. + BEZ. = EZ.Y.
i i ii
(A.2)
a system of two equations with two unknowns. Equations A.2 may be
easily solved (e.g. by substitution) for the estimates A and B
A = Y BZ
EZY nZY
(A.3)
B =
2 2
EZ nZ
where Z, Y are sample means of Z and Y, respectively, n the sample size,
and E denotes the summation from 1 to n.
An unbiased estimate of the variance about the fitted line is given
by
255
194
r
0. 179
I
0. 176
I
0. 172
I
0. 169
I
0. 163
I
0. 161
I
0. 158
I
0. 154
I
0. 151
I
0. 147
I
0. 144
I
O. 140
I
0. 136
I
0. 133
I
0. 129
I
0. 126
I
O. 122
I
0. 1 IB
I
0. 113
I
0. 1 11
I
0. 10B
I
0. 104
I
0. 100
I
0. 097
I
0. 093
I
0. 090
I
0. 0S6
I
0. 033
I
0. 079
I
O. 073
I
0. 072
I
0. 068
I
0. 065
I
0. 061
I
0. 037
I
0. 054
I
0. 050
I
0. 047
I
0. 043
I
0. 039
I
0. 036
I
0. 032
I
0. 029
I
0. 023
I
0. 022
I
0. 018
I
0. 014
I
0. 011
I
0. 007
I
0. 004
I
0. 000
I
i..
o. o
r.
s
i.... i.... i.... i.... i.... i.... i.... x x.... i.... i.
2. OO 4. OO 6. 00 B. 00 2 0. OO
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
r
oo
MIAMI WSMO AP METEOROLOGICAL STATION 193679. 3HRCS) INTEREVENT
INTENSITY VS MONTH
A=AVERAGE. S=STD. DEV.
Figure 6.4 Seasonal Variability of the Average Intensity
TABLE F2.4
COEFFICIENT OF RELIABILITY
FOR A PROBABILITY OF 98.000000 7. WITH NORMAL VARIATE Z = 2.054
VY ALFA
LOG
0. 001
0.
100
0. 200
0. 300
0.
400
0.
500
0.
600
0. 700
0.
800
0.
900
1.
000
0. 100
0.
819
0. 818
0.
817
0. 816
0.
815
0.
813
0.
812
0.
811
0. 809
0.
808
0.
806
0.
805
0. 200
0.
679
0. 676
0.
673
0. 669
0.
664
0.
660
0.
655
0.
651
0. 646
0.
640
0.
635
0.
629
0. 300
0.
571
0. 565
0.
558
0. 551
0.
543
0.
535
0.
527
0.
517
0. 508
0.
497
0.
486
0.
474
0. 400
0.
488
0. 476
0.
467
0. 457
0.
446
0.
434
0.
421
0.
408
0. 393
0.
377
0.
359
0.
338
0. 500
0.
424
0. 406
0.
394
0. 381
0.
368
0.
353
0.
337
0.
319
0. 299
0.
277
0.
252
0.
223
0. 600
0.
373
0. 349
0.
336
0. 321
0. 305
0.
288
0.
269
0.
248
0. 225
0.
198
0.
166
0.
128
0. 700
0.
334
0. 303
0.
289
0. 273
0.
256
0.
237
0.
216
0.
193
0. 166
0.
136
0.
099
0.
052
0. 800
0.
302
0. 266
0.
251
0. 234
0. 216
0.
197
0.
175
0.
151
0. 123
0.
090
0.
051
0.
000
0. 900
0.
277
0. 236
0.
221
0. 204
0.
186
0.
166
0.
144
0.
120
0. 092
0.
059
0.
019
0.
000
1. 000
0.
256
0. 211
0.
196
0. 179
0.
162
0.
142
0.
121
0.
098
0. 071
0.
040
0.
004
0.
000
1. 100
0.
239
0. 191
0.
176
0. 160
0.
143
0.
125
0.
105
0.
083
0. 059
0.
032
0.
001
0.
000
1. 200
0.
224
0. 174
0.
160
0. 145
0.
129
0.
112
0.
095
0.
076
0. 055
0.
033
0.
008
0.
000
1. 300
0.
213
0. 161
0.
148
0. 134
0.
119
0.
104
0.
089
0.
074
0. 058
0.
043
0.
029
0.
020
1. 400
0. 202
0. 150
0.
138
0. 125
0.
113
0.
100
0.
088
0.
077
0. 069
0.
064
0.
067
0.
084
1. 500
0.
194
0. 141
0.
130
0. 119
0.
109
0.
100
0.
092
0.
087
0. 088
0.
097
0.
122
0.
169
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
307
92
Table 4.4. Optimal Selection Statistic and Corresponding Transformation
(a) for Example 1.
Distribution
R2/a
STDE/a
WSS/a
MXLF/a
Normal
0.981/0.10
823.7/0.00
0.737/0.10*
9.15/0.10*
Gumbel
0.985/0.60
657.8/0.40
0.876/0.50
3.76/0.50
Rayleigh
0.931/0.30
753.1/0.00
1.139/0.30
3.91/0.30
Pearson
0.985/0.05
767.6/0.10*
0.711/0.10*
10.25/0.10*
*optimal transformation not reached, a < 0.10
1 opt
Table 4.33 Nonlinear Parameter Estimation, Transformed
Variables, Starting Values, AF=0.40, A=15, and B=100
ST.MARYS R J VER RUNOFF
GUMBEL STDE SAS NLIN PROCEDURE
< Y#tfAFl )/AF = A*Z + 13
NONLINEAR LEAST SQUARES SUMMARY STATISTICS
DEPENDENT VARIABLE Y
PARAMETER
AF
A
B
SOURCE
DF
SUM OF SQUARES
MEAN SQUARE
REGRESSION
3
351.57828773
117. 19276253
RESIDUAL
57
0.00360836
0. 00006330
UNCORRECTED TOTAL
60
351. 58189608
(CORRECTED TOTAL)
59
0.00303319
ESTIMATE
ASYMPTOTIC
STD. ERROR
ASYMPTOT
CONFIDENCE
LOWER
0. 40425480
0. 11916349
0. 64287548
0. 01221985
0. 0031 1707
0. 00597S03
2. 41332039
0. 53255656
1. 34689466
C 95 7.
INTERVAL
UPPER
0. 16563411
O. 01846168
3.47974612
ASYMPTOTIC CORRELATION MATRIX OF THE PARAMETERS
AF A B
AF
A
B
1.000000 0.837643 0.998632
0.837643 1.000000 0. 865054
0.998632 0.865054 1.000000
137
212
6.3.5. Summary and conclusions
This section illustrated the use of deterministic models for the
generation of hydrologic time series, after appropriate calibration over
several observed storms representative of the average hydrological
conditions of the basins under investigation. The case study was mainly
used for illustrative purposes rather than to answer specific questions
about the analyzed basins or rainfall stations. The use of the Storm
Water Management Model (SWMM) for continuous simulation, along with
STATS and SYNOP distributionfree statistical analysis was treated in
some detail, given the importance of such statistics for stormwater
management and for hydrological planning and design.
The next section presents other illustrative examples derived from
this case study for application of probabilistic models.
6.4. Probabilistic Models
6.4.1. Introduction
A first application of probabilistic models was given by the
illustrative examples of Chapter 4. In this section, other examples
derived from the case study are analyzed by the GPDCP program in order
to illustrate its application to different types of hydrological series.
First, the annual series of total rainfall at the eight NWS stations
described in Section 6.2 are analyzed. Then a sample of eventbased
statistics composed of four series of event characteristics (duration,
intensity, volume and delta) generated by SYNOP is analyzed by the
GPDCP. The last example uses a series of pollutant loads (COD) gen
erated by SWMM.
Tnble 4.11 Total Annual Runoff (Inches) and Statistics of
Original and Logtransformed Flow for Example 2.
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RUNOFF 19341981
SORTED RECORDED EVENTS
27. 064
19. 030
IS. 983
18. 212
17. 365
15. 461
13. 061
12. 603
12. 375
11. 772
11. 097
11. 026
10. 965
10. 391
10. 364
9. 960
9. 593
9. 503
9. 437
9. 401
9. 258
8. 925
8. 142
7. 563
7. 495
7. 495
7. 334
7. 301
7. 246
6. 745
6. 507
5. 807
5. 602
5. 439
5. 105
4. 962
4. 795
4. 790
4. 592
3. 973
3. 705
3. 526
3. 425
3. 217
3. 198
3. 163
2. 499
0. 499
MEAN OF Y
8. 749
VARIANCE OF Y
26. 914
SKEW OF Y
1. 205
MEAN OF LNCY)
1. 983
VARIANCE OF LNCY)
0. 457
SKEW OF LN(Y >
1. 136
105
Table 6.19. Sensitivity Analysis cf the Optimal MXLF Statistics to the Plotting
Position Definition (Eqn. 2.2.22) for the GND.
Station
a
Plotting Position Constant a
0.00
0.10
0.20
0.30
0.375
0.44
Miami AP
0.10
27.16
25.75
24.14
22.32
20.77
19.28
N. New River
0.65
29.43
30.03
30.49
30.75
30.71
30.46
West Palm Beach
0.90
9.40
8.81
8.07
7.13
6.26
5.36
Belle Glade
1.30
8.73
9.34
9.58
9.80
9.92
10.32
Ortona Lock 2
1.20
8.74
8.36
7.89
7.27
7.09
6.49
Port Mayaca
2.00
5.17
5.48
6.69
6.74
6.89
6.92
St. Lucie
0.45
2.45
2.17
1.48
0.74
0.14
0.39
Daytona Beach
0.90
23.31
23.24
23.00
22.53
21.97
21.30
216
215
rainfall series of the south Florida stations. The results of this
analysis are summarized in Table 6.19. Here again, as for the illus
trative examples of Chapter 4, the selection statistics show little
sensitivity to the plotting position definition compared to their high
sensitivity to the shape of the distribution, a. The Weibull plotting
position (a=0) (Equation 2.2.22) led to the best fit at four of the
stations, while for three four other stations the optimal plotting posi
tion constant was greater than 0.44. Thus, for these series the optimal
plotting position oscillated between 0.0 and 0.44, originally recom
mended for the uniform and extremal distributions, respectively
(Cunnane, 1978), although the fitted distribution is normal and its
expected plotting position is 0.375 (Section 2.2.7). Therefore, it may
be concluded that the plotting position definition is not as important
as that of the shape of the distribution in optimizing the selection
statistics. Furthermore, xxdien an improvement of the fit is sought, the
plotting position constant should be estimated like all the other
parameters since it is sample dependent.
6.4.3. Event statistics
Empirical frequencies and return periods of event characteristics
such as duration, intensity, volume and delta may be generated by SYNOP
or STATS after separating the continuous hourly series into events.
Distributionfree statistical analysis is performed x
grams. A more detailed analysis of these event characteristics can be
performed by the GPDCP program starting xjith the generated empirical
frequencies. In this section, SYNOP generated series from the Belle
Glade hourly rainfall data are used as an illustrative example for the
GPDCP application. Here again, only the GND x
TABLE G1.10
RELIABILITY AS A FUNCTION OF COR/
VY
0. 2000
0. 4000
0. 3000
0. 9995
0. 9925
0. 4000
0. 9956
0. 9763
0. 5000
0. 9881
0. 9598
0. 6000
0. 9797
0. 9476
0. 7000
0. 9728
0. 9404
0. 8000
0. 9679
0. 9372
0. 9000
0. 9653
0. 9372
1. 0000
0. 9645
0. 9394
1. 1000
0. 9651
0. 9430
1. 2000
0. 9667
0. 9475
1. 3000
0. 9690
0. 9525
1. 4000
0. 9717
0. 9576
1. 5000
0. 9746
0. 9627
1. 6000
0. 9776
0. 9675
1. 7000
0. 9804
0. 9721
COR
0. 6000
0. 8000
1. 0000
0. 9523
0. 8292
0. 6064
0. 9205
0. 8081
0. 6406
0. 9012
0. 8048
0. 6736
0. 8923
0. 8107
0. 7054
0. 8905
0. 8217
0. 7357
0. 8933
0. 8353
0. 7642
0. 8990
0. 8501
0. 7910
0. 9065
0. 8653
0. 8159
0. 9148
0. 8802
0. 8389
0. 9236
0. 8944
0. 8599
0. 9322
0. 9078
0. 8790
0. 9406
0. 9201
0. 8962
0. 9484
0. 9314
0. 9115
0. 9556
0. 9415
0. 9251
0. 9622
0. 9505
0. 9370
COR : COEFFICIENT OF RELIABILITY
VY : COEFFICIENT OF VARIATION
VY/FOR ALFA
O. 800
1. 2000
1. 4000
1. 6000
0. 0514
0. 1431
O. 2447
O. 3406
O. 4266
O. 5027
0. 5699
0. 6291
0. 6813
0. 7273
O. 7676
0. 8030
0. 8339
0. 8607
0. 8838
1. 8000
0. 0128
O. 0647
O. 1467
O. 2386
O. 3292
O. 4137
0. 4905
O. 5594
O. 6209
O. 6753
0. 7233
O. 7655
0. 8023
0. 8342
0. 8618
2. 0000
0. 0024
0. 0252
0. 0803
0. 1577
0. 2440
O. 3307
0. 4132
0. 4895
0. 5588
0. 6210
0. 6763
0. 7252
0. 7680
0. 8054
0. 8377
0. 3504
0. 1546
0. 4480
0. 2725
0. 5229
0. 3736
0. 5843
0. 4588
0. 6367
0. 5313
0. 6825
0. 5938
0. 7230
0. 6484
0. 7592
0. 6963
0. 7915
' 0. 7386
0. 8203
0. 7758
0. 8459
0. 8087
0. 8686
0. 8375
0. 8886
0. 8627
0. 9061
0. 8847
0. 9214
0. 9037
327
252
2. Among the four selection statistics considered in this study, the
coefficient of determination (R2) performed the best, followed by the
maximum likelihood function (MXLF) and the weighted sum of squares (WSS)
with the standard error (SIDE) ranking last. "Best" is used in the
sense that important features, such as the form (linearity) and shape
(skewness) of the selected model comply with the expected ones (e.g.,
the optimal transformation for the generalized normal distribution
should lead to a linear plot of transformed variables versus reduced
variates, and to a skewness close to zero). All four statistics plus
the shape parameter, a, were not sensitive to the scale (magnitude) of
the data, except for the generalized Pearson distribution, for which the
shape of the distribution is modeled by two parameters.
3. Selection statistics and the transformation parameter showed no
sensitivity to the definition of the plotting position for the four
generalized families of distributions.
4. A third order reliability analysis that explicitly includes the
transformation parameter in the formulation of different reliability
indices was made possible by the derivation of new relations between
the moments of the original and transformed variables. The importance
of such relations in reliability assessment is illustrated by the
generalized tables and plots of reliability for different transforma
tions (Chapter 5). Important decision elements, such as the design
period (project life), were found to have less effect on the design
event magnitude than does the transformation parameter, a.
5. Continuous simulation, via an adequately calibrated deterministic
model, such as SWMM, reproduced most of the observed variability of the
modeled processes. The spatial variability of the rainfall inputs was
4kviut4ken4kviCPcn0encneR0 4kvi4k vi00cn4k4k04k vj00cnviO'00enenvicn4kvi4k vicpenvi4kcpcn4kencnen4k vien4k4cnvi04k
*00000000000000*00*00*0*O*0000000000k0000000*0*00*0000*0*000
o vj vj >0 o > cd 4 en en vi o vj o vi o co r\j ivj en vi ru o pj o* en en 4k en cd tu en o>t m o cd vi vi ro o fu en vj nj en Jk enen ocd cj
o * o pj o fvj fu * nj pj o o o o *~ tu * 0 0  * 0 * 0 0 * 0 o * o o o  o 0 o o o 0 * o  * k  * o o o *  ** o * *  *
u en O en *o ^ *o u fo uivj W'O vj qv >4 q] (j yio o cd o o en vi fu en o o en en O'cj 'O 4k 4 *otk en cd t> en u en en en CJ tD u
* ru 0k **00* hhmmh ** h* nj****nJ wvw** *00
en vi vi en vi ro *w ut m o en o o id nj 4 4k M * ch o co vi o o o w *oo en i* cj o cj o jk Â¡> o* o* u) 4 q o o o fu o en en co m u ru
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooo
v* * v k.  ~ k* k *  *  * v v 0 0 0 fo 0 0 (Vj 0 w 0 ru 0 cj cj cj CJ cj 4k 4 4 en en O' o* vi >0 o a
cd cd CD CD *0 o p o o * * * * nj ro ru u cj 4k 4 en en O' vi vi cd cd o p O' O' co CD 43 * id en o CD o 0 cj t> o vi o 4> o* p en vi
ru 4k jk cd nj cd CD cj vi o ru en cd o cd O'o cd 4k vi cd *i3 *0 o cn CD O'o fu vi cd en en en o oo nj cj
ruk'OOCDCDD'*k0'4k04krvjvjoenO'0'0'ruru4k^3ocjencDruo4>vi>oO'CD*ooocDrucDru4kCDvjO'OvjoO'OCD*oen>ooo
DWO'WHOUOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
k4k
OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOmhmhmhhmmWIIJUCJ^nW^
ru ru ru tu ru cd cd cd cj a cd cj cj cj cj cd cd cj cd 4 4k 4k 4k 4s> 4k 4k 4k en en en en en en o* O'O'vi vi vi cd cd CD *0 o o ru cd 4k en vj ^3 ru en o vi o o fu o
vi cd cd >0 o o oru cj cd 4k en oO'vi co'O o ru cj en O'vi >a o ro cj en vi ru 4k vi o u oo 4k *o 4k o O'4 ru ru 4k cd en o ru vi ch 0* o o en o
o o en o en kon j CD en cd en cd o cd vi o en en en O'cd o iu O'o en *CD vi O'cd o en o *o* cj en ru o en o en 4k ko O vi so to *0 o o o o
4* 4k 4k 4k 4^ 4k 4k 4k en en en en en en en en en en o O O'o o o o o o> o v vi vi vi vi vi vi vi vi vi vi cd cd cd CD CD CD cd CD cd CD'O 43 *0 o'fl *0 o'O *o o
(J 4k Ul U> O Vj CD <3 Oru CJ 4k JJIO vi CD <10 kru CD 4k en O vi CDO O *ru ru ID 4k O vi CD *0 O k* ru CD ^ en O'vl CD
o o o <3o cd cd cd vi vi vi o o en en en 4k 4k 4k (j cj cj ru iu ruo o <3 <] cd o m vi vi vj o o 0* en en en 4 4i cj cj (D ru ru ru kko o o
'O en ru cd en  vj 4k o vi cj o o cj *o o ru *o en co 4k kvi 4k o vi cj o o u *a en ru cd en cd 4k k%i 4k o vi cj o O'ru o en ru cd et os vj 4k o
fc at :* *t at ST at at fc *fc 4t at *Â£ 4k :* * * * & *fc * * * * * * at *
at:*at*:*::#: at***:****:****Â¡*3*: at*:*:**::#:*:**:#:***:****Â£:*:*:**it*:**Â¡*41:
a
3>
H
m
X H
D
C 3
0 m
*
3
>
0 O
H
o z
Â£3
c
O"
Z H
O
hJ
a c
a
o
z
(D
w rn
0
H
ON
*
300
H
1 l
omm
C
1
Z0H
m
Ui
z
XO0
H
WDZ
>
W
Z
3
O0
>
X3
m
r*
H*
O0
<
k_j
20
M
m
m
H*
ft z
a
O
H
P3
Vi Vi Vi vl Vi vj Vl v Vi Vi vi Vj Vi V Vi Vj vi Vi Vj Vi vj v V Vi Vj VJ Vl Vi Xi Vj Vi v vj vj v Vi Vi Vi vl Vi vl vl Vl Vi vi Vi vi vi vi vi Vi vi Vi Vi Vi v vj vi v v
vj vi 4k 4 en 4k en en en o en 4 4k o 4k o o en vi o vi en vi o en en 4k vi v v vi u O'en en en vi o o 4k vj o en 4 vi en o vi vi 4* 4k 4k o en vi vi v o O'en
OOOOOk'*oOk*OOk*OOOOOOOOOOOOOOCO^OOH0^k*OCOO'OOHOOO'OOOCOOOOOOO
* n j c vi vi o vi lj m vi vi>0 o en O'ru cd n j cd en ru en kkru'D cd o o en o ru ro kpj 0"O ro id o l ) cj kvi m vi ko *o krn en n r t n 4k ru ru
kpj jK3 0ruO'iu*!UOcDOoorukforu*.pjOkoruruoLjpjkoiuoo.'ucrooko*otjkkiuiooruoioruoooiUk
4k 4k o k4k O f J CJ O Vi o
*4 *ru k *k  m ruru * ru*
o o) m vj vi o o vi 4k a vi cj *o >a co cd ru  o o j> cj ru cj *4k e co o 4k 4k vi o cj o ru co co *^ o o vi ru o cj ru pa *yt
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooaoooooooooooooooooooooooooooooooooo ooooooooo
kk
^ Nkkkpj ru (u ru ru ru ru ru ru ru ru cj cj u u cj cj 4* 4> 4k en ! e* O'0"D c 4
a a a o ^0 vQ <3 o o p *k**tu ru ru cj 4k 4k en en t> vj vj a] cd a ko k
ro u 4k vj ku co k4k so ru cj cd en cd kru O'vi vi o tnen o vi cd >o cd c 0 cj vO vi cd vi cj 4k cj o o0 vi 4k 4w ru o*4k vi vi en o o
*vd 4k o* en ru u cj
OO'enOkOenOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
w
OOOO poop OOpOOppOOOppOOOppOppOOOpOOppoppppOOpk^kkkv^.^IUUJUtJ4kp'p:^
pj ru ru ru fu ru cj cj cj (J cj cj cj cj cj to cj cj cj 4k 4k 4k 4k 4k 4k 4k 4k en ui en en en en O'o* o O'vi vi vj cd a *0 *0 O*pj 4k cn O'0 4k cd en 4k k03 en
vj vi 0 mo >o o kknj nj u 4k en en o vi a a >cj cpo fu en no 0 vi ro 0 4 'C m o u ve o U en c o en ru o O
40u0 0>o4kOO'rU'O0 0om04k0ruruiuru4kenkenO0U*OOiuO'ruOk4kO*enenou4ken'DOkoenoofuoenenoo
4k 4k 4.4k 4> 4 4 4* en en er en en en en en en en O'O'O'O'O'0* O'o* O'0 O vi si vi v/vi vi vi vi vj vi cd Q 0 0 0 0 0 o 0 0 *43 *0 0 0 0
_ruu4kp>O'vi0ppkpju_4ken>O'vi0pp*>roucj4ppvia:oo*(uu4ppvippowfejcj4kenO'vjQpo*kpu4kpp'viap
vi vi o o* o* en en en 4k 4k 4k u u u nj iu ru o o o
vi cj o nj
0
>
X H
a m
C 3
23 m
3
>
O
z
kt
H
C
a
m
3023
omm
Z2JH
HhC
X O 23
)D2
ATJ
m
023
230
m
n z
Vi Vi si vj vi vi Vi >4 Vi Vi vi vi Vi v V Vi VJ Vi vi sj vj Vi Vi vi Vi Vi vi Vi vi Vi vi ^ vi vi si Si Si Vi vi vi Vi vj si Vi vi vi Vi vi si Vi Vi Vi vi Vi Vvi vi Vi vi vi
vi si 4k en oen en 4k 4k 4> 4k 0* vi 4. O'0 o vi en 4k O'en 4k en en vj vj o* O'4k vi 4k 0 vi 4. vj en s vi vi *si vj vi vi o 4k O'vi 0 0 4k cp 0en 4k o* vi 4k o en
OO*OO * 0*00000000*OOOO*OOOOOOOOOOOO**OOOO*0*koOOkkkoO*
Vi ko C 0 0 O 0 vl 0 o 003 0 0 O O *O vi k0 o *43 en 0 CD 0 0 0 4k k4k Vi 0 4k CP CP P J ru 0 ru ru *fu er. O ro 4. c 4k ee o
k0 oru nj pj ro **o pj nj pj *pjo oro 00O*ro*o o**o on3 *o*OkOkk00OPJ 0*o o*oro* oro ooro*oo
v*oa,.000k4>orovi00er.0vien0OO0*en0Gk<3o*viku0vjviuen*O'en4kocD0CDk0viO0k*kkij0er.*0
_ X k k **KfOk kk kkVkk k k0 Mk kk.k.k 0 k00 k
V0O00PJ*Jp0004>pO'pO'Vivi*4>4k00000pvivip04kk0p4k0 0^300CDrUIUOenkO**rUO00viD
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
OOOOOOOOOOOOOOOO OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO OOOO
kkkH.kkk^kHkkMMhkkMM*MkMMMMMii)iufori]rOf\3IOfl3UMfl3CJL3UUUlJ.tk4kikikUiui(P^CD4]rJ
0a000ppppp^^*fururu04k4k4ken00viO3CD0p
**04kviO04kO00*0vi^jk.cj0OO000O,430O0vJk0CD'O0OvJ0004k0O**vjsjvjk00Jken04kCPO*4kvi0
0enk4kejtenenko04kO0PJO0'OO'Ovikko4kpjoen0k0rJ4kvi0a0U*4D0*004k0k0en'4343*>ooOvicDvj000oo
cn*OPJv00enOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
ooooppoooopooooooooooooooppopooppooopooooooo*******rururu04>en00
JUMruruPjru00000000000CJ0U4i4k4k4k4>.4k4k4.enenenenen0000vivviCD0*O'OOOkW04k00O0vi0O4^*0
viviCD0'O*43OOkfU004k4keP0v4CD0'OO*04ken0m'O*fU4k0rooruen0*4ksikoo0ro00en000k4k000m4k00
ruvi0vj0 vi0G34k.ov;cJocoen0O'C0cO'aOk4kvioenovi4k0ucnco0koru0v!Ovi*koviOken0cn000v04kvfcj
a
>
H
m
X H
Q k_
C 3
23 m
>
0 o
o z
c *
Z H
o c
0 a
 m
300
omm
Z2JH
o
o
a
at at * at at at at afc * *: at at nt 4fc at afc # ^ * c at at at at ae at a? ; at at at at * at ac st at =* * * *
m
<
m
0
>
>
3
m
>
z
>
m
a
4
o
H
>
r
r
a
>
a
II *.
n >
ii a
ii r
ii m
II
n o
II ~l
H 3
H >
II o
II z
II *
u H
II c
n o
ii m
II 
II
II 0
ii m
II H
II c
II 0
II z
a
II0
ii m
II 0
II *
II
ii a
II
H >
n z
ii a
II
tl 0
It 0
ii m
U Q
X c
h m
H Z
K O
B <
n
kO
Q
CD
p
o
o
D
p
CD
Pi
P3
rr
CD
Pu
cn
H
>
H
CO
Hi
o
O
c
M
rr
H
Hi
03
3
XO0
0DZ
4k4k4k4k4k4k4k4kenenenenenenenenenenen0000000000vivivivvivivjviviviCD000000004DO43'O'O*o*o*O*O*O
W04p0vippOk004k4kp0viCDpOkW04kp0vi0pO*W04U0vimppk*W04ken0vJ0*OO*p04ken0viO
^4k0U0Pj00k.voooo*O'Om00vivivj00enenen4k4k4k000000k*kkoO'O*O'O00vjvjvi000enenen4k4k00
4kkvi4kovi0O00*oeniU0enk0 4kkvi4fcovi0430(U'OenfU0enk04kkvi4kO0 0>O0W'OenPJ0enk04kOvi0O00'O0
A0
m
00
210
m
n z
H
602
257
The total variability of any individual prediction is given by
2
adding a to Equation A.8,
Var (Y. ) = [1 + + Z^rl n
i n
E(ZZ)
1_ 2
,2J n 3 Y
(A. 10)
20
distribution after an adequate reparameterization and transformation of
the data by the BoxCox transformation. Based on this finding, four
generalized families of probability distributions are derived, with the
normal, extremal, Rayleigh, and Pearson as parent distributions. A
computer program is developed for estimating their parameters. Separa
tion of the estimation of the location and scale parameters from that of
the transformation parameter, adopted by this program, results in a very
efficient algorithm which always converges in efforts where other
classical algorithms have failed.
The high performance of these distributions is illustrated by the
modeling of three annual series taken from the literature. Four selec
tion statistics are used for the evaluation of such performance. These
statistics exhibit a high sensitivity to the transformation parameters,
and no sensitivity to the definition of the plotting position and to the
change of scale on which the observed data are expressed. The optimal
selection statistics are found to be much less sensitive to the choice
of the parent distribution than to the transformation parameter, e.g..
all four generalized distributions perform equally well once the data are
optimally transformed.
Relations between the moments of the original and transformed
variables are derived, based on a Taylor series expansion of the BoxCox
transformation. These relations allow the extension of the second
moment reliability theory, based on the assumption of normality, to a
third order reliability theory, wherein the deviation from normality is
explicitly accounted for through the transformation parameter. Gener
alized expresssions for different measures and indices of reliability
are derived, and their sensitivity to the shape of the distribution is
Table 5.9 Power Co Lognormal Percentile Ratio, 0 >_ a. >_ 1, P = 99.999%
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A PROBABILITY OF 99.979000 7. WITH NORMAL VARIATE Z = 4.275
VY
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
ALFA
0. 500
0. 600
0. 700
0. 800
0. 900
1.
000
0. 1
1. 000
0. 999
0. 990
0. 981
0. 972
0. 962
0. 951
0. 940
0. 928
0. 916
0. 902
0.
888
0. 2
1. 000
0. 991
0. 958
0. 921
0. 881
0. 838
0. 789
0. 735
0. 674
0. 604
0. 522
0.
423
0. 3
1. 000
0. 974
0. 901
0. 820
0. 731
0. 631
0. 518
0. 388
0. 236
0. 058
0. 000
0.
000
0. 4
1. 000
0. 942
0. 821
0. 686
0. 539
0. 377
0. 203
0. 034
0. 000
0. 000
0. 000
0.
000
0. 5
1. 000
0. 899
0. 724
0. 535
0. 338
0. 144
0. 004
0. 000
0. 000
0. 000
0. 000
0.
000
0. 6
1. 000
0. 842
0. 616
0. 385
0. 167
0. 014
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
0. 7
1. 000
0. 778
0. 507
0. 252
0. 055
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
0. 8
1. 000
0. 708
0. 405
0. 148
0. 007
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
o
o
o
0. 9
1. 000
0. 636
0. 313
0. 076
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
1. 0
1. 000
0. 566
0. 235
0. 032
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
1. 1
1. 000
0. 499
0. 172
0. on
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
1. 2
1. 000
0. 437
0. 123
0. 002
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
o
o
o
. 3
1. 000
0. 381
0. 086
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
1. 4
1. 000
0. 331
0. 059
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
o
o
o
1. 5
1. 000
0. 287
0. 039
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
162
15
which the plant should be designed (or operated) to meet a given stand
ard (D) p% of the time (i.e., with p% reliability),
COR =  (1.2.11)
where C and D are the mean and pth percentile variate of the lognormal
distribution fitted to the pollutant of interest. For example a COR of
0.2 relates to a percentile equal to 5 times the mean of the lognormal
distribution. Detailed expressions of this coefficient will be given in
Chapter 5.
1.2.3.3. Generalized reliability index (GRI). Ditlevsen (1981, p. 232)
defined a generalized reliability index of a system with respect to a
limit state Ly dividing the transformed space into a safe set and a
failure set by the formula
GRI(y) = 1t/y M^)
where yT is the confidence limit (safe set) in the normalized space of
iJ
input variables corresponding to the limit state Ly, $ ^ is the inverse
of the cumulative distribution function of 9, the standardized normal
density function, and z^, z^> zn are the standardized normal
variates.
z.
1
yi Â§(xi0)
(1.2.13)
For the special case where Ly is limited to one variable, y^ will
be defined by a single point, and the reliability index of Equation
1.2.12 reduces to
GRI. .
Cy)
(z^)dz
$ 1^(zL)
z
L *
(1.2.14)
Table 6.1. Hourly Rainfall Station Identification
Number
NWS
Index
Name
County
Latitude
N
Longitude
W
Elevation
MSL
ft.
1
085663
Miami AP
Dade
25 49
80 17
8
2
086323
N. New River Canal
Palm Beach
26 20
80 32
22
3
089525
W. Palm Beach
Palm Beach
26 41
80 06
15
4
080616
Belle Glade, HRN G.4
Palm Beach
26 42
80 43
31
5
086657
Ortona Lock 2
Glades
26 47
81 18
21
6
087293
Port Mayaca S.L. Canal
Martin
26 59
80 37
34
7
087859
St. Lucie N. Lock 1
Martin
27 05
80 18
15
8
082158
Daytona Beach AP
Volusia
29 11
81 03
31
NWS: National Weather Service
MSL: Mean Sea Level
179
243
annual series. Table 6.32 gives the AIC and BIC evaluated by Equations
6.5.6 and 6.5.10, respectively. For this case too, these statistics
rapidly decrease with an increase of a; therefore, their use for selec
ting the best transformation may be misleading.
6.5.6. Reliability of estimated parameters
A Monte Carlo simulation was performed in order to assess the
reliability of the annual ARMA(1,1) parameters when estimated from small
samples of observations and to assess the effect of the normalizing
transformation on this reliability. Using the final estimates of the
autoregressive and moving average parameters, 100 series of the same
length as the original series (24 years) are generated by the IMSL
subroutine FTGEN (IMSL, 1979). Then the autocovariance structure of
these series is analyzed by the IMSL subroutine FTAUTO, and ARMA(1,1)
model is fitted to each of the 100 generated series. This procedure is
applied twice, first to the log transformed data (a=0), and second to
the optimally transformed data (a=1.3) for the annual total rainfall
series of the Belle Glade station. Reliability of the estimated parame
ters is investigated through the statistics of the generated series
parameters. Table 6.33 summarizes these statistics, giving the mean,
standard deviation and skewness coefficient for the AR and MA parame
ters, the residual variance and the average of the generated data, for
both transformations. Note the high variability (a ) of the generated
S
parameters compared to their mean, illustrating the unreliability of
such estimates in representing the population parameters. On the other
hand, the advantage of the optimal transformation over the log trans
formation is obvious from the statistics of the average of the generated
series (last column Table 6.33) Xi/here the mean is much closer to zero
Table 6.23. Skewness Coefficient for Transformed Annual Rainfall Series
Station
a ..
opt
Transformation
a = a
opt
a = 1.0
a = 0.50
a = 0.0
Miami AP
0.10
0.004
1.674
0.74
0.176
N. New River
0.65
0.213
0.980
0.126
1.25
West Palm Beach
0.90
0.204
0.001
1.050
2.160
Belle Glade
1.30
0.119
0.520
1.680
2.970
Ortona Lock 2
1.20
0.121
0.53
1.676
3.020
Port Mayaca
2.00
0.075
2.150
3.360
4.480
St. Lucie
0.45
0.067
+1.632
0.208
1.660
Daytona Beach
0.90
0.098
0.298
0.700
1.698
229
3
analysis. Contrary to deterministic modeling, where the objective is to
dampen uncertainty, the objective of reliability analysis is to magnify
it in order to represent, predict and understand the uncertainty of
reality. Thus, the functional form of Equation 1.1.1 may be modified to
2
include an error term, e, explicitly, with zero mean and variance a ,
y = g(x,0) + e (1.1.3)
It is the analysis of this error that will characterize the performance
of the model, measure the deviation of the predicted from observed
values and allow the construction of confidence limits about g(x,0).
Modeling uncertainty has become an important task of modern engi
neering analysis (Ditlevsen, 1981). In fact, it was not until the late
1960s that reliability began to interest a wide circle of scientists
and engineers (Lomnicki, 1973). The spread was so fast among different
engineering fields that it became hard to formulate a unique and simple
definition or measure of reliability. Some of the most common defini
tions will be given in the next section, but first, applications of
reD.iabl.lity concepts to hydrologic modeling are presented through a
review of related studies.
Wood and RodriguezIturbe (1975a, 1975b) and Kite (1975) were among
the pioneers in the analysis of reliability in flood flow frequency
modeling. They based their analysis on Monte Carlo simulation of
extreme events using different probability distribution models. Wood
and RodriguezIturbe adopted a Bayesian framework to account for param
eters and model uncertainties, while Kite assumed that the error term,
e, is normally distributed and derived approximate confidence limits for
the predicted values. Lwin and Maritz (1980) noted that there is no
224
the same as the maximum likelihood estimates provided the residuals are
normally distributed. Sensitivity of the estimated parameters and of
the model performance to deviation from this assumption of normality
will be investigated using monthly and yearly ranfall series from south
Florida along with the BoxCox transformation.
6.5.3.2. Goodness of fit evaluation. A first evaluation of the good
ness of fit may be given by examination of the behavior of the sum of
squares surfaces for different values of a. This may be accomplished by
inspection of tables, of contour lines of equal values, or of three
dimensional plots of the generated sum of squares.
Portemanteau goodness of fit test. The independence of the resid
uals is tested by the Portemanteau goodness of fit test, a commonly
applied test for diagnostic checking of fitted ARMA models (Salas et
al., 1980). The distribution of the statistic
L 2
Q = n E P,(e) > (6.5.3)
k=l k
where n is the sample size, L is the maximum lag considered, and p is
K.
the autocorrelation function of the residual as approximated by a
chisquare distribution (Section C.2.7). The adequacy of the model is
checked by comparing Q with the theoretical chisquare value for (L2)
degrees of freedom and a given level of reliability (95% for this study).
Akaike Information Criterion (AIC). Akaike (1974) proposed the
following information criterion for selecting among different ARMA(p,q)
models
AIC(p,q,l) = n log (o^) + 2(p+q) (6.5.4)
2
where o, is the maximum likelihood estimate of the residual variance
M
2 2
o^, and n is the sample size. The variance is usually replaced
38
criteria of goodness of fit for the normal and extreme value Type I
distributions. Maalel and Triki (1979) used the same approach to com
pare the performance of different probability distributions in the
modeling of rainfall intensitydurationfrequency (IDF) structure over
Tunisia. Tang (1980) applied linear regression for a Bayesian frequency
analysis of annual flood discharge. Like Maalel and Triki, Tang used
the Weibull plotting position to calculate the expected standardized
order statistic; independently they presented the same formulation to
account for discrepancies between observed data and model predictions,
and for the uncertainty of extrapolation from a limited sample of data.
An expression for the overall variability of any individual prediction
may be found in many statistical textbooks such as Raiffa and Schlaifer
(1961), and Gremy and Salman (1969) From Equation A.10 we have
a = S \ ,, 1 Zi Z N n 1
y, y V a + + 17)
(2.2.23)
Â£(z. z)
where S^ is the standard deviation denoting the average scatter of the
data points about the regression line
y = A + Bz (2.2.24)
where z is the standardized variate (reduced order statistic) with mean
z. A and B are the location and scale parameters, respectively. These
parameters are obtained from the regression analysis of Equation 2.2.24.
A description of these estimators is given in Appendix A.
The ratio of the difference between expected and predicted observa
tion to the overall standard deviation Equation 2.2.23 has a Student t
distribution (Draper and Smith, 1964, p. 24). The (1 p/2) percent
confidence interval of any prediction y^ is
9
This vague notion of reliability is of limited use in technical
applications, as in physics and engineering, where concepts must have
numerical measures. A better schematization of the concept of reli
ability is given by Figure 1.2, in which the engineering design process
is compared to the links of a chain, and reliability is referred to as
the strength of the weakest link. However, a suitable definition of
reliability is still required in order to quantify it and utilize one or
several measurable quantities.
1.2,2. Reliability definitions
The classical definition which \
ing field is not too far from the nontechnical concepts presented
above. This definition is the following:
Reliability in the probability of a device or system performing
its function adequately, for the period of time intended under the
operating conditions intended. (Hendrenyi, 1981)
Note that this definition is based on the mathematical concept of
probability, which is fundamentally associated with reliability. This
definition will be adapted for this study, wherein the performance of
the system or device will be described by the previously defined func
tional form g(x,0). Thus, the reliability will be the probability that
the model g(x,0) will perform adequately in predicting the behavior of
the system over a given range of observations and within specified
confidence limits. Thus, the concept of reliability remains unchanged
and is the "probability of success" or "probability of adequate per
formance," often referred to as probability of "no failure" or no risk.
Reliability = 1 Risk
= 1 Pr(failure)
(1.2.1)
269
with
< y < and v > 0.
mean
The moments of this distribution are
y = 0
variance
= v/(v2)
y
for v > 2 ,
and skewness y = 0 .
The Student t distribution is widely used for statistical inferences and
calculations of confidence intervals for sample means from normal parent
distributions with unknown variances (Haan, 1977, p. 121).
C.2.9. Extreme value distributions
Extremal distributions have been widely used in hydrology to study
the distribution of maximum and minimum hydrologic events. The distri
bution of the m largest or smallest events each selected from a sample
of n event samples, approaches an asymptotic form which depends on the
type of the parent distribution of the total number (mn) of events
(Chow, 1964). Three types of extreme value distributions have been
developed for three different types of parent distribution; these are
known as Type I, Type II and Type III extremal distributions. A rig
orous treatment of the theory of extremes is given in Gnedenko (1943) .
Extreme value Type I distribution. The Type I extremal distri
bution results from any parent distribution of the exponential type,
such as normal, lognormal, exponential or gamma distribution. The Type
I distribution is also known as Gumbel distribution since he was the
first to apply it to flood frequency analysis and was the author of most
of the works on extremal distributions. A good summary of Gumbel's work
on the distributions of extremes may be found in Gumbel (1958) .
The pdf of the extreme value Type I is
f(y) = ^ exp{+(^) exp [+(^j^) ] } .
(C.2.11)
35
2.2.7. Order statistics based least squares
If the distribution f (y) of the errors in Equation 2.2.1 is of
known form with location and scale parameters g(x,0) and a respec
tively, its parameters may be estimated by applying general least
squares theory to the ordered sample of observations. This method was
first analyzed by Lloyd (1952) who showed that the resulting estimates
are unbiased, linear in the ordered observations, and of minimum vari
ance. He also developed explicit formulae for the estimates and for
their variances and covariances.
For a distribution which depends on location and scale parameters
only, the cumulative distribution function (CDF), F (y) may be reduced
to a parameterfree distribution (standardized) F^(z) with pdf fz(z),
where z is equal to
_ y g(x,6)
Cj ~~~
a
(2.2.15)
e
If the variates (z's) are arranged in ascending order, such that z_^_^
< z., i=2, ..., n, the smallest observation z, is called the first order
i 1
statistic whereas the last observation z is called the nth order sta
n
tistic. The small probability h (z) dz that the ith order statistic,
i
z^, lies in the range z dz/2 <_ z + dz/2 will define the prob
ability density function, h (z), of the order statistic. The prob
i
ability that i1 observations are less than z, ni observations are
greater than z, and exactly one observation is in the range z dz/2
<_ z <_ z + dz/2 is given by the multinomial distribution,
hz (z) dz = (ii)'i.n(ni) Â¡ CFZ(2)]1 1 U ~ Fz(z)]n 1 fz(z) dz (2216)
The expected value of the ith smallest order statistic in a sample of
size n is
79
observations by sample is limited to 60, but it can easily be increased
by simply changing the dimension statement. For each sample the best
frequency model may be selected from up to 100 (4 families and 25 trans
formations) theoretical distributions.
The observed data may optionally be multiplied by a scale factor
(CP) and shifted by a constant (CT) using the transformation
y = yQ CP + CT (3.3.4)
This option was introduced in the program to allow investigation of the
effect of change of units and standardization of the observations on the
selection procedure. When the input data are not ranked, subroutine
ORDER sorts them in ascending order. The empirical cumulative fre
quencies are then calculated using plotting formulae listed in Table
3.5; this calculation is performed by subroutine CDF. From these fre
quencies, the expected order statistics are calculated using the inverse
of the cumulative distribution function of each family. The inverse of
the cumulative normal distribution, $ \ is solved for the reduced
normal variate (expected order statistic) using subroutine RNV. This
subroutine is based on a rational approximation for (Abramowitz and
Stegun, 1964, Equation 26.2.23). The inverse of the cumulative Pearson
distribution is evaluated using IMSL subroutine MDGHI (IMSL, 1979) and
the relation between the gamma and chisquare variates (Section C.2.7)
ZG = ZCl/2
VCH k/2
(3.3.5)
where zn and z are the reduced variates of the gamma and chisquare
distributions, respectively, v is the degree of freedom of the chi
square distribution and k, the shape parameter of the gamma distri
bution (estimated by Equation 3.2.17a). The expected order statistics
95
observed frequencies, the return period derived from the plotting
position formula (PERIOD) and the associated reduced variate Z. The
last two columns of this table give the Student's t and t statistics.
o
Note that for most of the observation the statistic t is more than ten
times smaller than the 95% Student's variate t indicating that all
observations are within the specified confidence interval. A more
detailed analysis of the performance of the selected model is given in
Table 4.6. The statistics included in this table are generated by the
IMSL subroutine RLONE. The correlation coefficient listed among the
basic descriptive statistics is the square root of the selection sta
tistic R2. The analysis of variance summarizes the classical ANOVA type
statistics. Note the very high value of the F statistic, leading to a
probability of exceedence not discernible from zero. The DurbinWatson
statistic is much smaller than 2, indicating a positive correlation
between the residuals. Note the high reliability of the estimated
parameters implied by their standard errors and the corresponding con
fidence limits. Also note theindependence between the parameter esti
mates suggested by the low value of their covariance.
The same type of analysis may be repeated itfithin the same run of
the GPDCP program using other selection statistics. For example, based
on the STDE statistic, the Gumbel family with a transformation of 0.45
was the best distribution. The performance of this model is summarized
in Tables 4.7 and 4.8. Note that even though this model has a smaller
STDE than the previously selected model (Table 4.5), its Student t
statistics are larger than t for more than one observation at both
tails of the distribution. This is illustrated by Figures 4.1 and 4.2
where the corresponding observations fall outside the 95% confidence
39
A
= t(v, 1 p/2) a
(2.2.25)
where v is the number of degrees of freedom.
2.2.8. Conclusion
The regression based on order statistics is extended in this study
to the more general case where the location parameter A is allowed to
have some explanatory variables, Equation 2.2.18,
yi = 8(x.,0) + ag E(zi) .
Note that if g(x,0) is linear in the parameters (0's) this equation is
linear, and multiple linear regression will have optimal properties. If
g(x,0) is not linear, the nonlinear parameters may be estimated sepa
rately through a nonlinear procedure, and Equation 2.2.18 will be used
for an overall evaluation of the model. This approach will be analyzed
in more detail in Chapter 6.
Given the importance of the probability distribution in the classi
fication and parameter estimation of hydrologic models, a review of
these distributions is given in the next section.
2.3. Probability Distributions
2.3.1. Introduction
Hydrologic modeling has always involved probabilistic analysis in
order to account for the stochastic nature of the processes involved.
Many discrete and continuous distributions have been found to be useful
for hydrologic frequency analysis (Chow, 1964) A review of these dis
tributions and their application in hydrology may be found in several of
the hydrologic textbooks such as Yevjevich (1972) Haan (1977), Kite
(1977), and Viessman et al. (1977).
Table 5.1O
Power to Lognormal Percentile Ratio, 0 > a > 1, P = 99.9999%
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A PROBABILITY OF 99.999900 7. WITH NORMAL VARIATE Z = 4.772
VY ALFA
LOG
0. 001
0.
100
0. 200
0.
300
0.
400
0.
500
0.
600
0. 700
0.
800
0. 900
1.
000
0. 1
1. 000
0. 99B
0.
938
0. 976
0.
964
0.
951
0.
938
0.
923
0. 908
0.
891
0. 873
0.
853
0. 2
1. 000
0. 990
0.
943
0. 901
0.
850
0.
793
0.
730
0.
658
0. 576
0.
480
0. 364
0.
216
0. 3
1. 000
0. 970
0.
878
0. 776
0.
662
0.
536
0.
393
0.
232
0. 056
0.
000
0. 000
0.
000
0. 4
1. 000
0. 935
0.
782
0. 614
0.
434
0.
245
0.
065
0.
000
0. 000
0.
000
0. 000
0.
000
0. 5
1. 000
0. 336
0.
667
0. 440
0.
218
0.
039
0.
000
0.
000
0. 000
0.
00
0. 000
0.
000
0. 6
1. 000
0. 823
0.
545
0. 279
0.
068
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
0. 7
1. 000
0. 751
0.
426
0. 152
0.
006
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
0. 3
1. 000
0. 674
0.
318
0. 068
0.
000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
0. 9
1. 000
0. 596
0.
227
0. 023
0.
000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
1. 0
1. 000
0. 519
0.
156
0. 005
0.
000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
1. 1
1. 000
0. 449
0.
102
0. 000
0. 000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
1. 2
1. 000
0. 384
0.
065
0. 000
0.
000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
1. 3
1. 000
0. 326
0.
039
0. 000
0.
000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
1. 4
1. 000
0. 276
0.
023
0. 000
0. 000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
1. 5
1. 000
0. 232
0.
013
0. 000
0.
000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
163
93
family has a k equal to infinity). The Pearson distribution gave a
slightly better fit based on all four selection statistics; this should
be expected since the Pearson distribution has four parameters. The
difference between optimal statistics of the four generalized distribu
tions is very small compared to the variation of these statistics within
each family (Table 4.3). This illustrates the importance of the trans
formation of the variables in the optimization of a given statistic: it
is much more important to find the right space in which to fit a given
distribution than to select among classical distributions within a fixed
space (real, logarithmic, etc.). For this example, the STDE, WSS and
MXLF selection statistics gave the lower limit of the range analyzed
(a=0.10) as an optimal transformation for the normal and Pearson dis
tributions, indicating that transformations with smaller a's may lead to
better selection statistics. Such values were not considered due to the
big change in the skewness coefficient mentioned earlier. The coeffi
cient of correlation R2 seems to have the best selection statistic,
among the four considered in this study, since it did not select any
negative value for a as the optimal transformation. Such a negative
value would imply that the transformed variables are limited from above
(Hernandez, 1978; Hinkley, 1975), which is usually an unrealistic
characteristic for the type of data analyzed in this study.
The GPDCP program automatically selects the overall best model
according to any of the four selection statistics. Based on R2, the
best model is from the Pearson family with a power transformation ALFA
equal 0.05. Table 4.5 lists the performance of this model, including
the residuals in the real and transformed spaces along with the cor
responding observed and predicted values. The same table lists the
26
2.2. Statistical Parameter Estimation
2.2.1. Introduction
In the previous section hydrologic models were classified into
deterministic, probabilistic and stochastic models, but estimation of
the parameters of these models is usually based on data samples of
limited sizes, and estimates of the same parameter from different sam
ples are not expected to be exactly equal. This is due to limited
accuracy of measurement techniques, the idealized conditions for which
the model was derived and the unpredicted or unaccounted for disturb
ances affecting the real system. Such disturbances are as much a part
of physical reality as the underlying exact quantities which appear in
the model (Bard, 1974). Thus, disturbances or errors are explicity
included in the mathematical representation of the modeled process.
Such representations bear the general form
y = g(x,0) + e (2.2.1)
where y = (y^, ..., yn), x = (x^, ..., xq) are the dependent and inde
pendent variables, respectively, n is the number of data points or
experiments, 0 = (8^, 02 ..., 0 ) defines the parameters of the model,
and e = (e^, en) is the error or deviation of the model pre
diction from the observation
e = y g(x,0) (2.2.1a)
Parameter estimation methods are based on the satisfaction of
Equation 2.2.1 by some measured values of the dependent and independent
variable (y xq) These measurements constitute a sample (realization)
of finite size n from the population of all possible values that may
occur in actual physical situations. Such samples are assumed to be
representative of the total population, leading to an estimate of the
218
Table 6.20. GPDCP Optimal Selection Statistic Summary, Belle Glade
Storm Events Characteristics.
Event
Characteristic
a
R2
STDE
WSS
MXLF
Duration
0.15
0.99366
1.578
0.134
24.12
Intensity
0.05
0.98525
0.083
0.308
14.13
Volume
0.20
0.99412
0.084
0.200
19.31
Delta
0.05
0.97708
114.000
0.543
7.33
R2, STDE, WSS and MXLF are defined in section 3.3.2.
346
Bobee, B. 1975. The Log Pearson Type 3 Distribution and Its Applica
tion in Hydrology. Water Resour. Res., Vol. 11, No. 6, pp. 851
854.
Bobee, B. and Robitaille, R. 1977. The Use of the Pearson Type 3 and
Log Pearson Type 3 Distribution Revisited. Water Resour. Res.,
Vol. 13, No. 2, pp. 427443.
Box, G.E.P. and D.R. Cox. 1964. An Analysis of Transformation. J.
Royal Statist. Soc., Vol. B28, No. 2, pp. 211252.
Box G.E.P. and W.J. Hill. 1974. Correcting Inhomogeneity of Variance
with Power Transformation Weighting. Technometrics, Vol. 16, No.
3, pp. 385389.
Box, G.E.P. and Jenkins, G.M. 1976. Time Series Analysis: Forecast
ing and Control. Revised Edition, HoldenDay, San Francisco,
California.
Brakensiek, D.L. 1958. Fitting Generalized Lognormal Distribution to
Hydrologic Data. Trans. Am. Geophys. Union, Vol. 39, pp. 469473.
Carrol, J.R. 1980. A Robust Method for Testing Transformations to
Achieve Approximate Normality. J. Royal Statist. B42, pp. 7178.
Chander, S., A. Kumar and S.K. Coyal. 1979. Comments on "Stochastic
Models for Monthly Rainfall Forecasting and Synthetic Generation."
Journal of Applied Meteorology, Vol. 18, p. 1380.
Chow, V.T. 1953. Frequency Analysis of Hydrological Data with Special
Application to Rainfall Intensities. Univ. Ill. Eng. Exp. Stn.,
Bull. Ser. No. 414, pp. 2729.
Chow, V.T. 1954. The LogProbability Law and Its Engineering Appli
cations. Proc. ASCE, Vol. 80, No. 536, pp. 125.
Chow, V.T., ed. 1964. Handbook of Applied Hydrology. McGrawHill Book
Company, New York, NY.
Conover, W.J. and R.L. Iman. 1981. Rank Transformation as a Bridge
Between Parametric and Nonparametric Statistics. The American
Statistician, Vol. 35, No. 3, pp. 124133.
Cooley, R.L. 1977. A Method of Estimating Parameters and Assessing
Reliability for Models of Steady State Groundwater Flow: 1. Theory
and Numerical Properties. Water Resour. Res., Vol. 13, No. 2, pp.
318324.
Cooley, R.L. 1979. A Method of Estimating Parameters and Assessing
R.eliability for Models of Steady Groundwater Flow: 2. Application
of Statistical Analysis. Water Resour. Res., Vol. 15, No. 3, pp.
603617.
50
This distribution is widely used in reliability analysis when the
observed data fail to follow any of the more familiar probability models
described in Chapter 2. Most of these distributions are special cases
of the GGD, e.g., the exponential (b=k=l), Weibull (k=l) and gamma
(b=l). The GGD is not the same as the familiar three parameter gamma
distribution, where the third parameter is a location parameter (Equa
tion C.2.4) .
Maximum likelihood estimates from samples of limited size were
derived by Parr and Webster (1965) for the parameters a, b and k along
with their asymptotic variances. Stacy and Mihram (1965) extended the
GGD to a more general distribution by allowing the parameter b to have
negative values. They also compared parameter estimators obtained by
the method of moments, maximum likelihood, and minimum variance. Harter
(1967) introduced a further generalization to the GGD by defining a
fourth parameter (location parameter). He developed an iterative proce
dure to solve the likelihood equations for the parameter estimates.
These equations were obtained by equating to zero the first partial
derivatives of the logarithm of the likelihood function with respect to
each of the parameters (Equation 2.2.6). The asymptotic variance
covariance matrix of the estimates was estimated by the inverse of the
information matrix, M(n), which is composed of the expected values of
the second partial derivatives of the likelihood function with respect
to the parameters.
207
loads, respectively. Also included in these figures are the statistical
parameters of the generated events. Distributionfree statistical
analysis similar to the one performed by SYNOP (Section 6.3.2) may be
performed by STATS for all the event characteristics listed as options
in Table 6.14. For each of these event characteristics a table of
empirical frequency and return period, such as Table 6.15, can be
generated (optionally) by STATS. Based on such a table, a parametric
frequency analysis following the procedure developed in Chapters 3 and 4
may be performed for a better description of the generated series. This
is also true for the series of the 50 highest hourly rainfall, runoff,
and pollutant concentrations. Examples of such analyses will be given
in the next section.
Generated hourly runoff series may be analyzed by the program
SYNOP, after their transformation to the NWS hourly rainfall tape
format. Goforth (1981) developed a small program to do such a trans
formation. Tables 6.16 and 6.17 give monthly and yearly summary sta
tistics of the generated runoff from the same rainfall series, for which
statistics were summarized in Tables 6.6 and 6.7. A comparison of these
tables reveals that the generated runoff follows exactly the same
seasonal trend as the rainfall, and that the number of runoff events is
smaller than the number of rainfall events for the same month or year.
This should be expected since not all rainfall events generate runoff,
and consecutive rainfall events may be close enough to each other to
produce a common runoff event. This is illustrated by a longer average
duration of the runoff events (Tables 6.16 and 6.17) compared to the
average duration of corresponding rainfall events (Tables 6.6 and 6.7).
Figure 1.3 Reliability and Risk Definitions.
After Harr (1977)
72
and scale parameters, A and B, x/hile the only nonlinear parameter, a,
is included in the dependent variable y.
y 1
a
= A + BZ
a^O
Y =
(3.2.30)
log y = A + BZ a=0
where y is the untransformed observation, Z is the expected order sta
tistic, A and B are the linear parameters and a is the shape parameter.
The four generalized distributions are summarized in Table 3.5.
The new parameterization has the following advantages over the classical
form of the GGD.
1. It covers a larger family of classical distribution with simpler
form in the parameters.
2. The linear parameters A and B may be estimated independently of the
nonlinear parameter a by ordinary least squares methods. Lawton and
Sylvestre (1971) and Spitzer (1982) showed that such a separation
increased the rate of convergence considerably, while Maalel (1983a)
found that such a procedure led to less biased estimates than simple
nonlinear methods.
3. Confidence intervals and statistical inferences are more easily
established in the transformed space. Atkinson (1983) gave some appli
cations of this type of transformation with the normal distribution for
eliminating outlier effects and displaying influential observations.
Based on the work of McCullagn (1980) and his own experience, Atkinson
suggested the use of a linear model along with the GGD given by Prentice
(1974) for more appropriate statistical investigations. The procedure
outlined in this chapter follows exactly this suggestion, although it
has been developed independently.
62
As an alternative to the classical form of the GGD a new class of
families of probability distributions based on the BoxCox transforma
tion will be presented in the following sections. For fixed values of
the GGD shape parameters a family of distributions can be defined simply
by transformation of the dependent variable. This generalization of the
BoxCox transformation to the GGD has apparently not been attempted in
any previous work. Also, the evaluation of the relative performances of
these generalized distributions and comparison to the performance of
special cases of the classical GGD based on real word data are believed
to be among the unique aspects of this study.
3.2. New Parameterization of the GGD
3.2.1. Generalized normal distribution (GND)
In the previous section it was seen that the lognormal distribution
is a special case of the GGD (q=0), Equation 3.1.7. If the logarithmic
transformation is replaced by the more versatile power transformation
Y =
a^O
(3.2.1)
log y a=0
a more general distribution of the variable y will be defined while the
distribution of Y is approximated by the normal pdf
f(Y)
(3.2.2)
This distribution has the normal (a=l) lognormal (ci=0) and the Nth root
normal (a=N ^) distributions as special cases. Its cumulative distri
bution function (CDF) is
238
Table 6.29. First and Final Estimates of ARMA(1,1) Parameters
for Square Root of Monthly Rainfall.
Station
First estimates
Final
estimates
+1
61
+1
61
Miami AP
0.90
0.80
0.916
0.838
N. New River
0.80
0.70
0.851
0.802
West Palm Beach
0.80
0.80
0.790
0.748
Belle Glade
0.10
0.00
0.084
0.022
Ortona Lock 2
0.80
0.70
0.848
0.798
Port Mayaca
0.90
0.90
0.902
0.888
St. Lucie
0.80
0.80
0.805
0.793
Daytona Beach
0.40
0.30
0.400
0.284
350
Hirsh, R.M. 1979. Synthetic Hydrology and Water Supply Reliability.
Water Resour. Res., Vol. 15, No. 6, pp. 16031615.
Hirsh, R.M. 1982. A Comparison of Four Streamflows Record Extension
Techniques. Water Resour. Res., Vol. 18, No. 4, pp. 10811088.
Huber, W.C., J.P. Heaney, D.A. Aggidis, R.E. Dickinson, K.J. Smolenyak
and R.W. Wallace. 1981a. Urban RainfallRunoffQuality Data Base.
U.S. Environmental Protection Agency Report, EPA600/281238
(NTIS PB 8222 1094).
Huber, W.C., J.P. Heaney, S.J. Nix, R.E. Dickinson and D.J. Polmann.
1981b. Storm Water Management Model User's Manual, Version III.
U.S. Environmental Protection Agency, Washington, D.C.
Huber, W.C., K. Maalel, E. Foufoula and J.P. Heaney. 1982. LongTerm
Rainfall/Runoff Relationships in the Kissimmee River Basin.
Unpublished report to South Florida Water Management District,
75 pp.
Hydroscience, Inc. 1976. Areawide Assessment Procedures Manual, Vol.
III. U.S. Environmental Protection Agency Report, EPA600/9
76014.
Hydroscience, Inc. 1979. A Statistical Method For Assessment of Urban
Stormwater, LoadsImpactsControls. U.S. Environmental Protection
Agency Report, EPA 440/379093.
Imn, R.L. and W.J. Conover. 1979. The Use of the Rank Transform in
Regressions. Technometrics, Vol. 21, No. 3, pp. 499509.
International Mathematical Statistical Library. 1979. IMSL Library
Reference Manual. Vol. 1, 2, and 3. IMSL Inc., Houston, Texas.
Iyengar, R.N. 1982. Stochastic Modeling of Monthly Rainfall. Journal
of Hydrology, Vol. 57, pp. 375387.
Jewell, K.T., T.J. Nummo and D.D. Adrian. 1978. Methodology for
Calibrating Stormwater Models. Journal of the Environmental
Engineering Division, ASCE, Vol. 104, No. EE3, pp. 485501.
ICashyap, R.L. 1982. Optimal Choice of AR and MA Parts in Autoregres
sive MovingAverage Models. IEEE Trans. Pattern Anal, Mach.
Intel., Vol. PAMIA, No. 2, pp. 99104.
Kimbal, B.F. 1946. Assignment of Frequencies to a Completely Ordered
Set of Sample Data. Trans. Am. Geophys. Union, Vol. 27, No. 6,
pp. 843846.
Kimbal, B.F. 1960. On the Choice of Plotting Positions on Probability
Paper. J. Am. Stat. Assoc., Vol. 55, No. 291, pp. 546560.
TABLES G1
RELIABILITY TABLES
54
.  U M )k1/2
o q z
has mean zero and variance approaching one as k approaches >
Equations 3.1.3a and 3.1.7 may be combined to give the
expression for Y
(3.1.7)
(q=0) .
following
Y = m + oz
o
where m = log a H^
b
, ,1. 1/2
and a b k
With these parameters, the LGGD pdf given in Equation 3.1.4
to
(3.1.8)
(3.1.9)
(3.1.10)
is extended
where
h(Y,m,a,q) =
a T(q
1
/2tt a
q / 2 zN
_2) exp(zq e )
, 1 2.
exp( j zo)
(Ym)
z = q + u ,
a z
zq = (Ym)/a, and
(qrO)
(3.1.11)
(q=0)
yz = V(q 2)
Extreme value Type I distributions for maxima and minima are easily
shown to be special cases of the LGGD for q=l and 1, respectively. The
normal distribution is now at the center of the parameter space, q=0.
The main feature of this reparameterization is the reduction of all
the distributions included in the LGGD to the simple form of a location
and scale type distribution. Such parameters are easily estimated
due to their linearity with the reduced variates. Prentice (1974)
noticed that the number of parameters to estimate may be reduced by one
219
Table 6.21. GPDCP Optimal Selection Statistics for COD Pollutant
Loads, Miami Multifamily Basin.
Distribution
a
R2
STDE
WSS
MXLF
Normal
0.30
0.94495
26.07
3.399
30.59
Gurabel
0.10
0.97871
12.33
1.123
2.90
Rayleigh
0.25
0.98270
14.45
0.892
2.86
Pearson
0.30
0.95488
23.90
2.890
26.55
R2, STDE, WSS and MXLF are defined in section 3.3.2.
342
1.2. Bayesian Information Criterion (BIC)
Given a set of n observations y=(y^, y. .., y ) to be assigned to
one of m hypothesis H^, .., H where IL may be a model or a type
of transformation, Rao et al. (1982) adopted an optimal decision rule
based on Bayesian theory by which a function, d^., of the posterior
probability of the hypothesis given y is minimized,
min dj_ = log PrOijJy) (1.4)
From Bayes theorem the probability Pr(H^jy) can be expressed in terms of
the likelihood function of the observations L(y,0^.) (where 0^ is the
vector of parameters corresponding to the lcth hypothesis) the prior
probability density and the prior probability of H^., Pr(H^),
where
Pr(yZHk)*Pr(Hk)
Pr(Hkly) = Z Pr(yH.) P(H.)
Pr (y ] fh ) = fs L(y, e) f Q (;j_) >
(1.5)
(1.6)
i
S_^ is the set of values that can be assumed by the parameters 0^ (com
posed of the AR and MA parameters and the residual variance) and
L(y,0^) is the likelihood function of y associated with the hypothesis
H.. Assuming the residuals e^, i=l, 2, ..., n, are normally distri
2
buted, with zero mean and variance cr the likelihood function takes the
form
2
n 1 e.
L (y, 0) = II exp ( ~) .
i=l /2tt a 2a
M M
(1.7)
For the ARMA models the residuals are not independent (e.g., Equation
6.5.2 for the ARMA(1,1) model).
Kashyap (1982) developed a theorem by which Equation 1.5 was
simplified. The simpler expression resulted from a Taylor series
132
Table 4.29. Selection Statistics for Kissimmee River Runoff Data
Expressed in Inches, Decimeters and Meters.
Distribution
Scale
R2
STDE
WSS
MXLF
in
0.91390
1.22677
2.01476
16.81
Normal
dm
0.91394
1.22681
2.01481
16.81
m
0.91392
1.22690
2.01471
16.81
in
0.81255
3.45936
4.01365
33.35
Gumbel
dm
0.81258
3.4931
4.01372
33.35
m
0.81254
3.45924
4.01344
33.35
in
0.84102
2.12291
3.13778
27.44
Rayleigh
dm
0.84101
2.12266
3.13771
27.44
m
0.84093
2.12238
3.13734
27.44
. a
in
0.81657
2.95510
3.80621
32.08
Pearson
dmb
0.52313
4.86004
11.95228
59.54
c
m
0.79440
3.22693
4.28136
34.90
in: inches
dm: decimeters
m: meters
ak = 8.602
bk = 0.822
Ck = 6.243
APPENDIX H
MONTHLY AND ANNUAL RAINFALL
AT EIGHT NWS STATIONS IN SOUTH FLORIDA
(INCHES)
J CE CE !2 CD LL D CE 2Z LU O > L CE CE _J ) CEZDIZOLlLl.
100
ST.M3RYS RIVER RUNO FE DATA
cvx&sz sms alfa = o.s
REDUCED YRRIflTE
Figure 4.2 Linear Plot of SIDE Selected Model, Example 1
265
'riie coefficient of variation, defined as the ratio of the standard
deviation to the mean, is equal to 1. The exponential distribution has
a constant positive skewness coefficient y=2.
C.2.3. Gamma distribution
The gamma distribution was mentioned as the probability distri
bution of the time to the kth event in a poisson process. This time is
the sum of k exponentially distributed independent variables each with
the same parameter A; the pdf of the gamma distribution takes the form
ff Ak k1 Ay fr 0 .
f(y) = y e (C.2.3a)
with y, A, k > 0, xdiere F(k) is the tabulated gamma function defined as
T(k) = 0/ tk_1 e_t dt (C.2.3b)
The moments of the gamma distribution are
mean
variance
and skexmess
The three parameter gamma distribution is often knoxm in hydrology
as the Pearson Type III distribution. The third parameter is a location
parameter introduced into Equation C.2.3a to yield the pdf of the
Pearson Type III distribution
f(y) = rM (y a)k_1 e~X(y"a) (C.2.4)
with (ya), A, k > 0. Only the mean of the distribution is affected by
the addition of the location parameter,
A
2 k
O 9 5
A
Y =
r(k)
(C.2.3c)
33
When measurements are made with physical instruments, as is often the
case in hydrology, the uncertainty in each measurement generally arises
from fluctuations of repeated readings of the instrument scale. Such
fluctuations may result from either human imprecision in reading the set
tings, imperfection in the equipment, or a combination of both. A
replication of some of these experiments will give direct estimates of
the elements of the covariance matrix. Draper and Smith (1966, p. 77)
considered the case where g(x,0) is linear in the parameters. When
the covariance matrix is not diagonal (correlated observations) or is
diagonal with inequal variances (some observations are more reliable
than others) they showed that there is always a transformation of the
observations y to other variables Y satisfying at least approximately
the following properties: 1) Y is linear in the parameters, 2) the
covariance matrix is diagonal with constant elements, and 3) the errors
are approximately normally distributed with zero mean and variance
2
matrix I a where I is the identity matrix. With these properties the
transformed variables satisfy all conditions leading to the minimum
variance unbiased estimate of 0 by unweighted least squares and allow
statistical tests and confidence intervals to be constructed. These
results are then reexpressed in terms of the original variables; the
optimality of these estimates is guaranteed by the previously mentioned
property 4 of the MLEs. From Draper and Smiths approach we see how
the problem of finding the right weights may be reduced to that of
finding an appropriate transformation of the measured variables.
Ditlevsen (1981, p. 232) followed the same approach of normalizing the
2
observations y to N(0, la), and then defining a general reliability
index in the normalized space. This index is extended to a more general
normalizing transformation in Chapters 3 and 5.
Table 4.12 Total Annual Rainfall (Inches) and Statistics
of Original and Logtransformed Data.
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RAINFALL ,19341981
SORTED RECORDED EVENTS
72. 230
68. 231
67. 449
64. 474
59. 693
59. 423
58. 1S5
56. 718
55. 740
54. 534
53. 193
53. 158
53. 135
52. 751
52. 158
51. 958
51. 911
51. 704
51. 496
51. 067
50. 911
50. 865
50. 517
50. 270
49. 590
49. 098
48. 307
47. 797
47. 415
47. 382
47. 257
47. 000
46. 090
45. 849
45. 700
45. 565
44. 983
44. 775
44. 619
42. 661
42. 621
42. 240
42. 126
41. 555
38. 340
37. 319
36. 171
34. 447
MEAN OF Y
50. 014
VARIANCE OF Y
63. 182
SKEW OF Y
0. 626
MEAN OF LNCY)
3. 900
VARIANCE OF LN(Y)
0. 024
SKEW OF LN(Y)
0. 277
106
TABLE F2.7
COEFFICIENT OF RELIABILITY
FOR A PROBABILITY OF 99. 99'0000 7. WITH NORMAL VARIATE Z = 3.726
VY ALFA
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0. 800
0. 900
1. 000
0. 100
0. 693
0. 692
0. 688
0. 683
0. 679
0. 674
0. 668
0. 663
0. 657
0. 651
0. 644
0. 637
0. 200
0. 483
0. 484
0. 472
0. 459
0. 445
0. 429
0. 413
0. 394
0. 374
0. 351
0. 325
0. 295
0. 300
0. 350
0. 342
0. 323
0. 302
0. 280
0. 254
0. 225
0. 193
0. 155
0. 110
0. 053
0. 000
0. 400
0. 256
0. 244
0. 221
0. 195
0. 167
0. 135
0. 099
0. 059
0. 015
0. 000
0. 000
0. 000
0. 500
0. 192
0. 176
0. 150
0. 123
0. 093
0. 060
0. 026
0. 000
0. 000
0. 000
0. 000
0. 000
0. 600
0. 148
0. 128
0. 102
0. 075
0. 047
0. 019
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 700
0. 116
0. 094
0. 070
0. 045
0. 021
0. 003
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 800
0. 093
0. 070
0. 047
0. 026
0. 008
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 900
0. 076
0. 052
0. 032
0. 014
0. 002
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 000
0. 064
0. 040
0. 022
0. 007
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 100
0. 054
0. 030
0. 015
0. 004
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 200
0. 046
0. 023
0. 010
0. 002
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 300
0. 040
0. 018
0. 007
0. 001
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.. 000
1. 400
0. 035
0. 014
0. 005
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 500
0. 032
0. on
0. 003
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
310
roci rmcQiE DmisQnwziiiH
221
QUALITY SIMULATION
KATLZGB RZ ALFA = f ,36
Figure 6.7 GPDCP Modeling of SWMM Generated
Hourly COD Loads
RELIABILITY ANALYSIS APPLIED
TO MODELING OF
HYDROLOGIC PROCESSES
By
Khlifa Maalel
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1983
To my
and
Mother, Father
my lovely <9
ACKNOWLEDGEMENTS
This research would not have been completed without the valuable
guidance of Dr. Wayne C. Huber, the continuous interest in this topic of
Dr. James P. Heaney and encouragement from Dr. Barry A. Benedict, Dr.
Clyde Kiker and Dr. Gabriel Bitton. To all of them goes my deep appre
ciation.
Special thanks go to Janet Wilson for her excellent typing of this
dissertation and to Dibbie Dunnam for helping in typing the references.
Appreciation is expressed to Bob Dickinson for his friendship and
for helping in solving many of the computer puzzles. Also, I want to
thank all my friends in the Environmental Resources Management program
at the University of Florida for their friendship and sense of humor
which made me feel at home.
My thanks are also offered to the Tunisian government for the high
priority it is giving to scientific research and higher education,
without which I would not have been able to pursue my studies.
Part of this research was supported by the South Florida Water
Management District and by the U.S. Environmental Protection Agency.
Computations were made using the Northeast Regional Data Center computer
facilities at the University of Florida.
TABLE OF CONTENTS
Page
ACKNOWLEDGEMENTS iii
LIST OF TABLES viii
LIST OF FIGURES xi
NOTATION xii
ABSTRACT xv
CHAPTER 1: INTRODUCTION 1
1.1. Generalities 1
1.2. Reliability Definitions and Indices 8
1.2.1. The concept of reliability 8
1.2.2. Reliability definitions 9
1.2.3. Reliability indices 14
1.3. Overview of the Study 16
1.4. Overview of the Results 19
1.5. Research Uniqueness 22
CHAPTER 2: LITERATURE REVIEW 23
2.1. Classification of Hydrologic Models 23
2.2. Statistical Parameter Estimation 26
2.2.1. Introduction 26
2.2.2. Properties of estimators 27
2.2.3. Least squares method 28
2.2.4. Maximum likelihood 29
2.2.5. Weighted least squares 31
2.2.6. Method of moments 34
2.2.7. Order statistic based least squares 35
2.2.8. Conclusion 39
2.3. Probability Distributions .39
2.3.1. Introduction 39
2.3.2. Discrete distributions 40
2.3.3. Continuous distributions 40
2.3.4. Logarithmically derived distributions 40
2.3.5. Other derived distributions 44
2.4. Summary 47
iv
Page
CHAPTER 3: GENERALIZED PROBABILITY DISTRIBUTIONS 49
3.1. Generalized Gamma Distribution 49
3.1.1. Introduction 49
3.1.2. Historical background 49
3.1.3. Linear regression and confidence limits 56
3.1.4. Summary and implications 59
3.2. New Parameterization of the GGD 62
3.2.1. Generalized normal distribution (GND) 62
3.2.2. Generalized extreme value distribution (GED)....63
3.2.3. Generalized Rayleigh distribution (GRD) 64
3.2.4. Generalized Pearson distribution (GPD) 65
3.2.5. Other generalized distributions 68
3.2.6. Summary 71
3.3. Generalized Probability Distribution Computer
Program (GPDCP) 74
3.3.1. Solution algorithm 74
3.3.2. Program description 77
CHAPTER 4: ILLUSTRATIVE EXAMPLES 85
4.1. Introduction 85
4.2. Illustrative Example 1 85
4.3. Illustrative Example 2 101
4.4. Sensitivity Analysis 122
4.4.1. Sensitivity to the plotting position
definition 122
4.4.2. Sensitivity to the change of scale 129
4.4.3. Sensitivity to the objective function
formulation and to the estimation procedure....133
4.5. Summary and Conclusions 138
CHAPTER 5: RELIABILITY ANALYSIS ...141
5.1. Introduction 141
5.2. Second Moment Reliability Modeling 141
5.2.1. Normal case 141
5.2.2. Lognormal case 143
5.3. Third Order Reliability Modeling 144
5.3.1. Introduction 144
5.3.2. Approximate relations between moments of
transformed and untransformed variables 145
5.3.3. Third order reliability representation 148
5.4. Sensitivity Analysis 3.50
5.4.1. Sensitivity of the predicted pth percentile
to the shape of the distribution 150
5.4.2. Sensitivity of the level of reliability to
the shape of the distribution and to the
first order approximation 164
v
Page
5.4.3. Sensitivity of design event magnitude to the
design period and to the shape of the
distribution 171
5.5. Summary and Conclusions 174
CHAPTER 6: CASE STUDY 175
6.1. Introduction 175
6.2. Case Study 176
6.2.1. Sites and data description 176
6.2.2. Temporal and spatial variability of the
recorded data 176
6.2.3. Continuous versus single event simulation 186
6.3. Deterministic Models 187
6.3.1. Introduction 187
6.3.2. Distribution free statistical analysis 188
6.3.3. Calibration and verification of deterministic
models 197
6.3.4. Water quantity and quality continuous
s imulation 199
6.3.5. Summary and conclusions 212
6.4. Probabilistic Models 212
6.4.1. Introduction 212
6.4.2. Annual rainfall series 214
6.4.3. Event statistics ....215
6.4.4. Quality Statistics 217
6.4.5. Conclusion 220
6.5. Stochastic Models 220
6.5.1. Introduction.... 220
6.5.2. Model description 222
6.5.3. Parameter estimation and goodness of fit
evaluation 223
6.5.4. Annual ARMA(1,1) model 226
6.5.5. Monthly ARMA(1,1) model 235
6.5.6. Reliability of estimated parameters 243
6.6. Summary and Conclusions 246
CHAPTER 7: SUMMARY AND CONCLUSIONS 250
7.1. Summary 250
7.2. Conclus ion 251
7.3. Suggestion for Further Research 253
APPENDICES
A. Linear Regression 255
B. Exact Relations Between Moments of the Normal and
Lognormal Distributions 258
C. Probability Distributions 261
D. GPDCP Source Program 27.4
E. Modified Kite Program 284
F. Coefficient of Reliability as a Function of ALFA and V^..293
vi
Page
G. Reliability as a Function of COR and V 318
H. Monthly and Annual Rainfall at Eight N$iS Stations 333
I. Extension of the Akaike and Bayesian Information
Criteria to the BoxCox Transformation 341
REFERENCES 345
BIOGRAPHICAL SKETCH 358
vii
LIST OF TABLES
Table Page
3.1. Special Distributions of the GGD 58
3.2. Optimal Power Transformation and Information Number Ratio...61
3.3. Special Distributions of the New Parametrized Generalized
Distributions 66
3.4. Other Special Distributions of the GGD Not Included in
This Study 69
3.5. New Generalized Family of Distributions 73
4.1. Annual Maximum Daily Runoff and Statistics of Original
and Log Transformed Flows, Example 1 87
4.2. Standard Errors for Example 1 88
4.3. Models, Parameters, and Selection Statistics, Example 1,
Runoff Series 90
4.4. Optimal Selection Statistics and Corresponding
Transformation (a) for Example 1 92
4.5. Best Model Based on R2 Selection Statistics, Example 1,
Runoff 94
4.6. Detailed Statistics for the R2 Selected Model, Example 1....96
4.7. Best Model Based on STDE Selection Statistic, Example 1 97
4.8. Detailed Statistics for the STDE Selected Model,
Example 1 98
4.9. Detailed Statistics for the WSS Selected Model, Example 1..103
4.10. Best Model Based on the WSS Selection Statistics,
Example 1 104
4.11. Total Annual Runoff and Statistics of Original and Log
Transformed Flow for Example 2 105
4.12. Total Annual Rainfall and Statistics of Original and Log
Transformed Data 106
4.13. Standard Errors for Example 2 .108
4.14. Models, Parameters, and Selection Statistics, Example 2,
Runoff 109
4.15. Optimal Selection Statistic and Corresponding
Transformation (a), Example 2, Annual Total Runoff 110
4.16. Best Model Based on R2 Selection Statistic, Example 2,
Runoff 112
4.17. Detailed Statistics for the R2 Selected Model, Example 2,
Runoff 113
4.18. Best Model Based on the STDE Selection Statistic,
Example 2, Runoff 1.14
4.19. Detailed Statistics for the STDE Selected Model,
Example 2, Runoff 115
4.20. Models, Parameters, and Selection Statistics, Example 2,
Rainfall 118
viii
Table Page
4.21. Optimal Selection Statistics and Corresponding (a),
Example 2, Annual Total Rainfall 119
4.22. Best Model Based on R2 Selection Statistics, Example 2,
Rainfall 120
4.23. Detailed Statistics for the R2 Selected Model, Example 2,
Rainfall 121
4.24. Best Model Based on the STDE Selection Statistic,
Example 2, Rainfall 123
4.25. Detailed Statistic for the STDE Selected Model,
Example 2, Rainfall 124
4.26. Sensitivity of Optimal Selection Statistic (R2) and
Corresponding Transformation (a) to Plotting Position
Definition 127
4.27. Sensitivity of Optimal Selection Statistic STDE and
Corresponding Transformation to Plotting Position
Definition 128
4.28. Selection Statistics for St. Marys Runoff Data Expression
cfs and Dimensionless Units 130
4.29. Selection Statistics for Kissimmee River Runoff Data
Expressed in Inches, Decimeters and Meters 132
4.30. Nonlinear Parameter Estimation, Original Variables,
Starting Values AF=0.40, A=13, and B=100 134
4.31. Nonlinear Parameter Estimation, Original Variables,
Starting Values AF=0.41, A=15, and B=117 135
4.32. Nonlinear Parameter Estimation, Transformed Variables,
Starting Values AE=0.41, A=13, and B=117 136
4.33. Nonlinear Parameter Estimation, Transformed Variables,
Starting Values AF=0.40, A=15, B=100 137
5.1. Power to Lognormal Percentile Ratio, 0 _< a <_ 1, P=95% 154
5.2. Power to Lognormal Percentile Ratio, 0 <_ a _< 1, P=99% 155
5.3. Power to Lognormal Percentile Ratio, 0 <_ a _< 1, P=99.9%.... 156
5.4. Power to Lognormal Percentile Ratio, 0 < a < 1, P=99.999%..157
5.5. Power to Lognormal Percentile Ratio, 0 <_ a <_ 1,
P=99.9999% 158
5.6. Power to Lognormal Percentile Ratio, 0 _> a >_ 1, P=95% 159
5.7. Power to Lognormal Percentile Ratio, 0 _> a _> 1, P=99% 160
5.8. Power to Lognormal Percentile Ratio, 0 >_ a _> 1, P=99.9%... 161
5.9. Power to Lognormal Percentile Ratio, 0 _> a _> 1,
P=99.999% 162
5.10. Power to Lognormal Percentile Ratio, 0 _> a _> 1,
P=99.9999% 163
5.11. Sensitivity of the Ratio, Design Event Magnitude/Mean
(1/C0R), to the Design Period and to the Shape of the
Distribution 173
6.1. Characteristics of the Four Urban Sites 179
6.2. Hourly Rainfall Station Identification 180
6.3. Time Variability of Rainfall, Runoff, and Runoff
Coefficient 182
6.4. Spatial Variability of Rainfall within the Urban Basins....183
6.5. Temporal and Spatial Variability of the Seasonal Yearly
Total Rainfall Over the South of Florida 185
ix
Table Page
6.6. Monthly Storm Event Statistics at Miami Airport Station....189
6.7. Annual Storm Event Statistics at Miami Airport Station 190
6.8. Time and Space Variability of the Storm Event based
Statistics 196
6.9. Highway Basin Calibration by Adjusting the Width 200
6.10. Calibration results for the Storm Water Management Model
at the South Florida Four Basins 201
6.11. Yearly and Monthly Summaries from SWMM for the Multifamily
Urban Basin 202
6.12. Ranked Hourly Rainfall for the Multifamily Urban Basin 203
6.13. Ranked Runoff and Pollutant Concentration for the
Multifamily Basin 204
6.14. Options Selection for STATS Analysis 208
6.15. Empirical Frequency Generated by STATS for the Multifamily
Basin 209
6.16. Monthly Event Statistics of Generated Runoff at the
Multifamily Basin 210
6.17. Annual Event Statistics of Generated Runoff at the
Multifamily Basin 211
6.18. GPDCP Selected Transformation and Corresponding Optimal
Statistics for the 8 NWS Stations 213
6.19. Sensitivity Analysis of the Optimal MXLF Statistics to
the Plotting Position Definition 216
6.20. GPDCP Optimal Selection Statistic Summary, Belle Glade
Storm Event Characteristics 218
6.21. GPDCP Optimal Selection Statistics for COD Pollutant
Loads, Miami Multifamily Basin 219
6.22. ARMA(1,1) Parameter Estimates for Annual Rainfall Series...227
6.23. Skewness Coefficient for Transformed Annual Series 229
6.24. Residual Variance and Minimum Sum of Squares for Annual
Rainfall Series 230
6.25. Akaike information Criteria for the ARMA(1,1) Models of
the Annual Rainfall Series 233
6.26. Bayesian Information Criteria for the ARMA(1,1) Models
of the Annual Rainfall Series 234
6.27. First Estimates of ARMA(1,1) Parameters for Three
Transformations of Monthly Rainfall Series 236
6.28. FTMXL Parameter Estimates of the ARMA(1,1) Monthly Models..237
6.29. First and Final Estimates of ARMA(1,1) Parameters for
Square Root of Monthly Rainfall 238
6.30. Residual Variance and Minimum Sum of Squares for the
ARMA(1,1) Rainfall Model 239
6.31. Sensitivity of the Monthly and Annual Skewness to the
Transformation of Monthly Rainfall Series 242
6.32. Akaike and Bayesian Information Criteria for the Monthly
ARMA(1,1) Model 244
6.33. Monte Carlo Simulation Summary Statistics 245
x
LIST OF FIGURES
Figure Page
1.1. Error Distribution About the Fitted Model 5
1.2. Schematic Representation of the Engineering Design
Process 10
1.3. Reliability and Risk Definitions 12
2.1. Hydrologic Model Classification 24
3.1. Generalized Gamma Distribution 53
3.2. Flow Chart for the Generalized Probability Distribution
Computer Program (GPDCP) 78
4.1. Linear Plot of R2 Selected Model, Example 1 ...99
4.2. Linear Plot of STDE Selected Model, Example 1 100
4.3. Linear Plot of MXLF Selected Model, Example 1 102
4.4. Linear Plot of R2 Selected Model, Example 2, Runoff 116
4.5. Linear Plot of STDE Selected Model, Example 2, Runoff 117
4.6. Linear Plot of R2 Selected Model, Example 2, Rainfall 125
4.7. Linear Plot of STDE Selected Model, Example 2, Rainfall....126
5.1. Reliability Versus Standardized Mean, a=1.0 166
5.2. Reliability Versus Standardized Mean, a=0.60 167
5.3. Reliability Versus Standardized Mean, a=0.0 168
5.4. Reliability Versus Standardized Mean, a=0.6 169
5.5. Reliability Versus Standardized Mean, a=1.0 170
6.1. Location of the Four Urban Basins 177
6.2. Location of the Eight NWS Hourly Rainfall Stations., 178
6.3. Seasonal Variability of the Time Between Events 192
6.4. Seasonal Variability of the Average Intensity 194
6.5. Frequency Plot and Statistical Parameters of Generated
Flows at the Multifamily Basin 205
6.6. Frequency Plot and Statistical Parameters of Generated
COD Loads at the Multifamily Basin 206
6.7. GPDCP Modeling of SWMM Generated Hourly COD Loads 221
6.8. Sum of Squares Surface Contours for Belle Glade Annual
Rainfall Series 231
6.9. Sum of Squares Surface Contours for Belle Glade Monthly
Rainfall Series 245
6.10. Annual Series Autocorrelation Functions, a=0.0 '.247
6.11. Annual Series Autocorrelation Functions, a=1.3 248
xi
NOTATION
A
a
ACF
AF
ALFA
AIC
ARMA
B
b
BIC
CDF
COD
COR
CP
CT
F
f
F
~S
8
GED
GGD
GPD
GPDCP
regression parameter
GGD parameter
Autocorrelation Function
transformation parameter
transformation parameter
Akaike Information Criteria
Autoregressive Moving Average Model
regression parameter
GGD parameter
Bayesian Information Criteria
Cumulative Distribution Function
Chemical Oxygen Demand
Coefficient of Reliability
Multiplicative constant
Additive constant
Cumulative Distribution Function
Probability density function
Standardized probability density
Fitted model
Generalized Extremal Distribution
Generalized Gamma Distribution
Generalized Pearson Distribution
Generalized Probability Distribution Computer Program
xii
GRD
GRI
h
H
IMSL
k, K
L
MXLF
n
P
o
P
P
pdf
Pr
q
R
R2
SAS
SF
STD
STDE
V
w
wss
x
y
Generalized Rayleigh Distribution
Generalized Reliability Index
probability density function
Cumulative distribution function
International Mathematical Statistical Library
GGD shape parameters
likelihood function
loglikelihood function
Maximum likelihood selection statistics
number of observations
Probability value
number of parameters
confidence level
probability density function
Probability
GGD shape parameter
Reliability
Coefficient of determination, selection statistics
Statistical Analysis System
Safety Factor
Standard Deviation
Standard error, selection statistics
Coefficient of variation
weight in least squares regression
Weighted Sum of Squares
Explanatory variable
original variable
xiii
transformed variable
standardized variate
expected order statistics
transformation parameter
ARMA.(1,1) autoregressive parameter
skewness coefficient
mean of original variables
mean of transformed variables
number of degrees of freedom
digamma function
trigamma function
product symbol
test level
summation symbol
model parameters
ARMA(1,1) moving average parameter
xiv
Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
RELIABILITY ANALYSIS APPLIED
TO MODELING OF
HYDROLOGIC PROCESSES
By
Khlifa Maalel
December 1983
Chairman: Wayne C. Huber
Cochairman: James P. Heaney
Major Department: Environmental Engineering Sciences
Parameter estimation in the modeling of physical (natural) proc
esses in general and of hydrologic processes in particular is one of the
most challenging problems still puzzling scientists and engineers. This
research is an attempt to shed new light on such problems by suggesting
a uniform reliability based approach to the estimation of the parameters
of deterministic, probabilistic and stochastic models and to the analy
sis of model predictions.
New families of generalized probability distributions are defined.
These reduce to the simple form of location and scale type distributions
when the data are transformed by the BoxCox transformation. The trans
formation parameter is introduced as a shape parameter for the new
families of distributions. An algorithm for estimating the location and
scale parameters separately from the nonlinear parameter is developed
and translated into a FORTRAN computer program. This program selects
the best model from among four generalized probability distributions
xv
from the normal, extremal, Rayleigh, and Pearson families. The selec
tion may be based on any of four criteria: the standard error, the
coefficient of determination of the simple linear regression, the
weighted sum of squares, and the maximum likelihood function. These
four statistics exhibit a high sensitivity to the shape parameter, a
small sensitivity to the plotting position definition and no sensitivity
to the change of scale on which the data are expressed. For all selec
tion statistics, the four analyzed families of distributions performed
equally well in modeling most of the analyzed series.
New relations between the moments of the original and BoxCox
transformed variables are derived, based on a Taylor series expansion
of the transformation. These relations allow the extension of the
classical second moment reliability theory, based on the assumption of
normality, to a third order reliability theory, wherein deviation from
normality is explicitly accounted for through the transformation parame
ter. Generalized expressions for different measures and indices of
reldability are derived, and their high sensitivity to the shape param
eter is illustrated by generated tables and plots of reliability for
several transformation parameters and coefficients of variation.
The potential applicability of the reliability based approach to
the modeling of hyclrologic processes is illustrated by the analysis of
several hydrologic series from southeast Florida.
xv i
CHAPTER 1
INTRODUCTION
1.1. Generalities
One of the main tasks of scientists in general and engineers in
particular is that of data reduction. An engineer who has compiled
tables of data wishes to reduce them to a more convenient and compre
hensive form (Bard, 1974). This may be accomplished, for example, by
plotting the data, reducing them to a graphical form, and by selecting a
class of functions and choosing from this class the one that best fits
the data. The functional form may be expressed mathematically as
y = g(x,0) (1.1.1)
where y = (y^, y. .., Yn) is the dependent variable, x = (x^, x^, ...,
x ) is the independent variable (explanatory variable) and 0 = (0^,
0 ..., 0 ) are the parameters; and where n and p are the number of
P
measurements and number of parameters, respectively.
The parameters are usually chosen to give the "best fit" to the
data. Several fit criteria may be used, among which the most widely
used is the least squares criterion, in which the sum of squares (SS)
is minimized:
SS = Z[y g(x.,0)]2 (1.1,2)
If the selection of the functional form g is mainly based on compu
tational considerations, then the physical nature of the process gen
erating the data will be reproduced to a minor extent by g. In this
case, the parameter estimation procedure is called curve fitting, and
1
2
the fitted parameters will usually bear little physical significance if
at all. These curves are useful tools for summarizing data and inter
polating between tabulated values, but they have limited use outside the
range of measurements (e.g., prediction and extrapolation purposes).
On the other hand, if the selection of g is based on some theoreti
cal considerations of the physical laws governing the behavior of the
natural system generating the data, then the procedure is called model
fitting (calibration). The parameters of such models usually represent
quantities that have physical significance. The estimation of these
parameters is a much more complicated problem than simple curve fitting
(Bard, 1974). This is so because in addition to fitting the data well,
these parameters should preserve their physical meaning by coming fairly
close to the "true" values. Unfortunately, such true values are usually
unknown; otherwise there would be no reason for performing the experi
ment. Thus, due to experimental uncertainties, even if the functional
form, g, is correct we can never expect to obtain the true parameters
and the consequent model predictions with absolute certainty. This is a
consequence of the deterministic nature of g, wherein each variable is
represented by a unique value, which is in contradiction to its uncer
tain (random) nature. Such uncertainty was described by Bevington
(1969) in the following words:
It is a wellestablished rule of scientific investigation that the
first time an experiment is performed the results bear all too
little resemblance to the truth being sought. As the experiment is
repeated, with successive refinements of technique and method, the
results gradually and asymptotically approach what we may accept
with, some confidence to be a reliable description of events.
This is even more true in hydrologic modeling, where in addition to
experimental uncertainty, the modeled processes are highly variable in
nature, and where engineering decisions are often based on reliability
3
analysis. Contrary to deterministic modeling, where the objective is to
dampen uncertainty, the objective of reliability analysis is to magnify
it in order to represent, predict and understand the uncertainty of
reality. Thus, the functional form of Equation 1.1.1 may be modified to
2
include an error term, e, explicitly, with zero mean and variance a ,
y = g(x,0) + e (1.1.3)
It is the analysis of this error that will characterize the performance
of the model, measure the deviation of the predicted from observed
values and allow the construction of confidence limits about g(x,0).
Modeling uncertainty has become an important task of modern engi
neering analysis (Ditlevsen, 1981). In fact, it was not until the late
1960s that reliability began to interest a wide circle of scientists
and engineers (Lomnicki, 1973). The spread was so fast among different
engineering fields that it became hard to formulate a unique and simple
definition or measure of reliability. Some of the most common defini
tions will be given in the next section, but first, applications of
reD.iabl.lity concepts to hydrologic modeling are presented through a
review of related studies.
Wood and RodriguezIturbe (1975a, 1975b) and Kite (1975) were among
the pioneers in the analysis of reliability in flood flow frequency
modeling. They based their analysis on Monte Carlo simulation of
extreme events using different probability distribution models. Wood
and RodriguezIturbe adopted a Bayesian framework to account for param
eters and model uncertainties, while Kite assumed that the error term,
e, is normally distributed and derived approximate confidence limits for
the predicted values. Lwin and Maritz (1980) noted that there is no
4
need for such a normality assumption if the dependent variable y has a
distribution from a location and scale family, i.e.,
Pr(y _> yQ  x = xQ) = F(ZzS_^9J)) (1.1.4)
where F is the cumulative distribution function of the standardized
residuals. The error distribution and confidence limits about the
fitted model are illustrated by Figure 1.1.
Klemes and Bulu (1979) investigated by a split sample experiment
and Monte Carlo simulation the reliability of various statistical
parameters of three different operational stochastic models. Yen and
Tang (1976) and Tung and Mays (1980) included hydraulic and hydrologic
uncertainties in the design of hydraulic structures. Cooley (1977,
1979) developed a method for estimating the parameters and assessing the
reliability of steady state groundwater flow models. Moss (1979)
treated model and measurement errors as a third dimension in the analy
sis of the timespace tradeoff in hydrologic data collection for
regional regression model calibrations. Lall and Beard (1981), through
a reliability analysis, developed a decision theoretic framework in which
the value of data was related to process risk and estimate uncertainty.
Hirsh (1979) suggested the use of synthetic hydrology for reliability
analysis of water supply systems. After comparison of the performance
of six generating methods of synthetic flow sequences he found that it
is operationally superior to perform the analysis on transformed (nor
malized) values of historical streamflow data, rather than on the data
themselves.. Also, he found that preservation of statistical moments
(mean, standard deviation, and lag correlation coefficients) may be a
misleading criterion for judging the ability of the generating model to
reproduce the performance of the water supply system. A better measure
Design Event rienitud
5
1 I I I I
5 10 20 50 100 x
Return PeriocbYesrs
Figure 1.1 Error Distribution About the Fitted Model.
Modified from Kite (1977)
6
of this performance was provided by checking the similarity between
historical and synthetic cumulative distribution functions of the
modeled flows or storages. This criterion was applied later for the
comparison of the performance of four streamflow record extension tech
niques (Hirsh, 1982).
Stedinger and Taylor (1982a, 1982b) reconsidered the problem of
streamflow generation and compared the performance of five generation
models. They did this in two steps: a verification phase to confirm
that the model reproduces the statistics of the observed data, and a
validation phase to demonstrate that other important characteristics of
generated flow sequences are consistent with those of the historical
flow, e.g., statistics related to the frequency and severity of droughts.
They also showed that by incorporating parameter uncertainty into the
streamflow generating model, derived distributions of reservoir reli
ability and performance will better reflect what is known (or is not)
about a basin's hydrology. Among their conclusions they noted that the
impact of parameter uncertainty is much greater than that of the selec
tion between a simple and a relatively complicated model. Thus, stream
flow model parameter uncertainty should be incorporated into reservoir
simulation studies to obtain realistic and honest estimates of system
reliability, given what is actually known about basin hydrology (Sted
inger and Taylor, 1982b).
Hashimoto et al. (1982) defined three criteria that can be used to
assist in the evaluation and selection of alternative design and operat
ing policies for a wide variety of water resource projects. These
criteria describe how likely a system is not to fail (reliability), how
quickly it recovers from failure (resilience) and how severe the
7
consequences of failure may be (vulnerability). Application of these
criteria was illustrated with a water supply reservoir example for which
it was found that there is a tradeoff among expected benefits, reli
ability, resilience and vulnerability. For instance, high system reli
ability was accompanied by high system vulnerability.
Niku et al. (1979, 1981) developed a reliability model for the
evaluation of activated sludge processes of plants under design or under
operation. The model was based on the assumption of lognormality of the
analyzed effluent and on the relation between the moments of the normal
and lognormal distributions. Reliabilitybased parameter estimation
procedures of rainfallrunoff models have been reported by Sorooshian
and Dracup (1980), Sorooshian (1981), Troutman (1982), Sorooshian et al.
(1983), Sorooshian and Gupta (1983) and Gupta and Sorooshian (1983) in
which special objective functions and solution techniques were selected
on the basis of the stochastic properties of the errors present in the
data and in the model. In all of these studies, the data were trans
formed in order to comply with the assumptions implied by the estimation
methods.
Stedinger (1983a) proposed the use of the noncentral t distribution
to construct approximate confidence limits for specified design events
from the normal and lognormal distributions, and suggested an adjustment
for skewness for the Pearson type III distribution. Through a Monte
Carlo simulation, the new confidence limits were shown to achieve the
desired confidence level better than those based on asymptotic theory
(Kite, 1975) or on the U.S. Water Resources Council Guidelines (WRC,
1977, 1981). In another paper Stedinger (1983b) recommended the use of
probability weighted moments for estimating the parameters of the
8
dimensionless flood flow frequency distributions, in which the dimen
sionless variables are obtained by dividing the data by the sample mean.
In a third paper, Stedinger (1983c) assumed that either the annual
floods, their logarithms, or some other transformation of the flows are
normally distributed, from which he derived an estimate of the pth
percentile of the flood flow distribution. This estimate was based on a
Bayesian approach, incorporating hydrologic and geomorphic information
(prior distribution of the parameters) in addition to the sample infor
mation (measured data).
As mentioned earlier, and illustrated by this brief review of
literature, the term reliability has no strict (unique) definition since
it applies to the analysis of the residuals as well as to the vari
ability of the data themselves. Transition between these two applica
tions may be easily handled by changing the definition of the functional
form g(x,0). Another type of reliability analysis is related to parame
ter estimation in which the variancecovariance matrix of the parameters
needs to be estimated along with the parameters themselves. The next
section gives some definitions and measures of reliability.
1.2. Reliability Definitions and Indices
1,2.1, The concept of reliability
An introduction to the concept of reliability is given by Hendrenyi
(1981):
Reliability is an old concept and a new discipline. For ages,
things and people have been called reliable if they had lived up to
certain expectations, and unreliable otherwise. A reliable person
would never (or hardly ever) fail to deliver what he had promised;
a reliable watch would be keeping the time day after day. The
types of expectations to judge reliability by have all been related
to the performance of some function or duty; the reliability of
a device has been considered high if it had repeatedly performed
its function with success and low if it had tended to fail in
repeated trials.
9
This vague notion of reliability is of limited use in technical
applications, as in physics and engineering, where concepts must have
numerical measures. A better schematization of the concept of reli
ability is given by Figure 1.2, in which the engineering design process
is compared to the links of a chain, and reliability is referred to as
the strength of the weakest link. However, a suitable definition of
reliability is still required in order to quantify it and utilize one or
several measurable quantities.
1.2,2. Reliability definitions
The classical definition which \
ing field is not too far from the nontechnical concepts presented
above. This definition is the following:
Reliability in the probability of a device or system performing
its function adequately, for the period of time intended under the
operating conditions intended. (Hendrenyi, 1981)
Note that this definition is based on the mathematical concept of
probability, which is fundamentally associated with reliability. This
definition will be adapted for this study, wherein the performance of
the system or device will be described by the previously defined func
tional form g(x,0). Thus, the reliability will be the probability that
the model g(x,0) will perform adequately in predicting the behavior of
the system over a given range of observations and within specified
confidence limits. Thus, the concept of reliability remains unchanged
and is the "probability of success" or "probability of adequate per
formance," often referred to as probability of "no failure" or no risk.
Reliability = 1 Risk
= 1 Pr(failure)
(1.2.1)
Figure 1.2
Schematic Representation of the Engineering
Design Process. After Harr (1977)
11
If the performance of the system is described by g(x,8) and an interval
A is defined about the mean prediction g(x,0) within which the observa
tion will be considered as a success, then the reliability may be ex
pressed as
Reliability = 1 Pr(y g(x,0) > A) (1.2.2)
From Figure 1.1, the error distribution about a single prediction
g(x^,8) defines the reliability as the area under the curve delimited by
the upper and lower limits and L^, respectively. The risk is defined
by the area under the tails of the distribution. Both risk and reli
ability may be defined with only one limit, the second limit may be set
to + or Figure 1.3a gives one such example in which y and g(x,0) are
replaced by capacity C and demand D, and safety margin SM (Harr,
1977). The risk of having the demand exceed the capacity (negative SM)
is given by the shaded area in Figure 1.3a. For this case the lower
limit was set equal to zero. Such limits are usually defined by stand
ards or design events which should not be exceeded for a given level of
reliability. If these limits are themselves random variables, they will
be represented by their probability distributions, and the reliability
will be evaluated as illustrated in Figure 1.3b, where the probability
associated with a level of demand (fixed limit), say D^, is
Pr( di~D I < ~) = fpiD^dD (1.2.3)
and the probability that the capacity is less than is the shaded area
of the same figure,
1
Pr(C < DJ = f f (C)dC (1.2.4)
The probability of failure is the product of these two probabilities or
D!
dPf = fD(D1)dD _J ^ fc(C)dC .
(1.2.5)
Figure 1.3 Reliability and Risk Definitions.
After Harr (1977)
13
Integration of this expression over all possible values of demand will
define the total risk represented by the shaded area of Figure 1.3c,
Risk = _y [_JB fc(D)dC] fD(D)dD (1.2.6)
which after evaluation of the inner integral and substitution into
Equation 1.2.1 gives
OO
Reliability = 1 _J F (D) f (D)dD (1.2.7)
In this equation, F^D) is the cumulative distribution function of the
capacity. Tung and Mays (1980) developed a computer program to evaluate
this expression for different combinations of capacity and demand dis
tributions, including the most widely used distributions in the hydro
logic field. Such relations will be dealt with in more detail in
Chapter 5.
Chow (1964) defined two types of reliability, sampling reliability
and prediction reliability for flood flow frequency analysis. The
first relates to the lack of fit of events from a given sample to the
theoretical functional form, say g(x,0), representing the total popu
lation of samples. Sampling reliability is usually expressed in terms
of confidence limits. Prediction reliability relates to the prob
ability of nonoccurrence of an event of given average recurrence
interval (T) during a given number of time intervals (n)
Pr(y < yL)n = (1 )n (1.2.8)
where n is known as the design period or project life. It is a func
tion of the reliability from the above equation for a given design event
y with average recurrence interval T
J_i
log Pr(y
log[(Tl)/T] *
n
(1.2.9)
14
This study will focus mainly on sampling reliability, although pre
diction reliability models will be treated as special cases of the
functional form g(x,0).
Nix (1982) lists three types of reliability applicable for the
measure of the performance of stormwater related systems. These are
annual reliability, defined as the probability of no failure within a
year; time reliability, defined as the portion of time with no failure
during the operation period; and volume reliability, defined as the
portion of the total demand satisfied during the operation period.
These three definitions were based on water supply reservoir theory;
they will not be considered further, although they may be easily in
corporated into the general frame of reliability analysis of this study.
1.2.3. Reliability indices
1.2.3.1. Safety factor (SF). This is a well known index of reliability
among engineers; it is usually defined as the ratio of the expected
capacity (C) to the expected demand (D) (Figure 1.3c) of a given
structure. Other values of capacity and demand may be substituted for
the expected values (e.g., C and D of the same figure). Tung and Mays
(1980) discussed six different safety factors, originally proposed by
Yen (1978) for hydrologic and hydraulic design. The general expression
of the safety factor is
SF = Â£ (1.2.10)
where C and D are some measure of the capacity and demand, respectively.
1.2.3.2. Coefficient of reliability (COR). Niku et al. (1979, 1981)
defined a similar index to measure the performance of a wastewater
treatment plant. This index was called coefficient of reliability and
defined as the ratio of the average effluent concentration (C) for
15
which the plant should be designed (or operated) to meet a given stand
ard (D) p% of the time (i.e., with p% reliability),
COR =  (1.2.11)
where C and D are the mean and pth percentile variate of the lognormal
distribution fitted to the pollutant of interest. For example a COR of
0.2 relates to a percentile equal to 5 times the mean of the lognormal
distribution. Detailed expressions of this coefficient will be given in
Chapter 5.
1.2.3.3. Generalized reliability index (GRI). Ditlevsen (1981, p. 232)
defined a generalized reliability index of a system with respect to a
limit state Ly dividing the transformed space into a safe set and a
failure set by the formula
GRI(y) = 1t/y M^)
where yT is the confidence limit (safe set) in the normalized space of
iJ
input variables corresponding to the limit state Ly, $ ^ is the inverse
of the cumulative distribution function of 9, the standardized normal
density function, and z^, z^> zn are the standardized normal
variates.
z.
1
yi Â§(xi0)
(1.2.13)
For the special case where Ly is limited to one variable, y^ will
be defined by a single point, and the reliability index of Equation
1.2.12 reduces to
GRI. .
Cy)
(z^)dz
$ 1^(zL)
z
L *
(1.2.14)
16
which is equal to the number of standard deviations separating the mean
from the safeset limit y The last two equations may be combined to
define confidence limits about g(x,6) for a given reliability level z
JLj
yL = g(x,0) + zLo (1.2.15)
1.2.3.4. Conclusion. As shown above, reliability indices are often
based on the first two moments of the observations, namely the mean and
the standard deviation. The use of these tx^o moments is usually asso
ciated with the assumption of normality, or lognormality, such as the
case of the GRI and COR indices, respectively. The sensitivity of these
indices to such assumptions will be investigated in Chapter 5, and
approximate new formulae will be derived to account explicitly for
deviation from the assumption of normality.
An overview of the objectives of this study and a description of
the remaining chapters are given in the next section.
1.3. Overview of the Study
The importance of probability distributions in model classifica
tion, parameter estimation, and reliability analysis is illustrated in
Chapter 2 where a classification of hydrological models is presented
along with a review of different methods of statistical parameter esti
mation procedure. Then classical distribution models and their statis
tical parameters are reviewed. Of special interest is a class of
derived distributions in which the variate of a parent distribution is
transformed using simple functions such as the logarithmic or power
transformation. Chapter 3 starts with a detailed analysis of the
generalized gamma distribution, a widely used distribution in reli
ability studies. This distribution includes most of the distributions
17
reviewed in Chapter 2 as special cases, but it has the disadvantage of
being so poorly parameterized that no simple procedure for estimating its
parameters has been reported in the literature. The estimates are
usually highly correlated, and iterative algorithms are required for
their evaluation. The convergence of these algorithms is not always
guaranteed, and the final estimates are often dependent on their first
estimates. To circumvent such limitations, a new parameterization of
the generalized distribution is suggested. In its new form the general
ized distribution includes four families of distributions, each express
ible in terms of only two parameters (location and scale), once the data
are transformed to the right space through the BoxCox transformation.
The four families of generalized distribution analyzed are the normal,
Gumbel, Rayleigh, and Pearson distributions. A computer program is
developed to fit up to 100 probability distribution models simultane
ously and select the best one according to any of four selection sta
tistics. These statistics are the correlation coefficient, standard
error of estimate, weighted sum of squares and maximum likelihood
function.
The performance of this program and the new generalized distri
butions is compared in Chapter 4 to that of the classical maximum
likelihood procedure used by Kite (1977) for estimating the parameters
of six of the most widely used frequency distribution models in hydrol
ogy. The comparison is based on two examples. The first is a series of
maximum daily discharges from the St. Mary's River at Stillwater, Nova
Scotia, analyzed by Kite (1977). The second example has two series of
annual total rainfall and runoff from the Kissimmee River basin analyzed
by Huber et al. (1982). These two examples are also used for a
18
sensitivity analysis of the selection statistics to the definition of
the plotting position, change of scale and estimation procedure.
Chapter 5 illustrates the use of the new parameterized general
distributions for reliability analysis. In this chapter, approximate
formulae relating the moments of the transformed and untransformed
variables are derived and substituted into expressions for reliability
indices. Reliability tables are generated for different probability
levels and for different spaces. A sensitivity analysis of the reli
ability, the predicted percentile and the design period to the shape of
the distribution is based on these new formulae.
Chapter 6 begins with a description of four urban basins used as a
case study for the deterministic modeling of water quantity and quality.
These basins were intensively investigated by the United States Geolog
ical Survey (USGS) and most of the data collected during these investi
gations are available at the University of Florida (Huber et al., 1981a).
Then a review of the calibration and verification of deterministic
rainfallrunoff models is presented. The basin data are then used for
the calibration of the EPA Storm Water Management Model (SWUM). A new
procedure for calibrating SWMM using more than one storm event within
the same run of the program is developed. The performance of the model
is evaluated through a reliability analysis of the parameter estimates
and a comparison of the variability in percent error for the total
volume and peak flow predicted by the model to the percent error in the
inputs (spatial and time variability of rainfall and basin character
istics) Also, within this case study, hourly rainfall data from eight
stations located in southeast Florida are used for a detailed analysis
of the rainfall spatial and temporal variability over the region
19
for the period 1956 to 1979. These data are analyzed on a yearly,
monthly, and event based discretization of the time scale. The event
based analysis is performed using a synoptic statistical analysis
program (SYNOP) originally developed by Hydroscience (1976) and improved
by the University of Florida. This program performs statistical analy
ses on the duration, total volume, average intensity and time between
events. A more detailed analysis of these characteristics is performed
by the generalized probability distribution computer program (GPDCP)
described in Chapter 3. The same analysis is performed on the hourly
series of flows and pollutant concentrations for the calibrated basins
using a continuous simulation with SWMM. Statistical analysis of these
generated series is performed directly by the statistical block of SWMM
(STATS), followed by SYNOP for comparison purposes. The last section of
Chapter 6 deals with stochastic modeling of the yearly and monthly
rainfall series from the eight stations introduced in section 6.2. The
reliability of the estimated parameters is analyzed through a Monte
Carlo simulation. The variability in parameter estimates clue to tem
poral and spatial variations and to transformation of variables is then
analyzed by comparison of estimates for the eight stations and for
different transformations.
Chapter 7 summarizes the main findings of the research, and gives
some recommendations for future investigations along with the conclu
sions .
1.4. Overview of the Results
Most probability distribution models used in hydrology and in
reliability analysis are shown to reduce to a location and scale type
20
distribution after an adequate reparameterization and transformation of
the data by the BoxCox transformation. Based on this finding, four
generalized families of probability distributions are derived, with the
normal, extremal, Rayleigh, and Pearson as parent distributions. A
computer program is developed for estimating their parameters. Separa
tion of the estimation of the location and scale parameters from that of
the transformation parameter, adopted by this program, results in a very
efficient algorithm which always converges in efforts where other
classical algorithms have failed.
The high performance of these distributions is illustrated by the
modeling of three annual series taken from the literature. Four selec
tion statistics are used for the evaluation of such performance. These
statistics exhibit a high sensitivity to the transformation parameters,
and no sensitivity to the definition of the plotting position and to the
change of scale on which the observed data are expressed. The optimal
selection statistics are found to be much less sensitive to the choice
of the parent distribution than to the transformation parameter, e.g..
all four generalized distributions perform equally well once the data are
optimally transformed.
Relations between the moments of the original and transformed
variables are derived, based on a Taylor series expansion of the BoxCox
transformation. These relations allow the extension of the second
moment reliability theory, based on the assumption of normality, to a
third order reliability theory, wherein the deviation from normality is
explicitly accounted for through the transformation parameter. Gener
alized expresssions for different measures and indices of reliability
are derived, and their sensitivity to the shape of the distribution is
21
illustrated by generated tables and plots of these indices for several
transformation parameters and coefficients of variation. An unexpected
increase of reliability with the coefficient of variation for fixed
extreme events is highlighted by the new generalized reliability indices
and definitions.
Deterministic models, such as SWMM, are proven to be valuable tools
for continuous simulation of hydrologic processes, once calibrated
against several storms representative of average hydrological conditions
of the basins under investigation. Simulated series provide valuable
information for storm water management and for hydrological planning and
design when separated into events and treated by statistical analysis
programs such as the Statistics Block of SWMM or the Synoptic Statis
tical Analysis Program (SYNOP).
More insight into the modeled hydrologic processes may be gained
through a probabilistic and stochastic modeling of these simulated
series, e.g., the generalized probability distribution computer program
(GPDCP) is used for the modeling of the event characteristics of the
generated series (duration, volume, intensity and time between events)
and of generated series of extreme hourly pollutant loads. For all
these series the performance of the models fitted to the event data is
as good as for the total annual series.
Stochastic modeling of annual and monthly rainfall series is
illustrated by the analysis of data from eight National Weather Service
NWS stations from southeast Florida. Parameter estimation based on
optimal decision criteria, such as the Akaike Information Criterion
(AIC) and Bayesian Information Criterion (BIC) extended to the BoxCox
transformation performed very poorly in selecting the normalizing
22
transformation compared to the reliability based approach of the GPDCP
program. This conclusion is supported by a Monte Carlo simulation.
1.5. Research Uniqueness
The reliability based approach for estimating the parameters of
deterministic, probabilistic and stochastic models, adapted in this
study is unique to this research. Also, the reparameterization of the
generalized gamma distribution, the introduction of the BoxCox trans
formation parameter as a shape parameter for the normal, extremal,
Rayleigh, and Pearson families of distributions, and the development of
a solution algorithm for estimating the parameters of all four types of
distributions are among the unique aspects of this research. Another
contribution of this study is the derivation of new relations between
the moments of original and transformed data, and their use for the
definition of generalized reliability measures and indices for cases
where the data are normalized by the BoxCox transformation. This
research also extends the Akaike Information Criterion (AIC) and the
Bayesian Information Criterion (BIC) to the BoxCox transformation.
Although most of the research applications focus on the southeast
Florida case study, the main objective is not to answer specific ques
tions about the modeled basins or rainfall stations, but rather to
illustrate the theory, highlighting the potential of the reliability
based approach in the modeling of hydrologic processes.
CHAPTER 2
LITERATURE REVIEW
2.1. Classification of Hydrologic Models
Hydrological models are mathematical formulations to simulate
natural phenomena. Their classification may be made according to the
simulated processes. A hydrologic process was defined by Chow (1964) as
a hydrologic phenomenon subject to continuous variation, especially with
respect to time. Mathematical models may be classified into three main
groups: deterministic, probabilistic and stochastic. A deterministic
model is one in which the simulated process is assumed to be chance
free. Models that include a random component for the simulation of
chance dependent processes are called probabilistic or stochastic ac
cording to their independence or dependence on time. For stochastic
models the sequence of occurrences is called a time series. Depending
on the correlation structure of the simulated time series elements,
these models are called pure random or nonpure random for uncorrelated
and correlated series, respectively.
The probability distribution of the simulated time series may be
time dependent or independent, requiring a stationary or nonstationary
stochastic model. Classification of hydrologic models and their inter
relation are summarized in Figure 2.1, based on Chow (1964, p. 810).
Hydrologic processes are generally treated as stationary to avoid
the mathematical complexity of the stochastic models. But it is well
known to hydrologists that these processes are stochastic nonpure
23
K>
Figure 2.1 Hydrologic Model Classification
Modified from Chow (1964, p. 810)
25
random, and that their exact mathematical representation is practically
impossible. Thus, all other models shown in Table 2.1 are simple
representations that only approximate hydrologic processes. These
approximations are based on knowledge of the modeled processes gained by
experience.
For each model the necessary assumptions should be verified. If
not satistied, either another model may be tested or the original time
series may be treated to comply with the previous assumptions. Among
the possible treatments of these series, removal of periodicity, trends
and correlation are widely documented in the literature (Yevjevich,
1972b; Lawrence and Kottegoda, 1977; Delleur and Kawas, 1978).
From the previous classification of hydrologic models emerges the
importance of the frequency distribution. The classification of each
process was based on some assumptions about the probability distribution
of the modeled series. Such assumptions are even more important for
estimating the parameters of the models, making inferences about the
goodness of fit, and evaluating the reliability of the estimated param
eters. As xill be shown in the next section, some assumptions about
the error distribution have to be made independently of the model and the
method used for parameter estimation. Such assumptions are made implic
itly or explicitly depending on the method of estimation. Model pre
dictions and the associated confidence limits are shown to be very
sensitive to the assumed distribution (Maalel, 1983a). A review of the
method of parameter estimation is given in the next section with empha
sis on the importance of the error probability distributions.
26
2.2. Statistical Parameter Estimation
2.2.1. Introduction
In the previous section hydrologic models were classified into
deterministic, probabilistic and stochastic models, but estimation of
the parameters of these models is usually based on data samples of
limited sizes, and estimates of the same parameter from different sam
ples are not expected to be exactly equal. This is due to limited
accuracy of measurement techniques, the idealized conditions for which
the model was derived and the unpredicted or unaccounted for disturb
ances affecting the real system. Such disturbances are as much a part
of physical reality as the underlying exact quantities which appear in
the model (Bard, 1974). Thus, disturbances or errors are explicity
included in the mathematical representation of the modeled process.
Such representations bear the general form
y = g(x,0) + e (2.2.1)
where y = (y^, ..., yn), x = (x^, ..., xq) are the dependent and inde
pendent variables, respectively, n is the number of data points or
experiments, 0 = (8^, 02 ..., 0 ) defines the parameters of the model,
and e = (e^, en) is the error or deviation of the model pre
diction from the observation
e = y g(x,0) (2.2.1a)
Parameter estimation methods are based on the satisfaction of
Equation 2.2.1 by some measured values of the dependent and independent
variable (y xq) These measurements constitute a sample (realization)
of finite size n from the population of all possible values that may
occur in actual physical situations. Such samples are assumed to be
representative of the total population, leading to an estimate of the
27
true value of the parameters. The independent variable x is often
referred to as an explanatory variable since it describes the variation
in the dependent variable through the model g(x,0). The unexplained
2
variability of y will be given by the variance of the residuals a^,
denoting the average scatter of the observed data about the fitted
curve. The smaller this unexplained variance is, the better is the fit
of the model to the observed data. As in Equation 2.2.1, in the pres
ence of explanatory variables the mean of the residual is often assumed
equal to zero reducing the total number of estimated parameters to p + 1.
If there are no explanatory variables as is usually the case in frequency
analysis, g(x,0) is replaced by the mean of the observations, y^, and
all the variability in the data will be included in the variance of the
residuals,
ei = yi yy (2.2.1b)
a2 = a2 (2.2.1c)
e y
In this case, only two parameters need to be estimated from the data,
2
namely the mean y and the variance o .
y y
2,2.2. Properties of estimators
A
An estimator 0 of the parameter 0 can be characterized by the
following criteria, taken individually or collectively (Neter and
Uasserman, 1974).
1. Unbiasedness
An estimator 0 is an unbiased estimator of 0 if
A
E(0 0) = 0 (2.2.2)
where E stands for expected value.
28
2. Consistency
An estimator 0 is a consistent estimator of 0 if the prob
ability that G differs from 0 by more than an arbitrary constant e
approaches zero as the sample size (n) approaches infinity
A
lim P( 100  >_ e) = 0 for any e > 0. (2.2.3)
n*
3. Sufficiency
An estimator 0 is a sufficient estimator of 0 if the condi
tional joint probability density function of the sample observa
tions, given 0, does not depend on the parameter 0.
4. Efficiency
An estimator 0 is an efficient (minimum variance) estimator of
0 if the variance of any other estimator 0' is larger than the
variance of 0,
a2(0) Â£ a2(0') for all 0'.
2.2.3. Least squares method
The simplest and most popular method of parameter estimation is the
least squares method. This method is based on minimizing the sum of the
squares of the differences between model predictions and measurements,
e, of Equation 2.2.1,
n n 2
SS = E ef = E [y. g(x.,0)] (2.2.4)
i=l 1 i=l 1 1
This method was first suggested by Legendre (1805) for parameter esti
mation in linear curve fitting. Curve fitting differs from parameter
estimation in that the parameters of the former have no physical mean
ing. Parameters estimated by the method of least squares are often of
this type. Since no assumption about the distribution of errors is
made, the estimates cannot be related to the true values of the
29
parameters and are thus meaningless. Duality between curve fitting and
parameter estimation will be dealt with in more detail in Chapter 6.
Equating to zero the derivatives of Equation 2.2.4 with respect to
each of the parameters 0 ^ j=l, ..., p leads to a set of p equations
with p unknowns. These are the x^ell known normal equations
= 0
9(SS) y r nv, 3g(Xi0)
= 2 I [y. gCx.,0)] a0i
9(SS) _ 3g(xi>6)
902 [yi g(xi0^ 902
= 0
(2.2.5)
3 (SS)
30
P
3g(x ,0)
2 Â£ [y g(X,0)] gg = 0 .
P
The ease of the solution of these equations is highly dependent on the
form of the function g. If this function is a linear expression of the
parameters, then the normal equations are also linear, and the problem
reduces to the linear least squares problem, solved readily by multiple
linear regression. On the other hand if g is not a linear expression of
the parameters, the normal equations are nonlinear. No direct explicit
solution for such equations exists. Nonlinear least squares methods may
be used for this solution, usually based on a direct search for the set
of parameters minimizing Equation 2.2.4 or on an iterative solution of a
linearized form of the normal equations. A detailed description
of these methods may be found in Bevington (1969) and Bard (1974).
2.2.4. Maximum likelihood Method
By showing that least squares estimates maximized the probability
density function for the normal error distribution, Gauss (1809) (refer
enced in Bard (1974)), laid the statistical foundation for parameter
estimation. In this, Gauss anticipated the maximum likelihood method.
30
This method was revived later by Fisher (1950) who studied its estimator
properties such as consistency, efficiency and sufficiency.
Assuming the measured values y i=l, n, are independent and
that each has a probability density function (pdf) f about its
*i
expected value y: estimated by g(x^,0) of Equation 2.2.1, the
probability that all n observations lie within dy of the predicted value
g(x_^,0) defines the likelihood function
n
L = n f (e.) (2.2.6)
. t y. i
i=l J i
Meyer (1975, pp. 136) emphasized that the term "likelihood" distin
guishes, and is reserved for, a posteriori probabilities. That is, after
a result y. has been observed, f (e.) states how probable it was that
1 yi i
e^ was found. The main distinction between a_ posteriori and ordinary (a
priori) probabilities is that the latter are well defined numbers (even
if unknown) whereas the former have the character of being random vari
able.
If the errors e^ are normally distributed with mean zero and
2
variance ck, then Equation 2.2.6 gives the following expression of the
likelihood function:
n
, 1 1 ,yi gCxi0>121
L = II exp [ y ( ) J
i=l /2tt a.
a.
i
(2.2.7)
= exp [ j ( a
Y ~ g(x.,6) 0 n
)'] n
i=l /2tt a.
i
The maximum likelihood estimator (MLE) of 0, say 0, is the value of 0
that maximizes L or equivalently the logarithm of L, Â£ = log L (Meyer,
1975, p. 312). The MLE of 0 is often but not always the solution of
the following p equations:
31
O, 1 1 . P*.
(2.2.8)
i
Examples requiring alternative procedures to obtain MLE may be found in
Mann et al. (1974, p. 113) and Meyer (1975, p. 314). These equations
are similar to the previously defined normal equations. Usually they
are not linear expressions of the parameters, and their solution re
quires nonlinear estimation methods. The complexity of the solution
increases with that of the form of g(x,0) and f Maximum likelihood
yi
estimates satisfying Equation 2.2.6 will often occur at local maxima
(Mejier, 1975, p. 312), but they do have the following important prop
erties (Mann et al., 1974, p. 82): 1) If an efficient estimator, 0, of 0
exists, then this estimator is the unique solution of the related like
lihood function. 2) If the number of sufficient statistics is equal to
the number of unknown parameters, the MLE's are the minimum variance
estimators of their respective expected values. 3) Under certain
general conditions the MLE's converge probabilistically (Equation 2.2.3)
to the true value 0, as n*, and are asymptotically normal and asymp
totically efficient estimates of 0. 4) The MLE's are invariant to a
transformation with a singlevalued inverse. In other words, if 0 is
the MLE of 0, and t(0) is a function of 0 with a singlevalued inverse,
t(0) is the MLE of t(0). This last property is a very important one; it
will be used in Chapter 3 for statistical inferences about estimates
from transformed variables.
2.2.5. Weighted least squares
By defining "chisquare"
2
Â£ (
g(x,0) 2
)
X
o.
i
(2.2.9)
32
and substituting it into Equation 2.2.7 the maximum likelihood function
will be
2 n 1
L = exp (x /2) n 3 (2.2.10)
i=l /2tt" a.
i
2
The only term in this expression that depends on 9 is x Therefore
2
maximizing L or Â£ will be equivalent to minimizing x (Meyer, 1975, p.
254) .
If all observations are normally distributed with the same variance
2 2
(homoscedastic) o^ a i=l, , p, then maximizing L, Equation 2.2.7
will be equivalent to minimizing the simple sum of squares of Equation
2.2.4. This is the previously mentioned relation between simple least
squares and maximum likelihood given by Gauss (1809).
Equation 2.2.9 may be viewed as a weighted sum of squares (WSS)
when xnritten in the following form:
n
WSS = I w (y g(x ,0)) (2.2.11)
i=l
where the weights w^ are equal to the inverse of the variance of the
2
residual (1/ck). Weighting by the elements of the inverse of the co
variance matrix of the errors was shown to lead to the leastvariance
estimates if the model g(x,0) is linear in the parameters, or if the
number of observations is large and the errors are normally distributed
(Bard, 1974, p. 57). For the general case of nonlinear models and
nonnormal distributions, approximate optimal properties may still be
reached by weighting with the inverse of the covariance matrix. I'ihen
the covariance matrix is not known Bard suggested either to guess its
elements or to estimate them along with the other parameters using a
method such as the maximum likelihood, while Bevington (1969, p. 102)
suggested evaluating these weights directly from instrumental uncertainties.
33
When measurements are made with physical instruments, as is often the
case in hydrology, the uncertainty in each measurement generally arises
from fluctuations of repeated readings of the instrument scale. Such
fluctuations may result from either human imprecision in reading the set
tings, imperfection in the equipment, or a combination of both. A
replication of some of these experiments will give direct estimates of
the elements of the covariance matrix. Draper and Smith (1966, p. 77)
considered the case where g(x,0) is linear in the parameters. When
the covariance matrix is not diagonal (correlated observations) or is
diagonal with inequal variances (some observations are more reliable
than others) they showed that there is always a transformation of the
observations y to other variables Y satisfying at least approximately
the following properties: 1) Y is linear in the parameters, 2) the
covariance matrix is diagonal with constant elements, and 3) the errors
are approximately normally distributed with zero mean and variance
2
matrix I a where I is the identity matrix. With these properties the
transformed variables satisfy all conditions leading to the minimum
variance unbiased estimate of 0 by unweighted least squares and allow
statistical tests and confidence intervals to be constructed. These
results are then reexpressed in terms of the original variables; the
optimality of these estimates is guaranteed by the previously mentioned
property 4 of the MLEs. From Draper and Smiths approach we see how
the problem of finding the right weights may be reduced to that of
finding an appropriate transformation of the measured variables.
Ditlevsen (1981, p. 232) followed the same approach of normalizing the
2
observations y to N(0, la), and then defining a general reliability
index in the normalized space. This index is extended to a more general
normalizing transformation in Chapters 3 and 5.
34
2.2.6. Method of moments
2
The error term of Equation 2.2.1 has a pdf f^(e) = f (y, g(x,0), a ).
The rth moment about zero of f is
y
Ur = / yr f (y, g(x,0), a2) dy (2.2.12)
y
2
The moment p^, is a function of the parameters 0_^, i=l, ... p and a ,
and hence can be written as
v'r = P (Ql 02> 0 o2) (2.2.13)
From a sample of observations y^, y^, ..., y^ the first p+1 sample
moments, m', are
r
1 n
m' = E yr (2.2.14)
r n y i
i=l
A A2 2
The moment estimators 9^., i=l, ..., p of the p 0's and a of a are
obtained by solving the system of p+1 equations resulting from equating
equations 2.2.12 and 2.2.14.
p^. = r=l, 2, .. p+1 .
2
Noce that for this case the variance o was considered unknown, but
2
constant. If a is not constant the number of unknowns will jump to p+n
requiring the calculation of the (p+n) first sample moments. This is
a main drawback of the method of moments since sample moments of order
higher than first are biased. The following properties of moment esti
mators were originally shown by Cramer (1946) and reported in Mann et
al. (1974, p. 80). (1) They are simple and squared error consistent;
(2) they are asymptotically normal, but not, in general, asymptotically
efficient or best asymptotically normal.
35
2.2.7. Order statistics based least squares
If the distribution f (y) of the errors in Equation 2.2.1 is of
known form with location and scale parameters g(x,0) and a respec
tively, its parameters may be estimated by applying general least
squares theory to the ordered sample of observations. This method was
first analyzed by Lloyd (1952) who showed that the resulting estimates
are unbiased, linear in the ordered observations, and of minimum vari
ance. He also developed explicit formulae for the estimates and for
their variances and covariances.
For a distribution which depends on location and scale parameters
only, the cumulative distribution function (CDF), F (y) may be reduced
to a parameterfree distribution (standardized) F^(z) with pdf fz(z),
where z is equal to
_ y g(x,6)
Cj ~~~
a
(2.2.15)
e
If the variates (z's) are arranged in ascending order, such that z_^_^
< z., i=2, ..., n, the smallest observation z, is called the first order
i 1
statistic whereas the last observation z is called the nth order sta
n
tistic. The small probability h (z) dz that the ith order statistic,
i
z^, lies in the range z dz/2 <_ z + dz/2 will define the prob
ability density function, h (z), of the order statistic. The prob
i
ability that i1 observations are less than z, ni observations are
greater than z, and exactly one observation is in the range z dz/2
<_ z <_ z + dz/2 is given by the multinomial distribution,
hz (z) dz = (ii)'i.n(ni) Â¡ CFZ(2)]1 1 U ~ Fz(z)]n 1 fz(z) dz (2216)
The expected value of the ith smallest order statistic in a sample of
size n is
36
00
E(z.) = / z h (z) dz
i z.
CD X
nj
(i1)! (ni)!
o
/ z [Fz(z)]1 1 [1 Fz(z)]n 1 fz(z) dz .(2.2.17)
CO
Substitution of E(z^) for z into Equation 2.2.15 leads to the relation
yi = g(x1,0) + oe E(z^) (2.2.18)
If gix^jO) is replaced by y^, the mean of the n observations,then this
equation reduces to the classical case of two parameter pdf's with a=
(Equation 2.2.10). Equation 2.2.18 illustrates the linear relation
between the mean observation and the expected order statistic. This
equation is the analytical formulation of the probability paper linear
plot. The expected standardized order statistic E(z^) may be converted
to (and is sometimes termed) the plotting position because it defines
the probability at which y^ should be plotted on probability paper.
Equation 2.2.17 is usually too complicated for analytical evaluation.
Harter (1961) used numerical integration for n=2 to 400 to obtain values
of E(z^) accurate to five decimal places with a normal parent distribu
tion. Computer algorithms for the evaluation of the expected normal
order statistic are included in many statistical packages (Westcott,
1977; Royston, 1982; Statistical Analysis System, 1982). For the
extreme value Type I parent distribution Lieblein and Salzer (1953)
simplified an expression, originally derived by Kimbal (1946), for
Equation 2.2.17:
E(zi) = i ()
ni
E (~Dr
r=0
,nl. C + In (i+r)
^ r i+r
(2.2.19)
where C = 0.5772 is Euler's constant. Cunnane (1978) found that for n
larger than 35, even with double precision FORTRAN, arithmetic roundoff
errors overwhelmed the evaluation of E(z_^) and suggested the use of
37
approximation formulae for the expected order statistic. For an
exponential distribution Equation 2.2.17 reduces to the very simple
expression
i
E(z.) = Â£ 1/(n + 1 j) (2.2.20)
1 3=1
and for a uniform distribution E(z_^) is numerically equal to the
plotting position (probability) and is given by the wellknown Weibull
(1939) formula,
E(z.) = F 1( ) = "V'T
i y n + 1 n + 1
(2.2.21)
A general approximation formula for E(z^) was first suggested by Blom
(1953)
E(z.) = F 1(rnV")
i y n + 1 2a
(2.2.22)
1 .
where F^ is the inverse of the parent CDF and a is a constant depending
only on the form of the probability distribution. Blom recommended a =
0.375 for the normal distribution. Gringorten (1963) suggested a = 0.44
for the extreme value Type I distribution. For a = 0 and a uniform
probability distribution Equation 2.2.22 reduces to the exact formula of
Equation 2.2.21. The name plotting position is usually given to the
argument of Equation 2.2.22, the probability at which the ith observation
should be plotted. Conversion of the plotting position to a standardized
variate, E(z^), allows the estimation of the location and scale param
eters in Equation 2.2.18 by simple regression (Lloyd, 1952). The same
technique was suggested by Chow (1953) for frequency analysis in hy
drology. Kimbal (1960) and Cunnane (1978) compared the performance of
several plotting position formulae using the same technique. They used
variance and bias of the location and scale parameter estimates as
38
criteria of goodness of fit for the normal and extreme value Type I
distributions. Maalel and Triki (1979) used the same approach to com
pare the performance of different probability distributions in the
modeling of rainfall intensitydurationfrequency (IDF) structure over
Tunisia. Tang (1980) applied linear regression for a Bayesian frequency
analysis of annual flood discharge. Like Maalel and Triki, Tang used
the Weibull plotting position to calculate the expected standardized
order statistic; independently they presented the same formulation to
account for discrepancies between observed data and model predictions,
and for the uncertainty of extrapolation from a limited sample of data.
An expression for the overall variability of any individual prediction
may be found in many statistical textbooks such as Raiffa and Schlaifer
(1961), and Gremy and Salman (1969) From Equation A.10 we have
a = S \ ,, 1 Zi Z N n 1
y, y V a + + 17)
(2.2.23)
Â£(z. z)
where S^ is the standard deviation denoting the average scatter of the
data points about the regression line
y = A + Bz (2.2.24)
where z is the standardized variate (reduced order statistic) with mean
z. A and B are the location and scale parameters, respectively. These
parameters are obtained from the regression analysis of Equation 2.2.24.
A description of these estimators is given in Appendix A.
The ratio of the difference between expected and predicted observa
tion to the overall standard deviation Equation 2.2.23 has a Student t
distribution (Draper and Smith, 1964, p. 24). The (1 p/2) percent
confidence interval of any prediction y^ is
39
A
= t(v, 1 p/2) a
(2.2.25)
where v is the number of degrees of freedom.
2.2.8. Conclusion
The regression based on order statistics is extended in this study
to the more general case where the location parameter A is allowed to
have some explanatory variables, Equation 2.2.18,
yi = 8(x.,0) + ag E(zi) .
Note that if g(x,0) is linear in the parameters (0's) this equation is
linear, and multiple linear regression will have optimal properties. If
g(x,0) is not linear, the nonlinear parameters may be estimated sepa
rately through a nonlinear procedure, and Equation 2.2.18 will be used
for an overall evaluation of the model. This approach will be analyzed
in more detail in Chapter 6.
Given the importance of the probability distribution in the classi
fication and parameter estimation of hydrologic models, a review of
these distributions is given in the next section.
2.3. Probability Distributions
2.3.1. Introduction
Hydrologic modeling has always involved probabilistic analysis in
order to account for the stochastic nature of the processes involved.
Many discrete and continuous distributions have been found to be useful
for hydrologic frequency analysis (Chow, 1964) A review of these dis
tributions and their application in hydrology may be found in several of
the hydrologic textbooks such as Yevjevich (1972) Haan (1977), Kite
(1977), and Viessman et al. (1977).
40
2.3.2. Discrete distributions
The main discrete distributions used in hydrology are from the
Bernouilli and Poisson processes. A detailed description of these
distributions along with their interrelations is given in Section C.l of
Appendix C. Waymire and Gupta (1981) present a very good review of
the Bernouilli and Poisson models and their application for the mathe
matical representation of the temporal and spatial distribution of
rainfall and rainfalldriven hydrologic processes.
2.3.3. Continuous distributions
A review of the main continuous distributions and their statistical
parameters is given in section C.2 of Appendix C. The emphasis of this
review was on the interrelation between different distributions. Such
relations are very important in that they allow an objective choice
among different probability models through regression analysis. Among
these relations, the exponential distribution (Equation C.2.2) was shown
to be a special case of the gamma distribution (Equation C.2.3). Other
special cases of the gamma distribution are the Pearson Type III and
chisquare distributions, Equations C.2.4 and C.2.9, respectively. A
logarithmic transformation of the variables allows the transition from a
Type II or III to Type I extreme value distribution. Along this line
of interrelation, many other distributions have been derived and are
widely used in the scientific fields in general and in hydrologic
studies in particular. The following sections describe some of these
derived distributions.
2.3.4. Logarithmically derived distributions
2.3.4.1. Logextremal distribution. In Section C.2.9 it is shown that
the extreme value Type I distribution can be transformed to a Type II
41
or III distribution by replacing the variate (xa) by its logarithm, log
(xa). By this transformation Equations C.2.13 and C.2.14 simplify to
the more tractable form of Equation C.2.11 where the parameters are
of the location and scale type. This feature of variable transformation
will be explored in more detail in Chapter 4.
2.3.4.2. Lognormal distribution. The transformation of data may be
based on professional knowledge of the modeled processes (Benjamin and
Cornell, 1970). For example, if the data are known to result from the
product of many small effects, then their logs will be the sum of the logs
of these effects. From the central limit theorem, the distribution of
this sum is expected to be normal (Haan, 1977), and the original data
will be lognormally distributed. The lognormal distribution is also
known as the Galton distribution since Galton (1875) was the first to
study it (Chow, 1964, pp. 817). Aitchison and Brown (1957) derived its
statistical moments and applied it for economic analysis. Chow (1954)
showed that the extreme value Type I distribution is one of its special
cases.
The pdf of the lognormal distribution is
f(y) =
a,r v^2tt y
Y J
(2.3.1)
where Y = log y, and and are the mean and standard deviation of Y.
The statistical parameters for the variate y are (see Appendix B,
Equations B.9 to B.12)
mean
V*Y + ay/2
yy = 6
 1)
variance
42
coefficient of variation
and skewness
1)
1/2
y = 3 V + V3 .
y y y
A location parameter, a, is often added to the lognormal distribution.
Redefining Y of Equation 2.3.1 as log (ya) leads to the three parame
ter lognormal distribution. Its pdf is
f (y) =
Oy v/2tt ~ (ya)
log(ya) y ?
~ 7 ( )
e 2 Y
(2.3.2)
with (ya) > 0, a the location parameter, and yiy and y are as defined
above.
Munro and Wixley (1970) showed that the lognormal distribution can
be expressed in terms of location, scale and shape parameters y, t
and a, respectively. This was illustrated by the following reparameter
ization of Equation 2.3.1,
1 2a
f (y) = e
/2tt (1+oz)t
^2 [log(l+az)]2
(2.3.3)
where
y ~ K
z =
y = a + e
and a = a
y
In this form it is obvious that the lognormal distribution approaches
the normal distribution as the shape parameter approaches zero for
fixed location (y) and scale (t) parameters (given, lim log(l+az) = az) .
aK)
43
2.3.4.3. LogPearson Type III distribution. If the variate (ya) of
the Pearson Type III distribution, Equation C.2.4, is replaced by its
logarithm, the resulting equation defines the logPearson Type III
distribution, after multiplication by the Jacobian of the transforma
tion, of course. This distribution was recommended by the United States
Water Resources Council for flood flow frequency analysis, WRC (1976,
1977, 1981). Application of this distribution in hydrology is exten
sively reported in the literature (Bobee, 1975; Bobee and Robitaille,
1977; Landwehr et al., 1978; Rao, 1980, 1981; Lall and Beard, 1981).
The pdf of the logPearson Type III distribution is
f (y) = y^j Yk_1 e_;VY (2.3.4)
with Y = log (ya) .
2.3.4.4. Summary. The logarithmic transformations of the extremal and
Pearson distributions has no theoretical basis. Their popularity
results mainly from the good fit they show in modeling many hydrological
processes. The assumption of an infinite number of small multiplica
tiva effects implied by the lognormal distribution is expected to be
seldom if ever satisfied in nature, where effects are usely of finite
number. Landwehr et al. (1978) compared flood statistics in real and
log space and found that flood sequences are predominantly positively
skewed in the real space and dominantly negatively skexsred in the log
space, indicating an over transformation of the data. From these
findings it is obvious that some intermediate transformation (space) may
be much more appropriate for the modeling of these data. The following
section gives some alternative transformations which have been reported
in hydrologic and meteorologic frequency analysis studies.
44
2.3.4. Other derived distributions
Stidd (1953, 1968) found a good fit of daily, monthly and annual
rainfall to the normal distribution with a cube root transformation of
the data. The choice of this transformation was based on speculation
about the nature of the skewness of the untransformed time series.
Stidd assumed that rainfall is the product of three meteorological
effects, namely atmospheric vertical motion, air moisture content and
rainfall duration time. Later, Stidd (1970) noted that no mathematical
proof of the previous concept was found, but that an experiment with
synthetic precipitation data confirmed the cube root distribution of
precipitation. Also, he generalized this concept to the Nth root normal
distribution and developed a straight line plotting method for the
estimation of the parameters. If the Nth root of the observed precipi
tation y is normally distributed, it may be plotted as a straight line,
y = a (zz0) (2.3.5)
cx
where a = 1/N, a is the standard deviation of y z is the standard
normal variate and z^ is the z value corresponding to the cumulative
percentage of zeros in the data (threshold in the truncated normal dis
tribution) Tables of the normal cumulative density function (CDF) were
used for evaluation of the z values. To enter this table, Stidd used
empirical frequencies calculated from the previously defined Weibull
plotting position, without even introducing the plotting position con
cept. (Evaluation of the Weibull plotting position using real da.ta will
be given in the next chapter.) Stidd developed an iterative procedure
for the estimation of the parameters a, a and z^. The procedure had as
an objective function the maximization of the linearity of log (y)
vs log (zZq) plot. From his experiment with synthetic rainfall Stidd
45
concluded that the Nth root normal distribution has a more valid theo
retical basis when compared to the empirically derived Pearson family of
distributions and that the Nth root normal distribution is more appro
priate for extrapolation to return periods greater than the period of
observation. Chander et al. (1979) applied the normalizing power trans
formation suggested by Box and Cox (1964), for the analysis of flood
frequency. The BoxCox transformation is
Y =
a i
x_=I
a 0
(2.3.6)
log y a = 0 .
This transformation has the advantage of being continuous at a=0 (N**>,
Equation 2.3.5) and not restricting a to the reciprocal of positive
integers. A more detailed discussion of this transformation will be
given in Chapter 3. Chander et al. applied this transformation to
annual maximum discharges of fifteen rivers from India. These data were
better fitted by the power normal (normal with Equation 2.3.6 trans
formation) than by the normal, lognormal, Pearson Type III, logPearson
Type III or Cumbel distribution. The goodness of fit was the closeness
(when viewed by eye) of these model predictions to the observed data
when plotted on normal probability paper. The Weibull plotting position
was used for the evaluation of the probability of nonexceedence with
all distributions. Here again no justification was given for such a
choice of the plotting position.
Independently, Maalel and Triki (1979) applied a similar trans
formation with the normal and extreme value distributions for frequency
analysis of average rainfall intensity of fixed duration. This trans
formation was
46
Y =
a
y
a ^ O
(2.3.7)
log y a = 0
This transformation has the disadvantage of not being continuous at a=0,
although its performance is the same as the BoxCox transformation.
Rainfall depths recorded at ten meteorological stations from Tunisia were
analysed for durations ranging from 5 to 120 minutes. Among their
conclusions, Maalel and Triki found that with the power exponent a
(Equation 2.3.7) as a third parameter, the normal, extremal, and many
other two parameters distributions of the exponential type fitted most
of their data equally well. A direct computer search algorithm was
developed to solve for the location and scale parameters and the appro
priate transformation a. The solution was based on the linear relation
between transformed variables and expected standardized order statistic,
Y = A E(zj,) + B (2.3.8)
where Y is defined in Equation 2.3.7, E(z_^) is the standardized order
statistic of the distribution of interest, and A, B are model parame
ters. For the normal distribution this method is similar to that of
Stidd (1970) arid Equation 2.3.8 is the same as Equation 2.3.5 with
E(z^)=z, A=a, and B=o'Zq. The parameters A, B are estimated by simple
linear regression for a fixed grid of ct's, ranging from 1.20 to 1.20.
The optimal a was the one giving the minimum mean square error (MSE) of
the untransformed precipitation,
n 2
E (yi ~ yo?
MSE = [ 1]1/2 (2.3.9)
where y is the observed value. This method will be analyzed in more
i
detail and its performance compared to other methods in Chapters 3 and 5.
47
Salas et al. (1980, p. 71) suggested a more general transformation
for the analysis of annual and monthly rainfall and runoff time series,
a (yb)01 a ^ 0
Y =
(2.3.10)
log (yb) a = 0 .
They found good approximations to the normal distribution for a=0,
b=0 and a=l/2, 1/3, and 1/4 for all the data they analyzed. This was in
good agreement with Stidd (1970), who found that a should be between 1/2
and 1/3, but stated that it should not become smaller than 1/3. Iyengar
(1982) applied the square root transformation (a=l/2) for monthly rain
fall series along with a change of sign from year to year,
Y. .
(i)j+1 (y..)172
(2.3.11)
where i = 1, 2, ..., 12 months, and
j =1, 2, ..., n years.
This transformation resulted in an approximately symmetrical process with
a 24 month period. The transformed data from ten Indian rainfall
stations were well fit by a straight line on a normal probability paper
plot.
2.4. Summary
Hydrologic models are deterministic, probabilistic or stochastic
depending on the probability distribution of the errors. These errors
are defined by the deviation of model predictions from the unknown real
values as estimated by sample observations. A uniform approach for
evaluating the reliability of these predictions may be accomplished by
linear regression analysis on the ranked observations and the expected
order statistics. This approach was shown to be a very powerful tech
nique, leading to optimal estimates of expected values and their
48
associated confidence intervals including statistical uncertainties.
Transformations other than the classical logarithmic transformation
define a better space on which inferences about model goodness of fit
and estimated parameters are possible.
The problem of finding the best probability distribution for hydro
logic reliability analysis may be solved following the same approach,
with g(x,0) equal to a constant, say y, in Equation 2.2.18. As will be
seen in the next chapter most probability distributions may be reduced
to a location scale parameter type distribution when the variables are
expressed in the right space. Chapter 3 starts with a review of gener
alized probability distributions, to end with an even more general
distribution with a form as simple as Equation 2.3.8.
CHAPTER 3
GENERALIZED PROBABILITY DISTRIBUTIONS
3.1. Generalized Gamma Distribution
3.1.1. Introduction
A review of different probability distribution models used in
hydrology was given in Chapter 2. The review included derived distri
butions, such as lognormal, logextremal, loggamma and Nth root normal
models. This chapter will focus on the relation between these distri
butions. First, generalized gamma and extreme value distributions are
presented. Then, a new parameterization is introduced, leading to an
even more general distribution with a much simpler form, consisting of
the location and scale parameter type (Equation 2.2.24). The new
generalized distribution can be easily fitted to a small sample of data.
Confidence intervals for parameter estimates and model predictions are
well defined given the linearity in the parameters of these models.
3.1.2. Historical background
A three parameter generalized gamma distribution (GGD) was dis
cussed by Stacy (1962). The probability density function (pdf) of this
distribution is
, bk1 r ,
f(y,a,b,k) = exp[(^) ] (3.1.1)
abiV r(k)
where a is a scale parameter, b and k define a shape parameter d=bk,
and with y, a, b and k > 0.
50
This distribution is widely used in reliability analysis when the
observed data fail to follow any of the more familiar probability models
described in Chapter 2. Most of these distributions are special cases
of the GGD, e.g., the exponential (b=k=l), Weibull (k=l) and gamma
(b=l). The GGD is not the same as the familiar three parameter gamma
distribution, where the third parameter is a location parameter (Equa
tion C.2.4) .
Maximum likelihood estimates from samples of limited size were
derived by Parr and Webster (1965) for the parameters a, b and k along
with their asymptotic variances. Stacy and Mihram (1965) extended the
GGD to a more general distribution by allowing the parameter b to have
negative values. They also compared parameter estimators obtained by
the method of moments, maximum likelihood, and minimum variance. Harter
(1967) introduced a further generalization to the GGD by defining a
fourth parameter (location parameter). He developed an iterative proce
dure to solve the likelihood equations for the parameter estimates.
These equations were obtained by equating to zero the first partial
derivatives of the logarithm of the likelihood function with respect to
each of the parameters (Equation 2.2.6). The asymptotic variance
covariance matrix of the estimates was estimated by the inverse of the
information matrix, M(n), which is composed of the expected values of
the second partial derivatives of the likelihood function with respect
to the parameters.
51
2 2 2
3 L 3 L 3 L
. 2 9a3b 3a3k
del
M(n) = E
n
2 2 2
3 L 3 L 3 L
3a3b .,2 3b3k
3b
(3.1.2)
2 2 2
3 L 3 L 3 L
3a3k 3b3k ..2
3k
From the analysis of the components of this matrix, Harter (1967)
A A
found a high negative correlation between the estimates b and k. This
should be expected, since, as stated earlier, they are related to the
same parameter through their product. This correlation increased the
number of iterations required when b and k were estimated simultane
ously. A more stable solution was reached for the estimation of their
product d, the shape parameter. This feature of reparameterization will
be applied at the end of this chapter for a better mapping of the GGD.
Hager and Bain (1970) and Hager et al. (1971) presented a detailed
analysis of the properties of maximum likelihood parameter estimates for
the GGD. Their main finding was that these estimates are "illbehaved,"
in that the solution to the likelihood equations may not exist, and
even when it does exist, the convergence of the iterative procedure is
A A ^ b
very slow. Also they found that k, b/b and (a/a) are distributed
independently of a and b, and that the assumed asymptotic normal dis
tribution of the parameter k was not satisfied even for sample sizes as
high as 400. The last result was based on synthetic data generated with
k values of 1 and 2.
Prentice (1974) and Farewell and Prentice (1977) introduced a
reparameterization and a logarithmic transformation to the GGD. From the
new form of the distribution they showed that the lognormal distribution
52
is a special case of the GGD (k*) For a better mapping of the log
normal distribution in the parameter space, they defined a new parameter
1/2
q=k Therefore, as q approaches zero the GGD will approach the
lognormal distribution. When Equation 3.1.1 is written for the variate
Y=log y, the following relationships may be developed:
Y = log a +  (3.1.3a)
or
z
Y log a
(3.1.3b)
where z is the reduced variate with pdf
f (z,k) = exp(kzeZ) (3.1.4)
z r(k)
This is the log GGD (LGGD) probability density function, where the vari
ate z follows a locationscale model with variate the log of a T(k)
variate (Prentice, 1977).
The LGGD as defined above has the extreme value Type I (k=l) and
the lognormal (k*) as special cases. The relation between the log
normal and gamma distribution was established by Bartlett and Kendall
t/ 2
(1946). By introducing the reparameterization q=k Prentice (1974)
extended the GGD to negative q; this was accomplished by reflecting the
pdf at fixed q about the origin (Figure 3.1). The reduced variate z,
Equation 3.1.3b, has a cumulant generating function log (r(kkx)/T(k)) ,
from which the mean and variance of z are
Pz = H k) =  (3.1.5)
and a2 = i>' (k) = (3.1.6)
z dk
where Y(k) and T'(k) are the digamma and trigamma functions, respec
tively. Prentice (1977) showed that the standardized variate
5 4 3
Figure 3.1 Generalized Garama Distribution
54
.  U M )k1/2
o q z
has mean zero and variance approaching one as k approaches >
Equations 3.1.3a and 3.1.7 may be combined to give the
expression for Y
(3.1.7)
(q=0) .
following
Y = m + oz
o
where m = log a H^
b
, ,1. 1/2
and a b k
With these parameters, the LGGD pdf given in Equation 3.1.4
to
(3.1.8)
(3.1.9)
(3.1.10)
is extended
where
h(Y,m,a,q) =
a T(q
1
/2tt a
q / 2 zN
_2) exp(zq e )
, 1 2.
exp( j zo)
(Ym)
z = q + u ,
a z
zq = (Ym)/a, and
(qrO)
(3.1.11)
(q=0)
yz = V(q 2)
Extreme value Type I distributions for maxima and minima are easily
shown to be special cases of the LGGD for q=l and 1, respectively. The
normal distribution is now at the center of the parameter space, q=0.
The main feature of this reparameterization is the reduction of all
the distributions included in the LGGD to the simple form of a location
and scale type distribution. Such parameters are easily estimated
due to their linearity with the reduced variates. Prentice (1974)
noticed that the number of parameters to estimate may be reduced by one
55
since a maximum likelihood estimate of m is given by the mean of Y=log y
for all values of q,
n
} log y
m = Y = (3.1.12)
n
The number of parameters estimated iteratively is then reduced to two,
namely a and q. From a simulation study where the estimated parameters
were compared to their true values, Prentice found that although true
and unbiased starting values for q and o were used, the iterative pro
cedure failed to converge for 5 out of 400 samples of size 25 from the
normal distribution (q=0). The rate of failure was much higher for
samples of the same size from the extreme value distribution (q=l).
"2
The efficiency of the estimate a ranged from 100% to 84% for the normal
and extreme value distributions, respectively. Such convergence prob
lems result from the high nonlinearity of the maximum likelihood equa
tions in the parameters q and a. These are
n z.
E [(e 1 q )qz .] = n (3.1.13a)
1=1
n z. z.
E [(e 1 q )qz .] 2c/ q E(e 1 q ) = n (3.1.13b)
i=l
Y. Y
where z = q + ,
Y. Y
z = and
ox a
n = sample size.
The asymptotic variancecovariance matrix of the parameter esti
mates (m, o, q) was shown to reduce to a diagonal matrix with elements
2 2
a /n, 0 /2n and 6/n for q=0.
56
3.1.3. Linear regression and confidence limits
Farewell and Prentice (1977) applied a slightly simplified version
of Equation 3.1.11 for several data sets from industrial and medical
literature to study the shape of the GGD distribution. The simplified
distribution expressed in terms of z and k is
,kl/2 /
f(y,z,k) = r<ky exp[/k~ z Ice Z/] (3.1.14)
which, as shown previously, approaches the normal distribution as k
approaches infinity.
Farewell and Prentice included explanatory variables x = (x^, ...,
Xp). These regression variables were assumed to be linear in the log
transformed variables Y, such that
Y = m + x9 + oz (3.1.15)
with m and a as defined above, z the reduced variate following the GGD
(Equation 3.1.4), and 0 = (0^, ..., 0 ) = regression coefficients of the
explanatory variables. This equation is exactly the same as Equation
2.2.18 with g(x,0)=m+x0, and E(z^)=z. The maximum likelihood estimate
of the 100 pth percentile of Y at a given x=xq and a specified distri
bution shape q=qQ is
A A A A
Y = m(q ) + x 0(q ) + o(q ) z (3.1.15a)
p o o o op
A A /\
where m', 0, and o are the maximum likelihood estimates of the parameters
and Zp is the 100 pth percentile of the GGD.
Given the linearity of Equation 3.1.15 the variance of the pre
dicted percentile y^ is estimated directly by the general linear regres
sion. Such solutions may be found in many statistical textbooks, e.g. ,
Draper and Smith (1964, p. 61),
V(qQ) X
o
T
(3.1.16)
57
where V(qQ) is the variancecovariance matrix of the parameter esti
mates, and
Xq = (1, xq, z) (3.1.16a)
is the coefficient matrix derived from Equation 3.1.15. The variance of
A
the estimates, and approximate confidence limits for can be calcu
lated at any confidence level p^,
A
Y = Y + z a (3.1.17)
P P P Y
o op
where z is the GGD reduced variate corresponding to its cumulative
Po
distribution having a value equal to pQ. If there is no explanatory
variable, Equation 3.1.15 will be of the type of Equation 3.3.1 (to be
discussed later) and Equation 3.1.16 reduces to the simple expression
for variance given by Equation A.10.
Bain and Englehardt (1981) started from Equation 3.1.14 to develop
simple procedures for approximating confidence limits for the parameters
and the predictions of the Weibull and extreme value distributions.
Their results were based on chisquare and Student's t approximations
to some functions of the GGD maximum likelihood parameter estimates.
The accuracy of their approximations was checked against results of
Monte Carlo simulations and was found adequate for most applied problems.
Farewell and Prentice (1977) found no particular shape appropriate
for their data and also reported a high variability of the estimated
coefficients of the explanatory variables with the parameter q. But
they found that addition of q to the CGD xas very useful for accom
modating a variety of shapes in the tail of the distribution and for
identifying outliers.
Table 3.1 gives a summary of the special distributions and cor
responding parameters of the GGD.
58
Table 3.1. Special Distributions of the GGD.
Distribution
Parameters
a
b
k
q
Normal
0
1
CO
0
Extreme I max
a
b
1
i
Extreme I min
a
b
1
i
Weibull
a
b
1
i
Rayleigh
*c/2~
2
1
i
Gamma
a
1
k
*c is the parameter of the Rayleigh distribution
Equation 3.2.11.
59
3.1.4. Summary and implications
From this review of the GGD historical development, the inter
relation between most probability models introduced in Chapter 2 becomes
obvious. An adequate reparameterization of these distributions along
with a logarithmic transformation of the dependent variable results in
very simple relations between reduced variates and the new parameters
(Equations 3.1.8 and 3.1.15). But even with these relations, estimation
of the original parameters remains very difficult, due first to their
dependence (Equations 3.1.9 and 3.1.10) and second to the high non
linearity of the resulting maximum likelihood equations (Equation
3.1.13a,b). These problems are mainly due to the poor parameterization
of the GGD. Because such parameterization was based on the form of clas
sical distributions, the GGD resulted in an over parameterization. A
good illustration of this over parameterization is given by the modeling
of the distribution shape, described by three parameters, b, k (Equation
3.1.1) and the logarithmic transformation (Equation 3.1.4). The equi
valence of the logarithmic transformation to the addition of a shape
parameter was given in Chapter 2, Section 2.3.3. Any one of these three
parameters would suffice for an adequate modeling of any distribution
shape, although transformation of the dependent variable seems to be the
most appropriate for this purpose. As will be seen below, incorporation
of the shape parameter into the BoxCox transformation results in much
simpler equations that are linear in the location and scale parameters
and in the transformed dependent variable.
Tukey (1957) was the first to suggest the power transformation to
reduce random variables to approximate normality. The transformation he
analyzed was of the type of Equation 2.3.7. Later Box and Cox (1964)
60
modified the Tukey transformation to the transformation defined by
Equation 2.3.6 and applied it for linear regression analysis. Box and
Cox showed that for this type of analysis assumptions such as (1) lin
earity of structure of expected values, (2) constancy of error variance,
(3) normality of distribution, and (4) independence of observations are
better satisfied in terms of transformed than original observations.
Hinkley (1975) applied the same transformation to exponential and gamma
random variables to reduce them to approximate symmetrical distributions
suitable for statistical inferences based on equitailed order statis
tics .
Hernandez and Johnson (1980) investigated the large sample behavior
of the BoxCox transformation. They evaluated the closeness of the
transformed variable distribution (f ) to the normal distribution (d> )
a yo
using the KullbackLeibler information number,
f (u)
I(f 4> ) = ff (u) log[', yddu (3.1.16)
or ya a
ya
The optimal transformation is the one that minimizes this number. For
many distributions from the gamma and Weibull families, Hernandez and
Johnson derived the optimal transformation leading to the best approxi
mation by the normal distribution. These values along the ratio of the
information numbers for original data (a=l) and optimally transformed
data (a=a t) are given in Table 3.2. Note the improvement in the
goodness of fit (I ) and the performance of the transformation
opt
(I ) with the increase in the shape parameter of the gamma
opt
distribution. This should be expected since the higher the value of the
shape parameter, the closer the gamma distribution approaches the normal
distribution (k*o, Equation 3.1.14).
61
Table 3.2. Optimal Power Transformation and Information Number
Ratio. After Hernandez and Johnson (1980).
Distribution
a
opt
\ ,
(a=a1)
h
(a=a )
opt
Vh
Gamma (a,l,.5)
0.2084
0.98175
0.1205
81.5
Exponential (a,1,1)
0.2654
0.41894
0.00278
150
Gamma (a,1,1.5)
0.2887
0.26070
0.00140
186
Gamma (a,1,2)
0.3006
0.18830
0.00051
369
Gamma (a,1,3)
0.3124
0.12067
0.00019
635
Weibull (a,b,l)
0.2654a



62
As an alternative to the classical form of the GGD a new class of
families of probability distributions based on the BoxCox transforma
tion will be presented in the following sections. For fixed values of
the GGD shape parameters a family of distributions can be defined simply
by transformation of the dependent variable. This generalization of the
BoxCox transformation to the GGD has apparently not been attempted in
any previous work. Also, the evaluation of the relative performances of
these generalized distributions and comparison to the performance of
special cases of the classical GGD based on real word data are believed
to be among the unique aspects of this study.
3.2. New Parameterization of the GGD
3.2.1. Generalized normal distribution (GND)
In the previous section it was seen that the lognormal distribution
is a special case of the GGD (q=0), Equation 3.1.7. If the logarithmic
transformation is replaced by the more versatile power transformation
Y =
a^O
(3.2.1)
log y a=0
a more general distribution of the variable y will be defined while the
distribution of Y is approximated by the normal pdf
f(Y)
(3.2.2)
This distribution has the normal (a=l) lognormal (ci=0) and the Nth root
normal (a=N ^) distributions as special cases. Its cumulative distri
bution function (CDF) is
63
F(Y)
1
du .
(3.2.3)
For a given transformation (a=a ) the only parameters of this dis
tribution are the location and scale, y^ and 0y> respectively. The
expected order statistic E(z^) of Equation 2.2.22 is
E(zi> F'1(n +VV
(3.2.4)
.1 .
where F is the inverse of the normal CDF, and the argument is the
plotting position, defined in Chapter 2. Equations 3.2.2, 3.2.3 and
3.2.4 cannot be evaluated analytically. Several approximations of these
equations may be found in the literature, e.g., Abramowitz and Stegun
(1964, pp. 932933). By equating the expected order statistic to the
standardized normal variate the following equation is obtained (for
simplicity of notation E(z_^) will be replaced by Z for the rest of the
text) ,
Y = yy + oy Z (3.2.5)
which is the same as Equation 2.2.24. The parameters yy and oy can be
estimated directly by simple linear regression. From Chapter 2 we saw
that in addition to the simplicity of their evaluation these estimates
are unbiased and have minimum variance.
3.2.2. Generalized extreme value distribution (GED)
Reduction of a set of observations to a distribution other than
normal by a power transformation was introduced by Maalel and Triki
(1979) although the transformation they used (Equation 2.3.14) was not
continuous at a=0. Aitkin and Clayton (1980) used the same equation to
define a generalized extreme value distribution and recommended the
64
BoxCox transformation to ensure continuity at zero. The pdf of the GED
is
(YjlJL)
f(Y) =exp[(^~) e b ] (3.2.6)
where Y is defined by Equation (3.2.1). It can easily be shown that the
extreme value distributions and the Weibull distribution are special
cases of the GED with  a  =1 and a=0, respectively. The CDF of Equation
3.2.6 is
F(Y) = exp (exp (g ) ) (3.2.7)
Contrary to the normal distribution, these equations can be evaluated
analytically. The inverse of the CDF exists. Letting F(Y) = P, the
above equation may be solved for Y to yield
Y = a + b log [log (P)] (3.2.8)
By definition of the expected order statistic Z, Equation 3.2.4, this
relation may be written as
Y = a + b Z (3.2.9)
where Z is evaluated directly for any probability level P by the fol
lowing equation
Z = F_1(P) = log (log (P)) (3.2.10)
Equation 3.2.9 is exactly the same as Equation 3.2.5; therefore, the
same procedure may be followed for the estimation of its parameters.
3.2.3. Generalized Rayleigh distribution (GRP)
If the parameters of the GGD defined by Equation 3.1.1 are set to
the following values ac/2, b=2 and k=l, the resulting distribution is
known as the Rayleigh distribution (Benjamin and Cornell, 1970, p. 301,
Stacy and Mihraim, 1965). The GRD is defined by the power transformation
65
of the observed variable y (Equation 3.2.1). The pdf of Y is
f(Y) = ~ (y d) exp[ j (~^)2] (3.2.11)
c
and its CDF is
F(Y) = 1 exp[ j (^%2] (3.2.12)
Here again, analytical evaluation of these equations is straight fore
ward, and the inverse of the CDF exists. Letting F(Y) = P, Equation
3.2.12 may be solved for Y, yielding
Y = d + c [2 log (1P)]1/2 (3.2.13)
which in terms of the expected order statistics reduces to
Y = d + c Z (3.2.14)
where Z may be evaluated for any probability level, P, directly by the
following equation
1
Z = F_1(P) = [2 log(l P)]2 (3.1.15)
Equation 3.2.14 is the same as Equation 3.2.9; consequently, the same
solution procedure may be followed for the estimation of its parameters.
Table 3.3 lists some of the classical distributions which are
special cases of the new families of generalized distributions.
3.2.4. Generalized Pearson distribution (GPP)
The Pearson Type III distribution, Equation C.2.4, was shown to be
a special case of the gamma distribution, xdiich itself is a special case
of the GGD wtLth b=l. The pdf of the GPD expressed in term of the power
transformed variable, Y, is
k1
f(Y) = Y^y (Y d)K 1 exp [ X (Y d) ] (3.2.16)
Definition of the different terms is the same as for Equation 2.3.7b.
66
Table 3.3. Special Distributions of the New Parameterized Generalized
Distributions.
Parameters
Family
Distribution
Location
Scale
, Shape
Normal
y
o
1
Lognormal
yy
a
y
0
GND
Square root normal
yy
a
y
0.5
Nth root normal
y
a
1/N
y
y
Inverse normal
y
a
1
y
y
Extreme I max
a
b
1
GED
Frechet
a
b
0
Extreme I min
a
b
1
Rayleigh
d
c/2"
1
GRD
Log Rayleigh
d
cv'T
0
Pearson III
d
1/A
1
GPD
Log Pearson III
d
1/A
0
N = positive integer
67
This distribution is much more complicated to deal with than the
previous ones because it has four parameters. In fact the shape of the
distribution is modeled by two parameters, k and the power exponent a.
As mentioned earlier, this is an over parameterization of the shape of
the distribution; for fixed k the power transformation will introduce
enough flexibility in Equation 3.2.16 to fit any shape of the observed
data. Thus, the parameter k will be replaced by its moment or maximum
likelihood estimate, reducing to three the number of parameters to be
estimated. Elimination of the shape parameter from the estimation
scheme was reported by Kite (1977) and applied by Siswadi (1981) and
Quesenberry and Kent (1982) for the selection among probability dis
tribution models. The moment estimate is
(3.2.17a)
where and are the mean and standard deviation of the gamma reduced
variate (Equation C.2.3c).
Alternatively, the maximum likelihood estimate of k can be eval
uated using the polynomial approximation of Greenwood and Durand (1960) .
k = (0.5000876 + 1.648852 C 0.0544274 C2)/C ,
for
0 < C < 0.5772
and
k = (8.898919 + 9.059950 C + 0.9775373 C2)/
(17.79728 + 11.968477 C + C2)C
(3.2.17b)
for
0.5772 < C < 17 .
where C = log(Y) log(Y) .
The CDF of the GPD is then
(u d)^ ^ exp[A(u d)] du = Tz(k)/r(k) (3.2.18)
68
where r (k) is the incomplete gamma function with argument z = A(Yd).
The reduced variate, z, is equal to the inverse of the CDF, and to the
expected order statistic. Therefore, the transformed variable Y is
equal to
Y = d + y Z (3.2.19)
with Z = F ^(P). Note that for the GPD as for the GND f, F and F ^
cannot be evaluated analytically. Tables of these functions can be
found in many statistics books, such as Benjamin and Cornell (1970) and
Ilaan (1977). Abramowitz and Stegun (1964, pp. 940946) have some approx
imations for these functions. These are implemented in many computer
libraries such as IMSL (1979) and SAS (1982).
3.2.5. Other generalized distributions
As stated earlier, for any fixed values, b and k of the GGD
o o
(Equation 3.1.1) a new generalized distribution can be defined by a
power transormation of the dependent variable. Some special distri
butions from the GGD, resulting from such generalizations are given in
Table 3.4. These distributions are not considered further in this
study; they are listed for completeness. Note, however, that the expo
nential and Weibull distribution are special cases of the GED, and the
chi and chisquare distributions are special cases of the GPD.
Other distributions not of the location and scale type and/or with
more than three parameters have limited application in reliability
analysis. Therefore, they have not been considered in this study,
although some of these distributions have been shown to be potentially
useful in flood frequency analysis. Such distributions include the
general lamda, the Walceby, and the kappa distributions. All three
distributions are expressible in inverse form, a property that made
69
Table 3.4. Other Special Distributions of the GGD Not
Included in This Study. (after Stacy and
Mihrara, 1965)
Distribution
Parameters
a
b
k
Exponential
a
1
1
Weibull
a
b
1
Chisquare
2
1
v/2
Chi
/2
2
v/2
Half normal
/2
2
1/2
Circular normal
/2
2
1
Spherical normal
/2
2
3/2
v = degree of freedom
70
them of special interest to Greenwood et al. (1979) who showed that for
this type of distribution, probability weighted moments are the most
appropriate for parameter estimation.
The inverse forms of the lamda, Wakeby and kappa distributions are,
respectively,
y = m + aFb c(l F)C (3.2.20)
y = m + a[l (1 F)b] c]l (1 F)_d] (3.2.21)
y = m + a[bFb/(l Fb)]1/bC (3.2.22)
where m, a, b, c and d are the distribution parameters. The first two
equations do not even have an explicit form for F, the cumulative dis
tribution function. Note the nonlinearity in the parameters in addition
to the fact that all three distributions require fitting of at least
four parameters. Of the same type is the three parameter generalized
extreme value distribution analyzed by Prescott and Walden (1983)
x = m + ab[1 + (log F)1/b] (3.2.23)
A generalized lognormal distribution suitable for fitting a wide
range of positively skewed hydrologic data was developed by Brakensielc
(1958). This distribution was based on Chow's (1954) representation of
hydrologic variables in terms of a frequency factor, K,
= 1 + V K (3.2.24)
yy y
and the relation between the moments of the normal and lognormal dis
tributions (Appendix B). The transformation to lognormality was accom
plished through the following location and scale transformation
= b2) + (1 b)
(3.2.25)
71
where b = and V_, and V are the coefficients of variation of Y and
V Y y
y
y, respectively. The generalized distribution has the form
= exp(cz c^/2) (3.2.26)
yY
2 1/2
where c = log(l + V ) (3.2.27)
and z is the reduced normal variate. This distribution was extended to
the normal and extreme value distributions through first order approxi
mations of the two previous equations.
1 + cz (3.2.28)
yY '
c Vy (3.2.29)
Similar approximations will be used in Chapter 5. Note that Equation
3.2.28 is similar to Equation 5.2.2. An iterative procedure was
developed to choose the best transformation, b, (Equation 3.2.25) satis
fying Equation 3.2.28. The power transformation adopted within this
dissertation (Equation 3.2.1) may easily be shown to be equivalent to a
locationscale transformation in the log space (Equation 2.2.25). For
the development of the new generalized distributions, no such approxi
mations xvere made since the parent distributions are defined in the
untransformed space, and the reduced variates have the same distribution
as the parent distributions, rather than being equal to the standardized
normal variate as in the Brakensiek formulation (Equation 3.2.28).
3.2.6 Summary
By this new parameterization the GGD has a much simpler analytical
foim: independently of the assumed parent distribution the transformed
variables are related to the expected order statistics by the same type
of relation (Equation 3.2.30). The linear parameters are the location
72
and scale parameters, A and B, x/hile the only nonlinear parameter, a,
is included in the dependent variable y.
y 1
a
= A + BZ
a^O
Y =
(3.2.30)
log y = A + BZ a=0
where y is the untransformed observation, Z is the expected order sta
tistic, A and B are the linear parameters and a is the shape parameter.
The four generalized distributions are summarized in Table 3.5.
The new parameterization has the following advantages over the classical
form of the GGD.
1. It covers a larger family of classical distribution with simpler
form in the parameters.
2. The linear parameters A and B may be estimated independently of the
nonlinear parameter a by ordinary least squares methods. Lawton and
Sylvestre (1971) and Spitzer (1982) showed that such a separation
increased the rate of convergence considerably, while Maalel (1983a)
found that such a procedure led to less biased estimates than simple
nonlinear methods.
3. Confidence intervals and statistical inferences are more easily
established in the transformed space. Atkinson (1983) gave some appli
cations of this type of transformation with the normal distribution for
eliminating outlier effects and displaying influential observations.
Based on the work of McCullagn (1980) and his own experience, Atkinson
suggested the use of a linear model along with the GGD given by Prentice
(1974) for more appropriate statistical investigations. The procedure
outlined in this chapter follows exactly this suggestion, although it
has been developed independently.
73
Table 3.5. New Generalized Family of Distributions.
Distribution
Family
Transformed
variable
(Y)
Expected
order statistic
(Z)
Plotting
position
(PP)
Normal
(GND)
a i
y l
r1 (pp)
i 0.375
a
n+12*0.375
Gumbel
(GED)
a i
y 1
log (log(PP))
i 0.44
a
n+120.44
Rayleigh
(GRD)
a i
y l
[2 log(lPP)]1/2
i 0.44
a
n+12*0.44
Pearson
(GPD)
a i
y i
Fq1 (PP)
i 0.40
a
n+120.40
$ = inverse cumulative normal distribution function,
i = rank of ordered data,
n = sample size.
F_^ = inverse cumulative gamma distribution function.
Gx
n = sample size.
PP = plotting position.
74
4. Regression of the transformed variables against the expected order
statistics allows a more efficient use of the information contained
in the observations, in that the rank (frequency) of the observation
contributes to the estimation in addition to the observed values.
Greenwood et al. (1979) showed the merit of using the order statistics
to derive analytical expressions for weighted moments of several distri
butions expressible in inverse form. The derived weighted moments were
of simpler analytical structure than the relationships, between the
conventional moments and the parameters. This is in good agreement with
the simplicity of the expressions derived in the previous sections,
Equation 3.2.30. Landwehr and Matalas (1979) compared the probability
weighted moments parameters and quantile estimates of the Gumbel dis
tribution with the estimates from conventional moments and maximum
likelihood methods. Their results showed good agreement between the
weighted moments and the other two methods.
5. Inclusion of the order statistics (ranks transformation) in the
linear regression constitutes a bridge between parametric and non
parametric statistics. Conover and Iman (1981) and Iman and Conover
(1979) gave a good illustration of this type of regression showing that
it works quite well on monotonic data. This will always be the case for
the regression of the transformed variable against the expected order
statistics, where the CDF is always a monotonic function.
3.3, Generalized Probability Distribution Computer Program (GPDCP)
3.3.1. Solution algorithm
In the previous section it was shown that the four newly parame
terized families of distributions, GNP, GEP, GRD and GPD, reduce to the
simple form of Equation 3.2.30. This equation relates the expected
75
order statistics to the transformed variables by a relation linear in
the location and scale parameters, A and B, respectively,
Y = A + BZ (3.3.1)
Given a sample of n observations y=(y^, y2> yR) the expected
order statistics Z=(Z^, Z^, ..., Z^) are calculated directly from the
approximate formula for the plotting position (Equation 2.2.22). Then
for a given a, the data are transformed using Equation 2.3.6. Note that
this transformation is a monotonic function and, consequently, does not
change the rank of the original data and the expected order statistics.
Box and Cox (1964) replaced the parameters A and B by their maximum
likelihood estimates, in the search for the best transformation to
normality (GNP family). Also, they noted that the maximum likelihood
estimates of the linear parameters are the least squares estimates for
the transformed variable Y, and that the maximum likelihood estimate of
a is better given by solving for different transformations and choosing
the one that maximizes the likelihood function (Equation 2.2.7). Sim
ilar algorithms, where the linear parameters are eliminated from the
nonlinear estimation, have proven to be fast and efficient compared to
those where all parameters are estimated simultaneously. Such algo
rithms have been applied by many authors, (Lawton and Sylvestre, 1971;
Hernandez and Johnson, 1980; Spitzer, 1982; and Maalel, 1983a). Like
Box and Cox, all these authors did not include the expected order sta
tistics among the explanatory variables (independent variables) in their
models, although parameter estimates based on expected order statistics
are known to be more efficient and less biased than those estimated by
the more classical procedures (Chapter 2, Section 2.2).
76
Gupta (1970) developed a general program for the selection among
ten theoretical probability distributions from the normal and extremal
families. For most of these distributions the estimation of the
parameters was through linear regression of the observations on the
expected order statistics. The same procedure was used by Cunnane
(1978) in an investigation of unbiased plotting positions. While
Cunnane used the goodness of fit by a straight line (visual test from
plot) as the best selection criterion, Gupta used the coefficient of
2
determination R defined as
R
2
(3.3.2)
where SSe is the sum of squares of the predicted minus observed values,
and SS^ is the sum of squares of the observed values minus their mean.
Thirriot et al. (1981) used regression analysis of transformed variables
(Equation 2.3.7) on expected order statistics calculated from the
Weibull plotting position to select among frequency distribution models
from the normal and extreme value Type I families. The selection cri
terion was the sum of squares of the untransformed residuals. Greenwood
et al. (1979) and Landwehr et al. (1978) used the expected order sta
tistics to derive probability weighted moments. These moments led to
less biased estimates from small generated data samples than the maximum
likelihood and the unweighted moments. Somerville and Bean (1982)
compared maximum likelihood to order statistic based least squares
methods and found that the least squares method often gave a substan
tially better fit to generated data, especially in the presence of
outliers or when the underlying distribution was not clearly estab
lished. Stedinger (1983b) recommended the use of probability weighted
77
moments with small samples, normalized by their sample means and loga
rithmically transformed to obtain consistent and accurate estimates of
normalized flood flow distribution parameters.
The procedure outlined in this section x^ill combine the advantages
of the regression on the expected order statistics and the separation of
the linear parameters from the nonlinear regression. The only nonlinear
parameter is then the shape parameter a of the transformation
Y =
a
cxrO
(3.3.3)
log y a=0 .
Thus, the selection of the best distribution will be a onedimensional
problem, since for each a the parameters A and B are found from simple
regression analysis (see Appendix A). A computer program (GPDCP) was
developed especially for the selection among frequency models from the
four generalized distributions summarized in Table 3.5. A detailed
description of this program is given in the next section. Figure 3.2
lists the main steps of the selection procedure followed by the GPDCP
program.
3.3.2. Program.description
A complete listing of the Generalized Probability Distribution
Computer Program (GPDCP) source program is given in Appendix D. This
section will be limited to a rather simplified description of the
capabilities and options of the program, with more emphasis on the
mathematical formulation of the estimation procedure and selection
criteria (statistics).
The GPDCP can handle simultaneously a virtually unlimited number of
stations each with up to 10 samples of observations. The number of
78
1.
2.
3.
4.
5.
6.
7.
Figure 3.2. Flow Chart for the Generalized Probability Distribution
Computer Program (GPDCP).
79
observations by sample is limited to 60, but it can easily be increased
by simply changing the dimension statement. For each sample the best
frequency model may be selected from up to 100 (4 families and 25 trans
formations) theoretical distributions.
The observed data may optionally be multiplied by a scale factor
(CP) and shifted by a constant (CT) using the transformation
y = yQ CP + CT (3.3.4)
This option was introduced in the program to allow investigation of the
effect of change of units and standardization of the observations on the
selection procedure. When the input data are not ranked, subroutine
ORDER sorts them in ascending order. The empirical cumulative fre
quencies are then calculated using plotting formulae listed in Table
3.5; this calculation is performed by subroutine CDF. From these fre
quencies, the expected order statistics are calculated using the inverse
of the cumulative distribution function of each family. The inverse of
the cumulative normal distribution, $ \ is solved for the reduced
normal variate (expected order statistic) using subroutine RNV. This
subroutine is based on a rational approximation for (Abramowitz and
Stegun, 1964, Equation 26.2.23). The inverse of the cumulative Pearson
distribution is evaluated using IMSL subroutine MDGHI (IMSL, 1979) and
the relation between the gamma and chisquare variates (Section C.2.7)
ZG = ZCl/2
VCH k/2
(3.3.5)
where zn and z are the reduced variates of the gamma and chisquare
distributions, respectively, v is the degree of freedom of the chi
square distribution and k, the shape parameter of the gamma distri
bution (estimated by Equation 3.2.17a). The expected order statistics
80
for the Gumbel and Rayleigh general distributions are calculated
directly from the mathematical expressions for the inverse of their
respective cumulative distributions listed in Table 3.5.
Regression analysis is then performed on the transformed observa
tions and expected order statistics, as dependent and independent vari
ables, respectively. The analysis is made according to.the solution
procedure outlined in Appendix A by subroutine REGRE. For each family
and for all specified transformations (up to 100 models) the program
calculates four selection statistics. These are: 1. coefficient of
determination, 2. standard error, 3. weighted sum of squares, and 4.
maximum likelihood function. The four selection statistics were in
cluded in the program to allow the comparison of their performance and
to give the user the option to choose his/her oto criteria. Definitions
of these selection statistics are as follow:
2
1. Coefficient of Determination (R )
This coefficient is the same as the one defined by Equation 3.3.2,
except that, herein the residuals refer to the transformed variable Y.
2
Thus, R is the well known coefficient of determination of the fitted
straight line (Equation 3.3.1). It is calculated directly within the
subroutine REGRE by the formula
(3.3.6)
p2 = Cov (Z,Y)
Var(Z) x Var(Y)
where the variables are as defined above.
2. Standard Error (STDE)
This is defined by the square root of the mean square of deviations
between observed and predicted untransformed variables. The standard
error has exactly the same expression as the mean square error (MSE) of
Equation 2.3.9.
81
Uy y )2
1 i 1/2
STDE = [ ] 021)
where y is the observed variable and y. is predicted,
i
3. Weighted Sum of Squares (WSS)
From section 2.2.5 it was seen that the maximum likelihood method
was equivalent to a weighted least squares, where the weights are
defined as the inverse of the variances (Equation 2.2.11)
w. = l/o2 (3.3.8)
1 y
Following the procedure adapted by Sorooshian and Dracup (1980) and
Sorooshian (1981) and first applied by Box and Hill (1974) the weights
of this equation are expressed in terms of the variance of the trans
formed variable Y=f(y). From the relation between these two variances
(discussed in Chapter 5, Section 5.3.2) we have
2 Y
< = (3.3.9)
y f'(y)
and as will he shown in Equation 5.3.6
f'(y) = ya_1 (3.3.10)
where f is the first derivative of the transformation (Equation 3.3.3).
Substitution of these expressions into Equation 3.3.8 gives
2a2
1 Y
wi = ~2~ = 2
a
yi Y
(3.3.11)
The weighted sum of squares reduces then to
2a2
n yi 2
wss s ( )
i=l a
(3.3.12)
82
v r/ ^ a1,2. 2
= E t(yi yo.} yi ] /aY
i=l i
The last expression is the one used by the GPDCP program for calculation
2
of the WSS. The variance is calculated by subroutine REGRE using
Equation A.5. Note that for our case the predicted values y^ are cal
culated from the predicted transformed variables Yp (Equation 3.3.1) and
the inverse of the BoxCox transformation (Equation 3.3.3),
1/a
(Yp a + 1)'
exp (Yp)
a^O
a=0
(3.3.13)
4. Maximum Likelihood Function
Expressed in terms of the untransformed variable y, the maximum
likelihood function, defined by Equation 2.2.7 reduces to
T, s Y. A BZ.
L = ^}y~ n exp[ y Z ( )2] (3.3.14)
(2Tr)n/2 a* 2 y
where J(a,y) is the Jacobian of the transformation of Equation 3.3.3
(Box and Cox; 1964)
n
J(a,y) = Z
i=l
dY.
i
dy
(3.3.15)
n .
= n ya 1 .
i1 1
The logarithm of the likelihood function Â£, is, after replacing the
Jacobian by its expression into Equation 3.3.14,
= (a1) Z log y.
n
log (2r) n log (ay)
n
Z (Y. A BZ.)2
2a^ i=l '
(3.3.16a)
83
If in the last term the variance is replaced by its maximum likeli
hood estimate, Equation A. 4, St, reduces to,
a = (al) E log yi log (2ir) n log (a^) () (3.3.16b)
For a given sample, the second and the last terms are constant; thus,
maximizing & will be equivalent to maximizing the reduced expression
MXLF = (al) E log y n log (a ) (3.3.17)
This is the fourth selection statistic calculated by the GPDCP program.
The performance of the selected frequency distribution model is then
analyzed in more detail; residuals and confidence limits are calculated
for predicted values in real and transformed spaces. Statistical tests
including Student t, F test, and DurbinWatson statistic are performed
on the residuals of the linear regression. A simple description of
these tests follows:
Student t test. The Student t statistic tests the deviation of
each observation from the fitted straight line. The t statistic is
calculated from Equation 2.2.25
AY. e.
t =  = (3.3.18)
Y y
i i
where is the variance of individual predictions, calculated using
i
Equation 2.2.23. This statistic is then compared to the value tQ(v,
1p /2), where v is the number of degree of freedom (=n3) and p is the
percent confidence (reliability) level on which the comparison is made.
For given v and pQ, t (Equation 3.3.18) should be less or equal to tQ(v,
1p /2), to conclude that observation is within the p^ percent con
fidence interval about the fitted line. Tables for t are available in
o
many statistics textbooks. The GPDCP program calculates t by calling
the IMSL subroutine MDSTI (IMSL, 1979).
84
F test. This is the appropriate test statistic for testing the
linearity of the regression function. The F statistic is defined as the
ratio of the mean square due to regression to the mean square due to
residual variation (Draper and Smith, 1966, p. 24). Each of these two
2
means, when multiplied by its degrees of freedom follows a x distribu
tion with 1 and n2 degrees of freedom, respectively,
F =
mc 2 2
bR = Â£ (Y Y; = x (D
Mq 2 2
e E (Y Yor X (n2)
(3.3.19)
where Y is the observed transformed variable,
o
2
The ratio of two x distributed random variables is known to follow
2
an F distribution with degrees of freedom equal to those of the x
distributions, 1 and n2 for this case. Draper and Smith noted that
this statistic is exactly the same as the Student t test for a zero
slope in the case of fitting a straight line.
The F statistic and its corresponding tail area of the Fdistri
bution are calculated in the GPDCP program by calling the IMSL sub
routine RLONE.
DurbinWatson statistic. This statistic is used in testing for
first order linear correlation in the residuals. It is defined as
DW =
n
E
i=2
(ei eil}
n
E
i=l
(3.3.20)
2
e.
i
and is always in the interval 0 to 4. A DW value significantly smaller
(larger) than 2 indicates the presence of positive (negative) correla
tion. Durbin and Watson (1951) estimated significance points for the 1,
2.5 and 5 percent levels for DW. This statistic too, is calculated
within the RLONE subroutine.
CHAPTER 4
ILLUSTRATIVE EXAMPLES
4.1. Introduction
The Generalized Probability Distribution Computer Program (GPDCP)
is first compared to a modified version of the maximum likelihood based
program, developed by Kite (1977) for hydrologic frequency and risk
analysis. In this program six classical probability distribution models
are fitted at the same time, and the best model is selected on the basis
of the minimum standard error (Equation 3.3.7) criterion. This was the
criterion adopted by Kite to compare the performance of different
models.
Two examples are analyzed by both programs. The first is the St.
Marys river annual maximum daily runoff series, used by Kite; the second
is from a study of droughts in the Kissimmee River basin (Huber et al.,
1982). For the latter example two series of total annual rainfall and
runoff were considered. In addition to the validation of the new
program, the use of the GPDCP and Kite programs for the analysis of
these two examples allows the illustration of the advantages of the new
parameterized general distributions over the most popular frequency
models in the hydrologic field, fitted by the widely accepted method of
maximum likelihood.
4.2. Illustrative Example 1
This example consists of the annual maximum daily discharges of the
St. Marys river at Stillwater, Nova Scotia for the period 1915 to 1974,
85
86
originally analyzed by Kite (1977). These data are listed in Table 4.1
along with their mean, variance and skewness coefficient in the real and
logarithmic spaces. (Note that the maximum discharge for the year 1915
was missing in Kites Table 2.2, but was set equal to 19900 cfs in the
sorted recorded events of page 167 of the same reference.)
The six frequency distribution models, tested by the modified
version of Kites program are, the normal (N), lognormal (LN), three
parameter lognormal (LN3), Gumbel (EVI), Pearson Type III (PT3), and
logPearson Type III (LP3) distributions. These distributions were
introduced in Chapter 2, along with their statistical parameters. A
detailed description of the maximum likelihood solution procedure for
estimating the parameters of these models is given in Kite (1977). The
modified version of the original program is included in Appendix E. The
main change to Kite's program is the addition of a new "DO loop" to
subroutines EVI, LN3 and PT3. This change allowed automatic relaxation
of the convergence criterion when the limiting number of iterations is
exceeded for a prespecified testing level (EPS) .
The standard errors (STDE) calculated by the modified program along
with those obtained by its original version are listed in Table 4.2.
Note the improvement introduced by the automatic relaxation of the
convergence criterion for the LN3, PT3 and LP3 models, especially for
the LP3 model, where the original program does not converge after 25
iterations. The last column of Table 4.2 gives the standard errors
obtained by the GPDCP program for the corresponding distributions which
are special cases of the investigated general distributions. This
program gives an improved solution over the previous two for all dis
tributions except the first one, for which the three programs give the
Table 4.1 Annual Maximum Daily Runoff (cfs) and Statistics
of Original and Logtransformed Flows, Example 1.
EXAMPLE 1 ST. MARYS RIVER AT STILLWATER, (KITE* 1977,TABLE 22)
SORTED RECORDED EVENTS
34400. 000
19900. 000
1S300. 000
16100. 000
14300. 000
13100. 000
12400. 000
1 1800. 000
10300. 000
8210. 000
29100. 000
19900. 000
1 6200. 000
16100. 000
14300. 000
13100. 000
12300. 000
11800. 000
10200. 000
8210. 000
20600. 000
19500. 000
13000. 000
16000. 000
13900. 000
13000. 000
12300. 000
13 600. 000
10200. 000
3180. 000
23000. 000
19200. 000
17200. 000
15600. 000
13900. 000
13000. 000
12200. 000
11000. 000
9900. 000
8040. 000
20600. 000
10600. 000
16900. 000
15100. 000
13900. 000
12900. 000
11900. 000
10700. 000
9020. 000
7130. 000
20100. 000
3 8500. 000
16400.000
14500. 000
13600. 000
12700. 000
1 1900. 000
10400. 000
8390. 000
6700. 000
00
MEAN OF Y
VARIANCE OF Y
SKEW OF Y
14554.660
27319370.000
1. 889
MEAN OF LN
VARIANCE OF LN
SKEW OF LN
9. 523
0. 115
0. 132
88
Table 4.2. Standard Errors for Example 1, St. Mary's River
Annual Maximum Daily Discharges. Minimum Values
are Underlined.
Distribution
Solution Procedure
Kite
New Kite
GPDCP
N
1637.3
1637.3
1637.3
LN
1014.4
1005.7
902.1
LN3
872.1
849.3
823.7
EVI
1029.2
1029.2
857.4
PT3
952.9
951.8
939.7
LP3
No convergence
1711.6
824.9
89
same STDE. This should be expected since, 1) the normal distribution
has only two parameters (locationscale) for which maximum likelihood
and least squares estimates were shown to be the same (section 2.2.5),
and 2) the same plotting position (Weibull, Equation 2.2.21) was used in
all three programs.
The improvement in the fit introduced by the GPDCP program over
classical maximum likelihood solutions (Kite's programs) results mainly
from the new parameterization of these models and the simple solution
procedure of estimating these parameters. Among the six frequency
models of Table 4.2, the three parameter lognormal distribution has the
minimum standard error; thus it is the best distribution according to
Kite's selection criterion (STDE). But, this is not necessarily true,
first, because many other distributions analyzed by the GPDCP program
may have a smaller STDE than those listed in Table 4.1. Second, other
selection statistics such as R2, WSS and MXLF (defined in the previous
chapter) may lead to a different conclusion. A complete list of the
models fitted by the GPDCP program to the St. Marys runoff data is given
in Table 4.3. Each model is described by its parent distribution (LAW),
location, scale, and shape (ALFA) parameters. Along with these param
eters are listed the four selection statistics: correlation coefficient
(R2), standard error (STDE), weighted sum of squares (WSS), and maximum
likelihood function (MXLF). Also included in this table for comparative
purposes are the mean, standard deviation (STD) and skewness coefficient
of the transformed data. For the Pearson family of distributions, an
extra column gives the moment estimate of the shape parameter k (Equa
tion 3.2.17a). Note that for the normal family, the location and scale
parameters are equal to the mean and standard deviation of the
Table 4.3 liodels, Parameters, and Selection
Statistics Example 1, Runoff Series.
EXAMPLE 1 ST. MARYS RIVER AT STILLWATER (KITE. 1977, TABLE 22)
REGRESSION RESULTS :
**** + ** ***********
PEARSON
FAMILY GIVES OPTIMAL R2 FOR ALFA
0. 03
LAW
NORMAL
NORMAL
NORMAL
NORMAL
NORMAL
NORMAL
NORMAL
NORMAL
NORMAL
NORMAL
NORMAL
NORMAL
GUMDEL
GUMBEL
GUMBEL
GUMBEL
GUMBEL
GUMBEL
GUMBEL
GUMBEL
GUMBEL
GUMBEL
GUMDEL
GUMBEL
RAYLEIGH
RAYLEIGH
RAYLEIGH
RAYLEIGH
RAYLEIGH
RAYLEIGH
RAYLEIGH
RAYLEIGH
RAYLEIGH
RAYLEIGH
RAYLEIGH
RAYLEIGH
PEARSON
PEARSON
PEARSON
PEARSON
PEARSON
PEARSON
PEARSON
PEARSON
PEARSON
PEARSON
PEARSON
PEARSON
ALFA
LOCATION
SCALE
R2
STDE
WSS
0. 10
6. 14139
0. 13595
0.
96079
823.
68921
0.
73710
0. 00
9. 52840
0. 35270
0.
97965
902.
00574
0.
88941
0. 10
15. 94529
0. 91712
0.
90090
979.
52930
1.
10779
0. 20
2B. 69591
2. 38666
G.
97749
1054.
83276
1.
4011 1
0. 30
55. 08219
6. 21749
0.
97301
1127.
62231
1.
77110
0. 40
111. 55782
16. 21521
0.
96544
1190.
10701
2.
21018
0. 50
235. 82118
42. 33614
0.
95771
1267.
30376
2.
71907
0. 60
515. 47461
110. 65771
0.
94862
1335.
97119
3.
28903
0. 70
1156. 5664 1
299. 55290
0.
93826
1405.
45190
3.
91221
0. so
2648. 86890
758. 51605
0.
92659
1477.
29785
4.
57794
0. 90
6167. 36320
1989. 27710
0
91394
1 553.
58496
3.
27493
1. 00
14553. 5391
5222. 86719
0.
90021
1637.
26123
5.
98688
0. 10
6. 00215
0. 10729
0.
93278
1073.
97534
2.
25509
0. 00
9. 37387
0. 27991
0.
95993
929.
65332
1.
92141
0. 10
15. 54165
0. 73111
0
97019
009.
52173
1.
59400
0. 20
27. 64043
1. 91192
0.
97621
727.
67163
1.
31584
0. 30
52. 31917
5. 00476
0.
9B124
678.
80981
1.
10034
0. 40
104. 31650
13. 11645
0.
9B318
657.
80766
0.
95119
0. 50
216. 02228
34. 41342
0.
90489
659.
13379
0.
87597
0. 60
465. 57104
90. 39207
0.
98516
677.
61304
0.
87635
0. 70
1025. 33087
237. 69698
0
90409
709.
00090
0.
95390
0. 00
2303. 39993
625. 76270
0.
90152
750.
34668
1.
10849
0. 90
5256. 94766
1649. 24951
0.
97773
799.
97803
1.
33083
1. 00
12151. 0781
4351. 66016
0.
97265
857.
37402
1.
64174
0. 10
5. 99643
0. 20480
0.
946B9
773.
90308
1.
53401
0. 00
8. 86446
0. 53334
0.
97093
753.
09082
1.
35495
0. 10
14. 21443
1. 39039
0.
97756
760.
36499
1.
21404
0. 20
24. 17816
3. 62908
0.
90001
786.
37720
1.
13936
0. 30
43. 27780
9. 49241
0.
9B136
824.
01733
1.
1391 1
0. 40
BO. 67847
24. 80525
0.
97966
860.
60270
1.
21299
0. 50
154. 95543
64. 95912
0.
97768
917.
39551
1.
36529
0. 60
303. 40340
170. 27153
0.
97413
960.
95306
1.
59401
0. 70
600. 22437
446. 90747
0
96919
1022.
67065
1.
89733
0. BO
1187. 09100
1174. 23950
0.
96209
1078.
56835
2.
27161
0. 90
2322. 51562
3009. 55005
0.
95530
1 137.
26343
2.
71278
1. 00
4429. 40234
8132. 67578
0.
9464 5
1 199.
75073
3.
21475
0. 10
1. 61415
0. 00416
0.
97848
767.
57153
0.
71065
0. 00
2. 49030
0. 01776
0.
90537
024.
93096
0.
79260
0. 10
4. 16701
0. 07164
0.
904B6
861.
99146
0.
88702
0. 20
7. 50000
0. 27060
0.
90330
004.
72217
0.
98684
0. 30
14. 39125
0. 96164
0.
98272
097.
05640
1.
09046
0. 40
29. 21857
3. 25395
0.
90045
907.
86255
1.
19424
0. 50
61. 94199
10. 57530
0.
97BGB
915.
69678
1.
30876
0. 60
135. 60767
33. 30441
0.
97603
920.
77001
1.
42615
0. 70
304. 66284
103. 07581
0.
97472
924.
99260
1.
5501 1
0. 00
690. 59351
313. 04443
0.
97236
927.
56372
1.
68301
0. 90
1628. 59375
938. 60B23
0.
96979
934.
34375
1.
82357
1. 00
3846. 46484
2708. 10384
0.
96713
939.
60994
1.
97212
MXLF
MEAN
STD
SKEW
R2 CORRELATION COEFFICIENT
STDE STANDARD ERROR
WSS WEIGHTED SUM OF SQUARES
MXLF MAXIMUM LIKELIHOOD FUNCTION
STD STANDARD DEVIATION
K PEARSON SHAPE PARAMETER
9.
15
6. 141
0. 132
2.
029
3.
52
9. 528
0. 338
0.
179
3.
07
15. 945
0. 879
0.
260
10.
12
20. 696
2. 292
0.
306
17.
15
55. 002
5. 985
0.
528
23.
77
111. 55B
15. 670
0.
622
30.
01
235. 821
41.077
0.
742
35.
72
515. 475
107. OBO
0.
860
40.
92
1156. 568
283. 838
0.
981
45.
64
2648. 872
740. 21 1
1.
103
49.
89
6167. 371
1975. 793
1.
22B
53.
69
14553. 559
5226. 836
1.
356
24.
40
6. 141
0. 132
2.
029
19.
59
9. 528
0. 338
0.
179
13.
99
15. 74 5
0. 879
0.
260
8.
23
28. 696
2. 292
0.
306
2.
87
55. 002
5. 985
0.
528
1.
50
111. 558
15. 670
0.
622
3.
97
235. 021
41.077
0.
742
3.
96
515. 475
107. 800
0.
860
1.
42
1156. 568
283. 838
0.
981
3.
09
2648. 872
748. 21 1
1.
103
a.
75
6167. 371
1975. 795
1.
228
14.
87
14553. 559
5226. B36
1.
356
12.
84
6. 141
0. 132
2.
029
9.
11
9. 528
0. 338
0.
179
5.
82
15. 945
0. 879
0.
260
3.
91
28. 696
2. 292
0.
386
3.
91
55. 002
5. 985
0.
520
5.
79
111. 558
15. 670
0.
622
9.
34
235. B21
41.077
0.
742
13.
99
515. 475
107. BOO
0.
860
19.
21
1156. 568
203. 038
0.
901
24.
61
2640. 072
748. 21 1
1.
103
29.
94
6167. 371
1973. 793
1.
228
35.
03
14553. 559
3226. B36
1.
356
10.
25
6. 141
0. 132
2.
029
6.
97
9. 520
0. 338
0.
179
3.
60
15. 94 5
0. 079
0.
260
0.
40
28. 696
2. 292
0.
306
2.
60
55. 032
3. 983
0.
528
5.
33
111. 558
15. 670
0.
622
8.
07
235. 821
41. 077
0.
742
10.
65
515. 475
107. 800
0.
860
13.
15
1156. 360
203. 838
0.
981
15.
62
2648. 872
748. 21 1
1.
103
18.
02
6167.371
1975. 795
1.
228
20.
37
14553. 559
5226. 836
1.
356
UD
O
K
2177. 955
792. 694
328. 870
156. 734
84. 704
50. 604
32. 959
22. 032
16. 604
12. 534
9. 744
7. 753
91
transformed variables, respectively. The small difference between scale
parameter and standard deviation is due to the correction for bias of
the STD,
STD = SCALE x
(4.2.1)
The skewness coefficients of Table 4.3 are larger than those of Table
4.1 because they are corrected for bias, while Kite's program gives the
maximum likelihood estimates with no correction. The correction factor
is
SKEW = SKEW, x v/n(no1> (4.2.2)
u b n2
where u and b stand for unbiased and biased estimate, respectively.
For this example, the search for the optimal transformation was
made in the range 0.10 to 1.0 with an increment of 0.05, a total of 25
transformations for each of the four families of distributions. Note
that only one out of every two models is listed in Table 4.3. The lower
limit of this range was fixed at this level because of the big change in
the skewness coefficient between ALFA, equal 0 and 0.10; with a highly
negative skewness, the transformed variables deviate excessively from
the normal distribution which is the target of the transformation. The
optimal statistics for each of the four families of distributions are
summarized in Table 4.4. Along with these statistics are listed the
corresponding transformations (ct) Note that the WSS and MXLF sta
tistics always have their optimal values at the same transformation.
This should be expected since the equivalence between these two methods
was illustrated in Chapter 2. The Pearson and normal families have the
closest result to each other, which may be explained by the very high
values of the shape parameter k of the Pearson distribution (the normal
92
Table 4.4. Optimal Selection Statistic and Corresponding Transformation
(a) for Example 1.
Distribution
R2/a
STDE/a
WSS/a
MXLF/a
Normal
0.981/0.10
823.7/0.00
0.737/0.10*
9.15/0.10*
Gumbel
0.985/0.60
657.8/0.40
0.876/0.50
3.76/0.50
Rayleigh
0.931/0.30
753.1/0.00
1.139/0.30
3.91/0.30
Pearson
0.985/0.05
767.6/0.10*
0.711/0.10*
10.25/0.10*
*optimal transformation not reached, a < 0.10
1 opt
93
family has a k equal to infinity). The Pearson distribution gave a
slightly better fit based on all four selection statistics; this should
be expected since the Pearson distribution has four parameters. The
difference between optimal statistics of the four generalized distribu
tions is very small compared to the variation of these statistics within
each family (Table 4.3). This illustrates the importance of the trans
formation of the variables in the optimization of a given statistic: it
is much more important to find the right space in which to fit a given
distribution than to select among classical distributions within a fixed
space (real, logarithmic, etc.). For this example, the STDE, WSS and
MXLF selection statistics gave the lower limit of the range analyzed
(a=0.10) as an optimal transformation for the normal and Pearson dis
tributions, indicating that transformations with smaller a's may lead to
better selection statistics. Such values were not considered due to the
big change in the skewness coefficient mentioned earlier. The coeffi
cient of correlation R2 seems to have the best selection statistic,
among the four considered in this study, since it did not select any
negative value for a as the optimal transformation. Such a negative
value would imply that the transformed variables are limited from above
(Hernandez, 1978; Hinkley, 1975), which is usually an unrealistic
characteristic for the type of data analyzed in this study.
The GPDCP program automatically selects the overall best model
according to any of the four selection statistics. Based on R2, the
best model is from the Pearson family with a power transformation ALFA
equal 0.05. Table 4.5 lists the performance of this model, including
the residuals in the real and transformed spaces along with the cor
responding observed and predicted values. The same table lists the
cjii 3D3o rum
sc>o
DZZZC
tnmGcociDO
JZTrnTirn
CHOQQQ
D 303030
mG332<
H>ODD^
H >*
Hh3)HO>
enmatan
icnmcnm
HHDIT1
non3]
co<
>H>
r^*H
Qh
zo
2
O'usm uimuius us m us ui4a4a .* 4 4a 4a4a 4a 4a 4aGGGUGGUGUGPJpjPJfUPJpjpjnjpjrua*** a
O **0 CD vi O UUi CJ fO >0 ODo m4 cj ru *o
a* pj g o
h* ** aaaaaaa, ajpj pjpjpjgpjpjpjpjpjg to to cog 4 4a 4 ys us O'"vigoiu us ooj
O O O O O W W fO fu CJ CO tofc> 4>4 m O! O'O O'Vj Vi CD O O O *ro CJ 4a Ul O'vl >0 O fU U m CD O CO O O (Ji vi o* vi fu fu U Ul o
GUSviGOGUSV143PJ4i vl43 rOUSGUiarjO'04434)434*0 0^0 vius4>4 4.US vJOUiaOG*0^4104*0 viTJa00USGOO
vi4a{SJO4343OaGe'OUSaaC'0'C"'0PJaui4US434G4aGC'GGG43 43C'OrJPJGUSOa43QGvIvJrsJGUlOGUS4avJOOGOO
4*43UlCD00 43'0 4343GQmaGviO'0*0'0'0'UlUS4a4*4aGGGGGGGGfOPJfUPjrjPJaaaaaaoOOOO'043aGGGGviO'
4aa0'O0'*'0OCiPJ0'CnUfUOf'04a^O0'*UlUU'0'0'00'aOO'0'44aUt0fU'0'0CDCD0'Ovl4al0PJU'0OUrUIUaCvJ
OOOOOOCOOOOCOOOCOOOCOOOOOCOOOOOCOOOOOOOOOOOOOOOOOOOCrO'0*tD4atOO
pppppppppppppoppopoppopopooppooppoopoppoopoppoopoppoeppppopp
ocoooooooooooooooooooooocooooooooooooooooooooooooooooooooooo
oocoooocoococooooooocooooooooooooooooocoooooocoooooooooooooo
oooooooooooooooooooooooooooooooooooooooocoooooooocoooooooooo
GnjiuioruAJi jiv)** >** **'/* a a* a a * *** ( a a^ a *
JfUf'JPJ**OOOOO'O'O'O'O'OCDCDCDvnJO'
oo m US Ul Vi Q 4a *0 4a *0 U1 CD 4a CD Ul CO O vi US U O CO O'4a fSJ >0 vi U1 CO TO O CD O'4a PJ o CD ^4 us CO *0 Vi US CO us P J O vi ja *Vi PJ Ut
CD CO 4a Vi CD O'O'u. fu CD Vj CD CO *0 Vi Vi CD O 4a CD 4a O v 4a PJO O0 O <10 *PJ to 4a O'*0 O ru CO 4a us O'O'O'O'4a CO O O'4a 4a P J US'vj
UOpCOCJv/'43CD^3*;^0'Up<3CJp^PJ>UJUJ*UJCJ4aCDUsCJ4aO'>'OCJQCOvi*4auiuifSJUSUiO^JpCJ;^*OCOpviOO
O'O O 4a ^3 O'us U us fu US IU 4a ro O'VJ *CD us Vi o 4. vj fu CO 4a 4a US U1 o vi LO CO O'O'(0 O US a no O'O O CO W ru CD 4a CD *0 CO *0 ro 4a ru
o 4a Vi CO to vi p J CD O U! US'O Vi O'CO'O W * O O CD 4a o'O 4a Vi US'O *P J PJ us Vi US US 4a O'US Vi <] 4a sO O'TO CD CD US CD Vi O'ra43 *4] US CD O CO
CD vi 4a vi CO vi 0 CD CD O43 4a vi <>0 4a CJ vi 4a0 vi 4a 4 W vi O'CJ *0 P J *CD O'US 4a US fU US 4a vi 4a vi O >0 <1 CO CD O'4a vi rvj CD
I I
4a PJ  aa ((I  II II I I I II II I I II
VI4avJU1'00'*U!4anjtJ *CJUS I I * PJ I 03 I fUUlUlUUlU *4aPJfU    f\J G aU 4a O'O'TJ I h* PJ 4a CO PJ O* US PJ I G**
O'US vj CD O'O'US P3 CD vi O O PJ PJ CD O US US O v 4a P3 O O O O O O vi O'4a (O P3 O *0 O'US 4a D U U10 U10"0 4a PJ CO U O'PJ PJ
O*0 4a CJ U vi O O 0 CO O G3 US CO O'US 43 O'CD p JUS US US CO 4ir CD U1 CO Ui U
CO 43 43 4a a43 O'Ui P3 U! P3 US vj us P3 vl LJ PJ ^ CD 4a vj O 4a vi P3 CJ 4a 4a us ui O vj O'O'O'CO 0"0 4a Ovi CJ CD *0 CO Ovi vi US CD vl CO'O vi CD 4a vj
CO US PJ U CJ vi p J O c US US 43 CO CJ O'O vi aO 43 aUS O 43 4a vi *0 apJ pJ tes vi 4a 4 Ja CO CD 4a P3 O US O CO vi(.n afU CO CD a43 43 4a O O'
P3LJO'viUvi43PJCJ43CDaOa0*Ua*'04]0'Vivi4a43vi4a*4apjviO'aUaC]aPJ4ausO'US(DUSO'UO'CJO^'GGG^C'0'fSJfUIU
*T1
3D
rn
o
c
m
z
n
<
* m
* o
* 30
* n
* 03
* CD
* a*
*
* z
*
*30
* m
* cn
* c
*r
* 0)
*
* *
m
>
P3
0)
G
Z
H
>
3
aa
r
<
<
m
0)
G
tj
PJ ro P3 PJ P3 P3 PJ R J PJ PJ PJ PJ PJ P3 PJ P3 PJ PJ PJ PJ P3 p J P3 PJ P3 PJ P3 PJ P3 PJ PJ PJ P3 PJ P) PJ P3 PJ PJ PJ PJ PJ PJ PJ P3 P3 PJ P3 P3 PJ PJ PJ PJ PJ PJ PJ PJ PJ ro PJ
G Q Vi Vi Vj Vi Vi O'O'O'O'O'O'O'O'O'O'us US US us us us us US us US us us us us 4a 4a 4a 4a 4a 4a 4a 4a 4* 4a 4a 4a 4a 4a to G 3 CO G CJ CO CO CJ CJ PJ PJ PJ PJ a
O'vi Ul CO O CD vi O'Ul 4a CO PJ aO O 43 CO vi vi O'US UI 4a (J CO IU O 43 43 CO vi vi (> us us 4a CD U PJ O 43 43 G v< O'US 4a cj pJ p 4] vi US PJ CD
OOvJGGvpjnjO'US4a4aUlO'vi4]UUSCOGO'OGO'OGO'OLJvJ04avJa4avJa.4aviOPJUlvi4JaG4ausui4aGamGUS4aO'4a
4a C 4J vi 4) C vi m aus O 4a aU vi G O 4) 43 *G O'O 4a CD (0 Q CD G U vi ui (D O *O vi U vi Q O'T J 4a aG 43 vi O'pJ ^ O'4a vi a
O CD PJ US 4a u: us O vl 41 aO'G CD O'a*PJ PJ vi vi 4) pJ U CO O aUS 4a G CJ10 O'vl G PJ 4a O'vj ui 4] 4a C U G G O 4 O'O'CO CO CO PJ PJ CO US 
3
>
r
30
PJ
G G p G P J PJ p PJ PJ PJ PJ P J PJ PJ PJ PJ PJ PJ PJ PJ PJ PJ PJ PJ PJ PJ PJ PJ PJ PJ PJ P J PJ PJ PJ fu PJ PJ PJ P J ^ ^ a a a a a * a a fc a a
Vi 4a PJ O O O CD vj vi O'O'O'O'O'US US 4 4a 4a 4a 4a G PJ PJ PJ PJ PJ PJ aaaa waO O O O O 43 43 4J 43 43 03 G vi vi vi O'U14a G G G 6O
aGr04aOr0COvJ4a43Gvi0'4a0'4a43 0'0'USU'43vlviGPJGGPJPJaaOv4aGrjavJvi0'0'G4O0'4apJGGGGaGGUS0'0'
G Vi GO* 4a 4^ vjv 4a 43 vi GO* G43aPJPJPJPJ*CPJ004>4a4ai3 43 43 0'0*4a4]aOGUSUIUia*4a43t.SOUS43'OPJUS^vivia4aC'43
<
Q
a
30
>
r
n
>
_UG GG _G rurj ro PJ P3 PJ PJPJ M W PJP3PJGW P3PJ PJ PJ PJ PJ G PJP3 PJ PJ PJ rUfUPJPJPJPJ^a^ aaaaiaaaaaaa
4a (SJO O 43 43 G G v] vi O O'O'US US US US 4a 4a 4a 4a G U G G PJ PJ PJ PJ aaaO O O O 43 43 43 43 G G G G vl vj vl O'O'O'US US 4a 4a G PJ O
vJavJG*USOUsOO'G430'PJ43 0'G*GU!GOvJusGOGUSGaGO'4aa^)vi4G43vi4aG'Ov}4aarouiPJ43USaviGOGO"043G
0434avimcjsaG43G0 4aOG viviGOG0O4a43USO0PJ43UlPjmuiWGUiaviG43 4a4DGviOG4auiUSGOUvviGUlaGPJPJ43
<
*D
O
O
II I II II I III I I I I I 1 II M I I II I II Ml
O O O O O O O O O O O O O O O O O O o o o o o p o o o o o o o p o o o o o p o o o o o c o o o o o o o o o o o o o o o
pjaoo* OOOOCOOOOOOCOOOOOOOOOOOOOOOOOOOOOOOC 0000 0000 00 0000000
4a4aui4aU!U'04GaGOG4aOOaPJOPJOrJO'C'UUSG*PJO'GPJOOOOOPJ4apJUlO'43 43GP10nJGvlO'4aa434aa{>PJG
GGOa^a^0'Ui43GO'OGO'PJGG*0'PJO'PJPJOO'GUiaGOO'USPJ43GO'C"43a0'fU4a**UOUsnj4)U!UlPJGG4aGPJO'0
o o p o o o o o o o o o o o o o o o o o o o o o o o p o o o o p o o p p p p p p o p o p o p p o p p p p p p o p p O O p
HHOOaMQoooOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOaOOOOOO
'Oa4aGPJOvlGM*rJOOPJ4aOOa0*0*USUlGUSGaIV3USGnjOOOOOfUGfU4aO'GGGa'OfOGvlO'4a04D4aaChPJG
USGOG4aGGG4D0'GUSG43OPJPJ0'aus4DUS4Jui4apJfU^OO4aGGfU43vi0'4aviG4aGOU1USG43USaGaWa430'GGPJ0'*
O'O'O'O'O'O O'O'O'O'O'O'O'O'O'O* O'O'O'O'O'O'O'O'O'O'O'O'O O O'O'O'O'O'O* O'O'O'O'O'O'O'oO'O'O'O O'O'OO'O'O'O* O'O'O'O'O'
vl Vl Vi Vi vj vj Vi vj Vi Vi Vi vi Vi Vi vi Vi Vi Vi Vi Vi Vi vj Vi Vi Vi Vi Vi Vi Vi vj Vi Vi Vi Vi Vi Vi Vi Vi Vi Vi v Vi V Vi Vi Vi Vi Vi Vi Vi vj Vi Vi Vi Vi Vi Vi Vi Vi vj
iu pj pj iu ro rj pj pj w pj pj g ru pj pj pj ru r j ru pj fu pj iu oj pj pj ivj pj pj pj pj pj ru pj ru pj ru ru pj pj nj p j w iu nj pj id pj iu pj ru p j pj pj pj ivj ivj pj pj pj
I
o
V6
Table 4.5 Best Model Based on R2 Selection Statistics, Example
EXAMPLE 1 ST. MARYS RIVER AT STILLWATER (KITE. 1977. TABLE 22) t
95
observed frequencies, the return period derived from the plotting
position formula (PERIOD) and the associated reduced variate Z. The
last two columns of this table give the Student's t and t statistics.
o
Note that for most of the observation the statistic t is more than ten
times smaller than the 95% Student's variate t indicating that all
observations are within the specified confidence interval. A more
detailed analysis of the performance of the selected model is given in
Table 4.6. The statistics included in this table are generated by the
IMSL subroutine RLONE. The correlation coefficient listed among the
basic descriptive statistics is the square root of the selection sta
tistic R2. The analysis of variance summarizes the classical ANOVA type
statistics. Note the very high value of the F statistic, leading to a
probability of exceedence not discernible from zero. The DurbinWatson
statistic is much smaller than 2, indicating a positive correlation
between the residuals. Note the high reliability of the estimated
parameters implied by their standard errors and the corresponding con
fidence limits. Also note theindependence between the parameter esti
mates suggested by the low value of their covariance.
The same type of analysis may be repeated itfithin the same run of
the GPDCP program using other selection statistics. For example, based
on the STDE statistic, the Gumbel family with a transformation of 0.45
was the best distribution. The performance of this model is summarized
in Tables 4.7 and 4.8. Note that even though this model has a smaller
STDE than the previously selected model (Table 4.5), its Student t
statistics are larger than t for more than one observation at both
tails of the distribution. This is illustrated by Figures 4.1 and 4.2
where the corresponding observations fall outside the 95% confidence
Table 4.6 Detailed Statistics for the R2
Selected Model, Example 1.
EXAMPLE 1 ST. MARYS RIVER AT STILLWATER (KITE* 1977 TABLE 22)
REGRESSION RESULTS : PEARSON FAMILY GIVES OPTIMAL R2 FOR ALFA = 0.05
**# a 8ttH#**#
BASIC DESCRIPTIVE STATISTICS
MEAN
STANDARD
DEVIATION
CORRELATION
TRANSFORMED VARIABLE
12. 211
0. 544
REDUCED VARIATE
251. 003
15. 040
0. 993
ANALYSIS OF VARIANCE
SOURCE OF
VARIATION
DEGREE OF
FREEDOM
SUMS OF
SQUARES
MEAN P(EXCEEDING F
SQUARES FVALUE UNDER HO)
REGRESSION
RESIDUAL
TOTAL
1.000 17.240
58. 000 0. 237
59. 000 17. 480
17. 243 4211. 035 0. 0
0. 004
DURBINWATSON STATISTIC 0.429
O'
MODEL PARAMETER INFERENCES
LOWER
POINT STANDARD CONFIDENCE
ESTIMATE ERROR LIMIT
UPPER
CONFIDENCE
LIMIT
SCALE
LOCATION
0. 036 0. 001 0. 034 0. 037
3. 188 0. 139, 2. 910 3. 467
COVARIANCE
0. 000
Example 1
Table 4.7 Best Model Based on STDE Selection Statistics,
EXAMPLE 1 ST. MARYS RIVER AT STILLWATER (KITE1977,TABLE 22)
REGRESSION RESULTS : GUMBEL FAMILY GIVES OPTIMAL 'STDE' FOR ALFA = 0. 45
<***#*#**##*#*#*#
FREQUENCY
PERIOD
OBS. VOL
PRED. VOL
RESIDUAL
Z
YO
YP
YR
T
TO
1.
61.000
6700. 000
7326. B32
626. 832
1. 414
114.
868
11?. 677
4. 807
1.
426
1. 672
2.
30. 500
7130. 000
7861. 098
731. 098
1. 229
1 18.
192
123. 599
5. 407
1.
610
1. 672
3.
20. 333
8040. 000
0233. 410
198. 410
1. 103
124.
880
126. 202
1. 402
0.
4 1B
1.672
4.
15. 250
8180. 000
B5t5. 270
365. 270
1. 002
125.
871
128. 414
2. 54 3
0.
760
1. 672
5.
12. 200
8210. 000
0811. 355
601. 355
0. 917
126.
082
130. 22?
4. 147
1.
242
1. 672
.
10. 167
8210. 000
9050. 789
840. 789
0. 041
126.
082
131. 837
5. 755
1.
725
1. 672
7.
0. 714
8390. 000
9271. 447
881. 449
0. 772
127.
340
133. 298
5. 758
1.
78B
1 672
B.
7. 625
9020. 000
9478. 309
458. 309
0. 709
131.
631
134. 650
3. 019
0.
907
1. 672
9.
6. 770
9900. 000
9674. 656
225. 344
0. 649
137.
358
135. 919
1. 439
0.
433
1. 672
10.
6. 100
10200. 000
98c2. 879
337. 121
0. 592
139.
24 5
137. 122
2. 123
0.
639
1. 672
1 1.
5. 545
10200. 000
10044. 715
1 55. 285
0. 538
13?.
245
138. 272
0. V73
0.
293
1. 672
12.
5. 003
10300. 000
10221. 516
70. 484
0. 486
139.
068
139. 380
0. 4BB
0.
147
1. 672
13.
4. 692
10400. 000
10394. 301
5. 699
0. 436
140.
487
140. 452
0. 035
0.
Oil
1. 672
14.
4. 357
10700. 000
10564. 020
135. 980
0. 306
142.
325
14 1. 4 96
0. Â£129
0.
250
1. 672
1 5.
4. 067
11000. 000
10731. 332
268. 668
0. 330
144.
135
142. 515
1. 619
0.
489
1. 672
16.
3. 013
11600. 000
10B6. 863
703. 137
0. 291
147.
675
143. 516
4. 159
1.
256
I. 672
17.
3. 583
11800. 000
11061. 164
738. 836
0. 245
1 48.
832
144. 501
4. 332
1.
309
1. 672
13.
3. 309
11800. 000
1 1224. 684
575. 316
0. 19?
148.
832
145. 473
3. 357
1.
015
1. 672
19.
3. 211
11900. 000
11387. 906
512. 094
0. 154
14?.
407
146. 435
2. 772
0.
898
1. 672
20.
3. 050
11900. 000
11551. 156
348. 84 4
0. 109
14?.
407
147. 391
2. 016
0.
610
1. 672
21.
2. 905
12200. 000
11714. 840
485. 152
0. 064
151.
116
140. 341
2. 775
0.
B39
1.672
22.
2. 773
12300. 000
1 1U' 9. 305
420. 695
0. 020
151.
680
149. 289
2. 371
0.
724
1. 672
23.
2. 652
12300. 000
12044. 867
255. 133
0. 025
151.
680
150. 235
1. 445
0.
437
1.672
24.
2. 542
12400. 000
1221 1.918
188. 082
0. 070
152.
242
151. 103
1. 059
0.
321
1. 672
25.
2. 440
12700. 000
12380. 699
319. 301
0. 114
1 53.
912
152. 134
1. 779
0.
539
1.672
26.
2. 346
12900. 000
J2551. 594
34B. 406
0. 159
1 55.
014
153. 089
1. 725
0.
5B3
1. 672
27.
2. 259
13000. 000
12724. 910
275. 090
0. 205
1 55.
561
154. 050
1. 51 1
0.
4 58
1. 672
2E3.
2. 179
13000. 000
12901. 062
9B. 938
0. 250
155'.
561
155. 020
0. 54 1
0.
164
1. 672
27.
2. 103
13100. 000
130B0. 340
19. 660
0. 296
156.
106
155. 999
0. 107
0.
032
1. 672
30.
2. 033
13100. 000
13263. 121
163. 121
0. 343
1 56.
106
156. 771
0. 005
0.
260
1. 672
31.
1. 968
13600. 000
13449. 879
150. 121
0. 390
1 5B.
798
157. 796
0. 802
0.
243
1. 672
32.
1. 906
13900. 000
13640. 984
259. 016
0. 438
160.
387
159. 016
1. 370
0.
416
1. 672
33.
1. S4B
13900. 000
13836. 926
63. 074
0. 4B7
160.
387
160. 054
0. 332
0.
101
1.672
34.
1. 794
13900. 000
14038. 230
13B. 230
0. 537
160.
387
161. 112
0. 726
0.
220
1. 672
35.
1. 743
14300. 000
14245. 434
54. 566
0. 508
162.
476
162. 173
0. 203
0.
0B6
1. 672
36.
1. 694
14300. 000
14459. 176
159. 176
0. 640
162.
476
163. 27B
0. 823
0.
249
1.672
37.
1. 649
14500. 000
14630. 098
180. 098
0. 693
163.
508
164. 432
0. 723
0.
280
1. 672
3B.
1. 605
15100. 000
14909. 000
191. 000
0. 748
166.
560
165. 596
0. 764
0.
292
1. 672
39.
1. 564
15600. 000
15146. 770
4 53. 230
0. 004
16?.
052
166. 775
2. 257
0.
6B4
1.672
40.
1. 525
16000. 000
15394. 340
605. 660
0. B63
171.
015
168. 033
2. 782
0.
904
1. 672
4 1.
1. 4SB
16100. 000
15652. 895
447. 105
0. 923
171.
501
167. 314
2. 108
0.
663
1. 672
42.
1. 452
16100. 000
1 5923. 699
176. 301
0. 986
171.
501
170. 643
0. 858
0.
260
1. 672
43.
1.419
16400. OOO
16208. 277
191.723
1. 051
172.
951
172. 026
0. 724
0.
2B0
1. 672
44.
1. 386
16900. 000
16508. 391
391. 60?
1.119
175.
334
173. 471
1. 86 3
0.
564
1. 672
45.
1. 356
17200. 000
16826. 176
373. 824
1. 190
176.
746
174. 785
1. 761
0.
533
1. 672
46.
1.326
1 8000. 000
17164. 180
835. 820
1. 265
180.
445
176. 578
3. 867
1.
169
1. 672
47.
1. 298
18200. 000
17525. 395
674. 605
1. 344
181.
355
178. 262
3. 074
0.
935
1. 672
40.
1. 271
18300. 000
17913. 676
386. 324
1. 428
181.
80B
180. 050
1. 758
0.
531
1.672
49.
1. 24 5
18500. 000
10333. 695
166. 305
1. 518
182.
711
101. 761
0. 750
0.
226
1.672
50.
1. 220
18600. 000
18791. 461
' 191.461
1. 615
1 83.
160
184. 017
0. 857
0.
25B
1. 672
51.
1. 196
19200. 000
19294. 781
94. 781
1. 720
185.
820
186. 24 5
0. 417
0.
126
1. 672
52
1. 173
19500. 000
19053. 930
353. 930
1.835
187.
144
IBB. 683
1. 539
0.
462
1. 672
53.
1. 151
19900. 000
20433. 145
583. 145
1. 962
108.
8B2
191. 383
2. 500
0.
749
1. 672
54.
1. 130
19900. 000
21202. 531
1302. 531
2. 105
188.
882
174. 4 13
5. 531
1.
6 54
1. 672
55.
1. 109
20100. 000
220.1. 934
1941. 934
2. 260
1B7.
744
177. B77
8. 135
2.
425
1. 672
56.
1. 089
20600. 000
23048. 430
2448. 430
2. 459
191.
879
201.940
10. 062
2.
9Q7
1. 672
57.
1.070
23000. 000
24302. 402
1302. 402
2. 691
201.
747
206. 866
5. 119
1.
512
1. 672
5B.
1. 052
25600. 000
25958. 074
358. 074
2. 9B7
211.
818
213. 160
1. 342
0.
393
1. 672
59.
1. 034
29100. 000
28372. 070
727. 930
3. 401
224.
524
221. 753
2. 570
0.
744
1. 672
60.
1.017
34400. 000
32734. 254
1665. 746
4. 103
242.
255
236. 055
5. 400
1.
527
1.672
z
REDUCED VARIATE
YO
TRANSFORMED
OBSERVATION
YP
TRANSFORMED
PREDICTION
YR
TRANSFORMED
RESIDUAL
T
STUDENT STATISTID
TO
75Z. STUDENT
T
vo
Table 4.8 Detailed Statistics for the STDE
Selected Model, Example 1.
EXAMPLE 1 ST. MARYS RIVER AT STILLWATER (KITE,1977,TABLE 22)
REGRESSION RESULTS : GUMBEL FAMILY GIVES OPTIMAL 'STDE' FOR ALFA =
BASIC DESCRIPTIVE STATISTICS
TRANSFORMED VARIABLE
REDUCED VARIATE
MEAN
161. 433
0. 552
STANDARD
DEVIATION
25. 360
1. 135
CORRELATION
0. 992
ANALYSIS OF VARIANCE
SOURCE OF
VARIATION
DEGREE OF SUMS OF MEAN P(EXCEEDING F
FREEDOM SQUARES SQUARES FVALUE UNDER HO)
REGRESSION
RESIDUAL
TOTAL
1.000 37353.172 37353.172 399. 145 0.0
53. 000 535. 750 10. 099
59. 000 37943. 922
DURBINWATSON STATISTIC 0.306
MODEL
PARAMETER INFERENCES
POINT
ESTIMATE
STANDARD
ERROR
LOWER
CONFIDENCE
LIMIT
UPPER
CONFIDENCE
LIMIT
SCALE
LOCATION
21. 242
149. 706
0. 349
0. 453
20. 312
148.798
22. 173
150. 613
COVARIANCE
0. 067
0. 45
T1T1QZC2] C3Z>m mSUQTlWZIiIM
99
ST.M3RYS RIVER RUNOFF DATA
PJEJtSON S2 ALFA = D.OS
REDUCED VARATE
Figure 4.1 Linear Plot of R2 Selected Model, Example 1
J CE CE !2 CD LL D CE 2Z LU O > L CE CE _J ) CEZDIZOLlLl.
100
ST.M3RYS RIVER RUNO FE DATA
cvx&sz sms alfa = o.s
REDUCED YRRIflTE
Figure 4.2 Linear Plot of SIDE Selected Model, Example 1
101
interval. Thus, the selection based on the R2 statistic seems to be
more reliable than the one based on the STDE. A comparison of the
corresponding F values and DurbinWatson statistics (Tables 4.6 and 4.8)
supports this conclusion since they are both smaller for the STDE case,
indicating less correlation for the regression line and more correlated
residuals.
The WSS and MXLF selection statistics gave the same optimal model
(Pearson with a=0.10). This model has slightly better performance over
the Pearson with a=0.05 model selected based on the R2 statistic: its F
value and DurbinWatson statistic are larger (Table 4.9) and the Student
t values are smaller for most of the observations (Table 4.10). Figure
4.3 gives the fitted model along with the observations and the asso
ciated 95% confidence intervals.
4.3. Illustrative Example 2
Annual total rainfall and runoff from the Kissimmee River basin,
Florida, for the period 1934 to 1981 were taken from a study by Huber et
al. (1982). These data are listed in Tables 4.11 and 4.12, along with
the mean, variance and skewness in the real and logarithmic spaces.
Note that by the logarithmic transformation the skewness of the runoff
series goes from positive (y=1.205) to negative (y=1.136) illustrating
the overtransformation effect mentioned in Section 2.3.4. On the, other
hand the logarithmic transformation of the rainfall series is an under
transformation since the skewness coefficient is simply reduced from
0.626 to 0.277.
These two series were first fitted to the same six classical prob
ability distributions described in the previous example. The estimation
of the parameters of these models was performed by the modified version
ttrcciZtnLLiDazzELjjQ >Lua:a:_j> oczdsollll
102
ST.MARYS RIVER RUNOFF DATA
PSASSOJff IX1F J1F "GJQ
REDUCED YflHCTE
Figure 4.3 Linear Plot of MXLF Selected Model, Example 1
Table 4.9 Detailed Statistics for the
WSS Selected Model, Example 1.
EXAMPLE 1 ST. MARYS RIVER AT STILLWATER (KITE, 1977, TABLE 22)
REGRESSION RESULTS : PEARSON FAMILY GIVES OPTIMAL 'WSS FOR ALFA = 0.10
BASIC DESCRIPTIVE STATISTICS
STANDARD
MEAN
DEVIATION
CORRELATION
TRANSFORMED VARIABLE
6. 141
0. 130
0. 994
REDUCED VARIATE
10SS. 938
31. 330
ANALYSIS OF VARIANCE
SOURCE OF
DEGREE OF
SUMS OF
MEAN
P(EXCEEDING
VARIATION
FREEDOM
SQUARES
SQUARES FVALUE
UNDER HO)
REGRESSION
1. 000
0. 982
0. 9S2 4753. 984 0. 0
RESIDUAL
53. 000
0. 012
0. 000
TOTAL
57. 000
0. 994
DURBINWATSCN
STATISTIC
0. 457
MODEL
PARAMETER
INFERENCES
LOWER
UPPER
POINT
STANDARD CONFIDENCE
CONFIDENCE
ESTIMATE
: ERROR
LIMIT
LIMIT
SCALE
0. 004
0. 000
0. 004
0. 004
LOCATION
1. 657
0. 065
1. 527
1. 787
COVARIANCE
0. 000
103
O 3JT3Q
<33HHH33
UIH3333T1
NC>a
DZ2ZC
comeo cdcdcv
iZTlTlTim
CHOCOO
a 3D3J33
mco222<:
H>ODD3
H **
Hn^T3n>
Um25U3H
fcomcnm
whD(I1
OOh3)
co<
>H>
r*H
a**
ZO
z
Chuimc/'u1. c^uicra.o!tri^4>^>ji^>^^.Ci^cjJCJL3CJJCJCJL3urororjfurjfjroronjpjvk*k
o 43 o vit> en c cj ru  o <3 o vj o* en j> cj ru* o o o vi o cji jk cj p j  c o tu vi o (s 4 cj nj o 43 co vi> cr, 4* cj ru  o o cd 4 o en 4 u n j 
***fu cj o
~ u ru ru ru fu pj(urufurutjcjcj cjcj4> 4 4 en en (> o* vi aj o pj en o o >
o o o o o ru ru ru (u cj cj cjtb.fc m ui (M> th cd o >o o ru o ^ m ovj ^ o ru cj en cd o cj o o en h* vj o >jfu fu cj ui o
*cjenvfcDOCJenvopj^N4vorjenc3*encDrji>oc*otoÂ£*oocjOvfe.,i^iitenvjoen*CQCD^oen>acotovjrj^oouicjoo
vj fu o >o o <5 ru cd en ur >at>* cjc* cd o cd cj u
cj fu ru ro ru u ^ ^ ^ ^ ^ ^
o en cj o o o 43 43 o cd cd cd cd cd v o o* o o o at en *> 4 cj cj cj cj cj cj cj cj fu pj fu nj ro n j *  *  * * o o o o o o o cd cd cd cd cd vj o
^c>oi>''0'0enpj0'enupjopj'0^*^oo encJCjooocDCDOO'J(jpjfU'DOururj^Q*'vi
OOOOOOOOOOOOOOOOCOOOOOOOOOOOCOOOOOOOOOOOOOOOOOOOOOCCrj'O* CD4CJO
po popo oc o o opoooo cepo poop pp oppopoopoo poop poop popo popo pppppop o
oooooooooooooooocoooooooooooooooooocoooooooooooooooooooooooo
ooooooooooocooooocoooooooocooocoooooooooooooooooccoooooooooo
ooooooooooooocoooooooooooooooooooooooooooooooooocooooooooooo
o o en cj fu * o 43 43 cd cd cd vi vi vj o o o en en en en 4* a 4 4 4* cj cj cj cj cj u pj u ru pj ru  *    o o o o o 43 o 4) o 43 cd cd o vj vi o
IU43 0v/0C3 443 4*v)cn4.&*CDcnnj4Jvi.rkr04)vicnCJ*'0vienCJ>4)vJtnCJfUO030'.&CJ^41vloiCJ*4]virnrUO''lcn*CDCJvi
o vi pj oj cj o vj > v o! j ^ lj cj * o cj >j > o fu vi uuj cj ru .u iU cj 4 en o co o * cj o a cd o  cj 4* en en o en 4*
o 4) fu vj 4 o t> o vj .& o 4 4* c co en <3 ^ p 43 o en cj cj ru cd o o  ** cj 
en'J *0 CJ O CJ O CJ CJ Vj ^4 * CJ cj o en IU Vj o *0OCD *0 >0 O O S CD O O ru ^3 o vi en Vf 4M> CD C> CD45. Vj u o o
ru vi fu 4 vi u vi cd u pj 4 o cd cd en vjaj pj vj o en cj m o o o pj cd en vj ^4 O *en vj a cd rj ^ pj cj o o* o cd vj en o o ru fu
cj u o cd cj >0 pj o ru o p j ru vj u en ru ru cj vj o cj vi en o 4^ 4^ cd cd 4> cd cj pj o o en 4i o vi o 4i ru ru ^0 cj o 4^ vj yuji <3 o
I I
4 pj i pj* mi I 1 I II I I I II I I I I II II I
~*cnv4oviPjcn4kpjcj 5*4en i i rj ^^.r.*fu 4* pj *4ru* i i pjuru4>4>oorj^ i *pj^uruoenru Pjru
OfUrUPJCDCJOv4>v(jUJen>ChCDO*UrUCDCJrUOvKnCJCJfUrUfUCJ4*4CJ> I CD0'4>U~43C0l>en4.4CJ4envJvi(nvi43 furu
^oovjvj^ooovi^0'4pj3**4pvi..mvipppp4viuiropppru4>vjmopjpop4enp^4ch( PocDOpHcacj^
4^ ru O CJ O CJ O 4k cj CJ vj ru CD OO CD en CD 4^ Vj Vi o O 4k O o 4 m <3 *0 OIU O ru 4) O CO en CD vi O O IU 4^ PJ en CJCJ CD 4. 4^ Vi 4* O Ch o
vi ru < p j 4i vj u vi cd cj pj 4> cj 4. p j wcd pj o en cj co o o cd p j a en vj cj *o ca 4^ pj env3 vi en vj vi c cj o o r j 4 en 4 c> j] pj p j
vj vi o cd cj *<3 fu c ro o ru cd ru vi en cd p j cj pj o vi vi en o 4 4 cd o 4> cd cj ru o 4wj en o *p j o iu vj u C o cd cd vi 4 a vjen en v3 o
n
3
m
o
c
ni
z
M 3
v m
* o
* 3
* m
* en
* en
* *<
*o
*z
*3
*m
* tn
*c
*r
*en
TJ
m
>
3
tn
a
2
3
m
en
H
O
c
>
r
TI
>
2
tH
r
<
<
m
en
mmhmmmv.umwmmwmv.^wmmumOOOOOOOOCCOOOOOOOOOOOOOCOCOOCOOOOOOOOO
o en 4 u u cj ru pj ru ru o o c o o o o < oo >0 en en en en 4 4^ 4 cj cj ru
o o 4> 0 vi o* 4ru o cc o* 4. ru 43 O'cj *o en 43 <1
0'enoruruvivjoenruoo>ru4kviien434o4oen^vj4o0'CJ4D0'rucDen*vjcj434i.oen'04vjL30'vivi0'4kO4i0'Cjenoovj m
O p J co **43 Vi U! cd 4. o en CD vi o 4. *ru en O CD CD CD O 4 Vi p J O'en CD PJ 4 4.4 Vi O O Vi o CD P J 43 O CJ O'CD O'vi Vi o 0">0 *pJ
ru ca 43 ch pj o a en cj o en 4 *o en v! u en o cj co o* co 43 cd o O'o ruo o o 43 pj (D p j ru cj o O'o O'cj o r j cj vi cd en n j 4* vi o cj
2
>
r
r
n
O'o* o* o* o* o* O'O'O'o o* O'O'o* o* O'o* o* O'o o* aO'c> O'O'O'O'O'o O'O'O'O'o* o O'o* o* o O'oo* o* o* o* O'O'O'O'O'o* en en en en en en en en
4. 4 cj cj pj ru ru ru ru ru ru ru ru ru ru pj ru ru ru ru ru * * ~ ~ * * ~  * m * o o o o o o o o o o o o o 43 43 43 43 43 41 cd cd
cd p j *4 cj 43 cd cd cd vi v enen en en 4 p j ro o o o 43 cd O'en en a 4 4 cj fu ro ru pj o o O 43 cd m cd cd vi en 4. cj cj p j r j vi 4.4> 4. u cj ai en
ru p j O'vi oo 4* 4 o* o ai i> pj o o 43 ru *45.4^ fu pj o 4 43 43 cd cd cd 43 en en ru pj 43 cjo o vi cd cd 4.4* cd vj O'envi vi en cd cd o o a r j O'
n
a
23
TI
>
O'O'o* o* oO'O'o oo O'O'o* t> o* O'O'O'ooO'O'o O'O'O'o O'O'o o* O'O'O'O'O'o oO'O'o* o oO'o* O'O'oo O'en en en eren en en en en
4> cj cj cj cj u cj ru ru ru ro ru ro ru ru ro ru ro ro ru  ~ o o c c o c o o o o o o o o 43 43 43 43 43 43 43 a co
CJ4)0'itJ*O43G3vl0'Cn44(jrjrU*OO4DCDCDvJvi0'0'Cn44CJCjrurU*OO4)4)CDCDvi0'0'en44CjrU*O4D43vi0'enCJ>4Den
vi UI4) CD ru (D en 4 4.4i O'vi <3 P Jvi o 4i vi en 43 CJ vi O'O 4> 43 CJ vi PJ o *en 43MD PJ O O 4 vivi o CJ en vi ^3 o 43 vi 4i CD 43 en O'
<
3
I
o
III _ I I I I I 111 I I I I II I I II I II I II I II I
o o p 0 o o o o p o p 0 p p p p p p p p c o p o p p o p O O O O O O OpOpOpppppOppOOOOpOOOOOOOO
oococoocccoooococcooocooocooooocooooooooooooooooooooooooooco
4PJ0*UUW^C00O00H0OC00000*^0t00050000C000M0HMrjpj0000'^'CjrjH0HK0
en vi cd O'ru *co 4^ vj ^ cj so ru ru ru u u cj vi cj cj cj ru vi ru O'4 ru vi jk 4 4 o u 4) cd 4 vi cj cj 43 en ru en o (D en ru o vi 4* o u cj o
000000 00 00 00000000 00 000000000000000000 00 00 000 coo 00 000000 00 00
000000000000000000000000000000000000000000000000000000000000
JUMOCPJHKOOOOOCOOOCOOOCOOOOOOOOOOOOOOOOCOOCOHHMOOCOOt^OIOHMCOOO
cd vi en vi ru 43 a vi en cj en o ru o cd **pj ru pj 4k ru ru cd cd 4> cd 4 o p j ca en cj u p j o pj 0 cd en 43 o o O'4. 4. vi (u cd 43 o o cd 43 o
0'CHMMM>OOsOO'00'0'lMM>CMMMMMMM>0'0(MMM>(M>tM>(hOO'0'0(MMMHMMMMMM>(HM>00'i,0'(MM>
ru ru ru rj pj ru ru pj ru pj ru p j ru pj ru p j ru pj pj ru iu ru ru pj iu ru ru pj ru iu iu ru ru ru ru pj ru pj ru ru ru pj ru pj pj ru ru p j ru ru ru ru p j pj pj pj pj pj ru ru
^01
o
Table 4.10 Best Model Based on the WSS Selection Statistics, Example
EXAMPLE 1 ST. MARYS RIVER AT STILLWATER (KITE.1977,TABLE 22)
Tnble 4.11 Total Annual Runoff (Inches) and Statistics of
Original and Logtransformed Flow for Example 2.
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RUNOFF 19341981
SORTED RECORDED EVENTS
27. 064
19. 030
IS. 983
18. 212
17. 365
15. 461
13. 061
12. 603
12. 375
11. 772
11. 097
11. 026
10. 965
10. 391
10. 364
9. 960
9. 593
9. 503
9. 437
9. 401
9. 258
8. 925
8. 142
7. 563
7. 495
7. 495
7. 334
7. 301
7. 246
6. 745
6. 507
5. 807
5. 602
5. 439
5. 105
4. 962
4. 795
4. 790
4. 592
3. 973
3. 705
3. 526
3. 425
3. 217
3. 198
3. 163
2. 499
0. 499
MEAN OF Y
8. 749
VARIANCE OF Y
26. 914
SKEW OF Y
1. 205
MEAN OF LNCY)
1. 983
VARIANCE OF LNCY)
0. 457
SKEW OF LN(Y >
1. 136
105
Table 4.12 Total Annual Rainfall (Inches) and Statistics
of Original and Logtransformed Data.
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RAINFALL ,19341981
SORTED RECORDED EVENTS
72. 230
68. 231
67. 449
64. 474
59. 693
59. 423
58. 1S5
56. 718
55. 740
54. 534
53. 193
53. 158
53. 135
52. 751
52. 158
51. 958
51. 911
51. 704
51. 496
51. 067
50. 911
50. 865
50. 517
50. 270
49. 590
49. 098
48. 307
47. 797
47. 415
47. 382
47. 257
47. 000
46. 090
45. 849
45. 700
45. 565
44. 983
44. 775
44. 619
42. 661
42. 621
42. 240
42. 126
41. 555
38. 340
37. 319
36. 171
34. 447
MEAN OF Y
50. 014
VARIANCE OF Y
63. 182
SKEW OF Y
0. 626
MEAN OF LNCY)
3. 900
VARIANCE OF LN(Y)
0. 024
SKEW OF LN(Y)
0. 277
106
107
of Kite's program (Appendix E). They were then analyzed by the GPDCP
program for a wide range of distributions, including the previous six as
special cases. To compare the performance of the new method of param
eter estimation (GPDCP) to the classical maximum likelihood method, the
standard errors of estimate (STDE) for each of the six models are
listed in Table 4.13. Here again, a significant improvement was intro
duced by the GPDCP program over the maximum likelihood method, espe
cially for the PT3 and LP3 models for which no convergence was achieved
using the original version of Kite's program, and where even the new
version gave poor performance compared to that of the GPDCP program.
A list of the models fitted to the runoff series is given in Table
4.14. Note that only one out of two fitted models is included in this
table. For each family of distributions the exponent ALFA of the trans
formation was varied from zero to one. For this range of ALFA, the
skewness coefficient of the transformed data varied from 1.210 to
1.284, respectively. The optimal selection statistics for the four
generalized distributions along with the corresponding transformations
are summarized in Table 4.15. For this example too, the differences
between the optimal statistics for different generalized distributions
are very small compared to their variation within the same family for
different transformations.
The WSS and MXLF selection statistics have the same optimal trans
formation for all distributions. For this example, contrary to the
previous one, the Pearson distribution deviated significantly from the
normal distribution; this may be explained by the relatively lower shape
coefficient k for this case (Table 4.14).
108
Table 4.13. Standard Errors for Example 2, Kissimmee River Basin
Annual Total Rainfall and Runoff.
Distribution
Runoff
Rainfall
New Kite
GPCDP
New Kite
GPDCP
N
1.626
1.575
1.717
1.694
LN
0.997
1.075
1.418
1.324
LN3
0.874
0.711
1.368
1.267
EVI
1.087
0.840
1.220
1.200
PT3
1.089
0.805
2.697
1.350
LP3
5.916
2.400
1.742
1.263
Table 4.14 Models, Parameters
Statistics, Example 2,
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RUNOFF 19341'
REGRESSION RESULTS
HHHHHHHHHHHHHHHHf#
GUMDEL
FAMILY
GIVES OPTIMAL R2 FOR
ALFA =
0. 75
LAW
ALFA
LOCATION
SCALE
R2
STDE
NORMAL
0. 0
1. 98334
0. 68403
0. 90539
1. 07562
NORMAL
0. 10
2. 22034
0. B2052
0. 93615
0. 78601
NORMAL
0. 20
2. 49799
0. 98991
0. 95712
0. 71095
NORMAL
0. 30
2. 82467
1. 20062
0. 96920
0. 76856
NORMAL
0. 40
3. 21069
1. 46337
0. 9737B
0. 87749
NORMAL
0. 50
3. 66B79
1.79179
0. 97'183
0. 99701
NORMAL
0. 60
4. 21475
2. 20329 .
0. 96455
1. 11667
NORMAL
0. 70
4. 86017
2. 72014
0. 95293
1. 23225
NORMAL
0. 80
5. 65346
3. 37090
0. 93781
1. 34700
NORMAL
0. 90
6. 60111
4. 19221
0. 91908
1. 46151
NORMAL
1. 00
7. 74935
5. 23127
0. 89971
1. 57547
GUMDEL
0. 0
1. 69668
0. 52340
0. 81945
2. 76635
CUMBEL
0. 10
1. 87286
0. 63444
0. 86522
2. 01224
GUMBEL
0. 20
2. 07454
0. 77316
0. 90260
1. 49857
GUMBEL
0. 30
2. 30609
0. 946B5
0. 93192
1. 14133
GUMBEL
0. 40
2. 57271
1. 16485
0. 95384
0. 89756
GUMBEL
0. 50
2. 8B060
1. 43913
0. 96917
0. 74470
GUMBEL
0. 60
3. 23711
1. 70504
0. 97072
0. 66038
GUMBEL
0. 70
3. 65104
2. 22230
0. 90325
0. 65400
GUMBEL
0. 80
4. 13287
2. 77630
0. 9834B
0. 68524
GUMBEL
0. 90
4. 69511
3. 48009
0. 97996
0. 74890
GUMBEL
1. 00
5. 352/1
4. 37592
0. 97322
0. 84000
RAYLEIGH
0. 0
0. 73812
1. 00150
0. B4299
1. 7B402
RAYLEIGH
0. 10
0. 71461
1. 21 103
0. 80575
1. 29444
RAYLEIGH
0. 20
0. 66765
1. 47211
0. 91937
0. 90416
RAYLEIGH
0. 30
0. 58894
1. 79815
0. 94433
0. 81103
RAYLEIGH
0. 40
0. 46749
2. 20630
0. 96143
0. 74504
RAYLEIGH
0. 50
0. 28883
2. 71844
0. 97160
0. 75308
RAYLEIGH
0. 60
0. 03393
3. 36256
0. 97579
0. 80379
RAYLEIGH
0. 70
0. 32233
4. 17463
0. 974B7
0. 87639
RAYLEIGH
0. BO
0. 81297
5. 20083
0. 96963
0. 96096
RAYLEIGH
0. 90
1. 48142
6. 50064
0. 96071
1.05513
RAYLEIGH
1. 00
2. 38488
8. 15078
0. 94869
1. 16265
PEARSON
0. 0
0. 62612
0. 31873
0. 82203
2. 40725
PEARSON
0. 10
0. 66320
0. 40648
0. 86244
1. 01376
PEARSON
0. 20
0. 71257
0. 52490
0. B9478
1. 42118
PEARSON
0. 30
0. 77602
0. 68534
0. 91980
1. 15912
PEARSON
0. 40
0. 85625
0. 90319
0. 93858
0. 99017
PEARSON
0. 50
0. 95560
1.20024
0. 95232
0. 8B501
PEARSON
0. 60
1. 07794
1. 60605
0. 96196
0. 82495
PEARSON
0. 70
1. 22707
2. 16245
0. 96841
0. 79629
PEARSON
0. 80
1. 4081 0
2. 92745
0. 97244
0. 70739
PEARSON
0. 90
1. 62Q26
3. 98149
0. 97463
0. 79120
PEARSON
1. 00
1.89502
5. 43843
0. 9753B
0. B0538
R2
CORRELATION COEFFICIENT
STDE
STANDARD ERROR
WSS
WEIGHTED SUM OF SQUARES
MXLF
MAXIMUM LIKELIHOOD FUNCTION
STD
STANDARD DEVIATION
K
PEARSON SHAPE PARAMETER
and Selection
Runoff.
WSS
MXLF
MEAN
STD
SKEW
2. 08890
17. 68
1. 983
0. 676
1. 210
1. 62781
11. 69
2. 220
0. 79B
0. 815
I. 30805
6. 46
2. 496
0. 952
0. 472
1. 1624 7
3. 61
2. 825
1. 147
0. 174
1. 204 56
4. 47
3. 211
1.395
0. 088
1. 43834
B. 72
3 669
1.710
0. 323
1. 8571 1
14. 86
4.215
2. 110
0. 537
2. 44810
21. 49
4. 868
2. 621
0. 736
3. 44664
29. 70
5. 653
3. 274
0. 925
4. 03943
33. 51
6. 601
. 4. 112
1. 107
4. 52098
36. 21
7. 749
5. 108
1. 284
3. 77901
31. 91
1. 983
0. 676
1. 210
3. 18185
27. 78
2. 220
0. 79B
0. 815
2. 57414
22. 69
2. 498
0. 952
0. 472
2. 00242
16. 66
2. 825
1. 147
0. 174
1. 50699.
9. 84
3. 21 1
1. 395
0. 0B8
1. 11956
2. 71
3. 669
1.710
0. 323
0. B6244
3. 55
4.215
2. 110
0. 537
0. 74907
6. 93
4. 868
2. 621
0. 736
0. 78475
5. 82
5. 653
3. 274
0. 925
0. 96712
0. DO
6. 601
4.112
1. 107
1. 28520
6. 02
7. 749
5. 108
1. 284
3. 05144
26. 77
1.983
0. 676
1. 210
2. 482BD
21. 83
2. 220
0. 798
0. 815
1. 96227
16. 18
2. 490
0. 952
0. 472
1. 53131
10. 23
2. B25
1. 147
0. 174
1. 22177
4. B1
3. 211
1. 395
0. 08B
1. 05486
1. 28
3. 669
1.710
0. 323
1. 04181
0. 9B
4. 215
2. 110
0. 537
1. 18509
4. 08
4. B68
2. 621
0. 736
1. 47951
9. 40
5. 653
3. 274
0. 925
1. 91256
15. 56
6. 601
4.112
1. 107
2. 46216
21. 62
7. 749
5. IBS
1. 284
K
3. 62361
30. 90
1. 983
0. 676
1. 210
8. 602
3. 15184
27. 55
2. 220
0. 798
0. 815
7. 747
2. 69941
23. B3
2. 498
0. 952
0. 472
6. 888
2. 29077
19. B9
2. B25
1. 147
0. 174
6. 063
1. 94352
15. 95
3. 211
1.395
0. OBB
5. 298
1. 66181
12. 19
3. 669
1.710
0. 323
4. 605
1. 44890
8. 90
4. 215
. 2. 110
0. 537
3. 989
1. 30225
6. 34
4. B6B
2. 621
0. 736
3. 449
1. 21373
4. 65
5. 653
3. 274
0. 925
2. 981
1. 17481
3. 87
6. 601
4. 112
1. 107
2. 578
1. 18147
4. 00
7. 749
5. 188
1. 2B4
2. 231
109
110
Table 4.15. Optimal Selection Statistics and Corresponding
Transformation (ct), Example 2, Annual Total Runoff.
Distribution
R2/a
STDE/a
WSS/a
MXLF/a
Normal
0.974/0.40
0.711/0.20
1.162/0.30
3.61/0.30
Cumbel
0.984/0.75
0.654/0.70
0.749/0.75
6.93/0.75
Rayleigh
0.976/0.60
0.745/0.40
1.042/0.60
0.98/0.60
Pearson
0.975/1.00
0.787/0.80
1.175/0.90
3.87/0.90
%
Ill
The overall best model according to each of the four selection
statistics was from the Gumbel family. Based on the R2 statistic, a
transformation with ALFA equal to 0.75 defined the best model. Detailed
performance of this model is shown in Tables 4.16 and 4.17. The same
model was selected using the WSS and .MXLF criteria. The STDE statistic
gave a slightly different transformation of ALFA equal to 0.70. The
performance of this model is shown in Tables 4.18 and 4.19. Note that
for both models the Student statistic was larger than t at several data
points from both tails of the distribution, but that this exceedence was
smaller for the first model. This illustrates the superiority of the R2
and the other selection statistics over the STDE statistic. The same
conclusion is supported by comparison of the F values and DurbinWatson
statistics (Tables 4.17 and 4.19). Figures 4.4 and 4.5 give the plots
of these two models in the transformed spaces, along with the observa
tions and the 95% confidence limits. It is hardly possible to dis
tinguish between the two models from the plots.
The models fitted to the rainfall series are listed in Table 4.20.
For the range of transformation considered, the optimal selection sta
tistics along with the corresponding a's are summarized in Table 4.21.
For this example too, notice the closeness of the optimal statistics of
different families of distributions compared to their variation within
the same family. Also notice that the values of the Pearson shape
parameter are very high, and that the Pearson and normal distributions
gave the same optimal transformations. This is another illustration of
the equivalence of the two distributions at such high values of k.
Based on the R2 statistic, the optimal model is from the Pearson
family with ALFA equal to 0.15. The performance of this model is shown
Table 4.16 Best Model Bas
Statistic, Example
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RUNOFF 19341981
REGRESSION RESULTS
fi*** H******tt
GUMDEL
FAMILY
GIVES OPTIMAL '
R2 FOR ALFA = 0. 75
FREQUENCY
PERIOD
OBS. VOL
PRED VOL
RESIDUAL
1.
49. 000
0. 499
1. 539
1. 040
2.
24. 500
2. 497
2. '04
0. 375
3.
16. 333
3. 163
2. j17
0. 64 4
4.
12. 250
3. 198
2. d64
0. 334
5.
7. BOO
3. 217
3. 169
0. 04B
6.
8. 167
3. 425
3. 443
0. 023
7.
7. 000
3. 526
3. 70B
0. 182
8.
6. 125
3. 705
3. 955
0. 250
9.
5. 444
3. 973
4. 192
0. 220
10.
4. 900
4. 592
4. Â¡22
0. 170
11.
4. 455
4. 790
4. 646
0. 144
12.
4. 0B3
4. 795
4. 866
0. 071
13.
3. 769
4. 962
5. 083
0. 121
14.
3. 500
5. 105
5. 299
0. 194
15.
3. 267
5. 439
5. 513
0. 075
16.
3. 063
5. 602
5. 727
0. 126
17.
2. 882
5. 807
5. V42
0. 135
18.
2. 722
6. 507
6. 158
0. 349
19.
2. 579
6. 74 5
6. 377
0. 368
20.
2. 450
7. 246
6. 597
0. 649
21.
2. 333
7. 301
6. 821
0. 400
22.
2. 227
7. 334
7. 048
0. 286
23.
2. 130
7. 495
7. 200
0. 215
24.
2. 042
7. 495
7. 517
0. 022
25.
1.96Q
7. 563
7. 761
0. 190
26.
1. 885
8. 142
8. 010
0. 132
27.
1. 815
8. 925 .
8. 268
0. 65B
28.
1. 750
9. 258
8. 534
0. 724
29.
1. 690
9. 401
8. 810
0. 591
30.
1. 633
9. 437
9. 097
0. 340
31.
1. 581
9. 503
9. 396
0. 107
32.
1. 531
9. 593
9. 710
0. 117
33.
1. 485
9. 960
10. 040
0. 000
34.
1. 441
10. 364
10. 389
0. 024
35.
1. 400
10. 391
10. 759
0. 368
36.
1. 361
10. 965
11. 154
0. 18B
37.
1. 324
11. 026
11. :j7B
0. 552
38.
1. 2B9
11. 097
12. 037
0. 940
39.
1. 256
11. 772
12. 538
0. 766
40.
1. 225
12. 375
13. 090
0. 716
41.
1. 195
12. 603
13. 706
1. 104
42.
1. 167
13. 061
14. 405
1. 343
43.
I. 140
15. 461
15. 211
0. 249
44.
1.114
17. 365
16. 168
1. 197
45.
1. 089
18. 212
17. 344
0. 867
46.
1. 065
18. 983
10. 873
0. 109
47.
1. 043
19. 030
21.057
2. 027
48.
1. 021
27. 064
24. 880
2. 184
2
REDUCED VARIATE
YO
TRANSFORMED
OBSERVATION
YP
TRANSFORMED
PREDICTION
YR
TRANSFORMED
RESIDUAL
T
STUDENT STATISTIC
TO
957. STUDENT
T
d on R2 Selection
2, Runo f f.
Z
YO
YP
YR
T
TQ
1. 359
0. 542
0. 509
1. 051
2. 636
1. 679
1. 163
1.317
0. 996
0. 321
0. 009
1. 679
1. 027
1.829
1. 332
0. 497
1. 256
1. 679
0. 910
1.056
1.602
0. 253
0. 642
1. 679
0. 825
1. 070
1. 034
0. 036
0. 091
1. 679
0. 74 2
2. 024
2. 041
0. 017
0. 043
1. 679
0. 666
2. 09B
2. 230
0. 132
0. 336
1. 679
0. 595
2. 227
2. 406
0. 179
0. 457
1. 679
0. 527
2. 419
2. 573
0. 155
0. 395
1. 67?
0. 463
2. 849
2. 733
0. 117
0. 299
1. 679
0. 401
2. 9B4
2. B86
0. 098
0. 250
1. 679
0. 341
2. 9B7
3. 035
0. 048
0. 124
1. 679
0. 283
3. 100
3. 181
0. 081
0. 207
1. 679
0. 225
3. 195
3. 323
0. 12B
0/329
1. 679
0. 169
3. 415
3. 464
0. 049
0. 125
1. 679
0. 113
3. 522
3. 603
0. OBI
0. 209
1.679
0. 057
3. 654
3. 741
0. 007
0. 224
1. 679
0. 001
4. 099
3. B79
0. 220
0. 565
1. 679
0. 054
4. 247
4. 017
0. 230
0. 592
1.679
0. 110
4. 555
4. 1 55
0. 400
1. 030
1. 67?
0. 166
4. 509
4. 294
0. 295
0. 758
1. 67?
0. 222
4. 609
4. 434
0. 175
0. 449
1. 67?
0. 279
4. 707
4. 576
0. 130
0. 336
1. 67?
0. 337
4. 707
4. 720
0. 013
0. 034
1. 679
0. 396
4. 747
4. 866
0. 119
0. 306
1. 67?
0. 456
5. 094
5. 01 5
0. 07B
0. 202
1. 679
0. 510
5. 552
5. 166
0. 384
0. 98?
1. 67?
0. 581
5. 743
5. 324
0. 41V
1.081
1. 67?
0. 645
5. 825
5. 4B5
0. 340
0. 876
1. 67?
0. 712
5. 845
5. 651
0. 195
0. 502
1. 679
0. 781
5. QB3
5. 823
0. 061
0. 157
1.67?
0. 853
5. 935
6. 001
0. 066
0. 171
1. 679
0. 92S
6. 142
6. 187
0. 045
0. 1 16
1. 679
1. 007
6. 369
6. 382
0. 013
0. 035
1. 67?
1.089
6. 383
6. 587
0. 204
0. 524
1. 67?
1. 177
6. 701
6. 804
0. 103
0. 265
1. 67?
1. 270
6. 735
7. 035
0. 301
0. 772
1. 67?
1. 369
6. 773
7. 283
0. 510
1. 307
1. 67?
1. 477
7. 140
7. 551
0. 410
1. 050
1. 67?
1. 595
7. 464
7. 843
0. 379
0. 968
1.679
1. 725
7. 585
B. 165
0. 580
1. 478
1.679
1. 870
7. 827
8. 525
0. 698
1. 775
1. 679
2. 035
9. 063
8. 937
0. 126
0. 319
1. 67?
2. 229
10. 009
9. 417
0. 591
1. 492
1. 679
2. 463
10. 421
9. 999
0. 422
1. 05?
1. 67?
2. 762
10. 792
10. 740
0. 052
0. 130
1. 679
3. 178
10. 815
11. 773
0. 95B
2. 34B
1.67?
3. 882
14. 487
13. 520
0. 967
2. 305
1.679
112
Table 4.17 Detailed Statistics for the 112
Selected Model, Example 2, Runoff.
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RUNOFF ,19341981
REGRESSION RESULTS : GUMBEL FAMILY GIVES OPTIMAL R2 FOR ALFA =
BASIC DESCRIPTIVE STATISTICS
STANDARD
MEAN
DEVIATION
CORRELATION
TRANSFORMED VARIABLE
5. 243
2. 92S
0. 992
REDUCED VARIATE
0. 54S
1. 170
ANALYSIS OF VARIANCE
SOURCE OF
VARIATION
DEGREE OF
FREEDOM
SUMS OF
SQUARES
MEAN
SQUARES
P(EXCEEDING F
FVALUE UNDER HO)
REGRESSION
RESIDUAL
TOTAL
1. ooo
46. 000
47. 000
396. 327
6. 496
402. 823
396. 327
0. 141
2806. 357 0. 0
DURBINWATSON
STATISTIC
1. 309
MODEL PARAMETER INFERENCES
POINT
STANDARD
LOWER
CONFIDENCE
UPPER
CONFIDENCE
ESTIMATE
ERROR
LIMIT
LIMIT
SCALE
2. 483
0. 047
2. 357
2. 609
LOCATION
3. S83
0. 060
3. 762
4. 004
COVARIANCE
0. 001
0. 75
113
Table 4.18 Best Model Based on the STDE
Selection Statistic, Example 2, Runoff.
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RUNOFF ,193419B1
REGRESSION RESULTS
******#***********
GUMEEL FAMILY GIVES OPTIMAL 'STDE' FOR ALFA = 0.70
FREQUENCY
PERIOD
OBS. VOL
PRED. VOL
RESIDUAL
Z
YO
YP
YR
T
TO
I.
49. 000
0. 499
1. 687
1. 180
1. 359
0. 551
0. 631
1. 182
3. 250
1. 679
2.
24. 500
2. 499
2. 219
0. 200
1. 163
1. 284
1. 067
0. 216
0. 598
1. 679
3.
16. 333
3. 163
2. 611
0. 552
1. 027
1. 770
1.368
0. 402
1.114
1. 679
4.
12. 250
3. 198
2. 939
0. 259
0. 91B
1.795
1. 610
0. 185
0. 515
1. 679
5.
9. 800
3. 217
3. 230
0. 012
0. 825
1.808
1. B17
0. 009
0. 024
1. 679
6.
8. 167
3. 425
3. 496
0. 071
0. 742
1.953
2. 002
0. 049
0. 136
1. 679
7.
7. 000
3. 526
3. 745
0. 219
0. 666
2. 023
2. 172
0. 148
0. 4 15
1. 679
8.
6. 125
3. 705
3. 982
0. 277
0. 595
2. 145
2. 330
0. 185
0. 517
1. 679
V.
5. 444
3. 973
4. 210
0. 237
0. 527
2. 323
2. 479
0. 155
0. 435
1. 679
10.
4. 900
4. 592
4. 431
0. 161
0. 463
2. 724
2. 622
0. 102
0. 287
1. 679
11.
4. 455
4. 790
4. 648
0. 142
0. 401
2. 848
2. 759
0. 0B9
0. 251
1. 679
12.
4. 083
4. 795
4. 860
0. 066
0. 341
2. B51
2. 892
0. 04 1
0. 1 15
1.679
13.
3. 769
4. 962
5. 071
0. 109
0. 283
2. 956
3. 023
0. 067
0. 188
1. 679
14.
3. 500
5. 105
5. 280
0. 175
0. 225
3. 043
3. 150
0. 107
0. 300
1. 679
15.
3. 267
5. 439
5. 489
0. 050
0. 169
3. 246
3. 276
0. 030
0. 0B5
1. 679
16.
3. 063
5. 602
5. 69B
0. 096
0. 113
3. 344
3. 401
0. 057
0. 160
1. 679
17.
2. 882
5. 807
5. 907
0. 100
0. 057
3. 465
3. 524
0. 059
0. 166
1. 679
18.
2. 722
6. 507
6. 119
0. 388
0. 001
3. 871
3. 648
0. 224
0. 630
1. 679
19.
2. 579
6. 745
6. 332
0. 413
0. 054
4. 006
3. 771
0. 235
0. 662
1. 679
20.
2. 450
7. 246
6. 548
0. 698
0. 110
4. 286
3. 895
0. 391
1. 103
1. 679
21.
2. 333
7. 301
6. 76B
0. 533
0. 166
4. 316
4. 019
0. 297
0. 837
1. 679
22.
2. 227
7. 334
6. 992
0. 342
0. 222
4. 334
4. 145
0. 190
0. 535
1. 679
23.
2. 130
7. 495
7. 220
0. 275
0. 279
4. 423
4. 272
0. 151
0. 426
1.679
24.
2. 042
7. 495
7. 454
0. 041
0. 337
4. 423
4. 400
0. 022
0. 063
1. 679
25.
1. 960
7. 563
7. 695
0. 132
0. 396
4. 460
4. 531
0. 072
0. 202
1. 679
26.
1. 885
8. 142
7. 942
, 0. 200
0. 456
4. 772
4. 665
0. 107
0. 303
1. 679
27.
1.815
B. 925
8. 197
0. 728
0. 518
5. 184
4. BO 1
0. 302
1. 080
1. 679
28.
1. 750
9. 25B
B. 462
0. 797
0. 5B1
5. 355
4. 941
0. 414
1. 170
1. 679
29.
1. 690
9. 401
8. 736
0. 665
0. 645
5. 42B
5. 0B5
0. 343
0. 969
1. 679
30.
1. 633
9. 437
9. 022
0. 415
0. 712
5. 446
5. 234
0. 213
0. 601
1. 679
31.
1. 581
9. 503
9. 321
0. 182
0. 781
5. 480
5 387
0. 093
0. 263
1. 679
32.
1. 531
9. 593
9. 634
0. 041
0. 853
5. 526
5. 547
0. 021
0. 059
1. 679
33.
1. 485
9. 960
9. 965
0. 004
0. 928
5. 711
5. 714
0. 002
0. 006
1. 679
34.
1. 441
10. 364
10. 314
0. 050
1. 007
5. 913
5. 088
0. 025
0. 070
1.679
35.
1. 400
10. 391
10. 686
0. 295
1. 0B9
5. 926
6. 072
0. 146
0. 410
1. 679
36.
1. 361
10. 965
1 V. 084
0. 110
1. 177
6. 208
6. 266
0. 058
0. 162
1. 679
37.
1. 324
11. 026
11. 512
0. 4B6
1. 270
6. 230
6. 473
0. 235
0. 661
1. 679
38.
1.289
11. 097
11. 976
0. B79
1. 369
6. 272
6. 694
0. 422
1. 186
1. 679
39.
1. 256
11. 772
12. 484
0. 712
1. 477
6. 597
6. 934
0. 337
0. 945
1. 679
40.
1. 225
12. 375
13. 044
0. 670
1. 595
6. 803
7. 195
0. 313
0. 875
1. 679
41.
1. 195
12. 603
13. 672
1. 069
1. 725
6. 990
7. 4B4
0. 494
1. 380
1. 679
42.
1. 167
13. 061
14. 385
1. 323
1. 870
7. 203
7. 806
0. 603
1. 682
1. 679
43.
1. 140
15. 461
15. 211
0. 250
2. 035
B. 285
8. 174
0. 110
0. 306
1.679
44.
1. 114
17. 365
16. 194
1. 171
2. 229
9. 107
8. 605
0. 503
1. 390
1. 679
45.
1. 089
10. 212
17. 407
0. 005
2. 463
9. 464
9. 125
0. 339
0. 932
1. 679
46.
1. 065
18. 983
18. 991
0. 000
2. 762
9. 785
9. 789
0. 003
0. 009
1. 679
47.
1. 043
19. 030
21.266
2. 236
3. 170
9. 805
10. 713
0. 908
2. 441
1. 679
48.
1. 021
27. 064
25. 284
1. 780
3. 882
12. 945
12. 277
0. 668
1. 746
1. 679
Z
REDUCED VARIATE
YO
TRANSFORMED
OBSERVATION
YP
TRANSFORMED
PREDICTION
YR
TRANSFORMED
RESIDUAL
T
STUDENT STATISTIC
TO
95% STUDENT
T
114
Table 4.19 Detailed Statistics for the STDE
Selected Model, Example 2, Runoff.
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RUNOFF ,19341931
REGRESSION RESULTS : GUMBEL FAMILY GIVES OPTIMAL 'STDE' FOR ALFA =
SKK* H#K* *KH * +r
BASIC DESCRIPTIVE STATISTICS
STANDARD
MEAN DEVIATION
TRANSFORMED VARIABLE 4. 868 2. 621
REDUCED VARIATE 0.548 1.170
CORRELATION
O. 992
ANALYSIS OF VARIANCE
SOURCE OF
VARIATION
DEGREE OF
FREEDOM
SUMS OF
SQUARES
MEAN P
SQUARES FVALUE UNDER HO)
REGRESSION
RESIDUAL
TOTAL
1. 000
46. 000
47. 000
317.503 317.503 2702.015
5. 405 0. 118
322. 908
0. 0
DURBINWATSON STATISTIC 1.250
MODEL PARAMETER INFERENCES
POINT
STANDARD
LOWER
CONFIDENCE
UPPER
CONFIDENCE
ESTIMATE
ERROR
LIMIT
LIMIT
SCALE
2. 222
0. 043
2. 107
2. 337
LOCATION
3. 651
0. 055
3. 541
3. 761
0. 70
COVARIANCE
0. 001
115
1GCCC2COLLOGC2ZUJQ )LtlCCQ l> EDZO.L.
116
KISSIMMEE RIVEE RUNOFF DATA
CUMBSL BZ JH?A = 0.7 S
REDUCED VflRITE
Figure 4.4 Linear Plot of R2 Selected Model,
Example. 2, Runoff
ti~na^c:m rnicmi om^nQTwsidoj
117
KISSIMMEE RIVEE RUNOFF DATA
CUMRSL STM ALFA 0.70
REDUCED YRRIflTE
Figure 4.5 Linear Plot of SIDE Selected Model,
Example 2, Runoff
Table 4.20 Models, Parameters and Selection
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RAINFALL,19341901
REGRESSION RESULTS
************<**
PEARSON
FAMILY
GIVES OPTIMAL '
1 R2 '
for
ALFA = 0.
15
LAW
ALFA
LOCATION
SCALE
R2
STDE
NORMAL
0. 20
2. 70699
0.
07514
0.
97485
1. 26700
NORMAL
0. 10
3. 22880
0.
11101
0.
97747
I. 29439
NORMAL
0. 0
3. 90027
0.
16403
0.
97619
1. 32377
NORMAL
0. 10
4. 77197
0.
24244
0.
97501
1. 35482
NORMAL
0. 20
5. 91316
0.
35843
0.
97234
1.38741
NORMAL
0. 30
7. 41906
0.
53002
0.
97024
1.42127
NORMAL
0. 40
9. 42111
0.
78396
0.
96841
1.45635
NORMAL
0. 50
12. 10142
1.
159U5
0.
96729
1. 49263
NORMAL
0. 60
15. 71303
1.
71639
0.
96518
1. 53000
NORMAL
0. 70
20. 60889
2.
54066
0.
96289
1. 56800
NORMAL
0. 80
27. 28258
3.
76160
0.
96002
1. 60091
NORMAL
0. 90
36. 42635
5.
57061
0.
95691
1. 65053
NORMAL
1. 00
49. 01389
8.
25167
0.
95363
1. 69386
GUMDEL
0. 20
2. 67440
0.
05951
0.
94520
1. 6324B
GUMDEL
0. 10
3. 18054
0.
08811
0.
95206
1. 56705
GUMDEL
0. 0
3. 82879
0.
13050
0.
95516
1. 50860
GUMBEL
0. 10
4. 66609
0.
19332
0.
95835
1. 45649
GUMBEL
0. 20
5. 75627
0.
28646
0.
96013
1. 41022
GUMBEL
0. 30
7. 18652
0.
42458
0.
96248
1. 36925
GUMBEL
0. 40
9. 07638
0.
62944
0.
96507
1. 33312
GUMBEL
0. 50
11. 59023
0.
93336
0.
96837
1. 30153
GUMBEL
0. 60
14. 95482
1.
38439
0.
97067
1. 27415
GUMBEL
0. 70
19. 48402
2.
05384
0.
97274
1. 25072
GUMBEL
0. 80
25. 61336
3.
04777
0.
97427
1.23105
GUMBEL
0. 90
33. 94870
4.
52381
0.
97556
1. 21495
GUMBEL
1. 00
45. 33540
6.
71640
0.
9766B
1. 20231
RAYLEIGH
0. 20
2. 56676
0.
1127B
0.
95389
1. 42538
RAYLEIGH
0. 10
3. 02133
0.
166B6
0.
95929
1. 40786
RAYLEIGH
0. 0
3. 59325
0.
24693
0.
96085
1. 39488
RAYLEIGH
0. 10
4. 31754
0.
36549
0.
96244
1. 38593
RAYLEIGH
0. 20
5. 24030
0.
54117
0.
96275
1. 30067
RAYLEIGH
0. 30
6. 42255
0.
80147
0.
96362
1. 37858
RAYLEIGH
0. 40
7. 94491
1.
18728
0.
96475
I. 37935
RAYLEIGH
0. 50
9. 91418
1.
75916
0.
96650
1. 38272
RAYLEIGH
0. 60
12. 47144
2.
60715
0.
96726
1. 3Q846
RAYLEIGH
0. 70
1 5. 80368
3.
86475
0.
96774
1. 39640
RAYLEIGH
0. 00
20. 15771
5.
73041
0.
96769
1. 40642
RAYLEIGH
0. 90
25. B5942
8.
49B70
0.
96741
1. 41838
RAYLEIGH
1. 00
33. 33830
12.
60759
0.
96693
1. 43226
PEARSON
0. 20
0. 69771
0.
00281
0.
97499
1. 23424
PEARSON
0. 10
0. 82895
0.
00514
0.
98043
1.24908
PEARSON
0. 0
1. 0001 B
0.
00930
0.
97954
1. 26285
PEARSON
0. 10
1.22488
0.
01662
0.
97039
1. 27634
PEARSON
0. 20
1. 52485
0.
02935
0.
97581
1.20903
PEARSON
0. 30
1. 91270
0.
0512B
0.
97484
1. 299B7
PEARSON
0. 40
2. 42805
0.
08854
0.
97422
1. 30974
PEARSON
0. 50
3. 11804
0.
15109
0.
97420
1. 31883
PEARSON
0. 60
4. 05163
0.
25537
0.
97347
1. 32692
PEARSON
0. 70
5. 32021
0.
42757
0.
97263
1. 33426
PEARSON
0. 80
7. 04657
0.
71021
0.
9718B
1. 34080
PEARSON
0. 90
9. 41202
1.
17064
0.
97118
1. 34692
PEARSON
1 00
12. 69B44
1.
91448
0.
97051
1. 35438
R2
CORRELATION COEFFICIENT
STDE
STANDARD
ERROR
WSS
WEIGHTED
SUM OF
SQUARES
MXLF
MAXIMUM
LIKELIHOOD
FUNCTION
STD
STANDARD
DEVIATION
K PEARSON SHAPE PARAMETER
Statistics
Example 2, Rainfall
WSS
MXLF
MEAN
STD
SKEW
1. 12398
2. 81
2. 707
0. 072
0. 0
1. 1454B
3. 26
3. 229
0. 106
0. 373
1. 17662
3. 90
3. 900
0. 156
0. 248
1. 22201
4. 81
4. 772
0. 231
0. 211
1. 27969
5. 92
5. 913
0. 342
0. 189
1. 35193
7. 24
7. 419
0. 506
0. 249
1. 43029
8. 72
9. 421
0. 749
0. 314
1. 53947
10. 35
12. 101
1. 109
0. 389
1. 65251
12. 06
15. 713
1. 643
0. 446
1. 77832
13. 82
20. 609
2. 436
0. 503
1. 91590
15. 60
27. 283
3. 61 1
0. 558
2. 06532
17. 41
36. 426
5. 357
0. 611
2. 22625
19. 21
49. 014
7. 948
0. 669
2. 39979
21. 01
2. 707
0. 072
0. 0
2. 24200
19. 38
3. 229
0. 106
0. 373
2. 08438
17. 63
3. 900
0. 156
0. 24B
1. 93725
15. 87
4. 772
0. 231
0. 211
1. 79B09
14. 08
5. 913
0.342
0. 189
1. 67140
12. 33
7. 419
0. 506
0. 249
1. 55664
10. 62
9. 421
0. 749
0. 314
1. 454B3
9. 00
12. 101
1. 109
0. 389
1. 36361
7. 44
15. 713
1. 643
0. 446
1. 28455
6. 01
20. 609
2. 436
0. 503
1. 21751
4. 72
27. 283
3. 61 1
0. 558
1. 16314
3. 63
36. 426
5. 357
0. 611
1. 12165
2. 76
49. 014
7. 948
0. 669
2. 05519
17. 29
2. 707
0. 072
0. 0
1. 95379
16. 07
3. 229
0. 106
0. 373
1. 85649
14. 85
3. 900
0. 156
0. 248
1. 77200
13. 73
4. 772
0. 231
0. 211
1. 69783
12. 70
5. 913
0. 342
0. 189
1. 63785
11. 84
7. 419
0. 506
0. 249
1. 59153
11. 15
9. 421
0. 749
0. 314
1. 55978
10. 67
12. 101
1. 109
0. 389
1. 539B7
10. 36
15. 713
1. 643
0. 446
1. 53325
10. 26
20. 609
2. 436
0. 503
1. 53946
10. 35
27. 283
3. 61 1
0. 558
1. 55892
10. 66
36. 426
5. 357
0. 61 1
1. 59170
11. 16
49. 014
7. 948
0. 669
K
1. 12738
2. 88
2. 707
0. 072
0. 0
1429. 815
1. 13291
2. 99
3. 229
0. 106
0. 373
934. 586
1. 14134
3. 17
3. 900
0. 156
0. 248
623. 750
1. 15625
3. 48
4. 772
0. 231
0. 211
426. 898
1. 17444
3. 86
5. 913
0. 34 2
0. 189
299. 088
1. 19734
4. 32
7. 419
0. 506
0. 249
214. 849
1. 22390
4. 85
9. 421
0. 749
0. 314
15B. 058
1. 25409
5. 43
12. 101
1. 109
0. 389
119. 006
1. 28508
6. 02
15. 713
1. 643
0. 446
91.419
1. 31757
6. 62
20. 609
2. 436
0. 503
71. 603
1. 35070
7. 22
27. 2B3
3. 61 1
0. 55B
57. 075
1. 38482
7. 81
36. 426
5. 357
0. 611
46. 242
1. 42331
B. 47
49. 014
7. 940
0. 669
38. 026
118
119
Table 4.21. Optimal Selection Statistics and Corresponding (a),
Example 2, Annual Total Rainfall.
Distribution
R2/a
STDE/a
WSS/a
MXLF/a
Normal
0.978/0.15
1.267/0.20
1.124/0.20
2.81/0.20
Gumbel
0.977/1.00
1.202/1.00
1.122/1.00
2.76/1.00
Rayleigh
0.968/0.70
1.379/0.30
1.533/0.70
10.26/0.70
Pearson
0.981/0.15
1.234/0.20
1.127/0.20
2.88/0.20
Table 4.22 Best Model Based on R2 Selection Statistic, Example 2, Rainfall.
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RAINFALL, 17341 RBI
EVENT'S AVERAGE DURATION = 1. YEAR(S)
PLOTTING POSITION CONSTANT = 0.0
ADDITIVE CONSTANT = 0.0
MULTIPLICATIVE CONSTANT = 1.000
REGRESSION RESULTS
PEARSON FAMILY GIVES OPTIMAL R2 7 FOR ALFA O. 1 5
FREQUENCY
PERIOD
ODS. VOL
PRED. VOL
RESIDUAL
Z
YO
YP
YR
T
TO
1.
49. 000
34. 447
35. 822
1. 376
530. 512
2. 746
2. 769
0. 023
0. 103
1. 67?
2.
24. 500
36. 171
37. 470
1. 307
537. 446
2. 775
2. 795
0. 021
0. 091
1. 679
3.
16. 333
37. 319
30. 604
1.285
541. 963
2. 793
2. 013
0. 020
0. 006
1. 67?
4.
12. 250
30. 340
39. 496
1. 156
545. 434
2. 809
2. 826
0. 017
0. 074
1. 67?
5.
9. 800
41. 555
40. 253
1. 302
548. 310
2. 855
2. 837
0. 018
0. 079
1. 67?
6.
8. 167
42. 126
40. 922
1. 204
550. 801
2. 863
2. 046
0. 017
0. 071
1. 67?
7.
7. 000
42. 240
41. 529
0. 711
553. 021
2. 864
2. 855
0. 010
0. 041
1. 67?
8.
6. 125
42. 621
42. 091
0. 530
555. 042
2. 869
2. B62
0. 007
0. 030
1. 679
9.
5. 444
42. 661
42. 617
0. 044
556. 908
2. 870
2. 869
0. 001
0. 002
1. 67?
10.
4. 900
44. 619
43. 116
1. 503
558. 652
2. 895
2. 876
O. 019
0. 082
1. 67?
11.
4. 455
44. 775
43. 593
1. 181
560. 299
2. 897
2. 082
0. 015
0. 064
1. 67?
12.
4. 003
44. 983
yt4. 053
0. 930
561. 866
2. 900
2. BB
0. 012
0. 050
1. 679
13.
3. 769
45. 565
44. 499
1. 066
563. 366
2. 907
2. 894
0. 013
0. 056
1.679
14.
3. 500
45. 700
44. 933
0. 760
564. 812
2. 909
2. 899
0. 010
0. 040
1. 67?
15.
3. 267
45. 049
45. 357
0. 491
566. 21 1
2. 911
2. 905
0. 006
0. 025
1. 67?
16.
3. 063
46. 070
45. 775
0. 315
567. 572
2. 914
2. 910
0. 004
0. 016
1. 67?
17.
2. BB2
47. 000
46. 187
0. 813
568. 901
2. 925
2. 915
0. 010
0. 041
1. 67?
ID.
2. 722
47. 257
46. 595
0. 662
570. 204
2. 928
2. 920
0. OOB
0. 033
1. 67?
1?.
2. 579
47. 302
47. 000
0. 382
571. 484
2. 929
2. 925
0. 005
0. 019
1. 67?
20.
2. 450
47. 415
47. 404
0. on
572. 747
2. 930
2. 930
0. 000
0. 001
1. 67?
21.
2. 333
4 7. 797
47. 807
0. 010
573. 996
2. 934
2. 934
0. 000
0. 000
1. 67?
22.
2. 227
48. 307
4B. 211
0. 096
575. 236
2. 940
2. 939
0. 001
0. 005
1. 679
23.
2. 130
49. 098
48. 617
0. 481
576. 469
2. 949
2. 944
0. 005
0. 023
1. 679
24.
2. 042
49. 590
49. 025
0. 565
577. 699
2. 955
2. 948
0. 006
0. 026
1. 679
25.
1. 960
50. 270
9. 438
0. B32
578. 929
2. 962
2. 953
0. 009
0. 038
1. 67?
26.
1. 805
50. 517
49. 055
0. 662
580. 161
2. 965
2. 95B
0. 007
0. 030
1. 67?
27.
1. 815
50. 865
50. 279
0. 506
581. 403
2. 969
2. 962
0. 006
0. 026
1. 679
28.
1. 750
50. 911
50. 711
0. 200
502. 653
2. 969
2. 967
0. 002
0. 009
1. 679
27.
1. 690
51. 067
51. 151
0. 004
5B3. 916
2. 971
2. 972
0. 001
0. 004
1. 679
30.
1. 633
51.496
51. 603
0. 107
585. 197
2. 976
2. 977
0. 001
0. 005
1. 67?
31.
1. 501
51. 704
52. 066
0. 363
506. 500
2. 978
2. 982
0. 004
0. 016
1. 67?
32.
1. 531
51. 911
52. 544
0. 634
587. 829
2. 900
2. 987
0. 007
0. 027
1. 67?
33.
1. 485
51. 958
53. 039
1. 081
509. 190
2. 981
2. 992
0. 011
0. 046
1. 67?
34.
1. 441
52. 158 .
53. 552
1. 394
590. 587
2. 983
2. 997
0. 015
0. 05B
1. 67?
35.
1. 400
52. 751
54. 088
1. 337
592. 030
2. 989
3. 003
0. 014
0. 055
1. 67?
36.
1. 361
53. 135
54. 650
1. 515
593. 523
2. 993
3. OOB
0. 015
0. 061
1. 67?
37.
1. 324
53. 158
55. 242
2. 084
595. 079
2. 993
3. 014
0. 021
0. 084
1. 67?
38.
1. 239
53. 193
55. 870
2. 67B
596. 710
2. 994
3. 021
0. 027
0. 107
1. 679
37.
1. 256
54. 534
56. 543
2. 009
59B. 432
3. 007
3. 027
0. 020
0. 07B
1. 67?
40.
1. 225
55. 740
57. 267
1. 32B
600. 260
3. 019
3. 034
0. 015
0. 058
1. 67?
41.
1. 175
56. 718
50. 059
1. 341
602. 227
3. 029
3. 041
0. 013
0. 050
1. 679
42.
1. 167
50. 185
58. 933
0. 748
604. 365
3. 043
3. 050
0. 007
0. 027
1. 679
43.
1. 140
59. 423
59. 916
0. 493
606. 726
3. 054
3. 059
0. 004
0. 017
1. 67?
44.
1. 114
59. 693
61. 049
1. 356
609. 392
3. 057
3. 069
0. 012
0. 047
1. 67?
45.
1.009
64. 474
82. 396
2. 078
612. 489
3. 098
3. 080
0. 018
0. 06B
I. 67?
46.
1. 065
67. 449
64. 080
3. 369
616. 254
3. 122
3. 095
0. 027
0. 105
1. 67?
47.
1. 043
6B. 231
66. 373
1. 858
621.201
3. 128
3. 114
O. 015
0. 056
1. 67?
48.
1. 021
72. 230
70. 133
2. 097
628. 903
3. 158
3. 143
0. 016
0. 0 58
1. 67?
z
REDUCED VARIATE
YO
TRANSFORMED
OESERVATION
YP
TRANSFORMED
PREDICTION
YR
TRANSFORMED
RESIDUAL
T
STUDENT STATISTIC
TO
95X STUDENT
T
120
Table 4.23 Detailed Statistics for the R2
Selected Model, Example 2, Rainfall.
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RAINFALL,19341931
REGRESSION RESULTS : PEARSON FAMILY GIVES OPTIMAL R2 FOR ALFA = O.15
BASIC DESCRIPTIVE STATISTICS
MEAN
STANDARD
DEVIATION
CORRELATION
TRANSFORMED VARIABLE
2. 952
0. 037
REDUCED VARIATE
570. 602
22. 624
0. 988
ANALYSIS OF VARIANCE
SOURCE OF
VARIATION
DEGREE OF
FREEDOM
SUMS OF
SQUARES
MEAN
SQUARES
P(EXCEEDING F
FVALUE UNDER HO)
REGRESSION
RESIDUAL
TOTAL
1. 000
46. 000
47. 000
0. 347
0. 003
0. 355
0. 347
0. 000
1900. 204 0. 0
DURBINWATSON
STATISTIC
0. 399
MODEL PARAMETER INFERENCES
LOWER
UPPER
POINT
STANDARD
CONFIDENCE
CONFIDENCE
ESTIMATE
ERROR
LIMIT
LIMIT
SCALE
0. 004
0. 000
0. 004
0. 004
LOCATION
0. 755
0. 050
0. 653
0. 856
COVARIANCE
0. 000
121
122
in Tables 4.22 and 4.23. The other three selection statistics gave the
same model, the Gumbel with no transformation (ALFA=1.0). The per
formance of this model is shown in Tables 4.24 and 4.25. For this
example too, the Student statistics suggest a much better fit based on
the optimal R2 statistic than by the other statistics. The plots of
these two models in the transformed spaces (Figures 4.6 and 4.7) do not
show any visible difference at the 95% confidence levels. From this, it
is possible to conclude that the Student statistic is more powerful than
the graphical display of the confidence intervals in detecting the
goodness of fit of the model.
4.4. Sensitivity Analysis
4.4.1. Sensitivity to the plotting position definition
In the previous section, the series of the two illustrative exam
ples were fitted using the Weibull plotting position (constant a=0.0 in
Equation 3.2.4). The sensitivity of the selection statistics to the
definition of the plotting position was investigated by varying the
constant a from 0.0 to 1.0. This range covers all recommended values
for different distributions (Table 3.5 and Section 2.2.7). The optimal
values of the statistic R2 and STDE are given by Tables 4.26 and 4.27 for
the Kissimmee river rainfall series (Section 4.2). From these tables
we note that the shape parameter is practically independent of the
definition of the plotting position, and that the sensitivity of the
optimal statistics to the change of this definition is very small
compared to the sensitivity of these statistics to the change of the
shape of the distribution (transformation a). Due to this low sensi
tivity, no clear conclusion can be made about the appropriate plotting
position for each distribution family. But it is interesting to notice
Table 4.24 Best Model Based on the STDE Selection Statistic, Example 2, Rainfall.
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RAINFALL, 19341981
EVENT'S AVERAGE DURATION = 1. YEAR(S)
PLOTTING POSITION CONSTANT = O. 0
ADDITIVE CONSTANT = 0.0
MULTIPLICATIVE CONSTANT = 1.000
REGRESSION RESULTS
a*********< #*****
GUMBEL FAMILY GIVES OPTIMAL 'STDE' FOR ALFA 1.00
lENCY
PERIOD
OBS.
VOL
PRED. VOL
RESIDUAL
Z
YO
YP
YR
T
TO
1.
49. 000
34.
447
37. 209
2. 762
1.359
33. 446
36. 209
2. 762
2.
122
1. 679
2.
24. 500
36.
171
38. 526
2. 355
I. 163
35. 171
3 7. 526
2. 355
1.
819
1. 679
3.
16. 333
37.
319
39. 437
2. 117
1. 027
36. 319
SB. 437
2. 1 18
I.
641
1. 679
4.
12. 250
38.
340
40. 167
1. 827
0. 918
37. 340
39. 167
1. 827
1.
419
1. 679
5.
9. 800
41.
555
40. 793
0. 762
0. 825
40. 555
39. 7 93
0. 762
0.
593
1. 679
6.
8. 167
42.
126
41. 352
0. 774
0. 742
41. 126
40. 352
0. 774
0.
603
1. 679
7.
7. 000
42.
240
41. 864
0. 376
0. 666
41.240
40. 864
0. 376
0.
293
1. 679
8.
6. 125
42.
621
42. 342
0. 279
0. 595
41. 621
41. 342
0. 279
0.
218
1.679
9.
5. 444
42.
661
42. 793
0. 132
0. 527
41. 661
41. 793
0. 132
0.
103
1. 679
10.
4. 900
44.
619
43. 224
1. 395
0. 463
43. 619
42. 224
1. 395
1.
092
1. 679
11.
4. 455
44.
77 5
43. 640
1. 135
0. 401
43. 775
42. 640
1. 135
0.
B90
1.677
12.
4. 083
44.
933
44. 043
0. 940
0. 341
43. 983
43. 043
0. 940
0.
738
1. 679
13.
3. 769
45.
565
44. 436
1. 129
0. 2B3
44. 565
43. 4 36
1. 129
0.
886
1. 679
14.
3. 500
45.
700
44. 822
0. 870
0. 225
44. 700
43. 822
0. 878
0.
690
1. 679
15.
3. 267
45.
849
45. 202
0. 646
0. 169
44. 849
44. 202
0. 646
0.
508
1. 679
16.
3. 063
46.
090
45. 579
0. 511
0. 113
45. 090
44. 579
0. 511
0.
402
1. 679
17.
2. 882
47.
000
45. 953
1. 047
0. 057
46. 000
14. 953
1.047
0.
824
1. 679
IB.
2. 722
47.
257
46. 326
0. 932
0. 001
46. 257
4 5. 326
0. 932
0.
734
1.679
19.
2. 579
47.
382
46. 699
0. 683
0. 054
46. 382
45. 699
0. 683
0.
538
1. 679
20.
2. 450
47.
415
47. 072
0. 342
0. 110
46. 415
46. 072
0. 342
0.
270
1. 679
21.
2. 333
47.
797
47. 448
0. 349
0. 166
46. 797
46. 44B
0. 349
0.
275
1. 679
22.
2. 227
48.
307
47. 820
0. 479
0. 222
47. 307
46. 820
0. 479
0.
378
1. 679
23.
2. 130
49.
090
4B. 211
0. 886
0. 279
48. 098
47. 211
0. 886
0.
699
1. 679
24.
2. 042
49.
590
48. 600
0. 990
0. 337
48. 590
47. 600
0. 989
0.
781
1. 679
25.
1. 960
50.
270
40. 996
1. 274
0. 396
49. 270
47. 996
1. 274
1.
005
1. 679
26.
1. 835
50.
517
49. 399
1. 110
0. 4 56
49. 517
48. 399
1. 118
0.
882
1. 679
27.
1. 815
50.
865
49. 811
1. 053
0. 510
49. 865
48. 811
1.033
0.
B31
1. 679
20.
1. 750
50.
91 1
50. 234
0. 677
0. 581
49. 91 1
49. 234
0. 677
0.
534
1. 679
29.
1. 690
51.
067
50. 669
0. 390
0. 645
50. 067
49. 669
0. 398
0.
314
1. 679
30.
1. 633
51.
496
51. 118
0. 37B
0. 712
50. 496
50. 118
0. 378
0.
298
1. 679
31.
1. 3B1
51.
704
51. 583
0. 121
0. 701
50. 704
30. 583
0. 121
0.
096
1. 679
32.
1. 531
51.
911
52. 065
0. 155
0. 853
50. 910
51. 065
0. 155
0.
122
1. 679
33.
1.485
51.
958
52. 569
0. 611
0. 928
50. 957
51. 569
0. 611
0.
482
1. 679
34.
. 1.441
52.
158
53. 096
0. 938
1. 007
51. 158
52. 096
0. 938
0.
739
1. 679
35.
1. 400
52.
751
53. 651
0. 900
1. 089
51. 751
52. 651
0. 900
0.
709
1. 679
36.
1. 361
53.
135
54. 238
1. 103
1. 177
52. 135
53. 238
1. 103
0.
868
1. 679
37.
1. 324
53.
158
54. 864
1. 705
1. 270
52. 158
53. 064
1. 705
1.
34 1
1. 679
38.
1. 289
53.
193
55. 534
2. 341
1. 369
52. 193
54. 534
2. 341
1.
838
1. 679
39.
1. 256
54.
534
56. 257
1. 723
1. 477
53. 534
55. 257
1. 724
1.
352
1. 679
40.
1. 225
55.
740
57. 047
1. 307
1. 595
54. 739
56. 047
1. 308
1.
024
1. 679
41.
1. 195
56.
718
57. 918
1. 201
1.725
55. 718
56. 918
1. 201
0.
938
1. 679
42.
1. 167
58.
185
58. 894
0. 709
1. 870
57. 185
57. 094
0. 709
0.
552
1. 679
43.
1. 140
59.
423
60. 006
0. 583
2. 035
58. 423
59. 006
0. 583
0.
4 53
1. 679
44.
1. 114
59.
693
61. 307
1. 614
2. 229
58. 693
60. 307
1. 614
1.
247
1. 679
45.
1. 089
64.
474
62. 880
1. 595
2. 463
63. 474
61. 880
1. 595
1.
225
1. 679
46.
1. 065
67.
449
64. 885
2. 564
2. 762
66. 449
63. 8B5
2. 564
1.
952
1, 679
47.
1. 043
68.
231
67. 679
0. 552
3. 178
67. 231
66. 679
0. 552
0.
414
1. 679
48.
1. 021
Z
YO
YP
YR
T
TO
72. 230 72. 405
REDUCED VARIATE
TRANSFORMED ODSERVATION
TRANSFORMED PREDICTION
TRANSFORMED RESIDUAL
STUDENT STATISTIC
95% STUDENT T
0. 175
3. 882
71. 230
71. 405
0. 175
0.
128
1. 679
123
Table 4.25 Detailed Statistic for the STDE
Selected Model, Example 2, Rainfall Series.
EXAMPLE 2 KISSIMMEE RIVER ANNUAL TOTAL RAINFALL,19341781
REGRESSION RESULTS : GUMBEL FAMILY GIVES OPTIMAL 'STDE' FOR ALFA =
** #**# tt***#'#*#)**#*
BASIC DESCRIPTIVE STATISTICS
STANDARD
MEAN
DEVI ATIOi
TRANSFORMED VARIABLE
49. 014
7. 948
REDUCED VARIATE
0. 548
1. 170
CORRELATION
O. 9B8
ANALYSIS OF VARIANCE
SOURCE OF
VARIATION
DEGREE OF SUMS OF
FREEDOM SGUAPES
MEAN P < EXCEEDING F
SQUARES FVALUE UNDER HO)
REGRESSION
RESIDUAL
TOTAL
1.000 2900.021 2900.031 1923.143 0.0
46. 000 69. 366 1. 503
47. 000 2969. 397
DURBINWATSON STATISTIC 0.431
MODEL PARAMETER INFERENCES
LOWER
POINT STANDARD CONFIDENCE
ESTIMATE ERROR LIMIT
UPPER
CONFIDENCE
LIMIT
SCALE
LOCATION
6. 716
45. 336
0. 153 6. 305 7. 128
0.196 44.941 45.730
1. 00
COVARIANCE
0. 013
124
r'r~:D''n;z:..:D:D crzDZDmc cj!ti:3::dgtiloz::d:i> :
125
KISSIMMEE RIVER RAINFALL DATA
P1AH5DN 32 ALFA = Q.15
Figure 4.6 Linear Plot of R2 Selected Model,
Example 2, Rainfall
126
KISSIMMEE RIVER RAINFALL DATA
CUMBFL STB$ ALFA = 1.00
Figure 4.7 Linear Plot of STDE Selected Model,
Example 2, Rainfall
Table 4.26. Sensitivity of Optimal Selection Statistic (R2) and Corresponding
Transformations (a) to Plotting Position Definition. Maximum
Values of R2 are underlined.
Plotting
Position
Constant
Normal
Gumbel
Rayleigh
Pearson
0.00
0.97747/0.10
0.97668/1.0
0.96774/0.70
0.98043/0.10
0.200
0.98015/0.10
0.95773/0.10
0.96401/0.10
0.98245/0.10
0.300
0.98122/0.10
0.95670/0.10
0.96428/0.10
0.98221/0.10
0.375
0.98192/0.10
0.95566/0.10
0.96453/0.10
0.98419/0.10
0.400
0.98220/0.10
0.95522/0.10
0.96456/0.10
0.98347/0.10
0.440
0.98262/0.10
0.95448/0.10
0.96456/0.10
0.98485/0.10
0.600
0.98405/0.10
0.94990/0.10
0.96455/0.10
0.98522/0.10
0.900
0.98231/0.10
0.91732/0.10
0.95699/0.10
0.98299/0.10
Table 4.27.
Sensitivity of Optimal Selection Statistic (STDE) and Corresponding
Transformations (a) to the Plotting Position Definition. Minimum
Values of STDE are Underlined.
Plotting
Position
Constant
Normal
Gumbel
Rayleigh
Pearson
0.00
1.2670/0.20
1.2023/1.00
1.3786/1.00
1.2342/0.20
0.200
1.138/0.70
1.510/0.10
1.364/0.10
1.146/0.50
0.300
1.120/0.70
1.552/0.10
1.356/0.10
1.124/0.50
0.375
1.196/0.70
1.593/0.10
1.354/0.10
1.109/0.50
0.400
1.105/0.50
1.609/0.10
1.353/0.10
1.103/0.50
0.440
1.098/0.50
1.637/0.10
1.354/0.10
1.099/0.50
0.600
1.079/0.50
1.794/0.10
1.374/0.10
1.092/0.50
0.900
1.293/0.50
2.631/0.10
1.679/0.10
1.344/0.50
128
129
that these data are centered about the value 0.40, which was recommended
by Cunnane (1978) as a general value for all distributions. Although
the optimal plotting position for each distribution was overshadowed by
roundoff errors in the values of the selection statistics, their posi
tions relative to each other are conserved. Thus, the Gumbel distribu
tion has the smallest value of a, followed by the Rayleigh distribution,
with the normal and Pearson coming last. The same order is followed by
the expected values given in Table 3.5.
Trends similar to those exhibited in Tables 4.26 and 4.27 were
shown by the WSS and FMXL selection statistics and by the other series
of the illustrative examples. More on the sensitivity of these statis
tics to the definition of the plotting position will be given with the
case study of Chapter 6.
4.4.2. Sensitivity to the change of scale
The sensitivity of the selection statistics and the optimal trans
formation parameter (shape parameter, a) to the change of scale (magni
tude) in which the data are expressed was investigated by conversion of
the series of the illustrative examples into different scales (units)
and performing the same frequency analysis using the GPDCP program.
First, dimensionless series were considered, where the original data
were standardized through a division by the sample mean. Frequency
analysis of such standardized series was suggested by Stedinger (1983b)
who discussed the advantage it has over the use of nonstandardized
distributions. The selection statistics were exactly the same, except
for minor fluctuations due to roundoff errors. This was true for the
normal, Gumbel and Rayleigh distributions. Table 4.28 illustrates this
by listing the selection statistics for the St. Marys runoff series.
130
Table 4.28. Selection Statistics for St. Marys Runoff Data Expressed
in cfs and Dimensionless Units.
Distribution
Scale
R2
STDE
WSS
MXLF
cfs
0.90021
1637.3
5.986
53.69
Normal
ND
0.90022
1637.3
5.986
53.69
cfs
0.97265
857.37
1.6417
14.87
Gumbel
ND
0.97264
857.37
1.6417
14.87
cfs
0.94645
1199.75
3.214
35.03
Rayleigh
ND
0.94640
1199.75
3.214
35.03
e a
cfs
0.96713
939.68
1.972
20.37
Pearson
u
NDb
0.84743
2024.52
9.154
66.43
cfs: cubic feet per second
ND: dimensionless
ak = 7.753
bk = 0.0001
131
The table includes statistics calculated from the original series (cfs
units) and the standardized series (dimensionless). On the other hand,
these statistics are highly sensitive to the change of scale for the
Pearson family of distributions (Table 4.28). This high sensitivity can
be explained by the dependence of the Pearson shape parameter on the
scale of the data, illustrated by the variation of the parameter k from
7.753 in the first scale to 0.0001 in the standardized scale. Since
this parameter along with the transformation (a) describes the distri
bution shape which is independent of scale (Siswadi, 1981), the param
eter a will be highly sensitive to the change of scale too. Therefore,
as mentioned in Section 3.2.4 it will be wise to give the parameter k a
fixed value (e.g., k = , normal; k = 1, Gumbel, or any of the values
listed in Table 3.4) and leave only one parameter (a) for the descrip
tion of the shape of the distribution. A second example illustrating
the sensitivity to the change of scale is given by performing the same
type of analysis with the total annual runoff series expressed in U.S.
customary units (inches) and metric units (meters and decimeters) .
Selection statistics for all three scales for the four families of dis
tribution are listed in Table 4.29. Here again, the first three dis
tributions produce constant statistics compared to the high sensitivity
of the Pearson distribution to the change of scale. As in the previous
example, this high sensitivity is directly related to the dependence of
the parameter k on the scale (see corresponding values of k in the same
table). These two examples illustrate the advantage of the new parame
terization over the classical parameterization of the generalized gamma
distribution anticipated in Section 3.2.
132
Table 4.29. Selection Statistics for Kissimmee River Runoff Data
Expressed in Inches, Decimeters and Meters.
Distribution
Scale
R2
STDE
WSS
MXLF
in
0.91390
1.22677
2.01476
16.81
Normal
dm
0.91394
1.22681
2.01481
16.81
m
0.91392
1.22690
2.01471
16.81
in
0.81255
3.45936
4.01365
33.35
Gumbel
dm
0.81258
3.4931
4.01372
33.35
m
0.81254
3.45924
4.01344
33.35
in
0.84102
2.12291
3.13778
27.44
Rayleigh
dm
0.84101
2.12266
3.13771
27.44
m
0.84093
2.12238
3.13734
27.44
. a
in
0.81657
2.95510
3.80621
32.08
Pearson
dmb
0.52313
4.86004
11.95228
59.54
c
m
0.79440
3.22693
4.28136
34.90
in: inches
dm: decimeters
m: meters
ak = 8.602
bk = 0.822
Ck = 6.243
133
4.4.3. Sensitivity to the objective function formulation and to the
estimation procedure
The advantage of including the shape parameter with the dependent
variable and estimating it separately from the location and scale
parameters was first illustrated by comparing the performance of the
GPDCP program to that of Kite's program for six classical frequency
distributions (Tables 4.2 and 4.12). In this section the performance of
the GPDCP program will be compared to that of the nonlinear estimation
procedure (NLIN) using the Marquardt method (SAS, 1982). This method
combines the Gauss and the steepest descent algorithms to solve a
linearized form of the normal equations (Equation 2.2.5). Detailed
descriptions of this method may be found in Bevington (1969) and Bard
(1974). Two programs were developed to solve for the location, scale
and shape parameters using the following two models, respectively,
a
Y = 2 = AZ + B (4.4.1)
a
y = [a(AZ + B) + l]1/a (4.4.2)
The program using Equation 4.4.1 minimizes the sum of squares of
the transformed variables, while the program using the second formula
tion minimizes the sum of squares of the original variables. The inputs
for both models are the same: a series of ranked observations (y) and
the corresponding expected order statistics generated by the GPDCP
program.
With both programs the estimated parameters were highly correlated
(Tables 4.30 and 4.31) and dependent on the first estimates (Tables
4.32 and 4.33). For the example illustrated bj* these tables the
second formulation (Equation 4.4.2) gave closer estimates to those given
by the GPDCP program (Table 4.8). This should be expected since both
tzÂ¡>
Table 4.30 Nonlinear Parameter Estimation, Original
Variables, Starting Values AF=0.40, A=13, and B=100
ST.MARYS RIVER RUNOFF
GUMBEL SAS NLIN PROCEDURE
Y= ( AF# < A#Z+B ) +1 )*#1/AF
NONLINEAR LEAST SQUARES SUMMARY STATISTICS DEPENDENT VARIABLE X
PARAMETER
AF
SOURCE
DF
SUM OF SQUARES
MEAN SQUARE
REGRESSION
3
14296525305. 778380 4765508435. 259459
RESIDUAL
57
25676294. 221619
450461. 302134
UNCORRECTED TOTAL
60
14322201600. 000000
(CORRECTED TOTAL)
59
1611902293. 333334
ESTIMATE
ASYMPTOTIC
ASYMPTOTIC 95 7.
STD. ERROR
CONFIDENCE
INTERVAL
LONER
UPPER
0.41596749
0.09010642 0.
25555721
0.57637778
15. 13660227
11.82707595 8.
54670026
38. 81990481
117.10819082
67. 32801753 17.
71379006 2
51.93017170
ASYMPTOTIC CORRELATION MATRIX OF THE PARAMETERS
AF
A
B
AF
1. 000000
0. 999799
0.999978
A
0. 999799
1. 000000
0.999695
B
0.999973
0.999695
1.000000
134
Table 4.31
Variables
Nonlinear Parameter Estimation, Original
Starting Values AF=0.41, A=15, and B=117
ST.MARYS RIVER RUNOFF
GUMBEL. SAS NLIN PROCEDURE
Y(AF*< A*Z+B )+l )**l/AF
NONLINEAR LEAST SQUARES SUMMARY STATISTICS DEPENDENT VARIABLE X
SOURCE
DP SUM OF SQUARES
MEAN SQUARE
REGRESSION
RESIDUAL
UNCORRECTED TOTAL
3 14296524020. 727828
57 25677579. 272170
60 14322201600. 000000
4765500006. 909276
450483. 846880
w
Ln
(CORRECTED TOTAL)
1611902293. 333334
PARAMETER
ESTIMATE
ASYMPTOTIC
ASYMPTOTIC 95 7
STI). ERROR
CONFIDENCE
INTERVAL
LOWER
UPPER
AF
0.41694401
0.05265029
0. 31151367
0. 52237436
A
15.27891403
11.25623936
7. 26130835
37.81913641
B
117. 91865914
57.54455297
2. 68772104
233. 14959724
ASYMPTOTIC
CORRELATION MATRIX OF '
THE PARAMETERS
AF A
B
AF
1. 000000 0. 63 1616
0. 553295
A
0.611616 1.000000
0. 9974 24
B
0. 553295 0. 9974 24
1.000000
ta>
Table 4.32 Nonlinear Parameter Estimation, Transformed
Variables, Starting Values AF=0.41, A=13, and B=117
ST. MARYS RIVER RUNOFF
GUMBEL STDE SAS NLIN PROCEDURE
NONLINEAR LEAST SQUARES SUMMARY STATISTICS DEPENDENT VARIABLE Y
PARAMETER
AF
SOURCE
DF
SUM OF SQUARES
MEAN SQUARE
REGRESSION
3
46535. 10073892
15511. 70024631
RESIDUAL
57
7.28962752
0.12788820
UNCORRECTED TOTAL
60
46542. 39036644
(CORRECTED TOTAL)
59
279. 71725364
ESTIMATE
ASYMPTOTIC
ASYMPTOTIC 95 7.
STD. ERROR
CONFIDENCE
INTERVAL
LOWER
UPPER
0. 19469392
0. 03033220
0. 13295351
0.25643432
1. 84081238
0. 29232564
1. 25544060
2.42618416
26. 63992792
2. 78308687
21.06689461
32.21296123
ASYMPTOTIC CORRELATION MATRIX OF THE PARAMETERS
AF A B
AF
A
B
1. OOOOOO
0. 307384
0. 786046
0. 307384
1. OOOOOO
0. 829857
0.786046
0.829857
1.OOOOOO
136
Table 4.33 Nonlinear Parameter Estimation, Transformed
Variables, Starting Values, AF=0.40, A=15, and B=100
ST.MARYS R J VER RUNOFF
GUMBEL STDE SAS NLIN PROCEDURE
< Y#tfAFl )/AF = A*Z + 13
NONLINEAR LEAST SQUARES SUMMARY STATISTICS
DEPENDENT VARIABLE Y
PARAMETER
AF
A
B
SOURCE
DF
SUM OF SQUARES
MEAN SQUARE
REGRESSION
3
351.57828773
117. 19276253
RESIDUAL
57
0.00360836
0. 00006330
UNCORRECTED TOTAL
60
351. 58189608
(CORRECTED TOTAL)
59
0.00303319
ESTIMATE
ASYMPTOTIC
STD. ERROR
ASYMPTOT
CONFIDENCE
LOWER
0. 40425480
0. 11916349
0. 64287548
0. 01221985
0. 0031 1707
0. 00597S03
2. 41332039
0. 53255656
1. 34689466
C 95 7.
INTERVAL
UPPER
0. 16563411
O. 01846168
3.47974612
ASYMPTOTIC CORRELATION MATRIX OF THE PARAMETERS
AF A B
AF
A
B
1.000000 0.837643 0.998632
0.837643 1.000000 0. 865054
0.998632 0.865054 1.000000
137
138
models used the same selection statistic (STDE). But still, the GPDCP
performance is superior since its parameter estimates have much smaller
standard deviations and are less correlated (e.g., covariance (A,B) =
0.067, Table 4.8 versus 0.999, Table 4.30). Thus, they are more
reliable.
4.5. Summary and Conclusions
The new parameterization of the GGD resulted in four families of
distributions all of which are expressible in terms of only two param
eters of the location and scale type once the data are transformed to
the right space. Thus, the parameters of the four families of distri
butions are related to the expected order statistics by the same type of
expression (Equation 3.3.1). Based on this relation, an algorithm was
developed and applied for the estimation of the parameters of the four
generalized distributions. Two illustrative examples were analyzed by
the generalized probability distribution computer program (GPDCP) based
on the above algorithm. The performance of this program was compared to
that of the classical maximum likelihood and nonlinear solution proce
dures. Based on the same examples, a sensitivity analysis of the selec
tion statistics to the plotting position definition, change of scale and
formulation of the objective function was performed. Results of these
investigations are summarized below.
1. The GPDCP program performed much better than the maximum
likelihood based program (Kite, 1977) and the nonlinear program
(SAS, 1982) in fitting most of the distributions tested in this
study.
2. The GPDCP offers a much wider range of distributions from
which to choose, and the option to use selection statistics other
139
than the STDE on which Kite based his analysis and which is the
objective function of SAS nonlinear procedure.
3. The STDE selection statistic gave the poorest performance
compared to the other three selection statistics. The WSS and the
MXLF statistic selected the same models for all the cases consid
ered. The R2 selection statistic was the best measure of goodness
of fit of the model's functional form and predictions.
4. The sensitivity of the selection statistic to the defi
nition of the plotting position was very low, especially over the
range 0.0 to 0.44 covering most of the recommended definitions for
different distributions.
5. The nexi shape parameter (a) was invariant to change of
scale for the normal, Gumbel and Rayleigh general distributions.
But, it was highly sensitive to such a change for the Pearson
family of distributions. It is the second shape parameter, re
placed by its moment estimate, that causes this high sensitivity,
illustrating the inadequacy of the classical parameterization of
the GGD.
6. The selection statistics were highly sensitive to the
change of space (transformation) for the four generalized distri
bution models. The difference between optimal statistics from
different families xjas very small compared to the variability due
to the transformation within a given family of distributions.
Thus, all four parent distributions have equivalent performance
once the data are transformed to the right space.
7. Due to the high sensitivity of the selection statistics to
the shape parameter, a quantitative evaluation of the effect of
140
this parameter on the tail of the distribution is required since
this is a critical region for estimating percentiles, constructing
confidence intervals and assessing reliability. This will be the
theme of the next chapter.
CHAPTER 5
RELIABILITY ANALYSIS
5.1. Introduction
In the previous chapter the importance of the shape parameter in
selecting the best probability model was illustrated by the analysis of
sensitivity of four selection statistics. In this chapter the sensi
tivity of extreme events (tails of the distributions) to this parameter
is investigated. Approximate relations that include the shape parameter
explicitly are developed into indices and formulae for measuring reli
ability. A quantitative evaluation of the sensitivity of these ex
pressions to the change in the shape parameter is performed through the
generation of reliability tables for different shape parameters at many
probability levels.
5.2. Second Moment Reliability Modeling
5.2.1. Normal case
In Chapter 1 it was shown that reliability analysis and measurement
are usually based on a second moment representation of the data. Such a
representation is often associated with the assumption of normality in
order to apply the normal theory for the definition of reliability
indices and confidence intervals. For that purpose only the first two
moments are required. Among indices of reliability that are based on
these moments is the generalized reliability index (z^) defined by
Equation 1.2.14, working in the transform domain and replacing g(x,0) by
the mean (p^) of the transformed variables (Y) gives
142
z
L
(5.2.1.)
where a^ is the standard deviation of Y, is the coefficient of vari
ation and Y^ is the limit defining the reliability level. The coef
ficient of reliability (Equation 1.2.11) may also be expressed in terms
of the transformed variables. From the above equation we have
COR
z V
L Y
(5.2.2)
If the level Y^ is itself a random variable, then a third index of
reliability may be derived from the definition of reliability given by
Equation 1.2.1
Reliability = R = 1 Pr(failure) (5.2.3)
If failure occurs when a variable Y^ exceeds a variable Y2 (both vari
ables being independent, random, and normally distributed with means and
standard deviation vt, o and p a respectively) then Equation
Y1 Y1 Y2 Y2
5.2.3 takes the form
R = 1 Pr[(Y2Y1) < 0] (5.2.4)
It is a well known property of normally distributed variables that
their difference is also a normally distributed variable with mean equal
to the difference of the means and variance the sum of the variances
(Benjamin and Cornell, 1970; Haan, 1977; Harr, 1977). Thus (Y2~Y^)
2 2
follows a normal distribution with mean (p p ) and variance (a +a ) .
*2 Y1 Y2 Y1
Substitution of these parameters into Equation 5.2.4 leads to
143
t(u y )
1 2 *1 ,2
2 / 2 2 ,l/2;
(0Y "^Y 5
R = 1 /
'sr +4 >1/z
2 1
dt
(yy yy )
= 1 {$( o 2? $()}
(a; +ay )i/z
2 1
(yy Vy )
R = 1 TT^Y i/f
(a +a )
2 *1
(5.2.5)
where $ is the CDF of the normal distribution.
A new reliability index may be defined as the argument of
^Y _^Y
ZRI = 22 1/2 (5.2.6)
(a +a )
2 1
ZRI corresponds to the reduced normal variate with cumulative prob
ability equal to the probability of failure (risk) represented by the
shaded area of Figure 1.3c. It is an imoortant index since it allows a
direct evaluation of risk or reliability by using just the first two
moments of each distribution and a table of the cumulative normal dis
tribution.
5.2.2. Lognormal case
Usually the data deviate excessively from normality, especially in
engineering fields where most measurements are positive quantities
limited from below by zero or some small value and unlimited from above.
Such data tend to be positively skewed and thus are better fitted by the
lognormal distribution. The previous expressions for reliability
144
indices are valid for the logarithm of the variables, but they may also
be expressed in terms of the moments of the original data using the
relations derived in Appendix B. Equations B.13 and B.14 define
Uy = log(y ) \ log(l+Vy) (5.2.7)
Gy = log(1+V2) (5.2.8)
inhere Y = log y, and V^ is the coefficient of variation of the untrans
formed variables. Substitution of these relations into Equations 5.2.1,
5.2.2 and 5.2.6 gives, after some algebraic rearrangement, the following
new expressions for the reliability indices
log & (1+V2) 1/2 ]
ZL = "
yL y
[log(l+V2)]1/2
or
2 1/2 Vlc,g(l+VX/2
yL = V1+Vy> '* y
from which COR = = (1+V )
yL y
2^1/2 e ZL[lg(l+Vy)]
and
ZRI =
y 1+V
yl y2.1/2,
lg[ (7~2) ]
y0 i+v
2 yi
{log[(1+V2 )(1+V2 )]}1/2
(5.2.9)
(5.2.9a)
(5.2.10)
(5.2.11)
5.3. Third Order Reliability Modeling
5.3.1. Introduction
In the above section expressions for reliability indices were
developed based on the first two moments of the original variables.
Such relations are widely used for analyzing uncertainty, assessing
145
reliability and constructing confidence intervals (Ditlevsen, 1981;
Harr, 1977; Chow, 1964). Unfortunately, no similar relations exist for
the general case, in which the data are best fitted by distributions
other than the normal or the lognormal. An important class of such
distributions is the generalized normal distribution (GND) analyzed in
Chapter 3, for which the data are fitted to a normal distribution after
being transformed by the BoxCox transformation (Equation 3.2.30). The
lack of such relations has been due mainly to the nonexistence of exact
relations between the moments of the transformed and untransformed vari
ables (Salas et al., 1980, p. 73). Approximate relations between these
moments are developed in the next section and will be substituted into
the reliability indices formulae later.
5.3.2. Approximate relations between moments of transformed and
untransformed variables
Given a function f(y) defined on an interval containing y^, and
which is twice differentiable at y^, its Taylor series expansion about
y is
y
f(y) = f(yy) + f'(yy)(yuy) + \ f"(yy)a* .
(5.3.1)
The mean, variance and covariance of f(y) are approximated by the fol
lowing expressions calculated from the above equation (Ditlevsen, 1981,
p. 98):
E[f(y)] = f(yy) + \ f"(yy)ay
(5.3.2)
Var[f(y)] = f(yy)2 ay
(5.3.3)
Cov(y,f(y)) = f(yy)
(5.3.4)
If for f(y) we substitute the BoxCox transformation, which satisfies
the conditions of continuity and differentiability imposed on f the
following relations result:
146
f (y)
= Y =
a
f(y) = y
al
(5.3.5)
(5.3.6)
a2
f"(y) = (al)y (5.3.7)
which, when substituted into Equations 5.3.2 to 5.3.4, lead to the
following relations between the moments of the original variables (y)
and the transformed variables (Y)
yal
Uy = E(Y) = 2^ + \ (al)y V2
a
y y
a2 = Var(Y) = y2a V2
Y y y
Cov(y,Y) = ya+^ V2 .
y y
(5.3.8)
(5.3.9)
(5.3.10)
The coefficient of variation of the transformed variables, expressed in
terms of the original moments, is then
V =
01 T7
y v
. y ...y.
Y ^y yai
(5.3.11)
4 (al) y^ V2
2 ' y y
To check the validity of the above relations, they are compared to
the exact relations derived for the logarithmic transformation (Appendix
B). As a approaches zero Equations 5.3.5 to 5.3.11 approach the fol
lowing equations:
f(y) = log y (5.3.12)
1
f(y) = y
f"(y) = y 2
which are exact relations, and
yy = ig(y ) j v2
(5.3.13)
(5.3.14)
(5.3.15)
147
4 = Vy (5.3.16)
Cov(y,Y) = yy Vy (5.3.17)
V
V =  y (5.3.18)
iogyy) f Vj
which are approximate relations. These all compare very well to the
exact relations of Section 5.2.2,
py = log(yY) \ log(l+Vy) (5.3.19)
4 = log(l+Vy) (5.3.20)
[log(l+vJ)]1/2
V = i ~ j (5.3.21)
log(yy) ^ log(l+Vy)
In fact, if in these relations the coefficient of variation is assumed
small, then a first, order approximation of the expression
log(l+Vy) V2 (5.3.22)
leads to the same approximate relations (Equations 5.3.15 to 5.3.18)
obtained by the Taylor series expansion. Thus, the general approximate
relations (Equations 5.3.8 to 5.3.11) will be assumed as good as the
special case (a=0) for estimating the exact but unknown true relations.
The approximate relations are used to define reliability indices that
include the shape parameter (a) explicitly to account for the deviation
of the original variables from normality. These relations are believed
to be among the unique aspects of this study.
Note that unlike the lognormal case, no assumption about the dis
tribution of either the transformed or the original variables was made
to derive the new relations. Thus, they may be applied to any type of
148
model using the Box and Cox transformation, including the four gen
eralized distributions analyzed in Chapters 3 and 4.
Another area where these new relations may play an important role
is the area of synthetic hydrology. It is well known in the hydrologic
field that generated data based on the moments estimated directly from
the transformed variables do not preserve the moments of the original
variables (Salas et al., 1981, p. 73 and Haan et al., 1982, p. 77).
Matalas (1967) and Fiering and Jackson (1971) showed by Monte Carlo
simulation that when the moments of the logtransformed series used for
synthetic data generation are calculated from the moments of the orig
inal data (Equations 5.2.7 and 5.2.8), the moments of the original data
are xrell reproduced. The new relations are expected to generalize this
finding to the more general BoxCox transformation.
5.3.3. Third order reliability representation
5.3.3.1. Generalized reliability index. The main equations developed
in the previous section are 5.3.8 and 5.3.9 relating the mean and the
variance of the transformed variables to the mean and coefficient of
variation of the original data and to the shape parameter (a) of the
normalizing transformation,
a *i
y 1 0
y = + w (ctl) y v
Y a 2 y y
a
a = y V
Y y y
(5.3.23)
(5.3.24)
Following substitution into Equation 5.2.1 these equations give, after
some algebraic simplifications, the confidence limit (or magnitude of the
event) associated with the generalized reliability index z^, in terms of
the untransformed variables,
149
[(1) l]/a i (al)V2
ZL =
(5.3.25)
or
yL V1 + f Vy + ZL Vy)1/a
(5.3.25a)
y y e y
It may be shown that these expressions approach Equations 5.2.9 and
5.2.9a, based on the exact relations for the logarithmic transformation,
as a approaches zero, and they approach Equation 5.2.1 for the normal
case (a=l). To show this, consider the rearranged expression for
Equation 5.3.25
yT
() 1
y
Z^ (al) Vz + zT V
a 2 y L y
and take the limit as aH) to obtain
(5.3.26)
from which
log
yT = y e
L y
j V2 + z V
2 y L y
1 2
tV + zT V
2 y L y
(5.3.27)
(5.3.28)
This equation is exactly the same as Equation 5.2.9a when combined with
the approximation of Equation 5.3.22. A simple substitution of 1 for a
in Equation 5.3.26 leads to Equation 5.2.1 exactly.
5.3.3.2. Generalized coefficient of reliability. A generalized coeffi
cient of reliability including, as special cases, those defined by
Equations 5.2.2 and 5.2.10 for the normal and lognormal distributions,
respectively, may be derived from the results of the previous section.
From Equation 5.3.25a we have
COR = = [1 + ~ a (al) V?' + azT V ]1/a
yT 2 y L y
(5.3.29)
150
Here again, it can easily be shown that this expression is a first order
approximation of Equation 5.2.10 and exactly the same as Equation 5.2.2,
for a equal zero and one, respectively.
5.3.3.3. Generalized reliability index for the case of two random
variables. By replacing the means and standard deviations of Equation
5.2.6 by their approximate expressions in terms of the original moments
(Equation 5.3.23 and 5.3.24) we have
ZRI = 
i ,
i
+ 2
 1
I (2 Vy.
(y2 V2 + y2 V2 )1/2
yl yl y2 y2
(5.3.30)
This index of reliability allows the estimation of the reliability
(Equation 5.2.4) or risk (dashed area of Figure 1.3c) directly from the
cumulative density function of the normal distribution.
5.4. Sensitivity Analysis
5.4.J. Sensitivity of the predicted pth percentile to the shape
of the distribution
The predicted pth percentile may be the limit of a confidence
interval of an estimated parameter, or the magnitude of an extreme
hydrologic event or water quality standard which should not be exceeded
p percent of the time (i.e., a reliability of p percent). The exceed
ance of such limits will define a failure. The equations developed in
the previous sections apply for this characterization, since they define
the relationship between the magnitude of such events and their prob
ability of occurrence (risk) or nonoccurrence (reliability), which are
related by Equation 1.2.1. Of special interest is Equation 5.3.29,
giving the coefficient of reliability (COR) in terms of the shape of the
151
distribution (a), the coefficient of variation V and the reduced normal
variate z^ corresponding to a cumulative probability p
COR =
[1 + \ a(al) V2 + azT V ] 1/a
2 y L y
2 1/2 2Ltlog(l+v5]1/2
(1+V)1 e L y
y
a^O
(5.4.1)
a=0 .
If known, COR allows the direct estimation of the confidence limit or
pth percentile from the mean of the original variables,
y
L
= JjL
COR
(5.4.2)
For a given coefficient of variation V and shape parameter a, the
coefficient of reliability may be either calculated directly from
Equation 5.4.1 (x^hich can be easily programmed on a desk top calculator)
or interpolated from the tables of Appendix F that relate COR to a and
Vy for different reliability percentiles. These tables were generated
by a small computer program (MWK) listed at the end of the same appen
dix. This program may be used for the generation of similar tables for
other reliability levels or ranges of V and a. Results from the
illustrative examples of the previous chapter will be used here to
illustrate the application of the coefficient of reliability for esti
mating the magnitude of the 100 year event. From example 1, the St.
Marys river runoff data have a mean y^ = 14554.66 cfs and a coefficient
of variation = 0.359 (from Table 4.1). The optimal transformation to
normality, based on the R2 selection statistics is a = 0.10 (Table
4.4). The coefficient of reliability for the 100 year flow is then
obtained from Table F.1.5 or directly calculated from Equation 5.4.1.
For an a of 0.0, 0.10 and 1.0 the coefficients of reliability are 0.473,
0.473 and 0.545, respectively. When substituted into Equation 5.4.2
152
these values lead to the following estimates of the 100 year flow
30788.96, 30760.69 and 26708.30 cfs, respectively. We notice that
for this example the normal distribution underestimates the 100 year
flow, while the lognormal distribution slightly over estimates it. From
example 2, the annual total runoff at the Kissimmee River has a mean
y = 8.749 in. and a coefficient of variation V = 0.593 (from Table
y y
4.11), and the optimal transformation to normality is = 0.40 (Table
4.15). From Table F.1.5 the 100 year coefficients of reliability for a
= 0, 0.40 and 1.0 are 0.324, 0.357 and 0.420, respectively, and the
corresponding 100 year runoff volumes are 26.979, 24.493 and 20.816
inches. Here again, the difference between the predictions by the
lognormal, optimal and normal models is small. Similar conclusions are
reached from the rainfall series of the same example for which y^ =
50.014 inches and V =0.159. For a = 0.0, 0.10 and 1.0, the COR is
y
0.701, 0.696 and 0.730, respectively, and the corresponding rainfall
totals are 71.322, 71.849 and 68.503 inches.
A better analysis of the sensitivity of the predicted pth per
centile to the shape of the fitted distribution may be assessed directly
from the tables of Appendix F, by evaluating the change of the coeffi
cient of reliability with a. For example, if a given series of data is
best described by a model with a shape parameter a^, but the lognormal
distribution (a^ = 0) was assumed and used for predicting the pth per
centile event, then a measure of the consequence of such an assumption
is given by the ratio of the predicted percentiles by each model. This
ratio is directly related to the ratio of the corresponding coefficients
of reliability,
153
Ratio =
w w cor
W
COR(a0)
(5.4.3)
W
since they are both expressed in terms of the same mean of the untrans
formed variables. Tables of this ratio have been generated, along with
those of Appendix F, by the program MWK. Tables 5.1 to 5.10 are
samples of such tables giving the value of this ratio for different a
and Vy for ten probability levels. Note that the deviation from one
(lognormal predicted percentile) increases with the deviation of a from
zero. Also note that this deviation is more important at high coeffi
cients of variation. .For example for the 100 year event (Table 5.2),
for a=l, this ratio is 1.241 for a coefficient of variation V =0.5 and
y
1.544 for V =1.5, implying that the 100 year event predicted by the
lognormal distribution will be 24.1 and 54.4 percent higher than the
real one when the true distribution is normal. The sensitivity of the
predicted percentiles is much higher toward the end of the tail of the
distribution (extreme events). For example, for a probability of 99.999
(Table 5.4) the ratio corresponding to the previous coefficients of
variation and for a=l are 2.148 and 7.758, respectively, illustrating
the high sensitivity to the change of a at such probability levels. On
the other hand, for a less than zero (Table 5.6 to 5.10) the sensi
tivity to the shape of distribution is even higher: the more ct deviates
from zero in the negative direction, the more negatively skewed is the
true distribution and the higher is the deviation from the assumed
lognormal distribution (positively skewed). This is well illustrated by
the variation of the pth percentile ratio, for example, the 100 year
Table 5.1 Power to Lognormal Percentile Ratio, 0 <_ a <_ 1, P = 95%
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A PROBABILITY OF 95.000000 V. WITH NORMAL VARIATE Z = 1.645
VY ALFA
LOG
0. 001
0.
100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0. 800
0.
900
1. 000
0. 1
1. 000
1. 000
1.
000 .
1. 001
1. 002
1. 003
1. 003
1. 004
1. 005
1. 005
1.
006
1. 007
0. 2
1. 000
0. 997
1.
000
1. 003
1. 005
1. 008
1. 010
1. 013
1. 015
1. 017
1.
020
1. 022
0. 3
1. 000
0. 992
0.
997
1. 002
1. 007
1. 012
1. 017
1. 022
1. 026
1. 031
1.
035
1. 039
0. 4
1. 000
0. 982
0.
990
0. 998
1. 006
1. 014
1. 021
1. 028
1. 035
1. 042
1.
049
1. 055
0. 5
1. 000
0. 969
0.
980
0. 991
1. 001
1. 012
1. 022
1. 031
1. 041
1. 050
1.
059
1. 067
0. 6
1. 000
0. 953
0.
966
0. 980
0. 993
1. 005
1. 018
1. 030
1. 041
1. 053
1.
064
1. 074
0. 7
1. 000
0. 936
0.
951
0. 966
0. 980
0. 995
1. 009
1. 023
1. 037
1. 050
1.
063
1. 076
0. 8
1. 000
0. 918
0.
933
0. 950
G. 966
0. 982
0. 997
1. 013
1. 028
1. 043
1.
058
1. 072
0. 9
1. 000
0. 901
0.
916
0. 932
0. 949
0. 966
0. 982
0. 999
1. 016
1. 032
1.
048
1. 064
1. 0
1. 000
0. 885
0.
900
0. 915
0. 931
0. 948
0. 965
0. 983
1. 000
1. 017
1.
035
1. 052
1. 1
1. 000
0. 873
0.
884
0. 898
0. 913
0. 930
0. 947
0. 964
0. 982
1. 000
1.
018
1. 036
1. 2
1. 000
0. 864
0.
872
0. 882
0. 896
0. 911
0. 927
0. 944
0. 962
0. 981
0.
999
1. 018
1. 3
1. 000
0. 859
0.
861.
0. 868
0. 879
0. 892
0. 907
0. 924
0. 941
0. 960
0.
979
0. 998
1. 4
1. 000
0. 859
0.
854
0. 856
0. 863
0. 874
0. 887
0. 903
0. 920
0. 938
0.
957
0. 976
1. 5
1. 000
0. 865
0.
851
0. 846
0. 849
0. 857
0. 868
0. 882
0. 898
0. 916
0.
935
0. 954
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
154
Table 5.2 Power to Lognormal Percentile Ratio, 0 <_ a <1, P = 99%
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A PROBABILITY OF 99.000000 7. WITH NORMAL VARIATE Z = 2.326
VY ALFA
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0. 800
0. 900
1. 000
0. 1
1. 000
1. 000
1. 002
1. 004
1. 005
1. 007
1. 009
1. 011
1. 013
1. 015
1. 016
1. 018
0. 2
1. 000
0. 997
1. 004
1. 011
1. 018
1. 025
1. 031
1. 038
1. 044
1. 050
1. 055
1. 061
0. 3
1. 000
0. 988
1. 003
1. 018
1. 033
1. 047
1. 060
1. 072
1. 084
1. 095
1. 106
1. 117
0. 4
1. 000
0. 973
0. 998
1. 023
1. 046
1. 068
1. 089
1. 109
1. 127
1. 145
1. 162
1. 178
0. 5
1. 000
0. 951
0. 988
1. 024
1. 057
1. 088
1. 117
1. 144
1. 170
1. 195
1. 218
1. 241
0. 6
1. 000
0. 925
0. 973
1. 019
1. 062
1. 102
1. 140
1. 176
1. 209
1. 241
1. 271
1. 300
0. 7
1. 000
0. 894
0. 954
1. 010
1. 063
1. 112
1. 158
1. 202
1. 243
1. 282
1. 319
1. 354
0. 8
1. 000
0. 860
0. 930
0. 997
1. 059
1. 117
1. 171
1. 222
1. 271
1. 316
1. 360
1. 401
0. 9
1. 000
0. 826
0. 905
0. 980
1. 051
1. 116
1. 178
1. 237
1. 292
1. 344
1. 394
1. 442
1. 0
1. 000
0. 791
0. 879
0. 961
1. 039
1. 112
1. 181
1. 246
1. 307
1. 366
1. 421
1. 474
1. 1
1. 000
0. 758
0. 852
0. 941
1. 025
1. 104
1. 179
1. 249
1. 317
1. 381
1. 442
1. 500
1. 2
1. 000
0. 727
0. 826
0. 920
1. 009
1. 093
1. 173
1. 249
1. 321
1. 390
1. 456
1. 519
1. 3
1. 000
0. 699
0. 802
0. 900
0. 993
1. 081
1. 165
1. 245
1. 322
1. 395
1. 465
1. 532
1. 4
1. 000
0. 674
0. 779
0. 879
0. 975
1. 067
1. 154
1. 238
1. 318
1. 395
1. 469
1. 540
1. 5
1. 000
0. 653
0. 758
0. 860
0. 958
1. 052
1. 142
1. 229
1. 312
1. 392
1. 469
1. 544
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
155
Table 5.3 Power to Lognormal Percentile Ratio, 0 < a < 1, P = 99.9%
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FDR
i PROBABILITY 01
r 99.900000 V
: WITH
NORMAL
VARIATE
Z
= 3.
090
VY
ALFA
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0.
800
0.
900
1.
000
0. 1
1. 000
1. 000
1. 003
1. 007
1. on
1. 015
1. 018
1. 022
1. 025
1.
028
1.
031
1.
035
0. 2
1. 000
0. 995
1. 010
1. 024
1. 038
1. 051
1. 063
1. 075
1. 087
1.
097
1.
108
1.
118
0. 3
1. 000
0. 983
1. 015
1. 045
1. 073
1. 100
1. 125
1. 149
1. 171
1.
192
1.
212
1.
231
0. 4
1. 000
0. 962
1. 015 '
1. 065
1. 112
1. 155
1. 196
1. 234
1. 269
1.
303
1.
335
1.
366
0. 5
1. 000
0. 932
1. 009
1. 082
1. 149
1. 212
1. 270
1. 324
1. 376
1.
424
1.
470
1.
513
0. 6
1. 000
0. 894
0. 997
1. 093
1. 182
1. 266
1. 343
1. 416
1. 484
1.
549
1.
609
1.
667
0. 7
1. 000
0. 849
0. 977
1. 098
1. 210
1. 315
1. 413
1. 505
1. 591
1.
673
1.
750
1.
823
o. e
1. 000
0. 800
0. 952
1. 096
1. 231
1. 357
1. 476
1. 588
1. 693
1.
793
1.
887
1.
976
0. 9
1. 000
0. 749
0. 922
1. 089
1. 245
1. 393
1. 533
1. 664
1. 789
1.
907
2.
018
2.
124
1. 0
1. 000
0. 697
0. 890
1. 076
1. 254
1. 422
1. 582
1. 733
1. 876
2.
012
2.
142
2.
265
1. 1
1. 000
0. 647
0. 856
1. 060
1. 257
1. 444
1. 623
1. 793
1. 955
2.
109
2.
256
2.
396
1. 2
1. 000
0. 600
0. 821
1. 041
1. 255
1. 461
1. 657
1. 845
2. 025
2.
197
2.
360
2.
517
1. 3
1. 000
0. 555
0. 787
1. 021
1. 250
1. 471
1. 685
1. 890
2. 087
2.
275
2.
455
2.
628
1. 4
1. 000
0. 515
0. 753
0. 999
1. 241
1. 478
1. 707
1. 928
2. 140
2.
344
2.
540
2.
728
1. 5
1. 000
0. 478
0. 722
0. 976
1. 230
1. 480
1. 723
1. 959
2. 186
2.
405
2.
616
2.
819
ALFA :
VY :
POWER TRANSFORMATION EXPONENT
COEFFICIENT OF VARIATION
156
99.999%
Table 5.4 Power to Lognormal Percentile Ratio, 0 < a < 1, P =
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A
PROBABILITY OF
99.999000 >
: WITH
NORMAL
VARIATE
Z
= 4.
275
VY
ALFA
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0.
BOO
0.
900
1.
000
0. 1
1. 000
0. 999
1. 007
1. 015
1. 023
1. 030
1. 037
1. 043
1. 050
1.
056
1.
062
1.
068
0. 2
1. 000
0. 993
1. 023
1. 053
1. 080
1. 106
1. 130
1. 153
1. 174
1.
195
1.
214
1.
233
0. 3
1. 000
0. 976
1. 042
1. 104
1. 161
1. 215
1. 265
1. 311
1. 355
1.
396
1.
435
1.
472
0. 4
1. 000
0. 946
1. 057
1. 162
1. 258
1. 348
1. 431
1. 510
1. 583
1.
652
1.
717
1.
779
0. 5
1. 000
0. 903
1. 066
1. 220
1. 364
1. 498
1. 623
1. 741
1. 852
1.
956
2.
055
2.
148
0. 6
1. 000
0. 847
1. 065
1. 274
1. 472
1. 658
1. 833
1. 999
2. 155
2.
302
2.
442
2.
574
0. 7
1. 000
0. 784
1. 055
1. 322
1. 578
1. 823
2. 055
2. 276
2. 485
2.
684
2.
873
3.
052
0. 8
1. 000
0. 715
1. 035
1. 361
1. 679
1. 987
2. 283
2. 566
2. 836
3.
093
3.
339
3.
573
0. 9
1. 000
0. 644
1. 008
1. 390
1. 773
2. 148
2. 512
2. 863
3. 200
3.
523
3.
832
4.
128
1. 0
1. 000
0. 574
0. 974
1. 410
1. 856
2. 301
2. 738
3. 162
3. 571
3.
966
4.
345
4.
710
1. 1
1. 000
0. 507
0. 936
1. 421
1. 930
2. 446
2. 957
3. 458
3. 945
4.
416
4.
871
5.
309
1. 2
1. 000
0. 446
0. 894
1. 425
1. 994
2. 580
3. 167
3. 747
4. 315
4.
867
5.
403
5.
920
1. 3
1. 000
0. 389
0. 852
1. 421
2. 049
2. 703
3. 367
4. 028
4. 679
5.
316
5.
935
6.
535
1. 4
1. 000
0. 339
0. 809
1. 413
2. 094
2. 816
3. 556
4. 298
5. 034
5.
757
6.
463
7.
149
1. 5
1. 000
0. 295
0. 766
1. 400
2. 131
2. 918
3. 733
4. 557
5. 378
6.
188
6.
982
7.
758
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
157
Table 5.5 Power to Lognormal Percentile Ratio, 0 ct j< 1, P = 99.9999%
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A PROBABILITY OF 99.999900 "/. WITH NORMAL VARIATE Z = 4.772
VY ALFA
LOG
0. 001
0. 100
0. 200
0. 300
0.
400
0.
500
0.
600
0. 700
0.
800
0.
900
1.
000
0. 1
1. 000
0. 999
1. 009
1. 019
1. 028
1.
037
1.
046
1.
054
1. 062
1.
070
1.
077
1.
084
0. 2
1. 000
0. 992
1. 031
1. 068
1. 102
1.
134
1.
164
1.
192
1. 219
1.
244
1.
268
1.
291
0. 3
1. 000
0. 972
1. 057
1. 135
1. 208
1.
275
1.
338
1.
397
1. 452
1.
504
1.
553
1.
599
0. 4
1. 000
0. 940
1. 081
1. 214
1. 333
1.
453
1.
560
1.
660
1. 755
1.
844
1.
927
2.
007
0. 5
1. 000
0. 891
1. 099
1. 297
1. 484
1.
659
1.
824
1.
979
2. 125
2.
263
2.
393
2.
517
0. 6
1. 000
0. 830
1. 107
1. 379
1. 639
1.
887
2.
123
2.
346
2. 557
2.
758
2.
949
3.
130
0. 7
1. 000
0. 758
1. 104
1. 454
1. 798
2.
131
2.
450
2.
756
3. 047
3.
325
3.
590
3.
842
0. 8
1. 000
0. 682
1. 090
1. 522
1. 956
2.
383
2.
799
3.
200
3. 586
3.
956
4.
310
4.
649
0. 9
1. 000
0. 604
1. 066
1. 578
2. 108
2.
639
3.
162
3.
672
4. 166
4.
643
5.
101
5.
542
1. 0
1. 000
0. 529
1. 035
1. 62*+
2. 252
2.
893
3.
534
4.
164
4. 779
5.
376
5.
953
6.
510
1. 1
1. 000
0. 458
0. 997
1. 659
2. 385
3.
141
3.
907
4.
668
5. 416
6.
146
6.
855
7.
542
1. 2
1. 000
0. 393
0. 955
1. 683
2. 506
3.
381
4.
278
5.
177
6. 069
6.
943
7.
797
8.
627
1. 3
1. 000
0. 335
0. 911
1. 698
2. 616
3.
610
4.
642
5.
687
6. 730
7.
760
8.
769
9.
754
I. 4
1. 000
0. 285
0. 865
1. 705
2. 714
3.
826
4.
997
6.
193
7. 395
8. 587
9.
761
10.
911
1. 5
1. 000
0. 241
0. 819
1. 704
2. 800
4.
030
5.
340
6.
691
8. 057
9.
420
10.
767
12.
091
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
158
TabXe 5.6 Power* to Tognormal Percentile Patio
0 > a > .1, P
95%
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A PROBABILITY OF 95.000000 7. WITH NORMAL VARIATE Z = 1.645
VY ALFA
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0. 800
0. 900
1. 000
0. 1
1. 000
0. 999
0. 999
0. 998
0. 997
0. 996
0. 996
0. 995
0. 994
0. 993
0. 992
0. 991
0. 2
1. 000
0. 996
0. 994
0. 992
0. 989
0. 986
0. 982
0. 979
0. 976
0. 973
0. 969
0. 966
0. 3
1. 000
0. 991
0. 986
0. 980
0. 974
0. 968
0. 962
0. 955
0. 948
0. 941
0. 934
0. 926
0. 4
1. 000
0. 981
0. 973
0. 964
0. 955
0. 945
0. 935
0. 925
0. 914
0. 903
0. 891
0. 878
0. 5
1. 000
0. 968
0. 957
0. 945
0. 932
0. 920
0. 906
0. 892
0. 878
0. 863
0. 848
0. 832
0. 6
1. 000
0. 952
0. 939
0. 924
0. 909
0. 894
0. 879
0. 863
0. 846
0. 830
0. 813
0. 796
0. 7
1. 000
0. 935
0. 919
0. 904
0. 888
0. 872
0. 856
0. 840
0. 825
0. 810
0. 796
0. 784
0. 8
1. 000
0. 916
0. 901
0. 885
0. 870
0. 855
0. 841
0. 829
0. 818
0. 810
0. 805
0. 805
0. 9
1. 000
0. 900
0. 885
0. 871
0. 858
0. 847
0. 838
0. 833
0. 832
0. 836
0. 848
0. 870
1. 0
1. 000
0. 884
0. 872
0. 861
0. 853
0. 849
0. 849
0. 856
0. 871
0. 896
0. 934
0. 987
1. 1
1. 000
0. 873
0. 864
0. 858
0. 858
0. 863
0. 877
0. 902
0. 940
0. 995
1. 069
1. 166
1. 2
1. 000
0. 863
0. 860
0. 863
0. 873
0. 893
0. 926
0. 975
1. 045
1. 139
1. 261
1. 411
1. 3
1. 000
0. 858
0. 863
0. 876
0. 900
0. 940
0. 998
1. 081
1. 192
1. 337
1. 515
1. 727
1. 4
1. 000
0. 859
0. 873
0. 899
0. 942
1. 006
1. 098
1. 223
1. 388
1. 593
1. 839
2. 119
1. 5
1. 000
0. 864
0. 891
0. 934
1. 001
1. 097
1. 231
1. 410
1. 639
1. 916
2. 237
2. 589
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
\
159
Table 5.7 Power to Lognormal Percentile Ratio, 0 _> a >_ 1, P = 99.%
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A PROBABILITY OF 99.000000 '/. WITH NORMAL VARIATE Z = 2.326
VY ALFA
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0. 800
0. 900
1. 000
0. 1
1. 000
0. 999
0. 997
0. 995
0. 993
0. 991
0. 988
0. 986
0. 983
0. 981
0. 978
0. 976
0. 2
1. 000
0. 995
0. 988
0. 979
0. 971
0. 961
0. 951
0. 941
0. 930
0. 919
0. 906
0. 893
0. 3
1. 000
0. 987
0. 970
0. 952
0. 932
0. 912
0. 889
0. 865
0. 839
0. 810
0. 779
0. 744
0. 4
1. 000
0. 971
0. 944
0. 913
0. 880
0. 844
0. 805
0. 761
0. 713
0. 658
0. 596
0. 522
0. 5
1. 000
0. 950
0. 910
0. 865
0. 817
0. 763
0. 704
0. 638
0. 562
0. 473
0. 367
0. 233
0. 6
1. 000
0. 922
0. 870
0. 81 1
0. 747
0. 675
0. 595
0. 504
0. 399
0. 273
0. 116
0. 000
0. 7
1. 000
0. 892
0. 826
0. 754
0. 674
0. 586
0. 486
0. 373
0. 241
0. 085
0. 000
0. 000
0. S
1. 000
0. 857
0. 781
0. 696
0. 603
0. 500
0. 384
0. 253
0. 105
0. 000
0. 000
0. 000
0. 9
1. 000
0. 823
0. 737
0. 641
0. 537
0. 422
0. 295
0. 154
0. 013
0. 000
0. 000
0. 000
1. 0
1. 000
0. 788
0. 694
0. 590
0. 477
0. 354
0. 220
0. 080
0. 000
0. 000
0. 000
0. 000
1. 1
1. 000
0. 756
0. 654
0. 545
0. 426
0. 298
0. 162
0. 031
0. 000
0. 000
0. 000
0. 000
1. 2
1. 000
0. 724
0. 619
0. 505
0. 383
0. 254
0. 120
0. 006
0. 000
0. 000
0. 000
0. 000
1. 3
1. 000
0. 696
0. 588
0. 472
0. 349
0. 220
0. 092
0. 000
0. 000
0. 000
0. 000
0. 000
1. 4
1. 000
0. 672
0. 562
0. 445
0. 323
0. 197
0. 075
0. 000
0. 000
0. 000
0. 000
o. ooo
1. 5
1. 000
0. 650
0. 541
0. 425
0. 306
0. 184
0. 068
0. 000
0. 000
0. 000
0. 000
0. 000
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
160
99.9%
Table 5.S Power to Lognormal Percentile Ratio, 0 >_ a >_ 1, P =
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A PROBABILITY OF 99.900000 X WITH NORMAL VARIATE Z = 3.090
VY ALFA
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
0. 500
0. 600
0. 700
0. BOO
0. 900
1. 000
0. 1
1. 000
0. 999
0. 995
0. 991
0. 986
0. 982
0. 977
0. 972
0. 966
0. 961
0. 955
0. 949
0. 2
1. 000
0. 993
0. 978
0. 961
0. 942
0. 923
0. 901
0. 878
0. 853
0. 826
0. 796
0. 763
0. 3
1. 000
0. 982
0. 947
0. 909
0. 867
0. 822
0. 772
0. 715
0. 652
0. 578
0. 492
0. 387
0. 4
1. 000
0. 960
0. 902
0. 837
0. 766
0. 686
0. 597
0. 494
0. 374
0. 230
0. 049
0. 000
0. 5
1. 000
0. 930
0. 845
0. 750
0. 646
0. 529
0. 397
0. 248
0. 081
0. 000
0. 000
0. 000
0. 6
1. 000
0. 890
0. 778
0. 654
0. 518
0. 368
0. 206
0. 044
0. 000
0. 000
0. 000
0. 000
0. 7
1. 000
0. 845
0. 707
0. 555
0. 393
0. 222
0. 060
0. 000
0. 000
0. 000
0. 000
0. 000
0. a
1. 000
0. 795
0. 633
0. 460
0. 280
0. 109
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 9
1. 000
0. 745
0. 562
0. 372
0. 187
0. 036
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 0
1. 000
0. 692
0. 494
0. 295
0. 115
0. 004
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 1
1. 000
0. 643
0. 433
0. 230
0. 064
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 2
1. 000
0. 595
0. 377
0. 177
0. 031
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 3
1.000
0. 550
0. 328
0. 134
0. 013
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 4
1. 000
0. 509
0. 286
0. 101
0. 004
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
1. 5
1. 000
0. 472
0. 250
0. 075
0. 001
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
161
Table 5.9 Power Co Lognormal Percentile Ratio, 0 >_ a. >_ 1, P = 99.999%
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A PROBABILITY OF 99.979000 7. WITH NORMAL VARIATE Z = 4.275
VY
LOG
0. 001
0. 100
0. 200
0. 300
0. 400
ALFA
0. 500
0. 600
0. 700
0. 800
0. 900
1.
000
0. 1
1. 000
0. 999
0. 990
0. 981
0. 972
0. 962
0. 951
0. 940
0. 928
0. 916
0. 902
0.
888
0. 2
1. 000
0. 991
0. 958
0. 921
0. 881
0. 838
0. 789
0. 735
0. 674
0. 604
0. 522
0.
423
0. 3
1. 000
0. 974
0. 901
0. 820
0. 731
0. 631
0. 518
0. 388
0. 236
0. 058
0. 000
0.
000
0. 4
1. 000
0. 942
0. 821
0. 686
0. 539
0. 377
0. 203
0. 034
0. 000
0. 000
0. 000
0.
000
0. 5
1. 000
0. 899
0. 724
0. 535
0. 338
0. 144
0. 004
0. 000
0. 000
0. 000
0. 000
0.
000
0. 6
1. 000
0. 842
0. 616
0. 385
0. 167
0. 014
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
0. 7
1. 000
0. 778
0. 507
0. 252
0. 055
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
0. 8
1. 000
0. 708
0. 405
0. 148
0. 007
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
o
o
o
0. 9
1. 000
0. 636
0. 313
0. 076
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
1. 0
1. 000
0. 566
0. 235
0. 032
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
1. 1
1. 000
0. 499
0. 172
0. on
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
1. 2
1. 000
0. 437
0. 123
0. 002
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
o
o
o
. 3
1. 000
0. 381
0. 086
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
1. 4
1. 000
0. 331
0. 059
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
o
o
o
1. 5
1. 000
0. 287
0. 039
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0. 000
0.
000
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
162
Table 5.1O
Power to Lognormal Percentile Ratio, 0 > a > 1, P = 99.9999%
POWER TO LOGNORMAL PREDICTED PTH PERCENTILE RATIO
FOR A PROBABILITY OF 99.999900 7. WITH NORMAL VARIATE Z = 4.772
VY ALFA
LOG
0. 001
0.
100
0. 200
0.
300
0.
400
0.
500
0.
600
0. 700
0.
800
0. 900
1.
000
0. 1
1. 000
0. 99B
0.
938
0. 976
0.
964
0.
951
0.
938
0.
923
0. 908
0.
891
0. 873
0.
853
0. 2
1. 000
0. 990
0.
943
0. 901
0.
850
0.
793
0.
730
0.
658
0. 576
0.
480
0. 364
0.
216
0. 3
1. 000
0. 970
0.
878
0. 776
0.
662
0.
536
0.
393
0.
232
0. 056
0.
000
0. 000
0.
000
0. 4
1. 000
0. 935
0.
782
0. 614
0.
434
0.
245
0.
065
0.
000
0. 000
0.
000
0. 000
0.
000
0. 5
1. 000
0. 336
0.
667
0. 440
0.
218
0.
039
0.
000
0.
000
0. 000
0.
00
0. 000
0.
000
0. 6
1. 000
0. 823
0.
545
0. 279
0.
068
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
0. 7
1. 000
0. 751
0.
426
0. 152
0.
006
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
0. 3
1. 000
0. 674
0.
318
0. 068
0.
000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
0. 9
1. 000
0. 596
0.
227
0. 023
0.
000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
1. 0
1. 000
0. 519
0.
156
0. 005
0.
000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
1. 1
1. 000
0. 449
0.
102
0. 000
0. 000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
1. 2
1. 000
0. 384
0.
065
0. 000
0.
000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
1. 3
1. 000
0. 326
0.
039
0. 000
0.
000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
1. 4
1. 000
0. 276
0.
023
0. 000
0. 000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
1. 5
1. 000
0. 232
0.
013
0. 000
0.
000
0.
000
0.
000
0.
000
0. 000
0.
000
0. 000
0.
000
ALFA : POWER TRANSFORMATION EXPONENT
VY : COEFFICIENT OF VARIATION
163
164
event for a=~l and V =0.5 and 1.5. This ratio is 0.233 and less than
y
3
10 respectively, implying an underestimation of the magnitude of this
event by as much as 4.3 and 1000 times the magnitude of the true 100
year event. Note that for low probability levels, the over and under
prediction of the corresponding percentiles are reversed for the posi
tive and negative a's at high coefficients of variation and absolute
values of a (Tables 5.1 and 5.6). More investigation is needed to assess
if this is a consequence of the first order approximation (Equation
5.3.22) or a simple consequence of the effect of an increasing coeffi
cient of variation on the tail of distribution.
A more detailed analysis of the sensitivity to the shape of the
distribution used for reliability analysis is given in the next section,
where for different transformations the reliability is calculated for a
range of values of V and COR. The following analysis is a generaliza
tion of the work of Niku et al. (1979, 1981) to other than the lognormal
distribution model on which they based their analysis.
5.4.2. Sensitivity of the level of reliability to the shape of the
distribution and to the first order approximation
The reliability level associated with a given COR (standardized
mean), a coefficient of variation (V ) and a normalizing transformation
(a) is made in two steps. First, the generalized reliability index
(Equations 5.2.9 and 5.3.25) is calculated as
i(al) Aj/Vy
a#)
a
z.
L
(5.4.4)
L
[log(l+vJ)]1/2
a=0 .
165
Then the corresponding reliability is calculated using a rational
approximation formula (Abromowitz and Stegun, 1964, p. 932, Eqn. 26.2.17)
for the cumulative normal distribution (Equation 3.2.3). A small com
puter program (KWM) was developed to perform these two steps and gener
ate tables of reliability level as a function of the coefficient of
variation and the standardized mean. Samples of these tables along with
a listing of the source program are given in Appendix G. Tables G.1.1
to G.1.11 give the reliability as a function of the standardized mean
and coefficient of variation for eleven distribution models from the GND
distribution, ranging from the normal (a=l) to the inverse normal (ct=l).
Other tables for different models from the same family or ranges of
and COR may be easily generated by the program KWM (Table G.2.1). These
tables generalize the results of Niku et al. (1979, Table VI) for the
lognormal distribution to the more general case in which the normal
ization is accomplished through the BoxCox transformation. Table
G.1.6 (a=0) compares very well to the results of Niku et al. Plot of
the reliability versus the standardized mean for different coefficients
of variation are given for a in the range 1 to 1 (Figures 5.1 to
5.5). These figures, too, are inspired from the work of Niku et al.
for the lognormal case. Note that Figure 5.3 reproduces the curves of
Figure 7 in Niku et al. (1979) very well. For the normal distribution
(Figure 5.1) the reliability is equal to 0.5 for a percentile equal to
the mean (C0R=1), independently of the coefficient of variation. For a
standardized mean less than 0.5 the reliability is a decreasing function
of the coefficient of variation, but for values larger than 0.5 (per
centiles smaller than the mean) the reliability increases with the
coefficient of variation. As a decreases the reliability becomes an
166
RELIABILITY VERSUS COR
FOB IF^i
l.O
0.9
0.8
0.7
R 0.G
E
L
c :
fl 0.5
B
[
l :
c o.u
T
r ;
0.3
o.a
0. L
0.0
I n i m rrrpnrrmpTrpnrrrrTr t ji "ri**t*r*TT,,,r,,i 
0.0 0.5 L.D 1.5 2.0
COEFFICIENT BF RELIABILITT,HU/TL
Figure 5.1 Reliability Versus Standardized Mean, a = 1.0
167
RELIABILITY VERSUS COR
ROE LFA^D.S
Figure 5.2 Reliability Versus Standardized Mean, a = 0.60
168
RELIABILITY VERSUS COR
POS IFxsO
Figure 5.3 Reliability Versus Standardized Mean, a = 0.0
169
EE LIABILITY VERSUS COR
FOE U7=0
Figure 5.4 Reliability Versus Standardized Mean, a =0.6
<i r, r rn CD 3D r. r m 3
170
RELIABILITY VERSUS COE
FOJS m=~i
Figure 5.5 Reliability Versus Standardized Mean, a = 1.0
171
increasing function of the coefficient of variation at smaller and
smaller values of the standardized mean, although the transition point
is not unique as for the normal case but is highly dependent on the
value of Vy (Figures 5.2 to 5.5). Note the smooth transition between
the different figures as a changes from 1 to 1, indicating that there
is no significant error introduced by the first order approximation used
for calculating the generalized reliability index (Equation 5.3.34).
Even for high values of the coefficient of variation, the curves follow
the same trends as those for small V where the approximation holds the
best. Also the curves for the lognormal case (a=0) fit very well as a
transition between those for a>0 and a<0. Thus, the increase of reli
ability with the coefficient of variation gives an explanation for the
change of trend in the coefficient of reliability (Tables. 5.1 and 5.6)
noticed in the previous section. From the above observation, it becomes
clear that such a change is mainly due to the effect of the coefficient
of variation on the tail of the distribution.
5.4.3. Sensitivity of design event magnitude to the design period
and to the shape of the distribution
The risk or probability that a given extreme event, standard or
design event will be exceeded at least once in the next n years, Pr(y>y ) ,
is often fixed by some design policy. This probability is related to
the probability of such an event or standard occurring at any year by
Equation 1.2.8, which may be written as
Pr(y>yL)n = 1 [1 Pr(y>yL)]n (5.4.5)
The probability of no occurrence of such an event defines the reli
ability R (for a single year) which may be evaluated from R^, the
reliability over the nyear period, by
172
R = R1^11
n
(5.4.6)
where
R = 1 Pr(y>yL)
(5.4.7)
and
En 1 Prn
(5.4.8)
Hie purpose of this section is the analysis of the sensitivity of
the magnitude of the design event y^ to the design period n and to the
shape parameter a of the distribution used for evaluating the reli
ability R. Assume a reliability R^=0.95 is desired over the next 20, 50
and 100 years. The corresponding reliabilities for any year are given
by Equation 5.4.6, and the standardized normal variates z are inter
polated from cumulative normal distribution tables. The ratio of the
magnitude of the design event to the mean may be evaluated directly from
Equation 5.4.1 as 1/COR for different values of the transformation
parameter a. Assuming a coefficient of variation V^=0.70, values of
this ratio for the above three design periods and for a=l, 0.5, 0.0
and 0.5 are given by Table 5.11. Note the high sensitivity of the
design event magnitude to the shape of the distribution especially at
high design periods. For example, if the lognormal distribution (a=0)
was used for estimating the magnitude of the design event which xjII
not be exceeded in the next 20 years with 95% reliability, its value
will be 4.8 times the mean. But if the true distribution is normal, the
magnitude of such an event would be only 2.96 the mean. This would
result in an over prediction of more than 60%, which is about six times
the percentage increase that would result from use of a design period of
100 years, for which the design event magnitude will be 3.30 times the
mean. Thus, designing for a 20 or 100 year period would have less
impact than using the wrong distribution.
173
Table 5.11. Sensitivity of the Ratio, Design Event Magnitude/Mean
(1/COR), to the Design Period and to the Shape of the
Distribution for R =0.95.
n
n
R
ZL
ALFA
1.0
0.50
0.00
0.50
20
0.997439
2.80
2.96
3.68
4.80
24.09
50
0.998975
3.09
3.16
4.08
5.77
95.65
100
0.999487
3.28
3.30
4.35
6.50
782.43
174
A similar conclusion was reached by Tung and Mays (1980) during
their consideration of the effect of the choice of the appropriate
frequency distribution on the evaluation of overall risk. That is, the
overall risk is most sensitive to the choice of the distribution.
5.5. Summary and Conclusions
Approximate relations between the moments of the BoxCox trans
formed variables and the original variables were derived. These rela
tions allowed the expression of many indices of reliability in terms of
the original moments and the shape parameter (a). Although approximate,
these relations performed very well in estimating the reliability; their
performance was compared to that of the exact relations of the log
arithmic transformation.
A sensitivity analysis illustrated the high sensitivity of the
predicted pth percentile, the reliability level and the design period to
the shape of the distribution. An unexpected increase of reliability
with the coefficient of variation for a fixed extreme event was high
lighted by the new generalized reliability indices and definitions.
The next chapter will give some illustrations, based on a case
study, of the applicability of the reliabilitybased generalized ap
proach to parameter estimation for the three types of hydrologic models
listed in Chapter 2.
CHAPTER 6
CASE STUDY
6.1. Introduction
The importance of reliability analysis in hydrological modeling
will be illustrated in this chapter by a case study in southeast Florida.
In an effort to assess the reliability of parameter estimates and model
predictions better, based on the general approach developed in the
previous chapters, the three types of hydrological models described in
Chapter 2 are investigated. Each of these models is applied at some
level of the case study for the description of the hydrological proc
esses under investigation. Deterministic models are used for continuous
simulation of the rainfallrunoff process in order to fill in the gaps
of continuous monitoring of runoff and associated pollutant loads.
Probabilistic models are applied for the frequency analysis of the
generated series, the input series and the series of residuals in the
calibration phase. Stochastic models are applied to the monthly and
yearly rainfall series for a better description of the correlation
structure within these series. The focus of this chapter will be mainly
on the applicability of the reliability based, uniform approach sug
gested by this research for estimating the parameters of all three types
of models.
175
176
6.2. Case Study
6.2.1. Sites and data description
This study will focus on southeast Florida. Eight rainfall sta
tions monitored continuously (hourly basis) by the United States
National Weather Service (NWS), and four urban basins monitored by the
United States Geological Survey (USGS) will be the basis of this study.
The exact locations of these basins and stations are given by Figures
6.1 and 6.2. The rainfall stations were selected from many in the
region for the completeness of their records for the period 19561979.
Detailed information regarding these stations is given in Table 6.1.
The four urban basins were investigated for rainfall and runoff water
quantity and quality. The data collected at these basins are among the
best data of the Nationwide Urban RainfallRunoff Quality Data Base
(Huber et al. 1981a). The good quality of these data has been
confirmed by many users of the data base (Doyle, 1981; Remain, 1980;
Voorhees and Wenzel, 1981; and Maalel, 1983b). Some of the character
istics of these basins are given in Table 6.2; more details about the
basins are given by Miller (1979).
The time and space variability of the measured data at the four
urban basins and the eight rainfall stations will be analyzed in the
next section in an effort to assess the inherent variability of the
observed processes. This variability will then be included in the
evaluation of the reliability of the mathematical models at the cali
bration and verification phases.
6.2.2. Temporal and spatial variability of the recorded data
The time variability of the total volume and duration of the rain
fall and runoff storm events for the four urban basins is summarized by
Figure 6.1 Location of the Four Urban Basins.
After Doyle (1981)
177
178
Figure 6.2 Location of the Eight NWS Hourly Rainfall Stations
Table 6.1. Hourly Rainfall Station Identification
Number
NWS
Index
Name
County
Latitude
N
Longitude
W
Elevation
MSL
ft.
1
085663
Miami AP
Dade
25 49
80 17
8
2
086323
N. New River Canal
Palm Beach
26 20
80 32
22
3
089525
W. Palm Beach
Palm Beach
26 41
80 06
15
4
080616
Belle Glade, HRN G.4
Palm Beach
26 42
80 43
31
5
086657
Ortona Lock 2
Glades
26 47
81 18
21
6
087293
Port Mayaca S.L. Canal
Martin
26 59
80 37
34
7
087859
St. Lucie N. Lock 1
Martin
27 05
80 18
15
8
082158
Daytona Beach AP
Volusia
29 11
81 03
31
NWS: National Weather Service
MSL: Mean Sea Level
179
180
Table 6.2. Characteristics of the Four Urban Basins.
Basin
Area ^
acre (1cm )
% imperviousness
Total #
of storm
events
Number of
events with
quality
Highway
58.3(0.236)
18.30
108
42
Single Family
Residential
40.80(0.165)
5.92
74
31
Commercial
20.40(0.083)
97.90
113
30
Multifamily
Residential
14.70(0.059)
43.90
43
15
181
Table 6.3, in which the mean and the coefficient of variation are
calculated for each of the basins. Included in the same table are the
mean and the coefficient of variation of the corresponding runoff
coefficients. Note that while the rainfall has a duration variability
greater than the variability of the runoff duration, its total volume is
less variable than that of the runoff. This should be expected, given
the damping effect introduced by the pervious area on the runoff dura
tion, and the increasing variability of the runoff volume total that
results from the change of the basins' hydrologic characteristics (e.g.,
initial moisture deficit) between storms. This is well illustrated by
the variability of the runoff coefficients, which are a decreasing
function of the percentage of imperviousness.
Most of the storm events over the Highway, the Single Family and
Multifamily basins were recorded by a secondary raingage in addition to
the main raingage. Hardee (1979) gives a detailed description of the
instrumentation used in these basins. The two raingages x^ere within 200
meters (660 feet) of each other. A tentative evaluation of the magni
tude of the spatial variability of rainfall and the degree of.repre
sentation of the areal rainfall by the point rainfall is made through
the analysis of the difference between the records of two gages for the
same storms. The means of these differences and of their absolute
values are summarized in Table 6.4. Note that the mean of the absolute
differences is much higher than that of the simple differences, indi
cating no systematic instrumental error. Thus, most of the variability
is attributable to spatial variation of the rainfall. The last column
of Table 6.4 gives the mean absolute difference as a percentage of the
Table 6.3. Time Variability of Rainfall, Runoff and Coefficient of Runoff Over
Four Urban Basins in South Florida, for the 19561979 Period.
Basin
Rainfall
Runoff
Runoff Coefficient
Duration (hr)
T. volume (in)
Duration (hr)
T. volume (in)
y
V
y v
y
V
y
V
y
V
Highway
3.45
0.64
0.729 1.01
3.87
0.56
0.135
1.07
0.156
0.44
Single Family
2.14
0.86
0.589 1.06
2.23
0.72
0.075
1.77
0.082
0.72
Commercial
3.83
0.77
0.709 1.00
3.74
0.64
0.639
1.14
0.848
0.32
Multifamily
3.15
0.56
0.988 1.12
2.91
0.79
0.547
1.41
0.540
0.41
y: mean
V: coefficient of variation
182
183
Table 6.4. Spatial Variability of Rainfall Within the Urban Basins.
Basin
(Gi~63)
(inches)
tGl_G2i
(inches)
% of
mean
runoff
Highway
0.004
0.064
47.4
Single Family
0.040
0.113
150.7
Multifamily
0.001
0.077
14.1
= primary gage
G2 = secondary gage
184
mean runoff volume by storm to illustrate the amount of error in model
prediction that may be explained by the spatial variability of the
rainfall inputs, when the latter are represented by point measurements.
Rainfall records from the eight NWS stations are first analyzed on
a monthly and yearly basis for time and space variability. The time
variability is assessed through a statistical analysis of the annual and
monthly totals over the 24 years of records. The spatial variability is
assessed through comparison of the statistics of the eight stations. A
complete list of the monthly and annual rainfall totals is given in
Tables H.l to H.8 of Appendix H. Also included in these tables are the
mean, standard deviation (STDV), coefficient of variation and coeffi
cient of skewness for each month and for the annual totals. All eight
stations exhibited the same seasonal trend: a wet season extends from
the month of May to October, with an average monthly total of 162.5 mm
(6.32 inches), with a dry season over the rest of the year with an
average monthly total of only 56.6 ram (2.23 in), about one third the wet
season average. But the average coefficient of variation of the dry
season is about twice that of the wet season (93%). Table 6.5 sum
marizes the seasonal and annual statistics along with their temporal and
spatial coefficients of variation. Notice the importance of the tem
poral variability compared to the spatial variability as illustrated by
the ratio of the corresponding coefficients of variations (last column
of Table 6.5). From the same table we see that the skewness coeffi
cient has the largest coefficient of variation, especially for the
annual totals where it is equal to 6.84. Although highly variable, the
skewness coefficient of the annual total has a lower mean skewness than
the wet and dry season.
185
Table 6.5. Temporal and Spatial Variability of the Seasonal and
Yearly Total Rainfall Over the Southeast of Florida.
Season
Statistic
Mean (in)
Spatial V
Time V
Ratio (%)
Wet*
Season
MEAN
6.32
0.09
0.27
33
STDV
0.53
0.05
0.21
24
SKEW
0.70
0.25
0.49
51
Dry**
Season
MEAN
2.23
0.16
0.42
38
STDV
0.91
0.03
0.19
16
SKEW
1.25
0.15
0.61
25
Yearly
MEAN
52.14
0.10


STDV
0.20
0.13


SKEW
0.04
6.84


*: May to October.
**: November to April
186
The above statistics showed no statistical differences between the
eight stations for both monthly and annual rainfall. Even when a small
difference exists, the spatial variability is overwhelmed by the tem
poral variability. Thus, any of the eight stations may represent the
average rainfall patterns over southeast Florida fairly well. Usually,
the monthly, seasonal, and yearly data are of little use in the studies
of small watersheds such as those monitored by the USGS, for which storm
event data are much more important for water quantity and quality
investigations. Unfortunately, such data are seldom collected, either
because of their great expense or simply because they are needed only
for design purposes, and postconstruction monitoring is not conducted.
In both cases, mathematical modeling of the hydrologic behavior of
structures and basins offers an alternative.
6.2.3. Continuous versus single event simulation
Continuous simulation is one of the tools most recommended for
storm T^ater management model application. McPherson (1978) among many
others discussed the high variability of hydrologic indicators derived
from long series of field observations. Such variability is illustrated
for the case of this study by Table 6.3. McPherson criticized the
classical approach of fitting hydrological models to some observed
hydrographs and pollutographs in the following words:
Let the modelers lament observedcalculated differences and let the
field engineers sermonize on instrument error, but always remember
that the degree of data network adequacy is a much more important
measure of the reliability of planning or design conclusions based
on simulations founded on data from the network.
The spatial variability of rainfall data given by Table 6.4 is a
good illustration of the origin of these criticisms. Similar comments
on deterministic modeling studies in which exact reproduction of some
187
measurements of the response of the modeled processes was sought were
made by Delleur and Dendrou (1981), Sautier and Delleur (1980), Voorhees
and Wenzel (1981), and Maalel (1983b). They all recommended the use of
continuous simulation and the assessment of reliability of the results
on the basis of a comparison of the frequency distributions of the
observations and the predictions. This approach follows the recom
mendation of McPherson (1978):
If reliability of the results of analysis is a genuine issue, the
only recourse is a timeseries synthesis of runoffquality events
via simulation using hydrologic models.
More details about this approach and its application for this case
study will be given in the next section.
6.3. Deterministic Models
6.3.1. Introduction
In the previous section, it was shown that the spatial variability
of rainfall monthly and yearly total volumes is overwhelmed by the
corresponding time variability. In this section a similar analysis will
be performed on the hourly data after their separation into events, in
order to derive eventbased statistics comparable to those listed in
Table 6.3. Such a comparison will be the basis for judging the repre
sentativeness of the storm events by the continuous simulation. The
separation of the hourly data into events and a distribution free sta
tistical analysis of these events are performed by a synoptic computer
program (SYNOP). A short description of this program and a summary of
its results are given in the next section. Then, an overview of the
problems associated with the calibration and verification of determin
istic models is presented, with emphasis on the potential application of
the reliability based generalized approach to such problems.
188
6.3.2. Distribution free statistical analysis
The SYNOP program was originally developed by Hydroscience Inc.
(1976). It has since been improved and updated by the University of
Florida. The hourly data are summarized by storm events, each with an
associated volume, duration, average intensity, and time since the
previous event (delta) measured between the midpoints of successive
storms. Storm events are defined as rainfall periods separated by a
minimum number of consecutive hours without rainfall (MIT). Heaney et
al. (1977) suggested two definitions for the MIT. The first is based on
the lag time for which the autocorrelation function of hourly data
vanishes. The second definition, a less precise one, is based on a plot
of the total number of storms per year versus the MIT; a "significant"
change in the slope of the resulting curve defines the adequate MIT. A
third definition assumes that the time between storms (delta) is expo
nentially distributed, thus the adequate MIT is the one leading to a
coefficient of variation equal to one for delta (Hydroscience, Inc.,
1979). For this study, an MIT of 5 hours was arbitrarily selected,
independently of the above definitions, for all eight stations in order
to have a common base for comparison of their statistics which are very
sensitive to such a definition. Sample outputs from SYNOP are given in
Tables 6.6 and 6.7, summarizing the event statistics for the 24 years
of records on a monthly and yearly basis, respectively. For each of the
four characteristics of the storm events (duration, intensity, volume
and delta) the following statistics are calculated by SYNOP: number,
total, minimum, maximum, average, standard deviation, variance and
coefficient of variation. The same seasonal trend exhibited by the
monthly rainfall totals in the previous section is reproduced by the
MONTH
1
3
4
3
6
7
8
7
10
1 1
13
Table 6.6 Monthly Storm Event Statistics at Miami Airport Station.
MIAMI WSMO AP METEOROLOGICAL STATION 19567?. SHRTS) INTEREVENT
RAINFALL STATISTICS BY MONTH(FOR PERIOD OF RECORD)
NUMBER
TOTAL
MIN? MUM
MAXIMUM
AVERAGE
STD DEV
VARIANCE
COEFVAR
DURATION
179.
0.
763000E+03
0.IOOOOOE+O1
0.
330000E+02
0.
426257E+01
0.
541567E+01
0.273295E+02
0.
127052E+01
INTENSITY
179.
0.
128 520E+02
0. 571500E02
0.
47000SE+00
0.
717987E01
0.
840083E01
0.705740E02
0.
1 17005E+01
VOLUME
177.
0.
529090E+02
O.100000E01
0.
304000E+01
0.
295581E+00
0.
45B703E+00
0.210592E+00
0.
1 55255E (01
DELTA
178.
0.
185315E+0 5
0. 600000E+01
0.
677500E+03
0.
104110E+03
0.
11619BE+03
0. 135020E+05
0.
1 11611E+01
DURATION
147.
0.
701O0OE+O3
0.1OOOOOE+O1
0.
340000E+02
0.
476871E+01
O.
578977E+01
0. 335215E+02
0.
121412E+01
INTENSITY
147.
0.
10376 5E+02
0.333400E02
0.
473335E+00
0.
739897E01
0.
821787E01
0. 675333E02
0.
1 11067E+01
VOLUME
147.
0.
S14473E+02
0.1OOOOOEO1
0.
573000E+01
0.
347995E+00
0.
602772E+0O
0.363575E+00
0.
172280E+01
DELTA
147.
0.
156275E+05
0.650000E+01
0.
585500E+03
0.
106323E+03
0.
10700SE+03
0.11450E+05
0.
100644E+01
DURATION
148.
0.
546000E+03
0.100000F+01
0.
250000E+02
0.
368719E+01
0.
334717E+01
0. 148007E+02
0.
1042B2E+01
INTENSITY
148.
0.
144904E+02
0.600100E02
0.
14 SOOOE+O1
0.
979078E01
0.
15B350E+00
0.252332E01
0.
162244E+01
VOLUME
148.
0.
48S974E+02
0.100000E01
0.
267000E+01
0.
330401E+00
0. 493305E+00
0.243350E+00
0.
147305E+01
DELTA
148.
0.
1 77300E) 0 5
0.600000E+01
0.
704000E+03
0.
119797E+03
0.
120714E+03
0. 146202E+05
0.
100932E+01
DURATION
132.
0.
506000E+03
O.1OOOOOE+O1
o.
280000E+02
O.
383333E+01
0.
458535E+01
0.210254E+02
0.
117618E+01
INTENSITY
132.
0.
1 36905E+02
0.400100E02
0.
870002E+00
0.
1 18867E+00
0.
15B425E+00
0.250936E01
0.
133279E+01
VOLUME
132.
0.
694193E+02
0.100000E01
0.
162400E+02
0.
525904E+00
0.
151795E+01
0.231025E+01
0.
287017E+01
DELTA
132.
0.
171405E+O5
0.700000E+01
0.
773500E+03
0.
129852E+03
0.
1 56287E+03
0.244258E+05
0.
12035SE4 01
DURATION
306.
0.
126400E+04
0.100000E+01
0.
360000E+02
0.
413072E+01
0.
457963E+01
0.209730E+02
0.
110360E+01
INTENSITY
306.
0.
377997E+02
0.333400E02
0.
710002E+00
0.
124183E+00
0.
1467 5BE+00
0.2I5378E01
0.
113177E+01
VOLUME
306.
0.
170948E+03
O.100000201
0.
1 15700E+02
0.
558653E+00
0.
116192E+01
O. 135006E+01
0.
207986E+01
DELTA
306.
0.
178465E+0 5
0.600000E+01
0.
636500E+03
o.
64S573E+02
0.
762579E+02
0.926558E+04
0.
148414E+01
DURATION
437.
0.
182700E+04
O.1OOOOOE+O1
0.
430000E+02
0.
416629E+01
0.
501023E+01
0.251024E+02
0.
120256E+01
INTENSITY
439.
0.
624127E+02
0.500100C02
0.
103001E+01
0.
142171E+00
0.
179375E+00
0.321756E01
0.
126167E+01
VOLUME
439.
0.
237267E+03
0.100000E01
0.
770000E+01
0.
545027E+00
0.
884240E+00
0.781BSOE+OO
0.
162233E+01
DELTA
437.
0.
17 5060E+0 5
0.600000E+01
0.
287500E+03
0.
378770E+02
0.
4143S1E+02
0. 171712E+04
0.
103915E+01
H*
00
vÂ£)
DURATION
INTENSITY
VOLUME
DELTA
402.
0. 1 18600E+04
0.
100000E+01
0. 170000E+02
0.
295025E+01
0.
284995E+01
0. 812221E+01
0.
966003E+00
402.
0. 479210E+02
0.
500100E02
0.108500E+01
0.
117206E+00
0.
147 561E+00
0.217743E01
0.
1 23786E+01
402.
0. 130827E+03
0.
1OOOOOEO1
0.268000E+01
0.
325441E+00
0.
453765E+00
0.206084E+00
0.
137492E+01
402.
0. 177190E+05
0.
600000E+01
0.318000E+03
0.
445746E+02
0.
481390E+02
0.231736E+04
0.
107996E+01
DURATION
INTENSITY
VOLUME
DELTA
401.
0.
160800E+04
0.
1OOOOOE+O1
0.
330000E+02
O.334303E+01
0.
37755BE+01
0. 142550E+02
0.
112737E+01
431.
0.
573B43E+02
0.
333400F02
0.
135000E+01
0. U7302E+00
0.
161119E+0O
O.257574EOl
0.
135052E+01
431.
0.
175287E+03
o.
lOOOOOE01
0.
690000E+01
0. 364421E+00
0.
586441E+00
0.343713E+00
0.
160924E+01
481.
0.
180905E+05
0.
600000E+01
0.
2270Q0E+03
0. 37 6102E+02
0.
3485 56E+02
0.121471E+04
0.
7267 59E+00
DURATION
INTENSITY
VOLUME
DELTA
483.
482.
402.
482.
O. 186300E+04
O.354424E+02
O. 200577E+03
0. 1714 50E+0 3
0.1OOOOOE+O1
O.400100E02
0.lOOOOOtOl
O.600000E+01
O. 360000E+02
O.202000E+01
O.843000E+01
O.237500E+03
O.3B6514E+01
0.115026E+00
0.416135E+00
O.355705E+02
O. 486923E+01
O. 167573E+00
O. 7872BE+00
O.373567E+02
O.237074E+02
O.280807E01
O.619820E+00
0.137552E+04
O.125773E+01
O.145683E+01
0. 1B?170E+01
0. 105021 E+01
DURATION
INTENSITY
VOLUME
DELTA
3B6.
0.
167200E+04
0.
1OOOOOE+O1
0.
450000E+02
0.
433161E+01
0.562B21E+01
0.316760E+02
0.
127734E+01
336.
0.
35673E+02
0.
333400E02
0.
1 lOOOOE+Ol
0.
724256E01
0. 126B92E+00
0.161016E01
0.
137271E+01
336.
0.
161257E+03
0.
1OOOOOEO1
0.
570000E+01
0.
4 1 7765E+00
0.7628B5E+00
0. 581993E+00
0.
182611E+01
336.
0.
17071SE+05
0.
600000E+01
0.
34S000E+03
0.
442267E+02
0. 48B202E+02
0.238341E+04
0,
110386E+01
DURATION
INTENSITY
VOLUME
DELTA
206.
206.
206.'
206.
O.712000E+03
O. i 68349E+02
0. 6854B7E+02
0. 1 63430E+05
O.1OOOOOE+O1
O.470600E02
O.100000E01
O.600000E+01
O. 360000E+02
0.660001E+00
O. B52000E+01
O.64B500E+03
O.345631E+01
O.817228E01
0. 332761E+00
O.793347E+02
O. 443BB0E+01
O. 10078CE+00
O.B03943E+00
0.734153E+02
O.177030E+02
O.101567E01
O.646327E+00
O.872643E+04
O.128426E+01
0.123320E+01
0.241577E+01
0. 1 1774BE + 01
DURATION
INTENSITY
VOLUME
DELTA
153.
153.
153.
153.
O. 63S000E+03
O. 107900E + 02
O. 477393E+02
0. 167055E+05
0. 1OOOOOE+O 1
O. 333400L02
0.1OOOOOEO1
O. 600000F +01
O. 470000E+02
O. 547500E+00
0. 438000E+01
O. 776000E+03
0.416993E+01
O.705230E01
0.312022E+00
0.110473E+03
0.612565E+01
0. 971691Â£01
0. 598 583E+00
O.121787E+03
O.375235E+02
O.744133E02
O.358302E+00
O. 148807E+05
0. 1467Q0E+01
0.137783E+01
0. 191840E+01
0. 110402E+01
190
YEAR
36
37
30
39
60
61
62
63
64
63
66
67
68
67
70
71
72
73
74
73
76
77
70
79
Table
6.7 Annual Storm
at Miami Airport
Event Statistics
Station.
MIAMI USMO AP METEOROLOGICAL STATION 193679. 3HR(S) INTEHEVENT
RAINFALL STATISTICS BY YEAR (FOR PERIOD OF RECORD)
NUMBER
DURATION
113
intensity
113.
VOLUME
1 13
DELTA
1 12.
DURATION
139.
INTENSITY
139.
VOLUME
139.
DELTA
139.
DURATION
142.
INTENSITY
142.
VOLUME
142.
delta
142.
DURATION
133.
INTENSITY
133.
VOLUME
183.
DELTA
1 B3.
DURATION
140
INTENSITY
140
VOLUME
140.
DELTA
140.
DURATION
1 16.
INTENSITY
116.
VOLUME
116.
DELTA
116.
DURATION
121.
INTENSITY
121.
VOLUME
121.
DELTA
121.
DURATION
119.
INTENSITY
119.
VOLUME
119.
DELTA
119.
DURATION
146.
INTENSITY
146.
VOLUME
146.
DELTA
146
DURATION
124.
INTENSITY
124.
VOLUME
124.
DELTA
124
DURATION
164.
INTENSITY
164.
VOLUME
164.
DELTA
164.
DURATION
141.
INTENSITY
141.
VOLUME
141.
DELTA
141.
DURATION
131.
INTENSITY
131.
VOLUME
131.
DELTA
131.
DURATION
165.
INTENSITY
163.
VOLUME
163.
DELTA
163.
DURATION
131.
INTENSITY
131.
VOLUME
131.
DELTA
131.
DURATION
137.
INTENSITY
13/.
VOLUME
137.
DELTA
137.
DURATION
176.
INTENSITY
176.
VOLUME
176.
DELTA
176.
DURATION
137.
INTENSITY
137.
VOLUME
137.
DKLTA
137.
DURATION
137.
INTENSITY
137.
VOLUME
137.
delta
137.
DURATION
133.
INTENSITY
133.
VOLUME
133.
DELTA
133.
DURATION
133.
INTENSITY
133.
VOLUME
133.
DELTA
133.
IX/R AT I ON
134.
INTENSITY
134.
VOLUME
134
DELTA
134.
DURATION
167.
INTENSITY
167.
VOLUME
167.
DELTA
167.
DURATION
132.
INTENSITY
132.
VOLUME *
132.
DELTA
132.
0. 331 000E+03
0. 1 18332E+02
0. 369994E+02
O. 823630E+04
O. 37100OE+03
O. 1B1919E+02
O. 708690E+02
O. B0O63OE+O4
O. 640OOOE03
O. 139479E+02
O. 719192E02
O. 066OOOE+O4
O 7Z2000E03
0. 233409E+02
O. B93290E +02
0. 881 130E+04
O. 1OOOOOE+Ol
O. 300100E02
O.I00000E01
O. 6OOOOOE+Ol
O. 1 OOOOOE+Ol
O. 333400E02
O. 1 00000E01
O. 600000E+01
o. ioooooe+oi
O. 400100E02
O. 100000E01
O. 630000E 01
o. ioooooe+oi
0. 730100E02
O. 100000E01
O. 600000E+01
O 270000E +02
O. B90002E+00
0. 342000E+O1
O. 7O4000E*03
O. 1BOOOOE+02
O. 1OOOOOE+Ol
O. 699000E+01
O. 64B300E+03
O. 470000E+02
0. 660001E+O0
O. 1030006+02
O. 3*33006+03
O. 360000^^02
O. 8B0C03E+O0
0. 032OOOEO1
O. 3793006+03
O. 292920E+0I
O. 104896E+00
O. 327429E+00
O. 733402E+02
O. 3391 19E+01
O. 1 14414E+00
0. 44371 7E+00
O. 33386BE+02
O. 43633BE+01
0. 1 12309E+00
O. 3064736+00
0. 609939E02
O. 394333E+01
O. 127346E00
0. 4081 36E *00
0. 481303E+02
O. 3411426+01
O. 1343486+00
O. 31 4709E+00
O. 1064 1 6E+03
O. 332636E+01
O. 1 476B3E+00
O. 776B70E+00
O. 0O869OE+O2
O. 640362601
O. 127349E00
O. 1026B0E01
O. 660346E+02
0. 302163601
O. 1 60934E *00
O. 923996E00
O. 647370E02
VARIANCE
O. 1 1637BC02
O.10JO33EO1
O. 264923E00
0. 1 13243E03
O.110660602
O. 21S109c01
O 6O3327E00
O.633979E04
O. 42O633E02
0.162639501
O.103432E01
O. 436717E04
O.232168602
O.233996E01
O.B3746BEOO
O. 4 1 9346Â£*04
CQ6FVAR
0. 116462601
O. 1 2B269EQ 1
O. 137197601
O. 144704E01
0. 9263106OO
O. 129079E01
O. 174297E01
O. 14600BE+01
O. 142123601
0. 1 1337CE01
0. 202736601
0. 1033*06*01
0. 127230601
0. 126 1776*01
0. 1397006+01
0. 134489E01
0. 633000603
O. 160233E02
O. 7023925+02
O. B7D630E04
O. 1 000006*01
O. 3001 00E02
O. 1 00000E01
O. 630000E01
0. 3600006*02
0. llOOOCEOl
O. 3430005+01
O. 4043006+03
0. 433371 E+01
0. 1144 326+00
O. 301331E+OO
O. 623464E+02
O. 3B9404E+01
O. 133363E+00
O. 1OBO1 OE+Ol
O. 797338E02
0.3473976+02
0.241332601
O.116662E+0I
O.63606B6+04
O. 12994 7E + 01
0. 1337466 + 01
O. 213223E+01
O. 4000006+03
0. 1 1 1 4306+02
0. 4 1 6994E + 02
0. 830130604
O. 413000603
O. 123473E02
0. 41 3493602
O. 886000604
O. 333000E03
O. 1 33723E02
O. 4*0793602
O B92300E+04
0. 433000603
O 133943E+02
0. 601 992E02
O. B69200E+04
O. 437000E03
O. 171363E+02
0. 333974502
O. 8B1300E04
O 67400CE03
O. 1924 13E02
O. 820390E02
O. B4790CE04
0. 391 OOOE03
0. 1601 20E02
O. 6S2193E+02
O. 901030E04
O. 766000E03
O. 1823B4E+02
O. B33991E02
O. B43100E04
0. 64 3000E03
O. 1 771 7 7E02
0. 71 4 B ?0E + 0 2
O. B6B100604
O. 46B000603
0. 124431602
0. 447 193E02
O. 9129306+04
Q. 490000603
0. 131243602
O. 307193E02
O. 870700604
O. 61 3000E03
0. 1931 32E02
0. 631 090E02
0. 873130604
0. 3710006+03
O. 1498626+02
O. 3323926 + 02
O. 8037306+04
O. 43B000E03
0. 1677006+02
O. 4399726+02
0. 883200604
O. 404000E03
0. 173366602
0. 3909936+02
O. 041 300E04
0. 3740006+03
O. 173313602
O. 33B993E02
O. 91 3930604
O. 627000603
O. 122166602
0. 6494926+02
O. 0737006+04
O. 6640006+03
0. 133302E+02
O. 4DB291E+02
O. 071 3306+04
0. 3100006+03
0. 1293906+02
O. 601 091E +02
O. 0772OOE+O4
O. 100000E+01
O. 300100E02
O. 100000E01
O. 700000E+01
O. 1 OOOOOE+Ol
O. 300100E02
O. 100000E01
O. 600000601
O. 1 OOOOOE+Ol
O.600100E02
O. 100000E01
O. 700000E+01
O. 1 OOOOOE+Ol
O. 100010E01
O. 1 OOOOOE01
O. 6000006+01
0. 1OOOOOE+Ol
O.333400E02
O. 100000E01
O. 600000E+01
O. 1 OOOOOE+Ol
O. 428600E02
O. 100000E01
O. 600000E+01
0. 1000006+01
O.371300E02
0. 1 OOOOOEOl
O. 600000E+01
O. 'OOOOOE+Ol
O. 666800E02
O.1OOOOOEO1
O.630000E+01
O.1OOOOOE+Ol
O.300100602
O. 100000601
O. 600000E+01
O. 1OOOOOE+Ol
O.666800E02
O. 100000E01
O. 600000E+01
0. 1OOOOOE+Ol
O. 470600602
O. 100000601
O. 630000E+01
O. 1000006+01
O. 434600E02
O. 1 OOOOOEO 1
O. 600000E+01
O.1OOOOOE+Ol
O. 400100E02
O.100000601
O.63OO0CC+01
0. 1 OOOOOE+Ol
0. 1 00030E01
0. 1 OOOOOEOl
O. 600000E+01
O. 1 OOOOOE+Ol
O.333400E02
O. 1 OOOOOEOl
O. 6000006+01
O. 1 OOOOOE+Ol
O. 371 300E02
O. 1 OOOOOEOl
O. 730000E+01
O. 1 OOOOOE+Ol
O. 730100E02
0. 1 OOOOOEOl
O. 7OOOOOE +01
O. 1 OOOOOE+Ol
O. 333400E02
O. 1 OOOOOEOl
O. 600000E+01
0. 1000006+01
O. 100010E01
O. 1 OOOOOEOl
O. 600000E +01
O. 3100OOE+O2
O. 643002E+00
O. 413000E+O1
O. 3B2300E+O3
O. 2OOOOOE +02
O. 710003E+00
O. 337000E+01
O. D67000E+03
O. 38O000E+02
0. 9600035+00
O. 4 1 4000E+01
O. 3703006+03
O. 430000E02
O. 7J0002E00
O. 6900005O1
0. 4663006+03
O. 360000E02
O. 1 43000E+O1
O. 6020006+01
O. 3360005*03
0. 3400006+02
O. 202000E+01
0.7700006+01
0. 491 OOOE03
O. 230000E+02
O. 1 OOOOOE+01
O. 4740006+01
O. 677300E+O3
O. 430000E+02
O. 134000E+01
O. 433000E+01
O. 7760005+03
0. 3300005*02
O. 133000E+O1
0. 3470006+01
O. 316300E03
0. 1900005*02
O. 380002E+00
O. 308000E+01
O. 373000E+03
O. 210000E+02
O.1030006*01
0 309000E*01
O. 6363005*03
O 220000E+02
O'. 1O0OOOE+O1
O. 3B3000E+01
O. 377000E+03
0. 190000E+02
O. 830001E00
O. 272000E+O1
O. 470OOOE+O3
O. 320000E+02
0. 9300036+00
O. 422000E01
O. 3B3300E03
O. 130000E+02
O 940003E+00
O.1330006+01
O. 432000E*O3
0. 3900006*02
O. 1093006*01
O. 343000E+01
O. 491 OOOE+OD
O. 3100006*02
0. 7100026+00
O. 1 1 D900E+O2
O. 773300E+O3
O. 2000006*02
0. 6400035+00 '
O. 197000E+01
O. 343300E+03
O. 20OOOOE+O2
0. 9466606*00
O. 1 624006*02
O. 616300E+O3
O. 331724E+01
O. 961033E01
O. 339477E00
O. 739734E+02
O. 341322E+01
O. 102043E00
O. 3433B4E+00
0. 732231E+02
0. 4663376*01
0. 1 14034E + OO
O. 387223E+00
O. 730000E+02
0. 332192E+01
O. 10M41E + 00
O. 412324E+00
0. 3933426+02
0. 3324196+01
0. 1 331 96E+00
0. 4709636+00
O. 710726E+02
0. 4231716+01
0. 1 17326E+00
0. 3003606+00
O. 317012E+02
0. 4191 49E01
0. 1 13360E+00
O. 469640E + 00
O. 639043E+02
O. 307283E+01
O. 120784E+00
0. 3322465+00
O. 33B344E+02
O. 390909E+01
0. 1073B0E+00
O. 433267E+00
O. 326121E+02
0. 3372326+01
0. 93001 OE01
O.34136BE+00
O. 696908E+02
O. 337664E+01
0. 1 10400E00
0. 37021 4E+00
O. 633347E+02
O. 34B293E+01
0. 1 10B82E+00
O. 33B374E + 00
O. 497244E+02
O. 363694E+01
O. 934336E01
O. 339103E + 00
O. 3642TTE+02
0. 291720E01
0. 10681 3E+00
0. 31 20976+00
0. 3638226+02
O. 264032E+01
0. 1 1331 1E+00
0. 2333 31 E+00
0. 3498696 + 02
O. 431 3796+01
0. 1303126+00
O. 420296E+00
O. 688684E+02
O. 467910EOl
O. 911687E01
O. 4B4693E+00
O. 633307E+02
O. 397603E+01
0. 8101 90E01
O.274426E+00
O. 321766E+02
O 392424E+01
0. 981799E01
O. 433372E+00
O. 664343E+02
O.4B0303E+O1
O. 1204 1 3E+00
O. 642006E+00
O. 021 B79E+02
O. 333334E+01
O. 123306E+00
O. 376336E00
O. 9744886+02
O. 670330E+01
O. 160128E+00
O. 6280486+00
' O. 102321E+03
O. 4616216+01
O. 1269295+00
O.B39018600
O. 7434BBE+02
O. 480424E+01
O. 18I047E00
O. B62329E00
O. B81033E02
O. 341141E+01
0. 1904 47E00
0. 943498600
0. 636111E02
0. 43338*601
O. 144867E+00
O. 73781 6E00
O. 94491 4E02
O. 687032EO1
O. 1B193BE00
O. B34033E00
O. 936126E+02
O.4414236+01
0. 1 34023E00
0. 718938600
O. 7077636+02
O.3338B6E+01
O. 110349E+00
O. 321100E+00
O. 9B7B31E+02
O. 3613I7E+01
O. 137338E00
O. 339402600
O. 976419E02
O. 364370E+01
O. 144377E+00
O.376062E00
O. 622321 E02
O.334667E0I
O. 1066336 +00
O. 490337E + 00
0. 736016E+02
0. 393319E+01
0. 1461 03E+00
O.368730600
O. 783447E02
O. 22873IE01
O. 163937E00
0. 3642396+00
O. 704703E02
O.3294B9E01
O. 203037EOO
O. 612988E00
O. 797343E02
O. 336372E01
O. 121617E00
O. 122733E + 01
O. 100227E03
O. 398633E01
O. 100723E00
O. 33671 3E00
O. 397323E02
O. 463037E+0I
O. 1 442225+00
O. 140437E + O1
O.B42916E+02
O.230693E+02
O.144997E01
O.412171E+00
0. 1363376 + 01
O. 123297E+01
O. 17B394E+01
0.
673484E+04
0.
Ill097E+01
O.
1111126+02
O.
9763965+00
0.
133278E01
0.
121324601
0.
332196E+O0
0.
167046EOI
0.
949627E+04
0.
1330B3E01
0.
449369E+02
0.
143733601
0.
236409EO1
0.
140396601
0.
394 444E+00
0.
1621936+01
0.
1O4676E+03
0.
13642B501
0.
213094502
0.
138962E01
o.
161109E01
0.
1203796+01
o.
737912E+00
0.
20B336E+01
0.
333732E04
0.
123220E+01
o.
230907E02
. 0.
1363226+01
o.
327780E01
0.
1310086+01
0.
743611E+00
0.
183099601
0.
776239E04
0.
123966E01
0.
292934E02
0.
127B7BE01
0.
362701E01
0.
162323c01
0.
890IB3E00
0.
108364601
0.
43043IE04
0.
126904EO1
0.
203339602
0.
108168601
0.
209863601
0.
127360E+O1
0.
374293E00
0.
1613616+01
0.
B92863E04
0.
1470646+01
0.
4720136+02
0.
1334336+01
0.
331016601
0.
1306316+01
o.
7Z9407E+00
0.
1 34631E +01
0.
7M177E+04
0.
171243E+01
0.
194336E+02
0.
112923E01
0.
237231E01
0.
1 434376+01
0.
3160716+00
0.
1639346+01
0.
300931E+04
0.
1343236+01
0.
1232335+02
0.
9903796+00
0.
122211601
0.
1163666+01
o.
271343E00
0.
1326306+01
0.
973809E+O4
0.
1417436+01
0.
130693502
0.
1010776+01
o.
1B8673E01
0.
1244196+01
0.
290934E+O0
0.
1 43700E01
0.
933393E04
0.
133634E+01
0.
132911E02
0.
104673EO1
0.
209026E01
0.
130389EO1
0.
331047EOO
0.
1606346+01
0.
387332E04
0.
123194E+01
0.
123799602
0.
9731806+00
o.
1 13706E01
0.
1117126+01
0.
240921E+00
0.
144746E+01
0.
342597204
0.
1303722+01
0.
136277E02
0.
1333136+01
0.
213462E01
0.
1367026+01
0.
323477E00
0.
1822336+01
o.
616927E04
0.
1393086+01
0.
323177E+01
0.
8662326+00
0.
260919601
0.
1446966+01
0.
1326836+00
0.
1423396+01
0.
496606c 04
0.
1291386+01
o.
200339502
0.
1226865+01
o.
420403601
0.
137344E+01
o.
373734EO0
0.
143047601
0.
636073E04
0.
113B07E+01
0.
2B7910E02
0.
114674601
o.
1 47907E01
0.
133393601
o.
13O607E01
0.
233261E0I
0.
1004 34 E03
0.
13336/EO1
0.
130910E02
0.
100239601
o.
101433E01
0.
124323601
0.
1272^4 E + 00
0.
1299036+01
0.
336777E04
0.
1 14481E01
o.
216270E+O2
0.
1103096+0!
0.
208000501
0.
146896E01
0.
220393601
0
32*011E+M*
0.
710338E04
0.
126B26E+01
191
event based statistics. This is illustrated in Table 6.6 by the big
difference between the monthly total number of storms within the two
seasons; the minimum number for the wet season is 306 (May), while the
maximum number for the dry season is 206 (November). Another illustra
tion of this seasonality is given by Figure 6.3 in which are presented
the plots of the average delta and its standard deviation versus the
months of the year. Note the similarity between the plots for the Miami
and North New River stations and the closeness of the standard deviation
to the mean, supporting the assumption of the exponential distribution
for delta. Figure 6.4 gives similar plots for the average intensity;
the seasonal trend remains, but the transition between the two seasons
is more gradual than for delta.
A detailed analysis of the time and space variability of the storm
event characteristics for these two seasons and for the yearly totals is
summarized in Table 6.8. Here again note that the spatial variability
remains very small compared to the temporal variability for all four
storm event characteristics. Thus, rainfall over southeast Florida may
be considered homogeneous for purposes of this study, and data from any
of the investigated stations may be used for continuous hydrologic
simulation within the region. Based on the results of this investi
gation, continuous simulation over the four urban basins, described
above will be made with hourly data from the closest station among the
eight NWS stations. But, first a review of the calibrationverification
procedure and the potential application of the reliability based parame
ter estimation for the deterministic model, used for the simulation, is
given in the next section.
192
136. 297 I
133. i 62 I
130. 036 I
146. 910 I
143. 784 I
140. 639 I
137. 333 I
134. 407 I
131. 281 I
128. 136 I
123. 030 I
121. 904 I
118. 778 I
115. 633 I
112. 327 I
109. 401 I
106. 273 I
103. 130 I
100. 024 I
96. 39a I
93. 772 X
90. 647 I
87. 321 I
84. 393 I
Bl. 269 I
78. 144 I
73. 018 1
71. B92 I
68. 766 I
63. 641 I
62. 313 I
39. 389 X
36. 263 I
33. 138 I
30. 012 I
46. BB6 X
43. 760 I
40. 633 I
37. 309 I
34. 383 I
31. 233 I
28. 132 I
23. 006 I
21. 830 I
IB. 733 I
15. 629 I
12. 303 I
9. 377 I
6. 252 I
3. 126 I
0. 000 I
I. .
o. 0
X. ... I. ... I. ... I. ... I. ... I. ... I. ... I. ... I. ... I. ... I. ... X
S I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I. ... I. ... I. ... I. ... I.... I. ... I.... I. ... I. ... I ... I. . I
2.00 4.00 6.00 8.00 10.00 12. OO
MIAMI
JSMO AP METEOROLOGICAL STATION 195679 3HRCS)
DELTA VS MONTH
A=AVERACE. S=STD. DEV.
INTEREVENT
Figure 6.3 Seasonal Variability of the
Time Between Events
193
186.
182.
178.
173.
171.
167.
163.
160.
156.
152.
149.
143.
141.
137.
134.
130.
126.
122.
119.
113.
ill.
108.
104.
100.
96.
93.
89.
85.
81.
78.
74.
70.
67.
63.
59.
53.
52.
48.
44.
40.
37.
33.
29.
26.
22.
18.
14.
11.
7.
3.
0.
313
392
866
139
413
686
960
234
507
781
055
328
602
876
149
423
696
970
244
517
791
065
323
612
886
159
433
706
980
234
527
SOI
075
343
622
896
169
443
716
990
264
537
811
085
353
632
906
179
453
726
OGO
I.
I
I
I
I
I
I
I
I
I
I
I
1
I
I
I
I
I
1
1
I
I
I
I
I
I
I
X
I
I
I
I
I
I
I
I
I
I
I
I
X
I
I
I
I
I
I
I
I
I
r
i
i..
O. O
I.... I.... I.... I.... I.... I.... I.... I.... I.... I.... .
. I
SI
I
I
I
I
I
I.... I.... I.... I.... I.... I.... I.... I.... I.... I.... I.
2.00 4.00 6.00 8.00 10.00
12. 00
NORTH NEU RIVER CAMAL 1, STATION 086323,195679 ,5 HRS INTEREVENT
DELTA VS MONTH
A=AVERACE, S=STD. DEV.
Figure 6.3 (continued) Seasonal Variability of
the Time Between Events
194
r
0. 179
I
0. 176
I
0. 172
I
0. 169
I
0. 163
I
0. 161
I
0. 158
I
0. 154
I
0. 151
I
0. 147
I
0. 144
I
O. 140
I
0. 136
I
0. 133
I
0. 129
I
0. 126
I
O. 122
I
0. 1 IB
I
0. 113
I
0. 1 11
I
0. 10B
I
0. 104
I
0. 100
I
0. 097
I
0. 093
I
0. 090
I
0. 0S6
I
0. 033
I
0. 079
I
O. 073
I
0. 072
I
0. 068
I
0. 065
I
0. 061
I
0. 037
I
0. 054
I
0. 050
I
0. 047
I
0. 043
I
0. 039
I
0. 036
I
0. 032
I
0. 029
I
0. 023
I
0. 022
I
0. 018
I
0. 014
I
0. 011
I
0. 007
I
0. 004
I
0. 000
I
i..
o. o
r.
s
i.... i.... i.... i.... i.... i.... i.... x x.... i.... i.
2. OO 4. OO 6. 00 B. 00 2 0. OO
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
r
oo
MIAMI WSMO AP METEOROLOGICAL STATION 193679. 3HRCS) INTEREVENT
INTENSITY VS MONTH
A=AVERAGE. S=STD. DEV.
Figure 6.4 Seasonal Variability of the Average Intensity
195
i.... i.... i.... i.... i.... i
O. 182 I
O. 178 I
0. 175 I
O. 171 X
0. 167 I 5
0. 164 I
0. 10 I
0. 156 I
0. 153
0. 149
O. 146
0. 142
O. 138
O. 133
O. 131
O. 127
O. 124
O. 120
O. 1 16
O. 1 13
0. 109
0. 103
0. 102
o. 09a
0. 093
0. 091
0. 037
0. 034
0. 030
0. 076
0. 073
0. 069
0. 063
0. 062
0. 0 58
0. 055
0. 051
0. 047
0. 044
0. 040
0. 036
0. 033
0. 029
0. 025
0. 022
0. 018
0. 013
0. 011
0. 007
0. 004
0. 000
I. ... I. ... J. ... I. ... I. ... J. ... I
S X
I
I
I
I
I
I
I
I
I
I
I
WEST PALM BEACH WSO AEROPORT STATION 1* 9323 .3679,3 HRCS) INTEREVENT
INTENSITY VS MONTH
A=AVERAGE. S=STD. DEV.
Figure 6.4 (continued) Seasonal Variability
of the Average Intensity
196
Table 6.8. Time and Space Variability of the Storm Event Based
Statistics. Data are for the eight NWS stations.
Season
Statistic
Duration
(hr)
Intensity
(in/hr)
Volume
(in)
Delta
(hr)
Wet
Mean
4.15
0.1191
0.46
55
Vt
1.30
1.20
1.50
1.05
V
s
0.11
0.03
0.07
0.12
Ratio (%)
8.46
2.50
4.67
11.43
Dry
Mean
5.08
0.0754
0.39
130
Vt
1.20
1.24
1.40
1.10
V
s
0.15
0.10
0.07
0.17
Ratio (%)
12.50
8.06
5.00
15.45
Yearly
Mean
4.62
0.0973
0.43
92
Vt
1.25
1.22
1.45
1.07
V
s
0.13
0.05
0.05
0.15
Ratio (%)
10.40
4.10
3.45
14.02
V : Temporal coefficient of variation
Vg: Spatial coefficient of variation
Ratio: (V /V ) x 100
197
6.3.3. Calibration and verification of deterministic models
This section will focus mainly on computer models for predicting
quantity and quality of urban stormwater runoff. These models may be
cost effective and reasonably accurate substitutes for extensive data
collection programs when calibrated and verified with data representa
tive of the real conditions of the basins under investigation (Jewell et
al., 1978). Among the models developed for this purpose, the USGS model
DSA (Dawdy et al., 1978; Alley and Smith, 1982) has been widely applied,
especially for the modeling of the four urban basins from south Florida
(Doyle and Miller, 1980; and Doyle, 1981). A second model that will be
used for continuous simulation at these four basins in the next section
is the Storm Water Management Model, SWMM (Huber et al., 1981b).
The calibration of these models involves minimization of some
function of the difference between measured field conditions and model
prediction by adjusting parameters within the models. Verification is
the process of checking the model calibration using an independent set
of data. Dawdy et al. (1978) and Troutman (1982) used the sum of
squares or the differences between the logarithms of the measured flows
and those predicted by the DSA model as a parameter selection criterion.
Each of the parameters was optimized independently for many storm
events, then averaged to represent hydrological conditions of more than
one storm. These parameters were constrained to prespecified ranges,
based on prior knowledge of hydrological conditions of the basins, in
order to avoid unrealistic parameter estimates. Moore and Clarke (1981)
gave a very good critique of the estimation of such parameters, stres
sing the fact that it is almost impossible to predict true parameters
from finite samples of observations. Troutman (1982) suggested the use
198
of the BoxCox transformation (Equation 3.3.3) instead of the logarith
mic transformation along with the nonlinear estimation procedure of the
USGSDSA model. This approach x^as suggested by Sorooshian and Dracup
(1980) and applied by Sorooshian (1981) for the estimation of a four
parameter rainfallrunoff model. This method reduces to the weighted
least squares method, x^herein the weights are dependent on the power
transformation parameter (a) as in Equation 3.3.12. The parameter
estimation algorithm is similar to the one given in Section 3.3.1,
except that the parameters of the model (represented by g(x,0)) are
estimated by a nonlinear procedure (Section 2.2.3) instead of simple
linear regression. Gupta and Sorooshian (1983) discussed the problem of
uniqueness of parameter estimates in the calibration of conceptual
rainfallrunoff models. Their main finding was that even under ideal
conditions (simulation studies) it was often impossible to obtain unique
estimates for the parameters. Thus, they suggested appropriate repa
rameterization of the pertinent model equations. The efficiency of such
reparameterization was proven in Chapter 4 for the case.of generalized
probability distribution models. The same concept xtfill be extended to
stochastic models in Section 6.5.
A methodology for calibrating stormwater models was recommended by
Jex^ell et al. (1978), based on experimentation x^ith the SWMM model. They
suggested calibration for some average conditions across several storms,
in order to reduce predictive errors and increase the reliability of the
results. Although SWMM x^as designed to be a deterministic model that
can be calibrated using any storm event resulting in similar parameter
estimates (Huber et al., 1981b), Maalel (1933b) like Jex^ell et al. found
that different storm events resulted in different calibrations and
199
different predictions. Thus, he developed a new technique for cali
brating SWMM for more than one storm at the same time, a very efficient
method by which it is possible to detect systematic errors in the inputs
and avoid estimation of unrealistic parameters. The objective function
was the minimization of the average differences between measured and
predicted total runoff and peak flows. This procedure was applied for
the four urban basins described in Section 6.2.1. Table 6.9 illus
trates the advantages of such a procedure in estimating the basin width,
a main calibration parameter. The calibration results are summarized in
Table 6.10. These results are used in the next section for runoff
quantity and quality simulation at the four urban basins.
6.3.4. Water quantity and qualitycontinuous simulation
As noted in Section 6.2.3 continuous simulation is the best tool
for synthesis of runoff and quality events once the models are cali
brated with real data. Table 6.11 gives samples of the yearly and
monthly summaries of total rainfall, runoff, and pollutant loads for the
multifamily urban basin, using hourly rainfall from the Miami airport
meteorological station. SWMM also produces within the same run ranked
series of the 50 highest hourly rainfall intensities (Table 6.12) and
the 50 highest hourly runoff and pollutant concentrations (Table 6.13).
These tables are very helpful for the selection of critical periods
suitable for more detailed analysis of extreme events and associated
responses of the basin under investigation. An eventbased statistical
analysis of the generated hourly series of runoff quantity and quality
can be performed directly by SWMM, through its statistical block, STATS
(Huber et al., 1981b). Figures 6.5 and 6.6 are sample plots from STATS,
giving the empirical frequencies of the generated flows and total
Table 6.9. Highway Basin Calibration by Adjusting the Width
Storm
#
Date
YR/MO/DA
Basin
Width
8000 ft
2000
ft
1000 ft
% Peak
% Volume
% Peak
% Volume
% Peak
% Volume
3
75/5/9
2729.00
1216.00
2381.00
1491.00
1534.00
1151.00
19
75/8/23
87.54
5.87
86.06
8.27
63.99
5.41
20
75/8/29
208.50
22.40
179.90
20.24
75.01
16.65
23
75/9/17
104.70
+18.10
43.50
+19.10
+16.30
+20.70
29
75/10/22
602.60
87.30
385.30
88.80
231.10
82.20
39
76/5/15
29.10
+17.61
25.99
+18.32
+1.51
+19.30
40
76/5/17
173.40
+11.82
56.96
+13.64
+22.60
+16.50
48
76/6/7
84.11
+17.37
74.30
+18.61
20.35
+20.10
63
76/8/16
+34.75
+24.20
+35.04
+24.68
+40.10
+25.17
MeasuredPredicted
x 100
Measured
200
Table 6.10. Calibration Results for the Storm Water Management Model at the
Four South Florida Basins.
Basin
Area
(acres)
Width
(ft)
Slope
Manning
n
% imp.
Infiltration
V
K
IMD
Highway
58.26
1000
0.030
0.020
18.10
15.0
0.20
0.04
Single Family
40.80
6000
C.027
0.015
5.92
15.0
0.99
0.03
Commercial
20.40
5000
0.010
0.015
97.90
8.0
0.30
0.01
Multifamily
14.70
2000
0.030
0.015
43.92
10.0
0.40
0.02
H': capillary suction at wetting front, ft.
K: saturated hydraulic conductivity ft/sec.
IMD: initial moisture deficit, ft/ft.
201
202
Table 6.11 Yearly and Monthly Summaries from
SWMM for the Multifamily Urban Basin.
SUMMARY OF QUANTITY AND QUALITY RESULTS FOR 1974
MONTH
INLT
RAIN
INCH
FLOW
CU. FEET
COD
POUNDS
TOT. SOL.
POUNDS
JANU
99
0. 30
1. 950E+04
5. 740E+01
7. 727E+01
FEBR
99
0. 20
1. 263E+04
4.527E+0
7.207E+01
MARC
99
1. 70
1.194E+05
1.7B2E+02
1.319E+03
APR I
99
0. 70
4. 771Er04
1.2B8E+02
6.523E+02
MAY
99
2. SO
1. B59E+05
1. 727E02
1.529E+03
JUNE
99
5. 10
3.47SE+05
1. 072E+02
7.303E+02
JULY
99
7. 60
5.221E+05
1.405E+02
7.975E+02
AUCU
99
3. 50
2.360E+05
1.006E+02
7. 022E+02
SEPT
99
5. 50
3. 792E+05
1. 414E+02
9.769E+02
OCTO
99
1. 10
7. 252E+04
1.813E+01
6. 41BE+01
NOVE
99
S. 20
3. S38E+05
2.032E+02
1.193E+03
DECE
99
0. 40
2. 670E+04
4.062E+01
9.982E+01
YR TOT
99
37. 10
2. 55E+06
1. 33E+03
8.21E+03
SUMMARY OF
QUANTITY AND QUALITY RESULTS FOR 1977
JANU
99
4. 20
2.981E+05
8.919E+01
5. 805E+02
FEBR
99
1. 60
1.066E+05
1.255E+02
5. 005E+02
MARC
99
0. 0
0. 0
0. 0
0. 0
APR I
99
1. 50
9. 206E+04
1.979E+02
8. 45BE+02
MAY
99
7. 60
5. 193E+05
1.586E+02
1. 805E+03
JUNE
99
6. 30
4.321E+05
1.093E+02
5. 570E+02
JULY
99
4. 50
3.16SE+05
1. 075E+02
7. 503E+02
AUCU
99
7. 40
4. 965E+05
1.521E+02
9. 772E+02
SEPT
99
8. 50
5. S93E+05
9.753E+01
4. 884E+02
CTO
99
3. 10
2.15SE+05
1.10BE+02
7.1S6E+02
NOVE
99
6. 60
4.741E+05
1.1B9E+02
5. 932E+02
NOVE
99
5. 10
3. 594E+05
1.147E+02
B. 0B6E+02
YR TOT
99
56. 40
3. 90E+06
1.3BE+03
8. 62E+03
203
Table 6.12 Ranked Hourly Rainfall for
the Multifamily Urban Basin.
SUMMARY OF THE SO H20HEST RAINFALL INTENSITIES FOR THIS SIMULATION.
RANK
DATE
HOUR
RAIN
1
5/10/75
15
3. 90
2
11/24/77
2
2. 60
3
10/11/76
3
2. 10
4
11/18/74
4
2. 10
5
6/ 3/75
13
1. BO
&
7/23/77
16
1. 70
7
10/21/73
2
1. 50
a
3/ 7/76
19
1. 50
9
12/13/76
22
1. 50
10
11/24/77
1
1. 30
11
12/16/77
13
1. 40
12
5/29/77
1 6
1. 40
13
6/ 9/77
15
1. 40
14
9/22/77
8
1. 30
15
7/ 3/74
6
1. 30
16
2/28/76
9
1.20
17
8/19/74
17
1.20
IB
8/24/77
4
1. 20
19
6/ 6/74
IB
1. 20
20
3/23/77
21
1. 20
21
7/12/74
15
1. 10
22
12/13/76
21
1. 10
23
1/13/77
13
1. 10
24
11/17/74
21
1. 10
25
3/13/74
22
1. 00
26
10/11/76
2
1. 00
27
9/25/74
23
1.00
28
10/23/77
B
1. 00
29
11/18/74
3
1. 00
30
6/23/76
23
0. 90
31
3/29/75
14
0. 90
32
12/ 9/77
19
0. 80
33
7/ 6/74
14
0. 80
34
7/ 1/77
8
0. 80
35
9/12/76
6
0. 80
36
10/22/73
16
0. 80
37
3/23/76
16
0. 80
3B
11/20/73
3
0. 80
39
5/29/76
18
0. 80
40
1/14/76
14
0. BO
41
7/ 1/74
19
0. 80
42
10/13/73
IB
0. 70
43
9/13/76
12
0. 70
44
3/14/74
21
0. 70
45
1/15/77
14
0. 70
46
9/30/74
19
0. 70
47
9/ 1/77
22
0. 70
48
9/ 1/77
23
0. 70
49
6/16/94
10
0. 70
50
6/ 9/77
14
0. 70
THERE WERE 764 HOURS WITH PRECIPITATION FOR THIS SIMULATION.
CARD GROUP FI
EVAPORATION RATE
JAN. FEB. MAR.
0. 12 O. 15 0. 20
IN/DAY).
APR. MAY JUN.
0. 25 0. 25 0. 24
UUL. AUG. SEP.
0. 23 O. 23 O. 1?
OCT. NOV. DEC.
O. IB O. 14 O. 13
# # * NO GUTTER OR PIPE NETWORK * * *
Table 6.13 Ranked Runoff and Pollutant Concentration
for the Multifamily Basin.
RANKED OUTPUT FOR CONTINUOUS SWhM *$**$*
TOTAL TINE STEPS (E. G. HOURS) OF SIMULATION = 34744
TIME STEPS WITH NONZERO RUNOFF = 1218
TIME STEPS WITH NONZERO PRECIPITATION 762
HIGHEST 50 (HOURLY) RUNOFF AND COD LOADS, SUM OF ALL
1. HIGHEST BY RUNOFF VALUES.
DATE
RUNOFF
COD
PRECIP.
RANK
MO/DY/YR
HOUR
(CFS)
( IN. /HR)
CDNCFLOW
. < IN. /HR)
1
5/10/75
15
70. 36
3. 420
32. 33
3. 70
2
11/24/77
' 2
55. 36
2. 691
25. 44
2. 60
3
11/18/74
4
43. 76
2. 127
20. 11
2. 10
4
10/11/76
3
41. 95
2. 039
17. 43
2. 10
5
12/13/76
22
33. 05
1. 607
15. 58
1. 50
6
6/ 3/75
15
30. 86
1. 500
17. 45
1. 80
7
6/ 7/77
15
29. 14
1.416
14. 24
1. 40
B
7/23/77
16
29. 02
1. 411
16. 53
1. 70
7
10/21/75
2
26. 76
1. 301
17. 33
1. 50
10
12/16/77
15
25. 93
1. 260
22. 57
1.40
11
11/24/77
1
25. 41
1.235
. 55. 38
1. 50
12
3/ 7/76
19
25. 37
1. 233
31. 12
1. 50
13
5/29/77
16
23. 49
1. 142
22. 72
1.40
14
7/22/77
8
22. 45
1. 071
18. 32
1. 30.
15
2/28/76
7
22. 23
1. OBI
12. 16
1. 20
16
5/25/77
21
22. 13
1. 076
53. 36
1. 20
17
7/ 3/74
6
21. 69
1. 054
20. 78
1. 30
18
5/10/75
16
21. 51
1. 045
9. 88
0. 60
17
8/24/77
4
21. 22
1. 031
IB. 29
1. 20
20
3/13/74
22
20. 05
0. 975
B7. 87
1. 00
21
8/17/74
17
19. B6
0. 765
30. 32
1. 20
22
6/ 6/74
18
19. 86
0. 765
32. 87
1. 20
23
1/15/77
13
19. 17
0. 932
20. 28
1. 10
24
10/11/76
2
IB. 57
0. 903
81. 47
1. 00
25
10/23/77
8
18. 14
0. 8B2
14. 64
1. 00
26
12/13/76
21
18. 14
0. 882
57. 37
1. 10
27
11/17/74
21
18. 12
0. 881
10. 19
1. 10
2B
7/12/74
15
18. 06
0. 878
50. 77
1. 10
27
7/25/74
23
17. 64
0. 857
18. 85
1. 00
30
1/15/77
14
17. 61
0. 856
7. 25
0. 70
31
6/23/76
23
16. 76
0. 815
8. 52
0. 90
32
7/ 1/77
23
16. 41
0. 778
7. 70
0. 70
33
11/18/74
3
16. 32
0. 794
7. 47
1. 00
34
11/24/77
3
16. 27
0. 771
7. 48
0. 70
35
12/ 9/77
19
15. 78
0. 767
30. 26
0. 80
36
6/ 3/75
16
15. 78
0. 767
7. 55
0. 50
37
1/14/76
15
15. 19
0. 738
15. 92
0. 60
3B
5/27/75
14
14. 45
0. 702
44. 24
0. 90
37
10/22/75
16
14. 06
0. 6B3
14. 31
0. 80
40
5/14/74
21
13. 86
0. 674
57. 20
0. 70
41
11/15/74
6
13. 56
0. 659
36. 52
0. 60
42
10/23/77
7
13. 52
0. 657
37. 05
0. 60
43
9/26/74
24
13. 27
0. 645
7. 02
0. 50
44
1/14/76
14
12. 76
0. 620
140. 55
0. 80
45
11/20/75
. 5
12. 75
0. 620
102. 27
0. 80
46
9/12/76
6
12. 71
0. 618
34. 20
0. 80
47
7/ 1/77
8
12. 68
0. 617
66. 97
0. 80
48
7/ 1/74
17
12. 68
0. 617
107. 8?
0. 80
47
7/ 6/74
14
12. 68
0. 616
16. 57
0. 80
50
5/23/76
16
12. 67
0. 616
14. 77
0. 80
* * * RUNOFF SIMULATION ENDED NORMALLY #.* * *
1 INLETS. <
2. HIGHEST BY COD VALUES.
RANK
DATE
MO/DY/YR
HOUR
RUNOFF
(CFS) (IN. /HR)
COD
CONC*FLOW
PRECIP.
(IN. /HR)
1
3/13/74
21
5. 78
0. 281
562. 20
0. 40
2
4/15/74
6
4. 11
0. 200
324. 06
0. 30
3
8/18/76
12
10. 92
0. 531
31 5. 29
0. 70
4
5/ 5/74
18
7. 43
0. 361
270. 45
0. 50
5
10/15/75
18
10. 95
0. 532
247. 61
0. 70
6
10/ 4/75
1
5. 41
0. 263
233. 56
0. 30
7
2/10/75
15
2. 59
0. 126
232. 99
0. 20
8
8/20/75
22
7. 44
0. 362
227. 80
0. 50
9
10/23/77
6
7. 47
0. 363
219. 71
0. 50
10
10/ 3/75
17
2. 57
0. 125
218. 26
0. 20
1 1
11/15/74
5
7. 50
0. 365
216. 85
0. 50
12
4/ 6/76
10
8. 11
0. 374
21 5. 17
0. 50
13
4/21/75
21
2. 53
0. 123
205. 09
0. 20
14
5/ 1/76
10
4. 11
0. 200
185. 85
0. 30
15
7/24/74
20
7. 47
0. 363
172. 52
0. 50
16
12/ 5/75
20
5. B3
0. 283
171. 31
0. 40
17
5/28/74
20
7. 43
0. 361
168. 56
0. 50
18
2/10/75
16
3. 05
0. 148
16 5. 90
0. 10
17
4/13/77
23
2. 53
0. 123
165. 37
0. 20
20
5/13/75
23
5. 33
0. 257
161. 67
0. 30
21
7/ 7/75
9
7. 44
0. 362
161. 60
0. 50
22
5/ 8/75
11
2. 53
0. 123
157. 11
0. 20
23
7/1B/77
8
7. 44
0. 362
153. 53
0. 50
24
4/16/74
22
4. 11
0. 200
149. 22
0. 30
25
1/14/76
14
12. 76
0. 620
140. 55
0. BO
26
11/ 5/77
10
9. 23
0. 447
134. 14
0. 60
27
7/31/75
12
7. 44
0. 362
133. 42
0. 50
28
2/11/75
1
4. 65
0. 226
130. 04
0. 20
29
8/18/74
12
5. 76
0. 280
126. 11
0. 40
30
5/ 1/76
13
7. 43
0. 361
120. 52
0. 50
31
4/14/77
24
2. 96
0. 144
118. 19
0. 10
32
6/21/77
17
7. 44
0. 362
117. 45
0. 50
33
2/28/77
3
4. 17
0. 203
116. 99
0. 30
34
11/ 2/74
4
2. 60
0. 126
11 6. 38
0. 20
35
4/14/77
1
4. 31
0. 210
113. 89
0. 30
36
10/18/76
16
7. 4B
0. 363
111. 38
0. 50
37
7/ 1/74
19
12. 68
0. 617
107. 89
0. 8Q
3B
4/ 6/76
9
2. 15
0. 105
106. 97
0. 10
39
11/20/75
5
12. 75
0. 620
102. 27
0. BO
40
5/14/74
20
4. 11
0. 200
102. 07
0. 30
41
2/25/76
9
4. 17
0. 203
100.26
0. 30
42
11/17/76
6
7. 97
0. 388
79. 16
0. 40
43
7/23/77
14
9. 17
0. 446
9B. BO
0. 60
44
6/ 5/74
16
5. 75
0. 280
98. 67
0. 40
45
3/ B/75
19
3. 77
0. 183
96. 45
0. 20
46
2/11/75
24
2. 59
0. 126
94. 73
0. 20
47
5/ 4/77
13
4. 56
0. 222
93. 28
0. 20
48
2/ 1/76
11
2. 57
0. 126
72. 87
0. 20
49
2/ 4/75
16
1. 10
0. 054
70. 59
0. 10
50
7/19/77
7
4. 13
0. 202
90. 52
0. 30
NS
O
4>
10. coooo
a. ooooo
TOTAL Q
(INCHES)
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
1
I
I
I
I
I
I
I
I
I
4. OOOOO 
I
I
I
I
I
I
I
I
I
2. OOOOO 
I
6. OOOOO
0. 0
I
I
I
I
I
I
I
I ************
x ********************
****************************** *********** 1 X 1
0. 0 10. 0 20. 0 30. 0 40. 0 SO. 0 60. 0 70. 0
*
>
*
**
****
***
*****
*********
1
SO. 0
I
VO. 0
*1
*1
*1
*1
*
*1
*1
* I
* I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
1
100. 0
PLOT OF MAGNITUDE VS.
PERCENT OF OCCURENCES LESS THAN/EQUAL TO GIVEN MAGNITUDE
FREQUENCY: TOTAL FLOW LOC NO. V7
MOMENTS
*= = = = = =IS3
CONSTITUENT ANALYZED: FLOW
EVENT PARAMETER ANALYZED: TOTAL FLOW
MEAN
. 542
VARIANCE
. 67V
STD. DEVIATION
. 824
COEFF OF VARIATION
1. 51V
COEFF OF SKEWNESS
3. 077
K>
O
Ln
Figure 6.5 Frequency Plot and Statistical Parameters of
Generated Flows at the Multifamily Basin
T
1 SO. ooooo
120. 00000
TOT LOAD
(POUNDS)
I
I
I
1
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
60. OOOOO 
I
I
I
I
I
I
I
I
30. OOOOO 
90. OOOOO
0. 0
*
*1
*1
*
*1
*1
*1
*1
*1
*1
*1
* I
* I
* 
**
***
I
I
I
I
I
I
I
I
1 **************
***** 1 1
0. 0 10. 0 20. 0
**
***
**
***
*****
**
***
*********
*************
*************
1
30. 0
1
40. 0
1
50. 0
1
60. 0
1
70. 0
1
SO. 0
I
90. 0
PLOT OF MAGNITUDE VS.
PERCENT OF OCCURENCES LESS THAN/EQUAL TO GIVEN MAGNITUDE
FREQUENCY: TOTAL LOAD COD LOC NO. 99
I
I
I
I
I
I
I
I
I
X
100. 0
N>
O
MOMENTS
C333S5S
CONSTITUENT ANALYZED: COD
EVENT PARAMETER ANALYZED: TOTAL LOAD
MEAN
(POUNDS)
17. 0
VARIANCE
477.
STD. DEVIATION
21. 9
COEFF OF VARIATION
1. 226
COEFF OF SKEWNESS
2. 037
Figure 6.6 Frequency Plot and Statistical Parameters of
Generated COD Loads at the Multifamily Basin
M******I4I
207
loads, respectively. Also included in these figures are the statistical
parameters of the generated events. Distributionfree statistical
analysis similar to the one performed by SYNOP (Section 6.3.2) may be
performed by STATS for all the event characteristics listed as options
in Table 6.14. For each of these event characteristics a table of
empirical frequency and return period, such as Table 6.15, can be
generated (optionally) by STATS. Based on such a table, a parametric
frequency analysis following the procedure developed in Chapters 3 and 4
may be performed for a better description of the generated series. This
is also true for the series of the 50 highest hourly rainfall, runoff,
and pollutant concentrations. Examples of such analyses will be given
in the next section.
Generated hourly runoff series may be analyzed by the program
SYNOP, after their transformation to the NWS hourly rainfall tape
format. Goforth (1981) developed a small program to do such a trans
formation. Tables 6.16 and 6.17 give monthly and yearly summary sta
tistics of the generated runoff from the same rainfall series, for which
statistics were summarized in Tables 6.6 and 6.7. A comparison of these
tables reveals that the generated runoff follows exactly the same
seasonal trend as the rainfall, and that the number of runoff events is
smaller than the number of rainfall events for the same month or year.
This should be expected since not all rainfall events generate runoff,
and consecutive rainfall events may be close enough to each other to
produce a common runoff event. This is illustrated by a longer average
duration of the runoff events (Tables 6.16 and 6.17) compared to the
average duration of corresponding rainfall events (Tables 6.6 and 6.7).
Table 6.14 Options Selection for STATS Analysis.
THE PERIOD OF TIME FOR WHICH THE STATISTICAL ANALYSIS IS BEING PERFORMED IS:
STARTING DATE: 740101 STARTING TIME: 0.0 HOURS
ENDING DATE: 771231 ENDING TIME: 0.0 HOURS
THE MINIMUM INTEREVENT TIME HAS BEEN DEFINED AS 10 TIME STEPS
THE LOCATION NUMBER REQUESTED FOR STATISTICAL ANALYSIS IS
ENGLISH UNITS ARE USED IN INPUT/OUTPUT
THE NUMBER OF POLLUTANTS REQUESTED FOR STATISTICAL ANALYSIS IS 1
THE BASE FLOW TO SEPARATE EVENTS IS O. O
THE POLLUTANTS REQUESTED FOR THIS RUN, IDENTIFIED BY NUMBER, ARE AS FOLLOWS: 1
THE STATISTICAL OPTIONS REQUESTED FOR
FLOW RATE ARE
INDICATED
BY A
'1 '
TOTAL FLOW
AVERAGE
FLOW
PEAK FLOW
EVENT DURATION
INTEREVENT DURATION
TABLE OF RETURN PERIOD AND FREQUENCY
1
1
1
1
1
GRAPH OF RETURN PERIOD
1
1
1
1
1
GRAPH OF FREQUENCY
1
1
1
1
1
MOMENTS
1
1
1
1
1
THE STATISTICAL OPTIONS REQUESTED FOR
TABLE OF RETURN PERIOD AND FREQUENCY
POLLUTANT NUMBER
TOTAL LOAD
1
1 ARE
AVERAGE
1
INDICATED BY A '1'
LOAD PEAK LOAD
1
FLOW WEIGHTED
AVERAGE CONC
1
GRAPH OF RETURN PERIOD
1
1
1
1
GRAPH OF FREQUENCY
1
1
1
1
MOMENTS
1
1
1
1
PEAK CONC
1
1
1
1
PROGRAM EXECUTION CONTINUING. DATA WILL BE READ FROM THE INTERFACE FILE AND SEPARATED INTO EVENTS.
******* END OF INTERFACE FILE REACHED
******* LAST DATE AND TIME READ ARE 771227 0.0 HOURS
******* PROGRAM CONTINUING WITH ANALYSIS OF EVENTS
THE NUMBER OF MONTHS WITHIN THE PERIOD OF ANALYSIS, ROUNDED TO THE NEAREST MONTH, IS 40
THE NUMBER OF EVENTS WITHIN THE PERIOD OF ANALYSIS IS 311
208
4kviut4ken4kviCPcn0encneR0 4kvi4k vi00cn4k4k04k vj00cnviO'00enenvicn4kvi4k vicpenvi4kcpcn4kencnen4k vien4k4cnvi04k
*00000000000000*00*00*0*O*0000000000k0000000*0*00*0000*0*000
o vj vj >0 o > cd 4 en en vi o vj o vi o co r\j ivj en vi ru o pj o* en en 4k en cd tu en o>t m o cd vi vi ro o fu en vj nj en Jk enen ocd cj
o * o pj o fvj fu * nj pj o o o o *~ tu * 0 0  * 0 * 0 0 * 0 o * o o o  o 0 o o o 0 * o  * k  * o o o *  ** o * *  *
u en O en *o ^ *o u fo uivj W'O vj qv >4 q] (j yio o cd o o en vi fu en o o en en O'cj 'O 4k 4 *otk en cd t> en u en en en CJ tD u
* ru 0k **00* hhmmh ** h* nj****nJ wvw** *00
en vi vi en vi ro *w ut m o en o o id nj 4 4k M * ch o co vi o o o w *oo en i* cj o cj o jk Â¡> o* o* u) 4 q o o o fu o en en co m u ru
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooo
v* * v k.  ~ k* k *  *  * v v 0 0 0 fo 0 0 (Vj 0 w 0 ru 0 cj cj cj CJ cj 4k 4 4 en en O' o* vi >0 o a
cd cd CD CD *0 o p o o * * * * nj ro ru u cj 4k 4 en en O' vi vi cd cd o p O' O' co CD 43 * id en o CD o 0 cj t> o vi o 4> o* p en vi
ru 4k jk cd nj cd CD cj vi o ru en cd o cd O'o cd 4k vi cd *i3 *0 o cn CD O'o fu vi cd en en en o oo nj cj
ruk'OOCDCDD'*k0'4k04krvjvjoenO'0'0'ruru4k^3ocjencDruo4>vi>oO'CD*ooocDrucDru4kCDvjO'OvjoO'OCD*oen>ooo
DWO'WHOUOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
k4k
OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOmhmhmhhmmWIIJUCJ^nW^
ru ru ru tu ru cd cd cd cj a cd cj cj cj cj cd cd cj cd 4 4k 4k 4k 4s> 4k 4k 4k en en en en en en o* O'O'vi vi vi cd cd CD *0 o o ru cd 4k en vj ^3 ru en o vi o o fu o
vi cd cd >0 o o oru cj cd 4k en oO'vi co'O o ru cj en O'vi >a o ro cj en vi ru 4k vi o u oo 4k *o 4k o O'4 ru ru 4k cd en o ru vi ch 0* o o en o
o o en o en kon j CD en cd en cd o cd vi o en en en O'cd o iu O'o en *CD vi O'cd o en o *o* cj en ru o en o en 4k ko O vi so to *0 o o o o
4* 4k 4k 4k 4^ 4k 4k 4k en en en en en en en en en en o O O'o o o o o o> o v vi vi vi vi vi vi vi vi vi vi cd cd cd CD CD CD cd CD cd CD'O 43 *0 o'fl *0 o'O *o o
(J 4k Ul U> O Vj CD <3 Oru CJ 4k JJIO vi CD <10 kru CD 4k en O vi CDO O *ru ru ID 4k O vi CD *0 O k* ru CD ^ en O'vl CD
o o o <3o cd cd cd vi vi vi o o en en en 4k 4k 4k (j cj cj ru iu ruo o <3 <] cd o m vi vi vj o o 0* en en en 4 4i cj cj (D ru ru ru kko o o
'O en ru cd en  vj 4k o vi cj o o cj *o o ru *o en co 4k kvi 4k o vi cj o o u *a en ru cd en cd 4k k%i 4k o vi cj o O'ru o en ru cd et os vj 4k o
fc at :* *t at ST at at fc *fc 4t at *Â£ 4k :* * * * & *fc * * * * * * at *
at:*at*:*::#: at***:****:****Â¡*3*: at*:*:**::#:*:**:#:***:****Â£:*:*:**it*:**Â¡*41:
a
3>
H
m
X H
D
C 3
0 m
*
3
>
0 O
H
o z
Â£3
c
O"
Z H
O
hJ
a c
a
o
z
(D
w rn
0
H
ON
*
300
H
1 l
omm
C
1
Z0H
m
Ui
z
XO0
H
WDZ
>
W
Z
3
O0
>
X3
m
r*
H*
O0
<
k_j
20
M
m
m
H*
ft z
a
O
H
P3
Vi Vi Vi vl Vi vj Vl v Vi Vi vi Vj Vi V Vi Vj vi Vi Vj Vi vj v V Vi Vj VJ Vl Vi Xi Vj Vi v vj vj v Vi Vi Vi vl Vi vl vl Vl Vi vi Vi vi vi vi vi Vi vi Vi Vi Vi v vj vi v v
vj vi 4k 4 en 4k en en en o en 4 4k o 4k o o en vi o vi en vi o en en 4k vi v v vi u O'en en en vi o o 4k vj o en 4 vi en o vi vi 4* 4k 4k o en vi vi v o O'en
OOOOOk'*oOk*OOk*OOOOOOOOOOOOOOCO^OOH0^k*OCOO'OOHOOO'OOOCOOOOOOO
* n j c vi vi o vi lj m vi vi>0 o en O'ru cd n j cd en ru en kkru'D cd o o en o ru ro kpj 0"O ro id o l ) cj kvi m vi ko *o krn en n r t n 4k ru ru
kpj jK3 0ruO'iu*!UOcDOoorukforu*.pjOkoruruoLjpjkoiuoo.'ucrooko*otjkkiuiooruoioruoooiUk
4k 4k o k4k O f J CJ O Vi o
*4 *ru k *k  m ruru * ru*
o o) m vj vi o o vi 4k a vi cj *o >a co cd ru  o o j> cj ru cj *4k e co o 4k 4k vi o cj o ru co co *^ o o vi ru o cj ru pa *yt
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooaoooooooooooooooooooooooooooooooooo ooooooooo
kk
^ Nkkkpj ru (u ru ru ru ru ru ru ru ru cj cj u u cj cj 4* 4> 4k en ! e* O'0"D c 4
a a a o ^0 vQ <3 o o p *k**tu ru ru cj 4k 4k en en t> vj vj a] cd a ko k
ro u 4k vj ku co k4k so ru cj cd en cd kru O'vi vi o tnen o vi cd >o cd c 0 cj vO vi cd vi cj 4k cj o o0 vi 4k 4w ru o*4k vi vi en o o
*vd 4k o* en ru u cj
OO'enOkOenOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
w
OOOO poop OOpOOppOOOppOOOppOppOOOpOOppoppppOOpk^kkkv^.^IUUJUtJ4kp'p:^
pj ru ru ru fu ru cj cj cj (J cj cj cj cj cj to cj cj cj 4k 4k 4k 4k 4k 4k 4k 4k en ui en en en en O'o* o O'vi vi vj cd a *0 *0 O*pj 4k cn O'0 4k cd en 4k k03 en
vj vi 0 mo >o o kknj nj u 4k en en o vi a a >cj cpo fu en no 0 vi ro 0 4 'C m o u ve o U en c o en ru o O
40u0 0>o4kOO'rU'O0 0om04k0ruruiuru4kenkenO0U*OOiuO'ruOk4kO*enenou4ken'DOkoenoofuoenenoo
4k 4k 4.4k 4> 4 4 4* en en er en en en en en en en O'O'O'O'O'0* O'o* O'0 O vi si vi v/vi vi vi vi vj vi cd Q 0 0 0 0 0 o 0 0 *43 *0 0 0 0
_ruu4kp>O'vi0ppkpju_4ken>O'vi0pp*>roucj4ppvia:oo*(uu4ppvippowfejcj4kenO'vjQpo*kpu4kpp'viap
vi vi o o* o* en en en 4k 4k 4k u u u nj iu ru o o o
vi cj o nj
0
>
X H
a m
C 3
23 m
3
>
O
z
kt
H
C
a
m
3023
omm
Z2JH
HhC
X O 23
)D2
ATJ
m
023
230
m
n z
Vi Vi si vj vi vi Vi >4 Vi Vi vi vi Vi v V Vi VJ Vi vi sj vj Vi Vi vi Vi Vi vi Vi vi Vi vi ^ vi vi si Si Si Vi vi vi Vi vj si Vi vi vi Vi vi si Vi Vi Vi vi Vi Vvi vi Vi vi vi
vi si 4k en oen en 4k 4k 4> 4k 0* vi 4. O'0 o vi en 4k O'en 4k en en vj vj o* O'4k vi 4k 0 vi 4. vj en s vi vi *si vj vi vi o 4k O'vi 0 0 4k cp 0en 4k o* vi 4k o en
OO*OO * 0*00000000*OOOO*OOOOOOOOOOOO**OOOO*0*koOOkkkoO*
Vi ko C 0 0 O 0 vl 0 o 003 0 0 O O *O vi k0 o *43 en 0 CD 0 0 0 4k k4k Vi 0 4k CP CP P J ru 0 ru ru *fu er. O ro 4. c 4k ee o
k0 oru nj pj ro **o pj nj pj *pjo oro 00O*ro*o o**o on3 *o*OkOkk00OPJ 0*o o*oro* oro ooro*oo
v*oa,.000k4>orovi00er.0vien0OO0*en0Gk<3o*viku0vjviuen*O'en4kocD0CDk0viO0k*kkij0er.*0
_ X k k **KfOk kk kkVkk k k0 Mk kk.k.k 0 k00 k
V0O00PJ*Jp0004>pO'pO'Vivi*4>4k00000pvivip04kk0p4k0 0^300CDrUIUOenkO**rUO00viD
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
OOOOOOOOOOOOOOOO OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO OOOO
kkkH.kkk^kHkkMMhkkMM*MkMMMMMii)iufori]rOf\3IOfl3UMfl3CJL3UUUlJ.tk4kikikUiui(P^CD4]rJ
0a000ppppp^^*fururu04k4k4ken00viO3CD0p
**04kviO04kO00*0vi^jk.cj0OO000O,430O0vJk0CD'O0OvJ0004k0O**vjsjvjk00Jken04kCPO*4kvi0
0enk4kejtenenko04kO0PJO0'OO'Ovikko4kpjoen0k0rJ4kvi0a0U*4D0*004k0k0en'4343*>ooOvicDvj000oo
cn*OPJv00enOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
ooooppoooopooooooooooooooppopooppooopooooooo*******rururu04>en00
JUMruruPjru00000000000CJ0U4i4k4k4k4>.4k4k4.enenenenen0000vivviCD0*O'OOOkW04k00O0vi0O4^*0
viviCD0'O*43OOkfU004k4keP0v4CD0'OO*04ken0m'O*fU4k0rooruen0*4ksikoo0ro00en000k4k000m4k00
ruvi0vj0 vi0G34k.ov;cJocoen0O'C0cO'aOk4kvioenovi4k0ucnco0koru0v!Ovi*koviOken0cn000v04kvfcj
a
>
H
m
X H
Q k_
C 3
23 m
>
0 o
o z
c *
Z H
o c
0 a
 m
300
omm
Z2JH
o
o
a
at at * at at at at afc * *: at at nt 4fc at afc # ^ * c at at at at ae at a? ; at at at at * at ac st at =* * * *
m
<
m
0
>
>
3
m
>
z
>
m
a
4
o
H
>
r
r
a
>
a
II *.
n >
ii a
ii r
ii m
II
n o
II ~l
H 3
H >
II o
II z
II *
u H
II c
n o
ii m
II 
II
II 0
ii m
II H
II c
II 0
II z
a
II0
ii m
II 0
II *
II
ii a
II
H >
n z
ii a
II
tl 0
It 0
ii m
U Q
X c
h m
H Z
K O
B <
n
kO
Q
CD
p
o
o
D
p
CD
Pi
P3
rr
CD
Pu
cn
H
>
H
CO
Hi
o
O
c
M
rr
H
Hi
03
3
XO0
0DZ
4k4k4k4k4k4k4k4kenenenenenenenenenenen0000000000vivivivvivivjviviviCD000000004DO43'O'O*o*o*O*O*O
W04p0vippOk004k4kp0viCDpOkW04kp0vi0pO*W04U0vimppk*W04ken0vJ0*OO*p04ken0viO
^4k0U0Pj00k.voooo*O'Om00vivivj00enenen4k4k4k000000k*kkoO'O*O'O00vjvjvi000enenen4k4k00
4kkvi4kovi0O00*oeniU0enk0 4kkvi4fcovi0430(U'OenfU0enk04kkvi4kO0 0>O0W'OenPJ0enk04kOvi0O00'O0
A0
m
00
210
m
n z
H
602
'able 6.16 Monthly Event Statistics of Generated
Runoff at the Multifamily Basin.
MIAMI WSO AP RUNOFF STATION S 085663 .196463. 3 HRS INTEREVENT TIHE
RAINFALL STATISTICS BY MONTHCFOR PERIOD DF RECORD)
MONTH
NUMBER
TOTAL
1
DURATION
134.
0. 696000E+03
INTENSITY
134.
0.464005E+01
VOLUME
134.
0. 266696E+02
DELIA
133.
0. 1664S5E+05
2
DURATION
121.
0.508000E+03
INTENSITY
121.
0. 529963E+01
VOLUME
121.
0. 256795E+02
DELTA
121.
0. 158135E+05
3
DURATION
116.
0. 313000E+03
INTENSITY
116.
0. 477725E+01
VOLUME
116.
0.22329BE+02
DELTA
116.
0. 174120E+05
4
DURATION
99.
0. 467000E+03
INTENSITY
99.
0. 377303E+01
VOLUME
99.
0. 253696E+02
DELTA
97.
0. 1311OOE+03
5
DURATION
211.
0. 943000E+03
INTENSITY
21 1.
0. 176B15E+02
VOLUME
211.
0. 100959E+03
DELTA
211.
0.2234B0E+03
6
DURATION
353.
0. 176100E+O4
INTENSITY
353.
0. 26B700E+02
VOLUME
333.
0. 14643SE + 03
DELTA
353.
0. 175445E+03
7
DURATION
334.
0. 127300E+04
INTENSITY
334.
0. 24B077E+02
VOLUME
354.
0. B3017BE+02
8
DELTA
354.
0. 174530E+05
DURATION
341.
0. 132100E+04
INTENSITY
341.
0. 236573E+02
VOLUME
341.
0. B77277E+02
DELTA
341.
0. 1B0693E+03
9
DURATION
407.
0.1B5900E+04
INTENSITY
407.
0. 221706E+02
VOLUME
407.
0. 109397E+03
10
DELTA
407.
0. 175670E+03
DURATION
345.
0. 160B00E+04
INTENSITY
343.
0. 199409E+02
VOLUME
345.
0.9B747BE+02
11
DELTA
343.
0. 170335E+03
DURATION
172.
0. 841000E+03
INTENSITY
192.
0. 957725E+01
VOLUME
192.
0. 532587E+02
DELTA
192.
0.160310E+03
12
DURATION'
113.
0. 513000E+03
INTENSITY
113.
0. 530993E+01
VOLUME
113.
0. 263297E+02
DELTA
113.
0. 166685E+05
MINIMUM MAXIMUM AVERAGE
0. 100000E+01
0. 355600E02
O. 1OOOOOE01
0. BOOOOOE+Ol
0. lOGOOOE+Oi
O. 66700E02
O. 1 OOOOOEOl
O. 850000E+01
O. lOOOOOE+OI
O. 625100E02
0. lOCOOOEOl
O. 930000E+01
0. lOOOOOE+OI
O. 600100E02
O. 1 OOOOOE01
O. 100000E+02
O. lOOOOOE+OI
O. 42B600E02
0. 1 OOOOOEOl
O. 730000E+01
0. lOOOOOE+OI
O. 371300E02
0. 1 OOOOOE01
O. BOOOOOE+Ol
0. lOOOOOE+OI
O. 332400E02
0. 1 OOOOOEOl
0. BOOOOOE+Ol
0. lOOOOOE+OI
O. 444500E02
0. 1 OOOOOE01
O. 730000E+01
0. lOCOOOE+Ol
O. 373100E02
0. 1 OOOOOE01
O. 750000E+01
O. lOOOOOE+OI
O. 333600E02
0. 1 OOOOOEOl
0. BOOOOOE+Ol
0. 100000E+01
O. 371500EO2
0. 1 OOOOOEOl
O. BOOOOOE+Ol
0. lOOOOOE+OI
O. 500100E02
0.1OOOOOE01
0. 900000E+01
O. 370000E+02
O. 176667E+00
O.lBOOOOE+Ol
O. 901000E+03
O.240000E+02
O. 213001E+00
O. 3B0000E+01
O. 541000E+03
O. 310000E+02
O. 23666BE+00
O. 18000E+01
O. 793500E+03
O. 400000E+02
O. 640002E+00
O. 393000E+01
O. 138600E+04
O. 320000E+02
O. 793333E+00
0. 14 1B00E+02
O. 118000E+04
O. 390000E+02
O. 9B0002E+00
O. 3B3000E+01
O. 425000E+03
O. 240000E+02
O. B80002E+00
O. 2B0000E+01
O. 356300E+03
,0. 320000E+02
O. 365002E+00
O. 605000E+01
O. 363300E+03
O. 360000E+02
O. 900002E+00
O. 613000E+01
O. 290000E+03
O. 460000E+02
O. 743002E+00
O. 439000E+01
O. 270000E+03
O. 340000E+02
O. 323002E+00
O.635000E+01
O.64B300E+03
0. 310000E+02
O.436250E+00
O. 349000E+01
0.925500E+03
O. 319403E+01
O. 346273E01
O. 199027E+00
O. 140214E+03
O. 419B33E+01
O. 4379B8E01
O. 212227E+00
O.130690E+03
O.442241E+01
0. 411B32EC1
O. 192498E+00
0. 1 30103E+03
O.471717E+01
O. 5S3135E01
O. 256239E+00
O. 132626E+03
O. 446919E+01
O. B3798BE01
O.47B47BE+00
0. 10391 3E+03
O. 49BB67E+01
O. 817B47E01
0. 4 14B37E+00
O. 497011E+02
O. 360169E+01
O. 700762E01
O. 234513E+00
O. 493022E+02
O. 387390E+01
O. 693762E01
O. 257266E + 00
O. 329B97E+02
O. 436757E+01
O. 544732E01
O. 2692S1E+00
O. 431622E+02
O. 4660B7E+01
O. 377998E01
O.2B6B05E+00
O.493725E+02
0. 430021 E+Ql
0. 49881 5E01
O.2773B9E+00
O.833990E+02
O. 447B26E+01
O. 461733E01
O. 230693E+00
O. 144943E+03
6TD DEV
O. 3B3343E+01
O. 335B93E01
O. 314090E*00
O. 161740E+03
O. 353379E+01
O. 41B31BE01
O.402422E+00
O. 1142C1E+03
O. 447632E+0I
O. 449B20E01
O. 297742E+00
O. 142493E+03
O. 320996E+01
O.101235E+00
O. 4BB020E+00
O. 183014E+03
O. 49436BE+01
O. 130340E+00
O. 141662E+01
O. 1711S0E+03
0. 319723E+01
0. 11B730E+00
O. 740B46E+00
O. 523072E+02
O. 304956E+01
O. 1 1333BE+00
0.376811E+00
O. 497404E+02
O. 36394SE+01
O. 995B30E01
O. 4B061BE+00
O. 337923E+02
O. 3054B4E+01
O. B59633E01
O. 376770Ei 00
O. 432115E+02
O. 303226E+01
O. B66490E01
O. 52028VE + 00
O. 4B9290E+02
O. 492416E+01
O. 783320E01
O. 683763E+00
O. 932B61E+02
O. 4B2200E + 01
O. 729273E01
O. 4B3766E+00
0. 1396B0E+03
VARIANCE
CDEFVAR
O. 340523E+02
O. 112B24E02
O. 9B6523E01
O. 261599E+05
O. 126437E+02
O. 174990E02
O. 161944E+00
O. 130418E+05
O. 200374E+02
O. 20233BE02
O. BB6305E01
O. 203O41E+O3
0. 271437E+02
O. 102525E01
O. 23B164E+00
O. 334942E+03
O.24439BE+02
O. 169BB6E01
O.2006B2E+01
O.292950E+03
0. 270112E+02
O. 140967E01
O. 348B32E+00
O. 273605E+04
O. 9299B3E+01
O. 13302BE01
O. 1419B7E+00
0. 247411E+04
O. 13243BE+02
O. 99167BE02
O. 230994E+00
0. 3112B0E+04
O. 255515E+02
O. 739004E02
O. 332664E+00
O. 1B6724E+04
O.253236E+02
O. 730B03E02
O. 270701E+00
O.239405E+04
O. 242473E+02
0. 616727E02
O.470271E+00
O. B70230E+04
0. 232317E+02
0. 531839E02
O. 233969E+00
0. 254977E+05
O. 112349E+01
O. 970024E+00
0. 137812E+01
O. 115352E+01
O. B46931E+00
O. 955091E+00
O. 1B961BE+01
O. B73B2BE+00
O. 101219E+01
O. 109224E+01
O. 1 34673E+01
O. 949296E+00
0. 110447E+01
O. 173639E+01
O. 190440E+01
O. 119910E+01
O. 110662E+01
O. 133340E+01
O. 296069E+01
O. 161600E+01
O. 1041B1E+01
O. 14 31 73E+01
O. 17B3B7E+01
O. 105243E+01
O. 846702E+00
O. 1645B4E+01
O. 16067BE+01
O. 100B89E+01
O. 9394BBE+00
O. 143540E+01
O. 1S6B1BE+01
O. 1032B9E+01
0. 11066BE+01
0. 137812E+01
O. 2141B9E+01
0. 100114E+01
O. 10796BE+01
0. 1 4991 2E+01
0. 1B1 409E+01
O. 9V1019E+00
O. 112418E+01
O. 157437E+01
O. 247221E+01
0. Ill 38BE+01
O. 107676E+01
0. 157943E+01
O. 21036BE+01
0. 110167E+01
211
Table 6.17 Annual Event Statistics of Generated
Runoff at the Multifamily Basin.
MIAMI USO AP RUNOFF STATION # 003643 ,1T367?, 3 MRS INTERHVENT TIME
RAINFALL STATISTICS BY YEAR (FOR PERIOD OF RECORD)
NUMBER
TDTAL
!>
DURATION
7?.
0. 307000E+03
INTENSITY
7?.
0. 30371 0E
VOLUME
77.
0. 30377BE<02
DELTA
78.
0. 771 400E <04
DURATION
121.
0. 346000E<03
INTENSITY
121.
0. 712201 E <01
VOLUME
121.
0. 422674E+02
DELTA
121.
0. 706700E<04
DURATION
117.
0. 573000E<03
INTENSITY
117.
0. 824337E <0 1
VOLUME
117.
0. 406773E <02
DELTA
117.
0. 0723OOE<04
DURATION
137.
0. 630000E<03
INTENSITY
137.
0. 124 731E <02
VOLUME
137.
0. 3442?3E<02
DELTA
137.
0.B77300E<04
DURATION
117.
0. 371OOOE<03
INTENSITY
11 7.
0. 661170E<01
VOLUME
117.
0. 41 33rT6E <02
DELTA
117.
0.B77200E<04
DURATION
72.
0. 360000E+03
INTENSITY
72.
0. 476632E<01
VOLUME
72.
0. 227176E<02
DELTA
72.
0. 85B200E<04
DURATION
102.
0. 3?7000E<03
INTENSITY
102.
0. 314217E<01
VOLUME
102.
0. 23627BE<02
DELTA
102.
0. 8B3900E+04
DURATION
101.
0. 4B2000Â£<03
INTENSITY
101.
0. 336472E<01
VOLUME
101.
0. 234770E<02
DELTA
101.
0. 870430E+04
DURATION
1 IB.
0. 430000E<03
INTENSITY
1 IB.
0. 7?21B7E<01
VOLUME
1 1 B.
0. 390976E<02
DELTA
11 B.
0. 973230E<04
DURATION
103.
0. 433000E<03
INTENSITY
103.
0 717364E01
VOLUME
103.
0. 337377E <02
DELTA
103.
0. B61600+04
DURATION
131.
0. 647000E<03
INTENSITY
131.
0. 101 46 IE <02
VOLUME
131.
0 4B7972E<02
DELTA
131.
0. B60BOOE
DURATION
111.
0. 316000Â£<03
INTENSITY
1 1 1.
0. 770437E<01
VOLUME
I 11.
0. 3B7095E+02
DELTA
1 1 1.
0.B77230E*04
DURATION
127.
0. 733000E <03
INTENSITY
127.
0. 061777E <01
VOLUME
127.
0. 40 7 173E<02
DELTA
137.
0. B074OOE
DURATION
136.
0. 611000E+03
INTENSITY
136.
0. B3036OE <01
VOLUME
136.
0. 377374E403
d;:lta
136.
3. fJri(13CE<04
DURATION
74.
0. 444000E<03
INTENSITY
74.
0. 302263E<01
VOLUME
74.
0. 240796E<02
DELTA
94._
0. 80343OE <04
DURATION
106.
0. 447000E<03
INTENSITY
106.
0. 390537E<0 1
VOLUME
106.
0. 233996E <02
DELTA
106.
0. B6603OE+O4
DURATION
13?.
0. 613000E<03
INTENSITY
139.
0. 778703E<0 1
VOLUME
137.
0. 338372Â£<02
DELTA
139.
0. 903250E<04
DURATION
131.
0. 333000E <03
INTENSITY
131.
0. 627333E + 0 1
VOLUME
131.
0. 270995E <02
DELTA
131.
0. B63130E+04
DURATION
12?.
0. 491000E<03
INTENSITY
129.
0. 7376B?E<0 1
VOLUME
129.
0. 2?2973E<02
DELTA
13?.
0. B70600E<04
DURATION
119.
0. 370000Â£<03
INTENSITY
11?.
0. 72406DE<0 1
VOLUME
11?.
0. 207498E<02
DELTA
11?.
0. B70300E<04
DURATION
113.
0. 31 4000E<03
INTENSITY
1 13.
0. 774062E
VOLUME
113.
0. 278696E<02
DELTA
113.
0. 0651OOE
DURATION
101.
0. 3B3000E+03
INTENSITY
101.
0. 63701 2E<01
VOLUME
101.
0. 394 1 ?3E<02
DELTA
101.
0. B97330E<04
DURATION
133.
0. 3370O0E <03
INTENSITY
133.
0. 360030E <01
VOLUME
133.
0. 217277E <02
DELTA
133.
0. B68530E<04
DURATION
109.
0. 437000E <03
INTENSITY
107.
0. 3B4?47E<0 1
VOLUME
10?.
0. 361 37JE <02
DELTA
107.
0. B63600E<04
O.200000E+01
O. 1OOOOOE01
O.200000E01
O. 0OOOOOE+O1
O. 1 OOOOOE *01
O. 623100E02
O. lOOOOOEOI
O. 730000E<01
O. 100000E+01
O. 623100E02
0. lOOOOOEOI
O. 0OOOOOE+O1
O.100000E+0I
O. 933400E02
o. ioooooeoi
O. 800000E+QI
O. 100000E *01
O. 444300E02
O. 100000E01
O. B00000E+01
0. 1 OOOOOE <01
O. 730100E02
O. 100000E01
0. B30000E<01
0. 100000E+01
O. 300100E02
0. 1OOOOOE01
O. BOOOOOE+Ol
0.100000E+01
O. 373100E02
O.100000E01
O.700000E+01
O. 100000E+01
O. 6231OOE02
O. 1OOOOOEO1
O. 0OOOOOE+O1
0. 200000E<01
O. 100010E01
O.200000E01
0. B00000E<01
O. 100000E+01
O. 33D600E02
0. 1 OOOOOEO 1
O. B30000E<01
0.100000E<01
O. 666700E02
O. 100000E01
0.BOOOOOE+Ol
O. 100000E+01
0. 42B600E02
0. 1 OOOOOEO 1
0. 800000E<01
0. 1 00000E<01
O.600100E02
0. 1 OOOOOEO 1
1. 800000E
O. IOOOOOE
0. 600100E02
O.I00000E01
o. 70ooooe
O. 100000F<01
O. 333600E02
0.1OOOOOEO1
O.BOOOOOE+Ol
O. 100000E+01
O. 333400E02
O. 1 OOOOOEO 1
O. 730000E+01
0. 1OOOOOE+Ol
0. 371 300E02
0.1OOOOOEO1
O.800000E+01
O.100000E+01
O. 333600E02
0. 100000E01
0.OOOOOOE<01
0. 1 OOOOOE+Ol
0. 300100E02
0.1OOOOOE01
O. 700000E<01
O. 1OOOOOE>01
0. 371 300E02
0. 1 OOOOOEO 1
O. 730000E *01
O. 1OOOOOE+O1
O. 714400E02
O. 1 OOOOOEO 1
O. BOOOOOE+Ol
O. 1 OOOOOE <01
O. 642900E02
O. 1 OOOOOEO 1
O. 000000E+01
0. 1 OOOOOE+O1
O. 300100E02
0. 1 OOOOOEO 1
O.000000E<01
0. 1 OOOOOE *02
0. 313333E<0O
O. 266000E <01
O. 1 39600E+O4
O. 270000E+02
O. 61 3000E+00
0. 61 3000E<01
O. B66000E+03
O. 370000E<02
O. 430003E+00
O. 7B600GE<0I
O. 363000E+03
0. 340Q00E<02
O. 7B0002E+00
O. 63500E *01
O. 5B0000E+03
0. 400000E<02
0. 66300?Â£<00
O.61J000E+01
O. 403000E+03
O. 260000E+02
O. 463003E+00
O. 297000E+01
O. 307OOOE+O3
O. 200000E *02
O. 4S666BE+00
O.273000E+01
O. 374OO0E+O3
O. 310000E<02
O. 495003E*00
O. 203000E<01
O.377300E+03
O. 270000E+02
O. 730002E+00
O. 605000E<01
O. 34I000E+03
O. 330000E<02
O. 32D002E<00
O.43900CE+01
0. 337000E<03
0. 340000E<02
O. 700002E<00
O. 5B00O0E+01
O. 473000E<03
O. 2B0000E<02
O. 080002E<00
O. 3??000E<01
0. B26000E<03
O. 460000E*02
O. 600001E+00
0. 41 3000E<01
O. 7B0300E<03
O. 220000E <02
O. 322301E*00
O 336000E <01
O. 31 600Â£ 33
O. 10OOOOE
O. 370002E <00
0. 746000E<01
0. 100OOOE *04
O. 2IOOOOF*f)2
0. 70000 1E+00
O. 2D0000E<01
0. 701000E+03
O. 210000E+02
0. 3737 1 5E+00
O. 263000E +01
O.379000E *03
O. 230000E + 02
0. 71 3335E+00
0. 21 4000E+01
O. 4 73300E+03
0. 300000E<02
O. 630002E<00
O. 344000E+01
0. 773300E<03
0. 180000E+02
O. 6B3002E+00
0. 1 37000E+01
O. 442000E+03
O. 370000E+02
O. 743002E<00
0. 21 4000E *01
O. 489000E<03
O. 320000E<02
O. 793333E + 00
O. 732000E+01
0. 773000E+03
0. 23000OE<02
O. 303OO2E<00
0. 137000E+01
0. 437300E<03
O.330000E+02
0. 5/730 I E+OO
0. 1 A 1 BOOE <02
O. 1 1 BOOOE <04
0. 38B60BE<0I
O 637B70E0I
O. 2D6706E<0O
O. 101 462E<03
O. 431240E+01
O. 7330B3EO1
0. 34?334E<00
O. 747337E<02
0. 471 433E<01
O. 704732E0 I
O. 3476B6E*00
O. 745726E<02
0. 41 4013E<01
O. 774462E01
O. 346693E<00
0. 560064E<02
O. 400O34E+OI
O.363103E0I
0.33332?Â£<00
O. 747743E+02
0. 371304E<0I
O.337B37E01
0. 246933E+00
O. 732826E+02
0. 371 176E+01
O. 304 133E01
O. 231664E+00
O. 86B431E+02
0. 47722BE<01
O.330962E01
O. 232473E+00
O. B81634E+02
O. 3BI336E<0l
O. 671343E0
O. 322773E + 00
O. 740042E+02
0. 4 1 4286E+01
O. 6B3203E0I
0. 321 32i E+00
O. B2037 1 E<02
0. 473420E+0
O. 77431 IE01
0. 372313E<00
O. 637079E+02
0. 4 64 963E+01
O 694103E01
O. 340734Â£<00
O. 7721 17E+02
0. 372713E<01
O. 67B3B2E01
0. 30361 7E<00
0. 67B740E+02
O 447263E<01
0. 616441 E01
O. 292349E<00
0 641066E+02
O. 472340E <0 I
O. 334323E01
O. 236 1 66E <00
O. 937B10E <02
O. 421 67BE<0 1
O 5571 1 IE01
0.239619E+00
0. 01 7783E+02
0. 441007E<01
O. 360273E01
0.257980E+00
O. 649B20E+02
O.406B70E+0I
O. 4D0373E01
O. 206967E+00
O. 65BB93E+02
O. 37206BE+O 1
O. 307336EO1
O. 227 1 2BE<00
O.6703B9E+02
0. 327731E+01
0. 600430E01
O. 1 74368E+00
O. 731 344E<02
0. 454067E+OI
O. 683719E01
O.264333Â£<00
0. 7A3573E<02
0.37720BE<01
O.630704E01
O. 390272E<00
O. 07O445E+O2
O. 403737E+0I
O 4210B2E0I
O. 1648B5Â£<00
O. 6D3045E+02
0. 421101E+01
O. 336649E0 1
0. 331 737E+00
0. 774 12BE+02
0. 302771E+01
O. 001 323E01
O. 396287E+00
O. 172616E+03
O 423268E+01
O. 1 1 3673E<00
0. 676106E<00
O. 111732E+03
O.334339E<0I
0. 93761 3E01
0. B 1B630E <00
O. 730452E<02
O. 44092BE<01
0. 1 31 31 BE <00
0. 7 71 237E <00
O 707272E<02
0. 6004 34E<01
O. B03193E01
0. 9303Q?E<00
0. 874361E<02
0. 452017E<01
0. 78303IE01
0. 4 73736E+00
O. 101722E+03
0. 33751 IE
O. 763901 E01
O. 462301E+OO
O. 1 12177E+03
O. 323427E<01
0. 717?2BE01
O. 394033E<0O
O. 1 13173E<03
O. 7177I6E+01
O. 642119E02
O. I37043E<00
O.2?7761E<03
O. 1B0933E+02
O. 1330O0EO1
O. 4B4364E + 00
O. 124B4IE<03
O.307313E<02
O.B92976E02
O. 670153E<00
O. 563178E+04
0. 17441 8E+02
O.172771E01
O. 594906E<00
O. 500234E<04
0. 360343E<02
0. 64 3126E02
0. 697743E+00
O. 764B37E+04
O. 204317E+02
0. 617529E02
O. 224426E + 00
0.103474E+03
O. 11326BE<02
O. 58331 3E02
O. 213701E+OO
O. 123BBlE<03
O. 273776E+02
O. 31 3420E02
O. 133263E+00
O. 13264BE<05
COEFVAR
O 779633E+0O
O. 123625E<01
O. 134374E<0I
O. 1 70127E<01
O. ?42443E<00
O. 133437E<01
0. 19?267E<0I
O. 1471O0E+O1
O. 112936E<01
0. 1 33329E <01
O. 23343IE<01
O.100634E<01
0. 106301E<01
O.163341E+01
O. 222161 E<0 I
0. 126204Â£ o1
0. 1 23033E <0l
O. 1421 33E <01
0.23D032E<01
0. 1 I6649E<01
0. 1 13313E<01
0. 1 43367Â£<0l
0.1?1833E<01
O. 10704 7E+01
0. 867724E<00
O. 131 323E+01
0. 1 77677E+01
O.12?1?3E<01
0.10?681E<01
0. 1 30304E<01
0.169477E<01
0. 130636E<01
0.
33129BE+01
0. 1 234 10E<02
0.
9211B1E<00
0.
110S35E<00
0. 122B43E01
0.
16D073E <01
0.
7 3345BE <00
0. 567697E<00
0.
23341BE<01
0.
?27B85E<02
0. B60971E<04
0.
1253B3E<01
0.
468314E<0I
0. 21 9303E<02
0.
I13090E<01
0.
722334E01
0. 93 1106E02
0.
133034E<01
0.
633344Â£<00
0. 427119E<00
O.
203266E <01
0.
?63071E<02
0. 731 400E<04
0.
117612E+01
0.
37034JE+01
0. 323317E+07
0.
113163E<01
0.
131?35E<00
0. 174122E01
0.
170373E <01
0.
7377BCE+00
0. D44320E+00
0.
170OD5E <01
0.
B07187E<02
O. 631331E<04
0.
122841E<01
0.
473901E<01
0.2264B1E+02
0.
102374E+01
0.
10BB35E<00
0. 1 1043OEO1
0.
i36779E <01
0.
62262GE<00
0. 397666E + 00
0.
17B339E+01
0.
120B?5E<03
0. 1 461 36E+03
0.
132623E+01
0.
70?236E<01
0.303043E<0?
0.
11?622E<01
0.
10073DE<00
0. 101473E01
0.
14044?E
0.
631330E<00
0. 373930E<00
0.
164623E<01
0.
109767E <03
O. 120472E<03
0.
137096E<01
0.
43B346E<01
O. 1721 4BE<02
0.
?736?BE<00
0.
B76774E01
0. 768733E02
0.
142232E<01
0.
309606E <00
0. 237679E + 00
0
174314E<01
0.
937 1 4PE* 02
r 7000 6E + 04
0
1303B7E4 01
o.
3903145<01
0. 152343E<02
o.
02634OE <00
0.
7207B3E01
0. 517D29E02
0.
I34977E<0
0.
4 273B0E <00
0. 1 82923E + 00
0
166?13E <01
0.
1640/0E<03
0. 26721 7E<03
0.
1743B1E<01
0
331576G <01
o i
o
B33763E <00
0.
B20328E01
0. 673267E02
0.
1472BJfc+01
0.
410919E+00
0. 1&B933E+00
0.
171499E<01
0.
13B3O0E+O3
0.171843E+03
0.
167370E+01
0.
403007E<01
0.164031E+02
0.
91B36?E<00
0.
719?19E01
0. 518284E02
0.
12B474E+01
0.
430336E+00
0.193206E<00
0.
16601BE<01
0.
747B67E+02
0.539304E<04
0.
113088E+01
0.
360B09E+01
0.130183E+02
0.
BB67?2E<00
0.
742863E0I
0. 531 846E02
0.
134379E<01
0.
349725E<00
0. 121609E400
0.
16B375E<01
0.
B39B00E+02
0. 7O33B3E+04
0.
127304E <01
0.
404B03E<01
O. 163B67E<02
0.
10B363E<01
0.
936332E01
0.714734E02
0.
167834E<01
0.
4 30660E + 00
0. 203 102E<00
0.
17B420E+0
0.
770133E<02
0.980364E+04
o,
143417E+01
0. 27?201E<01
0. 101 323E+00
0. 233771 E<00
O. B33333E<02
0. 311997E<01
O.126236E<00
O. 409337E+00
O. B23733E+02
O. 37 3033E+01
O.119272E+00
O. 1 1302?E
O. 122661E<03
0. 3?0300Â£<01
0. 321 321 E01
0. 22?455E<00
O. 706763E<02
O.493677E+01
0. 877162E01
0. 1 3031 ?E <01
O. 134470E<03
O. 777333E+01
O.103072E01
O.643793E01
O. 731773E+04
O. 262141E<02
O. 139336E01
0. 1 66733E<00
O. 681933E<04
0. 33O663E<02
O. 142239E01
O. 129370E<01
O. 1 504 36E<03
O.132334E<02
O.271704EO?
O. 526477E01
0. 49931 4E<04
O. 243717E<02
O.767412E02
0. !?1876E<01
O.1BOB44E<03
0. B31?21E<00
0.166936E+01
O. 1 43337E+0I
O. 1169B4E<01
O.12360E<01
0. 1B4093E <01
0. 1344B6E<01
0. 10733BE *01
O. 972772E<00
O.107110E<01
O. 2? 1630Â£<01
O. 137732E+01
O. 766663E<00
0. 123B33E<01
0. 139161 E<01
O. 100226E+01
0. 117233EK)!
0. 1634 32E<01
0. 4 1 733BE<01
0 167341E+01
212
6.3.5. Summary and conclusions
This section illustrated the use of deterministic models for the
generation of hydrologic time series, after appropriate calibration over
several observed storms representative of the average hydrological
conditions of the basins under investigation. The case study was mainly
used for illustrative purposes rather than to answer specific questions
about the analyzed basins or rainfall stations. The use of the Storm
Water Management Model (SWMM) for continuous simulation, along with
STATS and SYNOP distributionfree statistical analysis was treated in
some detail, given the importance of such statistics for stormwater
management and for hydrological planning and design.
The next section presents other illustrative examples derived from
this case study for application of probabilistic models.
6.4. Probabilistic Models
6.4.1. Introduction
A first application of probabilistic models was given by the
illustrative examples of Chapter 4. In this section, other examples
derived from the case study are analyzed by the GPDCP program in order
to illustrate its application to different types of hydrological series.
First, the annual series of total rainfall at the eight NWS stations
described in Section 6.2 are analyzed. Then a sample of eventbased
statistics composed of four series of event characteristics (duration,
intensity, volume and delta) generated by SYNOP is analyzed by the
GPDCP. The last example uses a series of pollutant loads (COD) gen
erated by SWMM.
213
6.4.2. Annual rainfall series
The illustrative examples of Chapter 4 showed that the four gen
eralized probability distribution models fitted the three analyzed
series almost equally well. Thus, the annual series of total rainfall
(listed in appendix H) are analyzed only by the generalized normal
distribution (GND). The normal family was selected because of its
popularity, especially for reliability analysis, in which the assumption
of normality is the rule rather than the exception, as was shown in
Chapter 5. The results of this analysis are summarized in Table 6.18,
giving the optimal transformation (a) for each of the eight stations and
the corresponding selection statistics. Note the high variability of a,
ranging from 0.1 to 2.0 for the Miami airport and Port Mayaca stations,
respectively. The best fit to the GND was by the North New River
station, while the xrarst fit was by the St. Lucie station. This clas
sification is based on the four selection statistics, which all led to
the same order of goodness of fit. Thus for a betweenstation com
parison, any of these statistics will be sufficient. Note that even for
the worst fit, the coefficient of determination R2 (Equation 3.3.6) is
greater than 0.95.
A sensitivity analysis of the estimated shape parameter (a) of the
GND to a change of scale confirmed the conclusion reached by the illus
trative examples of Chapter 4, in that for all eight stations, exactly
the same as were estimated with the rainfall series expressed in
inches, meters, and centimeters as those reported in Table 6.18.
Due to the importance of the estimated shape parameter in reli
ability analysis (Chapter 5) its sensitivity to the plotting position
definition for the GND case is analyzed in more detail using the annual
214
Table 6.18. GPDCP Selected Transformation and Corresponding Optimal
Statistics for the 8 NWS Stations.
Station
a
R2
STDE
WSS
MXLF
Miami AP
0.10
0.98134
1.988
0.450
20.77
N. New River
0.65
0.99043
0.916
0.228
30.71
West Palm Beach
0.90
0.97215
2.029
0.666
6.26
Belle Glade
1.30
0.97974
1.235
0.482
9.92
Ortona Lock 2
1.20
0.97615
1.517
0.573
7.09
Port Mayaca
2.00
0.98213
1.104
0.431
6.89
St. Lucie
0.45
0.95569
1.914
1.069
0.14
Daytona Beach
0.90
0.98679
1.079
0.345
21.97
R2, STDE, WSS and MXLF are defined in Section 3.3.2.
215
rainfall series of the south Florida stations. The results of this
analysis are summarized in Table 6.19. Here again, as for the illus
trative examples of Chapter 4, the selection statistics show little
sensitivity to the plotting position definition compared to their high
sensitivity to the shape of the distribution, a. The Weibull plotting
position (a=0) (Equation 2.2.22) led to the best fit at four of the
stations, while for three four other stations the optimal plotting posi
tion constant was greater than 0.44. Thus, for these series the optimal
plotting position oscillated between 0.0 and 0.44, originally recom
mended for the uniform and extremal distributions, respectively
(Cunnane, 1978), although the fitted distribution is normal and its
expected plotting position is 0.375 (Section 2.2.7). Therefore, it may
be concluded that the plotting position definition is not as important
as that of the shape of the distribution in optimizing the selection
statistics. Furthermore, xxdien an improvement of the fit is sought, the
plotting position constant should be estimated like all the other
parameters since it is sample dependent.
6.4.3. Event statistics
Empirical frequencies and return periods of event characteristics
such as duration, intensity, volume and delta may be generated by SYNOP
or STATS after separating the continuous hourly series into events.
Distributionfree statistical analysis is performed x
grams. A more detailed analysis of these event characteristics can be
performed by the GPDCP program starting xjith the generated empirical
frequencies. In this section, SYNOP generated series from the Belle
Glade hourly rainfall data are used as an illustrative example for the
GPDCP application. Here again, only the GND x
Table 6.19. Sensitivity Analysis cf the Optimal MXLF Statistics to the Plotting
Position Definition (Eqn. 2.2.22) for the GND.
Station
a
Plotting Position Constant a
0.00
0.10
0.20
0.30
0.375
0.44
Miami AP
0.10
27.16
25.75
24.14
22.32
20.77
19.28
N. New River
0.65
29.43
30.03
30.49
30.75
30.71
30.46
West Palm Beach
0.90
9.40
8.81
8.07
7.13
6.26
5.36
Belle Glade
1.30
8.73
9.34
9.58
9.80
9.92
10.32
Ortona Lock 2
1.20
8.74
8.36
7.89
7.27
7.09
6.49
Port Mayaca
2.00
5.17
5.48
6.69
6.74
6.89
6.92
St. Lucie
0.45
2.45
2.17
1.48
0.74
0.14
0.39
Daytona Beach
0.90
23.31
23.24
23.00
22.53
21.97
21.30
216
217
analysis. The results from the GPDCP are summarized in Table 6.20.
Note the good fit of the four series to the GND implied by the high
values of the R2 statistics. Based on this statistic, the series of
storm volumes gives the best fit compared to the other characteristics,
while according to the WSS and MXLF statistics, the series of durations
has the best fit. No such comparison between different samples is
possible with the STDE statistics, since they have the units of the
analyzed characteristics.
6.4.4. Quality statistics
Total pollutant loads and concentrations can be generated by SWMM
on a single event or continuous simulation basis. Statistical analysis
of the generated pollutants may be performed within SWMM (STATS block)
on a distributionfree basis or by the GPDCP program to select the best
probability distribution model.
In this section a series of the highest 50 hourly COD (Chemical
Oxygen Demand) loads generated by SWMM at the Miami Multifamily basin
(Table 6.13) is analyzed by the GPDCP program. The four generalized
probability distribution models are fitted to this series with trans
formations (a) ranging from 0.50 to +0.50 with 0.05 increments. The
optimal transformations and their corresponding selection statistics are
listed in Table 6.21. The GRD gives the best fit, according to the R2,
WSS and MXLF statistics, followed by the Gumbel and Pearson with the
normal being last. However, based on the STDE, the Gumbel distribution
has a better performance than the Rayleigh distribution. The net
advantage of the Gumbel and Rayleigh distributions over the normal and
Pearson distributions should be expected given the extremal character of
the analyzed series wherein the 50 highest hourly loads were selected
218
Table 6.20. GPDCP Optimal Selection Statistic Summary, Belle Glade
Storm Events Characteristics.
Event
Characteristic
a
R2
STDE
WSS
MXLF
Duration
0.15
0.99366
1.578
0.134
24.12
Intensity
0.05
0.98525
0.083
0.308
14.13
Volume
0.20
0.99412
0.084
0.200
19.31
Delta
0.05
0.97708
114.000
0.543
7.33
R2, STDE, WSS and MXLF are defined in section 3.3.2.
219
Table 6.21. GPDCP Optimal Selection Statistics for COD Pollutant
Loads, Miami Multifamily Basin.
Distribution
a
R2
STDE
WSS
MXLF
Normal
0.30
0.94495
26.07
3.399
30.59
Gurabel
0.10
0.97871
12.33
1.123
2.90
Rayleigh
0.25
0.98270
14.45
0.892
2.86
Pearson
0.30
0.95488
23.90
2.890
26.55
R2, STDE, WSS and MXLF are defined in section 3.3.2.
220
from a total number of 3494 hours of simulation (Table 6.13). Figure
6.7 gives the plot of the transformed series versus the reduced variates
(expected order statistics) along with the fitted model and the 95%
confidence interval.
6.4.5. Conclusions
This section presented some examples illustrating the potential use
of the GPDCP program in modeling different types of hydrologic series.
The program performed equally well for all the modeled examples. A
sensitivity analysis of the shape parameter to the change of scale
confirmed the results obtained by the illustrative examples of Chapter
4. The selection statistics showed little sensitivity to the plotting
position definition, which was found highly sample dependent, deviating
excessively from the expected value given in the literature (Cunnane,
1978; Royston, 1982).
The next section presents some applications of the reliability
based approach for stochastic modeling.
6.5. Stochastic Models
6.5.1. Introduction
Time series stochastic modeling for either data generation or
forecasting of hydrologic variables has become an important step in the
planning and operation of water resources systems (Salas and Obeysekera,
1982). The modeling process is generally composed of six main stages
(Salas et al., 1980): 1) identification of model composition, e.g,
univariate, multivariate or multilevel (disaggregation) models; 2)
identification of model type, e.g., autoregressivemoving average (ARMA),
fractional Gaussian noise (FGN), broken line (BL), etc.; 3) identifi
cation of model form or order, e.g., order p of the autoregressive (AR)
roci rmcQiE DmisQnwziiiH
221
QUALITY SIMULATION
KATLZGB RZ ALFA = f ,36
Figure 6.7 GPDCP Modeling of SWMM Generated
Hourly COD Loads
222
and q of the moving average (MA) component of an ARMA(p,q) model; 4)
estimation of model parameters; 5) testing goodness of fit of the model;
and 6) evaluation of the uncertainty of model predictions based on
available data.
The first four stages have been largely investigated (Rao et al.,
1982; Salas and Obeysekera, 1982; Ozaki, 1977; Delleur and Kawas, 1978;
Klemes and Bulu, 1979; Miller et al., 1981; Hirsh, 1982; Srikanthan and
McMahon, 1982; Stedinger and Taylor, 1982a). Although many statistics
have developed to compare the performance of different models, the best
model remains overshadowed by natural and parameter uncertainty (Klemes
et al., 1981; Stedinger and Taylor, 1982b) making model selection a
rather subjective decision. Thus, this section will focus mainly on the
last three stages, in an effort to shed more light on parameter esti
mation, goodness of fit testing and uncertainty evaluation, areas which
have not been given enough attention, especially when the analysis is
based on transformed data (e.g., BoxCox transformation, Equation
3.3.3).
6,5.2. Model Description
Since the main objective of this section is reliability analysis
emphasizing the above last three stages of the modeling process, only
one model will be considered. The focus will be on the effect of the
BoxCox transformation (Equation 3.3.3) on the performance of this model
and the reliability of its parameter.estimates. The model is the
ARMA(1,1), a model that has been recommended by many (e.g., Delleur and
Kawas, 1978, Salas et al., 1980) for the analysis of rainfall time
series similar to those of this case study (Appendix H).
223
The ARMA(1,1) model is represented by
zt *1 Vi + et A Vi
(6.5.1)
where z is the standardized transformed variable at time t, and
0^ are the autoregressive and moving average parameters, respectively,
and e is an independent random variable (white noise) with zero mean
2
and variance 0Â£. A detailed description of the parameter estimation
procedure may be found in Box and Jenkins (1976) or in a more simplified
form with application to hydrologic time series in Salas et al. (1980).
In this second reference two computer programs are given for the analy
sis of annual and monthly hydrologic time series based on the IMSL sub
routines (IMSL, 1979). These programs have been modified to perform the
type of analysis described in the following sections.
6.5.3. Parameter estimation and goodness of fit evaluation
6.5.3.1. Parameter Estimation. The parameters are first estimated by
the least squares method (Section 2.2.3). The sum of squares of the
residuals is calculated for a range of values of and 0^ within the
interval [1,+1] imposed by the stationarity and invertibility condi
tions (Box and Jenkins, 1976, p. 76),
N N 2
SS((.1,01) = Z et = Z (zfc 1 zt_1 + 01 et_1) (6.5.2)
where is set equal to zero. The optimal parameters are those for
which the sum of squares is minimized. More refined estimates of these
parameters are then calculated by the IMSL subroutine FTMXL (IMSL, 1979)
using the previous estimates as starting values. The nonlinear proce
dure used by this subroutine is described by Box and Jenkins (1976,
Section 7.2). As noted in Chapter 2, the least squares estimates are
224
the same as the maximum likelihood estimates provided the residuals are
normally distributed. Sensitivity of the estimated parameters and of
the model performance to deviation from this assumption of normality
will be investigated using monthly and yearly ranfall series from south
Florida along with the BoxCox transformation.
6.5.3.2. Goodness of fit evaluation. A first evaluation of the good
ness of fit may be given by examination of the behavior of the sum of
squares surfaces for different values of a. This may be accomplished by
inspection of tables, of contour lines of equal values, or of three
dimensional plots of the generated sum of squares.
Portemanteau goodness of fit test. The independence of the resid
uals is tested by the Portemanteau goodness of fit test, a commonly
applied test for diagnostic checking of fitted ARMA models (Salas et
al., 1980). The distribution of the statistic
L 2
Q = n E P,(e) > (6.5.3)
k=l k
where n is the sample size, L is the maximum lag considered, and p is
K.
the autocorrelation function of the residual as approximated by a
chisquare distribution (Section C.2.7). The adequacy of the model is
checked by comparing Q with the theoretical chisquare value for (L2)
degrees of freedom and a given level of reliability (95% for this study).
Akaike Information Criterion (AIC). Akaike (1974) proposed the
following information criterion for selecting among different ARMA(p,q)
models
AIC(p,q,l) = n log (o^) + 2(p+q) (6.5.4)
2
where o, is the maximum likelihood estimate of the residual variance
M
2 2
o^, and n is the sample size. The variance is usually replaced
225
by the least squares estimate, which as noted above is the same as the
maximum likelihood estimate if the data are normally distributed. The 1
in Equation (6.5.4) stands for (cx=l) or no transformation of the data.
Rao et al. (1982) extended this criterion to the case where the log
arithms (a=0) of the data are normally distributed, by defining
AIC(p,q,0) = n log (o^) + 2(p+q) + 2 n y (6.5.5)
This criterion is extended in this study to the case where the fitted
data are transformed by the BoxCox transformation (Equation 3.3.3)
AIC(p,q,a) = n log (a^) + 2(p+q) 2n(al) ylog y (6.5.6)
Derivation of this equation is based on the maximum likelihood function
(Equation 3.3.16) and the previous definitions of the AIC which may be
easily shown to be special cases of the new expression (Appendix I).
Bayesian Information Criteria (BIC). Rao et al. criticized the
AIC, noting that it does not minimize the average value of any criterion
function, and that it is not consistent (Section 2.2.2). Thus, they
proposed an alternative criterion based on Bayesian decision theory to
compare different types of ARMA models. This Bayesian Information
Criterion (BIC) minimizes the probability of selecting the wrong model
by minimizing the following expression
BIC(p,q,l) = n log (aM) + (p+q) log (n) (6.5.7)
where all terms are as defined previously. For the case where the
square root (a=l/2) or the logarithm (ct=0) of the data are fitted to the
ARMA models, Rao et al. extended the above criterion to
BIC(p,q,) = n log (aM) + (p+q) log (n) + 2n log (2) + n ylog (6.5.8)
and BIC(p,q,0) = n log + (p+q) log (n) + 2n ylog (6.5,9)
226
These three criteria were used for the selection of the best trans
formation among the square root, logarithmic and no transformation.
This statistic is also extended in this study to the BoxCox trans
formation. The following general expression is defined based on the
maximum likelihood function (Equation 3.3.16) and the previous three
special cases (a=l, 1/2 and 0) of the BIC (Appendix I gives some details
about this derivation)
BIC(p,q,a) = n log (aM) + (p+q) log (n)
 8n a(al) log (2) 2n(al) h^og y (6.5.10)
For a fixed p and q (e.g., ARMA(1,1)) the optimal transformation will be
the one minimizing the above equation. The performance of these sta
tistics is illustrated by the following two examples of annual and
monthly time series from south Florida.
6.5.4. Annual AKHA(1,1) Model
The annual series of total rainfall from the eight NWS stations
(Table 6.1) are fitted to the ARMA(1,1) model for four different trans
formations. Among these transformations are the normalizing trans
formation (a ) selected by the GPDCP program (Table 6.18). The first
estimates of the autoregressive and moving average parameters are
independent of a for seven of the eight stations investigated; the only
change is for the logtransformed series at the West Palm Beach station,
where these estimates change from 0.0 to 0.20 for both and 0^.
This change is not that important for the final estimates given by the
FTMXL subroutine (IMSL, 1979). Table 6.22 gives a complete list of the
final parameter estimates. Note the low sensitivity of the final
estimates to the change of a suggested by this table. Although the
difference between the parameters for a given station is small,
Table 6.22. ARMA(1,1) Parameter Estimates, for the Annual Rainfall Series
Station
First Estimates
FTMXL Final
Estimates
a =
aopt
a =
1
a =
0.5
a =
0.0
+1 '
61
h
61
l
61
*1
61
+1
91
Miami
0.20
0.60
0.156
0.602
0.150
0.560
0.154
0.585
0.156
0.606
N. New River
0.20
0.60
0.316
0.673
0.333
0.708
0.309
0.658
0.289
0.609
West Palm Beach
0.00
0.00
0.088
0.070
0.086
0.065
0.091
0.080
0.104
0.118
Belle Glade
0.20
0.40
0.162
0.357
0.151
0.332
0.137
0.297
0.125
0.265
Ortona Lock 2
0.20
0.40
0.778
0.521
0.272
0.513
0.255
0.492
0.237
0.470
Port Mayaca
0.20
0.40
0.223
0.453
0.222
0.440
0.211
0.435
0.195
0.428
St. Lucie
0.40
0.40
0.432
0.433
0.458
0.475
0.434
0.437
0.412
0.399
Daytona Beach
0.00
0.60
0.015
0.584
0.021
0.587
0.007
0.572
0.030
0.551
227
228
associated statistics are highly sensitive to the change of a. For
example, Table 6.23 lists the skewness coefficients of the eight series
for the four analyzed transformations. Note the high negative skewness
of the logtransormed series (a=0), the positive skewness of the
original series (a=l) and the relatively small skewness of the normal
ized series (a=a ).
opt
2
The residual variance, and the minimum sum of squares asso
ciated with the four transformations are listed in Table 6.24. These
statistics are expressed in terms of the variance of the transformed
data. Thus, the residual variance may be compared to one (the variance
of the standardized transformed series), and the variance explained by
2
the AKMA(1,1) model will be (laÂ£). If the maximization of this ex
plained variance is taken as a selection criterion, the logarithmic
transformation will be best for six of the eight stations. But this is
contradictory since Table 6.23 indicates that for these stations the
logarithmic transformation is least likely to reduce the annual series
to normality, since it leads to the highest negative skewness. This
apparent discrepency between selection of the best transformation based
on skewness and residual variance may be attributed to the fact that the
latter has magnitudes depending on the transformation. Thus, such a
statistic should not be used for the selection among different trans
formations. The closeness of the minimum sum of squares from different
transformations results from the small variability of the variance
explained by the ARMA(1,1) model, which averaged more than 30% for most
of the stations. A better visualization of the transformation effects
on the shape of the sum of squares surface and the optimal parameters is
given by plots of contours of equal SS values for different a's. Figure
Table 6.23. Skewness Coefficient for Transformed Annual Rainfall Series
Station
a ..
opt
Transformation
a = a
opt
a = 1.0
a = 0.50
a = 0.0
Miami AP
0.10
0.004
1.674
0.74
0.176
N. New River
0.65
0.213
0.980
0.126
1.25
West Palm Beach
0.90
0.204
0.001
1.050
2.160
Belle Glade
1.30
0.119
0.520
1.680
2.970
Ortona Lock 2
1.20
0.121
0.53
1.676
3.020
Port Mayaca
2.00
0.075
2.150
3.360
4.480
St. Lucie
0.45
0.067
+1.632
0.208
1.660
Daytona Beach
0.90
0.098
0.298
0.700
1.698
229
2
Table 6.24. Residual Variance and Minimum Sum of Squares (a /SS) for the
Transformed and Standardized Annual Rainfall Series. Values
are in Terms of the Unit Variance of the Standardized Series.
Station
a 4
opt
Transformation
a = c*
opt
a = 1.0
a = 0.50
a = 0.0
Miami AP
0.10
0.740/17.76
0.721/17.30
0.684/16.42
0.641/15.38
N. New River
0.65
0.634/15.22
0.631/15.14
0.634/15.22
0.633/15.19
West Palm Beach
0.90
0.819/19.66
0.823/19.75
0.802/19.25
0.782/18.77
Belle Glade
1.30
0.740/17.76
0.721/17.30
0.684/16.42
0.641/15.38
Ortona Lock 2
1.20
0.681/16.34
0.668/16.03
0.632/15.17
0.589/14.14
Port Mayaca
2.00
0.539/12.94
0.518/12.43
0.514/12.34
0.516/12.38
St. Lucie
0.45
0.769/18.46
0.784/18.82
0.770/18.48
0.755/18.12
Daytona Beach
2.90
0.819/19.66
0.823/19.75
0.802/19.25
0.782/18.77
230
231
ANNUAL ARMA(l.l) MODEL
THET*
'flLFfi 1.3
TtCT*
RLFfl 1.0
Figure 6.8 Sum of Squares Surface Contours for
Belle Glade Annual Rainfall Series
232
6.8 gives contour plots for the four analyzed transformations for the
Belle Glade annual rainfall series. Note the subtle tendency for the
sum of squares surface to get flatter around its minimum as a gets far
from the optimal transformation (a=1.3) making the estimation of the
associated parameters insensitive and thus less reliable. But for all
four case the uniqueness of the optimal model is obvious from these
plots.
All 32 models (8 stations and 4 transformations) fitted to the
annual rainfall series passed the Portemanteau goodness of fit test
(Equation 6.5.3) for the ARMA(1,1) model. This is an illustration of
the inadequacy of this statistic for choosing and comparing stochastic
models. A similar conclusion was reached by Chander et al. (1979) who
recommended the AIC (Equation 6.5.4) as an alternative selection sta
tistic. Table 6.25 lists this statistic resulting from the general
expression of Equation 6.5.6. Note the drastic decrease of this sta
tistic with the increase of a; this decrease is not an indication of a
better fit to the transformed series since the highest a is always
selected. Thus, the AIC may be useful for the selection among different
models for a fixed a, as applied by Rao et al. (1982) and others, but
its use for choosing the best transformation, recommended by Rao et al.,
needs more careful scrutiny. The net superiority of models fitted to
log transformed monthly data (shown later) may not be an indication of a
better fit but just a consequence of the change of magnitude, similar to
the case discussed above for the residual variance.
The Bayesian Information Criterion (Equation 6.5.10) closely
follows the AIC (Table 6.26). The same decrease with an increase of a
may be noticed.
Table 6.25. Akaike Information Criterion for the ARMA(1,1) Models of the
Annual Rainfall Series.
Station
a
opt
Transformation
a = a
opt
a = 1.0
a = 0.50
a = 0.0
Miami AP
0.10
169.00
4.54
91.85
188.28
N. N. R. Canal
0.65
58.94
7.03
87.20
181.32
West Palm Beach
0.90
18.79
0.68
96.66
193.95
Belle Glade
1.30
58j_79
3.85
87.51
178.55
Ortona Lock 2
1.20
42^20
5.67
85.45
176.22
Port Mayaca
2.00
194_!_76
11.78
80.00
172.06
St. Lucie
0.45
102.77
1.84
93.25
188.30
Daytona Beach
0.90
12.97
5.40
86.48
178.46
233
Tabic 6.26, Bayesian Information Criterion for the ARMA(1,1) Models of the
Annual Rainfall Series.
Station
a
opt
Transformation
a = a
opt
ct = 1.0
a = 0.50
a = 0.0
Miami AP
0.10
183.33
2.18
127.48
190.64
N. N. R. Canal
0.65
91.58
4.67
122.83
183.68
West Palm Beach
0.90
33.12
1.67
132.29
196.31
Belle Glade
1.30
108.34
1.49
123.14
180.90
Ortona Lock 2
1.20
71,79
3.32
121.07
178.58
Port Mayaca
2.00
458.58
9.42
115.63
174.42
St. Lucie
0.45
13806
0.51
128.88
190.66
Daytona Beach
0.90
27.30
3.05
122.11
180.81
234
235
6.5.5. Monthly AKMA(1,1) model
An ARMA(1,1) model for three different transformations is fitted to
the monthly rainfall series of the eight stations analyzed above for
annual total rainfall. The first estimates of the autoregressive (AR)
and moving average (MA) parameters, ^ and 0^, respectively, are listed
in Table 6.27. Note the independence of these parameters to the trans
formation, a, exhibited by most of the series. Also, note the closeness
between the AR and MA parameters and their equality for many series.
This is a very undesirable property for the processes modeled by an
ARMA(1,1) model since it is an indication of white noise (Box & Jenkins,
1976, p. 249). The following analysis is an illustration of the dif
ficulties associated with the estimation of such process by an ARMA(1,1)
model and the reliability of such estimates. The final estimates,
calculated by the FTMXL (IMSL, 1979) subroutine are listed in Table
6.28. Note the small difference between ^ and 0^ for all the models,
and the low sensitivity of these estimates to the transformation, a.
Unlike the annual series, the monthly series parameter estimates are
highly dependent on the first estimates. This is illustrated by Table
6.29 where first and final estimates for the square root series are
listed. These first estimates are selected from the range 0.0 to 1.0
instead of 1.0 to 1.0 as was the case for Table 6.26. Note the big
change in the final estimates of ^ and 0^ at the Belle Glade station,
relative to the other stations. This is due mainly to the fact that the
previous first estimate of 0^ (Table 6.29) was outside the range 0 to
1.0.
2
The variance of the residuals, a, and the minimum sum of squares
of these models are listed in Table 6.30. The variance of the residuals
Table 6.27. First Estimates of ARMA(1,1) Parameters for Three Transformations of
Monthly Rainfall Series.
Station
a =
1.0
a =
0.50
a =
0.0
+1
61
+1
61
H
e
91
Miami AP
0.80
0.80
0.80
0.60
0.80
0.60
N. New River
0.80
0,80
0.80
0.80
0.20
0.40
West Palm Beach
0.80
0.80
0.80
0.80
0.80
0.80
Belle Glade
0.00
0.20
0.00
0.20
0.20
0.20
Ortona Lock 2
0.80
0.80
0.80
0.80
0.80
0.80
Port Mayaca
0.80
0.80
0.80
0.80
0.80
0.80
St. Lucie
0.80
0.80
0.80
0.80
0.80
0.80
Daytona Beach
0.40
0.20
0.40
0.20
0.20
0.00
236
Table 6.28. FTMXL Parameter Estimates of the ARMA(1,1) Monthly Models
Station
a =
1.0
a =
3.50
a =
0.0
l
91
l
61
+1
61
Miami AP
0.924
0.856
0.910
0.831
0.883
0.789
N. New River
0.850
0.805
0.847
0.797
0.271
0.428
West Palm Beach
0.807
0.782
0.790
0.748
0.750
0.686
Belle Glade
0.052
0.156
0.046
0.153
0.237
0.147
Ortona Lock 2
0.851
0.810
0.845
0.974
0.830
0.769
Port Mayaca
0.901
0.886
0.893
0.879
0.801
0.806
St. Lucie
0.813
0.806
0.805
0.793
0.793
0.813
Daytona Beach
0.364
0.258
0.372
0.255
0.190
0.031
237
238
Table 6.29. First and Final Estimates of ARMA(1,1) Parameters
for Square Root of Monthly Rainfall.
Station
First estimates
Final
estimates
+1
61
+1
61
Miami AP
0.90
0.80
0.916
0.838
N. New River
0.80
0.70
0.851
0.802
West Palm Beach
0.80
0.80
0.790
0.748
Belle Glade
0.10
0.00
0.084
0.022
Ortona Lock 2
0.80
0.70
0.848
0.798
Port Mayaca
0.90
0.90
0.902
0.888
St. Lucie
0.80
0.80
0.805
0.793
Daytona Beach
0.40
0.30
0.400
0.284
Table 6.30. Residual Variance and Minimum Sum of Squares for the ARMA(1,1) Monthly
Rainfall Model.
Station
a =
1.00
a = 0.50
a =
0.00
2
0
e
SS
2
a
e
SS
2
a
e
SS
Miami AP
0.921
265.2
0.922
265.5
0.925
266.4
N. New River
0.938
270.1
0.936
269.6
0.934
269.0
West Palm Beach
0.945
272.2
0.939
270.4
0.934
269.0
Belle Glade
0.948
273.00
0.947
272.7
0.949
273.3
Ortona Lock 2
0.941
271.0
0.939
27_0.4
0.941
271.0
Port Mayaca
0.950
273.6
0.949
273.3
0.953
275.6
St. Lucie
0.950
2736
0.953
274.5
0.953
274.5
Daytona Beach
0.945
272.2
0.939
270.4
0.934
269.0
239
240
CT^ and the minimum sum of squares SS of these models are listed in Table
?
6.30. Note the low value of the ARMA(1,1) explained variance (lo),
less than 8% for all the stations and transformations compared to 30%
2
for the annual series (Table 6.24). Here again, the use of 0Â£ and SS
statistics for selecting the best transformation may be misleading for
the same reasons discussed in the previous section.
The effect of the transformation on the sum of squares surface is
illustrated in Figure 6.9 by the contour plots for three transforma
tions. Note that there is only a weakly optimal Pa^r that mini
mizes the sum of squares. The contours are parallel to the white noise
line, (Box and Jenkins, 1976, Fig. 3.11), and so is the optimal
solution. The three plots of Figure 6.9 show that the transformation of
the variable has little effect on the shape of these contours and the
location of optimal line. Among the s'tations analyzed, only the North
New River station failed to pass the Portemanteau good of fit test, this
for all three transformations.
The closeness of the distribution at the transformed series to nor
mality is investigated through the average monthly skewness values for
the eight stations. These are listed in Table 6.31 along with the
skewness of the annual totals of the monthly transformed series. Note
the positive monthly skewness of the untransformed data and the high
negative monthly skewness of the logtransformed data. The square root
transformation did not completely reduce the skewness to zero. Thus,
the optimal transformation should be somewhere between 0.5 and 0.0,
which is in good agreement with the finding of Stidd (1953, 1970) who
recommended the cubic root (a=l/3) transformation. Note that the
normalization of the monthly series results in an even more skewed
241
MONTHLY ARMA(1,1) MODEL
SUM QF SQUARES SURFACE
ALFA 1.0
ALFA D.S
IkET*
ALFA Q.D
Figure 6.9 Sum of Squares Surface Contours for
Belle Glade Monthly Rainfall Series
Table 6.31. Sensitivity of the Monthly and Annual Skewness to the Transformation
of Monthly Rainfall Series.
Station
a = ]
..00
a = (
>.50
a = 0.00
in
Yy
in
V
Ym
Yy
Miami AP
1.32
0.36
0.44
0.12
0.65
0.30
N. New River
1.09
0.21
0.19
0.07
1.05
0.62
West Palm Beach
1.06
0.00
0.35
0.20
0.61
0.26
Belle Glade
0.87
0.11
0.18
0.60
0.85
0.74
Ortona Lock 2
0.87
0.11
0.28
0.10
0.97
0.83
Port Mayaca
0.94
0.46
0.21
0.25
0.89
0.39
St. Lucie
0.95
0.35
0.15
0.59
1.41
1.86
Daytona Beach
1.06
0.00
0.35
0.20
0.61
0.26
average monthly skewness
yearly skewness
242
243
annual series. Table 6.32 gives the AIC and BIC evaluated by Equations
6.5.6 and 6.5.10, respectively. For this case too, these statistics
rapidly decrease with an increase of a; therefore, their use for selec
ting the best transformation may be misleading.
6.5.6. Reliability of estimated parameters
A Monte Carlo simulation was performed in order to assess the
reliability of the annual ARMA(1,1) parameters when estimated from small
samples of observations and to assess the effect of the normalizing
transformation on this reliability. Using the final estimates of the
autoregressive and moving average parameters, 100 series of the same
length as the original series (24 years) are generated by the IMSL
subroutine FTGEN (IMSL, 1979). Then the autocovariance structure of
these series is analyzed by the IMSL subroutine FTAUTO, and ARMA(1,1)
model is fitted to each of the 100 generated series. This procedure is
applied twice, first to the log transformed data (a=0), and second to
the optimally transformed data (a=1.3) for the annual total rainfall
series of the Belle Glade station. Reliability of the estimated parame
ters is investigated through the statistics of the generated series
parameters. Table 6.33 summarizes these statistics, giving the mean,
standard deviation and skewness coefficient for the AR and MA parame
ters, the residual variance and the average of the generated data, for
both transformations. Note the high variability (a ) of the generated
S
parameters compared to their mean, illustrating the unreliability of
such estimates in representing the population parameters. On the other
hand, the advantage of the optimal transformation over the log trans
formation is obvious from the statistics of the average of the generated
series (last column Table 6.33) Xi/here the mean is much closer to zero
Table 6.32. Alcaike and Bayesian Information Criteria for the Monthly ARMA(1,1)
Model.
Station
a = ;
L.O
a =
0.50
a =
0.00
AIC
BIC
AIC
BIC
AIC
BIC
Miami AP
19.61
12.28
300.62
707.20
621.84
629.16
N. New River
14.33
o
o
!
241.41
647.99
497.04
504.37
West Palm Beach
12.38
5.90
335.57
724.15
684.09
691.41
Belle Glade
11.34
4.01
223.61
630.19
419.42
466.75
Ortona Lock 2
14.32
6.10
215.95
622.53
446.91
454.24
Port Mayaca
10.83
5.68
222.21
628.78
457.90
465.23
St. Lucie
10.61
4.76
267.85
669.43
536.83
544.16
Daytona Beach
12.08
4.75
263.49
670.07
536.37
543.69
244
245
Table 6.33. Monte Carlo Simulation, Summary Statistics for the Log
and Optimally Transformed Data.
a
Statistics
l
91
a
E
y2
o
o
X
o
0.1623
0.3574
0.7400
0.0000
y
g..
0.0251
0.1217
0.6003
0.0080
a
8
0.6634
0.7777
0.4393
0.4663
Yg
1.5241
2.61079
5.5238
2.7939
1.3
X
o
0.1254
0.2649
0.6410
0.0000
yg
0.0507
0.0814
0.5090
0.0006
a
g
0.6782
0.7895
0.3843
0.4184
Yg
1.7813
0.1782
0.4099
1.7095
xq: original parameter
u^: mean of the generated parameters
o : standard deviation of generated parameters
y : skewness coefficient of generated parameters
246
with less variability and smaller skewness. Also note the relatively
lower skewness and residual variance for the MA parameter for the
optimal transformation. Thus for simulation purposes and for a better
description of analyzed hydrologic series, optimally transformed data
should be used for estimating the parameters, and the evaluation of the
performance of a fitted model should be based on the statistics of the
generated data rather than on the parameters of the model. This con
clusion is reinforced by the autocorrelation analysis of these series.
Figures 6.10 and 6.11 give plots of the autocorrelation functions of
the observed data (sample), average over the 100 generated series, and
the theoretical function (ACF) for a=0 and a=1.3, respectively. The
theoretical autocorrelation is evaluated by the following expression
(Box and Jenkins, 1976, p. 76)
ACF(k)
(1^0^ (<Â¡^0^
(1+0^ 21e1)
k = 1, 2, ...
(6.5.11)
where <Â¡>^ and 0^ are the original parameters used for the generation.
Note the rapid decay of the theoretical ACF, the periodic trend of the
sample ACF and the closeness of the average ACF to the theoretical ACF,
although all three ACF are within the 95% confidence limits after lag 1.
These limits are approximated by VL/'/n, where n is the sample size
(Box and Jenkins, 1976, p. 35, 178; Salas et al., 1980 p. 49). From
these two figures it is hardly possible to tell the difference between
the ACF for the log and optimal transformations; thus, model selection
should not be based on the comparison of such statistics.
6.6. Summary and Conclusions
The potential of applying the reliability based approach to the
estimation of parameters for deterministic, probabilistic and stochastic
247
AUTOCORRELATION FUNCTIONS
GENERATED SERIES ANALYSES
ALFA D.O
Figure 6.10 Annual Series Autocorrelation Functions,a = 0.0
248
AUTOCORRELATION FUNCTIONS
GENERATED SERIES ANALYSES
ALFA 1.3
LfiG
Figure 6.11 Annual Series Autocorrelation Functions, a = 1.3
249
models was illustrated by this case study. The need to comply with the
normality assumption implied by the least squares estimation procedure
in all three types of models was emphasized by a sensitivity analysis of
the selection statistics to the shape of the distribution (transforma
tion a). The use of deterministic models for continuous simulation of
urban runoff quantity and quality was proven to be a very promising tool
for the description of such processes when the models are calibrated
over several storms representative of average hydrological conditions..
Despite the conceptual nature of these models, their estimated parame
ters often lose their physical meaning during the calibration and
verification stages. The use of lumped parameters, such as the basin
width for the Runoff block of SWIM, along with calibration for more than
one storm simultaneously, avoided most such losses of meaning and
increased the reliability of the estimates. More insight into the
simulated processes may be gained by a distribution free analysis of the
generated series using STATS or SYNOP, or by a probabilistic or sto
chastic modeling of the generated series.
Statistics such as the AIC and BIC recommended by Rao et al. (1982)
for the selection between the no transformation, square root trans
formation and logarithmic transformation were extended to the BoxCox
transformation, and were shown to select transformations different from
the normalizing transformation. Thus, selection of the best trans
formation to normality (or to any other distribution from the types
analyzed in Chapter 3) should be based on a frequency analysis similar
to the one performed by the GPDCP program. The advantage of the normal
izing transformation was illustrated by a Monte Carlo simulation, where
simulated series statistics were found much more reliable than those of
the associated parameters.
CHAPTER 7
SUMMARY AND CONCLUSIONS
7.1. Summary
Hydrologic models were classified into deterministic, probabilistic
and stochastic models, according to the processes involved. Independ
ently of the type of model, an assumption about the probability dis
tribution of the modeled variables is always implied if not explicitly
stated for classifying the models and estimating their parameters.
Reliability analysis of their parameter estimates requires some prob
abilistic statement based on probability distribution models. Problems
in estimating the parameters of such models were shown to result mainly
from a poor parameterization of the probability distributions. A new
parameterization, in which the shape of the distribution is modeled by
the BoxCox transformation, performed much better than the classical
three parameter probability distributions. The algorithm used by the
Generalized Probability Distribution Computer Program (GPDCP) for esti
mating the new generalized distributions, in which linear parameters are
estimated separately from the nonlinear parameters, showed no conver
gence problems, while other algorithms, such as Kite's (1977) maximum
likelihood or SAS nonlinear procedure, failed to converge.
Modeling of the shape of the distribution by the BoxCox trans
formation parameter (a) allowed the derivation of simple relations
between the moments of transformed and original variables. Such rela
tions allowed the extension of the second moment reliability theory to a
250
251
third order reliability theory by accounting explicitly for the devia
tion from normality through the parameter a. Generalized reliability
indices were derived and reliability tables and plots were generated for
different probability levels and transformations. Application of the
reliability based approach to the three types of hydrologic models was
illustrated by a case study in south Florida for which annual, monthly
and hourly rainfall series at eight NWS stations, and storm event
rainfall, runoff and quality data from four urban basins were the basis
of the analysis.
7.2. Conclusions
The main conclusions of this research may be summarized as
follows:
1. The new generalized probability distributions performed much better
than most classical probability models known in hydrology and reli
ability fields. This was a result of their simple form (Equation
7.2.1) and of the separation of the estimation of the linear parameters
A and B from the estimation of the nonlinear parameter, a, (Equation
7.2.2),
Y = A + BZ
(7.2.1.)
Y =
a
2_zl
oc^O
(7.2.2.)
log y a=0 .
Thus, they may be a good substitute for the logPearson Type III dis
tributions recommended by the U.S. Water Resources Council and for the
generalized gamma distribution used in reliability analysis which both
exhibited poor performance due to their inadequate parameterization.
252
2. Among the four selection statistics considered in this study, the
coefficient of determination (R2) performed the best, followed by the
maximum likelihood function (MXLF) and the weighted sum of squares (WSS)
with the standard error (SIDE) ranking last. "Best" is used in the
sense that important features, such as the form (linearity) and shape
(skewness) of the selected model comply with the expected ones (e.g.,
the optimal transformation for the generalized normal distribution
should lead to a linear plot of transformed variables versus reduced
variates, and to a skewness close to zero). All four statistics plus
the shape parameter, a, were not sensitive to the scale (magnitude) of
the data, except for the generalized Pearson distribution, for which the
shape of the distribution is modeled by two parameters.
3. Selection statistics and the transformation parameter showed no
sensitivity to the definition of the plotting position for the four
generalized families of distributions.
4. A third order reliability analysis that explicitly includes the
transformation parameter in the formulation of different reliability
indices was made possible by the derivation of new relations between
the moments of the original and transformed variables. The importance
of such relations in reliability assessment is illustrated by the
generalized tables and plots of reliability for different transforma
tions (Chapter 5). Important decision elements, such as the design
period (project life), were found to have less effect on the design
event magnitude than does the transformation parameter, a.
5. Continuous simulation, via an adequately calibrated deterministic
model, such as SWMM, reproduced most of the observed variability of the
modeled processes. The spatial variability of the rainfall inputs was
253
overshadowed by their time variability confirming the validity of the
use of point rainfall for continuous simulation for the south Florida
examples.
6. Event based statistics generated from the continuous simulation may
give more insight into the modeled hydrologic processes, since they
allow the analysis of the event durations, magnitudes (volumes), inten
sities and times between events. Each of the characteristics may be
analyzed separately according to the problem at hand.
7. Stochastic model selection based on optimal decision criteria, such
as the Akaike Information Criterion (AIC) and Bayesian Information
Criterion (BIC) extended to the BoxCox transformation, performed very
poorly in selecting the normalizing transformation compared to the
reliability based approach of the GPDCP program. This conclusion was
confirmed by Monte Carlo simulation.
8. The reliability based approach, followed in this research for
parameter estimation and reliability analysis, is not confined to the
modeling of hydrologic processes. It may be applied to other engi
neering fields such as material sciences, electrical engineering, etc.,
and to economic analysis.
7.3. Suggestions for Further Research
As a continuation of this research, the following points may be
investigated further:
1, Relations between the transformation parameter a and the skewness
coefficient of the original data may be derived directly from the
relation between the first three moments, or by simulation. Preliminary
results based on the annual rainfall series at the eight NWS stations of
the case study suggested the following empirical relation
254
Y = \ (1ct) (7.3.1)
More investigation is needed to confirm such a result.
2. Reliability analysis based on other than the normal generalized
distribution may be performed, starting with Equation 3.1.17 and the
relations between the moments of the transformed and original data
(Equations 5.3.8 and 5.3.11) which are independent of the distribution.
Thus, such an analysis will be straightforward once the location and
scale parameters are related to the moments of the transformed vari
ables. Such a relation would be even more practical for the generalized
extreme value distribution (GED) and generalized rayleigh distribution
(GRD) for which explicit expressions for the reduced variate z do exist
(Table 3.5).
3. The reliability based approach may be applied for the automatic
calibration of deterministic models, such as SWMM, by searching for the
optimal transformation along with the model parameters.
4. The apparent poor performance of the AIC and BIC in selecting the
optimal transformation may be investigated by Monte Carlo simulation in
order to assess their reliability.
APPENDIX A
LINEAR REGRESSION
Equation 2.2.2 is linear in the two parameters A and B,
Y = A + BZ (A.O)
2
Least squares estimates of A, B and a^, the intercept, slope and vari
ance, respectively, may be found in many statistical textbooks, e.g.
Draper and Smith (1966), Haan (1977). These are the solution of the
normal equations defined by Equation 2.2.5.
3SS
n
= 2 E (Y. A BZ.) = 0
3A x i
x=l
3SS
n
(A.l)
= 2 E (Y. A BZ.) Z. = 0
3B i ii
1=1
which after some manipulation reduce to
A n + BEZ. = EY.
i i
AEZ. + BEZ. = EZ.Y.
i i ii
(A.2)
a system of two equations with two unknowns. Equations A.2 may be
easily solved (e.g. by substitution) for the estimates A and B
A = Y BZ
EZY nZY
(A.3)
B =
2 2
EZ nZ
where Z, Y are sample means of Z and Y, respectively, n the sample size,
and E denotes the summation from 1 to n.
An unbiased estimate of the variance about the fitted line is given
by
255
256
2 = S(Y A BZ)'
Y n 2
(A.4)
(A.5)
which after some arrangements reduces to
_2 E(YY)2 B2E(ZZ)2
Y n 2
Equation A.O may be rewritten in the following form as a function
of the estimates A and B to calculate the predicted value of Y=Y
A A
Y = Y + B(ZZ) (A.6)
Since Y and B are random variables, Y is also a random variable, and
its variance is
A O A A
Var(Y) = Var(Y) + (ZZ) Var(B) + 2(ZZ) Cov (Y, B) (A.7)
From Draper and Smith (1966, pp. 1922) we have
2
Var(Y) =
n
Var(B) =
, and
Â£(ZZ)
Cov(Y, B) = 0 .
Substitution into Equation A.7 leads to
Var(Y) = Â£ + a2
E(ZZ) u
(A.8)
where is a measure of the average scatter of the data about the
u
regression line. This variance is often estimated from the data by
Equation A.5 after some additional correction for bias. Raiffa and
Schlaifer (1961, Eq. 11.39) used the following estimate
2
n
1 2
n 3 Y
(A.9)
to account for the uncertainty in the estimate of a (Equation A.5).
257
The total variability of any individual prediction is given by
2
adding a to Equation A.8,
Var (Y. ) = [1 + + Z^rl n
i n
E(ZZ)
1_ 2
,2J n 3 Y
(A. 10)
APPENDIX B
EXACT RELATIONS BETWEEN MOMENTS OF THE NORMAL
AND LOGNORMAL DISTRIBUTIONS
Relations between the moments of the normal (N) and the lognormal
(LN) distributions are very important for confidence limit calculations
and reliability analysis. These relations have been derived by Aitchison
and Brown (1957). They are rederived here using the moment generating
function of the normal distribution. Assuming Y is normally distri
buted, its pdf is
fY0O
1
2
YUy 2
()
Y
9
(B.l)
and its moment generating function is by definition
My(t) = _J etY fy(Y) dY (B.2)
Replacing f^(Y) by its value (Equation B.l) yields after some
simplifications
_ 2 2 ,
tu t o/2
My(t) = e e
1 t \ 2
 J
dz
(B. 3)
with
z =
The integral in the above
1 2
 TU
CO /
/ e du
where u = zo^t, and its value
equation is of the standard form
, (B.4)
is /2tt (Beyer, 1978, p. 380, Eqn. 663).
258
259
So Equation B.3 reduces to
tVY + t^ay/2
My(t) = e (B.5)
For a=0 (Y=log y), the expected moment of order n about the origin of y
is
E(yn) = _ro/ yn fy(Y) dY .
Replacing y by e leads to
E(yU) = _/ fY(Y) dY
which is the same as Equation B.2 with n=t, thus
E(yn) = MyCn) = e (B.8)
From this equation, relations between moments of y and Y (log y) are
straight forward,
, 2 2.
nyy + n a /2
(B.6)
(B.7)
hy = E(y) = e
yy + oy/2
(B. 9)
2 2 2 2 Y
Oy = E(y ) E(y) = yy (e 1)
(B.10)
V
y y
y
Yy = [E(y3) 3 E(y2) E(y) + 2 E(y)3]/a3
, 2 2
3Y Y 3
= (e 3e + 2)/Vy
(B.ll)
(B.12)
3 V + V3
y y
260
The moments of the transformed variable (Y) may be expressed in
terms of the original moments. In particular, from Equation B.ll we
have
o\ = log(l + Vy) (B.13)
Substitution of this expression into Equation B.9 gives
PY = log(y ) j log(l + Vy) (B.14)
The coefficient of variation is then
ay [log(l + Vy)31/2
Y UY log[Vy/(l + Vy)1/2]
(B.15)
APPENDIX C
PROBABILITY DISTRIBUTIONS
C.l. Discrete Distributions
C.1.1. Binomial distribution
One distribution arising from a Bernouilli process is the binomial
distribution. A Bernouilli process is characterized by: 1. the
independence of its events; 2. an event may either occur or not occur
within a given trial (location or time) with probabilities p and q=lp,
respectively; and 3. the probability p does not change with time. The
binomial probability mass function is
p(y) = (y) py q11 y (C.1.1)
where () defines the number of combinations of n events taken y at a
time
(n) = ii
y n!(ny)!
The probability p(y) defines the probability of y successes in n trials.
The binomial distribution has the following moments,
mean y = np,
2
variance a = npq,
and skewness y = (qp)/ Aipq .
The binomial distribution is one of the most commonly used discrete
distributions (Chow, 1964).
261
262
C.1.2. Geometric distribution
The probability that the first success of a Bernouilli process
occurs at the yth trial is given by the geometric distribution. The
geometric probability mass function is
p(y) = (lp)71 p ,
and its moments are
mean
variance
and skewness
(C.1.2)
The geometric distribution may be used to describe the probability dis
tribution of the length of time between occurrences, by noting that the
probability that y trials elapse between occurrences is the same as the
probability that the first occurrence is at the y + 1st time step
p(y+i) = (ip)y P
C.1.3. Negative binomial distribution
The probability that the kth success occurs on the yth trial of a
Bernouilli process is given by the negative binomial distribution. Its
probability mass function is
p(y,k,p) = (y_^) pk (lp)7 k (C.1.3)
with y = k, lc+1, ... The mean and variance of the negative binomial
distribution are
lc
k =P
and
o2
a 2
P
respectively.
263
C.1.4. Poisson distribution
If in the binomial process y gets very large and p gets very small
such that yp, the expected number of successes, remains constant and
equal to A, the binomial distribution approaches the poisson distri
bution, with mass function
p(y) =
e~A Ay
nl
(C.1.4)
\vhere y = 0, 1, 2, ...; A > 0. The mean and variance of this distri
bution are
U = A ,
and = A ,
respectively, and the skewness coefficient is
. 1/2
y = A
showing that as the expected number of successes A increases, the skew
ness of the Poisson distribution decreases, approaching symmetry as A*00.
The Poisson process is a discrete process on a continuous time
scale (Haan, 1977). Therefore, the number of events in a fixed time has
a discrete distribution, while the time between events and the time to
the yth event have continuous probability distributions. These are the
exponential and gamma distributions, which are introduced in the next
section.
C.2. Continuous Distributions
C.2.1. Uniform, distribution
The uniform distribution gives the probability of a continuous
random process defined over an interval a to b. Its probability density
function (pdf) is given by
264
f(y) =
a < y < b
b a
0 elsewhere .
The moments of the uniform distribution are
(C.2.1)
mean
a + b
y = 2 > an(i
variance
2 (b a)'
12
The uniform distribution is used mainly for the generation of
random observations from a given probability distribution. The cumu
lative distribution function (CDF) of any continuous distribution is
uniformly distributed between 0 and 1, thus the generation problem
reduces to finding the inverse of the CDF.
C.2.2. Exponential distribution
The exponential distribution was mentioned in the previous section
in connection with the Poisson process as the probability distribution
of the time between occurrences of events. The exponential distribution
is also widely used in reliability analysis as the distribution of the
time to failure (Haan, 1977) This distribution is used in Chapter 6 in
the analysis of the storm event based statistics.
The exponential density function is
f(y) = A e ^ (C.2.2)
with y, A > 0. The mean and variance of the exponential distribution
y = 1/A
2
a
are
265
'riie coefficient of variation, defined as the ratio of the standard
deviation to the mean, is equal to 1. The exponential distribution has
a constant positive skewness coefficient y=2.
C.2.3. Gamma distribution
The gamma distribution was mentioned as the probability distri
bution of the time to the kth event in a poisson process. This time is
the sum of k exponentially distributed independent variables each with
the same parameter A; the pdf of the gamma distribution takes the form
ff Ak k1 Ay fr 0 .
f(y) = y e (C.2.3a)
with y, A, k > 0, xdiere F(k) is the tabulated gamma function defined as
T(k) = 0/ tk_1 e_t dt (C.2.3b)
The moments of the gamma distribution are
mean
variance
and skexmess
The three parameter gamma distribution is often knoxm in hydrology
as the Pearson Type III distribution. The third parameter is a location
parameter introduced into Equation C.2.3a to yield the pdf of the
Pearson Type III distribution
f(y) = rM (y a)k_1 e~X(y"a) (C.2.4)
with (ya), A, k > 0. Only the mean of the distribution is affected by
the addition of the location parameter,
A
2 k
O 9 5
A
Y =
r(k)
(C.2.3c)
266
The variance and skewness are the same as those of the gamma distribu
tion (Equation C.2.3c).
C.2.5. Beta distribution
The beta distribution is limited at both ends. It is generally
defined over the interval 0 to 1, but can easily be transformed to any
interval a to b. The limits a and b are usually estimated separately
based on prior knowledge of the modeled process, reducing the number of
parameters to two. The pdf of the beta distribution is
fW bSTit T*'1 u y)3'1
with 0 < y < 1,
a, g > 0,
where B(a,3)
a1
x
(1 x) ^ ^ dx
r(a) r(g)
T(a+6)
is the beta function. The beta function can be evaluated from its
relation to the gamma function
B(a,6)
r(a) T(g)
r(a+3)
(C.2.6)
The moments of the beta distribution are
mean y = 7r, and
a + 3
2 2
variance a = aB/[(l + a + B) (a + B) ] .
C.2.6. Normal distribution
The normal distribution is the most important and most widely used
continuous probability distribution. Yevjevich (1972a) gives five areas
of application of the normal distribution in hydrology:
"(1) fitting symmetrical empirical frequency distributions of
hydrologic random variables;
(2) analysis of random errors distribution;
267
(3) a bench mark distribution for comparison with other distribu
tions;
(4) statistical inferences about the assumption of normality of
many hydrologic statistical parameters; and
(5) for Monte Carlo simulation where normally distributed random
numbers are first generated before being transformed to other types
of distributions."
The probability density function of the normal distribution is
_ I(Z=E) 2
f (y) = e 2 0 (C.2.7)
a v^2tt
where y is a continuous random variable defined over the interval > to
2
h> with mean y and variance a A common notation of the normal distri
2
bution is N(y,cr ). The standardized form of the normal distribution has
zero mean and unit variance and is symbolized by N(0,1); its pdf is
2
z
f (z) = e 2 (C.2.8)
s mr
with Z = 00 < z < +oo
a
where y and a are the location and scale parameters, respectively (Mann
et al., 1974, p. 74).
The normal distribution is symmetrical with skewness equal to zero.
Although it is popular, the normal distribution cannot be fitted (in a
strict theoretical sense) to most hydrological variables, such as rain
fall, streamflow, reservoir storage, etc. These variables assume only
positive values with finite mean y and lower bound xq; even their
standardized form has a lower boundary equal to (yx )/a, while the
268
normal distribution is defined over the entire real domain (>,+>) .
However, Yevjevich (1972a) noted that if the mean is larger than 3o the
probability that the value x reaches the lower boundary is very small
and can be neglected for many practical purposes.
C.2.7. Chisquare distribution
The sum of squares of v normally distributed standardized variables
has a chisquare distribution with v degrees of freedom. This is a
special case of the gamma distribution, Equation C.2.3a, with \ = y and
k = 2v, where v is the single parameter of the chisquare distribution.
The pdf of the chisquare distribution is
f (y)
x(l v/2)
2V/2 T(v/2)
S.
2
and its moments are
mean
variance
H = v ,
a2 = 2v ,
(C.2.9)
and skewness y = /2/v .
Hie chisquare distribution is mainly used in statistical infer
ences and hypothesis testing.
C.2,8. Student t distribution
A variable y has a Student t distribution with v degree of freedom
if it is defined by
y = z//u/v
where z is a standardized normal variate and u is a chisquare variate
with v degrees of freedom. The pdf of the Student distribution is
f (y)
= r(~o
(i +
y2/v)
(v+1)
2
/[/^ r(v/2)]
(C.2.10)
269
with
< y < and v > 0.
mean
The moments of this distribution are
y = 0
variance
= v/(v2)
y
for v > 2 ,
and skewness y = 0 .
The Student t distribution is widely used for statistical inferences and
calculations of confidence intervals for sample means from normal parent
distributions with unknown variances (Haan, 1977, p. 121).
C.2.9. Extreme value distributions
Extremal distributions have been widely used in hydrology to study
the distribution of maximum and minimum hydrologic events. The distri
bution of the m largest or smallest events each selected from a sample
of n event samples, approaches an asymptotic form which depends on the
type of the parent distribution of the total number (mn) of events
(Chow, 1964). Three types of extreme value distributions have been
developed for three different types of parent distribution; these are
known as Type I, Type II and Type III extremal distributions. A rig
orous treatment of the theory of extremes is given in Gnedenko (1943) .
Extreme value Type I distribution. The Type I extremal distri
bution results from any parent distribution of the exponential type,
such as normal, lognormal, exponential or gamma distribution. The Type
I distribution is also known as Gumbel distribution since he was the
first to apply it to flood frequency analysis and was the author of most
of the works on extremal distributions. A good summary of Gumbel's work
on the distributions of extremes may be found in Gumbel (1958) .
The pdf of the extreme value Type I is
f(y) = ^ exp{+(^) exp [+(^j^) ] } .
(C.2.11)
270
The + and signs apply to the minimum and maximum value distribu
tions, respectively, and the variate y is unlimited, y <_ +. The
two distributions are symmetrical with each other about the mode a.
Equation C.2.11 may be standardized to
fg(z) = exp[+z exp(+z)] (C.2.12)
where z = showing that a and b are of the location and scale type
of parameters.
The moments of the Type I extreme value distribution are
a + Cb (maximum)
a Cb (minimum)
where C = 0.5772 is the Euler constant,
2
mean
variance
2 TT ,2
a =  b
6
(both)
skewness
Y =
+ 1.1396 (maximum)
 1.1396 (minimum) .
Extreme value Type II distribution. The Type II extremal dis
tribution results from a parent distribution of the Cauchy type (Chow,
1964) and is unlimited in the tail of interest. The Type II pdf for
largest value is
(J^)k
r, x k b .k+1 ya
f(y) b W
(C.2.13)
with y a. For this distribution moments of order greater than k do
not exist (Benjamin and Cornell, 1970, p. 279).
The moments of the Type II extremal distribution are
V = b T(1 i) + a
mean
k > 1
271
variance o2 = b2[r(l  r2(l k > 2
where the lower bound, a, is estimated independently from the data.
If y has a Type II extremal distribution with parameters a, b and k
it can be easily shown by using the relationship between derived dis
tributions, that log(ya) has a Type I extremal distribution with
parameters log b and 1/k (Benjamin and Cornell, 1970, p. 280).
The Type II extremal distribution for largest values has been
widely applied for the description of many hydrological and meteoro
logical phenomena (Gumbel, 1958).
Extreme value Type III distribution. The Type III extremal dis
tribution results from a parent distribution limited in the tail of
interest (Haan, 1977). The Type III for smallest values is known as the
Weibull distribution, since Weibull (1939) was the first to apply it for
the study of strength of materials. Later Gumbel (1954) used it for the
analysis of droughts.
The pdf of the Weibull distribution is
with y >_ a.
The moments of the Type III extremal distribution are
mean p = a + (ba) 1(1 + ^~)
variance a2 = (ba)2 [T(l + j^) I2(l 4 Â£) J .
The skewness is a complex function of k alone, from which a method of
moments estimate of k may be obtained (Gumbel, 1958, pp. 282284). Here
again, the Type III distribution with parameters a, b and k may be
reduced to a type I distribution (by the transformation y = log(xa) and
272
the relation between derived distributions) with parameters log(ba)
and k.
APPENDIX D
GPDCP SOURCE PROGRAM
cc
GENERALIZED PROBABILITY DISTRIBUTIONS COMPUTER PROGRAM (GPDCP)
GPDP 1
GPDP 2
c
GPDP 3
c
GPDP 4
DIMENSION S(60)>PI60)>PC(60)iF(60)fDD(604)jY(60)Z(60)
GPDP 5
DIMENSION AA(IOO)jBB(IOQ)>BP(60)>E2(1GO)>E1(1QO)>R2(1QO)
GPDP 6
DIMENSION DT(10)fN0(10)>Yl(60)Y2(60)iNADI(43)SITE(20)
GPDP 7
DIMENSION RF(60i4)TR(60)>AFT(100)>ZZ(10)jAHZ(10)>SS(60)
GPDP 8
DIMENSION U(10)fZV(10)>ZHO)>NP(10)RESY(100)RES(100)RT(60)
GPDP 9
DIMENSION TC(10))ON(10)IRG1(60)>IRG2(60)jXMXL(100)>S1(100)
GPDP 10
DIMENSION YMEOO)YV(100)YS(100)iSLCT<4)fST(60)(SCI60)DFN(25)
GPDP 11
DIMENSION XY<60>2)ALBAPI3)>BES(5)AH0VA(14)iSTAT(V)PRED(60i7)
GPDP 12
c
GPDP 13
c
GPDP 14
DATA INjIOU/5,6/
GPDP 15
DATA NADI< 11)NADI(1 2)NADI(1 j3)/' NOVRMALV 7
GPDP Id
DATA NADI(21)NADI(2>2)NADI(23)/' GUVMBELV 7
GPDP 17
DATA NADI(3>1)>NADI(3>2)>NADI(3j3)/' RAVYLEI'j'GH 7
GPDP 18
DATA NADI(4>l)>NAIiI(4>2)NADI(4>3)/' PEVARSOVN 7
DATA SLCT/' R2 VSTBEVUSS VHXLF7
GPDP 19
GPDP 20
DATA ALBAP/Q.01>0*05>0.05/
GPDP 21
c
GPDP 22
cc
CONTROL FLAGS
GPDP 23
c
GPDP 24
IP0=0
GPDP 25
IPl^l
GPDP 26
IAF=0
GPDP 27
c
GPDP 28
NS=1
GPDP 29
NLB=1
GPDP 30
NLE=4
GPDP 31
NL0I=NLENLBI1
GPDP 32
ALF=0.00
GPDP 33
c
GPDP 34
cc
STATION DO LOOP
GPDP 35
c
GPDP 36
DO 999 KS=liNS
GPDP 37
c
GPDP 38
CT::0,D
GPDP 39
CP=1.0
GPDP 40
IPRINT=1
GPDP 41
c
GPDP 42
cc
READ IN TRANSFORMATION REQUEST USING 501 OR 502 FORMAT
GPDP 43
c
GPDP 44
IFIAFEQ1) GO TO 10
GPDP 45
c
GPDP 46
READ(IN>501) NAF,AFO>PAS
GPDP 47
501
F0RMAT(I5>F6.2>F5.3)
GPDP 48
GO TO 20
GPDP 49
c
GPDP 50
10
READ(IN>502)NAF>((RF(I>J)>I=1>NAF)>J=NLB>NLE)
GPDP 51
502
FORMAT(I510F6*2)
GPDP 52
PAS=0.001
GPDP 53
20
CONTINUE
GPDP 54
C
GPDP 55
READ(IN>503) L=1>20)
GPDP 56
503
FORMAT(20A4)
GPDP 57
READN.504) NECH>NANNE>INDEX
GPDP 58
504
FORMAT (1
GPDP 59
C
GPDP 60
REAB(IN>505) (DT(I)>I=1,NECH)
GPDP 61
505
FORMAT(10F50)
GPDP 62
READdN.506) (NO(I)>I=1,NECH)
GPDP 63
506
FORMAT(1015)
GPDP 64
C
GPDP 65
274
275
cc
SAMPLE DO LOOP
GPDP 66
c
GPDP 67
DO 998 LE=1NECH
GPDP 68
c
GPDP 69
NSEUI=NO(LE)
GPDP 70
RTC=(NANNEF1)/(NSEUIF1)
GPDP 71
c
GPDP 72
READ(IN507)
GPDP 73
507
FORMAT(8F10*2)
GPDP 74
C
GPDP 75
CALL STATY NSEUI60)
GPDP 76
C
GPDP 77
CC !
STANDARDIZE BY DIVIDING BY THE SAMPLE MEAN IF IRED=1
GPDP 78
C
GPDP 79
IRED=0
GPDP 80
IF(IREDfNE.O) CP=1/SM
GPDP 81
C
GPDP 82
CC
RANK THE DATA IF IRNG EQUAL 1
GPDP 83
C
GPDP B4
IRNG=1
GPDP 85
IF(IRNG.EQ.i) CALL ORDER(SIRG1/IRG2HSEUIf60)
GPDP 86
c
GPDP 87
IFS(1)*LT*1*0) CT=10
GPDP 88
c
GPDP 89
DO 111 1=1jNSEUI
GPDP 90
SC(I)=S(I)$CP+CT
GPDP 91
111
P(I)=FLOAT(I)
GPDP 92
c
GPDP 93
cc
PRINT INPUT DATA IF IPO NOT EQUAL ZERO
GPDP 94
c
GPDP 95
IF(IPO.EQ.O) GO TO 30
GPDP 96
WRITE!IOU701) SITE
GPDP 97
WRITE!I0Uj702> DT(LE)
GPDP 98
701
FORMAT(1H1t10X20A4///)
GPDP 99
VRITE(I0U)702) DT(LE)ALFCTCP
GPDP100
702
FORMAT<15X'EVENT''S AVERAGE DURATION ='F5>0' YEAR(S)' >/
GPDP101
,//M5X>'PLOTTING POSITION CONSTANT = 'F6.3>
GPDP102
/>5X 'ADDITIVE CONSTANT = SF6.3
GPDP103
/15Xi 'MULTIPLICATIVE CONSTANT = 'F6>3,//)
GPDP104
WRITE!I0Ut703)
GPDP105
703
FORMAT<35X' VOLUME '>7X>'FREQUENCY'>/37X'UNIT '8X,'* OBS <
=' GPDP106
,//)
GPDP107
WRITE!I0Ui704) <
GPDP108
704
FORMAT < 37X> F5 *210XF5 0)
GPDP109
WRITE!I0Ui754) SMSVSK
GPDPllO
754
FORMA f!//? 5Xf 'MEAN ='F10>3/5Xj'STD =',F10,3,/5X>
GPDP111
,'SKEH ='jF10>3>//)
GPDP112
30
CONTINUE
GPDP113
C
GPDP114
CC EMPERICAL FREQUENCIES CALCULATION
GPBP115
C
GPDP116
CALL CBF(PPCF50NSEUIJ1J2INDEXALF)
GPDP117
c
GPDP118
NY=J2J1I1
GPDP119
AN=NY
GPDP120
AN3=AN3,
GPDP121
DF=(AN3t)/2>
GPDP122
c
GPDP123
cc
CALCULATE STUDENT TO FOR 95Z CONFIDENCE LEVEL
GPDP124
c
GPDP125
CALL HDSTI!0.10AN3STOIER)
GPDP126
c
GPDP127
CC REDUCED VARIABLES CALCULATION
GPDP128
C
GPDP12?
DO 222 I=J1J2
GPDP130
C
GPDP131
FA=F(I)
GPDP132
CALL RNV(FAZNDDDIF)
GPDP133
DD I i 1)=ZN
GPDP134
DD 2)=ALOG(ALOG(FA))
GPDP135
DD(I3)=SQRT(2>AL0G(1,FA))
GPDP136
RT(I)=RTC/FA
GPDP137
222
CONTINUE
GPDP138
OC JO
276
C GPBP139
CC REGRESSION COEFICIENTS EVALUATION GPBP140
C 6PDP141
K2=0 GPDP142
C GPDP143
CC PARENT DISTRIBUTION DO LOOP GPDP144
C GPDP145
DO 898 J=NLBHLE GPDP146
C GPDP147
CC TRANSFORMATION DO LOOP GPDP148
C GPDP149
DO 777 N=1>NAF GPDP150
C GPDP151
RF (NJ)=AF0+(Nl)tPAS GPDP152
C GPDP153
AF=RF(N>J) GPDP154
AFM1=AF1*0 GPDP155
IF(AF.NEQ>) AIF=1*/AF GPDP15
BF=2iABS(AF)PAS GPDP157
K2=N+(J1)*NAF GPDP158
C GPBP159
SY=0* GPDP160
SY2=0 GPDP16
DO 333 I=J1J2 GPDP162
IF(BF) 41>4140 GPDP163
40 Y(I)=(SC(I)AF1.0)/AF GPDP14
GO TO 42 GPDP165
41 Y(I)=ALOG(SC(I)) GPDP166
42 CONTINUE GPDP167
SY=SYFY(I) GPDP18
SY2=SY2FY(I)$Y(I) GPDP19
333 CONTINUE GPDP170
YH=SY/AN GPDP171
YS2=SY2/ANYM*YH GPDP172
C GPDP173
CC MOMENT ESTIMATE OF THE GAMMA SHAPE PARAMETER GPDP174
C GPDP175
DF=YM*YM/YS2J(AN1,)/AN GPDP17
IFJ.EQ.4) DFN(N)=DF GPDP177
DF=ABS(DF) GPDP78
IFDF.GT*200000*) DF=200000. GPDP179
IF(DF*LT050) BF=0,50 GPDP1B0
GPDP181
: INDEPENDENT VARIABLE DEFINITION GPDP182
GPDP183
DO 444 I=J1J2 GPDP184
IF(JNE>4) GO TO 43 GPDP185
FA=F(I) GPDP18
CALL MBCHI(FA,BF,XQ,IER) GPDP187
Z(I)=XG/2. GPDP188
DD(I>4)=2
GO TO 444 GPDP190
43 CONTINUE GPDP191
Z(I)=DD(IJ) GPDP192
444 CONTINUE GPDP193
C GPDP194
CALL REGRE(ZiY>Y250J1J2>AA(K2)>BB(K2)>R2(K2)fSl(K2)>ZZ(J)r GPDP195
AMZ( JbVARY) GPDP19&
C GPDP197
CALL STATY (YfYHE(K2),YV(K2)YS(K2)>NY>60> GPDP198
C GPDP199
EEB=0. GPDP200
EEC=0 GPDP201
F'XT=Of GPDP202
A1=0 GPDP203
C GPDP204
DO 50 I=J1>J2 GPDP205
C GPDP203
A1=A1F(S(I)AFM1(I))2 GPDP207
C GPDP208
GPDP209
277
C
IF(BF) 45,45,46
45 DPI)=EXP
GO TO 48
46 CONTINUE
AIY=AFIY2(I)+1.0
IF (AIY.GT,0.0) GO TO 47
C
CC PRINT MESSAGE FOR NEGATIVE TRANSFORMED VARIABLE IF IHES=1
C
IHES=0
IF(IHESEQ.1)WRITE<6>766) I,AIY,AF,(NADI(J,IK),IK=1,3)
766 F0RMAT(/,5X,'FOR 0BS.4',I3,' PREDICTED AIY = ',F10.3,
. FOR A POWER TRANSF. EXP. = ',F6.2,5X,3A4)
C
dp(I)=i.e5
GO TO 48
47 DP(I)=AIY**AIF
PXT=PXT+ALOG(DPI))
48 CONTINUE
BPI=(BP(I)CT)/CP
C
CC RESIDUALS AND SELECTION STATISTIC CALCULATION
C
EEB=EEBM S (I) DPI) tt2
EEC=EEC+ ((DP) SC (I)) *DP (I) AFH1) 2/VARY
50 CONTINUE
C
XHXL(K2)=AN/2,*AL0G(EEC)
ElK2)SQRTEEB/AN)
E2(K2)=EEC
C
777 CONTINUE
888 CONTINUE
C
CC SEARCH FOR THE OPTIMAL SELECTION STATISTIC
C
ISB=1
ISE=4
C
CC USE AS SELECTION CRITERIA 'SLCT' ISB TO ISE
C
DO 997 ISL=ISB,ISE
Hl=l
MN=NAF*NLOI
EEA=R2(1)
IF(ISL.EQ.2) EEA=E1(1)
IF(ISL.EQ.3) EEA=E2(1)
IF(ISL,EQ.4) EEA=XMXL(1>
C
DO 555 J=2,HN
IF(ISL.EQ.l) EET=R2(J)
IFSL.EQ.2) EET=E1(J)
IF(ISL.E0.3) EET=E2(J)
IF(ISL.EQ.4) EET=XMXL(J)
IF(EEAEET) 61,60,60
60 GO TO 555
61 EEA=EET
Hl=J
555 CONTINUE
C
CC SEARCH FOR THE CORRESPONDING DISTRIBUTION
C
N1=M1/NAF
H2=H1N1*NAF
IFM2) 62,63,62
62 N=H2
MrNlFl
GO TO 64
63 N=NAF
H=N1
64 CONTINUE
GPDP210
GPDP211
GPDP212
GPDP213
GPDP214
GPDP215
GPDP216
GPUP217
GPDP218
GPDP219
GPDP220
GPDP221
GPDP222
GPDP223
GPDP224
GPDP225
GPDP226
GPDP227
GPDP228
GPDP229
GPDP230
GPDP231
GPDP232
GPDP233
GPDP234
GPDP235
GPDP236
GPDP237
GPDP238
GPDP239
GPDP240
GPDP241
GPDP242
GPDP243
GPDP244
GPDP245
GPDP246
GPDP247
GPDP248
GPDP249
GPDP250
GPDP251
GPDP252
GPDP253
GPDP254
GPDP255
GPDP256
GPDP257
GPDP258
GPDP259
6PDP260
GPDP261
GPDP262
GPDP263
GPDP264
GPDP265
GPDP266
GPDP267
GPDP268
GPDP269
GPDP270
GPDP271
GPDP272
GPDP273
GPDP274
GPDP275
GPDP276
GPBP277
GPDP278
GPDP279
GPDP280
GPDP281
278
C
CC
C
705
OUTPUT REGRESSION RESULTS
706
URITEI0U705) SITE
FORMAT(1H1> 5Xj 20A4/)
URITEIOU702) BT(LE),ALF,CT,CP
URITEIOU706) NADIMI)1=13)>SLCT
F0RHAT1OX'REGRESSION RESULTS Â¡S3A4,
707
IFIPRINSEQ.O) GO TO 72
URITE(I0U,707)
FORMAT 8X >'LAW',7X >'ALFA', 4X >'LOCATION
,4Xr
u2X>'
R2
MEAN
STD SKEW',//)
'5X'SCALE'
HXLF '
DO 71 I=NLB>NLE
IF.EQ.4) URITEIOU78?)
789 FORMAT(1H+125X>' K '/)
IPLT=0
IFIPLTEQ1) 9RITE(7?720)
C
DO 70 NZ=1>NAF
K=NZ+(I1)*NAF
GPDP282
GPDP283
GPDP284
GPDP285
GPDP286
GPDP287
GPDP288
GPDP289
GPDP290
GPDP291
GPDP292
GPDP293
GPDP294
GPDP295
GPDP29
GPDP297
GPDP298
GPDP299
GPDP300
GPDP301
GPDP302
GPDP303
GPDP304
GPDP305
GPDP303
GPDP307
GPDP308
GPDP309
GPDP310
GPDP311
GPDP312
(NADI(IJbJ=i3),RF(NZ>I>BB(K)>AA(K),R2El(K)>GPDP313
(NADIIJ)>J=13)
NPK=2
C
IF(NZ/NPK*NPK.EQ.NZ) GO TO 70
IF(IPLT.EQ>1) URITE7787) RFNZI),R2K)E1K)E2K)XHXLiK)
*YSK)
787 F0RMAT6F12.3)
C
URITEIOU708)
.E2K)XHXLK),YMEK)>YyK)jYSK)
708 F0RMAT<4X>3A4jF>24(1XiF11>5)2XjF8t5F10t2)2X>3F9>3)
IFIEQ4) WRITEI0U788) DFNNZ)
788 FORMAT(IHFj120XF10.3)
C
70 CONTINUE
URITEIOU 709)
709 FORMAT!1H )
71 CONTINUE
799
72
URITEI0U>79?)
F0RMAT(//j20Xi'
/>20X'
/j20X>'
/j20X>'
/>20Xj '
/20X)'
CGNTINUE
R2 CORRELATION COEFFICIENT ',
STDE STANDARD ERROR ',
USS WEIGHTED SUM OF SQUARES '
MXLF MAXIMUM LIKELIHOOD FUNCTION'>
STD STANDARD DEVIATION ',
K PEARSON SHAPE PARAMETERS/)
CC PERFORMANCE ANALYSIS OF THE SELECTED PDF
C
AF=RF(N>M)
IFAF.NE.O) AIF=1./AF
BF=2,*ABS(AF)PAS
SRES=00
C
CC
C
CALCULATE TRANSFORMED AND REDUCED VARIABLE FOR THE SELECTED MODEL
NX=INT(AN)
DO 77 I=J1>J2
IF(M>NE,4) GO TO 73
BF=BFN(N)
FA=F(I)
CALL MDCHI(FA>DF>XG)IER)
DD(IiM)=XG/2*
73 CONTINUE
GPDP314
GPDP315
GPDP316
GPDP317
GPDP318
GPDP319
GPDP320
GPDP321
GPDP322
GPDP323
GPDP324
GPDP325
GPDP326
GPDP327
GPDP328
GPDP329
GPDP330
GPDP331
GPDP332
GPDP333
GPDP334
GPDP335
GPDP333
GPDP337
GPDP338
GPDP339
GPDP340
GPDP341
GPDP342
GPDP343
GPBP344
GPDP345
GPDP346
GPDP347
GPDP348
GPDP34?
GPDP350
279
IF(BF) 75,75,74 GPDP351
74 Y
GO TO 76 GPDP353
75 Y(I)=ALOG(SC(I)) GPHP354
76 CONTINUE GPDP355
C GPDP356
XY(I,1)=DD(I,H) GPBP357
XY(I2)=Y(I) GPDP358
PREB(I,1)=DB(I,H) GPBP359
C GPBP360
77 CONTINUE GPBP361
C GPBP362
CALL RL0NE(XY6Q,NX0,1ALBAP>BES,ANGVA>STATPRED,60NXIER) GPBP363
C GPBP364
CC PRINT HEADING FOR SELECTEB HODEL PERFORHANCE GPBP365
C GPBP366
IF0) GO TO 996 GPBP367
IF(IPRINT.EQ.O) GO TO 78 GPBP368
IPRINT=0 GPBP369
WRITE(IQU705) SITE GPBP370
URITE(IGUj702) DT(LS),ALF,CT,CP GPBP371
URITE(IOUf706) (NABI(H>I),I=1,3),SLCT{ISL)RF(N,H) GPBP372
78 CONTINUE GPBP373
URITE
710 FORMAT'FREQUENCY'> 4X >' PERIOD'5X>'QBS^UOL'5X,'PRED.VOL' GPDP375
4X,'RESIDUAL',8X,'Z',7X' YO '2X'YP ', GPDP376
YR T TO'/) GPDP377
C GPBP378
CC RESIDUAL AND CONFIDENCE LIMITS FOR THE SELECTED HODEL GPDP379
C GPBP380
IDISC=0 GPDP381
C GPDP382
IF(IDISC.EQ.l) yRITE(7720)(NABI(HI)>I=l>3)SLCT(ISL),RF(NH) GPDP383
720 FQRHAT(3X>3A42X,A4>F10>3) GPDP384
C GPDP385
DO 83 I=J1,J2 GPDP386
Y2
SS(I)=Si(HMSQRT((l.+
*(ANl)/(AN3)) GPDP389
IF(BF) 8181,79 GPDP390
79 CONTINUE GPDP391
AIY=AF*Y2(I)H0 GPDP392
IF(AIY,GT00) GO TO 80 GPDP393
BP(I)=lE5 GPDP394
IF(IHE3EG.1)URITC(6>766) :AIY>Ar,(NADI(};,IK),IK=l,3) 0PDP393
GO TO 82 GPDP39
80 DP(I)=AIYWAIF GPDP397
GO TO 82 GPDP398
81 CONTINUE GPDP399
DP(I)=EXP(Y2(I)) GPDP400
82 CONTINUE GPDP401
C GPDP402
DP(I)=(DP)~CT)/CP GPDP403
RES)=BP(I)S(I) GPDP404
RESY
SRES=SRES+RES(I)iRES(I) GPDP406
C GPDP407
CC STUDENT STATISTIC GPDP408
C GPDP409
ST (I) =ABS(RESY(I))/SS(I) GPDP410
C GPDP411
URITEDD(I,H)Y(I),Y2(I) GPDP412
>RESY(I)ST(I),STO GPDP413
711 F0RHAT(8X,F5,0,6F12.3,2F8,3,2F7>3) GPDP414
C GPDP415
CC PRINT SELECTED RESULTS ON TCP FILE FOR PLOTTING GPDP416
CC IF IDISC NOT EQUAL ZERO GPDP417
C GPDP413
IF(IDISC.EG>1) GPDP419
,URITE<7,712) P(I) RT(I)S(I)BP
712 F0RHAT(1X,F5>0,F61>5F123) GPDP421
280
83 CONTINUE GPDP422
C GPDP423
WRITE!IOU,722) GPDP424
722 FDRHAT(//,20X' Z REDUCED VARIATE S 6PDP425
. /20X' YO TRANSFORMED OBSERVATIONS GPDP426
. /,20X,' YP TRANSFORMED PREDICTION S GPDP427
/,20X,' YR TRANSFORMED RESIDUAL S GPDP428
/>20X' T STUDENT STATISTIC S GPDP429
/,20X,' TO 957. STUDENT T 't/I) GPDP430
C GPDP431
SRES=SQRT(SRES/AN) GPDP432
C GPDP433
CC OUTPUT ANOVA STATISTIC GPDP434
C GPDP435
NANDV=0 GPDP436
IF(NANOV.EQ.l) GO TO 996 GPDP437
WRITE!IOU,705) SITE GPDP438
WRITE!IOU,702) DT(LE),ALF,CT,CP GPDP439
WRITE(I0U,706) (NADI!H,I),I=1,3),SLCT(ISL),RF(N,H) GPDP440
WRITE(I0U,714) DES(2),DES(4),DES(5),DES(1),DES(3) GPDP441
WRITE!I0U,715) AN0VA!1),AN0VA!4),AN0VA(7),AN0VA(9),AN0VA(0), GPDP442
. AN0VA(2),AN0VA(5),AN0VA8), GPDP443
. ANOVA(3),ANOVA!6),AN0VA(14) GPDP444
WRITE!IOU,716) STAT(1),STAT(2),STAT(3),STAT<4), GPDP445
. STAT(5),STAT!),STAT(7)iSTAT(8),STAT(9) GPDP446
C GPDP447
714 F0RHAT(///,20X,'BASIC DESCRIPTIVE STATISTICS',//,45X, GPDP448
.'STANDARDS/,35X,'MEAN S5X,'DEVIATION CORRELATIONS//,5X, GPDP449
.'TRANSFORMED VARIABLES5X,2F11.3,/,55X,FIO.3,/5X, GPDP450
.'REDUCED VARIATE S8X,2F11.3,//) GPDP451
715 FORMAT!//,25X,'ANALYSIS OF VARIANCE',//5X, GPDP452
.'SOURCE OF DEGREE OF SUMS OF HEANS14X,'P!EXCEEDING F' GPDP453
.,/,5X, 'VARIATION FREEDOM SQUARES SQUARES FVALUES4X, GPDP454
.'UNDER HO)',//,5X,'REGRESSI0NS5F11.3,/,5X,'RESIDUAL S3F11.3,/, GPDP455
,5X,'TOTALS5X,2F11.3,//,5X,'DURBINUATS0N STATISTIC SF6.3,//) GPDP456
716 FORMAT!/,25X,'MODEL PARAMETER INFERENCES',//,48X,'L0WERS7X, GPDP457
.'UPPERS/,25X,'POINT STANDARD CONFIDENCE CONFIDENCES/, GPDP458
.23X,'ESTIMATE ERROR LIMIT LIMIT',//,5X, GPDP459
.'SCALE S 7X,4F12.3,/,5X, GPDP460
, 'LOCATION'4X,4F12.3,//5X, GPDP461
. 'C0VARIANCES4X.F10.3,/) GPDP462
C GPDPT63
996 CONTINUE GPDP464
997 CONfINUE GPDP465
998 CONTINUE GPDP466
999 CONTINUE GPDP467
STOP GPDP468
END GPDP469
o oo
281
c
c
100
200
300
400
500
00
C
C
C
C
1
2
4
5
6
7
8
C
GPDP470
CPDCP SUBROUTINES
GPDP471
GPDP472
SUBROUTINE ORDER(XIRANG1.IRANG2N,ND)
GPDP473
GPDP474
DIMENSION X(ND),IRANG1(NB)IRANG2(ND)
GPDP475
GPDP476
DO 100 1=1jN
GPDP477
IRANG2(I)=I
GPDP478
M=N1
GPDP47?
IF(ii) 600600*200
GPDP480
DO 400 1=1,H
GPDP4B1
L=I+1
GPDP482
DO 400 J=L,N
GPDP483
IF(X(I)X(J)) 400,400,300
GPDP484
T=X(I)
GPDP485
X(I)=X(J)
GPDP483
X(J)=T
GPDP487
IT=IRANG2(I)
GPDP488
IRANG2(I)=IRANG2(J)
GPDP48?
IRAH62 J)=IT
GPDP490
CONTINUE
GPDP491
DO 500 K=1N
GPDP492
L=IRANG2(K)
GPDP493
IRANG1(L)=K
GPDP494
RETURN
GPDP495
END
GPDP496
GPDP497
GPDP498
SUBROUTINE CDF(P,PC,F,NK,N,J1,J2,INDEX,ALF)
GPDP499
GPDP500
DIMENSION P(NM),PC(NM)F
GPDP501
GPDP502
PC(1)=P(1)
GPDP503
IF(INDEX1) 1,3,1
GPDP504
DO 2 1=2,N
GPDP505
PC(I)=PC(I1)+P(I)
GPDP506
GO TO 5
GPDP507
CONTINUE
GPDP503
DO 4 1=2,N
GPDP509
PC(I)=P(I)
GPDP510
CONTINUE
GPDP511
DO 6 1=1,N
GPDP512
PPP=PC(I)
GPDP513
IF(PPP) ,,7
GPDP514
CONTINUE
GPDP515
J1=I
GPDP516
J2=N
GPDP517
I1=N
GPDP518
DO 8 I=J1,J2
GPDP519
F
GPDP520
RETURN
GPDP521
END
GPDP522
GPDP523
282
SUBROUTINE RNV(P,X,D,IE)
GPDP524
GPDP525
GPDP526
IE0
GPDP527
IF(P) 142
GPDP52B
1
IE=1
GPDP529
GO TO 12
GPDP530
2
IF(Pl) 7.6,1
GPDP531
4
X=0,9E+32
GPDP532
5
D=0,
GPDP533
GO TO 12
GPDP534
6
X0.9EI32
GPDP535
GO TO 5
GPDP536
7
D=P
GPDP537
IRD0.5) 9,9,8
GPDP538
8
D=l.D
GPDP539
9
T2=ALQG(1./(DM>
GPDP540
T=SQRT(T2)
GPDP541
X=T(2.515517+0,802853m0,010328*T2)/(l,+l,432788m0.189269*T2
GPDP542
.+0.001308TT2)
GPDP543
IF(P0 5) 10,10,11
GPDP544
10
x=x
GPDP545
11
D=0.3989423EXP (XlX/2 *)
GPDP546
12
RETURN
GPDP547
END
GPDP548
SUBROUTINE REGRE(XYiY2,HMAX,JlJ2A,BR2iSfVARXAHX,VARY)
GPDP549
GPDP550
GPDP551
DIMENSION X(60),Y(60),Y2(60)
GPDP552
GPDP553
AHX=0,
GPDP554
GPDP555
AHY0
GPDP556
UAX=0.
GPDP557
VAYO.
GPDP558
COVO,
GPDP559
DO 1 I=J1,J2
GPDP560
AHXAHX+X)
GPDP561
AMY=AMY+Y(I)
GPDP562
VAX=VAX+X(I)iX(I)
GPDP563
VAY=VAY+Y(I)?Y(I)
GPDP564
COV=COV+X(I)tY(I)
GPDP565
1
CONTINUE
GPDP566
AN=FLOAT < J2J1+1)
GPDP567
AHXAMX/AN
GPDP568
AMYAHY/AN
GPDP569
VARXVAX/ANAHXtAHX
GPDP570
VARY=UAY/ANAHYAHY
GPDP571
COVACQO/ANAMXYAMY
GPDP572
A=COVA/VARX
GPDP573
BAMYAJAMX
GPDP574
R2=CQUAtt2/VARX/VARY
GPDP575
SSQRT ((AN/ (AN2,)) (UARYCOUA W2/VARX))
GPDP576
DO 2 I=J1,J2
GPDP577
Y2(I)=AtX(IHB
GPDP578
0
i.
CONTINUE
GPDP579
RETURN
GPDP580
END
GPDP581
APPENDIX E
MODIFIED KITE SOURCE PROGRAM
c
KITE 1
c
MODIFIED VERSION OF KITE'S (1977,PP.164167) PROGRAM
KITE 2
c
KITE 3
c
PROGRAM TO COMPUTE THE STANDARD ERRORS OF EVENTS COMPUTED
KITE 4
c
FROM VARIOUS PROBABILITY DISTRIBUTIONS COMPARED TO THE
KITE 5
c
OBSRVED EVENT MAGNITUDES
KITE 6
c
INPUT
KITE 7
c
TITLE
KITE 8
c
N NUMBER OF ANNUAL MAXIMUM EVENTS
KITE 9
c
X SERIES OF EVENTS
KITE 10
c
KITE 11
c
KITE 12
DIMENSION X(100)>Y(100)>2(100)P(100)jRP(100)>T(100)>TITLE(80)
KITE 13
REAL L1>L2>L3>LG
KITE 14
REAL K>M1>M2>H3
KITE 15
REAL MU
KITE 16
E=2.515517
KITE 17
C1=0.802853
KITE 18
C2=0,010328
KITE 19
01=1.432788
KITE 20
02=0.189269
KITE 21
D3=0.001308
KITE 22
66
READ (5>6>END=99) TITLE
KITE 23
READ (5>7) N
KITE 24
XN=N
KITE 25
READ(5>8) (X(I),I=1,N>
KITE 26
URITE<69) TITLE
KITE 27
C
KITE 28
A=0,0
KITE 29
B=0.0
KITE 30
C=0,0
KITE 31
DO 1 1=1,N
KITE 32
A=AFX(I)
KITE 33
B=B+X(I)*X(I)
KITE 34
c=c+x
KITE 35
1
CONTINUE
KITE 36
C
KITE 37
M1=A/XN
KITE 38
M2=(B/XN)(A/XN)2
KITE 39
M3=( C/XN) +2 .*Ml33 .*H1*( B/XN)
KITE 40
H2=H2tXN/(XNl,0)
KITE 41
G=H3/(H21.5)
KITE 42
CALL SORTX(N>X)
KITE 43
URITE (6,10)
KITE 44
WRITE (6,19) (X(I),I=1,N)
KITE 45
URITE (6,17)
KITE 46
C
KITE 47
A=0.0
KITE 48
B=0.0
KITE 49
C=0.0
KITE 50
DO 2 1=1,N
KITE 51 
Y(I)=ALOG(X(I))
KITE 52
A=APY(I)
KITE 53
B=B*Y(I)tt2
KITE 54
C=C+Y(I)**3
KITE 55
2
CONTINUE
KITE 56
284
285
L1=A/XN
KITE 57
L2=(B/XN)(A/XN)tt2
KITE 58
L3= (C/XN) 12 *L 1 33 *L 1 (B/XN)
KITE 59
L2=L2tXN/(XNl,0)
KITE 60
LG=L3/(L21.5)
KITE 61
WRITE (6,11) HI
KITE 62
WRITE (6,12) H2
KITE 63
WRITE (6,13) G
KITE 64
WRITE (6,14) LI
KITE 65
WRITE (6,15) L2
KITE 66
WRITE (6,16) LG
KITE 67
c
KITE 68
II
KITE 69
KITE 70
P(I)=10XI/(XNF1.0)
KITE 71
RP(I)= 1*0/(10P(I))
KITE 72
D=P(I)
KITE 73
IF(DGT0.5) D=10D
KITE 74
U=SGRT(ALOG(1,0/Dt*2))
KITE 75
T(I)=W(EICimC2*W2)/<1.0+DimD2*Utt2+D3*W**3)
KITE 76
IF(P(I)LT.0,5) T(I)=T(I)
KITE 77
3
CONTINUE
KITE 78
C
URITE (6,18)
KITE 79
KITE 80
CALL TN (N,H1,H2,T,Z)
KITE 81
WRITE(6,19) (2(1),1=1,N)
KITE 82
CALL SSQ(N,X,Z)
KITE 83
WRITE (6,20)
KITE 84
CALL TN (N,L1,L2,T,Z)
KITE 85
C
DO 4 1=1,N
KITE 86
KITE 87
4
Z(I)=EXP(Z(I))
KITE 88
WRITE (6,19)
KITE 89
CALL SSQ (N,X,Z)
KITE 90
WRITE (6,21)
KITE 91
C
EPQ=l>0E7
KITE 92
KITE 93
CALL LN3 (N,X,T,Z,HU,SD,AHL,IC,EPO)
KITE 94
WRITE (6,33) EPO, HU,SB,AHL,IC
KITE 95
33
FORMAT(/r2X,' EPS,HU,SD,AHL = ',4E10,3,' 4ITR = ',12,/)
KITE 96
WRITE (6,19) (Z(I),I=1,N)
KITE 97
CALL SSQ (N,X,Z)
KITE 98
WRITE (6,22)
KITE 99
EPS=l0E7
KITE100
C
CALL EV1 (N,X,H1,H2,RP,Z,BET,ALF,DLT,ICT,EPS)
KITE101
KITE102
C
WRITE(6,34) EPS,BET,DLT,ALF,ICT
KITE103
KITE104
34
FORMAT(/,2XEPS,BET,DLT,ALF ',4E10,3,' i ITR ',12,//)
KITE105
WRITE (6,19) (Z(I),I=1,N)
KITE106
CALL SSQ(N,X,Z)
KITE107
WRITE (6,24)
KITE108
C
EP2=10Â£3
KITE109
KITE110
CALL PT3 (N,Y,T,Z,DEL,GAH,ALF,IC,EP2)
KITE111
C
WRITE (6,44) DEL,EP2,GAH,ALF,IC
KITE112
KITE113
286
c
KITE114
DO 5 1=1,N
KITE115
5
Z(I)=EXP(Z(I))
KITE116
WRITE (6)19) (Z(I))I=1)N)
KITE117
CALL SSQ
KITE118
C
KITE119
WRITE (6)23)
KITE120
EP1=10E7
KITE121
CALL PT3
KITE122
c
KITE123
WRITE(6,44) DL)EP1)GH)AF)IC
KITE124
44
F0RriAT(/)2X)' DLT)EPS)GAMA)ALPHA = ',4E10,3)' ITR = ',I2)//>
KITE125
WRITE (6)19) (Z(D)I=l)fO
KITE126
CALL SSQ (N,XZ)
KITE127
C
KITE128
GO TO 66
KITE129
9?
CONTINUE
KITE130
STOP
KITE131
C
KITE132
6
FORMAT(80A1)
KITE133
7
FORMAT(15)
KITE134
8
FORMAT(10F8 >0)
KITE135
9
FORMAT <1H1,/80A1)//)
KITE136
10
FORMAT(3X)' SORTED RECORDED EVENTS ',/)
KITE137
11
FORMAT(20Xi'MEAN OF Y',16XiF12.3)
KITE138
12
FORMAT20X) 'VARIANCE OF Y')12X)F12.3)
KITE139
13
FORMAT(20X)'SKEW OF Y'16X)F12*3>/)
KITE140
14
FORMAT(20X)'MEAN GF LN(Y) M0X)F123)
KITE141
15
F0RMAT(20X)'VARIANCE OF LN(Y) ')6X,F12.3)
KITE142
16
FORM AT (20X)'SKEW OF LN(Y) M0X)F12.3/,1H1)
KITE143
17
FORMAT(///)
KITE144
18
FORMAT(3X'TRONCATED NORMAL EVENTS')/)
KITE145
19
F0RNAT(3X) 6F12.3)
KITE146
20
FORMAT(3X)'2 PARAMETER LOGNORMAL EVENTS')/)
KITE147
21
FORMAT(3X,'3 PARAMETER LOGNORMAL EVENTS')/)
KIYE148
22
F0RHAT(3X)'TYPE 1 EXTREMAL EVENTS')./)
KITE149
23
FORMAT(3X,'PEARSON TYPE 3 EVENTS')/)
KITE150
24
FORMAT(3X)'LOGPEARSON TYPE 3 EVENTS')/)
KITE151
C
KITE152
END
KITE153
c
KITE154
SUBROUTINE SSQ (NiXiZ)
KITE155
DIMENSION X(1))Z(1)
KITE156
SUM=0.0
KITE157
BO 1 1=1,N
KITE158
1
SUN=SUMF(Z(I)X(I))$t2
KITE159
XN=N
KITE160
SUM=SQRT(SUM/XN >
KITE161
WRITE (6)2) SUM
KITE162
2
F0RMAT(/)3X)'STANDARD ERROR IS ')F12.3,//)
KITE163
RETURN
KITE164
END
KITE165
D
KITE166
SUBROUTINE TN (N)XBAR)XVAR)T)X)
KITE167
DIMENSION X(1))T(D
KITE168
XSTD=SORT(XVAR)
KITE169
non
287
DO 1 1=1,N KITE170
1 X(I)=XBARfT
RETURN KITE172
END KITE173
C KITE174
SUBROUTINE SORTX (N,X) KITE175
KITE176
SORTS IN DECREASING ORDER X(1)=LARGEST KITE177
KITE178
DIMENSION X(l) KITE179
K=N1 KITE180
DO 2 L=1K KITE181
H=NL KITE182
DO 2 J=1H KITE183
IF(X( J)X( J+D) 1,1,2 KITE184
1 XT=X(J) KITE185
X(J)=X(Jtl) KITE188
X (J+l)=XT KITE187
2 CONTINUE KITE188
RETURN KITE189
END KITE190
C KITE191
SUBROUTINE PT3 < NXSND,XTDELTA,GAHMA,ALPHAICOUNT>EPS) KITE192
C KITE193
DIMENSION X(1)SND(1)XT(1) KITE194
XN=N KITE195
XMIN=1.0E7 KITE196
DO 1 1=1,N KITE197
1 IF
C KITE199
IEPSM=7 KITE200
IEPS=1 KITE201
NHAX=25 KITE202
C KITE203
10 CONTINUE KITE204
GML=XMIN*0.99 KITE205
IC0UNT=0 KITE20
2 IC0UNT=IC0UNT+1 KITE207
A=0*0 KITE20S
B=0*0 KITE209
C=00 KITE210
R=0,0 KITE211
C KITE212
DO 3 1=1,N KITE213
XIGML=ABS(X(I)GML) KITE214
A=ATlfO/XIGML KITE215
B=BFXIGHL KITE2
C=CTALOG(XIGML) KITE217
R=R+10/XIGHL#XIGHL KITE218
3 CONTINUE KITE219
C KITE220
BETA=A/
ALPHA=B/(XN*BETA) KITE222
D=BETA+2,0 KITE223
PSI=ALOG(D)(1.0/(2,0*D(1.0/(12,0*Dtt2m(l,0/(120,0*D4 KITE224
,( 1.0/(252.0*Dtt6))(l,0/(BETA+l,0))(l, O/BETA) KITE225
FCN=XNTPSI+CXNJALQG(ALPHA) KITE226
TRI=(1*0/D>f(l.0/(2.0m*2m(l,0/(6,0*Dtt3>)(l.0/(30,0*D5m KITE227
,<1,0/(42,0*D7))(l,0/(30,0*D9))+(l,0/((BETA+l,0)2))t(l,0/ KITE228
,BETA**2) KITE229
288
V=A(XN**2)/B KITE230
U=A KITE231
U=(B/XN)XN/A KITE232
DU=R KITE233
DV=RXN*t3/Btt2 KITE234
Ma1.0+(XN)/A2 KITE235
FPN=XN4TRI4( AXN*DW/U KITE236
AS=GMLFCN/FPN KITE237
DELTA=ABS(EPS*AS> KITE23S
IF(ABS(ASGHL).LTDELTA) GO TO 4 KITE239
IFdCOUNT>GTNHAX) GO TO 6 KITE240
GHL=AS KITE241
GO TO 2 KITE242
4 CONTINUE KITE243
GAHHA=AS KITE244
DO 5 J=1N KITE245
T=SND(J) KITE246
E=BETA(l./3.)1.0/<9.0*BETA(2*/3.))+T/(3*0ttETA(l./6J) KITE247
XT(J)=GAHHA*ALPHA*EW3 KITE24B
5 CONTINUE KITE24?
GO TO 66 KITE250
6 CONTINUE KITE251
C KITE252
IF(IEP5.GTIEPSrt) GO TO 64 KITE253
EPS=EPS10 KITE254
IEPS=IEPSP1 KITE255
GO TO 10 KITE254
66 CONTINUE KITE257
RETURN KITE258
END KITE259
C KITE260
SUBROUTINE EV1 M1H2>TXTBETAALPHADELTAICOUNTEPS) KITE261
C KITE262
DIMENSION X(1)T(1)XT(1) KITE263
REAL Ml.M2 KITE244
XN=N KITE245
C KITE266
NMAX=25 KITE247
ALPHA=12325/(SORT(H2)) KITE268
AML=ALPHA KITE269
IEPS=1 KITE270
IEPSH=7 KITE271
10 CONTINUE KITE272
IC0UNT=0 KITE273
1 IC0UNT=IC0UNTF1 KITE274
A=l*0/(AHLtt2) KITE275
B=H110/AML KITE276
C=0,0 KITE277
D=00 KITE278
E=0,0 KITE279
DO 2 1=1N KITE280
TEHP=EXP(AHLtX(D) KITE281
C=C+TEMP KITE282
D=DFTEMPX(I) KITE283
E=E+TEHP*X(I)2 KITE284
2 CONTINUE KITE285
289
FCN=DBC
FPN=BtDEA*C
AS=AHL(FCN/FPN)
DELTA=ABS(EPS*AS)
IF(ABSASAML)>LT >DELTA) GO TO 3
IFCOUNT.GT.NHAX) GO TO 5
AHL=AS
GO TO 1
3 CONTINUE
ALPHA=AS
BET A=( 1,0/ ALPHA) LOG (XN/C)
DO 4 J=1N
YH=ALOG ALOG1*010/T(J)))
XT(J)=BETA+YH/ALPHA
4 CONTINUE
GO TO 6
5 CONTINUE
IFEPS.GT.IEPSM) GO TO 6
IEPS=IEPS+1
EPS=EPSÂ£10
GO TO 10
i CONTINUE
RETURN
END
C
SUBROUTINE LN3
C
DIMENSION X(1)SND1)XT1)
REAL MU
XN=N
XMIN=1*0E7
DO 1 1=1jN
1 IF(X(I)>LT*XMIN) XMIN=X(I)
C
IEPSH=7
NMAX=25
C
IEPS=1
10 CONTINUE
AHL=XHINi0*80
IC0UNT=0
2 IC0UNT=IC0UNTI1
C
A=0,0
B=0.0
C=0.0
D=0.0
E=0>0
F=00
P=0,0
C
KITE286
KITE287
KITE288
KITE28?
KITE290
KITE291
KITE292
KITE293
KITE294
KITE295
KITE29
KITE297
KITE298
KITE299
KITE300
KITE301
KITE302
KITE303
KITE304
KITE305
KITE306
KITE307
KITE308
KITE309
KITE310
ICOUNT jEPS) KITE311
KITE312
KITE313
KITE314
KITE315
K.UE316
KITE317
KITE318
KITE319
KITE320
KITE321
KITE322
KITE323
KITE324
KITE325
KITE32A
KITE327
KITE328
KITE329
KITE33Q
KITE331
KITE332
KITE333
KITE334
KITE335
KITE336
O'
O'
O' on  