<%BANNER%>

Artificial Neural Network Model to Predict Financial Contingency for Transportation Construction Projects

Permanent Link: http://ufdc.ufl.edu/UFE0041093/00001

Material Information

Title: Artificial Neural Network Model to Predict Financial Contingency for Transportation Construction Projects
Physical Description: 1 online resource (170 p.)
Language: english
Creator: Lhee, Sang
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2009

Subjects

Subjects / Keywords: Design, Construction, and Planning -- Dissertations, Academic -- UF
Genre: Design, Construction, and Planning Doctorate thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Construction projects involve many uncertainties and risks in all phases. As a result, all types of construction projects, including transportation projects, have historically experienced significant cost increases. Project contingencies are important items in the cost estimate for compensating unforeseen risks against underestimating project costs and budget overruns. Generally, a contingency is represented as a fixed percentage of project cost. However, it is not appropriate to apply this deterministic method to all construction projects because it provides an arbitrary percentage value based on only the project costs. The purpose of this study was to develop an artificial neural network model that was able to predict the required contingency amount on transportation construction projects for project owners or sponsors like DOT (Department of Transportation). In order to obtain this ultimate goal, factors that affect the owner?s contingency were identified, their weights in contributing to the contingency were discovered, and an appropriate form of the owner?s contingency was found with FDOT (Florida Department of Transportation) project data from projects that were completed from 2004 to 2006. The best artificial neural network model to predict the owner?s contingency using the NeuroShell Predictor software was discovered and a predictive tool was developed using a Microsoft Excel spreadsheet. The more accurate predictions of the contingency from this study can be used to better manage project contingency requirements and allow for additional projects to be brought online at a faster pace.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Sang Lhee.
Thesis: Thesis (Ph.D.)--University of Florida, 2009.
Local: Adviser: Issa, R. Raymond.
Local: Co-adviser: Flood, Ian.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2011-12-31

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2009
System ID: UFE0041093:00001

Permanent Link: http://ufdc.ufl.edu/UFE0041093/00001

Material Information

Title: Artificial Neural Network Model to Predict Financial Contingency for Transportation Construction Projects
Physical Description: 1 online resource (170 p.)
Language: english
Creator: Lhee, Sang
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2009

Subjects

Subjects / Keywords: Design, Construction, and Planning -- Dissertations, Academic -- UF
Genre: Design, Construction, and Planning Doctorate thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Construction projects involve many uncertainties and risks in all phases. As a result, all types of construction projects, including transportation projects, have historically experienced significant cost increases. Project contingencies are important items in the cost estimate for compensating unforeseen risks against underestimating project costs and budget overruns. Generally, a contingency is represented as a fixed percentage of project cost. However, it is not appropriate to apply this deterministic method to all construction projects because it provides an arbitrary percentage value based on only the project costs. The purpose of this study was to develop an artificial neural network model that was able to predict the required contingency amount on transportation construction projects for project owners or sponsors like DOT (Department of Transportation). In order to obtain this ultimate goal, factors that affect the owner?s contingency were identified, their weights in contributing to the contingency were discovered, and an appropriate form of the owner?s contingency was found with FDOT (Florida Department of Transportation) project data from projects that were completed from 2004 to 2006. The best artificial neural network model to predict the owner?s contingency using the NeuroShell Predictor software was discovered and a predictive tool was developed using a Microsoft Excel spreadsheet. The more accurate predictions of the contingency from this study can be used to better manage project contingency requirements and allow for additional projects to be brought online at a faster pace.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Sang Lhee.
Thesis: Thesis (Ph.D.)--University of Florida, 2009.
Local: Adviser: Issa, R. Raymond.
Local: Co-adviser: Flood, Ian.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2011-12-31

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2009
System ID: UFE0041093:00001


This item has the following downloads:


Full Text

PAGE 1

ARTIFICIAL NEURAL NETWORK MODEL TO PREDICT FINANC IAL CONTINGENCY FOR TRANSPORTATION CONSTRUCTION PROJECTS By SANG CHOON LHEE A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2009 1

PAGE 2

2009 Sang Choon Lhee 2

PAGE 3

To my parents, Dr. Joongwoo Lhee and Kyungok C hoi, my wife Yeojin, and my son Joonhee, for their love and support 3

PAGE 4

ACKNOWLEDGMENTS Many people deserve acknowledgements for th eir contributions to this doctoral dissertation. First of all, I would like to express my deepest gratitude to Dr. R. Raymond Issa, my advisor and committee chair, for his constant support and continuous encouragement. I would like to thank Dr. Ian Flood as committee co-cha ir for introducing an ar tificial neural network methodology and helping me understand this new method with gene rous guidance and constructive comments during this research. My appreciation goes to Dr. Edward Minchin for his useful ideas and comments on transportation constr uction projects, as well as to Dr. Yuan Chow for his interest in this research. I am very grateful to my parents, Dr. Joongwoo Lhee and Kyungok Choi for their unconditional love and endless support. Without th eir constant encouragement, belief, and prayer, this achievement would not have been possible. My appreciation for them is indescribable. I also thank my parents-in-law, Yongwon Choi and Soonchun Lee, for their love and support. I would like to express my d eepest thanks to my beloved wife, Yeojin, for her endless emotional love and support during my entire doctoral study. I also thank my son, David Joonhee, for giving me a great smile and lots of pleasure everyday. What I would like to say is only that I love you so much. Finally, I would like to dedicate this dissertation to my grandparents, who went away to heaven before my coming to the United States and during this study. I can not forget their warm smile and boundless love forever. 4

PAGE 5

TABLE OF CONTENTS page ACKNOWLEDGMENTS ............................................................................................................... 4 LIST OF TABLES ...........................................................................................................................9 LIST OF FIGURES .......................................................................................................................13 ABSTRACT ...................................................................................................................... .............18 CHAPTER 1 INTRODUCTION ................................................................................................................ ..20 Research Backgrounds ............................................................................................................20 Research Objectives ........................................................................................................... .....24 Research Questions ............................................................................................................ .....24 Research Scope .......................................................................................................................25 2 LITERATURE REVIEWS .....................................................................................................26 Contingency ................................................................................................................... .........26 Previous Methodologies fo r Setting Contingency ..................................................................28 Traditional Fixed Percentage Approach ..........................................................................28 Expert Judgment ..............................................................................................................29 Itemized Allocation Method ............................................................................................29 Probabilistic Itemized Allocation Method .......................................................................32 PERT (Program Evaluation and Review Technique) Method ........................................33 Monte Carlo Simulation (MCS) ......................................................................................36 Fuzzy Set Theory .............................................................................................................37 Regression Analysis ........................................................................................................38 Artificial Neural Network (ANN) Method ......................................................................38 Other Probabilistic Methods ............................................................................................40 Limitation of Current Methods for Estim ating Contingency on FDOT Projects ...................41 3 ARTIFICIAL NEURAL NETWORK METHODOLOGY ....................................................45 Overview ...................................................................................................................... ...........45 Structure of Artificial Neural Network ...................................................................................47 Steps for Creating Artific ial Neural Network .........................................................................48 Types of Artificial Neural Network ........................................................................................50 Supervised Network ........................................................................................................50 Unsupervised Network ....................................................................................................51 Training and Testing (Validation) ..........................................................................................51 Development Tools for Artif icial Neural Network ................................................................53 5

PAGE 6

4 RESEARCH METHODOLOGY ...........................................................................................55 Methodology Overview ..........................................................................................................55 Strategizing .................................................................................................................. ...........56 Desired Contingency as Output Variable ........................................................................56 Potential Input Variables .................................................................................................58 Data Collection .......................................................................................................................60 Data Collection Method ..................................................................................................60 Descriptions of Accessible Input Variables ....................................................................63 Project work type .....................................................................................................63 Project delivery method type ....................................................................................63 Project contract agreement type ...............................................................................66 Bid award method type ............................................................................................67 Number of bidders ....................................................................................................68 Project size (contract amount) ..................................................................................69 Project duration ........................................................................................................69 Project letting (delivery) type ...................................................................................71 Project geographical location ...................................................................................72 Project letting year ...................................................................................................73 Summary of Accessibl e Input Variables .........................................................................73 5 ARTIFICIAL NEURAL NETWORK MODEL .....................................................................75 Artificial Neural Network (ANN) Methodology ....................................................................75 Development Tool for Artific ial Neural Network Model ...............................................75 Type and Structure of Artific ial Neural Network Model ................................................76 Development of Artificial Neural Network Model .........................................................78 Implementation of Artificia l Neural Network Model .....................................................78 Identification of Effective Input Variables ......................................................................81 Enhanced Generalization for Testing Set ........................................................................81 Artificial Neural Network Model Development .....................................................................82 ANN Model Type 1 .........................................................................................................82 Finding the best network for model type 1 ..............................................................82 Finding the optimal number of hidde n neurons and importance of input variables ................................................................................................................83 ANN Model Type 2 .........................................................................................................86 Finding the best network for model type 2 ..............................................................86 Finding the optimal number of hidde n neurons and importance of input variables ................................................................................................................87 ANN Model Type 3 .........................................................................................................90 Finding the best network for ANN model type 3 .....................................................90 Finding the optimal number of hidde n neurons and importance of input variables ................................................................................................................91 ANN Model Type 4 .........................................................................................................94 Finding the best network for ANN model type 4 .....................................................94 Finding the optimal number of hidde n neurons and importance of input variables ................................................................................................................95 6

PAGE 7

ANN Model Type 5 .........................................................................................................98 Finding the best network for ANN model type 5 .....................................................98 Finding the optimal number of hidde n neurons and importance of input variables ................................................................................................................99 ANN Model Type 6 .......................................................................................................102 Finding the best network for model type 6 ............................................................102 Finding the optimal number of hidde n neurons and importance of input variables ..............................................................................................................103 ANN Model Type 7 .......................................................................................................106 Finding the best network for model type 7 ............................................................106 Finding the optimal number of hidde n neurons and importance of input variables ..............................................................................................................107 ANN Model Type 8 .......................................................................................................110 Finding the best network for model type 8 ............................................................110 Finding the optimal number of hidde n neurons and importance of input variables ..............................................................................................................111 ANN Model Type 9 .......................................................................................................114 Finding the best network for model type 9 ............................................................114 Finding the optimal number of hidde n neurons and importance of input variables ..............................................................................................................115 Summary of the Best Neural Network for Each ANN Model Type .....................................118 6 MODEL VALIDATION ......................................................................................................121 Validation for ANN Model Type 1 ......................................................................................121 Validation for ANN Model Type 2 ......................................................................................121 Validation for ANN Model Type 3 ......................................................................................122 Validation for ANN Model Type 4 ......................................................................................124 Validation for ANN Model Type 5 ......................................................................................125 Validation for ANN Model Type 6 ......................................................................................125 Validation for ANN Model Type 7 ......................................................................................126 Validation for ANN Model Type 8 ......................................................................................127 Validation for ANN Model Type 9 ......................................................................................128 Summary ....................................................................................................................... ........129 7 PREDICTION TOOL ...........................................................................................................13 1 NeuroShell Fire MS Excel Add-in .......................................................................................132 Dialog Box .....................................................................................................................132 FireNet Function Call ....................................................................................................132 NeuroShell Run-Time Excel Add-in ....................................................................................133 Development of a Prediction Tool for Predicting Contingency ...........................................133 8 CONCLUSIONS ................................................................................................................. .136 Conclusions ...........................................................................................................................136 Research Limitations .......................................................................................................... ..139 7

PAGE 8

Recommendations for Future Research ................................................................................140 APPENDIX A FDOT PROJECT DATA ......................................................................................................142 B DISTRIBUTION GRAPH BETW EEN INPUT VARIABLES ...........................................154 C EVALUATION GRAPH FO R TESTING ERROR .............................................................156 D EXPERIMENT ON THE NE UROSHELL 2 SOFTWARE .................................................158 LIST OF REFERENCES .............................................................................................................166 BIOGRAPHICAL SKETCH .......................................................................................................170 8

PAGE 9

LIST OF TABLES Table page 2-1 AACE International cost estimate stages and contingencies .............................................27 2-2 Standard deviation for normal and lognormal estimates ...................................................40 2-3 Funding decisions of initial conti ngency amount pay item and contingency supplemental agreements on FDOT Projects .....................................................................41 2-4 Summary of methodologies for contingency estimation ...................................................42 4-1 Accessible/inaccessible data from F DOT construction project database ..........................61 4-2 Construction project delivery methods ..............................................................................64 4-3 Candidate projects for Design-Build contracting ..............................................................65 4-4 Construction project contract agreements ..........................................................................66 4-5 Candidate projects for Lump-Sum contracting ..................................................................66 4-6 Summary of accessibl e input variables ..............................................................................74 5-1 Classification of each ANN model type ............................................................................79 5-2 Three datasets for each ANN model type ..........................................................................80 5-3 Performance for training set of model type 1 for predicting contingency amount ............83 5-4 Performance for testing set of model type 1 for predicting contingency amount ..............83 5-5 Performance for training set of model type 1 for predicting contingency rate ..................83 5-6 Performance for testing set of model type 1 for predicting contingency rate ....................83 5-7 Performance for training set of model type 2 for predicting contingency amount ............87 5-8 Performance for testing set of model type 2 for predicting contingency amount ..............87 5-9 Performance for training set of model type 2 for predicting contingency rate ..................87 5-10 Performance for testing set of model type 2 for predicting contingency rate ....................87 5-11 Performance for training set of model type 3 for predicting contingency amount ............91 5-12 Performance for testing set of model type 3 for predicting contingency amount ..............91 9

PAGE 10

5-13 Performance for training set of model type 3 for predicting contingency rate ..................91 5-14 Performance for testing set of model type 3 for predicting contingency rate ....................91 5-15 Performance for training set of model type 4 for predicting contingency amount ............95 5-16 Performance for testing set of model type 4 for predicting contingency amount ..............95 5-17 Performance for training set of model type 4 for predicting contingency rate ..................95 5-18 Performance for testing set of model type 4 for predicting contingency rate ....................95 5-19 Performance for training set of model type 5 for predicting contingency amount ............99 5-20 Performance for testing set of model type 5 for predicting contingency amount ..............99 5-21 Performance for training set of model type 5 for predicting contingency rate ..................99 5-22 Performance for testing set of model type 5 for predicting contingency rate ....................99 5-23 Performance for training set of model type 6 for predicting contingency amount ..........103 5-24 Performance for testing set of model type 6 for predicting contingency amount ............103 5-25 Performance for training set of model type 6 for predicting contingency rate ................103 5-26 Performance for testing set of model type 6 for predicting contingency rate ..................103 5-27 Performance for training set of model type 7 for predicting contingency amount ..........107 5-28 Performance for testing set of model type 7 for predicting contingency amount ............107 5-29 Performance for training set of model type 7 for predicting contingency rate ................107 5-30 Performance for testing set of model type 7 for predicting contingency rate ..................107 5-31 Performance for training set of model type 8 for predicting contingency amount ..........111 5-32 Performance for testing set of model type 8 for predicting contingency amount ............111 5-33 Performance for training set of model type 8 for predicting contingency rate ................111 5-34 Performance for testing set of model type 8 for predicting contingency rate ..................111 5-35 Performance for training set of model type 9 for predicting contingency amount ..........115 5-36 Performance for testing set of model type 9 for predicting contingency amount ............115 5-37 Performance for training set of model type 9 for predicting contingency rate ................115 10

PAGE 11

5-38 Performance for testing set of model type 9 for predicting contingency rate ..................115 5-39 The best neural network for each ANN model type ........................................................118 6-1 Performance for validation set of the best model for type 1 ............................................121 6-2 Performance for validation set of the best model for type 2 ............................................122 6-3 Performance for validation set of the best model for type 3 ............................................124 6-4 Performance for validation set of the best model for type 4 ............................................124 6-5 Performance for validation set of the second best model for type 4 ................................124 6-6 Performance for validation set of the best model for type 5 ............................................125 6-7 Performance for validation set of the second best model for type 5 ................................125 6-8 Performance for validation set of the best model for type 6 ............................................126 6-9 Performance for validation set of the second best model for type 6 ................................126 6-10 Performance for validation set of the best model for type 7 ............................................127 6-11 Performance for validation set of the second best model for type 7 ................................127 6-12 Performance for validation set of the best model for type 8 ............................................128 6-13 Performance for validation set of the best model for type 9 ............................................129 7-1 FireNet function for continge ncy amount on each ANN model type ..............................134 A-1 FDOT project data for ANN model type 1 ......................................................................142 A-2 FDOT project data for ANN model type 2 ......................................................................145 A-3 FDOT project data for ANN model type 3 ......................................................................146 A-4 FDOT project data for ANN model type 4 ......................................................................147 A-5 FDOT project data for ANN model type 5 ......................................................................148 A-6 FDOT project data for ANN model type 6 ......................................................................150 A-7 FDOT project data for ANN model type 7 ......................................................................151 A-8 FDOT project data for ANN model type 8 ......................................................................152 A-9 FDOT project data for ANN model type 9 ......................................................................153 11

PAGE 12

D-1 Performance of the networks for ANN model type 1 ......................................................160 D-2 Contribution Factor of the networks for ANN model type 1 ...........................................161 D-3 Performance of the networks under alternative sets of input variables at the optimal number of hidden neurons ...............................................................................................162 D-4 Contribution Factor of input variables on the networks under alternative sets at the optimal number of hidden neurons ..................................................................................163 D-5 Performance of the networks under a lternative numbers of input variables ...................164 D-6 Performance of the networks forgetting previous weights ..............................................164 D-7 Performance of the networks memorizing previous weights ...........................................164 12

PAGE 13

LIST OF FIGURES Figure page 1-1 Rates of under/overrunning budget on past FDOT projects ..............................................21 2-1 Triangle distribution with three known values ..................................................................31 2-2 Pareto model .............................................................................................................. ........32 2-3 Beta distribution of each cost item.....................................................................................34 2-4 Project cost curve ........................................................................................................ .......35 2-5 Inputs and outputs of neural network for markup estimation ............................................39 2-6 Maximum funding limits for contin gencies in the FDOT contracts ..................................43 3-1 Process of artificial neural network ...................................................................................46 3-2 Basic architecture of ar tificial neural network ...................................................................48 3-3 Steps for creating artificial neural network ........................................................................49 4-1 Flowchart for proposed methodology ................................................................................56 4-2 Calculations for con tingency amount and rate ...................................................................57 4-3 Potential influencing fact ors on the contingency item .......................................................60 4-4 Example of FDOT quarter ly cost & time report ................................................................61 4-5 Example of FDOT bid tabulation ......................................................................................62 4-6 Histogram of project work type .........................................................................................64 4-7 Histogram of project delivery method type .......................................................................65 4-8 Histogram of project co ntract agreement type ...................................................................67 4-9 Histogram of bid award method type .................................................................................68 4-10 Histogram of nu mber of bidders ........................................................................................69 4-11 Histogram of project size ................................................................................................ ...70 4-12 Histogram of project duration ............................................................................................70 4-13 Histogram of pr oject letting type .......................................................................................7 1 13

PAGE 14

4-14 Definition of urban and rura l in the state of Florida ..........................................................72 4-15 Histogram of project geographical location .......................................................................73 4-16 Histogram of pr oject letting year .......................................................................................7 4 5-1 NeuroShell Predictor software ...........................................................................................76 5-2 Basic structure of artifi cial neural network model to predict contingency item ................77 5-3 Neural networ k learning panel ...........................................................................................81 5-4 Enhanced generalization panel ..........................................................................................82 5-5 Input configuration for the model type 1 ...........................................................................84 5-6 Optimal number of hidden neurons for the model type 1 ..................................................84 5-7 Importance of input variables at th e optimal number of hidden neurons ..........................85 5-8 Actual/predicted output value on tr aining dataset of the model type 1 .............................85 5-9 Actual/predicted output value on te sting dataset of the model type 1 ...............................86 5-10 Input configuration for the model type 2 ...........................................................................88 5-11 Optimal number of hidden neurons for the model type 2 ..................................................88 5-12 Importance of input variables at th e optimal number of hidden neurons ..........................89 5-13 Actual/predicted output value on tr aining dataset of the model type 2 .............................89 5-14 Actual/predicted output value on te sting dataset of the model type 2 ...............................90 5-15 Input configuration for the model type 3 ...........................................................................92 5-16 Optimal number of hidden neurons for the model type 3 ..................................................92 5-17 Importance of input variables at th e optimal number of hidden neurons ..........................93 5-18 Actual/predicted output value on tr aining dataset for the model type 3 ............................93 5-19 Actual/predicted output value on te sting dataset of the model type 3 ...............................94 5-20 Input configuration for the model type 4 ...........................................................................96 5-21 Optimal number of hidden neurons for the model type 4 ..................................................96 5-22 Importance of input variables at th e optimal number of hidden neurons ..........................97 14

PAGE 15

5-23 Actual/predicted output value on tr aining dataset of the model type 4 .............................97 5-24 Actual/predicted output value on te sting dataset of the model type 4 ...............................98 5-25 Input configuration for the model type 5 .........................................................................100 5-26 Optimal number of hidden neurons for the model type 5 ................................................100 5-27 Importance of input variables at th e optimal number of hidden neurons ........................101 5-28 Actual/predicted output value on tr aining dataset of the model type 5 ...........................101 5-29 Actual/predicted output value on te sting dataset of the model type 5 .............................102 5-30 Input configuration for the model type 6 .........................................................................104 5-31 Optimal number of hidden neurons for the model type 6 ................................................104 5-32 Importance of input variables at th e optimal number of hidden neurons ........................105 5-33 Actual/predicted output value on tr aining dataset of the model type 6 ...........................105 5-34 Actual/predicted output value on te sting dataset of the model type 6 .............................106 5-35 Input configuration for the model type 7 .........................................................................108 5-36 Optimal number of hidden neurons for the model type 7 ................................................108 5-37 Importance of input variables at th e optimal number of hidden neurons ........................109 5-38 Actual/predicted output value on tr aining dataset of the model type 7 ...........................109 5-39 Actual/predicted output value on te sting dataset of the model type 7 .............................110 5-40 Input configuration for the model type 8 .........................................................................112 5-41 Optimal number of hidden neurons for the model type 8 ................................................112 5-42 Importance of input variables at th e optimal number of hidden neurons ........................113 5-43 Actual/predicted output value on tr aining dataset of the model type 8 ...........................113 5-44 Actual/predicted output value on te sting dataset of the model type 8 .............................114 5-45 Input configuration for the model type 9 .........................................................................116 5-46 Optimal number of hidden neurons for the model type 9 ................................................116 5-47 Importance of input variables at th e optimal number of hidden neurons ........................117 15

PAGE 16

5-48 Actual/predicted output value on tr aining dataset of the model type 9 ...........................117 5-49 Actual/predicted output value on te sting dataset of the model type 9 .............................118 5-50 Comparison of R-squared value on th e best network for each model type .....................119 5-51 Comparison of Correlation value on th e best network for each model type ...................120 6-1 Actual/predicted outpu t value on validation dataset of the model type 1 ........................122 6-2 Actual/predicted outpu t value on validation dataset of the model type 2 ........................123 6-3 Actual/predicted outpu t value on validation dataset of the model type 3 ........................123 6-4 Actual/predicted outpu t value on validation dataset of the model type 8 ........................128 6-5 Actual/predicted outpu t value on validation dataset of the model type 9 ........................129 6-6 Performance of the best networks on the testing a nd validation set ................................130 6-7 Message in the NeuroShell Predictor ...............................................................................130 7-1 NeuroShell MS Excel Add-in panel ................................................................................131 7-2 Dialog Box panel in MS Excel spreadsheet .....................................................................132 7-3 Read First MS Excel pa nel in the prediction tool ........................................................134 7-4 TOOL MS Excel panel on the prediction tool .............................................................135 B-1 Distribution graphs be tween input variables on the best neural network ........................154 B-2 Distribution graphs be tween input variables on the worst neural network ......................155 C-1 Evaluation graphs for testing e rror on the best neural network .......................................156 C-2 Evaluation graphs for testing error on the worst neural network .....................................157 D-1 Neuroshell 2 software ..................................................................................................... .158 D-2 Learning process on the NeuroShell 2 software ..............................................................159 D-3 Searching for an optimal number of hi dden neurons minimizing the testing error .........160 D-4. Contribution Factor on the network at the optimal number of hidden neurons ...............161 D-5 Searching for a set of input variables fo r the network minimizing the testing error .......162 D-6 Contribution Factor for ANN model under alternative sets of input variables ................163 16

PAGE 17

D-7 Searching for a number of input variab les for the network minimizing the testing error ......................................................................................................................... .........164 D-8 Performance of the networks under alternativ e datasets ..................................................165 17

PAGE 18

Abstract of Dissertation Pres ented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy ARTIFICIAL NEURAL NETORK MODEL TO PREDICT FINANCIAL CONTINGENCY FOR TRANSPORTATION CONSTRUCTION PROJECTS By Sang Choon Lhee December 2009 Chair: Raymond Issa Cochair: Ian Flood Major: Design, Construction, and Planning Construction projects involve many uncertainties and risks in all phases. As a result, all types of construction projects, including transpor tation projects, have historically experienced significant cost increases. Projec t contingencies are important items in the cost estimate for compensating unforeseen risks against undere stimating project costs and budget overruns. Generally, a contingency is represented as a fixed percentage of project cost. However, it is not appropriate to apply this deterministic method to all construction projects because it provides an arbitrary percentage value based on only the project costs. The purpose of this study was to develop an ar tificial neural network model that was able to predict the required contingency amount on transportation construction projects for project owners or sponsors like DOT (Department of Transportation). In order to obtain this ultimate goal, factors that affect the owners contingency were identified, their weights in contributing to the contingency were discovered, and an appropria te form of the owners contingency was found with FDOT (Florida Department of Transporta tion) project data from projects that were completed from 2004 to 2006. 18

PAGE 19

The best artificial neural network model to predict the owners contingency using the NeuroShell Predictor software was discovered and a predictive tool was developed using a Microsoft Excel spreadsheet. The more accurate predictions of the contingency from this study can be used to better manage project continge ncy requirements and allow for additional projects to be brought online at a faster pace. 19

PAGE 20

CHAPTER 1 INTRODUCTION Research Backgrounds In this world, nothing is certain but d eath and taxes. (Benjamin Franklin, 1789) All construction projects, whet her residential, commercial, heavy civil, or industrial construction and whether they are small or bi g, involve many uncertainties and risks throughout all construction phases from start-up to co mpletion. Baccarini (2004) mentioned that construction projects are notorious for budget ove rruns due to uncertainti es. Historically, many cases about failures of cost estimates due to unc ertainties and risks can easily found all over the world (Flyvbjerg et al. 2002). The Suez Cana l, which was completed in 1869, had actual construction costs more than 20 times the initial estimate at the time of decision to start construction. The Panama Canal completed in 1914 experienced cost escalations in the range of 70 to 200%. And the Sidney Opera House was co mpleted with actual costs approximately 15 times higher than the initial projected estimate in 1963. Cost estimate failures have been inevitable in past transportation construction projects. Transportation construction projects have historically experienced si gnificant increases in project cost from conceptual planning estimates to final completion (M olenaar 2005). Through a sample study of 258 transportation infrastructure projec ts representing diffe rent project types, geographical regions, and historical periods, Flyvbjerg et al. (2002) discovered that project costs are underestimated in approximately 90% of the pr ojects and the actual costs are on average 28% higher than estimated costs. They also f ound that underestimated costs were wrong by a substantial larger margin than overestimated costs. With these historical facts about cost fa ilures, the rate of overrunning/underrunning the budget was investigated among 592 transportation construction projects which were sponsored 20

PAGE 21

by the Florida Department of Transportation (FDOT) and completed in FY (Fiscal Year) 2004/2005 and 2005/2006. Three hundred and ten (310) out of 592 transportation projects (52.4%) experienced cost overruns against their original contract amount s while the rate of underrunning the budget was only 6.4% (38 out of 592) as seen in Figure 1-1. Figure 1-1. Rates of under/overrunni ng budget on past FDOT projects As shown by these historical cases, a common error in the economic analysis and budgeting for construction projects is the underestimation of cons truction costs. As a provision against this error, contingency funding has been used for managing the risk of cost escalation and covering potential cost estimate shortfalls. In other words, co nstruction participants including owners and contractors have been using construction contingencies as an element of cost items in the estimate for compensating invisible and unf oreseen risks against underestimating project costs and overrunning budgets. Popescu et al. (2003) described some reasons for adding the contingency item into the estimate to cover one or possi bly more of the following: Unpredictable price escalation for materials, labor, and installed equipment for projects with an estimated duration greater than 12 months Project complexity Incomplete working drawings at the time of detail estimate 21

PAGE 22

Incomplete design in the fast-track or design-build cont racting approach Soft spots in the detail estimate due to possi ble estimating errors, to balance an estimate that is biased low Abnormal construction methods and startup requirements Estimator personal concerns regarding project, unusual construction risk, and difficulties to build Unforeseen safety and environmental requirements Preparation of a form of in surance that the contractors will stay within bid prices Since the contingency item is included within an estimate which is prepared before the commencement of projects, it is dire ctly related to the accuracy of base estimates. Therefore, the accurate estimation of contingency and its adequacy is so important for the financial successes of projects. The contingency may al so have tremendous impacts on project outcomes for project owners and sponsors (Dey et al. 1994; Baccarini 2004). A large contingency might result in poor or unattractive cost management, uneconomic co mpletion of projects, and lacks of available funds for other organizational activities, while a lo w contingency might give rise to inadequate funding for projects, unrealisti c financial environments a nd unsatisfactory performance outcomes. According to the Federal Highw ay Administration (FHWA) (2007b) the contingency fund management for transportation construction proj ects is currently handled in similar ways throughout the country and contingency for constr uction projects is typi cally established and adjusted on a sliding scale or based on the risk in exposure to cost escalations. The FDOT has also established an initia l contingency amount pay item and contingency supplemental agreements for compensating uncertainties and risks from work orders and funding additional works. Since the initial contingency amount pay item is included in their transportation project contracts prior to bid, it is an alternate method of obtaining funds for performing additional work 22

PAGE 23

orders without delay. In other words, the FDOT cr eates the initial continge ncy pay item that can be included in a contract prior to bid in order to avoid the de lay caused by obtaining certification of availability of funds and preparing and executing the initial c ontingency supplemental agreement and to provide a means to perform ad ditional work as soon as the first day of a project. Contingency supplemental agreements ar e also used for funding additional works in case that the amount of such work exceeds the amount not committed against the initial contingency pay item. According to the FDOT Construction Projec t Administration Manual (2002), the initial contingency amount pay item and all contingency supplemental agreements are established with the following funding limits. If the original contract amount is $5,000,000 or less, the amount authorized shall not exceed five percent (5%) of the original contract amount or $50,000, whichever is less. If the original contract amount is more than $5,000,000, the amount shall not exceed one percent (1%) of the original contra ct amount or $150,000, whichever is less. The method of adding a fixed percentage of th e estimated original co ntract amount for the contingency item by the FDOT has been considered as one of the most traditional methods and commonly used because of its simplicity. Howeve r, as a result of using this contingency assessment method, over 50% of past transporta tion projects sponsored by the FDOT were overrun against their original cont ract amounts as seen in Figure 1-1. Therefore, in order to estimate more realistic and sufficient total proj ect costs and to cover costs occurred by any uncertainties, the contingency item should be pr operly and effectively estimated by considering other factors influencing th e contingency, not just the original contract amount. 23

PAGE 24

Research Objectives The overall goal of this study is to develop an Artificial Ne ural Network (ANN) model to predict the owner contingency in the cost estimat e as accurately as possible before the start of project on transportation construction projects. The objectives are: To collect data related to the owner c ontingency item on transportation construction projects To identify potential factors that have an impact on the owner contingency and measure the impact of each input variable on the owner contingency item To determine and validate the accurate of owner contingency items on transportation projects To determine an appropriate form of the ow ner contingency as out put variable for the ANN models To demonstrate the viability of the developed approach in predicting the owner contingency To develop a prediction tool w ith the best results from artif icial neural network models From the proposed contingency prediction mode l, owners or sponsors like the FDOT will be able to determine the accu rate contingency for preventing uncertain risks on their transportation constructi on projects at the start of construction. Research Questions As a motive for developing this study, the following questions should be answered for developing a model to predict the owner con tingency item for tran sportation construction projects. How accurate are the owner contingenc y amounts for transportation projects? What factors influence on the owner contingency amount? How much do influencing factors have an impact on the owner contingency item? 24

PAGE 25

How can all effective factors be integrated in an artificial neural network model to predict the owner contingency amount? Research Scope This study focuses on transportation construction projects sponsored by the FDOT. The data for the research will be sel ected based on the following criteria: All projects that were practically completed in FY2004/2005, FY2005/2006, and FY2006/2007. The type of projects varies as follows: asphalt resurfacing, asphalt paving, bridge work, and combination of asphalt paving a nd bridge work, and other works. The time span for projects is from 2000 to 2006 based on the availability of data. The location of projects covers in a ll regions of the state of Florida. The range of dollar value for projects is from $0.01m to $27m. Inflation adjustment will not be considered for the simplicity of the prediction model. However, the project letting year as potentia l influencing factor on the contingency item will be considered. 25

PAGE 26

CHAPTER 2 LITERATURE REVIEWS The objective of this chapter is to describe research previously accomplished in the area related to contingency amounts fo r construction projects. Literatu re reviews conducted for this study include an overview of the c ontingencies, previous approaches for setting the contingency and their limitations, and a new direction to establish a more rational approach to set contingencies for transporta tion construction projects. Contingency Patrascu (1988) discovered no standard definitions for con tingency and mentioned that contingency is one of the most misunderstood, misinterpreted, a nd misapplied words in project execution. So, contingency can and does mean different things to different people. Each project manager, each engineer, and often each estimato r have their own definitions for contingency. The definition of contingency depends on which trade associations the definer belonged to or who the scholars were, who defined it as follows: An amount of money or time added to the base estimated amount to (1) achieve a specific confidence level, or (2) allow for changes th at experience shows will likely be required. This may be derived through statistical analys es of past projects, by applying experience, or through a probabilistic assessment of what may occur. (AACE 2000) The amount of money or time needed above the estimate to reduce the risk of overruns of project objectives to a level accepta ble to the organization (PMI 2000) Contingency covers the costs that may resu lt from incomplete design, unforeseen and unpredictable conditions, or uncertainties with in the defined project scope. The amount of the contingency will depend on the status of design, procurement, and construction; and the complexities and uncertainties of the component parts of the project. (U.S. DOE 1994) Costs that are unforeseen but are expected based on experience and risk analysis. These costs do not include scope change s, escalation due to inflation, or allowances which are known but undefined. (Ahuja 1994) 26

PAGE 27

A markup applied to account for substantial unce rtainties in quantities, unit costs, and the possibility of currently unforeseen risk even ts related to quantities, work elements, or other project requirements. (Molenaar 2005) Depending on the project phase and the par ti involved, there are three types of contingencies (Gnhan and Arditi 2007). Among thr ee types of contingencies, the contractor contingency and the owner contingency can be id entified as the construc tion contingency since these two contingencies target cha nges during the cons truction process. Designer Contingency: A cost item of the preliminary budget controlled by the estimating party for potential cost increases during the preconstruction phase Contractor Contingency: A co st item of the construction budget controlled by the general contractor for covering unforeseen cond itions during the co nstruction phase Owner Contingency: A cost item of the ow ners project budget c ontrolled by the owner for covering all changes in the project scope and all cost s for changes requested by the owner during the c onstruction phase The Association for the Advancement of Cost Engineering (AACE) International (1997) recommends expected accuracy ranges of the co ntingency item according to project cost estimate stages. Table 2-1 shows the expected accuracy ra nges and suggested values of contingency for five cost estimate classifications. Table 2-1. AACE International cost estimate stages and contingencies AACE International project stage Level of project definition AACE International expected accuracy range (Low & High) AACE International suggested contingency Concept screening 0% to 2% L:-20% to -50% H:+30% to +100% 50% Feasibility study 1% to 15% L: -15% to -30% H: +20% to +50% 30% Authorization or control 10% to 40% L: -10% to -20% H: +10% to +30% 20% Control or bid/tender 30% to 70% L: -5% to -15% H: +5% to +20% 15% Check estimate or bid/tender 50% to 100% L: -3% to -10% H: +3% to +15% 5% 27

PAGE 28

28 Previous Methodologies for Setting Contingency Burroughs and Juntima (2004) mentioned that re latively little has been published on the subject of contingency-calculating techniques improving the accuracy of project estimates. However, using probabilistic concep ts, artificial intelligence tec hniques, or historic cost data analysis, some methodologies for predicti ng the contingency item were suggested. Traditional Fixed Percentage Approach This is a subjective method based on gut feeling and intuition. It atte mpts to quantify the risk associated with a project based on past experience from similar types of project. A contingency estimated in this method usually rang es between 1 to 5 percent and rarely exceeds 10 percent (Jelen and Black 1983). The purpose of th is approach is to ensure that the project budget is sufficient to contain any cost incurred by risks. Although this predetermined percentage method is popular and easy to use, it has several weaknesses (Karlsen and Lereim 2005): This method is in danger of being overl y simplistic and heavily dependent on an estimators faith in their own experiences The percentage figure is arbitr arily arrived at and not appr opriate for a specific project There may be a tendency to double count risk A percentage addition still results in a singl e figure prediction of estimated costs, implying a degree of certainty th at is simply not justified Due to these weaknesses, this approach tends to underestimate the contingency on complex and poorly-defined projects and to overestimate the contingency on simple or well-defined projects (Burroughs and Juntima 2004). Therefore, this method should be used with caution on construction projects since it ma y produce large variati ons in the probability of over-running or under-running the budget from project to project, if used.

PAGE 29

Expert Judgment This method uses the well-experienced and well-educated judgments of experts for assessing a contingency level. In this technique, skilled estimators and project team members use their own experiences and expertis e in order to assign an appropria te level of contingency for the project at hand. Unlike the traditional fixed percen tage method, it considers specific risk factors and base estimate competitiven ess on setting contingency. For example, the experts select predetermined contingencies for discrete risk leve ls (15% for high risk, 10% for average, and 5% for low risk). By using the specificity and the subjectivity in setting each projects contingency level, a project will more likely have more accu rate estimates. However, the main disadvantages of this method are (1) the subjectivity in that the skill, knowledge, and motivations of the experts may vary widely and (2) the difficulty of transf erring expertise in that only a few experts are available whose understanding of pr oject cost risk and estimates competitiveness can be relied on (Burroughs and Juntima 2004). Itemized Allocation Method Yeo (1990) observed that the c onventional approach of assign ing an overall contingency to the bottom-line estimates is overly simplistic and not easily verifiable. So, it is reasonable to consider several work packages to allocate co ntingency instead of a pplying it on the entire project (Ahmad 1992). Subdividing a project into work packages helps organize the estimating process and the same subdivisions can also help in analyzing the risk and uncertainty related with each subdivision or work package. The itemized allocation method was developed based on this fact. Drigani (1988) discussed this methodology of allocating contingencies for each project element and then summing them up to the project le vel. It is similar to the predetermined fixed percentage approach, but the each cost item (Ci) is allocated into an estimated percent 29

PAGE 30

contingency (Ti) and the project overall contingency (PT) is then estimated as a weighted average as seen in the fo llowing equation (Moselhi 1997). ) n 1i C(T TC 1 PTii (Equation 2-1) Where: n: The total number of cost items TC: The estimated target cost of the project This method is more rational than the traditiona l predetermined fixed percentage method and could lead to a more reliable estimate. It allo ws the estimator to take a closer look at each item and to examine the anatomy of the project co st. However, it still has the same weakness as the traditional method since the results h eavily depend on the estim ators knowledge and experience (Karlsen and Lereim 2005). Ahmad (1992) developed a simulation-based model for allocating contingency under the assumption that historic cost information on cost elements (work packages) is available to a companys estimating department and a frequency of the ratio of actual cost to estimated cost for each work package follows a triangle distributi on for simplicity. So this assumption needs only three values [rl (the lowest or optimistic ratio), rm (the most frequent or the most likely ratio), and rh (the highest or pessimistic ra tio)] on the frequency distributi on curve as shown in Figure 2-1. The contingency for each cost elements can be calculated by using E quations 2-2, 2-3, 2-4, and 2-5. lh lmrr rr 100A (Equation 2-2) R/A)r(rrrlmle if R A (Equation 2-3) 30

PAGE 31

A) R)/(100 (100)r(rrrmhhe if R> A (Equation 2-4) 1.0)100(rce e (Equation 2-5) Where: A: Area covered between rl and rm R: Random number between 0 and 99 by random number generator re: Randomly selected ratio from a random number ce: Contingency of each work package Figure 2-1. Triangle distribution with three known values For each work package under a set of three ra tios, a simulated distribution can be obtained by generating random numbers and using Equations 2-2, 2-3, 2-4 and 2-5. The next step is to use the mean and standard deviation of the genera ted distribution in orde r to obtain probable outcomes through many times cycles of calculations using comput er-generated random numbers. Assuming that this simulated distribution curve approximates a normal distribution curve, a number of outcomes with different probabilitie s can be obtained through the mean and standard deviation of each work package. Finally the Equation 2-6 is used for each items contingency. 31

PAGE 32

Contingency = (Equation 2-6) Value)(Zsdmcc Where: mc: The generated mean of the work package sdc: The generated standard deviation of the work package This method is simple, but can be easily com puterized using spreadsheet programs. Use of this contingency estimating method which is ba sed on statistical information for each work package will help prevent the practice of arbitr arily allocating the contingency on the bottom-line estimates. Probabilistic Itemized Allocation Method This method is similar to the itemized allo cation method except that (1) it uses Paretos Law, the law of the vital few a nd the trivial many, or what is known as the 80/20 rule, and (2) it examines closely each work item being considered significant (SCi) and allocates a probability value (Pi) rather than a percent conti ngency for each work item. According to Paretos Law, 80% of the risk will be related with 20% of the cost items in the process of contingency estimation as seen in Figure 2-2 (Moselhi 1997). Figure 2-2. Pareto model (Moselhi 1997) 32

PAGE 33

Accordingly, the contingency can be estimat ed from the Equation 2-7. The probability values used in this equation can either be deri ved based on historical da ta or can simply be assigned based on subjective judgments of estimators. m 1i i i(SC))P(1 TC 1 PT (Equation 2-7) Where: m: The number of significant cost items being considered Pi: The probability value for each significant cost item (SC)i: The potential risk or cost overrun associated with the ith significant cost item PERT (Program Evaluation and Review Technique) Method The PERT method is a well-known approach developed to simplify the planning and scheduling of large and complex projects in the 1950s. Its objective is to evaluate the risk in meeting the time goals of the execution of proj ects whose activities are associated with an elevated level of uncertainty in the estimate of their durations. It is based on the central limit theorem of probability. In this method, threepoint estimates of cost for each item being considered are necessary; a most-likely, optimistic, and pessimistic or target, lowest and highest costs, respectively. These estimates can either be made quantitatively based on historic data collected from previous projects or qualitativ ely based on judgment a nd experience. It also requires some judgments about the probability density function that describes each cost item as a random variable taking on values between its estima ted lowest and highest costs. This function is commonly assumed to follow a be ta distribution (Moselhi 19 97; Karlsen and Lereim 2005). 33

PAGE 34

Figure 2-3. Beta distri bution of each cost item The expected value and the variance fo r any cost item can be calculated as: Mean= (Maximum Value + 4 Modal Value + Minimum Value)/6 (Equation 2-8) Variance= [(Maximum Value Minimum Value)/6]2 (Equation 2-9) The project target cost, based on the central limit theory, follows a normal distribution. Once the expected or project target cost (TC) and its variance (V) of total cost have been determined, one can investigate a required proj ect cost (X) under a desirable probability of completing projects with the Z-score table for norma l distribution as seen in Figure 2-4. Finally, one can calculate the contingenc y amount satisfying the desired probability using Equations 2-10 and 2-11. V TCX ScoreZ (Equation 2-10) VScore)(ZTCXyContingenc (Equation 2-11) 34

PAGE 35

Figure 2-4. Project cost curve This method is a simple and economical approa ch which does not require sophisticated and complex computer systems and is adequate when assigning the contingency of each cost item in the estimate of costs of the un it cost/quantity type under the breakdown of the estimate (Aquino 1992). However, since it assumes that all cost ite ms are independent, it tends to underestimate the variance of the project cost and accordingly leads to erro neous probability of occurrence (Moselhi 1997). Nassar (2002) proposed a quantitative approach for performing the contingency analysis for construction project using spreadsheet techniques in the Microsoft Excel program based on the PERT method. This analysis provides a more definitive perception of the overall risk of a construction project and a more rational basis for contingency planning and evaluation through visualizing risks from spreadsheets. Moselhi (1997) also proposed a direct pr obabilistic PERT method which considers any correlation that may exist among the project co st items. Based on the marginal probability distribution used for each cost item, the mean and the variance are calculated for each item. The 35

PAGE 36

mean of the project cost is simply the sum of those calculated for the individual items as in the PERT. However, unlike the way used in the tradi tional PERT method, the variance of the project cost is calculated in a different way in order to consider the correla tion among the cost items based on historical data or subjective judgment as s een in the Equation 2-12. )V(C)V(C 2)V(C V(TC)j n 1i n 1i n 1j iij i (Equation 2-12) Where: V(TC): The variance of total cost of the project V(Ci), V(Cj): The variance of each cost item i and j ij: The correlation between cost item i and j Monte Carlo Simulation (MCS) Since the development of Monte Carlo Simula tion techniques in la te 1940s and continuous improvements, this method has been thought as an alternative to the PERT method and accepted as a single frequently used tool for estimati ng the contingency (Mosel hi 1997; Lorance 1992). The Monte Carlo Simulation is a quantitative met hod which is usually proba bilistic in nature and allows the statistical confidence level of cost outcomes. Like the PERT approach, three estimates fo r each item or a range defined by the lowest and highest cost of item with the assumed pr obability density functi on are needed in the implementation of the Monte Carlo Simulation. A triangle distribution is generally assumed. In this method, random numbers are used to generate a set of artificial cost values for each item within its cost range. Generated cost sets or simulations whose number varies from 100 to 10,000, are used to calculate the mean or target project cost and its variance. And then, the probability of the project cost ex ceeding the calculated target by a specified amount is calculated as in the PERT. However, the assumption of tr iangle distribution has be en tested and was found 36

PAGE 37

invalid because the distribution produces unaccep table systematic errors (Chau 1995). And since the method assumes cost items to be independent random variables, it does not implicitly permit the correlation. It tends to underestimate the vari ance and hence the standard deviation associated with the estimated project cost, which leads to an erroneous and misl eading estimate of the probability for an assumed level of risk. Th erefore, one should discover an appropriate distribution as assumption of using this method, consider correlations among cost items, and develop a new revised Monte Carlo Simulation approach based on these considerations. Fuzzy Set Theory A fuzzy set approach, proposed by Zadeh (1965), is useful for uncertainty analysis where a probabilistic database is not avai lable or when values of input variables are uncertain. Using the concept of membership function, the fuzzy set a pproach has been widely applied to represent the uncertainties of reallife situations because fuzziness or uncertainty represents situations where memberships in sets cannot be defined on a yes/no basis due to the vague boundaries of the sets (Paek et al. 1992). Paek et al. (1993) charact erized the uncertainty in risk-associated consequences and its impact on re sults by applying the fuzzy set theory and incorporated them into the bidding price decisi on process using software based on a risk-prici ng algorithm, indicating some drawbacks of probabilistic analys is and interval analysis among other methods used for uncertainty analysis. In the probabilistic analysis, statistical measures such as mean and standard deviation of input variab les are used to estimate the mean and standard deviation of the results. However, this method is useless and inappropriate if the i nput variables have a nonprobabilistic nature. The in terval analysis uses ranges for input variables to estimate possible ranges of the results. However, it is also difficult to define the ranges of input variables when the boundaries of the ranges are uncertain. 37

PAGE 38

Regression Analysis Regression analysis is a statistically empi rical and objective technique for finding the equation that best fits sets of observation of the response variab le (output variable) and multiple explanatory variables (input vari ables) in order to represent th e true underlying relationships between variables. A quantitative regression model was proposed that produces consistent results no matter who applies it because this method brings expert knowledge in the prediction of contingency based on actual historic data without the need for a skilled expert for every project. Therefore, in order to use this technique, deta il enough project cost data must be collected as important factor for successfully implementi ng regression analysis. Burroughs and Juntima (2004) suggested that this regre ssion analysis technique is a good and viable alternative for such traditional methods as the percentage method and expert judgment for estimating contingency. Artificial Neural Network (ANN) Method Artificial neural networks gain analogy-ba sed problem solving capabilities by learning from a number of input patterns and their associated output patte rns. Under the fact that the estimation of optimum markup is considered a ma jor risk assessment meas ure that substitutes or supplements the allocation of contingencies based on the risk associated each item being estimated (De Neufville and King 1991), Moselhi et al. (1993) developed a decision-support system (DSS) that helps contractors in prepar ing competitive bids for building projects using artificial neural networks for this markup estimat ion that derives solutions for new bid situations based on analogy with past projects. They used Back Propagation neural networks (BPNN) for estimating markup in the development of a d ecision-support system th at helps contractors prepare competitive bids for building projects. Th e neural networks were trained to generalize the projects knowledge through surveys related to current bidding practice and some past bidding experiences of the partic ipating contractors and to be able to predict the project 38

PAGE 39

outcomes when fed with the contractors assessmen t of various project risks as shown in Figure 2-5. Figure 2-5. Inputs and outputs of neural network for markup es timation (Moselhi, Hegazy, and Fazio 1993) Chen and Hartman (2000) also developed an artificial neural netw ork (ANN) based model to predict the total contingency cost and time al lowance or variances at the front-end stage of project development. They used an ANN based technique on the fact that real world systems are often nonlinear and problems invol ving complex nonlinear relationships can be better solved by neural networks than by the traditional percen tage method. They randomly divided the project samples into training, test and production subset s in the ratio of 60% to 20% to 20%. In the development of ANN models, BPNN and General Regression Neural Network (GRNN) were used for the proposed cost model and BPNN and Pr obabilistic Neural Network (PNN) were used for the time model. Thorough comparison between the results from ANN and those from the Multiple Linear Regression (MLR) technique showed that the neural network outperformed MLR and could be used satisfactorily as an es timation model for predicting cost performance. 39

PAGE 40

However, they also noted some disadvantages of the ANN application: (1) the difficulty to trace the outputs from the internal structure of the ne ural network due to its black box nature and (2) the requirements for extensive data collection. Other Probabilistic Methods Rothwell (2005) discovered that the cost contingency can be approximately equal to the standard deviation of the cost estimate if the cost estimate has normal or lognormal distribution. If the cost estimate is normally dist ributed, the standard deviation is = X/Z, where X is the level of accuracy and Z depends on the confidence level. For example, the level of accuracy for preliminary estimate is about If the cost estimator has an 80% confidence in this range of accuracy, Z= 1.28. So, the standard deviation ( ) is 23.4% (=30%/1.28). Also, one can calculate standard deviations of the detai led and finalized estimate and compare the contingency rate to that suggest ed by the AACE International. As with the normal distribution, cost estimates with lognormal di stributions can be also be assi gned a contingency equal to their standard deviation calculated by using th e LOGNORMDIST function in the MS Excel spreadsheet. This lognormal distribution is more realistic for many cost estimates since many cost estimate accuracy ranges are non-symmetric. %30 As shown in Table 2-2, the st andard deviation of the cost estimate regardless of normal and lognormal distribution is approximately equal to the contingenc y suggested by the AACE International. Table 2-2. Standard deviation for nor mal and lognormal estimates (Rothwell 2005) Accuracy range Std. Dev. under normal distribution Std. Dev. under lognormal distribution Suggested contingency by AACE Preliminary estimate %30 23.4% 18.3% 20% Detailed estimate %20 15.6% 13.1% 15% Finalized estimate %10 7.8% 7.0% 5% 40

PAGE 41

Touran (2003) proposed a probabilistic cont ingency model that considers the random nature of change orders and th eir impacts on project cost and schedule. The model incorporates uncertainties in project cost and schedule and calculates contingency based on the level of confidence specified by the owner. It considers the effect of schedule delay in increasing the project cost and the effect of correlation between change order co sts. This model assumes that change orders occur randomly in time based on Poisson process. It can be used for budgeting purposes at the early stages of project pla nning and development. Refer to Table 2-4 for summarizing previous methodologies for setting contingencies. Limitation of Current Methods for Esti mating Contingency on FDOT Projects Currently, the FDOT is using two methods to determine contingencies for work orders. The first way is to use the In itial Contingency Amount Pay Item Since the initial contingency pay item is included in a contract prior to bid, it can be used for performing additional work as soon as the first day of a project. The second way is the Contingency Supplemental Agreement which is a method of obtaining additional funds fo r work orders whose cost exceeds the initial contingency amount. However, these two ways have the following maximum funding limits depending on the original contract am ount as seen in Table 2-3 (FDOT 2002). Table 2-3. Funding decisions of initial contingency amount pay item and contingency supplemental agreements on FDOT Projects Original contract amount Maximum funding limits Decision Option 1 Option 2 $5,000,000 or less 5% of the original contract amount $50,000 Whichever is less More than $5,000,000 1% of the original contract amount $150,000 41

PAGE 42

42Table 2-4. Summary of methodol ogies for contingency estimation Methodology Description Comparison Fixed percentage method Subjective method based on gut feeling and intuition Usually range between 1% to 5% Overly simplistic and dependent on estimators faith Inappropriate for specific projects Double count risk Implication of unjustified degree of certainty Expert judgment Use of the educated judgm ents of experts (skilled estimators and project team members) Various degree of structure to contingency setting process Subjectivity in skills, knowledge, and motivations of the experts Difficulty on transferring expertise Itemized allocation method Subdivision of a project into work packages Allocation of contingency into each cost item More rational and reliable method Dependence on the estimators knowledge and experience Probabilistic itemized allocation method Use of Paretos Law Allocation of probability values for contingency of each item Probability values derived based on historical data or subjective judgment Program Evaluation and Review Technique(PERT) Based on the Central Limit Theorem Use of three estimates of cost (most-likely, optimistic, and pessimistic cost) Based on quantitative data and qualitative judgment and experience Use of beta distribution for the function of cost Simple and economical approach Adequacy on assigning contingency for the cost estimate of the unit cost/quantity type Underestimation of variance due to assumption of independence Erroneous probability of occurrence Monte Carlo Simulation (MCS) Use of random numbers for a set of artificial cost values for each item Based on triangle distribution Viable alternative of PERT Invalid assumption on triangle distribution Independent random variable (no correlations) Estimation of variances Need for appropriate dist ribution for assumption and correlation among cost items Fuzzy set theory Useful for uncertainty analysis in unavailable probabilistic data and uncertain input variables Characterization of the uncertainty in risk-associated consequence and incorporation in to the bidding price decision process Compensation for disadvantage of probabilistic and interval analysis Regression analysis Use of best fitting equations for sets of response and explanatory variables Representation of underlying relationships between variables Empirical and obj ective approach Consistent results regardless of users Need for detail project cost data Artificial Neural Network (ANN) Use of analogy-based problem solving capabilities by learning from input and output patterns Solution for nonlinear relationships Difficulty to trace output from the internal structure due to a black box nature Need for extensive data collection Rothwells standard deviation method Approximation on normal and lognormal distribution Tourans probabilistic method Consideration of random nature of change orders Incorporation of uncertainties on cost and time Based on Poisson distribution Use for budgeting purpose at early stages of project planning and development

PAGE 43

Figure 2-6 represents the maximum funding lim its for the initial c ontingency amount pay item and the contingency supplemental agreements as a function of the original contract amounts. This method of adding a uniform fixed percen tage from original contract amounts of construction projects for the contingency item has been regarded as one of traditional approaches for predicting the contingency it em. Generally, the contingency percentage reported on modern estimating textbooks is around 5-10% of the co ntract value (Smith and Bohn 1999). Baccarini (2004) also found that the averag e construction contingency was 5. 24% of the awarded contract values for 48 road construction projects comp leted by the Australian Government. This value was calculated by a traditional percentage approach. In addition, the FHWA (2007b) recommends around 5% to 10% contingency for growth during construction to allow for the likelihood that additional constr uction work will be identified after completing the design and the proj ect is awarded. Traditionally, contingencies are assessed as a single amount, usua lly a specific and fixed percenta ge, and are added to the base estimate typically derived from intuition, past ex perience and historical da ta in the construction Figure 2-6. Maximum funding limits for contingencies in the FDOT contracts 43

PAGE 44

industry for a long time for budgeting purposes (Baccarini 2004; Mak and Picken 2000). Ahuja et al. (1994) mentioned that the most reliab le assessment of contingency is based on the experience of the people involved. However, T homson and Perry (1992) asserted that these methods simply add a specific percentage of conti ngency into the estimated cost of a project are considered arbitrary and unscientific ones saying all too often risk is either ignored or dealt with in an arbitrary way. They pointed out several we aknesses of using a predet ermined percentage of contingency as follows: The percentage figure is, most likely, arbitr arily arrived at and not appropriate for the specific project There is a tendency to double c ount risk because some estimat ors are inclined to include contingencies in their best estimate A percentage addition still results in a single-figure prediction of estimated cost, implying a degree of certainty that is simply not justified The added percentage indicates the potential for detrimental or downside risk. It does not indicate any potential for cost reduction and may therefore hide poor management of the execution of the project Because the percentage allows for all risks in term s of a cost contingency, it tends to direct attention away from time, performance, and quality risks It does not encourage creativity in estimating practice, allowing it to become routine and mundane, which can propagate oversights Therefore, adding a fixed per centage of contingency into the estimate only based on the contract amount is an unscientific approach and one of reasons why so many construction projects have experienced budget overruns. 44

PAGE 45

CHAPTER 3 ARTIFICIAL NEURAL NETWORK METHODOLOGY In this chapter, an overview and application of artificial neural network (ANN) technique as the main modeling tool in this study for pr edicting the contingency item are provided. In addition, powerful development tools for im plement artificial neural network model are suggested. Overview In recent years, a new approach known as an artificial neural network has grown popularity in science and engineering areas. Since the late 1980s, applications of artificial neural networks have been done for various topics in civil engineer ing (Flood and Kartam 1994a). These topics cover process optimization, construction simu lation, cost estimation, prediction, and classification and selection problems. The ANN method has been described as follows: A neural network is a system composed of many simple processing elements operating in parallel whose function is determined by the network structure, the connection strengths, and the processing performed at computing elements or nodes (DAR PA Neural Network Study 1988) Artificial neural systems, or neural netw orks, are physical cellular system which can acquire, store, and utilize expe riential knowledge. (Zurada 1992) A neural network is a circuit composed of a very large number of simple processing elements that are neurally based. Each element operates only on local information. Furthermore each element operates asynchronously ; thus there is no overall system clock. (Nigrin 1993) A neural network is a massively parallel distributed processor that has a natural propensity for storing experi ential knowledge and making it available for use. (Haykin 1994) 45

PAGE 46

Figure 3-1. Process of artificial neural network Adaptive System (w) Difference Training Algorithm Input Output Error Real Output Change Parameters Adaptive System (w) Difference Training Algorithm Input Output Error Real Output Change Parameters Figure 3-1 shows the general pro cess of artificial neural netw ork approach. The artificial neural network is a strong and powerful data m odeling technique that is able to represent complex input/output relationships throughout the l earning process. Learning can be defined as a self-organizing process, a mapping process, an optimization process, or a decision-making process (Adeli and Wu 1998). This methodology mi mics biological nervous systems of the human in problem solving process in the following two ways: Knowledge is acquired by the network th rough a trial and error learning process Networks knowledge is stored within inte rneuron connection strengths as synaptic weights Basically, humans biological neuron receives in puts from other sources, combines them in some ways, performs a nonlinear operation on a re sult, and then outputs a final result. Like the human brains problem solving proc ess, neural networks look for pa tterns in training sets of data, learn these patterns, and develop the ability to find relationships between input and output data and to obtain outputs in situations of decision, classification, predicti on and forecasting problems 46

PAGE 47

where data contains non -linear characteristics. So, the ne ural network is a powerful methodology for nonlinear modeling problems where functional relationships between input and output variables are vague and poorly understood. The ANN is a type of information-processi ng system and gains analogy-based problem solving capabilities. So, the neural network has a strong ability, referred to generalization, to infer answers to questions from knowledge. With the following big advantages and capabilities, this methodology has been applied in almost a ll areas of science and engineering (Flood and Kartam 1994a). Development of generalized solutions to a nonl inear problem from a set of samples (Data driven approach) Continuous development and adap tability of changing conditions from new variations of a problem Flexible production of valid solutions in case of errors in the training data or in the description of an ex ample of a problem However, the following factors are important in order to successfully implement artificial neural networks (Flood and Kartam 1994b). The quality of the data used for training process The type and structure of the neural network adopted The method of training The way to structure and inte rpret input and output data Structure of Artifici al Neural Network As seen in Figure 3-2, the ba sic architecture of an ANN cons ists of three layers: input, hidden, and output layer. The input layer, which includes an example of the problem to be solved, directly receives the data from external sources like input file s and serves to introduce the values of the input variables. The output layer, which provides the corresponding answer or solution to the presented pattern, is connected to all of th e units in the preceding layer and feeds outputs back to the input layer for further processing or sends information directly to the outside world. 47

PAGE 48

Between input and output layers, there could be arbitrary number of hidden layers according to design experience and expertise. Without outside contacts, these internal layers contain many neurons in various interconnected structures. The number of hidden la yer is important in determining the generalization capability and processing speed of the ANN. The number of hidden neurons should be also considered a nd determined with caution (Flood and Kartam 1994a; Flood 1999). A larger number of hidden neurons give a neural network more degrees of freedom for modeling the set of da ta and provides a literal interpre tation of a set of data while a fewer number of networks produces a generalized interpretation. So, an appropriate number of hidden neurons should be chosen according to the desired degree of generalization. Figure 3-2. Basic architecture of artificial neural network INPUT LAYER HIDDEN LAYER OUTPUT LAYER I1 I2 In I3 H1 H2 Hn Output INPUT LAYER HIDDEN LAYER OUTPUT LAYER I1 I2 In I3 H1 H2 Hn Output INPUT LAYER HIDDEN LAYER OUTPUT LAYER I1 I2 In I3 H1 H2 Hn Output Steps for Creating Artificial Neural Network In order to successfully execute an applicati on of artificial neural networks, the steps shown in Figure 3-3 will be followed. Before de signing and applying a neur al network, a target or objective for creating a neural network will be established. The objective means variables you want to predict or classify in the neural networ k analysis. These variables will be outputs for the 48

PAGE 49

network. And then variables which may influence th e outputs will be determined. This step is to select input factors for the networ k. The important thing is that it is better to include more input variables than not enough since neural networks detect subtle differences from all potential inputs through learning processes. Figure 3-3. Steps for creati ng artificial neural network After deciding output and input variables for th e network, the steps fo r collecting data or patterns which provide the network examples for generalizations will be followed. The important thing is to gather enough data for training the network. A good rule of thumb is that the number of training patterns should equal 10 times th e number of inputs. Based on knowledge and information about networks, the co llected data need to be encode d to simplify the input data and increasing performance of the network. 49

PAGE 50

The next step is to design a neural network. This means to choose an appropriate network type from Backpropagation Neur al Network (BPNN), Probabili stic Neural Network (PNN), General Regression Neural Netw ork (GRNN), Recurrent Network and other network types. Using software packages for neural networ k or specific computer language-programmed tools, the neural network will be trained until it generalizes well or produces the best possible answer or pattern for future data. One thing th at one must pay careful attention to during the training process is the danger of overtraining which drops perfor mances of the network after a certain point in time. So, one of the most im portant steps in genera ting successful neural networks is to know when to stop the training procedure. The next step is to validate the ANN using the test data set. The number of data records in the test set is approximately 10 percent of the number of the training set. The final step is to design the network again if the network does not reach a satisfactory level of performance. The corrective actions for this step are as follows: Change the type, structure or size of the network Add more hidden neurons or changing input variables Change some parameters such as the lear ning rate or smoothing factor in networks Types of Artificial Neural Network Depending on the use and the need for a speci fic desired (targeted) output for learning patterns, neural networks can be divided into the following two networks: Supervised network and unsupervised network. Supervised Network Supervised networks build models classifyi ng patterns or make pr edictions with other patterns of inputs and out puts through a learning (training) pr ocess involving a desired output. It produces the most reasonable answer from the learned patterns. Types of supervised network 50

PAGE 51

include Backpropagation Neural Networks (B PNN), Recurrent Networks, General Regression Neural Networks (GRNN), and Probabi listic Neural Networks (PNN). Among supervised networks, th e three layer feed forward BPNN has been most widely used for civil engineering applications (Flood and Kartam 1997) The BPNN is trained based on the Generalized Delta Rule (Rumelhart et al. 1986). It consists of an input and output layer and one to three hidden layers and uses the sigm oid function as the activation function. The connection scheme is feedforward where all units in one layer connect to a ll units in subsequent layers in a forward direction. Unsupervised Network Unsupervised networks build models which cl assify a set of trai ning patterns into a specified category without being shown in advance how to categorize since they do not include any desired output data. Since th e training pattern does not include targeted outputs, the network adapts only in response to its inpu t and is left to automatically de termine in the training process. The Kohonen Self Organizing Map network (1984) used in a neural network is one of the typical unsupervised networks. The network learns to classify patterns by making its own clustering scheme for patterns. Training and Testing (Validation) The training process is necessa ry for neural networks to properly represent the problem being solved. It determines an appropriate set of weights which are able to make the neural networks to properly represent th e problem being analyzed. Each pattern as input flows into the network and the resultant output is compared with the targeted output. The difference between two values provides a measure of total error in the network. The weights are then adjusted into the direction of decreasing the error by a prescrib ed rule. The best popular rule for training is the Generalized Delta Rule (Rumelhart et al. 1986). This process is repe ated many times until the 51

PAGE 52

52 network reproduces the acceptable outputs or corresponding solutions to problems within a specified tolerance (threshold value). In the Gene ralized Delta Rule, the first derivative of the total error with respect to a weight determines th e extent to which that weight is adjusted. In other words, this rule continuously modifies the strengths of input connections to reduce the difference between the actual output and the de sired output. It changes the synaptic (input) weights in the way that minimizes the mean squa red error of the network or until an acceptable accuracy is reached. One of the most important things in using this rule is to ensure that the input data set is well randomized since well-ordered or structured presentati ons of the training set might lead to a significantly biased network which cannot converge to the desired accuracy (Anderson and McNeil 1992). The validation step is another important aspe ct in the construction of neural network models like any other statistica l models. After adjusting the we ight sets through the continuous training processes, neural network models need to show the validity of the results they generate before using them with confidence since they ar e purely data driven a nd empirical. Though there does not seem to be a well formulated or th eoretical methodology for neural network model validation, the popular practice is to involve evaluations of the network perfor mance with a set of test data which are not used for training pattern s. The targeted outputs and real ones produced by the network can be compared in a visual compar ison of plotting points as a qualitative manner or in a statistical test of the correlation coefficient as a quant itative manner (Flood and Kartam 1994a). For evaluations and validations of neural network models, the following continuous error metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), or Root Mean

PAGE 53

53 Squared Error (RMSE) will be generally used (Twomey and Smith 1997). The definitions for MAE, MSE, and RMSE are given in Equations 3-1, 3-2, and 3-3. n TijOij MAEn 1i m 1j (Equation 3-1) n Tij)(Oij MSEn 1i m 1j 2 (Equation 3-2) n Tij)(Oij RMSEn 1i m 1j 2 attributes such as velocity, axle spacing, and axle loads only form strain-response reading taken Flood et al. (2001) proposed a Radial-Gaussian ne ural network model to accurately predict the maximum strength of externally reinforced concrete beams. Gagarin et al. (1994) used a programmed tool for the applica tion of neural networks to th e problem of determining truck There are currently many computer languages and tools for building neural networks models. These tools can be divide d into two groups. The first group is to use traditional computer languages to write a program. While this tool provides the maximum flexibility matching objectives of neural network models, it requi res an experienced language programmer to minimize training time. T: The target for the single neur on j, and i is each input pattern O: The output of a single neuron j m: The number of compone nts in the output vector n: The number of patte rns in the test set Where: Development Tools for Arti ficial Neural Network (Equation 3-3)

PAGE 54

from the structure. Adeli and Wu (1998) propos ed a regularization neural network model for estimating highway construction costs in the MATLAB programming language which was selected because of the availability of many built-in numerical analysis functions. The second group is to use network simula tors. Currently, there are many commercial softw NeuralWorks (NeuralWare, USA) ks, USA) ) A) s, USA) ion, USA) mited, Canada) ware, UK) any) Republic of Singapore) nd are easy to use due to a featur ldings. are packages for neural networ k models over the world as follows: BrainMaker (California Scientific Software, USA) MATLAB Neural Network Tool box (The Mathwor Propagator (ARD Corporation, USA NeuroShell (Ward Systems Group, USA) NeuroSolutions (NeuroDimension, US NeuroGenetic Optimizer (BioComp System Neuralyst (Cheshire Engin eering Corporat Dendronic Learning Engine (Dendron ic Decisions Li NeuNet Pro (CorMac Technologies, Canada) Nueframe (Neusciences, UK) Trajan Neural Network Simulator (Trajan Soft DataEngine (MIT GmbH, Germ NNetView (Neuronic, Switzerland) NeuroForecaster (Accel Infotech Pte, These software tools do not need complex programming skills a e of window based systems and total pack ages including designing, building, training, testing, and running neural netw orks. Issa et al. (1998) used the NeuroShell 2 software for running a neural network model to predict the en ergy performance index for residential bui Hanna et al. (1997) employed the BrainMaker software in the de velopment of a neural network model about owner-contractor prequalification decision. 54

PAGE 55

CHAPTER 4 RESEARCH METHODOLOGY Methodology Overview This study is to be conducted in five main st eps as shown in Figure 4-1: strategizing, data collection and evaluatio n, model development and evaluation, model validation, and implementation of a prediction tool. The first st ep is to determine the form of the output variables as the application obj ectives of the artificial neur al network (ANN) model and to identify potential input factors (v ariables) that might influence the contingency through literature reviews. The second step for this research is data co llection. The FDOT (Florida Department of Transportation) project data for the potentia l input variables and the contingency will be collected after determining the desired output vari able and the potential input variables. From collected data, desired continge ncy rates or amounts as output variables will be calculated through differences between the final present co st (contract amount) and the estimated cost (original contract amount). The next phase is to develop an artificial neural network model to predict the owner contingency using the NeuroShell Predictor software. Sensitivity analysis will help in discovering an optimal number of hidden neuron s and which input variable is important for predicting the output variable. After finding the best neural networks through training and testing the ANN models, final evaluation a nd validation process of the m odels will be performed with additional project contingency data. Finally, a prediction tool to predict the owner contingency on the discovered best ANN model will be developed for end users. The pred iction tool will be interactively executed in a Microsoft Excel spreadsheet. 55

PAGE 56

Model Development & Evaluation Implementation Model Validation Data Collection StrategizingProcess Objective Collect Project Database for Contingency Objective Develop an ANN Model to Predict Contingency Objective Validate the Developed ANN Model Objective Develop a Prediction Tool for End-users Project Contingency Database FDOT Database Literature Reviews Project Contingency Evaluation ANN ANN MODEL MODEL Model Validation Additional Project Contingency Data Objective Establish the Familiarity of the ANN Model Output Variable Potential Input Variable Prediction Prediction Tool Tool Model Development & Evaluation Implementation Model Validation Data Collection StrategizingProcess Objective Collect Project Database for Contingency Objective Develop an ANN Model to Predict Contingency Objective Validate the Developed ANN Model Objective Develop a Prediction Tool for End-users Project Contingency Database FDOT Database Literature Reviews Project Contingency Evaluation ANN ANN MODEL MODEL Model Validation Additional Project Contingency Data Objective Establish the Familiarity of the ANN Model Output Variable Potential Input Variable Prediction Prediction Tool Tool Figure 4-1. Flowchar t for proposed methodology Strategizing The first step in the development of the ANN m odel is to establish as much familiarity as possible with the problem at hand: Determini ng the form of the output variable and finding potential input variables likely to be significant. Desired Contingency as Output Variable The output variable for proposed neural netw ork models is a desired contingency amount and rate for construction projects. The contingency in this study is defined as the cost item which can compensate for all unforeseen work orders and related risks includ ing initial contingency amount. It can be called as Total Contingency Amount. Therefore, it is an amount of money set aside to cater for the uncertainty associ ated with the delivery of the project. In order to get a desired total contingency amount, the difference between the original contract amount and the final contract amount w ill be calculated as shown in Figure 4-2. Mak 56

PAGE 57

and Picken (2000) and Baccarini (2004) also stat e that contingency can be calculated from a comparison between the predicted (o riginal) cost and the actual (p resent) final cost in the same way. Figure 4-2. Calculations fo r contingency amount and rate The procedure for calculating the total desired contingency amount and rate is as follows: (1) Adjusted Original Contract Amount (2) Initial Contingency Amount (3) Original Contract Amount = (1)+(2) (4) Initial Contingency Rate = (2)/(1) (5) Final (Present) Contract Amount (6) Contractual Variation = (5)-(3) (7) Desired Contingency Amount = (2)+(6) (8) Desired Contingency Rate = (7)/(1) Therefore, the desired contingency for this study means the accurate cost item in the estimate for an on budget, not over-budget or under-budge t, project. With accu rate contingencies, 57

PAGE 58

sponsors or owners like the FDOT can effectiv ely manage financial pl ans without any excess allowance for other projects in the future. Potential Input Variables Before collecting data, potential input vari ables which might influence the contingency item should be identified and determined. According to the FHWA (Federal Highway Administration) Major Projec t Program Cost Estimating Guidance manual (2007a), the following factors may have an impact on contingencies. Design-Build Contracts Design-Build Contracts on major construction projects have shown little increase from the start to the final completion of projects under a negotiated contract amount and therefore may require a smaller contingency because of many re ductions of the number of construction claims from design errors and omissions. Number of Concurrent Contr acts and Contract Interfaces On projects where multiple contracts are underway at the same time, close coordination of construction activities and schedules may be required. The potential for one contractor to impact another contractors activities is higher and may result in additi onal delay or coordination costs during construction. So, a higher contingency may be required on these multiple contract project types. Contractor Proposed Construction Changes Contracts include some specifications such as Value Engineering Change Proposals to allow contractors to propose construction changes resulting in bene fits to the contractor and the owner. Contracts that restrict the opportunity for cont ractors to make these changes may limit the ability to contain costs once construction starts. An increased con tingency may be appropriate in these situations. 58

PAGE 59

Construction Time On projects with longer duration, there is a greater risk for impact s to the schedule and therefore the contingency amount should be higher. Construction sc heduled in winter or rainy seasons should be accounted for appropriately in the contingency amount because there may be a higher risk in satisfying construction sche dules due to unforeseen weather delays. Transportation Management Plans for Work Zones Major projects often have complex construction traffic controls and may have multiple construction contracts underway at the same time. The cost of implementing the Transportation Management Plan for work zones must be incl uded in the estimate. Costs may also include incident management, public information and communication efforts, transit demand management and improvements to the local area network for helping improve safety and traffic flow during construction. Environmental Impacts Major projects go through a thorough NEPA (N ational Environmental Policy Act) process. Due to the size and complexity of most majo r projects, there are of ten greater public and resource agency scrutiny during construction. This attention results in a greater likelihood that additional environmental mitigations may be required once construction begins. Other Factors As the potential impacts on the constructi on contingency, there are the risks of encountering underground utilities and other obstr uctions, differing site conditions, contaminated soil, multi-agency involvements. Popescu et al. (2003) also mentioned that the magnitude of contingency items depends on the type of contract agreement, type of constr uction, and project locati on. Based on the FHWAs guidance and the Popescus recommendation, co ntingency related data from transportation 59

PAGE 60

construction projects sponsored and completed by the FDOT will be collected. The list of project data of potential influencing factors to be collected is shown in Figure 4-3. Figure 4-3. Potential influenci ng factors on the contingency item Data Collection Data Collection Method Contingency-related project da ta are available from FDOT quarterly construction project database reports about all sponsored projects de livered and completed ove r past 5 years in the FDOT official website. The F DOT also provides bidding inform ation and tabulations for all construction projects. From quarterly time and cost reports, projec t information such as project delivery method type, project size, duration, letting type, and ge ographical location, project year can be accessed. Bidding tabulations for each project provide inform ation such as project work type, contract agreement type, bid award type, geographical locati on, and number of bidde rs (Please see Figure 60

PAGE 61

4-4 and 4-5). Data for 772 FDOT projects comple ted from FY 2004 to the third quarter of FY 2006 were collected for this study. Table 4-1. Accessible/inaccessible data from FDOT construction project database Accessible data Inaccessible data Project work type Project delivery method type Project contract agreement Bid award type Number of bidders Project size Project duration Project letting type Project geographical location Project letting year Number of concurrent projects Project site conditions Possibility of construction changes Environmental impacts Transportation management plans Figure 4-4. Example of FDOT quarterly cost & time report 61

PAGE 62

Figure 4-5. Example of FDOT bid tabulation 62

PAGE 63

Descriptions of Accessible Input Variables Project work type The first potential input variable is the proj ect work type. Project work type can be represented as roadway, rehabilita tion, structure, signage, signaliz ation, lighting, landscape, and other construction activities. Uncertainties and risks will vary based on the project work. Generally, roadway, rehabilitati on, and structural works have mo re uncertainties and risks than signing, signalization, ligh ting, and landscape works in perspective of complexity. These project work types involve many varia tions since all constr uction projects are unique. For this study, the project work type wa s categorized into the following five types. Figure 4-6 shows the frequency dist ribution for the five main project work types. Among the five work types, asphalt resurf acing and paving works account fo r approximate 60 percent of collected project works. Asphalt Resurfacing (Type 1) Asphalt Paving (Type 2) Bridge Work (Type 3) Combination of Bridge work and Asphalt Paving (Type 4) Other Works (Type 5) Project delivery method type Construction delivery method is defined as the set of relationships, roles and responsibilities of project members and the seque nce of activities and va ries on a project-toproject basis according to project objectives. Th ere are primary five delivery methods for construction projects as seen in Table 4-2. Among them, the FDOT have been using the following two delivery methods for transportation construction proj ects. Between two delivery methods, the FDOT has been using the DB (Design-Build) method which combines de sign, construction, an d even right-of-way 63

PAGE 64

services into a single contract in order to re duce costs and expedite c onstruction through speedy and coordinated communication. Design-Bid-Build (DBB) Method Design-Build (DB) Method 366 97 78 53 178 0 50 100 150 200 250 300 350 400 Type 1Type 2Type 3Type 4Type 5 Work TypeCount Figure 4-6. Histogram of project work type Table 4-2. Construction project delivery methods Project delivery method Description Design-Bid-Build (DBB) Traditional sequential co ntract for separate awards with a designer and a contractor Design-Build (DB) Single contract for all as pects of design, engineering, procurement, and construction using team concept Fast Track Concurrent contract for expediting schedule with overlapping construction processes Construction Management Performance-based owner-agent contract for management of contractors and construction cont racts using a negotiated GMP Owner Builders Contract for all constr uction activities by owners full staffs The FDOT Design-Build Guidelines (2007) recommend strong consideration of the Design-Build contracting on the following types of projects: Projects with an expedited schedule a nd a high possibility on early completion Projects with a minimum Right of Way acquisition and utility relocation Projects with a well-defined scope for all parties (Design and Construction) 64

PAGE 65

Projects with room for innovation in the design/construction effort Projects with a low risk of unforeseen conditions Projects with a low possibility of sign ificant change during all phases of work Table 4-3 shows examples of good or not good candidate projects for the Design-Build contracting method suggested by the FDOT Guid elines. However, the only DBB (Design-BidBuild) type as project delivery method type was considered in this study due to small numbers of DB (Design-Build) type as seen in Figure 4-7. Table 4-3. Candidate projects for Design-Build contracting Good Design-Build contracting candidates No t Good Design-Build c ontracting candidates Major bridge and minor bridge ITS (Computer signalized traffic) Intersection improvements Office buildings and rest areas Interstate widening and rural widening Fencing Landscaping Lighting Sidewalks Signing and signalization Guardrail Urban construction/reconstruction Rehabilitation of movable bridges Major bridge rehabilitation/repair projects Mill and resurfacing 26 746 0 100 200 300 400 500 600 700 800 DB(Design-Build) DBB(Design-Bid-Build) Delivery TypeCount Figure 4-7. Histogram of pr oject delivery method type 65

PAGE 66

Project contract agreement type The following three types of contract agreements have been generally used for construction projects: Lump-Sum (LS) Contract Agreement Unit-Price (UP) Contract Agreement Cost-plus-Fee Contract Agreement Table 4-4. Construction pr oject contract agreements Contract agreements Description Lump-Sum contract A contract where a fixed price is determin ed without a cost breakdown for a well-defined project Unit-Price contract A contract wh ere a contractor is paid for the measured quantity of work at a fixed price for each item under a cost breakdown Cost-plus-Fee contract A contract where a contra ctor is reimbursed all construction costs and is paid an additional agree-upon fee (profit) The FDOT has been employing lump-sum a nd unit-price contract agreements for transportation projects. Specifica lly the lump-sum contracting as an innovative method has been used in order to reduce the costs of design and contract administration related with quantity calculation, verification and m easurement on simple projects wh ich have a well-defined scope for all parties, a low risk of unforeseen conditi ons, and a low possibility for work changes during all construction phases. Table 4-5. Candidate projects for Lump-Sum contracting Good Lump-Sum contracting candidates No t Good Lump-Sum contracting candidates Bridge painting Bridge projects Fencing Guardrail Intersection improvements Landscaping Lighting Mill/resurface Minor road widening Sidewalks Signing Signalization Urban construction/reconstruction Rehabilitation of movable bridges Projects with subsoil earthwork Concrete pavement rehabilitation projects Major bridge rehabilitation/repair projects 66

PAGE 67

The FDOT Lump Sum Project Guidelines (2001) recommend examples of good candidate projects for the lump-sum contracting as show n in Table 4-5. Figure 4-8 shows the number of two contract agreement types. Between two types, unit price contract agreement type is more popular on FDOT projects. 316 456 0 50 100 150 200 250 300 350 400 450 500 LS(Lump Sum) UP(Unit Price) Contract Agreement TypeCount Figure 4-8. Histogram of proj ect contract agreement type Bid award method type As bidding award method, the FDOT has b een using the following three methods for construction projects: Lowest Bid (LB) Method A+B (Cost + Time) Bidding Method Bid Averaging Method (BAM) The A+B (Cost + Time) Bidding method enables contractors to determine a reasonable duration required for project comp letion and includes time bid items with an associated cost of the completion duration in determination of the lowest bid. The cost bid is represented as standard cost and the time bid is represented by (Days Predetermined daily road-user costs). 67

PAGE 68

The Bid Averaging Method (BAM) is used in or der to get contractor s to bid a true and reasonable project cost. However, this method must be used only on 100% State Funded or Local Projects, not Federally Funded Projects. Only one project among the data for this study used this bidding method. In this study, the only LB (Lowes t Bid) type was considered in the development of ANN model because it represents the largest number of data records as shown in Figure 4-9. 25 1 746 0 100 200 300 400 500 600 700 800 A+B(Cost+Time) Bidding BAM(Bid Averaging Method) LB(Lowest Bid) Bid Award TypeCount Figure 4-9. Histogram of bid award method type Number of bidders Various numbers of bidders participate at the bidding phase for construction projects. Depending on the number of bidders, the profitability of projects will be different. To owners or sponsors, the participation of many bidders will result in lowe r bidding price due to more competition, but more risks and probabilities against overrunning project estimates as a result. Figure 4-10 shows the distribution of the number of bidders on the collected data. In this study, irregular bids were not considered as being received and only confirming bids were considered. Five or less bidders participated on most construction projects used in this study. 68

PAGE 69

61 158 197 151 97 39 26 25 10 3 22 0 1 0 50 100 150 200 250 1234567891011121314 Number of BiddersCount Figure 4-10. Histogram of number of bidders Project size (contract amount) Project size can be measured in terms of cont ract dollar amount for each project. Projects with high contract amounts mean that they includ e more complex and difficult scopes. Therefore, these projects involve many uncerta inties and risks agains t their original contract budgets. Figure 4-11 represents the distribution of project amounts on the FDOT pr ojects collected for this study. Project duration The project duration means total construction days from contract award to practical completion. All construction projects have differe nt durations from the time plans are developed through scheduling tools. Longer durations mean inclusion of many complex jobs and higher probability of unforeseen weather delays for comp letion. There is a greater potential significant risk for impacts on the schedule and therefore th e contingency amount should be higher. As seen in Figure 4-12, about 80 percent of FDOT project s were completed with 300 work days or less. 69

PAGE 70

70 214 126 87 65 53 55 22 25 22 12 19 17 10 12 7 6 4 3 4 9 0 50 100 150 200 250 0.5M1.5M2.5M3.5M4.5M5.5M6.5M7.5M8.5M9.5M Original Contract Amount ($)Count Figure 4-11. Histogram of project size 223 235 147 62 20 27 21 11 12 4 5 1 2 1 0 1 0 50 100 150 200 250 100300500700900110013001500 Original Contract Duration (Day)Count Figure 4-12. Histogram of project duration

PAGE 71

Project letting (delivery) type FDOT projects can be divided in two letti ng types depending on the procurement agency: Central Letting (CO) and District Letting (DO). The central office is responsible for purchasing professional services, contractual services, and commodities related to the state highway systems. Each district office, a decentralized agency of the FDOT, is responsible for acquiring commodities, contractual services, road/bridge construction and maintenance and professional services. Normally, central letting projects sponsored by the FDOT main office are more complex in scope and higher in contract amount than district letting projects. So, they will include a higher contingency for supporting these uncertainties. Figure 413 shows the number of the two project letting types in the collected FDOT projects. Central Letting (CO) projects account for two times as many project as District Letting (DO) projects. 518 254 0 100 200 300 400 500 600 CO(Central Letting) DO(District Letting) Letting TypeCount Figure 4-13. Histogram of project letting type 71

PAGE 72

Project geographical location Geographical locations are important for execu ting construction projec ts. Projects near urban areas can easily obtain the supply of labor materials, and equipmen t in appropriate ways. Transportation time for them can be also estimat ed with accuracy. Theref ore, the location may influence the amount of risk and thus the level of contingency and its variation. Based on the definition of the United States Office of Management and Budget (OMB), project location was divided into urban/rural group as seen in Figure 4-14. Figure 4-15 shows the number of projects according to this classificat ion of geographical locatio n. Most of projects were executed in the urban areas. Figure 4-14. Definition of urban and rura l in the state of Florida (USDA 2000) 72

PAGE 73

632 140 0 100 200 300 400 500 600 700 Urban Rural Geographical LocationCount Figure 4-15. Histogram of project geographical location Project letting year The project letting year is related to ec onomic market conditio ns during construction phases. As economic conditions change through the years, it may influence the level of contingency and variations. The project letting year in this study was defined as the starting period date of the contract award for the project. Figure 4-16 s hows the distribution of letting year on the collected projects. Mo st of the projects were started from 2002 to 2005 due to their availability in the FDOT database. Summary of Accessible Input Variables From the histograms of each accessible input variable, the following categorical types such as project work type, contract agreement type bid award type, letting type, and geographical location type will be considered in the ANN model as seen in Ta ble 4-6. In addition, the DBB (Design-Bid-Build) type for proj ect delivery type and the LB (Lowest Bid) type for bid award type will be considered in this study. 73

PAGE 74

1 2 3 6 5 21 35 94 181 198 185 41 0 50 100 150 200 250 959697989900010203040506 Project YearCount Figure 4-16. Histogram of project letting year Table 4-6. Summary of accessible input variables Accessible input variables Type of variable Remark Project work type Categor ical 5 Focusing groups Project delivery method type Categorical DBB Contract agreement t ype Categorical LS/UP Bid award type Categorical LB Project letting type Categorical CO/DO Project geographical locatio n Categorical Urban/Rural Number of bidder Numerical Project amount Numerical Project duration Numerical Project letting year Numerical 74

PAGE 75

CHAPTER 5 ARTIFICIAL NEURAL NETWORK MODEL Artificial Neural Netw ork (ANN) Methodology The main modeling tool for this study is to us e artificial neural netw ork technique which is an exciting form of Artificial Intelligence (AI) wh ich mimics the learning process of the human brain in order to extract patterns from historical data. It is useful to solve problems with nonlinear relationships between input variables and output variable since there could be some unknown combined effects. In order to successfully implement the neural network method, the quantity and quality of the datase t for training as well as the type and structure of the network, the method of training, and the wa y to structure and interpret input and output data are important. Through repeated learning processes with random selected training data, the method will find the best suitable neural network model to predict the contingency item on transportation construction projects. The valida tion procedure involves evaluating performances of the network model with the test dataset which is not used in training process. Development Tool for Artifi cial Neural Network Model There are many useful commercial neural netw ork software tools which can be used to design, build, train, test, and run neural networks as a total package system. The NeuroShell Predictor software developed by the Ward Syst ems Group, Inc was used in this study. This software is a Microsoft Windows-based system which is very easy to use and is useful for forecasting and estimating numerica l values. It is a simple step by step process that uses recognized forecasting methods to look for future prediction in existing data and contains stateof-the-art prediction algorithms, ye t is designed to be extremely fast and effective with minimum intervention by the users. 75

PAGE 76

Figure 5-1. NeuroShell Predictor software Type and Structure of Artificial Neural Network Model In this study, the supervised le arning network will be used sin ce it contains input data with corresponding desired outputs and the network can learn to infer the relationship between two variables from training data. There are many st ructures and classification frameworks in the neural network method. One of the most common structures is Multilayer Perceptron (MLP) where units are organized in severa l layers: the input laye r, the hidden layer, and the output layer. For this study, the input layer is compositions of neurons of potential input variables on the owner contingency and the output la yer consists of one neuron of the contingency amount or rate as a form of contingency. Between input and outpu t layers, there is only one hidden layer in the NeuroShell Predictor software. The number of hidden neurons in the hidden layer is important in determining the generalization ability of the network model. The number of hidden neurons 76

PAGE 77

exponentially increases the number of connection we ights because they are connected to all other neurons in all other layers. So, training under many hidden neurons is more computationally intense than training under a fe wer numbers of hidden neurons. The optimal number of hidden neurons is automatically provided by the NeuroShell Predictor software. Figure 5-2 shows one of the basic three layers structures of artif icial neural network models in this study. Neurons in the input layer are represented as potential input variables for predicting the contingency item Th e only numerical input variable s such as number of bidders, project duration, project amount and project letting year we re available and employed for developing an ANN prediction model since ANN models do not have a capability of interpreting categorical variables that cannot be put in some meaningful orde r. The neuron in the output layer is the desired contingency amount or rate from the difference between original contract amounts and final contract amounts. Figure 5-2. Basic structure of artificial neural network mode l to predict contingency item 77

PAGE 78

Development of Artificial Neural Network Model A good method to handle unordered categorical i nput variables is to develop a separate ANN model for each type. In this study, separa te ANN models for categor ical input variables were developed. It is more effective to run ANN models with enough data since the ANN methodology is data-driven. After setting probl em domains (5%~95% distribution) for each input variable and removing signi ficant outliers of output variables, nine effective ANN models were developed as shown on Table 5-1. The cut off data number for a pplying ANN models were set as 20. All ANN model types fall into the DBB (Design-Bid-Build) t ype as project delivery method and the LB (Lowest Bid) type as bid aw ard method. The total nu mber of projects except DBB and LB types is 746 before defining the problem domain. Implementation of Artifici al Neural Network Model Data for nine effective ANN models were randomly divided into the following three groups in the ratio of 60% to 20% to 20%. Training Set: the dataset that attempts to replicate or a pproximate the model through the development phase Testing Set: the dataset that is used to eval uate the performance of the model during the development and to help in selecti ng the best model among competing ones Validation Set: the dataset that is used to make final assessment of the performance of the final model After selecting input variables, output variables, and trai ning strategy for each neural network model, the process for training a ne twork involves adding hi dden neurons until the network is able to make good predictions. The op timal number of hidden neurons is the number of hidden neurons that provides the best prediction of the network. 78

PAGE 79

79 Table 5-1. Classificatio n of each ANN model type Letting type Contract agreement Location Work type Data number (Before setting problem domain) Data number (After setting problem domain) ANN type CO UP URBAN 1 129 106 TYPE 1 2 39 27 TYPE 2 3 10 4 40 23 TYPE 3 5 33 23 TYPE 4 RURAL 1 16 2 5 3 2 4 5 5 11 LS URBAN 1 90 71 TYPE 5 2 7 3 7 4 0 5 50 32 TYPE 6 RURAL 1 34 25 TYPE 7 2 4 3 0 4 2 5 11 DO UP URBAN 1 34 23 TYPE 8 2 20 15 3 33 18 4 4 5 36 30 TYPE 9 RURAL 1 7 2 1 3 3 4 0 5 3 LS URBAN 1 25 17 2 13 3 17 4 0 5 28 19 RURAL 1 15 2 5 3 4 4 0 5 3 Note. Work Type Description 1: Asphalt Resurf acing, 2: Asphalt Paving, 3: Bridge Work, 4: Combinati on of Bridge and Paving, 5: Other

PAGE 80

Table 5-2 shows the descriptions and number data records of the three sets for each ANN model type. The software automatically finds th e optimal number of hidden neurons out of the number of hidden neuron trained during the learning process. Table 5-2. Three datase ts for each ANN model type ANN type Description Data number Training set Testing set Validation set Type 1 CO, UP, Urban, Work 1 106 60 22 24 Type 2 CO, UP, Urban, Work 2 27 16 6 5 Type 3 CO, UP, Urban, Work 4 23 14 5 4 Type 4 CO, UP, Urban, Work 5 23 13 7 3 Type 5 CO, LS, Urban, Work 1 71 43 15 13 Type 6 CO, LS, Urban, Work 5 32 21 6 5 Type 7 CO, LS, Rural, Work 1 25 16 5 4 Type 8 DO, UP, Urban, Work 1 23 13 6 4 Type 9 DO, UP, Urban, Work 5 30 17 8 5 Note. CO: Central Letting: DO: District Letting; UP: Unit Price; LS: Lump Sum. Work Type 1: Asphalt Resurfacing, 2: Asphalt Pavi ng, 3: Bridge Work, 4: Combination of Bridge and Paving, 5: Other There are accepted best net st atistics which are the objective of network training and parameter for comparing the performance of ANN models. Among them, the Average Error was used for comparing the performance of the neural networks developed. However, the R-squared and the Correlation were also used for compar ing the performance of the networks where the form of the output variable is different and thus the average error is not usable for the comparison (i.e. contingency amount vs. continge ncy rate). The net statistics parameters considered were: R-squared: indicator for measuri ng the adequacy of the ANN model Average Error: the absolute value of the actual values minus the predicted values divided by the number of patterns Correlation (r): the measure of how the actual value and predic ted value correlate to each other in terms of direction MSE (Mean Squared Error): a statistical meas ure of the differences between the actual output value and the predicted output value RSME (Root Mean Squared Error) : the squared root of the MSE 80

PAGE 81

Identification of Effective Input Variables The study was initially implemente d with all four numerical input variables that influence the contingency item in the NeuroShell Predictor so ftware. In the neural ne twork, the impact of each input variable will be identified and determin ed in the feature of Importance of inputs in the software as seen in Figure 5-3. This feature indicates the rela tive significance of each input in predicting the output value. The relative importance numbers range from 0 to 1, and they are normalized so that the sum of values for all inputs is approximately 1. Thes e numbers can be considered as a percent of contribution to the model of the each input. However, values should not be compared for different ANN models since the measure of importan ce for each input is relative to the current network model. In addition, a sensitivity an alysis was taken in order to find the best configuration of input variables on the contingency item by excluding each input variable which was compared with the initial sets including all of the input variables. Figure 5-3. Neural network learning panel Enhanced Generalization for Testing Set In the NeuroShell Predictor software, a featur e of Enhanced Generalization can be used if the data for a neural network is noisy as seen in Figure 5-4. It is a method of applying the network that tends to smooth out the prediction for te sting data. In other words, choosing an appropriate level of generali zation from 0% (No Enhanced Ge neralization) to 100% (OverGeneralization) will give the neural network th e better results for data not included in the 81

PAGE 82

training set where the data is noisy. This feature was applied in the testing dataset during the development of the ANN model for better prediction of the contingency item. Figure 5-4. Enhanced generalization panel Artificial Neural Netw ork Model Development ANN Model Type 1 Among forty ANN model types, there were nine feasible types for developing neural network models. Type 1 for the ANN models represen ts asphalt resurfacing as project work type, central letting as letting type, un it price as contract agreement type, and urban as geographical location. Finding the best network for model type 1 The potential networks for model Type 1 are ten networks according to fi ve configurations of input variables and two forms of output variable as seen in Tabl e 5-3, 5-4, 5-5, and 5-6. Among the networks, the best network is one with th e configuration of three input variables except Project Duration for predicting contingency amount as output variable and with the generalization level set at 100% from the performance based on th e average error of the testing set as seen in Table 5-4 and Figure 5-5. In this network, the Average Error is $67,323 for training set and $76,227 for testing set. 82

PAGE 83

Table 5-3. Performance for tr aining set of model type 1 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 54 55 55 55 55 Optimal number of hidden neurons 16 4 10 36 21 R-squared 0.729 0.514 0.655 0.693 0.706 Avg. Error 65758.3 75865.9 70598.9 67323.0 64904.2 Correlation 0.854 0.717 0.809 0.833 0.840 Table 5-4. Performance for testing set of model type 1 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 100% 30% 100% 100% 100% R-squared 0.378 0.472 0.291 0.457 0.407 Avg. Error 82271.3 76315.5 84654.9 76226.7 84083.0 Correlation 0.644 0.794 0.541 0.698 0.690 Table 5-5. Performance for tr aining set of model type 1 for predicting con tingency rate Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 43 35 24 31 25 Optimal number of hidden neurons 41 24 23 30 23 R-squared 0.750 0.174 0.378 0.591 0.514 Avg. Error 0.017 0.028 0.025 0.020 0.023 Correlation 0.866 0.417 0.615 0.769 0.717 Table 5-6. Performance for testing set of model type 1 for predicting contingency rate Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 100% 100% 100% 100% 100% R-squared 0.264 0.183 -0.009 0.357 0.241 Avg. Error 0.026 0.026 0.026 0.024 0.026 Correlation 0.549 0.460 0.072 0.652 0.528 Finding the optimal number of hidden ne urons and importance of input variables Through implementations of the NeuroShell Pr edictor software, the best ANN model for Type 1 was found to be the network with three input variables includi ng Number of Bidder, 83

PAGE 84

Project Year, and Project Amount and the outpu t variable of the contingency amount. The optimal number of hidden neurons for the learning process is 36 out of total numbers of hidden neurons trained (55) based on the average error as seen in Table 5-3 a nd Figure 5-6. Table 5-3 also shows best net sta tistics at the optimal nu mbers of hidden neurons. 72000.0 74000.0 76000.0 78000.0 80000.0 82000.0 84000.0 86000.0 NoneNo. of BidderPYDurationAmount Input Variable Excluded from ModelAvg. Error of Test Set Figure 5-5. Input configur ation for the model type 1 Figure 5-6. Optimal number of hi dden neurons for the model type 1 84

PAGE 85

Out of the three input variables, the Number of Bidders is the most significant one for predicting contingency amount as seen in Figure 5-7. The value of importance of Number of Bidder among the three variables is 0.584. Figure 5-8 and 5-9 s how comparisons between actual output value and predicted output value on the training and testing sets on the best network for ANN model Type 1. Figure 5-7. Importance of input variables at the optimal number of hidden neurons Figure 5-8. Actual/predicted output value on training data set of the model type 1 85

PAGE 86

Figure 5-9. Actual/predict ed output value on testing da taset of the model type 1 ANN Model Type 2 Among forty ANN model types, there were nine feasible types for developing neural network models. Type 2 for the ANN models repr esents asphalt paving as project work type, central letting as letting type, un it price as contract agreement type, and urban as geographical location. Finding the best network for model type 2 As candidates of the best networks for ANN mo del Type 2, there are ten networks based on five configurations of input va riables and two forms of output variables as shown in Table 5-7, 5-8, 5-9 and 5-10. Among them, the best network c onsists of all four inpu t variables in the input layer and the contingency amount in the output layer. At 70% of the generalization level, this network produced the least averag e error for the testing set as shown in Table 5-8 and Figure 510. 86

PAGE 87

Table 5-7. Performance for tr aining set of model type 2 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 10 11 11 11 11 Optimal number of hidden neurons 8 5 11 9 11 R-squared 0.740 0.724 0.780 0.442 0.869 Avg. Error 105024.9 103956.6 92505.8 138420.7 55461.4 Correlation 0.860 0.851 0.883 0.665 0.932 Table 5-8. Performance for testing set of model type 2 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 70% 70% 80% 70% 90% R-squared 0.713 0.713 0.698 0.323 0.636 Avg. Error 87117.8 88869.5 107384.6 128395.1 116249.1 Correlation 0.888 0.880 0.909 0.631 0.833 Table 5-9. Performance for tr aining set of model type 2 for predicting con tingency rate Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 10 11 11 11 11 Optimal number of hidden neurons 10 11 11 11 11 R-squared 0.766 0.697 0.901 0.676 0.791 Avg. Error 0.017 0.016 0.010 0.024 0.021 Correlation 0.875 0.835 0.949 0.822 0.889 Table 5-10. Performance for testing set of model type 2 for predicting contingency rate Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 70% 100% 100% 80% 80% R-squared -3.456 -1.266 -2.803 -2.369 -2.391 Avg. Error 0.053 0.042 0.055 0.047 0.050 Correlation -0.357 0.028 -0.615 -0.271 -0.333 Finding the optimal number of hidden ne urons and importance of input variables The best neural network for Type 2 was f ound to be the network with all four input variables: Number of Bidder; Project Year; Project Durati on; and Project Amount and the 87

PAGE 88

contingency amount as output variable. The optim al number of hidden neurons for the learning process is 8 out of total numbers of hidden neurons trained ( 10) based on the average error shown in Figure 5-11 and Table 5-7. 0.0 20000.0 40000.0 60000.0 80000.0 100000.0 120000.0 140000.0 NoneNo. of BidderPYDurationAmount Input Variables Excluded from ModelAvg. Error of Test Set Figure 5-10. Input configuration for the model type 2 Figure 5-11. Optimal number of hi dden neurons for the model type 2 88

PAGE 89

Project Duration (Original Cont ract Day) among the four input variables is the most significant for predicting contingency amount as s een in Figure 5-12. The value of importance of Project Duration is 0.772. Figure 5-13 and 514 show comparisons between actual output value and predicted output value on the training and the testing set for th e best network for ANN model Type 2. Figure 5-12. Importance of input variables at the optimal number of hidden neurons Figure 5-13. Actual/predicted output value on training da taset of the model type 2 89

PAGE 90

Figure 5-14. Actual/predicted output value on testing da taset of the model type 2 ANN Model Type 3 Among forty ANN model types, there were nine feasible types for developing neural network models. ANN model Type 3 represents a combination of bridge work and asphalt paving as project work type, central letting as letting type, unit price as contract agreement type, and urban as geographical location type. Finding the best network for ANN model type 3 There are ten networks for ANN model Type 3 from five configurations of input variables and two forms of output variable as seen in Table 5-11, 5-12, 5-13, and 5-14. Among them, the best network generated is one with the configuration of three input variables except Project Amount for predicting contingency amount as out put variable under 0% of the generalization level based on the average error of the testing set shown in Table 5-12 and Figure 5-15. 90

PAGE 91

Table 5-11. Performance for tr aining set of model type 3 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 8 9 9 9 9 Optimal number of hidden neurons 8 5 7 5 9 R-squared 0.883 0.860 0.862 0.793 0.946 Avg. Error 274626.2 248816.3 286045.4 327628.8 185176.9 Correlation 0.940 0.927 0.929 0.890 0.973 Table 5-12. Performance for testing set of model type 3 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 60% 0% 70% 90% 0% R-squared 0.367 0.415 0.506 0.290 0.838 Avg. Error 641826.8 524160.3 560244.9 601502.3 264287.1 Correlation 0.742 0.962 0.880 0.870 0.996 Table 5-13. Performance for tr aining set of model type 3 for predicting con tingency rate Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 8 9 9 9 9 Optimal number of hidden neurons 8 9 9 9 9 R-squared 0.480 0.355 0.466 0.764 0.550 Avg. Error 0.018 0.021 0.019 0.015 0.019 Correlation 0.693 0.596 0.683 0.874 0.741 Table 5-14. Performance for testing set of model type 3 for predicting contingency rate Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 100% 80% 0% 100% 60% R-squared -0.789 -0.804 -0.200 -0.794 -0.460 Avg. Error 0.036 0.037 0.026 0.034 0.031 Correlation -0.775 -0.414 0.277 -0.751 0.219 Finding the optimal number of hidden ne urons and importance of input variables The best ANN model for Type 3 is the netw ork with three input variables including Number of Bidder, Project Year and Project Duration and the c ontingency amount as the output 91

PAGE 92

variable. On the learning process for this network, the optimal number of hidden neurons is 9 out of total numbers of hidden neur ons trained (9) based on the averag e error as shown in Table 5-11 and Figure 5-16. 0.0 100000.0 200000.0 300000.0 400000.0 500000.0 600000.0 700000.0 NoneNo. of BidderPYDurationAmount Input Variable Excluded from ModelAvg. Error of Test Set Figure 5-15. Input configuration for the model type 3 Figure 5-16. Optimal number of hi dden neurons for the model type 3 92

PAGE 93

Project Duration (Original Cont ract Day) among the three input variables is the most significant one for predicting contingency amoun t as seen in Figure 5-17. The value of importance of Project Duration is 0.594. Figure 5-18 and 5-19 show comparisons between actual output value and predicted output value for the training and testing set on the best network for ANN model Type 3. Figure 5-17. Importance of input variables at the optimal number of hidden neurons Figure 5-18. Actual/predicted output value on training dataset for the model type 3 93

PAGE 94

Figure 5-19. Actual/predicted output value on testing da taset of the model type 3 ANN Model Type 4 Among forty ANN model types, there were nine feasible types for developing neural network models. Type 4 for the ANN models repr esents other works as project work type, central letting as letting type, un it price as contract agreement type, and urban as geographical location type. Finding the best network for ANN model type 4 There are ten networks based on five configur ations of input variab les and two forms of output variable as potential ne tworks for ANN model Type 4 as shown in Table 5-15, 5-16, 5-17 and 5-18. Among them, the best network consists of three input variable s except Project Letting Year in the input layer and con tingency amount in the output laye r at 100% of the generalization level. This network produced th e least average error ($17,887) fo r the testing set as shown in Table 5-16 and Figure 5-20. 94

PAGE 95

Table 5-15. Performance for tr aining set of model type 4 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 7 8 8 8 8 Optimal number of hidden neurons 7 8 8 8 8 R-squared 0.775 0.508 0.949 0.871 0.673 Avg. Error 33899.0 41830.3 13287.4 19402.6 32020.7 Correlation 0.880 0.713 0.974 0.933 0.821 Table 5-16. Performance for testing set of model type 4 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 90% 0% 100% 90% 90% R-squared 0.558 -0.450 0.853 0.432 -0.293 Avg. Error 35226.9 56933.9 17887.4 37585.5 58663.7 Correlation 0.962 0.573 0.976 0.951 0.340 Table 5-17. Performance for tr aining set of model type 4 for predicting con tingency rate Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 7 8 8 8 8 Optimal number of hidden neurons 7 8 8 8 8 R-squared 0.512 0.538 0.860 0.737 0.556 Avg. Error 0.020 0.019 0.013 0.017 0.025 Correlation 0.715 0.733 0.928 0.859 0.746 Table 5-18. Performance for testing set of model type 4 for predicting contingency rate Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 70% 80% 100% 100% 80% R-squared 0.493 0.261 0.250 0.468 0.453 Avg. Error 0.023 0.024 0.030 0.024 0.022 Correlation 0.713 0.686 0.586 0.794 0.676 Finding the optimal number of hidden ne urons and importance of input variables From the implementation of the NeuroShell Predictor software, the best ANN model for Type 4 was found to be the network with three input variables includi ng Number of Bidders, 95

PAGE 96

Project Duration, and Project Amount and contingency amount as the output variable. The optimal number of hidden neurons for the learning process is 8 out of total numbers of hidden neurons trained (8) based on the average erro r as shown in Table 5-15 and Figure 5-21. 0.0 10000.0 20000.0 30000.0 40000.0 50000.0 60000.0 70000.0 NoneNo. of BidderPYDurationAmount Input Variable Excluded from ModelAvg. Error of Test Set Figure 5-20. Input configuration for the model type 4 Figure 5-21. Optimal number of hi dden neurons for the model type 4 96

PAGE 97

Project Duration (Original Cont ract Day) among the three input variables is the most significant one for predicting the contingency am ount as seen in Figure 5-22. The value of importance of project duration is 0.644. Figure 5-23 and 5-24 show comparisons between actual output value and predicted output value on the trai ning and the testing set for the best network for ANN model Type 4. Figure 5-22. Importance of input variables at the optimal number of hidden neurons Figure 5-23. Actual/predicted output value on training da taset of the model type 4 97

PAGE 98

Figure 5-24. Actual/predicted output value on testing da taset of the model type 4 ANN Model Type 5 Among forty ANN model types, there were nine feasible types for developing neural network models. ANN model Type 5 represents asphalt resurfacing as project work type, central letting as letting type, lump sum as contract agreement type, and urban as geographical location type. Finding the best network for ANN model type 5 There are ten networks from five configurations of input vari ables and two forms of output variable as potential networks fo r model Type 5, as s hown in Tables 5-19, 5-20, 5-21, and 5-22. The best network is the one with the configura tion of all four input variables for predicting contingency amount as output vari able at 80% of the generaliza tion level based on the average error for the testing set as show n in Table 5-22 and Figure 5-25. 98

PAGE 99

Table 5-19. Performance for tr aining set of model type 5 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 37 38 38 38 38 Optimal number of hidden neurons 25 12 31 23 13 R-squared 0.460 0.306 0.430 0.337 0.253 Avg. Error 81292.7 81825.9 75500.9 88233.4 81437.2 Correlation 0.679 0.554 0.656 0.580 0.503 Table 5-20. Performance for testing set of model type 5 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 80% 20% 90% 90% 60% R-squared 0.820 0.722 0.723 0.385 0.449 Avg. Error 24919.0 36219.5 35971.5 53158.8 46456.3 Correlation 0.914 0.851 0.878 0.662 0.688 Table 5-21. Performance for tr aining set of model type 5 for predicting con tingency rate Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 37 34 37 36 24 Optimal number of hidden neurons 37 33 35 34 23 R-squared 0.901 0.867 0.831 0.842 0.695 Avg. Error 0.010 0.011 0.014 0.011 0.018 Correlation 0.949 0.931 0.912 0.918 0.834 Table 5-22. Performance for testing set of model type 5 for predicting contingency rate Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 90% 80% 80% 100% 100% R-squared 0.525 0.707 0.632 -0.022 -0.044 Avg. Error 0.019 0.016 0.017 0.026 0.025 Correlation 0.778 0.862 0.834 0.189 0.160 Finding the optimal number of hidden neurons and importance of input variables The best ANN model for Type 5 is the networ k with all four input variables including Number of Bidder, Project Year, Project Duration and Project Amount and with contingency 99

PAGE 100

amount as the output variable. The optimal numb er of hidden neurons for the learning process for this network is 25 out of to tal numbers of hidden neurons tr ained (37) based on the average error as shown in Table 5-21 and Figure 5-26. 0.0 10000.0 20000.0 30000.0 40000.0 50000.0 60000.0 NoneNo. of BidderPYDurationAmount Input Variable Excluded from ModelAvg. Error of Test Set Figure 5-25. Input configuration for the model type 5 Figure 5-26. Optimal number of hi dden neurons for the model type 5 Project Duration (Original Cont ract Day) among the four input variables is the most significant one for predicting contingency amoun t as seen in Figure 5-27. The value of 100

PAGE 101

importance of Project Duration is 0.392. Figure 5-28 and 5-29 show comparisons between actual output value and predicted output value on the trai ning and the testing set for the best network for ANN model Type 5. Figure 5-27. Importance of input variables at the optimal number of hidden neurons Figure 5-28. Actual/predicted output value on training da taset of the model type 5 101

PAGE 102

Figure 5-29. Actual/predicted output value on testing da taset of the model type 5 ANN Model Type 6 Among forty ANN model types, there were nine feasible types for developing neural network models. ANN model Type 6 represents ot her works as work type, central letting as letting type, lump sum as contract agreement type, and urban as geographical location type. Finding the best network for model type 6 As candidates of the best networks for model Ty pe 6, there are ten networks based on five configurations of input variable s and two forms of output variable as shown in Table 5-23, 5-24, 5-25 and 5-26. The best network among them consists of three input variab les except Number of Bidder in the input layer and con tingency rate in the output laye r at 90% of the generalization level. This network produced the least average er ror of the testing set as shown in Table 5-26 and Figure 5-30. 102

PAGE 103

Table 5-23. Performance for tr aining set of model type 6 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 15 16 16 16 16 Optimal number of hidden neurons 11 7 11 10 15 R-squared 0.817 0.786 0.791 0.592 0.755 Avg. Error 8156.1 11581.5 9829.6 15426.8 15487.1 Correlation 0.904 0.887 0.889 0.769 0.869 Table 5-24. Performance for testing set of model type 6 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 10% 10% 90% 100% 70% R-squared -3.411 -0.515 -3.911 0.258 -2.451 Avg. Error 21611.6 12600.7 24877.3 9421.3 24878.1 Correlation 0.878 0.920 0.781 0.924 0.679 Table 5-25. Performance for tr aining set of model type 6 for predicting con tingency rate Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 15 16 16 16 16 Optimal number of hidden neurons 15 16 16 16 16 R-squared 0.747 0.977 0.881 0.534 0.447 Avg. Error 0.012 0.004 0.008 0.010 0.017 Correlation 0.864 0.988 0.938 0.731 0.669 Table 5-26. Performance for testing set of model type 6 for predicting contingency rate Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 50% 90% 90% 60% 70% R-squared 0.514 0.505 0.216 0.361 -0.482 Avg. Error 0.019 0.017 0.023 0.020 0.033 Correlation 0.759 0.721 0.479 0.633 -0.418 Finding the optimal number of hidden ne urons and importance of input variables The best ANN model for type 6 is the network with three input vari ables including Project Year, Project Duration and Projec t Amount and contingency rate as the output variable. The 103

PAGE 104

optimal number of hidden neurons for the learning process is 16 out of the total numbers of hidden neurons trained (16) based on the average error as shown in Table 5-26 and Figure 5-31. 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 NoneNo. of BidderPYDurationAmount Input Variable Excluded from ModelAvg. Error of Test Set Figure 5-30. Input configuration for the model type 6 Figure 5-31. Optimal number of hi dden neurons for the model type 6 Project Amount out of the thre e input variables is the most significant one for predicting contingency rate as seen in Fi gure 5-32. The value of importance of Project Amount is 0.415. 104

PAGE 105

Figure 5-33 and 5-34 show comparisons between actual output value and predicted output value on the training and the testing set on th e best network for ANN model Type 6. Figure 5-32. Importance of input variables at the optimal number of hidden neurons Figure 5-33. Actual/predicted output value on training da taset of the model type 6 105

PAGE 106

Figure 5-34. Actual/predicted output value on testing da taset of the model type 6 ANN Model Type 7 Among forty ANN model types, there were nine feasible types for developing neural network models. ANN model Type 7 represents aspha lt resurfacing as work type, central letting as letting type, lump sum as contract agreement type, and rural as geog raphical location type. Finding the best network for model type 7 There are ten networks from five configurations of input vari ables and two forms of output variables as potential networks for the ANN model Type 7 as seen in Table 5-27, 5-28, 5-29, and 5-30. The best network is one with the configuration of thr ee input variables except Project Duration for predicting contingency rate as output variable at 10% of the generalization level from the performance based on the average error of the testing set as shown in Table 5-30 and Figure 5-35. In this network, the Average Erro r is 0.014 for the training set and 0.010 for the testing set. 106

PAGE 107

Table 5-27. Performance for tr aining set of model type 7 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 10 11 11 11 11 Optimal number of hidden neurons 8 4 4 11 6 R-squared 0.450 0.196 0.264 0.286 0.434 Avg. Error 40796.3 57496.6 56482.3 46251.3 47680.5 Correlation 0.671 0.443 0.514 0.535 0.659 Table 5-28. Performance for testing set of model type 7 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 90% 100% 100% 80% 100% R-squared 0.051 0.117 0.246 0.484 -0.222 Avg. Error 52851.3 39481.8 40785.9 35112.8 52469.2 Correlation 0.717 0.608 0.759 0.858 0.575 Table 5-29. Performance for tr aining set of model type 7 for predicting con tingency rate Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 10 11 11 11 11 Optimal number of hidden neurons 10 11 8 11 11 R-squared 0.417 0.436 0.432 0.478 0.426 Avg. Error 0.013 0.012 0.013 0.014 0.013 Correlation 0.645 0.660 0.657 0.692 0.652 Table 5-30. Performance for testing set of model type 7 for predicting contingency rate Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 60% 100% 100% 10% 10% R-squared -0.074 -0.321 -0.106 0.760 0.144 Avg. Error 0.021 0.027 0.025 0.010 0.022 Correlation 0.471 -0.208 0.168 0.986 0.422 Finding the optimal number of hidden ne urons and importance of input variables The best ANN model for Type 7 was found to be the network with th ree input variables including Number of Bidder, Proj ect Year, and Project Amount and with contingency rate as the 107

PAGE 108

output variable. The optimal number of hidden neurons for the lear ning process is 11 out of total numbers of hidden neurons trained (11) based on the average error as shown in Table 5-29 and Figure 5-36. 0.000 0.005 0.010 0.015 0.020 0.025 0.030 NoneNo. of BidderPYDurationAmount Input Variable Excluded from ModelAvg. Error of Test Set Figure 5-35. Input configuration for the model type 7 Figure 5-36. Optimal number of hi dden neurons for the model type 7 108

PAGE 109

Number of Bidder among the th ree input variables is the mo st significant for predicting contingency rate as shown in Figure 5-37. The value of importance of Number of Bidder is 0.958. Figure 5-38 and 5-39 show comparisons betw een actual output valu e and predicted output value on the training and testing sets on the best network for ANN model type 7. Figure 5-37. Importance of input variables at the optimal number of hidden neurons Figure 5-38. Actual/predicted output value on training da taset of the model type 7 109

PAGE 110

Figure 5-39. Actual/predicted output value on testing da taset of the model type 7 ANN Model Type 8 Among forty ANN model types, there were nine feasible types for developing neural network models. ANN model Type 8 represents aspha lt resurfacing as work type, district letting as letting type, unit price as cont ract agreement type, and urban as geographical location type. Finding the best network for model type 8 As candidates of the best networks for ANN mo del Type 8, there are ten networks based on five configurations of input variables and tw o forms of output variable as shown in Table 531, 5-32, 5-33 and 5-34. Among them, the best network consists of three input variables except Project Amount in the input laye r and the contingency amount in the output layer. At 60% of the generalization level, this networ k produced the least average error for the testing set as shown in Table 5-32 and Figure 5-40. 110

PAGE 111

Table 5-31. Performance for tr aining set of model type 8 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 7 8 8 8 8 Optimal number of hidden neurons 7 7 4 8 7 R-squared 0.863 0.842 0.181 0.907 0.811 Avg. Error 20977.5 22011.8 36295.7 14989.8 23736.7 Correlation 0.929 0.918 0.426 0.953 0.901 Table 5-32. Performance for testing set of model type 8 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 70% 90% 100% 90% 60% R-squared -0.143 -0.209 -0.869 -0.051 0.745 Avg. Error 63832.4 67057.7 83113.1 60693.9 31252.9 Correlation 0.535 0.764 0.589 0.775 0.916 Table 5-33. Performance for tr aining set of model type 8 for predicting con tingency rate Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 7 8 8 8 8 Optimal number of hidden neurons 7 8 8 8 8 R-squared 0.567 0.776 0.682 0.630 0.694 Avg. Error 0.023 0.011 0.017 0.018 0.015 Correlation 0.753 0.881 0.826 0.794 0.833 Table 5-34. Performance for testing set of model type 8 for predicting contingency rate Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 10% 20% 90% 80% 20% R-squared -1.158 -0.514 -2.174 -0.810 -1.366 Avg. Error 0.041 0.038 0.056 0.037 0.051 Correlation -0.007 0.129 0.079 0.086 -0.086 Finding the optimal number of hidden neurons and importance of input variables The best ANN model for Type 8 is the netw ork with three input variables including Number of Bidder, Project Year and Project Duration and with the contingency amount as the 111

PAGE 112

output variable. The optimal nu mber of hidden neurons for th e learning process on the best network is 7 out of the total numbers of hidden neurons trained (8) based on the average error as shown in Table 5-31 and Figure 5-41. 0.0 10000.0 20000.0 30000.0 40000.0 50000.0 60000.0 70000.0 80000.0 90000.0 NoneNo. of BidderPYDurationAmount Input Variable Excluded from ModelAvg. Error of Test Set Figure 5-40. Input configuration for the model type 8 Figure 5-41. Optimal number of hi dden neurons for the model type 8 112

PAGE 113

Number of Bidder among the three input variables is th e most significant one for predicting contingency amount as seen in Figure 5-42. The value of importance of Number of Bidder is 0.938. Figure 5-43 and 5-44 show comparisons between the actual output value and predicted output value for the training and tes ting sets for the best network for ANN model Type 8. Figure 5-42. Importance of input variables at the optimal number of hidden neurons Figure 5-43. Actual/predicted output value on training da taset of the model type 8 113

PAGE 114

Figure 5-44. Actual/predicted output value on testing da taset of the model type 8 ANN Model Type 9 Among forty ANN model types, there were nine feasible types for developing neural network models. ANN model Type 9 represents othe r works as work type, district letting as letting type, unit price as cont ract agreement type, and urban as geographical location type. Finding the best network for model type 9 There are ten networks from five configurations of input vari ables and two forms of output variable as potential networks for model Type 9, as shown in Table 5-35, 5-36, 5-37, and 5-38. The best network is the one with the configurati on of three input variables except Project Letting Year for predicting the contingency amount as output variable under 70% of the generalization level based on the average error for the testing set as shown in Table 5-36 and Figure 5-45. 114

PAGE 115

Table 5-35. Performance for tr aining set of model type 9 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 11 12 12 12 12 Optimal number of hidden neurons 11 4 12 8 8 R-squared 0.719 0.188 0.559 0.478 0.520 Avg. Error 6705.0 11807.2 9329.4 10594.2 8899.6 Correlation 0.848 0.433 0.748 0.692 0.721 Table 5-36. Performance for testing set of model type 9 for predicting contingency amount Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 80% 30% 70% 90% 100% R-squared 0.649 0.598 0.643 0.637 0.305 Avg. Error 7432.6 8687.3 6872.8 7429.0 11436.7 Correlation 0.826 0.833 0.825 0.827 0.818 Table 5-37. Performance for tr aining set of model type 9 for predicting con tingency rate Input variable excluded None No. of Bidder PY Duration Amount Number of hidden neurons trained 11 12 12 12 12 Optimal number of hidden neurons 11 12 12 12 12 R-squared 0.684 0.691 0.781 0.821 0.885 Avg. Error 0.021 0.017 0.012 0.008 0.011 Correlation 0.827 0.831 0.884 0.906 0.940 Table 5-38. Performance for testing set of model type 9 for predicting contingency rate Input variable excluded None No. of Bidder PY Duration Amount Level of generalization 100% 90% 100% 100% 100% R-squared -0.081 -0.016 -0.135 -0.028 -0.275 Avg. Error 0.019 0.018 0.021 0.017 0.023 Correlation 0.122 0.192 0.196 0.196 -0.155 Finding the optimal number of hidden ne urons and importance of input variables The best ANN model for Type 9 was found to be the network with th ree input variables including Number of Bidder, Project Durati on, and Project Amount and with contingency 115

PAGE 116

amount as output variable. The optimal number of hidden neurons for the learning process is 12 out of total numbers of hidden ne urons trained (12) based on the average error as shown in Table 5-35 and Figure 5-46. 0.0 2000.0 4000.0 6000.0 8000.0 10000.0 12000.0 14000.0 NoneNo. of BidderPYDurationAmount Input Variable Excluded from ModelAvg. Error of Test Set Figure 5-45. Input configuration for the model type 9 Figure 5-46. Optimal number of hi dden neurons for the model type 9 116

PAGE 117

Project Amount among the three input variable s is the most signifi cant one for predicting the contingency amount as shown in Figure 5-47. The value of importance of Project Amount is 0.624. Figure 5-48 and 5-49 show comparisons betw een actual output valu e and predicted output value for the training and testing sets fo r the best network for ANN model Type 9. Figure 5-47. Importance of input variables at the optimal number of hidden neurons Figure 5-48. Actual/predicted output value on training da taset of the model type 9 117

PAGE 118

Figure 5-49. Actual/predicted output value on testing da taset of the model type 9 Summary of the Best Neural Network for Each ANN Model Type Table 5-39 summarizes the optimal number of hidden neurons during the learning process, the appropriate level of genera lization for the testing dataset in each ANN model type. It shows the best configuration of input variables and the appr opriate form of the contingency item as output variable for the best ne tworks for each ANN model type. Table 5-39. The best neural network for each ANN model type ANN model type Optimal number of hidden neurons Generalization level Configuration of input variables Output variable No. of Bidder Project Year Project Duration Project Amount Contingency Amount Contingency Rate Type 1 36 100% Type 2 8 70% Type 3 9 0% Type 4 8 100% Type 5 33 80% Type 6 16 90% Type 7 11 10% Type 8 7 60% Type 9 12 70% 118

PAGE 119

For input variables, all other model types ex cept Types 2 and 5 have better performances when an input variable among f our variables is excluded. For output variable, the contingency amount is the better form of contingency item fo r Types 1, 2, 3, 4, 5, 8, and 9, while the contingency rate is the better for Types 6 and 7. Figure 5-50 and Figure 5-51 show the performan ce of the best neural network for nine ANN model types on the training and testing dataset. From the R-squared and correlation values on the testing set, the network for model Type 4 has the best performance and the network for model Type 1 has the worst performance among the best networks for nine ANN model types. In the Appendix B and C, distribution graphs between input variables and evaluation graphs for testing error on the best and worst ANN model types will be presented. 0.00 0.20 0.40 0.60 0.80 1.00 123456789 ANN Model TypeR-squared Training Set Testing Set Figure 5-50. Comparison of R-squared value on the best network for each model type 119

PAGE 120

0.00 0.20 0.40 0.60 0.80 1.00 123456789 ANN Model TypeCorrelation Training Set Testing Set Figure 5-51. Comparison of Correlation value on the best network for each model type 120

PAGE 121

CHAPTER 6 MODEL VALIDATION After selecting the best Artificial Neural Network (ANN) model for each type, it is necessary to reevaluate the performance of the model with the validation dataset. It provides a final assessment of the performance of the model and the confirmation of its validity. For model validation, the best net statistics from the Ne uroShell Predictor software, R-squared, Average Error, and Correlation values were checked. Validation for ANN Model Type 1 The best ANN model for Type 1 consisted of Nu mber of Bidder, Proj ect Year, and Project Amount as input variables and the continge ncy amount as output va riable under 100% of generalization level. It was eval uated with the validation dataset and compared with performance of the testing set as seen in Table 6-1. The perf ormance with the validation set is better than that with the testing set in terms of the R-squared a nd the correlation values. The best network has an R-squared value of 0.530 and correlation value of 0.855 on the validation set. Figure 6-1 also shows comparisons between the actual output va lues and he predicted output values on the validation set on the best ANN model for Type 1. Validation for ANN Model Type 2 The best ANN model for Type 2 with all input variables including Number of Bidder, Project Year, Project Duration, and Project Amount and with contingency amount as the output variable under 70% of generaliz ation level was evaluated with the validation dataset. The performance of the validation set wa s compared with that of the tes ting set as seen in Table 6-2. Table 6-1. Performance for validati on set of the best model for type 1 Validation set Testing set R-squared 0.530 0.457 Avg. Error 114735.2 76226.7 Correlation 0.855 0.698 121

PAGE 122

Figure 6-1. Actual/predicted output value on validation da taset of the model type 1 The performance of the validation set is better th an that with the testing set in terms of the R-squared and correlation values. The value of R-squared is 0.798 and the value of the Correlation is 0.935 on the validation set. Figur e 6-2 also shows comparisons between actual output value and predicted output value on the valid ation set for the best ANN model for Type 2. Validation for ANN Model Type 3 The best ANN model for Type 3 was found to be the network with input variables including Number of Bidder, Pr oject Year, and Project Duration and with contingency amount as output variable at 0% of generalization level. The network was evaluated with the validation dataset and compared with performa nce of the testing set as seen in Table 6-3. The performance Table 6-2. Performance for validati on set of the best model for type 2 Validation set Testing set R-squared 0.798 0.713 Avg. Error 98326.1 87117.8 Correlation 0.935 0.888 122

PAGE 123

Figure 6-2. Actual/predicted output value on validation da taset of the model type 2 of the validation set is good although it is not better than that with the testing set in terms of the R-squared and correlation values. Figure 6-3 show s comparisons between the actual output value and the predicted output value for the valida tion set for the best ANN model for Type 3. Figure 6-3. Actual/predicted output value on validation da taset of the model type 3 123

PAGE 124

Table 6-3. Performance for validati on set of the best model for type 3 Validation set Testing set R-squared 0.590 0.838 Avg. Error 921113.3 264287.1 Correlation 0.990 0.996 Validation for ANN Model Type 4 The best ANN model for Type 4 consisting of Number of Bidder, Project Duration, and Project Amount as input variable s and the contingency amount as output variable under 100% of generalization level was evaluated with the validation dataset. It was compared with the performance of the testing set as seen in Table 6-4. However, the performance on the validation set is worse than that on the testing set fo r the values of R-squa red and correlation. Table 6-4. Performance for validati on set of the best model for type 4 Validation set Testing set R-squared -2.386 0.853 Avg. Error 40445.6 17887.4 Correlation -0.393 0.976 So, the second best network for the ANN model for Type 4 was used in the validation phase. The second best model for Type 4 is the network consisting of all four input variables and contingency amount as output variable under 90% of the generalization level. It was evaluated and compared with the R-squared and correlation values on the testing set as shown in Table 6-5. The second best model also did not have good perf ormance for the validation set. Accordingly, the ANN model for type 4 will be excluded in the development of a prediction tool. Table 6-5. Performance for validation se t of the second best model for type 4 Validation set Testing set R-squared -2.082 0.558 Avg. Error 34879.0 35226.9 Correlation 0.089 0.962 124

PAGE 125

Validation for ANN Model Type 5 The best ANN model for Type 5 consisted of a ll four input variable s including Number of Bidder, Project Year, Project Duration, and Project Amount and the contingency amount as output variable under 80% of generalization leve l was evaluated on the va lidation dataset and it was compared with performance of the testing set as shown in Table 6-6. The performance on the validation set is not better than that on the testing set with lower R-squared values. Table 6-6. Performance for validati on set of the best model for type 5 Validation set Testing set R-squared 0.233 0.820 Avg. Error 39786.3 24919.0 Correlation 0.586 0.914 Similar to the case of the ANN model Type 4, the second best network was tested with the validation set. The second best model for Type 5 was found to be the ANN with three input variables including Project Year, Project Duration, and Project Am ount and the contingency rate as the output variable at 80% of the generalization level. It was evaluated and compared with the performance on the testing set as shown in Table 6-7. However, the second best model also did not have good performance for the validation set since the R-square d value is 0.052 and correlation value is 0.433. Therefore, the ANN model for Type 5 will be excluded in the development of a prediction tool because of its lack of validity. Table 6-7. Performance for validation se t of the second best model for type 5 Validation set Testing set R-squared 0.052 0.707 Avg. Error 0.014 0.016 Correlation 0.433 0.862 Validation for ANN Model Type 6 The best ANN model for Type 6 consisted of Project Year, Project Duration, and Project Amount as input variables and of the continge ncy rate as the output variable under 90% of 125

PAGE 126

generalization level was evaluated with the validation dataset. It was compared with the performance of the testing set as shown in Ta ble 6-8. However, the performance with the validation set was not good enough to show a su ccessful validation because of the negative Rsquared and correlation values. Table 6-8. Performance for validati on set of the best model for type 6 Validation set Testing set R-squared -0.196 0.505 Avg. Error 0.034 0.017 Correlation -0.150 0.721 The second best ANN model for Type 6 was the network with all four input variables and with contingency rate as the output variable at 50% of the generalization level. It was evaluated and compared with values of R-squared and corr elation values on the te sting set as shown in Table 6-9. However, the second best model also did not have good performance on the validation set with the negative values R-squared and corr elation. So, the ANN model for Type 6 will also be excluded in the development of a predicti on tool due to the insufficient validation. Table 6-9. Performance for validation se t of the second best model for type 6 Validation set Testing set R-squared -0.305 0.514 Avg. Error 0.041 0.019 Correlation -0.288 0.759 Validation for ANN Model Type 7 The best ANN model for Type 7 was the networ k which consisted of Number of Bidder, Project Year, and Project Amount as input vari ables and of contingenc y rate as the output variable at 10% of generalization level. The network was evaluated with the validation dataset and compared with performance of the testi ng set as shown in Table 6-10. However, the performance with validation set is not good enough to show the successful validation because of the negative value of R-squared. 126

PAGE 127

Table 6-10. Performance for validati on set of the best model for type 7 Validation set Testing set R-squared -1.583 0.760 Avg. Error 0.035 0.010 Correlation 0.645 0.986 The second best ANN for model Type 7 was used in the validation. The second best model for Type 7 consisted of three input variables including Number of Bidder, Project Year, and Project Amount and with the contingency amoun t as the output variable under 80% of the generalization level. It was eval uated and compared with the pe rformance on the testing set as shown in Table 6-11. However, since the s econd best network does not also have good performance on the validation set due to the lo wer value of R-squared, ANN model Type 7 will be excluded in the developm ent of a prediction tool. Table 6-11. Performance for validation se t of the second best model for type 7 Validation set Testing set R-squared 0.219 0.484 Avg. Error 46418.9 35112.8 Correlation 0.653 0.858 Validation for ANN Model Type 8 The best ANN model for Type 8 was found to be the network with th ree input variables including Number of Bidder, Pr oject Year, and Project Duration in the input layer and with contingency amount in the output layer under 60% of generalization level. It was evaluated with the validation dataset and compared with performa nce of the testing set as shown in Table 6-12. The performance of the validation set is good sin ce the value of R-square d is 0.702 and the value of correlation is 0.899. Figure 64 shows comparisons between actual output value and predicted output value on validation set on th e best ANN model for Type 8. 127

PAGE 128

Table 6-12. Performance for validati on set of the best model for type 8 Validation set Testing set R-squared 0.702 0.745 Avg. Error 66968.9 31252.9 Correlation 0.899 0.916 Figure 6-4. Actual/predicted output value on validation da taset of the model type 8 Validation for ANN Model Type 9 The best ANN model for Type 9 consists of Number of Bidder, Project Duration, and Project Amount as input variab les and the contingency amount as the output variable under 70% of generalization level. It wa s evaluated with the validation dataset and compared with the performance of the testing set as shown in Tabl e 6-13. The performance fo r the validation set is better than that for the testing set in that the value of R-s quared is 0.683 and the value of correlation is 0.872. Figure 6-5 shows comparis ons between actual outpu t value and predicted output value on validation set on th e best ANN model for Type 9. 128

PAGE 129

Table 6-13. Performance for validati on set of the best model for type 9 Validation set Testing set R-squared 0.683 0.643 Avg. Error 14255.2 6872.8 Correlation 0.872 0.825 Figure 6-5. Actual/predicted output value on validation da taset of the model type 9 Summary Among the nine ANN model types, five ANN models, Type 1, Type 2, Type 3, Type 8, and Type 9, were successfully validated with good performances. Especially, for Type 1, Type 2, and Type 9, the performances of validation sets we re better than those of testing sets with higher R-squared and correlation values as seen in Figure 6-6. However, for ANN model Types 4, 5, 6, and 7, the performances of the validation sets of the best ANN model for each type were not good. As an alternative, the second best networks for each ANN type were tested for the validation of the models. However, these second models did not also have good performances on the validatio n set. So, these ANN types would be excluded from the development of a prediction tool to pred ict the contingency item in the next chapter. 129

PAGE 130

0.00 0.20 0.40 0.60 0.80 1.00 12389 ANN Model TypeValue R-squared(Validation) Correlation(Validation) R-squared(Testing) Correlation(Testing) Figure 6-6. Performance of the best netw orks on the testing and validation set The main reason for these failures on the va lidation phase in this study would be the number and quality of data for each ANN model type. During training the networks in the NeuroShell Predictor software, a message of No further training is possible unless more rows of different examples are provided as seen in Fi gure 6-6 was often displayed on the screen. This message was displayed because the number of trai ning data relative to the number of inputs and hidden neurons that have been added is too small to permit further training without overfitting. Figure 6-7. Message in the NeuroShell Predictor 130

PAGE 131

CHAPTER 7 PREDICTION TOOL After validating the developed ANN model for each type, an easy-to-use prediction tool will help end-users apply ANN techniques in predicting the con tingency item. The NeuroShell Run-Time Server program allows researchers to fire and apply networks created with the NeuroShell Predictor in MS Excel spreadsheets. Among several programs on the NeuroShell RunTime Server, NeuroShell Fire MS Excel Add-in and NeuroShell Run-Time MS Excel Add-in programs are useful in the development and application of a prediction tool in an MS Excel spreadsheet. Th e NeuroShell Fire MS Excel Addin program, which uses the file named NSFIRE.X LA, can fire the networks within a MS Excel spreadsheet. The NeuroShell Run-Time MS Excel Add-in program using the file of NSRUN.XLA allows end-users who do not own th e NeuroShell Predictor software to call a network trained in the NeuroShell Pr edictor in a MS Excel spreadsheet. Figure 7-1. NeuroShell MS Excel Add-in panel 131

PAGE 132

NeuroShell Fire MS Excel Add-in Through the Neuroshell Fire MS Excel Add-in program, the neural networks created can be fired within a MS Excel spreadsheet. There ar e two ways to fire the networks: Dialog Box or FireNet Function. Dialog Box The Dialog Box helps place network calls into a MS Excel spreadsheet. In order to use the dialog box, NeuroShell Run-Time Option in the MS Excel Tools Menu will be selected and added. The screen shown in Figure 7-2 will then be displayed. After selecting the trained network file in the dialog box, the range contai ning values of input variables and the column where output values will be placed can be spec ified. Additionally, the column containing actual output values and the level of enhanced genera lization can be selected for network problems. Figure 7-2. Dialog Box pane l in MS Excel spreadsheet FireNet Function Call Instead of starting the Add-in from the MS Excel Tools Menu, a call of the function FireNet will be placed in each cell which will c ontain results from the networks. The function FireNet specifies the range of input variables, the trained network file, and the level of Enhanced 132

PAGE 133

Generalization as seen in the below example of the function. The output va lue will be placed in the cell containing the FireNet f unction. In this example, A2: D2 means the range of input variables, C:\AMOUNT1.net means the locati on of the trained network file, and .5 represents the level of e nhanced generalization for the best output results. =NSFIRE.XLA!FireNet(A2:D2 C:\AMOUNT1.net,0.5) NeuroShell Run-Time Excel Add-in The NeuroShell Run-Time MS Excel Add-in pr ogram is the version of the program which may be sent to users who do not own the NeuroS hell Predictor software. The software that is distributed to users should include the file of NSRUN.XLA, which the users load as an MS Excel Add-in. After loading NSRUN.XLA file in the MS Excel Add-in, the users can use the function FireNet for prediction problems like the Ne uroShell Fire MS Excel Add-in program. Development of a Prediction Tool for Predicting Contingency In Chapter 6, the final networks for each ANN model type were found. Among the nine ANN types, ANN models for five types (Type 1, Type 2, Type 3, Type 8, and Type 9) were successfully validated. A prediction tool in the MS Excel spreadsheet was developed for the five effective neural networks using the FireNet F unction Call in NeuroShell Fire MS Excel Add-in program since it is applicable to all users who do not even own the NeuroShell Predictor software. The value of output variable of all networks for five ANN model types is the contingency amount. All models fall within De sign-Bid-Build (DBB) as delivery method type, Lowest Bid (LB) as bid award type, Unit Price (UP) as contract agreement type, and Urban as geographical location type. Tabl e 7-1 shows the FireNet Functi ons for predicting contingency amount as the output value of each type. 133

PAGE 134

Table 7-1. FireNet function for contingency amount on each ANN model type ANN type FireNet function Type 1 =NSFIRE.XLA!FireNet(B5:D5 C:\DATA\AMOUNT1DURATION.net,1) Type 2 =NSFIRE.XLA!FireNet(B11: E11, C:\DATA\AMOUNT2.net,0.7) Type 3 =NSFIRE.XLA!FireNet(B17:D 17, C:\DATA\AMOUNT3AMOUNT.net,0) Type 8 =NSFIRE.XLA!FireNet(B23:D23, C:\DATA\AMOUNT8AMOUNT.net,0.6) Type 9 =NSFIRE.XLA!FireNet(B29:D 29, C:\DATA\AMOUNT9PY.net,0.7) The prediction tool in the MS Excel software consists of two spreadsheets. The first spreadsheet of Read First explains the instal lation of the tool as a MS Excel Add-in, the instruction of the prediction t ool and descriptions of feasib le ANN model types for predicting contingency amount as seen in Figure 7-3. Figure 7-3. Read First MS Ex cel panel in the prediction tool The second spreadsheet of TOOL covers the implementation program to predict contingency amount in the select ed ANN model type as seen in Figure 7-4. The prediction tool interactively executes a selected network as putting different values on the input variables. 134

PAGE 135

Figure 7-4. TOOL MS Excel panel on the prediction tool 135

PAGE 136

CHAPTER 7 CONCLUSIONS In this chapter, a brief summary of this study is provided and th e conclusions obtained from this study are explained. Research limitati ons and recommendations for future research are also discussed. Conclusions This study focuses on the development of an Artificial Neural Network (ANN) model for predicting the contingency item on transportation construction proj ects. The influencing factors on the contingency item as input variables for ANN models were identified and the calculation and appropriate form of the desired contingency item as the output variable for ANN models was found. Contingency related data were collect ed from the FDOT (Florida Department of Transportation) database. With accessible numerical input variables, ANN models from combinations of categorical input variables we re developed and validated in the NeuroShell Predictor software. Finally, th e prediction tool on Microsoft Excel spreadsheets was developed for the end-users. As potential input factors on the contingency item, there ar e project work type, project delivery method type, project cont ract agreement type, project bi d award type, number of bidders, project amount, project duration, project letting type, project geogr aphical location, project site conditions, possibility of construction changes, number of conc urrent projects for owners or sponsors, project letting year, environmental impacts, and tran sportation management plans. Among them, accessible input factors from the FDOT database were found to be project work type, project delivery method type project contract agreement type, project bid award type, number of bidders, project amount, project duration, project lett ing type, project geographical location, and project letting year. 136

PAGE 137

As the output variable, the contingency item, which is defined as the cost item to be able to compensate for all unforeseen work orders and related risks including in itial contingency amount, was calculated from the difference between the or iginal contract amount and the final contract amount. There are two forms of the contingency item: contingency amount and contingency rate. Separate ANN models from each combinati on of categorical input variables were developed since ANN models do not have a capability of interpreting categor ical variables that cannot be put in some meaningful order. There are nine effec tive ANN model types, whose data number is at least 20. Each ANN model consists of four numerical inpu t variables (number of bidder, project letting year, project duration, and project amount) in the input layer and one output variable (contingency amount or co ntingency rate) in the output layer. Among the nine ANN model types implemented in the NeuroShell Predictor software, five ones are successfully validated. The best network for ANN mode l Type 1 representing asphalt resurfacing as project work type, central letting as letting type, unit price as contract agreement type, and urban as geographical location type co nsists of number of bidder, project year, and project amount as input variab les and of contingency amount as the output variable under 100% of generalization level. It has an R-squared value of 0.530 and correlation value of 0.855 on the validation set. The best network for ANN model Type 2 repr esenting asphalt paving as project work type, central letting as letting type, un it price as contract agreement type, and urban as geographical location type is the one with all input variables including number of bidder, project year, project duration, and project amount and with conti ngency amount as output variable at 70% of generalization level. The model has good performa nce on the validation set with an R-squared value of 0.798 and correlation value of 0.935. 137

PAGE 138

The best network for ANN model Type 3 that re presents the combined work of bridge and asphalt paving as project work type, central letting as letting type, unit price as contract agreement type, and urban as geogr aphical location type is the one consisting of input variables including number of bidder, pr oject year, and project duration and the output variable of contingency amount under 0% of generalization level. The perf ormance with validation set on this network is good with an R-squared va lue f 0.590 and correlation value of 0.990. The best network for ANN model Type 8 repr esenting asphalt resurfacing as project work type, district letting as letti ng type, unit price as contract agreement type, and urban as geographical location type consists of number of bidder, pr oject year, and project duration as input variables and of contingency amount as the output variable at 60% of generalization level. The network has the good performance on the vali dation set with an R-squared value of 0.702 and correlation value of 0.899. The structure of the best network for ANN model Type 9 representing other works as work type, district letting as letti ng type, unit price as contract agreement type, and urban as geographical location type is composed of three input vari ables including number of bidder, project duration, and project amount and of the out put variable of conti ngency amount at 70% of generalization level. It has an R-squared value of 0.683 and correlation value of 0.872 of for the validation set. With the best networks for five ANN model t ypes, the easy-to-use prediction tool in the MS Excel spreadsheet was developed for end-users. The program interactively executes the prediction of the contingency item users want to know on the selected ANN model types before starting transportation construction projects. 138

PAGE 139

Research Limitations The developed prediction tool is only good fo r FDOT transportation projects because all ANN models developed were trai ned and validated from the FDOT database. Since the ANN methodology used as main modeling tool is a data drive technique, all ANN model types from combinations of categorical input variables were not implemented due to lacks of sufficient data records. Among fifteen potential input factors on the contingency item discovered from literature reviews, only ten input variables we re considered in the study due to the lack of availability of enough data. In addition, the NeuroShell Predictor software that was used in this research has some limitations on designing the neural networks. Alt hough it is easy to use and useful to predict numerical values, it does not pr ovide many powerful options on designing and training the networks such as choosing architectures and training criteria for the networks, memorizing weights on the learning process, or changing some parameters like learning rate or smoothing factors in the networks and thus flexible user controls over trai ning the networks for experimental environments. Some experiments on the ANN model Type 1 were done through the NeuroShell 2 ANN software in the Appendix D. The NeuroShell 2 software developed by the Ward Systems Group is a Microsoft Windows-based system that is easy to use from the icon-driven user interface and has lots of user control over training the netw orks since it combines powerful neural network architectures and various options to provide us ers the ultimate experimental environment for executing the neural networks. It contains 16 traditional neural network architectures for predicting the problems with genero us flexibilities and flexible controls for experimentation. 139

PAGE 140

Recommendations for Future Research The prediction tool for five feasible ANN model types will be a useful program for FDOT project financial planners in that it predicts the contingency item before starting transportation construction projects sponsored by the FDOT. Howe ver, it is not a complete tool for all FDOT projects since available project da ta were limited for some types. If further developed with more project data, the prediction tool could greatly im prove the accuracy of predicting the contingency item on and cover all FDOT transportation projects. For better prediction of the contingency ite m, a problem domain should be appropriately defined on the development phase of ANN models and the models performing sufficiently well in all regions of the problem do main should be developed for generalization. In addition, an experiment of two-step ANN model can be considered as future research for predicting more accurate contingency. In this study, contingency amount or contingency rate as output variable were considered and obtained from the one step calculation procedure of the difference between the original contract amount and the final c ontract amount on the implementation of neural networks. However, contingency rate was not th e better form of contingency item as output variable in most ANN models. Therefore, con tingency rate needs to be transformed into contingency amount for better performance of ne ural networks. Two-step ANN models can be implemented and compared with one-step ANN models on the following procedure. Predicted Contingency Rate from th e NeuroShell Predictor Software Predicted Contingency Amount = Predicted Contingency Rate Original Contract Amount Manual Calculations of R-squared, Avg. Erro r, and Correlation from Excel Spreadsheets Comparison of the Performance with One-step ANN Models for Predicting Contingency Amount 140

PAGE 141

A complete prediction tool should be developed for the accurate prediction of the contingency item through extensive data collection fr om the state of Florida and other states as well as through more detail data classification. Finally, due to these research efforts, the prediction tool to predict the contingency item will be able to become a more refined program for use by owners or sponsors of tran sportation construction projects. 141

PAGE 142

APPENDIX A FDOT PROJECT DATA Table A-1. FDOT project data for ANN model type 1 No. of Bidders Project Year Project Duration Project Amount Desired Contingency Amount Desired Contingency Rate Remark 7 2001 335 $10,331,699 $983,508 0.095 Training 3 2003 210 $2,308,136 $411,012 0.178 Training 3 2003 350 $3,215,727 $248,154 0.077 Training 3 2003 250 $5,915,484 $72,304 0.012 Training 4 2003 235 $4,160,971 $107,843 0.026 Training 2 2003 150 $2,910,373 $561,001 0.193 Training 5 2003 415 $2,679,935 $129,495 0.048 Training 4 2003 215 $4,373,073 $98,925 0.023 Training 4 2003 175 $2,291,630 $59,814 0.026 Training 6 2003 250 $3,149,289 $100,000 0.032 Training 5 2003 180 $1,591,861 $60,087 0.038 Training 3 2003 160 $1,924,925 $210,810 0.110 Training 3 2003 100 $1,206,126 $53,793 0.045 Training 5 2003 335 $4,358,460 $129,700 0.030 Training 5 2003 150 $2,233,935 $245,052 0.110 Training 5 2003 80 $1,899,690 $117,664 0.062 Training 4 2003 155 $1,811,845 $50,000 0.028 Training 5 2003 175 $2,878,906 $141,800 0.049 Training 4 2003 120 $410,845 $20,000 0.049 Training 5 2003 360 $2,818,720 $200,000 0.071 Training 6 2003 160 $1,278,966 $91,000 0.071 Training 3 2003 160 $1,407,458 $50,000 0.036 Training 3 2004 150 $2,723,487 $186,143 0.068 Training 4 2004 245 $2,407,419 $102,394 0.043 Training 3 2004 240 $1,807,477 $50,000 0.028 Training 3 2005 180 $1,885,414 $50,000 0.027 Training 3 2005 180 $1,165,515 $50,000 0.043 Training 3 2005 115 $446,825 $25,200 0.056 Training 6 2005 260 $1,524,786 $50,000 0.033 Training 3 2005 167 $5,064,490 $277,731 0.055 Training 4 2005 143 $2,490,732 $50,000 0.020 Training 4 2005 210 $3,197,071 $72,375 0.023 Training 5 2005 170 $2,883,938 $50,000 0.017 Training 5 2005 160 $2,352,855 $50,000 0.021 Training 4 2005 205 $1,917,778 $62,753 0.033 Training 6 2005 125 $1,450,830 $50,000 0.034 Training 6 2005 90 $484,673 $25,902 0.053 Training 142

PAGE 143

Table A-1. Continued No. of Bidders Project Year Project Duration Project Amount Desired Contingency Amount Desired Contingency Rate Remark 6 2002 150 $2,388,224 $175,074 0.073 Training 3 2002 255 $3,720,351 $403,616 0.108 Training 2 2002 350 $6,331,166 $212,406 0.034 Training 3 2002 280 $1,500,641 $50,000 0.033 Training 4 2002 160 $2,971,036 $134,000 0.045 Training 5 2002 235 $3,911,454 $172,575 0.044 Training 3 2003 175 $1,703,619 $50,000 0.029 Training 4 2003 120 $1,038,381 $150,011 0.144 Training 3 2004 210 $6,579,703 $187,968 0.029 Training 3 2004 140 $866,795 $97,070 0.112 Training 3 2004 95 $1,507,327 $50,000 0.033 Training 3 2004 184 $3,429,882 $85,754 0.025 Training 4 2004 147 $1,598,713 $50,000 0.031 Training 2 2004 150 $2,342,641 $62,000 0.026 Training 2 2004 200 $3,649,990 $613,202 0.168 Training 6 2004 120 $301,174 $15,000 0.050 Training 5 2004 90 $393,211 -$11,607 -0.030 Training 5 2004 160 $1,845,358 $50,000 0.027 Training 4 2004 105 $737,320 $121,227 0.164 Training 3 2004 95 $798,974 $50,000 0.063 Training 5 2004 175 $2,769,391 $38,593 0.014 Training 4 2005 215 $2,499,648 -$98,287 -0.039 Training 2 2005 80 $1,477,944 $50,000 0.034 Training 4 2003 445 $6,701,881 $778,141 0.116 Testing 2 2003 210 $1,432,336 $81,461 0.057 Testing 5 2003 250 $3,188,275 $99,052 0.031 Testing 5 2003 185 $2,127,049 $198,500 0.093 Testing 3 2003 110 $1,722,725 $105,107 0.061 Testing 5 2003 80 $375,975 $17,000 0.045 Testing 3 2004 365 $3,966,926 $205,288 0.052 Testing 4 2004 125 $1,677,202 $50,000 0.030 Testing 7 2004 120 $568,653 $25,000 0.044 Testing 6 2004 140 $2,052,486 $50,000 0.024 Testing 3 2005 175 $4,921,187 -$179,277 -0.036 Testing 3 2005 240 $4,415,274 $73,716 0.017 Testing 4 2005 190 $3,654,272 $150,000 0.041 Testing 6 2005 200 $2,762,760 $70,000 0.025 Testing 4 2005 90 $397,218 $21,000 0.053 Testing 3 2005 190 $1,577,389 $49,060 0.031 Testing 4 2005 60 $597,633 $27,000 0.045 Testing 143

PAGE 144

Table A-1. Continued No. of Bidders Project Year Project Duration Project Amount Desired Contingency Amount Desired Contingency Rate Remark 3 2002 160 $986,197 $180,000 0.183 Testing 3 2004 375 $5,007,900 $331,796 0.066 Testing 3 2004 210 $3,380,990 $266,697 0.079 Testing 3 2004 210 $3,058,503 $218,827 0.072 Testing 5 2004 90 $995,953 $50,000 0.050 Testing 5 2001 350 $10,018,068 $1,564,681 0.156 Validation 7 2001 245 $5,169,754 $439,288 0.085 Validation 3 2003 210 $2,395,328 $422,961 0.177 Validation 2 2003 235 $6,015,633 $418,490 0.070 Validation 3 2004 200 $2,684,204 $336,781 0.125 Validation 6 2004 540 $10,027,161 $371,751 0.037 Validation 4 2004 234 $3,451,412 $108,000 0.031 Validation 2 2005 170 $2,393,198 $50,000 0.021 Validation 4 2005 150 $2,629,334 $50,000 0.019 Validation 3 2005 90 $1,934,142 $50,000 0.026 Validation 4 2005 75 $1,279,861 $47,721 0.037 Validation 6 2005 150 $1,403,402 $65,000 0.046 Validation 6 2005 120 $1,281,741 $50,000 0.039 Validation 4 2005 115 $667,889 $31,000 0.046 Validation 4 2005 80 $1,535,043 $50,000 0.033 Validation 4 2005 80 $596,125 $66,424 0.111 Validation 4 2002 190 $1,291,435 $248,095 0.192 Validation 3 2002 150 $3,384,311 $387,120 0.114 Validation 4 2004 75 $638,557 $11,088 0.017 Validation 3 2004 85 $524,940 $15,575 0.030 Validation 3 2004 180 $3,689,402 $507,943 0.138 Validation 4 2004 70 $596,981 $30,000 0.050 Validation 5 2004 160 $2,661,588 $127,229 0.048 Validation 2 2005 95 $1,019,070 $40,000 0.039 Validation 144

PAGE 145

Table A-2. FDOT project data for ANN model type 2 No. of Bidders Project Year Project Duration Project Amount Desired Contingency Amount Desired Contingency Rate Remark 3 2000 620 $9,846,159 $251,522 0.026 Training 6 2001 310 $1,968,697 $54,828 0.028 Training 6 2001 530 $10,134,000 $718,545 0.071 Training 5 2001 715 $6,516,937 $940,409 0.144 Training 4 2003 80 $572,211 $69,894 0.122 Training 8 2003 600 $5,582,824 $297,961 0.053 Training 5 2003 580 $12,729,611 $381,858 0.030 Training 5 2003 120 $1,077,645 $62,800 0.058 Training 5 2005 120 $916,005 $46,000 0.050 Training 6 2002 600 $10,592,245 $311,250 0.029 Training 4 2002 394 $3,656,466 $193,703 0.053 Training 5 2002 840 $6,681,715 $673,102 0.101 Training 2 2002 233 $1,514,456 $214,584 0.142 Training 4 2002 270 $1,744,623 $334,375 0.192 Training 6 2004 380 $8,099,184 $250,000 0.031 Training 3 2004 120 $210,927 $36,373 0.172 Training 2 2003 300 $2,778,477 $106,576 0.038 Testing 5 2003 190 $2,125,679 $133,955 0.063 Testing 8 2004 141 $661,906 $57,135 0.086 Testing 4 2005 280 $4,561,111 $114,186 0.025 Testing 5 2002 440 $3,590,131 $91,911 0.026 Testing 6 2003 675 $7,036,181 $754,209 0.107 Testing 5 2003 680 $7,759,612 $384,467 0.050 Validation 7 2003 550 $8,508,042 $129,718 0.015 Validation 4 2003 125 $326,278 $16,000 0.049 Validation 4 2004 550 $7,785,827 $310,291 0.040 Validation 6 2002 840 $6,309,105 $752,649 0.119 Validation 145

PAGE 146

Table A-3. FDOT project data for ANN model type 3 No. of Bidders Project Year Project Duration Project Amount Desired Contingency Amount Desired Contingency Rate Remark 6 2000 490 $20,920,636 $740,599 0.035 Training 5 2001 975 $31,346,535 $1,539,574 0.049 Training 4 2001 800 $47,537,401 $1,587,698 0.033 Training 2 2001 1100 $50,659,533 $4,608,985 0.091 Training 5 2001 730 $11,612,855 $443,590 0.038 Training 3 2001 500 $5,637,803 $413,802 0.073 Training 7 2001 445 $3,374,746 $240,407 0.071 Training 2 2003 250 $4,581,747 $80,221 0.018 Training 4 2003 390 $2,195,124 $126,140 0.057 Training 7 2003 340 $4,559,465 $765,010 0.168 Training 3 2002 195 $3,666,961 $95,225 0.026 Training 8 2002 870 $25,514,852 $1,366,803 0.054 Training 5 2002 510 $9,341,555 $167,257 0.018 Training 2 2002 580 $5,850,000 $588,613 0.101 Training 8 2001 375 $3,425,530 $281,534 0.082 Testing 5 2003 480 $9,795,006 $670,351 0.068 Testing 7 2000 1065 $27,390,196 $3,299,714 0.120 Testing 8 2002 720 $9,627,040 $312,566 0.032 Testing 7 2002 360 $7,181,136 $739,787 0.103 Testing 6 2000 1060 $32,842,684 $4,757,798 0.145 Validation 4 2003 270 $2,714,403 $206,380 0.076 Validation 7 2002 580 $10,897,884 $530,312 0.049 Validation 4 2002 445 $5,398,782 $597,744 0.111 Validation 146

PAGE 147

Table A-4. FDOT project data for ANN model type 4 No. of Bidders Project Year Project Duration Project Amount Desired Contingency Amount Desired Contingency Rate Remark 5 2003 600 $2,695,908 $241,022 0.089 Training 4 2003 200 $795,258 $73,100 0.092 Training 2 2004 310 $5,154,296 $85,673 0.017 Training 4 2005 180 $1,084,921 $99,894 0.092 Training 2 2005 300 $1,446,680 $252,771 0.175 Training 3 2005 300 $2,650,770 -$21,460 -0.008 Training 4 2005 120 $658,947 $31,257 0.047 Training 5 2002 105 $1,464,603 $220,147 0.150 Training 4 2004 135 $619,152 $41,421 0.067 Training 5 2004 33 $351,162 $26,700 0.076 Training 7 2004 315 $1,505,555 $236,479 0.157 Training 4 2004 320 $748,983 $50,000 0.067 Training 5 2004 685 $214,169 $9,176 0.043 Training 3 2003 100 $254,731 $31,000 0.122 Testing 2 2003 240 $1,587,973 $50,000 0.031 Testing 3 2005 50 $93,900 $4,430 0.047 Testing 3 2005 280 $587,696 $47,800 0.081 Testing 3 2005 60 $195,841 $7,500 0.038 Testing 5 2004 360 $2,371,879 $189,407 0.080 Testing 4 2002 65 $273,204 $38,184 0.140 Testing 2 2005 160 $950,000 $50,000 0.053 Validation 4 2005 120 $166,508 -$5,298 -0.032 Validation 4 2004 135 $619,152 $41,421 0.067 Validation 147

PAGE 148

Table A-5. FDOT project data for ANN model type 5 No. of Bidders Project Year Project Duration Project Amount Desired Contingency Amount Desired Contingency Rate Remark 3 2003 275 $6,000,295 -$196,805 -0.033 Training 4 2003 75 $470,390 $26,190 0.056 Training 4 2003 180 $3,650,306 $294,782 0.081 Training 2 2003 75 $888,500 $40,650 0.046 Training 3 2003 165 $2,117,400 $50,000 0.024 Training 4 2003 90 $1,016,630 $50,000 0.049 Training 2 2003 275 $4,804,641 $645,158 0.134 Training 3 2003 250 $5,825,884 $557,971 0.096 Training 2 2003 130 $1,013,504 $18,974 0.019 Training 3 2003 100 $1,231,520 $50,000 0.041 Training 3 2003 90 $1,361,976 $50,000 0.037 Training 3 2003 150 $2,115,600 $184,213 0.087 Training 3 2003 95 $1,098,541 $41,405 0.038 Training 2 2003 60 $757,647 $49,000 0.065 Training 4 2003 145 $4,330,497 $114,628 0.026 Training 2 2004 190 $6,225,399 -$80,187 -0.013 Training 2 2004 75 $860,000 $39,000 0.045 Training 2 2005 174 $6,598,701 $154,726 0.023 Training 3 2005 90 $847,989 $50,000 0.059 Training 2 2005 120 $1,156,950 $25,370 0.022 Training 4 2005 90 $1,148,002 $50,000 0.044 Training 2 2005 150 $4,260,000 $101,623 0.024 Training 2 2005 320 $7,605,245 $175,508 0.023 Training 2 2005 75 $1,371,400 $50,000 0.036 Training 3 2005 150 $1,532,247 $17,439 0.011 Training 4 2002 210 $2,297,835 $28,698 0.012 Training 3 2002 230 $1,719,441 $320,007 0.186 Training 4 2002 148 $1,698,231 $122,480 0.072 Training 3 2002 80 $1,264,000 $60,144 0.048 Training 2 2004 135 $1,916,379 $101,179 0.053 Training 2 2004 73 $1,029,000 $50,000 0.049 Training 3 2004 200 $3,210,500 $206,074 0.064 Training 4 2004 150 $1,769,814 $50,000 0.028 Training 2 2004 225 $5,555,815 $116,507 0.021 Training 4 2004 210 $2,740,885 $50,000 0.018 Training 3 2004 210 $4,034,423 $620,825 0.154 Training 4 2004 90 $473,000 $12,000 0.025 Training 4 2004 120 $655,000 $33,000 0.050 Training 2 2005 145 $2,377,000 $108,843 0.046 Training 2 2005 80 $1,180,000 $50,000 0.042 Training 148

PAGE 149

Table A-5. Continued No. of Bidders Project Year Project Duration Project Amount Desired Contingency Amount Desired Contingency Rate Remark 2 2005 75 $1,927,650 -$10,810 -0.006 Training 3 2005 65 $1,305,429 $133,574 0.102 Training 3 2005 115 $2,137,500 $109,320 0.051 Training 3 2002 110 $1,320,873 $38,510 0.029 Testing 2 2003 250 $2,591,271 $395,432 0.153 Testing 2 2003 150 $4,425,559 $179,163 0.040 Testing 3 2003 85 $586,800 $30,476 0.052 Testing 4 2003 150 $931,000 $38,792 0.042 Testing 3 2003 150 $1,935,000 $27,560 0.014 Testing 2 2004 100 $1,045,650 $80,316 0.077 Testing 2 2005 70 $1,014,122 $23,880 0.024 Testing 3 2005 80 $1,299,000 $41,000 0.032 Testing 2 2002 170 $2,910,458 $153,851 0.053 Testing 3 2002 240 $7,616,300 $72,872 0.010 Testing 2 2004 60 $560,000 $36,000 0.064 Testing 3 2004 100 $1,852,504 $50,000 0.027 Testing 3 2004 90 $245,000 $7,500 0.031 Testing 3 2005 95 $1,998,493 $50,000 0.025 Testing 2 2003 80 $565,600 $22,587 0.040 Validation 4 2003 150 $590,000 $29,750 0.050 Validation 4 2003 65 $807,406 $50,000 0.062 Validation 3 2005 90 $783,885 $38,952 0.050 Validation 3 2005 220 $4,249,990 $100,000 0.024 Validation 2 2005 50 $276,200 $14,800 0.054 Validation 2 2002 90 $1,162,700 $80,671 0.069 Validation 3 2002 275 $7,199,000 $127,369 0.018 Validation 2 2004 80 $1,285,000 $28,700 0.022 Validation 3 2004 245 $6,980,270 $233,314 0.033 Validation 3 2004 160 $709,961 $36,000 0.051 Validation 3 2004 120 $910,094 $30,000 0.033 Validation 3 2005 150 $2,699,900 $50,000 0.019 Validation 149

PAGE 150

Table A-6. FDOT project data for ANN model type 6 No. of Bidders Project Year Project Duration Project Amount Desired Contingency Amount Desired Contingency Rate Remark 11 2003 210 $2,001,270 $50,000 0.025 Training 6 2003 154 $419,166 $31,540 0.075 Training 6 2003 177 $384,430 $17,500 0.046 Training 3 2003 100 $434,000 $36,596 0.084 Training 2 2005 190 $952,931 $160,875 0.169 Training 7 2005 135 $2,692,000 $130,756 0.049 Training 2 2005 90 $350,000 $38,504 0.110 Training 4 2005 75 $91,000 $8,574 0.094 Training 3 2005 60 $220,108 $12,238 0.056 Training 3 2005 120 $2,918,859 $50,000 0.017 Training 2 2005 150 $947,696 $24,230 0.026 Training 3 2005 60 $69,600 $5,000 0.072 Training 5 2004 135 $1,826,188 $65,374 0.036 Training 7 2004 150 $461,759 $28,068 0.061 Training 3 2004 45 $247,873 $15,305 0.062 Training 3 2004 90 $132,094 $11,667 0.088 Training 7 2004 60 $167,643 $11,210 0.067 Training 5 2004 240 $1,574,312 $102,507 0.065 Training 3 2004 50 $101,000 $10,000 0.099 Training 4 2004 90 $169,000 $11,233 0.066 Training 5 2005 120 $727,266 $31,479 0.043 Training 8 2003 44 $101,590 $9,846 0.097 Testing 4 2003 240 $2,185,000 $53,038 0.024 Testing 5 2004 106 $207,376 $19,483 0.094 Testing 3 2004 150 $111,177 $11,046 0.099 Testing 4 2004 60 $796,474 $42,000 0.053 Testing 3 2005 85 $535,152 $23,195 0.043 Testing 9 2003 75 $109,323 $6,970 0.064 Validation 7 2003 60 $151,044 $9,285 0.061 Validation 10 2003 70 $258,384 $35,399 0.137 Validation 2 2005 70 $499,542 $64,153 0.128 Validation 3 2005 60 $515,434 $22,000 0.043 Validation 150

PAGE 151

Table A-7. FDOT project data for ANN model type 7 No. of Bidders Project Year Project Duration Project Amount Desired Contingency Amount Desired Contingency Rate Remark 1 2003 200 $7,906,706 $189,360 0.024 Training 4 2003 155 $1,188,317 $50,000 0.042 Training 1 2003 180 $2,879,998 $100,000 0.035 Training 3 2004 220 $5,640,000 $223,658 0.040 Training 3 2005 210 $6,409,710 $423,168 0.066 Training 1 2005 180 $3,480,130 $50,000 0.014 Training 2 2005 210 $11,182,236 $58,669 0.005 Training 2 2005 120 $641,000 $98,662 0.154 Training 3 2002 180 $3,498,235 $131,767 0.038 Training 2 2004 200 $2,802,100 $50,000 0.018 Training 1 2004 185 $2,028,500 $50,000 0.025 Training 3 2004 140 $2,528,000 $119,919 0.047 Training 3 2004 225 $7,305,048 $67,640 0.009 Training 1 2004 182 $1,499,000 $50,000 0.033 Training 1 2005 145 $2,675,000 $50,000 0.019 Training 4 2005 60 $1,036,000 $43,989 0.042 Training 3 2003 120 $2,049,725 $197,732 0.096 Testing 3 2005 100 $573,030 $64,143 0.112 Testing 1 2004 60 $406,000 $19,565 0.048 Testing 2 2004 120 $654,126 $35,250 0.054 Testing 1 2004 100 $598,113 $25,238 0.042 Testing 3 2003 200 $2,298,497 $199,401 0.087 Validation 2 2003 200 $3,396,600 $100,000 0.029 Validation 2 2004 100 $1,157,464 $92,667 0.080 Validation 3 2004 63 $401,520 $15,000 0.037 Validation 151

PAGE 152

Table A-8. FDOT project data for ANN model type 8 No. of Bidders Project Year Project Duration Project Amount Desired Contingency Amount Desired Contingency Rate Remark 4 2003 210 $3,333,098 $292,353 0.088 Training 3 2003 140 $1,479,172 $50,000 0.034 Training 4 2003 120 $1,962,758 $52,942 0.027 Training 3 2004 180 $1,325,985 $74,480 0.056 Training 3 2005 75 $615,437 $37,686 0.061 Training 3 2005 170 $674,722 $30,000 0.044 Training 3 2005 60 $250,281 $49,646 0.198 Training 3 2005 260 $5,165,912 $70,529 0.014 Training 3 2004 255 $3,865,881 $29,232 0.008 Training 3 2004 200 $2,776,125 $50,000 0.018 Training 6 2004 160 $1,419,179 $50,000 0.035 Training 6 2004 120 $618,975 $27,000 0.044 Training 3 2004 100 $748,527 $22,000 0.029 Training 5 2003 250 $2,405,152 $254,825 0.106 Testing 4 2003 180 $1,415,643 $234,109 0.165 Testing 6 2003 180 $2,187,136 $126,994 0.058 Testing 4 2005 220 $926,352 $80,711 0.087 Testing 3 2004 250 $2,197,039 $117,000 0.053 Testing 3 2004 170 $615,862 $34,621 0.056 Testing 3 2003 270 $2,512,429 $385,259 0.153 Validation 5 2003 150 $706,534 $33,750 0.048 Validation 3 2004 90 $316,178 $12,500 0.040 Validation 3 2004 130 $1,233,247 $84,684 0.069 Validation 152

PAGE 153

Table A-9. FDOT project data for ANN model type 9 No. of Bidders Project Year Project Duration Project Amount Desired Contingency Amount Desired Contingency Rate Remark 1 2003 80 $103,485 $10,383 0.100 Training 1 2003 200 $546,438 $64,725 0.118 Training 4 2003 270 $967,575 $3,900 0.004 Training 1 2005 100 $925,493 $43,000 0.046 Training 4 2005 90 $205,300 $10,700 0.052 Training 2 2005 75 $217,289 $10,000 0.046 Training 3 2005 300 $987,711 $45,000 0.046 Training 1 2005 320 $332,862 $20,225 0.061 Training 2 2005 130 $434,759 $25,000 0.058 Training 3 2005 240 $252,009 $10,920 0.043 Training 4 2004 100 $387,358 $71,651 0.185 Training 4 2004 125 $426,971 $27,500 0.064 Training 3 2004 80 $258,035 $21,900 0.085 Training 2 2004 60 $118,542 $0 0.000 Training 4 2004 100 $191,395 $28,905 0.151 Training 4 2006 90 $294,616 $21,971 0.075 Training 5 2006 75 $105,274 $7,851 0.075 Training 2 2005 100 $321,500 $39,500 0.123 Testing 5 2005 60 $167,082 $10,848 0.065 Testing 2 2005 60 $78,825 $6,660 0.084 Testing 2 2005 45 $321,626 $15,000 0.047 Testing 1 2005 50 $383,250 $15,750 0.041 Testing 3 2004 100 $637,285 $37,500 0.059 Testing 3 2004 250 $941,712 $50,000 0.053 Testing 3 2006 60 $200,793 $13,000 0.065 Testing 3 2003 85 $95,534 $5,000 0.052 Validation 1 2003 180 $1,348,059 $84,917 0.063 Validation 5 2005 175 $695,122 $42,000 0.060 Validation 5 2005 90 $474,760 $22,500 0.047 Validation 2 2005 60 $661,064 $7,373 0.011 Validation 153

PAGE 154

APPENDIX B DISTRIBUTION GRAPH BETWEEN INPUT VARIABLES 2001 2002 2003 2004 2005 2006 012345678 Number of BiddersProject Year Training Data Testing Data Validation Data 0 100 200 300 400 500 600 700 800 012345678 Number of BiddersProject Duration Training Data Testing Data Validation Data $0 $1,000,000 $2,000,000 $3,000,000 $4,000,000 $5,000,000 $6,000,000 012345678 Number of BiddersProject Amount Training Data Testing Data Validation Data 0 100 200 300 400 500 600 700 800 200120022003200420052006 Project YearProject Duration Training Data Testing Data Validation Data $0 $1,000,000 $2,000,000 $3,000,000 $4,000,000 $5,000,000 $6,000,000 200120022003200420052006 Project YearProject Amount Training Data Testing Data Validation Data $0 $1,000,000 $2,000,000 $3,000,000 $4,000,000 $5,000,000 $6,000,000 0100200300400500600700800 Project DurationProject Amount Training Data Testing Data Validation Data 2001 2002 2003 2004 2005 2006 012345678 Number of BiddersProject Year Training Data Testing Data Validation Data 0 100 200 300 400 500 600 700 800 012345678 Number of BiddersProject Duration Training Data Testing Data Validation Data $0 $1,000,000 $2,000,000 $3,000,000 $4,000,000 $5,000,000 $6,000,000 012345678 Number of BiddersProject Amount Training Data Testing Data Validation Data 0 100 200 300 400 500 600 700 800 200120022003200420052006 Project YearProject Duration Training Data Testing Data Validation Data $0 $1,000,000 $2,000,000 $3,000,000 $4,000,000 $5,000,000 $6,000,000 200120022003200420052006 Project YearProject Amount Training Data Testing Data Validation Data $0 $1,000,000 $2,000,000 $3,000,000 $4,000,000 $5,000,000 $6,000,000 0100200300400500600700800 Project DurationProject Amount Training Data Testing Data Validation Data Figure B-1. Distributi on graphs between input variables on the best neural network (Type 4) 154

PAGE 155

2000 2001 2002 2003 2004 2005 2006 012345678 Number of BiddersPro j ect Year Training Data Testing Data Validation Data 0 100 200 300 400 500 600 012345678 Number of BiddersPro j ect Duration Training Data Testing Data Validation Data 0 100 200 300 400 500 600 2000200120022003200420052006 Project YearPro j ect Duration Training Data Testing Data Validation Data $0 $2,000,000 $4,000,000 $6,000,000 $8,000,000 $10,000,000 $12,000,000 012345678 Number of BiddersPro j ect Amount Training Data Testing Data Validation Data $0 $2,000,000 $4,000,000 $6,000,000 $8,000,000 $10,000,000 $12,000,000 0100200300400500600 Project DurationPro j ect Amount Training Data Testing Data Validation Data $0 $2,000,000 $4,000,000 $6,000,000 $8,000,000 $10,000,000 $12,000,000 2000200120022003200420052006 Project YearPro j ect Amount Training Data Testing Data Validation Data 2000 2001 2002 2003 2004 2005 2006 012345678 Number of BiddersPro j ect Year Training Data Testing Data Validation Data 0 100 200 300 400 500 600 012345678 Number of BiddersPro j ect Duration Training Data Testing Data Validation Data 0 100 200 300 400 500 600 2000200120022003200420052006 Project YearPro j ect Duration Training Data Testing Data Validation Data $0 $2,000,000 $4,000,000 $6,000,000 $8,000,000 $10,000,000 $12,000,000 012345678 Number of BiddersPro j ect Amount Training Data Testing Data Validation Data $0 $2,000,000 $4,000,000 $6,000,000 $8,000,000 $10,000,000 $12,000,000 0100200300400500600 Project DurationPro j ect Amount Training Data Testing Data Validation Data $0 $2,000,000 $4,000,000 $6,000,000 $8,000,000 $10,000,000 $12,000,000 2000200120022003200420052006 Project YearPro j ect Amount Training Data Testing Data Validation Data Figure B-2. Distributi on graphs between input variables on the worst neural network (Type 1) 155

PAGE 156

APPENDIX C EVALUATION GRAPH FOR TESTING ERROR $0 $5,000 $10,000 $15,000 $20,000 $25,000 $30,000 $35,000 $40,000 0123456 Number of BiddersTesting Error $0 $5,000 $10,000 $15,000 $20,000 $25,000 $30,000 $35,000 $40,000 200120022003200420052006 Project YearTesting Error $0 $5,000 $10,000 $15,000 $20,000 $25,000 $30,000 $35,000 $40,000 050100150200250300350400 Project DurationTesting Error $0 $5,000 $10,000 $15,000 $20,000 $25,000 $30,000 $35,000 $40,000 $0$500,000$1,000,000$1,500,000$2,000,000$2,500,000 Project AmountTesting Error $0 $5,000 $10,000 $15,000 $20,000 $25,000 $30,000 $35,000 $40,000 0123456 Number of BiddersTesting Error $0 $5,000 $10,000 $15,000 $20,000 $25,000 $30,000 $35,000 $40,000 200120022003200420052006 Project YearTesting Error $0 $5,000 $10,000 $15,000 $20,000 $25,000 $30,000 $35,000 $40,000 050100150200250300350400 Project DurationTesting Error $0 $5,000 $10,000 $15,000 $20,000 $25,000 $30,000 $35,000 $40,000 $0$500,000$1,000,000$1,500,000$2,000,000$2,500,000 Project AmountTesting Error Figure C-1. Evaluation graphs for testing error on the best neural network (Type 4) 156

PAGE 157

$0 $50,000 $100,000 $150,000 $200,000 $250,000 $300,000 $350,000 $400,000 $450,000 012345678 Number of BiddersTesting Error $0 $50,000 $100,000 $150,000 $200,000 $250,000 $300,000 $350,000 $400,000 $450,000 200120022003200420052006 Project YearTesting Error $0 $50,000 $100,000 $150,000 $200,000 $250,000 $300,000 $350,000 $400,000 $450,000 0100200300400500 Project DurationTesting Error $0 $50,000 $100,000 $150,000 $200,000 $250,000 $300,000 $350,000 $400,000 $450,000 $0$2,000,000$4,000,000$6,000,000$8,000,000 Project AmountTesting Error $0 $50,000 $100,000 $150,000 $200,000 $250,000 $300,000 $350,000 $400,000 $450,000 012345678 Number of BiddersTesting Error $0 $50,000 $100,000 $150,000 $200,000 $250,000 $300,000 $350,000 $400,000 $450,000 200120022003200420052006 Project YearTesting Error $0 $50,000 $100,000 $150,000 $200,000 $250,000 $300,000 $350,000 $400,000 $450,000 0100200300400500 Project DurationTesting Error $0 $50,000 $100,000 $150,000 $200,000 $250,000 $300,000 $350,000 $400,000 $450,000 $0$2,000,000$4,000,000$6,000,000$8,000,000 Project AmountTesting Error Figure C-2. Evaluation graphs for testing error on the wo rst neural network (Type 1) 157

PAGE 158

APPENDIX D EXPERIMENT ON THE NE UROSHELL 2 SOFTWARE The networks of ANN model type 1 was im plemented using feedforward backpropagation and the sigmoid activation function on the NeuroS hell 2 software. The data for ANN model type 1 were randomly divided into the same three independent se ts on the NeuroShell Predictor software. Figure D-2 shows the learning process of the networks for model type 1 on the NeuroShell 2 Software. On the learning process, the level of Complexity was set to Complex and very noisy since there are other unpredictable input factors (weather, site condition, and environmental impacts) on the contingency item which were not included in this research and therefore highway construction costs are very noisy as the result of thes e unpredictable factors. As far as choosing the level of Complexity, th e learning rate and the momentum factor are automatically set to .05 and .5. The Patte rn Selection was set to Random because data were coded in chronological a nd district order and the traini ng was automatically saved on the best test set during approxi mate 20,000 learning epochs. Figure D-1. Neuroshell 2 software The optimal number of hidden neurons was found through comparative analysis (1~20 hidden neurons), though the Neuroshell 2 provides the default number of hidden neurons from the following formula: Number of hidden neur ons= (Inputs +Output s) +Sqrt(Number of 158

PAGE 159

Patterns). The feature of Net-Perf ect means how often the test set is evaluated. It optimizes the network by applying the current netw ork to an independent test se t during training and helps the network to be able to generalize well and give good results on new data. In this application, the Net-Perfect interval ra nge was set to Figure D-2. Learning process on the NeuroShell 2 software Through the comparative analysis, an optim al number of hidden neurons was found as shown in Table D-1 and Figure D-3. The test data were used to evaluate the performance of the models and to select the best one among 20 co mpeting models based on the value of Average Error. Compared with the pe rformance of the best network for ANN model type 1 on the NeuroShell Predictor software, th e performance of the network at the optimal number of hidden neurons (7)on the NeuroShell 2 software is bette r based on the values of R-squared, Average Error, and Correlation Coefficient on the testing and validation set. 159

PAGE 160

Table D-1. Performance of the networks for ANN model type 1 Number of hidden neurons R-squared Avg. Error Correlation Coefficient Train Test Validation Train Test Validation Train Test Validation 1 0.451 0.490 0.528 81010 73438 103822 0.676 0.781 0.877 2 0.432 0.513 0.501 82781 71213 109050 0.661 0.777 0.861 3 0.436 0.513 0.514 81820 72071 106352 0.664 0.771 0.868 4 0.449 0.526 0.548 82363 71541 104752 0.673 0.794 0.875 5 0.429 0.498 0.552 87820 74897 101063 0.672 0.766 0.873 6 0.445 0.522 0.549 84266 71511 105976 0.672 0.774 0.869 7 0.453 0.536 0.570 83490 69786 104652 0.678 0.783 0.873 8 0.441 0.531 0.544 84782 70677 106715 0.670 0.784 0.866 9 0.450 0.531 0.569 84219 70560 103699 0.677 0.779 0.874 10 0.443 0.536 0.554 84844 69953 106359 0.673 0.779 0.866 11 0.444 0.528 0.567 85355 71197 103114 0.676 0.781 0.874 12 0.427 0.511 0.588 89997 75261 98416 0.678 0.779 0.876 13 0.445 0.520 0.574 86079 71591 102553 0.678 0.771 0.875 14 0.451 0.515 0.586 85922 73127 101251 0.681 0.769 0.879 15 0.422 0.520 0.579 90529 74566 100254 0.676 0.782 0.868 16 0.446 0.520 0.584 86835 72116 101602 0.679 0.769 0.875 17 0.423 0.514 0.552 88820 74991 102104 0.667 0.783 0.871 18 0.449 0.507 0.590 86832 72533 100428 0.684 0.762 0.879 19 0.450 0.503 0.591 86904 72477 100323 0.685 0.758 0.880 20 0.432 0.506 0.567 88536 74630 100966 0.675 0.773 0.875 66000 68000 70000 72000 74000 76000 1234567891011121314151617181920 No. of Hidden NeuronAvg. Error of Test Set Figure D-3. Searching for an optimal number of hidden neurons minimizing the testing error 160

PAGE 161

The Contribution Factor shows a rough measur e of the importance of each input variable in predicting the networks output relative to other input variable s in the same network. Table D-2 and Figure D-4 show the Contribution Factor of each input variable on the networks. Table D-2. Contribution Factor of the networks for ANN model type 1 Number of hidden neurons Number of Bidder PY Duration Amount 1 0.31 1.19 0.11 1.80 2 0.39 1.63 0.56 2.43 3 0.68 1.74 0.41 2.21 4 0.56 1.86 0.64 3.05 5 1.09 2.21 0.58 3.47 6 0.77 2.47 1.07 3.82 7 0.94 2.64 1.46 4.14 8 1.39 2.89 1.26 4.62 9 1.85 2.84 1.34 4.26 10 1.44 2.74 1.67 4.67 11 1.87 3.22 1.81 4.25 12 1.62 3.25 2.15 5.39 13 1.77 3.36 1.83 5.21 14 1.33 3.93 1.76 5.49 15 2.16 3.52 2.65 6.11 16 2.65 4.12 2.17 6.02 17 1.97 4.35 3.06 5.73 18 2.09 4.16 2.39 6.46 19 2.12 4.57 3.11 6.82 20 2.82 4.33 3.28 6.33 0.00 1.00 2.00 3.00 4.00 5.00 No. of Bidder PY Duration Amount Input VariableContribution Factor Figure D-4. Contribution Fact or on the network at the optim al number of hidden neurons 161

PAGE 162

For sensitivity analysis, each ANN model was implemented by excluding one input variable on the network at the optimal numbe r of hidden neurons. The performance of the networks was compared with the performance of the network with all input variables as seen in Table D-3 and Figure D-5. Tabl e D-4 and Figure D-6 show the contribution factor of input variables on each network under alternative sets of input variables at the optimal number of hidden neurons. Table D-3. Performance of the networks under alte rnative sets of input va riables at the optimal number of hidden neurons Input variable excluded R-squared Avg. Error Correlation Coefficient Train Test Validation Train Test Validation Train Test Validation None 0.453 0.536 0.570 83490 69786 104652 0.678 0.783 0.873 No. of Bidder 0.434 0.538 0.523 84386 71624 110178 0.661 0.797 0.853 PY 0.377 0.434 0.400 84165 79775 131039 0.617 0.661 0.756 Duration 0.475 0.505 0.617 83649 70421 101125 0.698 0.750 0.879 Amount 0.257 0.391 0.360 95451 89403 121670 0.519 0.668 0.816 0 20000 40000 60000 80000 100000 NoneNo. of BidderPYDurationAmount Input Variable Excluded from ModelAvg. Error of Test Set Figure D-5. Searching for a set of input variab les for the network mini mizing the testing error 162

PAGE 163

Table D-4. Contribution Factor of input variables on the networks under alternative sets at the optimal number of hidden neurons Input variable excluded No. of Bidder PY Duration Amount None 0.94 2.64 1.46 4.14 No. of Bidder 2.72 1.48 4.54 PY 1.10 2.17 5.16 Duration 1.53 2.40 5.46 Amount 1.27 4.07 2.75 0.00 1.00 2.00 3.00 4.00 5.00 6.00 NoneNo. of BidderPYDurationAmount Input Variable Excluded from ModelContribution Factor No. of Bidder PY Duration Amount Figure D-6. Contribution Fact or for ANN model under alternativ e sets of input variables Table D-5 and Figure D-7 show the performance of the network at th e optimal number of hidden neurons under alternative numbers of input variables. Input variab les were excluded from the network in descending order on the value of contribution factor. Table D-6, Table D-7, and Figure D-8 show the performance of the networ ks under alternative datasets with different training, testing, and validation set. It also compared the performance of the networks memorizing the previous weights with the perfor mance of the networks forgetting the previous weights. 163

PAGE 164

Table D-5. Performance of the networks under alternative number s of input variables No. of input variables R-squared Avg. Error Correlation Coefficient Train Test Validation Train Test Validation Train Test Validation 4 0.453 0.536 0.570 83490 69786 104652 0.678 0.783 0.873 3 0.479 0.415 0.442 93984 66979 121221 0.713 0.645 0.777 2 0.463 0.503 0.607 85319 73655 101785 0.690 0.757 0.864 1 0.477 0.393 0.543 80819 85605 136192 0.692 0.657 0.766 0 20000 40000 60000 80000 100000 1234 Number of Input Variables to ModelAvg. Error of Test Set Figure D-7. Searching for a number of input va riables for the network minimizing the testing error Table D-6. Performance of the ne tworks forgetting previous weights Trial R-squared Avg. Error Correlation Coefficient Train Test Validation Train Test Validation Train Test Validation 1 0.453 0.536 0.570 83490 69786 104652 0.678 0.783 0.873 2 0.624 0.450 0.643 78745 64445 113318 0.798 0.689 0.824 3 0.414 0.634 0.748 112046 51891 69335 0.685 0.800 0.917 4 0.729 0.487 0.348 89064 65020 56139 0.854 0.698 0.599 5 0.628 0.669 0.412 76979 129799 74410 0.795 0.826 0.685 Table D-7. Performance of the ne tworks memorizing previous weights Trial R-squared Avg. Error Correlation Coefficient Train Test Validation Train Test Validation Train Test Validation 1 0.453 0.536 0.570 83490 69786 104652 0.678 0.783 0.873 2 0.626 0.465 0.462 80970 71400 115451 0.791 0.700 0.862 3 0.508 0.661 0.902 102162 46719 48899 0.720 0.815 0.962 4 0.572 0.692 0.462 95294 53817 51041 0.759 0.836 0.772 5 0.654 0.562 0.451 61191 137480 66210 0.810 0.777 0.736 164

PAGE 165

0 30000 60000 90000 120000 150000 1 2 3 4 5Average TrialAvg. Error of Test Set Forgetting Previous Weights Memorizing Previous Weights Figure D-8. Performance of the ne tworks under alternative datasets 165

PAGE 166

LIST OF REFERENCES Adeli, H. and Wu, M. (1998). Regularization ne ural network for construction cost estimation. J. Constr. Eng. Manage., 124(1), 18-24. American Association of Cost Engineers (AACE)s Risk Management Committee (2000). AACE Internationals risk management dictionary. Cost Eng., 42(4), 28-31. Anderson, D. and McNeil, G. (1992). Artificial neural ne tworks technology, Data & Analysis Center for Software ( DACS), Rome, New York. Ahmad, I. (1992). Contingency alloca tion: A computer-aided approach. AACE International Transactions, F.5.1-F.5.5. Ahuja, H. N., Dozzi, S. P., and Abourizk, S. M. (1994). Project management, 2nd Ed., John Wiley & Sons, New York. Aquino, P. (1992). A PERT approach to cost risk analysis. AACE International Transactions, F.4.1-F.4.7. Association for the Advancement of Cost Engineering (AACE) International (1997). Cost Estimate Classification System, AACE International Recommended Practice No. 18R-97, Morgantown, WV. Baccarini, D. (2004). Accuracy in estimating project cost construction con tingency: A statistical analysis. The International Construction Research Conference of the Ro yal Institution of Chartered Surveyors, Headingley Cricket Club, Leeds. Australia. Burger, R. (2003). Contingency, quantifying the uncertainty. Cost Eng., 45(8), 12-17. Burroughs, S. E. and Juntima, G. (2004). E xploring techniques for contingency setting. AACE International Transactions, EST.03.1-EST.03.6. Chau, K. W. (1995). The validity of the tria ngle distribution assumption in monte carlo simulation of construction costs: Empirical evidence from Hong Kong. Constr. Manage. Econom., 13, 15-21. Chen, D. and Hartman, F. T. (2000). A neur al network approach to risk assessment and contingency allocation. AACE International Transactions, RISK.07.01-RISK.07.06. DARPA (1988). DARPA neural network study, AFCEA Press, Washington, DC. De Neufvill, R. and King, D. (1991). Risk and n eed for work premiums in contractor bidding. J. Constr. Eng. Manage., 117(4), 659-673. Dey, P., Tabucanon, M. T. and Ogunlana, S. O. (1994). Planning for project control through risk analysis: A petroleu m pipelaying project. Int. J. Proj. Manage., 12(1), 23-33. Drigani, F. (1988). Computerized project control, Marcel Dekker, New York. 166

PAGE 167

Federal Highway Administration (FHWA) (2007a). Major project program cost estimating guidance, Washington, DC. Federal Highway Administration (FHWA) (2007b). Contingency fund management for major projects, Washington, DC. Flood, I. and Kartam, N. (1994a). Neural networ ks in civil engineeri ng I: Principles and understanding. J. Comput. Civ. Eng., 8(2), 131-148. Flood, I. and Kartam, N. (1994b). Neural netw orks in civil engineering II: Systems and applications. J. Comput. Civ. Eng., 8(2), 149-162. Flood, I. and Kartam, N. (1997). System. Artificial neural networks for civil engineers: Fundamentals and applications, N. Kartam, I. Flood, and J. H. Garrett, Jr., Eds., ASCE, New York, 19-43. Flood, I. (1999). Modeling dynamic engineerin g processes using radialgaussian neural networks. J. Intel. Fuzzy Sys., 7, 373-385. Flood, I., Muszynski, L., and Nandy, S. (2001). Rapid analysis of externally reinforced concrete beams using neural networks. Comp. Struct., 79, 1553-1559. Florida Department of Transportation (FDOT) (2007). Design-Build guidelines, Tallahassee, FL. < http://www.dot.state.fl.us/construction/DesignBuild/DB%20Rules/DesignBuildGuideline s.doc > (Jan. 03, 2008). Florida Department of Transportation (FDOT) (2001). Lump sum project guidelines, Tallahassee, FL. (Jan. 06, 2008). Florida Department of Transportation (FDOT) (2002). Construction project administration manual, Tallahassee, FL. Flyvberg, B., Holm, M. S., and Buhl, S. (2002). U nderestimating costs in public works projects: Error or lie? J. Amer. Plan. Assoc., 68(3), 279-295. Gagarin, N., Flood, I., and Albrech t, P. (1994). Computing truck attr ibutes with artificial neural networks. J. Comput. Civ. Eng., 8(2), 179-200. Gnhan, S. and Arditi, D. (2007). Budge ting owners construction contingency. J. Constr. Eng. Manage., 133(7), 492-497. Hanna, A. S., Russell, J. S., Taha, M. A., and Park S. C. (1997). Application of neural networks to owner-contractor prequalification. Artificial neural networks for civil engineers: Fundamentals and applications, N. Kartam, I. Flood, and J. H. Garrett, Jr., Eds., ASCE, New York., 124-136. Haykin, S. (1994). Neural networks: A comprehensive foundation, Macmillan, New York. 167

PAGE 168

Healey, J. F. (1999). Statistics: A tool for social research, Wadsworth Publishing Company, Belmont, CA. Issa, R. R., Flood, I., and Martini, A. (1998). Estimating the energy performance index of buildings. Artificial neural networks for civi l engineers: Advanced features and applications, I. Flood and N. Kartam., Eds., ASCE, Reston, VA., 260-272. Jelen, F. C. and Black, J. H. (1983). Cost and optimization engineering, McGraw Hill, New York. Karlsen, J. and Lereim, J. (2005). Managem ent of project conti ngency and allowance. Cost Eng., 47(9), 24-29. Kohonen, T. (1984). Self-organization and associative memory, Springer-Verlag KG, Berlin, Germany. Lorance, R. B. (1992). Contingency draw-down using risk analysis. AACE International Transactions, F.6.1-F.6.6. Mak, S. and Picken, D. (2000). Using risk analysis to determine construction project contingencies. J. Constr. Eng. Manage., 126(2), 130-136. Moselhi, O. (1997). Risk assessm ent and contingency estimating. AACE International Transactions, A.06.1-A.06.6. Moselhi, O., Hegazy, T., and Fazio, P. ( 1993). DBID: Analogy-based DSS for bidding in construction. J. Constr. Eng. Manage., 119(3), 466-479. Molenaar, K. R. (2005). Programmatic cost risk analysis for highway megaprojects. J. Constr. Eng. Manage., 131(3), 343-353. Nassar, K. (2002). Cost contingency analysis fo r construction projects using spreadsheets. Cost Eng., 44(9), 26-31. Nigrin, A. (1993). Neural networks for pattern recognition, MIT Press, Cambridge, MA. Paek, J. H., Lee, Y. W., and Ock, J. H. (1993). Pricing construction risk: Fuzzy set application. J. Constr. Eng. Manage., 119(4), 743-756. Paek, J. H., Lee, Y. W., and Napier, T. R. (1992). Selection of desi gn/build proposal using a fuzzy logic system. J. Constr. Eng. Manage., 118(2), 303-317. Patrascue, A. (1988). Construction cost engineering handbook, Marcel Dekker, New York. Project Management Institute (PMI) (2000). A guide to the project management body of knowledge, Upper Darby, PA. Popescu, C. M., Phaobunjong, K., and Ovararin, N. (2003). Estimating building costs, Marcel Dekker, New York. 168

PAGE 169

Rothwell, G. (2005). Cost contingency as the standard deviation of the cost estimate. Cost Eng., 47(7), 22-25. Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning internal representations by error propagation. Parallel Distributed Processing, D.E. Rumelhart and J. McClelland, Eds., MIT Press, Cambridge, MA. Scheaffer, R. L. and McClave, J. T. (1990). Probability and statistics for engineers, 3rd Ed., PWS-KENT Publishing Company, Boston, MA. Smith, G. R. and Bohn, C. M. (1999). Small to medium contractor contingency and assumption of risk. J. Constr. Eng. Manage., 125(2), 101-108. Thompson, P. A. and Perry, J. G. (1992). Engineering construction risks: A guide to project risk analysis and risk management, Thomas Telford, London. U.K. Touran, A. (2003). Calculation of con tingency in construction projects. IEEE Trans. Eng. Manage., 50(2), 135-140. Twomey, J. M. and Smith, A. E. ( 1997). Validation and Verification. Artificial neural networks for civil engineers: Fundamentals and applications, N. Kartam, I. Flood, and J. H. Garrett, Jr., Eds., ASCE, New York. 44-64. U.S. Department of Agriculture (US DA) Economic Research Services (2000), Rural Definitions: State-Level Map (Florida), Washington, DC. < http://www.ers.usda.gov/data/ruraldefinitions/maps.htm > (Oct. 19, 2008). U.S. Department of Energy (DOE) (1994). Cost guide, Washington, DC. Yeo, K. T. (1990). Risks, classification of estimates, and contingency management. J. Manage. Eng., 6(4). 458-470. Zurada, J. M. (1992). Introduction to artificia l neural systems, PWS Publishing Company, Boston, MA. 169

PAGE 170

BIOGRAPHICAL SKETCH Sang Choon Lhee was born in Daegu, Korea in 1974. He received his Bachelor of Science in the Department of Industrial Engineeri ng from Hanyang University, Seoul, Korea. After having big interests in the cons truction engineering and manageme nt field, he attended Yonsei University, Seoul, Korea and was awarded his sec ond bachelors degree in the Department of Architectural Engineering. After graduation, he obtained the First-level Architectural Engineer certificate and worked as an assistant supervis or for a construction engineering company in Korea for eighteen months. In order to broaden his knowle dge and experience in the fiel d of construction management, he came to the United States and enrolled at the Construction Management Program in the School of Civil Engineering at Georgia Institute of Technology. After receiving the masters degree from Georgia Institute of Technology, he joined the Rinker School of Building Construction in the University of Florida in order to pursue the Doctor of Philosophy degree. During his doctoral study period, he worked as a teaching assistant for several undergraduate courses such as Timber and Formwork Desi gn, Steel Design, and Soils and Concrete Construction and passed the USGBC (United Stat es Green Building Council) LEED (Leadership in Energy and Environmental Desi gn) Accredited Professional exam. 170