UFDC Home  myUFDC Home  Help 



Full Text  
POWER DISTRIBUTION RELIABILITY AS A FUNCTION OF WEATHER By ROOP KISHORE R. MATAVALAM A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2004 Copyright 2004 By Roop Kishore R. Matavalam To my parents and my sister ACKNOWLEDGMENTS First and foremost, I thank my advisor, Dr. Alexander Domij an, for having confidence in me and allowing me to pursue this research. The work presented in this thesis would not be possible without his consistent support. I am also very thankful to Dr. Khai D.T. Ngo and Dr. Antonio A. Arroyo for serving as committee members in my thesis. I sincerely express my gratitude to my colleague and friend William S. Wilcox for his thorough discussions in statistics and suggestions, without which the current thesis would not have been exciting. I am also grateful to my colleague and friend Alejandro Montenegro for his suggestions and answers to my questions without any hesitation. I am also very grateful to my friend Raj Vignesh Thogulua for helping me in understanding the neural networks. I am also thankful to Dr. Tao Lin for his helpful suggestions. I also thank my fellow colleague and friend Ajay Karthik for being enthusiastic about my work. I would like to express my special thanks to all the personnel of FPL Distribution Reliability group especially Mr. J. R. "Pepe" Diaz, Ms. Lee Davis and Ms. Jessica D'Agostini for their valuable suggestions and financial support of this project, without their consistent support the current project would not have been finished. I am also thankful to FPL members including Mr. Val Miklausich, Mr. Santiageo Cocina, Mr. Luis Delfom, Mr. Manny Miranda, and Ms. Martha Caneia for their assistance. I would like to express my gratitude to all the undergraduate students who worked in FPL project and made it more lively and interesting. I also extend my gratitude to all my great friends for their support and encouragement. Finally, yet most importantly, I am indebted to my wonderful parents and sister for believing in my goals, aspirations, for their love, encouragement, and constant support in all my endeavors. TABLE OF CONTENTS page ACKNOW LEDGM ENTS ........................................ iv LIST OF TABLES ................ ........ ....... .................... viii LIST OF FIGURES ........................ ........................... ix ABSTRACT................................................ xi CHAPTER 1 INTRODUCTION ................... ...................................... ......... ........ 1.1 Im portance of Pow er R eliability........................................................................2 1.2 U understanding Pow er Reliability Indices.............................................................3 1.3 Purpose and Importance of this Thesis............... ...............................5 1.4 O organization of T hesis............................................................................................ 2 UNDERSTANDING FLORIDA WEATHER ......................................................8 2.1 Air D ensity, Tem perature and Pressure................................................................8 2.2 Humidity ......................................................... ..................8 2.3 R ain................... ..... ....... .................................... 2.4 Dew or Condensation of the Humidity..........................................9 2.5 Pollution................................ ........9 2.6 Wind ........................................ ............................. .........10 2.7 Lightning....................................... .................. ...................12 3 SYSTEM UNDER STUDY FPL......................................................... ........15 3.1 Description of the FPL distribution system........................................................15 3.2 Interruption Data from FPL................... .... ............. .. ........17 3.3 Meteorological Weather Data from NCDC........................................................22 3.4 Weather Parameters of interest......................................................... .........23 4 CORRELATION OF WEATHER AND INTERRUPTIONS .................................25 4.1 Im portance of Statistical Tools ............................................... ...... ......... 26 4.2 Probabilistic Characteristics of Data Distributions............................................27 4.3 Correlation Analysis between Weather Parameters and Interruptions.................29 4.3.1 Impact of Temperature on N ....................................... ..........29 4.3.2 Impact of Wind on N...................................... ....................... ......34 4.3.3 Impact of Rain on N ............... ................... ................... ..........35 4.3.4 Effect of Rain and W ind Together on N ....................................................37 5 PREDICTION OF INTERRUPTIONS USING ARTIFICIAL NEURAL N E T W O R K S ....................................................... 40 5.1 Introduction to Artificial Neural Networks ........................................ 40 5.1.1 Benefits of ANNS over statistical methods..............................................41 5.1.2 A architecture of A N N ........................................................................ ..........4 1 5.1.3 Functioning of A N N ...................................................... 43 5.1.4 Back Propagation Learning Rule.........................................................44 5.2 Steps to Enhance the performance of ANN....................................................45 5.3 A N N Sim ulation O utput ................................................................................. ......47 5.3.1 D detailed O observation .................................... ................... ................. 48 5.3.2 Dominant Weather Parameters Preliminary Observations ...................51 5.4 Analysis of ANN Simulation Output.............. ................. ............51 5.5 Pitfalls and Suggestions to FPL ................................................................. 53 5.5.1 W weather D ata ................. .... ................................. ........ 53 5.5.2 Interruption D ata ........... ... ........ .. .... .. ..... ...............53 5.6 Proving Localization of Weather Improves the Accuracy in Prediction............56 Case 1: Localized W weather Data ............................. ............... 56 Case 2: Scattered Weather Data ........................................ ............. 57 5.7 Comparison of Statistical Model and ANN Model ....................... .................57 5.8 Possible Software Development to Predict Power Interruptions Using ANNs.....58 6 LIMITATIONS, CONCLUSIONS AND FUTURE WORK ...................................62 6.1 L im stations of A pproach................................................................................. ......62 6. 1.1 W weather D ata ................. .... ................................. ........ 62 6.1.2 U nknow n V ariables ............................................................................63 6.1.3 Outliers ........................................ ........64 6.1.4 Hourly Data .............................................. ..... ...64 6.2 Conclusions............................. ........... ........64 6.3 Future Work........................................ .... ...............65 6.3.1 Data Collection and Creating New Variables ..................................66 6.3.2 Improving the Accuracy and Developing New ANN Models ................66 LIST OF REFERENCES ..................................... ............... ....................67 BIOGRAPHICAL SKETCH .................................................. ............... 69 LIST OF TABLES Table page 21 Type of Contaminant and Atmospheric Conditions at the Time of Contamination Flashover (UHV Project)................................................................10 31 FPL Pow er Sales by Sectors.....................................................................................15 32 FPL Distribution Management Areas along with Their Dispatch Centers ...........16 33 FPL C ause C odes (102) Table ....................................................... 18 41 The Frequency of Interruptions due to Tree Limbs (Cause Codes 20 & 21)..........27 42 Prediction of N Using Maximum Temperature of All the MAs ............................33 51 Summary Table of Covariance for All the Input Variables Considered in the Principle Com ponent A nalysis................................ ................... 46 52 Performance Comparison Between Statistical Model and ANN Model............... 57 LIST OF FIGURES Figure page 21 Regions with strong contamination (UHV project) ................................................9 22 Swing angle as a function of instantaneous wind speed at tower ......................... 11 23 Vegetation effects on power interruptions .........................................12 24 Number of days with thunderstorms in Florida (US Weather Bureau)....................13 25 Cumulative frequency distribution of peak current amplitudes in dow nw ard negative flashes ........................................ ................. 14 31 Snap shot of Florida map ................................ ......................... ........ 17 32 FPL's historical SAIFI performance .................................................. .............20 33 Frequency charts of interruptions and customers affected by interruptions .......................................................21 41 N for all management areas from 1998 to 2001 using the previous filters .................................. .... ......... .28 42 Rain (inches) for all management areas from 1998 to 2001 ..............................28 43 Wind2 minutes maximum speed (mph) for all management areas from 1998 to 2001 ..................................................... ........ 29 44 Variation of average N due to transformer failures vs. maximum tem perature ...................................... ................................ ......... 30 45 Variation of average N due to transformer failures vs. maximum temperature (averaged per month per year) ........................................31 46 Variation of average N due to transformer failures vs. max temperature (averaged per month) ............................... ............... 32 47 Variation of N vs. wind ............................. ... ................34 48 Mean of 2 minutes wind speed vs. average number of interruptions....................35 49 V aviation of N vs. rain.............................................................................36 410 V ariation of N vs. rain in the interval [0 2]........................................ ............. 37 411 Impact of rain and wind together on N ........................ ............................. .........38 51 ANN structures.................................................................42 52 A back propagation ANN model ................................ ...............45 53 Prediction patterns of N overlaid on actual patterns of N of 3 MAs for year 2002 ...................................... ......................... ........................48 54 Predicted N and the actual N for a few of the cases in North Dade MA for 2002...49 55 Numerical values of weather and interruption data under consideration..............50 56 Histogram plot of predicted interruptions ............................ .... ..............52 57 Prediction results of ANN using the original N (not shift adjusted)........................54 58 Prediction results of ANN using the adjusted N (shift adjusted).............................55 59 Mean and standard deviation of actual N....................... .... ...............55 510 Mean and standard deviation of adjusted N......................................................55 511 Mean squared error vs. training epochs ...........................................58 512 Graphical user interface developed to predicted interruptions...........................59 513 Predicted interruptions vs. actual interruptions for Central Dade............................61 61 Average precipitation difference............................................................ ....... 63 Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science POWER DISTRIBUTION RELIABILITY AS A FUNCTION OF WEATHER By Roop Kishore R Matavalam August 2004 Chair: Alexander Domij an, Jr. Major Department: Electrical and Computer Engineering The system principles that are used to design and maintain electric power distribution grids (distinct from transmission grids) are intended to minimize the number and duration of power disturbances, including interruptions. Many of these principles, such as load flow and load prediction are well understood and have been refined over many years. However, the impact of local weather conditions on power distribution grids has not been well researched. The current thesis is intended to improve our understanding of the effects of weather on power distribution systems and to develop tools for the prediction of weather related interruptions. Developing and disseminating this information will allow electric power engineers to ultimately improve our nation's power distribution capabilities. The current research also presents the novel concept of predicting the number of power interruptions in a distribution system using weather parameters. Preliminary results show that it is possible to define and build a model, using artificial neural networks (ANNs), which can use weather parameters as inputs and predict the number of interruptions with reasonable accuracy. The accuracy of the prediction depends, in part, on the accuracy of the weather data that are used in the model and, in part, on the precision of the model. It is expected that the use of realtime surface weather data, such as can be collected from wellsited weather stations, will eliminate the uncertainty inherent in weather data collected from geographically distant sources. CHAPTER 1 INTRODUCTION For any power company, providing reliable electric service is the numberone priority. But unfortunately, sometimes, power interruptions are simply unavoidable. The contribution of weather towards these power interruptions is significant, as it is going to be shown in this thesis. Often times the terms power outages and power interruptions are exchanged each other to mean the same  loss of power supply, even by many people in power industry. But there is a fine difference between these two terms as stated by the IEEE 1366 standard [1]: An outage is defined in IEEE 13662001 as: The state of a component when it is not available to perform its intended function due to some event directly associated with that component. Notes: (1) An outage may or may not cause an interruption of service to customers, depending on system configuration. (2) This definition derives from transmission and distribution applications and does not apply to generation outages. An interruption is defined in IEEE 13662001 as: The loss of service to one or more customers connected to the distribution portion of the system. Note: It is the result of one or more component outages, depending on system configuration. From FPL standards, the loss of power supply is defined in two ways based on the duration of the power disturbance to the customer: Momentary Interruption: Single operation (Open Close) of an interrupting device which results in zero voltage for a period of time of 59 seconds or less. Sustained Interruption: Power loss to the customer that has lasted at least for one minute. Note: In the current thesis, when we say interruptions we mean sustained interruptions. Also we use N to represent the total number of sustained interruption per day. 1.1 Importance of Power Reliability Because of the huge financial losses, besides customer satisfaction loss, associated with power interruptions to our economy, power reliability is one of the most important concerns for electric utilities. Power reliability modeling and indexing are among the tools used by utilities to manage costs and monitor equipment performance, and ultimately improvements in the flexibility of reliability assessment models will result in increased savings. According to Contingency Planning Research Company's annual study [2], downtime caused by power disturbances result in major financial losses as shown in Figure 11. $10,000,000 $1,000,000 $100,000 $10,000 $1,000 $100 $10 Figure 11. Average hourly impact of downtime and data loss by business sector Traditionally the reliability of an electric distribution system is measured in terms of total number of power interruptions occurred. Currently there are a variety of indices available to measure the reliability of any power utility. These indices are used as a yard stick within a utility to see how good they are doing each year besides serving as a better tool to compare their relative performance with the other utilities, of different sizes, in the country. 1.2 Understanding Power Reliability Indices Among many power distribution reliability indices available [3], the following are some of the widely used customer based indices: SAIFI: System average interruption frequency index (sustained interruptions). This index is designed to give information about the average of sustained interruptions per customer over a predefined area. In words, the definition is SAIFI = Total number of customer interruptions Total number of customers served Mathematically it is represented as SAIFI= 1N NT SAID: System average interruption duration index. This index is commonly referred to as customer minutes of interruption or customer hours, and is designed to provide information about the average time the customers are interrupted. SAIDI = Customer interruption durations Total number of customers served Mathematically it is represented as SAID= =rN. NT CAIDI: Customer average interruption duration index. CAIDI represents the average time required to restore service to the average customer per sustained interruption. In words, the definition is CAIDI= Customer interruption durations Total number of customer interruptions Mathematically it is written as CAIDI = rNN SAIDI Ni SAIFI ASAI: Average service availability index. This index represents the fraction of time( often in percentage) that a customer has power provided during one year or the defined reporting period. In words, the definition is ASAI = Customer hours service availability Customer hours service demand To calculate the index, the following mathematical equation is used: ASAINr (No.of hours/ year) ZrN,' NT (No.of hours / year) Note that there are 8760 hours in a regular year, 8784 in a leap year. Some of the other customer based reliability indices in use are CTAIDI, ASAI etc. Load based sustained indices include ASIFI, ASIDI etc. Momentary indices include CEMSMIn, AMAIFI and MAIFIE. The newest indices are CEMIn, Customers experiencing multiple interruptions, and CEMSMIn, Customers experiencing multiple sustained interruptions and momentary interruptions events. Usage of CEMIn as a basis performance measure is under consideration [3] by many states in USA. In Florida, usage of CEMI5 index is under consideration. If this value is exceeded, the commission is considering fines that would be paid to the customers who experienced poor performance. The snapshot of the percentage usage of different reliability indices, by the IEEE working group [1] on system design is as shown in Figure 12. It was analyzed through surveys that the most commonly used indices are SAIFI, SAIDI, CAIDI, and ASAI. 100 90 70 60. 50 40 30. 10.' D1 .00%  00%  00% 10.61% M% j453.03% DpO a ia~fi SLgiia a'Bg ia SAIFI SAID CAIDI ASAI ASIF 4ASIDf MAIFI OTHER CTAPDI CAlFI Index Figure 12. Percentage of companies using a given index reporting in 1990 1.3 Purpose and Importance of this Thesis It was so obvious from previous discussed indices that one has to reduce the total number of interruptions in order to improve the reliability of a distribution system. Though there are lots of parameters/ conditions that are responsible for these interruptions, weather is still a big player and has a significant contribution. Research carried out by the Electric Power Research Institute [4] showed the effects of weather components, such as lightning, rain and wind in transmission lines (345kV and above). However, not significant similar work was done for distribution lines which brought our attention to research in this new direction. The purpose of the current thesis work was to study the impacts of normal i Iclu/Wlr conditions on the distribution power interruptions and develop correlation models. Study also includes how the correlation knowledge can be used to reduce the power interruptions by incorporating Artificial Neural Networks (ANN). Using weather and interruption data novel prediction tools and modeling methods were developed, in which the similar kind of approach can be applied to any electric utility in the country. With the provision of novel tools developed in this thesis, any power utility would be able to estimate/predict the total approximate number of interruptions that might happen in the future due to weather. The accuracy of the prediction of interruptions depends, in part, on the accuracy of the forecast weather data which serves as input to the model and, in part, on the accuracy of the model. It is expected that the use of realtime surface weather data, such as can be collected from wellsited weather stations, will eliminate the uncertainty inherent in weather data collected from geographically distant sources. Despite a thorough search of the available literature, examples of the use of surface weather data in the construction of power reliability models have not been found. It is expected that this project will contribute significantly to the existing literature by providing predictive models, as well as background in a previously unexplored area. Moreover, this research is valuable for the exploration of proper ANN structure, internal parameters and feature pattern extraction methods for application to power systems. 1.4 Organization of Thesis The current thesis work comprises six chapters. The current chapter gives the motivation, literature survey and the importance of the project. The second chapter explains about the behavioral model of weather conditions prevalent in Florida State. It will also give a broad knowledge of different weather parameters. The third chapter deals with the Florida Power & Light (FPL) system and also the information on weather and interruption data considered in the analysis. The fourth chapter presents the approach followed towards developing correlation models between weather parameters and power interruptions using statistical tools. The fifth chapter reveals a novel model using Neural Networks that can be used to predict the power interruptions based on the forecasted 7 weather parameters. The last chapter concludes with the limitations of the research results, and conclusions of the current thesis followed by future work. CHAPTER 2 UNDERSTANDING FLORIDA WEATHER As weather in Florida is very varying, different weather parameters in Florida that are more common are explained and their impact on power distribution lines is explained in this chapter. Florida is also known as the lightning capital of the world. The natural phenomena of the weather that can reduce the strength [4] of the insulators in the state of Florida are airdensity, temperature, pressure, humidity, rain, dew or condensation of humidity, pollution, wind and lightening. 2.1 Air Density, Temperature and Pressure The variation of the temperature in the state of Florida is between 20TF and 100oF and the pressure doesn't change significantly. The influence of this on the strength of the insulators, without abnormal condition like fire, is around of 5%. 2.2 Humidity The humidity in Florida changes significantly throughout the year and throughout the day. Usually, the humidity can be very high between Midnight and 6 /7 A.M and after that it starts to decrease. It again starts to increases at night. The influence of the humidity without condensation can affect the strength of the insulator to around 16%. 2.3 Rain The rain can be classified into weak rain (mist or drizzle) or strong rain (rainfall).The influence of the rain on the insulator strength varies and depends on the intensity and its direction. The maximum influence of rain [4] on clean insulators is around 30% . The mist or drizzle along with pollution, can have a stronger impact on flashover when it combines together with pollution on the insulators strings. 2.4 Dew or Condensation of the Humidity When temperature decreases to the dew point, the condensation starts taking place on the surface of the insulators and then a thin layer of water appears. The condensation can have same effect as that of mist on insulator. 2.5 Pollution Pollution can reduce the strength of an insulator. Its influence on flashover depends on the type of contamination and its concentration on the surface of the insulators. We are here concerned about two types of the contamination, the spot contamination and the area contamination. In Florida the distribution of contamination [4] is as shown in Figure 21. Black dots refer to spot contaminations and shaded refer to area contaminations. Figure 21. Regions with strong contamination (UHV project) Table 21 shows the types of contamination that causes flashover. In dry conditions most of them are not good conductors, however, in wet conditions due to condensation, drizzle, mist or rain the conductivity increases substantially. 10 Table 21. Type of Contaminant and Atmospheric Conditions at the Time of Contamination Flashover (UHV Project) TABLE 10.3.5 TYPE OF CONTAMINANT, WEATHER, AND ATMOSPHERIC CONDITIONS AT TIME OF CONTAMINATION FLASHOVER Type of Orizzle. No High Wet Conlaminant Fog Dew Mist Ice Rain Wind Wind Snow Fair Seasall 14 11 22 1 12 3 12 3 Cement 12 10 16 2 11 4 1 4 Fertilizer 7 5 8 1 1 4 Flyash 11 6 19 1 6 3 1 3 1 Road sail 8 2 6 4 2 6 Potash 3 3 Cooling tower 2 2 2 2 Chemicals 9 5 7 1 1 1 1 Gypsum 2 1 2 2 2 Mixed contamination 32 19 37 13 1 1 Limestone 2 1 2 4 2 2 Phosphate and sulfate 4 1 4 3 Paint 1 1 1 Paper mill 2 2 4 2 1 Dried milk 1 1 1 1 1 Acid exhaust 2 3 Bird droppings 2 2 3 1 2 2 Zinc industry 2 1 2 1 1 Carbon 5 4 5 4 3 3 Soap 2 2 1 1 Steel works 6 5 3 2 2 1 Carbide residue 2 1 1 1 1 Sulphur 3 2 2 1 1 Copper and nickel salt 2 2 2 2 1 Wood fiber 1 1 1 1 Bulldozing dust 2 1 1 Aluminum plant 2 2 1 1 Sodium plant 1 1 Active dump 1 1 1 Rock crusher 3 3 5 1 Total 146 93 166 8 68 26 19 37 4 Percent weather 25.75 16.4 29.3 1.4 12 458 3.36 6.52 0,71 Pollution combined with condensation and rain can be considered as the worst condition behind reduction of insulator strength. Moreover, dew, drizzle and mist are considered the most important weather components at the time of flashover, for 72 % of the cases [4]. Rain can have two different aspects. On one hand, it reduces insulators' withstanding values. On the other hand, cleans the surface of the insulators thereby preventing the system from new flashover due to dew (condensation), mist and pollution. 2.6 Wind Special weather conditions, such as storms, thunderstorms, hurricanes and tornados have a direct effect on power interruptions. However, these events are a combination of rain, wind and lightning. Therefore, the individual analysis of each one of these weather components is required in order to study the real cause behind the flashover. An evaluation of the influence of windspeed on swing angle and therefore the minimum clearance required to avoid possible flashover can be carried out. Wind can provoke catastrophic mechanical damages due to asynchronous movements of the cables and/or insulators. These damages are more significant in transmission lines where the span is larger. Figure 22 shows the swing angle of a single conductor vs. windspeed [4]. 45 40 I  : SWING ANGLE AS A FUNCTION OF MEAN 20 WIND SPEED 5 20  10  5 10 i5 20 25 50 35 WIND SPEED METER/SEC Figure 22. Swing angle as a function of instantaneous wind speed at tower Distribution lines have small span, so the asynchronous movement of the conductors, most of the times, gives insignificant disturbance except when there is high speed. The most significant disturbance due to wind can come due to movement of the trees and its branches. Trees, which are untrimmed, can touch the lines and can result in a flash over leading to an outage. The Figure 23 below shows the uncut trees and branches touching the distribution lines in Gainesville, Florida. (a) (b) (c) (d) Figure 23. Vegetation effects on power interruptions 2.7 Lightning The number of days with thunderstorm in the Florida is between 80 and 100 as shown in the Figure 24 (isokeraunic level). As we can see in Figure 24, Florida is a state strongly affected with atmospheric discharges. The number of strokes to the earth per square mile per year (lightning) can be found through the expression: N= 0.25 I where I is the local isokeraunic level. If a line has shadow width of W, the number of lightning expected to hit it per year is NL= 0.25 ILW/5280 where W=b+4h, L is the length of the line in miles, h is the height of the shield (ground) wires and b is distance between shield wires. If the line doesn't have shield wires, h is the height of the conductors and b the distance between external conductors. Due to strokes on transmission or distribution lines with shield wires a back flashover is expected. The voltage across the insulator string in this case depends on tower foot resistance, current through the tower and of the coefficient of coupling between shield wire and phase conductor. S70^   70 l90 \0 800 I, d , 90 0 90 80 Figure 24. Number of days with thunderstorms in Florida (US Weather Bureau) Thus, flashover or interruptions due to lightning depends on the tower foot resistance and also on the intensity of the current. It is not possible to control the current so all the control should be done through tower foot resistance. Distribution lines without shield wires are directly affected by the lightning. The level of the voltage across the string or insulator depends on intensity of the current and the magnitude of the surge impedance of the line. Figure 25 shows the amplitude of the crest of the strokes with the probability of the occurrence. It is possible to see that the probability of a stroke, which has a current up to 5 kA, is almost 1. This current while passing through the conductor will be divided into two. Thus, the voltage across the insulators or string of distribution lines with surge impedance between 150 Q and 250 Q will be more or less between 375 kV and 625 kV. Thus, in most of the cases distribution lines up to 69 kV (BIL up to 350 kV) will be practically submitted to a flashover for every stroke on it. I I i P I"  011 10 20 50 100 200 500 CRESI CURRENTkA Figure 25. Cumulative frequency distribution of peak current amplitudes in downward negative flashes CHAPTER 3 SYSTEM UNDER STUDY FLORIDA POWER & LIGHT The preliminary results cited in this thesis are the research results obtained while working on the project sponsored by Florida Power & Light (FPL). But the concept and idea can be further applied to any system and can be enhanced as explained in the current thesis. The power distribution interruption data of FPL was considered to study the effects of weather on the occurrence of interruption patterns of FPL. 3.1 Description of the FPL distribution system FPL is among the largest and fastest growing electric utilities in the United States. As of December 2002, FPL had 9,612 employees serving nearly 8 million people, or about half the state of Florida. Power is delivered (Table 31) safely and reliably from 86 generating units with a Total generation capability =26203MW, through more than 500 substations and over more than 69,000 miles of transmission and distribution lines [5]. Table 3.1. FPL Power Sales by Sectors Sector Number ofAccounts Total Sales (kwh) Residential 3,521,146 50.9% Commercial 430,471 40.6% Industrial 15,248 4.4% Other** 2,746 4.1% * Monthly average as of December 2001 ** Includes public authorities, railway, wholesale and interchange. FPL Distribution system is divided in 16 management areas grouped under two regions; urban and suburban (Table 32). Figure 31 shows the location and area covered by each of the FPL distribution management areas (MA). 16 Table 32. FPL Distribution Management Areas along with Their Dispatch Centers DISPATCH West Pam South Florida Daytona Sarasota Breach Dispatch Dispatch Center Dispatch Dispatch Center Center Center AREAS TCTreasure PM Pompano NFNorth MS Manasota Coast Florida WB West Palm WG Wingate CFCentral TBToledo Florida Blade BR Boca Raton GS Gulfsteam BV Brevard GC Gulf Coast NDNorth Dade WDWest Dade CE Central SDSouth Dade MS =iG17  GC '1' ^ ___ _^ PM1 L N P  ', GS I N  0. ~ (b) Figure 31. Snap shot of Florida map (a) FPL distribution management areas. (b) Weather stations chosen within the FPL area 3.2 Interruption Data from FPL Power interruption data is primarily obtained from Florida Power & Light (FPL). FPL has divided its power supplying territory into 16 sections called Management Areas (MAs). An interruption data file was created for each of these 16 MAs. Interruption data was made available to us in a data storage program known as "PowerPlay." To make the things more clearly, interruptions are classified in to different groups, Figure 34, based on the type of causes that are responsible for these interruptions, for example  interruptions occurred due to squirrel are represented under the category of squirrel cause r%h _ _L _ 18 code (007). Basically a cause code indicates the principle cause of an interruption. With this kind of facility, we will be able to see interruptions happened due to a any specific cause as shown in Table 33. Table 33. FPL Cause Codes (102) Table REV 6/16/98 SC CAUSE CODES (Required for all interruptions) Natural Causes 001E Lightning, with equip.dam age 002 Storm w/no equip.damage 003E Fire 004E Salt Spray Corrosion 007 Squirrel 009 Bird 011 Other Animal 013 Tornado 014 Hurricane 015 Ice on Lines 020 Tree/Limb Preventable 021 Tree/Limb Unpreventable 023E Decay/Deterioration 024E Corrosion (Non Salt Spray) 025 Vines/Grass 026E Contamination (Non S /S) 187E Equipt. Failed, Cause Unk 190 Unknown Other Causes 170 Wrong size fuse 171 Overloaded Device 178 Nonstandard Construction 183 Improper Installation 191E Vandalism 193 Customer Request 195 Crew Request (Planned) 196 Slack Conductors 197 Other (explain) 202E Loose Connection Accidental Causes 040 Vehicle 041 Accidental Contact 046 Switching Error 079 DiaIn (Prooer Deoth) Support and FollowUp Codes (Codes to be used as Support or Foilow up Only) No Animal Guard 241 Injection Elbow (Not Installed) 022 Palm Tree 050 Foreign Crew or Customer 066 FPL Crew 067 FPL Distribution Contractor 068 FPL Line Clearing Contractor 069 Transmission Contractor 075 Improper Depth 100 Inadequate/No Ground 192 Crew Request (Forced Outage) 199 Defective MaterialUPR 222 Power Temp Used 240 Injection Elbow (Installed) 242 Flow (Positive) 243 Flow (None) 244 Injection Comp 245 Injection Job Pndng 250 Cable Replace Job Pending 251 Cable Replace Job Comp 260 Fault Locator Used 265 Cleared by Phone 271 Injected (8/96 on) 272 Replaced (8/96 on) 299 Data Corrected 999 Named Storm Exclusions Note: The suffix "E" denotes that an Equipment Code is required. nn Nnt untar .*C" an T Mu Overhead 080 Down Guy or Anchor 081 Pole 082 Cross Arm 083 Insulator 084 Pole Top Pin 087 Tie Wire 088 Jumper 089 Stirrup 090 Hot Line Clamp 092 Disconnect Switch 093 Fuse Switch 096 Line OCR 097 Line Capacitor 098 Line Regulator 104 Conductor Down 105 Conductor Damaged Equipment Codes Underground 110 Terminator 111 Cable 113 Elbow 114 Tx Fuse Switch 115 Tx Blade Switch 116 Bayonet Switch 121 PadmountSwitch 122 Oil Fuse Cutout 123 RA Switch 124 Mech for Throwover Sw. 125 PT Fuse 126 ConductCKT Fuse 127 Control Cable 132 Handhole 134 Bushing 135 Pothead 211 Iniected Cable Overhead or Underground 085 Arrester 102 Other Equipment 091 Connector 103 Splice 094 Transformer 106 Automated Switch (DA) 095 Step Down Transformer Meter 160 Meter 161 Meter Blocks, Repairable 162 CT's 163 PT's 164 Other Meter Equipment 165 Meter BlocksNot Reparable 200 Transmission related Weather Related Codes EQUIP DAMAGE NO EQUIP DAMAGE Substation 140 OCB (Feeder Breaker) 141 Regulator 142 Reactor 143 Relay 147 Step Down Transformer 148 Other Substation Equip. 150 SCADA 151 Telecommunications LIGHTNING PRESENT 001 E 002 NO LIGHTNING PRESENT 187 E 190 The interruption data will be daily totals broken down by cause code. For example, a specific substation may experience three interruptions on September 29, 1998, due to cause code 093 fuse switch. Each data point will be small, but the compilation of many years and many areas will provide a statistically significant sampling. The interruptions of interest to us were further defined by the use of the following filters: * With Exclusions This filter suppresses interruption data that is defined as exclusionary by FPL, including hurricane and tornado damage. We use this filter because we are interested in the effects of normal i ithel/r conditions. * Overhead This filter includes only those interruptions that are caused by faults in overhead equipment or lines. Underground lines were considered immune to most weather conditions. II I * Internal Distribution Interruptions located at the distribution system only were taken into account. * Primary Only primary systems (feeders, laterals and oil circuit breakers) are within the scope of this research. * Substation Each substation reports all the interruptions occurring in the secondary distribution system it supplies * Cause code FPL uses numeric cause codes to specify the causes of interruption. General categories include natural causes, equipment and accident. * Dates All relevant days from January 1, 1998 through December 31 2001 were considered. It is possible to get up to date interruption data by requesting FPL. Assumptions about FPL system: An important consideration in choosing FPL is that they have assured us that we can make the assumption that their equipment is homogenous throughout their area of operations (AO).Homogeneity of equipment is a necessary condition for statistically significant results Scope of current Thesis: As FPL personnel have already done correlation analysis between lightning strikes and the power interruptions, they are interested in knowing the indirect effects of weather including wind, temperature, rain etc on the total number of power interruptions. So the scope of the current research work is limited to these parameters only. Although interruptions represent between 3% to 5% [68] of the frequency of disturbances, a common method for measuring the reliability of an electric distribution system is based on the number of customers interrupted, which is proportional to the number of interruptions, as explained in chapter 1. Let us revisit the definition of SAIFI, a reliability index which the FPL uses more often. IEEE Standard 1366 defines the System Average Interruption Frequency Index (SAIFI) with the following formula: N SAIFI = Cb where Ni = Number of interruptions (sustained interruptions lasting over 1 minute) Ci = Customers interrupted for each interruption Cb = Customer base or customers served FPL SAIFI D95 J96 D96 J97 D97 J98 D98 J99 D99 J00 D00 J01 D01 Years (JJanuary, DDecember) Figure 32. FPL's historical SAIFI performance SAIFI indicates how often the average customer experiences a sustained interruption (>lmin.) over a predetermined period of time, and it has a special importance in decision making for engineers working in distribution reliability. A typical breakdown by significance of the major causes for customer interruption and number of interruptions of FPL distribution system under study for a period of 12 months is shown in Figure 33. Customer Interrupted2001 2000000 S 1600000 C. S 1200000 C 800000 0 C I C) C *> >' l =1 .Q 3 a LL CU a~ E > c) I 2 QJ MU cu LUa Major cause 30 00% 2700% 24 00% 2100% 18 00% 1500% 12 00% 9 00% 6 00% 300% 000% U) 0) C a 0 E Number of Interruptions2001 25000 In 0 * 20000 2 15000 e 10000 0 M 5000 E Z 0 c0 C 0) 0D  C W1 W 0 :3 E z a .) C: o)f 0 0 E 5 LLMajorcause Cu 0)0 c 0)0 E 0 LU Major cause 2000% 18 00% 16 00% 14 00% 12 00% 10 00% 800% 600% 400% 200% 000% ) L U) 0 00 0) E Figure 33. Frequency charts of interruptions and customers affected by interruptions. (a) number of customers interrupted vs. causes and, (b) number of interruptions a The previous graphs show the relative importance of the direct effects of weather (storms and lightning) on the interruptions, but not the indirect effects i.e. temperature, rain, wind etc. From this chart it is not possible to determine if interruptions associated with, for example, vegetation or equipment are indirectly affected by weather conditions such as temperature, rain or wind. The current thesis shows that normal weather conditions do have effects on interruptions and that those effects can be quantified. The benefits of this type of study are the ability to explain trends in the SAIFI due to weather conditions and as an aid in the development of indicators for possible use in anticipating interruptions. 3.3 Meteorological Weather Data from NCDC We collected daily average weather data for rain, temperature, wind speed and other parameters from Automated Surface Observation Stations (ASOSs) located at airports throughout the area of operations (AO). Construction of these stations has begun in 1981 as an aid to air navigation and they have since become the most comprehensive source of weather data in the United States. For the stations we are interested in, we will be using data from 1996 through 2002. As we are an educational institution, the National Climatic Data Center (NCDC), a department of the National Oceanic and Atmospheric Administration [9], is making this data available to us free of charge. The greatest difficulty in collecting these data is its sheer volume. Six years of data from one ASOS will generate a file containing 24 columns and 2190 lines. Stacking files from all the ASOSs in the AO will generate a composite file with more than 20,000 lines. To add to this problem, there are missing days, missing data points and formatting that is not importable to the statistical program of our choice. A final editing of the data was done by brushing (taking out) those points of data which doesn't make sense : data points with zero barometric sea pressure, zero average temperature, 100 mph of 2 minutes maximum wind gust etc. To address this, we wrote programs in C/C+ that will correct the omissions and convert the NCDC files to a generic text file that can be imported by any commercial spreadsheet program presently in use. Since we anticipate the use of ASOS data by any power engineer using our methods, file conversion is required by our objective. 3.4 Weather Parameters of interest Though there are a lot of weather parameters available in weather file downloaded from NCDC website, we used only those parameters which are of interest. A weather data file was created for each of the 15 ASOSs (one particular ASOS covered two regions). The following daily weather parameters were downloaded from the NCDC database: * Average temperature * Maximum temperature * Minimum temperature * Average dew point * Significant observations * Total rainfall * Barometric pressure (sea level and station) * Average wind speed * Twominute maximum sustained wind gust * Fivesecond maximum sustained wind gust Weather is a complex combination of lot of parameters including, but not limited to, wind, lightening, condensation, temperature, rain, pressure, humidity, cosmic dust, solar storms, hurricanes, storms etc and the list goes on if all the meteorological terms are included, some of which we are not even aware of. But if the daily prevailing weather conditions are considered, fortunately lot of parameters can be neglected by throwing 24 them under the category of extreme weather conditions i.e. notacommon daily weather parameter, e.g. Hurricanes, Storms, lightening etc. Therefore, the major focus was given on the weather parameters like wind, temperature, rain, pressure, humidity etc which are not extreme weather conditions. Also among all these common weather parameters, only wind, rain and temperature are investigated, because of their dominant role [10] on the power distribution interruptions. CHAPTER 4 CORRELATION OF WEATHER AND INTERRUPTIONS Consequences of power interruptions can range from mild to severe inconvenience such as missing your favorite TV show or losing critical data, to life threatening, such as the failure of traffic signals. Less obvious consequences include increased cost to the customer due to increased maintenance and repair costs for the provider. Because of these consequences, power engineers are always researching methods to reduce the number of interruptions. The first step to reducing interruptions is to define the causes, and quantify their effects. Accident, human error and aging equipment contribute a great deal, but weather is still the largest single cause, although the effects are not as well understood as we would like to think. We can all agree that adverse weather conditions cause power interruptions. The evidence is apparent. When a bad thunderstorm storm hits, or a hurricane arrives, many people experience power interruptions, and those who don't, hear about it on the news. Lightning strikes, especially common in Florida, create transients that overload transformers and trip fused circuit breakers, both conditions requiring a repair crew to restore power. High winds blow down trees, damaging conductors. Less apparent is the effect of normal weather on the frequency of power interruptions. Several days of moderate rain can saturate the ground, invading buried lines and causing short circuits (FPL study).An unexpectedly warm season can promote vegetation growth, causing interruptions due to tree limb/conductor contact. These and other effects of normal weather are not easily defined because there is not a onetoone relationship such as 'lightning hit the line so the transformer blew.' In fact, many preventable interruptions occur that are not properly attributed to weather because of that lack of oneto one relationships. 4.1 Importance of Statistical Tools Part of the interest of this project is to find the relationship between the number of interruptions and normal weather conditions. Both interruptions and weather conditions in the future are random. To gain a complete prediction of the number of interruptions in the future, we need to predict future weather conditions and predict the number of interruptions based on the predicted weather conditions. However, because of limited resources and the difficulty of weather predictions, we will process the conditional prediction for the number of interruptions assuming that the weather condition is known. In this subsection, we will describe the probabilistic characteristics of daily interruption frequencies and the sums of daily interruption frequencies, i.e., monthly interruptions or sums of interruptions when it rains and when it does not. Then the explanation on plausible statistical data analysis techniques for each case follows. The daily outage frequencies have only nonnegative integer values and are strongly skewed to the right. For example, the daily interruptions caused by tree limbs (Cause Code 20 and 21) has the range from 0 to 58, but 99.5% of frequencies are less than or equal to 5 (Table 41). Therefore, statistical techniques based on the normal distribution, such as tTest, normal linear regression, and ANOVA, generate big biases in calculating the confidence intervals of estimates and provide wrong conclusions in the search of significant weather effects. As a reminder, the normal distribution is symmetric and has the range, (0c, +oc). Table 41. The Frequency of Interruptions due to Tree Limbs (Cause Codes 20 & 21) The Number of Frequency Percent Cumulative Outage Percent 0 15890 81.16 81.16 1 2582 13.19 94.35 2 632 3.23 97.57 3 223 1.14 98.71 4 102 0.52 99.23 5 53 0.27 99.50 6 38 0.19 99.70 >6 59 0.30 100.00 The nature of the weather data sets is first evaluated to know the behavioral patterns of weather parameters. Some of the parameters of our interest are wind, temperature and rain. 4.2 Probabilistic Characteristics of Data Distributions As a prelude to presenting results, I will describe the probabilistic characteristics of the data set we are dealing with; to determine what statistical data analysis techniques and models should be used to correlate weather parameters with power interruptions. Daily interruption frequencies (for all cause codes) have only nonnegative integer values, from 0 to 200, and are skewed to the right, as can be seen in Figure 41. Rain data shows a stronger displacement to the right (Figure 42), while wind speed histogram (Figure 43) gets close to a normal distribution. 9000  8000  7000  6000  5000  o 4000  3000  2000  1000  0 0 10 20 30 40 50 60 70 80 N Figure 41. N for all management areas from 1998 to 2001 using the previous filters 2000  C o a) 1000  a LL 0 0  Figure 42. Rain (inches) for all management areas from 1998 to 2001 1 2 3 4 5 6 7 8 9 10 Rain 2000 a 1000 L. 0 0 10 20 30 40 50 60 70 2MMaxS Figure 43. Wind2 minutes maximum speed (mph) for all management areas from 1998 to 2001 Because a normal distribution is symmetric and the normal random variable is continuous within the range (0c,cc), these probabilistic characteristics must be explained using Poisson distribution. 4.3 Correlation Analysis between Weather Parameters and Interruptions The statistical correlation models between weather parameters of interest wind, temperature and rain, and the total daily number of power interruptions (N) were studied. 4.3.1 Impact of Temperature on N In this section, the impact of daily temperature variations on the Power interruptions due to transformer failures was studied. The monthly averages (means) of the maximum temperatures were taken on the X axis and the monthly means of the total number of interruptions due to transformer failures were taken on the Y axis for 4 years (19982001) of all the MAs. Under these conditions, the total number of monthly data points came to be around 567. Regression Plot of N Vs. Max. Temp Y= 202 803 5 09380X+ 0 0328080 X**2 S =262145 RSq =420% RSq(adj)= 41 8 % / Regression 20  2 0/ .. 95% Cl *.* 95% Pl 95% PI S* PI Prediction Interval limits SCT Confidence Tnterval limits 0 65 75 85 95 Mean of Max. Temperature (F) Figure 44. Variation of average N due to transformer failures vs. maximum temperature It can be observed from Figure 44 that the plot has peaks over the two edges of the Xaxis. The reason can be attributed due to the heavy load on the transformers because of the maximum usage of power during these temperatures. Part of the reason being, all the customers try to switch on their airconditioning at once when there is either maximum temperature or minimum temperature. It looks like around 750F to 800F, there will not be much increase in the transformer failure interruptions and hence is an optimal temperature. Approximately after 800F, the curve increases in an exponential way. The right skewness of the graph indicates that the higher temperature effects are more predominant than the lower ones; this is true for Florida, especially southern part, where most of the year it is sunny. Regression Plot of N Vs. Max. Temperature Y= 208 210 5 22174 X+ 0 0335747 X"2 S=1 11736 RSq =80 9% RSq(adj) 80 1 % 15 / / / / // Mean of Max. Temperature (F) Figure 45. Variation of average N due to transformer failures vs. maximum temperature (averaged per month per year) Figure 45 was plotted with the same exact information used to plot Figure 44, but the data of the corresponding months of the 4 years for all the MAs was averaged giving a total of 48 points. Similarly in Figure 46 the data of the corresponding months of all the 4 years was averaged to give 12 data points. The important thing that we should observe is that as the number of data points is getting lesser and lesser, the plot is getting smoother with the increase in R2 value but at the cost of losing the finest details of the data points because we are averaging out all the variations for each month. This method of averaging out the data points provides us an opportunity to see the hidden pattern between the variables by suppressing the disturbances/noise in the data set. Regression Plot for N Vs. Max.Temp Y = 273 490 6 81286 X+ 0 0432362 X**2 S = 0 597484 RSq = 95 1 % RSq(adj)= 94 0 % 13  12 / I. 11 / 10 / () 7 . 5 '' ~~ .*'^ / . 6    75 80 85 90 Mean of Max. Temperature (F) Figure 46. Variation of average N due to transformer failures vs. max temperature (averaged per month) The correlation equation for the X and Y variables considered is given on the top of each of the plots; R2 represents the proportion of variability in the Y variable accounted for by the X variable. Based on the correlation model developed between the transformer interruptions and maximum temperature, it is possible to predict the total number of interruptions due to transformer for any day/MA if the maximum temperature of that day/MA is know / given. The following Table 42 shows the prediction of Transformer interruptions and the error associated with it for each of the MAs of FPL. Table 42. Prediction of N Using Maximum Temperature of All the MAs MA Airport Tmax N Tx NTx Error NTx Error (Avg.) (Avg.) prediction ALLFPL prediction MA'sEQU actual ALLFPL EQUATION INDIVIDUAL __EQUATION _MA's EQU Central Florida DAB 66.5000 7.36364 9.31668 26.523 8.170 10.951 Wingate FLL 74.4091 3.45455 5.40181 56.368 4.898 41.784 Gulf Coast FMY 72.5909 6.13636 5.91181 3.659 * Treasure Coast FPR 71.3182 6.86364 6.40587 6.669 * Wingate FXE 74.6364 3.45455 5.35414 54.988 4.850 40.395 Gulf Stream HWO 75.3636 4.04545 5.22544 29.168 * Central Dade MIA 75.2727 3.50000 5.23954 49.701 5.020 43.429 Brevard MLB 69.8182 7.22727 .13454 1.283 * NorthDade OPF 75.4545 4.81818 5.21190 8.172 * WestPalm PBI 74.0000 4.90909 5.49660 11.968 * Toledo PGD 71.3182 4.18182 6.40587 53.184 5.429 29.824 Ponpano PMP 73.8636 2.31818 5.53076 138.582 3.720 60.471 Central Florida SFB 68.3636 7.36364 7.99378 8.558 * Manasota SRQ 68.0000 6.95455 8.23225 18.372 7.360 5.830 South Dade TMB 76.3636 5.40909 5.10758 5.574 * ALL FPL 72.485 5.2 5.9 13.4% Description: * MA Management Area considered * Airport The nearest airport to the MA considered in getting the weather data * Tmax (Avg.) The average value of the maximum temperatures occurred in January 2002 * N Tx (Avg.) Average number of interruptions (N) happened due to Transformer failures * N Tx prediction All FPL equation Predicted N Tx (Avg.) using the common equation of all MAs * N Tx prediction Individual equation Predicted N Tx (Avg.) using local equation of individual MAs It can be observed from table 42 that the prediction error varied over a wide range from 1.28 % to 138 %. There were only 5 cases where the error exceeded 50%, with others within the satisfactory limits. The huge error is due to the incorporation of common equation developed from the data of all the MAs. But using the individual MA's equations, which were developed from the local MA's data, those huge errors were drastically reduced. There were cases where the common equation gave better results than the local equations of the MAs; hence local equations are used only for those MAs where common equation gave a huge error. 4.3.2 Impact of Wind on N The role of wind is very significant among all the weather parameters. There is a very good correlation between wind and total number of interruptions (N). When the plot is drawn between the daily 2 minute maximum wind gust (TMMG) and N, it was a big mess and chaotic where no pattern can be seen. Because, for a given value of the TMMG speed there were different levels of N occurred. So the averages of different levels of N occurred at each of the speeds of TMMG were taken and then plotted, the plot can be seen in Figure 47. Regression Plot of N Vs. Wind Y = 4 57617 0 692087 X 0 0355799 X2 0 0004352 X"3 S = 2 76351 RSq =39 8 % RSq(adj) =35 2% 15  10  5 0 10 20 30 40 50 60 Mean of minutes Wind Gust speed (mph) Figure 47. Variation of N vs. wind It seems that there is a pattern until 40 mph, but after that the pattern gets distorted. If at least 30 points were considered while calculating averages then the correlation obtained by this process is very high, R2 = 99.3% and reveals the existence of strong cubic relationship, Figure 48, between N and TMMG. By doing so, only 1.5% of the data points were neglected still keeping 98.5% of the whole data. Regression Plot of N Vs. Wind Y = 0 613754 + 0 0647258 X 0 0065040 X**2 + 0 0002598 X**3 S=00900810 RSq =993% RSq(adj) =992% 4 3 Z 10 20 30 Mean of minutes Wind guest speed (mph) Figure 48. Mean of 2 minutes wind speed vs. average number of interruptions It can be observed from Figure 48 that the total average number of interruptions increases exponentially after around 20 mph. So power distribution poles and overhead equipment must be designed in such a way that there won't be any breakdown for wind gusts of more than 20 mph. Also care has to be taken that the distribution line's neighborhood vegetation and others near by to it are at a proper distance and will not lean on the power distribution lines during these wind gusts. 4.3.3 Impact of Rain on N The impact of rain on the mean number of N can be observed in the following figures. Regression Plot of N Vs. Rain Y = 1 43350 + 3 04671 X 1 09160 X**2 + 0 137744 X**3 S =124028 RSq =428% RSq(adj) =372% 9 8 6 5 74 2 . 0 0 1 2 3 4 5 Mean of Rain(inch) Figure 49. Variation of N vs. rain The number of days that didn't rain is more than the days that rain. As the impact of rain on N is under consideration, the nonrainy days have been excluded from the data set. The data points of N were averaged similar to the approach followed in analyzing wind impacts on N; different occurrences of N for each level of rain were averaged and then plotted in Figure 49. The whole graph can be divided into 3 piecewise linear segments; 0.1 to 1 inch, 1 to 3 inch and more than 3 inch. In the first segment there looks a linear relationship, Figure 410, between N and Rain, and hence initial small amount of rain play a vital role in the amount of interruptions. Regression Plot of N Vs. Rain Y= 1 47309 + 2 11652 X + 0 662153 X**2 0 498268 X**3 S =0568862 RSq =789% RSq(adj) =750% 55  50  45  40  Z 35  S25  15  10  0 Mean of Rain(Inch)1 2 Figure 410. Variation of N vs. rain in the interval [0 2] N remains pretty much constant in the second segment showing constant effect of rain, but in the third segment N increases drastically as rain increases over 3 inch. The small amount of rain, little showers, settles down on the insulators. This droplets of water helps as a solvent for the salts and the atmospheric dust deposited on the insulators and forms a conducting layer for the current, thereby causing a flashover which leads to power interruptions, as explained in chapter 2. On the other hand, rain from 1 to 3 inch is large enough to clean the insulator, as they drop off from it instead of getting deposited. Finally, rain over 3 inches is accompanied with extreme weather conditions leading to again huge amount of N. 4.3.4 Effect of Rain and Wind Together on N The following threedimensional Figure 411 gives the relationship between the combined effect of wind and rain on the average number of N. It can be seen that the predicted (calculated) Navg tracks very well the actual N happened for lower values of N. The regression equation is given by Navg = 1.05 0.04*Wwind speed + 6.82*Rrain average The correlation coefficient, R2, is = 85.6% Figure 411. Impact of rain and wind together on N Usual methods of statistical analysis rely, in part, on knowing in advance what the researcher is looking for. An example is a study done by FPL that provides a linear equation describing the number of interruptions caused by lightning as a function of the number of lightning strikes. In this case, the cause of the outage is known (onetoone relationship) and the result is expected. The data required for this type of analysis is also proscribed by the limited scope of the question. Also, there are limited strategies for dealing with the problem, since lightning is a random and unpredictable event. Analyzing the effects of normal weather requires a different approach. We need to be open to unexpected results rather than expected ones. We need to consider a body of data much, much larger than that required to investigate a single phenomena. We need to consider nonlinear relationships and relationships that imply a confluence of conditions. We need to apply every statistical tool we can think of, and then learn some more. Most of these features are available in a tool called Artificial Neural Networks (ANNs). Hence the application of ANNs to our current problem is discussed in next chapter. 4.3.5 Effect of Lightning Strikes on N FPL has already done correlation analysis between cause codes 01(Lightening, with equipment damage) and 02(Storm with no equipment damage) and lightning strikes for all the MAs considering the years 19982001. Cause codes 01 and 02 represent the direct weather effect on service interruptions. The following plot, copied from the FPL information slides during their visit to University of Florida, explains the impact of lightning strikes on the storm interruptions with a very high correlation with a linear relation meaning the higher the lightning strikes the higher the storm interruptions. Monthly Cnrre=Einn Li ghtnin Stri kes vs Storm hterrptions (Cod 01 4 02n1 1a8tno 9M1 5000 4500 y= 0.0376x + 91.956 * S3500 2300 5 500 2 0 1t500 500 0 I 0 20 POO 40,.00 60,000 80 POO 100I OOO 120,000 140,000 LightM ning Stri kes Figure 412. Lightning strikes vs. storm interruptions during 19982001 for all MAs CHAPTER 5 PREDICTION OF INTERRUPTIONS USING ARTIFICIAL NEURAL NETWORKS Though it gives the impression, from the previous chapter, that the effect of all the weather parameters on power interruptions can be quantified using standard mathematical functions/ statistical techniques, it is not always true. It may neither practical nor feasible, always, to find a function for certain complex correlations between weather and interruptions. This is where the need for the neural networks arises to analyze and generalize the hidden relationship. We need a tool which is powerful when applied to problems whose solutions require knowledge which is difficult to specify, but for which there is an abundance of examples artificial neural networks is one of the best tools for this kind of problems. 5.1 Introduction to Artificial Neural Networks Neural networks, or artificial neural networks (ANN) to be more precise, represent a technology that is rooted in many disciplines: neuroscience, mathematics, statistics, physics, computer science, and engineering. ANNs find applications in such diverse fields as modeling, time series analysis, pattern recognition, signal processing, and control by virtue of an important property: the ability to learn from input data with(supervised) or without a teacher (unsupervised).The most common training scenarios use supervised learning. ANN is a very useful tool for predicting the interruptions of a power distribution system to a decent accurate value. The accuracy of prediction is directly proportional to the accuracy of the historical power interruption and weather data used to train the ANN. This project provides the methodology for predicting the interruptions beforehand for the forecast weather conditions using ANNs. 5.1.1 Benefits of ANNS over statistical methods ANN is an alternative to conventional methods [11]. ANN is an approach that combines the time series and regression approaches; it learns from the previous interruption and weather patterns and predicts one for the current conditions, it also performs a nonlinear regression between interruptions and weather patterns. It shows superior performance in terms of accuracy when compared to statistical methods [12]. ANN derives its computing power through, first, its massively parallel distributed structure and, second, its ability to learn and therefore generalize. Generalization refers to the neural network producing reasonable outputs for inputs not encountered during training (learning). These two informationprocessing capabilities make it possible for neural networks to solve complex (largescale) problems that are currently intractable. The main reasons for using neural networks, for prediction, rather than statistical techniques/ classical time series analysis are [13] * They are selfmonitoring (i.e., they learn how to make accurate predictions. * They are able to cope with nonlinearity and nonstationarity of input processes. * They are adaptive, nonlinear and highly parallel. * They can generalize. * They are computationally at least as fast, if not faster than most available Statistical techniques. Multilayered ANNs are capable of performing just about any linear or nonlinear computation, and can approximate any reasonable function arbitrarily well. 5.1.2 Architecture of ANN Figure 51(a) shows the basic model of a single neuron while Figure 51(b) shows a onelayer network with R input elements and S neurons. In this network, each element of the input vector p is connected to each neuron input through the weight matrix W. The ith neuron has a summer that gathers its weighted inputs and bias to form its own scalar output n(i), Figure 51 (b). The various n(i) taken together form an Selement net input vector N. Finally, the neuron layer outputs form a column vector a, where a = f (Wp+b). Activaiion function I'" ~ ~ ". Synaptic weights (a) Input Layer of Neurons i=f(Wpb) (b) Input Hidden Layer Output Layer ( r___ al = tansig (IW1,Ip1 +bh) a2 =purelin lIW: ii +li (c) Figure 51. ANN structures: (a) basic nonlinear model of a neuron, (b) one layer network of neurons, and (c) 3 layer feed forward back propagation network Figure 51 (c) shows the ANN model used in the current project. IW represents Input Weight matrix having a source 1(second index) and a destination 1(first index). Also, elements of layer one, such as its bias, net input, and output have a superscript 1 to say that they are associated with the first layer. LW represents layer weights [14]. Inputl signals Where... = nurmberof remenis in input vector S= number of neurons in layer The data is presented to the input nodes. Each input node is connected to several nodes in the second layer. The second layer is called the hidden layer, since they are not accessible to the outer environment. The hidden layer acts as a layer of abstraction, pulling features from inputs. Determining the proper number of nodes for the hidden layer is difficult and often determined through hit and trial. Generally, network performance increases with the number of hidden nodes and then reaches a saturation level [15]. The addition of more hidden nodes may actually degrade performance due to increased difficulty of training data. The implementation of this commonly accepted rule will help train the ANN efficiently and will also help convergence of the solution. The last layer is referred to as the output layer, since the network's output is the response of nodes on this layer. The number of output nodes of an ANN is determined by the requirement. 5.1.3 Functioning of ANN In general, the operation of this feed forward network consists of passing weighted and summed input signals through a chosen nonlinearity. It presumes knowledge of the network's bias functions and weighted links. Once activation and output functions are chosen, an ANN is completely described by its weights and biases. Since a given ANN solves a specific problem, or function, finding weights and biases for the network is equivalent to finding the input/output relationship that describes the function. In the current ANN model, Figure 51(c), the activation functions chosen in the hidden layer and the output layer are "tansig" and "purelin" respectively. The two layer sigmoid/linear network can represent any functional relationship between inputs and outputs if the sigmoid layer has enough neurons. There were a lot of training algorithms and performance functions that we can chose from to train the network model. For the present problem BPN algorithm has been chosen, as it was the famous algorithm for multilayer perception (MLP) networks and 'trainbfg' training function was used to train BPN. The term back propagation refers to the manner in which the gradient is computed for nonlinear MLP networks. Properly trained back propagation networks tend to give reasonable answers when presented with inputs that they have never seen. Typically, a new input leads to an output similar to the correct output for input vectors used in training that are similar to the new input being presented. This generalization makes it possible to train a network on a representative set of input/target pairs and get good results without training the network on all possible input/output pairs. 5.1.4 Back Propagation Learning Rule The back propagation learning rule [16] is an iterative gradient algorithm designed to minimize the mean square error between the actual output of a multilayer feed forward network and the desired output. An essential component of the rule is the iterative method that propagates error terms required to adapt weights back from nodes in the output layer to nodes in lower layers. At beginning, we set all weights and node offsets to small random values. The input values are presented and the desired outputs are specified. Then the network, Figure 52, is used to calculate actual outputs. A recursive algorithm, starting at the output nodes and working back to the hidden layer, adjusts weights until weights converge and the cost function is reduced to an acceptable value. The training process is repeated by presenting different sets of input data to the ANN. EMw a Figure 52. A back propagation ANN model 5.2 Steps to Enhance the performance of ANN There is a wrong notion that one can dump all the available variables as input to the ANN to predict the solution. The more the number of input variables to ANN the complex the problem to track in studying the correlation between these input variables. To enhance the performance of ANN, the input data has to be preprocessed. ANN toolbox of MATLAB 6.0 has some of the functions which can perform these operations. The following are some of the techniques that could be helpful to enhance the quality of the input datasets before giving it to ANN: * Eliminate the unnecessary variables which don't have significant contribution to the output. * Scale the inputs and targets so that they always fall with in a specified range. * Reduce the dimensions of the input data, without much loss in the variance, e.g. Principle Component Analysis, as explained below. As weather is a combination of many parameters like wind, temperature, rain, pressure, dew, lightening etc, the next question that comes to our mind is what are the predominant ones among all these parameters that have significant contribution towards the daily interruptions? One way to figure out solution for this problem is to see the variance of all the weather parameters with respect to each other. The ones which have more variance are more responsible towards N than the ones with less variance. Less variance in a variable means fewer changes in its value, which means this variable has less effect on the changes of N. For investigations involving a large number of observed variables, it is often useful by considering a smaller number of linear combinations of original variables. Principle Component Analysis (PCA) PCA is one of the friendly tools used popularly to reduce the dimensions of input variables. Principle component analysis [13] finds a set of standardized linear combinations called the principal components, which are orthogonal and taken together explain all the variance of the original data. The following analysis shows the variance of the 8 input considered in the ANN model: Table 51. Summary Table of Covariance for All the Input Variables Considered in the Principle Component Analysis Importance of components: Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Standard deviation 10.9577574 7.0728909 5.5105074 3.22619511 2.01380874 1.5672211 0.936582778 0.3328819326 Proportion of Variance 0.5498531 0.2290853 0.1390550 0.04766335 0.01857119 0.0112477 0.004016943 0.0005074389 Cumulative Proportion 0.5498531 0.7789384 0.9179934 0.96565672 0.98422792 0.9954756 0.999492561 1.0000000000 Loadings: Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 MaxTemp 0.515 0.138 0.197 0.722 0.336 0.203 MinTemp 0.738 0.363 0.516 0.216 Rain 0.995 AvgrlndS 0.313 0.359 0.841 0.229 5Sec1indS 0.727 0.230 0.187 0.614 2MinWindS 0.585 0.151 0.787 HeatDays 0.107 0.277 0.949 CoolDays 0.415 0.903 In the above table 51, if component 1(comp.1) alone is considered, it explains 54.9% of the total variance in the data set by using the following linear combination of only 4 weather variables: Comp.l = 0.515(MaxTemp) + 0.738(MinTemp) 0.107(HeatDays) + 0.415(CoolDays) Similarly, comp.2 alone explains 22.9% of the total variance in the data set with the following linear combination: Comp.2 = 0.197(MaxTemp) + 0.313(AvgwindS) + 0.727(5SecWindS) + 0.585(2MinWindS) But when comp. 1 and comp.2 are considered together, 77.8% of the total variance of the data set can be explained. From above table, it can be observed that by considering up to comp.3, around 91.7% of the total variance can be explained and by considering up to comp.4, around 96.5% of the total variance can be explained. The choice of how many number of components to be considered depends on the amount of variance that is of our interest. Hence in this case, the total number of dimensions, 8, has been reduced to 4 if we want to retain 96.5% of the total variance by considering up to comp.4. To our interest, rain in above table has no contribution at all towards variance, if we consider until comp.4, hence this variable can be eliminated. Hence the 8 variables can be reduced to 4 components by preserving 96.56% of the total variance in the data set. Each component is like a new variable but a linear combination of the actual weather variables. 5.3 ANN Simulation Output Three management areasWingate, North Dade, and Gulf Stream were chosen as pilot areas in the current artificial neural network (ANN) project. These 3 MAs are adjacent to each other and small enough to make the assumption that the variation in weather due to geographical differences is slight. Also, they are all urban MAs and appear to have a similar distribution of customer types. Two years, 2000 and 2001, of weather and interruption data were chosen to train the ANN while 2002 weather and interruption data was used to evaluate the performance of the trained ANN model. One entry in either the training or evaluation datasets is composed of one day's weather and interruptions for one MA, so the one year evaluation dataset had 1060 entries (allowing for missing data). The output of the ANN is two columns of data; a prediction for each entry in the evaluation dataset and the actual number of interruptions for each entry in the evaluation dataset. Figure 53 is a graph of the predicted values superimposed over the actual values for the evaluation year 2002. The MAs are in sequence and it can be seen that the predicted values follow the seasonal trends for interruptions.  40 10 I 2002Gulfstream 2002North Dade 2002Wingate Figure 53. Prediction patterns of N overlaid on actual patterns of N of 3 MAs for year 2002 5.3.1 Detailed Observation Figure 54 is an expanded segment (North Dade) of the above graph to highlight the details. It can be seen that where the actual number of interruptions are large, the predictions matches the pattern of Max's and Min's, but are not always close in magnitude. Where the actual number of interruptions is small, the pattern matching breaks down, but large spikes in the predictions do not occur. The following is a segment of a time series plot of predicted values and actual values of total daily number of interruptions (N) for North Dade MA. It can be clearly seen that during some periods (rectangles) the predicted values match the pattern of the actual values, if not the magnitude, while other periods (ovals) do not show any such pattern matching although the magnitudes are small. Predicted N i Actual N Figure 54. Predicted N and the actual N for a few of the cases in North Dade MA for 2002 Some of the interesting observations from the above plot, Figure 54 are explained below. Case 1: Predicted N less than actual N. The actual value of N at points 529 and 530, in Figure 54, correspond to 6/17/2002 and 6/19/2002 in the North Dade MA. The interruption data for 6/17/2002 indicates that up to 18 among 28 interruptions that occurred in 6/17/2002 may not be related to daily common 1 eI/wer (Corrosion/Decay = 10, Improper Process = 6, Request = 2) which suggests that as few as 12 may be weather related interruptions, which is just the same as the predicted value. Similarly 15 out of 29 interruptions that occurred on 6/19/2002 may not be related to daily weather, which is close to our predicted value of 12. Though the weather conditions for these days were relatively mild, we have a significant increase in the number of interruptions, as shown in Figure 55 in the green box. 50 Case 2: Predicted N more than actual N. If we look into the details of the points 549, 550, 551 and 552, these points correspond to 7/8/2002, 7/9/2002, 7/10/2002, 7/11/2002 days of North Dade MA, outlined in red in Figure 55. The number of interruptions for these days was pretty much same though their weather conditions vary over a wide range. This large change in weather conditions forces the ANN model to predict N proportional to the weather. So it is really a question one should ask that why we have small changes in N though we have significant differences in their weather conditions? Were some precautionary measures, e.g. tree trimming been taken few days before the happening of these interruptions?? SC17T C18D C19 C20 C21 C22 C23 C24 C25 C26 C27 C2 C29 Call Sign_1 Date_1 %MaxTemip_1 MiliTemip_ AvgTemip_1 HeatDays CoolDays1 Rain1 AvgWdS1 5SMaxS_1 2MMaxS_1 WGLS_1 WG N_1 527 OFF 06/15/202 79 73 73 6 Ol 11 1 18 108 316 25 3; 33 531OFF 06O/2B]2002 B6 71 79 0 0 215: 97 26: 22 0 14 532 OOPF 0B/21/2002 77 71 74 6I D 1196 62 26 22 0 13 534 OFF 06/23/2B002 B 73 80 I 0 04W BO 36 29 5 14 535 OPF 0O/24/2002 87 74 81 OI 0 28 5 28 20 18 0 20 536 OPF 06/25/202 83 74 79 0 D 037 39 16 14 0 23 537 OPFF 06/2/2002 B 72 72 9 [I 112 5B 20 18 7 18 538 OPF 0/27/2002 87 75 81 O 0 000 45 16 13 0 20 539 OPF 06/2/2002 88 74 81 0 0' 003 48 22" 17 0 6 540 OFF 06/29/2002 B9 75 82 0E 0 002 7B 25 22 0, 23 541 OPF 0O/30/2002 B5 76 80 OI 0 016 67 25 21 2 14 543 OPF 07102/2002 B7 71 79 0 0 O 30B 43 24, 21 56 44 545 PF 07/04/202 89 74 82. 034 36 26 22 3 24 546 OPF 07105/2002 90 73 82 0 0 0 09 32 29: 21 51 42 547 OPF 07/0/2002 B8 73 8 1I 0 012 26 17' 15 3 12 548 OFF 07/07/202 B8 74 81 I 0 066 62 31 28 2 27 553 OPF 07/12/202: 90 73 82; 0 17 0 156 56 2 18 0 17 554 OPF 07/13/202 91 73 82 0 17 0 01 69 22 18 0 17 555 OPF 07/14/202 93 77 85' 0; 20 00 62 17 14 0 20 556 OPF 07/15/2002 92 78 85 0 0 08 50 17 16 3 18 557 OFF 07/16/2002 91 76 84 O 0 0 65 44 25 20 16 24 559 OFPF 07/1 /2002 90 77 84 OW 0 0 96 36 23' 18 78 18 Figure 55. Numerical values of weather and interruption data under consideration Case 3: In some of the cases there was a very small increase in N though there were large variations in the weather conditions. 5.3.2 Dominant Weather Parameters Preliminary Observations A series of ANN simulations with different weather parameters removed has been done, and the relative accuracy of each simulation has been compared to determine which weather parameters are the most significant. The preliminary results show that only a few of the many weather parameters account for most of the variation in the number of interruptions. It is expected that the importance of individual weather parameters will vary with location. The following list gives the weather variables in accordance with their importance for the pilot area: 1. Two Minute Sustained Wind Gust (mph) 2. Rain (inches) 3. Lightning Strikes (# of strikes/day) 4. Temperature Average or Max. & Min.(K') On the other hand, the following list of parameters which account for the least variation in the number of interruptions. 1. Pressure 2. Heat Days 3. Cool Days 4. Dew Point 5. Population (of MA) 5.4 Analysis of ANN Simulation Output Based on the actual number of interruptions and the predictions during the evaluation year, probability graphs (PGs) have been created to represent the range of interruptions that actually occurred in the evaluation dataset for each predicted value. For example, if every number between 1 and 40 interruptions were predicted at some point in the evaluation year, there would be 40 PGs. This is done by subsetting the evaluation dataset into 40 sets and creating a histogram of actual values for each predicted value in the evaluation set. By dividing each frequency column by the sum of the interruptions that comprise the histogram, a probability graph such as the one for a prediction of 11 shown below can be created. Histogram Probability Graph 15  10  10 .. 5 Ia 0 0 1 2 3 4 5 6 7 8 9 1011121314151617181920212223 2 3 4 5 6 7 8 9 1011 1213141516171819202122 N Actual N Actual (a) (b) Figure 56. Histogram plot of predicted interruptions From the histograms, it can be seen that the actual values for each predicted value follow a generally normal distribution, so it is justified to apply normally calculated mean and standard deviation to gauge the accuracy and precision of a prediction. The accuracy would be determined by the closeness of prediction to the mean actual number of interruptions. The precision would be determined by the magnitude of the percent standard deviation. Percent standard deviation was chosen to equalize the standard deviations for lower to higher predictions. Outliers provide clues to elements of the model that either are missing or should not be there. Test data is just a single day's weather data; a realtime updated weather parameter max, min or total from a weather station, a known day's values or a theoretical set of weather data. The former is used in realtime prediction but the latter can only be used after the fact and does not provide any predictive benefits, aside from inclusion in the historical data set. However, the latter can be used for research, such as modeling a system's robustness to weather. Test data that shows a very low prediction can be used as a base and the parameter values can be varied either individually or in groups to model the response to those parameters. 5.5 Pitfalls and Suggestions to FPL GIGO is an acronym from the predawn of computing garbage in, garbage out. The accuracy and precision of the ANN is limited by the accuracy and precision of the input. Although there will always be error inherent in the data collected, significant improvements may be possible. 5.5.1 Weather Data The error inherent in the ASOS weather data may be geographical and ASOS data is only available for historic and not realtime use. The installation of dedicated weather stations that is now occurring at FPL service centers will reduce that inherent error and allow realtime forecasting. 5.5.2 Interruption Data Although the FPL data cubes are thorough, the reporting procedures are not designed for a detailed, timedependent study such as this, nor are they always sensitive to the role of weather. Because of this, the accuracy and precision of the prediction suffers. An example is that a day on which an interruption may be reported runs three shifts from 7 AM to 7AM. In the last random sample made, the last shift, from 11 PM to 7AM, reported about 12% of the day's interruptions; meaning that from midnight to 7AM the interruptions were being reported on the previous day. This can be largely accounted for by taking data from the cube by shifts and summing, however that still leaves 11PM to midnight, or maybe 12% of the interruptions, reported on the wrong day. Because the data was only shifted in time, the average difference after adjustment was only 0.05 interruptions; however, because of many instances where a large number of interruptions were reported on the previous day, the average percent difference was 14%. To determine the effect on the output of this error, two sets of data were taken from the same time period and location, an original one with 24 hours of interruption data taken from the cube on the day it was reported and an adjusted one with 24 hours of interruption data taken from the beginning of the third shift on the day before it was reported. Both were run through the ANN and the results compared. The following detailed graphs of the same time period in the same MA show an improvement in the pattern and magnitude matching after the interruption data were adjusted for the shift differences. F1redicted N I, l Figure 57. Prediction results of ANN using the original N (not shift adjusted) Mean and Standard deviation plots for the actual N vs predicted N and adjusted N vs predicted N were plotted as shown in Figure 59 and Figure 510. It can be observed that adjusting the data to include the correct shifts on the correct days improves the fit of the prediction to the mean actual number of interruptions. It also shows that the fit 1I j L I .la I .I IL .*r II Figure 58. Prediction results of ANN using the adjusted N (shift adjusted) deteriorates as the prediction increases, indicating unknown factors. The graphs of the original, Figure 59, and shiftadjusted percent standard deviation, Figure 510, show a reduction in the adjusted %Standard Deviation at lower predictions while the higher predictions are not much improved, similar to the graphs of the means. 40 90  80 S 70 30  Z!A U) 60  20t 40 30  10 20  10  5 15 25 35 5 15 25 prediction Prediction (a) (b) Figure 59. Mean and standard deviation of actual N 15 25 35 prediction (b) Figure 510. Mean and standard deviation of adjustedN 15 25 Prediction (d) 5 These results suggest that there are other improvements that can be made in the data reporting. Not one change would have as visible an effect, perhaps, but taken together they could alter the results significantly. Some possibilities suggest themselves: * Report interruption requests due to weather related damage repair on the day the causative weather condition occurred. * Maintain an hourly database for interruptions since hourly weather is available. This would be especially useful if dedicated weather stations existed. * Update cause codes to be more sensitive to the possible role of weather. * Report Age of equipment Simulations that have been run with different cause codes subtracted from the interruption data, such as accident, animal, improper process and crew request (planned) have shown similar improvements in differing regions of the graphs. 5.6 Proving Localization of Weather Improves the Accuracy in Prediction Case 1: Localized Weather Data Three areas Wingate, North Dade, and Gulf Stream were chosen for study as pilot areas, which are adjacent to each other. Weather and Interruption data for each of the MAs were considered for years 2000 and 2001 and were used to train the ANN model. While Year 2002 weather data of the North Dade area was chosen to predict using the built trained ANN. The mean percentage error (MPE) of the predicted value is 25% (approximately) The mean percentage error (MPE) is calculated using the following formula: 1 Mod(Nactual Npredicted) AM Nactual Where M= Total number of cases considered Nactual = Actual number of N happened Npredicted = Predicted number of N Case 2: Scattered Weather Data Contrary to taking weather data from each of the weather stations, only one weather station was chosen for weather data while the interruption data was taken from all the 3 management areas. The mean percentage error of the predicted value is 35% (>25%) This shows that with the increase in the accuracy of weather, by considering the smaller areas, there are chances to enhance the performance of the model. 5.7 Comparison of Statistical Model and ANN Model A comparison of the prediction performance between statistical and ANN model was done using the 2000 and 2001 weather and interruption data of Gulf Stream (GS), North Dade (ND), and Wingate (WG) of FPL In the process, three variables Rain, 2Minutes Maximum Wind Gust and Average Temperature were considered in building the above two models. A multiple regression equation was developed for the above three variables as given below: N = 16.6 + 0.174 *AvgTemp + 4.71* Rain + 0.852 *2MMaxS The 2002 weather data of ND is used to predict N using the above equation. On the similar lines, ANN model was developed with 3 input variables, 5 hidden nodes and 1 output node. The same data set which is used for the statistical model is used in evaluating the ANN model. The results of both the models are tabulated in Table 51. Table 52. Performance Comparison Between Statistical Model and ANN Model Statistical Model ANN Model Mean % Error 67 45 Prediction with 30% Error 46 54 The above results show that the prediction accuracy of the ANN model is better than the statistical model. Though the actual predicted figures of accuracy from both the models are less, as we considered only few variables to make the problem easy, the point here is to show that the ANN model is better.  Training Validation Test 2.5 8 1 05 0 5 10 15 20 25 Epoch Figure 511. Mean squared error vs. training epochs Figure 511 shows that the mean square error is gradually getting decreased with the training of ANN for each epoch (a complete set of training data). The progress of training is diagnosed by looking into the training, validation and test errors. The training stopped after 40 epochs because the validation error increased. The result here is reasonable, since the test set error and the validation set error have similar characteristics, and it doesn't appear that any significant over fitting has occurred. 5.8 Possible Software Development to Predict Power Interruptions Using ANNs The following Figure 512, is a snap shot of the graphical user interface (GUI) development of the ANN that had been trained to predict the interruptions. SI 0i a EM FPL Interface Tj .6 & * Figure 512. Graphical user interface developed to predicted interruptions Using above interface model, Figure 512, is simple: We have to first load the training and testing data files (ASCII format) using the options buttons provided and then click on the "Run Simulation" button to see the above plots. Currently, the development of custom software to predict the power distribution interruptions, based on the idea provided in the current thesis, is in progress. The proposed prediction model is under test at FPL management areas. The custom software can be easily installed just like any other software on the user desktop and is just a click away to know the power interruptions in advance. In the future, the software will be distributed to other power utilities in USA. C Document and S .net.rnRoo My D documents NTJUSER.IN1 StatMenu Templa]es UsrData gswew32 ini ntuserdat ntuserdatlOG Load Te ting Data Load Trawmg Data RunSimulation i 1. 1. 1 .. 60 Using the model shown in Figure 512, it can also be possible to get similar kind of predictions as shown in Figure 513.Central Dade management area (MA) has been chosen as one of the pilot areas to see how well our developed model can predict the interruptions. It can be seen that the correlation coefficient R2 is around 90 which means the model is doing pretty good job in predicting the interruptions close to the actual number of interruptions. The Xaxis of figure 513(a) shows the predicted sum of monthly interruptions while the Yaxis shows the actual sum of monthly interruptions happened. This estimate of predicted interruptions will help the utilities to know in advance how much personnel they need to deploy to manage the interruptions. Central Dade 20012003 Monthly Total N Sum Actual = 1.84 + 1.060 Sum Predicted 600 S 32.3396 RSq 90.2% RSq(adj) 89.9% 500 * g 400 300 200 100 _ 150 200 250 300 350 400 450 500 550 Sum Predicted Scatterplot of Sum Actual, Sum Predicted vs Month for Central Dade Variable 600 0 Sum Actual 500 U Sum Predicted 400 300 200 0 3 6 9 12 Month Panel variable: Year Figure 513. Predicted interruptions vs. actual interruptions for Central Dade (a) 3 years plotted together (b) each year plotted separately CHAPTER 6 LIMITATIONS, CONCLUSIONS AND FUTURE WORK The research results presented in this thesis are not with their own limitations. Some of the hurdles that need to be overcome to get better results were discussed in this chapter. The conclusions of the current thesis are followed by the future work explaining about the steps that are to be followed from the current state of the project. 6.1 Limitations of Approach 6.1.1 Weather Data There are two types of weather parameter measurement errors. First, we found that the weather parameter measurement in an airport is not accurate as expected. Second, the distance between the location of outages and the airport, where weather parameters are measured, is up to 10 miles. The weather conditions in two locations for certain weather parameters can be significantly different (Figure 61). However the rain difference presents a normal distribution with mean close to zero. Therefore, rain data can be used for nearby locations without changing the results. 63 18 2 16 14 2 08 S06 02 11/20/2000 1/9/2001 2/28/2001 4/19/2001 6/8/2001 7/28/2001 9/16/2001 11/5/2001 12/25/2001 2/13/2002 Time (a) :.:rJ. r 1L. I !riJlsbcrm Ea.ar   0 1 :ii.S. iLamor[ re een wn . F[I milIe lOra .1 Li. IE T I I H Ir" /  I ; I _ 6.1.2 J.7miesr_ Unkn__ V Caqeft tPlui ofF these I dfrteh Fam, mile F ar I 1 LL I  Sm ile. I  %IIFa I O .: ~F E ter, Paneft Lai % (b) showing less than 975 miles distance between each weather station 6.1.2 Unknown Variablesmil of these are different weather parameters, but other variables are most likely specific to a system and would best be identified by utility employees who are familiar with the system. 6.1.3 Outliers It is possible to find aberrant observations among the data set, without any clear explanation of the cause. These sort of outliers must me study independently. 6.1.4 Hourly Data From the number of interruption database FPL provided, the lowest reachable level is "daily basis". However, for any interruption studied it is necessary to know the exact time, at the hour level or even at the minute level as shown in [1011].This is needed to study different weather parameters at the given outage time since the weather varies for each and every hour. 6.2 Conclusions The ANN and statistical analysis of the ANN output has the potential to provide powerful modeling tools, and can be used to provide limited realtime prediction. The accuracy and precision of the model is dependent as much on the input as the ANN model. The graphical output of the ANN can be used by itself or in conjunction with the statistical analysis to compare the accuracy and precision of the ANN model with different variable selections, principal components, study areas or times. In some cases, the graphical representation can provide better clues to the performance of the ANN than the graphs of means and percent standard deviations. With the ever increasing demand for more and more electricity every year, the need to look for the better ways in preventing the interruptions due to over loading of the power distribution equipment has drawn much attention. ANNs have been already applied in power systems in the areas of Economic Load Dispatch, Optimization and Loss Reduction, Fault Detection and Diagnosis, Frequency Control, Load Forecasting, Contingency analysis, static security assessment, Voltage and Reactive Power Control etc. But, not much research work can be found either online or in the IEEE publications regarding the application of ANNs for the prediction of power distribution interruptions. This novel idea seems very promising in letting the utilities know and alert them in advance about the number of interruptions that are going to happen in future. This helps to optimize their crew by mobilizing them to the location of interest and take proper action more effectively to avoid interruptions/ respond quickly in restoring the power due to interruptions. This further helps in reducing the SAIFI value. The utilities can predict SAIFI as they would be able to predict the total number of interruptions and can use it in their internal calculations. The developed ANN model can be further enhanced in predicting the extra information like time slot and location of the occurrence of these interruptions besides revealing their approximate number, for which all we need to do is to provide the extra information as input columns while training the model. The accuracy of the predicted results is directly proportional to the accuracy of the information provided in the training data which is used to train the model. A basic methodology that is easily automated has been proposed. The methodology promises to be easy to use and flexible enough to perform in both a realtime predictive and a theoretical modeling mode. 6.3 Future Work The following future steps will improve the accuracy of the current analysis. 6.3.1 Data Collection and Creating New Variables Additional data collection is necessary. It is suspected that a change in power usage or equipment density might cause outage trends over time. To verify this idea, we need to develop a scaling factor and collect usage data. This scaling factor would consist of information such as equipment density and length of lines daily power usage data would be an additional explanatory variable. Also, this new data might be useful in comparing management areas because the probability of interruptions occurring may be proportional to the scaling factor. The more the new input variables of the system 6.3.2 Improving the Accuracy and Developing New ANN Models Other types of ANN such as RBF, LVQ, SOM or their combinations need to be tested to see which of the model gives better prediction results. The dimension of input feature space/ input feature pattern needs to be reduced to improve performance such as speed, prediction accuracy If the prediction variable(s) are more than one (multioutput rather than single output), the architecture of whole system may be either a multiinput multioutput ANN or the composition of several multi inputsingle output ANNs. The training method as well as performance should be further investigated and compared. Develop an enhanced custom software model with Graphical User Interface, where the user will have the options of selecting new input and output datasets to train ANN and develop a model to predict the output. Hence, user can reuse this tool every time he wishes to create new model / renovate the existing model. LIST OF REFERENCES 1. IEEE TrialUse Guide for Electric Power Distribution Reliability Indices, IEEE Std 13662001, IEEE, New York, 1999. 2. 2001 Cost of Downtime, Contingency Planning Research (CPR) and Contingency Planning & Management Magazine (CPM). Web site http://www.contingencyplanningresearch.com (accessed on December 2001). 3. C. A. Warren, Overview of 13662001 the Full Use Guide on Electric Power Distribution Reliability Indices, Power Engineering Society Summer Meeting, IEEE, Volume 2, 2002. 4. Transmission Line Reference Book, 345kV and Above/Second Edition. Electric Power Research Institute, Palo Alto, CA, 1982. 5. Florida Power and Light website http://www.fpl.com (accessed June 2004) 6. Thomas E. Grebe, D. Daniel Sabin, and Mark F. McGranaghan, An Assessment of Distribution System Power Quality: Volume 1: Executive Summary. EPRI Report TR106294V1, Electric Power Research Institute, Palo Alto, California, May 1996. 7. D. Daniel Sabin, An Assessment of Distribution System Power Quality, Volume 2: Statistical Summary Report. EPRI Report TR106294V2, Electric Power Research Institute, Palo Alto, CA, May 1996. 8. Daniel L. Brooks and D. Daniel Sabin, An Assessment of Distribution System Power Quality: Volume 3: The Library of Distribution System Power Quality Monitoring Case Studies. EPRI Report TR106294V3, Electric Power Research Institute, Palo Alto, California, May 1996. 9. National Climatic Data Center (accessed June 2004), Website http://nndc.noaa.gov/?http://ols.ncdc.noaa.gov/cgibin/nndc/buyOL002.cgi. 10. A. Domijan, Jr., R. K. Matavalam, A. Montenegro, W. S. Willcox, Y. S. Joo, L. Delforn, J.R.Diaz, L.Davis, and J. D'Agostini, Effects of Normal Weather Conditions on Interruptions in Distribution Systems, International Journal of Power and Energy Systems, Publication No: 2033453. 11. J. M. Zurada. Introduction to Artificial Neural Systems. West Publishing Company, St. Paul, MN, 1992. 12. L. F. Garcia, and O.A Mohammed, Forecasting Peak Loads with Neural Networks, Southeast Conference. Creative Technology TransferA Global Affair, Proceedings of the 1994 IEEE, pp. 351 356, 1013 April, 1994. 13. S. I. Wu, Mirroring our Thought Processes. IEEE Potentials 14, 3641, 1995 14. Neuron Model & Network Architectures, Neural Networks Toolbox, MATLAB 6.0 Manual, Chapter 2. 15. W.M. Huang and R.P. Lippmann. Comparisons Between Neural Networks and Conventional Classifiers, Proc. IEEE Int. Conference on Neural Networks, pp. 485493, 1987. 16. J.L Chen, and Chang, S.H, A Neural Network Approach to Evaluate Distribution Systems Engineering, IEEE International Conference on Neural Networks, pp. 487 490, 1719 Sept. 1992. BIOGRAPHICAL SKETCH Roop Kishore R. Matavalam was born in Tirupati city, Andhra Pradesh state, India. He received his Bachelor of Technology (B.Tech) degree in 2001 specializing in electrical and electronics engineering from Sri Venkateswara University, India. Since Fall 2001 he has been pursuing his Master of Science degree in electrical and computer engineering at University of Florida, Gainesville. He has been working as a research assistant, since 2001, in Florida Power Affiliates and Power Quality Laboratory, University of Florida. His fields of interest include power reliability, power electronics, analog circuit design and RF micro electronics. 