<%BANNER%>

Power Distribution Reliability as a Function of Weather


PAGE 1

POWER DISTRIBUTION RELIABILIT Y AS A FUNCTION OF WEATHER By ROOP KISHORE R. MATAVALAM A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2004

PAGE 2

Copyright 2004 By Roop Kishore R. Matavalam

PAGE 3

To my parents and my sister

PAGE 4

ACKNOWLEDGMENTS First and foremost, I thank my advisor, Dr. Alexander Domijan, for having confidence in me and allowing me to pursue this research. The work presented in this thesis would not be possible without his consistent support. I am also very thankful to Dr. Khai D.T. Ngo and Dr. Antonio A. Arroyo for serving as committee members in my thesis. I sincerely express my gratitude to my colleague and friend William S. Wilcox for his thorough discussions in statistics and suggestions, without which the current thesis would not have been exciting. I am also grateful to my colleague and friend Alejandro Montenegro for his suggestions and answers to my questions without any hesitation. I am also very grateful to my friend Raj Vignesh Thogulua for helping me in understanding the neural networks. I am also thankful to Dr. Tao Lin for his helpful suggestions. I also thank my fellow colleague and friend Ajay Karthik for being enthusiastic about my work. I would like to express my special thanks to all the personnel of FPL Distribution Reliability group especially Mr. J. R. Pepe Diaz, Ms. Lee Davis and Ms. Jessica DAgostini for their valuable suggestions and financial support of this project, without their consistent support the current project would not have been finished. I am also thankful to FPL members including Mr. Val Miklausich, Mr. Santiageo Cocina, Mr. Luis Delforn, Mr. Manny Miranda, and Ms. Martha Caneia for their assistance. iv

PAGE 5

I would like to express my gratitude to all the undergraduate students who worked in FPL project and made it more lively and interesting. I also extend my gratitude to all my great friends for their support and encouragement. Finally, yet most importantly, I am indebted to my wonderful parents and sister for believing in my goals, aspirations, for their love, encouragement, and constant support in all my endeavors. v

PAGE 6

TABLE OF CONTENTS page ACKNOWLEDGMENTS.................................................................................................iv LIST OF TABLES...........................................................................................................viii LIST OF FIGURES...........................................................................................................ix ABSTRACT.......................................................................................................................xi CHAPTER 1 INTRODUCTION........................................................................................................1 1.1 Importance of Power Reliability.............................................................................2 1.2 Understanding Power Reliability Indices...............................................................3 1.3 Purpose and Importance of this Thesis...................................................................5 1.4 Organization of Thesis............................................................................................6 2 UNDERSTANDING FLORIDA WEATHER.............................................................8 2.1 Air Density, Temperature and Pressure..................................................................8 2.2 Humidity.................................................................................................................8 2.3 Rain.........................................................................................................................8 2.4 Dew or Condensation of the Humidity...................................................................9 2.5 Pollution..................................................................................................................9 2.6 Wind.....................................................................................................................10 2.7 Lightning...............................................................................................................12 3 SYSTEM UNDER STUDY FPL.............................................................................15 3.1 Description of the FPL distribution system..........................................................15 3.2 Interruption Data from FPL..................................................................................17 3.3 Meteorological Weather Data from NCDC..........................................................22 3.4 Weather Parameters of interest.............................................................................23 4 CORRELATION OF WEATHER AND INTERRUPTIONS...................................25 4.1 Importance of Statistical Tools.............................................................................26 4.2 Probabilistic Characteristics of Data Distributions...............................................27 4.3 Correlation Analysis between Weather Parameters and Interruptions.................29 4.3.1 Impact of Temperature on N......................................................................29 4.3.2 Impact of Wind on N..................................................................................34 4.3.3 Impact of Rain on N...................................................................................35 4.3.4 Effect of Rain and Wind Together on N....................................................37 vi

PAGE 7

5 PREDICTION OF INTERRUPTIONS USING ARTIFICIAL NEURAL NETWORKS..............................................................................................................40 5.1 Introduction to Artificial Neural Networks..........................................................40 5.1.1 Benefits of ANNS over statistical methods................................................41 5.1.2 Architecture of ANN..................................................................................41 5.1.3 Functioning of ANN...................................................................................43 5.1.4 Back Propagation Learning Rule................................................................44 5.2 Steps to Enhance the performance of ANN..........................................................45 5.3 ANN Simulation Output.......................................................................................47 5.3.1 Detailed Observation..................................................................................48 5.3.2 Dominant Weather Parameters Preliminary Observations......................51 5.4 Analysis of ANN Simulation Output....................................................................51 5.5 Pitfalls and Suggestions to FPL............................................................................53 5.5.1 Weather Data..............................................................................................53 5.5.2 Interruption Data........................................................................................53 5.6 Proving Localization of Weather Improves the Accuracy in Prediction..............56 Case 1: Localized Weather Data.........................................................................56 Case 2: Scattered Weather Data..........................................................................57 5.7 Comparison of Statistical Model and ANN Model..............................................57 5.8 Possible Software Development to Predict Power Interruptions Using ANNs....58 6 LIMITATIONS, CONCLUSIONS AND FUTURE WORK.....................................62 6.1 Limitations of Approach.......................................................................................62 6.1.1 Weather Data..............................................................................................62 6.1.2 Unknown Variables....................................................................................63 6.1.3 Outliers.......................................................................................................64 6.1.4 Hourly Data................................................................................................64 6.2 Conclusions...........................................................................................................64 6.3 Future Work..........................................................................................................65 6.3.1 Data Collection and Creating New Variables............................................66 6.3.2 Improving the Accuracy and Developing New ANN Models...................66 LIST OF REFERENCES...................................................................................................67 BIOGRAPHICAL SKETCH.............................................................................................69 vii

PAGE 8

LIST OF TABLES Table page 2-1 Type of Contaminant and Atmospheric Conditions at the Time of Contamination Flashover (UHV Project).................................................................10 3-1 FPL Power Sales by Sectors.....................................................................................15 3-2 FPL Distribution Management Areas along with Their Dispatch Centers..............16 3-3 FPL Cause Codes (102) Table.................................................................................18 4-1 The Frequency of Interruptions due to Tree Limbs (Cause Codes 20 & 21)...........27 4-2 Prediction of N Using Maximum Temperature of All the MAs..............................33 5-1 Summary Table of Covariance for All the Input Variables Considered in the Principle Component Analysis.................................................................................46 5-2 Performance Comparison Between Statistical Model and ANN Model..................57 viii

PAGE 9

LIST OF FIGURES Figure page 2-1 Regions with strong contamination (UHV project)...................................................9 2-2 Swing angle as a function of instantaneous wind speed at tower............................11 2-3 Vegetation effects on power interruptions...............................................................12 2-4 Number of days with thunderstorms in Florida (US Weather Bureau)....................13 2-5 Cumulative frequency distribution of peak current amplitudes in downward negative flashes..................................................................................14 3-1 Snap shot of Florida map.........................................................................................17 3-2 FPLs historical SAIFI performance........................................................................20 3-3 Frequency charts of interruptions and customers affected by interruptions.............................................................................................................21 4-1 N for all management areas from 1998 to 2001 using the previous filters.........................................................................................................................28 4-2 Rain (inches) for all management areas from 1998 to 2001....................................28 4-3 Wind-2 minutes maximum speed (mph) for all management areas from 1998 to 2001....................................................................................................29 4-4 Variation of average N due to transformer failures vs. maximum temperature...............................................................................................................30 4-5 Variation of average N due to transformer failures vs. maximum temperature (averaged per month per year).............................................................31 4-6 Variation of average N due to transformer failures vs. max temperature (averaged per month)...........................................................................32 4-7 Variation of N vs. wind............................................................................................34 4-8 Mean of 2 minutes wind speed vs. average number of interruptions.......................35 ix

PAGE 10

4-9 Variation of N vs. rain..............................................................................................36 4-10 Variation of N vs. rain in the interval [0 2]..............................................................37 4-11 Impact of rain and wind together on N.....................................................................38 5-1 ANN structures.........................................................................................................42 5-2 A back propagation ANN model ............................................................................45 5-3 Prediction patterns of N overlaid on actual patterns of N of 3 MAs for year 2002..........................................................................................................................48 5-4 Predicted N and the actual N for a few of the cases in North Dade MA for 2002...49 5-5 Numerical values of weather and interruption data under consideration.................50 5-6 Histogram plot of predicted interruptions................................................................52 5-7 Prediction results of ANN using the original N (not shift adjusted)........................54 5-8 Prediction results of ANN using the adjusted N (shift adjusted).............................55 5-9 Mean and standard deviation of actual N.................................................................55 5-10 Mean and standard deviation of adjusted N.............................................................55 5-11 Mean squared error vs. training epochs...................................................................58 5-12 Graphical user interface deve loped to predicted interruptions................................59 5-13 Predicted interruptions vs. actual interruptions for Central Dade............................61 6-1 Average precipitation difference..............................................................................63 x

PAGE 11

Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science POWER DISTRIBUTION RELIABILITY AS A FUNCTION OF WEATHER By Roop Kishore R Matavalam August 2004 Chair: Alexander Domijan, Jr. Major Department: Electrical and Computer Engineering The system principles that are used to design and maintain electric power distribution grids (distinct from transmission grids) are intended to minimize the number and duration of power disturbances, including interruptions. Many of these principles, such as load flow and load prediction are well understood and have been refined over many years. However, the impact of local weather conditions on power distribution grids has not been well researched. The current thesis is intended to improve our understanding of the effects of weather on power distribution systems and to develop tools for the prediction of weather related interruptions. Developing and disseminating this information will allow electric power engineers to ultimately improve our nations power distribution capabilities. The current research also presents the novel concept of predicting the number of power interruptions in a distribution system using weather parameters. Preliminary results show that it is possible to define and build a model, using artificial neural networks (ANNs), which can use weather parameters as inputs and predict the number of xi

PAGE 12

interruptions with reasonable accuracy. The accuracy of the prediction depends, in part, on the accuracy of the weather data that are used in the model and, in part, on the precision of the model. It is expected that the use of real-time surface weather data, such as can be collected from well-sited weather stations, will eliminate the uncertainty inherent in weather data collected from geographically distant sources. xii

PAGE 13

CHAPTER 1 INTRODUCTION For any power company, providing reliable electric service is the number-one priority. But unfortunately, sometimes, power interruptions are simply unavoidable. The contribution of weather towards these power interruptions is significant, as it is going to be shown in this thesis. Often times the terms power outages and power interruptions are exchanged each other to mean the same -loss of power supply, even by many people in power industry. But there is a fine difference between these two terms as stated by the IEEE 1366 standard [1]: An outage is defined in IEEE 1366-2001 as: The state of a component when it is not available to perform its intended function due to some event directly associated with that component. Notes: (1) An outage may or may not cause an interruption of service to customers, depending on system configuration. (2) This definition derives from transmission and distribution applications and does not apply to generation outages. An interruption is defined in IEEE 1366-2001 as: The loss of service to one or more customers connected to the distribution portion of the system. Note: It is the result of one or more component outages, depending on system configuration. From FPL standards, the loss of power supply is defined in two ways based on the duration of the power disturbance to the customer: Momentary Interruption: Single operation (Open Close) of an interrupting device which results in zero voltage for a period of time of 59 seconds or less. 1

PAGE 14

2 Sustained Interruption: Power loss to the customer that has lasted at least for one minute. Note: In the current thesis, when we say interruptions we mean sustained interruptions. Also we use N to represent the total number of sustained interruption per day. 1.1 Importance of Power Reliability Because of the huge financial losses, besides customer satisfaction loss, associated with power interruptions to our economy, power reliability is one of the most important concerns for electric utilities. Power reliability modeling and indexing are among the tools used by utilities to manage costs and monitor equipment performance, and ultimately improvements in the flexibility of reliability assessment models will result in increased savings. According to Contingency Planning Research Companys annual study [2], downtime caused by power disturbances result in major financial losses as shown in Figure 1-1. $1$10$100$1,000$10,000$100,000$1,000,000$10,000,000 RetailbrokerageCredit CardsalesauthorisationHomeshoppingchannelsAirlinereservationCentersPackageshippingserviceManufacturingBankingTransport Figure 1-1. Average hourly impact of downtime and data loss by business sector Traditionally the reliability of an electric distribution system is measured in terms of total number of power interruptions occurred.

PAGE 15

3 Currently there are a variety of indices available to measure the reliability of any power utility. These indices are used as a yard stick within a utility to see how good they are doing each year besides serving as a better tool to compare their relative performance with the other utilities, of different sizes, in the country. 1.2 Understanding Power Reliability Indices Among many power distribution reliability indices available [3], the following are some of the widely used customer based indices: SAIFI: System average interruption frequency index (sustained interruptions). This index is designed to give information about the average of sustained interruptions per customer over a predefined area. In words, the definition is SAIFI = TotalnumberofcustomerinterruptonsTotalnumberofcustomersserved Mathematically it is represented as SAIFI = iTNN SAIDI: System average interruption duration index. This index is commonly referred to as customer minutes of interruption or customer hours, and is designed to provide information about the average time the customers are interrupted. SAIDI = CustomerinterruptiondurationsTotalnumberofcustomersserved Mathematically it is represented as SAIDI = iiTrNN CAIDI: Customer average interruption duration index. CAIDI represents the average time required to restore service to the average customer per sustained interruption. In words, the definition is CAIDI = CustomerinterruptiondurationsTotalnumberofcustomerinterruptions

PAGE 16

4 Mathematically it is written as CAIDI = iirNSAIDINiSAIFI ASAI: Average service availability index. This index represents the fraction of time( often in percentage) that a customer has power provided during one year or the defined reporting period. In words, the definition is ASAI = CustomerhoursserviceavailabilityCustomerhoursservicedemand To calculate the index, the following mathematical equation is used: ASAI = ..(./)(./)TiTNNoofhoursyearrNNNoofhoursyear i Note that there are 8760 hours in a regular year, 8784 in a leap year. Some of the other customer based reliability indices in use are CTAIDI, ASAI etc. Load based sustained indices include ASIFI, ASIDI etc. Momentary indices include CEMSMIn, AMAIFI and MAIFIE. The newest indices are CEMIn, Customers experiencing multiple interruptions, and CEMSMIn, Customers experiencing multiple sustained interruptions and momentary interruptions events. Usage of CEMIn as a basis performance measure is under consideration [3] by many states in USA. In Florida, usage of CEMI 5 index is under consideration. If this value is exceeded, the commission is considering fines that would be paid to the customers who experienced poor performance. The snapshot of the percentage usage of different reliability indices, by the IEEE working group [1] on system design is as shown in Figure 1-2. It was analyzed through surveys that the most commonly used indices are SAIFI, SAIDI, CAIDI, and ASAI.

PAGE 17

5 Figure 1-2. Percentage of companies using a given index reporting in 1990 1.3 Purpose and Importance of this Thesis It was so obvious from previous discussed indices that one has to reduce the total number of interruptions in order to improve the reliability of a distribution system. Though there are lots of parameters/ conditions that are responsible for these interruptions, weather is still a big player and has a significant contribution. Research carried out by the Electric Power Research Institute [4] showed the effects of weather components, such as lightning, rain and wind in transmission lines (345kV and above). However, not significant similar work was done for distribution lines which brought our attention to research in this new direction. The purpose of the current thesis work was to study the impacts of normal weather conditions on the distribution power interruptions and develop correlation models. Study also includes how the correlation knowledge can be used to reduce the power interruptions by incorporating Artificial Neural Networks (ANN). Using weather and interruption data novel prediction tools and modeling methods were developed, in which the similar kind of approach can be applied to any electric utility in the country. With the

PAGE 18

6 provision of novel tools developed in this thesis, any power utility would be able to estimate/predict the total approximate number of interruptions that might happen in the future due to weather. The accuracy of the prediction of interruptions depends, in part, on the accuracy of the forecast weather data which serves as input to the model and, in part, on the accuracy of the model. It is expected that the use of real-time surface weather data, such as can be collected from well-sited weather stations, will eliminate the uncertainty inherent in weather data collected from geographically distant sources. Despite a thorough search of the available literature, examples of the use of surface weather data in the construction of power reliability models have not been found. It is expected that this project will contribute significantly to the existing literature by providing predictive models, as well as background in a previously unexplored area. Moreover, this research is valuable for the exploration of proper ANN structure, internal parameters and feature pattern extraction methods for application to power systems. 1.4 Organization of Thesis The current thesis work comprises six chapters. The current chapter gives the motivation, literature survey and the importance of the project. The second chapter explains about the behavioral model of weather conditions prevalent in Florida State. It will also give a broad knowledge of different weather parameters. The third chapter deals with the Florida Power & Light (FPL) system and also the information on weather and interruption data considered in the analysis. The fourth chapter presents the approach followed towards developing correlation models between weather parameters and power interruptions using statistical tools. The fifth chapter reveals a novel model using Neural Networks that can be used to predict the power interruptions based on the forecasted

PAGE 19

7 weather parameters. The last chapter concludes with the limitations of the research results, and conclusions of the current thesis followed by future work.

PAGE 20

CHAPTER 2 UNDERSTANDING FLORIDA WEATHER As weather in Florida is very varying, different weather parameters in Florida that are more common are explained and their impact on power distribution lines is explained in this chapter. Florida is also known as the lightning capital of the world. The natural phenomena of the weather that can reduce the strength [4] of the insulators in the state of Florida are air-density, temperature, pressure, humidity, rain, dew or condensation of humidity, pollution, wind and lightening. 2.1 Air Density, Temperature and Pressure The variation of the temperature in the state of Florida is between 20 o F and 100 o F and the pressure doesnt change significantly. The influence of this on the strength of the insulators, without abnormal condition like fire, is around of 5%. 2.2 Humidity The humidity in Florida changes significantly throughout the year and throughout the day. Usually, the humidity can be very high between Midnight and 6 /7 A.M and after that it starts to decrease. It again starts to increases at night. The influence of the humidity without condensation can affect the strength of the insulator to around 16%. 2.3 Rain The rain can be classified into weak rain (mist or drizzle) or strong rain (rainfall).The influence of the rain on the insulator strength varies and depends on the intensity and its direction. The maximum influence of rain [4] on clean insulators is around 30% 8

PAGE 21

9 The mist or drizzle along with pollution, can have a stronger impact on flashover when it combines together with pollution on the insulators strings. 2.4 Dew or Condensation of the Humidity When temperature decreases to the dew point, the condensation starts taking place on the surface of the insulators and then a thin layer of water appears. The condensation can have same effect as that of mist on insulator. 2.5 Pollution Pollution can reduce the strength of an insulator. Its influence on flashover depends on the type of contamination and its concentration on the surface of the insulators. We are here concerned about two types of the contamination, the spot contamination and the area contamination. In Florida the distribution of contamination [4] is as shown in Figure 2-1. Black dots refer to spot contaminations and shaded refer to area contaminations. Figure 2-1. Regions with strong contamination (UHV project) Table 2-1 shows the types of contamination that causes flashover. In dry conditions most of them are not good conductors, however, in wet conditions due to condensation, drizzle, mist or rain the conductivity increases substantially.

PAGE 22

10 Table 2-1. Type of Contaminant and Atmospheric Conditions at the Time of Contamination Flashover (UHV Project) Pollution combined with condensation and rain can be considered as the worst condition behind reduction of insulator strength. Moreover, dew, drizzle and mist are considered the most important weather components at the time of flashover, for 72 % of the cases [4]. Rain can have two different aspects. On one hand, it reduces insulators withstanding values. On the other hand, cleans the surface of the insulators thereby preventing the system from new flashover due to dew (condensation), mist and pollution. 2.6 Wind Special weather conditions, such as storms, thunderstorms, hurricanes and tornados have a direct effect on power interruptions. However, these events are a combination of rain, wind and lightning. Therefore, the individual analysis of each one of these weather

PAGE 23

11 components is required in order to study the real cause behind the flashover. An evaluation of the influence of wind-speed on swing angle and therefore the minimum clearance required to avoid possible flashover can be carried out. Wind can provoke catastrophic mechanical damages due to asynchronous movements of the cables and/or insulators. These damages are more significant in transmission lines where the span is larger. Figure 2-2 shows the swing angle of a single conductor vs. wind-speed [4]. Figure 2-2. Swing angle as a function of instantaneous wind speed at tower Distribution lines have small span, so the asynchronous movement of the conductors, most of the times, gives insignificant disturbance except when there is high speed. The most significant disturbance due to wind can come due to movement of the trees and its branches. Trees, which are untrimmed, can touch the lines and can result in a flash over leading to an outage. The Figure 2-3 below shows the uncut trees and branches touching the distribution lines in Gainesville, Florida.

PAGE 24

12 (a) (b) (c) (d) Figure 2-3. Vegetation effects on power interruptions 2.7 Lightning The number of days with thunderstorm in the Florida is between 80 and 100 as shown in the Figure 2-4 (isokeraunic level). As we can see in Figure 2-4, Florida is a state strongly affected with atmospheric discharges. The number of strokes to the earth per square mile per year (lightning) can be found through the expression: N= 0.25 I where I is the local isokeraunic level.

PAGE 25

13 If a line has shadow width of W, the number of lightning expected to hit it per year is N L = 0.25 ILW/5280 where W=b+4h, L is the length of the line in miles, h is the height of the shield (ground) wires and b is distance between shield wires. If the line doesnt have shield wires, h is the height of the conductors and b the distance between external conductors. Due to strokes on transmission or distribution lines with shield wires a back flashover is expected. The voltage across the insulator string in this case depends on tower foot resistance, current through the tower and of the coefficient of coupling between shield wire and phase conductor. Figure 2-4. Number of days with thunderstorms in Florida (US Weather Bureau) Thus, flashover or interruptions due to lightning depends on the tower foot resistance and also on the intensity of the current. It is not possible to control the current so all the control should be done through tower foot resistance. Distribution lines without shield wires are directly affected by the lightning. The level of the voltage across the string or insulator depends on intensity of the current and the magnitude of the surge

PAGE 26

14 impedance of the line. Figure 2-5 shows the amplitude of the crest of the strokes with the probability of the occurrence. It is possible to see that the probability of a stroke, which has a current up to 5 kA, is almost 1. This current while passing through the conductor will be divided into two. Thus, the voltage across the insulators or string of distribution lines with surge impedance between 150 and 250 will be more or less between 375 kV and 625 kV. Thus, in most of the cases distribution lines up to 69 kV (BIL up to 350 kV) will be practically submitted to a flashover for every stroke on it. Figure 2-5. Cumulative frequency distribution of peak current amplitudes in downward negative flashes

PAGE 27

CHAPTER 3 SYSTEM UNDER STUDY FLORIDA POWER & LIGHT The preliminary results cited in this thesis are the research results obtained while working on the project sponsored by Florida Power & Light (FPL). But the concept and idea can be further applied to any system and can be enhanced as explained in the current thesis. The power distribution interruption data of FPL was considered to study the effects of weather on the occurrence of interruption patterns of FPL. 3.1 Description of the FPL distribution system FPL is among the largest and fastest growing electric utilities in the United States. As of December 2002, FPL had 9,612 employees serving nearly 8 million people, or about half the state of Florida. Power is delivered (Table 3-1) safely and reliably from 86 generating units with a Total generation capability =26203MW, through more than 500 substations and over more than 69,000 miles of transmission and distribution lines [5]. Table 3.1. FPL Power Sales by Sectors Sector Number of Accounts* Total Sales (kwh) Residential 3,521,146 50.9% Commercial 430,471 40.6% Industrial 15,248 4.4% Other** 2,746 4.1% Monthly average as of December 2001 ** Includes public authorities, railway, wholesale and interchange. FPL Distribution system is divided in 16 management areas grouped under two regions; urban and suburban (Table 3-2). Figure 3-1 shows the location and area covered by each of the FPL distribution management areas (MA). 15

PAGE 28

16 Table 3-2. FPL Distribution Management Areas along with Their Dispatch Centers REGION URBAN SUBURBAN DISPATCH West Pam Breach Dispatch Center South Florida Dispatch Center Daytona Dispatch Center Sarasota Dispatch Center TC-Treasure Coast PMPompano NF-North Florida MSManasota WBWest Palm WGWingate CF-Central Florida TB-Toledo Blade BRBoca Raton GSGulfsteam BVBrevard GCGulf Coast ND-North Dade WD-West Dade CECentral AREAS SD-South Dade (a)

PAGE 29

17 (b) Figure 3-1. Snap shot of Florida map (a) FPL distribution management areas. (b) Weather stations chosen within the FPL area 3.2 Interruption Data from FPL Power interruption data is primarily obtained from Florida Power & Light (FPL). FPL has divided its power supplying territory into 16 sections called Management Areas (MAs). An interruption data file was created for each of these 16 MAs. Interruption data was made available to us in a data storage program known as Power-Play. To make the things more clearly, interruptions are classified in to different groups, Figure 3-4, based on the type of causes that are responsible for these interruptions, for example interruptions occurred due to squirrel are represented under the category of squirrel cause

PAGE 30

18 code (007). Basically a cause code indicates the principle cause of an interruption. With this kind of facility, we will be able to see interruptions happened due to a any specific cause as shown in Table 3-3. Table 3-3. FPL Cause Codes (102) Table REV 6/16/98 SCCAUSE CODES Equipment Codes (Required for all interruptions)Overhead Underground 187-E Equipt. Failed, Cause Unk080 Down Guy or Anchor110 Terminator190 Unknown081 Pole111 Cable Other Causes 082 Cross Arm113 Elbow170 Wrong size fuse083 Insulator114 Tx Fuse Switch171 Overloaded Device084 Pole Top Pin115 Tx Blade Switch178 Non-standard Construction087 Tie Wire116 Bayonet Switch183 Improper Installation088 Jumper121 Padmount Switch191-E Vandalism089 Stirrup122 Oil Fuse Cutout193 Customer Request090 Hot Line Clamp123 RA Switch195 Crew Request (Planned)092 Disconnect Switch124 Mech for Throwover Sw.196 Slack Conductors093 Fuse Switch125 PT Fuse197 Other (explain)096 Line OCR126 Conduct CKT Fuse202-E Loose Connection097 Line Capacitor127 Control CableAccidental Causes 098 Line Regulator132 Handhole040 Vehicle104 Conductor Down134 Bushing041 Accidental Contact105 Conductor Damaged135 Pothead046 Switching Error211 Injected Cable079 Dig-In (Proper Depth)Overhead or Underground Support and Follow-Up Codes 085 Arrester102 Other Equipment(Codes to be used as Support or Follow-up Only)091 Connector103 Splice 012 No Animal Guard 241 Injection Elbow (Not Installed)094 Transformer106 Automated Switch (DA) 022 Palm Tree 242 Flow (Positive)095 Step Down TransformerSubstation 050 Foreign Crew or Customer 243 Flow (None)Meter 140 OCB (Feeder Breaker) 066 FPL Crew 244 Injection Comp160 Meter141 Regulator 067 FPL Distribution Contractor 245 Injection Job Pndng161 Meter Blocks, Repairable142 Reactor 068 FPL Line Clearing Contractor 250 Cable Replace Job Pending 162 CT's 143 Relay 069 Transmission Contractor 251 Cable Replace Job Comp163 PT's147 Step Down Transformer 075 Improper Depth 260 Fault Locator Used164 Other Meter Equipment148 Other Substation Equip.100 Inadequate/No Ground 265 Cleared by Phone165 Meter Blocks-Not Reparable150 SCADA192 Crew Request (Forced Outage) 271 Injected (8/96 on)200 Transmission related151 Telecommunications199 Defective Material UPR 272 Replaced (8/96 on) 222 Power Temp Used 299 Data Corrected 240 Injection Elbow (Installed) 999 Named Storm Exclusions Note: The suffix "E" denotes that an Equipment Code is required.Do Not Enter "E" on TCMS. Natural Causes 001-E Lightning, with equip.damage 002 Storm w/no equip. damage 003-E Fire 004-E Salt Spray Corrosion 007 Squirrel 009 Bird 011 Other Animal 013 Tornado 014 Hurricane 015 Ice on Lines 020 Tree/Limb Preventable 021 Tree/Limb Unpreventable 023-E Decay/Deterioration 024-E Corrosion (Non Salt Spray) 025 Vines/Grass 026-E Contamination (Non S / S) Weather Related Codes LIGHTNINGNO LIGHTNINGPRESENT PRESENT EQUIP DAMAGE001 E187 E NO EQUIP DAMAGE 002190 The interruption data will be daily totals broken down by cause code. For example, a specific substation may experience three interruptions on September 29, 1998, due to cause code 093 fuse switch. Each data point will be small, but the compilation of many years and many areas will provide a statistically significant sampling. The interruptions of interest to us were further defined by the use of the following filters: With Exclusions This filter suppresses interruption data that is defined as exclusionary by FPL, including hurricane and tornado damage. We use this filter because we are interested in the effects of normal weather conditions. Overhead This filter includes only those interruptions that are caused by faults in overhead equipment or lines. Underground lines were considered immune to most weather conditions.

PAGE 31

19 Internal Distribution Interruptions located at the distribution system only were taken into account. Primary Only primary systems (feeders, laterals and oil circuit breakers) are within the scope of this research. Substation Each substation reports all the interruptions occurring in the secondary distribution system it supplies Cause code FPL uses numeric cause codes to specify the causes of interruption. General categories include natural causes, equipment and accident. Dates All relevant days from January 1, 1998 through December 31 2001 were considered. It is possible to get up to date interruption data by requesting FPL. Assumptions about FPL system: An important consideration in choosing FPL is that they have assured us that we can make the assumption that their equipment is homogenous throughout their area of operations (AO).Homogeneity of equipment is a necessary condition for statistically significant results Scope of current Thesis: As FPL personnel have already done correlation analysis between lightning strikes and the power interruptions, they are interested in knowing the indirect effects of weather including wind, temperature, rain etc on the total number of power interruptions. So the scope of the current research work is limited to these parameters only. Although interruptions represent between 3% to 5% [6-8] of the frequency of disturbances, a common method for measuring the reliability of an electric distribution system is based on the number of customers interrupted, which is proportional to the number of interruptions, as explained in chapter 1. Let us revisit the definition of SAIFI, a reliability index which the FPL uses more often. IEEE Standard 1366 defines the

PAGE 32

20 System Average Interruption Frequency Index (SAIFI) with the following formula: bNiCCSAIFIi1 where Ni = Number of interruptions (sustained interruptions lasting over 1 minute) Ci = Customers interrupted for each interruption Cb = Customer base or customers served Years (J-January, D-December) SAIFI Figure 3-2. FPLs historical SAIFI performance SAIFI indicates how often the average customer experiences a sustained interruption (>1min.) over a predetermined period of time, and it has a special importance in decision making for engineers working in distribution reliability. A typical breakdown

PAGE 33

21 by significance of the major causes for customer interruption and number of interruptions of FPL distribution system under study for a period of 12 months is shown in Figure 3-3. Customer Interrupted-20010400000800000120000016000002000000Weather(Storm+Lightning)Equipment FailureVegetationUnknownRequestCorrosion/DecayAnimalImproper ProcessOtherAccidentMajor causeCustomer Interrupted0.00%3.00%6.00%9.00%12.00%15.00%18.00%21.00%24.00%27.00%30.00% (a) Number of Interruptions-20010500010000150002000025000Weather(Storm+Lightning)Equipment FailureVegetationUnknownRequestCorrosion/DecayAnimalImproper ProcessOtherAccidentMajor causeNumber of Interruptions0.00%2.00%4.00%6.00%8.00%10.00%12.00%14.00%16.00%18.00%20.00% (b) Figure 3-3. Frequency charts of interruptions and customers affected by interruptions. (a) number of customers interrupted vs. causes and, (b) number of interruptions a

PAGE 34

22 The previous graphs show the relative importance of the direct effects of weather (storms and lightning) on the interruptions, but not the indirect effects i.e. temperature, rain, wind etc. From this chart it is not possible to determine if interruptions associated with, for example, vegetation or equipment are indirectly affected by weather conditions such as temperature, rain or wind. The current thesis shows that normal weather conditions do have effects on interruptions and that those effects can be quantified. The benefits of this type of study are the ability to explain trends in the SAIFI due to weather conditions and as an aid in the development of indicators for possible use in anticipating interruptions. 3.3 Meteorological Weather Data from NCDC We collected daily average weather data for rain, temperature, wind speed and other parameters from Automated Surface Observation Stations (ASOSs) located at airports throughout the area of operations (AO). Construction of these stations has begun in 1981 as an aid to air navigation and they have since become the most comprehensive source of weather data in the United States. For the stations we are interested in, we will be using data from 1996 through 2002. As we are an educational institution, the National Climatic Data Center (NCDC), a department of the National Oceanic and Atmospheric Administration [9], is making this data available to us free of charge. The greatest difficulty in collecting these data is its sheer volume. Six years of data from one ASOS will generate a file containing 24 columns and 2190 lines. Stacking files from all the ASOSs in the AO will generate a composite file with more than 20,000 lines. To add to this problem, there are missing days, missing data points and formatting that is not importable to the statistical program of our choice. A final editing of the data was done by brushing (taking out) those points of data which doesnt make sense : data points

PAGE 35

23 with zero barometric sea pressure, zero average temperature, 100 mph of 2 minutes maximum wind gust etc. To address this, we wrote programs in C/C ++ that will correct the omissions and convert the NCDC files to a generic text file that can be imported by any commercial spreadsheet program presently in use. Since we anticipate the use of ASOS data by any power engineer using our methods, file conversion is required by our objective. 3.4 Weather Parameters of interest Though there are a lot of weather parameters available in weather file downloaded from NCDC website, we used only those parameters which are of interest. A weather data file was created for each of the 15 ASOSs (one particular ASOS covered two regions). The following daily weather parameters were downloaded from the NCDC database: Average temperature Maximum temperature Minimum temperature Average dew point Significant observations Total rainfall Barometric pressure (sea level and station) Average wind speed Two-minute maximum sustained wind gust Five-second maximum sustained wind gust Weather is a complex combination of lot of parameters including, but not limited to, wind, lightening, condensation, temperature, rain, pressure, humidity, cosmic dust, solar storms, hurricanes, storms etc and the list goes on if all the meteorological terms are included, some of which we are not even aware of. But if the daily prevailing weather conditions are considered, fortunately lot of parameters can be neglected by throwing

PAGE 36

24 them under the category of extreme weather conditions i.e. not-a-common daily weather parameter, e.g. Hurricanes, Storms, lightening etc. Therefore, the major focus was given on the weather parameters like wind, temperature, rain, pressure, humidity etc which are not extreme weather conditions. Also among all these common weather parameters, only wind, rain and temperature are investigated, because of their dominant role [10] on the power distribution interruptions.

PAGE 37

CHAPTER 4 CORRELATION OF WEATHER AND INTERRUPTIONS Consequences of power interruptions can range from mild to severe inconvenience such as missing your favorite TV show or losing critical data, to life threatening, such as the failure of traffic signals. Less obvious consequences include increased cost to the customer due to increased maintenance and repair costs for the provider. Because of these consequences, power engineers are always researching methods to reduce the number of interruptions. The first step to reducing interruptions is to define the causes, and quantify their effects. Accident, human error and aging equipment contribute a great deal, but weather is still the largest single cause, although the effects are not as well understood as we would like to think. We can all agree that adverse weather conditions cause power interruptions. The evidence is apparent. When a bad thunderstorm storm hits, or a hurricane arrives, many people experience power interruptions, and those who dont, hear about it on the news. Lightning strikes, especially common in Florida, create transients that overload transformers and trip fused circuit breakers, both conditions requiring a repair crew to restore power. High winds blow down trees, damaging conductors. Less apparent is the effect of normal weather on the frequency of power interruptions. Several days of moderate rain can saturate the ground, invading buried lines and causing short circuits (FPL study).An unexpectedly warm season can promote 25

PAGE 38

26 vegetation growth, causing interruptions due to tree limb/conductor contact. These and other effects of normal weather are not easily defined because there is not a one-to-one relationship such as lightning hit the line so the transformer blew. In fact, many preventable interruptions occur that are not properly attributed to weather because of that lack of one-to one relationships. 4.1 Importance of Statistical Tools Part of the interest of this project is to find the relationship between the number of interruptions and normal weather conditions. Both interruptions and weather conditions in the future are random. To gain a complete prediction of the number of interruptions in the future, we need to predict future weather conditions and predict the number of interruptions based on the predicted weather conditions. However, because of limited resources and the difficulty of weather predictions, we will process the conditional prediction for the number of interruptions assuming that the weather condition is known. In this subsection, we will describe the probabilistic characteristics of daily interruption frequencies and the sums of daily interruption frequencies, i.e., monthly interruptions or sums of interruptions when it rains and when it does not. Then the explanation on plausible statistical data analysis techniques for each case follows. The daily outage frequencies have only nonnegative integer values and are strongly skewed to the right. For example, the daily interruptions caused by tree limbs (Cause Code 20 and 21) has the range from 0 to 58, but 99.5% of frequencies are less than or equal to 5 (Table 4-1). Therefore, statistical techniques based on the normal distribution, such as t-Test, normal linear regression, and ANOVA, generate big biases in calculating the confidence intervals of estimates and provide wrong conclusions in the search of

PAGE 39

27 significant weather effects. As a reminder, the normal distribution is symmetric and has the range, (-, +). Table 4-1. The Frequency of Interruptions due to Tree Limbs (Cause Codes 20 & 21) The Number of Outage Frequency Percent Cumulative Percent 0 15890 81.16 81.16 1 2582 13.19 94.35 2 632 3.23 97.57 3 223 1.14 98.71 4 102 0.52 99.23 5 53 0.27 99.50 6 38 0.19 99.70 >6 59 0.30 100.00 The nature of the weather data sets is first evaluated to know the behavioral patterns of weather parameters. Some of the parameters of our interest are wind, temperature and rain. 4.2 Probabilistic Characteristics of Data Distributions As a prelude to presenting results, I will describe the probabilistic characteristics of the data set we are dealing with; to determine what statistical data analysis techniques and models should be used to correlate weather parameters with power interruptions. Daily interruption frequencies (for all cause codes) have only nonnegative integer values, from 0 to 200, and are skewed to the right, as can be seen in Figure 4-1. Rain data shows a stronger displacement to the right (Figure 4-2), while wind speed histogram (Figure 4-3) gets close to a normal distribution.

PAGE 40

28 01020304050607080 0100020003000400050006000700080009000 N Frequency Figure 4-1. N for all management areas from 1998 to 2001 using the previous filters 1234567891 0 010002000 Rain Frequency Figure 4-2. Rain (inches) for all management areas from 1998 to 2001

PAGE 41

29 01020304050607 0 010002000 2MMaxS Frequency Figure 4-3. Wind-2 minutes maximum speed (mph) for all management areas from 1998 to 2001 Because a normal distribution is symmetric and the normal random variable is continuous within the range (-,), these probabilistic characteristics must be explained using Poisson distribution. 4.3 Correlation Analysis between Weather Parameters and Interruptions The statistical correlation models between weather parameters of interest wind, temperature and rain, and the total daily number of power interruptions (N) were studied. 4.3.1 Impact of Temperature on N In this section, the impact of daily temperature variations on the Power interruptions due to transformer failures was studied. The monthly averages (means) of the maximum temperatures were taken on the Xaxis and the monthly means of the total number of interruptions due to transformer failures were taken on the Yaxis for 4 years (1998-2001) of all the MAs. Under these conditions, the total number of monthly data points came to be around 567.

PAGE 42

30 Regression95% CI95% PI 65 758595 an of Max. Temperature (F) Y= 202.803 5.09S = 2.62145 R-Regression Pl 380X + 0.0328080 X**2Sq = 42.0 % R-Sq(adj) = 41.8 %ot of N Vs. Max. Temp 01020 Me Mean of N PI Prediction Interval limits CI Confid enc e Interv al lim its Figure 4-4. Variation of average N due to transformer failures vs. maximum temperature It can be observed from Figure 4-4 that the plot has peaks over the two edges of the X-axis. The reason can be attributed due to the heavy load on the transformers because of the maximum usage of power during these temperatures. Part of the reason being, all the customers try to switch on their air-conditioning at once when there is either maximum temperature or minimum temperature. It looks like around 75 0 F to 80 0 F, there will not be much increase in the transformer failure interruptions and hence is an optimal temperature. Approximately after 80 0 F, the curve increases in an exponential way. The right skewness of the graph indicates that the higher temperature effects are more predominant than the lower ones; this is true for Florida, especially southern part, where most of the year it is sunny.

PAGE 43

31 908070 1510 5 Mean of Max. Temperature (F) Mean o f N S = 1.11736 R-Sq = 80.9 % R-Sq(adj) = 80.1 % + 0.0335747 X**2Y = 208.210 5.22174 XRegression Plot of N Vs. Max. Temperature Figure 4-5. Variation of average N due to transformer failures vs. maximum temperature (averaged per month per year) Figure 4-5 was plotted with the same exact information used to plot Figure 4-4, but the data of the corresponding months of the 4 years for all the MAs was averaged giving a total of 48 points. Similarly in Figure 4-6 the data of the corresponding months of all the 4 years was averaged to give 12 data points. The important thing that we should observe is that as the number of data points is getting lesser and lesser, the plot is getting smoother with the increase in R 2 value but at the cost of losing the finest details of the data points because we are averaging out all the variations for each month. This method of averaging out the data points provides us an opportunity to see the hidden pattern between the variables by suppressing the disturbances/noise in the data set.

PAGE 44

32 75808590 3 4 5 6 7 8 910111213 MeanofMax.Tem p erature ( F ) Mean of N Y = 273.490 6.81286 X + 0.0432362 X**2S = 0.597484 R-Sq = 95.1 % R-Sq(adj) = 94.0 % R egress i on P l o t f or N V s. M ax. T emp Figure 4-6. Variation of average N due to transformer failures vs. max temperature (averaged per month) The correlation equation for the X and Y variables considered is given on the top of each of the plots; R 2 represents the proportion of variability in the Y variable accounted for by the X variable. Based on the correlation model developed between the transformer interruptions and maximum temperature, it is possible to predict the total number of interruptions due to transformer for any day/MA if the maximum temperature of that day/MA is know / given. The following Table 4-2 shows the prediction of Transformer interruptions and the error associated with it for each of the MAs of FPL.

PAGE 45

33 Table 4-2. Prediction of N Using Maximum Temperature of All the MAs MA Airport Tmax (Avg.) N Tx (Avg.) actual N Tx prediction ALL FPL EQUATION Error ALL FPL EQUATION N Tx prediction INDIVIDUAL MAs EQU Error MAs EQU Central Florida DAB 66.50007.363649.31668 26.523 8.170 10.951 Wingate FLL 74.40913.454555.40181 56.368 4.898 41.784 Gulf Coast FMY 72.59096.136365.91181 3.659 * Treasure Coast FPR 71.31826.863646.40587 6.669 * Wingate FXE 74.63643.454555.35414 54.988 4.850 40.395 Gulf Stream HWO 75.36364.045455.22544 29.168 * Central Dade MIA 75.27273.500005.23954 49.701 5.020 43.429 Brevard MLB 69.81827.227277.13454 1.283 * North Dade OPF 75.45454.818185.21190 8.172 * West Palm PBI 74.00004.909095.49660 11.968 * Toledo PGD 71.31824.181826.40587 53.184 5.429 29.824 Pompano PMP 73.86362.318185.53076 138.582 3.720 60.471 Cetral Florida SFB 68.36367.363647.99378 8.558 * Manasota SRQ 68.00006.954558.23225 18.372 7.360 5.830 South Dade TMB 76.36365.409095.10758 5.574 * ALL FPL 72.485 5.2 5.9 13.4% Description: MA Management Area considered Airport The nearest airport to the MA considered in getting the weather data Tmax (Avg.) The average value of the maximum temperatures occurred in January 2002 N Tx (Avg.) Average number of interruptions (N) happened due to Transformer failures N Tx prediction All FPL equation Predicted N Tx (Avg.) using the common equation of all MAs N Tx prediction Individual equation Predicted N Tx (Avg.) using local equation of individual MAs It can be observed from table 4-2 that the prediction error varied over a wide range from 1.28 % to 138 %. There were only 5 cases where the error exceeded 50%, with

PAGE 46

34 others within the satisfactory limits. The huge error is due to the incorporation of common equation developed from the data of all the MAs. But using the individual MAs equations, which were developed from the local MAs data, those huge errors were drastically reduced. There were cases where the common equation gave better results than the local equations of the MAs; hence local equations are used only for those MAs where common equation gave a huge error. 4.3.2 Impact of Wind on N The role of wind is very significant among all the weather parameters. There is a very good correlation between wind and total number of interruptions (N). When the plot is drawn between the daily 2 minute maximum wind gust (TMMG) and N, it was a big mess and chaotic where no pattern can be seen. Because, for a given value of the TMMG speed there were different levels of N occurred. So the averages of different levels of N occurred at each of the speeds of TMMG were taken and then plotted, the plot can be seen in Figure 4-7. 605040302010 0 1510 5 0 Mean of 2minutes Wind Gust speed (mph) Mean of N S = 2.76351 R-Sq = 39.8 % R-Sq(adj) = 35.2 % + 0.0355799 X**2 0.0004352 X**3Y = 4.57617 0.692087 XRegression Plot of N Vs. Wind Figure 4-7. Variation of N vs. wind

PAGE 47

35 It seems that there is a pattern until 40 mph, but after that the pattern gets distorted. If at least 30 points were considered while calculating averages then the correlation obtained by this process is very high, R 2 = 99.3% and reveals the existence of strong cubic relationship, Figure 4-8, between N and TMMG. By doing so, only 1.5% of the data points were neglected still keeping 98.5% of the whole data. 302010 4321 Mean of 2minutes Wind guest speed (mph) Mean of N S = 0.0900810 R-Sq = 99.3 % R-Sq(adj) = 99.2 % 0.0065040 X**2 + 0.0002598 X**3Y = 0.613754 + 0.0647258 X Regression Plot of N Vs. Wind Figure 4-8. Mean of 2 minutes wind speed vs. average number of interruptions It can be observed from Figure 4-8 that the total average number of interruptions increases exponentially after around 20 mph. So power distribution poles and overhead equipment must be designed in such a way that there wont be any breakdown for wind gusts of more than 20 mph. Also care has to be taken that the distribution lines neighborhood vegetation and others near by to it are at a proper distance and will not lean on the power distribution lines during these wind gusts. 4.3.3 Impact of Rain on N The impact of rain on the mean number of N can be observed in the following figures.

PAGE 48

36 01234 5 0123456789 Mean of Rain(inch) Mean of N Y = 1.43350 + 3.04671 X 1.09160 X**2 + 0.137744 X**3S = 1.24028 R-Sq = 42.8 % R-Sq(adj) = 37.2 %Regression Plot of N Vs. Rain Figure 4-9. Variation of N vs. rain The number of days that didnt rain is more than the days that rain. As the impact of rain on N is under consideration, the non-rainy days have been excluded from the data set. The data points of N were averaged similar to the approach followed in analyzing wind impacts on N; different occurrences of N for each level of rain were averaged and then plotted in Figure 4-9. The whole graph can be divided into 3 piecewise linear segments; 0.1 to 1 inch, 1 to 3 inch and more than 3 inch. In the first segment there looks a linear relationship, Figure 4-10, between N and Rain, and hence initial small amount of rain play a vital role in the amount of interruptions.

PAGE 49

37 01 2 1.01.52.02.53.03.54.04.55.05.5 Mean of Rain(Inch) Mean of N Y= 1.47309 + 2.11652 X + 0.662153 X**2 0.498268 X**3S = 0.568862 R-Sq = 78.9 % R-Sq(adj) = 75.0 %RegressionPlotofNVs.Rain Figure 4-10. Variation of N vs. rain in the interval [0 2] N remains pretty much constant in the second segment showing constant effect of rain, but in the third segment N increases drastically as rain increases over 3 inch. The small amount of rain, little showers, settles down on the insulators. This droplets of water helps as a solvent for the salts and the atmospheric dust deposited on the insulators and forms a conducting layer for the current, thereby causing a flashover which leads to power interruptions, as explained in chapter 2. On the other hand, rain from 1 to 3 inch is large enough to clean the insulator, as they drop off from it instead of getting deposited. Finally, rain over 3 inches is accompanied with extreme weather conditions leading to again huge amount of N. 4.3.4 Effect of Rain and Wind Together on N The following three-dimensional Figure 4-11 gives the relationship between the combined effect of wind and rain on the average number of N. It can be seen that the predicted (calculated) Navg tracks very well the actual N happened for lower values of N. The regression equation is given by Navg = 1.05 0.04*Wwind speed + 6.82*Rrain average The correlation coefficient, R 2 is = 85.6%

PAGE 50

38 5 10 15 20 25 30 35 0 0.2 0.4 0.6 0.8 0 1 2 3 4 5 6 7Wind (mph) Rain (inch)N avg Field dataPrediction Figure 4-11. Impact of rain and wind together on N Usual methods of statistical analysis rely, in part, on knowing in advance what the researcher is looking for. An example is a study done by FPL that provides a linear equation describing the number of interruptions caused by lightning as a function of the number of lightning strikes. In this case, the cause of the outage is known (one-to-one relationship) and the result is expected. The data required for this type of analysis is also proscribed by the limited scope of the question. Also, there are limited strategies for dealing with the problem, since lightning is a random and unpredictable event. Analyzing the effects of normal weather requires a different approach. We need to be open to unexpected results rather than expected ones. We need to consider a body of data much, much larger than that required to investigate a single phenomena. We need to consider non-linear relationships and relationships that imply a confluence of conditions. We need to apply every statistical tool we can think of, and then learn some more. Most of these features are available in a tool called Artificial Neural Networks (ANNs). Hence the application of ANNs to our current problem is discussed in next chapter.

PAGE 51

39 4.3.5 Effect of Lightning Strikes on N FPL has already done correlation analysis between cause codes 01(Lightening, with equipment damage) and 02(Storm with no equipment damage) and lightning strikes for all the MAs considering the years 1998-2001. Cause codes 01 and 02 represent the direct weather effect on service interruptions. The following plot, copied from the FPL information slides during their visit to University of Florida, explains the impact of lightning strikes on the storm interruptions with a very high correlation with a linear relation meaning the higher the lightning strikes the higher the storm interruptions. Figure 4-12. Lightning strikes vs. storm interruptions during 1998-2001 for all MAs

PAGE 52

CHAPTER 5 PREDICTION OF INTERRUPTIONS USING ARTIFICIAL NEURAL NETWORKS Though it gives the impression, from the previous chapter, that the effect of all the weather parameters on power interruptions can be quantified using standard mathematical functions/ statistical techniques, it is not always true. It may neither practical nor feasible, always, to find a function for certain complex correlations between weather and interruptions. This is where the need for the neural networks arises to analyze and generalize the hidden relationship. We need a tool which is powerful when applied to problems whose solutions require knowledge which is difficult to specify, but for which there is an abundance of examples artificial neural networks is one of the best tools for this kind of problems. 5.1 Introduction to Artificial Neural Networks Neural networks, or artificial neural networks (ANN) to be more precise, represent a technology that is rooted in many disciplines: neurosciences, mathematics, statistics, physics, computer science, and engineering. ANNs find applications in such diverse fields as modeling, time series analysis, pattern recognition, signal processing, and control by virtue of an important property: the ability to learn from input data with(supervised) or without a teacher (unsupervised).The most common training scenarios use supervised learning. ANN is a very useful tool for predicting the interruptions of a power distribution system to a decent accurate value. The accuracy of prediction is directly proportional to the accuracy of the historical power interruption and weather data used to train the ANN. 40

PAGE 53

41 This project provides the methodology for predicting the interruptions beforehand for the forecast weather conditions using ANNs. 5.1.1 Benefits of ANNS over statistical methods ANN is an alternative to conventional methods [11]. ANN is an approach that combines the time series and regression approaches; it learns from the previous interruption and weather patterns and predicts one for the current conditions, it also performs a non-linear regression between interruptions and weather patterns. It shows superior performance in terms of accuracy when compared to statistical methods [12]. ANN derives its computing power through, first, its massively parallel distributed structure and, second, its ability to learn and therefore generalize. Generalization refers to the neural network producing reasonable outputs for inputs not encountered during training (learning). These two information-processing capabilities make it possible for neural networks to solve complex (large-scale) problems that are currently intractable. The main reasons for using neural networks, for prediction, rather than statistical techniques/ classical time series analysis are [13] They are self-monitoring (i.e., they learn how to make accurate predictions. They are able to cope with nonlinearity and nonstationarity of input processes. They are adaptive, non-linear and highly parallel. They can generalize. They are computationally at least as fast, if not faster than most available Statistical techniques. Multi-layered ANNs are capable of performing just about any linear or nonlinear computation, and can approximate any reasonable function arbitrarily well. 5.1.2 Architecture of ANN Figure 5-1(a) shows the basic model of a single neuron while Figure 5-1(b) shows a one-layer network with R input elements and S neurons. In this network, each element of

PAGE 54

42 the input vector p is connected to each neuron input through the weight matrix W. The ith neuron has a summer that gathers its weighted inputs and bias to form its own scalar output n(i), Figure 5-1 (b). The various n(i) taken together form an S-element net input vector N. Finally, the neuron layer outputs form a column vector a, where a = f (Wp+b). (a) (b) (c) Figure 5-1. ANN structures: (a) basic nonlinear model of a neuron, (b) one layer network of neurons, and (c) 3 layer feed forward back propagation network Figure 5-1 (c) shows the ANN model used in the current project. IW represents Input Weight matrix having a source 1(second index) and a destination 1(first index). Also, elements of layer one, such as its bias, net input, and output have a superscript 1 to say that they are associated with the first layer. LW represents layer weights [14].

PAGE 55

43 The data is presented to the input nodes. Each input node is connected to several nodes in the second layer. The second layer is called the hidden layer, since they are not accessible to the outer environment. The hidden layer acts as a layer of abstraction, pulling features from inputs. Determining the proper number of nodes for the hidden layer is difficult and often determined through hit and trial. Generally, network performance increases with the number of hidden nodes and then reaches a saturation level [15]. The addition of more hidden nodes may actually degrade performance due to increased difficulty of training data. The implementation of this commonly accepted rule will help train the ANN efficiently and will also help convergence of the solution. The last layer is referred to as the output layer, since the networks output is the response of nodes on this layer. The number of output nodes of an ANN is determined by the requirement. 5.1.3 Functioning of ANN In general, the operation of this feed forward network consists of passing weighted and summed input signals through a chosen nonlinearity. It presumes knowledge of the networks bias functions and weighted links. Once activation and output functions are chosen, an ANN is completely described by its weights and biases. Since a given ANN solves a specific problem, or function, finding weights and biases for the network is equivalent to finding the input/output relationship that describes the function. In the current ANN model, Figure 5-1(c), the activation functions chosen in the hidden layer and the output layer are tansig and purelin respectively. The two layer sigmoid/linear network can represent any functional relationship between inputs and outputs if the sigmoid layer has enough neurons.

PAGE 56

44 There were a lot of training algorithms and performance functions that we can chose from to train the network model. For the present problem BPN algorithm has been chosen, as it was the famous algorithm for multi-layer perceptron (MLP) networks and trainbfg training function was used to train BPN. The term back propagation refers to the manner in which the gradient is computed for nonlinear MLP networks. Properly trained back propagation networks tend to give reasonable answers when presented with inputs that they have never seen. Typically, a new input leads to an output similar to the correct output for input vectors used in training that are similar to the new input being presented. This generalization makes it possible to train a network on a representative set of input/target pairs and get good results without training the network on all possible input/output pairs. 5.1.4 Back Propagation Learning Rule The back propagation learning rule [16] is an iterative gradient algorithm designed to minimize the mean square error between the actual output of a multilayer feed forward network and the desired output. An essential component of the rule is the iterative method that propagates error terms required to adapt weights back from nodes in the output layer to nodes in lower layers. At beginning, we set all weights and node offsets to small random values. The input values are presented and the desired outputs are specified. Then the network, Figure 5-2, is used to calculate actual outputs. A recursive algorithm, starting at the output nodes and working back to the hidden layer, adjusts weights until weights converge and the cost function is reduced to an acceptable value. The training process is repeated by presenting different sets of input data to the ANN.

PAGE 57

45 Figure 5-2. A back propagation ANN model 5.2 Steps to Enhance the performance of ANN There is a wrong notion that one can dump all the available variables as input to the ANN to predict the solution. The more the number of input variables to ANN the complex the problem to track in studying the correlation between these input variables. To enhance the performance of ANN, the input data has to be pre-processed. ANN toolbox of MATLAB 6.0 has some of the functions which can perform these operations. The following are some of the techniques that could be helpful to enhance the quality of the input datasets before giving it to ANN: Eliminate the unnecessary variables which dont have significant contribution to the output. Scale the inputs and targets so that they always fall with in a specified range. Reduce the dimensions of the input data, without much loss in the variance, e.g. Principle Component Analysis, as explained below. As weather is a combination of many parameters like wind, temperature, rain, pressure, dew, lightening etc, the next question that comes to our mind is what are the predominant ones among all these parameters that have significant contribution towards the daily interruptions? One way to figure out solution for this problem is to see the

PAGE 58

46 variance of all the weather parameters with respect to each other. The ones which have more variance are more responsible towards N than the ones with less variance. Less variance in a variable means fewer changes in its value, which means this variable has less effect on the changes of N. For investigations involving a large number of observed variables, it is often useful by considering a smaller number of linear combinations of original variables. Principle Component Analysis (PCA) PCA is one of the friendly tools used popularly to reduce the dimensions of input variables. Principle component analysis [13] finds a set of standardized linear combinations called the principal components, which are orthogonal and taken together explain all the variance of the original data. The following analysis shows the variance of the 8 input considered in the ANN model: Table 5-1. Summary Table of Covariance for All the Input Variables Considered in the Principle Component Analysis In the above table 5-1, if component 1(comp.1) alone is considered, it explains 54.9% of the total variance in the data set by using the following linear combination of only 4 weather variables: Comp.1 = 0.515(MaxTemp) + 0.738(MinTemp) 0.107(HeatDays) + 0.415(CoolDays)

PAGE 59

47 Similarly, comp.2 alone explains 22.9% of the total variance in the data set with the following linear combination: Comp.2 = -0.197(MaxTemp) + 0.313(AvgwindS) + 0.727(5SecWindS) + 0.585(2MinWindS) But when comp.1 and comp.2 are considered together, 77.8% of the total variance of the data set can be explained. From above table, it can be observed that by considering up to comp.3, around 91.7% of the total variance can be explained and by considering up to comp.4, around 96.5% of the total variance can be explained. The choice of how many number of components to be considered depends on the amount of variance that is of our interest. Hence in this case, the total number of dimensions, 8, has been reduced to 4 if we want to retain 96.5% of the total variance by considering up to comp.4. To our interest, rain in above table has no contribution at all towards variance, if we consider until comp.4, hence this variable can be eliminated. Hence the 8 variables can be reduced to 4 components by preserving 96.56% of the total variance in the data set. Each component is like a new variable but a linear combination of the actual weather variables. 5.3 ANN Simulation Output Three management areas--Wingate, North Dade, and Gulf Stream were chosen as pilot areas in the current artificial neural network (ANN) project. These 3 MAs are adjacent to each other and small enough to make the assumption that the variation in weather due to geographical differences is slight. Also, they are all urban MAs and appear to have a similar distribution of customer types. Two years, 2000 and 2001, of weather and interruption data were chosen to train the ANN while 2002 weather and interruption data was used to evaluate the performance of the trained ANN model. One entry in either the training or evaluation datasets is

PAGE 60

48 composed of one days weather and interruptions for one MA, so the one year evaluation dataset had 1060 entries (allowing for missing data). The output of the ANN is two columns of data; a prediction for each entry in the evaluation dataset and the actual number of interruptions for each entry in the evaluation dataset. Figure 5-3 is a graph of the predicted values superimposed over the actual values for the evaluation year 2002. The MAs are in sequence and it can be seen that the predicted values follow the seasonal trends for interruptions. 2002-Gulfstream 2002-North Dade 2002-Wingate Figure 5-3. Prediction patterns of N overlaid on actual patterns of N of 3 MAs for year 2002 5.3.1 Detailed Observation Figure 5-4 is an expanded segment (North Dade) of the above graph to highlight the details. It can be seen that where the actual number of interruptions are large, the predictions matches the pattern of Maxs and Mins, but are not always close in magnitude. Where the actual number of interruptions is small, the pattern matching breaks down, but large spikes in the predictions do not occur. The following is a segment of a time series plot of predicted values and actual values of total daily number of interruptions (N) for North Dade MA. It can be clearly seen that during some periods (rectangles) the predicted values match the pattern of the

PAGE 61

49 actual values, if not the magnitude, while other periods (ovals) do not show any such pattern matching although the magnitudes are small. Figure 5-4. Predicted N and the actual N for a few of the cases in North Dade MA for 2002 Some of the interesting observations from the above plot, Figure 5-4 are explained below. Case 1: Predicted N less than actual N. The actual value of N at points 529 and 530, in Figure 5-4, correspond to 6/17/2002 and 6/19/2002 in the North Dade MA. The interruption data for 6/17/2002 indicates that up to 18 among 28 interruptions that occurred in 6/17/2002 may not be related to daily common weather (Corrosion/Decay = 10, Improper Process = 6, Request = 2) which suggests that as few as 12 may be weather related interruptions, which is just the same as the predicted value. Similarly 15 out of 29 interruptions that occurred on 6/19/2002 may not be related to daily weather, which is close to our predicted value of 12. Though the weather conditions for these days were relatively mild, we have a significant increase in the number of interruptions, as shown in Figure 5-5 in the green box.

PAGE 62

50 Case 2: Predicted N more than actual N. If we look into the details of the points 549, 550, 551 and 552, these points correspond to 7/8/2002, 7/9/2002, 7/10/2002, 7/11/2002 days of North Dade MA, outlined in red in Figure 5-5. The number of interruptions for these days was pretty much same though their weather conditions vary over a wide range. This large change in weather conditions forces the ANN model to predict N proportional to the weather. So it is really a question one should ask that why we have small changes in N though we have significant differences in their weather conditions? Were some precautionary measures, e.g. tree trimming been taken few days before the happening of these interruptions?? Figure 5-5. Numerical values of weather and interruption data under consideration

PAGE 63

51 Case 3: In some of the cases there was a very small increase in N though there were large variations in the weather conditions. 5.3.2 Dominant Weather Parameters Preliminary Observations A series of ANN simulations with different weather parameters removed has been done, and the relative accuracy of each simulation has been compared to determine which weather parameters are the most significant. The preliminary results show that only a few of the many weather parameters account for most of the variation in the number of interruptions. It is expected that the importance of individual weather parameters will vary with location. The following list gives the weather variables in accordance with their importance for the pilot area: 1. Two Minute Sustained Wind Gust (mph) 2. Rain (inches) 3. Lightning Strikes (# of strikes/day) 4. Temperature Average or Max. & Min.(K ) On the other hand, the following list of parameters which account for the least variation in the number of interruptions. 1. Pressure 2. Heat Days 3. Cool Days 4. Dew Point 5. Population (of MA) 5.4 Analysis of ANN Simulation Output Based on the actual number of interruptions and the predictions during the evaluation year, probability graphs (PGs) have been created to represent the range of interruptions that actually occurred in the evaluation dataset for each predicted value. For example, if every number between 1 and 40 interruptions were predicted at some point in

PAGE 64

52 the evaluation year, there would be 40 PGs. This is done by sub-setting the evaluation dataset into 40 sets and creating a histogram of actual values for each predicted value in the evaluation set. By dividing each frequency column by the sum of the interruptions that comprise the histogram, a probability graph such as the one for a prediction of 11 shown below can be created. 1234567891011121314151617181920212223 051015 N Actual Frequency Histogram 2345678910111213141516171819202122 0510 N Actual Percent Probability 1.52673.81683.81685.34359.92378.39699.160312.21375.34357.63366.87026.87023.05341.52673.81683.05343.81681.52670.76340.76340.7634Probability Graph (a) (b) Figure 5-6. Histogram plot of predicted interruptions From the histograms, it can be seen that the actual values for each predicted value follow a generally normal distribution, so it is justified to apply normally calculated mean and standard deviation to gauge the accuracy and precision of a prediction. The accuracy would be determined by the closeness of prediction to the mean actual number of interruptions. The precision would be determined by the magnitude of the percent standard deviation. Percent standard deviation was chosen to equalize the standard deviations for lower to higher predictions. Outliers provide clues to elements of the model that either are missing or should not be there. Test data is just a single day's weather data; a real-time updated weather parameter max, min or total from a weather station, a known day's values or a theoretical set of weather data. The former is used in real-time prediction but the latter can only be used after the fact and does not provide any predictive benefits, aside from inclusion in the

PAGE 65

53 historical data set. However, the latter can be used for research, such as modeling a system's robustness to weather. Test data that shows a very low prediction can be used as a base and the parameter values can be varied either individually or in groups to model the response to those parameters. 5.5 Pitfalls and Suggestions to FPL GIGO is an acronym from the predawn of computinggarbage in, garbage out. The accuracy and precision of the ANN is limited by the accuracy and precision of the input. Although there will always be error inherent in the data collected, significant improvements may be possible. 5.5.1 Weather Data The error inherent in the ASOS weather data may be geographical and ASOS data is only available for historic and not real-time use. The installation of dedicated weather stations that is now occurring at FPL service centers will reduce that inherent error and allow real-time forecasting. 5.5.2 Interruption Data Although the FPL data cubes are thorough, the reporting procedures are not designed for a detailed, time-dependent study such as this, nor are they always sensitive to the role of weather. Because of this, the accuracy and precision of the prediction suffers. An example is that a day on which an interruption may be reported runs three shifts from 7 AM to 7AM. In the last random sample made, the last shift, from 11 PM to 7AM, reported about 12% of the day's interruptions; meaning that from midnight to 7AM the interruptions were being reported on the previous day. This can be largely accounted for by taking data from the cube by shifts and summing, however that still leaves 11PM to

PAGE 66

54 midnight, or maybe 1-2% of the interruptions, reported on the wrong day. Because the data was only shifted in time, the average difference after adjustment was only 0.05 interruptions; however, because of many instances where a large number of interruptions were reported on the previous day, the average percent difference was 14%. To determine the effect on the output of this error, two sets of data were taken from the same time period and location, an original one with 24 hours of interruption data taken from the cube on the day it was reported and an adjusted one with 24 hours of interruption data taken from the beginning of the third shift on the day before it was reported. Both were run through the ANN and the results compared. The following detailed graphs of the same time period in the same MA show an improvement in the pattern and magnitude matching after the interruption data were adjusted for the shift differences. Figure 5-7. Prediction results of ANN using the original N (not shift adjusted) Mean and Standard deviation plots for the actual N vs predicted N and adjusted N vs predicted N were plotted as shown in Figure 5-9 and Figure 5-10. It can be observed that adjusting the data to include the correct shifts on the correct days improves the fit of the prediction to the mean actual number of interruptions. It also shows that the fit

PAGE 67

55 Figure 5-8. Prediction results of ANN using the adjusted N (shift adjusted) deteriorates as the prediction increases, indicating unknown factors. The graphs of the original, Figure 5-9, and shift-adjusted percent standard deviation, Figure 5-10, show a reduction in the adjusted %Standard Deviation at lower predictions while the higher predictions are not much improved, similar to the graphs of the means. 5152535 10203040 prediction Mean Actual 5152535 102030405060708090100 Prediction Actual %StDev (a) (b) Figure 5-9. Mean and standard deviation of actual N 5152535 10203040 prediction Adj Mean Actual 5152535 30405060708090100 Prediction Adjusted %StDev (b) (d) Figure 5-10. Mean and standard deviation of adjusted N

PAGE 68

56 These results suggest that there are other improvements that can be made in the data reporting. Not one change would have as visible an effect, perhaps, but taken together they could alter the results significantly. Some possibilities suggest themselves: Report interruption requests due to weather related damage repair on the day the causative weather condition occurred. Maintain an hourly database for interruptions since hourly weather is available. This would be especially useful if dedicated weather stations existed. Update cause codes to be more sensitive to the possible role of weather. Report Age of equipment Simulations that have been run with different cause codes subtracted from the interruption data, such as accident, animal, improper process and crew request (planned) have shown similar improvements in differing regions of the graphs. 5.6 Proving Localization of Weather Improves the Accuracy in Prediction Case 1: Localized Weather Data Three areas Wingate, North Dade, and Gulf Stream were chosen for study as pilot areas, which are adjacent to each other. Weather and Interruption data for each of the MAs were considered for years 2000 and 2001 and were used to train the ANN model. While Year 2002 weather data of the North Dade area was chosen to predict using the built trained ANN. The mean percentage error (MPE) of the predicted value is 25% (approximately) The mean percentage error (MPE) is calculated using the following formula: MPE = NactualNpredictedNactualMod M )(1 Where M = Total number of cases considered Nactual = Actual number of N happened Npredicted = Predicted number of N

PAGE 69

57 Case 2: Scattered Weather Data Contrary to taking weather data from each of the weather stations, only one weather station was chosen for weather data while the interruption data was taken from all the 3 management areas. The mean percentage error of the predicted value is 35% (>25%) This shows that with the increase in the accuracy of weather, by considering the smaller areas, there are chances to enhance the performance of the model. 5.7 Comparison of Statistical Model and ANN Model A comparison of the prediction performance between statistical and ANN model was done using the 2000 and 2001 weather and interruption data of Gulf Stream (GS), North Dade (ND), and Wingate (WG) of FPL In the process, three variables Rain, 2Minutes Maximum Wind Gust and Average Temperature were considered in building the above two models. A multiple regression equation was developed for the above three variables as given below: N = -16.6 + 0.174 *AvgTemp + 4.71* Rain + 0.852 *2MMaxS The 2002 weather data of ND is used to predict N using the above equation. On the similar lines, ANN model was developed with 3 input variables, 5 hidden nodes and 1 output node. The same data set which is used for the statistical model is used in evaluating the ANN model. The results of both the models are tabulated in Table 5-1. Table 5-2. Performance Comparison Between Statistical Model and ANN Model Statistical Model ANN Model Mean % Error 67 45 Prediction with 30% Error 46 54 The above results show that the prediction accuracy of the ANN model is better than the statistical model. Though the actual predicted figures of accuracy from both the

PAGE 70

58 models are less, as we considered only few variables to make the problem easy, the point here is to show that the ANN model is better. Figure 5-11. Mean squared error vs. training epochs Figure 5-11 shows that the mean square error is gradually getting decreased with the training of ANN for each epoch (a complete set of training data). The progress of training is diagnosed by looking into the training, validation and test errors. The training stopped after 40 epochs because the validation error increased. The result here is reasonable, since the test set error and the validation set error have similar characteristics, and it doesnt appear that any significant over fitting has occurred. 5.8 Possible Software Development to Predict Power Interruptions Using ANNs The following Figure 5-12, is a snap shot of the graphical user interface (GUI) development of the ANN that had been trained to predict the interruptions.

PAGE 71

59 Figure 5-12. Graphical user interface developed to predicted interruptions Using above interface model, Figure 5-12, is simple: We have to first load the training and testing data files (ASCII format) using the options buttons provided and then click on the Run Simulation button to see the above plots. Currently, the development of custom software to predict the power distribution interruptions, based on the idea provided in the current thesis, is in progress. The proposed prediction model is under test at FPL management areas. The custom software can be easily installed just like any other software on the user desktop and is just a click away to know the power interruptions in advance. In the future, the software will be distributed to other power utilities in USA.

PAGE 72

60 Using the model shown in Figure 5-12, it can also be possible to get similar kind of predictions as shown in Figure 5-13.Central Dade management area (MA) has been chosen as one of the pilot areas to see how well our developed model can predict the interruptions. It can be seen that the correlation coefficient R 2 is around 90 which means the model is doing pretty good job in predicting the interruptions close to the actual number of interruptions. The X-axis of figure 5-13(a) shows the predicted sum of monthly interruptions while the Y-axis shows the actual sum of monthly interruptions happened. This estimate of predicted interruptions will help the utilities to know in advance how much personnel they need to deploy to manage the interruptions. Sum PredictedSum Actual 550500450400350300250200150 600500400300200100 S32.3396R-Sq90.2%R-Sq(adj)89.9%Central Dade 2001-2003 Monthly Total NSum Actual = 1.84 + 1.060 Sum Predicted (a)

PAGE 73

61 MonthY-Data 129630 600500400300200 129630 600500400300200 2001 2002 2003 VariableSum ActualSum PredictedPanel variable: YearScatterplot of Sum Actual, Sum Predicted vs Month for Central Dade (b) Figure 5-13. Predicted interruptions vs. actual interruptions for Central Dade (a) 3 years plotted together (b) each year plotted separately

PAGE 74

CHAPTER 6 LIMITATIONS, CONCLUSIONS AND FUTURE WORK The research results presented in this thesis are not with their own limitations. Some of the hurdles that need to be overcome to get better results were discussed in this chapter. The conclusions of the current thesis are followed by the future work explaining about the steps that are to be followed from the current state of the project. 6.1 Limitations of Approach 6.1.1 Weather Data There are two types of weather parameter measurement errors. First, we found that the weather parameter measurement in an airport is not accurate as expected. Second, the distance between the location of outages and the airport, where weather parameters are measured, is up to 10 miles. The weather conditions in two locations for certain weather parameters can be significantly different (Figure 6-1). However the rain difference presents a normal distribution with mean close to zero. Therefore, rain data can be used for nearby locations without changing the results. 62

PAGE 75

63 00.20.40.60.811.21.41.61.8211/20/20001/9/20012/28/20014/19/20016/8/20017/28/20019/16/200111/5/200112/25/20012/13/2002TimeAvg diff =(|R1-R2|+|R1-R3|+|R2-R3|)/ 3 (a) (b) Figure 6-1. Average precipitation difference (a)for 3 weather stations in Fort Lauderdale (b) showing less than 9 miles distance between each weather station 6.1.2 Unknown Variables There are many explanatory variables that would contribute to the response. Some of these are different weather parameters, but other variables are most likely specific to a

PAGE 76

64 system and would best be identified by utility employees who are familiar with the system. 6.1.3 Outliers It is possible to find aberrant observations among the data set, without any clear explanation of the cause. These sort of outliers must me study independently. 6.1.4 Hourly Data From the number of interruption database FPL provided, the lowest reachable level is daily basis. However, for any interruption studied it is necessary to know the exact time, at the hour level or even at the minute level as shown in [10-11].This is needed to study different weather parameters at the given outage time since the weather varies for each and every hour. 6.2 Conclusions The ANN and statistical analysis of the ANN output has the potential to provide powerful modeling tools, and can be used to provide limited real-time prediction. The accuracy and precision of the model is dependent as much on the input as the ANN model. The graphical output of the ANN can be used by itself or in conjunction with the statistical analysis to compare the accuracy and precision of the ANN model with different variable selections, principal components, study areas or times. In some cases, the graphical representation can provide better clues to the performance of the ANN than the graphs of means and percent standard deviations. With the ever increasing demand for more and more electricity every year, the need to look for the better ways in preventing the interruptions due to over loading of the power distribution equipment has drawn much attention.

PAGE 77

65 ANNs have been already applied in power systems in the areas of Economic Load Dispatch, Optimization and Loss Reduction Fault Detection and Diagnosis Frequency Control Load Forecasting Contingency analysis static security assessment Voltage and Reactive Power Control etc. But, not much research work can be found either online or in the IEEE publications regarding the application of ANNs for the prediction of power distribution interruptions. This novel idea seems very promising in letting the utilities know and alert them in advance about the number of interruptions that are going to happen in future. This helps to optimize their crew by mobilizing them to the location of interest and take proper action more effectively to avoid interruptions/ respond quickly in restoring the power due to interruptions. This further helps in reducing the SAIFI value. The utilities can predict SAIFI as they would be able to predict the total number of interruptions and can use it in their internal calculations. The developed ANN model can be further enhanced in predicting the extra information like time slot and location of the occurrence of these interruptions besides revealing their approximate number, for which all we need to do is to provide the extra information as input columns while training the model. The accuracy of the predicted results is directly proportional to the accuracy of the information provided in the training data which is used to train the model. A basic methodology that is easily automated has been proposed. The methodology promises to be easy to use and flexible enough to perform in both a real-time predictive and a theoretical modeling mode. 6.3 Future Work The following future steps will improve the accuracy of the current analysis.

PAGE 78

66 6.3.1 Data Collection and Creating New Variables Additional data collection is necessary. It is suspected that a change in power usage or equipment density might cause outage trends over time. To verify this idea, we need to develop a scaling factor and collect usage data. This scaling factor would consist of information such as equipment density and length of linesdaily power usage data would be an additional explanatory variable. Also, this new data might be useful in comparing management areas because the probability of interruptions occurring may be proportional to the scaling factor. The more the new input variables of the system 6.3.2 Improving the Accuracy and Developing New ANN Models Other types of ANN such as RBF, LVQ, SOM or their combinations need to be tested to see which of the model gives better prediction results. The dimension of input feature space/ input feature pattern needs to be reduced to improve performance such as speed, prediction accuracy If the prediction variable(s) are more than one (multi-output rather than single output), the architecture of whole system may be either a multi-input multi-output ANN or the composition of several multi input-single output ANNs. The training method as well as performance should be further investigated and compared. Develop an enhanced custom software model with Graphical User Interface, where the user will have the options of selecting new input and output datasets to train ANN and develop a model to predict the output. Hence, user can reuse this tool every time he wishes to create new model / renovate the existing model.

PAGE 79

LIST OF REFERENCES 1. IEEE Trial-Use Guide for Electric Power Distribution Reliability Indices, IEEE Std 1366-2001, IEEE, New York, 1999. 2. 2001 Cost of Downtime, Contingency Planning Research (CPR) and Contingency Planning & Management Magazine (CPM). Website http://www.contingencyplanningresearch.com (accessed on December 2001). 3. C. A. Warren, Overview of 1366-2001 the Full Use Guide on Electric Power Distribution Reliability Indices, Power Engineering Society Summer Meeting, IEEE, Volume 2, 2002. 4. Transmission Line Reference Book, 345kV and Above/Second Edition. Electric Power Research Institute, Palo Alto, CA, 1982. 5. Florida Power and Light website http:// www.fpl.com (accessed June 2004) 6. Thomas E. Grebe, D. Daniel Sabin, and Mark F. McGranaghan, An Assessment of Distribution System Power Quality: Volume 1: Executive Summary. EPRI Report TR-106294-V1, Electric Power Research Institute, Palo Alto, California, May 1996. 7. D. Daniel Sabin, An Assessment of Distribution System Power Quality, Volume 2: Statistical Summary Report. EPRI Report TR-106294-V2, Electric Power Research Institute, Palo Alto, CA, May 1996. 8. Daniel L. Brooks and D. Daniel Sabin, An Assessment of Distribution System Power Quality: Volume 3: The Library of Distribution System Power Quality Monitoring Case Studies. EPRI Report TR-106294-V3, Electric Power Research Institute, Palo Alto, California, May 1996. 9. National Climatic Data Center (accessed June 2004), Website http://nndc.noaa.gov/?http://ols.ncdc.noaa.gov/cgi-bin/nndc/buyOL-002.cgi 10. A. Domijan, Jr., R. K. Matavalam, A. Montenegro, W. S. Willcox, Y. S. Joo, L. Delforn, J.R.Diaz, L.Davis, and J. D'Agostini, Effects of Normal Weather Conditions on Interruptions in Distribution Systems, International Journal of Power and Energy Systems, Publication No: 203-3453. 11. J. M. Zurada. Introduction to Artificial Neural Systems. West Publishing Company, St. Paul, MN, 1992. 67

PAGE 80

68 12. L. F. Garcia, and O.A Mohammed, Forecasting Peak Loads with Neural Networks, Southeast Conference. Creative Technology TransferA Global Affair, Proceedings of the 1994 IEEE, pp. 351 356, 10-13 April, 1994. 13. S. I. Wu, Mirroring our Thought Processes. IEEE Potentials 14, 36-41, 1995 14. Neuron Model & Network Architectures, Neural Networks Toolbox, MATLAB 6.0 Manual, Chapter 2. 15. W.M. Huang and R.P. Lippmann. Comparisons Between Neural Networks and Conventional Classifiers, Proc. IEEE Int. Conference on Neural Networks, pp. 485-493, 1987. 16. J.L Chen, and Chang, S.H, A Neural Network Approach to Evaluate Distribution Systems Engineering, IEEE International Conference on Neural Networks, pp. 487 490, 17-19 Sept. 1992.

PAGE 81

BIOGRAPHICAL SKETCH Roop Kishore R. Matavalam was born in Tirupati city, Andhra Pradesh state, India. He received his Bachelor of Technology (B.Tech) degree in 2001 specializing in electrical and electronics engineering from Sri Venkateswara University, India. Since Fall 2001 he has been pursuing his Master of Science degree in electrical and computer engineering at University of Florida, Gainesville. He has been working as a research assistant, since 2001, in Florida Power Affiliates and Power Quality Laboratory, University of Florida. His fields of interest include power reliability, power electronics, analog circuit design and RF micro electronics. 69


Permanent Link: http://ufdc.ufl.edu/UFE0006668/00001

Material Information

Title: Power Distribution Reliability as a Function of Weather
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0006668:00001

Permanent Link: http://ufdc.ufl.edu/UFE0006668/00001

Material Information

Title: Power Distribution Reliability as a Function of Weather
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0006668:00001


This item has the following downloads:


Full Text












POWER DISTRIBUTION RELIABILITY AS A FUNCTION OF WEATHER


By

ROOP KISHORE R. MATAVALAM


















A THESIS PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE

UNIVERSITY OF FLORIDA


2004


































Copyright 2004

By

Roop Kishore R. Matavalam


































To my parents and my sister
















ACKNOWLEDGMENTS

First and foremost, I thank my advisor, Dr. Alexander Domij an, for having

confidence in me and allowing me to pursue this research. The work presented in this

thesis would not be possible without his consistent support. I am also very thankful to Dr.

Khai D.T. Ngo and Dr. Antonio A. Arroyo for serving as committee members in my

thesis.

I sincerely express my gratitude to my colleague and friend William S. Wilcox for

his thorough discussions in statistics and suggestions, without which the current thesis

would not have been exciting. I am also grateful to my colleague and friend Alejandro

Montenegro for his suggestions and answers to my questions without any hesitation.

I am also very grateful to my friend Raj Vignesh Thogulua for helping me in

understanding the neural networks. I am also thankful to Dr. Tao Lin for his helpful

suggestions. I also thank my fellow colleague and friend Ajay Karthik for being

enthusiastic about my work.

I would like to express my special thanks to all the personnel of FPL Distribution

Reliability group especially Mr. J. R. "Pepe" Diaz, Ms. Lee Davis and Ms. Jessica

D'Agostini for their valuable suggestions and financial support of this project, without

their consistent support the current project would not have been finished. I am also

thankful to FPL members including Mr. Val Miklausich, Mr. Santiageo Cocina, Mr. Luis

Delfom, Mr. Manny Miranda, and Ms. Martha Caneia for their assistance.









I would like to express my gratitude to all the undergraduate students who worked

in FPL project and made it more lively and interesting. I also extend my gratitude to all

my great friends for their support and encouragement.

Finally, yet most importantly, I am indebted to my wonderful parents and sister for

believing in my goals, aspirations, for their love, encouragement, and constant support in

all my endeavors.















TABLE OF CONTENTS
page

ACKNOW LEDGM ENTS ........................................ iv
LIST OF TABLES ................ ........ ....... .................... viii
LIST OF FIGURES ........................ ........................... ix
ABSTRACT................................................ xi
CHAPTER

1 INTRODUCTION ................... ...................................... ......... ........
1.1 Im portance of Pow er R eliability........................................................................2
1.2 U understanding Pow er Reliability Indices.............................................................3
1.3 Purpose and Importance of this Thesis............... ...............................5
1.4 O organization of T hesis............................................................................................

2 UNDERSTANDING FLORIDA WEATHER ......................................................8
2.1 Air D ensity, Tem perature and Pressure................................................................8
2.2 Humidity ......................................................... ..................8
2.3 R ain................... ..... ....... ....................................
2.4 Dew or Condensation of the Humidity..........................................9
2.5 Pollution................................ ........9
2.6 Wind ........................................ ............................. .........10
2.7 Lightning....................................... .................. ...................12

3 SYSTEM UNDER STUDY FPL......................................................... ........15
3.1 Description of the FPL distribution system........................................................15
3.2 Interruption Data from FPL................... .... ............. .. ........17
3.3 Meteorological Weather Data from NCDC........................................................22
3.4 Weather Parameters of interest......................................................... .........23

4 CORRELATION OF WEATHER AND INTERRUPTIONS .................................25
4.1 Im portance of Statistical Tools ............................................... ...... ......... 26
4.2 Probabilistic Characteristics of Data Distributions............................................27
4.3 Correlation Analysis between Weather Parameters and Interruptions.................29
4.3.1 Impact of Temperature on N ....................................... ..........29
4.3.2 Impact of Wind on N...................................... ....................... ......34
4.3.3 Impact of Rain on N ............... ................... ................... ..........35
4.3.4 Effect of Rain and W ind Together on N ....................................................37










5 PREDICTION OF INTERRUPTIONS USING ARTIFICIAL NEURAL
N E T W O R K S ....................................................... 40
5.1 Introduction to Artificial Neural Networks ........................................ 40
5.1.1 Benefits of ANNS over statistical methods..............................................41
5.1.2 A architecture of A N N ........................................................................ ..........4 1
5.1.3 Functioning of A N N ...................................................... 43
5.1.4 Back Propagation Learning Rule.........................................................44
5.2 Steps to Enhance the performance of ANN....................................................45
5.3 A N N Sim ulation O utput ................................................................................. ......47
5.3.1 D detailed O observation .................................... ................... ................. 48
5.3.2 Dominant Weather Parameters Preliminary Observations ...................51
5.4 Analysis of ANN Simulation Output.............. ................. ............51
5.5 Pitfalls and Suggestions to FPL ................................................................. 53
5.5.1 W weather D ata ................. .... ................................. ........ 53
5.5.2 Interruption D ata ........... ... ........ .. .... .. ..... ...............53
5.6 Proving Localization of Weather Improves the Accuracy in Prediction............56
Case 1: Localized W weather Data ............................. ............... 56
Case 2: Scattered Weather Data ........................................ ............. 57
5.7 Comparison of Statistical Model and ANN Model ....................... .................57
5.8 Possible Software Development to Predict Power Interruptions Using ANNs.....58

6 LIMITATIONS, CONCLUSIONS AND FUTURE WORK ...................................62
6.1 L im stations of A pproach................................................................................. ......62
6. 1.1 W weather D ata ................. .... ................................. ........ 62
6.1.2 U nknow n V ariables ............................................................................63
6.1.3 Outliers ........................................ ........64
6.1.4 Hourly Data .............................................. ..... ...64
6.2 Conclusions............................. ........... ........64
6.3 Future Work........................................ .... ...............65
6.3.1 Data Collection and Creating New Variables ..................................66
6.3.2 Improving the Accuracy and Developing New ANN Models ................66

LIST OF REFERENCES ..................................... ............... ....................67
BIOGRAPHICAL SKETCH .................................................. ............... 69
















LIST OF TABLES


Table page

2-1 Type of Contaminant and Atmospheric Conditions at the Time of
Contamination Flashover (UHV Project)................................................................10

3-1 FPL Pow er Sales by Sectors.....................................................................................15

3-2 FPL Distribution Management Areas along with Their Dispatch Centers ...........16

3-3 FPL C ause C odes (102) Table ....................................................... 18

4-1 The Frequency of Interruptions due to Tree Limbs (Cause Codes 20 & 21)..........27

4-2 Prediction of N Using Maximum Temperature of All the MAs ............................33

5-1 Summary Table of Covariance for All the Input Variables Considered in the
Principle Com ponent A nalysis................................ ................... 46

5-2 Performance Comparison Between Statistical Model and ANN Model............... 57
















LIST OF FIGURES


Figure page

2-1 Regions with strong contamination (UHV project) ................................................9

2-2 Swing angle as a function of instantaneous wind speed at tower ......................... 11

2-3 Vegetation effects on power interruptions .........................................12

2-4 Number of days with thunderstorms in Florida (US Weather Bureau)....................13

2-5 Cumulative frequency distribution of peak current amplitudes
in dow nw ard negative flashes ........................................ ................. 14

3-1 Snap shot of Florida map ................................ ......................... ........ 17

3-2 FPL's historical SAIFI performance .................................................. .............20

3-3 Frequency charts of interruptions and customers affected by
interruptions .......................................................21

4-1 N for all management areas from 1998 to 2001 using the previous
filters .................................. .... ......... .28

4-2 Rain (inches) for all management areas from 1998 to 2001 ..............................28

4-3 Wind-2 minutes maximum speed (mph) for all management areas
from 1998 to 2001 ..................................................... ........ 29

4-4 Variation of average N due to transformer failures vs. maximum
tem perature ...................................... ................................ ......... 30

4-5 Variation of average N due to transformer failures vs. maximum
temperature (averaged per month per year) ........................................31

4-6 Variation of average N due to transformer failures vs. max
temperature (averaged per month) ............................... ............... 32

4-7 Variation of N vs. wind ............................. ... ................34

4-8 Mean of 2 minutes wind speed vs. average number of interruptions....................35









4-9 V aviation of N vs. rain.............................................................................36

4-10 V ariation of N vs. rain in the interval [0 2]........................................ ............. 37

4-11 Impact of rain and wind together on N ........................ ............................. .........38

5-1 ANN structures.................................................................42

5-2 A back propagation ANN model ................................ ...............45

5-3 Prediction patterns of N overlaid on actual patterns of N of 3 MAs for year
2002 ...................................... ......................... ........................48

5-4 Predicted N and the actual N for a few of the cases in North Dade MA for 2002...49

5-5 Numerical values of weather and interruption data under consideration..............50

5-6 Histogram plot of predicted interruptions ............................ .... ..............52

5-7 Prediction results of ANN using the original N (not shift adjusted)........................54

5-8 Prediction results of ANN using the adjusted N (shift adjusted).............................55

5-9 Mean and standard deviation of actual N....................... .... ...............55

5-10 Mean and standard deviation of adjusted N......................................................55

5-11 Mean squared error vs. training epochs ...........................................58

5-12 Graphical user interface developed to predicted interruptions...........................59

5-13 Predicted interruptions vs. actual interruptions for Central Dade............................61

6-1 Average precipitation difference............................................................ ....... 63














Abstract of Thesis Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Science

POWER DISTRIBUTION RELIABILITY AS A FUNCTION OF WEATHER

By

Roop Kishore R Matavalam

August 2004

Chair: Alexander Domij an, Jr.
Major Department: Electrical and Computer Engineering

The system principles that are used to design and maintain electric power

distribution grids (distinct from transmission grids) are intended to minimize the number

and duration of power disturbances, including interruptions. Many of these principles,

such as load flow and load prediction are well understood and have been refined over

many years. However, the impact of local weather conditions on power distribution grids

has not been well researched. The current thesis is intended to improve our understanding

of the effects of weather on power distribution systems and to develop tools for the

prediction of weather related interruptions. Developing and disseminating this

information will allow electric power engineers to ultimately improve our nation's power

distribution capabilities.

The current research also presents the novel concept of predicting the number of

power interruptions in a distribution system using weather parameters. Preliminary

results show that it is possible to define and build a model, using artificial neural

networks (ANNs), which can use weather parameters as inputs and predict the number of









interruptions with reasonable accuracy. The accuracy of the prediction depends, in part,

on the accuracy of the weather data that are used in the model and, in part, on the

precision of the model. It is expected that the use of real-time surface weather data, such

as can be collected from well-sited weather stations, will eliminate the uncertainty

inherent in weather data collected from geographically distant sources.















CHAPTER 1
INTRODUCTION

For any power company, providing reliable electric service is the number-one

priority. But unfortunately, sometimes, power interruptions are simply unavoidable. The

contribution of weather towards these power interruptions is significant, as it is going to

be shown in this thesis. Often times the terms power outages and power interruptions

are exchanged each other to mean the same -- loss of power supply, even by many people

in power industry. But there is a fine difference between these two terms as stated by the

IEEE 1366 standard [1]:

An outage is defined in IEEE 1366-2001 as:

The state of a component when it is not available to perform its intended function
due to some event directly associated with that component. Notes: (1) An outage
may or may not cause an interruption of service to customers, depending on system
configuration. (2) This definition derives from transmission and distribution
applications and does not apply to generation outages.

An interruption is defined in IEEE 1366-2001 as:

The loss of service to one or more customers connected to the distribution portion
of the system. Note: It is the result of one or more component outages, depending
on system configuration.

From FPL standards, the loss of power supply is defined in two ways based on the

duration of the power disturbance to the customer:

Momentary Interruption: Single operation (Open Close) of an

interrupting device which results in zero voltage for a period of time of 59

seconds or less.











Sustained Interruption: Power loss to the customer that has lasted at least

for one minute.

Note: In the current thesis, when we say interruptions we mean sustained interruptions.

Also we use N to represent the total number of sustained interruption per day.

1.1 Importance of Power Reliability

Because of the huge financial losses, besides customer satisfaction loss, associated

with power interruptions to our economy, power reliability is one of the most important

concerns for electric utilities. Power reliability modeling and indexing are among the

tools used by utilities to manage costs and monitor equipment performance, and

ultimately improvements in the flexibility of reliability assessment models will result in

increased savings. According to Contingency Planning Research Company's annual

study [2], downtime caused by power disturbances result in major financial losses as

shown in Figure 1-1.



$10,000,000
$1,000,000
$100,000-
$10,000-
$1,000
$100
$10





Figure 1-1. Average hourly impact of downtime and data loss by business sector



Traditionally the reliability of an electric distribution system is measured in terms

of total number of power interruptions occurred.









Currently there are a variety of indices available to measure the reliability of any

power utility. These indices are used as a yard stick within a utility to see how good they

are doing each year besides serving as a better tool to compare their relative performance

with the other utilities, of different sizes, in the country.

1.2 Understanding Power Reliability Indices

Among many power distribution reliability indices available [3], the following are

some of the widely used customer based indices:

SAIFI:
System average interruption frequency index (sustained interruptions). This index

is designed to give information about the average of sustained interruptions per customer

over a predefined area. In words, the definition is

SAIFI = Total number of customer interruptions
Total number of customers served
Mathematically it is represented as

SAIFI= 1N
NT
SAID:
System average interruption duration index. This index is commonly referred to as

customer minutes of interruption or customer hours, and is designed to provide

information about the average time the customers are interrupted.


SAIDI = Customer interruption durations
Total number of customers served

Mathematically it is represented as

SAID= =rN.
NT
CAIDI:
Customer average interruption duration index. CAIDI represents the average time
required to restore service to the average customer per sustained interruption. In words,
the definition is

CAIDI= Customer interruption durations
Total number of customer interruptions









Mathematically it is written as


CAIDI = rNN SAIDI
Ni SAIFI
ASAI:
Average service availability index. This index represents the fraction of time( often

in percentage) that a customer has power provided during one year or the defined

reporting period. In words, the definition is

ASAI = Customer hours service availability
Customer hours service demand

To calculate the index, the following mathematical equation is used:

ASAINr (No.of hours/ year) ZrN,'
NT (No.of hours / year)

Note that there are 8760 hours in a regular year, 8784 in a leap year.

Some of the other customer based reliability indices in use are CTAIDI, ASAI etc.

Load based sustained indices include ASIFI, ASIDI etc. Momentary indices include

CEMSMIn, AMAIFI and MAIFIE. The newest indices are CEMIn, Customers

experiencing multiple interruptions, and CEMSMIn, Customers experiencing multiple

sustained interruptions and momentary interruptions events. Usage of CEMIn as a basis

performance measure is under consideration [3] by many states in USA. In Florida, usage

of CEMI5 index is under consideration. If this value is exceeded, the commission is

considering fines that would be paid to the customers who experienced poor

performance.

The snapshot of the percentage usage of different reliability indices, by the IEEE

working group [1] on system design is as shown in Figure 1-2. It was analyzed through

surveys that the most commonly used indices are SAIFI, SAIDI, CAIDI, and ASAI.










100
90


70
60.
50
40
30.


10.'
D1


.00% --------------------------------





00%- -



00%
10.61%
M% j453.03%
DpO a ia~fi SLgiia a'Bg ia


SAIFI SAID CAIDI ASAI ASIF 4ASIDf MAIFI OTHER CTAPDI CAlFI

Index
Figure 1-2. Percentage of companies using a given index reporting in 1990

1.3 Purpose and Importance of this Thesis

It was so obvious from previous discussed indices that one has to reduce the total

number of interruptions in order to improve the reliability of a distribution system.

Though there are lots of parameters/ conditions that are responsible for these

interruptions, weather is still a big player and has a significant contribution. Research

carried out by the Electric Power Research Institute [4] showed the effects of weather

components, such as lightning, rain and wind in transmission lines (345kV and above).

However, not significant similar work was done for distribution lines which brought our

attention to research in this new direction.

The purpose of the current thesis work was to study the impacts of normal i Iclu/Wlr

conditions on the distribution power interruptions and develop correlation models. Study

also includes how the correlation knowledge can be used to reduce the power

interruptions by incorporating Artificial Neural Networks (ANN). Using weather and

interruption data novel prediction tools and modeling methods were developed, in which

the similar kind of approach can be applied to any electric utility in the country. With the









provision of novel tools developed in this thesis, any power utility would be able to

estimate/predict the total approximate number of interruptions that might happen in the

future due to weather. The accuracy of the prediction of interruptions depends, in part, on

the accuracy of the forecast weather data which serves as input to the model and, in part,

on the accuracy of the model. It is expected that the use of real-time surface weather data,

such as can be collected from well-sited weather stations, will eliminate the uncertainty

inherent in weather data collected from geographically distant sources.

Despite a thorough search of the available literature, examples of the use of surface

weather data in the construction of power reliability models have not been found. It is

expected that this project will contribute significantly to the existing literature by

providing predictive models, as well as background in a previously unexplored area.

Moreover, this research is valuable for the exploration of proper ANN structure, internal

parameters and feature pattern extraction methods for application to power systems.

1.4 Organization of Thesis

The current thesis work comprises six chapters. The current chapter gives the

motivation, literature survey and the importance of the project. The second chapter

explains about the behavioral model of weather conditions prevalent in Florida State. It

will also give a broad knowledge of different weather parameters. The third chapter deals

with the Florida Power & Light (FPL) system and also the information on weather and

interruption data considered in the analysis. The fourth chapter presents the approach

followed towards developing correlation models between weather parameters and power

interruptions using statistical tools. The fifth chapter reveals a novel model using Neural

Networks that can be used to predict the power interruptions based on the forecasted






7


weather parameters. The last chapter concludes with the limitations of the research

results, and conclusions of the current thesis followed by future work.














CHAPTER 2
UNDERSTANDING FLORIDA WEATHER

As weather in Florida is very varying, different weather parameters in Florida that

are more common are explained and their impact on power distribution lines is explained

in this chapter. Florida is also known as the lightning capital of the world.

The natural phenomena of the weather that can reduce the strength [4] of the

insulators in the state of Florida are air-density, temperature, pressure, humidity, rain,

dew or condensation of humidity, pollution, wind and lightening.

2.1 Air Density, Temperature and Pressure

The variation of the temperature in the state of Florida is between 20TF and 100oF

and the pressure doesn't change significantly. The influence of this on the strength of the

insulators, without abnormal condition like fire, is around of 5%.

2.2 Humidity

The humidity in Florida changes significantly throughout the year and throughout

the day. Usually, the humidity can be very high between Midnight and 6 /7 A.M and after

that it starts to decrease. It again starts to increases at night. The influence of the humidity

without condensation can affect the strength of the insulator to around 16%.

2.3 Rain

The rain can be classified into weak rain (mist or drizzle) or strong rain

(rainfall).The influence of the rain on the insulator strength varies and depends on the

intensity and its direction. The maximum influence of rain [4] on clean insulators is

around 30% .









The mist or drizzle along with pollution, can have a stronger impact on flashover

when it combines together with pollution on the insulators strings.

2.4 Dew or Condensation of the Humidity

When temperature decreases to the dew point, the condensation starts taking place

on the surface of the insulators and then a thin layer of water appears. The condensation

can have same effect as that of mist on insulator.

2.5 Pollution

Pollution can reduce the strength of an insulator. Its influence on flashover depends

on the type of contamination and its concentration on the surface of the insulators.

We are here concerned about two types of the contamination, the spot

contamination and the area contamination. In Florida the distribution of contamination

[4] is as shown in Figure 2-1. Black dots refer to spot contaminations and shaded refer to

area contaminations.












Figure 2-1. Regions with strong contamination (UHV project)

Table 2-1 shows the types of contamination that causes flashover. In dry conditions

most of them are not good conductors, however, in wet conditions due to condensation,

drizzle, mist or rain the conductivity increases substantially.








10



Table 2-1. Type of Contaminant and Atmospheric Conditions at the Time of
Contamination Flashover (UHV Project)

TABLE 10.3.5 TYPE OF CONTAMINANT, WEATHER, AND ATMOSPHERIC CONDITIONS AT TIME OF
CONTAMINATION FLASHOVER
Type of Orizzle. No High Wet
Conlaminant Fog Dew Mist Ice Rain Wind Wind Snow Fair
Sea-sall 14 11 22 1 12 3 12 3
Cement 12 10 16 2 11 4 1 4
Fertilizer 7 5 8 1 1 4
Flyash 11 6 19 1 6 3 1 3 1
Road sail 8 2 6 4 2 6
Potash 3 3
Cooling tower 2 2 2 2
Chemicals 9 5 7 1 1 1 1
Gypsum 2 1 2 2 2
Mixed contamination 32 19 37 13 1 1
Limestone 2 1 2 4 2 2
Phosphate and sulfate 4 1 4 3
Paint 1 1 1
Paper mill 2 2 4 2 1
Dried milk 1 1 1 1 1
Acid exhaust 2 3
Bird droppings 2 2 3 1 2 2
Zinc industry 2 1 2 1 1
Carbon 5 4 5 4 3 3
Soap 2 2 1 1
Steel works 6 5 3 2 2 1
Carbide residue 2 1 1 1 1
Sulphur 3 2 2 1 1
Copper and nickel salt 2 2 2 2 1
Wood fiber 1 1 1 1
Bulldozing dust 2 1 1
Aluminum plant 2 2 1 1
Sodium plant 1 1
Active dump 1 1 1
Rock crusher 3 3 5 1
Total 146 93 166 8 68 26 19 37 4
Percent weather 25.75 16.4 29.3 1.4 12 458 3.36 6.52 0,71






Pollution combined with condensation and rain can be considered as the worst


condition behind reduction of insulator strength. Moreover, dew, drizzle and mist are


considered the most important weather components at the time of flashover, for 72 % of


the cases [4].


Rain can have two different aspects. On one hand, it reduces insulators'


withstanding values. On the other hand, cleans the surface of the insulators thereby


preventing the system from new flashover due to dew (condensation), mist and pollution.


2.6 Wind


Special weather conditions, such as storms, thunderstorms, hurricanes and tornados


have a direct effect on power interruptions. However, these events are a combination of


rain, wind and lightning. Therefore, the individual analysis of each one of these weather










components is required in order to study the real cause behind the flashover. An

evaluation of the influence of wind-speed on swing angle and therefore the minimum

clearance required to avoid possible flashover can be carried out. Wind can provoke

catastrophic mechanical damages due to asynchronous movements of the cables and/or

insulators. These damages are more significant in transmission lines where the span is

larger. Figure 2-2 shows the swing angle of a single conductor vs. wind-speed [4].




45

40 -I -





: SWING ANGLE AS A
FUNCTION OF MEAN
20 WIND SPEED
5 20 -


10 -



5 10 i5 20 25 50 35
WIND SPEED METER/SEC
Figure 2-2. Swing angle as a function of instantaneous wind speed at tower

Distribution lines have small span, so the asynchronous movement of the

conductors, most of the times, gives insignificant disturbance except when there is high

speed. The most significant disturbance due to wind can come due to movement of the

trees and its branches. Trees, which are untrimmed, can touch the lines and can result in a

flash over leading to an outage.

The Figure 2-3 below shows the uncut trees and branches touching the distribution lines
in Gainesville, Florida.


























(a) (b)














(c) (d)
Figure 2-3. Vegetation effects on power interruptions

2.7 Lightning

The number of days with thunderstorm in the Florida is between 80 and 100 as

shown in the Figure 2-4 (isokeraunic level). As we can see in Figure 2-4, Florida is a

state strongly affected with atmospheric discharges. The number of strokes to the earth

per square mile per year (lightning) can be found through the expression:

N= 0.25 I


where I is the local isokeraunic level.









If a line has shadow width of W, the number of lightning expected to hit it per year is

NL= 0.25 ILW/5280

where W=b+4h, L is the length of the line in miles, h is the height of the shield (ground)

wires and b is distance between shield wires. If the line doesn't have shield wires, h is the

height of the conductors and b the distance between external conductors.

Due to strokes on transmission or distribution lines with shield wires a back

flashover is expected. The voltage across the insulator string in this case depends on

tower foot resistance, current through the tower and of the coefficient of coupling

between shield wire and phase conductor.


S70^ -- -- 70



l90 \0 800
I, d ,




90 0



90
80
Figure 2-4. Number of days with thunderstorms in Florida (US Weather Bureau)

Thus, flashover or interruptions due to lightning depends on the tower foot

resistance and also on the intensity of the current. It is not possible to control the current

so all the control should be done through tower foot resistance. Distribution lines without

shield wires are directly affected by the lightning. The level of the voltage across the

string or insulator depends on intensity of the current and the magnitude of the surge











impedance of the line. Figure 2-5 shows the amplitude of the crest of the strokes with the

probability of the occurrence. It is possible to see that the probability of a stroke, which

has a current up to 5 kA, is almost 1. This current while passing through the conductor

will be divided into two. Thus, the voltage across the insulators or string of distribution


lines with surge impedance between 150 Q and 250 Q will be more or less between 375


kV and 625 kV. Thus, in most of the cases distribution lines up to 69 kV (BIL up to 350

kV) will be practically submitted to a flashover for every stroke on it.



I I i P










I" -















011
10 20 50 100 200 500
CRESI CURRENT-kA


Figure 2-5. Cumulative frequency distribution of peak current amplitudes in downward
negative flashes














CHAPTER 3
SYSTEM UNDER STUDY FLORIDA POWER & LIGHT

The preliminary results cited in this thesis are the research results obtained while

working on the project sponsored by Florida Power & Light (FPL). But the concept and

idea can be further applied to any system and can be enhanced as explained in the current

thesis. The power distribution interruption data of FPL was considered to study the

effects of weather on the occurrence of interruption patterns of FPL.

3.1 Description of the FPL distribution system

FPL is among the largest and fastest growing electric utilities in the United States.

As of December 2002, FPL had 9,612 employees serving nearly 8 million people, or

about half the state of Florida. Power is delivered (Table 3-1) safely and reliably from 86

generating units with a Total generation capability =26203MW, through more than 500

substations and over more than 69,000 miles of transmission and distribution lines [5].

Table 3.1. FPL Power Sales by Sectors
Sector Number ofAccounts Total Sales (kwh)
Residential 3,521,146 50.9%
Commercial 430,471 40.6%
Industrial 15,248 4.4%
Other** 2,746 4.1%
* Monthly average as of December 2001
** Includes public authorities, railway, wholesale and interchange.

FPL Distribution system is divided in 16 management areas grouped under two

regions; urban and suburban (Table 3-2). Figure 3-1 shows the location and area covered

by each of the FPL distribution management areas (MA).







16


Table 3-2. FPL Distribution Management Areas along with Their Dispatch Centers

DISPATCH West Pam South Florida Daytona Sarasota
Breach Dispatch Dispatch Center Dispatch Dispatch Center
Center Center
AREAS TC-Treasure PM- Pompano NF-North MS- Manasota
Coast Florida
WB- West Palm WG- Wingate CF-Central TB-Toledo
Florida Blade
BR- Boca Raton GS- Gulfsteam BV- Brevard GC- Gulf Coast
ND-North Dade
WD-West Dade
CE- Central
SD-South Dade


MS


=i--G-17

| GC '1'


^ ___ _^- PM1

L -N P

-- ', GS I
N| |


0-.


~








































(b)
Figure 3-1. Snap shot of Florida map (a) FPL distribution management areas. (b) Weather
stations chosen within the FPL area

3.2 Interruption Data from FPL

Power interruption data is primarily obtained from Florida Power & Light (FPL).

FPL has divided its power supplying territory into 16 sections called Management Areas

(MAs). An interruption data file was created for each of these 16 MAs. Interruption data

was made available to us in a data storage program known as "Power-Play." To make the

things more clearly, interruptions are classified in to different groups, Figure 3-4, based

on the type of causes that are responsible for these interruptions, for example -

interruptions occurred due to squirrel are represented under the category of squirrel cause


r%-h -_ _L -_










18





code (007). Basically a cause code indicates the principle cause of an interruption. With



this kind of facility, we will be able to see interruptions happened due to a any specific



cause as shown in Table 3-3.


Table 3-3. FPL


Cause Codes (102) Table


REV 6/16/98 SC


CAUSE CODES
(Required for all interruptions)


Natural Causes
001-E Lightning, with equip.dam age
002 Storm w/no equip.damage
003-E Fire
004-E Salt Spray Corrosion
007 Squirrel
009 Bird
011 Other Animal
013 Tornado
014 Hurricane
015 Ice on Lines
020 Tree/Limb Preventable
021 Tree/Limb Unpreventable
023-E Decay/Deterioration
024-E Corrosion (Non Salt Spray)
025 Vines/Grass
026-E Contamination (Non S /S)


187-E Equipt. Failed, Cause Unk
190 Unknown
Other Causes
170 Wrong size fuse
171 Overloaded Device
178 Non-standard Construction
183 Improper Installation
191-E Vandalism
193 Customer Request
195 Crew Request (Planned)
196 Slack Conductors
197 Other (explain)
202-E Loose Connection
Accidental Causes
040 Vehicle
041 Accidental Contact
046 Switching Error
079 Dia-In (Prooer Deoth)


Support and Follow-Up Codes
(Codes to be used as Support or Foilow -up Only)
No Animal Guard 241 Injection Elbow (Not Installed)


022 Palm Tree
050 Foreign Crew or Customer
066 FPL Crew
067 FPL Distribution Contractor
068 FPL Line Clearing Contractor
069 Transmission Contractor
075 Improper Depth
100 Inadequate/No Ground
192 Crew Request (Forced Outage)
199 Defective Material-UPR
222 Power Temp Used
240 Injection Elbow (Installed)


242 Flow (Positive)
243 Flow (None)
244 Injection Comp
245 Injection Job Pndng
250 Cable Replace Job Pending
251 Cable Replace Job Comp
260 Fault Locator Used
265 Cleared by Phone
271 Injected (8/96 on)
272 Replaced (8/96 on)
299 Data Corrected
999 Named Storm Exclusions


Note: The suffix "E" denotes that an Equipment Code is required.
nn Nnt untar .*C" an T Mu


Overhead
080 Down Guy or Anchor
081 Pole
082 Cross Arm
083 Insulator
084 Pole Top Pin
087 Tie Wire
088 Jumper
089 Stirrup
090 Hot Line Clamp
092 Disconnect Switch
093 Fuse Switch
096 Line OCR
097 Line Capacitor
098 Line Regulator
104 Conductor Down
105 Conductor Damaged


Equipment Codes
Underground
110 Terminator
111 Cable
113 Elbow
114 Tx Fuse Switch
115 Tx Blade Switch
116 Bayonet Switch
121 PadmountSwitch
122 Oil Fuse Cutout
123 RA Switch
124 Mech for Throwover Sw.
125 PT Fuse
126 ConductCKT Fuse
127 Control Cable
132 Handhole
134 Bushing
135 Pothead
211 Iniected Cable


Overhead or Underground
085 Arrester 102 Other Equipment
091 Connector 103 Splice
094 Transformer 106 Automated Switch (DA)


095 Step Down Transformer
Meter
160 Meter
161 Meter Blocks, Repairable
162 CT's
163 PT's
164 Other Meter Equipment
165 Meter Blocks-Not Reparable
200 Transmission related


Weather Related Codes

EQUIP DAMAGE
NO EQUIP DAMAGE


Substation
140 OCB (Feeder Breaker)
141 Regulator
142 Reactor
143 Relay
147 Step Down Transformer
148 Other Substation Equip.
150 SCADA
151 Telecommunications


LIGHTNING
PRESENT
001 -E
002


NO LIGHTNING
PRESENT
187 -E
190


The interruption data will be daily totals broken down by cause code. For example, a



specific substation may experience three interruptions on September 29, 1998, due to



cause code 093 fuse switch. Each data point will be small, but the compilation of many



years and many areas will provide a statistically significant sampling.



The interruptions of interest to us were further defined by the use of the following



filters:



* With Exclusions This filter suppresses interruption data that is defined as

exclusionary by FPL, including hurricane and tornado damage. We use this filter

because we are interested in the effects of normal i ithel/r conditions.



* Overhead This filter includes only those interruptions that are caused by faults in

overhead equipment or lines. Underground lines were considered immune to most

weather conditions.


II


I









* Internal Distribution Interruptions located at the distribution system only were
taken into account.

* Primary Only primary systems (feeders, laterals and oil circuit breakers) are
within the scope of this research.

* Substation Each substation reports all the interruptions occurring in the secondary
distribution system it supplies

* Cause code FPL uses numeric cause codes to specify the causes of interruption.
General categories include natural causes, equipment and accident.

* Dates All relevant days from January 1, 1998 through December 31 2001 were
considered. It is possible to get up to date interruption data by requesting FPL.

Assumptions about FPL system: An important consideration in choosing FPL is

that they have assured us that we can make the assumption that their equipment is

homogenous throughout their area of operations (AO).Homogeneity of equipment is a

necessary condition for statistically significant results

Scope of current Thesis: As FPL personnel have already done correlation analysis

between lightning strikes and the power interruptions, they are interested in knowing the

indirect effects of weather including wind, temperature, rain etc on the total number of

power interruptions. So the scope of the current research work is limited to these

parameters only.

Although interruptions represent between 3% to 5% [6-8] of the frequency of

disturbances, a common method for measuring the reliability of an electric distribution

system is based on the number of customers interrupted, which is proportional to the

number of interruptions, as explained in chapter 1. Let us revisit the definition of SAIFI,

a reliability index which the FPL uses more often. IEEE Standard 1366 defines the










System Average Interruption Frequency Index (SAIFI) with the following formula:

N

SAIFI =
Cb

where

Ni = Number of interruptions (sustained interruptions lasting over 1 minute)

Ci = Customers interrupted for each interruption

Cb = Customer base or customers served

FPL SAIFI


D-95 J-96 D-96 J-97 D-97 J-98 D-98 J-99 D-99 J-00 D-00 J-01 D-01
Years (J-January, D-December)

Figure 3-2. FPL's historical SAIFI performance


SAIFI indicates how often the average customer experiences a sustained

interruption (>lmin.) over a predetermined period of time, and it has a special importance

in decision making for engineers working in distribution reliability. A typical breakdown














by significance of the major causes for customer interruption and number of interruptions



of FPL distribution system under study for a period of 12 months is shown in Figure 3-3.




Customer Interrupted-2001


2000000


S 1600000
C.

S 1200000
C
800000




0


C




I


C) C *>- >'


l =1 .Q 3 a
LL CU

a~
E > c)
I 2
QJ
MU cu
LUa

Major cause


30 00%
2700%
24 00%
2100%
18 00%
1500%
12 00%
9 00%
6 00%
300%
000%
U) 0) C



a

0
E


Number of Interruptions-2001


25000
In
0
* 20000

2
15000


e 10000
0

M 5000
E
Z 0



c0
C 0)
0D -
C


W1 W 0 :3 E
z a
.) C: o)f
0 0

E 5
LLMajorcause

Cu 0)0 c
0)0
E
0
LU


Major cause


2000%
18 00%
16 00%
14 00%
12 00%
10 00%
800%
600%
400%
200%
000%
) L
U) 0



00
0)
E


Figure 3-3. Frequency charts of interruptions and customers affected by interruptions. (a)

number of customers interrupted vs. causes and, (b) number of interruptions a









The previous graphs show the relative importance of the direct effects of weather

(storms and lightning) on the interruptions, but not the indirect effects i.e. temperature,

rain, wind etc. From this chart it is not possible to determine if interruptions associated

with, for example, vegetation or equipment are indirectly affected by weather conditions

such as temperature, rain or wind.

The current thesis shows that normal weather conditions do have effects on

interruptions and that those effects can be quantified. The benefits of this type of study

are the ability to explain trends in the SAIFI due to weather conditions and as an aid in

the development of indicators for possible use in anticipating interruptions.

3.3 Meteorological Weather Data from NCDC

We collected daily average weather data for rain, temperature, wind speed and

other parameters from Automated Surface Observation Stations (ASOSs) located at

airports throughout the area of operations (AO). Construction of these stations has begun

in 1981 as an aid to air navigation and they have since become the most comprehensive

source of weather data in the United States. For the stations we are interested in, we will

be using data from 1996 through 2002. As we are an educational institution, the National

Climatic Data Center (NCDC), a department of the National Oceanic and Atmospheric

Administration [9], is making this data available to us free of charge.

The greatest difficulty in collecting these data is its sheer volume. Six years of data

from one ASOS will generate a file containing 24 columns and 2190 lines. Stacking files

from all the ASOSs in the AO will generate a composite file with more than 20,000 lines.

To add to this problem, there are missing days, missing data points and formatting that is

not importable to the statistical program of our choice. A final editing of the data was

done by brushing (taking out) those points of data which doesn't make sense : data points









with zero barometric sea pressure, zero average temperature, 100 mph of 2 minutes

maximum wind gust etc.

To address this, we wrote programs in C/C+ that will correct the omissions and

convert the NCDC files to a generic text file that can be imported by any commercial

spreadsheet program presently in use. Since we anticipate the use of ASOS data by any

power engineer using our methods, file conversion is required by our objective.

3.4 Weather Parameters of interest

Though there are a lot of weather parameters available in weather file downloaded

from NCDC website, we used only those parameters which are of interest. A weather

data file was created for each of the 15 ASOSs (one particular ASOS covered two

regions). The following daily weather parameters were downloaded from the NCDC

database:

* Average temperature
* Maximum temperature
* Minimum temperature
* Average dew point
* Significant observations
* Total rainfall
* Barometric pressure (sea level and station)
* Average wind speed
* Two-minute maximum sustained wind gust
* Five-second maximum sustained wind gust


Weather is a complex combination of lot of parameters including, but not limited to,

wind, lightening, condensation, temperature, rain, pressure, humidity, cosmic dust, solar

storms, hurricanes, storms etc and the list goes on if all the meteorological terms are

included, some of which we are not even aware of. But if the daily prevailing weather

conditions are considered, fortunately lot of parameters can be neglected by throwing






24


them under the category of extreme weather conditions i.e. not-a-common daily weather

parameter, e.g. Hurricanes, Storms, lightening etc. Therefore, the major focus was given

on the weather parameters like wind, temperature, rain, pressure, humidity etc which are

not extreme weather conditions. Also among all these common weather parameters, only

wind, rain and temperature are investigated, because of their dominant role [10] on the

power distribution interruptions.















CHAPTER 4
CORRELATION OF WEATHER AND INTERRUPTIONS


Consequences of power interruptions can range from mild to severe inconvenience

such as missing your favorite TV show or losing critical data, to life threatening, such as

the failure of traffic signals. Less obvious consequences include increased cost to the

customer due to increased maintenance and repair costs for the provider. Because of these

consequences, power engineers are always researching methods to reduce the number of

interruptions.

The first step to reducing interruptions is to define the causes, and quantify their

effects. Accident, human error and aging equipment contribute a great deal, but weather

is still the largest single cause, although the effects are not as well understood as we

would like to think. We can all agree that adverse weather conditions cause power

interruptions. The evidence is apparent. When a bad thunderstorm storm hits, or a

hurricane arrives, many people experience power interruptions, and those who don't, hear

about it on the news.

Lightning strikes, especially common in Florida, create transients that overload

transformers and trip fused circuit breakers, both conditions requiring a repair crew to

restore power. High winds blow down trees, damaging conductors.

Less apparent is the effect of normal weather on the frequency of power

interruptions. Several days of moderate rain can saturate the ground, invading buried

lines and causing short circuits (FPL study).An unexpectedly warm season can promote









vegetation growth, causing interruptions due to tree limb/conductor contact. These and

other effects of normal weather are not easily defined because there is not a one-to-one

relationship such as 'lightning hit the line so the transformer blew.' In fact, many

preventable interruptions occur that are not properly attributed to weather because of that

lack of one-to one relationships.

4.1 Importance of Statistical Tools

Part of the interest of this project is to find the relationship between the number of

interruptions and normal weather conditions. Both interruptions and weather conditions

in the future are random. To gain a complete prediction of the number of interruptions in

the future, we need to predict future weather conditions and predict the number of

interruptions based on the predicted weather conditions.

However, because of limited resources and the difficulty of weather predictions, we

will process the conditional prediction for the number of interruptions assuming that the

weather condition is known. In this subsection, we will describe the probabilistic

characteristics of daily interruption frequencies and the sums of daily interruption

frequencies, i.e., monthly interruptions or sums of interruptions when it rains and when it

does not. Then the explanation on plausible statistical data analysis techniques for each

case follows.

The daily outage frequencies have only nonnegative integer values and are strongly

skewed to the right. For example, the daily interruptions caused by tree limbs (Cause

Code 20 and 21) has the range from 0 to 58, but 99.5% of frequencies are less than or

equal to 5 (Table 4-1). Therefore, statistical techniques based on the normal distribution,

such as t-Test, normal linear regression, and ANOVA, generate big biases in calculating

the confidence intervals of estimates and provide wrong conclusions in the search of









significant weather effects. As a reminder, the normal distribution is symmetric and has

the range, (-0c, +oc).

Table 4-1. The Frequency of Interruptions due to Tree Limbs (Cause Codes 20 & 21)
The Number of Frequency Percent Cumulative
Outage Percent
0 15890 81.16 81.16
1 2582 13.19 94.35
2 632 3.23 97.57
3 223 1.14 98.71
4 102 0.52 99.23
5 53 0.27 99.50
6 38 0.19 99.70
>6 59 0.30 100.00


The nature of the weather data sets is first evaluated to know the behavioral

patterns of weather parameters. Some of the parameters of our interest are wind,

temperature and rain.

4.2 Probabilistic Characteristics of Data Distributions

As a prelude to presenting results, I will describe the probabilistic characteristics of the

data set we are dealing with; to determine what statistical data analysis techniques and

models should be used to correlate weather parameters with power interruptions.

Daily interruption frequencies (for all cause codes) have only nonnegative integer

values, from 0 to 200, and are skewed to the right, as can be seen in Figure 4-1. Rain data

shows a stronger displacement to the right (Figure 4-2), while wind speed histogram

(Figure 4-3) gets close to a normal distribution.














9000 -

8000 -

7000 -

6000 -

5000 -

o- 4000 -

3000 -

2000 -

1000 -

0

0 10 20 30 40 50 60 70 80
N


Figure 4-1. N for all management areas from 1998 to 2001 using the previous filters


2000 -








C-
o
a) 1000 -
a-

LL







0-
0 -


Figure 4-2. Rain (inches) for all management areas from 1998 to 2001


1 2 3 4 5 6 7 8 9 10
Rain












2000




a 1000
L.



0-
0 10 20 30 40 50 60 70
2MMaxS

Figure 4-3. Wind-2 minutes maximum speed (mph) for all management areas from 1998
to 2001

Because a normal distribution is symmetric and the normal random variable is

continuous within the range (-0c,cc), these probabilistic characteristics must be explained

using Poisson distribution.

4.3 Correlation Analysis between Weather Parameters and Interruptions

The statistical correlation models between weather parameters of interest wind,

temperature and rain, and the total daily number of power interruptions (N) were studied.

4.3.1 Impact of Temperature on N

In this section, the impact of daily temperature variations on the Power

interruptions due to transformer failures was studied.

The monthly averages (means) of the maximum temperatures were taken on the X-

axis and the monthly means of the total number of interruptions due to transformer

failures were taken on the Y- axis for 4 years (1998-2001) of all the MAs. Under these

conditions, the total number of monthly data points came to be around 567.











Regression Plot of N Vs. Max. Temp
Y= 202 803 5 09380X+ 0 0328080 X**2
S =262145 R-Sq =420% R-Sq(adj)= 41 8 %

/ Regression
20 -
2 0/ .. 95% Cl
*.*-------- 95% Pl
95% PI

S* PI Prediction Interval limits
SCT -Confidence Tnterval limits
0






65 75 85 95
Mean of Max. Temperature (F)


Figure 4-4. Variation of average N due to transformer failures vs. maximum temperature


It can be observed from Figure 4-4 that the plot has peaks over the two edges of the


X-axis. The reason can be attributed due to the heavy load on the transformers because of


the maximum usage of power during these temperatures. Part of the reason being, all the


customers try to switch on their air-conditioning at once when there is either maximum


temperature or minimum temperature. It looks like around 750F to 800F, there will not be


much increase in the transformer failure interruptions and hence is an optimal


temperature. Approximately after 800F, the curve increases in an exponential way. The


right skewness of the graph indicates that the higher temperature effects are more


predominant than the lower ones; this is true for Florida, especially southern part, where


most of the year it is sunny.










Regression Plot of N Vs. Max. Temperature
Y= 208 210 -5 22174 X+ 0 0335747 X"2
S=1 11736 R-Sq =80 9% R-Sq(adj) 80 1 %

15 /

/ /
/ //









Mean of Max. Temperature (F)


Figure 4-5. Variation of average N due to transformer failures vs. maximum temperature
(averaged per month per year)

Figure 4-5 was plotted with the same exact information used to plot Figure 4-4, but

the data of the corresponding months of the 4 years for all the MAs was averaged giving

a total of 48 points. Similarly in Figure 4-6 the data of the corresponding months of all

the 4 years was averaged to give 12 data points. The important thing that we should

observe is that as the number of data points is getting lesser and lesser, the plot is getting

smoother with the increase in R2 value but at the cost of losing the finest details of the

data points because we are averaging out all the variations for each month. This method

of averaging out the data points provides us an opportunity to see the hidden pattern

between the variables by suppressing the disturbances/noise in the data set.











Regression Plot for N Vs. Max.Temp
Y = 273 490 6 81286 X+ 0 0432362 X**2
S = 0 597484 R-Sq = 95 1 % R-Sq(adj)= 94 0 %
13 -
12 --/

I.
11 /
10 /



() 7 .

5 ''- ~~--- .*'^ / .
6 --- ------------ ----



75 80 85 90
Mean of Max. Temperature (F)


Figure 4-6. Variation of average N due to transformer failures vs. max temperature
(averaged per month)

The correlation equation for the X and Y variables considered is given on the top of


each of the plots; R2 represents the proportion of variability in the Y variable accounted


for by the X variable.


Based on the correlation model developed between the transformer interruptions


and maximum temperature, it is possible to predict the total number of interruptions due


to transformer for any day/MA if the maximum temperature of that day/MA is know /


given. The following Table 4-2 shows the prediction of Transformer interruptions and the


error associated with it for each of the MAs of FPL.









Table 4-2. Prediction of N Using Maximum Temperature of All the MAs
MA Airport Tmax N Tx NTx Error NTx Error
(Avg.) (Avg.) prediction ALLFPL prediction MA'sEQU
actual ALLFPL EQUATION INDIVIDUAL
__EQUATION _MA's EQU
Central Florida DAB 66.5000 7.36364 9.31668 26.523 8.170 10.951
Wingate FLL 74.4091 3.45455 5.40181 56.368 4.898 41.784
Gulf Coast FMY 72.5909 6.13636 5.91181 3.659 *
Treasure Coast FPR 71.3182 6.86364 6.40587 6.669 *
Wingate FXE 74.6364 3.45455 5.35414 54.988 4.850 40.395
Gulf Stream HWO 75.3636 4.04545 5.22544 29.168 *
Central Dade MIA 75.2727 3.50000 5.23954 49.701 5.020 43.429
Brevard MLB 69.8182 7.22727 .13454 1.283 *
NorthDade OPF 75.4545 4.81818 5.21190 8.172 *
WestPalm PBI 74.0000 4.90909 5.49660 11.968 *
Toledo PGD 71.3182 4.18182 6.40587 53.184 5.429 29.824
Ponpano PMP 73.8636 2.31818 5.53076 138.582 3.720 60.471
Central Florida SFB 68.3636 7.36364 7.99378 8.558 *
Manasota SRQ 68.0000 6.95455 8.23225 18.372 7.360 5.830
South Dade TMB 76.3636 5.40909 5.10758 5.574 *
ALL FPL 72.485 5.2 5.9 13.4%


Description:

* MA Management Area considered

* Airport The nearest airport to the MA considered in getting the weather data

* Tmax (Avg.) The average value of the maximum temperatures occurred in
January 2002

* N Tx (Avg.) Average number of interruptions (N) happened due to Transformer
failures

* N Tx prediction All FPL equation Predicted N Tx (Avg.) using the common
equation of all MAs

* N Tx prediction Individual equation Predicted N Tx (Avg.) using local equation
of individual MAs

It can be observed from table 4-2 that the prediction error varied over a wide range

from 1.28 % to 138 %. There were only 5 cases where the error exceeded 50%, with











others within the satisfactory limits. The huge error is due to the incorporation of

common equation developed from the data of all the MAs. But using the individual MA's

equations, which were developed from the local MA's data, those huge errors were

drastically reduced. There were cases where the common equation gave better results

than the local equations of the MAs; hence local equations are used only for those MAs

where common equation gave a huge error.

4.3.2 Impact of Wind on N

The role of wind is very significant among all the weather parameters. There is a

very good correlation between wind and total number of interruptions (N).

When the plot is drawn between the daily 2 minute maximum wind gust (TMMG)

and N, it was a big mess and chaotic where no pattern can be seen. Because, for a given

value of the TMMG speed there were different levels of N occurred. So the averages of

different levels of N occurred at each of the speeds of TMMG were taken and then

plotted, the plot can be seen in Figure 4-7.


Regression Plot of N Vs. Wind
Y = 4 57617 0 692087 X- 0 0355799 X-2 0 0004352 X"3
S = 2 76351 R-Sq =39 8 % R-Sq(adj) =35 2%

15 -


10 -


5





0 10 20 30 40 50 60
Mean of minutes Wind Gust speed (mph)

Figure 4-7. Variation of N vs. wind











It seems that there is a pattern until 40 mph, but after that the pattern gets distorted.

If at least 30 points were considered while calculating averages then the correlation

obtained by this process is very high, R2 = 99.3% and reveals the existence of strong

cubic relationship, Figure 4-8, between N and TMMG. By doing so, only 1.5% of the

data points were neglected still keeping 98.5% of the whole data.

Regression Plot of N Vs. Wind
Y = 0 613754 + 0 0647258 X- 0 0065040 X**2 + 0 0002598 X**3
S=00900810 R-Sq =993% R-Sq(adj) =992%
4


3


Z




10 20 30
Mean of minutes Wind guest speed (mph)


Figure 4-8. Mean of 2 minutes wind speed vs. average number of interruptions

It can be observed from Figure 4-8 that the total average number of interruptions

increases exponentially after around 20 mph. So power distribution poles and overhead

equipment must be designed in such a way that there won't be any breakdown for wind

gusts of more than 20 mph. Also care has to be taken that the distribution line's

neighborhood vegetation and others near by to it are at a proper distance and will not lean

on the power distribution lines during these wind gusts.

4.3.3 Impact of Rain on N

The impact of rain on the mean number of N can be observed in the following


figures.












Regression Plot of N Vs. Rain
Y = 1 43350 + 3 04671 X 1 09160 X**2 + 0 137744 X**3
S =124028 R-Sq =428% R-Sq(adj) =372%
9
8


6

5
74-


2- .


0
0 1 2 3 4 5
Mean of Rain(inch)


Figure 4-9. Variation of N vs. rain


The number of days that didn't rain is more than the days that rain. As the impact


of rain on N is under consideration, the non-rainy days have been excluded from the data


set. The data points of N were averaged similar to the approach followed in analyzing


wind impacts on N; different occurrences of N for each level of rain were averaged and


then plotted in Figure 4-9.


The whole graph can be divided into 3 piecewise linear segments; 0.1 to 1 inch, 1


to 3 inch and more than 3 inch. In the first segment there looks a linear relationship,


Figure 4-10, between N and Rain, and hence initial small amount of rain play a vital role


in the amount of interruptions.











Regression Plot of N Vs. Rain
Y= 1 47309 + 2 11652 X + 0 662153 X**2 0 498268 X**3
S =0568862 R-Sq =789% R-Sq(adj) =750%
55 -
50 -
45 -
40 -

Z 35 -

S25 -


15 -
10 -
0 Mean of Rain(Inch)1 2


Figure 4-10. Variation of N vs. rain in the interval [0 2]

N remains pretty much constant in the second segment showing constant effect of


rain, but in the third segment N increases drastically as rain increases over 3 inch. The


small amount of rain, little showers, settles down on the insulators. This droplets of water


helps as a solvent for the salts and the atmospheric dust deposited on the insulators and


forms a conducting layer for the current, thereby causing a flashover which leads to


power interruptions, as explained in chapter 2.


On the other hand, rain from 1 to 3 inch is large enough to clean the insulator, as


they drop off from it instead of getting deposited. Finally, rain over 3 inches is


accompanied with extreme weather conditions leading to again huge amount of N.


4.3.4 Effect of Rain and Wind Together on N

The following three-dimensional Figure 4-11 gives the relationship between the


combined effect of wind and rain on the average number of N. It can be seen that the


predicted (calculated) Navg tracks very well the actual N happened for lower values of N.


The regression equation is given by


Navg = 1.05 0.04*Wwind speed + 6.82*Rrain average


The correlation coefficient, R2, is = 85.6%

























Figure 4-11. Impact of rain and wind together on N

Usual methods of statistical analysis rely, in part, on knowing in advance what the

researcher is looking for. An example is a study done by FPL that provides a linear

equation describing the number of interruptions caused by lightning as a function of the

number of lightning strikes. In this case, the cause of the outage is known (one-to-one

relationship) and the result is expected. The data required for this type of analysis is also

proscribed by the limited scope of the question. Also, there are limited strategies for

dealing with the problem, since lightning is a random and unpredictable event. Analyzing

the effects of normal weather requires a different approach. We need to be open to

unexpected results rather than expected ones. We need to consider a body of data much,

much larger than that required to investigate a single phenomena. We need to consider

non-linear relationships and relationships that imply a confluence of conditions. We need

to apply every statistical tool we can think of, and then learn some more. Most of these

features are available in a tool called Artificial Neural Networks (ANNs). Hence the

application of ANNs to our current problem is discussed in next chapter.











4.3.5 Effect of Lightning Strikes on N



FPL has already done correlation analysis between cause codes 01(Lightening, with


equipment damage) and 02(Storm with no equipment damage) and lightning strikes for


all the MAs considering the years 1998-2001. Cause codes 01 and 02 represent the direct


weather effect on service interruptions. The following plot, copied from the FPL


information slides during their visit to University of Florida, explains the impact of


lightning strikes on the storm interruptions with a very high correlation with a linear


relation meaning the higher the lightning strikes the higher the storm interruptions.



Monthly Cnrre=Einn Li ghtnin Stri kes vs Storm hterrptions (Cod 01 4 02n1
1a8tno 9M1

5000

4500
y= 0.0376x + 91.956 *

S3500


2300
5 500 2

0
1t500


500

0 I
0 20 POO 40,.00 60,000 80 POO 100I OOO 120,000 140,000
LightM ning Stri kes


Figure 4-12. Lightning strikes vs. storm interruptions during 1998-2001 for all MAs














CHAPTER 5
PREDICTION OF INTERRUPTIONS USING ARTIFICIAL NEURAL NETWORKS

Though it gives the impression, from the previous chapter, that the effect of all the

weather parameters on power interruptions can be quantified using standard mathematical

functions/ statistical techniques, it is not always true. It may neither practical nor feasible,

always, to find a function for certain complex correlations between weather and

interruptions. This is where the need for the neural networks arises to analyze and

generalize the hidden relationship.

We need a tool which is powerful when applied to problems whose solutions

require knowledge which is difficult to specify, but for which there is an abundance of

examples artificial neural networks is one of the best tools for this kind of problems.

5.1 Introduction to Artificial Neural Networks

Neural networks, or artificial neural networks (ANN) to be more precise, represent

a technology that is rooted in many disciplines: neuroscience, mathematics, statistics,

physics, computer science, and engineering. ANNs find applications in such diverse

fields as modeling, time series analysis, pattern recognition, signal processing, and

control by virtue of an important property: the ability to learn from input data

with(supervised) or without a teacher (unsupervised).The most common training

scenarios use supervised learning.

ANN is a very useful tool for predicting the interruptions of a power distribution

system to a decent accurate value. The accuracy of prediction is directly proportional to

the accuracy of the historical power interruption and weather data used to train the ANN.









This project provides the methodology for predicting the interruptions beforehand for the

forecast weather conditions using ANNs.

5.1.1 Benefits of ANNS over statistical methods

ANN is an alternative to conventional methods [11]. ANN is an approach that

combines the time series and regression approaches; it learns from the previous

interruption and weather patterns and predicts one for the current conditions, it also

performs a non-linear regression between interruptions and weather patterns. It shows

superior performance in terms of accuracy when compared to statistical methods [12].

ANN derives its computing power through, first, its massively parallel distributed

structure and, second, its ability to learn and therefore generalize. Generalization refers to

the neural network producing reasonable outputs for inputs not encountered during

training (learning). These two information-processing capabilities make it possible for

neural networks to solve complex (large-scale) problems that are currently intractable.

The main reasons for using neural networks, for prediction, rather than statistical

techniques/ classical time series analysis are [13]

* They are self-monitoring (i.e., they learn how to make accurate predictions.
* They are able to cope with nonlinearity and nonstationarity of input processes.
* They are adaptive, non-linear and highly parallel.
* They can generalize.
* They are computationally at least as fast, if not faster than most available Statistical
techniques.

Multi-layered ANNs are capable of performing just about any linear or nonlinear

computation, and can approximate any reasonable function arbitrarily well.

5.1.2 Architecture of ANN

Figure 5-1(a) shows the basic model of a single neuron while Figure 5-1(b) shows a

one-layer network with R input elements and S neurons. In this network, each element of










the input vector p is connected to each neuron input through the weight matrix W. The ith

neuron has a summer that gathers its weighted inputs and bias to form its own scalar

output n(i), Figure 5-1 (b). The various n(i) taken together form an S-element net input

vector N. Finally, the neuron layer outputs form a column vector a, where a = f (Wp+b).


Activaiion
function
I'" ~ ~ -"-.


Synaptic
weights
(a)


Input Layer of Neurons












i=f(Wp-b)

(b)


Input Hidden Layer Output Layer
( ----r___


al = tansig (IW1,Ip1 +bh) a2 =purelin lIW: ii +li
(c)
Figure 5-1. ANN structures: (a) basic nonlinear model of a neuron, (b) one layer network of
neurons, and (c) 3 layer feed forward back propagation network

Figure 5-1 (c) shows the ANN model used in the current project. IW represents

Input Weight matrix having a source 1(second index) and a destination 1(first index).

Also, elements of layer one, such as its bias, net input, and output have a superscript 1 to

say that they are associated with the first layer. LW represents layer weights [14].


Inputl
signals


Where...
= nurmberof
remenis in
input vector
S= number of
neurons in layer









The data is presented to the input nodes. Each input node is connected to several

nodes in the second layer. The second layer is called the hidden layer, since they are not

accessible to the outer environment. The hidden layer acts as a layer of abstraction,

pulling features from inputs. Determining the proper number of nodes for the hidden

layer is difficult and often determined through hit and trial. Generally, network

performance increases with the number of hidden nodes and then reaches a saturation

level [15]. The addition of more hidden nodes may actually degrade performance due to

increased difficulty of training data. The implementation of this commonly accepted rule

will help train the ANN efficiently and will also help convergence of the solution. The

last layer is referred to as the output layer, since the network's output is the response of

nodes on this layer. The number of output nodes of an ANN is determined by the

requirement.

5.1.3 Functioning of ANN

In general, the operation of this feed forward network consists of passing weighted

and summed input signals through a chosen nonlinearity. It presumes knowledge of the

network's bias functions and weighted links. Once activation and output functions are

chosen, an ANN is completely described by its weights and biases. Since a given ANN

solves a specific problem, or function, finding weights and biases for the network is

equivalent to finding the input/output relationship that describes the function. In the

current ANN model, Figure 5-1(c), the activation functions chosen in the hidden layer

and the output layer are "tansig" and "purelin" respectively. The two layer sigmoid/linear

network can represent any functional relationship between inputs and outputs if the

sigmoid layer has enough neurons.









There were a lot of training algorithms and performance functions that we can

chose from to train the network model. For the present problem BPN algorithm has been

chosen, as it was the famous algorithm for multi-layer perception (MLP) networks and

'trainbfg' training function was used to train BPN. The term back propagation refers to

the manner in which the gradient is computed for nonlinear MLP networks. Properly

trained back propagation networks tend to give reasonable answers when presented with

inputs that they have never seen. Typically, a new input leads to an output similar to the

correct output for input vectors used in training that are similar to the new input being

presented. This generalization makes it possible to train a network on a representative set

of input/target pairs and get good results without training the network on all possible

input/output pairs.

5.1.4 Back Propagation Learning Rule

The back propagation learning rule [16] is an iterative gradient algorithm designed

to minimize the mean square error between the actual output of a multilayer feed forward

network and the desired output. An essential component of the rule is the iterative

method that propagates error terms required to adapt weights back from nodes in the

output layer to nodes in lower layers.

At beginning, we set all weights and node offsets to small random values. The

input values are presented and the desired outputs are specified. Then the network, Figure

5-2, is used to calculate actual outputs. A recursive algorithm, starting at the output nodes

and working back to the hidden layer, adjusts weights until weights converge and the cost

function is reduced to an acceptable value. The training process is repeated by presenting

different sets of input data to the ANN.
























EMw a
Figure 5-2. A back propagation ANN model

5.2 Steps to Enhance the performance of ANN

There is a wrong notion that one can dump all the available variables as input to the

ANN to predict the solution. The more the number of input variables to ANN the

complex the problem to track in studying the correlation between these input variables.

To enhance the performance of ANN, the input data has to be pre-processed. ANN

toolbox of MATLAB 6.0 has some of the functions which can perform these operations.

The following are some of the techniques that could be helpful to enhance the quality of

the input datasets before giving it to ANN:

* Eliminate the unnecessary variables which don't have significant contribution to
the output.

* Scale the inputs and targets so that they always fall with in a specified range.

* Reduce the dimensions of the input data, without much loss in the variance, e.g.
Principle Component Analysis, as explained below.

As weather is a combination of many parameters like wind, temperature, rain,

pressure, dew, lightening etc, the next question that comes to our mind is what are the

predominant ones among all these parameters that have significant contribution towards

the daily interruptions? One way to figure out solution for this problem is to see the











variance of all the weather parameters with respect to each other. The ones which have


more variance are more responsible towards N than the ones with less variance. Less


variance in a variable means fewer changes in its value, which means this variable has


less effect on the changes of N. For investigations involving a large number of observed


variables, it is often useful by considering a smaller number of linear combinations of


original variables.


Principle Component Analysis (PCA)


PCA is one of the friendly tools used popularly to reduce the dimensions of input


variables. Principle component analysis [13] finds a set of standardized linear


combinations called the principal components, which are orthogonal and taken together


explain all the variance of the original data. The following analysis shows the variance of


the 8 input considered in the ANN model:


Table 5-1. Summary Table of Covariance for All the Input Variables Considered in the
Principle Component Analysis
Importance of components:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
Standard deviation 10.9577574 7.0728909 5.5105074 3.22619511 2.01380874 1.5672211 0.936582778 0.3328819326
Proportion of Variance 0.5498531 0.2290853 0.1390550 0.04766335 0.01857119 0.0112477 0.004016943 0.0005074389
Cumulative Proportion 0.5498531 0.7789384 0.9179934 0.96565672 0.98422792 0.9954756 0.999492561 1.0000000000
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
MaxTemp 0.515 -0.138 0.197 -0.722 0.336 0.203
MinTemp 0.738 0.363 0.516 -0.216
Rain -0.995
AvgrlndS 0.313 0.359 0.841 0.229
5Sec1indS 0.727 -0.230 -0.187 0.614
2MinWindS 0.585 -0.151 -0.787
HeatDays -0.107 -0.277 0.949
CoolDays 0.415 -0.903


In the above table 5-1, if component 1(comp.1) alone is considered, it explains 54.9% of


the total variance in the data set by using the following linear combination of only 4


weather variables:

Comp.l = 0.515(MaxTemp) + 0.738(MinTemp) 0.107(HeatDays) + 0.415(CoolDays)









Similarly, comp.2 alone explains 22.9% of the total variance in the data set with the

following linear combination:

Comp.2 = -0.197(MaxTemp) + 0.313(AvgwindS) + 0.727(5SecWindS) + 0.585(2MinWindS)

But when comp. 1 and comp.2 are considered together, 77.8% of the total variance of the

data set can be explained. From above table, it can be observed that by considering up to

comp.3, around 91.7% of the total variance can be explained and by considering up to

comp.4, around 96.5% of the total variance can be explained. The choice of how many

number of components to be considered depends on the amount of variance that is of our

interest. Hence in this case, the total number of dimensions, 8, has been reduced to 4 if

we want to retain 96.5% of the total variance by considering up to comp.4. To our

interest, rain in above table has no contribution at all towards variance, if we consider

until comp.4, hence this variable can be eliminated. Hence the 8 variables can be reduced

to 4 components by preserving 96.56% of the total variance in the data set. Each

component is like a new variable but a linear combination of the actual weather variables.


5.3 ANN Simulation Output

Three management areas--Wingate, North Dade, and Gulf Stream were chosen as

pilot areas in the current artificial neural network (ANN) project. These 3 MAs are

adjacent to each other and small enough to make the assumption that the variation in

weather due to geographical differences is slight. Also, they are all urban MAs and

appear to have a similar distribution of customer types.

Two years, 2000 and 2001, of weather and interruption data were chosen to train

the ANN while 2002 weather and interruption data was used to evaluate the performance

of the trained ANN model. One entry in either the training or evaluation datasets is










composed of one day's weather and interruptions for one MA, so the one year evaluation

dataset had 1060 entries (allowing for missing data).

The output of the ANN is two columns of data; a prediction for each entry in the

evaluation dataset and the actual number of interruptions for each entry in the evaluation

dataset. Figure 5-3 is a graph of the predicted values superimposed over the actual values

for the evaluation year 2002. The MAs are in sequence and it can be seen that the

predicted values follow the seasonal trends for interruptions.





- 40





10 I
2002-Gulfstream 2002-North Dade 2002-Wingate

Figure 5-3. Prediction patterns of N overlaid on actual patterns of N of 3 MAs for year
2002

5.3.1 Detailed Observation

Figure 5-4 is an expanded segment (North Dade) of the above graph to highlight

the details. It can be seen that where the actual number of interruptions are large, the

predictions matches the pattern of Max's and Min's, but are not always close in

magnitude. Where the actual number of interruptions is small, the pattern matching

breaks down, but large spikes in the predictions do not occur.

The following is a segment of a time series plot of predicted values and actual

values of total daily number of interruptions (N) for North Dade MA. It can be clearly

seen that during some periods (rectangles) the predicted values match the pattern of the










actual values, if not the magnitude, while other periods (ovals) do not show any such

pattern matching although the magnitudes are small.


Predicted N
-i Actual N












Figure 5-4. Predicted N and the actual N for a few of the cases in North Dade MA for
2002


Some of the interesting observations from the above plot, Figure 5-4 are explained below.

Case 1: Predicted N less than actual N. The actual value of N at points 529 and

530, in Figure 5-4, correspond to 6/17/2002 and 6/19/2002 in the North Dade MA. The

interruption data for 6/17/2002 indicates that up to 18 among 28 interruptions that

occurred in 6/17/2002 may not be related to daily common 1 eI/wer (Corrosion/Decay =

10, Improper Process = 6, Request = 2) which suggests that as few as 12 may be weather

related interruptions, which is just the same as the predicted value. Similarly 15 out of 29

interruptions that occurred on 6/19/2002 may not be related to daily weather, which is

close to our predicted value of 12.

Though the weather conditions for these days were relatively mild, we have a

significant increase in the number of interruptions, as shown in Figure 5-5 in the green

box.








50



Case 2: Predicted N more than actual N. If we look into the details of the points


549, 550, 551 and 552, these points correspond to 7/8/2002, 7/9/2002, 7/10/2002,


7/11/2002 days of North Dade MA, outlined in red in Figure 5-5. The number of


interruptions for these days was pretty much same though their weather conditions vary


over a wide range. This large change in weather conditions forces the ANN model to


predict N proportional to the weather. So it is really a question one should ask that why


we have small changes in N though we have significant differences in their weather


conditions? Were some precautionary measures, e.g. tree trimming been taken few days


before the happening of these interruptions??



SC17T C18-D C19 C20 C21 C22 C23 C24 C25 C26 C27 C2 C29
Call Sign_1 Date_1 %MaxTemip_1 MiliTemip_ AvgTemip_1 HeatDays CoolDays1 Rain1 AvgWdS1 5SMaxS_1 2MMaxS_1 WG-LS_1 WG N_1
527 OFF 06/15/202 79 73 73 6 Ol 11 1 18 108 316 25 3; 33



531OFF 06O/2B]2002 B6 71 79 0 0 215: 97 26: 22 0 14
532 OOPF 0B/21/2002 77 71 74 6I D 1196 62 26 22 0 13

534 OFF 06/23/2B002 B 73 80 I 0 04W BO 36 29 5 14
535 OPF 0O/24/2002 87 74 81 OI 0 28 5 28 20 18 0 20
536 OPF 06/25/202 83 74 79 0| D 037 39 16 14 0 23
537 OPFF 06/2/2002 B 72 72 9 [I 112 5B 20 18 7 18
538 OPF 0/27/2002 87 75 81 O 0 000 45 16 13 0 20
539 OPF 06/2/2002 88 74 81 0 0' 003 48 22" 17 0 6
540 OFF 06/29/2002 B9 75 82 0E 0 002 7B 25 22 0, 23
541 OPF 0O/30/2002 B5 76 80 OI 0 016 67 25 21 2 14

543 OPF 07102/2002 B7 71 79 0 0 O 30B 43 24, 21 56 44

545 PF 07/04/202 89 74 82. 034 36 26 22 3 24
546 OPF 07105/2002 90 73 82 0 0 0 09 32 29: 21 51 42
547 OPF 07/0/2002 B8 73 8 1I 0 012 26 17' 15 3 12
548 OFF 07/07/202 B8 74 81 I 0 066 62 31 28 2 27




553 OPF 07/12/202: 90 73 82; 0 17 0 156 56 2 18 0 17
554 OPF 07/13/202 91 73 82 0 17 0 01 69 22 18 0 17
555 OPF 07/14/202 93 77 85' 0; 20 00 62 17 14 0 20
556 OPF 07/15/2002 92 78 85 0 0 08 50 17 16 3 18
557 OFF 07/16/2002 91 76 84 O 0 0 65 44 25 20 16 24

559 OFPF 07/1 /2002 90 77 84 OW 0 0 96 36 23' 18 78 18

Figure 5-5. Numerical values of weather and interruption data under consideration









Case 3: In some of the cases there was a very small increase in N though there

were large variations in the weather conditions.

5.3.2 Dominant Weather Parameters Preliminary Observations

A series of ANN simulations with different weather parameters removed has been

done, and the relative accuracy of each simulation has been compared to determine which

weather parameters are the most significant.

The preliminary results show that only a few of the many weather parameters

account for most of the variation in the number of interruptions. It is expected that the

importance of individual weather parameters will vary with location.

The following list gives the weather variables in accordance with their importance

for the pilot area:

1. Two Minute Sustained Wind Gust (mph)
2. Rain (inches)
3. Lightning Strikes (# of strikes/day)
4. Temperature Average or Max. & Min.(K')

On the other hand, the following list of parameters which account for the least

variation in the number of interruptions.

1. Pressure
2. Heat Days
3. Cool Days
4. Dew Point
5. Population (of MA)

5.4 Analysis of ANN Simulation Output

Based on the actual number of interruptions and the predictions during the

evaluation year, probability graphs (PGs) have been created to represent the range of

interruptions that actually occurred in the evaluation dataset for each predicted value. For

example, if every number between 1 and 40 interruptions were predicted at some point in










the evaluation year, there would be 40 PGs. This is done by sub-setting the evaluation

dataset into 40 sets and creating a histogram of actual values for each predicted value in

the evaluation set. By dividing each frequency column by the sum of the interruptions

that comprise the histogram, a probability graph such as the one for a prediction of 11

shown below can be created.

Histogram Probability Graph

15 -
10 -
10 ..

5 -Ia

0 0-
1 2 3 4 5 6 7 8 9 1011121314151617181920212223 2 3 4 5 6 7 8 9 1011 1213141516171819202122
N Actual N Actual
(a) (b)
Figure 5-6. Histogram plot of predicted interruptions

From the histograms, it can be seen that the actual values for each predicted value

follow a generally normal distribution, so it is justified to apply normally calculated mean

and standard deviation to gauge the accuracy and precision of a prediction. The accuracy

would be determined by the closeness of prediction to the mean actual number of

interruptions. The precision would be determined by the magnitude of the percent

standard deviation. Percent standard deviation was chosen to equalize the standard

deviations for lower to higher predictions. Outliers provide clues to elements of the

model that either are missing or should not be there.

Test data is just a single day's weather data; a real-time updated weather parameter

max, min or total from a weather station, a known day's values or a theoretical set of

weather data. The former is used in real-time prediction but the latter can only be used

after the fact and does not provide any predictive benefits, aside from inclusion in the









historical data set. However, the latter can be used for research, such as modeling a

system's robustness to weather. Test data that shows a very low prediction can be used as

a base and the parameter values can be varied either individually or in groups to model

the response to those parameters.

5.5 Pitfalls and Suggestions to FPL

GIGO is an acronym from the predawn of computing- garbage in, garbage out. The

accuracy and precision of the ANN is limited by the accuracy and precision of the input.

Although there will always be error inherent in the data collected, significant

improvements may be possible.

5.5.1 Weather Data

The error inherent in the ASOS weather data may be geographical and ASOS data

is only available for historic and not real-time use. The installation of dedicated weather

stations that is now occurring at FPL service centers will reduce that inherent error and

allow real-time forecasting.

5.5.2 Interruption Data

Although the FPL data cubes are thorough, the reporting procedures are not

designed for a detailed, time-dependent study such as this, nor are they always sensitive

to the role of weather. Because of this, the accuracy and precision of the prediction

suffers.

An example is that a day on which an interruption may be reported runs three shifts

from 7 AM to 7AM. In the last random sample made, the last shift, from 11 PM to 7AM,

reported about 12% of the day's interruptions; meaning that from midnight to 7AM the

interruptions were being reported on the previous day. This can be largely accounted for

by taking data from the cube by shifts and summing, however that still leaves 11PM to










midnight, or maybe 1-2% of the interruptions, reported on the wrong day. Because the

data was only shifted in time, the average difference after adjustment was only 0.05

interruptions; however, because of many instances where a large number of interruptions

were reported on the previous day, the average percent difference was 14%.

To determine the effect on the output of this error, two sets of data were taken from

the same time period and location, an original one with 24 hours of interruption data

taken from the cube on the day it was reported and an adjusted one with 24 hours of

interruption data taken from the beginning of the third shift on the day before it was

reported. Both were run through the ANN and the results compared. The following

detailed graphs of the same time period in the same MA show an improvement in the

pattern and magnitude matching after the interruption data were adjusted for the shift

differences.

F1redicted N








I, l


Figure 5-7. Prediction results of ANN using the original N (not shift adjusted)

Mean and Standard deviation plots for the actual N vs predicted N and adjusted N

vs predicted N were plotted as shown in Figure 5-9 and Figure 5-10. It can be observed

that adjusting the data to include the correct shifts on the correct days improves the fit of

the prediction to the mean actual number of interruptions. It also shows that the fit

















1I j L








I .la I .I IL .*r II



Figure 5-8. Prediction results of ANN using the adjusted N (shift adjusted)


deteriorates as the prediction increases, indicating unknown factors. The graphs of the


original, Figure 5-9, and shift-adjusted percent standard deviation, Figure 5-10, show a


reduction in the adjusted %Standard Deviation at lower predictions while the higher


predictions are not much improved, similar to the graphs of the means.


40 90 -
80
S 70
30 -
Z!A U) 60 -

20t 40
30 -
10 20 -
10 -
5 15 25 35 5 15 25
prediction Prediction
(a) (b)
Figure 5-9. Mean and standard deviation of actual N


15 25 35


prediction


(b)
Figure 5-10. Mean and standard deviation of adjustedN


15 25
Prediction


(d)


5









These results suggest that there are other improvements that can be made in the

data reporting. Not one change would have as visible an effect, perhaps, but taken

together they could alter the results significantly. Some possibilities suggest themselves:

* Report interruption requests due to weather related damage repair on the day the
causative weather condition occurred.

* Maintain an hourly database for interruptions since hourly weather is available.
This would be especially useful if dedicated weather stations existed.

* Update cause codes to be more sensitive to the possible role of weather.

* Report Age of equipment

Simulations that have been run with different cause codes subtracted from the

interruption data, such as accident, animal, improper process and crew request (planned)

have shown similar improvements in differing regions of the graphs.

5.6 Proving Localization of Weather Improves the Accuracy in Prediction

Case 1: Localized Weather Data

Three areas Wingate, North Dade, and Gulf Stream were chosen for study as pilot

areas, which are adjacent to each other. Weather and Interruption data for each of the

MAs were considered for years 2000 and 2001 and were used to train the ANN model.

While Year 2002 weather data of the North Dade area was chosen to predict using the

built trained ANN.

The mean percentage error (MPE) of the predicted value is 25% (approximately)

The mean percentage error (MPE) is calculated using the following formula:

1 Mod(Nactual Npredicted)
AM Nactual

Where M= Total number of cases considered
Nactual = Actual number of N happened
Npredicted = Predicted number of N









Case 2: Scattered Weather Data

Contrary to taking weather data from each of the weather stations, only one weather

station was chosen for weather data while the interruption data was taken from all the 3

management areas.

The mean percentage error of the predicted value is 35% (>25%)

This shows that with the increase in the accuracy of weather, by considering the

smaller areas, there are chances to enhance the performance of the model.

5.7 Comparison of Statistical Model and ANN Model

A comparison of the prediction performance between statistical and ANN model

was done using the 2000 and 2001 weather and interruption data of Gulf Stream (GS),

North Dade (ND), and Wingate (WG) of FPL In the process, three variables Rain,

2Minutes Maximum Wind Gust and Average Temperature were considered in building

the above two models. A multiple regression equation was developed for the above three

variables as given below:

N = -16.6 + 0.174 *AvgTemp + 4.71* Rain + 0.852 *2MMaxS

The 2002 weather data of ND is used to predict N using the above equation. On the

similar lines, ANN model was developed with 3 input variables, 5 hidden nodes and 1

output node. The same data set which is used for the statistical model is used in

evaluating the ANN model. The results of both the models are tabulated in Table 5-1.

Table 5-2. Performance Comparison Between Statistical Model and ANN Model
Statistical Model ANN Model
Mean % Error 67 45
Prediction with 30% Error 46 54

The above results show that the prediction accuracy of the ANN model is better

than the statistical model. Though the actual predicted figures of accuracy from both the










models are less, as we considered only few variables to make the problem easy, the point

here is to show that the ANN model is better.


-- Training
Validation
Test
2.5







8 1



05


0 5 10 15 20 25
Epoch

Figure 5-11. Mean squared error vs. training epochs

Figure 5-11 shows that the mean square error is gradually getting decreased with

the training of ANN for each epoch (a complete set of training data). The progress of

training is diagnosed by looking into the training, validation and test errors. The training

stopped after 40 epochs because the validation error increased. The result here is

reasonable, since the test set error and the validation set error have similar characteristics,

and it doesn't appear that any significant over fitting has occurred.

5.8 Possible Software Development to Predict Power Interruptions Using ANNs

The following Figure 5-12, is a snap shot of the graphical user interface (GUI)

development of the ANN that had been trained to predict the interruptions.





















SI







0i
a

EM


FPL Interface
















Tj .6 &
*-


Figure 5-12. Graphical user interface developed to predicted interruptions


Using above interface model, Figure 5-12, is simple: We have to first load the training


and testing data files (ASCII format) using the options buttons provided and then click on


the "Run Simulation" button to see the above plots.


Currently, the development of custom software to predict the power distribution


interruptions, based on the idea provided in the current thesis, is in progress. The


proposed prediction model is under test at FPL management areas. The custom software


can be easily installed just like any other software on the user desktop and is just a click


away to know the power interruptions in advance. In the future, the software will be


distributed to other power utilities in USA.


C Document and S .net.rnRoo





My D documents
NTJUSER.IN1


StatMenu
Templa]es
UsrData
gswew32 ini
ntuserdat
ntuserdatlOG


Load Te ting Data




Load Trawmg Data




RunSimulation


i


1. 1. 1 ..







60


Using the model shown in Figure 5-12, it can also be possible to get similar kind of

predictions as shown in Figure 5-13.Central Dade management area (MA) has been

chosen as one of the pilot areas to see how well our developed model can predict the

interruptions. It can be seen that the correlation coefficient R2 is around 90 which means

the model is doing pretty good job in predicting the interruptions close to the actual

number of interruptions. The X-axis of figure 5-13(a) shows the predicted sum of

monthly interruptions while the Y-axis shows the actual sum of monthly interruptions

happened. This estimate of predicted interruptions will help the utilities to know in

advance how much personnel they need to deploy to manage the interruptions.


Central Dade 2001-2003 Monthly Total N
Sum Actual = 1.84 + 1.060 Sum Predicted
600- S 32.3396
R-Sq 90.2%
R-Sq(adj) 89.9%
500-
*

g 400-


300-


200-


100 _
150 200 250 300 350 400 450 500 550
Sum Predicted












Scatterplot of Sum Actual, Sum Predicted vs Month for Central Dade


Variable
600
-0- Sum Actual
500 -U- Sum Predicted

400

300

200


0 3 6 9 12
Month


Panel variable: Year


Figure 5-13. Predicted interruptions vs. actual interruptions for Central Dade (a) 3 years
plotted together (b) each year plotted separately















CHAPTER 6
LIMITATIONS, CONCLUSIONS AND FUTURE WORK

The research results presented in this thesis are not with their own limitations.

Some of the hurdles that need to be overcome to get better results were discussed in this

chapter. The conclusions of the current thesis are followed by the future work explaining

about the steps that are to be followed from the current state of the project.

6.1 Limitations of Approach

6.1.1 Weather Data

There are two types of weather parameter measurement errors. First, we found that

the weather parameter measurement in an airport is not accurate as expected. Second, the

distance between the location of outages and the airport, where weather parameters are

measured, is up to 10 miles. The weather conditions in two locations for certain weather

parameters can be significantly different (Figure 6-1). However the rain difference

presents a normal distribution with mean close to zero. Therefore, rain data can be used

for nearby locations without changing the results.








63




18
2 16

14

2


08

S06


02


11/20/2000 1/9/2001 2/28/2001 4/19/2001 6/8/2001 7/28/2001 9/16/2001 11/5/2001 12/25/2001 2/13/2002
Time


(a)




:.:rJ. r 1L. I- !riJlsbcrm Ea.ar -
| 0 1- :ii.S.- iLamor-[ re een wn
-. F[I milIe

-lOra .1 Li. IE T I I

H Ir" / -- I ; I _




6.1.2 J.7miesr_ Unkn__ V
Caqeft tPlui











ofF these I dfrteh Fam, mile F- ar
I- -1

LL I















--- Sm ile. I -
%IIFa I
O .: ~F E ter, Paneft Lai %














(b) showing less than 975 miles distance between each weather station

6.1.2 Unknown Variablesmil
















of these are different weather parameters, but other variables are most likely specific to a









system and would best be identified by utility employees who are familiar with the

system.

6.1.3 Outliers

It is possible to find aberrant observations among the data set, without any clear

explanation of the cause. These sort of outliers must me study independently.

6.1.4 Hourly Data

From the number of interruption database FPL provided, the lowest reachable level

is "daily basis". However, for any interruption studied it is necessary to know the exact

time, at the hour level or even at the minute level as shown in [10-11].This is needed to

study different weather parameters at the given outage time since the weather varies for

each and every hour.

6.2 Conclusions

The ANN and statistical analysis of the ANN output has the potential to provide

powerful modeling tools, and can be used to provide limited real-time prediction. The

accuracy and precision of the model is dependent as much on the input as the ANN

model.

The graphical output of the ANN can be used by itself or in conjunction with the

statistical analysis to compare the accuracy and precision of the ANN model with

different variable selections, principal components, study areas or times. In some cases,

the graphical representation can provide better clues to the performance of the ANN than

the graphs of means and percent standard deviations.

With the ever increasing demand for more and more electricity every year, the need to

look for the better ways in preventing the interruptions due to over loading of the power

distribution equipment has drawn much attention.









ANNs have been already applied in power systems in the areas of Economic Load

Dispatch, Optimization and Loss Reduction, Fault Detection and Diagnosis, Frequency

Control, Load Forecasting, Contingency analysis, static security assessment, Voltage and

Reactive Power Control etc. But, not much research work can be found either online or in

the IEEE publications regarding the application of ANNs for the prediction of power

distribution interruptions. This novel idea seems very promising in letting the utilities

know and alert them in advance about the number of interruptions that are going to

happen in future. This helps to optimize their crew by mobilizing them to the location of

interest and take proper action more effectively to avoid interruptions/ respond quickly in

restoring the power due to interruptions. This further helps in reducing the SAIFI value.

The utilities can predict SAIFI as they would be able to predict the total number of

interruptions and can use it in their internal calculations. The developed ANN model can

be further enhanced in predicting the extra information like time slot and location of the

occurrence of these interruptions besides revealing their approximate number, for which

all we need to do is to provide the extra information as input columns while training the

model. The accuracy of the predicted results is directly proportional to the accuracy of

the information provided in the training data which is used to train the model.

A basic methodology that is easily automated has been proposed. The methodology

promises to be easy to use and flexible enough to perform in both a real-time predictive

and a theoretical modeling mode.

6.3 Future Work

The following future steps will improve the accuracy of the current analysis.









6.3.1 Data Collection and Creating New Variables

Additional data collection is necessary. It is suspected that a change in power usage

or equipment density might cause outage trends over time. To verify this idea, we need to

develop a scaling factor and collect usage data. This scaling factor would consist of

information such as equipment density and length of lines- daily power usage data would

be an additional explanatory variable. Also, this new data might be useful in comparing

management areas because the probability of interruptions occurring may be proportional

to the scaling factor. The more the new input variables of the system

6.3.2 Improving the Accuracy and Developing New ANN Models

Other types of ANN such as RBF, LVQ, SOM or their combinations need to be

tested to see which of the model gives better prediction results.

The dimension of input feature space/ input feature pattern needs to be reduced to

improve performance such as speed, prediction accuracy

If the prediction variable(s) are more than one (multi-output rather than single

output), the architecture of whole system may be either a multi-input multi-output

ANN or the composition of several multi input-single output ANNs. The training

method as well as performance should be further investigated and compared.

Develop an enhanced custom software model with Graphical User Interface,

where the user will have the options of selecting new input and output datasets to

train ANN and develop a model to predict the output. Hence, user can reuse this

tool every time he wishes to create new model / renovate the existing model.















LIST OF REFERENCES


1. IEEE Trial-Use Guide for Electric Power Distribution Reliability Indices,
IEEE Std 1366-2001, IEEE, New York, 1999.

2. 2001 Cost of Downtime, Contingency Planning Research (CPR) and
Contingency Planning & Management Magazine (CPM). Web site
http://www.contingencyplanningresearch.com (accessed on December 2001).

3. C. A. Warren, Overview of 1366-2001 the Full Use Guide on Electric Power
Distribution Reliability Indices, Power Engineering Society Summer Meeting,
IEEE, Volume 2, 2002.

4. Transmission Line Reference Book, 345kV and Above/Second Edition. Electric
Power Research Institute, Palo Alto, CA, 1982.

5. Florida Power and Light website http://www.fpl.com (accessed June 2004)

6. Thomas E. Grebe, D. Daniel Sabin, and Mark F. McGranaghan, An Assessment of
Distribution System Power Quality: Volume 1: Executive Summary. EPRI Report
TR-106294-V1, Electric Power Research Institute, Palo Alto, California, May
1996.

7. D. Daniel Sabin, An Assessment of Distribution System Power Quality, Volume 2:
Statistical Summary Report. EPRI Report TR-106294-V2, Electric Power Research
Institute, Palo Alto, CA, May 1996.

8. Daniel L. Brooks and D. Daniel Sabin, An Assessment of Distribution System
Power Quality: Volume 3: The Library of Distribution System Power Quality
Monitoring Case Studies. EPRI Report TR-106294-V3, Electric Power Research
Institute, Palo Alto, California, May 1996.

9. National Climatic Data Center (accessed June 2004), Website
http://nndc.noaa.gov/?http://ols.ncdc.noaa.gov/cgi-bin/nndc/buyOL-002.cgi.

10. A. Domijan, Jr., R. K. Matavalam, A. Montenegro, W. S. Willcox, Y. S. Joo, L.
Delforn, J.R.Diaz, L.Davis, and J. D'Agostini, Effects of Normal Weather
Conditions on Interruptions in Distribution Systems, International Journal of Power
and Energy Systems, Publication No: 203-3453.

11. J. M. Zurada. Introduction to Artificial Neural Systems. West Publishing
Company, St. Paul, MN, 1992.









12. L. F. Garcia, and O.A Mohammed, Forecasting Peak Loads with Neural Networks,
Southeast Conference. Creative Technology Transfer-A Global Affair,
Proceedings of the 1994 IEEE, pp. 351 356, 10-13 April, 1994.

13. S. I. Wu, Mirroring our Thought Processes. IEEE Potentials 14, 36-41, 1995

14. Neuron Model & Network Architectures, Neural Networks Toolbox, MATLAB 6.0
Manual, Chapter 2.

15. W.M. Huang and R.P. Lippmann. Comparisons Between Neural Networks and
Conventional Classifiers, Proc. IEEE Int. Conference on Neural Networks, pp.
485-493, 1987.

16. J.L Chen, and Chang, S.H, A Neural Network Approach to Evaluate Distribution
Systems Engineering, IEEE International Conference on Neural Networks,
pp. 487 490, 17-19 Sept. 1992.















BIOGRAPHICAL SKETCH

Roop Kishore R. Matavalam was born in Tirupati city, Andhra Pradesh state, India.

He received his Bachelor of Technology (B.Tech) degree in 2001 specializing in

electrical and electronics engineering from Sri Venkateswara University, India. Since Fall

2001 he has been pursuing his Master of Science degree in electrical and computer

engineering at University of Florida, Gainesville. He has been working as a research

assistant, since 2001, in Florida Power Affiliates and Power Quality Laboratory,

University of Florida. His fields of interest include power reliability, power electronics,

analog circuit design and RF micro electronics.