|UFDC Home||myUFDC Home | Help|
This item has the following downloads:
POWER DISTRIBUTION RELIABILITY AS A FUNCTION OF WEATHER
ROOP KISHORE R. MATAVALAM
A THESIS PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
UNIVERSITY OF FLORIDA
Roop Kishore R. Matavalam
To my parents and my sister
First and foremost, I thank my advisor, Dr. Alexander Domij an, for having
confidence in me and allowing me to pursue this research. The work presented in this
thesis would not be possible without his consistent support. I am also very thankful to Dr.
Khai D.T. Ngo and Dr. Antonio A. Arroyo for serving as committee members in my
I sincerely express my gratitude to my colleague and friend William S. Wilcox for
his thorough discussions in statistics and suggestions, without which the current thesis
would not have been exciting. I am also grateful to my colleague and friend Alejandro
Montenegro for his suggestions and answers to my questions without any hesitation.
I am also very grateful to my friend Raj Vignesh Thogulua for helping me in
understanding the neural networks. I am also thankful to Dr. Tao Lin for his helpful
suggestions. I also thank my fellow colleague and friend Ajay Karthik for being
enthusiastic about my work.
I would like to express my special thanks to all the personnel of FPL Distribution
Reliability group especially Mr. J. R. "Pepe" Diaz, Ms. Lee Davis and Ms. Jessica
D'Agostini for their valuable suggestions and financial support of this project, without
their consistent support the current project would not have been finished. I am also
thankful to FPL members including Mr. Val Miklausich, Mr. Santiageo Cocina, Mr. Luis
Delfom, Mr. Manny Miranda, and Ms. Martha Caneia for their assistance.
I would like to express my gratitude to all the undergraduate students who worked
in FPL project and made it more lively and interesting. I also extend my gratitude to all
my great friends for their support and encouragement.
Finally, yet most importantly, I am indebted to my wonderful parents and sister for
believing in my goals, aspirations, for their love, encouragement, and constant support in
all my endeavors.
TABLE OF CONTENTS
ACKNOW LEDGM ENTS ........................................ iv
LIST OF TABLES ................ ........ ....... .................... viii
LIST OF FIGURES ........................ ........................... ix
1 INTRODUCTION ................... ...................................... ......... ........
1.1 Im portance of Pow er R eliability........................................................................2
1.2 U understanding Pow er Reliability Indices.............................................................3
1.3 Purpose and Importance of this Thesis............... ...............................5
1.4 O organization of T hesis............................................................................................
2 UNDERSTANDING FLORIDA WEATHER ......................................................8
2.1 Air D ensity, Tem perature and Pressure................................................................8
2.2 Humidity ......................................................... ..................8
2.3 R ain................... ..... ....... ....................................
2.4 Dew or Condensation of the Humidity..........................................9
2.5 Pollution................................ ........9
2.6 Wind ........................................ ............................. .........10
2.7 Lightning....................................... .................. ...................12
3 SYSTEM UNDER STUDY FPL......................................................... ........15
3.1 Description of the FPL distribution system........................................................15
3.2 Interruption Data from FPL................... .... ............. .. ........17
3.3 Meteorological Weather Data from NCDC........................................................22
3.4 Weather Parameters of interest......................................................... .........23
4 CORRELATION OF WEATHER AND INTERRUPTIONS .................................25
4.1 Im portance of Statistical Tools ............................................... ...... ......... 26
4.2 Probabilistic Characteristics of Data Distributions............................................27
4.3 Correlation Analysis between Weather Parameters and Interruptions.................29
4.3.1 Impact of Temperature on N ....................................... ..........29
4.3.2 Impact of Wind on N...................................... ....................... ......34
4.3.3 Impact of Rain on N ............... ................... ................... ..........35
4.3.4 Effect of Rain and W ind Together on N ....................................................37
5 PREDICTION OF INTERRUPTIONS USING ARTIFICIAL NEURAL
N E T W O R K S ....................................................... 40
5.1 Introduction to Artificial Neural Networks ........................................ 40
5.1.1 Benefits of ANNS over statistical methods..............................................41
5.1.2 A architecture of A N N ........................................................................ ..........4 1
5.1.3 Functioning of A N N ...................................................... 43
5.1.4 Back Propagation Learning Rule.........................................................44
5.2 Steps to Enhance the performance of ANN....................................................45
5.3 A N N Sim ulation O utput ................................................................................. ......47
5.3.1 D detailed O observation .................................... ................... ................. 48
5.3.2 Dominant Weather Parameters Preliminary Observations ...................51
5.4 Analysis of ANN Simulation Output.............. ................. ............51
5.5 Pitfalls and Suggestions to FPL ................................................................. 53
5.5.1 W weather D ata ................. .... ................................. ........ 53
5.5.2 Interruption D ata ........... ... ........ .. .... .. ..... ...............53
5.6 Proving Localization of Weather Improves the Accuracy in Prediction............56
Case 1: Localized W weather Data ............................. ............... 56
Case 2: Scattered Weather Data ........................................ ............. 57
5.7 Comparison of Statistical Model and ANN Model ....................... .................57
5.8 Possible Software Development to Predict Power Interruptions Using ANNs.....58
6 LIMITATIONS, CONCLUSIONS AND FUTURE WORK ...................................62
6.1 L im stations of A pproach................................................................................. ......62
6. 1.1 W weather D ata ................. .... ................................. ........ 62
6.1.2 U nknow n V ariables ............................................................................63
6.1.3 Outliers ........................................ ........64
6.1.4 Hourly Data .............................................. ..... ...64
6.2 Conclusions............................. ........... ........64
6.3 Future Work........................................ .... ...............65
6.3.1 Data Collection and Creating New Variables ..................................66
6.3.2 Improving the Accuracy and Developing New ANN Models ................66
LIST OF REFERENCES ..................................... ............... ....................67
BIOGRAPHICAL SKETCH .................................................. ............... 69
LIST OF TABLES
2-1 Type of Contaminant and Atmospheric Conditions at the Time of
Contamination Flashover (UHV Project)................................................................10
3-1 FPL Pow er Sales by Sectors.....................................................................................15
3-2 FPL Distribution Management Areas along with Their Dispatch Centers ...........16
3-3 FPL C ause C odes (102) Table ....................................................... 18
4-1 The Frequency of Interruptions due to Tree Limbs (Cause Codes 20 & 21)..........27
4-2 Prediction of N Using Maximum Temperature of All the MAs ............................33
5-1 Summary Table of Covariance for All the Input Variables Considered in the
Principle Com ponent A nalysis................................ ................... 46
5-2 Performance Comparison Between Statistical Model and ANN Model............... 57
LIST OF FIGURES
2-1 Regions with strong contamination (UHV project) ................................................9
2-2 Swing angle as a function of instantaneous wind speed at tower ......................... 11
2-3 Vegetation effects on power interruptions .........................................12
2-4 Number of days with thunderstorms in Florida (US Weather Bureau)....................13
2-5 Cumulative frequency distribution of peak current amplitudes
in dow nw ard negative flashes ........................................ ................. 14
3-1 Snap shot of Florida map ................................ ......................... ........ 17
3-2 FPL's historical SAIFI performance .................................................. .............20
3-3 Frequency charts of interruptions and customers affected by
4-1 N for all management areas from 1998 to 2001 using the previous
filters .................................. .... ......... .28
4-2 Rain (inches) for all management areas from 1998 to 2001 ..............................28
4-3 Wind-2 minutes maximum speed (mph) for all management areas
from 1998 to 2001 ..................................................... ........ 29
4-4 Variation of average N due to transformer failures vs. maximum
tem perature ...................................... ................................ ......... 30
4-5 Variation of average N due to transformer failures vs. maximum
temperature (averaged per month per year) ........................................31
4-6 Variation of average N due to transformer failures vs. max
temperature (averaged per month) ............................... ............... 32
4-7 Variation of N vs. wind ............................. ... ................34
4-8 Mean of 2 minutes wind speed vs. average number of interruptions....................35
4-9 V aviation of N vs. rain.............................................................................36
4-10 V ariation of N vs. rain in the interval [0 2]........................................ ............. 37
4-11 Impact of rain and wind together on N ........................ ............................. .........38
5-1 ANN structures.................................................................42
5-2 A back propagation ANN model ................................ ...............45
5-3 Prediction patterns of N overlaid on actual patterns of N of 3 MAs for year
2002 ...................................... ......................... ........................48
5-4 Predicted N and the actual N for a few of the cases in North Dade MA for 2002...49
5-5 Numerical values of weather and interruption data under consideration..............50
5-6 Histogram plot of predicted interruptions ............................ .... ..............52
5-7 Prediction results of ANN using the original N (not shift adjusted)........................54
5-8 Prediction results of ANN using the adjusted N (shift adjusted).............................55
5-9 Mean and standard deviation of actual N....................... .... ...............55
5-10 Mean and standard deviation of adjusted N......................................................55
5-11 Mean squared error vs. training epochs ...........................................58
5-12 Graphical user interface developed to predicted interruptions...........................59
5-13 Predicted interruptions vs. actual interruptions for Central Dade............................61
6-1 Average precipitation difference............................................................ ....... 63
Abstract of Thesis Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Science
POWER DISTRIBUTION RELIABILITY AS A FUNCTION OF WEATHER
Roop Kishore R Matavalam
Chair: Alexander Domij an, Jr.
Major Department: Electrical and Computer Engineering
The system principles that are used to design and maintain electric power
distribution grids (distinct from transmission grids) are intended to minimize the number
and duration of power disturbances, including interruptions. Many of these principles,
such as load flow and load prediction are well understood and have been refined over
many years. However, the impact of local weather conditions on power distribution grids
has not been well researched. The current thesis is intended to improve our understanding
of the effects of weather on power distribution systems and to develop tools for the
prediction of weather related interruptions. Developing and disseminating this
information will allow electric power engineers to ultimately improve our nation's power
The current research also presents the novel concept of predicting the number of
power interruptions in a distribution system using weather parameters. Preliminary
results show that it is possible to define and build a model, using artificial neural
networks (ANNs), which can use weather parameters as inputs and predict the number of
interruptions with reasonable accuracy. The accuracy of the prediction depends, in part,
on the accuracy of the weather data that are used in the model and, in part, on the
precision of the model. It is expected that the use of real-time surface weather data, such
as can be collected from well-sited weather stations, will eliminate the uncertainty
inherent in weather data collected from geographically distant sources.
For any power company, providing reliable electric service is the number-one
priority. But unfortunately, sometimes, power interruptions are simply unavoidable. The
contribution of weather towards these power interruptions is significant, as it is going to
be shown in this thesis. Often times the terms power outages and power interruptions
are exchanged each other to mean the same -- loss of power supply, even by many people
in power industry. But there is a fine difference between these two terms as stated by the
IEEE 1366 standard :
An outage is defined in IEEE 1366-2001 as:
The state of a component when it is not available to perform its intended function
due to some event directly associated with that component. Notes: (1) An outage
may or may not cause an interruption of service to customers, depending on system
configuration. (2) This definition derives from transmission and distribution
applications and does not apply to generation outages.
An interruption is defined in IEEE 1366-2001 as:
The loss of service to one or more customers connected to the distribution portion
of the system. Note: It is the result of one or more component outages, depending
on system configuration.
From FPL standards, the loss of power supply is defined in two ways based on the
duration of the power disturbance to the customer:
Momentary Interruption: Single operation (Open Close) of an
interrupting device which results in zero voltage for a period of time of 59
seconds or less.
Sustained Interruption: Power loss to the customer that has lasted at least
for one minute.
Note: In the current thesis, when we say interruptions we mean sustained interruptions.
Also we use N to represent the total number of sustained interruption per day.
1.1 Importance of Power Reliability
Because of the huge financial losses, besides customer satisfaction loss, associated
with power interruptions to our economy, power reliability is one of the most important
concerns for electric utilities. Power reliability modeling and indexing are among the
tools used by utilities to manage costs and monitor equipment performance, and
ultimately improvements in the flexibility of reliability assessment models will result in
increased savings. According to Contingency Planning Research Company's annual
study , downtime caused by power disturbances result in major financial losses as
shown in Figure 1-1.
Figure 1-1. Average hourly impact of downtime and data loss by business sector
Traditionally the reliability of an electric distribution system is measured in terms
of total number of power interruptions occurred.
Currently there are a variety of indices available to measure the reliability of any
power utility. These indices are used as a yard stick within a utility to see how good they
are doing each year besides serving as a better tool to compare their relative performance
with the other utilities, of different sizes, in the country.
1.2 Understanding Power Reliability Indices
Among many power distribution reliability indices available , the following are
some of the widely used customer based indices:
System average interruption frequency index (sustained interruptions). This index
is designed to give information about the average of sustained interruptions per customer
over a predefined area. In words, the definition is
SAIFI = Total number of customer interruptions
Total number of customers served
Mathematically it is represented as
System average interruption duration index. This index is commonly referred to as
customer minutes of interruption or customer hours, and is designed to provide
information about the average time the customers are interrupted.
SAIDI = Customer interruption durations
Total number of customers served
Mathematically it is represented as
Customer average interruption duration index. CAIDI represents the average time
required to restore service to the average customer per sustained interruption. In words,
the definition is
CAIDI= Customer interruption durations
Total number of customer interruptions
Mathematically it is written as
CAIDI = rNN SAIDI
Average service availability index. This index represents the fraction of time( often
in percentage) that a customer has power provided during one year or the defined
reporting period. In words, the definition is
ASAI = Customer hours service availability
Customer hours service demand
To calculate the index, the following mathematical equation is used:
ASAINr (No.of hours/ year) ZrN,'
NT (No.of hours / year)
Note that there are 8760 hours in a regular year, 8784 in a leap year.
Some of the other customer based reliability indices in use are CTAIDI, ASAI etc.
Load based sustained indices include ASIFI, ASIDI etc. Momentary indices include
CEMSMIn, AMAIFI and MAIFIE. The newest indices are CEMIn, Customers
experiencing multiple interruptions, and CEMSMIn, Customers experiencing multiple
sustained interruptions and momentary interruptions events. Usage of CEMIn as a basis
performance measure is under consideration  by many states in USA. In Florida, usage
of CEMI5 index is under consideration. If this value is exceeded, the commission is
considering fines that would be paid to the customers who experienced poor
The snapshot of the percentage usage of different reliability indices, by the IEEE
working group  on system design is as shown in Figure 1-2. It was analyzed through
surveys that the most commonly used indices are SAIFI, SAIDI, CAIDI, and ASAI.
DpO a ia~fi SLgiia a'Bg ia
SAIFI SAID CAIDI ASAI ASIF 4ASIDf MAIFI OTHER CTAPDI CAlFI
Figure 1-2. Percentage of companies using a given index reporting in 1990
1.3 Purpose and Importance of this Thesis
It was so obvious from previous discussed indices that one has to reduce the total
number of interruptions in order to improve the reliability of a distribution system.
Though there are lots of parameters/ conditions that are responsible for these
interruptions, weather is still a big player and has a significant contribution. Research
carried out by the Electric Power Research Institute  showed the effects of weather
components, such as lightning, rain and wind in transmission lines (345kV and above).
However, not significant similar work was done for distribution lines which brought our
attention to research in this new direction.
The purpose of the current thesis work was to study the impacts of normal i Iclu/Wlr
conditions on the distribution power interruptions and develop correlation models. Study
also includes how the correlation knowledge can be used to reduce the power
interruptions by incorporating Artificial Neural Networks (ANN). Using weather and
interruption data novel prediction tools and modeling methods were developed, in which
the similar kind of approach can be applied to any electric utility in the country. With the
provision of novel tools developed in this thesis, any power utility would be able to
estimate/predict the total approximate number of interruptions that might happen in the
future due to weather. The accuracy of the prediction of interruptions depends, in part, on
the accuracy of the forecast weather data which serves as input to the model and, in part,
on the accuracy of the model. It is expected that the use of real-time surface weather data,
such as can be collected from well-sited weather stations, will eliminate the uncertainty
inherent in weather data collected from geographically distant sources.
Despite a thorough search of the available literature, examples of the use of surface
weather data in the construction of power reliability models have not been found. It is
expected that this project will contribute significantly to the existing literature by
providing predictive models, as well as background in a previously unexplored area.
Moreover, this research is valuable for the exploration of proper ANN structure, internal
parameters and feature pattern extraction methods for application to power systems.
1.4 Organization of Thesis
The current thesis work comprises six chapters. The current chapter gives the
motivation, literature survey and the importance of the project. The second chapter
explains about the behavioral model of weather conditions prevalent in Florida State. It
will also give a broad knowledge of different weather parameters. The third chapter deals
with the Florida Power & Light (FPL) system and also the information on weather and
interruption data considered in the analysis. The fourth chapter presents the approach
followed towards developing correlation models between weather parameters and power
interruptions using statistical tools. The fifth chapter reveals a novel model using Neural
Networks that can be used to predict the power interruptions based on the forecasted
weather parameters. The last chapter concludes with the limitations of the research
results, and conclusions of the current thesis followed by future work.
UNDERSTANDING FLORIDA WEATHER
As weather in Florida is very varying, different weather parameters in Florida that
are more common are explained and their impact on power distribution lines is explained
in this chapter. Florida is also known as the lightning capital of the world.
The natural phenomena of the weather that can reduce the strength  of the
insulators in the state of Florida are air-density, temperature, pressure, humidity, rain,
dew or condensation of humidity, pollution, wind and lightening.
2.1 Air Density, Temperature and Pressure
The variation of the temperature in the state of Florida is between 20TF and 100oF
and the pressure doesn't change significantly. The influence of this on the strength of the
insulators, without abnormal condition like fire, is around of 5%.
The humidity in Florida changes significantly throughout the year and throughout
the day. Usually, the humidity can be very high between Midnight and 6 /7 A.M and after
that it starts to decrease. It again starts to increases at night. The influence of the humidity
without condensation can affect the strength of the insulator to around 16%.
The rain can be classified into weak rain (mist or drizzle) or strong rain
(rainfall).The influence of the rain on the insulator strength varies and depends on the
intensity and its direction. The maximum influence of rain  on clean insulators is
around 30% .
The mist or drizzle along with pollution, can have a stronger impact on flashover
when it combines together with pollution on the insulators strings.
2.4 Dew or Condensation of the Humidity
When temperature decreases to the dew point, the condensation starts taking place
on the surface of the insulators and then a thin layer of water appears. The condensation
can have same effect as that of mist on insulator.
Pollution can reduce the strength of an insulator. Its influence on flashover depends
on the type of contamination and its concentration on the surface of the insulators.
We are here concerned about two types of the contamination, the spot
contamination and the area contamination. In Florida the distribution of contamination
 is as shown in Figure 2-1. Black dots refer to spot contaminations and shaded refer to
Figure 2-1. Regions with strong contamination (UHV project)
Table 2-1 shows the types of contamination that causes flashover. In dry conditions
most of them are not good conductors, however, in wet conditions due to condensation,
drizzle, mist or rain the conductivity increases substantially.
Table 2-1. Type of Contaminant and Atmospheric Conditions at the Time of
Contamination Flashover (UHV Project)
TABLE 10.3.5 TYPE OF CONTAMINANT, WEATHER, AND ATMOSPHERIC CONDITIONS AT TIME OF
Type of Orizzle. No High Wet
Conlaminant Fog Dew Mist Ice Rain Wind Wind Snow Fair
Sea-sall 14 11 22 1 12 3 12 3
Cement 12 10 16 2 11 4 1 4
Fertilizer 7 5 8 1 1 4
Flyash 11 6 19 1 6 3 1 3 1
Road sail 8 2 6 4 2 6
Potash 3 3
Cooling tower 2 2 2 2
Chemicals 9 5 7 1 1 1 1
Gypsum 2 1 2 2 2
Mixed contamination 32 19 37 13 1 1
Limestone 2 1 2 4 2 2
Phosphate and sulfate 4 1 4 3
Paint 1 1 1
Paper mill 2 2 4 2 1
Dried milk 1 1 1 1 1
Acid exhaust 2 3
Bird droppings 2 2 3 1 2 2
Zinc industry 2 1 2 1 1
Carbon 5 4 5 4 3 3
Soap 2 2 1 1
Steel works 6 5 3 2 2 1
Carbide residue 2 1 1 1 1
Sulphur 3 2 2 1 1
Copper and nickel salt 2 2 2 2 1
Wood fiber 1 1 1 1
Bulldozing dust 2 1 1
Aluminum plant 2 2 1 1
Sodium plant 1 1
Active dump 1 1 1
Rock crusher 3 3 5 1
Total 146 93 166 8 68 26 19 37 4
Percent weather 25.75 16.4 29.3 1.4 12 458 3.36 6.52 0,71
Pollution combined with condensation and rain can be considered as the worst
condition behind reduction of insulator strength. Moreover, dew, drizzle and mist are
considered the most important weather components at the time of flashover, for 72 % of
the cases .
Rain can have two different aspects. On one hand, it reduces insulators'
withstanding values. On the other hand, cleans the surface of the insulators thereby
preventing the system from new flashover due to dew (condensation), mist and pollution.
Special weather conditions, such as storms, thunderstorms, hurricanes and tornados
have a direct effect on power interruptions. However, these events are a combination of
rain, wind and lightning. Therefore, the individual analysis of each one of these weather
components is required in order to study the real cause behind the flashover. An
evaluation of the influence of wind-speed on swing angle and therefore the minimum
clearance required to avoid possible flashover can be carried out. Wind can provoke
catastrophic mechanical damages due to asynchronous movements of the cables and/or
insulators. These damages are more significant in transmission lines where the span is
larger. Figure 2-2 shows the swing angle of a single conductor vs. wind-speed .
40 -I -
: SWING ANGLE AS A
FUNCTION OF MEAN
20 WIND SPEED
5 20 -
5 10 i5 20 25 50 35
WIND SPEED METER/SEC
Figure 2-2. Swing angle as a function of instantaneous wind speed at tower
Distribution lines have small span, so the asynchronous movement of the
conductors, most of the times, gives insignificant disturbance except when there is high
speed. The most significant disturbance due to wind can come due to movement of the
trees and its branches. Trees, which are untrimmed, can touch the lines and can result in a
flash over leading to an outage.
The Figure 2-3 below shows the uncut trees and branches touching the distribution lines
in Gainesville, Florida.
Figure 2-3. Vegetation effects on power interruptions
The number of days with thunderstorm in the Florida is between 80 and 100 as
shown in the Figure 2-4 (isokeraunic level). As we can see in Figure 2-4, Florida is a
state strongly affected with atmospheric discharges. The number of strokes to the earth
per square mile per year (lightning) can be found through the expression:
N= 0.25 I
where I is the local isokeraunic level.
If a line has shadow width of W, the number of lightning expected to hit it per year is
NL= 0.25 ILW/5280
where W=b+4h, L is the length of the line in miles, h is the height of the shield (ground)
wires and b is distance between shield wires. If the line doesn't have shield wires, h is the
height of the conductors and b the distance between external conductors.
Due to strokes on transmission or distribution lines with shield wires a back
flashover is expected. The voltage across the insulator string in this case depends on
tower foot resistance, current through the tower and of the coefficient of coupling
between shield wire and phase conductor.
S70^ -- -- 70
l90 \0 800
I, d ,
Figure 2-4. Number of days with thunderstorms in Florida (US Weather Bureau)
Thus, flashover or interruptions due to lightning depends on the tower foot
resistance and also on the intensity of the current. It is not possible to control the current
so all the control should be done through tower foot resistance. Distribution lines without
shield wires are directly affected by the lightning. The level of the voltage across the
string or insulator depends on intensity of the current and the magnitude of the surge
impedance of the line. Figure 2-5 shows the amplitude of the crest of the strokes with the
probability of the occurrence. It is possible to see that the probability of a stroke, which
has a current up to 5 kA, is almost 1. This current while passing through the conductor
will be divided into two. Thus, the voltage across the insulators or string of distribution
lines with surge impedance between 150 Q and 250 Q will be more or less between 375
kV and 625 kV. Thus, in most of the cases distribution lines up to 69 kV (BIL up to 350
kV) will be practically submitted to a flashover for every stroke on it.
I I i P
10 20 50 100 200 500
Figure 2-5. Cumulative frequency distribution of peak current amplitudes in downward
SYSTEM UNDER STUDY FLORIDA POWER & LIGHT
The preliminary results cited in this thesis are the research results obtained while
working on the project sponsored by Florida Power & Light (FPL). But the concept and
idea can be further applied to any system and can be enhanced as explained in the current
thesis. The power distribution interruption data of FPL was considered to study the
effects of weather on the occurrence of interruption patterns of FPL.
3.1 Description of the FPL distribution system
FPL is among the largest and fastest growing electric utilities in the United States.
As of December 2002, FPL had 9,612 employees serving nearly 8 million people, or
about half the state of Florida. Power is delivered (Table 3-1) safely and reliably from 86
generating units with a Total generation capability =26203MW, through more than 500
substations and over more than 69,000 miles of transmission and distribution lines .
Table 3.1. FPL Power Sales by Sectors
Sector Number ofAccounts Total Sales (kwh)
Residential 3,521,146 50.9%
Commercial 430,471 40.6%
Industrial 15,248 4.4%
Other** 2,746 4.1%
* Monthly average as of December 2001
** Includes public authorities, railway, wholesale and interchange.
FPL Distribution system is divided in 16 management areas grouped under two
regions; urban and suburban (Table 3-2). Figure 3-1 shows the location and area covered
by each of the FPL distribution management areas (MA).
Table 3-2. FPL Distribution Management Areas along with Their Dispatch Centers
DISPATCH West Pam South Florida Daytona Sarasota
Breach Dispatch Dispatch Center Dispatch Dispatch Center
AREAS TC-Treasure PM- Pompano NF-North MS- Manasota
WB- West Palm WG- Wingate CF-Central TB-Toledo
BR- Boca Raton GS- Gulfsteam BV- Brevard GC- Gulf Coast
| GC '1'
^ ___ _^- PM1
L -N P
-- ', GS I
Figure 3-1. Snap shot of Florida map (a) FPL distribution management areas. (b) Weather
stations chosen within the FPL area
3.2 Interruption Data from FPL
Power interruption data is primarily obtained from Florida Power & Light (FPL).
FPL has divided its power supplying territory into 16 sections called Management Areas
(MAs). An interruption data file was created for each of these 16 MAs. Interruption data
was made available to us in a data storage program known as "Power-Play." To make the
things more clearly, interruptions are classified in to different groups, Figure 3-4, based
on the type of causes that are responsible for these interruptions, for example -
interruptions occurred due to squirrel are represented under the category of squirrel cause
r%-h -_ _L -_
code (007). Basically a cause code indicates the principle cause of an interruption. With
this kind of facility, we will be able to see interruptions happened due to a any specific
cause as shown in Table 3-3.
Table 3-3. FPL
Cause Codes (102) Table
REV 6/16/98 SC
(Required for all interruptions)
001-E Lightning, with equip.dam age
002 Storm w/no equip.damage
004-E Salt Spray Corrosion
011 Other Animal
015 Ice on Lines
020 Tree/Limb Preventable
021 Tree/Limb Unpreventable
024-E Corrosion (Non Salt Spray)
026-E Contamination (Non S /S)
187-E Equipt. Failed, Cause Unk
170 Wrong size fuse
171 Overloaded Device
178 Non-standard Construction
183 Improper Installation
193 Customer Request
195 Crew Request (Planned)
196 Slack Conductors
197 Other (explain)
202-E Loose Connection
041 Accidental Contact
046 Switching Error
079 Dia-In (Prooer Deoth)
Support and Follow-Up Codes
(Codes to be used as Support or Foilow -up Only)
No Animal Guard 241 Injection Elbow (Not Installed)
022 Palm Tree
050 Foreign Crew or Customer
066 FPL Crew
067 FPL Distribution Contractor
068 FPL Line Clearing Contractor
069 Transmission Contractor
075 Improper Depth
100 Inadequate/No Ground
192 Crew Request (Forced Outage)
199 Defective Material-UPR
222 Power Temp Used
240 Injection Elbow (Installed)
242 Flow (Positive)
243 Flow (None)
244 Injection Comp
245 Injection Job Pndng
250 Cable Replace Job Pending
251 Cable Replace Job Comp
260 Fault Locator Used
265 Cleared by Phone
271 Injected (8/96 on)
272 Replaced (8/96 on)
299 Data Corrected
999 Named Storm Exclusions
Note: The suffix "E" denotes that an Equipment Code is required.
nn Nnt untar .*C" an T Mu
080 Down Guy or Anchor
082 Cross Arm
084 Pole Top Pin
087 Tie Wire
090 Hot Line Clamp
092 Disconnect Switch
093 Fuse Switch
096 Line OCR
097 Line Capacitor
098 Line Regulator
104 Conductor Down
105 Conductor Damaged
114 Tx Fuse Switch
115 Tx Blade Switch
116 Bayonet Switch
122 Oil Fuse Cutout
123 RA Switch
124 Mech for Throwover Sw.
125 PT Fuse
126 ConductCKT Fuse
127 Control Cable
211 Iniected Cable
Overhead or Underground
085 Arrester 102 Other Equipment
091 Connector 103 Splice
094 Transformer 106 Automated Switch (DA)
095 Step Down Transformer
161 Meter Blocks, Repairable
164 Other Meter Equipment
165 Meter Blocks-Not Reparable
200 Transmission related
Weather Related Codes
NO EQUIP DAMAGE
140 OCB (Feeder Breaker)
147 Step Down Transformer
148 Other Substation Equip.
The interruption data will be daily totals broken down by cause code. For example, a
specific substation may experience three interruptions on September 29, 1998, due to
cause code 093 fuse switch. Each data point will be small, but the compilation of many
years and many areas will provide a statistically significant sampling.
The interruptions of interest to us were further defined by the use of the following
* With Exclusions This filter suppresses interruption data that is defined as
exclusionary by FPL, including hurricane and tornado damage. We use this filter
because we are interested in the effects of normal i ithel/r conditions.
* Overhead This filter includes only those interruptions that are caused by faults in
overhead equipment or lines. Underground lines were considered immune to most
* Internal Distribution Interruptions located at the distribution system only were
taken into account.
* Primary Only primary systems (feeders, laterals and oil circuit breakers) are
within the scope of this research.
* Substation Each substation reports all the interruptions occurring in the secondary
distribution system it supplies
* Cause code FPL uses numeric cause codes to specify the causes of interruption.
General categories include natural causes, equipment and accident.
* Dates All relevant days from January 1, 1998 through December 31 2001 were
considered. It is possible to get up to date interruption data by requesting FPL.
Assumptions about FPL system: An important consideration in choosing FPL is
that they have assured us that we can make the assumption that their equipment is
homogenous throughout their area of operations (AO).Homogeneity of equipment is a
necessary condition for statistically significant results
Scope of current Thesis: As FPL personnel have already done correlation analysis
between lightning strikes and the power interruptions, they are interested in knowing the
indirect effects of weather including wind, temperature, rain etc on the total number of
power interruptions. So the scope of the current research work is limited to these
Although interruptions represent between 3% to 5% [6-8] of the frequency of
disturbances, a common method for measuring the reliability of an electric distribution
system is based on the number of customers interrupted, which is proportional to the
number of interruptions, as explained in chapter 1. Let us revisit the definition of SAIFI,
a reliability index which the FPL uses more often. IEEE Standard 1366 defines the
System Average Interruption Frequency Index (SAIFI) with the following formula:
Ni = Number of interruptions (sustained interruptions lasting over 1 minute)
Ci = Customers interrupted for each interruption
Cb = Customer base or customers served
D-95 J-96 D-96 J-97 D-97 J-98 D-98 J-99 D-99 J-00 D-00 J-01 D-01
Years (J-January, D-December)
Figure 3-2. FPL's historical SAIFI performance
SAIFI indicates how often the average customer experiences a sustained
interruption (>lmin.) over a predetermined period of time, and it has a special importance
in decision making for engineers working in distribution reliability. A typical breakdown
by significance of the major causes for customer interruption and number of interruptions
of FPL distribution system under study for a period of 12 months is shown in Figure 3-3.
C) C *>- >'
l =1 .Q 3 a
E > c)
U) 0) C
Number of Interruptions-2001
W1 W 0 :3 E
.) C: o)f
Cu 0)0 c
Figure 3-3. Frequency charts of interruptions and customers affected by interruptions. (a)
number of customers interrupted vs. causes and, (b) number of interruptions a
The previous graphs show the relative importance of the direct effects of weather
(storms and lightning) on the interruptions, but not the indirect effects i.e. temperature,
rain, wind etc. From this chart it is not possible to determine if interruptions associated
with, for example, vegetation or equipment are indirectly affected by weather conditions
such as temperature, rain or wind.
The current thesis shows that normal weather conditions do have effects on
interruptions and that those effects can be quantified. The benefits of this type of study
are the ability to explain trends in the SAIFI due to weather conditions and as an aid in
the development of indicators for possible use in anticipating interruptions.
3.3 Meteorological Weather Data from NCDC
We collected daily average weather data for rain, temperature, wind speed and
other parameters from Automated Surface Observation Stations (ASOSs) located at
airports throughout the area of operations (AO). Construction of these stations has begun
in 1981 as an aid to air navigation and they have since become the most comprehensive
source of weather data in the United States. For the stations we are interested in, we will
be using data from 1996 through 2002. As we are an educational institution, the National
Climatic Data Center (NCDC), a department of the National Oceanic and Atmospheric
Administration , is making this data available to us free of charge.
The greatest difficulty in collecting these data is its sheer volume. Six years of data
from one ASOS will generate a file containing 24 columns and 2190 lines. Stacking files
from all the ASOSs in the AO will generate a composite file with more than 20,000 lines.
To add to this problem, there are missing days, missing data points and formatting that is
not importable to the statistical program of our choice. A final editing of the data was
done by brushing (taking out) those points of data which doesn't make sense : data points
with zero barometric sea pressure, zero average temperature, 100 mph of 2 minutes
maximum wind gust etc.
To address this, we wrote programs in C/C+ that will correct the omissions and
convert the NCDC files to a generic text file that can be imported by any commercial
spreadsheet program presently in use. Since we anticipate the use of ASOS data by any
power engineer using our methods, file conversion is required by our objective.
3.4 Weather Parameters of interest
Though there are a lot of weather parameters available in weather file downloaded
from NCDC website, we used only those parameters which are of interest. A weather
data file was created for each of the 15 ASOSs (one particular ASOS covered two
regions). The following daily weather parameters were downloaded from the NCDC
* Average temperature
* Maximum temperature
* Minimum temperature
* Average dew point
* Significant observations
* Total rainfall
* Barometric pressure (sea level and station)
* Average wind speed
* Two-minute maximum sustained wind gust
* Five-second maximum sustained wind gust
Weather is a complex combination of lot of parameters including, but not limited to,
wind, lightening, condensation, temperature, rain, pressure, humidity, cosmic dust, solar
storms, hurricanes, storms etc and the list goes on if all the meteorological terms are
included, some of which we are not even aware of. But if the daily prevailing weather
conditions are considered, fortunately lot of parameters can be neglected by throwing
them under the category of extreme weather conditions i.e. not-a-common daily weather
parameter, e.g. Hurricanes, Storms, lightening etc. Therefore, the major focus was given
on the weather parameters like wind, temperature, rain, pressure, humidity etc which are
not extreme weather conditions. Also among all these common weather parameters, only
wind, rain and temperature are investigated, because of their dominant role  on the
power distribution interruptions.
CORRELATION OF WEATHER AND INTERRUPTIONS
Consequences of power interruptions can range from mild to severe inconvenience
such as missing your favorite TV show or losing critical data, to life threatening, such as
the failure of traffic signals. Less obvious consequences include increased cost to the
customer due to increased maintenance and repair costs for the provider. Because of these
consequences, power engineers are always researching methods to reduce the number of
The first step to reducing interruptions is to define the causes, and quantify their
effects. Accident, human error and aging equipment contribute a great deal, but weather
is still the largest single cause, although the effects are not as well understood as we
would like to think. We can all agree that adverse weather conditions cause power
interruptions. The evidence is apparent. When a bad thunderstorm storm hits, or a
hurricane arrives, many people experience power interruptions, and those who don't, hear
about it on the news.
Lightning strikes, especially common in Florida, create transients that overload
transformers and trip fused circuit breakers, both conditions requiring a repair crew to
restore power. High winds blow down trees, damaging conductors.
Less apparent is the effect of normal weather on the frequency of power
interruptions. Several days of moderate rain can saturate the ground, invading buried
lines and causing short circuits (FPL study).An unexpectedly warm season can promote
vegetation growth, causing interruptions due to tree limb/conductor contact. These and
other effects of normal weather are not easily defined because there is not a one-to-one
relationship such as 'lightning hit the line so the transformer blew.' In fact, many
preventable interruptions occur that are not properly attributed to weather because of that
lack of one-to one relationships.
4.1 Importance of Statistical Tools
Part of the interest of this project is to find the relationship between the number of
interruptions and normal weather conditions. Both interruptions and weather conditions
in the future are random. To gain a complete prediction of the number of interruptions in
the future, we need to predict future weather conditions and predict the number of
interruptions based on the predicted weather conditions.
However, because of limited resources and the difficulty of weather predictions, we
will process the conditional prediction for the number of interruptions assuming that the
weather condition is known. In this subsection, we will describe the probabilistic
characteristics of daily interruption frequencies and the sums of daily interruption
frequencies, i.e., monthly interruptions or sums of interruptions when it rains and when it
does not. Then the explanation on plausible statistical data analysis techniques for each
The daily outage frequencies have only nonnegative integer values and are strongly
skewed to the right. For example, the daily interruptions caused by tree limbs (Cause
Code 20 and 21) has the range from 0 to 58, but 99.5% of frequencies are less than or
equal to 5 (Table 4-1). Therefore, statistical techniques based on the normal distribution,
such as t-Test, normal linear regression, and ANOVA, generate big biases in calculating
the confidence intervals of estimates and provide wrong conclusions in the search of
significant weather effects. As a reminder, the normal distribution is symmetric and has
the range, (-0c, +oc).
Table 4-1. The Frequency of Interruptions due to Tree Limbs (Cause Codes 20 & 21)
The Number of Frequency Percent Cumulative
0 15890 81.16 81.16
1 2582 13.19 94.35
2 632 3.23 97.57
3 223 1.14 98.71
4 102 0.52 99.23
5 53 0.27 99.50
6 38 0.19 99.70
>6 59 0.30 100.00
The nature of the weather data sets is first evaluated to know the behavioral
patterns of weather parameters. Some of the parameters of our interest are wind,
temperature and rain.
4.2 Probabilistic Characteristics of Data Distributions
As a prelude to presenting results, I will describe the probabilistic characteristics of the
data set we are dealing with; to determine what statistical data analysis techniques and
models should be used to correlate weather parameters with power interruptions.
Daily interruption frequencies (for all cause codes) have only nonnegative integer
values, from 0 to 200, and are skewed to the right, as can be seen in Figure 4-1. Rain data
shows a stronger displacement to the right (Figure 4-2), while wind speed histogram
(Figure 4-3) gets close to a normal distribution.
o- 4000 -
0 10 20 30 40 50 60 70 80
Figure 4-1. N for all management areas from 1998 to 2001 using the previous filters
a) 1000 -
Figure 4-2. Rain (inches) for all management areas from 1998 to 2001
1 2 3 4 5 6 7 8 9 10
0 10 20 30 40 50 60 70
Figure 4-3. Wind-2 minutes maximum speed (mph) for all management areas from 1998
Because a normal distribution is symmetric and the normal random variable is
continuous within the range (-0c,cc), these probabilistic characteristics must be explained
using Poisson distribution.
4.3 Correlation Analysis between Weather Parameters and Interruptions
The statistical correlation models between weather parameters of interest wind,
temperature and rain, and the total daily number of power interruptions (N) were studied.
4.3.1 Impact of Temperature on N
In this section, the impact of daily temperature variations on the Power
interruptions due to transformer failures was studied.
The monthly averages (means) of the maximum temperatures were taken on the X-
axis and the monthly means of the total number of interruptions due to transformer
failures were taken on the Y- axis for 4 years (1998-2001) of all the MAs. Under these
conditions, the total number of monthly data points came to be around 567.
Regression Plot of N Vs. Max. Temp
Y= 202 803 5 09380X+ 0 0328080 X**2
S =262145 R-Sq =420% R-Sq(adj)= 41 8 %
2 0/ .. 95% Cl
*.*-------- 95% Pl
S* PI Prediction Interval limits
SCT -Confidence Tnterval limits
65 75 85 95
Mean of Max. Temperature (F)
Figure 4-4. Variation of average N due to transformer failures vs. maximum temperature
It can be observed from Figure 4-4 that the plot has peaks over the two edges of the
X-axis. The reason can be attributed due to the heavy load on the transformers because of
the maximum usage of power during these temperatures. Part of the reason being, all the
customers try to switch on their air-conditioning at once when there is either maximum
temperature or minimum temperature. It looks like around 750F to 800F, there will not be
much increase in the transformer failure interruptions and hence is an optimal
temperature. Approximately after 800F, the curve increases in an exponential way. The
right skewness of the graph indicates that the higher temperature effects are more
predominant than the lower ones; this is true for Florida, especially southern part, where
most of the year it is sunny.
Regression Plot of N Vs. Max. Temperature
Y= 208 210 -5 22174 X+ 0 0335747 X"2
S=1 11736 R-Sq =80 9% R-Sq(adj) 80 1 %
Mean of Max. Temperature (F)
Figure 4-5. Variation of average N due to transformer failures vs. maximum temperature
(averaged per month per year)
Figure 4-5 was plotted with the same exact information used to plot Figure 4-4, but
the data of the corresponding months of the 4 years for all the MAs was averaged giving
a total of 48 points. Similarly in Figure 4-6 the data of the corresponding months of all
the 4 years was averaged to give 12 data points. The important thing that we should
observe is that as the number of data points is getting lesser and lesser, the plot is getting
smoother with the increase in R2 value but at the cost of losing the finest details of the
data points because we are averaging out all the variations for each month. This method
of averaging out the data points provides us an opportunity to see the hidden pattern
between the variables by suppressing the disturbances/noise in the data set.
Regression Plot for N Vs. Max.Temp
Y = 273 490 6 81286 X+ 0 0432362 X**2
S = 0 597484 R-Sq = 95 1 % R-Sq(adj)= 94 0 %
() 7 .
5 ''- ~~--- .*'^ / .
6 --- ------------ ----
75 80 85 90
Mean of Max. Temperature (F)
Figure 4-6. Variation of average N due to transformer failures vs. max temperature
(averaged per month)
The correlation equation for the X and Y variables considered is given on the top of
each of the plots; R2 represents the proportion of variability in the Y variable accounted
for by the X variable.
Based on the correlation model developed between the transformer interruptions
and maximum temperature, it is possible to predict the total number of interruptions due
to transformer for any day/MA if the maximum temperature of that day/MA is know /
given. The following Table 4-2 shows the prediction of Transformer interruptions and the
error associated with it for each of the MAs of FPL.
Table 4-2. Prediction of N Using Maximum Temperature of All the MAs
MA Airport Tmax N Tx NTx Error NTx Error
(Avg.) (Avg.) prediction ALLFPL prediction MA'sEQU
actual ALLFPL EQUATION INDIVIDUAL
__EQUATION _MA's EQU
Central Florida DAB 66.5000 7.36364 9.31668 26.523 8.170 10.951
Wingate FLL 74.4091 3.45455 5.40181 56.368 4.898 41.784
Gulf Coast FMY 72.5909 6.13636 5.91181 3.659 *
Treasure Coast FPR 71.3182 6.86364 6.40587 6.669 *
Wingate FXE 74.6364 3.45455 5.35414 54.988 4.850 40.395
Gulf Stream HWO 75.3636 4.04545 5.22544 29.168 *
Central Dade MIA 75.2727 3.50000 5.23954 49.701 5.020 43.429
Brevard MLB 69.8182 7.22727 .13454 1.283 *
NorthDade OPF 75.4545 4.81818 5.21190 8.172 *
WestPalm PBI 74.0000 4.90909 5.49660 11.968 *
Toledo PGD 71.3182 4.18182 6.40587 53.184 5.429 29.824
Ponpano PMP 73.8636 2.31818 5.53076 138.582 3.720 60.471
Central Florida SFB 68.3636 7.36364 7.99378 8.558 *
Manasota SRQ 68.0000 6.95455 8.23225 18.372 7.360 5.830
South Dade TMB 76.3636 5.40909 5.10758 5.574 *
ALL FPL 72.485 5.2 5.9 13.4%
* MA Management Area considered
* Airport The nearest airport to the MA considered in getting the weather data
* Tmax (Avg.) The average value of the maximum temperatures occurred in
* N Tx (Avg.) Average number of interruptions (N) happened due to Transformer
* N Tx prediction All FPL equation Predicted N Tx (Avg.) using the common
equation of all MAs
* N Tx prediction Individual equation Predicted N Tx (Avg.) using local equation
of individual MAs
It can be observed from table 4-2 that the prediction error varied over a wide range
from 1.28 % to 138 %. There were only 5 cases where the error exceeded 50%, with
others within the satisfactory limits. The huge error is due to the incorporation of
common equation developed from the data of all the MAs. But using the individual MA's
equations, which were developed from the local MA's data, those huge errors were
drastically reduced. There were cases where the common equation gave better results
than the local equations of the MAs; hence local equations are used only for those MAs
where common equation gave a huge error.
4.3.2 Impact of Wind on N
The role of wind is very significant among all the weather parameters. There is a
very good correlation between wind and total number of interruptions (N).
When the plot is drawn between the daily 2 minute maximum wind gust (TMMG)
and N, it was a big mess and chaotic where no pattern can be seen. Because, for a given
value of the TMMG speed there were different levels of N occurred. So the averages of
different levels of N occurred at each of the speeds of TMMG were taken and then
plotted, the plot can be seen in Figure 4-7.
Regression Plot of N Vs. Wind
Y = 4 57617 0 692087 X- 0 0355799 X-2 0 0004352 X"3
S = 2 76351 R-Sq =39 8 % R-Sq(adj) =35 2%
0 10 20 30 40 50 60
Mean of minutes Wind Gust speed (mph)
Figure 4-7. Variation of N vs. wind
It seems that there is a pattern until 40 mph, but after that the pattern gets distorted.
If at least 30 points were considered while calculating averages then the correlation
obtained by this process is very high, R2 = 99.3% and reveals the existence of strong
cubic relationship, Figure 4-8, between N and TMMG. By doing so, only 1.5% of the
data points were neglected still keeping 98.5% of the whole data.
Regression Plot of N Vs. Wind
Y = 0 613754 + 0 0647258 X- 0 0065040 X**2 + 0 0002598 X**3
S=00900810 R-Sq =993% R-Sq(adj) =992%
10 20 30
Mean of minutes Wind guest speed (mph)
Figure 4-8. Mean of 2 minutes wind speed vs. average number of interruptions
It can be observed from Figure 4-8 that the total average number of interruptions
increases exponentially after around 20 mph. So power distribution poles and overhead
equipment must be designed in such a way that there won't be any breakdown for wind
gusts of more than 20 mph. Also care has to be taken that the distribution line's
neighborhood vegetation and others near by to it are at a proper distance and will not lean
on the power distribution lines during these wind gusts.
4.3.3 Impact of Rain on N
The impact of rain on the mean number of N can be observed in the following
Regression Plot of N Vs. Rain
Y = 1 43350 + 3 04671 X 1 09160 X**2 + 0 137744 X**3
S =124028 R-Sq =428% R-Sq(adj) =372%
0 1 2 3 4 5
Mean of Rain(inch)
Figure 4-9. Variation of N vs. rain
The number of days that didn't rain is more than the days that rain. As the impact
of rain on N is under consideration, the non-rainy days have been excluded from the data
set. The data points of N were averaged similar to the approach followed in analyzing
wind impacts on N; different occurrences of N for each level of rain were averaged and
then plotted in Figure 4-9.
The whole graph can be divided into 3 piecewise linear segments; 0.1 to 1 inch, 1
to 3 inch and more than 3 inch. In the first segment there looks a linear relationship,
Figure 4-10, between N and Rain, and hence initial small amount of rain play a vital role
in the amount of interruptions.
Regression Plot of N Vs. Rain
Y= 1 47309 + 2 11652 X + 0 662153 X**2 0 498268 X**3
S =0568862 R-Sq =789% R-Sq(adj) =750%
Z 35 -
0 Mean of Rain(Inch)1 2
Figure 4-10. Variation of N vs. rain in the interval [0 2]
N remains pretty much constant in the second segment showing constant effect of
rain, but in the third segment N increases drastically as rain increases over 3 inch. The
small amount of rain, little showers, settles down on the insulators. This droplets of water
helps as a solvent for the salts and the atmospheric dust deposited on the insulators and
forms a conducting layer for the current, thereby causing a flashover which leads to
power interruptions, as explained in chapter 2.
On the other hand, rain from 1 to 3 inch is large enough to clean the insulator, as
they drop off from it instead of getting deposited. Finally, rain over 3 inches is
accompanied with extreme weather conditions leading to again huge amount of N.
4.3.4 Effect of Rain and Wind Together on N
The following three-dimensional Figure 4-11 gives the relationship between the
combined effect of wind and rain on the average number of N. It can be seen that the
predicted (calculated) Navg tracks very well the actual N happened for lower values of N.
The regression equation is given by
Navg = 1.05 0.04*Wwind speed + 6.82*Rrain average
The correlation coefficient, R2, is = 85.6%
Figure 4-11. Impact of rain and wind together on N
Usual methods of statistical analysis rely, in part, on knowing in advance what the
researcher is looking for. An example is a study done by FPL that provides a linear
equation describing the number of interruptions caused by lightning as a function of the
number of lightning strikes. In this case, the cause of the outage is known (one-to-one
relationship) and the result is expected. The data required for this type of analysis is also
proscribed by the limited scope of the question. Also, there are limited strategies for
dealing with the problem, since lightning is a random and unpredictable event. Analyzing
the effects of normal weather requires a different approach. We need to be open to
unexpected results rather than expected ones. We need to consider a body of data much,
much larger than that required to investigate a single phenomena. We need to consider
non-linear relationships and relationships that imply a confluence of conditions. We need
to apply every statistical tool we can think of, and then learn some more. Most of these
features are available in a tool called Artificial Neural Networks (ANNs). Hence the
application of ANNs to our current problem is discussed in next chapter.
4.3.5 Effect of Lightning Strikes on N
FPL has already done correlation analysis between cause codes 01(Lightening, with
equipment damage) and 02(Storm with no equipment damage) and lightning strikes for
all the MAs considering the years 1998-2001. Cause codes 01 and 02 represent the direct
weather effect on service interruptions. The following plot, copied from the FPL
information slides during their visit to University of Florida, explains the impact of
lightning strikes on the storm interruptions with a very high correlation with a linear
relation meaning the higher the lightning strikes the higher the storm interruptions.
Monthly Cnrre=Einn Li ghtnin Stri kes vs Storm hterrptions (Cod 01 4 02n1
y= 0.0376x + 91.956 *
5 500 2
0 20 POO 40,.00 60,000 80 POO 100I OOO 120,000 140,000
LightM ning Stri kes
Figure 4-12. Lightning strikes vs. storm interruptions during 1998-2001 for all MAs
PREDICTION OF INTERRUPTIONS USING ARTIFICIAL NEURAL NETWORKS
Though it gives the impression, from the previous chapter, that the effect of all the
weather parameters on power interruptions can be quantified using standard mathematical
functions/ statistical techniques, it is not always true. It may neither practical nor feasible,
always, to find a function for certain complex correlations between weather and
interruptions. This is where the need for the neural networks arises to analyze and
generalize the hidden relationship.
We need a tool which is powerful when applied to problems whose solutions
require knowledge which is difficult to specify, but for which there is an abundance of
examples artificial neural networks is one of the best tools for this kind of problems.
5.1 Introduction to Artificial Neural Networks
Neural networks, or artificial neural networks (ANN) to be more precise, represent
a technology that is rooted in many disciplines: neuroscience, mathematics, statistics,
physics, computer science, and engineering. ANNs find applications in such diverse
fields as modeling, time series analysis, pattern recognition, signal processing, and
control by virtue of an important property: the ability to learn from input data
with(supervised) or without a teacher (unsupervised).The most common training
scenarios use supervised learning.
ANN is a very useful tool for predicting the interruptions of a power distribution
system to a decent accurate value. The accuracy of prediction is directly proportional to
the accuracy of the historical power interruption and weather data used to train the ANN.
This project provides the methodology for predicting the interruptions beforehand for the
forecast weather conditions using ANNs.
5.1.1 Benefits of ANNS over statistical methods
ANN is an alternative to conventional methods . ANN is an approach that
combines the time series and regression approaches; it learns from the previous
interruption and weather patterns and predicts one for the current conditions, it also
performs a non-linear regression between interruptions and weather patterns. It shows
superior performance in terms of accuracy when compared to statistical methods .
ANN derives its computing power through, first, its massively parallel distributed
structure and, second, its ability to learn and therefore generalize. Generalization refers to
the neural network producing reasonable outputs for inputs not encountered during
training (learning). These two information-processing capabilities make it possible for
neural networks to solve complex (large-scale) problems that are currently intractable.
The main reasons for using neural networks, for prediction, rather than statistical
techniques/ classical time series analysis are 
* They are self-monitoring (i.e., they learn how to make accurate predictions.
* They are able to cope with nonlinearity and nonstationarity of input processes.
* They are adaptive, non-linear and highly parallel.
* They can generalize.
* They are computationally at least as fast, if not faster than most available Statistical
Multi-layered ANNs are capable of performing just about any linear or nonlinear
computation, and can approximate any reasonable function arbitrarily well.
5.1.2 Architecture of ANN
Figure 5-1(a) shows the basic model of a single neuron while Figure 5-1(b) shows a
one-layer network with R input elements and S neurons. In this network, each element of
the input vector p is connected to each neuron input through the weight matrix W. The ith
neuron has a summer that gathers its weighted inputs and bias to form its own scalar
output n(i), Figure 5-1 (b). The various n(i) taken together form an S-element net input
vector N. Finally, the neuron layer outputs form a column vector a, where a = f (Wp+b).
I'" ~ ~ -"-.
Input Layer of Neurons
Input Hidden Layer Output Layer
al = tansig (IW1,Ip1 +bh) a2 =purelin lIW: ii +li
Figure 5-1. ANN structures: (a) basic nonlinear model of a neuron, (b) one layer network of
neurons, and (c) 3 layer feed forward back propagation network
Figure 5-1 (c) shows the ANN model used in the current project. IW represents
Input Weight matrix having a source 1(second index) and a destination 1(first index).
Also, elements of layer one, such as its bias, net input, and output have a superscript 1 to
say that they are associated with the first layer. LW represents layer weights .
S= number of
neurons in layer
The data is presented to the input nodes. Each input node is connected to several
nodes in the second layer. The second layer is called the hidden layer, since they are not
accessible to the outer environment. The hidden layer acts as a layer of abstraction,
pulling features from inputs. Determining the proper number of nodes for the hidden
layer is difficult and often determined through hit and trial. Generally, network
performance increases with the number of hidden nodes and then reaches a saturation
level . The addition of more hidden nodes may actually degrade performance due to
increased difficulty of training data. The implementation of this commonly accepted rule
will help train the ANN efficiently and will also help convergence of the solution. The
last layer is referred to as the output layer, since the network's output is the response of
nodes on this layer. The number of output nodes of an ANN is determined by the
5.1.3 Functioning of ANN
In general, the operation of this feed forward network consists of passing weighted
and summed input signals through a chosen nonlinearity. It presumes knowledge of the
network's bias functions and weighted links. Once activation and output functions are
chosen, an ANN is completely described by its weights and biases. Since a given ANN
solves a specific problem, or function, finding weights and biases for the network is
equivalent to finding the input/output relationship that describes the function. In the
current ANN model, Figure 5-1(c), the activation functions chosen in the hidden layer
and the output layer are "tansig" and "purelin" respectively. The two layer sigmoid/linear
network can represent any functional relationship between inputs and outputs if the
sigmoid layer has enough neurons.
There were a lot of training algorithms and performance functions that we can
chose from to train the network model. For the present problem BPN algorithm has been
chosen, as it was the famous algorithm for multi-layer perception (MLP) networks and
'trainbfg' training function was used to train BPN. The term back propagation refers to
the manner in which the gradient is computed for nonlinear MLP networks. Properly
trained back propagation networks tend to give reasonable answers when presented with
inputs that they have never seen. Typically, a new input leads to an output similar to the
correct output for input vectors used in training that are similar to the new input being
presented. This generalization makes it possible to train a network on a representative set
of input/target pairs and get good results without training the network on all possible
5.1.4 Back Propagation Learning Rule
The back propagation learning rule  is an iterative gradient algorithm designed
to minimize the mean square error between the actual output of a multilayer feed forward
network and the desired output. An essential component of the rule is the iterative
method that propagates error terms required to adapt weights back from nodes in the
output layer to nodes in lower layers.
At beginning, we set all weights and node offsets to small random values. The
input values are presented and the desired outputs are specified. Then the network, Figure
5-2, is used to calculate actual outputs. A recursive algorithm, starting at the output nodes
and working back to the hidden layer, adjusts weights until weights converge and the cost
function is reduced to an acceptable value. The training process is repeated by presenting
different sets of input data to the ANN.
Figure 5-2. A back propagation ANN model
5.2 Steps to Enhance the performance of ANN
There is a wrong notion that one can dump all the available variables as input to the
ANN to predict the solution. The more the number of input variables to ANN the
complex the problem to track in studying the correlation between these input variables.
To enhance the performance of ANN, the input data has to be pre-processed. ANN
toolbox of MATLAB 6.0 has some of the functions which can perform these operations.
The following are some of the techniques that could be helpful to enhance the quality of
the input datasets before giving it to ANN:
* Eliminate the unnecessary variables which don't have significant contribution to
* Scale the inputs and targets so that they always fall with in a specified range.
* Reduce the dimensions of the input data, without much loss in the variance, e.g.
Principle Component Analysis, as explained below.
As weather is a combination of many parameters like wind, temperature, rain,
pressure, dew, lightening etc, the next question that comes to our mind is what are the
predominant ones among all these parameters that have significant contribution towards
the daily interruptions? One way to figure out solution for this problem is to see the
variance of all the weather parameters with respect to each other. The ones which have
more variance are more responsible towards N than the ones with less variance. Less
variance in a variable means fewer changes in its value, which means this variable has
less effect on the changes of N. For investigations involving a large number of observed
variables, it is often useful by considering a smaller number of linear combinations of
Principle Component Analysis (PCA)
PCA is one of the friendly tools used popularly to reduce the dimensions of input
variables. Principle component analysis  finds a set of standardized linear
combinations called the principal components, which are orthogonal and taken together
explain all the variance of the original data. The following analysis shows the variance of
the 8 input considered in the ANN model:
Table 5-1. Summary Table of Covariance for All the Input Variables Considered in the
Principle Component Analysis
Importance of components:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
Standard deviation 10.9577574 7.0728909 5.5105074 3.22619511 2.01380874 1.5672211 0.936582778 0.3328819326
Proportion of Variance 0.5498531 0.2290853 0.1390550 0.04766335 0.01857119 0.0112477 0.004016943 0.0005074389
Cumulative Proportion 0.5498531 0.7789384 0.9179934 0.96565672 0.98422792 0.9954756 0.999492561 1.0000000000
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
MaxTemp 0.515 -0.138 0.197 -0.722 0.336 0.203
MinTemp 0.738 0.363 0.516 -0.216
AvgrlndS 0.313 0.359 0.841 0.229
5Sec1indS 0.727 -0.230 -0.187 0.614
2MinWindS 0.585 -0.151 -0.787
HeatDays -0.107 -0.277 0.949
CoolDays 0.415 -0.903
In the above table 5-1, if component 1(comp.1) alone is considered, it explains 54.9% of
the total variance in the data set by using the following linear combination of only 4
Comp.l = 0.515(MaxTemp) + 0.738(MinTemp) 0.107(HeatDays) + 0.415(CoolDays)
Similarly, comp.2 alone explains 22.9% of the total variance in the data set with the
following linear combination:
Comp.2 = -0.197(MaxTemp) + 0.313(AvgwindS) + 0.727(5SecWindS) + 0.585(2MinWindS)
But when comp. 1 and comp.2 are considered together, 77.8% of the total variance of the
data set can be explained. From above table, it can be observed that by considering up to
comp.3, around 91.7% of the total variance can be explained and by considering up to
comp.4, around 96.5% of the total variance can be explained. The choice of how many
number of components to be considered depends on the amount of variance that is of our
interest. Hence in this case, the total number of dimensions, 8, has been reduced to 4 if
we want to retain 96.5% of the total variance by considering up to comp.4. To our
interest, rain in above table has no contribution at all towards variance, if we consider
until comp.4, hence this variable can be eliminated. Hence the 8 variables can be reduced
to 4 components by preserving 96.56% of the total variance in the data set. Each
component is like a new variable but a linear combination of the actual weather variables.
5.3 ANN Simulation Output
Three management areas--Wingate, North Dade, and Gulf Stream were chosen as
pilot areas in the current artificial neural network (ANN) project. These 3 MAs are
adjacent to each other and small enough to make the assumption that the variation in
weather due to geographical differences is slight. Also, they are all urban MAs and
appear to have a similar distribution of customer types.
Two years, 2000 and 2001, of weather and interruption data were chosen to train
the ANN while 2002 weather and interruption data was used to evaluate the performance
of the trained ANN model. One entry in either the training or evaluation datasets is
composed of one day's weather and interruptions for one MA, so the one year evaluation
dataset had 1060 entries (allowing for missing data).
The output of the ANN is two columns of data; a prediction for each entry in the
evaluation dataset and the actual number of interruptions for each entry in the evaluation
dataset. Figure 5-3 is a graph of the predicted values superimposed over the actual values
for the evaluation year 2002. The MAs are in sequence and it can be seen that the
predicted values follow the seasonal trends for interruptions.
2002-Gulfstream 2002-North Dade 2002-Wingate
Figure 5-3. Prediction patterns of N overlaid on actual patterns of N of 3 MAs for year
5.3.1 Detailed Observation
Figure 5-4 is an expanded segment (North Dade) of the above graph to highlight
the details. It can be seen that where the actual number of interruptions are large, the
predictions matches the pattern of Max's and Min's, but are not always close in
magnitude. Where the actual number of interruptions is small, the pattern matching
breaks down, but large spikes in the predictions do not occur.
The following is a segment of a time series plot of predicted values and actual
values of total daily number of interruptions (N) for North Dade MA. It can be clearly
seen that during some periods (rectangles) the predicted values match the pattern of the
actual values, if not the magnitude, while other periods (ovals) do not show any such
pattern matching although the magnitudes are small.
-i Actual N
Figure 5-4. Predicted N and the actual N for a few of the cases in North Dade MA for
Some of the interesting observations from the above plot, Figure 5-4 are explained below.
Case 1: Predicted N less than actual N. The actual value of N at points 529 and
530, in Figure 5-4, correspond to 6/17/2002 and 6/19/2002 in the North Dade MA. The
interruption data for 6/17/2002 indicates that up to 18 among 28 interruptions that
occurred in 6/17/2002 may not be related to daily common 1 eI/wer (Corrosion/Decay =
10, Improper Process = 6, Request = 2) which suggests that as few as 12 may be weather
related interruptions, which is just the same as the predicted value. Similarly 15 out of 29
interruptions that occurred on 6/19/2002 may not be related to daily weather, which is
close to our predicted value of 12.
Though the weather conditions for these days were relatively mild, we have a
significant increase in the number of interruptions, as shown in Figure 5-5 in the green
Case 2: Predicted N more than actual N. If we look into the details of the points
549, 550, 551 and 552, these points correspond to 7/8/2002, 7/9/2002, 7/10/2002,
7/11/2002 days of North Dade MA, outlined in red in Figure 5-5. The number of
interruptions for these days was pretty much same though their weather conditions vary
over a wide range. This large change in weather conditions forces the ANN model to
predict N proportional to the weather. So it is really a question one should ask that why
we have small changes in N though we have significant differences in their weather
conditions? Were some precautionary measures, e.g. tree trimming been taken few days
before the happening of these interruptions??
SC17T C18-D C19 C20 C21 C22 C23 C24 C25 C26 C27 C2 C29
Call Sign_1 Date_1 %MaxTemip_1 MiliTemip_ AvgTemip_1 HeatDays CoolDays1 Rain1 AvgWdS1 5SMaxS_1 2MMaxS_1 WG-LS_1 WG N_1
527 OFF 06/15/202 79 73 73 6 Ol 11 1 18 108 316 25 3; 33
531OFF 06O/2B]2002 B6 71 79 0 0 215: 97 26: 22 0 14
532 OOPF 0B/21/2002 77 71 74 6I D 1196 62 26 22 0 13
534 OFF 06/23/2B002 B 73 80 I 0 04W BO 36 29 5 14
535 OPF 0O/24/2002 87 74 81 OI 0 28 5 28 20 18 0 20
536 OPF 06/25/202 83 74 79 0| D 037 39 16 14 0 23
537 OPFF 06/2/2002 B 72 72 9 [I 112 5B 20 18 7 18
538 OPF 0/27/2002 87 75 81 O 0 000 45 16 13 0 20
539 OPF 06/2/2002 88 74 81 0 0' 003 48 22" 17 0 6
540 OFF 06/29/2002 B9 75 82 0E 0 002 7B 25 22 0, 23
541 OPF 0O/30/2002 B5 76 80 OI 0 016 67 25 21 2 14
543 OPF 07102/2002 B7 71 79 0 0 O 30B 43 24, 21 56 44
545 PF 07/04/202 89 74 82. 034 36 26 22 3 24
546 OPF 07105/2002 90 73 82 0 0 0 09 32 29: 21 51 42
547 OPF 07/0/2002 B8 73 8 1I 0 012 26 17' 15 3 12
548 OFF 07/07/202 B8 74 81 I 0 066 62 31 28 2 27
553 OPF 07/12/202: 90 73 82; 0 17 0 156 56 2 18 0 17
554 OPF 07/13/202 91 73 82 0 17 0 01 69 22 18 0 17
555 OPF 07/14/202 93 77 85' 0; 20 00 62 17 14 0 20
556 OPF 07/15/2002 92 78 85 0 0 08 50 17 16 3 18
557 OFF 07/16/2002 91 76 84 O 0 0 65 44 25 20 16 24
559 OFPF 07/1 /2002 90 77 84 OW 0 0 96 36 23' 18 78 18
Figure 5-5. Numerical values of weather and interruption data under consideration
Case 3: In some of the cases there was a very small increase in N though there
were large variations in the weather conditions.
5.3.2 Dominant Weather Parameters Preliminary Observations
A series of ANN simulations with different weather parameters removed has been
done, and the relative accuracy of each simulation has been compared to determine which
weather parameters are the most significant.
The preliminary results show that only a few of the many weather parameters
account for most of the variation in the number of interruptions. It is expected that the
importance of individual weather parameters will vary with location.
The following list gives the weather variables in accordance with their importance
for the pilot area:
1. Two Minute Sustained Wind Gust (mph)
2. Rain (inches)
3. Lightning Strikes (# of strikes/day)
4. Temperature Average or Max. & Min.(K')
On the other hand, the following list of parameters which account for the least
variation in the number of interruptions.
2. Heat Days
3. Cool Days
4. Dew Point
5. Population (of MA)
5.4 Analysis of ANN Simulation Output
Based on the actual number of interruptions and the predictions during the
evaluation year, probability graphs (PGs) have been created to represent the range of
interruptions that actually occurred in the evaluation dataset for each predicted value. For
example, if every number between 1 and 40 interruptions were predicted at some point in
the evaluation year, there would be 40 PGs. This is done by sub-setting the evaluation
dataset into 40 sets and creating a histogram of actual values for each predicted value in
the evaluation set. By dividing each frequency column by the sum of the interruptions
that comprise the histogram, a probability graph such as the one for a prediction of 11
shown below can be created.
Histogram Probability Graph
1 2 3 4 5 6 7 8 9 1011121314151617181920212223 2 3 4 5 6 7 8 9 1011 1213141516171819202122
N Actual N Actual
Figure 5-6. Histogram plot of predicted interruptions
From the histograms, it can be seen that the actual values for each predicted value
follow a generally normal distribution, so it is justified to apply normally calculated mean
and standard deviation to gauge the accuracy and precision of a prediction. The accuracy
would be determined by the closeness of prediction to the mean actual number of
interruptions. The precision would be determined by the magnitude of the percent
standard deviation. Percent standard deviation was chosen to equalize the standard
deviations for lower to higher predictions. Outliers provide clues to elements of the
model that either are missing or should not be there.
Test data is just a single day's weather data; a real-time updated weather parameter
max, min or total from a weather station, a known day's values or a theoretical set of
weather data. The former is used in real-time prediction but the latter can only be used
after the fact and does not provide any predictive benefits, aside from inclusion in the
historical data set. However, the latter can be used for research, such as modeling a
system's robustness to weather. Test data that shows a very low prediction can be used as
a base and the parameter values can be varied either individually or in groups to model
the response to those parameters.
5.5 Pitfalls and Suggestions to FPL
GIGO is an acronym from the predawn of computing- garbage in, garbage out. The
accuracy and precision of the ANN is limited by the accuracy and precision of the input.
Although there will always be error inherent in the data collected, significant
improvements may be possible.
5.5.1 Weather Data
The error inherent in the ASOS weather data may be geographical and ASOS data
is only available for historic and not real-time use. The installation of dedicated weather
stations that is now occurring at FPL service centers will reduce that inherent error and
allow real-time forecasting.
5.5.2 Interruption Data
Although the FPL data cubes are thorough, the reporting procedures are not
designed for a detailed, time-dependent study such as this, nor are they always sensitive
to the role of weather. Because of this, the accuracy and precision of the prediction
An example is that a day on which an interruption may be reported runs three shifts
from 7 AM to 7AM. In the last random sample made, the last shift, from 11 PM to 7AM,
reported about 12% of the day's interruptions; meaning that from midnight to 7AM the
interruptions were being reported on the previous day. This can be largely accounted for
by taking data from the cube by shifts and summing, however that still leaves 11PM to
midnight, or maybe 1-2% of the interruptions, reported on the wrong day. Because the
data was only shifted in time, the average difference after adjustment was only 0.05
interruptions; however, because of many instances where a large number of interruptions
were reported on the previous day, the average percent difference was 14%.
To determine the effect on the output of this error, two sets of data were taken from
the same time period and location, an original one with 24 hours of interruption data
taken from the cube on the day it was reported and an adjusted one with 24 hours of
interruption data taken from the beginning of the third shift on the day before it was
reported. Both were run through the ANN and the results compared. The following
detailed graphs of the same time period in the same MA show an improvement in the
pattern and magnitude matching after the interruption data were adjusted for the shift
Figure 5-7. Prediction results of ANN using the original N (not shift adjusted)
Mean and Standard deviation plots for the actual N vs predicted N and adjusted N
vs predicted N were plotted as shown in Figure 5-9 and Figure 5-10. It can be observed
that adjusting the data to include the correct shifts on the correct days improves the fit of
the prediction to the mean actual number of interruptions. It also shows that the fit
1I j L
I .la I .I IL .*r II
Figure 5-8. Prediction results of ANN using the adjusted N (shift adjusted)
deteriorates as the prediction increases, indicating unknown factors. The graphs of the
original, Figure 5-9, and shift-adjusted percent standard deviation, Figure 5-10, show a
reduction in the adjusted %Standard Deviation at lower predictions while the higher
predictions are not much improved, similar to the graphs of the means.
40 90 -
Z!A U) 60 -
10 20 -
5 15 25 35 5 15 25
Figure 5-9. Mean and standard deviation of actual N
15 25 35
Figure 5-10. Mean and standard deviation of adjustedN
These results suggest that there are other improvements that can be made in the
data reporting. Not one change would have as visible an effect, perhaps, but taken
together they could alter the results significantly. Some possibilities suggest themselves:
* Report interruption requests due to weather related damage repair on the day the
causative weather condition occurred.
* Maintain an hourly database for interruptions since hourly weather is available.
This would be especially useful if dedicated weather stations existed.
* Update cause codes to be more sensitive to the possible role of weather.
* Report Age of equipment
Simulations that have been run with different cause codes subtracted from the
interruption data, such as accident, animal, improper process and crew request (planned)
have shown similar improvements in differing regions of the graphs.
5.6 Proving Localization of Weather Improves the Accuracy in Prediction
Case 1: Localized Weather Data
Three areas Wingate, North Dade, and Gulf Stream were chosen for study as pilot
areas, which are adjacent to each other. Weather and Interruption data for each of the
MAs were considered for years 2000 and 2001 and were used to train the ANN model.
While Year 2002 weather data of the North Dade area was chosen to predict using the
built trained ANN.
The mean percentage error (MPE) of the predicted value is 25% (approximately)
The mean percentage error (MPE) is calculated using the following formula:
1 Mod(Nactual Npredicted)
Where M= Total number of cases considered
Nactual = Actual number of N happened
Npredicted = Predicted number of N
Case 2: Scattered Weather Data
Contrary to taking weather data from each of the weather stations, only one weather
station was chosen for weather data while the interruption data was taken from all the 3
The mean percentage error of the predicted value is 35% (>25%)
This shows that with the increase in the accuracy of weather, by considering the
smaller areas, there are chances to enhance the performance of the model.
5.7 Comparison of Statistical Model and ANN Model
A comparison of the prediction performance between statistical and ANN model
was done using the 2000 and 2001 weather and interruption data of Gulf Stream (GS),
North Dade (ND), and Wingate (WG) of FPL In the process, three variables Rain,
2Minutes Maximum Wind Gust and Average Temperature were considered in building
the above two models. A multiple regression equation was developed for the above three
variables as given below:
N = -16.6 + 0.174 *AvgTemp + 4.71* Rain + 0.852 *2MMaxS
The 2002 weather data of ND is used to predict N using the above equation. On the
similar lines, ANN model was developed with 3 input variables, 5 hidden nodes and 1
output node. The same data set which is used for the statistical model is used in
evaluating the ANN model. The results of both the models are tabulated in Table 5-1.
Table 5-2. Performance Comparison Between Statistical Model and ANN Model
Statistical Model ANN Model
Mean % Error 67 45
Prediction with 30% Error 46 54
The above results show that the prediction accuracy of the ANN model is better
than the statistical model. Though the actual predicted figures of accuracy from both the
models are less, as we considered only few variables to make the problem easy, the point
here is to show that the ANN model is better.
0 5 10 15 20 25
Figure 5-11. Mean squared error vs. training epochs
Figure 5-11 shows that the mean square error is gradually getting decreased with
the training of ANN for each epoch (a complete set of training data). The progress of
training is diagnosed by looking into the training, validation and test errors. The training
stopped after 40 epochs because the validation error increased. The result here is
reasonable, since the test set error and the validation set error have similar characteristics,
and it doesn't appear that any significant over fitting has occurred.
5.8 Possible Software Development to Predict Power Interruptions Using ANNs
The following Figure 5-12, is a snap shot of the graphical user interface (GUI)
development of the ANN that had been trained to predict the interruptions.
Tj .6 &
Figure 5-12. Graphical user interface developed to predicted interruptions
Using above interface model, Figure 5-12, is simple: We have to first load the training
and testing data files (ASCII format) using the options buttons provided and then click on
the "Run Simulation" button to see the above plots.
Currently, the development of custom software to predict the power distribution
interruptions, based on the idea provided in the current thesis, is in progress. The
proposed prediction model is under test at FPL management areas. The custom software
can be easily installed just like any other software on the user desktop and is just a click
away to know the power interruptions in advance. In the future, the software will be
distributed to other power utilities in USA.
C Document and S .net.rnRoo
My D documents
Load Te ting Data
Load Trawmg Data
1. 1. 1 ..
Using the model shown in Figure 5-12, it can also be possible to get similar kind of
predictions as shown in Figure 5-13.Central Dade management area (MA) has been
chosen as one of the pilot areas to see how well our developed model can predict the
interruptions. It can be seen that the correlation coefficient R2 is around 90 which means
the model is doing pretty good job in predicting the interruptions close to the actual
number of interruptions. The X-axis of figure 5-13(a) shows the predicted sum of
monthly interruptions while the Y-axis shows the actual sum of monthly interruptions
happened. This estimate of predicted interruptions will help the utilities to know in
advance how much personnel they need to deploy to manage the interruptions.
Central Dade 2001-2003 Monthly Total N
Sum Actual = 1.84 + 1.060 Sum Predicted
600- S 32.3396
150 200 250 300 350 400 450 500 550
Scatterplot of Sum Actual, Sum Predicted vs Month for Central Dade
-0- Sum Actual
500 -U- Sum Predicted
0 3 6 9 12
Panel variable: Year
Figure 5-13. Predicted interruptions vs. actual interruptions for Central Dade (a) 3 years
plotted together (b) each year plotted separately
LIMITATIONS, CONCLUSIONS AND FUTURE WORK
The research results presented in this thesis are not with their own limitations.
Some of the hurdles that need to be overcome to get better results were discussed in this
chapter. The conclusions of the current thesis are followed by the future work explaining
about the steps that are to be followed from the current state of the project.
6.1 Limitations of Approach
6.1.1 Weather Data
There are two types of weather parameter measurement errors. First, we found that
the weather parameter measurement in an airport is not accurate as expected. Second, the
distance between the location of outages and the airport, where weather parameters are
measured, is up to 10 miles. The weather conditions in two locations for certain weather
parameters can be significantly different (Figure 6-1). However the rain difference
presents a normal distribution with mean close to zero. Therefore, rain data can be used
for nearby locations without changing the results.
11/20/2000 1/9/2001 2/28/2001 4/19/2001 6/8/2001 7/28/2001 9/16/2001 11/5/2001 12/25/2001 2/13/2002
:.:rJ. r 1L. I- !riJlsbcrm Ea.ar -
| 0 1- :ii.S.- iLamor-[ re een wn
-. F[I milIe
-lOra .1 Li. IE T I I
H Ir" / -- I ; I _
6.1.2 J.7miesr_ Unkn__ V
ofF these I dfrteh Fam, mile F- ar
--- Sm ile. I -
O .: ~F E ter, Paneft Lai %
(b) showing less than 975 miles distance between each weather station
6.1.2 Unknown Variablesmil
of these are different weather parameters, but other variables are most likely specific to a
system and would best be identified by utility employees who are familiar with the
It is possible to find aberrant observations among the data set, without any clear
explanation of the cause. These sort of outliers must me study independently.
6.1.4 Hourly Data
From the number of interruption database FPL provided, the lowest reachable level
is "daily basis". However, for any interruption studied it is necessary to know the exact
time, at the hour level or even at the minute level as shown in [10-11].This is needed to
study different weather parameters at the given outage time since the weather varies for
each and every hour.
The ANN and statistical analysis of the ANN output has the potential to provide
powerful modeling tools, and can be used to provide limited real-time prediction. The
accuracy and precision of the model is dependent as much on the input as the ANN
The graphical output of the ANN can be used by itself or in conjunction with the
statistical analysis to compare the accuracy and precision of the ANN model with
different variable selections, principal components, study areas or times. In some cases,
the graphical representation can provide better clues to the performance of the ANN than
the graphs of means and percent standard deviations.
With the ever increasing demand for more and more electricity every year, the need to
look for the better ways in preventing the interruptions due to over loading of the power
distribution equipment has drawn much attention.
ANNs have been already applied in power systems in the areas of Economic Load
Dispatch, Optimization and Loss Reduction, Fault Detection and Diagnosis, Frequency
Control, Load Forecasting, Contingency analysis, static security assessment, Voltage and
Reactive Power Control etc. But, not much research work can be found either online or in
the IEEE publications regarding the application of ANNs for the prediction of power
distribution interruptions. This novel idea seems very promising in letting the utilities
know and alert them in advance about the number of interruptions that are going to
happen in future. This helps to optimize their crew by mobilizing them to the location of
interest and take proper action more effectively to avoid interruptions/ respond quickly in
restoring the power due to interruptions. This further helps in reducing the SAIFI value.
The utilities can predict SAIFI as they would be able to predict the total number of
interruptions and can use it in their internal calculations. The developed ANN model can
be further enhanced in predicting the extra information like time slot and location of the
occurrence of these interruptions besides revealing their approximate number, for which
all we need to do is to provide the extra information as input columns while training the
model. The accuracy of the predicted results is directly proportional to the accuracy of
the information provided in the training data which is used to train the model.
A basic methodology that is easily automated has been proposed. The methodology
promises to be easy to use and flexible enough to perform in both a real-time predictive
and a theoretical modeling mode.
6.3 Future Work
The following future steps will improve the accuracy of the current analysis.
6.3.1 Data Collection and Creating New Variables
Additional data collection is necessary. It is suspected that a change in power usage
or equipment density might cause outage trends over time. To verify this idea, we need to
develop a scaling factor and collect usage data. This scaling factor would consist of
information such as equipment density and length of lines- daily power usage data would
be an additional explanatory variable. Also, this new data might be useful in comparing
management areas because the probability of interruptions occurring may be proportional
to the scaling factor. The more the new input variables of the system
6.3.2 Improving the Accuracy and Developing New ANN Models
Other types of ANN such as RBF, LVQ, SOM or their combinations need to be
tested to see which of the model gives better prediction results.
The dimension of input feature space/ input feature pattern needs to be reduced to
improve performance such as speed, prediction accuracy
If the prediction variable(s) are more than one (multi-output rather than single
output), the architecture of whole system may be either a multi-input multi-output
ANN or the composition of several multi input-single output ANNs. The training
method as well as performance should be further investigated and compared.
Develop an enhanced custom software model with Graphical User Interface,
where the user will have the options of selecting new input and output datasets to
train ANN and develop a model to predict the output. Hence, user can reuse this
tool every time he wishes to create new model / renovate the existing model.
LIST OF REFERENCES
1. IEEE Trial-Use Guide for Electric Power Distribution Reliability Indices,
IEEE Std 1366-2001, IEEE, New York, 1999.
2. 2001 Cost of Downtime, Contingency Planning Research (CPR) and
Contingency Planning & Management Magazine (CPM). Web site
http://www.contingencyplanningresearch.com (accessed on December 2001).
3. C. A. Warren, Overview of 1366-2001 the Full Use Guide on Electric Power
Distribution Reliability Indices, Power Engineering Society Summer Meeting,
IEEE, Volume 2, 2002.
4. Transmission Line Reference Book, 345kV and Above/Second Edition. Electric
Power Research Institute, Palo Alto, CA, 1982.
5. Florida Power and Light website http://www.fpl.com (accessed June 2004)
6. Thomas E. Grebe, D. Daniel Sabin, and Mark F. McGranaghan, An Assessment of
Distribution System Power Quality: Volume 1: Executive Summary. EPRI Report
TR-106294-V1, Electric Power Research Institute, Palo Alto, California, May
7. D. Daniel Sabin, An Assessment of Distribution System Power Quality, Volume 2:
Statistical Summary Report. EPRI Report TR-106294-V2, Electric Power Research
Institute, Palo Alto, CA, May 1996.
8. Daniel L. Brooks and D. Daniel Sabin, An Assessment of Distribution System
Power Quality: Volume 3: The Library of Distribution System Power Quality
Monitoring Case Studies. EPRI Report TR-106294-V3, Electric Power Research
Institute, Palo Alto, California, May 1996.
9. National Climatic Data Center (accessed June 2004), Website
10. A. Domijan, Jr., R. K. Matavalam, A. Montenegro, W. S. Willcox, Y. S. Joo, L.
Delforn, J.R.Diaz, L.Davis, and J. D'Agostini, Effects of Normal Weather
Conditions on Interruptions in Distribution Systems, International Journal of Power
and Energy Systems, Publication No: 203-3453.
11. J. M. Zurada. Introduction to Artificial Neural Systems. West Publishing
Company, St. Paul, MN, 1992.
12. L. F. Garcia, and O.A Mohammed, Forecasting Peak Loads with Neural Networks,
Southeast Conference. Creative Technology Transfer-A Global Affair,
Proceedings of the 1994 IEEE, pp. 351 356, 10-13 April, 1994.
13. S. I. Wu, Mirroring our Thought Processes. IEEE Potentials 14, 36-41, 1995
14. Neuron Model & Network Architectures, Neural Networks Toolbox, MATLAB 6.0
Manual, Chapter 2.
15. W.M. Huang and R.P. Lippmann. Comparisons Between Neural Networks and
Conventional Classifiers, Proc. IEEE Int. Conference on Neural Networks, pp.
16. J.L Chen, and Chang, S.H, A Neural Network Approach to Evaluate Distribution
Systems Engineering, IEEE International Conference on Neural Networks,
pp. 487 490, 17-19 Sept. 1992.
Roop Kishore R. Matavalam was born in Tirupati city, Andhra Pradesh state, India.
He received his Bachelor of Technology (B.Tech) degree in 2001 specializing in
electrical and electronics engineering from Sri Venkateswara University, India. Since Fall
2001 he has been pursuing his Master of Science degree in electrical and computer
engineering at University of Florida, Gainesville. He has been working as a research
assistant, since 2001, in Florida Power Affiliates and Power Quality Laboratory,
University of Florida. His fields of interest include power reliability, power electronics,
analog circuit design and RF micro electronics.