Reliability assessment of bulk power systems using neural networks


Material Information

Reliability assessment of bulk power systems using neural networks technical report
Physical Description:
viii, 111 leaves : ill. ; 28 cm.
Ebron, Sonja, 1963-
Publication Date:


bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )


Thesis (Ph. D.)--University of Florida, 1992.
Includes bibliographical references (leaves 105-110).
Statement of Responsibility:
by Sonja Ebron.
General Note:
General Note:

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 001796813
notis - AJM0526
oclc - 27483529
System ID:

Full Text







Copyright 1992


Sonja Ebron

to the Fon of Dahomey

homage to the Ancestors, and peace


The author wishes to thank Lisa, Kim, and Camille for inspiration and

motivation. The generous financial and moral support of the author's family

and the Florida Endowment Fund are greatly appreciated. Special thanks go to

Dr. Dennis P. Carroll for the use of computational facilities, to Dr. Khai T.

Ngo for the use of office space, and to Mr. Roger A. Westphal of Gainesville

Regional Utilities for technical assistance. Finally, the author wishes to thank

the cochairs of her supervisory committee, Drs. Robert L. Sullivan and Jose C.

Principe, whose guidance, support, and patience are to be credited with this work.


ACKNOW LEDGEM ENTS ...................................................................................... iv
ABSTRACT .............................................. ........................................................... vii
1 INTRODUCTION ............................................................ ....................... 1
Bulk Power Systems ......................................................... ......................... 1
Reliability Assessment ...................................................... ........................ 3
Neural Networks ........................................................................................ 6
About the Thesis ............................................................ ......................... 8
2 POW ER SYSTEM RELIABILITY ................................................................. 9
Philosophy of Reliability ..................................................... ...................... 9
Classical Reliability ......................................................... ........................ 11
Load Flow Studies .......................................................... ......................... 18
Reliability of Bulk Power Systems ....................................................... ..... 26
3 NEURAL NETW ORKS ............................................................................. 36
Philosophy of Connectionism ................................................ .................. 36
Description and Operation .................................................................. ..... 38
Training Algorithms .......................................................... ...................... 43
Principal Components Analysis ............................. ............................ 50
4 RESEARCH M ETHODOLOGY ............................................. ............... 54
Objectives of Research ............................................................................... 54
Research Procedures ........................................... ............................ ........... 55
Training Issues ........................................................................................... 60
Results for a 3-Bus System ............................................ ..................... 66
5 SIM ULATION RESULTS ................ .......................................................... 71
Computational Tasks ...................................................... ......................... 71
RBTS Results ............................................................................................. 74
GRU System Results ............................................................................... 77
10-Bus System Results .............................................................................. 81
6 CONCLUSION ........................................................................................... 85
Summary ................................................................................................... 85
Limitations ................................................................................................. 87
Future W ork .......................................................................................... 89


A ANNUAL LOAD MODEL ............................................................................ 92
B SIMULATION DATA ................................................. ........................ 95
REFERENCES ..................................................................................................... 105
BIOGRAPHICAL SKETCH ........................................................ 111

Abstract of Dissertation Presented to the Graduate School of the University of Florida
in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy




August 1992

Chair: Robert L. Sullivan
Cochair: Jose C. Principe
Major Department: Electrical Engineering

Assessing the reliability of a bulk power system traditionally involves performing

load flow studies, at several load levels, on a small subset of possible contingencies.

Results are used to calculate systemwide adequacy indices. The conventional practice

of limiting the set of contingencies has an adverse effect on the accuracy of the chosen

indices, as does consideration of only a few load levels.

This work is motivated by two basic objectives to determine if a neural network

can qualitatively replace the load flow study and to determine whether adequacy indices

computed using a neural network strategy are as accurate as those conventionally com-

puted using load flow studies. The dissertation presents research demonstrating that an

artificial neural network can be trained to assess the acceptability of an arbitrary contin-

gency at an arbitrary load level. The network inputs for a given contingency are selected

bus power injections and generator voltage magnitudes at a given load level, as well as

principal components of the associated node admittance matrix.

The network is trained using the inputs for a preselected set of contingencies at a

few load levels and their respective outputs, which are determined by load flow studies.


The neural net can then be used to determine the acceptabilities of other system states

without further recourse to computationally intensive load flow studies, and reliability

indices can be quickly computed using all possible contingencies at as many load levels

as desired. For the systems used in this work, the proposed neural network strategy

affords a higher degree of accuracy in the computation of indices when compared to

conventional techniques.


1.1 Bulk Power Systems

Supplying the demand for electric energy involves three principal functions -

generating, transmitting, and distributing electric power over time. Electricity is generated

at several remote locations using fossil fuels, nuclear fission, natural gas, dammed water,

and alternative resources such as solar insolation and wind. The generators are connected

to transmission lines which transport electricity to several distribution centers, or loads.

Electricity is then transported along distribution lines to homes, offices, industrial plants,

and other users of electric power. A functional diagram of a typical electric power

system is shown in Figure 1.1.

Generator Generator

Transmission Transmission
Tran mission Line Line Tr ss
L ie L ae

Distribution Center Distribution
Center H e I \ Center
SHomes Plants
....Z I Offices s

s Plants

Figure 1.1

Typical Electric


homes i Plants

Power System

Besides generators, lines, and loads, a power system includes such components as

transformers, relays, transducers, shunt capacitors, and other elements. The power system



engineer, when analyzing a system, usually chooses a portion of the system to represent

with as much detail as is appropriate for the analysis. For instance, if the engineer wishes

to analyze power transients along a certain distribution route, the most proper model may

be one composed of all elements from the main distribution point to the end user.

On the other hand, if low voltages at several distribution centers are of interest, then

a low-detail model consisting of generators, transmission lines, and aggregate load points

may be most appropriate. This model, called a bulk power system, is normally used for

large-scale studies of entire systems and is the most common model in use at electric

utilities. Figure 1.2 shows the bulk power system model for the system of Figure 1.1.

Generator Generator

Transmission Transmission
Line Line
Tran mission Trans mission
Li ie Li ie

Aggregate Aggregate
Load Load

Figure 1.2 Bulk Power System Model

Besides controlling the supply of an ever-present electric demand, the power system

engineer must plan for the supply of future demand. Hence, one large-scale study

that uses the bulk power system model is expansion planning. Like operation and

control, expansion planning also involves three principal activities forecasting demand,

estimating reliability, and evaluating expansion alternatives. Since the expanded power

system must be capable of meeting the demand for electricity at some designated future


time, the demand at that time must be estimated. The growth in energy usage is dependent

on several factors, such as population growth, industrial growth, the economy, and energy-

efficient technologies. Some or all of these factors may be used to forecast load growth.

The expanded power system must be capable of meeting the forecasted demand in

a reliable fashion. That is, the system should provide dependable, high-quality electric

service. Since all components of the system are susceptible to failure at any time, the

power system engineer must ensure that failures of system components result in minimal

service interruptions. Hence, after forecasting, possible contingencies of the bulk power

system are analyzed by the engineer to determine the system's reliability under forecasted

demand. This aspect of power system planning is the focus of this work.

If reliability indices for the system fall short of desired values, then the system should

be expanded in some way(s). Additional generators or lines can be added to improve

the system's reliability. However, increased reliability must be balanced by cost-benefit

analyses in order to choose the best expansion alternative.

1.2 Reliability Assessment

Interest in the reliability of power systems began in the 1930s, but reliability remained

a largely philosophical field of study for many years. The major questions at that time

were twofold what did reliability mean and how was it to be measured? The classical

definition emphasized "adequate" functioning:

Reliability is the probability of a device or system performing its function
adequately, for the period of time intended, under the operating conditions
intended. [1: p. 1]

Not until after the second world war did theoretical and practical study of reliability

begin. The boom of the aeronautical and space industries in the '60s and '70s added

impetus and maturity to the study of reliability in several fields, including power systems.


A broad definition of bulk power system reliability, as given by the Institute of

Electrical and Electronics Engineers (IEEE), combines the issues of "adequacy" and

"security," as follows:

Adequacy is the degree of assurance that a bulk power system has sufficient
capability to meet the aggregate loads of all customers.
Security is the degree of assurance of a bulk power system in avoid-
ing uncontrolled, cascading tripouts which may result in widespread power
interruptions. [2: p. 3443]

The security of a bulk power system is determined by short-term, dynamic control; its

adequacy is more involved with long-term, steady-state operation. For this reason, and

because of the classical definition noted above, the terms "reliability" and "adequacy" are,

by tradition, used interchangeably. This work is concerned exclusively with the adequacy

of bulk power systems, and further references to "reliability" strictly imply "adequacy."

Several measures of adequacy are currently in use, ranging from a system's loss-of-

load probability to a customer's expected number of interruptions per year. In general,

reliability indices are either probabilities, frequencies, or expectations. The probabilistic

nature of these measures indicates that statistics and probability theory play a large role

in adequacy assessment, as the classical definition suggests.

Reliability studies may be performed on a utility's generation system, transmis-

sion system, distribution system, or bulk power system. In all cases, the general

procedure [1: p. 113] is the same. First, the system to be analyzed must be defined;

that is, the components to be included in the model must be listed, and the necessary

failure data for each component must be assembled. Also, any assumptions made in

using the model should be delineated, such as proscribed operator responses following

component failures. Second, the criteria for system failure must be defined. Usually,

each of these criteria implies an inability to meet the system load. Third, the set of


component failures to be considered must be chosen, and a "failure effects analysis,"

or determination of system failure, must be performed for each contingency. Finally,

reliability indices must be chosen, computed, and analyzed.

In the earlier days of power system adequacy assessment, only generation studies

were performed. Decades passed before transmission systems were studied, and large-

scale studies of distribution networks were never considered. The infancy of computer

technology, combined with the lack of algorithms for studying complete systems, pro-

hibited analyses of bulk power systems.

Only recently has reliability assessment of bulk power systems become feasible. Fast

algorithms for studying power flows in bulk power systems, coupled with computers of

sufficient speed and memory, make these studies possible. The will to perform these

studies comes from an understanding among utility engineers that separate analyses of

generation and transmission systems requires highly unrealistic assumptions. Indeed,

studying a generation (or transmission) system requires assuming that the connected

transmission (or generation) system never fails.

Though bulk power system reliability assessment is becoming common in the utility

industry [3], serious problems remain. The large number of generating units and trans-

mission lines that are part of modem power systems, as well as the need to consider

overlapping component failures, means that thousands of contingencies must be analyzed

in order to determine a system's adequacy. Furthermore, failure effects analyses for a

bulk power system involve performing load flow studies at several load levels for each

contingency. The necessity for large numbers of load flow studies compels utility engi-

neers to sharply reduce both the number of contingencies and the number of load levels

considered, with a consequent decrease in the accuracy of reliability indices.

1.3 Neural Networks

Neural networks are highly interconnected systems of simple nonlinear elements

which have shown astounding pattern recognition abilities within the last few years.

Interest in this field of artificial intelligence peaked during the 1940s after a neurophysi-

ologist named Warren McCulloch and an electrical engineer named Walter Pitts developed

a network that could be trained to mimic simple logic functions. This network, shown

in Figure 1.3, motivated cognitive and neurological scientists to develop extended net-

works and training algorithms for many years. Indeed, at the height of this era, a neural

network called the perceptionn" was taught to mimic a retina and could perform visual

recognition tasks.

Input 1 Input 2

1 2



Figure 1.3 The McCulloch & Pitts Model

Interest waned in the late '60s, however, after researchers in a rival artificial intelli-

gence community proved that neural networks with only input/output layers had severe

limitations [4]. This state of affairs continued until the late 70s, when scientists from sev-

eral fields developed a training algorithm for a neural network with intermediate layers

and showed that networks of this type had much more power than earlier models. These


networks sparked a revolution [5] in the communication and signal processing fields and

have been used in many other fields of study, including power systems.

The operation of a neural network involves repeatedly summing weighted input

signals and passing the results through simple nonlinearities. A network is trained to

recognize classes of patterns through the use of examples. That is, a training algorithm

chooses a set of weights which maps example input signals to example outputs. The

power of these networks is not, however, in simply memorizing sample patterns. Neural

networks are capable of generalizing to new patterns and classifying them based on

certain intrinsic characteristics. These networks, given an adequate range of examples,

are capable of learning the characteristics which distinguish one class of patterns from


A major difficulty in applying neural networks stems from the typically huge amount

of concurrent data that are available for use as input. This is especially true in power

system applications. The size of a neural net depends on the number of its input nodes,

and large networks are generally more difficult to train. The iterative algorithms used to

train neural nets may take an inordinate amount of time or even fail to converge.

It is advantageous, therefore, to find a representative set of features of the available

variables; the condensed set of features can then be used as input signals, instead of the

actual variables. In the parlance of neural networks, numerical computation of the features

of a set of variables is called "preprocessing." Finding features to use as input, however,

usually requires a great deal of knowledge about the particular application; even with

sufficient expertise, heuristic preprocessing necessarily involves a measure of guesswork.

Principal components analysis is a statistical method that transforms a highly cor-

related set of variables into a linearly independent set. The transformed variables are

called "principal components." Many of the variables in power system applications are

highly interdependent; for instance, a ten-to-one ratio may exist between the reactance

and resistance in a transmission line, and similar lines are used throughout a power sys-

tem. For this reason, principal components analysis is used in this work to reduce the

dimensionality of a chosen set of input variables.

1.4 About the Thesis

This thesis demonstrates that a neural network, using a chosen set of power system

variables, can be trained to specify the acceptability of an arbitrary contingency at an

arbitrary load level. The neural net can then be used to determine the acceptabilities of

other system states so that reliability indices can be quickly computed, using all possible

contingencies at as many load levels as desired.

Reporting the results of this research requires a review of background material.

Chapter 2 contains a thorough examination of power system reliability theory, while

Chapter 3 provides the necessary elements of neural network training and operation, as

well as a review of principal components analysis. In Chapter 4, the objectives and

methodology of the research are presented, using a small power system as an example.

Chapter 5 consists of simulation results for several larger systems and a comparison of

adequacy assessment techniques. Finally, conclusions are given in Chapter 6, along with

limitations of the research and suggestions for future work.


2.1 Philosophy of Reliability

The philosophical objective of a power system adequacy assessment is the develop-

ment of the most reliable system at the least cost. The general methodology for perform-

ing a reliability evaluation is simple, yet several difficulties exist in practice. Difficulties

range from conceptual and modeling issues to computational and data requirements [6].

These problems, and the assortment of simplifying assumptions used to solve them, ac-

count for the wide variability in adequacy assessment techniques. Indeed, depending

on the assumptions made in evaluating a system, the resulting indices may range from

near-exact to trivial values [7].

Since the structure of a reliability procedure depends on its purpose, one conceptual

issue involves the prospective uses of the results. An adequacy assessment may be

performed to compare alternative designs, to measure a system's reliability against

available standards, or to find a balance between costs and benefits of increased reliability.

The purpose of the study may affect the system model, computations, and required data.

Another conceptual problem involves criteria for system failure. Depending on the

level of detail in the system being evaluated, the set of failure events may consist of a

lack of sufficient generation capacity, an inability of the transmission system to carry any

or all of the load, high or low bus voltages, and/or other problems. Also, the choice of

adequacy indices must reflect the intended uses of the assessment. These indices may be

systemwide, load-point, or customer-related measures. In addition, standards by which


to assess computed indices must be found; in many cases, though, such standards are

simply unavailable.

Modeling issues include the types of system components to represent and the modes

of component failure. The representations of generation dispatch and load demand must

also be decided. Also, whether and how to model dependent failures, weather effects, and

energy constraints, as well as preventive maintenance of components and the response of

system operators following failures, must be determined. Modeling issues are resolved

by making appropriate assumptions about the significance (or insignificance) of these

factors to the reliability study.

Computational difficulties are especially onerous. Unless all possible contingencies

of a power system are to be used in assessing its adequacy, a contingency selection

process must be chosen. Generally, only the most probable contingencies are used;

however, more sophisticated ranking procedures [8-10] based on the anticipated severity

of contingencies are also available. In either case, compilation of the relevant data

requires significant calculation.

After a contingency set is chosen, it must be analyzed for failures. Depending on the

model of the system under study, evaluation of contingencies can be the most computer-

intensive part of reliability assessment; this is certainly true for bulk power system studies.

If operator responses are modeled, failed contingencies must be reevaluated following

supposed corrective actions, causing additional computational burdens. Hence, numerous

assumptions are made regarding system parameters in order to simplify contingency

analysis. Calculation of indices also requires significant computer time and memory, and

other assumptions are often made in order to simplify this task.

Data requirements for a reliability study may be extensive. Generally, a forecast

of the hourly demand and failure rates for all components in the system model are


necessary. If dependent failures and weather effects are modeled, data pertaining to them

are also required. The same is true for energy constraints and preventive maintenance

of components. In many cases, the lack of access to certain types of data determines

the structure and compass of a reliability assessment. The practical philosophy in power

system reliability studies is, therefore, to make those assumptions which give the best

tradeoff between accuracy of indices and computational effort.

2.2 Classical Reliability

As shown in Section 1.2, the adequacy of a bulk power system is its ability to meet

the load demand at any time within component ratings and voltage limits. Determination

of a system's adequacy begins with delineation of the system model. All elements (e.g.,

generators, lines) of a bulk power system operate under several assumptions [11], as


1. Each element resides in one of two possible states at a given time, either "up" or

"down." Hence, the elements are binary-state, repairable components.

2. Failure and repair of distinct elements can occur simultaneously; that is, element x

can fail at the moment element y is repaired'. Transitions are, therefore, independent.

3. Transition from an "up" state to a "down" state, or vice-versa, occurs randomly and

is instantaneous for each element. Since changes in state do not depend on previous

transitions, the state of an element over time is a Markov process.

4. Average operating and repair times are constant for each element; that is, the means

are stationary. Each element operates as a homogeneous Markov process.

In addition, several variables are associated with each element, including

i Note, however, that no element can fail and be repaired in any one time step.


m mean "up" time, or mean time to failure (MTTF), of an element,

r mean "down" time, or mean time to repair (MTTR), of an element,

p availability of an element, or probability that an element is available (equals )


q unavailability of an element, or probability that an element is unavailable (equals

r or 1 p); also called "forced outage rate" (FOR) of an element.

A system with n binary-state elements can reside in any one of 2" states. Thus, a

contingency is defined by the up/down status of each element. Since the elements are

independent, the probability of occurrence for a particular contingency is the product of

probabilities (p or q) that each element has the indicated status.

Given a load forecast, the procedure for measuring the reliability of a power system is

straightforward. First, all possible system states are listed with associated probabilities of

occurrence. Next, the set of unacceptable, or "failed," states is determined and, depending

on the reliability measures desired, certain parameters are specified for each unacceptable

state. For instance, if the expected value of unserved demand for the system is a chosen

index of reliability, then the amount of unserved demand for each failed state must be

determined. Finally, the chosen reliability indices are computed from parameters of the

set of unacceptable states.

As an example, consider the 2-unit generating system2 shown in Figure 2.1. The

first generating unit has a maximum capacity of 30 MW. It has a mean "up" time (mi)

of 100 days and a mean "down" time (rl) of 2.04 days. The second unit has a capacity

of 20 MW, mean "up" time (m2) of 50 days, and mean "down" time (r2) of 1.55 days.

2 In assessing the adequacy of a generation system, the transmission system is assumed fully reliable. That is, failures of transmission
lines are not considered in the analysis. Hence, lines are omitted, all generators are connected to one bus, and the entire system
load is lumped at the same bus.

The availabilities of each unit are, then,

mi 100 m2 50
mi + rl 100 + 2.04 m2 + r2 50 + 1.55

q1 = 1 -pi = 0.02, q2 = 1 P2 = 0.03.

Unit 1

Unit 2

Figure 2.1 Example Generating System

Table 2.1 Probabilities of Generating States

Generating Unit 1 Unit 2 Capacity State
State Status Status Available Probability
1 up up 50 MW pip2 = 0.9506
2 up down 30 MW plq2 = 0.0294
3 down up 20 MW qip2 = 0.0194
4 down down 0 MW qlq2 = 0.0006

Generating states and associated probabilities are listed in Table 2.1. The 2-unit

system has 4 combinations of available capacities, from zero, where both units are

down (state #4), to the sum of unit capacities, where both units are up (state #1). A

generating system's performance depends on its expected load. Using historical load

data, the projected load on a power system for a given year can be expressed as a

function of time. This function can be approximated by discrete load levels over a

normalized time variable, such that the probability of having a particular discrete load

I Uit


can be specified [12]. Discrete load levels and associated probabilities for the example

system are given in Table 2.2.
Table 2.2 Probabilities of Load States

a Load Level State Probability
1 10 MW 0.60
2 30 MW 0.08
3 35 MW 0.20
4 40 MW 0.12

The 4-state generation model and 4-state load model combine to yield 16 system

states, as shown in Figure 2.2. Each state is defined by a capacity margin, which is the

difference between available capacity and load for the state, and by the probability that the

system resides in that state. Events in the generation and load models are independent,

so changes in one do not affect the probabilities of change in the other. Hence, the

probability associated with a particular "margin state" is the product of generation and

load state probabilities.

For instance, the first state, comprising the first generation and load states in

Tables 2.1 and 2.2, respectively, has an available capacity of 50 MW and a load of

10 MW. Thus, the capacity margin is 40 MW. The probability of a 50-MW capacity

is 0.9506; the probability of a 10-MW load is 0.60. So, the probability of a 40-MW

margin is the product, or 0.5704, as shown in Figure 2.2. For a generating system, the

failed states are those with negative capacity margins, since these states have insufficient

capacity to meet the load demand. Since only parameters of unacceptable states are re-

quired for computation of reliability indices, information pertaining to acceptable states

can be ignored. The bold line in Figure 2.2 indicates the boundary between acceptable

(PM, or positive-margin) states and unacceptable (NM, or negative-margin) states.

Load 10 MW 30 MW 35 MW 40 MW
capacity 40 M 20 MW 15 MW 10 MW

0.570360 0.076048 0.190120 0.114072 States

20 MW
30 MW 0.017640

20 MW

-10 MW
0 MW
MW 0.000360

Figure 2.2



-10 MW


-30 MW


-5 MW


-15 MW


-35 MW


-10 MW


-20 MW


-40 MW


Capacity Margins and Associated Probabilities

One of the most common indices of reliability is the "loss-of-load probability," which

is the long-term probability of having inadequate generation at any given time. Since

the system endures a loss of load when it resides in any one of the disjoint NM states,

the loss-of-load probability (LOLP) is the sum of probabilities associated with negative-

margin states. For the example system (using the data in Figure 2.2), the LOLP is 0.0178.

Using a conversion factor of 365 days per year, this value indicates that a loss-of-load

condition can be expected to occur on 7 days of the year.

Another common index is the "expected unserved demand" (EUD), or the average

value of load not met in a failed state. For a negative-margin state, note that the magnitude

of the capacity margin is the unserved demand for that state. Hence, the EUD for the

system can be computed as the cumulative product of capacity margins and probabilities

associated with these states, divided by the loss-of-load probability3. For the example
3 In the computation of EUD, the cumulative product is divided by LOLP because the expectation is conditioned on the existence
of a failed state.


system, the EUD is 11.1 MW. In short, the system in Figure 2.1 will fail to serve an

average of 11.1 MW of load on each of 7 days in the year. These reliability indices can

be compared to those of other generating systems or to those of expansion alternatives.

In contrast, the reliability evaluation of transmission systems focuses on load points

instead of supply. Hence, the indices relate to the continuity of service (or whether a

load is served) rather than to its quality (or amount of load served), so a probabilistic

load model is unnecessary. The generation system is assumed fully reliable and situated

at one bus, and each transmission line terminates on at least one load point. The 4-line

system in Figure 2.3 can be used for illustration [13].

Line 1 Line 3

Line 2 Load A
dLoad A
Load B
Line 4

Figure 2.3 Example Transmission System

Historical observation of the system indicates that, on average, lines 1 and 2 are

out of service 0.5 days per year, while lines 3 and 4 are out 0.1 and 0.6 days per year,

respectively. The unavailability of line i, or probability that it is out of service, is simply

# of i out
q year
q 365 days

and these probabilities are listed in Table 2.3.


Table 2.3 Forced Outage Rates for Transmission Lines

Line Unavailability
1 q1 = 0.001370
2 q2 = 0.001370
3 q3 = 0.000274
4 q4 = 0.001644

There are three series paths for supplying load A, expressed as line connections.

They are line 1, line 2, and line 4 -+ line 3. Load A can be supplied if any one of

these paths is available; alternately, load A is interrupted when all three paths to it are

unavailable. The Average Annual Customer Interruption Rate (AACIR) for a load, or

large group of customers, is defined as the expected number of days per year that the

continuity of supply to the load is interrupted. For load A, then,

AACIRA = days line 1 line 2 line line 4
AACIRA = 365-- x P n n o ut
year out out \ out out ]J
= 365days x qq2(93 + q4 q3q4)
= 1.32 x 10-6days 1 second
= 1.32 x 106 a .
year 10 year
The paths to load B are line 1 -- line 3, line 2 -- line 3, and line 4. Hence,

days line 1 line 3 (line 2 line 3n line 4
AACIRB = 365 x P U n U n
year \out out o \ out out / out
days P line 3 line 1 line 2 line
=365 xP U n n
year out \out out out
= 365d x (q3 + q9q2 q3qlq2)q4
1.655 x 10-4days seconds
year year
The interruption rate for any load point in a realistic transmission system can be

determined by this method; the only difficulty stems from listing the numerous paths

to each load. Like the loss-of-load probability for generating systems, the AACIR uses


historical failure data and elementary probability theory. However, flow-graph theory is

also a necessary component of transmission system reliability assessment. Other indices,

such as the frequency and expected duration of load interruptions, are derived from

simple extensions to this basic algorithm [14].

2.3 Load Flow Studies

In a power system model without transmission lines, the set of unacceptable states

can be determined simply by comparing generation and load levels. The inclusion of

transmission lines in the model means that the load is augmented by transmission losses;

that is, generation levels must be compared to levels of load plus losses in order to

determine failed states4 in a bulk power system model. In effect, the boundary between

positive- and negative-margin states (see Figure 2.2) is augmented, by the addition of

losses, towards the upper left-hand corer. Hence, failure effects analyses of bulk power

systems require performance of load flow studies.

A load flow study is the determination of voltage at all points (and power through

all paths) in a bulk power system under a given load and contingency. It is essential to

bulk power system reliability assessment because satisfactory future operation depends

on knowing the effects of contingencies and new loads before they occur. Certain system

variables must be known before a load flow study can be performed. They include the

voltage magnitude at all generator buses, the real power generated at all but one of the

generator buses, the real and reactive load at each bus, and the admittance and shunt

capacitance of each line.

A bulk power system with two generators and two lines is shown in Figure 2.4.

Line losses for a given load are unknown and can only be determined by finding the bus

4 On average, losses account for approximately 5% of the total generation, or 33% of a system's reserves and, therefore, are
significant. Also, even if total generation exceeds load plus losses, the transfer capability of the lines may be insufficient to deliver
power to the loads.


voltages at the ends of each line. This requires solving a set of nonlinear equations for

the system, which is generally performed using an iterative procedure like the Newton-

Raphson method [15: pp. 193-226].



Figure 2.4 Example Bulk Power System

Four variables are associated with each bus in a power system; they are the voltage

magnitude (V) and angle (6) at the bus, and the real (P) and reactive (Q) power injections

to the bus, which are the differences between generation and load at the bus. There are

three categories of buses in a power flow study, and two of the four bus variables are

specified for each. The categories are

- a swing (or slack) bus, usually numbered '1', which is a set of generators that supplies

the difference between the power provided to the system by other generators and the

total system load plus losses. The voltage magnitude and angle (usually set to zero)

are specified for this bus.

- voltage-controlled buses, consisting of all other generator buses in the system. The

voltage magnitude and real power injection are specified for these buses.

- load buses, consisting of all other nodes in the system. The real and reactive power

injections, which are negative for loads, are specified for these buses.


The complex voltages and injected currents at the buses are related by the node

admittance matrix, denoted Yb,, [16]. This matrix is a shorthand description of the

system's topology which includes the admittances of each line. Each diagonal element

(or driving-point admittance) is formed by summing all admittances connected to the

associated node, or bus, including all shunt admittances. Off-diagonal elements (or

transfer admittances) are formed by taking the negative of the admittance between the

two associated buses, excluding the shunt admittance.

0 0

Figure 2.5 Line Admittances for the Sample Bulk Power System

Line and shunt admittances for the system of Figure 2.4 are shown in Figure 2.5.

The two line admittances, denoted y, are shown with associated shunt admittances, b. If

the number of buses in the system is N, then the node admittance elements are formally

computed as

S= Y.,Z = an + (2.1)


yi = YijZoij = -ij, i $ j.



For the sample system, the node admittance matrix is
Yl11 Y12ZL12 Y13/L13
Ybus= Y21L921 Y22/022 Y23L023
Y71L931 Y32L32 32 933 33
Y13 + b13 0 -y13
= 0 23+ 23 23 ]. (2.3)
-Y13 -923 y13 + 923 + (13 + b23)
The symmetry of the node admittance matrix allows for use of only the upper

(or lower) triangular portion. This is important for large systems because computer

memory requirements can be greatly reduced by omitting the redundant portion of the

matrix. In addition, the node admittance matrix is typically very sparse; that is, the

percentage of trivial entries generally increases with the size of the system. This means

that computer storage requirements can be further reduced by techniques of sparcity

programming [17-18]. A load flow study begins with formation of the node admittance

matrix, which is independent of the system load.

Note, however, that the matrix changes under any line contingency [15: pp. 166-192].

If, for instance, line 1-3 is removed from service, then the matrix must be updated by

the addition of a line parallel to line 1-3. The new line has an admittance and shunt sus-

ceptance that effectively cancels that of line 1-3. That is, the new line has an admittance

of -y13 and a shunt susceptance of -b13, and the new admittance matrix is
1 2 3

Y13 + 2b13-Y13 b,13



Y23 + 2 23



3 \ -13-+Y13 --23 Y13 + 23 + 1(b13 + b23)-13 2b13
0 0 0
= 0 23 + 23 -23 (2.4)
0 -y23 y23+ 23
describing a system with only the second line5. Since line 1-3 is connected to buses

1 and 3, only matrix elements 1-1, 1-3, 3-1, and 3-3 are affected, and only by the
5 Here, the first row and column of the matrix can be deleted because bus 1 is effectively isolated from the system.

bus 2



admittance and shunt susceptance of the missing line. Hence, a new admittance matrix

need not be formed under a line contingency; only the affected elements need be altered.


Figure 2.6 Node Model

Since all line power flows can be computed using node voltages, the load flow study

is considered solved when the complex voltage at each bus is known. Hence, power

injection equations, which are functions of node voltages, must be written for each bus.

With reference to the node model shown in Figure 2.6, the complex conjugate of the

power injected at bus k, from generation and/or load at the bus, is6

S = Pk- jk=Vk Ik+k

= VkL-k (kVk6k Vnn) + bnVk 6k)

= VkZL-k 1 (VnL6, Vk Lk)YknOkn + VkL6kYkkOkk + S Vk SkYknLOkn
n k n9k

= VkL-Sk (E VnbnYknLkn) + V2YkkOkk
n\ k
= VkL -bkVnLYknLOk,
= E VkVnynYk e(0+6"-6k)
= E VkVn Ykn[cos (kn + bn k) + j sin (O + bn 6k)]. (2.5)
6By Kirchoffs current law, the power injected from generation and/or load at bus k must flow to all connected buses, including
ground bus 0.


Thus, for a system with N buses, the power injections at the kth bus are calculated as

R{S } = P= VkVYkncos(Okn +6 n -k) (2.6)

{S} = -Qk= E VkVnYk sin(8k + b, 6k), (2.7)
yielding a system of 2N nonlinear equations. Half of the 4N bus variables are specified.

The 2N unknown variables are the real and reactive power injection at the slack bus,

the voltage angle and reactive power injection at other generator buses, and the voltage

magnitude and angle at the loads.

Yet, the system of equations represented by Equations (2.6) and (2.7) is a function of

only voltage magnitudes and angles. Let the number of load buses in the power system

equal and the number of generator buses, including the slack bus, equal g (such that

N = g + ); then the unknown quantities in Equations (2.6) and (2.7) are voltage angles

at all buses except the slack bus (g 1 + unknowns) and voltage magnitudes at load

buses ( unknowns). Thus, Equations (2.6) and (2.7) represent a system of 2(g + )

equations in g 1 + 2 unknowns. By removing all Equations (2.6) for which Pk is not

specified and all Equations (2.7) for which Qk is not specified, the nonlinear system has

g 1 + 2 equations in g 1 + 2 unknowns. That is, Equation (2.6) is written for all

generator buses except the slack node and for all loads, yielding g 1 + equations;

Equation (2.7) is written for all load buses to give f equations.

A Taylor-series expansion of the remaining injection equations about voltage magni-

tudes and angles, neglecting second- and higher-order gradients, gives7

7 Since the voltage angle at the slack bus and voltage magnitudes at all generators are known quantities. AS6 = 0 and AVk = 0
fork = ,...,g.

N Npk
APk Pal Pspc = E E V+ A1, = 2,..., N (2.8)
n=2 n=g+l
AQk k alc- sPe+ E + AV,, k = g + 1,..., N, (2.9)
n=2 n=g+l
where p Spc denotes the specified real injection at the kth bus and pkalc is the calculated
real injection of Equation (2.6).

The sensitivities of real power injections to changes in voltage magnitudes are
negligible, as are the sensitivities of reactive power injections to changes in voltage
angles. This means that the nonlinear system of equations represented by Equations (2.8)
and (2.9) can be decoupled into a subsystem for voltage angles and a subsystem for
voltage magnitudes. Thus, the differences between specified and calculated real power
injections can be approximated by

APk P P k pspc w2 + +* + ,-AN, k = 2,..., N. (2.10)

In like manner, the differences between specified and calculated reactive power injections
are approximated by
a *c Q- O~alcspc -__ oP
AQk Qkal pc Vg+i + + VAVN, k = g + 1,..., N. (2.11)
V,+1 V N

These equations yield the two linear systems

AP2 .'' ap"'" A
I AP2 (2.12)
\APN ...a AsN

: = (2.13)
.- .g dVN

where gradients are computed as follows:

Pk- = E VkVYkn sin(Okn

-VkVnYkn sin (8kn +
Nk- = 1 Vn'k. sin (An
Vn VkYkn sin(8k n + 8
a Vn

+ n 6k),

bn 6k), n 7

+ 6n bk) 2V,

- 6k), n : k.


kYkk sin (0kk),


Unknown voltage magnitudes are initially set to either that of the slack bus or 1.0 per

unit. Unknown voltage angles are initially set to zero. Solving the linear systems in

Equations (2.12) and (2.13) gives changes to initial estimates, such that 6Cew = Sld+A6

and Vknew = Vkold + AVk. After updating complex voltages, real and reactive power

injections, and gradients, the linear systems are solved iteratively8 until voltage changes

are negligible.

When the complex voltages are known, all other system variables can be computed.

Real injection at the slack bus is found using Equation (2.6) for k = 1, and reactive

injection at all generators is computed by using Equation (2.7) for k = 1,..., g. Line

power flows are found by

Sk, = Pkn + Qk,

= Vk knL Vk V

= Vk 6k [(-Yk kn)(Vk )k VnL6n,)]*, k n.

Line power losses are computed either as the difference between total generator output

and total load or as the sum of individual line losses. This technique is called Fast

SExperience with numerous power systems has shown that the gradients in Equation (2.14) are nearly invariant with voltage changes;
hence, a common practice involves using the original gradient calculations in all iterations.



Decoupled AC Load Flow [19] and, like all Newton-based methods, converges only

when initial voltage estimates are "near" the actual values.

In bulk power system reliability assessment, this procedure must be performed for

each contingency that is not obviously unacceptable9. A system state is unacceptable

under the following conditions:

the voltage magnitude at any load bus exceeds a lower or upper voltage limit (usually

0.95 and 1.05 per unit, respectively),

the real or reactive power generation at any voltage-controlled bus exceeds total

rated capacity at the bus, or

the magnitude of complex power flowing on any line exceeds the thermal capacity

of the line.

Switched capacitors (which alter reactive power injections) or changes in transformer taps

(which alter line and shunt admittances) may be used to correct unacceptable conditions,

but a new load flow study must be performed on the contingency for each change. A

reduction in load may be necessary to make a failed state acceptable, in which case a

loss-of-load condition exists. Clearly, the number of power flow studies necessary to

assess the adequacy of a power system can be prohibitive.

2.4 Reliability of Bulk Power Systems

As shown in Section 2.2, the state-space approach to adequacy assessment relies

heavily on the theory of homogeneous Markov processes [20]. If a "system state" is

defined as a power system contingency (in which lines and/or generators may be out

of service) that occurs at a given level of system load, and if the system's load and its

9 Clearly, a system state with no generators and/or no lines is unacceptable. A load flow study is not required to determine
acceptability of this sort of contingency. Indeed, the iterative technique diverges in these cases.

contingencies occur randomly, then the state of a bulk power system varies randomly

and its characteristics can be modeled by probability theory.

Formally, a stochastic process is a set of random variables which are ordered

sequentially; for example, the state of a power system over time is a set of joint random

variables with contingencies and loads as independent random variables, so a power

system operates as a stochastic process. A Markov process is a stochastic process in

which the probability associated with a random variable depends only on the state of

the variable at the previous time step, but not on earlier states. A Markov process is

homogeneous when the probability that a random variable changes state does not depend

on time; that is, the probability of a change in state is constant. Hence, if reliability is

a measurable characteristic of a system, then it can be quantified by application of this

statistical theory.

Several assumptions make the theory applicable to bulk power systems. First, the

probability density function of an element's time in a given state is exponential; that is,

the probability that an element will stay in a given state (failed or working) for a time t' is

P(t') = e-t, (2.16)

where A is the rate of transitions out of the state. The same assumption is made for load

levels, which are not confined to two states. Next, events of failure or repair of distinct

elements are independent, as are changes of state for the set of elements and changes

of state for the load. Also, the probability of more than one of the following events

occurring in one time step is negligible:

- failure or repair of element 'a',

- failure or repair of element 'b',

- change in the load level;


that is, the possibility of simultaneous transitions is ignored.

Figure 2.7 A Two-State Homogeneous Markov Process

The theory is best illustrated using a two-state processo, described in Figure 2.7,

such as the failure/repair process of an element. Here, the probability that the element's

state does not change in one time step, At, given that the element resides in state i, is

denoted pii(At), or pi,. The probability that the state changes to j in At, given that the

element is currently in i, is denoted Pi,. Since state j is the complement of state i,

pii(At) = 1 pij(At).


The duration in state i has an exponential probability density function with transition rate

Ai, so the mean duration in state i is

mi = -i.


The rate of transitions out of state i is defined as

Ai, lim pij(t)
At-d0 At

so that, if At is sufficiently small, the conditional probability of leaving state i in At is

pij(At) = AiAt.


10 From the perspective of a particular state, all processes involve only two states existence in the state and existence outside it.

Let pi(t) denote the probability that the element resides in state i at time t. If pi(t) is

known, then the state probability at the next time step is simply the conditional probability

that the element remains in state i if it currently resides in state i, or that it moves to

state i if it currently resides outside i; that is,

pi(t + At) = pii(At)p,(t) + pji(At)pj(t). (2.20)

Substituting Equations (2.17) and (2.19), Equation (2.20) becomes

pi(t + At) = (1 AXAt)pi(t) + AjAtpj(t),


(t + At) i(t) -Api(t) + Ajpi(t). (2.21)

As At approaches zero, Equation (2.21) becomes the time-derivative of the state

probability which, because the process is homogeneous, is zero. Hence,

dpi(t) pi(t + At) p(t). (222)
=Lim = -Aipi(t) + Aiji(t) = 0. (2.22)
dt At-o At

Also, since states i and j are complements,

pi(t) + pj(t)= 1. (2.23)

Equations (2.22) and (2.23) form a linear system whose solution gives the long-run state

probabilities for the element; that is,

(Ai )( ) (0) ). (2.24)

Alternately, by substitution of Equation (2.18),

m a m
i= -- and pj (2.25)
mi + mj mi + mj


The frequency of encountering a state is defined as the average number of arrivals

into (or departures from) the state per unit time. The probability of a given state and the

frequency of encountering it are related by the mean duration of stays in the state. That

is, if the mean duration of stays in each state of the process is known, then the mean

cycle time of the process (i/j or i/i) is

Ti = mi + m, = Ti, (2.26)

where the mean duration outside state i, mi, is the mean duration in state j, mj. In the

long run, the frequency is the reciprocal of the mean cycle time, or

fi= f= (2.27)

From Equation (2.25), however,

Pi = mifi. (2.28)

The fundamental relation in Equation (2.28) shows that any two of the three state

parameters define the third. Indeed, Equations (2.18)-(2.19) and (2.25)-(2.28) provide

a means of determining all three parameters from historical observations11

The application of this method to bulk power system reliability begins at the level

of elements, or transmission lines and generating units. An element is either in service

or out. Let i and i represent the element's working and failed states, respectively. The

mean duration in each element state is assumed known; that is, mi and m, are given.

Then, the cycle time of the working/failed process is Ti = mi + m, and its frequency,

the frequency of encountering either state, is f; = 1. The probability that the element

resides in state i is pi = mifi, and the rate of transitions out of i is Ai = 1

1 As an illustration, consider an automobile which, on average, runs for 4 years before requiring 6 months (I year) of repair. Here,
the mean duration of a state of repair, m, = year, is given, as is the mean duration of a working state, mr = 4 years. The
probability of the car being in a state of repair is, then, m. = Though the rate of breakdown (or departure from the
working state) is once every 4 years thefrquncy of covering a state of repair is once every 4 years.
working state) is once every 4 years, the frequency of encountering a state of repair is once every 4 1 years.


With this foundation, an outage state is defined as a collection of element states. With

n elements in the system, each with two possible states, the sample space of contingencies

contains 2n states. If the sample space is limited to those states in which no more

than two elements are out of service, then the number of states in the space becomes

(n) + (n) + (n) = 1+ n + (n 1). Let the elements in the system be denoted el,..., e;
also, the probability that element e is in service is denoted pei and the probability that it

is out of service is Pe;. The probability that the system resides in outage state o, denoted

Po, is the probability that each element has the status indicated by o; that is,

Po = P(e n e l n ... n = e ) (Pe llP (2.29)
ein e,out
where e, in denotes the set of elements in service for o.

From the perspective of the state o, the process has only the two states o and i. Let

the transition rate out of the working state for element e be denoted Aei and its transition

rate out of the failed state be A,,. Since the duration of stays in each element state

is exponential, and since element state transitions are independent, the distribution of

durations of stay in an outage state, to, is described by

P(t) = (I AetO) ( -e~xt) e= ite-t (2.30)
e,in e,out
such that the mean duration in o is12

mo = {to} = j to(to)dto = toe- en Ae.i+Eo odto

r(2) /00 E,in Aei + E,out Ael toe-( .in XA +E .,, )todto
Ee,in Aei + Ee,out Ae, J-oo r(2)
Ze,in Aei + e,out e (2.31)
12 The derivation of mean duration in Equation (2.31) uses the similarity between the gamma distribution and the distribution of to
in Equation (2.30). The gamma distribution is P(x) = TT-le-X^ where, if a is a positive integer, r(a + 1) = a!. The
integrand in Equation (2.31), toP(to), is transformed to a gamma function by setting a = 2 and A = 'e .Ae,i + Yeout Ae.
Then, since the integral of any distribution over all values of a random variable is unity, the integration leaves only the constant,
tm,. The derivation shows that the mean value of any exponentially distributed random variable is simply the reciprocal of its
transition rate.

The frequency of encountering state o is, then, fo = .

The 8760 hourly loads on the system for a projected year are assumed given. The

annual load curve (given as a function of time) can be discretized into several load levels,

such as increments of 5% or 25% of the annual peak load. A load state is defined as any

discretized load level in which the system may reside, and if the increments are 5% of

the peak load, then the sample space for loads contains the 20 load levels 5%, 10%,...,

100% of peak load. Let the number of occurrences13 of load state f in the discretized

annual load curve be denoted n1. If t4 is the duration in at the kth occurrence, then

the mean duration of stays in is me = Ek t4. Also, the probability that the system

resides in load state is p, = 1 k tk. Since the duration in state is exponentially

distributed, the transition rate out of t (or into 1) is At = and the frequency of

encountering is ft = R.
A system state is an independent combination of an outage state and a load state.

With 20 possible load states, the sample space contains 20(1 + n + '(n 1)) system

states. If system state s consists of outage state o and load state then the probability

that the system resides in s is14

Ps = P(o n ) = p = Pei Pe, P1- (2.32)
e,in e,out
From the perspective of state s, the process has only the states s and s. Both the

distribution of durations of stay in o and that of t are exponential, so the distribution

of durations of stay in s, t9, is

P(tt) = e-,'(e-(e',i Aei+ZeeO,2 A)e = e-(A+EZinAe.+EutAe, ) (2.33)
13 An occurrence of load state I may last for several hours.
14 Alternately, since a system state is also an independent combination of element states and a load state,

p.=P(ene2nn... e.nnt)= ( ) (Pe) .( e Pt = POP.
\e'ia ) cOQul )


such that the mean duration of stays in s is

ms = 1 (2.34)
At + Ze,in Aei + 2e,out Ae (

The frequency of encountering system state s is f, = --.

State-space reliability indices for the system are the probability, mean duration, and

frequency of encountering a failed (or unacceptable) system state [1: pp. 194-200]. For

the purpose of computing them, an acceptability state is either the set of failed system

states or the set of working system states, and a system state is classified as working or

failed by performing a load flow study. Let the sets z and z represent a system's failed and

working states, respectively. If nz is the number of failed states and z = {sl, s2,..., s,,}

is the set of unacceptable states, then the probability that the system resides in a failed

state, or the long-term probability of system failure, is

P, = P(sl U s2 U U sn) = Ps, + Ps2 +... + Ps, = Ps, (2.35)

and the expected duration of system failure is15

z = mops, in hours. (2.36)
Pz sEz

Finally, the frequency of encountering system failure is

Pz. failures
f = 8760 in urges (2.37)
mz year

15 A system failure event is assumed to last as long as the associated outage state (instead of ending when the load changes), so
the expected duration of system failure involves mo instead of m,. That is, the load level present at the onset of the outage
is assumed to be present throughout the repair time. This assumption, though unrealistic, is necessary to a fair estimation of
system failure duration. Also, the cumulative product in Equation (2.36) is divided by the failure probability because the expected
duration is conditioned on system failure.


After determining the set of unacceptable system states, an index relating to loss

of load can be computed. Table 2.4 lists the acceptabilities of the 20 system states

associated with arbitrary outage o. Load levels in the table are in increasing order, so

that unacceptable system states for this outage are those with the highest loads. The

loss of load for an unacceptable state, denoted ds, can be estimated as the sum of load

increments from it to an acceptable state.

Table 2.4 Acceptabilities of System States Associated With Outage o

Load Level 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Acceptability of
Syes yes yes ye ye yes yes yes ye ye yyees yes yyes no no no no no
System State

For example, since the last acceptable state in Table 2.4 is associated with load level

15, the loss of load for the first unacceptable state at load level 16 is 5% of peak load; the

loss of load at level 17 is 10% of peak load. In this way, d, can be determined for each

unacceptable system state, and the expected unserved demand during system failure is

dz = 1 dps,, in % of peak load. (2.38)

Given historical or projected data on element failures and loads, the state-space

approach to reliability assessment is straightforward. The only difficulty arises from the

number of system states under consideration and the need to classify them as acceptable or

failed. For realistic power systems, which may contain hundreds of elements, restricting

the outage space to first- and second-level contingencies still leaves a large number of

contingencies to consider. Coupling the outage space with a realistic discretized load

space, such as the 20 levels used here, means that failure effects analysis of the system

space is a prohibitive task.


The most common method of further truncating the state space [21-23] is to de-

termine, based on access to computational facilities, the maximum number of load flow

studies allowable. The number of load levels considered can be reduced by increasing the

increments of peak load which form the discretized load curve; for example, if increments

of 25% are used, the number of load states is reduced from 20 to 4. Then, the number

of outage states considered becomes the allowable number of load flow studies divided

by the chosen number of load levels. In an attempt to maximize the accuracy of relia-

bility indices, only those contingencies with the highest probabilities of occurrence are

included. Though this conventional technique suffers from reduced accuracy of the four

reliability indices, it is much less computationally intensive than the more exact method.


3.1 Philosophy of Connectionism

The philosophy of connectionism is motivated by a desire to model the information

processing tasks of humans. Modern computers accomplish strictly regimented tasks

with much greater speed and precision than humans, but the capabilities of the human

brain to analyze sound, sight, smell, touch, taste, and knowledge are far superior to those

of conventional computers. Artificial neural systems attempt to replicate these tasks

by using theories from physics, mathematics, neurobiology, linguistics, psychology, and

other disciplines.

The cerebral cortex [24] is composed of billions of brain cells, or neurons, each

connected to as many as 10,000 other cells. Each neuron consists of a nucleus called

a soma, which processes electrochemical signals passed to it from the cell's dendrites

at a rate of 10-20 ms. Processed signals are then passed to other neurons through the

cell's axon, which is connected to the dendrites of other cells by synapses. Each neuron

lies on one of four layers and may be connected to cells on any layer. These layers,

rising from the hypothalamus at the top of the spinal cord to the brain's outer surface,

differ in the density and size of neurons and in the type of connections (excitatory or

inhibitory) to other neurons.

Human learning relies on the formation of representations that generalize from

the details of specific examples. While specific experiences are retained in memory,

generalization depends on the congruity between these examples [25]. The brain retains


knowledge in the synapses between neurons, and learning occurs by adjusting these

synapses at the presentation of new information. Apparently, synaptic strength between

two neurons increases when the cells have similar responses to input, and it decreases

when the responses are very different. Just as artificial neural models are motivated by

an understanding of brain structure, many learning algorithms are based on this simple

rule of synaptic adjustment.

The precision and rigidity of conventional digital computers allow for superior

performance on algorithmic tasks, but functions such as learning require the adaptability

and resiliency of neural systems. The most glaring differences between the two are the

speed and order of processes. A modern computer processes information a million times

faster than a brain, yet the processing is done in a serial fashion. In contrast, the brain

processes information in parallel with a massive number of neurons.

Neural systems and computers also differ in information storage and retrieval. A

computer stores data in addressed memory locations, so that old information is destroyed

when new data is assigned the same address. The brain stores information in its distributed

synapses, which are simply adjusted when new information is stored. The distributed

nature of knowledge in the brain means that partial degradation has a minimal effect on

the brain's performance. In contrast, failure of a few processing elements in a computer

leads to general failure. This fault-tolerant characteristic of the brain is also due to local

control of processing; that is, while the central processing unit (CPU) in a computer

dictates all processes, control of the processes in a neural system lies with each neuron.

These differences give artificial neural systems the ability to perform well on certain

nonalgorithmic tasks.

Connectionist theory is an emerging science which seeks to model the parallelism

and interconnectivity of the human brain. Recent advances in applications of the theory


range from visual and adaptive pattern recognition to motion detection. In fact, VLSI

implementations of neural networks are the focus of ongoing research [26-28].

Power system engineers have also recognized the facility of using artificial neural

networks. The three IEEE journals dealing exclusively with power systems show that

one application of neural networks was published in 1989, three were published in 1990,

and six were published in 1991 [29-38]. These applications ranged from incipient fault

detection on distribution lines and in synchronous machines to security assessment of

transmission systems. The wide range of problems and techniques indicates that power

systems will provide a vast area of application for neural networks.

3.2 Description and Operation

A neural network consists of a set of nodes, a connection topology, propagation and

learning rules, and an input space. A general node model is represented in Figure 3.1.

The node, or neuron, receives its input through weighted links. This input may come

from other nodes in the network or from outside stimuli. An activation function, usually

a summation, acts on the input; the node's internal bias is then added to the summed and

weighted input. The result is called node activation.


> --Transfer
Output Fcon
I Function
Figure 3.1 Neuron Model

The node's output is determined by an output function, which responds to the

activation. An example is the S-shaped sigmoid shown in Figure 3.1. The transfer


function of the node consists of its activation and output functions. The node's output

travels along the links, or synapses, either to other nodes or to the output of the

system [39].

Figure 3.2 Neural Network Model

A neural network is simply an interconnected collection of these nodes. An example

of a network is shown in Figure 3.2. Nodes in a network differ only in the values

of internal biases and connected weights. In general, then, the propagation rule of the

network is, collectively, the transfer function of any single node. Signal propagation

may occur in either direction in the network; in fact, the output of a node may propagate

directly to the node's input.

6 12 -5 10
13 3 17
-3 21 -2
B= W -13-5 12
16 1 26 3
-7 3 4 31 7
4 6 28 5
2 13 -4 -1

Figure 3.3 Bias Vector and Weight Matrix for Neural Network

The sample network shown in Figure 3.2 has three layers, with a total of eight

neurons. It accepts four inputs and produces one output. The nodes are connected by

links of varying weight. An n-node network with a given propagation rule is fully


described by an n-dimensional internal bias vector and an n-dimensional square weight

matrix, which gives its connection pattern. Figure 3.3 shows a possible representation

for the network above.

Since each node in the foregoing model can connect to any other, the network is

far too complex for many applications. This research focuses on feedforward layered

networks, with each node's output determined by the sigmoid, or logistic, function. That

is, all input is received on one layer, and the resulting signals propagate forward, one

layer at a time, until the signals reach the last layer. An example of such a neural network

is shown in Figure 3.4, with a corresponding bias vector and weight matrix.


Y^i^4 %q^
'XN V b -b 1 12 -5
3-6 8

3 4 B =2 W 10

5 T Y4
-3 14



Figure 3.4 Feedforward Layered Network

This particular network contains five nodes in three layers, and only six weighted

links are required to fully connect the network. Each node on the first layer is connected

to each node on the second, and each node on the second layer is connected to the node

on the third. The last layer is referred to as the output layer, since the network's output

is the response of neurons on this layer. The first layer is referred to as the input layer.


Nodes on this layer differ from others in a feedforward network; they are strictly linear

in that the output of a node is simply its one allowable input. Nodes on the input layer

have no internal biases.

All other layers in the network are referred to as hidden layers, since they are not

accessible to the outer environment. Nodes in these hidden layers serve as detectors

of relationships between the input and the corresponding output1. The AND logic gate

mapping of Figure 3.5 serves as a simple example of a function which can be described

by hidden nodes in a network. The network's input space is partially determined by the

number of nodes in the first layer; for an m-node input layer, it is a subspace of Rm.

The input space for a network solving the AND function consists of all pairs of numbers

having values between zero and one.

x x2 t
00 0 x

1 1
1 0 0


Figure 3.5 Definition of the AND Function

The operation of a neural network consists of the presentation of a set of input and

subsequent propagation of this input through the network. The 5-node network shown

in Figure 3.4 can be used to illustrate the forward propagation of input signals. In this

example, inputs zl and X2 are presented to Node 1 and Node 2, respectively. Since

nodes on the input layer have linear transfer rules,

1 No more than two hidden layers are needed for the most complex mappings [40].


i1 = Xl (3.1)


Y2 = X2. (3.2)

Activation for nodes on the other layers is the sum of the node's internal bias and

weighted outputs passed to it; that is,

X3 = w13Y1 + w23Y2 + b3 (3.3)


X4 = wl4y1 + W242 + b4. (3.4)

Output for all nodes not on the input layer is determined by a sigmoid function2, such that

y3 = (3.5)
1+ e-z3


Y4 = (3.6)
1+ e-X4


x5 = w3sy3 + w45y4 + b5 (3.7)


Ys5 = (3.8)
1 + e-z"

Hence, y5 is the network's response to inputs xi and x2. The method is called "forward

propagation" because node responses on a given layer can only be calculated after those

on the preceding layer are found.

2 For sigmoid function output, note that
lim yn =O and limn y =
x --oo0 zn-+o0


In general, then, forward propagation consists of passing weighted and summed

input signals through a chosen nonlinearity. It presumes knowledge of the network's

bias vector and weight matrix. A node's internal bias can be thought of as the weight

of a link connecting the node to one whose output is always unity, so the procedure for

finding biases is identical to that for finding weights. Any further reference to network

weights shall be understood to include biases.

Once activation and output functions are chosen, a neural network is completely

described by its weights. Since a given neural network solves a specific problem, or

function, finding the weights for the network is equivalent to finding the input/output

relationship that describes the function. For example, the weights defining the network

that solves the AND gate of Figure 3.5 describe the function

I (,... 1) if ax > and X2 > }
1(0, ...,] else'

though not explicitly. Because the weights of a network can represent a given function,

neural networks are especially appropriate and powerful when used to find relationships

that are difficult to describe explicitly, such as that between power system variables and

state acceptability.

3.3 Training Algorithms

A neural network can be categorized by its methods of learning and recall. Recall

can be eitherfeedforward, as illustrated in Section 3.2, or feedback, where node responses

are recurrent. Learning can be supervised, given sample inputs and desired outputs, or

unsupervised, using only inputs.

The Hopfield net [41] is typical of unsupervised-learning / feedback-recall networks.

This network consists of n neurons on one layer, with each neuron connected to all

others and to itself. The net receives and produces n-dimensional, binary input and


output vectors. Its most prominent application is as an associative memory, where a

noisy signal is presented and the most appropriate memorized signal is produced. Given

S patterns to be learned by the net, the n x n symmetric weight matrix is computed using

the sum of outer products of the pattern vectors.

During recall, an input vector is presented and the neurons produce output in discrete

time until a stable response occurs. At any time step, the activation for each neuron is

the sum of weighted outputs from all neurons at the previous time step; this sum is then

passed through a step threshold function to produce the node's new output. The storage

capacity of a Hopfield net decreases in a logarithmic fashion with the number of neurons.

The Brain-State-in-a-BOx, or BSB [42], is an example of supervised-learning /

feedback-recall networks. It has the same topology as the Hopfield net and is used

primarily for pattern completion. Given S patterns to be learned, synaptic weights are

determined iteratively by the learning algorithm. Briefly, starting from initial random

values and the first pattern, the weight from one node to another (or to itself) is changed

by the product of the first node's output and the difference between the second node's

input and output. When all weights are updated, the next pattern is presented and the

process is repeated. The set of patterns is presented until all differences between node

inputs and outputs are negligible.

Recall is performed in the same manner as that in the Hopfield net. The only

difference is that a ramp threshold function is used to give a node's output. The storage

capacity of the BSB increases with the number of neurons, but this advantage over the

Hopfield net is gained at the expense of a greatly increased learning time.

The self-organizing feature map [43] is an unsupervised-learning / feedforward-recall

network that is used primarily for pattern classification. This fully connected neural net

has an n-node input layer, which accepts n-dimensional patterns, and an m-node output


layer which indicates the one class to which a pattern belongs. For each of S patterns

to be learned by the net, the learning algorithm alters only those weights leading to the

output node whose Euclidean distance from the inputs is smallest.

This competitive learning rule allows only one of the m output nodes to fire for

each pattern. When new patterns are presented, the output node whose weights best

correspond to the inputs is caused to fire, while all other node responses are muted. The

feature map can also be trained to give continuous-valued responses at all output nodes;

in this case, the net serves as a data quantizer and the outputs represent a set of features,

or characteristics, of the input patterns3.

Finally, the elementary perception [44] is a supervised-learning / feedforward-recall

network that is also used for pattern classification. It has the same topology as the self-

organizing feature map and is the precursor to the feedforward layered network described

in Section 3.2. Given S n-dimensional input patterns, each with an m-dimensional desired

output, the learning rule randomly assigns bipolar (1) values to the weights. The first

pattern is presented to the input nodes, and each output node has an activation equal to

the cumulative product of inputs and weights leading to the output node. This activation

is then passed through a bipolar threshold function and compared to the desired node

response for this pattern, and the difference is used to adjust the connected weights. The

next pattern causes further adjustment to the weights, and the pattern set is presented

until all differences between actual and desired responses are negligible.

On recall, a new pattern is presented, and output nodes form activations and bipolar

responses. The output indicates the one of 2m classes to which the new pattern belongs.

If the classes defined by the initial patterns are linearly separable, then an optimal set of

weights can be found to map the patterns.

See the network in Section 3.4.


In order for a neural net to learn the "rules" for solving a problem, data sets describing

the problem must be given. In supervised learning, these data sets normally consist of

input vectors and desired, or "target," output vectors for each. The truth table shown in

Figure 3.5 contains four such input/output vectors. These vectors form a full training

set for a neural network with supervised learning; that is, they describe the full range

of expected inputs and associated, desired outputs. The multilayered perceptions4 used

in this research (and described in Section 3.2) are trained by a procedure called the

Backpropagation Learning Algorithm [45], alternately known as the Generalized Delta

Rule. It is an extension of the perception convergence procedure5 that allows for the

inclusion of hidden layers in the network.

The information content of a network, relating all possible vectors to their associated

outputs, is contained in the network's weight matrix, W. If respective outputs for a set of

input vectors is known, then a weight matrix can be found that relates each input vector

in this training set to its associated output; that is, the weights minimize the squared error

between the network's outputs and the known outputs. Hence, assuming the set of input

vectors is representative of the input space, the resulting weights can be used to find the

output associated with an arbitrary input vector.

Finding an appropriate weight matrix for a multilayered perception is equivalent to

training the network to specify the correct output for a given input vector. Thus, the set

of input vectors and known outputs is used by the training algorithm to iteratively change

the weights (from random initial guesses) until the network has learned the input/output

relationships implicit in the set, if such relationships exist

4 Although other neural networks are described in this section, further references to "neural networks" are meant to describe
"multilayered perceptions" only.
5 Unlike the perception convergence procedure, the Backpropagation learning rule is not proven to converge, even if vectors in the
training set fall into linearly separable regions.

The total output error of a multilayered perception with one output, yout, is defined as

SE= EP=ZE 1(tP yu)2, (3.9)
p p
where tP denotes the target output for the pth training vector, and EP denotes its error.

Using the network of Figure 3.4 as an example, small random weights6 are first assigned

to the network links. These weights are

wr and br for s = 1,...,4 and r = 3,...,5,

with the weights of nonexistent links (wl15, for instance) set to zero. There are two sets

of weights in this network those leading to the output node from the hidden layer,

and those leading to the hidden layer from the inputs. For optimal weights, the gradient

of the total error function with respect to each weight is zero. Hence, for a weight, w,5,

leading from hidden node k to the output node,

OE Z dEP dy Ox
-wk5 d dy d o wk5

= [(tp y)(-1)] ([ +ez)2][
(1 + e--X)2
= E(t y.)y(1 y~). (3.10)
Defining the "assigned error" at the output node for vector p as

5 = y(1 y)(tp P), (3.11)

a necessary condition for local optimality of wk is

-E6y =- 0, k = 3,4. (3.12)

6 Small initial weights correspond to low levels of node activation. A low activation level initially places each node in the linear
region of its output function, so that nodes are appropriately "uncommitted" to being on or off.


Similarly, for a weight, wjk, leading from input node j to hidden node k,

aE dE dyP ady8 dy= a9x
Wjk -dyP dx5 9yk dX4 9wjk

5= 6[wkr5 1 + e-2k

= E6-wy(1 y)y+ (3.13)

Defining the assigned error at hidden node k for vector p as

6 dz ( = (1 yk)wk,56, (3.14)

a necessary condition for local optimality of wjk is

E y = 0, j = 1, 2, k = 3,4. (3.15)

This analysis can be easily extended for perceptrons with multiple hidden layers and


From the initial random values, weights are found using the well-known Steepest

Descent algorithm of nonlinear programming [4]. In short, for the pth training vector,

the input and output of each node is computed using Equations (3.1) (3.8). The

network's output is compared to the known output, and the assigned error for the output

node is found using Equation (3.11). Next, assigned errors for hidden nodes7 are found

by Equation (3.14), and weight updates are computed by

Aw< = )y(65P, (3.16)

where r7 is a "learning rate" between 0 and 1. This procedure is followed for each

training vector, in turn.

7 Note that no error is attributable to input nodes, whose transfer rules are linear.


When weight changes for each vector have been computed, the weights are updated,

in the hth iteration, by

) (h) (&-1)
(h+l) = w(h + Aw> (317)
WSr WSr +- Sr Sr I
p p

where a is an "acceleration rate," also between 0 and 1, that adds momentum to the search

procedure and helps to avoid local minima. Weights are updated in this manner, with

iterative presentation of the training set, until Equations (3.12) and (3.15) are satisfied.

Equations (3.12) and (3.15) are known as "soft" criteria for stopping the iterative

procedure because they only ensure that a sum of weighted node errors is negligible. An

alternative, known as "hard" criteria, ensures that all node errors are negligible for each

training vector. That is, for each training vector p, 6b = 0, k = 3, 4, 5. Clearly, "soft"

criteria are met by satisfying the "hard" criteria, but this alternative guarantees that the

neural net will give the correct result for each vector in the training set.

Note that the assigned error for the output node is necessarily computed before those

for hidden nodes; that is, unlike the forward operation of a neural net, training occurs

in a backward fashion. The Backpropagation Learning Algorithm consists of repeatedly

passing the training set through the perception until its weights minimize the output

errors over the entire set.

During recall, the first hidden layer effects a nonlinear transformation of the input

space into an hi-dimensional "image space," where hi is the number of nodes in the

first hidden layer. The input space is partitioned into at most 2h' decision regions by

hi hyperplanes (some of which may be redundant), and sets of similar input vectors

are assigned to separate regions. In this way, the hidden layer performs intermediate

classification of the input patterns [47].


If the network contains a second hidden layer with h2 nodes, then a second transfor-

mation partitions the space defined by the first hidden layer into an h2-dimensional space,

and patterns are reassigned to at most 2h2 regions based on similarities. The output node

performs a final transformation, partitioning the space into two regions, and each input

pattern is placed into one of the two classes (corresponding to outputs of either 0 or 1).

Geometrically, the learning process determines the best successive partitions of the input

space for mapping the training vectors to their associated targets.

In addition, the decision regions described by optimal weights are related to several

discriminantfunctions that specify the class to which an arbitrary input vector belongs

[48]. Generally, a pattern is assigned to a specific region if the associated discriminant

function is greater at that pattern than the discriminant functions for all other regions. The

projection of these functions onto the input space is a set of decision surfaces that separate

classes of input patterns into regions. Hence, the training of a neural net is the iterative

formation of discriminant functions and decision regions associated with the application.

3.4 Principal Components Analysis

Figure 3.6 Principal Components Analyzer

Principal components analysis [49] is a statistical method that can be used to reduce

the dimensionality of a highly correlated set of variables by linearly transforming it


into an uncorrelated set of "principal components." A network that performs a linear

transformation of variables is shown in Figure 3.6.

A set of S input vectors, each denoted g = [y1y2 Y N], can be "standardized" so

that each element, yi, has a standard normal probability distribution. That is, if means

and variances of the elements are

P = I (3.18)

= (y 2-()2)22 (3.19)
respectively, then standardized inputs are

yi = -(y p ), p= 1,...,S. (3.20)

The covariance between standard normal random variables yi and yj is the correlation

between them, so the input correlation matrix is R = {rij }, where

rii = yy, (3.21)
and each rii is bounded by +1. Diagonal terms of R are unity because these terms are

autocorrelations. The trace of R, or sum of its diagonal terms, is the total variance of

each standardized input vector, y; that is, tr{R} = number of input elements = N.

The objective of principal components analysis is to linearly transform the elements of

y so that they are uncorrelated; that is, the correlation matrix of the principal components

should be diagonal. The transformed vectors, z, contain the same energy as the input

vectors. However, because the principal components are linearly independent, most of

the input energy is compacted into a few variables with much higher variances than the

others. Hence, most of the energy in the original variables can be extracted from a few

principal components, such as the M variables (M < N) shown as output in Figure 3.6.


Again, the objective is a linear transformation of the input correlation matrix,

A = UTRU, so that A is diagonal; that is, A = {A,}. Matrix U can then serve

to represent the weights of the network in Figure 3.6. In addition, because A should

preserve the total variance (or energy content) in R, U is required to be orthonormal. In

other words, if the columns of U are orthogonal and normalized, then

UUT = IN tr{A} = tr{R} = N, (3.22)

where IN denotes the N x N identity matrix.

Let uk denote the kth column of the transformation matrix; then U is derived as



= ukAk = Ruk, k =1,...,N

= (R- AkIN)uk= 0, k = 1,...,N, (3.23)

which implies that A1,...,AN are eigenvalues of R, and Ul,...,uN are associated

eigenvectors, all orthogonal. If, in addition, Ilukll = 1, then U = [ul -uk ... UN]

is orthonormal. The kth column of U is the vector of weights from all inputs to the kth

output, zk; that is,

zk = uky. (3.24)

The eigensystem of a correlation matrix can be computed using the Singular-Value De-

composition procedure of linear algebra [50] or newer training algorithms for performing

principal components analysis [51-53].

Since R is real and symmetric, its eigenvalues are all real and positive. Hence, A and

U can be arranged such that A1 > A2 > ... > AN. Then, the first principal component,


zl, has the highest portion of the total input variance (or energy), and a = A1; the

second principal component, z2, has the next highest portion of the total input variance,

and a2o = A2; and so on. Recall that the total variance is

a2ot. = N = tr{R} = tr{A} = A1 + A2 +' + AN, (3.25)

so aOl/2ota, = 1i/N, O2/2total- = A2/N, etc. Hence, the average amount of input

energy extracted by the first M principal components can be computed. Alternately,

the number of principal components needed to extract a given percentage of the input

energy, on average, can be determined. Generally, for highly correlated input variables,

the required number of components is much smaller than the number of variables, which

makes principal components analysis a valuable data-reduction tool.


4.1 Objectives of Research

This work is motivated by two basic objectives to determine if a multilayered

perception can qualitatively replace the load flow study in certain applications' and to

determine whether adequacy indices computed using a neural network strategy are as

accurate as those conventionally computed using load flow studies. In other words, this

research seeks to answer the following questions:

O Can a perception dependably specify the acceptability of an arbitrary system

state that is not presented during training?

[ For this application, does the use of principal components analysis aid or

hamper the performance of a perception?

O Does adequacy assessment using a perception compare favorably with the

conventional technique in terms of accuracy of indices?

The strategy by which these objectives are achieved involves choosing variables for

use as neural net inputs, finding a training set, and training and testing a neural net

for determining state acceptability. The strategy also involves determining the proper

number of principal components, training and testing another neural net, and computing

reliability indices.

1 Although the output of a neural net in this work is taken as a binary (yes/no) number, it can be used as a probability if certain
training conditions are satisfied. Hence, neural nets can be used quantitatively.

4.2 Research Procedures

A power system failure is defined as an unacceptable system state, which occurs

under any of the following conditions:

- voltage magnitude at a load bus is out of bounds,

- power flow magnitude on a line is out of bounds,

- real generation at the swing bus is out of bounds,

- reactive generation at the swing bus or other voltage-controlled bus is out of bounds,

- a bus or set of buses is isolated, or

- the system operating point is unstable (i.e., voltage collapse is likely).

From Section 2.4, the adequacy of a power system can be described in terms of the

probability of system failure, the mean duration of system failure, the frequency of

system failure, and the expected unserved demand during a system failure. Calculation

of these indices requires, as data, the historical mean-time-to-failure and mean-time-to-

repair for all transmission lines and generating units, as well as forecasted hourly loads

for the year in question.

The objectives stated in the previous section can be achieved by computing exact

values and conventional estimates for these indices, then comparing them to values

obtained using a neural network with, and without, the use of principal components.

A truly exact calculation of these indices, for a given power system, requires testing all

possible system states for acceptability. As a practical matter, contingencies with more

than two failed elements may be omitted from the study, and no more than 20 evenly

spaced load levels are necessary. Practically, then, an "exact" calculation of the indices

begins with a power flow study of each single- and double-element contingency at 20

discrete load levels.


For a typical power system, the number of states which must be studied to give exact

indices is still very large. As a consequence, the conventional practice involves further

truncating the contingency space to a manageable size and reducing the number of load

levels studied. In this work, conventional estimates of the indices are based on 100 of the

most probable contingencies, which are studied at four load levels. Hence, 400 load-flow

routines (as opposed to thousands) are executed during a conventional estimation of the

chosen reliability indices.

Regardless of whether exact values or reduced-state estimates are sought, the accept-

ability of each system state being considered must be determined before computing the

indices. Also, an annual load curve for the system must be formed. From Section 2.4,

the procedure for computing exact indices, given a system load curve and mean times to

failure (mei) and repair (me,) for each element, is as follows:

1. Discretize the annual load curve into load states, L. If nt is the number of occur-

rences of e and td is the duration of i at the kth occurrence, find transition rates

A, = -, and probabilities p = In this work, discrete load states are
S' p a t P 8760
5%, 10%, 15%,... ,100% of peak load.

2. Compute transition rates Aei = and A, = -, and probabilities Pei = mm

and pe, = 1 pei, for each element.

3. Determine the acceptability of each system state, s, which is a combination of a load

state and a set of element states, by performing a load flow study. For unacceptable

system states, determine the loss of load, ds, as the sum of load increments to an

acceptable state.

4. For each unacceptable system state, compute the state probability,

S= Pt ( Pei (IPe (4.1)
e,in / \e,out


and the mean duration in the associated outage state,

mo = (4.2)
Ze,in Aei + Ze,out (ei

5. Let z denote the set of unacceptable system states. The probability of system failure is

Pz = ps, (4.3)
the expected duration of system failure is

mz = 1 mops, in hours, (4.4)
Pz sEz

the frequency of encountering a system failure is

Pz failures
f = 8760-, in faur, (4.5)
mz year

and the expected unserved demand during system failure is

dz = 1 dsps, in % of peak load. (4.6)
Pz sEz

Conventional estimates of these indices use the same procedure, with two exceptions:

* load states, f, are restricted to 25%, 50%, 75%, and 100% of peak load, so step 1

involves only four loads, and

* steps 3-5 are completed only for the 100 most probable contingencies, which are

determined in step 2 by computing probability Po = (e,in Pei) ( ,out Pe) for

each outage state.

As an example, consider the 3-bus power system of Figure 4.1. The swing generator

at bus 1, element 'a,' is rated at 45 MW and supplies a local peak load of 10 MW and

5 MVAR; it can consume up to 10 MVAR or provide up to 25 MVAR. The bus has a

nominal voltage magnitude of 1.02 per unit on a 23-kV base. The two generators at bus

2, elements 'b' and 'c,' each supply 25 MW at a per-unit voltage of 1.01. Combined,


they can consume up to 20 MVAR or provide up to 40 MVAR. Bus 2 has a local peak

load of 20 MW and 10 MVAR.

a b

d e

Figure 4.1 3-Bus Sample Power System

A peak load at bus 3 of 50 MW and 20 MVAR is supplied through two transmission

lines connected to buses 1 and 2. Element 'd,' the line connecting bus 1 to this load,

has a resistance of 0.01, a reactance of 0.10, and a shunt susceptance of 0.005, all in

per unit on a 100-MVA base. The line connecting bus 2 to the load, element 'e,' has

a resistance of 0.02, a reactance of 0.18, and a susceptance of 0.009, also in per unit.

Each line can carry a load of up to 100 MVA.

The outage space consists of single- and double-element failures of elements 'a'-'e,'

as well as the base case. Element 'a' has a mean-time-to-failure (mttf) of 1550 hours

and a mean-time-to-repair (mttr) of 35 hours. Element 'b' has mttf of 1630 hours and

mttr of 25 hours. Element 'c' has mttfof 1710 hours and mttr of 30 hours. Element 'd'

has a mttf of 12,200 hours and a mttr of 10 hours, and element 'e' has mttf of 14,100

hours and mttr of 12 hours.

Coincident hourly loads on the system for one year are given in the Appendix2. From

this curve, probabilities3 associated with load states = 5%, 10%,..., 100% are shown
2 With the exception of the GRU system, the load model used for all systems in this work is taken from the IEEE Reilability Test
System [54-55]. Although this system was developed as a benchmark for comparing reliability evaluation techniques, it is an
ill-conditioned system [56] and its size is too large, given available computing facilities, to be considered here.
3 Note that the system must supply, at all times, a load greater than 30% of peak load. This means that some unacceptable system
states may have zero probabilities of occurrence and may be ignored.


in Table 4.1. Next, failure probabilities for elements 'a' through 'e' are 0.0221, 0.0151,

0.0172, 0.0008, and 0.0009. Transition rates to failed element states are 0.0006, 0.0006,

0.0006, 0.0001, and 0.0001, respectively; transition rates to working states are 0.0286,

0.0400, 0.0333, 0.1000, and 0.0833.

Table 4.1 Probabilities Associated with Load States

Load Load
Level Probability Lel Probability
Level Level
1 0.0000 11 0.1188
2 0.0000 12 0.0959
3 0.0000 13 0.1223
4 0.0000 14 0.1102
5 0.0000 15 0.0818
6 0.0000 16 0.0840
7 0.0015 17 0.0755
8 0.0349 18 0.0358
9 0.0976 19 0.0110
10 0.1285 20 0.0022

In Table 4.2, the acceptabilities of the 16 outage states at 20 load levels are shown.

(In the table, dashes represent failed elements, and the 20th load level represents 100% of

peak load.) For this system, 320 system states must be considered for an exact calculation

of reliability indices, while only 64 need be considered using the conventional technique.

(Since the number of elements is so small, no truncation of the outage space is necessary

in computing a conventional estimate.)

Unacceptable system states in Table 4.2 are those represented by 'n,' or 'no.' From

the table, the number of failed system states is 257; 52 of the 64 states considered by

the conventional technique are unacceptable. The exact probability of system failure is

0.0277, with a conventional estimate of 0.0304. The expected duration of a failure event


is 30.5 hours, while the estimate gives 30.2 hours. The frequency of failure is 8 times

per year; the estimate is 9 times per year. The expected unserved demand during failure

is 44.7 MW, with an estimate of 50.4 MW. For this example, only slight errors occur

from reducing the number of load levels considered.

Table 4.2 Acceptabilities of 3-Bus System States

Outage Acceptabtlites by Load Level
State Elemmts
1 2 3 4 5 6 7 I 9 10 11 12 13 14 15 16 17 18 19 20

1 abcde y y y y y y y y y y y y y y y y y y y y
2 -bcde a n n n n n n n n n n n n n n n n
3 --cde n n n n n n n n n n n n n n n n n n n n
4 -b-de n n na n n n n n n n n n n n n I n
5 a-cde y y y y y y y y y y y y y y y y n n n n
6 a--de y y y y y y y y y y y n n n n n n n
7 ab-de y y y y y y y y y y y y y y y y n n n n
8 -bc-e n n n n n n n n n n n n n n n n n n n n
9 -bed- n n n n n n n n n n n n n n n n n n n n
10 a-c-c n n n n n n n n n n n n n n n n n n n n
11 a-cd- n n n n n n n n n n n n n n n n n n n n
12 ab--c n n n n n n n n n n n a n n n n n n n n
13 ab-d- n n n n n n a n n n n n n n n a n n n n
14 abc- n n n n n n n n n n n n n n n n n n n n
15 abc-- n n n n n n n n n n n n a n n n n n n n
16 abcd- n n n n n n n n n na n n n n n

4.3 Training Issues

The neural network approach to adequacy assessment involves determining the

acceptability of each power system state using a trained perception instead of a load

flow routine. In order to compare the actual indices and conventional estimates to

those obtained using perceptions, issues regarding the training of such networks must

be resolved. Before training a neural net, a training set must be formed which is

representative of the entire spectrum of system states. First, a set of power system

variables must be chosen for input to the neural net, then pseudorandom "snapshots" of


the system must be gathered for determining the system's decision regions. The variables

selected to represent a system state should form a consistent set in that no variable, or

set of variables, should duplicate information provided by another set of variables. Such

a set is readily available in the variables used as input to a load flow routine.

With reference to Section 2.3, inputs to the load flow routine are voltage magnitudes

at the generator buses, real power injected at all but the swing bus, reactive power injected

at all load buses, and complex admittance matrix elements. Limitations on bus voltages,

generator injections, and lines flows are not part of a routine's input but are used after

its execution to determine violations. Also, since sparcity programming minimizes the

number of blank admittance elements used by the load flow routine, only nonzero matrix

elements need be provided as input to a neural net.

Hence, the inputs to a perception that determines acceptable states of the system in

Figure 4.1 are

- voltage magnitudes at generator buses 1 and 2 (2 inputs),

- real power injected at buses 2 and 3 (2 inputs),

- reactive power injected at load bus 3 (1 input),

- real parts of nonzero admittance matrix elements (5 inputs), and

- imaginary parts of nonzero admittance matrix elements (5 inputs).

The task of choosing states of the power system appropriate for training is more

complicated. The set of states should also be consistent in not providing redundant

information to the neural net, but it must include patterns from various parts of the state

space if the network is to generalize its knowledge to new states. Ideally, training patterns

should lie on either side of each decision region in the space4. A neural net could perfectly

4 See, for instance, the margin states along the PM/NM boundary in Figure 2.2.


replicate the decision surfaces using these patterns but, obviously, a priori knowledge of

these surfaces is required for selecting the patterns.

Alternately, categories of states can be described, such as line failures at heavy loads

or generator failures at light loads, and the training set may consist of patterns from

each category. Thus, a simple scheme for forming a training set is to include the base

case, all single-generator failures, a double-generator failure involving each generator, all

single-line failures, and a double-line failure involving each line5, at four load levels. If

n is the number of elements in the system, then exactly 2n 1 outage states are selected

using this scheme. Referring to Table 4.2, the training set for the system of Figure 4.1

consists of outage states 1, 2, 3, 5, 6, 7, 14, 15, and 16, at four load levels. Hence, 36

load flow routines must be executed in order to train a perception for this system.

Another issue that must be resolved before training is the normalization of the input

space. If components of the mean input pattern vary by several orders of magnitude,

then the smaller inputs have little effect on the choice of weights, regardless of their

significance to the application. In addition, if the range of values for any input node

is very large, then all nodes in subsequent layers become saturated6, and learning is

hampered. Since the training algorithm normally pushes node responses to a saturation

level (either zero or one), the network is "tricked" into believing the initial weights are

accurate. Thus, the input range for each vector in the training set must be contracted

(or expanded) to the range [0,1].

For example, if S is the number of training cases, the jth element of every training

vector constitutes a collection of signals, { f,..., j,..., js for the jth node in the

5 Although the neural network is expected to accurately specify the acceptabilities of contingencies involving both generators and
lines, these types of contingencies are not used in training. A neural net is capable of generalizing to these cases from the given
training vectors.
6 Large input values initially place each node output near one, while very small values place node outputs near zero; node outputs are
thereby predetermined, and assigned node errors are negligible. See Equations (3.11) and (3.14). An alternative to normalization
of inputs is the addition of a small constant to the gradient of the output function at all nodes; this addition prevents the sigmoid
from saturating, even at convergence of the training algorithm.


neural net's input layer. The values of these signals range from minj to maxj. The

input space [minj, maxi] is transformed to [0,1] by letting

j ip mini (4.7)
x,, = (4.7)
maxj minj

for the pth training case. If N is the number of input nodes, the transformation in matrix

notation is

X11 *.. Xlp ** XlS maxi-mini 0

Xil ...* p ** xjS = 0 max msn 0
XN1 XNp XNs 0 "'" 0 *

Sill -- mini ip mini ... x lS mini

x ijil mini ... xjp mini ... ijs mini (4.8)

X\N1 minN ... XNp minN ... XNS minN
Also, the configuration of a neural net must be determined before training it. The

size of the input layer is specified by the number of inputs in each training vector, and

only one output node is needed to classify the acceptability of a system state. Still, the

number of hidden layers, and the number of nodes on these layers, must be specified. As

shown in Section 3.3, the composition of intermediate layers in a neural net affects the

transformation and partitions of the input space. The degrees of freedom with which the

network forms decision regions are set by the number of hidden layers and the number

of nodes in each. If the network is "overfed," or has too many degrees of freedom, then

the process of assigning patterns to decision regions is made difficult, and the network

may simply memorize training patterns. On the other hand, if a network is "underfed," or

has too few degrees of freedom, then it can not accurately replicate the decision surfaces

and will fail to map training patterns to their targets.


By the same token, the number of hidden layers and nodes affects the total number

of weights in the network. In addition to increasing the training time, numerous weights

require a large number of training vectors to describe decision surfaces of moderate order,

the iterative training process is, therefore, further lengthened. Heuristically, the number

of training patterns required is at least twice the number of weights. Except for very

complex problems, one hidden layer usually suffices to classify patterns, but no consensus

exists regarding the number of hidden nodes. In this work, an arbitrarily chosen ratio of

one hidden unit for every four inputs is used, and all hidden nodes lie on one layer.

During training, the learning and acceleration factors determine the speed at which

optimal weights are found. An acceleration factor of 0.9 is applicable to a wide variety

of problems and is used for all simulations in this work. The best learning factor, on the

other hand, varies with the shape of the performance surface, a function of the weight

space which relates a network's total output error to its weights7. The shape of this

curve is determined by the correlations between inputs. In general, uncorrelated inputs

produce smooth performance curves with global minima that can be found rapidly. Highly

correlated inputs produce very rough curves with many local minima, so searching for a

global minimum must be performed at a slow pace.

In other words, the shape of the performance curve cannot be altered, but the rate

at which weights are changed can be set. The learning rate should be set to reduce the

network's total output error at each iteration of the training algorithm. If the total output

error oscillates during the initial stages of training, the learning rate should be reduced.

A learning rate of 0.5 is used for the 3-bus system, but the systems in Chapter 5 all

require a rate of 0.2.

7 This function is obtained by substituting for you in Equation (3.9), using forward propagation equations and the training set


Stopping criteria for the training algorithm must also be chosen. As shown in Section

3.3, either "hard" or "soft" criteria may be used to halt the iterative training process. The

"hard" criteria generally require several times as many iterations as the "soft" ones and,

in many cases, only cause a neural net to memorize a training set instead of learning

generalizations. Hence, only "soft" stopping criteria are used in this work. Also, the

error gradient tolerance determines the allowable distance from the minimum total output

error for convergence. Obviously, the number of iterations varies indirectly with the

tolerance. In this work, a tolerance of 0.001 is used.

After training, a decision must be made regarding the distinction between zero and

one when observing continuous-valued network responses; that is, a threshold must

be chosen such that responses above it are "high," and values below it are "low." In

classifying power system states as failed or working, a high threshold is desirable because

false "lows" are less detrimental than false "highs." This is especially true for online

applications where operator responses might depend on a network's output. However,

since the set of unacceptable states found by a perception will be larger with the use of a

high threshold, the reliability estimates may be adversely affected. In this work, outputs

greater than or equal to 0.5 represent 1, while numbers less than 0.5 connote 0.

The use of principal components brings additional training issues. No bus variables

are reduced by the principal components analyzer (PCA), so only line contingencies are

needed to find appropriate PCA weights. However, the choice of line contingencies used

to find weights remains to be specified. Like the training set for the neural network,

the training set for the PCA must be representative of all line contingencies without

being redundant. The training set for the perception contains a representative set of line

outages, but it must be stripped of bus variables in order to be used for the PCA.

For instance, the training set associated with the system in Figure 4.1 has vectors


containing 5 bus variables and 10 admittance variables. Note that admittance variables

are specific to outage states and are independent of load levels; thus, only one load level

for each outage state need be considered. When the remaining vectors are stripped of the

5 bus variables, redundancies are evident. The original training set contains outage states

1, 2, 3, 5, 6, 7, 14, 15, and 16, but states 2, 3, 5, 6, and 7 duplicate the line variables

in state 1 and may be omitted. Hence, the training set for the PCA8 consists only of the

10 admittance variables for outage states 1, 14, 15, and 16. In short, a PCA training set

should contain the admittance variables for each of the line contingencies (including the

base case) represented in the original training set.

Once weights are found for the PCA, a determination must be made as to how many

principal components to retain for training a reduced neural network (called a PCANN).

The total number of principal components is the same as the number of admittance

variables but, since they have no linear correlation, the amount of energy in the first few

components rivals that of the entire set of admittances.

Inspection of the eigenvalues of the admittance correlation matrix is instructive. Ob-

viously, principal components associated with trivial eigenvalues are useless. Eigenvalues

less than unity are associated with principal components that contain less energy than any

one of the admittances; hence, those components can also be neglected. This method of

choosing principal components is used throughout this work. For the system of Figure

4.1, using the four line contingencies noted above, only two nontrivial eigenvalues are

found. Thus, only two principal components are retained.

4.4 Results for a 3-Bus System

With these issues resolved, actual indices and conventional estimates for the system

s Since the original training set in this example contains all possible line contingencies, the PCA training set also contains all four
line outages. This is not the case for larger systems; generally, the PCA only estimates principal components of admittances.

of Figure 4.1 can be compared to estimates from a neural net (NN) and a PCANN.

Load flow analyses are performed on outage states 1, 2, 3, 5, 6, 7, 14, 15, and 16, at

loads of 25%, 50%, 75%, and 100% of peak load. The resulting acceptability of each

of these cases, as well as the associated bus and admittance variables, are used to form

a 36-vector training set for a neural network.

A perception is configured with 15 input nodes, 1 output node, and 4 nodes on one

hidden layer. This 20-node network finds an optimal set of weights in 1156 iterations of

the training set. When tested on all 16 outage states at 20 load levels (the system states

in Table 4.2), the neural net exhibits an agreement with load flow results of 98.75%.

Table 4.3 lists the load levels for which each outage state is acceptable and shows where

the perception errs. (See NN Acceptability in the table.)

Table 4.3 Comparison of Acceptable Load Levels for 3-Bus Outage States

c'raining Actual PCANN
State Elements Train Actual NN Acceptability PCANN
Set? Acceptability Acceptability
1 abcde yes all loads all loads all loads
2 b c d e yes no loads no loads no loads
3 c d e yes no loads no loads no loads
4 b de no no loads no loads no loads
5 a c d e yes <85% of peak load <90% of peak load <90% of peak load
6 a d e yes <65% of peak load <75% of peak load <75% of peak load
7 a b d e yes <85% of peak load <90% of peak load <90% of peak load
8 bc e no no loads no loads no loads
9 bd no no loads no loads no loads
10 a c e no no loads no loads no loads
11 a cd no no loads no loads no loads
12 a b e no no loads no loads no loads
13 ab d no no loads no loads no loads
14 a b c e yes no loads no loads no loads
15 a b c yes no loads no loads no loads
16 abcd- yes no loads no loads no loads


Next, weights for a PCA are determined from an eigensystem analysis of the

admittance variables for outage states 1, 14, 15, and 16. Linear dependence among

the 10 admittances is such that only two of the 10 principal components are retained, and

they represent fully 100% of the energy contained in the admittances. The admittance

variables for the four line contingencies are passed through the PCA, and values of the

first principal component for these contingencies are shown in Figure 4.2. Notice that, for

this example, the first component alone places line contingencies into linearly separable

classes. Hence, a perception may more easily classify a given state by using the two9

principal components as input instead of the 10 admittance variables.

Double Single Base
Contingency Contingency Case

I Mt I I I 0 i 1
1 0 i I -- I- I --
-4 -2 0 2 4
Both Lines Li C Line D No Lines
Out Out Out Out

Figure 4.2 First Principal Component for Line Contingencies on the 3-Bus System

Another perception is configured for use with the PCA. For input, this network uses

the same voltage and power variables as the original net, but the 10 admittance variables

are replaced by the two principal components. Hence, this network has only 7 inputs, 1

output, and 2 nodes on one hidden layer. The same 36 system states used to train the

original net are used in training this 10-node network, which requires 1287 iterations to

find optimal weights. When tested on all 320 system states, the PCANN also shows an

agreement with load flow results of 98.75%. (See PCANN Acceptability in Table 4.3.)

9 Since the first principal component is able to clearly distinguish between line contingencies, the second is not needed. It is included
for conformity with the method presented here.


The results in Table 4.3 indicate that both networks are capable of determining the

acceptability of an arbitrary system state'0. For both networks, errors occur through

overestimating the levels of load that are acceptable for a particular contingency. The

PCANN takes a few more iterations to train, but it uses much less data. The two principal

components contain the same information as the admittance variables, so the simpler

structure of the PCANN input variables makes this network equally accurate in testing.


2.5 -,

02 4

uj -----------
3 --- NN
O ------ PCANN

Figure 4.3 Learning Curves of Multilayered Perceptrons for the 3-Bus System

A fast learning rate of 0.5 and an acceleration rate of 0.9 are used for both perceptrons.

The learning curve, a plot of total output error during training, is shown for both networks

in Figure 4.3. The plot shows that the training algorithm proceeds smoothly to the

minimum error in both cases.

Once the set of unacceptable states is specified by the neural networks, adequacy

indices can be estimated and compared. Using the procedure outlined in Section 4.2, as

well as the element and load parameters for the 3-bus system, the original perception

to The number of weights in the first perceptrun is 64, while the second has only 16 weights. Only 36 patterns are used in training
each net, so the rule specifying twice the number of training states as weights is violated in the irst case. The high accuracy of
this network indicates that classes of pattems ae easily separable and that the decision surfaces associated with the 3-us system
Sof low order. That is, the two classes of system states cluster w
0.5 +


Figure 4.3 Learning Curves of Multilayered Perceptrons for the 3-Bus System

A fast learning rate of 0.5 and an acceleration rate of 0.9 are used for both perceptrons.

The learning curve, a plot of total output error during training, is shown for both networks

in Figure 4.3. The plot shows that the training algorithm proceeds smoothly to the

minimum error in both cases.

Once the set of unacceptable states is specified by the neural networks, adequacy

indices can be estimated and compared. Using the procedure outlined in Section 4.2, as

well as the element and load parameters for the 3-bus system, the original perception

l0 The number of weights in the first perceptron is 64, while the second has only 16 weights. Only 36 patterns are used in training
each net, so the rule specifying twice the number of training states as weights is violated in the first case. The high accuracy of
this network indicates that classes of patterns are easily separable and that the decision surfaces associated with the 3-bus system
are of low order. Thai is, the two classes of system states cluster welL.


gives a probability of failure of 0.0253, a mean duration of 30.9 hours, a failure frequency

of 7 times per year, and an unserved demand of 48.3 MW. Because the PCANN makes

exactly the same errors as the original net, its estimates are the same. These results are

tabulated in Table 4.4.
Table 4.4 Comparison of Adequacy Estimates for the 3-Bus System

Conventional NN PCANN
Adequacy Actual
Index Value Estimate % Error Estimate Estimate
Error Error
Probability 0.0277 0.0304 -9.5 0.0253 8.7 0.0253 8.7
Duration 30.53 30.18 1.2 30.93 -1.3 30.93 -1.3
Frequency 7.96 8.82 -10.8 7.17 9.8 7.17 9.8
Unserved 44.68 50.41 -12.8 48.26 -8.0 48.26 -8.0

Note that the estimates obtained using the perceptrons are somewhat more accurate

than the conventional estimate on three of the four indices, and slightly less accurate on

the other. Errors using the conventional technique are less than 15% in all categories,

and errors using either neural network strategy are less than 10%. Since the PCANN

shows no increase in accuracy over the original net, principal components analysis is not

useful when applied to this system (unless speedier training is highly valued). In any

event, the indices suggest that the 3-bus system is available over 97% of the time. It

is expected to fail 8 times per year, with each failure resulting in a loss of almost 45

MW of load for over 30 hours.


5.1 Computational Tasks

Results for the 3-bus system listed in Chapter 4 are obtained by passing descriptive

input files through a series of thirteen computer programs1. One input file lists the real

and reactive loads at each bus, voltage magnitudes and power capabilities of all generator

buses, and impedances, shunt capacitances, and thermal limits of all lines. For the 3-bus

system, this file is called "ex3.dat."

The first program, called drive, uses this input file to produce files which describe

each of the system contingencies at 20 load levels. A subroutine, dacdrive, performs a

decoupled load flow study on each system state and lists the resulting acceptability (either

0 or 1) in a file called "tester.ans." A second input file, called "lines.dat," lists only the

line data for the system. This file is used by program pattern to form the admittance

matrix for each line contingency; the nonzero matrix elements are placed in a file called


At this point, program maketesta uses input file "ex3.dat" and the line patterns in

"fullpatt.dat" to form the neural net inputs for each system contingency at 20 load levels.

These vectors are listed in a file called "testera.dat." Next, program maketraina uses

selected vectors in "testera.dat" and the associated acceptabilities in "tester.ans" to form

a neural net training set, listed in file "traina.dat." Training vectors are selected, at four

load levels, in the following manner:

1 A technical report containing programming code and sample files for the 3-bus system is available from Dr. Jose C. Principe,
Department of Electrical Engineering, University of Florida.

* base case,

* failure of first generating unit,

* failure of first and second generating units,

* failure of second generating unit,

* failure of second and third generating units,

* failure of last generating unit,

* failure of first line,

* failure of first and second lines,

* failure of second line,

* failure of second and third lines,

* failure of last line.

Weights for the neural net are computed by program traina, which performs the back-

propagation routine on the vectors in "traina.dat." Weights are stored in file "weightsa.dat"

and used by program testera to compute neural net responses to each of the vectors in

"testera.dat." These responses are compared to the acceptabilities in "tester.ans" and

listed in file "testera.otp."

Next, program pcaweight uses the admittance elements of selected line contingencies

in "fullpatt.dat" to form an admittance correlation matrix, from which eigenvalues and

eigenvectors are found. Eigenvectors associated with eigenvalues less than unity are

discarded, and the remaining vectors are stored in file "pcaweight.dat." Program pea

then passes all admittance patterns in "fullpatt.dat" through a principal components

analyzer formed from the vectors in "pcaweight.dat;" the transformed vectors are stored

in "fullpca.dat."


Using input file "ex3.dat" and the reduced line patterns in "fullpca.dat," program

maketestb forms neural net inputs for each system contingency at 20 load levels. These

vectors are stored in file "testerb.dat." Program maketrainb forms a training set from

selected vectors and associated acceptabilities in "tester.ans." Training states are the same

as those used to train the first neural net, and these vectors are stored in "trainb.dat."

Weights for the neural net are computed by program trainb, which performs the

training routine on the vectors in "trainb.dat." Weights are stored in file "weightsb.dat"

and used by testerb to compute network responses to each vector in "testerb.dat." These

responses are compared to the acceptabilities in "tester.ans" and listed in file "testerb.otp."

Finally, reliability indices for the system are computed by program reliab. A third

input file, called "reliab.dat," contains load data for the system and mean times to failure

and repair for each element. Using this file, reliab forms the annual hourly load curve,

then computes probabilities and transition rates associated with each of 20 load levels

and with each of four load levels. Probabilities and transition rates for each outage state

are then computed from element failure data.

Using the actual acceptabilities in"tester.ans," the probability, duration, and unserved

demand for each unacceptable system state are accumulated, and these quantities are

used to compute actual reliability indices. In the same manner, the acceptabilities in

"testera.otp" and "testerb.otp" are used to compute indices for the two neural nets. This

procedure is also used to obtain conventional estimates, but only the 100 most probable

outages at four load levels are considered. The indices obtained by each of these strategies

are listed in file "reliab.otp."

5.2 RBTS Results

Figure 5.1 The Roy Billinton Test System

The Roy Billinton Test System (RBTS) is a small bulk power system model used in

graduate reliability studies at the University of Saskatchewan, Canada [57]. The system,

shown in Figure 5.1, consists of 6 buses and operates on a base of 230 kV and 100 MVA.

Two of the buses house a total of 11 generating units, and both buses have a voltage

magnitude of 1.05. Generating data is listed in Table 5.1.

Table 5.1 RBTS Generation Data

Rating Limits hrs hrs
1 40 -15 17 1460 45
1 40 -15 -- 17 1460 45
1 20 -7 -* 12 1752 45
1 10 0 7 2190 45
2 40 -15 -+ 17 2920 60
2 20 -7 -- 12 3650 55
2 20 -7 12 3650 55
2 20 -7 12 3650 55
2 20 -7 12 3650 55
2 5 0 5 4380 45
2 5 0 5 4380 45


Total rated generation of 240 MW is more than sufficient to supply a peak load of

185 MW. The load curve is described in the Appendix and is the same as that for the

3-bus system of Chapter 4. Coincident loads are assumed, and peak bus loads are listed

in Table 5.2. Voltage limits for all load buses are, in per unit, 0.97 and 1.05.

Table 5.2 RBTS Peak Bus Loads

Bus MW Load MVAR Load
1 0 0
2 20 4
3 85 17
4 40 8
5 20 4
6 20 4

The transmission system consists of nine lines. Per-unit line parameters are given in

Table 5.3, along with failure and repair data.

Table 5.3 RBTS Transmission Data

S From To Re Shunt MVA MTTF, MTTR,
Line Resistance Reactance .
Bus Bus Admittance Limit hrs hrs
1 1 2 0.0912 0.4800 0.0564 71 2180 10
2 1 3 0.0342 0.1800 0.0212 85 5830 10
3 1 3 0.0342 0.1800 0.0212 85 5830 10
4 2 4 0.1140 0.6000 0.0704 71 1742 10
5 2 4 0.1140 0.6000 0.0704 71 1742 10
6 3 4 0.0228 0.1200 0.0142 71 8750 10
7 3 5 0.0228 0.1200 0.0142 71 8750 10
8 4 5 0.0228 0.1200 0.0142 71 8750 10
9 5 6 0.0228 0.1200 0.0142 71 8750 10

There are 211 outage states for this system, or 4220 system states. Forty-five of the

outage states are line outages, while 66 concern only generating units. In any event,

4220 load flow studies must be performed, each for a 6-bus system, in order to collect


the data required for calculating actual values of reliability indices. A combination of

this data and the load curve yields the values shown in Table 5.4. For this system, the

number of failed states is 1581. The actual indices show that the RBTS has a 99.75%

availability. It undergoes three failures every two years, each lasting almost 15 hours

and resulting in over 91 MW of lost load.

Table 5.4 RBTS Reliability Indices

Conventional NN PCANN
Adequacy Actual
Ade y A l Estimate % Error Estimate % Error Estimate % Error
Index Value
Probability 0.0025 0.0064 -157.4 0.0038 -53.8 0.0024 5.9
Duration 14.6 14.4 1.2 17.6 -20.6 10.7 26.4
Frequency 1.5 3.9 -160.6 1.9 -27.5 1.9 -27.9
91.2 82.1 10.0 82.5 9.6 82.3 9.8

The conventional estimates, using only the 100 most probable contingencies, give the

most inaccurate values for three of the four indices. The estimate of failure duration is

unique in having the least error with the conventional technique, but the probability and

frequency estimates have errors approaching 200%. Of the 400 system states involved

in conventional estimates, 120 are failed.

There are two voltage magnitudes, five real bus injections, and four reactive injections

available as neural network inputs. In addition, there are 13 nonzero, complex admittance

elements, for a total of 37 available inputs. A neural net is configured with 37 input nodes

and ten nodes on a hidden layer. It learns the 156 training cases in 2473 iterations and

has an 84.5% agreement with load flow results when tested on all system states. The

number of unacceptable states found by this network is 1848. This neural net gives the

most accurate estimate of unserved demand, but the error is only slightly less than either

of the other two strategies.

The 26 admittance variables are transformed to seven principal components by

passing them through a PCA that retains 100% of the energy in the admittances. A neural

net with 18 inputs and five hidden nodes learns the training set in 1781 iterations and

agrees with load flow results on 87.0% of the system states. The number of states deemed

unacceptable by this net is 1891. It performs better than the conventional technique on

all but one index, while its performance is slightly worse on three indices than that of

the original network. However, this network gives a much better estimate of probability

than the original net, so the PCA is quite useful for this system.

5.3 GRU System Results

2 1


11 14

Figure 5.2 The GRU System

The Gainesville Regional Utilities (GRU) system, shown in Figure 5.2, has a reputa-

tion for bulk system reliability throughout its service territory. The system has 11 buses

operating on a base of 138 kV and 100 MVA. Nine generating units reside on two of

the buses, which have per-unit voltages of 1.02 and 1.01, respectively. Table 5.5 lists

the generation data for this system.

Table 5.5 GRU System Generation Data

Rating Limits hrs hrs
1 218 -45 -* 91 8730 30
1 81 -10 25 8730 30
1 18 -6 9 8730 30
1 18 -6 9 8730 30
2 41 -17 -- 19 8730 30
2 20 -10 -- 12 8730 30
2 14 -8 10 8730 30
2 14 -8 10 8730 30
2 14 -8 -* 10 8730 30

Table 5.6 GRU System Peak Bus Loads

Bus MW Load MVAR Load
1 13.2 2.6
2 89.1 17.8
3 28.1 5.6
4 40.7 8.1
5 7.0 1.4
6 21.1 4.2
7 7.9 1.6
8 6.1 1.2
9 39.3 7.9
10 31.1 6.2
11 9.4 1.9


The generating capacity of 442 MW supplies a peak load of 293 MW2. Again,

coincident loads are assumed, and voltage limits for the load buses are 0.95 and 1.05,

respectively. Peak bus loads are listed in Table 5.6. There are 14 lines in the transmission

system, providing 105 line contingencies to be considered in the adequacy assessment.

Line parameters, in per-unit, are given with failure and repair data in Table 5.7.

Table 5.7 GRU System Transmission Data

From To Shunt MVA MTTF, MTTR,
Line Resistance Reactance .
Bus Bus Admittance Limit hrs hrs
1 1 3 0.0057 0.0325 0.0096 245.7 87590 10
2 1 4 0.0024 0.0138 0.0041 245.7 87590 10
3 1 5 0.0030 0.0223 0.0082 313.0 87590 10
4 2 3 0.0055 0.0313 0.0092 245.7 87590 10
5 2 4 0.0088 0.0500 0.0148 245.7 87590 10
6 2 10 0.0051 0.0288 0.0085 245.7 87590 10
7 2 11 0.0032 0.0182 0.0054 205.6 87590 10
8 5 6 0.0046 0.0339 0.0125 313.0 87590 10
9 6 7 0.0014 0.0077 0.0023 245.7 87590 10
10 6 8 0.0045 0.0257 0.0076 245.7 87590 10
11 7 8 0.0031 0.0179 0.0053 245.7 87590 10
12 8 9 0.0061 0.0350 0.0103 245.7 87590 10
13 8 11 0.0066 0.0377 0.0111 205.6 87590 10
14 9 10 0.0015 0.0087 0.0026 245.7 87590 10

The 277 outage states for this system, when combined with the 20-level load model,

yield 5440 system states. Hence, 5440 load flow studies are necessary for an actual

calculation of reliability indices. These indices and their estimates are shown in Table

5.8. There are only 368 failed states in the GRU system, and it has an availability of

99.89%. The system fails once every three years for over 28 hours, resulting in 50 MW

of lost load.

2 The 1991 annual load curve for this system was obtained from the Strategic Planning staff at GRU. Scheduled power interchanges
with connected utilities were not included in the system model.


The conventional estimates, using only the 100 most probable outages, give the most

inaccurate values for three of the four indices. The best conventional estimate, failure

duration, is "off" by less than 1%, while the other estimates contain 70-90% error. Of

the 400 system states involved in conventional estimates, 48 are failed.

Table 5.8 GRU System Reliability Indices

Conventional NN PCANN
Adequacy Actual
Adequacy Actual Estimate % Error Estimate % Error Estimate % Error
Index Value
Probability 0.0011 0.0018 -75.2 0.0010 7.4 0.0010 7.8
Duration 28.4 28.6 -0.8 28.8 -1.3 28.3 0.3
Frequency 0.3 0.6 -73.8 0.3 -4.9 0.3 5.6
Unserved 49.5 92.0 -85.7 93.7 -89.3 62.4 -26.1

There are two voltage magnitudes, ten real bus injections, and nine reactive injections

available as neural network inputs. In addition, there are 25 nonzero, complex, admittance

elements, for a total of 71 available inputs. A neural net is configured with 71 input nodes

and 18 nodes on a hidden layer. It learns the 180 training cases in 3621 iterations and has

a 89.1% agreement with load flow results when tested on all system states. The number

of unacceptable states found by this network is 523. This neural net gives the most

accurate estimates of two indices probability and frequency; the estimate of unserved

demand is actually worse than the conventional estimate.

The 50 admittance variables are transformed to 12 principal components by passing

them through a PCA that retains 96.3% of the energy in the admittances. A neural

net with 33 inputs and eight hidden nodes learns the training set in 4443 iterations and

agrees with load flow results on 88.6% of the system states. The number of states deemed

unacceptable by this net is 487. In comparison to the other techniques, it performs best


on the duration and unserved demand indices, but its performance is slightly worse

than the NN on the probability and frequency indices. Principal components analysis is

worthwhile here primarily due to its increased accuracy on the loss-of-load index.

5.4 10-Bus System Results

-9 14 5
D 44

Figure 5.3 The 10-Bus System

This fictitious system, as its name suggests, uses 15 lines to supply its 10 bus loads.

Three of the buses house the eight generating units, which are described in Table 5.9.

Voltage magnitudes at the three generator buses are 1.02, 1.04, and 1.02, respectively.

The generation system has a maximum capacity of 530 MW to supply a peak load

of 480 MW. The load curve described in the Appendix is also used for this system.

Coincident peak bus loads are listed in Table 5.10. Per-unit voltage limits for load buses

are 0.95 and 1.05. The system operates on a base of 23 kV and 100 MVA. Per-unit

line parameters are given in Table 5.11, which also contains failure and repair data for

the lines.

Table 5.9 10-Bus Generation Data

Rating Limits hrs hrs
1 100 -25 40 2100 30
1 100 -25 -* 40 2100 30
2 65 -30 -+ 30 1630 25
2 50 -10 -+ 25 1630 25
2 50 -10 -. 25 1580 40
3 65 -30 -- 30 1790 15
3 50 -10 -* 25 1790 15
3 50 -10 25 1710 30

Table 5.10 10-Bus System Peak Bus Loads

Bus MW Load MVAR Load
1 40 13
2 30 10
3 40 13
4 60 20
5 40 13
6 70 23
7 50 17
8 30 10
9 50 17
10 70 23


Table 5.11 10-Bus System Transmission Data

From To Shunt MVA MTTF, MTTR,
Line Resistance Reactance
Bus Bus Admittance Limit hrs hrs
1 1 4 0.0360 0.1440 0.0170 110 12200 10
2 1 5 0.0420 0.1680 0.0210 110 14100 12
3 1 7 0.0540 0.2310 0.0260 110 16500 14
4 2 5 0.0310 0.1260 0.0160 110 18700 16
5 2 6 0.0310 0.1260 0.0160 110 16500 18
6 2 9 0.0840 0.3360 0.0410 110 14100 20
7 3 7 0.0530 0.2100 0.0260 110 18700 19
8 3 8 0.0750 0.3010 0.0360 110 16500 17
9 3 10 0.0420 0.1680 0.0210 110 14100 15
10 4 5 0.0630 0.2520 0.0310 110 12200 13
11 4 6 0.0420 0.1680 0.0210 110 12200 11
12 6 7 0.0310 0.1260 0.0160 110 14100 12
13 8 9 0.0840 0.3360 0.0410 110 16500 14
14 8 10 0.0750 0.3010 0.0360 110 18700 16
15 9 10 0.0630 0.2520 0.0310 110 16500 18

The 277 outage states for the 10-bus system correspond to 5440 system states, so

5440 load flow studies are required for actual calculation of the indices. This system has

1490 failed states. Actual indices and estimates are shown in Table 5.12. The 10-bus

system has a 98.01% availability, fails 8 times per year for over 22 hours, and has a

lost load of 77 MW per failure.

The conventional technique, using only the 100 most probable outages, gives the

most inaccurate values for all of the four indices. The estimate of duration is unique

among conventional estimates in giving reasonable accuracy, but each of the other indices

contains approximately 100% error. Of the 400 system states involved in conventional

estimates, 163 are failed.


Table 5.12 10-Bus System Reliability Indices

Conventional NN PCANN
Adequacy Actual
Adequacy Actual Estimate % Error Estimate % Error Estimate % Error
Index Value
Probability 0.0199 0.0427 -114.4 0.0207 -3.8 0.0204 -2.7
Duration 22.3 23.3 -4.3 22.2 0.5 22.0 1.5
Frequency 7.8 16.1 -105.6 8.1 -4.3 8.1 -4.2
76.9 146.7 -90.8 80.2 -4.4 81.0 -5.3

There are three voltage magnitudes, nine real bus injections, and seven reactive

injections available as neural network inputs. In addition, there are 25 nonzero, complex,

admittance elements, for a total of 69 available inputs. A neural net is configured with

69 input nodes and 18 nodes on a hidden layer. It learns the 180 training cases in 7410

iterations and has an 89.0% agreement with load flow results when tested on all system

states. The number of unacceptable states found by this network is 1476. This neural net

gives the most accurate estimate of the duration and demand indices; the other estimates

are slightly less accurate than the reduced network.

The 50 admittance variables are transformed to 14 principal components by passing

them through a PCA that retains 98.3% of the energy in the admittances. A neural

net with 33 inputs and eight hidden nodes learns the training set in 7968 iterations and

agrees with load flow results on 86.1% of the system states. The number of states deemed

unacceptable by this net is 1968. In comparison to the other techniques, it performs best

on the probability and frequency indices, but its performance relative to the original net

does not merit the added computation of the PCA.


6.1 Summary

The reliability assessment of an electric power system is an integral part of expansion

planning for the system. Given a model of a power system, the planner measures the

effect of system contingencies on the overall reliability of the system. Traditionally, due

to the complexity involved in dealing with composite systems, planners have evaluated

the generation, transmission, and distribution components of a power system separately.

Within recent years, however, engineers have realized that changes in one part of a system

can significantly affect the reliability of the entire system. Hence, modern reliability

assessment involves evaluation of the bulk power system model of an electric utility.

This model consists of all generators and transmission lines in the power grid, with the

distribution component represented by load equivalents at major points of service.

Until the early 1980's, reliability assessment was centered on evaluating a utility's

generation system. Measuring the effect of a generating system contingency involved

simply comparing available capacity with various load levels. In contrast, measuring the

effect of a bulk power system contingency involves satisfying constraints on generation,

bus voltages, and line flows, as well as taking system losses into account. In short,

evaluation of a bulk power system requires the performance of several load flow studies on

each contingency. Load flow analysis is time-consuming and costly in terms of computer

resources; in fact, these studies form the most expensive component of bulk power system

reliability assessment. For this reason, a fast, noniterative method of evaluating system


contingencies, such as that presented here, may greatly facilitate planning for electric


Neural networks have been successfully applied to a number of problems, including

handwriting analysis, detection of speech patterns, logic gate mappings, and other

applications, given only examples of patterns characteristic of the specific problem.

Relatively little research has been performed in the application of these networks to power

systems. Yet, the need for online computation in power system operations, combined with

the iterative nature of many solution techniques (and the sheer intractability of certain

problems), means the supply and distribution of electric energy may provide a vast area

of application for this form of artificial intelligence.

The problem considered in this work is the determination of state acceptability using

a multilayered perception trained by error propagation. After choosing input variables

that describe a system state, principal components analysis is used to reduce the number

of inputs to the neural net. A set of patterns and desired outputs is selected for training

and, when network weights are found, the entire state space is passed through the net.

The resulting network outputs are used to estimate chosen adequacy indices.

For the three systems described in Chapter 5, the results show that a neural network

can determine the acceptability of a system state with high probability. In addition,

principal components analysis is shown to be useful in reducing the dimension of an input

vector. In some cases, the use of principal components improves the results; in others, the

accuracy of indices is reduced. Because of this ambiguity, principal components analysis

should only be used with large systems whose input vectors would otherwise result in

unreasonable network training times. Nevertheless, the reliability indices found using

perceptrons are in much closer agreement with actual values than those found using the

conventional technique.


This work offers several contributions to the study of electric utility systems. The

methodology provided for the use of neural networks is both systematic and robust.

Economies of scale are highly favorable in that even large systems can give results

similar to those obtained for the small systems used here.

For operation and control, the research offers a way to monitor a system continuously

instead of performing load flow studies hourly, as is the conventional practice. It is

important to emphasize that, once a neural net has been trained on a power system,

the acceptability of any state of the system can be determined in a fraction of the time

required to perform a load flow study.

For planning, this work shows that a "middle ground" exists between the intensive

computation of actual indices and the easily computed but inaccurate conventional

technique. The neural network approach to adequacy estimation is both faster than the

actual computation and more accurate than the conventional technique. Also, considering

the ongoing nature of the planning process, the computational intensity involved in

training a perception is possibly a one-time cost. After the first planning year, network

training for an expanded system can begin with previously computed weights, requiring

far fewer iterations to include new data in the mapping. This work offers a way to

drastically reduce the number of load flow studies necessary in designing and maintaining

a reliable system.

6.2 Limitations

The conclusions of this research are based on studies of only three small power

systems. Though this work is primarily concerned with developing a methodology for

studying the application of perceptrons to reliability and state acceptability, the results

cannot be generalized without testing on an extensive set of realistic systems.


In addition, this work focuses only on systemwide adequacy indices; that is, no

provision is made for local indices such as the probability of failure at a given bus.

However, since a perception specifies the set of unacceptable states, load flow studies

can be performed on the states to obtain information related to buses. Alternately, a

neural net with multiple output nodes may be trained to give failed buses for system

states and, if particular local indices are desired, load flow studies may be performed on

only those states which indicate specific system failures.

Furthermore, time comparisons between reliability assessment methods are little more

than informed guesses based on the author's experience with the simulations in Chapters

4 and 5. Though time measurements are accessible for each technique, any resulting

statements are flawed due to coding inefficiency in all methods, especially in coding

the training algorithms. Also, though issues related to employing principal components

are discussed, few specific recommendations on the use of this technique are given;

for instance, an analysis of the effects of reduced input correlation on the performance

surfaces is lacking for this application.

In addition, neural network processing requires making judicious choices of several

parameters. From the random numbers initially assigned to network weights to the

learning and acceleration factors of the training procedure, the parameters determine

whether the network accurately models the training set and whether it can generalize to

new patterns. Many of these parameters have been tested on a variety of applications

and have generally accepted heuristics. Unfortunately, no heuristic exists for the number

of hidden units to use.

A 4:1 ratio of inputs to hidden nodes is used in this work, but it is somewhat arbitrary

and only selected for consistent comparison of results. Indeed, some of the network

simulations described in Chapter 5 have higher agreement with load flow studies when


different ratios are used. Hence, the 4:1 ratio is not recommended as a general heuristic.

Although a great deal of research in neural networks is devoted to finding a rule for

this parameter [58-59], it will be necessary to experiment with many power systems

before any rule becomes acceptable for this application. In the context of presenting

a methodology, the lack of an acceptable heuristic for the number of hidden units is a

major limitation of this work. Hopefully, the discussions in Chapters 3 and 4 may guide

other researchers in this matter.

6.3 Future Work

With the methodology presented here, it is possible to formulate an exhaustive

study of power systems to validate the results. Along with such a study, faithful

time comparisons of reliability assessment techniques can be obtained. This requires

professionally developed load flow routines, as well as "state-of-the-art" programming

code for training multilayered perceptrons [60]. In addition, a multiple-year planning

study with yearly expansions might be simulated for various systems in order to quantify

the benefits of using previously computed weights in retraining a perception for an

evolving system. This study would assume a set of expansions limited to the addition of

generating units at existing voltage-controlled buses, alteration of line admittances along

existing rights-of-way, and increments in the annual peak load; however, these additions

form the bulk of power system expansions.

Also, a study of the effects on reliability indices of varying the threshold between

"high" and "low" network outputs would be a valuable addition to this work, as would a

study on the utilization of principal components. Plotting the effects of diverse sizes of

hidden layers on a perception's accuracy in this application would also prove beneficial,

using similar systems of various dimensions.


Knowing the decision regions formed by the hidden layer may impact both the

operation and planning for a power system. With unacceptable regions known to the

dispatcher, continuous monitoring of system variables may help avoid failed states

by prompting preventive operations. In the planning process, finding commonalities

between states in each unacceptable region may suggest specific expansions for improved


A study on the specific benefits of analyzing decision regions for realistic systems

would be the most suitable progeny of this work. Research questions include, among


1. the sensitivity of decision regions to particular weights (i.e., high sensitivity to certain

weights may suggest the importance of associated power system variables), and

2. the implications of similarities between clustered states (i.e., regions filled with

identifiably similar contingencies might be eliminated by reducing the effects of

such contingencies).

This study might also provide clues to such vexing problems as transmission "bottlenecks"

and appropriate amounts of generation reserves. Further research in this field should be

aimed at discovering the implications of a direct mapping of power system variables to

state acceptability.

Finally, this work has shown the manner in which a neural network can be trained

to specify the acceptability of a given power system state. Although the intended

application is contingency screening for reliability assessment, many tasks related to

power system maintenance and control can benefit from this work. Examples are the

timing of preventive maintenance for generators, noncritical line repairs, and feeder

reconfiguration. These applications currently require load flow analyses although, like


contingency screening, they do not require detailed knowledge of bus voltages or line

flows. Hence, these and many other power system tasks provide good applications for

further research.


With the exception of the GRU system, all test systems use an annual load curve

taken from the IEEE RTS. Load data is given in three stages:

- weekly peak load as a percentage of the annual peak,

- daily peak load as a percentage of the weekly peak, and

- hourly peak load as a percentage of the daily peak.

In this way, each hourly load of the year is computed as the product of the associated

weekly, daily, and hourly peak loads.

Table A.1 lists the peak load for each week in the projected year. The annual peak

occurs in the 51st week of the year, though a lower seasonal peak occurs in the 23rd

week. Either a winter- or summer-peaking system can be described by assigning dates

to the first week.

Table A.1 Weekly Peak Load in Percent of Annual Peak

Week Peak Load Week Peak Load
1 86.2 27 75.5
2 90.0 28 81.6
3 87.8 29 80.1
4 83.4 30 88.0
5 88.0 31 72.2
6 84.1 32 77.6
7 83.2 33 80.0
8 80.6 34 72.9
9 74.0 35 72.6
10 73.7 36 70.5
11 71.5 37 78.0
12 72.7 38 69.5