UFDC Home  Search all Groups  UF Institutional Repository  UF Institutional Repository  UF Theses & Dissertations  Vendor Digitized Files   Help 
Material Information
Subjects
Notes
Record Information

Full Text 
RELIABILITY ASSESSMENT OF BULK POWER SYSTEMS USING NEURAL NETWORKS: TECHNICAL REPORT by SONJA EBRON UNIVERSITY OF FLORIDA 1992 Copyright 1992 by Sonja Ebron to the Fon of Dahomey homage to the Ancestors, and peace ACKNOWLEDGEMENTS The author wishes to thank Lisa, Kim, and Camille for inspiration and motivation. The generous financial and moral support of the author's family and the Florida Endowment Fund are greatly appreciated. Special thanks go to Dr. Dennis P. Carroll for the use of computational facilities, to Dr. Khai T. Ngo for the use of office space, and to Mr. Roger A. Westphal of Gainesville Regional Utilities for technical assistance. Finally, the author wishes to thank the cochairs of her supervisory committee, Drs. Robert L. Sullivan and Jose C. Principe, whose guidance, support, and patience are to be credited with this work. TABLE OF CONTENTS page ACKNOW LEDGEM ENTS ...................................................................................... iv ABSTRACT .............................................. ........................................................... vii CHAPTERS 1 INTRODUCTION ............................................................ ....................... 1 Bulk Power Systems ......................................................... ......................... 1 Reliability Assessment ...................................................... ........................ 3 Neural Networks ........................................................................................ 6 About the Thesis ............................................................ ......................... 8 2 POW ER SYSTEM RELIABILITY ................................................................. 9 Philosophy of Reliability ..................................................... ...................... 9 Classical Reliability ......................................................... ........................ 11 Load Flow Studies .......................................................... ......................... 18 Reliability of Bulk Power Systems ....................................................... ..... 26 3 NEURAL NETW ORKS ............................................................................. 36 Philosophy of Connectionism ................................................ .................. 36 Description and Operation .................................................................. ..... 38 Training Algorithms .......................................................... ...................... 43 Principal Components Analysis ............................. ............................ 50 4 RESEARCH M ETHODOLOGY ............................................. ............... 54 Objectives of Research ............................................................................... 54 Research Procedures ........................................... ............................ ........... 55 Training Issues ........................................................................................... 60 Results for a 3Bus System ............................................ ..................... 66 5 SIM ULATION RESULTS ................ .......................................................... 71 Computational Tasks ...................................................... ......................... 71 RBTS Results ............................................................................................. 74 GRU System Results ............................................................................... 77 10Bus System Results .............................................................................. 81 6 CONCLUSION ........................................................................................... 85 Summary ................................................................................................... 85 Limitations ................................................................................................. 87 Future W ork .......................................................................................... 89 vi APPENDICES A ANNUAL LOAD MODEL ............................................................................ 92 B SIMULATION DATA ................................................. ........................ 95 REFERENCES ..................................................................................................... 105 BIOGRAPHICAL SKETCH ........................................................ 111 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy RELIABILITY ASSESSMENT OF BULK POWER SYSTEMS USING NEURAL NETWORKS by SONJA EBRON August 1992 Chair: Robert L. Sullivan Cochair: Jose C. Principe Major Department: Electrical Engineering Assessing the reliability of a bulk power system traditionally involves performing load flow studies, at several load levels, on a small subset of possible contingencies. Results are used to calculate systemwide adequacy indices. The conventional practice of limiting the set of contingencies has an adverse effect on the accuracy of the chosen indices, as does consideration of only a few load levels. This work is motivated by two basic objectives to determine if a neural network can qualitatively replace the load flow study and to determine whether adequacy indices computed using a neural network strategy are as accurate as those conventionally com puted using load flow studies. The dissertation presents research demonstrating that an artificial neural network can be trained to assess the acceptability of an arbitrary contin gency at an arbitrary load level. The network inputs for a given contingency are selected bus power injections and generator voltage magnitudes at a given load level, as well as principal components of the associated node admittance matrix. The network is trained using the inputs for a preselected set of contingencies at a few load levels and their respective outputs, which are determined by load flow studies. Viii The neural net can then be used to determine the acceptabilities of other system states without further recourse to computationally intensive load flow studies, and reliability indices can be quickly computed using all possible contingencies at as many load levels as desired. For the systems used in this work, the proposed neural network strategy affords a higher degree of accuracy in the computation of indices when compared to conventional techniques. CHAPTER 1 INTRODUCTION 1.1 Bulk Power Systems Supplying the demand for electric energy involves three principal functions  generating, transmitting, and distributing electric power over time. Electricity is generated at several remote locations using fossil fuels, nuclear fission, natural gas, dammed water, and alternative resources such as solar insolation and wind. The generators are connected to transmission lines which transport electricity to several distribution centers, or loads. Electricity is then transported along distribution lines to homes, offices, industrial plants, and other users of electric power. A functional diagram of a typical electric power system is shown in Figure 1.1. Generator Generator Transmission Transmission Tran mission Line Line Tr ss L ie L ae Distribution Distribution Center Distribution Center H e I \ Center SHomes Plants ....Z I Offices s s Plants Offices Figure 1.1 Typical Electric ion homes i Plants Offices Power System Besides generators, lines, and loads, a power system includes such components as transformers, relays, transducers, shunt capacitors, and other elements. The power system 1. 2 engineer, when analyzing a system, usually chooses a portion of the system to represent with as much detail as is appropriate for the analysis. For instance, if the engineer wishes to analyze power transients along a certain distribution route, the most proper model may be one composed of all elements from the main distribution point to the end user. On the other hand, if low voltages at several distribution centers are of interest, then a lowdetail model consisting of generators, transmission lines, and aggregate load points may be most appropriate. This model, called a bulk power system, is normally used for largescale studies of entire systems and is the most common model in use at electric utilities. Figure 1.2 shows the bulk power system model for the system of Figure 1.1. Generator Generator N Transmission Transmission Line Line Tran mission Trans mission Li ie Li ie Aggregate Load Aggregate Aggregate Load Load Figure 1.2 Bulk Power System Model Besides controlling the supply of an everpresent electric demand, the power system engineer must plan for the supply of future demand. Hence, one largescale study that uses the bulk power system model is expansion planning. Like operation and control, expansion planning also involves three principal activities forecasting demand, estimating reliability, and evaluating expansion alternatives. Since the expanded power system must be capable of meeting the demand for electricity at some designated future 3 time, the demand at that time must be estimated. The growth in energy usage is dependent on several factors, such as population growth, industrial growth, the economy, and energy efficient technologies. Some or all of these factors may be used to forecast load growth. The expanded power system must be capable of meeting the forecasted demand in a reliable fashion. That is, the system should provide dependable, highquality electric service. Since all components of the system are susceptible to failure at any time, the power system engineer must ensure that failures of system components result in minimal service interruptions. Hence, after forecasting, possible contingencies of the bulk power system are analyzed by the engineer to determine the system's reliability under forecasted demand. This aspect of power system planning is the focus of this work. If reliability indices for the system fall short of desired values, then the system should be expanded in some way(s). Additional generators or lines can be added to improve the system's reliability. However, increased reliability must be balanced by costbenefit analyses in order to choose the best expansion alternative. 1.2 Reliability Assessment Interest in the reliability of power systems began in the 1930s, but reliability remained a largely philosophical field of study for many years. The major questions at that time were twofold what did reliability mean and how was it to be measured? The classical definition emphasized "adequate" functioning: Reliability is the probability of a device or system performing its function adequately, for the period of time intended, under the operating conditions intended. [1: p. 1] Not until after the second world war did theoretical and practical study of reliability begin. The boom of the aeronautical and space industries in the '60s and '70s added impetus and maturity to the study of reliability in several fields, including power systems. 4 A broad definition of bulk power system reliability, as given by the Institute of Electrical and Electronics Engineers (IEEE), combines the issues of "adequacy" and "security," as follows: Adequacy is the degree of assurance that a bulk power system has sufficient capability to meet the aggregate loads of all customers. Security is the degree of assurance of a bulk power system in avoid ing uncontrolled, cascading tripouts which may result in widespread power interruptions. [2: p. 3443] The security of a bulk power system is determined by shortterm, dynamic control; its adequacy is more involved with longterm, steadystate operation. For this reason, and because of the classical definition noted above, the terms "reliability" and "adequacy" are, by tradition, used interchangeably. This work is concerned exclusively with the adequacy of bulk power systems, and further references to "reliability" strictly imply "adequacy." Several measures of adequacy are currently in use, ranging from a system's lossof load probability to a customer's expected number of interruptions per year. In general, reliability indices are either probabilities, frequencies, or expectations. The probabilistic nature of these measures indicates that statistics and probability theory play a large role in adequacy assessment, as the classical definition suggests. Reliability studies may be performed on a utility's generation system, transmis sion system, distribution system, or bulk power system. In all cases, the general procedure [1: p. 113] is the same. First, the system to be analyzed must be defined; that is, the components to be included in the model must be listed, and the necessary failure data for each component must be assembled. Also, any assumptions made in using the model should be delineated, such as proscribed operator responses following component failures. Second, the criteria for system failure must be defined. Usually, each of these criteria implies an inability to meet the system load. Third, the set of 5 component failures to be considered must be chosen, and a "failure effects analysis," or determination of system failure, must be performed for each contingency. Finally, reliability indices must be chosen, computed, and analyzed. In the earlier days of power system adequacy assessment, only generation studies were performed. Decades passed before transmission systems were studied, and large scale studies of distribution networks were never considered. The infancy of computer technology, combined with the lack of algorithms for studying complete systems, pro hibited analyses of bulk power systems. Only recently has reliability assessment of bulk power systems become feasible. Fast algorithms for studying power flows in bulk power systems, coupled with computers of sufficient speed and memory, make these studies possible. The will to perform these studies comes from an understanding among utility engineers that separate analyses of generation and transmission systems requires highly unrealistic assumptions. Indeed, studying a generation (or transmission) system requires assuming that the connected transmission (or generation) system never fails. Though bulk power system reliability assessment is becoming common in the utility industry [3], serious problems remain. The large number of generating units and trans mission lines that are part of modem power systems, as well as the need to consider overlapping component failures, means that thousands of contingencies must be analyzed in order to determine a system's adequacy. Furthermore, failure effects analyses for a bulk power system involve performing load flow studies at several load levels for each contingency. The necessity for large numbers of load flow studies compels utility engi neers to sharply reduce both the number of contingencies and the number of load levels considered, with a consequent decrease in the accuracy of reliability indices. 1.3 Neural Networks Neural networks are highly interconnected systems of simple nonlinear elements which have shown astounding pattern recognition abilities within the last few years. Interest in this field of artificial intelligence peaked during the 1940s after a neurophysi ologist named Warren McCulloch and an electrical engineer named Walter Pitts developed a network that could be trained to mimic simple logic functions. This network, shown in Figure 1.3, motivated cognitive and neurological scientists to develop extended net works and training algorithms for many years. Indeed, at the height of this era, a neural network called the perceptionn" was taught to mimic a retina and could perform visual recognition tasks. Input 1 Input 2 1 2 3 Output Figure 1.3 The McCulloch & Pitts Model Interest waned in the late '60s, however, after researchers in a rival artificial intelli gence community proved that neural networks with only input/output layers had severe limitations [4]. This state of affairs continued until the late 70s, when scientists from sev eral fields developed a training algorithm for a neural network with intermediate layers and showed that networks of this type had much more power than earlier models. These 7 networks sparked a revolution [5] in the communication and signal processing fields and have been used in many other fields of study, including power systems. The operation of a neural network involves repeatedly summing weighted input signals and passing the results through simple nonlinearities. A network is trained to recognize classes of patterns through the use of examples. That is, a training algorithm chooses a set of weights which maps example input signals to example outputs. The power of these networks is not, however, in simply memorizing sample patterns. Neural networks are capable of generalizing to new patterns and classifying them based on certain intrinsic characteristics. These networks, given an adequate range of examples, are capable of learning the characteristics which distinguish one class of patterns from others. A major difficulty in applying neural networks stems from the typically huge amount of concurrent data that are available for use as input. This is especially true in power system applications. The size of a neural net depends on the number of its input nodes, and large networks are generally more difficult to train. The iterative algorithms used to train neural nets may take an inordinate amount of time or even fail to converge. It is advantageous, therefore, to find a representative set of features of the available variables; the condensed set of features can then be used as input signals, instead of the actual variables. In the parlance of neural networks, numerical computation of the features of a set of variables is called "preprocessing." Finding features to use as input, however, usually requires a great deal of knowledge about the particular application; even with sufficient expertise, heuristic preprocessing necessarily involves a measure of guesswork. Principal components analysis is a statistical method that transforms a highly cor related set of variables into a linearly independent set. The transformed variables are called "principal components." Many of the variables in power system applications are 8 highly interdependent; for instance, a tentoone ratio may exist between the reactance and resistance in a transmission line, and similar lines are used throughout a power sys tem. For this reason, principal components analysis is used in this work to reduce the dimensionality of a chosen set of input variables. 1.4 About the Thesis This thesis demonstrates that a neural network, using a chosen set of power system variables, can be trained to specify the acceptability of an arbitrary contingency at an arbitrary load level. The neural net can then be used to determine the acceptabilities of other system states so that reliability indices can be quickly computed, using all possible contingencies at as many load levels as desired. Reporting the results of this research requires a review of background material. Chapter 2 contains a thorough examination of power system reliability theory, while Chapter 3 provides the necessary elements of neural network training and operation, as well as a review of principal components analysis. In Chapter 4, the objectives and methodology of the research are presented, using a small power system as an example. Chapter 5 consists of simulation results for several larger systems and a comparison of adequacy assessment techniques. Finally, conclusions are given in Chapter 6, along with limitations of the research and suggestions for future work. CHAPTER 2 POWER SYSTEM RELIABILITY 2.1 Philosophy of Reliability The philosophical objective of a power system adequacy assessment is the develop ment of the most reliable system at the least cost. The general methodology for perform ing a reliability evaluation is simple, yet several difficulties exist in practice. Difficulties range from conceptual and modeling issues to computational and data requirements [6]. These problems, and the assortment of simplifying assumptions used to solve them, ac count for the wide variability in adequacy assessment techniques. Indeed, depending on the assumptions made in evaluating a system, the resulting indices may range from nearexact to trivial values [7]. Since the structure of a reliability procedure depends on its purpose, one conceptual issue involves the prospective uses of the results. An adequacy assessment may be performed to compare alternative designs, to measure a system's reliability against available standards, or to find a balance between costs and benefits of increased reliability. The purpose of the study may affect the system model, computations, and required data. Another conceptual problem involves criteria for system failure. Depending on the level of detail in the system being evaluated, the set of failure events may consist of a lack of sufficient generation capacity, an inability of the transmission system to carry any or all of the load, high or low bus voltages, and/or other problems. Also, the choice of adequacy indices must reflect the intended uses of the assessment. These indices may be systemwide, loadpoint, or customerrelated measures. In addition, standards by which 10 to assess computed indices must be found; in many cases, though, such standards are simply unavailable. Modeling issues include the types of system components to represent and the modes of component failure. The representations of generation dispatch and load demand must also be decided. Also, whether and how to model dependent failures, weather effects, and energy constraints, as well as preventive maintenance of components and the response of system operators following failures, must be determined. Modeling issues are resolved by making appropriate assumptions about the significance (or insignificance) of these factors to the reliability study. Computational difficulties are especially onerous. Unless all possible contingencies of a power system are to be used in assessing its adequacy, a contingency selection process must be chosen. Generally, only the most probable contingencies are used; however, more sophisticated ranking procedures [810] based on the anticipated severity of contingencies are also available. In either case, compilation of the relevant data requires significant calculation. After a contingency set is chosen, it must be analyzed for failures. Depending on the model of the system under study, evaluation of contingencies can be the most computer intensive part of reliability assessment; this is certainly true for bulk power system studies. If operator responses are modeled, failed contingencies must be reevaluated following supposed corrective actions, causing additional computational burdens. Hence, numerous assumptions are made regarding system parameters in order to simplify contingency analysis. Calculation of indices also requires significant computer time and memory, and other assumptions are often made in order to simplify this task. Data requirements for a reliability study may be extensive. Generally, a forecast of the hourly demand and failure rates for all components in the system model are 11 necessary. If dependent failures and weather effects are modeled, data pertaining to them are also required. The same is true for energy constraints and preventive maintenance of components. In many cases, the lack of access to certain types of data determines the structure and compass of a reliability assessment. The practical philosophy in power system reliability studies is, therefore, to make those assumptions which give the best tradeoff between accuracy of indices and computational effort. 2.2 Classical Reliability As shown in Section 1.2, the adequacy of a bulk power system is its ability to meet the load demand at any time within component ratings and voltage limits. Determination of a system's adequacy begins with delineation of the system model. All elements (e.g., generators, lines) of a bulk power system operate under several assumptions [11], as follows: 1. Each element resides in one of two possible states at a given time, either "up" or "down." Hence, the elements are binarystate, repairable components. 2. Failure and repair of distinct elements can occur simultaneously; that is, element x can fail at the moment element y is repaired'. Transitions are, therefore, independent. 3. Transition from an "up" state to a "down" state, or viceversa, occurs randomly and is instantaneous for each element. Since changes in state do not depend on previous transitions, the state of an element over time is a Markov process. 4. Average operating and repair times are constant for each element; that is, the means are stationary. Each element operates as a homogeneous Markov process. In addition, several variables are associated with each element, including i Note, however, that no element can fail and be repaired in any one time step. 12 m mean "up" time, or mean time to failure (MTTF), of an element, r mean "down" time, or mean time to repair (MTTR), of an element, p availability of an element, or probability that an element is available (equals ) and q unavailability of an element, or probability that an element is unavailable (equals r or 1 p); also called "forced outage rate" (FOR) of an element. A system with n binarystate elements can reside in any one of 2" states. Thus, a contingency is defined by the up/down status of each element. Since the elements are independent, the probability of occurrence for a particular contingency is the product of probabilities (p or q) that each element has the indicated status. Given a load forecast, the procedure for measuring the reliability of a power system is straightforward. First, all possible system states are listed with associated probabilities of occurrence. Next, the set of unacceptable, or "failed," states is determined and, depending on the reliability measures desired, certain parameters are specified for each unacceptable state. For instance, if the expected value of unserved demand for the system is a chosen index of reliability, then the amount of unserved demand for each failed state must be determined. Finally, the chosen reliability indices are computed from parameters of the set of unacceptable states. As an example, consider the 2unit generating system2 shown in Figure 2.1. The first generating unit has a maximum capacity of 30 MW. It has a mean "up" time (mi) of 100 days and a mean "down" time (rl) of 2.04 days. The second unit has a capacity of 20 MW, mean "up" time (m2) of 50 days, and mean "down" time (r2) of 1.55 days. 2 In assessing the adequacy of a generation system, the transmission system is assumed fully reliable. That is, failures of transmission lines are not considered in the analysis. Hence, lines are omitted, all generators are connected to one bus, and the entire system load is lumped at the same bus. The availabilities of each unit are, then, mi 100 m2 50 mi + rl 100 + 2.04 m2 + r2 50 + 1.55 q1 = 1 pi = 0.02, q2 = 1 P2 = 0.03. Unit 1 Unit 2 LOAD Figure 2.1 Example Generating System Table 2.1 Probabilities of Generating States Generating Unit 1 Unit 2 Capacity State State Status Status Available Probability 1 up up 50 MW pip2 = 0.9506 2 up down 30 MW plq2 = 0.0294 3 down up 20 MW qip2 = 0.0194 4 down down 0 MW qlq2 = 0.0006 Generating states and associated probabilities are listed in Table 2.1. The 2unit system has 4 combinations of available capacities, from zero, where both units are down (state #4), to the sum of unit capacities, where both units are up (state #1). A generating system's performance depends on its expected load. Using historical load data, the projected load on a power system for a given year can be expressed as a function of time. This function can be approximated by discrete load levels over a normalized time variable, such that the probability of having a particular discrete load I Uit 14 can be specified [12]. Discrete load levels and associated probabilities for the example system are given in Table 2.2. Table 2.2 Probabilities of Load States Load a Load Level State Probability State 1 10 MW 0.60 2 30 MW 0.08 3 35 MW 0.20 4 40 MW 0.12 The 4state generation model and 4state load model combine to yield 16 system states, as shown in Figure 2.2. Each state is defined by a capacity margin, which is the difference between available capacity and load for the state, and by the probability that the system resides in that state. Events in the generation and load models are independent, so changes in one do not affect the probabilities of change in the other. Hence, the probability associated with a particular "margin state" is the product of generation and load state probabilities. For instance, the first state, comprising the first generation and load states in Tables 2.1 and 2.2, respectively, has an available capacity of 50 MW and a load of 10 MW. Thus, the capacity margin is 40 MW. The probability of a 50MW capacity is 0.9506; the probability of a 10MW load is 0.60. So, the probability of a 40MW margin is the product, or 0.5704, as shown in Figure 2.2. For a generating system, the failed states are those with negative capacity margins, since these states have insufficient capacity to meet the load demand. Since only parameters of unacceptable states are re quired for computation of reliability indices, information pertaining to acceptable states can be ignored. The bold line in Figure 2.2 indicates the boundary between acceptable (PM, or positivemargin) states and unacceptable (NM, or negativemargin) states. Load 10 MW 30 MW 35 MW 40 MW capacity 40 M 20 MW 15 MW 10 MW 50MW 0.570360 0.076048 0.190120 0.114072 States 20 MW 30 MW 0.017640 10MW 20 MW 0.011640 10 MW 0 MW MW 0.000360 Figure 2.2 0MW 0.002352 10 MW 0.001552 30 MW 0.000048 5 MW 0.005880 15 MW 0.003880 35 MW 0.000120 10 MW 0.003528 20 MW 0.002328 40 MW 0.000072 Capacity Margins and Associated Probabilities One of the most common indices of reliability is the "lossofload probability," which is the longterm probability of having inadequate generation at any given time. Since the system endures a loss of load when it resides in any one of the disjoint NM states, the lossofload probability (LOLP) is the sum of probabilities associated with negative margin states. For the example system (using the data in Figure 2.2), the LOLP is 0.0178. Using a conversion factor of 365 days per year, this value indicates that a lossofload condition can be expected to occur on 7 days of the year. Another common index is the "expected unserved demand" (EUD), or the average value of load not met in a failed state. For a negativemargin state, note that the magnitude of the capacity margin is the unserved demand for that state. Hence, the EUD for the system can be computed as the cumulative product of capacity margins and probabilities associated with these states, divided by the lossofload probability3. For the example 3 In the computation of EUD, the cumulative product is divided by LOLP because the expectation is conditioned on the existence of a failed state. NM States 16 system, the EUD is 11.1 MW. In short, the system in Figure 2.1 will fail to serve an average of 11.1 MW of load on each of 7 days in the year. These reliability indices can be compared to those of other generating systems or to those of expansion alternatives. In contrast, the reliability evaluation of transmission systems focuses on load points instead of supply. Hence, the indices relate to the continuity of service (or whether a load is served) rather than to its quality (or amount of load served), so a probabilistic load model is unnecessary. The generation system is assumed fully reliable and situated at one bus, and each transmission line terminates on at least one load point. The 4line system in Figure 2.3 can be used for illustration [13]. Line 1 Line 3 Line 2 Load A dLoad A GEN Load B Line 4 Figure 2.3 Example Transmission System Historical observation of the system indicates that, on average, lines 1 and 2 are out of service 0.5 days per year, while lines 3 and 4 are out 0.1 and 0.6 days per year, respectively. The unavailability of line i, or probability that it is out of service, is simply # of i out q year q 365 days 365yar year and these probabilities are listed in Table 2.3. 17 Table 2.3 Forced Outage Rates for Transmission Lines Line Unavailability 1 q1 = 0.001370 2 q2 = 0.001370 3 q3 = 0.000274 4 q4 = 0.001644 There are three series paths for supplying load A, expressed as line connections. They are line 1, line 2, and line 4 + line 3. Load A can be supplied if any one of these paths is available; alternately, load A is interrupted when all three paths to it are unavailable. The Average Annual Customer Interruption Rate (AACIR) for a load, or large group of customers, is defined as the expected number of days per year that the continuity of supply to the load is interrupted. For load A, then, AACIRA = days line 1 line 2 line line 4 AACIRA = 365 x P n n o ut year out out \ out out ]J days = 365days x qq2(93 + q4 q3q4) year = 1.32 x 106days 1 second = 1.32 x 106 a . year 10 year The paths to load B are line 1  line 3, line 2  line 3, and line 4. Hence, days line 1 line 3 (line 2 line 3n line 4 AACIRB = 365 x P U n U n year \out out o \ out out / out days P line 3 line 1 line 2 line =365 xP U n n year out \out out out days = 365d x (q3 + q9q2 q3qlq2)q4 year 1.655 x 104days seconds year year The interruption rate for any load point in a realistic transmission system can be determined by this method; the only difficulty stems from listing the numerous paths to each load. Like the lossofload probability for generating systems, the AACIR uses 18 historical failure data and elementary probability theory. However, flowgraph theory is also a necessary component of transmission system reliability assessment. Other indices, such as the frequency and expected duration of load interruptions, are derived from simple extensions to this basic algorithm [14]. 2.3 Load Flow Studies In a power system model without transmission lines, the set of unacceptable states can be determined simply by comparing generation and load levels. The inclusion of transmission lines in the model means that the load is augmented by transmission losses; that is, generation levels must be compared to levels of load plus losses in order to determine failed states4 in a bulk power system model. In effect, the boundary between positive and negativemargin states (see Figure 2.2) is augmented, by the addition of losses, towards the upper lefthand corer. Hence, failure effects analyses of bulk power systems require performance of load flow studies. A load flow study is the determination of voltage at all points (and power through all paths) in a bulk power system under a given load and contingency. It is essential to bulk power system reliability assessment because satisfactory future operation depends on knowing the effects of contingencies and new loads before they occur. Certain system variables must be known before a load flow study can be performed. They include the voltage magnitude at all generator buses, the real power generated at all but one of the generator buses, the real and reactive load at each bus, and the admittance and shunt capacitance of each line. A bulk power system with two generators and two lines is shown in Figure 2.4. Line losses for a given load are unknown and can only be determined by finding the bus 4 On average, losses account for approximately 5% of the total generation, or 33% of a system's reserves and, therefore, are significant. Also, even if total generation exceeds load plus losses, the transfer capability of the lines may be insufficient to deliver power to the loads. 19 voltages at the ends of each line. This requires solving a set of nonlinear equations for the system, which is generally performed using an iterative procedure like the Newton Raphson method [15: pp. 193226]. 0 LOAD Figure 2.4 Example Bulk Power System Four variables are associated with each bus in a power system; they are the voltage magnitude (V) and angle (6) at the bus, and the real (P) and reactive (Q) power injections to the bus, which are the differences between generation and load at the bus. There are three categories of buses in a power flow study, and two of the four bus variables are specified for each. The categories are  a swing (or slack) bus, usually numbered '1', which is a set of generators that supplies the difference between the power provided to the system by other generators and the total system load plus losses. The voltage magnitude and angle (usually set to zero) are specified for this bus.  voltagecontrolled buses, consisting of all other generator buses in the system. The voltage magnitude and real power injection are specified for these buses.  load buses, consisting of all other nodes in the system. The real and reactive power injections, which are negative for loads, are specified for these buses. 20 The complex voltages and injected currents at the buses are related by the node admittance matrix, denoted Yb,, [16]. This matrix is a shorthand description of the system's topology which includes the admittances of each line. Each diagonal element (or drivingpoint admittance) is formed by summing all admittances connected to the associated node, or bus, including all shunt admittances. Offdiagonal elements (or transfer admittances) are formed by taking the negative of the admittance between the two associated buses, excluding the shunt admittance. 0 0 Figure 2.5 Line Admittances for the Sample Bulk Power System Line and shunt admittances for the system of Figure 2.4 are shown in Figure 2.5. The two line admittances, denoted y, are shown with associated shunt admittances, b. If the number of buses in the system is N, then the node admittance elements are formally computed as S= Y.,Z = an + (2.1) and and yi = YijZoij = ij, i $ j. (2.2) 21 For the sample system, the node admittance matrix is Yl11 Y12ZL12 Y13/L13 Ybus= Y21L921 Y22/022 Y23L023 Y71L931 Y32L32 32 933 33 Y13 + b13 0 y13 = 0 23+ 23 23 ]. (2.3) Y13 923 y13 + 923 + (13 + b23) The symmetry of the node admittance matrix allows for use of only the upper (or lower) triangular portion. This is important for large systems because computer memory requirements can be greatly reduced by omitting the redundant portion of the matrix. In addition, the node admittance matrix is typically very sparse; that is, the percentage of trivial entries generally increases with the size of the system. This means that computer storage requirements can be further reduced by techniques of sparcity programming [1718]. A load flow study begins with formation of the node admittance matrix, which is independent of the system load. Note, however, that the matrix changes under any line contingency [15: pp. 166192]. If, for instance, line 13 is removed from service, then the matrix must be updated by the addition of a line parallel to line 13. The new line has an admittance and shunt sus ceptance that effectively cancels that of line 13. That is, the new line has an admittance of y13 and a shunt susceptance of b13, and the new admittance matrix is 1 2 3 Y13 + 2b13Y13 b,13 0 0 Y23 + 2 23 13+y13 23 3 \ 13+Y13 23 Y13 + 23 + 1(b13 + b23)13 2b13 0 0 0 = 0 23 + 23 23 (2.4) 0 y23 y23+ 23 describing a system with only the second line5. Since line 13 is connected to buses 1 and 3, only matrix elements 11, 13, 31, and 33 are affected, and only by the 5 Here, the first row and column of the matrix can be deleted because bus 1 is effectively isolated from the system. 1 ybnew bus 2 / 22 admittance and shunt susceptance of the missing line. Hence, a new admittance matrix need not be formed under a line contingency; only the affected elements need be altered. I, Figure 2.6 Node Model Since all line power flows can be computed using node voltages, the load flow study is considered solved when the complex voltage at each bus is known. Hence, power injection equations, which are functions of node voltages, must be written for each bus. With reference to the node model shown in Figure 2.6, the complex conjugate of the power injected at bus k, from generation and/or load at the bus, is6 S = Pk jk=Vk Ik+k ntk = VkLk (kVk6k Vnn) + bnVk 6k) = VkZLk 1 (VnL6, Vk Lk)YknOkn + VkL6kYkkOkk + S Vk SkYknLOkn n k n9k = VkLSk (E VnbnYknLkn) + V2YkkOkk n\ k = VkL bkVnLYknLOk, n = E VkVnynYk e(0+6"6k) n = E VkVn Ykn[cos (kn + bn k) + j sin (O + bn 6k)]. (2.5) n 6By Kirchoffs current law, the power injected from generation and/or load at bus k must flow to all connected buses, including ground bus 0. 23 Thus, for a system with N buses, the power injections at the kth bus are calculated as N R{S } = P= VkVYkncos(Okn +6 n k) (2.6) n=l and N {S} = Qk= E VkVnYk sin(8k + b, 6k), (2.7) n=1 yielding a system of 2N nonlinear equations. Half of the 4N bus variables are specified. The 2N unknown variables are the real and reactive power injection at the slack bus, the voltage angle and reactive power injection at other generator buses, and the voltage magnitude and angle at the loads. Yet, the system of equations represented by Equations (2.6) and (2.7) is a function of only voltage magnitudes and angles. Let the number of load buses in the power system equal and the number of generator buses, including the slack bus, equal g (such that N = g + ); then the unknown quantities in Equations (2.6) and (2.7) are voltage angles at all buses except the slack bus (g 1 + unknowns) and voltage magnitudes at load buses ( unknowns). Thus, Equations (2.6) and (2.7) represent a system of 2(g + ) equations in g 1 + 2 unknowns. By removing all Equations (2.6) for which Pk is not specified and all Equations (2.7) for which Qk is not specified, the nonlinear system has g 1 + 2 equations in g 1 + 2 unknowns. That is, Equation (2.6) is written for all generator buses except the slack node and for all loads, yielding g 1 + equations; Equation (2.7) is written for all load buses to give f equations. A Taylorseries expansion of the remaining injection equations about voltage magni tudes and angles, neglecting second and higherorder gradients, gives7 7 Since the voltage angle at the slack bus and voltage magnitudes at all generators are known quantities. AS6 = 0 and AVk = 0 fork = ,...,g. 24 N Npk APk Pal Pspc = E E V+ A1, = 2,..., N (2.8) n=2 n=g+l and N N AQk k alc sPe+ E + AV,, k = g + 1,..., N, (2.9) n=2 n=g+l where p Spc denotes the specified real injection at the kth bus and pkalc is the calculated real injection of Equation (2.6). The sensitivities of real power injections to changes in voltage magnitudes are negligible, as are the sensitivities of reactive power injections to changes in voltage angles. This means that the nonlinear system of equations represented by Equations (2.8) and (2.9) can be decoupled into a subsystem for voltage angles and a subsystem for voltage magnitudes. Thus, the differences between specified and calculated real power injections can be approximated by APk P P k pspc w2 + +* + ,AN, k = 2,..., N. (2.10) In like manner, the differences between specified and calculated reactive power injections are approximated by a *c Q O~alcspc __ oP AQk Qkal pc Vg+i + + VAVN, k = g + 1,..., N. (2.11) V,+1 V N These equations yield the two linear systems AP2 .'' ap"'" A I AP2 (2.12) \APN ...a AsN and : = (2.13) . .g dVN where gradients are computed as follows: Pk = E VkVYkn sin(Okn nak aPk VkVnYkn sin (8kn + aS,6n Nk = 1 Vn'k. sin (An n$k sink Vn VkYkn sin(8k n + 8 a Vn + n 6k), bn 6k), n 7 + 6n bk) 2V,  6k), n : k. k, kYkk sin (0kk), (2.14) Unknown voltage magnitudes are initially set to either that of the slack bus or 1.0 per unit. Unknown voltage angles are initially set to zero. Solving the linear systems in Equations (2.12) and (2.13) gives changes to initial estimates, such that 6Cew = Sld+A6 and Vknew = Vkold + AVk. After updating complex voltages, real and reactive power injections, and gradients, the linear systems are solved iteratively8 until voltage changes are negligible. When the complex voltages are known, all other system variables can be computed. Real injection at the slack bus is found using Equation (2.6) for k = 1, and reactive injection at all generators is computed by using Equation (2.7) for k = 1,..., g. Line power flows are found by Sk, = Pkn + Qk, = Vk knL Vk V = Vk 6k [(Yk kn)(Vk )k VnL6n,)]*, k n. Line power losses are computed either as the difference between total generator output and total load or as the sum of individual line losses. This technique is called Fast SExperience with numerous power systems has shown that the gradients in Equation (2.14) are nearly invariant with voltage changes; hence, a common practice involves using the original gradient calculations in all iterations. (2.15) 26 Decoupled AC Load Flow [19] and, like all Newtonbased methods, converges only when initial voltage estimates are "near" the actual values. In bulk power system reliability assessment, this procedure must be performed for each contingency that is not obviously unacceptable9. A system state is unacceptable under the following conditions: the voltage magnitude at any load bus exceeds a lower or upper voltage limit (usually 0.95 and 1.05 per unit, respectively), the real or reactive power generation at any voltagecontrolled bus exceeds total rated capacity at the bus, or the magnitude of complex power flowing on any line exceeds the thermal capacity of the line. Switched capacitors (which alter reactive power injections) or changes in transformer taps (which alter line and shunt admittances) may be used to correct unacceptable conditions, but a new load flow study must be performed on the contingency for each change. A reduction in load may be necessary to make a failed state acceptable, in which case a lossofload condition exists. Clearly, the number of power flow studies necessary to assess the adequacy of a power system can be prohibitive. 2.4 Reliability of Bulk Power Systems As shown in Section 2.2, the statespace approach to adequacy assessment relies heavily on the theory of homogeneous Markov processes [20]. If a "system state" is defined as a power system contingency (in which lines and/or generators may be out of service) that occurs at a given level of system load, and if the system's load and its 9 Clearly, a system state with no generators and/or no lines is unacceptable. A load flow study is not required to determine acceptability of this sort of contingency. Indeed, the iterative technique diverges in these cases. 27 contingencies occur randomly, then the state of a bulk power system varies randomly and its characteristics can be modeled by probability theory. Formally, a stochastic process is a set of random variables which are ordered sequentially; for example, the state of a power system over time is a set of joint random variables with contingencies and loads as independent random variables, so a power system operates as a stochastic process. A Markov process is a stochastic process in which the probability associated with a random variable depends only on the state of the variable at the previous time step, but not on earlier states. A Markov process is homogeneous when the probability that a random variable changes state does not depend on time; that is, the probability of a change in state is constant. Hence, if reliability is a measurable characteristic of a system, then it can be quantified by application of this statistical theory. Several assumptions make the theory applicable to bulk power systems. First, the probability density function of an element's time in a given state is exponential; that is, the probability that an element will stay in a given state (failed or working) for a time t' is P(t') = et, (2.16) where A is the rate of transitions out of the state. The same assumption is made for load levels, which are not confined to two states. Next, events of failure or repair of distinct elements are independent, as are changes of state for the set of elements and changes of state for the load. Also, the probability of more than one of the following events occurring in one time step is negligible:  failure or repair of element 'a',  failure or repair of element 'b',  change in the load level; 28 that is, the possibility of simultaneous transitions is ignored. Figure 2.7 A TwoState Homogeneous Markov Process The theory is best illustrated using a twostate processo, described in Figure 2.7, such as the failure/repair process of an element. Here, the probability that the element's state does not change in one time step, At, given that the element resides in state i, is denoted pii(At), or pi,. The probability that the state changes to j in At, given that the element is currently in i, is denoted Pi,. Since state j is the complement of state i, pii(At) = 1 pij(At). (2.17) The duration in state i has an exponential probability density function with transition rate Ai, so the mean duration in state i is 1 mi = i. Mi= (2.18) The rate of transitions out of state i is defined as Ai, lim pij(t) Atd0 At so that, if At is sufficiently small, the conditional probability of leaving state i in At is pij(At) = AiAt. (2.19) 10 From the perspective of a particular state, all processes involve only two states existence in the state and existence outside it. 29 Let pi(t) denote the probability that the element resides in state i at time t. If pi(t) is known, then the state probability at the next time step is simply the conditional probability that the element remains in state i if it currently resides in state i, or that it moves to state i if it currently resides outside i; that is, pi(t + At) = pii(At)p,(t) + pji(At)pj(t). (2.20) Substituting Equations (2.17) and (2.19), Equation (2.20) becomes pi(t + At) = (1 AXAt)pi(t) + AjAtpj(t), or (t + At) i(t) Api(t) + Ajpi(t). (2.21) At As At approaches zero, Equation (2.21) becomes the timederivative of the state probability which, because the process is homogeneous, is zero. Hence, dpi(t) pi(t + At) p(t). (222) =Lim = Aipi(t) + Aiji(t) = 0. (2.22) dt Ato At Also, since states i and j are complements, pi(t) + pj(t)= 1. (2.23) Equations (2.22) and (2.23) form a linear system whose solution gives the longrun state probabilities for the element; that is, (Ai )( ) (0) ). (2.24) Alternately, by substitution of Equation (2.18), m a m i=  and pj (2.25) mi + mj mi + mj 30 The frequency of encountering a state is defined as the average number of arrivals into (or departures from) the state per unit time. The probability of a given state and the frequency of encountering it are related by the mean duration of stays in the state. That is, if the mean duration of stays in each state of the process is known, then the mean cycle time of the process (i/j or i/i) is Ti = mi + m, = Ti, (2.26) where the mean duration outside state i, mi, is the mean duration in state j, mj. In the long run, the frequency is the reciprocal of the mean cycle time, or 1 fi= f= (2.27) From Equation (2.25), however, Pi = mifi. (2.28) The fundamental relation in Equation (2.28) shows that any two of the three state parameters define the third. Indeed, Equations (2.18)(2.19) and (2.25)(2.28) provide a means of determining all three parameters from historical observations11 The application of this method to bulk power system reliability begins at the level of elements, or transmission lines and generating units. An element is either in service or out. Let i and i represent the element's working and failed states, respectively. The mean duration in each element state is assumed known; that is, mi and m, are given. Then, the cycle time of the working/failed process is Ti = mi + m, and its frequency, the frequency of encountering either state, is f; = 1. The probability that the element resides in state i is pi = mifi, and the rate of transitions out of i is Ai = 1 1 As an illustration, consider an automobile which, on average, runs for 4 years before requiring 6 months (I year) of repair. Here, the mean duration of a state of repair, m, = year, is given, as is the mean duration of a working state, mr = 4 years. The probability of the car being in a state of repair is, then, m. = Though the rate of breakdown (or departure from the working state) is once every 4 years thefrquncy of covering a state of repair is once every 4 years. working state) is once every 4 years, the frequency of encountering a state of repair is once every 4 1 years. 31 With this foundation, an outage state is defined as a collection of element states. With n elements in the system, each with two possible states, the sample space of contingencies contains 2n states. If the sample space is limited to those states in which no more than two elements are out of service, then the number of states in the space becomes (n) + (n) + (n) = 1+ n + (n 1). Let the elements in the system be denoted el,..., e; also, the probability that element e is in service is denoted pei and the probability that it is out of service is Pe;. The probability that the system resides in outage state o, denoted Po, is the probability that each element has the status indicated by o; that is, Po = P(e n e l n ... n = e ) (Pe llP (2.29) ein e,out where e, in denotes the set of elements in service for o. From the perspective of the state o, the process has only the two states o and i. Let the transition rate out of the working state for element e be denoted Aei and its transition rate out of the failed state be A,,. Since the duration of stays in each element state is exponential, and since element state transitions are independent, the distribution of durations of stay in an outage state, to, is described by P(t) = (I AetO) ( e~xt) e= itet (2.30) e,in e,out such that the mean duration in o is12 mo = {to} = j to(to)dto = toe en Ae.i+Eo odto r(2) /00 E,in Aei + E,out Ael toe( .in XA +E .,, )todto Ee,in Aei + Ee,out Ae, Joo r(2) 1(231) Ze,in Aei + e,out e (2.31) 12 The derivation of mean duration in Equation (2.31) uses the similarity between the gamma distribution and the distribution of to in Equation (2.30). The gamma distribution is P(x) = TTleX^ where, if a is a positive integer, r(a + 1) = a!. The integrand in Equation (2.31), toP(to), is transformed to a gamma function by setting a = 2 and A = 'e .Ae,i + Yeout Ae. Then, since the integral of any distribution over all values of a random variable is unity, the integration leaves only the constant, tm,. The derivation shows that the mean value of any exponentially distributed random variable is simply the reciprocal of its transition rate. 32 The frequency of encountering state o is, then, fo = . The 8760 hourly loads on the system for a projected year are assumed given. The annual load curve (given as a function of time) can be discretized into several load levels, such as increments of 5% or 25% of the annual peak load. A load state is defined as any discretized load level in which the system may reside, and if the increments are 5% of the peak load, then the sample space for loads contains the 20 load levels 5%, 10%,..., 100% of peak load. Let the number of occurrences13 of load state f in the discretized annual load curve be denoted n1. If t4 is the duration in at the kth occurrence, then the mean duration of stays in is me = Ek t4. Also, the probability that the system resides in load state is p, = 1 k tk. Since the duration in state is exponentially distributed, the transition rate out of t (or into 1) is At = and the frequency of encountering is ft = R. m1 A system state is an independent combination of an outage state and a load state. With 20 possible load states, the sample space contains 20(1 + n + '(n 1)) system states. If system state s consists of outage state o and load state then the probability that the system resides in s is14 Ps = P(o n ) = p = Pei Pe, P1 (2.32) e,in e,out From the perspective of state s, the process has only the states s and s. Both the distribution of durations of stay in o and that of t are exponential, so the distribution of durations of stay in s, t9, is P(tt) = e,'(e(e',i Aei+ZeeO,2 A)e = e(A+EZinAe.+EutAe, ) (2.33) 13 An occurrence of load state I may last for several hours. 14 Alternately, since a system state is also an independent combination of element states and a load state, p.=P(ene2nn... e.nnt)= ( ) (Pe) .( e Pt = POP. \e'ia ) cOQul ) 33 such that the mean duration of stays in s is 1 ms = 1 (2.34) At + Ze,in Aei + 2e,out Ae ( The frequency of encountering system state s is f, = . Statespace reliability indices for the system are the probability, mean duration, and frequency of encountering a failed (or unacceptable) system state [1: pp. 194200]. For the purpose of computing them, an acceptability state is either the set of failed system states or the set of working system states, and a system state is classified as working or failed by performing a load flow study. Let the sets z and z represent a system's failed and working states, respectively. If nz is the number of failed states and z = {sl, s2,..., s,,} is the set of unacceptable states, then the probability that the system resides in a failed state, or the longterm probability of system failure, is P, = P(sl U s2 U U sn) = Ps, + Ps2 +... + Ps, = Ps, (2.35) sEz and the expected duration of system failure is15 z = mops, in hours. (2.36) Pz sEz Finally, the frequency of encountering system failure is Pz. failures f = 8760 in urges (2.37) mz year 15 A system failure event is assumed to last as long as the associated outage state (instead of ending when the load changes), so the expected duration of system failure involves mo instead of m,. That is, the load level present at the onset of the outage is assumed to be present throughout the repair time. This assumption, though unrealistic, is necessary to a fair estimation of system failure duration. Also, the cumulative product in Equation (2.36) is divided by the failure probability because the expected duration is conditioned on system failure. 34 After determining the set of unacceptable system states, an index relating to loss of load can be computed. Table 2.4 lists the acceptabilities of the 20 system states associated with arbitrary outage o. Load levels in the table are in increasing order, so that unacceptable system states for this outage are those with the highest loads. The loss of load for an unacceptable state, denoted ds, can be estimated as the sum of load increments from it to an acceptable state. Table 2.4 Acceptabilities of System States Associated With Outage o Load Level 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Acceptability of Syes yes yes ye ye yes yes yes ye ye yyees yes yyes no no no no no System State For example, since the last acceptable state in Table 2.4 is associated with load level 15, the loss of load for the first unacceptable state at load level 16 is 5% of peak load; the loss of load at level 17 is 10% of peak load. In this way, d, can be determined for each unacceptable system state, and the expected unserved demand during system failure is dz = 1 dps,, in % of peak load. (2.38) PzsE Given historical or projected data on element failures and loads, the statespace approach to reliability assessment is straightforward. The only difficulty arises from the number of system states under consideration and the need to classify them as acceptable or failed. For realistic power systems, which may contain hundreds of elements, restricting the outage space to first and secondlevel contingencies still leaves a large number of contingencies to consider. Coupling the outage space with a realistic discretized load space, such as the 20 levels used here, means that failure effects analysis of the system space is a prohibitive task. 35 The most common method of further truncating the state space [2123] is to de termine, based on access to computational facilities, the maximum number of load flow studies allowable. The number of load levels considered can be reduced by increasing the increments of peak load which form the discretized load curve; for example, if increments of 25% are used, the number of load states is reduced from 20 to 4. Then, the number of outage states considered becomes the allowable number of load flow studies divided by the chosen number of load levels. In an attempt to maximize the accuracy of relia bility indices, only those contingencies with the highest probabilities of occurrence are included. Though this conventional technique suffers from reduced accuracy of the four reliability indices, it is much less computationally intensive than the more exact method. CHAPTER 3 NEURAL NETWORKS 3.1 Philosophy of Connectionism The philosophy of connectionism is motivated by a desire to model the information processing tasks of humans. Modern computers accomplish strictly regimented tasks with much greater speed and precision than humans, but the capabilities of the human brain to analyze sound, sight, smell, touch, taste, and knowledge are far superior to those of conventional computers. Artificial neural systems attempt to replicate these tasks by using theories from physics, mathematics, neurobiology, linguistics, psychology, and other disciplines. The cerebral cortex [24] is composed of billions of brain cells, or neurons, each connected to as many as 10,000 other cells. Each neuron consists of a nucleus called a soma, which processes electrochemical signals passed to it from the cell's dendrites at a rate of 1020 ms. Processed signals are then passed to other neurons through the cell's axon, which is connected to the dendrites of other cells by synapses. Each neuron lies on one of four layers and may be connected to cells on any layer. These layers, rising from the hypothalamus at the top of the spinal cord to the brain's outer surface, differ in the density and size of neurons and in the type of connections (excitatory or inhibitory) to other neurons. Human learning relies on the formation of representations that generalize from the details of specific examples. While specific experiences are retained in memory, generalization depends on the congruity between these examples [25]. The brain retains 37 knowledge in the synapses between neurons, and learning occurs by adjusting these synapses at the presentation of new information. Apparently, synaptic strength between two neurons increases when the cells have similar responses to input, and it decreases when the responses are very different. Just as artificial neural models are motivated by an understanding of brain structure, many learning algorithms are based on this simple rule of synaptic adjustment. The precision and rigidity of conventional digital computers allow for superior performance on algorithmic tasks, but functions such as learning require the adaptability and resiliency of neural systems. The most glaring differences between the two are the speed and order of processes. A modern computer processes information a million times faster than a brain, yet the processing is done in a serial fashion. In contrast, the brain processes information in parallel with a massive number of neurons. Neural systems and computers also differ in information storage and retrieval. A computer stores data in addressed memory locations, so that old information is destroyed when new data is assigned the same address. The brain stores information in its distributed synapses, which are simply adjusted when new information is stored. The distributed nature of knowledge in the brain means that partial degradation has a minimal effect on the brain's performance. In contrast, failure of a few processing elements in a computer leads to general failure. This faulttolerant characteristic of the brain is also due to local control of processing; that is, while the central processing unit (CPU) in a computer dictates all processes, control of the processes in a neural system lies with each neuron. These differences give artificial neural systems the ability to perform well on certain nonalgorithmic tasks. Connectionist theory is an emerging science which seeks to model the parallelism and interconnectivity of the human brain. Recent advances in applications of the theory 38 range from visual and adaptive pattern recognition to motion detection. In fact, VLSI implementations of neural networks are the focus of ongoing research [2628]. Power system engineers have also recognized the facility of using artificial neural networks. The three IEEE journals dealing exclusively with power systems show that one application of neural networks was published in 1989, three were published in 1990, and six were published in 1991 [2938]. These applications ranged from incipient fault detection on distribution lines and in synchronous machines to security assessment of transmission systems. The wide range of problems and techniques indicates that power systems will provide a vast area of application for neural networks. 3.2 Description and Operation A neural network consists of a set of nodes, a connection topology, propagation and learning rules, and an input space. A general node model is represented in Figure 3.1. The node, or neuron, receives its input through weighted links. This input may come from other nodes in the network or from outside stimuli. An activation function, usually a summation, acts on the input; the node's internal bias is then added to the summed and weighted input. The result is called node activation. Bias Inputs Activation uFunction > Transfer Output Fcon I Function Synapses Outputs Figure 3.1 Neuron Model The node's output is determined by an output function, which responds to the activation. An example is the Sshaped sigmoid shown in Figure 3.1. The transfer 39 function of the node consists of its activation and output functions. The node's output travels along the links, or synapses, either to other nodes or to the output of the system [39]. Figure 3.2 Neural Network Model A neural network is simply an interconnected collection of these nodes. An example of a network is shown in Figure 3.2. Nodes in a network differ only in the values of internal biases and connected weights. In general, then, the propagation rule of the network is, collectively, the transfer function of any single node. Signal propagation may occur in either direction in the network; in fact, the output of a node may propagate directly to the node's input. 6 12 5 10 13 3 17 3 21 2 B= W 135 12 16 1 26 3 7 3 4 31 7 4 6 28 5 2 13 4 1 Figure 3.3 Bias Vector and Weight Matrix for Neural Network The sample network shown in Figure 3.2 has three layers, with a total of eight neurons. It accepts four inputs and produces one output. The nodes are connected by links of varying weight. An nnode network with a given propagation rule is fully 40 described by an ndimensional internal bias vector and an ndimensional square weight matrix, which gives its connection pattern. Figure 3.3 shows a possible representation for the network above. Since each node in the foregoing model can connect to any other, the network is far too complex for many applications. This research focuses on feedforward layered networks, with each node's output determined by the sigmoid, or logistic, function. That is, all input is received on one layer, and the resulting signals propagate forward, one layer at a time, until the signals reach the last layer. An example of such a neural network is shown in Figure 3.4, with a corresponding bias vector and weight matrix. xl Y^i^4 %q^ 'XN V b b 1 12 5 36 8 3 4 B =2 W 10 5 T Y4 3 14 5 y, Figure 3.4 Feedforward Layered Network This particular network contains five nodes in three layers, and only six weighted links are required to fully connect the network. Each node on the first layer is connected to each node on the second, and each node on the second layer is connected to the node on the third. The last layer is referred to as the output layer, since the network's output is the response of neurons on this layer. The first layer is referred to as the input layer. 41 Nodes on this layer differ from others in a feedforward network; they are strictly linear in that the output of a node is simply its one allowable input. Nodes on the input layer have no internal biases. All other layers in the network are referred to as hidden layers, since they are not accessible to the outer environment. Nodes in these hidden layers serve as detectors of relationships between the input and the corresponding output1. The AND logic gate mapping of Figure 3.5 serves as a simple example of a function which can be described by hidden nodes in a network. The network's input space is partially determined by the number of nodes in the first layer; for an mnode input layer, it is a subspace of Rm. The input space for a network solving the AND function consists of all pairs of numbers having values between zero and one. x x2 t 00 0 x 1 1 1 0 0 X1 Figure 3.5 Definition of the AND Function The operation of a neural network consists of the presentation of a set of input and subsequent propagation of this input through the network. The 5node network shown in Figure 3.4 can be used to illustrate the forward propagation of input signals. In this example, inputs zl and X2 are presented to Node 1 and Node 2, respectively. Since nodes on the input layer have linear transfer rules, 1 No more than two hidden layers are needed for the most complex mappings [40]. 42 i1 = Xl (3.1) and Y2 = X2. (3.2) Activation for nodes on the other layers is the sum of the node's internal bias and weighted outputs passed to it; that is, X3 = w13Y1 + w23Y2 + b3 (3.3) and X4 = wl4y1 + W242 + b4. (3.4) Output for all nodes not on the input layer is determined by a sigmoid function2, such that 1 y3 = (3.5) 1+ ez3 and Y4 = (3.6) 1+ eX4 Finally, x5 = w3sy3 + w45y4 + b5 (3.7) and 1 Ys5 = (3.8) 1 + ez" Hence, y5 is the network's response to inputs xi and x2. The method is called "forward propagation" because node responses on a given layer can only be calculated after those on the preceding layer are found. 2 For sigmoid function output, note that lim yn =O and limn y = x oo0 zn+o0 43 In general, then, forward propagation consists of passing weighted and summed input signals through a chosen nonlinearity. It presumes knowledge of the network's bias vector and weight matrix. A node's internal bias can be thought of as the weight of a link connecting the node to one whose output is always unity, so the procedure for finding biases is identical to that for finding weights. Any further reference to network weights shall be understood to include biases. Once activation and output functions are chosen, a neural network is completely described by its weights. Since a given neural network solves a specific problem, or function, finding the weights for the network is equivalent to finding the input/output relationship that describes the function. For example, the weights defining the network that solves the AND gate of Figure 3.5 describe the function I (,... 1) if ax > and X2 > } 1(0, ...,] else' though not explicitly. Because the weights of a network can represent a given function, neural networks are especially appropriate and powerful when used to find relationships that are difficult to describe explicitly, such as that between power system variables and state acceptability. 3.3 Training Algorithms A neural network can be categorized by its methods of learning and recall. Recall can be eitherfeedforward, as illustrated in Section 3.2, or feedback, where node responses are recurrent. Learning can be supervised, given sample inputs and desired outputs, or unsupervised, using only inputs. The Hopfield net [41] is typical of unsupervisedlearning / feedbackrecall networks. This network consists of n neurons on one layer, with each neuron connected to all others and to itself. The net receives and produces ndimensional, binary input and 44 output vectors. Its most prominent application is as an associative memory, where a noisy signal is presented and the most appropriate memorized signal is produced. Given S patterns to be learned by the net, the n x n symmetric weight matrix is computed using the sum of outer products of the pattern vectors. During recall, an input vector is presented and the neurons produce output in discrete time until a stable response occurs. At any time step, the activation for each neuron is the sum of weighted outputs from all neurons at the previous time step; this sum is then passed through a step threshold function to produce the node's new output. The storage capacity of a Hopfield net decreases in a logarithmic fashion with the number of neurons. The BrainStateinaBOx, or BSB [42], is an example of supervisedlearning / feedbackrecall networks. It has the same topology as the Hopfield net and is used primarily for pattern completion. Given S patterns to be learned, synaptic weights are determined iteratively by the learning algorithm. Briefly, starting from initial random values and the first pattern, the weight from one node to another (or to itself) is changed by the product of the first node's output and the difference between the second node's input and output. When all weights are updated, the next pattern is presented and the process is repeated. The set of patterns is presented until all differences between node inputs and outputs are negligible. Recall is performed in the same manner as that in the Hopfield net. The only difference is that a ramp threshold function is used to give a node's output. The storage capacity of the BSB increases with the number of neurons, but this advantage over the Hopfield net is gained at the expense of a greatly increased learning time. The selforganizing feature map [43] is an unsupervisedlearning / feedforwardrecall network that is used primarily for pattern classification. This fully connected neural net has an nnode input layer, which accepts ndimensional patterns, and an mnode output 45 layer which indicates the one class to which a pattern belongs. For each of S patterns to be learned by the net, the learning algorithm alters only those weights leading to the output node whose Euclidean distance from the inputs is smallest. This competitive learning rule allows only one of the m output nodes to fire for each pattern. When new patterns are presented, the output node whose weights best correspond to the inputs is caused to fire, while all other node responses are muted. The feature map can also be trained to give continuousvalued responses at all output nodes; in this case, the net serves as a data quantizer and the outputs represent a set of features, or characteristics, of the input patterns3. Finally, the elementary perception [44] is a supervisedlearning / feedforwardrecall network that is also used for pattern classification. It has the same topology as the self organizing feature map and is the precursor to the feedforward layered network described in Section 3.2. Given S ndimensional input patterns, each with an mdimensional desired output, the learning rule randomly assigns bipolar (1) values to the weights. The first pattern is presented to the input nodes, and each output node has an activation equal to the cumulative product of inputs and weights leading to the output node. This activation is then passed through a bipolar threshold function and compared to the desired node response for this pattern, and the difference is used to adjust the connected weights. The next pattern causes further adjustment to the weights, and the pattern set is presented until all differences between actual and desired responses are negligible. On recall, a new pattern is presented, and output nodes form activations and bipolar responses. The output indicates the one of 2m classes to which the new pattern belongs. If the classes defined by the initial patterns are linearly separable, then an optimal set of weights can be found to map the patterns. See the network in Section 3.4. 46 In order for a neural net to learn the "rules" for solving a problem, data sets describing the problem must be given. In supervised learning, these data sets normally consist of input vectors and desired, or "target," output vectors for each. The truth table shown in Figure 3.5 contains four such input/output vectors. These vectors form a full training set for a neural network with supervised learning; that is, they describe the full range of expected inputs and associated, desired outputs. The multilayered perceptions4 used in this research (and described in Section 3.2) are trained by a procedure called the Backpropagation Learning Algorithm [45], alternately known as the Generalized Delta Rule. It is an extension of the perception convergence procedure5 that allows for the inclusion of hidden layers in the network. The information content of a network, relating all possible vectors to their associated outputs, is contained in the network's weight matrix, W. If respective outputs for a set of input vectors is known, then a weight matrix can be found that relates each input vector in this training set to its associated output; that is, the weights minimize the squared error between the network's outputs and the known outputs. Hence, assuming the set of input vectors is representative of the input space, the resulting weights can be used to find the output associated with an arbitrary input vector. Finding an appropriate weight matrix for a multilayered perception is equivalent to training the network to specify the correct output for a given input vector. Thus, the set of input vectors and known outputs is used by the training algorithm to iteratively change the weights (from random initial guesses) until the network has learned the input/output relationships implicit in the set, if such relationships exist 4 Although other neural networks are described in this section, further references to "neural networks" are meant to describe "multilayered perceptions" only. 5 Unlike the perception convergence procedure, the Backpropagation learning rule is not proven to converge, even if vectors in the training set fall into linearly separable regions. 47 The total output error of a multilayered perception with one output, yout, is defined as SE= EP=ZE 1(tP yu)2, (3.9) p p where tP denotes the target output for the pth training vector, and EP denotes its error. Using the network of Figure 3.4 as an example, small random weights6 are first assigned to the network links. These weights are wr and br for s = 1,...,4 and r = 3,...,5, with the weights of nonexistent links (wl15, for instance) set to zero. There are two sets of weights in this network those leading to the output node from the hidden layer, and those leading to the hidden layer from the inputs. For optimal weights, the gradient of the total error function with respect to each weight is zero. Hence, for a weight, w,5, leading from hidden node k to the output node, OE Z dEP dy Ox wk5 d dy d o wk5 = [(tp y)(1)] ([ +ez)2][ (1 + eX)2 = E(t y.)y(1 y~). (3.10) p Defining the "assigned error" at the output node for vector p as dEP 5 = y(1 y)(tp P), (3.11) a necessary condition for local optimality of wk is E6y = 0, k = 3,4. (3.12) P 6 Small initial weights correspond to low levels of node activation. A low activation level initially places each node in the linear region of its output function, so that nodes are appropriately "uncommitted" to being on or off. 48 Similarly, for a weight, wjk, leading from input node j to hidden node k, aE dE dyP ady8 dy= a9x Wjk dyP dx5 9yk dX4 9wjk 5= 6[wkr5 1 + e2k = E6wy(1 y)y+ (3.13) p Defining the assigned error at hidden node k for vector p as dEP 6 dz ( = (1 yk)wk,56, (3.14) a necessary condition for local optimality of wjk is E y = 0, j = 1, 2, k = 3,4. (3.15) p This analysis can be easily extended for perceptrons with multiple hidden layers and outputs. From the initial random values, weights are found using the wellknown Steepest Descent algorithm of nonlinear programming [4]. In short, for the pth training vector, the input and output of each node is computed using Equations (3.1) (3.8). The network's output is compared to the known output, and the assigned error for the output node is found using Equation (3.11). Next, assigned errors for hidden nodes7 are found by Equation (3.14), and weight updates are computed by Aw< = )y(65P, (3.16) where r7 is a "learning rate" between 0 and 1. This procedure is followed for each training vector, in turn. 7 Note that no error is attributable to input nodes, whose transfer rules are linear. 49 When weight changes for each vector have been computed, the weights are updated, in the hth iteration, by ) (h) (&1) (h+l) = w(h + Aw> (317) WSr WSr + Sr Sr I p p where a is an "acceleration rate," also between 0 and 1, that adds momentum to the search procedure and helps to avoid local minima. Weights are updated in this manner, with iterative presentation of the training set, until Equations (3.12) and (3.15) are satisfied. Equations (3.12) and (3.15) are known as "soft" criteria for stopping the iterative procedure because they only ensure that a sum of weighted node errors is negligible. An alternative, known as "hard" criteria, ensures that all node errors are negligible for each training vector. That is, for each training vector p, 6b = 0, k = 3, 4, 5. Clearly, "soft" criteria are met by satisfying the "hard" criteria, but this alternative guarantees that the neural net will give the correct result for each vector in the training set. Note that the assigned error for the output node is necessarily computed before those for hidden nodes; that is, unlike the forward operation of a neural net, training occurs in a backward fashion. The Backpropagation Learning Algorithm consists of repeatedly passing the training set through the perception until its weights minimize the output errors over the entire set. During recall, the first hidden layer effects a nonlinear transformation of the input space into an hidimensional "image space," where hi is the number of nodes in the first hidden layer. The input space is partitioned into at most 2h' decision regions by hi hyperplanes (some of which may be redundant), and sets of similar input vectors are assigned to separate regions. In this way, the hidden layer performs intermediate classification of the input patterns [47]. 50 If the network contains a second hidden layer with h2 nodes, then a second transfor mation partitions the space defined by the first hidden layer into an h2dimensional space, and patterns are reassigned to at most 2h2 regions based on similarities. The output node performs a final transformation, partitioning the space into two regions, and each input pattern is placed into one of the two classes (corresponding to outputs of either 0 or 1). Geometrically, the learning process determines the best successive partitions of the input space for mapping the training vectors to their associated targets. In addition, the decision regions described by optimal weights are related to several discriminantfunctions that specify the class to which an arbitrary input vector belongs [48]. Generally, a pattern is assigned to a specific region if the associated discriminant function is greater at that pattern than the discriminant functions for all other regions. The projection of these functions onto the input space is a set of decision surfaces that separate classes of input patterns into regions. Hence, the training of a neural net is the iterative formation of discriminant functions and decision regions associated with the application. 3.4 Principal Components Analysis Figure 3.6 Principal Components Analyzer Principal components analysis [49] is a statistical method that can be used to reduce the dimensionality of a highly correlated set of variables by linearly transforming it 51 into an uncorrelated set of "principal components." A network that performs a linear transformation of variables is shown in Figure 3.6. A set of S input vectors, each denoted g = [y1y2 Y N], can be "standardized" so that each element, yi, has a standard normal probability distribution. That is, if means and variances of the elements are P = I (3.18) p and = (y 2()2)22 (3.19) P respectively, then standardized inputs are yi = (y p ), p= 1,...,S. (3.20) The covariance between standard normal random variables yi and yj is the correlation between them, so the input correlation matrix is R = {rij }, where rii = yy, (3.21) P and each rii is bounded by +1. Diagonal terms of R are unity because these terms are autocorrelations. The trace of R, or sum of its diagonal terms, is the total variance of each standardized input vector, y; that is, tr{R} = number of input elements = N. The objective of principal components analysis is to linearly transform the elements of y so that they are uncorrelated; that is, the correlation matrix of the principal components should be diagonal. The transformed vectors, z, contain the same energy as the input vectors. However, because the principal components are linearly independent, most of the input energy is compacted into a few variables with much higher variances than the others. Hence, most of the energy in the original variables can be extracted from a few principal components, such as the M variables (M < N) shown as output in Figure 3.6. 52 Again, the objective is a linear transformation of the input correlation matrix, A = UTRU, so that A is diagonal; that is, A = {A,}. Matrix U can then serve to represent the weights of the network in Figure 3.6. In addition, because A should preserve the total variance (or energy content) in R, U is required to be orthonormal. In other words, if the columns of U are orthogonal and normalized, then UUT = IN tr{A} = tr{R} = N, (3.22) where IN denotes the N x N identity matrix. Let uk denote the kth column of the transformation matrix; then U is derived as follows: A = UTRU UA = RU = ukAk = Ruk, k =1,...,N = (R AkIN)uk= 0, k = 1,...,N, (3.23) which implies that A1,...,AN are eigenvalues of R, and Ul,...,uN are associated eigenvectors, all orthogonal. If, in addition, Ilukll = 1, then U = [ul uk ... UN] is orthonormal. The kth column of U is the vector of weights from all inputs to the kth output, zk; that is, zk = uky. (3.24) The eigensystem of a correlation matrix can be computed using the SingularValue De composition procedure of linear algebra [50] or newer training algorithms for performing principal components analysis [5153]. Since R is real and symmetric, its eigenvalues are all real and positive. Hence, A and U can be arranged such that A1 > A2 > ... > AN. Then, the first principal component, 53 zl, has the highest portion of the total input variance (or energy), and a = A1; the second principal component, z2, has the next highest portion of the total input variance, and a2o = A2; and so on. Recall that the total variance is a2ot. = N = tr{R} = tr{A} = A1 + A2 +' + AN, (3.25) so aOl/2ota, = 1i/N, O2/2total = A2/N, etc. Hence, the average amount of input energy extracted by the first M principal components can be computed. Alternately, the number of principal components needed to extract a given percentage of the input energy, on average, can be determined. Generally, for highly correlated input variables, the required number of components is much smaller than the number of variables, which makes principal components analysis a valuable datareduction tool. CHAPTER 4 RESEARCH METHODOLOGY 4.1 Objectives of Research This work is motivated by two basic objectives to determine if a multilayered perception can qualitatively replace the load flow study in certain applications' and to determine whether adequacy indices computed using a neural network strategy are as accurate as those conventionally computed using load flow studies. In other words, this research seeks to answer the following questions: O Can a perception dependably specify the acceptability of an arbitrary system state that is not presented during training? [ For this application, does the use of principal components analysis aid or hamper the performance of a perception? O Does adequacy assessment using a perception compare favorably with the conventional technique in terms of accuracy of indices? The strategy by which these objectives are achieved involves choosing variables for use as neural net inputs, finding a training set, and training and testing a neural net for determining state acceptability. The strategy also involves determining the proper number of principal components, training and testing another neural net, and computing reliability indices. 1 Although the output of a neural net in this work is taken as a binary (yes/no) number, it can be used as a probability if certain training conditions are satisfied. Hence, neural nets can be used quantitatively. 4.2 Research Procedures A power system failure is defined as an unacceptable system state, which occurs under any of the following conditions:  voltage magnitude at a load bus is out of bounds,  power flow magnitude on a line is out of bounds,  real generation at the swing bus is out of bounds,  reactive generation at the swing bus or other voltagecontrolled bus is out of bounds,  a bus or set of buses is isolated, or  the system operating point is unstable (i.e., voltage collapse is likely). From Section 2.4, the adequacy of a power system can be described in terms of the probability of system failure, the mean duration of system failure, the frequency of system failure, and the expected unserved demand during a system failure. Calculation of these indices requires, as data, the historical meantimetofailure and meantimeto repair for all transmission lines and generating units, as well as forecasted hourly loads for the year in question. The objectives stated in the previous section can be achieved by computing exact values and conventional estimates for these indices, then comparing them to values obtained using a neural network with, and without, the use of principal components. A truly exact calculation of these indices, for a given power system, requires testing all possible system states for acceptability. As a practical matter, contingencies with more than two failed elements may be omitted from the study, and no more than 20 evenly spaced load levels are necessary. Practically, then, an "exact" calculation of the indices begins with a power flow study of each single and doubleelement contingency at 20 discrete load levels. 56 For a typical power system, the number of states which must be studied to give exact indices is still very large. As a consequence, the conventional practice involves further truncating the contingency space to a manageable size and reducing the number of load levels studied. In this work, conventional estimates of the indices are based on 100 of the most probable contingencies, which are studied at four load levels. Hence, 400 loadflow routines (as opposed to thousands) are executed during a conventional estimation of the chosen reliability indices. Regardless of whether exact values or reducedstate estimates are sought, the accept ability of each system state being considered must be determined before computing the indices. Also, an annual load curve for the system must be formed. From Section 2.4, the procedure for computing exact indices, given a system load curve and mean times to failure (mei) and repair (me,) for each element, is as follows: 1. Discretize the annual load curve into load states, L. If nt is the number of occur rences of e and td is the duration of i at the kth occurrence, find transition rates A, = , and probabilities p = In this work, discrete load states are S' p a t P 8760 5%, 10%, 15%,... ,100% of peak load. 2. Compute transition rates Aei = and A, = , and probabilities Pei = mm and pe, = 1 pei, for each element. 3. Determine the acceptability of each system state, s, which is a combination of a load state and a set of element states, by performing a load flow study. For unacceptable system states, determine the loss of load, ds, as the sum of load increments to an acceptable state. 4. For each unacceptable system state, compute the state probability, S= Pt ( Pei (IPe (4.1) e,in / \e,out 57 and the mean duration in the associated outage state, 1 mo = (4.2) Ze,in Aei + Ze,out (ei 5. Let z denote the set of unacceptable system states. The probability of system failure is Pz = ps, (4.3) sEz the expected duration of system failure is mz = 1 mops, in hours, (4.4) Pz sEz the frequency of encountering a system failure is Pz failures f = 8760, in faur, (4.5) mz year and the expected unserved demand during system failure is dz = 1 dsps, in % of peak load. (4.6) Pz sEz Conventional estimates of these indices use the same procedure, with two exceptions: * load states, f, are restricted to 25%, 50%, 75%, and 100% of peak load, so step 1 involves only four loads, and * steps 35 are completed only for the 100 most probable contingencies, which are determined in step 2 by computing probability Po = (e,in Pei) ( ,out Pe) for each outage state. As an example, consider the 3bus power system of Figure 4.1. The swing generator at bus 1, element 'a,' is rated at 45 MW and supplies a local peak load of 10 MW and 5 MVAR; it can consume up to 10 MVAR or provide up to 25 MVAR. The bus has a nominal voltage magnitude of 1.02 per unit on a 23kV base. The two generators at bus 2, elements 'b' and 'c,' each supply 25 MW at a perunit voltage of 1.01. Combined, 58 they can consume up to 20 MVAR or provide up to 40 MVAR. Bus 2 has a local peak load of 20 MW and 10 MVAR. a b d e Figure 4.1 3Bus Sample Power System A peak load at bus 3 of 50 MW and 20 MVAR is supplied through two transmission lines connected to buses 1 and 2. Element 'd,' the line connecting bus 1 to this load, has a resistance of 0.01, a reactance of 0.10, and a shunt susceptance of 0.005, all in per unit on a 100MVA base. The line connecting bus 2 to the load, element 'e,' has a resistance of 0.02, a reactance of 0.18, and a susceptance of 0.009, also in per unit. Each line can carry a load of up to 100 MVA. The outage space consists of single and doubleelement failures of elements 'a''e,' as well as the base case. Element 'a' has a meantimetofailure (mttf) of 1550 hours and a meantimetorepair (mttr) of 35 hours. Element 'b' has mttf of 1630 hours and mttr of 25 hours. Element 'c' has mttfof 1710 hours and mttr of 30 hours. Element 'd' has a mttf of 12,200 hours and a mttr of 10 hours, and element 'e' has mttf of 14,100 hours and mttr of 12 hours. Coincident hourly loads on the system for one year are given in the Appendix2. From this curve, probabilities3 associated with load states = 5%, 10%,..., 100% are shown 2 With the exception of the GRU system, the load model used for all systems in this work is taken from the IEEE Reilability Test System [5455]. Although this system was developed as a benchmark for comparing reliability evaluation techniques, it is an illconditioned system [56] and its size is too large, given available computing facilities, to be considered here. 3 Note that the system must supply, at all times, a load greater than 30% of peak load. This means that some unacceptable system states may have zero probabilities of occurrence and may be ignored. 59 in Table 4.1. Next, failure probabilities for elements 'a' through 'e' are 0.0221, 0.0151, 0.0172, 0.0008, and 0.0009. Transition rates to failed element states are 0.0006, 0.0006, 0.0006, 0.0001, and 0.0001, respectively; transition rates to working states are 0.0286, 0.0400, 0.0333, 0.1000, and 0.0833. Table 4.1 Probabilities Associated with Load States Load Load Level Probability Lel Probability Level Level 1 0.0000 11 0.1188 2 0.0000 12 0.0959 3 0.0000 13 0.1223 4 0.0000 14 0.1102 5 0.0000 15 0.0818 6 0.0000 16 0.0840 7 0.0015 17 0.0755 8 0.0349 18 0.0358 9 0.0976 19 0.0110 10 0.1285 20 0.0022 In Table 4.2, the acceptabilities of the 16 outage states at 20 load levels are shown. (In the table, dashes represent failed elements, and the 20th load level represents 100% of peak load.) For this system, 320 system states must be considered for an exact calculation of reliability indices, while only 64 need be considered using the conventional technique. (Since the number of elements is so small, no truncation of the outage space is necessary in computing a conventional estimate.) Unacceptable system states in Table 4.2 are those represented by 'n,' or 'no.' From the table, the number of failed system states is 257; 52 of the 64 states considered by the conventional technique are unacceptable. The exact probability of system failure is 0.0277, with a conventional estimate of 0.0304. The expected duration of a failure event 60 is 30.5 hours, while the estimate gives 30.2 hours. The frequency of failure is 8 times per year; the estimate is 9 times per year. The expected unserved demand during failure is 44.7 MW, with an estimate of 50.4 MW. For this example, only slight errors occur from reducing the number of load levels considered. Table 4.2 Acceptabilities of 3Bus System States Outage Acceptabtlites by Load Level State Elemmts 1 2 3 4 5 6 7 I 9 10 11 12 13 14 15 16 17 18 19 20 1 abcde y y y y y y y y y y y y y y y y y y y y 2 bcde a n n n n n n n n n n n n n n n n 3 cde n n n n n n n n n n n n n n n n n n n n 4 bde n n na n n n n n n n n n n n n I n 5 acde y y y y y y y y y y y y y y y y n n n n 6 ade y y y y y y y y y y y n n n n n n n 7 abde y y y y y y y y y y y y y y y y n n n n 8 bce n n n n n n n n n n n n n n n n n n n n 9 bed n n n n n n n n n n n n n n n n n n n n 10 acc n n n n n n n n n n n n n n n n n n n n 11 acd n n n n n n n n n n n n n n n n n n n n 12 abc n n n n n n n n n n n a n n n n n n n n 13 abd n n n n n n a n n n n n n n n a n n n n 14 abc n n n n n n n n n n n n n n n n n n n n 15 abc n n n n n n n n n n n n a n n n n n n n 16 abcd n n n n n n n n n na n n n n n 4.3 Training Issues The neural network approach to adequacy assessment involves determining the acceptability of each power system state using a trained perception instead of a load flow routine. In order to compare the actual indices and conventional estimates to those obtained using perceptions, issues regarding the training of such networks must be resolved. Before training a neural net, a training set must be formed which is representative of the entire spectrum of system states. First, a set of power system variables must be chosen for input to the neural net, then pseudorandom "snapshots" of 61 the system must be gathered for determining the system's decision regions. The variables selected to represent a system state should form a consistent set in that no variable, or set of variables, should duplicate information provided by another set of variables. Such a set is readily available in the variables used as input to a load flow routine. With reference to Section 2.3, inputs to the load flow routine are voltage magnitudes at the generator buses, real power injected at all but the swing bus, reactive power injected at all load buses, and complex admittance matrix elements. Limitations on bus voltages, generator injections, and lines flows are not part of a routine's input but are used after its execution to determine violations. Also, since sparcity programming minimizes the number of blank admittance elements used by the load flow routine, only nonzero matrix elements need be provided as input to a neural net. Hence, the inputs to a perception that determines acceptable states of the system in Figure 4.1 are  voltage magnitudes at generator buses 1 and 2 (2 inputs),  real power injected at buses 2 and 3 (2 inputs),  reactive power injected at load bus 3 (1 input),  real parts of nonzero admittance matrix elements (5 inputs), and  imaginary parts of nonzero admittance matrix elements (5 inputs). The task of choosing states of the power system appropriate for training is more complicated. The set of states should also be consistent in not providing redundant information to the neural net, but it must include patterns from various parts of the state space if the network is to generalize its knowledge to new states. Ideally, training patterns should lie on either side of each decision region in the space4. A neural net could perfectly 4 See, for instance, the margin states along the PM/NM boundary in Figure 2.2. 62 replicate the decision surfaces using these patterns but, obviously, a priori knowledge of these surfaces is required for selecting the patterns. Alternately, categories of states can be described, such as line failures at heavy loads or generator failures at light loads, and the training set may consist of patterns from each category. Thus, a simple scheme for forming a training set is to include the base case, all singlegenerator failures, a doublegenerator failure involving each generator, all singleline failures, and a doubleline failure involving each line5, at four load levels. If n is the number of elements in the system, then exactly 2n 1 outage states are selected using this scheme. Referring to Table 4.2, the training set for the system of Figure 4.1 consists of outage states 1, 2, 3, 5, 6, 7, 14, 15, and 16, at four load levels. Hence, 36 load flow routines must be executed in order to train a perception for this system. Another issue that must be resolved before training is the normalization of the input space. If components of the mean input pattern vary by several orders of magnitude, then the smaller inputs have little effect on the choice of weights, regardless of their significance to the application. In addition, if the range of values for any input node is very large, then all nodes in subsequent layers become saturated6, and learning is hampered. Since the training algorithm normally pushes node responses to a saturation level (either zero or one), the network is "tricked" into believing the initial weights are accurate. Thus, the input range for each vector in the training set must be contracted (or expanded) to the range [0,1]. For example, if S is the number of training cases, the jth element of every training vector constitutes a collection of signals, { f,..., j,..., js for the jth node in the 5 Although the neural network is expected to accurately specify the acceptabilities of contingencies involving both generators and lines, these types of contingencies are not used in training. A neural net is capable of generalizing to these cases from the given training vectors. 6 Large input values initially place each node output near one, while very small values place node outputs near zero; node outputs are thereby predetermined, and assigned node errors are negligible. See Equations (3.11) and (3.14). An alternative to normalization of inputs is the addition of a small constant to the gradient of the output function at all nodes; this addition prevents the sigmoid from saturating, even at convergence of the training algorithm. 63 neural net's input layer. The values of these signals range from minj to maxj. The input space [minj, maxi] is transformed to [0,1] by letting j ip mini (4.7) x,, = (4.7) maxj minj for the pth training case. If N is the number of input nodes, the transformation in matrix notation is X11 *.. Xlp ** XlS maximini 0 Xil ...* p ** xjS = 0 max msn 0 maz"maxmmn XN1 XNp XNs 0 "'" 0 * Sill  mini ip mini ... x lS mini x ijil mini ... xjp mini ... ijs mini (4.8) X\N1 minN ... XNp minN ... XNS minN Also, the configuration of a neural net must be determined before training it. The size of the input layer is specified by the number of inputs in each training vector, and only one output node is needed to classify the acceptability of a system state. Still, the number of hidden layers, and the number of nodes on these layers, must be specified. As shown in Section 3.3, the composition of intermediate layers in a neural net affects the transformation and partitions of the input space. The degrees of freedom with which the network forms decision regions are set by the number of hidden layers and the number of nodes in each. If the network is "overfed," or has too many degrees of freedom, then the process of assigning patterns to decision regions is made difficult, and the network may simply memorize training patterns. On the other hand, if a network is "underfed," or has too few degrees of freedom, then it can not accurately replicate the decision surfaces and will fail to map training patterns to their targets. 64 By the same token, the number of hidden layers and nodes affects the total number of weights in the network. In addition to increasing the training time, numerous weights require a large number of training vectors to describe decision surfaces of moderate order, the iterative training process is, therefore, further lengthened. Heuristically, the number of training patterns required is at least twice the number of weights. Except for very complex problems, one hidden layer usually suffices to classify patterns, but no consensus exists regarding the number of hidden nodes. In this work, an arbitrarily chosen ratio of one hidden unit for every four inputs is used, and all hidden nodes lie on one layer. During training, the learning and acceleration factors determine the speed at which optimal weights are found. An acceleration factor of 0.9 is applicable to a wide variety of problems and is used for all simulations in this work. The best learning factor, on the other hand, varies with the shape of the performance surface, a function of the weight space which relates a network's total output error to its weights7. The shape of this curve is determined by the correlations between inputs. In general, uncorrelated inputs produce smooth performance curves with global minima that can be found rapidly. Highly correlated inputs produce very rough curves with many local minima, so searching for a global minimum must be performed at a slow pace. In other words, the shape of the performance curve cannot be altered, but the rate at which weights are changed can be set. The learning rate should be set to reduce the network's total output error at each iteration of the training algorithm. If the total output error oscillates during the initial stages of training, the learning rate should be reduced. A learning rate of 0.5 is used for the 3bus system, but the systems in Chapter 5 all require a rate of 0.2. 7 This function is obtained by substituting for you in Equation (3.9), using forward propagation equations and the training set 65 Stopping criteria for the training algorithm must also be chosen. As shown in Section 3.3, either "hard" or "soft" criteria may be used to halt the iterative training process. The "hard" criteria generally require several times as many iterations as the "soft" ones and, in many cases, only cause a neural net to memorize a training set instead of learning generalizations. Hence, only "soft" stopping criteria are used in this work. Also, the error gradient tolerance determines the allowable distance from the minimum total output error for convergence. Obviously, the number of iterations varies indirectly with the tolerance. In this work, a tolerance of 0.001 is used. After training, a decision must be made regarding the distinction between zero and one when observing continuousvalued network responses; that is, a threshold must be chosen such that responses above it are "high," and values below it are "low." In classifying power system states as failed or working, a high threshold is desirable because false "lows" are less detrimental than false "highs." This is especially true for online applications where operator responses might depend on a network's output. However, since the set of unacceptable states found by a perception will be larger with the use of a high threshold, the reliability estimates may be adversely affected. In this work, outputs greater than or equal to 0.5 represent 1, while numbers less than 0.5 connote 0. The use of principal components brings additional training issues. No bus variables are reduced by the principal components analyzer (PCA), so only line contingencies are needed to find appropriate PCA weights. However, the choice of line contingencies used to find weights remains to be specified. Like the training set for the neural network, the training set for the PCA must be representative of all line contingencies without being redundant. The training set for the perception contains a representative set of line outages, but it must be stripped of bus variables in order to be used for the PCA. For instance, the training set associated with the system in Figure 4.1 has vectors 66 containing 5 bus variables and 10 admittance variables. Note that admittance variables are specific to outage states and are independent of load levels; thus, only one load level for each outage state need be considered. When the remaining vectors are stripped of the 5 bus variables, redundancies are evident. The original training set contains outage states 1, 2, 3, 5, 6, 7, 14, 15, and 16, but states 2, 3, 5, 6, and 7 duplicate the line variables in state 1 and may be omitted. Hence, the training set for the PCA8 consists only of the 10 admittance variables for outage states 1, 14, 15, and 16. In short, a PCA training set should contain the admittance variables for each of the line contingencies (including the base case) represented in the original training set. Once weights are found for the PCA, a determination must be made as to how many principal components to retain for training a reduced neural network (called a PCANN). The total number of principal components is the same as the number of admittance variables but, since they have no linear correlation, the amount of energy in the first few components rivals that of the entire set of admittances. Inspection of the eigenvalues of the admittance correlation matrix is instructive. Ob viously, principal components associated with trivial eigenvalues are useless. Eigenvalues less than unity are associated with principal components that contain less energy than any one of the admittances; hence, those components can also be neglected. This method of choosing principal components is used throughout this work. For the system of Figure 4.1, using the four line contingencies noted above, only two nontrivial eigenvalues are found. Thus, only two principal components are retained. 4.4 Results for a 3Bus System With these issues resolved, actual indices and conventional estimates for the system s Since the original training set in this example contains all possible line contingencies, the PCA training set also contains all four line outages. This is not the case for larger systems; generally, the PCA only estimates principal components of admittances. of Figure 4.1 can be compared to estimates from a neural net (NN) and a PCANN. Load flow analyses are performed on outage states 1, 2, 3, 5, 6, 7, 14, 15, and 16, at loads of 25%, 50%, 75%, and 100% of peak load. The resulting acceptability of each of these cases, as well as the associated bus and admittance variables, are used to form a 36vector training set for a neural network. A perception is configured with 15 input nodes, 1 output node, and 4 nodes on one hidden layer. This 20node network finds an optimal set of weights in 1156 iterations of the training set. When tested on all 16 outage states at 20 load levels (the system states in Table 4.2), the neural net exhibits an agreement with load flow results of 98.75%. Table 4.3 lists the load levels for which each outage state is acceptable and shows where the perception errs. (See NN Acceptability in the table.) Table 4.3 Comparison of Acceptable Load Levels for 3Bus Outage States c'raining Actual PCANN State Elements Train Actual NN Acceptability PCANN Set? Acceptability Acceptability 1 abcde yes all loads all loads all loads 2 b c d e yes no loads no loads no loads 3 c d e yes no loads no loads no loads 4 b de no no loads no loads no loads 5 a c d e yes <85% of peak load <90% of peak load <90% of peak load 6 a d e yes <65% of peak load <75% of peak load <75% of peak load 7 a b d e yes <85% of peak load <90% of peak load <90% of peak load 8 bc e no no loads no loads no loads 9 bd no no loads no loads no loads 10 a c e no no loads no loads no loads 11 a cd no no loads no loads no loads 12 a b e no no loads no loads no loads 13 ab d no no loads no loads no loads 14 a b c e yes no loads no loads no loads 15 a b c yes no loads no loads no loads 16 abcd yes no loads no loads no loads 68 Next, weights for a PCA are determined from an eigensystem analysis of the admittance variables for outage states 1, 14, 15, and 16. Linear dependence among the 10 admittances is such that only two of the 10 principal components are retained, and they represent fully 100% of the energy contained in the admittances. The admittance variables for the four line contingencies are passed through the PCA, and values of the first principal component for these contingencies are shown in Figure 4.2. Notice that, for this example, the first component alone places line contingencies into linearly separable classes. Hence, a perception may more easily classify a given state by using the two9 principal components as input instead of the 10 admittance variables. Double Single Base Contingency Contingency Case I Mt I I I 0 i 1 1 0 i I  I I  4 2 0 2 4 Both Lines Li C Line D No Lines Out Out Out Out Figure 4.2 First Principal Component for Line Contingencies on the 3Bus System Another perception is configured for use with the PCA. For input, this network uses the same voltage and power variables as the original net, but the 10 admittance variables are replaced by the two principal components. Hence, this network has only 7 inputs, 1 output, and 2 nodes on one hidden layer. The same 36 system states used to train the original net are used in training this 10node network, which requires 1287 iterations to find optimal weights. When tested on all 320 system states, the PCANN also shows an agreement with load flow results of 98.75%. (See PCANN Acceptability in Table 4.3.) 9 Since the first principal component is able to clearly distinguish between line contingencies, the second is not needed. It is included for conformity with the method presented here. 69 The results in Table 4.3 indicate that both networks are capable of determining the acceptability of an arbitrary system state'0. For both networks, errors occur through overestimating the levels of load that are acceptable for a particular contingency. The PCANN takes a few more iterations to train, but it uses much less data. The two principal components contain the same information as the admittance variables, so the simpler structure of the PCANN input variables makes this network equally accurate in testing. 3 2.5 , 02 4 uj  3  NN 15 O  PCANN Figure 4.3 Learning Curves of Multilayered Perceptrons for the 3Bus System A fast learning rate of 0.5 and an acceleration rate of 0.9 are used for both perceptrons. The learning curve, a plot of total output error during training, is shown for both networks in Figure 4.3. The plot shows that the training algorithm proceeds smoothly to the minimum error in both cases. Once the set of unacceptable states is specified by the neural networks, adequacy indices can be estimated and compared. Using the procedure outlined in Section 4.2, as well as the element and load parameters for the 3bus system, the original perception to The number of weights in the first perceptrun is 64, while the second has only 16 weights. Only 36 patterns are used in training each net, so the rule specifying twice the number of training states as weights is violated in the irst case. The high accuracy of this network indicates that classes of pattems ae easily separable and that the decision surfaces associated with the 3us system Sof low order. That is, the two classes of system states cluster w 0.5 + Iterations Figure 4.3 Learning Curves of Multilayered Perceptrons for the 3Bus System A fast learning rate of 0.5 and an acceleration rate of 0.9 are used for both perceptrons. The learning curve, a plot of total output error during training, is shown for both networks in Figure 4.3. The plot shows that the training algorithm proceeds smoothly to the minimum error in both cases. Once the set of unacceptable states is specified by the neural networks, adequacy indices can be estimated and compared. Using the procedure outlined in Section 4.2, as well as the element and load parameters for the 3bus system, the original perception l0 The number of weights in the first perceptron is 64, while the second has only 16 weights. Only 36 patterns are used in training each net, so the rule specifying twice the number of training states as weights is violated in the first case. The high accuracy of this network indicates that classes of patterns are easily separable and that the decision surfaces associated with the 3bus system are of low order. Thai is, the two classes of system states cluster welL. 70 gives a probability of failure of 0.0253, a mean duration of 30.9 hours, a failure frequency of 7 times per year, and an unserved demand of 48.3 MW. Because the PCANN makes exactly the same errors as the original net, its estimates are the same. These results are tabulated in Table 4.4. Table 4.4 Comparison of Adequacy Estimates for the 3Bus System Conventional NN PCANN Adequacy Actual Index Value Estimate % Error Estimate Estimate Error Error Probability 0.0277 0.0304 9.5 0.0253 8.7 0.0253 8.7 Duration 30.53 30.18 1.2 30.93 1.3 30.93 1.3 Frequency 7.96 8.82 10.8 7.17 9.8 7.17 9.8 Unserved 44.68 50.41 12.8 48.26 8.0 48.26 8.0 Demand Note that the estimates obtained using the perceptrons are somewhat more accurate than the conventional estimate on three of the four indices, and slightly less accurate on the other. Errors using the conventional technique are less than 15% in all categories, and errors using either neural network strategy are less than 10%. Since the PCANN shows no increase in accuracy over the original net, principal components analysis is not useful when applied to this system (unless speedier training is highly valued). In any event, the indices suggest that the 3bus system is available over 97% of the time. It is expected to fail 8 times per year, with each failure resulting in a loss of almost 45 MW of load for over 30 hours. CHAPTER 5 SIMULATION RESULTS 5.1 Computational Tasks Results for the 3bus system listed in Chapter 4 are obtained by passing descriptive input files through a series of thirteen computer programs1. One input file lists the real and reactive loads at each bus, voltage magnitudes and power capabilities of all generator buses, and impedances, shunt capacitances, and thermal limits of all lines. For the 3bus system, this file is called "ex3.dat." The first program, called drive, uses this input file to produce files which describe each of the system contingencies at 20 load levels. A subroutine, dacdrive, performs a decoupled load flow study on each system state and lists the resulting acceptability (either 0 or 1) in a file called "tester.ans." A second input file, called "lines.dat," lists only the line data for the system. This file is used by program pattern to form the admittance matrix for each line contingency; the nonzero matrix elements are placed in a file called "fullpatt.dat" At this point, program maketesta uses input file "ex3.dat" and the line patterns in "fullpatt.dat" to form the neural net inputs for each system contingency at 20 load levels. These vectors are listed in a file called "testera.dat." Next, program maketraina uses selected vectors in "testera.dat" and the associated acceptabilities in "tester.ans" to form a neural net training set, listed in file "traina.dat." Training vectors are selected, at four load levels, in the following manner: 1 A technical report containing programming code and sample files for the 3bus system is available from Dr. Jose C. Principe, Department of Electrical Engineering, University of Florida. * base case, * failure of first generating unit, * failure of first and second generating units, * failure of second generating unit, * failure of second and third generating units, * failure of last generating unit, * failure of first line, * failure of first and second lines, * failure of second line, * failure of second and third lines, * failure of last line. Weights for the neural net are computed by program traina, which performs the back propagation routine on the vectors in "traina.dat." Weights are stored in file "weightsa.dat" and used by program testera to compute neural net responses to each of the vectors in "testera.dat." These responses are compared to the acceptabilities in "tester.ans" and listed in file "testera.otp." Next, program pcaweight uses the admittance elements of selected line contingencies in "fullpatt.dat" to form an admittance correlation matrix, from which eigenvalues and eigenvectors are found. Eigenvectors associated with eigenvalues less than unity are discarded, and the remaining vectors are stored in file "pcaweight.dat." Program pea then passes all admittance patterns in "fullpatt.dat" through a principal components analyzer formed from the vectors in "pcaweight.dat;" the transformed vectors are stored in "fullpca.dat." 73 Using input file "ex3.dat" and the reduced line patterns in "fullpca.dat," program maketestb forms neural net inputs for each system contingency at 20 load levels. These vectors are stored in file "testerb.dat." Program maketrainb forms a training set from selected vectors and associated acceptabilities in "tester.ans." Training states are the same as those used to train the first neural net, and these vectors are stored in "trainb.dat." Weights for the neural net are computed by program trainb, which performs the training routine on the vectors in "trainb.dat." Weights are stored in file "weightsb.dat" and used by testerb to compute network responses to each vector in "testerb.dat." These responses are compared to the acceptabilities in "tester.ans" and listed in file "testerb.otp." Finally, reliability indices for the system are computed by program reliab. A third input file, called "reliab.dat," contains load data for the system and mean times to failure and repair for each element. Using this file, reliab forms the annual hourly load curve, then computes probabilities and transition rates associated with each of 20 load levels and with each of four load levels. Probabilities and transition rates for each outage state are then computed from element failure data. Using the actual acceptabilities in"tester.ans," the probability, duration, and unserved demand for each unacceptable system state are accumulated, and these quantities are used to compute actual reliability indices. In the same manner, the acceptabilities in "testera.otp" and "testerb.otp" are used to compute indices for the two neural nets. This procedure is also used to obtain conventional estimates, but only the 100 most probable outages at four load levels are considered. The indices obtained by each of these strategies are listed in file "reliab.otp." 5.2 RBTS Results Figure 5.1 The Roy Billinton Test System The Roy Billinton Test System (RBTS) is a small bulk power system model used in graduate reliability studies at the University of Saskatchewan, Canada [57]. The system, shown in Figure 5.1, consists of 6 buses and operates on a base of 230 kV and 100 MVA. Two of the buses house a total of 11 generating units, and both buses have a voltage magnitude of 1.05. Generating data is listed in Table 5.1. Table 5.1 RBTS Generation Data MW MVAR MTTF, MTTR, Bus Rating Limits hrs hrs 1 40 15 17 1460 45 1 40 15  17 1460 45 1 20 7 * 12 1752 45 1 10 0 7 2190 45 2 40 15 + 17 2920 60 2 20 7  12 3650 55 2 20 7 12 3650 55 2 20 7 12 3650 55 2 20 7 12 3650 55 2 5 0 5 4380 45 2 5 0 5 4380 45 75 Total rated generation of 240 MW is more than sufficient to supply a peak load of 185 MW. The load curve is described in the Appendix and is the same as that for the 3bus system of Chapter 4. Coincident loads are assumed, and peak bus loads are listed in Table 5.2. Voltage limits for all load buses are, in per unit, 0.97 and 1.05. Table 5.2 RBTS Peak Bus Loads Bus MW Load MVAR Load 1 0 0 2 20 4 3 85 17 4 40 8 5 20 4 6 20 4 The transmission system consists of nine lines. Perunit line parameters are given in Table 5.3, along with failure and repair data. Table 5.3 RBTS Transmission Data S From To Re Shunt MVA MTTF, MTTR, Line Resistance Reactance . Bus Bus Admittance Limit hrs hrs 1 1 2 0.0912 0.4800 0.0564 71 2180 10 2 1 3 0.0342 0.1800 0.0212 85 5830 10 3 1 3 0.0342 0.1800 0.0212 85 5830 10 4 2 4 0.1140 0.6000 0.0704 71 1742 10 5 2 4 0.1140 0.6000 0.0704 71 1742 10 6 3 4 0.0228 0.1200 0.0142 71 8750 10 7 3 5 0.0228 0.1200 0.0142 71 8750 10 8 4 5 0.0228 0.1200 0.0142 71 8750 10 9 5 6 0.0228 0.1200 0.0142 71 8750 10 There are 211 outage states for this system, or 4220 system states. Fortyfive of the outage states are line outages, while 66 concern only generating units. In any event, 4220 load flow studies must be performed, each for a 6bus system, in order to collect 76 the data required for calculating actual values of reliability indices. A combination of this data and the load curve yields the values shown in Table 5.4. For this system, the number of failed states is 1581. The actual indices show that the RBTS has a 99.75% availability. It undergoes three failures every two years, each lasting almost 15 hours and resulting in over 91 MW of lost load. Table 5.4 RBTS Reliability Indices Conventional NN PCANN Adequacy Actual Ade y A l Estimate % Error Estimate % Error Estimate % Error Index Value Probability 0.0025 0.0064 157.4 0.0038 53.8 0.0024 5.9 Duration 14.6 14.4 1.2 17.6 20.6 10.7 26.4 Frequency 1.5 3.9 160.6 1.9 27.5 1.9 27.9 Unserved 91.2 82.1 10.0 82.5 9.6 82.3 9.8 Demand The conventional estimates, using only the 100 most probable contingencies, give the most inaccurate values for three of the four indices. The estimate of failure duration is unique in having the least error with the conventional technique, but the probability and frequency estimates have errors approaching 200%. Of the 400 system states involved in conventional estimates, 120 are failed. There are two voltage magnitudes, five real bus injections, and four reactive injections available as neural network inputs. In addition, there are 13 nonzero, complex admittance elements, for a total of 37 available inputs. A neural net is configured with 37 input nodes and ten nodes on a hidden layer. It learns the 156 training cases in 2473 iterations and has an 84.5% agreement with load flow results when tested on all system states. The number of unacceptable states found by this network is 1848. This neural net gives the most accurate estimate of unserved demand, but the error is only slightly less than either of the other two strategies. 77 The 26 admittance variables are transformed to seven principal components by passing them through a PCA that retains 100% of the energy in the admittances. A neural net with 18 inputs and five hidden nodes learns the training set in 1781 iterations and agrees with load flow results on 87.0% of the system states. The number of states deemed unacceptable by this net is 1891. It performs better than the conventional technique on all but one index, while its performance is slightly worse on three indices than that of the original network. However, this network gives a much better estimate of probability than the original net, so the PCA is quite useful for this system. 5.3 GRU System Results 2 1 r 10 11 14 Figure 5.2 The GRU System The Gainesville Regional Utilities (GRU) system, shown in Figure 5.2, has a reputa tion for bulk system reliability throughout its service territory. The system has 11 buses operating on a base of 138 kV and 100 MVA. Nine generating units reside on two of the buses, which have perunit voltages of 1.02 and 1.01, respectively. Table 5.5 lists the generation data for this system. Table 5.5 GRU System Generation Data MW MVAR MTTF, MTTR, Bus Rating Limits hrs hrs 1 218 45 * 91 8730 30 1 81 10 25 8730 30 1 18 6 9 8730 30 1 18 6 9 8730 30 2 41 17  19 8730 30 2 20 10  12 8730 30 2 14 8 10 8730 30 2 14 8 10 8730 30 2 14 8 * 10 8730 30 Table 5.6 GRU System Peak Bus Loads Bus MW Load MVAR Load 1 13.2 2.6 2 89.1 17.8 3 28.1 5.6 4 40.7 8.1 5 7.0 1.4 6 21.1 4.2 7 7.9 1.6 8 6.1 1.2 9 39.3 7.9 10 31.1 6.2 11 9.4 1.9 79 The generating capacity of 442 MW supplies a peak load of 293 MW2. Again, coincident loads are assumed, and voltage limits for the load buses are 0.95 and 1.05, respectively. Peak bus loads are listed in Table 5.6. There are 14 lines in the transmission system, providing 105 line contingencies to be considered in the adequacy assessment. Line parameters, in perunit, are given with failure and repair data in Table 5.7. Table 5.7 GRU System Transmission Data From To Shunt MVA MTTF, MTTR, Line Resistance Reactance . Bus Bus Admittance Limit hrs hrs 1 1 3 0.0057 0.0325 0.0096 245.7 87590 10 2 1 4 0.0024 0.0138 0.0041 245.7 87590 10 3 1 5 0.0030 0.0223 0.0082 313.0 87590 10 4 2 3 0.0055 0.0313 0.0092 245.7 87590 10 5 2 4 0.0088 0.0500 0.0148 245.7 87590 10 6 2 10 0.0051 0.0288 0.0085 245.7 87590 10 7 2 11 0.0032 0.0182 0.0054 205.6 87590 10 8 5 6 0.0046 0.0339 0.0125 313.0 87590 10 9 6 7 0.0014 0.0077 0.0023 245.7 87590 10 10 6 8 0.0045 0.0257 0.0076 245.7 87590 10 11 7 8 0.0031 0.0179 0.0053 245.7 87590 10 12 8 9 0.0061 0.0350 0.0103 245.7 87590 10 13 8 11 0.0066 0.0377 0.0111 205.6 87590 10 14 9 10 0.0015 0.0087 0.0026 245.7 87590 10 The 277 outage states for this system, when combined with the 20level load model, yield 5440 system states. Hence, 5440 load flow studies are necessary for an actual calculation of reliability indices. These indices and their estimates are shown in Table 5.8. There are only 368 failed states in the GRU system, and it has an availability of 99.89%. The system fails once every three years for over 28 hours, resulting in 50 MW of lost load. 2 The 1991 annual load curve for this system was obtained from the Strategic Planning staff at GRU. Scheduled power interchanges with connected utilities were not included in the system model. 80 The conventional estimates, using only the 100 most probable outages, give the most inaccurate values for three of the four indices. The best conventional estimate, failure duration, is "off" by less than 1%, while the other estimates contain 7090% error. Of the 400 system states involved in conventional estimates, 48 are failed. Table 5.8 GRU System Reliability Indices Conventional NN PCANN Adequacy Actual Adequacy Actual Estimate % Error Estimate % Error Estimate % Error Index Value Probability 0.0011 0.0018 75.2 0.0010 7.4 0.0010 7.8 Duration 28.4 28.6 0.8 28.8 1.3 28.3 0.3 Frequency 0.3 0.6 73.8 0.3 4.9 0.3 5.6 Unserved Unserved 49.5 92.0 85.7 93.7 89.3 62.4 26.1 Demand There are two voltage magnitudes, ten real bus injections, and nine reactive injections available as neural network inputs. In addition, there are 25 nonzero, complex, admittance elements, for a total of 71 available inputs. A neural net is configured with 71 input nodes and 18 nodes on a hidden layer. It learns the 180 training cases in 3621 iterations and has a 89.1% agreement with load flow results when tested on all system states. The number of unacceptable states found by this network is 523. This neural net gives the most accurate estimates of two indices probability and frequency; the estimate of unserved demand is actually worse than the conventional estimate. The 50 admittance variables are transformed to 12 principal components by passing them through a PCA that retains 96.3% of the energy in the admittances. A neural net with 33 inputs and eight hidden nodes learns the training set in 4443 iterations and agrees with load flow results on 88.6% of the system states. The number of states deemed unacceptable by this net is 487. In comparison to the other techniques, it performs best 81 on the duration and unserved demand indices, but its performance is slightly worse than the NN on the probability and frequency indices. Principal components analysis is worthwhile here primarily due to its increased accuracy on the lossofload index. 5.4 10Bus System Results 9 14 5 D 44 Figure 5.3 The 10Bus System This fictitious system, as its name suggests, uses 15 lines to supply its 10 bus loads. Three of the buses house the eight generating units, which are described in Table 5.9. Voltage magnitudes at the three generator buses are 1.02, 1.04, and 1.02, respectively. The generation system has a maximum capacity of 530 MW to supply a peak load of 480 MW. The load curve described in the Appendix is also used for this system. Coincident peak bus loads are listed in Table 5.10. Perunit voltage limits for load buses are 0.95 and 1.05. The system operates on a base of 23 kV and 100 MVA. Perunit line parameters are given in Table 5.11, which also contains failure and repair data for the lines. Table 5.9 10Bus Generation Data MW MVAR MTTF, MTTR, Bus Rating Limits hrs hrs 1 100 25 40 2100 30 1 100 25 * 40 2100 30 2 65 30 + 30 1630 25 2 50 10 + 25 1630 25 2 50 10 . 25 1580 40 3 65 30  30 1790 15 3 50 10 * 25 1790 15 3 50 10 25 1710 30 Table 5.10 10Bus System Peak Bus Loads Bus MW Load MVAR Load 1 40 13 2 30 10 3 40 13 4 60 20 5 40 13 6 70 23 7 50 17 8 30 10 9 50 17 10 70 23 83 Table 5.11 10Bus System Transmission Data From To Shunt MVA MTTF, MTTR, Line Resistance Reactance Bus Bus Admittance Limit hrs hrs 1 1 4 0.0360 0.1440 0.0170 110 12200 10 2 1 5 0.0420 0.1680 0.0210 110 14100 12 3 1 7 0.0540 0.2310 0.0260 110 16500 14 4 2 5 0.0310 0.1260 0.0160 110 18700 16 5 2 6 0.0310 0.1260 0.0160 110 16500 18 6 2 9 0.0840 0.3360 0.0410 110 14100 20 7 3 7 0.0530 0.2100 0.0260 110 18700 19 8 3 8 0.0750 0.3010 0.0360 110 16500 17 9 3 10 0.0420 0.1680 0.0210 110 14100 15 10 4 5 0.0630 0.2520 0.0310 110 12200 13 11 4 6 0.0420 0.1680 0.0210 110 12200 11 12 6 7 0.0310 0.1260 0.0160 110 14100 12 13 8 9 0.0840 0.3360 0.0410 110 16500 14 14 8 10 0.0750 0.3010 0.0360 110 18700 16 15 9 10 0.0630 0.2520 0.0310 110 16500 18 The 277 outage states for the 10bus system correspond to 5440 system states, so 5440 load flow studies are required for actual calculation of the indices. This system has 1490 failed states. Actual indices and estimates are shown in Table 5.12. The 10bus system has a 98.01% availability, fails 8 times per year for over 22 hours, and has a lost load of 77 MW per failure. The conventional technique, using only the 100 most probable outages, gives the most inaccurate values for all of the four indices. The estimate of duration is unique among conventional estimates in giving reasonable accuracy, but each of the other indices contains approximately 100% error. Of the 400 system states involved in conventional estimates, 163 are failed. 84 Table 5.12 10Bus System Reliability Indices Conventional NN PCANN Adequacy Actual Adequacy Actual Estimate % Error Estimate % Error Estimate % Error Index Value Probability 0.0199 0.0427 114.4 0.0207 3.8 0.0204 2.7 Duration 22.3 23.3 4.3 22.2 0.5 22.0 1.5 Frequency 7.8 16.1 105.6 8.1 4.3 8.1 4.2 Unserved 76.9 146.7 90.8 80.2 4.4 81.0 5.3 Demand There are three voltage magnitudes, nine real bus injections, and seven reactive injections available as neural network inputs. In addition, there are 25 nonzero, complex, admittance elements, for a total of 69 available inputs. A neural net is configured with 69 input nodes and 18 nodes on a hidden layer. It learns the 180 training cases in 7410 iterations and has an 89.0% agreement with load flow results when tested on all system states. The number of unacceptable states found by this network is 1476. This neural net gives the most accurate estimate of the duration and demand indices; the other estimates are slightly less accurate than the reduced network. The 50 admittance variables are transformed to 14 principal components by passing them through a PCA that retains 98.3% of the energy in the admittances. A neural net with 33 inputs and eight hidden nodes learns the training set in 7968 iterations and agrees with load flow results on 86.1% of the system states. The number of states deemed unacceptable by this net is 1968. In comparison to the other techniques, it performs best on the probability and frequency indices, but its performance relative to the original net does not merit the added computation of the PCA. CHAPTER 6 CONCLUSION 6.1 Summary The reliability assessment of an electric power system is an integral part of expansion planning for the system. Given a model of a power system, the planner measures the effect of system contingencies on the overall reliability of the system. Traditionally, due to the complexity involved in dealing with composite systems, planners have evaluated the generation, transmission, and distribution components of a power system separately. Within recent years, however, engineers have realized that changes in one part of a system can significantly affect the reliability of the entire system. Hence, modern reliability assessment involves evaluation of the bulk power system model of an electric utility. This model consists of all generators and transmission lines in the power grid, with the distribution component represented by load equivalents at major points of service. Until the early 1980's, reliability assessment was centered on evaluating a utility's generation system. Measuring the effect of a generating system contingency involved simply comparing available capacity with various load levels. In contrast, measuring the effect of a bulk power system contingency involves satisfying constraints on generation, bus voltages, and line flows, as well as taking system losses into account. In short, evaluation of a bulk power system requires the performance of several load flow studies on each contingency. Load flow analysis is timeconsuming and costly in terms of computer resources; in fact, these studies form the most expensive component of bulk power system reliability assessment. For this reason, a fast, noniterative method of evaluating system 86 contingencies, such as that presented here, may greatly facilitate planning for electric utilities. Neural networks have been successfully applied to a number of problems, including handwriting analysis, detection of speech patterns, logic gate mappings, and other applications, given only examples of patterns characteristic of the specific problem. Relatively little research has been performed in the application of these networks to power systems. Yet, the need for online computation in power system operations, combined with the iterative nature of many solution techniques (and the sheer intractability of certain problems), means the supply and distribution of electric energy may provide a vast area of application for this form of artificial intelligence. The problem considered in this work is the determination of state acceptability using a multilayered perception trained by error propagation. After choosing input variables that describe a system state, principal components analysis is used to reduce the number of inputs to the neural net. A set of patterns and desired outputs is selected for training and, when network weights are found, the entire state space is passed through the net. The resulting network outputs are used to estimate chosen adequacy indices. For the three systems described in Chapter 5, the results show that a neural network can determine the acceptability of a system state with high probability. In addition, principal components analysis is shown to be useful in reducing the dimension of an input vector. In some cases, the use of principal components improves the results; in others, the accuracy of indices is reduced. Because of this ambiguity, principal components analysis should only be used with large systems whose input vectors would otherwise result in unreasonable network training times. Nevertheless, the reliability indices found using perceptrons are in much closer agreement with actual values than those found using the conventional technique. 87 This work offers several contributions to the study of electric utility systems. The methodology provided for the use of neural networks is both systematic and robust. Economies of scale are highly favorable in that even large systems can give results similar to those obtained for the small systems used here. For operation and control, the research offers a way to monitor a system continuously instead of performing load flow studies hourly, as is the conventional practice. It is important to emphasize that, once a neural net has been trained on a power system, the acceptability of any state of the system can be determined in a fraction of the time required to perform a load flow study. For planning, this work shows that a "middle ground" exists between the intensive computation of actual indices and the easily computed but inaccurate conventional technique. The neural network approach to adequacy estimation is both faster than the actual computation and more accurate than the conventional technique. Also, considering the ongoing nature of the planning process, the computational intensity involved in training a perception is possibly a onetime cost. After the first planning year, network training for an expanded system can begin with previously computed weights, requiring far fewer iterations to include new data in the mapping. This work offers a way to drastically reduce the number of load flow studies necessary in designing and maintaining a reliable system. 6.2 Limitations The conclusions of this research are based on studies of only three small power systems. Though this work is primarily concerned with developing a methodology for studying the application of perceptrons to reliability and state acceptability, the results cannot be generalized without testing on an extensive set of realistic systems. 88 In addition, this work focuses only on systemwide adequacy indices; that is, no provision is made for local indices such as the probability of failure at a given bus. However, since a perception specifies the set of unacceptable states, load flow studies can be performed on the states to obtain information related to buses. Alternately, a neural net with multiple output nodes may be trained to give failed buses for system states and, if particular local indices are desired, load flow studies may be performed on only those states which indicate specific system failures. Furthermore, time comparisons between reliability assessment methods are little more than informed guesses based on the author's experience with the simulations in Chapters 4 and 5. Though time measurements are accessible for each technique, any resulting statements are flawed due to coding inefficiency in all methods, especially in coding the training algorithms. Also, though issues related to employing principal components are discussed, few specific recommendations on the use of this technique are given; for instance, an analysis of the effects of reduced input correlation on the performance surfaces is lacking for this application. In addition, neural network processing requires making judicious choices of several parameters. From the random numbers initially assigned to network weights to the learning and acceleration factors of the training procedure, the parameters determine whether the network accurately models the training set and whether it can generalize to new patterns. Many of these parameters have been tested on a variety of applications and have generally accepted heuristics. Unfortunately, no heuristic exists for the number of hidden units to use. A 4:1 ratio of inputs to hidden nodes is used in this work, but it is somewhat arbitrary and only selected for consistent comparison of results. Indeed, some of the network simulations described in Chapter 5 have higher agreement with load flow studies when 89 different ratios are used. Hence, the 4:1 ratio is not recommended as a general heuristic. Although a great deal of research in neural networks is devoted to finding a rule for this parameter [5859], it will be necessary to experiment with many power systems before any rule becomes acceptable for this application. In the context of presenting a methodology, the lack of an acceptable heuristic for the number of hidden units is a major limitation of this work. Hopefully, the discussions in Chapters 3 and 4 may guide other researchers in this matter. 6.3 Future Work With the methodology presented here, it is possible to formulate an exhaustive study of power systems to validate the results. Along with such a study, faithful time comparisons of reliability assessment techniques can be obtained. This requires professionally developed load flow routines, as well as "stateoftheart" programming code for training multilayered perceptrons [60]. In addition, a multipleyear planning study with yearly expansions might be simulated for various systems in order to quantify the benefits of using previously computed weights in retraining a perception for an evolving system. This study would assume a set of expansions limited to the addition of generating units at existing voltagecontrolled buses, alteration of line admittances along existing rightsofway, and increments in the annual peak load; however, these additions form the bulk of power system expansions. Also, a study of the effects on reliability indices of varying the threshold between "high" and "low" network outputs would be a valuable addition to this work, as would a study on the utilization of principal components. Plotting the effects of diverse sizes of hidden layers on a perception's accuracy in this application would also prove beneficial, using similar systems of various dimensions. 90 Knowing the decision regions formed by the hidden layer may impact both the operation and planning for a power system. With unacceptable regions known to the dispatcher, continuous monitoring of system variables may help avoid failed states by prompting preventive operations. In the planning process, finding commonalities between states in each unacceptable region may suggest specific expansions for improved reliability. A study on the specific benefits of analyzing decision regions for realistic systems would be the most suitable progeny of this work. Research questions include, among others, 1. the sensitivity of decision regions to particular weights (i.e., high sensitivity to certain weights may suggest the importance of associated power system variables), and 2. the implications of similarities between clustered states (i.e., regions filled with identifiably similar contingencies might be eliminated by reducing the effects of such contingencies). This study might also provide clues to such vexing problems as transmission "bottlenecks" and appropriate amounts of generation reserves. Further research in this field should be aimed at discovering the implications of a direct mapping of power system variables to state acceptability. Finally, this work has shown the manner in which a neural network can be trained to specify the acceptability of a given power system state. Although the intended application is contingency screening for reliability assessment, many tasks related to power system maintenance and control can benefit from this work. Examples are the timing of preventive maintenance for generators, noncritical line repairs, and feeder reconfiguration. These applications currently require load flow analyses although, like 91 contingency screening, they do not require detailed knowledge of bus voltages or line flows. Hence, these and many other power system tasks provide good applications for further research. APPENDIX A ANNUAL LOAD MODEL With the exception of the GRU system, all test systems use an annual load curve taken from the IEEE RTS. Load data is given in three stages:  weekly peak load as a percentage of the annual peak,  daily peak load as a percentage of the weekly peak, and  hourly peak load as a percentage of the daily peak. In this way, each hourly load of the year is computed as the product of the associated weekly, daily, and hourly peak loads. Table A.1 lists the peak load for each week in the projected year. The annual peak occurs in the 51st week of the year, though a lower seasonal peak occurs in the 23rd week. Either a winter or summerpeaking system can be described by assigning dates to the first week. Table A.1 Weekly Peak Load in Percent of Annual Peak Week Peak Load Week Peak Load 1 86.2 27 75.5 2 90.0 28 81.6 3 87.8 29 80.1 4 83.4 30 88.0 5 88.0 31 72.2 6 84.1 32 77.6 7 83.2 33 80.0 8 80.6 34 72.9 9 74.0 35 72.6 10 73.7 36 70.5 11 71.5 37 78.0 12 72.7 38 69.5 