UFDC Home  myUFDC Home  Help  RSS 
Title Page  
Acknowledgement  
Table of Contents  
Abstract  
Problem formulation  
System assumptions  
Model formulation  
Optimal parameter selection  
Applications  
Appendix A: Orthogonal signal...  
Appendix B: Hilbert space...  
Appendix C: Sdomain derivation...  
Appendix D: Squared error stationarity...  
References  
Biographical sketch 
CITATION
THUMBNAILS
PAGE IMAGE
ZOOMABLE


Full Citation  
STANDARD VIEW
MARC VIEW


Table of Contents  
Title Page
Page i Acknowledgement Page ii Table of Contents Page iii Page iv Abstract Page v Page vi Problem formulation Page 1 Page 2 Page 3 Page 4 Page 5 Page 6 Page 7 Page 8 Page 9 System assumptions Page 10 Page 11 Page 12 Page 13 Page 14 Page 15 Page 16 Page 17 Page 18 Page 19 Page 20 Page 21 Model formulation Page 22 Page 23 Page 24 Page 25 Page 26 Page 27 Page 28 Page 29 Page 30 Page 31 Page 32 Page 33 Page 34 Optimal parameter selection Page 35 Page 36 Page 37 Page 38 Page 39 Page 40 Page 41 Page 42 Page 43 Page 44 Page 45 Page 46 Page 47 Page 48 Page 49 Page 50 Page 51 Page 52 Page 53 Page 54 Page 55 Page 56 Page 57 Page 58 Page 59 Page 60 Page 61 Page 62 Page 63 Page 64 Page 65 Page 66 Page 67 Page 68 Page 69 Page 70 Page 71 Page 72 Page 73 Page 74 Page 75 Page 76 Page 77 Page 78 Page 79 Page 80 Applications Page 81 Page 82 Page 83 Page 84 Page 85 Page 86 Page 87 Page 88 Page 89 Page 90 Page 91 Page 92 Page 93 Page 94 Page 95 Page 96 Page 97 Page 98 Page 99 Page 100 Page 101 Page 102 Page 103 Page 104 Page 105 Page 106 Page 107 Page 108 Page 109 Page 110 Page 111 Page 112 Page 113 Page 114 Page 115 Page 116 Page 117 Page 118 Page 119 Page 120 Page 121 Page 122 Page 123 Page 124 Appendix A: Orthogonal signal sets Page 125 Appendix B: Hilbert space concepts Page 126 Page 127 Appendix C: Sdomain derivation of minimum J Page 128 Page 129 Appendix D: Squared error stationarity using the Laguerre model Page 130 Page 131 References Page 132 Page 133 Page 134 Page 135 Page 136 Page 137 Page 138 Page 139 Biographical sketch Page 140 Page 141 Page 142 Page 143 

Full Text  
SYSTEM MODELING USING GENERALIZED LINEAR FEEDFORWARD NETWORKS By DOUGLAS G. JONES A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1999 ACKNOWLEDGE MENTS There are many people who made the completion of this work possible. First and foremost I would like to thank my research professor, Dr. Jose C. Principe. He has taught me the valuable skills of rigor and completeness. Also, his patience with my progress while I juggled both research and a busy career was greatly appreciated. I would like to thank my supervisory committee members Dr. Yunmei Chen, Dr. William W. Edmonson, Dr. John G. Harris, and Dr. Jian Li for their encouragement and direction. Many thanks go to the management within the 46 Test Wing, Eglin AFB, FL, who made it possible for me to attend the University of Florida through the longterm full time training program. This organization fosters an atmosphere that encourages academic achievement, and it has continually provided me with the opportunity to further my education. There are no words to describe the wisdom, guidance, and support I have received from my parents. Their love to this day is endless. Above all I would like to give the highest recognition to my loving wife, Donna, for her sacrificing devotion throughout my education. To my precious children, Lindsey and Mark, you are the sunshine in my life. Lastly, I dedicate this work to my Lord and Savior Jesus Christ. I can do all things through him who gives me strength. TABLE OF CONTENTS page ACKNOW LEDGM ENTS ......................................................................................... .. ii ABSTRACT .................................................................................................................. v CHAPTERS 1 PROBLEM FORM ULATION ................................................................................. 1 1.1 Problem D efinition......................................................................................... 2 1.2 M easurement Objective ......... ............... ........................... ..... ................... 3 1.3 Model Classifications................................................................................... 4 1.4 H historical R eview ............................................................................................... 5 1.5 Modeling Methodology................................................................................ 7 1.6 Research Contribution.................................................................................. 8 2 SYSTEM ASSUM PTIONS...................................................................... .... ........ 10 2.1 Linear, TimeInvariant, Causal Systems............................................................ 10 2.2 R ealizability ................................................................................................. 14 2.3 L2 Signals and Systems ...................................................................................... 14 2.4 System Synthesis.......................................................................................... 15 2.5 Continuous/DiscreteTime Transformation................................... ............ 16 2.6 E xam ples...................... ................................. .............................................. 18 2.7 Conclusion (System Assumptions)............................. ............ ............... 21 3 MODEL FORMULATION....................................................................................... 22 3.1 General Model Form M(p)......................................................................... 22 3.2 D iscreteTim e M odels................................................................................... 23 3.2.1 FIR M odel............................................................................................... .. 23 3.2.2 R M odel............................................................................................... 24 3.2.3 G LFF M odel ........................................................................................... 24 3.2.4 FIR, IIR, GLFF Model Block Diagrams ................................................. 25 3.3 GLFF M odel Kernels ............ .... ................... ................................................ 27 3.3.1 Popular DiscreteTime GLFF Kernels.................................................... 27 3.3.2 Popular ContinuousTime GLFF Kernels............................................... 27 3.4 GLFF Kernel Derivations and Properties..................................... ............ 30 3.4.1 ContinuousTime Laguerre Kernel.............................................................. 30 3.4.2 ContinuousTime Kautz Kernel (Complex Time Scale Vector) ...............30 3.4.3 ContinuousTime Kautz Kernel (Real Time Scale Vector).........................31 3.4.4 ContinuousTime Legendre and Jacobi Kernels..................................... 32 3.5 Impulse and Step Response Modeling.......................................................... 32 iii 3.6 Conclusion (Model Formulation)............................................................... 33 4 OPTIMAL PARAMETER SELECTION............................................................ 35 4.1 M odel D im ension.......................................................................................... 35 4.1.1 C om pleteness ............................................................................... .......... 36 4.1.2 O rthogonal Projection............................................................................... 37 4.1.3 "O ptim al" Selection of N ............................................................................ 38 4.2 W eight V sector ............................................................................................ ...... 39 4.2.1 Optimal Selection of w ........................................................................ 39 4.2.2 Relationship to WienerHopf Equations...................................................41 4.2.3 Minimum Squared Error Performance Function ......................... .............. 42 4.2.4 Computing w Using Step Response Data..................................................43 4.2.5 Adaptation of w .......... ............................................................................46 4.3 T im e Scale V ector............................................................................................ 46 4.3.1 Optimal Selection of A (Historical Review).................................... ..48 4.3.2 Optimal Selection of A (Complex Error Analysis).................................... 56 4.3.3 Further Analysis of the Complex Performance Function ....................... 66 4.3.4 Time Domain Synthesis Using GLFF Networks....................................... 76 4.3.5 Conclusion of the Complex Performance Function.................................. 78 4.4 Conclusion (Optimal Parameter Selection)................... ........... .......... .... 80 5 A PPLIC ATIO N S ........................................... ......................................................... 81 5.1 Examples of GLFF Networks Using Simulated Systems ................................83 5.2 Laboratory Testing of Guided Weapon Systems ............................................. 89 5.2.1 O objectives .............................................................................................. 91 5.2.2 Functional Breakdown of a Missile System ........................................... 92 5.2.3 HITL ClosedLoop Missile Simulations....................................................94 5.3 Modeling the Step Response of a Carco 5Axis FMS ....................................95 5.3.1 DiscreteTime GLFF Models of the FMS .................................................98 5.3.2 Step Response Modeling Procedure ........................................................ 100 5.3.3 Results (FMS Step Response Modeling)................................................ 103 5.3.4 ContinuousTime GLFF Models of the FMS .......................................... 118 5.3.5 Conclusion (Step Reponse Modeling)..................................................... 123 5.4 O their A applications ........................................................................................... 123 5.5 Conclusion (Applications) ............................................................................... 124 APPENDICES A: ORTHOGONAL SIGNAL SETS ......................................................................... 125 B: HILBERT SPACE CONCEPTS ........ ................................................................. 126 C: SDOMAIN DERIVATION OF MINIMUM J................................................... 128 D: SQUARED ERROR STATIONARITY USING THE LAGUERRE MODEL .... 130 R EFER E N C E S ........................................................................................................ 132 BIOGRAPHICAL SKETCH............................................................................... 140 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy SYSTEM MODELING USING GENERALIZED LINEAR FEEDFORWARD NETWORKS By Douglas G. Jones August 1999 Chairman: Jose C. Principe Major Department: Electrical and Computer Engineering The focus of this work is the derivation and development of a class of models called Generalized Linear FeedForward (GLFF) net%%orks. Included is a description of the a priori assumptions concerning the systems under study, a development and formulation of the models (networks) to be used. and a definition of a measurement criteria used to judge goodnessoffit. The models are constructed from a weighted sum of parametrically dependent kernels. These kernels represent basis vectors that span a vector space with dimension equal to the order of the network realization. Optimization of the model parameters is discussed and a new approach is given; one that extends the conventional squared error analysis to the complex domain. After presentation of GLFF model theory and parametric optimization methodologies, network realizations are constructed to address the system modeling problem. GLFF networks can be formulated to model discretetime, continuous time, and sampleddata systems. In this study, it is shown how a flight motion simulator, used in the laboratory test and evaluation of guided weapon systems, can be accurately modeled using GLFF networks comprised of Gamma, Laguerre, and Legendre kernels. Advantages of GLFF networks are their ability to accurately represent linear time invariant systems coupled with the simplicity of the procedures employed to optimally determine their parameters. The network structures have efficient and practical implementations and applications are offered as proof. CHAPTER 1 PROBLEM FORMULATION In this chapter an overview of the entire content of the research is gi\en. beginning with a definition of the problem to be solved; namely, modeling the impulse and step responses of causal, linear, timeinvariant systems. The system/model configuration is illustrated and the primary measurement objective stated. The model architecture is briefly described and prefaced with a review of the major contributors leading to the proposed formulation. Finally, the contribution of this research is summarized. It will be developed throughout the remaining chapters. It will be shown that the models developed arc capable of approximating systems with real and/or complex conjugate pair poles, even those \which have slowly decaying impulse responses (large time constants) and/or oscillatory modes (light damping). This is difficult to achieve with finite impulse response (FIR) structures. The models studied are considered parametric since they are functions of parameter vectors that must be determined. These parameter vectors are fairly easy to optimize and unlike infinite impulse response (IIR) structures stability can always be guaranteed. The networks under consideration in this research are comprised of linear combinations of special basis filctions. These basis functions are not arbitrarily chosen but are derived from assumptions of the systems being modeled. The 2 networks therefore become good approximates since they obtain the natural information of the unknown system. This information resides in a parameter vector built into the model structure. The parameters are optimally chosen so that the models approximate the system in a least squares sense. Hence, the proposed theory is an implementation of parametric least squares error modeling [Cad90]. The modeling methodology can also be described using geometric concepts. Considering signals as vectors in vector spaces, the models are linear combinations of parametric basis vectors. They span a subspace of the space spanned by the unknown system. The model output is a projection of the system vector onto the model subspace. This space has an orientation that is parametrically established. The error is a vector that lies outside the model subspace. Projecting orthogonally the system onto the model space minimizes the error. 1.1 Problem Definition The objective is to approximate a causal, linear timeinvariant (LTI) system, H, using a predefined model, M(p), such that a measure of the quality of the approximation is minimized. Models developed in this study will be called generalized linear feedforward (GLFF) networks and will be defined in Chapter 3. The basic system modeling configuration is shown in Figure 11 below. Arguments are not included with the functions {x, d, y, e, H} to illustrate that the formulation will apply in both a continuoustime setting {x(t), d(t), y(t), e(t), H(s)} and a sampleddata or discretetime setting {x(k), d(k), y(k), e(k), H(z)}. Quite often in this thesis we switch backandforth between continuoustime and discretetime concepts. This is intentionally done to prove the broad applicability of this work. Most of the theory however will be derived in continuoustime. It would have been just as appropriate to develop the concepts in discretetime. d H x e M(p) Figure 11: System modeling configuration. 1.2 Measurement Objecti\e How is it determined whether a model has achieved a good representation of the system being approximated? For this a measurement of the error or a test for goodnessoffit of the model is needed. In this thesis the integrated squared error criteria will be employed; namely, for a given model M(p), the parameter set p is found such that the energy in the error signal e is minimized. This is one of the simplest error criteria to work with since the measurement is basically quadratic in form and its derivatives are linear in certain elements of the parameter set. So the objective is to minimize J = e(t)dt= (d(t) y(t)) dt (continuoustime) or (1.1) J = e2(k) = J(d(k) y(k))2 (discretetime) k=O k=O 1.3 Model Classifications To provide motivation for the GLFF networks (a complete formulation will be given in Chapter 3), consider the discretetime setting. In the past discretetime system models have been largely divided into two classes: FIR and IIR. A new class of models (filters) known as the generalized feedforward (GFF) filters was popularized with the development of the Gamma filter [Pri93]. This filter has a structure similar to the tapdelay line (FIR filter) with the delays replaced by a "more general" transfer function known as the Gamma kernel. Extending this development can be accomplished by considering other popular kernels, most of which are orthogonal, such as the Laguerre, Legendre, Jacobi, and Kautz [Lee67], [Arm59], [Kau54]. The extended formulation with a more comprehensive set of kernels is called the GLFF model class because its members are used in the context of modeling linear systems and they obey the superposition principle. How does the GLFF model class compare to FIR and IIR models? Figure 12 relates these three classes based upon important issues that are usually considered when deciding which modeling tool to pull from the signal processing toolbox. GLFF models can be said to fall somewhere between FIR and IIR models in capability, complexity, stability and computability. FIR GLFF IIR low complexity high low * { } * high l capability J f stability 1 trivial < stability difficult FcomputabilityI Figure 12: FIR/GLFF/IIR comparison. This thesis exploits the strengths of GLFF models. They offer an alternative to the two extremes in the modeling process. Their advantages are that they have a reduced complexity compared to IIR models while offering greater capability over FIR models. Unlike IIR modeling techniques, stability can be easily guaranteed and their optimal or nearoptimal solutions can be computed with much less difficulty. The most difficult task is in selection of the time scale r sector A. This is one element of the parameter set p. The complete parameter set will be defined in Chapter 3. Optimization of its elements comprises all of Chapter 4. 1.4 Historical Review Much of the theory on which the GLFF model class is founded began with the work of Lee [Lee33], [Lee67]. Lee showed how to use orthogonal functions to synthesize transient networks and how this concept is employed to build electrical devices that can realize physical systems. Another major contributor is Kautz [Kau54]. He provided a convenient way to develop the time domain response of a system using a most general orthogonal signal set. Young and Huggins extended Kautz' procedure to discretetime [You62a] as did Broome [Bro65]. Nurges and Yaaksoo [Nur81] formulated state space equations that represent discretetime linear systems using the Laguerre model. Roberts and Mullis provide a generalized state variable formulation of orthogonal allpass network structures [Rob87]. These structures keep data paths local. The fundamental processing elements are alike except perhaps for local scaling parameters. This simplifies the design and allows for distributing the computations in a connected array. These orthogonal structures also have the advantage that higher order models can be constructed by simply adding new blocks or terms without effecting or forcing a recalculation of the existing lower order terms. The concept of a generalized feedforward filter was proposed by Principe [Pri93] with the Gamma model. This idea of creating a more general transversal filter using the earlier work with orthogonal signal sets leads to the modeling methodology here described; namely, the generalized linear feedforward networks. What is the basic objective? Given a desired impulse response, h(t), of a causal LTI system along with a prescribed tolerance on the allowable error, find an approximating impulse response, h(t), that is simple in structure and easy to compute. Let the system and model transfer functions be H(s) and H(s), respectively. Construct H(s) with the form m (s) (S ,) ( A, H(s) A '= (1.2) x(s) (s ) = ( p) i=1 Assuming no coincident poles the impulse response is n h(t) = '(H(s)) = A eP (1.3) i1 where Re(pi) < 0, V i=1, ..., n. It is desired that h(t) and h(t) be as close as possible under prescribed tolerances. There are two wellknown approaches to performing this approximation. The first is to transform h(t) into H(s) and then approximate H(s) in the sdomain by using a ratio of polynomials. This approach is generally less difficult and several techniques for accomplishing this approximation have been presented in the literature. These include Prony's method, Pad6 approximants, continuedfraction expansions, and even analog filter design [Su71], [Bak81], [Opp89]. The problem with this approach is that the error in the time domain is difficult to control. The second approach is to approximate h(t) with a finite sum of exponentially damped sinusoids and damped exponentials (as described above) and then transform the approximation to find H(s). Here the modeling error may be controlled directly in the time domain. One way of accomplishing this is to implement the procedure described by Kautz [Kau54] which involves approximating h(t) using a weighted sum of special basis functions. These basis functions are themselves weighted sums of exponentials carefully constructed so that they are orthonormal. This provides for great simplification when searching for the optimal parameter set p and gives direct control over the associated error. 1.5 Modeling Methodology In this thesis the method used to implement GLFF network approximations of linear systems is based on the work done by Kautz. This is carried out by considering a construction of h(t) using an infinite sum of weighted time domain basis functions {(,,,(t)} 0". Since the infinite sum must be truncated to a finite number of terms an approximation results N h(t) =h (t)= w,4,,(tA ) (1.4) n=0 The weights w,,, w, ... ,WN and basis functions {0,(t,)}I are chosen with the following goals in mind: 1. For a fixed number of terms, N, make the convergence as rapid as possible. This is controlled using a time scale vector A. 2. Make the optimal weights independent of the number of taps, N+1. This is accomplished by using orthogonal basis functions. 3. For a fixed number of taps, N+1, minimize the energy in the error signal. This requires determining optimal values for all of the elements of the parameter set p in such a way that J (the integrated squared error) is minimized. The above procedure is applicable when modeling the impulse response of a desired system. It may also be of interest to approximate the step response. In fact the applications given in this thesis involve finding system models using step response data. If the network input is a step function, this method is employed by approximating not the desired system output but its derivative or by approximating the desired output and taking the derivative of the result. This will give an analytic expression for the impulse response. When only a numerical or tabulated form of the impulse response is desired, use the step response data to generate a system model. Once the model is established, use an impulse (or white noise) input to generate the impulse response data. 1.6 Research Contribution This study extends the above concepts and contributes to the system modeling theory in several ways. First, a unified system modeling framework centered around the GLFF model class is provided. As will be seen in Chapter 3 this formulation not only includes models that employ orthogonal signal sets like the Laguerre and Kautz but non orthogonal ones as well, such as the Gamma. In addition to these generalized models, the basic framework can also be used to represent FIR and IIR structures. Optimal parameter selection is a big part of the modeling procedure. The most difficult task is locating or approximating the optimal time scale vector. A new approach that extends traditional squared error analysis to the complex domain is developed. It will be shown how the information in this new space can be used to locate the optimal time scale parameters. Finally an application of the theory is given by modeling the step response of a Carco 5axis flight motion simulator (FMS). This system is used in hardwareinthe loop (HITL) testing of guided weapon systems. GLFF modeling of other systems and subsystems used in HITL testing is also considered. It is demonstrated how this class of models can be employed to accurately represent the dynamics of physical systems and add value to the task of weapon system test and evaluation. CHAPTER 2 SYSTEM ASSUMPTIONS In the previous chapter, a brief discussion was given concerning the assumed systems to be modeled. A more complete analysis of all the system assumptions is now offered. Mathematical and physical arguments are used to validate their necessity. The basic assumptions are that input/output signals have finite energy and systems are stable, causal, linear, timeinvariant, and realizable. After stating the assumptions, the technique employed to synthesis these systems by simulation is given. Higher order systems will be simulated by cascading together 1st and 2nd order subsystems. This is done in a continuoustime framework. The transformation used when a digital system is desired is also discussed. Finally some simulation examples of the kinds of real systems to be modeled are offered. 2.1 Linear, TimeInvariant, Causal Systems There are many ways to classify systems; linear vs. nonlinear, timeinvariant vs. time varying, causal vs. noncausal, stochastic vs. deterministic, continuous vs. discrete, etc. One important step in the study and analysis of systems theory is to derive a structural or mathematical description of processes and systems. This assumed structure is essential if plausible analytical models are sought to approximate them. One could argue that if we had a mathematical description of the process why would we need additional models. The primary reason for the model is to approximate input/output behavior of the process. The assumed mathematical description of the process could be used to give very specific information about its internal composition (time constants, damping ratios, etc.); however, a component level specification may be unavailable. In fact even if these system parameters are known they may not be operating within their stated specification. In this research an understanding of the internal composition is not the objective, only a model of the input/output behavior when viewing the system as a black box. The theory presented in this section does however give the foundation used to formulate and validate this new class of models and serves to illustrate the derivation of these concepts. Many systems studied in engineering can be well represented by mathematical relations that describe their physical makeup or behavior. Quite often systems are very complex and a realistic characterization may require nonlinear and/or time varying equations that are difficult to solve. For simplicity, practicality, and in order to establish an applicable set of tools which can be used in the design and analysis of systems, approximations and assumptions are made about their physical nature. This is done primarily to allow for the use of linear systems theory. This reasoning is often justified in one of two ways: 1. The system is assumed to be linear by nature or it at least operates in a linear region of its capabilities so that the linearity conditions are satisfied. 2. The system is assumed to be nonlinear by nature and in order to apply linear systems theory we conceptualize a linearization around a nominal operating point. With this approach precautions must be taken to ensure the domain of the linear assumption is not exceeded. Why is the use of linear systems theory desirable? In addition to the availability of a wellstudied set of tools, most often a description of a system via the impulse response and transfer function is desired. These functional system descriptions are established by employing the theory of linear systems [DeC89]. Consider a linear timeinvariant (LTI) system having input x(t) and output d(t). The impulse response, h(t), is defined as the output when the input is a unit impulse function 8(t). The transfer function, H(s), is defined as the Laplace transform of h(t) with zero initial conditions. This is equivalent to assuming h(t) = 0 for t < 0 (causal). The above definition of the transfer function is in terms of the impulse response. However a description of the transfer function taking a different, more mathematical approach is also available. In practice, a large class of signals and systems in engineering can be formulated through the use of differential equations. The LTI, causal assumptions equivalently translate into the requirement that systems be representable by linear, constant coefficient, ordinary differential equations. Namely, an /th order system can be written as d (t) + an_d "l)(t) + + ad(' (t) + a0d(t) = (2.1) b,,x(m(t)+ b,_lx('(t)+. .+ bx("(t) + box(t) where ao, a,, ..., a,_, bo, b\, ..., b,, are all real constants with n 2 m. Systems described by ordinary differential equations of this form are said to contain lumped parameters rather than distributed parameters [Kuo87]. Partial differential equations such as the elliptic equations, hyperbolic equations, and parabolic equations describe phenomena such as fluid flow, wave propagation, and heat conduction all of which are considered to be distributed parameter systems [Pow79]. To obtain the transfer function of an LTI system described by the differential equation above assume zero initial conditions (causality) and take the Laplace transform H(s) = D(s) b,,s" + b,, s' +b s+ (2b2) X(s) s" + a,_,s"'+ +ags + ao Re%\ writing this in factored form gives (71sz) H(s)= A ' (2.3) S(s p,) i=1 and in partial fraction form (assuming no repeated poles) H(s) = I (2.4) I=, (S p,) To obtain the impulse response, take the inverse Laplace transform. Assuming no coincident poles this yields h(t)= AI'l"t (2.5) So the LTI, causal assumption leads to the following result: LTI causal systems can be represented by constant coefficient ordinary differential equations whose solutions (impulse responses) are comprised of a sum of exponential functions that are real (when pi is real) and/or complex (when Pi is complex). If some of the poles are repeated then there are terms with multiplicative factors t, t2, t3, ..., tk1 for a kth order pole. This representation is the motivation for the modeling structure considered in this thesis. 2.2 Realizability It was shown that the LTI, causal system assumption leads to a transfer function representation of the form H(s)= A '=' (2.6) H(s p,) i=1 An answer is now given to the following question "What are the constraints on this form that ensures it is a description of a physically realizable system?" The realizability requirement of H(s) is satisfied with the following constraints: 1. The constant A must be a real. 2. The number of poles, n, must be greater than the number of zeros, m. 3. Complex poles and zeros must occur in conjugate form. That is if pi is complex then there must be a pj, for some j, such that pj = pi where denotes complex conjugation. When H(s) is realizable, it is easy to show that the impulse response is represented by a weighted sum of exponentials and exponentially decaying sinusoids. 2.3 L2 Signals and Systems The final constraint imposed is that the signals must belong to the vector space L2(R), the squareintegrable functions. Here integrable is meant in the Lebesgue sense [Deb90]. Assume the signal xe L2(R) and x: R R. then x2 (t)dt < (2.7) Hence it is said that x has finite energy. Let h(t) be the impulse response of a system. Since h(t) = 0 for t < 0, and he L2((R) then Jh'(t)dt In this case it is said that h is a stable system. So by operating in L (R) a restriction is imposed to work only with finite energy signals and stable systems. 2.4 System Synthesis Recall the system realizability requirement results in an impulse response representation that is a weighted sum of exponentials and exponentially decaying sinusoids. From the transfer function perspective then a system can be thought of as consisting of a cascade of 1st and 2nd order subsystems. The most general forms of these subsystem structures are D, ( s) b Hi (s) = D =(s) b__ X(s) s+a (2.9) D, (s) b~s + bo Hz(s) = = X(s) s +a,,s +a2o resulting in differential equations d (t)+ ald,1() = box(t) (2.10) i, (t) + a,,d(t) + ad(t) = b,,x(t) + bzx(t) During this thesis, examples will be given whereby simulated systems are created. For such examples, the synthesized systems will consist of these 1st and 2nd order subsystem blocks. Most often slightly less general forms will be used to keep the synthesis more in line with engineering (rather than mathematical) concepts. These special forms convey the engineering information of usual interest: lime constants, damping ratios, and undamped natural frequencies. These subsystems have the following transfer functions H,(s)D, (s) 1 =_, X(s) rs+l s+Y 2 (2.11) H, (s) co, X(s) s2 + 2 ws + Wo and corresponding differential equations rd (t)+ d,(t)= x(t) (2.12) d,(t) + 2 (),d2(t) + A d(t)= =ox(t) where r = time constant S= damping ratio t, = undamped natural frequency 2.5 Continuous/DiscreteTime Transformation The goal of this thesis is to define a model class useful for approximating discrete time, continuoustime, and sampleddata systems. The above system formulation was carried out in the continuoustime domain. The same principles however could have been derived in the discretetime domain. All the concepts discussed have equivalent discretetime representations. Sometimes a continuoustime system in analytic form is given when a discretetime representation is desired. This can be resolved by sampling. In this case, the discretetime system is called a sampleddata system. Hence a sampling time, T, will necessarily be specified. Having a continuoustime transfer function, there are several methods used to map to the discretetime domain. The most popular are the bilinear, impulse or step invariant, and matchedZ transform [Str88]. There are tradeoffs among these mappings and some work well where others do not. The most widely used is the bilinear transform. The transformation (or mapping) is carried out as follows H(z)=H(SIs 2z1 (2.13) T z+l where T is the sampling interval. This is also known as Tustin's method in control systems circles and is derived by considering the problem of numerical integration of differential equations [Fra90]. A derivation of the bilinear transform is as follows. Beginning with the transfer function H(s), determine its differential equation representation. A difference equation is then derived whose solution matches the solution of the differential equation at the sample points. The specific transformation depends on the method with which the differential equation solution is computed. Consider the 1s order system D,(s) _ H,(s) = () (2.14) X (s) s + or associated differential equation d,(t)+ d,(t)= x(t) (2.15) The solution, d\(t), can be written in integral form as I d,(t)= J[d,(u)+x(v)]du (2.16) 0 For the sampling interval of time Tthis can be rewritten as nTT nT d,(n)= d,(nT)= j[d,(v)+x(u)]dv+ j[d,(v)+x(v)]dv 0 Trr (2.17) nT = d(n1)+r I[d,(v)+x(u)]dv nTT The method in which the integral is computed over the region [nTT, nT] determines the discretetime representation, d(n), of d(t) and hence the transformation used to map H(s) to H(z). Two of the most common integration schemes are Euler's rule (backward difference method) and the trapezoidal rule. For these two cases the following mappings result. Table 21: Continuoustime to discretetime mappings. Integration Rule Mapping H(s) I4 H(z) Euler's 1 fzl s=\  T z Trapezoidal 2(z1) T _z+l) So the bilinear transform is derived by computing the integral over [nTT, nT] using the trapezoidal rule. In Steiglitz's work [Ste65] a more theoretical treatment and significance of the bilinear transform is developed. In fact, it is shown that the bilinear transform is an isomorphism connecting the continuoustime vector space L2(R) and the discretetime (or digital) vector space of doubleended sequences of complex numbers {x,, }_ that are square summable. This connection proves that the signal processing theories with respect to LTI causal signals and systems are identical in both the continuoustime domain and discretetime domain. In this thesis, when a continuoustodiscrete transformation is performed the bilinear transform will always be used. 2.6 Examples Several examples of the 1st and 2nd order systems discussed in Chapter 2.4 are shown on the next two pages in Figures 21 to 24. These figures illustrate the impulse and step response behavior that must be approximated by the class of models proposed in this thesis. 19 2 18 16 1.4 S0T= 5 1.2 T1 1 = 0.8 06 r=2 04 02 0 0 0.5 1 1.5 2 25 3 3.5 4 45 5 t (sec) Figure 21: 1st order impulse responses, r0.5, 1, 2, 3. 0.9 T=05 r= 1 0.8 0.7 T =2 '=3 0.6 o 0.5 04 0.3 02 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 t (sec) Figure 22: lst order step responses, r=0.5, 1, 2, 3. 0 1 2 3 4 5 6 7 8 9 10 t (sec) Figure 23: 2nd order impulse responses with at=l. a) =0.4, under damped; b) &1.0, critically damped; c) 4=1.5, over damped.  0.6 0.4 0.2 1 2 3 4 5 t (sec) 6 7 8 9 10 Figure 24: 2nd order step responses with a=l1. a) =0.4, under damped; b) =1.0, critically damped; c) =1.5, over damped. 0.2 0 2.7 Conclusion (System Assumptions) In this chapter the assumptions made about the systems to be modeled have been stated; causality, linearity, timeinvariance, realizability. This discussion was to provide the mathematical foundation and engineering principles that motivate the development of the system models to be introduced. For the purposes of simulating real systems, 1st and 2nd order subsystems are used as the basic building blocks. These subsystems can be cascaded together to create high order systems. Methods were discussed that allow for transforming continuoustime system descriptions, H(s), into discretetime descriptions, H(z). A good model is one with a structure that can capture the dynamics of a system (time constants, damping coefficients, natural frequencies) without explicit knowledge of its parameters. For systems with large time constants (slowly decaying impulse response) and/or small damping coefficients oscillatoryy modes) a good approximation is more difficult to achieve. The models presented in this thesis are good approximates of systems even with these characteristics. CHAPTER 3 MODEL FORMULATION In this chapter our first major contribution to the problem of system modeling is presented; namely, a comprehensive framework given by the GLFF model class. Although many of the components have been around for quite some time it is the unifying approach taken here that offers new insight. This new formulation encompasses not only GLFF networks but FIR and IIR models (for the discretetime case) as well. 3.1 General Model Form M(p) Recall the general system modeling configuration, shown in Figure 31 below where H is the system to be approximated using the model M(p). The specific form of the model M(p) is now discussed. d H + M(p) Figure 31: System modeling configuration. p is defined as a parameter set. Its elements are vectors called the parameter vectors of the model M(p). The parameter set contains three elements (vectors), p= {N,w,2 } where (3.1) N dimension vector w weight vector A time scale vector The model can be written in the form M(p)= M({N,w,A )= f(N,w,H,(.,2)) (3.2) where H,(*, A) is called the model kernel at stage n, 0 < n < Ni, and Ni is a component of the vector N. So M(p) is a function, f, of the parameter vectors and the model kernel. The kernel argument is used to imply that the domain of operation (s or z) is universal. When solving specific problems the notation H,(s,A) is used in the continuoustime case and H,(z,2) in the discretetime case. 3.2 DiscreteTime Models Consider the definitions above in the discretetime domain. This unifying representation can be used to illustrate many specific model structures. In this section it is shown how to describe the FIR, IIR, and GLFF model classes using this formulation. 3.2.1 FIR Model In the case of the FIR models N M(p)= M({N,w, })= w,,H,(z,A) (3.3) n=O where 2=0 n =nH,(z) i=0 The functions H,(z) are given by i=O i>O 3.2.2 IIR Model In the case of the IIR models M(p)= M({N,w,A ) N, b,, H,, (z,t ) n=0 N1 l=aH,(z,A) n=l where N = (N1,N2)T w= al,az,...,al ,bob,,..., b 2=0 H(Z,)= H,(z)= H (z) i=0 Just as in the FIR case the functions H,(z) are given by H(z) I ZI i=0 i>0 3.2.3 GLFF Model In the case of the GLFF models M(p)= M({N,w,A])= wnH,,(z,A) n=0 where N=(N,) (3.4) (3.5) (3.6) (3.7) (3.8) (3.9) H, (z, A) = H, (z) H,(z)= { , z1 N =(N,) w=(Wo Wi1...I^ WN)T 2 (2...L)T (3.10) H,(z, )= Hn2(z,' )l A(zA) i=0 The functions H,,(z,A) and H,,(z,A) could be any transfer function. In this thesis a restriction is made to consider special cases that 1. are founded on the system model assumptions described in Chapter 2. 2. have kernels that are (typically) orthogonal. 3. have kernels that are cascaded lowpass and allpass subs) stems. 4. will allow for modeling the unknown system with real and/or complex poles via the time scale vector A. 5. have localized feedback. 6. guarantee stability. 3.2.4 FIR, IR, GLFF Model Block Diagrams To compare the structures of these models examine their block diagrams in Figures 32, 33, and 34. As can be seen from the FIR model, its lack of any feedback component limits its applicability. The IR model offers the greatest capability but at a significant cost and loss of stability control. The GLFF model offers a practical alternative that rivals the IIR model in performance while maintaining simple computational procedures similar to the FIR case. x(k) x(k) x(k)  Figure 32: FIR model block diagram. y(k) Figure 33: IIR model block diagram. y(k) Figure 34: GLFF model block diagram. y(k) 3.3 GLFF Model Kernels The GLFF kernels can be comprised of almost any transfer function. Some of the most popular discretetime and continuoustime kernels are given in Tables 31 and 3 3, respectively. These tables include each kernel's common name, symbol, and n'h tap transfer function, H,,(., A). Most GLFF networks are derived from orthogonal signal sets (see Appendix A). This orthogonality of a kernel is also indicated in the tables. An important part of a GLFF network is the time scale parameter vector 2. It allows for local feedback and is responsible for the popularity and usefulness of this entire class of models. The dimension of the time scale vector, dim(A), determines the number of degreesoffreedom a kernel offers. dim(A) is also given in the tables. When fitting a model to an unknown system, the parameter set must be optimally selected to achieve the minimum integrated squared error. 3.3.1 Popular DiscreteTime GLFF Kernels The most popular discretetime GLFF kernels are the delay, Gamma, Laguerre, Legendre, and Kautz. They are listed in Table 31, and restrictions on the time scale parameters are given in Table 32. Some of the earliest discussions of these kernels are found in the work by Young [You62a], Perez [Per88], and Principe [Pri93]. 3.3.2 Popular ContinuousTime GLFF Kernels The most popular continuoustime GLFF kernels are the Gamma, Laguerre, Legendre, Jacobi, and Kautz. They are listed in Table 33, and restrictions on the time scale parameters are given in Table 34. Some of first to describe these kernels are Lee [Lee33], Kautz [Kau54], Armstrong Ar[Arm59], and Principe [Pri93]. Table 31: Discretetime GLFF kernels. Table 32: Restrictions on discretetime time scale parameters. Kernel Valid time scale Delay none Gamma A real, 0 < < 2 Laguerre A real, 1A <1 Legendre Ak real, Ak =k, 0 < 1 Kautz A2 complex, Ak <1 Kernel n htap H2 (z,A) Hk,(z,) H,(z,A) ortho dim(2) symbol gonal Delay H,(z) I z' z no 0 Gamma Gn(z,A) I AZ' A1 no 1 ___1(12)z 1(1A)z Laguerre Ln(z,A) _2 z 22 _ I yes 1 1A2z 1Az1 12 1 2z) __ Legendre P,(z,A) 1 z1 1 z A yes >1 12jz 12z 1 1 1'z Kautz K,(z,A) 1 /1 1 n 1 yes >1 1 I Z1 1 Z i= 1A iz Z_ i=O Table 33: Continuoustime GLFF kernels. Table 34: Restrictions on continuoustime time scale parameters. Kernel Valid time scale Gamma A real, A > 0 Laguerre A real, A> 0 Legendre A, real, A, = 2, A > 0 Jacobi 2, real, A, = k + 1 Kautz Ak complex, A, = 2AR + jA/, ARk > 0 Kernel n'h tap H H(s,A) Hk(s,) H,(s,A) ortho dim(A) ___ symbol__ gonal Gamma G,(s,A) 1 A 2 "A no 1 s s+A [s+A _ Laguerre L,(s,A) s A s s A "" yes 1 s+A s+A s+A s+A Legendre P,(s,A) 2j s Ak n2 fSA yes 1 s+A s+k s+ A, =0 S+ Jacobi J,(s,A) s 2 2_ 22 s A, yes >1 s+A_ s++ +k s+ i0 s+ Kautz K,(s,A) 2(sA) (sAkX(s) 2f(sI)( sXs) yes >1 (s+AlXs+A) (s+AkXs+ ) (s+2Xs+x ) s+2+Xs+4) 3.4 GLFF Kernel Derivations and Properties The Laguerre, Legendre, Jacobi, and Kautz GLFF networks have the interesting property that their kernels are orthonormal. Hence, we call these orthonormal generalized linear feedforward (OGLFF) networks. Next these kernels will be derived from first principles. This is done in the continuoustime domain. The derivations will make evident their relationship to causal LTI system. The connection between the network structures and the mathematical description of causal LTI systems is the primary reason they are able to approximate these systems so well. 3.4.1 ContinuousTime Laguerre Kernel Consider the sequence I(At) e" }. Regarding this as the solution to an ordinary differential equation that represents a causal LTI system, the system would consist of one pole with multiplicity n+l. Orthogonalizing this sequence yields a new sequence {l((t, )} given by 1,(tA)=V2 I e,nA> 2>0,2>0, t0 (3.11) Now taking the Laplace transform gives the Laguerre model kernel L,(s,)= I {l(t)} S=_ An(3.12) SA S+A 3.4.2 ContinuousTime Kautz Kernel (Complex Time Scale Vector) Consider the sequence {et '} where the A,'s are distinct complex scalars (i.e., components of the time scale vector 2). Since the poles are complex they must exist in conjugate form in order to satisfy the realizability requirement. Orthogonalizing this sequence yields a new sequence {k,(t,A)} given by k,,(t,A)= Re{A,,e1}, n>0, t> 0, Re{A,}>0, Vi (3.13) i=0 It is now shown how to obtain the coefficients A,i. Taking the Laplace transform of the new sequence gives the Kautz model kernel 2A(s+In) (s (sA,)(sA) n1 K2,,(t,A)=I {k, (t, )}= ( + m 0,1,..., 2 Sr2AR (sIA) (s1(A,)(sAX)' n1 Kz,,, (t, )= L(k,,(t,A )}= s_+ +,), ,yn i m=0,1,..., m+(tA)= km+(tA) (s+A +X)(s+ ) ,U (s+A,)(s+A ) 2 Express these functions in partial fraction form. Grouping each pair of time scale elements (2, ,2 ) gives the kernel expression K,,(t,)= s+2A, + n=2mandn=2m+1 (3.15) Hence the coefficients Ani are twice the residue at the pole A, for the nth tap kernel. 3.4.3 ContinuousTime Kautz Kernel (Real Time Scale Vector) Consider the sequence {e"')} where the A,'s are distinct real scalars (i.e., components of the time scale vector A). Orthogonalizing this sequence results in a new sequence {k,(t, )} given by k,(tA) = Aie', n >0, t > 0,A, > ,Vi (3.16) i=0 It is now shown how to obtain the coefficients Ai. Taking the Laplace transform of the new sequence gives the Kautz model kernel K,(t, )=I k,,(2)}= 22 k( sA =, A+, (3.17) S + n i= S +i i=0 S + Ai where the last expression is simply the partial fraction expansion of the product form of K,(t, 2). Hence A,i is the residue of the pole at A, for the Inth tap kernel. It will be real since the time domain function k,(t, 2) is real. 3.4.4 ContinuousTime Legendre and Jacobi Kernels Consider again the sequence {(e"'} where the An's are distinct real scalars. This is the same sequence from which the Kautz kernel with a real time scale vector is derived. This kernel is described by equation (3.17). Now consider the kernel with a "special" time scale parameter. Namely, if A = (AA,,2.,2,)T with k = 2A, A > 0 then we obtain the Legendre kernel. Likewise, if Ak = k +1 we obtain the Jacobi kernel. 3.5 Impulse and Step Response Modeling The models discussed in this chapter can accurately represent any signal or system belonging to L 2(R). We will be most interested in modeling causal LTI systems using sampled step response data. In this case, a step function is applied to a continuous time system to generate its step response. The step response is sampled to obtain a discretetime representation of the system. Optimal continuoustime and discrete time GLFF models are determined using this data (optimization will be discussed in Chapter 4). For example using an N+I tap Kautz filter the approximating transfer function is N H(z) = H(z)= w,K,(z,A) (3.18) n=0 Suppose the discretetime impulse response is desired. How can it be computed? Three simple methods are now given. 1. Using the discretetime model /H(z), pass the impulse x(k)=J(k) through this model to generate an approximation to h(k). 2. Since x(t) is a unit step then X(s) = 1/s. If g(t) is the step response of the system then g(t) = h(t) x(t) or G(s) = H(s) / s. So H(s) = sG(s) = D(s)G(s) (3.19) or H(z) = D(s)G(s)I, = D(z)G(z) where D(z) = (3.20) 1 T z+1 Using the sampled step response data g(k), pass this through D(z) to generate h(k). dg(t) 3. Since h(t)= d use a numerical approximation to the derivative (backward dt difference perhaps) to obtain h(k). In this case dg(t) g(kT) g((k)T) g(k) g(k 1)21) dt ,kT T T 3.6 Conclusion (Model Formulation) This chapter has dealt with structures of models useful for approximating causal LTI systems. The generic model M(p) was stated. It was shown how this formulation can be used to represent FIR, IIR, and GLFF networks. Several GLFF kernels were derived, and discussion was provided to illustrate their connection to the assumptions of the systems under study. Tables of discretetime and continuoustime GLFF model 34 kernels were included. These kernels are the most popular in the literature and as the derivations of the Laguerre, Kautz, Legendre, and Jacobi revealed they are not theoretical abstractions, but orthogonalized versions of solutions to differential equation representations of several important classes of systems. Now that the model structure M(p) has been defined, optimization of the parameter set, p, becomes the focus. CHAPTER 4 OPTIMAL PARAMETER SELECTION In this chapter the selection of the optimal (or near optimal) parameter set for GLFF networks is discussed. Recall p = {N, w,A } where N is the dimension vector, w is the weight vector, and A is the time scale vector. Optimization for these three vectors is handled separately below. In fact, the ability to separate the parameters and their optimization techniques is another benefit of the GLFF formulation. 4.1 Model Dimension Consider a GLFF model consisting of a set of time domain basis functions {o, (t, )}'o with frequency representation {I,(s,A2)}o The model structure is a weighted sum of these basis functions (kernels). In the sdomain Y(s) M(p) = w,(s,A)= WTc(s,2) (4.1) X(s) ,,=O where w= (WoWI...WN) (4.2) D(s, 2 )= (o0(s, 2>, ,(s,2),..., N (s,2))T In the time domain y(t) = wA(t, )* x(t) = [w O(t,2 )*x(t) (4.3) where 0(t,2) = (o0(t,2 ),1 0(t,2),..., ON(t, ))T (4.4) In this thesis, the theory will focus on modeling the impulse response of a continuoustime system. Given sampled step response data, continuoustime and discretetime GLFF networks will be constructed to approximate the desired system. Hence the input, x(t), will be considered an impulse and the desired signal, d(t), will be the system impulse response. The selection of N is now discussed. N is a scalar for the GLFF model class. Suppose N is a fixed finite positive integer. Then d(t) can be approximated by N d(t) =y(t)= w,O,(t, A) (4.5) n=0 4.1.1 Completeness To ensure that y(t)  d(t) as N  oa the property of completeness is employed. A set of functions {O,(t,2)}) is called complete [Su71] on the space L2([a,b]) if there exists no function d(t)e L2([a,b]) such that b d(t)O(t, A)dt = 0, Vn = 0,1,2,... (4.6) a An equivalent definition is as follows. Given a piecewise continuous function d(t)E L2([a,b]) and an E > 0 there exists an integer N > 0 such that b J(d(t) y(t))2dt < e (4.7) a where y(t) is given by the equation (4.5). All of the GLFF kernel sets are complete on [0,oo). An example of an incomplete set of orthonormal functions {n0(t,2)}) o defined on [0,2;r) is (t,=) = 0,(t)= sin(nt) (4.8) The function d(t)=cos(mt) is nonzero on [0,2r) however it is orthogonal to {0,(t,')} 0on this interval. Namely, 21r Jcos(mt)O,(t,2 )dt = 0, Vn,m (4.9) o 4.1.2 Orthogonal Projection When {\,(t,2)}Nois an orthonormal set then N y(t)= wOn(t,2A) (4.10) n=0 spans an N+1dimensional Hilbert Space, S (see Appendix B). The Orthogonal Projection Theorem states that for every d(t) in a Hilbert space H, it can be decomposed as d(t) = y(t) + e(t) where y(t) S, e(t)e SI. S is the orthogonal complement of S and H = S E S So y(t) can be considered as the orthogonal projection of d(t)E H onto the space S. Namely y(t)= Projs(d(t)) (4.11) and {, (t, )}N is an orthonormal basis of S. n=O 4.1.3 "Optimal" Selection of N Completeness is important because it guarantees that if the approximation of d(t) using N+1 weighted terms of {0((t,2 )}n0 is not within a prescribed tolerance then more terms can be included to eventually reach the tolerance requirement. Increasing N may not improve the approximation of d(t) when a set of incomplete functions is used. Having the completeness property provides the following choices when a GLFF network realization for a given N is inadequate: 1. Increase N until the prescribed tolerance is reached. 2. Select other kernels from the GLFF set and examine the approximation error. For a given N, if none of the GLFF kernels chosen provide satisfactory results then N must be increased. If this is done using several kernels the rate of convergence for each kernel as N is increased could be monitored. Choose from this set the kernel that yields the minimum N needed to achieve the required tolerance. Other options are to either raise the tolerance level or choose a different performance criteria. One suggestion for setting the tolerance is to choose the acceptable error power to be a certain percent (5% perhaps) of the system power. Davila and Chiang [Dav95] suggest another approach to estimating the dimension vector N = (N1, N2) for a general IIR structure. This method examines the eigenvectors of the input/output data covariance matrix. When the model orders are overestimated, zeros will appear in the noise subspace eigenvectors. The number of zeros found can be used to estimate the model orders (dimension vector components). This procedure could be applied to GLFF networks. 4.2 Weight Vector One of the greatest advantages to working with a feedforward model is that the calculation of the optimal weight vector w (in the squared error sense) is simple. This is true even for GLFF structures. Although the realizability requirement for GLFF networks demands the weight vector be real, the optimal solution for w is now derived assuming w, d(t), and {0,(t,2)} N are generally complex. Realizable optimal solutions are obtained by setting the imaginary components equal to zero. 4.2.1 Optimal Selection of w Assume d(t) is the impulse response of the desired system (i.e., d(t) = h(t)) and approximate it with y(t) given by N y(t)= w ,(t) n=0 (4.12) The parameter set p= {N,w,A } is to be chosen so the error e(t) = d(t) y(t) is minimum in the integrated squared error sense. Namely, choose p to minimize If Wn = a, + jbn then J= f d(t) y(t) ldt 0 N = ld(t)ldt dw d'(t) (t, A)dt(4 0 i=0 0 E wf d(t)0*(t,A)dt + w,I2 ji,(t,A)2dt i=0 0 i=0 0 dJ dJ set = 0 and = 0 to find w, that minimizes J. Hence da, db, S= (t),(t, A)t d(t)o0(t, A)t + 2a, f (t,)1dt =0 da, 0 0 0 (4 .13) .14) 40 so jRe[d(t)*(t,A)]dt a= 0o (4.15) j On (t, A)2dt 0 When {(O(t,2 )}n is an orthonormal set, the bottom integral equals unity yielding a, = Re[d(t)O (t,A )]dt (4.16) 0 Likewise b, can be derived Sj d(t)o((t,2 )dt jld(t) (t,A)dt +2b (t, )dt = 0 (4.17) n 0 0 0 so 0 Im[d(t)*,(t, )]dt b. = 0(4.18) Again, if {(,(t,A)}N is orthonormal then equation (4.18) reduces to b, = Im[d(t)o*(t, A )]dt (4.19) 0 Hence, combining the above results gives w, = a + jb = Id(t)o(t, A)dt (4.20) 0 When {(Z,(tr,2 )} =is real then w, is real and can be written as w, = d(t)O,(t, A)dt (4.21) 0 So w= d(t)O(t, A)dt (4.22) 0 is the weight vector that minimizes the energy of e(t). From this point forward only real weight vectors will be considered. 4.2.2 Relationship to WienerHopf Equations It is easy to show that w is in fact the solution to the WienerHopf (or Normal) equations. The performance function can be expanded as follows J= e (t)dt = (d(t) y(t))2dt = d(t) w'(t, A ))dt o o 0 0 0 0 (4.23) = d2(t)dt 2wTjd(t)o,(t,A)dt +wT jO(t,2A) (t,A)dt w 0 0 0) now define the following terms d,= d(t)dt o p jd(t)O(t, )dt (4.24) 0 R O(t, )O(t,A)dt 0 Then the performance function J becomes J = d 2wrp+w Rw (4.25) To minimize this function take the gradient of J with respect to the weight vector w and equate to zero VJ = 2p + 2Rw = 0 (4.26) Rw = p (4.27) These are known as the WienerHopf or Normal equations. Solving for w gives the optimal solution w = Rp (4.28) Now the OGLFF kernels {jO(t,/N)} form an orthonormal set so 0 Hence the weight vector becomes 0 which is the solution derived in the previous section. 4.2.3 Minimum Squared Error Performance Function Now that the optimal weight vector has been derived, the minimum squared error can be computed. Recall J = d, 2wrp + wrRw (4.31) For the optimal weights it was found that (using OGLFF kernels) w = p and R = I. Plugging these into the equation for J yields Jopt = de wTW (4.32) The above optimal weight vector w and minimum squared error Jopt was derived using a fixed time scale A. Notice that w and Jopt are functions of 2. Hence Jopt can be written (4.33) Jopt(A ) = de wT(A)w( A) to illustrate the importance of A on the optimal solution. Appendix C presents a derivation of equation (4.32) in the sdomain. 4.2.4 Computing w Using Step Response Data In Chapter 5 applications of the GLFF networks are given. The systems there are continuoustime systems. The available information of these systems is sampled step response data. GLFF networks will be used to approximate the input/output step response behavior. Methods for computing the optimal weight vector w using sampled step response data are now given. Least Squares Solution. [Hay91] Consider the model dimension N and time scale vector A fixed or previously chosen to be optimal. Discussion of model dimension was given in Chapter 4.1. Discussion of the optimal time scale will be given in Chapter 4.3. Since the information is sampled (discretetime) step response data the following definitions are needed. Let the input, x(t), be the unit step and g(t) be the response of the system under question. If the step response is sampled at interval T then define g(k) = g(kT), k=0,1, ..., K as the desired signal. Define also xi(k) = kth sample at the ith tap of the GLFF network (4.34) y(k) = kth output of the GLFF network In this discretetime case, the matrix R and vector p are given by K R=[r,] where r = (xi,,x) = x(k)xj(k) (4.35) k=0 and SPO K p= : where p, = (g,x,)= g(k)x,(k) (4.36) k=0 \PN) This gives the normal equations Rw = p with solution w = R'p. This is the Wiener Hopf solution. Notice in this case that R will not be the identity matrix. Only when the input is an impulse (or white noise) will R = I. Nevertheless, the WienerHopf solution provides the optimal weight vector (for fixed N and A) regardless of the input. Integral Equation Solution. [Clu91] A more direct evaluation of the weight vector solution given in Chapter 4.2.1 is now considered. Recall w,=fd(t)0,(t,A)dt, n= 0,1,...,N (4.37) 0 is the optimal weight solution assuming an orthonormal basis. For the system modeling problem this is accomplished by applying an impulse (or white noise) into the system. In which case d(t)=h(t), the impulse response. Consider, for example, the Laguerre kernels given by {,(tA)},0={l,(t',)} N0. Some of its properties can be utilized to derive a simplified integral equation solution. Again work with g(k), k=0,1 ,..., K samples of the continuoustime step response g(t). Since the continuoustime impulse response is h(t) = dg(t)/dt, the optimal weights can be written as w, = h(t)l,(t, )dt 0 = l,(t,A)dt (4.38) 0 dt = (g(t)1n(t,A )I j g(t) dt 00 dt where integration by parts has been used. For a stable system g(t) it must have lg(oo) < oo. Also using the property of the Laguerre functions [Par71] i,(t,A2)=Z ei (t"e"2) (4.39) n i t" yields lim l (t,) = 0 (4.40) Hence w =g(t) t dt (4.41) dt 0 t Let T., be the steady state response time of g(t); namely, T,. is the time which for all t > T, the output can be consider constant, g(t) = gs, = constant. In this case the integral can be broken up as T', In(tA) ddgll( dt w. = g(t) dt dt g dt ST (4.42) = g(t) (t, ) dt + g,J (T,I A) 0ldt From the Laguerre function property above it is easy to compute the derivative relationship S A i(t, A) 2A 1(t,A) (4.43) dt i=0 The weight computation becomes n1 I T , w.= 22ZJ g(t),(t, A)dt + A (t)l(t, )dt + gsl (T,I,A) (4.44) i=0 0 0 Up to now the continuoustime step response g(t) has been assumed. To compute w, using the sampled step response g(k) an integral approximation technique must be used such as Euler integration r,, K Sg(t)1,(t,A)dt = A g(k)1,(k, A) (4.45) o k=O where T A =  (integration interval) (4.46) K Finally this gives the integral equation weight solution n1 K K w, = 2/A _g(k)l((k,A)+AAE8g(k)l,(k,2)+ g,sl,(T,) (4.47) i=0 k=0 k=0 4.2.5 Adaptation of w Note that the weights can be computed by solving a set of linear (WienerHopf) equations. This linear relationship among the weights corresponds to a convex performance function with one minimum point. So the weight vector can be easily solved in an adaptive manner using a gradient search technique such as Newton's method or LMS [Wid85]. Unfortunately, this nice relationship does not hold for the time scale vector. It is (in general) nonconvex with multiple stationary points. The number of which depends upon the dimension, N, of the model used. 4.3 Time Scale Vector Optimization of the parameter set elements N and w have been discussed to this point. Their solutions are fairly easily found for GLFF networks and are the only elements of p that are required for FIR and IIR models. As has been previously stated, the power of GLFF networks is in local feedback. This feedback is controlled by the time scale vector. Although IIR structures also have feedback, it can not be separated and optimized independently from the taptotap weights like the GLFF structures. Unfortunately, the elements of the time scale vector are not linearly related; hence, the performance function J is usually nonconvex w.r.t. these parameters. As such, the bulk of the work in finding an optimal p involves finding an optimal 2. This section is devoted to this issue. It begins by examining several methods that currently exist for either determining exactly or approximately the optimal A. Each method usually addresses a particular GLFF kernel, most often the Laguerre. A few papers have considered the nonorthogonal Gamma and a few discuss optimization of the more complex Kautz kernel. Although the exact optimal solution is sought, Kautz has pointed out that nonoptimal solutions (which may be derived in a simplified manner) can still produce excellent results. His comments suggest that one should not search to exhaustion for optimality if it is possible to get close with much less work. After the overview of optimization methods found in the literature, we propose a new technique relying heavily on complex variable theory. This is a method that extends the concepts of squared error analysis to the complex domain. Under certain assumptions, the optimal time scale can be fairly easily determined by simply looking for zero crossings. 4.3.1 Optimal Selection of A (Historical Review) In this section a summary is given of current methods found in the literature to either locate or approximate the optimal time scale A. Each method usually addresses a particular GLFF kernel, most often the Laguerre. A few papers have considered the nonorthogonal Gamma and a few discuss optimization of the more complex Kautz kernel. The methods presented in the literature can be grouped into the following categories according to the employed strategy or desired objective: 1. Approximation via the impulse response. 2. Asymptotic solutions. 3. Achieving prescribed measurement objectives (other than minimum squared error). 4. Matched filter approach. 5. Moment matching. 6. Satisfying necessary conditions for stationarity of the squared error function. 7. Iterative and adaptive search techniques. Approximation via the impulse response. These methods approximate A given a priori knowledge about the impulse response h(t) of the system. The required information is different for each approach. Either h(t) is needed in analytic form or tabular form (samples of h(t) at discrete points, t = tk, k = 0,1,2, ... ). These methods appeal to the necessary requirement for all stable and realizable linear systems; namely, its transfer function H(s) must be representable as a rational function with no righthalf plane poles or multiple jaxis poles. Under this assumption, h(t) can be represented as a sum of exponentials of the type (A + At+...+Atm")eA' (4.48) where Ak = ARk + k, Rk < 0 (4.49) A frequency domain method for approximating A is as follows [Su 71]. Suppose h(t) is known (or approximation to it) and its Laplace transform H(s) is determined. Note that the form of h(t) that is given may not produce a rational H(s). Now find a rational approximation Ha(s). For instance use the expression for H(s) and expand it into a Taylor series about s = 0 to obtain H(s) = ao + as+a2s +* (4.50) Then a rational function can be found to approximate H(s). One way to accomplish this is to use a Pad6 approximant. The poles of this approximation can be used as an initial guess for the poles of H(s). These pole locations could be incrementally adjusted to locally optimize. A time domain approach is to use point matching. The basis of this technique is attributed to Prony [Ham73]. The strategy is to force an approximate impulse response h(t) to match h(t) at evenly spaced points. Assume h(t) is known at points separated by the spacing At. Using the approximation N (t)= Ae"' (4.51) it is possible to create N linear equations (using the known points of h(t)) whose solution gives coefficients ao, a,, ..., aN. of an Nh order characteristic polynomial xN + a_,XNI + aN_2x + + axx + a0 = 0 (4.52) Once the coefficients are known, find the roots of this polynomial x, xz, ..., XN. They are related to the time scale parameters by .i = nx, (4.53) At Another time domain approach involves the use of an orthogonal signal set [Kau54]. Assume h(t) and its first N derivatives are known. Using these functions, generate a set of orthogonal functions {(0(t)}N0 satisfying the relationships o(t)= h(t) 01(t) = h(t)+ b o0o(t) (t = (t) + b, ,(t) + bz20o(t) (4.54) N(t) = hI(N(t)+bN(N N_ l(t)+**+ bNlOl(t)+ bNOo(t) where the b,,n's can be found such that 0 J^.(t)O (tdt= (4.55) Since each 0~(t) is a linear combination of the derivatives (from the 0th to the nth) of h(t) then 0n(t) can be rewritten in terms of h)(t), j = 0,1,...,n1. Replace each h()(t) by A' to get the characteristic equation A + an(nl)/ + an(n_2),2) +" + +a + an0 = 0 (4.56) The roots of these equations give the time scale parameters Ani, i=1,2,...,n for model order n for each n = 1,2,...,N. Asymptotic solutions. Several methods are offered that are based on power series representations of the Laguerre model. Schetzen [Sch70], [Sch71] presents an analytic method for determining the asymptotic optimal time scale LA where 2 = lim AN (4.57) N) This is based on the region of convergence (ROC) of the power series equivalence of the Laguerre model. Justification for A~. in the place of AN is made on the basis that most engineering applications require the use of a large number of terms N in the Laguerre expansion to achieve accurate results. In this case, /L is viewed as a good approximation to the optimal time scale AN. Bruni [Bru64] offers another approach involving optimization around the ROC of a Laguerre model expansion. Assuming h(t) is the impulse response of an LTI lumped parameter system M h(t)= Ae p', p,=a,+jb,,a,>0 (4.58) i=0 decompose a Laguerre model representation of h(t) into M infinite series h(t) = wnt,l(t, 2) = W"l,(t,2 ) (4.59) n=0 n=O i=1 where w,i is the nth coefficient of the expansion of the ith exponential in h(t). With this formulation the Fourier transform of h(t) can be represented as a sum of M infinite complex geometric series. Mi gives the magnitude of the term in the ith infinite series. Mi should be minimized for each i in order to achieve the most rapid convergence of the series. Since M, is a function of A, there is an optimal A (denoted Ai) for each Mi. Choose as the optimal A for the entire model, the Ai that minimizes the largest Mi. Methods similar to this one are also given in [War54] and [Hea55]. Achieving prescribed measurement objectives. In addition to locating A that minimizes the squared error, approaches that minimize other measurement criteria have also been considered. One such method employed when using the Laguerre signal set {l(t,2A)}N0is given in [Par71]. Consider the approximation of d(t) using an N+1 tap Laguerre model N d(t) =y(t)= wl, (t, ) (4.60) n=0 Now define the following quantities ftd2(t)dt td2(t)dt M, o and M2 0 (4.61) Sd2(t)dt d2(t)dt 0 0 M1 is a measure of how rapidly d(t) decays. M2 is a measure of how smoothly d(t) decays. Parks show that the time scale A given by A = M1/M2 minimizes the maximum error over the class of all signals with these measurements. The advantage to this approach is that a complete knowledge of the signal to be approximated is not required. In [Fu93] a similar approach is taken when using the discretetime Laguerre model. In this case, the optimal A is found that minimizes the measurement J =tnw2 (4.62) n=0 This performance measure is chosen because it linearly increases the cost of each additional term used in the Laguerre expansion. This enforces a fast convergence rate. The optimal A is obtained using a discretetime formulation of M1 and M2 above. A relationship for J is derived that depends only on M1, M2, and A. The optimal 2 is then determined to be S2M= 1M2 (4.63) 2M, 1 + 4M ,M M 2M2 Matched filter approach. In [Kuo94] a matched filter bank is used to estimate the optimal 2 for the Gamma model. Recall that {g,(t,, )}N0 is a nonorthogonal GLFF signal set. Let d(t) be approximated by N d(t)= y(t)= w,g,(t, A) (4.64) n=O A matched filter is constructed based on the optimal A condition (g ,(t,2),d(t) = 0 where g ,(t,A) is the component of gN+i(t, 2) that is orthogonal to the previous basis components. A bank of such filters is created where each bank uses a different A. To optimize 2 over the region (0, ,,ax) an Mbank filter structure is created where the mth filter uses the time scale A = A,,x / M. When the desired signal d(t) is applied to the filter bank, a change in sign of the M outputs will occur between filter m and m+1 for a value of A that satisfies the optimality condition. When there are multiple occurrences, the associated A for each case must be used to determine which one achieves the minimum squared error. Moment matching. The method presented in [Cel95] uses a Gamma model to approximate a signal d(t). The Gamma model is conceptualized as a rotating manifold in space. The kernels {g,(trA)} 0 are the basis vectors that define the N+I dimensional space of operation and the time scale parameter A is the feature that allows for rotation. The optimal A is estimated using a moment matching method. This method attempts to estimate 2 by equating a finite number of time moments of y(t) (the output of the Gamma model) to those of d(t). Namely, it is required that mn(y(t))= m,(d(t)), n = ,...,N where mt(y(t)) t"y(t)dt (4.65) 0 Using these functions an N+1 order polynomial in A, p(2), is formulated. The roots of p(A) are estimates of the local minima of the squared error performance function J. The root that minimizes J is chosen as the optimal A. Satisfying necessary conditions for stationarity of the squared error. Some of the most popular methods for optimizing the time scale of the Laguerre model center around the necessary conditions for stationarity of the squared error performance function J. For instance, approximating d(t) with an N+1 tap Laguerre network N d(t) = w,(A)1,(t, A) (4.66) n=0 where the weight vector components wn(A) have been given an argument in 2 to emphasis the dependence of these weights on the time scale. For a fixed 2, the squared error is minimized when the weights are computed as w,(2) = d(t)l,(t,A)dt (4.67) 0 The squared error function reduces to N J= d2(t)dt w,(A) (4.68) 0 n=0 To minimize J (over A) the evaluation of the X must be maximized. This requires that dN 2 NA = N d w,() )2w,(A) w,(A2)= 0 (4.69) dn=O n=O d which yields the interesting relation WN(A)WN+(2A)=0 (4.70) A proof of this assertion is given in Appendix D. Hence the optimal value of 2 is either a root of WN(2) = 0 or WN+il() = 0. Solving for A analytically is a tedious task because WN(A) = 0 is a polynomial in 2 of degree (N+M) if d(t) is a system with M poles. The first proof of this (using mathematical induction) was given by Clowes [Clo65] and was restricted to the case where d(t) had only simple poles. A more concise and less restrictive proof was given in [Kin69]. This paper also derives expressions for the 2nd derivative d2N [ 1 )[(N+ 4() NWN+,(A)wN(A)] for wN(A)= 0 dt2w( E W (4.71) dt n=0 [N (N+2)w2(A)w,(A)(N+l)()] for wN+,(a)=0 These equations may be used to determine whether a stationary point is a maximum or minimum. In [Mas90] a discretetime formulation of the stationary conditions is formulated and the authors use numerical techniques to find the roots of the polynomials WN(A) = 0 and WN+I(A) = 0. [Sil93] derives these conditions for the Gamma model. The derivation given by Clowes was valid only for systems with rational transfer functions. [Wan94b] extends the results to approximating any L2(R) stable linear system. [Sil95] describes another efficient way to compute the dA weights WN(A) and WN+I(A) via Taylor series and Pad6 approximants. This procedure is illustrated in both continuoustime and discretetime when approximating a system excited by an arbitrary input. Iterative and adaptive search techniques. When representing a system with a discretetime GLFF network, the elements of the time scale vector are restricted to bounded regions to guarantee stability. In particular, the discretetime Laguerre model requires 2e (1,1). One method of optimization is to employ exhaustive search over this interval, seeking the 2 that minimizes the squared error performance function. Since this would be done in discrete steps, the only way to get continually closer to the optimal A is to increase the number of points in the search region. [Kin77] and [Mai85] discuss minimizing the squared error by employing a Fibonacci search technique over 2. This is an interval elimination procedure wherein the region in which the minimum lies is sequentially reduced using a predetermined number of steps. [Nur87] uses an adaptive technique whereby the optimal A is chosen to be the center of gravity of the presumed poles of the system. The method works as follows. Initialize 2, say Ao. Using the Laguerre model, with this 2, the parameters of a difference equation representation of the system are estimated. From the characteristic equation determine the poles of the estimated system. Choose as Ai, i=1,2,... the center of gravity of these poles. Repeat this process until A2+ ,i < where Sis a small number. Once the center of gravity converges to a fixed point stop the adaptation. 4.3.2 Optimal Selection of A (Complex Error Analysis) A new approach to estimate the optimal time scale is now offered. The methodology used here is one of extending the squared error performance function to the complex domain. This is accomplished by defining a new complex performance function. It will be shown how this function is derived, how it relates to the conventional squared error function, and what additional information it offers. The optimal time scale can be determined by looking for zero crossings of the imaginary component of the new complex performance function. For systems consisting of a cascade of 2nd order subsystems containing pairs of complex conjugate poles, examples will be presented that illustrate how the imaginary component of the new complex performance function can be used to obtain the optimal (in the minimum squared error sense) time scale parameter. These examples will solve the system identification problem with a GLFF model containing the Laguerre kernel. 4.3.2.1 Derivation of the complex performance function. In Appendix C an s domain description of the squared error performance function is discussed. It is given by J = E(s)E(s)ds (4.72) C where C is a contour taken in the CCW direction enclosing the entire lefthand plane (LHP) in s, E(s)= D(s) WT (s,), and w is a real weight vector. For a fixed time scale A the optimal weight vector is w= fD(s)(s, A)ds (4.73) C yielding the performance function J = de w'w (4.74) where d, D(s)D(s)ds (4.75) C Now consider restricting evaluation of the performance function to the upper LHP only. Namely, create the complex performance function J , E(s)E(s)ds (4.76) c. Make the following definitions: due, = D(s)D(s)ds (complex energy) c, (4.77) wu = D(s)4(s, A)ds (complex weight vector) The contour Cu is taken in the CCW direction enclosing the upper LHP in s. Now if F(s) is a function which has real and/or complex conjugate pair poles then f F(s)ds= 2Re F(s)ds (4.78) c c, hence w = 2 Re[wf] w 2 "J(4.79) d, =2Re[d,,] By making the above definitions, the problem of locating the optimal A using Ju is studied. By operating in the complex space there is information that can be used to offer additional insight into the optimal solution of the time scale parameter 2. This information is not available using squared error analysis because the imaginary component of Ju is canceled when calculating J. This is due to the contribution from the poles of D(s) in the lower LHP in s. The squared error performance function can be considered as a mapping onto R and the complex performance function as a mapping onto C. A simplified expression for J, is derived as follows Ju = 2 E(s)E(s)ds c= 7 f [D1s) WTq (sA )][D(s) WTD(s,2 )]ds C9 , (4.sU) = D(s)D(s)ds w' D(s,(s, A)ds C. c. w D(s)((s, A)ds +w ~t(s, A)Q~(s,)dsw Q c, Using the definition of the complex energy, due, and complex weight vector, w,, as well as the following relations and WT j (s, A)T(s,A)ds w = W'w C, jD(s)D(s,A)ds =w c. (4.81) (4.82) gives Ju = 4 ~WTW WTW+ WTW = de WT WU Note that 2Re[J,] = 2Re[d.,] w2 Re[w,] = d, WTW = J which is what is expected since (4.83) (4.84) J = E(s)E(s)ds= 2 Re  c C = 2Re[J ] (4.85) 60 4.3.2.2 Complex performance function examples and experimental results. Three test cases have been chosen to study the complex performance function, Ju. One 2nd order system and two 4th order systems are modeled using a GLFF network with a Laguerre kernel. The transfer function for each example follows: Example 1: (2nd order system, rapidly decaying impulse response) 1 1 H(s) = + (4.86) s+2j s+2+j Example 2: (4th order system, rapidly decaying impulse response with slight oscillation) 1 1 1 1 H(s) = + + + (4.87) s+2j s+2+j s+l2j s+1+2j Example 3: (4th order system, slowly decaying impulse response with oscillation) 2 2 3 3 H(s) = +  + + (4.88) s+0.4j s+0.4+j s+0.62.8j s+0.6+2.8j Experimental results for each example are tabulated and graphed on the next few pages. In Figures 41 and 42 the squared error performance function, J (solid curve), and twice the imaginary component of the complex performance function, 2Im[Ju] (dash curve), are plotted for the first four model orders for example 1, Figures 43 and 44 for example 2, and Figures 45 and 46 for example 3. Notice the imaginary component has zero crossings near the stationary points of the squared error performance function. Im[Jj] crosses negativetopositive near local minima of J. So the complex performance function provides an estimate of the optimal time scale through the imaginary component. In fact Im[J,] can be considered an estimate of V,J(A) that is best near Aop.t Numerical results are provided in Tables 41, 42, and 43 to illustrate how close the nearest zero crossing (nearest ZCu) of 2Im[Ju] is to the optimal time scale as the model order increases. Also the squared error is computed at the optimal, J(Aopt), and nearest ZCu, J(Anzcu), time scales. For this analysis, Im[Ju] was scaled by 2 to provide similar scaling to J since J=2Re[Ju]; however, the zero crossings are unaffected by this scaling. By examining J(Aopt) and J(Anzcu) it is found that when using a Laguerre network with sufficient order to approximate the unknown system, no degradation occurs by choosing A at the nearest ZCu. In fact the small difference in ,,op, and Anzc, is partly due to numerical approximation error when searching for minima of J and roots of Im[J,]. 10 o 20 =L a, 30 C 40 1 a) 0 z 60 70 0 1 2 3 4 5 Time Scale, X Figure 41: J for example 1 (orders = 0 to 3). 0.1 ' 0.05  E 0O. 0.05 0.1 0 3 x10 1 r (order=0) 1 2 3 4 5 (order=2) 05 =  d 0.5 . E 0 1 S 0 1 2 3 4 5 0.01 0.005 (order=1) 0 .005 O 4 t.............. .................... ................... . 0.01 S 0 1 2 3 4 5 4 x10 4 r x (order=3) 0 1 2 3 4 5 Figure 42: J and 2Im[Ju] for example 1. 0 5 0 10 o S15 u a) 2 20 CO 25 30 S35 40 45 (order=0) 2 (order=2) 2 4 S0.2 S0.1  E0.  0.1 * 0.2 6 0 0.1 0.05 0 0.05 0.1 Figure 44 J and 2Im[J for example 2. Figure 44: J and 2Im[Ju] for example 2. (order=1) 2 (order=3) 2 4 6 1 2 3 4 Time Scale, X Figure 43: J for example 2 (orders = 0 to 7). 0.4 0.2 0 0.2 0.4 0.1 r 0.05 0 0.05 0.1 0 1 2 3 4 5 6 Time Scale, X Figure 45: J for example 3 (orders 0 to 11). (order=O)  0 2 4 6 (order=2) . i (order=1)    (order=3) 0 2 4 Figure 46: J and 2Im[J,] for example 3. CO 0 c 10 U 0 15 " 20 z25 30 1 j'_0.5 E 0 0.6 0.4 E 0.2 0. 0 0.2 Table 41: Performance function characteristics for example 1. N oApr A2nzcu J(Ao,) in dB J(Azcu) in dB 0 2.47 2.48 24.7 24.6 1 1.60 1.58 33.8 33.7 2 2.31 2.31 49.9 49.9 3 1.92 1.92 60.7 60.7 Table 42: Performance function characteristics for example 2. N Aopt Anzcu J(A,,) in dB J(RzcU) in dB 0 2.99 3.12 11.6 11.6 1 1.50 1.43 15.6 15.5 2 2.58 2.60 20.8 20.8 3 1.81 1.78 24.8 24.7 4 2.45 2.47 29.3 29.3 5 1.93 1.92 33.3 33.3 6 2.39 2.40 37.7 37.7 7 2.00 2.00 41.8 41.8 Table 43: Performance function characteristics for example 3. N Ao, 2nzcu J( A,) in dB J(,,zcu) in dB 0 0.82 0.34 0.5 0.2 1 2.30 2.46 2.7 2.7 2 1.29 1.21 5.2 5.1 3 2.12 2.12 9.9 9.9 4 2.87 2.89 12.7 12.7 5 3.56 3.59 14.1 14.1 6 4.21 4.22 14.7 14.7 7 4.82 4.76 14.9 14.9 8 2.24 2.24 17.1 17.1 9 2.62 2.64 19.6 19.5 10 2.25 2.24 22.9 22.9 11 2.57 2.56 25.5 25.5 4.3.3 Further Analysis of the Complex Performance Function In the previous section, it was shown that the squared error and complex performance functions can be expressed as J = d w w Sd (4.89) Ju = due w where d, = D(s)D(s)ds du, = D(s)D(s)ds c c" (4.90) w= fD(s)(s,A)ds w, = D(s)(s,A)ds C C. c c, and D(s) = desired system cp(s,2) = L(s,/) the Laguerre kernel C = contour enclosing the entire left hand plane (LHP) in s Cu = contour enclosing the upper left hand plane (LHP) in s It was also shown that d, = 2Re[d,] d = 2Re[de (4.91) w= 2Re[w,] Writing J and Ju in normalized form (relative to the total energy of the system) gives J = (d, WTW)d (4.92) (4.92) Ju = (de wT'w)/d Decomposing due and Wu into their complex component forms due = dueR + jdue (4.93) wu = wR + jW.u yielding de = 2Re[due] = 2dueR w = 2RWu(4.94) w= 2Re[w,] = 2w, So we can write J and J, in terms of components of the complex energy signal and complex weight vector. J = (2due (2wR)T(2wUR))/(2duR) = (dR 2w uR) dueR Ju = (due (2wuR)T w)/(2d,R) =(i ww)/U(4.96) Now we are interested in 2Im[J,], twice the imaginary component of the complex performance function. This has the expression 2Im[J,] = (due, 2wWU)/du, (4.97) This extension to the complex domain has shed light on new information. The imaginary component of Ju has been shown to have roots (or zero crossings) near the stationary points of J, one of which provides an approximation to the optimal time scale solution. In an effort to capitalize on this discovery, the analytic performance function, Ja, will be defined and studied as a method of achieving the same information while requiring less a priori knowledge of the unknown system D(s). As will be seen, Ja is composed of an energy signal and a weight vector that is the analytic signal representation of the associated squared error components. 4.3.3.1 Analytic performance function. Consider the components of Ju. The complex energy signal, due, can be written as du = 2 D(s)D(s)ds o0 = J D(o)D(c)da + f D( jw)D( ji)do (4.98) 0 0 ~0 = [,D(jw)D(jw)dw] j D(o)D()d7] Likewise, the complex weight vector, wu, can be written as w = fD(s)(s,2 )ds C. o (4.99) = f D(a)((c~,A)do + f D(jw)( jw, )dw 00 0 Now collecting real and imaginary components gives wU= Re D( jo)W( jo,A)do Re [ f D(a)(c, )da + 0 1 1 (4.100) j Im [i D( jo)(j, A)d J Im J D(o)O(a, ,)d Consider the components of Wu along the +jm axis only. This is equivalent to the spectrum of the analytic signal constructed from w. Namely, for a signal f(t), the analytic signal is specified by fa(t) f(t) + j(t) (4.101) where f(t)= H{f(t)}, H is the Hilbert Transform. The spectrum of f,(t) is equivalent to the spectrum of f(t) for 0 < < oo and zero for oo < m < 0. Now we define the following functions: d,,, [real part of d., along + jw axis] + j[imaginary part of du along + jo axis] = D(joI)D(j))da) + [0] (4.102) = f D(jo)D(jwo)dw 0 and w,, =[real part of w, along + jw axis] + j[imaginary part of w, along + jw axis] = Re 1 D( jo)0(jo, A)dj + j Im [ D( jw)(jo, l)do] (4.103) 0 0 = JD( j)w)( jo,)do 0 dae is called the analytic energy signal and wa is called the analytic weight vector. As in the case of the complex performance function we define the analytic performance function, Ja, Ja d w'w (4.104) or in normalized form J= (da, wT a)/de (4.105) Now decomposing the components of Ja into their complex form gives dae = dae + Jde' (4.106) a = WR + Jal where doeR = [ D(j))D(j)d (4.107) dae =0 and 70 a = Re D(j)(iw4(jo9,A)d]o (4.108) w = Im [ D( j)W (jI),)dA) 0 The goal is to rewrite J and Ja in terms of analytic energy signal and analytic weight vector components only. Recall o J w=; D(s)W(s, A)ds C I Tr f D(jw4)(jw,A2)dw 0 = D(jo)O( jo,)d(+ + D(jmo))( jo,A)do 0 0 = 2 iD( jo)D(jw )dw) + D( j))(jwj,A))dw (4.109) 0 1 0 =2Re[2 i D(jo)o(jtWA)dw] =2Re[w] = 2w,, Likewise d,= 2Re [ D(jow)D(jm)dw ] 0 = 2Re[de] (4.110) = 2daeR So we can write J and Ja in terms of components of dae and Wa as follows J = (2de (2WaR)T(2waR))(2daeR) S((4.111) = (daeR 2w WaR W)/daeR S=(dae (2WaRT a)/(2daeR) (4.112) We are interested in 2Im[Ja], twice the imaginary component of the analytic performance function. This has expression 2Im[Ja] = 2wW waT /daeR (4.113) 4.3.3.2 Analytic performance function examples and experimental results. Again consider the three examples used to study the complex performance function. Figures 47, 48, and 49 compare the complex and analytic performance functions. For each of the three examples the squared error performance function, J (solid curve), twice the imaginary component of the complex performance function, 2Im[Ju] (dash curve), and twice the imaginary component of the analytic performance function, 2Im[Ja] (dotdash curve), are plotted for the first four model orders. Numerical results are provided in Tables 45, 46, and 47 to illustrate how close the nearest zero crossing (nearest ZCu) of 2Im[Ju] and the nearest zero crossing (nearest ZCa) of 2Im[Ja] are to the optimal time scale as the model order increases. Also the squared error is computed at the optimal, J(A,,p), nearest ZCu, J(Anzcu), and nearest ZCa, J(nzcaa), time scales. It is interesting to compare Ju and Ja for these three examples. The difference in these functions is that Ju includes the line integral along the negative aoaxis. Consider the complex energy due for each example. From Table 44 we find that duel is 22% of Table 44: Complex energy signal characteristics. Example due= dueR + duei % complex 1 0.45+0.10/ 22 2 1.98+0.63/ 32 3 9.91+0.31/ 3 dueR for example 1, 32% for example 2, and only 3% for example 3. Observe from Figure 49 that J,(A)) = J,(A), for all A, for example 3. Figures 47 and 48 show that Ju and Ja are significantly different (but have close zero crossings near Aopt) for examples 1 and 2. So the line integral along the negative o axis contributes significantly to the evaluation of J for examples 1 and 2 but not example 3. This results in the similarity of Ju and Ja for the last case. Clearly, the relative values of dueR and duc, are application dependent. We have shown that even though Ju and Ja are different functions, their imaginary components have roots that are similar near 2op,. Since Ja can be computed directly from a sampling of d(t), it could be used to provide the optimal time scale estimate in lieu of Ju. (order=0) 0.1 _U 0.05  S 0 ... ., ......' ....... .........  ..... ............. E 0.1 0.1 I  j  0 1 2 3 4 5 3 x10 (order=2) 1 0.5 .... \ E i I I t i I 0 1 2 3 4 5 (order=1) 0.00  0.00 0.C i ! 4 ..... ......... i ............ ....... .................. 5  n5  SI 0 1 2 3 4 0 1 2 3 4 5 S Figure 47: J, 2Im[Jy], and 2Im[Ja] for example 1. Table 45: Performance function characteristics for example 1. N Aopt 2Azcu Azca J(A,,)p in dB J(Azcu) in dB J(Anzca) in dB 0 2.47 2.48 2.48 24.7 24.6 24.6 1 1.60 1.58 1.49 33.8 33.7 31.1 2 2.31 2.31 2.29 49.9 49.9 49.5 3 1.92 1.92 2.62 60.7 60.7 54.6 0.0 (order=1) , ...... .............. ................................... ..... ..................... i/ . ...... .. . ........; 0.3 0.2 M0.1 4 0 E 0.1 S0.2 0.4 v 0.2 E ;i0 E ,0.2 0.4 0.1 7 0.05 E  0 E , 0.05 0.1 0 2 4  0.05 E " 0 E i 0.05 (order=3) (order=3) 2 4 Figure 48: J, 2Im[J,], and 2Im[Ja] for example 2. Table 46: Performance function characteristics for example 2. N Ap A,,zcu Azc J(Aopt) in dB J(AZzcu) in dB J(Azcn) in dB 0 2.99 3.12 3.08 11.6 11.6 11.6 1 1.50 1.43 1.31 15.6 15.5 14.8 2 2.58 2.60 2.19 20.8 20.8 19.0 3 1.81 1.78 3.16 24.8 24.7 22.5 4 2.45 2.47 2.40 29.3 29.3 29.2 5 1.93 1.92 1.86 33.3 33.3 32.8 6 2.39 2.40 2.14 37.7 37.7 34.8 7 2.00 2.00 1.96 41.8 41.8 41.5 0 2 4 (order=2) (order=O) (order=0) i i 0 1 2 3 (order=2) 2 4 Figure 49: J, 2Im[J.], and (order=1) 0.8 0.6 E S0.4 E 0.2 0 0.2 0.8 0.6 _E  0.4 E 0.2 0 0.2 0.6 " 0.4 E M 0.2 4 0 0.2 2 4 orderr) (order=3) 2 4 2Im[J] for example 3. 21m[Ja] for example 3. Table 47: Performance function characteristics for example 3. N Aopi 2zcu Anzc, J(2op,) in dB J(A zc,) in dB J(Anzca) in dB 0 0.82 0.34 0.55 0.5 0.2 0.4 1 2.30 2.46 2.43 2.7 2.7 2.7 2 1.29 1.21 1.21 5.2 5.1 5.0 3 2.12 2.12 2.13 9.9 9.9 9.9 4 2.87 2.89 2.89 12.7 12.7 12.7 5 3.56 3.59 3.60 14.1 14.1 14.1 6 4.21 4.22 4.22 14.7 14.7 14.7 7 4.82 4.76 4.79 14.9 14.9 14.9 8 2.24 2.24 2.22 17.1 17.1 17.0 9 2.62 2.64 2.60 19.6 19.5 19.6 10 2.25 2.24 2.28 22.9 22.9 22.8 11 2.57 2.56 2.62 25.5 25.5 25.3 0.6 " 0.4 E 4 0.2 E  o0 0.2 dl ............... ..................... I ..............  ........... 4.3.4 Time Domain Synthesis Using GLFF Networks For the discretetime setting, one of the fundamental limitations of an FIR model is its lack of memory depth; namely, its inability to approximate (with low order) a system with an infinitely decaying impulse response. GLFF networks do not have this problem due to the presence of local feedback in their structure. How well a Laguerre network approximates the impulse responses of the three example systems given by equations (4.86), (4.87), and (4.88) is now illustrated. Here the system impulse response, d(t), (h(t) = d(t) since the input is an impulse), the Laguerre network output when using the optimal time scale yo(t), and the Laguerre network output when using the nearest ZCu time scale yz(t) are plotted. In example 1 the impulse response decays rapidly and has no significant oscillation. yo(t) and yz(t) are plotted in Figure 410 using a 1st order Laguerre model. In example 2 the impulse response is also rapidly decaying but with a slight oscillation. Plots of yo(t) and yz(t) are given in Figure 411 for a 5th order model. Example 3 is a system having a slowly decaying impulse response with a strong oscillation. Plots of yo(t) and yz(t) are given in Figure 412 for a 10th order model. This oscillation is difficult to approximate with a Laguerre model since only a single real pole (time scale) is used in the approximation. Increasing the order of the network could further reduce the error. Another option is to use a Kautz model with a pair of complex conjugate poles. For all three examples we observe that yo(t) and yz(t) are inseparable. The point to be made in this section is that when using A obtained from the appropriate zero Example 1: d(t),yo(t),yz(t) 1 2 3 4 Figure 410: Impulse response and approximations using a 1st order Laguerre model. Example 2: d(t),yo(t),yz(t) 3 4 5 Figure 411: Impulse response and approximations using a 5th order Laguerre model. Example 3: d(t),yo(t),yz(t) 5 6 7 8 Figure 412: Impulse response and approximations using a 10th order Laguerre model. 2 1.5 1 0.5 0 %I crossing of Im[J1u (or Im[Ja]) as the time scale for the model, the same result can be achieved as when using A obtained by searching J for the global minimum. Also the squared error performance function can always be computed using J = 2Re[Ju]. 4.3.5 Conclusion of the Complex Performance Function For systems composed of sets of real and/or complex conjugate pair poles, a new complex performance function, J,, was derived when using a GLFF network as a system model. The imaginary component of Ju has zero crossings near the stationary points of the squared error performance function J. Experimentally it is determined that as the order of the model is increased, the zero crossings of Im[J,] near minima of J converge to these stationary points, one of which will be the optimal time scale, Aopt. Rather than searching the nonconvex function J for a global minimum, locate the roots of Im[Ju(A)], A,, and evaluate 2Re[Ju(Ar)]. One of these will be close to Aopt; namely, = arg( min{2 Re[Ju (A,)]}j (4.114) where 2r= {roots of Im[Ju(X)]}. In addition, we have defined the new analytic performance function and derived its closed form solution. It was shown that J could be written in terms of components of Ju and Ja. Expressions for the imaginary components of Ju and Ja were also given. The value of these functions in determining the optimal time scale of GLFF kernels was illustrated using three example systems. The important equations from the above sections are now summarized: Squared Error Performance Function (de wTw)/de J= (deR 2WTRWuR,)/dueR (daeR 2WRWaR )/daeR {Squared error components} {Complex error components} {Analytic error components} Complex Performance Function Ju = (+due WuRWu)/dueR Analytic Performance Function Ja = ( dae Wa)/daeR where d, = ;D(s)D(s)ds d,= D(s)D(s)ds Cu o w= D(s(Ds(s,A)ds C w = D(s)4(s, A)ds wa = D(ji)D( jw,A)dw 0 If D(s) is known analytically or in closed form we can employ the residue theorem to compute de, due, dae, w, Wu, wa. If these functions must be determined numerically then the analytic performance function can be easily computed. The only requirement is a sampled data set from an impulse or step response test. In addition, J can always be computed from Ju or Ja. This illustrates that the squared error solution is completely embedded within the complex and analytic performance functions. (4.115) (4.116) (4.117) (4.118) 4.4 Conclusion (Optimal Parameter Selection) This chapter has addressed optimization techniques for GLFF models, M(p). Optimization of each element of the parameter set p = {N,w, A} is handled separately. Model dimension element optimization generally involves increasing N until a prescribed error tolerance is achieved. Monitoring the rate of convergence for several GLFF kernels could also aid in choosing the model yielding the smallest N. The optimal weight vector minimizes the integrated squared error and is found by solving a set of WienerHopf equations. A direct integral evaluation method was also given that takes advantage of properties of the model kernels. The weight vector solution is a function of the time scale vector 2. Optimizing A is the most difficult task because the minimum squared error solution is nonlinear w.r.t. these parameters. A number of optimization approaches exist in the literature and they were discussed. Finally, a new method was proposed that extends the squared error analysis to the complex domain. Simply locating zero crossings and choosing the one that minimizes J gives the optimal A. Examples were provided to demonstrate the theory. CHAPTER 5 APPLICATIONS Applications of the theory presented in the previous chapters will now be discussed. In this thesis, GLFF networks will be used to approximate sampleddata systems. In particular, the step response of a system used in laboratory test and evaluation of guided weapon systems will be modeled. Other areas where GLFF models have proven useful are first surveyed. The GLFF model class encompasses many structures that have been independently formulated and studied in the past. This includes the tapdelay line, Gamma, Laguerre, Legendre, Jacobi, and Kautz models defined in both discretetime and continuoustime domains. The unification of these structures is one contributory component of this thesis; however, it is worth reviewing the previous history of these models to gain insight into the many applications that they have been able to address. Network synthesis. The earliest work dealt primarily with orthogonal model structures in an effort to synthesize networks and transient responses of linear systems [Arm57], [Arm59], [Kau54], [Hea55], [Bru64], [Clo65], [Kin77], [War53]. In [Lee33] and [Hug56] synthesis of electrical circuits was the focus. System identification and modeling. Similarly as the tools became more developed the nomenclature and approach shifted to system identification and system modeling [Bod94], [Eko94], [Kin79], [Lin93], [Mas90], [Mas91], [Nur87], [Nin94], [Oli87], [Oli94], [Per91], [Si194], [Sil95], [Van94], [Wah91], [Wah94], [Moh88]. The basic principles are the same as in network synthesis only now the modeling effort involves more than just the network impulse response. A further refinement includes modeling noncausal systems [Eko95], nonlinear biological systems [Mam93], probability density functions, [Pab92], [Tia89]. State space formulations appeared [Dum86], [Gus93], [Heu90], [Heu95], and modeling using step response data [Wan94b] and frequency response data [Clu92], [Wan95] have been considered. Function approximation. A representation of an underlying system is not always desired. Various GLFF models were also useful in function approximation, signal representation, time series analysis [Bro65], [Cel95], [Con94], [Cle82], [Hwa82], [Kor91], [Mar69], [Mai85], [McD68], [Ros64], [Sch71], [Wah93], [You62b], [Yen62]. Particular examples include representation of seismic data [Dea64] and electrocardiograms [You63]. Speech. Applications of GLFF models used in speech synthesis/analysis [Man63] and speech compression [A1J94] have appeared. Control theory. GLFF models are also useful in control theory. Examples include adaptive controllers [Dum86], [Dum90], [Zer88a], adaptive predictor controllers [Els94], [Guo94], optimal control [Hag91], [Nur81], robust control [Clu91] and robust stability tests [Aga92]. Stochastic processes. Stochastic concepts have been addressed including approximating cross correlations [den93a], correlations and power density spectra [Lee67], and probability density functions [Tia89], [Pab92]. (Adaptive) filter theory. GLFF models have been used as filters [Kab94] including adaptive filters [den93a], [den94], [Per88], [Pri93]. Systematic procedures to design FIR filters have been introduced [Fri81], [Mas90], [Mas91]. Recursive identification of time varying signals [Gun90], 2D filtering [Tsi94] and hardware implementations have also been studied [Ji95]. Other applications. Other examples where GLFF networks have been used include solutions to boundary value problems [Ira92], separation, enhancement, and tracking multiple sinusoids [Ko94], and approximation of time delays [Lam94], [Par91], [Moh95]. Although the list of applications above is fairly extensive, it is by no means exhaustive. Nevertheless, the applicability of the GLFF modeling theory developed in this research is clearly illustrated. 5.1 Examples of GLFF Networks Using Simulated Systems The application focus of this thesis is system modeling; namely, it is desired to approximate (model) LTI systems using GLFF networks. The objective is to obtain an approximation to the system impulse response. Quite often this is accomplished by optimizing the parameters of the model using step response data. In this section several GLFF network realizations are constructed that approximate simulated LTI systems. Optimization is performed using the step response of the desired systems. Real (nonsimulated) systems and their associated step response data will be considered in Chapters 5.3 and 5.4. Three simulated systems will be studied: 1st order, 2nd order, and 4th order. Example 1: (1st order) H(s)= r=1 (5.1) s + 1 Example 2: (2nd order) (02 H(s) = w, = 1, = 0.4 (underdamped) (5.2) s + 2 os + n. Example 3: (4th order) H(s)= H,(s)H2(s) (5.3) where H (s) = + 2, ,, =1, I = 0.4 (underdamped) s2 + 241,00s + t (5 2 (5.4) H,(s)= n2 2 ,2 = 2, 2 = 0.25 (underdamped) s2 + 22W2s + on2 In each case, the step response is generated and sampled. The sampled step response data is modeled with Laguerre, Gamma, and Legendre models. The results are given below in Figures 51 through 56. In each case, the optimal time scale parameter was determined and used to compute and plot the step response and impulse response for 6tap (N = 5) networks. Figures 51, 52, and 53 compare the system and GLFF network step responses. Figures 54, 55, and 56 show the resulting impulse responses. The optimal time scale parameters of the 6tap models and associated minimum squared error, J(Aopt), are provided in Tables 51, 52, 53. 1 0.9 0.8 0.7 0.6 0.5 0.4 System + + + + Laguerre model 0.3 f Gamma model x x x x Legendre model 0.2 0.1  0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 t (sec) Figure 51: 1st order step response and model data. 14 1.2 1 0.8 0.6 04 System + + + + Laguerre model Gamma model 0.2 x x x x Legendre model 0 1 2 3 4 5 t (sec) 6 7 8 9 10 Figure 52: 2nd order step response and model data. 1.6 1.4 1.2 1  0.8 0.6 0.4 0.2 0.2 0 1 2 3 4 5 6 7 8 9 10 t (sec) Figure 53: 4h order step response and model data. 1  0.9 0.8 0.7 0.6 0.5  0.4 0.3 0.2 0.1 0 0 0.5 System + + + + Laguerre model * Gamma model x x x x Legendre model Figure 54: 1st order impulse response and model data. System + + + + Laguerre model * Gamma model x x x x Legendre model 1 1.5 2 2.5 3 3.5 4 4.5 5 t (sec) 0.8 0.7 xxxx 0.6  System + + + + Laguerre model 0.5 Gamma model f x x x x Legendre model 0.4 x 0.3 x X 0.2 x x 0.1 0.1 0 2 ,i 1 i i I I i1 0 1 2 3 4 5 6 7 8 9 10 t (sec) Figure 55: 2nd order impulse response and model data. 0.8 x X+. 0.7 x . x 0.6 x x System 0.5 + ++ + Laguerre model 0 x\ Gamma model 0.4 x \ x x x x Legendre model x 0.3 0.2 0.1 +* / ;xxxxxxxxxxx> 0 x x +x x Ixx 0.1 \.x + +. x .. (*', x 0.2 \ *+*.x 0 1 2 3 4 5 6 7 8 9 10 t (sec) Figure 56: 4th order impulse response and model data. Table 51: 1st order model results. Model N Aot J(2o) in dB FIR 101 13.7 Laguerre 5 0.88 57.1 Gamma 5 0.05 oo Legendre 5 Aop = e0.05 56.0 Table 52: 2nd order model results. Model N A__ pt J(A opt) in dB FIR 101 13.0 Laguerre 5 0.95 57.1 Gamma 5 0.08 40.7 Legendre 5 opt,i = e0.03i 37.5 Table 53: 4th order model results. Model N 2opt J(/p,) in dB FIR 101 10.9 Laguerre 5 0.89 37.7 Gamma 5 0.09 31.2 Legendre 5 opt,i = e003i 28.8 The three examples considered above are typical of the subsystem makeup of guided weapon systems. With fairly low order models (N=5) the GLFF networks have demonstrated (in the simulated examples) they can approximate these systems well. This includes systems having slowly decaying impulse responses and oscillatory modes. The most difficult task of the modeling process is locating the optimal time scale. This was done using a numerical search technique in the above examples. For a given model, if the resulting error is too large the model order can be increased. This process can be continued until the prescribed error tolerance is reached. 5.2 Laboratory Testing of Guided Weapon Systems Guided weapon systems are as important in a military tactical mission as the aircraft and ships that carry them. They are the tools with which a pilot can carry out both offensive and defensive plans. As such, their capability and reliability must be well understood so that they can be operated within the envelope for which they were developed. To ensure their success it is necessary that the operational envelope is understood. This knowledge is acquired through extensive test and evaluation of these systems. The objective is to produce a smart dependable weapon that operates as expected when launched from an aircraft or ship. Test and evaluation (T&E) of guided weapon subsystems can be broken down into three phases [Eic89]. This breakdown is typical when a new weapon or modification of an existing weapon is under consideration. These phases are 1. Research/Concept Exploration a) analytic study b) openloop laboratory and field tests 2. Simulation a) software simulation and modeling b) HITL simulation 3. Flight Test a) captive carry b) launch Research and concept exploration are the very beginning of the T&E effort. Proposals must be carefully considered to determine which (current) weapons could possibly meet the objective being considered. Can an existing weapon be modified or does it call for a new design? All these scenarios must be analytically explored. Flight tests are the final hurdle. It is here where the weapon system's performance is measured against its expectation. Before flight testing however, many problems can be worked out by evaluating a weapon system in a simulated environment created in the laboratory. There are many methods for evaluating the performance of components of a weapon system such as parametric tests and openloop tests. Another class of tests studies the behavior of these systems in a closedloop configuration and involves a simulation of the realworld environment and the weapon's flight in this environment. This class of tests can be broken down into two types: alldigital and hardwareintheloop (HITL). In an alldigital closedloop flight, the entire weapon system is simulated mathematically. If the weapon subsystem models are accurate, alldigital simulations can give us a good idea of how the weapon will perform as well as what its limitations and vulnerabilities are. Once subsystems have been hardware fabricated, these can replace the models in the closedloop simulation. Ultimately it is desired to have as much of the weapon hardware in the loop as possible. 5.2.1 Objectives Several reasons why laboratory HITL testing is an integral part of a weapons development test program are now given. 1. Asset Availability During the development of a weapon system, often answers are needed to key questions about the system to determine the limits of its capability. Rather than fabricating an entire system and conducting a flight test, many times these questions can be answered by using simple prototypes or only subsystem components. This precludes having to completely fabricate a system and conduct open air flight tests. Flight tests are expensive and consume lots of resources (fully developed weapon system, aircraft/ships, openair range time/space, money). Here is where laboratory testing offers a great advantage. 2. Planning and Analysis Requirements As a missile system reaches development stage and can be flight tested, valid test conditions (test matrix) must be formulated. Having laboratory results (alldigital and/or HITL) as a guide, the test matrix can be formulated with intelligence and unrealistic scenarios can be avoided. Laboratory tests can not only help define the launch envelope of a weapon, but they can be used to predict its performance when operating under valid conditions. 3. Repeatability Another advantage to the laboratory environment is repeatability testing. Once a weapon system simulation has been developed, it can be used to conduct hundreds and thousands of tests. Weapon performance when launched under varying release conditions and "what if" scenarios can be examined. The laboratory environment is a major contributor to this kind of testing because real assets do not have to be expended to conduct many tests. In "live fire" testing assets can usually be used only once. 4. Security/Safety Finally there is the security advantage. As long as testing is being conducted in a laboratory, external security vulnerability is reduced. Range safety costs and concerns that occur in openair testing are also eliminated. To accomplish valid and accurate laboratory testing, careful attention must be paid to the test configuration. For HITL tests it must be ensured that the laboratory equipment can recreate the dynamics and conditions that will be experienced in the real world. Often it is easy to switch between configurations involving hardware and software. This requires accurate software models of the hardware being simulated. This is where GLFF networks discussed in this work enters the application front. To gain a better understanding of the systems the GLFF networks must model, a more detailed breakdown of a typical missile system is now given. The functional components of the missile are discussed as well as how they are integrated into a HITL test configuration. The GLFF networks can then be used to approximate some of these components. 5.2.2 Functional Breakdown of a Missile System This section provides a discussion of components (subsystems) most often found in missile systems. Figure 57 below illustrates a highlevel functional break down of a typical missile system [Gar80]. Figure 57: Missile functional block diagram. Seeker/Guidance Modules. It can be said that the seeker is the eye and the guidance module is the brain of a weapon system. It is here the missile trajectory is commanded and controlled. For missiles that are considered lockon after launch, the guidance module may begin in a target search mode. In this case it drives the seeker to search for a target in some preprogrammed pattern. During this phase commands may be sent to the autopilot to have it execute a midcourse maneuver such as pitching the weapon up in order to gain altitude for target search and future end game engagement. If target energy is found the guidance module may transition into an acquisition mode to narrow the search field of the seeker and attempt to select a single target. At this point it begins target track and will try to keep the seeker pointed in the direction of the target. During track mode, guidance (acceleration) commands are sent to the autopilot in accordance with the guidance law being executed (proportional or pursuit navigation). All the while this module performs seeker head stabilization and directional pointing to maintain a continuous track of the target until impact. Autopilot. The job of the autopilot is to maintain a stable flight and deliver on the commands received from the guidance module. To perform its task, the autopilot accepts input from various sources and sensors. The three most common are: guidance module, accelerometers, gyros. The guidance module tells the autopilot that a certain amount of acceleration is required to maintain the desired trajectory. Upon receiving a command for a specific acceleration, the autopilot issues commands to the actuator to drive the control surfaces (fins or canards) in the proper direction. Measurements are taken from accelerometers to determine whether the weapon has achieved the desired acceleration. Gyro rotational rate and angle measurements are taken to ensure proper damping and to keep the motion of the missile in a stable condition. Actuator and Control Surfaces. The purpose of the actuator is to accept commands from the autopilot and in turn drive the control surfaces to achieve the dynamics needed. As the fins are controlled into the wind the missile body will rotate and accelerate hopefully in a stable manner sufficient to maintain a target track. Track angles, Steering commands Fin commands Fin rates rates positions Target Seeker 4 Guidance Autopilot Actuator signature Mode, Stabilization "> I Flight Motion Simulator (FMS) IAa T arget .............................................. .......................... ... ............................ A ir ram e Target A Forces, ............... ....... K inem atics Missile body angles, rates, Target Missile/Target 4 accelerations Geometry Target profile Figure 58: Closedloop missile simulation. 5.2.3 HITL ClosedLoop Missile Simulations In this section HITL closedloop missile simulations will be discussed. Figure 58 above illustrates a typical test configuration. Usually the seeker, guidance module, and autopilot are hardware components and the functionality of these systems is the subject of the test. Special laboratory equipment is used to generate the target scene. This is dependent upon the kind of sensor in the seeker head. The remaining blocks are simulated in software. For a HITL closedloop simulation, the seeker, guidance 