<%BANNER%>
HIDE
 Title Page
 Acknowledgement
 Table of Contents
 Abstract
 Problem formulation
 System assumptions
 Model formulation
 Optimal parameter selection
 Applications
 Appendix A: Orthogonal signal...
 Appendix B: Hilbert space...
 Appendix C: S-domain derivation...
 Appendix D: Squared error stationarity...
 References
 Biographical sketch














System modeling using generalized linear feedforward networks
CITATION THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/AA00013547/00001
 Material Information
Title: System modeling using generalized linear feedforward networks
Physical Description: vi, 140 leaves : ill. ; 29 cm.
Language: English
Creator: Jones, Douglas G., 1963-
Publication Date: 1999
 Subjects
Subjects / Keywords: Mathematical optimization   ( lcsh )
Network analysis (Planning)   ( lcsh )
Linear control systems   ( lcsh )
Electrical and Computer Engineering thesis, Ph.D   ( lcsh )
Dissertations, Academic -- Electrical and Computer Engineering -- UF   ( lcsh )
Genre: bibliography   ( marcgt )
non-fiction   ( marcgt )
 Notes
Thesis: Thesis (Ph.D.)--University of Florida, 1999.
Bibliography: Includes bibliographical references (leaves 132-139).
Statement of Responsibility: by Douglas G. Jones.
General Note: Typescript.
General Note: Vita.
 Record Information
Source Institution: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: aleph - 030475416
oclc - 43164426
System ID: AA00013547:00001

Table of Contents
    Title Page
        Page i
    Acknowledgement
        Page ii
    Table of Contents
        Page iii
        Page iv
    Abstract
        Page v
        Page vi
    Problem formulation
        Page 1
        Page 2
        Page 3
        Page 4
        Page 5
        Page 6
        Page 7
        Page 8
        Page 9
    System assumptions
        Page 10
        Page 11
        Page 12
        Page 13
        Page 14
        Page 15
        Page 16
        Page 17
        Page 18
        Page 19
        Page 20
        Page 21
    Model formulation
        Page 22
        Page 23
        Page 24
        Page 25
        Page 26
        Page 27
        Page 28
        Page 29
        Page 30
        Page 31
        Page 32
        Page 33
        Page 34
    Optimal parameter selection
        Page 35
        Page 36
        Page 37
        Page 38
        Page 39
        Page 40
        Page 41
        Page 42
        Page 43
        Page 44
        Page 45
        Page 46
        Page 47
        Page 48
        Page 49
        Page 50
        Page 51
        Page 52
        Page 53
        Page 54
        Page 55
        Page 56
        Page 57
        Page 58
        Page 59
        Page 60
        Page 61
        Page 62
        Page 63
        Page 64
        Page 65
        Page 66
        Page 67
        Page 68
        Page 69
        Page 70
        Page 71
        Page 72
        Page 73
        Page 74
        Page 75
        Page 76
        Page 77
        Page 78
        Page 79
        Page 80
    Applications
        Page 81
        Page 82
        Page 83
        Page 84
        Page 85
        Page 86
        Page 87
        Page 88
        Page 89
        Page 90
        Page 91
        Page 92
        Page 93
        Page 94
        Page 95
        Page 96
        Page 97
        Page 98
        Page 99
        Page 100
        Page 101
        Page 102
        Page 103
        Page 104
        Page 105
        Page 106
        Page 107
        Page 108
        Page 109
        Page 110
        Page 111
        Page 112
        Page 113
        Page 114
        Page 115
        Page 116
        Page 117
        Page 118
        Page 119
        Page 120
        Page 121
        Page 122
        Page 123
        Page 124
    Appendix A: Orthogonal signal sets
        Page 125
    Appendix B: Hilbert space concepts
        Page 126
        Page 127
    Appendix C: S-domain derivation of minimum J
        Page 128
        Page 129
    Appendix D: Squared error stationarity using the Laguerre model
        Page 130
        Page 131
    References
        Page 132
        Page 133
        Page 134
        Page 135
        Page 136
        Page 137
        Page 138
        Page 139
    Biographical sketch
        Page 140
        Page 141
        Page 142
        Page 143
Full Text









SYSTEM MODELING USING
GENERALIZED LINEAR FEEDFORWARD NETWORKS














By

DOUGLAS G. JONES


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


1999













ACKNOWLEDGE MENTS

There are many people who made the completion of this work possible. First and

foremost I would like to thank my research professor, Dr. Jose C. Principe. He has

taught me the valuable skills of rigor and completeness. Also, his patience with my

progress while I juggled both research and a busy career was greatly appreciated. I

would like to thank my supervisory committee members Dr. Yunmei Chen, Dr.

William W. Edmonson, Dr. John G. Harris, and Dr. Jian Li for their encouragement

and direction.

Many thanks go to the management within the 46 Test Wing, Eglin AFB, FL, who

made it possible for me to attend the University of Florida through the long-term full-

time training program. This organization fosters an atmosphere that encourages

academic achievement, and it has continually provided me with the opportunity to

further my education.

There are no words to describe the wisdom, guidance, and support I have received

from my parents. Their love to this day is endless.

Above all I would like to give the highest recognition to my loving wife, Donna,

for her sacrificing devotion throughout my education. To my precious children,

Lindsey and Mark, you are the sunshine in my life.

Lastly, I dedicate this work to my Lord and Savior Jesus Christ. I can do all

things through him who gives me strength.













TABLE OF CONTENTS

page
ACKNOW LEDGM ENTS ......................................................................................... .. ii
ABSTRACT .................................................................................................................. v
CHAPTERS
1 PROBLEM FORM ULATION ................................................................................. 1
1.1 Problem D efinition......................................................................................... 2
1.2 M easurement Objective ......... ............... ........................... ..... ................... 3
1.3 Model Classifications................................................................................... 4
1.4 H historical R eview ............................................................................................... 5
1.5 Modeling Methodology................................................................................ 7
1.6 Research Contribution.................................................................................. 8
2 SYSTEM ASSUM PTIONS...................................................................... .... ........ 10
2.1 Linear, Time-Invariant, Causal Systems............................................................ 10
2.2 R ealizability ................................................................................................. 14
2.3 L2 Signals and Systems ...................................................................................... 14
2.4 System Synthesis.......................................................................................... 15
2.5 Continuous/Discrete-Time Transformation................................... ............ 16
2.6 E xam ples...................... ................................. .............................................. 18
2.7 Conclusion (System Assumptions)............................. ............ ............... 21
3 MODEL FORMULATION....................................................................................... 22
3.1 General Model Form M(p)......................................................................... 22
3.2 D iscrete-Tim e M odels................................................................................... 23
3.2.1 FIR M odel............................................................................................... .. 23
3.2.2 R M odel............................................................................................... 24
3.2.3 G LFF M odel ........................................................................................... 24
3.2.4 FIR, IIR, GLFF Model Block Diagrams ................................................. 25
3.3 GLFF M odel Kernels ............ .... ................... ................................................ 27
3.3.1 Popular Discrete-Time GLFF Kernels.................................................... 27
3.3.2 Popular Continuous-Time GLFF Kernels............................................... 27
3.4 GLFF Kernel Derivations and Properties..................................... ............ 30
3.4.1 Continuous-Time Laguerre Kernel.............................................................. 30
3.4.2 Continuous-Time Kautz Kernel (Complex Time Scale Vector) ...............30
3.4.3 Continuous-Time Kautz Kernel (Real Time Scale Vector).........................31
3.4.4 Continuous-Time Legendre and Jacobi Kernels..................................... 32
3.5 Impulse and Step Response Modeling.......................................................... 32



iii







3.6 Conclusion (Model Formulation)............................................................... 33
4 OPTIMAL PARAMETER SELECTION............................................................ 35
4.1 M odel D im ension.......................................................................................... 35
4.1.1 C om pleteness ............................................................................... .......... 36
4.1.2 O rthogonal Projection............................................................................... 37
4.1.3 "O ptim al" Selection of N ............................................................................ 38
4.2 W eight V sector ............................................................................................ ...... 39
4.2.1 Optimal Selection of w ........................................................................ 39
4.2.2 Relationship to Wiener-Hopf Equations...................................................41
4.2.3 Minimum Squared Error Performance Function ......................... .............. 42
4.2.4 Computing w Using Step Response Data..................................................43
4.2.5 Adaptation of w .......... ............................................................................46
4.3 T im e Scale V ector............................................................................................ 46
4.3.1 Optimal Selection of A (Historical Review).................................... ..48
4.3.2 Optimal Selection of A (Complex Error Analysis).................................... 56
4.3.3 Further Analysis of the Complex Performance Function ....................... 66
4.3.4 Time Domain Synthesis Using GLFF Networks....................................... 76
4.3.5 Conclusion of the Complex Performance Function.................................. 78
4.4 Conclusion (Optimal Parameter Selection)................... ........... .......... .... 80
5 A PPLIC ATIO N S ........................................... ......................................................... 81
5.1 Examples of GLFF Networks Using Simulated Systems ................................83
5.2 Laboratory Testing of Guided Weapon Systems ............................................. 89
5.2.1 O objectives .............................................................................................. 91
5.2.2 Functional Breakdown of a Missile System ........................................... 92
5.2.3 HITL Closed-Loop Missile Simulations....................................................94
5.3 Modeling the Step Response of a Carco 5-Axis FMS ....................................95
5.3.1 Discrete-Time GLFF Models of the FMS .................................................98
5.3.2 Step Response Modeling Procedure ........................................................ 100
5.3.3 Results (FMS Step Response Modeling)................................................ 103
5.3.4 Continuous-Time GLFF Models of the FMS .......................................... 118
5.3.5 Conclusion (Step Reponse Modeling)..................................................... 123
5.4 O their A applications ........................................................................................... 123
5.5 Conclusion (Applications) ............................................................................... 124
APPENDICES
A: ORTHOGONAL SIGNAL SETS ......................................................................... 125
B: HILBERT SPACE CONCEPTS ........ ................................................................. 126
C: S-DOMAIN DERIVATION OF MINIMUM J................................................... 128
D: SQUARED ERROR STATIONARITY USING THE LAGUERRE MODEL .... 130
R EFER E N C E S ........................................................................................................ 132
BIOGRAPHICAL SKETCH............................................................................... 140












Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

SYSTEM MODELING USING
GENERALIZED LINEAR FEEDFORWARD NETWORKS

By

Douglas G. Jones

August 1999


Chairman: Jose C. Principe
Major Department: Electrical and Computer Engineering

The focus of this work is the derivation and development of a class of models

called Generalized Linear FeedForward (GLFF) net%%orks. Included is a description

of the a priori assumptions concerning the systems under study, a development and

formulation of the models (networks) to be used. and a definition of a measurement

criteria used to judge goodness-of-fit. The models are constructed from a weighted

sum of parametrically dependent kernels. These kernels represent basis vectors that

span a vector space with dimension equal to the order of the network realization.

Optimization of the model parameters is discussed and a new approach is given; one

that extends the conventional squared error analysis to the complex domain.

After presentation of GLFF model theory and parametric optimization

methodologies, network realizations are constructed to address the system modeling

problem. GLFF networks can be formulated to model discrete-time, continuous-







time, and sampled-data systems. In this study, it is shown how a flight motion

simulator, used in the laboratory test and evaluation of guided weapon systems, can

be accurately modeled using GLFF networks comprised of Gamma, Laguerre, and

Legendre kernels.

Advantages of GLFF networks are their ability to accurately represent linear time-

invariant systems coupled with the simplicity of the procedures employed to optimally

determine their parameters. The network structures have efficient and practical

implementations and applications are offered as proof.












CHAPTER 1
PROBLEM FORMULATION


In this chapter an overview of the entire content of the research is gi\en.

beginning with a definition of the problem to be solved; namely, modeling the

impulse and step responses of causal, linear, time-invariant systems. The

system/model configuration is illustrated and the primary measurement objective

stated. The model architecture is briefly described and prefaced with a review of the

major contributors leading to the proposed formulation. Finally, the contribution of

this research is summarized. It will be developed throughout the remaining chapters.

It will be shown that the models developed arc capable of approximating systems

with real and/or complex conjugate pair poles, even those \which have slowly

decaying impulse responses (large time constants) and/or oscillatory modes (light

damping). This is difficult to achieve with finite impulse response (FIR) structures.

The models studied are considered parametric since they are functions of parameter

vectors that must be determined. These parameter vectors are fairly easy to optimize

and unlike infinite impulse response (IIR) structures stability can always be

guaranteed.

The networks under consideration in this research are comprised of linear

combinations of special basis filctions. These basis functions are not arbitrarily

chosen but are derived from assumptions of the systems being modeled. The




2


networks therefore become good approximates since they obtain the natural

information of the unknown system. This information resides in a parameter vector

built into the model structure. The parameters are optimally chosen so that the

models approximate the system in a least squares sense. Hence, the proposed theory

is an implementation of parametric least squares error modeling [Cad90].

The modeling methodology can also be described using geometric concepts.

Considering signals as vectors in vector spaces, the models are linear combinations of

parametric basis vectors. They span a subspace of the space spanned by the unknown

system. The model output is a projection of the system vector onto the model

subspace. This space has an orientation that is parametrically established. The error

is a vector that lies outside the model subspace. Projecting orthogonally the system

onto the model space minimizes the error.


1.1 Problem Definition

The objective is to approximate a causal, linear time-invariant (LTI) system, H,

using a pre-defined model, M(p), such that a measure of the quality of the

approximation is minimized. Models developed in this study will be called

generalized linear feedforward (GLFF) networks and will be defined in Chapter 3.

The basic system modeling configuration is shown in Figure 1-1 below. Arguments

are not included with the functions {x, d, y, e, H} to illustrate that the formulation will

apply in both a continuous-time setting {x(t), d(t), y(t), e(t), H(s)} and a sampled-data

or discrete-time setting {x(k), d(k), y(k), e(k), H(z)}. Quite often in this thesis we

switch back-and-forth between continuous-time and discrete-time concepts. This is







intentionally done to prove the broad applicability of this work. Most of the theory

however will be derived in continuous-time. It would have been just as appropriate to

develop the concepts in discrete-time.



d
H

x e

M(p)


Figure 1-1: System modeling configuration.


1.2 Measurement Objecti\e

How is it determined whether a model has achieved a good representation of the

system being approximated? For this a measurement of the error or a test for

goodness-of-fit of the model is needed. In this thesis the integrated squared error

criteria will be employed; namely, for a given model M(p), the parameter set p is

found such that the energy in the error signal e is minimized. This is one of the

simplest error criteria to work with since the measurement is basically quadratic in

form and its derivatives are linear in certain elements of the parameter set. So the

objective is to minimize

J = e(t)dt= (d(t)- y(t)) dt (continuous-time)
or (1.1)
J = e2(k) = J(d(k)- y(k))2 (discrete-time)
k=O k=O








1.3 Model Classifications

To provide motivation for the GLFF networks (a complete formulation will be

given in Chapter 3), consider the discrete-time setting. In the past discrete-time

system models have been largely divided into two classes: FIR and IIR. A new class

of models (filters) known as the generalized feedforward (GFF) filters was

popularized with the development of the Gamma filter [Pri93]. This filter has a

structure similar to the tap-delay line (FIR filter) with the delays replaced by a "more

general" transfer function known as the Gamma kernel. Extending this development

can be accomplished by considering other popular kernels, most of which are

orthogonal, such as the Laguerre, Legendre, Jacobi, and Kautz [Lee67], [Arm59],

[Kau54]. The extended formulation with a more comprehensive set of kernels is

called the GLFF model class because its members are used in the context of modeling

linear systems and they obey the superposition principle. How does the GLFF model

class compare to FIR and IIR models? Figure 1-2 relates these three classes based

upon important issues that are usually considered when deciding which modeling tool

to pull from the signal processing toolbox. GLFF models can be said to fall

somewhere between FIR and IIR models in capability, complexity, stability and

computability.


FIR GLFF IIR
low complexity high
low *- { } -* high
l capability J
f stability 1
trivial <-- stability difficult
FcomputabilityI

Figure 1-2: FIR/GLFF/IIR comparison.







This thesis exploits the strengths of GLFF models. They offer an alternative to

the two extremes in the modeling process. Their advantages are that they have a

reduced complexity compared to IIR models while offering greater capability over

FIR models. Unlike IIR modeling techniques, stability can be easily guaranteed and

their optimal or near-optimal solutions can be computed with much less difficulty.

The most difficult task is in selection of the time scale r sector A. This is one element

of the parameter set p. The complete parameter set will be defined in Chapter 3.

Optimization of its elements comprises all of Chapter 4.


1.4 Historical Review

Much of the theory on which the GLFF model class is founded began with the

work of Lee [Lee33], [Lee67]. Lee showed how to use orthogonal functions to

synthesize transient networks and how this concept is employed to build electrical

devices that can realize physical systems. Another major contributor is Kautz

[Kau54]. He provided a convenient way to develop the time domain response of a

system using a most general orthogonal signal set. Young and Huggins extended

Kautz' procedure to discrete-time [You62a] as did Broome [Bro65]. Nurges and

Yaaksoo [Nur81] formulated state space equations that represent discrete-time linear

systems using the Laguerre model. Roberts and Mullis provide a generalized state

variable formulation of orthogonal all-pass network structures [Rob87]. These

structures keep data paths local. The fundamental processing elements are alike

except perhaps for local scaling parameters. This simplifies the design and allows for

distributing the computations in a connected array. These orthogonal structures also






have the advantage that higher order models can be constructed by simply adding new

blocks or terms without effecting or forcing a recalculation of the existing lower order

terms. The concept of a generalized feedforward filter was proposed by Principe

[Pri93] with the Gamma model. This idea of creating a more general transversal filter

using the earlier work with orthogonal signal sets leads to the modeling methodology

here described; namely, the generalized linear feedforward networks.

What is the basic objective? Given a desired impulse response, h(t), of a causal

LTI system along with a prescribed tolerance on the allowable error, find an

approximating impulse response, h(t), that is simple in structure and easy to compute.

Let the system and model transfer functions be H(s) and H(s), respectively.

Construct H(s) with the form

m
(s) (S ,) ( A,
H(s) A '= (1.2)
x(s) (s ) = ( p)
i=1

Assuming no coincident poles the impulse response is

n
h(t) = '(H(s)) = A eP (1.3)
i--1

where Re(pi) < 0, V i=1, ..., n. It is desired that h(t) and h(t) be as close as possible

under prescribed tolerances.

There are two well-known approaches to performing this approximation. The first

is to transform h(t) into H(s) and then approximate H(s) in the s-domain by using a

ratio of polynomials. This approach is generally less difficult and several techniques

for accomplishing this approximation have been presented in the literature. These







include Prony's method, Pad6 approximants, continued-fraction expansions, and even

analog filter design [Su71], [Bak81], [Opp89]. The problem with this approach is

that the error in the time domain is difficult to control. The second approach is to

approximate h(t) with a finite sum of exponentially damped sinusoids and damped

exponentials (as described above) and then transform the approximation to find H(s).

Here the modeling error may be controlled directly in the time domain. One way of

accomplishing this is to implement the procedure described by Kautz [Kau54] which

involves approximating h(t) using a weighted sum of special basis functions. These

basis functions are themselves weighted sums of exponentials carefully constructed so

that they are orthonormal. This provides for great simplification when searching for

the optimal parameter set p and gives direct control over the associated error.


1.5 Modeling Methodology

In this thesis the method used to implement GLFF network approximations of

linear systems is based on the work done by Kautz. This is carried out by considering

a construction of h(t) using an infinite sum of weighted time domain basis functions

{(,,,(t)} 0". Since the infinite sum must be truncated to a finite number of terms an

approximation results

N
h(t) =h (t)= w,4,,(tA ) (1.4)
n=0

The weights w,,, w, ... ,WN and basis functions {0,(t,)}I are chosen with the

following goals in mind:

1. For a fixed number of terms, N, make the convergence as rapid as possible.
This is controlled using a time scale vector A.









2. Make the optimal weights independent of the number of taps, N+1. This is
accomplished by using orthogonal basis functions.
3. For a fixed number of taps, N+1, minimize the energy in the error signal. This
requires determining optimal values for all of the elements of the parameter
set p in such a way that J (the integrated squared error) is minimized.

The above procedure is applicable when modeling the impulse response of a

desired system. It may also be of interest to approximate the step response. In fact

the applications given in this thesis involve finding system models using step

response data. If the network input is a step function, this method is employed by

approximating not the desired system output but its derivative or by approximating

the desired output and taking the derivative of the result. This will give an analytic

expression for the impulse response. When only a numerical or tabulated form of the

impulse response is desired, use the step response data to generate a system model.

Once the model is established, use an impulse (or white noise) input to generate the

impulse response data.


1.6 Research Contribution

This study extends the above concepts and contributes to the system modeling

theory in several ways.

First, a unified system modeling framework centered around the GLFF model

class is provided. As will be seen in Chapter 3 this formulation not only includes

models that employ orthogonal signal sets like the Laguerre and Kautz but non-

orthogonal ones as well, such as the Gamma. In addition to these generalized models,

the basic framework can also be used to represent FIR and IIR structures.







Optimal parameter selection is a big part of the modeling procedure. The most

difficult task is locating or approximating the optimal time scale vector. A new

approach that extends traditional squared error analysis to the complex domain is

developed. It will be shown how the information in this new space can be used to

locate the optimal time scale parameters.

Finally an application of the theory is given by modeling the step response of a

Carco 5-axis flight motion simulator (FMS). This system is used in hardware-in-the-

loop (HITL) testing of guided weapon systems. GLFF modeling of other systems and

subsystems used in HITL testing is also considered. It is demonstrated how this class

of models can be employed to accurately represent the dynamics of physical systems

and add value to the task of weapon system test and evaluation.















CHAPTER 2
SYSTEM ASSUMPTIONS


In the previous chapter, a brief discussion was given concerning the assumed

systems to be modeled. A more complete analysis of all the system assumptions is

now offered. Mathematical and physical arguments are used to validate their

necessity. The basic assumptions are that input/output signals have finite energy and

systems are stable, causal, linear, time-invariant, and realizable. After stating the

assumptions, the technique employed to synthesis these systems by simulation is

given. Higher order systems will be simulated by cascading together 1st and 2nd order

subsystems. This is done in a continuous-time framework. The transformation used

when a digital system is desired is also discussed. Finally some simulation examples

of the kinds of real systems to be modeled are offered.


2.1 Linear, Time-Invariant, Causal Systems

There are many ways to classify systems; linear vs. nonlinear, time-invariant vs.

time varying, causal vs. non-causal, stochastic vs. deterministic, continuous vs.

discrete, etc. One important step in the study and analysis of systems theory is to

derive a structural or mathematical description of processes and systems. This

assumed structure is essential if plausible analytical models are sought to approximate

them. One could argue that if we had a mathematical description of the process why

would we need additional models. The primary reason for the model is to







approximate input/output behavior of the process. The assumed mathematical

description of the process could be used to give very specific information about its

internal composition (time constants, damping ratios, etc.); however, a component

level specification may be unavailable. In fact even if these system parameters are

known they may not be operating within their stated specification. In this research an

understanding of the internal composition is not the objective, only a model of the

input/output behavior when viewing the system as a black box. The theory presented

in this section does however give the foundation used to formulate and validate this

new class of models and serves to illustrate the derivation of these concepts.

Many systems studied in engineering can be well represented by mathematical

relations that describe their physical makeup or behavior. Quite often systems are

very complex and a realistic characterization may require nonlinear and/or time-

varying equations that are difficult to solve. For simplicity, practicality, and in order

to establish an applicable set of tools which can be used in the design and analysis of

systems, approximations and assumptions are made about their physical nature. This

is done primarily to allow for the use of linear systems theory. This reasoning is often

justified in one of two ways:

1. The system is assumed to be linear by nature or it at least operates in a linear
region of its capabilities so that the linearity conditions are satisfied.
2. The system is assumed to be nonlinear by nature and in order to apply linear
systems theory we conceptualize a linearization around a nominal operating
point. With this approach precautions must be taken to ensure the domain of
the linear assumption is not exceeded.

Why is the use of linear systems theory desirable? In addition to the availability

of a well-studied set of tools, most often a description of a system via the impulse










response and transfer function is desired. These functional system descriptions are

established by employing the theory of linear systems [DeC89].

Consider a linear time-invariant (LTI) system having input x(t) and output d(t).

The impulse response, h(t), is defined as the output when the input is a unit impulse

function 8(t). The transfer function, H(s), is defined as the Laplace transform of h(t)

with zero initial conditions. This is equivalent to assuming h(t) = 0 for t < 0 (causal).

The above definition of the transfer function is in terms of the impulse response.

However a description of the transfer function taking a different, more mathematical

approach is also available. In practice, a large class of signals and systems in

engineering can be formulated through the use of differential equations. The LTI,

causal assumptions equivalently translate into the requirement that systems be

representable by linear, constant coefficient, ordinary differential equations. Namely,

an /th order system can be written as

d (t) + an_-d "-l)(t) +- -+ ad(' (t) + a0d(t) =
(2.1)
b,,x(m(t)+ b,-_lx(-'(t)+. .+ bx("(t) + box(t)

where ao, a,, ..., a-,_, bo, b\, ..., b,, are all real constants with n 2 m. Systems

described by ordinary differential equations of this form are said to contain lumped

parameters rather than distributed parameters [Kuo87]. Partial differential equations

such as the elliptic equations, hyperbolic equations, and parabolic equations describe

phenomena such as fluid flow, wave propagation, and heat conduction all of which

are considered to be distributed parameter systems [Pow79].







To obtain the transfer function of an LTI system described by the differential

equation above assume zero initial conditions (causality) and take the Laplace

transform

H(s) = D(s) b,,s" + b,, s' +b s+ (2b2)
X(s) s" + a,_,s"-'+- +ags + ao

Re%\ writing this in factored form gives


(71s-z)
H(s)= A -' (2.3)
S(s p,)
i=1

and in partial fraction form (assuming no repeated poles)


H(s) = I (2.4)
I=, (S p,)

To obtain the impulse response, take the inverse Laplace transform. Assuming no

coincident poles this yields


h(t)= AI'l"t (2.5)


So the LTI, causal assumption leads to the following result: LTI causal systems

can be represented by constant coefficient ordinary differential equations whose

solutions (impulse responses) are comprised of a sum of exponential functions that

are real (when pi is real) and/or complex (when Pi is complex). If some of the poles

are repeated then there are terms with multiplicative factors t, t2, t3, ..., tk1 for a kth

order pole. This representation is the motivation for the modeling structure

considered in this thesis.








2.2 Realizability

It was shown that the LTI, causal system assumption leads to a transfer function

representation of the form



H(s)= A '=' (2.6)
H(s- p,)
i=1

An answer is now given to the following question "What are the constraints on

this form that ensures it is a description of a physically realizable system?" The

realizability requirement of H(s) is satisfied with the following constraints:

1. The constant A must be a real.
2. The number of poles, n, must be greater than the number of zeros, m.
3. Complex poles and zeros must occur in conjugate form. That is if pi is
complex then there must be a pj, for some j, such that pj = pi where denotes
complex conjugation.

When H(s) is realizable, it is easy to show that the impulse response is represented

by a weighted sum of exponentials and exponentially decaying sinusoids.


2.3 L2 Signals and Systems

The final constraint imposed is that the signals must belong to the vector space

L2(R), the square-integrable functions. Here integrable is meant in the Lebesgue

sense [Deb90]. Assume the signal xe L2(R) and x: R R. then


x2 (t)dt < (2.7)


Hence it is said that x has finite energy. Let h(t) be the impulse response of a system.

Since h(t) = 0 for t < 0, and he L2((R) then








Jh'(t)dt 0

In this case it is said that h is a stable system. So by operating in L (R) a restriction is

imposed to work only with finite energy signals and stable systems.


2.4 System Synthesis

Recall the system realizability requirement results in an impulse response

representation that is a weighted sum of exponentials and exponentially decaying

sinusoids. From the transfer function perspective then a system can be thought of as

consisting of a cascade of 1st and 2nd order subsystems. The most general forms of

these subsystem structures are

D, ( s) b
Hi (s) = D =(s) -b__
X(s) s+a (2.9)
D, (s) b~s + bo
Hz(s)- = =
X(s) s +a,,s +a2o

resulting in differential equations

d (t)+ ald,1() = box(t)
(2.10)
i, (t) + a,,d(t) + ad(t) = b,,x(t) + bzx(t)

During this thesis, examples will be given whereby simulated systems are created.

For such examples, the synthesized systems will consist of these 1st and 2nd order

subsystem blocks. Most often slightly less general forms will be used to keep the

synthesis more in line with engineering (rather than mathematical) concepts. These

special forms convey the engineering information of usual interest: lime constants,

damping ratios, and undamped natural frequencies. These subsystems have the

following transfer functions










H,(s)D, (s) 1 =_,
X(s) rs+l s+Y
2 (2.11)
H, (s) co,
X(s) s2 + 2 ws + Wo

and corresponding differential equations

rd (t)+ d,(t)= x(t)
(2.12)
d,(t) + 2 (),d2(t) + A d(t)= =ox(t)

where
r = time constant
S= damping ratio
t, = undamped natural frequency


2.5 Continuous/Discrete-Time Transformation

The goal of this thesis is to define a model class useful for approximating discrete-

time, continuous-time, and sampled-data systems. The above system formulation was

carried out in the continuous-time domain. The same principles however could have

been derived in the discrete-time domain. All the concepts discussed have equivalent

discrete-time representations. Sometimes a continuous-time system in analytic form

is given when a discrete-time representation is desired. This can be resolved by

sampling. In this case, the discrete-time system is called a sampled-data system.

Hence a sampling time, T, will necessarily be specified.

Having a continuous-time transfer function, there are several methods used to map

to the discrete-time domain. The most popular are the bilinear, impulse- or step-

invariant, and matched-Z transform [Str88]. There are tradeoffs among these

mappings and some work well where others do not. The most widely used is the

bilinear transform. The transformation (or mapping) is carried out as follows







H(z)=H(SIs 2z-1 (2.13)
T z+l

where T is the sampling interval. This is also known as Tustin's method in control

systems circles and is derived by considering the problem of numerical integration of

differential equations [Fra90]. A derivation of the bilinear transform is as follows.

Beginning with the transfer function H(s), determine its differential equation

representation. A difference equation is then derived whose solution matches the

solution of the differential equation at the sample points. The specific transformation

depends on the method with which the differential equation solution is computed.

Consider the 1s order system

D,(s) _
H,(s) = () (2.14)
X (s) s +

or associated differential equation

d,(t)+ d,(t)= x(t) (2.15)

The solution, d\(t), can be written in integral form as

I
d,(t)= J[-d,(u)+x(v)]du (2.16)
0

For the sampling interval of time Tthis can be rewritten as

nT-T nT
d,(n)= d,(nT)= j[-d,(v)+x(u)]dv+ j[-d,(v)+x(v)]dv
0 Tr-r (2.17)
nT
= d(n-1)+r I[-d,(v)+x(u)]dv
nT-T

The method in which the integral is computed over the region [nT-T, nT]

determines the discrete-time representation, d(n), of d(t) and hence the transformation

used to map H(s) to H(z). Two of the most common integration schemes are Euler's








rule (backward difference method) and the trapezoidal rule. For these two cases the

following mappings result.


Table 2-1: Continuous-time to discrete-time mappings.

Integration Rule Mapping H(s) I-4 H(z)
Euler's 1 fz-l
s=-\ --
T z
Trapezoidal 2(z-1)
T _z+l)


So the bilinear transform is derived by computing the integral over [nT-T, nT]

using the trapezoidal rule.

In Steiglitz's work [Ste65] a more theoretical treatment and significance of the

bilinear transform is developed. In fact, it is shown that the bilinear transform is an

isomorphism connecting the continuous-time vector space L2(R) and the discrete-time

(or digital) vector space of double-ended sequences of complex numbers {x,, }_ that

are square summable. This connection proves that the signal processing theories with

respect to LTI causal signals and systems are identical in both the continuous-time

domain and discrete-time domain. In this thesis, when a continuous-to-discrete

transformation is performed the bilinear transform will always be used.


2.6 Examples

Several examples of the 1st and 2nd order systems discussed in Chapter 2.4 are

shown on the next two pages in Figures 2-1 to 2-4. These figures illustrate the

impulse and step response behavior that must be approximated by the class of models

proposed in this thesis.





19




2

18

16-

1.4
S0T= 5

1.2

T1 1 =

0.8

06-
r=2
04

02

0
0 0.5 1 1.5 2 25 3 3.5 4 45 5
t (sec)

Figure 2-1: 1st order impulse responses, r-0.5, 1, 2, 3.






0.9 T=05
r= 1
0.8

0.7 T =2
'=3
0.6

o 0.5

04

0.3

02

0.1

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
t (sec)


Figure 2-2: lst order step responses, r=0.5, 1, 2, 3.




































0 1 2 3 4 5 6 7 8 9 10
t (sec)

Figure 2-3: 2nd order impulse responses with at=l.
a) =-0.4, under damped; b) &-1.0, critically damped; c) 4=1.5, over damped.


- 0.6


0.4


0.2


1 2 3 4 5
t (sec)


6 7 8 9 10


Figure 2-4: 2nd order step responses with a=l1.
a) =-0.4, under damped; b) =-1.0, critically damped; c) =-1.5, over damped.


-0.2
0







2.7 Conclusion (System Assumptions)

In this chapter the assumptions made about the systems to be modeled have been

stated; causality, linearity, time-invariance, realizability. This discussion was to

provide the mathematical foundation and engineering principles that motivate the

development of the system models to be introduced. For the purposes of simulating

real systems, 1st and 2nd order subsystems are used as the basic building blocks. These

subsystems can be cascaded together to create high order systems. Methods were

discussed that allow for transforming continuous-time system descriptions, H(s), into

discrete-time descriptions, H(z).

A good model is one with a structure that can capture the dynamics of a system

(time constants, damping coefficients, natural frequencies) without explicit

knowledge of its parameters. For systems with large time constants (slowly decaying

impulse response) and/or small damping coefficients oscillatoryy modes) a good

approximation is more difficult to achieve. The models presented in this thesis are

good approximates of systems even with these characteristics.












CHAPTER 3
MODEL FORMULATION


In this chapter our first major contribution to the problem of system modeling is

presented; namely, a comprehensive framework given by the GLFF model class.

Although many of the components have been around for quite some time it is the

unifying approach taken here that offers new insight. This new formulation

encompasses not only GLFF networks but FIR and IIR models (for the discrete-time

case) as well.


3.1 General Model Form M(p)

Recall the general system modeling configuration, shown in Figure 3-1 below

where H is the system to be approximated using the model M(p). The specific form

of the model M(p) is now discussed.



d
H +



M(p)



Figure 3-1: System modeling configuration.







p is defined as a parameter set. Its elements are vectors called the parameter

vectors of the model M(p). The parameter set contains three elements (vectors),

p= {N,w,2 }
where (3.1)
N dimension vector
w weight vector
A time scale vector

The model can be written in the form

M(p)= M({N,w,A )= f(N,w,H,(.,2)) (3.2)

where H,(*, A) is called the model kernel at stage n, 0 < n < Ni, and Ni is a component

of the vector N. So M(p) is a function, f, of the parameter vectors and the model

kernel. The kernel argument is used to imply that the domain of operation (s or z)

is universal. When solving specific problems the notation H,(s,A) is used in the

continuous-time case and H,(z,2) in the discrete-time case.


3.2 Discrete-Time Models

Consider the definitions above in the discrete-time domain. This unifying

representation can be used to illustrate many specific model structures. In this section

it is shown how to describe the FIR, IIR, and GLFF model classes using this

formulation.


3.2.1 FIR Model

In the case of the FIR models
N
M(p)= M({N,w, })= w,,H,(z,A) (3.3)
n=O
where












2=0


n
=nH,(z)
i=0


The functions H,(z) are given by


i=O
i>O


3.2.2 IIR Model

In the case of the IIR models


M(p)= M({N,w,A )


N,
b,, H,, (z,t )
n=0
N1
l-=aH,(z,A)
n=l


where


N = (N1,N2)T

w= al,az,...,al ,bob,,..., b

2=0

H(Z,)= H,(z)= H (z)
i=0


Just as in the FIR case the functions H,(z) are given by


H(z) I
Z-I


i=0
i>0


3.2.3 GLFF Model

In the case of the GLFF models


M(p)= M({N,w,A])= wnH,,(z,A)
n=0
where


N=(N,)


(3.4)


(3.5)


(3.6)


(3.7)


(3.8)


(3.9)


H, (z, A) = H, (z)


H,(z)= { ,
z-1







N =(N,)

w=(Wo Wi1...I^ WN)T

2- (2...L)T (3.10)

H,(z, )= Hn2(z,' )l- A(zA)
i=0

The functions H,,(z,A) and H,,(z,A) could be any transfer function. In this thesis a

restriction is made to consider special cases that

1. are founded on the system model assumptions described in Chapter 2.
2. have kernels that are (typically) orthogonal.
3. have kernels that are cascaded lowpass and allpass subs) stems.
4. will allow for modeling the unknown system with real and/or complex poles
via the time scale vector A.
5. have localized feedback.
6. guarantee stability.


3.2.4 FIR, IR, GLFF Model Block Diagrams

To compare the structures of these models examine their block diagrams in

Figures 3-2, 3-3, and 3-4. As can be seen from the FIR model, its lack of any

feedback component limits its applicability. The IR model offers the greatest

capability but at a significant cost and loss of stability control. The GLFF model

offers a practical alternative that rivals the IIR model in performance while

maintaining simple computational procedures similar to the FIR case.










x(k)














x(k)




















x(k) -


Figure 3-2: FIR model block diagram.


y(k)


Figure 3-3: IIR model block diagram.


y(k)


Figure 3-4: GLFF model block diagram.


y(k)







3.3 GLFF Model Kernels

The GLFF kernels can be comprised of almost any transfer function. Some of the

most popular discrete-time and continuous-time kernels are given in Tables 3-1 and 3-

3, respectively. These tables include each kernel's common name, symbol, and n'h tap

transfer function, H,,(., A). Most GLFF networks are derived from orthogonal signal

sets (see Appendix A). This orthogonality of a kernel is also indicated in the tables.

An important part of a GLFF network is the time scale parameter vector 2. It allows

for local feedback and is responsible for the popularity and usefulness of this entire

class of models. The dimension of the time scale vector, dim(A), determines the

number of degrees-of-freedom a kernel offers. dim(A) is also given in the tables.

When fitting a model to an unknown system, the parameter set must be optimally

selected to achieve the minimum integrated squared error.


3.3.1 Popular Discrete-Time GLFF Kernels

The most popular discrete-time GLFF kernels are the delay, Gamma, Laguerre,

Legendre, and Kautz. They are listed in Table 3-1, and restrictions on the time scale

parameters are given in Table 3-2. Some of the earliest discussions of these kernels

are found in the work by Young [You62a], Perez [Per88], and Principe [Pri93].


3.3.2 Popular Continuous-Time GLFF Kernels

The most popular continuous-time GLFF kernels are the Gamma, Laguerre,

Legendre, Jacobi, and Kautz. They are listed in Table 3-3, and restrictions on the time

scale parameters are given in Table 3-4. Some of first to describe these kernels are

Lee [Lee33], Kautz [Kau54], Armstrong Ar[Arm59], and Principe [Pri93].
















Table 3-1: Discrete-time GLFF kernels.


Table 3-2: Restrictions on discrete-time time scale parameters.

Kernel Valid time scale
Delay none
Gamma A real, 0 < < 2
Laguerre A real, 1A <1
Legendre Ak real, Ak =k, 0 < 1
Kautz A2 complex, Ak <1


Kernel n htap H2 (z,A) Hk,(z,) H,(z,A) ortho- dim(2)
symbol gonal
Delay H,(z) I z-' z- no 0


Gamma Gn(z,A) I AZ-' A-1 no 1
___1-(1-2)z 1-(1-A)z
Laguerre Ln(z,A) _2 z 22 -_ I yes 1
1-A2z- 1-Az-1 1-2 1- 2z-) __
Legendre P,(z,A) 1 z-1 1 z A yes >1
1-2jz- 1-2z- 1 1 1-'z-
Kautz K,(z,A) 1 -/-1 1 n -1 yes >1

1 I Z-1 1- Z- i= 1-A iz-
Z_ i=O














Table 3-3: Continuous-time GLFF kernels.


Table 3-4: Restrictions on continuous-time time scale parameters.

Kernel Valid time scale
Gamma A real, A > 0
Laguerre A real, A> 0
Legendre A, real, A, = 2, A > 0
Jacobi 2, real, A, = k + 1
Kautz Ak complex, A, = 2AR + jA/, ARk > 0


Kernel n'h tap H H(s,A) Hk(s,) H,(s,A) ortho- dim(A)
___ symbol__ gonal
Gamma G,(s,A) 1 A 2 "A no 1
s---
s+A [s+A _
Laguerre L,(s,A) s -A s- s- A "" yes 1
s+A s+A s+A s+A
Legendre P,(s,A) 2j s- Ak n-2 fS-A yes 1
s+A s+k s+ A, =0 S+
Jacobi J,(s,A) s 2- 2_ 22 s- A, yes >1
s+A_ s++ +k s+ i0 s+
Kautz K,(s,A) 2(sA) (s-AkX(s-) 2f(sI)( sXs-) yes >1
(s+AlXs+A) (s+AkXs+ ) (s+2Xs+x ) s+2+Xs+4)








3.4 GLFF Kernel Derivations and Properties

The Laguerre, Legendre, Jacobi, and Kautz GLFF networks have the interesting

property that their kernels are orthonormal. Hence, we call these orthonormal

generalized linear feedforward (OGLFF) networks. Next these kernels will be

derived from first principles. This is done in the continuous-time domain. The

derivations will make evident their relationship to causal LTI system. The connection

between the network structures and the mathematical description of causal LTI

systems is the primary reason they are able to approximate these systems so well.


3.4.1 Continuous-Time Laguerre Kernel

Consider the sequence I(At) e" }. Regarding this as the solution to an ordinary

differential equation that represents a causal LTI system, the system would consist of

one pole with multiplicity n+l. Orthogonalizing this sequence yields a new sequence

{l((t, )} given by


1,(tA)=V2 I e-,nA> 2>0,2>0, t0 (3.11)


Now taking the Laplace transform gives the Laguerre model kernel


L,(s,)= I {l(t)}- S=_ An(3.12)
S-A S+A


3.4.2 Continuous-Time Kautz Kernel (Complex Time Scale Vector)

Consider the sequence {et '} where the A,'s are distinct complex scalars (i.e.,

components of the time scale vector 2). Since the poles are complex they must exist







in conjugate form in order to satisfy the realizability requirement. Orthogonalizing

this sequence yields a new sequence {k,(t,A)} given by


k,,(t,A)= Re{A,,e-1}, n>0, t> 0, Re{A,}>0, Vi (3.13)
i=0

It is now shown how to obtain the coefficients A,i. Taking the Laplace transform of

the new sequence gives the Kautz model kernel

2-A(s+In) (s (s-A,)(s-A) n-1
K2,,(t,A)=I {k, (t, )}= ( + m 0,1,..., 2

Sr2AR (s-IA) (s1(-A,)(s-AX)' n-1
Kz,,, (t, )= L(k,,(t,A )}= s_+ +,), ,yn i m=0,1,...,--
m+(tA)= km+(tA) (s+A +X)(s+ ) ,U (s+A,)(s+A ) 2

Express these functions in partial fraction form. Grouping each pair of time scale

elements (2, ,2 ) gives the kernel expression


K,,(t,)= s+2A, + n=2mandn=2m+1 (3.15)


Hence the coefficients Ani are twice the residue at the pole -A, for the nth tap kernel.


3.4.3 Continuous-Time Kautz Kernel (Real Time Scale Vector)

Consider the sequence {e"')} where the A,'s are distinct real scalars (i.e.,

components of the time scale vector A). Orthogonalizing this sequence results in a

new sequence {k,(t, )} given by


k,(tA) = Aie-', n >0, t > 0,A, > ,Vi (3.16)
i=0

It is now shown how to obtain the coefficients Ai. Taking the Laplace transform of

the new sequence gives the Kautz model kernel









K,(t, )=I k,,(2)}= 22 k( s-A =, A+, (3.17)
S + n i=- S +i i=0 S + Ai

where the last expression is simply the partial fraction expansion of the product form

of K,(t, 2). Hence A,i is the residue of the pole at -A, for the Inth tap kernel. It will be

real since the time domain function k,(t, 2) is real.


3.4.4 Continuous-Time Legendre and Jacobi Kernels

Consider again the sequence {(e"'} where the An's are distinct real scalars. This is

the same sequence from which the Kautz kernel with a real time scale vector is

derived. This kernel is described by equation (3.17). Now consider the kernel with a

"special" time scale parameter. Namely, if A = (AA,,2.,2,)T with k = 2A, A > 0

then we obtain the Legendre kernel. Likewise, if Ak = k +1 we obtain the Jacobi

kernel.


3.5 Impulse and Step Response Modeling

The models discussed in this chapter can accurately represent any signal or system

belonging to L 2(R). We will be most interested in modeling causal LTI systems using

sampled step response data. In this case, a step function is applied to a continuous-

time system to generate its step response. The step response is sampled to obtain a

discrete-time representation of the system. Optimal continuous-time and discrete-

time GLFF models are determined using this data (optimization will be discussed in

Chapter 4). For example using an N+I tap Kautz filter the approximating transfer

function is







N
H(z) = H(z)= w,K,(z,A) (3.18)
n=0

Suppose the discrete-time impulse response is desired. How can it be computed?

Three simple methods are now given.

1. Using the discrete-time model /H(z), pass the impulse x(k)=J(k) through this

model to generate an approximation to h(k).

2. Since x(t) is a unit step then X(s) = 1/s. If g(t) is the step response of the system

then g(t) = h(t) x(t) or G(s) = H(s) / s. So

H(s) = sG(s) = D(s)G(s) (3.19)

or


H(z) = D(s)G(s)I, = D(z)G(z) where D(z) = -(3.20)
-1 T z+1

Using the sampled step response data g(k), pass this through D(z) to generate h(k).

dg(t)
3. Since h(t)= d use a numerical approximation to the derivative (backward
dt

difference perhaps) to obtain h(k). In this case

dg(t) g(kT)- g((k-)T) g(k)- g(k -1)21)
dt ,kT T T


3.6 Conclusion (Model Formulation)

This chapter has dealt with structures of models useful for approximating causal

LTI systems. The generic model M(p) was stated. It was shown how this formulation

can be used to represent FIR, IIR, and GLFF networks. Several GLFF kernels were

derived, and discussion was provided to illustrate their connection to the assumptions

of the systems under study. Tables of discrete-time and continuous-time GLFF model







34


kernels were included. These kernels are the most popular in the literature and as the

derivations of the Laguerre, Kautz, Legendre, and Jacobi revealed they are not

theoretical abstractions, but orthogonalized versions of solutions to differential

equation representations of several important classes of systems. Now that the model

structure M(p) has been defined, optimization of the parameter set, p, becomes the

focus.












CHAPTER 4
OPTIMAL PARAMETER SELECTION


In this chapter the selection of the optimal (or near optimal) parameter set for

GLFF networks is discussed. Recall p = {N, w,A } where N is the dimension vector,

w is the weight vector, and A is the time scale vector. Optimization for these three

vectors is handled separately below. In fact, the ability to separate the parameters and

their optimization techniques is another benefit of the GLFF formulation.


4.1 Model Dimension

Consider a GLFF model consisting of a set of time domain basis functions

{o, (t, )}'o with frequency representation {I,(s,A2)}o The model structure is a

weighted sum of these basis functions (kernels). In the s-domain

Y(s)
M(p) = w,(s,A)= WTc(s,2) (4.1)
X(s) ,,=O

where

w= (WoWI...WN)
(4.2)
D(s, 2 )= (o0(s, 2>, ,(s,2),..., N (s,2))T

In the time domain


y(t) = wA(t, )* x(t) = [w O(t,2 )*x(t) (4.3)


where








0(t,2) = (o0(t,2 ),1 0(t,2),..., ON(t, ))T (4.4)

In this thesis, the theory will focus on modeling the impulse response of a

continuous-time system. Given sampled step response data, continuous-time and

discrete-time GLFF networks will be constructed to approximate the desired system.

Hence the input, x(t), will be considered an impulse and the desired signal, d(t), will

be the system impulse response.

The selection of N is now discussed. N is a scalar for the GLFF model class.

Suppose N is a fixed finite positive integer. Then d(t) can be approximated by

N
d(t) =y(t)= w,O,(t, A) (4.5)
n=0


4.1.1 Completeness

To ensure that y(t) -- d(t) as N -- oa the property of completeness is employed. A

set of functions {O,(t,2)}) is called complete [Su71] on the space L2([a,b]) if there

exists no function d(t)e L2([a,b]) such that

b
d(t)O(t, A)dt = 0, Vn = 0,1,2,... (4.6)
a

An equivalent definition is as follows. Given a piecewise continuous function

d(t)E L2([a,b]) and an E > 0 there exists an integer N > 0 such that

b
J(d(t)- y(t))2dt < e (4.7)
a







where y(t) is given by the equation (4.5). All of the GLFF kernel sets are complete on

[0,oo). An example of an incomplete set of orthonormal functions {n0(t,2)}) o

defined on [0,2;r) is

(t,=) = 0,(t)= sin(nt) (4.8)

The function d(t)=cos(mt) is nonzero on [0,2r) however it is orthogonal to

{0,(t,')} 0on this interval. Namely,

21r
Jcos(mt)O,(t,2 )dt = 0, Vn,m (4.9)
o


4.1.2 Orthogonal Projection

When {\,(t,2)}Nois an orthonormal set then

N
y(t)= wOn(t,2A) (4.10)
n=0

spans an N+1-dimensional Hilbert Space, S (see Appendix B). The Orthogonal

Projection Theorem states that for every d(t) in a Hilbert space H, it can be

decomposed as d(t) = y(t) + e(t) where y(t) S, e(t)e SI. S is the orthogonal

complement of S and H = S E S So y(t) can be considered as the orthogonal

projection of d(t)E H onto the space S. Namely

y(t)= Projs(d(t)) (4.11)

and {, (t, )}N is an orthonormal basis of S.
n=O









4.1.3 "Optimal" Selection of N

Completeness is important because it guarantees that if the approximation of d(t)

using N+1 weighted terms of {0((t,2 )}n0 is not within a prescribed tolerance then

more terms can be included to eventually reach the tolerance requirement. Increasing

N may not improve the approximation of d(t) when a set of incomplete functions is

used.

Having the completeness property provides the following choices when a GLFF

network realization for a given N is inadequate:

1. Increase N until the prescribed tolerance is reached.
2. Select other kernels from the GLFF set and examine the approximation error.

For a given N, if none of the GLFF kernels chosen provide satisfactory results

then N must be increased. If this is done using several kernels the rate of convergence

for each kernel as N is increased could be monitored. Choose from this set the kernel

that yields the minimum N needed to achieve the required tolerance. Other options

are to either raise the tolerance level or choose a different performance criteria. One

suggestion for setting the tolerance is to choose the acceptable error power to be a

certain percent (5% perhaps) of the system power.

Davila and Chiang [Dav95] suggest another approach to estimating the dimension

vector N = (N1, N2) for a general IIR structure. This method examines the

eigenvectors of the input/output data covariance matrix. When the model orders are

overestimated, zeros will appear in the noise subspace eigenvectors. The number of

zeros found can be used to estimate the model orders (dimension vector components).

This procedure could be applied to GLFF networks.







4.2 Weight Vector

One of the greatest advantages to working with a feedforward model is that the

calculation of the optimal weight vector w (in the squared error sense) is simple. This

is true even for GLFF structures. Although the realizability requirement for GLFF

networks demands the weight vector be real, the optimal solution for w is now

derived assuming w, d(t), and {0,(t,2)} N are generally complex. Realizable optimal

solutions are obtained by setting the imaginary components equal to zero.


4.2.1 Optimal Selection of w

Assume d(t) is the impulse response of the desired system (i.e., d(t) = h(t)) and

approximate it with y(t) given by


N
y(t)= w ,(t)
n=0


(4.12)


The parameter set p= {N,w,A } is to be chosen so the error e(t) = d(t) y(t) is

minimum in the integrated squared error sense. Namely, choose p to minimize


If Wn = a, + jbn then


J= f d(t)- y(t) ldt
0
N
= ld(t)ldt dw d'(t) (t, A)dt(4
0 i=0 0
-E wf d(t)0*(t,A)dt + w,I2 ji,(t,A)2dt
i=0 0 i=0 0

dJ dJ
set = 0 and = 0 to find w, that minimizes J. Hence
da, db,


S=- (t),(t, A)t d(t)o0(t, A)t + 2a, f (t,)1dt =0
da, 0 0 0


(4


.13)









.14)





40


so


jRe[d(t)*(t,A)]dt
a= 0o (4.15)
j On (t, A)|2dt
0

When {(O(t,2 )}n is an orthonormal set, the bottom integral equals unity yielding


a, = Re[d(t)O (t,A )]dt (4.16)
0

Likewise b, can be derived


Sj d(t)o((t,2 )dt jld(t) (t,A)dt +2b (t, )dt = 0 (4.17)
n 0 0 0

so





0
Im[d(t)*,(t, )]dt
b. = 0----(4.18)



Again, if {(,(t,A)}N is orthonormal then equation (4.18) reduces to


b, = Im[d(t)o*(t, A )]dt (4.19)
0

Hence, combining the above results gives


w, = a + jb = Id(t)o(t, A)dt (4.20)
0

When {(Z,(tr,2 )} =is real then w, is real and can be written as


w, = d(t)O,(t, A)dt (4.21)
0







So


w= d(t)O(t, A)dt (4.22)
0

is the weight vector that minimizes the energy of e(t). From this point forward only

real weight vectors will be considered.


4.2.2 Relationship to Wiener-Hopf Equations

It is easy to show that w is in fact the solution to the Wiener-Hopf (or Normal)

equations. The performance function can be expanded as follows


J= e (t)dt = (d(t) y(t))2dt = d(t) w'(t, A ))dt
o o 0
0 0 0 (4.23)
= d2(t)dt 2wTjd(t)o,(t,A)dt +wT jO(t,2A) (t,A)dt w
0 0 0)

now define the following terms


d,= d(t)dt
o

p jd(t)O(t, )dt (4.24)
0

R O(t, )O(t,A)dt
0

Then the performance function J becomes

J = d -2wrp+w Rw (4.25)

To minimize this function take the gradient of J with respect to the weight vector w

and equate to zero

VJ = -2p + 2Rw = 0 (4.26)








Rw = p (4.27)

These are known as the Wiener-Hopf or Normal equations. Solving for w gives the

optimal solution

w = R-p (4.28)

Now the OGLFF kernels {jO(t,/N)} form an orthonormal set so



0

Hence the weight vector becomes



0

which is the solution derived in the previous section.


4.2.3 Minimum Squared Error Performance Function

Now that the optimal weight vector has been derived, the minimum squared error

can be computed. Recall

J = d, 2wrp + wrRw (4.31)

For the optimal weights it was found that (using OGLFF kernels) w = p and R = I.

Plugging these into the equation for J yields

Jopt = de wTW (4.32)

The above optimal weight vector w and minimum squared error Jopt was derived using

a fixed time scale A. Notice that w and Jopt are functions of 2. Hence Jopt can be

written


(4.33)


Jopt(A ) = de wT(A)w( A)







to illustrate the importance of A on the optimal solution. Appendix C presents a

derivation of equation (4.32) in the s-domain.


4.2.4 Computing w Using Step Response Data

In Chapter 5 applications of the GLFF networks are given. The systems there are

continuous-time systems. The available information of these systems is sampled step

response data. GLFF networks will be used to approximate the input/output step

response behavior. Methods for computing the optimal weight vector w using

sampled step response data are now given.

Least Squares Solution. [Hay91] Consider the model dimension N and time scale

vector A fixed or previously chosen to be optimal. Discussion of model dimension

was given in Chapter 4.1. Discussion of the optimal time scale will be given in

Chapter 4.3.

Since the information is sampled (discrete-time) step response data the following

definitions are needed. Let the input, x(t), be the unit step and g(t) be the response of

the system under question. If the step response is sampled at interval T then define

g(k) = g(kT), k=0,1, ..., K as the desired signal. Define also

xi(k) = kth sample at the ith tap of the GLFF network (4.34)

y(k) = kth output of the GLFF network

In this discrete-time case, the matrix R and vector p are given by

K
R=[r,] where r = (xi,,x) = x(k)xj(k) (4.35)
k=0


and










SPO K
p= : where p, = (g,x,)= g(k)x,(k) (4.36)
k=0
\PN)

This gives the normal equations Rw = p with solution w = R-'p. This is the Wiener-

Hopf solution. Notice in this case that R will not be the identity matrix. Only when

the input is an impulse (or white noise) will R = I. Nevertheless, the Wiener-Hopf

solution provides the optimal weight vector (for fixed N and A) regardless of the

input.

Integral Equation Solution. [Clu91] A more direct evaluation of the weight vector

solution given in Chapter 4.2.1 is now considered. Recall


w,=fd(t)0,(t,A)dt, n= 0,1,...,N (4.37)
0

is the optimal weight solution assuming an orthonormal basis. For the system

modeling problem this is accomplished by applying an impulse (or white noise) into

the system. In which case d(t)=h(t), the impulse response. Consider, for example, the

Laguerre kernels given by {,(tA)},0={l,(t',)} N0. Some of its properties can be

utilized to derive a simplified integral equation solution. Again work with g(k), k=0,1

,..., K samples of the continuous-time step response g(t). Since the continuous-time

impulse response is h(t) = dg(t)/dt, the optimal weights can be written as


w, = h(t)l,(t, )dt
0

= l,(t,A)dt (4.38)
0 dt

= (g(t)1n(t,A )I- j g(t) dt
00 dt







where integration by parts has been used. For a stable system g(t) it must have

lg(oo) < oo. Also using the property of the Laguerre functions [Par71]


i,(t,A2)=-Z ei (t"e"-2) (4.39)
n i t"

yields

lim l (t,) = 0 (4.40)

Hence


w =-g(t) t dt (4.41)
dt
0 t

Let T., be the steady state response time of g(t); namely, T,. is the time which for all

t > T, the output can be consider constant, g(t) = gs, = constant. In this case the

integral can be broken up as

T', In(tA) dd-gll( dt
w. =- g(t) dt dt g dt
ST (4.42)
= g(t) (t, ) dt + g,J (T,I A)
0-ldt

From the Laguerre function property above it is easy to compute the derivative

relationship

S -A i(t, A)- 2A 1(t,A) (4.43)
dt i=0

The weight computation becomes

n-1 I T ,
w.= 22ZJ g(t),(t, A)dt + A (t)l(t, )dt + gsl (T,I,A) (4.44)
i=0 0 0










Up to now the continuous-time step response g(t) has been assumed. To compute w,

using the sampled step response g(k) an integral approximation technique must be

used such as Euler integration

r,, K
Sg(t)1,(t,A)dt = A g(k)1,(k, A) (4.45)
o k=O

where

T
A = -- (integration interval) (4.46)
K

Finally this gives the integral equation weight solution

n-1 K K
w, = 2/A _g(k)l((k,A)+AAE8g(k)l,(k,2)+ g,sl,(T,) (4.47)
i=0 k=0 k=0



4.2.5 Adaptation of w

Note that the weights can be computed by solving a set of linear (Wiener-Hopf)

equations. This linear relationship among the weights corresponds to a convex

performance function with one minimum point. So the weight vector can be easily

solved in an adaptive manner using a gradient search technique such as Newton's

method or LMS [Wid85]. Unfortunately, this nice relationship does not hold for the

time scale vector. It is (in general) non-convex with multiple stationary points. The

number of which depends upon the dimension, N, of the model used.


4.3 Time Scale Vector

Optimization of the parameter set elements N and w have been discussed to this

point. Their solutions are fairly easily found for GLFF networks and are the only








elements of p that are required for FIR and IIR models. As has been previously

stated, the power of GLFF networks is in local feedback. This feedback is controlled

by the time scale vector. Although IIR structures also have feedback, it can not be

separated and optimized independently from the tap-to-tap weights like the GLFF

structures.

Unfortunately, the elements of the time scale vector are not linearly related; hence,

the performance function J is usually non-convex w.r.t. these parameters. As such,

the bulk of the work in finding an optimal p involves finding an optimal 2. This

section is devoted to this issue. It begins by examining several methods that currently

exist for either determining exactly or approximately the optimal A. Each method

usually addresses a particular GLFF kernel, most often the Laguerre. A few papers

have considered the non-orthogonal Gamma and a few discuss optimization of the

more complex Kautz kernel. Although the exact optimal solution is sought, Kautz

has pointed out that non-optimal solutions (which may be derived in a simplified

manner) can still produce excellent results. His comments suggest that one should

not search to exhaustion for optimality if it is possible to get close with much less

work.

After the overview of optimization methods found in the literature, we propose a

new technique relying heavily on complex variable theory. This is a method that

extends the concepts of squared error analysis to the complex domain. Under certain

assumptions, the optimal time scale can be fairly easily determined by simply looking

for zero crossings.










4.3.1 Optimal Selection of A (Historical Review)

In this section a summary is given of current methods found in the literature to

either locate or approximate the optimal time scale A. Each method usually addresses

a particular GLFF kernel, most often the Laguerre. A few papers have considered the

non-orthogonal Gamma and a few discuss optimization of the more complex Kautz

kernel.

The methods presented in the literature can be grouped into the following

categories according to the employed strategy or desired objective:

1. Approximation via the impulse response.
2. Asymptotic solutions.
3. Achieving prescribed measurement objectives (other than minimum squared
error).
4. Matched filter approach.
5. Moment matching.
6. Satisfying necessary conditions for stationarity of the squared error function.
7. Iterative and adaptive search techniques.

Approximation via the impulse response. These methods approximate A given a

priori knowledge about the impulse response h(t) of the system. The required

information is different for each approach. Either h(t) is needed in analytic form or

tabular form (samples of h(t) at discrete points, t = tk, k = 0,1,2, ... ). These methods

appeal to the necessary requirement for all stable and realizable linear systems;

namely, its transfer function H(s) must be representable as a rational function with no

right-half plane poles or multiple j-axis poles. Under this assumption, h(t) can be

represented as a sum of exponentials of the type

(A + At+...+Atm")eA' (4.48)


where








Ak = ARk + k, Rk < 0 (4.49)

A frequency domain method for approximating A is as follows [Su 71]. Suppose

h(t) is known (or approximation to it) and its Laplace transform H(s) is determined.

Note that the form of h(t) that is given may not produce a rational H(s). Now find a

rational approximation Ha(s). For instance use the expression for H(s) and expand it

into a Taylor series about s = 0 to obtain

H(s) = ao + as+a2s +* (4.50)

Then a rational function can be found to approximate H(s). One way to accomplish

this is to use a Pad6 approximant. The poles of this approximation can be used as an

initial guess for the poles of H(s). These pole locations could be incrementally

adjusted to locally optimize.

A time domain approach is to use point matching. The basis of this technique is

attributed to Prony [Ham73]. The strategy is to force an approximate impulse

response h(t) to match h(t) at evenly spaced points. Assume h(t) is known at points

separated by the spacing At. Using the approximation

N
(t)= Ae"' (4.51)


it is possible to create N linear equations (using the known points of h(t)) whose

solution gives coefficients ao, a,, ..., aN-. of an Nh order characteristic polynomial

xN + a_,XN-I + aN_2x- + + axx + a0 = 0 (4.52)

Once the coefficients are known, find the roots of this polynomial x, xz, ..., XN. They

are related to the time scale parameters by










.i = -nx, (4.53)
At

Another time domain approach involves the use of an orthogonal signal set

[Kau54]. Assume h(t) and its first N derivatives are known. Using these functions,

generate a set of orthogonal functions {(0(t)}N0 satisfying the relationships

o(t)= h(t)
01(t) = h(t)+ b o0o(t)
(t = (t) + b, ,(t) + bz20o(t) (4.54)


N(t) = hI(N(t)+bN(N N_ l(t)+**+ bNlOl(t)+ bNOo(t)

where the b,,n's can be found such that



0
J^.(t)O (tdt= (4.55)


Since each 0~(t) is a linear combination of the derivatives (from the 0th to the nth) of

h(t) then 0n(t) can be rewritten in terms of h)(t), j = 0,1,...,n-1. Replace each h()(t)

by A' to get the characteristic equation

A + an(n-l)/ + an(n_2),-2) +" + +a + an0 = 0 (4.56)

The roots of these equations give the time scale parameters Ani, i=1,2,...,n for model

order n for each n = 1,2,...,N.

Asymptotic solutions. Several methods are offered that are based on power series

representations of the Laguerre model. Schetzen [Sch70], [Sch71] presents an

analytic method for determining the asymptotic optimal time scale LA where

2 = lim AN (4.57)
N-)--







This is based on the region of convergence (ROC) of the power series equivalence of

the Laguerre model. Justification for A~. in the place of AN is made on the basis that

most engineering applications require the use of a large number of terms N in the

Laguerre expansion to achieve accurate results. In this case, /L is viewed as a good

approximation to the optimal time scale AN.

Bruni [Bru64] offers another approach involving optimization around the ROC of

a Laguerre model expansion. Assuming h(t) is the impulse response of an LTI

lumped parameter system

M
h(t)= A-e- p', p,=a,+jb,,a,>0 (4.58)
i=0

decompose a Laguerre model representation of h(t) into M infinite series


h(t) = wnt,l(t, 2) = W"l,(t,2 ) (4.59)
n=0 n=O i=1

where w,i is the nth coefficient of the expansion of the ith exponential in h(t). With

this formulation the Fourier transform of h(t) can be represented as a sum of M

infinite complex geometric series. Mi gives the magnitude of the term in the ith

infinite series. Mi should be minimized for each i in order to achieve the most rapid

convergence of the series. Since M, is a function of A, there is an optimal A (denoted

Ai) for each Mi. Choose as the optimal A for the entire model, the Ai that minimizes

the largest Mi. Methods similar to this one are also given in [War54] and [Hea55].

Achieving prescribed measurement objectives. In addition to locating A that

minimizes the squared error, approaches that minimize other measurement criteria

have also been considered. One such method employed when using the Laguerre









signal set {l(t,2A)}N0is given in [Par71]. Consider the approximation of d(t) using

an N+1 tap Laguerre model

N
d(t)- =y(t)= wl, (t, ) (4.60)
n=0

Now define the following quantities


ftd2(t)dt td2(t)dt
M, o and M2 0 (4.61)
Sd2(t)dt d2(t)dt
0 0

M1 is a measure of how rapidly d(t) decays. M2 is a measure of how smoothly d(t)

decays. Parks show that the time scale A given by A = M1/M2 minimizes the

maximum error over the class of all signals with these measurements. The advantage

to this approach is that a complete knowledge of the signal to be approximated is not

required.

In [Fu93] a similar approach is taken when using the discrete-time Laguerre

model. In this case, the optimal A is found that minimizes the measurement


J =tnw2 (4.62)
n=0

This performance measure is chosen because it linearly increases the cost of each

additional term used in the Laguerre expansion. This enforces a fast convergence

rate. The optimal A is obtained using a discrete-time formulation of M1 and M2

above. A relationship for J is derived that depends only on M1, M2, and A. The

optimal 2 is then determined to be








S2M= -1-M2 (4.63)
2M, 1 + 4M ,M- M 2M2

Matched filter approach. In [Kuo94] a matched filter bank is used to estimate the

optimal 2 for the Gamma model. Recall that {g,(t,, )}N0 is a non-orthogonal GLFF

signal set. Let d(t) be approximated by

N
d(t)= y(t)= w,g,(t, A) (4.64)
n=O

A matched filter is constructed based on the optimal A condition (g ,(t,2),d(t) = 0

where g ,(t,A) is the component of gN+i(t, 2) that is orthogonal to the previous basis

components. A bank of such filters is created where each bank uses a different A. To

optimize 2 over the region (0, ,,ax) an M-bank filter structure is created where the mth

filter uses the time scale A = A,,x / M. When the desired signal d(t) is applied to the

filter bank, a change in sign of the M outputs will occur between filter m and m+1 for

a value of A that satisfies the optimality condition. When there are multiple

occurrences, the associated A for each case must be used to determine which one

achieves the minimum squared error.

Moment matching. The method presented in [Cel95] uses a Gamma model to

approximate a signal d(t). The Gamma model is conceptualized as a rotating manifold

in space. The kernels {g,(trA)} 0 are the basis vectors that define the N+I

dimensional space of operation and the time scale parameter A is the feature that

allows for rotation. The optimal A is estimated using a moment matching method.

This method attempts to estimate 2 by equating a finite number of time moments of










y(t) (the output of the Gamma model) to those of d(t). Namely, it is required that

mn(y(t))= m,(d(t)), n = ,...,N where


mt(y(t)) t"y(t)dt (4.65)
0

Using these functions an N+1 order polynomial in A, p(2), is formulated. The roots of

p(A) are estimates of the local minima of the squared error performance function J.

The root that minimizes J is chosen as the optimal A.

Satisfying necessary conditions for stationarity of the squared error. Some of the

most popular methods for optimizing the time scale of the Laguerre model center

around the necessary conditions for stationarity of the squared error performance

function J. For instance, approximating d(t) with an N+1 tap Laguerre network

N
d(t) = w,(A)1,(t, A) (4.66)
n=0

where the weight vector components wn(A) have been given an argument in 2 to

emphasis the dependence of these weights on the time scale. For a fixed 2, the

squared error is minimized when the weights are computed as


w,(2) = d(t)l,(t,A)dt (4.67)
0

The squared error function reduces to

N
J= d2(t)dt- w,(A) (4.68)
0 n=0

To minimize J (over A) the evaluation of the X must be maximized. This requires that

dN 2 NA = N d
w,() )2w,(A) w,(A2)= 0 (4.69)
dn=O n=O d







which yields the interesting relation

WN(A)WN+(2A)=0 (4.70)

A proof of this assertion is given in Appendix D. Hence the optimal value of 2 is

either a root of WN(2) = 0 or WN+il() = 0. Solving for A analytically is a tedious task

because WN(A) = 0 is a polynomial in 2 of degree (N+M) if d(t) is a system with M

poles. The first proof of this (using mathematical induction) was given by Clowes

[Clo65] and was restricted to the case where d(t) had only simple poles. A more

concise and less restrictive proof was given in [Kin69]. This paper also derives

expressions for the 2nd derivative


d2N [ 1 )[(N+ 4()- NWN+,(A)wN-(A)] for wN(A)= 0
dt2w( E W (4.71)
dt n=0 [N (N+2)w2(A)w,(A)-(N+l)()] for wN+,(a)=0


These equations may be used to determine whether a stationary point is a maximum

or minimum. In [Mas90] a discrete-time formulation of the stationary conditions is

formulated and the authors use numerical techniques to find the roots of the

polynomials WN(A) = 0 and WN+I(A) = 0. [Sil93] derives these conditions for the

Gamma model. The derivation given by Clowes was valid only for systems with

rational transfer functions. [Wan94b] extends the results to approximating any L2(R)

stable linear system. [Sil95] describes another efficient way to compute the


dA

weights WN(A) and WN+I(A) via Taylor series and Pad6 approximants. This procedure

is illustrated in both continuous-time and discrete-time when approximating a system

excited by an arbitrary input.








Iterative and adaptive search techniques. When representing a system with a

discrete-time GLFF network, the elements of the time scale vector are restricted to

bounded regions to guarantee stability. In particular, the discrete-time Laguerre

model requires 2e (-1,1). One method of optimization is to employ exhaustive search

over this interval, seeking the 2 that minimizes the squared error performance

function. Since this would be done in discrete steps, the only way to get continually

closer to the optimal A is to increase the number of points in the search region.

[Kin77] and [Mai85] discuss minimizing the squared error by employing a Fibonacci

search technique over 2. This is an interval elimination procedure wherein the region

in which the minimum lies is sequentially reduced using a predetermined number of

steps.

[Nur87] uses an adaptive technique whereby the optimal A is chosen to be the

center of gravity of the presumed poles of the system. The method works as follows.

Initialize 2, say Ao. Using the Laguerre model, with this 2, the parameters of a

difference equation representation of the system are estimated. From the

characteristic equation determine the poles of the estimated system. Choose as Ai,

i=1,2,... the center of gravity of these poles. Repeat this process until |A2+ ,i <

where Sis a small number. Once the center of gravity converges to a fixed point stop

the adaptation.


4.3.2 Optimal Selection of A (Complex Error Analysis)

A new approach to estimate the optimal time scale is now offered. The

methodology used here is one of extending the squared error performance function to







the complex domain. This is accomplished by defining a new complex performance

function. It will be shown how this function is derived, how it relates to the

conventional squared error function, and what additional information it offers. The

optimal time scale can be determined by looking for zero crossings of the imaginary

component of the new complex performance function.

For systems consisting of a cascade of 2nd order subsystems containing pairs of

complex conjugate poles, examples will be presented that illustrate how the imaginary

component of the new complex performance function can be used to obtain the

optimal (in the minimum squared error sense) time scale parameter. These examples

will solve the system identification problem with a GLFF model containing the

Laguerre kernel.

4.3.2.1 Derivation of the complex performance function. In Appendix C an s-

domain description of the squared error performance function is discussed. It is given

by

J =- E(s)E(-s)ds (4.72)
C

where C is a contour taken in the CCW direction enclosing the entire left-hand plane

(LHP) in s, E(s)= D(s)- WT (s,), and w is a real weight vector. For a fixed time

scale A the optimal weight vector is

w= -fD(s)(-s, A)ds (4.73)
C

yielding the performance function

J = de w'w (4.74)


where









d, D(s)D(-s)ds (4.75)
C

Now consider restricting evaluation of the performance function to the upper LHP

only. Namely, create the complex performance function

J -, E(s)E(-s)ds (4.76)
c.

Make the following definitions:

due-, = D(s)D(-s)ds (complex energy)
c, (4.77)
wu =- D(s)4(-s, A)ds (complex weight vector)


The contour Cu is taken in the CCW direction enclosing the upper LHP in s. Now if

F(s) is a function which has real and/or complex conjugate pair poles then


f F(s)ds= 2Re F(s)ds (4.78)
c c,

hence

w = 2 Re[wf]
w 2 "J(4.79)
d, =2Re[d,,]

By making the above definitions, the problem of locating the optimal A using Ju is

studied. By operating in the complex space there is information that can be used to

offer additional insight into the optimal solution of the time scale parameter 2. This

information is not available using squared error analysis because the imaginary

component of Ju is canceled when calculating J. This is due to the contribution from

the poles of D(s) in the lower LHP in s. The squared error performance function can







be considered as a mapping onto R and the complex performance function as a

mapping onto C.

A simplified expression for J, is derived as follows

Ju = 2 E(s)E(-s)ds


c=
7- f [D1s) WTq (sA )][D(-s) WTD-(s,2 )]ds
C9 -,


(4.sU)


= D(s)D(-s)ds w' D(s,(-s, A)ds
C. c.


-w D(-s)((s, A)ds +w -~t(s, A)Q~(-s,)dsw
Q c,

Using the definition of the complex energy, due, and complex weight vector, w,, as

well as the following relations


and


WT -j (s, A)T(-s,A)ds w = W'w
C,



jD(-s)D(s,A)ds =w
c.


(4.81)





(4.82)


gives


Ju = 4 ~WTW WTW+ WTW
= de -WT WU

Note that

2Re[J,] = 2Re[d.,]- w2 Re[w,] = d, WTW = J

which is what is expected since


(4.83)


(4.84)


J = E(s)E(-s)ds= 2 Re -
c C


= 2Re[J ]


(4.85)





60


4.3.2.2 Complex performance function examples and experimental results. Three

test cases have been chosen to study the complex performance function, Ju. One 2nd

order system and two 4th order systems are modeled using a GLFF network with a

Laguerre kernel. The transfer function for each example follows:



Example 1: (2nd order system, rapidly decaying impulse response)


1 1
H(s) = -+ (4.86)
s+2-j s+2+j



Example 2: (4th order system, rapidly decaying impulse response with slight
oscillation)


1 1 1 1
H(s) = + + +-- (4.87)
s+2-j s+2+j s+l-2j s+1+2j



Example 3: (4th order system, slowly decaying impulse response with oscillation)


2 2 -3 -3
H(s) = --+ -- + -+ -(4.88)
s+0.4-j s+0.4+j s+0.6-2.8j s+0.6+2.8j



Experimental results for each example are tabulated and graphed on the next few

pages. In Figures 4-1 and 4-2 the squared error performance function, J (solid curve),

and twice the imaginary component of the complex performance function, 2Im[Ju]

(dash curve), are plotted for the first four model orders for example 1, Figures 4-3

and 4-4 for example 2, and Figures 4-5 and 4-6 for example 3. Notice the imaginary

component has zero crossings near the stationary points of the squared error







performance function. Im[Jj] crosses negative-to-positive near local minima of J. So

the complex performance function provides an estimate of the optimal time scale

through the imaginary component. In fact Im[J,] can be considered an estimate of

V,J(A) that is best near Aop.t

Numerical results are provided in Tables 4-1, 4-2, and 4-3 to illustrate how close

the nearest zero crossing (nearest ZCu) of 2Im[Ju] is to the optimal time scale as the

model order increases. Also the squared error is computed at the optimal, J(Aopt), and

nearest ZCu, J(Anzcu), time scales. For this analysis, Im[Ju] was scaled by 2 to provide

similar scaling to J since J=2Re[Ju]; however, the zero crossings are unaffected by

this scaling. By examining J(Aopt) and J(Anzcu) it is found that when using a Laguerre

network with sufficient order to approximate the unknown system, no degradation

occurs by choosing A at the nearest ZCu. In fact the small difference in ,,op, and Anzc,

is partly due to numerical approximation error when searching for minima of J and

roots of Im[J,].















-10-


o -20

=L
a,
-30
C


-40 1
a)



0
z
-60


-70
0 1 2 3 4 5
Time Scale, X

Figure 4-1: J for example 1 (orders = 0 to 3).


0.1 '


0.05 --


E 0O.

-0.05

-0.1
0
-3
x10
1 r-


(order=0)


1 2 3 4 5

(order=2)


05--- -=---------- --- d
0.5 .


E 0




-1 S
0 1 2 3 4 5


0.01

0.005


(order=1)


-0 .005 O 4 t.............. .................... ................... .


-0.01 S
0 1 2 3 4 5


-4
x10
4 r---


x
(order=3)


0 1 2 3 4 5


Figure 4-2: J and 2Im[Ju] for example 1.










0


-5


0 -10
o
S-15
u-
a)
2 -20
CO

-25


-30


S-35

-40


-45


(order=0)


2

(order=2)


2 4


S0.2

S0.1

- E0.
-

-0.1 *

-0.2-
6 0


0.1

0.05

0

-0.05

-0.1


Figure 4-4 J and 2Im[J for example 2.
Figure 4-4: J and 2Im[Ju] for example 2.


(order=1)


2

(order=3)


2 4 6


1 2 3 4
Time Scale, X

Figure 4-3: J for example 2 (orders = 0 to 7).


0.4

0.2

0

-0.2

-0.4


0.1 r

0.05

0

-0.05

-0.1
0








































1 2 3 4 5 6
Time Scale, X

Figure 4-5: J for example 3 (orders 0 to 11).


(order=O)




-







0 2 4 6

(order=2)


.


-i


(order=1)





- ------ ----------


(order=3)


0 2 4


Figure 4-6: J and 2Im[J,] for example 3.


CO

0
c -10
U-










0
-15



" -20



z-25



-30


1




j'_0.5
E


0


0.6


0.4


E 0.2
-0.
0


-0.2













Table 4-1: Performance function characteristics for example 1.

N oApr A2nzcu J(Ao,) in dB J(Azcu) in dB
0 2.47 2.48 -24.7 -24.6
1 1.60 1.58 -33.8 -33.7
2 2.31 2.31 -49.9 -49.9
3 1.92 1.92 -60.7 -60.7



Table 4-2: Performance function characteristics for example 2.

N Aopt Anzcu J(A,,) in dB J(RzcU) in dB
0 2.99 3.12 -11.6 -11.6
1 1.50 1.43 -15.6 -15.5
2 2.58 2.60 -20.8 -20.8
3 1.81 1.78 -24.8 -24.7
4 2.45 2.47 -29.3 -29.3
5 1.93 1.92 -33.3 -33.3
6 2.39 2.40 -37.7 -37.7
7 2.00 2.00 -41.8 -41.8



Table 4-3: Performance function characteristics for example 3.

N Ao, 2nzcu J( A,) in dB J(,,zcu) in dB
0 0.82 0.34 -0.5 -0.2
1 2.30 2.46 -2.7 -2.7
2 1.29 1.21 -5.2 -5.1
3 2.12 2.12 -9.9 -9.9
4 2.87 2.89 -12.7 -12.7
5 3.56 3.59 -14.1 -14.1
6 4.21 4.22 -14.7 -14.7
7 4.82 4.76 -14.9 -14.9
8 2.24 2.24 -17.1 -17.1
9 2.62 2.64 -19.6 -19.5
10 2.25 2.24 -22.9 -22.9
11 2.57 2.56 -25.5 -25.5








4.3.3 Further Analysis of the Complex Performance Function

In the previous section, it was shown that the squared error and complex

performance functions can be expressed as

J = d w w
Sd (4.89)
Ju = due w
where

d, = D(s)D(-s)ds du, = D(s)D(-s)ds
c c" (4.90)
w= fD(s)(-s,A)ds w, =- D(s)(-s,A)ds
C C.
c c,
and

D(s) = desired system
cp(s,2) = L(s,/) the Laguerre kernel
C = contour enclosing the entire left hand plane (LHP) in s
Cu = contour enclosing the upper left hand plane (LHP) in s

It was also shown that

d, = 2Re[d,]
d = 2Re[de (4.91)
w= 2Re[w,]

Writing J and Ju in normalized form (relative to the total energy of the system) gives

J = (d, WTW)d (4.92)
(4.92)
Ju = (de wT'w)/d

Decomposing due and Wu into their complex component forms

due = dueR + jdue (4.93)
wu = wR + jW.u

yielding

de = 2Re[due] = 2dueR
w = 2RWu(4.94)
w= 2Re[w,] = 2w,







So we can write J and J, in terms of components of the complex energy signal and

complex weight vector.

J = (2due (2wR)T(2wUR))/(2duR)

= (dR- 2w uR) dueR

Ju = (due- (2wuR)T w)/(2d,R)
=(i ww)/U(4.96)

Now we are interested in 2Im[J,], twice the imaginary component of the complex

performance function. This has the expression

2Im[J,] = (due, 2wWU)/du, (4.97)

This extension to the complex domain has shed light on new information. The

imaginary component of Ju has been shown to have roots (or zero crossings) near the

stationary points of J, one of which provides an approximation to the optimal time

scale solution.

In an effort to capitalize on this discovery, the analytic performance function, Ja,

will be defined and studied as a method of achieving the same information while

requiring less a priori knowledge of the unknown system D(s). As will be seen, Ja is

composed of an energy signal and a weight vector that is the analytic signal

representation of the associated squared error components.

4.3.3.1 Analytic performance function. Consider the components of Ju. The

complex energy signal, due, can be written as











du = 2 D(s)D(-s)ds

o0
= J D(o)D(-c)da + f D( jw)D(- ji)do (4.98)
-0 0
~0
= [,D(jw)D(-jw)dw] j D(o)D(-)d7]


Likewise, the complex weight vector, wu, can be written as

w = fD(s)(-s,2 )ds
C.
o (4.99)
= f D(a)((-c~,A)do +- f D(jw)(- jw, )dw
-00 0

Now collecting real and imaginary components gives


wU= Re D( jo)W(- jo,A)do Re [ f D(a)(-c, )da- +

0 1 1 (4.100)
j Im [i D( jo)(-j, A)d J Im J D(o)O(-a, ,)d


Consider the components of Wu along the +jm axis only. This is equivalent to the

spectrum of the analytic signal constructed from w. Namely, for a signal f(t), the

analytic signal is specified by

fa(t)- f(t) + j(t) (4.101)

where f(t)= H{f(t)}, H is the Hilbert Transform. The spectrum of f,(t) is

equivalent to the spectrum of f(t) for 0 < < oo and zero for -oo < m < 0. Now we

define the following functions:









d,,, [real part of d., along + jw axis] + j[imaginary part of du along + jo axis]

= D(joI)D(-j))da) + [0] (4.102)


= f D(jo)D(-jwo)dw
0

and

w,, =[real part of w, along + jw axis] + j[imaginary part of w, along + jw axis]

= Re 1 D( jo)0(-jo, A)dj + j Im [ D( jw)(-jo, l)do] (4.103)
0 0

= JD( j)w)(- jo,)do
0

dae is called the analytic energy signal and wa is called the analytic weight vector. As

in the case of the complex performance function we define the analytic performance

function, Ja,

Ja d w'w (4.104)

or in normalized form

J= (da, wT a)/de (4.105)

Now decomposing the components of Ja into their complex form gives

dae = dae + Jde' (4.106)
a = WR + Jal

where


doeR = [ D(j))D(-j)d (4.107)

dae =0


and







70



a = Re- D(j)(iw4(-jo9,A)d]o
(4.108)

w = Im [ D( j)W (-jI),)dA)
0

The goal is to rewrite J and Ja in terms of analytic energy signal and analytic weight

vector components only. Recall


o J
w=; D(s)W(-s, A)ds
C

I Tr f D(jw4)(-jw,A2)dw
0
= D(jo)O(- jo,)d(+ +- D(jmo))(- jo,A)do
-0 0

= 2 iD( jo)D(-jw )dw) + D( j))(-jwj,A))dw (4.109)
0 1 0

=2Re[2 i D(jo)o(-jtWA)dw]

=2Re[w]
= 2w,,

Likewise


d,= 2Re [- D(jow)D(-jm)dw ]
0
= 2Re[de] (4.110)
= 2daeR

So we can write J and Ja in terms of components of dae and Wa as follows

J = (2de -(2WaR)T(2waR))(2daeR)
S((4.111)
= (daeR 2w WaR W)/daeR

S=(dae (2WaRT a)/(2daeR)
(4.112)









We are interested in 2Im[Ja], twice the imaginary component of the analytic

performance function. This has expression

2Im[Ja] = -2wW waT /daeR (4.113)


4.3.3.2 Analytic performance function examples and experimental results. Again

consider the three examples used to study the complex performance function. Figures

4-7, 4-8, and 4-9 compare the complex and analytic performance functions. For each

of the three examples the squared error performance function, J (solid curve), twice

the imaginary component of the complex performance function, 2Im[Ju] (dash curve),

and twice the imaginary component of the analytic performance function, 2Im[Ja]

(dot-dash curve), are plotted for the first four model orders. Numerical results are

provided in Tables 4-5, 4-6, and 4-7 to illustrate how close the nearest zero crossing

(nearest ZCu) of 2Im[Ju] and the nearest zero crossing (nearest ZCa) of 2Im[Ja] are to

the optimal time scale as the model order increases. Also the squared error is

computed at the optimal, J(A,,p), nearest ZCu, J(Anzcu), and nearest ZCa, J(nzcaa), time

scales.

It is interesting to compare Ju and Ja for these three examples. The difference in

these functions is that Ju includes the line integral along the negative aoaxis. Consider

the complex energy due for each example. From Table 4-4 we find that duel is 22% of


Table 4-4: Complex energy signal characteristics.

Example due= dueR + duei % complex
1 0.45+0.10/ 22
2 1.98+0.63/ 32
3 9.91+0.31/ 3








dueR for example 1, 32% for example 2, and only 3% for example 3. Observe from

Figure 4-9 that J,(A)) = J,(A), for all A, for example 3. Figures 4-7 and 4-8 show that

Ju and Ja are significantly different (but have close zero crossings near Aopt) for

examples 1 and 2. So the line integral along the negative o axis contributes

significantly to the evaluation of J for examples 1 and 2 but not example 3. This

results in the similarity of Ju and Ja for the last case. Clearly, the relative values of

dueR and duc, are application dependent.

We have shown that even though Ju and Ja are different functions, their imaginary

components have roots that are similar near 2op,. Since Ja can be computed directly

from a sampling of d(t), it could be used to provide the optimal time scale estimate in

lieu of Ju.











(order=0)
0.1

_U 0.05-- -


S 0 ... ., ......' ....... ......... --- ..... -.............
E


-0.1
-0.1 I --- j -----
0 1 2 3 4 5
3
x10 (order=2)

1




-0.5 --....
\
E i
I
I t i I


0 1 2


3 4 5


(order=1)


0.00




-- -0.00


-0.C


i !

4 ..... ......... i ............ .......- ..................
5 --------
n5 -


SI


0 1 2 3 4


0 1 2 3 4 5


S

Figure 4-7: J, 2Im[Jy], and 2Im[Ja] for example 1.


Table 4-5: Performance function characteristics for example 1.


N Aopt 2Azcu Azca J(A,,)p in dB J(Azcu) in dB J(Anzca) in dB
0 2.47 2.48 2.48 -24.7 -24.6 -24.6
1 1.60 1.58 1.49 -33.8 -33.7 -31.1
2 2.31 2.31 2.29 -49.9 -49.9 -49.5
3 1.92 1.92 2.62 -60.7 -60.7 -54.6


0.0












(order=1)





, ...... .............. ................................... ..... .....................

i/
. ...... .. .
........;


0.3
0.2
M0.1
4 0
E -0.1
S-0.2


0.4

v 0.2
E
;i0
E
,--0.2

-0.4



0.1

7 0.05
E
- 0
E
,- -0.05

-0.1


0 2 4


- 0.05
E
" 0
E
-i -0.05


(order=3)
(order=3)


2 4


Figure 4-8: J, 2Im[J,], and 2Im[Ja] for example 2.




Table 4-6: Performance function characteristics for example 2.


N Ap A,,zcu Azc J(Aopt) in dB J(AZzcu) in dB J(Azcn) in dB
0 2.99 3.12 3.08 -11.6 -11.6 -11.6
1 1.50 1.43 1.31 -15.6 -15.5 -14.8
2 2.58 2.60 2.19 -20.8 -20.8 -19.0
3 1.81 1.78 3.16 -24.8 -24.7 -22.5
4 2.45 2.47 2.40 -29.3 -29.3 -29.2
5 1.93 1.92 1.86 -33.3 -33.3 -32.8
6 2.39 2.40 2.14 -37.7 -37.7 -34.8
7 2.00 2.00 1.96 -41.8 -41.8 -41.5


0 2 4

(order=2)


(order=O)











(order=0)


i i
0 1 2 3

(order=2)


2 4


Figure 4-9: J, 2Im[J.], and


(order=1)


0.8
0.6
E
S0.4
E 0.2
0
-0.2


0.8
0.6
_E
- 0.4
E 0.2
0
-0.2


0.6

" 0.4
E
M 0.2

4- 0

-0.2


2 4
orderr)
(order=3)


2 4


2Im[J] for example 3.
21m[Ja] for example 3.


Table 4-7: Performance function characteristics for example 3.


N Aopi 2zcu Anzc, J(2op,) in dB J(A zc,) in dB J(Anzca) in dB
0 0.82 0.34 0.55 -0.5 -0.2 -0.4
1 2.30 2.46 2.43 -2.7 -2.7 -2.7
2 1.29 1.21 1.21 -5.2 -5.1 -5.0
3 2.12 2.12 2.13 -9.9 -9.9 -9.9
4 2.87 2.89 2.89 -12.7 -12.7 -12.7
5 3.56 3.59 3.60 -14.1 -14.1 -14.1
6 4.21 4.22 4.22 -14.7 -14.7 -14.7
7 4.82 4.76 4.79 -14.9 -14.9 -14.9
8 2.24 2.24 2.22 -17.1 -17.1 -17.0
9 2.62 2.64 2.60 -19.6 -19.5 -19.6
10 2.25 2.24 2.28 -22.9 -22.9 -22.8
11 2.57 2.56 2.62 -25.5 -25.5 -25.3


0.6

"- 0.4
E
4 0.2
E
-- o0

-0.2


dl


..--............. ..................... I .............. -- ...........











4.3.4 Time Domain Synthesis Using GLFF Networks

For the discrete-time setting, one of the fundamental limitations of an FIR model

is its lack of memory depth; namely, its inability to approximate (with low order) a

system with an infinitely decaying impulse response. GLFF networks do not have

this problem due to the presence of local feedback in their structure.

How well a Laguerre network approximates the impulse responses of the three

example systems given by equations (4.86), (4.87), and (4.88) is now illustrated.

Here the system impulse response, d(t), (h(t) = d(t) since the input is an impulse), the

Laguerre network output when using the optimal time scale yo(t), and the Laguerre

network output when using the nearest ZCu time scale yz(t) are plotted. In example 1

the impulse response decays rapidly and has no significant oscillation. yo(t) and yz(t)

are plotted in Figure 4-10 using a 1st order Laguerre model. In example 2 the impulse

response is also rapidly decaying but with a slight oscillation. Plots of yo(t) and yz(t)

are given in Figure 4-11 for a 5th order model. Example 3 is a system having a slowly

decaying impulse response with a strong oscillation. Plots of yo(t) and yz(t) are given

in Figure 4-12 for a 10th order model. This oscillation is difficult to approximate with

a Laguerre model since only a single real pole (time scale) is used in the

approximation. Increasing the order of the network could further reduce the error.

Another option is to use a Kautz model with a pair of complex conjugate poles.

For all three examples we observe that yo(t) and yz(t) are inseparable. The point

to be made in this section is that when using A obtained from the appropriate zero












Example 1: d(t),yo(t),yz(t)


1 2


3 4


Figure 4-10: Impulse response and approximations using a 1st order Laguerre model.


Example 2: d(t),yo(t),yz(t)


3 4 5


Figure 4-11: Impulse response and approximations using a 5th order Laguerre model.


Example 3: d(t),yo(t),yz(t)


5 6 7 8


Figure 4-12: Impulse response and approximations using a 10th order Laguerre model.


2


1.5


1


0.5


0


%I









crossing of Im[J1u (or Im[Ja]) as the time scale for the model, the same result can be

achieved as when using A obtained by searching J for the global minimum. Also the

squared error performance function can always be computed using J = 2Re[Ju].


4.3.5 Conclusion of the Complex Performance Function

For systems composed of sets of real and/or complex conjugate pair poles, a new

complex performance function, J,, was derived when using a GLFF network as a

system model. The imaginary component of Ju has zero crossings near the stationary

points of the squared error performance function J. Experimentally it is determined

that as the order of the model is increased, the zero crossings of Im[J,] near minima

of J converge to these stationary points, one of which will be the optimal time scale,

Aopt. Rather than searching the non-convex function J for a global minimum, locate

the roots of Im[Ju(A)], A,, and evaluate 2Re[Ju(Ar)]. One of these will be close to Aopt;

namely,


= arg( min{2 Re[Ju (A,)]}j (4.114)


where

2r= {roots of Im[Ju(X)]}.

In addition, we have defined the new analytic performance function and derived its

closed form solution. It was shown that J could be written in terms of components of

Ju and Ja. Expressions for the imaginary components of Ju and Ja were also given.

The value of these functions in determining the optimal time scale of GLFF kernels

was illustrated using three example systems. The important equations from the above

sections are now summarized:











Squared Error Performance Function

(de wTw)/de
J= (deR -2WTRWuR,)/dueR
(daeR 2WRWaR )/daeR


{Squared error components}
{Complex error components}
{Analytic error components}


Complex Performance Function


Ju = (+due WuRWu)/dueR


Analytic Performance Function


Ja = ( dae Wa)/daeR


where


d, = ;D(s)D(-s)ds

d,= D(s)D(-s)ds
Cu

o


w= -D(s(Ds(-s,A)ds
C
w = D(s)4(-s, A)ds


wa = D(ji)D(- jw,A)dw
0


If D(s) is known analytically or in closed form we can employ the residue theorem

to compute de, due, dae, w, Wu, wa. If these functions must be determined numerically

then the analytic performance function can be easily computed. The only requirement

is a sampled data set from an impulse or step response test. In addition, J can always

be computed from Ju or Ja. This illustrates that the squared error solution is

completely embedded within the complex and analytic performance functions.


(4.115)


(4.116)


(4.117)


(4.118)










4.4 Conclusion (Optimal Parameter Selection)

This chapter has addressed optimization techniques for GLFF models, M(p).

Optimization of each element of the parameter set p = {N,w, A} is handled

separately. Model dimension element optimization generally involves increasing N

until a prescribed error tolerance is achieved. Monitoring the rate of convergence for

several GLFF kernels could also aid in choosing the model yielding the smallest N.

The optimal weight vector minimizes the integrated squared error and is found by

solving a set of Wiener-Hopf equations. A direct integral evaluation method was also

given that takes advantage of properties of the model kernels. The weight vector

solution is a function of the time scale vector 2. Optimizing A is the most difficult

task because the minimum squared error solution is nonlinear w.r.t. these parameters.

A number of optimization approaches exist in the literature and they were discussed.

Finally, a new method was proposed that extends the squared error analysis to the

complex domain. Simply locating zero crossings and choosing the one that

minimizes J gives the optimal A. Examples were provided to demonstrate the theory.














CHAPTER 5
APPLICATIONS


Applications of the theory presented in the previous chapters will now be

discussed. In this thesis, GLFF networks will be used to approximate sampled-data

systems. In particular, the step response of a system used in laboratory test and

evaluation of guided weapon systems will be modeled. Other areas where GLFF

models have proven useful are first surveyed.

The GLFF model class encompasses many structures that have been

independently formulated and studied in the past. This includes the tap-delay line,

Gamma, Laguerre, Legendre, Jacobi, and Kautz models defined in both discrete-time

and continuous-time domains. The unification of these structures is one contributory

component of this thesis; however, it is worth reviewing the previous history of these

models to gain insight into the many applications that they have been able to address.

Network synthesis. The earliest work dealt primarily with orthogonal model

structures in an effort to synthesize networks and transient responses of linear systems

[Arm57], [Arm59], [Kau54], [Hea55], [Bru64], [Clo65], [Kin77], [War53]. In

[Lee33] and [Hug56] synthesis of electrical circuits was the focus.

System identification and modeling. Similarly as the tools became more

developed the nomenclature and approach shifted to system identification and system

modeling [Bod94], [Eko94], [Kin79], [Lin93], [Mas90], [Mas91], [Nur87], [Nin94],









[Oli87], [Oli94], [Per91], [Si194], [Sil95], [Van94], [Wah91], [Wah94], [Moh88].

The basic principles are the same as in network synthesis only now the modeling

effort involves more than just the network impulse response. A further refinement

includes modeling non-causal systems [Eko95], nonlinear biological systems

[Mam93], probability density functions, [Pab92], [Tia89]. State space formulations

appeared [Dum86], [Gus93], [Heu90], [Heu95], and modeling using step response

data [Wan94b] and frequency response data [Clu92], [Wan95] have been considered.

Function approximation. A representation of an underlying system is not always

desired. Various GLFF models were also useful in function approximation, signal

representation, time series analysis [Bro65], [Cel95], [Con94], [Cle82], [Hwa82],

[Kor91], [Mar69], [Mai85], [McD68], [Ros64], [Sch71], [Wah93], [You62b],

[Yen62]. Particular examples include representation of seismic data [Dea64] and

electrocardiograms [You63].

Speech. Applications of GLFF models used in speech synthesis/analysis [Man63]

and speech compression [A1J94] have appeared.

Control theory. GLFF models are also useful in control theory. Examples include

adaptive controllers [Dum86], [Dum90], [Zer88a], adaptive predictor controllers

[Els94], [Guo94], optimal control [Hag91], [Nur81], robust control [Clu91] and

robust stability tests [Aga92].

Stochastic processes. Stochastic concepts have been addressed including

approximating cross correlations [den93a], correlations and power density spectra

[Lee67], and probability density functions [Tia89], [Pab92].







(Adaptive) filter theory. GLFF models have been used as filters [Kab94]

including adaptive filters [den93a], [den94], [Per88], [Pri93]. Systematic procedures

to design FIR filters have been introduced [Fri81], [Mas90], [Mas91]. Recursive

identification of time varying signals [Gun90], 2D filtering [Tsi94] and hardware

implementations have also been studied [Ji95].

Other applications. Other examples where GLFF networks have been used

include solutions to boundary value problems [Ira92], separation, enhancement, and

tracking multiple sinusoids [Ko94], and approximation of time delays [Lam94],

[Par91], [Moh95].

Although the list of applications above is fairly extensive, it is by no means

exhaustive. Nevertheless, the applicability of the GLFF modeling theory developed in

this research is clearly illustrated.


5.1 Examples of GLFF Networks Using Simulated Systems

The application focus of this thesis is system modeling; namely, it is desired to

approximate (model) LTI systems using GLFF networks. The objective is to obtain

an approximation to the system impulse response. Quite often this is accomplished

by optimizing the parameters of the model using step response data. In this section

several GLFF network realizations are constructed that approximate simulated LTI

systems. Optimization is performed using the step response of the desired systems.

Real (non-simulated) systems and their associated step response data will be

considered in Chapters 5.3 and 5.4.










Three simulated systems will be studied: 1st order, 2nd order, and 4th order.


Example 1: (1st order)

H(s)= r=1 (5.1)
s + 1


Example 2: (2nd order)

(02
H(s) = w, = 1, = 0.4 (underdamped) (5.2)
s + 2 os + n.


Example 3: (4th order)

H(s)= H,(s)H2(s) (5.3)

where


H (s) = + 2, ,, =1, I = 0.4 (underdamped)
s2 + 241,00s + t (5
2 (5.4)
H,(s)= n2 2 ,2 = 2, 2 = 0.25 (underdamped)
s2 + 22W2s + on2


In each case, the step response is generated and sampled. The sampled step

response data is modeled with Laguerre, Gamma, and Legendre models. The results

are given below in Figures 5-1 through 5-6. In each case, the optimal time scale

parameter was determined and used to compute and plot the step response and

impulse response for 6-tap (N = 5) networks. Figures 5-1, 5-2, and 5-3 compare the

system and GLFF network step responses. Figures 5-4, 5-5, and 5-6 show the

resulting impulse responses. The optimal time scale parameters of the 6-tap models

and associated minimum squared error, J(Aopt), are provided in Tables 5-1, 5-2, 5-3.










1

0.9

0.8

0.7

0.6

0.5

0.4- System
+ + + + Laguerre model
0.3 f Gamma model
x x x x Legendre model
0.2-

0.1 -

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
t (sec)

Figure 5-1: 1st order step response and model data.



14


1.2


1


0.8


0.6


04 System
+ + + + Laguerre model
Gamma model
0.2 x x x x Legendre model


0


1 2 3 4 5
t (sec)


6 7 8 9 10


Figure 5-2: 2nd order step response and model data.











1.6


1.4


1.2


1 -


0.8


0.6


0.4


0.2





-0.2
0


1 2 3 4 5 6 7 8 9 10
t (sec)


Figure 5-3: 4h order step response and model data.


1 -

0.9

0.8

0.7

0.6

0.5 -

0.4-

0.3

0.2

0.1

0
0 0.5


System
+ + + + Laguerre model
* Gamma model
x x x x Legendre model


Figure 5-4: 1st order impulse response and model data.


System
+ + + + Laguerre model
* Gamma model
x x x x Legendre model


1 1.5 2 2.5 3 3.5 4 4.5 5
t (sec)












0.8


0.7
xxxx
0.6 -- System
+ + + + Laguerre model
0.5 Gamma model
f x x x x Legendre model
0.4
x

0.3 x
X
0.2- x
x

0.1




-0.1


-0 2 ,i 1 i i I I i-1
0 1 2 3 4 5 6 7 8 9 10
t (sec)


Figure 5-5: 2nd order impulse response and model data.






0.8 x
X+.

0.7 x .
x
0.6 x x System
0.5 + ++ + Laguerre model
0 x\ Gamma model

0.4 x \ x x x x Legendre model




x
0.3

0.2-

0.1 -+* / ;xxxxxxxxxxx>

0 x x +x
x Ixx
-0.1 \.x +
+. x .. (*', x
-0.2 \ *+*.x


0 1 2 3 4 5 6 7 8 9 10
t (sec)


Figure 5-6: 4th order impulse response and model data.









Table 5-1: 1st order model results.

Model N Aot J(2o) in dB
FIR 101 -13.7

Laguerre 5 0.88 -57.1

Gamma 5 0.05 -oo

Legendre 5 Aop = e0.05 -56.0




Table 5-2: 2nd order model results.

Model N A__ pt J(A opt) in dB
FIR 101 -13.0

Laguerre 5 0.95 -57.1

Gamma 5 0.08 -40.7

Legendre 5 opt,i = e0.03i -37.5




Table 5-3: 4th order model results.

Model N 2opt J(/p,) in dB
FIR 101 -10.9

Laguerre 5 0.89 -37.7

Gamma 5 0.09 -31.2

Legendre 5 opt,i = e003i -28.8







The three examples considered above are typical of the subsystem makeup of

guided weapon systems. With fairly low order models (N=5) the GLFF networks

have demonstrated (in the simulated examples) they can approximate these systems

well. This includes systems having slowly decaying impulse responses and

oscillatory modes. The most difficult task of the modeling process is locating the

optimal time scale. This was done using a numerical search technique in the above

examples. For a given model, if the resulting error is too large the model order can be

increased. This process can be continued until the prescribed error tolerance is

reached.


5.2 Laboratory Testing of Guided Weapon Systems

Guided weapon systems are as important in a military tactical mission as the

aircraft and ships that carry them. They are the tools with which a pilot can carry out

both offensive and defensive plans. As such, their capability and reliability must be

well understood so that they can be operated within the envelope for which they were

developed. To ensure their success it is necessary that the operational envelope is

understood. This knowledge is acquired through extensive test and evaluation of

these systems. The objective is to produce a smart dependable weapon that operates

as expected when launched from an aircraft or ship.

Test and evaluation (T&E) of guided weapon subsystems can be broken down

into three phases [Eic89]. This breakdown is typical when a new weapon or

modification of an existing weapon is under consideration. These phases are










1. Research/Concept Exploration
a) analytic study
b) open-loop laboratory and field tests

2. Simulation
a) software simulation and modeling
b) HITL simulation

3. Flight Test
a) captive carry
b) launch

Research and concept exploration are the very beginning of the T&E effort.

Proposals must be carefully considered to determine which (current) weapons could

possibly meet the objective being considered. Can an existing weapon be modified or

does it call for a new design? All these scenarios must be analytically explored.

Flight tests are the final hurdle. It is here where the weapon system's performance is

measured against its expectation. Before flight testing however, many problems can

be worked out by evaluating a weapon system in a simulated environment created in

the laboratory. There are many methods for evaluating the performance of

components of a weapon system such as parametric tests and open-loop tests.

Another class of tests studies the behavior of these systems in a closed-loop

configuration and involves a simulation of the real-world environment and the

weapon's flight in this environment. This class of tests can be broken down into two

types: all-digital and hardware-in-the-loop (HITL). In an all-digital closed-loop

flight, the entire weapon system is simulated mathematically. If the weapon

subsystem models are accurate, all-digital simulations can give us a good idea of how

the weapon will perform as well as what its limitations and vulnerabilities are. Once

subsystems have been hardware fabricated, these can replace the models in the







closed-loop simulation. Ultimately it is desired to have as much of the weapon

hardware in the loop as possible.


5.2.1 Objectives

Several reasons why laboratory HITL testing is an integral part of a weapons

development test program are now given.

1. Asset Availability
During the development of a weapon system, often answers are needed to key
questions about the system to determine the limits of its capability. Rather
than fabricating an entire system and conducting a flight test, many times
these questions can be answered by using simple prototypes or only subsystem
components. This precludes having to completely fabricate a system and
conduct open air flight tests. Flight tests are expensive and consume lots of
resources (fully developed weapon system, aircraft/ships, open-air range
time/space, money). Here is where laboratory testing offers a great advantage.

2. Planning and Analysis Requirements
As a missile system reaches development stage and can be flight tested, valid
test conditions (test matrix) must be formulated. Having laboratory results
(all-digital and/or HITL) as a guide, the test matrix can be formulated with
intelligence and unrealistic scenarios can be avoided. Laboratory tests can not
only help define the launch envelope of a weapon, but they can be used to
predict its performance when operating under valid conditions.

3. Repeatability
Another advantage to the laboratory environment is repeatability testing.
Once a weapon system simulation has been developed, it can be used to
conduct hundreds and thousands of tests. Weapon performance when
launched under varying release conditions and "what if" scenarios can be
examined. The laboratory environment is a major contributor to this kind of
testing because real assets do not have to be expended to conduct many tests.
In "live fire" testing assets can usually be used only once.

4. Security/Safety
Finally there is the security advantage. As long as testing is being conducted
in a laboratory, external security vulnerability is reduced. Range safety costs
and concerns that occur in open-air testing are also eliminated.









To accomplish valid and accurate laboratory testing, careful attention must be

paid to the test configuration. For HITL tests it must be ensured that the laboratory

equipment can recreate the dynamics and conditions that will be experienced in the

real world. Often it is easy to switch between configurations involving hardware and

software. This requires accurate software models of the hardware being simulated.

This is where GLFF networks discussed in this work enters the application front.

To gain a better understanding of the systems the GLFF networks must model, a

more detailed breakdown of a typical missile system is now given. The functional

components of the missile are discussed as well as how they are integrated into a

HITL test configuration. The GLFF networks can then be used to approximate some

of these components.


5.2.2 Functional Breakdown of a Missile System

This section provides a discussion of components (subsystems) most often found

in missile systems. Figure 5-7 below illustrates a high-level functional break down of

a typical missile system [Gar80].


Figure 5-7: Missile functional block diagram.







Seeker/Guidance Modules. It can be said that the seeker is the eye and the

guidance module is the brain of a weapon system. It is here the missile trajectory is

commanded and controlled. For missiles that are considered lockon after launch, the

guidance module may begin in a target search mode. In this case it drives the seeker

to search for a target in some pre-programmed pattern. During this phase commands

may be sent to the autopilot to have it execute a midcourse maneuver such as pitching

the weapon up in order to gain altitude for target search and future end game

engagement. If target energy is found the guidance module may transition into an

acquisition mode to narrow the search field of the seeker and attempt to select a single

target. At this point it begins target track and will try to keep the seeker pointed in the

direction of the target. During track mode, guidance (acceleration) commands are

sent to the autopilot in accordance with the guidance law being executed (proportional

or pursuit navigation). All the while this module performs seeker head stabilization

and directional pointing to maintain a continuous track of the target until impact.

Autopilot. The job of the autopilot is to maintain a stable flight and deliver on the

commands received from the guidance module. To perform its task, the autopilot

accepts input from various sources and sensors. The three most common are:

guidance module, accelerometers, gyros. The guidance module tells the autopilot that

a certain amount of acceleration is required to maintain the desired trajectory. Upon

receiving a command for a specific acceleration, the autopilot issues commands to the

actuator to drive the control surfaces (fins or canards) in the proper direction.

Measurements are taken from accelerometers to determine whether the weapon has

achieved the desired acceleration. Gyro rotational rate and angle measurements are











taken to ensure proper damping and to keep the motion of the missile in a stable

condition.

Actuator and Control Surfaces. The purpose of the actuator is to accept

commands from the autopilot and in turn drive the control surfaces to achieve the

dynamics needed. As the fins are controlled into the wind the missile body will rotate

and accelerate hopefully in a stable manner sufficient to maintain a target track.



Track angles, Steering commands Fin commands Fin
rates
rates positions
Target Seeker 4 Guidance Autopilot Actuator
signature
Mode, Stabilization
"> I Flight Motion Simulator (FMS) IAa
T arget .............................................. .......................... ... ............................ A ir ram e
Target A
Forces,

............... ....... K inem atics
Missile body
angles, rates,
Target Missile/Target 4 accelerations
Geometry

Target profile

Figure 5-8: Closed-loop missile simulation.


5.2.3 HITL Closed-Loop Missile Simulations


In this section HITL closed-loop missile simulations will be discussed. Figure 5-8

above illustrates a typical test configuration. Usually the seeker, guidance module,

and autopilot are hardware components and the functionality of these systems is the

subject of the test. Special laboratory equipment is used to generate the target scene.

This is dependent upon the kind of sensor in the seeker head. The remaining blocks

are simulated in software. For a HITL closed-loop simulation, the seeker, guidance