Realization of invariant system descriptions from Markov sequences

Material Information

Realization of invariant system descriptions from Markov sequences
Candy, James Vincent, 1944-
Publication Date:
Physical Description:
viii, 124 leaves : ; 28 cm.


Subjects / Keywords:
Algebra ( jstor )
Canonical forms ( jstor )
Covariance ( jstor )
Factorization ( jstor )
Integers ( jstor )
Linear systems ( jstor )
Mathematical vectors ( jstor )
Matrices ( jstor )
Stochastic models ( jstor )
Transfer functions ( jstor )
Dissertations, Academic -- Electrical Engineering -- UF
Electrical Engineering thesis Ph. D
Invariant subspaces ( lcsh )
Markov processes ( lcsh )
bibliography ( marcgt )
non-fiction ( marcgt )


Thesis--University of Florida.
Bibliography: leaves 114-123.
General Note:
General Note:
Statement of Responsibility:
by James Vincent Candy.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
000180838 ( ALEPH )
03197863 ( OCLC )
AAU7374 ( NOTIS )


This item has the following downloads:

Full Text







To my wife, Patricia,and daughter, Kirstin,for unending faith,

encouragement and understanding. To my mother, Anne, for her constant

support and my mother-in-law, Ruth, for her encouragement. To "big" Ed,

my father-in-law, whose sense of humor often lifted my sometimes low


:' .. ~'~~-~P1"*-~qnq8-.s~-rrarrm~mtin*a-s--- --il----~---~i~a~rCI~AI~~-I- -. --~-


I would like to express my sincere appreciation to the members

of my supervisory committee: Dr. Thomas E. Bullock, Chairman, and

Dr. Michael E. Warren, Cochairman, Dr. Donald G. Childers,

Dr. Z.R. Pop-Stojanovic and Dr. V.M. Popov. A special thanks to

Dr. Thomas E. Bullock and Dr. Michael E. Warren for their constant

encouragement, unending patience, and invaluable suggestions in the

course of this research.

I would also like to thank my fellow students and friends,

Zuonhua Luo, Arun Majumdar, Jose DeQueiroz, and Jaime Roman, for

many fruitful discussions and suggestions.

-- iii


ACKNOWLEDGMENTS ........................................... iii

LIST OF SYMBOLS .......... .................. ... ....... ... vi

ABSTRACT .............................. ..... ..... ............. vii

CHAPTER 1: INTRODUCTION ................................. ...... 1

1.1 Survey of Previous Work in Canonical Forms
for Linear Systems ............................ 2
1.2 Survey of Previous Work in Realization Theory.. 5
1.3 Purpose and Chapter Outline ................... 10
1.4 Notation ..................................... 11


2.1 Realization Theory ............................ 12
2.2 Invariant System Descriptions ................. 18
2.3 Canonical Realization Theory .................. 33
2.4 Some New Realization Algorithms .............. 45

CHAPTER 3: PARTIAL REALIZATIONS ............................. 54

3.1 Nested Algorithm ............................. 54
3.2 Minimal Extension Sequences ................... 64
3.3 Characteristic Polynomial Determination by
Coordinate Transformation ..................... 74

DESCRIPTIONS ...................................... 78

4.1 Stochastic Realization Theory ................ 81
4.2 Invariant System Description of the
Stochastic Realization ........................ 87
4.3 Stochastic Realizations Via Trial and Error ... 97
4.4 Stochastic Realization Via the Kalman Filter .. 104

CHAPTER 5: CONCLUSIONS ....................................... 111

5.1 Summary ....................................... 111
5.2 Suggestions for Future Research ............... 112

~1_1__1 __~~__ j


REFERENCES ................................................ 114

BIOGRAPHICAL SKETCH ................................. ....... 124

_ ____r___ll___ Cr ~_I_ _~__ 1__1_1 _



A aT
IA| or det A
diag A


dim X



First Usage



Transpose of A, a
Inverse of A
Inverse of AT
Rank of A
Determinant of A
Diagonal elements of A
x is not equal to y
x is greater than y
X is contained in or a
subset of Y
x is an element of X
Map (set X into set Y)
x is defined by
Abstract group operation
Sequence or set with
End of proof
Group action operator
Dimension of vector
space X
if and only if
Eigenvalues ,of A
Square root of x
Maximum value of
Positive integers










~~1~1_ _I_ I

Abstract of Dissertation Presented to the Graduate Council
of the university of Florida in Partial Fulfillment of
the Requirements for the Degree of Doctor of Philosophy



James Vincent Candy

March, 1976

Chairman: Dr. Thomas E. Bullock
Cochairman: Dr. Michael E. Warren
Major Department: Electrical Engineering

The realization of infinite and finite Markov sequences for multi-

dimensional systems is considered, and an efficient algorithm to extract

the invariants of the sequence under a change of basis in the state

space is developed. Knowledge of these invariants enables the deter-

mination of the corresponding canonical form, and an invariant system

description under this transformation group. For the partial realization

problem, it is shown that this algorithm possesses some attractive

nesting properties. If the realization is not unique, the class of

all possible solutions is found.

The stochastic version of the realization problem is also examined.

It is shown that the transformation group which must be considered is

richer than the general linear group of the deterministic problem. The

invariants under this group are specified and it is shown that they can

be determined from a realization of the measurement covariance sequence.

Knowledge of these invariants is sufficient to specify an invariant

system description for the stochastic problem. The link between the

realization from the measurement covariance sequence, the white noise

model and the steady state Kalman filter is established.




Special state space representations of linear dynamic systems have

long been the motivation for extensive research. These models are

generally used to simplify a problem, such as pole placement, by

introducing arrays which require the fewest number of parameters while

exhibiting the most pertinent information. In general, system represen-

tations have been studied in literature as the problem of determining

canonical forms; Canonical forms have been used in observer design,

exact model matching methods, feedback system design, and Kalman filtering

techniques. In realization theory, canonical forms for linear multi-

variable systems are important. Since it is only possible to model a

system within an equivalence class, the ability to characterize the class

by a unique element is beneficial.

The problem of determining a canonical form has its roots in

invariant theory. Over the past decade many so-called "canonical" system

representations have evolved in the literature, but unfortunately these

representations were obtained from a particular application or from

computational considerations and not derived from the invariant theory

point of view. Generally, these representations are not even unique and

therefore cannot be called a canonical form. Representations derived

in this manner have generally been a source of confusion as evidenced by

the ambiguity surrounding the word canonical itself. In this dissertation

we follow an algebraic procedure to obtain unique system representations,

i.e., we insure that these representations dre in fact canonical forms.

In simple terms this approach seeks the determination of certain entities

called invariants obtained by applying particular transformation groups

*(e.g., change of basis in the state space) to a well-defined set

representing a system parameterization. The invariants are the basic

structural elements of a system which do not change under this trans-

formation and are used to specify the corresponding canonical form. This

approach insures that the ambiguities prevalent in earlier work are

removed. Initially, we develop a simple solution to the problem of

determining a state space model from the unit pulse response of a given

linear system and then extend these results to the stochastic case where

the system is driven by a random input. The technique developed to

extract the invariants from this (response) sequence not only provides

a simple solution to the realization problem, but also gives more insight

into the system structure.

1.1 Survey of Previous Work in Canonical Forms for Linear Systems

The study of canonical forms for linear dynamic systems evolved

slowly in the sixties. The main impetus of investigation was initiated

by Kalman (1962,1963) when he compared two different methods for describing

linear dynamic systems: (1) internally by the state space representation

denoted by the triple (F,G,H), or (2) externally by the transfer

function--the input/output description. Development over the past

decade in such areas as optimal control, decoupling theory, estimation

and filtering, identification theory, etc., have relied heavily on the

tThis defines (simply) the realization problem.


state space representation for analysis and design. In early literature,

however, transfer function representations were used. For highly

complex systems it is much easier to determine external behavior rather

than internal,since the state variables are normally not available for

measurement. As pointed out by Kalman (1963) the language of these

representations may be different, but both describe the same problems and

are related. Many researchers have investigated the relationship between

both representations, but always with one common goal--to obtain a

state space model which specifies the external description directly by

inspection. Kalman (1963) and later Johnson and Wonham (1964),

Silverman (1966) have shown that there exists a canonical form (under

change of basis in the state space) in the scalar case for the triple

(F,g,h) where F is in companion matrix form (see Hoffman and Kunze (1971))
and g is a unit column vector. It was shown that there exists a one to

one correspondence between the non-zero/non-unit elements of the triple

and the transfer function. This representation was used by Bass and Gura

(1965) to solve the pole-positioning problem and recently by Wolovich

(1972b)in solving the exact model matching problem.

The progress in determining a canonical form for the internal

description of multivariable systems came more slowly. The earliest work

appears to be that of Langenhop (1964) in which he develops a representation

to study system stability. Brunovsky (1966,1970) was probably the first

to recognize the invariant properties of the canonical form for the

controllable pair (F,G). Tuel (1966,1967) developed canonical forms for

multivariable systems in his investigation of the quadratic optimization

problem. Subsequently, Luenberger (1967) proposed certain sets of

canonical forms for controllable pairs; however, his development allowed

the possibility of nonuniqueness of these representations. Bucy (1968)

extended the results of Langenhop and Luenberger when he developed a

canonical form for certain subclasses of observable systems, but he

too was unaware of its invariant properties. Proceeding from the

external system description many researchers began to realize the

usefulness in the development of canonical forms. Popov (1969) developed

a canonical form for the transfer function in his investigation of

irreducible system representations. Gilbert (1969) examined the invariant

properties of a system with feedback applied to solve the decoupling

problem. Dickinson et al. (1974a) discuss the construction and appli-

cation of these canonical forms for the transfer function matrix in a

recent survey. The properties of canonical forms were not fully

understood initially. In fact, the basic question of their uniqueness

posed many doubts as to their usefulness. This issue wasn't resolved

until the work of Rosenbrock, Kalman, and Popov in the early seventies.

The properties of the Luenberger forms were clarified by the

results of Rosenbrock (1970) and Kalman (1971a) in their studies of the

minimal column indices (or Kronecker indices) of the matrix pencil

[Iz-F,G], or more commonly, the indices of the pair (F,G). These indices

were shown to be invariants under the following transformations: change

of basis in the state space, input change of basis, and state feedback.

These results precisely resolve the question of what can (or cannot)

be altered by applying feedback to a linear multivariable system. At

the same time Popov (1972) examined the properties of the controllable

pair (F,G) under the same transformations in a very precise, step-by-step,

algebraic procedure todetermine the corresponding invariants. He shows

clearly that obtaining the invariants under a particular transformation

group is the only information required to specify the corresponding

canonical form. Wonham and Morse (1972) obtained the feedback invariants

of the controllable pair from the not as lucid geometric viewpoint.

Their results were identical to those of Brunovsky and-4osenbrock.

Morse (1973) examined the invariants of the triple (F,G,H) under a lars

group of transformations which includes output change of basis. A

complete set of feedback invariants of this triple still remains an

open problem, but some fragmentary results were presented by Wang and

Davison (1972) when they investigated certain sets of restricted triples.

Along these lines Rissanen (1974), Caines and Rissanen (1974), .

Mayne (1972a,b),Weinert and Anton (1972), Tse and Weinert (1973,1975),

Glover and Willems' (1974) examined the identification problem from the

invariant theory viewpoint and obtained some rather interesting results.

Recent results in decoupling theory were obtained by Warren and Eckberg

(1973), Concheiro (1973), and Forney (1975) using the Kronecker invariants.

Probably the most extensive survey of these results has been compiled

by Denham (1974) and we refer the interested reader to this paper.

We temporarily leave this area to consider one specific application of

these results--the realization problem.

1.2 Survey of Previous Work in Realization Theory

The first realization problem proposed for control systems was

the determination of a state space model (internal description) from

a given transfer function (external description). Gilbert (1963) and

Zadeh and Desoer (1963) describe realization procedures based on the

determination of the rank of the residue matrices of the given transfer

function matrix, but unfortunately these procedures only apply to the


case of simple poles. Kalman (1963) proposed an algorithm whereby

the given transfer function is realized as a parallel combination of

single input, controllable subsystems in companion form, and then

applied the "canonical structure theory" (Kalman (1962)) to delete the
uncontrollable dynamics. This technique handles simple as well as

multiple transfer function poles. Later Kalman (1965) showed the

equivalence of the realization problem of control theory to the

corresponding network theory formulation.

A significant advance in realization theory was given by Ho and
Kalman (1966). They showed that the state space model could be found

from the impulse response sequence provided the system under investi-

gation is finite dimensional. They also developed an algorithm based

on forming the generalized Hankel array from the given sequence and

then extracted the state space triple from it. Shortly after the pub-

lication of Ho's algorithm, Youla and Tissi (1966) working in network

synthesis and Silverman and Meadows (1966) in control theory developed

similar realization techniques again based on the impulse response sequence.

Ho's algorithm gave new impetus to realization theory. Several

authors have provided alternate or improved realization algorithms based

on the Hankel array formulation. Mayne (1968), Panda and Chen (1969),

Roveda and Schmid (1970), Rosenbrock (1970), Lal et al. (1972) and even

more recently Huang (1974), Rozsa and Sinha (1975) among others,
considered the older transfer function matrix approach, while Rissanen

(1971,1974), Silverman (1971), Ackermannand Bucy (1971), Chen and Mital

(1972), Mital and Chen (1973), and Bonivento et al. (1973) approached

the problem from the Hankel array formulation.

Rissanen (1974), Furata and Paquet (1975), Roman (1975),
Dickinson et al. (1974a,b) have recently considered the problem of

realizing a given infinite impulse response matrix sequence with a
polynomial matrix pair. Such a pair is referred to as a matrix-
fraction description of the system and is becoming well known in

control literature largely due to the ground work established by

Popov (1969), Rosenbrock (1970), Wolovich (1972a,b, 1973a,b) and
Kalman (1971b), Tether (1970), and Godbole (1972) later

considered the more realistic case where only a finite number of

terms of the impulse response sequence are specified. This is
commonly known as the partial realization problem and corresponds
in the scalar case to the classical Pade approximation problem.

Generally most realization altorithms can be used to process partial

data, but usually at a loss of efficiency and even more seriously
the possibility of yielding misleading results. A wealth of new
techniques have recently been published to handle this very special,
yet realistic variant of the realization problem. Rissanen (1972a,b),

Ackermann (1972), Dickinson et al. (1974a), Roman and Bullock (1975a),
Anderson et al. (1975) published some efficient and improved algorithms
to solve this problem.
Also of recent interest is the development of algorithms which

realize a system directly in a canonical form (under a change of basis

in the state space), i.e., algorithms which solve the canonical realiza-

tion problem. The algorithms of Ackermann (1972), Bonivento et al.
(1973), Rissanen (1974), Dickinson et al. (1974a), Rozsa and Sinha

(1975), Luo (1975), and Roman and Bullock (1975a) solve this problem.

One of the main contributions of this dissertation is to use the

results developed from invariant theory to solve the realization and

partial realization problems in the deterministic as well as stochastic

cases. The realization of a system directly in a canonical form actually

reduces to first determining which transformation groups are present,

specifying the corresponding invariants, and then developing a method to

extract these invariants from the given unit pulse response sequence.

This philosophy is basic to any canonical realization scheme and actually

provides an explicit formula which is applied throughout this dissertation.

In the last few years, several interesting extensions have emerged

from the original concept of realization theory. The major motivation

evolved just after the development of the Kalman filter (see Kalman (1961))

in estimation theory because a priori knowledge of the state space

model and noise statistics are required to begin data processing. The

link between the filtering and realization problem was established by

Kalman (1965) just prior to the advent of Ho's algorithm. The work of

Gopinath (1969), Budin (1971,1972), Bonivento et al. (1973), and Audley

and Rugh (1973,1975) were concerned with the more general problem of.

obtaining a state space representation given a general input/output

sequence of the system in both deterministic and stochastic cases. The

stochastic version of the realization problem has not received quite

as much attention as the deterministic case mainly due to its g-eater

complexity and high dependence on the adequacy of covariance estimators.

The realization of stochastic systems was studied by Faurre (1967,1970)

and more recently by Rissanen and Kailath (1972), Gupta and Fairman (1974)

tThe Hankel array formulation is used exclusively in this dissertation;

and Akaike (1974a,b). From the transfer function viewpoint this problem

has been solved using spectral factorization as originally introduced

by Wiener (1955,1959) and studied by others such as Gokhberg and Krein

(1960), Youla (1961), Davis (1963), Motyka and Cadizow (1967), and

Strintzis (1972). The link between the stochastic realization problem

and spectral factorization evolved from the work in stability theory by

Popov (1961,1964), Yakubovich (1963), Kalman (1963), Szeg6 and

Kalman (1963). The equations establishing this link were derived in

the Kalman-Yakubovich-Popov lemma for continuous systems and the Kalman-

Szego-Popov lemma for discrete time systems. Newcomb (1966), Anderson

(1967a,b,1969), and Denham (1975) extended these results and provided

techniques to solve these equations. Defining the invariants of these

problems is still an area of active research as evidenced by the recent

work of Denham (1974), Glover (1973), and Dickinson et al. (1974b).

This is one area developed in this dissertation. It will be shown that

the invariants of the stochastic realization problem not only lends more

insight into the structure of the problem, but also yields some new

Research in realization theory and its applications continues as

evidenced by the recent results of Rissanen (1975) in estimation theory,

Ackermann (1975) in feedback system design,De Jong (1975) in the

numerical aspects of the problem and Roman and Bullock (1975b) in

observer theory. The results presented in this dissertation tie together

some previously well-known results in stochastic realization and filtering

theory as well as provide a technique which can be used to study

other problems.

1.3 Statement of Purpose and Chapter Outline

It is the purpose of this dissertation to provide an extensive

discussion of the realization problem in both the deterministic and

stochastic cases as well as specify the invariants under particular

transformation groups in each case. It is also desired to develop a

simple and efficient algorithm to solve the canonical realization problem.

This algorithm is to be modified to process data sequentially such that

only the pertinent information--the invariants, are extracted from the

given sequence. In the case of a fixed finite unit pulse response

sequence (the partial realization problem), the solution is to be

obtained such that all possible degrees of freedom are specified. The

relationship between the stochastic realization and steady state Kalman

filtering problems are discussed by again examining the corresponding

invariants. In so doing, a considerable amount of knowledge about the

existence and structure of realizations and the steady state filter is


The basic theoretical essentials of realization and invariant theory

are reviewed in Chapter 2. A "formula" essentially outlined in Popov

(1973) and Kalman (1974) is developed which will be applied to various

realization problems throughout the text. Some new theoretical results

in canonical realization theory are established and used to develop a

new canonical realization algorithm.

In Chapter 3 the algorithm is modified to handle sequentially

the case of partial data and also that of a fixed finite sequence. New

results evolve which completely characterize the class of all minimal
partial realizations and extension sequences as well as determining the

characteristic equation in a simple manner.

The stochastic case of the canonical realization problem is in-

vestigated in Chapter 4. A complete set of independent invariants is

found to characterize the corresponding solution. Equivalent solutions

to this problem as well as to the steady state Kalman filtering problem

are studied and it is shown that the filter parameters can be specified

by solving an analogous realization problem.

The specific contributions of this research and further research

possibilities are outlined in Chapter 5.

Examples are used generously throughout this work to illustrate the

various algorithms discussed and to point out significant details that

are otherwise difficult to see. A comment on notation to be used through-

out this dissertation closesdthis chapter.

1.4 Notation

Uppercase letters denote matrices, and vectors are represented by

underlined lowercase letters. Lowercase letters are used to represent

scalars and integers. All matrices and vectors appearing in this work

are assumed to be real and constant. A = [a..] is an nxm matrix with
m ijmis an nxm matrix with
elements a.i; On is the nxm null matrix with row and column vectors

given by T and 0; In represents the nxn identity matrix, and eT or

stands for its j-th row or j-th column; jem means j=1,2,...,m.



In this chapter we present a brief review of the major results

in realization theory. We establish a basic "formula" and apply it to

various system representations. It is shown that this approach greatly

simplifies the realization problem. Two new algorithms for realization

are developed which appear to be more efficient than previous techniques

because they extract only the minimal information necessary to specify

a system from the given input/output sequence in an extremely simple

fashion. All of the essential theory is developed and a multivariable

example is presented.

2.1 Realization Theory

A real finite dimensional linear constant dynamic system has

internal description given by the state variable equations in discrete

time as,

x+l = Fx + Guk

Y-= Hk (2.1-1)

where keZ xeKn=X, ueKm=U, eKP=Y and F, G, H are nxn, nxm, pxn matrices

over the field K. X,U,Y are the state, input, and output spaces,


The external system description may be given either in rational

form as,

T(z) = H(Iz-F)-IG (z complex) (2.1-2)

or equivalently as an infinite matrix power series

T(z) = E Azk (2.1-3)

where the sequence {Ak} is the unit pulse response or Markov sequence

of (2.1-1). The Markov parameters are

Ak = HFkG k=1,2,... (2.1-4)

The problem of determining the internal description (F,G,H) from the
external description (T(z) or {Ak}) is the realization problem. Out of

all possible realizations, E:=(F,G,H) having the same Markov parameters,

those of smallest dimension are minimal realizations.

Prior to stating some of the significant results from realization

theory several useful definitions will be given. The j-controllability

and j-observability matrices are the nxmj and pjxn arrays,

W = [G j1G] and = [HT (HF'-1)T]. The pair (F,G)

is completely controllable if p(W )=n and the pair (F,H) is completely

observable if p(V )=n. Throughout this dissertation we will only be

concerned with systems possessing these properties. For a completely

controllable and observable system, the controllability index, u, and

the observability index, v, are the least positive integers such that the

rank of W and V is n.
1I v


If two minimal realizations E, 7 are equivalent under a change of
basis in X, then there exists a nonsingular T such that
(Fj,,~,)t = (TFT-1,TG,HT-1). It also follows by direct substitution that
the controllability and observability indices of these realizations are
identical and

W = TW. for j = 1,2,...
j J
Vi = ViT-1 for i = 1,2,...

The generalized NxN' block submatrix of the doubly infinite Hankel
array is given by


SN,N' :


... AN'

.. AN+N'-1

Implicit in the realization problem is determining when a finite
dimensional realization exists and, if so, its corresponding minimal
dimension. The following proposition by Silverman gives the necessary
and sufficient conditions for {Ak} to have finite dimensional realiza-

Proposition. (2.1-5)

An infinite sequence {Ak} is realizable iff there
exist positive integers p,v,n such that

p(S ) = (S + ) = (S +j, +1)=n. or j=0,1,...

Further, if {Ak} is realizable, then p,v are the
controllability and observability indices and n is
the dimension of the minimal realization.

tThis notation means F = TFT", G = TG, and H = HT-1

Proof. See Silverman (1971).

Note that the essential point established in Ho and Kalman (1966), which

is used in the proof of the above proposition is that Z is a minimal

realization iff it is completely controllable and observable. Since
S =V W it follows for dim: = n that: p(S ) min[p(V ),p(W )]=n.

This result is essential to construct any realization algorithm. In
(2.1-5) the crucial point of finite dimensionality is carefully woven

into necessary and sufficient conditions for an infinite sequence to be

realizable. What if only partial information about the system is

available in the form of a finite Markov sequence? Is this sequence

realizable? What is the relationship between the minimal realization

and one based only on partial data? These are only a few of the questions

which must be resolved when we are limited to partial data.

Intrinsic in the realization from a finite Markov sequence is the
fact that enough data are contained in S to recover the infinite

sequence, i.e., knowledge of {A1,...,Av,-1} is sufficient to determine

{Ak}, k=1,2,.... But in reality the only way to be sure of this is
knowledge of the actual system dimension (or at least an upper bound). A
minimal partial realization is a realization of smallest dimension
determined from a finite Markov sequence {Ak},keM which realizes the

sequence up to M terms. The order of the partial realization is M and

the realization is denoted by Z(M). The realization induces an extension
of {Ak}, i.e., Ak=HF k-G for k>M. The following basic result analogous
to (2.1-5) answers the realizability question when only partial data are

given. For a proof, see Kalman (1971).

Proposition. (2.1-6) (Realizability Criterion) The minimal partial

realization problem of order M possesses a

solution, E(M) iff there exist positive integers..

v,v, M = v~-pM, such that

(R) p( 5~) = p(SV+1,) = p(Sv,+1)

where dimE(M) = p(S ) = n.

In this proposition (R) is designated the rank condition. Also, it

is important to note that when (R) is satisfied the minimal extension

(of S(M)) is unique (see Tether (1970) for proof), but E(M) is not

unique because there exist other minimal partial realizations equiv-

alent to E(M) under a change of basis in X.

We must consider three possible cases when only partial data is

available. In the first case enough data is available such that M>M

for known n; thus, a minimal realization is found. Second, v and p

are available such that (R) is satisfied. In this case a minimal par-

tial realization can be found, but this in no way insures it is also a

minimal realization of the infinite sequence, since the rank of S

may increase as v,u increase. Third, the rank condition does not hold.

How can a realization be found when no more data is available? The

only possibility in this case is to extend the sequence until (R) is

satisfied, but there can exist many extensions satisfying (R) while

giving nonminimal realizations. For this reason define a minimal

extension as any that corresponds to a minimal (partial) realization.

To obtain minimality we must somehow select the right extension among

the many possible.

Prior to summarizing the main results of Kalman (1971) and Tether

(1970), define the incomplete Hankel array associated with a given
partial sequence {Ak}, keM as

1 A2 .. AM
A2 A3 A.... *
S(M,M) := .

AM *

where the asterisks denote positions where no data is available. The

rank of S(M,M) is the number of linearly independent rows (columns)

determined by comparing only the data specified elements in each row

(column) with the preceding rows (columns) with the cognizance that

upon the availability of more data this number can only remain the same

or increase. Thus, the rank is a lower bound for any extension when the

* are filled in-consistent with the preservation of the Hankel pattern.

Both Kalman and Tether show that there are three pertinent integers
associated with the incomplete Hankel array. They are defined as: n(M),

v(M), p(M) and correspond to the rank of S(M,M), the observability index,

and the controllability index of the given data. The latter two are

lower bounds (separately) for v and p. Knowledge of either v(M), or

y(M) enables us to construct extensions, since they are the least integers
such that (R) holds for all minimal extensions.

It should also be noted that the integers n,v,p,... are actually

non-decreasing functions of the amount of data available, M, and should

be written, n(M), v(M), p(M) etc. to be precise. However, the argument

tIt also follows from this that the p(S(M,M)) is a lower bound for dim.E
(see Kalman (1971)).

M will be understood throughout this dissertation in order to maintain

notational simplicity.

There is one more variant of the partial realization problem that

must be considered. A sequence of minimal partial realizations such

that each lower order realization is contained in one of higher order

will be called a nested realization. Symbolically, this is given by

...S(M)EE(M)I... for M
appear as submatrices of the corresponding matrices in Z(M). The solution

to this problem is most desirable from the computational viewpoint,

since each higher order model can be realized by calculating just a few

new elements in the corresponding realization. Rissanen (1971) has

given an efficient recursive algorithm to determine this solution.

Another related problem of interest is determining a unique member of

equivalent systems under similarity and is discussed in the following


2.2 Invariant System Descriptions

In this section we review some of the fundamental ideas encountered

when examining the invariants of multivariable linear systems. The

framework developed here will be used throughout this dissertation in

formulating and solving various realization problems. Not only does

this formulation enable the determination of unique system representations

under some well-known transformations, but it also provides insight into

the structure of the systems considered. First, we briefly define the

essential terminology and then use it to describe some of the more common

sets of canonical forms employed in many recent applications (e.g., Roman

and Bullock (1975a,b), Tse and Weinert (1975)).

For any two sets X and Y, a subset R = X x Yt is called a binary

relation on X to Y (or, a relation "between" X and Y). Then (x,y)eR

is usually written as xRy and is read: "x stands in the relation R to

y". If for X=Y this relation is reflexive, symmetric, and transitive,
then it is an equivalence relation E on X given by xEy for x,ysX. The

set of all elements z equivalent to x is denoted by E(x).= {zeXfxEz} and

is called the equivalence class or orbit of x for the equivalence relation

E. The set of all such equivalence classes is called the quotient set

or orbit space and is given by X/E. Thus, the relation E of X partitions

the set X into a family of mutually disjoint subsets or orbits by sending

elements which are related into the same equivalence class.

Consider a fixed group Gtt of transformations acting on a set X.

Then the elements xl,x2 of X are equivalent under the action of G iff

there exists a transformation TeG which maps x1 into x2. This is basically

the "formula" we will apply throughout, i.e., we first formulate the set

of elements (generally the internal system description), then define a

transformation group; and finally determine the orbits under the action
of G. To be more precise, let us first define the function f mapping

a set X into Y as an invariant for E if for x1,x2 X, x1Ex2 implies

f(x i)f(x2). In addition if f(xl)=f(x2) implies xlEx2, then f is a

This is the standard Cartesian product, XxY = {(x,y)|xeX, yeY}

tHere we mean "group" in the standard algebraic sense, i.e., (Go)
where G is a closed set of elements each possessing an inverse and
the identify element; o is an associative binary operation. When
o is understood, the group is merely denoted by G.
tNote that an invariant is actually a function, but common usage
refers to its image as the invariant. We will also use this terminology
throughout this dissertation.

complete invariant. In general we will be interested in a complete
system of invariants for E given by the.set of invariants {fi} where
f:t X + Y1xY2x...xY fi is an invariant for E, and fl(x,)= (x2),...'

fn(xl)=fn(x2) imply xEx2. Completeness of this set of invariants
means that the set is sufficient to specify the orbit of x, i.e., there

is a one to one correspondence between the equivalence classes in X
and the image of f. If the set of complete invariants is independent,
then the map f: X-YlX...xYn is surjective. This property means that
corresponding to every set of values of the invariants there always exists
an n-tuple in Y specified by this set. A complete system of independent
invariants will be called an algebraic basis.
Generally, we consider a subset of X (e.g., in system theory a
controllable system). Correspondingly, let fo be a function mapping the
subset X of X into set Y, then fo is a restriction of f if f (x)=f(x)
for each xeXo. We can uniquely characterize an equivalence class E(x)
by means of the set of values of the functions fi(x), ien where the {fi}
constitute a complete set of invariants for E on X. If the corresponding
complete invariant f is restricted such that its image is itself a
subset of X, then we have specified a set of canonical forms C for
E on X. To be more precise, a canonical form C for X under E is a
member of a subset CcX such that: (1) for every xeX there exists one and
only one cEC for which xEc, and since C is the image of a complete
invariant f, then (2) for any xeX and cl, c2EC, xEcl,and xEc2 implies
f(x)=f(cl)=f(c2)=cl=c2 (invariance); (3) for any ceC if f(x1)=c and
f(x2)=c, then x1Ex2 (completeness). Thus, c=f(x) is a unique member of

This notation is actually f=(fl,...,f ):x+Y1...xYn but it is
shortened when the set {fi} is clearly understood.

FR~-"YI i

E(x) for every xeX. With these definitions in mind, our "formula"


(i) Formulate the set of elements;

(ii) Define the transformation group;

(iii) Determine a set of complete invariants under this

transformation group; and

(iv) Develop the canonical form in terms of the corresponding

invariants. (2.2-1)

We now apply (2.2-1) to various restricted sets related to multivariable

systems. This approach is essentially given in Kalman (1971a),Popov

(1972), Rissanen (1974), or Denham (1974). In this sequel we review the

main results of Popov. First, define the set of matrix pairs (F,G)


X = {(F,G)IFeKnxn, GeKnxm; (F,G)controllable}

The general linear group, which corresponds to a change of basis

in the state space, is specified by the set

GL(n):= {TITeKnxn; det TO} (2.2-2)

with the group operation standard matrix multiplication, i.e.,

To T = T T.

In order to determine the orbits of X under the action of GL(n),

it is first necessary to specify the action operator "+"

T + (F,G):= (TFT-1,TG)

In general the problem of determining a canonical form is quite
difficult. However in this dissertation we consider restricted sets
which make the problem much simpler. For a thorough discussion of this
problem see Kalman (1973).


or alternately we can say that the action of GL(n) on Xo induces

F + TFT-1
G + TG

The action of GL(n) induces an equivalence relation on Xo. We

indicate (F,G)ET(F,G) if there exists TeGL(n) such that (T,G)=T+(F,G).

Dual results are defined for the observable pair (F,H) and the
analogous set denoted by X .
The third step of (2.2-1) is established in Popov (1972), but
first consider the following definitions. For a controllable pair (F,G)

define the j-th controllability index p ., jEm as the smallest

positive integer such that the vector F 3gj is a linear combination of
its predecessors, where a predecessor of Figj is any vector Frgs where
we have assumed p(G) = m. Throughout this dissertation we use the

following definition of predecessor independence: a row or column vec-
tor of a given array is independent if it is not a linear combination of
its regular predecessors. The following results were established by

Popov (1972)

Proposition. (2.2-3) (1) The regular vectors are linearly independent;
(2) The controllability indices satisfy the
following relationship, E j = n; (3) Ti.ere exists
exactly one set of ordered scalars, ajkscK defined

for jem, kej-l, s = O,l,...,min(pj ,k-1) and for jEm,

k = j,...,m,s = 0,l,...,min(pj,pk) 1 such that

tThroughout this dissertation we use the overbar on a set to denote the
dual set.
ttThese indices are also called the Kronecker indices.

j-1 min(sj.,k-l) m min(,ljpk)-1
F jg. Z Z ajks gk + E ajks Fk'
3 k=1 s=O k=j s=O

This proposition follows directly from the controllability of (F,G) and
indicates that the regular vectors forma basis where the a's are the
coefficients of linear dependencies. The set [{Pj},{ajks}], j,kem,
s=0,...,j.-I are defined as the controllability invariants of (F,G),
and v=max(pj). The main result of Popov is:

Proposition. (2.2-4) The controllability invariants are a complete
set of independent invariants for (F,G)eX under
the action of GL(n).

The proof of this proposition is given in Popov (1972) and consists of
verifying the invariance, completeness, and independence of [{1j},{ajks}].
Invariance follows directly from Proposition (2.2-3), since (F,G)ET(F,G),
then T'gk can be replaced by TFSgk in the given recursion and the
controllability invariants remain unchanged. Completeness is shown by
constructing a TeGL(n) such that for two pairs of matrices (F,G),
(F,G)eX with identical controllability invariants, (F,G) = (TFT1, TG)
or (F,G)ET(F,G). Independence of the controllability invariants is
obtained by constructing a canonical form determined only in terms of
these invariants. Thus, by introducing a finite set of indices {pj},
Popov shows that this set along with the {ajks} are invariants under the
action of GL(n). The main reason for specifying a set of complete and in-
dependent invariants is that it enables us to uniquely characterize the
orbit of (F,G). It should also be noted that dual results hold for the observ-
able pair (F,H), and it follows that the observability invariants are the

~~I __L


set [{vi},{ ist}], i,sep, t=O,...,v.i- where the {vi}are the observa-

bility indices.

The last step of (2.2-1) is to specify the corresponding canonical

forms under GL(n). These forms are commonly called the Luenberger

forms and are specified by the controllability and observability

invariants. They are defined by the pairs (FC,Gc), (FR,HR) where the

subscripts C,R reference the fact that the regular vectors span either

the columns of W+,1 or the rows of V +l'
p + 1 H+ 1

FC = [t2 l1 2 .. emm] (2.2-5)

GC = el e ql+l1 e l+l

qj = s s jem
Ss=1 s

~ T

e1 T

T -J

FR R = (2.2-6)
er P-+2 __r 0+1


ri = Vs iE2

-i---------*-------i--_~-_ --

where aj, i are n column, n row vectors containing {aI {Bi s}

respectively over appropriate indices and zeros in the other places.

Luenberger (1967) shows that the transformation, TC, required to

obtain the pair (FC,Gc) is determined from the columns of W as

TC =[T1 T2 ... Tm (2.2-7)
where Tj =[gj Fg lF1j] jFm

pCG) = m, and gj is the j-th column of G.

Similar results hold for the pair (FR,HR) and is specified by TR

constructed from the rows of V .

Unfortunately Luenberger (1967) in attempting to develop multi-

variable system representations did not determine the invariants under

GL(n). It is essential to use the approach outlined in (2.2-1) in

order to obtain the corresponding canonical forms or else it is possible

to obtain erroneous results. The following example due to Denham (1974),

shows that the Luenberger form, as originally stated is not canonical.

If we are given the pair (F,G) as

0 0 1 1 1 0

1 0 2 1 0 0
F = G
0 1 2 1 0 0

0 0 1 1_ 0 1

These matrices are in the form of (2.2-5), but it is easliy verified

by constructing TC that the controllability invariants are in fact

pl=2, P2=2 and 21 = [-1 -1 -1 1], a2 = [-2 0 -2 4T1 The problem

with the Luenberger forms is that the maps 7: X0+ XO/E are not well

defined. Thus, the image of the maps are indeed canonical forms, but

as shown here for (F,G)eXO/E, we need not have rT(F,G)=(F,G), i.e., the

mapping does not leave the canonical forms unchanged. The point to
remember is that the invariants are the necessary entities of interest

which must be determined.

The procedure to construct the transformation matrix TC of (2.2-7)
is called the Luenberger second plan. The first Luenberger plan
consists of examining the columns of ,n' given by

n = Wn = [gl ... Fn-g1 "' gm Fn- m] (2.2-8)

where Y is an nmxnm permutation matrix, for predecessor independence.
Thus, we can define a new set of invariants (under GL(n)) [{f }, {ajks}]
completely analogous to the controllability invariants. The canonical

forms associated with the invariants obtained in this fashion have

tThis procedure amounts to examining the column vectors of Wn for
predecessor independence, i.e., examine g1 ... g Fg1 ... Fg .
Fn-191 ... Fn-lgm.

i I

become known as the Bucy forms which were derived directly from the
results of Langenhop (1964), Luenberger (1967), and Bucy (1968). We
refer the interested reader to these references as well as the recent
survey by Denham (1974). Here we will be satisfied to note that the

procedure of (2.2-1) applies with the set of controllable pairs (F,G)

restricted to the {j} invariants rather than {pj}. Analogous to the
Luenberger forms, we define the row and column Bucy forms as (FBR,HBR),

(FBC,GBC) respectively. The row form is given by

L eT
11 2-1
L21 L22 He +1
FBR BR (2.2-9)

Lpl Lp L eT + +..+p+1


I ..-1 V
L.. = ---- ; L.. = ]
11 ST 13 -T"
L Bii ij

v. > 0 and satisfy E v =n ;

B, j are v.,vj row vectors containing ist } invariants.

The transformation, TBR,required to obtain the pair (FBR,HBR) is

BR = [T T F B T (2.2-10)

B i i

The importance of the Bucy form is that the characteristic equation can

be found by inspection of the block diagonal arrays of FBRt. Since FB
is block lower triangular, the characteristic equation is given as

XFBR(z) = det(Iz-FBR) = (z) ... XL (z) (2.2-11)

where the Lii are the companion matrices of (2.2-9). Similar results

hold for the pair (FBCGBc) and the transformation is specified by TBC

constructed from the columns of Wn.
This completes the discussion of invariants and canonical forms for
controllable or observable pairs. To extend these results to matrix

triples (internal system description), it is more convenient to examine
an alternate characterization of the corresponding equivalence class--
the Markov sequence of (2.1-4). This approach was used by Mayne (1972b)
and Rissanen (1974), in order to determine the orbits of Z under GL(n).

It is obvious that the sequence is invariant under this group action

Aj = (HT-1)(TFT-1)j-1(TG) = HFj-1G (2.2-12)

Consequently every element of Aj can be considered an invariant of Z

with respect to GL(n); therefore, two systems which are equivalent under

GL(n) possess identical Markov sequences. The converse is also true, i.e.,

any two systems with identical Markov sequences are equivalent.
The standard approach to investigate a system characterized by its
Markov sequence is to form the Hankel array, SN,N, where we define S ,

It should be noted that the Bucy form is not a canonical form if the
transformation group includes a change of basis in either input or output
spaces, while the Luenberger form is still a canonical form.

iAN and S., jeN' as the block rows and columns of SN,N, and the
block column and row vectors, a or aT denote the r-th column of S
.r s. ,.
or the s-th row of S,1 for remN', spN. Rissanen (1974) has shown
that by examining the set
X1 = {2 I E controllable and observable with {ii} invariants}
under the action of GL(n) that

Proposition. (2.2-13) The set of controllability invariants and block
column vectors, [{1j},{ajks},{a.i}] for the

appropriate indices constitute an algebraic basis
for any SeX1 under the action of GL(n).

The proof of this proposition is given in Rissanen (1974) and consists of
showing that any two members of X1 with identical Markov sequences
are equivalent under GL(n). Thus, invariance follows by showing that a
dependent column vector of the Hankel array can be uniquely represented
in terms of the set [{CI-},{acjks}]. These parameters remain unchanged
under GL(n); therefore, they are invariants. The block column vectors,
a.t satisfy a recursion analogous to (2.2-3), i.e.,

n-I min(j,Pk-1) m min(lj,Pk)-l

a.j+m. = aksa.j+ms + zE jksa.j+ms
k=l s=O k=j s=O

Thus, all dependent block columns can be generated directly from the set,
{a.t} of regular block column vectors. These vectors are invariants under
GL(n), since every column vector of Aj is an invariant as shown in
(2.2-12). Completeness follows immediately from the above recursion,
since any two members of X1 possessing identical invariants satisfy the
above recursion and therefore have identical Markov sequences.

Independence is shown by constructing the Luenberger form of (2.2-5) and (2.2-14)

below* directly from these invariants.

The dual result yields another basis on X1, [{i},{},{a ].

The corresponding canonical forms for ZeX1 or X1 are given by the

Luenberger pairs of (2.2-5,2.2-6) and

HC = [a.1 ... a.(l )m+ll.. la.m .. a. mm (2.2-14)






and the canonical triples are denoted by EC and ZR respectively.
Rissanen (1974) also shows that a canonical form for the transfer

function can be constructed from the invariants of (2.2-13). This is

possible because the determination of canonical forms for Z based on the

Markov parameters is independent of the origin of Ak's. Rissanen defines
the (left) matrix fraction description (MFD) as

T(z) := B-1(z)D(z) (2.2-15)

where B(z) = z Bi.z for IB \f 0
i=O '
D(z) E Diz .
i=0 I

The relationship of the MFD to the Hankel array, S +1,+l ,follows by
writing (2.2-15) as

B(z)T(z) = D(z) (2.2-16)

and equating coefficients of the negative powers of z to obtain the


BAj + B1Aj+I + + BA =j j=1 ,2...

expanding over j gives the relation over the block Hankel rows as

[B ... B] B S T p ) (2.2-17)

v+ ,.

where the pxp(v+l) matrix of Bi's is called the coefficient matrix of

B(z). Similarly equating coefficients of the positive powers of z
gives the recursion

Dk = Bk+A + Bk+A + ... + BA-k k=0,l,...,v-1

or expanding over k gives the relation in terms of the first block Hankel
column as

v-1 Bv
D2 B B
Dv-2 v- v S (2.2-18)

DO Bl B2 ... B

The canonical forms for both left and right MFD's are defined by the
polynomial pairs (BR(z),DR(z)) and (BC(z),DC(z)) respectively, where
R and C have the same meaning as in (2.2-5,2.2-6) and the former is
given by

BR(z) := :




; bT K
ki Kp



b = OTk

where k=i+pvi and Bkj ar

S {ist

Bkj : 0

1 k2 *"'* k(i+pvi) l0-i] i

e given by




and DR(z) is determined from (2.2-18).

Dual results hold for the corresponding column vectors, bj, jem of the
coefficient array of IC,(z) in terms of the controllability invariants.
This completes the discussion of canonical forms for Z or T(z).
Note that analogous forms can easily be determined for the Bucy forms

if X1 is restricted to {vj}. Henceforth, when we refer to an invariant

system description, we will mean any representation completely specified
by an algebraic basis. In the next section we develop the theory
necessary to realize these representations directly from the Markov

2.3 Canonical Realization Theory

In this section we develop the theory necessary to solve the

canonical realization problem, i.e., the determination of a minimal

realization from an infinite Markov sequence, directly in a canonical

form for the action of GL(n). Obviously from the previous discussion,

this solution has an advantage over other techniques which do not obtain

Z in any specific form. From the computational viewpoint, the simplest

realization technique would be to extract only the most essential

information from the Markov sequence--the invariants under GL(n). Not

only do the invariants provide the minimal information required to

completely specify the orbit of E, but they simultaneously specify a

unique representation of this orbit--the corresponding canonical form.

Thus, subsequent theory is developed with one goal in mind--to extract

the invariants from the given sequence.

The following lemma provides the theoretical core of the subsequent


Lemma. (2.3-1) Let VN and WN, be any full rank factors of SN,N, = VN WN'

Then each row (column) of SNN,, is dependent iff it is a

dependent row (column) of VN (WN,).

Proof. From the factorization SN,N, = VN WN, it follows if the j-th

row of SN,N, is dependent, then there exists an eKpN, iaj0

such that
SN,N' = OmN'
Since p(WN,)=n, i.e., WN, is of full row rank, it follows that

or aTVN(WN' N) = OT N'

but det (WNWN' ) 0; thus, a = Om0N, i.e., a dependent row
of SN,N, is a dependent row of VN. Conversely assume that there.
exists a nonzero a as before such that
aV = 0 T
VN -mN'
Since p(WN,)=n, it follows that this expression remains unaltered
if post-multiplied by WN,, i.e.,
aVNN, = iN'

and the desired result follows immediately.V

The significance of this lemma is that examining the Hankel rows

(columns) for dependencies is equivalent to examining the rows (columns)

of the observability (controllability) matrix. When these rows (columns)

are examined for predecessor independence, then the corresponding

indices and coefficients of linear dependence have special meaning--

they are the observability (controllability) invariants. Thus, the

obvious corrollary to this lemma is

Corollary. (2.3-2)

If the rows of the Hankel array are examined for

predecessor independence, then the j-th (dependent)

row, where j=i+pvi, iep is given by
i-l min(vi,vs-1) p min(v,v s)-l
-J = ist-s+pt + s istys+pt
s=1 t=0 s=l t=0

where{Bist an:d{vi} are the observability invariants

and -, kcpN is the k-th row vector of S N,N'

Proof. The proof is immediate from Proposition (2.2-3) and Lemma (2.3-1).V

Note that similar results hold for the columns of the Hankel array when

examined for predecessor independence.

In the solution to some problems knowledge of both controllability

and observability indices are required. Moore and Silverman (1972)

require both indices to design dynamic compensators in order to solve

the exact model matching problem. Similarly the requirement exists in

the design of pole placement compensators and also stable observers as

indicated in Brausch and Pearson (1970) and more recently Roman and

Bullock (1975b). In an on-line application Saridis and Lobbia (1972)

require the controllability invariants as well as the observability

indices to solve the problem of parameter identification and control.

The latter case exemplifies the fact that in some instances it is first

necessary to determine the structural properties of a system from its

external description prior to compensation.

The need for an algorithm which determines both sets of controllability

and observability invariants from an external system description is

apparent. Computationally the simplest and most efficient technique to

determine these invariants would be some type of Gaussian elimination

scheme which utilizes elementary operations (e.g., see Faddeeva (1959)).

If we perform elementary row operations on VN such that the predecessor

dependencies of PVN are identical to those of VN and perform column

operations on WN, so that WNE and WN, have the same dependencies then

examination of SN,N' = PSN,NE is equivalent to the examination of SN,N'

We define SN,N as the structural array of SN,N,. This array is

specified by the indices {vi} and {p.} which are the least integers such

that the row and column vectors of SNN are respectively,
ii **i

rnonzero fa=Q,...,v-1

zero J la=, ..,N- J

Snonzero b=0,...,p -1
a = for 1
S zero b= ..,N-

These results follow since SN,N has identical

as SN,N,, then




for s=j+mb

predecessor dependencies

where T= 0 if it depends on its predecessors. To find the observability

indices, let a be the index of the last nonzero row of 6i+p t=0,...,N-.
Then if T6 = 0 v = 0 otherwise v. = (a-i)/p+l. Similar results
follow when SNN is expressed in terms of the g The following theorem
specifies the matrices P and E required to obtain SN,N.

Theorem. (2.3-3) There exist elementary matrices P and E, respectively

lower and upper triangular with unit diagonal elements,

such that SN,N=PSN,N.E has identical predecessor

dependencies as SNN,N

Proof. Let PS N,N=Q where Q is row equivalent to SN,N, and P=[prs].

If the j-th row of SN,N, is dependent on its predecessors, i.e.,

T j-1 T T
T. + Z a = 0
3 k=l jk-k -

then selecting P lower triangular such that


0O r Prs = 1 r=s
ajk r>s

gives this relation. From this choice of P it follows that
dependent rows of SN,N, are zero rows of Q. If the j-th row
of SN,N, is regular, then P unit diagonal-lower triangular

insures that the corresponding row of Q is nonzero and regular.

Similar results hold for the columns of SNN with E unit diagonal-
upper triangular.
This choice of P does not alter the column dependencies of SN,N;
for if the i-th column of SN,N, is dependent on its predecessors,
then from Corollary (2.3-2) .i is uniquely represented as a
linear combination of regular vectors in terms of the control-
lability invariants. Since P is unit diagonal-lower triangular,

it is the matrix representation of a nonsingular linear

transformation, Pir=q. where _q is the i-th column vector of Q.

Thus, multiplying on the left every vector Li in (2.3-2) with
this P gives for i=j+mpj
j-1 min(P ,k"-1) m min(j ,pk)-1
=i = Z Z ajksq.k+ms + E Z j ksqk+ms
k=l s=O k=j s=O

Thus, we have shown that selecting P with the given structure
does not alter the predecessor column dependencies of SN,N' or

equivalently Q. Since the column vectors of Q satisfy the

above recursion, SN,N, and Q have identical predecessor column
dependencies, therefore, performing column operations on Q is
analogous to performing them on SN,N, and so we have S =

(PSN,N,)E = QE or the predecessor dependencies of SN,N, and SN,N,
are identical.V

This theorem shows that the indices can be found by performing a sequence

of elementary lower triangular row and upper triangular column operations

in a specified manner on the Hankel array and examining the nonzero rows

and columns of SNN., the structural array of SN,N,. The {ajks} and

({ist} are also easily found by inspection from the proper rows of P and
columns of E as given by

Corollary. (2.3-4) The sets of invariants {8ist',{jiks} or more compactly

the sets of n vectors {. )},{a} are given by the rows

of P and columns of E in (2.3-3) respectively as

i = [qrqr+p "' Pqr+p(v-l)' 1 i+i ir

a. [estes+mt ... es+m(.j. )tT t=mtj+j, j,sm

1 q=r, r=s
Pqr'est qt

Proof. The proof of this corollary is immediate from Theorem (2.3-3).V

We can also easily extract the set of invariant block row or column

vectors, {a },{a k from the Hankel array and therefore, we have a

solution to the canonical realization problem.

Theorem. (2.3-5) If the generalized Hankel submatrix of rank n is

transformed by elementary row operations to obtain a row

equivalent array, then by proper choice of P the matrix Q

is given by:

TG I TFG ... TF TH,,,
S= --------------------- __ --
nPn-N I pn-N
LmN' L mN'

where (F,G) is a controllable pair and det TO.

Proof. .If x is a minimal realization, then it is well-known that

p(VN)=P(WN,)=n. Since P is an elementary array, then it follows
[PVN]-min[p(P),p(VN)]=n; thus P can be chosen such that

PV -- and det TfO.
N pN-n
Post multiplication by WN, gives

"VN"''N' pRn N' 1
PVN N' =FG] PSN,N' := Q
Multiplication of the arrays gives the desired results.V

Corollary. (2.3-6) If P is selected such that Q is as in (2.3-5) with the

pair (F,G) in Luenberger column form, then the set

of invariants {a.}, jem is given by the columns of
W N wk, kemN' with

-k = k=pjm+j

Proof. If P is selected in Theorem (2.3-5) such that T=T then it
follows that each column of WN, corresponding to the (j+mpj)-th
for each jem contains the {ajks} invariants.V
J ks

The method of selecting P is given in the ensuing algorithm.

Theorem. (2.3-10) Given the infinite realizable Markov sequence
from an unknown system, then EC=(FC,GC,H)n is a
minimal canonical realization of {Ak} with

Fc = [W* I W* 1 ... I W]
C 1 2 Wm
GC is a submatrix of (W+1i)C given by the first m
HC = [a.1 ... al+m(ll) I ... I a .. a m

and W = [wj+ ... w jm], jsm, w is a column
vector of (Wy+1)C.

Proof. Since the sequence is realizable, there exist integers, n,v,p,
satisfying Proposition (2.1-5). If Q is given as in Corollary
(2.3-6), then

(W k)C GC ... FclGC
Q = -------- = -------------------- for k>p+l
0Pv-n pv-n
L mk Omk
Thus, GC is obtained immediately from the first m columns of

(Wk)c. Form two nxn arrays, A and A each constructed by
selecting n regular columns of (Wk)C starting with the first
column for A and the (l+m) column for A The independent
columns of (Wk)C are indexed by the pj and satisfy (2.3-8);
thus, they are unit columns and A is a permutation matrix, i.e.,

A = [w ... w I Wl+m ". 2m I .. I .. +m(p.-) ...], jem

Theorem. (2.3-10) Given the infinite realizable Markov sequence
from an unknown system, then EC=(FC,GC,HC)n is a
minimal canonical realization of {Ak) with

FC [ W I W I ... I Wm

GC is a submatrix of (W.+1)C given by the first m
HC = [a.l ... a.l+m(Pll ) I ... a ... am ]

and W = [j+m ... j+m], jm, w is a column
vector of (W +1)C.

Proof. Since the sequence is realizable, there exist integers, n,v,p,
satisfying Proposition (2.1-5). If Q is given as in Corollary
(2.3-6), then

F(Wk)CGC | ... | FC GC
Q = ------- = ----------------- for k>i+l
mk mk
Thus, GC is obtained immediately from the first m columns of

(Wk)C. Form two nxn arrays, A and A each constructed by
selecting n regular columns of (Wk)C starting with the first
column for A and the (l+m) column for A The independent
columns of (Wk)C are indexed by the pj and satisfy (2.3-8);
thus, they are unit columns and A is a permutation matrix, i.e.,

A = [w1 ... w~ l+m 2m ... j+m( 1) jem

where it follows from (2.3-8) that the columns of A form chains
[w. ...w. j = [eq 1 ... e ] for qj= E P..
3 -34+m1( y1) -qj-1+ll -s=1

Since A is A shifted m columns to the right, each chain of A
is given by [wAj+ ... w J+m ] and again each column is unit
-j- -j+mp.
3 T
except wj+m = aj from Corollary (2.3-6). Thus, FC := A

gives the matrix of (2.2-5). HC is obtained directly from

H G I ... I F G = [a.1 ... a.m I .. a.l+m ... am(k+l)]

since multiplication by the unit columns of (Fc,Gc) select the
n columns of H .V

Analogous results hold for the dual ZR. It should also be noted that
if the Hankel array is transformed to SN,N' and both rows and columns
examined for predecessor independence as before, i.e.,

NN := N,N' U N (2.3-11)

where WN is given in (2.2-8) and T is a permutation array, then all of the
previous theory is applicable. The only exception in this case is that
the Bucy invariants and forms given by IBR and IBC are obtained instead of
the Luenberger forms. These results follow directly from (2.2-1).
In many applications the characteristic polynomial XF(z) is required.
Many efficient classical methods (e.g., see Faddeeva (1959)) exist to
determine XF(z) from the system matrix. Even more recently some
techniques have been developed to extract the characteristic polynomial


from the Markov sequence, but in general they are only valid in the cyclic

case (see Candy et al. (1975)). An alternate solution to this problem

is to obtain the Bucy form and use (2.2-11) to find XF(Z) by inspection.

It is possible to realize the system directly in Bucy form as mentioned in

the previous paragraph, but in this dissertation we prefer to take

advantage of the structure of the Luenberger form to construct TBR or

TBC. Superficially, this method does not appear simple because the

transformation matrix and its inverse must be constructed, but the

following lemma shows that TBR can almost entirely be written by inspection

from the observability invariants after the {I.} are known.

Lemma. (2.3-12) The transformation matrix TBR, such that

-1 -1

is given by

T = T T
BR B= [T .. B I
1 p

If the given triple is in Luenberger form, ZR' then the

(.ixn) submatrices TB are

eT ei-+1

S-r+' +2


T > i-V

T-i> R


v'.~ are the observability invariants of ER

vi are the invariants associated with EBR and
recall ri = E v ro=O.
1 s s o
Proof. This lemma is proved by direct construction of the T 's.
B i
Since each T satisfies for v.'v.
F 1 1

TB. = F R-1
i h1F'

then analogous to

and therefore
hiFR =

h.F 'i
h FR

i-l T
property (2.3-8), it follows that h.F =e

i^R RR -r

(hiFRi )FR = eFR = i

v v.-V.-l1 v.-v.-l'
= (hiFR )FR = ~.FR .

In order to construct TBR it is first necessary to find the {ui} from the

rows of [V ]R, but in this case the i 's can generally be found by

inspection while simultaneously building TBR. Also, TBR is generally a

sparse matrix with unit row vectors; therefore, the inverse can easily be

found by solving T RT-B = n directly for the unknown elements of TB .

In the next section we develop some new algorithms which utilize
the theory developed here.

2.4 Some New Realization Algorithms
In this section we present two new algorithms which can be used to
extract both observability and controllability invariants from the given
Markov sequence. Recall from Theorem (2.3-3) that performing row operations
on the Hankel array does not alter the column dependencies, however, it

is possible to obtain the row equivalent array, Q in a form such that
the controllability invariants can easily be found.
The first part of the algorithm consists of performing a restricted
Gaussian elimination (see Faddeeva (1959) for details) procedure on the
Hankel array. This procedure is restricted because there is no row or
column interchange and the leading element or first nonzero element of
each row is not necessarily a one. Define the natural order as 1,2,....

Algorithm. (2.4-1)
(1) Form the augmented array: [IpN I SN,N' I ImN
(2) Perform the following row operations on SN,N. to obtain

EP Q I ImN,:

(i) Set the first row of Q equal to the first Hankel row.
(ii) Search the first column of SN,N, by examining the rows in
their natural order to obtain the first leading element.
This element is qjl.
(iii) Perform row operations (with interchange) to obtain qkl=0,k>j.

tAlternately it is possible to extract the Bucy invariants from Q by
reordering the columns as in (2.2-8) to obtain (=QU and examining
the columns for predecessor dependencies.

(iv) Repeat (ii) and (iii) by searching the columns in their
natural order for leading elements.

(v) Terminate the procedure after all the leading elements have
been determined.

(vi) Check that at least the last p rows of Q are zero. This assures
that the rank condition, (R) is satisfied.

(3) Obtain the observability and controllability indicest as in

Theorem (2.3-3).

(4) Obtain T iep from the appropriate rows of P as in Corollary (2.3-4)
T *
and bi as in (2.2-19) where Bij-pij

(5) Perform the following column operations on Q to obtain [P SN,, E]:

(i) Select the leading element in the first column of Q, qjl.
(ii) Perform column operations (with interchange) to obtain
qjs=0 for s>l.

(iii) Repeat (i) and (ii) until the only nonzero elements in each row
are leading elements.

(6) Obtain cj, jem from the appropriate columns of E as in Corollary

(2.3-4) and -b from the dual of (2.2-19).
(7) From the invariants construct the Luenberger and MFD forms as in

Section (2.2).

If we also require the characteristic polynomial, then we must include:
(8) Determine the {.}, iep and (simultaneously) construct TBR as in
Lemma (2.3-12).
-1 1
(9) Find TBR by solving for the non unit rows in TT = In

Note that the leading elements have been selected from the rows by examining
the columns in their natural order; therefore, the dependent columns are
not zero as in (2.3-3), but are easily found from this form of Q by
inspection. It should also be noted that the leading elements could have
been selected in the j, (j+m), (j+2m)... columns; therefore, facilitating
the determination of the Bucy invariants and forms.
Alternately the {(3v}, jcm and TBC could be used. These indices can be
found easily from the columns of Q.


If we consider the alternate method implied in Corollary (2.3-6), then

the following modifications to the preceding steps are required:

(1)* Start with the following augmented array:

[IpN I SN,N.]

(2)* Obtain [P I Q] as before.

(5)* Perform additional row operations on Q to obtain unit

columns for each column possessing a leading row element, and

perform row interchanges such that (2.3-8) is satisfied

for each jem, i.e., obtain

Q :------n

(6)* Obtain the aj, jem, as in (2.3-6).

It should be noted that these algorithms are directly related to

those developed by Ho and Kalman (1966), Silverman (1971), or Rissanen

(1971). As in Ho's algorithm, the basis of the first technique is

performing the special equivalence transformation of Theorem (2.3-3)
on SN,N, to obtain SN,N,. The second technique accomplishes the same

objectives by restricting the operations to only the rows of SN,N, which

is analogous to either the Silverman or Rissanen method. The initial

storage requirements in the first method are greater than the second if

mN'>pN, since P and E can be stored in the same locations due to their

lower and upper triangular structure; and (2) P will be altered in the

second method, since row interchanges must be performed in (5)*; whereas,

it remains unaltered in the first method which may be important in some

applications. Consider the following example which is solved using both



Example. (2.4-2)

Let m=2, p=3, and the Hankel array be given as, S4,4

1 2 2 4 4 8 8 16
1 2 2 4 6 10 13 22

1 0 1 0 3 2 6 6

2 4 4 8 8 16 16 32
2 4 6 10 13 22 28 48

1 0 3 2 6 6 13 16
S = --------------------------
S 4,4
4 8 8 16 16 32 32 64

6 10 13 22 28 48 58 102

3 2 6 6 13 16 27 38
8 16 16 32 32 64 64 128
13 22 28 48 58 102 19 214

6 6 13 16 27 38 56 86

(1) [112 I S4,4 I 8

(2) Performing the row operations as in (2.4-1), obtain [P I Q I 8
where the leading elements are circled,

i] O 2 2 4 4 88 16:
-1 .1 00 00 2 5 6
1 0 -1 -4 0 -5 -7
-2 0 1 0 0 0 0 0 0 0 0
S0 D 1 I 0( 2, 0 1 1
11 1 0 -] 1
-4 0 0 0 0 0 1
-3 0 -1 0 -1 0 0 1
0 1 -2 0 -1 0 D 0 1 07
-8 0 ,0 0 0 0 0 0 0 1
-8 1 -2 0 -2 0 0 0 0 0 1
-1 2 -3 0 -2 0 0 0 0 0 0 1

(3) The indices are obtained by inspection from the independent rows
and columns of Q in accordance with Theorem (2.3-3) as:

v1 : 1

v2 2

v3 =1

P1 = 3

112 = 1

and p(S2,3) = p(S3,3) = p(S2,4) = 4 satisfying (R).

(4) The Ti and bi are determined from the appropriate rows and columns
of P as:

P45 P43] =

P85I P83 =

P65 P633 =

[2 I 0 0]

[3 I 0 1 1]

(-1 -1 1 I 1]

b = 1 P41 P42 P43 P44 I 0 [0 1-2 0 0 1 ]

- = Cp81 P82 P83 P84 P85 P86 P87 P881 01] -3 0-1 0 -1 0 0110]

T = [ IP61 62 P63 P64 P65 P66] = [1 0 -1 1]
-i" 3 i6 p6 p6 1 ]

(5) Performing the column operations, obtain the structural array
SN,N and E as:

[P I S4,4 ) E] where the leading elements are circled,,

= -EP41
= -[P81

= -[p61



I P62

0 000 0 0 0

0 0 0 0 0 0 0 0

0 o0 ( 0 0 0 0 0


1 -2 -1 1 -4 7

S- 2 4 4

1 -1 T -
1 0 0 0 0

1 -1 -- -3

0 1 0 0

1 0


(6) The a. and b. are determined from the appropriate rows and columns

of E as:



















' 2




















1 = N


(7) The canonical forms of ZR, BR(z), DR(z) and EC, BC(z), iC(Z) are:









1 2

1 2

2 4

1 0

z2-2z 0

-3 z2+z

z -z2+z




; DR() =

z 2z

z+1 2z+2

0 -2z

FC = e2 e3 -1 a 2

GC = [el 4]

HC = [a.1 a.3 a5 1 a 2]
C a1.

z.- [ z2 + 'Z + 4
BC(z) =


_Z3, 2
-z +z

2 4 2

2 6 2
1 3 0

; c(z) =

(8) The {vi} and TBR are determined simultaneously as:

and TBR





1 0 0 0

0 1 0 0

0 0 1 0

3 0 1 1

BR(z) [

- I-
+ i-

+ 3




z2- Iz

z2- 2z




(9) TB1 is given by solving

1 0
T-1 0 1
BR 0 o0
-3 0

(10) Find FBR and XF(z) as



the equations for the last row as:

0 0
0 0
1 0



XF(z) = (z-2)(z3-2z2+1) = z4_4z3+4z2+z-2



completes the first method. If the second method is used instead,

only (5)*, (6)*, and (8)* differ.

(5)* Performing the additional row operations and interchanges to

satisfy (2.3-8) gives:

5 1 1 0 5 7
-- 1 0 0 0 -I 0 -1 ~

5 3 1 1 1
-- 1 0 0 0 0 0 0 1 0 1 1 3
O- o- o o o o -$
0 -4

X 08

are determined

i; =w- '


the appropriate


0 -4


columns of Q as:

(6)* The


a =



This completes the algorithms. In the next chapter the first method is

modified to develop a nested algorithm from finite Markov sequences.



One of the main objectives of this research is to provide an

efficient algorithm to solve the realization problem when only partial

data is given. As new data is made available (e.g., an on-line

application, Mehra (1971)), it must be concatenated with the old

(previous) data and the entire algorithm re-run. What if the rank of

the Hankel array does not change? Effort is wasted, since the previous

solution remains valid. An algorithm which processes only the new data

and augments these results (when required) to the solution is desirable.

Algorithms of this type are nested algorithms.

In this chapter we show how to modify the algorithm of (2.4-1)

to construct a nested algorithm which processes data sequentially.

The more complex case of determining a partial realization from a fixed

number of Markov parameters arises when the rank' condition, abbreviated

(R), is not satisfied. It is shown not only how to determine the minimal

partial realization in this case, but also how to describe the entire

class of partial realizations. In addition, a new recursive technique

is presented to obtain the corresponding class of minimal extensions and

the determination of the characteristic equation is also considered.

3.1 Nested Algorithm

Prior to the work of Rissanen (1971) no earlier recursive methods

appeared in the realization theory literature. Rissanen uses a


factorization technique to solve the partial realization problem when

(R) is satisfied. His algorithm not only solves the problem in a
simple manner, but also provides a method for checking (R) simultaneously.

In the scalar case, Rissanen obtains the partial realizations, Z(K),

K=1,2,... imbedded in the nested problem of (2.1), but unfortunately

this is not true in the multivariable case. Also, neither set of

invariants is obtained.

The development of a nested algorithm to solve the partial

realization problem given in this dissertation follows directly from

(2.4-1) with minor modification. There are two cases of interest when

only a finite Markov sequence is available.

Case I. (R) is satisfied assuring that a unique partial

realization exists; or

Case II. (R) is not satisfied and an extension sequence

must be constructed.

The nested algorithm will be given under the assumption that Case I

holds in order to avoid the unnecessary complications introduced in

the second case. The modified algorithm is given below. The corresponding

row or column operations are performed only on the data specified


Partial Realization Algorithm. (3.1-1)

(1) Same as (1) and (2) of Algorithm (2.4-1) except (iv) is qkjfO

k>j if k is a row whose leading element has been specified.

(2) If (R) is satisfied for some M*=v+p, obtain the invariants as

before in (3), (4) of (2.4-1) and go to (5).If not, continue.

(3) Add the next piece of data, AM+1 and form S(M+1,M+1).
(4) Multiply S(M+1,M+1) by P. Perform row operations (if necessary)

using old leading elements to obtain Q (M+1,M+1). If (R) is

satisfied, continue. If not, go to 3.
(5) Perform column operations as in (5) of (2.4-1) and obtain the
invariants and canonical forms as in (6), (7). Go to 3.

Example (2.4-2) will be processed to demonstrate the modified algorithm

for comparison. Assume that the Markov parameters are sequentially

available at discrete times, i.e., Al is received, then A2, etc., and
the system is to be realized.

Example. (3.1-2) Let the Markov sequence be given

1 2] 2 4 [4 8
A = 1 -2 A = 2 4 A3 6 10 A4
_1 0 1 0 3 2-


S8 161 [16 32-
13 22 A5= 28 48
6 6 13 16

and apply the algorithm of (3.1-1). It is found that the rank condition

is first satisfied when A1, A2 are processed, i.e.,

(1) [I6 I S(2,2) 1 14

(2) Performing first row and then
obtain [P I S*(2,2) J E] or

1 0 0
-l 1 O

-2 0 0 1 0 0
-2 0 0 0 1 0 0
0 0 -1 0 0 1 0 0

column operation as in (3.1-1),

1 -2 -1
1 -
0 1

I _

(3) Indices are: v = 1 = 1
V2 =0 2 = 1
V =1

(4) :Invariants are: = -[P41 P43] = [2 ) 0]
__ -1'P 4 = [0 1 0]
S=-[P61 63 = 1

-N = [P41 P42 P43 P44 0] [-2 0 0 1 0 0]

b2 = [0 P21 P22 ] = [0 0 0-1 1 0]

S[P61 P62 P63 P64 P65 P66]= 0 0 -1 0 0 1]

e --1 e4 0
-i l13 114
3 1 e23 e24 -2
22 = ~ e 1 e
S24 2 33 34
-0 O_ e 44 1

(5) Canonical forms are:
1 2
FC I.1 2 GC = [ 1 I e2 HC = [a.1 I 2] = 1 2

z-1 0 1 2
Fc(z) = f C(z) =
-2" z-2 1 0

T P T -] -

FR =R IR 1
T 3 .
3 ~ ~_eT2_"3- -

where wTt = -[P 21 P23] =[1 0]

z-2 0 0 1 2
BR(z) = -2 z 0; DR(Z) = 0 0
0 0 z-1- 1 0

The rank condition is next satisfied when A1,...,A5 are processed,
.e, M =5 and we obtain [P (5,5) E as:
i.e., M =5 and we obtain [P | S (5,5) | E] as:

[P I Q(5,5)] =















() 2 2 4 4 8 8 16 16.32

0 0. 0 ( 2 5 61216

0 -1 -4 -1 -6 -2 -10 -3 -16
0 0 0 0 0 0 0 0

0 0 (J) 2 5 6 12 16
0 0 0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

0 0

0 0

0 0

and performing the column operations give [S (5,5) I E~

wT is found easily from HRGR=A1 or solving for the second row of Hp,
wTGR [1 2].


0 1

0 0 1

0 0 0 1

0 0 0 0


0 1

0 0

0 0

1 -1

0 0

0 -1

1 -2

0 0

1 -2

2 -3

0 0
0 -4

0 -5


0 1

0 -1

0 0

0 -1

0 -1

0 0

0 -2

0 -2

0 0
0 0

0 0

) 0 0 0 0 0 0 0 0 0 1 -2 -1 -1 -i- 4 10 -12
4 2
0 0 0 0 @ 0 0 0 0 0 1 i 1 -L' -6 -14
0o( 0 0 0 0 0 0 0 1 -1 -- -3 -- -- 15 20

0 0 0 0 0 0 0 01 0 0 0 0 0 0

0 0 () 0 0 0 0 0 1 -1 -3 -6 -8

0 0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0. 1 0 0

0 0 0 0 0 0 1 0

0 0 0 0 1

0 0 0 0
0 0 0 0
a 0

a a

The results in this case are identical to those of Example (2.4-2).

Let us examine the nesting properties of this realization algorithm.

Temporarily, we resort to using data dependent notation for this

discussion with the same symbols as defined previously in the previous
sections, e.g., the minimal partial realization of order M is given by

Z(M) :=- (F(M),G(M),H(M)). Thus, Z(M+k) is a (M+k)-order partial

realization. We also assume for this discussion that E(M) is in row

canonical form; therefore, it can be expressed in terms of the set of
invariants, [{vi(M)},{ (M)},{aT(M)}]. If E(M) is an n dimensional,

minimal partial realization specified by these invariants, then there

are n regular vectors, -+pt(M) spanning the rows of S(M,M). Furthermore,

each dependent row vector, J.(M) is uniquely represented as a linear
combination of regular vectors in terms of the observability invariants

and it can be generated from the recursion of Corollary. (2.3-2). Similarly,

it follows from Proposition (2.2-13) that the dependent block row

vectors, aj (M) satisfy an analogous recursion. The following lemma
describes the nesting properties of minimal partial canonical realizations.

Recall that M is the integer of Proposition (2.1-6) given by M =v+p.

Lemma. (3.1-3) Let there exist an integer M*(M)IM such that the rank

condition is satisfied and let Z(M) be the corresponding
minimal partial canonical realization of {Ar}, reM
specified by the set of invariants [{v.(M)},{ist(M)},

{a (M)}]. Then

vi(M) = vi(M+k)
Bist(M) = ... = ist(M+k)

aT (M) = = aT (M+k)
j. J.

iff p(S(M,M))=p(S(M+1,M+1) = ... = p(S(M+k,M+k))
for the given k.

Proof. If v.(M) = ... = v.(M+k), etc., then the minimal canonical
1 1
partial realizations are identical, E(M)=E(M+1)= ... =E(M+k).

It follows that p(S(M,M))=dimE(M)=p(S(M+1,M+I))=dimE(M+I)=

Conversely, P(S(M,M))=P(S(M+1,M+1))= ... =P(S(M+k,M+k)) implies
dimZ(M)=dimE(M+1)=... =dimE(M+k). Since E(M) is a unique minimal

canonical partial realization, so is Z(M*). Furthermore, since

each realization has the same dimension, each realization has

has M*(M)=M*(M+1)= ... = M*(M+k) so that each canonical
realization is equal to Z(M*); therefore, Z(M)=Z(M+1)= ... =Z(M+k).V

Next we examine the case where E(M) and z(M+k) are of different
dimension. The nesting properties are given in the following lemma.

Lemma. (3.1-4) Let there exist integers, M*(M)M, M (M+k)-M+k such
that the rank condition is satisfied (separately) and
E(M), Z(M+k) are minimal partial canonical realizations
of {Ar} whenreM andrsM+k, respectively, for given k.
If p(S(M+k,M+k))>p(S(M,M)), then vi(M+k)-vi(M), iep.
Furthermore, a .(M+k)=a. (M), j=i,i+p,...,i+p(vi(M)-l).
j. j. 1
Proof. Since p(S(M+k,M+k))>p(S(M,M)), M*(M+k)>M*(M) and therefore,

Sv(M),(M) is a submatrix of Sv(M+k),,(M+k). If the j-th row

of Sv(M),p(M) is regular, it follows that the j-th row of

Sv(M+k),i(M+k) is also regular by the nature of the Hankel
pattern, i.e., the rows of Sv(M),p(M) are subrows of

Sv(M+k),p(M+k). The addition of more data (AM+1,...,AM+k) to
S(M,M) makes previously dependent rows become independent rows
but previously independent rows remain independent; thus, the
v.(M) can only increase or remain the same, i.e., v.(M+k) -
vi(M), iep. The set of regular {a (M+k)} are specified by the
vi(M+k)'s; therefore aT (M+k)=a (M), j=i,i+p,...,i+p(v.(M)-1),
( k'. 1
since vi(M+k)Qvi(M), iep.V

The results of these two lemmas are directly related to the nesting
properties of the partial realization algorithm. First, define JM as the
set of indices of regular Hankel row vectors based on M Markov parameters


available, i.e., J = {1,1+p,... ,1+p( .(M)-l),...,p,2p,... ,pv (M)}
and similarly denote the row vectors of P1 the elementary row matrix
of the previous chapter, by T(M). From Lemma (3.1-3), it follows
S* T T
that JM J+k and i+p(M)(M) =... i+ (M+k)(M+k) since

the observability invariants are identical. The vi specify the
elements in J and along with the Bist, they specify the elements of
Pi+p. (M)(M) (see Corollary (2.3-4)). From Lemma (3.1-4) it is clear

that JMJM+k since vi(M+k)Zvi(M).
Reconsider Example (3.1-2), to see these properties. In this
case we have M=2, k=3, M (2)=2, M*(5)=5, and p(S(5,5))>p(S(2,2)) as
in Lemma (3.1-4); therefore, J*cJ*, since J2 = {1,3} and J* = {1,3,2,5}.

The observability indices are identical except for v2(5)>v2(2); thus,
{a (2),a, (2)}c{aT (5),aT (5),aT (5),aT (5)} since aT (2) = aj (5)
1 3. I 3. 2. 5. J. j.
for j=1,3. We also know from Example (2.4-2) that Z(5) is the solution
to the realization problem and therefore the properties of Lemma (3.1-3)
will hold for {AM}, M>5. Table (3.1-5) summarizes these properties.
The results for the dual case also follow directly. We now proceed to
the case of constructing minimal partial realizations when.(R) is
not satisfied, i.e., the construction of minimal extensions.


Table. (3.1-5) Nesting Properties of Algorithm (3.1-1)

Augment M-4+k







where (R) is satisfied for some k and C means that the

corresponding invariants, vectors, or indices are nested or

contained in a set of higher order.


3.2 Minimal Extension Sequences

In this section we discuss the more common and difficult

problem of obtaining a minimal partial realization from a finite

Markov sequence when (R) is not satisfied. Two different approaches

for the solution of this problem have evolved. The first is based

on constructing an extension sequence so that (R) is satisfied

and the second is based on extracting a set of invariants from

the given data. We will show that these methods are equivalent

in the sense that they may both lead to the same solution. In order

to do this the existing algorithm is extended to obtain the more

general results of Roman and Bullock (1975a). Also a new recursive

method for obtaining the entire class of minimal extensions is

presented. It is shown that the existing algorithm does in fact

yield a particular solution to this problem which is valuable in

many modeling applications.

In the first approach, Kalman (1971b),Tether (1970), and

subsequently Godbole (1972) examine the incomplete Hankel array

to determine if (R) is satisfied. If so, the corresponding minimal

partial realization is found. If not, a minimal extension is con-

structed such that (R) holdsand a realization is found as before.

They show that a minimal extension can always be found, but in

general it will be arbitrary. They also show that this extension

must be constructed so that the rank of S(M,M) remains constant

and the existing row or column dependencies are unaltered.

Considerable confusion has resulted from the degrees of freedom

available in the choice of minimal extensions. In fact, initially,

the major motivation for construction an extension was that it

was necessary in order to be able to apply Ho's algorithm. Un-

fortunately, these approaches obscure the possible degrees of

freedom and may lead to the construction of non-minimal extensions

as shown by Godbole (1972).

Roman and Bullock (1975a)developed the second approach to

the solution of this problem. They show that examining the columns

or rows of the Hankel array for predecessor independence yields

a systematic procedure for extracting either set of invariants

imbedded in the data. They also show that some of these would-be

invariants are actually free parameters which can be used to

describe the entire class of minimal partial realizations. These

results precisely specify the number and intrinsic relationship

between these free parameters. Unfortunately Roman and Bullock

did not attempt to connect their results precisely with those in

Kalman (1971b),Tether (1970). It will be shown that this connection

offers further insight into the problem as well as new results

which completely describe the corresponding class of minimal extensions.

Before we state the algorithm to extract all invariants available

in the data, let us first motivate the technique. When operating

on the incomplete Hankel array, only the elements specified by the

data are used. It is assumed that the as yet unspecified elements

will not alter the existing predecessor dependencies when they are

specified by an extension sequence. Since the predecessor dependencies

are found by examining only the data in S(M,M), we must examine

complete submatrices of S(M,M) in order to extract the invariants

associated with a particular chain (see Roman and Bullock (1975a)).

Therefore, it is possible that a dependent vector, say !T of a sub-
matrix of S(M,M) later corresponds to an independent vector in S(M,M).

When representing any other dependent vector in this submatrix

in terms of regular predecessors, T. must be included, since it is
a regular vector of S(M,M) under the above assumption. In this represen-

tation the coefficient of linear dependence corresponding to Y
is arbitrary. Reconsider Example (3.1-2) for{Ai}, i=1,2,3 where we

only consider the (row) map P.

Example. (3.2-1) For A1, A2, A3 of (3.1-2) we have P: S(3,3)+Q(3,3) or

1 2 2 414 8~ -@2 2 414 8

1 2 2 416 10 0 0 0 0)2

1.0 1 03 2 .0_1-4-1-6

2448 0 0 0 0
2 4 6 10 -- 0 0 2

1 0 3 2 0 0 0 0

48 00

6 10 0 0

3 2 _0 0
The indices are {vl,v2,v3} = {1,2,1}. Since v1=1, the fourth row of

S(3,3) (or equivalently Q(3,3) ) is dependent on its regular predecessors

as shown in the corresponding 3x4 submatrix (in dashed lines) of S(3,3)
(or Q(3,3) ). The second row, say 2 in this submatrix is dependent,
yet it is an independent row of S(3,3) (or Q(3,3) ). Now, expand T of

this submatrix, i.e.,

^T =T + T ^T
4 110 1 120 130 3 121'0)

[2 4 4 8] = B110 [1 2 2 4] + B120 [1 2 2 4] + B130[1 0 1 0]

The solution is B110 = 2 120' g130=0; thus, the coefficient 8120 is
ah arbitrary parameter. Note that this recursion is essentially the

technique given in Roman and Bullock (1975a).

Clearly, if (R) is satisfied as in the previous section, then there
exists a complete submatrix (data is available for each element) of S(M*,M*)
in which every regular vector of S(M,M) is always a regular vector

of the submatrix corresponding to a particular chain; thus, there

will be no arbitrary or free parameters.

The algorithm for the case when (R) is not satisfied may be
illustrated by considering row operations on S(M,M) to obtain Q(M,M),

since the identical technique can be applied to obtain S*(M,M). The

arbitrary (column) parameters are found by performing additional

column operations to Q(M,M). As in Example (3.2-1), we must find
the largest submatrix of Q(M,M) for each chain, i.e., if we define
k as the index of the block row of S(M,M) containing the: vector ,
- -1i+p vi
then the largest submatrix of data specified elements corresponding

to the i-th chain is given by the first (i+pvi-l) rows and m(M+1-ki)
columns of Q(M,M). Also, we define Ji,iep as the sets of Hankel row
indices corresponding to each dependent (zero) row of the

(i+pu-1l)x (m(M+1-ki) submatrix of Q(M,M) which becomes independent, i.e.,

it contains a leading element. In Example (3.2-1) for i=l, we have

(l+pv-l)3 and k1=2; thus, m(M+l-kl)=4 and the corresponding submatrix is
given by the first 3 rows and columns of Q(3,3), and. of course, J1={2}.

I_ I

Arbitrary Parameter Partial Realization Algorithm. (3.2-2)

(1) Perform (1) of Algorithm (3.1-1) to obtain [P I Q(M,M)]
(2) For each iep, determine the largest (i+pwl)xm(M+1-k.) sub-
1 1
matrix of Q(M,M) of data specified elements and form the set Ji
(3) For each iep, replace by + b, ba scalar.

(4) Determine the corresponding canonical forms incorporating
these free parameters.

Dual results hold for the columns. The free-parameters are
fund in analogous fashion by examining the zero columns of the
submatrices of S*(M,M).

Example. (3.2-3) The following example is from Tether (1970).
For m=p=2 and

1 1 2 4 3 10 7 22 15]
A 0 0 0 0 1 3 3
(1) [ P r Q(4,4) ] =

1 ( 1 4 3 10 7 22 15
0 1 0 0 0, 0 ( 3
-4 0 1 0o o-6 -5-18-13
0 0 0 1 0 0 06 1 0 0
2 0 -3 0 1 0 01 0
-1 0 0 1 0 1 0 0o 0o
6 0 -7 0 0 0 1 0 0
-3 0 0 0 0 0 0 1 0 0

It should be noted that when (R) is not satisfied, some of the v. may not
be defined, i.e., the last independent row of a.chain is in the last block
Hankel row. In this case all would-be invariants are arbitrary.


The indices are: v1=2, v2=3

(2) For i=1, (1+p -1)=4, kl=3, m(M+1-k,)=4; thus, the corresponding
submatrix is constructed from the first 4 rows and columns of Q(4,4)
(small dashes). J1={2}.

For i=2, (2+pv2-1)=7, k2=4, m(M+1-k2)=2; thus, the corresponding
submatrix of Q(4,4) is given by the first 7 rows and 2 columns (large
ra'shes). J2={2,4,6}.

(3) Replacing the fifth and eighth rows of P with p + bPj2 and
p + cP2 + dp + ePg where b,c,d,e are real scalars gives
S= [ 2 b -3 0 1 0 0 0]

S[-3-e c 0 d+e 0 e 0 1 ]
p8 = [-3-e c 0 d+e 0 e 0 1

The T are:

_-= [ -2

T = [ 3+e

3 -b

0 0]

-c -(d+e) -e ]

(4) The canonical form is


F T1
FR= eT




Corresponding to these realizations is a minimal extension sequence

which can be found by determining the Markov parameters. These parameters






0 0

0 0

0 0

are cumbersome to obtain due to the general complexity of the expressions

in zR or ZC; therefore, a technique to determine these extensions
without forming the Markov parameters directly (or the realization) was
developed. This method consists of recursively solving simple linear
equations (one unknown) to obtain the minimal extension. Extensions
constructed in this manner not only eliminate the possibility of non-

minimality as expressed in Godbole (1972), but also describe the entire

class of minimal extensions. The method of constructing the minimal
extension sequence evolves easily from the lower triangular-unit diagonal
structure of P. Since a dependent row of Q(M,M) is a zero row, it

follows from Theorem (2.3-3) that

+pvi j qi+pi j 0 for jemM (3.2-4)

where recall that p i+i=0 for j>i+pv.. Thus, by inserting the
1ipq,j 1
unknown extension parameters, x..(r) for

rx11(r) *** xm(r)


x pl(r) *.. pm(r)

into S(M,M) a system of linear equations is established in terms of the
xi (r)'s by (3.2-4). Due to the structure of P, this system of equations
is decoupled and therefore easily solved.

Example. (3.2-5) Reconsider (3.1-2) for Al, A2. Since (R) is satisfied,
the extension A., j>2 is unique. We would like to obtain

x11(3) x12(3)

x21(3) x22(3)

x31(3) x32(3)

Since P maps S(2,2) into Q(2,2), we


1 2 2 4
1 2 2 4
1 0 1 0

r-" -
2 4 x11(3)
2 4 x21(3)

_1 0 x31(3)

X -



2 2 4
0 0 0

0 -1
0 0 0
0 0 0
0 0 0


and in this case,{vl,v2,v3} ={1,0,1}. Thus, using (3.2-4), we

solving 0 = P4T 13 = [-2

0 0 1 o


for x11(3) gives x11(3)=4

Similarily solving:

p44 = 0


for x12(3) gives x12(3) = 8

for x31(3) gives x31(3) = 1

for x32(3) gives x32(3) = 0

In this example, x21(3)=x11(3) and x22(3)=x12(3), since v2=0.


Thus, this example shows that the minimal extension sequence can be
found recursively due to the structure of P. Of course, the problem
of real interest is when (R) is not satisfied and (as in Ho's algorithm)
a minimal extension with arbitrary parameters must be constructed.

Minimal Extension Algorithm. (3.2-6)
(1) Perform (1), (2), (3) of Algorithm (3.2-2).
(2) Determine M* = v+-p. (The values of v,p are determined by the partial data)
(3) Recursively construct the minimal extension {Ar}, r = M+1, ... ,M*
where Ar [xij(r)] P by solving the set of equations for xij(r)
given by
+v = 0, j = m(M+l-k)+l, ... ,m(M*+1-ki), for each iep.
p-i +pv 1i)

and recall that ki is the index of the block row of S(M,M) containing
the row vector, Y

Example. (3.2-7) Reconsider (3.2-3) for illustrative purposes.

(1) These results are given in Example (3.2-3)
(2) M*=6; thus, find

A5 = Fxll(5) x12(5)1
x21(5) x22(5)

(3) Recursively solve: Pi2vi r = .
i=2, j=3,4,5,6.

p5 5 = 0 gives x11(5);

p T3 = 0 gives x21(5);

= 0 gives

= 0 gives

A6 = xl'(6) x 2(6)-
x21(6) x22(6)

for i=l, j=5,6,7,8 and for


= 0 gives x12(5)

= 0 gives x22(5)

= 0 gives

= 0 gives'


x21(6) ;



and therefore

46-b 31-b 94-6b 63-6b
A5 = A6
12-d 9-d-e 30-c-3d-5e+de 21-c-3d-5e+de-e2

By solving for the x i's in A5, A6 we obtain the extension as

A x11(5) x11(5)-15~ A 6x11(5)-182 6x11(5)-213
x21(5) x22(5) x21(6) x21(6)+(21(5)- (5-3)-9

The number of degrees of freedom is 4,i.e.,{x11(5),x21 (5),x22(5),x2(6)}.

The technique used to solve the partial realization problem when (R)
is not satisfied was to extract the most pertinent information from the

given data in the form of the invariants, which completely described
the class of minimal partial realizations. A recursive method to obtain
the corresponding class of minimal extensions was also presented in (3.2-6).
This method is equivalent to that of Kalman (1971b) or Tether (1970) for

if the minimal extension is recursively constructed and Ho's algorithm
is applied to the resulting Hankel array the corresponding partial real-
ization will belong to the same class. Note that if the extension is not
constructed in this fashion, it is possible that all degrees of freedom

available may not be found (see Roman (1975)). It should be noted that
the integers v and i are determined from the given data,i.e., knowledge
of the invariants enables the construction of a minimal extension such
that v and p can be found. The approach completely resolves the ambiguity
pointed out by Godbole (1972) arising in the Kalman or Tether technique.

The results given above correspond directly to those presented in

Kalman (1971b) and Tether (1970). They have shown, when (R) is satisfied,

there exists no arbitrary parameters in the minimal partial realization

or corresponding extension. Therefore, the existence of arbitrary

parameters can be used as a check to see if the rank condition holds.

Although it is not essential to construct both sets of invariants, it is

necessary to determine M* which requires v and i; thus, the algorithm

presented has definite advantages over others, since these integers are

simultaneously determined.

In practical modeling applications, the prediction of model

performance is normally necessary; therefore, knowledge of a minimal

extension is required. Also in some of the applications the number of

degrees of freedom may not be of interest, if only one partial realization

is required rather than the entire class. In this case such a model is

easily found by setting all free parameters to zero which corresponds to

merely applying the Algorithm (3.1-1) directly to the data and obtaining

the corresponding canonical forms as before.

Describing the class of minimal extensions offers some advantages

over the state space representation in that it is coordinate free and

indicates the number of degrees of freedom available without compensation.

3.3 Characteristic Polynomial Determination by Coordinate Transformation

In this section we obtain the characteristic equation of the entire

class of minimal.partial realizations described by FR or FC of the

previous section. It is easily obtained by transforming the realized

FR or FC into the Bucy form as before. Recall that the advantage of

this representation over the Luenberger form is that it is possible to

find the characteristic polynomial directly by inspection of FBR in (2.2-11).


Even though it is possible to realize the system directly in Bucy form
as implied in the discussion of (2.3-12), it has been found that this

method has serious deficiencies when dealing with finite Markov sequences.

If (R) is satisfied, the partial realization is unique. When (R) is not

satisfied, this technique does not yield all degrees of freedom. For

example, reconsider the arbitrary parameter realization of Example (3.2-3).

This realization is given in Ackermann (1972) as

-0 1 0 0 0 0 1 0 0 0

-2 3 -b 0 0 -2 3 0 0 0
FR = 0 0 0 1 0 FAck = 0 0 0 1 0

0 0 0 0 1 0 0 0 0 1

3+e 0 -c -(d+e) -e 3+e 0 -c -(d+e) -e

Note that one degree of freedom (b=0) has been lost. Similarily

Ledwich and Fortmann (1974) have shown by example that this technique

can also lead to non-minimal realizations. These deficiencies arise due

to the procedure used for the determination of the Bucy invariants. This

procedure does not account for the possibility that an independent row

vector of a particular chain may actually be dependent if it is compared

with portions of the same length of vectors in different chains. To cir-

cumvent the problem, the previous technique will be used,i.e., the system

is realized directly in Luenberger form and transformed to Bucy form. Not

only does this assure minimality as well as the determination of all possible

degrees of freedom, but TBR is almost found by inspection as shown in

(2.3-12). Reconsider the example of the previous section.

Example. (3.3-1) Recall that in (3.2-3) m=p=2, n=5, and v1=2, v2=3,

T = [ -2 3 -b 0 0
.k = [ -2 3 -b 0 0 ]

(1) Simultaneously construct TBR

for predecessor independence

-1 R

-1 R.

(2) Determine T -1 from







j2= [3+e 0 -c -(d+e) -e i

from (3.3-4) while examining the rows

3 -b

7 -3b

-14 15 -7b -3b' -b

TR R= I which gives










3/b -1/b

(3) Determine FBR:


(4) Find the characteristic polynomial by inspection.


























XFBR(z) = z+(e-3)z4+(d-2e+2)z'+(c-3d-e)z2+(-3c+2d+2e)z+(b+2c+be)

This example points out some very interesting points. When this

technique is combined with the algorithm of (3.2-2), it offers a method

which can be used to obtain the solution to the stable realization

problem developed in Roman and Bullock (1975b). Also, if the system
were realized directly in Bucy form, then b=O and a degree of freedom is

lost; thus, in Ackermann's example 91=1, while ours is Z1=5. It is

critical that all degrees of freedom are obtained as shown in this case,

since the system is observable from a single output.
This section concludes the discussion of the deterministic case of

the realization problem. In the next chapter we examine the stochastic

version of the realization problem.



In this chapter the stochastic realization problem is examined

by specifying an invariant system description under suitable trans-

formation groups for the realization. Superficially, this may appear

to be a direct extension of results previously developed, but this is

not the case. It will be shown that the general linear group used in

the deterministic case is not the only group action which must be

considered when examining the Markov sequence for the corresponding

stochastic case.

Analogous to the deterministic realization problem there are

basically two approaches to consider (see Figure 1): (1) realization

from the matrix power spectral density (frequency domain) by performing

the classical spectral factorization; or (2) realization from the

measurement covariance sequence (time domain) and the solution of a set

of algebraic equations. Direct factorization of the power spectral

density (PSD) matrix is inefficient and may not be very accurate.

Recently developed methods of factoring Toeplitz matrices by using fast

algorithms offer some hope, but are quite tedious. Alternately,

realization from the covariance sequence is facilitated by efficient

realization algorithms and solutions of the Kalman-Szeg6-Popov equations.



Figure 1. Techniques of Solution to the Stochastic
Realization Problem.

The problem considered in this chapter is the determination of a

minimal realization from the output sequence of a linear constant

system driven by white noise. The solution to this problem is well known

(e.g.. see Mehra (1971)) as diagrammed below in Figure 2. The output

sequence of an assumed linear system driven by white noise is correlated

and a realization algorithm is applied to obtain a model whose unit

pulse response is the measurement covariance sequence. A set of algebraic

equations is solved in order to determine the remaining parameters of

the white-noise system.

This problem is further complicated by the fact that the covariance

sequence must be estimated from the measurements. From the practical

viewpoint, the realization is highly dependent on the adequacy of the

estimates. Although in realistic situations the covariance-estimation

problem cannot be ignored, it will be assumed throughout this chapter

that perfect estimates are made in order to concentrate on the realization

portion of the problem.t

In this chapter we present a brief review of the major results

necessary to solve the stochastic realization problem. We use the

Majumdar (1976) has shown in the scalar case that even if imperfect
estimates are made realization theory can successfully be applied.

*White Noise Input

Linear Constant

Solve Algebraic

Output Sequence

Measurement Covariance Sequence

Stochastic Realization

Figure 2. A Solution to the Stochastic Realization Problem




__ ___

algebraic structure of a transformation group acting on a set to obtain

an invariant system description for this problem. A new realization

algorithm is developed to extract this description from the covariance

sequence. Recently published results establishing an alternate approach
to the solution of this problem are also considered.

4.1 Stochastic Realization Theory

Analogous to the deterministic model of (2.1-1) consider a white-
noise (WN) model given by

SFx + w (4.1-1)

Xk =Hx

where x and y are the real, zero mean, n state and p output vectors,
and w is a real, zero mean, white Gaussian noise sequence. The noise

is uncorrelated with the state vector, A., j 5 k and

Cov(w ,w.):=E[(w.-Ew.)(w.-Ewj)] = Q6i

where 6ij is the Kronecker delta. This model is defined by the triple,
WN :=(F,I,H) of compatible dimensions with (F,H) observable and F a

nonsingular,t stability matrix, i.e., the eigenvalues of F have magnitude

less than 1. The transfer function.of (4.1-1) is denoted by TWN(z).

tIn the discussion that follows the WN model parameters will be used to
obtain a solution to the stochastic realization problem. Denham (1975)
has shown that if the spectral factors of the PSD are of least degree,
i.e., they possess no poles at the origin, then F is a nonsingular matrix.

The corresponding measurement process is given by

-k -k + Y-k (4.1-2)

where z is the p measurement vector and v is a zero mean, white
Gaussian noise sequence, uncorrelated with x., j 5 k with

Cov(vi,.j) = R6ij

Cov(wA,.) = S6ij

for R a pxp positive definite, covariance matrix and S a nxp cross
covariance matrix. Thus, a model of this measurement process is
completely specified by the quintuplet, (F,H,Q,R,S).
When a correlation technique is applied to the measurement process,
it is necessary to consider the state covariance defined by

Sk: =Cov(x, )

We assume that the processes are wide sense stationary; therefore,

k = T, a constant here. It is easily shown from (4.1-1) that the
state covariance satisfies the Lyapunov equation (LE)

I = FIFT + Q (4.1-3)

It is well known (e.g. see Faurre (1967)) that since F is a stability
matrix, corresponding to any positive semidefinite covariancee) matrix Q,
there exists a unique, positive semidefinite solution I to the (LE).
The measurement covariance is given (in terms of lag j) by

Cj:= Cov(k+j'k) = Cov(yk+j,yk)+Cov(yk+j'vk)+Cov!(vk+j'yk)+Cov(k+j ',k)


and from (4.1-1) it may be shown that

C. = HFj-I(F~HT+S) j > 0 (4.1-5)

C = HnHT + R

The PSD matrix of the measurement process is obtained by taking the

bilateral z-transform of the sequence C. defined in (4.1-4) which gives

DZ(z) = H(Iz-F) Q(Iz1-FT)l HT+H(Iz-F)-1S+ST(Iz-1-FT) -HT+R
It is important to note that this expression is the frequency domain

representation of the measurement process which can alternately be

expressed directly in terms of the measurement covariance sequence as

00 -
4Z(Z) = Z C.z
j=-o, J

Since the measurement process is stationary and z is real, C =CT and
-k k
therefore the PSD can be decomposed as

00 .z
(z) = Z C.z" + C + Z CTz. (4.1-7)
Sj=l J j=l J

Note that {C.} is analogous to the Markov sequence of the deterministic

realization problem. We define the problem of determining a quintuplet,

(F,H,Q,R,S) in (4.1-6) from Z(z) or {C,} as the stochastic realization


In this chapter we are only concerned with the realization from the

measurement covariance sequence. When a realization algorithm is applied

to the covariance sequence, we define the resulting realization as the

Kalman-Szegi-Popov (KSP) model because of the parameter constraints

(to follow) which evolve from the generalized Kalman-Szegb-Popov lemma

(see Popov (1973)t). Thus, we specify the KSP model as the realization

of {C .}t defined by the quadruple, KSP:=(A,B,C,D) of appropriate

dimension with transfer function, TKSP(z)=C(Iz-A) -B+D. Note that since
the unit pulse response of the KSP model is simply related to the
measurement covariance sequence, then (4.1-7) can be written as the

sum decomposition.

S(z) = TKSP(z)+TKp(z-1) = C(Iz-A)-1B+D+DT+BT(Iz'1-AT)-1CT (4.1-8)

The relationship between the KSP model and the stochastic realization
of the measurement process is shown in the following proposition by

Glover (1973).

Proposition (4.1-9) Let ZKSP=(A,B,C,D) be a minimal realization of {Cj}.

Then the quintuplet (F,H,Q,R,S) is a minimal stochastic

realization of the measurement process specified

by (4.1-1) and (4.1-2), if there exists a positive
definite, symmetric matrix H and TeGL(n) such that
the following KSP equations are satisfied:

n-AlIAT =Q


where A=T-1FT and C=HT.

The proof of this proposition is given in Glover (1973) and

This book was published in Romanian in 1966, but the English version
became available in 1973.
tNote that the sequence, {C.}1 is related to the measurement covariance
sequence as Co =Co and C.=C. for j > 0.
o o 33

corresponds directly to the results presented by Anderson (1969) in
the continuous case. The proof follows by comparing the two distinct
representations of 0Z(z) given by (4.1-6) and (4.1-8). Minimality of
(F,H,Q,R,S) is obtained directly from Theorem (3.7-2) of Rosenbrock
(1970). The KSP equations are obtained by equating the sum decomposition
of (4.1-8) to (4.1-6).
This proposition gives an indirect method to check whether a given

EKSP and stochastic realization, (F,H,Q,R,S) correspond to the same
covariance sequence. Attempts to use the KSP equations to construct
all realizations, (F,H,Q,R,S) with identical {C.} from ZKSP and T by
choice of all possible symmetric, positive definite matrices, I will
not work in general because all I's do not correspond to Q,R,S matrices
that have the properties of a covariance matrix, i.e.,

A:= Cov( [w v) Q S (4.1-10)
T '6
Li R

First, it is necessary to question if the stochastic realization problem
always has a solution, or equivalently, when is there a i so that
(4.1-10) holds. Fortunately, the well-known PSD property, on the unit circle (see e.g. Gokhberg and Krein (1960) and Youla (1961))
is sufficient to insure the existence of a solution. This result is
available in the generalized Kalman-Szegb-Popov lemma (see Popov (1973)).

Proposition (4.1-11) If (F,H) is completely observable, then DZ(Z) Z 0
on the unit circle is equivalent to the existence
of a quintuplet, ( ,,R,,) such that

z(z) = [f(Iz-?)-1 I p[Q (Iz-1-FT)-1"T

i" Rj- p


The proof of this proposition is given in Popov (1973) and essentially
consists of showing there exists a spectral factorization of the given

PSD. Thus, this proposition assures us that there exists at least one
solution to the stochastic realization problem.
Proposition (4.1-9) shows that once ZKSP, T, and n are determined

then a stochastic realization, (F,H,Q,R,S) may be specified; however, it
does not show how to determine I. Recently many researchers (e.g. Glover
(1973), Denham (1974,1975), Tse and Weinert (1975)) have studied this
problem. They were interested in obtaining only those solutions to the
KSP equations of (4.1-9) which correspond to a stochastic realization
such that AO of (4.1-11). Denham (1975) has shown that any solution,
n*, of the KSP equations which corresponds to a factorization as in

(4.1-11) with V=KN, W=N for K=Knxp, NeKPxP, K full rank and N symmetric
positive definite, is in fact a solution of a discrete Riccati equation.
This can readily be seen by substituting, (Q,R,S) =(KNN KNN KNN)
of (4.1-11) into (4.1-9)

I*-AI*AT = KNNTKT (4.1-12)


B-AJ*CT = KNNT for A = T- FT, C=HT, TeGL(n).

Solving the last equation for K and substituting for NNT yields

K = (B-AI*CT )(D+DT-C*CT)-1 (4.1-13)

Now substituting (4.1-13) and NNT in the first equation shows that I*


= AI*AT-(B-An*cT)(D+DT-_*cT) (B-AH*CT)T (4.1-14)

a discrete Riccati equation. Thus, in this case the stochastic

realization problem can be solved by (1) obtaining a realization,

EKSP from {C.}; (2) solving (4.1-14) for 1*; (3) determining NNT from
(4.1-12) and K from (4.1-13); and (4) determining Q,R,S from K and NN.

A quintuplet specified by T and I* obtained in this manner is guaranteed

to be a stochastic realization, but at the computational expense of solving

a discrete Riccati equation. Note that solutions of the Riccati equation

are well known and it has been shown that there exists a unique, I*,

which gives a stable, minimum phase, spectral factor (e.g. see Faurre

(1970), Willems (1971), Denham (1975), Tse and Weinert (1975)). We
will examine this approach more closely in a subsequent section, but
first we must find an invariant system description for the stochastic

4.2 Invariant System Description of the Stochastic Realization

Suppose we obtain two stochastic realizations by different methods
from the same PSD. We would like to know whether or not there is any

way to distinguish between these realizations. To be more precise,

we would like to know whether or not it is possible to uniquely

characterize the class of all realizations possessing the same PSD.
We first approach this problem from a purely algebraic viewpoint.

We define a set of quintuplets more general than the stochastic

realizations, then consider only those transformation groups acting

on this set which leave the PSD or equivalently {C invariant, and

finally specify various invariant system descriptions under these

groups which subsequently prove useful in specifying a stochastic

realization algorithm. The groups employed were first presented by

Popov (1973) in his study of hyperstability. The results we obtain

are analogous to those of Popov as well as those obtained in the

quadratic optimization problem (e.g. see Willems (1971)).

Define the set

X2 = {(F,H,Q,R,S)I FeKnxn,HeKPxn,QeKnxnReKPxPSeKnxp; Q,R symmetric}

and consider the following transformation group specified by the set

GK := {L ILEKnxn; L symmetric}

and the operation of matrix addition. Let the action of GK on X2 be
n b
defined by

L (F,H,Q,R,S) := (F,H,Q-FLFT+L,R-HLHT,S-FLHT) (4.2-1)

This action induces an equivalence relation on X2 written for each pair

(F,H,Q,-R,S), (F,H,Q,R,S)eX2 as (F,H,Q,R,S)EL(F,H,Q,R,S) iff there exists
a LEGKn such that (F,H,Q,-R,S) = L +(F,H,Q,R,S).

This group and GL(n) are essential to this discussion, but we must

consider their composite action. Therefore, we define the transformation

group, GRn which is the cartesian product of GL(n) and GK ,

GRn := GL(n)xGK The following proposition specifies GR .
nn n


Proposition. (4.2-2) The closed set GRn and operation o form a group
GRn = {(T,L) IT GL(n);LeGK n

and the group operation is given by

(T,)o(T,L) = (TT,L+T- LTT).

Proof. This proof of this proposition follows by verifying the standard
group axioms with respective identity and inverse elements

(In,0O) and (T-1,-TLTT).v

Let the action of GRn on X2 be defined by

(T,L) + (F,H,Q,R,S):=(TFT-1,HT-',T(Q-FLFT+L)TT,R-HLHT,T(S-FLHT)) (4.2-3)

An element (F,HQ,R,<) of the set X2 is said to be equivalent to the
element (F,H,Q,R,S) of X2 if there exists a (T,L)eGR such that
(F,H,Q,R,S)=(T,L)+(F,H,Q,R,S). This relation is reflexive

(F,H,Q,R,S) = (InOn) +(F,H,Q,R,S)

and symmetric

(T-1,TLTT)+(F,H,Q,R,S)=(T-1,-TLTT)+((T,L)+(F,H,Q,R,S)) =
((T-1,TLTT)o(T,L))+(F,H,Q,R,S)=(nn0 )+(F,H,Q,R,S) .

Transitivity follows from (F,H,Q,R,S)=(T,L)+(F,H,Q,R,S) and

(F,H,Q,R,S)= T,T)+(T,H,Qs,,S)=(T,f)+((T,L)+(F,H,Q,R,S))=(f,L)+ (F,H,Q,R,S).

Thus, GRn induces an equivalence relation on X2 which we denote by
ETL and (4.2-3) defines the partitioning of X2 into classes. Note that
our first objective has been satisifed, i.e., two ETL-equivalent quin-
tuplets have the same PSD; for if we let the pair (F,H,Q,R,S),

(F,H,Q,R,S)EX2 then if (T,L)eGRn

_ i __

z(z)=H(Iz-F)-Q(Iz-FT)- +H(IzF) S+S(Iz-- -H+R

=(HT-')T(Iz-F)-1T-1(T(Q-FLFT+LT)TT)TT(Iz -FT )ITT(HT1 )T

+(HT-I)T(Iz-F)-1T-1T(S-FLHT)+(ST-HLFT)TTT-T(Iz-1-FT)1 TT(HT 1)T

+R-HLHT (4.2-4)

D (z)=Z (z)+H(Iz-F)I [L-FLFT-FL(Iz-1-FT)-(Iz-F)LFT-(Iz-F)L -FT)(Iz I1-FT)-1HT

which gives QZ(z) = DZ(z). The measurement covariance sequence is also
invariant under the action of GRn on X2 because the PSD is also given by

DZ(z) = Z C.z-J. Thus, we will call any two systems represented by the
j=- j

quintuplet of X2 covariance equivalent, if they are ETL-equivalent.
Clearly, any two covariance equivalent systems have identical PSD's

(or measurement covariance sequences). Conversely, any two systems with
identical PSD's are covariance equivalent (see Popov (1973) for proof).

In order to uniquely characterize the class of covariance equi-

valent quintuplets we must determine an invariant system description
for X2 under the action of GR The number of invariants may be found
by counting the parameters. If we define, M :=dim(F,H,Q,R,S) and
M2:=dim(T,L), then there are M1=n2+np+n(n+l)+p(p+l)+np parameters

specifying this quintuplet and GR acts on M2=n2 +n(n+l) of them; thus,

there exist M1-M2=2np+'p(p+l) invariants. If we consider the transfor-

mation, (TR,L)EGR, to the Luenberger row coordinates, then np of these
invariants specify the canonical pair (FR,HR) of (2.2-6). The action

of (TR,L) on Q,R,S is given by

a_ _~ __ ~I ~

R = TR(Q-(FTR -)(TRLTR )(FTR- ) +L)TR Q-FRLRFR +L (4.2-5)


R = TR(S-(FTR-1)(TRLTRT)(HTR-1)) = SR-FRLRHRT (4.2-7)

The transformation LR acts on n(n+l) parameters of the total
n(n+l)+'-p(p+l)+np parameters available in QRR,SR as shown above for the
given (FR,HR). Once this action is completed the remaining np+p(p+l)
parameters are invariants. There are only four possible ways that LR
can act on the triple, (Q,R,S):

(i) LR acts only on QR; (4.2-8)
(ii) LR acts first to specify SR with the remaining elements
of LR acting on QR;
(iii) LR acts first to specify R with the remaining elements of

LR free to act on QR or SR or both; and
(iv) LR acts on any combination of elements in Q,R,S.

If we choose to restrict the action of LR to only the kn(n+l)
elements of QR, then for any choice of QR (given (FR,HR) and any QR)'
the transformation LR is uniquely defined. Since FR is a nonsingular
stability matrix, then it is well -known (Gantmacher (1959)) from (4.2-5)
that LR is the unique solution of L-FL F = Q* for Q* = RQ It
is important to note that the elements of QR are completely free, but
once they are selected, LR is fixed by (4.2-5) for any QR and therefore
the np elements of SR and the p(p+l) elements of RR are the invariants.
Thus, for a particular choice of QR we can uniquely specify the equivalence

_I_ ______ ___I I _

class of X2 under the action of GR i.e., (FRHR'R,RR,SR) is a
canonical form for ETL-equivalence on X2.
On the other hand, if we choose to let LR act on the np elements
of SR, then from (4.2-7) only np-p(p-l) elements of LR are uniquely

specified, i.e., since LR is symmetric and

LRH R rl+ l rp-1+1

there are p(p+l) redundant elements in the R.'s. Thus, for any choice
of SR (given (FR,HR) and any SR), np-p(p-l) elements of LR are uniquely
defined by (4.2-7) and the remaining elements of LR are free to act on QR.
In other words np-p(p-l) elements of QR are invariants,t as well as
the elements of RR, since any choice of SR specifies the elements of
LR in (4.2-6).

Similarly restricting the action of LR to act on the elements of R
specifies ;p(p+l) elements of LR from (4.2-6) and we are free to allow
the remaining elements of LR to act exclusively on QR or SR or both.

Clearly, there are many choices available to distribute the action of LR

on QR,R,SR; however, the important point is that once the choice is made,
the invariants are specified.
Any choice of symmetric QR is acceptable, since LR is uniquely
determined from (4.2-5) for given QRFR, but this is not the case when

an SR is selected. First recall that FR is nonsingular (see footnote

p.81). Then if we define SR:=SR-'R it follows from (4.2-7) that

FR-1S LRHRT (4.2-9)

and then

tThis was pointed out by Luo (1975) and Majumdar (1975).


Full Text
xml record header identifier 2009-02-09setSpec [UFDC_OAI_SET]metadata oai_dc:dc xmlns:oai_dc http:www.openarchives.orgOAI2.0oai_dc xmlns:dc http:purl.orgdcelements1.1 xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.openarchives.orgOAI2.0oai_dc.xsd dc:title Realization of invariant system descriptions from Markov sequencesdc:creator Candy, James Vincentdc:publisher James Vincent Candydc:date 1976dc:type Bookdc:identifier (oclc)000180838 (alephbibnum)dc:source University of Floridadc:language English