REALIZATION OF INVARIANT SYSTEM DESCRIPTIONS FROM MARKOV SEQUENCES
By
JAMES VINCENT CANDY
A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL
OF THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR TH
DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1976
To my wife, Patricia,and daughter, Kirstin,for unending faith,
encouragement and understanding. To my mother, Anne, for her constant
support and my motherinlaw, Ruth, for her encouragement. To "big" Ed,
my fatherinlaw, whose sense of humor often lifted my sometimes low
spirit.
:' .. ~'~~~P1"*~qnq8.s~rrarrm~mtin*as il~~i~a~rCI~AI~~I . ~
ACKNOWLEDGMENTS
I would like to express my sincere appreciation to the members
of my supervisory committee: Dr. Thomas E. Bullock, Chairman, and
Dr. Michael E. Warren, Cochairman, Dr. Donald G. Childers,
Dr. Z.R. PopStojanovic and Dr. V.M. Popov. A special thanks to
Dr. Thomas E. Bullock and Dr. Michael E. Warren for their constant
encouragement, unending patience, and invaluable suggestions in the
course of this research.
I would also like to thank my fellow students and friends,
Zuonhua Luo, Arun Majumdar, Jose DeQueiroz, and Jaime Roman, for
many fruitful discussions and suggestions.
 iii
TABLE OF CONTENTS
ACKNOWLEDGMENTS ........................................... iii
LIST OF SYMBOLS .......... .................. ... ....... ... vi
ABSTRACT .............................. ..... ..... ............. vii
CHAPTER 1: INTRODUCTION ................................. ...... 1
1.1 Survey of Previous Work in Canonical Forms
for Linear Systems ............................ 2
1.2 Survey of Previous Work in Realization Theory.. 5
1.3 Purpose and Chapter Outline ................... 10
1.4 Notation ..................................... 11
CHAPTER 2: REALIZATION OF INVARIANT SYSTEM DESCRIPTIONS ...... 12
2.1 Realization Theory ............................ 12
2.2 Invariant System Descriptions ................. 18
2.3 Canonical Realization Theory .................. 33
2.4 Some New Realization Algorithms .............. 45
CHAPTER 3: PARTIAL REALIZATIONS ............................. 54
3.1 Nested Algorithm ............................. 54
3.2 Minimal Extension Sequences ................... 64
3.3 Characteristic Polynomial Determination by
Coordinate Transformation ..................... 74
CHAPTER 4: STOCHASTIC REALIZATION VIA INVARIANT SYSTEM
DESCRIPTIONS ...................................... 78
4.1 Stochastic Realization Theory ................ 81
4.2 Invariant System Description of the
Stochastic Realization ........................ 87
4.3 Stochastic Realizations Via Trial and Error ... 97
4.4 Stochastic Realization Via the Kalman Filter .. 104
CHAPTER 5: CONCLUSIONS ....................................... 111
5.1 Summary ....................................... 111
5.2 Suggestions for Future Research ............... 112
~1_1__1 __~~__ j
TABLE OF CONTENTS (Continued)
REFERENCES ................................................ 114
BIOGRAPHICAL SKETCH ................................. ....... 124
_ ____r___ll___ Cr ~_I_ _~__ 1__1_1 _
LIST OF MATHEMATICAL SYMBOLS
Usage
A aT
A1
AT
p(A)
IA or det A
diag A
xty
x>y
XcY
xeX
X+Y
X:=
xoy
{.)
dim X
XCA).
/x
max(.)
Z
K
Meaning
First Usage
Symbol
T
1
T
P
I I
Transpose of A, a
Inverse of A
Inverse of AT
Rank of A
Determinant of A
Diagonal elements of A
x is not equal to y
x is greater than y
X is contained in or a
subset of Y
x is an element of X
Map (set X into set Y)
x is defined by
Abstract group operation
Sequence or set with
elements
Summation
Infinity
Footnote
End of proof
Group action operator
Dimension of vector
space X
if and only if
Eigenvalues ,of A
Square root of x
Maximum value of
Positive integers
Field
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
pg.
13
13
89
13
30,21
103
30
16
18
12
20
13
19
13
13
13
2
34
21
15
14
96
99
23
12
12
iff
X
/
~~1~1_ _I_ I
Abstract of Dissertation Presented to the Graduate Council
of the university of Florida in Partial Fulfillment of
the Requirements for the Degree of Doctor of Philosophy
REALIZATION OF INVARIANT SYSTEM DESCRIPTIONS
FROM MARKOV SEQUENCES
By
James Vincent Candy
March, 1976
Chairman: Dr. Thomas E. Bullock
Cochairman: Dr. Michael E. Warren
Major Department: Electrical Engineering
The realization of infinite and finite Markov sequences for multi
dimensional systems is considered, and an efficient algorithm to extract
the invariants of the sequence under a change of basis in the state
space is developed. Knowledge of these invariants enables the deter
mination of the corresponding canonical form, and an invariant system
description under this transformation group. For the partial realization
problem, it is shown that this algorithm possesses some attractive
nesting properties. If the realization is not unique, the class of
all possible solutions is found.
The stochastic version of the realization problem is also examined.
It is shown that the transformation group which must be considered is
richer than the general linear group of the deterministic problem. The
invariants under this group are specified and it is shown that they can
be determined from a realization of the measurement covariance sequence.
Knowledge of these invariants is sufficient to specify an invariant
system description for the stochastic problem. The link between the
realization from the measurement covariance sequence, the white noise
model and the steady state Kalman filter is established.
viii
CHAPTER 1
INTRODUCTION
Special state space representations of linear dynamic systems have
long been the motivation for extensive research. These models are
generally used to simplify a problem, such as pole placement, by
introducing arrays which require the fewest number of parameters while
exhibiting the most pertinent information. In general, system represen
tations have been studied in literature as the problem of determining
canonical forms; Canonical forms have been used in observer design,
exact model matching methods, feedback system design, and Kalman filtering
techniques. In realization theory, canonical forms for linear multi
variable systems are important. Since it is only possible to model a
system within an equivalence class, the ability to characterize the class
by a unique element is beneficial.
The problem of determining a canonical form has its roots in
invariant theory. Over the past decade many socalled "canonical" system
representations have evolved in the literature, but unfortunately these
representations were obtained from a particular application or from
computational considerations and not derived from the invariant theory
point of view. Generally, these representations are not even unique and
therefore cannot be called a canonical form. Representations derived
in this manner have generally been a source of confusion as evidenced by
the ambiguity surrounding the word canonical itself. In this dissertation
we follow an algebraic procedure to obtain unique system representations,
i.e., we insure that these representations dre in fact canonical forms.
In simple terms this approach seeks the determination of certain entities
called invariants obtained by applying particular transformation groups
*(e.g., change of basis in the state space) to a welldefined set
representing a system parameterization. The invariants are the basic
structural elements of a system which do not change under this trans
formation and are used to specify the corresponding canonical form. This
approach insures that the ambiguities prevalent in earlier work are
removed. Initially, we develop a simple solution to the problem of
determining a state space model from the unit pulse response of a given
linear system and then extend these results to the stochastic case where
the system is driven by a random input. The technique developed to
extract the invariants from this (response) sequence not only provides
a simple solution to the realization problem, but also gives more insight
into the system structure.
1.1 Survey of Previous Work in Canonical Forms for Linear Systems
The study of canonical forms for linear dynamic systems evolved
slowly in the sixties. The main impetus of investigation was initiated
by Kalman (1962,1963) when he compared two different methods for describing
linear dynamic systems: (1) internally by the state space representation
denoted by the triple (F,G,H), or (2) externally by the transfer
functionthe input/output description. Development over the past
decade in such areas as optimal control, decoupling theory, estimation
and filtering, identification theory, etc., have relied heavily on the
tThis defines (simply) the realization problem.
j
state space representation for analysis and design. In early literature,
however, transfer function representations were used. For highly
complex systems it is much easier to determine external behavior rather
than internal,since the state variables are normally not available for
measurement. As pointed out by Kalman (1963) the language of these
representations may be different, but both describe the same problems and
are related. Many researchers have investigated the relationship between
both representations, but always with one common goalto obtain a
state space model which specifies the external description directly by
inspection. Kalman (1963) and later Johnson and Wonham (1964),
Silverman (1966) have shown that there exists a canonical form (under
change of basis in the state space) in the scalar case for the triple
(F,g,h) where F is in companion matrix form (see Hoffman and Kunze (1971))
and g is a unit column vector. It was shown that there exists a one to
one correspondence between the nonzero/nonunit elements of the triple
and the transfer function. This representation was used by Bass and Gura
(1965) to solve the polepositioning problem and recently by Wolovich
(1972b)in solving the exact model matching problem.
The progress in determining a canonical form for the internal
description of multivariable systems came more slowly. The earliest work
appears to be that of Langenhop (1964) in which he develops a representation
to study system stability. Brunovsky (1966,1970) was probably the first
to recognize the invariant properties of the canonical form for the
controllable pair (F,G). Tuel (1966,1967) developed canonical forms for
multivariable systems in his investigation of the quadratic optimization
problem. Subsequently, Luenberger (1967) proposed certain sets of
canonical forms for controllable pairs; however, his development allowed
the possibility of nonuniqueness of these representations. Bucy (1968)
extended the results of Langenhop and Luenberger when he developed a
canonical form for certain subclasses of observable systems, but he
too was unaware of its invariant properties. Proceeding from the
external system description many researchers began to realize the
usefulness in the development of canonical forms. Popov (1969) developed
a canonical form for the transfer function in his investigation of
irreducible system representations. Gilbert (1969) examined the invariant
properties of a system with feedback applied to solve the decoupling
problem. Dickinson et al. (1974a) discuss the construction and appli
cation of these canonical forms for the transfer function matrix in a
recent survey. The properties of canonical forms were not fully
understood initially. In fact, the basic question of their uniqueness
posed many doubts as to their usefulness. This issue wasn't resolved
until the work of Rosenbrock, Kalman, and Popov in the early seventies.
The properties of the Luenberger forms were clarified by the
results of Rosenbrock (1970) and Kalman (1971a) in their studies of the
minimal column indices (or Kronecker indices) of the matrix pencil
[IzF,G], or more commonly, the indices of the pair (F,G). These indices
were shown to be invariants under the following transformations: change
of basis in the state space, input change of basis, and state feedback.
These results precisely resolve the question of what can (or cannot)
be altered by applying feedback to a linear multivariable system. At
the same time Popov (1972) examined the properties of the controllable
pair (F,G) under the same transformations in a very precise, stepbystep,
algebraic procedure todetermine the corresponding invariants. He shows
clearly that obtaining the invariants under a particular transformation
group is the only information required to specify the corresponding
canonical form. Wonham and Morse (1972) obtained the feedback invariants
of the controllable pair from the not as lucid geometric viewpoint.
Their results were identical to those of Brunovsky and4osenbrock.
Morse (1973) examined the invariants of the triple (F,G,H) under a lars
group of transformations which includes output change of basis. A
complete set of feedback invariants of this triple still remains an
open problem, but some fragmentary results were presented by Wang and
Davison (1972) when they investigated certain sets of restricted triples.
Along these lines Rissanen (1974), Caines and Rissanen (1974), .
Mayne (1972a,b),Weinert and Anton (1972), Tse and Weinert (1973,1975),
Glover and Willems' (1974) examined the identification problem from the
invariant theory viewpoint and obtained some rather interesting results.
Recent results in decoupling theory were obtained by Warren and Eckberg
(1973), Concheiro (1973), and Forney (1975) using the Kronecker invariants.
Probably the most extensive survey of these results has been compiled
by Denham (1974) and we refer the interested reader to this paper.
We temporarily leave this area to consider one specific application of
these resultsthe realization problem.
1.2 Survey of Previous Work in Realization Theory
The first realization problem proposed for control systems was
the determination of a state space model (internal description) from
a given transfer function (external description). Gilbert (1963) and
Zadeh and Desoer (1963) describe realization procedures based on the
determination of the rank of the residue matrices of the given transfer
function matrix, but unfortunately these procedures only apply to the
I
case of simple poles. Kalman (1963) proposed an algorithm whereby
the given transfer function is realized as a parallel combination of
single input, controllable subsystems in companion form, and then
applied the "canonical structure theory" (Kalman (1962)) to delete the
uncontrollable dynamics. This technique handles simple as well as
multiple transfer function poles. Later Kalman (1965) showed the
equivalence of the realization problem of control theory to the
corresponding network theory formulation.
A significant advance in realization theory was given by Ho and
Kalman (1966). They showed that the state space model could be found
from the impulse response sequence provided the system under investi
gation is finite dimensional. They also developed an algorithm based
on forming the generalized Hankel array from the given sequence and
then extracted the state space triple from it. Shortly after the pub
lication of Ho's algorithm, Youla and Tissi (1966) working in network
synthesis and Silverman and Meadows (1966) in control theory developed
similar realization techniques again based on the impulse response sequence.
Ho's algorithm gave new impetus to realization theory. Several
authors have provided alternate or improved realization algorithms based
on the Hankel array formulation. Mayne (1968), Panda and Chen (1969),
Roveda and Schmid (1970), Rosenbrock (1970), Lal et al. (1972) and even
more recently Huang (1974), Rozsa and Sinha (1975) among others,
considered the older transfer function matrix approach, while Rissanen
(1971,1974), Silverman (1971), Ackermannand Bucy (1971), Chen and Mital
(1972), Mital and Chen (1973), and Bonivento et al. (1973) approached
the problem from the Hankel array formulation.
Rissanen (1974), Furata and Paquet (1975), Roman (1975),
Dickinson et al. (1974a,b) have recently considered the problem of
realizing a given infinite impulse response matrix sequence with a
polynomial matrix pair. Such a pair is referred to as a matrix
fraction description of the system and is becoming well known in
control literature largely due to the ground work established by
Popov (1969), Rosenbrock (1970), Wolovich (1972a,b, 1973a,b) and
others.
Kalman (1971b), Tether (1970), and Godbole (1972) later
considered the more realistic case where only a finite number of
terms of the impulse response sequence are specified. This is
commonly known as the partial realization problem and corresponds
in the scalar case to the classical Pade approximation problem.
Generally most realization altorithms can be used to process partial
data, but usually at a loss of efficiency and even more seriously
the possibility of yielding misleading results. A wealth of new
techniques have recently been published to handle this very special,
yet realistic variant of the realization problem. Rissanen (1972a,b),
Ackermann (1972), Dickinson et al. (1974a), Roman and Bullock (1975a),
Anderson et al. (1975) published some efficient and improved algorithms
to solve this problem.
Also of recent interest is the development of algorithms which
realize a system directly in a canonical form (under a change of basis
in the state space), i.e., algorithms which solve the canonical realiza
tion problem. The algorithms of Ackermann (1972), Bonivento et al.
(1973), Rissanen (1974), Dickinson et al. (1974a), Rozsa and Sinha
(1975), Luo (1975), and Roman and Bullock (1975a) solve this problem.
One of the main contributions of this dissertation is to use the
results developed from invariant theory to solve the realization and
partial realization problems in the deterministic as well as stochastic
cases. The realization of a system directly in a canonical form actually
reduces to first determining which transformation groups are present,
specifying the corresponding invariants, and then developing a method to
extract these invariants from the given unit pulse response sequence.
This philosophy is basic to any canonical realization scheme and actually
provides an explicit formula which is applied throughout this dissertation.
In the last few years, several interesting extensions have emerged
from the original concept of realization theory. The major motivation
evolved just after the development of the Kalman filter (see Kalman (1961))
in estimation theory because a priori knowledge of the state space
model and noise statistics are required to begin data processing. The
link between the filtering and realization problem was established by
Kalman (1965) just prior to the advent of Ho's algorithm. The work of
Gopinath (1969), Budin (1971,1972), Bonivento et al. (1973), and Audley
and Rugh (1973,1975) were concerned with the more general problem of.
obtaining a state space representation given a general input/output
sequence of the system in both deterministic and stochastic cases. The
stochastic version of the realization problem has not received quite
as much attention as the deterministic case mainly due to its geater
complexity and high dependence on the adequacy of covariance estimators.
The realization of stochastic systems was studied by Faurre (1967,1970)
and more recently by Rissanen and Kailath (1972), Gupta and Fairman (1974)
tThe Hankel array formulation is used exclusively in this dissertation;
and Akaike (1974a,b). From the transfer function viewpoint this problem
has been solved using spectral factorization as originally introduced
by Wiener (1955,1959) and studied by others such as Gokhberg and Krein
(1960), Youla (1961), Davis (1963), Motyka and Cadizow (1967), and
Strintzis (1972). The link between the stochastic realization problem
and spectral factorization evolved from the work in stability theory by
Popov (1961,1964), Yakubovich (1963), Kalman (1963), Szeg6 and
Kalman (1963). The equations establishing this link were derived in
the KalmanYakubovichPopov lemma for continuous systems and the Kalman
SzegoPopov lemma for discrete time systems. Newcomb (1966), Anderson
(1967a,b,1969), and Denham (1975) extended these results and provided
techniques to solve these equations. Defining the invariants of these
problems is still an area of active research as evidenced by the recent
work of Denham (1974), Glover (1973), and Dickinson et al. (1974b).
This is one area developed in this dissertation. It will be shown that
the invariants of the stochastic realization problem not only lends more
insight into the structure of the problem, but also yields some new
results.
Research in realization theory and its applications continues as
evidenced by the recent results of Rissanen (1975) in estimation theory,
Ackermann (1975) in feedback system design,De Jong (1975) in the
numerical aspects of the problem and Roman and Bullock (1975b) in
observer theory. The results presented in this dissertation tie together
some previously wellknown results in stochastic realization and filtering
theory as well as provide a technique which can be used to study
other problems.
1.3 Statement of Purpose and Chapter Outline
It is the purpose of this dissertation to provide an extensive
discussion of the realization problem in both the deterministic and
stochastic cases as well as specify the invariants under particular
transformation groups in each case. It is also desired to develop a
simple and efficient algorithm to solve the canonical realization problem.
This algorithm is to be modified to process data sequentially such that
only the pertinent informationthe invariants, are extracted from the
given sequence. In the case of a fixed finite unit pulse response
sequence (the partial realization problem), the solution is to be
obtained such that all possible degrees of freedom are specified. The
relationship between the stochastic realization and steady state Kalman
filtering problems are discussed by again examining the corresponding
invariants. In so doing, a considerable amount of knowledge about the
existence and structure of realizations and the steady state filter is
gained.
The basic theoretical essentials of realization and invariant theory
are reviewed in Chapter 2. A "formula" essentially outlined in Popov
(1973) and Kalman (1974) is developed which will be applied to various
realization problems throughout the text. Some new theoretical results
in canonical realization theory are established and used to develop a
new canonical realization algorithm.
In Chapter 3 the algorithm is modified to handle sequentially
the case of partial data and also that of a fixed finite sequence. New
results evolve which completely characterize the class of all minimal
partial realizations and extension sequences as well as determining the
characteristic equation in a simple manner.
The stochastic case of the canonical realization problem is in
vestigated in Chapter 4. A complete set of independent invariants is
found to characterize the corresponding solution. Equivalent solutions
to this problem as well as to the steady state Kalman filtering problem
are studied and it is shown that the filter parameters can be specified
by solving an analogous realization problem.
The specific contributions of this research and further research
possibilities are outlined in Chapter 5.
Examples are used generously throughout this work to illustrate the
various algorithms discussed and to point out significant details that
are otherwise difficult to see. A comment on notation to be used through
out this dissertation closesdthis chapter.
1.4 Notation
Uppercase letters denote matrices, and vectors are represented by
underlined lowercase letters. Lowercase letters are used to represent
scalars and integers. All matrices and vectors appearing in this work
are assumed to be real and constant. A = [a..] is an nxm matrix with
m ijmis an nxm matrix with
elements a.i; On is the nxm null matrix with row and column vectors
given by T and 0; In represents the nxn identity matrix, and eT or
stands for its jth row or jth column; jem means j=1,2,...,m.
~J
CHAPTER 2
REALIZATION OF INVARIANT SYSTEM DESCRIPTIONS
In this chapter we present a brief review of the major results
in realization theory. We establish a basic "formula" and apply it to
various system representations. It is shown that this approach greatly
simplifies the realization problem. Two new algorithms for realization
are developed which appear to be more efficient than previous techniques
because they extract only the minimal information necessary to specify
a system from the given input/output sequence in an extremely simple
fashion. All of the essential theory is developed and a multivariable
example is presented.
2.1 Realization Theory
A real finite dimensional linear constant dynamic system has
internal description given by the state variable equations in discrete
time as,
x+l = Fx + Guk
Y= Hk (2.11)
where keZ xeKn=X, ueKm=U, eKP=Y and F, G, H are nxn, nxm, pxn matrices
over the field K. X,U,Y are the state, input, and output spaces,
respectively.
The external system description may be given either in rational
form as,
T(z) = H(IzF)IG (z complex) (2.12)
or equivalently as an infinite matrix power series
kk
T(z) = E Azk (2.13)
k=l
where the sequence {Ak} is the unit pulse response or Markov sequence
of (2.11). The Markov parameters are
Ak = HFkG k=1,2,... (2.14)
The problem of determining the internal description (F,G,H) from the
external description (T(z) or {Ak}) is the realization problem. Out of
all possible realizations, E:=(F,G,H) having the same Markov parameters,
those of smallest dimension are minimal realizations.
Prior to stating some of the significant results from realization
theory several useful definitions will be given. The jcontrollability
and jobservability matrices are the nxmj and pjxn arrays,
W = [G j1G] and = [HT (HF'1)T]. The pair (F,G)
is completely controllable if p(W )=n and the pair (F,H) is completely
observable if p(V )=n. Throughout this dissertation we will only be
concerned with systems possessing these properties. For a completely
controllable and observable system, the controllability index, u, and
the observability index, v, are the least positive integers such that the
rank of W and V is n.
1I v
I
If two minimal realizations E, 7 are equivalent under a change of
basis in X, then there exists a nonsingular T such that
(Fj,,~,)t = (TFT1,TG,HT1). It also follows by direct substitution that
the controllability and observability indices of these realizations are
identical and
W = TW. for j = 1,2,...
j J
Vi = ViT1 for i = 1,2,...
The generalized NxN' block submatrix of the doubly infinite Hankel
array is given by
rAl
SN,N' :
AN
... AN'
.. AN+N'1
Implicit in the realization problem is determining when a finite
dimensional realization exists and, if so, its corresponding minimal
dimension. The following proposition by Silverman gives the necessary
and sufficient conditions for {Ak} to have finite dimensional realiza
tion.
Proposition. (2.15)
An infinite sequence {Ak} is realizable iff there
exist positive integers p,v,n such that
p(S ) = (S + ) = (S +j, +1)=n. or j=0,1,...
Further, if {Ak} is realizable, then p,v are the
controllability and observability indices and n is
the dimension of the minimal realization.
tThis notation means F = TFT", G = TG, and H = HT1
Proof. See Silverman (1971).
Note that the essential point established in Ho and Kalman (1966), which
is used in the proof of the above proposition is that Z is a minimal
realization iff it is completely controllable and observable. Since
S =V W it follows for dim: = n that: p(S ) min[p(V ),p(W )]=n.
This result is essential to construct any realization algorithm. In
(2.15) the crucial point of finite dimensionality is carefully woven
into necessary and sufficient conditions for an infinite sequence to be
realizable. What if only partial information about the system is
available in the form of a finite Markov sequence? Is this sequence
realizable? What is the relationship between the minimal realization
and one based only on partial data? These are only a few of the questions
which must be resolved when we are limited to partial data.
Intrinsic in the realization from a finite Markov sequence is the
fact that enough data are contained in S to recover the infinite
sequence, i.e., knowledge of {A1,...,Av,1} is sufficient to determine
{Ak}, k=1,2,.... But in reality the only way to be sure of this is
knowledge of the actual system dimension (or at least an upper bound). A
minimal partial realization is a realization of smallest dimension
determined from a finite Markov sequence {Ak},keM which realizes the
sequence up to M terms. The order of the partial realization is M and
the realization is denoted by Z(M). The realization induces an extension
k1
of {Ak}, i.e., Ak=HF kG for k>M. The following basic result analogous
to (2.15) answers the realizability question when only partial data are
given. For a proof, see Kalman (1971).
Proposition. (2.16) (Realizability Criterion) The minimal partial
realization problem of order M possesses a
solution, E(M) iff there exist positive integers..
v,v, M = v~pM, such that
(R) p( 5~) = p(SV+1,) = p(Sv,+1)
where dimE(M) = p(S ) = n.
v,11
In this proposition (R) is designated the rank condition. Also, it
is important to note that when (R) is satisfied the minimal extension
(of S(M)) is unique (see Tether (1970) for proof), but E(M) is not
unique because there exist other minimal partial realizations equiv
alent to E(M) under a change of basis in X.
We must consider three possible cases when only partial data is
available. In the first case enough data is available such that M>M
for known n; thus, a minimal realization is found. Second, v and p
are available such that (R) is satisfied. In this case a minimal par
tial realization can be found, but this in no way insures it is also a
minimal realization of the infinite sequence, since the rank of S
may increase as v,u increase. Third, the rank condition does not hold.
How can a realization be found when no more data is available? The
only possibility in this case is to extend the sequence until (R) is
satisfied, but there can exist many extensions satisfying (R) while
giving nonminimal realizations. For this reason define a minimal
extension as any that corresponds to a minimal (partial) realization.
To obtain minimality we must somehow select the right extension among
the many possible.
Prior to summarizing the main results of Kalman (1971) and Tether
(1970), define the incomplete Hankel array associated with a given
partial sequence {Ak}, keM as
A AA
1 A2 .. AM
A2 A3 A.... *
S(M,M) := .
AM *
where the asterisks denote positions where no data is available. The
rank of S(M,M) is the number of linearly independent rows (columns)
determined by comparing only the data specified elements in each row
(column) with the preceding rows (columns) with the cognizance that
upon the availability of more data this number can only remain the same
or increase. Thus, the rank is a lower bound for any extension when the
* are filled inconsistent with the preservation of the Hankel pattern.
Both Kalman and Tether show that there are three pertinent integers
associated with the incomplete Hankel array. They are defined as: n(M),
v(M), p(M) and correspond to the rank of S(M,M), the observability index,
and the controllability index of the given data. The latter two are
lower bounds (separately) for v and p. Knowledge of either v(M), or
y(M) enables us to construct extensions, since they are the least integers
such that (R) holds for all minimal extensions.
It should also be noted that the integers n,v,p,... are actually
nondecreasing functions of the amount of data available, M, and should
be written, n(M), v(M), p(M) etc. to be precise. However, the argument
tIt also follows from this that the p(S(M,M)) is a lower bound for dim.E
(see Kalman (1971)).
M will be understood throughout this dissertation in order to maintain
notational simplicity.
There is one more variant of the partial realization problem that
must be considered. A sequence of minimal partial realizations such
that each lower order realization is contained in one of higher order
will be called a nested realization. Symbolically, this is given by
...S(M)EE(M)I... for M
appear as submatrices of the corresponding matrices in Z(M). The solution
to this problem is most desirable from the computational viewpoint,
since each higher order model can be realized by calculating just a few
new elements in the corresponding realization. Rissanen (1971) has
given an efficient recursive algorithm to determine this solution.
Another related problem of interest is determining a unique member of
equivalent systems under similarity and is discussed in the following
section.
2.2 Invariant System Descriptions
In this section we review some of the fundamental ideas encountered
when examining the invariants of multivariable linear systems. The
framework developed here will be used throughout this dissertation in
formulating and solving various realization problems. Not only does
this formulation enable the determination of unique system representations
under some wellknown transformations, but it also provides insight into
the structure of the systems considered. First, we briefly define the
essential terminology and then use it to describe some of the more common
sets of canonical forms employed in many recent applications (e.g., Roman
and Bullock (1975a,b), Tse and Weinert (1975)).
For any two sets X and Y, a subset R = X x Yt is called a binary
relation on X to Y (or, a relation "between" X and Y). Then (x,y)eR
is usually written as xRy and is read: "x stands in the relation R to
y". If for X=Y this relation is reflexive, symmetric, and transitive,
then it is an equivalence relation E on X given by xEy for x,ysX. The
set of all elements z equivalent to x is denoted by E(x).= {zeXfxEz} and
is called the equivalence class or orbit of x for the equivalence relation
E. The set of all such equivalence classes is called the quotient set
or orbit space and is given by X/E. Thus, the relation E of X partitions
the set X into a family of mutually disjoint subsets or orbits by sending
elements which are related into the same equivalence class.
Consider a fixed group Gtt of transformations acting on a set X.
Then the elements xl,x2 of X are equivalent under the action of G iff
there exists a transformation TeG which maps x1 into x2. This is basically
the "formula" we will apply throughout, i.e., we first formulate the set
of elements (generally the internal system description), then define a
transformation group; and finally determine the orbits under the action
of G. To be more precise, let us first define the function f mapping
a set X into Y as an invariant for E if for x1,x2 X, x1Ex2 implies
f(x i)f(x2). In addition if f(xl)=f(x2) implies xlEx2, then f is a
This is the standard Cartesian product, XxY = {(x,y)xeX, yeY}
tHere we mean "group" in the standard algebraic sense, i.e., (Go)
where G is a closed set of elements each possessing an inverse and
the identify element; o is an associative binary operation. When
o is understood, the group is merely denoted by G.
tNote that an invariant is actually a function, but common usage
refers to its image as the invariant. We will also use this terminology
throughout this dissertation.
complete invariant. In general we will be interested in a complete
system of invariants for E given by the.set of invariants {fi} where
f:t X + Y1xY2x...xY fi is an invariant for E, and fl(x,)= (x2),...'
fn(xl)=fn(x2) imply xEx2. Completeness of this set of invariants
means that the set is sufficient to specify the orbit of x, i.e., there
is a one to one correspondence between the equivalence classes in X
and the image of f. If the set of complete invariants is independent,
then the map f: XYlX...xYn is surjective. This property means that
corresponding to every set of values of the invariants there always exists
an ntuple in Y specified by this set. A complete system of independent
invariants will be called an algebraic basis.
Generally, we consider a subset of X (e.g., in system theory a
controllable system). Correspondingly, let fo be a function mapping the
subset X of X into set Y, then fo is a restriction of f if f (x)=f(x)
for each xeXo. We can uniquely characterize an equivalence class E(x)
by means of the set of values of the functions fi(x), ien where the {fi}
constitute a complete set of invariants for E on X. If the corresponding
complete invariant f is restricted such that its image is itself a
subset of X, then we have specified a set of canonical forms C for
E on X. To be more precise, a canonical form C for X under E is a
member of a subset CcX such that: (1) for every xeX there exists one and
only one cEC for which xEc, and since C is the image of a complete
invariant f, then (2) for any xeX and cl, c2EC, xEcl,and xEc2 implies
f(x)=f(cl)=f(c2)=cl=c2 (invariance); (3) for any ceC if f(x1)=c and
f(x2)=c, then x1Ex2 (completeness). Thus, c=f(x) is a unique member of
This notation is actually f=(fl,...,f ):x+Y1...xYn but it is
shortened when the set {fi} is clearly understood.
FR~"YI i
E(x) for every xeX. With these definitions in mind, our "formula"
becomes
(i) Formulate the set of elements;
(ii) Define the transformation group;
(iii) Determine a set of complete invariants under this
transformation group; and
(iv) Develop the canonical form in terms of the corresponding
invariants. (2.21)
We now apply (2.21) to various restricted sets related to multivariable
systems. This approach is essentially given in Kalman (1971a),Popov
(1972), Rissanen (1974), or Denham (1974). In this sequel we review the
main results of Popov. First, define the set of matrix pairs (F,G)
as
X = {(F,G)IFeKnxn, GeKnxm; (F,G)controllable}
The general linear group, which corresponds to a change of basis
in the state space, is specified by the set
GL(n):= {TITeKnxn; det TO} (2.22)
with the group operation standard matrix multiplication, i.e.,
To T = T T.
In order to determine the orbits of X under the action of GL(n),
it is first necessary to specify the action operator "+"
T + (F,G):= (TFT1,TG)
In general the problem of determining a canonical form is quite
difficult. However in this dissertation we consider restricted sets
which make the problem much simpler. For a thorough discussion of this
problem see Kalman (1973).
1
or alternately we can say that the action of GL(n) on Xo induces
F + TFT1
G + TG
The action of GL(n) induces an equivalence relation on Xo. We
indicate (F,G)ET(F,G) if there exists TeGL(n) such that (T,G)=T+(F,G).
Dual results are defined for the observable pair (F,H) and the
analogous set denoted by X .
The third step of (2.21) is established in Popov (1972), but
first consider the following definitions. For a controllable pair (F,G)
define the jth controllability index p ., jEm as the smallest
positive integer such that the vector F 3gj is a linear combination of
its predecessors, where a predecessor of Figj is any vector Frgs where
rm+s
we have assumed p(G) = m. Throughout this dissertation we use the
following definition of predecessor independence: a row or column vec
tor of a given array is independent if it is not a linear combination of
its regular predecessors. The following results were established by
Popov (1972)
Proposition. (2.23) (1) The regular vectors are linearly independent;
(2) The controllability indices satisfy the
m
following relationship, E j = n; (3) Ti.ere exists
n=1
exactly one set of ordered scalars, ajkscK defined
for jem, kejl, s = O,l,...,min(pj ,k1) and for jEm,
k = j,...,m,s = 0,l,...,min(pj,pk) 1 such that
tThroughout this dissertation we use the overbar on a set to denote the
dual set.
ttThese indices are also called the Kronecker indices.
j1 min(sj.,kl) m min(,ljpk)1
F jg. Z Z ajks gk + E ajks Fk'
3 k=1 s=O k=j s=O
This proposition follows directly from the controllability of (F,G) and
indicates that the regular vectors forma basis where the a's are the
coefficients of linear dependencies. The set [{Pj},{ajks}], j,kem,
s=0,...,j.I are defined as the controllability invariants of (F,G),
and v=max(pj). The main result of Popov is:
Proposition. (2.24) The controllability invariants are a complete
set of independent invariants for (F,G)eX under
the action of GL(n).
The proof of this proposition is given in Popov (1972) and consists of
verifying the invariance, completeness, and independence of [{1j},{ajks}].
Invariance follows directly from Proposition (2.23), since (F,G)ET(F,G),
then T'gk can be replaced by TFSgk in the given recursion and the
controllability invariants remain unchanged. Completeness is shown by
constructing a TeGL(n) such that for two pairs of matrices (F,G),
(F,G)eX with identical controllability invariants, (F,G) = (TFT1, TG)
or (F,G)ET(F,G). Independence of the controllability invariants is
obtained by constructing a canonical form determined only in terms of
these invariants. Thus, by introducing a finite set of indices {pj},
Popov shows that this set along with the {ajks} are invariants under the
action of GL(n). The main reason for specifying a set of complete and in
dependent invariants is that it enables us to uniquely characterize the
orbit of (F,G). It should also be noted that dual results hold for the observ
able pair (F,H), and it follows that the observability invariants are the
~~I __L
24
set [{vi},{ ist}], i,sep, t=O,...,v.i where the {vi}are the observa
bility indices.
The last step of (2.21) is to specify the corresponding canonical
forms under GL(n). These forms are commonly called the Luenberger
forms and are specified by the controllability and observability
invariants. They are defined by the pairs (FC,Gc), (FR,HR) where the
subscripts C,R reference the fact that the regular vectors span either
the columns of W+,1 or the rows of V +l'
p + 1 H+ 1
FC = [t2 l1 2 .. emm] (2.25)
GC = el e ql+l1 e l+l
J
qj = s s jem
Ss=1 s
~ T
2
T
eT
e1 T
T J
e
i
FR R = (2.26)
T T
er P+2 __r 0+1
T
erp
T
ri = Vs iE2
s=l1
i*i_~_ 
where aj, i are n column, n row vectors containing {aI {Bi s}
respectively over appropriate indices and zeros in the other places.
Luenberger (1967) shows that the transformation, TC, required to
obtain the pair (FC,Gc) is determined from the columns of W as
TC =[T1 T2 ... Tm (2.27)
where Tj =[gj Fg lF1j] jFm
pCG) = m, and gj is the jth column of G.
Similar results hold for the pair (FR,HR) and is specified by TR
constructed from the rows of V .
Unfortunately Luenberger (1967) in attempting to develop multi
variable system representations did not determine the invariants under
GL(n). It is essential to use the approach outlined in (2.21) in
order to obtain the corresponding canonical forms or else it is possible
to obtain erroneous results. The following example due to Denham (1974),
shows that the Luenberger form, as originally stated is not canonical.
If we are given the pair (F,G) as
0 0 1 1 1 0
,
1 0 2 1 0 0
F = G
0 1 2 1 0 0
0 0 1 1_ 0 1
These matrices are in the form of (2.25), but it is easliy verified
by constructing TC that the controllability invariants are in fact
pl=2, P2=2 and 21 = [1 1 1 1], a2 = [2 0 2 4T1 The problem
with the Luenberger forms is that the maps 7: X0+ XO/E are not well
defined. Thus, the image of the maps are indeed canonical forms, but
as shown here for (F,G)eXO/E, we need not have rT(F,G)=(F,G), i.e., the
mapping does not leave the canonical forms unchanged. The point to
remember is that the invariants are the necessary entities of interest
which must be determined.
The procedure to construct the transformation matrix TC of (2.27)
is called the Luenberger second plan. The first Luenberger plan
consists of examining the columns of ,n' given by
n = Wn = [gl ... Fng1 "' gm Fn m] (2.28)
where Y is an nmxnm permutation matrix, for predecessor independence.
Thus, we can define a new set of invariants (under GL(n)) [{f }, {ajks}]
completely analogous to the controllability invariants. The canonical
forms associated with the invariants obtained in this fashion have
tThis procedure amounts to examining the column vectors of Wn for
predecessor independence, i.e., examine g1 ... g Fg1 ... Fg .
Fn191 ... Fnlgm.
i I
become known as the Bucy forms which were derived directly from the
results of Langenhop (1964), Luenberger (1967), and Bucy (1968). We
refer the interested reader to these references as well as the recent
survey by Denham (1974). Here we will be satisfied to note that the
procedure of (2.21) applies with the set of controllable pairs (F,G)
restricted to the {j} invariants rather than {pj}. Analogous to the
Luenberger forms, we define the row and column Bucy forms as (FBR,HBR),
(FBC,GBC) respectively. The row form is given by
T
L eT
11 21
0
L21 L22 He +1
FBR BR (2.29)
Lpl Lp L eT + +..+p+1
where
I ..1 V
L.. =  ; L.. = ]
11 ST 13 T"
L Bii ij
I P
v. > 0 and satisfy E v =n ;
s=l
B, j are v.,vj row vectors containing ist } invariants.
The transformation, TBR,required to obtain the pair (FBR,HBR) is
T T T VlT
BR = [T T F B T (2.210)
B i i
The importance of the Bucy form is that the characteristic equation can
be found by inspection of the block diagonal arrays of FBRt. Since FB
FBR
is block lower triangular, the characteristic equation is given as
XFBR(z) = det(IzFBR) = (z) ... XL (z) (2.211)
FBR 1BR Lpp
where the Lii are the companion matrices of (2.29). Similar results
hold for the pair (FBCGBc) and the transformation is specified by TBC
constructed from the columns of Wn.
This completes the discussion of invariants and canonical forms for
controllable or observable pairs. To extend these results to matrix
triples (internal system description), it is more convenient to examine
an alternate characterization of the corresponding equivalence class
the Markov sequence of (2.14). This approach was used by Mayne (1972b)
and Rissanen (1974), in order to determine the orbits of Z under GL(n).
It is obvious that the sequence is invariant under this group action
Aj = (HT1)(TFT1)j1(TG) = HFj1G (2.212)
Consequently every element of Aj can be considered an invariant of Z
with respect to GL(n); therefore, two systems which are equivalent under
GL(n) possess identical Markov sequences. The converse is also true, i.e.,
any two systems with identical Markov sequences are equivalent.
The standard approach to investigate a system characterized by its
Markov sequence is to form the Hankel array, SN,N, where we define S ,
It should be noted that the Bucy form is not a canonical form if the
transformation group includes a change of basis in either input or output
spaces, while the Luenberger form is still a canonical form.
iAN and S., jeN' as the block rows and columns of SN,N, and the
block column and row vectors, a or aT denote the rth column of S
.r s. ,.
or the sth row of S,1 for remN', spN. Rissanen (1974) has shown
that by examining the set
X1 = {2 I E controllable and observable with {ii} invariants}
under the action of GL(n) that
Proposition. (2.213) The set of controllability invariants and block
column vectors, [{1j},{ajks},{a.i}] for the
appropriate indices constitute an algebraic basis
for any SeX1 under the action of GL(n).
The proof of this proposition is given in Rissanen (1974) and consists of
showing that any two members of X1 with identical Markov sequences
are equivalent under GL(n). Thus, invariance follows by showing that a
dependent column vector of the Hankel array can be uniquely represented
in terms of the set [{CI},{acjks}]. These parameters remain unchanged
under GL(n); therefore, they are invariants. The block column vectors,
a.t satisfy a recursion analogous to (2.23), i.e.,
nI min(j,Pk1) m min(lj,Pk)l
a.j+m. = aksa.j+ms + zE jksa.j+ms
k=l s=O k=j s=O
Thus, all dependent block columns can be generated directly from the set,
{a.t} of regular block column vectors. These vectors are invariants under
GL(n), since every column vector of Aj is an invariant as shown in
(2.212). Completeness follows immediately from the above recursion,
since any two members of X1 possessing identical invariants satisfy the
above recursion and therefore have identical Markov sequences.
Independence is shown by constructing the Luenberger form of (2.25) and (2.214)
below* directly from these invariants.
The dual result yields another basis on X1, [{i},{B.ist},{a ].
The corresponding canonical forms for ZeX1 or X1 are given by the
Luenberger pairs of (2.25,2.26) and
HC = [a.1 ... a.(l )m+ll.. la.m .. a. mm (2.214)
and
T
al
aT
a(l1)p+l.
GR
a
P*
T
Vpp.
and the canonical triples are denoted by EC and ZR respectively.
Rissanen (1974) also shows that a canonical form for the transfer
function can be constructed from the invariants of (2.213). This is
possible because the determination of canonical forms for Z based on the
Markov parameters is independent of the origin of Ak's. Rissanen defines
the (left) matrix fraction description (MFD) as
T(z) := B1(z)D(z) (2.215)
where B(z) = z Bi.z for IB \f 0
i=O '
v1
D(z) E Diz .
i=0 I
The relationship of the MFD to the Hankel array, S +1,+l ,follows by
writing (2.215) as
B(z)T(z) = D(z) (2.216)
and equating coefficients of the negative powers of z to obtain the
recursion
BAj + B1Aj+I + + BA =j j=1 ,2...
expanding over j gives the relation over the block Hankel rows as
[B ... B] B S T p ) (2.217)
T
v+ ,.
where the pxp(v+l) matrix of Bi's is called the coefficient matrix of
B(z). Similarly equating coefficients of the positive powers of z
gives the recursion
Dk = Bk+A + Bk+A + ... + BAk k=0,l,...,v1
or expanding over k gives the relation in terms of the first block Hankel
column as
v1 Bv
0
D2 B B
Dv2 v v S (2.218)
DO Bl B2 ... B
The canonical forms for both left and right MFD's are defined by the
polynomial pairs (BR(z),DR(z)) and (BC(z),DC(z)) respectively, where
R and C have the same meaning as in (2.25,2.26) and the former is
given by
T
b
1
BR(z) := :
bT
p
p
Izv
; bT K
ki Kp
(2.219)
for
T T
b = OTk
where k=i+pvi and Bkj ar
S {ist
Bkj : 0
1
*T
1 k2 *"'* k(i+pvi) l0i] i
e given by
j=i,i+p,...,i+p(vi1)
jfi,i+p,...,i+p(vi1)
j=i+pvi
and DR(z) is determined from (2.218).
Dual results hold for the corresponding column vectors, bj, jem of the
J
coefficient array of IC,(z) in terms of the controllability invariants.
This completes the discussion of canonical forms for Z or T(z).
Note that analogous forms can easily be determined for the Bucy forms
if X1 is restricted to {vj}. Henceforth, when we refer to an invariant
system description, we will mean any representation completely specified
by an algebraic basis. In the next section we develop the theory
necessary to realize these representations directly from the Markov
sequence.
2.3 Canonical Realization Theory
In this section we develop the theory necessary to solve the
canonical realization problem, i.e., the determination of a minimal
realization from an infinite Markov sequence, directly in a canonical
form for the action of GL(n). Obviously from the previous discussion,
this solution has an advantage over other techniques which do not obtain
Z in any specific form. From the computational viewpoint, the simplest
realization technique would be to extract only the most essential
information from the Markov sequencethe invariants under GL(n). Not
only do the invariants provide the minimal information required to
completely specify the orbit of E, but they simultaneously specify a
unique representation of this orbitthe corresponding canonical form.
Thus, subsequent theory is developed with one goal in mindto extract
the invariants from the given sequence.
The following lemma provides the theoretical core of the subsequent
algorithms.
Lemma. (2.31) Let VN and WN, be any full rank factors of SN,N, = VN WN'
Then each row (column) of SNN,, is dependent iff it is a
dependent row (column) of VN (WN,).
Proof. From the factorization SN,N, = VN WN, it follows if the jth
row of SN,N, is dependent, then there exists an eKpN, iaj0
such that
T T
SN,N' = OmN'
Since p(WN,)=n, i.e., WN, is of full row rank, it follows that
TSN T =
SS,NIWN' mN'
or aTVN(WN' N) = OT N'
but det (WNWN' ) 0; thus, a = Om0N, i.e., a dependent row
of SN,N, is a dependent row of VN. Conversely assume that there.
T
exists a nonzero a as before such that
T T
aV = 0 T
VN mN'
Since p(WN,)=n, it follows that this expression remains unaltered
if postmultiplied by WN,, i.e.,
T T
aVNN, = iN'
and the desired result follows immediately.V
The significance of this lemma is that examining the Hankel rows
(columns) for dependencies is equivalent to examining the rows (columns)
of the observability (controllability) matrix. When these rows (columns)
are examined for predecessor independence, then the corresponding
indices and coefficients of linear dependence have special meaning
they are the observability (controllability) invariants. Thus, the
obvious corrollary to this lemma is
Corollary. (2.32)
If the rows of the Hankel array are examined for
predecessor independence, then the jth (dependent)
row, where j=i+pvi, iep is given by
il min(vi,vs1) p min(v,v s)l
T T T
J = ists+pt + s istys+pt
s=1 t=0 s=l t=0
where{Bist an:d{vi} are the observability invariants
and , kcpN is the kth row vector of S N,N'
Proof. The proof is immediate from Proposition (2.23) and Lemma (2.31).V
Note that similar results hold for the columns of the Hankel array when
examined for predecessor independence.
In the solution to some problems knowledge of both controllability
and observability indices are required. Moore and Silverman (1972)
require both indices to design dynamic compensators in order to solve
the exact model matching problem. Similarly the requirement exists in
the design of pole placement compensators and also stable observers as
indicated in Brausch and Pearson (1970) and more recently Roman and
Bullock (1975b). In an online application Saridis and Lobbia (1972)
require the controllability invariants as well as the observability
indices to solve the problem of parameter identification and control.
The latter case exemplifies the fact that in some instances it is first
necessary to determine the structural properties of a system from its
external description prior to compensation.
The need for an algorithm which determines both sets of controllability
and observability invariants from an external system description is
apparent. Computationally the simplest and most efficient technique to
determine these invariants would be some type of Gaussian elimination
scheme which utilizes elementary operations (e.g., see Faddeeva (1959)).
If we perform elementary row operations on VN such that the predecessor
dependencies of PVN are identical to those of VN and perform column
operations on WN, so that WNE and WN, have the same dependencies then
examination of SN,N' = PSN,NE is equivalent to the examination of SN,N'
We define SN,N as the structural array of SN,N,. This array is
specified by the indices {vi} and {p.} which are the least integers such
that the row and column vectors of SNN are respectively,
ii **i
rnonzero fa=Q,...,v1
zero J la=, ..,N J
Snonzero b=0,...,p 1
a = for 1
S zero b= ..,N
These results follow since SN,N has identical
as SN,N,, then
SNN'
T
1
T
pN
for s=j+mb
predecessor dependencies
where T= 0 if it depends on its predecessors. To find the observability
indices, let a be the index of the last nonzero row of 6i+p t=0,...,N.
T T
Then if T6 = 0 v = 0 otherwise v. = (ai)/p+l. Similar results
*
follow when SNN is expressed in terms of the g The following theorem
*
specifies the matrices P and E required to obtain SN,N.
Theorem. (2.33) There exist elementary matrices P and E, respectively
lower and upper triangular with unit diagonal elements,
such that SN,N=PSN,N.E has identical predecessor
dependencies as SNN,N
Proof. Let PS N,N=Q where Q is row equivalent to SN,N, and P=[prs].
If the jth row of SN,N, is dependent on its predecessors, i.e.,
T j1 T T
T. + Z a = 0
3 k=l jkk 
then selecting P lower triangular such that
I
0O r
Prs = 1 r=s
ajk r>s
gives this relation. From this choice of P it follows that
dependent rows of SN,N, are zero rows of Q. If the jth row
of SN,N, is regular, then P unit diagonallower triangular
insures that the corresponding row of Q is nonzero and regular.
Similar results hold for the columns of SNN with E unit diagonal
upper triangular.
This choice of P does not alter the column dependencies of SN,N;
for if the ith column of SN,N, is dependent on its predecessors,
then from Corollary (2.32) .i is uniquely represented as a
linear combination of regular vectors in terms of the control
lability invariants. Since P is unit diagonallower triangular,
it is the matrix representation of a nonsingular linear
transformation, Pir=q. where _q is the ith column vector of Q.
Thus, multiplying on the left every vector Li in (2.32) with
this P gives for i=j+mpj
j1 min(P ,k"1) m min(j ,pk)1
=i = Z Z ajksq.k+ms + E Z j ksqk+ms
k=l s=O k=j s=O
Thus, we have shown that selecting P with the given structure
does not alter the predecessor column dependencies of SN,N' or
equivalently Q. Since the column vectors of Q satisfy the
above recursion, SN,N, and Q have identical predecessor column
dependencies, therefore, performing column operations on Q is
analogous to performing them on SN,N, and so we have S =
(PSN,N,)E = QE or the predecessor dependencies of SN,N, and SN,N,
are identical.V
This theorem shows that the indices can be found by performing a sequence
of elementary lower triangular row and upper triangular column operations
in a specified manner on the Hankel array and examining the nonzero rows
and columns of SNN., the structural array of SN,N,. The {ajks} and
({ist} are also easily found by inspection from the proper rows of P and
columns of E as given by
Corollary. (2.34) The sets of invariants {8ist',{jiks} or more compactly
the sets of n vectors {. )},{a} are given by the rows
of P and columns of E in (2.33) respectively as
i = [qrqr+p "' Pqr+p(vl)' 1 i+i ir
a. [estes+mt ... es+m(.j. )tT t=mtj+j, j,sm
JJ
where
1 q=r, r=s
Pqr'est qt
Proof. The proof of this corollary is immediate from Theorem (2.33).V
We can also easily extract the set of invariant block row or column
vectors, {a },{a k from the Hankel array and therefore, we have a
solution to the canonical realization problem.
Theorem. (2.35) If the generalized Hankel submatrix of rank n is
transformed by elementary row operations to obtain a row
equivalent array, then by proper choice of P the matrix Q
is given by:
TG I TFG ... TF TH,,,
S=  __ 
nPnN I pnN
LmN' L mN'
where (F,G) is a controllable pair and det TO.
Proof. .If x is a minimal realization, then it is wellknown that
p(VN)=P(WN,)=n. Since P is an elementary array, then it follows
[PVN]min[p(P),p(VN)]=n; thus P can be chosen such that
PV  and det TfO.
N pNn
Sn
Post multiplication by WN, gives
"VN"''N' pRn N' 1
PVN N' =FG] PSN,N' := Q
n
Multiplication of the arrays gives the desired results.V
Corollary. (2.36) If P is selected such that Q is as in (2.35) with the
pair (F,G) in Luenberger column form, then the set
of invariants {a.}, jem is given by the columns of
J
W N wk, kemN' with
k = k=pjm+j
Proof. If P is selected in Theorem (2.35) such that T=T then it
follows that each column of WN, corresponding to the (j+mpj)th
for each jem contains the {ajks} invariants.V
J ks
The method of selecting P is given in the ensuing algorithm.
Theorem. (2.310) Given the infinite realizable Markov sequence
from an unknown system, then EC=(FC,GC,H)n is a
minimal canonical realization of {Ak} with
Fc = [W* I W* 1 ... I W]
C 1 2 Wm
GC is a submatrix of (W+1i)C given by the first m
columns
HC = [a.1 ... al+m(ll) I ... I a .. a m
and W = [wj+ ... w jm], jsm, w is a column
vector of (Wy+1)C.
Proof. Since the sequence is realizable, there exist integers, n,v,p,
satisfying Proposition (2.15). If Q is given as in Corollary
(2.36), then
(W k)C GC ... FclGC
Q =  =  for k>p+l
0Pvn pvn
L mk Omk
Thus, GC is obtained immediately from the first m columns of
(Wk)c. Form two nxn arrays, A and A each constructed by
selecting n regular columns of (Wk)C starting with the first
column for A and the (l+m) column for A The independent
columns of (Wk)C are indexed by the pj and satisfy (2.38);
thus, they are unit columns and A is a permutation matrix, i.e.,
A = [w ... w I Wl+m ". 2m I .. I .. +m(p.) ...], jem
3
Theorem. (2.310) Given the infinite realizable Markov sequence
from an unknown system, then EC=(FC,GC,HC)n is a
minimal canonical realization of {Ak) with
FC [ W I W I ... I Wm
GC is a submatrix of (W.+1)C given by the first m
columns
HC = [a.l ... a.l+m(Pll ) I ... a ... am ]
and W = [j+m ... j+m], jm, w is a column
vector of (W +1)C.
Proof. Since the sequence is realizable, there exist integers, n,v,p,
satisfying Proposition (2.15). If Q is given as in Corollary
(2.36), then
F(Wk)CGC  ...  FC GC
Q =  =  for k>i+l
mk mk
Thus, GC is obtained immediately from the first m columns of
(Wk)C. Form two nxn arrays, A and A each constructed by
selecting n regular columns of (Wk)C starting with the first
column for A and the (l+m) column for A The independent
columns of (Wk)C are indexed by the pj and satisfy (2.38);
thus, they are unit columns and A is a permutation matrix, i.e.,
A = [w1 ... w~ l+m 2m ... j+m( 1) jem
where it follows from (2.38) that the columns of A form chains
satisfying
m
[w. ...w. j = [eq 1 ... e ] for qj= E P..
3 34+m1( y1) qj1+ll s=1
Since A is A shifted m columns to the right, each chain of A
is given by [wAj+ ... w J+m ] and again each column is unit
j j+mp.
3 T
except wj+m = aj from Corollary (2.36). Thus, FC := A
gives the matrix of (2.25). HC is obtained directly from
H G I ... I F G = [a.1 ... a.m I .. a.l+m ... am(k+l)]
since multiplication by the unit columns of (Fc,Gc) select the
n columns of H .V
Analogous results hold for the dual ZR. It should also be noted that
if the Hankel array is transformed to SN,N' and both rows and columns
examined for predecessor independence as before, i.e.,
NN := N,N' U N (2.311)
where WN is given in (2.28) and T is a permutation array, then all of the
previous theory is applicable. The only exception in this case is that
the Bucy invariants and forms given by IBR and IBC are obtained instead of
the Luenberger forms. These results follow directly from (2.21).
In many applications the characteristic polynomial XF(z) is required.
Many efficient classical methods (e.g., see Faddeeva (1959)) exist to
determine XF(z) from the system matrix. Even more recently some
techniques have been developed to extract the characteristic polynomial
43
from the Markov sequence, but in general they are only valid in the cyclic
case (see Candy et al. (1975)). An alternate solution to this problem
is to obtain the Bucy form and use (2.211) to find XF(Z) by inspection.
It is possible to realize the system directly in Bucy form as mentioned in
the previous paragraph, but in this dissertation we prefer to take
advantage of the structure of the Luenberger form to construct TBR or
TBC. Superficially, this method does not appear simple because the
transformation matrix and its inverse must be constructed, but the
following lemma shows that TBR can almost entirely be written by inspection
from the observability invariants after the {I.} are known.
Lemma. (2.312) The transformation matrix TBR, such that
1 1
(FBRGBRHBR) = (TBRFBR' BRG, HTBR
is given by
T = T T
BR B= [T .. B I
1 p
If the given triple is in Luenberger form, ZR' then the
(.ixn) submatrices TB are
T
il+l
T
eT ei+1
Sr+' +2
1FR
T
T > iV
Ti> R
where
v'.~ are the observability invariants of ER
vi are the invariants associated with EBR and
i
recall ri = E v ro=O.
1 s s o
s=1
Proof. This lemma is proved by direct construction of the T 's.
B i
Since each T satisfies for v.'v.
F 1 1
TB. = F R1
i h1F'
then analogous to
and therefore
v
hiFR =
R
h.F 'i
h FR
il T
property (2.38), it follows that h.F =e
i^R RR r
(hiFRi )FR = eFR = i
v v.V.l1 v.v.l'
= (hiFR )FR = ~.FR .
In order to construct TBR it is first necessary to find the {ui} from the
rows of [V ]R, but in this case the i 's can generally be found by
inspection while simultaneously building TBR. Also, TBR is generally a
sparse matrix with unit row vectors; therefore, the inverse can easily be
found by solving T RTB = n directly for the unknown elements of TB .
BR BR n BR'
In the next section we develop some new algorithms which utilize
the theory developed here.
2.4 Some New Realization Algorithms
In this section we present two new algorithms which can be used to
extract both observability and controllability invariants from the given
Markov sequence. Recall from Theorem (2.33) that performing row operations
on the Hankel array does not alter the column dependencies, however, it
is possible to obtain the row equivalent array, Q in a form such that
the controllability invariants can easily be found.
The first part of the algorithm consists of performing a restricted
Gaussian elimination (see Faddeeva (1959) for details) procedure on the
Hankel array. This procedure is restricted because there is no row or
column interchange and the leading element or first nonzero element of
each row is not necessarily a one. Define the natural order as 1,2,....
Algorithm. (2.41)
(1) Form the augmented array: [IpN I SN,N' I ImN
(2) Perform the following row operations on SN,N. to obtain
EP Q I ImN,:
(i) Set the first row of Q equal to the first Hankel row.
(ii) Search the first column of SN,N, by examining the rows in
their natural order to obtain the first leading element.
This element is qjl.
(iii) Perform row operations (with interchange) to obtain qkl=0,k>j.
tAlternately it is possible to extract the Bucy invariants from Q by
reordering the columns as in (2.28) to obtain (=QU and examining
the columns for predecessor dependencies.
(iv) Repeat (ii) and (iii) by searching the columns in their
natural order for leading elements.
(v) Terminate the procedure after all the leading elements have
been determined.
(vi) Check that at least the last p rows of Q are zero. This assures
that the rank condition, (R) is satisfied.
(3) Obtain the observability and controllability indicest as in
Theorem (2.33).
(4) Obtain T iep from the appropriate rows of P as in Corollary (2.34)
T *
and bi as in (2.219) where Bijpij
(5) Perform the following column operations on Q to obtain [P SN,, E]:
(i) Select the leading element in the first column of Q, qjl.
(ii) Perform column operations (with interchange) to obtain
qjs=0 for s>l.
(iii) Repeat (i) and (ii) until the only nonzero elements in each row
are leading elements.
(6) Obtain cj, jem from the appropriate columns of E as in Corollary
(2.34) and b from the dual of (2.219).
J
(7) From the invariants construct the Luenberger and MFD forms as in
Section (2.2).
If we also require the characteristic polynomial, then we must include:
tt.
(8) Determine the {.}, iep and (simultaneously) construct TBR as in
Lemma (2.312).
1 1
(9) Find TBR by solving for the non unit rows in TT = In
BR BRTBR n
Note that the leading elements have been selected from the rows by examining
the columns in their natural order; therefore, the dependent columns are
not zero as in (2.33), but are easily found from this form of Q by
inspection. It should also be noted that the leading elements could have
been selected in the j, (j+m), (j+2m)... columns; therefore, facilitating
the determination of the Bucy invariants and forms.
Alternately the {(3v}, jcm and TBC could be used. These indices can be
found easily from the columns of Q.
I
If we consider the alternate method implied in Corollary (2.36), then
the following modifications to the preceding steps are required:
(1)* Start with the following augmented array:
[IpN I SN,N.]
(2)* Obtain [P I Q] as before.
(5)* Perform additional row operations on Q to obtain unit
columns for each column possessing a leading row element, and
perform row interchanges such that (2.38) is satisfied
for each jem, i.e., obtain
Q :n
OpNn
kmn,
(6)* Obtain the aj, jem, as in (2.36).
It should be noted that these algorithms are directly related to
those developed by Ho and Kalman (1966), Silverman (1971), or Rissanen
(1971). As in Ho's algorithm, the basis of the first technique is
performing the special equivalence transformation of Theorem (2.33)
*
on SN,N, to obtain SN,N,. The second technique accomplishes the same
objectives by restricting the operations to only the rows of SN,N, which
is analogous to either the Silverman or Rissanen method. The initial
storage requirements in the first method are greater than the second if
mN'>pN, since P and E can be stored in the same locations due to their
lower and upper triangular structure; and (2) P will be altered in the
second method, since row interchanges must be performed in (5)*; whereas,
it remains unaltered in the first method which may be important in some
applications. Consider the following example which is solved using both
techniques.
I
Example. (2.42)
Let m=2, p=3, and the Hankel array be given as, S4,4
1 2 2 4 4 8 8 16
1 2 2 4 6 10 13 22
1 0 1 0 3 2 6 6
2 4 4 8 8 16 16 32
2 4 6 10 13 22 28 48
1 0 3 2 6 6 13 16
S = 
S 4,4
4 8 8 16 16 32 32 64
6 10 13 22 28 48 58 102
3 2 6 6 13 16 27 38
8 16 16 32 32 64 64 128
13 22 28 48 58 102 19 214
6 6 13 16 27 38 56 86
(1) [112 I S4,4 I 8
(2) Performing the row operations as in (2.41), obtain [P I Q I 8
where the leading elements are circled,
i] O 2 2 4 4 88 16:
1 .1 00 00 2 5 6
1 0 1 4 0 5 7
2 0 1 0 0 0 0 0 0 0 0
S0 D 1 I 0( 2, 0 1 1
11 1 0 ] 1
4 0 0 0 0 0 1
3 0 1 0 1 0 0 1
0 1 2 0 1 0 D 0 1 07
8 0 ,0 0 0 0 0 0 0 1
8 1 2 0 2 0 0 0 0 0 1
1 2 3 0 2 0 0 0 0 0 0 1
(3) The indices are obtained by inspection from the independent rows
and columns of Q in accordance with Theorem (2.33) as:
v1 : 1
v2 2
v3 =1
P1 = 3
112 = 1
and p(S2,3) = p(S3,3) = p(S2,4) = 4 satisfying (R).
T T
(4) The Ti and bi are determined from the appropriate rows and columns
of P as:
P45 P43] =
P85I P83 =
P65 P633 =
[2 I 0 0]
[3 I 0 1 1]
(1 1 1 I 1]
b = 1 P41 P42 P43 P44 I 0 [0 12 0 0 1 ]
 = Cp81 P82 P83 P84 P85 P86 P87 P881 01] 3 01 0 1 0 0110]
T = [ IP61 62 P63 P64 P65 P66] = [1 0 1 1]
i" 3 i6 p6 p6 1 ]
(5) Performing the column operations, obtain the structural array
SN,N and E as:
[P I S4,4 ) E] where the leading elements are circled,,
= EP41
= [P81
= [p61
P42
P82
I P62
0 000 0 0 0
0 0 0 0 0 0 0 0
0 o0 ( 0 0 0 0 0
7
8
1 2 1 1 4 7
S 2 4 4
1 1 T 
1 0 0 0 0
1 1  3
0 1 0 0
1 0
1
(6) The a. and b. are determined from the appropriate rows and columns
of E as:
e17
e37
e57
e27
el7
"27
e37
e47
e57
e67
e77
0
5
4
1
1
4
0
5
2
0
1
0
5
2
1
8
' 2
L2
4
e14
e24
e34
e44
e
14
e34
e54
e24
0
3
1
1
1
1
0
3
7
I
1 = N
51
(7) The canonical forms of ZR, BR(z), DR(z) and EC, BC(z), iC(Z) are:
FR
R
T
f3
T
T
"
3
T
a
1.
a
2.
T
a5
5.
T
a
3.
1 2
1 2
2 4
1 0
z22z 0
3 z2+z
z z2+z
0
1
z2zj
; DR() =
z 2z
z+1 2z+2
0 2z
FC = e2 e3 1 a 2
GC = [el 4]
HC = [a.1 a.3 a5 1 a 2]
C a1.
z. [ z2 + 'Z + 4
BC(z) =
i.
8
1
1
1
_Z3, 2
z +z
2 4 2
2 6 2
1 3 0
; c(z) =
(8) The {vi} and TBR are determined simultaneously as:
and TBR
BR
T
T
T
eT
1
a{.
1 0 0 0
0 1 0 0
0 0 1 0
3 0 1 1
3011
BR(z) [
 I
+ i
+ 3
IT
z2
z2
1
j2
z2 Iz
z2 2z
22z
V1
2=3
I
(9) TB1 is given by solving
1 0
T1 0 1
1
BR 0 o0
3 0
(10) Find FBR and XF(z) as
BR BRRBR
FBRTBRFRTBR
2
0
0
2
the equations for the last row as:
0 0
0 0
1 0
1
1
0
0
1
2
and
XF(z) = (z2)(z32z2+1) = z4_4z3+4z2+z2
This
then
completes the first method. If the second method is used instead,
only (5)*, (6)*, and (8)* differ.
(5)* Performing the additional row operations and interchanges to
satisfy (2.38) gives:
5 1 1 0 5 7
 1 0 0 0 I 0 1 ~
5 3 1 1 1
 1 0 0 0 0 0 0 1 0 1 1 3
0V
O o o o o o $
0 4
X 08
are determined

i; =w '
from
the appropriate
1
"Il
1
0 4
3
columns of Q as:
(6)* The
a.'s
a =
1
I
53
This completes the algorithms. In the next chapter the first method is
modified to develop a nested algorithm from finite Markov sequences.
CHAPTER 3
PARTIAL REALIZATIONS
One of the main objectives of this research is to provide an
efficient algorithm to solve the realization problem when only partial
data is given. As new data is made available (e.g., an online
application, Mehra (1971)), it must be concatenated with the old
(previous) data and the entire algorithm rerun. What if the rank of
the Hankel array does not change? Effort is wasted, since the previous
solution remains valid. An algorithm which processes only the new data
and augments these results (when required) to the solution is desirable.
Algorithms of this type are nested algorithms.
In this chapter we show how to modify the algorithm of (2.41)
to construct a nested algorithm which processes data sequentially.
The more complex case of determining a partial realization from a fixed
number of Markov parameters arises when the rank' condition, abbreviated
(R), is not satisfied. It is shown not only how to determine the minimal
partial realization in this case, but also how to describe the entire
class of partial realizations. In addition, a new recursive technique
is presented to obtain the corresponding class of minimal extensions and
the determination of the characteristic equation is also considered.
3.1 Nested Algorithm
Prior to the work of Rissanen (1971) no earlier recursive methods
appeared in the realization theory literature. Rissanen uses a
I
factorization technique to solve the partial realization problem when
(R) is satisfied. His algorithm not only solves the problem in a
simple manner, but also provides a method for checking (R) simultaneously.
In the scalar case, Rissanen obtains the partial realizations, Z(K),
K=1,2,... imbedded in the nested problem of (2.1), but unfortunately
this is not true in the multivariable case. Also, neither set of
invariants is obtained.
The development of a nested algorithm to solve the partial
realization problem given in this dissertation follows directly from
(2.41) with minor modification. There are two cases of interest when
only a finite Markov sequence is available.
Case I. (R) is satisfied assuring that a unique partial
realization exists; or
Case II. (R) is not satisfied and an extension sequence
must be constructed.
The nested algorithm will be given under the assumption that Case I
holds in order to avoid the unnecessary complications introduced in
the second case. The modified algorithm is given below. The corresponding
row or column operations are performed only on the data specified
elements.
Partial Realization Algorithm. (3.11)
(1) Same as (1) and (2) of Algorithm (2.41) except (iv) is qkjfO
k>j if k is a row whose leading element has been specified.
(2) If (R) is satisfied for some M*=v+p, obtain the invariants as
before in (3), (4) of (2.41) and go to (5).If not, continue.
(3) Add the next piece of data, AM+1 and form S(M+1,M+1).
(4) Multiply S(M+1,M+1) by P. Perform row operations (if necessary)
using old leading elements to obtain Q (M+1,M+1). If (R) is
satisfied, continue. If not, go to 3.
(5) Perform column operations as in (5) of (2.41) and obtain the
invariants and canonical forms as in (6), (7). Go to 3.
Example (2.42) will be processed to demonstrate the modified algorithm
for comparison. Assume that the Markov parameters are sequentially
available at discrete times, i.e., Al is received, then A2, etc., and
the system is to be realized.
Example. (3.12) Let the Markov sequence be given
1 2] 2 4 [4 8
A = 1 2 A = 2 4 A3 6 10 A4
_1 0 1 0 3 2
by
S8 161 [16 32
13 22 A5= 28 48
6 6 13 16
and apply the algorithm of (3.11). It is found that the rank condition
is first satisfied when A1, A2 are processed, i.e.,
(1) [I6 I S(2,2) 1 14
(2) Performing first row and then
obtain [P I S*(2,2) J E] or
1 0 0
l 1 O
2 0 0 1 0 0
2 0 0 0 1 0 0
0 0 1 0 0 1 0 0
column operation as in (3.11),
1 2 1
1 
0 1
I _
(3) Indices are: v = 1 = 1
V2 =0 2 = 1
V =1
(4) :Invariants are: = [P41 P43] = [2 ) 0]
__ 1'P 4 = [0 1 0]
S=[P61 63 = 1
and
N = [P41 P42 P43 P44 0] [2 0 0 1 0 0]
T T T
b2 = [0 P21 P22 ] = [0 0 01 1 0]
S[P61 P62 P63 P64 P65 P66]= 0 0 1 0 0 1]
e 1 e4 0
i l13 114
3 1 e23 e24 2
22 = ~ e 1 e
S24 2 33 34
0 O_ e 44 1
(5) Canonical forms are:
1 2
FC I.1 2 GC = [ 1 I e2 HC = [a.1 I 2] = 1 2
z1 0 1 2
Fc(z) = f C(z) =
2" z2 1 0
and
T P T ] 
FR =R IR 1
T 3 .
3 ~ ~_eT2_"3 
where wTt = [P 21 P23] =[1 0]
z2 0 0 1 2
BR(z) = 2 z 0; DR(Z) = 0 0
0 0 z1 1 0
The rank condition is next satisfied when A1,...,A5 are processed,
.e, M =5 and we obtain [P (5,5) E as:
i.e., M =5 and we obtain [P  S (5,5)  E] as:
[P I Q(5,5)] =
1
1
1
2
2
1
4
3
0
8
8
1
16
24
8
() 2 2 4 4 8 8 16 16.32
0 0. 0 ( 2 5 61216
0 1 4 1 6 2 10 3 16
0 0 0 0 0 0 0 0
0 0 (J) 2 5 6 12 16
0 0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0
0 0
0 0
and performing the column operations give [S (5,5) I E~
wT is found easily from HRGR=A1 or solving for the second row of Hp,
wTGR [1 2].
GR
1
0 1
0 0 1
0 0 0 1
0 0 0 0
1
0 1
0 0
0 0
1 1
0 0
0 1
1 2
0 0
1 2
2 3
0 0
0 4
0 5
1
0 1
0 1
0 0
0 1
0 1
0 0
0 2
0 2
0 0
0 0
0 0
) 0 0 0 0 0 0 0 0 0 1 2 1 1 i 4 10 12
4 2
0 0 0 0 @ 0 0 0 0 0 1 i 1 L' 6 14
0o( 0 0 0 0 0 0 0 1 1  3   15 20
0 0 0 0 0 0 0 01 0 0 0 0 0 0
0 0 () 0 0 0 0 0 1 1 3 6 8
0 0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0. 1 0 0
0 0 0 0 0 0 1 0
0 0 0 0 1
0 0 0 0
0 0 0 0
a 0
a a
The results in this case are identical to those of Example (2.42).
Let us examine the nesting properties of this realization algorithm.
Temporarily, we resort to using data dependent notation for this
discussion with the same symbols as defined previously in the previous
sections, e.g., the minimal partial realization of order M is given by
Z(M) := (F(M),G(M),H(M)). Thus, Z(M+k) is a (M+k)order partial
realization. We also assume for this discussion that E(M) is in row
canonical form; therefore, it can be expressed in terms of the set of
invariants, [{vi(M)},{ (M)},{aT(M)}]. If E(M) is an n dimensional,
minimal partial realization specified by these invariants, then there
are n regular vectors, +pt(M) spanning the rows of S(M,M). Furthermore,
Ss+pt
each dependent row vector, J.(M) is uniquely represented as a linear
combination of regular vectors in terms of the observability invariants
and it can be generated from the recursion of Corollary. (2.32). Similarly,
it follows from Proposition (2.213) that the dependent block row
vectors, aj (M) satisfy an analogous recursion. The following lemma
j.
describes the nesting properties of minimal partial canonical realizations.
Recall that M is the integer of Proposition (2.16) given by M =v+p.
Lemma. (3.13) Let there exist an integer M*(M)IM such that the rank
condition is satisfied and let Z(M) be the corresponding
minimal partial canonical realization of {Ar}, reM
specified by the set of invariants [{v.(M)},{ist(M)},
{a (M)}]. Then
vi(M) = vi(M+k)
Bist(M) = ... = ist(M+k)
aT (M) = = aT (M+k)
j. J.
iff p(S(M,M))=p(S(M+1,M+1) = ... = p(S(M+k,M+k))
for the given k.
Proof. If v.(M) = ... = v.(M+k), etc., then the minimal canonical
1 1
partial realizations are identical, E(M)=E(M+1)= ... =E(M+k).
It follows that p(S(M,M))=dimE(M)=p(S(M+1,M+I))=dimE(M+I)=
p(S(M+k,M+k)).
Conversely, P(S(M,M))=P(S(M+1,M+1))= ... =P(S(M+k,M+k)) implies
dimZ(M)=dimE(M+1)=... =dimE(M+k). Since E(M) is a unique minimal
canonical partial realization, so is Z(M*). Furthermore, since
each realization has the same dimension, each realization has
has M*(M)=M*(M+1)= ... = M*(M+k) so that each canonical
realization is equal to Z(M*); therefore, Z(M)=Z(M+1)= ... =Z(M+k).V
Next we examine the case where E(M) and z(M+k) are of different
dimension. The nesting properties are given in the following lemma.
Lemma. (3.14) Let there exist integers, M*(M)M, M (M+k)M+k such
that the rank condition is satisfied (separately) and
E(M), Z(M+k) are minimal partial canonical realizations
of {Ar} whenreM andrsM+k, respectively, for given k.
If p(S(M+k,M+k))>p(S(M,M)), then vi(M+k)vi(M), iep.
T T
Furthermore, a .(M+k)=a. (M), j=i,i+p,...,i+p(vi(M)l).
j. j. 1
Proof. Since p(S(M+k,M+k))>p(S(M,M)), M*(M+k)>M*(M) and therefore,
Sv(M),(M) is a submatrix of Sv(M+k),,(M+k). If the jth row
of Sv(M),p(M) is regular, it follows that the jth row of
Sv(M+k),i(M+k) is also regular by the nature of the Hankel
pattern, i.e., the rows of Sv(M),p(M) are subrows of
Sv(M+k),p(M+k). The addition of more data (AM+1,...,AM+k) to
S(M,M) makes previously dependent rows become independent rows
but previously independent rows remain independent; thus, the
v.(M) can only increase or remain the same, i.e., v.(M+k) 
vi(M), iep. The set of regular {a (M+k)} are specified by the
vi(M+k)'s; therefore aT (M+k)=a (M), j=i,i+p,...,i+p(v.(M)1),
( k'. 1
since vi(M+k)Qvi(M), iep.V
The results of these two lemmas are directly related to the nesting
properties of the partial realization algorithm. First, define JM as the
set of indices of regular Hankel row vectors based on M Markov parameters
1
available, i.e., J = {1,1+p,... ,1+p( .(M)l),...,p,2p,... ,pv (M)}
and similarly denote the row vectors of P1 the elementary row matrix
of the previous chapter, by T(M). From Lemma (3.13), it follows
S* T T
that JM J+k and i+p(M)(M) =... i+ (M+k)(M+k) since
the observability invariants are identical. The vi specify the
elements in J and along with the Bist, they specify the elements of
T
Pi+p. (M)(M) (see Corollary (2.34)). From Lemma (3.14) it is clear
that JMJM+k since vi(M+k)Zvi(M).
Reconsider Example (3.12), to see these properties. In this
case we have M=2, k=3, M (2)=2, M*(5)=5, and p(S(5,5))>p(S(2,2)) as
in Lemma (3.14); therefore, J*cJ*, since J2 = {1,3} and J* = {1,3,2,5}.
The observability indices are identical except for v2(5)>v2(2); thus,
{a (2),a, (2)}c{aT (5),aT (5),aT (5),aT (5)} since aT (2) = aj (5)
1 3. I 3. 2. 5. J. j.
for j=1,3. We also know from Example (2.42) that Z(5) is the solution
to the realization problem and therefore the properties of Lemma (3.13)
will hold for {AM}, M>5. Table (3.15) summarizes these properties.
The results for the dual case also follow directly. We now proceed to
the case of constructing minimal partial realizations when.(R) is
not satisfied, i.e., the construction of minimal extensions.
63
Table. (3.15) Nesting Properties of Algorithm (3.11)
Augment M4+k
JMJM+k
n(M+k)=n(M)
n(M+k)>n(M)
T
pi+pvi
Vi
Bist
where (R) is satisfied for some k and C means that the
corresponding invariants, vectors, or indices are nested or
contained in a set of higher order.
I
3.2 Minimal Extension Sequences
In this section we discuss the more common and difficult
problem of obtaining a minimal partial realization from a finite
Markov sequence when (R) is not satisfied. Two different approaches
for the solution of this problem have evolved. The first is based
on constructing an extension sequence so that (R) is satisfied
and the second is based on extracting a set of invariants from
the given data. We will show that these methods are equivalent
in the sense that they may both lead to the same solution. In order
to do this the existing algorithm is extended to obtain the more
general results of Roman and Bullock (1975a). Also a new recursive
method for obtaining the entire class of minimal extensions is
presented. It is shown that the existing algorithm does in fact
yield a particular solution to this problem which is valuable in
many modeling applications.
In the first approach, Kalman (1971b),Tether (1970), and
subsequently Godbole (1972) examine the incomplete Hankel array
to determine if (R) is satisfied. If so, the corresponding minimal
partial realization is found. If not, a minimal extension is con
structed such that (R) holdsand a realization is found as before.
They show that a minimal extension can always be found, but in
general it will be arbitrary. They also show that this extension
must be constructed so that the rank of S(M,M) remains constant
and the existing row or column dependencies are unaltered.
Considerable confusion has resulted from the degrees of freedom
available in the choice of minimal extensions. In fact, initially,
the major motivation for construction an extension was that it
was necessary in order to be able to apply Ho's algorithm. Un
fortunately, these approaches obscure the possible degrees of
freedom and may lead to the construction of nonminimal extensions
as shown by Godbole (1972).
Roman and Bullock (1975a)developed the second approach to
the solution of this problem. They show that examining the columns
or rows of the Hankel array for predecessor independence yields
a systematic procedure for extracting either set of invariants
imbedded in the data. They also show that some of these wouldbe
invariants are actually free parameters which can be used to
describe the entire class of minimal partial realizations. These
results precisely specify the number and intrinsic relationship
between these free parameters. Unfortunately Roman and Bullock
did not attempt to connect their results precisely with those in
Kalman (1971b),Tether (1970). It will be shown that this connection
offers further insight into the problem as well as new results
which completely describe the corresponding class of minimal extensions.
Before we state the algorithm to extract all invariants available
in the data, let us first motivate the technique. When operating
on the incomplete Hankel array, only the elements specified by the
data are used. It is assumed that the as yet unspecified elements
will not alter the existing predecessor dependencies when they are
specified by an extension sequence. Since the predecessor dependencies
are found by examining only the data in S(M,M), we must examine
complete submatrices of S(M,M) in order to extract the invariants
associated with a particular chain (see Roman and Bullock (1975a)).
Therefore, it is possible that a dependent vector, say !T of a sub
1
matrix of S(M,M) later corresponds to an independent vector in S(M,M).
When representing any other dependent vector in this submatrix
in terms of regular predecessors, T. must be included, since it is
1I
a regular vector of S(M,M) under the above assumption. In this represen
tation the coefficient of linear dependence corresponding to Y
1
is arbitrary. Reconsider Example (3.12) for{Ai}, i=1,2,3 where we
only consider the (row) map P.
Example. (3.21) For A1, A2, A3 of (3.12) we have P: S(3,3)+Q(3,3) or
1 2 2 414 8~ @2 2 414 8
1 2 2 416 10 0 0 0 0)2
1.0 1 03 2 .0_1416
2448 0 0 0 0
P
2 4 6 10  0 0 2
1 0 3 2 0 0 0 0
48 00
6 10 0 0
3 2 _0 0
The indices are {vl,v2,v3} = {1,2,1}. Since v1=1, the fourth row of
S(3,3) (or equivalently Q(3,3) ) is dependent on its regular predecessors
as shown in the corresponding 3x4 submatrix (in dashed lines) of S(3,3)
^T
(or Q(3,3) ). The second row, say 2 in this submatrix is dependent,
yet it is an independent row of S(3,3) (or Q(3,3) ). Now, expand T of
this submatrix, i.e.,
^T =T + T ^T
4 110 1 120 130 3 121'0)
or
[2 4 4 8] = B110 [1 2 2 4] + B120 [1 2 2 4] + B130[1 0 1 0]
The solution is B110 = 2 120' g130=0; thus, the coefficient 8120 is
ah arbitrary parameter. Note that this recursion is essentially the
technique given in Roman and Bullock (1975a).
Clearly, if (R) is satisfied as in the previous section, then there
exists a complete submatrix (data is available for each element) of S(M*,M*)
in which every regular vector of S(M,M) is always a regular vector
of the submatrix corresponding to a particular chain; thus, there
will be no arbitrary or free parameters.
The algorithm for the case when (R) is not satisfied may be
illustrated by considering row operations on S(M,M) to obtain Q(M,M),
since the identical technique can be applied to obtain S*(M,M). The
arbitrary (column) parameters are found by performing additional
column operations to Q(M,M). As in Example (3.21), we must find
the largest submatrix of Q(M,M) for each chain, i.e., if we define
k as the index of the block row of S(M,M) containing the: vector ,
 1i+p vi
then the largest submatrix of data specified elements corresponding
to the ith chain is given by the first (i+pvil) rows and m(M+1ki)
columns of Q(M,M). Also, we define Ji,iep as the sets of Hankel row
indices corresponding to each dependent (zero) row of the
(i+pu1l)x (m(M+1ki) submatrix of Q(M,M) which becomes independent, i.e.,
it contains a leading element. In Example (3.21) for i=l, we have
(l+pvl)3 and k1=2; thus, m(M+lkl)=4 and the corresponding submatrix is
given by the first 3 rows and columns of Q(3,3), and. of course, J1={2}.
I_ I
Arbitrary Parameter Partial Realization Algorithm. (3.22)
(1) Perform (1) of Algorithm (3.11) to obtain [P I Q(M,M)]
(2) For each iep, determine the largest (i+pwl)xm(M+1k.) sub
1 1
matrix of Q(M,M) of data specified elements and form the set Ji
T T T
(3) For each iep, replace by + b, ba scalar.
(4) Determine the corresponding canonical forms incorporating
these free parameters.
Dual results hold for the columns. The freeparameters are
fund in analogous fashion by examining the zero columns of the
submatrices of S*(M,M).
Example. (3.23) The following example is from Tether (1970).
For m=p=2 and
1 1 2 4 3 10 7 22 15]
A 0 0 0 0 1 3 3
(1) [ P r Q(4,4) ] =
1 ( 1 4 3 10 7 22 15
0 1 0 0 0, 0 ( 3
4 0 1 0o o6 51813
0 0 0 1 0 0 06 1 0 0
2 0 3 0 1 0 01 0
1 0 0 1 0 1 0 0o 0o
6 0 7 0 0 0 1 0 0
3 0 0 0 0 0 0 1 0 0
It should be noted that when (R) is not satisfied, some of the v. may not
be defined, i.e., the last independent row of a.chain is in the last block
Hankel row. In this case all wouldbe invariants are arbitrary.
69
The indices are: v1=2, v2=3
(2) For i=1, (1+p 1)=4, kl=3, m(M+1k,)=4; thus, the corresponding
submatrix is constructed from the first 4 rows and columns of Q(4,4)
(small dashes). J1={2}.
For i=2, (2+pv21)=7, k2=4, m(M+1k2)=2; thus, the corresponding
submatrix of Q(4,4) is given by the first 7 rows and 2 columns (large
ra'shes). J2={2,4,6}.
(3) Replacing the fifth and eighth rows of P with p + bPj2 and
T T T T
p + cP2 + dp + ePg where b,c,d,e are real scalars gives
T
S= [ 2 b 3 0 1 0 0 0]
S[3e c 0 d+e 0 e 0 1 ]
p8 = [3e c 0 d+e 0 e 0 1
The T are:
1
_= [ 2
T = [ 3+e
3 b
0 0]
c (d+e) e ]
(4) The canonical form is
T
2
T
F T1
FR= eT
T
4
4
T
e
1
HR= GR=
T
eL
Corresponding to these realizations is a minimal extension sequence
which can be found by determining the Markov parameters. These parameters
T
a
Ta
3.
a
a4.
T
6,
0 0
0 0
0 0
are cumbersome to obtain due to the general complexity of the expressions
in zR or ZC; therefore, a technique to determine these extensions
without forming the Markov parameters directly (or the realization) was
developed. This method consists of recursively solving simple linear
equations (one unknown) to obtain the minimal extension. Extensions
constructed in this manner not only eliminate the possibility of non
minimality as expressed in Godbole (1972), but also describe the entire
class of minimal extensions. The method of constructing the minimal
extension sequence evolves easily from the lower triangularunit diagonal
structure of P. Since a dependent row of Q(M,M) is a zero row, it
follows from Theorem (2.33) that
T
+pvi j qi+pi j 0 for jemM (3.24)
where recall that p i+i=0 for j>i+pv.. Thus, by inserting the
1ipq,j 1
unknown extension parameters, x..(r) for
rx11(r) *** xm(r)
Ar
x pl(r) *.. pm(r)
into S(M,M) a system of linear equations is established in terms of the
xi (r)'s by (3.24). Due to the structure of P, this system of equations
is decoupled and therefore easily solved.
Example. (3.25) Reconsider (3.12) for Al, A2. Since (R) is satisfied,
the extension A., j>2 is unique. We would like to obtain
3
x11(3) x12(3)
x21(3) x22(3)
x31(3) x32(3)
Since P maps S(2,2) into Q(2,2), we
have
1 2 2 4
1 2 2 4
1 0 1 0
r" 
2 4 x11(3)
2 4 x21(3)
_1 0 x31(3)
X 
x12(3)
x22(3)
x32(3)
P
jo
0
2 2 4
0 0 0
0 1
0 0 0
0 0 0
0 0 0
I01
4
0
0
0
and in this case,{vl,v2,v3} ={1,0,1}. Thus, using (3.24), we
solving 0 = P4T 13 = [2
0 0 1 o
x11(3)
x21(3)
x31(3)
for x11(3) gives x11(3)=4
Similarily solving:
T
p44 = 0
T
T
for x12(3) gives x12(3) = 8
for x31(3) gives x31(3) = 1
for x32(3) gives x32(3) = 0
In this example, x21(3)=x11(3) and x22(3)=x12(3), since v2=0.
have
Thus, this example shows that the minimal extension sequence can be
found recursively due to the structure of P. Of course, the problem
of real interest is when (R) is not satisfied and (as in Ho's algorithm)
a minimal extension with arbitrary parameters must be constructed.
Minimal Extension Algorithm. (3.26)
(1) Perform (1), (2), (3) of Algorithm (3.22).
(2) Determine M* = v+p. (The values of v,p are determined by the partial data)
(3) Recursively construct the minimal extension {Ar}, r = M+1, ... ,M*
where Ar [xij(r)] P by solving the set of equations for xij(r)
given by
+v = 0, j = m(M+lk)+l, ... ,m(M*+1ki), for each iep.
pi +pv 1i)
and recall that ki is the index of the block row of S(M,M) containing
T
the row vector, Y
i+pvvi.
Example. (3.27) Reconsider (3.23) for illustrative purposes.
(1) These results are given in Example (3.23)
(2) M*=6; thus, find
A5 = Fxll(5) x12(5)1
x21(5) x22(5)
(3) Recursively solve: Pi2vi r = .
i=2, j=3,4,5,6.
p5 5 = 0 gives x11(5);
p T3 = 0 gives x21(5);
= 0 gives
= 0 gives
A6 = xl'(6) x 2(6)
x21(6) x22(6)
for i=l, j=5,6,7,8 and for
T
= 0 gives x12(5)
= 0 gives x22(5)
= 0 gives
= 0 gives'
xl1(6);
x21(6) ;
x12(6)
x22(6)
and therefore
46b 31b 946b 636b
A5 = A6
12d 9de 30c3d5e+de 21c3d5e+dee2
By solving for the x i's in A5, A6 we obtain the extension as
A x11(5) x11(5)15~ A 6x11(5)182 6x11(5)213
x21(5) x22(5) x21(6) x21(6)+(21(5) (53)9
The number of degrees of freedom is 4,i.e.,{x11(5),x21 (5),x22(5),x2(6)}.
The technique used to solve the partial realization problem when (R)
is not satisfied was to extract the most pertinent information from the
given data in the form of the invariants, which completely described
the class of minimal partial realizations. A recursive method to obtain
the corresponding class of minimal extensions was also presented in (3.26).
This method is equivalent to that of Kalman (1971b) or Tether (1970) for
if the minimal extension is recursively constructed and Ho's algorithm
is applied to the resulting Hankel array the corresponding partial real
ization will belong to the same class. Note that if the extension is not
constructed in this fashion, it is possible that all degrees of freedom
available may not be found (see Roman (1975)). It should be noted that
the integers v and i are determined from the given data,i.e., knowledge
of the invariants enables the construction of a minimal extension such
that v and p can be found. The approach completely resolves the ambiguity
pointed out by Godbole (1972) arising in the Kalman or Tether technique.
The results given above correspond directly to those presented in
Kalman (1971b) and Tether (1970). They have shown, when (R) is satisfied,
there exists no arbitrary parameters in the minimal partial realization
or corresponding extension. Therefore, the existence of arbitrary
parameters can be used as a check to see if the rank condition holds.
Although it is not essential to construct both sets of invariants, it is
necessary to determine M* which requires v and i; thus, the algorithm
presented has definite advantages over others, since these integers are
simultaneously determined.
In practical modeling applications, the prediction of model
performance is normally necessary; therefore, knowledge of a minimal
extension is required. Also in some of the applications the number of
degrees of freedom may not be of interest, if only one partial realization
is required rather than the entire class. In this case such a model is
easily found by setting all free parameters to zero which corresponds to
merely applying the Algorithm (3.11) directly to the data and obtaining
the corresponding canonical forms as before.
Describing the class of minimal extensions offers some advantages
over the state space representation in that it is coordinate free and
indicates the number of degrees of freedom available without compensation.
3.3 Characteristic Polynomial Determination by Coordinate Transformation
In this section we obtain the characteristic equation of the entire
class of minimal.partial realizations described by FR or FC of the
previous section. It is easily obtained by transforming the realized
FR or FC into the Bucy form as before. Recall that the advantage of
this representation over the Luenberger form is that it is possible to
find the characteristic polynomial directly by inspection of FBR in (2.211).
I
Even though it is possible to realize the system directly in Bucy form
as implied in the discussion of (2.312), it has been found that this
method has serious deficiencies when dealing with finite Markov sequences.
If (R) is satisfied, the partial realization is unique. When (R) is not
satisfied, this technique does not yield all degrees of freedom. For
example, reconsider the arbitrary parameter realization of Example (3.23).
This realization is given in Ackermann (1972) as
0 1 0 0 0 0 1 0 0 0
2 3 b 0 0 2 3 0 0 0
FR = 0 0 0 1 0 FAck = 0 0 0 1 0
0 0 0 0 1 0 0 0 0 1
3+e 0 c (d+e) e 3+e 0 c (d+e) e
Note that one degree of freedom (b=0) has been lost. Similarily
Ledwich and Fortmann (1974) have shown by example that this technique
can also lead to nonminimal realizations. These deficiencies arise due
to the procedure used for the determination of the Bucy invariants. This
procedure does not account for the possibility that an independent row
vector of a particular chain may actually be dependent if it is compared
with portions of the same length of vectors in different chains. To cir
cumvent the problem, the previous technique will be used,i.e., the system
is realized directly in Luenberger form and transformed to Bucy form. Not
only does this assure minimality as well as the determination of all possible
degrees of freedom, but TBR is almost found by inspection as shown in
(2.312). Reconsider the example of the previous section.
Example. (3.31) Recall that in (3.23) m=p=2, n=5, and v1=2, v2=3,
T = [ 2 3 b 0 0
.k = [ 2 3 b 0 0 ]
(1) Simultaneously construct TBR
for predecessor independence
I
1
T
e
2
T
TBR 
BTFR
1 R
BTF
1 R.
(2) Determine T 1 from
BR
1
BR
1
0
2/b
0
0
T
j2= [3+e 0 c (d+e) e i
from (3.34) while examining the rows
3 b
7 3b
14 15 7b 3b' b
TR R= I which gives
BRBR n
1
3/b
2/b
0
1/b
3/b
2/b
0
1/b
3/b 1/b
(3) Determine FBR:
1BR BR RBR
FBRTBRFRTBR=
(4) Find the characteristic polynomial by inspection.
0
0
0
0
3b2cce
1
0
0
0
3c2d2e
0
1
0
0
c+3d+e
0
0
1
0
d+2e2
0
0
0
1
e+3
XFBR(z) = z+(e3)z4+(d2e+2)z'+(c3de)z2+(3c+2d+2e)z+(b+2c+be)
This example points out some very interesting points. When this
technique is combined with the algorithm of (3.22), it offers a method
which can be used to obtain the solution to the stable realization
problem developed in Roman and Bullock (1975b). Also, if the system
were realized directly in Bucy form, then b=O and a degree of freedom is
lost; thus, in Ackermann's example 91=1, while ours is Z1=5. It is
critical that all degrees of freedom are obtained as shown in this case,
since the system is observable from a single output.
This section concludes the discussion of the deterministic case of
the realization problem. In the next chapter we examine the stochastic
version of the realization problem.
CHAPTER 4
STOCHASTIC REALIZATION VIA INVARIANT SYSTEMS DESCRIPTIONS
In this chapter the stochastic realization problem is examined
by specifying an invariant system description under suitable trans
formation groups for the realization. Superficially, this may appear
to be a direct extension of results previously developed, but this is
not the case. It will be shown that the general linear group used in
the deterministic case is not the only group action which must be
considered when examining the Markov sequence for the corresponding
stochastic case.
Analogous to the deterministic realization problem there are
basically two approaches to consider (see Figure 1): (1) realization
from the matrix power spectral density (frequency domain) by performing
the classical spectral factorization; or (2) realization from the
measurement covariance sequence (time domain) and the solution of a set
of algebraic equations. Direct factorization of the power spectral
density (PSD) matrix is inefficient and may not be very accurate.
Recently developed methods of factoring Toeplitz matrices by using fast
algorithms offer some hope, but are quite tedious. Alternately,
realization from the covariance sequence is facilitated by efficient
realization algorithms and solutions of the KalmanSzeg6Popov equations.
REALIZATION FROM FACTO T
COVARIANCE SEQUENCE PSD  MTOR TIO
AND ALGEBRAIC METHODS METHODS
STOCHASTIC REALIZATION
Figure 1. Techniques of Solution to the Stochastic
Realization Problem.
The problem considered in this chapter is the determination of a
minimal realization from the output sequence of a linear constant
system driven by white noise. The solution to this problem is well known
(e.g.. see Mehra (1971)) as diagrammed below in Figure 2. The output
sequence of an assumed linear system driven by white noise is correlated
and a realization algorithm is applied to obtain a model whose unit
pulse response is the measurement covariance sequence. A set of algebraic
equations is solved in order to determine the remaining parameters of
the whitenoise system.
This problem is further complicated by the fact that the covariance
sequence must be estimated from the measurements. From the practical
viewpoint, the realization is highly dependent on the adequacy of the
estimates. Although in realistic situations the covarianceestimation
problem cannot be ignored, it will be assumed throughout this chapter
that perfect estimates are made in order to concentrate on the realization
portion of the problem.t
In this chapter we present a brief review of the major results
necessary to solve the stochastic realization problem. We use the
Majumdar (1976) has shown in the scalar case that even if imperfect
estimates are made realization theory can successfully be applied.
*White Noise Input
Linear Constant
System
Solve Algebraic
Equations
Output Sequence
Measurement Covariance Sequence
Stochastic Realization
Figure 2. A Solution to the Stochastic Realization Problem
Correlation
Techniques
Realization
Algorithm
m
__ ___
algebraic structure of a transformation group acting on a set to obtain
an invariant system description for this problem. A new realization
algorithm is developed to extract this description from the covariance
sequence. Recently published results establishing an alternate approach
to the solution of this problem are also considered.
4.1 Stochastic Realization Theory
Analogous to the deterministic model of (2.11) consider a white
noise (WN) model given by
SFx + w (4.11)
Xk =Hx
where x and y are the real, zero mean, n state and p output vectors,
and w is a real, zero mean, white Gaussian noise sequence. The noise
is uncorrelated with the state vector, A., j 5 k and
Cov(w ,w.):=E[(w.Ew.)(w.Ewj)] = Q6i
where 6ij is the Kronecker delta. This model is defined by the triple,
WN :=(F,I,H) of compatible dimensions with (F,H) observable and F a
nonsingular,t stability matrix, i.e., the eigenvalues of F have magnitude
less than 1. The transfer function.of (4.11) is denoted by TWN(z).
tIn the discussion that follows the WN model parameters will be used to
obtain a solution to the stochastic realization problem. Denham (1975)
has shown that if the spectral factors of the PSD are of least degree,
i.e., they possess no poles at the origin, then F is a nonsingular matrix.
The corresponding measurement process is given by
k k + Yk (4.12)
where z is the p measurement vector and v is a zero mean, white
Gaussian noise sequence, uncorrelated with x., j 5 k with
Cov(vi,.j) = R6ij
Cov(wA,.) = S6ij
for R a pxp positive definite, covariance matrix and S a nxp cross
covariance matrix. Thus, a model of this measurement process is
completely specified by the quintuplet, (F,H,Q,R,S).
When a correlation technique is applied to the measurement process,
it is necessary to consider the state covariance defined by
Sk: =Cov(x, )
We assume that the processes are wide sense stationary; therefore,
k = T, a constant here. It is easily shown from (4.11) that the
state covariance satisfies the Lyapunov equation (LE)
I = FIFT + Q (4.13)
It is well known (e.g. see Faurre (1967)) that since F is a stability
matrix, corresponding to any positive semidefinite covariancee) matrix Q,
there exists a unique, positive semidefinite solution I to the (LE).
The measurement covariance is given (in terms of lag j) by
Cj:= Cov(k+j'k) = Cov(yk+j,yk)+Cov(yk+j'vk)+Cov!(vk+j'yk)+Cov(k+j ',k)
(4.14)
and from (4.11) it may be shown that
C. = HFjI(F~HT+S) j > 0 (4.15)
C = HnHT + R
0
The PSD matrix of the measurement process is obtained by taking the
bilateral ztransform of the sequence C. defined in (4.14) which gives
3
DZ(z) = H(IzF) Q(Iz1FT)l HT+H(IzF)1S+ST(Iz1FT) HT+R
(4.16)
It is important to note that this expression is the frequency domain
representation of the measurement process which can alternately be
expressed directly in terms of the measurement covariance sequence as
00 
4Z(Z) = Z C.z
j=o, J
Since the measurement process is stationary and z is real, C =CT and
k k
therefore the PSD can be decomposed as
00 .z
(z) = Z C.z" + C + Z CTz. (4.17)
Sj=l J j=l J
Note that {C.} is analogous to the Markov sequence of the deterministic
realization problem. We define the problem of determining a quintuplet,
(F,H,Q,R,S) in (4.16) from Z(z) or {C,} as the stochastic realization
problem.
In this chapter we are only concerned with the realization from the
measurement covariance sequence. When a realization algorithm is applied
to the covariance sequence, we define the resulting realization as the
KalmanSzegiPopov (KSP) model because of the parameter constraints
(to follow) which evolve from the generalized KalmanSzegbPopov lemma
(see Popov (1973)t). Thus, we specify the KSP model as the realization
of {C .}t defined by the quadruple, KSP:=(A,B,C,D) of appropriate
dimension with transfer function, TKSP(z)=C(IzA) B+D. Note that since
the unit pulse response of the KSP model is simply related to the
measurement covariance sequence, then (4.17) can be written as the
sum decomposition.
S(z) = TKSP(z)+TKp(z1) = C(IzA)1B+D+DT+BT(Iz'1AT)1CT (4.18)
The relationship between the KSP model and the stochastic realization
of the measurement process is shown in the following proposition by
Glover (1973).
Proposition (4.19) Let ZKSP=(A,B,C,D) be a minimal realization of {Cj}.
Then the quintuplet (F,H,Q,R,S) is a minimal stochastic
realization of the measurement process specified
by (4.11) and (4.12), if there exists a positive
definite, symmetric matrix H and TeGL(n) such that
the following KSP equations are satisfied:
nAlIAT =Q
D+DTICT = R
BAHCT = S
where A=T1FT and C=HT.
The proof of this proposition is given in Glover (1973) and
This book was published in Romanian in 1966, but the English version
became available in 1973.
tNote that the sequence, {C.}1 is related to the measurement covariance
sequence as Co =Co and C.=C. for j > 0.
o o 33
corresponds directly to the results presented by Anderson (1969) in
the continuous case. The proof follows by comparing the two distinct
representations of 0Z(z) given by (4.16) and (4.18). Minimality of
(F,H,Q,R,S) is obtained directly from Theorem (3.72) of Rosenbrock
(1970). The KSP equations are obtained by equating the sum decomposition
of (4.18) to (4.16).
This proposition gives an indirect method to check whether a given
EKSP and stochastic realization, (F,H,Q,R,S) correspond to the same
covariance sequence. Attempts to use the KSP equations to construct
all realizations, (F,H,Q,R,S) with identical {C.} from ZKSP and T by
choice of all possible symmetric, positive definite matrices, I will
not work in general because all I's do not correspond to Q,R,S matrices
that have the properties of a covariance matrix, i.e.,
A:= Cov( [w v) Q S (4.110)
T '6
Li R
First, it is necessary to question if the stochastic realization problem
always has a solution, or equivalently, when is there a i so that
(4.110) holds. Fortunately, the wellknown PSD property,
on the unit circle (see e.g. Gokhberg and Krein (1960) and Youla (1961))
is sufficient to insure the existence of a solution. This result is
available in the generalized KalmanSzegbPopov lemma (see Popov (1973)).
Proposition (4.111) If (F,H) is completely observable, then DZ(Z) Z 0
on the unit circle is equivalent to the existence
of a quintuplet, ( ,,R,,) such that
z(z) = [f(Iz?)1 I p[Q (Iz1FT)1"T
i" Rj p
where
STV [VT T >
The proof of this proposition is given in Popov (1973) and essentially
consists of showing there exists a spectral factorization of the given
PSD. Thus, this proposition assures us that there exists at least one
solution to the stochastic realization problem.
Proposition (4.19) shows that once ZKSP, T, and n are determined
then a stochastic realization, (F,H,Q,R,S) may be specified; however, it
does not show how to determine I. Recently many researchers (e.g. Glover
(1973), Denham (1974,1975), Tse and Weinert (1975)) have studied this
problem. They were interested in obtaining only those solutions to the
KSP equations of (4.19) which correspond to a stochastic realization
such that AO of (4.111). Denham (1975) has shown that any solution,
n*, of the KSP equations which corresponds to a factorization as in
(4.111) with V=KN, W=N for K=Knxp, NeKPxP, K full rank and N symmetric
positive definite, is in fact a solution of a discrete Riccati equation.
V 'V TT T T
This can readily be seen by substituting, (Q,R,S) =(KNN KNN KNN)
of (4.111) into (4.19)
I*AI*AT = KNNTKT (4.112)
D+DTCT*CT = NNT
BAJ*CT = KNNT for A = T FT, C=HT, TeGL(n).
Solving the last equation for K and substituting for NNT yields
K = (BAI*CT )(D+DTC*CT)1 (4.113)
Now substituting (4.113) and NNT in the first equation shows that I*
satisfies
= AI*AT(BAn*cT)(D+DT_*cT) (BAH*CT)T (4.114)
a discrete Riccati equation. Thus, in this case the stochastic
realization problem can be solved by (1) obtaining a realization,
EKSP from {C.}; (2) solving (4.114) for 1*; (3) determining NNT from
(4.112) and K from (4.113); and (4) determining Q,R,S from K and NN.
A quintuplet specified by T and I* obtained in this manner is guaranteed
to be a stochastic realization, but at the computational expense of solving
a discrete Riccati equation. Note that solutions of the Riccati equation
are well known and it has been shown that there exists a unique, I*,
which gives a stable, minimum phase, spectral factor (e.g. see Faurre
(1970), Willems (1971), Denham (1975), Tse and Weinert (1975)). We
will examine this approach more closely in a subsequent section, but
first we must find an invariant system description for the stochastic
realization.
4.2 Invariant System Description of the Stochastic Realization
Suppose we obtain two stochastic realizations by different methods
from the same PSD. We would like to know whether or not there is any
way to distinguish between these realizations. To be more precise,
we would like to know whether or not it is possible to uniquely
characterize the class of all realizations possessing the same PSD.
We first approach this problem from a purely algebraic viewpoint.
We define a set of quintuplets more general than the stochastic
realizations, then consider only those transformation groups acting
on this set which leave the PSD or equivalently {C invariant, and
finally specify various invariant system descriptions under these
groups which subsequently prove useful in specifying a stochastic
realization algorithm. The groups employed were first presented by
Popov (1973) in his study of hyperstability. The results we obtain
are analogous to those of Popov as well as those obtained in the
quadratic optimization problem (e.g. see Willems (1971)).
Define the set
X2 = {(F,H,Q,R,S)I FeKnxn,HeKPxn,QeKnxnReKPxPSeKnxp; Q,R symmetric}
and consider the following transformation group specified by the set
GK := {L ILEKnxn; L symmetric}
and the operation of matrix addition. Let the action of GK on X2 be
n b
defined by
L (F,H,Q,R,S) := (F,H,QFLFT+L,RHLHT,SFLHT) (4.21)
This action induces an equivalence relation on X2 written for each pair
(F,H,Q,R,S), (F,H,Q,R,S)eX2 as (F,H,Q,R,S)EL(F,H,Q,R,S) iff there exists
a LEGKn such that (F,H,Q,R,S) = L +(F,H,Q,R,S).
This group and GL(n) are essential to this discussion, but we must
consider their composite action. Therefore, we define the transformation
group, GRn which is the cartesian product of GL(n) and GK ,
GRn := GL(n)xGK The following proposition specifies GR .
nn n
i
Proposition. (4.22) The closed set GRn and operation o form a group
where
GRn = {(T,L) IT GL(n);LeGK n
and the group operation is given by
(T,)o(T,L) = (TT,L+T LTT).
Proof. This proof of this proposition follows by verifying the standard
group axioms with respective identity and inverse elements
(In,0O) and (T1,TLTT).v
Let the action of GRn on X2 be defined by
(T,L) + (F,H,Q,R,S):=(TFT1,HT',T(QFLFT+L)TT,RHLHT,T(SFLHT)) (4.23)
An element (F,HQ,R,<) of the set X2 is said to be equivalent to the
element (F,H,Q,R,S) of X2 if there exists a (T,L)eGR such that
(F,H,Q,R,S)=(T,L)+(F,H,Q,R,S). This relation is reflexive
(F,H,Q,R,S) = (InOn) +(F,H,Q,R,S)
and symmetric
(T1,TLTT)+(F,H,Q,R,S)=(T1,TLTT)+((T,L)+(F,H,Q,R,S)) =
((T1,TLTT)o(T,L))+(F,H,Q,R,S)=(nn0 )+(F,H,Q,R,S) .
Transitivity follows from (F,H,Q,R,S)=(T,L)+(F,H,Q,R,S) and
(F,H,Q,R,S)= T,T)+(T,H,Qs,,S)=(T,f)+((T,L)+(F,H,Q,R,S))=(f,L)+ (F,H,Q,R,S).
Thus, GRn induces an equivalence relation on X2 which we denote by
ETL and (4.23) defines the partitioning of X2 into classes. Note that
our first objective has been satisifed, i.e., two ETLequivalent quin
tuplets have the same PSD; for if we let the pair (F,H,Q,R,S),
(F,H,Q,R,S)EX2 then if (T,L)eGRn
_ i __
z(z)=H(IzF)Q(IzFT) +H(IzF) S+S(Iz H+R
=(HT')T(IzF)1T1(T(QFLFT+LT)TT)TT(Iz FT )ITT(HT1 )T
+(HTI)T(IzF)1T1T(SFLHT)+(STHLFT)TTTT(Iz1FT)1 TT(HT 1)T
+RHLHT (4.24)
or
D (z)=Z (z)+H(IzF)I [LFLFTFL(Iz1FT)(IzF)LFT(IzF)L FT)(Iz I1FT)1HT
which gives QZ(z) = DZ(z). The measurement covariance sequence is also
invariant under the action of GRn on X2 because the PSD is also given by
DZ(z) = Z C.zJ. Thus, we will call any two systems represented by the
j= j
quintuplet of X2 covariance equivalent, if they are ETLequivalent.
Clearly, any two covariance equivalent systems have identical PSD's
(or measurement covariance sequences). Conversely, any two systems with
identical PSD's are covariance equivalent (see Popov (1973) for proof).
In order to uniquely characterize the class of covariance equi
valent quintuplets we must determine an invariant system description
for X2 under the action of GR The number of invariants may be found
by counting the parameters. If we define, M :=dim(F,H,Q,R,S) and
M2:=dim(T,L), then there are M1=n2+np+n(n+l)+p(p+l)+np parameters
specifying this quintuplet and GR acts on M2=n2 +n(n+l) of them; thus,
there exist M1M2=2np+'p(p+l) invariants. If we consider the transfor
mation, (TR,L)EGR, to the Luenberger row coordinates, then np of these
invariants specify the canonical pair (FR,HR) of (2.26). The action
of (TR,L) on Q,R,S is given by
a_ _~ __ ~I ~
R = TR(Q(FTR )(TRLTR )(FTR ) +L)TR QFRLRFR +L (4.25)
RR = R(HTR1)TRLTRT)(HTR T = RHRLRHRT (4.26)
R = TR(S(FTR1)(TRLTRT)(HTR1)) = SRFRLRHRT (4.27)
where LR = TRLTRT, FR = TRFTR, HR = HTR, QR = TRQTRT SR = TRS.
The transformation LR acts on n(n+l) parameters of the total
n(n+l)+'p(p+l)+np parameters available in QRR,SR as shown above for the
given (FR,HR). Once this action is completed the remaining np+p(p+l)
parameters are invariants. There are only four possible ways that LR
can act on the triple, (Q,R,S):
(i) LR acts only on QR; (4.28)
(ii) LR acts first to specify SR with the remaining elements
of LR acting on QR;
(iii) LR acts first to specify R with the remaining elements of
LR free to act on QR or SR or both; and
(iv) LR acts on any combination of elements in Q,R,S.
If we choose to restrict the action of LR to only the kn(n+l)
elements of QR, then for any choice of QR (given (FR,HR) and any QR)'
the transformation LR is uniquely defined. Since FR is a nonsingular
stability matrix, then it is well known (Gantmacher (1959)) from (4.25)
that LR is the unique solution of LFL F = Q* for Q* = RQ It
is important to note that the elements of QR are completely free, but
once they are selected, LR is fixed by (4.25) for any QR and therefore
the np elements of SR and the p(p+l) elements of RR are the invariants.
Thus, for a particular choice of QR we can uniquely specify the equivalence
_I_ ______ ___I I _
class of X2 under the action of GR i.e., (FRHR'R,RR,SR) is a
canonical form for ETLequivalence on X2.
On the other hand, if we choose to let LR act on the np elements
of SR, then from (4.27) only npp(pl) elements of LR are uniquely
specified, i.e., since LR is symmetric and
LRH R rl+ l rp1+1
there are p(p+l) redundant elements in the R.'s. Thus, for any choice
of SR (given (FR,HR) and any SR), npp(pl) elements of LR are uniquely
defined by (4.27) and the remaining elements of LR are free to act on QR.
In other words npp(pl) elements of QR are invariants,t as well as
the elements of RR, since any choice of SR specifies the elements of
LR in (4.26).
Similarly restricting the action of LR to act on the elements of R
specifies ;p(p+l) elements of LR from (4.26) and we are free to allow
the remaining elements of LR to act exclusively on QR or SR or both.
Clearly, there are many choices available to distribute the action of LR
on QR,R,SR; however, the important point is that once the choice is made,
the invariants are specified.
Any choice of symmetric QR is acceptable, since LR is uniquely
determined from (4.25) for given QRFR, but this is not the case when
an SR is selected. First recall that FR is nonsingular (see footnote
p.81). Then if we define SR:=SR'R it follows from (4.27) that
FR1S LRHRT (4.29)
and then
tThis was pointed out by Luo (1975) and Majumdar (1975).
_I
