MULTIPLE MODELING AND CONTROL OF NONLINEAR SYSTEMS
WITH SELF-ORGANIZING MAPS
By
JEONGHO CHO
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
2004
Copyright 2004
by
JEONGHO CHO
This document is dedicated to my late father My inspiration and the one who gave me
every opportunity to realize my dreams.
ACKNOWLEDGMENTS
Being at CNEL has been not only a wonderful academic experience but also a
unique opportunity to meet many colleagues who helped me immensely along the way.
It is a privilege to me to acknowledge the unconditional support of my supervisor,
Dr. Jose C. Principe, who has been a mentor during my years as a graduate student. His
advice, wisdom, and many invaluable lessons in life have made this dissertation possible.
I sincerely appreciate the help offered by the members of my academic committee,
Dr. John G. Harris, Dr. Michael C. Nechyba, and Dr. Loc Vu-Quoc. Their cooperation
and suggestions have considerable improved the quality of my dissertation. I would also
like to express my gratitude to Dr. Mark A. Motter and Dr. Deniz Erdogmus for their
help and providing many valuable comments during the past years of my stay in the
CNEL.
My deepest recognition goes to my beloved parents, especially to my late father
who helped me in any imaginable way to achieve my objectives and fulfill my dreams.
They have been an inexhaustible source of love and inspiration all my life.
My most special thanks go to my wife, Joonhee, for her patience, understanding
and encouragement, without which it would have been impossible to complete this
dissertation. Finally, I thank my specially beloved son, Minsuh, hoping that the effort of
these years may offer him a more plentiful life in the years to come.
TABLE OF CONTENTS
A C K N O W L E D G M E N T S ................................................................................................. iv
LIST OF TABLES ..................................... ........ .......................................... vii
LIST OF FIGURES ................................................ viii
A B S T R A C T ...................................................................................................... ............ x i
CHAPTER
1 IN TR O D U C T IO N ........ .. ......................................... ..........................................1.
1 .1 M o tiv atio n s ..................................................................................................... .
1.2 Review of Literature .................................................................. 2
1.3 Objectives and Author's Contribution.....................................................5...
1 .4 O u tlin e .......................................................... ................................................ 6
2 SYSTEM IDENTIFICATION VIA MULTIPLE MODELS .................................... 8
2 .1 L ocal D ynam ic M odeling ............................................................ ..................... 8
2.2 SO M -B ased L ocal M odeling ........................................................... ................ 10
2.2.1 R reconstruction of State-Space..................................................... 11
2.2.1.1 D elay reconstruction .................................................... 11
2.2.1.2 Estimating an embedding dimension ..........................................12
2.2.2 The Self-O organizing M ap ...................................................... ................ 14
2.2.2.1 C om petitive process .................................................... ............... 15
2.2.2.2 C cooperative process .................................................... ............... 16
2 .2 .2 .3 A daptive process ......................................................... .............. 17
2.2.3 Modeling Methodology Based on the SOM......................................... 17
2.3 Input-Output Representation of System s......................................... ............... 19
2 .3 .1 C classical A approach ................................................................... ............... 2 0
2.3.2 Series-Parallel and Parallel M odels....................................... ................ 21
2.3.3 SOM-based Multiple ARX Models.......................................................23
2.3.3.1 Selection of operating regions with a SOM ...............................23
2.3.3.2 M odel develop ent procedure ............................. ..................... 27
3 MULTIPLE MODEL BASED CONTROL ..........................................................29
v
3.1 D iscrete-Tim e C control System ........................................................ ................ 31
3.2 Inverse Control via Backpropagation Through Model....................................32
3.3 M multiple Inverse C ontrol........................................ ....................... ................ 34
3 .4 M multiple P ID C control ........................................... ......................... ................ 37
4 MULTIPLE QUASI-SLIDING MODE CONTROL ...........................................42
4.1 Introduction to Variable Structure Systems.....................................................42
4.1.1 Sliding H yperplane D esign.................................................... ................ 44
4.1.2 Sliding Mode Control Law Design ................ ...................................45
4.2 Sliding Mode Control in Sampled-Data Systems...........................................48
4 .2 .1 Q u asi-Sliding M ode................. ................................................. ............... 4 9
4.2.2 Quasi-Sliding Mode Control Using Multiple Models................................52
4.3 Analysis of Multiple Quasi-Sliding Mode Control with an Imperfect Sensor .....54
5 C A SE ST U D IE S ............... .. .................. .................. .............. ......... ... ............ 57
5.1 C controlled C haotic System s ............................................................ ................ 57
5.1.1 T he L orenz Sy stem ....................................... ...................... ............... 58
5.1.2 The D uffing O scillator .......................................................... ................ 66
5.2 Nonlinear Discrete-Time Systems....................................................72
5.2.1 A First-order Plant ............................. .............. ................................. 72
5.2.2 A Laboratory-scale Liquid-level Plant..................................................78
5 .3 F lig h t V eh icles ...................................................................................................... 8 6
5.3.1 M issile D ynam ics .. .. .... ........ ........ ............................................. 86
5.3.2 L oF L Y T E U A V .......................................... ........................ ................ 92
6 CONCLUSIONS AND FUTURE WORK........................................101
6 .1 S u m m a ry ............................................................................................................. 1 0 1
6 .2 F u tu re W ork ........................................................................................................ 10 3
6.3 C including R em arks ................. ............................................................. 104
APPENDIX
QSMC FOR MIMO SYSTEM .............................................................................106
LIST O F R EFEREN CE S .. .................................................................... ............... 109
BIOGRAPHICAL SKETCH ...............................................................................1...... 18
LIST OF TABLES
Table page
5-1 Lipschitz index of the controlled Lorenz system for determining an embedding
d im e n sio n ............................................................................................................... 5 9
5-2 Comparison of modeling performance for the controlled Lorenz system. .............62
5-3 Comparison of modeling performance for the controlled Duffing oscillator. .........67
5-4 Comparison of tracking performance for 3 different control task : Settling time
and NRAIS-SSE ....................... ........ ............... 70
5-5 Lipschitz index of a laboratory-scale liquid-level plant for determining an
em bedding dim pension. .............. .............. ............................................ 79
5-6 Comparison of modeling performance for the liquid-level plant..........................80
5-7 Comparison of control performance for the liquid-level plant in noise-free
en v iro n m e n t.............................................................................................................. 8 3
5-8 Comparison of control performance for the liquid-level plant in the presence of
sensor noise: standard deviation of noise is 4.5e-2. ............................ ................ 83
5-9 Comparison of modeling performance for the missile system...............................87
5-10 Comparison of controller performance for the missile system in the presence of
noise having the standard deviation of 4.6e-2..................................... ................ 90
5-11 Comparison of modeling performance for the lateral motion (p and r ) of the
LoFLYTE UAV. ................. ............. .. ......... .......................... 95
LIST OF FIGURES
Figure page
2-1 Local modeling scheme on the basis of a SOM ................................. ................ 10
2-2 Two data points that are close in (o but distant in (2 ...................... ...............13
2-3 K ohonen's Self-organizing M ap ......................................................... ................ 15
2-4 Nonlinear dynamic m odel configuration ............... .............. ..................... 20
2-5 A series-parallel model (left) and A parallel model (right).................................22
2-6 Configuration of local linear modeling based on a SOM ............... ..................... 27
3-1 Classical discrete-tim e control system ................................................ ................ 31
3-2 Modeling and control scheme using the TDNN: (a) TDNN modeling of a plant
(b) An inverse controller via Backpropagation through (Plant) Model ................34
3-3 Proposed SOM -based inverse control scheme ................................... ................ 36
3-4 P ID controlled sy stem ........................................................................... ................ 38
3-5 Overall schematic diagram of the nonlinear PID closed loop control mechanism
u sing m multiple controllers ........................................ ........................ ................ 39
3-6 Block diagram of PID controller for a SISO plant model...................................40
4-1 Phase plane plot of a continuous-time second-order variable structure system.......46
4-2 Discrete-time system response with sliding mode control..................................49
5-1 The uncontrolled Lorenz system: phase-space trajectory and time-series ............59
5-2 Generalization error v.s. Number of PEs (left) Learning curve (right).................60
5-3 Identification of the controlled Lorenz system by multiple models......................61
5-4 Tracking a fixed point reference signal by MIC: (a) yd =0 (b) yd =8 8 .............63
5-5 Comparison of control performance varying the number of inverse controllers
based on m multiple m odels ........................................ ........................ ................ 64
5-6 Comparison of tracking performance by multiple model based controllers (MIC,
MPIDC, MQSMC) and global inverse controllers (IC-ARX, TDNNC). ..............65
5-7 The uncontrolled Duffing oscillator: phase-space trajectory and time-series..........66
5-8 Lipschitz index (left) for the determination of optimal number of inputs and
outputs and Generalization error v.s. Number of PEs (right)...............................67
5-9 Identification of the controlled Duffing oscillator by TDNN (left) and multiple-
m o d els (rig h t) ........................................................................................................... 6 8
5-10 Performance comparison on trajectory tracking by TDNNC, PID-ARX, and
MPIDC when the poles of the closed-loop response are place at (a) 0.9 (b)
0.5 0.5i (c) 0.25 0.25i (d) 0.05 0.05i ........................................... ................ 69
5-11 Control performance by TDNNC (left) and MPIDC (right) ..............................71
5-12 Parameter selection to design multiple models .................................. ................ 72
5-13 Modeling performance using 64 multiple models for a nonlinear first
o rd e r p la n t. ............................................................................................................... 7 3
5-14 Responses for parameter selection to design QSMC by varying
(a) rT and (b) qT ........................................................................................... 74
5-15 Comparison of tracking performance using a global controller and multiple model
based controllers in the absence of sensor noise. The figure (right) is an
enlargement of the figure (left) between 34 and 52 iterations. ..............................75
5-16 Performance of square-wave tracking in the absence of noise by the MQSMC ...... 76
5-17 Comparison of performance against noise among TDNNC, MIC, MPIDC and
M Q S M C ............................................................................................................... .. 7 7
5-18 Sinusoidal and arbitrary signal tracking by the MQSMC:(a) in the absence of
sensor noise (b) in the presence of sensor noise, SNR = 20dB. ............................ 78
5-19 Modeling performance using multiple models for a liquid-level plant.................79
5-20 Typical input-output characteristic of the second-order liquid-level plant.............. 81
5-21 Square-wave tracking performance of the liquid-level plant varying the sliding
surface and the noise level by the M QSM C........................................ ................ 81
5-22 Control of the liquid-level plant by the MQSMC varying the number of
controllers: (a) M 1 (b)M= 16 (c)M= 36 (d)M= 144..................................82
5-23 Control of the liquid level system with measurement noise by the MQSMC
with (a)M= 1 (b)M= 16 (c)M= 144 and (d) the TDNNC. ..............................84
5-24 Tracking an oscillatory reference signal of the liquid-level plant by the TDNNC
(left) and the MQSMC (right) in the presence of sensor noise..............................85
5-25 Performance assessment on a trajectory tracking under noisy environment. ..........85
5-26 Modeling performance using multiple models for the missile dynamics. .............87
5-27 Tracking various set-point reference signal by (a) TDNNC (b) MIC (c) MPIDC
and (d) MQSMC in the absence of measurement noise ...................................... 88
5-28 Tracking various set-point reference signal by (a) TDNNC (b) MIC (c) MPIDC
and (d) MQSMC under the presence of measurement noise whose standard
d ev iatio n is 4 .6 e-2 ................................................................................................... 8 9
5-29 Trajectory tracking by TDNNC (left) and MQSMC (right) in the presence of
noise w hose standard deviation is 2.3e-1 ........................................... ................ 90
5-30 Set-point tracking behavior by the TDNNC (left) and the MQSMC (right) under
param eter variations. ............. ................. ............................. ...............9 1
5-31 General description of aircraft (left) and LoFLYTE testbed UAV (right) .............92
5-32 Control inputs (3, 3,) used to generate data samples for training the networks....94
5-33 M odeling of a roll-rate using multiple models.................................... ................ 95
5-34 Comparison for controlling roll-rate and yaw-rate to track the set point in the
absence of noise by (a) TDNNC (b) MIC (c) MQSMC ...................................... 98
5-35 Comparison for controlling roll-rate and yaw-rate to track the set point in the
presence of noise by (a) TDNNC (b) MIC (c) MQSMC. .................................... 99
5-36 Performance of controlling roll-rate and yaw-rate to track an arbitrary trajectory
with measurement noise (SNR = 20) by (a) TDNNC (b) MIC (c) MQSMC......... 100
Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
MULTIPLE MODELING AND CONTROL OF NONLINEAR SYSTEMS
WITH SELF-ORGANIZING MAPS
By
Jeongho Cho
December 2004
Chair: Jose C. Principe
Major Department: Electrical and Computer Engineering
This dissertation is concerned with the development and analysis of a nonlinear
approach to modeling and control of nonlinear complex systems. In particular, the
problem of designing a mathematical model of a nonlinear plant using only observed data
is considered.
For the identification of the plants, the concept of multiple models with switching
is employed in order to simplify both the modeling and the controller design since a
single controller may sometimes have difficulty meeting the design specifications in case
the dynamics vary considerably over the operating region. For this reason, a Self-
Organizing Map (SOM) is utilized to divide the operating region into local regions as a
modeling infrastructure to construct local models. The SOM selects the local operating
region relying on the embedded output, and the local model is built by the embedded
output as well as the embedded control input data samples which are spaced in the local
area.
Based on the identified multiple models, the problem of designing controllers is
discussed. Each local linear model is associated with a linear controller, which is easy to
design. Switching of the controllers is done synchronously with the active local linear
model that tracks different operating conditions. The effectiveness of the proposed
approach is shown through experiments for modeling complex nonlinear plants such as
chaotic systems, nonlinear discrete time systems and flight vehicles. Its comparison with
neural networks-based alternatives, Time Delay Neural Network (TDNN), shows clear
advantages of local modeling and control in terms of performance.
CHAPTER 1
INTRODUCTION
1.1 Motivations
The identification of nonlinear dynamical systems has received considerable
attention since it is an indispensable step towards analysis, simulation, prediction,
monitoring, diagnosis, and controller design for nonlinear systems [21,65,67]. In
particular, the problem of designing a mathematical model of a nonlinear plant using only
observed data has attracted much interest, both from an academic and an industrial point
of view. During the past few years, neural networks as a global model have been
suggested for nonlinear dynamical black-box modeling and successfully applied to the
prediction and modeling of nonlinear processes [11,46,65,73].
Global models, however, have shown some difficulties in cases when the
dynamical system characteristics vary considerably over the operating regime, effectively
bringing the issue of time varying parameters (or nonlinearity) into the design. On the
other hand, local modeling derives a model based on neighboring samples in the
operating space to characterize some operating point or similar feature [23,85]. If a
function f to be modeled is complicated, there is no guarantee that any given global
representation will approximate f equally across all space. Moreover, nonlinear models
are too complex to be used for controller design [70]. Thus, nonlinear control methods
cannot serve all needs of real industrial control problems.
In this case, the dependence on representation can be reduced using local
approximation where the domain off is divided into local regions and a separate model is
used for each region [3,13,24]. In a number of local modeling applications, a Self-
Organizing Map (SOM) has been utilized to divide the operating regions into local
regions [30,60,74,97]. The SOM is particularly appropriate for switching, because it
converts complex, nonlinear statistical relationships of high-dimensional data into simple
geometric relationships that preserve the topology in the feature space [44]. Thus the role
of the SOM is to discover patterns in the high dimensional state space and divide that
space into a set of regions represented by the weights of each Processing Element (PE).
Linear models and associated techniques for linear control design are typically used
to control the plant under certain specific operating conditions. This type of control is
only valid in a small region around the operating point. For that reason, the concept of
multiple models with switching, according to a change in dynamics, has been an area of
interest in control theory in order to simplify both modeling and controller design
[62,66]. The motivation for this research, therefore, is to explore control strategy using
SOM-based multiple models for nonautonomous and nonlinear systems.
1.2 Review of Literature
There are many examples in the literature in which the local modeling paradigm
has been successfully applied for the modeling of nonlinear autonomous and
nonautonomous systems. Farmer and Sidorowich [23] have shown that local linear
models, despite their simplicity, provide an effective and accurate approximation of
chaotic dynamical systems. Jacobs et al. [38] have proposed the mixtures of expert
models that are composed of several different expert networks and a gating network that
localizes the experts. They showed that a simple model can be built by dividing a vowel
discrimination task into appropriate subtasks. Bottou and Vapnik [3] have proposed using
local learning algorithms instead of training a complex system with all data samples, and
demonstrated that a set of subsystems trained with a subset of data can improve the
performance for an optical character recognition problem. The neural-gas architecture
proposed by Martinetz et al. [53] is similar to the SOM in that the competitive network
divides the input space in a set of smaller regions and then local linear models are created
by a LMS-like rule. They showed that the neural-gas network outperforms MLPs and
RBF networks for time series prediction. The same group [77] used a SOM for the
control of a robotic arm. Murray-Smith et al. [61] similarly have extended RBF networks
where each local model is a linear function of the input and exhibited great success in
control problems. Principe and Wang [74] have successfully modeled a chaotic system
with a SOM-based local linear modeling method. Vesanto [94] and Moshou and Ramon
[60] proposed a scheme that essentially followed local linear modeling based on SOM
topology for nonautonomous system. Under some conditions, it has been shown that
multiple models can uniformly approximate any system on a closed subset of the state
space provided a sufficient number of local models are given.
Generally, the control using multiple models is categorized by two approaches: a
global model-based control using local models and a multiple model-based control with
switching. Global controller design with the aid of multiple linear models has been
extensively reported in the literature. Gain scheduling has been perhaps the most
common systematic approach to control nonlinear systems in practice due to its simple
design and tuning [47,68,79]. The multiple model adaptive control approach differs from
gain scheduling mainly in the use of an estimator-based scheduling algorithm used to
weight the local controllers. Murray-Smith and Hunt [61] utilized an extended RBF
network where each local model is a linear function of the input and reported great
success for control problems. The overall controller designed is based on the local models
and a validity function to guarantee smooth interpolation. Similarly, Foss et al. [24] and
Gawthrop and Ronco [29] employed model predictive controllers and self-tuning
predictive controllers, respectively, using multiple models. Palizban et al. [70] attempted
to control nonlinear systems with the linear quadratic optimal control technique using
multiple linear models and provided the stability condition for the closed loop system.
Ishigame et al. [37] proposed the sliding mode control scheme based on fuzzy modeling
composed of a weighted average of linear systems to stabilize an electric power system.
In contrast, Narendra et al. [66] proposed the multiple model approach in the
context of adaptive control with switching where local model performance indices have
been used to select the local controller. Subsequently, Narendra and Balakrishnan [64]
proposed different switching and tuning schemes for adaptive control that combine fixed
and adaptive models yielding a fast and accurate response. Principe et al. [75] proposed a
SOM-based local linear modeling strategy and predictive multiple model switching
controller to control a wind tunnel and showed improved performance with decreased
control effort over both the existing controller and an expert human-in-the-loop control.
Later, Narendra and Xiang [63] proved that the adaptive control using multiple models is
globally stable and that the tracking error converges to zero in the deterministic case.
Diao and Passino [20] applied multiple model based adaptive schemes to the fault
tolerant engine control problem. A linear robust adaptive controller and multiple
nonlinear neural network based adaptive controllers were exploited by Chen and
Narendra [11]. Thampi et al. [89,90] have also shown the applicability of the multiple
model approach based on the SOM for flight control.
1.3 Objectives and Author's Contribution
This dissertation is concerned with modeling and control of nonlinear
nonautonomous dynamical systems. The objective of this dissertation is to investigate if
it is possible to obtain a better result in extending the formulation of the control problem
from using just one global model to using several internal models. Thus a multiple
modeling approach is presented and techniques to design controllers based on these
model structures are developed.
The main contributions made by the author with respect to the modeling and
control of nonlinear systems include:
Firstly, an extended version of the SOM-based local modeling scheme for
nonautonomous and nonlinear plants is developed for more general representation of the
underlying dynamical systems and better approximation solely based on input-output
measurements of the plant. Local linear models are derived through competition using the
SOM and they are derived from the data samples corresponding to each of the SOM's
PEs.
Secondly, we investigate several options regarding how to capture the dynamics in
the input-output joint space. It is shown that as the number of dependent variables is
increased SOM modeling may become increasingly difficult to model accurately due to
its memory based structure. Thus, the SOM is trained to position the local models in the
embedded output space. At any time instant, the model representing the plant dynamics is
chosen by the SOM depending on the history of the plant and then incorporated with the
previous control inputs.
Thirdly, the model structure is controller oriented since the dynamics are simpler
locally than globally such that it is easier to develop local models as well as controllers.
For instance, if the system phenomena or behavior changes smoothly with the operating
point, then a linear model (or controller) will always be sufficiently accurate locally
provided that the operating region is sufficiently small, even though the system may
contain complex nonlinearities when viewed globally. These local controllers can then be
"switched" as the system changes operating conditions. Hence, multiple control with
switching, such as an inverse controller and a PID controller using identified multiple
models, are examined.
Finally, in order to obtain a controller which preserves the good sensitivity to
external disturbances, a sliding mode controller (SMC) is employed using multiple
models. By doing so, one of the difficulties in designing a SMC (that requires the
complete knowledge of the plant to be controlled) can be removed. In addition, we
examine the effect by the modeling error due to the quantization of state space as well as
by measurement noise to the proposed multiple model based sliding mode control
performance. It is shown that the switching scheme does not create an issue to be
considered in order to guarantee BIBO stability of the overall system.
1.4 Outline
This dissertation is divided into six chapters. Chapter 2 gives a review of the local
dynamic modeling required to study further for nonautonomous systems. The SOM
employed as a modeling infrastructure to construct the local models is described briefly.
At the end, the proposed SOM-based multiple ARX modeling scheme is introduced for
nonlinear nonautonomous dynamic systems representation. Chapter 3 shows how to
design an inverse control and a PID control framework based on the designed multiple
linear models. A description of a global nonlinear TDNN trained by backpropagtion
through a model is also presented for comparisons regarding performance. Chapter 4
gives a brief description of Variable Structure Systems (VSS). Quasi-sliding mode
control strategy is proposed based on multiple models. In addition, analysis of multiple
quasi-sliding mode control structure with an imperfect sensor is discussed. In Chapter 5,
simulations are conducted assuming that both the plant is unknown and the only state
available for measurements is the plant output for two controlled chaotic systems, two
nonlinear discrete-time systems, one missile, and one Unmanned Aerial Vehicle (UAV).
Finally, Chapter 6 presents conclusions based on the preceding analysis and simulation
results and suggests further study.
CHAPTER 2
SYSTEM IDENTIFICATION VIA MULTIPLE MODELS
The idea of multiple modeling is to approximate a nonlinear system with a set of
relatively simple local models valid in certain operating regimes [39]. Because of the
complexity, uncertainty and nonlinearity of a large class of systems, we often cannot
derive appropriate models from first principles, and are not capable of deriving accurate
and complete equations for input-state-output representations of the systems. Hence we
need to resort to input-output data in order to derive the unknown nonlinear system model
[10,35]. The technique of multiple model networks is appealing for modeling complex
nonlinear systems due to its intrinsic simplicity [62,66].
2.1 Local Dynamic Modeling
We begin with a brief overview of a dynamical systems approach to input-output
modeling. When no physical knowledge of the system is available, we have to determine
a model from a finite number of measurements of the system's inputs and outputs. An
autonomous dynamical system's approach to "black-box" modeling based on Takens
Embedding theorem was first suggested by Casdagli [7]. The delay embedding offers the
possibility of accessing linear or nonlinear coupling between variables and is a
fundamental tool in nonlinear system identification. The use of delay variables in the
structure of these dynamical models is similar to that originally studied by Leontaritis and
Billings [49], and is common in linear time-series analysis and system identification [96].
When we are trying to understand an irregular sequence of measurements, an
immediate question is what kind of process generates such a series. Under the
deterministic assumption, irregularity can be autonomously generated by the nonlinearity
of the intrinsic dynamics. Let the possible state x of a system be represented by points in
a finite dimensional phase space, 9T This can be realized by a map of 9TP onto itself:
xk+1 Xk
: =f (2.1)
xk-P+2_ _xk-P+l _
The predictive mapping is the centerpiece of modeling since once determined, f can be
obtained from the predictive mapping f, : 9P -> 93 as
Xk+ = f (,k ) (2.2)
where Vx,k =[Xk, Xk-1, Xk-P+1]T. In addition, Singer et al. [85] derived the locally linear
prediction based on this relationship as
f(xk & V xk +b (2.3)
The vector and scalar quantities of d and b are estimated from the selected pairs
(x 1', i,,) in the least square sense, where j is the index of the data samples in the
operating regime, i.e., one model. To obtain a stable solution, more than P pairs must be
selected. In general, the above local model fitting is composed of two steps: a set of
nearby state searches over the signal history and model parameters which, when pieced
together, provide a global modeling of the dynamics in state space. The underlying
dynamics is then approximated as
f U f, (2.4)
=1,--..,N
where N is the number of operating regimes. Based on this approximation of an
autonomous system, local linear models have performed very well in comparative studies
10
on time series prediction problems and in most cases have generated more accurate
predictions than global methods [84,97]. Moreover, the nonlinear dynamical system can
be identified by local framework even in the presence of noise if enough data are
available to cover all of the state space since local regions are local averages of the data.
To make the local network less sensitive to noise and outliers, more than one neighbor
can be utilized in local modeling.
2.2 SOM-Based Local Modeling
The SOM is employed as a modeling infrastructure to construct the local models. It
provides a codebook representation of the plant dynamics and organizes the different
dynamic regimes in topological neighborhoods. Thus we can create a set of models that
are local to the data in the Voronoi tessellation created by the SOM. This local model
structure with the SOM is depicted in Figure 2-1.
uk Yk
Reconstruction of State-Space
Yk+ 1 -- Model l Switching
|IModel 2 Device
-: (SOM)
.Model N
Figure 2-1. Local modeling scheme on the basis of a SOM.
2.2.1 Reconstruction of State-Space
In many cases of practical interest it is not possible to measure the state variables of
a system directly. Instead, the measuring procedure yields some value yk = (k), when
the system is in states xk. Here, (p(-) is a measurement function which in general
depends on the state variables in a nonlinear way. The time evolution of the state of the
system results in a scalar time series x,, x, x3,.... In order to reconstruct the underlying
dynamics in phase space, delay embedding techniques are commonly used.
2.2.1.1 Delay reconstruction
Delay-coordinate embedding [41,91], a technique developed by the dynamics
community, is one way to help the input-output modeling; it allows one to reconstruct the
internal dynamics of a complicated nonlinear system from a single time series. That is,
one can often use delay-coordinate embedding to infer useful information about internal
(and immeasurable) states using only output information. The reconstruction produced by
delay-coordinate embedding is not, of course, completely equivalent to the internal
dynamics in all situations, or embedding would amount to a general solution to control
the theory's observer problem: how to identify all of the internal state variables of a
system and assume their values from the signals can be observed [87]. However, a single-
sensor reconstruction, if done properly, can still be extremely useful because its results
are guaranteed to be topologically (i.e., qualitatively) identical to the internal dynamics.
This means that conclusions drawn about the reconstructed dynamics are also true of the
internal dynamics of the system inside the black box. In order to reconstruct the
underlying dynamics in phase space, we begin with scalar observable, Yk of the state xk
of a deterministic dynamical system. Then typically we can reconstruct a copy of the
original system by considering blocks
V'y,k =[Q( (k)(k-kdy-I 4 (2.5)
= [Yk Yk- ''"Yk (dy-)T \t
of dy successive observations of
reconstruction is governed by two parameters, embedding dimension dy and embedding
delay r. Note that using dy = 1 merely returns the original time series; one-dimensional
embedding is equivalent to not embedding at all. Proper choice of dy and r is critical to
this type of phase-space reconstruction and must therefore be done wisely; only "correct"
values of these two parameters yield embeddings that are guaranteed by the Taken's
theorem [88] and subsequent work by Packard et al. [69] and Casdagli et al. [7] to be
topologically equivalent to the original (unobserved) phase-space dynamics.
2.2.1.2 Estimating an embedding dimension
There has been much work on determining the embedding dimensions of the time
series generated by autonomous dynamical systems in the absence of dynamical noise
[41,91,42,6]. The methods developed for estimating the minimum embedding dimensions
are grounded on Takens' embedding theorem [88] and most of them use the ideas of the
false nearest neighbors technique [42,6]. Later a number of works discussed theoretical
foundations of the delay embedding of the input-output time series [7,6]. This led to the
generalization of the existing method for the case of non-autonomous dynamical systems
[87,6,76].
He and Asada [35] proposed a strategy which is based directly on measurement
data and does not make any assumptions about the intended model architecture or
structure. It requires only that the process behavior can be described by a smooth
function, which is an assumption that must be made in black box nonlinear system
identification. An explanation of this strategy's central idea follows. In general case, the
task is to determine the relevant inputs of the function
Yk+1 1f( 1,(02,',(On) (2.6)
from a set of potential inputs (p1, ( o2 ... (, ( (o > n) that is given. If the function in (2.6)
is assumed to depend on only n 1 inputs although it actually depends on n inputs, the
data set may contain two (or more) points that are very close (in the extreme case they
can be identical) in the space spanned by the n -1 inputs but differ significantly in the
nth input. This situation is shown in Figure 2-2 for the case n = 2.
(,2
............................................., Y
YJ
....................................... j
Figure 2-2. Two data points that are close in (p, but distant in (P2
The two points i andj are close in the input space spanned by (,1 alone but they are
distant in the (1 (92- input space. Because these points are very close in the space
spanned by the n 1 inputs ((0p) it can be expected that the associated process outputs y,
and yj are also close (assuming that the function f(.) is smooth). If one (or several)
relevant inputs are missing then obviously y, and yj are expected to take totally
different values. In this case, it is possible to conclude that the n 1 inputs are not
sufficient. Thus, the nth input should be included and the investigation may begin again.
In [35] an index is defined based on so-called Lipschitz quotients, which is large if
one or several inputs are missing (the larger the quotients, the more inputs are missing)
and is small otherwise. Thus, using this Lipschitz index the correct embedding
dimensions can be detected at the point where the Lipschitz index ceases to decrease. The
Lipschitz quotients in the one-dimensional case are defined as
1, = for i, j 9L, i j (2.7)
where L is the number of samples in the data set. For the multidimensional case, the
Lipschitz quotients can be calculated by the straightforward extension of (2.7):
(Y1= ( fori, j \ 2L, ij (2.8)
where n is the number of input. The Lipschitz index, then, can be defined as the
maximum occurring Lipschitz quotient
I" = max (/;) (2.9)
As long as n is too small and thus not all relevant inputs are included, the Lipschitz
index will be large because situations as shown in Figure 2-2 will occur. As soon as all
relevant inputs are included, (2.9) stays relatively constant.
2.2.2 The Self-Organizing Map
The principal goal of the SOM is to transform an incoming signal pattern of
arbitrary dimension into a one or two-dimensional discrete map, and to perform this
transformation adaptively in a topologically ordered fashion [44]. Figure 2-3 shows
Kohonen's model of a two-dimensional SOM. Each PE in the lattice is fully connected to
all the source PEs in the input layer. This network represents a feedforward structure with
a single computational layer consisting of PEs arranged in rows and columns.
x1,k 0 O Winning PE
Input Layer Competition Layer
Figure 2-3. Kohonen's Self-organizing Map.
The algorithm responsible for the formation of the SOM proceeds first by
initializing the synaptic weights in the network. Once the network has been properly
initialized, there are three essential processes involved in the formation of the SOM:
competition, cooperation and adaptation. Descriptions of these processes follow.
2.2.2.1 Competitive process
Let m denote the dimension of the input space. Let an input vector selected
randomly from the input space be denoted by
Xk =[xl,k, 2,k, ***,Xk (2.10)
The synaptic weight vector of each PE in the network has the same dimension as in the
input space. Let the synaptic weight vector of PE i be denoted by
i, = [w ,,W2,,,---,W ,, ], i G N (2.11)
where Nis the total number of PEs in the network.
To find the best match of the input vector xk with the synaptic weight vectors w,,
compare the Euclidean distance between xk and v, and select the smallest one as
i = ii it i =1,---,N (2.12)
which sums up the essence of the competition process among the PEs. According to
(2.12), i is the subject of attention because we want the identity of PE i. The particular
PE i that satisfies this condition is called the best-matching unit or winning PE for the
input vector xk.
2.2.2.2 Cooperative process
The winning PE locates the center of a topological neighborhood of cooperating
PEs. A topological neighborhood can be defined by many methods. In particular, a PE
that is firing tends to excite the PEs in its immediate neighborhood more than those
farther away from it, which is intuitively satisfying. This means that after classification of
the input sample, the adaptation will be done not only for the winning PE but also for the
neighbors of the PE which gives the best response. Let A ,, denote the topological
neighborhood function centered on the winning PE i, then a typical choice of A, is
A -o, -rpo2 (2.13)
2rk
where r, ro represents the Euclidean distance in the output space between the ith PE
and the winning PE and rk is the effective width of the topological neighborhood. To
satisfy the requirement that the size of the topological neighborhood shrinks with time, let
the width of the topological neighborhood function decrease with time
ak =oexp k-), k=1,2,... (2.14)
where o- is the value of c at the initiation of the SOM algorithm, and o, is a time
constant. Thus, as time (i.e., the number of iterations) increases, the width decreases at an
exponential rate, and the topological neighborhood shrinks in a corresponding manner.
2.2.2.3 Adaptive process
The network can be trained with a simple Hebbian-like rule to train the weights of
the winning PE and its neighbors. The neighboring PEs can be trained in proportion to
their activity (Gaussian), or all neighbors within a certain distance can be trained equally.
The learning rule can be described as follows:
,+ = + k (2.15)
1 Wlk, O1lh '/ 11 i t'
Notice that both the learning rate, l7k, and neighborhood size, -k, are time dependent
and are typically annealed (from large to small) to provide the best performance with the
smallest training time.
2.2.3 Modeling Methodology Based on the SOM
In this architecture of local linear modeling, the SOM is trained to position the local
models in the embedded output space, fy,k =[YkYk 1,'" Yk-(dy -1),. The SOM
preserves topological relationships in the input space in such a way that neighboring
inputs are mapped to neighboring PEs in the map space. Then, when each PE is extended
with a local model it can actually learn the mapping yk+l = f(y,k) in a supervised way.
Each PE has an associated local model { 5i,, b, } in (2.3) that represents the approximation
of the local dynamics.
The local model weights { ,, b, } are computed directly from the desired signal
samples y, and the input samples by a least square fit within a Voronoi region centered
at the current winning PE chosen from f7y. The size of the data samples in the region
must be at least equal to the dy -dimensional basis vector. The design procedure for this
local model is as follows:
1. Apply training data to the SOM and find the winning PE corresponding to the input
V/Y such that we have winner-input pairs.
2. Use the least square fit to find the local linear model coefficients for the winning
PE, i, where desired output vector y, e ^3M as
y, = [, bo] ,Io for Vj e M (2.16)
where = [o bo ] is the sought linear model coefficients, Mis the size of inputs
involved in the winning PE i.
Specifically, the least-squares problem
Y = OX (2.17)
is solved for 0, where X e 9t (dM) is defined as a matrix that contains each input vector
associated with the winning PE, and Y e 91^ is defined as a vector that contains the
target outputs. It is well-known that although the least-squares solution obtained from
(2.17) is reasonably good when the noise level is low, the estimates tend to be biased for
higher levels of noise. Addition of a single sample to a cluster can radically change the
distances. Besides, the models will perform very well for that particular training set with
very low error because it has memorized the training examples but they may not perform
well with new data sets. Thus we make use of data samples from the winner as well as
the neighbors to create the local models in order to make them more robust as well as to
improve the generalization for the network. Also we take the data samples from the
neighbors in case less data than the dimension of the input are assigned in some Voronoi
region.
In testing, once the winning PE is determined we select the appropriate local model
from the list of associated models. Apply the local model to obtain the estimated output
Yk+1 = y Vy,k + b, (2.18)
2.3 Input-Output Representation of Systems
The temporal state evolution of an autonomous system is functionally dependent
only on the system state, but a nonautonomous system, such as considered in this work,
allows for an explicit dependence on an independent variable, the control input, in
addition to the system state. For an autonomous system, it is reasonable to assume that
the future behavior of the system can be predicted over some finite interval from a finite
number of observations of past outputs. In contrast, predictions of the behavior of a
nonautonomous system require consideration of not only the "internal" deterministic
dynamics (past outputs), but also of the "external" driving term (future input) [30,75,96].
System identification is a technique that permits building mathematical models of
dynamic systems based on input-output data (measurements). Its main purpose is to
identify a model of an unknown process in order to predict and gain insight into the
behavior of the process [39]. Real-life systems almost always show nonlinear dynamical
behavior. This behavior complicates the task of finding models that accurately describe
these systems. While in a large number of applications a linear model shows already
satisfactory results, there are numerous situations where linear models are not accurate
enough; especially when we deal with very complex systems or require very high
performance. Physical knowledge of the system can be a great aid in finding a nonlinear
model. However, this knowledge is not always available. In these cases we have to
determine a model from a finite number of measurements of the system's inputs and
outputs. This approach to nonlinear system modeling is often referred to as nonlinear
black-box identification. Usually, a nonlinear mapping is fitted from a number of delayed
inputs and outputs to the current output [94]. This results in a nonlinear input-output
model of the system.
2.3.1 Classical Approach
Some common classical approaches for nonlinear nonautonomous system modeling
are based on polynomials, e.g., Kolmogorov-Gabor polynomial models [71], Volterra
Series models [110], Hammerstein models [16,22,33], and so on, for the realization of the
nonlinear mapping.
21k 1 1 z1 z Yk
Nonlinear static approximator f(.)
Yk+l
Figure 2-4. Nonlinear dynamic model configuration.
Normally, a discrete-time nonlinear dynamic system can be described by a NARX
(Nonlinear Auto-Regressive with eXogenous input) model that is an extension of the
linear ARX model, and represents the system by a nonlinear mapping of past inputs and
output terms to future outputs, that is,
k+l =f(Yk, ,Yk yk- 1, k, *,U k- /+1) (2.19)
Here Yk e Y c 9'P is the output vector and u.k e Uc 9q' is the input vector. For
simplicity, we will set p = q = 1. Let the (dy + du) dimensional basis vector be
/k = [vy,k, Vu,k] = [Yk,"**Yk -dy+l, k, k-d+1 ] (2.20)
where W/k is in the set = Yy x U'd. Figure 2-4 shows the schematic diagram of NARX
model. Another nonlinear model is a NOE (Nonlinear Output Error) structure described
by
Yk+l = f(k,'-" k -dy+l, 1k k-du+1) (2.21)
where y is the output of the identification model f. A NARMAX (Nonlinear Auto-
Regressive Moving Average with eXogenous input) is the lagged version of the NARX
model and is represented by
Yk+l = f(Yk, "Yk- dy+ ,k,'**,k-du )+ek (2.22)
In the above, f can be replaced by neural networks, radial basis function networks or
fuzzy logic systems, which are other methods that have been developed for nonlinear
system identification [5,96]. Narendra and Parthasarathy [65] have compared NOE and
NARX, and as a result they have shown that NARX is better than NOE. In the neural
network community most identification schemes use the series-parallel model (NARX).
2.3.2 Series-Parallel and Parallel Models
A nonlinear dynamic model can be used in two configurations: a "series-parallel"
model and a "parallel" model. A series-parallel model predicts one or several steps into
the future on the basis of previous plant inputs and plant outputs and ensures that all the
signals are bounded if the plant is BIBO stable. Most published reports use the series-
parallel model because of its resulting stability. A requirement for using this model is that
the plant output is measured during the operation. In particular, in control engineering
applications the series-parallel model plays an important role, e.g., for the design of a
minimum variance or a predictive controller.
Plant, P(z) Plant, P(z)
Model, f(.) Model, f(.)
Yk+1 Yk+1
Figure 2-5. A series-parallel model (left) and A parallel model (right).
In contrast, a parallel model is required whenever the plant output cannot be
measured during operation. This is the case when a plant is to be simulated without
coupling to the real system, or when a sensor is to be replaced by a model. Also, for fault
detection and diagnosis the plant output may be compared with the simulated model
output in order to extract information from the residuals. Finally parallel model is very
useful when dealing with noisy systems since it avoids problems of bias caused by noise
on the real system output: If the identification model is to be used offline, the parallel
model is obviously more suitable. The parallel model, however, lacks theoretical
verification; hence, it is difficult to utilize its advantages.
The two configurations shown in Figure 2-5 can not only be distinguished for the
model operation phase but also during training. In this research, we follow the series-
parallel model.
2.3.3 SOM-based Multiple ARX Models
In the interest of modeling the local dynamics of a nonautonomous system in each
region, the local approximation method presented for autonomous systems can be
extended by letting xk Vk in (2.3), so that yk+1 f('k). Provided that necessary
smoothness conditions on f, : -> Y are satisfied, a Taylor series expansion can be
used around the operating point. The first-order approximation about the system's
equilibrium point produces N local predictive ARX models f, ---, f, of the plant
described by
dy-1 du-1
f(k) Za,yk j + bJUk ,i =1,...,N (2.23)
J=0 J=0
where a,,j and b,,j are the parameters of the ith model. Although higher order Taylor
approximations would improve accuracy, they are not very useful in practice because the
number of parameters in the model increases drastically with the expansion order.
Our proposed methodology is summarized as follows: first, the delayed version of
input-output joint space is decomposed into a set of operating regimes that are assumed to
cover the full operating space Next, for each operating regime we choose a simple linear
ARX model to capture the dynamics of the region. Consequently, a nonlinear
nonautonomous system is approximated by a concatenation of local linear models
f()W U f (y, V) (2.24)
i=1,.--,N
2.3.3.1 Selection of operating regions with a SOM
Building local mappings in the full operating space is a time and memory
consuming process, which led to the natural idea of quantizing the operating regimes and
building local mappings in positions given by prototype vectors obtained from running
the plant. For quantization of the operating regimes, the k-nearest-neighbor method is
effective but it disregards neighborhood relations, which may affect performance [53]. In
contrast, the SOM has the characteristic of being a local framework liable to limit the
interference phenomenon and to preserve the topology of the data using neighborhood
links between PEs. Neighboring PEs in the network compete with each other by means of
mutual lateral interactions, and develop adaptively into specific detectors of different
signal patterns [44]. The training algorithm is simple, robust to missing values, and it is
easy to visualize the map. These properties make SOM a prominent tool in data mining
[94].
In most of the papers discussing local linear models for system identification, the
SOM has been used with a first order expansion around each PE in the output space. The
SOM transforms an incoming signal pattern of arbitrary dimension into a one or two-
dimensional discrete map, and performs this transformation adaptively in a topologically
ordered fashion [44]. The results obtained so far with this methodology have been quite
promising. However, problems that need to be solved remain: first, efficiently
partitioning the operating regimes in high dimensional spaces is still a problem due to the
curse of dimensionality [30]; second, it may be hard to find a small number of variables
to characterize the operating regimes due to the possibly large number of local models;
third, all the methods have to be extended for nonautonomous regimes.
The previous work by Principe et al. [75] provided the starting point for the
proposed modeling architecture. The most important difference is how to capture the
dynamics in the input-output joint space, which is fundamental for identifying the
unknown system. Several options are possible, and we have been investigating them:
Firstly, we tried to find the local models by quantizing the input-output joint space
by embedding not only the outputs but also the control inputs using one SOM. This
modification is essential because the purpose is to characterize the system dynamics that
exist in the input-output joint space. However, we encountered some difficulties such as
normalization of the joint space and large dimensionality of the space involved (many
degrees of freedom and large dynamic range of parameters) [13].
Secondly, in order to reduce the approximation error with local models based on a
SOM, we utilized a counter-propagation network by quantizing the input-output joint
space and the desired signal space together [15]. Since the output at each PE is just the
average output for all of the feature vectors that map to that point local models might be
created for better approximation using the quantization error in the input space and the
average output. This is achieved by coupling each PE with a linear mapping in such a
way that a functional relationship can be established between each Voronoi region in the
input space (of the SOM) and the desired signal. However, this method required a much
larger map to make the estimation error in the desired output space smaller. Additionally,
when noise is added in the input of the SOM, the quantization error in the input Voronoi
region may be magnified by the local models.
As the number of dependent variables is increased, the process becomes
increasingly difficult to model accurately. This led us to think that a model that uses only
a few of the observed variables will be more accurate than a model that uses all the
observed variables. In this scheme, therefore, we let the SOM look at only the current
output and its past values to decide the winner, and create the models with the control
inputs.
Here we will pursue the last option for the following reasons. The competitive
learning rule works best for normalized inputs. The SOM algorithm uses the Euclidean
metric to measure distances between vectors. For example, if one variable has values in
the range of [-100,...,100] and another in the range of [-1,...,1] the former almost
completely dominates the map organization because of its greater impact on the measured
distances. Either, the measure of distance is weighted by the inverse of the scales or the
data must be normalized such that each component of the input vectors have unit variance
and zero means [8]. However, normalization loses information (the mean or the scale can
be important) and it can become meaningless if the data dynamic range (or mean)
changes over time. Therefore we cannot normalize the data (nor create the weighted
Euclidean metric) in this way since it is not always guaranteed that the mean and the
dynamics range of the data are available.
In addition, as the number of dependent variables is increased, SOM modeling
becomes increasingly difficult because it is basically a memory-based approach that does
not scale up well with the input dimension. This led us to think that a model that uses
only a few of the observed variables will be more accurate than a model that uses all of
the observed variables. When the SOM modeling is done in the output space, we let the
SOM look at only the current output and its past values to decide the winner which
represents the operating regime, and create the models with the control inputs as shown
in Figure 2-6. In so doing, normalization of the input space is not necessary since the
clustering is performed solely by the history of the output.
Figure 2-6. Configuration of local linear modeling based on a SOM.
2.3.3.2 Model development procedure
After the operating regions are divided by the SOM the underlying dynamics f is
then approximated as f U 11 f where N is the number of operating regions. N local
predictive ARX models f1, fN of the plant are described by
dy du
f/( k) aJyk- +ZbJuk-J, i=1,...,N (2.25)
J=0 J=0
where a,,j and b,, are the parameters of the ith model. Then, when each PE of the SOM is
extended with a local model it can actually learn the mapping yk+l = f (Vy,k V.u,k) in a
supervised way. The development of local models is done by directly fitting the
quantized embedded output samples obtained from the SOM and corresponding
embedded control input samples that cover the whole range of operation of the plant.
Each PE has an associated local model { d,, b} which are computed directly from
the desired signal samples r,, and the input-output samples by a least square fit within a
Voronoi region centered at the current winning PE chosen from y,. The design
procedure for this local model is as follows:
1. Apply training data to the SOM and find the winning PE corresponding to the input
Vy such that we have winner-input pairs.
2. Use the least square fit to find the local linear model coefficients for the winning
PE, i, where desired output vector r, e 91M as
r,= b = a Y '] for Vj eM (2.26)
where [ao b o] is the sought linear model coefficients, M is the size of data
involved in the winning PE i.
3. In testing, once the winning PE is determined we select the appropriate local model
from the list of associated models. Apply the local model to obtain the estimated
output
vk+l y,k +bTk (2.27)
Our proposed modeling methodology is summarized as follows: first, the delayed
version of input-output joint space is decomposed into a set of operating regions that are
assumed to cover the full operating space. Next, for each operating region we choose a
simple linear ARX model to capture the dynamics of the region. Consequently, a
nonlinear nonautonomous system is approximated by a concatenation of local linear
models.
CHAPTER 3
MULTIPLE MODEL BASED CONTROL
Researchers have been interested in control of nonlinear systems for a very long
time. Progress in nonlinear control design, however, has been difficult because of the
intrinsic complexity of the problem [82]. In general, nonlinear control methods are
complex and can be applied only to a narrow class of systems. For example, methods
such as backstepping and feedback linearization can be applied to nonlinear systems with
some specific structure, but not to arbitrary nonlinear systems. Thus, nonlinear control
methods cannot serve all needs of real industrial control problems.
One way to approach the control of a nonlinear system in a wide range of
conditions is to linearize the model at a number of operating points, and then design one
linear feedback controller at each operating region [101]. These local controllers can then
be "switched" or "scheduled" as the system changes operating conditions. The use of
multiple models is not novel in control theory. Multiple Kalman filters were proposed in
the 1960s and 1970s by Magill [51] and Lainiotis [45] to improve the accuracy of the
state estimates in control problems. Fault detection and control in aircraft was proposed
by Maybeck and Pogoda [54], and in the subsequent years Maybeck and Stevens [55]
used the idea extensively in controlling aircraft systems. In all the above cases, no
switching is concerned, and only a linear combination of the control determined by the
different models is used to control the system.
The idea of switching between controllers has been most likely introduced for the
first time in the adaptive control literature by Martensson [52]. In the direct switching
schemes, the sequence in which the different controllers are to be tried is pre-determined.
The only determination that has to be made is when to switch from one controller to
another. It was soon realized that such architectures have very little practical utility. On
the other hand, the outputs of the multiple observers determine both when and to which
controller switching should occur in indirect switching schemes. Middleton et al. [56]
explicitly proposed the use of multiple models and switching to alleviate the problem of
stabilizing of the estimated model in indirect control, and further extended in [59] and
labeled the "Hysteresis switching algorithm". The objective in all the above efforts is to
attain stability in adaptive control with minimum past information.
A controller is often highly dependent on a plant model especially when the
controller has been designed out of the model. Hence, for those cases, the modeling error
would be a relevant criterion for controller selection. If the number of controllers is
bounded, the delay between the selection of the controller and its activation can be
neglected. Thus the selection of the controller according to the modeling error is feasible.
Narendra and Balakrishnan [64] were the first to propose this idea of using multiple
adaptive models and switching in order to improve the performance of an adaptive
system, while assuring stability. Although it has already been shown that the performance
of a system can be significantly improved using the multiple model adaptive control with
switching, applicability to highly complex systems with this approach has not been
investigated in details. Thus, in this chapter, multiple controller design methodologies are
introduced by extending the multiple model approach for more complex nonlinear
systems.
3.1 Discrete-Time Control System
The field of classical control theory concerns itself with the task of servo or
regulator control of linear analog plant. Design methods for both continuous-time linear
controllers and discrete-time linear controllers obtained by discretizing the plant are well
understood. Figure 3-1 shows a schematic diagram of a classical discrete-time control
system.
dk
rk -1 Controller uk Plant Yk+1
-o G(z) P(z)
Figure 3-1. Classical discrete-time control system.
The signal rk1 is the reference signal. We would like the plant output Yk+1 to track
it as closely as possible. To track the reference signal, the controller uses both rkj and
Yk to compute the plant control signal uk. Feedback of Yk is used to stabilize the plant,
and to ensure that the controller is both resilient in the face of external disturbances and
able to quickly reduce the output error to zero. Their only drawback is that they assume
precise knowledge of the plant dynamics. For this reason, a great deal of effort has been
expended to create accurate models of typical plants. As one improved way of modeling,
we proposed multiple models for better approximation of the plant in Chapter 2. This
scheme makes it easier to design the controller for the model which approximates internal
descriptions of the plant when given a finite number of external measurements without
any knowledge of the plant dynamics.
One reason to consider discrete-time systems is that it is well known that most
complex systems are controlled by computers which are discrete in nature and this
constitutes an obvious reason for dealing with multiple model based control. Another is
the fact that the presence of random noise can be dealt with more easily in the case of
discrete-time systems. Since most practical systems have to operate in the presence of
noise, the stability and performance of multiple model based control in such contexts has
to be well understood, if the theory is to find wide applications in practice.
3.2 Inverse Control via Backpropagation Through Model
In order to design a controller, we need to determine a plant model which should
capture the dynamics of the plant well enough that a controller designed to control the
plant model will also control the plant very well. Such a model might be derived from
physics by carefully analyzing the system and determining a set of partial differential
equations which explain its dynamics. Alternatively, the model might be a black-box
implementing some sort of universal transfer function. This function may be tuned by the
adjustment of its internal parameters to capture the dynamics of the system. For nonlinear
unknown systems, a NARX model of sufficient order is a universal dynamic system
approximator. Hence, we implement NARX neural network plant models for
performance comparisons with the proposed local linear models.
The conventional design methods for control systems involve constructing a
mathematical model of the system's dynamics and utilization of analytical techniques for
the derivation of a control law. Such mathematical models comprise sets of linear or
nonlinear differential/difference equations, which are usually derived with a degree of
simplification or approximation. The modeling of physical systems for feedback control
generally involves a balance between model accuracy and model simplicity [102]. Should
a representative mathematical model be difficult to obtain, due to uncertainty or sheer
complexity, conventional techniques prove to be less useful. Also, even though an
accurate model may be produced, the underlying nature of the model may make its
utilization using conventional control design difficult.
Neural networks, hence, have been used for different purposes in the context of
control due to its ability to learn an essential feature of unknown plants by mimic
[43,105,78]. Most of methods existing are based on inverse control. We will therefore
start by considering a neural network controller, specifically, TDNN since this is very
suitable to create both a model and a controller when only input-output measurements are
available.
Principally, the TDNN is an extended multilayer perception that allows us to
handle temporal patterns and the problems of time variant signals, i.e., signals that are
scaled and translated over time [40]. The idea that has been followed in the TDNN is
based on the invention of time delays, resulting in giving the individual PEs the ability to
store the history of their input signals. This way, the network as a whole can adapt not
only to a set of patterns, but also to a set of sequences of patterns. An advantage of the
TDNN is the relatively simple mathematical analysis and ability of training by
Backpropagation algorithm. Thus we compare the performance of the proposed control
systems with that of an inverse controller trained through the TDNN.
The algorithm is derived as follows: First we train a TDNN as a model, f, by
letting yk+1 = f(Yk 1, ,Uk, Uk-,I i- -) as shown in Figure 3-2(a). Then a TDNN
controller is designed based on the created TDNN model from which we obtain the
Jacobian of the plant. The controller is described by
1k = g(uk-1, Uk-k ,r+,y,k k-1,-,W) letting g be the function implemented by the
controller G(z). Here W is the weights of the TDNN model, F(z). The controller
parameters in the fixed control structure are adapted by an algorithm that ensures that the
desired performance level is maintained and the parameters are updated by back
propagating the error through the model as shown in Figure 3-2(b).
Embedding Embedding
3.3 Multiple Inverse Control
Now we discuss the control problem for the local linear model using an inverse
control framework [17]. The central advantage of such a framework is that an inverse
model can be used directly to build a feed-forward controller. When given a model
Yk+1 = f (Yk,Yk-1,Uk,Uk- ), the control network is brought on line and the control signal
is calculated at each instant of time by setting the output value Yk+1 at instance k +1
equal to the desired value rk+1 as 1uk -f 1 (Yk, Yk 1, rk+1, k 1) while trained off line as in
classical inverse control approach. Thus, for the desired behavior, the controller simply
asks the model to predict the action needed.
As stated before, our principal objective is to determine a control input, uk, which
will result in the output, yk+l of the plant tracking with sufficient accuracy a specified
sequence, rk+1. The system identification block has N predictive models denoted by
{f }N1, in parallel. Corresponding to each model /, a controller g, is designed such that
g, achieves the control objective for f Therefore, at every instant one of the models is
selected and the corresponding controller is used to control the actual plant. In order to
control a plant, consider the control problem where the dimension of the input is equal to
that of the output, that is, p = q. From (2.23), because p = q, and under the assumption
that b, is invertible, the control law of an inverse controller for the model, f can be
directly calculated as
S dy-1 du-
uk = oo rk+ -1~Zao -y --Zbo Uk- (3.1)
J ]=0 J =1
Therefore, at time instance k, the control uk can be obtained, if the future target of Yk,
rkl is known. Therefore, the set of local linear models simplifies the control design for a
nonlinear plant. So instead of a global neuro-controller as in other adaptive control
schemes [12,18,31,58], here we can function with a group of linear controllers associated
with each identified model, thus taking care of the system over the whole operating
region.
One advantage of this scheme is its simplicity and fast convergence to get the
desired response. Another advantage is that the dynamic space is decomposed in the
appropriate switching among very simple linear models, which leads to accurate
modeling and controls. On the other hand, creating a set of models by embedded input
and output may cause serious problem in the presence of large noise or outliers since the
wrong predictive model due to noise may cause poor control. Hence, the selection of the
right model is as important as creating models and designing controllers.
Once the right local linear model is determined, the corresponding controller is
designed using (3.1). A schematic diagram of the proposed SOM based inverse control
system is shown in Figure 3-3 where the inverse control seeks to model the inverse of the
plant. A set of controllers appears in series with the plant. The command input, rk+, is
fed to the controller and provides also the desired response. Hence, when the error is
small the controller transfer function is the inverse of the plant.
11-- Plant P(z) No
. . ... ............................................
[ Embedding \- --- Embedding *
hN = gN ........
Figure 3-3. Proposed SOM-based inverse control scheme.
Generally, an adaptive controller that meets the specifications is slow to adapt.
However, our approach models all the operating regions and automatically divides the
operating regions by the number of PEs. So once the current operating region is
determined by the SOM, the corresponding controller is triggered so that the plant tracks
the desired signal. Moreover, even if the wrong PE is assigned in the winning PE due to
noise, a similar dynamic model can be activated since neighboring SOM PEs represent
neighboring regions in the dynamic space. Thus, the proposed control system can reach
the set point quickly, and even if the dynamic model is not the most appropriate, there is
an extra flexibility to match the set point with the least amount of error.
3.4 Multiple PID Control
In linear control theory, despite the development of more sophisticated control
strategies, Proportional-Integral-Derivative (PID) controllers have been extensively
studied by researchers and well understood by practitioners, since they are widely used in
practice, and their principle is well understood by engineers [95,92]. It gained its
popularity for its simplicity of having only three parameters. But owing to its simplicity it
has also paid the price of not having an efficient and practical way of determining
optimal gains.
The ideal continuous time PID controller is expressed in Laplace form as follows:
K
G(s)= Kp + K +Kds (3.2)
where KP is the proportional gain, K, the integral gain, and Kd the derivative gain.
Each of the terms works "independently" of the other1. The standard PID control
configuration is shown in Figure 3-4. The introduction of integral action facilitates the
achievement of equality between the measured value and the desired value, as a constant
error produces an increasing controller output. The derivative action indicates that
changes in the desired value may be anticipated, and thus an appropriate correction may
be added prior to the actual change. Thus, in simplified terms, the PID controller allows
contributions from present, past and future controller inputs.
1 This is not exactly true since the whole thing operates in the context of a closed-loop. However, at any
instant in time, this is true and makes working with the PID controller much easier than other controller
designs.
G(s)
R(s) E(s) U(S) Y(s)
I-- C--- -K=s- P(s)
Figure 3-4. PID controlled system.
Extensions of the PID control methodology to nonlinear systems, however, are not
trivial. Usually, global nonlinear models are, as always, linearized and the parameters of
the PID controller are scheduled according to the regime. All these difficulties, in fact
stem from the fact that the system model is constructed from a nonlinear dynamical
equation. This difficulty could be eliminated by a piece-wise linear approximation of the
nonlinear dynamics (as one does in gain scheduling). However, gain scheduling methods
are not flexible for model uncertainties and they cannot be scaled up to regimes that are
not described in the initial system identification stage by a linear model. It also has the
problem of either inefficient use of system identification data due to a large number of
arbitrarily selected operating (linearization) points or inaccurate modeling due to the
small number of linearization points [83].
The multiple PID control design method described in this section is rooted in the
principle of using local linear models to construct a globally nonlinear (piecewise linear)
system model that is determined completely from the input-output data collected from the
actual plant [12,30]. In addition to previously designed multiple inverse control schemes,
this section illustrates how the multiple models can be united with the well-known linear
PID controller design techniques to obtain a principled and simple nonlinear PID
controller design methodology. The proposed closed loop scheme is illustrated in Figure
3-5 where the SOM determines which linear PID controller contributes to the
instantaneous control input.
G- l(z) SOM -- Embedding --
SG (z ) P, ) kY k+ 1
rk+1
Figure 3-5. Overall schematic diagram of the nonlinear PID closed loop control
mechanism using multiple controllers.
Once system identification is complete, the design of a globally piece-wise linear
PID control system can be easily accomplished using standard techniques. The literature
has an abundance of PID design methodologies for linear SISO systems including direct
pole-placement techniques and optimal coefficient adjustment according to some criteria
[95,92]. Here pole-placement technique [4] is utilized to design a PID controller for each
linear SISO model and is illustrated briefly in the following.
Assume that the plant is modeled by a set of the SISO, second-order system given
by
Yk+ = f(YkYk -l,kik ,k)
= a ,,Yk + a2,,Yk 1 + b,k +b 2,uk 1
where Yk is the output of the plant, and uk is the input to the plant. The controller is
designed by starting with a general PID regulator shown in Figure 3-6 and determining
the coefficients of polynomials C, and D, so that the closed-loop system has desired
properties.
R(z+1) E(z) DU(z) ---Y(z +1)
S( (z) F, ( (z)z)
Figure 3-6. Block diagram of PID controller for a SISO plant model.
Each model's transfer function is given by
B, (z) b, z2 + b2~, z
(z)= -=2 (3.4)
A (z) z al,, -a2,,
It is assumed that the polynomials A, (z) and B, (z) do not have any common factors.
Note that in order to have a stable zero (zero inside the unit circle), b,, > b2, must be
true. The controller specifications are expressed in terms of a model that gives the desired
response to command signals. The general control equation is
C, (z)U(z) = D (z)(R(z) Y(z)) (3.5)
It is assumed that the C, is monic. To make sure that low-frequency disturbances give
small errors, C, is chosen as
C (z) = (z 1)L C (z) (3.6)
with a suitable selection of L. This puts a (z-1) term in the denominator of the
controller transfer function, guaranteeing integral control action.
The goal of the controller design is to map the values of a b,,,a2,Z,b,, and b2,, into
the controller coefficients, d,, and c, subject to the constraint of the model whose
characteristic polynomial is z2 + IlZ + 2 The transfer function of the closed-loop
system is 1/(l+zA(z)C,(z)/B (z)D (z)); therefore, its characteristic polynomial is
A (z)C, (z) + z 1B, (z)D, (z). The characteristic polynomial for the model is Am, with the
addition of a second-order "observer polynomial" term Ao. Equating characteristic
polynomials results in the design equation
A (z)(z-1)L C, (z) + z-'B, (z)D (z) = A, (z)Am(z) (3.7)
which can be written as
(z2 a,z -a)(z 1)(z +c,) + z-(b,,z2 +b ,,z)(d,,z2 + dz,,) = z2(z2 + z + A2)(3.8)
This equation is solved for the d,, and c,. Then the control law difference equation is
uk =(1 -)uk-1 +c,uk-2 + d,,ek + d2,,ek-1 +d3,ek-2 (3.9)
Given the local linear models as obtained through the use of a SOM and a PID
design technique, the overall closed loop nonlinear PID design reduces to determining the
coefficients of the individual local linear PID controllers using their respective linear
plant model transfer functions. One needs to determine a set of PID coefficients per linear
model. In the competitive SOM approach, the model output depends only on a single
linear model at a given time; therefore, the PID coefficients are set to those values
determined for the instantaneous winning model.
CHAPTER 4
MULTIPLE QUASI-SLIDING MODE CONTROL
Attempts to use traditional control methods, such as inverse control and PID
control, for nonlinear plants will inevitably encounter problems when faced with the
nonlinear nature of these systems. In order to overcome these difficulties in designing
controllers for nonlinear systems, a simplified control method that keeps the advantage of
the conventional approach was proposed in Chapter 3, i.e., the SOM was explored as a
modeling infrastructure, and the controllers were built based on SOM-based multiple
models.
However, these control methods (especially multiple inverse control scheme) may
show poor control performance when existing sensor noise or external disturbances due
to the fact that the perfect control is achieved if the plant and the controller is stable, if
the model is perfect, and if there is no disturbance. Wrong selection of the model caused
by noise can devastate the control mission since the controller is also determined by very
different model from the given task. For this reason, sliding mode control architecture,
which is a very well-known robust controller, is employed in this chapter in order to
obtain sturdiness against noise as well as uncertainties on the model.
4.1 Introduction to Variable Structure Systems
The control of nonlinear systems has been an important research topic and many
approaches have been proposed. While classical control techniques have produced many
highly reliable and effective control systems, great attention has been devoted to the
design of variable structure control systems (VSCS). Variable structure systems (VSS)
are a special class of nonlinear systems characterized by a discontinuous control action
which changes structure upon reaching a set of switching hyperplanes during the control
process to attain improved overall characteristics in the controlled system. During the
sliding mode the VSCS has invariance properties, yielding motion that is remarkably
good in rejecting certain disturbances and parameter variations [93,9,25,86].
Most of the VSCS proposed in the literature have been developed mainly based on
the state-space model with the assumption that all state variables are measurable or on the
input-output model for a linear system. But in some control problems, we are allowed to
access only the input and the output of the nonlinear plant [72]. In this case, an observer
could be used to estimate the unmeasurable state variables if the state equations are
known. Otherwise, this is not possible. Thus, it is the purpose of this chapter to provide a
new technique to design sliding mode control systems based on input-output models of
the considered discrete-time nonlinear system so that the amount of guesswork1 is
reduced, while attainable performance is increased.
Normally, the design of VSS consists of two parts: First, the sliding surface, which
is usually of lower order than the given process, must be constructed such that the system
performance during sliding mode satisfies the design objectives, in terms of stability,
performance index minimization, linearization of nonlinearities, order reduction, etc.
Second, the switched feedback control is designed such that it satisfies the reaching
condition and thus drives the state trajectory to the sliding surface in finite time and
maintains it there thereafter [93].
1 In most cases, we need to estimate the unknown parameters, unmodeled dynamics and bounded
disturbances. Also, it should be noted that the VSCS works best when the plant is completely known.
4.1.1 Sliding Hyperplane Design
Sliding surfaces can be either linear or nonlinear. The theory of designing linear
switching surfaces for linear dynamic system has been developed in great depth and
completeness. While the design of sliding surfaces for more general nonlinear systems
remains a largely open problem. For simplicity, we focus only on linear switching
surfaces. Moreover, for surface design, it is sufficient to consider only ideal systems, i.e.,
without uncertainties and disturbances. Consider a general system
= A(x)+B(x)u (4.1)
with a sliding surface
S = {x I S(x) = 0}
where A(x), B(x) are general nonlinear functions of x, and x e 91", u e 91'.
The equivalent control is found by recognizing that S(x) = 0 is a necessary
condition for the state trajectory to stay on the sliding surface S(x)= 0. Therefore,
setting S(x) = 0, i.e.,
S(x)=(L' )k = A(x) + -B(x)u =0 (4.2)
yields the equivalent control
Ueq B(x) A(x) (4.3)
where (OS/ 8x)B(x) is nonsingular. When in sliding mode, the dynamics of the system is
governed by
x= I-B(x) sB(x) s A(x) (4.4)
1 \^x ) x
For example, if the system (4.1) is linear and described by
S= Ax+Bu (4.5)
where A and B are properly dimensioned constant matrices. The switching surface can
be defined as
S(x) = CTx = 0 (4.6)
i.e., OS / Cx = C', where C = [c, c2, *, c, ] is an m x n matrix, and then we have
Ueq = (Cr'B) C'Ax (4.7)
and (4.4) becomes
= (I B(CT B) C' )Ax = (A BK)x (4.8)
(4.4) and (4.8) describe the behavior of the systems (4.1) and (4.5), respectively, which
are restricted to the switching surface if the initial condition x(to) satisfies S(x(to)) = 0.
For the linear case, the system dynamics is ensured by a suitable choice of the feedback
matrix K = (CTB) C'TA. In other words, the choice of the matrix C can be made
without prior knowledge of the form of the control u.
4.1.2 Sliding Mode Control Law Design
Once the sliding surfaces have been selected, attention must be turned to solving
the reachability problem. This involves the selection of a state feedback control function
u : 91" -> 9T" which can drive the state x towards the surface and thereafter maintains it
on the surface illustrating in Figure 4-1. In other words, the controlled system must
satisfy the reaching conditions.
The classic sufficient condition for sliding mode to appear is to satisfy the
condition s, s <0, i= 1,---,m and a similar condition proposed by Utkin [93], i.e.,
lim < 0 and lim s, > 0. These reaching laws result in a VSC where individual
S, _>0+ S, ->CT
switching surfaces and their intersection are all sliding surfaces. This reaching is global
but does not guarantee finite reaching time.
x
Reaching phase -- Sliding phase -- Reaching phase
Figure 4-1. Phase plane plot of a continuous-time second-order variable structure system.
Another commonly used reaching law is proposed by Gao & Hung [27]. The law
directly specifies the dynamics of the switching surface by the differential equation
S = -Q sgn(S)- Rg(S) (4.9)
where the gains Q and R are diagonal matrices with positive elements, and
sgn(S) = [sgn(s ),... sgn(s )] g(S) = [gl (s,),..., g (Sm )] where
sgn(s,) ={1 if s,(x) >0, 0 if s, (x) = 0, -1 if s,(x) <0} (4.10)
and the scalar functions g, satisfy the condition
s,g, (s,)> 0, when s, 0 (4.11)
Various choices of Q and R specify different rates for approaching S and yield
different structures in the reaching law.
For one way of designing controllers, recall that during sliding mode, one can
compute the equivalent control ueq according to (4.3) or (4.7). However, only using Ueq
cannot drive the state towards the sliding surface if the initial conditions of the system are
not on S. One popular design method is to augment the equivalent control with a
discontinuous or switched part, i.e.,
u= Uq + UN (4.12)
where ueq is a continuous control defined by (4.3), and uN is added to satisfy the
reaching condition which may have different forms. For a controller having the structure
of (4.12), we have
as
S(x) = x k A(x) + B(x)(ug + u)]
= [A(x) + B(x)ue q+ -B(x)uN
= -B(x)u,
for simplicity, assume (OS/8x)B(x) = I, then S(x) = UN. Some often used forms of UN
are relay type of control uN = -a sgn(S(x)), linear continuous feedback control
UN = -a(S(x)), and linear feedback control with switching gains UN = Yx where
S= [VI ] is an m x n matrix and
if s, (x)x > 0
Parameters and are chosen to ify the desired reaching condition.< 0
Parameters a and 8? are chosen to satisfy the desired reaching condition.
Another design method of controllers is to employ the reaching law approach
proposed by Gao & Hung [27] and can be directly obtained by computing S(x) along the
reaching mode trajectory, i.e.,
S(x) =-x= s[A(x) + B(x)u]= -Q sgn(S(x)) Rg(S(x)) (4.13)
ox ox
Hence, we have
u = B(x A(x) + Q sgn(S(x)) + Rg(S(x)) (4.14)
By this approach, the resulting sliding mode is not preassigned but follows the natural
trajectory on a first-reach-first-switch scheme. The switching takes place depending on
the location of the initial state.
4.2 Sliding Mode Control in Sampled-Data Systems
The VSS theory which was originally developed from a continuous time
perspective has been realized that directly applying the theory to discrete-time systems
will confront some unconquerable problems, such as the limited sampling frequency,
sample/hold effects and discretization errors [2]. Since the switching frequency in
sampled-data systems cannot exceed the sampling frequency, a discontinuous control
does not enable generation of motion in an arbitrary manifold in discrete-time systems
[3,9]. This leads to chattering along the designed sliding surface, or even instability in
case of a too large switching gain [2]. Figure 4-2 illustrates that in discrete-time systems,
the state moves around the sliding surface in a zigzag manner at the sampling frequency.
Much research has been done in this field. Among various concepts of discrete-time
sliding mode, Quasi-Sliding Mode (QSM) is reviewed in this section for the sliding mode
controller design aimed at sampled-data systems.
State trajectory
k
k+1
k+2
Sliding surface s = 0
Figure 4-2. Discrete-time system response with sliding mode control.
4.2.1 Quasi-Sliding Mode
Many researchers have either addressed the limitations when direct implementation
is done or have proposed designs that take the sampling process into account.
Milosavljevic [57] was among the first researchers to formally state that the sampling
process limits the existence of a true sliding mode. In light of this, the concept of QSM
has been suggested and the conditions for the existence of such mode have been
investigated. Consider a sampled-data SISO system with the predefined sliding surface
xk+1 =Axk + Buk (4.15)
Sk = CTXk = 0
The desired state trajectory of a discrete-time VSC system should have the
following features: Firstly, the trajectory moves monotonically towards the switching
manifold and crosses it in finite time starting from any initial point. Secondly, the
trajectory crosses the manifold in succession after it hits the manifold, resulting in a
zigzag motion about the switching surface. Lastly, the trajectory keeps on within a
specified band without increasing the size of each successive zigzagging step. Gao et al.
[28] defined a QSM as the motion of a discrete VSC system satisfying last two features.
In addition they named the specified band, {xI -A < s(x) < +A}, which contains the QSM
as the Quasi-Sliding Mode Band (QSMB). For single input systems, the main
approaches for the design of QSM control laws can be categorized into the following two
methods:
Discrete Lyapunov function based approach: Sarpturk et al. [81] noticed that unlike
the case in continuous-time SMC, the switching control in the discrete-time case should
be both upper and lower bounded in an open interval, in order to guarantee the
convergence of sliding mode. Recall that in continuous-time SMC, the control (4.12) is
composed of the equivalent control and a switching control. Converting this control to
discrete-time gives
uk = k,eq + k,N (4.16)
Hui & Zak [36] observed that if uk,N is a relay control with a constant amplitude,
the relay must be turned off in some neighborhood of the surface, in order to reach the
switching surface, otherwise, the trajectory will chatter around the surface with a chatter
amplitude at least as large as the amplitude of the relay output. The idea of sliding sector
[26,48] was used to solve this problem, i.e., to specify a region in a neighborhood along
the sliding mode, where linear control is used to keep the state inside the region after it
has reached the region. The switching control is applied only when system states are out
of the region. In this case, the derived switching surface is different from the sliding
surface. Based on a discrete Lyapunov function,
12
Vk-2k
the reaching law is given by
1
Sk(Sk+1 Sk ) < (Sk+1 Sk)2 for Sk 0 (4.17)
2
which ensures Vk+1 < Vk. Furuta [25] proposed a control law of the type
1k = uk,eq+FDXk (4.18)
where the equivalent control uk,eq is the solution of
ASk = Sk+1 Sk = 0 (4.19)
and therefore the equivalent control for the system (4.16) is
Uk,eq = (C 'B) 1 C(A- I)xk (4.20)
FD is a discontinuous control law which will be zero inside the sliding sector.
Reaching law based approach: Gao et al. [28] pointed out that the reaching law
(4.17) was incomplete for a satisfactory guarantee of a discrete-time sliding mode, since
it does not ensure that the trajectory moves monotonically towards the switching surface
and the trajectory stays on within a specified band. Thus, they presented an algorithm that
drives the system state to the vicinity of a switching hyperplane in the state space, rather
than to a sector of a different shape. They specified desired properties of the controlled
systems and proposed a reaching law based approach for designing the discrete-time
sliding mode control law. The equivalent form of the reaching law for discrete-time SMC
extended from the continuous-time reaching law (4.13), and for a SISO system is
sk+i -Sk =-qTsgn(sk)-rTsk, r > 0, q > 0, 1-rT > 0 (4.21)
where T > 0 is the sampling period. The state reaches the switching surface at a constant
rate -qT and the term -rT forces the state to approach the switching surfaces faster
when sk is large. The inequality for T guarantees that starting from any initial state, the
trajectory will move monotonically towards the switching surface and cross it in finite
time. Then the control law for discrete SMC is derived by comparing
sk+ sk = Cxk+ C'xk = C' Ax, + C' Buk -C'xk (4.22)
with the reaching law (4.21), which yields,
uk =(CB) [CT Axk -C k +qTsgn(sk)+rTsk] (4.23)
4.2.2 Quasi-Sliding Mode Control Using Multiple Models
Now we discuss the design of the control law for local linear models using the
QSM control framework proposed by Gao et al. [28], where the system states move in a
neighborhood around the sliding surface sk = 0. The central advantage of the sliding
mode control strategy is that it is an effective robust control strategy for incompletely
modeled or uncertain systems. Thus, the feature of the proposed control scheme is that
the robustness for disturbances can be obtained by the simple control logic based on the
linear model for each region. Another feature of the strategy is that it guarantees
convergence of the system output to a vicinity of the predetermined, fixed plane in finite
time, specified a priori by the designer.
Consider one of the local single input-output models f, of the plant f,
Yk+ = alk + a2yk 1 + + amYk-n+1 + bluk +b2uk-1 + + b,,uk-n+1 (4.24)
Equivalently, the input-output model of the plant in (4.24) can be written as the state-
space model2
xk+l = (k +Alk +A2Uk 1 + +AUk-n+1 (4.25)
where xk =[Yk-m+l"' Yk-1, Yk ] C 9 is the system state vector which is available for
measurement and cD and A1, -*, A, have the following forms:
2 In a formal state-space model, past values of the input should be included in the state vector using delay
operators. For simplicity, we include only the system output's past values in the state vector in this
notation.
0 1 0 0 *.. 0
0 0 1 0 *.. 0
0 0 0 0 ... 1
am am,1 am2 m-3 ... a
0 0 0
A1 = : 2 : ,A :
0 0 0
Also defining the tracking error vector as
ke+1 = k+ Xk+ (4.26)
where the desired signal vector is Fk+ = [dk-m+2, "..,dkdk+ ], the switching surface is
defined in the space of the tracking error vector given by
k = k (4.27)
where c =[c1,c2, ..,cm ]T. Then an equivalent control is designed to satisfy the ideal
quasi-sliding mode condition, sk+1 = sk = 0, by
ueq=(c TA) r(Fk+--1 Ak u)-TAzUk cTAuk.+} (4.28)
and the closed-loop system response of the ideal quasi-sliding mode substituting (4.28)
into with an equivalent control is given by
Xk+1 = -A,(cTA1) A J} k +A,(JTA1) rT k+1 (4.29)
The system (4.29) can be viewed as a linear system with the input Fk+~ and the output
k+I To get an insight into the tracking capability of the system, (4.29) can be
represented in terms of the tracking error e, = dk -Yk by
ml1 (4.-30)
ek+l -- ek -- ek-1 1ek-m+2 (4.30)
m m m
Note that by designing the switching surface such that the roots of polynomial
Atm 1 + (Cm_ /cM )A"' + -- + (c, / cm) are inside of the unit circle, the error vanishes and
thus the condition ensures asymptotic convergence to the desired output. An arbitrary
positive scalar cm also determines the time taken to reach the sliding surface and can be
adjusted to get a faster response.
The reaching law (4.21) always satisfies the reaching condition such that the
discrete VSC system designed using the reaching law approach is always stable with a
stable ideal quasi-sliding mode [28]. Then the control law is derived by comparing
sk+1 sk = Ck+l ck
with the reaching law (4.21), which yields,
uk = (T A1) [jTrk+l _T)k _T A2Uk .c A Uk-n+l (4.31)
+(rT- 1)sk + qTsgn(sk)]
Salient feature of the multiple quasi-sliding mode controller is that one can obtain
faster convergence to get the desired response due to multiple control scheme and one
can employ VSCS to control unknown nonlinear plants while gaining indemnity against
noise and parameter variations.
4.3 Analysis of Multiple Quasi-Sliding Mode Control with an Imperfect Sensor
For the SOM-based system identification, one needs to quantify the effect of the
modeling error that will occur due to the quantization of state space induced by the SOM,
and also by the wrong selection of the winning model. Consider model (4.24) in the
presence of modeling error and measurement noise. The predicted output becomes:
k+l = al (Yk + k ) + a2 (Yk-1 -+ -1 ) + m+ a (Yk-.m+l- + k+l )
Il 1 1 (4.31)
+ buk +b2 k1+---+ buk-n+l
where a wrong local model (^i, a2,, m* 1, ,, j is triggered by the SOM due to
the noisy output measurement Yk + k. Then, when sk =0, the overall tracking error
response with an equivalent control is given by
Cmr1 Cmr2 C1
ek+l k 2 k-k 1 ek-m+2
m m m
+ (a1 )(Yk + k ) +m + (am am )(Ykm+ + km+1) (4.32)
S(b2 -2 )k 1 -+ + (bn ,)uk -n+l
For simplicity, consider the error dynamics (4.32) when m = 2. Defining model
parameter error Ak = [a al, a2 a2,b -b ] and noise = [k, Ck k2 ]T
ek+1 =cek +Ak(k +fk) (4.33)
where zk =[Yk,Yk-l Uk-1]T, C =1 C2, and c is chosen as c <1. We assume that
S2 2 2
[]= 7 and Ak
between the reference vector of the correct PE and that of the neighboring PE selected by
the perturbed output measurements3. Also, it is assumed that the noise sk is zero-mean
and white. We have the following recursive formula for the tracking mean squared error.
E[e] =E[c2e 2Cek (k + ik ) k k )(k -k T (4.34)n
2E[e +AzkzAk +ATE[iik <]Ak
When we take the norm on each side in (4.34), the norm of the tracking error power is
represented by
3 The first assumption states that measurement noise has finite power. The second assumption means model
parameter error is bounded by the distance in the state space.
E[e2 1]< cE[e2]+ Ak 2k +2 Ak (4.35)
where the Cauchy inequality is used on the second term on the right hand side. As
k -> oo, using the earlier assumptions on noise and model error bounds, the following
bound on the steady-state tracking error power is obtained:
^ 2
E[e ]< Ik k [k 2 +- (4.36)
(1- cU2)
Note that the difference between the true model (winning PE) and the neighboring
model (wrong PE) assigned by noisy input, c 1-k -k is typically small, since
neighboring SOM PEs represent neighboring regions in the dynamic space. Also, it
should be noted that the error can still be very large if we choose c as close as 1. In
contrast, by choosing c as small as possible, the closed-loop system may have very fast
transient response, possibly too large unexpected overshoot. Thus we should be careful
for determining c so as not to have large error. This problem will be discussed later in
simulation results. If c is set to small enough it then follows that the error by choosing
appropriate design parameters mentioned above will be bounded for a given modeling
uncertainty and measurement noise bounded by y. Moreover, this shows that the
switching scheme does not create an issue to be considered in order to guarantee BIBO
stability of the overall system.
CHAPTER 5
CASE STUDIES
To examine the effectiveness of the proposed multiple controller design
methodologies, the chaotic systems with the input term, discrete-time systems, and flight
vehicles have been considered assuming the following:
Assumption 1: The only state available for measurements is yk = x1l.
Assumption 2: The nonlinear function f is completely unknown.
By assuming that the function f is unknown, we confront a worst case (least prior
knowledge) control design. If an estimate f of f is available, it can be included in the
control design. As can be expected, the better the estimate f the better the performance
of the resulting controller. Our objective is to design multiple controllers for unknown
nonlinear plants that guarantees global stability and forces the output, Yk, to
asymptotically track the desired signal, i.e., Yk rk 0, as k oc without any a priori
knowledge of the plant.
5.1 Controlled Chaotic Systems
Chaotic systems exhibit irregular, complex, and unpredictable behavior that exists
in many industrial systems. Recently considerable effort has been in the focus of attention
in the nonlinear dynamics literature since removing chaos can improve system
performance, avoid fatigue failure of the system, and lead to a predictable system
behavior. In the literature, several design techniques have been applied for the control
and synchronization of a variety of chaotic systems. A generalized synchronization of
chaos via linear transformation [106], adaptive control and synchronization of chaotic
systems using Lyapunov theory [103,107], nonadaptive and adaptive control systems
based on backstepping design techniques [31,32,99,100], and an adaptive variable
structure control system for the tracking of periodic orbits [108] have been considered, to
name a few. The contribution of this work lies in the design of multiple control system
for the control of chaos based on the theory of the inverse control, PID control, and
sliding mode control [14] assuming that the chaotic system is unknown. Given an
unknown chaotic system, the goal is to force it to set points or a stable trajectory.
5.1.1 The Lorenz System
The Lorenz model is used for fluid conviction that describes some feature of the
atmospheric dynamic [73]. The controlled model is given by
1q = (x x)
x2 = -x x-x3 x 1 + u (5.1)
where x,, x2, and x3 represent measurements of fluid velocity and horizontal and vertical
temperature variations, respectively. The Lorenz system can exhibit quite complex
dynamics depending on the parameter values. For 0
equilibrium point. For 1 < a < a* (a, ) := a(a + f + 3)/(o -/ 1), the system has two
stable equilibrium points + (a- 1),+ (a- 1),(a-1)) and one unstable equilibrium
point at the origin. For a* (o,/ ) < a, all three equilibrium points become unstable and
the system trajectory have chaotic behavior [109]. As in the commonly studied case, we
select = 10, = 8/3 which leads to a *(o,/7)= 24.74. Thus we will consider the
system with o- = 10, / = 8/3, and a = 28, which produces the well-known butterfly
chaotic dynamics without any control, i.e., u = 0, as shown in Figure 5-1. Here, the
control force, which is uniformly distributed in u < 100, is added into the second
equation of (5.1).
20
1 "' 3 5 10 15 20 25 30 35 40 45 50
'i 0 ~r A / ~ i AA X .
0 ----------------------------
6 5 I0 15 20 25 30 35 4D 45 50
520
400"_
2 2 4 00
x2 -40 0 O 5 10 15 20 25 30 35 40 45 50
x3 Time (sec)
Figure 5-1. The uncontrolled Lorenz system: phase-space trajectory and time-series.
After solving the set (5.1) with forth-order Runge-Kutta method with integral step
0.05, 7000 samples were generated for analysis. In order to construct a set of model-
based controller, first, we design a model as
k+ = f(Yk,Yk-1 ,Uk,k-, ,Uk- 2) (5.2)
based on the Lipschitz index [35] as shown in Table 5.1.
Table 5-1. Lipschitz index of the controlled Lorenz system for determining an
embedding dimension.
Number of inputs
1 2 3 4 5
1 1090.9 49.7 5.7 3.3 2.1
Number 2 10.5 2.9 2.4 2.1 1.6
of 3 5.1 2.7 2.0 1.5 1.3
outputs 4 3.2 2.3 1.9 1.4 1.2
5 3.0 1.8 1.4 1.4 1.2
-10^
60
0 OS
0 00
0 045
0 04 40
0 035
30
SD03-
0 025 20
002
0 01510
001 01
05 10 15 0 500 1000 1500 2000 2500 3000 3500 4000
Number of PEs of one-side ofa map Epochs
Figure 5-2. Generalization error v.s. Number of PEs (left) Learning curve (right).
The SOM was trained with the output history vector V/y,k I = Yk- 1] over
L = 6000 samples with the time decaying parameters, 7k = 0.1/(1+0.003k) and
Ck = (,[N / 2)/(1+ 0.003k) in (2.14), and then local linear models were built from V/y,k
as well as V',,k = U k-1 ,Uk- 2] for each PE. A newly generated sequence of M = 1000
samples was applied to multiple models for performance test1. Then the dimension of the
square SOM was determined as 8x8 based on the generalization error as shown in Figure
5-2 since the error does not decrease much after N = 8. Also, with N = 8 x 8, the
learning curve2 for 4000 epochs is as shown in Figure 5-2. This curve reflects the overall
closeness of the winning PE to the input samples during the training process and it
becomes approximately constant at the end of training.
Identification performance was evaluated by NRMSE (Normalized root mean squared error):
= 1/ max(r) 1/M IZ (rk+1 k+ )2
2 The learning curve is defined as the RMS (Root mean squared) distortion between the input and the
winning PE: = 1/L z |1yk W,k
61
The identification result is shown in Figure 5-3 where the dashed line is the output
of the controlled Lorenz system, and the solid line is the output from the multiple models.
As we can see, the multiple models are a very good approximation of the controlled
Lorenz system. In addition, plant modeling performance with 64 multiple models was
compared with a single linear model, ARX, with the same number of inputs used in local
modeling.
i-20---------
S 5 10 15 20 25 30 35 40 45 50
UJ
-10
0
0 5 10 15 20 25 30 35 40 45 50
Time (sec)
Figure 5-3. Identification of the controlled Lorenz system by multiple models.
It was also compared with those by means of a conventional TDNN, which was
trained by the backpropagation algorithm with the constant learning rate of 0.001 for
3000 iterations on the same number of inputs and outputs as in local linear modeling,
adopted as a global nonlinear model and listed in Table 5.2. The best result with the
proposed method was a NRMSE of 0.0131 while with the TDNN3 obtained a NRMSE of
3 The number of PEs in the hidden layer of the TDNN is chosen as 10 by 20 Monte-Carlo simulations
varying the size of the hidden layer.
0.0205. On the other hand, a single linear ARX model produced much higher NRMSE,
0.0486, than others did. This result shows that the proposed multiple linear modeling
scheme outperforms both the linear and the nonlinear global modeling paradigm.
Table 5-2. Comparison of modeling performance for the controlled Lorenz system.
Methodology NRMSE
ARX (1) 4.8e-2
Multiple ARX (64) 1.3e-2
TDNN (5:10:1) 2.1e-2
Next, we tested the proposed control scheme using multiple models. First, we let
the system converge to the origin, which is one of the equilibria of the Lorenz system,
starting from the initial state [x, x, x3] = [10,10,10]' by the Multiple Inverse Controllers
(MIC); the controller is activated at 5 sec and the results are shown in Figure 5-4(a)
where the closed loop system response is seen to converge to the origin fairly well. This
is relatively easy to control since the affine system (5.1) of zero dynamics is
asymptotically stable at the equilibrium point. Also the figure shows the behavior of the
control input. Second, we forced the controlled Lorenz system to a set point, Yd = 8
which is not one of the equilibria. This is a more complicated task than steering the state
of the system to the origin. The regulation results with the proposed MIC are shown in
Figure 5-4(b) where we observe that the states x, and x2 asymptotically regulate to
x, = 8 and x = 8, respectively, and the state x3 remains bounded. They illustrate that the
conditions x2 = x1 and x3 = (x,)2 /1/ described in [108] are satisfied even in the case
under consideration.
63
50 100
40 -, 2 00
30 1 "
20 40
040
-100
-60
20 -80
2 3 4 5 7 910 0 1 2 3 4 5 7 9
Time (sec)
(a)
50 100
40 2 0
30
10 I 0 f' 40
'* t I 2' 0
40
-10
2o-20
20 -80
-30 -100
Time (sec)
2 3 4 5 B 7 8 9 10 1 2 3 4 5 6 7 B 9 10
Time (sec) Time (sec)
(b)
Figure 5-4. Tracking a fixed point reference signal by MIC: (a) yd = 0 (b) yd = 8 .
Afterward, the MIC was investigated as increasing the number of controller and
compared with a single linear inverse controller built by a ARX model as well as a
nonlinear TDNN controller, which is a global controller trained through the TDNN
model, for the same task. The optimal number of PEs in the hidden layer of the TDNN
controller was selected as 40. Figure 5-5 illustrates how the number of controllers effects
control performance. As expected, a single inverse controller based on a ARX model
showed the worst performance with regard to settling time as well as steady-state error
even though it reached the set-point the first time. As seen in the figure, the faster the
rising time, the larger the overshoot, which is an unwanted factor in most cases, is
E
generated. A smaller overshoot, faster settling time, and less steady-state errors were
obtained by using a larger number of inverse controllers, such as the 16-MIC, which
showed a very similar performance to the TDNN controller. Moreover, the response
using the 64-MIC demonstrated very small steady-state errors. The 144-MIC, however,
reached the set-point much slower than the others. It also took much longer to settle to
the desired point. Too many divisions of state space may cause poor control performance
(frequent switching among controllers) so that the controller may not be capable of
following fast-changing trajectory.
13
*** . .... TDNN
12 ---- Set-point -
MIC (1)
11 MIC (16)
MIC (64)
10 \-- MIC (144)
77 5.8 5
4
Figure 5-5. Comparison of control performance varying the number of inverse controllers
based on multiple models.
Finally, the proposed multiple control schemes, especially MIC, were compared
with a linear controller and a nonlinear controller. The PID controller coefficients were
determined to bring the poles of the closed-loop response from the plant output to the
desired output to 0, 0, 0.55+/0.31, and 0.55-/0.31. The QSM controller was designed by
choosing the switching surface with c = [1,-3]T and qT = 0.8, rT = 0.001. Depending on
the selection of the coefficients for PID and QSM controllers, one obtains the
performance desired. This issue will be discussed in later sections. The step performance
of the designed multiple model based control scheme and the TDNN controller in closed-
loop operation with the controlled Lorenz system is provided for the two case studies,
Yd = 0 and Yd = 8, in Figure 5-6.
TDNNC TDNNC
IC-ARX IC-ARX
MIC 12- MIC
-- MPIDC MPIDC
MQSMC -- MQSMC
10
8-
48 5 52 54 56 58 45 5 55 6
Time (sec) Time (sec)
Figure 5-6. Comparison of tracking performance by multiple model based controllers
(MIC, MPIDC, MQSMC) and global inverse controllers (IC-ARX,
TDNNC).
We can observe from these results that the overshoots of the Inverse Controller based on
the ARX model (IC-ARX) are much larger than others, which also results in a much
longer settling time. Moreover, even if IC-ARX and TDNNC demonstrated very short
reaching time, they exhibited relatively larger steady-state errors than the others. On the
contrary, the multiple inverse controller showed a much faster reaching time, which is
close to that by IC-ARX and TDNNC, with small steady-state errors when compared
with MPIDC and MQSMC. These comparisons of IC-ARX, TDNNC, and MIC seem
reasonable since they all have an inverse control framework. From these results, it can be
easily inferred that the proposed control strategy guarantees the convergence of the
66
system gaining a fast response to the desired set-point, even though the plant with the
highly nonlinear characteristics is not known a priori to the controller.
5.1.2 The Duffing Oscillator
Another system considered corresponds with the control of a Duffing oscillator,
which displays chaotic behavior, described by
S= pX2 3 (5.3)
x2 = 1 -pIx2 p2 p3I + P4 COS at
where 0 is a constant frequency parameter, p p2 ,P3 and p4 are constant parameters
[10,32,34,35].
I 2,_ I10 20 30 40 50 60 70 80 90 100
-2 6 1 0 5 0' 5 1 1 5 2 0 10 20 30 40 60 60 70 D0 90 100
x1 Time (sec)
Figure 5-7. The uncontrolled Duffing oscillator: phase-space trajectory and time-series.
We assume that the controlled Duffing oscillator is originally (u = 0) in the chaotic state,
shown in Figure 5-7, with parameters o = 1.8, [p, p2,p3,p4] = [0.4, -1.1, 1.0, 1.8]'
[32]. In the simulations, the Duffing system was considered as unknown, which only
generated time-series data excited by uniformly distributed control input, u < 5, via the
fourth-order Runge-Kutta scheme with a fixed time step of 0.2. The embedding
dimension for model construction was chosen as d, = 1 and d, = 1 based on the Lipshitz
index in Figure 5-8. Thus the model is designed as k+ = 1 f(Yk,Y ,Uk,Uk ) Another
67
parameter to be selected is the size of the SOM which is trained with f = [YkYk 1] T
We chose the number of PEs in the SOM as 8x8 by the performance varying the size of
the map shown in Figure 5-8.
Number of outpu s
4 Number of inputs
0036
O03
0 025
002
001
n n
5 5 2 4 6B 10 12
Number of PEs of one-side of a map
Figure 5-8. Lipschitz index (left) for the determination of optimal number of inputs and
outputs and Generalization error v.s. Number of PEs (right).
Table 5-3. Comparison of modeling performance for the controlled Duffing oscillator.
Methodology NRMSE
ARX (1) 3.4e-2
Multiple ARX (64) 1.3e-2
TDNN (4:14:1) 1.le-2
The identification result with 8x8 square map and its performance comparison with
one of global model, TDNN4, is presented in Figure 5-9. In addition, another comparison
with a single linear model is listed Table 5-3. We observe that the proposed multiple
modeling strategy is a little worse than a TDNN, even if it demonstrated much better
modeling performance than a single ARX model. The proposed modeling method,
4 A TDNN with 14 PEs in the hidden layer demonstrated best performance in modeling the Duffing
oscillator.
however, has a much simpler structure for modeling chaotic systems than a global
modeling paradigm.
4- 4-
A 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Time (sec) Time (sec)
Figure 5-9.Identification of the controlled Duffing oscillator by TDNN (left) and
multiple-models (right).
Next, we performed simulations for 3 different control tasks.
1) Control of a Duffing oscillator: For this task, we target the oscillator to follow an
arbitrary trajectory generated by a random control input bounded by 5.
2) Synchronization of two Duffing oscillators: This is to show the proposed control
scheme is able to synchronize oscillators effectively in spite of model mismatch; the
parameters for the slave oscillator are p, = 0.4, p4 = 1.8, and o = 1.8, whereas the
parameters for the master oscillator are p1 = 0.41, p4 = 2, and o = 1.9.
3) Synchronization of two strictly different second order oscillators: In this case,
the slave oscillator is a Duffing one. For this oscillator, the parameters were taken as
p = 0.4, p4 =1.8, and o = 1.8. The master oscillator is a van der Pol one which is
described by x, = x, 2 = -0.1(x,2 1)x -xi' +1.75 cos* (0.667t)
For these missions, the controllers were designed as follows: The optimal number
of PEs in the hidden layer of the TDNNC was selected as 30 by 20 Monte-Carlo
simulations. The MQSMC were built by choosing the sliding hyperplane with
j = [1,-15]' and qT = 0.8, rT = 0.001. The MPIDC were designed for placing the poles
of the closed-loop response at 0.25 + i0.25 which demonstrated the fastest convergence
to the desired trajectory in Figure 5-10.
Fime (sec) Time (sec)
(a) (b)
Time (sec) Time (se-)
(c) (d)
Figure 5-10. Performance comparison on trajectory tracking by TDNNC, PID-ARX,
and MPIDC when the poles of the closed-loop response are place at (a) 0.9
(b) 0.5 0.5i (c) 0.25 0.25i (d) 0.05 0.05i.
From these figures, it should be noted that the MPIDC produces shorter settling time than
a single PIDC does regardless where the poles are placed. In addition, as the poles are
getting closer to the unit circle or to the origin the convergence time is getting longer.
Hence, the poles of the error dynamics should be chosen carefully.
We compared the performance of the controllers using 50sec-long oscillatory
trajectory regarding settling time and NRMS of steady-state error (NRMS-SSE) in Table
5-4: the settling time was selected when the tracking error was bounded in +0.1 and the
NRMS-SSE was calculated using the tracking error from 30sec to 50sec. As seen in the
table, the proposed control strategies outperformed a global controller, which is generally
utilized for unknown system control, in both fast response and accuracy. Especially,
while the TDNNC demonstrated some difficulty in following the desired path generated
from a different dynamics, the van der Pol oscillator, the multiple controllers
accomplished the mission relatively well.
Table 5-4. Comparison of tracking performance for 3 different control task : Settling
time and NRMS-SSE.
Methodology Task 1 Task 2 Task 3
6.0sec 3.6sec 10.6sec
2.8e-2 3.2e-2 6.8e-2
5.0sec 3.6sec 7.8sec
2.3e-2 2.2e-2 2.7e-2
3.6sec 1.6sec 6.2sec
1.9e-2 1.9e-2 1.8e-2
4.2sec 2.4sec 5.2sec
MQSMC
2.7e-2 2.6e-2 3.2e-2
Moreover, the MPIDC among the proposed multiple control methodologies showed a
significant reduction of the transient time as well as accuracy improvement in most
missions. Figure 5-11 shows the synchronization of the master and slave oscillator
signals by the TDNNC and MPIDC where the controller is activated at t = 8 sec. It is
seen from the results again that the transient time is shortened without resulting in
overshoots using the MPIDC.
-1
0 5 10 15 20 25
5 10 15 20 25
Time (sec)
-11
2
-3
0 5 10 15 20 25
0-3
0 5 10 15 20 25
a 5 10 15 20 25
Time (sec)
60 10 15 20 25
6 5 10 15 20 25
Time (sic)
ime (sec)
2
-3
0 5 10 15 20 25
0 5 10 15 20 25
Time (sec)
3
2
-3
0 5 10 15 20 25
10 15
Time (sec)
Control performance by TDNNC (left) and MPIDC (right): The dotted
line is the desired trajectory and the solid line is the output of the
oscillator. (a) Tracking an oscillatory reference signal (b) Synchronization
of two Duffing oscillators (c) Synchronization of a Duffing oscillator and
a van der Pol oscillator.
Figure 5-11.
5.2 Nonlinear Discrete-Time Systems
We have seen how effectively the PID and the inverse controller can be
implemented using multiple models. In this section, nonlinear discrete time systems in a
noisy environment are considered focusing more on multiple model-based control with
sliding mode.
5.2.1 A First-order Plant
Consider the following nonlinear discrete-time plant [5]
,k+l "2,k
x2,k+1 = l+ )2 +x2+uk (5.4)
16 1+ (x2,k 2
Yk = X2,k + Zk
where uk is the input and zk is an external disturbance. In (5.4), we considered a SISO
model, assuming that only the output Yk is available for measurement. The output time-
series was created by exciting an input signal that is uniformly distributed in u < 0.5.
x10
4 I
2 2 1
Number of o puts 1 1 Nu br of nputs 2 4 6 8 10 12 14 16
(a) (b)
Figure 5-12. Parameter selection to design multiple models: (a) Lipschitz index for
determining the embedding dimension (b) Identification performance v.s.
network dimension on independently generated test data for choosing the
size of a map.
73
The model was assumed to be a second order in input and output based on Figure
5-12(a). After quantization of the embedded output space V/,, a set of models was built
with the input-output data samples for each PE. For testing, 400 independently generated
data samples were used changing the size of the map. The best size of the map was
determined as 8x8 since the performance did not improve much after 64 PEs (see Figure
5-12(b)). Thus plant identification with 64 multiple models (8x8) was tested in the
absence of sensor noise as well as in the presence of sensor noise with the plant input
signal being uniformly distributed. The result of system identification in the absence of
sensor noise by multiple models is shown in Figure 5-13(a). As we can see, the models
provide a very good approximation of the plant visually based on the error signals.
S2---------------- 10',---------
E
2~ A
0 la 10 C 200 250 30 360 400
0 1 ~ --
TUN
S 560 100 160 200 250 300 360 400 0 6 10 15 20 25 30 35 40
Iterations SNR
(a) (b)
Figure 5-13. Modeling performance using 64 multiple models for a nonlinear first order
plant: (a) System identification of the nonlinear plant in the absence of
disturbance by the proposed multiple models. (b) Comparison of robustness
against noise between TDNN and multiple models.
Also, the proposed multiple modeling scheme was compared with a conventional
TDNN that was trained through the backpropagation algorithm with 5000 samples and a
constant learning rate, 0.005. The modeling result with 64 multiple models was a NRMSE
of around 6.5e-4 while with the TDNN5 one obtained a NRMSE of about 6.3e-3. The
performance when the testing data are perturbed by noise is shown in Figure 5-13(b)
where we observe that the SOM is more likely to select the wrong model as the noise
level is increased. Even though the multiple models presented more accuracy in
identification than a global model in the absence of noise, it should be pointed out that
the multiple modeling strategy does not have more noise-immunity than global models at
certain noise level. However, the proposed method is more robust than a global model up
to certain noise level.
-8- T6 -8- qT-05
-A- 7Tv-6 -E- qTf01
12 --e- T04 1 7-0
1 2 4 T 8
2 4 6 8 10 12 14 16 1 20 0 2 4 6 1 4 16 1 20
Iterations Iteratons
Figure 5-14. Responses for parameter selection to design QSMC by varying (a) rT and
(b) qT.
Based on the 64 multiple models and the TDNN model identified, we designed
multiple controllers and a TDNN controller, respectively. The TDNN controller was
trained by back-propagating an error through the TDNN model taught by 20 hidden PEs.
The number of hidden PEs in the controller was chosen as 40. The MPIDC was designed
for placing the poles of the closed-loop response at 0.1 i0.3. The MQSMC was built by
choosing the sliding hyperplane with c = [1,-2]T and qT = 0.8, rT = 0.01 comparing the
5 The number of PEs in the hidden layer of the TDNN is chosen as 20 by 20 Monte-Carlo simulations
varying the size of the hidden layer.
parameters as in Figure 5-14. In the figures, we observe that the larger the value of rT is
the faster the plant reaches to the reference signal. In addition, the smaller the value of
qT is the smaller the sliding mode band is.
First, the performance of the controllers was tested on square-wave [1,-1.5, 1, 0]
tracking in the absence of sensor noise and the results are presented in Figure 5-15 where
we can see that all controllers showed very good performance on tracking the reference
signal. Specifically, the MPIDC is the most accurate controller in spite of very slow
convergence, and the MIC is the fastest one even though it shows a little steady-state
error. In contrast, the MQSMC demonstrated the worst performance among the 4
controllers regarding rising time and steady-state error, but it shows its superiority in the
presence of noise later.
151 04
TDkNNC
1_ MIC
MP 1 02
MQSMC
05
0 98
D5 .' TNN
S94- M
MOSMC
0 92
0 10 20 3 40 50 60 70 8 90 100 34 36 38 40 42 44 46 48 50 52
Iterations Iterations
Figure 5-15. Comparison of tracking performance using a global controller and
multiple model based controllers in the absence of sensor noise. The figure
(right) is an enlargement of the figure (left) between 34 and 52 iterations.
Figure 5-16 shows the plant responses of the closed loop control system using the
MQSMC. The trajectories are seen to converge to the desired values of [-1.5 -1.0 -0.5 0.5
1.0 1.5]. The figure also shows control input, the sliding surface, and the winner activities
switched automatically by the SOM. It can be easily seen that the proposed MQSMC
76
scheme guarantees the convergence of the system to the quasi-sliding-mode band around
the sliding hyperplane ek = 0.
20 40 60 IIso 1
ltera
20 140 160 180 2
S2 20
-3_.!
10 -
0 20 40 60 O0 100 120 140 160 10 200 0 20 40 6O 80 100 120 140 160 180 200
Iteration Iterations
Figure 5-16. Performance of square-wave tracking in the absence of noise by the
MQSMC.
Next, the robustness of the proposed control scheme was compared with that of a
global controller using TDNN. The standard deviation of the error between the plant
output and the desired output versus the standard deviation of the noise is shown in
Figure 5-17. It is evident that the MQSMC performs best in terms of insensitivity to
disturbances. The MIC structure showed the best performance only in the noise-free
environment. It should be noted that the MIC and the MPIDC began to become less
robust than the TDNNC at the point where the standard deviation of the noise is over
0.04. From this examination we can conclude that the MIC and the MPIDC can be robust
against noise up to certain level. However, wrong selection of the winner, as the amount
of noise is increased, can be devastating for the controller that is designed based on the
predicted model.
2 TDNNC E MIC B MPIDC DMQSMC
b 0.18-
y 0.16 -
0.14-
0.12-
0.1
0.08
0.06-
1 0.024 H
0.000 0.003 0.011 0.023 0.046 0.070 0.093 0.115
Standard deviation of noise
Figure 5-17. Comparison of performance against noise among TDNNC, MIC, MPIDC
and MQSMC.
Furthermore we tested the closed loop system for tracking a sinusoidal and an
arbitrary desired output. Once again, the multiple controller networks perfectly track the
desired command except for a transient time of a few time steps, as shown in Figure 5-18,
even if the measurement is corrupted by zero-mean random noise with 20dB of SNR.
Overall, we conclude that the proposed MQSMC approach is the most robust
design technique among the four methods considered. This is evident from Figure 5-17,
where we observe that on average the tracking error of the MQSMC increases at a lower
rate than that of the MIC, the MPIDC, and the TDNNC.
0 5 0 -
0 0
0 0
05 -05
15i--------------------- 15----------------------
L 20 40 El 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
Iteration Iteration
05 09722 0.3578 0.1295 0.3103
0 0
4228-1 663 0.03259
-1 30 -2 U 5
-22
25 260 40 6O 80 100 120 140 180 180 200 0 20 40 60 B0 100 120 140 160 180 200
Iteration Iteration
(a) (b)
Figure 5-18. Sinusoidal and arbitrary signal tracking by the MQSMC:(a) in the absence
of sensor noise (b) in the presence of sensor noise, SNR = 20dB.
5.2.2This model is obtained through identification of a laboratory-scale liquid-leveLiquid-level Plant
A liquid-level system described by the following second-order equation has been
considered:
Yk+1 = 0.9722yk +0.3578uk -0.1295uk-1 -0.3103ykUk
-0.04228 k- +0.1663yk-1k-1 0.03259yk yk- 1
--0.03513y uk-1 +0.3084ykyk-lUk-I
+ 0.1087yk-lUkUk-1
This model is obtained through identification of a laboratory-scale liquid-level system
[80,1]. In [80], this model has been used to illustrate theoretical developments for direct
adaptive control. 10,000 data samples were generated for analysis using the control
effort, 'i <1, which has uniform distribution. Given the prior information concerning
the order of the plant by Lipschitz index in Table 5-5, a second order input-output model
described by the following equation was chosen to identify the plant:
k+l =f(Yk Yk-1 kk k-1)
(5.6)
Table 5-5. Lipschitz index of a laboratory-scale liquid-level plant
embedding dimension.
Number
of
outputs
for determining an
Number of inputs
262.34
13.49
1.46
1.19
1.19
10.86
1.32
1.16
1.15
1.15
1.80
1.18
1.13
1.10
1.09
1.64
1.14
1.13
1.10
1.09
1.46
1.13
1.12
1.09
1.09
liii..
4 6 8 10
Number of PEs of one-side of a map
12 14
01
1-0 104
50 100 150 200 250
Iterations
Figure 5-19.
Modeling performance using multiple models for a liquid-level plant: (a)
Identification performance v.s. network dimension for choosing the size of a
map of the liquid-level plant (b) Identification of a liquid-level plant using
12x12 multiple models.
The embedded output vector, Vy,k', was used for SOM training over 6,000 epochs,
and then not only V'y,k but the embedded input vector, V' ,k, were exploited to create
local linear models. In order to test the performance of the proposed modeling scheme,
0 025
002
0015T
001
0 005
00 350 400
we applied 400 newly generated data samples to multiple models. Figure 5-19 depicts
identification performance depending on the size of the network and how well the
multiple models approximate the liquid-level plant using 144 models.
In addition, we compared the identification performance using multiple models
with that by a TDNN. The number of inputs and outputs to the network were the same as
in local linear modeling. A single hidden layer of 30 PEs was large enough for good
identification performance by 20 Monte-Carlo simulations varying the size of the hidden
layer. As seen in Table 5-6, the proposed multiple models slightly outperformed a
nonlinear model in a liquid-level plant identification. Also the table clearly shows the
benefit of using multiple models when comparing with the performance by a single ARX
model.
Table 5-6. Comparison of modeling performance for the liquid-level plant.
Methodology NRMSE
ARX (1) 29.2e-3
Multiple Models (12x 12) 3.5e-3
TDNN (4:30:1) 4.3e-3
A typical open-loop input-output characteristic of the plant is shown in Figure 5-20.
The large variations in the steady-state gain and time constant with the operating point is
also clearly visible from this figure. The performance of the proposed multiple model
based controllers were compared to that of a nonlinear TDNN controller designed
through the previously identified TDNN model. The controllers for this plant were
designed as follows: First, the optimal number of PEs in the hidden layer of the TDNN
controller was chosen as 50. Second, the PID controller was designed in order for the
closed loop response poles to be located in 0.7702 and 0.1298. Third, the sliding
81
hyperplane of the QSM controller was set to j = [1, -1.66]' based on Figure 5-21 which
illustrates the effect of the parameter selection for the sliding surface design under
different noise levels.
0.6
-- Output
0.5 ---- Input
0.4 -
0.3
2- 0.2
0.1
0 - - - - - - - -
-0.1
I 0 \
-0.2 --
-0.3 -
0 100 200 300 400 500 600 700 800 900
Iterations
Figure 5-20. Typical input-output characteristic of the second-order liquid-level plant.
0045
SNR=40dB
004 -0 SNR=35dB
-B- SNR=30dB
0035 -*- SNR=25dB
-e- SNR=20dB
0 03 -
u 0 025 -
i 0.02
0.015
0 00
01 0.2 03 04 05 0.6 07 0 09 1
Icl/c21
Figure 5-21. Square-wave tracking performance of the liquid-level plant varying the
sliding surface and the noise level by the MQSMC.
In most cases, as we placed the poles (-c, / c2) closer to the origin inside the unit circle,
the controller showed better tracking performance. For instance, from the plot, we can
say that the pole should be chosen as between 0.5 and 0.6 to have the robustness against
noise whose level is 25dB of SNR since the error changes too slowly above 0.8 and very
fast below 0.5. Thus the switching surface was chosen as sk = [1,-1.66][ek 1, ek] in
order to get small error and short enough transient time as well.
Iterations Iterations
(a) (b)
60 100 150 200 250 300 0 50 100 150 200 250 300
Iterations Iterations
(c) (d)
Figure 5-22. Control of the liquid-level plant by the MQSMC varying the number of
controllers: (a)M= 1 (b)M= 16 (c)M= 36 (d)M= 144.
The control systems were tested for a typical square wave set-point of amplitude
0.5. Figure 5-22 compares the tracking performance of the reference signal, which is the
output of a first-order reference model with transfer function Gd (z) = 0.2z/(z 0.8) and
driven by the set-point, by the MQSMC varying the size of the map. The figure illustrates
how the control performance is affected by the number of QSM controllers. As the
number of controllers is increased the plant reaches the set-point with smaller steady-state
errors. Even if there seems, however, no big difference between Figure 5-22(c) and
Figure 5-22(d) the larger number of controllers might be working better in a noisy
environment. This will be discussed again later in this section. Additionally, a
comparison of this figure with Figure 5-20 displays the open-loop response and
establishes the efficacy of the proposed control scheme.
Table 5-7. Comparison of control performance for the liquid-level plant in noise-free
environment.
Methodology NRMSE
TDNNC (4:50:1) 5.7e-3
MIC (12x12) 1.4e-3
MPIDC (12x 12) 10.9e-3
MQSMC (12x 12) 5.0e-3
Table 5-8. Comparison of control performance for the liquid-level plant in the presence
of sensor noise: standard deviation of noise is 4.5e-2.
Standard deviation
Methodology of error
of error
TDNNC (4:50:1) 3.8e-2
MIC (12 x 12) 4.2e-2
MPIDC (12x 12) 10.2e-2
MQSMC (12x 12) 1.9e-2
Table 5-7 compares the tracking performance among controllers where MIC and
MQSMC outperformed TDNNC. However, even though MPIDC showed the worst
performance it showed the smallest steady-state errors among them. It only demonstrated
84
large errors in the transient phase since it was not fast enough to track the trajectory. In
addition, the controllers were tested on the same tracking trajectory regulating
disturbances whose standard deviation is 4.5e-2. The results are listed in Table 5-8.
a 50 100 160 200 250 30(
Iterations
(a)
Iterations
(b)
0 50 100 150 200 250 300 0 50 100 150 200 250 300
Iterations Iterations
(c) (d)
Figure 5-23. Control of the liquid level system with measurement noise by the
MQSMC with (a)M= 1 (b)M= 16 (c)M= 144 and (d) the TDNNC.
As in the previous result for controlling the first order plant, the MQSMC showed
the best performance in robustness against the measurement noise, and the MPIDC
showed the worst performance again due to the slow convergence. Figure 5-23 compares
the response among a nonlinear controller and the MQSMCs varying the number of the
controllers under a noisy environment. As we can see, the proposed MQSMC is capable
of rejecting measurement noise and following the trajectory with very little variations.
f
Also the MQSMC with M = 144 showed a more stable aspect than the smaller structure.
While a nonlinear controller, the TDNNC, showed relatively large chatters on the
trajectory even if it tracks the path, it also spent much more control effort to do the same
mission than the MQSMC did.
06 -- Plant output Plant output
S- Desired output Desired output
50 100 150 200 250 300 0 50 100 150 200 250 300
Iteration Iterations
Figure 5-24. Tracking an oscillatory reference signal of the liquid-level plant by the
TDNNC (left) and the MQSMC (right) in the presence of sensor noise.
>
VI
o0i
D
r,.
0
'2
0,
o
Incre
asing closed-loop variance
Figure 5-25. Performance assessment on a trajectory tracking under noisy environment.
Finally, we evaluated the tracking performance of an oscillatory reference signal by
the TDNNC and the MQSMC depicted in Figure 5-24. The reference signal to be
followed was 0.2(0.8 sin(0.15k + r / 2) +1.2 cos(0.1k)). We observed that the MQSMC
tracked the trajectory more smoothly than the TDNNC did. In summary, the proposed
MQSMC is the best possible multiple model based controller in a noisy environment for
both trajectory tracking and set-point tracking as illustrated in Figure 5-25. On the other
hand, the MPIDC is not suitable for a trajectory tracking problem that requires fast
transient response coping with measurement noise even if it is the best in set-point
tracking in the absence of noise.
5.3 Flight Vehicles
Flight vehicles, such as missiles and aircraft, are very complex systems that are
typically non-minimum phase and have aerodynamic coefficients which vary over a wide
dynamic range due to large Mach-altitude fluctuations [43,47]. Control of high-
performance low-cost UAV especially involves the problems of incomplete
measurements, external disturbances and modeling uncertainties. Nevertheless, the
autopilot for these vehicles is often required to achieve very stringent performance
objectives. Thus the proposed control algorithms are applied to missiles and UAV to
show the effectiveness in this section.
5.3.1 Missile Dynamics
We consider a benchmark missile model used widely in the literature [50,104].
This benchmark model can be formulated as
= + [93 (5.7)
9 2 0 q 9g4
and
g = [0.0001a2 0.0112|a 0.2010(2 -M/3)]M cos(a)
g2 = [0.0152a2 -1.3765a + 3.6001(-7+8M/3)]M2
g3 = -0.0403M cos(a)
g4 = -14.542M2
where a is the angle of attack (deg), q is the pitch rate (deg/s), M is the Mach number,
and 5 is the tail deflection angle (deg). We considered an operation range of M = 3 and
a e [-8, 8] deg. The model was formulated as Yk1 f(Yk Yk-1,UkUk k-1) where Yk is
the Angle Of Attack (AOA) to be controlled.
00I0 6o o 150 200 250 300 350 400
-00
DOE020 i------------------------------- Ji !- 10 -------------.----------
0 0 1 20 0 60 100 150 200 250 300 350 400
Number of PEs of one-side of a map Iterations
(a) (b)
Figure 5-26. Modeling performance using multiple models for the missile dynamics: (a)
Identification performance varying the network dimension (b) Identification
of a missile dynamics by 18 x18 multiple models.
The SOM was trained with the vector [Yk Yk 1 ] varying the size of the map and
tested with 400 newly generated data samples. Identification of the missile system with
18 x 18 multiple models, which was chosen as the optimal dimension of network as shown
in Figure 5-26(a), is depicted in Figure 5-26(b) where we observe that the models present
a very good approximation of the plant.
Table 5-9. Comparison of modeling performance for the missile system.
Methodology NRMSE
Multiple Models (18x 18) 6.1e-3
TDNN (4:35:1) 6.3e-3
88
The modeling performance using the designed multiple models were compared with that
by a TDNN, which was trained by one hidden layer with 35 PEs, in Table 5-9. The
proposed method showed marginally better performance than the nonlinear model did.
Based on 324 linear models and a nonlinear TDNN model with 35 PEs in the
hidden layer, we designed the multiple model-based controllers and the TDNN model
based controller. The optimal parameters selected for each controller were 30 hidden PEs
for TDNN controller, the desired pole locations of 0.3 and 0.7 for the closed loop
response by the PID controller, and c, = 1, c, = -1.85, qT = 0.51 for QSMC.
-6 --------------------- -6 ,-------------------
2 2
-4 -4
0 20 30 40 50 0 10 20 30 40 5 0
Time (sc) Time (sec)
(a) (b)
4 2
Z 0
4 2
2
4
0 10 20 30 40 50 600 10 20 30 40 50 60
Time ( ) TIme (se)
(c) (d)
Figure 5-27. Tracking various set-point reference signal by (a) TDNNC (b) MIC (c)
MPIDC and (d) MQSMC in the absence of measurement noise.