UFDC Home  Search all Groups  UF Institutional Repository  UF Institutional Repository  UF Theses & Dissertations  Vendor Digitized Files   Help 
Material Information
Subjects
Notes
Record Information

Full Text 
ROBUST STABILITY ANALYSIS METHODS FOR SYSTEMS WITH STRUCTURED AND PARAMETRIC UNCERTAINTIES By CHARLES THOMAS BAAB A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2002 Copyright 2002 by Charles Thomas Baab To Holly and C.J. ACKNOWLEDGMENTS I would like to express my sincere appreciation to my advisor, Oscar Crisalle, without whose support this dissertation would not have been possible. I'm also grateful for the opportunity he has given me to obtain a master's degree in electrical engineering. I wish to thank Professors Richard Dickinson, Dinesh Shah, Spyros Svoronos, and Haniph Latchman for serving on my supervisory committee. It was reassuring knowing that no matter how mathematically intense my research became Dr. Latchman always understood where I was and where I needed to go. I thank V. R. Basker, Jon Engelstad, Serkan Kincal, and H. Mike Mahon who have led the way and Chris Meredith and Brian Remark who will follow. They all have not only contributed greatly to my research but have been good friends. Finally, I wish to thank my family. In particular, my loving wife, Holly, and my wonderful son, C. J., for their unending support. TABLE OF CONTENTS page ACKNOWLEDGMENTS ........................................................................................... iv A B STR A C T ............................................................................................................... viii CHAPTER 1 INTRODUCTION ...................................................................................................... 1.1. M otivation ................................................................................. ............................. 1.2. Objective and Structure of Dissertation............................................. ..............3 2 A DUALITY PROOF FOR THE MAJOR PRINCIPAL DIRECTION ALIGNMENT PR IN C IPLE ................................................................................................................... 6 2.1. Introduction ............................................................................. .............................. 2.2. Mathematical Background ...............................................................................8 2.2.1. The Singular Value Decomposition and Eigenvalue Decomposition .............8 2.2.2. Dual Norms and Dual Vectors................................................................ 10 2.2.3. Dual Eigenvector Result ........................................................................... 11 2.2.4. EigenvectorSingular Vector Equivalence Result ........................................12 2.3. Statement of the Major Principal Direction Alignment Property ........................13 2.4. Modified Statement of the Major Principal Direction Alignment Principle..........14 2.5. Exam ples.................................................................. ........................................... 17 2.5.1. Exam ple 2.1 .............................................................................................. 17 2.5.2. Exam ple 2.2 ..............................................................................................21 2.6. Conclusions...................................................................................................... 24 3 MAJOR PRINCIPAL DIRECTION ALIGNMENT WHEN THE MAXIMUM SINGULAR VALUE IS REPEATED AND ITS RELATIONSHIP TO OPTIMAL SIMILARITY SCALING ............................................................................................25 3.1. Introduction......................................................................................................25 3.2. Mathematical Background ...................................................................................27 3.2.1. The Singular Value Decomposition............................................................27 3.2.2. Statement of the Major Principal Direction Alignment Principle ................29 3.2.3. Affine Sets, Convex Sets, and Convex Functions ........................................29 3.2.4. Differential Theory .................................................................................32 3.2.5. Expression for the gradient when the maximum singular value is distinct. ...36 3.3. Main Result Characterization of the Subdifferential when the Maximum Singular V alue is Repeated ......................................................................................39 3.3.1. General Expression for the Subdifferential...................................................39 3.3.2. Characterization of the Subdifferential as an Ellipsoid ................................42 3.4. Determining the Steepest Descent Direction and Conditions for a Minimum ......50 3.5. Attainability of MPDA when the maximum singular value is repeated ..............52 3.6. Reconciling the Results with the PDA Results...................................................54 3.7. E xam ples..........................................................................................................54 3.7.1. Exam ple 3.1 ..............................................................................................55 3.7.2. Exam ple 3.2 .............................................................................................. 58 3.7.3. E xam ple 3.3 .............................................................................................. 59 3.8. C onclusions....... ......... .......................................... ........... .............................. 61 4 SPECTRAL RADIUS MAXIMUM SINGULAR VALUE EQUIVALENCE UNDER OPTIMAL SIMILARITY SCALING..............................................................62 4.1. Introduction .................................................................................................... ...... 62 4.2. M them atical Background ........................................ ...........................................64 4.2.1. Dual Norms and Dual Vectors.....................................................................64 4.2.2. Positive M atrix Result.............................................................. .................... 65 4.2.3. Major Principal Direction Alignment Property ............................................69 4.2.4. MPDA as a Control Theory Application ................................................70 4.3. Main Result Extension of the Positive Matrix Result to General Complex M atrices................................................................................................... ..................72 4.4. Exam ple 4.1 ..................................................................................................... 76 4.5. C onclusions.................................................. ....................................................... 77 5 GENERALIZATION OF THE NYQUIST ROBUST STABILITY MARGIN AND ITS APPLICATION TO SYSTEMS WITH REAL AFFINE PARAMETRIC UN CERTA IN TIES ...................................................................................................... 78 5.1. Introduction ............................................................................................................ 78 5.2. Generalization of the Critical Direction Theory .......................................... ..80 5.2.1. Prelim inaries ............................................................................................. 80 5.2.2. Analysis of Robust Stability ................................... .....................................84 5.3. Systems with Affine Uncertainty Structure ...................................................88 5.4. Robust Stability and Uncertainty ValueSet Membership...................................89 5.5. Computation of the Critical Perturbation Radius............................. ............ 95 5.6. Intersection of a Ray and Arcs in the Complex Plane .........................................96 5.7. Exam ples................................. .......... ................................................................ 100 5.7.1. Example 5.1 Convex Critical Value Set ..................................................100 5.7.2. Example 5.2 Nonconvex Critical Value Set............................................104 5.8. C onclusions........................................................................................................ 109 6 ROBUST CONTROLLER SYNTHESIS FOR SYSTEMS WITH NONCONVEX VALUE SETS USING AN EXTENSION OF THE NYQUIST ROBUST STABILITY M A R G IN ......................................................................................................................... 10 6.1 Introduction.................................................................................................... 110 6.2. D esign M ethodology............................................................................... ..........112 6.3. D esign Exam ple ...................................................................................................116 6.4. C conclusion ..................................................................................................... 121 7 ROBUSTNESS OF CLASSICAL PROPORTIONALINTEGRAL CONTROLLER D ESIG N M ETH O D S.................................................................................................122 7.1. Introduction.......................................................................................................... 122 7.2. Prelim inaries ..................................................................................................125 7.2.1. Process Model and Uncertainty Description...................................................126 7.2.2. ProportionalIntegral Control and Controller Tuning Rules.................... 127 7.3. Analysis of Robust Stability .............................................................................. 129 7.3.1. Conditions for Robust Stability ..................................................................129 7.3.2. Parametric Boundaries for Robust Stability ............................................. 133 7.3.4. Stability M argins.................................................................... .................. 39 7.4. Results of Num erical Studies.......................................................................... 143 7.4.1. Region of Stable Perturbations for the ITAE Regulation Tuning Rule........ 143 7.4.2. Stability Margins Computation for Each Tuning Rule............................... 148 7.5. C onclusions............................................................................................. .......... 153 8 CONCLUSIONS AND FUTURE WORK ................................................................154 APPENDIX A PRO O F O F LEM M A 2.1........................................................................................... 156 B PROOF OF THEOREM 2.1..................................................................................... 158 C PROOF OF THEOREM 7.1.................................................................................. 161 D PROOF OF LEMMA 7.3....................................................................................163 E PROOF OF THEOREM 7.2..................................................................................... 167 F SIGN CHANGES IN EQUATIONS (7.12A) AND (7.12B) ....................................171 LIST OF REFEREN CES .................................................................................................176 BIOGRAPHICAL SKETCH ...................................................................................180 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy ROBUST STABILITY ANALYSIS METHODS FOR SYSTEMS WITH STRUCTURE AND PARAMETRIC UNCERTAINTIES By Charles Thomas Baab December 2002 Chairman: Oscar D. Crisalle Major Department: Chemical Engineering The major principal direction alignment principle is investigated in detail for the case when the maximum singular value is repeated. A first result is a new proof based on duality theory for the necessary and sufficient conditions that ensure equality of the spectral radius and maximum singular value of a matrix; namely, that there must exists at least one aligned pair of major inputoutput principaldirection vectors. A second result is the development of a novel numerical optimization algorithm to solve the optimal similarityscaling problem that yields an upper bound for the structured singular value. The algorithm provides a systematic procedure for identifying the steepestdescent search direction even for the case when the singular value is repeated and the underlying optimization problem is locally nondifferentiable. The key theoretical element is the characterization of the subdifferential at every point of nondifferentiability. viii The criticaldirection theory is extended to include nonconvex critical uncertainty value sets through the introduction of a general definition of the critical perturbation radius. The Nyquist robust stability margin is calculated for systems with affine parametric uncertainty using an explicit map from the parameter space to the Nyquist plane. A practical design approach based on parameter space methods is introduced. First the controller parameters that result in robustly stable closedloop systems are determined. Then, a performance objective is optimized over the set of robustly stabilizing controller parameters, resulting in a robustly stabilizing controller with some optimal performance characteristics. A formal robustness analysis of popular proportionalintegral controller tuning rules for systems approximated by a firstorderplustimedelay model is presented. The uncertainty in the process model is represented by multiplicative parametric perturbations in the process gain, process time constant, and process timedelay. The robustness of the uncertain system is characterized in terms of the set of all perturbations that result in stable closedloops. This set is used to calculate the standard gain and phase margins, and the parametric stability margin which is a metric of robustness to simultaneous variations in all three system parameters. These margins are used to compare the relative robustness properties of several disturbancerejection and tracking tuning rules in widespread use. CHAPTER 1 INTRODUCTION 1.1. Motivation Uncertainty is a fact of any realworld system. This uncertainty inherently translates to the model of the system used for control design, and is most often in the form of neglected dynamics or variations in model parameters. An important requirement of any control system is that it be robust (i.e., it functions satisfactorily under these uncertainties), and the design of such control systems is known as Robust Control. An important aspect of the robust control problem is the robust analysis problem which is determining if a control system satisfies stability and performance requirements given an admissible set of uncertainties. Robust stability is obviously a necessary requirement for robustness, and has been studied since the earliest days of feedback control, which originated to desensitize control systems to changes in the process as well as stabilize unstable systems. The classical design techniques focused on frequency domain methods such as those based on Bode plots and Nyquist plots (Nyquist, 1932) and resulted in the gain and phase stability margins. With the advent of the space race of the 1960's, the focus of control engineers was shifted away from frequency domain robust stability methods to the field of optimal control. In fact, the linear quadratic regulator (LQG) design appeared to give controllers with good stability properties, but in the late 1970's it was found that LQG and other prevailing methods of control design such as state feedback through observers lost their stability guarantees under uncertainty. As a result, H. optimal control (Zames, 1981; Francis and Zames, 1984; Doyle, 1983; Safonov and Verma, 1985; Doyle et al., 1989) was introduced as a framework to effectively deal with robust stability and performance problems. The theory provides a precise formulation and solution to the problem of synthesizing an output feedback compensator that minimizes the H. norm of a prescribed system transfer function. The method considers unstructured uncertainties where the only information known about the uncertainty is a norm bound. Typically, more information about the uncertainty is known than a simple norm bound. As a consequence, several robust analysis methods have been developed to consider these structure uncertainties. Possibly the most effective and comprehensive result is the structured singular value (p ) analysis method introduced by Doyle (1982), which considers the problem of robust stability for a known plant subject to a blockdiagonal uncertainty structure under feedback. In general, any blockdiagram interconnection of systems and uncertainties can be rearranged into the blockdiagonal standard form.. The value of p corresponds to the smallest uncertainty that will destabilize the system. Unfortunately, calculating p is not trivial; in fact the underlying optimization problem has been proven to be NPhard (Braatz et al., 1994). However, there is a convex optimization that gives a conservative upper bound for p,. In addressing the existence of solutions to the proposed convex optimization, Kouvaritakis and Latchman introduce the major principal direction alignment (MPDA) property (1985), which gives necessary and sufficient conditions for p to equal its upper bound, thus eliminating the conservatism. Another robust stability analysis method for structure uncertainties is the critical direction theory developed by Latchman and Crisalle (1995) and Latchman et al. (1997) which addresses the problem of robust stability of systems affected by uncertainties that are characterized in terms of arbitrary frequencydomain value sets that are convex. The critical direction theory proposes the Nyquist robust stability margin as a measure of robust stability which has obvious connections to the Nyquist stability criteria. The advantage of the critical direction theory over the structured singular value theory is that for several common structured uncertainty types there is an analytical expression for the Nyquist robust stability margin. Also, even if there is not an analytical expression, determining the Nyquist robust stability margin is a tractable problem. Another type of structured uncertainty is real parametric uncertainty in the process model. The robust stability problem under parametric uncertainty began to receive renewed attention with the seminal result of Kharitonov (1979) on the stability of interval polynomials, and is considered the most important development in the area after the RouthHurwitz criterion. The theory makes it possible to determine if a linear time invariant control system, containing several uncertain real parameters remains stable as the parameters vary over a set (Bhattacharyya et al., 1995). Accordingly, the parametric stability margin is defined as the length of the smallest perturbation in the parameters which destabilizes the closed loop. The parametric stability margin is useful in controller design as a means of comparing the performance of proposed controllers. 1.2. Objecive and Structure of Dissertation The first goal of this dissertation is to revisit the MPDA principle to strengthen the result when the maximum singular value is repeated. Chapter 2 introduces a revised statement of the MPDA property that fully considers the case of a repeated maximum singular value. An alternative proof is presented that is based on the theory of dual norms and dual vectors which was the inspiration of the original result. The MPDA results are also used to determine the upper bound on p given by the minimization over a positive diagonal similarity scaling of the maximum singular value. When the maximum singular value is distinct there exists an analytical expression for the gradient of the objective function. The first order necessary condition for a minimum (i.e., the gradient being indentically 0) is equivalent to MPDA; therefore the minimum is a tight upper bound. Chapter 3 investigates this optimization problem when the maximum singular value is repeated such that the gradient does not exist and the objective function is nondifferentiable. One result is a method for determining the subdifferential when the maximum singular value is repeated where the subdiffemtial represents the set of all sub gradients. The necessary condition for a minimum is that zero is an element of the subdifferential. Furthermore, it is shown that MPDA is still achievable when zero is on the boundary of the subdifferential; otherwise, MPDA is not attainable and the upper bound on p is conservative. Finally, Chapter 4 gives a necessary condition for the optimal similarity scaling. The necessary condition requires that the vector of diagonal elements of the similarity scaling be an element of the null space of a matrix formed from the absolute values of the elements of the left and right eigenvectors of the matrix. The second goal of this dissertation is the extension of the critical direction theory to the more general case where the critical valueset is nonconvex. This work is presented in Chapter 5. The key to extending the theory is the introduction of a generalized definition of the critical perturbation radius in a fashion that preserves all previous results. The nonconvexity of the critical value set is observed in a number of interesting problems, including the case studied by Fu (1990) consisting of rational systems where the uncertainty appears affinely in the form of real parameters that belong to a known rectangular polytope. The generalized critical direction theory is applied to this particular class of uncertain systems, and is used to calculate the required Nyquist robust stability margin with high precision and in the context of a computationally manageable framework. Finally, Chapter 6 proposes a practical design approach based on parameter space methods (Siljak, 1989) to illustrate the utility of the Nyquist robust stability margin as a measure of robust stability. The final goal of this dissertation is to perform a complete robust stability analysis of classical proportionalintegral (PI) controller design techniques based on approximate firstorderplustimedelay models. The uncertain parameters for this problem are the plant gain, plant time constant, and plant time delay. The region of all stabilizing parameter perturbations is determined. By modeling the uncertainties as multiplicative perturbations it is shown that the stability properties of the closedloop system are only dependent on the timedelaytotimeconstant controller tuning parameter. The results include plots of the classical gain and phase margin and the parametric stability margin as a function of the controller tuning parameter for the PI controller design methods investigated. CHAPTER 2 A DUALITY PROOF FOR THE MAJOR PRINCIPAL DIRECTION ALIGNMENT PRINCIPLE 2.1. Introduction The structured singular value, p(M), defined as the supremum of the spectral radius of MU over diagonal unitary matrices U (Doyle, 1982), is a widely accepted tool in the robust analysis of linear systems. It considers the problem of robust stability for a known plant subject to a blockdiagonal uncertainty structure under feedback. In general, any blockdiagram interconnection of systems and uncertainties can be rearranged into the blockdiagonal standard form. Calculating p is not trivial; in fact the underlying optimization problem has been proven to be NPhard (Braatz et al., 1994). The difficulty is that the spectral radius is nonconvex over the set of unitary matrix transformations. One approach is to consider upper bounds for the spectral radius that can be calculated easily, and ideally should be attainable to eliminate conservatism. The maximum singular value is a reasonable choice for an upper bound because it is invariant under unitary matrix transformations. In addition the maximum singular value upper bound can be decreased by optimizing over similarity transformations, because the spectral radius is invariant under such transformations. Ultimately, the problem becomes one of conditioning a matrix through optimal similarity and unitary transformations to achieve equality between the spectral radius and maximum singular value. Therefore, determining the conditions under which the upper bound is attained is a significant issue in the field of robust control. In addressing the existence of solutions to the proposed optimization, Kouvaritakis and Latchman introduce the major principal direction alignment (MPDA) property (1985). The result states that the spectral radius of a matrix is equal to the maximum singular value of the matrix if and only if the major input and the major output principal direction of the matrix are aligned. MPDA is a strict condition for a matrix, but can be used to determine the optimal positive diagonal matrix and unitary matrix that result in equality between the aforementioned definition of p and the maximum singular value upper bound. The proof of the MPDA principle is based on linear algebra arguments, and considers separately the cases of a unique and a repeated maximum singular value. For either case the proof of sufficiency is straightforward. The proof of necessity for the case of a unique maximum singular value is precise but not as clearcut. On the other hand, the proof of necessity for the case of a repeated maximum singular value is slightly ambiguous. The inspiration for the MPDA principle is early work on determining when the spectral radius equals the maximum singular value for positive matrices transformed by nonnegative diagonal matrices (Stoer and Witzgall, 1962). One motivation for the focus on positive matrices is that they have good numerical properties (i.e., less round off error) and therefore may be used for conditioning of matrices. In addition, positive matrices remain positive under transformations by nonnegative diagonal matrices leading to connections to Perronroots ;r(M) (positive eigenvalues of largest modulus) of positive matrices M (Ortega, 1987). These results on positive matrices are based on the mathematical concepts of dual norms and dual vectors utilized by Bauer (1962) which lead to elegant proofs for many of the results. It is the goal of this chapter to revisit the MPDA principle from the viewpoint of duality theory. To facilitate the reading, the next section provides relevant mathematical background including a discussion of the singular value decomposition, a summary of dualnorm and dualvector concepts, and a dual eigenvalue result. Section 2.3 introduces the original MPDA theorem. Section 2.4 provides a modified statement of the MPDA principle theorem that explicitly considers a repeated maximum singular value with a proof based on the dualnorm and dualvector theory. Several examples are given in Section 2.5 and concluding remarks are made in Section 2.6. 2.2. Mathematical Background 2.2.1. The Singular Value Decomposition and Eigenvalue Decomposition In this chapter only square matrices are being considered; therefore the definitions are specialized for this type of matrices, but it is noted that the singular value decomposition theory is applicable to generally rectangular matrices. The singular value decomposition of an arbitrary matrix A e C"" is given by A= X(A)Z(A)Y*(A) (2.1) where E(A):= diag(a,(A),r2(A),..,a,(A)) is the diagonal matrix of singular values organized in descending order, and X(A) and Y(A) are unitary matrices. The singular values of square matrix A e C"" are given by ra(A):= A,(A'A), i=1,2,.,n where A,(A*A) represent the ith eigenvalues of the matrix A'A and where the singular values are ordered such that The matrices X(A) and Y(A) are of the form X(A)=[x,(A) x2(A) .. x,(A)] Y(A)=[y,(A) y2(A) , y,(A)] where the set of normalized left singularvectors (input principal directions) {x,(A)} and normalized right singularvectors (output principal directions) {y,(A)} for i = 1,2, .,n, respectively constitute orthonormal eigenbasis of AA' and A*A. Furthermore, a pair of singular vectors {x,(A), y(A)} is associated with each singular value a,(A) through the relationship Ay,(A) = o,(A)x,(A) (2.2) The maximum singular value is denoted a(A). It must be noted that the maximum singular value can be associated with a repeated singular value, i.e. C(A) =a,(A)=o2(A)=. A maximum left/right singular vector pair (or major output/input principal direction pair) {((A),y(A)} is any pair of left and right singular vectors x,(A) and y,(A) that correspond to the maximum singular value and satisfy (2.2 ). Necessarily, a major output principal direction and major input principal direction respectively must be elements of the orthonormal eigensubspace of AA' and A'A associated with the maximum singular value. In this chapter the following definitions are used in relation to the eigenvalue decomposition (Golub & Van Loan, 1983; Isaacson & Keller, 1966; Stewart, 1970). Let A,(A) be an eigenvalue of A; then a right eigenvector v, of A is any nonzero vector that satisfies Avi = Avi Furthermore, a left eigenvector wi of A is any nonzero vector that satisfies w*A = AwJ The reader is cautioned that some authors use the term left eigenvector for an eigenvector of A'. Finally, an eigenvalue of maximum modulus is any eigenvalue A,(A) that satisfies IAi(A) = p(A). 2.2.2. Dual Norms and Dual Vectors In the theoretical development that follows, the mathematical concepts of dual norms and dual vectors are utilized. These concepts are explained in a paper by Bauer (1962) and are reviewed here to facilitate the theoretical development. Given a vector norm II its dual vector norm IIlo is defined as Rey'x Iy := max Re y'x = max For such dual norms the Holder inequality IlyllJD x Re yx holds an is sharp, i.e., for any yo there exists at least one xo, and for any xo there exists at least one yo such that the equality holds (Bauer, 1962). If such a pair (xo,yo) with IIyollIllIoll = Reyoxo also satisfies the scaling condition IIYO ollD 11oll it is called a dual pair. Note that the dual vector of x is often written (x)D. A pair (x,,yo) is strictly dual and is written yOll D if IyollDJ oO = =Yo =1. For strictly homogenous norms (i.e., those satisfying Iax = IaI I'xll for all complex scalars a) the Holder inequality may be sharpened to (Bauer, 1962) Ilylloll. > ly'Xj For a dual pair (Xo, y) under a homogenous norm it follows that Rey;oxo = IIyollDllxoI > yoo which implies that Reyoxo = yoxo. Hence, for a strictly homogenous norm every pair of dual vectors (xo,Yo) is also strictly dual pair. In addition, there exist a strict dual yo for any xo 0 and a strict dual xo for any yo # 0. Furthermore, the concept of approximately dual vectors is proposed such that a pair (xoYo) is approximately dual if Ilyollolxoll = IYXo = . In general, the dual norm of a pnorm Ix1 := (Ix, j)')1 is the associated pnorm Ill where 1/ p+ / q = 1. So the infinitynorm and the 1norm are duals, and the dual norm of the 2 (Euclidean) norm is itself. For the 2norm, a pair (xo,yo) is dual if Yo = Xo / lxo, and approximately dual if yo = ejxo /I xo . 2.2.3. Dual Eigenvector Result The basis of the following Lemma is a result of Bauer (1962) on the field of values of a matrix. Lemma 2.1. If the spectral radius of a matrix A e C""" is equal to the maximum singular value of A, then for each normalized right eigenvector v, associated with an eigenvalue of maximum modulus A,(A) there exists a normalized left eigenvector w; = v, such that vi and w, form a dualpair wII vi. Proof. Lemma 2.1 is a specialization of Bauer's result to the case of the Euclidean norm, and is therefore in terms of the maximum singular value of the matrix. The proof makes use of dual norm dualvector theory presented earlier. Details are in Appendix A. Q.E.D. 2.2.4. EigenvectorSingular Vector Equivalence Result The following Lemma is a consequence of Lemma 2.1. Lemma 2.2. If the spectral radius of a matrix A e C"" is equal to the maximum singular value of A, then each normalized right eigenvector v, of A associated with an eigenvalue of maximum modulus A i(A) is also a right singular vector y, of A associated with the maximum singular value rF(A). Proof It suffices to prove that v, is a right eigenvector of A"A associated with an eigenvalue whose square root is a(A), because by definition the rights singular vectors y, are an orthonormal eigenbasis of A'A and the singular values are the square roots of the eigenvalues of A*A. First, from Lemma 2.1, it follows that for each normalized right eigenvector vi of A associated with an eigenvalue of maximum modulus A,(A) there exists a normalized left eigenvector w, = vi of A. For each such eigenvector vi A'Av, = Av,A,(A) = (vA)*i,(A) = (w;A)'2,(A) = (A,(A)wA)'A,(A) = v '"(A)A2(A) = I,(A)12 v = A,(AA)v, (2.3) Hence, from (2.3) if follows that v, is an eigenvector of A*A with eigenvalue 2i(A'A). Finally, A/(AA) = FIl(A)I = V2(A) = (A) completing the proof. Q.E.D. 2.3. Statement of the Major Principal Direction Alignment Property In solving various robust control problems it is necessary to determine the conditions under which the spectral radius of a matrix attains its maximum singularvalue upper bound. The major principal direction alignment (MPDA) property addresses this problem. Consider the singular value decomposition of a matrix A given by (2.1) where (A) is the diagonal matrix of singular values organized in descending order, and X(A) and Y'(A) are unitary matrices whose columns are the respective output and input principal directions of A, arranged in an order conformal with the order of the singular values. The major input principal direction y(A) and major output principal direction Y(A) of a matrix A are defined as input and output principal directions respectively, corresponding to the maximum singular value, r(A) of A. In addition, the major input principaldirection y(A) and the major output principaldirection 1(A) are said to be aligned if the exists a real scalar 0 R such that y(A)= e'el(A). The following statement of the Major Principal Direction Alignment (MPDA) property is found in Kouvaritakis and Latchman (1985). Theorem 2.1. The spectral radius of any matrix A ECn"" is equal to the maximum singular value of A if and only if the major input and output principal directions of A are aligned. Proof The proof consists of two cases, namely, when the maximum singular value is distinct, and when it is repeated. The proof is taken directly from Kouvaritakis and Latchman (1985) and is relegated to Appendix B. Q.E.D. For the case of a distinct maximum singular value, Theorem 2.1 as stated is entirely accurate and the proof rigorous. Unfortunately, when there is a repeated maximum singular value, Theorem 2.1 as stated is not entirely accurate and the proof is not rigorous. In the proof, Equation B.5 states that the variable z must assume a given form (i.e., that z = Y(A)w must be at least one element of the form). This does not mean that every major input and output principal direction pair results from z= Y'(A)w; instead it should be interpreted as meaning that there is at least one such pair that results from z= Y'(A)w. Hence, when the maximum singular value is repeated, there may exist a major input and output principal direction pair that is not aligned even when the spectral radius equals the maximum singular value. Counterexamples are given in the examples section. A modified statement of MPDA with a proof based on duality arguments is provided in the next section. 2.4. Modified Statement of the Major Principal Direction Alignment Principle The following theorem is a modification of the MPDA Theorem 2.1 which accurately takes into account the case of a repeated maximum singular value. Theorem 2.2. The spectral radius of any matrix A e Cn" is equal to the maximum singular value of A if and only if there exists a major input and major output principal direction pair of A that is aligned. Proof To prove sufficiency note that alignment of a major input and major output principal direction pair of A implies y(A) = ejI(A) (2.4) Premultiplication of equation (2.4) by A gives Ay(A) = e'AY(A) (2.5) The singular value decomposition of A implies Ay(A) = a(A)Y(A) (2.6) Combining equation (2.5) and equation (2.6) gives AY(A)= e=je(A)Y(A) so that A = e'oa(A) emerges as an eigenvalue of A with eigenvector I(A). Noting that the eigenvalues of A are always bounded from above by C(A), it follows that IAI = p(A) = r(A) To prove necessity, assume p(A) = _(A), then from Lemma 2.2 it follows that any right eigenvector vi of A associated with an eigenvalue of maximum modulus Ai (A) is also a right singular vector y, of A associated with the maximum singular value a(A). From equation (2.2), the corresponding left singular vectors are Ay, (A) x, (A) =Ay,(A) T(A ) c(A) Av, (A) SA,(A)v,(A) jA, (A)I =eJ'TKA',(A))yi(A) = e9y,.(A) Therefore, for each orthonormalized right eigenvector vi there is a major input/output principal direction pair that is aligned. Namely yi(A) = v, and x,(A) = ej'y,(A) (2.7) where 0= arg(i.(A)) (2.8) Finally, there is always at least one right eigenvector vi of A associated with an eigenvalue of maximum modulus Ai(A); therefore, there must exist at least one major input/output principal direction pair that is aligned, which completes the proof. Q.E.D. Theorem 2.2 is a precise statement of the MPDA property. The theorem eliminates any ambiguity that may result when applying the MPDA property as stated in Theorem 2.1 to the case of repeated maximum singular values. In addition, the proof of necessity makes welldesigned use of the earlier work on dual vectors and dual norms, and avoids the confusions associated with the earlier proof. This section is concluded with a simple corollary that restates the MPDA property in the duality terminology, namely Corollary 2.1. The spectral radius of any matrix A EC"'" is equal to the maximum singular value of A if and only if there exists a major input and major output principal direction pair of A that is approximately dual with respect to the Euclidean norm. Proof. It suffices to show that approximate duality of a major input/output principal direction pair with respect to the Euclidean norm is equivalent to alignment of the pair. By definition, the pair are approximately dual with respect to the Euclidean norm if and only if I= ey/ly2 (2.9) Principal directions are always normalized; therefore (2.9) is equivalent to which is exactly the condition for alignment completing the proof. 2.5. Examples 2.5.1. Example 2.1 Consider the matrix 0.90261.0077i A= 0.6086 +0.2053i 0.6487 + 0.2968i 025860.1506i 1.2588 1.1670i 0.5918 0.4665i 0.1661+0.2372i 0.6442 + 0.2239i 0.16411.4383i with eigenvectors 0.0687 + 0.1159i 0.1807+ 0.1816i 0.6834 0.0523i I{v,, v,, v } 0.8719 + 0.2183i 0.3478 02425i 0.3628 0.1105i 0.3920 + 0.142 li 0.8670 + 0.0538i 0.4576 0.4208i and associated eigenvalues {A 1,,A2,3 } = {2.0000e.6000j ,1.2503e1.6660 ,1.5996e2.555i} and singular value decomposition A = XYY", where 0.0018 + 0.0876i X = [x, x2 3]= 02741+0.8117i 0.2903 0.4173i 0.1556 Y=[Yi Y2 Y3]= 0.2965+0.8731i 0.3367 0.1101i 0.4121 + 0.3962i 0.4542 + 0.0343i 0.4806 0.4845i 0.7481 0.0272 0.1299i 0.5404 0.3615i 0.4426 + 0.6853i 0.2421 0.0061i 02491 + 0.4624i 0.6451 0.0400 + 0.3612i (2.10) 05455 + 0.3927i Q.E.D. and the singular values are {o,,I 2,a'3} = {2,2,1} The spectral radius equals the maximum singular value, i.e. IA I = p(A) = =(A)= a =o 2 In this case the eigenvalue of maximum modulus is unique and nonrepeated, and the maximum singular value is repeated. An inspection of the left and right singular vectors reveals that x, ejoy, and x2 # e'y,2 which appears to contradict the MPDA Theorem 2 which states that there must exist at least one major input/output principal direction pair that is aligned. This apparent contradiction can be resolved by realizing (2.10) is only one possible orthonormal eigenbasis of A*A whose vectors are right singular vectors. Different orthonormal eigenbasis of A'A are achieved through unitary transformations of the orthonormal bases of the eigenspaces of A'A associated with each particular singular value. The eigenspace of A'A associated with a nonrepeating singular values is rank one; therefore an orthonormal basis consists of only one vector and the only unitary transformation of this basis is of the form ei0. On the other hand, the eigenspace of A'A associated with a repeating singular value has rank greater than one, and therefore an orthonormal basis consists of more than one vector and a unitary transformation of this basis is a unitary matrix whose size is the rank of the corresponding eigenspace. Hence, for this example, there must exist a unitary matrix that transforms the left singular vectors x, and x2 into x, and x2 such that at least one of the transformed left singular vectors is aligned with the corresponding right singular vectors y, and y,. The problem becomes finding a matrix U such that [x, x;]=[x, x2]U (2.11) [y y'2]=[Yi y]U (2.12) and x' = ej;yi with unitary constraint U'U =I The solution can be found by solving the system of equations that equates the moduli of the elements of x, and y, and that constrains the arguments of elements of x' and y' to differ by 0, where the unknowns are the elements of U and the variable 0. Although this is a simple problem in complex algebra, the resulting set of equations have many terms and are relatively cumbersome. Further theoretical work in this area is discussed in Chapter 3. Therefore, an alternative method is used to solve the problem. First, from Lemma 2.2 it follows that the right eigenvector v, is also right singular vector yi; therefore, if U =[u, n2] then the first part of the problem becomes finding a normalized u, such that Y' = v, =[yi y2]u, (2.13) The normalized least squares solution to (2.13) is S[y y2]+v, [0.5548 + 0.8057i1 U [l y, y2 ]v1 0.2072 + 0.0126i where [e]* denotes the MoorePinrose pseudoinverse (Ortega, 1987). The second and last part of the problem is to choose u2 such that U is unitary. One choice is 0.5548 + 0.8057i 0.2075 U =[' 2J 0.2072 + 0.0126i 0.6026 + 0.7705i Now using the relationships (2.11) and (2.12) and defining X' =[x, x; x'] and Y' = [y y2 Y3 yields an alternative singular value decomposition A= X'ZY' where 0.0088 + 0.1345i 0.5540 + 0.0969i 0.4426 + 0.6853i X =[x', x x3]= 0.5964 + 0.67251 0.3042 0.202 i 0.24210.006 li 0.4038 0.1041i 0.7232 0.1649i 0.2491 + 0.4624i 0.0687+0.1159i 0.48310.5764i 0.6451 Y =[y; y; y,]= 0.8719+0.2183i 0.0549 + 0.2386i 0.0400 + 0.3612i 0.3920 + 0.1421i 0.0228 + 0.6114i 0.5455 + 0.3927i and the singular values again are {cr 1, 2 3} = {2,2,1} Finally, the apparent contradiction of Theorem 2.2 is resolved by verifying arg(A) o0.60 j y x,; = e y, = e 0 y1 Note that x; ; e y; even though a2 is equal to the maximum singular value. A reasonable question now is whether it is possible to choose u2 such that x, is also aligned with y2 ? The answer is no. This is proved as follows. By construction y, is an eigenvector of A corresponding to an eigenvalue of maximum modulus, namely v,. Next, it can be shown (see, Theorem 2.1, proof of sufficiency) that alignment of major input and major output principal directions implies that the major principal directions are both necessarily eigenvectors of A. Now, assume that is possible to choose u2 such that x2 and y; are aligned. Necessarily, y2 is also an eigenvector of A which implies that there are two linearly independent eigenvectors, yl and y', associated with the eigenvalue of maximum modulus. Hence, the eigenvalue's geometric multiplicity is greater than one. Furthermore, for this case the eigenvalue's algebraic multiplicity is one, but an eigenvalues geometric multiplicity can not exceed its algebraic multiplicity. This is an obvious contradiction, therefore the assumption is false. The result that it is not possible to achieve alignment of all the major input and major output principal directions is not compatible with the original statement of the MPDA property as given in Section 2.3. However, it is compatible with the revised version of the MPDA property of Section 2.4, which allows for a major input/output principal direction pair that is not aligned as long as at least one other input/output principal direction pair that is aligned as is the case in this example. 2.5.2. Example 2.2 Consider the matrix 1.7907 0.8729i 0.0780 + 0.0482i 0.0085 + 0.151 li A = 0.0827 0.0396i 1.6645 1.1040i 0.0475 0.0001i 0.1225 + 0.0888i 0.0258 + 0.0399i 1.68831.0605i with eigenvectors 0.8554 0.0000i 0.0224 0.4144i 0.0187 +0.1611i {viv2,3 0.01450.2681i 0.0631+0.4205i 0.5177+0.7269i 0.4002 0.1901iJ 0.5880 + 05489i J 0.3707 0.1999i and associated eigenvalues {'l ,,,2 3} = {2.0000e "' ,2.0000e W O ,2.0000e .' } and singular value decomposition A = XEY*, where 0.0381 + 0.0008i 0.8954 0.4364i 0.0789 + 0.0124i1 X=[x, x2 3]= 0.95500.2457i 0.04140.0198i 0.09520.1281i 0.1616 0.000i 0.0613 + 0.0444i 0.3854 0.9053i 0.0000 1.0000 0.0000 Y = [Y 2 Y3] = 0.9340+ 0.3268i 0.0000 0.0243 0.1422i 0.11380.0887i 0.0000 0.1537 0.9775i and the singular values are {O(l r,2, 3} = {2,2,2} The spectral radius equals the maximum singular value, i.e. AI I= 21 = l3 = p(A)= o(A)= a, = a2 = 3 where there are two eigenvalues of maximum modulus with one being nonrepeated and the other having a multiplicity of two. The maximum singular value is associated with a repeated singular value of multiplicity three. Again, inspection of the left and right singular vectors reveals that x, ejoy,, x,2 e'y,2, and x2 ejy2. From Theorem 2.2, it is known that there is at least one major input/output principal direction pair that is aligned, but it is not apparent if there are more than one. The possibility exists that all three can be aligned through a unitary transformation, because there are three independent eigenvectors associated with the eigenvalue of maximum modulus. The unitary matrix that transforms all three singular vectors such that all input/output principal directions pairs are aligned is not found by solving the resulting system of complex algebraic equations, because the equations are even more cumbersome than would for the previous example. In fact, in this example, the existing singular value decomposition as not transformed at all. Instead, an alternative singular value decomposition is constructed from the three eigenvectors associated with the eigenvalue of maximum modulus. First, one right singular vector is obtained from the eigenvector associated with the eigenvalue that is not repeated, i.e., y' = v, as dictated by Lemma 2.2 and the corresponding left singular is given by x'; = ejazrg(')y = e0.40ojy according to (2.7) and (2.8) of Theorem 2.2. The remaining two right singular vectors are obtained from v2 and v3, the eigenvectors associated with repeated eigenvalue of maximum modulus. It can be shown that both of these eigenvectors are eigenvectors of A'A, but that alone does not make them both right singular vectors, because singular vectors are obtained from the orthonormal eigenbasis of A'A. It is easy to show that v, is normal and orthogonal to v2 and v,. Therefore, the remaining step is to orthonormalize v2 and v3. One such orthonormalization is Y2 = V2 V3 V2V3 V2 V3 ;2V2 3 'V211 The corresponding left singular vectors are then given by eiarg(A) 0.6000 ' X2 = eJ ( Y2 = e 2. O2 W(A,) y .6000) x3 = e J( y3 = e 6jy3 The alternative singular value decomposition is A = X'EY'" where 0.7878 0.333ii 0.2525 0.3294i 0.2144 + 0.2240i X' =[x' x x;]= 0.09100.2525i 0.2895+0.3114i 0.1311 +0.8544i 0.29460.3310i 0.7952 + 0.1210i 0.0666 0.3902i 0.8554 0.0000i 0.0224 0.4144i 0.0505 + 0.3059i' Y'=[y, y'2 y;3]= 0.01450.2681i 0.0631+0.4205i 0.5907+0.6311 i 0.4002 0.190li 0.5880 + 0.5489i 0.1654 0.3597i and the singular values are still {oa,,a 2,a3} = {2,2,2} By construction, the input/output principal direction pairs are aligned as follows e 0.4000 j ' x, OO y, = 6000jy, x2=e 32 0.6000 "j x3 = e 3 This result shows that it is possible for several pairs of input/output principal directions to be aligned when there is singular value multiplicity. Note that the alignment factors, however, are not necessarily identical. 2.6. Conclusions This chapter clarifies the implications of the MPDA principle by explicitly considering the case of a repeated maximum singular value. An alternative proof of the necessity of the MPDA property is presented that is based on dual norm and vector theory. This proof shows the ties the MPDA property has to the earlier duality work which partly inspired it. Examples show that the alignment properties of the input/output principal direction pairs associated with maximum singular value are directly related to the eigenvectors associated with eigenvalues of maximum modulus in terms of both the multiplicity and the amount of alignment. CHAPTER 3 MAJOR PRINCIPAL DIRECTION ALIGNMENT WHEN THE MAXIMUM SINGULAR VALUE IS REPEATED AND ITS RELATIONSHIP TO OPTIMAL SIMILARITY SCALING 3.1. Introduction The Major Principal Direction Alignment (MPDA) theory yields a necessary and sufficient condition for the spectral radius of a matrix to equal its maximum singular value (Kouvaritakis and Latchman, 1985). This has been proved using duality arguments in Chapter 2 where it is shown that the results hold, even for the case of a repeated maximum singular value. The primary reason for the development of the MPDA principle is to solve the structuredsingularvalue / u problem, that is often written in the form (Doyle, 1982) sup p(MU) = p(M) < inf a(DMD') (3.1) Uel) DED Where M eC"n', V.:= {diag(e'J,e'j' ,...,ej*)O< <, <2;,i =1,2,,n} is the set of diagonal unitary matrices and lD:= {diag(d,,d2,..,d,)l d, > 0,i= 1,2,. ,n} is the set of positive diagonal matrices. In equation (3.1), p represents the spectral radius, p the structured singular value, and ~ the maximum singular value. The supremization over V is known to be an NPhard and nonconvex optimization problem (Braatz, 1994); therefore, when using standard optimization techniques there is always the problem of local verses global optima. On the other hand, the infimization over D can be shown to be a convex optimization problem (Safonov and Verma, 1985; Tzafestas, 1984; Latchman, 1986) and the global minima can be determined via an appropriate optimization technique. However, as (3.1) implies, in general this yields only an upper bound on p . The MPDA theory shows that if the maximum singular value is distinct for a given D, then there is an analytic expressions for the gradient 8f(DMD') / D. From this expression for the gradient, the condition for a stationary point (i.e., a8(DMD')/aD=0) implies that the moduli of the input and output principal directions are elementwise equal. Therefore, if at the infimum the maximum singular value is distinct, then the gradient exists and is identically zero, and the moduli of the input and output principal directions are pairwise equal. In addition, a unitary transformation matrix U (note the maximum singular value is invariant under unitary transformations) can be determined that shifts the angles of the elements of the input or output principal direction such that MPDA is achieved, and therefore the upper bound is tight and the value of p is determined by solving a convex optimization problem. In general the maximum singular value is not unique for a given scaling D. This work investigates further the situations that arise when the maximum singular value is repeated. There are two aspects of this problem that are investigated. The first aspect is the effect the repeated maximum singular value has on the optimization over D, with specific interest on gradient search methods. The second aspect is the attainability of MPDA when the maximum singular value is repeated for the optimal scaling. Finally, this work attempts to reconcile the results obtained with those of the principal direction alignment (PDA) principle (Daniel et al., 1986). 3.2. Mathematical Background 3.2.1. The Singular Value Decomposition The following definitions are associated with the singular value decomposition (Ortega, 1987). In this chapter only square matrices are being considered, therefore the definitions are specialized for the case of square matrices, but it is noted that the singular value decomposition theory is applicable to rectangular matrices. The singular value decomposition of an arbitrary matrix A eC".x is given by A =X(A)E(A)Y'(A) (3.2) where S(A):= diag(a,(A),a2(A),.,, (A)) is the diagonal matrix of singular values places in descending order, and X(A) and Y(A) are unitary matrices. The singular values of square matrix A eC"" are given by 0/(A):= /A(A*A), i=1,2,...,n where A (A A) represent the ith eigenvalues of the matrix A*A and where the singular values are ordered such that cr(A) >a2 (A) >.. > ,(A) The matrices X(A) and Y(A) are of the form X(A)=[x,(A) x,(A) . x,(A)] Y(A)=[yt(A) y2(A) .. y(A)] where the set of normalized left singularvectors (input principal directions) {x,(A)} and the set of normalized right singularvectors (output principal directions) {y,(A)} for i = 1,2, **,n, respectively constitute orthonormal eigenbasis of AA* and A'A, such that AA'x,(A) = (A)x,(A) and A'Ay,(A) = a (A)y,(A) (3.3) Furthermore, a pair of singular vectors {xi(A),y,(A)} is associated with each singular value ai(A) through the relationship Ay,(A) = a,(A)x,(A) (3.4) The maximum singular value is defined as F(A):= a,(A). The maximum singular value can be associated with a repeated singular value, i.e. F(A)= ao(A) = a2(A) =* *= ra,(A), where r < n is the multiplicity. A maximum left/right singular vector pair (or major output/input principal direction pair) {Y(A),y(A)} is any pair of left and right singular vectors that corresponding to the maximum singular value and satisfy (3.4). Necessarily, a major output principal direction and major input principal direction respectively must respectively be normalized elements of the eigensubspaces of AA* and A'A associated with the maximum singular value. If {x (A)} for i = 1,2, ,r and {y,(A)} for i = 1,2, ,r are orthonormal bases for these eigensubspaces that satisfy (3.4), then any and all major output principal directions and major input principal directions are respectively given by r Y(A) = [x,(A) x2(A) x,(A)]u= x, (A)u, (3.5) i=1 and y(A)=[y,(A) y2(A) ... y,(A)]u = y(A)u, (3.6) i=e where u E C satisfies u u=1 that is, u must be on the unit ball in C". 3.2.2. Statement of the Major Principal Direction Alignment Principle The following theorem is a modification of the MPDA principle as proposed in Kouvaritakis and Latchman (1985) which takes into account the case of a repeated maximum singular value. Theorem 3.1. The spectral radius of a matrix A e C""" is equal to the maximum singular value of A if and only if there exists a major input and major output principal direction pair of A that is aligned, i.e. there exists a pair {Y(A),y(A)} such that y(A) = ej'x(A) (3.7) for some 0 e[0,2;r). Proof The proof is given in Kouvaritakis and Latchman (1985), and Chapter 2 offers an alternative proof based on the theory of dual vectors and dual norms. Q.E.D. Given the optimal matrices UO and Do, Theorem 3.1 gives a necessary and sufficient condition for the left hand side (spectral radius) and right hand side (maximum singular value) of (3.1) to hold with equality. It is apparent that equation (3.7) requires that the major input and major output principal directions have elementbyelement equal moduli and a constant elementbyelement phase difference. 3.2.3. Affine Sets, Convex Sets, and Convex Functions If x and y are different point in R", the set of points of the form (1A)x+Ay=x+A(yx), A eR is called the line through x and y. A subset M of R" is called an affine set if (1 )x + Ay M Vx M,yeM,AeR In general, an affine set has to contain, along with any two different points, the entire line through those points. The intuitive picture is that of an endless uncurved structure, like a line or a plane in space. The subspaces of R" are the affine sets which contain the origin. The dimension of a nonempty affine set is defined as the dimension of the subspace parallel to it (the dimension of the empty set is 1 by convention). Affine sets of dimension 0, 1 and 2 are called points, lines, and planes, respectively. An (nl) dimensional affine set in R" is called a hyperplane. Hyperplanes and other affine sets may be represented by linear functions and linear equations. For example, given j3 e R and a nonzero b e R", the set H = {xxTb = } (3.8) is a hyperplane in R". Moreover, every hyperplane may be represented in this way, with p and b unique up to a common nonzero multiple. For any nonzero b e R and any P e R, the sets {xIx'b < fl, {xlxTb f} are called closed halfspaces. The sets {xlxTb /} are called open halfspaces. These halfspaces depend only on the hyperplane H given by (3.8). Therefore, one may speak unambiguously of the open and closed halfspaces corresponding to a given hyperplane. Finally, the intersection of an arbitrary collection of affine sets is again affine. Therefore, given any S c R" there exists a unique smallest affine set containing S. This set is called the affine hull of S and is denoted affS. A subset C of R" is said to be convex if (l,R)x+ Ay C Vx EC,y C, 1e(0,1) All affine sets are convex, as are halfspaces. A vector sum A,x, + A2x 2 +A m~ is called a convex combination of x,x ,,x. ,x, if the coefficients A, are all nonnegative and A, +l2 +..+Am =1. A subset of R" is convex if and only if it contains all the convex combinations of its elements. The intersection of all the convex sets containing a given subset S of R" is called the convex hull of S and is denoted convS. Necessarily, convS is the smallest convex set containing S. In addition, for any S c R", convS consists of all the convex combinations of the elements of S. In general, by the dimension of a convex set C one means the dimension of the affine hull of C. A supporting halfspace to a convex set C is a closed halfspace which contains C and has a point of C in its boundary. A supporting hyperplane to C, is a hyperplane which is the boundary of a supporting halfspace to C. As such, a supporting hyperplane to C is associated with a linear function which achieves its maximum on C. The supporting hyperplanes passing through a given point a e C correspond to vectors b normal to C at a, as defined by (3.8). Let a function f(d) be de defined on a convex set S c R" (note, for the MPDA problem f(d):= ao(DMD') where D = diag(d) and S is the positive orthant such that D e D). In what follows, it is assume that the function f(d) is finite on its domain of definition. The function f(d) is said to be convex on S if f(ad, +(1 a)d2) af(d,)+( a)f(d,) Vdl,d2 e S, a E[0,1] A concave function on S is a function whose negative is convex. An affine function on S is a function which is convex and concave. The set {(d,f(d)) e R"' d eS} is the graph of the of the function f(d) defined on the set S. The set epif:= {(d,f) e R+ ld eS,p eR,p/ f(d)} is called the epigraph of the function f(d) defined on the set S. The epigraph of a convex function is a convex set. 3.2.4. Differential Theory A vector 4 is said to be a subgradient of f:S c R" > R at d e S if f(g) 2 f(d)+4T(gd), VgeS (3.9) This condition, which is referred to as the subgradient inequality, has a simple geometric meaning: it says that the graph of the affine function h(g) = f(d) + T(g d) is a non vertical supporting hyperplane to the convex set epif at the point (d,f(d)). The set of all subgradients of f at d is called the subdifferential of f at d and is denoted by af(d). The multivalued pointtoset mapping af:d > Of(d) is called the subdifferential of f. Obviously, af(d) is a closed convex set, since by definition 4 e Of(d) if and only if 4 satisfies a certain infinite system of weak linear inequalities (one for each g of (3.9)). In addition, 8f(d) is also nonempty and bounded. The directional derivative of f at d e S in the direction of g, denoted f'(d;g), is defined by the limit f'(d;g) = lir f(d + Ag) f(d) f'(d;g)= him J v w ^ ^ if it exists. Notably, for convex functions the directional derivative f'(d; g) exists for all d e S and for all g e R". Dem'yanov and Vasil'ev (1985) show that the relation f'(d;g)= max Vg (3.10) SEff (d) holds. The function f is differential at de S if and only if there exists a vector Vf(d) (necessarily unique), called the gradient, for which f(g) = f(d) + VfT (d)(g d) + O(g dl) or, equivalently, Sf(g) f(d) VfT (d)(g d) 0 gd g d If f is a convex function then f is differential at d e S if and only if the partial derivatives exists. In addition, the gradient is given by T Vf(d) (d) Of(d) af(d)d Sd, 8d, Md, and f has only one subgradient, namely the gradient Vf(d), such that 8f(d) = {Vf(d)} (3.11) Also, f(g) > f(d)+ VfT(d)(gd), Vg eS That is, Vf(d) is the normal of the tangent supporting hyperplane of epif at d. With the terminology of differential theory thus developed, several important theorems are given that are used in the sequel. The first theorem describes the set of points where f is differentiable. This theorem is used as the basis of the primary assumption made in the results section, namely, that the function being considered (i.e., f(d):= o'(DMD')) is nondifferentiable only at points in its domain. The second theorem gives a characterization of the subdifferential that is used to construct an expression for Of(d) when the function is nondifferentiable. Theorem 3.2. Let f be a convex function defined on a convex set S c R", and let D be the set of points where f is differentiable. Then D is a dense subset S, and its complement in S, given by D, is a set of measure zero. Furthermore, the gradient mapping Vf: d > Vf(d) is continuous on D. Proof Two different proofs are given in Dem'yanov and Vasil'ev (1985) and Rockafeller (1972). Both proofs are based on measure theory, and show that there are countable number of sets where f is not differentiable. Q.E.D. Theorem 3.2 essentially states that f is differentiable almost everywhere in S. Theorem 3.3. Let f be a convex function defined on a convex set S c R", and let D be the set ofpoints where f is differentiable. Then af(d)= conv{z eR"z= lim Vf(dk), d > d, d, ED} Proof Again, two different proofs are given in Dem'yanov and Vasil'ev (1985) and Rockafeller (1972). Both proofs use the continuity of the gradient on D given in Theorem 3.2 to show that the limit sequences exist and that they converge to exposed points of 8f(d). Therefore the Of(d) is the convex hull of all such limit sequences. Q.E.D. As presented Theorem 3.3 seems of little practical value, because to construct af(d) from it requires the construction of an infinite number of limit sequences. This not the case as is shown in the results section. The next to theorems deal with solving the optimization problem of infimizing f(d). One gives conditions for an infimum, the other gives an expression for the steepest desent direction. Theorem 3.4. For the convex function f(d) to obtain its optimum value on S at the point do, it is necessary and sufficient that 0 a f(do) Proof. A detailed proof is given in Dem'yanov and Vasil'ev (1985). Basically, the condition is sufficient, because epif is entirely above a horizontal supporting hyperplane at do. The condition is necessary, because if 0 af(do) then it is possible to find a direction that would decrease f(do) such that f(do) is not optimum. Q.E.D. The steepest decreasing direction is given in the following theorem. Theorem 3.5. If 0 o af(d), then the subgradient given by Sd(d) = arg min (3.12) Wef(d) points in the opposite direction of the steepest descent direction. That is, sd (d) g(d) = I isd(d)II is the steepest descent direction of f at d. Proof. A detailed proof is given in Dem'yanov and Vasil'ev (1985). The proof is based on finding the direction that gives the smallest directional derivative as given by (3.10). Q.E.D. It is obvious that Theorem 3.4 and Theorem 3.5 are of great utility for any steepest descent nondifferentiable optimization algorithm. 3.2.5. Expression for the gradient when the maximum singular value is distinct. The mathematical background will now focus on the problem at hand, namely performing the infimization inf c(DMD') (3.13) DED The objective function f(d) = &(DMD') (where D = diag(d) and the domain of the objective function is the positive orthant such that D eD) is convex as was already stated. Latchman (1986) has stated that when the maximum singular value is distinct, the gradient exists and is given by a relatively simple expression. The following is the derivation of this expression. After defining M:= DMD' to simplify the notation, the singular value decomposition and equation (3.3) give S(DMD1) = 32 (M) = *(M)M*y(M) (3.14) If it exists, the partial derivative of (3.14) with respect to the diagonal element d, of matrix D is given by a2(M) a_ '(M)M' (M)) ad, ad, which by the chain rule becomes a2 (M) yO(M) M. (M) + ."(M).M By(M) + () (M'M) M 0(= M 1My0(M) + (MM () 1+Y (M) My(M) od, adi a9d, di Using (3.3) gives d(M) 2 ( ) (M) ) 8d, = ad, adi Jd, which simplifies to adi adi adi 2 1 . a(M'M) = (M) + y (M) y(M) Sd, Od,; y (M) d (M) ad,. Expanding the partial derivative term now gives a~'(M) a(D'M*D2MD') adi (M) a(M) [ aD' .D2 D2MD = y (M) M'D2MMD' + D'M a MD' + DM'D2 y(M) Y= (M1) 2 EiM'D2MD + 2dD'M*EiMD' D'M'D2M E (M) d d where E, is a diagonal n x n matrix with a 1 in the (i,i) position, and zeros everywhere else. Since E, = d,E,D' = DE,D /d = dD'E,, the above equation becomes 3M2 ( .()[EiD'M'D I MD' + 2D1MDEiDMD' D'M'D2MID'E](M) 9d, d, = ((M)[M'EMl 2(M)Ej]y(M) Using equation (3.4) this becomes aa2 (M) 22 (i) [ (M)E(M)()EY ) di d 22 (M) (M = d (1 ) (M Now, aQU2 aa(M) 8di di such that an expression for the partial derivative of c(DMD') with respect to the diagonal element d, of matrix D is given by 8a(DMD') (DMD') [DMD (DMD (3.15) adi di When the maximum singular value C(DMD~') is distinct, then the major output principal direction i(DMD') and the major input principal direction y(DMD') are determined by a scaling factor ejo of the left singular vector x,(DMD') and right singular vector y, (DMD'), respectively. Therefore, I, (DMD' ) and Iyi (DMD ')1 are unique and the partial derivatives (3.15) exists for i = 1,2,..,n such that the gradient is given by Vf(d) = Vr(DMD') = (DMD1)D'1Y(DMD1 )2 (DMD')1] (3.16) where the absolute value I* is considered an elementbyelement operator when applied to a vector. As the preceding development has verified, when the maximum singular value is distinct the gradient of the objective function f(d) = C(DMD') exists and is given by (3.16). In addition, the subdifferential is given by (3.11). When the maximum singular value is repeated the gradient no longer exists, but it is possible to determine the subdifferential and therefore a steepest descent direction. This is the main theoretical result of this paper and is given in the next section. 3.3. Main Result Characterization of the Subdifferential when the Maximum Singular Value is Repeated 3.3.1. General Expression for the Subdifferential When the maximum singular value is repeated the major output principal direction x(DMD') and the major input principal direction y(DMD') are determined by (3.5) and (3.6). As such, the expression for the gradient given by (3.16) may not be uniquely determined which implies the objective function may not be differentiable. When the function is nondifferentiable then the subdifferential must be determined, as opposed to the gradient. To characterize the subdifferential define the function 2 r 2" Vf(d;u) = r(DMD')D' x, (DMD' )u, Zy,(DMD')ui (3.17) Ii=1 i=1 where u eC' satisfies u'u = and {x,(DMD')) for i = 1,2, ,r is an orthonormal set of left singular vectors and {y,(DMD')} for i = 1,2,.*,r is an orthonormal set of right singular vectors corresponding to the maximum singular value a(DMD') of multiplicity r. Definition (3.17) represents the evaluation of the gradient function (3.16) for possible values of i(DMD') and y(DMD'). For different u, the function Vf (d; u) may give different values, such that the gradient is not unique and is therefore undefined. The subdifferential is now characterized in the following theorem. Theorem 3.6. The subdifferential of the function f(d) = T(DMD') is given by f (d) = conv{Vf,(d;u)ju'u = 1) (3.18) where Vf, (d; u) is defined by (3.17). Proof From Theorem 3.3, the subdifferential is given by f(d) = conv{z e R" = lim Vf(d,), dk + d, d e D} I k>00 where D is the set of points where f(d) = a(DMD') is differentiable (i.e., the maximum singular value is distinct). The gradients Vf(dk) are given by (3.16), and are determined from the from the sequence of major output principal directions xk(DkMD;') and major input principal directions yk(DkMD;'), which are uniquely determined up a multiple of ejo. From the perturbation theory of matrices (Lancaster and Tismenetsky, 1985), analytic perturbations on normal matrices (i.e., (DMD' )(DMD')) have continuous eigenvalues and eigenvectors in a neighborhood of the perturbation. Now, the right singular vectors {y,(DMD')} are an orthonormal eigenbasis of (DMD )'(DMD'). Therefore, the right singular vectors are continuous in D. This implies that for points where the maximum singular is nondifferential each major input principal direction y(DMD') (all of which are given by (3.6)) is the limit of a sequence of major input principal directions y,(DMD~') that correspond to a maximum singular value that is differentiable. In addition the converse is true; that a sequence of major input principal directions yk(DAMD~') that corresponds to a sequence of maximum singular values k, (DMD ') that are differential converges to major input principal direction y(DMD') given by (3.6) if the sequence a, (DAMDi') converge to the maximum singular value a(DMD'). Similar arguments can be made for the left singular vectors/major output principal directions. Therefore, for all dk D as d, d then k (DkMDA)>. (DMD) and yk(DkMD') + (DMD') which are given by (3.5) and (3.6) with u'u = 1. In addition, for every point where the maximum singular value is nondifferential i(DMD1) and y(DMD') are given by (3.5) and (3.6) and there exists a sufficiently small perturbation of d such that there exists sequences Yk (D MD;')> Y(DMD') and Y, (DAMD ') y(DMD') which are uniquely determined up a multiple of eo' such that the gradients Vf(dk) exist. All that is left is to define Vf,(d;u) by (3.17), which represents the limit of Vf(dk) as dk d for some dk e D, where all u such that u*u = 1 represents all possible limit sequences d, > d for d, e D. Q.E.D. Theorem 3.6 is the natural extension of the gradient result given in Section 3.2.5. For the case when the maximum singular value is distinct, Vf.(d;u) = Vf(d) for all u*u = 1 (i.e., u = u = eje) such that af(d) = conv{Vf(d)} = {Vf(d)} as expected. On the other hand, when the maximum singular value is repeated Vf(d) does not exists. Instead one has Vf,(d;u), which is an extension of equation (3.16) for Vf(d), in that Vf (d;u) represents the vector obtain when equation (3.16) is evaluated at one of the possible major output and input principal directions given by (3.5) and (3.6). Obviously, Vf, (d; u) is a subgradent, since it is an element of 8f(d). In fact, Vf, (d; u) represents a subgradent that is arbitrarily close to some Vf(dk) where d4 > d. That is Vf (d;u) for u'u = 1 represent the boundary of 8f(d). Note, that a repeated maximum singular value does not necessary guarantee a nondifferentiability. Consider the matrix M = I where f(d) = a(DMD') = a(DID') = a(I) = 1 with multiplicity n independent of d. The function Vf,(d;u) = 0 for all d and u, such that f (d) = {0} = {Vf(d)) where the gradient exists and is identically zero. This is an extreme case where the set given by (3.18) degenerates to a point. This and other degenerate cases should be taken into consideration when using Theorem 3.6. 3.3.2. Characterization of the Subdifferential as an Ellipsoid. When the maximum singular value is distinct, the subdifferential is given by the point 8f(d) = Vf,(d;u) = Vf(d), where Vf(d) is given in Section 3.2.5. The next step is to explore the case when the maximum singular value is repeated once. In this section, it is shown that af(d) is an ellipsoid when the maximum singular value is repeated once by examining the properties of the function Vf,(d;u). To simplify the notation define the vector valued function g:C2 * R" as g(u):= Vf(d;u) where d is a fixed point such that maximum singular value f(d) = (DMD') has multiplicity 2. From Theorem 3.6, this gives f (d) = conv{g(u)lu'u = l,u e C2} (3.19) To analyze (3.19), g(u) is expressed in terms of u = [Iulejz"' u2 lueiL ]T as Ju, 12 g(u) = H cos(Lu, z2),u21 (3.20) sin(Zu, Lu2 )uI, lu2 I 1212 where the elements of the constant matrix H are given by h,. = (DMD) (I I) (3.21) di h.2 = 2 (DM ) (IX, (,,2 I cos(LZ,, Lx,)y,1 y12 Icos(y,, Lyi2)) (3.22) di 2 d2 (3.23) ,,.3 = 2 a(i ') (^Ill^,Isi ( ,ZX,,,)y,ll,lsi'n(Z^L,ZY,)) (3.23) di and hi4 (= (D ) (Ixi2 Y2 12) (3.24) for i = 1,2,**,n with x21 X22 x,= X2= (3.25) Xnl Xn2 an orthonormal set of left singular vectors and Y1 Y12 Y21 Y22 Yl = Y2 = (3.26) .Yn. _Yn2 an orthonormal set of right singular vectors corresponding to the maximum singular value c(DMD') of multiplicity 2. Equation (3.20) is obtained from equation (3.17) when the maximum singular value has multiplicity 2, i.e. g(u) = Vf,(d;u)= W(DMD')D'l[Ixu +x2u212 y,u, +yu2 2] Using (3.25) for {x,,x2} and (3.26) for {y1,y2}, and considering one element of g(u) the law of cosines gives S(DMD') 1, 2,22 g, (u) = ( d [IxFiMu + 21xiu, 11x12u2Icos(Z(xiu) Z(x,2u2)) + jxu2 12 dUsing tri +gonom c ad cos(Z(yu,)x n(yc,,x )) + y,,2s Using trigonometric and complex number identities gives 44 a(DMD') 2 g,(u) = (xl,, ly,, d, 2 (IX(DM") (lx1Xi2 ICOS(LXr, Zr,.) Yi ljYi2 I Cos(Ly, Ly,2 ))0os(LZ. Lu .)1 l I  a(DMD') 2 ((D I xIx,,2 I sin(x Z,, )IY l I sin(Zy,, Zy,2)) sin(Zu, Lu )Iu, I2 I + (DMD' (I2 12 I2 12 )1, d, which is of the form (3.20) where the elements of H are defined by (3.21)(3.24) respectively. There are now three cases to consider. The first case is when n = 2. This is a trivial case, in that the optimal similarity scaling is given by the Perron scaling. Therefore, there is no need to further investigate the properties of the subdifferential when n = 2. The other two cases are when n = 3 and when n > 3. As will be discussed shortly, the case when n = 3 is a degenerate case of the more general case n > 3. Therefore the case when n > 3 will be discussed next followed by the case when n = 3. The first result is that the subdifferential given by (3.19) is contained within an affine set of dimension 3. The result is stated in the following theorem. Theorem 3.7. For n > 3 and d such that the maximum singular value f(d) = i(DMD') has multiplicity 2, the subdifferential, Of(d), is contained in the affine set S = {z e R" Pz = q} where elements of the matrix P1.1 P1,2 P1,3 1 0 *.. 0 SP2.1 P2.2 P2,3 0 1 .. 0 (n() Pn3., Pn3,2 P.3,3 0 0 ... 1 and the vector q = [q q2 *. q_3]T satisfy Pi,I hil h2, h3, 1 hi+3, Pi,2 h,2 h22 h3,2 0 hi+3,2 = (3.27) Pi,3 h,3 h2,3 h33 0 hi3,3 .q _h],4 h2,4 3,4 1 .hi+3,4 for i = 1,2,,n 3, and where h,j are given by (3.21)(3.24). Additionally, the dimension of S is 3. Proof. It is sufficient to show that every g(u) given by (3.20) with u'u = 1 is an element of the affine set S = {z e R Pz = q}, because if a set (i.e., {g(u)lu'u = 1) ) is contained within an affine set (i.e., S) then convex hull of the set (i.e., 8f(d) = conv{g(u)lu'u = 1) is also contained within the affine set. Therefore, it must be shown that for all u'u = 1, g(u) satisfies each of the n 3 linear equations that defines the affine set. The first linear equation is [Pi,, P,2 P,3 1 0 ... 0]g(u)=q= which must hold for all u'u = 1. This becomes Pl,.lg (u)+ ,2g(+ ,39 + Pg4 () = q1 which from equation (3.20) is equivalent to p., (hA,, u 12 + h,,2 COS(Lu LZu )Iu, 1u2 I+ h,,3 sin(Zu, Lu2 )lu, u2 I + h,,4 u') + P1,2 (h2.1 I, 2 + h2,2 cos(Lu, LZ2 )Ii IIu I + h23 sin(Z, Lu), IIu 2 I + h2 u,, I2) + P,3 (h3,1 u, 12 + h,,2 cos(u, Zu2 )u, II2 I + h3. sin(Lu, Zu,)I, C2 I + h3.4 1u2 '2) + (h4,, Iu, 12 + h4,2 Cos(Z Z2 )I~, IIU I + h4,. sin(Lu, ZL )u IIJ2 I + h.,4 I2 2) = q, (pl.h,, + pl.2h2, + I,3h3,1 + h4,. )u12 + (plh1,2 + p,2h,2 + + ,3h3 4,2) COS(LZ LZ2 )Il, IU2 I + (3.28) (P,,h,,3 + PI.2h2,3 + 3h,3 + h4 3)sin(Lu, Lu )lu, 1U2 I + (pl.h,,4 + P12h2,4 + P,3h34 + h4,4 )1,12 =q Now, equation (3.27) gives hI, h2,1 h3.i 1 Pil hi+ h,2 h2,2 h3,2 0 Pi,2 hi+,2 h,3 h2,3 h3,3 0 i,3 hi+3,3 hI,4 h2,4 h3,4 1L q hi+3,4 such that (3.28) becomes (q, h4, + h4,, )l 12 + (h4,2 + 4,2) cos(u, u,2 )lu, ll, + (h4,3 + h,3) sin(Zu, u,2 )lu, 1u2 I2 + (q, h4,4 + h,4)u, 2 = q, or Iu 12 +u212 =1 which holds for all u'u = 1. Hence, for all uuu =1, g(u) satisfies the first linear equation that defines S. In fact, the preceding arguments hold for all n3 linear equations that define S. Therefore, g(u) is contained within the affine set S for all u'u = 1 implying that the subdifferential is contained within S. Finally, the n 3 linearly independent rows of P are a basis for the orthogonal complement of the subspace parallel to S such that the dimension of the affine set S is 3. Q.E.D. Theorem 3.7 implies that the last n3 terms of g(u) can be expressed as an affine functions of the first 3 terms of g(u) such that the subdifferential 8f(d) is a 3 dimensional convex set in an n dimensional space. This means that the first 3 terms of g(u) (i.e., {g,(u),g2(u),g3(u)}) describe af(d). Therefore, to complete the characterization of af(d) it is only necessary to investigate conv{g,(u),g (u),g,(u)} for u'u = 1, and then translate this 3dimensional set to the R" using the affine functions given in Theorem 3.7. The convex hull, conv{g,(ug),g2(u),g3(u)u'u = 1}, is now shown to be a 3 dimensional ellipsoid, and thus Qf(d) is a 3dimensional ellipsoid. Consider equation (3.20), even though u = [lu lei"' u2 je"'2 ]T has 4 parameters (i.e., u, I Lu 1 ,and Zu2 ), the function g(u) with u*u = 1 is a function of only 2 parameters. One of the parameters is x, = u, and the other is 0, = Zu, Zu2. The reason Iu2 is not a third parameter is that u'u = 1 necessarily requires u2 = 1lu, i Now consider a fixed value of x,, the terms {g (u),g2 (u),g(u)} are of the form g, (u) = e,, + e,,, cos(9,) + e,3 sin(O,) g2 (u) = e2, + e2,2 cos( ) + e23 sin(, ) g3 () = e3, + e,2 cos(O,) + e3,3 sin(0,) which is obviously a parametric representation of a 2dimensional ellipse in a 3 dimensional space centered at [e,1 e2, e3,]T. To satisfy u'u = 1, lu, must be an element of [0,1]. Therefore, varying x, over its admissible range of 0 to 1 generates a set of ellipses which form the surface of an ellipsoid. This ellipsoid is given by E = {z eR 3(z c) B(z c) = 1} (3.29) The center c of the E is given by c] hA,2 + hI,4 c = c2 0.5 h 22 + h2,4 (3.30) c3 h3,2+ h3,4 and the matrix B which characterizes the length of the axes of E and its orientation has the form bi b1,2 bl3 B= b,2 2 b23 (3.31) b1,3 b2,3 b3,3 where the 6 parameters {b6 b2,2 ,b33 ,b2 ,b,3 ,b2, 3 } can also be expressed in terms of the constants hi, 's. These expression can be obtained by picking six different values of u with u'u = 1, setting z = [g (u) g2 (u) g(u)]T and then solving the resulting system of six linear equations in terms of {b, ,b2,2,b,3 ,bl,2,b,,3,b2, 3 obtained from (3.29). Unfortunately these expressions are vary cumbersome, and therefore in practice it is easier to just solve the system of six linear equations resulting from the numerical data of the particular problem. The following theorem combines Theorem 3.7 and the above result that {g,(u),g2(u),g3(u)} with uu = 1 is an ellipsoid to give a useful characterization of af(d). Theorem 3.8. For n > 3 and d such that the maximum singular value f(d) = i(DMD') has multiplicity 2 the subdifferential af(d) is given by af(d)= {zR"IPz=q, ([z, z2 Z3] T)B([z, Z2 z3 T ) 1) where constants P and q are given by (3.27) and c and B by (3.30) and (3.31). Proof. The subdifferential af(d) is just the convex hull of the 3dimensional ellipsoid E given by (3.29) translated to R" by making it an element of the affine set given by Theorem 3.7. The convex hull of E is the union of itself and its interior which is given by convE = {z e RI (z c)TB(zc) 1}. Q.E.D. To complete the ellipsoidal characterization of af(d) when the maximum singular value is repeated twice the case of n = 3 is now discussed. When n = 3, g (u) is an affine function of g, (u) and g2 (u), such that aff9f(d) becomes a 2dimensional plane in 3dimensional space. The effect is that the 3dimensional ellipsoid E is degenerate in that it has an axis of length zero, because it is required to be a subset of a 2 dimensional plane. Consequently, convE = E such that af(d) becomes a 2dimensional ellipse including its interior in a 3dimensional space. Also, 8f(d) has no relative interior (i.e., there are no elements of Sf(d) that are not also on the boundary of af(d)). Finally, note that degenerate cases are possible. Consider, the matrix M = diag([1 1 0 O]T) such that the maximum singular value is repeated twice. The above analysis gives H = 0 such that equation (3.27) is not meaningful. For this case 8f(d) is no longer contained within a 3dimensional affine set, but is actually af(d) = {0} which is a special ellipsoid whose axis are all length zero. Theorem 3.8 and the preceding paragraph concerning the case of n = 3 give the desired ellipsoidal characterization of af(d) when the maximum singular value is repeated once. The next logical set is to extend the results of this section to the case when the maximum singular value is repeated more than once. Unfortunately, the preceding ellipsoidal characterization no longer holds and the only characterization of af(d) is that given by Theorem 3.6. As is shown in the next section, this still has some utility in determining a steepest descent direction. 3.4. Determining the Steepest Descent Direction and Conditions for a Minimum When the maximum singular value is distinct the gradient exist and the steepest descent direction is given by Vf(d) / Vf(d)I. Furthermore, the necessary and sufficient condition for a minimum is Vf(d) = 0. When the maximum singular value is repeated the results of the previous section and Theorem 3.4 and Theorem 3.5 can be used in a steepest descent optimization algorithm. First the case when the when the maximum singular is repeated once is considered, because the ellipsoidal characterization of af(d) results in a convex optimization problem for determining the steepest descent direction. This is followed by the more general case when the maximum singular value is repeated more than once. Using Theorem 3.5 and the ellipsoidal characterization of af(d) given by Theorem 3.8, the subgradient that gives the steepest descent direction is now given by the optimization S(d) = arg minIll11 (3.32) such that P4 = q (3.33a) and ([Gr '2 ]T)B([ 2 3 T ) 1 (3.33b) Optimizaiton (3.32) with constraints (3.33a) and (3.33b) represent the minimum distance from the origin to the ellipsoid Of(d). Obviously, the objective function of optimization (3.32) is convex in the n parameters {(,2,..,~4)} and the constraints (3.33a) and (3.33b) are convex sets. This n dimensional optimization can be reduced to a 3 dimensional optimization by incorporating the equality constraints (3.32a) into the objective function (3.32), because by Theorem 3.7, (3.33a) implies that {44,,"5,.t} are affine function of {( r,2 3}. The optimization given by (3.32) and (3.33) becomes Sd(d) = argminjI112 = argmi [ + 2 + i~ ,( p,, p.2 Pi,3 3 (3.34) '1 42 43 such that ([1 2 3 c)B([, 2 3]T c) 1 (3.35) where the terms {(, ,,4,} of sd(d) are obtain from the affine functions of {19,2~ 3}. The objective function of optimization (3.34) is a positive semidefinite quadratic function and is therefore convex. In addition, the constraint (3.35) is a convex set. Therefore, determining the steepest descent direction when the maximum singular value is repeated once reduces to a simple 3dimensional convex quadratic optimization over a convex set. Finally, from Theorem 3.8 the necessary and sufficient condition for a minimum, i.e. 0 e Of(d), reduces to cTBcl1, q = 0 (3.36) because [z, z2 z3 T =0 must be an element of the ellipsoid and when [z, z2 3 ]T = 0, the terms {z4,z, ... ,z,} are zero only when q = 0 (i.e., the affine set S = {z E R"IPz = q} must pass through the origin). Now for the case when the maximum singular value is repeated more than once. From Theorem 3.5, the steepest descent direction is obtained from the smallest subgradient in the Euclidean norm. From Theorem 3.6, all subgradients are given by the convex hull of Vf, (d; u) for u'u = 1, which means every subgradient can be expressed as the linear combination AVf (d; u)+(1 )Vf (d; u2) with u, # u,, A = [0,1], u;u, =1 and u u2 =1. Therefore the optimization problem given by (3.12) to determine the subgradient used to obtain the steepest descent direction can be written in the form 4sd(d) = arg minAVf (d; u,)+ (1 A)Vf,(d; u ,) (3.37a) A,Ul,u2 with the constraints u, u2, A=[0,1], u;u, =1 and uu2, =1 (3.37b) Unfortunately, the objective function of optimization (3.37) is nonconvex in the components of the complex vectors u, and u2, and therefore has all of the associated difficulties, like local versus global minimums. In addition, from Theorem 3.4 the necessary and sufficient condition for a minimum is given by s (d) = 0. 3.5. Attainability of MPDA when the maximum singular value is repeated When the maximum singular value is distinct, the necessary condition for a infimum of (3.13) is Vf(d) = 0 where the gradient is given by (3.16). This implies that the moduli of the major input and major output principal directions are elementwise equal. Furthermore, a unitary transformation matrix U can be determined that shifts the angles of the elements of the input and output principal directions such that MPDA is achieved and the upper bound for p is nonconservative. In general MPDA is not possible when the maximum singular value is repeated and the upper bound on p given by is conservative. Therefore, the goal of this section is to determine the sufficient conditions for which MPDA is attainable when the maximum singular value is repeated. These conditions are important, because they result in a nonconservative upper bound for p. A sufficient condition for attainability of MPDA is that there exist a major input and major output principal direction pair with elementwise equal moduli. This is equivalent to the existence of u such that Vf (d;u)= 0. In contrast, the less stringent sufficient condition for a minimum is 0 e8f(d), where as the condition Vf,(d;u) = 0 is equivalent to 0 being an element of the surface of af(d). For the case when the maximum singular value has multiplicity 2 this becomes the condition that 0 is on the surface of the ellipsoid. In other words cTBc = 1 (3.38a) and q = 0 (3.38b) Equations (3.38a) and (3.38b) represent the sufficient conditions for attainability of MPDA when the maximum singular value is repeated twice. When the maximum singular value is repeated more than once the sufficient condition for attainability of MPDA becomes min Vf (d;u) = 0 (3.39) with u'u = 1. Condition (3.39) is not as convenient as (3.38), but is still useful as a method for determining attainability of MPDA and thus the conservatism of the upper bound of . 3.6. Reconciling the Results with the PDA Results The principal direction alignment (PDA) principle (Daniel et al., 1986) states the infimum of (3.1) occurs at a stationary point of the largest singular value for which a stationary point exists starting with the maximum singular value. If the maximum singular value is repeated then there is no stationary point (the maximum singular value is nondifferentiable), and an attempt is made to find a stationary point of the second largest singular value, and so on. This statement is not entirely accurate. Consider the case when at the infimum, the singular value is repeated, and therefore the gradient does not exist. As such the gradient can not be 0 and there is no stationary point, but it is possible to have a repeated maximum singular value and still achieve MPDA as demonstrated by Example 3.3. As such, the infimum occurs at a nonstationary point contradicting the PDA theory. The PDA theory can rectified as follows. First, a more accurate statement than stating the infimum occurs at a stationary point (i.e. when all the partial are zero) of a singular value is to state that the infimum occurs at a point where exist a left and right singular vector pair that element wise equal moduli. The work of the previous section gives the conditions for under which it is possible to equate the moduli when a singular value is repeated. If the moduli can be equated, then MPDA achieved, otherwise it is necessary to use the PDA algorithm by infimizing the next singular value. 3.7. Examples The following three examples demonstrate the results of the previous sections. The first example shows how to determine the steepest descent direction. The second example demonstrates the conditions for a minimum. The third example illustrates the conditions for which MPDA is attainable. 3.7.1. Example 3.1. Let M = AB', where 0.1582 0.3074i 0.3252 + 0.3078i 0.4198 0.5890i 0.0843 0.0067i A= 0.2182 + 0.0182i 0.7031 + 0.1455i 0.1039 0.4891i 0.4090 + 0.0507i 0.0765 + 0.2315i 0.3256 0.030 li and 0.36810.3181i 0.2366 + 0.271li 0.2708 + 0.0371i 0.0536 + 0.3304i B = 0.4548 + 0.5280i 0.0931 0.2255i 0.3127 0.1501i 0.0013 0.0917i 0.2842 + 0.0444i 0.8244 + 0.1044i In performing the infimization infc(DMD'), consider the point d = [1 1 1 1 1]T corresponding to D = I. The maximum singular value F(DMD') = a(M) is repeated (i.e., a, (M)= a2(M)= with a, (M) = a (M) = r5(M) = 0). Therefore, the objective function f(d) = F(DMD') is nondifferentiable at d = [1 1 1 1 1]T and the results of the this chapter are used to efficiently solve the optimization by either determining a steepest descent direction from the point d = [1 1 1 1 1]T or by determining if the point satisfies the optimality and MPDA conditions. First, the ellipsoidal characterization of the subdifferential is obtained using the method of Section 3.3.2. An orthonormal set of right singular vectors corresponding to the repeated maximum singular value is 0.1106 0.1679i 0.1893 0.4442i 0.3526 0.6134i 0.2971 + 0.1422i 0.0002 + 0.3428i 0.5229 + 0.0786i 0.0843 + 0.5385i 0.1979 0.1544i 0.0567 + 0.5552i 0.2160 0.0469i and an orthonormal set of left singular vectors corresponding to the repeated maximum singular value is 0.0000 + 0.0000i 0.2151+ 0.1955i 0.0322 + 0.3960i 0.0690 0.2280i 0.3825 0.7447i 0.6051 + 0.0000i 0.3141 0.0597i 0.1384 0.6067i 0.1529 + 0.2204i 0.1730 0.2060i Using these sets of left and right singular vectors and equations (3.21)(3.24) gives H = 0.0404 0.1486 0.3427 0.0517 0.5834 0.0893 0.6221 0.8006 0.2036 0.4713 0.1930 0.0195 0.0148 0.2458 0.0481 0.0865 0.1949 0.3243 0.2395 0.0235 From Theorem 3.8 the ellipsoidal characterization of the subdifferential is given by af(d)= {zeR"IPz=q, ([z, z2 Z3] T)B([z, Z2 z3]T c) 1} where the elements of the matrix 1.2180 1 0.2180 0.6214 0.3786 0.0927 0.9073 1.0000 0.0000 and the vector 0.2251 q= [0.2251 are obtained from (3.27), the matrix {XIX2}= {yI,y2} 0.00001 1.0000 106.6882 13.8175 21.7949 B= 13.8175 32.1680 17.6270 21.7949 17.6270 15.3504 is obtained by the method mentioned after equation (3.31), and the vector [0.02311 c = 0.1718 0.0092 is given by (3.30). The point d = [1 1 1 1 1]T is obviously not optimal, because the necessary optimality condition q = 0 of (3.36) is not satisfied. Consequently, MPDA does not hold either. Therefore, the next step is to find a steepest descent direction in order to decrease the objective function in the next step of an iterative optimization algorithm. The subgradient that gives the steepest descent direction is obtained by solving the simple 3parameter optimization given by (3.34) and (3.35) and is determined to be 0.0264 0.1069 ~d(d)= 0.0795 0.1338 0.1877 Finally, the steepest descent direction is 0.0988 0.3996 g(d) = (d)= 0.2970 d(d) 0.5002 0.7015 3.7.2. Example 3.2. The following example is taken from Daniel et al. (1986). Let M = AB, where 0.65012 + 0.00000i 0.00000 + 0.00000i 0.45970 + 0.00000i 0.45970 + 0.00000i A= 0.45970 + 0.00000i 0.00000 + 0.45970i 0.39322 + 0.00000i 0.53729 + 0.53729i and 0.00000 + 0.00000i 0.65012 + 0.00000i 0.45970 + 0.00000i 0.45970 + 0.00000i B= 0.45970 + 0.00000i 0.00000 0.45970i 0.53729 0.53729i 0.39332 + 0.00000i Again, in performing the infimization infa(DMD1) the point d=[l 1 1 1]T corresponding to D = I has a maximum singular value F(DMD') = 1(M) that is repeated (i.e., o, (M) = a2(M)= 1, with 3(M)= a4 (M) = 0). Therefore, the objective function f(d) = F(DMD') is nondifferentiable at d = [1 1 1 1]T and the results of the this chapter are used to solve the optimization by either determining a steepest descent direction from the point d = [1 1 1 1] or by determining if the point satisfies the optimality and MPDA conditions. The ellipsoidal characterization of the subdifferential is given by af(d)= {zeR= Pz=q, ([z, z2 z,]cT)B([z, z2 z3T  )l} where P=[I 1 1 1] q=0 3.5575 1.0750 0.00001 B = 1.0750 6.0773 1.0750 0.0000 1.0750 3.5576 and S0.0548 c = 0.0749 0.0548 The point d = [1 1 1 I] is optimal, because the necessary optimality conditions q = 0 and CTBc = 0.0378 < 1 of (3.36) are satisfied implying 0 af(d). This means the upper bound inf (DMD') is 1.0000. On the other hand, the MPDA attainability condition cTBc = 1 is not satisfied. Therefore, MPDA is not attainable and the upper bound is conservative, i.e. p(M) < inf F(DMD') = 1, and either the principal direction alignment (PDA) method proposed in Daniel et al. (1986) or a direct attempt at solving the lower bound supp(MU) must by used to obtain an exact value of the structured singular value. 3.7.3. Example 3.3. This last example shows that even though the maximum singular value is repeated at the optimum it may still be possible to attain MPDA and thus eliminate the conservatism in the upper bound of p/. Consider the matrix 0.0274 + 0.2253i 0.0622 + 0.0571i 0.0597 + 0.0705i 0.0147 + 0.0149i 0.1624 0.1333i 0.2201 0.2277i 0.2355 0.0394i 0.1303 + 0.0643i 0.0632 + 0.1792i 0.3688 0.1437i M = 0.4758 + 0.2550i 0.1977 0.198 i 0.1025 + 0.0008i 0.1533 + 0.1583i 0.2666 + 0.1838i 0.1192 0.0574i 0.2418 0.0274i 0.1239 0.2037i 0.3778 0.3278i 0.0824 + 0.3762i 0.0974 0.3482i 0.1610 + 0.1308i 0.1589 0.0976i 0.4272 0.1706i 0.1610 + 0.0723i The point d = [1 1 1 1 1]' corresponding to D = I has a maximum singular value &(DMD') = a(M) that is repeated (i.e., 0a,(M) = a2 (M) = 1, with a3 (M) = 4 (M) = a0 (M) = 0). Therefore, the objective function f(d) = 5(DMD') is nondifferentiable at d = [1 1 1 1 1]T and the results of the this chapter are used to efficiently solve the optimization by either determining a steepest descent direction from the point d = [1 1 1 1 1]T or by determining if the point satisfies the optimality and MPDA conditions. The ellipsoidal characterization of the subdifferential is given by af(d)={zER "nPz=q, ([z, z2 z3]cT)B([z, z2 z ]T c) where [0.0828 0.9106 0.4221 1.0000 0.00001 0.9172 1.9106 0.5779 0.0000 1.0000 [0.0000 S[0.0000 37.4471 7.5773 25.1289 B= 7.5773 28.1213 6.3079 25.1289 6.3079 26.8994 and [023981 c= 0.0632 0.2010 The point d = [1 1 1 1 1] is optimal, because it satisfies the necessary optimality conditions (3.36). Furthermore, the MPDA attainability conditions (3.38) are also satisfied, namely q = 0 and cTBc = 1. Therefore, the upper bound inf (DMD) = 1.0000 is tight and the structured singular value is exactly p(M) = 1.0000 even though the maximum singular value is repeated such that the objective function is nondifferentiable. 3.8. Conclusions The MPDA principle approach to solving the structured singular value problem is investigated. In the infimization that gives an upper bound to mu, a repeated maximum singular value results in a nondifferentiablity of the objective function. Therefore, efficient gradient descent optimization algorithms that use the analytical expression for the gradient must be modified. The first result of this paper is characterization of the subdifferential which represents the set of all subgradients or generalized gradients. In addition, for the case of a once repeated maximum singular value it is shown that the subdifferential is in fact a 3dimensional ellipsoid in and ndimensional space. Using results from nondifferential optimization theory, the steepest descent direction is obtain from this characterization of the subdifferential to facilitate the optimization. Furthermore, conditions for optimality are presented which are based zero being an element of the subdifferential. Finally, attainability of MPDA at the optimum is shown to be equivalent to zero being on the boundary of the subdifferential enhancing the PDA results when themaximum singular value is repeated. CHAPTER 4 SPECTRAL RADIUS MAXIMUM SINGULAR VALUE EQUIVALENCE UNDER OPTIMAL SIMILARITY SCALING 4.1. Introduction It is well known that the maximum singular value of a matrix is an upper bound of the spectral radius (i.e., p(M) 3(M) where M eC""*). Determining the conditions under which the upper bound is attained is a significant issue in the field of robust control. One approach is to seek properties of matrices that are necessary and sufficient for equality of the spectral radius and the maximum singular value. Another approach uses optimization to condition the matrix through similarity and unitary transformations in order to increase the spectral radius and decrease the maximum singular value upper bound so that equality is achieved. Previous work deals with the optimal conditioning of matrices from a numerical accuracy stand point (Bauer, 1963) and focuses on similarity transformations using nonnegative diagonal matrices. The scaling problem for nonnegative matrices yields a very elegant and precise result. It provides a closed form expression for the optimal similarity scaling matrix for which the Perronroot (largest positive eigenvalue of a positive matrix) equals the least upper bound subordinate to an absolute norm. In addition there are analytical expressions for the elements of the optimal diagonal matrix that involve the Perroneigenvectors of the given positive matrix (Stoer and Witzgall, 1962). The relationship to the present work is that the least upper bound of the matrix subordinate to the Euclidean norm is the maximum singular value of a matrix. The previous results are based on earlier work that derive a necessary condition for the least upper bound of a matrix to equal the modulus of an eigenvalue of the matrix, namely, that the corresponding right and left eigenvector are dual (Bauer 1962). Unfortunately, for the general case of complex matrices, there are no equivalent analytical results on optimal scaling by positive diagonal matrices, although there exist several numerical algorithms. From a robust control perspective, the structured singular value, p (defined as supp(MU) where tV:= diag(eJ't,eJ ',..,ejo*)0<5 is a widely accepted tool in the robust analysis of linear systems. It considers the problem of robust stability for a known plant subject to a blockdiagonal uncertainty structure under feedback. In general, any blockdiagram interconnection of systems and uncertainties can be rearranged into the blockdiagonal standard form. Calculating p is not trivial; in fact the problem has been proven to be NPhard (Braatz et al., 1994). The difficulty is that the spectral radius is nonconvex over the set of unitary matrix transformations. One approach is to consider upper bounds for the spectral radius that can be calculated easily, and ideally should be attainable to eliminate conservatism. The maximum singular value is reasonable choice for an upper bound because it is invariant under unitary matrix transformations. In addition, the maximum singular value upper bound can be decreased by optimizing over similarity transformations because the spectral radius is invariant under such transformations. Ultimately, the problem becomes one of conditioning a matrix through optimal similarity and unitary transformations to achieve equality between the spectral radius and the maximum singular value. In addressing the existence of solutions to the proposed optimization, Kouvaritakis and Latchman introduce the major principal direction alignment (MPDA) property (1985). The result states that the spectral radius of a matrix is equal to the maximum singular value of the matrix if and only if a major input principledirection and a major output principaldirection of the matrix are aligned. MPDA is a strict condition for a matrix, but can be used to determine the optimal positive diagonal matrix and unitary matrix that results in equality between the afore mentioned definition of p and the maximum singular value upper bound for the case when the maximum singular value is distinct. It is the goal of this work to establish relationships between results obtained from different perspectives of the same spectralradius/maximumsingularvalue equivalence problem. To this end, the earlier work by Bauer (1963) on positive matrices is extended to the class of general complex matrices. The results are necessary conditions for equality that are used to improve the calculation of p through its upper bound. 4.2. Mathematical Background 4.2.1. Dual Norms and Dual Vectors In the theoretical development that follows the mathematical concepts of dual norms and dual vectors are utilized. These concepts are explained in a paper by Bauer (1962) and are reviewed here to facilitate the theoretical development. Given a vector norm I its dual vector norm Ile is defined as Rey*x IIYI:= max Re y x = max For such dual norms the Holder inequality IlyllD lxll 2 Rey'x 65 holds an is sharp, i.e., for any yo there exists at least one xo, and for any xo there exists at least one yo such that the equality holds (Bauer, 1962). If such a pair (xo,yo) with IlyoDJllollo = Reyoxo also satisfies the scaling condition IlyollD 1ol 0 = it is called a dual pair. Note that the dual vector of x is often written (x)D. A pair (xo,yo) is strictly dual and is written yolloD if IIyollDI0ll=lo =y 1' For strictly homogenous norms (i.e., those satisfying axll = lal lxI for all complex scalars a) the Holder inequality may be sharpened to (Bauer, 1962) IIYID1.114 ly'xl For a dual pair (Xo, y) under a homogenous norm it follows that Reyoxo = IIYollollDXI L yoX0 which implies that Re yoo = Yoxo. Hence, for a strictly homogenous norm every pair of dual vectors (x,,yo) is also strictly dual pair. In addition, there exists a strict dual yo for any xo 0 and a strict dual xo for any yo # 0. In general, the dual norm of a pnorm Ix l:=l (Zlx')1', is the associated pnorm II I where / p + / q = 1. So the infinitynorm and the 1norm are duals, and the dual norm of the 2 (Euclidean) norm is itself. For the 2norm, a pair (xo,Yo) is dual if yo = Xo/lxoll( 4.2.2. Positive Matrix Result Early work on determining when the spectral radius equals the maximum singular value is concerned with positive matrices transformed by nonnegative diagonal matrices, because they have good numerical properties (i.e., less round off errors) and therefore may be used for conditioning of matrices. In addition, positive matrices remain positive under transformation by nonnegative diagonal matrices leading to connections with Perronroots r(P) (positive eigenvalues of largest modulus) of positive matrices P e RX"n (note, R. is the set of positive real numbers). From this perspective, Stoer and Witzgall (1962) show that for the positive matrix P and nonnegative diagonal matrices D ,z(P) = min lub(D'PD) DED (4.1) where D:= {diag(d,,d2,...,d.)I d, > 0,i = 1,2,.,n}, and lub(A):= max Ax maxiAxI 0 x' II1=1 is the least upper bound norm of a matrix A e C""' subordinate to the vector norm II. It is noted that the least upper bound norm is equivalent to the induced matrix norm, and that when the subordinating norm is the Euclidean norm then lub(A) = C(A) . In developing the result it is necessary to make use of a result from Bauer (1962) that states that if A is an eigenvalue of A, then 1A = lub(A) (4.2) is only possible if a right and left A eigenvector are dual with respect to the norm to which the bound norm is subordinate, where by definition a left A eigenvector w of A satisfies the relation w'A = Aw* (Golub & Van Loan, 1983; Isaacson & Keller, 1966; Stewart, 1970). The reader is cautioned that some authors use the term left eigenvector for an eigenvector of A'. Let Do be the minimizing D of(4.1), then 71(P) = r(DO'PDo) = lub(Do'PDo) (4.3) From the result of Bauer (1962), (4.3) is only possible if the right and left eigenvectors of Do'PDo are dual. Now, if v > 0 and w > 0 are the right and left Perron vectors of P (note that it is implied that greater than operator ">" is an element wise operation when applied to a vector), then it is straightforward to show that Dv > 0 and Dw > 0 are the right and left Perron vectors of D'PD. Therefore, any Do that minimizes (4.1) such that equality is achieved must also make the vectors D0'v and Dow dual, where v > 0 and w > 0 are the right and left Perron vectors of P. The problem now reduces to transforming the positive vectors v > 0 and w > 0 to dual vectors Do'v and D0w where Do is a nonnegative diagonal matrix. Stoer and Witzgall (1962) state that for absolute norms (i.e., norms that only depend on the moduli of their components (Bauer et al., 1961)) there exists one, and up to positive multiples only one, nonsingular nonnegative diagonal matrix Do such that Do'v and D0w form a dual pair. For a pnorms which are necessarily absolute norms, the positive vectors y > 0 and x > 0 are dual if (y,)q =(x,)P, i=1,2,...,n and 1 1 += P q Therefore, the matrix Do =diag, I (4.4) makes Do'v and Dow a dual pair for any right and left Perroneigenvectors v > 0 and w>0. Duality is only a necessary condition for (4.2). Therefore, to show (4.1) holds it suffices to show (4.3) holds for those matrices Do'PDo whose right and left Perron vectors Do'v and Dow are dual, where Do is given by (4.4). Using the definitions of eigenvalues and eigenvectors it can be shown that Re{(w*Do)(Do'PDo)(Do'v)} = r(Do'PDo) Re{(w*Do)(Do'v)} (4.5) and from the definition of duality of vectors it is true that Re{(w*Do)(D;'v)} =1 (4.6) IIDo WIiD Do'vl Combining (4.5) and (4.6) gives Re{(w'Do)(Do'PDo)(Do'v)} IjjDow HDv = (Do'PDo) (4.7) DoWI Io ll Using of the bilinear characterization of the least upper bound Re(y Ax} lub(A):= max ReyAx (4.8) Xyro I1yllI.X4 Stoer and Witzgal (1962) show there is a maximizing pair for (4.8) in the positive orthant, and that the only maximizing pair in the positive orthant for lub(Do'PD,) is the pair Do'v and Dow such that Re{(w'Do)(Do'PDo)(Do'v)} lub(DO'PDo) = IlDowllollDD'vli which from (4.7) equals ;r(DoPD0). Therefore, (4.1) holds where the minimizing D is given by (4.4). The relationship of Stoer and Witzgall's positive matrix result to the spectral radius/maximumsingularvalue problem can be shown by specifying the least upper bound norm to be subordinate to the Euclidean norm, i.e. lub(A) = o(A) (4.9) where A e Cx"". Combining (4.9) and the fact that the Perronroot of a positive matrix is the spectral radius, (4.1) becomes p(P)= min a(D'PD) (4.10) DeB for positive matrices P and positive diagonal matrices D. In addition, from (4.4), there is an analytical expression for the optimizing Do given by ( /2 1/2 1/2 (4.11) DO = diag ,, (4.11) where v > 0 and w > 0 are right and left Perronvectors of P. Clearly, (4.10) shows that for positive matrices there is a simple similarity transformation for which the spectral radius attains its the maximum singular value upper bound. 4.2.3. Major Principal Direction Alignment Property In solving various robust control problems it is necessary to determine the conditions under which the spectral radius of a matrix attains its maximum singular value upper bound. The major principal direction alignment (MPDA) property addresses this problem (Kouvaritakis and Latchman, 1985). Consider the singular value decomposition of a square matrix A e C"x" given by A= X(A)E(A)Y'(A) where E(A) is the diagonal matrix of singular values placed in descending order, and X(A) and Y'(A) are unitary matrices whose columns are the respective output and input principal directions of A, arranged in an order conformal with the order of the singular values (Lancaster and Tismenetsky, 1985). Now, define a major input principal direction y(A), and a major output principal direction Y(A), of a matrix A respectively as normalized input and output principal directions, corresponding to the maximum singular value, U(A) of A. The MPDA property is given in the following theorem. Theorem 4.1. The spectral radius of any matrix A eC""" is equal to the maximum singular value of A, if and only if there exists a major input principal direction and a major output principal direction of A which are aligned such that x(A) = ejoy(A) Proof. Given by Kouvaritakis and Latchman (1985). An alternative proof based on dual norms and dual vectors is given in Chapter 2. Q.E.D. 4.2.4. MPDA as a Control Theory Application One area in the field of robust control that makes use of the spectral radius/maximumsingularvalue equivalence problem is the stability analysis of multivariable feedback systems in the presence of structured uncertainties. Of particular interest is the stability of diagonally perturbed systems for which the uncertainty is represented by the complex diagonal matrix A ediag(,,82, ..., 1I < P,, P eR i= 1,2,.,n This class of systems is especially amenable to spectral radiuspreserving similarity scaling, and through simple transformations is representative of the more general class of full structured uncertainties. Using Nyquist arguments in the complex plane, it can be shown that sup p(MA) <1 (4.12) A is a necessary and sufficient stability condition, where the complex matrix M is function of the system's transfer function matrix evaluated a particular frequency. The optimization problem (4.12) is nonconvex, but it can be simplified by introducing the following positive diagonal similarity scaling p(MA) = p(D'MDA) 5 5(D'MDA) Furthermore, using geometric arguments based on the MPDA principle, it can be shown that the supermizing diagonalmatrix A,, has the form Aop, = QU where Q = diag(q,,q2,.*,qn) with q, eR, and U eV.:= diag(e'e",e j',...,ejo )0 c0, <2;r,i= 1,2,.,n} The optimization problem (4.12) becomes equivalent to sup p(MA) = sup p(MQU) 5 inf (D'MQD) (4.13) A Uet De and the necessary and sufficient stability condition becomes inf(D'MQD) <1 (4.14) DelD Furthermore, using MPDA arguments it can be shown that the optimizing Do in (4.14) results in the equality sup p(MQU) = F(D'MQDo) when the maximum singular value is distinct at the infimum. 4.3. Main Result Extension of the Positive Matrix Result to General Complex Matrices The positive matrix result of Stoer and Witzgall as stated by (4.1) and specialized to the Euclidean norm by (4.10) gives a positive diagonal similarity scaling (4.11) that results in equality of the spectral radius and maximum singular value of a positive matrix. When applied to robust control problems that involve complex matrices, the positive matrix result is usually only suboptimal. Therefore, it is necessary to extend the result to the class of complex matrices. Unfortunately, much of the theoretical development is dependent on the characteristic properties of positive matrices. Therefore, when generalizing the result to complex matrices it is not possible to explicitly state that there exists a similarity scaling that will result in equality of the spectral radius and maximum singular value of a matrix. Nevertheless, it is possible determine the necessary conditions for the existence of a positive diagonal similarity scaling that leads to equality. The result is given in the following theorem. Theorem 4.2. Let A e C""U have a right eigenvector v and a left eigenvector w associated with an eigenvalue A(A) of maximum modulus such that IA(A)I = p(A), and let Do = diag(do,,, d,2, ...,d ,) be define as Do:= arg min (D'AD) DeD where D:= {diag(d,,d2,...,d,) d > 0,i= 1,2,,n}. Then if p(A) = min o(D'AD) DeD the following three conditions hold i) [do, d,2 S1, 11 W 12 I Iv211w, 12 1IH211 .. do] Te null(N) = ker(N) IV1 1 12 IV21'W"1 IW2 ... IVll MI arg(v,)= arg(w) i= 1,2,* ,n (4.18) and either iiia) X Iv,= lw,= i=1 (4.19a) or iiib) Iw, = for at least onei 1,2,*.,n (4.19b) Proof. Assume (4.15) holds where Do is an optimizing D such that IA(A)I = 5(Do'ADo) (4.20) where A(A) is an eigenvalue of maximum modulus. Following the development of the positive matrix result, a necessary condition for (4.20) to hold is that the corresponding right and left eigenvectors of Do'ADo be dual with respect to the Euclidean norm. Given (4.15) where (4.16) (4.17) that v and w are a pair of right and left eigenvectors of the A(A) eigenvalue of A, then Dolv and Dow are corresponding right and left eigenvectors of Do'ADo. Therefore, the necessary condition is that Do'v and Dow are dual with respect to the Euclidean norm. In the mathematical background section it is stated that this is equivalent to requiring Do'v = w (4.21) IIDowIl: Using the notation u = [IulleaT I)i, Iu2eaT( u,) J ,* e I v = [vle arg(,) j ,v2 Jearg(Y2) j ,IvearB(v")j ]T the necessary condition (4.21) is equivalent to the set of scalar qualities SIv, le arg(,) dO, w'l I:"l ) (4.221) do, d 1Iw, 2 + d.2w, 2 + ...+ + doJ, w, 2 Iv2 learg(v,)j d.lw21w2 le ) (4.222) do,2 d, w, IWI2 + do2 W22 + + d. Iw, 12 1 Iv.le'ar"gS.)i = d, l~("" )i (4.22n) 1o, WI12 +d2 W212 + +d 2 2 2 d0,o dw, 2 +do,2 '2+ +do.Iw.l2 which leads directly to necessary condition (4.18). Given that (4.18) is satisfied, (4.22) can be rearranged as IV 1I1 I12 _IWI V I IW212 ... IVj I IJV.12 dr2'l I lI I ,I 2 i I I I 2 I ,llw,2 d 2 2I I Iv 11Xw'w I2 2 Iv~ w2 ,2 0 (4.23) .I IwI2 IW22 ... Iw 2. I Iw.I jV _dO from which necessary condition (4.16) becomes apparent. For the null space of the matrix given in (4.23) to be nontrivial, its determinant must be 0 (i.e. the matrix must be rank deficient). First, note that if any Iw, =0 then the corresponding column i is composed of only zeros making the matrix rank deficient, resulting in part iiib) of necessary condition (4.19). For the case when no Iw, =0 for i=1,2,..,n the determinant can be determined using elementary row and column operations to obtain a matrix that is sparse and has the same determinant. Multiplying column 1 by Iw,2 /1w11 and adding the result to each column i for i = 2,3,.* ,n gives the matrix II W I w, IwII Iv ww, OI Iw~l 0 ... 0 v3 ll 1 0 w3l : .. 0 IvI Iw, 2 0 .. 0 w. Now, for i= 2,3,,n, multiplying row i by Iwi/Iwl1 and adding the result to row 1 gives the matrix (IIIwI +Iv21IW21 +v I31 w3+...+IV. IwI)w,II IV.3I Iw I INllI2 0 I2 o 0 31 0 for which the determinant is 0 0 0 0 w, (lv, I Iwl I+lv, I112 I+ ..+ IV .1% I, I 1)Iw I y1 I w2l... Iw ,I So, for the case when no [wl = 0 the determinant is identically zero and the null space is nontrivial only when part iiia) of necessary condition (4.19) Y1vil IwI=1 (4.19) i=1 is satisfied. Q.E.D. 4.4. Example 4.1 The following example demonstrates the result of Theorem 4.2. Consider the matrix 0.5259 + 0.6358j 0.30901.3791j 0.2031 +0.2317j 0.1016 + 0.9524j 0.4712 +0.1832j 0.73830.5966j 0.31740.1128j 0.28400.2127j A= 0.02900.1034j 0.7906 +2.0522j 0.49910.6463j 0.0584 + 0.1540j 0.0925 0.2759j 0.5359 + 0.7832j 0.2490 + 0.083 Ij 0.0694 0.1919j where the eigenvalue of maximum modulus is 2(A)=1.65071.1293j such that I2(A)I = p(A) = 2. Performing the minimization on the righthand side of (4.15) gives min _(D'AD)= 2 Del with 10 0 0 0 0.5 0 0 Do:= argmin(DIAD)= DeD 0 0 1.2 0 0 0 0 0.7 such that equation (4.15) holds. Therefore, the three necessary conditions of Theorem 4.2 must be satisfied. First, the right and left eigenvectors of A associated with the eigenvalue ,(A) = 1.65071.1293j of maximum modulus respectively are 0.0845 0.2857j 0.3690 0.2780j V= 0.1441 + 0.7943j 0.0167 + 0.2140j 0.0567 0.1918j 0.9912 0.7467j W= 0.0672 + 0.3704j 0.0229 + 0.2932j For condition i) the matrix N given by (4.17) is 0.1881 0.4589 0.0422 0.0258 0.0185 0.5294 0.0655 0.0400 N= 0.0323 1.2432 0.2620 0.0698 0.0086 0.3305 0.0304 0.2756 and d02,1 1 d2 0.25 N 0,2 =N =0 d 1.44 _d024. 0.49 such that condition i) is satisfied. Finally, it is easy to shows that condition ii) and iiia) of Theorem 4.2 are satisfied. 4.5. Conclusions In this paper we recover the dualnorm arguments for the case of complex A and obtain an exact and closed form expression for the optimal D matrix. This result has independent value in terms of the mathematical completeness of the extension of the case of complex matrices as well as potential algorithmic improvements in computing the optimal scaling matrices. CHAPTER 5 GENERALIZATION OF THE NYQUIST ROBUST STABILITY MARGIN AND ITS APPLICATION TO SYSTEMS WITH REAL AFFINE PARAMETRIC UNCERTAINTIES 5.1. Introduction The criticaldirection theory developed by Latchman and Crisalle (1995) and Latchman et al. (1997) addresses the problem of robust stability of systems affected by uncertainties that can be characterized in terms of frequencydomain value sets. The approach introduces the Nyquist robust stability margin kN (o) as a scalar measure of robustness analogous to the structured singular value p (Doyle, 1982) and the multivariable stability margin km (Safonov, 1982) within the valueset paradigm. This chapter extends the critical direction theory to the more general case where the critical valueset may be nonconvex. The key to extending the theory is the introduction of a generalized definition of the critical perturbation radius in a fashion that preserves all previous results. The nonconvexity of the critical value set is observed in a number of interesting problems, including the case studied by Fu (1990) consisting of rational systems where the uncertainty appears affinely in the form of real parameters that belong to a known rectangular polytope. The generalized critical direction theory is applied to this particular class of uncertain systems, and is used to calculate the required Nyquist robust stability margin with high precision and in the context of a computationally manageable framework. The robust stability problem studied by Fu is part of an extensive literature on systems where the uncertainty appears in the form of parameters that vary in prescribed real intervals, a situation of relevance to many engineering problems. Early advances in this field are due to Kharitonov (1978, 1979) who derived necessary and sufficient conditions for the robust stability of interval polynomials, that is, polynomials with independent coefficients that take values in closed real intervals. An extension of Kharitonov's theorem to rational interval plants is proposed in Chapellat et al. (1989), where the objective is to assess the stability of a family of plants by testing a subset of extreme plants or extreme segments. The number of extreme plants required to determine robust stability depends on the functional relationship between the uncertain parameters and their bounding intervalsets. Comprehensive results based on extreme plants or segments are known to exist only for a restricted set of uncertainty structures. A detailed account of Karitonovlike methods can be found in Barmish (1994) and in the references therein. For contextual value, it is worth mentioning that many of the methods proposed are based on determining the stability of a set of Kharitonov plants (or extreme plants) derived from the interval boundingset description. For example, Chapellat et al. (1989) and Bartlett et al. (1990) give conditions that use 32 Kharitonov segments or edges. Barmish et al (1992) prove that when using firstorder compensators it is necessary and sufficient that sixteen of the extreme plants be stable; furthermore, under certain conditions only eight or twelve plants are necessary. In this chapter the generalized critical direction theory is applied to systems with affine parametric uncertainty and exploits earlier results of Fu (1990) regarding the mapping of the uncertain parameters from their polytopic domain to the Nyquist plane to develop a computationally tractable algorithm for calculating the Nyquist robust stability margin. The chapter is organized as follows. Section 5.2 generalizes the critical direction theory for systems with nonconvex critical value sets. Sections 5.3 through 5.8 are concerned with the application of the generalized theory to the case of affine uncertain rational systems with real polytopic parametric uncertainties. Section 5.3 introduces a precise definition of the uncertain system considered, and Section 5.4 derives two robust stability theorems for these types of systems. Section 5.5 presents a systematic method for calculating the critical perturbation radius, and Section 5.6 provides two examples of the analysis method, including the case of a convex and the case of a nonconvex critical value set. Overall conclusions are given in Section 5.7. 5.2. Generalization of the Critical Direction Theory 5.2.1. Preliminaries Consider the singleinput singleoutput linear time invariant system g(s) = go(s) + 8(s) (5.1) where go(s) is a known nominal transfer function, and 8(s) A is an unknown perturbation belonging to a known perturbation family A. The focus of this analysis is on the robust stability of the closedloop system that results when the uncertain system (5.1) is configured in the unity negative feedback control structure shown in Figure 5.1. g (s) Figure 5.1. Unity feedback control scheme for an uncertain plant g(s). The following standard assumptions are made throughout this chapter: (Al) The nominal transfer function go(s) is stable under unity negative feedback. (A2) The set of allowable perturbations A is such that g(s) and go(s) have the same number of open loop unstable poles. The robust stability analysis is based on a frequency domain description of the uncertain perturbations using value sets. The uncertainty value set of g(s) at frequency to is defined as V(co):= {g(jo) Ig(o) = go((jw)+ S(jo), 8(s) e A} and V(o() is said to lie on the Nyquist plane. A generic uncertainty value set is shown in Figure 5.2. r(w,) r '4 \4 \4 go(jw,) + d,(w,) go(jco,) +P,(w,)dc(wo,) Img(jw) Re g(jw) Figure 5.2. Schematic of an uncertainty value set ((w,) (shaded area), and the critical perturbation radius pc(wo) at a frequency tow. Also shown in the figure are the critical line r(co) (dashed line); and the nonconvex critical uncertainty value set V(w,) which in this case is the union of two disjoint straightline segments (shown by the dotted lines). The criticaldirection theory advanced in Latchman and Crisalle (1995) and in Latchman et al. (1997) is based on the observation that the smallest destabilizing perturbations occur along the critical direction 1 + go(jw) 1+ g0(jco) which is interpreted as the unit vector with origin at the nominal point go(joe) and pointing towards the critical point 1 + j (cf Figure 5.2). This direction in turn defines the critical line r(w) := go(jw) + ad,(jw) a e R where R' denotes the nonnegative real numbers. The critical line r(ao) is interpreted as a ray that originates at the nominal point go(jo) and passes through the critical point + j0. The intersection of the uncertainty value set with the critical line determines the critical uncertainty value set S(co) := (c(o) n r(w) which may be (i) a single straightline segment or a single isolated point (in which case ((co) is a convex set) or (ii) a union of disjoint straightline segments and isolated points (in which case V((w) is a nonconvex set). Figure 5.2 shows the case of a nonconvex critical uncertainty value set. Finally, the boundary of the uncertainty value set is denoted V8(co), and the set of critical boundary intersections Bc (w) is defined as B, (w): = {89(w)) n r(co)} \ go (ja) where "\" is the setdifference operator. For the special case where 9V(o) n r(co) contains go (jw) as its only element, the following definition is applied: %(w):= {go(jw)} Note that to determine c (to) it is necessary to have knowledge of the uncertainty value set boundary only along the critical line. Clearly, B (co) contains a single element if 4(ow) is a convex set, and contains at least two elements if V'(w) is nonconvex. When the critical value set V((m) is convex (as in the case of starshaped value sets with respect to the nominal point, for example), the critical perturbation radius is defined as (Latchman and Crisalle, 1995; Latchman et al., 1997) pc ():= max {a z=go(jo)+ ad,(j j) C EV() } (5.2) Definition (5.2) states that the critical perturbation radius for the case of a convex set T((j) is simply the distance along the critical direction between the nominal point go(j0 ) and the uncertainty value set boundary V8(w). Note also that the perturbation radius captures the "size" of the uncertainty that is relevant for stability analysis. Definition (5.2) is not suitable, however, for the case of nonconvex critical value sets V,(to). In this chapter the following generalization of the definition of the critical perturbation radius is proposed, which is applicable to both the convex and nonconvex cases: p(w):= +g(j ) ifl+j0 (w ) (5.3) 11+ go(jU ) j+ (w) otherwise where (w) = eai)1 + zI (5.4) represents the distance from 1+ jO to the point in B, (t) that is closest to the critical point 1+ jO. The upper statement in definition (5.3) states that when 1+ j0 is not an element of V(w), the critical perturbation radius p,(w) is defined as the difference between two distances, namely, the distance from the critical point 1+j0 to the nominal point go(jw) (represented by 1+ go(jo) ) and the distance from the critical point 1+ jO to the closest criticalboundary intersection (represented by ((o)). On the other hand, when 1+ j0 is an element of V(c), the lower statement in (5.3) states that the critical perturbation radius is taken as the sum of the two distances in question. Observe that when the critical uncertainty value set is convex, B, (o) has only one element (i.e. there is only one critical boundary intersection), and definition (5.3) becomes equivalent to definition (5.2). Note also that to compute the critical perturbation radius from (5.3) it is necessary to have full knowledge of the set of critical boundary intersections B (co) and to be able to evaluate whether the set membership condition l+ jO ef'(w) holds; both of these issues are completely resolved in Section 5.3 and Section 5.4 of this chapter for the case of systems with real affine parametric uncertainties. For either definition it can be shown that Pc (o) > 0 for all frequencies. Finally, the Nyquist robust stability margin kN () := P()) (5.5)(6) + g(jo) I is defined as the ratio of the critical perturbation radius to the distance between the nominal point go(jc) and the critical point l+ jO measured along the critical direction. Note that kN (c)) > 0 for all frequencies. 5.2.2. Analysis of Robust Stability The analysis of the robust stability of the uncertain feedback system being considered can be resolved in terms of the following theorem. Theorem 5.1. Consider the uncertain system g(s) given in (5.1) with assumptions (Al) and (A2). Then, the closed loop system is robustly stable under unity feedback if and only if 1+ j0 V V(w) Vw (5.7) Theorem 5.1 is simply a restatement of the wellknown zeroexclusion principle (Barmish, 1994), and it gives a necessary and sufficient condition for the robust stability of the closed loop in question. However, Theorem 5.1 does not provide a measure of the degree of robust stability of the loop, a quantity that would be most useful as the basis for the synthesis of optimally robust controllers or for the assessment of the relative merits of alternative control schemes. The critical direction theory seeks to quantify the robust stability of such systems in terms of the Nyquist robust stability margin (5.5), which plays a role analogous to that of the structured singular value (Doyle, 1982) and of the multivariable stability margin (Safonov, 1982). Efficiency in the analysis is obtained through the realization that it suffices to verify condition (5.7) only for valueset points that lie along the critical direction; more precisely, the set membership condition (5.7) holds if and only if 1+ jO V%(w) holds. These observations lead to the following key result of the critical direction theory. Theorem 5.2. Consider the uncertain system g(s) given in (5.1) with assumptions (Al) and (A2). Then the closed loop system is robustly stable under unity feedback if and only if k (o) < Vwc (5.8) Proof A complete proof is given in Latchman and Crisalle (1995) for the case where 1((w) is convex. For the nonconvex case in which the generalized definition (5.3) of pc(w) is utilized the proof is extended as follows. From Theorem 5.1 the uncertain closed loop system is stable if and only if l + jO 0 V() Vw Therefore, to prove that (5.8) is sufficient for robust stability, we must show that if kN(c)) <1 VW then l+jO 0 (w) Vw. To prove by contradiction, assume that kN(w) <1 Vow and that 3c such that l+jE0 c(w). Then applying definitions (5.3) and (5.5) for a frequency at which 1 + jO e 9V() gives p,(w) 1+g0(jw) +4(w)1 (() kN (o)= PC I + ) = 1) I+ go(jw) I I1+ g0(jc) I 1+ go0(J) I where 4(w) is the nonnegative real scalar given by (5.4). Hence, kN (c) 2 1 for at least one frequency, which contradicts the assumption. Therefore, if kN,(o)<1 V'd then it follows that 1+ j0 0 V(w) Vw To prove that (5.8) is necessary for robust stability, one must show that if 1+ jO 4(co) Vco then kN(c))< 1 Vco. To establish this, note that if 1 + jO 4 V(c) Vo then by definitions (5.3) and (5.5) kPC () ( 1 + lgo (jo) I ()) =() kN(W)= ( =1 (w S+ go(jc 1+ g() I 1+ go() 1 +go where (wc) is given by (5.4). In this case, however, since 1 + jO 9V(w) it follows that 1 + jO0 Qc (c), and thus (co) must necessarily be a positive number. Using this fact in the above equality leads to the conclusion that kN (co) <1 'V Q.E.D. From Theorem 5.2 it follows that the scalar kN (c) serves to quantify the robust stability of the closedloop system. The computation of k (ct) requires knowledge of the critical perturbation radius pc(w) defined in (5.3). The challenging task in a given problem is in fact the calculation of the critical perturbation radius. When C((w) is convex, definition (5.2) indicates that pc(w) represents the distance between the point go(ji) and the (unique) point where the critical line intersects the boundary of V(co). On the other hand, when V (w) is nonconvex there are multiple points where the critical line intersects the boundary of V(co). In such cases, definition (5.3) indicates that pc(w) is a function of the distance between go(jiw) and the boundaryintersection point that is closest to the critical point 1+ j0. Since in many cases the convexity of Vc(w) at any given frequencies may not be known a priori, the generalized critical radius definition allows the application of the critical direction theory without conservatism to a more general class of uncertain systems, including the case of real affine uncertain systems discussed in ensuing sections. The Nyquist robust stability margin k,(w) computed using the general definition (5.3) for p(co) is attractive from an analysis standpoint because through Theorem 5.2 it gives necessary and sufficient conditions for robust stability. On the other hand, if kN (c) is computed using equation (5.2) for p, (w), then the condition kN,() < 1 Vw is only sufficient for robust stability when the set V (w) is nonconvex. From a control design point of view, however, it may be advantageous to adopt the computationally simpler definition (5.2) even for the case where V1(w) is nonconvex, and accept the result as a suboptimal design, as is done in the context of the structured singular value paradigm where control design is based on an upper bound rather than on the exact value of the structured singular value. It must be remarked, however, that when V'(co) is in fact convex, using definition (5.2) for pc(o) makes the resulting condition kN(w)) stability; and in such cases the results are not conservative. It must also be emphasized i.,' that the uncertainty value set V(co) itself does not have to be convex for the critical uncertainty value set V (co) to be convex 5.3. Systems with Affine Uncertainty Structure In this section the generalized critical direction theory is specialized to systems with real parametric uncertainties that appear in an affine fashion, namely, an uncertain rational function of the form p no(s) + Zqin (s) g(s,q)= ,= qeQ (5.9a) do(s) + qid,(s) 1=1 where no(s):= noksk k=0 and do(s):= d0k k k=0 are known nominal polynomials, n,(s) = nis k=O and d,(s) = dik k=0 are known perturbation polynomials, and q = [q, q2 ... q,]T RP is a vector of real perturbation parameters belonging to the bounded rectangular polytope Q={qeRP' q,

Full Text 
xml version 1.0 encoding UTF8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchemainstance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd INGEST IEID EQAIA2H5Z_J13C7Y INGEST_TIME 20130214T13:33:41Z PACKAGE AA00013545_00001 AGREEMENT_INFO ACCOUNT UF PROJECT UFDC FILES 