Title Page
 Table of Contents
 List of Tables
 List of Figures
 List of Symbols
 Two-level optimization method....
 A strong version of the discrete...
 Examples on nonconvex organiza...
 The synthesis of process flowsheets....
 Branch and sound strategy with...
 Evolutionary synthesis of process...
 Conclusions and recommendations...
 Biographical sketch

Title: Optimization of nonconvex systems and the synthesis of optimum process flowsheets
Full Citation
Permanent Link: http://ufdc.ufl.edu/UF00086028/00001
 Material Information
Title: Optimization of nonconvex systems and the synthesis of optimum process flowsheets
Physical Description: xvi, 192 leaves. : illus. ; 28 cm.
Language: English
Creator: Stephanopoulos, George, 1947-
Publisher: George Stephanopoulos
Publication Date: 1974
Copyright Date: 1974
Subject: Chemical engineering -- Mathematical models   ( lcsh )
Chemical porcesses -- Mathematical models   ( lcsh )
System analysis   ( lcsh )
Mathematical optimization   ( lcsh )
Chemical Engineering thesis Ph. D
Dissertations, Academic -- Chemical Engineering -- UF
Genre: bibliography   ( marcgt )
non-fiction   ( marcgt )
Thesis: Thesis--University of Florida.
Bibliography: Bibliography: leaves 187-190.
General Note: Typescript.
General Note: Vita.
 Record Information
Bibliographic ID: UF00086028
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: oclc - 14174287
alephbibnum - 000585036
notis - ADB3668


This item has the following downloads:

optimizationofno00steprich ( PDF )

Table of Contents
    Title Page
        Page i
        Page ii
        Page iii
    Table of Contents
        Page iv
        Page v
        Page vi
    List of Tables
        Page vii
    List of Figures
        Page viii
        Page ix
        Page x
    List of Symbols
        Page xi
        Page xii
        Page xiii
        Page xiv
        Page xv
        Page xvi
        Page 1
        Page 2
        Page 3
        Page 4
        Page 4a
    Two-level optimization method. Dual gaps. Resolution using method of multipliers
        Page 5
        Page 6
        Page 7
        Page 8
        Page 9
        Page 10
        Page 11
        Page 12
        Page 13
        Page 14
        Page 15
        Page 16
        Page 17
        Page 18
        Page 19
        Page 20
        Page 21
        Page 22
        Page 23
        Page 24
        Page 25
        Page 26
        Page 27
        Page 28
        Page 29
        Page 30
        Page 31
        Page 32
        Page 33
        Page 34
        Page 35
        Page 36
    A strong version of the discrete minimum principle
        Page 37
        Page 38
        Page 39
        Page 40
        Page 41
        Page 42
        Page 43
        Page 44
        Page 45
        Page 46
        Page 47
        Page 48
        Page 49
        Page 50
        Page 51
        Page 52
        Page 53
        Page 54
        Page 55
    Examples on nonconvex organization
        Page 56
        Page 57
        Page 58
        Page 59
        Page 60
        Page 61
        Page 62
        Page 63
        Page 64
        Page 65
        Page 66
        Page 67
        Page 68
        Page 69
        Page 70
        Page 71
        Page 72
        Page 73
        Page 74
        Page 75
        Page 76
        Page 77
        Page 78
        Page 79
        Page 80
        Page 81
        Page 82
    The synthesis of process flowsheets. A general review
        Page 83
        Page 84
        Page 85
        Page 86
        Page 87
        Page 88
        Page 89
    Branch and sound strategy with list techniques for the synthesis of optimal separation schemes
        Page 90
        Page 91
        Page 92
        Page 93
        Page 94
        Page 95
        Page 96
        Page 97
        Page 98
        Page 99
        Page 100
        Page 101
        Page 102
        Page 103
        Page 104
        Page 105
        Page 106
        Page 107
        Page 108
        Page 109
        Page 110
        Page 111
        Page 112
        Page 113
        Page 114
        Page 115
        Page 116
        Page 117
        Page 118
        Page 119
        Page 120
        Page 121
        Page 122
        Page 123
        Page 124
        Page 125
        Page 126
        Page 127
        Page 128
        Page 129
        Page 130
        Page 131
        Page 132
        Page 133
    Evolutionary synthesis of process flowsheets
        Page 134
        Page 135
        Page 136
        Page 137
        Page 138
        Page 139
        Page 140
        Page 141
        Page 142
        Page 143
        Page 144
        Page 145
        Page 146
        Page 147
        Page 148
        Page 149
        Page 150
        Page 151
        Page 152
        Page 153
        Page 154
        Page 155
        Page 156
        Page 157
        Page 158
        Page 159
        Page 160
        Page 161
        Page 162
        Page 163
        Page 164
        Page 165
        Page 166
        Page 167
        Page 168
        Page 169
        Page 170
        Page 171
        Page 172
        Page 173
        Page 174
        Page 175
        Page 176
        Page 177
        Page 177a
    Conclusions and recommendations for further research
        Page 178
        Page 179
        Page 180
        Page 181
        Page 182
        Page 183
        Page 184
        Page 185
        Page 186
        Page 187
        Page 188
        Page 189
        Page 190
    Biographical sketch
        Page 191
        Page 192
        Page 193
        Page 194
Full Text






Dedicated to

The Memory of My Father

Nicholas Stephanopoulos

Mall hai s -thr~e ways o6 acting Itiu-%ef:
Fi~tty, O kimed cLtatiom, RtJLE5 L, tfie
iqob'1jcLt; Seccmwde-y, Oki Zvtetticil, tdLi 5
~iztzc ects~i.est; cuzd Tblii-dJ, or,
expe.iekice; -tlhLs i,5 -th'c -t-te.LCs t.



The author wishes to express his gratitude to:

The chairman of his supervisory committee, Dr. Arthur W. Westerberg,

Associate Professor of Chemical Engineering, for his firm support,

sound advice, and guidance throughout what has been the most

stimulating and enjoyable period of his academic life;

The members of his supervisory committee: Dr. F.P. May,

Professor of Chemical Engineering, Dr. D.W. Kirmse, Assistant Professor

of Chemical Engineering, Dr. T.E. Bullock, Associate Professor of

Electrical Engineering and Dr. M.E. Thomas, Professor of Industrial

and Systems Engineering;

The faculty and staff of the Department of Chemical Engineering

for their assistance;

The National Science Foundation, which provided financial support

through the grants GK-18633 and 41606 and the Graduate School for a

research assistantship;

Jeanette for her encouragement through the lows and the sharing

of the highs of the life.



ACKNOWLEDGMENTS ......................................... iii

LIST OF TABLES ............................................ vii

LIST OF FIBGURES ........................................... viii

LIST OF SYMBOLS ........................................... xi

ABSTRACT ........................................... ...... xiv


I. INTRODUClION ................................


II.1. Review of the Previous Works on the Non-
convex Optimization .....................

11.2. Statement of the Problem and the Two-Level
Procedure ............................... 7

11.3. Dual Gaps ............................... 10

11.4. Resolution of Dual Gaps Using the Method
of Multipliers .......................... 13

11.5. Computational Separability .............. 22

11.6. The Algorithm ........................... 24

11.7. A Discrete Minimum Principle ............ 26

II.8. Discussion .............................. 34

PRINCIPLE ...................................... 37

III.1. Review of Previous Works ............... 37

11I.2. Statement of the Problem and Its
Lagrangian Formulation ................. 39




111.3. Development of a Stronger Version of
the Discrete Minimum Principle ......... 45

111.4. The Algorithm and Computational
Characteristics ........................ 51

111.5. Discussion ............................ 53


IV.1. Numerical Examples ...................... 56

IV.l.a. Two-stage examples ............. 56

IV.l.b. Three-stage example with
recycle ........................ 60

IV.l.c. Soland's example ............... 64

IV.l.d. Jackson and Horn's counter-
example ............. ......... 64

IV.l.e. Denn's counterexample .......... 69

IV.2. The Design of a Heat Exchange Network ... 72

REVIEW ....................... .......... ....... 83

OPTIMAL SEPARATION SCHEMES ..................... 90

VI.1. Previous IUorks on the Synthesis of
Separation Schemes ...................... 90

VI.2. Statement of the Problem and the List
Techniques for the Representation of the
Separation Operations.................... 93

VI.3. Branch and Bound Strategy ............... 100

VI.4. Examples ................................ 107

VI.4.a. Example 1: n-butylene
purification system ............ 108

VI.4.b. Example 2: olefins-parrafins
separation system .............. 121



VI.5. Discussion .............................. 132


VII.1. A General Philosophy on Evolutionary
Synthesis .............................. 134

VII.2. Evolutionary Synthesis of Optimal Multi-
component Separation Sequences ......... 143

VII.2.a. Representation of separation
sequences as binary trees .... 143

VII.2.b. Neighboring flowsheets and the
evolutionary rules for a
separation sequence .......... 146

VII.2.c. Polish strings and their repre-
sentation of separation
sequences .................... 150

VII.2.d. Proof of the completeness of
the evolutionary rules ....... 155

VII.2.e. Evolutionary strategy ........ 161

VII.3. Examples of Evolutionary Synthesis 163

VII.3.a. Example 1: synthesis of a
solids' separation system .... 163

VII.3.b. Example 2: synthesis of a
multicomponent separation
sequence ..................... 167

VII.4. Discussion ............................. 174

RESEARCH ....................................... 178

APPENDICES ................................................ 183

APPENDIX A ...................................... 184

BIBLIOGRAPHY .............................................. 187

BIOGRAPHICAL SKETCH ....................................... 191


Table Page

1 Points Generated for the Two-Stage Example ....... 58

2 Points Generated for the Modified Two-Stage
Example ............ ................. ............. 61

3 Points Generated for Soland's Example ............ 65

4 The Numbers of Distinct Separators B(N) and
Distinct Flowsheets F(N) for a Mixture of r
Components and One Separation Method ............. 96

5 Initial Feed to the n-Butylene Purification
System ........................................... 109

6 Generated Flowsheets and Their Minimum Costs ..... 120

7 Specifications of the Initial Feed and the Desired
Products for Example 2 ........................... 122

8 Specification ef the Solids in Example 1 ......... 165

9 The Evolutionary Steps Taken During the Synthesis
of the Solids' Separation System ................. 168

10 Table Indicating the Effectiveness of the Proposed
Evolutionary Strategy in Synthesizing an Optimal
Separation Sequence (One Separation Method Used).. 176

11 Table Indicating the Effectiveness of the Proposed
Evolutionary Strategy in Synthesizing an Optimal
Separation Sequence (Three Separation Methods
Used) ............................................ 177


Fi ure Page

1 Geometric View of the Success of the Dual Approach .. 12

2 Geometric View of the Failure of the Dual Approach
to Yield the Primal Solution ....................... 14

3 Improvement of the Dual Bound h (X) for the
Penalized Problem Over the Dual Bound h(A) for the
Unpenalized Problem for the Same Multipliers X ..... 36

4 Relationship of Penalty Constant K Required for
Three Lagrangian Based Optimization Algorithms ..... 55

5 Two-stage Example .................................. 57

6 Three-stage Example with Recycle ................... 62

7 Jackson and Horn's Counterexample ................. 66

8 Heat Recovery Process .............................. 74

9 Uncoupled Heat Recovery Process Showing the Enthalphy
Variables Used in the Two-level Optimization Method. 75

10 Diagram Showing Notation for a Single Exchanger .... 78

11 All the Distinct Separators Generated and the Basic
Flowsheet for a Fictitious Example of 4 Components
Using 2 Separation Methods ......................... 98

12 Two-stage Example to Demonstrate the Insensitivity
of the Dual Function to the Values of the Lagrange
Multipliers .............................. .......... 106

13 Generation of the Basic Flowsheet for the Synthesis
of the n-Butylene Purification System .............. 111

14 Generation of all the Flowsheets Starting with
Separator 1 for the n-Butylene Purification System
Example ............................................ 112

LIST OF FIGURES (continued)

Figure Page
15 Generation of all the Flowsheets Starting with
Separators 2 or 5 for the n-Butylene Purification
System Example ..................................... 114

16 All the Distinct Separators Employed During the
Synthesis by Branch and Bound of the n-Butylene
Purification System ................................ 116

17 Nearly Optimum Flowsheets Retained at the End of
the Branch and Bound Synthesis of the n-Butylene
Purification System ................................ 119

18 All the Distinct Separators Employed During the
Synthesis by Branch and Bound of the Olfins,
Paraffins Separation System ........................ 123

19 Generation of the Basic Flowsheet for the Synthesis
of the Olefins, Paraffins Separation System ........ 127

20 Nearly Optimum Flowsheets Retained at the End of
the Branch and Bound Synthesis of the Olefins,
Paraffins Separation System ........................ 128

21 Generation of all Flowsheets Starting with
Separator 1 for the Olefins, Paraffins Separation
System ............................................. 129

22 Generation of all Flowsheets Starting with
Separators 2 or 3 for the Olefins, Paraffins
Separation System .................................. 130

23 Generation of all Flowsheets Starting with
Separator 4 for the Olefins, Paraffins Separation
System ............................................. 131

24 An Illustrative Diagram Showing Two Alternate Sets
of Evolutionary Rules for a Family of Flowsheets A.. 137

25 Flowsheet A and Its Neighboring Flowsheets B,C,D
and E Resulting from A with Simple Structural
Modifications ...................................... 139

26 A Separation Sequence (A), Its Corresponding Binary
Tree (B) and the Skeleton Structure (C) Corresponding
to This Tree ....................................... 144

LIST OF FIGURES (continued)

Figure Page

27 Flowsheet (A), a Down Neighbor (B) to It, and a
New Separator Type Neighbor (C) to It .............. 147

28 Separation Sequences Corresponding to the Binary
Trees of Figures 27 A and B ......................... 149

29 A Binary Tree for the Algebraic Expression
3x2y + z2/u ............. ....................... 151

30 Interpreting a Polish String Development of an
Algebraic Expression Using a Stack of Operands and
a Set of Operators ................................. 153

31 The Operators of a Polish String and Their Operands,
and the Schematic Representation of the Generation
of the Neighboring Flowsheets by Applying Rules 1
and 2 .............................................. 156

32 Flowsheet A-1 and Its Neighboring Flowsheets A-2,
A-3 and A-4 Generated Using Rules 1 and 2 .......... 158

33 A Schematic Representation of a System for the
Separation of Solids and the Corresponding Binary
Tree ............................................... 166

34 Flowsheets Generated During the Evolutionary
Synthesis of the n-Butylene Purification System
Starting from Flowsheet (a) ........................ 170

35 The Evolution Process for the n-Butylene Purification
System Starting from the Flowsheet (a) ............. 171

36 The Evolution Process for the n-Butylene Purification
System Starting from the Flowsheet (k) ............. 172

37 Flowsheets Generated During the Evolutionary
Synthesis of the n-Butylene Purification System
Starting from Flowsheet (k) ....................... 173














h .




= Heat transfer area of the i-th heat exchanger.

= Flowrates of hot streams in a heat exchange network.

= Heat capacity of stream i.

= Cost of the i-th heat exchanger.

= Set of feasible dual variables or multipliers.

= Scalar return function.

= Scalar return function for the augmented problem, may
be subscripted.

= Stage transformation function, for stage i.

= Scalar return function for subsystem j.

= Vector of equality constraint functions, may be subscripted.

= Supporting hyperplane for the set R.

= Hessian matrix of subsystem i.

= Hamiltonian function. Subscripted with i refers to the
Hamiltonian for the subsystem i.

= Hamiltonian function for the augmented problem.

= Dual function.

= Vector of inequality constraint functions for subsystem j.

= The set of i such that stream i is an input to subsystem j.

= Penalty constant, always nonnegative.

= Lagrangian function.

= Lagrangian function for the augmented problem, may be

. = Sub-lagrangian for subsystem i.
= Sub-lagrangian for subsystem i after the linear
approximation of nonseparable terms.

0(j) = The set of i such that stream i is an output of
subsystem j.

P(q) = Scalar quantity = gT(q)g(q).

Q = Heat duty of an exchanger.

q = Composite vector variable (xlu), may be subscripted.
R = The set (z ,z) such that z w(z), z c Z, may be

S = Constrained variable set, may be subscripted.

s = Direction vector, may be subscripted.

T ,T = Temperatures of a cold stream at the entrance and the
Sexit of a heat exchange network.

t. = Stage transformation function, for stage i.

ta tb,tc = Temperatures at which the hot streams a,b,c are available.

u = System decision variable vector, may be subscripted.

w(z) = Minimum {F(x); g(x) = z, x e S}.

w (z) = Minimum {F (x); g(x) = z, x c S}.

x = Vector variable associated with interconnecting streams,
may be subscripted.

y = Vector variable associated with streams leaving the
system, may be subscripted.

Z = The set of z such that 3 x c S 3 g(x) = z.

z = Perturbation vector, may be subscripted.

Greek Letters

a = Scalar parameter.

i. = Positive constant, usually = 0.6.

B = Scalar parameter.

Yi = Positive constant.

AT = Log mean temperature difference.

= Small positive quantity.

0 = System decision variable vector, may be subscripted.

\ = Vector of Lagrange multipliers, may be subscripted.

. = Vector of Lagrange multipliers for the augmented problem.

v = Vector of Kuhn-Tucker multipliers, may be subscripted.

i = Scalar parameter.

Mathematical Symbols

3 = Such that.

3,$ = There exists, there does not exist.

V = For all.

P = Zero vector.

C = Subset of.

A = Intersection with.

U = Union with.

Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment
of the Requirements for the Degree of Doctor of Philosophy



George Stephanopoulos

June, 1974

Chairman: Arthur W. Westerberg
Major Department: Chemical Engineering

Previous efforts in the area of optimization of chemical

processes have not accounted for the nonconvexities commonly en-

countered in such systems. These nonconvexities cause many of the

proposed large scale optimization strategies to fail. The subject

also of the chemical process synthesis was largely ignored until

recently despite its importance in the chemical engineering practice.

This dissertation presents techniques to overcome the deficiencies

of two related and often studied optimization methods in the presence

of nonconvexities and develops strategies for the synthesis of

process flowsheets.

Many Lagrangian based methods for the optimization of large scale

systems require that the Lagrangian function possess a saddle point.

The two-level optimization method may not be generally applicable to

chemical process design problems due to the mathematical character

of commonly encountered constraint sets and objective functions, which

commonly do not allow the existence of saddle points for the Lagrangian

function. The dissertation presents a method to overcome these

shortcomings of the two-level optimization procedure by employing

Hestenes' method of multipliers. The objective function is augmented

by a penalty term which is the sum of the squares of the connection

constraints multiplied by a positive constant. This penalty term

under certain conditions turns every stationary point of the

Lagrangian into a saddle point thus securing the success of the two-

level method. The separability of the initial system which was lost

because of the penalty term is regained by expanding the nonseparable

terms into a Taylor Series and retaining only the linear part of

it. As a direct extension of the above strategy the dissertation

presents the development of a stronger version of the discrete minimum

principle. Two algorithms have been developed to implement these

theoretical results. The success of the new methods has been

demonstrated in several small size numerical examples and in the

design of a simple heat recovery network on which the previous methods


The dissertation also develops two strategies for the synthesis

of chemical process flowsheets. The first is a branch and bound

strategy which exploits the bounding properties of the dual and primal

values which can be obtained for the flowsheet objective function.

The flowsheets are constructed in a block-by-block building procedure,

and this method was used to synthesize optimal multicomponent separation

sequences. In the method list processing techniques are used to

develop the distinct separators which can occur. At tie end of the

synthesis a small number of nearly optimum flowsheets is retained.

Further screening among them is possible to locate the optimum

flowsheet. This strategy is demonstrated in two different examples

with very encouraging results.

The second synthesis strategy, the evolutionary strategy, is

considered next and is systematized for use in the synthesis of

general process flowsheets. The evolutionary synthesis procedure is

broken into four subtasks: a) the development of a starting flowsheet,

b) the generation of evolutionary rules to produce the structural

modifications to be considered during the synthesis, c) the developing

of the proper evolutionary strategy to lead to the optimum solution

in the most effective manner, and d) the screening among the current

flowsheet and the alternative flowsheets generated. Notions such

as that of the neighboring flowsheets and the evolutionary strategy

help to put evolutionary synthesis in a correct perspective. The

evolutionary approach is illustrated with the synthesis of two

distinct separation problems with very encouraging results.



The essential concern, purpose, and culmination of engineering

is design. In chemical engineering this is the prevailing and most

important factor of its scope. Very often, a final design is achieved

without due consideration to all aspects of the design morphology.

This is necessitated by the complexity of the design problem and

the state of the limited engineering advances in certain areas.

Proper design procedure includes the three essential stages of synthesis,

analysis and evaluation. The design process is complicated by the

interrelationships existing among these stages. Frequently these

interrelationships are complex and cause design to be an iterative

process, requiring the special attention of the designer and the

development of flexible strategies which will lead to good solutions.

Analysis is a term which is equally familiar to both practicing

engineers and students of engineering. It has been developed de-

ductively and quantitatively to a high degree. Strategies have been

developed to analyze whole processes and very effective, sophisticated

methods have been proposed to resolve the complicated, time consuming

activities of the design process.

Considerable work has been done and is in progress on optimization

theory for large structured systems. Both theory and applications

have found very fertile ground in chemical engineering. The particular

feature of chemical processes (i.e., sparseness and complexity)

have caused the development of highly effective strategies for

analysis and optimization. Chemical process design has been the

instigator of such developments.

The synthesis stage of the design procedure was largely ignored

until recently in the chemical engineering literature despite its

importance in chemical engineering practice. During the last few

years the importance of creativity, innovation and invention in

designing chemical processes has been stressed and has received the

proper attention.

Because synthesis is such an important step, it became the

principal coal of this thesis. Initially we were exploring the use

of licGalliard's (McGalliard, 1971) approach to structural synthesis

for the synthesis of optimal multi component separation schemes. This

strategy involves the development of dual bounds for alternate flow-

sheets. Thus the first problem of concern was a good initial

estimation of the Lagrange multipliers to be used for the evaluation

of dual bounds. A more thorough and detailed investigation of the

physical meaning of the Lagrange multipliers and the discovery of

Hestenes' method in the literature (Hestenes 1969) led to the

development of a new method to overcome the deficiencies of Lasdon's

(Lasdon, 1964) two-level optimization method.

Then it became evident that a stronger version of the discrete

minimum principle could also be established. Thus two new strategies

evolved which can very likely be used effectively (as their application

to several small examples has indicated) to optimize large-scale

systems in the presence of nonconvexities. These approaches are of

particular interest to a chemical engineer, since the cost functions

used in process design involve the throughput to a unit raised to

the 0.6 power which is a characteristic nonconvex function.

Furthermore we explored the use of the bounding properties of

the primal and dual functions in connection with list processing

techniques to generate very good alternate solutions to the multi-

component separation problem. Finally, the evolutionary approach to

the synthesis of optimal process flowsheets was systematized, and

all these principles and ideas were illustrated in the synthesis

of an optimal separation scheme.

Thus, in Chapter II we discuss the two-level optimization method,

its theoretical foundations, its advantages and its drawbacks. The

generation of dual gaps because of 'onconvexities and their resolution

using Hestenes' method of multipliers is discussed. An algorithmic

procedure employing these ideas is described. In Chapter III the

"classical" discrete minimum principle is outlined, and its short-

comings because of nonconvexities are defined. Again Hestenes' multi-

pliers are used to develop a strong version of the discrete minimum

principle, and a new algorithmic procedure is proposed. In Chapter IV

the theories developed in Chapters II and III are tested on some

simple numerical examples and also some examples drawn from the

chemical engineering literature. In Chapter V a general review is

presented of previous works in the area of process flowsheet synthesis.

Then a branch and bound strategy, using list processing techniques,

is outlined for the synthesis of optimal multicomponent separation

sequences in Chapter VI, while Chapter VII develops a systematic

evolutionary approach for the synthesis of process flowsheets.


Chapter VII concludes with an illustrative presentation of the principles

governing the evolution of the design on a separation problem.

Finally in Chapter VIII we summarize the results of this thesis and

outline a program for further related research in the area of

chemical process design.



II.1. Review of Previous Works on the Nonconvex Optimization.

The Lagrangian approach, representing a dual method, has often

been proposed for solving optimization problems. In many engineering

design problems, the system to be optimized comprises several fairly

complex subsystems or units which are connected together by sets of

equality constraints. By appending the equality constraints to the

objective function with Lagrange multipliers, the Lagrange function,

for fixed multiplier values, is separable, leaving a subproblem for

each subsystem. The approach is therefore very attractive for this type

of problem. It can fail however because of the presence of a dual gap

at the solution point (Everett, 1963; Lasdon, 1970). Unfortunately

this failure is common in engineering design problems; therefore, the

development of a method to resolve dual gaps is of great importance.

Gaps may arise for various reasons, and a quite thorough treat-

ment of their causes and resolutions is presented by Greenberg (1969).

He reviewed the use of nonlinear supports (Gould, 1969; Bellman-Karush,

1961). the use of surrogates (Greenberg-Pierskalla, 1970; Glover, 1968),

the use of cuts via dominance and efficiency concepts (Loane, 1971;

Greenberg, 1969), and the use of branch and bound method designed for

finite separable problems. It should be noted that in all these methods

to resolve gaps, inequality connection constraints are emphasized.

Bellmore, Greenberg and Jarvis (1970) have examined the use of

penalty functions to resolve gaps, and they reviewed the relationship

between the original problem and the augmented one along with proposed

solution procedures.

Falk and Soland (1969) have proposed an algorithm to solve the

separable nonconvex programming problems where the constraints are only

upper and lower bounds on the variables. Soland (1971) has extended

the algorithm to include inequality constraints of a more general form.

Both the algorithms are of a branch and bound type and solve a sequence

of problems with convex objective functions. These problems correspond

to successive partitions of the feasible set. Greenberg (1973), in a

recent publication, has provided a sharper lower bound on the optimum

value with less computation using the Generalized Lagrange Multiplier

method, and the Falk-Soland algorithm can be modified appropriately.

All the above methods have a major drawback: the separability of

the objective function and the constraints, if it existed in the

original problem, is destroyed after the proposed modifications. Only

the last method reported by Greenberg (1969) and using a branch and

bound technique preserves the separability, but it is only applicable

for finite problems. Separability is a characteristic providing many

advantages for the solution of a large system, and it is very desirable

to preserve it.

In applying the Lagrangian approach and structural sensitivity

analysis (McGalliard, Westerberg ,1972) in engineering design problems,

ue want to resolve the dual gap problem by keeping the structural

characteristics of the system.

In the present work an algorithm is developed which makes use of

the penalty function approach together with a linear approximation of

the nonseparable terms. The original problem is replaced by a sequence

of problems, each one yielding a tighter dual bound on the optimum

solution. The solutions to the above problems form a nondecreasing

sequence of real numbers, bounded from above. Under certain conditions

developed later on, this sequence of solutions converges to the solution

of the original problem. The structural characteristics of the original

system are preserved: each subproblem is solved separately.

11.2. Statement of the Problem and the Two-Level Procedure

As a basis for the description of the two-level optimization

procedure. of the encountered dual "gaps", and of their proposed reso-

lution, we will consider the following model which represents many

engineering system design problems (as opposed to resource allocation


I(j) = {i ; stream i is an input to subsystem j}

0(j) = {i ; stream i is an output of subsystem j}

x. is a vector variable associated with the interconnecting
Stream i.

u. is a vector decision variable associated with
Ssubsystem j.

Define vector variable qu as

q = (Xk ... xk [u ) ki e I(j). (II-1)

The transformation equations which connect the subsystems are

xi ti(qj) i O(j) j = 1,...,n .


The overall system return function is the sum of the subsystem

return functions:

F = Y f (qj).


The system constraint set, excepting the interconnection constraints

(11-2), is separable, i.e.,

q Sj = {qj hj(qj) 01

j=l . ,n.

Thus, the overall optimization problem is

Minimize F = I f (q.)


subject to xi = ti(qj) i E 0(j) j = 1...,n

qj Sj

, j = 1 .. ,n.

The Lagrange

given by

function for the above problem described by (11-4) is

n n
L = fj(qj) + 1 (t (q) -
j=1 j=1 iSO(j)

x.) (11-5)

where the hi are Lagrange multipliers. Rearranging the terms in (11-5)

the Lagrangian can be written as follows:

L = I { f.(qj) +
j=1 ie0(j)

- xix}


For fixed x the problem of minimizing L

equivalent to solving the subproblems

becomes separable, and it is

Minimize Zj(qjQ


subject to qu S..

Let us now

function is

define a dual function and a dual problem. The dual

h(A) = X (minimum k.(q j;)),
j=1 qjES. j


and the dual problem is

Maximize h(A)


subject to A E D D = {X; h(A) exists } .

The two-level optimization procedure requires the following two

distinct operations:

First level: Calculate h(A) by solving the n subproblems (11-7)

Second level: Adjust the multipliers,A, so as to satisfy the inter-
connection constraints in (11-4)

In effect this procedure solves the dual problem described by (11-9).

The important question concerning this procedure relates to the

existence of a saddle point for the Lagrange function of the problem.

The following theorems provide the theoretical basis and give some

answers to the saddle point existence question. For further details

and proofs of the theorems, see Lasdon (1970).

Theorem 1: Let XAeE A point (q OA) is a constrained saddle point

for L(q,X) if and only if 1) qO minimizes L(q,Ao) over S and 2)

x. t.(qj) = 0, ieo(j), j=l,...,n.

Theorem 2: If (qo ,x) is a saddle point for L, then qO solves the

primal problem described by (11-4).

Theorem 3: h(A) is concave over any convex set of D.

Theorem 4: h(X) < F(q) for all qeS such that x ti(qj) = 0, icO(j),

j=1,...,n and for all AeD.

11.3. Dual Gaps

The basic drawback of the Lagrangian approach, and therefore of

the two-level optimization procedure, is its failure to find the solution

of a problem when the solution is in a "dual gap".

To provide additional insight into the relationship between the

primal and the dual problems and to demonstrate the formation of gaps,

we will give at first a geometric interpretation of the procedure and

second the theoretical justification of the failure of the two-level

approach in certain cases (Lasdon, 1970 ; Rockafellar, 1967 ).

Consider the family of the perturbed problems, with perturbations

Minimize F(q) = X f (qj)
j=l J

subject to xi ti(qj) = zi icO(j) j=,...,n

qj S. j=l,... n

The primal problem corresponds to zi = 0, ie0(j), j=1,...,n. Assuming

continuity of the objective function and of the constraint functions,

let us define

w(z) = minimum {F(q); xi ti(qj) = zi, iEO(j), j=l,...,n, qcS}

where z = [z, for all iEO(j), j=1,...,n].

The domain of w is

Z = {z; there exists a qcS such that x. ti(q ) = zi, iEO(j), j=l,...,n) .

Consider now the set RCEm+l

R = {(z ,z); z > w(z), zeZ}.

If w and Z are convex then R is convex. We shall call this space,

containing R, the HZ space.

The following theorem demonstrates the connection between duality
and supporting hyperplanes for the set R, see Lasdon (1970) in WZ.

Theorem 5: If eS and AEEm, then q minimizes L(q,2) over S, if and

only if H = {(z ,z); z z = L(q,2)} is a supporting hyperplane for

the set R at the point, (F(q), xi ti(qj) = zi, ie0(j), j=l,...,n).
Supporting hyperplanes exist for every point and therefore for
the solution point, if R is a convex set (Fig. 1). In the case of


i ~

5~42 qh \
'a "`-I
4a 9




(w(I (,2j)



I nh(X wO)
h (,\)= w (Z


hyperp aon

at (w(i0 O)

Figure 1. Geometric View of the Success of the Dual Approach

Cul~ -I __ .I- ----l*ur~a .-iL~l-1 9-U9~~C*-j




nonconvex R sets, there are regions consisting of constraint vectors

that are not generated by any vector X (Fig. 2). Optimum solutions

for constraints inside such inaccessible regions cannot be discovered

by straightforward application of the two-level optimization procedure,

and must be sought by other means.

The following corollary (Greenberg, 1969) helps to anticipate

the existence of such an inaccessible region, which will be termed a

"dual gap".

Corollary: A dual gap arises if some choice of Lagrange multipliers

XED produces at least two solutions to the Lagrangian problem (11-8)

with distinct zi = xi ti(qj) i 0, iEO(j), j=l,...,n.

11.4. Resolution of Dual Gaps Using the Method of Multipliers

Let us assume that the solution to problem (11-4) is in a dual

gap. According to Theorem 5, the two-level approach fails to find the

solution. The method of multipliers developed by Hestenes (1969)

will be used to cure this shortcoming of the two-level method.

Let us make the following notational changes:

ij = x t (qj), icO(j) j=1 ...,n

9 [gi 2 m gi ] ; j=l,...,n ; ik 0(j) ; k=1,... ,m

g = [g j, 92 n... ]

q = [qTl qT2.... qT]

hT = [h, hT ... h
1' 2' n

= w(o)

Gapp p

Figure 2. Geometric View of the Failure of the Dual Approach
to Yield the Primal Solution

Then problem (11-4) is written as follows:

Min F = y f (qj) = F(q)
s.t. g(q) = 0

q S

where S = S1 x S2 x ... x Sn

Consider now the following augmented problem:

Min Ft(q,k) = F(q) + K.gT(q)g(q)

s.t. g(q)= 0
q E S (II-P2)

K.> 0 .

Let us now examine the relationship between problems (II-PI) and (II-P2).

Let us make the following assumptions:

A-l: Objective function F(q) and constraints g(q) are of class C".

A-2: The set S is compact and nonempty.

Theorem 6 (Hestenes, 1969 ): Let qo be a nonsingular solution to the

problem (II-P1), then there exists a multiplier vector A and a constant

Ki such that the point q0 is an unconstrained minimum of the Lagrangian

of problem (II-P2).

Proof: Since q is the solution of the problem (II-P1), the following

conditions are true:

g(qo)= O,

and AL = AqT 2

0 where L = F(q) ATg(q) v Th(q)

Aq > 0 for all permissible variations Aq,

such that

Ag = Aq = 0 and vj [hi i j,,...,r

Note: The Kuhn-Tucker multipliers vj are always nonpositive at the

solution to (II-P1). The Lagrangian of the problem (II-P2) is:

L. = L + KigT(q)g(q).

At the point q :

aL 0

AL. = AL + Ki

lq j2q T q
_q q i~l

For permissible variations Aq such that Ag = 0 and v iAh 0,

AL. > 0 and for nonpermissible variations Aq, it is possible that

AL < 0, but there exists a K. > 0 such that

AL. > 0

and the proof is complete.

From Theorem 6 we conclude that there exist a vector A and a Ki > 0

qq q

such that the point (q ,0) is a saddle point for the Lagrangian Li,

of the augmented problem (II-P2). Therefore the problem (!I-P1)
can be solved using the two-level method, if it is augmented by the
penalty term KigT(q)g(q). Define:

w(z) = Min {F(q):g(q) = z}

wi(z) = Min {F (q,Ki):g(q) = z}

Let us examine now how the value of Ki affects the wi(z) and under

what conditions a proper choice of K. yields a supporting hyperplane

for the set R., where

Ri = {(z ,z):z >_ wi(z), zEZ}

at the solution point (wi(0),0). The following theoretical treatment

will demonstrate the effect of the penalty term on the resolution of
dual gaps via a geometrical representation.

Lemma 1:w(z) < w.(z) for z f 0 and w(O) = wi(O). More generally, if
K. > K. then w.(z) > w.(z) and w.(0) = w.(O).

Proof: The equality for z = 0 is direct since the penalty term is then
zero. Let q minimize F(q) with g(q) = z, then q minimizes Fi with

g(q) = z because if this is not true and q minimizes Fi with g(q) = z,

F(q) + KigT(q)g(q) < F(q) + KigT()g(q)

and consequently

F(q) < F(q)

which is not true since by assumption q minimizes F(q) with g(q) = z.


Min F.(q,K.) Min {F(q) + K.z z
i qES

T *
> Min {F(q) + K.z = Min F*(q,K) .
qcS 1 1 1

Therefore w.(z) > wi(z) for z 0 Q.E.D.

Lemma 1 implies that the curve w(z) vs. z (in the one-dimensional

case) is moved upwards as K. increases, keeping always the same value

at z = 0.

Lemma 2: If w(z) is a continuous function with continuous first de-

rivatives, K. > K., and
J 1

H(K ) = Max h*(A) = Max Min L (q,X,K ) = Max Min [F AT(q)]
A A qeS P qES

H(Kj) > H(Ki) and H(Kj) = H(Ki) only at the

solution of the problem (II-P1).

Proof: Let

* *T *
H(K ) = Fj(qj) ) g(q ))

* *T *
H(Ki) = Fi( (i)) (i) g(q (i)

Assume that H(K.) C H(K.), then

* *T * .*T *
Fj(q (j)) g(q ) < F(q ) g(q(i))

F(qj)) A*(g(q*) > F (qj )

F (q j)) A~ i)g(q(j)) < Fi(q i))



** *
The last inequality implies that the point (F (q(j)), g(qj)) lies

below or on the hyperplane, defined by:

H = {(z,z)I .(i)z = H(Ki)}

Let us first examine the case where it lies below the hyperplane. Then

(Fj(qj)) g(qj))) Ri


R = (zZ) Iz *(z) zZ
Ri = {(z ,z) Iz wCi(z) zEZ1


(F i((j)) g(q(j))) E Rj


Rj = {(zo,) Izo > w (Z) zeZ }

Because of lemma 1 R.C: R. and (R. R.) ( R. = (w(O),0). Thus
J 1 1 J J


(Fj(q( ) g(q( ))) E Ri

which contradicts the above result. Therefore H(Kj) 4 H(Ki).
* *
In the case that the point (F.(q(j)), g(q(j))) lies on the hyperplane,


(Ri Rj) n Ri = (w(O),O)

we conclude that H(Kj) = H(K.) only at the solution of (II-Pl).

Theorem 7: If w(z) is finite for all qes and is continuous with con-

tinuous first derivatives and existing finite one-sided directional

second derivatives in any direction s, at z = 0, then there exists a
* *
K finite such that for all K > K H(K) = H(K) = primal solution.

Proof: Since w(z) is continuous with continuous first derivatives at

z = 0, there exists a hyperplane tangent to w(z) at the point (w(0),0).

This is not a supporting hyperplane for the set R, since there is no

saddle point for the Lagrangian of the problem (II-P1) inside the dual

gap. This hyperplane is described by:

z z = w(0) with some z w(z).

Now we have to find a K. such that the above hyperplane is a supporting

hyperplane for the set R., i.e.,

T *
z = w(0) + ? z < w*(z).

We note that

w*(z) = Min {F(q) + Kig (q)g(q)g(q) = z}


w(z) + K.zz Tz > w(O)

and there must be a

Ki. > w(0) w(z) + X z for all EZ {z 3ES g(q) = z}

For z / 0 and with w(z) > -m for zeZ, a finite value of K. exists,

say L The only potential problem might occur at z = 0. IJe can use

L'Hospital's rule to find the limit as z 0 along any direction s,


w(O) w(Bs) + T(Bs) 1 2w(Bs) = finite ,
Lim x = Lim T 2 2
z+O )-0T (gs) (ss) 2

if this limit exists. If the limit does not exist, then the finiteness

of the one-sided directional second derivatives in any direction s, at

z = 0, yields:

Lim a = a. = finite, along the direction s..
Bj.-0 J J

Therefore choose K., such that

K. ? max [az, a. for every direction sj] = finite.

Therefore there exists a finite K = K. such that the hyperplane tangent

to the w(z) at the point (w(0),0), with a slope \, is a supporting

hyperplane for the set R.. Therefore the dual solution H(K ) equals

the primal solution.

Now, for any K > K H(K) H(K ) but it cannot be that H(K) H(K )

since that leads to H(K) > primal solution. Therefore H(K) = H(K ).

If 3w(z)/3z is to be discontinuous and w(z) is continuous at z = 0,

then K has to be +m, and the dual approach fails to give a solution

for the problem (II-P2), but as K increases, approaching +-, it

produces tighter lower bounds on the primal solution. The method can

fail altogether if w(z) is discontinuous at z = 0.

11.5. Computational Separability

The previously described resolution of the dual gaps suffers from

a serious drawback, namely, the separability, which existed in problem

(II-P1), has been destroyed in problem (II-P2) because of the cross-

product terms in the penalty term. Separability preserves the struc-

tural characteristics of a system and induces properties which are

always desired in solving a large-scale problem such as many of those

in engineering design.

In the following paragraphs an algorithm is proposed which resolves

the dual gap by preserving a computational separability of the system.

Using the initial notation, consider the penalty term

T n T
Km(g(q)-z)T (g(q)-z) = K M X [ti (j) xi zi [ti(qj)-xi- z]
j=1 iEO(j) 1

n s 2 2 2
= K [t2 +x +z2 -2t. z. -2tt xir
j=l ic0(j) r=l ir r r ir ir

+2ir ir]

where z is the deviation from satisfying the constraints. Under

appropriate assumptions it has been seen that for large enough Km,

the solution of the problem can occur at z. = 0 for all icO(j) and


Each term of the above triple summation consists of separable
2 2
terms tl (qj), Xi -2tir(q.)z.r, and -2Xi zi, and a crossproduct

-2tir(q )xir which is not separable. Expand the crossproduct in a

Taylor Series and consider the following linear approximation around

the point, ir, ti = t (q ):

tir (q)x. = t. x. + t. x. + t. x.
Sr ) ir ir ir ir r ir ir

Then the Lagrangian of the problem (II-P2) with constraints g(q) = z

(instead of g(q) = 0) and Lagrange multipliers p takes the following

approximate separable form:

L* n +1 2 2 A
L f.(q.) + K [t. +z. -2t. z. +2t. x. -2t. x. ]
lm j (1 J j m Lr 1 ir ir ir ir ir ir ir ir
m =l ic(j) r=l

icO(j) icl(j)

n ,
S mj (qj. )

Thus the problem of minimizing this form for L for fixed K and 1 is

equivalent to solving the subproblems:

min mj (qj,l)

j=1 ,...,n

s.t. q E S..

11.6. The Algorithm

Using the multiplier method (Section II-4) along with the above-

mentioned linear approximation for the crossproduct terms, the

following algorithmic procedure is developed which resolves the dual

gap by preserving a computational separability of a large-scale system.

In Section 11-7 a minimum principle is developed which completes the

theoretical foundation on which the algorithm is based.

Step 1: Assume a value for the penalty constant K .

Step 2: Assume values for Lagrange multipliers X (we shall relate

these to p in the step 4).

Step 3: Assume a point (xiti(qi)) for each icO(j) and j=1,2,...,n.

Step 4: Put zi = ti(qj) xi and Pi = 2Kmzi + Xi. Form the subproblems

based on the linear approximation for the crossproduct term.

Step 5: Find q. for j=1,2,...,n which solves

Mi n *
qjES m

Step 6: Update xi, t (qj) and iterate from step 4 until zi = t.(qj) xi

Step 7: Update the Lagrange multipliers A and go to step 3 until the

Max Min L has been attained. Check the constraints. If
i qeS

satisfied stop, otherwise update Km and go to step 2.

The algorithm described above requires that a sequence of Max-Min

problems be solved, each possibly requiring a large number of

iterations. Consequently the total number of iterations for

convergence may become excessively large for practical applications.

In order to accelerate convergence, we have adopted the

modifications on the updating rules proposed by Miele et al. (1972) for

the method of the multipliers. The adopted modifications are the


(i) Shorten the length of a cycle of computations. A cycle of

computations is defined to be the sequence of iterations in which the

multiplier A and the penalty constant Km are held unchanged, while the

vector qj for the subsystem j is viewed as unconstrained. The number

of iterations permitted in each cycle depends on the unconstrained

optimization technique which is employed, and it is AN = number of

iterations = 1 for the ordinary gradient algorithm and the modified

quasi-linearization algorithm (Miele et al. 1972) and AN = dimension

of vector qj, for the conjugate-gradient algorithm.

(ii) Improve the estimate of the Lagrange multipliers A, using

the formula

A(i+1) X(i) + 2 1 g(q)

where B is a scalar parameter determined so as to produce some optimum

effect, namely, to minimize the error in the necessary conditions for


(iii) Select the penalty constant Ki in an appropriate fashion.
The method used to select K depends largely on the unconstrained
optimization method which is employed. If the ordinary gradient method
is employed, the penalty constant is given by the formula

Km = 2P(q)/Pq(q) Pq(q) ,

where P(q) = gT (q)g(q) and P (q) = P If the conjugate-gradient
or the modified quasi-linearization algorithm is used, the penalty
constant is updated by the formula

K(i+ ) = min (KoK(i)) if P(q) <_ Q(q,A)
m 0 m

K(i+1) = max (K ,rKi) if P(q) > Q(q,X)
M 0 5

where: q(q,X) = F (q,X) F (q,X) F (q,A) A 1
q q q q 3q

and K /q pg /Pq(q) Pq(q)

K = ) ^ /PT
[ q T q q q

See Miele et al. (1972) for further information on these rules.

11.7 A Discrete Ninimum Principle
In this section we shall give a theoretical justification for the
algorithm given in Section 11.6. Consider the following two Lagrangian

L = F(q) + K gT ((q q) Tg(q)
m M



= F(q) + K (g(q) z)T (g(q) z) -T(g(q) z)


Geometrically (II-11) is the Lagrange function resulting from moving

the origin to z in the WZ space and adding the penalty term for

deviation from that new origin. Theorem 8 indicates a relationship

which exists between them.

Theorem 8: If q solves the problem

h _() = Min Lm(q,K m,) (II-P3)

resulting in

g(q) = z (11-12)

then q also solves, for K and z fixed at these same values, the problem

Min {Lm(q,Km,-,z)jg(q) = z} (II-P4)

Conversely, if one can find a z which permits (II-P4) to have a solution,

say q, then q solves a problem of form (II-P3).

Proof: Suppose that q solves (II-P3) and results in g(q) = z. Then by

Everett's main theorem (Everett ,1963 ), q solves the problem

irn {F(q) + K gT(q)g(q) g(q) = z} (-13)

which is equivalent to solving the problem

Min {F(q)lg(q) = z} (11-14)

since the term KgT (q)g(q) = K z i s constant for z fixed. Problem

(II-P4) is equivalent to problem (11-14); thus q solves problem (II-P4).

If z is fixed and permits (II-P4) to have a solution, say q,

then one has solved the problem

lin {F(q) + KmgT(q)g(q) (2K z + ~)T g(q) + K zTz + Tz}

which is equivalent to solving the problem

Min {F(q) K q)(q) gTg(q)} (11-15)


A = 2K z + p (11-16)

Problem (11-15) is a problem of form (II-P3), and consequently q solves

a problem of form (II-P3).

Theorem 8 says problems (II-P3) and (II-P4) are equivalent, in that

they give rise to the same (z,q) values. The following corollary

follows directly from this observation.

Corollary 8.1: If and only if 2 permits (II-P4) to have a solution,

then the set Rm corresponding to problem (II-P3) has a supporting hyper-

plane at the point (w (z),z) with a "slope" given by (11-16).

The following corollary also follows directly.

Corollary 8.2: If z permits (II-P4) to have a solution, say q, then

T- T-
h(X) = (q,K ,i z) + Km zT + Az (II-17)

where A is given by (11-16).

Proof: If z permits (II-P4) to have a solution, q, then

g(q) = z (IT-18)


L (q,Kmvuz) = F(q) (11-19)

Equation (11-17) follows immediately from (II-10), (11-18) and

(11-19). Q.E.D.

Note that corollary 8.2 gives us a formula to calculate a dual bound,

h(A), to w(O) if we have in fact solved (II-P4) rather than (II-P3).

The result following exposes some very useful properties of problem

(II-P4) for the type of engineering design problems being considered

in this paper. It also constitutes the theoretical reasoning of the

algorithm presented in Section 11-6.
Result:T ~T
Result: If the point q [q ...q n] and the multipliers p solve

problem (II-P4), then each subproblem j, j=1,...,n resulting after the

Taylor Series linear approximation of the crossproduct terms is

minimized with respect to the corresponding qj at the point qj.

Proof: Consider a system of two stages and the following family of


Min F fl(xl,u ) + f2(x2,u2)

subject to: x2 t((xl,ul) = z

h1 (x1 ,ut) 0

h2(x2,u2) 2 0

where zeZ' = {zlx2 t1(xl,ul) = z, hl(x1l,u) < 0, and h2(x2,u2) 0}.

The Lagrangian function for the augmented problem is:

L = fl(x1Ul)+ f2(x2,u2) + Km[ X t1(x1,l) z]T[x -t (x',u)- z]

-T) z T T
[2 t1(x1u) z] -vlhl (x1,u) V2h2(x2,u2)

= L + KM[x2 t1(xl,u1) z]T[x2 t1(x,u) z] ,

where V1 and v2 are Kuhn-Tucker multipliers.

Suppose that L is minimized at the point (x1 ,ul ,X,2). At this point

the Hessian matrix of the second derivatives of m must be positive
definite, ie.,
definite, i.e.,

2 atT at a2t
S 1 t ___
2 + 2K -2K 2 1 E t- z]
ax 1 ax T ax -

2 K tT 2 2t 1
T + 2Km u 2Km [x2 1- z]


m T


2 3T 2tI 9 t
xx + 2Km x 2Km [X2 tl- z]
axu 1 2u ax1ul

2 at at a2 1
2 + 2K ul u 2K [x- t z]
a-2 n u 1 m T[2- tl- z]
I. I du 1 9u1

- 2Km u
m T



-2Km sul

a2L 2
- + 2K I 2 L
22 Tm
ax2 ax23u 2


must be positive definite.

Note that A contains terms which are not separable, each of which

contains the term x t z. Since u and ( lUil'X2 2, ) minimize Lm

with -2 t1(x1'ul) z = 0, the matrix A reduces to the following


32 L Lt at
L+ 2K t
x 1 ax
1 1

T +

at 3t t
2K t
m u T

_ _2L 1t
x + 2KL i l Iu
x1uT m 9X1 uT


- 2K -
m 3xT


tT 1
2K 2
mu au

- 2K
m T


Since A must be positive definite the principal minor matrices of A

must be positive definite, i.e.

2L T at
D2L+ 2Km 1 T
ax2 m 1 x3t
1 1

22L at1
xu + 2Km Il x
axTau m u 1 XT

2L atT ;t
XL + 2K 1
x1 2,T m 2x1 uT
O2L D t 1 atl

i2 m a
1 1

- 2K
m Bx1

m 2u1

22L + 2K I
2 m

x ou2


posi tive

2 2

A2 = positive definite
x2,u2 DaL 3 L
22 2
ax23u2 aU2

Consider now the sub-Lagrangians ml and m2. After the linear

approximation of the crossproduct terms -2Kmx tl( (x,u), they take the

following form:

fml= l(Xl'Ul) Ttl(X1Ul) vhl(xul) + K (x1u )tl(xlul)

2Kmx2 tl(x1,ul) 2Kmz tl

-T T T KT2 T
22 = f2(x2'u2) + 1 x2 2(2'u2) +Kmx2x2 -2Kmt( X1 )x2 + 2Km 2

where xl,ul,X2 is the point around which the Taylor Series expansion

takes place. At the point 1 ,ulx2,u2 the following conditions are


zml n Lm -ml u Lm
-0 0
axl ax1 aul xul

m2 -= 0 and m 0
5x2 5x2 Su Su
2 x2 2 2

Also the Hessian matrices of ml and Sm2' which are precisely Al and A2
respectively, are positive definite. Thus we conclude that ml and
respectively, are positive definite. Thus we conclude that S>. and S.

are minimized at the point (xl,ul ,X2,U2) which minimizes Lm. By

induction the same result is found to apply to any number oF stages.

Since the primal problem (II-P1) corresponds to that member of

the class of problems (II-P1') with z = 0, the above mentioned result

applies to (II-P1) for a proper Km and i.

11.8. Discussion

We should note the following two observations which constitute

the essence of the proposed algorithm: 1) The use of a Taylor Series

expansion really provided two results. The first is, of course, that

the problem becomes separable. The second is evident only when one

realizes that other devices could effect the separability feature; for

example, one could rewrite the crossproduct term -2x. t. as

-2xir(xir zir) which is also separable for fixed zir. Unfortunately,

in this form in contrast to the proposed one, the subproblem, for

the unit containing x. as a variable, becomes a concave minimization

in xir as Km increases. 2) For the Taylor Series expansion approach,

we have found that problem (II-P4), and not (II-P3) to which it is

equivalent, has the desired property that the subproblems must always

be minimized if the overall problem is minimized, even if K is not

sufficiently large.
2 2
We should also note that the convex quadratic terms t. and x.
ir ir
produced by the penalty term will tend to improve and could dominate

the behavior of the subproblems as Km increases, making them easier

to solve.

The introduction of the quadratic terms in the objective function

also offers another advantage. It desensitizes the dual function with


respect to the multipliers A. For given multipliers \,

h (x) = Min L*
xi,ui ,i= ... n

h(X) = Min L
xi,ui,i= .... ,n

is closer to the primal solution than the

(see Figure 3). This is a characteristic

which can be of importance for a dual bounding procedure, such as

the one used in structural sensitivity analysis (McGalliard, Westerberg,


Figure 3. Improvement of the Dual Bound h (X) for the
Penalized Problem Over the Dual Bound h(A) for
the Unpenalized Problem for the Same Multipliers X

i' 9




The discrete form of Pontryagin's Minimum Principle proposed by

a number of authors has been shown by others in the past to be

fallacious; only a weak result can be obtained. Due to the mathe-

matical character of the objective function and the stage transformation

equations, only a small class of chemical engineering problems have

been solved by the strong discrete minimum principle. This chapter

presents a method to overcome the previous shortcomings of the strong

principle. An algorithmic procedure is developed which uses this new

version. Numerical examples are provided to clarify the approach

and demonstrate its usefulness.

III.1. Review of Previous Works

Pontryagin's minimum principle (Pontryagin et al., 1962) is a

well-known method to solve a wide class of extremal problems associated

with given initial conditions. A discrete analog of the minimum

principle, where the differential equations are substituted by dif-

rerence equations, is not valid in general but only in certain almost

trivial cases. Rozonoer (1959) first pointed out this fact. Katz (1962)

and Fan and Wang (1964) later on developed a discrete minimum principle

which was shown to be fallacious by Horn and Jackson (1965a), by means

of simple counterexamples. As was pointed out by Horn and Jackson

(1965a, b) and lucidly presented by Denn (1969), the failure of a

strong minimum principle lies in the fact that we cannot deduce the

nature of the stationary values of the Hamiltonian from a consideration

of first-order variations only. Inclusion of the second-order terms,

does not help to draw a general conclusion about the nature of the

stationary points in advance. A weak minimum principle which relates

the solution of the problem to a stationary point of the Hamiltonian

exists and is valid (Horn, 1961 ;Jackson, 1964).

In the case of control systems described by differential equations,

time, by its evolution on a continuum, has a "convexifying" effect

(Halkin, 1966) which does not make necessary the addition of some

convexity assumptions to the specification of the problem. Thus a

strong minimum principle can be applied for these problems, requiring

the minimization of the Hamitonian even in the case that a continuous

problem is solved by discretizing it with respect to the time and

using a strong discrete minimum principle. For discrete, staged systems

described by difference equations, the evolution of the system does

not have any "convexifying" effect and, in order to obtain a minimum

principle, we must add some convexity assumptions to the problem

specification or reformulate the problem in an equivalent form which

possesses inherently the convexity assumptions. This present work

belongs to the second class.

In the present chapter we propose to show a strong version of

the minimum principle which relates the solution of the problem to a

minimum point, rather than a stationary point, of the Hamiltonian.

This is attained through the use of the Hestenes' method of multipliers,

a technique used effectively in Chapter II to resolve the dual gaps

of the two-level optimization method. This method turns a stationary

point of the Hamiltonian into a minimum point, thus minimum seeking

algorithms can be used.

111.2. Statement of the Problem, and Its Lagrangian Formulation

As a basis for the description of the minimum principle, the two-

level optimization approach, their success and failure, and the

development of a strong minimum principle, consider the following

sequential unconstrained problem. (Constraints and recycles do not

change the following results (Westerberg, 1973), and we want to keep

the presentation here as simple as possible.)

Min F = i i(xi,ui) (III-P

subject to

xi+1 = fi(xi,ui)

xI = xo (given)

For every i=2,...,N the vector valued function fi(xi,ui) is given and

satisfies the following conditions:

a. the function f. is defined for all x. and u. ,
1 1 1

b. for every ui the function fi(xi,ui) is twice continuously dif-

ferentiable with respect to xi,

c. the fi(xi,ui) and all its first and second partial derivatives

are uniformly bounded.

These conditions correspond to the usual "smoothness" assumptions.

The Lagrangian function for this problem (III-P1) is given by:

L = i (xiiu. +[) i+ fi(xiui)] (x -
i-i =1 i +

N T T To T
S {Ti(x u) xi f. (x.,u )} + xxl ( l


The solution to the problem (III-P1) is a stationary point of the

Lagrangian function L. The necessary conditions for a stationary

point of the Lagrangian are:

L D i + fi
;x. axi 1i axi 9 i+1

+--A = 0,
aui aui i+l

- f (xiu) = 0

0, i=1,...,N

i= ,.. .,N

i=1 ....N. N

From equation (II1-2) we have the defining equations for the multipliers,

3m af
i- + th (III-5)

with the natural boundary condition,

xN+1 = 0 .







Equation (111-4) simply necessitates the satisfaction of the connection


The Lagrangian approach constitutes a unifying and general pre-

sentation of the necessary conditions which must be satisfied at the

solution of a problem. For the solution of the necessary conditions

different strategies have been developed. In a tutorial presentation

(Westerberg, 1973) the relationship of the different strategies to

solve problem (III-PI), with the Lagrangian approach, is established

and it is shown that methods such as sensitivity analysis, discrete

minimum principle, and the two-level optimization method are simply

different techniques to solve the same necessary conditions, eqs.

(III-1), (111-2) and (III-3).

Let us define the stage Hamiltonian H. as follows:

Hi = .i(xi,u ) + T fi(x 'ui) i=1,...,N (II -7)

and the overall Hamiltonian by:

H i Hi+ ?TX


L = Hi i 1 ixi+ XTx
i=1 1i=

and the necessary conditions (111-2), (111-3) yield:

DL 1 = O i=,...,N (II-8)
Xi 3x i

S 0 i=,...N (III-9)
aui au

Thus, in order to solve problem (III-PI) by the discrete minimum

principle, we require that each stage Hamiltonian be at a stationary


Consider the point (x,2 .. ,xN; ,u'2, ,u,) and the variations

6ulU2,... ,6uN around the previous point with respect to the controls,

such that S1uijl where the second index j denotes the jth element

of the control vector ui. 6x1 will be taken as zero. The variational

equations corresponding to the connection constraints of the problem

(III-P1), with up to the second order terms included, are:

2 2
i1 T f + 1 (6x T fi
6xi1 (6ui) + -(x ) + (6u.)T ( ) + (x -- x
i+l T 2 u u2 x2 (u2 i
au. ax u ax
1 i i

+ (6ui)T i iT (6xi) + O(E2

For the considered sequential system, the solution of the above system

with respect to the 6x.'s, i=1,...,N, is straightforward and yields

the following general formula in terms of the variations in the controls


6xT 1 x -- y(6u ) + n k xk (u )T_]2 (u )
=1 k=R+1 k au =1 k=+1 k 'Bu

f T T T2 f
S l (6ukT k sts 2 n-l ax+t
2 .=1 k,m=1 v=+1 z x u 3k s=k+1 xs ax2 tm+l
T -

u m T2 =1 k=1 v=+1 x uk =k sk+l s ax u
in IV


The variation in the objective function F caused by the variations
in the controls is given by:

N Fi i 2 2i
SF ((u.) + i (6x.) + 1 (6u. (6ui)

1 =
T 1 T 2 2 1
i=1 u ax 3u
i i i

1 T 2 i T 32 i 2
+ (xi)T x (6x.) + (Xui) T (Ox1) + 0(c ).
i au 3ax

Substituting into the last expression for 6xi's, i=2,...,N with their

equals from eqs. (III-10) and, noting that the stage Hamiltonians
H., i=l,...,N, are given by eq. (III-7), we find:

N H (6, T 2 (6 2
6F = (u ) + (6u ) (u )
=1 au a u
SN T -1 -1 f 3

lN z fT f 2HZFzI9 I T fro
2 (k Su x 1 a2 9 Xum
S=l k,m=1 k s=k+l 3x t=m+l t au

1 N N af -1 af] 32H
+ (6uk)T 3x T (6u)
=k k=l uk Ls=k+l Ss xSu u" (III-11)

Unlike the continuous case (Denn, 1969; Halkin, 1966) there is no
general way in which the last two terms in eq. (III-11) can be made to

vanish. Therefore, the variations considered in the controls may well

produce 6F < 0, or 6F > 0, or 6F = 0. Thus it is evident from the
above that a strong minimum principle, which requires that the solution
to the problem (III-P1) minimizes the stage Hamiltonians H z=1,...,N,
is not generally available (Horn and Jackson, 1965b; Denn, 1969;

Halkin, 1966). A weaker form of the discrete minimum principle can

be used and requires that the solution to the problem (III-P1) makes

the stage Hamiltonians stationary. The examples presented by Horn

and Jackson, which counter the strong minimum principle, are such

that the stage Hamiltonians do not possess a minimum stationary point

whereas the problem itself does. However, there exist special cases

where the strong minimum principle can be applied (Horn and Jackson,

1965b; Denn, 1969).

At this point a further clarification is required. The strong

discrete minimum principle fails for physically staged steady-state

systems. Halkin (1966) has shown that it always succeeds for discrete-

time systems, obtained as an approximation to the continuous time

systems, since the time increment t can become arbitrarily small and

make the last two terms in eq. (III-11) vanish. The success or the

failure of the strong discrete minimum principle can also be explained

in terms of convexity (or directional convexity) or lack of it for the

sets of the reachable states of the system (Halkin, 1966; Holtzman

and Halkin, 1966). This constitutes an important characteristic and

in fact it is the basis for the development of a strong version of

the discrete minimum principle, which follows.

The two-level optimization procedure is an infeasible decomposition

strategy of a Lagrangian nature which solves the problem (III-Pl).

The similarity between the Lagrangian and the weak discrete minimum

principle is a well-known fact and in a recent note (Schock and Luus,

1972) the similarity between the two-level optimization procedure and

the discrete minimum principle has been pointed out. Further insight

in the relationship between the last two methods will be given later

in this chapter, and a stronger version of the discrete minimum

principle will be developed, based on a method to overcome the

shortcomings of the two-level optimization procedure developed in

Chapter II.

111.3. Development of a Stronger Version of the Discrete
Minimum Principle

In this section we will develop a stronger version of the discrete

minimum principle using Hestenes'method of multipliers and following

a procedure similar to the one that was developed in Chapter II to

overcome the dual gaps of the two-level optimization procedure.

As a basis for the presentation we will use problem (III-P1).

Consider now the augmented problem (III-P2)

N T To
iin F = i(iui) + K E [xi1- f [x1- fi] K(x -x) (x1-xo)
i=1 i=l

subject to (!II-P2)

Xi+l = fi(xi'ui) i=1,..., N

xl = xo (given)

with K > 0.

The Hamiltonian for this problem is given by:

+ T T
S= F + i+l fi(xi ui) + llx

In the following theorems we will establish some important properties

of H

Theorem 9: If (1, ...,N ) is a stationary point of the Hamiltonian H

for the problem (III-P1), then it is also a stationary point for the

Hamiltonian H of the augmented problem (III-P2) for any real value

of the parameter K.

Proof: From the equations defining H and 11 we take,

H = H + K I [x+- f]T[xi+ fi] + K(xl x)T(x- ) (II-12)

Differentiation with respect to the controls yields:

N [xil f.]T
3H 3H + 2K i+1 u [xi+ fi] "
Du Su u +1 1

Evaluating the derivatives at the point (0,,...,.N), since 0 = 0

and x i f. = 0 for i=l,...,N, and we find

H 3H
3u u

Theorem 10: Let u=0 be a local isolated minimum of F with all the

connection constraints satisfied. Then there exists a real valued
2 *
parameter K such that u- 0 and H > 0 (i.e., the matrix
0Su 0 u2 0

is positive definite) for K > K .

Proof: In Theorem 9 it was shown that- =- 0 is independent of the

constant K; therefore, it will be valid and for K > K Let us
2 2
consider next the matrix 2 H /0u :

2* 2H N [x i+ fiLT [xil fJ
32 2 + 2K u 1
u2 u =2 u
Du Do i-1 Do

The solution of problem (III-P1) is a stationary point of the

Hamiltonian H. The nature of the stationary point for the discrete

case cannot be predetermined and thus 2 0. Assuming that we
have a nonsingular problem, i.e.

[xi+l fkT D[xi+1 fi]
uT f- / 0 i=1 ...,N
B u T

and we conclude that

T 2H T a2 N 3[xi+l fi T
(su)T 2H (6u) = (6u) 2 (6u) + 2K Y T (6u)
au Du i=l Du

D[xi+l f] 6 > 0
1+1 1T (6u) > 0
L Tau

for a large enough K so that the second term which is always positive

prevails over any negativity of the first term. Q.E.D.

The second theorem implies that any stationary point of the

Hamiltonian H can be made a minimum point by choosing the parameter K

large enough.

Theorems 9 and 10 establish the following result: For a large

enough K, a local solution to problem (III-P2) which is also a local

solution to problem (III-P1), minimizes the Hamiltonian H of the

augmented problem (III-P2).

The above result requires that we minimize the overall problem

Hamiltonian, H The available weak minimum principle permits us to

solve the overall problem by solving one problem per stage per

iteration since H decomposes into a sum of stage Hamiltonians.

Unfortunately H does not decompose, as we shall now see.

Let us see now how we can simplify the above result and relate

the solution of the problem (iII-P1) or (III-P2) to stage Hamiltonians.

Consider the penalty term in the objective function of the problem


K [x+- fT[xi ] f K [x x+ + f1 2x fx-
i=1 =

We note that each member of the summation includes separable terms,

e.g., x+ xi+l and f i f and nonseparable terms such as the cross-

product x Tifi. The following develops a strategy to decompose

computationally the problem of minimizing H This approach is moti-

vated by the work on solving nonconvex problems using the two-level


Expand the crossproduct term in a Taylor Series and consider the

following linear approximation around the point, il f = (xui):

T T ? T T ^
xi1 f + i i i+ i + xi+ i

Then the Hamiltonian of the augmented problem takes the following
form H for H*
form H for H :

H = (x.,ui) + K i [x+x. f+ + 2x+ 2x+ 2
.1 i+ i+l + i i-i- i i +ll1 1+1+

+ K(x- x)T(x- xo) + N + T o
x1 1x)- 1 iIi+fi + A il1
No T -T A T + Kx
E [ + Kfif. + 2x l 2T + +xx]
i~ 1*1 11 i+1 i i i-i i+l 1 1

= H + Ax + Kx N+ 2Kx fN + Kxx (III-13)

H. + Kx x + Kfif + 2Kx f 2Kx f 2x f,}+ A
1 1 1 1 1 j+l i i+l i ii i+l i

Let us now establish the following result which also constitutes

the basis of the proposed algorithm (given later in this chapter).

-T -T -T -T
Theorem 11: If the point u = (uT,u,... ,u) minimizes the overall
Hamiltonian H then each stage Hamiltonian Hi, i=1,...,N, resulting

after the Taylor Series linear approximation of the crossproduct

terms is minimized with respect to the corresponding control variable

u at the point ui

Proof: At the solution point we require feasibility, i.e.

xi+l = fi(xi'ui)


(This requirement is evident from the algorithm to be presented in

the next section.) From eq. (111-13) it is clear then that

Dui ui

i=l ,... N ,

and therefore a stationary point of IH with respect to u. is a

stationary point of H. with respect to u.. The Hessian matrix of

the second derivatives of H must be positive definite at the point

u, i.e.

2 H*

must be positive definite. Thus, all the submatrices on the diagonal
2* 2* u*
must be positive definite, i.e., 2 ... 2 2 must be
ul su2 c2

2 *
positive definite. But 2

2- therefore all the 1 i=1,...,N
u 3au
1 i

are positive definite.

Theorem 11 implies that the minimization of the H with respect

to the control variables can be replaced by the problems

Iiin Hi

i=1 . N .

111.4. The Algorithm and Computational Characteristics

The theorems of section 111.3. imply the following algorithm

which constitutes a stronger version of the now available discrete

minimum principle.

Step 1: Assume a value for the penalty constant K.

Step 2: Assume values for the control variables ul... ,uN

Step 3: Using the values u, ...,uN, solve the state equations


xi+l fi(xi'ui)

i=l .. ,N

fon:ard and find x2,...,xN. Let x2,...,xN be the found values

(x, = x, given constant).

Step 4: Using the values l ,... ,uN and x ,...,xI, solve the adjoint


3<^. aofT
1 ax ax i+l
S i...

SN+l= 0

backwards. Let Al...,N+A be the found values.

Step 5: Formulate the Hamiltonian H of the augmented problem (III-P2)

and expand the crossproduct terms around the point

(xI ... ,xN;u .... UN). Formulate the stage Hamiltonian Hi,

i=1,...,. Minimize all Hi, i=1,...,N and find optimal values

for the controls, say u., i=1,...,N. If the minimization

procedure fails to produce a minimum for at least one H.

increase K and go to step 2.

Step 6: If luij uij| < E for all j and for i=1,...,N (j denotes the

jth component of the vector ui) stop, the solution is found,

otherwise assume new values for the controls putting

New u. = u. i=1,...,N

and go back to step 3.

The effective use of the above algorithm depends largely on the value

of the constant K. Very small values of K will not produce the

necessary convexity of the stage Hamiltonians, while a very large K

will mask the objective function and will make the algorithm insensitive

to the descent direction of the objective function. Miele et al. (1972)

have suggested a method for choosing a proper value for the constant

K in the context of the Hestenes method of multipliers. As will be

discussed in section III-5, the two methods are related and this

method for choosing K should be satisfactory here.

III.5. Discussion and Conclusions

As mentioned before, the failure of the strong discrete minimum

principle was caused by the fact that the stationary points of the

stage Hamiltonian with respect to the controls are not always minima

points. The method proposed in the previous sections of this chapter

turns every stationary point of the Hamiltonian into a minimum point

for a large enough K. Thus we can find the solution of the original

problem at a local minimum point of the stage Hamiltonians. This

constitutes a stronger result than that currently available, where

we must search for a stationary point of the stage Hamiltonians.

The method used in this chapter to establish a stronger version

of the discrete minimum principle parallels in many respects the method

used in Chapter II to resolve the dual gaps in the two-level opti-

mization procedure. The source of the shortcomings for both the

methods (minimum principle and two-level method) is the nonconvexity

of the objective function and/or the stage transformation equations.

The two-level optimization method fails if either the objective

function or transformation equations or both are nonconvex with respect

to the control and/or the state variables.

For the resolution of the dual gaps of the two-level method, the

same penalty term was used, multiplied by a positive constant K.

It was required that K be large enough so that the stage sub-Lagrangians

become locally convex with respect to the control and state variables.

Since, in the present work, we have required that the K be large

enough to turn the stage Hamiltonians convex with respect to the

controls only, we conclude that the K required by the strengthened

form of the minimum principle to solve a nonconvex problem is at

most as large as the K required by the two-level optimization method

to resolve dual gaps.

Fig. 4 compares the values of K required to solve a nonconvex

problem for the three methods discussed, namely, weak discrete

minimum principle, the strengthened form of discrete minimum principle

developed in this work, and the two-level optimization method.

Note, KTLM > KSMP > 0, where KTLM is the least value of K required

for the two-level method and KSMp is the least value of K for the

minimum principle developed here. Thus it should be clear that,

although the methods are related, they do not have equivalent short-

comings; it is possible to find problems which can be solved by the

weak minimum principle or even the method developed here, and not by

the two-level method. However given a K 2 KTLM' all three methods




IV.1. Numerical Examples

IV.l.a. Two-stage Examples

Consider the following two stages example, Fig. 5, taken from

McGalliard's thesis (McGalliard ,1971 ) and described by:

0.6 0.6
Min F = -t3(x2,u2) + x6 + 2u, + x26 + 5u2

s.t. x2 = t2(xl,u1) = 3x1 + 3ul

x3 = t3(x2,u2) = 2x2 + 2u2

x, 0

u1 0

X1 < 3

x2 0_

2> 0


ix+ 2u1 < 4 Lx2+ 2u2 < 4

Using the regular two-level optimization procedure, a table

generated, see Table 1, and the solution is found to be:

x.- = 0 or 3

u = 0

x = 4

u = 0


I |>


Figure 5. Two-stage Example




x2 x2 u1 x2 u2 Min L

+ Cm





















4 0




Min tL2 h(2)



- 9.3

- 7.1

- 6.3




and Max Min [F A(x2- 3x -3ul)] = -4.84
A xl,UlX 2,U2

Two solutions are produced with

z1 = x2 3x1 3u1 = + 4 3(3) + 0 = -5

z2 = x2 3x1 3u = + 4 3(0) + 0 = 4

and from the corollary in Section 111.3. we conclude that a dual gap


Introduce the penalty term K(x2- 3x1- 3ul)2 in the objective F

and make a linear approximation of the crossproduct terms x2x1 and

x2u1. Following the algorithm in Section 111.5., we find the solution:

x1 = 4/3 x2 = 4 with A = 0.5 and K = 10

u1 = 0 u2 = 0 and

F = Max Fin
X x1,u ,x2,u2

[F-A(x2- 3x1-3ul)]= -4.51 -

Substitute unit 2, in example 1, by another described as follows:

2 = t3(x2,u2) + 2x6 + 2u2

t3(x2,u2) = 2x2 + 3u2

x 2xu2 2
S2 = u2 < 2

x2+u2 j 4

x1,u ,x2,u2

The problem becomes:

Min F = x16 + 2ul 2x2 u2 + 2x 6

s.t. x2 = 3x1 + 3

x1,l U S1

x2,u2 E S2

The two-level procedure yields the solution (see Table 2)

x1 = 0 and 3 x2 = 4 with Max Min
A x u1 ,x2,u2

U1 = 0

[F-A(x2- 3x1-3ul)]= -2.54

u2 = 0

and a dual gap is detected again.

The proposed method yields the solution

x2 = 4

u2 = 0

F = Max Min
X x1,ulx2,u2

for A = 0.8 and K = 10

[F-h(x2- 3x1- 3ul)] = -2.07

with all constraints satisfied.

IV.l.b. Three-stage example with recycle

In the two-stage example, let us insert an additional stage

with a recycle loop from this new stage to stage 1, Fig 6.

x = 4/3

u- = 0

Mi n




A2 U1 x2 2 Mlif n1 R 1 h(22)











-* cx,















- 7.56

- 4.06

- 2.56

- 2.42

- 0.767

- 0.031















-+- oo

->+ 0



- 9.56

- 6.06


- 4.42

- 3.137

- 2.561

- 2.546

- 3.00

-- C

Figure 6. Three-stage Examnple with Recycle

6 .) IX,, u,3

The problem is:

M in F = x'6 + 2u + x26 + 5u2 -4x u + x4


x2 = 3x1 + 3ul

x3 = 2x2 + 2u2

u = u3/4

x i, e S1

x2,u2 e S2

x3,'u3 0
x3'u3 E S3 x 4

x3+u3 < 6

Since the solution of this problem is in a dual gap, we apply the

proposed resolution which yields

x- = 0.67 x2 = 2 x3 = 4

u1 = 0 u2 = 0 u3 = 0

for \1 = 1.0 2 = 1.0 ,3 = 0.5 and K = 10.

Iiin F = Max Min [F-I (x2- 3x1- 3u1) 2 (x3-2x2-2u2)
The x1,u1,x2,u2.x3,u3 13 X1,X2 u3 X l,ux2 ,U2 X3,U3

= -11.96

with all constraints satisfied.

IV.l.c. Soland's example

This example is taken from Soland (1971):

Min f = -12x1 7x2 + x2

s.t. ((x) = -2x1 + 2 x2 0

0 5 x 2

The solution found by Soland

the generation of 12 points,

subproblems is:

x = 0

(which is not the global minimum) after

which necessitated the solution of 23

and f = -10

x2 = 2

The algorithm proposed in this work generated a sequence of 8 points,

solving 8 subproblems, and found the solution:

x = 0.718

and f = -16.7387

x2 = 1.46847

The sequence of generated points is shown in Table 3. The starting

point is X1 = 2.00, x2 = 0.0 .

IV.l.d. Jackson and Horn's counterexample

Consider the following two-stage example, Fig. 7, taken from

Horn and Jackson (1965a). This example is the counterexample which

demonstrated the fallacy of the strong minimum principle and is



A x x2 h(A)

-60 2.00 0.025 -919.04

-58.9 1.7365 0.0393 -713.07

-25.5 1.6513 0.06146 -182.64

-18.9 1.5511 0.07873 -108.35

-12.5 1.4318 0.1719 78.52

- 6.5 1.2799 0.3435 27.25

- 1.0 1.06 0.6326 22.16

+ 3.0 0.718 1.46847 16.7389

1 ^ 0 j' i S V.J

U *4 i T

2.L V
11^clac.a-tB; ,~.ia;waiKK

r* %

; I-3~t e~LIA~i~i mT~~hi> I.~

Figure 7. Jackson and Horn's Counterexample

5L,2" I

U N i T



described by:

Min x2
1 02

subject to

1 o 1 1 12
x = x1 26 (1)2

1 o 1+
x2 = x2

x2 = x + (x + (62)2

x2 = arbitrary

xo = xo = 1
x1 2
The solution to this problem is 61 = 2 = 0 which gives the smallest
value of x1. It, however, does not minimize the Hamiltonian for the

stage 1 since this Hamiltonian has only one stationary point which
is a maximum. Let us now apply the procedure of the strong discrete
minimum principle developed in the previous sections. Thus we have:

H K[xo 201 (12 2 + K[x0 + 2 2K(x)[x 291 (1)2]
o o 1 1 -1 o 1 1 1O

-2K(x2)[xo + 1] + K(xl)[xo 21 (61)2] + K( )[x2 ]

+A 1 1 1 11 2 2o 1
iXl 2)1 (1 + X2x2 + XA2


H* x + (x2)2 + (62)2 + K(x (x2 2K(x 2 1 )2-

-2K(x )[x + 1] + K(Xl)[x 2 1 (1 ] + K(x )[x2 + ]

1 1 1 1
1 1 x 2x2


=1 + 1 o ( ~12
A1 21+K(x1) 2K~x1 -26 [- 6 )

A2 = 2(x2) + 2K(x2) 2K[x2 + 1

The first necessary conditions yield:

1 K(13+ 6K(12 + [10K 2Kx + 2K(x)
Ml I

- Al91

+ [-4Kxo + 2Kx2 + 4K(x2) 2K(x2) 2X1 + A] = 0

-- 2(2) = 0

1. Put K = 1

2. Assume, 6

which gives 2 = 0 .

2 = 0

1 1 1 3
3. Find, x 1 2
S2 2

and x2 = 1 + 1 = 2

4. Also, A 1 and = 2(2) = 4
1 A2

~1 ~2
5. Minimize H1 and H2 which give: 1 = 0.947 and 6 = 0

|11 O11 = |.0531 > E = 0.001

Go back to step 2,

2a. Assume 1 = 0.947 and a2 = 0

1 1
3a. Find, x1 =- 1.343 and x2 = 1.947

1 1
4a. Find, A = 1 and X1 = 3.894
-2* 2
5a. Minimize H1 and H2 and find: 61 = 0.9 2 = 0

Go back to step 2a and assume 1 = 0.9 62 = 0. Continue in

the same way until 6e 1 < E and 2 6 2 < This finally

leads, in 7 iterations, to the solution 1 = 2 = 0 which is the

solution of the problem.

Note that if K = 0.1 the nonconvexity of the first connection

constraint with respect to 01 does not disappear. The stronger version

of the minimum principle developed in this paper does not succeed in

giving the solution.

IV.I.e. Denn's counterexample

Consider the following example taken from Denn (1969), which is

also a counterexample to Katz' strong minimum principle and is described


Minimize x1
1 2
u ,u

subject to

1 o o 1 1)2
x1 = x(1 + x2) 2 ()
1 o o 1
x2 = 4x1 2x2 + u

2 1 1 1 ,22
S= x (1 + x2) (u

2 1 1 2
x2 = 4xI 2x2 + u
2 1 2x2

x- =- 1/4

and x = 1

1 2
where u and u2 are to be chosen subject to:

1 2
0 < u u ,u

This is a constrained problem but it can be handled equally as well as

the unconstrained problem by the developed method. The inequalities

will be handled through Kuhn-Tucker multipliers.

The above problem has a solution at the point ui = 1 and u2 = u,

but the Hamiltonian for the first stage does not have a minimum and

thus Katz' strong minimum principle cannot be used. In fact the

Hamiltonian of the first stage possesses one stationary point which

is a maximum. Let us now apply the algorithm developed in the

present work. Thus we take:

= Al(u )2 + ( 1+ )u1 + A1 x(1 + x12)+ 42x1 2A2x2 + K(x1)2((1+12)2
S 2 11 221 2 12

+ 16K(x)2 +K(x(x11,2 16Kx x1 +u + Kx21(1 + x2) 2Kx1x1(1+x2)

1 2 1 22 1 2
+ -^ (1 + x)(2) Kx(1 + x)(u2) + 4K12 8Kxx2

^2 I2 1^2 122 I2 1-2
2Kx2x2 + 4Kx2x2 4K4Kx1u + 8Kx1u + 2Kx2u 4Kx2u


H = 22 2 Kx2 2 + ( 4 + Kx 2) K(x2 + K(u2)
H2 1 2 A1(u ) A u Kx1 Kx1(u ) 2K(x

- ^2 2^u ^2 K >2j +21 2)K
2Kx2u 2 u 2) + Kx1x1(1 x12) 2Kxxl(l+x2) 1 12)

2 1 2 142 1 1 2 12 1 2
Kx(1+x2)(u2) 4Kx1x2 8Kx1x2 2Kxx2 + 4Kx2x2

I 2 1 2 12 21 2
4Kx12 + SKx1u2 + 2Kx2u 4Kx12u

The necessary conditions to be satisfied are:

3H 1 1 and
-1 1(ul) + 2 + = 0 and

21 2 2 2 =1 1

S K(u A Kx + K 2Kx Kx1 +2)
1 -1
+ 8Kxl 4Kx2=0

The adjoint variables satisfy the following equations:

A1 (1+x(2) + 4A + 2K(x1)(1+x1 2 + 32K(x1) 16Kx 2KX^(1+x2)

K(+x )(u2) 8Kx 2 8Kv2

12 2 -2
Kx1u + 4Kx2 4Ku

2 2 2 A A
X2 = 2x + K + K(u2) 2Kx^(1 + x2)

2 = 2K(x) 2Ku + 4Kx1 .

Note that in the above equations the variables pl and p2 are Kuhn-

Tucker multipliers through which we handle the inequalities
1 2 *
1 2
u u and u > u For p1 and 12 we have P2 < 0.

Starting with K = 10 and initial assumptions ul = 2 and u2 = 2,

we apply the same algorithmic procedure as presented in the previous

example, and after 5 iterations we find the solution

u = 1 and u

which is the solution for the problem.

IV.2. The Design of a Heat Exchange Network

Avery and Foss (1971) have demonstrated that the method of two-

level optimization may not be generally applicable to chemical process

design problems due to the mathematical character of commonly en-

countered objective functions. Since the cost functions used in

chemical engineering design problems typically include a term

x where x is the throughput in a unit, nonconvex problems arise.

The method for resolving the nonconvexities, as described in

Chapter II will be applied in this section to the design of a

heat exchange network presented by Avery and Foss and possessing

inherent nonconvexities.

The cold stream D of Figure 8 is to be heated from temperature

T to T Three hot streams having flowrates a,b,c and temperatures
o p

ta, tb' tc may be used for heating; here they are considered to have

sufficient heat capacity and availability to accomplish the task.

The distribution of the total heat load among the three exchangers

is to be accomplished at the minimum cost of equipment, which is related

to the total heat transfer surface area A by

3 3 ac.
C = i Ai Ai (IV-I)
i=l i=l

The parameters y and a are positive constants; typically a = 0.6.

It is easily demonstrated that this problem has a minimum cost solution

and it is unique.

Let xI and x2 be the enthalpies of stream D entering and leaving

respectively, the first heat exchanger (see Figure 9). Similarly

yl and y2 are the enthalpies for the second exchanger and z, and z2

for the third. Note that x1 and z2 are known since the flowrate and

Cold Stre mYw

Figure 8. Heat Recovery Process

Hot Streams


t b s i'c

a 4

Stream D

Figure 9. Uncoupled Heat Recovery Process Showing the
Enthalpy Variables Used in the Two-level
Optimization Method


temperatures of stream D are known at the entrance, to the first
exchanger and the exit of the third exchanger. Thus, considering

x2Y1' Y2 and z1 as the design variables, the minimization problem
can be formulated as follows:

Min [C1(X2) + C2(1',2) + C3(z)]
x2Y1 Y2,z1

s.t. x2 =

Y2 = Z1 (IV-2)

The Lagrangian for this problem is:

L = Ci- X(x2 Y1) A2(Y2 z1)

= {C1(x2) X1x2} + {C2(Y1,Y2) + XAIY X2Y2} + {C3(zl)+ 2Z }

= 1(x,2'1) + Z2(yy2',X1,A2) + j3(z1,X2) (IV-3)

For given values of the multipliers X and x2 the two-level optimization

method requires the minimization of the sublagrangians 1l' k2 and 93.

A sufficient condition for a function to be at a minimum point is
that the Hessian matrix be positive definite at that point. The

Hessian for 22 is defined as
L22 2C
3 C2 3 C2 d e

H2 = Q2 Q3yl 1 (IV-4)
32C2 2C2 e d2
Yl yQ Y712

and the conditions assuring that both eigenvalues are positive are

d1 > 0 d2 > 0 (IV-5)

A (d d2 e2)> 0 (IV-6)

In the above, the heat duty Q = y2 yl in exchanger 2 has been

substituted for the variable y2.

d d2 and e may be determined from equation (IV-1), and the

following relations, which are suitable for the determination of heat

exchange area. Figure 10 identifies the notation used here.

Q = CD (T2 T1) = CB (t2 t1)

Q =Y2 Y

T1= Y1/CD

A =

(t2 T2) (t T1) CD
AT = r f- I
Tm (t2 T2) r
n 2) B
(t T)
1 1

Without loss of generality, it is convenient to take r > 1; identical

conclusions hold when r 1. It may be shown that

dl = q[(a-l) (rs-1)2 + p {(rs)2 1}]

d2 = q[(ra-) (s-1)2 + p(s2-1)]

e = q[(a-l) (rs-l) (s-l) + p(rs2-_)]

/d ka


it' Ma
Y, ;1, xx
U .

Figure 10. Diagram Showing Notation for a Single Exchanger


62 CDOt2-Yl >
61 CDt2-rQ-Y1


q = Ka p2/62 0

K I C a
K=Y L-2 >0

Substitution of di, d2 and e into equation (IV-6) gives

A (dd2-e ) = q p s (r-1)

which implies that A is never positive, and that, regardless of the

parameters of the problem, the stationary point can never be a
minimum. Therefore the two-level method fails.

Consider now the following augmented problem:

2* 2 2
Min F = Cl(X2) + C2(YIy2) + C3(z ) + K(x K(x Ky (y2-z1)

s.t. x2 Yl = 0 and y2 z, = 0

with K > 0.

The coupled terms x2y, and Y2z1 are expanded in a Taylor Series

around the point x2,Y1,Y2,z1 and are approximated by the linear

terms in the expansion. The Lagrangian of the above problem becomes:


*~ 2 *~
L {C1(x2) (X + 2K Y)x2 + K x + K x2}

+ {C2(y1'Y2) + ( 2K*x2)Yl (2 2Kzl)Y2

*2 *2 *-~ *~
+ K Y1 + K Y2 + K x21 + K Y2z}
*~ 2 *~ ~

+ {C3( ) + ()2 2K y2)z1 + K z1 + K y2z}

= I + 2 + 93

The necessary conditions for the minimum of C are

VL = 0 and V L = 0


v = [x2yl'y2,z1-]T

and = ( T,2 A2)

This results in the following equations

0 _

RZ2 R2
y1 y Y2

= 0
a z

x2 = Yl and Y2 = z ;

that is, each subsystem is at a stationary point with respect to the

variables associated with it. The Hessian matrix for the new sub-
agrangian 2 of the second exchanger is:
lagrangian S2 of the second exchanger is:


2 C2 *
--+ 2K

a2C2 *
+ 2K
ay IQ

a2C2 *
_ + 2K

Q2C2 *
S+ 4K
a 2

The terms d1 d2 and e

dI = dI + 2K

are easily found to be

*= d *+
d2 = d2 + 4K

* *
e = e + 2K

* ** *2
A = [dd2 (e ) ]

= 4(K*)2 + (4d1 + 2d2 4e) K* + (d d2-e2)

= 4(K*)2 + r2K + P3


r2 = 4dl + 2d2 4e

3 = dd2 -

A has roots

K -F2 (F 2 16F3)

It was shown before that F 5 0



(22 163) >

and the roots are real and of opposite sign. Let K1 > 0 and

K2 0 then for K > K A > 0. Also for

K > d1/2 K3 d1 > 0

and for

K > d2/2 K4 d2 > 0

Therefore if K = max {K, K3, K4} we find H2 > 0 and the stationary

point for unit 2 is a minimum as required.

Thus we see that the curvature of a sublagrangian at the sta-

tionary point can be determined not only by the sub-unit cost

function, C2 in this case, but also by the numerical value of the

penalty factor K

The above example demonstrates the applicability of the developed

strategy in a wide class of chemical engineering design problems

where the conventional two-level method fails to provide the solution.



Processing systems are characterized by two distinct features.

The first is the nature of the process units and their interconnections,

and the second is the capacities and the operating conditions of

the units. Synthesis of optimal processing schemes requires a search

over the whole space of structural alternatives as well as over the

space of the design variables for each structural configuration.

The synthesis of an optimal process flowsheet represents a great

challenge for a chemical engineer in the area of systems engineering.

A number of significant contributions have moved the problem from

the state of art to the state of semi-art where the personal capa-

bilities of the designer have been organized and systematized in a

logical manner. The size of the problem is very large and the methods

currently available simply give good solutions which are happily

accepted by practicing designers. The progress of mathematical systems

theory in this area has been very slow and no major thrusts have been

attained. Thus the chemical engineer has to resort to techniques

that are available for use with his own experience and intuition.

We shall attempt to present the major contributions to the

solution of the synthesis problem in order to relate and compare the

synthesis strategies proposed in this work, and presented in Chapters VI

and VII, with the general trends of thought already established by

other researchers in this area.

University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs