Citation
Optimization of nonconvex systems and the synthesis of optimum process flowsheets

Material Information

Title:
Optimization of nonconvex systems and the synthesis of optimum process flowsheets
Creator:
Stephanopoulos, George, 1947-
Publisher:
George Stephanopoulos
Publication Date:
Copyright Date:
1974
Language:
English
Physical Description:
xvi, 192 leaves. : illus. ; 28 cm.

Subjects

Subjects / Keywords:
Chemical engineering ( jstor )
Design engineering ( jstor )
Heuristics ( jstor )
Lagrange multipliers ( jstor )
Lagrangian function ( jstor )
Mathematical vectors ( jstor )
Minimum principle ( jstor )
Objective functions ( jstor )
Purification ( jstor )
Separators ( jstor )
Chemical Engineering thesis Ph. D
Chemical engineering -- Mathematical models ( lcsh )
Chemical porcesses -- Mathematical models ( lcsh )
Dissertations, Academic -- Chemical Engineering -- UF
Mathematical optimization ( lcsh )
System analysis ( lcsh )
Genre:
bibliography ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis--University of Florida.
Bibliography:
Bibliography: leaves 187-190.
General Note:
Typescript.
General Note:
Vita.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
14174287 ( OCLC )
022916622 ( AlephBibNum )
ADB3668 ( NOTIS )

Downloads

This item has the following downloads:


Full Text





















OPTIMIZATION OF NONCONVEX SYSTEMS
AND THE SYNTHESIS OF OPTIMUM
PROCESS FLOWSHEETS










By


GEORGE STEPHANOPOULOS


A DISSERTATION PRESENTED TO THE GRADUATE
COUNCIL OF THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY


UNIVERSITY OF FLORIDA
1974



























Dedicated to


The Memory of My Father

Nicholas Stephanopoulos




















Mall hai s -thr~e ways o6 acting Itiu-%ef:
Fi~tty, O kimed cLtatiom, RtJLE5 L, tfie
iqob'1jcLt; Seccmwde-y, Oki Zvtetticil, tdLi 5
~iztzc ects~i.est; cuzd Tblii-dJ, or,
expe.iekice; -tlhLs i,5 -th'c -t-te.LCs t.

Co;[6(1CLtS














ACKNOWLEDGMENT


The author wishes to express his gratitude to:

The chairman of his supervisory committee, Dr. Arthur W. Westerberg,

Associate Professor of Chemical Engineering, for his firm support,

sound advice, and guidance throughout what has been the most

stimulating and enjoyable period of his academic life;

The members of his supervisory committee: Dr. F.P. May,

Professor of Chemical Engineering, Dr. D.W. Kirmse, Assistant Professor

of Chemical Engineering, Dr. T.E. Bullock, Associate Professor of

Electrical Engineering and Dr. M.E. Thomas, Professor of Industrial

and Systems Engineering;

The faculty and staff of the Department of Chemical Engineering

for their assistance;

The National Science Foundation, which provided financial support

through the grants GK-18633 and 41606 and the Graduate School for a

research assistantship;

Jeanette for her encouragement through the lows and the sharing

of the highs of the life.














TABLE OF CONTENTS


Page

ACKNOWLEDGMENTS ......................................... iii

LIST OF TABLES ............................................ vii

LIST OF FIBGURES ........................................... viii

LIST OF SYMBOLS ........................................... xi

ABSTRACT ........................................... ...... xiv

CHAPTERS:

I. INTRODUClION ................................

II. TWO LEVEL OPTIMIZATION METHOD. DUAL GAPS AND
THEIR RESOLUTION USING METHOD OF MULTIPLIERS ... 5

II.1. Review of the Previous Works on the Non-
convex Optimization .....................

11.2. Statement of the Problem and the Two-Level
Procedure ............................... 7

11.3. Dual Gaps ............................... 10

11.4. Resolution of Dual Gaps Using the Method
of Multipliers .......................... 13

11.5. Computational Separability .............. 22

11.6. The Algorithm ........................... 24

11.7. A Discrete Minimum Principle ............ 26

II.8. Discussion .............................. 34

III. A STRONG VERSION OF THE DISCRETE MINIMUM
PRINCIPLE ...................................... 37

III.1. Review of Previous Works ............... 37

11I.2. Statement of the Problem and Its
Lagrangian Formulation ................. 39

iv








TABLE OF CONTENTS (continued)


Page


111.3. Development of a Stronger Version of
the Discrete Minimum Principle ......... 45

111.4. The Algorithm and Computational
Characteristics ........................ 51

111.5. Discussion ............................ 53

IV. EXAMPLES OF NONCONVEX OPTIMIZATION ............. 56

IV.1. Numerical Examples ...................... 56

IV.l.a. Two-stage examples ............. 56

IV.l.b. Three-stage example with
recycle ........................ 60

IV.l.c. Soland's example ............... 64

IV.l.d. Jackson and Horn's counter-
example ............. ......... 64

IV.l.e. Denn's counterexample .......... 69

IV.2. The Design of a Heat Exchange Network ... 72

V. SYNTHESIS OF PROCESS FLOWSHEETS. A GENERAL
REVIEW ....................... .......... ....... 83

VI. BRANCH AND BOUND STRATEGY FOR THE SYNTHESIS OF
OPTIMAL SEPARATION SCHEMES ..................... 90

VI.1. Previous IUorks on the Synthesis of
Separation Schemes ...................... 90

VI.2. Statement of the Problem and the List
Techniques for the Representation of the
Separation Operations.................... 93

VI.3. Branch and Bound Strategy ............... 100

VI.4. Examples ................................ 107

VI.4.a. Example 1: n-butylene
purification system ............ 108

VI.4.b. Example 2: olefins-parrafins
separation system .............. 121








TABLE OF CONTENTS (continued)


Page

VI.5. Discussion .............................. 132

VII. EVOLUTIONARY SYNTHESIS OF PROCESS FLOWSHEETS ... 134

VII.1. A General Philosophy on Evolutionary
Synthesis .............................. 134

VII.2. Evolutionary Synthesis of Optimal Multi-
component Separation Sequences ......... 143

VII.2.a. Representation of separation
sequences as binary trees .... 143

VII.2.b. Neighboring flowsheets and the
evolutionary rules for a
separation sequence .......... 146

VII.2.c. Polish strings and their repre-
sentation of separation
sequences .................... 150

VII.2.d. Proof of the completeness of
the evolutionary rules ....... 155

VII.2.e. Evolutionary strategy ........ 161

VII.3. Examples of Evolutionary Synthesis 163

VII.3.a. Example 1: synthesis of a
solids' separation system .... 163

VII.3.b. Example 2: synthesis of a
multicomponent separation
sequence ..................... 167

VII.4. Discussion ............................. 174

VIII. CONCLUSIONS AND RECOMMENDATIONS FOR FURTHER
RESEARCH ....................................... 178

APPENDICES ................................................ 183

APPENDIX A ...................................... 184

BIBLIOGRAPHY .............................................. 187

BIOGRAPHICAL SKETCH ....................................... 191













LIST OF TABLES


Table Page

1 Points Generated for the Two-Stage Example ....... 58

2 Points Generated for the Modified Two-Stage
Example ............ ................. ............. 61

3 Points Generated for Soland's Example ............ 65

4 The Numbers of Distinct Separators B(N) and
Distinct Flowsheets F(N) for a Mixture of r
Components and One Separation Method ............. 96

5 Initial Feed to the n-Butylene Purification
System ........................................... 109

6 Generated Flowsheets and Their Minimum Costs ..... 120

7 Specifications of the Initial Feed and the Desired
Products for Example 2 ........................... 122

8 Specification ef the Solids in Example 1 ......... 165

9 The Evolutionary Steps Taken During the Synthesis
of the Solids' Separation System ................. 168

10 Table Indicating the Effectiveness of the Proposed
Evolutionary Strategy in Synthesizing an Optimal
Separation Sequence (One Separation Method Used).. 176

11 Table Indicating the Effectiveness of the Proposed
Evolutionary Strategy in Synthesizing an Optimal
Separation Sequence (Three Separation Methods
Used) ............................................ 177













LIST OF FIGURES


Fi ure Page

1 Geometric View of the Success of the Dual Approach .. 12

2 Geometric View of the Failure of the Dual Approach
to Yield the Primal Solution ....................... 14

3 Improvement of the Dual Bound h (X) for the
Penalized Problem Over the Dual Bound h(A) for the
Unpenalized Problem for the Same Multipliers X ..... 36

4 Relationship of Penalty Constant K Required for
Three Lagrangian Based Optimization Algorithms ..... 55

5 Two-stage Example .................................. 57

6 Three-stage Example with Recycle ................... 62

7 Jackson and Horn's Counterexample ................. 66

8 Heat Recovery Process .............................. 74

9 Uncoupled Heat Recovery Process Showing the Enthalphy
Variables Used in the Two-level Optimization Method. 75

10 Diagram Showing Notation for a Single Exchanger .... 78

11 All the Distinct Separators Generated and the Basic
Flowsheet for a Fictitious Example of 4 Components
Using 2 Separation Methods ......................... 98

12 Two-stage Example to Demonstrate the Insensitivity
of the Dual Function to the Values of the Lagrange
Multipliers .............................. .......... 106

13 Generation of the Basic Flowsheet for the Synthesis
of the n-Butylene Purification System .............. 111

14 Generation of all the Flowsheets Starting with
Separator 1 for the n-Butylene Purification System
Example ............................................ 112








LIST OF FIGURES (continued)


Figure Page
15 Generation of all the Flowsheets Starting with
Separators 2 or 5 for the n-Butylene Purification
System Example ..................................... 114

16 All the Distinct Separators Employed During the
Synthesis by Branch and Bound of the n-Butylene
Purification System ................................ 116

17 Nearly Optimum Flowsheets Retained at the End of
the Branch and Bound Synthesis of the n-Butylene
Purification System ................................ 119

18 All the Distinct Separators Employed During the
Synthesis by Branch and Bound of the Olfins,
Paraffins Separation System ........................ 123

19 Generation of the Basic Flowsheet for the Synthesis
of the Olefins, Paraffins Separation System ........ 127

20 Nearly Optimum Flowsheets Retained at the End of
the Branch and Bound Synthesis of the Olefins,
Paraffins Separation System ........................ 128

21 Generation of all Flowsheets Starting with
Separator 1 for the Olefins, Paraffins Separation
System ............................................. 129

22 Generation of all Flowsheets Starting with
Separators 2 or 3 for the Olefins, Paraffins
Separation System .................................. 130

23 Generation of all Flowsheets Starting with
Separator 4 for the Olefins, Paraffins Separation
System ............................................. 131

24 An Illustrative Diagram Showing Two Alternate Sets
of Evolutionary Rules for a Family of Flowsheets A.. 137

25 Flowsheet A and Its Neighboring Flowsheets B,C,D
and E Resulting from A with Simple Structural
Modifications ...................................... 139

26 A Separation Sequence (A), Its Corresponding Binary
Tree (B) and the Skeleton Structure (C) Corresponding
to This Tree ....................................... 144









LIST OF FIGURES (continued)


Figure Page

27 Flowsheet (A), a Down Neighbor (B) to It, and a
New Separator Type Neighbor (C) to It .............. 147

28 Separation Sequences Corresponding to the Binary
Trees of Figures 27 A and B ......................... 149

29 A Binary Tree for the Algebraic Expression
3x2y + z2/u ............. ....................... 151

30 Interpreting a Polish String Development of an
Algebraic Expression Using a Stack of Operands and
a Set of Operators ................................. 153

31 The Operators of a Polish String and Their Operands,
and the Schematic Representation of the Generation
of the Neighboring Flowsheets by Applying Rules 1
and 2 .............................................. 156

32 Flowsheet A-1 and Its Neighboring Flowsheets A-2,
A-3 and A-4 Generated Using Rules 1 and 2 .......... 158

33 A Schematic Representation of a System for the
Separation of Solids and the Corresponding Binary
Tree ............................................... 166

34 Flowsheets Generated During the Evolutionary
Synthesis of the n-Butylene Purification System
Starting from Flowsheet (a) ........................ 170

35 The Evolution Process for the n-Butylene Purification
System Starting from the Flowsheet (a) ............. 171

36 The Evolution Process for the n-Butylene Purification
System Starting from the Flowsheet (k) ............. 172

37 Flowsheets Generated During the Evolutionary
Synthesis of the n-Butylene Purification System
Starting from Flowsheet (k) ....................... 173













LIST OF SYMBOLS


A.

a,b,c

C.

ci

D

F

F*



f.

g

H


H



H

h
h .
J
I(j)

K

L

L


= Heat transfer area of the i-th heat exchanger.

= Flowrates of hot streams in a heat exchange network.

= Heat capacity of stream i.

= Cost of the i-th heat exchanger.

= Set of feasible dual variables or multipliers.

= Scalar return function.

= Scalar return function for the augmented problem, may
be subscripted.

= Stage transformation function, for stage i.

= Scalar return function for subsystem j.

= Vector of equality constraint functions, may be subscripted.

= Supporting hyperplane for the set R.

= Hessian matrix of subsystem i.

= Hamiltonian function. Subscripted with i refers to the
Hamiltonian for the subsystem i.

= Hamiltonian function for the augmented problem.

= Dual function.

= Vector of inequality constraint functions for subsystem j.

= The set of i such that stream i is an input to subsystem j.

= Penalty constant, always nonnegative.

= Lagrangian function.

= Lagrangian function for the augmented problem, may be
subscripted.






. = Sub-lagrangian for subsystem i.
-I?
= Sub-lagrangian for subsystem i after the linear
approximation of nonseparable terms.

0(j) = The set of i such that stream i is an output of
subsystem j.

P(q) = Scalar quantity = gT(q)g(q).

Q = Heat duty of an exchanger.

q = Composite vector variable (xlu), may be subscripted.
R = The set (z ,z) such that z w(z), z c Z, may be
subscripted.

S = Constrained variable set, may be subscripted.

s = Direction vector, may be subscripted.

T ,T = Temperatures of a cold stream at the entrance and the
Sexit of a heat exchange network.

t. = Stage transformation function, for stage i.

ta tb,tc = Temperatures at which the hot streams a,b,c are available.

u = System decision variable vector, may be subscripted.

w(z) = Minimum {F(x); g(x) = z, x e S}.

w (z) = Minimum {F (x); g(x) = z, x c S}.

x = Vector variable associated with interconnecting streams,
may be subscripted.

y = Vector variable associated with streams leaving the
system, may be subscripted.

Z = The set of z such that 3 x c S 3 g(x) = z.

z = Perturbation vector, may be subscripted.


Greek Letters

a = Scalar parameter.

i. = Positive constant, usually = 0.6.

B = Scalar parameter.







Yi = Positive constant.

AT = Log mean temperature difference.

= Small positive quantity.

0 = System decision variable vector, may be subscripted.

\ = Vector of Lagrange multipliers, may be subscripted.

. = Vector of Lagrange multipliers for the augmented problem.

v = Vector of Kuhn-Tucker multipliers, may be subscripted.

i = Scalar parameter.


Mathematical Symbols

3 = Such that.

3,$ = There exists, there does not exist.

V = For all.

P = Zero vector.

C = Subset of.

A = Intersection with.

U = Union with.







Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment
of the Requirements for the Degree of Doctor of Philosophy



OPTIMIZATION OF NONCONVEX SYSTEMS
AND THE SYNTHESIS OF OPTIMUM
PROCESS FLOWSHEETS

by

George Stephanopoulos

June, 1974

Chairman: Arthur W. Westerberg
Major Department: Chemical Engineering

Previous efforts in the area of optimization of chemical

processes have not accounted for the nonconvexities commonly en-

countered in such systems. These nonconvexities cause many of the

proposed large scale optimization strategies to fail. The subject

also of the chemical process synthesis was largely ignored until

recently despite its importance in the chemical engineering practice.

This dissertation presents techniques to overcome the deficiencies

of two related and often studied optimization methods in the presence

of nonconvexities and develops strategies for the synthesis of

process flowsheets.

Many Lagrangian based methods for the optimization of large scale

systems require that the Lagrangian function possess a saddle point.

The two-level optimization method may not be generally applicable to

chemical process design problems due to the mathematical character

of commonly encountered constraint sets and objective functions, which

commonly do not allow the existence of saddle points for the Lagrangian

function. The dissertation presents a method to overcome these







shortcomings of the two-level optimization procedure by employing

Hestenes' method of multipliers. The objective function is augmented

by a penalty term which is the sum of the squares of the connection

constraints multiplied by a positive constant. This penalty term

under certain conditions turns every stationary point of the

Lagrangian into a saddle point thus securing the success of the two-

level method. The separability of the initial system which was lost

because of the penalty term is regained by expanding the nonseparable

terms into a Taylor Series and retaining only the linear part of

it. As a direct extension of the above strategy the dissertation

presents the development of a stronger version of the discrete minimum

principle. Two algorithms have been developed to implement these

theoretical results. The success of the new methods has been

demonstrated in several small size numerical examples and in the

design of a simple heat recovery network on which the previous methods

fail.

The dissertation also develops two strategies for the synthesis

of chemical process flowsheets. The first is a branch and bound

strategy which exploits the bounding properties of the dual and primal

values which can be obtained for the flowsheet objective function.

The flowsheets are constructed in a block-by-block building procedure,

and this method was used to synthesize optimal multicomponent separation

sequences. In the method list processing techniques are used to

develop the distinct separators which can occur. At tie end of the

synthesis a small number of nearly optimum flowsheets is retained.

Further screening among them is possible to locate the optimum







flowsheet. This strategy is demonstrated in two different examples

with very encouraging results.

The second synthesis strategy, the evolutionary strategy, is

considered next and is systematized for use in the synthesis of

general process flowsheets. The evolutionary synthesis procedure is

broken into four subtasks: a) the development of a starting flowsheet,

b) the generation of evolutionary rules to produce the structural

modifications to be considered during the synthesis, c) the developing

of the proper evolutionary strategy to lead to the optimum solution

in the most effective manner, and d) the screening among the current

flowsheet and the alternative flowsheets generated. Notions such

as that of the neighboring flowsheets and the evolutionary strategy

help to put evolutionary synthesis in a correct perspective. The

evolutionary approach is illustrated with the synthesis of two

distinct separation problems with very encouraging results.













CHAPTER I

INTRODUCTION


The essential concern, purpose, and culmination of engineering

is design. In chemical engineering this is the prevailing and most

important factor of its scope. Very often, a final design is achieved

without due consideration to all aspects of the design morphology.

This is necessitated by the complexity of the design problem and

the state of the limited engineering advances in certain areas.

Proper design procedure includes the three essential stages of synthesis,

analysis and evaluation. The design process is complicated by the

interrelationships existing among these stages. Frequently these

interrelationships are complex and cause design to be an iterative

process, requiring the special attention of the designer and the

development of flexible strategies which will lead to good solutions.

Analysis is a term which is equally familiar to both practicing

engineers and students of engineering. It has been developed de-

ductively and quantitatively to a high degree. Strategies have been

developed to analyze whole processes and very effective, sophisticated

methods have been proposed to resolve the complicated, time consuming

activities of the design process.

Considerable work has been done and is in progress on optimization

theory for large structured systems. Both theory and applications

have found very fertile ground in chemical engineering. The particular








feature of chemical processes (i.e., sparseness and complexity)

have caused the development of highly effective strategies for

analysis and optimization. Chemical process design has been the

instigator of such developments.

The synthesis stage of the design procedure was largely ignored

until recently in the chemical engineering literature despite its

importance in chemical engineering practice. During the last few

years the importance of creativity, innovation and invention in

designing chemical processes has been stressed and has received the

proper attention.

Because synthesis is such an important step, it became the

principal coal of this thesis. Initially we were exploring the use

of licGalliard's (McGalliard, 1971) approach to structural synthesis

for the synthesis of optimal multi component separation schemes. This

strategy involves the development of dual bounds for alternate flow-

sheets. Thus the first problem of concern was a good initial

estimation of the Lagrange multipliers to be used for the evaluation

of dual bounds. A more thorough and detailed investigation of the

physical meaning of the Lagrange multipliers and the discovery of

Hestenes' method in the literature (Hestenes 1969) led to the

development of a new method to overcome the deficiencies of Lasdon's

(Lasdon, 1964) two-level optimization method.

Then it became evident that a stronger version of the discrete

minimum principle could also be established. Thus two new strategies

evolved which can very likely be used effectively (as their application

to several small examples has indicated) to optimize large-scale

systems in the presence of nonconvexities. These approaches are of








particular interest to a chemical engineer, since the cost functions

used in process design involve the throughput to a unit raised to

the 0.6 power which is a characteristic nonconvex function.

Furthermore we explored the use of the bounding properties of

the primal and dual functions in connection with list processing

techniques to generate very good alternate solutions to the multi-

component separation problem. Finally, the evolutionary approach to

the synthesis of optimal process flowsheets was systematized, and

all these principles and ideas were illustrated in the synthesis

of an optimal separation scheme.

Thus, in Chapter II we discuss the two-level optimization method,

its theoretical foundations, its advantages and its drawbacks. The

generation of dual gaps because of 'onconvexities and their resolution

using Hestenes' method of multipliers is discussed. An algorithmic

procedure employing these ideas is described. In Chapter III the

"classical" discrete minimum principle is outlined, and its short-

comings because of nonconvexities are defined. Again Hestenes' multi-

pliers are used to develop a strong version of the discrete minimum

principle, and a new algorithmic procedure is proposed. In Chapter IV

the theories developed in Chapters II and III are tested on some

simple numerical examples and also some examples drawn from the

chemical engineering literature. In Chapter V a general review is

presented of previous works in the area of process flowsheet synthesis.

Then a branch and bound strategy, using list processing techniques,

is outlined for the synthesis of optimal multicomponent separation

sequences in Chapter VI, while Chapter VII develops a systematic

evolutionary approach for the synthesis of process flowsheets.




4



Chapter VII concludes with an illustrative presentation of the principles

governing the evolution of the design on a separation problem.

Finally in Chapter VIII we summarize the results of this thesis and

outline a program for further related research in the area of

chemical process design.












CHAPTER II

TWO-LEVEL OPTIMIZATION METHOD. DUAL GAPS.
RESOLUTION USING METHOD OF MULTIPLIERS


II.1. Review of Previous Works on the Nonconvex Optimization.

The Lagrangian approach, representing a dual method, has often

been proposed for solving optimization problems. In many engineering

design problems, the system to be optimized comprises several fairly

complex subsystems or units which are connected together by sets of

equality constraints. By appending the equality constraints to the

objective function with Lagrange multipliers, the Lagrange function,

for fixed multiplier values, is separable, leaving a subproblem for

each subsystem. The approach is therefore very attractive for this type

of problem. It can fail however because of the presence of a dual gap

at the solution point (Everett, 1963; Lasdon, 1970). Unfortunately

this failure is common in engineering design problems; therefore, the

development of a method to resolve dual gaps is of great importance.

Gaps may arise for various reasons, and a quite thorough treat-

ment of their causes and resolutions is presented by Greenberg (1969).

He reviewed the use of nonlinear supports (Gould, 1969; Bellman-Karush,

1961). the use of surrogates (Greenberg-Pierskalla, 1970; Glover, 1968),

the use of cuts via dominance and efficiency concepts (Loane, 1971;

Greenberg, 1969), and the use of branch and bound method designed for

finite separable problems. It should be noted that in all these methods

to resolve gaps, inequality connection constraints are emphasized.









Bellmore, Greenberg and Jarvis (1970) have examined the use of

penalty functions to resolve gaps, and they reviewed the relationship

between the original problem and the augmented one along with proposed

solution procedures.

Falk and Soland (1969) have proposed an algorithm to solve the

separable nonconvex programming problems where the constraints are only

upper and lower bounds on the variables. Soland (1971) has extended

the algorithm to include inequality constraints of a more general form.

Both the algorithms are of a branch and bound type and solve a sequence

of problems with convex objective functions. These problems correspond

to successive partitions of the feasible set. Greenberg (1973), in a

recent publication, has provided a sharper lower bound on the optimum

value with less computation using the Generalized Lagrange Multiplier

method, and the Falk-Soland algorithm can be modified appropriately.

All the above methods have a major drawback: the separability of

the objective function and the constraints, if it existed in the

original problem, is destroyed after the proposed modifications. Only

the last method reported by Greenberg (1969) and using a branch and

bound technique preserves the separability, but it is only applicable

for finite problems. Separability is a characteristic providing many

advantages for the solution of a large system, and it is very desirable

to preserve it.

In applying the Lagrangian approach and structural sensitivity

analysis (McGalliard, Westerberg ,1972) in engineering design problems,

ue want to resolve the dual gap problem by keeping the structural

characteristics of the system.








In the present work an algorithm is developed which makes use of

the penalty function approach together with a linear approximation of

the nonseparable terms. The original problem is replaced by a sequence

of problems, each one yielding a tighter dual bound on the optimum

solution. The solutions to the above problems form a nondecreasing

sequence of real numbers, bounded from above. Under certain conditions

developed later on, this sequence of solutions converges to the solution

of the original problem. The structural characteristics of the original

system are preserved: each subproblem is solved separately.


11.2. Statement of the Problem and the Two-Level Procedure

As a basis for the description of the two-level optimization

procedure. of the encountered dual "gaps", and of their proposed reso-

lution, we will consider the following model which represents many

engineering system design problems (as opposed to resource allocation

problems):


I(j) = {i ; stream i is an input to subsystem j}

0(j) = {i ; stream i is an output of subsystem j}

x. is a vector variable associated with the interconnecting
Stream i.

u. is a vector decision variable associated with
Ssubsystem j.


Define vector variable qu as


q = (Xk ... xk [u ) ki e I(j). (II-1)
J








The transformation equations which connect the subsystems are


xi ti(qj) i O(j) j = 1,...,n .


(11-2)


The overall system return function is the sum of the subsystem

return functions:


n
F = Y f (qj).
j=l


(11-3)


The system constraint set, excepting the interconnection constraints

(11-2), is separable, i.e.,


q Sj = {qj hj(qj) 01


j=l . ,n.


Thus, the overall optimization problem is


n
Minimize F = I f (q.)
33


(I1-4)


subject to xi = ti(qj) i E 0(j) j = 1...,n


qj Sj


, j = 1 .. ,n.


The Lagrange

given by


function for the above problem described by (11-4) is


n n
L = fj(qj) + 1 (t (q) -
j=1 j=1 iSO(j)


x.) (11-5)


where the hi are Lagrange multipliers. Rearranging the terms in (11-5)

the Lagrangian can be written as follows:








n
L = I { f.(qj) +
j=1 ie0(j)


n
- xix}


(11-6)


For fixed x the problem of minimizing L

equivalent to solving the subproblems


becomes separable, and it is


Minimize Zj(qjQ


(11-7)


subject to qu S..


Let us now

function is


define a dual function and a dual problem. The dual


n
h(A) = X (minimum k.(q j;)),
j=1 qjES. j


(11-8)


and the dual problem is


Maximize h(A)


(11-9)


subject to A E D D = {X; h(A) exists } .


The two-level optimization procedure requires the following two

distinct operations:

First level: Calculate h(A) by solving the n subproblems (11-7)

Second level: Adjust the multipliers,A, so as to satisfy the inter-
connection constraints in (11-4)

In effect this procedure solves the dual problem described by (11-9).








The important question concerning this procedure relates to the

existence of a saddle point for the Lagrange function of the problem.

The following theorems provide the theoretical basis and give some

answers to the saddle point existence question. For further details

and proofs of the theorems, see Lasdon (1970).

Theorem 1: Let XAeE A point (q OA) is a constrained saddle point

for L(q,X) if and only if 1) qO minimizes L(q,Ao) over S and 2)

x. t.(qj) = 0, ieo(j), j=l,...,n.

Theorem 2: If (qo ,x) is a saddle point for L, then qO solves the

primal problem described by (11-4).

Theorem 3: h(A) is concave over any convex set of D.

Theorem 4: h(X) < F(q) for all qeS such that x ti(qj) = 0, icO(j),

j=1,...,n and for all AeD.


11.3. Dual Gaps

The basic drawback of the Lagrangian approach, and therefore of

the two-level optimization procedure, is its failure to find the solution

of a problem when the solution is in a "dual gap".

To provide additional insight into the relationship between the

primal and the dual problems and to demonstrate the formation of gaps,

we will give at first a geometric interpretation of the procedure and

second the theoretical justification of the failure of the two-level

approach in certain cases (Lasdon, 1970 ; Rockafellar, 1967 ).

Consider the family of the perturbed problems, with perturbations







n
Minimize F(q) = X f (qj)
j=l J

subject to xi ti(qj) = zi icO(j) j=,...,n

qj S. j=l,... n

The primal problem corresponds to zi = 0, ie0(j), j=1,...,n. Assuming

continuity of the objective function and of the constraint functions,

let us define

w(z) = minimum {F(q); xi ti(qj) = zi, iEO(j), j=l,...,n, qcS}

T T
where z = [z, for all iEO(j), j=1,...,n].

The domain of w is

Z = {z; there exists a qcS such that x. ti(q ) = zi, iEO(j), j=l,...,n) .

Consider now the set RCEm+l

R = {(z ,z); z > w(z), zeZ}.

If w and Z are convex then R is convex. We shall call this space,

containing R, the HZ space.

The following theorem demonstrates the connection between duality
and supporting hyperplanes for the set R, see Lasdon (1970) in WZ.

Theorem 5: If eS and AEEm, then q minimizes L(q,2) over S, if and

only if H = {(z ,z); z z = L(q,2)} is a supporting hyperplane for

the set R at the point, (F(q), xi ti(qj) = zi, ie0(j), j=l,...,n).
Supporting hyperplanes exist for every point and therefore for
the solution point, if R is a convex set (Fig. 1). In the case of
























,
I,
r

~
i ~

5~42 qh \
1
"~rc,
'a "`-I
4a 9
'1


supporclng

hyperplane

aT

(w(I (,2j)


aZo









w(zl)






I nh(X wO)
h (,\)= w (Z


supporting


hyperp aon


at (w(i0 O)


Figure 1. Geometric View of the Success of the Dual Approach


Cul~ -I __ .I- ----l*ur~a .-iL~l-1 9-U9~~C*-j


i


B
i
a
i
ti


-








nonconvex R sets, there are regions consisting of constraint vectors

that are not generated by any vector X (Fig. 2). Optimum solutions

for constraints inside such inaccessible regions cannot be discovered

by straightforward application of the two-level optimization procedure,

and must be sought by other means.

The following corollary (Greenberg, 1969) helps to anticipate

the existence of such an inaccessible region, which will be termed a

"dual gap".


Corollary: A dual gap arises if some choice of Lagrange multipliers

XED produces at least two solutions to the Lagrangian problem (11-8)

with distinct zi = xi ti(qj) i 0, iEO(j), j=l,...,n.


11.4. Resolution of Dual Gaps Using the Method of Multipliers

Let us assume that the solution to problem (11-4) is in a dual

gap. According to Theorem 5, the two-level approach fails to find the

solution. The method of multipliers developed by Hestenes (1969)

will be used to cure this shortcoming of the two-level method.

Let us make the following notational changes:


ij = x t (qj), icO(j) j=1 ...,n

T T T T
9 [gi 2 m gi ] ; j=l,...,n ; ik 0(j) ; k=1,... ,m

T T T T
g = [g j, 92 n... ]


q = [qTl qT2.... qT]

hT = [h, hT ... h
1' 2' n



























= w(o)


Gapp p


Figure 2. Geometric View of the Failure of the Dual Approach
to Yield the Primal Solution








Then problem (11-4) is written as follows:

n
Min F = y f (qj) = F(q)
j=1
(JI-P1)
s.t. g(q) = 0

q S


where S = S1 x S2 x ... x Sn


Consider now the following augmented problem:


Min Ft(q,k) = F(q) + K.gT(q)g(q)


s.t. g(q)= 0
q E S (II-P2)

K.> 0 .


Let us now examine the relationship between problems (II-PI) and (II-P2).

Let us make the following assumptions:

A-l: Objective function F(q) and constraints g(q) are of class C".

A-2: The set S is compact and nonempty.


Theorem 6 (Hestenes, 1969 ): Let qo be a nonsingular solution to the

problem (II-P1), then there exists a multiplier vector A and a constant

Ki such that the point q0 is an unconstrained minimum of the Lagrangian

of problem (II-P2).


Proof: Since q is the solution of the problem (II-P1), the following

conditions are true:









g(qo)= O,



and AL = AqT 2


0 where L = F(q) ATg(q) v Th(q)


Aq > 0 for all permissible variations Aq,


such that


Ag = Aq = 0 and vj [hi i j,,...,r



Note: The Kuhn-Tucker multipliers vj are always nonpositive at the

solution to (II-P1). The Lagrangian of the problem (II-P2) is:

L. = L + KigT(q)g(q).


At the point q :


aL 0


AL. = AL + Ki


lq j2q T q
_q q i~l


For permissible variations Aq such that Ag = 0 and v iAh 0,

AL. > 0 and for nonpermissible variations Aq, it is possible that

AL < 0, but there exists a K. > 0 such that
1

AL. > 0


and the proof is complete.

From Theorem 6 we conclude that there exist a vector A and a Ki > 0


qq q








such that the point (q ,0) is a saddle point for the Lagrangian Li,

of the augmented problem (II-P2). Therefore the problem (!I-P1)
can be solved using the two-level method, if it is augmented by the
penalty term KigT(q)g(q). Define:


w(z) = Min {F(q):g(q) = z}
qES

wi(z) = Min {F (q,Ki):g(q) = z}
qES


Let us examine now how the value of Ki affects the wi(z) and under

what conditions a proper choice of K. yields a supporting hyperplane

for the set R., where

Ri = {(z ,z):z >_ wi(z), zEZ}

at the solution point (wi(0),0). The following theoretical treatment

will demonstrate the effect of the penalty term on the resolution of
dual gaps via a geometrical representation.

Lemma 1:w(z) < w.(z) for z f 0 and w(O) = wi(O). More generally, if
K. > K. then w.(z) > w.(z) and w.(0) = w.(O).

Proof: The equality for z = 0 is direct since the penalty term is then
zero. Let q minimize F(q) with g(q) = z, then q minimizes Fi with

g(q) = z because if this is not true and q minimizes Fi with g(q) = z,
then

F(q) + KigT(q)g(q) < F(q) + KigT()g(q)


and consequently









F(q) < F(q)


which is not true since by assumption q minimizes F(q) with g(q) = z.

Similarly:

KTz
Min F.(q,K.) Min {F(q) + K.z z
i qES

T *
> Min {F(q) + K.z = Min F*(q,K) .
qcS 1 1 1
qcS

*
Therefore w.(z) > wi(z) for z 0 Q.E.D.

Lemma 1 implies that the curve w(z) vs. z (in the one-dimensional

case) is moved upwards as K. increases, keeping always the same value

at z = 0.


Lemma 2: If w(z) is a continuous function with continuous first de-

rivatives, K. > K., and
J 1

H(K ) = Max h*(A) = Max Min L (q,X,K ) = Max Min [F AT(q)]
A A qeS P qES
then

H(Kj) > H(Ki) and H(Kj) = H(Ki) only at the

solution of the problem (II-P1).

Proof: Let

* *T *
H(K ) = Fj(qj) ) g(q ))

* *T *
H(Ki) = Fi( (i)) (i) g(q (i)


Assume that H(K.) C H(K.), then








* *T * .*T *
Fj(q (j)) g(q ) < F(q ) g(q(i))


F(qj)) A*(g(q*) > F (qj )




F (q j)) A~ i)g(q(j)) < Fi(q i))


*T




*T
(i)g(q(i)


** *
The last inequality implies that the point (F (q(j)), g(qj)) lies

below or on the hyperplane, defined by:

*T
H = {(z,z)I .(i)z = H(Ki)}

Let us first examine the case where it lies below the hyperplane. Then


(Fj(qj)) g(qj))) Ri


where


R = (zZ) Iz *(z) zZ
Ri = {(z ,z) Iz wCi(z) zEZ1


But,


(F i((j)) g(q(j))) E Rj


where


Rj = {(zo,) Izo > w (Z) zeZ }


Because of lemma 1 R.C: R. and (R. R.) ( R. = (w(O),0). Thus
J 1 1 J J


therefore








(Fj(q( ) g(q( ))) E Ri


which contradicts the above result. Therefore H(Kj) 4 H(Ki).
* *
In the case that the point (F.(q(j)), g(q(j))) lies on the hyperplane,

because


(Ri Rj) n Ri = (w(O),O)

we conclude that H(Kj) = H(K.) only at the solution of (II-Pl).

Theorem 7: If w(z) is finite for all qes and is continuous with con-

tinuous first derivatives and existing finite one-sided directional

second derivatives in any direction s, at z = 0, then there exists a
* *
K finite such that for all K > K H(K) = H(K) = primal solution.

Proof: Since w(z) is continuous with continuous first derivatives at

z = 0, there exists a hyperplane tangent to w(z) at the point (w(0),0).

This is not a supporting hyperplane for the set R, since there is no

saddle point for the Lagrangian of the problem (II-P1) inside the dual

gap. This hyperplane is described by:


z z = w(0) with some z w(z).

Now we have to find a K. such that the above hyperplane is a supporting

hyperplane for the set R., i.e.,

T *
z = w(0) + ? z < w*(z).

We note that


w*(z) = Min {F(q) + Kig (q)g(q)g(q) = z}
qcE








therefore,

T T
w(z) + K.zz Tz > w(O)


and there must be a

T
Ki. > w(0) w(z) + X z for all EZ {z 3ES g(q) = z}
Z Z


For z / 0 and with w(z) > -m for zeZ, a finite value of K. exists,

say L The only potential problem might occur at z = 0. IJe can use

L'Hospital's rule to find the limit as z 0 along any direction s,

namely


w(O) w(Bs) + T(Bs) 1 2w(Bs) = finite ,
Lim x = Lim T 2 2
z+O )-0T (gs) (ss) 2



if this limit exists. If the limit does not exist, then the finiteness

of the one-sided directional second derivatives in any direction s, at

z = 0, yields:


Lim a = a. = finite, along the direction s..
Bj.-0 J J


Therefore choose K., such that


K. ? max [az, a. for every direction sj] = finite.


Therefore there exists a finite K = K. such that the hyperplane tangent

to the w(z) at the point (w(0),0), with a slope \, is a supporting

hyperplane for the set R.. Therefore the dual solution H(K ) equals

the primal solution.








Now, for any K > K H(K) H(K ) but it cannot be that H(K) H(K )

since that leads to H(K) > primal solution. Therefore H(K) = H(K ).

If 3w(z)/3z is to be discontinuous and w(z) is continuous at z = 0,

then K has to be +m, and the dual approach fails to give a solution

for the problem (II-P2), but as K increases, approaching +-, it

produces tighter lower bounds on the primal solution. The method can

fail altogether if w(z) is discontinuous at z = 0.


11.5. Computational Separability

The previously described resolution of the dual gaps suffers from

a serious drawback, namely, the separability, which existed in problem

(II-P1), has been destroyed in problem (II-P2) because of the cross-

product terms in the penalty term. Separability preserves the struc-

tural characteristics of a system and induces properties which are

always desired in solving a large-scale problem such as many of those

in engineering design.

In the following paragraphs an algorithm is proposed which resolves

the dual gap by preserving a computational separability of the system.

Using the initial notation, consider the penalty term


T n T
Km(g(q)-z)T (g(q)-z) = K M X [ti (j) xi zi [ti(qj)-xi- z]
j=1 iEO(j) 1

n s 2 2 2
= K [t2 +x +z2 -2t. z. -2tt xir
j=l ic0(j) r=l ir r r ir ir


+2ir ir]

where z is the deviation from satisfying the constraints. Under








appropriate assumptions it has been seen that for large enough Km,

the solution of the problem can occur at z. = 0 for all icO(j) and

j=l,...,n.

Each term of the above triple summation consists of separable
2 2
terms tl (qj), Xi -2tir(q.)z.r, and -2Xi zi, and a crossproduct

-2tir(q )xir which is not separable. Expand the crossproduct in a

Taylor Series and consider the following linear approximation around

the point, ir, ti = t (q ):


tir (q)x. = t. x. + t. x. + t. x.
Sr ) ir ir ir ir r ir ir


Then the Lagrangian of the problem (II-P2) with constraints g(q) = z

(instead of g(q) = 0) and Lagrange multipliers p takes the following

approximate separable form:


L* n +1 2 2 A
L f.(q.) + K [t. +z. -2t. z. +2t. x. -2t. x. ]
lm j (1 J j m Lr 1 ir ir ir ir ir ir ir ir
m =l ic(j) r=l
r.





icO(j) icl(j)

n ,
S mj (qj. )
j=l


Thus the problem of minimizing this form for L for fixed K and 1 is

equivalent to solving the subproblems:








min mj (qj,l)
q

j=1 ,...,n


s.t. q E S..



11.6. The Algorithm

Using the multiplier method (Section II-4) along with the above-

mentioned linear approximation for the crossproduct terms, the

following algorithmic procedure is developed which resolves the dual

gap by preserving a computational separability of a large-scale system.

In Section 11-7 a minimum principle is developed which completes the

theoretical foundation on which the algorithm is based.


Step 1: Assume a value for the penalty constant K .


Step 2: Assume values for Lagrange multipliers X (we shall relate

these to p in the step 4).


Step 3: Assume a point (xiti(qi)) for each icO(j) and j=1,2,...,n.


Step 4: Put zi = ti(qj) xi and Pi = 2Kmzi + Xi. Form the subproblems

based on the linear approximation for the crossproduct term.


Step 5: Find q. for j=1,2,...,n which solves


Mi n *
qjES m

Step 6: Update xi, t (qj) and iterate from step 4 until zi = t.(qj) xi








Step 7: Update the Lagrange multipliers A and go to step 3 until the

Max Min L has been attained. Check the constraints. If
i qeS

satisfied stop, otherwise update Km and go to step 2.


The algorithm described above requires that a sequence of Max-Min

problems be solved, each possibly requiring a large number of

iterations. Consequently the total number of iterations for

convergence may become excessively large for practical applications.

In order to accelerate convergence, we have adopted the

modifications on the updating rules proposed by Miele et al. (1972) for

the method of the multipliers. The adopted modifications are the

following:

(i) Shorten the length of a cycle of computations. A cycle of

computations is defined to be the sequence of iterations in which the

multiplier A and the penalty constant Km are held unchanged, while the

vector qj for the subsystem j is viewed as unconstrained. The number

of iterations permitted in each cycle depends on the unconstrained

optimization technique which is employed, and it is AN = number of

iterations = 1 for the ordinary gradient algorithm and the modified

quasi-linearization algorithm (Miele et al. 1972) and AN = dimension

of vector qj, for the conjugate-gradient algorithm.

(ii) Improve the estimate of the Lagrange multipliers A, using

the formula


A(i+1) X(i) + 2 1 g(q)

where B is a scalar parameter determined so as to produce some optimum

effect, namely, to minimize the error in the necessary conditions for

optimality.







(iii) Select the penalty constant Ki in an appropriate fashion.
The method used to select K depends largely on the unconstrained
optimization method which is employed. If the ordinary gradient method
is employed, the penalty constant is given by the formula


Km = 2P(q)/Pq(q) Pq(q) ,


where P(q) = gT (q)g(q) and P (q) = P If the conjugate-gradient
or the modified quasi-linearization algorithm is used, the penalty
constant is updated by the formula


K(i+ ) = min (KoK(i)) if P(q) <_ Q(q,A)
m 0 m


K(i+1) = max (K ,rKi) if P(q) > Q(q,X)
M 0 5

T
where: q(q,X) = F (q,X) F (q,X) F (q,A) A 1
q q q q 3q

and
and K /q pg /Pq(q) Pq(q)

K = ) ^ /PT
[ q T q q q


See Miele et al. (1972) for further information on these rules.

11.7 A Discrete Ninimum Principle
In this section we shall give a theoretical justification for the
algorithm given in Section 11.6. Consider the following two Lagrangian
functions:


L = F(q) + K gT ((q q) Tg(q)
m M


(11-10)








and


= F(q) + K (g(q) z)T (g(q) z) -T(g(q) z)

(II-11)

Geometrically (II-11) is the Lagrange function resulting from moving

the origin to z in the WZ space and adding the penalty term for

deviation from that new origin. Theorem 8 indicates a relationship

which exists between them.


Theorem 8: If q solves the problem

*
h _() = Min Lm(q,K m,) (II-P3)
qcS

resulting in


g(q) = z (11-12)

then q also solves, for K and z fixed at these same values, the problem


Min {Lm(q,Km,-,z)jg(q) = z} (II-P4)
qES

Conversely, if one can find a z which permits (II-P4) to have a solution,

say q, then q solves a problem of form (II-P3).


Proof: Suppose that q solves (II-P3) and results in g(q) = z. Then by

Everett's main theorem (Everett ,1963 ), q solves the problem



irn {F(q) + K gT(q)g(q) g(q) = z} (-13)
qES


which is equivalent to solving the problem








Min {F(q)lg(q) = z} (11-14)
qeS


since the term KgT (q)g(q) = K z i s constant for z fixed. Problem

(II-P4) is equivalent to problem (11-14); thus q solves problem (II-P4).

If z is fixed and permits (II-P4) to have a solution, say q,

then one has solved the problem


lin {F(q) + KmgT(q)g(q) (2K z + ~)T g(q) + K zTz + Tz}
qeS


which is equivalent to solving the problem


Min {F(q) K q)(q) gTg(q)} (11-15)
qeS

where

A = 2K z + p (11-16)


Problem (11-15) is a problem of form (II-P3), and consequently q solves

a problem of form (II-P3).

Theorem 8 says problems (II-P3) and (II-P4) are equivalent, in that

they give rise to the same (z,q) values. The following corollary

follows directly from this observation.


Corollary 8.1: If and only if 2 permits (II-P4) to have a solution,

then the set Rm corresponding to problem (II-P3) has a supporting hyper-

plane at the point (w (z),z) with a "slope" given by (11-16).

The following corollary also follows directly.








Corollary 8.2: If z permits (II-P4) to have a solution, say q, then


T- T-
h(X) = (q,K ,i z) + Km zT + Az (II-17)


where A is given by (11-16).


Proof: If z permits (II-P4) to have a solution, q, then


g(q) = z (IT-18)

and


L (q,Kmvuz) = F(q) (11-19)


Equation (11-17) follows immediately from (II-10), (11-18) and

(11-19). Q.E.D.

Note that corollary 8.2 gives us a formula to calculate a dual bound,

h(A), to w(O) if we have in fact solved (II-P4) rather than (II-P3).

The result following exposes some very useful properties of problem

(II-P4) for the type of engineering design problems being considered

in this paper. It also constitutes the theoretical reasoning of the

algorithm presented in Section 11-6.
Result:T ~T
Result: If the point q [q ...q n] and the multipliers p solve

problem (II-P4), then each subproblem j, j=1,...,n resulting after the

Taylor Series linear approximation of the crossproduct terms is

minimized with respect to the corresponding qj at the point qj.


Proof: Consider a system of two stages and the following family of

problems:







Min F fl(xl,u ) + f2(x2,u2)


subject to: x2 t((xl,ul) = z


h1 (x1 ,ut) 0


h2(x2,u2) 2 0


where zeZ' = {zlx2 t1(xl,ul) = z, hl(x1l,u) < 0, and h2(x2,u2) 0}.


The Lagrangian function for the augmented problem is:


*T
L = fl(x1Ul)+ f2(x2,u2) + Km[ X t1(x1,l) z]T[x -t (x',u)- z]


-T) z T T
[2 t1(x1u) z] -vlhl (x1,u) V2h2(x2,u2)


= L + KM[x2 t1(xl,u1) z]T[x2 t1(x,u) z] ,


where V1 and v2 are Kuhn-Tucker multipliers.


Suppose that L is minimized at the point (x1 ,ul ,X,2). At this point

*
the Hessian matrix of the second derivatives of m must be positive
definite, ie.,
definite, i.e.,

















2 atT at a2t
S 1 t ___
2 + 2K -2K 2 1 E t- z]
ax 1 ax T ax -


2 K tT 2 2t 1
T + 2Km u 2Km [x2 1- z]

t1

S2K
m T
0x]



0


2
2 3T 2tI 9 t
xx + 2Km x 2Km [X2 tl- z]
axu 1 2u ax1ul


2 at at a2 1
2 + 2K ul u 2K [x- t z]
a-2 n u 1 m T[2- tl- z]
I. I du 1 9u1


at1
- 2Km u
m T



0


T
-2Km



t2K
-2Km sul


a2L 2
- + 2K I 2 L
22 Tm
ax2 ax23u 2


T
2xu22








must be positive definite.

Note that A contains terms which are not separable, each of which

contains the term x t z. Since u and ( lUil'X2 2, ) minimize Lm

with -2 t1(x1'ul) z = 0, the matrix A reduces to the following

form:


2
32 L Lt at
L+ 2K t
x 1 ax
1 1


22L
T +


at 3t t
2K t
m u T
1


_ _2L 1t
x + 2KL i l Iu
x1uT m 9X1 uT


32L
u2


aDt
- 2K -
m 3xT
0


0


tT 1
2K 2
mu au


at1
- 2K
m T


0


Since A must be positive definite the principal minor matrices of A

must be positive definite, i.e.


2L T at
D2L+ 2Km 1 T
ax2 m 1 x3t
1 1


22L at1
xu + 2Km Il x
axTau m u 1 XT


2L atT ;t
XL + 2K 1
x1 2,T m 2x1 uT
O2L D t 1 atl


i2 m a
1 1


-t
- 2K
m Bx1


DtT
3tT
2K
m 2u1



22L + 2K I
2 m
2


22L
x ou2


x1'l1


posi tive
definite










2 2


A2 = positive definite
x2,u2 DaL 3 L
22 2
ax23u2 aU2


*
Consider now the sub-Lagrangians ml and m2. After the linear

approximation of the crossproduct terms -2Kmx tl( (x,u), they take the

following form:


fml= l(Xl'Ul) Ttl(X1Ul) vhl(xul) + K (x1u )tl(xlul)


2Kmx2 tl(x1,ul) 2Kmz tl

-T T T KT2 T
22 = f2(x2'u2) + 1 x2 2(2'u2) +Kmx2x2 -2Kmt( X1 )x2 + 2Km 2


where xl,ul,X2 is the point around which the Taylor Series expansion

takes place. At the point 1 ,ulx2,u2 the following conditions are

satisfied:


zml n Lm -ml u Lm
-0 0
axl ax1 aul xul

*
m2 -= 0 and m 0
5x2 5x2 Su Su
2 x2 2 2

*
Also the Hessian matrices of ml and Sm2' which are precisely Al and A2
respectively, are positive definite. Thus we conclude that ml and
respectively, are positive definite. Thus we conclude that S>. and S.








are minimized at the point (xl,ul ,X2,U2) which minimizes Lm. By

induction the same result is found to apply to any number oF stages.

Since the primal problem (II-P1) corresponds to that member of

the class of problems (II-P1') with z = 0, the above mentioned result

applies to (II-P1) for a proper Km and i.


11.8. Discussion

We should note the following two observations which constitute

the essence of the proposed algorithm: 1) The use of a Taylor Series

expansion really provided two results. The first is, of course, that

the problem becomes separable. The second is evident only when one

realizes that other devices could effect the separability feature; for

example, one could rewrite the crossproduct term -2x. t. as

-2xir(xir zir) which is also separable for fixed zir. Unfortunately,

in this form in contrast to the proposed one, the subproblem, for

the unit containing x. as a variable, becomes a concave minimization

in xir as Km increases. 2) For the Taylor Series expansion approach,

we have found that problem (II-P4), and not (II-P3) to which it is

equivalent, has the desired property that the subproblems must always

be minimized if the overall problem is minimized, even if K is not

sufficiently large.
2 2
We should also note that the convex quadratic terms t. and x.
ir ir
produced by the penalty term will tend to improve and could dominate

the behavior of the subproblems as Km increases, making them easier

to solve.

The introduction of the quadratic terms in the objective function

also offers another advantage. It desensitizes the dual function with




35



respect to the multipliers A. For given multipliers \,


h (x) = Min L*
xi,ui ,i= ... n


h(X) = Min L
xi,ui,i= .... ,n


is closer to the primal solution than the




(see Figure 3). This is a characteristic


which can be of importance for a dual bounding procedure, such as

the one used in structural sensitivity analysis (McGalliard, Westerberg,

1972).













































Figure 3. Improvement of the Dual Bound h (X) for the
Penalized Problem Over the Dual Bound h(A) for
the Unpenalized Problem for the Same Multipliers X


i' 9


h(l)













CHAPTER III

A STRONG VERSION OF
THE DISCRETE MINIMUM PRINCIPLE


The discrete form of Pontryagin's Minimum Principle proposed by

a number of authors has been shown by others in the past to be

fallacious; only a weak result can be obtained. Due to the mathe-

matical character of the objective function and the stage transformation

equations, only a small class of chemical engineering problems have

been solved by the strong discrete minimum principle. This chapter

presents a method to overcome the previous shortcomings of the strong

principle. An algorithmic procedure is developed which uses this new

version. Numerical examples are provided to clarify the approach

and demonstrate its usefulness.


III.1. Review of Previous Works

Pontryagin's minimum principle (Pontryagin et al., 1962) is a

well-known method to solve a wide class of extremal problems associated

with given initial conditions. A discrete analog of the minimum

principle, where the differential equations are substituted by dif-

rerence equations, is not valid in general but only in certain almost

trivial cases. Rozonoer (1959) first pointed out this fact. Katz (1962)

and Fan and Wang (1964) later on developed a discrete minimum principle

which was shown to be fallacious by Horn and Jackson (1965a), by means

of simple counterexamples. As was pointed out by Horn and Jackson








(1965a, b) and lucidly presented by Denn (1969), the failure of a

strong minimum principle lies in the fact that we cannot deduce the

nature of the stationary values of the Hamiltonian from a consideration

of first-order variations only. Inclusion of the second-order terms,

does not help to draw a general conclusion about the nature of the

stationary points in advance. A weak minimum principle which relates

the solution of the problem to a stationary point of the Hamiltonian

exists and is valid (Horn, 1961 ;Jackson, 1964).

In the case of control systems described by differential equations,

time, by its evolution on a continuum, has a "convexifying" effect

(Halkin, 1966) which does not make necessary the addition of some

convexity assumptions to the specification of the problem. Thus a

strong minimum principle can be applied for these problems, requiring

the minimization of the Hamitonian even in the case that a continuous

problem is solved by discretizing it with respect to the time and

using a strong discrete minimum principle. For discrete, staged systems

described by difference equations, the evolution of the system does

not have any "convexifying" effect and, in order to obtain a minimum

principle, we must add some convexity assumptions to the problem

specification or reformulate the problem in an equivalent form which

possesses inherently the convexity assumptions. This present work

belongs to the second class.

In the present chapter we propose to show a strong version of

the minimum principle which relates the solution of the problem to a

minimum point, rather than a stationary point, of the Hamiltonian.

This is attained through the use of the Hestenes' method of multipliers,

a technique used effectively in Chapter II to resolve the dual gaps








of the two-level optimization method. This method turns a stationary

point of the Hamiltonian into a minimum point, thus minimum seeking

algorithms can be used.


111.2. Statement of the Problem, and Its Lagrangian Formulation

As a basis for the description of the minimum principle, the two-

level optimization approach, their success and failure, and the

development of a strong minimum principle, consider the following

sequential unconstrained problem. (Constraints and recycles do not

change the following results (Westerberg, 1973), and we want to keep

the presentation here as simple as possible.)

N
Min F = i i(xi,ui) (III-P


subject to


xi+1 = fi(xi,ui)


xI = xo (given)


For every i=2,...,N the vector valued function fi(xi,ui) is given and

satisfies the following conditions:

a. the function f. is defined for all x. and u. ,
1 1 1

b. for every ui the function fi(xi,ui) is twice continuously dif-

ferentiable with respect to xi,

c. the fi(xi,ui) and all its first and second partial derivatives

are uniformly bounded.

These conditions correspond to the usual "smoothness" assumptions.









The Lagrangian function for this problem (III-P1) is given by:


N N
L = i (xiiu. +[) i+ fi(xiui)] (x -
i-i =1 i +

N T T To T
S {Ti(x u) xi f. (x.,u )} + xxl ( l



i=l


The solution to the problem (III-P1) is a stationary point of the

Lagrangian function L. The necessary conditions for a stationary

point of the Lagrangian are:


T
L D i + fi
;x. axi 1i axi 9 i+1


fT
+--A = 0,
aui aui i+l


- f (xiu) = 0


0, i=1,...,N



i= ,.. .,N



i=1 ....N. N


From equation (II1-2) we have the defining equations for the multipliers,

T
3m af
i- + th (III-5)


with the natural boundary condition,


xN+1 = 0 .


'N+1


(III-1)


(111-2)



(111-3)



(III-4)


(I11-6)









Equation (111-4) simply necessitates the satisfaction of the connection

constraints.

The Lagrangian approach constitutes a unifying and general pre-

sentation of the necessary conditions which must be satisfied at the

solution of a problem. For the solution of the necessary conditions

different strategies have been developed. In a tutorial presentation

(Westerberg, 1973) the relationship of the different strategies to

solve problem (III-PI), with the Lagrangian approach, is established

and it is shown that methods such as sensitivity analysis, discrete

minimum principle, and the two-level optimization method are simply

different techniques to solve the same necessary conditions, eqs.

(III-1), (111-2) and (III-3).

Let us define the stage Hamiltonian H. as follows:


Hi = .i(xi,u ) + T fi(x 'ui) i=1,...,N (II -7)


and the overall Hamiltonian by:

To
H i Hi+ ?TX
i=l


Then,

N N
L = Hi i 1 ixi+ XTx
i=1 1i=

and the necessary conditions (111-2), (111-3) yield:

3H.
DL 1 = O i=,...,N (II-8)
Xi 3x i


S 0 i=,...N (III-9)
aui au








Thus, in order to solve problem (III-PI) by the discrete minimum

principle, we require that each stage Hamiltonian be at a stationary

point.

Consider the point (x,2 .. ,xN; ,u'2, ,u,) and the variations

6ulU2,... ,6uN around the previous point with respect to the controls,

such that S1uijl where the second index j denotes the jth element

of the control vector ui. 6x1 will be taken as zero. The variational

equations corresponding to the connection constraints of the problem

(III-P1), with up to the second order terms included, are:

2 2
i1 T f + 1 (6x T fi
6xi1 (6ui) + -(x ) + (6u.)T ( ) + (x -- x
i+l T 2 u u2 x2 (u2 i
au. ax u ax
1 i i

2
+ (6ui)T i iT (6xi) + O(E2
auiaxi


For the considered sequential system, the solution of the above system

with respect to the 6x.'s, i=1,...,N, is straightforward and yields

the following general formula in terms of the variations in the controls

only:



6xT 1 x -- y(6u ) + n k xk (u )T_]2 (u )
=1 k=R+1 k au =1 k=+1 k 'Bu

f T T T2 f
S l (6ukT k sts 2 n-l ax+t
2 .=1 k,m=1 v=+1 z x u 3k s=k+1 xs ax2 tm+l
T -


u m T2 =1 k=1 v=+1 x uk =k sk+l s ax u
in IV


(III-10)








The variation in the objective function F caused by the variations
in the controls is given by:

N Fi i 2 2i
SF ((u.) + i (6x.) + 1 (6u. (6ui)

1 =
T 1 T 2 2 1
i=1 u ax 3u
i i i

1 T 2 i T 32 i 2
+ (xi)T x (6x.) + (Xui) T (Ox1) + 0(c ).
i au 3ax

Substituting into the last expression for 6xi's, i=2,...,N with their

equals from eqs. (III-10) and, noting that the stage Hamiltonians
H., i=l,...,N, are given by eq. (III-7), we find:
1

N H (6, T 2 (6 2
6F = (u ) + (6u ) (u )
=1 au a u
SN T -1 -1 f 3

lN z fT f 2HZFzI9 I T fro
2 (k Su x 1 a2 9 Xum
S=l k,m=1 k s=k+l 3x t=m+l t au


1 N N af -1 af] 32H
+ (6uk)T 3x T (6u)
=k k=l uk Ls=k+l Ss xSu u" (III-11)



Unlike the continuous case (Denn, 1969; Halkin, 1966) there is no
general way in which the last two terms in eq. (III-11) can be made to

vanish. Therefore, the variations considered in the controls may well

produce 6F < 0, or 6F > 0, or 6F = 0. Thus it is evident from the
above that a strong minimum principle, which requires that the solution
to the problem (III-P1) minimizes the stage Hamiltonians H z=1,...,N,
is not generally available (Horn and Jackson, 1965b; Denn, 1969;

Halkin, 1966). A weaker form of the discrete minimum principle can








be used and requires that the solution to the problem (III-P1) makes

the stage Hamiltonians stationary. The examples presented by Horn

and Jackson, which counter the strong minimum principle, are such

that the stage Hamiltonians do not possess a minimum stationary point

whereas the problem itself does. However, there exist special cases

where the strong minimum principle can be applied (Horn and Jackson,

1965b; Denn, 1969).

At this point a further clarification is required. The strong

discrete minimum principle fails for physically staged steady-state

systems. Halkin (1966) has shown that it always succeeds for discrete-

time systems, obtained as an approximation to the continuous time

systems, since the time increment t can become arbitrarily small and

make the last two terms in eq. (III-11) vanish. The success or the

failure of the strong discrete minimum principle can also be explained

in terms of convexity (or directional convexity) or lack of it for the

sets of the reachable states of the system (Halkin, 1966; Holtzman

and Halkin, 1966). This constitutes an important characteristic and

in fact it is the basis for the development of a strong version of

the discrete minimum principle, which follows.

The two-level optimization procedure is an infeasible decomposition

strategy of a Lagrangian nature which solves the problem (III-Pl).

The similarity between the Lagrangian and the weak discrete minimum

principle is a well-known fact and in a recent note (Schock and Luus,

1972) the similarity between the two-level optimization procedure and

the discrete minimum principle has been pointed out. Further insight

in the relationship between the last two methods will be given later

in this chapter, and a stronger version of the discrete minimum








principle will be developed, based on a method to overcome the

shortcomings of the two-level optimization procedure developed in

Chapter II.


111.3. Development of a Stronger Version of the Discrete
Minimum Principle

In this section we will develop a stronger version of the discrete

minimum principle using Hestenes'method of multipliers and following

a procedure similar to the one that was developed in Chapter II to

overcome the dual gaps of the two-level optimization procedure.

As a basis for the presentation we will use problem (III-P1).

Consider now the augmented problem (III-P2)


N T To
iin F = i(iui) + K E [xi1- f [x1- fi] K(x -x) (x1-xo)
i=1 i=l

subject to (!II-P2)


Xi+l = fi(xi'ui) i=1,..., N

xl = xo (given)

with K > 0.


The Hamiltonian for this problem is given by:

+ T T
S= F + i+l fi(xi ui) + llx


In the following theorems we will establish some important properties

of H


Theorem 9: If (1, ...,N ) is a stationary point of the Hamiltonian H

for the problem (III-P1), then it is also a stationary point for the








Hamiltonian H of the augmented problem (III-P2) for any real value

of the parameter K.


Proof: From the equations defining H and 11 we take,


H = H + K I [x+- f]T[xi+ fi] + K(xl x)T(x- ) (II-12)



Differentiation with respect to the controls yields:


N [xil f.]T
3H 3H + 2K i+1 u [xi+ fi] "
Du Su u +1 1
i=l


Evaluating the derivatives at the point (0,,...,.N), since 0 = 0

and x i f. = 0 for i=l,...,N, and we find


H 3H
3u u


Theorem 10: Let u=0 be a local isolated minimum of F with all the

connection constraints satisfied. Then there exists a real valued
2 *
parameter K such that u- 0 and H > 0 (i.e., the matrix
0Su 0 u2 0

is positive definite) for K > K .
0

Proof: In Theorem 9 it was shown that- =- 0 is independent of the


constant K; therefore, it will be valid and for K > K Let us
o
2 2
consider next the matrix 2 H /0u :


2* 2H N [x i+ fiLT [xil fJ
32 2 + 2K u 1
u2 u =2 u
Du Do i-1 Do








The solution of problem (III-P1) is a stationary point of the

Hamiltonian H. The nature of the stationary point for the discrete

case cannot be predetermined and thus 2 0. Assuming that we
u
have a nonsingular problem, i.e.


[xi+l fkT D[xi+1 fi]
uT f- / 0 i=1 ...,N
B u T
3u

and we conclude that


T 2H T a2 N 3[xi+l fi T
(su)T 2H (6u) = (6u) 2 (6u) + 2K Y T (6u)
au Du i=l Du


D[xi+l f] 6 > 0
1+1 1T (6u) > 0
L Tau

for a large enough K so that the second term which is always positive

prevails over any negativity of the first term. Q.E.D.

The second theorem implies that any stationary point of the

Hamiltonian H can be made a minimum point by choosing the parameter K

large enough.

Theorems 9 and 10 establish the following result: For a large

enough K, a local solution to problem (III-P2) which is also a local

solution to problem (III-P1), minimizes the Hamiltonian H of the

augmented problem (III-P2).

The above result requires that we minimize the overall problem

Hamiltonian, H The available weak minimum principle permits us to

solve the overall problem by solving one problem per stage per









iteration since H decomposes into a sum of stage Hamiltonians.

Unfortunately H does not decompose, as we shall now see.

Let us see now how we can simplify the above result and relate

the solution of the problem (iII-P1) or (III-P2) to stage Hamiltonians.

Consider the penalty term in the objective function of the problem

(III-P2):

N T N T T T
K [x+- fT[xi ] f K [x x+ + f1 2x fx-
i=1 =


We note that each member of the summation includes separable terms,

T T
e.g., x+ xi+l and f i f and nonseparable terms such as the cross-

product x Tifi. The following develops a strategy to decompose

computationally the problem of minimizing H This approach is moti-

vated by the work on solving nonconvex problems using the two-level

approach.

Expand the crossproduct term in a Taylor Series and consider the

following linear approximation around the point, il f = (xui):


T T ? T T ^
xi1 f + i i i+ i + xi+ i


Then the Hamiltonian of the augmented problem takes the following
form H for H*
form H for H :








T T T T F
H = (x.,ui) + K i [x+x. f+ + 2x+ 2x+ 2
.1 i+ i+l + i i-i- i i +ll1 1+1+


+ K(x- x)T(x- xo) + N + T o
x1 1x)- 1 iIi+fi + A il1
1=1
N
No T -T A T + Kx
E [ + Kfif. + 2x l 2T + +xx]
i~ 1*1 11 i+1 i i i-i i+l 1 1




= H + Ax + Kx N+ 2Kx fN + Kxx (III-13)
i=l





H. + Kx x + Kfif + 2Kx f 2Kx f 2x f,}+ A
1 1 1 1 1 j+l i i+l i ii i+l i


Let us now establish the following result which also constitutes

the basis of the proposed algorithm (given later in this chapter).

-T -T -T -T
Theorem 11: If the point u = (uT,u,... ,u) minimizes the overall
*
Hamiltonian H then each stage Hamiltonian Hi, i=1,...,N, resulting

after the Taylor Series linear approximation of the crossproduct

terms is minimized with respect to the corresponding control variable

u at the point ui

Proof: At the solution point we require feasibility, i.e.


xi+l = fi(xi'ui)




50



(This requirement is evident from the algorithm to be presented in

the next section.) From eq. (111-13) it is clear then that


Dui ui


i=l ,... N ,


and therefore a stationary point of IH with respect to u. is a

stationary point of H. with respect to u.. The Hessian matrix of

the second derivatives of H must be positive definite at the point

u, i.e.


2 H*
2
auN
J^


must be positive definite. Thus, all the submatrices on the diagonal
2* 2* u*
must be positive definite, i.e., 2 ... 2 2 must be
ul su2 c2









2 *
positive definite. But 2
1


2- therefore all the 1 i=1,...,N
u 3au
1 i


are positive definite.

Theorem 11 implies that the minimization of the H with respect

to the control variables can be replaced by the problems


Iiin Hi


i=1 . N .


111.4. The Algorithm and Computational Characteristics

The theorems of section 111.3. imply the following algorithm

which constitutes a stronger version of the now available discrete

minimum principle.


Step 1: Assume a value for the penalty constant K.


Step 2: Assume values for the control variables ul... ,uN


Step 3: Using the values u, ...,uN, solve the state equations

o
-x="


xi+l fi(xi'ui)


i=l .. ,N


fon:ard and find x2,...,xN. Let x2,...,xN be the found values

(x, = x, given constant).


Step 4: Using the values l ,... ,uN and x ,...,xI, solve the adjoint


equations








3<^. aofT
1 ax ax i+l
S i...


SN+l= 0


backwards. Let Al...,N+A be the found values.


Step 5: Formulate the Hamiltonian H of the augmented problem (III-P2)

and expand the crossproduct terms around the point

(xI ... ,xN;u .... UN). Formulate the stage Hamiltonian Hi,

i=1,...,. Minimize all Hi, i=1,...,N and find optimal values

for the controls, say u., i=1,...,N. If the minimization

procedure fails to produce a minimum for at least one H.

increase K and go to step 2.


Step 6: If luij uij| < E for all j and for i=1,...,N (j denotes the

jth component of the vector ui) stop, the solution is found,

otherwise assume new values for the controls putting


New u. = u. i=1,...,N


and go back to step 3.


The effective use of the above algorithm depends largely on the value

of the constant K. Very small values of K will not produce the

necessary convexity of the stage Hamiltonians, while a very large K

will mask the objective function and will make the algorithm insensitive

to the descent direction of the objective function. Miele et al. (1972)









have suggested a method for choosing a proper value for the constant

K in the context of the Hestenes method of multipliers. As will be

discussed in section III-5, the two methods are related and this

method for choosing K should be satisfactory here.


III.5. Discussion and Conclusions

As mentioned before, the failure of the strong discrete minimum

principle was caused by the fact that the stationary points of the

stage Hamiltonian with respect to the controls are not always minima

points. The method proposed in the previous sections of this chapter

turns every stationary point of the Hamiltonian into a minimum point

for a large enough K. Thus we can find the solution of the original

problem at a local minimum point of the stage Hamiltonians. This

constitutes a stronger result than that currently available, where

we must search for a stationary point of the stage Hamiltonians.

The method used in this chapter to establish a stronger version

of the discrete minimum principle parallels in many respects the method

used in Chapter II to resolve the dual gaps in the two-level opti-

mization procedure. The source of the shortcomings for both the

methods (minimum principle and two-level method) is the nonconvexity

of the objective function and/or the stage transformation equations.

The two-level optimization method fails if either the objective

function or transformation equations or both are nonconvex with respect

to the control and/or the state variables.

For the resolution of the dual gaps of the two-level method, the

same penalty term was used, multiplied by a positive constant K.

It was required that K be large enough so that the stage sub-Lagrangians








become locally convex with respect to the control and state variables.

Since, in the present work, we have required that the K be large

enough to turn the stage Hamiltonians convex with respect to the

controls only, we conclude that the K required by the strengthened

form of the minimum principle to solve a nonconvex problem is at

most as large as the K required by the two-level optimization method

to resolve dual gaps.

Fig. 4 compares the values of K required to solve a nonconvex

problem for the three methods discussed, namely, weak discrete

minimum principle, the strengthened form of discrete minimum principle

developed in this work, and the two-level optimization method.

Note, KTLM > KSMP > 0, where KTLM is the least value of K required

for the two-level method and KSMp is the least value of K for the

minimum principle developed here. Thus it should be clear that,

although the methods are related, they do not have equivalent short-

comings; it is possible to find problems which can be solved by the

weak minimum principle or even the method developed here, and not by

the two-level method. However given a K 2 KTLM' all three methods

succeed.














CHAPTER IV

EXAMPLES ON NONCONVEX OPTIMIZATION


IV.1. Numerical Examples


IV.l.a. Two-stage Examples

Consider the following two stages example, Fig. 5, taken from

McGalliard's thesis (McGalliard ,1971 ) and described by:


0.6 0.6
Min F = -t3(x2,u2) + x6 + 2u, + x26 + 5u2


s.t. x2 = t2(xl,u1) = 3x1 + 3ul



x3 = t3(x2,u2) = 2x2 + 2u2


x, 0


u1 0


X1 < 3


x2 0_

2> 0


u2-


ix+ 2u1 < 4 Lx2+ 2u2 < 4

Using the regular two-level optimization procedure, a table

generated, see Table 1, and the solution is found to be:


x.- = 0 or 3

u = 0


x = 4

u = 0


Sl


























I |>




a
f!
u


Figure 5. Two-stage Example











TABLE 1

POINTS GENERATED FOR THE TWO


STAGE EXAMPLE


x2 x2 u1 x2 u2 Min L


+ Cm

2

1.425

1.

0.667

0.5

0.215

0.0

-0.5


0.5

0.5

0.5

0.5

0.5,0

0

0

0

0

0


-18.07

-12.1


4 0


-4.84

-5.7

-7.7


Min tL2 h(2)


-18.07

-12.1

- 9.3

- 7.1

- 6.3

-4.84

-5.7

-7.7








and Max Min [F A(x2- 3x -3ul)] = -4.84
A xl,UlX 2,U2

Two solutions are produced with


z1 = x2 3x1 3u1 = + 4 3(3) + 0 = -5

z2 = x2 3x1 3u = + 4 3(0) + 0 = 4

and from the corollary in Section 111.3. we conclude that a dual gap

exists.

Introduce the penalty term K(x2- 3x1- 3ul)2 in the objective F

and make a linear approximation of the crossproduct terms x2x1 and

x2u1. Following the algorithm in Section 111.5., we find the solution:

x1 = 4/3 x2 = 4 with A = 0.5 and K = 10

u1 = 0 u2 = 0 and


F = Max Fin
X x1,u ,x2,u2


[F-A(x2- 3x1-3ul)]= -4.51 -


Substitute unit 2, in example 1, by another described as follows:


2 = t3(x2,u2) + 2x6 + 2u2


t3(x2,u2) = 2x2 + 3u2


x 2xu2 2
S2 = u2 < 2

x2+u2 j 4


Min
x1,u ,x2,u2








The problem becomes:


Min F = x16 + 2ul 2x2 u2 + 2x 6


s.t. x2 = 3x1 + 3


x1,l U S1


x2,u2 E S2


The two-level procedure yields the solution (see Table 2)


x1 = 0 and 3 x2 = 4 with Max Min
A x u1 ,x2,u2


U1 = 0


[F-A(x2- 3x1-3ul)]= -2.54


u2 = 0


and a dual gap is detected again.

The proposed method yields the solution


x2 = 4

u2 = 0


F = Max Min
X x1,ulx2,u2


for A = 0.8 and K = 10


[F-h(x2- 3x1- 3ul)] = -2.07


with all constraints satisfied.


IV.l.b. Three-stage example with recycle

In the two-stage example, let us insert an additional stage

with a recycle loop from this new stage to stage 1, Fig 6.


x = 4/3

u- = 0


Mi n
xl,'l,x2,u2










TABLE 2

POINTS GEiNERATED FOR THE MODIFIED


TlhO-STAGE EXAMPLE


A2 U1 x2 2 Mlif n1 R 1 h(22)


3

2

1

0.667

0.5

0.484

0.3

0.218

0.215

0.1

-* cx,


0.5

0.5

0.5

0.5

0.5,0

0

0

0

0

0

0

0


-28.56

-18.06

- 7.56

- 4.06

- 2.56

- 2.42

- 0.767

- 0.031

0

0

0


-2.0

-2.0

-2.0

-2.0

-2.0

-2.0

-2.0

-2.37

-2.53

-2.546

-3.00

-+- oo


->+ 0

-30.56

-20.06

- 9.56

- 6.06

-4.56

- 4.42

- 3.137

- 2.561

- 2.546

- 3.00

-- C

































Figure 6. Three-stage Examnple with Recycle


6 .) IX,, u,3








The problem is:



M in F = x'6 + 2u + x26 + 5u2 -4x u + x4

s.t.

x2 = 3x1 + 3ul

x3 = 2x2 + 2u2

u = u3/4

x i, e S1

x2,u2 e S2

x3,'u3 0
x3'u3 E S3 x 4

x3+u3 < 6


Since the solution of this problem is in a dual gap, we apply the

proposed resolution which yields


x- = 0.67 x2 = 2 x3 = 4

u1 = 0 u2 = 0 u3 = 0


for \1 = 1.0 2 = 1.0 ,3 = 0.5 and K = 10.


Iiin F = Max Min [F-I (x2- 3x1- 3u1) 2 (x3-2x2-2u2)
The x1,u1,x2,u2.x3,u3 13 X1,X2 u3 X l,ux2 ,U2 X3,U3
X3(ul-u3/4)]

= -11.96


with all constraints satisfied.








IV.l.c. Soland's example

This example is taken from Soland (1971):

2
Min f = -12x1 7x2 + x2


s.t. ((x) = -2x1 + 2 x2 0


0 5 x 2


The solution found by Soland

the generation of 12 points,

subproblems is:


x = 0


(which is not the global minimum) after

which necessitated the solution of 23




and f = -10


x2 = 2


The algorithm proposed in this work generated a sequence of 8 points,

solving 8 subproblems, and found the solution:


x = 0.718


and f = -16.7387


x2 = 1.46847


The sequence of generated points is shown in Table 3. The starting

point is X1 = 2.00, x2 = 0.0 .


IV.l.d. Jackson and Horn's counterexample

Consider the following two-stage example, Fig. 7, taken from

Horn and Jackson (1965a). This example is the counterexample which

demonstrated the fallacy of the strong minimum principle and is









TABLE 3

POINTS GENERATED FOR SOLAND'S EXAMPLE


A x x2 h(A)

-60 2.00 0.025 -919.04

-58.9 1.7365 0.0393 -713.07

-25.5 1.6513 0.06146 -182.64

-18.9 1.5511 0.07873 -108.35

-12.5 1.4318 0.1719 78.52

- 6.5 1.2799 0.3435 27.25

- 1.0 1.06 0.6326 22.16

+ 3.0 0.718 1.46847 16.7389

















1 ^ 0 j' i S V.J


U *4 i T

2.L V
I VI
"F7
11^clac.a-tB; ,~.ia;waiKK
es


ILI R B IT R;0.1T
r* %
cP&S7



; I-3~t e~LIA~i~i mT~~hi> I.~


Figure 7. Jackson and Horn's Counterexample


's
5L,2" I
rt"Il


U N i T

5'




S1







described by:

Min x2
1 02


subject to

1 o 1 1 12
x = x1 26 (1)2

1 o 1+
x2 = x2

x2 = x + (x + (62)2

2
x2 = arbitrary

xo = xo = 1
x1 2
The solution to this problem is 61 = 2 = 0 which gives the smallest
2
value of x1. It, however, does not minimize the Hamiltonian for the

stage 1 since this Hamiltonian has only one stationary point which
is a maximum. Let us now apply the procedure of the strong discrete
minimum principle developed in the previous sections. Thus we have:


H K[xo 201 (12 2 + K[x0 + 2 2K(x)[x 291 (1)2]
o o 1 1 -1 o 1 1 1O


-2K(x2)[xo + 1] + K(xl)[xo 21 (61)2] + K( )[x2 ]


+A 1 1 1 11 2 2o 1
iXl 2)1 (1 + X2x2 + XA2




68




H* x + (x2)2 + (62)2 + K(x (x2 2K(x 2 1 )2-


-2K(x )[x + 1] + K(Xl)[x 2 1 (1 ] + K(x )[x2 + ]


1 1 1 1
1 1 x 2x2

with

=1 + 1 o ( ~12
A1 21+K(x1) 2K~x1 -26 [- 6 )


A2 = 2(x2) + 2K(x2) 2K[x2 + 1


The first necessary conditions yield:


1 K(13+ 6K(12 + [10K 2Kx + 2K(x)
Ml I


- Al91


+ [-4Kxo + 2Kx2 + 4K(x2) 2K(x2) 2X1 + A] = 0


aH
-- 2(2) = 0


1. Put K = 1

1
2. Assume, 6


which gives 2 = 0 .


2 = 0


1 1 1 3
3. Find, x 1 2
S2 2


and x2 = 1 + 1 = 2


4. Also, A 1 and = 2(2) = 4
1 A2








~1 ~2
5. Minimize H1 and H2 which give: 1 = 0.947 and 6 = 0


|11 O11 = |.0531 > E = 0.001


Go back to step 2,

2a. Assume 1 = 0.947 and a2 = 0

1 1
3a. Find, x1 =- 1.343 and x2 = 1.947

1 1
4a. Find, A = 1 and X1 = 3.894
-2* 2
5a. Minimize H1 and H2 and find: 61 = 0.9 2 = 0

Go back to step 2a and assume 1 = 0.9 62 = 0. Continue in

the same way until 6e 1 < E and 2 6 2 < This finally

leads, in 7 iterations, to the solution 1 = 2 = 0 which is the

solution of the problem.

Note that if K = 0.1 the nonconvexity of the first connection

constraint with respect to 01 does not disappear. The stronger version

of the minimum principle developed in this paper does not succeed in

giving the solution.


IV.I.e. Denn's counterexample

Consider the following example taken from Denn (1969), which is

also a counterexample to Katz' strong minimum principle and is described

by:



Minimize x1
1 2
u ,u








subject to

1 o o 1 1)2
x1 = x(1 + x2) 2 ()
1 o o 1
x2 = 4x1 2x2 + u


2 1 1 1 ,22
S= x (1 + x2) (u

2 1 1 2
x2 = 4xI 2x2 + u
2 1 2x2

x- =- 1/4

and x = 1

1 2
where u and u2 are to be chosen subject to:

1 2
0 < u u ,u

This is a constrained problem but it can be handled equally as well as

the unconstrained problem by the developed method. The inequalities

will be handled through Kuhn-Tucker multipliers.

The above problem has a solution at the point ui = 1 and u2 = u,

but the Hamiltonian for the first stage does not have a minimum and

thus Katz' strong minimum principle cannot be used. In fact the

Hamiltonian of the first stage possesses one stationary point which

is a maximum. Let us now apply the algorithm developed in the

present work. Thus we take:








= Al(u )2 + ( 1+ )u1 + A1 x(1 + x12)+ 42x1 2A2x2 + K(x1)2((1+12)2
S 2 11 221 2 12


+ 16K(x)2 +K(x(x11,2 16Kx x1 +u + Kx21(1 + x2) 2Kx1x1(1+x2)

1 2 1 22 1 2
+ -^ (1 + x)(2) Kx(1 + x)(u2) + 4K12 8Kxx2

^2 I2 1^2 122 I2 1-2
2Kx2x2 + 4Kx2x2 4K4Kx1u + 8Kx1u + 2Kx2u 4Kx2u

and


H = 22 2 Kx2 2 + ( 4 + Kx 2) K(x2 + K(u2)
H2 1 2 A1(u ) A u Kx1 Kx1(u ) 2K(x

- ^2 2^u ^2 K >2j +21 2)K
2Kx2u 2 u 2) + Kx1x1(1 x12) 2Kxxl(l+x2) 1 12)


2 1 2 142 1 1 2 12 1 2
Kx(1+x2)(u2) 4Kx1x2 8Kx1x2 2Kxx2 + 4Kx2x2

I 2 1 2 12 21 2
4Kx12 + SKx1u2 + 2Kx2u 4Kx12u


The necessary conditions to be satisfied are:

3H 1 1 and
-1 1(ul) + 2 + = 0 and


21 2 2 2 =1 1
Du


S K(u A Kx + K 2Kx Kx1 +2)
3u
1 -1
+ 8Kxl 4Kx2=0








The adjoint variables satisfy the following equations:


A1 (1+x(2) + 4A + 2K(x1)(1+x1 2 + 32K(x1) 16Kx 2KX^(1+x2)


K(+x )(u2) 8Kx 2 8Kv2




12 2 -2
Kx1u + 4Kx2 4Ku




2 2 2 A A
X2 = 2x + K + K(u2) 2Kx^(1 + x2)


2 = 2K(x) 2Ku + 4Kx1 .


Note that in the above equations the variables pl and p2 are Kuhn-

Tucker multipliers through which we handle the inequalities
1 2 *
1 2
u u and u > u For p1 and 12 we have P2 < 0.

Starting with K = 10 and initial assumptions ul = 2 and u2 = 2,

we apply the same algorithmic procedure as presented in the previous

example, and after 5 iterations we find the solution

u = 1 and u

which is the solution for the problem.

IV.2. The Design of a Heat Exchange Network

Avery and Foss (1971) have demonstrated that the method of two-

level optimization may not be generally applicable to chemical process








design problems due to the mathematical character of commonly en-

countered objective functions. Since the cost functions used in

chemical engineering design problems typically include a term

x where x is the throughput in a unit, nonconvex problems arise.

The method for resolving the nonconvexities, as described in

Chapter II will be applied in this section to the design of a

heat exchange network presented by Avery and Foss and possessing

inherent nonconvexities.

The cold stream D of Figure 8 is to be heated from temperature

T to T Three hot streams having flowrates a,b,c and temperatures
o p

ta, tb' tc may be used for heating; here they are considered to have

sufficient heat capacity and availability to accomplish the task.

The distribution of the total heat load among the three exchangers

is to be accomplished at the minimum cost of equipment, which is related

to the total heat transfer surface area A by


3 3 ac.
C = i Ai Ai (IV-I)
i=l i=l


The parameters y and a are positive constants; typically a = 0.6.

It is easily demonstrated that this problem has a minimum cost solution

and it is unique.

Let xI and x2 be the enthalpies of stream D entering and leaving

respectively, the first heat exchanger (see Figure 9). Similarly

yl and y2 are the enthalpies for the second exchanger and z, and z2

for the third. Note that x1 and z2 are known since the flowrate and

















Cold Stre mYw


Figure 8. Heat Recovery Process


Hot Streams


T A

















t b s i'c



a 4


Stream D


Figure 9. Uncoupled Heat Recovery Process Showing the
Enthalpy Variables Used in the Two-level
Optimization Method


Cold








temperatures of stream D are known at the entrance, to the first
exchanger and the exit of the third exchanger. Thus, considering

x2Y1' Y2 and z1 as the design variables, the minimization problem
can be formulated as follows:


Min [C1(X2) + C2(1',2) + C3(z)]
x2Y1 Y2,z1

s.t. x2 =

Y2 = Z1 (IV-2)


The Lagrangian for this problem is:

3
L = Ci- X(x2 Y1) A2(Y2 z1)


= {C1(x2) X1x2} + {C2(Y1,Y2) + XAIY X2Y2} + {C3(zl)+ 2Z }



= 1(x,2'1) + Z2(yy2',X1,A2) + j3(z1,X2) (IV-3)


For given values of the multipliers X and x2 the two-level optimization

method requires the minimization of the sublagrangians 1l' k2 and 93.

A sufficient condition for a function to be at a minimum point is
that the Hessian matrix be positive definite at that point. The

Hessian for 22 is defined as
L22 2C
3 C2 3 C2 d e

H2 = Q2 Q3yl 1 (IV-4)
32C2 2C2 e d2
Yl yQ Y712







and the conditions assuring that both eigenvalues are positive are


d1 > 0 d2 > 0 (IV-5)

A (d d2 e2)> 0 (IV-6)

In the above, the heat duty Q = y2 yl in exchanger 2 has been

substituted for the variable y2.

d d2 and e may be determined from equation (IV-1), and the

following relations, which are suitable for the determination of heat

exchange area. Figure 10 identifies the notation used here.


Q = CD (T2 T1) = CB (t2 t1)


Q =Y2 Y


T1= Y1/CD


A =
UAT
m

(t2 T2) (t T1) CD
AT = r f- I
Tm (t2 T2) r
n 2) B
(t T)
1 1

Without loss of generality, it is convenient to take r > 1; identical

conclusions hold when r 1. It may be shown that


dl = q[(a-l) (rs-1)2 + p {(rs)2 1}]

d2 = q[(ra-) (s-1)2 + p(s2-1)]

e = q[(a-l) (rs-l) (s-l) + p(rs2-_)]



















C,3
/d ka


I N

it' Ma
Y, ;1, xx
U .


Figure 10. Diagram Showing Notation for a Single Exchanger








where

62 CDOt2-Yl >
61 CDt2-rQ-Y1

52



2
q = Ka p2/62 0


K I C a
K=Y L-2 >0

Substitution of di, d2 and e into equation (IV-6) gives


A (dd2-e ) = q p s (r-1)

which implies that A is never positive, and that, regardless of the

parameters of the problem, the stationary point can never be a
minimum. Therefore the two-level method fails.

Consider now the following augmented problem:

2* 2 2
Min F = Cl(X2) + C2(YIy2) + C3(z ) + K(x K(x Ky (y2-z1)


s.t. x2 Yl = 0 and y2 z, = 0

with K > 0.

The coupled terms x2y, and Y2z1 are expanded in a Taylor Series

around the point x2,Y1,Y2,z1 and are approximated by the linear

terms in the expansion. The Lagrangian of the above problem becomes:




80




*~ 2 *~
L {C1(x2) (X + 2K Y)x2 + K x + K x2}



+ {C2(y1'Y2) + ( 2K*x2)Yl (2 2Kzl)Y2

*2 *2 *-~ *~
+ K Y1 + K Y2 + K x21 + K Y2z}
*~ 2 *~ ~


+ {C3( ) + ()2 2K y2)z1 + K z1 + K y2z}

*
= I + 2 + 93


The necessary conditions for the minimum of C are


VL = 0 and V L = 0


where


v = [x2yl'y2,z1-]T


and = ( T,2 A2)


This results in the following equations


*2
0 _
ax2-0


*
RZ2 R2
y1 y Y2


*
-3
= 0
a z


x2 = Yl and Y2 = z ;


that is, each subsystem is at a stationary point with respect to the

variables associated with it. The Hessian matrix for the new sub-
agrangian 2 of the second exchanger is:
lagrangian S2 of the second exchanger is:











H2


a2^
2 C2 *
--+ 2K
3Q

a2C2 *
+ 2K
ay IQ


a2C2 *
_ + 2K
SQ3y

2
Q2C2 *
S+ 4K
a 2
a1


The terms d1 d2 and e


dI = dI + 2K


are easily found to be


*= d *+
d2 = d2 + 4K


* *
e = e + 2K


* ** *2
A = [dd2 (e ) ]

*2
= 4(K*)2 + (4d1 + 2d2 4e) K* + (d d2-e2)

= 4(K*)2 + r2K + P3


where


r2 = 4dl + 2d2 4e


3 = dd2 -


A has roots

K -F2 (F 2 16F3)
8

It was shown before that F 5 0


d2
*








thus.


(22 163) >


and the roots are real and of opposite sign. Let K1 > 0 and

K2 0 then for K > K A > 0. Also for


K > d1/2 K3 d1 > 0


and for


K > d2/2 K4 d2 > 0


Therefore if K = max {K, K3, K4} we find H2 > 0 and the stationary

point for unit 2 is a minimum as required.

Thus we see that the curvature of a sublagrangian at the sta-

tionary point can be determined not only by the sub-unit cost

function, C2 in this case, but also by the numerical value of the

penalty factor K

The above example demonstrates the applicability of the developed

strategy in a wide class of chemical engineering design problems

where the conventional two-level method fails to provide the solution.













CHAPTER V

THE SYNTHESIS OF PROCESS FLOWSHEETS.
A GENERAL REVIEW.


Processing systems are characterized by two distinct features.

The first is the nature of the process units and their interconnections,

and the second is the capacities and the operating conditions of

the units. Synthesis of optimal processing schemes requires a search

over the whole space of structural alternatives as well as over the

space of the design variables for each structural configuration.

The synthesis of an optimal process flowsheet represents a great

challenge for a chemical engineer in the area of systems engineering.

A number of significant contributions have moved the problem from

the state of art to the state of semi-art where the personal capa-

bilities of the designer have been organized and systematized in a

logical manner. The size of the problem is very large and the methods

currently available simply give good solutions which are happily

accepted by practicing designers. The progress of mathematical systems

theory in this area has been very slow and no major thrusts have been

attained. Thus the chemical engineer has to resort to techniques

that are available for use with his own experience and intuition.

We shall attempt to present the major contributions to the

solution of the synthesis problem in order to relate and compare the

synthesis strategies proposed in this work, and presented in Chapters VI

and VII, with the general trends of thought already established by

other researchers in this area.




Full Text

PAGE 1

OPTIMIZATIOI^ OF NOFCONVF.X SYS i EMS Af^iD THE SYNTHESIS OF OPTIMUM PROCESS FLOWSHEETS By GEORGE STEPHANOPOULOS A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1974

PAGE 2

Dedicated to The Memory of My Father Nicholas Stephanopoulos Mcui licii, .t'^^£e imiji o^ acting mImzZij: ?Luttij, on meditation , tJiLs ^i the nobloAt; Sdcondlij, on imitation, -tlii-i Zi, the. acibie^t; cmd Jlii.\dhj, on expefviencd; tliib i^ tlit bittcfiest. Con-^aciiiS

PAGE 3

ACKNOWLEDGMENT The author wishes to expv^ess his gratitude to: The chairman of his supervisory committee, Dr. Arthur W. Westerberg, Associate Professor of Chemical Engineering, for iiis firm support, sound advice, and guidance throughout what has been the most stimulating and enjoyable period of his academic life; The members of his supervisory committee: Dr. P.P. May, Professor of Chemical Engineering, Dr. D.W. Kirmse, Assistant Professor of Chemical Engineering, Dr. T.E. Bullock, Associate Professor of Electrical Engineering and Dr. M.E. Thomas, Professor of Industrial and Systems Engineering; The faculty and staff of the Department of Chemical Engineering for their assistance; The National Science Foundation, which provided financial support through the grants GK-18633 and 41606 and the Graduate School for a research assistantship; Jeanette for her encouragement through the lows and the sharing of the highs of the life. T n

PAGE 4

TABLE OF CONTENTS Page ACKNOWLEDGMENTS 1 i i LIST or TABLES vi i LIST OF FIGURES vii1 LIST OF SYMBOLS xi ABSTRACT xi v CHAPTERS: I . INTRODUCTION 1 II. TWO LEVEL OPTIMIZATION METHOD. DUAL GAPS AND THEIR RESOLUTION USING METHOD OF MULTIPLIERS ... 5 11. 1. Review of the Previous Works on the Nonconvex Optimization 5 1 1. 2. Statement of the Problem and the TwoLevel Procedure 7 11. 3. Dual Gaps 10 11. 4. Resolution of Dual Gaps Using the Method of Multipl iers 13 1 1. 5. Computational Separability 22 1 1. 6. The Algorithm 24 11. 7. A Discrete Minimum Principle 26 1 1. 8. Discussion 34 III. A STRONG VERSION OF THE DISCRETE MINIMUM PRINCIPLE 37 111.1. Review of Previous Works 37 1 1 1. 2. Statement of the Problem and Its Lagrangian Formulation 39 iv

PAGE 5

TABLE OF CONTLFnITS (continued) III. 5. Discussion IV. EXAMPLES OF NONCONVEX OPTIMIZATION IV.l. Numerical Examples IV. 1 .a. Two-stage examples IV.l.b. Three-stage example with recycle VI. 1. Previous Works on the Synthesis of Separation Schemes VI 2 Statement of the Problem and the List Techniques for the Representation of the Separation Operations ' VI. 3. Branch and Bound Strategy VI. 4. Examples VI. 4. a. Example 1: n-butylene purification system Page III 3 Development of a Stronger Version of the Discrete Minimum Principle 1 1 1. 4. The Algorithm and Computational ^^ Characteristics 53 55 55 56 60 IV. I.e. Soland's example IV.l.d. Jackson and Morn's counter^^ example IV. I.e. Denn's counterexample ^^ IV.?-. The Design of a Heat Exchange Network ... V. SYNTHESIS OF PROCESS FLOVISHEETS. A GENERAL REVIEW VI BRANCH AND BOUND STRATEGY FOR THE SYNTHESIS OF OPTIMAL SEPARATION SCHEMES 72 83 90 93 100 107 108 VI. 4. b. Example 2: olefi ns-parraf ins separation system

PAGE 6

TABLE OF CONTENTS (continued) Page VI. 5. Discussion 132 VII. EVOLUTIONARY SYNTHESIS OF PROCESS FLOWSHEETS ... 134 VII. 1. A General Philosophy on Evolutionary Synthesis 134 VII. 2. Evolutionary Synthesis of Optimal Multicomponent Separation Sequences 143 VI 1. 2. a. Representation of separation sequences as binary trees .... 143 VII. 2. b. Neighboring flowsheets and the evolutionary rules for a separation sequence 146 VII .2. c. Polish strings and their representation of separation sequences 1 50 VII. 2. d. Proof of the completeness of the evolutionary rules i'"^'--. VII. 2. e. Evolutionary strategy 161 VII. 3. Examples of Evol utionary Synthesis . 163 VI 1. 3. a. Lxample 1: synthesis of a solids' separation system .... 153 VII. 3. b. Example 2: synthesis of a mul ti component separation sequence 1 67 VII. 4. Discussion 174 VIII. CONCLUSIONS AND RECOMMENDATIONS FOR FURTHER RESEARCH 1 78 APPENDICES 183 APPENDIX A 184 BIBLIOGRAPHY 187 BIOGRAPHICAL SKETCH 191 VI

PAGE 7

Table 1 LIST OF TABLES Points Generated for the Two-Stage Example 2 Points Generated for the Modified Two-Stage Exanipl e • 3 Points Generated for Sol and' s Example 4 The Numbers of Distinct Separators B(N) and Distinct Flowsheets F(rO for a Mixture of U Components and One Separation Method 5 Initial Feed to the n-Butylene Purification Sys tern 6 Generated Flowsheets and Tiieir Minimum Costs 7 Specifications of the Initial Feed and the Desired Products for Exampl e 2 8 Specification of the Solids in Fxairple 1 9 The Evolutionary Steps Taken During the Synthesis of the Solids' Separation System 10 Table Indicating the Effectiveness of the Proposed Evolutionary Strategy in Synthesizing an Optimal Separation Sequence (One Separation Method Used;.. 11 Table Indicating the Effectiveness of the Proposed Evolutionary Strategy in Synthesizing an Optimal Reparation Sequence (Three Separation Methods Used) 58 61 65 96 109 120 122 165 158 176 VII

PAGE 8

LIST OF FIGURES Figure Page 1 Geometric View of the Success of the Dual Approach .. 12 2 Geometric View of the Failure of the Dual Approach to Yield the Primal Solution 14 3 Improvement of the Dual Bound h (A) for the Penalized Problem Over the Dual Bound h(A) for the Unpenalized Problem for the Same Multipliers A 36 4 Relationship of Penalty Constant K Required for Three Lagrangian Based Optimization Algorithms 55 5 Two-stage Example 57 6 Three-stage Example with Recycle 62 7 Jackson and Horn's Counterexample 65 8 Heat Recovery Process 74 9 Uncoupled Heat Recovery Process Showing the Enthalphy Variables Used in the Twolevel Optimization Method. 75 10 Diagram Showing Notation for a Single Exchanger .... 78 11 All the Distinct Separators Generated and the Basic Flowsheet for a Fictitious Example of 4 Components Using 2 Separation Methods 98 12 Two-stage Example to Demonstrate the Insensi ti vity of the Dual Function to the Values of the Lagrange Multipliers 105 13 Generation of the Basic Flowsheet for the Synthesis of the n-Butylene Purification System Ill 14 Generation of all the Flowsheets Starting v/j th Separator 1 for the n-Butylene Purification System Example 112

PAGE 9

LIST OF FIGURES (continued) Figure ^^ 15 Generation of all the Flowsheets Starting vrith Separators 2 or 5 for the n-Butylene Purification System Example ^^^ 16 All the Distinct Separators Employed During the Synthesis by Branch and Bound of the n-Butylene Purification System ^'^6 17 Nearly Optimum Flowsheets Retained at the End of the Branch and Bound Synthesis of the n-Butylene Purification System 119 18 All the Distinct Separators Employed During the Synthesis by Branch and Bound of the Olefins, Paraffins Separation System 123 19 Generation of the Basic Flowsheet for the Synthesis of the Olefins, Paraffins Separation System 127 20 Nearly Optimum Flowsheets Retained at the End of the Branch and Bound Synthesis of the Olefins, Paraffins Sepa'-ation System 128 21 Generation of all Flowsheets Starting with Separator 1 for the Olefins, Paraffins Separation System "'29 22 Generation of all Flowsheets Starting with Separators 2 or 3 for the Olefins, Paraffins Separation System 130 23 Generation of all Flowsheets Starting with Separator 4 for the Olefins, Paraffins Separation System 131 24 An Illustrative Diagram Showing Two Alternate Sets of Evolutionary Rules for a Family of Flowsheets A.. 137 25 Flowsheet A and Its Neighboring Flowsheets B,C,D and E Resulting from A with Simple Structural Modifications 139 25 A Separation Sequence (A), Its Corresponding Binary Tree (B) and the Skeleton Structure (C) Corresponding to This Tree 144 IX

PAGE 10

LIST OF FIGURES (continued) Figure Page 27 Flowsheet (A), a Down Neighbor (B) to It, and a New Separator Type Neighbor (C) to It 147 28 Separation Sequences Corresponding to the Binary Trees of Figures 27 A and B 149 29 A Binary Tree for the Algebraic Expression 2 2 3x y + z /u 151 30 Interpreting a Polish String Development of an Algebraic Expression Using a Stack of Operands and a Set of Operators 153 31 The Operators of a Polish String and Their Operands, and the Schematic Representation of the Generation of the Neiohboring Flowsheets by Applying Rules 1 and 2 ....r 156 32 Flowsheet A-1 and Its Neighboring Flowsheets A-2, A-3 and A-4 Generated Using Rules 1 and 2 158 33 A Schematic Representation of a System for the Separation of Solids and the Corresponding Binary Tree ^66 34 Flowsheets Generated During the Evolutionary Synthesis of the n-Butylene Purification System Starting from Flowsheet (a) I'^O 35 The Evolution Process for the n-Butylene Purification System Starting from tiie Flowsheet (a) 171 36 The Evolution Process for the n-Butylene Purification System Starting from the Flowsheet (k) 172 37 Flowsheets Generated During the Evolutionary Synthesis of the n-Butylene Purification System Starting from Flowsheet (k) 173

PAGE 11

LIST OF SYMBOLS A. = Heat transfer area of the i-th heat exchanger. 1 a,b,c ^^ Flowrates of hot streams in a heat exchange netv;ork. C. ^ Heat capacity of streain i. c= Cost of the i-th heat exchanger. D = Set of feasible dual variables or multipliers. F = Scalar return function. F* = Scalar return function for the augmented problem, may be subscripted. f. ^Stage transformation function, for stage i. f . = Scalar return function for subsystem j. g ^ Vector of equality constraint functions, may be subscripted. H = Supporting hyperplane for the set R. H. ^Hessian matrix of subsystem i. (/ = Hamilton! an function. Subscripted with i refers to the Hamiltonian for the subsystem i. •k H = Hamiltonian function for the augmented problem. h = Dual function. h. = Vector of inequality constraint functions for subsystem j. I(j) = The set of i such that stream i is an input to subsystem j. K = Penalty constant, always nonnegative. L = Lagrangian function. "k L ^ Lagrangian function for the augmented problem, may be subscripted. XT

PAGE 12

I. = Sub-lagrangian for subsystem i. £. = Sub-lagrangian for subsystem i after the linear approximation of nonseparable terms. 0(j) = The set of i such that stream i is an output of subsystem j. P(q) = Scalar quantity = g (q)g(q). = Heat duty of an exchanger. q = Composite vector variable (x|u), may be subscripted. R = The set (z ,z) such that z v.'(z), z e Z, may be subscripted. S = Constrained variable set, may be subscripted. s = Direction vector, may be subscripted. T ,T = Temperatures of a cold stream at the entrance and the D exit of a heat exchange network. t. = Stage transformation function, for stage i. t ,t, ,t = Temperatures at which the hot streams a,b,c are available, a D c '^ u = System decision variable vector, may be subscripted. w(z) = Minimum {F(x); g(x) = z, x e S}. w (z) = Minimum {F (x); g(x) = z, x e S}. X = Vector variable associated with interconnecting streams, may be subscripted. y = Vector variable associated with streams leaving the system, may be subscripted. Z = The set of z such that 3 x e S 3 g(x) = z. z = Perturbation vector, may be subscripted. Greek Letters a = Scalar parameter. a. = Positive constant, usually = 0.6. 6 = Scalar parameter. xii

PAGE 13

'1 AT = Positive constant. = Log mean temperature difference. = Small positive quantity. = System decision variable vector, may be subscripted. = Vector of Lagrange multipliers, may be subscripted. = Vector of Lagrange multipliers for the augmented problem. = Vector of Kuhn-Tucker multipliers, may be subscripted. Scalar parameter. Mathematical Symbols 3

PAGE 14

Abstract of Dissertation Presented to the Graduate Council of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy OPTIMIZATION OF NONCONVEX SYSTEMS AND THE SYNTHESIS OF OPTIMUM PROCESS FLOWSHEETS by George Stephanopoulos June, 1974 Chairman: Arthur W. Westerberg Major Department: Chemical Engineering Previous efforts in the area of optimization of chemical processes have not accounted for the nonconvexi ties commonly encountered in such systems. These nonconvexi ties cause many of the proposed large scale optimization strategies to fail. The subject also of the chemical process synthesis was largely igno>"ed until recently despite its importance in the chemical erigineering practice. This dissertation presents techniques to overcome the deficiencies of two related and often studied optimization methods in the presence of nonconvexities and develops strategies for the synthesis of process flowsheets. Many Lagrangian based methods for the optimization of large scale systems require that the Lagrangian function possess a saddle point. The two-level optimization method may not be generally applicable to chemical process design problems due to the mathematical character of commonly encountered constraint sets and objective functions, ','hich commonly do not allow the existence of saddle points for the Lagrangiar function. The dissertation presents a method to overcome these XTV

PAGE 15

shortcomings of the two-level optimization procedure by employing Hestenes' method of multipliers. The objective function is augmented by a penalty term which is t!ie sum of the squares of the connection constraints multiplied by a positive constant. This penalty term under certain conditions turns every stationary point of the Lagrangian into a saddle point thus securing the success of the twolevel method. The separability of the initial system which was lost because of the penalty term is regained by expanding the nonseparable terms into a Taylor Series and retaining only the linear part of it. As a direct extension of the above strategy the dissertation presents the development of a stronger version of the discrete minimum principle. Two algorithms have been developed to implement these theoretical results. The success of the new methods has been demonstrated in several small size numerical examples and in the design of a simple heat recovery network on which the previous methods fail. The dissertation also develops two strategies for the synthesis of chemical process flowsheets. The first is a branch and bound strategy which exploits the bounding properties of the dual and primal values which can be obtained for the flowsheet objective function. The flowsheets are constructed in a block-by-block building procedure, and this method was used to synthesize optimal multicomponenc separation sequences. In the method list processing techniques are used to develop the distinct separators which can occur. At the end of the synthesis a small number of nearly optimum flowsheets is retained. Further screening among them is possible to locate the optimum XV

PAGE 16

flov.'sheet. This strategy is demonstrated in two different examples with very encouraging results. The second synthesis strategy, the evolutionary strategy, is considered next and is systematized for use in the synthesis of general process flowsheets. The evolutionary synthesis procedure is broken into four subtasks: a) the development of a starting flowsheet, b) the generation of evolutionary rules to produce the structural modifications to be considered during the synthesis, c) the developing of the proper evolutionary strategy to lead to the optimum solution in the most effective manner, and d) the screening among the current flowsheet and the alternative flowshieets generated. Notions such as that of the neighboring flowsheets and the evolutionary strategy help to put evolutionary synthesis in a correct perspective. The evolutionary approach is illustrated with the synthesis of two distinct sepat^ation problems with \/ery encouraging results. xvi

PAGE 17

CHAPTER I INTRODUCTION The essential concern, purpose, and culmination of engineering is design. In chemical engineering this is the prevailing and most important factor of its scope. Very often, a final design is achieved without due consideration to all aspects of the design morphology. This is necessitated by the complexity of the design problem and the state of the limited engineering advances in certain areas. Proper design procedure includes the three essential stages of synthesis analysis and evaluation. The design process is complicated by the interrelationships existing among these stages. Frequently these interrelationships are complex and cause design to be an iterative process, requiring the special attention of the designer and the development of flexible strategies which will lead to good solutions. Analysis is a term which is equally familiar to both practicing engineers and students of engineering. It has been developed deductively and quantitatively to a high degree. Strategies have been developed to analyze whole processes and very effective, sophisticated methods have been proposed to resolve the complicated, time consuming activities of the design process. Considerable work has been done and is in progress on optimization theory for large structured systems. Both theory and applications have found very fertile ground in chemical engineering. The particular

PAGE 18

feature of chemical processes (i.e., sparseness and complexity) have caused the development of highly effective strategies for analysis and optimization. Chemical process design has been the instigator of such developments. The synthesis stage of the design procedure was largely ignored until recently in the chemical engineering literature despite its importance in chemical engineering practice. During the last few years the importance of creativity, innovation and invention in designing chemical processes has been stressed and iias received the proper attention. Because syntliesis is such an important step, it became the principal goal of this thesis. Initially \\'e were exploring the use of licGal liard' s (McGalliard, 1971) approach to structural synthesis for the synthesis of optimal multi component separ^ation schemes. This strategy involves the development of dual bounds for alternate flowsheets. Thus the first problem of concern was a good initial estimation of the Lagrange multipliers to be used for the evaluation of dual bounds. A more thorough and detailed investigation of the physical meaning of the Lagrange multipliers and the discovery of Hestenes' m.ethod in the literature (Hestenes , 1959) led to the development of a new method to overcome the deficiencies of Lasdon's (Lasdon, 1964) two-level optimization method. Then it became evident that a stronger version of the discrete minimum principle could also be established. Thus two new strategies evolved which can \'ery likely be used effectively (as their application to several small examples has indicated) to optimize large-scale systems in the presence of non convexities . These approaches are of

PAGE 19

particular interest to a cliGinical engineer, since the cost functions used in process design involve the throughput to a unit raised to the 0.6 power v;hich is a characteristic nonconvex function. Furthermore v/e explored the use of the bounding properties of the primal and dual functions in connection vrith list processing techniques to generate very good alternate solutions to tfie multiccmponent separation problem. Finally, the evolutionary approach to the syntiiesis of optimal process flowsheets was systematiz'-:d, and all these principles and ideas were illustrated in the synthesis of an optimal separation scheme. Thus, in Chapter II we discuss the two-level optimization method, its xheoretical foundations, its advantages and its drawbacks. The generation of dual gaps because of nonconvexities and their resolution using Mestenes' method of multipliers is discussed. An algorithmic procedure employing these ideas is described. In Chapter III the "classical" discrete minimum principle is outlined, and its shortccriiings because of nonconvexities are defined. Again Hestenes' multipliers are used to develop a strong version of the discrete minimum principle^ and a new algorithmic procedure is proposed. In Chapter IV the theories developed in Chapters II and III are tested on some simple numerical examples and also some examples drawn from the chemical engineering literature. In Chapter V a general review is presented of previous works in the area of process flowsheet synthesis. Then a branch and bound strategy, using list processing techniques, is outlined for the synthesis of optimal multi component separation sequences in Chapter VI, while Chapter VII develops a systematic evolutionary approach for the synthesis of process flowsheets.

PAGE 20

Chapter VII concludes with an illustrative presentation of the principles governing tlie evolution oi" the design on a separation problem. Finally in Chapter VIII v/e summarize the results ot this thesis and outline a program for further related research in the area of chemical process design.

PAGE 22

CHAPTER II TWO-LEVEL OPTIMIZATION METHOD. DUAL GAPS. RESOLUTION USING METHOD OF MULTIPLIERS II .1. Review of Previous Works on the Nonconvex Optimiza tion. The Lagrangian approach, representing a dual method, has often been proposed for solving optimization problems. In many engineering design problems, the system to be optimized compi^ises several fairly complex subsystems or units which are connected together by sets of equality constraints. By appending the equality constraints to the objective function with Lagrange multipliers, the Lagrange function, for fixed multiplier values, is separable, leaving a subproblem for each subsystem. The approach is therefore very attractive for this type of problem. It can fail however because of the p>^esence of a dual gap at the solution point (Everett, 1953; Lasdon, 1970). Unfortunately this failure is common in engineering design problems; therefore, the development of a method to resolve dual gaps is of great itnportance. Gaps may arise for various reasons, and a quite thorough treatment of their causes and resolutions is presented by Greenberg (1969). He reviewed the use of nonlinear supports (Gould, 1969; Bellnian-Karush, 1961), the use of surrogates (Greenberg-Pierskalla, 1970; Glover, 1963), the use of cuts via dominance and efficiency concepts (Loane, 1971; Greenberg, 1959), and the use of branch and bound method designed for finite separable problems. It should be noted that in all these methods to resolve gaps, inequality connection constraints are emphasized.

PAGE 23

BelliTiore, Greenberg and Jarvis (1970) have examined the use of penalty functions to resolve gaps, and they reviewed the relationship between the original problem and the augmented one along with proposed solution procedures. Falk and Soland (1969) have proposed an algorithm to solve the separable nonconvex programming problems where the constraints are only upper and lower bounds on the variables. Soland (1971) has extended the algorithm to include inequality constraints of a more general form. Both the algorithms are of a branch and bound type end solve a sequence of problems with convex objective functions. These problems correspond to successive partitions of the feasible set. Greenberg (1973), in a recent publication, has provided a sharper lower bound on the optimum value with less computation using the Generalized Lagrange Multiplier method, and the Falk-Soland algorithm can be m.odified appropriately. All the above methods have a major drawback: the sepa>'ability of the objective function and the constraints, if it existed in the original problem, is destroyed after the proposed modifications. Only the last method reported by Greenberg (1969) and using a branch and bound technique preserves the separability, but it is only applicable for finite problems. Separability is a characteristic providing many advantages for the solution of a large system, and it is very desirable to preserve it. In applying the Lagrangian approach and structural sensitivity analysis (McGalliard, Westerberg ,1972) in engineering design problems, v^e want to resolve the dual gap problem by keeping the structural characteristics of the system.

PAGE 24

In the present work an algorithm is developed v;hich makes use of the penalty function approach together witli a linear approximation of the nonseparable terms. The original problem is replaced by a sequence of problems, each one yielding a tighter dual bound on the optimum solution. The solutions to the above problems form a nondecreasing sequence of real numbers, bounded from above. Under certain conditions developed later on, this sequence of solutions converges to the solution of the original problem. The structural characteristics of the original system are preserved: each subproblem is solved separately. II. 2. Statement of the Pro blem and the Two-Level Procedure As a basis for the description of the two-level optimization procedure: of the encountered dual "gaps", and of their proposed resolution, we will consider the following model which represents many engineering system design problems (as opposed to resource allocation problems) : I(j) = {i ; stream i is an input to subsystem j} 0(j) ^ {i ; stream i is an output of subsystem j} X. is a vector variable associated with the interconnecting stream i . uis a vector decision variable associated with ^ subsystem j. Define vector variable q as q (x, ,...,x, |u ) , k. e I(j). (II-D

PAGE 25

The transformation equations v;hich connect the subsystems are X. = t.(qj) , i £ 0(j) . J = l,...,n (II-2) The overall system return function is the sum of the subsystem return functions: j=l ^ ^ (11-: The system constraint set, excepting the interconnection constraints (II-2) , is separable, i.e., q. e S. {q^ |hj(qj) 0) j=l,...,n Thus, the overall optimization problem is Minimize F = I ^j^^-j) (11-'^) subject to X. = ^.(q.) , i e 0(j) , j = l....,n ^j ^ 'j , j = l,...,n. The Lagrange function for the above problem described by (II-4) is given by L= I f (q.) ^ I I AJ(t.(q ) X.) , (II-5) j=l ^ ^ j=l i£0(j) ^ ^ J where the A. are Lagrange multipliers. Rearranging the terms in (II-5) the Lagrangian can be written as follows:

PAGE 26

LI {f.(qj) + y A t (q ) y AX} := I £ (q -A) . j=l -^ -^ i£0(j) ' ' J i£l(j) ^ ' j=l J J (ii-e; For fixed \ the problem of minimizing L becomes separable, and it is equivalent to solving the subprcblems Minimize £.(q.;A) J 'J subject to q hS ., u J j=l,.-.,n (11-/; Let us now define a dual function and a dual problem. The dual function is n h(A) = I (minimum .? (q ;A)), (II-8) 0=1 ,.cS. J J and the dual problem is Maximize h(A) (11-9) subject to A e D , D = {A; h(A) exists } . The two-level optimization procedure requires the following two distinct operations: First level: Calculate h(A) by solving the n subproblems (II-7) Second level: Adjust the multipliers , A, so as to satisfy the interconnection constraints in (II-4) In effect this procedure solves the dual problem described by (II-9).

PAGE 27

10 The important question concerning this procedure relates to the existence of a saddle point for the Lagrange function of the problem. The following theorems provide the theoretical basis and give some ansv/ers to the saddle point existence question. For further details and proofs of the theorems, see Lasdon (1970). Theorem 1 : Let A cE . A point (q ,A ) is a constrained saddle point for L(q,A) if and only if 1) q minimizes L(q.A ) over S and 2) X. t.(qj) = 0, ieo(j), j=l,..,,n. Theorem 2 : If (q ,A ) is a saddle point for L, then q° solves the primal problem described by (II-4). Theorem 3 : h(A) is concave over any convex set of D. Theorem 4 : h(A) <: F(q) for all qeS such that x. t.(q.) = 0, icO(j), j=l ,. . . ,n and for all AcD. II. 3. Dual Gaps The basic drawback of the Lagrangian approach, and therefore of the two-level optimization procedure, is its failure to find the solution of a problem when the solution is in a "dual gap". To provide additional insight into the relationship between the primal and the dual problems and to demonstrate the formation of gaps, we will give at first a geometric interpretation of the procedure and second the theoretical justification of the failure of the two-level approach in certain cases (Lasdon, 1970 ; Rockafel 1 ar , 1967 ). Consider the family of the perturbed problems, with perturbations z. :

PAGE 28

11 Minimize F(q) I f-(qj-1 -^ ^ subject to X. t^.(q.) = z. , ieO(j) , j=l,...,n . G S . J J j=l,.. . ,n The primal problem corresponds to z . =0, ieOfj), j=l,...,n. Assuming continuity of the objective function and of the constraint functions, let us define w(z) = minimum {F(q); x. t^. (q.) = z . , i£0(j), j=l,...,n, ceS) T T vJhere z = [z., for all leO(j), j=l,...,n]. The domain of w is Z ^ {z; there exists a qeS such that x. t.(q.) = z., i£0(j)j j=li...,n} Consider now the set RCIE m+1 R = {(Zp,z); Zq > w(z), zcZ} If v/ and Z are convex then R is convex. We shall call this space, containing R, the WZ space. The following theorem demonstrates the connection between duality and supporting hyperplanes for the set P, see Lasdon (1970) in WZ. Theorem 5 : If q^S and AcE , then q minimizes L(q,X) over S, if and only if H = {(z ,z); z Az = L(q,X)} is a supporting hyperplane for the set R at the point, (F(q), x. t.(q.) z., ieO(j), j=l,...,n). Supporting hyperplanes exist for e^ery point and therefore for the solution point, if R is a convex set (Fig. 1). In the case of

PAGE 29

12 suppofTing riYoerolane at vV\Zi \ \ \ \ I supporting hvoerolana \ \ i I fe^r»nm>ieTi>— w^w Zi Figure 1. Geometric View of the Success of the Dual Approach

PAGE 30

13 nonconvex R sets, there are regions consisting of constraint vectors that are not generated by any vector ,\ (Fig. 2). Cptimum solutions for constraints inside such inaccessible regions cannot be discovered by straightforward application of the two-level optimization procedure, and must be sought by other means. The following corollary (Greenberg, 1969) helps to anticipate the existence of such an inaccessible region, which will be termed a "dual gap". Corollary : A dual gap arises if some choice of Lagrange multipliers AeD produces at least two solutions to the Lagrangian problem (II-8) with distinct z. = x. t.(q.) i 0, i£0(j), j=l,...,n. 1 1 1 J II. 4. Resolution of Dual Gaps Using the Method of Multipliers Let us assume that the solution to problem (II -4) is in a dual gap. According to Theorem 5, the two-level approach fails to find the solution. The method of multipliers developed by Hestenes (1959) will be used to cure this shortcoming of the two-level method. Let us make the following notational changes: g^-j = X, t. (q^), i£0(j) , j-l,...,n g. = [g,,•> g.T>--->g,,•] ; j=l,...,n ; i|.eO(i) ; k-l,...,m T r T T T-, g = [gp g2»--'gn-l q = [qp q2,...,q^] h = [h, , n,,.. .,h J _ '1' "2

PAGE 31

14 rJ\Z) I iVlaxh(Xj
PAGE 32

15 Then problem (I 1-4) is written as follows: Min F = I f .(q.) = F(q) j = l ^ ^ s.t. g(q) = q E S where S = S-j x S^ x ... x S^ Consider now the following augmented problem: Min Ft(q,k.) = F(q) + K.g^(q)g(q) s.t. g(q)= q e S K.> . 1 (H-Pl) (II-P2: Let us now examine the relationship between problems (II-PI) and (II-r2) Let us make the following assumptions: A-1: Objective function F(q) and constraints g(q) are of class C" . A-2: The set S is compact and nonempty. Theorem 6 (Hestenes , 1969 ) : Let q be a nonsingular solution to the problem (II-Pl), then there exists a multiplier vector A and a constant K. such that the point q is an unconstrained minimum of the Lagrangian 1 ^ ^0 of problem (II-P2) . ProofSince q is the solution of the problem (II-Pl), the following — " ^0 conditions are true:

PAGE 33

16 J g(q ) 0, 4~ = , where L = F(q) A g(q) v; 'n(q) 3q 2 and AL = Aq — ^ 9q Aq > for all permissible variations Aq. such that 3g Ag = ^ sq > j=l,2,...,n Note: The Kuhn-Tucker multipliers v-^ are always nonpositive at the solution to (II-Pl). The Lagrangian of the problem (II-P2) is: At the point q^: L. = L + K.g'(q)g(q). 9q 3JL 9q '0 and AL. AL + K. \^ Aq Aq For permissible variations Aq such that Ag = and v.^Ah.^ > 0, •k AL. > and for nonpermissible variations Aq, it is possible that AL < 0, but there exists a K^ > such that AL. > 1 and the proof is complete. From Theorem 5 we conclude that there exist a vector A and a K^ >

PAGE 34

17 such that the point (q ,X) is a saddle point for the Lagrangian L^ , of the augmented problem (1I-P2). Therefore the problem (II-Pl) can be solved using the two-level method, if it is augmented by the penalty term K.g'(q)g(q). Define: 1 w(z) Min {F(q):g(q) = z} qeS ^(z) = Min {F.(q,K.):g(q) = z} qeS Let us examine now how the value of K^. affects the w^. (z) and under what conditions a proper choice of K. yields a supporting hyperplane for the set R. , where R. = ((z^,z):z^> w;(z), zcZ} at the solution point (w.(0),0). The following theoretical treatment will demonstrate the effect of the penalty term on the resolution of dual gaps via a geometrical representation. Lemma l :w(z) < w*(z) for z f and w(0) = w. (0). More generally, if K. > K. then w,(z) > w. (z) and w.(0) = w^. (0). Proof : The equality for z = is direct since the penalty term is then /s /v * zero. Let q minimize F(q) with g(q) = z, then q minimizes F. with g(q) = z because if this is not true and q minimizes F^ with g(q) = z, then F(q) + lCg^(q)g(q) < F(q) + K.g^(q)g(q) and consequently

PAGE 35

F(q) < Kq; which is not true since by assumption q minimizes F(q) vn th g(q) = z, Si mi larly: Min F*(q,K.) = Mi n {F(q) + K.z^z} ^ ^ qeS ^ > Min {F(q) + K.z'^'z} = Min F"(q,K.) . qeS Therefore w.(z) > w. (z) for z r Q.E.D. Lemma 1 implies that the curve v;(z) vs. z (in the one-dimensional case) is moved upv/ards as K. increases, keeping alv/ays the same value at z = 0. Lemma 2 : If w(z) is a continuous function with continuous first derivatives, K. > K. , and J 1 H(K ) = Max h*(A) = Max Min L*(q,X,K ) = Max Min [F* A"'"g(q)] ^ A ^ A qeS ^^ '^ A qcS then H(K.) > H(K.) and H(K.) = H(K.) only at the solution of the problem (II-Pl). Proof : Let H(K.) = F*{q;.,) >r^]^
PAGE 36

but therefore 19 '^jKj)^ -\]f^'u)^V.(|jZ = H(K.)}. Let us first examine the case where it lies below the hyperplane. Then (F"(qT.J , g(qT.\)) i R. where But, where R. = {(Zq,z) |z^ > w.(z) , zeZ) . •'^>{j)^ ' 9( Wj.(z) , zeZ) . because of lemma 1 R.CI r. and (R. R.) R= (w(0),0). Thus

PAGE 37

20 (Fj(q,j)) , g(q-(j,)) .: R, v.'hich contradicts the above result. Therefore H(K.) •]; H(K.). In the case that the point (F.(q,.J, g(q,.J) lies on the hyperplane, because ^i RjOn R. (w(o),n) we conclude that H(K.) = H(K. ) only at the solution of (II-Pl). Theorem 7 : If w(z) is finite for all qcs and is continuous with continuous first derivatives and existing finite one-sided directional second derivatives in any direction s, at z = 0, then there exists a K finite such that for all K > K , H(K) = H(K ) = primal solution. Proof: Since w(z) is continuous with continuous first derivatives at z = 0, there exists a hyperplane tangent to w(z) at the point (w(0),0) This is not a supporting hyperplane for the set R, since there is no saddle point for the Lagrangian of the problem (II-Pl) inside the dual gap. This hyperplane is described by: 2^ A z = w(0) with some z > w(z) . ~ Now we have to find a K^. such that the above hyperplane is a supporting R . , i.e., z^ = w(0) + A^z < w*(2). hyperplane for the set R., i.e., ' ' 1 We note that wJz) = Min {F(q) + K g^(q)g(q ) | g(q) = z} qeE ^

PAGE 38

21 therefore. w(z) + K.z^z a'^z > w(0) and there must be a w(0) w(z) + a"^z T z z for all zeZ ^ {zJ^oeS , g(q) = z} For z / and with v.'(z) > -'^ for zeZ, a finite value of K. exists, say a . The only potential problem might occur at z = 0. We can use L'Hospital's rule to find the limit as z -> along sny direction s, namely Lim a = Lim z^ 3^ w(0) w(Bs) + .f(os) (Bs)T(3s) 1 3 w(Bs) = finite if this limit exists. If the limit does not exist, then the finiteness of the one-sided directional second derivatives in any direction s, at z =^ 0, yields: Lim a = a. = finite, along the direction s.. B.-O J J Therefore choose K., such that K. > max [a , afor every direction s.] = finite. J J Therefore there exists a finite K = K. such that the hyperplane tangent to the w(z) at the point (v/(0),0), with a slope A, is a supporting •k hyperplane for the set R.. Therefore the dual solution H(K ) equals the primal solution.

PAGE 39

22 Now, for any K > K , H(K) H(K ) but it cannot be that H(K) • H(K ) since that leads to H(K) > primal solution. Therefore H(K) = H(K ). If Sw(z)/9z is to be discontinuous and w(z) is continuous at z ^ 0, then K has to be +-''\ and the dual approach fails to give a solution for the problem (II-P2), but as K increases, approaching ^, it produces tighter lov/er bounds on the primal solution. The metfiod can fail altogether if w(z) is discontinuous at z = 0. 1 1 . 5 . Computational Separability The previously described resolution of the dual gaps suffers from a serious drawback, namely, the separability, which existed in problem (II-Pl), has been destroyed in problem (II-P2) because of the crossproduct terms in the penalty term. Separability preserves the structural characteristics of a system and induces properties which are always desired in solving a large-scale problem such as many of those in engineering design. In the following paragraphs an algorithm is proposed which resolves the dual gap by preserving a computational separability of the system. Using the initial notation, consider the penalty term K^(g(q)-z)"^ (g(q)-z) = K I I [t (q ) X z.]^[t.(q.)-x -z..] j=l i£0(j) ^ J 1 1 1 J 1 I S= Y. I I I It^. +x^ +z^ -2t. z. -2t. X. "^ j=l i£0(j) r^l ^^ ''' ^^ '^ ^^ ''^ ''' +2x. z. -, 1 r irj where z is the deviation from satisfying the constraints. Under

PAGE 40

23 appropriate assumptions it has been seen that for large enough K , the solution of the problem can occur at z. =0 for all icO(j) and j=l,...,n. Each term of the above triple summation consists of separable 2 2 terms tt (q-)' x. , -2t. (q.)z, and -2x. z, and a crossproduct -2t. (q.)x. v.'hich is not separable. Expand the crossproduct in a ir^j ir '^ Taylor Series and consider the follovn'ng linear approximation around the point, x. , t. = t. (q ) ^ p ir ir ir J t. (q.)x. == t. X. + t. X. + t. X. 1 r J 1 r 1 r 1 r "i r ir i r i r Then the Lagrangian of the problem (II-P2) with constraints g(q) = z (instead of g(q) = 0) and Lagrange multipliers u takes the following approximate separable form: m >T J-I 2 , 2 f.(q.) + K y ) [t. +z. -2t. z. +2t. X. -2t. x. ] J J m T^o(j) p=] ir ir Tr ir ir ir ir ir-" r. 1 n + K y y [x -2t. X. -2x. z. ] m . v / -x '^T ir ir ir ir ir-^ lelj) r=l icO(j) ' ' ' i£l(j) ' ' y a (q-.ii) . J-1 mj J Thus the problem of minimizing this form for ^ for fixed K and y is ^ ^ m m ^ equivalent to solving the subproblems:

PAGE 41

j=l ,. • ,n 24 J S.t. q . e S . J J I I . 5 . The Algorithm Using the multiplier method (Section II-4) along witli the abovementioned linear approximation for the crossproduct terms, the follov;ing algorithmic procedure is developed v,'hich resolves the dual gap by preserving a computational separability of a large-scale system. In Section II-7 a minimum principle is developed which completes the theoretical foundation on which the algorithm is based. Step 1: Assume a value for the penalty constant K . Step 2: Assume values for Lagrange multipliers X (vje shall relate these to y in the step 4). Step 3: Assume a point (x.,t.(q.)) for each ieO(j) and j=l,2,...,n. Step 4: Put z. = t.(q.) xand 1 1 . = 2K z. + ,\.. Form the subproblems ^ 1 1 J 1 1 mil based on the linear approximation for the crossproduct term. Step 5: Find q. for j=l,2,...,n which solves Min i* ^ c ni Step 6: Update x., t.(q.) and iterate from step 4 until z. = t.(q.) x^.

PAGE 42

25 Step 7: Update the Lagrange multipliers A and go to step 3 until the r-lax Min 'has been attained. Check the constraints. If satisfied stop, otherwise update K and go to step 2. The algorithm described above requires that a sequence of Max-Mi n problems be solved, each possibly requiring a large number of iterations. Consequently the total number of iterations for convergence may become excessively large for practical applications. In order to accelerate convergence, v,'e have adopted the modifications on the updating rules proposed by Miele ejt aj_. (1972) for the method of the multipliers. The adopted modifications are the follov/ing: (i) Shorten the length of a cycle of computations. A cycle of computations is defined to be the sequence of iterations in which the multiplier A and the penalty constant K are held unchanged, while the vector q. for the subsystem j is viewed as unconstrained. The number of iterations permitted in each cycle depends on the unconstrained optimization technique which is employed, and it is AN = number of iterations = 1 for the ordinary gradient algorithm and the modified quasi-linearization algorithm (Miele et al_. , 1972) and AN = dimension of vector q., for the conjugate-gradient algorithm. (ii) Improve the estimate of the Lagrange multipliers A, using the formula (^"^) = a(^') .2 3 g(q) where B is a scalar parameter determined so as to produce some optimum effect, namely, to minimize the error in the necessary conditions for optimal ity.

PAGE 43

26 (iii) Select the penalty constsnt K in an appropriate fashion. The method used to select K depends largely on the unconstrained optimization method which is employed. If the ordinay^y gradient method is employed, the penalty constant is given by the formula K^ = 2P(q)/P;j(q) P^lq) , T 3P(q) where P(q) =^ g (q)g(q) and P (q) = — ^. If the conjugate-gradient q dq or the modified quasi -1 inearizati on algorithm is used, the penalty constant is updated by the formula ^^^^ = min (K jS^^) if P(q) < Q(q,,\) ni where: and K^""^^^ max (K ,ttK^^^) if P(q) > Q(q,X) m m ^ ^ " ^^^ ' Q(q,A) = F'J(q,A) F (q,A) , F (q,A) = § " ^ A,ti > 1 q q q aq oq Mai 9q ) g(q) 9q /Pq(q) Pq(q) See Miele et^ a]_. (1972) for further information on these rules. II . 7 A Discrete Minimum Principle In this section we shall give a theoretical justification for the algorithm given in Section 1 1. 6. Consider the following two Lagrangian functions : L* F(q) + K^^g"^(q)g(q) A"^g(q) (11-10)

PAGE 44

27 and ^* = F(q) + K^^{g(q) z)^ (g(q) z) /(g(q) z) (11-11) Geometrically (11-11) is the Lagrange function resulting from moving the origin to z in the WZ space and adding the penalty term for deviation from that nev; origin. Theorem 8 indicates a relationship v;hich exists between them. Theorem 8 : If q solves the problem h*(,\) = Min L"(q,K ,X) (II-P3) m^ f in in qeS resulting in g(q) = z (11-12) then q also solves, for K and z fixed at these same values, the problem Min {^*(q,K ,u,z)jg(q) = z} . (II-P4) qeS "^ "^ Conversely, if one can find a z which permits (II-P4) to have a solution, say q, then q solves a problem of form (II-P3). Proof : Suppose that q solves (II-P3) and results in g(q) = z. Then by Everett's main theorem (Everett ,1953 ), q solves the problem Min {F(q) + K g^(q)g(q) i g(q ) = z} (11-13) qeS which is equivalent to solving the problem

PAGE 45

28 Min {F(q)|g(q) = z. qcS (11-14) T T since the term K g (q)a(q) = K z z is constant for z fixed. Problem (II-P4) is equivalent to problem (11-14); tlius q solves problem (II-P4) If z is fixed and permits (II-P4) to have a solution, say q, then one has solved the problem Min {F(q) + K g^(q)g(q) (2K^z + u)^ g(q) + K,^ Jz + i/z} qeS which is equivalent to solving the problem Min {F(q) + K^^g"^(q)g(q) A^g(q)} qeS 1-15' where A = 2K z + y m (11-16) Problem (11-15) is a problem of form (II-P3), and consequently q solves a problem of form (II-P3). Theorem 8 says problems (II-P3) and (II-P4) are equivalent, in that they give rise to the same (z,q) values. The following corollary follows directly from this observation. Corollary 8.1 : If and only if z permits (II-P4) to have a solution, then the set R corresponding to problem (II-P3) has a supporting hyperplane at the point (w (z),z) with a "slope" given by (11-16). The following corollary also follows directly.

PAGE 46

29 Corollary 8.2 : If z permits (II-P4) to iiave a solution, say q, then ^ ' nr • m m (11-17) where A is given by (11-16) P roof : If z permits (II-P4) to have a solution, q, then g(q) = z (11-18) and ^ (q,K .v.'i) = F(q) m^^ m ' ^' (II-1&; Equation (11-17) follows immediately from (11-10), (11-18) and (11-19). Q.E.D. Note that corollary 8.2 gives us a formula to calculate a dual bound, h(A), to w(0) if we have in fact solved (II-P4) rather than (II-P3}. The result following exposes some ^^^vj useful properties of probleiii (II-P4) for the type of engineering design problems being considered in this paper. Ii also constitutes the theoretical reasoning of the algorithm presented in Section II-6. Result : If the point q = [qi',...,q ] and the multipliers \\ solve problem (II-P4), then each subproblem j, j=l,...,n resulting after the Taylor Series linear approximation of the crossproduct terms is minimized with respect to the corresponding q. at the point q.. Proof: Consider a system of two stages and the following family of problems :

PAGE 47

30 Mill F f-,(x, ,u-,) + t' (x„,u„) subject to: x^ t, (x,,u,) = z h^(x^,u^) < h^{x„,u^) c here zeZ' = {zlx„ t.(x-,,LiJ = z, h-,(x, ,u,) < 0, and h„(x„,u„) < 0} '2 n^'^T 1 r 1 2^"2' 2^ The Lagrangian function for the augmented problem is = f^(x-|.u^)+ f^{x.^,u,^) + K^_[x2 t^(x^,U|) z] [x2-t^(x-|^u^)z] ij [x^ t^(x-|,u^) z] Vih^(x^ ,u^) v,^^^{x^^,u^) = L + K,^[x2 t-|(x^,u^) z] [X2 t^(x^,u^) z] , v;here v, and v„ are Kuhn-Tucker multipliers. * Suppose that '^ is minimized at the point (x, ,0-, ,x„,u„) . At this point the Hessian matrix of the second derivatives of ^ must be positive m definite, i.e.,

PAGE 48

'-i CM +

PAGE 49

32 niust he positive definite. Note that A contains terms which are not separable, each of which con tains the term x^ t-| z. Since u and (x, ji-, jX^^/u^) minimize with Xg t,(x-,,u,) z =0, the t^atrix A reduces to the following lorm: ^^1 ot, oL.x2 3X, .^T r^C.< Ot -J Ot 1 dxjsu^ "^ '^h 3xj 2K 9t^ ^1 , „ T -^m 3X-, . T C 1 o U -J I d U 1 3u2' ' >" 3^ 3uj 3t 2K 3U 2K 3t^ dt 2K rr, cU ^ + 2r( I --^-^ 0A„ oX^oUp A 3xj9u2 ^2, 3 L Since A inust be positive definite the principal minor matrices of A rr.ust be positive definite, i.e. A, V"i .2, 3t-, 3t, ^ + 2K v-^ -4 3x2 ^ 3><1 3xj ^2. ot-i ot-i -4-^ + 2K -^-4 3x^3u^ "^ ^^1 3x| p,2, atn St-, )x^3u] '^ ^^1 9u^ g2|^ St-j 3t-| K" "f" 2K "K~" — ^ ^,,2 m 3Ui T dU-, 1 cU-, positive definite and

PAGE 50

33 X^.Ur, .4 2K I m 3x^3u„ -A 9X2SU2 positive definite Consider now the sub-Lagrangians i , and Z ^. After the linear T approximation of the crossproduct terms -2K x^t. (x-, .u-, ) , they take the following form: "ml ^ ^V^r^^ "^ t^(x-j.u^) vjh^(x-|,u^) + K^^tj(x^,u^)t^(x^,u^) 2K x^tJx^.u,) 2K z"^t., m 2 p 1 r m 1 -T. T, <2 02n2\-2'"2' ^m2 " ''2(^2'^2^ "^ ^'-"^^ " ''-'o^'?^^')'^')) + '^m>^9X9 " ^K^t^tx^ ,Ut)x, + 2K,,z'> V2^2 ^.J(x^,G^)x2 + 2k/x2 where x,,U-,,X2 is the point around which the Taylor Series expansicr takes place. At the point x-,,u,,x„,u the following conditions are satisfied: 3x, 3x, = dl ml 3a, 3Ut 3Un = m2 _ m dXo 3x^ and m2 _ m 9u2 9 Up = Also the Hessian matrices of I . and I ^, which are precisely A-, and A„ ml m2 ^ -'I 2 respectively, are positive definite. Thus we conclude that Z -, and £ ^.^

PAGE 51

34 are riiiniiTii7.ed at the point {x-.,u-.,x^,li„) which minimizes /. By induction the same result is found to apply to any number of stages. Since the primal problem (II~P1) corresponds to that member of the class of problems (II-Pl') with z = 0, the above mentioned result applies to (II-Pl) for a proper K and y. II. 8. Discussion We should note the following two observations which constitute the essence of the proposed algorithm: 1) The use of a Taylor Series expansion really provided two results. The first is, of course, that the problem becomes separable. The second is evident only when one realizes that other devices could effect the separability feature; for example, one could rewrite the crcssproduct term -2x. t. as -2x. (x. z. ) which is also separable for fixed z.^. Unfortunately, ir ir ir *^ ir -^ in this form in constrast to the proposed one, the subproblem, for the unit containing x. as a variable, becomes a concave minimization in X. as K increases. 2) For the Taylor Series expansion approach, we have found that problem (II-P4), and not (II-P3) to which it is equivalent, has the desired property that the subproblems must always be minimized if the overall problem is minimized, even if K is not ^ m sufficiently large. 2 2 We should also note that the convex quadratic terms t. and x. -t ir 1 r produced by the penalty term will tend to improve and could dominate the behavior of the subproblems as K increases, making them easier to solve. The introduction of the quadratic terms in the objective function also offers another advantage. It desensitizes the dual function with

PAGE 52

35 respect to the multipliers X. For given multipliers X h*(A) = Min l' X. ,u^ ,i^0 ,. . . ,n is closer to the primal solution than the h(A) = Min L X. ,u. ,i = l ,. . . ,n 1 1 (see Figure 3). !his is a characteristic which can be of in^portance for a dual bounding procedure, such as the one used in structural sensitivity analysis (McGalliard, Westerberg 1972).

PAGE 53

36 Figure 3. Improvement of the Dual Bound h (A) for the Penalized Problem Over tlie Dual Bound h(A) for the Unpenalized Problem for the Same Multipliers A

PAGE 54

CHAPTER III A STRONG VERSION OF THE DISCRETE MINIMUM PRINCIPLE The discrete form of Pontryagin's Minimum Principle proposed by a number of authors has been shown by others in the past to be fallacious; only a iveak result can be obtained. Due to the mathematical character of the objective function and the stage transformation equations, only a small class of chemical engineering problems have been solved by the strong discrete minimum principle. This chapter presents a method to overcome the previous shortcomings of the strong principle. An algorithmic procedure is developed which uses this new version. Numerical examples are provided to clarify the approach and demonstrate its usefulness. III.l. Review of Previous Works Pontryagin's minimum principle (Pontryagin et^ al_. , 1962) is a well-known method to solve a wide class of extremal problems associated with given initial conditions. A discrete analog of the minimum principle, where the differential equations are substituted by difrerence equations, is not valid in general but only in certain almost trivial cases. Rozonoer (1959) first pointed out this fact. Katz (1952) and Fan and Wang (1964) later on developed a discrete minimum principle which was shown to be fallacious by Horn and Jackson (1965a), by means of simple counterexamples. As was pointed out by Horn and Jackson 37

PAGE 55

38 (19653, b) and lucidly presented by Detin (1969), the failure of a strong miniinuin principle lies in the fact th.at we cannot deduce the nature of the stationary values of the Hamiltonian from a consideration of first-order variations only. Inclusion of the second-order terms, does not help to draw a general conclusion about the nature of the stationary points in advance. A weak minimum principle which relates the solution of the problem to a stationary point of the Hamiltonian exists and is valid (Horn, 1961 ; Jackson, 1964). In the case of control systems described by differential equations, time, by its evolution on a continuum, has a "convexifying" effect (Halkin, 1966) which does not make necessary the addition of some convexity assumptions to the specification of the problem. Thus a strong minimum principle can be applied for these problems, requiring the minimization of the Hamitonian even in the case that a continuous problem is solved by discretizing it with respect to the time and using a strong discrete minimum principle. For discrete, staged systems described by difference equations, the evolution of the system does not have any "convexifying" effect and, in order to obtain a minimum principle, we must add some convexity assumptions to the problem specification or reform.ulate the problem in an equivalent form which possesses inherently the convexity assumptions. This present work belongs to the second class. In the present chapter V;e propose to show a strong version of the minimum principle which relates the solution of the problem to a minimum point, rather than a stationary point, of the Hamiltonian. This is attained through the use of the Hestenes' method of multipliers, a technique used effectively in Chapter II to resolve the dual gaps

PAGE 56

39 of the two-level optimization method. This method turns a stationary point of the Hamiltonian into a minimum point, thus minimum seeking algorithms can be used. III. 2. Statement of the Problem, and Its Lagra n gian Formulation As a basis for the description of the minimum principle, the twolevel optimization approach, their success and failure, and the development of a strong minimum principle, consider the following sequential unconstrained problem. (Constraints and recycles do not change the following results (Westerberg. 1973), and we want to keep the presentation here as simple as possible.) Min F = I ^.{x.,u.) (III-PI) subject to + 1 = ^•(Xi'^-) / . V x-j = Xn (given) For every i^2,...,N the vector valued function f.(x.,u.) is given and satisfies the following conditions: a. the function f. is defined for all x. and u. , 1 11 b. for every uthe function f.(x.,u.) is twice continuously differentiable with respect to x., c. the f-(x.,u.) and all its first and second partial derivatives are uniformly bounded. These conditions correspond to the usual "smoothness" assumptions.

PAGE 57

40 The Lagrangian function for this proble.-n (III-Pl) is given by I {4.j(x,,u.) ),|x + xj^, f,(x.,u.)} + :.jx° >J^, x„^, 1=1 N J T y £.(x.,u.,X-,A.,^) + X^x^ A^,j,^ x,j^^ . (III-l 1=1 The solution to the problem (III-PI) is a stationary point of the Lagrangian function L. The necessary conditions for a stationary point of the Lagrangian are: 3X. 3^ J A+ — ^ A.,n = 0, i = l 9X^. '^1 3x^. 1 + 1 (III-2) 3k 3u, 3d.. Df— ^ + — ^ A -, = , 3u. 3u. 1+1 i=l,...,N (in-3) TTTX. f. t(x.,U.) = , 3A^1 1-r 1 r i=i,...,r (III-4) From equation (III-2) we have the defining equations for the multipliers, _ 3^^ 3f^ ^ 3x. "^ 3x. ^i+1 (III-5) vn'th the natural boundary condition, Vl = ° (III-5)

PAGE 58

41 Equation (III-4) simply necessitates the satisfaction of the connection constraints. The Lagrengian approach constitutes a unifying and general presentation of the necessary conditions which must be satisfied at the solution of a problem. For the solution of the necessary conditions different strategies have been developed. In a tutorial presentation (Westerberg, 1973) the relationship of the different strategies to solve problem (III-Pl), vrith the Lagrangien approach, is established and it is shown that methods such as sensitivity analysis, discrete minimum principle, and the tv,'o-level optimization method are simply different techniques to solve the same necessary conditions, eqs. (III-l), (III-2) and (III-3). Let us define the stage Hamiltonian H^. ^s follows: H .(x.,u.) + \]^, f,(x.,u.) i = l,...,N (HI-?: ).^^^-u.; A.^^ I.V--.U. and the overall Hamiltonian by: ? . .-To Then, N N L = I ^i .L ^i^ ^ '' .T ''1^1 i=l ' i=l and the necessary conditions (III-2), (III-3) yield: 3k_= 11X. = , i = l,...,N (III-8) 3x. 3x. 1 lL_.!!!i= , i=i,...,N (in-9) 3U. 3U.

PAGE 59

42 Thus, in order to solve problem (IIIPI) by the discrete niiniiiium principle, we require that each stage Hami 1 tonian be at a stationary point. Consider the point (x-, ,x„, . . . ,Xj,,;u-, ,u„, . . . ,'u,,) and the variations 6li, ,6u„, . . . ,6Uf, around the previous point with respect to the controls, such that |6u. .| <. r. , where the second index j denotes the jth element of the control vector u-. ^x-, will be taken as zero. The variational equations corresponding to the connection constraints of the problem (IJI-Pl), with up to the second order terms included, are; 3f. 8f. . J 9^f. . J 3^f . ox.,-, = -y (6u.) + -^i&x.) + f (6u.)' -y^ (6u ) + ^^ (6x.)' — ^ (ox ) 9^+ (fiu.)"^ '-J (6x.) + 0(e2) )u.9x^ For the considered sequential system, the solution of the above system with respect to the ox.'s, i=l,...,N, is straightforward and yields the following general formula in terms of the variations in the controls only: ^^+1 i I 1=] i 9f k=£+l ^(6u ) 4 I 9u| ^ ^ ^=1 i I 1 1=] k,m=l V=£+l ^\ T (6u,) J'4 9u, i 9f' n — k=£+l ''^k P-1 9f t9 f, 9u^ n =k+l 9x, J /f. 9x 9f 9u ' "^ '^ £=1 k=l i 9f nil (J X (ou, ) ;; — £-1 df] n — ^ s = k+l^^ •1 9f| t=m+l^'\ T (6u, 9x^9uJ (III-IO)

PAGE 60

43 The variation in the objective function F caused by the variations in the controls is given by: i=l 9(i) 1 .. ^T " ^i 3U. 1 (6u.) +^ (6x.) +^ (6u.)' ^-^ (5u.) IX. 1 8u 1 .. ^T " '''i J ° ^M (6x. + ~ {6x.)' -y(6x ) + (6u.) ^ ,.. ^ ' 9X. ' 8u.3xl + 0(r/) Substituting into the last expression for 6x. 's, i=2,...,N with their equals from eqs. (III-IO) and, noting that the stage Hami 1 tonians H., i=l,...,N, are given by eq. (III-7), we find: 5F= I 1=] -f (6u ) + ^ (6u )' —~ (6u 9uJ -^ 9uf N Ji I I (5uJ £=1 k,m=l N H T H k' Su, •1 Sf T n ^ £=1 k=l ^ ^^k s-k+1 °^ £-1 Bf"'" s=k+l s S^H, £-1 3f| t=m+l ^^t 3f.2 3x^9u^ T (^^) f (6u ) T m 1 1 111 Unlike the continuous case (Denn, 1969; Hal kin, 1966) there is no general way in which the last two terms in eq. (1 11-11) can be made to vanish. Therefore, the variations considered in the controls may well produce 6F < 0, or 6F > 0, or 6F = 0. Thus it is evident from the above that a strong minimum principle, which requires that the solution to the problem (III-Pl) minimizes the stage Hamiltonians H^, !?,=1,...,N, is not generally available (Horn and Jackson, 1965b; Denn, 1969; Halkin, 1966). A weaker form of the discrete minimum principle can

PAGE 61

44 be used and requii-es that the solution to the problem (III-Pl) makes the stage Hamiltonians stationary. The examples presented by Horn and Jackson, vjhich counter the strong minimum principle, are such that the stage Hamiltonians do not possess a minimum stationary point whereas the problem itself does. However, there exist special cases where the strong minimum principle can be applied (Horn and Jackson, 1965b: Denn, 1969). Ac this point a further clarification is required. The strong discrete minimum principle fails for physically staged steady-state systems. Halkin (1966) has shown that it always succeeds for discretetime systems, obtained as an approximation to the continuous time systems, since the time increment t can become arbitrarily small and make the last two terms in eq. (III-ll) vanish. The success or the failure of the strong discrete minimum principle can also be explained in terms of convexity (or directional convexity) or lack of it for the sets of the reachable states of the system (Halkin, 1956; Holtzman and Halkin, 1966). This constitutes an important characteristic and in fact it is the basis for the development of a strong version of the discrete minimum principle, which follows. The two-level optimization procedure is an infeasible decomposition strategy of a Lagrangian nature which solves the problem (III-Pl). The similarity between the Lagrangian and the weak discrete minimum principle is a well-known fact and in a recent note (Schock and Luus , 1972) the similarity between the two-level optimization procedure and the discrete minimum principle has been pointed out. Further insight in the relationship between the last two methods will be given later in this chapter, and a stronger version of the discrete minimum

PAGE 62

45 principle will be developed, based on a method to overcome the shortcomings of the two-level optimization procedure developed in Chapter II. III. 3. Developriient of a Stronger Version of the Discrete Minimum Principle In this section we will develop a stronger version of the discrete minimum principle using Hestenes' method of multipliers and following a procedure similar to the one that was developed in Chapter II to overcome the dual gaps of the two-level optimization procedure. As a basis for the presentation we will use problem (III-Pl). Consider now the augmented problem (III-P2) N N Hi Tr., . 1 , ,w 0^T ^lin F = l^ ^.(x.,u.) . K J^ [x.,^f^lTx.,^f.] . K(x^-x°)'(x^-x^) subject to (III-P2) ^•+1 = ^^N'^-) i=l,...,N X. = X, (given) with K > 0. The Hamiltonian for this problem is given by: N * * T . / \ , .T H =F .,1^ Al^, f,(x,,u,)..jx°. In the following theorems we will establish some important properties of H*. Theorem 9: If (u,,...,U|^) is a stationary point of the Hamiltonian H for the problem (III-Pl), then it is also a stationary point for the

PAGE 63

46 Haniil Ionian U of the augmented problem (III-P2) for any real value of the parameter K. Proof : From the equations defining H and if we take, N I i=l H* = H + K J [x.^^f^]"^[x^^.^f^] + K(x^x°)'(x^x°) . (III-12) Differentiation vn'th respect to the controls yields J M_. = iiH + 2K y 9u E)U ^^1 du tXi+1 ^i^ 3H Evaluating the derivatives at the point (u, , . . . ,u.,) , since — and x. , T f . =0 for i=l , . . . ,N, and v^e fi nd 1+1 1 = 3u 3u Theorem 10 : Let u=u be a local isolated minimum of F with all the connection constraints satisfied. Then there exists a real valued 9H parameter K , such that —-= and 9 * 9 H 3u is positive definite) for K > K Proof: In Theorem 9 it was shown that 9H_ 3u > (i.e., the matrix is independent of the constant K; therefore, it will be valid and for K > K . Let us 2 * 2 consider next the matrix 9 H /9u : 3 V A , ,, ? i^-^1 ilZ ^^"i^-i 'i^ .

PAGE 64

47 The solution of problem (III-Pl) is a stationary point of the Hamiltonian H. The nature of the stationary point for the discrete case cannot be predetermined and thus — ^0. Assuming that we u have a nonsingular problem, i.e. '1+1 k-' "1+1 I-' du f i=1. and we conclude that .5u)"^^(6u) = (5u)T^(6u) + 2K I 8u^ 3u' )[x.,^ f,-] 9[x.,i f^.] 3u T i=l > 8u' (6u) for a large enough K so that the second term which is always positive prevails over any negativity of the first term. Q.E.D. The second theorem implies that any stationary point of the Hamiltonian H can be made a minimum point by choosing the parameter K large enough. Theorems 9 and 10 establish the following result: For a large enough K, a local solution to problem (III-P2) which is also a local solution to problem (III-Pl), minimizes the Hamiltonian H of the augmented problem (III-P2). The above result requires that we minimize the overall problem Hamiltonian, H . The available weak minimum principle permits us to solve the overall problem by solving one problem per stage per

PAGE 65

^z iteration since H decomposes into a sum of stage Hami Itonians . Unfortunately W does not decompose, as v/e shall now see. Let us see now how we can simplify the above result and relate the solution of the problem (III-PI) or (III-P2) to stage Hamilton! ans. Consider the penalty term in the objective function of the problem (III-P2): K \ [x.,^f.]^[x^,^f J = K \ [x|,^ x.,^ . fj f. 2xj^.^ f.] 1=1 1=1 Vie note that each member of the summation includes separable terms, e.g., x. n x._, -, and f. f . , and nonseparable terms such as the crossproduct x-_^-,f.. The following develops a strategy to decompose computationally the problem of minimizing H . This approach is motivated by the work on solving nonconvex problems using the two-level approach. Expand the crossproduct term in a Taylor Series and consider the following linear approximation around the point, x-^-i > f.: = (x^. ,u. ): T -nT ^ /^T T ^ 1+1 1 1+1 1 1+1 1 1+1 1 Then the Hamiltonian of the augmented problem takes the following form H for H :

PAGE 66

49 N N H*= y (i).(x.,u.) + K y [x"''^tX.^t + fl"^. + 2x'''^,f. Zx^^^f. 2x^^.f.] i=i "• " " i = -' "'"'"1 ^"'"' " "I 1+1 1 i+i 1 1 + 1 1 N ..// OsT/ Os, r^ T^ , To + K(x^x^) (x^x^) + J^ x^^^f. + A^x^ = i^ Ui ^ K^|fi ^ 2xJ,^f. 2xj,^f. 2x{f._^ + AJ,^f. + Kx{x.] + a'^x° + Kx''' X 2Kx^ f + Kx° x° N T .1^ H* + a|x° + KxJ^iX,^^^ 2KxJ^;f^ + Kx° x° (III-13) ^ where f e x-, and we have defined a stage Hamiltonian I ^ H* = {(?. + Kx^x. + KfTf. + 2Kx]' -.f. 2Kx.^Tf. 2x"!'f. .} + \T ,f. 1 T 11 11 i+1 1 1+1 1 1 l-M '^1+1 1 Let us now establish the following result which also constitutes the basis of the proposed algorithm (given later in this chapter). Theoren 11 : If the point u = (u, ,Up,. . . ,u,,) minimizes the overall Hamiltonian H , then each stage Hamiltonian H., i = l,...,;J, resulting after the Taylor Series linear approximation of the crossproduct terms is minimized with respect to the corresponding control variable uat the point u . . 1 "^ 1 Proof : At the solution point we require feasibility, i.e.

PAGE 67

50 (This requirement is evident from the algorithm to be presented in the next section.) From eq. (III-13) it is clear then that dH, iu. 9U. i-1,. ..,N , and therefore a stationary point of // with respect to u. is a stationary point of H. vrith respect to u.. The Hessian matrix of the second derivatives of H must be positive definite at the point u , i.e. 'dU 2 * 2 * 9 * 2 3LIm must be positive definite. Thus, all the submatrices on the diagonal must be positive definite, i.e., „ , „ dU-j au^ aU, must be

PAGE 68

51 positive definite. But ^— ^ = — ytherefore all the — ^^ ' i=1,...,N are positive definite. Theo»-^m 11 iir^plies that the minirriization of the H with respect to the control variables can be replaced by the problems ^nn H. 1 u. i=l,...,N III .4. The Algorithm and Computational Characteristics The theoren-6 of section III. 3. imply the follov/ing algorithm which constitutes a stronger version of the now available discrete minimum principle. Step 1: Assun-e a value for the penalty constant K. Step 2: Assume values for the control variables u-^ , . . . ,Uj^j. Step 3: Using the values u, ,...,Uj^, solve the state equations ^1 -^1 x.^^ = fi(x.,u.) i=i,...,r fonvard and find x^.-.-.x,^. Let x^,...,x^^ be the found values (x-| = X-, , given constant). Step 4: Using the values u^,...,Uj,^ and x^,...,x,,j, solve the adjoint equations

PAGE 69

52 3(1). 3f| i=l,2,...,N h+] ^ ° backwards. Let A. , • . ,A.n -, be the found values. •ft Step 5: Formulate the Hainiltom'an H of the augmented problem (III-P2) and expand the crossproduct terms around the point (x. , . . . ,x>|;u, , . . . ,Uj^) . Formulate the stage Hamiltonian H., i=l,...,N. Minimize all H., i=l,...,N and find optimal values for the controls, say u,. , i = l,...,N. If the minimization procedure fails to produce a minimum for at least one H. increase K and go to step 2. Step 6: If |u.. u..| < e for all j and for i=l,...,N (j denotes the jth component of the vector u.) stop, the solution is found, otherwise assume new values for the controls putting New u. = u. , i=l 1 1 and go back to step 3. The effective use of the above algorithm depends largely on the value of the constant K. \lery small values of K will not produce the necessary convexity of the stage Hamiltonians, while a very large K will mask the objective function and will make the algorithm insensitive to the descent direction of the objective function. Miele et al. (1972)

PAGE 70

53 have suggested a method for cheesing a proper value for the constant K in the context of the Hestenes method of multipliers. As will be discussed in section III-5, the two methods are related and this method for choosing K should be satisfactory here. 1 1 1. 5. Discussion and Conclusions As mentioned before, the failure of the strong discrete minimum principle was caused by the fact that the stationary points of the stage Hamiltonian with respect to the controls are not always minima points. The method proposed in the previous sections of this chapter turns every stationary point of the Hamiltonian into a minimum point for a large enough K. Thus we can find the solution of the original problem at a local minimum point of the stage Hamiltonians. This constitutes a stronger result than that currently available, where we must search for a stationary point of the stage Hamiltonians. The method used in this chapter to establish a stronger version of the discrete minimum principle parallels in many respects the method used in Chapter II to resolve the dual gaps in the two-level optimization procedure. The source of the shortcomings for both the methods (minimum principle and two-level method) is the nonconvexity of the objective function and/or the stage transformation equations. The two-level optimization method fails if either the objective function or transformation equations or both are nonconvex with respect to the control and/or the state variables. For the resolution of the dual gaps of the twolevel method, the same penalty term was used, multiplied by a positive constant K. It was required that K be large enough so that the stage sub-Lagrangians

PAGE 71

54 become locally convex vnth respect to the control and state variables. Since, in the present work, we have required that the K be large enough to turn the stage Hamiltonians convex viith respect to the controls only, we conclude that the K required by the strengthened form of the minimum principle to solve a nonconvex problem is at most as large as the IC required by the two-level optimization method to resolve dual gaps. Fig. 4 compares the values of K required to solve a nonconvex problem for the three methods discussed, namiely, weak discrete minimum principle, the strengthened form of discrete minimum principle developed in this work, and the two-level optimization method. Note, Kw-, f,, ir K ,,p > 0, where K^, ,, is the least value of K required for the two-level method and K^,,^ is the least value of K for the minimum principle developed here. Thus it should be clear that, although the methods are related, they do not have equivalent shortcomings; it is possible to find problems which can be solved by the weak minimum principle or even the method developed here, and not by the two-level miethod. However given a K > K^|[«» a"!"! three miethods succeed.

PAGE 72

55

PAGE 73

CHAPTER IV EXAMPLES ON NONCONVEX OPTIMIZATION IV. 1. Numerical Examples IV. 1. a. Two-stage Examples Consider the follovn'ng two stages example, Fig. 5, taken from McGalliard's thesis (McGalliard ,1971 ) and described by: Min F = -t3(x2,U2) + x°-^ + 2u^ + x^"^ + Su^ s.t. x„ = tAx-,u.) = 3X-, + 3lu ^3 " ^3^^2''^2^ ^ ^^2 ^ ^'^2 X-] > u > x^ 5 3 x-,+ 2u^ < 4

PAGE 74

^i A 57 -:s*» 2 r ^ Figure 5. Two-stage Example

PAGE 75

58 TABLE 1 POINTS GENERATED FOR THE TWO STAGE EXAMPLE A 2

PAGE 76

59 and Max Min [F ^(x^3x-|-3u^)] = -4.84 A x-j,u-|,X2,U2 Two solutions are produced with z-| = x^ 3x-| 3u^ = + 4 3(3) + = -5 Z2 = X2 3x^ 3u^ = + 4 3(0) +0=4 and from the corollary in Section 1 1 1. 3. we conclude that a dual gap exists. Introduce the penalty term K(x23x.3uJ in the objective F and make a linear approximation of the crossproduct terms x^X-, and x^u,. Following the algorithm in Section III. 5., we find the solution X, = 4/3 X2 = 4 with A = 0.5 and K = 10 U-, = u„ = and Min X| 5 U -, , Xq 5 u„ F = Max Kin X x-,,u,,x„,u„ [F-A(X23x^-3u^)]-4.51 Substitute unit 2, in example 1, by another described as follows: 0.6 f^ = t2(x2,Li2) + 2x2' + 2U2 t^(x2!U2) = 2x2 + 3U2 XpjUp 2 U2 < 2 X2+U2 < 4

PAGE 77

60 The problem becomes: 0.5 , „ r, ^ ,, 0.6 s.t. Xp = 3x-] + 3 x^,u^ e S^ 9 '9 ^ 9 Ihe two-level procedure yields the solution (see Table 2) x-j = and 3 x^ = 4 with Max Min [F-a(x23x^-3u^)]= -2.54 X x-jU-jjX^.Up u-, = Up = and a dual gap is detected again. The proposed method yields the solution X, = 4/3 x^ = 4 for A = 0.8 and K = 10 u= U2 = and Min F = Max Min [F-a(x23x 3u,)] = -2.07 x-| ^u-j.x^.u^ A x-],u-|,X2,U2 with all constraints satisfied. IV.l.b. Three-stage example with recycle In the two-stage example, let us insert an additional stage with a recycle loop from this new stage to stage 1, Fig 6.

PAGE 78

61 TABLE 2 POINTS GENERATED FOR THE MODIFIED T\-.'0-STAGE EXAMPLE h

PAGE 79

62

PAGE 80

63 The problem is: .,. c 0-6 , o ^ 0.5 ^ . , 0.4 i-nn F = X, + 2u, + x„ -f 5Up 4x^ u^ + x s.t. Xp = Sx-j + 3Ui Xo = 2Xj^ + 2Up ^1 = ^3/4 x-j ,u.| £ S, ^2''2 ^ ^2 '3''^3 ^ ^3 X3,U3 . X3< 4 X3+U3 < 6 Since the solution of this problem is in a dual gap, we apply the proposed resolution which yields x^ 0.67 X2 = 2 X3 = 4 u-j = Up = U3 = for ,\^ 1.0 A2 = 1-0 , A3 = 0.5 and K = 10 . Min F = Max Min [F-X-jCxp3x-,3u) X^{x~-2Xr^-2uJ, The x^ :U.j,X2,U2,X3,U3 X-j ,a2.X3 x^ ,u, ,X2,U25X3,U3 >,3(u^-U3/4)] -11.96 with all constraints satisfied.

PAGE 81

54 IV.1.C. Soland's example This example is taken from Sol and (1971) 2 Min f = -12Xi 7xp + x^ s.t. {x) = -2x\ + 2 x^ 5 X-, 5 2 5 Xg 5 3 The solution found by Sol and (which is not the global miniinum) after the generation of 12 points, which necessitated the solution of 23 subprobleiiis is: x^ =0 and f = -10 Xp = 2 The algorithm proposed in this work generated a sequence of 8 points, solving 8 subproblems, and found the solution: x-j = 0.718 and f = -16.7387 X2 1.46847 The sequence of generated points is shewn in Table 3. The starting point is X-, = 2.00, L = 0.0 . IV.l.d. Jackson and Horn's counterexamp le Consider the following two-stage example. Fig. 7, taken from Horn and Jackson (1965a). This example is the counterexample which demonstrated the fallacy of the strong minimum principle and is

PAGE 82

65 TABLE 3 POINTS GENERATED FOR SOLANO'S EXAMPLE ^1

PAGE 83

66

PAGE 84

67 described by: 2 Min X, J n2 ' subject to Xp = Xp + e x^ = x] + (x^)^ + (92)2 Xp = arbi trary T X, = Xp = 1 1 2 The solution to this problem is 9 ^0 =0 which gives the smallest 2 value of X-, . It, however, does not minimize the Hamiltonian for the stage 1 since this Hamiltonian has only one stationary point which is a maximum. Let us now apply the procedure of the strong discrete minimum principle developed in the previous sections. Thus we have: H* = K[x° -26^-1 (e^)2]2 ,. K[x° + e"']^ 2K(x])[x° 29^ \ (B^^j 2K(x^)[x°+ e'] + K(xj)[x° 2g' 1 (Q^)h + K(x])[x° + 6^] + A ,x-j 2 a, 6 "o A-j (G ) + A„Xp + A 9 and

PAGE 85

H* x] + (xl)2 + (6^)2 4 K(xJ)2 + Kixlf 2K(x.])[x° 28^ kb^)^] ,yL-i . -] l\r,0 o?1 1 ,2^2.1-,r .0 , 'J. 2K(xp[x^ + 0'] + K(x;)[xlj' 26' i(9')^] + K(x^j[x^ + o'] J 1 ,11 wi th a] 1 + 2I((x]) 2K[x° 29^ i (eb"] aI = 2(xh + 2K(xl) 21([x° + 6^] The first necessary conditions yield: m }K(6^)^ + 6K(e^)^ + [lOK 2;
PAGE 86

59 5. Minimize H* and H* which give: 6^ = 0.947 and G^ = . IG^ e^ I = |.053| > £ = 0.001 Go back to step 2, 2a. Assume s"" = 0.947 and 6^ = 3a. Find, x] = 1.343 and x^ = 1.947 4a. Find, a] = 1 and a^ = 3.894 5a. Minimize H^ and H* and find: 9^ = 0.9 , 9^-0 -1 ^1 Go back to step 2a and assume Q = 0.9 , 9 =0. Continue in ^1 ~1 -^2 ~2 the same way until |6 9 | < e and |6 9 | < e. This finally 1 2 leads, in 7 iterations, to the solution 9=9 =0 which is the solution of the problem. Note that if K = 0.1 the nonconvexity of the first connection constraint with respect to 9 does not disappear. The stronger version of the minimum principle developed in this paper does not succeed in giving the solution. IV. I.e. Denn 's counterexample Consider the following example taken from Denn (1969), which is also a counterexample to Katz' strong minimum principle and is described by: 2 Minimize x, 1 2 ' u ,u

PAGE 87

70 subject to 1 0/. ^ Ov If lv2 x-j = x^(l + x^j 2 (u ) 1 .0 „ , 1 x^ = 4X-, ZXr, + u x^^ = x] (1 + x^) l(u2)2 x^ = 4x, 2Xp + u x° 1/4 and x° = 1 1 2 where u and u are to be chosen subject to: * 1 ? < u < u',u'^ This is a constrained problem but it can be handled equally as well as the unconstrained problem by the developed method. The inequalities vn'll be handled through Kuhn-Tucker multipliers. The above problem has a solution at the point u-, = 1 and u„ = u'', but the Hamiltonian for the first stage does not have a minimum and thus Katz' strong minimum principle cannot be used. In fact the Hamiltonian of the first stage possesses one stationary point which is a maximum. Let us now apply the algorithm developed in the present work. Thus we take:

PAGE 88

71 H* = ^ a](J)2 + (A^ + i,^)u^ + A^xjd 4 x^) + 4A^x] 2}\x\ + K(x])^(U:4)^ + 16K(x])^ + 4K(X2)^ 16Kx]x2 u-jU* + Kx^x](l + x^) ~ 2Kx^x](l+X2; + § xhl + xl)(G^) Kx!(l + xl)(G^) + 4Kx!x^ 8KxJx^ 2 '^1 rz " "'^-^1^2 2KX2X2 + 4KX2X2 4Kx]u^ + 8Kxju^ + 2KX2U^ 4Kx2U^ and "2 = x^ \ A^(u^)^ + A^u^ + Kx^ -^ I (u^)^ + Kx^(u2) + K(x^)2 + Klu^) 2Kx2U^ u(u* u^) + Kx^x](l + x^) 2Kx^x](l+X2) + | x](1+X2)(l/; Kx](l+X2)(u^) + 4Kx]x2 SKxjx^ 2KX2X2 + 4KX2X2 -1-2 -1 2 -1-2 -1 2 4Kx]u + SKx^u + 2KX2U 4KX2U The necessary conditions to be satisfied are: dW, 9u \= \\[u^) + A^ + y^ = and ^ Uu) At(u ) + A^ + Kx^ + K 2Kx^ + y, Kx^-(1 + X2; 8u 2 "^1 7 ^2 1 + 8Kx] 4KX2 =

PAGE 89

72 The adjoint variables satisfy the follovn'ng equations: \ = >4(1+X2) + 4A^ + 2K(x^)(Hx2)" + 32K(x]) IGKx^ 2Kx^(l+X2) K(1+X2)(G^) 8KX2 2^^^^ ' X^x] 2A2 + 2K(x])^(Hx2) + 8K(X2) 16Kx] 2Kx^x] 1 -2 -2 -2 Kx]u + 4KX2 4Ku A 2x^ + K + K(u^) 2Kx](l + x^) A^ = 2K(X2) 2Ku^ + 4KX2 Note that in the above equations the variables y-i ^nd |j„ are KuhnTucker multipliers through vjhich we handle the inequalities 1 * 2 * u > u and u ? u . For y-, and ij„ we have y, , ij„ j 0. 1 2 Starting with K = 10 and initial assumptions u = 2 and u =2, we apply the same algorithmic procedure as presented in the previous example, and after 5 iterations v;e find the solution ~1 ~2 * u ~ 1 and u = u , which is the solution for the problem. IV. 2. T he Desi gn of a Heat Exchange Network Avery and Foss (1971) have demonstrated that the method of twolevel optimization may not be generally applicable to chemical process

PAGE 90

73 design problems due to the mathematical character of commonly encountered objective functions. Since the cost functions used in chemical engineering design problems typically include a term X vjhere x is the throughput in a unit, nonconvex problems arise. The method for resolving the nonconvexi ties , as described in Chapter II , will be applied in this section to the design of a heat exchange network presented by Avery and Foss and possessing inherent nonconvexi ties . The cold stream D of Figure Sis to be heated from temperature T to T . Three hot streams having flowrates a,b,c and temperatures op a ' r t , t, , t may be used for heating; here they are considered to have a b c sufficient heat capacity and availability to accomplish the task. The distribution of the total heat load among the three exchangers is to be accomplished at the minimum cost of equipment, which is related to the total heat transfer surface area A by 3 3 aC= I C. = Jy. A. \ (IV-1 1=1 1=1 The parameters -,' and a are positive constants; typically a = 0.6. It is easily demonstrated that this problem has a minimum cost solution and it is unique. Let X, and x^ be the enthalpies of stream D entering and leaving respectively, the first heat exchanger (see Figure 9). Similarly y, and y„ are the enthalpies for the second exchanger and z, and z^ for the third. Note that x-, and z^ are known since the flowrate and

PAGE 91

74 CO o

PAGE 92

75 en

PAGE 93

76 temperatures of stream D are knovvti at the entrance, to the first exchanger and the exit of the third exchanger. Thus, considering Xo^y-i^Yn and z. as the design variables, the minimization problem can be formulated as follows: Min ^2'^'t^2'^1 s.t. [C^lx^) + ^'2^^^'^2^ ^ S^^l^^ X2 ^y^ 'z = ^1 (IV-2) The Lagrangian for this problem is: L= p. A^(x2-y^) A^ty^z^) {c^Cx^) A-jX^} + {C2(yi,y2) + '^^y^ A2y2l + {^-^{z^)+\2^'{' = l^ix^A^) + ^2(yTy2'^T^2^ ^ £3(2^, A^) IV3) For given values of the multipliers '*^i and '^„, the two-level optimization method requires the minimization of the sublagrangians £-, , ^^ and S, A sufficient condition for a function to be at a minimum point is that the Hessian matrix be positive definite at that point. The Hessian for iL is defined as r 2 2 ~| d. H2 = DQ aQ9y 2 ay-jSQ ay-, 2 1 (IV-4)

PAGE 94

77 and the conditions assuring that both eigenvalues are positive are d^ > d^ > A E (d^d^ e'^) > . (IV-5) (IV-5) In the above, the heat duty Q = y^ y-, in exchanger 2 has been substituted for the variable y„. d-, , d„ and e may be determined from equation (IV-1), and the following relations, which are suitable for the determination of heat exchange area. Figure 10 identifies the notation used here. Q = C„ (T^ T, R / 1 Q = 72 Yi ^1 = ^l/S A DAT AT t2 T^) (t^ T^) £n (t2 T^) r = 7^ / 1 Without loss of generality, it is convenient to take r > 1 ; identical conclusions hold when r 1. It may be shown that d^ = q[(a-l) (rs-1)^ + p {(rs)^ 1}] d2 = q[(a-l) (s-1)^ + p(s^-l)] e q[(a-l) (rs-1) (s-1) + p(rs2-l)]

PAGE 95

78 C Figure 10. Diagram Showing Notation for a Single Exchanger

PAGE 96

79 whe re p =^ £n — > q = K a p""^/62 > 1 K = Y c_ 'a U(r-T) > . Substitution of d, d„ and e into equation (IV-6) gives A E (d^d^-e ) = q p s (r-1) which irrplies that A is never positive, and that, regardless of the parameters of the problem, the stationary point can never be a minimum. Therefore the two-level method fails. Consider now the following augmented problem: Min F* = C^tx^) + C2(y^.y2) + ^^I'^^'i ^ K*(x2-y-])^ + ;<'(y2-Zi) X2 y-, " and y 2 z^ =0 s.t. with K* > 0. The coupled terms x^-, and y2Z, are expanded in a Taylor Series around the point L.yijyo.z, and are approximated by the linear terms in the expansion. The Lagrangian of the above problem becones:

PAGE 97

80 L {Ct(x^) (X. 2K yjx, + K x: + Kx^y,} '] v-^o' ^1 "-'^ J-] -"^2 ^2-V + {C2 (y-j.yg) + (a-, 2K l^)^ (A2 + 2Ki^)y2 + K y^ + K y^ + K x^y^ + K y2Z-,} /2 + (03(2^) + (X2 2K y2)z-, + K z^ + K y2Z^} * * * The necessarv conditions for the minimum of C are VyL = and V^L = where V = [X2:yi»y2 = 2-]] and A = [X^.l^) • This results in the following equations 8£^ 9X, 3£, X.2 " ' 3^=^=° ' 3£9Z^ X2 y-i and y2 = z^ ; that is, each subsystem is at a stationary point with respect to the variables associated with it. The Hessian matrix for the new sublagrangian I of the second exchanger is:

PAGE 98

81 3^C. + 2K 8y^3Q + 2K 8Q3y. 2 3yi + 2K + 4K X "X The terms d, , dp and e are easily found to be d^ = d^ + 2K d^ = d^ ^ 4K and e = e + 2K Then A = [d^d^ (e )^] 4(K*)^ + (4d^ + 2d2 4e) K* + (d^d^-e^] v.'here and 4(K*)^ + r K* + r r^ = 4d^ + 2d2 4e '3 = ^1^2 ^' A has roots r^ + {y/ ^ST^y It was shown before that ^o -

PAGE 99

82 thus. -. 2 16r.)' and the roots are real and of opposite sign. Let K, : and Kp 5 , then for K > K-, , A > 0. Also for K > d^/2 E K3 d-j > and for d^/Z . K^ d^ > . Therefore if K = max {K, , K^, K^} we find H^ > and the stationary point for unit 2 is a minimum as required. Thus we see that the curvature of a sublagrangian at the stationary point can be determined not only by the sub-unit cost function, C^ in this case, but also by the numerical value of the penalty factor K . The above example demonstrates the applicability of the developed strategy in a wide class of chemical engineering design problems where the conventional twolevel method fails to provide the solution.

PAGE 100

CHAPTER V THE SYNTHESIS OF PROCESS FLOWSHEETS. A GENERAL REVIEW. Processing s^'stems are characterized by two distinct features. The first is the nature of the process units and their interconnections, and the second is the capacities and the operating conditions of the units. Synthesis of optimal processing schemes requires a search over the v/hole space of structural alternatives as well as over the space of the design variables for each structural configuration. The synthesis of an optimal process flowsheet represents a great challenge for a chemical engineer in the area of systems engineering. A number of significant contributions have moved the problem from the state of art to the state of semi-art where the personal capabilities of the designer have been organized and systematized in a logical manner. The size of the problem is '^ery large and the methods currently available simply give good solutions which are happily accepted by practicing designers. The progress of mathematical systems theory in this area has been very slow and no major thrusts have been attained. Thus the chemical engineer has to resort to techniques that are available for use with his own experience and intuition. We shall attempt to present the major contributions to the solution of the synthesis problem in order to relate and compare the synthesis strategies proposed in this work, and presented in Chapters VI and VII, with the general trends of thought already established by other researchers in this area. 83

PAGE 101

84 Rudd (1958) proposed an approach to system synthesis based on process design decomposition whereby a design problem for wlrich no previous technology existed is broken down into a sequence of subdesign problems until the level of the available technology is reached. Thus the original problem is decomposed into two smaller problems S. and S ^ which are united through a set of artificially imposed tear constraints. This decomposition continues until existing technology for tne sub tasks is reached. The selection among the alternate structures arising from the decomposition of a task into subtasks was made by comparing the economic performance of the structures generated. The economic performance of subtasks of unknown technology was estimated by an iterative scheme starting with a first approximation of the system's performance based on previously solved, similar design problems. In a second step Masso and Rudd (1969) used heuristics (rules of thumb) to define the points of decomposition and the set of artificially imposed tear constraints. The heuristics, rules based on the previous experience, ingenuity and intuition of the designer and updated through experience with the current problem, speed up the synthesis process, but they are fallible and they do not guarantee an optimal solution. This procedure was applied without complete success to the synthesis of heat exchange networks. The usefulness of a branch and bound strategy in screening the various structural alternatives, in conjunction with the above strategy, was examined by Lee, Masso and Rudd (1970). The design decom.position supported by heuristic decisions came to its advanced state with the works of Siirola and Rudd (1971) and Siirola, Powers and Rudd (1971). It was applied to the invention of chemical process

PAGE 102

85 flo'wsheets. Thv^ synthesis of the flowsheet starts at its very heart, the chemical reactor. Reactor support systems such as preparation of the feed, separation of the products, purification, heating or cooling the feed or the products are identified. The reasonableness of the various routings for the components through the subunits identified is tested according to a set of process heuristics while the overall material flow is selected using linear programming to maximize the sum of heuristic scores from all the feasible routings. This strategy enables the synthesis of complicated integrated systems and has been applied to a process for the production of monochlorodecone by direct chlori nation. Such an approach which relies heavily on heuristics can only be successful if satisfactory heuristics can be developed. Nevertheless it produces good solutions quickly. In the electrical engineering literature a general approach is to postulate a network which must contain the desired optimum network as a subset of the one postulated. A similar approach has been followed by various researchers for the synthesis of chemical process flowsheets. The structure of a processing system is determined by a connection matrix with elements expressing the interconnections between each pair of subsystems. From this point of view the optimal synthesis of a chemical process flowsheet is to determine the optimal connection matrix expressing the interconnections among the subsystems with the satisfaction of the optimal design conditions. Thus integrated complex systems are first generated where all the available subsystems are included. Split fractions which are defined on the fraction of the output from one unit used as an input to another unit are assigned to each stream connecting two units and the optimal flowsheet is

PAGE 103

86 generated by a direct optimization of the initial complex system, vn'th respect to the design variables of each unit and the split fractions for each stream. The various methods v.'hich follow the above strategy differ in the way they tackle this large-scale optimization problem. Umeda and Ichikawa (1972) use an infeasible method through the two-level optimization procedure. Osakada and Fan (1973) employ a multilevel technique which at the first level minimizes the objective functions of the separated noninteracting subsystems, while at the second level the solutions of the subprcblems are coordinated to render the optimum of the overall system. Um.eda, Hirai and Ichikawa (1972) optimize the initial, large interconnected system using the complex algorithm. Umeda , Shindo and Tazaki (1972) have employed a feasible decomposition method at two levels. Ichikawa and Fan (1973) use a strategy based on the weak discrete minimum principle which results in a feasible decomposition. As an application of the theory of Chapter II of the present work, a variation of the two-level method was proposed (Stephanopoulos and Westerberg, 1974) to account for the nonconvexities of chemical engineering cost functions or connection constraints. Any method following the above general approach has to solve a large optimization problem, and, only when combined with heuristics (to disregard certain connections) and efficient decomposition techniques, can they be valuable. A third general approach is that of the evolutionary procedure. This has been probably the method most likely to be used by a process designer. The evolutionary procedure consists of a finite sequence of structural changes each of which leads to a better flowsheet.

PAGE 104

87 This procedure ends when no flowsheet can be found which can be generated from the last flowsheet by simple, one-step structural changes, and will also appreciate a lower cost. Then we say that we have found a "local minimum". Thus a processing system is devised, analyzed and changed in one or more ways so as to improve it. These steps are repeated until no further improvement in the flowsheet can be made. Such a strategy maintains the good points of the current flowsheet while the less promising parts of the flowsiieet are changed in an effort to improve the overall process. This procedure resembles the hill climbing search techniques used by optimization algorithms to maximize a function with continuous variables. In the synthesis problem we have to deal with discrete variables which can change drastically from one flowsheet to another. As is the case with the optimization of functions with continuous variables, the optimum reached by an evolutionary synthesis is a "local optimum" unless the cost function over the various processes is in some sense unimodal. Furthermore repeated application of an evolutionary scheme from very different starting flowsheets will reveal more than one local optima. Also as King et^ al_. argued, "It is questionable whether one can speak of a global optimum design for an open-ended process synthesis problem, ..." King, Gantz and Barnes (1972) applied this strategy as a succession of alterations on that portion of the most current flowsheet which could be changed for the greatest advantage. In the samie work two examples of evolutionary synthesis were given. In the first example, the synthesis of a demethanizer tower in an ethylene plant, the evolution steps were identified by the designer based on his previous

PAGE 105

experience in this particular processing area. In the second example the computer v/as used to implement the logic of the evolution steps. In both examples the identification of the portion of the process to be altered was guided by heuristics, while a second level of heuristics was used to guide the selection of the parts to replace the parts to be changed. This approach, being highly heuristic, cannot guarantee the optimum solution, but it can "help to structure the thinking of the design engineer" (King, Gantz and Barnes, 1972) and thus produce good solutions. McGalliard and Westerberg (1972) have proposed an algorithmic approach to screen the alternatives developed quickly without necessitating the optimization of the entire process. It is based on Lasdon's two-level optimization method and the bounding characteristics of the dual and primal functions. They also pointed out that the basic structural modifications to a certain process can be to replace a unit with a different one, to delete a unit and to generate a recycle. Thus, since the change is local, the sensitivity information from the current structure is presented via the stream prices (Lagrange multipliers). Pho and Lapidus (1973) have presented a strategy which generates systematically all possible heat exchange networks for problems of the type defined earlier by Rudd and coworkers. Their work can also be viewed as an evolutionary strategy, where the system continuously evolves to better systems in an organized manner. Thus for the synthesis of a heat exchange network, starting from a situation with no matches betv;een the streams in question, all alternate first matches are exam.ined. A set of structural changes is then available.

PAGE 106

89 The most promising is retained and the rest are rejected. In this fashion the synthesis continues in an evolutionary sense with a simultaneous screening of the various alternatives open at a certain point. The evaluation of the alternatives is heuristic and follows the lookahead strategy used in chess. For the solution of particular synthesis problems, specialized techniques have been developed, in addition to the general strategy mentioned above. Thus dynamic programming in conjunction with heuristics was used to synthesize separation schemes (Hendry and Hughes, 1972), reaction paths (Powers and Jones, 1973), reactor networks (Aris, 1964) and heat exchange networks (Westbrook, 1951). Hendry, Rudd and Seader (1973) presented a detailed account of all the specialized methods developed to synthesize such particular problems as the synthesis of heat exchange networks, separation trains reaction paths, reactor networks, etc.

PAGE 107

CHAPTER VI BRANCH AND BOUND STRATEGY WITH LIST TECHNIQUES FOR THE SYNTHESIS OF OPTIMAL SEPARATION SCHEMES Multicomponent separations arise very frequenily in chemical processing systems, and so techniques for the synthesis of multicomponent separation sequences play an important part in the synthesis of optimal processing schemes. In this chapter we intend to develop a scheme that v/ill enable us to relax the most of the restrictions that the methods already proposed include inherently, while retaining their attractive characteristics. Thus section VI. 1. presents the previous works on the synthesis of optimal separation schemes, while the section VI. 2 states the problem and develops the list techniques for the representation of separation trains. Section VI. 3 develops the branch and bound strategy while section VI. 4 presents two illustrative examples of the proposed strategy. Section VI. 5 summarizes and discusses the proposed method. VI. 1. Previous Works on the Synthesis of Separation Schemes With respect to the synthesis of multicomponent separation schemes, two developments are the most significant, the works by Thompson and King (1972) and by Hendry and Hughes (1972). These works represent very distinctly the general trends and attitudes in the synthesis area and, because of their relevance to the present v\fork, will be examined further in some detail. 90

PAGE 108

91 The approach proposed by Thompson and King (1972) is highly heuristic in nature. The first step is to identify a feasible set of products for the process. The synthesis begins with the choice of an initial separator. All possible first separations are identified and the cost for each one is estimated. These estimation parameters are updated during the synthesis. The cheapest is selected as the first separator and is rigorously costed. This procedure is repeated to choose a second separator then a third and so forth until no further separators are needed. With an updating of the cost estimation parameters the entire procedure is repeated until no change is found in the process. The identification of alternatives and their selection is guided by heuristics. After the separation sequence has been established, a new one is produced based on a new set of products and this continues until no lower cost process can be found. For the prediction of the cost of a unit separating product i from product j using method k, estimation parameters 6-., are intro1 J K duced. The estimated cost of a separator is quite simply (S--,) (N-,-. ), v/here N..j^ = number of stages required for the above separation. Initially all the g's take the low value of $1 .0/stage in order to permit the consideration of all the viable alternatives. When a separation unit is selected, its 3^-|^ is updated using the real cost of the separator: ^n-ii, = ^eal cost of separator / (N. ., ) ijk '^ ijk' This approach being almost entirely heuristic cannot guarantee the optimality of the separation sequences derived. Its success in producing good solutions quickly depends largely on the success of the heuristics used.

PAGE 109

92 Hendry and Hughes (1972) developed a strategy based on the jse of dynamic programniing. Unlike the previous method, this one is basically an algorithmic approach. In the first step all the twocomponents separations are evaluated, e.g. A _B_ _C_ _1_ etc. B C D E The generation of all the three-components separations follows, e.g. A A B B etc. B B C C C C D D Since the cost for the separation of the remaining two components is known from the first step, the optimal sequence for the separation of three components into three products is easily derived. This procedure can continue up to the number of components in the original mixture. In fact this approach is an exhaustive enumeration over the set of all the possible separators, which incidentally is much smaller than the set of all possible flowsheets. As will be shown later the above procedure can be improved and a sm.aller class of separators will need to be examined. This approach, employing in fact the principle of dynamic programming, cannot be applied in cyclic systems, and it requires large computation tim.e. Furthermore we must assume that essentially perfect splits take place to permit an analysis of a unit without consideration of the units preceding it and that mass separating agents will have to recovered in tne immediate successor unit. In this chapter we intend to develop a scheme that will enable us to relax most of the above restrictions and assumptions while retaining the attractive characteristics of the various methods.

PAGE 110

93 VI. 2. Statement of the Problem and the List Techniques for the Representation of the Separation Operations The synthesis of separation sequences is a basic subproblem in the overall synthesis of chemical processing systems. The general mul ticomponent separation problem can be stated as follows: "Select the proper separations (separation method and point of split) and their sequencing in order to minimize the overall venture cost of a separation process, which separates a mixture of N components into a pres peel fled number of product streams, using S different separation methods." The maximum numiber of the alternative processing schemes which can be derived for the above problem is ^ery large and is given by the following formula (Thompson and King, 1972) (2 (N-1))! .N-1 N!(N-1)! ^ Consider the following mixture of five components A,B,C,D,E which must be separated into its five pure components. The first question we are confronted with is: what separation methods will be considered and how many. The answer to this comes from a close examination of the physical properties of the components, trying to exploit differences in their physical properties, and from the engineering judgement of the designer in conjunction with the fact that, with an increase in the number of the separation methods considered an Immense Increase in the number of the plausible alternatives is caused. Suppose that two methods are selected, method a and method 8. The ranked lists for the two methods are formed. The ranked list Is a list of the components of the initial mixture, in

PAGE 111

94 decreasing order with respect to a physical property exploited by a certain separation method (e.y., boiling point for distillation, solubility for extraction, size of solid particles for screening). Let the ranked lists for methods a and 6 be: RL(1) : A,B,C,D,E and RL(2) : B,D,A,C,E In developing the separation train tv;o main problems arise: which separation method will be selected and where will the split be performed. Thus, using method a the following separators can be generated at the beginning of the separation train: A A A A B B B B C C C C D D D D E E E E It is obvious that by selecting one of the above separators as the first one the consequent separators have been restricted to a more specified class of separators. Thus, a decision made at the first separator is going to affect the performance of the entire process not only by the performance of the first stage itself but also by the restrictions that the first decision imposes on the consequent separators. The answer to this dilemma will be given later by the approach developed. Hendry and Hughes (1972) have proposed the use of list techniques to represent a separation operation. The basis for such a development is the ranked lists. Forming the ranked lists of the components

PAGE 112

95 for the separation methods considered by the designer, all possible combinations of coinponents which may be formed during the separation process may be derived. Furthermore all the distinct separators which may be part of a plausible flowsheet can be derived. This result follows from a consideration of the separation operations as "ranked list splitters" (Hendry and Hughes, 1972) which takes the ranked list and splits it into two adjacent sections. The components in each of these sections must then undergo further processing, perhaps using a different separation method, until the prespecified products are produced. Thus, the ranked lists define the various ways in which the separations may take place. The following fact should be pointed out: the number of distinct separators which are generated from the ranked lists for a mixture of N components using S separation methods, is much smaller than the number of flowsheets the separators can generate when they are interconnected to each other to form a feasible process. In the Table 4, the corresponding numbers for different mixtures are shown with only one separation method considered. As N increases the following limits are approached: 111L±±1 ^> 4 and '-^^"-^^ ^> 1 F(N) B(N) For N ^ 3 and S > 2 the number of distinct flowsheets is much larger than the number of distinct separators. Thus working within the set of distinct separators is an advantage since the search space is smaller, Based on the ranked lists for all the separation methods considered, we can develop all the distinct separators which will

PAGE 113

96 TABLE 4 THE NUMBERS OF DISTINCT SEPARATORS B(N) AND DISTINCT FLOWSHEETS F(N) FOR A MIXTURE OF N COMPONENTS AND ONE SEPARATION METHOD Number of Components Number of Distinct Number of Distinct .„ . r/.-.4-,.v.. (kW Spnaratnrs B N) Fl ows heeU ( r ( N ) ) 1 n d r

PAGE 114

97 constitute the elements of every feasible process. First all the N-component separators are generated for every ranked list. This step produces N-1 different separators for every separation method considered and groups of coniponents containing from one up to N-1 components. At the second stage all the (N-1) or fewer component separators are generated based on the component groups derived at the first step which in turn produce smaller size groups. This procedure continues until all the n-component separators (n=2,3, . . . ,N} have been generated. In the Figure 11, all the distinct separators for a 4 component mixture are shown with two separation methods a and g considered. Hendry and Hughes (1972) have discussed in detail the various characteristics and consequences of the use of ranked lists. Thus for components with very close boiling points or solubilities in a certain sol vent, etc. , it may be reasonable that they be ranked equally thus implying that they cannot be separated using that method. Similarly, it often happens that a particular separation technique cannot be applied when certain components are present in the mixture. Consequently, the ranked list should not contain those components. After using the ranked lists to generate all possible separators which will be part of any feasible separation scheme, we shall proceed to develop, in the next section, a strategy which allows the generation of a very small number of "nearly optimum" flowsheets or when carried even further, to generate the globally best flowsheet subject to the assumptions defining the problem.

PAGE 115

98 4 ComDonent Separators 2. /A\ 3./A\ 3 C iD/a iD/a Id// -3 f — C \D//3 o, S\ 0/;S 3 Component S -8 .o a raters c Ai \cIq B.JB\ 9./B\ 10./3 c !c (c IJ/^ \cja \o'la 'oA 17. /A \ 18, /.A \ 2~ Component Saporotors Isia 3\ 2Lf3\ 22,ft\ 2oJc\ 24. (C] A 1/3 \CJa \Clj3 \i)ia \0// ^^•(§1 ^^-'^ A//3 Figure 11. All the Distinct Separators Generated and the Basic Flowsheet for a Fictitious Example of 4 Components Using 2 Separation Methods

PAGE 116

99 Figure 11 . (continued)

PAGE 117

100 VI. 3. Branch and Bound Strategy The goal of the branch and bound strategy is to get the optimum separation scheme by enumerating as few separators as possible. Hendry and Hughes (1972), with the Dynamic Programming scheme they employed, have to search over the whole set of distinct separators. The branch and bound strategy of the present method is based on the bounding properties of the dual and the primal return function evaluations. These bounding characteristics were used by McGalliard and Westerberg (1972) to determine whether s feasible modification to a given feasible structure will appreciate system return without having to optimize fully the modified system. In the present work the primal and dual values are used in connection with a branch and bound scheme to restrict the search space of distinct separators and to yield eventually the optimum structure. It is a step by step building procedure of the optin:um flov.'sheet. Since the twolevel optimization method (Lasdon, 1964) is the theoretical foundation of the proposed strategy, let us describe this method by considering the following sequential problem (for more details on the two-level method see Chapter II): n Minimize = I (i)^(x.,u.) (VI-Pl s.t. X-, = x° (given) X. + 1 = f^.(x.,u.) i = l,...,l (x.,u.) I-: S. i = l,..,,l

PAGE 118

lOl The Lagrangian of this problem is II {^. -AJX. H-AJ^T f.} n Jo = I £. (x. , u , A., A.J + A.x, i=l 1 T 1 T T+l II (VI-1 where the natural boundary condition on A ,, yields A, .-, = 0. n+ I n+i Next v/e define the problem (VI-P2) as n -" h(A,,...,A ) = I . • " ''l ^1 (VI-P2) ^ "^ i=l (x.,u.) e S. ' ' and finally the dual problem (VI-P3) Maximize h(A.| ,. . . ,A^) (VI-P3) The two-level method involves solving (VI-P3) by first guessing the multipliers and then iterating until h(A^,...,A^) is maximized. The following theorem (Lasdon, 1970) establishes the dual function as a lower bound of the optimum value: Theorem : h(A.j,...,A^) < {xy Uy x^, u^,..., x^^, uj for all (x.,u.) 1=1,... -N satisfying the constraints of problem (VI-Pl ) and for all ^^^ D, i = l,...,N where D is the domain of definition of the Lagrange multipliers.

PAGE 119

102 A direct consequence of the above theorem is the fcllov/ing inequality (lower bound) n( \-, ,. • • ^A^.) = ;i,(x^' ,,j,' , . . . ,x^ ,u^) where x-, , u-, i==l,...,N is the optimum solution of problem (YI-Pl). Furthermore, it is obvious that (upper bound) p(x-| ,u^ ,. . . ,x^^,Uj,) > c})(x^ ,u^ ,. . . ,x^,j,u^P where (x-, ,u^ , . . . ,x,,,'j.,) satisfy the constraints of (VI-Pl). The II i\ N lower bound will be called the dual bound and the upper bound the primal bound. Construction of the optimum flowsheet begins by defining a basic flowsheet. The basic flowsheet can be any separation train which produces the desired products. If the same separation problem has been solved in the past, then the flowsheet available in the literature can be the basic flowsheet. If it is a new problem then the basic flowsheet can be developed as follows. For given Lagrange multipliers the sublagrangians £. , as defined in equation (VI-1), are minimized with respect to the input and decision variables for e\er-y N-component separator. As the first separator we retain the separator with the smallest value of £.. In the 4-component example of section VI. 1., let (A/BCD) be the first separator thus selected. The second separator can be any of the separators 7,8,9 or 10 in Fig. 11. Following the same procedure we choose separator 7, (B/CD) as the second separator, with the smallest 9.-. In the same way we choose separator 24, (C/D) as the last separator. Then the basic p flowsheet is given in Figure llA. Developing the basic flowsheet we

PAGE 120

103 have used the heuristic "choose the potentially cheapest as the next separator." Thompson and King have proposed essentially this heuristic and its implications when using the infeasible decomposition strategy of the two-level optimization m.ethod will be discussed later. The sum h^^ =£, +£ +i is a dual bound to the optimum value for the basic flowsheet. Next we compute a primal bouna for this flowsheet. Let this be d,. Thus if 9 is the minimum cost for the basic flowsheet we have Once the basic flowsheet and its dual and primal bounds are established, the screening of the other alternate structures goes in the following systematic way. If any of the 4-component separators 2,3,4,5 or 6 has a sublagrangian £. > 6. , all the flowsheets starting from this separator are disregarded since they will appreciate a greater cost. Thus v;e develop a tree, each branch of which proceeds as long as the sum of the sublagrangians of the separators contained in this branch does not exceed the primal bound 9, of the basic flowsheet. In Figures 14 and 15 this strategy is demonstrated in the n-butylene purification example of 6 components using two separation methods, distillation (method a) and extractive distillation (method $)• At the end of the above procedure a small number of flowsheets is retained, all of which have a dual bound lower than ^ . The optimal flowsheet is a member of this family of flowsheets. At this point all the retained flowsheets are further examined by the designer. All of the flowsheets may be considered as candidates for the final process and the choice of one of them may be based on many other factors such

PAGE 121

104 as operational, controllability, etc. If desired, further screening of the flowsheets to produce the optiiiiuni proceeds in the follovn'ng manner. The flowsheet with the lowest dual bound is established as the basic flowsheet and a new primal bound is computed. If the dual bound of any of the remaining flowsheets exceeds the primal bound of the new basic flowsheet, this flowsheet is rejected. By continuously updating the dual bound of the basic flowsheet through a change in the Lagrange multipliers, the gap between the primal and the dual bounds of the current basic flowsheet becomes smaller and the screening procedure more effective. In this manner we can continue until very few and probably only one flowsheet is left which, because of the procedure followed, is the "global" optimum. The mathematical rigorousness of the two-level optimization method guarantees this result. An additional flexibility can be introduced at the first level of screening by properly and timely changing the basic flowsheet. If for example an attractive alternate structure is established during the first screening, a new basic flowsheet may be established based on this flowsheet. A new primal bound is computed and hopefully is lower than the primal bound of the previous basic flowsheet. Only under this condition would the change in the basic flowsheet be useful. It is obvious that such a variation in the initial strategy is highly heuristic. The decision that a certain structure will appreciate a better primal bound and therefore will be a better basic flowsheet depends largely on the intuition and the experience of the designer. Despite the heuristic nature of such decisions, the mathematical

PAGE 122

105 rigorousness of the overall strategy is not lost, since the actual screening is based on the rigorous upper and lower bounds of the optiiiium costs. The set of all the distinct separators can also be reduced with the use of certain heuristics. If a certain split is very easy using distillation, the separator with the same split but using extractive distillation is not considered (Hendry and Hughes, 1972). Furthermore, certain separators are physically infeasible; e.g., the feed to an extractive distillation column is immiscible or the two streams in an extraction unit are miscible. Finally, rejection may result if during the simulation of the separation units, difficulties in the performance of units are encountered such as very large internal flows, very high temperatures (Thompson and King, 1972) etc. In all the above cases the separators are excluded either in advance or during the construction of the flowsheets. The success of the branch and bound strategy depends largely on the proper selection of the Lagrange multiplier values and the efficient estimation of a primal bound for the basic flowsheet. The problem of the synthesis of a separation sequence has certain characteristics which can make it insensitive to the value of the Lagrange multipliers. If the inputs and the outputs for each unit are considerably constrained into a small range of acceptable values, then whatever the values of the Lagrange multipliers are, the range of possible values for the overall dual bound is very limited. This limitation in values can be understood better by considering the simple two-stage example in Figure 12. The dual bound is h(A) defined by

PAGE 123

105 *^3 yi ^a I •2^ 0/ CD f V ? * 1 31 a JS'
PAGE 124

107 Min {(J)^ (x-|,u^) + Ay^ | y^ = f-j (x^,u^), y-| e Y} + Min {(|)p (x„,u^) Ax^ I Vq "^ ^2 (-^2'^?^' ^2 *^ ^^ Since both y-, and x„ are restricted to a very small allowable set of values, Y, the contribution to the dual bound A(y-, x„) is very small regardless of the value of A. This characteristic was exploited in this work and the Lagrange multipliers were set equal to zero. A comparison of the values of the i.'s with A=0 and the values of the objective functions (j).'s for the optimal primal solution did not differ significantly, justifying this approach. The value of the primal bound of the basic flowsheet is also very important for an effective branch and bound strategy. It must lie as close as possible to the dual bound for an effective screening. This cannot be guaranteed by an arbitrary set of input and control variables for each unit. Thus repeated interations may be needed to establish a good primal bound. In the case that a similar separation sequence has been synthesized in the past, the optimum cost for that sequence will be a very good primal bound. VI. 4. Examples In order to illustrate the strategy proposed in the previous section, two separation problems were tackled. These examples were taken from the existing literature in order to allow for comparison and also to provide for a certain realism of the procedure adopted.

PAGE 125

108 VI.4.a. Example 1: n-butylene purificati o n system This example v;as taken from Hendry and Hughes (1972) and its purpose is to illustrate the approach developed in the previous sections. The composition of the initial inixture, the flov^rates and the destination of each component are given in the work mentioned above and are shown in Table 5. Two separation methods will be considered, distillation and extractive distillation with a furfuralwater mixture (4% water by weight). For the simulation of the separation units, the short-cut programs developed by Thompson and King (1972) were used. The thermodynamic routines from the same work were used to calculate the thermodynamic properties of the above hydrocarbon mixture. The objective function to be minimized is the venture cost given by: V = 0.5 * (OP) + 0.25 * (CAP) where V = venture cost, $/yr. OP = operating cost, $/yr. CAP = capital cost, S

PAGE 126

109 TABLE 5 INITIAL FEED TO THE n-BUTYLENE PURIFICATION SYSTEM Mole Fraction Destination Propane

PAGE 127

10 The optimization variables for each unit were the pressure and tiie reflux ratio factor as the decision variables of a column and tfie temperature and pressure of the feed in the unit as the input variables. The ranked lists for the tv/o separation methods are as follows: Distillation (method a): C3,B1 ,NB,B2T,B2C,C5 Extractive Distillation (method 6): C3,nB,Bl ,B2T,B2C,C5 The following heuristic was used (Hendry and Hughes, 1972) to reduce tlie number of possible separators: "Because of the relatively small amounts of propane and pentane present in t!"ie mixture and the ease of separation of these components from others by using straight fractionation, extractive distillation splits with either of these components as keys are prohibited." 64 distinct separators can be generated. In Figure 13 the generation of the basic flowsheet is shown, while in Figures 14 and 15 the branch and bound strategy for this example is developed starting with the modes which were not rejected after the basic flowsheet was established. The numbers in a circle in Figures 13, 14 and 15 correspond to the distinct separators shown in Figures 16a, 16b and 16c. At the end four flowsheets (including the basic) shown in Figure 17 were retained: From a total of 54 distinct separators only 43 had to be optimized. Further screening shows that flowsheet (III); Figure 17, is the optimum with a cost 861,400 $/yr. This is the same solution found by Hendry and Hughes. We should point out that separate optimization of the flowsheets retained yields the results shown in Table 6. He can notice the relatively small differences in the cost among the four retained flowshieets. Thus for engineering purposes we could have accepted any of the 4 as the optimum or nearly optimum flowsheet.

PAGE 128

n Ui u X (Ji o _i b. O in \CD LsJ LJ CO o o o_ o" CO G5 II 03 < m

PAGE 129

112 < UJ a: 11^ iS/ ^ ^^ S^ "^^ ^^ %S/ i^' >^ ->2i

PAGE 130

I +-> Q) CU CD sz -sr +-> CO CD 2 S>— O O CL U_ ro n3 to >> ro O O fO +-> •ro. ra +-> oj u ra 00 -rS_ 4CD +J s-

PAGE 131

< 1—

PAGE 132

+-> OJ T3 CD OJ -C QO B LiJ ~0 sc ra n3 +J 03 CO E CO OO Q 1/5 4-J •Ires OJ t/1 o

PAGE 133

116 cn !_;o ,o 33 o.;.!j o fa ^ CO a3 o •1^ _ a f>j tM to o a 2: cQ cn o tn]D3 CO 00 -o ojffl :2 a c:3 o o s o do & 3^ on o < < a. €0 Io ex O o ui t-iO to — ca o^uM o io 2 aja , 85LJ m — ;;\i XM -jD -2 23 C3 '^ O o 5.V

PAGE 134

117 GO p < < C^^ o o s a* c:> 03 Cu G5 f4 CD CV3

PAGE 135

J IS o o 01 ^ Cv? CO o i < < lil hi O o o i O o lO 1-^ o 10 CO

PAGE 136

119 '^o la Figure 17. Nearly Optimum Flowsheets Retained at the End of the Branch and Bound Synthesis of the n-Butylene Purification System

PAGE 137

120 TABLE 6 GENERATED FLOWSHEETS AND THEIR MINIMUM COSTS Flowsheet Optimuin Cost I 884,828 II 877,512 III 851,400 IV 869,476

PAGE 138

121 VI.4.b. Exsmple 2: Olefins , pa raffins s eparation system This example was taken from Thompson and King (1972) and is a separation problem of olefins and paraffins into pure products. The specifications of the initial feed and the desired products are shown in the Table 7. Two separation methods were considered: distillation (method a) and extractive distillation with tetrahydrofuran (method B). For the calculation of the thermodynamic properties and the simulation of the distillation and the extractive distillation units the same routines as in the previous example were used. The objective function remained the same, i.e., the venture cost and the optimization variables were as in the previous example. The ranked lists for the two methods are as follows: RL (method a) : C2,P1 ,C3,B1 ,C4,C5 RL (methods) : C2,C3,P1 ,C4,B1 ,C5 Applying the same procedure of the branch and bound strategy, 70 distinct separators were generated. Only 34 had to be optimized, see Figures 18a, 18b and 18c. Figure 19 shows the generation of the basic flowsheet which is the flowsheet (V) , see Figure 20. The bounds on this flowsheet are: dual bound = 602,760 $/year primal bound = 791 ,804 $/year At the end the flowsheets shown in Figure 20 were retained (beside the basic). Figures 21, 22, 23 show the flowsheets generated (retained and rejected) during the branch and bound strategy from the open nodes 1 , 2, 3 and 4 after after the basic flowsheet was established.

PAGE 139

122 TABLE 7 SPECIFICATIONS OF THE INITIAL FEED AND THE DESIRED PRODUCTS FOR EXAMPLE 2 Feed rate = 100 lb moles/hr. Component

PAGE 140

123 u f-} o C5 o u cvj a li {-^ o i<'> Cv5 < Cv? to — %' o o G. ! o CD o CO CDr— C CD Dr; E O Lij ca (A in tn IO) d) c -r-£= >,M+-: 00 4rt3 I— OJ ^ 1— x: IT3 CC -t-J Do u [^ CO; a CV: LO LO

PAGE 142

125 TO U 50'^hU-i iO CD £^4 fO a. cj^u -rro 01 o o €0 o {3 O j^ CO £2 O < < LlJ CO llJ o a CO il o r-o" CD ti o CO o C^J w^ 'O a id" CD O! ^ ^ ^ — DJ ;^ to O o o t to 0^ 5r €0 CD a.|o G3 o — o a. b o 8 a ^! — "^i»n Ol 03 O U CSJ SI 'f o CD '53o eg

PAGE 143

o 126 5a o o eg CVJ O 50 CV2 ^ C33 SO in d (M CM o o I TO DJ — 5V) o eg :9 Da -d ^ 6} CM o Si O. m CV4 0> CO CO 04 iO CVJ C\J GO fO

PAGE 144

O) +-= ft!

PAGE 145

128 (Y) (•1) mi) (VIII) DU AL = 602,760 DUAL642.058 HPl) (B\) UBI) rJUAL =685,632 (C2) (iX) DUAL = 7o7,977 Figure 20. Nearly Optimum Flov/sheets Retained at the End of the Branch and Bound Synthesis of the Olefins, Paraffins Separation System

PAGE 146

129 /?^ /^) /*>. j-^ fi^ /^, /3N ^^ ^ic^v^/ f-> to O +-> -iJ cC ra E ssa C Q. in a; e >> CD 1/3 CO

PAGE 147

130 n3

PAGE 148

131 a: en

PAGE 149

132 All the retained flowsheets are nearly optimum. Further screening yielded flowsheet (V) as the optimum one. VI. 5. Discussion Use of tlie primal and dual bounds in a branch and bound strategy along with a list processing representation of the separation processes leads to an efficient algorithmic procedure for the synttiesis of a small number of nearly optimum multi component separation sequences or furthermore for the synthesis of the "globally optimum" separation sequence. The algorithmic nature of the procedure guarantees the optimal ity of the synthesized sequence unlike the heuristic methods (Thompson and King, 1972; Siirola, et al_. , 1971). No major assumptions concerning the serial nature of the separation operations are required, as is the case with dynamic programming. For an extractive distillation unit, the removal of the solvent does not have to take place in the next step and the recycling of certain streams in a separation sequence can be easily handled. Thus, mass separating agents can be handled as normal products. In Examples 1 and 2, the composition of the feed to a separator has been considered known, which implies that perfect separation has been assumed. This restriction is easily relaxed. Since the products must follow certain purity specifications, the range of compositions each component can have in the feed mixture to a unit is very limited. Thus relaxing the restriction of specified compositions for the feed in each unit does not appreciably increase the difficulty of the optimization of each separator.

PAGE 150

133 The computation time required by the new procedure is reduced compared to the time required by a dynamic programming scheme, which requires a complete search over the whole set of the distinct separators. (For example 1 only 43 separators from 64 possible were optimized to produce 4 nearly optimum flowsheets, while for the Example 2, 36 separators from 70 possible were optimized to generate 5 flowsheets.) The basic characteristic of the proposed approach is the capability to produce alternate "nearly optimum" separation sequences quickly and by examining only a subset of the set of all the possible separators. As Example 1 illustrated all the generated sequences differ very little in their minimum cost. Any one of them is almost as good as any other, and the selection of the most suitable sequence can be based on some other factors that the designer will consider important besides the preliminary economics of the unit. At the same time the procedure can, if desired, develop the economically best sequence by screening the final set of the retained flowsheets. Thompson and King's (1972) proposed strategy consisted of a set of decisions in a sequence and at different stages. Tlius first a decision was made for the first separator and then for tlie second separator and so forth. Despite the iterative character of the procedure, it never assesses the implications of a decision on the subsequent decisions. This characteristic was of course recognized by them and defended because of the desired speed for their method. Unlike this procedure the branch and bound strategy proposed in this work considers effectively the consequences of a decision on the following decisions so that the final process is an optimal entity.

PAGE 151

CHAPTER VII EVOLUTIONARY SYNTHESIS OF PROCESS FLOWSHEETS In this chapter we try to systematize the logic of evolutionary synthesis. The evolutionary procedure is presented as four subtasks: finding an initial structure, creating a set of rules on how to alter structures, developing a strategy to apply these rules, and finally providing a means to compare structures which are generated. Thus, a sequence of structural modifications based on certain evolutionary rules will ultimately lead to a locally optimum flowsheet. The advantages and disadvantages of the approach will be pointed out and finally an illustration of all the ideas will be made on the synthesis of separation sequences. VII.l. A General Philosophy on Evolutionary Synthesis We shall now consider in some detail the synthesis of optimal structures by evolving from less optimal ones through small modifications. Given this goal one must partition the problem into a set of subproblems for which one can develop solutions. We have partitioned the evolutionary synthesis task into the following four subtasks which will be discussed in more detail. 1. To evolve to improved structures, one must have an initial structure which satisfies the design goals. 2. Given an initial structure one needs a set of rules to modify a structure, creating what we shall term its "neighboring" structures. 134

PAGE 152

135 3. Haying the rules to generate modifications, the efficiency of the method will hinge on the strategy used to apply them to evolve to improved structures. 4. Finally one needs a means by which to compare structures to decide if one represents an improvement over an other or not. We shall now try to deal with these subtasks directly and propose certain guidelines which can be used to satisfy them. The initial structure . Attempting to locate an optimal structure by evolving to it bears many similarities to optimizing a function of continuous variables. The initial structure corresponds to the initial point and by using our experience in the latter problem, obviously can have a profound effect on the success and efficiency of the method. The better the initial flowsheet, the closer we are to the optimum solution and therefore the faster we attain it. But the development of a good initial flowsheet is not a trivial problem. In fact many of the proposed synthesis strategies deal with this problem. The AIDES system (Siirola and Rudd, 1971; Siirola, Powers and Rudd, 1971) for example develops alternate good initial flowsheets. Two general approaches exist which can be used to determine a good starting flowsheet: the algorithmic and the heuristic. The first is mathematically rigorous for the conditions stated for the problem and guarantees to produce very reliable alternatives. The branch and bound strategy of the Chapter VI can be viewed as an algorithmic procedure to generate good initial flowsheets. In this case it guarantees that the optimum is a member of the family of the flowsheets generated while any of the other members of this family is

PAGE 153

136 nearly optimum. The algoritnmic approach requires extensive computation times. The second, the heuristic approach, does not guarantee any result in a mathematically rigorous manner but tends to produce quickly very good initial flov/sheets . Based on rules of thumb which are a result of past experience, ingenuity, and intuition on the part of the designer, they are effective and are likely the more efficient method to develop good starting flowsheets for the evolutionary approach. In case that the same process has been synthesized in the past, the available literature can provide a suitable flowsheet to start the evolutionary strategy. This is because existing plants, although they are not optimal, are structured and operate with a high degree of effectiveness and efficiency. The generation of the initial flowsneet resembles the opening part of the game chess, where the movements are guided by experience and have been well studied. Once a starting flowsheet is available, the evolution to better flowsheets can begin. Rules are needed to guide the evolution in two different ways: first to develop the possible structural changes that are permitted in the current flowsheet and second to guide the evolution to the most promising direction. We will call the first set of rules evolutionary rules while the second set of rules comprises the evolutionary strategy. The evolutionary rules . Consider a space containing a node for every possible structure which is expected to meet the design goals. Figure 24A shows such a space. Two nodes will be close to each other in this space if they are judged to be yery similar, having only a single "small" structural difference between them. Then the evolutionary

PAGE 154

137 (A) HQDES' FLOWSHEETS EDGES: EVOLUTIONARY RULES Figure 24. An Illustrative Diagram Showing Two Alternate Sets of Evolutionary Rules for a Family of Flowsheets A.

PAGE 155

138 rules we wish to invent can be viewed as a set of edges joining these nodes and forming a network as in Figure 24A. Thus the basic notion of the neighboring flowsheet is developed. A flowsheet B will be called neighboring flowsheet to a flowsheet A if B has been generated from A with the smallest possible number of meaningful structural modifications to A. Thus around flowsheet A there is a family of flowsheets with the smallest number of structural differences from A; these are the neighboring flowsheets of A. We will view the evolutionary rules as the guidelines needed to develop all the neighboring flowsheets of the current flowsheet. Figures 24A and B show the generation of the neighboring flowsheets with two different sets of evolutionary rules. Figure 25 shows also the generation of close neighbors using simple structural modifications. With this discussion we are motivated to require the set of evolutionary rules invented have at least the following properties. 1. Efficiency: the rules should invent legitimate candidate flowsheets only and not ones which are infeasible or undesired for other reasons, such as requiring technology we wish to avoid or which we do not have. For the network of nodes representing the feasible flowsheets, this requirement means no edges lead out of a node which fails to connect to another node. 2. Completeness: to guarantee by repeated application, the potential to generate all possible flowsheets for the system under examination. This property says that the network (graph) representing the structures as nodes and the rules as edges must be connected. 3. Reversibility: the rules should satisfy the requirement that if flowsheet A is a neighbor of flowsheet B, then B is a neighbor of A.

PAGE 156

140 Again the graph has a corresponding property; that is, the graph should be undirected. The last property we are inclined to impose cannot be proved but certainly seeins desirable. 4. Intuitive reasonableness: the rules should connect structures which a design engineer considers to be actual "neighbors". The rules of the network in the Figure 24A are appealing in this sense whereas the ones in Figure 24B are not. The first three properties just stated guarantee that, starting from any feasible flowsheet, we can generate in a reasonable manner all possible flowsheets which are feasible and meaningful. Thus by evaluation of all the alternatives one could locate the optimum solution. However, we wish to examine here if a properly selected scheme can lead from the starting flowsheet to the optimum by enumerating only a small subset of the set of all the flowsheets, and the fourth property appeals to one's intuition as a helpful one for this goal. The evolutionary strategy . The evolutionary strategy is one that will hopefully lead the designer to the optimum flowsheet by an effective use of the evolutionary rules. As such it must be fast and yield good, if not optimum, results. There are different variations that an evolutionary strategy can follow depending largely on the search time that can be spent and the quality of the solution that is desired. The general tendency is to move from a flowsheet to one of its neighboring flowsheets and in particular to the more promising one. Several approaches suggest themselves. 1. One might generate, using the evolutionary rules, all neigliboring flowsheets to the current flowsheet, size and cost them rigorously.

PAGE 157

139 NJ

PAGE 158

141 Among the alternatives, the one effecting the greatest improvement would be chosen as the next current flov.'sheet. 2. Another approach is to generate all the neighboring flowsheets but to use heuristics to choose the most likely best next flowsheet. This flowsheet is evaluated and, if it represents an improvement, is chosen as the next current flowsheet. The other flowsheets would not be evaluated. If it fails, the second most likely flowsheet is investigated and so forth. Quite good heuristics are usually available for chemical process design and this approach could prove very effective. 3. The evolutionary rules could be applied selectively, applying only some until a locally optimum structure is found using only them. Then the other rules could be applied to see if they lead to improvements. This approach is rather like a univariate search for continuous variable optimization. 4. If the total number of flov.'sheets is snail or perhaps only when investigating a structure which appears to be locally optimum, one could not only evaluate neighbors but also neighbors of the neighbors (second level of neighbors) and so forth. This idea is suggested by the look-ahead strategy in chess where as many look-ahead steps are investigated as time and space permit. Pho and Lapidus (1973) also used this idea to generate networks of heat exchangers as nodes of a tree and search for the optimum. Evaluation of the Flowsheets . The last subtask into which we partitioned evolutionary synthesis is providing a means to evaluate flowsheets so they may be compared. Again many options are possible, and we shall suggest a few.

PAGE 159

142 1. Evaluate the flowsheet at a reasonable set of values for the decision variables only. This corresponds to a case study approach and is useful if large differences occur in the results. 2. Optimize each structure by adjusting the decision variables to mininiize the objective function. Clearly considerable effort is required here, but the fact that neighboring flowsheets are only slightly different may provide good starting conditions for this optimization. 3. Use primal and dual bounding on each structure as suggested by McGalliard and Westerberg (1972). These bounds provide a range within which the optimum value of the objective function must occur for a flowsheet. Nonoverlapping ranges permit easy comparison. The interesting feature here is that a quite good dual bound for a slightly modified structure is often easy to obtain, requiring only that the modified portion of the structure be analyzed. Again a mixture of approaches is possible. If a case study approach indicates major reduction in the objective function, then it can be used until the screening becomes more difficult. Then more and more precise evaluations can be used. Only the final structure may require a detailed optimized evaluation of it and its neighbors to prove that it is the best locally. Whichever of the above strategies is adopted, better and better flowsheets are continuously synthesized until a local optimum is found. This same procedure can be repeated from different starting points. Depending on the nature of the problem one or more local solutions may result.

PAGE 160

143 VI 1. 2. Evolutionary Synthesis of Optimal Multi component Separation Sequences In this section the previously developed notions of evolutionary synthesis v/ill be illustrated on the synthesis of niulti component separation sequences. The problem of synthesizing such a system was formulated in Chapter VI. Representation techniques for separation schemes, the nature of the neighboring flowsheets, the evolutionary rules and evolution strategy will be developed here in the framework of evolutionary synthesis. VII. 2. a. Representation of separation sequences as binary trees The notion of the trees is not new. It has been used extensively in different areas. Formally we will define a binary tree as a "finite set of nodes which either is empty, or consists of a root and two disjoint binary trees" which will be called the upper and the lov/er subtrees of the root. A separation sequence can be represented as a binary tree. Every separator receives one feed and produces two product streams which either are final products or have to be treated further. Thus the set of separators in a separation sequence satisfies the definition of the binary trees as given here. Consider the following separation sequence in Figure 26A. The binary tree corresponding to this flov/sheet is given as Figure 28B, where a represents a separator using separation method a and S represents a separator using separation method g. The intermediate nodes of a binary tree will represent separation operators, while the terminal nodes represent basic products (to be called basic operands) which are to be produced from the initial mixture. The edges correspond to streams connecting the different separators. The representation of

PAGE 161

144 ^ R 1 Figure 26. A Separation Sequence (A), Its Corresponding Binary Tree (B) and the Skeleton Structure (C) Corresponding to This Tree

PAGE 162

145 a separation sche.e by a binary tree is a compact and computationally very useful approach. Binary trees or their equivalent Polish strings are easily handled in a computer. If we replace all operators by an asterisk (*) and all basic operands by a dot (.) in a tree, we shall define the resulting tree as the "skeleton structure" equivalent to the original tree. Figure 26C represents the skeleton structure of the tree in Figure 26B. We shall also define the "up operand count" for an operator as the total number of basic operands in the upper subtree emanating frc. that operator and the "down operand count" as the total number of basic operands in its lower subtree. These counts are g.ven on the tree branches in Figure 268. For a binary tree representing a separation train the up operand count denotes how many components are in the top product stream for that separator and the down operand count, how riany components are in the bottom product stream. Furthermore, if we convert a given skeleton structure back to an actual separation scheme by specifying the kind of separator used in each intermediate node, the positions of the various components of the initial mixture on the binary tree are automatically specified through the use of the ranked lists. For example, if the ranked lists for methods a and 3 are: RL(a) : ABCDE and RL(6) : BDAEC the distribution of the components A, B, C, D and E on the tree in Figure 26B is fixed as shown. Thus, operator a^ has one up operand

PAGE 163

146 and four down operands. From the ranked list for separation metliod u these are (A) and (B,C,D,E) respectively. Similarly, for operator (3^ which receives the mixture B,C,D and E, there dre 2 up operands and 2 down operands, and from the ranked list for method 3 these must be (B,D) and (E,C) respectively. In the same way we find the operands for operators a^ and 3^. VII.2.b. Neighboring flowsheets and the evolutionary rules for a "^ separation sequence Given a certain separation scheme, all the neighboring flowsheets must be generated. For a separation sequence a neighboring flowsheet must retain as much as possible of the current structure and differ only in a small portion of it (Property 4). Usually this type of change is localized by considering two consecutive separators. Consider the binary tree in Figure 27A. Operator a^ performs the split ... D/E and operator a, the split ... C/D We can make a simple modification to the above structure by exchanging the relative positions of operators a^ and a^ and therefore the relative position of these two splits in the overall structure. By doing so the structure in Figure 27B results. Comparing these tv:o structures we notice that they differ as a result of one simple structural modification in an isolated part of the overall structure. We shall therefore consider these two flowsheets to be neighbors to each other. We shall refer to structure 27B as a down neighbor to structure 27A because the tv-^o separators interchanged are in a "downward" relationship to one another. Property 3 for our evolution rules requires that if structure 27B is a neighbor to 27A, then 27A is a neighbor to 27B. In particular we shall call it an up neighbor.

PAGE 164

147 Figure 27. Flowsheet (A), a Down Neighbor (B) to It, and a New Separator Type Neighbor (C) to It

PAGE 165

148 To illustrate further the above relationship betv/een these structures consider the flovvsheets they represent. Figures 28A and B respectively. In the first flovvfsheet, Fig. 23A, the ... D/E s|)lit takes place before the ... C/D ... split, while the opposite is the case v,'i th the second flowsheet, Fig. 28B. Thus we can say that the difference between them is only in the order which these two splits occur. A third kind of neighboring flowsheets is generated by substituting an operator with a different kind of operator, for example use separation method 3 instead of a to attain a certain split. The structure in Figure 27C is a neighboring flowsheet of this type to the one in Figure 27A. When a tree has more than a single type of separation operator present in it, the interchange (either upward or downward) of two unlike operators may lead to a considerable reshuffling of the basic operands. We shall still use the rules developed for a single operator type to generate neighbors for different operator types as there appears to be no obvious method to make a smaller change to a structure in this case. Thus it is evident from the above that three evolutionary rules are needed to generate all the neigliboring flowsheets to a given flowsheet. These rules are further discussed in the section VII. 2. c. from the polish strings point of view. The evolutionary rules for a separation sequence as discussed above satisfy the first three basic properties required in section VII. 1 Tfius, Property 1: all the rules create physically consistent and feasible flowsheets since any binary tree resulting from the application

PAGE 166

149 I A) Figure 28. Separation Sequences Corresponding to the Binary Trees of Figures 27 A and B

PAGE 167

150 of the previous rules corresponds to a separation sequence. Property 2: by repeated application of the evolutionary rules, all possible flowsheets can be generated. The proof of this result is given in the section VTT.?.d. This guarantees the completeness of the evolutionary rules. Property 3: if a flowsheet B is an up neighbor of A, then A is a down neighbor of B. These characteristics are also met with the third evolutionary rule, where a separator is replaced by a different kind of separator. The "existence" of property 4 can only be argued and not proved, as we have done in presenting these rules. VII. 2. c. Polish strings and their representation of separation sequences Binary trees were introduced to represent separation sequences , but for computation purposes polish strings have been used to provide a compact and elegant way to represent binary trees. Polish strings, invented by the Polish logician Lukasiewicz (Knuth, 1968) consist of a string of operators and basic operands. In computer science they are used to represent mathematical expressions and an operator is a basic mathematical operation such as multiply or subtract and a basic operand is an alphabetic or numerical character. For example the algebraic expression 3x y + z /u can be represented by the binary tree shown in Figure 29, which in turn is represented by the following polish string (using FORTRAN operators) : + / u (**) 2 z * y * (**) 2x3

PAGE 168

151 ^^ ^ \ \ Figure 29. A Binary Tree for the Algebraic Expression 3x y + z /u

PAGE 169

152 The +, /, **, and * are the operators while the u,z,2,y,x, and 3 are the basic operands. A polish string is interpreted in reverse order (usually referred to as reverse polish) which we can do here to aid in interpreting such a string. First an empty stack for operands is created and if an operand is encountered it is placed on this stack. If an operator is encountered, the operator is immediately applied to the last two operands on the stack. These two operands are removed from the stack and then the result of the operation is placed back on the stack. These steps for the above string are given in Figure 30. Since a separation sequence is a binary tree, it can be represented by a polish string. In such a polish string an operator is a separator employing one of the available separation methods hile a basic operand is a component of the initial mixture. Consider the separation train in Fig. 25B. The polish string representing this flowsheet is: a^ 3^ ^4 C E a D B A Since e\/ery separation scheme can be represented as a binary tree, there is a polish string for every separation sequence. Before presenting the evolutionary rules in terms of polish strings let us give some further characterizations. Consider the following string a a„ a , -a 12 4 3 • In the following string the operands which would be on an operand stack if the string were being interpreted can be counted by starting w

PAGE 170

153 o 3) >H CO a. K> ^ ^ ^ X cy ^ o < o +-> o a T4 CJ QJ CD i-> cr.ui

PAGE 171

154 from the right end of the string. 12 3 4 3 4 3 2 1 a-, Op a, • a^ • • • • Note that when an operator is encountered the number of operands decrease by one since an operator employs two operands and produces a single new operand. Following the count of the operands, the two operands of the operator a, are readily identified. Thus for operator a-, we have : 1 2 3 4 3 4 3 2 1 'l I a^ a^ • a^ first operand second operand In a similar way the operands of the operators a„,a2 and a^ are identified. The full picture is given in Figure 31. Let us now see how the evolutionary rules can be used along with a polish string representation in a manner suitable for computer programming. Rule 1 : (Generation of down neighbors). If an operator in a polish string is followed directly by a second operator a down neighbor exists. Move the first operator to a position just before the second operand of the second operator. R ule 2 : (Generation of up neighbors). If an operator is immediately preceded by a basic operand an up neighbor exists. Simply reverse the steps taken in rule 1; the operator being examined here will become the "first" operator of rule 1.

PAGE 172

155 Rule 3: (Generation of neighbors with different kind of separators). In order to generate all the nieghbors to a given separation train, by introducing a new kind of separator, introduce the new operator (corresponding to the new kind of separator) to every possible position occupied by any other operator in the given string. In Figure 31 the movement of operators to generate neighboring flowsheets using rules 1 and 2 is also shown, while the flowsheets are shown in Figure 32. Note to apply a rule requires very simple programming operations. For rule 1, an operator is removed from a string, a number of elements are moved up one position until the second operand of the second operator is inserted back in the string. The simple counting strategy discussed earlier locates directly the second operand of the second operator as the elem.ents are being moved. The reverse steps give rule 2. VII. 2. d. Proof of the compl eteness of the evolutionary rules As stated in section VII. 1. the evolutionary rules must have the characteristic of completeness, i.e. all possible flowsheets can be reached using repeated applications of rules 1 , 2 and 3 from any initial flowstieet. The proof of the completeness of the evolutionary rules given here for separation systems will be by providing an algorithm which can convert any given original flowsheet to any other target flowsheet. Three steps constitute the heart of the algorithm: Step I: Convert the original flowsheet to a binary tree structure with the same kind of operators (i.e. a) by at most n-1 repeated applications of rule 3 only, where n is the number

PAGE 173

156 CM ro ro •sT" cy

PAGE 174

3> -n

PAGE 175

C\3 k 158 ^ / \ vd ^ < \ y \ ' / ^ / \/ \ \ •-W %^

PAGE 176

159 of components in the original mixture. Step II: Change the skeleton structure of this flov;sheet to that of target flowsheet by a finite number of applications or rules 1 and 2 only. Step III: Convert the final binary tree structure to the target flowsheet by at most n-1 repeated applications of rule 3 only. It is clear from the above that only step II requires a proof that it can be accomplished, since the validity of the steps I and III is self-evident. To prove step II, first list the basic operands according to the ranked list of the separation method used (i.e., method a). We will then need the following identification scheme for the operators. Index each operator in the original structure by the number of the components in the original mixture which are lighter than the heavy key for that operator. Thus a. is an operator denoting a separator with j basic components lighter than the heavy key of the split for that separator. Next we note that operator ain the new structure is that operator which represents the same split as in the original structure and thus also has j components lighter than its heavy key. Figs.27A and 27B illustrate the application of rule 1 with the operators identified in both. Identifying the operators in this manner, we observe that in all flowsheets, a. never changes its vertical position relative to all other operators. The difference in flowsheets can be characterized by where the operators are relative to each other in the horizontal direction only. Thus an operator a. is either before a, , after a. ,

PAGE 177

160 or on a path parallel to it. It is always above if j
PAGE 178

161 VII.2.e. E volutionary Strategy In the previous sections a compact way to represent a separation sequence and a set of evolutionary rules were developed. It still remains to establish a search strategy with which, starting from a good initial flowsheet, we can move quickly and effectively in a systematic way to the optimum. Repeated application of rules 1, 2 and 3 can generate all the possible separation sequences, since these rules possess the property of completeness. If each flowsheet is optimally sized and costed, the optimum flowsheet is easily identified. In principle this exhaustive enumeration strategy will provide the optimum solution to any size separation problem. However as the number of components in the initial mixture and/or the number of the separation methods considered increases, the number of the possible alternatives grows so large that an exhaustive enumeration becomes prohibitive. Therefore certain guides are required to define the direction which will be most promising for yielding the optimum solution, thus reducing significantly the search space. These guides are heuristic in nature. For the synthesis of separation sequences, the m,ost promising direction is that of the neighboring flowsheet which has the lowest cost. Thus, consider the initial flowsheet and its neighboring flowsheets derived by using rules 1 and 2. These flowsheets are optimally sized and costed and the one that exhibits the lowest cost is retained while the rest are rejected. The neighboring flowsheets for this new base flowsheet are developed and evaluated. Ue always move towards the direction of the cheapest neighbor if of course this neighbor appreciates a lov-;er cost than the current best

PAGE 179

162 flowsheet. This procedure ends when no up or down neighboring flowsheet to the current best flowsheet has a lower cost. Then rule 3 is applied to generate a new class of neighboring flowsheets. We proceed in the same manner until no neighboring flowsheet can be generated with lower cost. Then v/e have a local optimum. This approach can be improved by generating a second level of neighboring flowsheets. Thus neighbors of the neighboring flowsheets to a given flowsheet are generated and evaluated. From all the flowsheets the one with the lowest cost is retained. Such a variation requires more computing time but also improves the ability of the strategy to locate the best flowsheet. As mentioned in section VII. 1., Pho and Lapidus (1973) used a similar look-ahead strategy with a variable number of look-ahead steps for the synthesis of heat exchange networks. In their approach the solution found depends very much on the number of the look-ahead steps, since whole classes of flowsheets are disregarded once a certain direction is chosen. With the strategy proposed in this work flowsheets are neglected only temporarily but the possibility exists they will be generated later as neighboring flowsheets of other flowsheets. Another variation of the evolutionary strategy, making use of heuristics and thus potentially requiring less computing time, is the following. The structures for the neighboring flowsheets to the current flowsheet are derived. The most promising is chosen and only this is sized and costed. If the neighbor represents an improvement, it immediately becomes the new basic flowsheet. The selection of the most premising neighbor is heuristic and the success of the method depends on the experience of the designer in the particular area of

PAGE 180

163 the synthesis problem. Heuristics v;hich can help the synthesis of muUi component separation sequences using this last evolutionary strategy are given by Thompson and King (1972) and King (1971). VII .3. Examples of Evolutionary Synthesis In the following two sections the ideas developed previously will be tested on the synthesis of separation systems. The first example is the synthesis of a separation train for solids using two methods of separation, screening and the use of dense liquids in which some of the rocks float and others sink. It is a simple example of large size (1344 flowsheets possible) where the main interest is to demonstrate the use of the evolutionary rules and of the evolutionary strategy as developed in the previous sections. The second example is the synthesis of a liquid multi component separation scheme for a real industrial problem. It was taken from Hendry and Hughes (1972) and represents a real process (Buell and Boatright, 1947) as encountered by the practicing engineer. Unlike the first example the separation units are more complicated; their computation time is longer. More complex cost functions were used and therm.odynamic routines were needed to calculate the thermodynamic properties of the mixture. Vihile the first example was solved by hand, the second required the use of a computer. VII. 3. a. Example 1: synthesis of a solids' separation system The problem is to separate a mixture of solids into separate homogeneous piles consisting of the same type of solids. Consider a mixture of 100 tons of solids A,B,C,D,E and F. Each type of solids has a different size and different specific gravity. The

PAGE 181

164 specifications of this mixture of solids and the corresponding sizes and specific gravities for each type are shown in Table 8. We shall consider two separation methods, the use of screens (method a) and the use of dense liquids in which some of the rocks float and others sink (method 3). It is clear that a screen can separate the rocks put across it into two piles: those which get through the screen go into one pile and those which are too large go into the other. Using a dense liquid also gives two resulting piles: those which are less dense than the liquid and float and the rest which sink. Figure 33 illustrates a structure, i.e., a system of separators, which could separate the pile of rocks into six piles each containing only one type of rock. We shall use the following cost functions for the two types of separation methods: C = weight of the input stream handled ($) C = weight of the input stream / (specific gravity difference of the two rock types next to the point of the split) ($) The cost of the structure in Figure 33 is $236 with individual costs as follows: Separator a^ Cost: $ 100 Separator B-, Cost: $ 60 Separator a^ Cost: $ 15 Separator 3. Cost: $ 41 Separator ar Cost: $ 20

PAGE 182

165 TABLE 8 SPECIFICATIONS OF THE SOLIDS IN EXAMPLE 1 Rocks

PAGE 183

166 > B Figure 33. A Schematic Representation of a System for the Separation of Solids and the Corresponding Binary Tree

PAGE 184

167 The evolutionary rules 1, 2 and 3 developed earlier were used for the evolutionary synthesis of the above system. Applying rules 1 or 2 suggests that we interchange ci^ and g, , 3-1 and a^, a^ and 3^, or B. and a^, thus creating four neighboring structures. Rule 3 would result in changing an a separator into a B separator or the reverse; S^ becomes a,, a^ becomes B^, and so forth. Five additional neighboring structures can be created. Twelve interchanges were made in the order indicated in Table 9 to locate a local minimum structure of $200.3. An exhaustive enumeration of all the feasible flowsheets through a branch and bound strategy showed that this is also the global solution to the problem. VII.3.b. Synthesis of a multi component separation scheme The n-butylene purification system, taken from flendry and Hughes (1972) and discussed in Chapter VI, will be used to illustrate further the principles of the systematic evolutionary procedure developed here. The initial feed contains Propane (C3), n-Butane (NB), Butene-1 (Bl), trans-Butene-2 (B2T), Cis-Butene-2 (B2C) and Pentane (C5). Distillation (a) and extracti ve distillation (P) will again be the separation methods considered. To generate the initial flowsheet we will use the strategy employed to establish the basic flowsheet in the branch and bound method in Chapter VI. Thus the sublagrangians of the 6-component separators are evaluated and the one with the least value is chosen as the first '' separator. In a similar manner we find the second, third and fourth separators for the initial flowsheet. The basis of such an approach is the fact that the separation units, being effectively constrained.

PAGE 185

168 TABLE 9 THE EVOLUTIONARY STEPS TAKEfl DURING THE SYNTHESIS OF THE SOLIDS SEPARATION SYSTEM Number

PAGE 186

169 yield values for the dual bounds which are insensitive to the Lagrange multipliers and do not differ very much from the optimal primal values. For the n-butylene purification system the initial flowsheet thus created is the flowsheet (a) in Figure 34. The minimum cost for this flowsheet is found to be $884,828 / yr. If we apply rules 1 and 2 on the operators of this flowsheet, the neighboring flowsheets (b), (c), and (d) are generated. These flowsheets and their minimum costs are shown in Figure 34. Flowsheet (c) has the lowest cost of all the neighboring flowsheets, thus we move from (a) to (c) which becomes the new basic flowsheet. The neighbors of (c) are flowsheets (a), (e) and (f). lie move to flowsheet (e) which has the lov/est cost. None of the neighbors of the flowsheet (e), flowsheets (g) and (h) which are generated by using rules 1 and 2, appreciate a lower cost than the cost for (e). At this point we generate the neighbors of (e) using rule 3, obtaining flowsheets (i) and (j). Similarly, we find out that none of these new neighbors has a lower optimum cost than (e), thus we conclude that flowsheet (e) is a local optimum. In fact this is the globally optimum flowsheet as shown by Hendry and Hughes through an exhaustive enumeration guided by Dynamic Programming. Figure 35 shows the evolution from flowsheet (a) to the optimum flowsheet (e). Starting with all possible initial flowsheets the same optimum, flowsheet (e), was always located. In Fig. 36 the evolution is shown from flowsheet (k) which is a very bad starting point to the optimum, while the newly generated flowsheets are shown in Figure 37. For this simple problem the maximum number of flowsheets that had to be optimized by starting with the worst possible first flov/sheet

PAGE 187

170

PAGE 188

171 IJLl

PAGE 189

172

PAGE 190

173 ^ CD JC (.1 -C I/) S 4-" 2

PAGE 191

174 was 19. This number should be compared to 227, the total number of flowsheets wirich can be generated (Hendry and Hughes, 1972), to give a measure of the success of the proposed method to this particular synthesis problem. VII.4. Discussion An approach to organize and systematize the evolutionary synthesis of process flowsheets has been presented. The principal value of the present work lies in the fact that evolutionary thinking of the designer can be structured in a logical manner so that it does not depend solely on his intuition. The systematic development of structural modifications through evolutionary rules and the establishment of favorable or more promising directions are the two major factors of the proposed approach. A first attempt to synthesize optimal multi component separation sequences has been successful. Work is currently in progress to develop evolutionary rules for heat exchange networks which also allow for stream splitting and for integrated separation schemes with heat recovery networks. The experience with the synthesis of separation schemes has been encouraging. Promising directions, once established, move the process quickly to very attractive flowsheets. Information accumulated during the beginning exploratory steps of the evolution can be used very effectively in a heuristic sense to accelerate the process and to limit the search space. In this respect evolutionary synthesis has tfie advantage of an information feedback procedure to improve the effectiveness of the decision making at a given stage of the synthesis problem.

PAGE 192

175 The reasonabili ty of the evolutionary approach can be demonstrated by coiTioaring the total number of feasible flowsheets for a given separation problem to the number of flowsheets examined. Thus, for an initial mixture of n components using only one method of separation, Table 10 demonstrates the effectiveness of the proposed separation sequence. Table 11 corresponds to a situation where three separation methods ere used. In both tables the column "Ratio" denotes how many flowsheets are not examined per flowsheet examined. Thus for n=6 and one separation method, we examine on the average at most 22 flowsheets out of the 42 possible. For the calculation of the number of possible flowsheets the following formula was used (Thompson and King, 1972): n!(n-l)! where S is the number of the employed separation methods. The assumptions for the estimation of an average search and the formulas giving the number of the examined flowsheets during an average search are given in the Appendix A. With this approach matched by an efficient strategy for developing good initial flowsheets, it may prove to be a very useful tool in the hands of a designer. It is not intended only as a completely computerized synthesis strategy but also as an approach where the process design engineer, interacting with the computer, organizes and structures his thinking as a means to achieve "optimum" solutions.

PAGE 193

176 oo >< uj t/0 ct: LU 2: 0_ > — I UJ >— I iT) }— >^ CJ CD UJ LU Lj sr Llf— O L:_ cC >-H UJ d; UJ Q a_ —I LU IjJ CQ OO 00 -a

PAGE 194

177 O 1— a: Q_ i/) D_ o Q oo c/) !— LU UJ < 2: re q; UJ I— eC -a

PAGE 196

CHAPTER VIII CONCLUSIONS AND RECOMMENDATIONS FOR FURTHER RESEARCH In this dissertation two problems in chemical engineering process design were tackled. The first involved the optimization of nonconvex systems, very commonly encountered in chemical engineering, using modified version of the two-level optimization method and the discrete minimum principle. The second was concerned with the development of efficient strategies for the synthesis of process floivsheets. Hestenes' method of multipliers was employed successfully to overcome the deficiencies of the two-level optimization procedure and to develop a stronger version of the discrete minimum principle. The shortcomings for bothtrethods are caused by the presence of nonconvexities either in the objective functions or the constraints on the units of the flowsheet. The essence of the proposed method is that the nature of the stationary points of the Lagrangian function can be altered by introducing a penalty term. Thus, maxima or saddle points can be turned to minima points by simply increasing the value of the penalty constant. This secures the existence of a saddle point for the Lagrangian function and therefore guarantees the solution of the initial problem tlirough the use of a Lagrangian based m.ethod. The mathematical assumptions which were made to prove the validity of 178

PAGE 197

179 the proposed method are satisfied by most of the systems of interest. The new procedure has been illustrated by several numerical examples and its usefulness for the chemical engineering process design has been demonstrated on the design of a heat recovery netv/ork. Although these illustrations and demonstrations concern rather small examples for which the computations were made by hand, they are not sufficient to establish the general effectiveness of the methods, from a computation-time-required point of view, for large problems. Thus we suggest that a study of the computational characteristics of the new methods be undertaken for large problems. At the present an example of a somewhat larger problem is under way (Stephanopoulos and Westerberg, 1974) concerning the synthesis of an integrated system. The stronger version of the discrete minimum principle is probably faster than the new algorithm for the two-level optimization method because the problem is always feasible and does not require a second level of iteration. This second level of iteration doss not require exceedingly lengthy computation times because, for a large enough value of the penalty constant, the sub-lagrangians are likely well behaved and the minimum points should be found quickly. As it has already been mentioned the introduction of the penalty term desensitizes the Lagrangian function with respect to the Lagrange multipliers. This characteristic is of great importance in situations v/here the primal and the dual values are used as upper and lower bounds of the objective function respectively, as in the case with the branch and bound strategy for the synthesis of optimal multi component separation sequences described in Chapter VI. The loss of separability (which is only overcome computationally) will however mean that the

PAGE 198

180 penalty tern can only be used to refine con^plete flowsheet dual bounds. The units will be tied together at the solution. With respect to the synthesis of process flowsheets two general procedures were developed. The first makes use of the bounding properties of the primal and dual values in a branch and bound scheme to synthesize the optimum sequences of multi component separation problems. The second develops a systematic approach to evolutionary synthesis. The branch and bound strategy was applied to two separation problems with satisfactory results. The number of the optimized separators was considerably reduced compared with previous approaches while the flowsheets generated ware nearly optimum so that further screening was not necessary and the decision for the best could be made based on other considerations. Thus for the n-butylene purification system, from 64 possible separators, only 43 separators were optimized to generate 4 nearly optimum flowsheets out of 227 possible. For the olefins, paraffins separation systen, from 70 possible separators, only 36 v/ere optimized to generate 5 nearly optimum flowsheets. As mentioned in Chapter VI the success of the branch and bound strategy depends on the good solution of the following three pr-oblems: a) the generation of a good basic flowsheet, b) the computation of a satisfactory primal bound and c) the good estimation of the initial Lagrange multipliers. There exists fertile and promising ground for each of the above three problems. Thus it is suggested that new techniques be developed, or existing techniques be tested in their effectiveness to produce satisfactory basic flowsheets for the synthesis of general integrated process systems. The AIDES system

PAGE 199

181 (Siirola, Powers, and Rudd, 1971) or extensions of it may very well constitute a good approach in establishing a basic flowsheet. The computation of a satisfactory primal bound for the basic flowsheet is not a trivial problem. Sometimes repeated iterations are needed to establish a good, tight primal bound. One approach which suggests itself, as McGalliard (1971) noted, is to refine the information reversal procedure by applying a reversal sensitivity analysis at the point of occurence of system infeasibility in the primal calculation. Clearly this would be a local procedure and could incur a sizable computational load, but a modest improvement in the effectiveness of the resulting primal bound could justify its use. The problem of good estimation of the Lagrange multipliers is not critical for the synthesis of separation sequences since the decision variables are tightly constrained. But for more general systems which do not have the above property, the estimation of good initial Lagrange multipliers becomies very important. One alternative is to introduce in the problem a penalty term which will desensitize the dependence of the dual value on the Lagrange multipliers. This would slow down the procedure. It is suggested that new alternative methods to estimate good initial Lagrange multipliers be developed. The work on evolutionary synthesis has paved the ground for more extended research in this area. The basic principles and notions have been underlined and preliminary work has shown the usefulness of the approach. Further work is needed to establish evolutionary rules for other systems such as heat exchangers, reactor networks, separation sequences with heat recovery networks etc. The simple examples on which evolutionary synthesis was illustrated demonstrate that the

PAGE 200

182 evoluticriQry approach may constitute a powerful tool for this purpose, Further v;ork is needed to verify this preliminary result in more complicated integrated systems.

PAGE 201

APPENDICES

PAGE 202

184 APPENDIX A In section VII. 4. the reasonability and the efficiency of the evolutionary procedure was demonstrated by comparing the total number of feasible flowsheets for a given separation problem to the number of flowsheets examined during an average search. Thus, for an initial mixture of N components using S methods of separation, the number of the feasible flowsheets is given by the following formula: F = (2(N-1)) ! ^n-] N! (M-1) ! In order to estimate the number of flowsheets examined during an average evolutionary search, some assumptions must be made. The first and the basic assumption is that we will generate and evaluate all the possible neighbors using rules 1, 2 and 3 at each step of the evolution. For a mixture of N components using only one method of separation, the smallest number of the neighboring flowsheets that can be generated during the evolutionary procedure is the number of pairwise interchanges to prove the initial flowsheet is optimal : '., = (M-2) The largest number of examined flowsheets is M^ = (N-2) (N-1) (N-2) / Z

PAGE 203

185 because as assumed, at each evolutionary step, all possible neighbors are generated by (N-2) pairvjise interchanges using rules 1 and 2. Also for each of the above nieghbors there will be at most (N-2) + (N-3) + ... + 1 + = (N-2) (N-1) / 2 (see step D of the algorithm in section VII. 2. d.) consecutive structural changes to produce any of the flowsheets. Therefore the actual number of the flowsheets examined is between these two bounds and on the average we assume the arithmetic mean '^^A (N-2) [(N-2) (N-1) + 2] / 4 When more than one separation method is considered, the number of flowsheets examined at each step in an average evolutionary search increases by the factor (N-1) (S-1) since the application of rule 3 at each stage generates (N-1) (S-1) new flowsheets. In Table 10 the smallest (M^), the largest (M^) and the average (M ) numbers of flowsheets examined are shown for mixtures of N components (with different values of N) and employing only one separation method, while in the Table 11 M^ is shown for cases where three separation m.ethods have been considered. Values of the F/M^ ratio in the above tables show the potential effectiveness of the evolutionary procedure.

PAGE 204

185 These estimates of the average numbers of the flowsheets examined during an evolutionary synthesis of a multi component separation sequence are rather conservative estimates. There are tviQ reasons. First, one is likely to start the evolution with a flowsheet which is not far from the optimum. This will eliminate a large number of flowsheets which need not be considered. Second, it is also likely that one will not use the strategy to generate and evaluate all neighboring flowsheets. The evolutionary strategy may be based on the examination of the neighbors which are generated by a subset of the evolutionary rules or move to the first improved flowsheet found and only when we find a local minimum flowsheet we will evaluate most or all of the neighbors. The strategy to use only a subset of the rules was applied in the second example synthesized by evolution in Chapter VII and to move to the first improved flowsheet found in the first example.

PAGE 205

BIBLIOGRAPHY Avery, CO., and A.S. Foss, "A Shortcoming of the Multilevel Optimization Technique," AIChEJ., U (4), p. 998 (1971). Aris, R., Discrete Dynamic Programming, Blaisdell, N.Y. (1954). Bellman, R. , and W. Karush, "On a New Functional Transform in Analysis: The Maximum Transform," Bulletin of American Mathematical Society, 67 (5), p. 501 (1951). Bellmore, M. , H.J. Greenberg, and J.J. Jarvis, "Generalized Penalty Function Concepts in Mathematical Optimization," Oper. Res., 18 (2), p. 229 (1970). Rrosilow, C, and L. Lasdon, "A Two-Level Optimization Technique for Recycle Processes," AIChE-I .Chem.E. Symposium Series, Mo. 4 (1965). Buell C R., and R.G. Boatright, "Furfural Extractive Distillation for Separation and Purification of C^ Hydrocarbons," Ind. Eng. Chem., 39 (6), p. 695 (1947). Denn, M.M., optimization by Variational Methods, McGraw-Hi 1 1 , Mew York, N.Y. (1969). Everett H. Ill, "Generalized Lagrange Multiplier Method for Solving Problems of Optimum Allocation of Recourses," Oper. Res., 4 (3), p. 399 (1963). Falk, J.E., and R.M. Soland, "An Algorithm for Separable Nonconvex Programming Problems," Manag. Science, 15_(9), p. 560 (1969). Fan, L.T., and C.S. Wang, The Discrete Maximum Principle, Wiley, New York, N.Y. (1964). Glover, F., "A Multiphase-Dual Algorithm for Zero-One Integer Programming Problems," Oper. Res., 16 (4), p. 883 (1958). Gould, F.J., "Extensions of Lagrange Multipliers in Nonlinear Programming," S.I. A.M. Journal, Applied Mathematics, ]7_ (o), p. I28U (1969). Greenberg, H.J., "Lagrangian Duality Gaps: Their Source and Resolution," Technical Report CP-59005, Southern Methodist University, April (1909). 187

PAGE 206

Greenberq, H.J., "Bounding Nonconvex Programs by Conjugates," Oper. Res., 21 (1), p. 345 (1973). Greenberg, H.J., and W.P. Pierskalla, "Surrogate Mathematical Programs," Oper. Res., IS (5), p. 483 (1970). Hal kin, J., "A Maximum Principle of the Pontryagin Type for Systems Described by Nonlinear Difference Equations, S.I. A.M. Journal, Control, 4 (1), p. 90 (1966). Hendry, J.E., and R.R. Hughes, "Generating Separation Process Flowsheets," Chem. Eng. Progr., 58 (6), p. 69 (1972). Hendry, J.E., D.F. Rudd and J.D. Seader, "Synthesis in the Design of Chemical Processes," AIChEJ., 19 (1), p. 1 (1973). Hestenes, M.R., "Multiplier and Gradient Methods," J. of Optimiz. Theory and Applications, 4 (5), p. 303 (1959). Holtzman, J.M., and H. Halkin, "Directional Convexity and the Maximum Principle for Discrete Systems," S.I. A.M. Journal, Control, 4 (2), p. 253 (1956). Horn, P., "Uber das Problem der Optimalen Ruhrkessel kaskade fur Chemische Reactionen," Chem. Eng. Sci . , J5^ (2), p. 175 (1961). Horn, F. , and R. Jackson, "Discrete Maximum Principle," Ind. and Eng. Chem., Fundamentals, 4 (1), p. 110 (1965a). Horn, F., and R. Jackson, On Discrete Analogues of Pontryagin's Maximum Principle," Intern. J. on Control, 1 (3), p. 339 (1955b). Ichikawa, A., and L.T. Fan, "Optimal Synthesis of Process Systems. Necessary Conditions for Optimal System and Its Use in Synthesis of Systems," Chem. Eng. Sci., 28 (2), p. 357 (1973). Jackson, R., "Some Algebraic Properties of Optimization Problems in Complex Chemical Plants," Chem. Eng. Sci., 19^ (1), p. 19 (1964). Katz, S., "Best Operating Points for Staged Systems," Ind. and Eng. Chem., Fundamentals, 1 (2), p. 225 (1952). King, C.J., Separation Processes, McGravy-Hi 1 1 Book Co., New York, N.Y. (1971). King, C.J., D.Vi. Gantz and F.J. Barnes, "Systematic Evolutionary Process Synthesis," Ind. and Eng. Chem., Process Design and Development, H (2), p. 271 (1972). Knuth, D.E., The Art of Computer Programming. Vol. 1, Addi S0n-','esley Publishing Company, Reading, Mass. (1958).

PAGE 207

189 Lasdon, L.S., "A Multi-Level Technique for Optimization," Ph.D. Thesis, Case Institute of Technology (1964); Systems Res. Center Report SRC 50-C-64-19. Lasdon, L.S., optimization Theory for Large Systems, MacMillan Publishing Co., New York, N.Y. (1970). Lee, K.F., A.H. Masso, and D.F. Rudd, "Branch and Bound Synthesis of Integrated Process Designs," Ind. and Engin. Chem., Fundamentals, 9 (1), p. 48 (1970). Loane, E.P., "An Algorithm to Solve Finite Separable Constraint Optimization ProblemiS," Oper. Res., 19^ (6), p. 1477 (1971). Masso, A.H., and D.F. Rudd, "The Synthesis of System Designs: II. Heuristic Structuring," AIChEJ., l^ (1), p. 10 (1969). McGalliard, R.L., "Structural Sensitivity Analysis in Design Synthesis," Ph.D. Thesis, University of Florida (1971). McGalliard, R.L., and A.W. Westerberg, "Structural Sensitivity Analysis in Design Synthesis," Chem. Eng. J., 4 (4), p. 127 (1972). Miele, A., P.E. Moseley, A.V. Levy and G.M. Coggins, "On the Method of Multipliers for Mathematical Programming Problems," J. of Optimiz. Theory and Applications, 10 (1), p. 1 (1972). Osakada, K., and L.T. Fan, "Synthesis of an Optimal Large-Scale Interconnected System by Structural Parameter Method Coupled with Multilevel Technique," Can. J. of Chem. Eng., 5]_ (1), p. 94 (1973). Pho, T.K., and L. Lapidus, "Topics in Computer-Aided Design: Part II. Synthesis of Ootimal Heat Exchanger Networks by Tree Searching Algorithms," AIChEJ., 19 (5), p. 1182 (1973). Pontryagin, L.S., V.A. Boltyanskii, R.V. Gamkrelidze, and E.F. Mischenko, The Mathematical Theory of Optimal Processes, John Wiley and Sons, New York, N.Y. (1962). Powers, G.J., and R.L. Jones, "Reaction Path Synthesis Strategies," AIChEJ., 19 (6), p. 1204 (1973). Rockafellar, R.T., "Non-Linear Programming," American Mathematical Society Summer Seminar on the Mathematics of the Decision Sciences, Stanford University, July-August (1967). Rozonoer, L.I., "The Maximum Principle of L.S. Pontryagin in Optimal System Theory: Part III.," Automation Remote Control, 20 (12), p. 1517 (1959). Rudd, D.F., "The Synthesis of System Designs: I. Elementary Decomposition Theory," AIChEJ., U (2), p. 343 (1968).

PAGE 208

190 Schock, A. v., and R. Luus , "Relationship of the Two-Level Optimization Procedure to the Discrete Maximum Principle," AIChEJ., 1_8 (3), p. 659 (1972). Siirola, J.J., and D.F. Rudd, "Computer-Aided Synthesis of Chemical Process Designs," Ind. and Eng. Chem., Fundamentals, j_0 (3), p. 353 (1971). Siirola, J.J., G.J. Pov^/ers, and D.F. Rudd, "Synthesis of Systems Designs: III. Toward a Process Concept Generator," AIChEJ., j_7 (3), p. 677 (1971). Sol and, R.M., "An Algorithm for Separable Nonconvex Programming Problems: II. Nonconvex Constraints," Manag. Science, 1_7 (11), p. 750 (1971). Stephanopoulos, G., and A.W. V/esterberg, "A Multilevel Technique for the Synthesis of Integrated Systems in the Presence of Nonconvexities," in preparation (1974). Thompson, R.W., and C.J. King, "Synthesis of Separation Schemes," Technical Report, AEC Contract No. W-7405-eng-48, Lawrence Berkeley Laboratory, July (1972). Umeda, T., A. Shindo, and E. Tazaki , "Optimal Design of Chemical Process by Feasible Decomposition Method," Ind. and Eng. Chem., Process Des. and Develop., H (1), p. 1 (1972). Umeda, T., and A. Ichikawa, "Synthesis of Optimal Processing Systems by a Method of Decomposition," Paper presented in the 71st Mat. Mtg. AlChe, Dallas Texas, February (1972). Umeda, T., A. Hirai, and A. Ichikawa, "Synthesis of Optimal Processing Systems by an Integrated Approach," Chem. Eng. Sci . , 7.1_ (5), p. 795 (1972). Westbrook, G.T., "Use This Method to Size Each Stage for Best Operation,' Hydrocarbon Processing Petroleum Refiner, 40^ (1), p. 201 (1961). Westerberg, A.W., Decomposition of Large-Scale Problems, D.M. Himmelblau, editor, pp. 379-397, North Hoi land/ American Elsevier, Amsterdam (1973).

PAGE 209

BIOGRAPHICAL SKETCH George Stephanopoulos, the oldest son of Nicholas and Elisabeth, was born June 1, 1947, in Kalamata, Messinia, Greece. Upon his graduation from the Paralia Gymnasium in June, 1955, he began his university studies at the National Technical University of Athens, Athens, Greece. In June, 1970, following the receipt of his Diploma in Chemical Engineering, he entered the Graduate School of McMaster University, Hamilton, Ontario, Canada, to work for a Master of Engineering degree in Chemical Engineering, which was granted to him in September, 1971. In September, 1971 he enrolled in the Graduate School of the University of Florida where he was initially awarded a teaching assistantship in the Department of Chemical Engineering. In June, 1972, he was awarded a research assistantship under Dr. A.W. Westerberg and in September, 1973, a Graduate School Research Assistantship for the completion of his studies at the University of Florida. He was the recipient of the following awards: 1. First Award of the Greek Mathematical Society (1955). 2. Study Scholarships from the National Scholarships Establishment (1965-70). 3. Progress Scholarships from the National Technical University (1967-70). 4. Texaco Oil Co. research fellowship (1959). 5. First Award for Graduating Students of the National Bank of Greece (1970). 5. "THOMAIDION" Award from National Technical University (1970). 191

PAGE 210

192 7. "CHRYSOVERGION" Award from National Technical University (1970), Mr. Stephanopoulos has accepted a faculty position as an assistant professor in Chemical Engineering at the University of Minnesota.

PAGE 211

I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. ''Mi/ 1 {'Ut Arthur W. Westerberg, Chairman Associate Professor of Chemical Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Michael E. Thomas Professor of Industrial and Systems Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Thomas E. Bullock Associate Professor of Electrical Engineering This dissertation was submitted to the Graduate Faculty of the College of Engineering and to the Graduate Council, and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. June, 1974 Dean, College of Engineering Dean, Graduate School

PAGE 212

I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Arthur W. We sterber^ Chairman Associate Professor of Chemical Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. ;^ ^V&c.o9