On the automated optimal design of constrained structures


Material Information

On the automated optimal design of constrained structures
Physical Description:
ix, 254 leaves. : illus. ; 28 cm.
Hornbuckle, Jerry Clyde, 1942-
Publication Date:


Subjects / Keywords:
Structural dynamics   ( lcsh )
Structural engineering   ( lcsh )
Girders   ( lcsh )
Buckling (Mechanics)   ( lcsh )
Structural design   ( lcsh )
Engineering Sciences thesis Ph. D
Dissertations, Academic -- Engineering Sciences -- UF
bibliography   ( marcgt )
non-fiction   ( marcgt )


Thesis -- University of Florida.
Bibliography: leaves 244-253.
Statement of Responsibility:
Jerry C. Hornbuckle.
General Note:
General Note:

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 000585119
oclc - 14184224
notis - ADB3751
System ID:

Full Text








Copyright by

Jerry C. Hornbuckle



To my grandmother, Mrs. Florence Hornbuckle,

and my wife, Carolyn. Without the love, confidence, and

patient understanding of Granny Hornbuckle and Carolyn

my graduate studies would never have been attempted.


To Dr. Robert L. Sierakowski and Dr. William H. Boykin, Jr.,

for guiding my research and for being more than just advisors.

To Dr. Gene W. Hemp, Dr. Ibrahim K. Ebcioglu, and Dr. John

M. Vance, for their assistance, support, and for serving on my advisory


To Dr. Lawrence E. Malvern and Dr. Martin A. Eisenberg, for

always finding the time to offer advice and explanations on questions

related to solid mechanics and academics.

To the departmental office staff for their kind assistance

with administrative problems and clerical support.

To Randell A. Crowe, Charles D. Myers, and J. Eric Schonblom

for attentive discussions of many little problems and for assistance

in preparing for the qualifying examination.




ABSTRACT . . . viii

I INTRODUCTION . ,. . .. 1

1.0 Survey Papers . .. .. 1
1.1 Historical Development: Optimal Columns 6
1.2 Historical Development: Optimal Static Beams 9
1.3 Historical Development: Optimal Dynamical Beams 12
1.4 Scope of the Dissertation . .. 14


2.0 Introduction . . .. 16
2.1 Problem Classification Criteria . 16

2.1.0 Problem Classification Guidelines .. 18
2.1.1 Governing Equations of the System .. 18
2.1.2 Constraints . ... 19
2.1.3 Cost Functionals . ... 21
2.2 Methods: Continuous Systems . ... 26
2.2.0 Special Variational Methods .. 27
2.2.1 Energy Methods . .. 27
2.2.2 Pontryagin's Maximum Principle .. 29
2.2.3 Method of Steepest Ascent/Descent .. 31
2.2.4 Transition Matrix: Aeroelasticity Problems 31
2.2.5 Other Miscellaneous Methods .. 32

2.3 Methods: Discrete Systems . ... 33

2.3.0 Mathematical Programming . .. 34
2.3.1 Discrete Solution Approximations .. 35
2.3.2 Segmentwise-Constant Approximations .. 37
2.3.3 Complex Structures with Frequency
Constraints . 38
2.3.4 Finite Element Approximations .. 40
2.4 Closure . . .. 41




3.0 Introduction . . .. 43
3.1 Problem Statement and Necessary Conditions ... 44
3.2 Mathematical Programming: Gradient
Projection Method . ... 48
3.3 Gradient Projection Methods Applied to
the Maximum Principle . ... 53
3.4 Maximum Principle Algorithm . .. 60
3.5 Solution Methods . .. 63


4.0 Introduction . . 66
4.1 Problem Statement . .. 66
4.2 Structural System. .... . 67
4.3 Unmodified Application of the Maximum Principle 70
4.4 Results: Geometric Control Constraints .. 79
4.5 Inequality Stress Constraints . .. 89
4.6 Results: Stress Constraints Included. .. 93


5.0 Introduction . . .. .. 101
5.1 Problem Statement . .. 101
5.2 Structural System . .. 102
5.3 Analysis of the Problem . ... 109
5.4 Application of the Maximum Principle .. 121
5.5 Results: Geometric Control Constraints 132
5.6 Inequality Stress Constraints . ... 148

AN EXAMPLE . . 155

6.0 Introduction . . .. 155
6.1 Finite Element Problem Statement. ... 155
6.2 Mathematical Programming: Gradient
Projection Method . .. 157
6.3 Results . ... 162


7.0 Introduction . .. 173
7.1 Computer Program Convergence Features .. 1.73
7.2 :!,.:rical Instatilibies for Cantilever
Beam Example . . 175
7.3 Numerical Instabilities for Column
Buckling Example . .... 183



8.0 Summary and Conclusions . ... 186
8.1 Recommendations ... . 186




BIBLIOGRAPHY ..... .... . 244


Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of the Requirements
for the Degree of Doctor of Philosophy



Jerry C. Hornbuckle

August, 1974

Chair-man: Dr. William H. Boykin, Jr.
Co-Chairman: Dr. Robert L. Sierakowski
Major Department: Engineering Sciences

Pontryagin's Maximum Principle is applied to the optimal design

of elastic structures, subject to both hard inequality constraints and

subsidiary conditions. By analyzing the maximum principle as a non-

linear programming problem, an explicit formulation is derived for the

Lagrangian multiplier functions that adjoin the constraints to the cost

functional. With this result the usual necessary conditions for opti-

mality can themselves be used directly in an algorithm for obtaining

a solution.

A survey of general methods and problems in the optimal design

of elastic structures shows that there are two general problem types

depending upon whether or not the cost functional is an eigenvalue.

An example problem of each type is included with the solutions obtained

by the method of quasilinearization. In the first example, a minimum

deflection beam problem, classical Maximum Principle techniques are

used. The eigenvalue problem is exemplified by the maximization of the

buckling load of a column and uses the explicit multiplier function


formulation mentioned above. Since the problem considered is conserva-

tive, it is therefore described mathematically by a self-adjoint system;

under this condition it is shown that the minimum weight problem is

identical to the maximum buckling load problem.

In order to demonstrate the theory for the programming techniques

used, the beam problem is also solved by using a finite element repre-

sentation of the structure. From a comparison to the maximum principle

solution it is shown that the form of the optimal solution obtained is

dependent upon the magnitude of the tolerance used with the numerical

solution scheme. Furthermore, it is shown that convergence by the

quasilinearization algorithm is related to the respective curvatures of

the initial guess and the solution.

Recommendations for additional investigations pertinent to this

study are also included.



1.0 Survey Papers

It is exceedingly difficult to write a general introduction to

the field of structural optimization for two basic reasons: (i) there

is no conventionally accepted nomenclature, and (ii) there is also no

conventionally accepted classification of problem types or character-

istics. In marked contrast, when one considers the calculus of varia-

tions, "cost functional, system equations, kinematic and natural bound-

ary conditions, adjoint variables, Hamiltonian, etc.," all have well-

defined, universally accepted meanings. Additionally, there is no con-

fusion when speaking of the problem types of Mayer, Lagrange, and Bolza.

This common language and categorization of problems does not exist in

structural optimization.

Instead, the field tends to branch and fragment into very special-

ized sub-disciplines that are oriented towards applications. ':hile these

branches are related to the general field, the techniques and methods of

one branch can seldom be applied to another. Moreover, as a result of

the tendency to an applications orientation, solutions are generally ad

hoc and not useful for other problems even within the same branch. The

lack of any definitive unification of the subject cannot be blamed either

on being recently developed or in receiving too little attention.

This is readily seen by considering the survey papers described in the

following paragraphs.

The earliest comprehensive survey paper is Wasiutynski and

Brandt (1963). Although their excellent historical development is

dominated by Russian and Eastern European references, the authors do

include a higher percentage of papers by Western authors than is

encountered in the typical paper from Eastern Europe. A more funda-

mental criticism is that too little is said regarding problem types or

solution methods.

Chronologically, the next survey paper is Gerard (1966).

The theme of this review is aerospace applications, with a particular

orientation to the design-management, decision-making process. Most of

the papers cited treat specialized aerospace structures and applica-

tions; however, the author does try to generalize by introducing a

design index D, a material efficiency parameter M, and a structural

efficiency parameter S. After defining the expressions for M and S

corresponding to several structural elements, design charts are pre-

sented which show regions of possible application for various materials.

Unfortunately the design charts do not satisfy expectations aroused by

the introduction of the three general parameters.

Rozvany (1966) presents a similar paper pertaining to structures

in civil engineering. This paper is less comprehensive and more ori-

ented towards specific structural applications than either of the pre-

ceding survey papers. The author postulates several "interrelated

qu3nti.ies (or parameters)" which could perhaps be used to generalize

structural optimization into a more rational methodology. However,

these quantities--loading (L), material (M), geometry (G), initial

behavior (IB), and design behavior (SB)--are only applied to an abstract

discussion of concepts.

Barnett (1966) has a readable short survey of the field that

dwells more upon theoretical aspects. He postulates a general problem

in which the cost is minimized subject to a system in equilibrium with

its loads, while "behavior constraints on strength, stiffness, and

stability" are satisfied. Uniform strength design introduces the dis-

cussion of optimal trusses; virtual work theorems that were derived

originally for trusses are then applied to simultaneous plastic collapse

problems. Following a brief discussion of the plastic collapse load

bounding theorems, there is a short treatment of elastic stability

problems and material merit indices. Barnett's stiffness design example

exhibits several important features worth noting. Specifically, the

example is to minimize the weight of a beam subject to some given load

where the deflection at a certain point is specified. Virtual work is

used to handle the subsidiary deflection condition. Necessary condi-

tions are obtained from the calculus of variations, but more signif-

icantly, the Schwarz inequality provides a sufficient condition for

global optimality. Barnett concludes his survey with a section stating

that multiload designs satisfying "all three behavior criteria" are more

easily solved in "design space" by mathematical programming techniques

described by Schmidt (1966).

As a sequel to the comprehensive survey (1638 to 1962) by

Wasiutynski and Brandt, Sheu and Prager (1968b) present a complete review

of developments from 1962 to 1968. This paper contains three major

sections: general background, methodology, and specific problems.

In the first section they state that the "well-posed problem of optimal

structural design" requires specification of the

(1) purpose of the structure (load and environment),

(2) geometric design constraints (limits to design

(3) behavioral design constraints (limits to the "state" of
the structure),

(4) design objective (cost functional).

The methodology section is not noteworthy, but the third section lists

what is in their opinion the specific problem types: static compliance,

dynamic compliance, buckling load, plastic collapse load, multipurpose/

multiconstraint structures, optimal layout (e.g., trusses), reinforced/

prestressed structures, and from the background section, probabilistic

problems. Their concluding remarks succinctly summarize the paramount

difficulty of the subject: realistic problems are too complicated for

precise analytical treatment. While progress is being made in analyt-

ical treatment of simple structures, the authors opine that realistic

problems require mathematical programming techniques. However, they do

Feel that analytical treatment is desirable to provide "a deeper insight

into the analytical nature of optimality."

A related survey by Wang (1968) on distributed parameter systems,

". .. whose dynamical behaviors are describable by partial differential

equations, integral eqlu-jtiLns, or functional differential equations,"

consists entirely of a bibliography. While not pertinent to the disser-

tation, it is mentioned here for completeness.

Prager (1970) provides another survey which is not comprehensive,

nor does he present any new results as claimed. For example, Prager

and Taylor (1968) used the principle of minimum potential energy

and the assumption that stiffness is proportional to specific mass to

prove global optimality. It could hardly be called a new development

in 1970. At the same time, the author does present an excellent example

of a multipurpose optimal design problem. Prager also treats "segment-

wise constant" approximations and the optimal layout of trusses.

The final survey paper, Troitskii (1971), is an unusual review

of methods in the calculus of variations. Whereas some of the earlier

surveys present lengthy lists of references but contain little method-

ology or theory, this survey is just the opposite. Part of what makes

it unique is that the author believed only eight articles merited cita-

tion--all of them by Troitskii. This shortcoming is more than overcome

by a thorough classification of optimal control problems in the calculus

of variations. Troitskii bases the classification on "certain character-

istics of control problems": types of constraints, properties of the

governing dynamical equations of the system, type of cost functional, and

possible state discontinuities. From these four criteria he postulates

five principal classes of problems; however, it is the criteria that

are important and not the specific problem type.

By comparing what the authors of the aforementioned surveys

believe to be the important types of problems, it is readily apparent

that there is little agreement on which characteristics of structural

optimization problems are significant.

1.1 Historical Development:
Optimal Columns

The beginning of structural optimization is generally attributed

to Galileo's studies in 1638 of the bending strength of beams. Accord-

ing to Barnett (1968), Galileo considered a constant-width cantilever

beam under a tip load as part of a study of "solids of equal resistance."

In requiring the maximum stress in each cross section to be constant

throughout the beam, the height must be a parabolic function of position

along the beam. While this appears to be the origin of the field, a prob-

lem that received more attention is the buckling of a column.

Using the newly developed calculus of variations, in 1773 Lagrange

attempted to apply variational techniques to the problem of finding that

distribution of a homogeneous material along the length of a column which

maximizes the buckling load. Truesdell (1968) relates that through an

insufficient mathematical formulation Lagrange showed the optimal form

to be a circular cylinder. Clausen (1851) provides the earliest known

solution to this problem for the simply supported case. As described in

Todhunter and Pearson (1893, pp. 325-329), Clausen minimized the volume

of the column with the differential equation for buckled deflection

treated as a subsidiary condition. Assuming all cross sections to be

similar, after several variable transformations and complicated manip-

ulation, he obtained an implicit, analytical solution.

The next development was Greenhill (1881), according to Keller

and Niordson (1966). Greenhill determined the height of a uniform

prismatic column, beyond which the column buckled due solely to its own

weight. Timoshenko and Gere (1961) reproduce the solution in which the

deflection is expressed as the integral of a Bessel function of the

first kind (of the negative one-third order).

Blasius (1913) introduces his paper with a uniform strength and

a minimum deflection beam problem. For a given load and amount of mate-

rial, the cross-sectional area distribution is determined which maxi-

mizes the buckling load of a circular column. The solution is identical

to that obtained by Clausen. In addition, Blasius also obtained solu-

tions for columns having rectangular cross sections and discussed the

effect of different boundary conditions on the results.

For the next few decades, structural optimization appears to

have been directed towards applications in the aircraft industry, where

aircraft structural problems and results are presented in the format of

a design handbook. Feigen (1952) is a good example of this, consider-

i:; the buckling of a thin-wall column. Given a constant load and wall

thickness, he required the variable inside diameter to be chosen such

that the buckling load is maximized. Wall thickness is selected to make

local buckling and Euler buckling occur at the same load. Solid tapered

columns having blunt ends are also treated for assumed stiffness dis-


Renewed interest was aroused by Keller (1960), who examined the

problem from the point of view of the theory of elasticity, and in

choosing the cross-sectional shape to give the maximum stiffness.

Neglecting the r--.bht of the column, he obtained via the former that

twisting the column does not affect the buckling. Of all convex cross

sections, the equilateral triangle is shown to have the largest second

moment of area relative to a centroidal axis. Hence, from the defin-

ition of buckling load, the "best" cross-sectional shape is the equi-

lateral triangle. Keller also obtained Clausen's implicit, analytical

solution. Subsequently, Tadjbakhshand Keller (1962) generalized the

problem to a general eigenvalue problem and boundary conditions subject

to a subsidiary equality constraint. The latter corresponds to spec-

ifying the volume (or weight) of material to be distributed in an opti-

mal manner. Using the H6lder inequality they demonstrate global opti-

mality of the eigenvalue for the hinged-hinged column.

Keller and Niordson (1966) examine the case of a vertical column

fixed at the base, subject to a vertical load at the tip and the column's

own weight. It is also assumed that all cross sections are similar.

Their approach is to state the problem as a simultaneous, dual optimiza-

tion of the Rayleigh quotient. The eigenvalue is minimized with respect

to the eigenfunction and maximized with respect to cross-sectional area

distribution. Specifying the volume of material available is treated

as a subsidiary equality constraint. From the maximum lowest eigenvalue

the maximum height at buckling is determined. Solutions are obtained by

an iterative technique employed with integral equations.

In a brief note, Taylor (1967), suggests that ener, methods may

link optimum column problems to the traditional eigenvalue problems of

mechanics. Prager and Taylor (1968) classify problems in optimal struc-

tural design and demonstrate global optimality using energy principles.

Unfortunately their assumption of thin-wall construction limits the

results to structures where the stiffness is proportional to the specific

mass density. The consequence of this assumption is that in the energy

formulation the resulting control law and governing equations are

decoupled, and hence easily solved. Huang and Sheu (1968) apply this

same thin-wall assumption to the problem treated by Keller and Niordson.

However, the former seek the maximum end load instead of the maximum

height. The authors also attempt to limit the maximum allowable stress

and obtain solutions by a finite-difference method. Further discussion

of sandwich (thin-wall construction) columns is given by Taylor and Liu

(1968). Basically, this paper is an elaboration of the techniques

described by Prager and Taylor when applied to columns. Extensive

results are provided for various cases.

Post-buckling behavior for columns subject to conservative

loads is considered by Gajewski and Zyczkowski (1970). A nonconcerva-

tive problem is treated by Plaut (1971b). The first of these two papers

is lengthy but is much too narrow in scope to be particularly useful.

In the second paper, the Ritz method is applied to an energy functional,

obtaining the "best" form of the assumed approximation to the optimal


1.2 Historical Development:
ODtimal Static Beams

That beam problems played a role in the early developments of

structural optimization has already been indicated in the preceding sec-

tion. No attempt is made in what follows to present a complete history,

but merely to outline the type of problems that have been considered.

Opatowski (1944) has an outstanding paper that deals with

cantilever beams of uniform strength. Besides providing numerous refer-

ences to earlier studies, the author treats the problem with impres-

sive mathematical rigor. The beam is considered to deflect under its

own weight and a transverse tip load; bending deflection is described

by a Volterra integral equation which is solved exactly for various

assumed classes of variable cross-sectional geometry. This paper is

representative of earlier papers in that it contains extensive analysis

and analytical results, but little numerical data. Barnett's work (1961)

and its sequal (1963a) apply the calculus of variations to more real-

istic I-beams. One problem considered is maximizing the weight sub-

ject to general, unspecified loads for a specified deflection at a given

point. The Schwarz inequality is used to derive a sufficient condition

for global optimality. Also included is a comparison of uniform strength

beams to the minimum deflection beam for several different cases of

applied load and geometry. The paper is concluded with various minimum

weight design examples in which both bending and shear stiffness are con-


Haug and Kirmser's (1967) work is one of the most comprehensive

studies of minimumweightbeam problem published. Wlnilc it may succeed in

handling any conceivable problem and in employing the most realistic

stress constraints, this very generality requires so many variables and

conditions that the mathematics is complicated almost beyond reason.

Another study of minimum weight beams (Huang and Tang, 1969) is

important for several reasons. By dividing the beam into many segments

having constant properties, and determining the necessary conditions

that must be satisfied by every segment, it appears that the authors

are using the same methods that were used to solve the Brachistochrone

problem in the seventeenth century. In their treatment, multiple

loads must produce a specified deflection at a given point; these

"segmentwise constant" approximations and multiple load problems have

recently received more attention. Of further interest in this paper is

the derivation of the multiple load optimality condition using

Pontryagin's Maximum Principle.

While no new results or techniques are contained in Citron's

(1969, pp. 154-166), the author gives a very readable minimum weight

beam example. The problem is simple, described in detail completely,

and provides an excellent example of how control theory is applied to

an optimal problem in structural design. An analysis of intermediate

beams which may form plastic hinges is provided by Gjelsvik (1971).

It shows that if hinges are placed at all points of the beam where the

bending moment is zero, this makes the plastic or elastic minimum

weight beam statically determinate. Both the elastic and plastic beam

designs are shown to be fully stressed, i.e., are uniform strength beams.

Application of generalized vector space techniques is character-

istic of more recent papers. Bhargava and Duffin's (1973) is such a study.

It treats the maximum strength of a cantilever beam on an elastic

Since these early papers are not readily available, this statement
is based upon descriptions of them given by historical references
cited in Appendix A.

foundation subject to an upper bound on weight. Although involving

more advanced mathematics than normally required by variational tech-

niques, vector space methods may also provide more powerful analyt-

ical tools.

1.3 Historical Development:
Optimal Dynamical Beams

The title of this section is a misnomer. On the basis of the

related literature, a more realistic terminology would be "Optimal

Quasistatic Beams." Papers dealing with the optimization of dynamical

beams ultimately contain some assumption or given condition that effec-

tively transforms the problem to an equivalent static case. Simple

harmonic motion is frequently assumed to remove time dependence from the

governing equation. For example, Barnett (1963b)minimizes the tip of

deflection of a cantilever beam accelerating uniformly upwards, with the

total '.Aicht of the beam specified. The optimal solution for cross-

sectional area distribution is specified by a nonlinear integral equa-

tion solved by successive approximations. Dynamics enters the problem

only as a time invariant inertia load which converts the problem to a

static beam subject to a body force.

Niordson (1964) finds the tapering of a simply supported beam

of given volume and length which maximizes the fundamental frequency of

free vibration. Assuming that all cross sections are similar, Niordson

expresses the desired frequency as the Rayleigh quotient. This is

obtained by the equation for spatial dependence associated with the

usual separation of variables technique. In this case it is assumed

that deflection, shear, and rotational inertia effects are small.

Solution of the conditions for optimality were obtained by successive

approximations. This approach results in a problem identical in form

to the eigenvalue problem associated with maximizing the Euler buckling

load of a column.

A very specialized problem is treated in Brach (1968). Cross-

sectional area and stiffness are proportional, and have both upper and

lower bounds; material properties and length of the beam are specified

constants. The object is to make the fundamental frequency of free

vibration stationary, without any constraints on the weight. Instead

of using the Rayleigh quotient, Brach uses the total potential energy.

His solution method is ad hoc and not generally useful. A more useful

approach is described by Icerman (1969) for structures subject to per-

iodic loads. Necessary and sufficient conditions for minimum weight

are obtained from the principle of minimum potential energy. Amplitude

and frequency of the applied load is specified as well as "dynamic

response," defined as the potential energy associated with the load

amplitude displaced a distance equal to the displacement amplitude at

the point of application. It is also required that the load's frequency

be less than the fundamental natural frequency. Subject to these con-

straints, the structure's weight is to be minimized. Trusses and

segmentwise constant approximations are also treated. Once again a

dynamical problem is effectively transformed into a quasistatic problem.

Realistic treatment of the dynamic response of beams subject to

time dependent loads is given in Plaut (1970). An upper bound on the

response of the beam is obtained from the "largest possible displace-

ment of the beam under a static concentrated unit load." This inequal-

ity is determined from the time derivative of the total energy of the

beam and the Schwarz inequality. Plaut minimizes the upper bound on

the response for specified total weight and relationship between specific

mass and stiffness. Despite the consideration of a truly dynamical

problem, this approach has two weaknesses indicated by the author.

First, the upper bound need not be close to the exact answer; secondly,

there is no demonstration that minimizing the upper bound also minimizes

the actual response of the beam subject to dynamic loading.

Another paper worthy of mention is Brach and Walters (1970).

They maximize the fundamental natural frequency which is expressed as

a Rayleigh quotient that includes the effect of shear. Standard varia-

tional methods are employed to derive necessary conditions, but no

solutions nor examples are given. The authors do, however, suicest

using the method of quasilinearization. This paper is another example

of a quasistatic, dynamical beam problem.

1.4 Scope of the Dissertation

This dissertation is primarily concerned with the application

of Pontryagip,'s Maximum Principle to problems in structural optimization.

Only elastic materials are considered; however, various types of con-

straints are treated.

The theoretical development of the dissertation pertains to

problems described by an ordinary differential equation but is based

upon a numerical technique normally used with systems described by

a finite number of discrete quantities. For this reason, examples are

included for both types of systems.

No new solution techniques have been developed for the nonlinear

two point boundary value problems which characteristically arise in

optimization problems for continuous systems. Solutions are obtained

by a standard quasilinearization method. However, a modified "feasible

direction" numerical algorithm for use with discrete systems is

described and an example included to demonstrate its operation. This

serves to illustrate the application of the theory on which the algo-

rithm is based to the theoretical development associated with contin-

uous systems. Furthermore, it provides a comparison between the

solution to a problem described as a continuous system, and alternately

by a discrete approximation.

Additionally, it is the intent of this dissertation to replace

some of the confusion in classification of problem types contained in

the various survey papers with an organization based upon mathematical

attributes. The result is a logical approach to the formulation of

optimization problems for elastic structures.



2.0 Introduction

The first chapter presents a broad view of structural optimiza-

tion and the historical development of two general types of problems

that are used as examples in later chapters. Before the mathematical

theory is developed in the next chapters, general problems and methods

in structural optimization itself are briefly outlined in this chapter.

The classification of problem types vis-a-vis mathematical attributes

is discussed first. This is followed by short descriptions of the

major methods of structural optimization for both continuous and dis-

crete systems. Appropriate references are cited for each section.

2.1 Problem Classification Criteria

Perhaps the major source of difficulty in classifying struc-

tural optimization problems lies in the translation from the physics

involved to a mathematical representation. A single physical concept

when transformed to mathematics may become more than one mathematical

attribute. For example, consider the class of conservative problems.

Since energy is conserved this immediately prohibits dissipative mate-

rials, time varient constraints, and nonholonomic constants. More

important, consider the following statements from Lanczos (1962,

p. 226):

all the equations of mathematical physics which
do not involve any energy losses are deducible from
a "principle of least action," that is the principle
of making a certain scalar quantity a minimum or a
maximum .all the differential equations which are
self-adjoint, are deducible from a minimum-maximum
principle and vice versa.

However, it is shown in Chapter V that to be self-adjoint systems places

requirements on both the differential equation and the boundary condi-

tion. Thus, the single physics attribute of being a conservative sys-

tem is described mathematically by expressions involving the cost func-

tional, the governing system of differential equations, boundary condi-

tions, and constraints.

As a result of this lack of similarity in descriptions, a choice

must be made as to which realm will be used for classification of prob-

lems. Since all problems are ultimately transformed to mathematics,

mathematical characteristics are selected as the criteria. On the basis

of survey paper contents and the many related papers, it is felt that

the proper (not necessarily the best, nor all inclusive) characteristics

for classification of problem types are:

(i) cost functional

(ii) system equation and boundary conditions

(iii) control constraints

(iv) behavioral constraints (state and/or control constraints)

Subsequent discussion is in terms of these four characteristics.

Exceptions to these descriptors are readily acknowledged, e.g., whether

the problem is deterministic or probabilistic. An example of the latter

may be seen in Moses and Kinser (1967). These exceptions do not serve

as negating counterexamples but instead indicate the requirement for

additional descriptors and verify the difficulty of the task, suggest-

ing the need for further comprehensive study.

2.1.0 Problem Classification Guidelines

In the following sections each criterion is briefly discussed.

Some of the various characteristics of each are mentioned, and where

appropriate references exist the citation is given. It must again be

emphasized that the following is not all inclusive; it is an attempt to

categorize the types of problems existing in the literature according

to the four mathematical descriptors postulated. Moreover, the descrip-

tors are not discussed in the order given but instead are treated in

the order normally encountered during a problem solution.

2.1.1 Governing Equations of the System

The immediate question to be answered is whether the structural

system is described by a set of continuous functions or a set of dis-

crete constants. Bending deflection of simple structural elements is

an example of the former; design of trusses is a good example of the

latter. In general, variational techniques are employed with contin-

uous systems while mathematical progr-irming techniques are most fre-

quently applied to the discrete systems. However, variational methods

can be applied to the approximation of continuous systems by discrete

elements. This usually takes the form of either a finite element or

"segmentwise constant" approximation of the continuous structure.

References treating the different systems described above are included

in the section on methods.

2.1.2 Constraints

Two of the descriptors are postulated to be control constraints

and behavioral constraints. A further consideration is whether the

constraint is defined by an equality relationship or by an inequality

expression. Equality constraints are handled by a long-known technique

entitled "isoperimetric constraints." Valentine's (1937) work is known

to contain the initial development of a technique for converting inequal-

ity constraints to equality constraints. The introduction of slack

variables increases the number of variables in the problem to be solved

but at the same time permits all of the isoperimetric techniques to be

used. A detailed application of this approach is presented in Appendix B.

It should also be noted that isoperimetric constraints are sometimes

referred to as "accessory or subsidiary conditions."

Most real structural optimization problems possess an isoperi-

metric constraint as well as inequality constraints dependent upon the

control u and/or the state x of the structural system. Typically the

control constraints are the result of geometrical limitations or

restrictions to the types of available materials. Behavioral constraints

are related to the state of the system and may depend solely upon the

state (of deformation), or in the case of most stress constraints,

jointly upon the state and control variables. With this distinction the

constraints may be classified vis-a-vis the two criteria and optimal

control characteristics as follows:

By convention all vectors are column vectors unless indicated otherwise.

(i) unconstrained

(ii) 4(u) < 0 control constraint

(iii) O(x,u) < 0
Behavioral constraints
(iv) 0(x) 0)

All of this discussion pertains to both continuous and discrete systems.

No references for unconstrained optimization problems are given.

It may be that in some problems the unconstrained structural optimiza-

tion solutions have either infinite stiffness and finite weight, or

finite stiffness and zero weight. A discussion of this can be found in

Salinas (1968, pp. 23-26).

Investigation of control constraints led to the development of

the maximum principle. Although reserving discussion of the method for

a later section, the classical reference detailing the derivation of

the principle is given here for completeness. Rozonoer (1959) treats con-

trol constraints but only as related to the development of the maximum


Most of the literature concerns either bounded control problems

or a more general form of constraint which can be classified as a behav-

ioral constraint. The latter is a mixed constraint which depends upon

both the state and control variables. Breakwell (1959) is a lucid paper

dealing with this type of constraint. References which treat state

and/or mixed constraints are Bryson et al. (1963), and Speyer and

Bryson (1968). Constraints which depend upon only the state are not

treated in this dissertation; an example of such a constraint is to

determine the optimal solution for some problem subject to an upper

bound on deflection of the structure at any point.

2.1.3 Cost Functionals

There are two basic types of cost functionals that occur in the

field of structural optimization. They were first identified by Prager

(1969) but not for the proper reasons. Using Prager's notation, they are

J = Min f F(p) dV
f G(W) dV
JQ = Min
f H(J ) dV

where F, G, and H are scalar functionals of -. The latter functional

represents a Rayleigh quotient associated with an eigenvalue problem.

It can be reduced to the first type of functional shown above by choos-

ing a normalization of the eigenfunction such that the numerator

equals unity for all admissible variations (see section 5.3). This

normalization is thereafter treated as a subsidiary constraint.

What actually distinguishes the second functional from the

first is not that the functional is a quotient, but that the extrem-

ization of an eigenvalue requires a dual extremization (see section 5.3).

In terms of a state x and control u, the fundamental eigenvalue is

given by minimization of the Rayleigh quotient with respect to the

eigenfunction x, or

f G(x,u) dV
J = Min
x f H(x,u) dV

where u represents some specified design parameter. If the desired

result is to maximize the cost JQ with respect to all admissible u, it

is observed that a second extremization is required; for example, see

Keller and Niordson (1966). Thus, a more appropriate manner of classi-

fying cost functionals is on the basis of whether the problem statement

implies a single or a double extremization. Hence the two basic types

of cost functionals encountered in structural optimization are

J = Min f F(x,u) dV
u V

f G(x,u) dV
J = Max Minr
u x f H(x,u) dV

where x must satisfy an equilibrium condition of the state, and u is

subject to some admissibility requirements.

There is a special case related to these two in which the weight

is to be minimized for a specified eigenvalue. This problem is

treated in Icerman (1969) with a mathematical discussion of such a

variational problem presented in Irving and Mullineux (1959, p. 394).

In terms of the two cost functionals, the special case is

J = Min f {G(x,u) J H(x,u)} dV
uV V

where J is a specified constant. This approach is frequently employed

in eigenvalue problems to avoid the inherent difficulties associated

with the dual extremization problem.

There have also been many papers published that consider

"multi-purpose structures," e.g., Prager (1969), Martin (1970), and

Prager and Shield (1968). The cost function for such problems is

defined as

J = Z a.J.(x,u)

where a. are positive constants, serving as weighting parameters.

While perhaps demonstrating much potential, no significant results

obtained with this approach have so far been published. What problems

have been solved are too simple; indeed the authors indicate the need

for using a discrete approximation and mathematical programming tech-

niques in realistic applications.

A subject closely related to "multi-purpose structures" is

that of multiple constraints. It is mentioned here only because most

papers on the latter also include the former--see Martin (1971). The

idea of multiple constraints is not new; in both variational and math-

ematical programming fields there exist standard techniques for han-

dling multiple constraints.

A recent Russian paper (Salukvadze, 1971) suggests an alter-

nate to the "multi-purpose" cost functional. Instead of treating a

vector functional that requires the choosing of weighting coefficients,

it is suggested that the several functionals be combined into one.

Given a system and vector cost functionals

x = f(x,u,t)

J.[u] = J.(x,u,t) i = l,...,k

Let uOPT denote the optimal solution for which J. assumes the optimal
-OPT 1
value on the trajectory of the system. For each of the J. there is a
different u PT. These k values J. can be thought of as components of
-OPT 1

a vector r where

= (1)1 [ (kl )T
"I PTJ k"'"

For any arbitrary u the result

r[u] = {J [u] ... J [u]}

is just some vector functional.
Vector r represents a constant point in the space of

(J ,...'Jk). Since no choice of u can optimize all of the J. simul-

taneously, that is, to attain the point r in J.-space, the best alter-
native is to minimize the distance between r[u] and r That distance

is defined by the Euclidean norm. To avoid the question of incon-

sistent dimensions the functionals are reduced to dimensionless form.


k k J[u] J.M
J[u] = i -- ___
i=1 J.
i PT

and PyT is that function u which minimizes the functional J[uj.

Mathematically speaking,

uOT = ARGMIN {J[u]}

This type of vector cost functional is much more appealing

than the type treated in the papers on multi-purpose structures.

It also suggests an entirely new field of study: the more realistic

t T
Superscript "T", e.g., u denotes the transpose of the vector u.

choice of cost functionals. The mathematics of a problem seldom

accommodates financial considerations. For example, the design which

requires the least material may reduce the cost of materials at an

overwhelming expense in manufacturing or fabrication. When aesthetic

appeal and environmental impact are included--as must be done in any

real, commercial application--the selection of an appropriate cost

functional is an almost insurmountable task. However, a simple exten-

sion of Salukvadze's composite cost functional may reduce the difficul-

ties to operations research considerations.

In problems where it is desired to optimize simultaneously

several different functionals, not all having the same dimensions, the

concept of a generalized inner product may prove useful. It is defined

in terms of a metric operator A; for some general vector z

z ||2= (z,z)A =TAz

and symbol means "is defined by." With reference to the vector

cost functional, A represents a set of scale factors which converts

all of the separate cost functionals to a common dimension. This is

where the operations research enters--relating material expense to

fabrication to sociological considerations and so forth--to determine

the metric A. For a vector c whose elements are functionals,

E = r[u] r

where r[u] and r are defined above, the composite cost functional is

J[u] = (E,Ac)

The weakness in this method is that an optimal solution must be

obtained for each individual cost functional prior to attempting a

solution to the composite problem. Additionally, some of the more

abstract cost objectives may be difficult to quantify in a meaningful

manner. Despite these shortcomings this approach does suggest inter-

esting applications.

2.2 Methods: Continuous Systems

The problems characterized by mathematical functions, in con-

trast to those represented by a set of discrete constants, are normally

treated by variational techniques. Many books on this subject have

been published; the better authors include Elsgolc (1961), Gelfand and

Fomin (1963), Dreyfus (1965), Hestenes (1966), Denn (1969), Luenberger

(1969), and Bryson and Ho (1969). An excellent summary paper is avail-

able in Berg (1962).

To see how these techniques are applied, three papers are

recommended. The first is Blasius (1913), which provides sufficient

detail and explanation to make it quite worthwhile. Although it does

include several examples that involve subsidiary conditions, no inequal-

ity constraints are treated. An example that includes inequality con-

straints to the control variable is contained in Brach (1968). A much

more general application of variational principles is presented in

Oden and Reddy (1974). In this paper a dual-complementary variational

principle is developed for a particular class of problems. It is shown

that the canonical equations obtained are the Euler-1.a-range equations

for a certain functional.

2.2.0 Special Variational Methods

Besides the ordinary variational method, more specialized

techniques have been developed to the point where they are recognized

as independent methods in their own right. In the following sections

these methods are identified and a number of representative references


2.2.1 Energy Methods

The oldest of these methods is the energy method. It originated

with the principle of minimum potential energy, and was later extended

to include the representation of eigenvalues through the energy func-

tional. A good discussion of the former is available in Fung (1965) or

Przemieniecki (1968); the best general treatment of the latter is avail-

able in either Gould (1957, Chapter 4) or in Mikhlin and Smolitskiy

(1967, Chapter 3).

The principle of minimum potential energy is frequently used

with simple problems to prove that a necessary condition for optimality

is also sufficient. Prager and Taylor (1968) contains such a proof

for the global, maximum stiffness design of an elastic structure of

sandwich construction; two papers that also consider this problem are

Huang (1968) and Taylor (1969). Specific application of the energy

method to an eigenvalue problem is demonstrated in Taylor and Liu (1968).

A much more general discussion of the energy method is provided in

Salinas (1968). Further extensions of the method are presented in

Masur (1970), in which the principle of minimum complementary energy

is applied to problems of the optimum stiffness and strength of elastic

structures. In these problems a necessary condition for optimality

is that the strain energy density be constant throughout the structure.

This condition is also sufficient for optimality in certain classes of

structures that satisfy a specific relationship between the strain

energy density and design variables.

Many of the energy problems belong to the class of problems

having quadratic cost functionals. The significance of this character-

istic is that the Euler-Lagrange equations derived from such functionals

are linear.

A more recent energy method development is the concept of

"mutual potential energy." Mutual potential energy techniques resemble

those of the principle of minimum potential energy. In both methods

a cost functional is defined over the entire domain occupied by the

structure and optimized with respect to the control variable. If it

is desired that the optimal solution be required to have a specified

deflection at a certain point, this condition corresponds to a sub-

sidiary state constraint when using the principle of minimum potential

energy. The mutual potential energy method incorporates this type of

localized constraint into the cost functional which is defined over

the entire domain of the structure. By itself this alone is advanta-

geous; however, for certain types of problems the mutual potential

energy method also provides both a necessary and sufficient condition

for global optimality. In the way of a critical comment, either the

method has received too little attention, or else it does not effi-

ciently handle problems more difficult than the simple examples presented.

Four papers that are representative of the literature associated with

this method are Shield and Prager (1970), Chern (1971a, 1971b), and

Plaut (1971c).

Another recent development is that the class of problems for

which energy methods is applicable has been expanded to include cer-

tain types of nonconservative systems. Together with the mutual poten-

tial energy concepts, this suggests that perhaps the classical energy

method is a special case of a more generalized method. If a technique

can be developed which uses the adjoint variables to transform a gen-

eral nonconservative system into an equivalent self-adjoint form, the

method might be deduced. Some papers pertaining: to the subject are

Prasad and Herrmann (1969), Wu (1973), and Barston (1974).

2.2.2 Pontryagin's Maximum Principle

There are many textbooks which derive, explain, and give examples

for the maximum principle. The original (Pontryagin et al., 1962)

requires a knowledge of functional analysis. A condensed form of this

same material is available in Rozonoer (1959). Denn (1969) provides

another point of view in which the principle is derived from Green's

functions. In this manner, the sensitivity to variations is readily

observed. To understand Denn's treatment requires only a knowledge of

the solution of differential equations.

Shortly after Pontryagin's book was published, many papers

devoted to the theoretical aspects of the maximum principle were pub-

lished. Some of the more readable ones are Kopp (1962, 1963), Roxin

(1963), and Halkin (1963). Another early paper (Breakwell, 1959)

appears to be a completely independent derivation of the maximum prin-

ciple. Although quite general in the mathematical sense, the examples

presented are trajectory optimization problems and not a general type

of mathematical problem. This may be an explanation for what seems to

be a lack of recognition for a significant achievement.

The application of PMP to problems in structural optimization

is relatively recent. When the method is used, one of two difficulties

is often encountered. The first is errors in the formulation of the

optimal control problem; the second is that once a well-posed, non-

linear two-point boundary value problem (TPBVP) is obtained, it is dif-

ficult to solve. An example of the first is provided by Dixon (1968)--

the correction was given in Boykin and Sierakowski (1972). De Silva

(1972) provides a clear presentation on the application of PTIP to a

specific problem, but includes no data because a solution could not be

obtained. Despite the failure to determine the solution, this paper

is worthwhile for its lucid discussion of the PMP application. Another

paper that gives a good specific application of PMP is Maday (1973).

Although much analysis is presented very little is said regarding the

solution techniques.

All of the above references are applicable only to systems

that are described by ordinary differential equations, in contrast with

the calculus of variations which also handles problems described by

partial differential equations. Since many of the problems of mathe-

matical physics involve partial differential equations, an extension of

PMP to include this class of problems is the next logical development.

Some work has already been done, for example, Barnes (1971) and

Komkov (1972). A survey of these "distributed parameter systems"--

see Section 1.0--is presented in Wang (1968).

2.2.3 Method of Steepest Ascent/Descent

This method is frequently cited in the literature for trajec-

tory optimization, and occasionally in references related to optimal

structures. When the method of quasilinearization converged for the

dissertation example problems, there was no need to investigate other

methods such as the method of steepest ascent. Consequently, little is

said about it. According to the references, it is applied in a straight-

forward manner. Furthermore, the example problem solutions presented

seem to be real problems and not academically simple. The following

four papers treat the method in general with trajectory optimization

applications: Bryson and Denham (1962, 1964), Bryson et al. (1963), and

Hanson (1968). In Haug et al. (1969), the method of steepest ascent is

derived in detail, completely discussed, and compared to the maximum

principle. Several structural optimization problems are then solved

by the method of steepest ascent. Although no exciting results are

obtained, use of the method is clearly illustrated by the applications

to realistic structural problems.

2.2.4 Transition Matrix:
Aeroelasticity Problems

For the past few years a group at Stanford University has studied

the optimization of structures subject to dynamic or aerodynamic con-

straints. The general problem of their interest is that of minimizing

the weight of a given structure for specified eigenvalue, subject to

inequality constraints on control.

Three types of solution techniques are used once the necessary

conditions for minimum weight are determined. Exact solutions are

obtained for most of the problems because they are so simple that

analytical methods are applicable. More complicated problems are

solved by a "transition matrix" method described in Bryson and Ho (1969).

On the basis of convergence difficulties reported in the references,

this method should be used with caution. Results have been obtained

only for very simple problems. However, these results are corroborated

by data obtained from a discrete approximation method. Five papers

that are representative of this work are McIntosh and Eastep (1968),

Ashley and McIntosh (1968), McIntosh et al. (1969), Ashley et al. (1970),

and Weisshaar (1970).

2.2.5 Miscellaneous Methods

The preceding sections have briefly outlined the methods of

structural optimization most frequently encountered in the literature.

Appropriate, representative references have also been given. Not all

methods are listed; while some are omitted for not being generally use-

ful, others are omitted for not being generally used. Two examples of

the latter are the "modified quasilinearization" and "sequential gradient-

restoration" algorithms described in Miele et al. (1972) and in Hennig

and Miele (1972). At some later time these methods may be acknowledged

as major methods that are applicable to many different or important prob-

lems, but for now they are mentioned only in passing.

2.3 Methods: Discrete Systems

Discrete systems are described by a set of discrete constants

instead of the set of functions associated with continuous systems.

The classic example of a discrete system is a pin-connected truss, where

the state x. and control u. are the stress and cross-sectional area,
1 1
respectively, for each member i. A discrete system also arises in the

approximation of a continuous system.

Several references that present a good discussion of general

methods applied to discrete systems are available. Most of these exist

in the form of an edited collection of papers by various authors on the

topics of their acknowledged expertise. Four such publications are

Gellatly (1970), Gellatly and Berke (1971), Pope and Schmidt (1971),

and Gallagher and Zienkiewicz (1973). Another report, Melosh and Luik

(1967), provides a good exposition of the difficulties associated with

the analysis portion of least weight structural design. It also con-

tains a brief comparison of various mathematical programming methods.

McNeill (1971) is the last reference to be cited in the section

on general methods for the optimization of discrete systems. Minimum

weight design of general structures is treated in a mathematically

precise formulation. Legendre's necessary condition is combined with

the concepts of convex functions and sets to derive the necessary and

sufficient conditions for global optimality. Fully stressed designs

and constraints to eigenvalues are also discussed. In summary, this

paper provides a good example of the general mathematical problem that

must be solved in the optimization of discrete systems.

While certain variational methods may be applied to discrete

systems, the most frequently used technique is mathematical programming.

In the following sections, this method and other major methods are dis-

cussed and representative references cited.

2.3.0 Mathematical Programming

The general method of mathematical programming is discussed

in section 3.2 of the dissertation, and the solution of an example

problem using this method is detailed in Chapter VL. In the literature

related to this subject, a very readable textbook is available--Fox

(1971). This book complements the theory with numerous discussions

pertaining to numerical techniques and methods that can be employed to

overcome certain difficulties that may arise. Although it does contain

flowcharts of several algorithms, there are few specific examples given.

For a discussion of the general theory, two alternatives to this book

exist in the form of papers: Schmidt (1966, 1968). The first is

written in a conversational style, contains no mathematics, and is

intended to provide only a general description of the subject. The

latter paper is theoretical in content.

An excellent application to a realistic problem is to be found

in Stroud et al. (1971). Ths paper contains little discussion of the

method itself, but does demonstrate an application that allows a concep-

tual visualization of the solution. The approach is to assume the

solution to be a linear combination of specified functions, and to

choose the weighting coefficients to minimize the cost. Mathematical

programming is employed to determine the optimal set of coefficients.

This approach resembles Galerkin's method, and though not mathemat-

ically rigorous, it may provide a useful approximation to large,

unwieldy problems.

2.3.1 Discrete Solution Approximations

In the previous section a paper is cited that contains an

approximate solution obtained by Galerkin's method. The use of the

Galerkin or Rayleigh-Ritz approximate solution techniques is suffi-

ciently widespread to be considered a general method. For both methods,

the solution is assumed to be a linear combination of the solution to

the linear part of the governing equation and a set of prescribed func-

tions. This approximate solution does not satisfy the given equation

exactly but produces some residual function. A cost function that

depends upon the residual is then minimized with respect to the unknown

coefficients. The two weighted residual methods mentioned above have

different cost functions, but the methods are identical for linear

equations--see Cunningham (1958, p. 158).

The advantage to using these methods is that after assuming

the particular form of solution, the problem of solving for the weight-

ing coefficients may be much simpler than the original problem. In the

case of Stroud et al. (1971), the coefficients were obtained by mathe-

matical programming techniques. However, the weakness of the method

is the restricted function space of possible solutions. With the coef-

ficents obtained by these methods the resulting solution is the best

approximation that is possible from the set of solution functions

prescribed. There is no guarantee that the approximation even resembles

the true solution.

The flutter of a panel is solved using Galerkin's method in

Plaut (1971a). No general developments are presented and the assumed

solution functions are trivially simple. However, this paper does

provide an application of the method to obtain an approximate solution

to a very difficult optimization problem involving the stability of

a nonconservative system. A similar problem is treated in a more theo-

retical manner in Plaut (1971b) using a modified Rayleigh-Ritz method.

"Segmentwise-constant" control functions are assumed also; this partic-

ular approximation is discussed with more detail in the following sec-

tion. Additional nonconservative problems are treated in Leipholz

(1972), applying Galerkin's approximate solution to the energy method.

Several simple examples are included.

Nonconservative elastic stability problems of elastic continue

are treated in Prasad and Herrmann (1969) using adjoint systems. This

approach is more realistic than the segmentwise-constant control assump-

tion described in the following section. Solutions for the state and

adjoint system are assumed, such that approximation process resembles

the Rayleigh-Ritz method. However, only a single type of nonconserva-

tive system is considered. Extension to several other types of noncon-

servative elastic continue problems is given in Dubey (1970). Varia-

tional equations corresponding to both the Galerkin and Rayleigh-Ritz

methods are derived. Furthermore, the condition for equivalence of the

two methods is shown to be that the admissible velocity field must

satisfy a natural boundary condition over that portion of the body's

surface where tractions are prescribed.

2.3.2 Segmentwise-Constant Approximations

The definitive characteristic of this method is approximating

the structural system by a number of discrete segments, where within

each segment the control function has a constant value. In general,

the constant value of the control differs from segment to segment. For

the many papers on this method that have been published, the procedure

is the same. An optimality condition (necessary in all cases but also

sufficient in some) or cost functional is derived for the continuous

system. After defining the segmentwise-constant approximation, the

condition or functional is reformulated in terms of the discrete param-

eters. Most of the papers use so few elements that solving for the

discrete values of the control parameter poses no difficulties. Although

this method does simplify the mathematical problem to be solved, the

crudeness of the approximation is not appealing. Five papers which

treat a variety of problems using this approximation are cited below.

Minimum weight of sandwich structures subject to static loads

is discussed in Sheu and Prager (1968a). In Sheu (1968) the same type

of structure is considered. It differs from the first problem by

requiring point masses to be supported such that the total structure has

a prescribed fundamental frequency of free vibration. Icerman (1969)

treats the problem of elastic structures subject to a concentrated load

of harmonically varying amplitude. The minimum weight design is obtained

subject to a compliance constraint related to the applied load, and

which is effectively a boundary condition on displacement at the point

of application. A truss problem is also included.

The concept of a compliance constraint is pursued further in

Chern and Prager (1970). The minimum weight design for sandwich con-

struction beams under alternative loads is found, subject to this type

of constraint. The paper uses up to eight segments, thereby obtaining

a more realistic approximation to the continuous problem. Minimum weight

design of elastic structures subject to body forces and a prescribed

deflection is discussed in Chern (1971a). This investigation is no-

table in that it considers applied loads that are functions of the design


2.3.3 Complex Structures with
Frequency Constraints

On the basis of useful application, perhaps the most important

class of discrete structural optimization problems is the minimum weight

design of complex structures subject to natural frequency constraints.

Since most real structures are built with many structural elements of

various types, and are not realistically described by any single type,

this approach is more appropriate from the aspect of modeling the struc-

ture. Furthermore, many structures must be designed to avoid certain

natural frequencies because of resonance or self-induced oscillations;

this situation indicates that the natural frequency constraint is also


Many different solution schemes have been developed which are

usually based upon general mathematical programming techniques.

Typically, a design is iteratively altered to minimize the weight with

a subsequent increase in frequency until a constraint is violated.

At that point the design process uses an iteration which simultaneously

reduces both weight and frequency. These two processes are repeated

sequentially until no further weight reduction is possible.

Although circumstances may require the use of many elements,

the number of them may itself be a critical factor. Some of the schemes

require a matrix inversion as part of the eigenvalue problem solution

associated with the frequency constraint. If the number of elements

becomes too large, the size of the matrix to be inverted likewise be-

comes excessively large. When that occurs the matrix inversion can

require excessive amounts of computer time. Another possible difficulty

is that the inverse matrix itself is not sufficiently accurate, such

that the subsequent calculations are not acceptable. However, for

structures such as reinforced shells composed of different types of

structural elements, this method may be the most applicable.

Many papers have been published pertaining to this class of

structural optimization problem. Because the method is inherently

oriented towards applications, the references are cited in chronolog-

ical order without additional comments. Interested readers are referred

to: Turner (1967), Zarghamee (1968), Turner (1969), De Silva (1969),

Rubin (1970), Fox and Kapoor (1970), McCart et al. (1970), and Rudisill

and Bhatia (1971).

2.3.4 Finite Element Approximations

There is an unfortunate ambiguity to the label "finite elements"

that occurs because these words are used to describe two completely

different entities. In papers cited in the preceding section they are

used to indicate the discrete structural elements of finite dimensions

which comprise the complex structure. The analysis of such systems of

structural elements has been accomplished by ordinary matrix methods

during the last three decades. However, during the past decade another

method has been developed and named "the finite element method."

In this method a continuum is divided into small, finite ele-

ments over which a particular form of approximation of either the dis-

placement and/or force is assumed. A number of nodes common to one or

more element is prescribed; continuity is required to exist at these

nodes but not necessarily elsewhere. An equilibrium equation is derived

for each element, and then all of the individual equations are combined

into a single equilibrium equation for the entire system. lihc result-

ing equation is a linear algebraic equation whose unknowns are displace-

ments and/or forces at the nodes. Once the matrix equation is inverted,

the nodal displacements and/or forces are used with the assumed approx-

imation form to describe the state of the structure throughout each and

every element, and hence the system. Hereafter this method is referred

to as the "finite element method."

The most frequent application of the finite element method is

to problems having complicated loads, geometry, and response. Generally

speaking, the method is employed wherever the physical system is too

complex to be described adequately by a single differential equation

and boundary conditions. For a complete theoretical development of the

finite element method and numerous examples, see Zienkiewicz (1971).

With respect to structural optimization the method is employed

to simplify the problem to be solved. Very little has been published

on this subject, but the papers available cover a wide spectrum of tech-

niques. For example, Dupuis (1971) combines the finite element and

segmentwise-constant methods as applied to minimum weight beam design.

A similar application to column buckling is contained in Simitses et al.

(1973). Another paper, Wu (1973), is a study of two classical noncon-

servative stability problems. Although adapted to stability consider-

ations, this presentation is the best exposition available in the open


In Chapter VI a minimum deflection beam problem is solved with

the combined methods of finite elements and mathematical programming.

2.4 Closure

In the preceding sections of this chapter, general problem

types and methods are discussed. Only those methods that appear to

have attained some standard of acceptance are presented. It must be

acknowledged that other areas of important study exist but are perhaps

overlooked as not being pertinent to the general subject area of the

dissertation. As an example, Dorn et al. (1964) treats the optimal

layout of trusses--an important subject but not related to the general

problem to be considered in this dissertation. In addition only

elastic structures have been considered although there are numerous

publications on optimal design of inelastic structures. References

that are representative of this subject are: Drucker and Shield

(1957a, 1957b), Hu and Shield (1961), Shield (1963), Prager and Shield

(1967), and Mayeda and Prager (1967).

On considering the various references mentioned above it would

appear that there are two possible pitfalls in structural optimization

that should be avoided. The first is the confusing of method of opti-

mization with the solution techniques employed to obtain a solution to

the resulting TPBVP. In order to avoid possible errors the two should

be dealt with independently, unless it is clearly advantageous to

relate one to the other. Besides this it must be recognized that any

solution obtained is "optimal" only with respect to the given condi-

tions of the particular problem. Any change in the problem statement

invalidates the applicability of that solution. The change may lead to

a more desirable solution, but the original solution is no less valid.

Simitses (1973) is an example where this situation is not acknowledged.

In this paper the thickness of a thin reinforced circular plate of spec-

ified weight and diameter is determined such that the average deflec-

tion due to a uniform load is minimized. An earlier paper which did

not include stiffening is cited with the implication that the optimum

solution for the unstiffened plate is not correct. The point made above

is that both of these solutions are optimum under the respective condi-

tions of the two problems. Neither solution is more, or less, valid

than the other.



3.0 Introduction

This chapter contains the development of two distinct methods

used in the theory of optimal processes, into a more general method.

The first section defines precisely the problem to be considered.

This includes the necessary conditions for an optimal solution given by

the calculus of variations. Several mathematical programming techniques

are described in the second section along with a numerical algorithm

called the gradient projection method. The application of this numer-

ical method to the solution of the necessary conditions from Pontryagin's

Maximum Principle (PMP) is detailed in Section 3.3. Results of this

approach are shown to be consistent with the necessary conditions, given

in Section 3.1; these results provide a clarifying insight to the math-

ematical processes entailed in the maximum principle, and an explicit

formulation for the Lagrangian multiplier functions. This explicit for-

mulation is used in the next section to show the necessary conditions

may then be regarded as an algorithm. The final section contains a brief

summary of solution methods.

The main theoretical development of the dissertation is contained

in the first three sections. It is well known that the problems encoun-

tered in the calculus of variations are equivalent to the optimization of

a functional (in the sense of mathematical programming problems) under

certain restrictions upon the variations. A good exposition of this is

available in Luenberger (1969). With this equivalence in mind, it is

noted that the PMP is itself worded as a constrained optimization prob-

lem. When treated with what is normally regarded as a numerical method,

the gradient projection method, an explicit formulation of the atten-

dant Lagrangian multipliers is obtained. This form satisfies all of the

calculus of variation necessary conditions and allows one to use them

in a most straightforward fashion. As a result, these necessary condi-

tions may be directly used in the form of an algorithm to obtain a solu-

tion. Furthermore, it is believed that treating the PMP as a mathemat-

ical programming problem in conjunction with the gradient projection

method helps to explain the effect of combined control-state constraints

upon the maximum principle.

3.1 Problem Statement and
Necessary Conditions

A general problem which represents a large class of structural

optimization problems is treated in the sequel. The functional

J = / L0(x,u) dt (3.1.1)

is to be minimized with respect to the control u(t) where the state

x(t) must satisfy certain boundary conditions and a differential con-

straint; in addition, an inequality constraint involving both the state

and control must be satisfied. For

uT(t) = [u (t) u2(t) ... u (t)] (3.1.2)
-- 1 2 m

x (t) = [xl(t) x2(t) ... x (t)] (3.1.3)

the subsidiary conditions to minimizing the cost function J are:

x = f(x,u) (3.1.4)

Specified Boundary Conditions on x(t) (3.1.5)

(x,u) < 0 = l,...,q (3.1.6)

Terminal time tF is considered to be constant; allowing it to be

unspecified requires only a slight modification to the following


This problem is a particular form of a very general one treated

by Hestenes (1966). His results are a set of necessary conditions which

must be satisfied by the optimal solution and include the maximum prin-

ciple. To obtain the necessary conditions, the inequality constraints

are converted to equality constraints in the manner of Valentine (1937).

These constraints and the differential constraints are then adjoined to

the cost function via Lagrangian multiplier functions ,j(t) and Pi(t)


(x,u) + s2(t) = 0

where the slack variables s (t) are defined such that

The symbol denotes "is defined by."
The symbol "= denotes "is defined by."

A 1 ]
s(t) = [-(P(x,u)] 0 0

tF t
J = J Lo(x,.u)dt + p (t) [x-f(x,u)] dt +
0 0
+ f P,(t) [PQ(x,u) + S2(t)] dt

Implied summation convention is used whenever a vector formulation leads

to possible ambiguities in later developments. Integrating the second

integral by parts gives a result that leads to the variational Hamiltonian.

t t
=T. TF dF
J = P- +/ [Lo-p- + -x 2] dt
0 0

H(x,u,2) = L (x,u) p (t) f(x,u) (3.1.7)

the variational Hamiltonian, and H which will include terms arising

from the inequality constraint.

H =- H- H-


H = (t) f(x,u) Lo(x,u) Zi(t) pk(x,u)


J = T x F [H* + x pIs ] dt
0 0

With the e'.c:eption of the maximum principle, all of Hestenes' necessary

conditions are obtained from the requirement that the first variation

of the cost function vanish. In the following, "6x" designates


"the variation of x"; a subscript vector designates the partial deriva-

tive with respect to that vector, with the result itself a column vector.


tF t
6J = p 6x f
0 0

[x (H* ) +

T *
+ 6u H 2s6s] dt = 0
u k z k

To derive the PMP requires an extensive mathematical development and is

not included since it contributes nothing to the present discussion.

However, the necessary conditions are listed in order to be available

for later reference.

x = H = f(x,u)


0 = Px t
1 1

0 = p.6x. t

Specified Boundary

= 0
- i

= t o

Conditions on x(

0 =H

0 = uz(t) p((x,u) P(t) ( 0

H(x PT' ~PT' ) H(OpT' u, e)

The optimal solution must satisfy these six conditions together with the

inequality constraint (3.1.6).

The PMP states that along the optimal trajectory, each instant

of time t, state XOPT(t) and adjoint state p(t), treated as fixed, the

optimal control UPT(t) is that admissible control which minimizes the

variational Hamiltonian. In the present context, admissibility requires

that u(t) be piecewise continuous, the set of admissible controls being

denoted by Q. Hence the PMP indicates that

u (t) = ARGMIN [H(x O, u p)] (3.1.8)

Notice that the necessary conditions suggest nothing about how

a solution is obtained, but merely indicate certain functional relation-

ships that must be satisfied. However, equation (3.1.8) seems to inti-

mate that solution of the necessary condition of P!1P involves a mathe-

matical programming problem.

3.2 Mathematical Programming:
Gradient Projection Method

Having shown that the PMP from the calculus of variations

approach to an optimization problem may perhaps be related to a mathe-

matical programming problem, the latter will be discussed in general

terms. Consider a nonlinearly constrained optimization problem

-xo = ARGECMIN [F(x)]

subject to

gj(x) 0 j = l,...,m

where Q denotes the set of admissible state components x., i = l,...,n,

and to be admissible requires only the satisfaction of the m inequalities.

Necessary conditions which xOPT must satisfy are given in the Kuhn-Tucker


(i) constraints are satisfied gj (xPT) < 0

(ii) multipliers exist such that X. 0

and for all j = l,...,m AXg (xPT) = 0
(iii) and VF(x T)+ E V g(x ) =0
Sj=l -

Observe that if I denotes the set of indices associated with active

constraints, the first two conditions may be written as

j e IA g.(x) = 0 and A. > 0

j IA gj.(x) < 0 and X. = 0
A J- J

Fox (1971, pp. 168-176) presents a very readable proof of this theorem;

a more mathematical proof using vector space concepts is available in

Luenberger (1969).

Many methods for obtaining a numerical solution to the nonlinear

programming problem described by the first two equations of this section

have been developed. The gradient projection method by Rosen (1960) is

used frequently in structural optimization. Basic to the method is the

orthogonal projection of the cost function gradient onto a subspace

defined by the normal vectors of the active constraints. An inherent

part of the algorithm is the concept of a "feasible," "usable" direction.

Any direction d is feasible if an increment x in that direction improves

the cost function, i.e., decreases F(x). Direction d is said to be

usable if it also satisfies the constraints. As long as a feasible,

usable direction exists, the cost function may be improved. A constrained

optimal solution xPT occurs at that point where no feasible direction

is also usable, i.e., any attempt to improve the cost violates a con-

straint. In Appendix B these concepts are used in a concise proof of the

Kuhn-Tucker conditions.

Fox (1971) derives the matrix P which projects the cost function

gradient into the subspace defined by vectors normal to the active con-

straints. This is equivalent to subtracting all components parallel to

vectors that are normal to surfaces of active constraints from the nega-

tive gradient of the cost function. Recalling the definition of set IA,

consider r constraints to be active such that

IA = {al ,a2... ,}

Define a vector whose elements are the corresponding nonzero

Lagrangian multipliers, and another vector whose elements are the active

constraint functions

A1 = [\ \ ... \ ]
1 C2 r
N [g g, ... g]
1 2 r

From the N vector, a matrix N is introduced, each column of xhiCh is the

gradient of an active constraint. Hence, N is an (nx r) matrix where

N N = [N..] i = 1,...,n

and (3.2.1)

N = = j = 1,. ,r
ij 1x. x.
1 1

With these definitions of A and N, the third Kuhn-Tucker condition can

be written as

V F(x) + N X = 0

At any feasible point x where g (x) : 0, the direction which best

improves the cost function is the negative gradient of the cost. If

those directions which lead to constraint violations are subtracted from

-V F(x), the projection matrix P is obtained. Directions causing a

constraint violation are specified by the gradients of active constraints,
i.e., the columns of N What is required of S, the projection of the

gradient, is that

S = (-V F(x)) Nx A (3.2.2)
x x

where A are scalar coefficients to be determined such that S is ortho-

gonal to each column of N or

(N)T S=0
(N )r S = 0

When the matrix equivalent to N is used together with the S expres-

sion (3.2.2), the result is

N (-V F(x) N X) = 0

such that the X which satisfies this orthogonality condition is:

A = (NTN)-1 NT(V F(x)) (3.2.3)

Unless the active boundary surface normals V xg(x) are linearly depen-

dent, the matrix (N N) is nonsingular. Conversely, if this matrix is

singular the active constraints are not linearly independent; however,

this is not a condition encountered in most real cases.

Substitution of the X expression into the S equation leads

directly to the projection matrix P:

S= -P V F(x)
X -


P = I N(NTN)-1 NT (3.2.4)

where I is the identity matrix. The direction S which best improves

the cost is given in terms of P, where P and N are given by (3.2.4)

and (3.2.1). If no constraints are active at a point x, then N is a

null matrix, P reduces to an identity matrix, and the direction of best

improvement is coincident with the direction of steepest descent.

In the algorithm associated with this method the starting point

must be a feasible point where g.(x) : 0 for all j = l,...,m. The

design then proceeds in the S direction until the solution is satisfied

to within a specified position tolerance E. Necessary conditions

generally programmed in a computer program are:

Is. i i = 1,...,n

A. > 0 j A

A. = 0 j I

It is readily seen that for S, P, and A defined as above, these are

completely equivalent to the Kuhn-Tucker conditions.

3.3 Gradient Projection Method Applied
to the Maximum Principle

Based upon the preceding discussion, the similarity between PMP

and the-mathematical programming problem can be discussed. The maximum

principle states that the optimal control uOPT minimizes the variational

Hamiltonian with respect to all admissible u. Or, at each time

0 t t tF, uPT minimizes H(x,u,p) with respect to u for given x and p

and where OPT is subject to constraints 4 (x,u) < 0, k = l,...,q.

Treating this as a mathematical programming problem, the following

correspondences are recognized

x U

F(x) % H(x,u,p)

gj(x) % (x,u)

j v<(t)

V F(x) 'b H
x- u

S H plus constraints

Continuing to identify corresponding quantities, at each time t, let

IA denote the set of active constraints, taken to be r in number.

IA = {al, 02' ...' r }

Then NT T = [ U ..1. ** ]

A' = [) (t) PQ (t) ... (t)0
1 2 r

N T H j] = Jj = l,...,n
3 j = 1,... ,m


Furthermore, define (H ) to be the gradient of H with respect to u

where all components that cause a constraint violation have been removed.

Since projection matrix P removes cost function gradient components that

lead to constraint violations, consider its use in the maximum principle.

With the correspondent to N identified as then

P = I Tu (3.3.1)
u -- u

(Hu) = PHu (3.3.2)

From the Kuhn-Tucker conditions, this implies that along the

optimal trajectory (t,XOPT, u OPT)

= (Hu)p = 0 (3.3.3)

Similarly, at each time

g(t) (- T )- T H (3.3.4)
T-- T u
u u -

from which it follows

(. T )) + H T Hu = 0
u u -

u + Hu = 0

(H + _To)u = 0 (3.3.5)

( -H ) 0

Or, H = 0 (3.3.6)

Hence the control law from Hestenes' necessary conditions can be derived

from the PMP condition by treating it as a nonlinearly constrained math-

ematical programming problem. While using the gradient projection method

in the derivation, it is seen that equation (3.3.5) is equivalent to the

third Kuhn-Tucker condition. The second Kuhn-Tucker condition is iden-

tical to Hestenes' necessary condition on the Lagrangian multipliers used

to adjoin the inequality constraints to the cost function. Satisfaction

of the inequality is implied by requiring the first Kuhn-Tucker condition

to be fulfilled, where

X. > 0 i(t) > 0 (3.3.7)

Sj g(x) =0 -+ (t)>(x,u) = 0 (3.3.8)

g.(x) W 0 z 4 (x,u) : 0 (3.3.9)

Thus by treating the solution of the necessary conditions of the max-

imum principle as a programming problem with inequality constraints,

using the gradient projection matrix, and by requiring satisfaction of

the Kuhn-Tucker conditions, an explicit formula for Hestenes'

Lagrangian multiplier functions has been derived. It is further demon-

strated that with the u (t) so defined satisfaction of the extremum con-

trol law condition is implied. However, before this treatment can be

accepted as valid, it must also be shown that the system of canonical

differential equations is unchanged.

Consider the state system equations

x = H = f(x,u)


H = T (t)f(x,u) L (xu) (t) (x,u)

It is obvious that the explicit form of p(t) has absolutely no effect

upon the state system equation expressed in canonical form.

Demonstrating that the adjoint system equation is unchanged

requires the method described by Bryson et al. (1964). Consider the

general problem of Section 3.1 again, but with only the differential

constraints adjoined to the cost function, i.e.,
tF t
Min {J = px + f (H xT)dt} (3.3.10)
u 0 0

subject to: pk(x,u) < 0 9 = l,...,q (3.1.6)

where H(x,u,p) = L (x,u) T (t)f(x,u) (3.1.7)

and x = f(x,u) (3.1.4)

Again let IA denote the set of indices associated with r active

constraints at any time t

IA = {al a, ...,ar

4 (
k(x,u) = 0 E IA A-> v(t) > 0

and p is defined as before

-1 a 2 r

The problem can then be thought of as minimizing (3.3.10) subject to

p(x,u) = 0

While on the constraint surfaces defined by this equation the variations

in control 6u(t) and state 6x(t) are not independent but instead are

related through the subsidiary requirement that

6O(x,u) = 0


Sx 4 (x,u) + 6uT T(x,u) = 0 (3.3.11)
--- -- __U -

This imposes a restriction to the admissible variations. For cost func-

tion (3.3.10) to be a minimum, it is necessary that its first variation

vanish, i.e.,

tF tI
6J = 6x + f 6xT H + 6u H 6xT] dt = 0 (3.3.12)
-0 0 u

It has already been shown that

H = (H + ) = 0
u u

which will be used to advantage shortly, after having added and sub-
tracted the term 6u (1u 4T) from the integrand of (3.3.12).

+ 6UT RT1 6uT(u
t t
T F ^F
0=E6x + f 6xT(H uTH +

+ uT ~( )u ~T (T4) )udt

Rearranging terms gives
tF tF
0 =px +f pxT(H i) +
0 0

T T T T).dt
+ 5u (H + (j 4i)u) cu (ii) dt


T T *
(H + (iT) = (H + I )u =(-H) =0
(Ti )u =


0 = T tF + F 6x T(H ) u dt
0 0x -

It is here that the restrictions imposed by the active constraints are

applied; from (3.3.11)

Su t = 6x x

such that

T~6x + F _xT(Hx- ) + x _A dt
0 0

t t
0 = x + f/ 6x (H p + ) dt
+f x x-
0 0 -

Applying Euler's lemma, for arbitrary variations in the state which

satisfies the constraints,

(H- + P) =

which by the following manipulations is shown to be the adjoint system

equation of Hestenes.

i = + (T)x

(-H- _Tx


Thus, the explicit formulation for pL(t) obtained by applying the gradient

projection method to the PMP satisfies all the necessary conditions of


It may happen that in some cases the constraint upon control

does not depend upon the state. It can be shown that the y(t) explicit

formulation is equally valid in this instance. By examination of

equations (3.3.1) through (3.3.9) it can be verified that all the neces-

sary conditions except the adjoint system equation are satisfied. To

demonstrate the latter, recall that when on a constraint boundary the

first variation of both the cost functional and the constraint function

must vanish. That is, for

K(u) = 0


J = 0


65 = uT 6 T = 0 (3.3.13)

To derive the desired equivalence, the same term must be added and

subtracted from the integrand of 6J as before, again arriving at the

t t
T I / T T T
0 = p 6x, + f Fx (H P) 6Tu I dt
0 0 -0

When the constraint variation (3.3.13) is introduced into this last

equation, then by Euler's lemma

(Hx p) = 0

Since it was stipulated that (u) is not a function of x, the equation

may be written

(H +1 4)_ = 0

S-H )-


Thus, the expression for I(t) is valid when the constraint inequality

depends only upon the control u(t).

3.4 Maximum Principle Algorithm

In the introduction to this chapter it was stated that the

Lagrange type problem from the calculus of variations is equivalent to

an ordinary mathematical programming problem based on the Kuhn-Tucker

conditions. Furthermore, when inequality constraints are present the

necessary conditions are equivalent to the Kuhn-Tucker conditions.

It was demonstrated in the preceding section that if the PMP is itself

treated as a mathematical programming problem, application of the

gradient projection method provides an explicit solution for the

Lagrangian multipliers associated with active constraints. This explicit

solution for P(t) also satisfies all of the other necessary conditions

for an optimal solution. The ability to determine i (t) explicitly

in terms of parameters and functions that describe the problem suggests

the possibility of converting the necessary conditions of an optimal

solution into an algorithm for obtaining it.

Ensuing discussion of the algorithmic form of the necessary

conditions contains the implicit assumption that all equations are valid

along the optimal trajectory. It is further assumed that the problem

under consideration is that one described in equations (3.1.1 3.1.6).

The algorithm requires that x(t) and p(t) be known at each time

0 ( t < tF for which the solution procedure is as follows.

(i) Use PMP on the variational Hamiltonian to determine an optimal

control u (t) independent of constraints.

u (t) = ARGMIN [H(x,u,p)]

Evaluating the inequality constraints with u = u reveals which of

the k = l,...,q constraints are active. Let r denote the number

of active constraints and IA the set of indices associated with


IA = {a a2, ., a }

9 (x,u) = 0 C IA

P (x,u) < 0 I1 IA

From this the vectors whose elements are the nonzero Lagrangian

multipliers and corresponding constraint functions are defined,

respectively, at the instant of time t.

PT(t) = [u J ... ]
a 1 42 r

T (x'u) = [ a1 a2 a' a I

(ii) Having identified which of the q constraints are active, r

components of OPT are specified by _(x,u) = 0. They may be

solved by using the implicit function theorem, which requires

T to be of rank r. This in turn requires the r constraints

which are active at point x(t) to be linearly independent.

To determine the remaining (m- r) components of uOPT requires

that be known at time t, but

(t) =- ( T )-1 T Hu
T -- T u
u U -

This value of is used to determine the "constrained" Hamil-


H = (H + P )
(iii) With the nonzero Lagrangian multipliers V known and H conse-

quently defined, the remaining (m-r) unknown components of

OPT are determined from the control law for the constrained

system, i.e,, from

H =0

Once uOPT is completely known, the adjoint system equations

are determined by


The process outlined above then allows uOPT to be written as

u = ARGMIN [H(x Tu,)

since the u obtained in this fashion satisfies (x,u) < 0 which is the

only requirement for being admissible. However it must be recalled

that these equations are valid along the optimal trajectory; it remains

to be shown that this algorithm may be employed in some manner to obtain

that optimal trajectory and to demonstrate their satisfaction along it.

3.5 Solution Methods

Necessary conditions from the calculus of variations provides

a Two Point Boundary Value Problem (TPBVP) to be solved, which is in

general, nonlinear. For all but the most simple problems no analytical

solution is possible and if any solution is to be obtained a computer

must be used with some numerical method. A discussion of the available

methods and their relative advantages/disadvantages is not included here

due to the availability of such discussions in the literature, e.g.,

Bullock (1966). All of the methods involve some iterative scheme, and

for optimal control can be separated into two general categories.

(i) Indirect methods. Schemes which require an initial guess of

the state's solution: In these methods the starting point is

an initial guess of the time history of the solution. The con-

trol associated with the solution is a subsequent calculation.

Iteration continues until the state satisfies some criterion

connoting convergence; the final control history at conver-

gence is the optimal control.

(ii) Direct methods. Schemes which require an initial guess of the

control function: The starting point for these methods is an

initial guess of the control time history. For this class of

methods the state associated with the control is a subsequent

calculation. Iteration continues until the control satisfies

some convergence criterion.

The method of quasilinearization was selected, based upon the

success of Boykin and Sierakowski (1972) in applying it to a constrained

structural optimization problem. Excellent convergence for their

problem, the capability to handle nonlinear systems, and the avail-

ability as an IBM SHARE program, ABS QUAS1, dictated its selection.

In the application to the examples in Chapters IV and V the program

required no modification. As a result, a detailed discussion of the

method of quasilinearization is not included.

The problem discussed in preceding sections of this chapter

falls into the general class of problems that QUAS1 handles, that is,

Y = (Y,t)

with the boundary condition of the form

B Y(O) + B Y(t ) + C = 0
k- r F -Q

where tF, square matrices B and Br, and vector C are specified,

constant quantities. The specific form of B B and C depend upon the
2 r o-Q
given boundary conditions. As described in algorithm form

x = f(x-, oPT(xP)) = Gl(x,E)

P = -H (x P (x,)) = G (x,p)
x -OPT -2

In terms of the general QUASI nomenclature,

= g(Y, t) = _
1G2 ('-)

Boundary conditions are determined by those specified for the original

system and by the necessary conditions outlined in the first section

of this chapter.

In Chapter VI the problem treated by Boykin and Sierakowski (1972)

is solved by the gradient projection method applied to a finite element


formulation for the description of the structural system. This is

a method of the second kind mentioned above. Results of the two methods

are compared.



4.0 Introduction

A structural optimization problem has been selected for its

simplicity and stated as an optimal control problem. The maximum prin-

ciple is applied, giving a nonlinear TPBVP of the Mayer type. Among the

earliest expository papers on the maximum principle, Rozonodr (1959) gives

an excellent treatment to a similar type of problem; his technique is

used to obtain both the variational Hamiltonian and adjoint variable

boundary conditions. It is shown that no finite solution exists for the

situation of unconstrained control. Numerical solutions for constrained

control are obtained by the method of quasilinearization. Constraints

include both geometric limitations to control as well as maximum stress

limits that become mixed constraints depending upon both state and con-

trol variables.

4.1 Problem Statement

A cantilever beam of variable rectangular cross section is to be

designed for minimum tip deflection due solely to its own weight. The

material is specified to the extent that the modulus E and density p are

constants. Length L is specified but the design variables, height h(x)

and width w(x), may be chosen independently of each other, subject to

hard constraints upon the allowable dimensions. That is

a < w(x) < c
b < h(x) d d

If Y(x) denotes the deflection of the centerline, the problem is:

given E, p, L, and the constraints, find h(x) and w(x) to minimize

Y(L). The particular form of differential constraints to be satisfied

will be derived in the next section.

4.2 Structural System

Small deflections are assumed in order to use linear Bernoulli-

Euler bending theory. Basic conventions assumed for this example are

depicted in Figure 4.1; with these conventions the governing equation

is derived using standard strength of materials considerations. The

result is

EIB(x)Y"(x) = M(x) (4.2.1)

MB(x) = yf (T-x)w(T) h(T)d

IB(x) = 1 w(x) h3(x)

and y = pg. Kinematic boundary conditions to be satisfied by the solu-

tion of (4.2.1) are:

Y(O) = 0

Y'(0) = 0

/ x h (x)


/ w(x)
dW = yw(x)h(x)dx

V (x)

MB (x)

Note: Y(x) is centerline deflection, positive downward.

Structural Conventions

Figure 4.1

Design variables and the related constraints (4.1.1) are put into

dimensionless form such that

v= = h + b/d < v < 1
1 d 1

v = W a/c s v2 1
2 c 2

IB(x) = 1 cd3 v 2
B 12 1 2

Replacing the independent variable with a dimensionless equivalent, and

using the control components allows the governing equation to be put

into a dimensionless form. For

t = -

1 E d2
(2)2 u = f (T-t) U1(T)u2(T)dT
yL2 t


u.(t) = vi(x(t))

':-ien constant CB is defined, the usual kinematical relationships for

a beam may be written in a simple dimensionless form; that is, let

1 E d 2 -1
C (d) (Units = Length )
B 12 yL2

x2 = CB
31 B 12

x3 = CBUU2Y

"l2 = CB 2t (u1u2Y)
u1d2 B (u1u2Y)






These state component definitions are used with the natural boundary

conditions to obtain

x3(1) = 0

x4(l) = 0

From (4.2.2), the state component definitions, and the above boundary

conditions it follows that

Xl = x2

2 = x3/uuu2

x3 = 4

x4 = UlU2

x1(0) = 0

x2(0) = 0

x3(1) = 0

x4(l) = 0

These equations and boundary conditions are used in the following section

to precisely state the problem. The solution and results are given later.

4.3 Unmodified Application of
the Maximum Principle

In terms of the state variables defined in the preceding section,

the problem can be stated with more mathematical precision. Find

Up (t) = ARGMIN [x (1)]
subject to:

subject to:

(i) differential constraints

(ii) kinematic boundary conditions

natural boundary conditions

(iii) hard geometric constraints

x = f(x,u)

xl(0) = x2(0) = 0

x3(1) = x4(l) = 0

b/d : ul(t) 5 1

a/c < u2(t) < 1

According to terminology in the calculus of variations this is a Mayer

type problem. Among the early papers concerning the PMP, Rozonoer (1959)

applies the PMP to a similar problem giving a geometric interpretation

to the function of the adjoint variables.

In Rozonoir's problem the cost is a generalization of the

ordinary Mayer problem, in the sense that the cost function is a linear

combination of the terminal state components. It can be shown via the

calculus of variations that to minimize

where c is

J = c x(tF)
-r-- F

a vector of prescribed constants, the necessary conditions are

H = T(t)f(x,u,t)

x= H = f

p = -H =

0 =H

o = p (0) x(0)

0 = (tF) [r + P(tF)

H(OPT', uOPT' p,t) H(xOPT' u, t)

That is,

Min[J] Max [H]
u u

If we consider the cantilever beam problem, the forms in the

necessary conditions are

c = [1 0 0 ]T

H = pl2 + P2X3/U 2 + 3X4 + p UU2

t =1



from which
from which

l = 0

P2 = -P1

= -P2/ulU2

P4 = -P3

pl(l) = -1

P2(1) = 0

P3(0) = 0

4(0) = 0

Adjoint variables Pl(t) and p2(t) can be integrated by inspection

Pl(t) = -1

p3(t) = (l-t)

0 < t 1

such that

H = x (l-t)x3/uu2 + P3X4 + PUU2


With this result, the necessary condition for control to minimize tip

deflection is

Hu = 3(l-t)x3/uu2 + 2 = 0

H = (l-t)x3/u3u2 + p41 = 0
u2 3 12

At first this appears to be a contradiction since the two equations can

be satisfied only by the trivial solution because they have equivalent


p4u = -3(1-t)x3
P4 1 2 3

p u = (1-t)x
P4UU2 3

Further examination, however, leads to the conclusion that when the

control is completely unconstrained there is no horizontal tangent

plane to the surface H = H(ul,u2).

When the geometric constraints to the control are included,

a constrained minimum may exist. If such is the case, the maximum

value of H occurs on the boundary of admissible control space. To that

end, PMP is employed along the control space boundary to determine

uPT (t) at each time t. Before detailing this procedure, it is neces-

sary to first consider some structural aspects of the problem.

By definition the control components are positive, which in

turn implies

Load: 4 > 0 k i4 = u l2

Shear: x4 < 0 k x4 >0 x (1) = 0

Moment: x3 0 x < 0 x3(1) = 0

Furthermore, since p2(t) 5 0 from (4.3.1)

P3 > 0 3 > 0 P3(0) = 0

p4 < 0 4 P4 < 0 P4(0) = 0

This exercise makes it possible to use the information of the sense for

x3 and p4 to simplify the search for OPT on the control boundary. By

arranging the Hamiltonian in the following fashion

H = -2 + 34 + P4[u1U2 (l-t)x3/p4ul2]

it is observed that both terms in the bracketed expression are positive.

This and the p4 outside the leading bracket allows the following


Max [H] I Min [ul2 (1-t)x3/P4UU2]
uE 9U u eaU

u =PT ARGMIN [$(u)]


[(u) = [u2 + F2(t)/uu2 ] (4.3.3)


F2(t) = (l-t)x3/P4 > 0

At each position t along the beam, the state and adjoint variables must

satisfy the appropriate differential equations, and u is specified by

the preceding three equations.

Control space boundary rU is illustrated in Figure 4.2, where

the ul axis is treated as the ordinate since ul(t) and u2(t) correspond

b/d < u 1 1

a/c < u < 1



b/d -

0 a/c 1

Figure 4.2 Admissible Control Space

to the height and width, respectively, of the cross section of the

beam at position t.

Along the constant u edges of 3U, let uI = u where u has the
1 1 c c
value of either b/d or unity. If $ (Uc,u2) has a minimum point

dl d2$
--= 0 and -- > 0
du2 du2

The value of

D(u ,u ) = u u + F2(t)/u3u

2 c 22 2
du u F (t)/u3u2
d c c 2

du2 c2

u2 which satisfies the first

u = F(t)/u2
2 c

Furthermore, it is observed that only one extremum of Q(u) exists along

ul = Uc and that it is a minimum. Hence, either 4(uc,U ) has a minimum

on the constant u1 edge or is monotonically decreasing/increasing. If


u 2 a/c

1 i u2

then along the constant ul edge, H has its maximum value at a corner of

the rectangular 3U. On the other hand, if

a/c < u2 < 1

then H has its maximum value on the line u = u interior to the
1 c

condition is

Similarly, along the u2 edges of U, denote u2 = u where u

has the value of either a/c or unity. If $(ul,uc) has a minimum point

de d2,
S0 and 7- > 0
du1 du


$(Ulu ) = 1u + F2(t)/u3u

d- u 3F2(t) /uu
du1 c 1c
d2 12F2(t)/ulu
du? c

It is observed that $(u1,uc) has only one extremum along u2 = uc,

it is a minimum, and occurs at the point ul = ul where

u = + {3F2(t)/u2}
1 c

Thus, by the same argument posed in the preceding paragraph, if either

U1 < b/d or 1 ul

then along the constant u2 edge, H has its maximum value at a corner of

the rectangular DU. Wherever

b/d < u1 < 1

H has its maximum value on the line u2 = u interior to the endpoints.
2 c

On the basis of these arguments, the following system was

solved by the method of quasilinearization:

X1 = x2 x1(0) = 0

2 = x3/uu2 x2(0) = 0

3 = x4 x3(1) = 0

4 = ul2 x4(1) = 0

3 = -P2/3U2 ''P3(0)= 0

4 = -P3 P4(0) = 0

u = ARGMIN [UU2 (l-t)x3/p4ulu2]
U 1U

The beam is represented by 100 intervals composing the range 0 < t 5 1,

which is separated by 101 "mesh points." An initial guess of the

solution x(t) and p(t) is chosen; it is selected to satisfy the bound-

ary conditions. This guess is not a solution and does not satisfy the

differential equations. The and j equations are linearized about the

initial guess, then the resulting linear TPBVP is solved to obtain new

x(t) and p(t) functions which more closely satisfy the differential

equations. At each time t corresponding to a mesh point, H is numer-

ically evaluated along each of the four straight line segments com-

posing 3U to determine 4OPT. The point (ul,u2) on 3U which gives

H(u;x,p,t) its maximum value is u T. This process is repeated until

the x(t) and p(t) iterate satisfies the differential equations to

within a specified tolerance. The equations necessary to use the IBM

program available are given in Appendix C, in the form of a subroutine



4.4 Results: Geometric Control Constraints

For the most part, no major difficulties were encountered in

using quasilinearization to obtain a solution to the sixth order sys-

tem derived in the previous section. Certain parameter values did

engender numerical instability. These cases, the source of the diffi-

culty, and its circumvention are discussed in Chapter VII. Moreover,

all calculations were done in double precision as necessitated by matrix

inversion accuracy requirements.

Parameter values selected to illustrate the solution method are:

u : b/d = 0.25
u2 : a/c = 0.20

The measure of error of satisfaction of the differential equations in

the TPBVP is in terms of the general system

= f.(Y,x) Y Y.(x), i = ,...,n
dx 1 -1

E-1. i = Max IdY. f.(Y,x)dx| (4.4.2)

Deflection of a uniform beam due to its own weight was used to infer

an initial guess which satisfies all boundary conditions:

xl(t) = t4

x2(t) = t3

x3(t) = 1 t2
0 < t < 1 (4.4.3)
x4(t) = -1 + t

P3(t) = t2

P4(t)= -t3

With these specified parameter values and initial guess of the solu-

tion, the program converged to a solution in five iterations. From

this run a tolerance was selected for all subsequent cases; the follow-

ing tabulation provides the data used in its selection:

Iteration ERROR Tip Deflection (Cost)

1 .2028 .7387749327 x 10-

2 .1704 .3192152426 x 10-

3 .6533 x 10-1 .2847993812 x 10-

4 .3031 x 10-5 .2853731846 x 10-

5 .1129 x 10-10 .2853719983 x 10-1

It is seen from these tabular data that there is little improvement in

cost (tip deflection) as a result of the fifth iteration. For this
reason a value for the tolerance was selected as 0.5 x 10 which corre-

sponds to about six significant digits in the cost functional.

Recall from the previous section that no unconstrained minimum

exists. With the control bounds included, the intuitive solution is

one in which the cross-sectional area is maximum near the root, and

reduces to a minimum at the tip. Recalling that for the optimal control,

u =NMax [H] Min [i]

A sequence of illustrations in Figure 4.3 demonstrates the location of

uPT on 5U for several stations along the beam. Constant contours of

D(u) are plotted on the admissible control space at five distinct posi-

tions. If an extremal point exists interior to DU some lines of constant

0(u) contours must be closed curves in u-space. This is impossible for

this example.

x/L .1


0 1 1




x/L = .7

O Minimum $(u) u-pT

O Maximum O(u)

-->- Direction of Increasing 0(u)


Figure 4.3 Contour Plots of 0(u) at Various Stations
Along the Beam

x/L = .5



x/L = 1.

x/L = .3

The first illustration is for the t = 0.1 cross section, near

the root of the beam. Since uPT occurs at the point of minimum 0(u),

the optimum value for both ul and u2 is unity, the maximum allowable

dimensions for both height and width. Constant contour lines indicate

that ((u) is mathematically decreasing in either direction of u-space.

Lines of constant 0(u) are also plotted for the cross section of

t = 0.3. The optimum control has the maximum admissible value for

height ul but u2 has a value somewhat less than unity. However, there

are still no contour lines which are closed curves.

At the midpoint cross section the minimum 0(u) point occurs

at ul = 1 and u2 = a/c. Although the surface Q(u) forms a scoop-like

shape, there are still no closed curve contours, and hence no extremal

interior to admissible u-space. The next cross section at which 0(u)

is displayed occurs at t = 0.7. On this section, u2 is still at its

lower bound but ul is no longer at the maximum allowable value of unity

as shown in Figure 4.3. In the last of the sequence, d(u) contours for

the cross section at the tip of the beam are displayed. The point of

minimum q(u) occurs where both components of control have their minimum

allowable values. Again, no contour lines of constant 0(u) form a

closed curve indicating the existence of an interior extremal point.

This sequence of illustrations indicates two things. First,

the lack of closed curve contour lines of <(u) verifies that uPT exists

on U. With further study it may be possible to obtain some condition

on H = 0 which implies the equations corresponding to (4.3.2) can

never yield a finite, unconstrained optimum. Such a condition would

define the class of structures whose unconstrained solution is the

"zero volume solution" frequently described in the literature on struc-

tural optimization. Secondly, at t = 0 the point u PT occurs at
u = [1,1] the point of maximum cross-sectional area; as t increases

from zero to one, the point uPT moves along the ul = 1 boundary of aU

to the u2 lower bound, and then down the u2 = a/c boundary of DU to ul

lower bound. By the time t = 1 the optimal cross-sectional area is the

minimum allowable area. As a result of the prescribed form of 3U,

if u (t) follows this particular path as t increases from zero to one,
each component of u (t) has its own distinct region of transition.
That is, at any value of t, if b/d < ul < 1, then u2 must be on either

its upper or lower bound. Conversely, if u2 is in transition where

a/c < u2 < 1, then ul must be on one of its bounds.

This effect is seen most clearly in Figure 4.4, where uOPT is

displayed for the example case parameter values specified by (4.4.1).

The profiles are displayed on a two-view drawing as a plan-form of the

beam might appear. State components corresponding to this beam are

shown in Figure 4.5, representing dimensionless deflection, slope,

moment, and shear, respectively. As observed in Figure 4.4, there are

five distinct regions of the beam:

(i) 0 < t .25

uI =1
controls onupper bounds
u2 = 1

(ii) .25 < t < .52

u = 1
u2 transition


a/c = .20



Figure 4.4 Plan-form Views of Optimal Solution for
b/d = .20 and a/c = .25


2 u













0 .2 .4 .6 .8 1.0

Figure 4.5

State Components of Optimal Solution for
b/d = .20 and a/c = .25







(iii) .52 < t < .59

uI = 1 on upper bound

u2 = a/c on lower bound

(iv) .59 < t < .90

b/d 1 u transition
u = a/c

(v) .90 < t < 1.00

ul = b/d
controls on lower bounds
u = a/c

The curves that show the intercept locations as a function of

parameter values b/d have been called "correlation curves" in earlier

studies. When the width is allowed to vary also, the second parameter

a/c is introduced. For the sake of comparison to previous studies, the

intercept/correlation curves are plotted as dependent upon b/d and

parametric in a/c. However, it would be just as correct to do the


Intercept location curves described above are shown in

Figure 4.6. The heavy black curve is the case where a/c = 1, a beam

of constant uniform width--the case cited from earlier literature.

Another special case is represented by dashed lines, correspondin;- to

a/c = 0 which is the case corresponding to a minimum allowable thick-

ness equal to zero. Dashed lines are used because these data are an

extrapolation: convergence problems encountered for parameter values

less than 0.1 prevented obtaining numerical results.



o c


a 0

(d 4



A discussion on the convergence difficulties experienced by

the quasilinearization algorithm for parameter values approaching zero

is presented in the chapter on numerical instabilities. In that dis-

cussion, isolation of the source of difficulty is reported; it is pos-

sible that this difficulty may be a general result applicable to all

problems to be solved by the method of quasilinearization. A solu-

tion for this case is later obtained by finite element techniques.

Note that since 0 < a/c < 1 these two cases represent limits to the

solutions of the problem. In addition, if the four intercept locations

are plotted versus a/c and parametric in b/d, curves r and r appear
a c

as "horizontal vees" with rb and rd lines that are nearly parallel.

It is interesting to note from the figure depicting the solu-

tion of this case as a plan-form, that in the central region of the

beam, the height is greater than the width. This result can be antic-

ipated since such a configuration gives a greater bending resistance

per unit weight.

With further reference to Figure 4.4, the transition of ui(t)

is seen to be almost a linear taper, whereas the u2(t) transition

exhibits a much more pronounced curvature. To generalize from this

specific case of given values of b/d and a/c to arbitrary values

requires the introduction of four quantities characterizing the solu-

tion. These quantities are the values of t at the points where the

transitions intercept the bounds on ul and u2; since t represents a

normalized position x/L, these quantities can be thought of as an

intercept location expressed as percent of the beam's length. They

are defined with reference to the five distinct regions of the beam

previously given, where

rb design

rd design

ra design

r design

such that the five

(i) 0 < t

(ii) rc < t

(iii) r < t
a -

(iv) rd < t

(v) rb t
b -

tes u (t)

tes ul(t)

tes u2(t)

tes u2(t)


< r
- c

< r
< rd

< rb

< 1.0

intercept with lower bound

intercept with upper bound

intercept with lower bound

intercept with upper bound


control on upper bounds

u2 transition

control on upper/lower bounds

u1 transition

control on lower bounds

4.5 Inequality Stress Constraints

This section treats an inequality limit to allowable normal

and shear stresses associated with bending. Using the ordinary

strength of materials formulations it can be shown that for the rec-

tangular cross section these constraints take the form

1 MB(x)
Sh(x) MAX

1 VB(x)
8 IB(X) h2(x) MAX

When dimensionless quantities are introduced

6 Lx3 u u2 < MAX

3 < 2
2 YL x4/ulu2 MAX

the inequalities may be written in the required form for mixed con-

straints, i.e., as a function of both control and state components:

i1(xu) = x3/u u2 0 (4.5.1)

h2(xu) = x4/uu2 To 0 (4.5.2)

1 MAX d
0 6 yL

2 'MAX
T0 3 yL

The two stress constraints place restrictions upon the minimum

cross-sectional dimensions to keep the normal and shear stresses less

than prescribed values. Specifically, from the constraints (4.5.1)

and (4.5.2), two control inequalities must be satisfied at each station

t, and these inequalities depend upon the state of the structural sys-

tem. The inequalities are:

ulu2> x3/o0

uu2 > x4/T x4(t) < 0

from which can be derived boundary arcs in u-space:

ul(u2) = (x3/O0u2) 0

uT(U2) = 4/TO > 0

Both of these boundary arcs are hyperbolas restricted to the first

quadrant of u-space. Depending upon the location of the arcs, vis-a-vis

the rectangular 3U, inclusion of stress constraints has one of three

effects in determining what u is admissible. At any station t in some

structural state x(t),

(i) if 0 ,T0 is too small the stress boundary arc lies entirely

above rectangular 3U; all geometrically admissible u violate

the stress constraints.

(ii) if 0,T0 is too large the stress boundary arc lies entirely

below rectangular DU; all geometrically admissible u satisfy

the stress constraints.

(iii) for some range of 0,T 0 the stress boundary arc divides the

rectangular DU into two regions: the upper region consists of

geometrically admissible u that satisfy the stress constraint,

u in the lower region are geometrically admissible but violate

the stress constraint.

Inclusion of stress constraints alters the admissible control

space from the rectangular shape previously considered to a shape that

may contain a stress boundary arc as part of its boundary. Consider the

normal stress boundary arc specified by (4.5.3) to be a part of 4U.

Then to find uPT in the manner outlined in Section 4.3, (u) must be

evaluated along ul = u l(u2). If a minimum exists along the orthogonal

projection of u l(u2) on the O(u) surface, then


Min (u) U -
ul ul- a20(u)
> 0