Differential Constraint Scaling in Penalty Function Optimization

This article was downloaded by: [University Library Technische Universität München]On: 06 October 2013, At: 17:01Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

A I I E TransactionsPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/uiie19

Differential Constraint Scaling in Penalty FunctionOptimizationDonald L. Keefer a & Byron S. Gottfried ba Gulf Research & Development Company ,b University of Pittsburgh ,Published online: 09 Jul 2007.

To cite this article: Donald L. Keefer & Byron S. Gottfried (1970) Differential Constraint Scaling in Penalty FunctionOptimization, A I I E Transactions, 2:4, 281-289

To link to this article: http://dx.doi.org/10.1080/05695557008974766

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose ofthe Content. Any opinions and views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be reliedupon and should be independently verified with primary sources of information. Taylor and Francis shall not beliable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilitieswhatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out ofthe use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/uiie19

http://dx.doi.org/10.1080/05695557008974766

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

Differential Constraint Scaling in Penalty Function Optimization1 DONALD L. KEEFER Gulf Research & Development Company

BYRON S. GOTTFRIED ~ University of Pittsburgh

Abstact: Penalty function optimization techniques are generally very successful in solving nonlinear programming problems. However, they can become computationally in- effective if certain of the constraints tend to dominate the entire constraint set. Under these circumstances, computational efficiency can often be restored by multiplying each constraint by an appropriate scale factor. This article presents a heuristic algorithm for computing such scale factors at periodic intervals during the computation. The method is applied to several sample problems, including some problems which are intentionally ill-scaled and others which are of a more realistic nature. The method is shown to be beneficial in all cases.

One of the more powerful nonlinear programming techniques which has recently been developed is based upon the penalty function concept. The strategy behind such algorithms is to transform the problem

subject to g,(X)=O, j = l , 2, . . . , k [I 1

into an equivalent unconstrained problem

where the plus sign is chosen for minimization problem and the minus sign for a maximization. The function a(X) is a nonnegative penalty function, which is constructed from the given set of constraints, and X is some positive constant, Bnown as the penalty coeficient, which fixes the cost of constrain violation. If y(X), g,(X), j = l , 2, . . - , m, and x(X, A) are continuously varying functions of X, then the desired solution to [I] can be obtained by solving a sequence of unconstrained problems [2] with successively greater values for X. Several schemes for constructing the x(X, A) function, and several hillclimbing procedures for carrying out the subsequent unconstrained

optimizations, are discussed by Fiacco and McCormick (1).

A particularly popular penalty function algorithm is based upon extremizing the unconstrained function

The 6;s are switching functions, determined according to

Notice that as the penalty coefficient becomes larger, the penalty function is driven to zero by the extremization of z(X, A). This is the mechanism which causes the constraints to be satisfied a t the end of the sequence of optimizations.

Equation 3 represents an exterior penalty function algorithm, since the optimal solution point is usually ap- proached "from the outside;" i.e., from an unfeasible region. One of the attractive features of this algorithm is that it does not require that the constraints be satisfied until the end of the computation, thus avoiding some of the rigors of other algorithms which must maintain feasibility a t all times.

I n a computational sense, the algorithm suggested by [3] works very well when each of the constraints has about the same sensitivity to a given movement within the hyperspace (the term "sensitivity" will be defined precisely in the next section). Stated differently, the method works well when each of the violated constraints tends toward feasibility a t about the same rate. However, when one or more constraints are significantly

' Presented a t the 38th ORSA Meeting in Detroit, Michigan, October 30, 1970.

December 1970 AIIE Transactions 28 1

Dow

nloa

ded

by [

Uni

vers

ity L

ibra

ry T

echn

isch

e U

nive

rsitä

t Mün

chen

] at

17:

01 0

6 O

ctob

er 2

013

more sensitive than the remaining constraints, the penalty A function can take on ridge-like properties which tax the

X2 abilities of even the best hillclimbing procedures. Under these conditions, the computation becomes inefficient. Moreover, the computation may terminate a t a false optimum. Adverse conditions of this type are particularly common to problems of high dimensionality with a large number of constraints.

To see this problem more clearly, disregard the objective function y(X) for the moment and consider only the two equality constraints

g1(x1, 22) = X l + xz = 0

92(x1, x2) = 21 - 2 2 = 0

shown in Figure 1. The corresponding penalty function, which must approach zero at the solution point, is

/

a circle whose center is at the origin, as shown in Figure 2. Clearly, the two constraints can both be satisfied only at the origin. This point can be obtained very easily by minimizing a(x1, 22) using any one of several known hillclimbing procedures. (The fact that u(xl, x2) is twice differentiable enlarges considerably the number of avail- able hillclimbing procedures.)

On the other hand, consider the two equality constraints

Figure 1. Constraints gl(X1, XZ) and g2(X1, Xz).

Figure 2. Level curves of a(&, XZ).

These two constraints are mathematically equivalent to the original pair. In fact, a plot of g,'(xl, xz) = 0 and g,'(xl, x2) =0 is identical to Figure 1. Notice, however, that at any point which is equally distant from the two lines, inclusion of the factor 100 causes gzl(xl, xz) to be much larger in magnitude than gll(xl, x2). Stated differently, g21(x1, x2) is said to dominate g,'(xl, 2%). This same dominance is exhibited in the magnitude of the gradients of gtl(xl, x2) and g<(xl, x2). The corresponding penalty function is al(xl, XZ) = 10,001 (x I~+xz~) - 19,998 xlxz, as shown in Figure 3. Observe the sharp ridge-like features of this penalty function, in contrast to the well-behaved function in the previous figure.

The computational problems which can be caused by constraint dominance are readily appreciated if one at- tempts to reach the feasible point, 0, starting at point A in Figure 3. Suppose one follows some path which approx- imates that of steepest descent. Since the partial derivatives of (g,')2 are much greater in magnitude than those of (g,')2, the path toward 0 will first be directed toward B, following which it will "zig-zag" down the valley toward 0. It is widely recognized that numerical hillclimbing procedures tend to become highly inefficient when applied to functions which exhibit such steep valleys or ridges, even, in some cases, terminating at false optima.

The above problem is, of course, contrived to demon- strate the undesirable effects of sharp ridges and valleys which are caused by constraint imbalance. Obviously, these particular difficulties can be eliminated simply by removing the factor of 100 from gz'. In many problems of practical interest, however, such ridges and valleys do

282 AIIE Transactions Volume I1 No. 4

Dow

nloa

ded

by [

Uni

vers

ity L

ibra

ry T

echn

isch

e U

nive

rsitä

t Mün

chen

] at

17:

01 0

6 O

ctob

er 2

013

occur, despite the analyst's best efforts to eliminate them. can only be accomplished by forcing each gJ2 to zero for Thus, the problem is more general, and more serious, than X sufficiently large. For each gi2, a corresponding scale one might suspect from the elementary nature of the factor aj is sought which will cause each product a3gj2 to above example. move toward feasibility at about the same rate.

From a geometric viewpoint, the desired aj's should tend to smooth any sharp features of the penalty function which result from constraint imbalance. Referring to the previous example, the scale factors should be chosen so that the penalty function whose contours are shown in Figure 3 will be transformed into a penalty function with contours resembling those of Figure 2. Note that such smoothing of the contours tends to direct the gradient of the penalty function more toward the desired minimum.

To obtain some measure of the variation of gj2(X) in the neighboi-hood of some point X,, it is convenient to define a sensitivity function, S,, as

ag3(X~) AS, (Xo) = I gj(xo> 1 c -- .

ax, / [7 I

This choice is not the only one possible, but it is simple and it has worked successfully.

Now a plausible way to scale the constraints would be / to compute some average sensitivity function for all of

Figure 3. Level curves of uf(X1, X2). (Not drawn to scale.)

The difficulties caused by ill-scaled constraints can be minimized, if not eliminated, by multiplying each constraint by an appropriate scale factor, thereby altering the sensitivity of the penalty function to each of the constraints. Thus the algorithm suggested by [3] can be modified by writing

where the penalty function, u(X, a ) is given by

and the aj are the constraint scale factors. In the following section an algorithm is presented for evaluating the a,'s, based upon the sensitivities of the corresponding gi2(X)'s to a movement in the hyperspace.

THE ALGORITHM Consider, for the time being, a maximization problem

having only equality constraints. It will be assumed that each constraint has continuous first partial derivatives within the region of interest.

The proposed algorithm proceeds by computing a sequence of maxima of x(X, a, X) with respect to the policy vector Xfor increasing values of X. The function z(X, a, X) is constructed in such a nianner that maximization of x

the constraints, 3, and compare each Sj with 3. If Sj is smaller than 3, then gI2 should be multiplied by some aj> 1, which would in turn increase the scaled value of 8,. Conversely, if Sj is larger than 3, then a, should be chosen such that O < a j < l .

Before discussing how a, and might be determined, it should be pointed out that the above strategy is plausible only when Xo is not close to any of the constraint surfaces. Suppose, however, that X, is very near a constraint surface-as, for example, point B in Figure 3. Then it is likely that I g,(Xo) I , and hence S,(Xo), will be compara- tively small, despite the fact that the partial derivatives of g,(X,) may be large in magnitude. This is known as the "constraint ridge" problem.

Application of the above strategy under these conditions results in a choice of a, > 1. This causes an increase in the scaled value of g,-and hence Vg,-whirh is precisely the wrong result. Rather than render the contours of the penalty function more circular, the above procedure fur- ther increases the severity of the constraint ridge problem, resulting in zig-zagging along the ridge.

To overcome the constraint ridge problem, the sensitivity function, Sj, must be prevented from taking on sma11 values at some X, where g3(Xo) is nearly satisfied but

C 1 agJ(xo)/ax. I 2=1

is large. This can be accomplished simply by lower bounding I g,(X,) I at unity.

In practice, i t is a good idea also to lower bound the summation term

December 1970 AIIE Transactions 283

Dow

nloa

ded

by [

Uni

vers

ity L

ibra

ry T

echn

isch

e U

nive

rsitä

t Mün

chen

] at

17:

01 0

6 O

ctob

er 2

013

a(X0, a ) = a(X0, a*), [ill

at unity. The reason for this is less apparent than the lower bounding of I gj(Xo) I . Occasionally, however, a constraint whose first partial derivatives vary considerably with X takes on a small value for the summation term a t some particular X,. Lower bounding the summation term at unity thus prevents that constraint from being overscaled.

Thus, the expression for the sensitivity function becomes

where

and

Gj(Xo) = Max -- [lo]

The method used for computing S is somewhat arbitrary and depends, among other things, on how often a new set of scale factors is to be determined. The argu- ments presented above which pertain to the utility of the sensitivity functions are based upon the local behavior of the constraints at some arbitrary point X,. This suggests the possibility of determining a new set of sensitivity functions and a new set of scale factors whenever a move has been made within the hyperspace. To do so, however, would result in an alteration of the objective function during the course of each unconstrained maximization. Even if this were not so, the effort involved in computing a new set of S,'s whenever a move is made within the hyperspace could easily offset the advantage to be gained by scaling.

As an alternative, one could determine new sensitivity functions and new scale factors only at the start of every penalty level (i.e., whenever a new value is assigned to h and a new hillclimbing problem is initiated). Thus the strategy will be implemented globally rather than locally. If the increase in X at successive penalty levels is not too large, however, then the successive optimal solution points will not be too far apart. Under these conditions the global application of the above strategy is not unreasonable.

Consider, then, that Xo represents the starting point of a given penalty level. Except for the first penalty level, X, will also represent the solution point of the previous penalty level. Let a,* represent the scale factors at the end of the previous penalty level, j = 1, 2, . . . , m (note that aj* = 1 at the start of the first penalty level), and let a, be the scale factors a t the start of the current penalty level. It will be required that a vary continuously with X, even between successive penalty levels. Hence,

which can be written

5 1:ns,2(x~) = o(Xo, a*). j= 1 1121

If the aj are chosen such that

then Equation 12 becomes

- m

s ,=I C s~~(X~>/S , (X~) = u(x,, a*). [141

Solving for 3, one obtains

which is the desired expression. To summarize the algorithm, the sensitivity functions,

S . 1 , 3 - '- 1 , 2, . . , m, are calculated at the start of each penalty level in accordance with Equations 8-10. A value for S is then obtained from Equation 15, and each aj is determined using Equation 13. These values of a, are then held constant throughout the penalty level.

When implementing the algorithm, it has been found that an automatic on-off provision can occasionally be beneficial for certain ill-scaled problems. This proceeds as follows: when a constraint ridge has been encountered during a penalty level in which scaling has been employed, then the next penalty level proceeds without scaling. The computation continues without scaling as long as a continues to decrease. Once u becomes relatively constant, the scaling is resumed at the next penalty level. It is therefore possible to alternate repeatedly between scaling and no scaling in successive penalty levels. Alteration of the penalty function in this manner can result in movement away from the constraint ridge. This can be beneficial in problems with severe ridges, as it causes the optimizer to explore a portion of the region where the influ- ence of the ridge is less dominant.

Although the above discussion has been limited to equality constraints, inequalities may be treated in a similar manner. However, since inequality constraints are satisfied when negative, the value of I g, I used in computing S, must be set equal to 1 whenever g, 5 1. This lo~ver- bounding procedure causes inequality constraints which are satisfied to be scaled just as satisfied equality constraints. Domination of the constraint system by satisfied inequality constraints having large derivatives is thereby prevented.

NUMERICAL APPLICATIONS Four sample problems were devised to test the effec-

tiveness of the differential constraint scaling scheme


Dow

nloa

ded

by [

Uni

vers

ity L

ibra

ry T

echn

isch

e U

nive

rsitä

t Mün

chen

] at

17:

01 0

6 O

ctob

er 2

013

described in the previous section. Constraint ridges were artificially created in these problems by multiplying indi- vidual constraints by a factor of lo3. These problems are shown below:

SAMPLE PROBLEM 1

Maximize y = 5x12 + xz3 f 2x1x2 + xS2

subject to gl = x3 - 4 = 0

g2 = 103(x2 - x3 - 4) = o 93 = x12 + xZ2 + 23' - 81 _( 0

and 0 < x, 5 10, i = 1, 2, 3.

Answer: y(1, 8,4) = 549.

SAMPLE PROBLEM 2

Maximize y = xl + x2 + x3 subject to gl = 103(x12 + xZ2 + x3Z - 9) = 0

and 0 5 xi 5 5, i = 1, 2, 3.

Answer: y(2, 2, 1) = 5.

Maximize y = XI + 2 2 + xs3

subject to gl = x12 + xz2 -I- - 4 5 0

and 0 5 xi 5 10, i = 1, 2, 3.

Answer: y(1, I , 4 5 ) = 2 + 2 4 .

Maximize y = x12 + xZ2 + subject to gl = (XI - ~ 2 ) ~ = 0

g2 = 1O3(x1 + x2 - lo) = 0

and 0 5 xi 2 10, i = 1, 2, 3.

Answer: y(5,5,lO) = l 5 0 .

Each of the sample problems was solved on a digital computer using the exterior penalty function technique suggested by Equation 3. Each problem was solved with and without constraint scaling using two different hillclimbing techniques-the direct search algorithm of Hooke and Jeeves (4) known as "pattern search," and the "variable metric" gradient method of Fletcher and Powell (2). Ten solutions were obtained for each problem using each hillclimbing method both with and without the scaling procedure. Thus, forty solutions were obtained for each problem.

Each set of ten solutions was obtained from the same set of ten randomly generated starting points. These starting points are given in Table 1. Note that Problems 1, 3 and 4 were all started from the same set of starting points, whereas Problem 2 was started from a different set of 10 randomly generated points. The reason for this

is that the upper bounds imposed on the independent variables in Problem 2 were different from those in the other three problems.

The results of the computation are summarized in Table 2 and are presented in greater detail in Tables 3-10. In Table 2, a ('satisfactory solution" must have a value of a. less than where a. is the value of the unscaled penalty function a t the end of the optimization. The values of the independent variables must be reasonably close to the known solution. Thus, solutions which are grossly unfeasible and solutions which are feasible but not optimal are not included.

The initial value for the penalty coefficient, XO, is shown in Tables 3-10 for each problem at each starting point. For each of the problems, the penalty coefficient was increased by a factor of 8 at each penalty level.

Table 1 : Starting points for sample problems

Problems 1, 3, 4

Run Number XI x2 x3

1 6.7685 2.3368 3.1042 2 7.5939 7.6255 7.4077 3 5.8170 8.2322 7.0402 4 8.1519 5.5494 9.9293 5 9.6318 8.4266 3.8737 6 7.4023 9.5507 0.68323 7 8.1434 2.7116 2.9784 8 3.4665 3.9934 2.7614 9 0.62801 8.9153 7.8396

10 6.7999 0.24352 0.26182

Problem 2

1 3.3843 1.1684 1.5521 2 3.7970 3.8127 3.7039 3 2.9085 4.1161 3.5201 4 4.0759 2.7747 4.9647 5 4.8159 4.2133 1.9368 6 3.7012 4.7753 0.34161 7 4.0717 1.3558 1 .4892 8 1.7333 1.9967 1.3807 9 0.31400 4.4576 3.9198

10 3.4000 0.12176 0.13091


Table 2 : Number of satisfactory solutions from 10 starting points

Variable Metric Search Scaling No Scaling

10 0 10 2 7 2

10 2

Prob- lem

1 2 3 4

Pattern Search Scaling No Scaling

10 0 10 0 10 10 10 2

Dow

nloa

ded

by [

Uni

vers

ity L

ibra

ry T

echn

isch

e U

nive

rsitä

t Mün

chen

] at

17:

01 0

6 O

ctob

er 2

013

Table 3 : Problem number 1-pattern search

Run 1 2 3 4 5 6 7 8 9 10

Scaling employed

Table 4 : Problem number 1-variable metric search

Run 1 2 3 4 5 6 7 8 9 10 --

Scaling employed

AIIE Transactions Volume I1 No. 4

1.0107 7.9991 3.9991 549.1

8.1000 X10-7 1695

3.8461 X10-9

XI

XI

xs Y oo

IOBJ --

Xo


Run 1 2 3 4 5 6 7 8 9 10

Scaling employed

N o scaling employed

1.0122 7.9991 3.9991 549.13

8.8252 X10-6 1643

5.0884 X10-'

XI

xz xs Y oo

IOBJ --

Xo

1.0107 7.9991 3.9991 549.10

8.1000 X10-7 1650

4.1832 X10-*

1.0079 7.9994 3.9994 549.09

3.6812 X10-6 1573

3.8461 XlO-'

1.0133 7.9990 3.9999 549.15

1.0601 Xl0-5 1631

4.7497 X10-7

XI

x2 x3 Y oo

IOB J

Xo

No scaling employed

1.9981 1.9986 1.0065 5.0032

3.5253 X10-a 663

-- 3.5353 X10-l1

1 . O m 8.0000 4.0000 549.01

6.4026 X10-7 534

2.8765 X10-7

X I

xz 1 8

Y w

IOB J

kn

1.0095 7.9993 3.9993 549.11

5.3416 X10-6 1570

4.4024 X10-9

1.0058 7.9996 3.9996 549.06

1.9688 X10-6 1637

6.5416 X10-8

XI

x2 x3 Y co

IOBJ

Xo

1.9495 2.0495

0.99950 4.9985

1.5625 X10-6 368

8.5961 XlO'O

No scaling employed

1.0113 7.9991 3.9991 549.11

2.0853 X10-6 1537

2.9189 XlO-'

0.0 9.0550 5.0550 768.00 705.81

696

5.8251 X10-8

1.0325 7.9976 3.9976 549.36

6.7648 X10-5 1352

4.1832 X10-9

1.8944 2.1000 1.0005 4.9950

1.7497 X10-a 659

1.7561 X10-lo

0.99530 8.0004 4.0004 548.96

2.0947 X10-7 1739

4.7497 X10-7

1.0199 7.9985 3.9985 549.22

2.4270 Xl0-5 1635

-- 2.8765 X10-7

1.9546 7.8844 3.8844 555.13

1.8585 X I 0 3

1735

5.8251 XlO-8

1.9976 1.9990 1.0066 5.0033

3.3231 X10-8 697

2.8051 X10-11

1.0151 7.9988 3.9988 549.15

4.0998 X10-6 1693

6.5416 Xl0-8

0.0 9.0550 5.0550 768.00 705.81

691

2.9189 XlO-?

1.0199 7.9985 3.9985 549.22

2.4307 X10-5 1423 -

1.1632 X10-7

1.0314 7.9977 3.9977 549.35

6.2627 X10-6 1664

9.5280 X10-8

1.9976 1.9991 1.0066 5.0033

3.3951 X10-3 626

1.9686 X10-11

1.6955 1.7500 1.7500 5.1955 12.713

413

2.7663 X10-l1

XI

xz x8 Y oo

IOB J

Xo

1.1894 7.9827 3.9827 550.61

5.4299 X10-4 3566 -

9.5280 X10-8

0.0 9.0550 5.0550 768.00 705.81

699

6.147 X10-8

0.0 9.0550 5.0550 768.00 705.81

687

1.6515 X10-6

0.0 9.0550 5.0550 768.00 705.81

688

1.9718 X10-7

0.0 9.0550 5.0550 768.00 705.81

726

2.2014 X10-7

1.0105 7.9992 3.9992 549.12

6.5292 X10-6 1664

2.6400 X10-7

1.9505 2.0505

0.99550 4.9965

7.9156 X10-4 614

7.4261 X10-11

1.6945 1.6995 1.8000 5.1940 14.622

401

4.9073 X10-lo

1.6955 1.7500 1.7500 5.1955 12.713

417

1.7819 X10-10

1.1388 7.9876 3.9876 550.20

1.1884 X10-3 1497

1.1632 X10-7

1.0531 7.9955 3.9955 549.49

2.1374 X10-6 1393

2.6400 X10-7

0.0 I 9.0550 9.0550 5.0550 5.0550 768.00 768.00 705.81 705.81

714

7.8826

1.9433 7.8861 3.8861 555.07

X 1 0 - V . 7 6 8 5 X10" 1702

6.14670 X10-8

2.1081 7.8618 3.8618 556.20

4.6304 X10-2 1606

1.9718 X10-7

1.9489 2.0472 1.0054 5.0015

2.1240 X10-3 47 1

2.9850 X10-2

1.9961 2.0011 1.0056 5.0028

2.9180 X10-8 657

-- 2.4464 X10-u

1.6955 1.7500 1.7500 5.1955 12.713

389

1.9686 XlO*

1.1134 7.9900 3.9900 549.99

1.0000 X10-4 846 -

8.8950 X10-@

0.0 9.0550 5.0550 768.00 705.81

655

2.8703 X10-7

1.6955 1.7500 1.7500 5.1955 12.713

427

1.8144 X10-1

0.0 9.0550 5.0550 768.00 705.81

677

6.0823 X10-8

2.0005 1.9970 1.0050 5.0025

1.6793 )(lo-8

800

1.5486 X10-11

1.6955 1.7500 1.7500 5.1955 12.713

399

2.4464 X10-u

2.0448 7.8708 3 .8708 555.66

2.9458 1557

2.9189 X10-7

2.0478 7.8703 3.8703 555.68

2.9990 Xl0" 1634

2.8703 X10-7

1.9455 2.0505 1.0050 5.0011

3.5847 ~ 1 0 - 8 62 1

1.8144 X10-11

1.6955 1.7500 1.7500 5.1955 12.713

424

2.5908 X10-11

1.6945 1.6995 1.8000 5.1940 14.622

411

1.5486 X10-n

1.7455 1.7000 1.7500 5.1955 13.167

1.9455 7.8858 3.8858 555.09

1.7856 X I 0 3 1746

-- 6.0823 X10-8

1.7278 1.6917 1.7757 5.1952 13.598

1.8890 7.8943 3.8943 554.81

1.4110 X10-2 1847

7.8826 X I 0 9

304

8.1930 1.4147 X10-8

1.8615 7.8984 3.8984 554.68

1.2657 X107 1826

-- 8.8950 X10-8

2.1013 7.8627 3.8627 556.13

4.3740 X I 0 1 1526

1.6515 X10-6

2.0915 7.8640 3.8640 556.03

4.0462 X10-2 1476

2.2014 X10-7

Dow

nloa

ded

by [

Uni

vers

ity L

ibra

ry T

echn

isch

e U

nive

rsitä

t Mün

chen

] at

17:

01 0

6 O

ctob

er 2

013


Run 1 2 3 4 5 6 7 8 9 10

Scaling employed

XI

xa xs Y uo

IOB J

Xa


Run 1 2 3 4 5 6 7 8 9 10

Scaling employed



Run 1 2 3 4 5 6 7 8 9 10

Scaling employed X X X

1 .W 0.99190

1.4143 4.8291

6.0683 X10-' 692

3.9762 XIO-''

XI

xz 13

Y so

IOBJ

Xa

1.9999 1.9999 1 .OW2 5.0001

1.6053 X10-' 1014

8.5961 X10-lo

2.0000 2.0000 1 .0001 5.0000

1.5687 X10-6 1074

1.7561 Xl0-10

No scaling employed

1.4157 1.4157

9.1859 X10-6 2.8314

7.2051 X10-5 607

3.9762 XlO-"

XI

xz x3 Y ao

IOB J

Xo

XI

xe x3 Y m

IOBJ

Xa

No scaling employed

2.0000 2.0000 1 .0001 5.0000

1.8490 X10-6 1055

2.0086 X10-11

No scaling employed X X X X X X X X

0.94780 1.0529 1.4127 4.8201

6.9936 X1W6 754

3.1455X10-w

2.0052 1.9372 1.1074 5.0498

2.3752 X10-1 3556

1.7819 X10-W

1.0287 0.97197 1.4141 4.8284

6.9709 X10-6 474

4.3918 X10-lo

0.99060 1 . O W 1.4145 4.8306

3.9865 X10-6 654

1.0801 X10-10

1.0164 0.98430 1.4145 4.8308

7.4294 X10-6 581

2.6780 X10-10

0.98080 1.0197 1.4143 4.8295

4.2027 X10-6 459

5.4916 X10-10

0.95081 1.0502 1.4131 4.8227

1.3989 XlO' 623

4.0798X10-10

1.4144 1.4144

9.8986 X10-6 2.8288

1.3170 X10-6 879

2.6780 X10-10

1.0002 1 .0002 1.4145 4.8305

3.3635 X10-6 1305

5.4916 X10-10

2.0000 2.0000 1 .0001 5.0000

1.2710X10-6 1025

2.5749 X10-11

1.9169 2.0751 1.0097 5.0017

2.1964 X10-8 4470

1.9686 X10-11

0.99080 1.0102 1.4148 4.8329

1.4805 X10-5 560

7.5231 XlO'a

1.0042 1.0084 1.4053 4.7878

0.0 480

4.1322 X10-10

1.0092 0.99099 1.4143 4.8292

7.8050 X10-7 659

2.4200 X1O-O

0.98972 1.0107 1.4144 4.8300

2.4688 X10-6 546

3.7013 X10-10

XI

xe x3 Y m

IOBJ

Xo

0.99102 1.0082 1.4136 4.8239

1.1167 X10-8 1108

1.0801 X10-10

X Nonoptimum solution point, despite small value of m.

1.4145 1.4145

1.9440X10-6 2.8290

~10--2.333 X10-8 1943

1.5849 X10-9

1.4145 1.4145

1.9933 X10-8 2.8289

2.0398 2049

2.2510 XlO-IO

1.9399 1.9999 1 . O W 5.0002

6.7163 X10-6 993

2.8051 X10-l1

1.8471 1.8616 1.4569 5.1656

1.0151 0.99797 1.4049 4.7860

2.7646 X10-9 615

4.5056 X10-10

0.91680 1 .0835 1.4095 4.8005

1.4036 X10-6 724

6.4656 X10-12

1.0020 0.99874

1.4147 4.8323

8.5957 X10-6 1064

7.5231 X10-12

1.4145 1.4145

9.9768 X10-6 2.8290

2.5897 XlO-' 514

2.4200 X10'

1.0002 1 .0002 1.4145 4.8306

3.9353 X10-6 1263

3.7013 X10-0

XI

xe x3 Y 00

IOBJ

Xo

2.0000 2.0000 1.0001 5.0001

4.8860 XlO-6 1051

1.8967 X10-11

0.93600 1.0645 1.4116 4.8133

3.6932 X10-6 687

4.4561 X10-10

1.0007 0.99955

1.4144 4.8295

6.6826 xIO-' 1105

4.3918 X10-lo

1.4153 1.4153

4.6895X10d 2.8307

3.9695 X10-' 2428

4.0798 X10-lo

1.9999 1.9999 1.0002 5.0001

2.0906 X10-5 904

1.3113 X10-11

2.0361 1.9494 1.0267 5.0122

1.8270 1.8199 1.5330 5.1799

0.94780 1.0529 1.4127 4.8201

6.9936 X10-6 716

2.2510 X10-lo

1 .0002 1 .OOO2 1.4145 4.8305

3.4400 X10-6 1144

4.1322 X10-10

1.4144 1.4144

1 .9905X10-8 2.8289

1.6324 X10-6 1993

3.1455 X1OWU

0.88872 1.1120 1.4059 4.7797

9.2818 X10-6 3743

4.4561 X10-W

1.4154 1.4154

1.9714 X10-6 2.8309

4.8217 x10-6. 2438

3.7014 X10-10

1.9999 1.9999 1.0005 5 .OW2

7.8993 X10-6 932

3.5353 X10-l1

2.0000 2.0000 1 .OM)1 5.0000

2.3112 X10-6 1068

7.4261 X10-11

1.4712 X10' 4297

4.9073 XlO*O

4.4901 2115

2.4464 X10-11

1 .W53 1.9221 1.2943 5.1217

2.0519 1.9332 1.0259 5.0110

0.88782 1.1123 1 .a53 4.7754

8.2736 X10-8 684

3.3317 X10-10

0.95655 1.0441 1.4133 4.8236

6.3568 X10-6 609

1.5849 X10-9

1 .0002 1.0002 1.4145 4.8302

2.5919 X10-6 1218

4.5056 X10-10

1.4156 1.4156

1.9780 X10-8 2.8312

5.9502 X10-6 2384

3.3317 X10-10

0.64808 1.3521 1.3238 4.3203

5.4260 )(lo-7

3636

9.5161 X10-11

1.4153 1.4153

1.9639 X10-6 2.8306

3.7884 )<lo-6 2464

4.1763 X10-10

2.0000 2.0000 1 .0001 5.00o0

1.4041 X10-6 890 -

2.9850 X10-"

1.8166 1.8253 1.5390 5.1808

2.0577 1.9268 1.0264 5.0109

0.98924 1.0115 1.4146 4.8317

8.5827 X10-6 2864

6.4656 X10-12

2.0099 1.9890 1.0021 5.0010

0.88197 1.1192 1.4050 4.7746

1.9306 638

-- 3.7014 X10-lo

1.4019 X I 0 3 4253

1.5486 X10-u

6.3204 1853

2.5908 X10-"

1.4678 X10" 4061

1.8144 X10-11

0.85257 1.1482 1.3992 4.7400

X10-V.5625 X10-6 613

4.1763 X10-W

0.93080 1.0698 1.4112 4.8110

5.5043 X10-6 674

9.5161 X I 0 4

8.9986 X10-5 6.1762 1730

4611 I 8.1930 X10-11 1.4147X10-8

1.8214 2750

2.7663 X10"

Dow

nloa

ded

by [

Uni

vers

ity L

ibra

ry T

echn

isch

e U

nive

rsitä

t Mün

chen

] at

17:

01 0

6 O

ctob

er 2

013


Run I 2 3 4 5 6 7 8 9 10

Scaling employed

Examination of the results reveals that the performance of the optimizer with scaling was at least com- parable to, and usually much better than, its performance without scaling. In most cases, satisfactory solutions could not be obtained without the use of the scaling technique (realizing, of course, that these problems are purposely ill-scaled). Even in the pattern search solutions of Sample Problem 3, where the unscaled solutions were all correct, there was an advantage in using the scale factors; namely, the total number of functional evaluations from all 10 starting points was reduced by 15 percent. (The number of functional evaluations required to obtain a given solution is denoted by IOBJ in Tables 3-10.) Computer running time was reduced proportion- ately.

In addition to the simple, intentionally ill-scaled problems cited above, the scaling algorithm was tested on three additional problems of a more comprehensive nature, using pattern search as the hillclimbing technique. Unlike the earlier problems, these were not intended to

Table 10 : Problem number +variable metric search

Run 1 2 3 4 5 6 7 8 9 10

Scaling employed

be ill-scaled. In fact, considerable effort was made to remove any apparent constraint imbalance prior to the actual computation. Two of these (Problems 5 and 6) involved the optimization of nonlinear models represent- ing petroleum refinery operations. Problem 7 was concerned with the optimization of a nonlinear chemical separation process, as discussed in a forthcoming paper by Gottfried, Brugginl~, and Harwood (3).

Some statistics describing these three problems are presented in Table 11. Also shown for each problem are the percent reduction in functional evaluations when the scaling technique was employed. Clearly, use of the scaling technique was of significant benefit in improving the computational efficiency of the algorithm.

It is significant to note that the first four problems had obvious constraint scaling deficiencies which could easily be removed. However, complex problems often possess similar undesirable characteristics which are not readily discernible. The procedure described in this article is specifically intended for such problems.

XI

x2 X I

Y c n

IOB J

Ao


4.9000 5.1000 10.000 150.02

1.6000 490

1.7700 X10-@

5.0252 4.9748 10.000 150.00

6.4524 Xl0-6 485

1.5157 X10-7

XI

x2 X 3

Y an

IOB J --

AO

No scaling employed

5.0277 4.9723 10.000 150.00

1.8932 X10-6 786

1.7700 X10-8

4.9726 5.0274 10.000 150.00

3.1829 X10" 796

1.5157 X10-7

No scaling employed

4.9000 5.1000 10.000 150.02

X10-V.6000 X10-8 497

2.1471 X10-8

XI

xe X I

Y an

IOB J

Xo

5.0210 4.9790 10.000 150.00

3.1117 X10-6 514

2.6736 X10-8

8.1100 1.8900 10.000 169.34 1496.8

622

6.1684XlO'

5.0000 5.0000 10.000 150.00

9.9840 X10" 313

8.9167 XlO-9

5.0046 4.9954 10.000 150.00

1.1433 X10-6 815

8.9167 X10-9

5.0278 4.9723 10.000 150.00

1.577 X10-6 651

1.9710 X10-@

4.5976 5.4024 10.000 150.32

0.41948 967

1.8406 X10-'

5.0000 5.0000 10.000 150.00

X10-V.3849 X10"2 466

1.8406X10-7

1.8900 8.1100 10.000 169.34 1496.8

616

2.6736XlO-8

1.8900 8.1100 10.000 169.34 1496.8

594

1.7700XlO-5

4.9729 5.0271 10.000 150.00

3.0964 X10-6 759

1.0048 X10-@

5.0277 4.9723 10.000 150.00

9.6421 X10-6 789

2.5551 X10-8

4.8309 5.1691 10.000 150.06

1.3092 X10-2 1149

1.6286 X10-@

XI

xz X3

Y an

IOB J --

An

4.9000 5.1000 10.000 150.02

1.6000 X1O-s 511

2.4279 X10-8

4.9900 5.0100 10.000 150.00

1.6000 X10-7 419

1.0048 X10-@

4.9000 5.1000 10.000 150.02

1.6000 X10-3 524

1.1733XlO-8

1.8900 8.1100 10.000 169.34 1496.8

620

2.1471XlO-@

5.0097 4.9903 10.000 150.00

2.7514 X10-6 869

1.2819 XlO-%

5.1675 4 .8326 10.000 150.06

1.2585 XlO-I 1155

1.6607 X10-@

4.9426 5.0574 10.000 150.01

2.0107 X10-4 1265

1.7700 X10-@

5.1992 4.8008 10.000 150.08

2.5240 XlO7 1256

1.1733 )(lo-@

4.7543 5.2458 10.000 150.12

5.8379 X10-1 1280

6.1684 X10-8

1.8900 8.1100 10.000 169.34 1496.8

656

1.8406XlO-7

5.0278 4.9722 10.000 150.00

1.0585 X10-6 707

2.4210 X10"

4.9727 5.0273 10.000 150.00

1.8210 X10-5 710

2.2163 X10-7

4.9990 5.0010 10.000 150.00

1.6000 X10-s 328

2.2163 X10-7

1.8900 8.1100 10.000 169.34 1496.8

626

1.6286XlO-@

1.8900 8.1100 10.000 169.34 1496.8

649

1.2308XlO-@

5.0277 4.9723 10.000 150.00

1.5360 X10-6 814

1.6463 X10-@

4.8055 5.1945 10.000 150.08

2.2900 X10-2 1253

1.2308 X10-8

4.8534 5.1460 10.000 150.04

7.4032 X10" 1357

- 2.1471 X10-8

4.9000 5.1000 10.000 150.02

1.6000 496

1.6463 X10-@

4.7772 5.2228 10.000 150.10

3.9472 X10-2 1225 --

7.5136 X10-@

5.3733 4.6267 9.9999 150.28

0.31065 1166

2.6736 X10-8

4.9000 5.1000 10.000 150.02

1.6000 X10-a 529

7.5136XlO-8

1.8900 8.1100 10.000 169.34 1496.8

599

1.6607XlO-3

Dow

nloa

ded

by [

Uni

vers

ity L

ibra

ry T

echn

isch

e U

nive

rsitä

t Mün

chen

] at

17:

01 0

6 O

ctob

er 2

013

REFERENCES (1 ) FIACCO, A. V. , A N D MCCORMICK, G. P., Nonlinear Program-

ming: Sequential Unconstrained Minimization Technzques, John Wiley and Sons, Inc., New Y o r k , New Y o r k , 1968.

( 2 ) FLETCHER, R., A N D POWELL, M. J. D., " A Rapidly Converg- en t Descent Method for Minimization," Computer Journal, Volume 6, 1963, pages 163-168.

(3 ) GOTTFRIED, B. S., BRIGGINK, P. R., A N D HARWOOD, E. R., "Chemical Process Optimization Using Penalty Functions," pending publication i n I&EC Process Design & Development.

(4) HOOKE, R. AND JEEVES, T . A., "Direct Search Solution o f Numerical and Statistical Problems," J . Assoc. Comp. Mach., Volume 8, 1961, pages 212-229.

Table 11 : Benefits of constraint scaling in three realistic problems

CONCLUSION

Problem

In closing, it should be pointed out that the algorithm presented in this paper is of a heuristic nature. Both the manner in which the sensitivity functions are defined and the frequency with which they are recalculated result from arbitrary, though plausible, decisions. The technique could quite possibly be improved by altering the specific strategy in some manner which might be suggested by a more comprehensive analysis of the algorithm.

The method has, however, been shown to be con- sistently beneficial in all of the test problems. In some cases, use of the scaling technique was crucial to whether or not an optima1 solution could be obtained. In other cases, where optimal solutions had been obtained without scaling, use of this method still resulted in an appreciable increase in computational efficiency. The method can be implemented on a digital computer without undue diffi- culty. Its use with exterior penalty function algorithms as described herein is strongly recommended.

Dr. Gottfried, associate professor of industrial engineering at the University of Pittsburgh, i s currently doing research in mathe- matical modeling techniques and applied optimization theory as well as writing a textbook in optimization. A former faculty member of Carnegie-Mellon University, he was also employed by Gulf Research & Development Company, N A S A and Westing- house Electric Corp. Dr. Gottfried has a B S degree from Purdue University, a n M S from the University of Michigan, and a P h D

from Case-Western Reserve University. He i s a n associate member of O R S A and A I C h E .

5 Refinery 15 0 16 54 % Operations

6 Refinery 10 2 1 34 % Operations

7 Chemical 7 2 2 35% Separation Process

Mr. Keefer i s a n engineer in the Economics and Computer Science Division of the Gulf Research and Development Com- pany, where he i s concerned with the development of nonlinear optimization techniques and their application to management and engineering problems. He holds a B S degree in mechanical engineering from Carnegie-Mellon University and a n M S degree in mechanical engineering from Stanford University. He i s a n associate member of the American Society of Mechanical Engineers and a member of Ph i K a p p a Phi , T a u Beta P i , and P i T a u Sigma.

Percent reduction in functional evaluations

due to scaling

December 1970

Description

AIIE Transactions

No. ind. variables

No. equal. constraints

No. inequal. constraints

Dow

nloa

ded

by [

Uni

vers

ity L

ibra

ry T

echn

isch

e U

nive

rsitä

t Mün

chen

] at

17:

01 0

6 O

ctob

er 2

013

Documents

Differential Constraint Scaling in Penalty Function Optimization