23
Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Dr. Abebe Geletu Ilmenau University of Technology Department of Process Optimization Optimization Problems with Constraints - introduction to theory, numerical Methods and applications TU Ilmenau

Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

  • Upload
    others

  • View
    17

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Optimization Problems with Constraints -introduction to theory, numerical Methods and

applications

Dr. Abebe Geletu

Ilmenau University of TechnologyDepartment of Process Optimization

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 2: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Optimization problems with constraintsA general form of a constrained optimization problem is

(NLP) minx

f (x)

s.t.

hi (x) = 0, i = 1, 2, . . . , p;

gj(x) ≤ 0, j = 1, 2, . . . ,m.

where f , gi , hj : Rn → R are at least one-time differentiablefunctions and x ∈ Rn.

Feasible set of NLP:

• x ∈ Rn is feasible point of NLP if hi (x) = 0, i = 1, 2, . . . , p andgj(x) ≤ 0, j = 1, 2, . . . ,m.

• The set of all feasible points of NLP is the set which we represent by

F :={x ∈ Rn | hi (x) = 0, i = 1, . . . , p; gj (x) ≤ 0, j = 1, . . . ,m

}.

known as the feasible set of NLP.Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 3: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Optimization problems with constraints ....• Some times it is convenient to write the constraints of NLP in acompact form as

h(x) = 0, g(x) ≤ 0,

where

h(x) =

h1(x)h2(x)

...hp(x)

and g(x) =

g1(x)g2(x)

...gm(x)

Example:

(NLP1) minx

{1

2x2

1 + x1x22

}s.t.

x1x22 − 1 = 0,

− x21 + x2 ≤ 0,

x2 ≥ 0.

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 4: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Optimization problems with constraints ....In the example problem NLP1 above:• there is only one equality constraint h1(x) = x1x2

2 − 1 and• two inequality constraints g1(x) = −x2

1 + x2 ≤ 0 and g2(x) = −x2 ≤ 0.

I Observe that the point x> = (1, 1) is a feasible point; while the point (0, 0) is notfeasible (or infeasible); i.e., x> = (0, 0) does not belong to the feasible set

F =

{(x1

x2

)∈ R2 | x1x

22 − 1 = 0,−x2

1 + x2 ≤ 0, x2 ≥ 0

}.

Optimal Solution

A point x∗ ∈ Rn is an optimal solution of the constrained problemNLP if and only if(i) if x∗ is feasible to NLP; i.e. x∗ ∈ F and(ii) f (x) ≥ f (x∗) for all x ∈ F .

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 5: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Optimization problems with constraints ....

• For the example problem NLP1, the point x>∗ = (1, 1) is an optimal solution.

I In general, it is not easy to determine an optimal solution of aconstrained optimization problem like NLP.

Question:

(Q1) How do we verify that a give point x ∈ Rn is an optimalsolution of NLP? ( This about optimality criteria. )(Q2) What methods are available to find an optimal solution of NLP?( This about computation of optimal solutions. )

Definition (Active Constraints)

• Let x be a feasible point of NLP. An inequality constraint gi (x) ≤ 0 of NLP is said to beactive at x if

gi (x) = 0.

• The set A(x) = {i ∈ {1, 2, . . . ,m} | gi (x) = 0} denotes index set of active constraintsat x .

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 6: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Optimization with constraints ... Optimality criteria• For the example problem NLP1, the constraint g1 = −x2

1 + x2 is active at the point

x>∗ = (1, 1); while g2 = −x2 is active at x∗ . Hence, A = {1}

descent direction

A vector d is a descent direction for the objective function f at thepoint x if

f (x + d) ≤ f (x)

Moving from point x in the direction of d decreases the function f .

• Any vector d that satisfies d>f (x) ≤ 0 is a descent direction.To verify this, recall the 1st order Taylor approximation of f at the point x .

f (x + d) ≈ f (x) + d>∇f (x).

It follows thatf (x + d) − f (x) = d>∇f (x) ≤ 0 ⇒ f (x + d) − f (x) ≤ 0.

Hence, f (x + d) ≤ f (x). Therefore, d is a descent direction.

I NB: If d is a decent direction, then d̃ = αd , α > 0, is also adescent direction.

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 7: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Optimization with constraints ... Optimality criteria

feasible direction

Let x a feasible point (i.e. x ∈ F) and d is a vector in Rn. If(i) hi (x + d) = 0, i = 1, . . . , p and(ii) gj(x + d) ≤ 0, j = 1, . . . ,m,then d is said to be to be a feasible direction for the NLP.

• Let x a feasible point. If a vector d satisfies

d>∇hi (x) = 0, i = 1, . . . ,m and d>∇gj(x) < 0, j ∈ A(x),

then d̃ = αd is feasible direction at x , for some α > 0.To verify this,use the 1st order Taylor approximations at the point x .

i = 1, . . . , p : hi (x + αd) ≈ hi (x)︸ ︷︷ ︸=0

+α d>∇hi (x)︸ ︷︷ ︸=0

;⇒ hi (x + d̃) = 0,

j ∈ A(x) : gj (x + αd) ≈ gj (x)︸ ︷︷ ︸≤0

+α d>∇gj (x)︸ ︷︷ ︸<0

⇒ gj (x + d̃) ≤ 0;

j /∈ A(x)︸ ︷︷ ︸non active constraints

: gj (x + αd) ≈ gj (x)︸ ︷︷ ︸<0

+αd>∇gj (x) ⇒ gj (x + d̃) ≤ 0, for 0 < α ≤ −gj (x)

d>∇gj (x)if d>∇gj (x) > 0.

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 8: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Optimization with constraints ... Optimality criteria

Optimality condition

If x∗ is an optimal solution of NLP, then there is no vector d ∈ Rn

which is both a descent direction for f and feasible direction at x∗.That is, if x∗ an optimal solution of NLP, then system of inequalities

d>∇f (x∗) < 0, d>∇hi (x∗) = 0, i = 1, . . . , p; d>∇gj (x∗) < 0, j ∈ A(x∗)

equivalently

[−∇f (x∗)]> d > 0,[∇hi (x∗)

]> d = 0, i = 1, . . . , p;[∇gj (x∗)

]>d < 0, j ∈ A(x∗) (1)

has no solution d .

Farkas’ TheoremGiven any set of vectors c, ai , bj ∈ Rn, i = 1, . . . ,m; j = 1, . . . , l . Then exactly only one of the following two systems has a solution

System I: c>d > 0, a>i d = 0, i = 1, . . . , p; b>j d < 0, j = 1, . . . , m̃

System II: c =

p∑i=1

λi ai +l∑

j=1

µj bj , µ > 0.

Now if we let c = −∇f (x∗), ai = ∇hi (x∗), i = 1, . . . , p and bj = ∇gj (x∗), j ∈ A(x∗), then, since x∗ an optimal point, then only

system II has a solution.

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 9: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Optimization with constraints ... Optimality criteria• Hence, if x∗ is an optimal solution of NLP, then there exist vectors λ ∈ Rm, λ> =

(λ1, λ2, . . . , λm

)and

µ ∈ Rl , µ> =(µ1, µ2, . . . , µl

)> 0 such that

−∇f (x∗) =m∑i=1

λi∇hi (x∗) +l∑

j=1

µj∇gj (x∗)

where l = #A(x∗). If we let µj = 0 for for j ∈ {1, . . . ,m} \ A(x∗), then we can write

−∇f (x∗) =∑m

i=1 λi∇hi (x∗) +∑m

j=1 µj∇gj (x∗).

The Karush-Kuhn-Tucker (KKT) optimality condition

If x∗ is a minimum point of NLP, then there is λ ∈ Rp and µ ∈ Rm, µ ≥ 0 suchthat the following hold true:

∇f (x∗) +m∑i=1

λi∇hi (x∗) +m∑j=1

µj∇gj(x∗) = 0 (Optimality)

h(x∗) = 0 (feasibility)

g(x∗) ≤ 0

µ ≥ 0 (Nonnegativity)

µjgj(x∗) = 0, j = 1, . . . ,m. (Complementarity )

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 10: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Optimization with constraints ... Optimality criteria• The function

L(x , λ, µ) = f (x) +m∑i=1

λihi (x) +m∑j=1

µjgj(x)

called the Lagrange function associated to the NLP.Example: Solve the following optimization problem:

(NLP2) minx

{f (x) = x2

1 − x22

}(2)

s.t. (3)

x1 + 2x2 + 1 = 0 (4)

x1 − x2 ≤ 3. (5)

Solution:Lagrange functionL(x , λ, µ) = (x2

1 − x22 ) + λ(x1 + 2x2 + 1) + µ(x1 − x2 − 3).

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 11: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Optimization with constraints ... Optimality criteriaOptimality condition:

∂L∂x1

= 0⇒ 2x1 + λ + µ = 0⇒ x1 = −1

2(λ + µ) (6)

∂L∂x2

= 0⇒ −2x2 + 2λ− µ = 0⇒ x2 =1

2(2λ− µ) (7)

Feasibility

h(x) = 0⇒ x1 + 2x2 + 1 = 0 ⇒ −1

2(λ + µ) + (2λ− µ) + 1 = 0

⇒ λ = µ−2

3

Complementarity

µg(x) = 0 ⇒ µ(x1 − x2 − 3) = 0⇒ µ

(−

1

2(λ + µ)−

1

2(2λ− µ)− 3

)= 0

⇒ µ

(−

1

2

[(µ−

2

3) + µ

]−[

(µ−2

3)− µ

]− 3

)= 0

⇒ µ [−µ− 2] = 0⇒ µ = 0 or µ = −2.

µ = −2 < 0 is not allowed. Hence, µ∗ = 0 is the only possibility. As a result λ∗ = µ∗ − 23 = − 2

3 .

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 12: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Optimization with constraints ... Optimality criteria

Now using µ∗ = 0 and λ∗ = −1

x∗1 = −1

2(λ∗ + µ∗) = −

1

2(−

2

3+ 0) =

1

3(8)

x∗2 =1

2(2λ∗ − µ∗) =

1

2(2× (−

2

3)− 0) = −

2

3. (9)

The point x∗> =(

13 ,−

23

)is a candidate for optimality.

Observe that: the inequality constraint g(x) = x1 − x2 − 3 ≤ 0 is not active at x∗.

• In general, the KKT conditions above are only necessary optimalityconditions.

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 13: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Optimization with constraints ... Optimality criteria

Sufficient optimality conditionsSuppose the functions f , hi , i =, . . . , p; gj , j = 1, . . . ,m are twice differentiable. Let x∗ be a feasiblepoint of NLP. If there are Lagrange multipliers λ∗ and µ∗ ≥ 0 such that:(i) the KKT conditions are satisfied at (x∗, µ∗, λ∗); and(ii) and the hessian of the Lagrangian

∇xL(x∗, λ∗, µ∗) = ∇2f (x∗) +

p∑i=1

λ∗i ∇

2hj (x∗) +

m∑j=1

µ∗j ∇

2gj (x∗)

is positive definite (i.e. d>∇xL(x∗, λ∗, µ∗)d > 0) for d from the subspaceS =

{d ∈ Rn | d>∇hi (x

∗) = 0, i = 1, . . . , p; d>∇gj (x∗) = 0, µj > 0, j ∈ A(x∗)

}, then x∗ is an

optimal solution of NLP.

For the problem NLP2, since g(x∗) < 0 we have

• S = {d ∈ R2 | d>∇h(x∗) = 0} = {d ∈ R2 | (d1, d2)

(12

)= 0} = {d ∈ R2 | d1 = −2d2}.

• ∇xL(x∗, λ∗, µ∗) =

(2 00 −2

). Let d> = (d1.d2)> ∈ S, d 6= 0 (note that if d1 6= 0, then d2 6= 0 and

vice-versa.)

(d1, d2)

(2 00 −2

)(d1d2

)= (−2d2, d2)

(2 00 −2

)(−2d2d2

)= 2d2 > 0.

Therefore, x∗> =(

13,− 2

3

)is an optimal solution of NLP2.

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 14: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Numerical methods for constrained optimizationproblemsI Unless in some trivial cases, solution of a constrained optimizationproblem by hand is not a trivial task.

• As a result, currently we find several numerical methods todetermine an approximate solution(s) for a constrained optimizationproblem. Some well known gradient-based methods are(I) The sequential quadratic programming (SQP) method

(II) The Interior Point Method(III) Penalty function methods(IV) The Augmented Lagrangian method, etcI SQP based algorithms are highly favored for the solution ofconstrained nonlinear optimization problems.I In the SQP method: a sequence of iterates xk+1 = xk + αkdk are generated by solving

quadratic optimization (programming) problems to determine the dk ’s.

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 15: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Quadratic optimization (programming) problems• Quadratic optimization problems appear in several engineeringapplications.Eg: Least square approximations (regression), Model predictivecontrol, in SQP method for NLP, etc.

(I) Quadratic programming problems with equality constraints

(QP) minx

{f (x) = x>Qx + q>x

}(10)

s.t. (11)

Ax = b, (12)

where:• Q ∈ Rn×n a symmetric (i.e. Q> = Q) positive definite matrix.• x ∈ Rn the (unknown) optimization variable; q ∈ Rn a given vector.

• A ∈ Rm×n and b ∈ Rm are given; i.e. there are m equality constraint.• rank(A) = m.

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 16: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Quadratic optimization ....The Lagrange function for (QP):

L(x , λ) = x>Qx + q>x + λ> (Ax − b) .

Karush-Kuhn-Tucker Conditions:

∂L∂x

= 0⇒ Qx + A>λ = −q (13)

∂L∂λ

= 0⇒ Ax = b. (14)

This implies [Q A>

A 0

] [xλ

]=

[−qc

]• If Q is positive definite and rank(A) = m, then

[Q A>

A 0

]is invertible and

[xλ

]=

[Q A>

A 0

]−1 [−qc

]

⇒ x∗ = −Q−1q + Q−1A>(AQ−1A>

)−1 (b + AQ−1q

)λ∗ = −

(AQ−1A>

)−1 (b + AQ−1q

).

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 17: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Quadratic optimization (programming) problemsExample:A material flow from two reservoirs R1 and R2 flows into a third reservoir R3.The rate of flow from R1 is measured to be FM

1 = 6.5m3/h and from R2 isFM

2 = 14.7m3/h. As a result of these two inflows, the volume in R3 is measured tochange at a rate FM

3 = 19.8m3/h.Flow rate measurement devices for each reservoirs are known to incur measurementerrors each with standard-deviation: σ1 = 0.1m3/h, σ2 = 0.2m3/h andσ3 = 0.3m3/h.State an optimization problem to determine the actual flow rates.

Solution: Let F1,F2 and F3 be the true flow rates.• Note that Fi = FM

i ± ηiσi , i = 1, 2, 3;⇒ measurement ±ηiσi = Fi − FM

i represents measurement error.⇒ To minimize all these measurement error

min(F1,F2,F3)

{(F1 − FM

1

σ1

)+

(F2 − FM

2

σ22

)+

(F3 − FM

3

σ3

)}s.t.

F1 + F2 = F3.

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 18: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Quadratic optimization (programming) problems(II) Quadratic programming problems with equality and inequality constraints

(QP) minx

{f (x) = x>Qx + q>x

}(15)

s.t. (16)

a>i x = bi , i = 1, 2, . . . , p (17)

aix ≥ bi , i = p + 1, . . . ,m. (18)

where: Q is symmetric positive definite.If we set

A = [a1, a2, . . . , ap ], a> = (b1, b2, . . . , bp),B = [ap+1, ap+2, . . . , am] and b> = (bp+1, bp+2, . . . , bm)

Then (QP) can be compactly written as:

(QP) minx

{f (x) = x>Qx + q>x

}(19)

s.t. (20)

Ax = a (21)

Bx ≥ b. (22)

• In general, inequality constrained (QP)s may not be easy to besolved analytically.

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 19: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Quadratic optimization ... Active set MethodActive Set Method

Strategy:• Start from an arbitrary point x0

• Find the next iterate by setting xk+1 = xk + αkdk ,

where αk is a step-length and dk is search direction.

Question

• How to determine the search direction dk?• How to determine the step-length αk?

(A) Determination of the search direction:• At the current iterate xk determine the index set of activeconstraints

Ak = {i | aixk − bi = 0, i = p + 1, . . . ,m} ∪ {1, 2, . . . , p}.

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 20: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Quadratic optimization ... Active set MethodActive Set Method

Strategy:• Start from an arbitrary point x0

• Find the next iterate by setting xk+1 = xk + αkdk ,

where αk is a step-length and dk is search direction.

Question

• How to determine the search direction dk?• How to determine the step-length αk?

(A) Determination of the search direction:• At the current iterate xk determine the index set of activeconstraints

Ak = {i | aixk − bi = 0, i = p + 1, . . . ,m} ∪ {1, 2, . . . , p}.

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 21: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Quadratic optimization ... Active set Method• Solve the direction finding problem

mind

{1

2(xk + d)>Q(xk + d) + q>(xk + d)

}s.t.

a>i (xk + d) = bi , i ∈ Ak.

Expand• 1

2(xk + d)>Q(xk + d) + q>(xk + d) = 1

2d>Qd + 1

2d>Qxk + 1

2(xk )>Qd + 1

2(xk )>Qxk + q>d + q>xk

• ai>(xk + d

)= bi ⇒ ai>d = bi − a>i xk = 0.

• Simplify these expressions and drop constants to obtain:

mind

{1

2d>Q + d +

[Qxk + q

]d

}s.t.

a>i d = 0, i ∈ Ak.

Set

A =

a>1a>2

.

.

.

, i ∈ Ak.

• Solve this problem using the problem using the KKT conditions to obtain dk .

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 22: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Quadratic optimization ... Active set Method

(A) Determination of the step length:Once you obtained dk compute αk as

αk = min

[1,

bi − a>i xk

a>i dk

, i /∈ Ak

]

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau

Page 23: Optimization Problems with Constraints - introduction to ... › fileadmin › media › ... · Optimization with constraints ... Optimality criteria For the example problem NLP1,

Quadratic optimization ... Active set Method

Example: Application ProblemA parallel flow heat exchanger of a given length to be designed. Theconducting tubes, all of the same diameter, are enclosed in an outershell of diameter D = 1m (see Fugure). The diameter of each of theconducting tubes should not exceed 60mm. Determine the number oftubes and the diameter for the largest surface area.

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

TU Ilmenau